eCollegeeCollege
eCollegeeCollege
eCollege
Home > Company > News > eNewsletter > Educator's Voice

 

Educator's Voice

Volume 6, Issue 2
February 9, 2005

Toward Objectivity in Assessment: Applying the NORMS

On the first day of class, most students have the same burning questions: "What will I be doing in this course and how will I be evaluated?" They not only want this question answered, they want that answer to be specific, well-reasoned and clear. And they want the standards by which they are evaluated to be fair and applied consistently. As instructors, we, too, strive for fairness and consistency in evaluating our students. This is not an easy task, but it can be helped by devices such as rubrics, explicit grading criteria and by the topic of this column, the NORMS. The NORMS stand for Not Interpretation, Observable, Reliable, Measurable, and Specific. Together, they provide standards for evaluating information. In this case, the information we are concerned about is student performance.

Not Interpretation
It is tough to evaluate student work without interjecting some degree of our own interpretation of that performance. Indeed, we perceive everything through the filter of our own experiences. What this principle does advise, however, is not letting our personal biases get in the way of fair evaluation. When we hear the word "bias" we may automatically think of partiality toward certain cultures, ethnic groups or personal characteristics. But a bias can be a bit more benign and develop from daily interactions with students. We form general impressions of students from these interactions, and these impressions are what may become a factor when we then have to evaluate students formally. For example, consider a student we've all had in class at some point: the one who may be somewhat absent in daily sessions, not fully participating in class discussions, not engaging in group or individual activities. We may form an impression of this student as disengaged, not motivated, and perhaps even as a "slacker." This student, however, may produce a really top-notch paper. In such a case, we are faced with the task of objectively evaluating the paper according to stated grading criteria and applying the standards to their work, regardless of their other, often unrelated, behaviors.

Observable
When we write learning objectives, we essentially are describing the types of behaviors we would like our students to demonstrate after they have taken our courses. If we look at learning as a change in student behavior, in particular how they behave with respect to the material, our next task is to make sure that the behavior we describe in our learning objectives is actually observable.

For example, we all would like our students to achieve a deeper understanding of the course material. So, now let's ask ourselves: How do I know when my students have reached this level of understanding? And what level of understanding is sufficient for me to say that this learning objective has been met? That we must ask these additional questions means that "understanding" per se is not observable. But explaining concepts to another student is, as is effectively applying a mathematical model to a new set of data. To get to an observable definition, keep asking yourself, "How do I know this is happening?" until your answer is, "Because I can see it."

Reliable
Reliable information is that which is consistent across observations. If we were to observe the very same event twice, our observations should be identical. If they are, this information is reliable. For example, if you weigh yourself on the scale once and then again five minutes later, the two weights should be nearly identical (assuming you neither added nor subtracted clothing or other apparel). If they are, the scale is a reliable measurement tool and the weights themselves are reliable. In science, reliability is a necessary but often overlooked prerequisite for validity. That is, information must be reliable before it can be considered valid. Returning to the scale example, if you continue to weigh yourself every five minutes and each weight is within a fraction of a pound, you can be pretty certain that the scale is reflecting your true weight. If, on the other hand, each weight is different from the others by five to ten pounds, you are not likely to believe the scale is showing your true weight. In this case, the information is unreliable and, hence, invalid.

Applied to student academic performance, reliability is a bit harder to come by, but applying standards for performance consistently to all students helps ensure that the information we provide in our assessments of student work is reliable. For example, the following is an excerpt from an evaluation sheet I use to grade student papers. For each criterion, I circle the point value associated with the level of performance I observe in the paper.

Criterion Excellent Good Average Fair Poor/NA
Summarizes information accurately and precisely, providing sufficient information concisely 25 20 15 10 0
Procedures explained thoroughly enough to allow others to replicate exactly 16 6 4 2 0

Using this sheet for all papers helps create reliable evaluations and, as a result, enhances their validity. If I were to read each student's paper twice, I would expect my second evaluation to produce the same results as my first. If so, this is a reliable evaluation tool and the actual evaluation is reliable as well.

It is important to remember that reliability is necessary but not sufficient for validity. Let's return again to the scale example. If each weighing produces the same result, this might be because the scale does not reset to zero each time, but to 10. Each weight is the same, but they are all ten pounds more than they should be. Thus, the information is reliable but not valid. With regard to our paper grading example, you may apply the same criteria to all student papers, thus producing reliable information, but your idea of good may be someone else's idea of excellent. This issue is one of personal interpretation and we certainly can't be expected to apply the exact same standards as our colleagues. Rather, it is crucial that we apply our own standards to all students consistently.

Measurable
This principle goes hand-in-hand with the principles of Observable and Specific, and introduces the concept of operational definitions. An operational definition is one that specifies not only what is being measured, but also how it is measured. Typically, an operational definition specifies some numerical measurement of the behavior or event being observed. Consider the table below to see how performance factors can be operationalized.

Behavior Vague Definition Operational Definition
Participation Participates in class discussion Regularly initiates discussion (at least once per session); responds to at least 3 original comments from peers and to 2 responses to original post
Speaking Delivers high-quality speeches in professional manner States main thesis, provides supporting arguments and draws fitting conclusions; maintains eye contact with audience and uses appropriate gestures
Writing Writes sophisticated papers in a clear, concise manner Writes papers appropriate for the intended audience. Transitional sentences develop one idea from the previous one. Words carry precise meaning and sentences use only the words required and not any more. The paper is free of spelling, punctuation and grammatical errors.
Critical thinking Demonstrates critical thinking skills Analyzes information into component parts; draws connections between information from different sources; evaluates information according to specific standards

Clearly, some of these operational definitions are written in such a way that numerical measurements can not be made. Often, measurements are made according to some qualitative description as in, for example, the above operational definition of writing. The goal is to be descriptive enough to prevent as much as possible having to interpret the meaning of the definition.

Specific
In psychology, human observation is a common data collection method. The nature of the subject matter is such that this is often our only way of recording human action. To produce as reliable and valid data as possible, it is crucial that the definitions of the behavior to be observed and the criteria applied by the observers be as specific as possible. Otherwise, the observers may very well be observing two different behaviors but recording them as the same.

We can apply the same rigor to our evaluation of student work. We can make efforts to write learning objectives, grading criteria and other assessment tools so that independent observers can agree on whether or not the student met those objectives or criteria. In the table above, the operational definitions are much more specific than the other definitions. As other examples of improved specificity, think about the instructions you give your students for papers and other assignments. You may provide any number of the following instructions:

Students appreciate such specificity. It not only ensures them that the same standards are applied to everyone, but it helps them prepare and complete their assignments, and often do so much more effectively than they would be able to do otherwise.

Final Thoughts
Pure objectivity is never easy. Indeed, some may say it is impossible or something to which we can only aspire, a pipe dream of sorts. I hope this discussion of the NORMS has shown that it is possible to promote objectivity in assessing our students by consistently applying these principles. In the end, of course, it comes down to trusting our abilities to evaluate student work fairly.


       – Jennifer O'Donnell, Ph.D.

TIP

Effective Video Clips

Have you been thinking about adding video to your course? Videos can be a great teaching technique. They are valuable teaching tools for demonstrations, examples, or to add a personal touch to your course. However, videos can also be a drain -- technologically and mentally -- if they are not used correctly.

It's important to find the right video length. If videos are too long, students often have trouble staying focused. We've found that the best practice for video length in an online class is five to ten minutes. Not only is this length best for student attention spans, but it is also a best practice for technological reasons. Longer videos are difficult for many students to download and can be difficult for their computers to handle.

You can make videos easier to view by using Streaming Media. The .rm (video file extension with using Real Presenter) file can be uploaded to the Streaming Media folder, which is located under the File Manager on the Course Home Page. Click on the Streaming Media link. Then, you can create a link in the course to this file. The file will stream because it is on the media server, which means that the students will not have to download the file first, before viewing the video. They can just view the video after they click on the link. The file will download as it plays.


       – Christa Palmer, M.A.