Re: Standardizing evaluation scores
In article [EMAIL PROTECTED], Dennis Roberts [EMAIL PROTECTED] wrote: sorry for late reply ranking is the LEAST useful thing you can do ... so, i would never START with simple ranks any sort of an absolute kind of scale ... imperfect as it is ... would generally be better ... You can say that again! one can always convert more detailed scale values INTO ranks at the end if necessary BUT, you cannot go the reverse route This cannot be overemphasized. We see much of this; how valid are those of the current IQ scales, where the values are given by converting the raw scores to a normal distribution? This is also done in other tests of this type; we need to teach in our beginning courses not to transform unless one has a REALLY good reason to do so, and obtaining normality is not one. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
sorry for late reply ranking is the LEAST useful thing you can do ... so, i would never START with simple ranks any sort of an absolute kind of scale ... imperfect as it is ... would generally be better ... one can always convert more detailed scale values INTO ranks at the end if necessary BUT, you cannot go the reverse route say we have 10 people measured on variable X ... and we end up with no ties ... so, we get ranks of 1 to 10 ... but, these value give on NO idea whatsoever as to the differences amongst the 10 if i had a 3 person senior high school class with cumulative gpas of 4.00, 3.97, and 2.38 ... the ranks would be 1, 2, and 3 ... but clearly, there is a huge difference between either of the top 2 and the bottom ... but, ranks give no clue to this at all so, my message is ... DON'T START WITH RANKS At 02:11 AM 12/19/01 +, Doug Federman wrote: I have a dilemma which I haven't found a good solution for. I work with students who rotate with different preceptors on a monthly basis. A student will have at least 12 evaluations over a year's time. A preceptor usually will evaluate several students over the same year. Unfortunately, the preceptors rarely agree on the grades. One preceptor is biased towards the middle of the 1-9 likert scale and another may be biased towards the upper end. Rarely, does a given preceptor use the 1-9 range completely. I suspect that a 6 from an easy grader is equivalent to a 3 from a tough grader. I have considered using ranks to give a better evaluation for a given student, but I have a serious constraint. At the end of each year, I must submit to another body their evaluation on the original 1-9 scale, which is lost when using ranks. Any suggestions? -- It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten. - B.F. Skinner New Scientist, 31 May 1964, p. 484 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
Thanks for your responses. I have already considered most of these issues, but reached no clear solution. I have tried educating the evaluators. High graders remain high graders and some use the entire scale. I'll just have to try a few methods out and see what works. I do have external objective knowledge evaluations from standardized testing and will see how they relate to the scores. -- It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten. - B.F. Skinner New Scientist, 31 May 1964, p. 484 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
Glen Barnett [EMAIL PROTECTED] wrote in sci.stat.edu: Stan Brown wrote: But is it worth it? Don't the easy graders and :tough graders pretty much cancel each other out anyway? Not if some students only get hard graders and some only get easy graders. Right you are -- I read the OP's article as saying every student was evaluated by every preceptor, but when I look back again I see that I misread it. -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com/ My theory was a perfectly good one. The facts were misleading. -- /The Lady Vanishes/ (1938) = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
A classic problem of 'norming' or 'standardizing' the scale and the preceptors. Can you find a couple students who fall near the bottom and tops of the scale? Preferably ones whose final rankings are not 'permanent record'? then you would have each preceptor use these two students as 'baseline' indicators of what a 2 means, and what an 8 means. then have each person do the regular ranking of students, using these as your indicators. It might be possible for the attendant group of preceptors to agree on the ranking of a pair of students, in each specialty or area. then use these for ranking within that specialty. Failing this kind of development for mutual agreement, you might be able to describe a 2 or 3 rating, and a 7 or 8 rating, in such a way that generalized agreement would be obtained, and each grade would be set in comparison to this descriptive scale. This is essentially what the Baldrige Criteria does, for industrial/ educational/ health care operations. Of course, if it's grades we are discussing, it is entirely likely that virtually nobody gets grades in certain ranges, such as the equivalent of C or below on an A- F scale. If Harvard can graduate over half a class as Cum Laude, the rest of us can skew grades anywhere we like. Jay Doug Federman wrote: I have a dilemma which I haven't found a good solution for. I work with students who rotate with different preceptors on a monthly basis. A student will have at least 12 evaluations over a year's time. A preceptor usually will evaluate several students over the same year. Unfortunately, the preceptors rarely agree on the grades. One preceptor is biased towards the middle of the 1-9 likert scale and another may be biased towards the upper end. Rarely, does a given preceptor use the 1-9 range completely. I suspect that a 6 from an easy grader is equivalent to a 3 from a tough grader. I have considered using ranks to give a better evaluation for a given student, but I have a serious constraint. At the end of each year, I must submit to another body their evaluation on the original 1-9 scale, which is lost when using ranks. Any suggestions? -- It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten. - B.F. Skinner New Scientist, 31 May 1964, p. 484 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = -- Jay Warner Principal Scientist Warner Consulting, Inc. North Green Bay Road Racine, WI 53404-1216 USA Ph: (262) 634-9100 FAX: (262) 681-1133 email: [EMAIL PROTECTED] web: http://www.a2q.com The A2Q Method (tm) -- What do you want to improve today? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
A naive solution seems reasonable if I am willing to assume that students are randomly assigned to the preceptors for evaluation. If so, I'd expect the average rating given by each judge to be the same. So, force the judges' means to be be equal to the overall mean by dividing each individual score appropriately. Then calculate the students' averages. Of course the results are no longer integers 1 to 9, but that's where you, the filter to that other body, will have the Procrustean Responsibility! If there is no balance, if for instance some students is rated 12 times by the same judge while others are rated by 12 different judges, there probably is no good model for the process. The literature on rating the strength of sports teams in a league with an unbalanced schedule might give some hints on how to proceed. Doug Federman wrote: I have a dilemma which I haven't found a good solution for. I work with students who rotate with different preceptors on a monthly basis. A student will have at least 12 evaluations over a year's time. A preceptor usually will evaluate several students over the same year. Unfortunately, the preceptors rarely agree on the grades. One preceptor is biased towards the middle of the 1-9 likert scale and another may be biased towards the upper end. Rarely, does a given preceptor use the 1-9 range completely. I suspect that a 6 from an easy grader is equivalent to a 3 from a tough grader. I have considered using ranks to give a better evaluation for a given student, but I have a serious constraint. At the end of each year, I must submit to another body their evaluation on the original 1-9 scale, which is lost when using ranks. Any suggestions? -- It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten. - B.F. Skinner New Scientist, 31 May 1964, p. 484 -- Neil W. Henry Department of Sociology and Anthropology Department of Statistical Sciences and Operations Research Box 843083 Virginia Commonwealth University Richmond VA 23284-3083 (804)828-1301 x124FAX: 828-8785 http://www.people.vcu.edu/~nhenry = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
Doug Federman [EMAIL PROTECTED] wrote in sci.stat.edu: I have a dilemma which I haven't found a good solution for. I work with students who rotate with different preceptors on a monthly basis. A student will have at least 12 evaluations over a year's time. A preceptor usually will evaluate several students over the same year. Unfortunately, the preceptors rarely agree on the grades. One preceptor is biased towards the middle of the 1-9 likert scale and another may be biased towards the upper end. Rarely, does a given preceptor use the 1-9 range completely. I suspect that a 6 from an easy grader is equivalent to a 3 from a tough grader. First, it is rare that _any_ survey gets a significant number of responses at either end. People tend to think, Hmm, 1 to 9. Well, 1 would be perfect and 9 would be valueless [or vice versa]. Nobody's perfect, so I'll write down a 2. I have considered using ranks to give a better evaluation for a given student, but I have a serious constraint. At the end of each year, I must submit to another body their evaluation on the original 1-9 scale, which is lost when using ranks. Any suggestions? You could make a case for almost any jiggering of the numbers -- and a case against it too. Since the 12 preceptors are grading the same group of people, you could justify various forms of normalizing. But is it worth it? Don't the easy graders and :tough graders pretty much cancel each other out anyway? -- Stan Brown, Oak Road Systems, Cortland County, New York, USA http://oakroadsystems.com My reply address is correct as is. The courtesy of providing a correct reply address is more important to me than time spent deleting spam. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
Stan Brown wrote: But is it worth it? Don't the easy graders and :tough graders pretty much cancel each other out anyway? Not if some students only get hard graders and some only get easy graders. If all students got all graders an equal amount of time it probably won't matter at all. Glen = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
Glen Barnett [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Stan Brown wrote: But is it worth it? Don't the easy graders and :tough graders pretty much cancel each other out anyway? Not if some students only get hard graders and some only get easy graders. If all students got all graders an equal amount of time it probably won't matter at all. Glen If some graders use the whole scale and others only use part of the scale or concentrate grades near the centre, then using raw scores means you are giving the full scale graders more weight in the overall ranking of students. If this is undesirable, grades could be scaled to a common mean and equal mean deviation. (Standard deviation would give increased weight to extremes of the scale.) In all these adjustments, we lose transparency of the process and this must be weighed against the gains. I suspect that only sharp contrasts between the behaviour of the graders and/or different students having different sets of graders would justify this, and may well be better dealt with by instructing the graders appropriately after pointing out that it is desirable for all graders to have equal weight in assessment. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =