Re: Evaluating students: A Statistical Perspective
On 7 Dec 2001 14:24:17 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote: At 08:08 PM 12/7/01 +, J. Williams wrote: On 6 Dec 2001 11:34:20 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote: if anything, selectivity has decreased at some of these top schools due to the fact that given their extremely high tuition ... i was just saying that IF anything had happened ... that it might have gone down ... i was certainly not saying that it had ... but i do think that it could probably not get too much more selective ... so it probably has sort of stayed where it has over the decades ... so if grade inflation has occurred there it would not likely be due to an increased smarter incoming class (In the NY Times) At Harvard in particular, the interviewees claimed that the present freshmen had notably better SATs than those of a generation ago -- There are not nearly so many people with so-so scores (alumni offspring?), and a quarter of the class now has SATs that are perfect 1600, or nearly that (?no explanation of what 'nearly' means). I go along with the notion that, in the long run, if there is to be special meaning to being an honors graduate from Harvard, it can't mean top 75% of the class. (I think that is what someone reported, somewhere.) I remember reading, years ago, that the Japanese school trajectory differed from ours -- they learnt a lot before college, and college was a long party before starting a career. (This was a few years ago.) Their life-long success was pre-determined largely by which-university accepted them; it sounded like the old-school-tie was a huge social asset. Reportedly, that was why their high school students worked so hard on cram courses and extra studying; college was 4 years of party. - Since they are sliding away from lifetime employment, etc., I wonder if the educational system is becoming more flexible and technocratic, too. Are our systems converging? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
At 08:08 PM 12/7/01 +, J. Williams wrote: On 6 Dec 2001 11:34:20 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote: if anything, selectivity has decreased at some of these top schools due to the fact that given their extremely high tuition ... i was just saying that IF anything had happened ... that it might have gone down ... i was certainly not saying that it had ... but i do think that it could probably not get too much more selective ... so it probably has sort of stayed where it has over the decades ... so if grade inflation has occurred there it would not likely be due to an increased smarter incoming class _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Just in case someone is interested in the Harvard instance that I mentioned -- while you might get the article from a newsstand or a friend -- On Sun, 02 Dec 2001 19:19:38 -0500, Rich Ulrich [EMAIL PROTECTED] wrote: [ ... ] Now, in the NY Times, just a week or two ago. The dean of undergraduates at Harvard has a complaint about grade inflation. More than 48% of all undergraduate grades last year were A. (In 1986, it was only 34% or so.) Only 6% or present grades were C or D or F. The dean has asked the faculty to discuss it, which is as much as she can do. I don't know: Would the A's emerge as scores on-a-curve, or are the lessons so easy that all the answers are right? [ snip, rest] Section A of the NY Times on Wed., Dec 5, had another article (page 14) and a column (page 21). There were specific comments *contrary* to some obvious notions of grade inflation as an arbitrary and bad thing: some were presented as opinion, and other as apparent fact. Recent Harvard students have higher SATs than ever. Students at a particular level (of SAT, or otherwise) supposedly are performing better. The Dean of Harvard College (a subunit, I think) says that his students (in computer science) handle some previously-tough problems much more easily. [ And I wonder, Is that peculiar to cs.] Someone else was quoted, that the performance needed for an A had not changed. Amongst the commentary in the column - Comments on educational research: Good students (some research says) learn more if top grades are kept lower, but lower grading can discourage poorer students and increase dropout rates. - Both effects are easy to imagine, somewhere, sometime, -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
generally speaking, it is kind of difficult to muster sufficient evidence that the amount of grade inflation that is observed ... within and across schools or colleges ... is due to an increase in student ability i find it difficult to believe that the average ability at a place like harvard has gone up ... but if so, very much over the years ... if anything, selectivity has decreased at some of these top schools due to the fact that given their extremely high tuition ... they need to keep their dorms full and, making standards higher and higher would have the opposite effect on keep dorms filled At 11:58 AM 12/6/01 -0500, Rich Ulrich wrote: Just in case someone is interested in the Harvard instance that I mentioned -- while you might get the article from a newsstand or a friend -- On Sun, 02 Dec 2001 19:19:38 -0500, Rich Ulrich [EMAIL PROTECTED] wrote: = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
On Sun, 02 Dec 2001 19:19:38 -0500, Rich Ulrich [EMAIL PROTECTED] wrote: With the curve, and low, low averages, you do notice that a single *good* performance can outweigh several poor ones. So that is good. It is good, but conversely having several high scores even with low, low averages and then receiving a single disastrously low score can be a bummer of the first order. I remember this happening to me a couple of times...no fun at all! = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
In article [EMAIL PROTECTED], Thom Baguley [EMAIL PROTECTED] wrote: Donald Burrill wrote: On Fri, 23 Nov 2001, L.C. wrote: The question got me thinking about this problem as a multiple comparison problem. Exam scores are typically sums of problem scores. The problem scores may be thought of as random variables. By the central limit theorem, the distribution of a large number of test scores should look like a Normal distribution, Provided, of course, that the test scores in question are iid. Now it is possible to imagine that test scores for different persons are measured independently (although I am aware of skepticism in the ranks on this point!), but that they are identically distributed seems unlikely at best. The number of people is completely irrelevant, although one could get a better estimate of the distribution. This is the case even if they are independent. I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). If one used a large number of questions, the total score of an individual, GIVEN the ability of that individual, may well approximate normality. However, I believe that most current tests have too many small questions as it is. There is not even a fair reason for believing that the ability of people selected at random is at all close to the normal distribution, the origin of the word normal notwithstanding, and it is even less so for those in a class which is not a random sample from humanity. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Hi On Tue, 27 Nov 2001, Thom Baguley wrote: I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). The distribution of grades will depend on the distribution of difficulties of the items, one of the elements examined by psychometrists in the development of professional-quality assessments. To use the 3 question example, if the questions are of the same difficulty, then scores of 0 or 3 could easily result. But if the questions are graduated to be easy, moderate, and difficult, then scores of 0, 1, 2, and 3 are more likely to result, with the actual distribution depending largely on the distribution of the underlying ability (with considerable amounts of noise added in). With larger numbers of questions, then the distribution of scores will depend on the proportion of questions at different degrees of difficulty, on the distribution of the underlying ability (or abilities), and on extraneous factors. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Hi On 25 Nov 2001, Herman Rubin wrote: If it is a good test, ability should predominate, and there is absolutely no reason for ability to even have close to a normal distribution. If one has two groups with different normal distributions, combining them will never get normality. I think that no reason is too strong. The typical explanation for normally distributed polygenic traits (ability, height, or whatever) is that each of a large number of genes contributes some small component to the trait. With enough genes, the ultimate distribution will be reasonably well approximated by the normal (analogous to the normal approximation to the binomial). You don't need to accept genetic mechanisms to find some reasonable reason to think that test performance and other trait measures will be normally distributed, or at least approximately so. If we appreciate that performance depends on a host of differentiated factors (e.g., having a good night's sleep, having just happened to study a particular kind of problem more than some other, having distracting thoughts or not, not misreading the question, different kinds of ability, and so on ...), then again a normal-like distribution will emerge. This isn't to deny Herman's basic point that a set of marks can contain results from different underlying populations. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
At 01:35 PM 11/28/01 -0600, jim clark wrote: Hi On Tue, 27 Nov 2001, Thom Baguley wrote: I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). The distribution of grades will depend on the distribution of difficulties of the items, one of the elements examined by psychometrists in the development of professional-quality assessments. well, not exactly ... it depends on a joint function of how hard items turn OUT to be AND, where i set the cut scores for grades items can be real difficult ... but still exhibit some spread .. hence my distribution of grades may or may not exhibit some spread depending on where i set the A, B, etc. points item difficulties will determine (usually) the general SHAPE of the distribution of SCORES ... but grades are on top of scores and do NOT have to conform to the shape of the distribution of scores unless your semantics was equating the term grades with the term scores ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Hi On 28 Nov 2001, Dennis Roberts wrote: At 01:35 PM 11/28/01 -0600, jim clark wrote: The distribution of grades will depend on the distribution of difficulties of the items, one of the elements examined by psychometrists in the development of professional-quality assessments. unless your semantics was equating the term grades with the term scores ... Wasn't that obvious from the discussion which immediately followed the above introductory statement (and which you have cut out of your reply)? Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Donald Burrill wrote: On Fri, 23 Nov 2001, L.C. wrote: The question got me thinking about this problem as a multiple comparison problem. Exam scores are typically sums of problem scores. The problem scores may be thought of as random variables. By the central limit theorem, the distribution of a large number of test scores should look like a Normal distribution, Provided, of course, that the test scores in question are iid. Now it is possible to imagine that test scores for different persons are measured independently (although I am aware of skepticism in the ranks on this point!), but that they are identically distributed seems unlikely at best. I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
On Tue, 27 Nov 2001, Thom Baguley wrote in part: Donald Burrill wrote: On Fri, 23 Nov 2001, L.C. wrote: The question got me thinking about this problem as a multiple comparison problem. Exam scores are typically sums of problem scores. The problem scores may be thought of as random variables. By the central limit theorem, the distribution of a large number of test scores should look like a Normal distribution, Provided, of course, that the test scores in question are iid. Now it is possible to imagine that test scores for different persons are measured independently (although I am aware of skepticism in the ranks on this point!), but that they are identically distributed seems unlikely at best. I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. True. But this does not seem to me to speak to the issue of independence, which as I understand it is an assumption that responses made by student A to items on a test are unrelated to (i.e., do not affect and are not affected by) the responses made by student B to those items. Surely student A, who has not (let us suppose) adequately remembered what s/he needs to know of simple algebra, is not to be held responsible for the fact that student B doesn't remember any either? In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). Interesting. Scores on my exams tend to be negatively skewed in general, and to show evidence of several clusters (that may or may not show up as apparent modes): the several persons at the bottom, often clustered at some little distance from their nearest neighbor(s), who almost seem dtermined to fail; and two to four clusters moving up the scale from there, which sometimes fall into ranges useful for grades of D, C, B. Sometimes, but not always, there are another few students clustered at the top. -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
In article [EMAIL PROTECTED], L.C. [EMAIL PROTECTED] wrote: The question got me thinking about this problem as a multiple comparison problem. Exam scores are typically sums of problem scores. The problem scores may be thought of as random variables. By the central limit theorem, the distribution of a large number of test scores should look like a Normal distribution, and it typically (though not always) does. Hence the well known bell curve. (Assume, for the sake of argument that it holds here.) This is so completely erroneous as to demand being put in the garbage and removed completely. For one thing, few tests have that many problems, and better tests have fewer and longer problems, with unequal weights. Even then, the problem scores are not independent, but at least highly correlated. Here's the problem. Is the bell curve the result of a distribution of abilities/preparations, or is it a distribution of totally random nonsense? If it is a good test, ability should predominate, and there is absolutely no reason for ability to even have close to a normal distribution. If one has two groups with different normal distributions, combining them will never get normality. When testing, say, the efficacy of a similar number of, say, drugs, we might be disturbed at a normal distribution. We would say that drugs A and B were in the top 5%, but that proves nothing because that many drugs would have turned out that way at random. There is no reason here for anything like the normal distribution. OTOH with students, we immediately leap to the conclusion that the top testers are suprior to the others. Is either perspective justifiable? Why? With most tests, it is questionable, but that is not because of the random variation, but because the tests are designed to test trivial pursuit, not long-term understanding. A good test is one who has merely memorized the book would not achieve a high score, but one who understands the concepts and has not looked at many of the details would ace. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
In article [EMAIL PROTECTED], Donald Burrill [EMAIL PROTECTED] wrote: On Sat, 24 Nov 2001, L.C. wrote: As for the iid, it's reasonable to believe the questions could be drawn from some population. Why not the answers? If the questions are selected in accordance with some table of specifications, they are not from _a_ population, but from many; and there is no _a priori_ reason I can think of to suppose that their item characteristics are iid. Much of the psychological theory here is derived from testing short-term memorization of nonsense syllables, to try to get to the model used. There is no reason that this should be relevant to evaluating how well students understand subjects. As for the answers, the usual reason for wanting to evaluate students is precisely because they are (or one hopes they are!) different in their levels of skill (or whatever): the task is to assess these skill levels, and it is nonsense to assume that all the persons are id on the measure on which one hopes to identify differences. (Hey! I've heard much worse justifications for statistical assumptions! :) At any rate, bell curves do arise often enough in this context to be written about. Of course, bell curve does not necessarily imply normal distribution. You can get quite nice bell curves from binomial distributions, e.g. Also of course, any real data must be discrete, not continuous, so cannot technically be normally distributed anyway. (It is possible that the distribution may be more or less well approximated by a normal distribution with the same mean variance, but that's not the same thing.) As for wanting gaps in the resulting distribution... That was my point. When you do have a bell curve, it shouldn't be satisfying; it should be disturbing. Depends on how bell-like the curve is. For almost any interesting variable that can be measured on humans, one expects rather a lot of people in the middle, and progressively fewer toward the extremes, of the distribution; doesn't one? (And if not, why not?) There are likely to be SAMPLE gaps. This is the maddening aspect of psychometry - they engineer these nice normal distributions on which to base their diagnoses. You'd think they'd *want* bimodal, discrete, or mixed continuous/discrete distributions, but no. They diagnose by Z scores (thereby defining their own prevalences :) and assert that they are discovering diseases, and not punishing unusual people. Anyone who converts data to normality, or even standardizes variances, is using statistics as pure ritual. There is often a justification for using procedures based on the normal distribution; they often work well in general. Least squares is one of these. Best Regards, -Larry (And they get to testify in court) C. Hmm. This thread started out as evaluating students, in the context of classes and teacher-made tests, as I recall. Not exactly the same thing as diagnosing (in a quasi-medical sense) or discovering diseases, I shouldn't think. One wonders, then, why you aren't posting these complaints in a newsgroup of psychometricians, rather than one of statistics teachers? He may have questions about the religious ritual, and want to get opinions from those who have not been brainwashed by the priests. Psychometricians and educationists do act as if those who are outstanding are in the category of diseased. They are doing their best to keep them from learning anything near what they can learn. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
Donald Burrill wrote: On Sat, 24 Nov 2001, L.C. wrote: Thanks for the reply! As for the iid, it's reasonable to believe the questions could be drawn from some population. Why not the answers? If the questions are selected in accordance with some table of specifications, they are not from _a_ population, but from many; and there is no _a priori_ reason I can think of to suppose that their item characteristics are iid. Actually, it's not so much the questions, but the answers that must be iid. Also, suppose we have two subject areas. The specific questions for each are largely arbitrary, but the kind of question, particularly in, say, high school is not. So you get pools of two kinds of questions. I claim the responses to each may indeed be iid. If you take the sums of any proportion of the answers to one kind with those of another, you get iid responses. If you take sums of those, you get ~N. Your remark about the number of questions may be fulfilled by, say, a final or midterm. As for the answers, the usual reason for wanting to evaluate students is precisely because they are (or one hopes they are!) different in their levels of skill (or whatever): the task is to assess these skill levels, and it is nonsense to assume that all the persons are id on the measure on which one hopes to identify differences. (Hey! I've heard much worse justifications for statistical assumptions! :) At any rate, bell curves do arise often enough in this context to be written about. Of course, bell curve does not necessarily imply normal distribution. On the contrary. It does, as it is normally used You can get quite nice bell curves from binomial distributions, e.g. Also of course, any real data must be discrete, not continuous, so cannot technically be normally distributed anyway. (It is possible that the distribution may be more or less well approximated by a normal distribution with the same mean variance, but that's not the same thing.) As for wanting gaps in the resulting distribution... That was my point. When you do have a bell curve, it shouldn't be satisfying; it should be disturbing. Depends on how bell-like the curve is. For almost any interesting variable that can be measured on humans, one expects rather a lot of people in the middle, and progressively fewer toward the extremes, of the distribution; doesn't one? (And if not, why not?) This is the maddening aspect of psychometry - they engineer these nice normal distributions on which to base their diagnoses. You'd think they'd *want* bimodal, discrete, or mixed continuous/discrete distributions, but no. They diagnose by Z scores (thereby defining their own prevalences :) and assert that they are discovering diseases, and not punishing unusual people. Best Regards, -Larry (And they get to testify in court) C. Hmm. This thread started out as evaluating students, in the context of classes and teacher-made tests, as I recall. Not exactly the same thing as diagnosing (in a quasi-medical sense) or discovering diseases, I shouldn't think. One wonders, then, why you aren't posting these complaints in a newsgroup of psychometricians, rather than one of statistics teachers? I didn't post the complaints. I sent them to you. AND, I continue to thank you for the response. I admit the original question was a bit of an idle troll, and I got what I deserved. -Love and Regards Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 Donald Burrill wrote: On Sat, 24 Nov 2001, L.C. wrote: Thanks for the reply! As for the iid, it's reasonable to believe the questions could be drawn from some population. Why not the answers? If the questions are selected in accordance with some table of specifications, they are not from _a_ population, but from many; and there is no _a priori_ reason I can think of to suppose that their item characteristics are iid. As for the answers, the usual reason for wanting to evaluate students is precisely because they are (or one hopes they are!) different in their levels of skill (or whatever): the task is to assess these skill levels, and it is nonsense to assume that all the persons are id on the measure on which one hopes to identify differences. (Hey! I've heard much worse justifications for statistical assumptions! :) At any rate, bell curves do arise often enough in this context to be written about. Of course, bell curve does not necessarily imply normal distribution. You can get quite nice bell curves from binomial distributions, e.g. Also of course, any real data must be discrete, not continuous, so cannot technically be normally distributed anyway. (It is possible that the distribution may be more or less well approximated by a normal
Re: Evaluating students: A Statistical Perspective
On Sat, 24 Nov 2001, L.C. wrote: Thanks for the reply! As for the iid, it's reasonable to believe the questions could be drawn from some population. Why not the answers? If the questions are selected in accordance with some table of specifications, they are not from _a_ population, but from many; and there is no _a priori_ reason I can think of to suppose that their item characteristics are iid. As for the answers, the usual reason for wanting to evaluate students is precisely because they are (or one hopes they are!) different in their levels of skill (or whatever): the task is to assess these skill levels, and it is nonsense to assume that all the persons are id on the measure on which one hopes to identify differences. (Hey! I've heard much worse justifications for statistical assumptions! :) At any rate, bell curves do arise often enough in this context to be written about. Of course, bell curve does not necessarily imply normal distribution. You can get quite nice bell curves from binomial distributions, e.g. Also of course, any real data must be discrete, not continuous, so cannot technically be normally distributed anyway. (It is possible that the distribution may be more or less well approximated by a normal distribution with the same mean variance, but that's not the same thing.) As for wanting gaps in the resulting distribution... That was my point. When you do have a bell curve, it shouldn't be satisfying; it should be disturbing. Depends on how bell-like the curve is. For almost any interesting variable that can be measured on humans, one expects rather a lot of people in the middle, and progressively fewer toward the extremes, of the distribution; doesn't one? (And if not, why not?) This is the maddening aspect of psychometry - they engineer these nice normal distributions on which to base their diagnoses. You'd think they'd *want* bimodal, discrete, or mixed continuous/discrete distributions, but no. They diagnose by Z scores (thereby defining their own prevalences :) and assert that they are discovering diseases, and not punishing unusual people. Best Regards, -Larry (And they get to testify in court) C. Hmm. This thread started out as evaluating students, in the context of classes and teacher-made tests, as I recall. Not exactly the same thing as diagnosing (in a quasi-medical sense) or discovering diseases, I shouldn't think. One wonders, then, why you aren't posting these complaints in a newsgroup of psychometricians, rather than one of statistics teachers? Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =