Re: Reliability and baseball

Neil W. Henry Wed, 13 Sep 2000 13:21:47 -0700
Replying to "Robert J. MacG. Dawson" <[EMAIL PROTECTED]>  I
wrote:

It  may be necessary to look into the way that the various states have
set up their testing programs in order to appreciate the value of
Rogosa's exercise.

Here are some answers to questions you raise, which seem to me to be
clear from the article.

>The  obvious question is: what are the actual standards for promotion?

Getting the mandated grade on the test. That's what the state's
"standards of learning" law requires.

>What does "deserving" it mean?

"Deserving" promotion means that the student's actual degree of
knowledge ("true score" in the jargon of testers) is above the mandated
level set by the state. That is a bit tricky to explain, which is why
Rogosa looked at the so-called reliabilities of these tests, as
published by the test publishers.  Presumably reliability is related to
the degree to which a student, with a fixed degree of knowledge, would
fail to get exactly the same score on the test (or a parallel version of
it) if he or she took it again.

>The article _suggests_ that the tests have no objective pass mark but
>are set up so that, no matter how well students do overall, 30% will be

>held back. Is this true (it should not be in a well-run system!), or
are
>the percentiles referred to just the percentiles currently
corresponding
>to pass marks that have some objective justification?

The example is obviously there for illustrative purposes. "Percentiles",
in the standardized testing business,  do not refer to the collection of
people who take the test this year in a particular school or state,  but
to scores in a standard reference distribution.   Thus the
administrators (or worse, the legislators) have set a fixed score as the
mandated level, and it is conceivable that everyone (or no one) in
California will exceed it, even though the publisher of the test
describes it as the 30th percentile mark.

>Whether the position of the pass mark relative to an "ideal pass mark"
>is appropriate given the magnitude of the random component of a
>students' score can only be determined by looking at actual
performance.

It is obviously necessary to look at individual student's performance on
more than one test in order to have a fair assessment of his or her true
score. That is what Rogosa, for one, considers to be a flaw in the
state's educational policy. But it is not necessary to have repeated
observations on all students in order to evaluate "the magnitude of the
random component". Sample data is sufficient for that, and that is what
the publishers of the tests claim to have in their possession --- which
is what is being referred to when the article says "He has simply
converted technical reliability information from test publishers
(Harcourt Educational Measurement, in this case) to more understandable
"accuracy" guides."

Rogosa is trying to point out the social implications of  using
single-test criteria for graduation or promotion, when it is agreed that
the tests have less than perfect accuracy.

--
  *************************************************
 `o^o' * Neil W. Henry ([EMAIL PROTECTED])  *
 -<:>- * Virginia Commonwealth University *
 _/ \_ * Richmond VA 23284-2014  *
  *(804)828-1301 x124  (math sciences, 2037c Oliver) *
  *FAX: 828-8785   http://saturn.vcu.edu/~nhenry  *
  *************************************************




=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================
Re: Reliability and baseball

Reply via email to