Before you decide how the test should be set up, and what should be on it, you
_must_ decide what you want the test result to do.  Should sound a familiar
refrain to people who use statistics to answer questions and solve problems.
There are at least two options:

1)    the test is a 'standards' focus type, in which everyone must reach a
certain score to advance, or graduate, or be considered 'nice' people.
Whatever.  In this case, we should set the questions to evaluate the ability of
the student to understand/be able to manipulate a core set of material, which
we decided is the minimum necessary to survive at the next level.  In the case
of the MCAS tests, they must pass it to graduate from high school.  Asking
questions that are at the high end of the expected learning level (don't ask me
for an operational definition of that - I don't make these terms up:) is a
waste of time - we only need to know if the student understands the core
material.  It is very likely that many students will breeze through such a
test, suggesting that the core material is not a challenge for them.  Of course
not!  The test is seeking to discover who barely makes it through HS, and who
does not.  BTW, the ISO9000:2000 Quality standards for a company fall into this
category of 'test.'

2)    the test seeks to assess the degree of learning level achieved by the
students - what is sometimes called an achievement, or excellence, focus.  Now
you _want_ to have some questions that just about everyone gets correct, and
some that no one gets correct.  You have to, in order to place everyone on the
scoring scale.  When some students run off near the top, or off near the
bottom, you have to change your exam to include the extremes.  If you don't
then you cannot differentiate between the students, which was your original
goal.  You don't much care what the state wide curriculum is, except as a norm
for most of the questions.  You must adjust your questions to encompass the
range of student capabilities, not the expected subject matter.  BTW, the
Baldrige criteria for assessing a company falls in this category.  As an
assessor for the Wisconsin Forward Award (baby Baldrige), I have not seen a
company score extremely high on the scale, although some firms have done so.
Thus, the scoring scale still goes beyond the extremes of corporate
performance.

Your discussion mentions both of these, but they are neither interchangeable
nor mergable.

for HS graduation, for political reasons I suppose one would want to set the
passing grade so that most students graduate.  However, when done properly, the
minimum 'learning' expectations for post-HS success should be developed,
followed by tests of that level of learning.  Recommended curriculum content
might fit in there, too, before the tests.  If the expectations are valid
(supportable independent of the scores obtained), then the percent who pass
should be a dependent variable, following from the test development.

In Florida, they didn't do the hard expectation development, and so wound up
with a committee debating what score should be made 'passing,' while for each
proposed cut-off score, the technical people told them what percentage of
students who took the exam the previous year would have failed.  The ultimate
political decision, completely independent of what is or is not necessary to
survive past HS.

then you said, "If I'm going to deny graduation to some
individual on the basis of this test, I'd sure like a test that clearly
demonstrates the kid is clueless before going into court.  A test consisting
of very simple items would accomplish this."

this is the major part of the issue.  A standards test for HS graduation should
aim to detect those suited for graduation, and those not.  'Clueless' comes to
mind, although there were some who would have given me that title when I was
there.  In one state (VA, I believe), analysis found that for  students who
were tested multiple times over the year in different topics, some 10% of the
students could very well have been mislabeled at least once as having failed
when they deserved to pass, or passed when they deserved to fail.  (alpha risk
and beta risk; we've heard this stuff before).  It's in the nature of
measurement, when the measure includes possible measuring error.  We've heard
about that, too.  BTW, passing instead of failing in this case denied needy
students some extra instructional help.

'clearly demonstrates' is not part of these tests, unless they are expanded way
beyond rationality.  Interpreting the test results as an absolute measure, of
anything, is a fraud at worst, and demonstrated ignorance by the legislators at
best.  You can count up the politics of it as you see fit.  It remains
technically invalid.

Jay

"Olsen, Chris" wrote:

> Hello Dennis and All --
>
>   Please pardon the formatting of my response here -- apparently I cannot
> choose a different font, so I will bracket my comments by "-->" and "<---."
>
> Dennis writes...
>
>   <Snip>
>
> since i was not the person posting the original item on this matter ... i
> do know in fact what the real purpose was for this particular item and, i
> do not know what the mass. objectives are ... and the material presented in
> typical classes ... that then finds its way on to the assessment test
>
> i would say however ... that IF the test included only 6 items related to
> statistics ... out of the larger test ... then the issue of whether in a
> boxplot ... the vertical bar...
>
>   <Snip>
>
>   is the mean or median ... is trivial no matter what state the test is for
> ...
>
> --->   I certainly would not argue that the question is anything but
> trivial, but that does not necessarily mean it is not legitimate question
> for the purposes intended.  The "worth" of an individual question should be
> judged by its contribution to the overall objectives of the test itself.
> One might argue, for instance, that the 6 probability and statistics
> questions (if that there be) should span a range of difficulties from
> relatively easy to relatively difficult, in order to get a relatively global
> picture of what a student knows, in order to inform future instruction.
> Suppose (as certainly seems reasonable) that this is a very easy question,
> and almost nobody gets it right.  One hopes that such a circumstance would
> inform future instruction.  On the other hand, suppose almost everybody gets
> it right.  If nothing else, one might redefine the levels of difficulties of
> questions for next year's test.  Or, one might not, depending on the
> politics...
>
>    There is another variable in play here, that being the politics involved.
> I am not familiar with the MCAS test, nor for that matter, the curriculum in
> Massachusetts.  However, I can imagine a public uproar if a high stakes test
> delivered low percentages of correct scores for a significant part of the
> population, unless there was a very low cut score for passing.  (I'm
> remembering that North Carolina went that route, to the amusement of many.)
> The untrained public might be perceived by the testors as having little
> truck with an argument that low percentages were OK because the test was
> relatively difficult.  Thus, it is at least conceivable that trivial
> questions would be encouraged in this sort of exam.  Do I think this would
> be good assessment practice?  No.  But I don't think good assessment
> practice is driving these tests -- politics, in the form of legislatures
> are.
>
> <------------
>
> IF indeed there is an item on the test related to a boxplot ... then it
> should be about interpreting data using the boxplot ... not about some
> ditzy little tidbit that the line in the center part ... or the + sign
> above ... is the mean or the median ...
>
> ----->  Cannot disagree there, but it would depend on the purpose, use, and
> interpretation of the test.  If I'm going to deny graduation to some
> individual on the basis of this test, I'd sure like a test that clearly
> demonstrates the kid is clueless before going into court.  A test consisting
> of very simple items would accomplish this.
> <-------
>
>   -- Chris
>
> Chris Olsen
> George Washington High School
> 2205 Forest Drive SE
> Cedar Rapids, IA
>
> (319)-398-2161
>
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?





=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to