On Mon, 22 May 2000 [EMAIL PROTECTED] wrote:

> First up, the purpose I have at hand is to make interpolations for
> percentages of students who have achieved above a certain score on a
> test (where this score may lie between two discrete score points on the 
> scale).

One might inquire, if one were pursuing this matter in a little more 
depth, why one would not prefer a continuous approximating distribution 
(e.g., normal, if that be appropriate, as is often the case), on the 
basis either that the empirical CFs at hand represent an instance drawn 
from such an idealized population, or that the continous function is an 
adequate approximation to the true population distribution;  since the 
purpose you describe clearly is to apply the CF information to some 
(hypothetical?) set of students whose scores are not in fact represented 
in the data in hand.  (Of course, the problem you describe below still 
arises, in terms of how one converts from the discrete empirical CF 
function to the (idealized?) continuous function;  this is much less a 
problem if the continuous function is obtained from information other 
than the CFs themselves -- e.g., an approximating normal distribution 
would be derived from the empirical mean and standard deviation, not from 
the empirical CFs.)

> It seems to me cumulative frequencies should be plotted at the exact
> upper limit of each interval.  This is the only simple method that
> makes sense to me.

If by "cumulative frequency" ("CF" above) you mean "observed frequency of 
responses less than or equal to this score value", and especially if 
these CFs have been cumulated over a grouped empirical frequency 
distribution, your logic is impeccable.  If you've been cumulating at the 
level of individual score values, there may be room for SOME quibbling.

> However, it has been suggested by others in the context I’m dealing
> with that frequencies/percentages can alternatively be plotted at the
> mid-point of each interval, or even at the lower limit!  Although I can 
> understand plotting graphs at the mid-point for ease of representation,
> this hardly seems suited to making interpolations.  This is because
> when you read off the graph at the upper limit of a given interval, you 
> will (probably) have more cases than fell up to and including the
> interval itself.  This is surely absurd, yet people seem to seriously
> believe it is a viable alternative.

First, make sure you're all on the same wavelength.  You clearly are 
thinking in terms of "<=" CFs;  plotting at the lower limit would be 
appropriate for "strictly <" CFs (or equivalently ">=" CFs).  Plotting at 
the midpoint would be reasonable if one took for one's CF the midpoint 
between a "strictly <" CF and a "<=" CF.  If upon examination it turns 
out that your colleagues (?) really think they're dealing with "<<=" CFs: 

        You might ask them how they view the two intervals at the extreme 
ends of the CFs.  In terms of relative cumulative percents (C%s), what 
scores then apply to the upper and lower limits of (1) the lowest 
non-empty score interval;  (2) the highest score interval?  And in 
particular, what C% applies to the upper limit of the highest interval? 
Either of the two alternatives you report implies a C% > 100% here, which 
ought to be absurd enough for anyone with a decent grasp of reality.

        Another approach is to inquire how one would arrange a CF 
downward -- i.e., where the C%s range from 0 at the maximum value to 
100% at the minimum, and the CFs represent the frequency of responses 
greater than or equal to this score value.  

> I’m really hoping for a good reference on this (preferably by a highly
> regarded author to make the case stronger :).  Any comments, or refs?

        Sorry, can't help you here, I don't think.  It has not been my 
habit to invoke appeals to the Irrelevant Authorities at Headquarters, 
nor am I much impressed by such appeals.  If the authorities invoked are 
in fact relevant, they have logical arguments on their side, and the 
logical arguments are what one needs, not the name(s) of the authorities. 

        Of course, if you're dealing with folks who DON'T have a decent 
grasp of reality, irrelevant authorities may be a surprisingly effective 
part of one's armamentarium.  In this case, look for any standard 
introductory statistics texts that deal in detail with CFs, which 
probably means texts three decades old or more (your local university 
library should have an adequate assortment), and pick one whose author(s) 
happen to be well-known in the field in which these folks think they 
operate.  (But make sure the authors' logic is correct!)
                                                        -- DFB.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to