Another approach is to use resampling.  Take your entire sample (2500
subjects) and consider that to be your population.  Justify this by arguing
that your sample is representative of the actual population -- or make your
argument hypothetical ["Assuming the sample is representative of the
population.].

>From this "population", sample with replacement creating samples of size
2500.  For each resample, group the data by category, total the number in
each category and order the categories.  Repeat this a lot of times (500 or
more).  Take your highest category (#6).  In what percent of the resamples
is this category the first category (the most common category)?  If the
percentages for the top 15 categories are all above some high percentage
(95%?), then you are in luck!!!   If they are not, then you've got to deal
with the consequences...

To investigate your sensitivity to sample size, repeat the entire procedure
by starting with samples having twice the size (four times the size, etc.),
by simply doubling each of the entries.  Then repeat the resampling process,
compute the percentages [X% of resamples from a population of size 5,000
have category #6 in 1st place] and see how these percentages increase as the
sample size increases.  This may give you the minimum sample size necessary
to insure that a given category (say #6) has more than a 95% chance of being
the primary category.
Milo
==========================================================
Mike Stiso wrote in message <8ggn1m$32j$[EMAIL PROTECTED]>...
> <SNIP>  ...it's important to my project to ensure the
>accuracy of the ordering within the table for the top fifth of the
>categories-in other words, I need to be reasonably certain (p = .01?) that
>the top, say, 15 categories in my table actually are the 15 most common
>categories occurring within the population, and also (less importantly)
that
>their relative frequencies correspond to the ranking indicated by the
table.
>Is such a determination possible? Or, looking at the question another way,
>is it possible to determine the optimal sample size for achieving a stable
>frequency table? And if so, can you point me in the right direction?
>Thanks very much for any help,
>Mike





===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to