On Wed, 14 Mar 2001, Scott wrote:

> I am uncertain about the solution to the problem for which I am trying
> to solve.  I am hoping that someone might help guide me to the correct
> solution. 
                First you'll have to be rather clearer about what the 
problem is.  Comments embedded below:

> I have a discrete distribution (say 100) 

By this I'd expect you to mean that you have a variable that can take on 
any of about 100 different values.  Are these values categories, or are 
they numerical values (that might be taken, e.g., to reflect an 
underlying continuous variable, and might reasonably be considered to be 
interval scale)?  [Or did you mean something else entirely by this 
statement?]

> with a dichotomous population (either good or bad). 

And by this you presumably refer to a dichotomous response variable. 
How, if at all, is this dichotomy related to the earlier "discrete 
distribution"?

> There is no knowledge of the population split until after sampling. 

Which means that (a) whether an observation is "good" or "bad" cannot be 
determined in advance, and may mean further that (b) the proportion of 
"good" observations in the population of interest (which is as yet 
undefined) cannot even be guessed at in advance.

> Since the testing of each sample cost time and money, only one sample 
> set will be taken.  I want to find the smallest sample size that
> has a 90% probability of being indicative of the entire population.

Umm.  Strictly speaking, ANY sample can be said to be "indicative of the 
population" from which it is drawn.  You must intend to refer to some 
characteristic(s?) of the population in question:  perhaps the proportion 
of "good" cases in the population, for example.  Until you specify what 
you want to "indicate", the problem remains undefined.  Further, you need 
to specify some degree of uncertainty (or "noise") in order to be able to 
determine a sample size.  (Think of a result of a survey:  "With 90% 
confidence, the proportion of the population who [have some attribute] is 
within 3 percentage points of 42%."  You appear to have specified the 
90%, but not the uncertainty implicit in "3 percentage points".)

> The distribution seems like it would be a hypergeometric distribution

The distribution of what?  Your "discrete distribution"?  The 
distribution of the "good/bad" dichotomy?  The sampling distribution of 
the statistic you intend to use (which, by the way, you have not 
specified either)?

> (since I do not know the sample size, I do not know if a binomial
> distribution would be appropriate in this case). 

I have difficulty imagining circumstances in which the _sample_size_ can
possibly determine a distribution;  except insofar as it may be germane to
the credibility of a convenient approximation to a sampling distribution
of interest. 

> The information I have read on confidence intervals does not seem to be
> directly applicable to hypergeometric distributions.  The method of
> maximum likelihood seems like a method to relate the sample population 
> to the entire population, but it does not provide for determining the 
> most efficient sample size and it does not indicate the accuracy of the 
> sample distribution as compared to the entire population distribution. 

I may be utterly missing the point of your problem;  but "accuracy" is 
not a word I'd use to describe a sample.  A statistic, yes (although I 
might prefer "precision" to "accuracy").  In any case, I do not see what 
you want to mean by "the accuracy of the sample distribution", or "the 
accuracy of the sample distribution as compared to the entire population 
distribution".  

> If you can help me solve this problem, I would greatly appreciate it.

Without more help from you in _defining_ the problem, I doubt whether 
anyone can help you _solve_ it.

 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to