In article <[EMAIL PROTECTED]>,
James Ankeny <[EMAIL PROTECTED]> wrote:
> Hello, 
>   I am currently taking a first course in statistics, and I was hoping that
>perhaps someone might be kind enough to answer a question for me. I
>understand that, while a quantitative variable may not be normally
>distributed, we may calculate the mean of the sample, and use facts about
>the Central Limit Theorem, to form a 95% confidence interval for the
>population mean. As far as I know, this means that in 95/100 samples, the
>interval will contain the true population mean. This seems very useful at
>first, but then something begins to confuse me. Yes, we have an interval
>that may contain the true population mean, but ... if the distribution is
>heavily skewed to the right, say like income, why do we want an interval for
>the population mean, when we are taught that the median is a better measure
>of central tendency for skewed distributions?

I suggest that the idea of "central tendency" be removed from
consideration.  It is not even reasonable, and it is easy to
give examples of this.  One can construct interval estimates
for any parameter; if one is interested in the mean, the median
is inappropriate, and if one is interested in the median, the
mean is inappropriate.  It is only in parametric models that
one can use one to do inference on the other, so if the model
is normal, use the mean in any case, and if it is double
exponential, use the median in any case.  Both of these 
distributions are symmetric.  I do not know of any skewed 
parametric family of distributions for which the median is
a good estimator.

This is what confuses me. I
>hope that I have phrased my question in such a way that people can
>understand what I am saying, and why I am confused. There is just one more
>thing I would like to get off my chest. My textbook talks about simple
>random sampling, where you can specify the probability of a sample being
>selected from the population. Yet, there are examples in the book which deal
>with conceptual populations, such as the set of all cars of a particular
>model which may be manufactured in the future. Suppose you have a sample of
>several of these autos, and you want to find a 95% confidence interval for
>mean miles/gallon. How is this an SRS when you can't specify the probability
>of a sample being selected, because the population is conceptual? Perhaps I
>am simply looking at everything the wrong way, but this is very confusing to
>me. Any help would be greatly appreciated. 



As to your other problem, it is the case that almost all
samples are conceptual.  The idea of random sampling with
equally likely outcomes is an exception, and should not 
be used to try to discuss probability.  It leads to students
forcing this model when it is totally inappropriate.




-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to