At 12:49 PM 11/21/01 -0500, Ronny Richardson wrote: >As I understand it, the Central Limit Theorem (CLT) guarantees that the >distribution of sample means is normally distributed regardless of the >distribution of the underlying data as long as the sample size is large >enough and the population standard deviation is known.
nope ... clt says nothing of the kind it says that regardless of the shape of the target population ... as n increases, the shape of the sampling distribution of means is better and better APPROXIMATED by the normal distribution that is, even if the target population is quite different from normal ... if we take decent sized samples ... we can say and not be TOO wrong that the sampling distribution of means looks something like a normal ... here is a quick simulation taking samples of n=50 (based on 10000 samples) from a chi square distribution with 1 df . ..::.. :::::::::. .::::::::::::. .::::::::::::::.. .::::::::::::::::::. ..::::::::::::::::::::::.. .....:::::::::::::::::::::::::::::............ . +---------+---------+---------+---------+---------+-------C51 0.30 0.60 0.90 1.20 1.50 1.80 even though the chi square distribution is radically + skewed, the sampling distribution looks pretty darn close to a normal distribution ... but it never will be exactly one ... it does NOT say that it will GET to and BECOME a normal distribution if the population is not normal ... the sampling distribution will not be normal regardless of n ... but, it could be that your EYES could not tell the difference >It seems to me that most statistics books I see over optimistically invoke >the CLT not when n is over 30 and the population standard deviation is >known but anytime n is over 30. This seems inappropriate to me or am I >overlooking something? you are mixing two metaphors ... if we know the sd of the population ... then we know the real sampling error ... ie, standard error of the mean ... if we do NOT know the population sd, and substitute our estimate of that from the sample, then we are only estimating the standard error of the mean thus ... knowing or not knowing the population sd helps us to know or only to estimate the real standard error ... but this is unconnected with shape of sampling distribution shape of sampling distribution is partly a function of shape of population AND random sample size ... >When the population standard deviation is not know (which is almost all the >time) it seems to me that the Student t (t) distribution is more >appropriate. However, t requires that the underlying data be normal, or at >least not too non-normal. My expectations is that most data sets are not >nearly "normal enough" to make using t appropriate. > >So, if we do not know the population standard deviation and we cannot >assume a normal population, what should we be doing-as opposed to just >using the CLT as most business statistics books do? > >Ronny Richardson > > >Ronny Richardson > > >================================================================= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >================================================================= _________________________________________________________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm ================================================================= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =================================================================