I don't know whether showing p-values is the best approach either, but I'm using them only as indicators to show how good the approximation would be as the sample size increases. You may regard the p-values as a measure of goodness of fit. I don't think I need to answer the question of hypothesis test -- as Duncan has explained.
Yes you can generate normal random numbers in the mean time and compare the p-values, but I prefer comparing the sample means with the theoretical population distribution instead of simulated normal random numbers. The problem with most demos in CLT is we have no means to observe how good is the approximation. In your clt.examp(), there is a graphical measure, i.e. comparing the density curve to the histogram, but that's not sufficient, as sometimes our eyes cannot easily detect differences between curves, e.g. the t-distribution and normal distribution. That's why I use numerical measures like p-values. P. S. I think your code in clt.examp() needs a correction: the parameters of the theoretical normal distribution should not be computed by *simulated* means & variances, but from original theoretical distribution. For example, for the uniform distribution over (a, b), mean = (a+b)/2 and sd=(b-a)/sqrt(12*n) (although in the case of large sample sizes these results will be very close) Regards, Yihui -- Yihui Xie <[EMAIL PROTECTED]> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China On Thu, Oct 16, 2008 at 11:43 PM, Greg Snow <[EMAIL PROTECTED]> wrote: > I wonder if including the p-values for the normality test is the best > approach in you animation? The clt does not say that the distribution of the > means will be normal, just that it approaches normality (and therefore may be > a decent approximation). The normality test can just reject the null that > the data (simulated means) comes from a normal distribution. Since the true > distribution of the means is not normal (unless you use a sample size of Inf, > and I for one have better things to than wait for a computer to simulate > several samples of size Inf) the null for the normality test is always false > and therefore the test will always result in either saying it is not normal > or a type II error. The real goal is not to show normality, but to show that > using the normal gives a "good enough" approximation. I would prefer the > bottom plot to show either the proportion of p-values from a normal based > test on the simulated data that is less than alpha, or the proportion of > confid! ence intervals based on the normal based test that include the true parameter. Then the user can see when those values become close enough an approximation. > > What is your target audience for this demo? In my opinion, anyone who could > understand the bottom plot should already understand the clt enough not to > need the demo, those that I would aim the demo at would just be confused by > the current bottom plot. > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > [EMAIL PROTECTED] > 801.408.8111 > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.