On Dec 17, 2007 3:10 PM, hadley wickham <[EMAIL PROTECTED]> wrote: > > This has nothing to do really with the question that Troels asked, > > but the exposition quoted from the AA paper is unnecessarily > > confusing. > > The phrase ``Because X0 and X1 have identical marginal > > distributions ...'' > > throws the reader off the track. The identical marginal > > distributions > > are irrelevant. All one needs is that the ***means*** of X0 and X1 > > be the same, and then the null hypothesis tested by a paired t-test > > is true and so the p-values are (asymptotically) Uniform[0,1]. With > > a sample size of 100, the ``asymptotically'' bit can be safely > > ignored > > for any ``decent'' joint distribution of X0 and X1. If one further > > assumes that X0 - X1 is Gaussian (which has nothing to do with X0 > > and > > X1 having identical marginal distributions) then ``asymptotically'' > > turns into ``exactly''. > > Another related issue is that uniform distributions don't look very uniform: > > hist(runif(100)) > hist(runif(1000)) > hist(runif(10000)) > > Be sure to calibrate your eyes (and your bin width) before rejecting > the hypothesis that the distribution is uniform. > > Hadley
Thanks for the example, Hadley. To me, this suggests we should stop teaching histograms in Stat 101 and instead use quantile plots, which give excellent results for n=100 and even surprisingly good results for n=10: par(mfrow=c(2,2)) for(i in c(10, 100, 1000, 10000)) { qqplot(runif(i), qunif(seq(1/i, 1, length=i)), main=i, xlim=c(0,1), ylim=c(0,1), xlab="runif", ylab="Uniform distribution quantiles") abline(0,1,col="lightgray") } Kevin (drifting even further off topic) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.