On Tue, Dec 10, 2013 at 7:33 PM, Rolf Turner <r.tur...@auckland.ac.nz>wrote:
> > See inline below. > > > On 12/11/13 11:28, Bert Gunter wrote: > >> This is not really an R question -- it is statistics. >> In any case, you should do better posting this on the >> R-Sig-Mixed-Models list, which concerns itself with matters like this. >> >> However, I'll hazard a guess at an answer: maybe. (Vague questions >> elicit vague answers). >> > > No! Nay! Never! Well, hardly ever. The ***y*** values will rarely be > Gaussian. > (Think about a simple one-way anova, with 3 levels, and N(0,sigma^2) > errors. > The y values will have a distribution which is a mixture of 3 independent > Gaussian > distributions.) > Actually, the ***y*** values are semi-Gaussian much more often than you might think. (Key word here is "semi-Gaussian".) I (like you) find this somewhat surprising, but this is what I observe many times when making a QQ plot of all the data. Even for split-split-plot designs with multiple levels of treatments. Why should this be the case? Think about it this way: If the differences in the treatment means were big, you probably wouldn't even be doing a statistical analysis, because the intra-ocular test would have noticed the difference and you would be done. So, a priori, the differences in means are modest or even small, and a mixture of Normals with small differences in the means (relative to the standard deviations) is not too unlike a Normal. Because of this, I rarely find QQ plots of the whole data to be very useful, whereas QQ plots of the data for each treatment level can be more informative. It may depend on the type of data you have -- your experience may be different. Kevin [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.