Re: [R] data distribution for lme

Kevin Wright Wed, 11 Dec 2013 08:20:10 -0800

On Tue, Dec 10, 2013 at 7:33 PM, Rolf Turner <r.tur...@auckland.ac.nz>wrote:

>
> See inline below.
>
>
> On 12/11/13 11:28, Bert Gunter wrote:
>
>> This is not really an R question -- it is statistics.
>> In any case, you should do better posting this on the
>> R-Sig-Mixed-Models list, which concerns itself with matters like this.
>>
>> However, I'll hazard a guess at an answer: maybe.  (Vague questions
>> elicit vague answers).
>>
>
> No! Nay! Never!  Well, hardly ever.   The ***y*** values will rarely be
> Gaussian.
> (Think about a simple one-way anova, with 3 levels, and N(0,sigma^2)
> errors.
> The y values will have a distribution which is a mixture of 3 independent
> Gaussian
> distributions.)
>

Actually, the ***y*** values are semi-Gaussian much more often than you
might think.   (Key word here is "semi-Gaussian".)  I (like you) find this
somewhat surprising, but this is what I observe many times when making a QQ
plot of all the data.  Even for split-split-plot designs with multiple
levels of treatments.  Why should this be the case?  Think about it this
way: If the differences in the treatment means were big, you probably
wouldn't even be doing a statistical analysis, because the intra-ocular
test would have noticed the difference and you would be done.  So, a
priori, the differences in means are modest or even small, and a mixture of
Normals with small differences in the means (relative to the standard
deviations) is not too unlike a Normal.  Because of this, I rarely find QQ
plots of the whole data to be very useful, whereas QQ plots of the data for
each treatment level can be more informative.  It may depend on the type of
data you have -- your experience may be different.

Kevin

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data distribution for lme

Reply via email to