Re: [R] Assumptions for ANOVA: the right way to check the normality

Frodo Jedi Thu, 06 Jan 2011 20:03:19 -0800

Thanks a lot Greg, 
you have been very helpful.

All the best





________________________________
From: Greg Snow <greg.s...@imail.org>

<r-help@r-project.org>
Sent: Thu, January 6, 2011 9:29:36 PM
Subject: RE: [R] Assumptions for ANOVA: the right way to check the normality


Some would argue to always use the kruskal wallis test since we never know for 
sure if we have normality.  Personally I am not sure that I understand what
exactly that test is really testing.  Plus in your case you are doing a two-way 
anova and kruskal.test does one-way, so it will not work for your case.  There 
are other non-parametric options.

Whether to use anova and other normality based tests is really a matter of what 
assumptions you are willing to live with and what level of âclose enoughâ 
you 
are comfortable with.  Consulting with a local consultant with experience in 
these areas is useful if you donât have enough experience to decide what you 
are 
comfortable with.

For your description, I would try the proportional odds logistic regression, 
but 
again, you should probably consult with someone who has experience rather than 
trying that on your own until you have more training and experience.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


Sent: Thursday, January 06, 2011 12:57 PM
To: Greg Snow; r-help@r-project.org
Subject: Re: [R] Assumptions for ANOVA: the right way to check the normality


Ok,
I see ;-)

LetÂ´s put in this way then. When do I have to use the kruskal wallis test? I 
mean, when I am very sure that I have 

to use it instead of ANOVA?

Thanks


Best regards

P.S.  In addition, which is the non parametric methods corresponding to a 2 
ways 
anova?..or have I to
repeat many times the kruskal wallis test?

________________________________

From:Greg Snow <greg.s...@imail.org>

"r-help@r-project.org" <r-help@r-project.org>
Sent: Thu, January 6, 2011 7:07:17 PM
Subject: RE: [R] Assumptions for ANOVA: the right way to check the normality

Remember that an non-significant result (especially one that is still near 
alpha 
like yours) does not give evidence that the null is true.  The reason that the 
1st 2 tests below don't show significance is more due to lack of power than 
some 
of the residuals being normal.  The only test that I would trust for this is 
SnowsPenultimateNormalityTest (TeachingDemos package, the help page is more
useful than the function itself).

But I think that you are mixing up 2 different concepts (a very common 
misunderstanding).  What is important if we want to do normal theory inference 
is that the coefficients/effects/estimates are normally distributed.  Now since 
these coefficients can be shown to be linear combinations of the error terms, 
if 
the errors are iid normal then the coefficients are also normally distributed.  
So many people want to show that the residuals come from a perfectly normal
distribution.  But it is the theoretical errors, not the observed residuals 
that 
are important (the observed residuals are not iid).  You need to think about 
the 
source of your data to see if this is a reasonable assumption.  Now I cannot 
fathom any universe (theoretical or real) in which normally distributed errors 
added to means that they are independent of will result in a finite set of
integers, so an assumption of exact normality is not reasonable (some may want 
to argue this, but convincing me will be very difficult).  But looking for 
exact 
normality is a bit of a red herring because, we also have the Central Limit
Theorem that says that if the errors are not normal (but still iid) then the 
distribution of the coefficients will approach normality as the sample size
increases.  This is what make statistics doable (because no real dataset 
entered 
into the computer is exactly normal).  The more important question is are the 
residuals "normal enough"?  for which there is not a definitive test 
(experience 
and plots help).

But this all depends on another assumption that I don't think that you have 
even 
considered.  Yes we can use normal theory even when the random part of the data 
is not normally distributed, but this still assumes that the data is at least 
interval data, i.e. that we firmly believe that the difference between a 
response of 1 and a response of 2 is exactly the same as a difference between a 
6 and a 7 and that the difference from 4 to 6 is exactly twice that of 1 vs. 2. 
 
>From your data and other descriptions, I don't think that that is a reasonable 
assumption.  If you are not willing to make that assumption (like me) then 
means 
and normal theory tests are meaningless and you should use other approaches.  
One possibility is to use non-parametric methods (which I believe Frank has
already suggested you use), another is to use proportional odds logistic 
regression.



--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Frodo Jedi
> Sent: Wednesday, January 05, 2011 3:22 PM
> To: Robert Baer; r-help@r-project.org
> Subject: Re: [R] Assumptions for ANOVA: the right way to check the
> normality
>
> Dear Robert,
[[elided Yahoo spam]]
> So you also think that I have to check only the residuals and not the
> data
> directly.
> Now just for curiosity I did the the shapiro test on the residuals. The
> problem
> is that on fit3 I donÂ´t get from the test
> that the data are normally distribuited. Why? Here the data:
>
> > shapiro.test(residuals(fit1))
>
>    Shapiro-Wilk normality test
>
> data:  residuals(fit1)
> W = 0.9848, p-value = 0.05693
>
> #Here the test is ok: the test says that the data are distributed
> normally
> (p-value greather than 0.05)
>
>
>
> > shapiro.test(residuals(fit2))
>
>    Shapiro-Wilk normality test
>
> data:  residuals(fit2)
> W = 0.9853, p-value = 0.06525
>
> #Here the test is ok: the test says that the data are distributed
> normally
> (p-value greather than 0.05)
>
>
>
> > shapiro.test(residuals(fit3))
>
>    Shapiro-Wilk normality test
>
> data:  residuals(fit3)
> W = 0.9621, p-value = 0.0001206
>
>
>
> Now the test reveals p-value lower than 0.05: so the residuals for fit3
> are not
> distributed normally....
> Why I get this beheaviour? Indeed in the histogram and Q-Q plot for
> fit3
> residuals I get a normal distribution.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ________________________________
> From: Robert Baer <rb...@atsu.edu>
>
> Sent: Wed, January 5, 2011 8:56:50 PM
> Subject: Re: [R] Assumptions for ANOVA: the right way to check the
> normality
>
> > Someone suggested me that I donÂ´t have to check the normality of the
> data, but
> > the normality of the residuals I get after the fitting of the  linear
> model.
> > I really ask you to help me to understand this point as I donÂ´t find
> enough
> > material online where to solve it.
>
> Try the following:
> # using your scrd data and your proposed models
> fit1<- lm(response ~ stimulus + condition + stimulus:condition,
> data=scrd)
> fit2<- lm(response ~ stimulus + condition, data=scrd)
> fit3<- lm(response ~ condition, data=scrd)
>
> # Set up for 6 plots on 1 panel
> op = par(mfrow=c(2,3))
>
> # residuals function extracts residuals
> # Visual inspection is a good start for checking normality
> # You get a much better feel than from some "magic number" statistic
> hist(residuals(fit1))
> hist(residuals(fit2))
> hist(residuals(fit3))
>
> # especially qqnorm() plots which are linear for normal data
> qqnorm(residuals(fit1))
> qqnorm(residuals(fit2))
> qqnorm(residuals(fit3))
>
> # Restore plot parameters
> par(op)
>
> >
> > If the data are not normally distributed I have to use the kruskal
> wallys test
> > and not the ANOVA...so please help
> > me to understand.
>
> Indeed - Kruskal-Wallis is a good test to use for one factor data that
> is
> ordinal so it is a good alternative to your fit3.
> Your "response" seems to be a discrete variable rather than a
> continuous
> variable.
> You must decide if it is reasonable to approximate it with a normal
> distribution
> which is by definition continuous.
>
> >
> > I make a numerical example, could you please tell me if the data in
> this table
> > are normally distributed or not?
> >
> > Help!
> >
> >
> > number                  stimulus condition response
> > 1            flat_550_W_realism        A        3
> > 2            flat_550_W_realism        A        3
> > 3            flat_550_W_realism        A        5
> > 4            flat_550_W_realism        A        3
> > 5            flat_550_W_realism        A        3
> > 6            flat_550_W_realism        A        3
> > 7            flat_550_W_realism        A        3
> > 8            flat_550_W_realism        A        5
> > 9            flat_550_W_realism        A        3
> > 10            flat_550_W_realism        A        3
> > 11            flat_550_W_realism        A        5
> > 12            flat_550_W_realism        A        7
> > 13            flat_550_W_realism        A        5
> > 14            flat_550_W_realism        A        2
> > 15            flat_550_W_realism        A        3
> > 16            flat_550_W_realism        AH        7
> > 17            flat_550_W_realism        AH        4
> > 18            flat_550_W_realism        AH        5
> > 19            flat_550_W_realism        AH        3
> > 20            flat_550_W_realism        AH        6
> > 21            flat_550_W_realism        AH        5
> > 22            flat_550_W_realism        AH        3
> > 23            flat_550_W_realism        AH        5
> > 24            flat_550_W_realism        AH        5
> > 25            flat_550_W_realism        AH        7
> > 26            flat_550_W_realism        AH        2
> > 27            flat_550_W_realism        AH        7
> > 28            flat_550_W_realism        AH        5
> > 29            flat_550_W_realism        AH        5
> > 30        bump_2_step_W_realism        A        1
> > 31        bump_2_step_W_realism        A        3
> > 32        bump_2_step_W_realism        A        5
> > 33        bump_2_step_W_realism        A        1
> > 34        bump_2_step_W_realism        A        3
> > 35        bump_2_step_W_realism        A        2
> > 36        bump_2_step_W_realism        A        5
> > 37        bump_2_step_W_realism        A        4
> > 38        bump_2_step_W_realism        A        4
> > 39        bump_2_step_W_realism        A        4
> > 40        bump_2_step_W_realism        A        4
> > 41        bump_2_step_W_realism        AH        3
> > 42        bump_2_step_W_realism        AH        5
> > 43        bump_2_step_W_realism        AH        1
> > 44        bump_2_step_W_realism        AH        5
> > 45        bump_2_step_W_realism        AH        4
> > 46        bump_2_step_W_realism        AH        4
> > 47        bump_2_step_W_realism        AH        5
> > 48        bump_2_step_W_realism        AH        4
> > 49        bump_2_step_W_realism        AH        3
> > 50        bump_2_step_W_realism        AH        4
> > 51        bump_2_step_W_realism        AH        5
> > 52        bump_2_step_W_realism        AH        4
> > 53        hole_2_step_W_realism        A        3
> > 54        hole_2_step_W_realism        A        3
> > 55        hole_2_step_W_realism        A        4
> > 56        hole_2_step_W_realism        A        1
> > 57        hole_2_step_W_realism        A        4
> > 58        hole_2_step_W_realism        A        3
> > 59        hole_2_step_W_realism        A        5
> > 60        hole_2_step_W_realism        A        4
> > 61        hole_2_step_W_realism        A        3
> > 62        hole_2_step_W_realism        A        4
> > 63        hole_2_step_W_realism        A        7
> > 64        hole_2_step_W_realism        A        5
> > 65        hole_2_step_W_realism        A        1
> > 66        hole_2_step_W_realism        A        4
> > 67        hole_2_step_W_realism        AH        7
> > 68        hole_2_step_W_realism        AH        5
> > 69        hole_2_step_W_realism        AH        5
> > 70        hole_2_step_W_realism        AH        1
> > 71        hole_2_step_W_realism        AH        5
> > 72        hole_2_step_W_realism        AH        5
> > 73        hole_2_step_W_realism        AH        5
> > 74        hole_2_step_W_realism        AH        2
> > 75        hole_2_step_W_realism        AH        6
> > 76        hole_2_step_W_realism        AH        5
> > 77        hole_2_step_W_realism        AH        5
> > 78        hole_2_step_W_realism        AH        6
> > 79    bump_2_heel_toe_W_realism        A        3
> > 80    bump_2_heel_toe_W_realism        A        3
> > 81    bump_2_heel_toe_W_realism        A        3
> > 82    bump_2_heel_toe_W_realism        A        2
> > 83    bump_2_heel_toe_W_realism        A        3
> > 84    bump_2_heel_toe_W_realism        A        3
> > 85    bump_2_heel_toe_W_realism        A        4
> > 86    bump_2_heel_toe_W_realism        A        3
> > 87    bump_2_heel_toe_W_realism        A        4
> > 88    bump_2_heel_toe_W_realism        A        4
> > 89    bump_2_heel_toe_W_realism        A        6
> > 90    bump_2_heel_toe_W_realism        A        5
> > 91    bump_2_heel_toe_W_realism        A        4
> > 92    bump_2_heel_toe_W_realism        AH        7
> > 93    bump_2_heel_toe_W_realism        AH        3
> > 94    bump_2_heel_toe_W_realism        AH        4
> > 95    bump_2_heel_toe_W_realism        AH        2
> > 96    bump_2_heel_toe_W_realism        AH        5
> > 97    bump_2_heel_toe_W_realism        AH        6
> > 98    bump_2_heel_toe_W_realism        AH        4
> > 99    bump_2_heel_toe_W_realism        AH        4
> > 100    bump_2_heel_toe_W_realism        AH        4
> > 101    bump_2_heel_toe_W_realism        AH        5
> > 102    bump_2_heel_toe_W_realism        AH        2
> > 103    bump_2_heel_toe_W_realism        AH        6
> > 104    bump_2_heel_toe_W_realism        AH        5
> > 105    hole_2_heel_toe_W_realism        A        3
> > 106    hole_2_heel_toe_W_realism        A        3
> > 107    hole_2_heel_toe_W_realism        A        1
> > 108    hole_2_heel_toe_W_realism        A        3
> > 109    hole_2_heel_toe_W_realism        A        3
> > 110    hole_2_heel_toe_W_realism        A        5
> > 111    hole_2_heel_toe_W_realism        A        2
> > 112    hole_2_heel_toe_W_realism        AH        5
> > 113    hole_2_heel_toe_W_realism        AH        1
> > 114    hole_2_heel_toe_W_realism        AH        3
> > 115    hole_2_heel_toe_W_realism        AH        6
> > 116    hole_2_heel_toe_W_realism        AH        5
> > 117    hole_2_heel_toe_W_realism        AH        4
> > 118    hole_2_heel_toe_W_realism        AH        4
> > 119    hole_2_heel_toe_W_realism        AH        3
> > 120    hole_2_heel_toe_W_realism        AH        3
> > 121    hole_2_heel_toe_W_realism        AH        1
> > 122    hole_2_heel_toe_W_realism        AH        5
> > 123 bump_2_combination_W_realism        A        4
> > 124 bump_2_combination_W_realism        A        2
> > 125 bump_2_combination_W_realism        A        4
> > 126 bump_2_combination_W_realism        A        1
> > 127 bump_2_combination_W_realism        A        4
> > 128 bump_2_combination_W_realism        A        4
> > 129 bump_2_combination_W_realism        A        2
> > 130 bump_2_combination_W_realism        A        4
> > 131 bump_2_combination_W_realism        A        2
> > 132 bump_2_combination_W_realism        A        4
> > 133 bump_2_combination_W_realism        A        2
> > 134 bump_2_combination_W_realism        A        6
> > 135 bump_2_combination_W_realism        AH        7
> > 136 bump_2_combination_W_realism        AH        3
> > 137 bump_2_combination_W_realism        AH        4
> > 138 bump_2_combination_W_realism        AH        1
> > 139 bump_2_combination_W_realism        AH        6
> > 140 bump_2_combination_W_realism        AH        5
> > 141 bump_2_combination_W_realism        AH        5
> > 142 bump_2_combination_W_realism        AH        6
> > 143 bump_2_combination_W_realism        AH        5
> > 144 bump_2_combination_W_realism        AH        4
> > 145 bump_2_combination_W_realism        AH        2
> > 146 bump_2_combination_W_realism        AH        4
> > 147 bump_2_combination_W_realism        AH        2
> > 148 bump_2_combination_W_realism        AH        5
> > 149 hole_2_combination_W_realism        A        5
> > 150 hole_2_combination_W_realism        A        2
> > 151 hole_2_combination_W_realism        A        4
> > 152 hole_2_combination_W_realism        A        1
> > 153 hole_2_combination_W_realism        A        5
> > 154 hole_2_combination_W_realism        A        4
> > 155 hole_2_combination_W_realism        A        3
> > 156 hole_2_combination_W_realism        A        5
> > 157 hole_2_combination_W_realism        A        2
> > 158 hole_2_combination_W_realism        A        5
> > 159 hole_2_combination_W_realism        A        5
> > 160 hole_2_combination_W_realism        A        1
> > 161 hole_2_combination_W_realism        AH        7
> > 162 hole_2_combination_W_realism        AH        5
> > 163 hole_2_combination_W_realism        AH        3
> > 164 hole_2_combination_W_realism        AH        1
> > 165 hole_2_combination_W_realism        AH        6
> > 166 hole_2_combination_W_realism        AH        4
> > 167 hole_2_combination_W_realism        AH        7
> > 168 hole_2_combination_W_realism        AH        5
> > 169 hole_2_combination_W_realism        AH        5
> > 170 hole_2_combination_W_realism        AH        2
> > 171 hole_2_combination_W_realism        AH        6
> > 172 hole_2_combination_W_realism        AH        2
> > 173 hole_2_combination_W_realism        AH        4
> >
> >
> >
> >
> > Thanks in advance
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> >
>
>
>
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
>
>      [[alternative HTML version deleted]]


      
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assumptions for ANOVA: the right way to check the normality

Reply via email to