I think I found the solution !

> cc<-factor(cars)
> dd<-factor(driver)
> MODEL<-y~cc+dd+additive
> summary(aov(MODEL,data=DATA))

On 14 Jun, 2010, at 2:52 PM, Andrea Bernasconi DG wrote:

> Hi R help,
> 
> Hi R help,
> 
> Which is the easiest (most elegant) way to force "aov" to treat numerical 
> variables as categorical ?
> 
> Sincerely, Andrea Bernasconi DG
> 
> PROBLEM EXAMPLE
> 
> I consider the latin squares example described at page 157 of the book:
> Statistics for Experimenters: Design, Innovation, and Discovery by George E. 
> P. Box, J. Stuart Hunter, William G. Hunter.
> 
> This example use the data-file /BHH2-Data/tab0408.dat from 
> ftp://ftp.wiley.com/ in /sci_tech_med/statistics_experimenters/BHH2-Data.zip.
> 
> The file tab0408.dat contains following DATA:
> > DATA
>    driver cars additive  y
> 1       1    1        A 19
> 2       2    1        D 23
> 3       3    1        B 15
> 4       4    1        C 19
> 5       1    2        B 24
> 6       2    2        C 24
> 7       3    2        D 14
> 8       4    2        A 18
> 9       1    3        D 23
> 10      2    3        A 19
> 11      3    3        C 15
> 12      4    3        B 19
> 13      1    4        C 26
> 14      2    4        B 30
> 15      3    4        A 16
> 16      4    4        D 16
> 
> Now
> > summary( aov(MODEL, data=DATA) )
>             Df Sum Sq Mean Sq F value Pr(>F)  
> cars         1   12.8  12.800  0.8889 0.3680  
> driver       1  115.2 115.200  8.0000 0.0179 *
> additive     3   40.0  13.333  0.9259 0.4634  
> Residuals   10  144.0  14.400                 
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> This results differ from book result at p 159, since "cars" and "driver" are 
> treated as numerical variables by "aov".
> 
> BRUTE FORCE SOLUTION
> 
> Manually transforming "cars" and "driver" into categorical variables, I 
> obtain the correct result:
> > DATA_AB
>    driver cars additive  y
> 1      D1   C1        A 19
> 2      D2   C1        D 23
> 3      D3   C1        B 15
> 4      D4   C1        C 19
> 5      D1   C2        B 24
> 6      D2   C2        C 24
> 7      D3   C2        D 14
> 8      D4   C2        A 18
> 9      D1   C3        D 23
> 10     D2   C3        A 19
> 11     D3   C3        C 15
> 12     D4   C3        B 19
> 13     D1   C4        C 26
> 14     D2   C4        B 30
> 15     D3   C4        A 16
> 16     D4   C4        D 16
> > summary( aov(MODEL, data=DATA_AB) )
>             Df Sum Sq Mean Sq F value   Pr(>F)   
> cars         3     24   8.000     1.5 0.307174   
> driver       3    216  72.000    13.5 0.004466 **
> additive     3     40  13.333     2.5 0.156490   
> Residuals    6     32   5.333                    
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 
> QUESTION
> 
> Which is the easiest (most elegant) way to force "driver" and "cars" from 
> DATA to be treated as categorical variables by "aov"?
> More generally, which is the easiest way to force "aov"  to treat numerical 
> variables as categorical ?
> 
> Sincerely, Andrea Bernasconi DG
> 
> PROBLEM EXAMPLE
> 
> I consider the latin squares example described at page 157 of the book:
> Statistics for Experimenters: Design, Innovation, and Discovery by George E. 
> P. Box, J. Stuart Hunter, William G. Hunter.
> 
> This example use the data-file /BHH2-Data/tab0408.dat from 
> ftp://ftp.wiley.com/ in /sci_tech_med/statistics_experimenters/BHH2-Data.zip.
> 
> The file tab0408.dat contains following DATA:
> > DATA
>    driver cars additive  y
> 1       1    1        A 19
> 2       2    1        D 23
> 3       3    1        B 15
> 4       4    1        C 19
> 5       1    2        B 24
> 6       2    2        C 24
> 7       3    2        D 14
> 8       4    2        A 18
> 9       1    3        D 23
> 10      2    3        A 19
> 11      3    3        C 15
> 12      4    3        B 19
> 13      1    4        C 26
> 14      2    4        B 30
> 15      3    4        A 16
> 16      4    4        D 16
> 
> Now
> > summary( aov(MODEL, data=DATA) )
>             Df Sum Sq Mean Sq F value Pr(>F)  
> cars         1   12.8  12.800  0.8889 0.3680  
> driver       1  115.2 115.200  8.0000 0.0179 *
> additive     3   40.0  13.333  0.9259 0.4634  
> Residuals   10  144.0  14.400                 
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> This results differ from book result at p 159, since "cars" and "driver" are 
> treated as numerical variables by "aov".
> 
> BRUTE FORCE SOLUTION
> 
> Manually transforming "cars" and "driver" into categorical variables, I 
> obtain the correct result:
> > DATA_AB
>    driver cars additive  y
> 1      D1   C1        A 19
> 2      D2   C1        D 23
> 3      D3   C1        B 15
> 4      D4   C1        C 19
> 5      D1   C2        B 24
> 6      D2   C2        C 24
> 7      D3   C2        D 14
> 8      D4   C2        A 18
> 9      D1   C3        D 23
> 10     D2   C3        A 19
> 11     D3   C3        C 15
> 12     D4   C3        B 19
> 13     D1   C4        C 26
> 14     D2   C4        B 30
> 15     D3   C4        A 16
> 16     D4   C4        D 16
> > summary( aov(MODEL, data=DATA_AB) )
>             Df Sum Sq Mean Sq F value   Pr(>F)   
> cars         3     24   8.000     1.5 0.307174   
> driver       3    216  72.000    13.5 0.004466 **
> additive     3     40  13.333     2.5 0.156490   
> Residuals    6     32   5.333                    
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 
> QUESTION
> 
> Which is the easiest (most elegant) way to force "driver" and "cars" from 
> DATA to be treated as categorical variables by "aov"?
> More generally, which is the easiest way to force "aov"  to treat numerical 
> variables as categorical ?
> 
> 

Mobile  +41 79 621 74 07




        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to