Hi all,

I'm using R 2.0.1. for Windows to analyze the influence of following factors
on response Y:

A (four levels)
B (three levels)
C (two levels)
D (29 levels) with
E (four replicates)

The dataset looks like this:
A       B       C       D       E       Y
0       1       1       1       1       491.9
0       1       1       1       2       618.7
0       1       1       1       3       448.2
0       1       1       1       4       632.9
250     1       1       1       1       92.4
250     1       1       1       2       117
250     1       1       1       3       35.5
250     1       1       1       4       102.7
500     1       1       1       1       47
500     1       1       1       2       57.4
500     1       1       1       3       6.5
500     1       1       1       4       50.9
1000    1       1       1       1       0.7
1000    1       1       1       2       6.2
1000    1       1       1       3       0.5
1000    1       1       1       4       1.1
0       2       2       2       1       6
0       2       2       2       2       4.2
0       2       2       2       3       20.3
0       2       2       2       4       3.5
250     2       2       2       1       8.4
250     2       2       2       2       2.8

etc.

If I ask the following: summary(aov(Y~A+B+C+D+E))

R gives me this answer:

                 Df  Sum Sq Mean Sq  F value Pr(>F)    
A                 3 135.602  45.201 310.2166 <2e-16 ***
B                 2   0.553   0.276   1.8976 0.1512    
C                 1   0.281   0.281   1.9264 0.1659    
D                25  92.848   3.714  25.4890 <2e-16 ***
E                 3   0.231   0.077   0.5279 0.6634    
Residuals   411  59.885   0.146   

Can someone explain me why factor C has only 25 Df (in stead of 28, what I
expected), and why this number changes when I leave out factors B or C (but
not A)? Why do factors B and C (but again: not A) not show up in the
calculation if they appear later in the formula than D?

When I ask summary.lm(aov(Y~A+B+C+D+E)), R tells me that three levels of D
were not defined because of "singularities" (what does this word mean?).
After checking and playing around with the dataset, I find no logical reason
for which levels are not defined. Even if I construct a "perfect" dataset
(balanced, no missing values) I never get the correct number of Df. 

My other datasets are analyzed as expected using the similar function calls
and similar datasets. Am I doing something wrong here?

Many thanks,

Ren� Eschen.

___
drs. Ren� Eschen
CABI Bioscience Switzerland Centre
1 Rue des Grillons
CH-2800 Del�mont
Switzerland
+41 32 421 48 87 (Direct)
+41 32 421 48 70 (Secretary)
+41 32 421 48 71 (Fax)

http://www.unifr.ch/biol/ecology/muellerschaerer/group/eschen/eschen.html

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to