To my surprise, the R functions I employed did NOT verify the independence property. I've no quarrel with the theory--it's the R functions that worry me.
pG -----Original Message----- From: Douglas Bates [mailto:[EMAIL PROTECTED] Sent: Monday, July 04, 2005 9:13 AM To: [EMAIL PROTECTED]; r-help Subject: Re: [R] Lack of independence in anova() I have already had email exchanges off-list with Phillip Good pointing out that the independence property that he wishes to establish by simulation is a consequence of orthogonality of the column span of the row contrasts and the interaction contrasts. If you set the contrasts option to a set of orthogonal contrasts such as contr.helmert or contr.sum, which has no effect on the results of the anova, this is easily established. > build function(size, v = rnorm(sum(size))) { col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]), rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8])) row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6] +size[7]+size[8])) return(data.frame(c=factor(col), r=factor(row),yield=v)) } > size [1] 3 3 3 0 3 3 3 0 > set.seed(1234321) > vv <- build(size) > vv c r yield 1 0 0 1.2369081 2 0 0 1.5616230 3 0 0 1.8396185 4 1 0 0.3173245 5 1 0 1.0715115 6 1 0 -1.1459955 7 2 0 0.2488894 8 2 0 0.1158625 9 2 0 2.6200816 10 0 1 1.2624048 11 0 1 -0.9862578 12 0 1 -0.3235653 13 1 1 0.2039706 14 1 1 -1.4574576 15 1 1 1.9158713 16 2 1 -2.0333909 17 2 1 1.0050272 18 2 1 0.6789184 > options(contrasts = c('contr.helmert', 'contr.poly')) > crossprod(model.matrix(~c*r, vv)) (Intercept) c1 c2 r1 c1:r1 c2:r1 (Intercept) 18 0 0 0 0 0 c1 0 12 0 0 0 0 c2 0 0 36 0 0 0 r1 0 0 0 18 0 0 c1:r1 0 0 0 0 12 0 c2:r1 0 0 0 0 0 36 On 7/4/05, Phillip Good <[EMAIL PROTECTED]> wrote: > If the observations are normally distributed and the 2xk design is > balanced, theory requires that the tests for interaction and row effects be > independent. In my program, appended below, this would translate to cntT > (approx)= cntR*cntI/N if all R routines were functioning correctly. They > aren't. > > sim2=function(size,N,p){ > cntR=0 > cntC=0 > cntI=0 > cntT=0 > cntP=0 > for(i in 1:N){ > #generate data > v=gendata(size) > #analyze after build(ing) design containing data > lm.out=lm(yield~c*r,build(size,v)) > av.out=anova(lm.out) > #if column effect is significant, increment cntC > if (av.out[[5]][1]<=p)cntC=cntC+1 > #if row effect is significant, increment cntR > if (av.out[[5]][2]<=p){ > cntR=cntR+1 > tmp = 1 > } > else tmp =0 > if (av.out[[5]][3]<=p){ > #if interaction is significant, increment cntI > cntI=cntI+1 > #if both interaction and row effect are significant, increment cntT > cntT=cntT + tmp > } > } > list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT) > } > > build=function(size,v){ > #size is a vector containing the sample sizes > col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]), > rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8])) > row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6] > +size[7]+size[8])) > return(data.frame(c=factor(col), r=factor(row),yield=v)) > } > > gendata=function(size){ > ssize=sum(size); > return (rnorm(ssize)) > } > > #Example > size=c(3,3,3,0,3,3,3,0) > sim2(size,10000,10,.16) > > > > Phillip Good > Huntington Beach CA > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html