[R] about lm restrictions...

2006-01-26 Thread klebyn

Hello all R-users


_question 1_

I need to make a statistical model and respective ANOVA table
but I get distinct results for

the T-test (in summary(lm.object) function) and
the F-test (in   anova(lm.object) )

shouldn't this two approach give me the same result, i.e
to indicate the same significants terms in both tests???

obs.

The system has two restrictions:
1) sum( x_i ) = 1
2) sum( z_j ) = 1



*output below*

_question 2_


Has I to considerate a SST in ANOVA table with:

1) N-2 d.f. because of 2 restrictions?
 or
2) N-1 d.f. because of 1 global restriction: sum( x ) + sum( z ) = 2 ?


I don't find any paper, book or another reference,
if someone may to indicate references for this type model (with 2 
restrictions),
I would be very grateful.


Thanks a lot.
Regards
 
 
Cleber N. Borges



###
# OUTPUT
###


Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(|t|)   
(Intercept)  15.5000 0.5270  29.409 2.97e-10 ***
z1:x1-5. 0.7454  -6.708 8.77e-05 ***
z1:x2 0.5000 0.7454   0.671 0.519177   
z1:x3-3. 0.7454  -4.025 0.002996 **
z2:x1-6. 0.7454  -8.050 2.11e-05 ***
z2:x2-5. 0.7454  -6.708 8.77e-05 ***
z2:x3-4.5000 0.7454  -6.037 0.000193 ***
z3:x1 1. 0.7454   1.342 0.212580   
z3:x2 1.5000 0.7454   2.012 0.075029 . 
z3:x3 NA NA  NA   NA   

Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq F valuePr(F)   
z1:x1  1 16.674  16.674 30.0125 0.0003910 ***
z1:x2  1 13.580  13.580 24.4446 0.0007977 ***
z1:x3  1  1.190   1.190  2.1429 0.1772677   
z2:x1  1 35.267  35.267 63.4800 2.287e-05 ***
z2:x2  1 32.400  32.400 58.3200 3.202e-05 ***
z2:x3  1 42.667  42.667 76.8000 1.061e-05 ***
z3:x1  1  0.083   0.083  0.1500 0.7075349   
z3:x2  1  2.250   2.250  4.0500 0.0750295 . 
Residuals  9  5.000   0.556 
---





###
# DATA
###

  z1 z2 z3 x1 x2 x3  y
  1  0  0  1  0  0 10
  1  0  0  0  1  0 15
  1  0  0  0  0  1 12
  0  1  0  1  0  0 10
  0  1  0  0  1  0 11
  0  1  0  0  0  1 11
  0  0  1  1  0  0 16
  0  0  1  0  1  0 17
  0  0  1  0  0  1 15
  1  0  0  1  0  0 11
  1  0  0  0  1  0 17
  1  0  0  0  0  1 13
  0  1  0  1  0  0  9
  0  1  0  0  1  0 10
  0  1  0  0  0  1 11
  0  0  1  1  0  0 17
  0  0  1  0  1  0 17
  0  0  1  0  0  1 16



###
# CODE
###


 x = read.table(file(clipboard),h=T)

## NOT a Scheffé Model:
 
 x.lm - lm( y ~ (z1+z2+z3):(x1+x2+x3), data=x)
 summary(x.lm)
 anova(x.lm)


## Scheffé Model: - IS CORRECT the analysis below?
 
 x.lm - lm( y ~ -1 + (z1+z2+z3):(x1+x2+x3), data=x)
 summary(x.lm)

 x.aov - aov( y ~  (z1+z2+z3):(x1+x2+x3), data=x)
 summary(x.aov)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] about lm restrictions...

2006-01-26 Thread Liaw, Andy


From: klebyn
 
 Hello all R-users
 
 
 _question 1_
 
 I need to make a statistical model and respective ANOVA table
 but I get distinct results for
 
 the T-test (in summary(lm.object) function) and
 the F-test (in   anova(lm.object) )
 
 shouldn't this two approach give me the same result, i.e
 to indicate the same significants terms in both tests???

No, because they are not the same tests.  The t-tests in summary.lm() test
whether the coefficient is zero, when all other terms are present in the
model.  The F-tests in anova.lm() test the terms by sequentially adding them
into the model.  Here's an example:

 set.seed(1)
 d - data.frame(x1=runif(20), x2=runif(20), y=rnorm(20))
 fm - lm(y ~ ., d)
 summary(fm)$coef
  Estimate Std. Errort value   Pr(|t|)
(Intercept)  1.0187254  0.5534310  1.8407452 0.08318123
x1  -1.6914784  0.6377065 -2.6524404 0.01675543
x2  -0.1817831  0.6618875 -0.2746435 0.78689983
 anova(fm)
Analysis of Variance Table

Response: y
  Df  Sum Sq Mean Sq F value  Pr(F)  
x1 1  4.2341  4.2341  7.0936 0.01638 *
x2 1  0.0450  0.0450  0.0754 0.78690  
Residuals 17 10.1472  0.5969  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 anova(fm2 - lm(y ~ x2 + x1, d))
Analysis of Variance Table

Response: y
  Df  Sum Sq Mean Sq F value  Pr(F)  
x2 1  0.0797  0.0797  0.1336 0.71928  
x1 1  4.1994  4.1994  7.0354 0.01676 *
Residuals 17 10.1472  0.5969  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Notice how the p-value for x1 in the last output matches that of the t-test:
because both are testing if the coefficient for x1 is 0 given that x2 is
already in the model.  (It's the same reason that the p-value for x2 in the
first anova() output matches that of the summary.lm(), but not the second
anova() output.)

I may be off, but I do not think the restrictions you mentioned have any
bearing on the analysis.  If x + z is restricted to something _for each
case_ then you do have to worry, but not the way you have it.  You can
choose the independent variables to take on any value you like (as in
designed experiments), so such restrictions should not matter.

Andy


 
 obs.
 
 The system has two restrictions:
 1) sum( x_i ) = 1
 2) sum( z_j ) = 1
 
 
 
 *output below*
 
 _question 2_
 
 
 Has I to considerate a SST in ANOVA table with:
 
 1) N-2 d.f. because of 2 restrictions?
  or
 2) N-1 d.f. because of 1 global restriction: sum( x ) + sum( z ) = 2 ?
 
 
 I don't find any paper, book or another reference,
 if someone may to indicate references for this type model (with 2 
 restrictions),
 I would be very grateful.
 
 
 Thanks a lot.
 Regards
  
  
 Cleber N. Borges
 
 
 
 ###
 # OUTPUT
 ###
 
 
 Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)   
 (Intercept)  15.5000 0.5270  29.409 2.97e-10 ***
 z1:x1-5. 0.7454  -6.708 8.77e-05 ***
 z1:x2 0.5000 0.7454   0.671 0.519177   
 z1:x3-3. 0.7454  -4.025 0.002996 **
 z2:x1-6. 0.7454  -8.050 2.11e-05 ***
 z2:x2-5. 0.7454  -6.708 8.77e-05 ***
 z2:x3-4.5000 0.7454  -6.037 0.000193 ***
 z3:x1 1. 0.7454   1.342 0.212580   
 z3:x2 1.5000 0.7454   2.012 0.075029 . 
 z3:x3 NA NA  NA   NA   
 
 Analysis of Variance Table
 
 Response: y
   Df Sum Sq Mean Sq F valuePr(F)   
 z1:x1  1 16.674  16.674 30.0125 0.0003910 ***
 z1:x2  1 13.580  13.580 24.4446 0.0007977 ***
 z1:x3  1  1.190   1.190  2.1429 0.1772677   
 z2:x1  1 35.267  35.267 63.4800 2.287e-05 ***
 z2:x2  1 32.400  32.400 58.3200 3.202e-05 ***
 z2:x3  1 42.667  42.667 76.8000 1.061e-05 ***
 z3:x1  1  0.083   0.083  0.1500 0.7075349   
 z3:x2  1  2.250   2.250  4.0500 0.0750295 . 
 Residuals  9  5.000   0.556 
 ---
 
 
 
 
 
 ###
 # DATA
 ###
 
   z1 z2 z3 x1 x2 x3  y
   1  0  0  1  0  0 10
   1  0  0  0  1  0 15
   1  0  0  0  0  1 12
   0  1  0  1  0  0 10
   0  1  0  0  1  0 11
   0  1  0  0  0  1 11
   0  0  1  1  0  0 16
   0  0  1  0  1  0 17
   0  0  1  0  0  1 15
   1  0  0  1  0  0 11
   1  0  0  0  1  0 17
   1  0  0  0  0  1 13
   0  1  0  1  0  0  9
   0  1  0  0  1  0 10
   0  1  0  0  0  1 11
   0  0  1  1  0  0 17
   0  0  1  0  1  0 17
   0  0  1  0  0  1 16
 
 
 
 ###
 # CODE
 ###
 
 
  x = read.table(file(clipboard),h=T)
 
 ## NOT a Scheffé Model:
  
  x.lm - lm( y ~ (z1+z2+z3):(x1+x2+x3), data=x)
  summary(x.lm)
  anova(x.lm)
 
 
 ## Scheffé Model: - IS CORRECT the analysis below?
  
  x.lm - lm( y ~ -1 + (z1+z2+z3):(x1+x2+x3), data=x)
  summary(x.lm)
 
  x.aov - aov( y ~