Re: [R] glm model syntax

2008-05-17 Thread Birgit Lemcke

Thanks a lot for you explanations.

Only to complete this:

I am using glm with a quasi-poisson distribution for count data  
variables and I still have problems to interpret the table that I get  
back.

But that is probably more a problem of lacking statistical knowledge.

Greets

Birgit

Am 16.05.2008 um 19:10 schrieb Doran, Harold:

Dear Berwin:

Indeed, it seems I was incorrect. Using your data, it seems that  
only in

the case that the variables are numeric would my earlier statements be
true, as you note. For example, if we did

lm(y ~ as.numeric(N)+as.numeric(M), dat)
lm(y ~ as.numeric(N)*as.numeric(M), dat)
lm(y ~ as.numeric(N):as.numeric(M), dat)

Then the latter two are different, but only under the coercion to
numeric.


-Original Message-
From: Berwin A Turlach [mailto:[EMAIL PROTECTED]
Sent: Friday, May 16, 2008 12:27 PM
To: Doran, Harold
Cc: Birgit Lemcke; R Hilfe
Subject: Re: [R] glm model syntax

G'day Harold,

On Fri, 16 May 2008 11:43:32 -0400
Doran, Harold [EMAIL PROTECTED] wrote:


N+M gives only the main effects, N:M gives only the interaction, and
G*M gives the main effects and the interaction.


I guess this begs the question what you mean with N:M gives
only the interaction ;-)

Consider:

R (M - gl(2, 1, length=12))
 [1] 1 2 1 2 1 2 1 2 1 2 1 2
Levels: 1 2
R (N - gl(2, 6))
 [1] 1 1 1 1 1 1 2 2 2 2 2 2
Levels: 1 2
R dat - data.frame(y= rnorm(12), N=N, M=M) dim(model.matrix(y~N+M,
R dat))
[1] 12  3
R dim(model.matrix(y~N:M, dat))
[1] 12  5
R dim(model.matrix(y~N*M, dat))
[1] 12  4

Why has the model matrix of y~N:M more columns than the model
matrix of y~N*M if the former contains the interactions only
and the latter contains main terms and interactions?  Of
course, if we leave the dim() command away, we will see why.
Moreover, it seems that the model matrix constructed from
y~N:M has a redundant column.

Furthermore:

R D1 - model.matrix(y~N*M, dat)
R D2 - model.matrix(y~N:M, dat)
R resid(lm(D1~D2-1))

Shows that the column space created by the model matrix of
y~N*M is completely contained within the column space created
by the model matrix of y~N:M, and it is easy to check that
the reverse is also true.  So it seems to me that y~N:M and
y~N*M actually fit the same models.  To see how to construct
one design matrix from the other, try:

R lm(D1~D2-1)

Thus, I guess the answer is that y~N+M fits a model with main
terms only while y~N:M and y~N*M fit the same model, namely a
model with main and interaction terms, these two formulations
just create different design matrices which has to be taken
into account if one tries to interpret the estimates.

Of course, all the above assumes that N and M are actually
factors, something that Birgit did not specify.  If N and M
(or only one of
them) is a numeric vector, then the constructed matrices
might be different, but this is left as an exercise. ;-)
(Apparently, if N and M are both numeric, then your summary
is pretty much correct.)

Cheers,

Berwin

=== Full address  
=
Berwin A TurlachTel.: +65 6515 4416  
(secr)
Dept of Statistics and Applied Probability+65 6515 6650  
(self)

Faculty of Science  FAX : +65 6872 3919
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail:  
[EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/ 
~statba




Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm model syntax

2008-05-16 Thread Birgit Lemcke


Hello R users!

What is the difference between

glm(A~N+M)
glm(A~N:M)
glm(A~N*M)

Thanks in advance.

Birgit

Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm model syntax

2008-05-16 Thread Doran, Harold
N+M gives only the main effects, N:M gives only the interaction, and G*M gives 
the main effects and the interaction. 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Birgit Lemcke
 Sent: Friday, May 16, 2008 11:27 AM
 To: R Hilfe
 Subject: [R] glm model syntax
 
 
 Hello R users!
 
 What is the difference between
 
 glm(A~N+M)
 glm(A~N:M)
 glm(A~N*M)
 
 Thanks in advance.
 
 Birgit
 
 Birgit Lemcke
 Institut für Systematische Botanik
 Zollikerstrasse 107
 CH-8008 Zürich
 Switzerland
 Ph: +41 (0)44 634 8351
 [EMAIL PROTECTED]
 
 175 Jahre UZH
 «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» 
 MNF-Jubiläumsevent für gross und klein.
 19. April 2008, 10.00 Uhr bis 02.00 Uhr
 Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere 
 Informationen http://www.175jahre.uzh.ch/naturwissenschaft
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm model syntax

2008-05-16 Thread Berwin A Turlach
G'day Harold,

On Fri, 16 May 2008 11:43:32 -0400
Doran, Harold [EMAIL PROTECTED] wrote:

 N+M gives only the main effects, N:M gives only the interaction, and
 G*M gives the main effects and the interaction. 

I guess this begs the question what you mean with N:M gives only the
interaction ;-)

Consider:

R (M - gl(2, 1, length=12))
 [1] 1 2 1 2 1 2 1 2 1 2 1 2
Levels: 1 2
R (N - gl(2, 6))
 [1] 1 1 1 1 1 1 2 2 2 2 2 2
Levels: 1 2
R dat - data.frame(y= rnorm(12), N=N, M=M)
R dim(model.matrix(y~N+M, dat))
[1] 12  3
R dim(model.matrix(y~N:M, dat))
[1] 12  5
R dim(model.matrix(y~N*M, dat))
[1] 12  4

Why has the model matrix of y~N:M more columns than the model matrix of
y~N*M if the former contains the interactions only and the latter
contains main terms and interactions?  Of course, if we leave the dim()
command away, we will see why.  Moreover, it seems that the model
matrix constructed from y~N:M has a redundant column.

Furthermore:

R D1 - model.matrix(y~N*M, dat)
R D2 - model.matrix(y~N:M, dat)
R resid(lm(D1~D2-1))

Shows that the column space created by the model matrix of y~N*M is
completely contained within the column space created by the model
matrix of y~N:M, and it is easy to check that the reverse is also
true.  So it seems to me that y~N:M and y~N*M actually fit the same
models.  To see how to construct one design matrix from the other, try:

R lm(D1~D2-1)

Thus, I guess the answer is that y~N+M fits a model with main terms
only while y~N:M and y~N*M fit the same model, namely a model with main
and interaction terms, these two formulations just create different
design matrices which has to be taken into account if one tries to
interpret the estimates.

Of course, all the above assumes that N and M are actually factors,
something that Birgit did not specify.  If N and M (or only one of
them) is a numeric vector, then the constructed matrices might be
different, but this is left as an exercise. ;-)  (Apparently, if N and
M are both numeric, then your summary is pretty much correct.)

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm model syntax

2008-05-16 Thread Doran, Harold
Dear Berwin:

Indeed, it seems I was incorrect. Using your data, it seems that only in
the case that the variables are numeric would my earlier statements be
true, as you note. For example, if we did

lm(y ~ as.numeric(N)+as.numeric(M), dat)
lm(y ~ as.numeric(N)*as.numeric(M), dat)
lm(y ~ as.numeric(N):as.numeric(M), dat) 

Then the latter two are different, but only under the coercion to
numeric.

 -Original Message-
 From: Berwin A Turlach [mailto:[EMAIL PROTECTED] 
 Sent: Friday, May 16, 2008 12:27 PM
 To: Doran, Harold
 Cc: Birgit Lemcke; R Hilfe
 Subject: Re: [R] glm model syntax
 
 G'day Harold,
 
 On Fri, 16 May 2008 11:43:32 -0400
 Doran, Harold [EMAIL PROTECTED] wrote:
 
  N+M gives only the main effects, N:M gives only the interaction, and
  G*M gives the main effects and the interaction. 
 
 I guess this begs the question what you mean with N:M gives 
 only the interaction ;-)
 
 Consider:
 
 R (M - gl(2, 1, length=12))
  [1] 1 2 1 2 1 2 1 2 1 2 1 2
 Levels: 1 2
 R (N - gl(2, 6))
  [1] 1 1 1 1 1 1 2 2 2 2 2 2
 Levels: 1 2
 R dat - data.frame(y= rnorm(12), N=N, M=M) dim(model.matrix(y~N+M, 
 R dat))
 [1] 12  3
 R dim(model.matrix(y~N:M, dat))
 [1] 12  5
 R dim(model.matrix(y~N*M, dat))
 [1] 12  4
 
 Why has the model matrix of y~N:M more columns than the model 
 matrix of y~N*M if the former contains the interactions only 
 and the latter contains main terms and interactions?  Of 
 course, if we leave the dim() command away, we will see why.  
 Moreover, it seems that the model matrix constructed from 
 y~N:M has a redundant column.
 
 Furthermore:
 
 R D1 - model.matrix(y~N*M, dat)
 R D2 - model.matrix(y~N:M, dat)
 R resid(lm(D1~D2-1))
 
 Shows that the column space created by the model matrix of 
 y~N*M is completely contained within the column space created 
 by the model matrix of y~N:M, and it is easy to check that 
 the reverse is also true.  So it seems to me that y~N:M and 
 y~N*M actually fit the same models.  To see how to construct 
 one design matrix from the other, try:
 
 R lm(D1~D2-1)
 
 Thus, I guess the answer is that y~N+M fits a model with main 
 terms only while y~N:M and y~N*M fit the same model, namely a 
 model with main and interaction terms, these two formulations 
 just create different design matrices which has to be taken 
 into account if one tries to interpret the estimates.
 
 Of course, all the above assumes that N and M are actually 
 factors, something that Birgit did not specify.  If N and M 
 (or only one of
 them) is a numeric vector, then the constructed matrices 
 might be different, but this is left as an exercise. ;-)  
 (Apparently, if N and M are both numeric, then your summary 
 is pretty much correct.)
 
 Cheers,
 
   Berwin
 
 === Full address =
 Berwin A TurlachTel.: +65 6515 4416 (secr)
 Dept of Statistics and Applied Probability+65 6515 6650 (self)
 Faculty of Science  FAX : +65 6872 3919   
 National University of Singapore
 6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
 Singapore 117546http://www.stat.nus.edu.sg/~statba
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.