Re: [R] glm model syntax
Thanks a lot for you explanations. Only to complete this: I am using glm with a quasi-poisson distribution for count data variables and I still have problems to interpret the table that I get back. But that is probably more a problem of lacking statistical knowledge. Greets Birgit Am 16.05.2008 um 19:10 schrieb Doran, Harold: Dear Berwin: Indeed, it seems I was incorrect. Using your data, it seems that only in the case that the variables are numeric would my earlier statements be true, as you note. For example, if we did lm(y ~ as.numeric(N)+as.numeric(M), dat) lm(y ~ as.numeric(N)*as.numeric(M), dat) lm(y ~ as.numeric(N):as.numeric(M), dat) Then the latter two are different, but only under the coercion to numeric. -Original Message- From: Berwin A Turlach [mailto:[EMAIL PROTECTED] Sent: Friday, May 16, 2008 12:27 PM To: Doran, Harold Cc: Birgit Lemcke; R Hilfe Subject: Re: [R] glm model syntax G'day Harold, On Fri, 16 May 2008 11:43:32 -0400 Doran, Harold [EMAIL PROTECTED] wrote: N+M gives only the main effects, N:M gives only the interaction, and G*M gives the main effects and the interaction. I guess this begs the question what you mean with N:M gives only the interaction ;-) Consider: R (M - gl(2, 1, length=12)) [1] 1 2 1 2 1 2 1 2 1 2 1 2 Levels: 1 2 R (N - gl(2, 6)) [1] 1 1 1 1 1 1 2 2 2 2 2 2 Levels: 1 2 R dat - data.frame(y= rnorm(12), N=N, M=M) dim(model.matrix(y~N+M, R dat)) [1] 12 3 R dim(model.matrix(y~N:M, dat)) [1] 12 5 R dim(model.matrix(y~N*M, dat)) [1] 12 4 Why has the model matrix of y~N:M more columns than the model matrix of y~N*M if the former contains the interactions only and the latter contains main terms and interactions? Of course, if we leave the dim() command away, we will see why. Moreover, it seems that the model matrix constructed from y~N:M has a redundant column. Furthermore: R D1 - model.matrix(y~N*M, dat) R D2 - model.matrix(y~N:M, dat) R resid(lm(D1~D2-1)) Shows that the column space created by the model matrix of y~N*M is completely contained within the column space created by the model matrix of y~N:M, and it is easy to check that the reverse is also true. So it seems to me that y~N:M and y~N*M actually fit the same models. To see how to construct one design matrix from the other, try: R lm(D1~D2-1) Thus, I guess the answer is that y~N+M fits a model with main terms only while y~N:M and y~N*M fit the same model, namely a model with main and interaction terms, these two formulations just create different design matrices which has to be taken into account if one tries to interpret the estimates. Of course, all the above assumes that N and M are actually factors, something that Birgit did not specify. If N and M (or only one of them) is a numeric vector, then the constructed matrices might be different, but this is left as an exercise. ;-) (Apparently, if N and M are both numeric, then your summary is pretty much correct.) Cheers, Berwin === Full address = Berwin A TurlachTel.: +65 6515 4416 (secr) Dept of Statistics and Applied Probability+65 6515 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/ ~statba Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm model syntax
Hello R users! What is the difference between glm(A~N+M) glm(A~N:M) glm(A~N*M) Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm model syntax
N+M gives only the main effects, N:M gives only the interaction, and G*M gives the main effects and the interaction. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Birgit Lemcke Sent: Friday, May 16, 2008 11:27 AM To: R Hilfe Subject: [R] glm model syntax Hello R users! What is the difference between glm(A~N+M) glm(A~N:M) glm(A~N*M) Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm model syntax
G'day Harold, On Fri, 16 May 2008 11:43:32 -0400 Doran, Harold [EMAIL PROTECTED] wrote: N+M gives only the main effects, N:M gives only the interaction, and G*M gives the main effects and the interaction. I guess this begs the question what you mean with N:M gives only the interaction ;-) Consider: R (M - gl(2, 1, length=12)) [1] 1 2 1 2 1 2 1 2 1 2 1 2 Levels: 1 2 R (N - gl(2, 6)) [1] 1 1 1 1 1 1 2 2 2 2 2 2 Levels: 1 2 R dat - data.frame(y= rnorm(12), N=N, M=M) R dim(model.matrix(y~N+M, dat)) [1] 12 3 R dim(model.matrix(y~N:M, dat)) [1] 12 5 R dim(model.matrix(y~N*M, dat)) [1] 12 4 Why has the model matrix of y~N:M more columns than the model matrix of y~N*M if the former contains the interactions only and the latter contains main terms and interactions? Of course, if we leave the dim() command away, we will see why. Moreover, it seems that the model matrix constructed from y~N:M has a redundant column. Furthermore: R D1 - model.matrix(y~N*M, dat) R D2 - model.matrix(y~N:M, dat) R resid(lm(D1~D2-1)) Shows that the column space created by the model matrix of y~N*M is completely contained within the column space created by the model matrix of y~N:M, and it is easy to check that the reverse is also true. So it seems to me that y~N:M and y~N*M actually fit the same models. To see how to construct one design matrix from the other, try: R lm(D1~D2-1) Thus, I guess the answer is that y~N+M fits a model with main terms only while y~N:M and y~N*M fit the same model, namely a model with main and interaction terms, these two formulations just create different design matrices which has to be taken into account if one tries to interpret the estimates. Of course, all the above assumes that N and M are actually factors, something that Birgit did not specify. If N and M (or only one of them) is a numeric vector, then the constructed matrices might be different, but this is left as an exercise. ;-) (Apparently, if N and M are both numeric, then your summary is pretty much correct.) Cheers, Berwin === Full address = Berwin A TurlachTel.: +65 6515 4416 (secr) Dept of Statistics and Applied Probability+65 6515 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm model syntax
Dear Berwin: Indeed, it seems I was incorrect. Using your data, it seems that only in the case that the variables are numeric would my earlier statements be true, as you note. For example, if we did lm(y ~ as.numeric(N)+as.numeric(M), dat) lm(y ~ as.numeric(N)*as.numeric(M), dat) lm(y ~ as.numeric(N):as.numeric(M), dat) Then the latter two are different, but only under the coercion to numeric. -Original Message- From: Berwin A Turlach [mailto:[EMAIL PROTECTED] Sent: Friday, May 16, 2008 12:27 PM To: Doran, Harold Cc: Birgit Lemcke; R Hilfe Subject: Re: [R] glm model syntax G'day Harold, On Fri, 16 May 2008 11:43:32 -0400 Doran, Harold [EMAIL PROTECTED] wrote: N+M gives only the main effects, N:M gives only the interaction, and G*M gives the main effects and the interaction. I guess this begs the question what you mean with N:M gives only the interaction ;-) Consider: R (M - gl(2, 1, length=12)) [1] 1 2 1 2 1 2 1 2 1 2 1 2 Levels: 1 2 R (N - gl(2, 6)) [1] 1 1 1 1 1 1 2 2 2 2 2 2 Levels: 1 2 R dat - data.frame(y= rnorm(12), N=N, M=M) dim(model.matrix(y~N+M, R dat)) [1] 12 3 R dim(model.matrix(y~N:M, dat)) [1] 12 5 R dim(model.matrix(y~N*M, dat)) [1] 12 4 Why has the model matrix of y~N:M more columns than the model matrix of y~N*M if the former contains the interactions only and the latter contains main terms and interactions? Of course, if we leave the dim() command away, we will see why. Moreover, it seems that the model matrix constructed from y~N:M has a redundant column. Furthermore: R D1 - model.matrix(y~N*M, dat) R D2 - model.matrix(y~N:M, dat) R resid(lm(D1~D2-1)) Shows that the column space created by the model matrix of y~N*M is completely contained within the column space created by the model matrix of y~N:M, and it is easy to check that the reverse is also true. So it seems to me that y~N:M and y~N*M actually fit the same models. To see how to construct one design matrix from the other, try: R lm(D1~D2-1) Thus, I guess the answer is that y~N+M fits a model with main terms only while y~N:M and y~N*M fit the same model, namely a model with main and interaction terms, these two formulations just create different design matrices which has to be taken into account if one tries to interpret the estimates. Of course, all the above assumes that N and M are actually factors, something that Birgit did not specify. If N and M (or only one of them) is a numeric vector, then the constructed matrices might be different, but this is left as an exercise. ;-) (Apparently, if N and M are both numeric, then your summary is pretty much correct.) Cheers, Berwin === Full address = Berwin A TurlachTel.: +65 6515 4416 (secr) Dept of Statistics and Applied Probability+65 6515 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.