I have something which has been bugging me and I have even asked this on cross validated but I did not get a response. Let's construct a simple example. Below is the code.
A<-gl(2,4) #factor of 2 levels B<-gl(4,2) #factor of 4 levels df<-data.frame(y,A,B) As you can see, B is nested within A. The peculiar result I am interested in the output of the model matrix when I fit for a nested model . *How does R decide what is included inside the intercept?* Since we are using dummy coding, the coefficients of the model is interpreted as the difference between a particular level and the reference level/the intercept for an single factor model. I understand for model ~A, A1 becomes the intercept and that for model ~A+B, A1 and B1 (both) become the intercept. *I do not get why when we use a nested model, A1:B2 appears as a column inside the model matrix. Why isn't the first parameter of the interaction subspace A1:B1 or A2:B1? *I think I am missing the concept. I think the intercept is A1. *Hence, Why do we not compare the levels of A1:B1 and A1(intercept) or A2:B1 and A1(intercept)?* #nested model > mod<-aov(y~A+A:B) > model.matrix(mod) (Intercept) A2 A1:B2 A2:B2 A1:B3 A2:B3 A1:B4 A2:B4 1 1 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 3 1 0 1 0 0 0 0 0 4 1 0 1 0 0 0 0 0 5 1 1 0 0 0 1 0 0 6 1 1 0 0 0 1 0 0 7 1 1 0 0 0 0 0 1 8 1 1 0 0 0 0 0 1 -- Yours sincerely, Justin *I check my email at 9AM and 4PM everyday* *If you have an EMERGENCY, contact me at +447938674419(UK) or +60125056192(Malaysia)* [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.