I'm puzzled by a phenomenon that I have just encountered, and was hoping that someone out there might be able to give me some insight. It's not an R problem as such, but ..... well, there are R aspects.
I have a rather messy data set, with a response, a fixed effect, and 3 random effects: school, class, and student. The classes are nested within schools; the students are nested within schools; students are *not* nested within classes. The fixed effect is ``time'', with 6 levels. There are 1428 observations. The ``design'' (the data are from an observational study) is vastly imbalanced; there are brazillions of empty cells. I tried fitting two models: (1) y ~ time + school + class%in%school + student%in%school (2) y ~ time + cls.in.scl + std.in.scl where I formed the factors ``cls.in.scl'' and ``std.in.scl'' by using the interaction() function: cls.in.scl <- interaction(class,school,sep=".in.") std.in.scl <- interaction(student,school,sep=".in.") This was one way that I could think of to fit the model with the school factor dropped out (given that the levels of ``class'' and ``student'' were nested within the levels of ``school''). Now my puzzlement: The model matrix for (2) is singular; that for (1) is not. I can believe that the imbalance/existence of empty cells could easily make the model singular --- but since students and classes are nested within schools, I don't see how including a school factor could eliminate the singularity. I have been trying to concoct a low dimensional toy example of this phenomenon, and am not able to do so. Can anyone provide me with some enlightenment? I might remark that I have tried fitting the same two models, to the same data, in Minitab, and got the same sort of result --- model (1) ran to completion; model (2) came to a shuddering halt with a complaint about ``rank deficiency ... no further analysis will be done''. I mention this because it convinces me that the problem is not with my syntax for expressing the models --- I have never really managed to grok R's linear modelling syntax, but I feel comfortable/confident with Minitab's syntax. Minitab would write model (1) as (1) y ~ time + school + class(school) + student(school) While I'm on this topic --- a totally side-issue: I fitted the two models with lm(), i.e. completely ignoring the fact that school, class, and student are random effects (whence any correct inference from the fits is impossible). That's by-the-by, since my real concern at the moment is the singularity issue. But can any kind soul please tell me how I might *properly* fit the model in R? What package and function should I use, and what (very explicitly, in monosyllables, for I am a bear of very little brain and long words bother me) is the correct syntax for expressing the models for the aforesaid function? Yes, I have looked at lme4 and lmer(), and I have (tried to) read Pinheiro and Bates, but I couldn't get it. As I said, I am a bear of very little brain .... Thanks for taking the time to read this long-winded cri de coeur. If anyone is actually *interested* I would be very happy to make the data available to her or him. cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confidenti...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.