I know this is three months old, but I had put it on a mental TODO list to see whether it was something that could be fixed, and I finally got a little spare time for that sort of thing. It turns out that it can NOT be fixed (without fundamental design changes), so I thought I'd better record the conclusion for the archives.
The reason is pretty simple: It is terms.formula that decides which terms require contrast coding and which need dummy coding. It does this based on the formula structure _only_; i.e., it has no information on whether variables are factors or numerical. E.g., you can do > terms(~ a:x + a:b) ~a:x + a:b attr(,"variables") list(a, x, b) attr(,"factors") a:x a:b a 2 2 x 2 0 b 0 1 attr(,"term.labels") [1] "a:x" "a:b" attr(,"order") [1] 2 2 attr(,"intercept") [1] 1 attr(,"response") [1] 0 attr(,".Environment") <environment: R_GlobalEnv> in which there is no indication to terms() what kind of variable "x" is. So if you want different behavior if x is a factor than if it is continuous, you're out of luck... Not quite sure the last three lines of my note below make sense, though. -pd On Jan 29, 2012, at 11:24 , peter dalgaard wrote: > > On Jan 29, 2012, at 02:42 , Ben Bolker wrote: >>> >> >> (quoting removed to make Gmane happy) >> >> AFAICS, this is a bug. >> >> >> I think so too, although I haven't got my head around it yet. >> >> Chuck, are you willing to post a summary of this to r-devel >> for discussion ... and/or post a bug report? > > You have to be very specific about what the bug is, if any... I.e., which > precisely are the rules that are broken by the current behavior? > > Preferably also suggest a fix --- the corner cases of model.matrix and > friends is some of the more impenetrable code in the R sources. > > Notice that order dependent parametrization of terms is not a bug per se, nor > is the automatic switch to dummy coding of factors. Consider these cases: > > d <- cbind(expand.grid(a=c("a","b"),b=c("X","Y"),c=c("U","W")),x=1:8) > model.matrix(~ a:b + a:c, d) > model.matrix(~ a:c + a:b, d) > model.matrix(~ a:b + a:x, d) > model.matrix(~ a:x + a:b, d) > > and notice that the logic applying to "x" is the same as that applying to > "c". > > The crux seems to be that the model ~a:c contains the model ~a whereas ~a:x > does not, and hence the rationale for _not_ expanding a subsequent a:b term > to dummies (namely that ~a is "already in" the model) fails. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd....@cbs.dk Priv: pda...@gmail.com > > > > > > > > -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.