Thanks for your reply. It might good to document the naming convention in ?contrasts. It is hard to understand .L for linear, .Q for quadratic, .C for cubic and ^n for other degrees.
For contr.sum, we could have used .Sum<level1>, .Sum<level2>… Maybe the examples ?model.matrix should use names in dd objects so that we observe when names are dropped. Kind regards, Christophe > Le 14 juin 2024 à 11:45, peter dalgaard <pda...@gmail.com> a écrit : > > You're at the mercy of the various contr.XXX functions. They may or may not > set the colnames on the matrices that they generate. > > The rationales for (not) setting them is not perfectly transparent, but you > obviously cannot use level names on contr.poly, so it uses .L, .Q, etc. > > In MASS, contr.sdif is careful about labeling the columns with the levels > that are being diff'ed. > > For contr.treatment, there is a straightforward connection to 0/1 dummy > variables, so level names there are natural. > > One could use levels in contr.sum and contr.helmert, but it might confuse > users that comparisons are with the average of all levels or preceding > levels. (It can be quite confusing when coding is +1 for male and -1 for > female, so that the gender difference is twice the coefficient.) > > -pd > >> On 14 Jun 2024, at 08:12 , Christophe Dutang <duta...@gmail.com> wrote: >> >> Dear list, >> >> Changing the default contrasts used in glm() makes me aware how >> model.matrix() set column names. >> >> With default contrasts, model.matrix() use the level values to name the >> columns. However with other contrasts, model.matrix() use the level indexes. >> In the documentation, I don’t see anything in the documentation related to >> this ? It does not seem natural to have such a behavior? >> >> Any comment is welcome. >> >> An example is below. >> >> Kind regards, Christophe >> >> >> #example from ?glm >> counts <- c(18,17,15,20,10,20,25,13,12) >> outcome <- paste0("O", gl(3,1,9)) >> treatment <- paste0("T", gl(3,3)) >> >> X3 <- model.matrix(counts ~ outcome + treatment) >> X4 <- model.matrix(counts ~ outcome + treatment, contrasts = >> list("outcome"="contr.sum")) >> X5 <- model.matrix(counts ~ outcome + treatment, contrasts = >> list("outcome"="contr.helmert")) >> >> #check with original factor >> cbind.data.frame(X3, outcome) >> cbind.data.frame(X4, outcome) >> cbind.data.frame(X5, outcome) >> >> #same issue with glm >> glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) >> glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = >> list("outcome"="contr.sum")) >> glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = >> list("outcome"="contr.helmert")) >> >> coef(glm.D93) >> coef(glm.D94) >> coef(glm.D95) >> >> #check linear predictor >> cbind(X3 %*% coef(glm.D93), predict(glm.D93)) >> cbind(X4 %*% coef(glm.D94), predict(glm.D94)) >> >> ------------------------------------------------- >> Christophe DUTANG >> LJK, Ensimag, Grenoble INP, UGA, France >> ILB research fellow >> Web: http://dutangc.free.fr >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd....@cbs.dk Priv: pda...@gmail.com > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.