Thanks for your reply.

It might good to document the naming convention in ?contrasts. It is hard to 
understand .L for linear, .Q for quadratic, .C for cubic and ^n for other 
degrees.

For contr.sum, we could have used .Sum<level1>, .Sum<level2>…

Maybe the examples ?model.matrix should use names in dd objects so that we 
observe when names are dropped.

Kind regards, Christophe


> Le 14 juin 2024 à 11:45, peter dalgaard <pda...@gmail.com> a écrit :
> 
> You're at the mercy of the various contr.XXX functions. They may or may not 
> set the colnames on the matrices that they generate. 
> 
> The rationales for (not) setting them is not perfectly transparent, but you 
> obviously cannot use level names on contr.poly, so it uses .L, .Q, etc. 
> 
> In MASS, contr.sdif is careful about labeling the columns with the levels 
> that are being diff'ed. 
> 
> For contr.treatment, there is a straightforward connection to 0/1 dummy 
> variables, so level names there are natural.
> 
> One could use levels in contr.sum and contr.helmert, but it might confuse 
> users that comparisons are with the average of all levels or preceding 
> levels. (It can be quite confusing when coding is +1 for male and -1 for 
> female, so that the gender difference is twice the coefficient.)
> 
> -pd
> 
>> On 14 Jun 2024, at 08:12 , Christophe Dutang <duta...@gmail.com> wrote:
>> 
>> Dear list,
>> 
>> Changing the default contrasts used in glm() makes me aware how 
>> model.matrix() set column names.
>> 
>> With default contrasts, model.matrix() use the level values to name the 
>> columns. However with other contrasts, model.matrix() use the level indexes. 
>> In the documentation, I don’t see anything in the documentation related to 
>> this ? It does not seem natural to have such a behavior?
>> 
>> Any comment is welcome.
>> 
>> An example is below.
>> 
>> Kind regards, Christophe  
>> 
>> 
>> #example from ?glm
>> counts <- c(18,17,15,20,10,20,25,13,12)
>> outcome <- paste0("O", gl(3,1,9))
>> treatment <- paste0("T", gl(3,3))
>> 
>> X3 <- model.matrix(counts ~ outcome + treatment)
>> X4 <- model.matrix(counts ~ outcome + treatment, contrasts = 
>> list("outcome"="contr.sum"))
>> X5 <- model.matrix(counts ~ outcome + treatment, contrasts = 
>> list("outcome"="contr.helmert"))
>> 
>> #check with original factor
>> cbind.data.frame(X3, outcome)
>> cbind.data.frame(X4, outcome)
>> cbind.data.frame(X5, outcome)
>> 
>> #same issue with glm
>> glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>> glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = 
>> list("outcome"="contr.sum"))
>> glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = 
>> list("outcome"="contr.helmert"))
>> 
>> coef(glm.D93)
>> coef(glm.D94)
>> coef(glm.D95)
>> 
>> #check linear predictor
>> cbind(X3 %*% coef(glm.D93), predict(glm.D93))
>> cbind(X4 %*% coef(glm.D94), predict(glm.D94))
>> 
>> -------------------------------------------------
>> Christophe DUTANG
>> LJK, Ensimag, Grenoble INP, UGA, France
>> ILB research fellow
>> Web: http://dutangc.free.fr
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd....@cbs.dk  Priv: pda...@gmail.com
> 

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to