Re: [R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

Rolf Turner Thu, 15 Feb 2018 20:02:47 -0800


On 16/02/18 15:28, Bert Gunter wrote:

This is really a statistical issue. What do you think the Intercept term
represents? See ?contrasts.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

It's a *little* bit R-ish in that the results depend on theparametrisation of the model which by default in R is formed using theso-called "treatment" contrasts.


To wander from R into statistics (sorry Bert) the problem arises because
the "usual" parametrisation of the model is the "over-parametrised" form:

Y_ij = mu + beta_i + E_ij (i = 1, ..., I, j = 1, ..., J_i)

where Y_ij is the j-th observation corresponding to the i-th "treatment"or group. (Things get a bit more complicated in "multi-way" models;let's not go there.)

The parameter "mu" is the "grand mean" and the "beta_i" are the"treatment" effects.

The trouble with this parametrisation is that the parameters aremeaningless! (The usual jargon is to say that they are "not estimable",

but "meaningless"  is a more accurate description.)

In order to ascribe unique values to the parameters, one must apply a"constraint". With the "treatment contrasts" the constraint is that

beta_1 = 0.

As a result the mean for the first treatment is mu, that for the second
treatment is mu + beta_2, and so on.

Consequently the SE corresponding to "(Intercept)" is the SE ofestimated mean for treatment 1. The SE corresponding to beta_2 is theSE of the estimated *difference* between the mean for treatment 2 andthat for treatment 1, and so on.


Frequently the constraint beta_1 + ...+ beta_I = 0 is used.  This

sort of treats all of the beta_i equally. At this point it gets R-ishagain. You can impose the foregoing constraint by using either "sum" or"Helmert" contrasts. You can get the "more natural", "fully" (ratherthan "over") parametrised model via the syntax:


    g <- glm(a ~ 0 + b, data=df)

or

    g <- glm(a ~ b - 1, data=df)

This syntax actually imposes the constraint that mu = 0.

(Here "contrasts" don't get involved --- a bit counterintuitive, that.)

The foregoing expressions will give you estimates labelled "b0", "b1","b2" (nothing labelled "(Intercept)") and these estimate are of thetreatment means and the SEs are straightforward to interpret.

There *is* a rationale for the use of the over-parametrised model, but Iwon't try to explain it here. I've raved on long enough.


Marc:  I hope this helps.

cheers,

Rolf

P.S. It is bad form to call a data frame "df" since this is the name ofthe density function for the family of F-distributions. There arecircumstances in which such usage can lead to error messages which areimpossible to interpret, e.g. "object of type 'closure' is notsubsettable". (!!!)

R.


On Thu, Feb 15, 2018 at 5:27 PM, Marc Girondot via R-help <
r-help@r-project.org> wrote:

Dear R-er,

I try to get the standard error of fitted parameters for factors with a
glm, even the reference one:

a <- runif(100)
b <- sample(x=c("0", "1", "2"), size=100, replace = TRUE)
df <- data.frame(A=a, B=b, stringsAsFactors = FALSE)

g <- glm(a ~ b, data=df)
summary(g)$coefficients

# I don't get SE for the reference factor, here 0:

               Estimate Std. Error    t value     Pr(>|t|)
(Intercept)  0.50384827 0.05616631  8.9706490 2.236684e-14
b1          -0.03598386 0.07496151 -0.4800311 6.322860e-01
b2           0.03208039 0.07063113  0.4541962 6.507023e-01

# Then I try to change the order of factors, for example:

df$B[df$B=="0"] <- "3"
g <- glm(a ~ b, data=df)
summary(g)$coefficients

By I get the same...

Any idea ?

Thanks

Marc


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Re: SE for all levels (including reference) of a factor atfer a GLM

Reply via email to