What Titus said is correct. The only time the intercept will correspond to the grand mean in a generalized linear model is when you are using the identity link.

Basically what is going on in a logit model is the model assumes that random intercepts are normally distributed around the intercept on the *logit* scale. So your intercept is the "conditional" estimate of the grand mean (in that it is conditional on a random effect of zero). This can differ from the "marginal" mean (the mean *ignoring* the distribution of your random effects). Indeed, the further away the intercept is from zero in log odds space, the more the conditional and marginal means will diverge, due to the projection from log odds to probability space. The divergence doesn't mean there there is anything wrong with your analysis; they are just different ways of looking at the same data. If you want to do your analysis on the marginal values, you would need to use a different approach: Generalized Estimation Equations.

You can read about "subject-specific" vs. "population average models" in Raudenbush & Bryk, Hierarchical Linear Models. I think there is also some discussion in a textbook on longitudinal data analysis by Fitzmaurice, Laird, and Ware.

On 24/09/15 20:33, Titus von der Malsburg wrote:

Hi Maria,

below is some code that illustrates what I think is going on.  The
central point is that the mean of the logit of two values (condition
means) is not the same as the logit of the mean of those values.

   # Generating the data:
   i <- c(135, 263-135, 250, 272-250, 139, 266-139, 244, 267-244)
   x <- data.frame(A=rep(c(-1,-1, 1, 1,-1,-1, 1, 1), times=i),
                   B=rep(c(-1,-1,-1,-1, 1, 1, 1, 1), times=i),
                   y=rep(c( 1, 0, 1, 0, 1, 0, 1, 0), times=i))

   # Sanity check:
   table(x$y)

   logit <- function(p) log(p/(1-p))
   # The logit of the ordinary mean (not exactly the logit of the grand
   # mean due to the imbalance):
   logit(768 / 1068)

   # The models that you mentioned, but using glm not lrm:
   m1 <- glm(y~1  , x, family=binomial())
   m2 <- glm(y~A  , x, family=binomial())
   m3 <- glm(y~B  , x, family=binomial())
   m4 <- glm(y~A*B, x, family=binomial())
   # The results come out as you reported.

   # Note that B does not have an effect:
   summary(m4)

   # Since B doesn't have an effect, it's unsurprising that the intercept
   # of the first and the third model are almost the same.  Part of the
   # reason is that when the condition means are the same, this constitutes
   # a special case where the mean of the logit means is really the same as
   # the logit of the grand mean.  Let's check that:

   # Condition means:
   xx <- with(x, tapply(y, list(B,A), function(j) mean(j)))

   # Contrast for B:
   b1 <- mean(xx[1,])
   b2 <- mean(xx[2,])
   logit(mean(c(b1, b2))) # 0.9306637
   mean(logit(c(b1, b2))) # 0.930669

   # But why are the intercepts of m2 and m4 different?

   # Contrast for A:
   a1 <- mean(xx[,1])
   a2 <- mean(xx[,2])
   logit(mean(c(a1, a2)))  # 0.9306637
   mean(logit(c(a1, a2)))  # 1.233657

So the point is that the non-linear logit transform has to be taken into
account when interpreting the intercept of a logistic regression
model.  Apparently you assumed that the intercept would be the logit of
the grand mean when in fact it is the mean of the logit means.  It’s an
easy mistake to make.

Hope that helps.

   Titus

On 2015-09-24 Thu 09:42, m...@interfree.it wrote:


Hello everyone,

I need to pick your brains! I am running a mixed
effects regressions logit model (glmer) on some binomial data, where the
dependent variable is coded as either 0 (failure) or 1 (success). I have
two variables A and B, each with 2 levels.

I have applied a
contr.sum(2) coding to both variables because I want the output to be
interpreted as an ANOVA - so the intercept should represent the grand
mean (in logits) and the main effects should represent the deviations
from the grand mean (correct me if I am wrong).

To double check that I
am using the correct contrast coding, along with glmer models, I run lrm
models on the same data - because the lrm model does not have random
effects, the intercept it estimates should correspond to the actual
grand mean, am I right? I have 768 successes and 300 failures, so total
number of answers is 1068, which should make a grand mean in logits of
.94.

When I run the lrm model: lrm(answer~A*B), I do NOT get the grand
mean of. 94 in the intercept, but something much higher : 1.23. When I
run lrm models with only one predictor, I get an intercept of .94 (the
grand mean) when I include only B, and one of 1.23 when I include only A
(the same intercept when I include both predictors and their
interaction). I do not understand why.

Below I give the data
breakdown by condition - the numerator gives the number of successes and
the denominator the total number of answers in the condition

  A level
1 A level 2

B level 1 135/263 250/272

B level 2 139/266 244/267


First I thought that it might have to do with the fact that the data
set is not fully balanced. However, I get the same thing when I use new
variables A and B that have been 'scaled as numeric' (I believe this
scaling addresses the imbalance).

Here is the distribution of the
number of observations across condition:

  A level 1 A level 2

  B
level 1 263 272
  B level 2 266 267

Does anyone have any clue as to why
this is happening? I need to understand what I am doing, because later I
need to analyse (actually re-analyse) some more complex data. I have
only started to use contrasts recently (partly as a result of reading
Parsimonius mixed models by Bates et al) and I have found that using
(appropriate) contrasts makes a big difference to the ease of
convergence of a model, at least on my data.

Thanks!

Maria Nella
Carminati




--
Dale Barr
Institute of Neuroscience and Psychology
University of Glasgow
58 Hillhead Street
Glasgow G12 8QB

Reply via email to