Re: [R] A model-building strategy in mixed-effects modelling

2010-01-19 Thread Ben Bolker
Stats Wolf stats.wolf at gmail.com writes:

 
 Dear all,
 
 Consider a completely randomized block design (let's use data(Oats)
 irrespoctive of the split-plot design it was arranged in). Look:
 
 library(nlme)
 fit - lme(yield ~ nitro, Oats, random = ~1|Block, method=ML)
 fit2 - lm(yield ~ nitro + Block, Oats)
 anova(fit, fit2)
 gives this:
  Model df  AIC  BIClogLik   Test  L.Ratio p-value
 fit  1  4 624.3245 633.4312 -308.1623
 fit2 2  8 611.9309 630.1442 -297.9654 1 vs 2 20.39366   4e-04
 
 Clearly, considering block a random term is worse than considering it
 a fixed term. 

  I have technical concerns with this step (and larger
philosophical concerns with the whole approach -- see below).
It's not at all clear that anova() applied to these two models
makes sense -- in fact I think it doesn't, because the two
models aren't nested. (I guess technically I could take the
limit as the random effects variance goes to infinity,
or 1/variance - 0, to get from the random to the fixed
specification.  Then the LRT would be suspect on the grounds
that the null hypothesis (1/variance=0) is on the boundary of
the allowable region, but that will typically only bias the
p-value by a factor of 2 ...)  Furthermore, I'm not 100% convinced
that the likelihoods computed by lm() and lme() are comparable,
constant terms are often left out.
  But let's suppose this comparison is OK ...


Let's see if blocking should be included in the model at
 all:
 
 fit3 - lm(yield ~nitro, Oats)
 anova(fit2,fit3)
 
 which gives a very small P value in favor of fit2, which suggests the
 block term should be included. So, I go for the second model, with
 block considered fixed.
 
 Is this indeed how I should generally proceed when choosing the
 optimum model for a situation that calls for mixed effects? Of course,
 the example above is overly simplistic, yet such situations can occur

  I think in general that you should decide on random vs fixed
on philosophical and practical grounds, _a priori_ ... if you're going
to be Bayesian (see various discussions by Andrew Gelman on the subject)
then you don't have any philosophical problems at all, because fixed
and pooled are just two ends of a spectrum (variance of the priors
of the effects fixed at infinity and zero, respectively), and it's
just a practical question of estimation.  Doing a lot of hypothesis
testing to decide on the best model starts down the road of
data-dredging ...

  In general, questions like this might get more attention
on r-sig-mixed-mod...@lists.r-project.org

  cheers
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A model-building strategy in mixed-effects modelling

2010-01-18 Thread Stats Wolf
Dear all,

Consider a completely randomized block design (let's use data(Oats)
irrespoctive of the split-plot design it was arranged in). Look:

library(nlme)
fit - lme(yield ~ nitro, Oats, random = ~1|Block, method=ML)
fit2 - lm(yield ~ nitro + Block, Oats)
anova(fit, fit2)
gives this:
 Model df  AIC  BIClogLik   Test  L.Ratio p-value
fit  1  4 624.3245 633.4312 -308.1623
fit2 2  8 611.9309 630.1442 -297.9654 1 vs 2 20.39366   4e-04

Clearly, considering block a random term is worse than considering it
a fixed term. Let's see if blocking should be included in the model at
all:

fit3 - lm(yield ~nitro, Oats)
anova(fit2,fit3)

which gives a very small P value in favor of fit2, which suggests the
block term should be included. So, I go for the second model, with
block considered fixed.

Is this indeed how I should generally proceed when choosing the
optimum model for a situation that calls for mixed effects? Of course,
the example above is overly simplistic, yet such situations can occur
-- from a complex model with a couple of random terms one can finally
get to a simple fixed-effects model. Please comment.

Thanks in advance,

Stats Wolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.