On Mon, Feb 25, 2008 at 1:46 PM, Peter Dalgaard
<[EMAIL PROTECTED]> wrote:
> Ben Bolker wrote:
> > stian mail.rockefeller.edu mail.rockefeller.edu> writes:
> >
> >> Hi,I am wondering how to conduct Kenward-Roger correction in
> >> the linear mixed model using R. Any idea?
> >>
> >> Thanks a lot,
> >>
> >> Suyan
> >>
> >
> > Not really possible, I'm afraid. Having already invested a huge
> > amount of time in making lme4 available (and this is the cutting-edge
> > version of linear mixed models for R), Doug Bates has declined to
> > spend effort implementing K-R because (1) he's not convinced of the
> > appropriateness of adjusting F-distribution degrees of freedom in
> > this way, (2) he doesn't think that the K-R algorithm will be
> > feasible for the sorts of large-data problems he's interested in,
> > (3) [can't find the reference] he finds the correspondence between
> > K-R's notation and his difficult.
Regarding (2), it is more an issue of the structure of the model than
of the size of the data set. One of the big differences between the
lme4 package and the nlme package, or other software for fitting
linear mixed models, is that lme4 is designed to handle models with
non-nested random effects. The random effects can be completely
crossed, such as in an experiment where a sample of subjects are each
exposed to the same selection of items and the model incorporates
random effects for both subject and item, or partially crossed where
annual test scores may be associated with both the student taking the
test and the teacher for the student that year.
It has been a while since I looked at the Kenward-Roger formulation
but my recollection is that it would be difficult to extend the
calculations to models with non-nested random effects. I am not
opposed to the method in principle - it's just that I am not about to
have the time to work on it myself in the foreseeable future. If
someone else wants to work it out then I say go for it.
> The last one could be due to me rather than Doug, and certainly, it is
> just an obstacle rather than anything prohibitive in itself. I tend to
> think that (2) is "nontrivial" rather than "infeasible", but it is true
> that the K-R formulas depend on items like the covariance matrix (say,
> Sigma) for the entire data set, which is at best implicitly available in
> large-data problems. Whether or not it would be possible to work with
> implicit representations of Sigma is what is difficult to see because of
> the notational differences (it's been a while, but my recollection is
> that Doug uses Sigma for a different purpose, and there are a couple of
> similarly confusing notational switches).
Beyond notation there is a question of emphasis in the formulation of
the model. Most of the theory of linear mixed models focuses on the
marginal distribution of the response random variable. I have found
it more useful to consider the distribution of the random effects,
conditional on the observed values of the response. My latest
attempts to explain the computational methods (slides at
http://www.stat.wisc.edu/~bates/2008-02-22-Emory.pdf) examine the
conditional distribution of the random effects - in particular, the
unnormalized conditional density of the random effects. In the case
of a linear mixed model the logarithm of this density is a quadratic
form. For a generalized linear mixed model or a nonlinear mixed model
or a generalized nonlinear mixed model the conditional density can be
optimized via a penalized IRLS or penalized NLS algorithm to obtain
the conditional means and the conditional variance-covariance matrix.
I consider a linear transformation, U, of the random effects, B, for
which Var(U) = sigma^2 I. By transforming to U I can incorporate the
effect of the variance component parameters into a (sparse) model
matrix A rather than into the variance-covariance matrix. The linear
predictor that was expressed as eta = X beta + Z b is now X beta + A'
u where A is the transpose of a sparse model matrix that is derived
from Z. Evaluation of the deviance and related quantities hinges on
calculating the sparse Cholesky factor L that satisfies LL' = AA' + I.
(There is also a fixed permutation matrix. P, involved in that
expression but it doesn't have any effect on the theory - although it
can have a tremendous effect on the speed of the calculation.)
One reason for expressing the model in this way is because it is
exactly the way that the calculations are carried out. There are
slots named X, Zt (for Z-transpose), A and L in the object returned by
lmer and they have exactly the meanings shown above.
It is possible to get an expression for the marginal
variance-covariance of the response vector but it will be an n by n
matrix and will only have nice sparsity patterns for models with
nested random effects. For models with non-nested random effects this
is potentially a dense n by n matrix and you really don't want to try
work