Re: [R] low-variance warning in lmer

2006-11-25 Thread Douglas Bates
On 11/24/06, Ben Bolker [EMAIL PROTECTED] wrote:

 For block effects with small variance, lmer will sometimes
 estimate the variance as being very close to zero and issue
 a warning.

The very close to zero is merely a computational convenience.  The
theoretical lower bound on the variance components is zero but
evaluating the profiled log-likelihood or log-restricted likelihood
when the variance-covariance matrix of the random effects is singular
would require  a completely different algorithm.  Evaluating the
gradient and the ECME update in this situation would be even more
complicated.

Instead of doing that I set the lower bounds to a value close to zero
but still large enough that the evaluation algorithm used elsewhere
works here too.  The value I use for the relative variances is 5e-10.
The corresponding estimate of the variance component is 5e-10 * s^2.

So the algorithm hasn't really converged to 5e-10 * s^2.  The estimate
of the variance component should be zero but I haven't allowed it to
go all the way to zero.

 I don't have a problem with this

I hope not.  The maximum likelihood or REML estimates of variance
components can legitimately be zero.

 -- I've explored
 things a bit with some simulations (see below) and conclude that
 this is probably inevitable when trying to incorporate
 random effects with not very much data (the means and medians
 of estimates are plausibly close to the nominal values:
 the fraction of runs with warnings/near-zero estimates drops
 from about 50% when the between-block variance is 1% of
 the total (with 2 treatments, 12 blocks nested within treatment,
 3 replicates per block), to 15% when between=30% of total,
 to near zero when between=50% of total.

 My question is: what should I suggest to students when this
 situation comes up?

If it is a variance component that is estimated as zero then just drop
that term from the model.  Having the MLE or the REML estimate of a
variance component be zero is pretty strong evidence that it is not
significantly greater than zero (in the usual sense of failing to
reject the null hypothesis - it still could be the case that it is
greater than zero but we don't have strong enough evidence to reject
the case that it is zero).  Hence we go with the simpler model that
eliminates this random effect.

The more difficult case is when the estimate of the
variance-covariance matrix of vector-valued random effects is
singular. Say you have a random effect for both the intercept and the
slope w.r.t. time for each subject.  It is possible to converge to a
variance-covariance matrix for these random effects with a non-zero
variance for the intercept and a non-zero variance for the slope but
perfect correlation between these random effects.  I'm not sure what
to suggest in that case as the reduced model is, as far as I can see,
not in the class of linear mixed models.  I think I might just not
bother mentioning this in a classroom.


 Can anyone point me to appropriate references?
 (I haven't found anything relevant in Pinheiro and Bates, but
 I may not have looked in the right place ...)

   thanks
 Ben Bolker


 self-contained but unnecessarily complicated simulation
 code/demonstration:
 ---
   library(lme4)
 library(lattice)

 simfun - function(reefeff,ntreat=2,nreef=12,
nreefpertreat=3,
t.eff=10,
totvar=25,seed=NA) {
   if (!is.na(seed)) set.seed(seed)
   ntot = nreef*nreefpertreat
   npertreat=ntot/ntreat
   reef = gl(nreef,nreefpertreat)
   treat = gl(ntreat,npertreat)
   r.sd = sqrt(totvar*reefeff)
   e.sd = sqrt(totvar*(1-reefeff))
   y.det = ifelse(treat==1,0,t.eff)
   r.vals = rnorm(nreef,sd=r.sd)
   e.vals = rnorm(ntot,sd=e.sd)
   y - y.det+r.vals[as.numeric(reef)]+e.vals
   data.frame(y,treat,reef)
 }

 getranvar - function(x) {
   vc - VarCorr(x)
   c(diag(vc[[1]]),attr(vc,sc)^2)
 }

 estfun - function(reefeff,...) {
   x - simfun(reefeff=reefeff,...)
   ow - options(warn=2)
   f1 - try(lmer(y~treat+(1|reef),data=x))
   w - (class(f1)==try-error  length(grep(effectively zero,f1))0)
   options(ow)
   f2 - lmer(y~treat+(1|reef),data=x)
   c(getranvar(f2),as.numeric(w))
 }


 rvec - rep(c(0.01,0.05,0.1,0.15,0.2,0.3,0.5),each=100)
 X - t(sapply(rvec,estfun))
 colnames(X) - c(reefvar,resvar,warn)
 rfrac - X[,reefvar]/(X[,reefvar]+X[,resvar])
 fracwarn - tapply(X[,warn],rvec,mean)
 est.mean - tapply(rfrac,rvec,mean)


 op - par(mfrow=c(1,2))
 plot(rvec,rfrac,type=n,xlim=c(-0.02,0.55),axes=FALSE)
 axis(side=2)
 box()
 boxplot(rfrac~rvec,at=unique(rvec),add=TRUE,pars=list(boxwex=0.03),
 col=gray)
 points(jitter(rvec),rfrac,col=X[,warn]+1)
 lines(unique(rvec),fracwarn,col=blue,type=b,lwd=2)
 text(0.4,0.1,fraction\nzero,col=blue)
 abline(a=0,b=1,lwd=2)
 points(unique(rvec),est.mean,cex=1.5,col=5)
 ##
 plot(rvec,rfrac,type=n,xlim=c(-0.02,0.55),axes=FALSE,log=y)
 axis(side=2)
 axis(side=1,at=unique(rvec))
 box()
 

[R] low-variance warning in lmer

2006-11-24 Thread Ben Bolker

  For block effects with small variance, lmer will sometimes
estimate the variance as being very close to zero and issue
a warning.  I don't have a problem with this -- I've explored
things a bit with some simulations (see below) and conclude that
this is probably inevitable when trying to incorporate
random effects with not very much data (the means and medians
of estimates are plausibly close to the nominal values:
the fraction of runs with warnings/near-zero estimates drops
from about 50% when the between-block variance is 1% of
the total (with 2 treatments, 12 blocks nested within treatment,
3 replicates per block), to 15% when between=30% of total,
to near zero when between=50% of total.

  My question is: what should I suggest to students when this
situation comes up?  Can anyone point me to appropriate references?
(I haven't found anything relevant in Pinheiro and Bates, but
I may not have looked in the right place ...)

  thanks
Ben Bolker


self-contained but unnecessarily complicated simulation
code/demonstration:
---
  library(lme4)
library(lattice)

simfun - function(reefeff,ntreat=2,nreef=12,
   nreefpertreat=3,
   t.eff=10,
   totvar=25,seed=NA) {
  if (!is.na(seed)) set.seed(seed)
  ntot = nreef*nreefpertreat
  npertreat=ntot/ntreat
  reef = gl(nreef,nreefpertreat)
  treat = gl(ntreat,npertreat)
  r.sd = sqrt(totvar*reefeff)
  e.sd = sqrt(totvar*(1-reefeff))
  y.det = ifelse(treat==1,0,t.eff)
  r.vals = rnorm(nreef,sd=r.sd)
  e.vals = rnorm(ntot,sd=e.sd)
  y - y.det+r.vals[as.numeric(reef)]+e.vals
  data.frame(y,treat,reef)
}

getranvar - function(x) {
  vc - VarCorr(x)
  c(diag(vc[[1]]),attr(vc,sc)^2)
}

estfun - function(reefeff,...) {
  x - simfun(reefeff=reefeff,...)
  ow - options(warn=2)
  f1 - try(lmer(y~treat+(1|reef),data=x))
  w - (class(f1)==try-error  length(grep(effectively zero,f1))0)
  options(ow)
  f2 - lmer(y~treat+(1|reef),data=x)
  c(getranvar(f2),as.numeric(w))
}


rvec - rep(c(0.01,0.05,0.1,0.15,0.2,0.3,0.5),each=100)
X - t(sapply(rvec,estfun))
colnames(X) - c(reefvar,resvar,warn)
rfrac - X[,reefvar]/(X[,reefvar]+X[,resvar])
fracwarn - tapply(X[,warn],rvec,mean)
est.mean - tapply(rfrac,rvec,mean)


op - par(mfrow=c(1,2))
plot(rvec,rfrac,type=n,xlim=c(-0.02,0.55),axes=FALSE)
axis(side=2)
box()
boxplot(rfrac~rvec,at=unique(rvec),add=TRUE,pars=list(boxwex=0.03),
col=gray)
points(jitter(rvec),rfrac,col=X[,warn]+1)
lines(unique(rvec),fracwarn,col=blue,type=b,lwd=2)
text(0.4,0.1,fraction\nzero,col=blue)
abline(a=0,b=1,lwd=2)
points(unique(rvec),est.mean,cex=1.5,col=5)
##
plot(rvec,rfrac,type=n,xlim=c(-0.02,0.55),axes=FALSE,log=y)
axis(side=2)
axis(side=1,at=unique(rvec))
box()
points(jitter(rvec),rfrac,col=X[,warn]+1)
curve(1*x,add=TRUE,lwd=2)

print(densityplot(~rfrac,groups=rvec,from=0,to=0.9,
panel=function(...) {
  panel.densityplot(...)
  cols - trellis.par.get(superpose.line)$col
  lpoints(unique(rvec),rep(8,length(rvec)),type=h,
  col=cols[1:length(rvec)])
}))




signature.asc
Description: OpenPGP digital signature
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] low-variance warning in lmer

2006-11-24 Thread Gregor Gorjanc
Ben Bolker bolker at zoo.ufl.edu writes:
   For block effects with small variance, lmer will sometimes
...
   My question is: what should I suggest to students when this
 situation comes up?  Can anyone point me to appropriate references?
 (I haven't found anything relevant in Pinheiro and Bates, but
 I may not have looked in the right place ...)

Gelman has quite some similar examples. Take a look at his schools example in
the Bayesian book.

Gregor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.