[R] random effects with lme() -- comparison with lm()

2004-01-15 Thread Jerome Asselin

Hi all,

In the (very simple) example below, I have defined a random effect for
the residuals in lme(). So the model is equivalent to a FIXED effect
model. Could someone explain to me why lme() still gives two standard
deviation estimates? I would expect lme() to return either:
a) an error or a warning for having an unidentifiable model;
b) only one standard deviation estimate.

Thank you for your time.
Jerome Asselin

> library(nlme)
> simdat <- data.frame(A=1:4,Y=c(23,43,11,34))
> simdat
  A  Y
1 1 23
2 2 43
3 3 11
4 4 34
> lme(Y~1,data=simdat,random=~1|A)
<...snip...>
Random effects:
 Formula: ~1 | A
(Intercept) Residual
StdDev:12.96007 4.860027
<...snip...>
> summary(lm(Y~1,data=simdat))$sigma
[1] 13.84136
> sqrt(12.96007^2+4.860027^2)
[1] 13.84136

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random effects with lme() -- comparison with lm()

2004-01-15 Thread Douglas Bates
Jerome Asselin <[EMAIL PROTECTED]> writes:

> Hi all,
> 
> In the (very simple) example below, I have defined a random effect for
> the residuals in lme(). So the model is equivalent to a FIXED effect
> model. Could someone explain to me why lme() still gives two standard
> deviation estimates? I would expect lme() to return either:
> a) an error or a warning for having an unidentifiable model;
> b) only one standard deviation estimate.
> 
> Thank you for your time.
> Jerome Asselin
> 
> > library(nlme)
> > simdat <- data.frame(A=1:4,Y=c(23,43,11,34))
> > simdat
>   A  Y
> 1 1 23
> 2 2 43
> 3 3 11
> 4 4 34
> > lme(Y~1,data=simdat,random=~1|A)
> <...snip...>
> Random effects:
>  Formula: ~1 | A
> (Intercept) Residual
> StdDev:12.96007 4.860027
> <...snip...>
> > summary(lm(Y~1,data=simdat))$sigma
> [1] 13.84136
> > sqrt(12.96007^2+4.860027^2)
> [1] 13.84136

The estimates from lme are REML estimates because, as you have seen,
the sum of the estimated variances is correct and in this case only
the sum is well-defined.  Of course there are an infinite number of
other possible REML estimates and that situation is not flagged.
(BTW, I wouldn't say that this is equivalent to a fixed effects
model.  It is still a random effects model with two variance
components.  It just doesn't have well-defined estimates for those two
variance components.)

What has happened is that lme set up the optimization problem and
passed it off to one of the optimizer functions which came up with
converged estimates according to some convergence criterion.  In a
simple situation like this it is easy to determined that the estimates
are not well defined.  However, lme is designed to handle very general
model specifications and catching situations where estimates are not
well defined in the general case is difficult.  So the way we designed
lme is to take the estimates from the optimizer without doing further
analysis to see if they are consistent.

You should find that intervals() applied to your fitted model produces
huge intervals on the variance components, which is one way of
diagnosing an ill-defined or nearly ill-defined model.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random effects with lme() -- comparison with lm()

2004-01-15 Thread Jerome Asselin
On Thu, 2004-01-15 at 16:30, Douglas Bates wrote:
<...snip...>
> (BTW, I wouldn't say that this is equivalent to a fixed effects
> model.  It is still a random effects model with two variance
> components.  It just doesn't have well-defined estimates for those two
> variance components.)

Agreed.

<...snip...>
> You should find that intervals() applied to your fitted model produces
> huge intervals on the variance components, which is one way of
> diagnosing an ill-defined or nearly ill-defined model.

Following your suggestion, I got:
> intervals(lme(Y~1,data=simdat,random=~1|A))
Error in intervals.lme(lme(Y ~ 1, data = simdat, random = ~1 | A)) :
Cannot get confidence intervals on var-cov components:
Non-positive definite approximate variance-covariance

This led me to:
> lme(Y~1,data=simdat,random=~1|A)$apVar
[1] "Non-positive definite approximate variance-covariance"

As a new feature suggestion for lme(), would it be appropriate to use
"apVar" as a warning flag in this case?

Sincerely,
Jerome Asselin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random effects with lme() -- comparison with lm()

2004-01-16 Thread Douglas Bates
Jerome Asselin <[EMAIL PROTECTED]> writes:

> On Thu, 2004-01-15 at 16:30, Douglas Bates wrote:
> <...snip...>
> > (BTW, I wouldn't say that this is equivalent to a fixed effects
> > model.  It is still a random effects model with two variance
> > components.  It just doesn't have well-defined estimates for those two
> > variance components.)
> 
> Agreed.
> 
> <...snip...>
> > You should find that intervals() applied to your fitted model produces
> > huge intervals on the variance components, which is one way of
> > diagnosing an ill-defined or nearly ill-defined model.
> 
> Following your suggestion, I got:
> > intervals(lme(Y~1,data=simdat,random=~1|A))
> Error in intervals.lme(lme(Y ~ 1, data = simdat, random = ~1 | A)) :
> Cannot get confidence intervals on var-cov components:
> Non-positive definite approximate variance-covariance
> 
> This led me to:
> > lme(Y~1,data=simdat,random=~1|A)$apVar
> [1] "Non-positive definite approximate variance-covariance"
> 
> As a new feature suggestion for lme(), would it be appropriate to use
> "apVar" as a warning flag in this case?

Certainly.

You may know that we are doing a major revision of the lme
computational methods based on the ability to calculate both the
gradient and the Hessian of the profiled log-likelihood, as described
in
http://www.stat.wisc.edu/~bates/reports/MixedComp.pdf

I think that when we have both the gradient and the Hessian we will be
in a much better situation to diagnose ill-defined estimates.  The
apVar component in the current lme objects is an approximate
variance-covariance matrix from numerical derivatives.  Working with
an exact Hessian should be much more reliable.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html