[R] Problem trying to use boot and lme together

2005-06-21 Thread Michael Dewey
The outcome variable in my dataset is cost. I prefer not to transform it as 
readers will really be interested in pounds, not log(pounds) or 
sqrt(pounds). I have fitted my model using lme and now wish to use boot on 
it. I have therefore plagiarised the example in the article in RNews 2/3 
December 2002

after loading the dataset I source this file


require(boot)
require(nlme)
model.0 <- lme(tot ~ time + timepost + pct + totpat
+ (time + timepost) : single + single
+ (time + timepost) : train + train
+ time:gpattend + timepost:gpattend + gpattend,
data = common,
random = ~time + timepost | gp
)
ints.9 <- intervals(model.0, which="fixed")$fixed[9,]
#
common$fit <- fitted(model.0)
common$res <- resid(model.0, type = "response")
cats.fit <- function(data) {
mod <- lme(tot ~ time + timepost + pct + totpat
   + (time + timepost) : single + single
   + (time + timepost) : train + train
   + time:gpattend + timepost:gpattend + gpattend,
data = data,
random = ~ time + timepost | gp)
summ <- summary(mod)
c(fixef(summ), diag(summ$varFix))
}
model.fun <- function(d, i) {
d$tot <- d$fit+d$res[i]
cats.fit(d)
}
tot.boot <- boot(common, model.fun, R=999)

So I fit the model and then generate fitted values and residuals which I 
use within the model.fun function to generate the bootstrap resample.

If I do this the plot looks fine as does the jack.after.boot plot but the 
confidence intervals are about 10% of the width of the ones from the lme 
output. I wondered whether I was using the wrong fitted and residuals so I 
added level = 0 to the calls of fitted and resid respectively (level = 1 is 
the default) but this seems to lead to resamples to which lme cannot fit 
the model.

Can anyone spot what I am doing wrong?

I would appreciate a cc'ed response as my ISP has taken to bouncing the 
digest posts from R-help with probability approximately 0.3.

FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme 
which shipped with the binary.


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem trying to use boot and lme together

2005-06-21 Thread Prof Brian Ripley
We don't have the structure of your dataset.  But it seems pretty clear 
that your resampling is not preserving the random effects structure: you 
are always fitting to the same clusters labelled by gp and so missing the 
major source of variability.

You either need to resample clusters, or resample the cluster random 
effects, as well as resample the within-cluster effects.  I doubt if there 
is much theoretical support for such bootstrapping schemes, nor software 
support: I would prefer a so-called parametric bootstrapping approach, 
that is resample from the fitted model -- see simulate.lme.



On Tue, 21 Jun 2005, Michael Dewey wrote:

> The outcome variable in my dataset is cost. I prefer not to transform it as
> readers will really be interested in pounds, not log(pounds) or
> sqrt(pounds). I have fitted my model using lme and now wish to use boot on
> it. I have therefore plagiarised the example in the article in RNews 2/3
> December 2002
>
> after loading the dataset I source this file
>
> 
> require(boot)
> require(nlme)
> model.0 <- lme(tot ~ time + timepost + pct + totpat
>+ (time + timepost) : single + single
>+ (time + timepost) : train + train
>+ time:gpattend + timepost:gpattend + gpattend,
>data = common,
>random = ~time + timepost | gp
> )
> ints.9 <- intervals(model.0, which="fixed")$fixed[9,]
> #
> common$fit <- fitted(model.0)
> common$res <- resid(model.0, type = "response")
> cats.fit <- function(data) {
>mod <- lme(tot ~ time + timepost + pct + totpat
>   + (time + timepost) : single + single
>   + (time + timepost) : train + train
>   + time:gpattend + timepost:gpattend + gpattend,
>data = data,
>random = ~ time + timepost | gp)
>summ <- summary(mod)
>c(fixef(summ), diag(summ$varFix))
> }
> model.fun <- function(d, i) {
>d$tot <- d$fit+d$res[i]
>cats.fit(d)
> }
> tot.boot <- boot(common, model.fun, R=999)
> 
> So I fit the model and then generate fitted values and residuals which I
> use within the model.fun function to generate the bootstrap resample.
>
> If I do this the plot looks fine as does the jack.after.boot plot but the
> confidence intervals are about 10% of the width of the ones from the lme
> output. I wondered whether I was using the wrong fitted and residuals so I
> added level = 0 to the calls of fitted and resid respectively (level = 1 is
> the default) but this seems to lead to resamples to which lme cannot fit
> the model.
>
> Can anyone spot what I am doing wrong?
>
> I would appreciate a cc'ed response as my ISP has taken to bouncing the
> digest posts from R-help with probability approximately 0.3.
>
> FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme
> which shipped with the binary.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem trying to use boot and lme together

2005-06-21 Thread Søren Højsgaard
The problem with simulate.lme is that it only returns logL for a given model 
fitted to a simulated data set  - not the simulated data set itself (which one 
might have expected a function with that name to do...). It would be nice with 
that functionality...
Søren
 



Fra: [EMAIL PROTECTED] på vegne af Prof Brian Ripley
Sendt: ti 21-06-2005 18:53
Til: Michael Dewey
Cc: r-help-stat.math.ethz.ch
Emne: Re: [R] Problem trying to use boot and lme together



We don't have the structure of your dataset.  But it seems pretty clear
that your resampling is not preserving the random effects structure: you
are always fitting to the same clusters labelled by gp and so missing the
major source of variability.

You either need to resample clusters, or resample the cluster random
effects, as well as resample the within-cluster effects.  I doubt if there
is much theoretical support for such bootstrapping schemes, nor software
support: I would prefer a so-called parametric bootstrapping approach,
that is resample from the fitted model -- see simulate.lme.



On Tue, 21 Jun 2005, Michael Dewey wrote:

> The outcome variable in my dataset is cost. I prefer not to transform it as
> readers will really be interested in pounds, not log(pounds) or
> sqrt(pounds). I have fitted my model using lme and now wish to use boot on
> it. I have therefore plagiarised the example in the article in RNews 2/3
> December 2002
>
> after loading the dataset I source this file
>
> 
> require(boot)
> require(nlme)
> model.0 <- lme(tot ~ time + timepost + pct + totpat
>+ (time + timepost) : single + single
>+ (time + timepost) : train + train
>+ time:gpattend + timepost:gpattend + gpattend,
>data = common,
>random = ~time + timepost | gp
> )
> ints.9 <- intervals(model.0, which="fixed")$fixed[9,]
> #
> common$fit <- fitted(model.0)
> common$res <- resid(model.0, type = "response")
> cats.fit <- function(data) {
>mod <- lme(tot ~ time + timepost + pct + totpat
>   + (time + timepost) : single + single
>   + (time + timepost) : train + train
>   + time:gpattend + timepost:gpattend + gpattend,
>data = data,
>random = ~ time + timepost | gp)
>summ <- summary(mod)
>c(fixef(summ), diag(summ$varFix))
> }
> model.fun <- function(d, i) {
>d$tot <- d$fit+d$res[i]
>cats.fit(d)
> }
> tot.boot <- boot(common, model.fun, R=999)
> 
> So I fit the model and then generate fitted values and residuals which I
> use within the model.fun function to generate the bootstrap resample.
>
> If I do this the plot looks fine as does the jack.after.boot plot but the
> confidence intervals are about 10% of the width of the ones from the lme
> output. I wondered whether I was using the wrong fitted and residuals so I
> added level = 0 to the calls of fitted and resid respectively (level = 1 is
> the default) but this seems to lead to resamples to which lme cannot fit
> the model.
>
> Can anyone spot what I am doing wrong?
>
> I would appreciate a cc'ed response as my ISP has taken to bouncing the
> digest posts from R-help with probability approximately 0.3.
>
> FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme
> which shipped with the binary.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem trying to use boot and lme together

2005-06-21 Thread Douglas Bates
On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote:
> The problem with simulate.lme is that it only returns logL for a given model 
> fitted to a simulated data set  - not the simulated data set itself (which 
> one might have expected a function with that name to do...). It would be nice 
> with that functionality...
> Søren

You could add it.  You just need to create a matrix that will be large
enough to hold all the simulated data sets and fill a column (or row
if you prefer but column is probably better because of the way that
matrices are stored) during each iteration of the simulation and
remember to include that matrix in the returned object.

The reason that we didn't do that in the original design is because
that matrix can become rather large when you have either a lot of data
or a lot of simulations and we were working on the computers or 5 to
10 years ago.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem trying to use boot and lme together

2005-06-21 Thread Prof Brian Ripley

On Tue, 21 Jun 2005, Douglas Bates wrote:


On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote:

The problem with simulate.lme is that it only returns logL for a given model 
fitted to a simulated data set  - not the simulated data set itself (which one 
might have expected a function with that name to do...). It would be nice with 
that functionality...
Søren


You could add it.  You just need to create a matrix that will be large
enough to hold all the simulated data sets and fill a column (or row
if you prefer but column is probably better because of the way that
matrices are stored) during each iteration of the simulation and
remember to include that matrix in the returned object.


Note: you don't need to store it: you can do the analysis at that point 
and return the statistics you want, rather than just logL.


I did say `see simulate.lme', not `use simulate.lme'.  I know nlme is no 
longer being developed, but if it were I would be suggesting/contributing 
a modification that allowed the user to specify an `extraction' function 
from the fit -- quite a few pieces of bootstrap code work that way.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem trying to use boot and lme together

2005-06-22 Thread Michael Dewey
At 23:09 21/06/05, Prof Brian Ripley wrote:
>On Tue, 21 Jun 2005, Douglas Bates wrote:
>
>>On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote:

Thanks everyone for your help, more comments at the foot

>>>The problem with simulate.lme is that it only returns logL for a given 
>>>model fitted to a simulated data set  - not the simulated data set 
>>>itself (which one might have expected a function with that name to 
>>>do...). It would be nice with that functionality...
>>>Søren
>>
>>You could add it.  You just need to create a matrix that will be large
>>enough to hold all the simulated data sets and fill a column (or row
>>if you prefer but column is probably better because of the way that
>>matrices are stored) during each iteration of the simulation and
>>remember to include that matrix in the returned object.
>
>Note: you don't need to store it: you can do the analysis at that point 
>and return the statistics you want, rather than just logL.
>
>I did say `see simulate.lme', not `use simulate.lme'.  I know nlme is no 
>longer being developed, but if it were I would be suggesting/contributing 
>a modification that allowed the user to specify an `extraction' function 
>from the fit -- quite a few pieces of bootstrap code work that way.
>
>--
>Brian D. Ripley,  [EMAIL PROTECTED]
>Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>University of Oxford, Tel:  +44 1865 272861 (self)
>1 South Parks Road, +44 1865 272866 (PA)
>Oxford OX1 3TG, UKFax:  +44 1865 272595

This has been most informative for me. Given the rather stern comments in 
Pinheiro and Bates that most things do not have the reference distribution 
you might naively think I now realise that simulate.lme is more important 
than its rather cursory treatment in the book. As suggested I have looked 
at the code but although I can see broadly what each section does I lack 
the skill to modify it myself. I will have to wait for someone more gifted.

If there is to be a successor edition to Pinheiro and Bates perhaps I could 
suggest that this topic merits a bit more discussion?


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html