Re: [Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment

Duncan Murdoch Tue, 02 Aug 2011 12:09:29 -0700

On 02/08/2011 10:48 AM, Thaler,Thorn,LAUSANNE,Applied Mathematics wrote:

>  mm<- function(datf) {
>     lm(y ~ x, data = datf)
>  }
>  mydatf<- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10)), z
=
>  rnorm(20))
>
>  l<- mm(mydatf)
>  update(l, . ~ . + z)   # This fails, z is not found


Good point. So let me rephrase the initial problem:

1.) An lm object is fitted somewhere with some data, which resides
somewhere in the memory.
2.) An ideal update function would know where the original data is
(rather than assuming that it is stored
   a.) in the parent frame
   b.) under the name given in the call slot of the lm object)

While from my point of view assumption a.) seems to be reasonable,
assumption b.) is kind of awkward as pointed out, because it makes it
kind of cumbersome to update models, which were created inside a
function (which should not be a too rare use case).

Thus, I've to questions:
1.) Is it somehow possible to retrieve the original data.frame with
which an lm is fitted just from the knowledge of the fit? I fear that
model.frame is the best I have.

I don't think so. You can get the environment in which the formula wascreated from the "terms" component of the result; that's the secondplace lm() will look. The first place it will look is in the explicitlyspecified data variable, and you can get its name, but I don't think theresult object necessarily stores the full "data" argument or theenvironment in which to look it up. (In your example, you can look up"datf" in environment(l$terms) and get it, but that wouldn't work if theformula had also been specified as an argument to mm().)

2.) Is there any other way of making update aware of where to look for
the model building data?

By the way, another work-around I was just thinking of is to use

mm<- function(datf) {
    l<- lm(y ~ x, data = datf)
    call<- l$call
    call$data<- substitute(datf)
    l$call<- call
    l
}

which solves my issue (and with which I can very well live with), but I
was wondering whether you see any chance that update could be made
smarter? Thanks for your input.

I would suggest something simpler: return a list containing both l anddatf, and pass datf to update. You can attach a class to that list tohide some of the ugliness if you like.


Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment

Reply via email to