On 16/06/2014 19:35, Therneau, Terry M., Ph.D. wrote:
Someone has reported a problem with predict.coxph that I can't seem to
solve.  The underlying issue is with model.frame.coxph; the same issue
is also found in lm so I'll use that for the example.

--------------------------

 > test <- data.frame(y = 1:10 + runif(10), x=1:10)

 > myfun <- function(formula, nd) {
     fit <- lm(formula, data=nd, model=FALSE)
     model.frame(fit)
     }

 > myfun(test)
Error in is.data.frame(data): object "nd" not found

You have specified formula = test and given no value for nd. Is that really what you intended? It is undocumented that it works for lm().


--------------------

1. The key line, in both model.frame.coxph and model.frame.lm is
     eval(fcall, env, parent.frame())

and it appear (at least to me) that the parent.frame() part of this is
effectively ignored when fcall is itself a reference to model.frame.
I'd like to understand this better.

Way back (ca R 1.2.0) an advocate of lexical scoping changed model.frame.lm to refer to an environment not a data frame for 'env'. That pretty fundamental change means that your sort of example is not a recommended way to do this: you are mixing scoping models.

2. The modeling functions coxph and survreg in the survival default to
model=FALSE, originally in mimicry of lm and glm; I don't know when R
> changed the default to model=TRUE for lm and glm.  One possible response

I am not sure R ever did: model = TRUE was the default 16 years ago at the beginning of the CVS/SVN archive.

to my question would be advice to change my routine's defaults too.  I'm
somewhat reluctant since I work with a few very large data sets, but
would entertain that discussion as well.   I'd still like to understand
how model.frame could be made to work under the current regimen.

For smaller problems using model = TRUE is the most robust solution. As the components of the model frame can be changed after fitting, there is no way to guarantee to recreate the model frame, so to be sure you need to store it.

If you called myfun(y ~ x, test) it will look for 'nd' in the global environment, the environment of the formula. One way to get that to work more often is something like

myfun <- function(formula, nd) {
     qnd <- substitute(nd)
     fit <- lm(formula, data=nd, model=FALSE)
     fit$call$data <- qnd
     model.frame(fit)
}

so it looks for the value of 'nd' instead.


--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to