[Rd] help with eval()
I've narrowed my scope problems with predict.coxph further. Here is a condensed example: fcall3 <- as.formula("time ~ age") dfun3 <- function(dcall) { fit <- lm(dcall, data=lung, model=FALSE) model.frame(fit) } dfun3(fcall3) The final call fails: it can't find 'dcall'. The relevant code in model.frame.lm is: env <- environment(formula$terms) if (is.null(env)) env <- parent.frame() eval(fcall, env, parent.frame()) If the environment of the formula is .Globalenv, as it is here, the contents of parent.frame() are ignored. Adding a print(ls(parent.frame())) statement just above the final call shows that it isn't a scope issue: the variables we want are there. I don't understand the logic behind looking for variables in the place the formula was first typed (this is not a complaint). The inability to look elsewhere however has stymied my efforts to fix the scoping problem in predict.coxph, unless I drop the env(formula) argument alltogether. But I assume there must be good reasons for it's inclusion and am reluctant to do so. Terry Therneau > sessionInfo() R version 2.13.0 RC (2011-04-12 r55424) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base PS. This also fails dfun3 <- function(dcall) { fit <- lm(dcall, data=lung) model.frame(fit, subset=1:10) } You just need to force model.frame.lm to recreate data. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] help with eval()
On 11-04-18 5:51 PM, Terry Therneau wrote: I've narrowed my scope problems with predict.coxph further. Here is a condensed example: fcall3<- as.formula("time ~ age") dfun3<- function(dcall) { fit<- lm(dcall, data=lung, model=FALSE) model.frame(fit) } dfun3(fcall3) The final call fails: it can't find 'dcall'. The relevant code in model.frame.lm is: env<- environment(formula$terms) if (is.null(env)) env<- parent.frame() eval(fcall, env, parent.frame()) If the environment of the formula is .Globalenv, as it is here, the contents of parent.frame() are ignored. Adding a print(ls(parent.frame())) statement just above the final call shows that it isn't a scope issue: the variables we want are there. I don't understand the logic behind looking for variables in the place the formula was first typed (this is not a complaint). The inability to look elsewhere however has stymied my efforts to fix the scoping problem in predict.coxph, unless I drop the env(formula) argument alltogether. But I assume there must be good reasons for it's inclusion and am reluctant to do so. The reason is that when a formula is created, the variables in it are assumed to have meaning in that context. Where you work with the formula after that should not be relevant: that's why formulas carry environments with them. When you create the formula before the variables, things go wrong. There's probably a way to associate the lung dataframe with the formula, or create the formula in such a way that things work, but I can't spot it. Duncan Murdoch Terry Therneau sessionInfo() R version 2.13.0 RC (2011-04-12 r55424) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base PS. This also fails dfun3<- function(dcall) { fit<- lm(dcall, data=lung) model.frame(fit, subset=1:10) } You just need to force model.frame.lm to recreate data. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] help with eval()
On Mon, Apr 18, 2011 at 5:51 PM, Terry Therneau wrote: > I've narrowed my scope problems with predict.coxph further. > Here is a condensed example: > > fcall3 <- as.formula("time ~ age") > dfun3 <- function(dcall) { > fit <- lm(dcall, data=lung, model=FALSE) > model.frame(fit) > } > dfun3(fcall3) > > The final call fails: it can't find 'dcall'. > > The relevant code in model.frame.lm is: > env <- environment(formula$terms) > if (is.null(env)) > env <- parent.frame() > eval(fcall, env, parent.frame()) > > If the environment of the formula is .Globalenv, as it is here, the > contents of parent.frame() are ignored. Adding a > print(ls(parent.frame())) > statement just above the final call shows that it isn't a scope issue: > the variables we want are there. > > I don't understand the logic behind looking for variables in the place > the formula was first typed (this is not a complaint). The inability to > look elsewhere however has stymied my efforts to fix the scoping problem > in predict.coxph, unless I drop the env(formula) argument alltogether. > But I assume there must be good reasons for it's inclusion and am > reluctant to do so. > Try using do.call. Using the built in BOD to illustrate, we first try the posted code to view the error: > fcall3 <- as.formula("demand ~ Time") > dfun3 <- function(dcall) { + fit <- lm(dcall, data=BOD, model=FALSE) + model.frame(fit) + } > dfun3(fcall3) Error in model.frame(formula = dcall, data = BOD, drop.unused.levels = TRUE) : object 'dcall' not found > > # now replace the lm call with a do.call("lm" ...) > # so that dcall gets substituted before the call to lm: > > fcall3 <- as.formula("demand ~ Time") > dfun3 <- function(dcall) { + fit <- do.call("lm", list(dcall, data = BOD, model = FALSE)) + model.frame(fit) + } > dfun3(fcall3) demand Time 18.31 2 10.32 3 19.03 4 16.04 5 15.65 6 19.87 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] help with eval()
On Mon, 18 Apr 2011, Duncan Murdoch wrote: On 11-04-18 5:51 PM, Terry Therneau wrote: I've narrowed my scope problems with predict.coxph further. Here is a condensed example: fcall3<- as.formula("time ~ age") dfun3<- function(dcall) { fit<- lm(dcall, data=lung, model=FALSE) model.frame(fit) } dfun3(fcall3) The final call fails: it can't find 'dcall'. The relevant code in model.frame.lm is: env<- environment(formula$terms) if (is.null(env)) env<- parent.frame() eval(fcall, env, parent.frame()) If the environment of the formula is .Globalenv, as it is here, the contents of parent.frame() are ignored. Adding a print(ls(parent.frame())) statement just above the final call shows that it isn't a scope issue: the variables we want are there. I don't understand the logic behind looking for variables in the place the formula was first typed (this is not a complaint). The inability to look elsewhere however has stymied my efforts to fix the scoping problem in predict.coxph, unless I drop the env(formula) argument alltogether. But I assume there must be good reasons for it's inclusion and am reluctant to do so. The reason is that when a formula is created, the variables in it are assumed to have meaning in that context. Where you work with the formula after that should not be relevant: that's why formulas carry environments with them. When you create the formula before the variables, things go wrong. There's probably a way to associate the lung dataframe with the formula, or create the formula in such a way that things work, but I can't spot it. This is why model=FALSE is not the default. It avoids trying to find the data at a later date (and even if you can solve the scoping issues, the data may have been changed). Duncan Murdoch Terry Therneau sessionInfo() R version 2.13.0 RC (2011-04-12 r55424) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base PS. This also fails dfun3<- function(dcall) { fit<- lm(dcall, data=lung) model.frame(fit, subset=1:10) } You just need to force model.frame.lm to recreate data. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] help with eval()
On Apr 19, 2011, at 07:16 , Prof Brian Ripley wrote: > On Mon, 18 Apr 2011, Duncan Murdoch wrote: > >> On 11-04-18 5:51 PM, Terry Therneau wrote: >>> I've narrowed my scope problems with predict.coxph further. >>> Here is a condensed example: >>> fcall3<- as.formula("time ~ age") >>> dfun3<- function(dcall) { >>> fit<- lm(dcall, data=lung, model=FALSE) >>> model.frame(fit) >>> } >>> dfun3(fcall3) >>> [.] >>> I don't understand the logic behind looking for variables in the place >>> the formula was first typed (this is not a complaint). The inability to >>> look elsewhere however has stymied my efforts to fix the scoping problem >>> in predict.coxph, unless I drop the env(formula) argument alltogether. >>> But I assume there must be good reasons for it's inclusion and am >>> reluctant to do so. >> >> >> The reason is that when a formula is created, the variables in it are >> assumed to have meaning in that context. Where you work with the formula >> after that should not be relevant: that's why formulas carry environments >> with them. When you create the formula before the variables, things go wrong. >> >> There's probably a way to associate the lung dataframe with the formula, or >> create the formula in such a way that things work, but I can't spot it. > > This is why model=FALSE is not the default. It avoids trying to find the > data at a later date (and even if you can solve the scoping issues, the data > may have been changed). Yes, but there are other cases where a reevaluation is triggered. The example I found earlier involved doing model.frame on a subset, in which case the length(nargs) clause in model.frame.lm gets chosen. So something is not right: Either we should arrange that reevaluations are never necessary, or we there should be a mechanism to get them reevaluated in the same scope as the original call. An obvious way would be to add the evaluation environment as an attribute to the $call component, but what would the memory management and serialization consequences be? One workaround is, as Gabor points out, effectively to substitute the value of the arguments to lm() at the point of the call, using do.call(lm, list(.)) or some eval(substitute(.)) construct to the same effect. However, the result of do.call() will look awkward in the cases where the $call gets deparsed, though. E.g. in Gabor's example, if we modify it to show the actual fit, we get the result below (I'm sure you can imagine what would happen if a data frame with more than 7 rows got used!). On the other hand, NOT substituting such arguments leaves the scoping issues. Another possible workaround is to make sure that functions that call modelling code internally will do the evaluation in the frame of the caller (like the call to model.matrix inside lm does). However, that seems to defeat the purpose of adding environments to formulas in the first place. -pd > dfun3 <- function(dcall) { + fit <- do.call("lm", list(dcall, data = BOD, model = FALSE)) + print(model.frame(fit)) + fit} > dfun3(fcall3) demand Time 18.31 2 10.32 3 19.03 4 16.04 5 15.65 6 19.87 Call: lm(formula = demand ~ Time, data = structure(list(Time = c(1, 2, 3, 4, 5, 7), demand = c(8.3, 10.3, 19, 16, 15.6, 19.8)), .Names = c("Time", "demand"), row.names = c(NA, -6L), class = "data.frame", reference = "A1.4, p. 270"), model = FALSE) Coefficients: (Intercept) Time 8.5211.721 > -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel