[Rd] help with eval()

2011-04-18 Thread Terry Therneau
I've narrowed my scope problems with predict.coxph further.
Here is a condensed example:

fcall3 <- as.formula("time ~ age")
dfun3 <- function(dcall) {
fit <- lm(dcall, data=lung, model=FALSE)
model.frame(fit)
}
dfun3(fcall3)

The final call fails: it can't find 'dcall'.

The relevant code in model.frame.lm is:
   env <- environment(formula$terms)
   if (is.null(env)) 
env <- parent.frame()
eval(fcall, env, parent.frame())

If the environment of the formula is .Globalenv, as it is here, the
contents of parent.frame() are ignored.  Adding a
   print(ls(parent.frame())) 
statement just above the  final call shows that it isn't a scope issue:
the variables we want are there.

  I don't understand the logic behind looking for variables in the place
the formula was first typed (this is not a complaint).  The inability to
look elsewhere however has stymied my efforts to fix the scoping problem
in predict.coxph, unless I drop the env(formula) argument alltogether.
But I assume there must be good reasons for it's inclusion and am
reluctant to do so.

Terry Therneau

> sessionInfo()
R version 2.13.0 RC (2011-04-12 r55424)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=C  
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base 

PS. This also fails
dfun3 <- function(dcall) {
fit <- lm(dcall, data=lung)
model.frame(fit, subset=1:10)
}
You just need to force model.frame.lm to recreate data.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help with eval()

2011-04-18 Thread Duncan Murdoch

On 11-04-18 5:51 PM, Terry Therneau wrote:

I've narrowed my scope problems with predict.coxph further.
Here is a condensed example:

fcall3<- as.formula("time ~ age")
dfun3<- function(dcall) {
 fit<- lm(dcall, data=lung, model=FALSE)
 model.frame(fit)
}
dfun3(fcall3)

The final call fails: it can't find 'dcall'.

The relevant code in model.frame.lm is:
env<- environment(formula$terms)
if (is.null(env))
 env<- parent.frame()
eval(fcall, env, parent.frame())

If the environment of the formula is .Globalenv, as it is here, the
contents of parent.frame() are ignored.  Adding a
print(ls(parent.frame()))
statement just above the  final call shows that it isn't a scope issue:
the variables we want are there.

   I don't understand the logic behind looking for variables in the place
the formula was first typed (this is not a complaint).  The inability to
look elsewhere however has stymied my efforts to fix the scoping problem
in predict.coxph, unless I drop the env(formula) argument alltogether.
But I assume there must be good reasons for it's inclusion and am
reluctant to do so.



The reason is that when a formula is created, the variables in it are 
assumed to have meaning in that context.  Where you work with the 
formula after that should not be relevant:  that's why formulas carry 
environments with them.  When you create the formula before the 
variables, things go wrong.


There's probably a way to associate the lung dataframe with the formula, 
or create the formula in such a way that things work, but I can't spot it.


Duncan Murdoch


Terry Therneau


sessionInfo()

R version 2.13.0 RC (2011-04-12 r55424)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=C
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

PS. This also fails
dfun3<- function(dcall) {
 fit<- lm(dcall, data=lung)
 model.frame(fit, subset=1:10)
}
You just need to force model.frame.lm to recreate data.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help with eval()

2011-04-18 Thread Gabor Grothendieck
On Mon, Apr 18, 2011 at 5:51 PM, Terry Therneau  wrote:
> I've narrowed my scope problems with predict.coxph further.
> Here is a condensed example:
>
> fcall3 <- as.formula("time ~ age")
> dfun3 <- function(dcall) {
>    fit <- lm(dcall, data=lung, model=FALSE)
>    model.frame(fit)
> }
> dfun3(fcall3)
>
> The final call fails: it can't find 'dcall'.
>
> The relevant code in model.frame.lm is:
>       env <- environment(formula$terms)
>       if (is.null(env))
>            env <- parent.frame()
>        eval(fcall, env, parent.frame())
>
> If the environment of the formula is .Globalenv, as it is here, the
> contents of parent.frame() are ignored.  Adding a
>           print(ls(parent.frame()))
> statement just above the  final call shows that it isn't a scope issue:
> the variables we want are there.
>
>  I don't understand the logic behind looking for variables in the place
> the formula was first typed (this is not a complaint).  The inability to
> look elsewhere however has stymied my efforts to fix the scoping problem
> in predict.coxph, unless I drop the env(formula) argument alltogether.
> But I assume there must be good reasons for it's inclusion and am
> reluctant to do so.
>

Try using do.call.  Using the built in BOD to illustrate, we first try
the posted code to view the error:

> fcall3 <- as.formula("demand ~ Time")
> dfun3 <- function(dcall) {
+ fit <- lm(dcall, data=BOD, model=FALSE)
+ model.frame(fit)
+ }
> dfun3(fcall3)
Error in model.frame(formula = dcall, data = BOD, drop.unused.levels = TRUE) :
  object 'dcall' not found
>
> # now replace the lm call with a do.call("lm" ...)
> # so that dcall gets substituted before the call to lm:
>
> fcall3 <- as.formula("demand ~ Time")
> dfun3 <- function(dcall) {
+ fit <- do.call("lm", list(dcall, data = BOD, model = FALSE))
+ model.frame(fit)
+ }
> dfun3(fcall3)
  demand Time
18.31
2   10.32
3   19.03
4   16.04
5   15.65
6   19.87



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help with eval()

2011-04-18 Thread Prof Brian Ripley

On Mon, 18 Apr 2011, Duncan Murdoch wrote:


On 11-04-18 5:51 PM, Terry Therneau wrote:

I've narrowed my scope problems with predict.coxph further.
Here is a condensed example:

fcall3<- as.formula("time ~ age")
dfun3<- function(dcall) {
 fit<- lm(dcall, data=lung, model=FALSE)
 model.frame(fit)
}
dfun3(fcall3)

The final call fails: it can't find 'dcall'.

The relevant code in model.frame.lm is:
env<- environment(formula$terms)
if (is.null(env))
 env<- parent.frame()
eval(fcall, env, parent.frame())

If the environment of the formula is .Globalenv, as it is here, the
contents of parent.frame() are ignored.  Adding a
print(ls(parent.frame()))
statement just above the  final call shows that it isn't a scope issue:
the variables we want are there.

   I don't understand the logic behind looking for variables in the place
the formula was first typed (this is not a complaint).  The inability to
look elsewhere however has stymied my efforts to fix the scoping problem
in predict.coxph, unless I drop the env(formula) argument alltogether.
But I assume there must be good reasons for it's inclusion and am
reluctant to do so.



The reason is that when a formula is created, the variables in it are assumed 
to have meaning in that context.  Where you work with the formula after that 
should not be relevant:  that's why formulas carry environments with them. 
When you create the formula before the variables, things go wrong.


There's probably a way to associate the lung dataframe with the formula, or 
create the formula in such a way that things work, but I can't spot it.


This is why model=FALSE is not the default.  It avoids trying to find 
the data at a later date (and even if you can solve the scoping 
issues, the data may have been changed).




Duncan Murdoch


Terry Therneau


sessionInfo()

R version 2.13.0 RC (2011-04-12 r55424)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=C
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

PS. This also fails
dfun3<- function(dcall) {
 fit<- lm(dcall, data=lung)
 model.frame(fit, subset=1:10)
}
You just need to force model.frame.lm to recreate data.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help with eval()

2011-04-19 Thread peter dalgaard

On Apr 19, 2011, at 07:16 , Prof Brian Ripley wrote:

> On Mon, 18 Apr 2011, Duncan Murdoch wrote:
> 
>> On 11-04-18 5:51 PM, Terry Therneau wrote:
>>> I've narrowed my scope problems with predict.coxph further.
>>> Here is a condensed example:
>>> fcall3<- as.formula("time ~ age")
>>> dfun3<- function(dcall) {
>>> fit<- lm(dcall, data=lung, model=FALSE)
>>> model.frame(fit)
>>> }
>>> dfun3(fcall3)
>>> [.]
>>>   I don't understand the logic behind looking for variables in the place
>>> the formula was first typed (this is not a complaint).  The inability to
>>> look elsewhere however has stymied my efforts to fix the scoping problem
>>> in predict.coxph, unless I drop the env(formula) argument alltogether.
>>> But I assume there must be good reasons for it's inclusion and am
>>> reluctant to do so.
>> 
>> 
>> The reason is that when a formula is created, the variables in it are 
>> assumed to have meaning in that context.  Where you work with the formula 
>> after that should not be relevant:  that's why formulas carry environments 
>> with them. When you create the formula before the variables, things go wrong.
>> 
>> There's probably a way to associate the lung dataframe with the formula, or 
>> create the formula in such a way that things work, but I can't spot it.
> 
> This is why model=FALSE is not the default.  It avoids trying to find the 
> data at a later date (and even if you can solve the scoping issues, the data 
> may have been changed).

Yes, but there are other cases where a reevaluation is triggered. The example I 
found earlier involved doing model.frame on a subset, in which case the 
length(nargs) clause in model.frame.lm gets chosen.

So something is not right: Either we should arrange that reevaluations are 
never necessary, or we there should be a mechanism to get them reevaluated in 
the same scope as the original call. 

An obvious way would be to add the evaluation environment as an attribute to 
the $call component, but what would the memory management and serialization 
consequences be?

One workaround is, as Gabor points out, effectively to substitute the value of 
the arguments to lm() at the point of the call, using do.call(lm, list(.)) 
or some eval(substitute(.)) construct to the same effect. However, the 
result of do.call() will look awkward in the cases where the $call gets 
deparsed, though. E.g. in Gabor's example, if we modify it to show the actual 
fit, we get the result below (I'm sure you can imagine what would happen if a 
data frame with more than 7 rows got used!). On the other hand, NOT 
substituting such arguments leaves the scoping issues.

Another possible workaround is to make sure that functions that call modelling 
code internally will do the evaluation in the frame of the caller (like the 
call to model.matrix inside lm does). However, that seems to defeat the purpose 
of adding environments to formulas in the first place.  

-pd

> dfun3 <- function(dcall) {
+  fit <- do.call("lm", list(dcall, data = BOD, model = FALSE))
+ print(model.frame(fit))
+ fit}
> dfun3(fcall3)
  demand Time
18.31
2   10.32
3   19.03
4   16.04
5   15.65
6   19.87

Call:
lm(formula = demand ~ Time, data = structure(list(Time = c(1, 
2, 3, 4, 5, 7), demand = c(8.3, 10.3, 19, 16, 15.6, 19.8)), .Names = c("Time", 
"demand"), row.names = c(NA, -6L), class = "data.frame", reference = "A1.4, p. 
270"), 
model = FALSE)

Coefficients:
(Intercept) Time  
  8.5211.721  

> 

 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel