> On Jan 6, 2017, at 11:03 AM, Jacob Wegelin <jacobwege...@fastmail.fm> wrote:
> 
> Given any regression model, created for instance by lm, lme, lmer, or rqs, 
> such as
> 
> z1<-lm(weight~poly(Time,2), data=ChickWeight)
> 
> I would like a general way to obtain only those variables used for the model. 
>  In the current example, this "minimal data frame" would consist of the 
> "weight" and "Time" variables and none of the other columns of ChickWeight.
> 
> (Motivation: Sometimes the data frame contains thousands of variables which 
> are not used in the current regression, and I do not want to keep copying and 
> propagating them.)
> 
> The "model" component of the regression object doesn't serve this purpose:
> 
>> head(z1$model)
>  weight poly(Time, 2).1 poly(Time, 2).2
> 1     42    -0.066020938     0.072002235
> 2     51    -0.053701293     0.031099018
> 3     59    -0.041381647    -0.001334588
> 4     64    -0.029062001    -0.025298582
> 5     76    -0.016742356    -0.040792965
> 6     93    -0.004422710    -0.047817737
> 
> The following awkward workaround seems to do it when variable names contain 
> only "word characters" as defined by regex:
> 
> minimalvariablesfrommodel20161120 <-function(object, originaldata){
> # stopifnot(!missing(originaldata))
> stopifnot(!missing(object))
> intersect(
>       unique(unlist(strsplit(format(object$call$formula), split="\\W", 
> perl=TRUE)))
>       , names(originaldata)
>       )
> }
> 
>> minimalvariablesfrommodel20161120(z1, ChickWeight)
> [1] "weight" "Time" 
>> 
> 
> But if a variable has a space in its name, my workaround fails:
> 
>> ChickWeight$"dog tail"<-ChickWeight$Time
>> z1<-lm(weight~poly(`dog tail`,2), data=ChickWeight)
>> head(z1$model)
>  weight poly(`dog tail`, 2).1 poly(`dog tail`, 2).2
> 1     42          -0.066020938           0.072002235
> 2     51          -0.053701293           0.031099018
> 3     59          -0.041381647          -0.001334588
> 4     64          -0.029062001          -0.025298582
> 5     76          -0.016742356          -0.040792965
> 6     93          -0.004422710          -0.047817737
>> minimalvariablesfrommodel20161120(z1, ChickWeight)
> [1] "weight"
>> 
> 
> Is there a more elegant, and hence more reliable, approach?
> 
> Thanks
> 
> Jacob A. Wegelin


Jacob,

In general, if you have a model object 'm', you can use the following syntax:

  all.vars(terms(m))

See ?terms and ?all.vars, the latter also includes all.names().

Regards,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to