Re: [R] Get variable names from results of lm()

Marc Schwartz Wed, 23 May 2012 17:50:51 -0700

On May 23, 2012, at 2:52 PM, arun wrote:

> Hi Marc,
> 
> Just to point out some difference,
> 
> 
>  x <- 1:20
>   y <-  x + (x/4 - 2)^3 + rnorm(20, sd=3)
>       names(y) <- paste("O",x,sep=".")
>        ww <- rep(1,20); ww[13] <- 0
>       summary(lmxy <- lm(y ~ x + I(x^2)+I(x^3) + I((x-10)^2),
>                           weights = ww), cor = TRUE)
> 
> 
>> all.vars(formula(lmxy))
> [1] "y" "x"
> 
> 
>> variable.names(lmxy)
> [1] "(Intercept)" "x"           "I(x^2)"      "I(x^3)"   
> 
> 
> 
> A.K.



<snip>

Hi Arun,

Note that as long as the model terms are not factors (and other terms that get 
'expanded'), the above will return the names of the terms, plus of course the 
intercept. I suspect however, that in your example, you might want:

> variable.names(lmxy, full = TRUE)
[1] "(Intercept)"   "x"             "I(x^2)"        "I(x^3)"       
[5] "I((x - 10)^2)"

since the last term was dropped in your output. Note that you would get 
essentially the same information from:

> names(coef(lmxy))
[1] "(Intercept)"   "x"             "I(x^2)"        "I(x^3)"       
[5] "I((x - 10)^2)"


again, with no factors present.


However, with factors present, consider:

LM <- lm(Sepal.Length ~ ., data = iris)

> all.vars(formula(LM))
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
[5] "Species"  

versus:

> variable.names(LM)
[1] "(Intercept)"       "Sepal.Width"       "Petal.Length"     
[4] "Petal.Width"       "Speciesversicolor" "Speciesvirginica" 


That does no better than:

> names(coef(LM))
[1] "(Intercept)"       "Sepal.Width"       "Petal.Length"     
[4] "Petal.Width"       "Speciesversicolor" "Speciesvirginica" 


This is because variable.names() is essentially getting its information from:

> colnames(lmxy$qr$qr)
[1] "(Intercept)"   "x"             "I(x^2)"        "I(x^3)"       
[5] "I((x - 10)^2)"

> colnames(LM$qr$qr)
[1] "(Intercept)"       "Sepal.Width"       "Petal.Length"     
[4] "Petal.Width"       "Speciesversicolor" "Speciesvirginica" 



Other options include:

> labels(terms(lmxy))
[1] "x"             "I(x^2)"        "I(x^3)"        "I((x - 10)^2)"

> labels(terms(LM))
[1] "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"    


which gets the information from the 'term.labels' attribute of the model terms 
object, which is the RHS:

> attr(terms(LM), "term.labels")
[1] "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species" 


You could also use:

> colnames(model.frame(LM))
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
[5] "Species"     

> colnames(model.frame(lmxy))
[1] "y"             "x"             "I(x^2)"        "I(x^3)"       
[5] "I((x - 10)^2)" "(weights)" 


This gives slightly different information, but shows that there is more than 
one way to get information from an R object, depending upon needs. 


Let me throw in another twist into the mix:

> variable.names(lm(y ~ poly(x, 3)))
[1] "(Intercept)" "poly(x, 3)1" "poly(x, 3)2" "poly(x, 3)3"

> all.vars(formula(lm(y ~ poly(x, 3))))
[1] "y" "x"

> labels(terms(lm(y ~ poly(x, 3))))
[1] "poly(x, 3)"

> colnames(model.frame(lm(y ~ poly(x, 3))))
[1] "y"          "poly(x, 3)"


Which output do you want? That will be dependent upon use case. One needs to be 
cautious in proposing a generic solution to an underlying problem that needs to 
be precisely defined.

Food for thought...

Regards,

Marc

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get variable names from results of lm()

Reply via email to