On 06/06/2010 10:49 PM, Mark Seeto wrote:
Hello,

I have a couple of questions about the ols function in Frank Harrell's rms
package.

Is there any way to specify variables by their column number in the data
frame rather than by the variable name?

For example,

library(rms)
x1<- rnorm(100, 0, 1)
x2<- rnorm(100, 0, 1)
x3<- rnorm(100, 0, 1)
y<- x2 + x3 + rnorm(100, 0, 5)
d<- data.frame(x1, x2, x3, y)
rm(x1, x2, x3, y)
lm(y ~ d[,2] + d[,3], data = d)  # This works
ols(y ~ d[,2] + d[,3], data = d) # Gives error
Error in if (!length(fname) || !any(fname == zname)) { :
   missing value where TRUE/FALSE needed

However, this works:
ols(y ~ x2 + d[,3], data = d)

The reason I want to do this is to program variable selection for
bootstrap model validation.

A related question: does ols allow "y ~ ." notation?

lm(y ~ ., data = d[, 2:4])  # This works
ols(y ~ ., data = d[, 2:4]) # Gives error
Error in terms.formula(formula) : '.' in formula and no 'data' argument

Thanks for any help you can give.

Regards,
Mark

Hi Mark,

It appears that you answered the questions yourself. rms wants real variables or transformations of them. It makes certain assumptions about names of terms. The y ~ . should work though; sometime I'll have a look at that.

But these are the small questions compared to what you really want. Why do you need variable selection, i.e., what is wrong with having insignificant variables in a model? If you indeed need variable selection see if backwards stepdown works for you. It is built-in to rms bootstrap validation and calibration functions.

Frank

--
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to