On Thu, Sep 22, 2011 at 2:54 PM, trekvana <trekv...@aol.com> wrote: > Hello all, > > So I am using the (formula entry) method for randomForests: > > randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that some of > the items in that package dont take a formula entry - you have to explicitly > state the y and x vector: > > randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...) > > Now my question is whether there is a function/way to tell R to take a > formula and make the two corresponding datasets [x,y] (that way I dont have > to create the x dataset manually with all 40 variables I have). > > There must be a more elegant way to do this than > x=xxx[,c('x1','x2',...,'x40')]
We assume that the formula is of the form: fo <- y ~ x1 + x2 + x3 Now if we set: v <- all.vars(fo) and if DF is our data frame then DF[, v[1]] and DF[v[-1]] are the response and predictors. (You may need to add an intercept to the predictors and convert the predictors from data frame to a matrix depending on what you intend to do next.) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.