On 20.06.2013 16:46, David martin wrote:
Hi ,
When using errorest on a large dataset (12000 variables) it performs
very slow. By looking at the randomforest package it says that for
largedatasets the use of the formula is discouraged.

So it's better to use the x and y terms as the example below:
rf<-randomForest(x=df[trainindices,-1],y=df[trainindices,1],xtest=df[testindices,-1],ytest=df[testindices,1],
do.trace=5, ntree=500)

Would it be possible to modify errorest so that it uses x and y rather
than formula. I think that would increase speed on large datasets.

errorest(type~.,data=mydate, model=randomForest,mtry=2)#will perform slow
errorest(x=type,y=variables,data=mydate,
model=randomForest,mtry=2)#would perform faster if implemented

Talk to the maintainer of the package you found errorest() in?

Best,
Uwe Ligges

thanks,
david

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to