Hi, I am using the caretNWS package to train some supervised regression models (gbm, lasso, random forest and mars). The problem I have encountered started when my training data set increased in the number of predictors and the number of observations.
The training data set has 347 numeric columns. The problem I have is when there are more then 2500 observations the 5 sleigh objects start but do not use any CPU resources and do not process any data. N=100 cpu(%) memory(K) Rgui.exe 0 91737 5x sleighs (RTerm.exe) 15-25 ~27000 N=2500 Rgui.exe 0 160000 5x sleighs (RTerm.exe) 15-25 ~74000 N=5000 Rgui.exe 50 193000 5x sleighs (RTerm.exe) 0 ~19000 A 10% sample of my overall data is ~22000 observations. Can someone give me an idea of the limitations of the nws and caretNWS packages in terms of the number of columns and rows of the training matrices and if there are other tuning/training functions that work faster on large datasets? Thanks for your help. Peter > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.2 year 2008 month 02 day 08 svn rev 44383 language R version.string R version 2.6.2 (2008-02-08) > memory.limit() [1] 2047 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.