Hi,

I am using the caretNWS package to train some supervised regression models 
(gbm, lasso, random forest and mars). The problem I have encountered started 
when my training data set increased in the number of predictors and the number 
of observations.

The training data set has 347 numeric columns. The problem I have is when there 
are more then 2500 observations the 5 sleigh objects start but do not use any 
CPU resources and do not process any data.

N=100                     cpu(%)       memory(K)
Rgui.exe                   0           91737
5x sleighs (RTerm.exe)    15-25         ~27000

N=2500
Rgui.exe                  0             160000
5x sleighs (RTerm.exe)    15-25         ~74000

N=5000
Rgui.exe                  50             193000
5x sleighs (RTerm.exe)    0             ~19000


A 10% sample of my overall data is ~22000 observations.

Can someone give me an idea of the limitations of the nws and caretNWS packages 
in terms of the number of columns and rows of the training matrices and if 
there are other tuning/training functions that work faster on large datasets?

Thanks for your help.
Peter


> version
               _
platform       i386-pc-mingw32
arch           i386
os             mingw32
system         i386, mingw32
status
major          2
minor          6.2
year           2008
month          02
day            08
svn rev        44383
language       R
version.string R version 2.6.2 (2008-02-08)

> memory.limit()
[1] 2047

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to