[R] Need Help! Poor performance about randomForest for large data

2010-05-25 Thread Jia ZJ Zou
Hi, dears, I am processing some data with 60 columns, and 286,730 rows. Most columns are numerical value, and some columns are categorical value. It turns out that: when ntree sets to the default value (500), it says can not allocate a vector of 1.1 GB size; And when I set ntree to be a very

Re: [R] Need Help! Poor performance about randomForest for large data

2010-05-25 Thread Joris Meys
Hi Jia, without seeing the actual data, it's difficult to give solid options. But it's quite normal this runs for hours : it has to make a whole lot of decisions, and it can grow tremendous large trees with that amount of data. Also the error is quite logic : you just can't store all those huge