Re: [Scikit-learn-general] RandomForest benchmark

2012-11-17 Thread Olivier Grisel
Yeah actually they can only be better if the data is memmaped in advanced (for instance using joblib.dump(data, filename) / joblib.load(filename, mmap_mode='c')). Also this is only really interesting for large datasets (e.g. larger than 100MB) which is probably not the case here in retrospect. 201

Re: [Scikit-learn-general] RandomForest benchmark

2012-11-17 Thread Peter Prettenhofer
Olivier, I tested it with the joblib PR - results got a bit worse. see below best, Peter arcene r py score 0.2700 (0.03) 0.2633 (0.02) train 3.9454 (0.09) 4.6661 (0.20) test

Re: [Scikit-learn-general] RandomForest benchmark

2012-11-16 Thread Olivier Grisel
You can retry by replacing the sklearn/externals/joblib folder with the joblib folder of this branch: https://github.com/joblib/joblib/pull/44 -- Monitor your physical, virtual and cloud infrastructure from a single web c

Re: [Scikit-learn-general] RandomForest benchmark

2012-11-16 Thread Satrajit Ghosh
this would also be consistent with the evaluation done here: http://wise.io/wiserf.html cheers, satra On Fri, Nov 16, 2012 at 2:25 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > Hi, > > I did a quick benchmark to compare sklearn's RandomForestClassifier > against R's randomFo