Yeah actually they can only be better if the data is memmaped in
advanced (for instance using joblib.dump(data, filename) /
joblib.load(filename, mmap_mode='c')). Also this is only really
interesting for large datasets (e.g. larger than 100MB) which is
probably not the case here in retrospect.
201
Olivier,
I tested it with the joblib PR - results got a bit worse.
see below
best,
Peter
arcene
r py
score 0.2700 (0.03) 0.2633 (0.02)
train 3.9454 (0.09) 4.6661 (0.20)
test
You can retry by replacing the sklearn/externals/joblib folder with
the joblib folder of this branch:
https://github.com/joblib/joblib/pull/44
--
Monitor your physical, virtual and cloud infrastructure from a single
web c
this would also be consistent with the evaluation done here:
http://wise.io/wiserf.html
cheers,
satra
On Fri, Nov 16, 2012 at 2:25 PM, Peter Prettenhofer <
peter.prettenho...@gmail.com> wrote:
> Hi,
>
> I did a quick benchmark to compare sklearn's RandomForestClassifier
> against R's randomFo