On 02/26/2014 05:55 PM, Peter Prettenhofer wrote:
>
> please make sure to pickle with the highest protocol - otherwise
> pickle uses a textual serialization format which is quite inefficient:
>
> pickle.dump(clf, f, protocol=pickle.HIGHEST_PROTOCOL)
Or simply protocol=-1. This usually makes a hu
You can control the size of your random forest by adjusting the
parameters n_estimators, min_samples_split and even max_depth (read
the documentation for more details).
It's up to you to find parameter values that match your constraints in
terms of accuracy vs model size in RAM and prediction spee
Hi Lorenzo,
please make sure to pickle with the highest protocol - otherwise pickle
uses a textual serialization format which is quite inefficient:
pickle.dump(clf, f, protocol=pickle.HIGHEST_PROTOCOL)
For large datasets limit the number of tree nodes by specifying
``min_samples_leaf`` -- sett