subject:"\[Scikit\-learn\-general\] Saving Huge Models"

Re: [Scikit-learn-general] Saving Huge Models

2014-02-26 Thread Andy

On 02/26/2014 05:55 PM, Peter Prettenhofer wrote: > > please make sure to pickle with the highest protocol - otherwise > pickle uses a textual serialization format which is quite inefficient: > > pickle.dump(clf, f, protocol=pickle.HIGHEST_PROTOCOL) Or simply protocol=-1. This usually makes a hu

Re: [Scikit-learn-general] Saving Huge Models

2014-02-26 Thread Olivier Grisel

You can control the size of your random forest by adjusting the parameters n_estimators, min_samples_split and even max_depth (read the documentation for more details). It's up to you to find parameter values that match your constraints in terms of accuracy vs model size in RAM and prediction spee

Re: [Scikit-learn-general] Saving Huge Models

2014-02-26 Thread Peter Prettenhofer

Hi Lorenzo, please make sure to pickle with the highest protocol - otherwise pickle uses a textual serialization format which is quite inefficient: pickle.dump(clf, f, protocol=pickle.HIGHEST_PROTOCOL) For large datasets limit the number of tree nodes by specifying ``min_samples_leaf`` -- sett

[Scikit-learn-general] Saving Huge Models

2014-02-26 Thread Lorenzo Isella

Dear All, I am using RandomForest on a data set which has less than 20 features, but about 40 lines. The point is that, even if I work on a subset of about 3 lines to train my model, when I save it using pickle, I get a large file in the order of several hundreds of Mb of space (see

Re: [Scikit-learn-general] Saving Huge Models

Re: [Scikit-learn-general] Saving Huge Models

Re: [Scikit-learn-general] Saving Huge Models

[Scikit-learn-general] Saving Huge Models

4 matches

Site Navigation

Mail list logo

Footer information