Hello, I am dumping the dataset vectorized with TfidfVectorizer, target array, and the classifier OneVsRestClassifierSGDClassifier(loss=log, n_iter=50, alpha=0.00001)), since I want to add it to a package. I use joblib library from sklearn.externals to dump the vectors. The max memory used when training the classifier is 12g, however, when the program starts dumping classifier the usage jumps to 38g (which I assume is due to some internal copy?). I have about 32g of RAM, so is there a better way to store the classifier instead of using joblib.dump(compress=9)? [I tried values compress=3, 5, 7, 9, always get memory error]. If I do not compress the vectors total to about 11g. Thanks
------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
