[scikit-learn] baggingClassifier with pipeline

2019-06-27 Thread Roxana Danger
Hello, I would like to use the BaggingClassifier whose base estimator is a pipeline with multiple transformations including a DataFrameMapper from sklearn_pandas. I am getting an error during the fitting the DataFrameMapper as the first step of the BaggingClassifier is to convert the DataFrame to a

Re: [scikit-learn] Scikit Learn in a Cray computer

2019-06-27 Thread Mauricio Reis
Finally I was able to access the Cray computer and run the routine. I am sending below the files and commands I used and the result found, where you can see "ncpus = 1" (I still do not know why 4 lines were printed - I only know that this amount depends on the value of the "aprun" command us

Re: [scikit-learn] titanic dataset, use for book

2019-06-27 Thread Roman Yurchak via scikit-learn
Meanwhile, loading the CSV from OpenML (https://www.openml.org/d/40945) would also work, pd.read_csv('https://www.openml.org/data/get_csv/16826755/phpMYEkMl') -- Roman On 25/06/2019 17:04, Andreas Mueller wrote: > By the time your book comes out, it's likely to be merged, but might not > be r

[scikit-learn] Any drawbacks when using partial_fit?

2019-06-27 Thread lampahome
I try to use Birch to cluster time-series data incrementally. Because insufficient memory, so I train it batch by batch. Every batch is 1000 samples and for 50 batch. I found when I only train the first batch, it cluster well. After first trained, I train following batch with the same model and