[scikit-learn] help-Renaming features in Sckit-learn's CountVectorizer()

2018-03-05 Thread Ranjana Girish
Hai all, I have a very large pandas dataframe. Below is the sample * Id description* 1switvch for air conditioner transformer.. 2control tfrmr... 3coling pad. 4DRLG machine 5hair smothing kit..

[scikit-learn] Need help in dealing with large dataset

2018-03-05 Thread CHETHAN MURALI
Dear All, I am working on building a CNN model for image classification problem. As par of it I have converted all my test images to numpy array. Now when I am trying to split the array into training and test set I am getting memory error. Details are as below: X = np.load("./data/X_train.npy",

Re: [scikit-learn] Need help in dealing with large dataset

2018-03-05 Thread Guillaume LemaƮtre
If you work with deep net you need to check the utils from the deep net library. For instance in keras, you should create a batch generator if you need to deal with large dataset. In patch torch you can use the data loader which and the ImageFolder from torchvision which manage the loading for you.

Re: [scikit-learn] Need help in dealing with large dataset

2018-03-05 Thread Sebastian Raschka
Like Guillaume suggested, you don't want to load the whole array into memory if it's that large. There are many different ways for how to deal with this. The most naive way would be to break up your NumPy array into smaller NumPy array and load them iteratively with a running accuracy calculatio

Re: [scikit-learn] help-Renaming features in Sckit-learn's CountVectorizer()

2018-03-05 Thread Joel Nothman
You can effectively merge features through matrix multiplication: multiply the CountVectorizer output by a sparse matrix of shape (n_features_in, n_features_out) which has 1 where the output feature corresponds to an input feature. Your spelling correction then consists of building this mapping mat

Re: [scikit-learn] transfer-learning for random forests

2018-03-05 Thread Andreas Mueller
http://scikit-learn.org/dev/faq.html#what-are-the-inclusion-criteria-for-new-algorithms On 02/16/2018 04:51 AM, peignier sergio wrote: Hello, I recently begun a research project on Transfer Learning with some colleagues. We would like to contribute to scikit-learn incorporating Transfer Lear