Re: [scikit-learn] [Feature] drop_one in one hot encoder

2017-06-25 Thread Gael Varoquaux
On Sun, Jun 25, 2017 at 05:18:09PM +0530, Parminder Singh wrote: > Hy Sci-kittens! :-) Nice :). FYI: there is work in progress to replace the OneHotEncoder, as it has many strong limitations: https://github.com/scikit-learn/scikit-learn/pull/9151 It might be useful to have a look at this PR to m

Re: [scikit-learn] [Feature] drop_one in one hot encoder

2017-06-25 Thread Sebastian Raschka
Hi, hm, I think that dropping a column in onehot encoded features is quite uncommon in machine learning practice -- based on the applications and implementations I've seen. My guess is that the onehot encoded features are multicolinear anyway!? There may be certain algorithms that benefit from

[scikit-learn] [Feature] drop_one in one hot encoder

2017-06-25 Thread Parminder Singh
Hy Sci-kittens! :-) I was doing machine learning a-z course on Udemy, there they told that every time one-hot encoding is done, one of the columns should be dropped as it is like doubling same category twice and redundant to model. I thought if instead of having user find the index and drop it