Hermes, That's an interesting function. Does it work with sklearn after factorize? Is there any example? Thanks!
On Thu, Apr 30, 2020 at 6:51 PM Hermes Morales <paisanoher...@hotmail.com> wrote: > Perhaps pd.factorize could hello? > > Obtener Outlook para Android <https://aka.ms/ghei36> > > ------------------------------ > *From:* scikit-learn <scikit-learn-bounces+paisanohermes= > hotmail....@python.org> on behalf of Gael Varoquaux < > gael.varoqu...@normalesup.org> > *Sent:* Thursday, April 30, 2020 5:12:06 PM > *To:* Scikit-learn mailing list <scikit-learn@python.org> > *Subject:* Re: [scikit-learn] Why does sklearn require one-hot-encoding > for categorical features? Can we have a "factor" data type? > > On Thu, Apr 30, 2020 at 03:55:00PM -0400, C W wrote: > > I've used R and Stata software, none needs such transformation. They > have a > > data type called "factors", which is different from "numeric". > > > My problem with OHE: > > One-hot-encoding results in large number of features. This really blows > up > > quickly. And I have to fight curse of dimensionality with PCA reduction. > That's > > not cool! > > Most statistical models still not one-hot encoding behind the hood. So, R > and stata do it too. > > Typically, tree-based models can be adapted to work directly on > categorical data. Ours don't. It's work in progress. > > G > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Fscikit-learn&data=02%7C01%7C%7Ce7aa6f99b7914a1f84b208d7ed430801%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637238744453345410&sdata=e3BfHB4v5VFteeZ0Zh3FJ9Wcz9KmkUwur5i8Reue3mc%3D&reserved=0 > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn