Hi Andy, the last time I did this I used DictVectorizer too. Would be great though to have a something like OneHotTransformer.
BTW: Has anybody of your looked into patsy [1]? They have plenty of functionality for this kind of encodings (they call it treatment coding [2]). best, Peter [1] https://github.com/pydata/patsy [2] http://patsy.readthedocs.org/en/latest/API-reference.html#patsy.Treatment 2012/10/19 Lars Buitinck <[email protected]>: > 2012/10/19 Andreas Mueller <[email protected]>: >> I'd like to convert an array of integer categorial features to a sparse >> indicator matrix. >> So my data points look like >> x =[ 100, 1, 5, 10] >> These are indices for feature-bins which don't really have an ordering. >> Therefore I want to convert them to a one-hot encoding per feature. >> >> What is the best way in sklearn to achieve this? This looks a bit >> like the DictVectorizer, I think. > > DictVectorizer is the only class we offer that does this, I think. > (Unless you care to make strings out of your matrices and use > CountVectorizer...). > >> If not, do you think this kind of encoding is common enough to >> be included in sklearn? > > It's been requested on the ML over and over. At one point, I had a PR > for a OneHotTransformer > (https://github.com/scikit-learn/scikit-learn/pull/242) but that > hasn't been updated in quite a while and I'm using DictVectorizer > myself now. Feel free to pick it up if you need it, or start afresh. > > -- > Lars Buitinck > Scientific programmer, ILPS > University of Amsterdam > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Peter Prettenhofer ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
