Hi David, I recommend that you load the data using Pandas (``pandas.read_csv``). Scikit-learn does not support categorical features out-of-the-box; you need to encode them as dummy variables (aka one-hot encoding) - you can do this either using ``sklearn.preprocessing.DictVectorizer`` or via ``pandas.get_dummies`` .
HTH, Peter 2013/2/27 David Montgomery <[email protected]>: > Hi, > > I have a data structure that looks like this: > > 1 NewYork 1 6 high > 0 LA 3 4 low > ....... > > I am trying to predict probability where Y is column one. The all of the > attributes of the X are categorical and I will use a dtree regression. How > do I load this data into the y and X? > > Thanks > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > -- Peter Prettenhofer ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
