Ok....now I am really confused on how to interpret the tree.

So...I am trying to build a Prob est tree.  All of the independent
variables are categorical and created dummies.  What is throwing me off are
the <=.

I should have a rule that says e.g. if city=LA,NY and TIME=Noon then .20.

In the chart I see city=Dubai<=.500  What does that mean?  What I am trying
so see is a chart that I would usually see in SPSS answer tree or SAS etc.

So..how do I interpret the city=Dubai<=.500?

My aim is to get a node id and to create sql rules to extract data.

Unless I am wrong, it appears the the dtree algo is not designed to extract
rules and even assign a rule to a node id.  Dtrees in scikits are solely
for prediction.  Is this a fair statement?

I will be taking the *.dot file not to graph but to somehow parse the file
so I can create my rules.

Thanks

















On Wed, Feb 27, 2013 at 11:57 PM, Peter Prettenhofer <
[email protected]> wrote:

> Looks good to me - save the output to a file (e.g. foobar.dot) and run
> the following command:
>
>     $ dot -Tpdf foobar.dot -o foobar.pdf
>
> When I open the pdf all labels are correctly displayed - remember that
> they are not indicator features - so the thresholds are usually
> "country=AU <= 0.5".
>
> You can find more information here:
> http://scikit-learn.org/dev/modules/tree.html#classification
>
> 2013/2/27 David Montgomery <[email protected]>:
> > Thanks I used DictVectorizer()
> >
> > I am now trying to add lables to the tree graph.   Below are the labels
> and
> > the digraph Tree.  However, I dont see lables on the tree nodes.  Did I
> not
> > use feature names correct?
> >
> >
> >
> >
> > measurements = [
> > {'country':'US','city': 'Dubai'},
> > {'country':'US','city': 'London'},
> > {'country':'US','city': 'San Fransisco'},
> > {'country':'US','city': 'Dubai'},
> > {'country':'AU','city': 'Mel'},
> > {'country':'AU','city': 'Sydney'},
> > {'country':'AU','city': 'Mel'},
> > {'country':'AU','city': 'Sydney'},
> > {'country':'AU','city': 'Mel'},
> > {'country':'AU','city': 'Sydney'},
> > ]
> > y = [0,0,0,1,1,1,1,1,1,1]
> >
> >
> > vec = DictVectorizer()
> > X = vec.fit_transform(measurements)
> > feature_name = vec.get_feature_names()
> > clf = tree.DecisionTreeRegressor()
> > clf = clf.fit(X.todense(), y)
> > with open("au.dot", 'w') as f:
> >     f = tree.export_graphviz(clf, out_file=f,feature_names=feature_name)
> >
> >
> > feature_name = ['city=Dubai', 'city=London', 'city=Mel', 'city=San
> > Fransisco', 'city=Sydney', 'country=AU', 'country=US']
> >
> > digraph Tree {
> > 0 [label="country=AU <= 0.5000\nerror = 2.1\nsamples = 10\nvalue = [
> 0.7]",
> > shape="box"] ;
> > 1 [label="city=Dubai <= 0.5000\nerror = 0.75\nsamples = 4\nvalue = [
> 0.25]",
> > shape="box"] ;
> > 0 -> 1 ;
> > 2 [label="error = 0.0000\nsamples = 2\nvalue = [ 0.]", shape="box"] ;
> > 1 -> 2 ;
> > 3 [label="error = 0.5000\nsamples = 2\nvalue = [ 0.5]", shape="box"] ;
> > 1 -> 3 ;
> > 4 [label="error = 0.0000\nsamples = 6\nvalue = [ 1.]", shape="box"] ;
> > 0 -> 4 ;
> > }
> >
> >
> >
> >
> > On Wed, Feb 27, 2013 at 9:50 PM, Peter Prettenhofer
> > <[email protected]> wrote:
> >>
> >> Hi David,
> >>
> >> I recommend that you load the data using Pandas (``pandas.read_csv``).
> >> Scikit-learn does not support categorical features out-of-the-box; you
> >> need to encode them as dummy variables (aka one-hot encoding) - you
> >> can do this either using ``sklearn.preprocessing.DictVectorizer`` or
> >> via ``pandas.get_dummies`` .
> >>
> >> HTH,
> >>  Peter
> >>
> >> 2013/2/27 David Montgomery <[email protected]>:
> >> > Hi,
> >> >
> >> > I have a data structure that looks like this:
> >> >
> >> > 1 NewYork 1 6 high
> >> > 0 LA 3 4 low
> >> > .......
> >> >
> >> > I am trying to predict probability where Y is column one.  The all of
> >> > the
> >> > attributes of the X are categorical and I will use a dtree regression.
> >> > How
> >> > do I load this data into the y and X?
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> ------------------------------------------------------------------------------
> >> > Everyone hates slow websites. So do we.
> >> > Make your web apps faster with AppDynamics
> >> > Download AppDynamics Lite for free today:
> >> > http://p.sf.net/sfu/appdyn_d2d_feb
> >> > _______________________________________________
> >> > Scikit-learn-general mailing list
> >> > [email protected]
> >> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >> >
> >>
> >>
> >>
> >> --
> >> Peter Prettenhofer
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Everyone hates slow websites. So do we.
> >> Make your web apps faster with AppDynamics
> >> Download AppDynamics Lite for free today:
> >> http://p.sf.net/sfu/appdyn_d2d_feb
> >> _______________________________________________
> >> Scikit-learn-general mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > Everyone hates slow websites. So do we.
> > Make your web apps faster with AppDynamics
> > Download AppDynamics Lite for free today:
> > http://p.sf.net/sfu/appdyn_d2d_feb
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
>
> --
> Peter Prettenhofer
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to