If you have datasets with many categorical features, and perhaps many
categories, the tools in sklearn are quite limited,
but there are alternative implementations of boosted trees that are
designed with categorical features in mind. Take a look
at catboost [1], which has an sklearn-compatible API.

J

[1] https://catboost.ai/

On Sat, Sep 14, 2019 at 3:40 AM C W <tmrs...@gmail.com> wrote:

> Hello all,
> I'm very confused. Can the decision tree module handle both continuous and
> categorical features in the dataset? In this case, it's just CART
> (Classification and Regression Trees).
>
> For example,
> Gender Age Income  Car   Attendance
> Male     30   10000   BMW          Yes
> Female 35     9000  Toyota          No
> Male     50   12000    Audi           Yes
>
> According to the documentation
> https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-5-c5-0-and-cart,
> it can not!
>
> It says: "scikit-learn implementation does not support categorical
> variables for now".
>
> Is this true? If not, can someone point me to an example? If yes, what do
> people do?
>
> Thank you very much!
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to