You can simply make a new binary feature (per feature that might have a
missing value) that is 1 if the value is missing and 0 otherwise.  The RF
can then work out what to do with this information.

I don't know how this compares in practice to more sophisticated approaches.

Raphael

On Thursday, October 13, 2016, Stuart Reynolds <stu...@stuartreynolds.net>
wrote:

> I'm looking for a decision tree and RF implementation that supports
> missing data (without imputation) -- ideally in Python, Java/Scala or C++.
>
> It seems that scikit's decision tree algorithm doesn't allow this --
> which is disappointing because its one of the few methods that should be
> able to sensibly handle problems with high amounts of missingness.
>
> Are there plans to allow missing data in scikit's decision trees?
>
> Also, is there any particular reason why missing values weren't supported
> originally (e.g. integrates poorly with other features)
>
> Regards
> - Stuart
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to