You can simply make a new binary feature (per feature that might have a missing value) that is 1 if the value is missing and 0 otherwise. The RF can then work out what to do with this information.
I don't know how this compares in practice to more sophisticated approaches. Raphael On Thursday, October 13, 2016, Stuart Reynolds <stu...@stuartreynolds.net> wrote: > I'm looking for a decision tree and RF implementation that supports > missing data (without imputation) -- ideally in Python, Java/Scala or C++. > > It seems that scikit's decision tree algorithm doesn't allow this -- > which is disappointing because its one of the few methods that should be > able to sensibly handle problems with high amounts of missingness. > > Are there plans to allow missing data in scikit's decision trees? > > Also, is there any particular reason why missing values weren't supported > originally (e.g. integrates poorly with other features) > > Regards > - Stuart >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn