Sparks' Decision tree does not accept datasets with a single value in a
feature. It produces the following error:

> requirement failed: DecisionTree Strategy given invalid
> categoricalFeaturesInfo setting: feature 645 has 1 categories.  The number
> of categories should be >= 2

This is not an uncommon scenario since large datasets can contain features
with only a single value (See training data in [1] for example). As this is
a Spark error, there should be a way to handle such datasets externally.

One possible solution is to allow user to discard features(columns), so
that they can discard those features with single values before training a
Decision tree. Please suggest if there are any other feasible solutions.

Best regards,

Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Mobile: +94711228855
Dev mailing list

Reply via email to