Sparks' Decision tree does not accept datasets with a single value in a feature. It produces the following error:
> requirement failed: DecisionTree Strategy given invalid > categoricalFeaturesInfo setting: feature 645 has 1 categories. The number > of categories should be >= 2 > This is not an uncommon scenario since large datasets can contain features with only a single value (See training data in [1] for example). As this is a Spark error, there should be a way to handle such datasets externally. One possible solution is to allow user to discard features(columns), so that they can discard those features with single values before training a Decision tree. Please suggest if there are any other feasible solutions. Best regards, [1] https://www.kaggle.com/c/digit-recognizer -- Pruthuvi Maheshakya Wijewardena Software Engineer WSO2 Lanka (Pvt) Ltd Email: mahesha...@wso2.com Mobile: +94711228855
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev