Joseph K. Bradley created SPARK-3934: ----------------------------------------
Summary: RandomForest bug in sanity check in DTStatsAggregator Key: SPARK-3934 URL: https://issues.apache.org/jira/browse/SPARK-3934 Project: Spark Issue Type: Bug Components: MLlib Reporter: Joseph K. Bradley When run with a mix of unordered categorical and continuous features, on multiclass classification, RandomForest fails. The bug is in the sanity checks in getFeatureOffset and getLeftRightFeatureOffsets, which use the wrong indices for checking whether features are unordered. Proposal: Remove the sanity checks since they are not really needed, and since they would require DTStatsAggregator to keep track of an extra set of indices (for the feature subset). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org