[ https://issues.apache.org/jira/browse/SPARK-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720824#comment-14720824 ]
Trevor Cai commented on SPARK-9963: ----------------------------------- [~lkhamsurenl]: Are you still working on this? I took a quick look at the code, and it looks to me like it's not trivial to convert the binnedFeatures and splits arrays into a single vector mapping feature to threshold. The below section of code seems to indicate to me that some features don't have a threshold and should always move right, and using Double.MAX_VALUE as the threshold in those cases seems like it could potentially cause issues. {code} override private[tree] def shouldGoLeft(binnedFeature: Int, splits: Array[Split]): Boolean = { if (binnedFeature == splits.length) { // > last split, so split right false } else { val featureValueUpperBound = splits(binnedFeature).asInstanceOf[ContinuousSplit].threshold featureValueUpperBound <= threshold } } {code} As a result, if you're still working on this, your second proposal makes more sense. If not, I can pick up the issue and implement the second option. [~josephkb] Does this seem reasonable to you? > ML RandomForest cleanup: replace predictNodeIndex with predictImpl > ------------------------------------------------------------------ > > Key: SPARK-9963 > URL: https://issues.apache.org/jira/browse/SPARK-9963 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Joseph K. Bradley > Priority: Trivial > Labels: starter > > Replace ml.tree.impl.RandomForest.predictNodeIndex with Node.predictImpl. > This should be straightforward, but please ping me if anything is unclear. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org