[ 
https://issues.apache.org/jira/browse/SPARK-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720824#comment-14720824
 ] 

Trevor Cai commented on SPARK-9963:
-----------------------------------

[~lkhamsurenl]: Are you still working on this?
I took a quick look at the code, and it looks to me like it's not trivial to 
convert the binnedFeatures and splits arrays into a single vector mapping 
feature to threshold. The below section of code seems to indicate to me that 
some features don't have a threshold and should always move right, and using 
Double.MAX_VALUE as the threshold in those cases seems like it could 
potentially cause issues.
{code}
override private[tree] def shouldGoLeft(binnedFeature: Int, splits: 
Array[Split]): Boolean = {
    if (binnedFeature == splits.length) {
      // > last split, so split right
      false
    } else {
      val featureValueUpperBound = 
splits(binnedFeature).asInstanceOf[ContinuousSplit].threshold
      featureValueUpperBound <= threshold
    }
  }
{code}

As a result, if you're still working on this, your second proposal makes more 
sense. If not, I can pick up the issue and implement the second option.

[~josephkb] Does this seem reasonable to you?


> ML RandomForest cleanup: replace predictNodeIndex with predictImpl
> ------------------------------------------------------------------
>
>                 Key: SPARK-9963
>                 URL: https://issues.apache.org/jira/browse/SPARK-9963
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>            Reporter: Joseph K. Bradley
>            Priority: Trivial
>              Labels: starter
>
> Replace ml.tree.impl.RandomForest.predictNodeIndex with Node.predictImpl.
> This should be straightforward, but please ping me if anything is unclear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to