GitHub user smurching opened a pull request:

    https://github.com/apache/spark/pull/19758

    [SPARK-3162][MLlib] Local Tree Training Pt 1: Refactor RandomForest.scala 
into utility classes

    ## What changes were proposed in this pull request?
    
    Breaks up #19433 to help unblock #19666; after this PR is merged, #19666 
can be merged.
    
    This PR contains the changes made to migrate functionality from 
RandomForest.scala into the following utility classes:
    
    * AggUpdateUtils
    * ImpurityUtils
    * SplitUtils
    
    The PR also adds tests for split selection logic in TreeSplitUtilsSuite.
    
    A follow-up PR will include the other changes from #19433:
    * Local decision tree data structures & tests
    * Local tree training logic & tests
    
    ## How was this patch tested?
    
    Adds unit tests for split selection logic in TreeSplitUtilsSuite
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/smurching/spark refactor-random-forest

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19758.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19758
    
----
commit f2e3fbd40eea2919d249710eae5b5789d97543b7
Author: Sid Murching <sid.murch...@databricks.com>
Date:   2017-11-15T17:52:01Z

    Local tree training part 1 (refactor RandomForest.scala into utility 
classes)

commit a2357c95672e94a148051d00e26b89245eb8e204
Author: Sid Murching <sid.murch...@databricks.com>
Date:   2017-11-15T17:57:55Z

    WIP adding TreeSplitUtilsSuite

commit 320c32ee8d0ac9bde457b0286d064470648c73af
Author: Sid Murching <sid.murch...@databricks.com>
Date:   2017-11-15T19:37:56Z

    WIP

commit b93f9f3da9cca0887c0264162f5b032f14fa87d7
Author: Sid Murching <sid.murch...@databricks.com>
Date:   2017-11-15T19:57:25Z

    Add TreeSplitUtilsSuite, refactor it to not depend on any local tree 
training code

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to