GitHub user smurching reopened a pull request:

    https://github.com/apache/spark/pull/14872

    [SPARK-3162][MLlib][WIP] Add local tree training for decision tree 
regressors

    ## What changes were proposed in this pull request?
    
    Based on [Yggdrasil](https://github.com/fabuzaid21/yggdrasil), added local 
training of decision tree regressors.
    
    Some classes/objects largely correspond to Yggdrasil classes/objects.
    Specifically:
    * class LocalDecisionTreeRegressor --> class YggdrasilRegressor
    * object LocalDecisionTree --> object YggdrasilRegression
    * object LocalDecisionTreeUtils --> object Yggdrasil
    
    ## How was this patch tested?
    
    Added unit tests in (ml/tree/impl/LocalTreeTrainingSuite.scala) verifying 
that local & distributed training of a decision tree regressor produces the 
same tree.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/smurching/spark local-trees-pr

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14872.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14872
    
----
commit acf5b3e29a346a0cb86f621269855a6a98a9a74e
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-08-29T23:51:33Z

    Add local tree training for decision tree regressors

commit aa4fcc8d401385f38fe0cdfdb9fe39062c3a9f96
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-08-30T01:19:07Z

    Fix setting of impurity values for leaf nodes to match values produced by
    distributed Random Forest algorithm

commit f273fc6a4b5048ae577d03676def354dce5c87a7
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-08-31T07:01:26Z

    WIP refactoring single-machine tree code

commit 5e61e3b29c236d27e0d655d15a48f2fe3e13d26a
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-01T01:22:48Z

    Remove unused imports, remove array of single-node impurity aggregators

commit d2060fc460a97228a36bf81956cf8dd24c83106e
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-01T23:11:17Z

    WIP

commit 634a3223374608d68018daac5500a429034bbc20
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T00:21:10Z

    More work, tests still pass

commit eb7fde00e0db5aa5d04951f8f4a9cd62204f1609
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T17:02:06Z

    WIP: Added tests for classes upon which local tree training is dependent. 
Some integration tests fail

commit b748f05e3eaa7d58b1ad86d269e0dda5f35ee885
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T17:37:31Z

    WIP debugging

commit 297052242727e6693ccbacf89f44b3ff6db584f7
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T21:34:13Z

    Consolidate checking for valid splits

commit 8d443ce38f958e7b83b502e614e01c824cb63c4b
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T21:52:47Z

    Delete empty test suite

commit ee56ffe98756ed78cefbc3f782a471f04e80b256
Author: Siddharth Murching <smurch...@databricks.com>
Date:   2016-09-02T22:49:38Z

    Fix some style errors

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to