GitHub user jkbradley opened a pull request:

    https://github.com/apache/spark/pull/2349

    [mllib] DecisionTree: Add minInstancesPerNode, minInfoGain params to 
example and Python API

    Added minInstancesPerNode, minInfoGain params to:
    * DecisionTreeRunner.scala example
    * Python API (tree.py)
    
    Also:
    * Fixed typo in tree suite test "do not choose split that does not satisfy 
min instance per node requirements"
    * small style fixes
    
    CC: @mengxr

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jkbradley/spark chouqin-dt-preprune

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2349.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2349
    
----
commit ac4237808090237fe4c562da8c88c55c330d451f
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T03:17:58Z

    add min info gain and min instances per node parameters in decision tree

commit ff34845c8e43f5b9755dd1fdf428be8b2284c68b
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T04:29:12Z

    separate calculation of predict of node from calculation of info gain

commit 987cbf4b177f29e232bf2ba2ca595ea7015694da
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T04:30:01Z

    fix bug

commit f195e830a94097e5d6d42f22c67c32ca8900d848
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T06:04:20Z

    fix style

commit 845c6fa58c00bfba426e56e71eb46a6f8c3f5985
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T06:05:37Z

    fix style

commit e72c7e4d0ad015fdf25ea2959bdbf524056e38ca
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T06:52:24Z

    add comments

commit 46b891fd7f30b9f2d439134931b35dab387fe2b1
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T08:09:34Z

    fix bug

commit cadd569cf64d6eb7b9c9979a5066a2f63f15fed9
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T08:48:51Z

    add api docs

commit bb465cabc804ca53ef5005f6793b58aa2e4a5274
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T09:09:14Z

    Merge branch 'master' of https://github.com/apache/spark into dt-preprune
    
    Conflicts:
        
mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala

commit 6728fad304511030611c61592b1a590214e7f434
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T09:16:27Z

    minor fix: remove empty lines

commit 10b801269864cda2c00159518688942b1985061b
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T10:10:24Z

    fix style

commit efcc7369f7f52de2810446c6fb976ab1743a63cf
Author: qiping.lqp <qiping....@alibaba-inc.com>
Date:   2014-09-09T12:33:37Z

    fix bug

commit d593ec70d70b633b72e260c38e89d87ab14fcd69
Author: chouqin <liqiping1...@gmail.com>
Date:   2014-09-09T23:57:27Z

    fix docs and change minInstancesPerNode to 1

commit 0278a1198017aae578be3109a8311abc1f9a8e14
Author: chouqin <liqiping1...@gmail.com>
Date:   2014-09-10T02:31:57Z

    remove `noSplit` and set `Predict` private to tree

commit c6e2dfcc62aaa0d26bff90fb34f5b81526ce71c8
Author: Joseph K. Bradley <joseph.kurata.brad...@gmail.com>
Date:   2014-09-10T04:51:35Z

    Added minInstancesPerNode and minInfoGain parameters to 
DecisionTreeRunner.scala and to Python API in tree.py

commit 39f9b60907050b4e1c78f7413282df13b7e6552c
Author: chouqin <liqiping1...@gmail.com>
Date:   2014-09-10T14:15:46Z

    change edge `minInstancesPerNode` to 2 and add one more test

commit c7ebaf1721ba414ed1539bfc4721c3bbfd70b77a
Author: chouqin <liqiping1...@gmail.com>
Date:   2014-09-10T14:27:08Z

    fix typo

commit f1d11d15fe519f9ef9d4e1158b309dc6af38864e
Author: chouqin <liqiping1...@gmail.com>
Date:   2014-09-10T14:30:22Z

    fix typo

commit 19b01af035719b7e9b67bc85611b4f04b790797a
Author: Joseph K. Bradley <joseph.kurata.brad...@gmail.com>
Date:   2014-09-10T15:52:14Z

    Merge remote-tracking branch 'chouqin/dt-preprune' into chouqin-dt-preprune

commit e2628b605459badb64b8d63059a2821dfff4bd4c
Author: Joseph K. Bradley <joseph.kurata.brad...@gmail.com>
Date:   2014-09-10T23:13:03Z

    Merge remote-tracking branch 'upstream/master' into chouqin-dt-preprune

commit 95c479d5a60b166d9c75b9a81cee82e808f23aa0
Author: Joseph K. Bradley <joseph.kurata.brad...@gmail.com>
Date:   2014-09-10T23:52:05Z

    * Fixed typo in tree suite test "do not choose split that does not satisfy 
min instance per node requirements"
    * small style fixes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to