Re: Spark ML Decision Trees Algorithm

2016-10-02 Thread Yan Facai
Perhaps the best way is to read the code. The Decision tree is implemented by 1-tree Random forest, whose entry point is `run` method: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala#L88 I'm not familiar with the so-called

Re: Spark ML Decision Trees Algorithm

2016-09-30 Thread janardhan shetty
It would be good to know which paper has inspired to implement the version which we use in spark 2.0 decision trees ? On Fri, Sep 30, 2016 at 4:44 PM, Peter Figliozzi wrote: > It's a good question. People have been publishing papers on decision > trees and various

Re: Spark ML Decision Trees Algorithm

2016-09-30 Thread Peter Figliozzi
It's a good question. People have been publishing papers on decision trees and various methods of constructing and pruning them for over 30 years. I think it's rather a question for a historian at this point. On Fri, Sep 30, 2016 at 5:08 PM, janardhan shetty wrote: >

Re: Spark ML Decision Trees Algorithm

2016-09-30 Thread janardhan shetty
Read this explanation but wondering if this algorithm has the base from a research paper for detail understanding. On Fri, Sep 30, 2016 at 1:36 PM, Kevin Mellott wrote: > The documentation details the algorithm being used at >

Re: Spark ML Decision Trees Algorithm

2016-09-30 Thread Kevin Mellott
The documentation details the algorithm being used at http://spark.apache.org/docs/latest/mllib-decision-tree.html Thanks, Kevin On Fri, Sep 30, 2016 at 1:14 AM, janardhan shetty wrote: > Hi, > > Any help here is appreciated .. > > On Wed, Sep 28, 2016 at 11:34 AM,

Re: Spark ML Decision Trees Algorithm

2016-09-30 Thread janardhan shetty
Hi, Any help here is appreciated .. On Wed, Sep 28, 2016 at 11:34 AM, janardhan shetty wrote: > Is there a reference to the research paper which is implemented in spark > 2.0 ? > > On Wed, Sep 28, 2016 at 9:52 AM, janardhan shetty > wrote: > >>

Re: Spark ML Decision Trees Algorithm

2016-09-28 Thread janardhan shetty
Is there a reference to the research paper which is implemented in spark 2.0 ? On Wed, Sep 28, 2016 at 9:52 AM, janardhan shetty wrote: > Which algorithm is used under the covers while doing decision trees FOR > SPARK ? > for example: scikit-learn (python) uses an

Spark ML Decision Trees Algorithm

2016-09-28 Thread janardhan shetty
Which algorithm is used under the covers while doing decision trees FOR SPARK ? for example: scikit-learn (python) uses an optimised version of the CART algorithm.