[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-25 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96282516 @mengxr Curious: Why does it say there are unmerged commits? (I checked, and the last commit was merged correctly.) --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/5626 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-25 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96267165 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96174080 [Test build #698 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/698/consoleFull) for PR 5626 at commit [`729167a`](https://githu

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96157201 [Test build #698 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/698/consoleFull) for PR 5626 at commit [`729167a`](https://github

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-24 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96101280 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-24 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96077458 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-24 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29086395 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -85,18 +82,16 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-24 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-96066260 Updated. The only remaining question is about the ```(private[ml])``` notes. (See comment above.) --- If your project is set up for it, you can reply to this email a

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-24 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29081639 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -85,18 +82,16 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95806853 LGTM except some minor inline comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024766 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GBTExample.scala --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024771 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala --- @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Softwar

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024769 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -85,18 +82,16 @@ final class DecisionTreeClassifier

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024757 --- Diff: mllib/src/test/java/org/apache/spark/ml/classification/JavaGBTClassifierSuite.java --- @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Soft

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024748 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -296,5 +299,194 @@ private[ml] trait TreeRegressorParams extends Params {

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024762 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apac

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024737 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024743 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala --- @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache So

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024756 --- Diff: mllib/src/test/java/org/apache/spark/ml/classification/JavaGBTClassifierSuite.java --- @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Soft

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024746 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -296,5 +299,194 @@ private[ml] trait TreeRegressorParams extends Params {

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024741 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala --- @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache So

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29024038 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95762202 [Test build #30882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30882/consoleFull) for PR 5626 at commit [`bbae2a2`](https://gith

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95762210 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-9573 [Test build #30882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30882/consoleFull) for PR 5626 at commit [`bbae2a2`](https://githu

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95739749 Updated! I think the only thing I didn't do was make stepSize a shared param. Copying from the comment above: > I'm hesitating about putting it in sharedParams sin

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r29001008 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // T

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r2806 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // T

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28997695 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala --- @@ -58,3 +58,43 @@ trait DecisionTreeModel { header + rootNode.subtreeT

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28997488 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // T

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28997275 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // T

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28995695 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28994568 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28994127 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28992440 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95675400 @jkbradley I made one pass on the public APIs. There are some issues from the ml.DT PR: 1. Node.prediction should say "leaf" node instead of "internal": https

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987617 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // Thes

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987631 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // Thes

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987623 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // Thes

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987650 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala --- @@ -58,3 +58,43 @@ trait DecisionTreeModel { header + rootNode.subtreeToSt

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987585 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987643 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala --- @@ -58,3 +58,43 @@ trait DecisionTreeModel { header + rootNode.subtreeToSt

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987598 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987606 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala --- @@ -0,0 +1,180 @@ +/* + * Licensed to the Apache So

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987621 --- Diff: mllib/src/main/scala/org/apache/spark/ml/impl/tree/treeParams.scala --- @@ -298,3 +302,200 @@ private[ml] object TreeRegressorParams { // Thes

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987593 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987596 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/5626#discussion_r28987591 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -0,0 +1,225 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95285348 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95285321 [Test build #30764 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30764/consoleFull) for PR 5626 at commit [`855aa9a`](https://gith

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95254646 [Test build #30764 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30764/consoleFull) for PR 5626 at commit [`855aa9a`](https://githu

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95038170 [Test build #30729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30729/consoleFull) for PR 5626 at commit [`ea3d901`](https://gith

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95038173 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5626#issuecomment-95038036 [Test build #30729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30729/consoleFull) for PR 5626 at commit [`ea3d901`](https://githu

[GitHub] spark pull request: [SPARK-6113] [ml] Tree ensembles for Pipelines...

2015-04-21 Thread jkbradley
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/5626 [SPARK-6113] [ml] Tree ensembles for Pipelines API This is a continuation of [https://github.com/apache/spark/pull/5530] (which was for Decision Trees), but for ensembles: Random Forests and Grad