[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182138566 Here it is: [https://github.com/viirya/spark-1/pull/1] --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182143591 **[Test build #51012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51012/consoleFull)** for PR 8734 at commit [`2bbe037`](https://github.com/apache/spark/commit/2bbe0373e9f3e2d04146bbade7279ca9fcadc6f1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182135907 Yes, LGTM pending the improved test, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182155032 LGTM! Thanks @viirya and @sethah I'll merge with master and see how far back I can backport it easily. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182142095 @jkbradley Great thanks. I've merged your PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182126572 @viirya Thanks for the updates. I think the code is correct now, but I'm going to send you a PR (to update this PR) in order to improve the test. I agree with @sethah that the current test does not really test anything. @sethah Does it look good to you, other than the test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182151869 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182151871 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51012/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-182151701 **[Test build #51012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51012/consoleFull)** for PR 8734 at commit [`2bbe037`](https://github.com/apache/spark/commit/2bbe0373e9f3e2d04146bbade7279ca9fcadc6f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8734 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-174160685 **[Test build #49934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49934/consoleFull)** for PR 8734 at commit [`5c44e23`](https://github.com/apache/spark/commit/5c44e23cac423ad57e8e602c227dfa7a0f7b822c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-174165133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49934/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-174165132 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-174165097 **[Test build #49934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49934/consoleFull)** for PR 8734 at commit [`5c44e23`](https://github.com/apache/spark/commit/5c44e23cac423ad57e8e602c227dfa7a0f7b822c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50617823 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -331,12 +336,62 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { + assert(topNode.rightNode.get.predict.predict === 1.0) +} else { + assert(topNode.leftNode.get.predict.predict === 1.0) + assert(topNode.rightNode.get.predict.predict === 0.0) +} assert(topNode.leftNode.get.impurity === 0.0) assert(topNode.rightNode.get.impurity === 0.0) } + test("Use soft prediction for binary classification with ordered categorical features") { --- End diff -- Hmm, I just want a test case to show it actually order the bins by soft prediction. Although @jkbradley suggested we should use directly `binsToBestSplit`, but in order to do that, we also need to expose many details of `findBestSplits` too, e.g., `binSeqOp`, `getNodeToFeatures` and `partitionAggregates`...etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50576539 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -294,8 +295,12 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { --- End diff -- I think now that we are ordering by the correct thing, this change is not necessary. Since we order by proportion falling in outcome class 1, the hard prediction 0 will always be before hard prediction 1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50572835 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,11 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +if (categoryStats.count == 2) { --- End diff -- I think you meant `categoryStats.stats.length == 2`. `categoryStats.count` is the count of data points falling into that particular bin. Since we are trying to determine here whether this is regression or binary classification, I think checking `if (binAggregates.metadata.isClassification)` is more clear. Additionally, the code under the if and else statements of `centroidForCategories` is identical except for a single line. It seems cleaner to restructure to something like: ```scala val centroidForCategories = Range(0, numCategories).map { case featureValue => val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { if (binAggregates.metadata.isMulticlass) { // multiclass classification categoryStats.calculate() } else if (binAggregates.metadata.isClassification) { // binary classification categoryStats.stats(1) } else { // regression categoryStats.predict } } else { Double.MaxValue } (featureValue, centroid) } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50601582 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -331,12 +336,62 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { + assert(topNode.rightNode.get.predict.predict === 1.0) +} else { + assert(topNode.leftNode.get.predict.predict === 1.0) + assert(topNode.rightNode.get.predict.predict === 0.0) +} assert(topNode.leftNode.get.impurity === 0.0) assert(topNode.rightNode.get.impurity === 0.0) } + test("Use soft prediction for binary classification with ordered categorical features") { --- End diff -- What is the goal of this test? I guessed that the goal would be to find a case where ordering by hard predictions produces a different (suboptimal) tree than ordering by soft predictions. However, I did a quick simulation for this dataset and the results I got were the same either way. Just wanted to clarify. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50614934 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,11 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +if (categoryStats.count == 2) { --- End diff -- @sethah Thanks. You are right. I didn't read this part of codes thoroughly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173506110 **[Test build #49874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49874/consoleFull)** for PR 8734 at commit [`a37d3d8`](https://github.com/apache/spark/commit/a37d3d8fc026a7a42405b5a16814e23c6fcfa3be). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173518885 **[Test build #49874 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49874/consoleFull)** for PR 8734 at commit [`a37d3d8`](https://github.com/apache/spark/commit/a37d3d8fc026a7a42405b5a16814e23c6fcfa3be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173519027 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173519029 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49874/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50361275 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- Finding the _proportion_ falling in outcome class 1 simply requires division of the counts by a constant. Since we're just using that number for an ordering, constant division won't matter. They are the same. My initial comment has a typo. It should say for a "binary **outcome**", not "binary category". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50355886 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- As I saw from the implementation, `categoryStats.stats(1)` is just the count of class 1. Are we going to order bins by that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50366332 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- Yeah, I see. I was thinking we are going to order them by soft prediction of each bin. Actually what we want is to order them by soft prediction of certain class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173484334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173484336 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49865/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173484306 **[Test build #49865 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49865/consoleFull)** for PR 8734 at commit [`cd25214`](https://github.com/apache/spark/commit/cd252142a29e6083648bdec6a0646f2b6921b603). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173477452 **[Test build #49865 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49865/consoleFull)** for PR 8734 at commit [`cd25214`](https://github.com/apache/spark/commit/cd252142a29e6083648bdec6a0646f2b6921b603). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50285840 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- I don't believe this is correct. Ordering by the probability of the prediction is essentially the same as ordering by impurity. That's because when the impurity is low, the predicted value will have high probability and vice versa. From Hastie, Tibshirani, and Friedman: "We order the predictor classes according to the proportion falling in outcome class 1. Then we split this predictor as if it were an ordered predictor." For binary category I think it should be as @jkbradley suggested `categoryStats.stats(1)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173088520 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173080387 **[Test build #49757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49757/consoleFull)** for PR 8734 at commit [`2c32350`](https://github.com/apache/spark/commit/2c3235054d2d2a98f9f88ae67243bd1ba89eca1e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173088522 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49757/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-173088436 **[Test build #49757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49757/consoleFull)** for PR 8734 at commit [`2c32350`](https://github.com/apache/spark/commit/2c3235054d2d2a98f9f88ae67243bd1ba89eca1e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50212622 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -337,12 +342,62 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { + assert(topNode.rightNode.get.predict.predict === 1.0) +} else { + assert(topNode.leftNode.get.predict.predict === 1.0) + assert(topNode.rightNode.get.predict.predict === 0.0) +} assert(topNode.leftNode.get.impurity === 0.0) assert(topNode.rightNode.get.impurity === 0.0) } + test("Use soft prediction for binary classification with ordered categorical features") { +val arr = Array( + LabeledPoint(0.0, Vectors.dense(1.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 1.0, 1.0)), // right node + LabeledPoint(0.0, Vectors.dense(2.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 2.0, 1.0)), // right node + LabeledPoint(1.0, Vectors.dense(1.0, 1.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 2.0))) // left node +val input = sc.parallelize(arr) + +val strategy = new Strategy(algo = Classification, impurity = Gini, maxDepth = 1, + numClasses = 2, categoricalFeaturesInfo = Map(0 -> 3)) +val metadata = DecisionTreeMetadata.buildMetadata(input, strategy) +val (splits, bins) = DecisionTree.findSplitsBins(input, metadata) + +val treeInput = TreePoint.convertToTreeRDD(input, bins, metadata) +val baggedInput = BaggedPoint.convertToBaggedRDD(treeInput, 1.0, 1, false) + +val topNode = Node.emptyNode(nodeIndex = 1) +assert(topNode.predict.predict === Double.MinValue) +assert(topNode.impurity === -1.0) +assert(topNode.isLeaf === false) + +val nodesForGroup = Map((0, Array(topNode))) +val treeToNodeToIndexInfo = Map((0, Map( + (topNode.id, new RandomForest.NodeIndexInfo(0, None)) +))) +val nodeQueue = new mutable.Queue[(Int, Node)]() +DecisionTree.findBestSplits(baggedInput, metadata, Array(topNode), --- End diff -- In order to call `binsToBestSplit` directly, we need to expose many details of `findBestSplits` too, e.g., `binSeqOp`, `getNodeToFeatures ` and `partitionAggregates`...etc., because `binsToBestSplit` needs `binAggregates` and `featuresForNode`..etc. as parameters. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-17141 @jkbradley Sorry for replying late. I will try to finish this soon. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-171491697 Ping! Please let me know if you don't have time to work on this, and I can take it over. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-171500434 OK thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-168035189 I'll have bandwidth to get this merged now, so I'll watch for updates. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r48616278 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -776,7 +776,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.calculate() --- End diff -- This should use the soft prediction, not simply the impurity. For regression, that means using predict(), but for binary classification, that will require using categoryStats.stats(1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r48616306 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -337,12 +342,62 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { + assert(topNode.rightNode.get.predict.predict === 1.0) +} else { + assert(topNode.leftNode.get.predict.predict === 1.0) + assert(topNode.rightNode.get.predict.predict === 0.0) +} assert(topNode.leftNode.get.impurity === 0.0) assert(topNode.rightNode.get.impurity === 0.0) } + test("Use soft prediction for binary classification with ordered categorical features") { +val arr = Array( + LabeledPoint(0.0, Vectors.dense(1.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 1.0, 1.0)), // right node + LabeledPoint(0.0, Vectors.dense(2.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 2.0, 1.0)), // right node + LabeledPoint(1.0, Vectors.dense(1.0, 1.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 2.0))) // left node +val input = sc.parallelize(arr) + +val strategy = new Strategy(algo = Classification, impurity = Gini, maxDepth = 1, + numClasses = 2, categoricalFeaturesInfo = Map(0 -> 3)) +val metadata = DecisionTreeMetadata.buildMetadata(input, strategy) +val (splits, bins) = DecisionTree.findSplitsBins(input, metadata) + +val treeInput = TreePoint.convertToTreeRDD(input, bins, metadata) +val baggedInput = BaggedPoint.convertToBaggedRDD(treeInput, 1.0, 1, false) + +val topNode = Node.emptyNode(nodeIndex = 1) +assert(topNode.predict.predict === Double.MinValue) +assert(topNode.impurity === -1.0) +assert(topNode.isLeaf === false) + +val nodesForGroup = Map((0, Array(topNode))) +val treeToNodeToIndexInfo = Map((0, Map( + (topNode.id, new RandomForest.NodeIndexInfo(0, None)) +))) +val nodeQueue = new mutable.Queue[(Int, Node)]() +DecisionTree.findBestSplits(baggedInput, metadata, Array(topNode), --- End diff -- Ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r42964746 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -294,8 +295,12 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { --- End diff -- Because previously we rank the bins according to hard prediction. So for example the bin of hard prediction 0 is always in front of hard prediction 1. Now the order of bins can be switched. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r41326380 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -337,12 +342,62 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { + assert(topNode.rightNode.get.predict.predict === 1.0) +} else { + assert(topNode.leftNode.get.predict.predict === 1.0) + assert(topNode.rightNode.get.predict.predict === 0.0) +} assert(topNode.leftNode.get.impurity === 0.0) assert(topNode.rightNode.get.impurity === 0.0) } + test("Use soft prediction for binary classification with ordered categorical features") { +val arr = Array( + LabeledPoint(0.0, Vectors.dense(1.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 1.0, 1.0)), // right node + LabeledPoint(0.0, Vectors.dense(2.0, 0.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(0.0, 2.0, 1.0)), // right node + LabeledPoint(1.0, Vectors.dense(1.0, 1.0, 0.0)), // left node + LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 2.0))) // left node +val input = sc.parallelize(arr) + +val strategy = new Strategy(algo = Classification, impurity = Gini, maxDepth = 1, + numClasses = 2, categoricalFeaturesInfo = Map(0 -> 3)) +val metadata = DecisionTreeMetadata.buildMetadata(input, strategy) +val (splits, bins) = DecisionTree.findSplitsBins(input, metadata) + +val treeInput = TreePoint.convertToTreeRDD(input, bins, metadata) +val baggedInput = BaggedPoint.convertToBaggedRDD(treeInput, 1.0, 1, false) + +val topNode = Node.emptyNode(nodeIndex = 1) +assert(topNode.predict.predict === Double.MinValue) +assert(topNode.impurity === -1.0) +assert(topNode.isLeaf === false) + +val nodesForGroup = Map((0, Array(topNode))) +val treeToNodeToIndexInfo = Map((0, Map( + (topNode.id, new RandomForest.NodeIndexInfo(0, None)) +))) +val nodeQueue = new mutable.Queue[(Int, Node)]() +DecisionTree.findBestSplits(baggedInput, metadata, Array(topNode), --- End diff -- Can you please update this test to call binsToBestSplit directly? You can change it to be ```private[tree]``` so that it's callable from this test suite. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r41326387 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -294,8 +295,12 @@ class DecisionTreeSuite extends SparkFunSuite with MLlibTestSparkContext { assert(topNode.impurity !== -1.0) // set impurity and predict for child nodes -assert(topNode.leftNode.get.predict.predict === 0.0) -assert(topNode.rightNode.get.predict.predict === 1.0) +if (topNode.leftNode.get.predict.predict === 0.0) { --- End diff -- What happened here? Did the leaf nodes switch because of the internal change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/8734 [SPARK-10524][ML] Use the soft prediction to order categories' bins JIRA: https://issues.apache.org/jira/browse/SPARK-10524 Currently we use the hard prediction (`ImpurityCalculator.predict`) to order categories' bins. But we should use the soft prediction. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 dt-soft-centroids Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8734.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8734 commit 84260ca52bc037f2074bc062c7d5b88e1657314a Author: Liang-Chi HsiehDate: 2015-09-13T15:27:08Z Use the soft prediction to order categories' bins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139891838 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139887804 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139892475 [Test build #42382 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42382/consoleFull) for PR 8734 at commit [`84260ca`](https://github.com/apache/spark/commit/84260ca52bc037f2074bc062c7d5b88e1657314a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139896122 [Test build #42382 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42382/console) for PR 8734 at commit [`84260ca`](https://github.com/apache/spark/commit/84260ca52bc037f2074bc062c7d5b88e1657314a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139896144 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8734#issuecomment-139896145 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42382/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org