[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90901/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90901 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90901/testReport)** for PR 21165 at commit [`74911b7`](https://github.com/apache/spark/commit/74911b7a8d7714618ab060b3227e33505b0c5d05). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90901/testReport)** for PR 21165 at commit [`74911b7`](https://github.com/apache/spark/commit/74911b7a8d7714618ab060b3227e33505b0c5d05). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90891/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90891 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90891/testReport)** for PR 21165 at commit [`74911b7`](https://github.com/apache/spark/commit/74911b7a8d7714618ab060b3227e33505b0c5d05). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90891 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90891/testReport)** for PR 21165 at commit [`74911b7`](https://github.com/apache/spark/commit/74911b7a8d7714618ab060b3227e33505b0c5d05). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 Gently ping @cloud-fan again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90643/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90643/testReport)** for PR 21165 at commit [`59c2807`](https://github.com/apache/spark/commit/59c2807dca5cc1c9332a1b3e7fcc73214df436ec). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90643/testReport)** for PR 21165 at commit [`59c2807`](https://github.com/apache/spark/commit/59c2807dca5cc1c9332a1b3e7fcc73214df436ec). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 I think we can just update MimaExcludes, since it's developer API. cc @JoshRosen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 Looks like that simply add fields with default values into case class will break binary compatibility. How should we deal with that? Add to MimaExcludes or add missing methods? @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90595/testReport)** for PR 21165 at commit [`945c1d5`](https://github.com/apache/spark/commit/945c1d5945e2cf836e36f4c670b5b80dc83b1c76). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90595/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90595/testReport)** for PR 21165 at commit [`945c1d5`](https://github.com/apache/spark/commit/945c1d5945e2cf836e36f4c670b5b80dc83b1c76). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90585/testReport)** for PR 21165 at commit [`05d1d9c`](https://github.com/apache/spark/commit/05d1d9cad761bb09e1131162458fecd5e34f02d2). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90585/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21165 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21165 **[Test build #90585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90585/testReport)** for PR 21165 at commit [`05d1d9c`](https://github.com/apache/spark/commit/05d1d9cad761bb09e1131162458fecd5e34f02d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 @jiangxb1987 @cloud-fan I think it's ready for review. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > We should not do these 2 things together, and to me the second one is way simpler to get in and we should do it first. Agreed. For the scope of this pr, let's get killed tasks's accumulators into metrics first. After that we can discuss the possibility to expose the ability under users' request. > but please make sure this patch only touches internal accumulators that are used for metrics reporting. After a second look, this part is already be handled by Task's collectAccumulatorUpdates: ``` def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { // Note: internal accumulators representing task metrics always count failed values context.taskMetrics.nonZeroInternalAccums() ++ // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not // filter them out. context.taskMetrics.externalAccums.filter(a => !taskFailed || a.countFailedValues) } else { Seq.empty } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 > For example user may want to record CPU time for every task and get the total CPU time for the application. The problem is, shall we allow end users to collect metrics via accumulators? Currently only Spark can do that via internal accumulators which count failed tasks. We need a careful API design about how to expose this ability in the end users. In the meanwhile, since we already count failed tasks, it makes sense to also count killed tasks for internal metrics collecting. We should not do these 2 things together, and to me the second one is way simpler to get in and we should do it first. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > However, I don't agree user side accumulators should get updates from killed tasks, that changes the semantic of accumulators. And I don't think end-users need to care about killed tasks. Similarly, when we implement task metrics, we need to count failed tasks, but user side accumulator still skips failed tasks. I think we should also follow that approach. I don't agree that end-user didn't care killed tasks. For example user may want to record CPU time for every task and get the total CPU time for the application. However the default behaviour should keep backward-compatibility with existing behaviour. ``` private[spark] case class AccumulatorMetadata( id: Long, name: Option[String], countFailedValues: Boolean) extends Serializable ``` The metadata has `countFailedValues` field, we can use this or add a new field? However we didn't expose this field to end user... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21165 I do agree task killed event should carry metrics update, as it's reasonable to count killed tasks for something like how many bytes were read from files. However, I don't agree user side accumulators should get updates from killed tasks, that changes the semantic of accumulators. And I don't think end-users need to care about killed tasks. I haven't read the PR yet, but please make sure this patch only touches internal accumulators that are used for metrics reporting. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 I add a note for accumulator update. Please comment if more document is needed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > It should be [Spark-20087] instead of [Spark 20087] in the title. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark 20087][CORE] Attach accumulators / metrics to 'Ta...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21165 It should be `[Spark-20087]` instead of `[Spark 20087]` in the title. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21165: [Spark 20087][CORE] Attach accumulators / metrics to 'Ta...
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > we should document the changes in a migration document or something, I think documentation is necessary, will update the documentation tomorrow (Beijing time) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org