[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12899 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-218243622 LGTM merging into master 2.0. Let's address any follow-ups in a future patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62723405 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) --- End diff -- ok I'll add it when I merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62711158 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) --- End diff -- Can we add a comment to say internal accumulators will always count on failures? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62710891 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- Sounds good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62709430 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- If this is the only issue, can we merge this pull request and have a new pr to fix this semantics - which clearly there are some disagreement with. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62708707 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- If there is no spilling, we could say the size of spilling is not define (null for unknown). We also have total value, we could know that how many task had spilled. Right now, we can't know how many tasks had spilled, actually it's worse. I don't think it's wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62622150 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- It's task level statistics. Let's say an operator launched 100 tasks to execute it, and 99 tasks don't spill, only one task spills 100 mb, then the avg of `spilling size` will be 100 mb, if we don't include zero values. This is obviously wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62614867 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- We could see zero value as `null`, it's reasonable to not include the `null` in sum/avg. Is there any downside to exclude these zeros ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-218042313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58185/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-218042312 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-218042198 **[Test build #58185 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58185/consoleFull)** for PR 12899 at commit [`18aa4ab`](https://github.com/apache/spark/commit/18aa4abb4ddd4cf0800e0b353077d083f66096de). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-218027664 **[Test build #58185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58185/consoleFull)** for PR 12899 at commit [`18aa4ab`](https://github.com/apache/spark/commit/18aa4abb4ddd4cf0800e0b353077d083f66096de). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62595519 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala --- @@ -66,7 +66,7 @@ private[sql] object SQLMetrics { def createMetric(sc: SparkContext, name: String): SQLMetric = { val acc = new SQLMetric(SUM_METRIC) -acc.register(sc, name = Some(name), countFailedValues = true) +acc.register(sc, name = Some(name), countFailedValues = false) --- End diff -- It was a mistake, SQLMetric should not count failed values. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62595350 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- and it's for UI. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62595291 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- Because SQL Metrics has some statistics, e.g. max, min, avg, we need all values even it's zero. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62539578 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- Another question is that: Why zero value of SQL metrics are useful? Are they only useful for tests or UI? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62539364 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- SQL metrics should not count on failures, will we fix that in this PR or a separate one? Then this part should also be updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217932594 This patch itself LGTM. I'm merging it into master 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62538599 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- we should probably change the semantics of internal to mean internal to Spark (i.e. include SQL metrics), but that's a separate issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62410420 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- There are 2 concepts: 1. internal accumulators: like GCtime, resultSize, which are internal to DAGScheduler. 2. `countFailedValues` accumulator: `countFailedValues` is an internal flag that can only be set by us. All internal accumulators are `countFailedValues` accumulators, and SQLMetrics are also `countFailedValues` accumulators. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62397915 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- internal here means task metrics. "internal" to core. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62374320 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -155,7 +155,13 @@ private[spark] abstract class Task[T]( */ def collectAccumulatorUpdates(taskFailed: Boolean = false): Seq[AccumulatorV2[_, _]] = { if (context != null) { - context.taskMetrics.accumulators().filter { a => !taskFailed || a.countFailedValues } + context.taskMetrics.internalAccums.filter { a => +// RESULT_SIZE accumulator is always zero at executor, we need to send it back as its +// value will be updated at driver side. +!a.isZero || a.name == Some(InternalAccumulator.RESULT_SIZE) + // zero value external accumulators may still be useful, e.g. SQLMetrics, we should not filter --- End diff -- We change the countOnFailure from false to true recently, is that an design change? Why SQLMetrics are external accumulators? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217394779 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217394784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57969/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217394568 **[Test build #57969 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57969/consoleFull)** for PR 12899 at commit [`b4e7385`](https://github.com/apache/spark/commit/b4e7385823880b622f17d5cdf57fca037fe93cb7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217369918 **[Test build #57969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57969/consoleFull)** for PR 12899 at commit [`b4e7385`](https://github.com/apache/spark/commit/b4e7385823880b622f17d5cdf57fca037fe93cb7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217352598 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217352599 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57959/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217352525 **[Test build #57959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57959/consoleFull)** for PR 12899 at commit [`cb21034`](https://github.com/apache/spark/commit/cb210349233e06a368f4693b92ed50314e168eab). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217344297 **[Test build #57959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57959/consoleFull)** for PR 12899 at commit [`cb21034`](https://github.com/apache/spark/commit/cb210349233e06a368f4693b92ed50314e168eab). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217340511 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217340512 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57946/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217340426 **[Test build #57946 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57946/consoleFull)** for PR 12899 at commit [`b48dda8`](https://github.com/apache/spark/commit/b48dda8402bd85ab02586e36bd7eb9440c140e00). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217338062 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217338038 **[Test build #57944 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57944/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217338064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57944/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217330300 **[Test build #57946 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57946/consoleFull)** for PR 12899 at commit [`b48dda8`](https://github.com/apache/spark/commit/b48dda8402bd85ab02586e36bd7eb9440c140e00). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217324749 **[Test build #57944 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57944/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217324631 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217229568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57898/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217229565 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217229469 **[Test build #57898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57898/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217203206 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217204592 **[Test build #57898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57898/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217200613 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217200615 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57893/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217200510 **[Test build #57893 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57893/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217197047 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57892/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217197044 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217196930 **[Test build #57892 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57892/consoleFull)** for PR 12899 at commit [`2d3d3d4`](https://github.com/apache/spark/commit/2d3d3d4b65d659188f2a328282fbf81e35657014). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217172496 **[Test build #57893 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57893/consoleFull)** for PR 12899 at commit [`41f5cb4`](https://github.com/apache/spark/commit/41f5cb4da8d9192bc75f547ec1b4dd68d6205161). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62196091 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1097,8 +1097,8 @@ class DAGScheduler( throw new SparkException(s"attempted to access non-existent accumulator $id") } acc.merge(updates.asInstanceOf[AccumulatorV2[Any, Any]]) -// To avoid UI cruft, ignore cases where value wasn't updated -if (acc.name.isDefined && !updates.isZero) { +// Only display named accumulators on UI. +if (acc.name.isDefined) { --- End diff -- I reverted this change because we can't do assert here. It's surrounded by a `try catch` which captures all NonFatal exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217171297 **[Test build #57892 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57892/consoleFull)** for PR 12899 at commit [`2d3d3d4`](https://github.com/apache/spark/commit/2d3d3d4b65d659188f2a328282fbf81e35657014). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217115149 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57866/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217115146 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217115088 **[Test build #57866 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57866/consoleFull)** for PR 12899 at commit [`c523fee`](https://github.com/apache/spark/commit/c523feeb58376ed4813d1e5119638fe6528f742a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217101715 **[Test build #57866 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57866/consoleFull)** for PR 12899 at commit [`c523fee`](https://github.com/apache/spark/commit/c523feeb58376ed4813d1e5119638fe6528f742a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217100897 @davies , I think the added `assert` can guarantee this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217100854 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217075463 I will be great we could have a test to make sure that we always have this behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217075163 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57832/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217075088 **[Test build #57832 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57832/consoleFull)** for PR 12899 at commit [`c523fee`](https://github.com/apache/spark/commit/c523feeb58376ed4813d1e5119638fe6528f742a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217075162 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217064808 **[Test build #57832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57832/consoleFull)** for PR 12899 at commit [`c523fee`](https://github.com/apache/spark/commit/c523feeb58376ed4813d1e5119638fe6528f742a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-217002044 Since we have only one BlockStatusesAccumulator object in TaskMetrics, it may not worth to do 2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62112750 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1097,8 +1097,8 @@ class DAGScheduler( throw new SparkException(s"attempted to access non-existent accumulator $id") } acc.merge(updates.asInstanceOf[AccumulatorV2[Any, Any]]) -// To avoid UI cruft, ignore cases where value wasn't updated -if (acc.name.isDefined && !updates.isZero) { +// Only display named accumulators on UI. +if (acc.name.isDefined) { --- End diff -- it's because the executor no longer sends back updates where the value is zero, so the second condition is always assumed to be true. Maybe we should add an assert or something instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62112538 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -291,25 +291,32 @@ private[spark] object TaskMetrics extends Logging { private[spark] class BlockStatusesAccumulator extends AccumulatorV2[(BlockId, BlockStatus), Seq[(BlockId, BlockStatus)]] { - private[this] var _seq = ArrayBuffer.empty[(BlockId, BlockStatus)] + private[this] var _seq: ArrayBuffer[(BlockId, BlockStatus)] = _ - override def isZero(): Boolean = _seq.isEmpty + private def seq = { --- End diff -- can you add return type --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216998738 I agree. I think (1) is straightforward and we should do it. (2) I'm not so sure since it only affects one of the accumulators. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216965921 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216965926 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57771/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216965728 **[Test build #57771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57771/consoleFull)** for PR 12899 at commit [`b4571f9`](https://github.com/apache/spark/commit/b4571f96d0e73bdd5c0b53d2f90ff68c0bc98105). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216943351 @cloud-fan we don't need to do this here but I think we can also substantially cut down the size of a task result if we consolidate all the accumulators into a single one in TaskMetrics. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216940164 ``` scala> ser.newInstance().serialize(ArrayBuffer.empty[Any]) res4: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=173 cap=214] scala> ser.newInstance().serialize(null) res5: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=5 cap=32] scala> ser.newInstance().serialize(new java.util.ArrayList[Long]) res6: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=58 cap=64] scala> ser.newInstance().serialize(1L) res7: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=82 cap=128] scala> ser.newInstance().serialize(Array(1L)) res8: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=35 cap=64] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216937041 @cloud-fan How much we can gain from 2)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62077628 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -291,25 +291,32 @@ private[spark] object TaskMetrics extends Logging { private[spark] class BlockStatusesAccumulator extends AccumulatorV2[(BlockId, BlockStatus), Seq[(BlockId, BlockStatus)]] { - private[this] var _seq = ArrayBuffer.empty[(BlockId, BlockStatus)] + private[this] var _seq: ArrayBuffer[(BlockId, BlockStatus)] = _ - override def isZero(): Boolean = _seq.isEmpty + private def seq = { --- End diff -- Should this be thread-safe? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12899#discussion_r62077575 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1097,8 +1097,8 @@ class DAGScheduler( throw new SparkException(s"attempted to access non-existent accumulator $id") } acc.merge(updates.asInstanceOf[AccumulatorV2[Any, Any]]) -// To avoid UI cruft, ignore cases where value wasn't updated -if (acc.name.isDefined && !updates.isZero) { +// Only display named accumulators on UI. +if (acc.name.isDefined) { --- End diff -- Why this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216935103 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216935106 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57767/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216935011 **[Test build #57767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57767/consoleFull)** for PR 12899 at commit [`b4571f9`](https://github.com/apache/spark/commit/b4571f96d0e73bdd5c0b53d2f90ff68c0bc98105). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216934067 **[Test build #57771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57771/consoleFull)** for PR 12899 at commit [`b4571f9`](https://github.com/apache/spark/commit/b4571f96d0e73bdd5c0b53d2f90ff68c0bc98105). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216932538 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216923365 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57766/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216923361 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216923251 **[Test build #57766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57766/consoleFull)** for PR 12899 at commit [`57fc9af`](https://github.com/apache/spark/commit/57fc9afbda5265d389771c0e102266e39171034b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216900257 **[Test build #57767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57767/consoleFull)** for PR 12899 at commit [`b4571f9`](https://github.com/apache/spark/commit/b4571f96d0e73bdd5c0b53d2f90ff68c0bc98105). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12837][CORE] reduce network IO for accu...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12899#issuecomment-216899245 cc @davies @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org