Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23207#discussion_r238633725 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -92,6 +92,12 @@ private[spark] class ShuffleMapTask( threadMXBean.getCurrentThreadCpuTime - deserializeStartCpuTime } else 0L + // Register the shuffle write metrics reporter to shuffleWriteMetrics. + if (dep.shuffleWriteMetricsReporter.isDefined) { + context.taskMetrics().shuffleWriteMetrics.registerExternalShuffleWriteReporter( --- End diff -- a simpler idea: 1. create a `class GroupedShuffleWriteMetricsReporter(reporters: Seq[ShuffleWriteMetricsReporter]) extends ShuffleWriteMetricsReporter`, which proxy all the metrics updating to the input reporters. 2. create a `GroupedShuffleWriteMetricsReporter` instance here: `new GroupedShuffleWriteMetricsReporter(Seq(dep.shuffleWriteMetricsReporter.get, context.taskMetrics().shuffleWriteMetrics))`, and pass it to `manager.getWriter` I think we can use the same approach for read metrics as well.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org