Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > However, I don't agree user side accumulators should get updates from killed tasks, that changes the semantic of accumulators. And I don't think end-users need to care about killed tasks. Similarly, when we implement task metrics, we need to count failed tasks, but user side accumulator still skips failed tasks. I think we should also follow that approach. I don't agree that end-user didn't care killed tasks. For example user may want to record CPU time for every task and get the total CPU time for the application. However the default behaviour should keep backward-compatibility with existing behaviour. ``` private[spark] case class AccumulatorMetadata( id: Long, name: Option[String], countFailedValues: Boolean) extends Serializable ``` The metadata has `countFailedValues` field, we can use this or add a new field? However we didn't expose this field to end user...
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org