[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

advancedxy Thu, 26 Apr 2018 21:04:24 -0700

Github user advancedxy commented on the issue:

    https://github.com/apache/spark/pull/21165
  
    > However, I don't agree user side accumulators should get updates from 
killed tasks, that changes the semantic of accumulators. And I don't think 
end-users need to care about killed tasks. Similarly, when we implement task 
metrics, we need to count failed tasks, but user side accumulator still skips 
failed tasks. I think we should also follow that approach.
    
    I don't agree that end-user didn't care killed tasks. For example user may 
want to record CPU time for every task and get the total CPU time for the 
application. However the default behaviour should keep backward-compatibility 
with existing behaviour.
    
    ```
    private[spark] case class AccumulatorMetadata(
        id: Long,
        name: Option[String],
        countFailedValues: Boolean) extends Serializable
    ```
    The metadata has `countFailedValues` field, we can use this or add a new 
field?
    
    However we didn't expose this field to end user...



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

Reply via email to