Patrick Wendell created SPARK-10620:
---------------------------------------

             Summary: Look into whether accumulator mechanism can replace 
TaskMetrics
                 Key: SPARK-10620
                 URL: https://issues.apache.org/jira/browse/SPARK-10620
             Project: Spark
          Issue Type: Task
          Components: Spark Core
            Reporter: Patrick Wendell
            Assignee: Andrew Or


This task is simply to explore whether the internal representation used by 
TaskMetrics could be performed by using accumulators rather than having two 
separate mechanisms. Note that we need to continue to preserve the existing 
"Task Metric" data structures that are exposed to users through event logs etc. 
The question is can we use a single internal codepath and perhaps make this 
easier to extend in the future.

I think there are a few things to look into:
- How do the semantics of accumulators on stage retries differ from aggregate 
TaskMetrics for a stage? Could we implement clearer retry semantics for 
internal accumulators to allow them to be the same - for instance, zeroing 
accumulator values if a stage is retried (see discussion here: SPARK-10042).
- Are there metrics that do not fit well into the accumulator model, or would 
be difficult to update as an accumulator.
- If we expose metrics through accumulators in the future rather than 
continuing to add fields to TaskMetrics, what is the best way to coerce 
compatibility?
- Is it worth it to do this, or is the consolidation too complicated to justify?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to