[ 
https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186379#comment-15186379
 ] 

Liwei Lin commented on SPARK-10620:
-----------------------------------

hi [~andrewor14], in the "\[3\] A Simpler Accumulator API" section of the 
design doc:
{quote}
Since the design of this is mostly orthogonal to the rest of this document, 
here we only outline 
the desire for a new, simpler API, and does not discuss the solution. The 
actual design will be in 
a separate design doc.
{quote}

Anywhere to find that separate "Simpler Accumulator API" design doc please? 
Thanks!

> Look into whether accumulator mechanism can replace TaskMetrics
> ---------------------------------------------------------------
>
>                 Key: SPARK-10620
>                 URL: https://issues.apache.org/jira/browse/SPARK-10620
>             Project: Spark
>          Issue Type: Task
>          Components: Spark Core
>            Reporter: Patrick Wendell
>            Assignee: Andrew Or
>             Fix For: 2.0.0
>
>         Attachments: accums-and-task-metrics.pdf
>
>
> This task is simply to explore whether the internal representation used by 
> TaskMetrics could be performed by using accumulators rather than having two 
> separate mechanisms. Note that we need to continue to preserve the existing 
> "Task Metric" data structures that are exposed to users through event logs 
> etc. The question is can we use a single internal codepath and perhaps make 
> this easier to extend in the future.
> I think a full exploration would answer the following questions:
> - How do the semantics of accumulators on stage retries differ from aggregate 
> TaskMetrics for a stage? Could we implement clearer retry semantics for 
> internal accumulators to allow them to be the same - for instance, zeroing 
> accumulator values if a stage is retried (see discussion here: SPARK-10042).
> - Are there metrics that do not fit well into the accumulator model, or would 
> be difficult to update as an accumulator.
> - If we expose metrics through accumulators in the future rather than 
> continuing to add fields to TaskMetrics, what is the best way to coerce 
> compatibility?
> - Are there any other considerations?
> - Is it worth it to do this, or is the consolidation too complicated to 
> justify?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to