[ https://issues.apache.org/jira/browse/SPARK-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-10620: ------------------------------ Attachment: AccumulatorsandTaskMetricsinSpark2.0.pdf > Look into whether accumulator mechanism can replace TaskMetrics > --------------------------------------------------------------- > > Key: SPARK-10620 > URL: https://issues.apache.org/jira/browse/SPARK-10620 > Project: Spark > Issue Type: Task > Components: Spark Core > Reporter: Patrick Wendell > Assignee: Andrew Or > Attachments: AccumulatorsandTaskMetricsinSpark2.0.pdf > > > This task is simply to explore whether the internal representation used by > TaskMetrics could be performed by using accumulators rather than having two > separate mechanisms. Note that we need to continue to preserve the existing > "Task Metric" data structures that are exposed to users through event logs > etc. The question is can we use a single internal codepath and perhaps make > this easier to extend in the future. > I think a full exploration would answer the following questions: > - How do the semantics of accumulators on stage retries differ from aggregate > TaskMetrics for a stage? Could we implement clearer retry semantics for > internal accumulators to allow them to be the same - for instance, zeroing > accumulator values if a stage is retried (see discussion here: SPARK-10042). > - Are there metrics that do not fit well into the accumulator model, or would > be difficult to update as an accumulator. > - If we expose metrics through accumulators in the future rather than > continuing to add fields to TaskMetrics, what is the best way to coerce > compatibility? > - Are there any other considerations? > - Is it worth it to do this, or is the consolidation too complicated to > justify? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org