[ 
https://issues.apache.org/jira/browse/TEZ-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1744:
-----------------------------
    Target Version/s: 0.7.0  (was: 0.6.0)

> It is not necessary to check whether dag is commit in RecoveryTransition
> ------------------------------------------------------------------------
>
>                 Key: TEZ-1744
>                 URL: https://issues.apache.org/jira/browse/TEZ-1744
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.1
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1744.patch
>
>
> It is not necessary to check whether dag is commit in RecoveryTransition, 
> because we already check that in RecoveryParser by using the summary event.
> Copy the comments from TEZ-1737,
> bq. But even the non-summary VertexFinishedEvent is seen, its 
> VertexRecoverableEventsGeneratedEvent may still lost. I think there's no 
> guaranteed that VertexRecoverableEventsGeneratedEvent is logged before 
> VertexFinishedEvent.
> The expectation was that all tasks are completed before a vertex has 
> finished. Also, a TaskFinishedEvent is only seen after all its datamovement 
> events are generated and therefore logged.
> The handling for for the general case where there are a lot of data movement 
> events generated, commit started and then ended. In a scenario, where commit 
> starts but does not end, the summary log helps catch the problem. Now, in a 
> scenario, where commit finished successfully, there could be a situation 
> where the AM crashed before all data movements are stored to recovery. In 
> this scenario, we cannot do anything as the commit has already been done but 
> we have no idea what was lost. The main crux to answer your question is that 
> a committer cannot be invoked twice.
> Agree that VertexRecoverableEventsGeneratedEvent is a different problem. In 
> such cases, I believe that if VertexRecoverableEventsGeneratedEvent is not 
> seen before a VertexFinished is seen, there needs to be some additional 
> handling for that scenario too. If a VertexRecoverableEventsGeneratedEvent is 
> always guaranteed to be generated for a vertex and it is not seen, then that 
> means it is a potential non-recoverable case when the vertex itself was seen 
> to have been completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to