[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106483#comment-14106483
 ] 

Zhijie Shen commented on MAPREDUCE-5956:
----------------------------------------

[~leftnoteasy], thanks for your comments. I'd like to add some additional 
points.

* Undeleted staging dir is a very rare issue, though it is likely to happen. MR 
AM will ask for a retry (with staging dir kept for next retry) once failure 
happens regardless what maxAttempts is. If unfortunately RM finds the job has 
used up all retry quota, the staging dir will be not cleaned up.

* The undeleted staging dir won't affect JHS and other MR logic, but you need 
to evaluate if it will affect your business. Or you can wait until YARN-2261 is 
ready, but the down side is that the MR job may have a bit fewer retry 
opportunities that it expects to have.

> MapReduce AM should not use maxAttempts to determine if this is the last retry
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster, mrv2
>    Affects Versions: 2.4.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Wangda Tan
>            Priority: Blocker
>             Fix For: 2.6.0
>
>         Attachments: MR-5956.patch, MR-5956.patch
>
>
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we 
> don't count AM preemption towards AM failures on RM side, but MapReduce AM 
> itself checks the attempt id against the max-attempt count to determine if 
> this is the last attempt.
> {code}
>     public void computeIsLastAMRetry() {
>       isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
>     }
> {code}
> This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to