[ https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056074#comment-14056074 ]
Zhijie Shen commented on MAPREDUCE-5956: ---------------------------------------- bq. b. Failure happened, and captured by MRAppMasterShutDownHook How can (2) work for b? Since MR AM doesn't know the preemption, the only possibility is that MR AM thinks it's not last retry, but RM thinks it is (RM may also think it's not last retry but with one fewer attempt). In this case, MR AM want to get the right last retry flag from RM. However, RMCommunicator is not supposed to do unregistration if RM AM doesn't think it's the last retry now. Hence I'm afraid MR AM doesn't have the chance to communicate with RM to inquiry the right information, unless the logic to trigger unregistration is modified. Please correct me if i'm missing something. > MapReduce AM should not use maxAttempts to determine if this is the last retry > ------------------------------------------------------------------------------ > > Key: MAPREDUCE-5956 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster, mrv2 > Reporter: Vinod Kumar Vavilapalli > Assignee: Wangda Tan > Priority: Blocker > > Found this while reviewing YARN-2074. The problem is that after YARN-2074, we > don't count AM preemption towards AM failures on RM side, but MapReduce AM > itself checks the attempt id against the max-attempt count to determine if > this is the last attempt. > {code} > public void computeIsLastAMRetry() { > isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts; > } > {code} > This causes issues w.r.t deletion of staging directory etc.. -- This message was sent by Atlassian JIRA (v6.2#6252)