[ https://issues.apache.org/jira/browse/MAPREDUCE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445229#comment-13445229 ]
Hadoop QA commented on MAPREDUCE-4611: -------------------------------------- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543142/MR-4611.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2792//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2792//console This message is automatically generated. > MR AM dies badly when Node is decomissioned > ------------------------------------------- > > Key: MAPREDUCE-4611 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4611 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0 > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Attachments: MR-4611.txt > > > The MR AM always thinks that it is being killed by the RM when it gets a kill > signal and it has not finished processing yet. In reality the RM kill signal > is only sent when the client cannot communicate directly with the AM, which > probably means that the AM is in a bad state already. The much more common > case is that the node is marked as unhealthy or decomissioned. > I propose that in the short term the AM will only clean up if > # The process has been asked by the client to exit (kill) > # The process job has finished cleanly and is exiting already > # This is that last retry of the AM retries. > The downside here is that the .staging directory will be leaked and the job > will not show up in the history server on an kill from the RM in some cases. > At least until the full set of AM cleanup issues can be addressed, probably > as part of MAPREDUCE-4428 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira