[ https://issues.apache.org/jira/browse/MAPREDUCE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207308#comment-13207308 ]
Hadoop QA commented on MAPREDUCE-3802: -------------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514406/MAPREDUCE-3802-20120213.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.v2.app.TestRecovery +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1848//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1848//console This message is automatically generated. > If an MR AM dies twice it looks like the process freezes > --------------------------------------------------------- > > Key: MAPREDUCE-3802 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3802 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster, mrv2 > Affects Versions: 0.23.1, 0.24.0 > Reporter: Robert Joseph Evans > Assignee: Vinod Kumar Vavilapalli > Priority: Critical > Fix For: 0.23.1 > > Attachments: MAPREDUCE-3802-20120213.txt, syslog > > > It looks like recovering from an RM AM dieing works very well on a single > failure. But if it fails multiple times we appear to get into a live lock > situation. > {noformat} > yarn jar > hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*-SNAPSHOT.jar > wordcount -Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.job.reduces=30 > input output > 12/02/03 21:06:57 WARN conf.Configuration: fs.default.name is deprecated. > Instead, use fs.defaultFS > 12/02/03 21:06:57 WARN conf.Configuration: mapred.used.genericoptionsparser > is deprecated. Instead, use mapreduce.client.genericoptionsparser.used > 12/02/03 21:06:57 INFO input.FileInputFormat: Total input paths to process : > 17 > 12/02/03 21:06:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library > 12/02/03 21:06:57 WARN snappy.LoadSnappy: Snappy native library not loaded > 12/02/03 21:06:57 INFO mapreduce.JobSubmitter: number of splits:17 > 12/02/03 21:06:57 INFO mapred.ResourceMgrDelegate: Submitted application > application_1328302034486_0003 to ResourceManager at HOST/IP:8040 > 12/02/03 21:06:57 INFO mapreduce.Job: The url to track the job: > http://HOST:8088/proxy/application_1328302034486_0003/ > 12/02/03 21:06:57 INFO mapreduce.Job: Running job: job_1328302034486_0003 > 12/02/03 21:07:03 INFO mapreduce.Job: Job job_1328302034486_0003 running in > uber mode : false > 12/02/03 21:07:03 INFO mapreduce.Job: map 0% reduce 0% > 12/02/03 21:07:09 INFO mapreduce.Job: map 5% reduce 0% > 12/02/03 21:07:10 INFO mapreduce.Job: map 17% reduce 0% > #KILLED AM with kill -9 here > 12/02/03 21:07:16 INFO mapreduce.Job: map 29% reduce 0% > 12/02/03 21:07:17 INFO mapreduce.Job: map 35% reduce 0% > 12/02/03 21:07:30 INFO mapreduce.Job: map 52% reduce 0% > 12/02/03 21:07:35 INFO mapreduce.Job: map 58% reduce 0% > 12/02/03 21:07:37 INFO mapreduce.Job: map 70% reduce 0% > 12/02/03 21:07:41 INFO mapreduce.Job: map 76% reduce 0% > 12/02/03 21:07:43 INFO mapreduce.Job: map 82% reduce 0% > 12/02/03 21:07:44 INFO mapreduce.Job: map 88% reduce 0% > 12/02/03 21:07:47 INFO mapreduce.Job: map 94% reduce 0% > 12/02/03 21:07:49 INFO mapreduce.Job: map 100% reduce 0% > 12/02/03 21:07:53 INFO mapreduce.Job: map 100% reduce 3% > 12/02/03 21:08:00 INFO mapreduce.Job: map 100% reduce 6% > 12/02/03 21:08:06 INFO mapreduce.Job: map 100% reduce 10% > 12/02/03 21:08:12 INFO mapreduce.Job: map 100% reduce 13% > 12/02/03 21:08:18 INFO mapreduce.Job: map 100% reduce 16% > #killed AM with kill -9 here > 12/02/03 21:08:20 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. > Already tried 0 time(s). > 12/02/03 21:08:21 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. > Already tried 1 time(s). > 12/02/03 21:08:22 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. > Already tried 2 time(s). > 12/02/03 21:08:26 INFO mapreduce.Job: map 64% reduce 16% > #It never makes any more progress... > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira