[ https://issues.apache.org/jira/browse/MAPREDUCE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024466#comment-15024466 ]
Junping Du commented on MAPREDUCE-6555: --------------------------------------- The previous failure is because since MAPREDUCE-5485, we allow MR job can retry on AM failure during committing stage (if Committer is repeatable). So MRAppMaster.initAndStartAppMaster() won't throw fatal exception if there are commit start file exists (which hints previous AM failed in the middle of commit) for FileOutputCommitter which is default for version 2 algorithm in trunk. I think we don't need this fix in branch-2 as the version in branch-2 is 1. > TestMRAppMaster fails on trunk > ------------------------------ > > Key: MAPREDUCE-6555 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6555 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Varun Saxena > Assignee: Junping Du > Attachments: MAPREDUCE-6555.patch > > > Observed in QA report of YARN-3840 > {noformat} > Running org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster > Tests run: 9, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 20.699 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster > testMRAppMasterMidLock(org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster) > Time elapsed: 0.474 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster.testMRAppMasterMidLock(TestMRAppMaster.java:174) > testMRAppMasterSuccessLock(org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster) > Time elapsed: 0.175 sec <<< ERROR! > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.io.FileNotFoundException: File > file:/home/varun/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/staging/history/done_intermediate/TestAppMasterUser/job_1317529182569_0004-1448100479292-TestAppMasterUser-%3Cmissing+job+name%3E-1448100479413-0-0-SUCCEEDED-default-1448100479292.jhist_tmp > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:640) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:866) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:630) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292) > at > org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:372) > at > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:513) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.moveTmpToDone(JobHistoryEventHandler.java:1346) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processDoneFiles(JobHistoryEventHandler.java:1154) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1751) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1247) > at > org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster.testMRAppMasterSuccessLock(TestMRAppMaster.java:254) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)