[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024466#comment-15024466
 ] 

Junping Du commented on MAPREDUCE-6555:
---------------------------------------

The previous failure is because since MAPREDUCE-5485, we allow MR job can retry 
on AM failure during committing stage (if Committer is repeatable). So 
MRAppMaster.initAndStartAppMaster() won't throw fatal exception if there are 
commit start file exists (which hints previous AM failed in the middle of 
commit) for FileOutputCommitter which is default for version 2 algorithm in 
trunk. I think we don't need this fix in branch-2 as the version in branch-2 is 
1.

> TestMRAppMaster fails on trunk
> ------------------------------
>
>                 Key: MAPREDUCE-6555
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6555
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Varun Saxena
>            Assignee: Junping Du
>         Attachments: MAPREDUCE-6555.patch
>
>
> Observed in QA report of YARN-3840 
> {noformat}
> Running org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster
> Tests run: 9, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 20.699 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster
> testMRAppMasterMidLock(org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster)  
> Time elapsed: 0.474 sec  <<< FAILURE!
> java.lang.AssertionError: null
>       at org.junit.Assert.fail(Assert.java:86)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at org.junit.Assert.assertTrue(Assert.java:52)
>       at 
> org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster.testMRAppMasterMidLock(TestMRAppMaster.java:174)
> testMRAppMasterSuccessLock(org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster)
>   Time elapsed: 0.175 sec  <<< ERROR!
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.io.FileNotFoundException: File 
> file:/home/varun/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/staging/history/done_intermediate/TestAppMasterUser/job_1317529182569_0004-1448100479292-TestAppMasterUser-%3Cmissing+job+name%3E-1448100479413-0-0-SUCCEEDED-default-1448100479292.jhist_tmp
>  does not exist
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:640)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:866)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:630)
>       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
>       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:372)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:513)
>       at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.moveTmpToDone(JobHistoryEventHandler.java:1346)
>       at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processDoneFiles(JobHistoryEventHandler.java:1154)
>       at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>       at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>       at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
>       at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1751)
>       at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1247)
>       at 
> org.apache.hadoop.mapreduce.v2.app.TestMRAppMaster.testMRAppMasterSuccessLock(TestMRAppMaster.java:254)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to