[ 
https://issues.apache.org/jira/browse/HADOOP-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887471#comment-16887471
 ] 

Steve Loughran commented on HADOOP-16207:
-----------------------------------------

Working on this. Finally got a log. And (currently) /tmp/hadoop-yarn/staging/ 
doesn't exist.

Assumption: all the miniYarnClusters are sharing the same /tmp staging dir, so 
that when one is shutdown while another is running, the second one fails as all 
its staging files go away -in which case yes, it is a race condition. At least 
this time.

{code}
(TaskAttemptListenerImpl.java:fatalError(288)) - Task: 
attempt_1563401248365_0003_m_000000_0 - exited : java.io.FileNotFoundException: 
File 
file:/tmp/hadoop-yarn/staging/stevel/.staging/job_1563401248365_0003/job.split 
does not exist
        at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:666)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:987)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:656)
        at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
        at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:153)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:917)
        at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:362)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
{code}

> Fix ITestDirectoryCommitMRJob.testMRJob
> ---------------------------------------
>
>                 Key: HADOOP-16207
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16207
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, test
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>
> Reported failure of {{ITestDirectoryCommitMRJob}} in validation runs of 
> HADOOP-16186; assertIsDirectory with s3guard enabled and a parallel test run: 
> Path "is recorded as deleted by S3Guard"
> {code}
>     waitForConsistency();
>     assertIsDirectory(outputPath) /* here */
> {code}
> The file is there but there's a tombstone. Possibilities
> * some race condition with another test
> * tombstones aren't timing out
> * committers aren't creating that base dir in a way which cleans up S3Guard's 
> tombstones. 
> Remember: we do have to delete that dest dir before the committer runs unless 
> overwrite==true, so at the start of the run there will be a tombstone. It 
> should be overwritten by a success.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to