[ https://issues.apache.org/jira/browse/HADOOP-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856131#action_12856131 ]
Tom White commented on HADOOP-6688: ----------------------------------- Could this be a consistency issue, like HADOOP-6208? I'm not sure a blanket ignore of FileNotFoundExceptions is quite right: if you call delete with a path that doesn't exist should it not throw FileNotFoundException? I can see that if a file in a subdirectory of the path being deleted doesn't exist then that should not result in a FileNotFoundException. You might want to check if the new FileContext API exhibits the same problem, so it can be considered for the contract there too. I don't think this is a blocker. As a workaround you could create a wrapped FileOutputCommitter that catches and ignores FileNotFoundException. > FileSystem.delete(...) implementations should not throw FileNotFoundException > ----------------------------------------------------------------------------- > > Key: HADOOP-6688 > URL: https://issues.apache.org/jira/browse/HADOOP-6688 > Project: Hadoop Common > Issue Type: Bug > Components: fs, fs/s3 > Affects Versions: 0.20.2 > Environment: Amazon EC2/S3 > Reporter: Danny Leshem > Priority: Blocker > Fix For: 0.20.3, 0.21.0, 0.22.0 > > > S3FileSystem.delete(Path path, boolean recursive) may fail and throw a > FileNotFoundException if a directory is being deleted while at the same time > some of its files are deleted in the background. > This is definitely not the expected behavior of a delete method. If one of > the to-be-deleted files is found missing, the method should not fail and > simply continue. This is true for the general contract of FileSystem.delete, > and also for its various implementations: RawLocalFileSystem (and > specifically FileUtil.fullyDelete) exhibits the same problem. > The fix is to silently catch and ignore FileNotFoundExceptions in delete > loops. This can very easily be unit-tested, at least for RawLocalFileSystem. > The reason this issue bothers me is that the cleanup part of a long (Mahout) > MR job inconsistently fails for me, and I think this is the root problem. The > log shows: > {code} > java.io.FileNotFoundException: > s3://S3-BUCKET/tmp/0008E25BF7554CA9/2521362836721872/DistributedMatrix.times.outputVector/_temporary/_attempt_201004061215_0092_r_000002_0/part-00002: > No such file or directory. > at > org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:334) > at > org.apache.hadoop.fs.s3.S3FileSystem.listStatus(S3FileSystem.java:193) > at org.apache.hadoop.fs.s3.S3FileSystem.delete(S3FileSystem.java:303) > at org.apache.hadoop.fs.s3.S3FileSystem.delete(S3FileSystem.java:312) > at > org.apache.hadoop.mapred.FileOutputCommitter.cleanupJob(FileOutputCommitter.java:64) > at > org.apache.hadoop.mapred.OutputCommitter.cleanupJob(OutputCommitter.java:135) > at org.apache.hadoop.mapred.Task.runJobCleanupTask(Task.java:826) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:292) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > {code} > (similar errors are displayed for ReduceTask.run) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira