[ 
https://issues.apache.org/jira/browse/HADOOP-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760077#comment-16760077
 ] 

Steve Loughran commented on HADOOP-16090:
-----------------------------------------

Some possible options here

# we go back to check & delete up the tree. This is bad because it is such a 
performance killer compared to the O(1) batch delete call.
# we make that optional. It'll only be really slow on versioned stores
# we allow an option to only work with the parent dir on a check & delete of 
that *and only care about the parent directory marker*. It's slower, but it 
would avoid blind deletes. 

you could point at option #3 and say it has problems with empty parent 
directories, but since we don't stop you doing things like creating a file 
under a file if you try hard enough, we will just have to trust people not to 
generate the corner cases with problems.

rename does the delete too. That's potentially a more complex problem. We know 
that if you rename a path to a location then it must have an implicit or 
explicit parent directory. If there's an implicit one (i.e. some other child of 
the parent dir exists), we can assume it deleted the empty parent dir

One more thing, looking through the code, I don't see innerMkdirs deleting any 
empty parent marker dir(s). It does walk up the tree to check for parents, and 
it should delete any empty directory marker it finds.

Overall then, yes, this is potentially a larger problem than you'd expect. We'd 
probably need a "versioned.store" flag to set which would tell us to be 
efficient in deleting at the possible expense of performance. Do you fancy 
getting involved here, in trunk?

In the meantime, if you are on the 2.8 branch, you need to know that any 
improvements here aren't going to go back into branch-2...we've moved on too 
much. What we could think about doing is backporting HADOOP-13421, v2 list 
support, to hadoop 2.9.x (no earlier, due to SDK Versions &c). That would at 
least get address some of the problems you are seeing

thoughts? 





> deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a 
> versioned S3 bucket
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16090
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.1
>            Reporter: Dmitri Chmelev
>            Priority: Minor
>
> The fix to avoid calls to getFileStatus() for each path component in 
> deleteUnnecessaryFakeDirectories() (HADOOP-13164) results in accumulation of 
> delete markers in versioned S3 buckets. The above patch replaced 
> getFileStatus() checks with a single batch delete request formed by 
> generating all ancestor keys formed from a given path. Since the delete 
> request is not checking for existence of fake directories, it will create a 
> delete marker for every path component that did not exist (or was previously 
> deleted). Note that issuing a DELETE request without specifying a version ID 
> will always create a new delete marker, even if one already exists ([AWS S3 
> Developer 
> Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])
> Since deleteUnnecessaryFakeDirectories() is called as a callback on 
> successful writes and on renames, delete markers accumulate rather quickly 
> and their rate of accumulation is inversely proportional to the depth of the 
> path. In other words, directories closer to the root will have more delete 
> markers than the leaves.
> This behavior negatively impacts performance of getFileStatus() operation 
> when it has to issue listObjects() request (especially v1) as the delete 
> markers have to be examined when the request searches for first current 
> non-deleted version of an object following a given prefix.
> I did a quick comparison against 3.x and the issue is still present: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to