[ 
https://issues.apache.org/jira/browse/HADOOP-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760157#comment-16760157
 ] 

Dmitri Chmelev commented on HADOOP-16090:
-----------------------------------------

A slight optimization of #1 is to avoid calling full-blown getFileStatus() and 
instead just issue a head request per path component, making sure that the 
trailing slash is present. Effectively, this probes for existence of the fake 
directory, without using list-objects. This, however, still keeps the cleanup 
at O(depth) and we effectively trade write amplification for read 
amplification. Currently, that's the patch that I was going to propose, 
predicated on addition of "versioned.store" flag.

I am not sure #3 works if we limit the search to immediate parent directory. 
The problem is "mkdir -p" and copyFromLocalFile(). Either can shadow an 
existing empty directory present anywhere along the destination path. One idea 
I had was to bail out searching as soon as the candidate path component is 
confirmed to have two 'non-fake' children. However, this is not ideal for two 
reasons: 1) race conditions when multiple clients create objects in the same 
directory, defeating the check 2) pathological case where every path component 
along the path has exactly one child (likely not common, assuming expected 
branching factor of > 1). As far as #1 goes, I am actually concerned in general 
that handling of fake directories today is racy and could lead to 
inconsistencies when multiple writers are involved (fake directories 
incorrectly created or removed, breaking lookup).

Regarding innerMkdirs() not deleting fake dirs, I believe this was fixed in 
HADOOP-14255. I intend to cherrypick this change as I ran into the same 
conclusion while reading the code.

As far as HADOOP-13421, it was already on my radar and I was curious whether it 
could be backported easily to 2.8.x. Thanks for the heads up about the SDK 
version. I believe it does not solve the underlying problem of delete marker 
accumulation. It could also be mitigated by adding life-cycle policy rules to 
perform cleanup.

> deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a 
> versioned S3 bucket
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16090
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.1
>            Reporter: Dmitri Chmelev
>            Priority: Minor
>
> The fix to avoid calls to getFileStatus() for each path component in 
> deleteUnnecessaryFakeDirectories() (HADOOP-13164) results in accumulation of 
> delete markers in versioned S3 buckets. The above patch replaced 
> getFileStatus() checks with a single batch delete request formed by 
> generating all ancestor keys formed from a given path. Since the delete 
> request is not checking for existence of fake directories, it will create a 
> delete marker for every path component that did not exist (or was previously 
> deleted). Note that issuing a DELETE request without specifying a version ID 
> will always create a new delete marker, even if one already exists ([AWS S3 
> Developer 
> Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])
> Since deleteUnnecessaryFakeDirectories() is called as a callback on 
> successful writes and on renames, delete markers accumulate rather quickly 
> and their rate of accumulation is inversely proportional to the depth of the 
> path. In other words, directories closer to the root will have more delete 
> markers than the leaves.
> This behavior negatively impacts performance of getFileStatus() operation 
> when it has to issue listObjects() request (especially v1) as the delete 
> markers have to be examined when the request searches for first current 
> non-deleted version of an object following a given prefix.
> I did a quick comparison against 3.x and the issue is still present: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to