This is an automated email from the ASF dual-hosted git repository.

stevel pushed a commit to branch branch-3.4.2
in repository https://gitbox.apache.org/repos/asf/hadoop.git


The following commit(s) were added to refs/heads/branch-3.4.2 by this push:
     new 8c27442888f HADOOP-19576. S3A: Disable Purging Pending MPUs Before 
Directory Purge (#7722)
8c27442888f is described below

commit 8c27442888f4e4929bd8c9206ef121be788c5cad
Author: Syed Shameerur Rahman <rhma...@amazon.com>
AuthorDate: Tue Jul 8 20:08:19 2025 +0530

    HADOOP-19576. S3A: Disable Purging Pending MPUs Before Directory Purge 
(#7722)
    
    Contributed by Syed Shameerur Rahman
---
 .../org/apache/hadoop/fs/s3a/S3AFileSystem.java    |  3 +--
 .../tools/hadoop-aws/troubleshooting_s3a.md        | 27 ++++++++++++++++++----
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
index c344022ebfc..7046ed9f110 100644
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
@@ -689,9 +689,8 @@ public void initialize(URI name, Configuration originalConf)
       s3ExpressStore = isS3ExpressStore(bucket, endpoint);
 
       // should the delete also purge uploads?
-      // happens if explicitly enabled, or if the store is S3Express storage.
       dirOperationsPurgeUploads = 
conf.getBoolean(DIRECTORY_OPERATIONS_PURGE_UPLOADS,
-          s3ExpressStore);
+          DIRECTORY_OPERATIONS_PURGE_UPLOADS_DEFAULT);
 
       this.isMultipartUploadEnabled = 
conf.getBoolean(MULTIPART_UPLOADS_ENABLED,
           DEFAULT_MULTIPART_UPLOAD_ENABLED);
diff --git 
a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
 
b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
index 6520e0dc026..151ee5bd8a4 100644
--- 
a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
+++ 
b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
@@ -1218,10 +1218,29 @@ java.io.FileNotFoundException: Completing multi-part 
upload on fork-5/test/multi
 This can happen when all outstanding uploads have been aborted, including the
 active ones.
 
-If the bucket has a lifecycle policy of deleting multipart uploads, make sure
-that the expiry time of the deletion is greater than that required for all open
-writes to complete the write,
-*and for all jobs using the S3A committers to commit their work.*
+When working with S3A committers and multipart uploads (MPUs), consider these 
important guidelines:
+
+1. **Bucket Lifecycle Policies:**
+   - If your bucket has a lifecycle policy for deleting multipart uploads
+   - Set the deletion expiry time long enough to:
+     - Complete all open write operations
+     - Allow S3A committers to finish their commit process
+
+2. **Directory Operations and MPUs:**
+   - Setting `fs.s3a.directory.operations.purge.uploads=true` will abort all 
pending MPUs before directory cleanup
+   - For jobs using S3A committers:
+     - Set `fs.s3a.directory.operations.purge.uploads=false` when directories 
need to be overwritten before job completion
+     - This prevents accidental abortion of active uploads during the commit 
phase
+
+
+### S3 Express Store directory object not getting deleted
+
+When working with S3 Express store buckets (unlike standard S3 buckets), 
follow these steps to purge a directory object:
+
+1. Set `fs.s3a.directory.operations.purge.uploads=true` if you need to delete 
a directory object that has pending multipart uploads (MPUs).
+
+2. This setting ensures that all pending MPUs are aborted before the directory 
object is deleted, which is a requirement specific to S3 Express store buckets.
+
 
 ### Application hangs after reading a number of files
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

Reply via email to