steveloughran commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2117702734

   +1
   
   I was about to merge then I realised that yetus wasn't ready. Here is my 
draft commit message
   
   1. as soon as this is in I will rebase #6686 onto it, which extends 
WrappedIO and adds the reflection utility classes from Parquet to assist in 
testing.
   2. I'll leave you to do the cherrypick and merge onto 3.4.x
   3. And I want to get a minimal version into 3.3.x, maybe with a page size of 
1 even on S3A, but without the safety checks, so still saves on LIST/HEAD calls.
   
   ----
   
   
   create a BulkDelete implementation from a
   BulkDeleteSource; the BulkDelete interface provides
   the pageSize(): the maximum number of entries which can be
   deleted, and a bulkDelete(Collection<Path> paths)
   method which can take a collection up to pageSize() long.
   
   This is optimized for object stores with bulk delete APIs;
   the S3A connector will offer the page size of
   fs.s3a.bulk.delete.page.size unless bulk delete has
   been disabled.
   
   Even with a page size of 1, the S3A implementation is
   more efficient than delete(path)
   as there are no safety checks for the path being a directory
   or probes for the need to recreate directories.
   
   The interface BulkDeleteSource is implemented by
   all FileSystem implementations, with a page size
   of 1 and mapped to delete(pathToDelete, false).
   This means that callers do not need to have special
   case handling for object stores versus classic filesystems.
   
   To aid use through reflection APIs, the class
   org.apache.hadoop.io.wrappedio.WrappedIO
   has been created with "reflection friendly" methods.
   
   Contributed by Mukund Thakur and Steve Loughran


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to