ReemaAlzaid opened a new pull request, #12026:
URL: https://github.com/apache/gluten/pull/12026

   ### What changes were proposed in this pull request?
   
   This PR adds a Gluten S3 filesystem registration path for the Velox backend
   
   It introduces `GlutenS3FileSystem`, which extends Velox's `S3FileSystem`, 
and registers it from `VeloxBackend` when Gluten is built with S3 support. The 
current implementation preserves the existing Velox S3 behavior by delegating 
writes to Velox's S3 filesystem.
   
   ### Why are the changes needed?
   
   Gluten currently registers Velox's S3 filesystem directly. That makes it 
hard to customize, debug, or extend S3 behavior from Gluten without changing 
Velox side registration.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   - Built the focused C++ test target:
     `cmake --build cpp/build --target gluten_s3_file_system_test -j 12`
   
   - Ran the new C++ test:
     `cpp/build/velox/tests/gluten_s3_file_system_test 
--gtest_filter=GlutenS3FileSystemTest.registeredFileSystemUsesGlutenSubclass`
   
   - Manually verified with Spark 4.0.1 and S3A that Gluten registers the 
custom S3 filesystem, native Parquet write uses the Gluten S3 write path, S3 
files are committed successfully, and the Velox S3 read plan is generated.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to