cbb330 opened a new pull request, #48984: URL: https://github.com/apache/arrow/pull/48984
## Summary - Add comprehensive predicate pushdown support for ORC reader in C++ and Python - Implement OrcMultiStripeReader for efficient reading of filtered stripes - Add detailed design documentation for ORC predicate pushdown - Enhance Python bindings to expose stripe filtering capabilities - Add extensive test coverage for predicate pushdown functionality ## Changes - **C++ Core**: New OrcMultiStripeReader class handles reading from multiple selected stripes after predicate filtering - **Python Bindings**: Expose stripe filtering through updated _orc.pyx interface - **Dataset Integration**: Enhanced file_orc.cc with predicate pushdown support - **Documentation**: Comprehensive design document explaining implementation approach - **Testing**: Added unit tests for stripe filtering and predicate evaluation ## Test Plan - New C++ tests in file_orc_test.cc verify stripe filtering behavior - Python tests in test_orc.py validate end-to-end predicate pushdown - Standalone test program cpp/test_orc_pushdown.cc for integration testing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
