adriangb opened a new pull request, #18720: URL: https://github.com/apache/datafusion/pull/18720
## Summary This PR consolidates the separate `ArrowFileSource` and `ArrowStreamFileSource` implementations into a unified `ArrowSource` with an `ArrowFormat` enum. This is part of the larger projection refactoring effort tracked in https://github.com/apache/datafusion/pull/18627. ## Key Changes - **Removed separate structs**: Eliminated duplicate `ArrowFileSource` and `ArrowStreamFileSource` implementations - **Added `ArrowFormat` enum**: Simple enum with `File` and `Stream` variants to distinguish between Arrow IPC formats - **Unified `ArrowSource` struct**: Single struct that uses `ArrowFormat` to dispatch to appropriate opener - **Kept separate openers**: `ArrowFileOpener` and `ArrowStreamFileOpener` remain distinct as their implementations differ significantly - **Format-specific behavior**: `repartitioned()` method returns `None` for Stream format (doesn't support parallel reading) and delegates to default logic for File format ## Benefits - **Reduced code duplication**: ~144 net lines removed - **Clearer architecture**: Single source of truth for Arrow file handling - **Maintained separation**: Format-specific logic remains in separate openers - **No behavior changes**: All existing tests pass without modification ## Testing - All existing tests pass - No changes to test files needed - Both file and stream formats work correctly ## Related Work This PR is independent and can be merged before or after: - PR 1: Move Statistics Handling (if created) - PR 3: Enhance Physical-Expr Projection Handling (if created) Part of #18627 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
