zuston opened a new pull request, #2603: URL: https://github.com/apache/uniffle/pull/2603
### What changes were proposed in this pull request? This PR is to introduce a mechanism to report localfile read plan before real reading, and the changes only are scoped in the client side. More changes should be added in the shuffle server in the future. ### Why are the changes needed? For normal partitions, the reading mode is sequential, which makes read-ahead optimization feasible. This has already been verified in the Riffle project (see [issue #483](https://github.com/zuston/riffle/issues/483)). For huge partitions, however, the reading mode becomes skippable due to the AQE skew join optimization rule. In such cases, it is difficult to predict the next read position and length. Based on this analysis, we propose introducing a fixed read plan that is propagated from the client to the server, allowing the server to recognize the next read offset and thereby benefit from read-ahead optimization. ### Does this PR introduce _any_ user-facing change? Yes. And this feature will be disabled by default ### How was this patch tested? Existing tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
