metesynnada opened a new pull request, #6858: URL: https://github.com/apache/arrow-datafusion/pull/6858
# Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/6857 # Rationale for this change In theory, we can maintain the order of records on the probe side of the hash join. This ordering preservation can be applied to Inner, RightSemi, and RightAnti joins. The implementation changes in this pull request are designed to leverage this aspect of hash join operations, thereby eliminating unnecessary sort operations. Consequently, this enhancement is expected to result in improved performance, particularly in scenarios where the probe side of the hash join operation is already sorted. # What changes are included in this PR? 1. **Adjustment of HashJoinExec APIS**: We have modified the HashJoinExec operation to preserve the order of its probe side in Inner, RightSemi, and RightAnti joins, thereby eliminating unnecessary sort operations. 2. **Bug Fix on Sort Pushdown Rule**: We have fixed a minor bug in the Sort Pushdown rule. # Are these changes tested? The new tests are included in the` join_disable_repartition.slt `file. # Are there any user-facing changes? No. cc @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
