UBarney commented on PR #16443: URL: https://github.com/apache/datafusion/pull/16443#issuecomment-2993893069
> I'll find out why there is a performance improvement From the flame graph (when executing the SQL `select t1.value from range(8192) t1 join range(8192) t2 on t1.value + t2.value < t1.value * t2.value;`) by adjusting the input indices sizes, the execution time of these two functions was reduced - `apply_join_filter_to_indices` Showed a reduction in execution time (sample count reduced from 528million to 241million). - `build_batch_from_indices` (excluding the contribution of `apply_join_filter_to_indices`) Showed a reduction in execution time (sample count reduced from 79million to 35million). But I still can't explain why these two functions performed better. 😂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org