UBarney commented on PR #16443: URL: https://github.com/apache/datafusion/pull/16443#issuecomment-2994063902
> * `apply_join_filter_to_indices` Showed a reduction in execution time (sample count reduced from 528million to 241million). The benchmark results indicate that restricting the row count of `intermediate_batch` in `apply_join_filter_to_indices` indeed enhances performance. ``` Benchmarking nlj/filter/batch_size==8192: Collecting 100 samples in estimated 156.94 s (30 nlj/filter/batch_size==8192 time: [487.31 ms 488.41 ms 489.68 ms] Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 1 (1.00%) high mild 4 (4.00%) high severe Benchmarking nlj/filter/batch_size==8192*8192: Collecting 100 samples in estimated 126.81 nlj/filter/batch_size==8192*8192 time: [658.42 ms 659.90 ms 661.49 ms] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild ``` https://gist.github.com/UBarney/23fdb597f43bfcffe4f781fb6b99e579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org