UBarney commented on PR #16443:
URL: https://github.com/apache/datafusion/pull/16443#issuecomment-2994063902

   > * `apply_join_filter_to_indices` Showed a reduction in execution time 
(sample count reduced from 528million to 241million).
   
   The benchmark results indicate that restricting the row count of 
`intermediate_batch` in `apply_join_filter_to_indices` indeed enhances 
performance.
   ```
   Benchmarking nlj/filter/batch_size==8192: Collecting 100 samples in 
estimated 156.94 s (30
   nlj/filter/batch_size==8192
                           time:   [487.31 ms 488.41 ms 489.68 ms]
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
     4 (4.00%) high severe
   Benchmarking nlj/filter/batch_size==8192*8192: Collecting 100 samples in 
estimated 126.81 
   nlj/filter/batch_size==8192*8192
                           time:   [658.42 ms 659.90 ms 661.49 ms]
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   ```
   
   https://gist.github.com/UBarney/23fdb597f43bfcffe4f781fb6b99e579


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to