alamb commented on PR #7650:
URL: https://github.com/apache/arrow-rs/pull/7650#issuecomment-2983703192

   I played around with this PR for a while this morning to see why the 
coalesce kernel slower for high selectivity filters even when the algorithm 
should be the same. I could reproduce it locally:
   
   
   this branch:
   ```
   cargo bench --bench coalesce_kernels  -- "mixed_utf8, 8192, nulls: 0, 
selectivity: 0.8"
   
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8
                           time:   [3.7678 ms 3.8577 ms 3.9434 ms]
                           time:   [4.0622 ms 4.1009 ms 4.1415 ms] -- why does 
this slow down?
                           time:   [4.0251 ms 4.0980 ms 4.1655 ms]
   ```
   
   on main, there is a real and measurable difference
   
   ```
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8
                           time:   [3.9407 ms 3.9883 ms 4.0385 ms]
                           time:   [3.5373 ms 3.6533 ms 3.7646 ms]
                           time:   [3.4424 ms 3.5605 ms 3.6821 ms]
   ```
   
   I poked around with the profiles and it seems like the answer may be the 
overhead due to RecordBatch::slice -- I am looking into avoiding that.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to