alamb commented on PR #7650: URL: https://github.com/apache/arrow-rs/pull/7650#issuecomment-2983703192
I played around with this PR for a while this morning to see why the coalesce kernel slower for high selectivity filters even when the algorithm should be the same. I could reproduce it locally: this branch: ``` cargo bench --bench coalesce_kernels -- "mixed_utf8, 8192, nulls: 0, selectivity: 0.8" filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 time: [3.7678 ms 3.8577 ms 3.9434 ms] time: [4.0622 ms 4.1009 ms 4.1415 ms] -- why does this slow down? time: [4.0251 ms 4.0980 ms 4.1655 ms] ``` on main, there is a real and measurable difference ``` filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 time: [3.9407 ms 3.9883 ms 4.0385 ms] time: [3.5373 ms 3.6533 ms 3.7646 ms] time: [3.4424 ms 3.5605 ms 3.6821 ms] ``` I poked around with the profiles and it seems like the answer may be the overhead due to RecordBatch::slice -- I am looking into avoiding that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org