[PR] Perf: optimize actual_buffer_size to use only data buffer capacity [arrow-rs]

via GitHub Sun, 20 Jul 2025 08:23:47 -0700


zhuqi-lucas opened a new pull request, #7967:
URL: https://github.com/apache/arrow-rs/pull/7967


   # Which issue does this PR close?
   
   This is a very interesting idea that we only calculate the data buffer size 
when we choose to gc, because we almost only care about the gc for data 
buffers, not for other field views/nulls.
   
   
   # Rationale for this change
   
   optimize actual_buffer_size to use only data buffer capacity
   
   # What changes are included in this PR?
   
   optimize actual_buffer_size to use only data buffer capacity
   
   # Are these changes tested?
   
   The performance improvement for some high select benchmark with low null 
ratio is very good:
   
   ```rust
   cargo bench --bench coalesce_kernels "single_utf8view"
      Compiling arrow-select v55.2.0 (/Users/zhuqi/arrow-rs/arrow-select)
      Compiling arrow-cast v55.2.0 (/Users/zhuqi/arrow-rs/arrow-cast)
      Compiling arrow-string v55.2.0 (/Users/zhuqi/arrow-rs/arrow-string)
      Compiling arrow-ord v55.2.0 (/Users/zhuqi/arrow-rs/arrow-ord)
      Compiling arrow-csv v55.2.0 (/Users/zhuqi/arrow-rs/arrow-csv)
      Compiling arrow-json v55.2.0 (/Users/zhuqi/arrow-rs/arrow-json)
      Compiling arrow v55.2.0 (/Users/zhuqi/arrow-rs/arrow)
       Finished `bench` profile [optimized] target(s) in 13.26s
        Running benches/coalesce_kernels.rs 
(target/release/deps/coalesce_kernels-bb9750abedb10ad6)
   filter: single_utf8view, 8192, nulls: 0, selectivity: 0.001
                           time:   [30.946 ms 31.071 ms 31.193 ms]
                           change: [−1.7086% −1.1581% −0.6036%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) low mild
     1 (1.00%) high mild
   
   filter: single_utf8view, 8192, nulls: 0, selectivity: 0.01
                           time:   [3.8178 ms 3.8311 ms 3.8444 ms]
                           change: [−4.0521% −3.5467% −3.0345%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   
   Benchmarking filter: single_utf8view, 8192, nulls: 0, selectivity: 0.1: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 9.9s, enable flat sampling, or reduce sample count to 40.
   filter: single_utf8view, 8192, nulls: 0, selectivity: 0.1
                           time:   [1.9337 ms 1.9406 ms 1.9478 ms]
                           change: [+0.3699% +0.9557% +1.5666%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) low mild
     3 (3.00%) high severe
   
   filter: single_utf8view, 8192, nulls: 0, selectivity: 0.8
                           time:   [797.60 µs 805.31 µs 813.85 µs]
                           change: [−59.177% −58.412% −57.639%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.001
                           time:   [43.742 ms 43.924 ms 44.108 ms]
                           change: [−1.2146% −0.5778% +0.0828%] (p = 0.08 > 
0.05)
                           No change in performance detected.
   
   filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.01
                           time:   [5.5736 ms 5.5987 ms 5.6247 ms]
                           change: [−0.2381% +0.4740% +1.1711%] (p = 0.18 > 
0.05)
                           No change in performance detected.
   
   filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.1
                           time:   [2.2963 ms 2.3035 ms 2.3109 ms]
                           change: [−0.9314% −0.5125% −0.0931%] (p = 0.02 < 
0.05)
                           Change within noise threshold.
   
   Benchmarking filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.8: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 8.1s, enable flat sampling, or reduce sample count to 50.
   filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.8
                           time:   [1.5482 ms 1.5697 ms 1.5903 ms]
                           change: [−45.794% −44.386% −43.000%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   ```
   
   If tests are not included in your PR, please explain why (for example, are 
they covered by existing tests)?
   
   # Are there any user-facing changes?
   
   If there are user-facing changes then we may require documentation to be 
updated before approving the PR.
   
   If there are any breaking changes to public APIs, please call them out.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Perf: optimize actual_buffer_size to use only data buffer capacity [arrow-rs]

Reply via email to