simonvandel commented on PR #4560:
URL: https://github.com/apache/arrow-rs/pull/4560#issuecomment-1650391720
I tried expanding the benchmarks in
f472f3fbd0a9ab76903071b70696789cfefbe341, and then comparing before this PR
(but with f472f3fbd0a9ab76903071b70696789cfefbe341) and this PR:
```
$ RUSTFLAGS='-C target-cpu=native' cargo +nightly bench --bench
aggregate_kernels "sum" -- --baseline=before
Finished bench [optimized] target(s) in 0.10s
Running benches/aggregate_kernels.rs
(target/release/deps/aggregate_kernels-da08a889b5821ed5)
sum 512 u8 no nulls time: [17.561 ns 17.569 ns 17.577 ns]
change: [+151.69% +157.94% +162.51%] (p = 0.00 <
0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
sum 512 u8 50% nulls time: [672.19 ns 672.84 ns 673.62 ns]
change: [+254.36% +255.68% +257.08%] (p = 0.00 <
0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
sum 512 ts_millis no nulls
time: [158.65 ns 158.72 ns 158.80 ns]
change: [+443.21% +444.17% +445.21%] (p = 0.00 <
0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
6 (6.00%) high severe
sum 512 ts_millis 50% nulls
time: [84.507 ns 84.543 ns 84.577 ns]
change: [-56.741% -56.521% -56.356%] (p = 0.00 <
0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
2 (2.00%) high mild
6 (6.00%) high severe
sum 512 f32 no nulls time: [28.857 ns 28.886 ns 28.920 ns]
change: [-93.041% -92.961% -92.887%] (p = 0.00 <
0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
2 (2.00%) high mild
6 (6.00%) high severe
sum 512 f32 50% nulls time: [87.285 ns 87.330 ns 87.380 ns]
change: [-61.882% -61.786% -61.684%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
```
This is still on an i7-10750H, and `rustc 1.73.0-nightly (0308df23e
2023-07-21)`.
Interestingly, the speedups are only for f32 with/without nulls, and
ts_millis with nulls. For all others, it's not a speedup.
@jhorstmann can you reproduce?
In any case, some more investigation into the regressions are needed before
this can be merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]