tustvold commented on PR #2666:
URL: https://github.com/apache/arrow-rs/pull/2666#issuecomment-1241796423
<details>
<summary>Final benchmarks</summary>
```
add(0) time: [11.227 us 11.246 us 11.266 us]
change: [+16.981% +17.118% +17.246%] (p = 0.00 <
0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
add_checked(0) time: [65.867 us 65.890 us 65.916 us]
change: [-92.090% -92.078% -92.060%] (p = 0.00 <
0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low mild
1 (1.00%) high mild
5 (5.00%) high severe
add_scalar(0) time: [5.9757 us 5.9783 us 5.9812 us]
change: [-5.9046% -5.7997% -5.6941%] (p = 0.00 <
0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high severe
subtract(0) time: [10.012 us 10.026 us 10.040 us]
change: [+2.7087% +2.8497% +2.9788%] (p = 0.00 <
0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
subtract_checked(0) time: [66.243 us 66.253 us 66.265 us]
change: [-92.047% -92.030% -92.010%] (p = 0.00 <
0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) high mild
2 (2.00%) high severe
subtract_scalar(0) time: [6.0194 us 6.0281 us 6.0367 us]
change: [-5.6497% -5.4648% -5.2827%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
multiply(0) time: [9.0082 us 9.0199 us 9.0313 us]
change: [-7.6503% -7.4907% -7.3226%] (p = 0.00 <
0.05)
Performance has improved.
multiply_checked(0) time: [65.879 us 65.896 us 65.916 us]
change: [-92.134% -92.122% -92.109%] (p = 0.00 <
0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
3 (3.00%) high mild
8 (8.00%) high severe
multiply_scalar(0) time: [5.9307 us 5.9356 us 5.9408 us]
change: [-6.8873% -6.7886% -6.6976%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
divide(0) time: [11.193 us 11.194 us 11.196 us]
change: [-0.2047% +0.0109% +0.1345%] (p = 0.92 >
0.05)
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
5 (5.00%) high severe
divide_checked(0) time: [92.997 us 93.031 us 93.067 us]
change: [-88.829% -88.810% -88.769%] (p = 0.00 <
0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
divide_scalar(0) time: [11.121 us 11.122 us 11.124 us]
change: [-6.5334% -5.2295% -3.9311%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
6 (6.00%) high mild
1 (1.00%) high severe
modulo(0) time: [285.83 us 285.93 us 286.04 us]
change: [-38.958% -38.888% -38.775%] (p = 0.00 <
0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Benchmarking modulo_scalar(0): Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase
target time to 6.4s, enable flat sampling, or reduce sample count to 60.
modulo_scalar(0) time: [1.2745 ms 1.2751 ms 1.2759 ms]
change: [+0.5454% +0.6770% +0.8583%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
add(0.1) time: [10.884 us 10.943 us 11.019 us]
change: [-6.5805% -5.7599% -4.9249%] (p = 0.00 <
0.05)
Performance has improved.
add_checked(0.1) time: [152.23 us 152.30 us 152.38 us]
change: [-84.601% -84.578% -84.562%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
add_scalar(0.1) time: [6.3819 us 6.3834 us 6.3850 us]
change: [+7.1879% +7.2832% +7.3988%] (p = 0.00 <
0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
subtract(0.1) time: [10.841 us 10.850 us 10.858 us]
change: [-2.3402% -1.5446% -0.7289%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
subtract_checked(0.1) time: [150.19 us 150.25 us 150.32 us]
change: [-84.988% -84.942% -84.915%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
subtract_scalar(0.1) time: [6.3538 us 6.3562 us 6.3591 us]
change: [+7.4412% +7.5009% +7.5576%] (p = 0.00 <
0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
multiply(0.1) time: [11.025 us 11.042 us 11.062 us]
change: [-3.1599% -2.5328% -1.8726%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
multiply_checked(0.1) time: [150.59 us 150.61 us 150.63 us]
change: [-84.775% -84.766% -84.758%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
multiply_scalar(0.1) time: [6.3859 us 6.3892 us 6.3931 us]
change: [+7.7047% +7.7979% +7.8903%] (p = 0.00 <
0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) high mild
4 (4.00%) high severe
divide(0.1) time: [11.957 us 11.965 us 11.974 us]
change: [-0.6239% -0.3466% -0.0980%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
9 (9.00%) high mild
4 (4.00%) high severe
divide_checked(0.1) time: [214.63 us 217.64 us 221.37 us]
change: [-77.104% -76.827% -76.523%] (p = 0.00 <
0.05)
Performance has improved.
divide_scalar(0.1) time: [11.122 us 11.125 us 11.128 us]
change: [-0.2726% +0.0005% +0.2795%] (p = 0.99 >
0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) low mild
2 (2.00%) high mild
8 (8.00%) high severe
modulo(0.1) time: [340.24 us 340.60 us 340.93 us]
change: [-47.233% -47.195% -47.154%] (p = 0.00 <
0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
10 (10.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
Benchmarking modulo_scalar(0.1): Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase
target time to 6.1s, enable flat sampling, or reduce sample count to 60.
modulo_scalar(0.1) time: [1.2175 ms 1.2177 ms 1.2180 ms]
change: [-0.1447% +0.0582% +0.1750%] (p = 0.66 >
0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) low mild
4 (4.00%) high mild
1 (1.00%) high severe
add(0.5) time: [9.7497 us 9.7623 us 9.7760 us]
change: [-15.134% -14.995% -14.848%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
add_checked(0.5) time: [88.050 us 88.075 us 88.104 us]
change: [-93.946% -93.935% -93.929%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
add_scalar(0.5) time: [5.9287 us 5.9329 us 5.9380 us]
change: [-8.3559% -8.1373% -7.9945%] (p = 0.00 <
0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
subtract(0.5) time: [9.8349 us 9.8428 us 9.8503 us]
change: [-17.554% -17.021% -16.534%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
subtract_checked(0.5) time: [86.171 us 86.188 us 86.208 us]
change: [-94.066% -94.056% -94.050%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
6 (6.00%) high mild
1 (1.00%) high severe
subtract_scalar(0.5) time: [5.9195 us 5.9225 us 5.9269 us]
change: [-8.2995% -8.2140% -8.1189%] (p = 0.00 <
0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) high mild
4 (4.00%) high severe
multiply(0.5) time: [9.6635 us 9.6777 us 9.6943 us]
change: [-15.146% -14.954% -14.764%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
multiply_checked(0.5) time: [85.098 us 85.126 us 85.159 us]
change: [-94.132% -94.118% -94.102%] (p = 0.00 <
0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low mild
5 (5.00%) high mild
7 (7.00%) high severe
multiply_scalar(0.5) time: [5.9200 us 5.9237 us 5.9278 us]
change: [-8.2128% -8.1431% -8.0698%] (p = 0.00 <
0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) high mild
4 (4.00%) high severe
divide(0.5) time: [12.046 us 12.049 us 12.051 us]
change: [-0.4953% -0.2927% -0.1787%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
divide_checked(0.5) time: [91.931 us 91.972 us 92.018 us]
change: [-93.638% -93.635% -93.632%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
divide_scalar(0.5) time: [11.355 us 11.463 us 11.604 us]
change: [+6.9121% +8.4076% +9.9229%] (p = 0.00 <
0.05)
Performance has regressed.
modulo(0.5) time: [207.41 us 207.46 us 207.52 us]
change: [-72.652% -72.601% -72.572%] (p = 0.00 <
0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
modulo_scalar(0.5) time: [915.27 us 915.38 us 915.50 us]
change: [+0.1323% +0.1758% +0.2137%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
8 (8.00%) high severe
add(0.9) time: [11.666 us 11.671 us 11.676 us]
change: [-3.3882% -3.2702% -3.1398%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
1 (1.00%) high mild
4 (4.00%) high severe
add_checked(0.9) time: [27.665 us 27.675 us 27.686 us]
change: [-97.033% -97.032% -97.031%] (p = 0.00 <
0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
add_scalar(0.9) time: [6.3798 us 6.3827 us 6.3867 us]
change: [+7.7749% +7.8943% +8.0098%] (p = 0.00 <
0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
7 (7.00%) high mild
5 (5.00%) high severe
subtract(0.9) time: [11.760 us 11.765 us 11.770 us]
change: [+13.484% +13.620% +13.752%] (p = 0.00 <
0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
subtract_checked(0.9) time: [26.822 us 26.834 us 26.848 us]
change: [-97.131% -97.127% -97.124%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
subtract_scalar(0.9) time: [6.3898 us 6.3927 us 6.3960 us]
change: [+7.9864% +8.0589% +8.1266%] (p = 0.00 <
0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
multiply(0.9) time: [11.780 us 11.792 us 11.804 us]
change: [+14.224% +14.359% +14.501%] (p = 0.00 <
0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) low mild
4 (4.00%) high mild
multiply_checked(0.9) time: [28.264 us 28.275 us 28.288 us]
change: [-96.974% -96.968% -96.959%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
multiply_scalar(0.9) time: [6.3647 us 6.3662 us 6.3681 us]
change: [+7.3504% +7.4791% +7.6189%] (p = 0.00 <
0.05)
Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) high mild
11 (11.00%) high severe
divide(0.9) time: [11.957 us 11.960 us 11.964 us]
change: [-2.6038% -2.5495% -2.4895%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
divide_checked(0.9) time: [30.427 us 30.438 us 30.450 us]
change: [-96.737% -96.733% -96.726%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
divide_scalar(0.9) time: [11.153 us 11.155 us 11.159 us]
change: [+0.0128% +0.2851% +0.5665%] (p = 0.01 <
0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
5 (5.00%) high mild
7 (7.00%) high severe
modulo(0.9) time: [54.611 us 54.641 us 54.675 us]
change: [-85.530% -85.520% -85.510%] (p = 0.00 <
0.05)
Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
2 (2.00%) low severe
3 (3.00%) low mild
6 (6.00%) high mild
9 (9.00%) high severe
modulo_scalar(0.9) time: [347.32 us 347.43 us 347.58 us]
change: [-0.4787% -0.4314% -0.3671%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
add(1) time: [9.9205 us 9.9267 us 9.9329 us]
change: [-7.5383% -7.4679% -7.3991%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
5 (5.00%) high mild
add_checked(1) time: [4.6835 us 4.7599 us 4.8586 us]
change: [-99.416% -99.408% -99.398%] (p = 0.00 <
0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) high mild
9 (9.00%) high severe
add_scalar(1) time: [6.4029 us 6.4052 us 6.4078 us]
change: [+1.0775% +1.1667% +1.2739%] (p = 0.00 <
0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
subtract(1) time: [9.9767 us 9.9877 us 9.9982 us]
change: [-11.592% -11.492% -11.395%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
subtract_checked(1) time: [4.7278 us 4.8780 us 5.0544 us]
change: [-99.434% -99.427% -99.419%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severe
subtract_scalar(1) time: [6.4270 us 6.4293 us 6.4321 us]
change: [+1.1474% +1.2054% +1.2601%] (p = 0.00 <
0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
multiply(1) time: [10.013 us 10.031 us 10.051 us]
change: [-8.6851% -8.4095% -8.1241%] (p = 0.00 <
0.05)
Performance has improved.
multiply_checked(1) time: [4.7410 us 4.8338 us 4.9462 us]
change: [-99.411% -99.402% -99.392%] (p = 0.00 <
0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) high mild
11 (11.00%) high severe
multiply_scalar(1) time: [6.4224 us 6.4264 us 6.4310 us]
change: [+1.2780% +1.3487% +1.4176%] (p = 0.00 <
0.05)
Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
1 (1.00%) high mild
6 (6.00%) high severe
divide(1) time: [12.043 us 12.048 us 12.053 us]
change: [-0.2281% -0.1755% -0.1296%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
divide_checked(1) time: [4.7782 us 4.8815 us 5.0043 us]
change: [-99.405% -99.394% -99.381%] (p = 0.00 <
0.05)
Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) high mild
13 (13.00%) high severe
divide_scalar(1) time: [11.283 us 11.356 us 11.454 us]
change: [+6.4576% +7.8297% +9.3776%] (p = 0.00 <
0.05)
Performance has regressed.
modulo(1) time: [4.7719 us 4.9086 us 5.1167 us]
change: [-98.369% -98.349% -98.324%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) high mild
6 (6.00%) high severe
modulo_scalar(1) time: [217.53 us 217.57 us 217.61 us]
change: [-2.3072% -2.1226% -2.0109%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
4 (4.00%) high mild
3 (3.00%) high severe
```
</details>
So anything from 50% to 99% faster for the checked kernels :tada: . The
unchecked kernels regularly vary by +-10% but we're talking single-digit
microseconds, so this isn't all that surprising.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]