jhorstmann commented on PR #7824: URL: https://github.com/apache/arrow-rs/pull/7824#issuecomment-3028829209
Benchmark results from a local run with `-Ctarget-cpu=native`: ``` write_batch primitive/4096 values primitive time: [476.56 µs 477.29 µs 478.26 µs] thrpt: [367.86 MiB/s 368.61 MiB/s 369.18 MiB/s] change: time: [-28.207% -27.412% -26.910%] (p = 0.00 < 0.05) thrpt: [+36.817% +37.763% +39.289%] Performance has improved. write_batch primitive/4096 values primitive with bloom filter time: [4.6939 ms 4.7261 ms 4.7606 ms] thrpt: [36.956 MiB/s 37.226 MiB/s 37.482 MiB/s] change: time: [-2.5791% -1.8121% -1.0755%] (p = 0.00 < 0.05) thrpt: [+1.0872% +1.8456% +2.6474%] Performance has improved. write_batch primitive/4096 values primitive non-null time: [387.15 µs 389.86 µs 395.46 µs] thrpt: [436.24 MiB/s 442.50 MiB/s 445.61 MiB/s] change: time: [-13.168% -12.894% -12.420%] (p = 0.00 < 0.05) thrpt: [+14.182% +14.802% +15.165%] Performance has improved. write_batch primitive/4096 values primitive non-null with bloom filter time: [4.6790 ms 4.6942 ms 4.7115 ms] thrpt: [36.616 MiB/s 36.751 MiB/s 36.870 MiB/s] change: time: [-2.2927% -1.8226% -1.3822%] (p = 0.00 < 0.05) thrpt: [+1.4015% +1.8564% +2.3465%] Performance has improved. write_batch primitive/4096 values bool time: [34.476 µs 34.581 µs 34.717 µs] thrpt: [30.547 MiB/s 30.667 MiB/s 30.761 MiB/s] change: time: [-41.518% -40.820% -40.108%] (p = 0.00 < 0.05) thrpt: [+66.966% +68.977% +70.994%] Performance has improved. write_batch primitive/4096 values bool non-null time: [20.286 µs 20.328 µs 20.380 µs] thrpt: [28.077 MiB/s 28.148 MiB/s 28.207 MiB/s] change: time: [-28.555% -28.463% -28.362%] (p = 0.00 < 0.05) thrpt: [+39.591% +39.788% +39.967%] Performance has improved. write_batch primitive/4096 values string time: [512.18 µs 512.45 µs 512.75 µs] thrpt: [3.9009 GiB/s 3.9032 GiB/s 3.9053 GiB/s] change: time: [-5.8849% -5.7654% -5.6374%] (p = 0.00 < 0.05) thrpt: [+5.9742% +6.1181% +6.2528%] Performance has improved. write_batch primitive/4096 values string with bloom filter time: [990.06 µs 991.34 µs 992.86 µs] thrpt: [2.0146 GiB/s 2.0177 GiB/s 2.0203 GiB/s] change: time: [-4.0814% -3.2002% -2.4531%] (p = 0.00 < 0.05) thrpt: [+2.5148% +3.3060% +4.2551%] Performance has improved. write_batch primitive/4096 values string #2 time: [225.11 µs 225.57 µs 226.14 µs] thrpt: [558.08 MiB/s 559.49 MiB/s 560.65 MiB/s] change: time: [-13.134% -12.724% -12.312%] (p = 0.00 < 0.05) thrpt: [+14.041% +14.580% +15.119%] Performance has improved. write_batch primitive/4096 values string with bloom filter #2 time: [542.18 µs 542.61 µs 543.16 µs] thrpt: [232.35 MiB/s 232.59 MiB/s 232.77 MiB/s] change: time: [-4.4430% -4.1789% -3.9231%] (p = 0.00 < 0.05) thrpt: [+4.0833% +4.3612% +4.6496%] Performance has improved. write_batch primitive/4096 values string dictionary time: [263.02 µs 263.21 µs 263.44 µs] thrpt: [3.8264 GiB/s 3.8298 GiB/s 3.8325 GiB/s] change: time: [-12.989% -11.939% -11.008%] (p = 0.00 < 0.05) thrpt: [+12.370% +13.558% +14.928%] Performance has improved. write_batch primitive/4096 values string dictionary with bloom filter time: [502.98 µs 503.29 µs 503.64 µs] thrpt: [2.0015 GiB/s 2.0029 GiB/s 2.0041 GiB/s] change: time: [-5.9241% -5.7646% -5.6083%] (p = 0.00 < 0.05) thrpt: [+5.9415% +6.1172% +6.2971%] Performance has improved. write_batch primitive/4096 values string non-null time: [689.09 µs 690.48 µs 692.29 µs] thrpt: [2.8879 GiB/s 2.8954 GiB/s 2.9013 GiB/s] change: time: [-1.6423% -1.1567% -0.5306%] (p = 0.00 < 0.05) thrpt: [+0.5335% +1.1702% +1.6698%] Change within noise threshold. write_batch primitive/4096 values string non-null with bloom filter time: [1.2297 ms 1.2313 ms 1.2334 ms] thrpt: [1.6209 GiB/s 1.6236 GiB/s 1.6258 GiB/s] change: time: [-1.8332% -1.5425% -1.2486%] (p = 0.00 < 0.05) thrpt: [+1.2644% +1.5667% +1.8674%] Performance has improved. write_batch primitive/4096 values float with NaNs time: [316.17 µs 317.04 µs 318.12 µs] thrpt: [172.77 MiB/s 173.36 MiB/s 173.84 MiB/s] change: time: [-1.5436% -0.5349% +0.5982%] (p = 0.34 > 0.05) thrpt: [-0.5946% +0.5377% +1.5678%] No change in performance detected. write_batch nested/4096 values primitive list time: [1.1048 ms 1.1141 ms 1.1251 ms] thrpt: [1.8504 GiB/s 1.8687 GiB/s 1.8843 GiB/s] change: time: [-12.662% -12.001% -11.382%] (p = 0.00 < 0.05) thrpt: [+12.844% +13.638% +14.497%] Performance has improved. write_batch nested/4096 values primitive list non-null time: [1.1968 ms 1.1984 ms 1.2002 ms] thrpt: [1.7310 GiB/s 1.7336 GiB/s 1.7358 GiB/s] change: time: [-2.9200% -2.6182% -2.3050%] (p = 0.00 < 0.05) thrpt: [+2.3594% +2.6886% +3.0078%] Performance has improved. ``` Very nice improvements for primitives, minor improvements once bloom filters or string types are involved. I think the last two benchmarks are named incorrectly, they actually write 3 columns of types int32, bool and utf8. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org