cyb70289 commented on pull request #9604:
URL: https://github.com/apache/arrow/pull/9604#issuecomment-787804449


   Besides better accuracy, pairwise summation also improves performance 
significantly for normal cases.
   - big improvement (~1x) for 0.01%, 0% null count
   - moderate improvement for 1% null count
   - big drop (~0.5x) for 10%, 50% null count (due to short continuous data 
blocks)
   
   Benchmark result on skylake, clang-9 (removed Int32 benchs as they're not 
touched)
   
   ```
   Before
   
---------------------------------------------------------------------------------------------
   Benchmark                                   Time             CPU   
Iterations UserCounters...
   
---------------------------------------------------------------------------------------------
   VarianceKernelInt64/1048576/10000         201 us          201 us         
3479 bytes_per_second=4.85858G/s null_percent=0.01 size=1048.58k
   VarianceKernelInt64/1048576/100           235 us          235 us         
2983 bytes_per_second=4.15903G/s null_percent=1 size=1048.58k
   VarianceKernelInt64/1048576/10            515 us          515 us         
1362 bytes_per_second=1.89616G/s null_percent=10 size=1048.58k
   VarianceKernelInt64/1048576/2             826 us          825 us          
848 bytes_per_second=1.18306G/s null_percent=50 size=1048.58k
   VarianceKernelInt64/1048576/1           0.866 us        0.866 us       
806079 bytes_per_second=1.10133T/s null_percent=100 size=1048.58k
   VarianceKernelInt64/1048576/0             186 us          186 us         
3766 bytes_per_second=5.23785G/s null_percent=0 size=1048.58k
   
   VarianceKernelFloat/1048576/10000         549 us          549 us         
1274 bytes_per_second=1.77922G/s null_percent=0.01 size=1048.58k
   VarianceKernelFloat/1048576/100           579 us          579 us         
1208 bytes_per_second=1.68558G/s null_percent=1 size=1048.58k
   VarianceKernelFloat/1048576/10           1040 us         1040 us          
673 bytes_per_second=961.64M/s null_percent=10 size=1048.58k
   VarianceKernelFloat/1048576/2            1650 us         1650 us          
424 bytes_per_second=606.079M/s null_percent=50 size=1048.58k
   VarianceKernelFloat/1048576/1           0.876 us        0.876 us       
775800 bytes_per_second=1115.15G/s null_percent=100 size=1048.58k
   VarianceKernelFloat/1048576/0             541 us          541 us         
1294 bytes_per_second=1.80506G/s null_percent=0 size=1048.58k
   
   VarianceKernelDouble/1048576/10000        275 us          275 us         
2543 bytes_per_second=3.54668G/s null_percent=0.01 size=1048.58k
   VarianceKernelDouble/1048576/100          277 us          277 us         
2530 bytes_per_second=3.52828G/s null_percent=1 size=1048.58k
   VarianceKernelDouble/1048576/10           500 us          500 us         
1397 bytes_per_second=1.95315G/s null_percent=10 size=1048.58k
   VarianceKernelDouble/1048576/2            793 us          793 us          
884 bytes_per_second=1.23201G/s null_percent=50 size=1048.58k
   VarianceKernelDouble/1048576/1          0.894 us        0.894 us       
765871 bytes_per_second=1092.23G/s null_percent=100 size=1048.58k
   VarianceKernelDouble/1048576/0            271 us          271 us         
2576 bytes_per_second=3.59923G/s null_percent=0 size=1048.58k
   
   
   After
   
---------------------------------------------------------------------------------------------
   Benchmark                                   Time             CPU   
Iterations UserCounters...
   
---------------------------------------------------------------------------------------------
   VarianceKernelInt64/1048576/10000         138 us          138 us         
5082 bytes_per_second=7.07395G/s null_percent=0.01 size=1048.58k
   VarianceKernelInt64/1048576/100           226 us          226 us         
2858 bytes_per_second=4.32355G/s null_percent=1 size=1048.58k
   VarianceKernelInt64/1048576/10            679 us          679 us         
1032 bytes_per_second=1.4386G/s null_percent=10 size=1048.58k
   VarianceKernelInt64/1048576/2            1121 us         1121 us          
625 bytes_per_second=891.928M/s null_percent=50 size=1048.58k
   VarianceKernelInt64/1048576/1           0.835 us        0.835 us       
815343 bytes_per_second=1.14166T/s null_percent=100 size=1048.58k
   VarianceKernelInt64/1048576/0             165 us          165 us         
5278 bytes_per_second=5.91169G/s null_percent=0 size=1048.58k
   
   VarianceKernelFloat/1048576/10000         333 us          333 us         
2103 bytes_per_second=2.93102G/s null_percent=0.01 size=1048.58k
   VarianceKernelFloat/1048576/100           550 us          550 us         
1272 bytes_per_second=1.77448G/s null_percent=1 size=1048.58k
   VarianceKernelFloat/1048576/10           1699 us         1699 us          
412 bytes_per_second=588.556M/s null_percent=10 size=1048.58k
   VarianceKernelFloat/1048576/2            2788 us         2788 us          
251 bytes_per_second=358.73M/s null_percent=50 size=1048.58k
   VarianceKernelFloat/1048576/1           0.878 us        0.878 us       
831430 bytes_per_second=1112.44G/s null_percent=100 size=1048.58k
   VarianceKernelFloat/1048576/0             327 us          327 us         
2164 bytes_per_second=2.98677G/s null_percent=0 size=1048.58k
   
   VarianceKernelDouble/1048576/10000        121 us          121 us         
5785 bytes_per_second=8.05841G/s null_percent=0.01 size=1048.58k
   VarianceKernelDouble/1048576/100          222 us          222 us         
3145 bytes_per_second=4.40277G/s null_percent=1 size=1048.58k
   VarianceKernelDouble/1048576/10           789 us          789 us          
887 bytes_per_second=1.23705G/s null_percent=10 size=1048.58k
   VarianceKernelDouble/1048576/2           1428 us         1428 us          
490 bytes_per_second=700.099M/s null_percent=50 size=1048.58k
   VarianceKernelDouble/1048576/1          0.853 us        0.853 us       
827831 bytes_per_second=1.11828T/s null_percent=100 size=1048.58k
   VarianceKernelDouble/1048576/0            140 us          140 us         
5986 bytes_per_second=6.97951G/s null_percent=0 size=1048.58k
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to