[ 
https://issues.apache.org/jira/browse/ARROW-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281864#comment-17281864
 ] 

Kirill Lykov commented on ARROW-10899:
--------------------------------------

1. From the paper "Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious 
SIMD Sort" by Satish et al:
They compare, beside other things, performance of radix sort and merge sort 
using intrinsics.
Both implementations are highly optimized for simd architecture and radix has 
many not obvious improvements to make it more cache-friendly.
Radix sort performance depends on the key size because the longer key we have 
the more passes we need to do. Radix is bandwidth-bound starting at 48-bits 
keys.
Further, Satish et al report that merge sort is 1.7X slower than radix sort on 
32- bit keys, but becomes comparable in performance to radix on 64-bit keys and 
1.3X better on 128-bit keys.
Their conclusion is that "bandwidth-oblivious SIMD friendly merge sort, should, 
therefore, be the sorting method of choice for future databases".

2. I haven't found a good opensource radix sort for CPUs. There is one in boost 
called spread sort but it is not actually a radix sort and according to my 
benchmarks for integers it demonstrates exactly the same performance as gcc's 
stable_sort. From description, It might be designed for strings.

> [C++] Investigate radix sort for integer arrays
> -----------------------------------------------
>
>                 Key: ARROW-10899
>                 URL: https://issues.apache.org/jira/browse/ARROW-10899
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Kirill Lykov
>            Priority: Major
>
> For integer arrays with a non-tiny range of values, we currently use a stable 
> sort. It may be faster to use a radix sort instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to