[ 
https://issues.apache.org/jira/browse/ARROW-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314850#comment-17314850
 ] 

Kirill Lykov commented on ARROW-10899:
--------------------------------------

I added a repository to put there some experiments i've done for the earlier 
added plots: [https://github.com/KirillLykov/int-sort-bmk]

Unfortunately, I couldn't achieve a fast progress on this ticket and since it 
is not my main activity I decided to freeze it on my side.
By fast progress I mean delivering a stable non-comparison-based sorting 
algorithm which is faster than std::stable_sort.
Naiveradix sort which is implemented there is much slower on int64_t as one 
might find by looking into the plots in scripts/imgs.


The last thing that I was trying to do is to modify boost's integer_sort to 
make it stable (as unstable version is really fast).
To simplify experiments with integer_sort I've extracted it in one separate 
file called 
[https://github.com/KirillLykov/int-sort-bmk/blob/master/src/boost_spread_sort.h]
One can find that integer_sort sometimes relies on pdqsort. This can be 
replaced with stable_sort.
A more interesting part of the code which I think makes integer_sort unstable 
is 
[https://github.com/KirillLykov/int-sort-bmk/blob/master/src/boost_spread_sort.h#L210]
I think in-place version can be replaced with non-in-place.

> [C++] Investigate radix sort for integer arrays
> -----------------------------------------------
>
>                 Key: ARROW-10899
>                 URL: https://issues.apache.org/jira/browse/ARROW-10899
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Kirill Lykov
>            Priority: Major
>         Attachments: Screen Shot 2021-02-09 at 17.48.13.png, Screen Shot 
> 2021-02-10 at 10.58.23.png
>
>
> For integer arrays with a non-tiny range of values, we currently use a stable 
> sort. It may be faster to use a radix sort instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to