I ran into a few issues with the rust sort kernels and would like to discuss possible solutions.
1. When sorting by multiple columns (lexsort_to_indices) the Float32 and Float64 data types are not supported because the implementation relies on the OrdArray trait. This trait is not implemented because f64/f32 only implements PartialOrd. The sort function for a single column (sort_to_indices) has some special logic which looks like it wants to treats NaN the same as null, but I'm also not convinced this is the correct way. For example postgres does the following (https://www.postgresql.org/docs/12/datatype-numeric.html#DATATYPE-FLOAT) "In order to allow floating-point values to be sorted and used in tree-based indexes, PostgreSQL treats NaN values as equal, and greater than all non-NaN values." I propose to do the same in an OrdArray impl for Float64Array/Float32Array and then simplifying the sort_to_indices function accordingly. 2. Sorting for dictionary encoded strings. The problem here is that DictionaryArray does not have a generic parameter for the value type so it is not currently possible to only implement OrdArray for string dictionaries. Again for the single column case, the value data type could be checked and a sort could be implemented by looking up each key in the dictionary. An optimization could be to check the is_sorted flag of DictionaryArray (which does not seem to be used really) and then directly sort by the keys. For the general case I see roughly to options - Somehow implement an OrdArray view of the dictionary array. This could be easier if OrdArray did not extend Array but was a completely separate trait. - Change the lexicographic sort impl to not use dynamic calls but instead sort multiple times. So for a query `ORDER BY a, b`, first sort by b and afterwards sort again by a. With a stable sort implementation this should result in the same ordering. I'm curious about the performance, it could avoid dynamic method calls for each comparison, but it would process the indices vector multiple times. Looking forward to any other suggestions or feedback. -- Jörn Horstmann | Senior Backend Engineer www.signavio.com Kurfürstenstraße 111, 10787 Berlin, Germany