On Wed, Dec 3, 2025 at 3:22 PM Chao Li <[email protected]> wrote:
> I played with this again today and found an optimization that seems to 
> dramatically improve the performance:
>
> ```
> +static void
> +radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate 
> *state)
> +{
> +       RadixPartitionInfo partitions[256] = {0};
> +       uint8_t         remaining_partitions[256] = {0};
> ```
>
> Here partitions and remaining_partitions are just temporary buffers, 
> allocating memory from stack and initialize them seems slow. By passing them 
> as function parameters are much faster. See attached diff for my change.

The lesson here is: you can make it as fast as you like if you
accidentally blow away the state that we needed for this to work
correctly.

--
John Naylor
Amazon Web Services


Reply via email to