On Wed, Feb 11, 2026 at 5:25 PM zengman <[email protected]> wrote: > I'm wondering if we should replace `state->memtuples` with `data` in the > `sort_byvalue_datum()` function here. > ``` > if (nulls_first) > { > null_start = state->memtuples;
Sounds good. On Wed, Feb 11, 2026 at 6:47 PM cca5507 <[email protected]> wrote: > +1. Now data == state->memtuples, how about remove the parameter "data" and > "n" and just > get them from "Tuplesortstate"? The parameter list currently looks like a sort function. I'm not sure why you think that's objectionable? > Some comments for v7: > > 1) > > ``` > /* > * Retrieve byte from datum, indexed by 'level': 0 for LSB, 7 for MSB > */ > static inline uint8 > current_byte(Datum key, int level) > { > int shift = (SIZEOF_DATUM - 1 - level) * > BITS_PER_BYTE; > > return (key >> shift) & 0xFF; > } > ``` > > Maybe "0 for MSB, 7 for LSB"? If level == 0, this function will return the > Most Significant Byte. Will fix, thanks. > 2) radix_sort_tuple() > > ``` > size_t end_offset = partitions[*rp].next_offset; > SortTuple *partition_end = begin + end_offset; > ptrdiff_t num_elements = end_offset - start_offset; > ``` > > Why the type of "num_elements" is "ptrdiff_t"? Maybe just "size_t"? That's from upstream. It's pretty rare in our codebase for in-core code, so I'll go ahead and change it. > 3) tuplesort_sort_memtuples() > > ``` > /* > * Do we have the leading column's value or abbreviation in > datum1? > */ > if (state->base.haveDatum1 && state->base.sortKeys) > { > SortSupportData ssup = state->base.sortKeys[0]; > ``` > > I think we should avoid the copy of SortSupportData. We can just use a > pointer. Will do, for consistency. > 4) > > Many places just consider "Datum" as integer, do we need to add a > DatumGetUInt**() for them? It already does that for the case where it specifically needs uint32. For the general case, I don't think so. We can treat a Datum as an unsigned integer and don't care about the actual size or what it contains. There is one exception that I can see: 0002 has a call to pg_leftmost_one_pos64(), so a macro for that would be more appropriate than an implicit cast. Now I'm wondering if 0002 should normalize the datum as it determines the common prefix. That should only matter for int32 where the input has a mix of positive and negative, but it could be worth the additional calculation. I'll hold off on committing 0002 for a bit longer, but 0001 (plus the changes that I've agreed to) is still on schedule. Also, looking at commit 6aebedc38, it's preferred to use "sizeof(Datum)" rather than the vestigial SIZEOF_DATUM, so I'll change that in 0001/2. -- John Naylor Amazon Web Services
