On Wed, Apr 17, 2019 at 4:22 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > As for the general usability argument, I'm not sure --- as we start > to look at alternate AMs, we might have more use for them. When I first > saw the functions, I thought maybe they were part of sort acceleration > for TIDs; evidently they're not (yet), but that seems like another > possible use-case.
There is also your join-or-to-union patch, which I thought might make use of this for its TID sort. Maybe it would make sense to put this infrastructure in tuplesort.c, but probably not. TIDs are 6 bytes, which as you once pointed out, is not something that we have appropriate infrastructure for (there isn't a DatumGet*() macro, and so on). The encoding scheme (which you originally suggested as an alternative to my first idea, sort support for item pointers) works particularly well as these things go -- it was about 3x faster when everything fit in memory, and faster still with external sorts. It allowed us to resolve comparisons at the SortTuple level within tuplesort.c, but also allowed tuplesort.c to use the pass-by-value datum qsort specialization. It even allowed sorted array entries (TIDs/int8s) to be fetched without extra pointer chasing -- that can be a big bottleneck these days. The encoding scheme is a bit ugly, but I suspect it would be simpler to stick to the same approach elsewhere than to try and hide all the details within tuplesort.c, or something like that. Unless we're willing to treat TIDs as a whole new type of tuple with its own set of specialized functions in tuplesort.c, which has problems of its own, then it's kind of awkward to do it some other way. -- Peter Geoghegan