On Fri, Apr 30, 2021 at 10:10 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > There are two major reasons why I want variable-width tuple IDs. One > > is global indexes, where you need as many bits as the AMs implementing > > the partitions need, plus some extra bits to identify which partition > > is relevant for a particular tuple. No fixed number of bits that you > > make available can ever be sufficient here, > > I agree that global indexes need more bits, but it doesn't necessarily > follow that we must have variable-width TIDs. We could for example > say that "real" TIDs are only 48 bits and index AMs that want to be > usable as global indexes must be capable of handling 64-bit TIDs, > leaving 16 bits for partition ID. A more forward-looking definition > would require global index AMs to store 96 bits (partition OID plus > 64-bit TID). Either way would be far simpler for every moving part > involved than going over to full varlena TIDs.
The question of how the on-disk format on indexes needs to be changed to accomodate global indexes seems like an entirely separate question to how we go about expanding or redefining TIDs. Global indexes should work by adding an extra column that is somewhat like a TID, that may even have its own pg_attribute entry. It's much more natural to make the partition number a separate column IMV -- nbtree suffix truncation and deduplication can work in about the same way as before. Plus you'll need to do predicate pushdown using the partition identifier in some scenarios anyway. You can make the partition identifier variable-width without imposing the cost and complexity of variable-width TIDs on index AMs. I believe that the main reason why there have been so few problems with any of the nbtree work in the past few releases is that it avoided certain kinds of special cases. Any special cases in the on-disk format and in the space accounting used when choosing a split point ought to be avoided at all costs. We can probably afford to add a lot of complexity to make global indexes work, but it ought to be contained to cases that actually use global indexes in an obvious way. -- Peter Geoghegan