On Tue, Jun 23, 2026 at 2:30 PM Matthias van de Meent <[email protected]> wrote: > > I think that you meant that it can see different TIDs originating from > > the same updated logical row. > > It could access and return the same TID twice, on multiple pages, if > the TID is recycled between the page accesses. But, as mentioned, at > most one of the rows indicated by the index' scan mechanism will be > MVCC-visible for the IndexScan executor node.
That might be true with SnapshotAny, but bringing visibility concerns into this discussion doesn't seem useful. The relevant invariant is that the same physical TID cannot appear twice within the same index. It is useful to think of it as an invariant that the index AM is directly concerned with (and to ignore visibility stuff, which happens at a higher level, and shouldn't be of concern to the index AM at all). In general, nbtree page deletion (and merging underfull pages) modifies a physical data structure to improve space efficiency. It isn't relevant why VACUUM deleted some index tuples (making that free space available and indirectly triggering deletion/merging). Using XIDs for BTPageGetDeleteXid is just a convenient (though very conservative) way to implement "the drain technique". It would still be correct to implement it differently, provided this alternative approach also ensures that no backend follows a downlink and ends up on a wholly unrelated page due to concurrent deletion. We must always ensure that such a backend at least lands on a page marked deleted/a tombstone page and then recovers by moving right -- no scan can ever have an irredeemably bad picture of the tree structure. But we don't fundamentally need to care about XIDs to make that work -- this is a physical modification that's orthogonal to logical/transaction considerations. > > A non-hot update can create 2 separate TIDs that point to different > > versions of the same logical row. In that case, both TIDs must be > > returned to the scan (assuming both have index tuple values that > > satisfy the scan keys). This doesn't really matter to the index AM; it > > doesn't know about updates at all. > > Except the indexUnchanged flag for aminsert() -which index AMs should > only use as a hint- but more generally, yes that's right. That's just a hint used to trigger bottom-up index deletion, it really isn't relevant. -- Peter Geoghegan
