On Mon, Jul 29, 2019 at 12:11 PM Robert Haas <robertmh...@gmail.com> wrote: > I find this hard to believe, because an UPDATE can always be broken up > into a DELETE and an INSERT. If that were to be done, you would not > have a stable heap TID and you would have a "new HOT chain," or your > AM's equivalent of that concept. So if we can't handle an UPDATE that > changes the TID, then we also can't handle a DELETE + INSERT. But > surely handling that case is a hard requirement for any AM.
I'm not saying you can't handle it. But that necessitates "write amplification", in the sense that you must now create new index tuples even for indexes where the indexed columns were not logically altered. Isn't zheap supposed to fix that problem, at least at in version 2 or version 3? I also think that stable heap TIDs make index-only scans a lot easier and more effective. I think that indexes (or at least B-Tree indexes) will ideally almost always have tuples that are the latest versions with zheap. The exception is tuples whose ghost bit is set, whose visibility varies based on the MVCC snapshot in use. But the instant that the deleting/updating xact commits it becomes legal to recycle the old heap TID. We don't need to go back to the index to permanently zap the tuple whose ghost bit we already set, because there is an undo pointer in the same leaf page, so nobody is in danger of getting confused and following the now-recycled heap TID. This ghost bit design owes plenty to 2PL (which will fully remove the index tuple synchronously, rather than just setting a ghost bit). You could say that it's a 2PL/MVCC hybrid, while classic Postgres is "pure" MVCC because it uses explicit row versioning -- it doesn't need to impose restrictions on TID stability. Which seems to be why we offer such a large variety of index access methods -- it's relatively straight forward for Postgres to add niche index AMs, such as SP-GiST. -- Peter Geoghegan