Hi, As of today, in the existing code in HeapTupleSatisfiesVacuum, if a row is inserted via an aborted transaction, then it's deemed to be dead immediately and can be cleaned up for re-use by HOT or vacuum, etc.
With logical decoding, there might arise a case that such a row, if it belongs to a system catalog table or even a user catalog table (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=66abc2608c7c00fcd449e00a9e23f13f02e65d04), then it might be being used for decoding at the very same moment that the abort came in. If the row is re-claimed immediately, then it's possible that the decoding happening alongside might face issues. The main issue here is that HeapTupleSatisfiesVacuum *assumes* that rows belonging to an aborted transaction are not visible to anyone else. As mentioned above, that assumption needs to change in the case when logical decoding is active and the row belongs to catalog tables. This was not a problem before, but now there are at least two patches out there which challenge these assumptions. One patch is for logical decoding of 2PC, in which case a PREPARED transaction might be getting decoded and a concurrent ROLLBACK PREPARED can happen alongside. Another patch is about streaming logical changes before commit/abort. The provided patch changes that assumption. We now pass in an additional argument to HeapTupleSatisfiesVacuum() to indicate if the current row belongs to a catalog table in the logical decoding environment. We call the existing RelationIsAccessibleInLogicalDecoding() function to ascertain if it's a catalog table worth considering for this case. So if this is the case, we return HEAPTUPLE_RECENTLY_DEAD if the xmin of the row is newer than OldestXmin. So, for the following case, we now will return HEAPTUPLE_RECENTLY_DEAD when earlier we would have returned HEAPTUPLE_DEAD: BEGIN; ALTER subscribed_table ADD COLUMN; .... ABORT; VACUUM pg_attribute; This is essentially the same logic that is used today for rows that got deleted (via update or delete). We return HEAPTUPLE_RECENTLY_DEAD for such deleted rows if the xmax of the row is newer than OldestXmin because some open transactions can/could still see the tuple. With changes to HeapTupleSatisfiesVacuum, we also had to tweak the logic in heap_prune_chain(). That function again assumes that only Updated tuples can return HEAPTUPLE_RECENTLY_DEAD. That assumption is not true and we check explicitly for the HeapTupleHeaderGetUpdateXid value to be valid before further processing. The patch had to make changes in all locations where HeapTupleSatisfiesVacuum gets called and I have done multiple "make check-world" runs with cassert enabled and things run fine. Also, since it only sets the flag for system/user catalog tables and that too ONLY in the logical decoding environment, it does not cause any performance issues in normal circumstances. Regards, Nikhils -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
HeapTupleSatisfiesVacuum_Logical_Decoding_26_12_17.patch
Description: Binary data