Hi, On 01/19/2018 03:34 PM, Simon Riggs wrote: > On 26 December 2017 at 14:21, Nikhil Sontakke <nikh...@2ndquadrant.com> wrote: > >> With logical decoding, there might arise a case that such a row, if it >> belongs to a system catalog table or even a user catalog table >> (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=66abc2608c7c00fcd449e00a9e23f13f02e65d04), >> then it might be being used for decoding at the very same moment that >> the abort came in. If the row is re-claimed immediately, then it's >> possible that the decoding happening alongside might face issues. > > If we want to make this change, I'd like to see something that > explains exactly how the problem in logical decoding occurs, or > preferably a test case. I can't understand the problem by reading the > archives. I'm not suggesting this patch doesn't work, I'm thinking > about whether there are other ways. > > Surely if you are decoding a transaction and a concurrent abort is > requested then decoding should be interrupted at the next sensible > point. Allowing the two actions to occur without interlock is an > issue, so I suggest we just don't do it, rather than allow it and fix > subsequent breakage, which is what I understand this patch to do. >
I don't quite understand how the interlock would work? Can you elaborate how would that be implemented? Consider that logical decoding (a) is a read-only process, so it can't modify database state (e.g. locking a row in a catalog) (b) is asynchronous process, i.e. it may be happening quite a bit of time after the changes actually happened An example of a problematic transaction was already outlined in Nikhil's post, i.e. it's a transaction that (1) does DDL, e.g. to add a column to the decoded table (2) inserts some data into the table (with the new column) (3) aborts (possibly hours / gigabytes of WAL in the future) Currently, vacuum can just go and cleanup the catalog rows added in (1) because the transaction is aborted (and postgres already knows that), and we do ignore OldestXmin for catalog rows. But if you try decode the changes (and hand them over to the plugin for apply) without waiting for the final commit/abort, that will fail. Because it will read changes from WAL with a column missing in any existing catalog row. Kabooooom! Until now this was not an issue, because the decoding happens at commit time (or more precisely, we do decode the changes incrementally, but the apply only happens on commit). But it's an issue for both Nikhil's and mine patches. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services