Re: [HACKERS] Issues with logical replication
> On 3 Jan 2018, at 23:35, Alvaro Herrera wrote: > > Pushed. Will you (Konstantin, Stas, Masahiko) please verify that after > this commit all the problems reported with logical replication are > fixed? Checked that with and without extra sleep in AssignTransactionId(). In both cases patch works. Thanks! -- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Issues with logical replication
Alvaro Herrera wrote: > Will push this shortly after lunch. Pushed. Will you (Konstantin, Stas, Masahiko) please verify that after this commit all the problems reported with logical replication are fixed? Thanks -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Issues with logical replication
Stas Kelvich wrote: > Seems that having busy loop is the best idea out of several discussed. > > I thought about small sleep at the bottom of that loop if we reached topmost > transaction, but taking into account low probability of that event may be > it is faster to do just busy wait. In other locations where we do similar things we have 1ms sleeps. I agree with the need for a comment here. Proposed patch attached. I tried your reload.pgb test case in 9.4 (after changing pgoutput to test_decoding and removing the 3rd arg to a function call) and the crash takes about 3 seconds without patch in my machine. No crash with this patch. Will push this shortly after lunch. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From abca9e98d1ddf86e892bd23bd24878516225a2d1 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Wed, 3 Jan 2018 12:36:47 -0300 Subject: [PATCH] Make XactLockTableWait work for transactions that are not yet self-locked MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. The new coding does not assume that the xid is a subtransaction anymore, and will retry the lock with same toplevel xid it was given before (courtesy of SubTransGetTopmostTransaction()). For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1b3e32d8-fcf4-40b4-aef9-5c0e3ac57...@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas --- src/backend/storage/lmgr/lmgr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c index 3754283d2b..c184f1f956 100644 --- a/src/backend/storage/lmgr/lmgr.c +++ b/src/backend/storage/lmgr/lmgr.c @@ -590,7 +590,7 @@ XactLockTableWait(TransactionId xid, Relation rel, ItemPointer ctid, if (!TransactionIdIsInProgress(xid)) break; - xid = SubTransGetParent(xid); + xid = SubTransGetTopmostTransaction(xid); } if (oper != XLTW_None) @@ -622,7 +622,7 @@ ConditionalXactLockTableWait(TransactionId xid) if (!TransactionIdIsInProgress(xid)) break; - xid = SubTransGetParent(xid); + xid = SubTransGetTopmostTransaction(xid); } return true; -- 2.11.0
Re: [HACKERS] Issues with logical replication
On 30/11/17 11:48, Simon Riggs wrote: > On 30 November 2017 at 11:30, Petr Jelinek > wrote: >> On 30/11/17 00:47, Andres Freund wrote: >>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: I don't understand. I mean sure the SnapBuildWaitSnapshot() can live with it, but the problematic logic happens inside the XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of detecting the situation short of reimplementing the XactLockTableInsert() instead of calling it. >>> >>> Right. But we fairly trivially can change that. I'm remarking on it >>> because other people's, not yours, suggestions aimed at making this >>> bulletproof. I just wanted to make clear that I don't think that's >>> necessary at all. >>> >> >> Okay, then I guess we are in agreement. I can confirm that the attached >> fixes the issue in my tests. Using SubTransGetTopmostTransaction() >> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU >> cost for other callers. I don't think it's worth worrying about given we >> are waiting for heavyweight lock, but if we did we can just inline the >> code directly into SnapBuildWaitSnapshot(). > > This will still fail an Assert in TransactionIdIsInProgress() when > snapshots are overflowed. > Hmm, which one, why? I see 2 Asserts there, one is: > Assert(nxids == 0); Which is inside the RecoveryInProgress(), surely on standbys there will still be no PGXACTs with assigned xids so that should be fine. The other one is: > Assert(TransactionIdIsValid(topxid)); Which should be again fine toplevel xid of toplevel xid is same xid which is a valid one. So I think we should be fine there. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Issues with logical replication
On 30 November 2017 at 11:30, Petr Jelinek wrote: > On 30/11/17 00:47, Andres Freund wrote: >> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >>> with it, but the problematic logic happens inside the >>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >>> detecting the situation short of reimplementing the >>> XactLockTableInsert() instead of calling it. >> >> Right. But we fairly trivially can change that. I'm remarking on it >> because other people's, not yours, suggestions aimed at making this >> bulletproof. I just wanted to make clear that I don't think that's >> necessary at all. >> > > Okay, then I guess we are in agreement. I can confirm that the attached > fixes the issue in my tests. Using SubTransGetTopmostTransaction() > instead of SubTransGetParent() means 3 more ifs in terms of extra CPU > cost for other callers. I don't think it's worth worrying about given we > are waiting for heavyweight lock, but if we did we can just inline the > code directly into SnapBuildWaitSnapshot(). This will still fail an Assert in TransactionIdIsInProgress() when snapshots are overflowed. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Issues with logical replication
> On 30 Nov 2017, at 03:30, Petr Jelinek wrote: > > On 30/11/17 00:47, Andres Freund wrote: >> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >>> with it, but the problematic logic happens inside the >>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >>> detecting the situation short of reimplementing the >>> XactLockTableInsert() instead of calling it. >> >> Right. But we fairly trivially can change that. I'm remarking on it >> because other people's, not yours, suggestions aimed at making this >> bulletproof. I just wanted to make clear that I don't think that's >> necessary at all. >> > > Okay, then I guess we are in agreement. I can confirm that the attached > fixes the issue in my tests. Using SubTransGetTopmostTransaction() > instead of SubTransGetParent() means 3 more ifs in terms of extra CPU > cost for other callers. I don't think it's worth worrying about given we > are waiting for heavyweight lock, but if we did we can just inline the > code directly into SnapBuildWaitSnapshot(). > > -- > Petr Jelinek http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > <0001-Make-XactLockTableWait-work-for-transactions-that-ar.patch> +1 Seems that having busy loop is the best idea out of several discussed. I thought about small sleep at the bottom of that loop if we reached topmost transaction, but taking into account low probability of that event may be it is faster to do just busy wait. Also some clarifying comment in code would be nice. Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Issues with logical replication
On 30/11/17 00:47, Andres Freund wrote: > On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: >> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live >> with it, but the problematic logic happens inside the >> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of >> detecting the situation short of reimplementing the >> XactLockTableInsert() instead of calling it. > > Right. But we fairly trivially can change that. I'm remarking on it > because other people's, not yours, suggestions aimed at making this > bulletproof. I just wanted to make clear that I don't think that's > necessary at all. > Okay, then I guess we are in agreement. I can confirm that the attached fixes the issue in my tests. Using SubTransGetTopmostTransaction() instead of SubTransGetParent() means 3 more ifs in terms of extra CPU cost for other callers. I don't think it's worth worrying about given we are waiting for heavyweight lock, but if we did we can just inline the code directly into SnapBuildWaitSnapshot(). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 7d230d6453e2ac6e4c47b2af989dea62e667fc4b Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Thu, 30 Nov 2017 01:18:55 +0100 Subject: [PATCH] Make XactLockTableWait work for transactions that are not yet self locked The XactLockTableWait assumed that the xid given on input has already added itself to the locktable. This led to another assumption that if the lock of the xid has succeeded but the xid is still in progress, the inpud xid must have been subtransaction. These assumtions might not be true in logical slot snapshot building on rare occation where the standby snapshot logged contains xid that has already been added to ProcArray but not to lock table yet. The new coding will just not assume xid is subtransaction anymore and will retry the lock with same toplevel xid it was given before (courtesy of SubTransGetTopmostTransaction()). For consistency reasons, same change is made to ConditionalXactLockTableWait. --- src/backend/storage/lmgr/lmgr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c index da5679b7a3..9ba6ea00b1 100644 --- a/src/backend/storage/lmgr/lmgr.c +++ b/src/backend/storage/lmgr/lmgr.c @@ -590,7 +590,7 @@ XactLockTableWait(TransactionId xid, Relation rel, ItemPointer ctid, if (!TransactionIdIsInProgress(xid)) break; - xid = SubTransGetParent(xid); + xid = SubTransGetTopmostTransaction(xid); } if (oper != XLTW_None) @@ -622,7 +622,7 @@ ConditionalXactLockTableWait(TransactionId xid) if (!TransactionIdIsInProgress(xid)) break; - xid = SubTransGetParent(xid); + xid = SubTransGetTopmostTransaction(xid); } return true; -- 2.14.1
Re: [HACKERS] Issues with logical replication
On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote: > I don't understand. I mean sure the SnapBuildWaitSnapshot() can live > with it, but the problematic logic happens inside the > XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of > detecting the situation short of reimplementing the > XactLockTableInsert() instead of calling it. Right. But we fairly trivially can change that. I'm remarking on it because other people's, not yours, suggestions aimed at making this bulletproof. I just wanted to make clear that I don't think that's necessary at all. Greetings, Andres Freund
Re: [HACKERS] Issues with logical replication
On 30/11/17 00:40, Andres Freund wrote: > On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote: >> Yes that helps thanks. Now that I reproduced it I understand, I was >> confused by the backtrace that said xid was 0 on the input to >> XactLockTableWait() but that's not the case, it's what xid is changed to >> in the inner loop. > >> So what happens is that we manage to do LogStandbySnapshot(), decode the >> logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in >> between GetNewTransactionId() and XactLockTableInsert() calls in >> AssignTransactionId() for that same transaction. >> >> I guess the probability of this happening is increased by the fact that >> GetRunningTransactionData() acquires XidGenLock so if there is >> GetNewTransactionId() running in parallel it will wait for it to finish >> and we'll log immediately after that. >> >> Hmm that means that Robert's loop idea will not help and ProcArrayLock >> will not save us either. Maybe we could either rewrite XactLockTableWait >> or create another version of it with SubTransGetParent() call replaced >> by SubTransGetTopmostTransaction() as that will return the same top >> level xid in case the input xid wasn't a subxact. That would make it >> safe to be called on transactions that didn't acquire lock on themselves >> yet. > > I've not really looked into this deeply, but afair we can just make this > code accept that edgecase be done with it. As the comment says: > > * Iterate through xids in record, wait for all older than the cutoff to > * finish. Then, if possible, log a new xl_running_xacts record. > * > --- highlight --- > * This isn't required for the correctness of decoding, but to: > --- highlight --- > * a) allow isolationtester to notice that we're currently waiting for > * something. > * b) log a new xl_running_xacts record where it'd be helpful, without having > * to write for bgwriter or checkpointer. > I don't understand. I mean sure the SnapBuildWaitSnapshot() can live with it, but the problematic logic happens inside the XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of detecting the situation short of reimplementing the XactLockTableInsert() instead of calling it. The problem is that SubTransGetParent() returns InvalidTransactionId when the race happens and SubTransGetParen() is called because XactLockTableInsert() assumes that if transaction lock was acquired and the xid is still in progress, the input must have been xid of a subtransaction). This is why I suggested replacing that call with SubTransGetTopmostTransaction() which will just return same xid it got on input and we'll simply retry to do the lock instead of crashing. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Issues with logical replication
On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote: > Yes that helps thanks. Now that I reproduced it I understand, I was > confused by the backtrace that said xid was 0 on the input to > XactLockTableWait() but that's not the case, it's what xid is changed to > in the inner loop. > So what happens is that we manage to do LogStandbySnapshot(), decode the > logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in > between GetNewTransactionId() and XactLockTableInsert() calls in > AssignTransactionId() for that same transaction. > > I guess the probability of this happening is increased by the fact that > GetRunningTransactionData() acquires XidGenLock so if there is > GetNewTransactionId() running in parallel it will wait for it to finish > and we'll log immediately after that. > > Hmm that means that Robert's loop idea will not help and ProcArrayLock > will not save us either. Maybe we could either rewrite XactLockTableWait > or create another version of it with SubTransGetParent() call replaced > by SubTransGetTopmostTransaction() as that will return the same top > level xid in case the input xid wasn't a subxact. That would make it > safe to be called on transactions that didn't acquire lock on themselves > yet. I've not really looked into this deeply, but afair we can just make this code accept that edgecase be done with it. As the comment says: * Iterate through xids in record, wait for all older than the cutoff to * finish. Then, if possible, log a new xl_running_xacts record. * --- highlight --- * This isn't required for the correctness of decoding, but to: --- highlight --- * a) allow isolationtester to notice that we're currently waiting for *something. * b) log a new xl_running_xacts record where it'd be helpful, without having *to write for bgwriter or checkpointer. Greetings, Andres Freund
Re: [HACKERS] Issues with logical replication
On 29/11/17 20:11, Stas Kelvich wrote: > >> On 29 Nov 2017, at 18:46, Petr Jelinek wrote: >> >> What I don't understand is how it leads to crash (and I could not >> reproduce it using the pgbench file attached in this thread either) and >> moreover how it leads to 0 xid being logged. The only explanation I can >> come up is that some kind of similar race has to be in >> LogStandbySnapshot() but we explicitly check for 0 xid value there. >> > > Zero xid isn’t logged. Loop in XactLockTableWait() does following: > > for (;;) > { > Assert(TransactionIdIsValid(xid)); > Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny())); > > <...> > > xid = SubTransGetParent(xid); > } > > So if last statement is reached for top transaction then next iteration > will crash in first assert. And it will be reached if whole this loop > happens before transaction acquired heavyweight lock. > > Probability of that crash can be significantly increased be adding sleep > between xid generation and lock insertion in AssignTransactionId(). > Yes that helps thanks. Now that I reproduced it I understand, I was confused by the backtrace that said xid was 0 on the input to XactLockTableWait() but that's not the case, it's what xid is changed to in the inner loop. So what happens is that we manage to do LogStandbySnapshot(), decode the logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in between GetNewTransactionId() and XactLockTableInsert() calls in AssignTransactionId() for that same transaction. I guess the probability of this happening is increased by the fact that GetRunningTransactionData() acquires XidGenLock so if there is GetNewTransactionId() running in parallel it will wait for it to finish and we'll log immediately after that. Hmm that means that Robert's loop idea will not help and ProcArrayLock will not save us either. Maybe we could either rewrite XactLockTableWait or create another version of it with SubTransGetParent() call replaced by SubTransGetTopmostTransaction() as that will return the same top level xid in case the input xid wasn't a subxact. That would make it safe to be called on transactions that didn't acquire lock on themselves yet. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Issues with logical replication
> On 29 Nov 2017, at 18:46, Petr Jelinek wrote: > > What I don't understand is how it leads to crash (and I could not > reproduce it using the pgbench file attached in this thread either) and > moreover how it leads to 0 xid being logged. The only explanation I can > come up is that some kind of similar race has to be in > LogStandbySnapshot() but we explicitly check for 0 xid value there. > Zero xid isn’t logged. Loop in XactLockTableWait() does following: for (;;) { Assert(TransactionIdIsValid(xid)); Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny())); <...> xid = SubTransGetParent(xid); } So if last statement is reached for top transaction then next iteration will crash in first assert. And it will be reached if whole this loop happens before transaction acquired heavyweight lock. Probability of that crash can be significantly increased be adding sleep between xid generation and lock insertion in AssignTransactionId(). AssignTransactionId.patch Description: Binary data Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Issues with logical replication
Hi, (sorry for not being active here, I am still catching up after being away for some family issues) On 16/11/17 21:12, Robert Haas wrote: > On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund wrote: >>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >>> misdesigned >> >> Maybe I'm confused, but why is it fundamentally misdesigned? It's not >> such an absurd idea to wait for an xid in a WAL record. I get that >> there's a race condition here, which obviously bad, but I don't really >> see as evidence of the above claim. >> >> I actually think this code used to be safe because ProcArrayLock used to >> be held while generating and logging the running snapshots record. That >> was removed when fixing some other bug, but perhaps that shouldn't have >> been done... > > OK. Well, I might be overstating the case. My comment about > fundamental misdesign was really just based on the assumption that > XactLockTableWait() could be used to wait for an XID the instant it > was generated. That was never gonna work and there's no obvious clean > workaround for the problem. Getting snapshot building to work > properly seems to be Hard (TM). > No kidding, we've been at it since 9.4. But anyway, while sure we have race condition because XactLockTableWait() finishes too early, all that should mean is we call LogStandbySnapshot() too early and as a result taking snapshot (and hence slot creation) will take longer as we'll wait for next call of LogStandbySnapshot() from some other caller (because the tx we care about will still be running). The whole SnapBuildWaitSnapshot() is just optimization to make slot creation take less time (and also to be able to write tests). What I don't understand is how it leads to crash (and I could not reproduce it using the pgbench file attached in this thread either) and moreover how it leads to 0 xid being logged. The only explanation I can come up is that some kind of similar race has to be in LogStandbySnapshot() but we explicitly check for 0 xid value there. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Issues with logical replication
On 4 October 2017 at 10:35, Petr Jelinek wrote: > On 02/10/17 18:59, Petr Jelinek wrote: >>> >>> Now fix the trigger function: >>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ >>> BEGIN >>> RETURN NEW; >>> END $$ LANGUAGE plpgsql; >>> >>> And manually perform at master two updates inside one transaction: >>> >>> postgres=# begin; >>> BEGIN >>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; >>> UPDATE 1 >>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; >>> UPDATE 1 >>> postgres=# commit; >>> >>> >>> and in replica log we see: >>> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply >>> worker for subscription "sub" has started >>> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible >>> tuple >>> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical >>> replication worker for subscription 16403 (PID 2954) exited with exit >>> code 1 >>> >>> Error happens in trigger.c: >>> >>> #3 0x0069bddb in GetTupleForTrigger (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, >>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at >>> trigger.c:3103 >>> #4 0x0069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, >>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 >>> #5 0x006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) >>> at execReplication.c:461 >>> #6 0x00820894 in apply_handle_update (s=0x7ffc442163b0) at >>> worker.c:736 >> >> We have locked the same tuple in RelationFindReplTupleByIndex() just >> before this gets called and didn't get the same error. I guess we do >> something wrong with snapshots. Will need to investigate more. >> > > Okay, so it's not snapshot. It's the fact that we don't set the > es_output_cid in replication worker which GetTupleForTrigger is using > when locking the tuple. Attached one-liner fixes it. Applied, thanks -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Issues with logical replication
On 4 October 2017 at 07:35, Petr Jelinek wrote: > On 02/10/17 18:59, Petr Jelinek wrote: > >> > >> Now fix the trigger function: > >> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER > AS $$ > >> BEGIN > >> RETURN NEW; > >> END $$ LANGUAGE plpgsql; > >> > >> And manually perform at master two updates inside one transaction: > >> > >> postgres=# begin; > >> BEGIN > >> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; > >> UPDATE 1 > >> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; > >> UPDATE 1 > >> postgres=# commit; > >> > >> > >> and in replica log we see: > >> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply > >> worker for subscription "sub" has started > >> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible > >> tuple > >> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical > >> replication worker for subscription 16403 (PID 2954) exited with exit > >> code 1 > >> > >> Error happens in trigger.c: > >> > >> #3 0x0069bddb in GetTupleForTrigger (estate=0x2e36b50, > >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, > >> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at > >> trigger.c:3103 > >> #4 0x0069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, > >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, > >> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 > >> #5 0x006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, > >> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) > >> at execReplication.c:461 > >> #6 0x00820894 in apply_handle_update (s=0x7ffc442163b0) at > >> worker.c:736 > > > > We have locked the same tuple in RelationFindReplTupleByIndex() just > > before this gets called and didn't get the same error. I guess we do > > something wrong with snapshots. Will need to investigate more. > > > > Okay, so it's not snapshot. It's the fact that we don't set the > es_output_cid in replication worker which GetTupleForTrigger is using > when locking the tuple. Attached one-liner fixes it. > This seems like a clear-cut bug with a simple fix. Lets get this committed, so we don't lose it. The rest of the thread is going off into the weeds a bit issues unrelated to the original problem. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Issues with logical replication
On Fri, Nov 17, 2017 at 5:12 AM, Robert Haas wrote: > On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund wrote: >>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >>> misdesigned >> >> Maybe I'm confused, but why is it fundamentally misdesigned? It's not >> such an absurd idea to wait for an xid in a WAL record. I get that >> there's a race condition here, which obviously bad, but I don't really >> see as evidence of the above claim. >> >> I actually think this code used to be safe because ProcArrayLock used to >> be held while generating and logging the running snapshots record. That >> was removed when fixing some other bug, but perhaps that shouldn't have >> been done... > > OK. Well, I might be overstating the case. My comment about > fundamental misdesign was really just based on the assumption that > XactLockTableWait() could be used to wait for an XID the instant it > was generated. That was never gonna work and there's no obvious clean > workaround for the problem. Getting snapshot building to work > properly seems to be Hard (TM). The patches discussed here deserve tracking, so please note that I have added an entry in the CF app: https://commitfest.postgresql.org/16/1381/ -- Michael
Re: [HACKERS] Issues with logical replication
On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund wrote: >> To me, it seems like SnapBuildWaitSnapshot() is fundamentally >> misdesigned > > Maybe I'm confused, but why is it fundamentally misdesigned? It's not > such an absurd idea to wait for an xid in a WAL record. I get that > there's a race condition here, which obviously bad, but I don't really > see as evidence of the above claim. > > I actually think this code used to be safe because ProcArrayLock used to > be held while generating and logging the running snapshots record. That > was removed when fixing some other bug, but perhaps that shouldn't have > been done... OK. Well, I might be overstating the case. My comment about fundamental misdesign was really just based on the assumption that XactLockTableWait() could be used to wait for an XID the instant it was generated. That was never gonna work and there's no obvious clean workaround for the problem. Getting snapshot building to work properly seems to be Hard (TM). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Issues with logical replication
On 2017-11-16 10:36:40 -0500, Robert Haas wrote: > On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich > wrote: > > I did a sketch of first approach just to confirm that it solves the problem. > > But there I hold ProcArrayLock during update of flag. Since only reader is > > GetRunningTransactionData it possible to have a custom lock there. In > > this case GetRunningTransactionData will hold three locks simultaneously, > > since it already holds ProcArrayLock and XidGenLock =) > > To me, it seems like SnapBuildWaitSnapshot() is fundamentally > misdesigned Maybe I'm confused, but why is it fundamentally misdesigned? It's not such an absurd idea to wait for an xid in a WAL record. I get that there's a race condition here, which obviously bad, but I don't really see as evidence of the above claim. I actually think this code used to be safe because ProcArrayLock used to be held while generating and logging the running snapshots record. That was removed when fixing some other bug, but perhaps that shouldn't have been done... > and ideally Petr (who wrote the patch) or Andres (who committed it) > ought to get involved here and help fix this problem. My own first > inclination would be to rewrite this as a loop: if the transaction ID > precedes the oldest running XID, then continue; else if > TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that > we don't need to wait; else XactLockTableWait() then loop. That way, > if you hit the race condition, you'll just busy-wait instead of doing > the wrong thing. Maybe insert a sleep(1) if we retry more than once. > That sucks, of course, but it seems like a better idea than trying to > redesign XactLockTableWait() or the procarray, which could affect an > awful lot of other things. Hm. Thinking about this. Greetings, Andres Freund
Re: [HACKERS] Issues with logical replication
On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich wrote: > I did a sketch of first approach just to confirm that it solves the problem. > But there I hold ProcArrayLock during update of flag. Since only reader is > GetRunningTransactionData it possible to have a custom lock there. In > this case GetRunningTransactionData will hold three locks simultaneously, > since it already holds ProcArrayLock and XidGenLock =) To me, it seems like SnapBuildWaitSnapshot() is fundamentally misdesigned, and ideally Petr (who wrote the patch) or Andres (who committed it) ought to get involved here and help fix this problem. My own first inclination would be to rewrite this as a loop: if the transaction ID precedes the oldest running XID, then continue; else if TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that we don't need to wait; else XactLockTableWait() then loop. That way, if you hit the race condition, you'll just busy-wait instead of doing the wrong thing. Maybe insert a sleep(1) if we retry more than once. That sucks, of course, but it seems like a better idea than trying to redesign XactLockTableWait() or the procarray, which could affect an awful lot of other things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Issues with logical replication
> On 15 Nov 2017, at 23:09, Robert Haas wrote: > > Ouch. This seems like a bug that needs to be fixed, but do you think > it's related to to Petr's proposed fix to set es_output_cid? That fix > looks reasonable, since we shouldn't try to lock tuples without a > valid CommandId. > > Now, having said that, I understand how the lack of that fix could cause: > > 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible tuple > > But I do not understand how it could cause: > > #3 0x0086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, > oper=XLTW_None) at lmgr.c:582 > I think that’s two separate bugs. Second one is due to race between XactLockTableWait() and acquiring of heavyweight transaction lock in AssignTransactionId(). XactLockTableWait assumes that transaction already set it’s hw transaction lock, but with current machinery of RunningXacts and snapshot building we can start such waiting before transaction gets it hw lock. I can see two ways out of this and both of them are quite unattractive: 1. Create a flag in pgproc indicating that hw lock is acquired and include in RunningXacts only transaction with this flag set. That's probably quite expensive since we need to hold one more exclusive lwlock in AssignTransactionId. 2. Rewrite XactLockTableWait() in a way that it is aware of the fact that valid xid can be topmost transaction but not yet have a hw lock entry. In such situation we can just wait and try to lock again. Good stop condition in this loop is TransactionIdIsInProgress(xid) becoming false. I did a sketch of first approach just to confirm that it solves the problem. But there I hold ProcArrayLock during update of flag. Since only reader is GetRunningTransactionData it possible to have a custom lock there. In this case GetRunningTransactionData will hold three locks simultaneously, since it already holds ProcArrayLock and XidGenLock =) Any better ideas? -- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company xltw_fix.diff Description: Binary data
Re: [HACKERS] Issues with logical replication
On Mon, Oct 9, 2017 at 9:19 PM, Stas Kelvich wrote: > I investigated this case and it seems that XactLockTableWait() in > SnapBuildWaitSnapshot() > not always work as expected. XactLockTableWait() waits on LockAcquire() for > xid to be > completed and if we finally got this lock but transaction is still in > progress then such xid > assumed to be a subxid. However LockAcquire/LockRelease cycle can happen > after transaction > set xid, but before XactLockTableInsert(). > > Namely following history happened for xid 5225 and lead to crash: > > [backend] LOG: AssignTransactionId: XactTopTransactionId = 5225 >[walsender] LOG: LogCurrentRunningXacts: Wrote RUNNING_XACTS xctn=1, > xid[0]=5225 >[walsender] LOG: XactLockTableWait: LockAcquire 5225 >[walsender] LOG: XactLockTableWait: LockRelease 5225 > [backend] LOG: AssignTransactionId: LockAcquire ExclusiveLock 5225 >[walsender] LOG: TransactionIdIsInProgress: SVC->latestCompletedXid=5224 > < xid=5225 => true > [backend] LOG: CommitTransaction: ProcArrayEndTransaction xid=5225, ipw=0 > [backend] LOG: CommitTransaction: ResourceOwnerRelease locks xid=5225 Ouch. This seems like a bug that needs to be fixed, but do you think it's related to to Petr's proposed fix to set es_output_cid? That fix looks reasonable, since we shouldn't try to lock tuples without a valid CommandId. Now, having said that, I understand how the lack of that fix could cause: 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible tuple But I do not understand how it could cause: #3 0x0086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0, oper=XLTW_None) at lmgr.c:582 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company