Re: [HACKERS] Issues with logical replication

2018-01-03 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Will push this shortly after lunch.

Pushed.  Will you (Konstantin, Stas, Masahiko) please verify that after
this commit all the problems reported with logical replication are
fixed?

Thanks

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Issues with logical replication

2018-01-03 Thread Alvaro Herrera
Stas Kelvich wrote:

> Seems that having busy loop is the best idea out of several discussed.
> 
> I thought about small sleep at the bottom of that loop if we reached topmost
> transaction, but taking into account low probability of that event may be
> it is faster to do just busy wait.

In other locations where we do similar things we have 1ms sleeps.  I
agree with the need for a comment here.

Proposed patch attached.  I tried your reload.pgb test case in 9.4
(after changing pgoutput to test_decoding and removing the 3rd arg to a
function call) and the crash takes about 3 seconds without patch in my
machine.  No crash with this patch.

Will push this shortly after lunch.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From abca9e98d1ddf86e892bd23bd24878516225a2d1 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 3 Jan 2018 12:36:47 -0300
Subject: [PATCH] Make XactLockTableWait work for transactions that are not yet
 self-locked
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

XactLockTableWait assumed that its xid argument has already added itself
to the lock table.  That assumption led to another assumption that if
locking the xid has succeeded but the xid is reported as still in
progress, then the input xid must have been a subtransaction.

These assumptions hold true for the original uses of this code in
locking related to on-disk tuples, but they break down in logical
replication slot snapshot building -- in particular, when a standby
snapshot logged contains an xid that's already in ProcArray but not yet
in the lock table.  This leads to assertion failures that can be
reproduced all the way back to 9.4, when logical decoding was
introduced.

The new coding does not assume that the xid is a subtransaction anymore,
and will retry the lock with same toplevel xid it was given before
(courtesy of SubTransGetTopmostTransaction()).

For consistency, change ConditionalXactLockTableWait the same way.

Author: Petr Jelínek
Discussion: 
https://postgr.es/m/1b3e32d8-fcf4-40b4-aef9-5c0e3ac57...@postgrespro.ru
Reported-by: Konstantin Knizhnik
Diagnosed-by: Stas Kelvich, Petr Jelínek
Reviewed-by: Andres Freund, Robert Haas
---
 src/backend/storage/lmgr/lmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 3754283d2b..c184f1f956 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -590,7 +590,7 @@ XactLockTableWait(TransactionId xid, Relation rel, 
ItemPointer ctid,
 
if (!TransactionIdIsInProgress(xid))
break;
-   xid = SubTransGetParent(xid);
+   xid = SubTransGetTopmostTransaction(xid);
}
 
if (oper != XLTW_None)
@@ -622,7 +622,7 @@ ConditionalXactLockTableWait(TransactionId xid)
 
if (!TransactionIdIsInProgress(xid))
break;
-   xid = SubTransGetParent(xid);
+   xid = SubTransGetTopmostTransaction(xid);
}
 
return true;
-- 
2.11.0



Re: [HACKERS] Issues with logical replication

2017-12-01 Thread Petr Jelinek
On 30/11/17 11:48, Simon Riggs wrote:
> On 30 November 2017 at 11:30, Petr Jelinek  
> wrote:
>> On 30/11/17 00:47, Andres Freund wrote:
>>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote:
 I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
 with it, but the problematic logic happens inside the
 XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
 detecting the situation short of reimplementing the
 XactLockTableInsert() instead of calling it.
>>>
>>> Right. But we fairly trivially can change that. I'm remarking on it
>>> because other people's, not yours, suggestions aimed at making this
>>> bulletproof. I just wanted to make clear that I don't think that's
>>> necessary at all.
>>>
>>
>> Okay, then I guess we are in agreement. I can confirm that the attached
>> fixes the issue in my tests. Using SubTransGetTopmostTransaction()
>> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU
>> cost for other callers. I don't think it's worth worrying about given we
>> are waiting for heavyweight lock, but if we did we can just inline the
>> code directly into SnapBuildWaitSnapshot().
> 
> This will still fail an Assert in TransactionIdIsInProgress() when
> snapshots are overflowed.
> 

Hmm, which one, why?

I see 2 Asserts there, one is:
>   Assert(nxids == 0);
Which is inside the RecoveryInProgress(), surely on standbys there will
still be no PGXACTs with assigned xids so that should be fine.

The other one is:
>   Assert(TransactionIdIsValid(topxid));
Which should be again fine toplevel xid of toplevel xid is same xid
which is a valid one.

So I think we should be fine there.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-30 Thread Simon Riggs
On 30 November 2017 at 11:30, Petr Jelinek  wrote:
> On 30/11/17 00:47, Andres Freund wrote:
>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote:
>>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
>>> with it, but the problematic logic happens inside the
>>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
>>> detecting the situation short of reimplementing the
>>> XactLockTableInsert() instead of calling it.
>>
>> Right. But we fairly trivially can change that. I'm remarking on it
>> because other people's, not yours, suggestions aimed at making this
>> bulletproof. I just wanted to make clear that I don't think that's
>> necessary at all.
>>
>
> Okay, then I guess we are in agreement. I can confirm that the attached
> fixes the issue in my tests. Using SubTransGetTopmostTransaction()
> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU
> cost for other callers. I don't think it's worth worrying about given we
> are waiting for heavyweight lock, but if we did we can just inline the
> code directly into SnapBuildWaitSnapshot().

This will still fail an Assert in TransactionIdIsInProgress() when
snapshots are overflowed.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Andres Freund
On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote:
> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
> with it, but the problematic logic happens inside the
> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
> detecting the situation short of reimplementing the
> XactLockTableInsert() instead of calling it.

Right. But we fairly trivially can change that. I'm remarking on it
because other people's, not yours, suggestions aimed at making this
bulletproof. I just wanted to make clear that I don't think that's
necessary at all.

Greetings,

Andres Freund



Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Petr Jelinek
On 30/11/17 00:40, Andres Freund wrote:
> On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote:
>> Yes that helps thanks. Now that I reproduced it I understand, I was
>> confused by the backtrace that said xid was 0 on the input to
>> XactLockTableWait() but that's not the case, it's what xid is changed to
>> in the inner loop.
> 
>> So what happens is that we manage to do LogStandbySnapshot(), decode the
>> logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in
>> between GetNewTransactionId() and XactLockTableInsert() calls in
>> AssignTransactionId() for that same transaction.
>>
>> I guess the probability of this happening is increased by the fact that
>> GetRunningTransactionData() acquires XidGenLock so if there is
>> GetNewTransactionId() running in parallel it will wait for it to finish
>> and we'll log immediately after that.
>>
>> Hmm that means that Robert's loop idea will not help and ProcArrayLock
>> will not save us either. Maybe we could either rewrite XactLockTableWait
>> or create another version of it with SubTransGetParent() call replaced
>> by SubTransGetTopmostTransaction() as that will return the same top
>> level xid in case the input xid wasn't a subxact. That would make it
>> safe to be called on transactions that didn't acquire lock on themselves
>> yet.
> 
> I've not really looked into this deeply, but afair we can just make this
> code accept that edgecase be done with it. As the comment says:
> 
>  * Iterate through xids in record, wait for all older than the cutoff to
>  * finish.  Then, if possible, log a new xl_running_xacts record.
>  *
> --- highlight ---
>  * This isn't required for the correctness of decoding, but to:
> --- highlight ---
>  * a) allow isolationtester to notice that we're currently waiting for
>  *  something.
>  * b) log a new xl_running_xacts record where it'd be helpful, without having
>  *  to write for bgwriter or checkpointer.
> 

I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
with it, but the problematic logic happens inside the
XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
detecting the situation short of reimplementing the
XactLockTableInsert() instead of calling it.

The problem is that SubTransGetParent() returns InvalidTransactionId
when the race happens and SubTransGetParen() is called because
XactLockTableInsert() assumes that if transaction lock was acquired and
the xid is still in progress, the input must have been xid of a
subtransaction). This is why I suggested replacing that call with
SubTransGetTopmostTransaction() which will just return same xid it got
on input and we'll simply retry to do the lock instead of crashing.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Andres Freund
On 2017-11-30 00:25:58 +0100, Petr Jelinek wrote:
> Yes that helps thanks. Now that I reproduced it I understand, I was
> confused by the backtrace that said xid was 0 on the input to
> XactLockTableWait() but that's not the case, it's what xid is changed to
> in the inner loop.

> So what happens is that we manage to do LogStandbySnapshot(), decode the
> logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in
> between GetNewTransactionId() and XactLockTableInsert() calls in
> AssignTransactionId() for that same transaction.
> 
> I guess the probability of this happening is increased by the fact that
> GetRunningTransactionData() acquires XidGenLock so if there is
> GetNewTransactionId() running in parallel it will wait for it to finish
> and we'll log immediately after that.
> 
> Hmm that means that Robert's loop idea will not help and ProcArrayLock
> will not save us either. Maybe we could either rewrite XactLockTableWait
> or create another version of it with SubTransGetParent() call replaced
> by SubTransGetTopmostTransaction() as that will return the same top
> level xid in case the input xid wasn't a subxact. That would make it
> safe to be called on transactions that didn't acquire lock on themselves
> yet.

I've not really looked into this deeply, but afair we can just make this
code accept that edgecase be done with it. As the comment says:

 * Iterate through xids in record, wait for all older than the cutoff to
 * finish.  Then, if possible, log a new xl_running_xacts record.
 *
--- highlight ---
 * This isn't required for the correctness of decoding, but to:
--- highlight ---
 * a) allow isolationtester to notice that we're currently waiting for
 *something.
 * b) log a new xl_running_xacts record where it'd be helpful, without having
 *to write for bgwriter or checkpointer.


Greetings,

Andres Freund



Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Petr Jelinek
On 29/11/17 20:11, Stas Kelvich wrote:
> 
>> On 29 Nov 2017, at 18:46, Petr Jelinek  wrote:
>>
>> What I don't understand is how it leads to crash (and I could not
>> reproduce it using the pgbench file attached in this thread either) and
>> moreover how it leads to 0 xid being logged. The only explanation I can
>> come up is that some kind of similar race has to be in
>> LogStandbySnapshot() but we explicitly check for 0 xid value there.
>>
> 
> Zero xid isn’t logged. Loop in XactLockTableWait() does following:
> 
> for (;;)
> {
> Assert(TransactionIdIsValid(xid));
> Assert(!TransactionIdEquals(xid, GetTopTransactionIdIfAny()));
> 
>   <...>
>   
> xid = SubTransGetParent(xid);
> }
> 
> So if last statement is reached for top transaction then next iteration
> will crash in first assert. And it will be reached if whole this loop
> happens before transaction acquired heavyweight lock.
> 
> Probability of that crash can be significantly increased be adding sleep
> between xid generation and lock insertion in AssignTransactionId().
> 

Yes that helps thanks. Now that I reproduced it I understand, I was
confused by the backtrace that said xid was 0 on the input to
XactLockTableWait() but that's not the case, it's what xid is changed to
in the inner loop.

So what happens is that we manage to do LogStandbySnapshot(), decode the
logged snapshot, and run SnapBuildWaitSnapshot() for a transaction in
between GetNewTransactionId() and XactLockTableInsert() calls in
AssignTransactionId() for that same transaction.

I guess the probability of this happening is increased by the fact that
GetRunningTransactionData() acquires XidGenLock so if there is
GetNewTransactionId() running in parallel it will wait for it to finish
and we'll log immediately after that.

Hmm that means that Robert's loop idea will not help and ProcArrayLock
will not save us either. Maybe we could either rewrite XactLockTableWait
or create another version of it with SubTransGetParent() call replaced
by SubTransGetTopmostTransaction() as that will return the same top
level xid in case the input xid wasn't a subxact. That would make it
safe to be called on transactions that didn't acquire lock on themselves
yet.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-29 Thread Petr Jelinek
Hi,

(sorry for not being active here, I am still catching up after being
away for some family issues)

On 16/11/17 21:12, Robert Haas wrote:
> On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund  wrote:
>>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally
>>> misdesigned
>>
>> Maybe I'm confused, but why is it fundamentally misdesigned? It's not
>> such an absurd idea to wait for an xid in a WAL record.  I get that
>> there's a race condition here, which obviously bad, but I don't really
>> see as evidence of the above claim.
>>
>> I actually think this code used to be safe because ProcArrayLock used to
>> be held while generating and logging the running snapshots record.  That
>> was removed when fixing some other bug, but perhaps that shouldn't have
>> been done...
> 
> OK.  Well, I might be overstating the case.  My comment about
> fundamental misdesign was really just based on the assumption that
> XactLockTableWait() could be used to wait for an XID the instant it
> was generated.  That was never gonna work and there's no obvious clean
> workaround for the problem.  Getting snapshot building to work
> properly seems to be Hard (TM).
> 

No kidding, we've been at it since 9.4.

But anyway, while sure we have race condition because
XactLockTableWait() finishes too early, all that should mean is we call
LogStandbySnapshot() too early and as a result taking snapshot (and
hence slot creation) will take longer as we'll wait for next call of
LogStandbySnapshot() from some other caller (because the tx we care
about will still be running). The whole SnapBuildWaitSnapshot() is just
optimization to make slot creation take less time (and also to be able
to write tests).

What I don't understand is how it leads to crash (and I could not
reproduce it using the pgbench file attached in this thread either) and
moreover how it leads to 0 xid being logged. The only explanation I can
come up is that some kind of similar race has to be in
LogStandbySnapshot() but we explicitly check for 0 xid value there.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-21 Thread Simon Riggs
On 4 October 2017 at 10:35, Petr Jelinek  wrote:
> On 02/10/17 18:59, Petr Jelinek wrote:
>>>
>>> Now fix the trigger function:
>>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$
>>> BEGIN
>>>   RETURN NEW;
>>> END $$ LANGUAGE plpgsql;
>>>
>>> And manually perform at master two updates inside one transaction:
>>>
>>> postgres=# begin;
>>> BEGIN
>>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26;
>>> UPDATE 1
>>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26;
>>> UPDATE 1
>>> postgres=# commit;
>>> 
>>>
>>> and in replica log we see:
>>> 2017-10-02 18:40:26.094 MSK [2954] LOG:  logical replication apply
>>> worker for subscription "sub" has started
>>> 2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible
>>> tuple
>>> 2017-10-02 18:40:26.102 MSK [2882] LOG:  worker process: logical
>>> replication worker for subscription 16403 (PID 2954) exited with exit
>>> code 1
>>>
>>> Error happens in trigger.c:
>>>
>>> #3  0x0069bddb in GetTupleForTrigger (estate=0x2e36b50,
>>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac,
>>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at
>>> trigger.c:3103
>>> #4  0x0069b259 in ExecBRUpdateTriggers (estate=0x2e36b50,
>>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac,
>>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748
>>> #5  0x006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50,
>>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240)
>>> at execReplication.c:461
>>> #6  0x00820894 in apply_handle_update (s=0x7ffc442163b0) at
>>> worker.c:736
>>
>> We have locked the same tuple in RelationFindReplTupleByIndex() just
>> before this gets called and didn't get the same error. I guess we do
>> something wrong with snapshots. Will need to investigate more.
>>
>
> Okay, so it's not snapshot. It's the fact that we don't set the
> es_output_cid in replication worker which GetTupleForTrigger is using
> when locking the tuple. Attached one-liner fixes it.

Applied, thanks

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-21 Thread Craig Ringer
On 4 October 2017 at 07:35, Petr Jelinek 
wrote:

> On 02/10/17 18:59, Petr Jelinek wrote:
> >>
> >> Now fix the trigger function:
> >> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER
> AS $$
> >> BEGIN
> >>   RETURN NEW;
> >> END $$ LANGUAGE plpgsql;
> >>
> >> And manually perform at master two updates inside one transaction:
> >>
> >> postgres=# begin;
> >> BEGIN
> >> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26;
> >> UPDATE 1
> >> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26;
> >> UPDATE 1
> >> postgres=# commit;
> >> 
> >>
> >> and in replica log we see:
> >> 2017-10-02 18:40:26.094 MSK [2954] LOG:  logical replication apply
> >> worker for subscription "sub" has started
> >> 2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible
> >> tuple
> >> 2017-10-02 18:40:26.102 MSK [2882] LOG:  worker process: logical
> >> replication worker for subscription 16403 (PID 2954) exited with exit
> >> code 1
> >>
> >> Error happens in trigger.c:
> >>
> >> #3  0x0069bddb in GetTupleForTrigger (estate=0x2e36b50,
> >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac,
> >> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at
> >> trigger.c:3103
> >> #4  0x0069b259 in ExecBRUpdateTriggers (estate=0x2e36b50,
> >> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac,
> >> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748
> >> #5  0x006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50,
> >> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240)
> >> at execReplication.c:461
> >> #6  0x00820894 in apply_handle_update (s=0x7ffc442163b0) at
> >> worker.c:736
> >
> > We have locked the same tuple in RelationFindReplTupleByIndex() just
> > before this gets called and didn't get the same error. I guess we do
> > something wrong with snapshots. Will need to investigate more.
> >
>
> Okay, so it's not snapshot. It's the fact that we don't set the
> es_output_cid in replication worker which GetTupleForTrigger is using
> when locking the tuple. Attached one-liner fixes it.
>

This seems like a clear-cut bug with a simple fix.

Lets get this committed, so we don't lose it. The rest of the thread is
going off into the weeds a bit issues unrelated to the original problem.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Issues with logical replication

2017-11-16 Thread Robert Haas
On Wed, Nov 15, 2017 at 8:20 PM, Stas Kelvich  wrote:
> I did a sketch of first approach just to confirm that it solves the problem.
> But there I hold ProcArrayLock during update of flag. Since only reader is
> GetRunningTransactionData it possible to have a custom lock there. In
> this case GetRunningTransactionData will hold three locks simultaneously,
> since it already holds ProcArrayLock and XidGenLock =)

To me, it seems like SnapBuildWaitSnapshot() is fundamentally
misdesigned, and ideally Petr (who wrote the patch) or Andres (who
committed it) ought to get involved here and help fix this problem.
My own first inclination would be to rewrite this as a loop: if the
transaction ID precedes the oldest running XID, then continue; else if
TransactionIdDidCommit() || TransactionIdDidAbort() then conclude that
we don't need to wait; else XactLockTableWait() then loop.  That way,
if you hit the race condition, you'll just busy-wait instead of doing
the wrong thing.  Maybe insert a sleep(1) if we retry more than once.
That sucks, of course, but it seems like a better idea than trying to
redesign XactLockTableWait() or the procarray, which could affect an
awful lot of other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Issues with logical replication

2017-11-16 Thread Stas Kelvich

> On 15 Nov 2017, at 23:09, Robert Haas  wrote:
> 
> Ouch.  This seems like a bug that needs to be fixed, but do you think
> it's related to to Petr's proposed fix to set es_output_cid?  That fix
> looks reasonable, since we shouldn't try to lock tuples without a
> valid CommandId.
> 
> Now, having said that, I understand how the lack of that fix could cause:
> 
> 2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible tuple
> 
> But I do not understand how it could cause:
> 
> #3  0x0086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0,
> oper=XLTW_None) at lmgr.c:582
> 

I think that’s two separate bugs. Second one is due to race between
XactLockTableWait() and acquiring of heavyweight transaction lock in
AssignTransactionId(). XactLockTableWait assumes that transaction
already set it’s hw transaction lock, but with current machinery of
RunningXacts and snapshot building we can start such waiting before
transaction gets it hw lock.

I can see two ways out of this and both of them are quite unattractive:

1. Create a flag in pgproc indicating that hw lock is acquired and
include in RunningXacts only transaction with this flag set.
That's probably quite expensive since we need to hold one more exclusive
lwlock in AssignTransactionId.

2. Rewrite XactLockTableWait() in a way that it is aware of the fact that
valid xid can be topmost transaction but not yet have a hw lock entry.
In such situation we can just wait and try to lock again. Good stop
condition in this loop is TransactionIdIsInProgress(xid) becoming
false.

I did a sketch of first approach just to confirm that it solves the problem.
But there I hold ProcArrayLock during update of flag. Since only reader is
GetRunningTransactionData it possible to have a custom lock there. In
this case GetRunningTransactionData will hold three locks simultaneously,
since it already holds ProcArrayLock and XidGenLock =)

Any better ideas?

--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



xltw_fix.diff
Description: Binary data


Re: [HACKERS] Issues with logical replication

2017-11-15 Thread Robert Haas
On Mon, Oct 9, 2017 at 9:19 PM, Stas Kelvich  wrote:
>   I investigated this case and it seems that XactLockTableWait() in 
> SnapBuildWaitSnapshot()
> not always work as expected. XactLockTableWait() waits on LockAcquire() for 
> xid to be
> completed and if we finally got this lock but transaction is still in 
> progress then such xid
> assumed to be a subxid. However LockAcquire/LockRelease cycle can happen 
> after transaction
> set xid, but before XactLockTableInsert().
>
> Namely following history happened for xid 5225 and lead to crash:
>
> [backend] LOG:  AssignTransactionId: XactTopTransactionId = 5225
>[walsender] LOG:  LogCurrentRunningXacts: Wrote RUNNING_XACTS xctn=1, 
> xid[0]=5225
>[walsender] LOG:  XactLockTableWait: LockAcquire 5225
>[walsender] LOG:  XactLockTableWait: LockRelease 5225
> [backend] LOG:  AssignTransactionId: LockAcquire ExclusiveLock 5225
>[walsender] LOG:  TransactionIdIsInProgress: SVC->latestCompletedXid=5224 
> < xid=5225 => true
> [backend] LOG:  CommitTransaction: ProcArrayEndTransaction xid=5225, ipw=0
> [backend] LOG:  CommitTransaction: ResourceOwnerRelease locks xid=5225

Ouch.  This seems like a bug that needs to be fixed, but do you think
it's related to to Petr's proposed fix to set es_output_cid?  That fix
looks reasonable, since we shouldn't try to lock tuples without a
valid CommandId.

Now, having said that, I understand how the lack of that fix could cause:

2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible tuple

But I do not understand how it could cause:

#3  0x0086ac1d in XactLockTableWait (xid=0, rel=0x0, ctid=0x0,
oper=XLTW_None) at lmgr.c:582

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company