Re: [HACKERS] More race conditions in logical replication

2017-08-12 Thread Michael Paquier
On Sun, Aug 13, 2017 at 10:25 AM, Noah Misch  wrote:
> Committed.  These counts broke three times in the v10 release cycle.  It's too
> bad this defect doesn't cause an error when building the docs.

That's another argument for generating the table dynamically. Thanks
for the commit.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-08-12 Thread Noah Misch
On Sat, Aug 12, 2017 at 05:38:29PM +0900, Michael Paquier wrote:
> On Thu, Aug 10, 2017 at 4:22 PM, Michael Paquier
>  wrote:
> > On Tue, Aug 8, 2017 at 8:11 PM, Alvaro Herrera  
> > wrote:
> >> Here's a patch.  It turned to be a bit larger than I initially expected.
> >
> > Álvaro, 030273b7 did not get things completely right. A couple of wait
> > events have been added in the docs for sections Client, IPC and
> > Activity but it forgot to update the counters for morerows . Attached
> > is a patch to fix all that.
> 
> As the documentation format is weird because of this commit, I am
> adding an open item dedicated to that. Look for example at
> WalSenderWaitForWAL in
> https://www.postgresql.org/docs/devel/static/monitoring-stats.html.
> While looking again, I have noticed that one line for the section of
> LWLock is weird, so attached is an updated patch.

Committed.  These counts broke three times in the v10 release cycle.  It's too
bad this defect doesn't cause an error when building the docs.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-08-12 Thread Michael Paquier
On Thu, Aug 10, 2017 at 4:22 PM, Michael Paquier
 wrote:
> On Tue, Aug 8, 2017 at 8:11 PM, Alvaro Herrera  
> wrote:
>> Here's a patch.  It turned to be a bit larger than I initially expected.
>
> Álvaro, 030273b7 did not get things completely right. A couple of wait
> events have been added in the docs for sections Client, IPC and
> Activity but it forgot to update the counters for morerows . Attached
> is a patch to fix all that.

As the documentation format is weird because of this commit, I am
adding an open item dedicated to that. Look for example at
WalSenderWaitForWAL in
https://www.postgresql.org/docs/devel/static/monitoring-stats.html.
While looking again, I have noticed that one line for the section of
LWLock is weird, so attached is an updated patch.

Thanks,
-- 
Michael
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 12d5628266..5575c2c837 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -845,7 +845,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
 
   

-LWLock
+LWLock
 ShmemIndexLock
 Waiting to find or allocate space in shared memory.

@@ -1155,7 +1155,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  Waiting to acquire a pin on a buffer.
 
 
- Activity
+ Activity
  ArchiverMain
  Waiting in main loop of the archiver process.
 
@@ -1212,7 +1212,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  Waiting in main loop of WAL writer process.
 
 
- Client
+ Client
  ClientRead
  Waiting to read data from the client.
 
@@ -1250,7 +1250,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  Waiting in an extension.
 
 
- IPC
+ IPC
  BgWorkerShutdown
  Waiting for background worker to shut down.
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-08-10 Thread Michael Paquier
On Tue, Aug 8, 2017 at 8:11 PM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>> On Mon, Aug 7, 2017 at 8:14 PM, Alvaro Herrera  
>> wrote:
>> > BTW, I noticed that the PG_WAIT_LOCK value that we're using for wait
>> > event here (and in the replication slot case) is bogus.  We probably
>> > need something new here.
>>
>> Yeah, if you're adding a new wait point, you should add document a new
>> constant in the appropriate section, probably something under
>> PG_WAIT_IPC in this case.
>
> Here's a patch.  It turned to be a bit larger than I initially expected.

Álvaro, 030273b7 did not get things completely right. A couple of wait
events have been added in the docs for sections Client, IPC and
Activity but it forgot to update the counters for morerows . Attached
is a patch to fix all that.
-- 
Michael


docfix-wait-event-rows.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-08-08 Thread Alvaro Herrera
Robert Haas wrote:
> On Mon, Aug 7, 2017 at 8:14 PM, Alvaro Herrera  
> wrote:
> > BTW, I noticed that the PG_WAIT_LOCK value that we're using for wait
> > event here (and in the replication slot case) is bogus.  We probably
> > need something new here.
> 
> Yeah, if you're adding a new wait point, you should add document a new
> constant in the appropriate section, probably something under
> PG_WAIT_IPC in this case.

Here's a patch.  It turned to be a bit larger than I initially expected.

Wait events are a maintainability fail IMO.  I think we need to rethink
this stuff; using generated files from a single source containing the C
symbol, text name and doc blurb sounds better.  That can wait for pg11
though.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From e050ef66956935e6dba41fc18e11632ff9f0068f Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 8 Aug 2017 13:33:47 -0400
Subject: [PATCH] Fix various inadequacies in wait events

In commit 9915de6c1cb2, we introduced a new wait point for replication
slots and incorrectly labelled it as wait event PG_WAIT_LOCK.  That's
wrong, so invent an appropriate new wait event instead, and document it
properly.

While at it, fix numerous other problems in the vicinity:
- two different walreceiver wait events were being mixed up in a single
  wait event; split it out so that they can be distinguished, and
  document the new symbols properly.

- ParallelBitmapPopulate was documented but didn't exist.

- ParallelBitmapScan was not documented

- Logical replication wait events weren't documented

- various symbols had been added in dartboard order in various places.
  Put them back in alphabetical order, as was originally intended.
---
 doc/src/sgml/monitoring.sgml   | 32 +--
 src/backend/postmaster/pgstat.c| 36 +-
 .../libpqwalreceiver/libpqwalreceiver.c|  4 +--
 src/backend/replication/slot.c |  3 +-
 src/include/pgstat.h   | 16 +-
 5 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index be3dc672bc..eb20c9c543 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1176,6 +1176,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  Waiting in main loop of checkpointer process.
 
 
+ LogicalLauncherMain
+ Waiting in main loop of logical launcher process.
+
+
+ LogicalApplyMain
+ Waiting in main loop of logical apply process.
+
+
  PgStatMain
  Waiting in main loop of the statistics collector 
process.
 
@@ -1213,6 +1221,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  Waiting to write data from the client.
 
 
+ LibPQWalReceiverConnect
+ Waiting in WAL receiver to establish connection to remote 
server.
+
+
+ LibPQWalReceiverReceive
+ Waiting in WAL receiver to receive data from remote 
server.
+
+
  SSLOpenServer
  Waiting for SSL while attempting connection.
 
@@ -1251,6 +1267,14 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  Waiting for activity from child process when executing 
Gather node.
 
 
+ LogicalSyncData
+ Waiting for logical replication remote server to send data for 
initial table synchronization.
+
+
+ LogicalSyncStateChange
+ Waiting for logical replication remote server to change 
state.
+
+
  MessageQueueInternal
  Waiting for other process to be attached in shared message 
queue.
 
@@ -1271,14 +1295,18 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  Waiting for parallel workers to finish computing.
 
 
- ParallelBitmapPopulate
- Waiting for the leader to populate the TidBitmap.
+ ParallelBitmapScan
+ Waiting for parallel bitmap scan to become initialized.
 
 
  ProcArrayGroupUpdate
  Waiting for group leader to clear transaction id at 
transaction end.
 
 
+ ReplicationSlotDrop
+ Waiting for a replication slot to become inactive to be 
dropped.
+
+
  SafeSnapshot
  Waiting for a snapshot for a READ ONLY DEFERRABLE 
transaction.
 
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a0b0eecbd5..3f5fb796a5 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ 

Re: [HACKERS] More race conditions in logical replication

2017-08-07 Thread Robert Haas
On Mon, Aug 7, 2017 at 8:14 PM, Alvaro Herrera  wrote:
> BTW, I noticed that the PG_WAIT_LOCK value that we're using for wait
> event here (and in the replication slot case) is bogus.  We probably
> need something new here.

Yeah, if you're adding a new wait point, you should add document a new
constant in the appropriate section, probably something under
PG_WAIT_IPC in this case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-08-07 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Alvaro Herrera wrote:

> > I just noticed that Jacana failed again in the subscription test, and it
> > looks like it's related to this.  I'll take a look tomorrow if no one
> > beats me to it.
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-26%2014%3A39%3A54
> 
> Ahh, I misread the message.  This one is actually about the replication
> *origin* being still active, not the replication *slot*.  I suppose
> origins need the same treatment as we just did for slots.  Anybody wants
> to volunteer a patch?

So here's a patch.  I was able to reproduce the issue locally by adding
a couple of sleeps, just like Tom did in the case of replication slots.
With this patch it seems the problem is solved.  The mechanism is the
same as Petr used to fix replication origins -- if an origin is busy,
sleep on the CV until someone else signals us; and each time the
acquirer PID changes, broadcast a signal.  Like before, this is likely
to be a problem in older branches too, but we have no CVs to backpatch
this.

BTW, I noticed that the PG_WAIT_LOCK value that we're using for wait
event here (and in the replication slot case) is bogus.  We probably
need something new here.

I found four instances of this problem in buildfarm,
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-26%2014%3A39%3A54
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-28%2021%3A00%3A15
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-31%2007%3A03%3A20
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar=2017-08-04%2004%3A34%3A04

Jacana only stopped having it because it broke for unrelated reasons.  I
bet we'll see another failure soon ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index 3593712791..ae40f7164d 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -939,7 +939,7 @@ DropSubscription(DropSubscriptionStmt *stmt, bool 
isTopLevel)
snprintf(originname, sizeof(originname), "pg_%u", subid);
originid = replorigin_by_name(originname, true);
if (originid != InvalidRepOriginId)
-   replorigin_drop(originid);
+   replorigin_drop(originid, false);
 
/*
 * If there is no slot associated with the subscription, we can finish
diff --git a/src/backend/replication/logical/origin.c 
b/src/backend/replication/logical/origin.c
index 1c665312a4..4f32e7861c 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -79,15 +79,15 @@
 #include "access/xact.h"
 
 #include "catalog/indexing.h"
-
 #include "nodes/execnodes.h"
 
 #include "replication/origin.h"
 #include "replication/logical.h"
-
+#include "pgstat.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/lmgr.h"
+#include "storage/condition_variable.h"
 #include "storage/copydir.h"
 
 #include "utils/builtins.h"
@@ -125,6 +125,11 @@ typedef struct ReplicationState
int acquired_by;
 
/*
+* Condition variable that's signalled when acquired_by changes.
+*/
+   ConditionVariable origin_cv;
+
+   /*
 * Lock protecting remote_lsn and local_lsn.
 */
LWLock  lock;
@@ -324,9 +329,9 @@ replorigin_create(char *roname)
  * Needs to be called in a transaction.
  */
 void
-replorigin_drop(RepOriginId roident)
+replorigin_drop(RepOriginId roident, bool nowait)
 {
-   HeapTuple   tuple = NULL;
+   HeapTuple   tuple;
Relationrel;
int i;
 
@@ -334,6 +339,8 @@ replorigin_drop(RepOriginId roident)
 
rel = heap_open(ReplicationOriginRelationId, ExclusiveLock);
 
+restart:
+   tuple = NULL;
/* cleanup the slot state info */
LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);
 
@@ -346,11 +353,21 @@ replorigin_drop(RepOriginId roident)
{
if (state->acquired_by != 0)
{
-   ereport(ERROR,
-   (errcode(ERRCODE_OBJECT_IN_USE),
-errmsg("could not drop 
replication origin with OID %d, in use by PID %d",
-   state->roident,
-   
state->acquired_by)));
+   ConditionVariable  *cv;
+
+   if (nowait)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_IN_USE),
+errmsg("could not drop 

Re: [HACKERS] More race conditions in logical replication

2017-07-27 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> 
> > Hmm, yeah, that's not good.  However, I didn't like the idea of putting
> > it inside the locked area, as it's too much code.  Instead I added just
> > before acquiring the spinlock.  We cancel the sleep unconditionally
> > later on if we didn't need to sleep after all.
> 
> I just noticed that Jacana failed again in the subscription test, and it
> looks like it's related to this.  I'll take a look tomorrow if no one
> beats me to it.
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-26%2014%3A39%3A54

Ahh, I misread the message.  This one is actually about the replication
*origin* being still active, not the replication *slot*.  I suppose
origins need the same treatment as we just did for slots.  Anybody wants
to volunteer a patch?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-26 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Hmm, yeah, that's not good.  However, I didn't like the idea of putting
> it inside the locked area, as it's too much code.  Instead I added just
> before acquiring the spinlock.  We cancel the sleep unconditionally
> later on if we didn't need to sleep after all.

I just noticed that Jacana failed again in the subscription test, and it
looks like it's related to this.  I'll take a look tomorrow if no one
beats me to it.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2017-07-26%2014%3A39%3A54

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-25 Thread Robert Haas
On Tue, Jul 25, 2017 at 5:47 AM, Petr Jelinek
 wrote:
> As a side note, the ConditionVariablePrepareToSleep()'s comment could be
> improved because currently it says the only advantage is that we skip
> double-test in the beginning of ConditionVariableSleep(). But that's not
> true, it's essential for preventing race conditions like the one above
> because it puts the current process into waiting list so we can be sure
> it will be signaled on broadcast once ConditionVariablePrepareToSleep()
> has been called.

But if you don't call ConditionVariablePrepareToSleep() before calling
ConditionVariableSleep(), then the first call to the latter will call
the former and return without doing anything else.  So I don't see how
this can ever go wrong if you're using these primitives as documented.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-25 Thread Robert Haas
On Tue, Jul 25, 2017 at 1:42 PM, Alvaro Herrera
 wrote:
> As I see it, we need to backpatch at least parts of this patch.  I've
> received reports that in earlier branches pglogical and BDR can
> sometimes leave slots behind when removing nodes, and I have a hunch
> that it can be explained by the bugs being fixed here.  Now we cannot
> use condition variables in back-branches, so we'll need to figure out
> how to actually do it ...

If all you had to back-patch was the condition variable code itself,
that might not really be all that bad, but it depends on the
WaitEventSet stuff, which I think is far too dangerous to back-patch.
However, you might be able to create a dumber, slower version that
only uses WaitLatch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-25 Thread Alvaro Herrera
Petr Jelinek wrote:
> On 25/07/17 01:33, Alvaro Herrera wrote:
> > Alvaro Herrera wrote:
> >> I'm back at looking into this again, after a rather exhausting week.  I
> >> think I have a better grasp of what is going on in this code now,
> > 
> > Here's an updated patch, which I intend to push tomorrow.
> 
> I think the ConditionVariablePrepareToSleep() call in
> ReplicationSlotAcquire() needs to be done inside the mutex lock
> otherwise there is tiny race condition where the process which has slot
> acquired might release the slot between the mutex unlock and the
> ConditionVariablePrepareToSleep() call which means we'll never get
> signaled and ConditionVariableSleep() will wait forever.

Hmm, yeah, that's not good.  However, I didn't like the idea of putting
it inside the locked area, as it's too much code.  Instead I added just
before acquiring the spinlock.  We cancel the sleep unconditionally
later on if we didn't need to sleep after all.

As I see it, we need to backpatch at least parts of this patch.  I've
received reports that in earlier branches pglogical and BDR can
sometimes leave slots behind when removing nodes, and I have a hunch
that it can be explained by the bugs being fixed here.  Now we cannot
use condition variables in back-branches, so we'll need to figure out
how to actually do it ...

> As a side note, the ConditionVariablePrepareToSleep()'s comment could be
> improved because currently it says the only advantage is that we skip
> double-test in the beginning of ConditionVariableSleep(). But that's not
> true, it's essential for preventing race conditions like the one above
> because it puts the current process into waiting list so we can be sure
> it will be signaled on broadcast once ConditionVariablePrepareToSleep()
> has been called.

Hmm.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-25 Thread Petr Jelinek
On 25/07/17 01:33, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> I'm back at looking into this again, after a rather exhausting week.  I
>> think I have a better grasp of what is going on in this code now,
> 
> Here's an updated patch, which I intend to push tomorrow.
> 

I think the ConditionVariablePrepareToSleep() call in
ReplicationSlotAcquire() needs to be done inside the mutex lock
otherwise there is tiny race condition where the process which has slot
acquired might release the slot between the mutex unlock and the
ConditionVariablePrepareToSleep() call which means we'll never get
signaled and ConditionVariableSleep() will wait forever.

Otherwise the patch looks good to me.

As a side note, the ConditionVariablePrepareToSleep()'s comment could be
improved because currently it says the only advantage is that we skip
double-test in the beginning of ConditionVariableSleep(). But that's not
true, it's essential for preventing race conditions like the one above
because it puts the current process into waiting list so we can be sure
it will be signaled on broadcast once ConditionVariablePrepareToSleep()
has been called.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-24 Thread Alvaro Herrera
Alvaro Herrera wrote:
> I'm back at looking into this again, after a rather exhausting week.  I
> think I have a better grasp of what is going on in this code now,

Here's an updated patch, which I intend to push tomorrow.

> and it
> appears to me that we should change the locking so that active_pid is
> protected by ReplicationSlotControlLock instead of the slot's spinlock;
> but I haven't analyzed the idea fully yet and I don't have the patch
> done yet either.

This doesn't seem to be a good idea actually.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 33f08678bf20eed3a4cb3d10960bb06543a1b3db Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 12 Jul 2017 18:38:33 -0400
Subject: [PATCH v4] Wait for slot to become free in when dropping it

---
 src/backend/replication/logical/logicalfuncs.c |   2 +-
 src/backend/replication/slot.c | 115 ++---
 src/backend/replication/slotfuncs.c|  32 ---
 src/backend/replication/walsender.c|   6 +-
 src/include/replication/slot.h |  10 ++-
 5 files changed, 110 insertions(+), 55 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c 
b/src/backend/replication/logical/logicalfuncs.c
index 363ca82cb0..a3ba2b1266 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -244,7 +244,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, 
bool confirm, bool bin
else
end_of_wal = GetXLogReplayRecPtr();
 
-   ReplicationSlotAcquire(NameStr(*name));
+   ReplicationSlotAcquire(NameStr(*name), true);
 
PG_TRY();
{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc7de20e11..ea9cd1f22b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -157,6 +157,7 @@ ReplicationSlotsShmemInit(void)
/* everything else is zeroed by the memset above */
SpinLockInit(>mutex);
LWLockInitialize(>io_in_progress_lock, 
LWTRANCHE_REPLICATION_SLOT_IO_IN_PROGRESS);
+   ConditionVariableInit(>active_cv);
}
}
 }
@@ -313,25 +314,35 @@ ReplicationSlotCreate(const char *name, bool db_specific,
LWLockRelease(ReplicationSlotControlLock);
 
/*
-* Now that the slot has been marked as in_use and in_active, it's safe 
to
+* Now that the slot has been marked as in_use and active, it's safe to
 * let somebody else try to allocate a slot.
 */
LWLockRelease(ReplicationSlotAllocationLock);
+
+   /* Let everybody know we've modified this slot */
+   ConditionVariableBroadcast(>active_cv);
 }
 
 /*
  * Find a previously created slot and mark it as used by this backend.
  */
 void
-ReplicationSlotAcquire(const char *name)
+ReplicationSlotAcquire(const char *name, bool nowait)
 {
-   ReplicationSlot *slot = NULL;
+   ReplicationSlot *slot;
+   int active_pid;
int i;
-   int active_pid = 0; /* Keep compiler quiet */
 
+retry:
Assert(MyReplicationSlot == NULL);
 
-   /* Search for the named slot and mark it active if we find it. */
+   /*
+* Search for the named slot and mark it active if we find it.  If the
+* slot is already active, we exit the loop with active_pid set to the 
PID
+* of the backend that owns it.
+*/
+   active_pid = 0;
+   slot = NULL;
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (i = 0; i < max_replication_slots; i++)
{
@@ -339,35 +350,59 @@ ReplicationSlotAcquire(const char *name)
 
if (s->in_use && strcmp(name, NameStr(s->data.name)) == 0)
{
+   /* Found the slot we want -- can we activate it? */
SpinLockAcquire(>mutex);
+
active_pid = s->active_pid;
if (active_pid == 0)
active_pid = s->active_pid = MyProcPid;
+
SpinLockRelease(>mutex);
slot = s;
+
break;
}
}
LWLockRelease(ReplicationSlotControlLock);
 
-   /* If we did not find the slot or it was already active, error out. */
+   /* If we did not find the slot, error out. */
if (slot == NULL)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("replication slot \"%s\" does not 
exist", name)));
+
+   /*
+* If we found the slot but it's already active in another backend, we
+* either error out or retry after a short wait, as caller 

Re: [HACKERS] More race conditions in logical replication

2017-07-21 Thread Alvaro Herrera
I'm back at looking into this again, after a rather exhausting week.  I
think I have a better grasp of what is going on in this code now, and it
appears to me that we should change the locking so that active_pid is
protected by ReplicationSlotControlLock instead of the slot's spinlock;
but I haven't analyzed the idea fully yet and I don't have the patch
done yet either.  I'm going to report, hopefully with a fix committed
this time, on Monday at 19:00 CLT.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-18 Thread Alvaro Herrera
Alvaro Herrera wrote:
> After going over this a bunch more times, I found other problems.  For
> example, I noticed that if I create a temporary slot in one session,
> then another session would rightly go to sleep if it tries to drop that
> slot.  But the slot goes away when the first session disconnects, so I'd
> expect the sleeping session to get a signal and wake up, but that
> doesn't happen.
> 
> I'm going to continue to look at this and report back Tuesday 18th.

I got tangled up in a really tough problem this week, so I won't have
time to work on this for a couple of days.  I can next update tomorrow
19:00 CLT (probably not with a final fix yet, though).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-15 Thread Alvaro Herrera
After going over this a bunch more times, I found other problems.  For
example, I noticed that if I create a temporary slot in one session,
then another session would rightly go to sleep if it tries to drop that
slot.  But the slot goes away when the first session disconnects, so I'd
expect the sleeping session to get a signal and wake up, but that
doesn't happen.

I'm going to continue to look at this and report back Tuesday 18th.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-14 Thread Noah Misch
On Wed, Jul 12, 2017 at 06:48:28PM -0400, Alvaro Herrera wrote:
> Petr Jelinek wrote:
> 
> > So best idea I could come up with is to make use of the new condition
> > variable API. That lets us wait for variable which can be set per slot.
> 
> Here's a cleaned up version of that patch, which I intend to get in the
> tree tomorrow.  I verified that this allows the subscription tests to
> pass with Tom's sleep additions.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-12 Thread Alvaro Herrera
Petr Jelinek wrote:

> So best idea I could come up with is to make use of the new condition
> variable API. That lets us wait for variable which can be set per slot.

Here's a cleaned up version of that patch, which I intend to get in the
tree tomorrow.  I verified that this allows the subscription tests to
pass with Tom's sleep additions.

I noticed one point where we're reading the active_pid without locking;
marked it with an XXX comment.  Not yet sure whether this is a bug or
not.


I noticed something funny in CREATE_REPLICATION; apparently we first
create a slot and set it active, then we activate it by name.  With the
current coding it seems to work fine, because we're careful to make
activation idempotent, but it looks convoluted.  I don't think this is
new, though.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 55533aa3cdc2fecbf7b6b6c661649342a204e73b Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 12 Jul 2017 18:38:33 -0400
Subject: [PATCH v3 1/1] Wait for slot to become free in when dropping it

---
 src/backend/replication/logical/logicalfuncs.c |  2 +-
 src/backend/replication/slot.c | 79 +++---
 src/backend/replication/slotfuncs.c|  2 +-
 src/backend/replication/walsender.c|  6 +-
 src/include/replication/slot.h |  8 ++-
 5 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c 
b/src/backend/replication/logical/logicalfuncs.c
index 363ca82cb0..a3ba2b1266 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -244,7 +244,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, 
bool confirm, bool bin
else
end_of_wal = GetXLogReplayRecPtr();
 
-   ReplicationSlotAcquire(NameStr(*name));
+   ReplicationSlotAcquire(NameStr(*name), true);
 
PG_TRY();
{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc7de20e11..76198a627d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -157,6 +157,7 @@ ReplicationSlotsShmemInit(void)
/* everything else is zeroed by the memset above */
SpinLockInit(>mutex);
LWLockInitialize(>io_in_progress_lock, 
LWTRANCHE_REPLICATION_SLOT_IO_IN_PROGRESS);
+   ConditionVariableInit(>active_cv);
}
}
 }
@@ -313,24 +314,27 @@ ReplicationSlotCreate(const char *name, bool db_specific,
LWLockRelease(ReplicationSlotControlLock);
 
/*
-* Now that the slot has been marked as in_use and in_active, it's safe 
to
+* Now that the slot has been marked as in_use and active, it's safe to
 * let somebody else try to allocate a slot.
 */
LWLockRelease(ReplicationSlotAllocationLock);
+
+   ConditionVariableBroadcast(>active_cv);
 }
 
 /*
  * Find a previously created slot and mark it as used by this backend.
  */
 void
-ReplicationSlotAcquire(const char *name)
+ReplicationSlotAcquire(const char *name, bool nowait)
 {
ReplicationSlot *slot = NULL;
+   int active_pid = 0;
int i;
-   int active_pid = 0; /* Keep compiler quiet */
 
Assert(MyReplicationSlot == NULL);
 
+retry:
/* Search for the named slot and mark it active if we find it. */
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (i = 0; i < max_replication_slots; i++)
@@ -339,27 +343,52 @@ ReplicationSlotAcquire(const char *name)
 
if (s->in_use && strcmp(name, NameStr(s->data.name)) == 0)
{
+   /* Found the slot we want -- can we activate it? */
SpinLockAcquire(>mutex);
+
active_pid = s->active_pid;
if (active_pid == 0)
active_pid = s->active_pid = MyProcPid;
+
SpinLockRelease(>mutex);
slot = s;
+
break;
}
}
LWLockRelease(ReplicationSlotControlLock);
 
-   /* If we did not find the slot or it was already active, error out. */
+   /* If we did not find the slot, error out. */
if (slot == NULL)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("replication slot \"%s\" does not 
exist", name)));
+
+   /*
+* If we found the slot but it's already active in another backend, we
+* either error out or retry after a short wait, as caller specified.
+*/
if (active_pid != MyProcPid)
-   ereport(ERROR,
- 

Re: [HACKERS] More race conditions in logical replication

2017-07-10 Thread Alvaro Herrera
Alvaro Herrera wrote:

> I'll next update this on or before Monday 10th at 19:00 CLT.

I couldn't get to this today as I wanted.  Next update on Wednesday
12th, same time.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-07 Thread Tom Lane
Alvaro Herrera  writes:
> I'll next update this on or before Monday 10th at 19:00 CLT.

Schedule note --- we'll be wrapping beta2 on Monday, a couple hours
before that.  Although it'd be great to have this issue fixed before
beta2, jamming in a patch just a few hours before wrap is probably
not prudent.  If you can't get it done over the weekend, I'd counsel
holding off till after beta2 is tagged.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-07 Thread Alvaro Herrera
Petr Jelinek wrote:

> So best idea I could come up with is to make use of the new condition
> variable API. That lets us wait for variable which can be set per slot.
> 
> It's not backportable however, I am not sure how big problem that is
> considering the lack of complaints until now (maybe we could backport
> using the ugly timeout version?).
> 
> The attached patch is a prototype of such solution and there are some
> race conditions (variable can get signaled before the waiting process
> starts sleeping for it). I am mainly sending it to get feedback on the
> approach.

This one looks a lot better than the original one.

I wonder if it's possible to do this using lwlocks instead of condition
variables (similar to how we do the "wait for IO" thing both for slots
and buffers).  We could backport that easily ...

I'll next update this on or before Monday 10th at 19:00 CLT.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-07 Thread Petr Jelinek
On 06/07/17 18:20, Petr Jelinek wrote:
> On 06/07/17 17:33, Petr Jelinek wrote:
>> On 03/07/17 01:54, Tom Lane wrote:
>>> I noticed a recent failure that looked suspiciously like a race condition:
>>>
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-07-02%2018%3A02%3A07
>>>
>>> The critical bit in the log file is
>>>
>>> error running SQL: 'psql::1: ERROR:  could not drop the replication 
>>> slot "tap_sub" on publisher
>>> DETAIL:  The error was: ERROR:  replication slot "tap_sub" is active for 
>>> PID 3866790'
>>> while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R 
>>> dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION 
>>> tap_sub' at 
>>> /home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
>>>  line 1198.
>>>
>>> After poking at it a bit, I found that I can cause several different
>>> failures of this ilk in the subscription tests by injecting delays at
>>> the points where a slot's active_pid is about to be cleared, as in the
>>> attached patch (which also adds some extra printouts for debugging
>>> purposes; none of that is meant for commit).  It seems clear that there
>>> is inadequate interlocking going on when we kill and restart a logical
>>> rep worker: we're trying to start a new one before the old one has
>>> gotten out of the slot.
>>>
>>
>> Thanks for the test case.
>>
>> It's not actually that we start new worker fast. It's that we try to
>> drop the slot right after worker process was killed but if the code that
>> clears slot's active_pid takes too long, it still looks like it's being
>> used. I am quite sure it's possible to make this happen for physical
>> replication as well when using slots.
>>
>> This is not something that can be solved by locking on subscriber. ISTM
>> we need to make pg_drop_replication_slot behave more nicely, like making
>> it wait for the slot to become available (either by default or as an
>> option).
>>
>> I'll have to think about how to do it without rewriting half of
>> replication slots or reimplementing lock queue though because the
>> replication slots don't use normal catalog access so there is no object
>> locking with wait queue. We could use some latch wait with small timeout
>> but that seems ugly as that function can be called by user without
>> having dropped the slot before so the wait can be quite long (as in
>> "forever").
>>
> 
> Naive fix would be something like attached. But as I said, it's not
> exactly pretty.
> 

So best idea I could come up with is to make use of the new condition
variable API. That lets us wait for variable which can be set per slot.

It's not backportable however, I am not sure how big problem that is
considering the lack of complaints until now (maybe we could backport
using the ugly timeout version?).

The attached patch is a prototype of such solution and there are some
race conditions (variable can get signaled before the waiting process
starts sleeping for it). I am mainly sending it to get feedback on the
approach.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From b72ea52c865b2d7f0d7d29d0834d71e1ec33d54a Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Thu, 6 Jul 2017 18:16:44 +0200
Subject: [PATCH] Wait for slot to become free in when dropping it

---
 src/backend/replication/logical/logicalfuncs.c |  2 +-
 src/backend/replication/slot.c | 43 +-
 src/backend/replication/slotfuncs.c|  2 +-
 src/backend/replication/walsender.c|  6 ++--
 src/include/replication/slot.h |  8 +++--
 5 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 363ca82..a3ba2b1 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -244,7 +244,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr();
 
-	ReplicationSlotAcquire(NameStr(*name));
+	ReplicationSlotAcquire(NameStr(*name), true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc7de20..2993bb9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -46,6 +46,7 @@
 #include "pgstat.h"
 #include "replication/slot.h"
 #include "storage/fd.h"
+#include "storage/ipc.h"
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -157,6 +158,7 @@ ReplicationSlotsShmemInit(void)
 			/* everything else is zeroed by the memset above */
 			SpinLockInit(>mutex);
 			LWLockInitialize(>io_in_progress_lock, LWTRANCHE_REPLICATION_SLOT_IO_IN_PROGRESS);
+			ConditionVariableInit(>active_cv);
 		}
 	}
 }
@@ -323,7 +325,7 @@ ReplicationSlotCreate(const char *name, bool 

Re: [HACKERS] More race conditions in logical replication

2017-07-06 Thread Petr Jelinek
On 06/07/17 17:33, Petr Jelinek wrote:
> On 03/07/17 01:54, Tom Lane wrote:
>> I noticed a recent failure that looked suspiciously like a race condition:
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-07-02%2018%3A02%3A07
>>
>> The critical bit in the log file is
>>
>> error running SQL: 'psql::1: ERROR:  could not drop the replication 
>> slot "tap_sub" on publisher
>> DETAIL:  The error was: ERROR:  replication slot "tap_sub" is active for PID 
>> 3866790'
>> while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R 
>> dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION 
>> tap_sub' at 
>> /home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
>>  line 1198.
>>
>> After poking at it a bit, I found that I can cause several different
>> failures of this ilk in the subscription tests by injecting delays at
>> the points where a slot's active_pid is about to be cleared, as in the
>> attached patch (which also adds some extra printouts for debugging
>> purposes; none of that is meant for commit).  It seems clear that there
>> is inadequate interlocking going on when we kill and restart a logical
>> rep worker: we're trying to start a new one before the old one has
>> gotten out of the slot.
>>
> 
> Thanks for the test case.
> 
> It's not actually that we start new worker fast. It's that we try to
> drop the slot right after worker process was killed but if the code that
> clears slot's active_pid takes too long, it still looks like it's being
> used. I am quite sure it's possible to make this happen for physical
> replication as well when using slots.
> 
> This is not something that can be solved by locking on subscriber. ISTM
> we need to make pg_drop_replication_slot behave more nicely, like making
> it wait for the slot to become available (either by default or as an
> option).
> 
> I'll have to think about how to do it without rewriting half of
> replication slots or reimplementing lock queue though because the
> replication slots don't use normal catalog access so there is no object
> locking with wait queue. We could use some latch wait with small timeout
> but that seems ugly as that function can be called by user without
> having dropped the slot before so the wait can be quite long (as in
> "forever").
> 

Naive fix would be something like attached. But as I said, it's not
exactly pretty.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 6d016a3fad8635c2366bcf5438d49f41c905621c Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Thu, 6 Jul 2017 18:16:44 +0200
Subject: [PATCH] Wait for slot to become free in when dropping it

---
 src/backend/replication/logical/logicalfuncs.c |  2 +-
 src/backend/replication/slot.c | 51 ++
 src/backend/replication/slotfuncs.c|  2 +-
 src/backend/replication/walsender.c|  6 +--
 src/include/replication/slot.h |  4 +-
 5 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 363ca82..a3ba2b1 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -244,7 +244,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr();
 
-	ReplicationSlotAcquire(NameStr(*name));
+	ReplicationSlotAcquire(NameStr(*name), true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc7de20..ab115f4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -46,6 +46,7 @@
 #include "pgstat.h"
 #include "replication/slot.h"
 #include "storage/fd.h"
+#include "storage/ipc.h"
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
@@ -323,7 +324,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
  * Find a previously created slot and mark it as used by this backend.
  */
 void
-ReplicationSlotAcquire(const char *name)
+ReplicationSlotAcquire(const char *name, bool nowait)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -331,6 +332,8 @@ ReplicationSlotAcquire(const char *name)
 
 	Assert(MyReplicationSlot == NULL);
 
+retry:
+
 	/* Search for the named slot and mark it active if we find it. */
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 	for (i = 0; i < max_replication_slots; i++)
@@ -350,16 +353,48 @@ ReplicationSlotAcquire(const char *name)
 	}
 	LWLockRelease(ReplicationSlotControlLock);
 
-	/* If we did not find the slot or it was already active, error out. */
+	/* If we did not find the slot, error out. */
 	if (slot == NULL)
 		ereport(ERROR,
 (errcode(ERRCODE_UNDEFINED_OBJECT),
  errmsg("replication slot \"%s\" does not exist", name)));
+
+	/*
+	 

Re: [HACKERS] More race conditions in logical replication

2017-07-06 Thread Petr Jelinek
On 03/07/17 01:54, Tom Lane wrote:
> I noticed a recent failure that looked suspiciously like a race condition:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-07-02%2018%3A02%3A07
> 
> The critical bit in the log file is
> 
> error running SQL: 'psql::1: ERROR:  could not drop the replication 
> slot "tap_sub" on publisher
> DETAIL:  The error was: ERROR:  replication slot "tap_sub" is active for PID 
> 3866790'
> while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R 
> dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION 
> tap_sub' at 
> /home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
>  line 1198.
> 
> After poking at it a bit, I found that I can cause several different
> failures of this ilk in the subscription tests by injecting delays at
> the points where a slot's active_pid is about to be cleared, as in the
> attached patch (which also adds some extra printouts for debugging
> purposes; none of that is meant for commit).  It seems clear that there
> is inadequate interlocking going on when we kill and restart a logical
> rep worker: we're trying to start a new one before the old one has
> gotten out of the slot.
> 

Thanks for the test case.

It's not actually that we start new worker fast. It's that we try to
drop the slot right after worker process was killed but if the code that
clears slot's active_pid takes too long, it still looks like it's being
used. I am quite sure it's possible to make this happen for physical
replication as well when using slots.

This is not something that can be solved by locking on subscriber. ISTM
we need to make pg_drop_replication_slot behave more nicely, like making
it wait for the slot to become available (either by default or as an
option).

I'll have to think about how to do it without rewriting half of
replication slots or reimplementing lock queue though because the
replication slots don't use normal catalog access so there is no object
locking with wait queue. We could use some latch wait with small timeout
but that seems ugly as that function can be called by user without
having dropped the slot before so the wait can be quite long (as in
"forever").

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-05 Thread Alvaro Herrera
Noah Misch wrote:

> The above-described topic is currently a PostgreSQL 10 open item.  Peter,
> since you committed the patch believed to have created it, you own this open
> item.

I volunteer to own this item.  My next update is going to be on or
before Friday 7th at 19:00 Chilean time, though I don't think I can have
this ready before then.  I expect a fix for this to miss next week's
beta release, but I'll try to get it done sooner.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-03 Thread Noah Misch
On Sun, Jul 02, 2017 at 07:54:48PM -0400, Tom Lane wrote:
> I noticed a recent failure that looked suspiciously like a race condition:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-07-02%2018%3A02%3A07
> 
> The critical bit in the log file is
> 
> error running SQL: 'psql::1: ERROR:  could not drop the replication 
> slot "tap_sub" on publisher
> DETAIL:  The error was: ERROR:  replication slot "tap_sub" is active for PID 
> 3866790'
> while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R 
> dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION 
> tap_sub' at 
> /home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
>  line 1198.
> 
> After poking at it a bit, I found that I can cause several different
> failures of this ilk in the subscription tests by injecting delays at
> the points where a slot's active_pid is about to be cleared, as in the
> attached patch (which also adds some extra printouts for debugging
> purposes; none of that is meant for commit).  It seems clear that there
> is inadequate interlocking going on when we kill and restart a logical
> rep worker: we're trying to start a new one before the old one has
> gotten out of the slot.
> 
> I'm not particularly interested in fixing this myself, so I'm just
> going to add it to the open items list.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] More race conditions in logical replication

2017-07-02 Thread Tom Lane
I noticed a recent failure that looked suspiciously like a race condition:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet=2017-07-02%2018%3A02%3A07

The critical bit in the log file is

error running SQL: 'psql::1: ERROR:  could not drop the replication slot 
"tap_sub" on publisher
DETAIL:  The error was: ERROR:  replication slot "tap_sub" is active for PID 
3866790'
while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R dbname='postgres' 
-f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION tap_sub' at 
/home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm
 line 1198.

After poking at it a bit, I found that I can cause several different
failures of this ilk in the subscription tests by injecting delays at
the points where a slot's active_pid is about to be cleared, as in the
attached patch (which also adds some extra printouts for debugging
purposes; none of that is meant for commit).  It seems clear that there
is inadequate interlocking going on when we kill and restart a logical
rep worker: we're trying to start a new one before the old one has
gotten out of the slot.

I'm not particularly interested in fixing this myself, so I'm just
going to add it to the open items list.

regards, tom lane

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc7de20..78cd416 100644
*** a/src/backend/replication/slot.c
--- b/src/backend/replication/slot.c
*** ReplicationSlotAcquire(const char *name)
*** 358,364 
  	if (active_pid != MyProcPid)
  		ereport(ERROR,
  (errcode(ERRCODE_OBJECT_IN_USE),
!  errmsg("replication slot \"%s\" is active for PID %d",
  		name, active_pid)));
  
  	/* We made this slot active, so it's ours now. */
--- 358,364 
  	if (active_pid != MyProcPid)
  		ereport(ERROR,
  (errcode(ERRCODE_OBJECT_IN_USE),
!  errmsg("replication slot \"%s\" is active for PID %d (acq)",
  		name, active_pid)));
  
  	/* We made this slot active, so it's ours now. */
*** ReplicationSlotRelease(void)
*** 391,396 
--- 391,398 
  		 * Mark persistent slot inactive.  We're not freeing it, just
  		 * disconnecting.
  		 */
+ 		pg_usleep(10);
+ 		elog(LOG, "ReplicationSlotRelease: unmarking persistent slot");
  		SpinLockAcquire(>mutex);
  		slot->active_pid = 0;
  		SpinLockRelease(>mutex);
*** ReplicationSlotDropPtr(ReplicationSlot *
*** 523,528 
--- 525,532 
  	{
  		bool		fail_softly = slot->data.persistency != RS_PERSISTENT;
  
+ 		pg_usleep(10);
+ 		elog(LOG, "ReplicationSlotDropPtr: unmarking slot after rename fail");
  		SpinLockAcquire(>mutex);
  		slot->active_pid = 0;
  		SpinLockRelease(>mutex);
*** ReplicationSlotDropPtr(ReplicationSlot *
*** 540,545 
--- 544,551 
  	 * and nobody can be attached to this slot and thus access it without
  	 * scanning the array.
  	 */
+ 	pg_usleep(10);
+ 	elog(LOG, "ReplicationSlotDropPtr: unmarking slot");
  	LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
  	slot->active_pid = 0;
  	slot->in_use = false;
*** restart:
*** 876,882 
  		if (active_pid)
  			ereport(ERROR,
  	(errcode(ERRCODE_OBJECT_IN_USE),
! 	 errmsg("replication slot \"%s\" is active for PID %d",
  			slotname, active_pid)));
  
  		/*
--- 882,888 
  		if (active_pid)
  			ereport(ERROR,
  	(errcode(ERRCODE_OBJECT_IN_USE),
! 	 errmsg("replication slot \"%s\" is active for PID %d (drop)",
  			slotname, active_pid)));
  
  		/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers