Hi, On 2014-05-16 16:37:16 -0400, Steve Singer wrote: > I am finding that my logical walsender connections are being terminated due > to a timeout on the CREATE REPLICATION SLOT command. with "terminating > walsender process due to replication timeout" > > Below is the stack trace when this happens > > #3 0x000000000067df28 in WalSndCheckTimeOut (now=now@entry=453585463823871) > at walsender.c:1748 > #4 0x000000000067eedc in WalSndWaitForWal (loc=691727888) at > walsender.c:1216 > ... > #9 0x0000000000680f16 in CreateReplicationSlot (cmd=0x1798b50) at > walsender.c:800 > #10 exec_replication_command () at walsender.c:1291 > #11 0x00000000006bf4a1 in PostgresMain (argc=<optimized out>, > argv=argv@entry=0x177db50, dbname=0x177db30 "test1", > > (gdb) p last_reply_timestamp > $1 = 0 > > > I propose the attached patch sets last_reply_timestamp to now() it starts > processing a command. Since receiving a command is hearing something from > the client.
Hm. Yes, that's a problem. > diff --git a/src/backend/replication/walsender.c > b/src/backend/replication/walsender.c > new file mode 100644 > index 5c11d68..56a2f10 > *** a/src/backend/replication/walsender.c > --- b/src/backend/replication/walsender.c > *************** exec_replication_command(const char *cmd > *** 1276,1281 **** > --- 1276,1282 ---- > parse_rc)))); > > cmd_node = replication_parse_result; > + last_reply_timestamp = GetCurrentTimestamp(); > > switch (cmd_node->type) > { I don't think that's going to cut it though. The creation can take longer than whatever wal_sender_timeout is set to (when there's lots of longrunning transactions). I think checking whether last_reply_timestamp = 0 during timeout checking is more robust. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers