On 25.12.2019 16:51, Alexey Kondratov wrote:
On 25.12.2019 07:03, Kyotaro Horiguchi wrote:
As the result I think what is needed there is just checking if the
returned lsn is equal or larger than moveto. Doen't the following
change work?

-    if (XLogRecPtrIsInvalid(endlsn))
+    if (moveto <= endlsn)

Yep, it helps with physical replication slot persistence after advance, but the whole validation (moveto <= endlsn) does not make sense for me. The value of moveto should be >= than minlsn == confirmed_flush / restart_lsn, while endlsn == retlsn is also always initialized with confirmed_flush / restart_lsn. Thus, your condition seems to be true in any case, even if it was no-op one, which we were intended to catch.

I will recheck everything again and try to come up with something during this week.

If I get it correctly, then we already keep previous slot position in the minlsn, so we just have to compare endlsn with minlsn and treat endlsn <= minlsn as a no-op without slot state flushing.

Attached is a patch that does this, so it fixes the bug without affecting any user-facing behavior. Detailed comment section and DEBUG output are also added. What do you think now?

I have also forgotten to mention that all versions down to 11.0 should be affected with this bug.


Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

>From e08299ddf92abc3fb4e802e8b475097fa746c458 Mon Sep 17 00:00:00 2001
From: Alexey Kondratov <kondratov.alek...@gmail.com>
Date: Wed, 25 Dec 2019 20:12:42 +0300
Subject: [PATCH v2] Make physical replslot advance persistent

---
 src/backend/replication/slotfuncs.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6683fc3f9b..bc5c93b089 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -573,9 +573,17 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
 	nulls[0] = false;
 
-	/* Update the on disk state when lsn was updated. */
-	if (XLogRecPtrIsInvalid(endlsn))
+	/*
+	 * Update the on disk state when LSN was updated.  Here we rely on the facts
+	 * that: 1) minlsn is initialized with restart_lsn and confirmed_flush LSN for
+	 * physical and logical replication slot respectively, and 2) endlsn is set in
+	 * the same way by pg_*_replication_slot_advance, but after advance.  Thus,
+	 * endlsn <= minlsn is treated as a no-op.
+	 */
+	if (endlsn > minlsn)
 	{
+		elog(DEBUG1, "flushing replication slot '%s' state",
+							NameStr(MyReplicationSlot->data.name));
 		ReplicationSlotMarkDirty();
 		ReplicationSlotsComputeRequiredXmin(false);
 		ReplicationSlotsComputeRequiredLSN();

base-commit: 8ce3aa9b5914d1ac45ed3f9bc484f66b3c4850c7
-- 
2.17.1

Reply via email to