Hi, here are some review comments for v45-0001 (excluding the test code)
======
doc/src/sgml/config.sgml
1.
+ Note that the inactive timeout invalidation mechanism is not
+ applicable for slots on the standby server that are being synced
+ from primary server (i.e., standby slots having
nit - /from primary server/from the primary server/
======
src/backend/replication/slot.c
2. ReplicationSlotAcquire
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+ "replication_slot_inactive_timeout.")));
nit - "replication_slot_inactive_timeout." - should be no period
inside that GUC name literal
~~~
3. ReportSlotInvalidation
I didn't understand why there was a hint for:
"You might need to increase \"%s\".", "max_slot_wal_keep_size"
But you don't have an equivalent hint for timeout invalidation:
"You might need to increase \"%s\".", "replication_slot_inactive_timeout"
Why aren't these similar cases consistent?
~~~
4. RestoreSlotFromDisk
+ /* Use the same inactive_since time for all the slots. */
+ if (now == 0)
+ now = GetCurrentTimestamp();
+
Is the deferred assignment really necessary? Why not just
unconditionally assign the 'now' just before the for-loop? Or even at
the declaration? e.g. The 'replication_slot_inactive_timeout' is
measured in seconds so I don't think 'inactive_since' being wrong by a
millisecond here will make any difference.
======
src/include/replication/slot.h
5. ReplicationSlotSetInactiveSince
+/*
+ * Set slot's inactive_since property unless it was previously invalidated due
+ * to inactive timeout.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+ bool acquire_lock)
+{
+ if (acquire_lock)
+ SpinLockAcquire(&s->mutex);
+
+ if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT)
+ s->inactive_since = *now;
+
+ if (acquire_lock)
+ SpinLockRelease(&s->mutex);
+}
Is the logic correct? What if the slot was already invalid due to some
reason other than RS_INVAL_INACTIVE_TIMEOUT? Is an Assert needed?
======
Kind Regards,
Peter Smith.
Fujitsu Australia
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 27b2285..97b4fb5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4582,7 +4582,7 @@ restore_command = 'copy "C:\\server\\archivedir\\%f"
"%p"' # Windows
<para>
Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
- from primary server (i.e., standby slots having
+ from the primary server (i.e., standby slots having
<link
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
value <literal>true</literal>).
Synced slots are always considered to be inactive because they don't
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d92b92b..8cc67b4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -640,7 +640,7 @@ retry:
errmsg("can no longer get changes from
replication slot \"%s\"",
NameStr(s->data.name)),
errdetail("This slot has been invalidated
because it was inactive for longer than the amount of time specified by
\"%s\".",
-
"replication_slot_inactive_timeout.")));
+
"replication_slot_inactive_timeout")));
}
/*