On Tue, Feb 13, 2024 at 9:38 AM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > Here is the V85_2 patch set that added the test and fixed one typo, > there are no other code changes. >
Few comments on the latest changes: ============================== 1. +# Confirm that the invalidated slot has been dropped. +$standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid 5/, + $log_offset); Is it okay to hardcode dbid 5? I am a bit worried that it can lead to instability in the test. 2. +check_primary_info(WalReceiverConn *wrconn, int elevel) +{ .. + bool primary_info_valid; I don't think for 0001, we need an elevel as an argument, so let's remove it. Additionally, can we change the variable name primary_info_valid to primary_slot_valid? Also, can we change the function name to validate_remote_info() as the remote can be both primary or standby? 3. +SyncReplicationSlots(WalReceiverConn *wrconn) +{ + PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn)); + { + check_primary_info(wrconn, ERROR); + + synchronize_slots(wrconn); + } + PG_END_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn)); + + walrcv_disconnect(wrconn); It is better to disconnect in the caller where we have made the connection. -- With Regards, Amit Kapila.
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml index bfa2a13377..a9ff6acd31 100644 --- a/doc/src/sgml/logicaldecoding.sgml +++ b/doc/src/sgml/logicaldecoding.sgml @@ -365,19 +365,18 @@ postgres=# select * from pg_logical_slot_get_changes('regression_slot', NULL, NU <title>Replication Slot Synchronization</title> <para> The logical replication slots on the primary can be synchronized to - the hot standby by enabling <literal>failover</literal> during slot - creation (e.g., using the <literal>failover</literal> parameter of + the hot standby by using the <literal>failover</literal> parameter of <link linkend="pg-create-logical-replication-slot"> <function>pg_create_logical_replication_slot</function></link>, or using the <link linkend="sql-createsubscription-params-with-failover"> <literal>failover</literal></link> option of - <command>CREATE SUBSCRIPTION</command>), and then calling + <command>CREATE SUBSCRIPTION</command> during slot creation, and then calling <link linkend="pg-sync-replication-slots"> <function>pg_sync_replication_slots</function></link> on the standby. For the synchronization to work, it is mandatory to - have a physical replication slot between the primary and the standby (e.g., + have a physical replication slot between the primary and the standby aka <link linkend="guc-primary-slot-name"><varname>primary_slot_name</varname></link> - should be configured on the standby), and + should be configured on the standby, and <link linkend="guc-hot-standby-feedback"><varname>hot_standby_feedback</varname></link> must be enabled on the standby. It is also necessary to specify a valid <literal>dbname</literal> in the diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c index feb04e1451..eb238cef92 100644 --- a/src/backend/replication/logical/slotsync.c +++ b/src/backend/replication/logical/slotsync.c @@ -304,6 +304,12 @@ reserve_wal_for_local_slot(XLogRecPtr restart_lsn) * number. However, if no WAL segment files have been removed by a * checkpoint since startup, we need to search for the oldest segment * file from the current timeline existing in XLOGDIR. + * + * XXX: Currently, we are searching for the oldest segment in the + * current timeline as there is less chance of the slot's restart_lsn + * from being some prior timeline, and even if it happens, in the worst + * case, we will wait to sync till the slot's restart_lsn moved to the + * current timeline. */ oldest_segno = XLogGetLastRemovedSegno() + 1; @@ -685,11 +691,10 @@ synchronize_slots(WalReceiverConn *wrconn) } /* - * Checks the primary server info. + * Checks the remote server info. * - * Using the specified primary server connection, check whether we are a - * cascading standby. It also validates primary_slot_name for non-cascading - * standbys. + * We ensure that the 'primary_slot_name' exists on the remote server and + * the remote server is not a standby node. */ static void check_primary_info(WalReceiverConn *wrconn, int elevel)