Here are some review comments for the patch v20240805_2-0003.

======
doc/src/sgml/catalogs.sgml

nitpick - removed the word "either"

======
doc/src/sgml/ref/alter_subscription.sgml

I felt the discussions about "how to handle warnings" are a bit scattered:
e.g.1 - ALTER SUBSCRIPTION REFRESH PUBLICATION copy data referred to
CREATE SUBSCRIPTION copy data.
e.g.2 - ALTER SUBSCRIPTION REFRESH explains what to do, but now the
explanation is in 2 places.
e.g.3 - CREATE SUBSCRIPTION copy data explains what to do (again), but
IMO it belongs better in the common "Notes" part

FYI, I've moved all the information to one place (in the CREATE
SUBSCRIPTION "Notes") and others refer to this central place. See the
attached nitpicks diff.

REFRESH PUBLICATION copy_data
nitpick - now refers to  CREATE SUBSCRIPTION "Notes". I also moved it
to be nearer to the other sequence stuff.

REFRESH PUBLICATION SEQUENCES:
nitpick - now refers to CREATE SUBSCRIPTION "Notes".

======
doc/src/sgml/ref/create_subscription.sgml

REFRESH PUBLICATION copy_data
nitpick - now refers to CREATE SUBSCRIPTION "Notes"

Notes:
nitpick - the explanation of, and what to do about sequence WARNINGS,
is moved to here

======
src/backend/commands/sequence.c

pg_sequence_state:
nitpick - I just moved the comment in pg_sequence_state() to below the
NOTE, which talks about "page LSN".

======
src/backend/catalog/pg_subscription.c

1. HasSubscriptionRelations

Should function 'HasSubscriptionRelations' be renamed to
'HasSubscriptionTables'?

~~~

GetSubscriptionRelations:
nitpick - tweak some "skip" comments.

======
src/backend/commands/subscriptioncmds.c

2. CreateSubscription

  tables = fetch_table_list(wrconn, publications);
- foreach(lc, tables)
+ foreach_ptr(RangeVar, rv, tables)
+ {
+ Oid relid;
+
+ relid = RangeVarGetRelid(rv, AccessShareLock, false);
+
+ /* Check for supported relkind. */
+ CheckSubscriptionRelkind(get_rel_relkind(relid),
+ rv->schemaname, rv->relname);
+
+ AddSubscriptionRelState(subid, relid, table_state,
+ InvalidXLogRecPtr, true);
+ }
+
+ /* Add the sequences in init state */
+ sequences = fetch_sequence_list(wrconn, publications);
+ foreach_ptr(RangeVar, rv, sequences)

These 2 loops (first for tables and then for sequences) seem to be
executing the same code. If you wanted you could combine the lists
up-front, and then have one code loop instead of 2. It would mean less
code. OTOH, maybe the current code is more readable? I am not sure
what is best, so just bringing this to your attention.

~~~

AlterSubscription_refresh:
nitpick = typo /indicating tha/indicating that/

~~~

3. fetch_sequence_list

+ appendStringInfoString(&cmd, "SELECT DISTINCT n.nspname, c.relname,
s.seqtypid, s.seqmin, s.seqmax, s.seqstart, s.seqincrement,
s.seqcycle"
+    " FROM pg_publication p, LATERAL
pg_get_publication_sequences(p.pubname::text) gps(relid),"
+    " pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN
pg_sequence s ON c.oid = s.seqrelid"
+    " WHERE c.oid = gps.relid AND p.pubname IN (");
+ get_publications_str(publications, &cmd, true);
+ appendStringInfoChar(&cmd, ')');

Please wrap this better to make the SQL more readable.

~~

4.
+ if (seqform->seqtypid != seqtypid || seqform->seqmin != seqmin ||
+ seqform->seqmax != seqmax || seqform->seqstart != seqstart ||
+ seqform->seqincrement != seqincrement ||
+ seqform->seqcycle != seqcycle)
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("Sequence option in remote and local is not same for \"%s.%s\"",
+    get_namespace_name(get_rel_namespace(relid)), get_rel_name(relid)),
+ errhint("Alter/Re-create the sequence using the same options as in remote."));

4a.
Are these really known as "options"? Or should they be called
"sequence parameters", or something else, like "sequence attributes"?

4a.
Is there a way to give more helpful information by identifying what
was different in the log? OTOH, maybe it would become too messy if
there were multiple differences...

======
src/backend/replication/logical/launcher.c

5. logicalrep_sync_worker_count

- if (isTablesyncWorker(w) && w->subid == subid)
+ if ((isTableSyncWorker(w) || isSequenceSyncWorker(w)) &&
+ w->subid == subid)

You could micro-optimize this -- it may be more efficient to write the
condition the other way around.

SUGGESTION
if (w->subid == subid && (isTableSyncWorker(w) || isSequenceSyncWorker(w)))

======
.../replication/logical/sequencesync.c

File header comment:
nitpick - there seems a large cut/paste mistake (the first 2
paragraphs are almost the same).
nitpick - reworded with the help of Chat-GPT for slightly better
clarity. Also fixed a couple of typos.
nitpick - it mentioned MAX_SEQUENCES_SYNC_PER_BATCH several times so I
changed the wording of one of them

~~~

fetch_remote_sequence_data:
nitpick - all other params have the same name as sequence members, so
change the parameter name /lsn/page_lsn/

~

copy_sequence:
nitpick - rename var /seq_lsn/seq_page_lsn/

======
src/backend/replication/logical/tablesync.c

6. process_syncing_sequences_for_apply

+ * If a sequencesync worker is running already, there is no need to start a new
+ * one; the existing sequencesync worker will synchronize all the sequences. If
+ * there are still any sequences to be synced after the sequencesync worker
+ * exited, then a new sequencesync worker can be started in the next iteration.
+ * To prevent starting the sequencesync worker at a high frequency after a
+ * failure, we store its last failure time. We start the sync worker for the
+ * same relation after waiting at least wal_retrieve_retry_interval.

Why is it talking about "We start the sync worker for the same
relation ...". The sequencesync_failuretime is per sync worker, not
per relation. And, I don't see any 'same relation' check in the code.

======
src/include/catalog/pg_subscription_rel.h

GetSubscriptionRelations:
nitpick - changed parameter name /all_relations/all_states/

======
src/test/subscription/t/034_sequences.pl

nitpick - add some ########## comments to highlight the main test
parts to make it easier to read.
nitpick - fix typo /syned/synced/

7. More test cases?
IIUC you can also get a sequence mismatch warning during "ALTER ...
REFRESH PUBLICATION", and "CREATE SUBSCRIPTION". So, should those be
tested also?

======
Kind Regards,
Peter Smith.
Fujitsu Australia
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 16c427e..5c66797 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8109,7 +8109,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration 
count&gt;</replaceable>:<replaceable>&l
 
   <para>
    This catalog only contains tables and sequences known to the subscription
-   after running either
+   after running
    <link linkend="sql-createsubscription"><command>CREATE 
SUBSCRIPTION</command></link> or
   <link linkend="sql-altersubscription-params-refresh-publication">
    <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link> or
diff --git a/doc/src/sgml/ref/alter_subscription.sgml 
b/doc/src/sgml/ref/alter_subscription.sgml
index eb7d544..f280019 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -200,6 +200,12 @@ ALTER SUBSCRIPTION <replaceable 
class="parameter">name</replaceable> RENAME TO <
           <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION 
SEQUENCES</command></link>
          </para>
          <para>
+          See <xref linkend="sql-createsubscription-notes"/> for 
recommendations on how
+          to handle any warnings about differences in the sequence definition
+          between the publisher and the subscriber, which might occur when
+          <literal>copy_data = true</literal>.
+         </para>
+         <para>
           See <xref linkend="sql-createsubscription-notes"/> for details of
           how <literal>copy_data = true</literal> can interact with the
           <link 
linkend="sql-createsubscription-params-with-origin"><literal>origin</literal></link>
@@ -211,11 +217,6 @@ ALTER SUBSCRIPTION <replaceable 
class="parameter">name</replaceable> RENAME TO <
           parameter of <command>CREATE SUBSCRIPTION</command> for details about
           copying pre-existing data in binary format.
          </para>
-         <para>
-          See the <link 
linkend="sql-createsubscription-params-with-copy-data"><literal>copy_data</literal></link>
-          on how to handle the warnings regarding the difference in sequence
-          definition between the publisher and the subscriber.
-         </para>
         </listitem>
        </varlistentry>
       </variablelist></para>
@@ -230,12 +231,12 @@ ALTER SUBSCRIPTION <replaceable 
class="parameter">name</replaceable> RENAME TO <
       sequence data with the publisher. Unlike <link 
linkend="sql-altersubscription-params-refresh-publication">
       <literal>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</literal></link> 
which
       only synchronizes newly added sequences, <literal>REFRESH PUBLICATION 
SEQUENCES</literal>
-      will re-synchronize the sequence data for all subscribed sequences. The
-      sequence definition can differ between the publisher and the subscriber,
-      this is detected and a WARNING is logged to the user, but the warning is
-      only an indication of a potential problem; it is recommended to alter the
-      sequence to keep the sequence option same as the publisher and execute
-      the command again.
+      will re-synchronize the sequence data for all subscribed sequences.
+     </para>
+     <para>
+      See <xref linkend="sql-createsubscription-notes"/> for recommendations 
on how
+      to handle any warnings about differences in the sequence definition
+      between the publisher and the subscriber.
      </para>
     </listitem>
    </varlistentry>
diff --git a/doc/src/sgml/ref/create_subscription.sgml 
b/doc/src/sgml/ref/create_subscription.sgml
index de3bdb8..e28ed96 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -264,12 +264,10 @@ CREATE SUBSCRIPTION <replaceable 
class="parameter">subscription_name</replaceabl
           <literal>origin</literal> parameter.
          </para>
          <para>
-          The sequence definition can differ between the publisher and the
-          subscriber, this is detected and a WARNING is logged to the user, but
-          the warning is only an indication of a potential problem; it is
-          recommended to alter the sequence to keep the sequence option same as
-          the publisher and execute <link 
linkend="sql-altersubscription-params-refresh-publication-sequences">
-          <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION 
SEQUENCES</command></link>.
+          See <xref linkend="sql-createsubscription-notes"/> for 
recommendations on how
+          to handle any warnings about differences in the sequence definition
+          between the publisher and the subscriber, which might occur when
+          <literal>copy_data = true</literal>.
          </para>
         </listitem>
        </varlistentry>
@@ -543,6 +541,17 @@ WHERE N.nspname = PT.schemaname AND
       PT.pubname IN (&lt;pub-names&gt;);
 </programlisting></para>
 
+  <para>
+   Sequence definitions can differ between the publisher and the subscriber.
+   If this is detected, a WARNING is logged to inform the user of a
+   potential problem. It is recommended to use
+   <link linkend="sql-altersequence"><command>ALTER SEQUENCE</command></link>
+   to keep the subscriber sequence parameters the same as the publisher
+   sequence parameters. Then, execute
+   <link linkend="sql-altersubscription-params-refresh-publication-sequences">
+   <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION 
SEQUENCES</command></link>.
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/pg_subscription.c 
b/src/backend/catalog/pg_subscription.c
index 8a2161e..876ab96 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -562,7 +562,7 @@ GetSubscriptionRelations(Oid subid, bool get_tables, bool 
get_sequences,
                        if (!get_sequences)
                                continue;
 
-                       /* Skip all non-init sequences if not all_states were 
requested */
+                       /* Skip all non-init sequences unless all_states was 
requested */
                        if (!all_states && (subrel->srsubstate != 
SUBREL_STATE_INIT))
                                continue;
                }
@@ -572,7 +572,7 @@ GetSubscriptionRelations(Oid subid, bool get_tables, bool 
get_sequences,
                        if (!get_tables)
                                continue;
 
-                       /* Skip all ready tables if not all_states were 
requested */
+                       /* Skip all ready tables unless all_states was 
requested */
                        if (!all_states && (subrel->srsubstate == 
SUBREL_STATE_READY))
                                continue;
                }
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ec7d5bb..322eb71 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1898,13 +1898,13 @@ pg_sequence_last_value(PG_FUNCTION_ARGS)
 /*
  * Return the current on-disk state of the sequence.
  *
+ * Note: This is roughly equivalent to selecting the data from the sequence,
+ * except that it also returns the page LSN.
+ *
  * The page_lsn allows the user to determine if the sequence has been updated
  * since the last synchronization with the subscriber. This is done by
  * comparing the current page_lsn with the value stored in pg_subscription_rel
  * from the last synchronization.
- *
- * Note: This is roughly equivalent to selecting the data from the sequence,
- * except that it also returns the page LSN.
  */
 Datum
 pg_sequence_state(PG_FUNCTION_ARGS)
diff --git a/src/backend/replication/logical/sequencesync.c 
b/src/backend/replication/logical/sequencesync.c
index 2a5c8c5..44fa6ac 100644
--- a/src/backend/replication/logical/sequencesync.c
+++ b/src/backend/replication/logical/sequencesync.c
@@ -18,30 +18,19 @@
  * ALTER SUBSCRIPTION ... REFRESH PUBLICATION
  * ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCE
  *
- * Sequencesync worker will get the sequences that should be synchronized from
- * pg_subscription_rel catalog table. It synchronizes
- * MAX_SEQUENCES_SYNC_PER_BATCH (100) sequences within a single transaction by
- * getting the sequence value from the remote publisher and updating it to the
- * local subscriber sequence and updates the seqeunce state to READY. It also
- * updates the remote sequence's lsn to pg_subscription_rel which can be
- * later used to compare it with the pg_sequence_state page_lsn value to
- * identify if sequence is changed since the last synchronization.
- *
- * The sequencesync worker retrieves the sequences that need to be synchronized
- * from the pg_subscription_rel catalog table. It synchronizes up to
- * MAX_SEQUENCES_SYNC_PER_BATCH (100) sequences in a single transaction by
- * fetching the sequence values and the sequence's page_lsn from the remote
- * publisher and updating them in the local subscriber sequence. After
- * synchronization, it sets the sequence state to READY. This LSN can later be
- * compared with the pg_sequence_state page LSN value to determine if the
+ * The sequencesync worker retrieves the sequences to be synchronized from the
+ * pg_subscription_rel catalog table.  It synchronizes multiple sequences per
+ * single transaction by fetching the sequence value and page LSN from the
+ * remote publisher and updating them in the local subscriber sequence.  After
+ * synchronization, it sets the sequence state to READY.  The page LSN can
+ * later be compared with the pg_sequence_state page_lsn to determine if the
  * sequence has changed since the last synchronization.
  *
  * So the state progression is always just: INIT -> READY.
  *
- * Here MAX_SEQUENCES_SYNC_PER_BATCH (100) sequences are synchronized within a
- * single transaction  to avoid creating a lot of transactions and also the
- * locks on the sequence relation will be periodically released during the
- * commit transaction.
+ * To avoid creating too many transactions, up to MAX_SEQUENCES_SYNC_PER_BATCH
+ * (100) sequences are synchronized per transaction. The locks on the sequence
+ * relation will be periodically released at each transaction commit.
  *
  *-------------------------------------------------------------------------
  */
@@ -70,12 +59,12 @@
  *
  * The sequence last_value will be returned directly, while
  * log_cnt, is_called and page_lsn will be returned via the output
- * parameters log_cnt, is_called and lsn, respectively.
+ * parameters log_cnt, is_called and page_lsn, respectively.
  */
 static int64
 fetch_remote_sequence_data(WalReceiverConn *conn, Oid remoteid, char *nspname,
                                                   char *relname, int64 
*log_cnt, bool *is_called,
-                                                  XLogRecPtr *lsn)
+                                                  XLogRecPtr *page_lsn)
 {
        WalRcvExecResult *res;
        StringInfoData cmd;
@@ -114,7 +103,7 @@ fetch_remote_sequence_data(WalReceiverConn *conn, Oid 
remoteid, char *nspname,
        *is_called = DatumGetBool(slot_getattr(slot, 3, &isnull));
        Assert(!isnull);
 
-       *lsn = DatumGetLSN(slot_getattr(slot, 4, &isnull));
+       *page_lsn = DatumGetLSN(slot_getattr(slot, 4, &isnull));
        Assert(!isnull);
 
        ExecDropSingleTupleTableSlot(slot);
@@ -138,7 +127,7 @@ copy_sequence(WalReceiverConn *conn, Relation rel)
        int64           seq_last_value;
        int64           seq_log_cnt;
        bool            seq_is_called;
-       XLogRecPtr      seq_lsn = InvalidXLogRecPtr;
+       XLogRecPtr      seq_page_lsn = InvalidXLogRecPtr;
        WalRcvExecResult *res;
        Oid                     tableRow[] = {OIDOID, CHAROID};
        TupleTableSlot *slot;
@@ -185,13 +174,13 @@ copy_sequence(WalReceiverConn *conn, Relation rel)
 
        seq_last_value = fetch_remote_sequence_data(conn, remoteid, nspname,
                                                                                
                relname, &seq_log_cnt, &seq_is_called,
-                                                                               
                &seq_lsn);
+                                                                               
                &seq_page_lsn);
 
        SetSequence(RelationGetRelid(rel), seq_last_value, seq_is_called,
                                seq_log_cnt);
 
        /* return the LSN when the sequence state was set */
-       return seq_lsn;
+       return seq_page_lsn;
 }
 
 /*
diff --git a/src/include/catalog/pg_subscription_rel.h 
b/src/include/catalog/pg_subscription_rel.h
index 58abed9..1c954c9 100644
--- a/src/include/catalog/pg_subscription_rel.h
+++ b/src/include/catalog/pg_subscription_rel.h
@@ -92,6 +92,6 @@ extern void RemoveSubscriptionRel(Oid subid, Oid relid);
 extern bool HasSubscriptionRelations(Oid subid);
 extern List *GetSubscriptionRelations(Oid subid, bool get_tables,
                                                                          bool 
get_sequences,
-                                                                         bool 
all_relations);
+                                                                         bool 
all_states);
 
 #endif                                                 /* 
PG_SUBSCRIPTION_REL_H */
diff --git a/src/test/subscription/t/034_sequences.pl 
b/src/test/subscription/t/034_sequences.pl
index 88f2705..100a420 100644
--- a/src/test/subscription/t/034_sequences.pl
+++ b/src/test/subscription/t/034_sequences.pl
@@ -67,9 +67,11 @@ my $result = $node_subscriber->safe_psql(
 ));
 is($result, '100|32|t', 'initial test data replicated');
 
-# ALTER SUBSCRIPTION ... REFRESH PUBLICATION should cause sync of new
+##########
+## ALTER SUBSCRIPTION ... REFRESH PUBLICATION should cause sync of new
 # sequences of the publisher, but changes to existing sequences should
 # not be synced.
+##########
 
 # Create a new sequence 'regress_s2', and update existing sequence 'regress_s1'
 $node_publisher->safe_psql(
@@ -105,9 +107,11 @@ $result = $node_subscriber->safe_psql(
 is($result, '100|32|t',
        'REFRESH PUBLICATION will sync newly published sequence');
 
-# ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES should cause sync of
+##########
+## ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES should cause sync of
 # new sequences of the publisher, and changes to existing sequences should
 # also be synced.
+##########
 
 # Create a new sequence 'regress_s3', and update the existing sequence
 # 'regress_s2'.
@@ -128,7 +132,7 @@ $result = $node_subscriber->safe_psql(
 $node_subscriber->poll_query_until('postgres', $synced_query)
   or die "Timed out while waiting for subscriber to synchronize data";
 
-# Check - existing sequences are syned
+# Check - existing sequences are synced
 $result = $node_subscriber->safe_psql(
        'postgres', qq(
        SELECT last_value, log_cnt, is_called FROM regress_s1;
@@ -150,8 +154,10 @@ $result = $node_subscriber->safe_psql(
 is($result, '100|32|t',
        'REFRESH PUBLICATION SEQUENCES will sync newly published sequence');
 
+##########
 # ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES should throw a warning
 # for sequence definition not matching between the publisher and the 
subscriber.
+##########
 
 # Create a new sequence 'regress_s4' whose START value is not the same in the
 # publisher and subscriber.
@@ -165,6 +171,7 @@ $node_subscriber->safe_psql(
        CREATE SEQUENCE regress_s4 START 10 INCREMENT 2;
 ));
 
+# Do ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES
 ($result, my $stdout, my $stderr) = $node_subscriber->psql(
        'postgres', "
         ALTER SUBSCRIPTION regress_seq_sub REFRESH PUBLICATION SEQUENCES");

Reply via email to