On Tue, Nov 28, 2023 at 4:12 PM vignesh C <vignes...@gmail.com> wrote:
>

Few comments on the latest patch:
===========================
1.
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
+ else
+ appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
+
+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");
+
+ appendPQExpBufferStr(query,
+ "FROM pg_subscription s\n");
+
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n");

Why 'subenabled' have a check for binary_upgrade but
'suboriginremotelsn' doesn't?

2.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;
+
+ CHECK_IS_BINARY_UPGRADE;
+
+ /* We must check these things before dereferencing the arguments */
+ if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+ elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is
not allowed");
+
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ relid = PG_GETARG_OID(1);
+ relstate = PG_GETARG_CHAR(2);
+ sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+ if (!HeapTupleIsValid(tup))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relation %u does not exist", relid));
+ ReleaseSysCache(tup);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);

Why there is no locking for relation? I see that during subscription
operation, we do acquire AccessShareLock on the relation before adding
a corresponding entry in pg_subscription_rel. See the following code:

CreateSubscription()
{
...
foreach(lc, tables)
{
RangeVar   *rv = (RangeVar *) lfirst(lc);
Oid relid;

relid = RangeVarGetRelid(rv, AccessShareLock, false);

/* Check for supported relkind. */
CheckSubscriptionRelkind(get_rel_relkind(relid),
rv->schemaname, rv->relname);

AddSubscriptionRelState(subid, relid, table_state,
InvalidXLogRecPtr);
...
}

3.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
{
...
...
+ AddSubscriptionRelState(subid, relid, relstate, sublsn);
...
}

I see a problem with directly using this function which is that it
doesn't release locks which means it expects either the caller to
release those locks or postpone to release them at the transaction
end. However, all the other binary_upgrade support functions don't
postpone releasing locks till the transaction ends. I think we should
add an additional parameter to indicate whether we want to release
locks and then pass it true from the binary upgrade support function.

4.
extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
  int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);

getSubscriptions() and getSubscriptionTables() are defined in the
opposite order in .c file. I think it is better to change the order in
.c file unless there is a reason for not doing so.

5. At this stage, no need to update/send the 0002 patch, we can look
at it after the main patch is committed. That is anyway not directly
related to the main patch.

Apart from the above, I have modified a few comments and messages in
the attached. Kindly review and include the changes if you are fine
with those.

-- 
With Regards,
Amit Kapila.
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4a4bafba11..a4e723f922 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5014,18 +5014,21 @@ dumpSubscription(Archive *fout, const SubscriptionInfo 
*subinfo)
 
        appendPQExpBufferStr(query, ");\n");
 
+       /*
+        * In binary-upgrade mode, we allow the replication to continue after 
the
+        * upgrade.
+        */
        if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
        {
                if (subinfo->suboriginremotelsn)
                {
                        /*
                         * Preserve the remote_lsn for the subscriber's 
replication
-                        * origin. This value will be stale if the publisher 
gets
-                        * upgraded, we don't have a mechanism to distinguish 
this
-                        * scenario currently. There is no problem even if the 
remote_lsn
-                        * is updated with a stale value in this case as 
upgrade ensures
-                        * that all the transactions will be replicated before 
upgrading
-                        * the publisher.
+                        * origin. This value is required to start the 
replication from the
+                        * position before the upgrade. This value will be 
stale if the
+                        * publisher gets upgraded before the subscriber node. 
However,
+                        * this shouldn't be a problem as the upgrade ensures 
that all the
+                        * transactions were replicated before upgrading the 
publisher.
                         */
                        appendPQExpBufferStr(query,
                                                                 "\n-- For 
binary upgrade, must preserve the remote_lsn for the subscriber's replication 
origin.\n");
@@ -5037,6 +5040,10 @@ dumpSubscription(Archive *fout, const SubscriptionInfo 
*subinfo)
 
                if (strcmp(subinfo->subenabled, "t") == 0)
                {
+                       /*
+                        * Enable the subscription to allow the replication to 
continue
+                        * after the upgrade.
+                        */
                        appendPQExpBufferStr(query,
                                                                 "\n-- For 
binary upgrade, must preserve the subscriber's running state.\n");
                        appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s 
ENABLE;\n", qsubname);
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 4d6ae77e2d..9fd1417f0a 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -123,8 +123,8 @@ check_and_dump_old_cluster(bool live_check)
                check_old_cluster_for_valid_slots(live_check);
 
                /*
-                * Subscription dependencies can be migrated since PG17. See 
comments
-                * atop get_db_subscription_count().
+                * Subscription and its dependencies can be migrated since 
PG17. See
+                * comments atop get_db_subscription_count().
                 */
                check_old_cluster_subscription_state();
        }
@@ -1554,7 +1554,8 @@ check_new_cluster_logical_replication_slots(void)
  * check_new_cluster_subscription_configuration()
  *
  * Verify that the max_replication_slots configuration specified is enough for
- * creating the subscriptions.
+ * creating the subscriptions. This is required to create the replication
+ * origin for each subscription.
  */
 static void
 check_new_cluster_subscription_configuration(void)
@@ -1564,7 +1565,7 @@ check_new_cluster_subscription_configuration(void)
        int                     nsubs_on_old;
        int                     max_replication_slots;
 
-       /* Logical slots can be migrated since PG17. */
+       /* Subscriptions and its dependencies can be migrated since PG17. */
        if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
                return;
 
@@ -1726,15 +1727,18 @@ check_old_cluster_subscription_state(void)
                }
 
                /*
-                * A slot not created yet refers to the 'i' (initialize) state, 
while
-                * 'r' (ready) state refers to a slot created previously but 
already
-                * dropped. These states are supported for pg_upgrade. The other
+                * We don't allow upgrade if there is a risk of dangling slot 
or origin
+                * corresponding to initial sync after upgrade.
+                *
+                * A slot/origin not created yet refers to the 'i' (initialize) 
state,
+                * while 'r' (ready) state refers to a slot/origin created 
previously but
+                * already dropped. These states are supported for pg_upgrade. 
The other
                 * states listed below are not supported:
                 *
                 * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this 
state
                 * would retain a replication slot, which could not be dropped 
by the
                 * sync worker spawned after the upgrade because the 
subscription ID
-                * tracked by the publisher does not match anymore.
+                * used for the slot name won't match anymore.
                 *
                 * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this 
state
                 * would retain the replication origin when there is a failure 
in
@@ -1786,6 +1790,7 @@ check_old_cluster_subscription_state(void)
                fclose(script);
                pg_log(PG_REPORT, "fatal");
                pg_fatal("Your installation contains subscriptions without 
origin or having relations not in i (initialize) or r (ready) state.\n"
+                                "You can allow the initial sync to finish for 
all relations and then restart the upgrade.\n"
                                 "A list of the problem subscriptions is in the 
file:\n"
                                 "    %s", output_path);
        }
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index fb8250002f..cc73c0fc0c 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -738,7 +738,7 @@ count_old_cluster_logical_slots(void)
 /*
  * get_db_subscription_count()
  *
- * Gets the number of subscription count of the database.
+ * Gets the number of subscriptions of the database referred to by "dbinfo".
  *
  * Note: This function will not do anything if the old cluster is pre-PG17.
  * This is because before that the logical slots are not upgraded, so we will

Reply via email to