On Mon, Feb 19, 2024 at 9:46 AM shveta malik <shveta.ma...@gmail.com> wrote:
>
> Okay I see. Thanks for pointing it out. Here are the patches
> addressing your comments. Changes are in patch001, rest are rebased.
>

Few comments on 0001
====================
1. I think it is better to error out when the valid GUC or option is
not set in ensure_valid_slotsync_params() and
ensure_valid_remote_info() instead of waiting. And we shouldn't start
the worker in the first place if not all required GUCs are set. This
will additionally simplify the code a bit.

2.
+typedef struct SlotSyncWorkerCtxStruct
 {
- /* prevents concurrent slot syncs to avoid slot overwrites */
+ pid_t pid;
+ bool stopSignaled;
  bool syncing;
+ time_t last_start_time;
  slock_t mutex;
-} SlotSyncCtxStruct;
+} SlotSyncWorkerCtxStruct;

I think we don't need to change the name of this struct as this can be
used both by the worker and the backend. We can probably add the
comment to indicate that all the fields except 'syncing' are used by
slotsync worker.

3. Similar to above, function names like SlotSyncShmemInit() shouldn't
be changed to SlotSyncWorkerShmemInit().

4.
+ReplSlotSyncWorkerMain(int argc, char *argv[])
{
...
+ on_shmem_exit(slotsync_worker_onexit, (Datum) 0);
...
+ before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn));
...
}

Do we need two separate callbacks? Can't we have just one (say
slotsync_failure_callback) that cleans additional things in case of
slotsync worker? And, if we need both those callbacks then please add
some comments for both and why one is before_shmem_exit() and the
other is on_shmem_exit().

In addition to the above, I have made a few changes in the comments
and code (cosmetic code changes). Please include those in the next
version if you find them okay. You need to rename .txt file to remove
.txt and apply atop v90-0001*.

-- 
With Regards,
Amit Kapila.
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index 3080d4aa53..790799c164 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -463,9 +463,9 @@ static void InitPostmasterDeathWatchHandle(void);
 /*
  * Slot sync worker allowed to start up?
  *
- * If we are on a hot standby, slot sync parameter is enabled, and it is
- * the first time of worker's launch, or enough time has passed since the
- * worker was launched last, then it is allowed to be started.
+ * We allow to start the slot sync worker when we are on a hot standby,
+ * sync_replication_slots is enabled, and it is the first time of worker's
+ * launch, or enough time has passed since the worker was launched last.
  */
 #define SlotSyncWorkerAllowed()        \
        (sync_replication_slots && pmState == PM_HOT_STANDBY && \
diff --git a/src/backend/replication/logical/slotsync.c 
b/src/backend/replication/logical/slotsync.c
index 40ab87bce1..0e0f3cdf76 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1052,9 +1052,9 @@ ensure_valid_slotsync_params(void)
 /*
  * Re-read the config file.
  *
- * If any of the slot sync GUCs have changed, exit the worker and
- * let it get restarted by the postmaster. The worker to be exited for
- * restart purpose only if the caller passed restart as true.
+ * Exit if any of the slot sync GUCs have changed. The postmaster will
+ * restart it. The worker to be exited for restart purpose only if the
+ * caller passed restart as true.
  */
 static void
 slotsync_reread_config(bool restart)
@@ -1295,9 +1295,9 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
        dbname = walrcv_get_dbname_from_conninfo(PrimaryConnInfo);
 
        /*
-        * Connect to the database specified by user in primary_conninfo. We 
need
-        * a database connection for walrcv_exec to work. Please see comments 
atop
-        * libpqrcv_exec.
+        * Connect to the database specified by the user in primary_conninfo. We
+        * need a database connection for walrcv_exec to work which we use to 
fetch
+        * slot information from the remote node. See comments atop 
libpqrcv_exec.
         */
        InitPostgres(dbname, InvalidOid, NULL, InvalidOid, 0, NULL);
 
@@ -1310,12 +1310,11 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
                appendStringInfo(&app_name, "%s", "slotsyncworker");
 
        /*
-        * Establish the connection to the primary server for slots
+        * Establish the connection to the primary server for slot
         * synchronization.
         */
        wrconn = walrcv_connect(PrimaryConnInfo, false, false, false,
-                                                       app_name.data,
-                                                       &err);
+                                                       app_name.data, &err);
        pfree(app_name.data);
 
        if (!wrconn)
@@ -1332,7 +1331,7 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
         */
        ensure_valid_remote_info(wrconn);
 
-       /* Main wait loop */
+       /* Main loop to synchronize slots */
        for (;;)
        {
                bool            some_slot_updated = false;
@@ -1345,7 +1344,7 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
        }
 
        /*
-        * The slot sync worker can not get here because it will only stop when 
it
+        * The slot sync worker can't get here because it will only stop when it
         * receives a SIGINT from the startup process, or when there is an 
error.
         */
        Assert(false);

Reply via email to