On Fri, Mar 27, 2026 at 10:49 PM Fujii Masao <[email protected]> wrote: > > On Fri, Mar 27, 2026 at 9:38 PM Nisha Moond <[email protected]> wrote: > > Attached the updated patch. > > Thanks for updating the patch! It looks good overall. >
Thank you Fujii-san for the review.
> Regarding the comments in SlotSyncCtxStruct, since the role of
> stopSignaled field has changed, those comments should be updated
> accordingly? For example,
>
> -------------------------
> - * the SQL function pg_sync_replication_slots(). When the startup process
> sets
> - * 'stopSignaled' during promotion, it uses this 'pid' to wake up the
> currently
> - * synchronizing process so that the process can immediately stop its
> - * synchronizing work on seeing 'stopSignaled' set.
> - * Setting 'stopSignaled' is also used to handle the race condition when the
> + * the SQL function pg_sync_replication_slots(). On promotion,
> + * the startup process sets 'stopSignaled' and uses this 'pid' to wake up
> + * the currently synchronizing process so that the process can
> + * immediately stop its synchronizing work.
> + * Setting 'stopSignaled' is used to handle the race condition when the
> -------------------------
>
Updated as suggested.
>
> +/*
> + * Interrupt flag set when PROCSIG_SLOTSYNC_MESSAGE is received, asking the
> + * slotsync worker or pg_sync_replication_slots() to stop because
> + * standby promotion has been triggered.
> + */
> +volatile sig_atomic_t SlotSyncShutdown = false;
>
> For the interrupt flag set in procsignal_sigusr1_handler(), other flags
> use a *Pending suffix (e.g., ProcSignalBarrierPending,
> ParallelApplyMessagePending), so SlotSyncShutdownPending would
> be more consistent.
>
>
> +void
> +HandleSlotSyncMessage(void)
>
> Functions called from ProcessInterrupts() typically use the Process* prefix
> (e.g., ProcessProcSignalBarrier(), ProcessParallelApplyMessages()),
> so ProcessSlotSyncMessage would be more consistent than HandleSlotSyncMessage.
>
Agree, fixed.
>
> + ereport(LOG,
> + errmsg("replication slot synchronization worker will stop because
> promotion is triggered"));
> +
> + proc_exit(0);
> + }
> + else
> + {
> + /*
> + * For the backend executing SQL function
> + * pg_sync_replication_slots().
> + */
> + ereport(ERROR,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("replication slot synchronization will stop because promotion
> is triggered"));
>
> The log messages say "will stop", but since sync hasn't started yet,
> "will not start" seems clearer here. For example, "replication slot
> synchronization worker will not start because promotion was triggered"
> and "replication slot synchronization will not start because promotion was
> triggered". Thought?
>
We were using the same log message in two places:
check_and_set_sync_info() and HandleSlotSyncMessage().
I think “will not start” fits better in the first case, while “will
stop” makes sense to keep in the second.
--
Thanks,
Nisha
v5-0001-Prevent-slotsync-worker-API-hang-during-standby-p.patch
Description: Binary data
