On Tue, Feb 17, 2026 at 9:45 AM Amit Kapila <[email protected]> wrote:
>
> On Tue, Feb 17, 2026 at 9:13 AM shveta malik <[email protected]> wrote:
> >
> > On Mon, Feb 16, 2026 at 4:35 PM Amit Kapila <[email protected]> wrote:
> > >
> > > On Fri, Feb 13, 2026 at 7:54 AM Zhijie Hou (Fujitsu)
> > > <[email protected]> wrote:
> > > >
> > > > Thanks for pushing! Here are the remaining patches.
> > > >
> > >
> > > One thing that bothers me about the remaining patch is that it could
> > > lead to infinite re-tires in the worst case. For example, in first
> > > try, slot-1 is not synced say due to physical replication delays in
> > > flushing WALs up to the confirmed_flush_lsn of that slot, then in next
> > > (re-)try, the same thing happened for slot-2, then in next (re-)try,
> > > slot-3 appears to invalidated on standby but it is valid on primary,
> > > and so on. What do you think?
> >
> > Yes, that is a possibility we cannot rule out. This can also happen
> > during the first invocation of the API (even without the new changes)
> > when we attempt to create new slots, they may remain in a temporary
> > state indefinitely. However, that risk is limited to the initial sync,
> > until the slots are persisted, which is somewhat expected behavior.
> >
>
> Right.
>
> > With the current changes though, the possibility of an indefinite wait
> > exists during every run. So the question becomes: what would be more
> > desirable for users -- for the API to finish with the risk that a few
> > slots are not synced, or for the API to wait longer to ensure that all
> > slots are properly synced?
> >
> > I think that if the primary use case of this API is when a user plans
> > to run it before a scheduled failover, then it would be better for the
> > API to wait and ensure everything is properly synced.
> >
>
> I don't think we can guarantee that all slots are synced as per latest
> primary state in one invocation because some newly created slots can
> anyway be missed.
Oh, right.
> So why take the risk of infinite waits in the API? I
> think it may be better to extend the usage of this API (probably with
> more parameters) based on more user feedback.
I agree.
> > But I am not
> > very very sure on the use case though. What do you think?
> >
> > > Independent of whether we consider the entire patch, the following bit
> > > in the patch in useful as we retry to sync the slots via API.
> > > @@ -218,7 +219,7 @@ update_local_synced_slot(RemoteSlot *remote_slot,
> > > Oid remote_dbid)
> > > * Can get here only if GUC 'synchronized_standby_slots' on the
> > > * primary server was not configured correctly.
> > > */
> > > - ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
> > > + ereport(LOG,
> > > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > > errmsg("skipping slot synchronization because the received slot sync"
> > > " LSN %X/%08X for slot \"%s\" is ahead of the standby position
> > > %X/%08X",
> > >
> >
> > yes. I agree.
> >
>
> Let's wait for Hou-San's opinion on this one.
>
Sure.
thanks
Shveta