Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

Ashutosh Sharma Wed, 25 Feb 2026 23:42:35 -0800

Hi Amit,

On Thu, Feb 26, 2026 at 11:50 AM Amit Kapila <[email protected]> wrote:
>
> On Thu, Feb 26, 2026 at 10:28 AM Ashutosh Sharma <[email protected]> 
> wrote:
> >
> >
> > > >
> > > > Proposal:
> > > >
> > > > Make synchronized_standby_slots quorum aware i.e. extend the GUC to 
> > > > accept an ANY M (slot1, slot2, ...) syntax similar to 
> > > > synchronous_standby_names, so StandbySlotsHaveCaughtup() can return 
> > > > true when M of N slots (where M <= N and M >= 1) have caught up. I 
> > > > still prefer two different GUCs for this as the list of slots to be 
> > > > synchronized can still be different (for example, DBA may want to 
> > > > ensure Geo standby to be sync before allowing the logical decoding 
> > > > client to read the changes). I kept synchronized_standby_slots  parse 
> > > > logic similar to  synchronous_standby_names  to keep things simple. The 
> > > > default behavior is also not changed for  synchronized_standby_slots.
> > > >
> ...
> >
> > Thinking about this further, using quorum settings for
> > synchronized_standby_slots can/will certainly result in at least one
> > sync standby lagging behind the logical replica, making it probably
> > impossible to continue with the existing logical replication setup
> > after a failover to the standby that lags behind. Here is what I am
> > mean:
> >
>
> But won't that be true even for synchronous_standby_names? I think in
> the case of quorum, it is the responsibility of the failover solution
> to select the most recent synced standby among all the standby's
> specified in synchronous_standby_names. Similarly here before failing
> over logical subscriber to one of physical standby, the failover tool
> needs to ensure it is switching over to the synced replica. We have
> given steps in the docs [1] that could be used to identify the replica
> where the subscriber can switchover. Will that address your concern?
>


Here's my understanding of this:

I don't think we should be comparing "synchronous_standby_names" with
"synchronized_standby_slots", even though they appear similar in
purpose. All values listed in synchronous_standby_names represent
synchronous standbys exclusively, whereas synchronized_standby_slots
can hold values for both synchronous and asynchronous standbys. In
other words, every server referenced by synchronous_standby_names is
of the same type, but that may not be the case with
synchronized_standby_slots.

If a GUC can hold values of different types (sync vs. async), does it
really make sense to use a qualifier like ANY 1 (val1, val2) when val1
and val2 are different in nature? For example, suppose val1 is a
synchronous standby and val2 is an asynchronous standby, and we
configure ANY 1 (val1, val2). It's possible for val2 to get ahead of
val1 in terms of replication progress, which in turn could mean the
logical replica is also ahead of val1. So if we were to fail over to
val1 (since it's the only synchronous standby), we will not be able to
use the existing logical replication setup.

Please correct me if I have misunderstood anything here.

--
With Regards,
Ashutosh Sharma.

Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

Reply via email to