On Fri, Apr 3, 2026 at 9:46 AM shveta malik <[email protected]> wrote:
>
> On Thu, Apr 2, 2026 at 3:55 PM Ashutosh Sharma <[email protected]> wrote:
> >
> > Hi Shveta,
> >
> > On Wed, Apr 1, 2026 at 12:06 PM shveta malik <[email protected]> wrote:
> > >
> > > On Thu, Mar 26, 2026 at 5:23 PM Ashutosh Sharma <[email protected]> 
> > > wrote:
> > > >
> > > >
> > > > PFA patch addressing all the comments above and let me know for any
> > > > further comments.
> > > >
> > >
> > > Thank You Ashutosh. Doc looks good to me. Few comments:
> > >
> > > 3)
> > > What is the execution time for this new test?
> > > I ran it on my VM (which is slightly on the slower side), and the
> > > runtime varies between ~60 seconds and ~140 seconds. I executed it
> > > around 10–15 times. Most runs completed in about 65 seconds (which is
> > > still more), but a few were significantly longer (100+ seconds).
> > > During the longer runs, I noticed the following entry in pub.log
> > > (possibly related to Test Scenario E taking more time?). Could you
> > > please try running this on your end as well?
> > >
> > > 2026-03-31 19:45:45.557 IST client backend[145705]
> > > 053_synchronized_standby_slots_quorum.pl LOG:  statement: SELECT
> > > active_pid IS NOT NULL
> > >   AND restart_lsn IS NOT NULL
> > >   AND restart_lsn < '0/03000450'::pg_lsn
> > > FROM pg_replication_slots
> > > WHERE slot_name = 'sb1_slot';
> > >
> > > Just for reference, the complete  failover test
> > > (t/040_standby_failover_slots_sync.pl) takes somewhere between 7 to
> > > 10sec on my VM.
> > >
> >
> > My concern with this new test is that it's both slow to run and prone
> > to flakiness, which makes me question whether it's worth keeping.
> >
>
> will review and share my thoughts.
>

I gave it more thought, another idea for a shorter and quicker
testcase could be to check wait_event for that particular
application_name in pg_stat_activity. A lagging standby will result in
wait_event=WaitForStandbyConfirmation with backend_type=walsender.

I have attached sample-code to do the same in the attached txt file,
please have a look. I discussed with Hou-San offline, he is okay with
this idea. Please see if it works and change it as needed.

thanks
Shveta
sub test_lagging_standbys
{
        my ($mode, $label, $message) = @_;

        $primary->adjust_conf('postgresql.conf',
                'synchronized_standby_slots', "'$mode'");
        $primary->reload;

        $primary->safe_psql('postgres',
                "SELECT pg_logical_emit_message(true, 'qtest', '$message');"
        );

        $primary->wait_for_replay_catchup($standby2);

        my $bg = $primary->background_psql(
                'postgres',
                on_error_stop => 0,
                timeout => $PostgreSQL::Test::Utils::timeout_default);

        $bg->query_until(
                qr/decode_start/, q(
   \echo decode_start
   SELECT pg_logical_slot_peek_changes('logical_failover', NULL, NULL); ));

        ok( $primary->poll_query_until(
                        'postgres', q{
SELECT EXISTS (
        SELECT 1
        FROM pg_stat_activity
        WHERE wait_event = 'WaitForStandbyConfirmation');
        }),
                $label);

        $primary->adjust_conf('postgresql.conf',
                'synchronized_standby_slots', "''");
        $primary->reload;

        $bg->quit;
}


# Hold back standby1 feedback
$standby1->adjust_conf(
        'postgresql.conf',
        'wal_receiver_status_interval',
        "'0'"
);
$standby1->reload;

# FIRST 1 must wait
test_lagging_standbys(
        "FIRST 1 (sb1_slot, sb2_slot)",
        'FIRST 1 waits for lagging higher-priority slot',
        'first_1_lagging_blocks'
);

# ANY 1 must wait
test_lagging_standbys(
        "ANY 2 (sb1_slot, sb2_slot)",
        'ANY 1 waits for lagging sb1_slot slot',
        'any_2_lagging_blocks'
);

Reply via email to