Hi,

On 4/28/23 5:55 AM, Amit Kapila wrote:
On Wed, Apr 26, 2023 at 7:53 PM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:

+# Get the restart_lsn from an invalidated slot
+my $restart_lsn = $node_standby->safe_psql('postgres',
+ "SELECT restart_lsn from pg_replication_slots WHERE slot_name =
'vacuum_full_activeslot' and conflicting is true;"
+);
+
+chomp($restart_lsn);
+
+# Get the WAL file name associated to this lsn on the primary
+my $walfile_name = $node_primary->safe_psql('postgres',
+ "SELECT pg_walfile_name('$restart_lsn')");
+
+chomp($walfile_name);
+
+# Check the WAL file is still on the primary
+ok(-f $node_primary->data_dir . '/pg_wal/' . $walfile_name,
+ "WAL file still on the primary");

How is it guaranteed that the WAL file corresponding to the
invalidated slot on standby will still be present on primary?

The slot(s) have been invalidated by the "vacuum full" test just above
this one. So I think the WAL we are looking for is the last one being used
by the primary. As no activity happened on it since the vacuum full it looks to
me that it should still be present.

But I may have missed something and maybe that's not guarantee that this WAL is 
still there in all the cases.
In that case I think it's better to remove this test (it does not provide added 
value here).

Test removed in V7 attached.

Can you
please explain the logic behind this test a bit more like how the WAL
file switch helps you to achieve the purpose?


The idea was to generate enough "wal switch" on the primary to ensure
the WAL file has been removed.

I gave another thought on it and I think we can skip the test that the WAL is
not on the primary any more. That way, one "wal switch" seems to be enough
to see it removed on the standby.

It's done in V7.

V7 is not doing "extra tests" than necessary and I think it's probably better 
like this.

I can see V7 failing on "Cirrus CI / macOS - Ventura - Meson" only (other 
machines are not complaining).

It does fail on "invalidated logical slots do not lead to retaining WAL", see 
https://cirrus-ci.com/task/4518083541336064

I'm not sure why it is failing, any idea?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 2ab08214415023505244c954a6a5ebf42ec9aebb Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Fri, 28 Apr 2023 07:27:20 +0000
Subject: [PATCH v7] Add a test to verify that invalidated logical slots do not
 lead to retaining WAL.

---
 .../t/035_standby_logical_decoding.pl         | 39 ++++++++++++++++++-
 1 file changed, 37 insertions(+), 2 deletions(-)
 100.0% src/test/recovery/t/

diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl 
b/src/test/recovery/t/035_standby_logical_decoding.pl
index 66d264f230..b32c1002b0 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -500,9 +500,44 @@ $node_standby->restart;
 check_slots_conflicting_status(1);
 
 ##################################################
-# Verify that invalidated logical slots do not lead to retaining WAL
+# Verify that invalidated logical slots do not lead to retaining WAL.
 ##################################################
-# XXXXX TODO
+
+# Before removing WAL file(s), ensure the cascading standby catch up
+$node_standby->wait_for_replay_catchup($node_cascading_standby,
+       $node_primary);
+
+# Get the restart_lsn from an invalidated slot
+my $restart_lsn = $node_standby->safe_psql('postgres',
+       "SELECT restart_lsn from pg_replication_slots WHERE slot_name = 
'vacuum_full_activeslot' and conflicting is true;"
+);
+
+chomp($restart_lsn);
+
+# As pg_walfile_name() can not be executed on the standby,
+# get the WAL file name associated to this lsn from the primary
+my $walfile_name = $node_primary->safe_psql('postgres',
+       "SELECT pg_walfile_name('$restart_lsn')");
+
+chomp($walfile_name);
+
+# Generate some activity and switch WAL file on the primary
+$node_primary->safe_psql(
+       'postgres', "create table retain_test(a int);
+                                                                        insert 
into retain_test values(1);
+                                                                        select 
pg_switch_wal();
+                                                                        
checkpoint;");
+
+# Wait for the standby to catch up
+$node_primary->wait_for_catchup($node_standby);
+
+# Request a checkpoint on the standby to trigger the WAL file(s) removal
+$node_standby->safe_psql('postgres', 'checkpoint;');
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;
+ok(!-f "$standby_walfile",
+       "invalidated logical slots do not lead to retaining WAL");
 
 ##################################################
 # Recovery conflict: Invalidate conflicting slots, including in-use slots
-- 
2.34.1

Reply via email to