Re: TAP test to cover "EndOfLogTLI != replayTLI" case

Amul Sul Mon, 17 Jan 2022 03:38:46 -0800

On Sat, Jan 15, 2022 at 11:35 AM Julien Rouhaud <rjuju...@gmail.com> wrote:
>
> Hi,
>
> On Mon, Jan 10, 2022 at 09:46:23AM +0530, Amul Sul wrote:
> >
> > Thanks for the note, I can see the same test is failing on my centos
> > vm too with the latest master head(376ce3e404b).  The failing reason is
> > the "recovery_target_inclusive = off" setting which is unnecessary for
> > this test, the attached patch removing the same.
>
> This version still fails on windows according to the cfbot:
>
> https://cirrus-ci.com/task/5882621321281536
>
> [18:14:02.639] c:\cirrus>call perl src/tools/msvc/vcregress.pl recoverycheck
> [18:14:56.114]
> [18:14:56.122] #   Failed test 'check standby content before timeline switch 
> 0/500FB30'
> [18:14:56.122] #   at t/003_recovery_targets.pl line 234.
> [18:14:56.122] #          got: '6000'
> [18:14:56.122] #     expected: '7000'
>
> I'm switching the cf entry to Waiting on Author.


Thanks for the note.

I am not sure what really went wrong but I think the 'standby_9'
server shutdown too early before it gets a chance to archive a
required WAL file. The attached patch tries to shutdown that server
after the required WAL are archived, unfortunately, I couldn't test
that on the window, let see how cfbot reacts to this version.

Regards,
Amul

From 08d19c0ef2f464e8bf722ce13457acf3f9be47e8 Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Mon, 17 Jan 2022 06:25:30 -0500
Subject: [PATCH v3] TAP test for EndOfLogTLI

---
 src/test/recovery/t/003_recovery_targets.pl | 57 ++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index 24da78c0bcd..928799b9490 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
-use Test::More tests => 9;
+use Test::More tests => 10;
 use Time::HiRes qw(usleep);
 
 # Create and test a standby from given backup, with a certain recovery target.
@@ -182,3 +182,58 @@ $logfile = slurp_file($node_standby->logfile());
 ok( $logfile =~
 	  qr/FATAL: .* recovery ended before configured recovery target was reached/,
 	'recovery end before target reached is a fatal error');
+
+# Test to cover a case where that we are looking for WAL record that ought to be
+# in for e.g 000000010000000000000001 we don't find it; instead we find
+# 000000020000000000000003 because of various reasons such as there was a
+# timeline switch in that segment, and we were reading the old WAL from a
+# segment belonging to a higher timeline or our recovery target timeline is 2,
+# or something that has 2 in its history.
+
+# Insert few more data to primary
+$node_primary->safe_psql('postgres',
+	"INSERT INTO tab_int VALUES (generate_series(6001,7000))");
+my $lsn6 = $node_primary->safe_psql('postgres',
+	"SELECT pg_current_wal_lsn()");
+
+# Setup new standby and enable WAL archiving to archive WAL files at the same
+# location as the primary.
+my $archive_cmd = $node_primary->safe_psql('postgres',
+	"SELECT current_setting('archive_command')");
+$node_standby = PostgreSQL::Test::Cluster->new('standby_9');
+$node_standby->init_from_backup(
+	$node_primary, 'my_backup',
+	has_streaming => 1);
+$node_standby->append_conf(
+        'postgresql.conf', qq(
+archive_mode = on
+archive_command = '$archive_cmd'
+));
+$node_standby->start;
+# Wait until necessary replay has been done on standby
+$node_primary->wait_for_catchup($node_standby, 'replay',
+	$node_primary->lsn('write'));
+$node_standby->promote;
+$node_standby->safe_psql('postgres',
+	"INSERT INTO tab_int VALUES (generate_series(7001,8000))");
+# Force archiving of WAL file
+my $last_wal = $node_standby->safe_psql('postgres',
+	"SELECT pg_walfile_name(pg_switch_wal())");
+# Wait until this WAL file archive
+my $check_archive = "SELECT last_archived_wal >= '$last_wal' FROM pg_stat_archiver";
+$node_standby->poll_query_until('postgres', $check_archive)
+	or die "Timed out while waiting for $last_wal file archive";
+$node_standby->stop;
+
+# Another standby whose recovery target lsn will be in the WAL file has
+# a different TLI than the target LSN belongs to.
+$node_standby = PostgreSQL::Test::Cluster->new('standby_10');
+$node_standby->init_from_backup(
+	$node_primary, 'my_backup',
+	has_restoring => 1);
+$node_standby->append_conf(
+        'postgresql.conf', qq(recovery_target_lsn = '$lsn6'));
+$node_standby->start;
+my $result = $node_standby->safe_psql('postgres',
+	"SELECT count(*) FROM tab_int");
+is($result, '7000', "check standby content before timeline switch $lsn6");
-- 
2.18.0

Re: TAP test to cover "EndOfLogTLI != replayTLI" case

Reply via email to