On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
>
> On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
>
> > So, if we put the reordering logic outside the streamer, we’d
> > sometimes be receiving buffers containing mixed data from two WAL
> > files. The caller would then need to correctly identify WAL file
> > boundaries within those buffers. This would require passing extra
> > metadata -- like segment numbers for the WAL files in the buffer, plus
> > start and end offsets of those segments within the buffer. While not
> > impossible, it feels a bit hacky and I'm unsure if that’s the best
> > approach.
>
> I agree that we need that kind of metadata, but I don't see why our
> need for it depends on where we do the reordering. That is, if we do
> the reordering above the astreamer layer, we need to keep track of the
> origin of each chunk of WAL bytes, and if we do the reordering within
> the astreamer layer, we still need to keep track of the origin of the
> WAL bytes. Doing the ordering properly requires that tracking, but it
> doesn't say anything about where that tracking has to be performed.
>
> I think it might be better if we didn't write to the astreamer's
> buffer at all. For example, suppose we create a struct that looks
> approximately like this:
>
> struct ChunkOfDecodedWAL
> {
>      XLogSegNo segno; // could also be XLogRecPtr start_lsn or char
> *walfilename or whatever
>      StringInfoData buffer;
>      char *spillfilename; // or whatever we use to identify the temporary 
> files
>      bool already_removed;
>      // potentially other metadata
> };
>
> Then, create a hash table and key it on the segno whatever. Have the
> astreamer write to the hash table: when it gets a chunk of WAL, it
> looks up or creates the relevant hash table entry and appends the data
> to the buffer. At any convenient point in the code, you can decide to
> write the data from the buffer to a spill file, after which you
> resetStringInfo() on the buffer and populate the spill file name. When
> you've used up the data, you remove the spill file and set the
> already_removed flag.
>
> I think this could also help with the error reporting stuff. When you
> get to the end of the file, you'll know all the files you saw and how
> much data you read from each of them. So you could possibly do
> something like
>
> ERROR: LSN %08X/%08X not found in archive "\%s\"
> DETAIL: WAL segment %s is not present in the archive
> -or
> DETAIL: WAL segment %s was expected to be %u bytes, but was only %u bytes
> -or-
> DETAIL: whatever else can go wrong
>
> The point is that every file you've ever seen has a hash table entry,
> and in that hash table entry you can store everything about that file
> that you need to know, whether that's the file data, the disk file
> that contains the file data, the fact that we already threw the data
> away, or any other fact that you can imagine wanting to know.
>
> Said differently, the astreamer buffer is not really a great place to
> write data. It exists because when we're just forwarding data from one
> astreamer to the next, we will often need to buffer a small amount of
> data to avoid terrible performance. However, it's only there to be
> used when we don't have something better. I don't think any astreamer
> that is intended to be the last one in the chain currently writes to
> the buffer -- they write to the output file, or whatever, because
> using an in-memory buffer as your final output destination is not a
> real good plan.
>

Make sense, I implemented this approach in the attached version, but
with a different structure name and a slightly different error
message. In the error output using the WAL file name instead of the
LSN. This is because the LSN at that point may differ from the
user-provided one (it might have been adjusted to the start of a WAL
page by xlogreader). This follows the same style used in the routine
that reads the WAL file. The LSN values (user provided) are only used
in error messages generated at the very beginning, specifically in the
main() function of pg_waldump.

I have also restructured the code by moving most of the tar file
reading logic out of pg_waldump.c into astreamer_waldump.c, which has
now been renamed to archive_waldump.c.

Kindly have a look at the attached version. Thank you !

Regards,
Amul
From 9bfed15797bcecf15e828d2b48f64caead36e9bb Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v5 1/8] Refactor: pg_waldump: Move some declarations to new
 pg_waldump.h

This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
 src/bin/pg_waldump/pg_waldump.c | 11 ++---------
 src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 9 deletions(-)
 create mode 100644 src/bin/pg_waldump/pg_waldump.h

diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
 #include "common/logging.h"
 #include "common/relpath.h"
 #include "getopt_long.h"
+#include "pg_waldump.h"
 #include "rmgrdesc.h"
 #include "storage/bufpage.h"
 
@@ -39,19 +40,11 @@
 
 static const char *progname;
 
-static int	WalSegSz;
+int			WalSegSz = DEFAULT_XLOG_SEG_SIZE;
 static volatile sig_atomic_t time_to_stop = false;
 
 static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
 
-typedef struct XLogDumpPrivate
-{
-	TimeLineID	timeline;
-	XLogRecPtr	startptr;
-	XLogRecPtr	endptr;
-	bool		endptr_reached;
-} XLogDumpPrivate;
-
 typedef struct XLogDumpConfig
 {
 	/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		  src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int	WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+	TimeLineID	timeline;
+	XLogRecPtr	startptr;
+	XLogRecPtr	endptr;
+	bool		endptr_reached;
+} XLogDumpPrivate;
+
+#endif		/* end of PG_WALDUMP_H */
-- 
2.47.1

From 830dcb9c9f98de3bfc6d0b19d56865ed1e175860 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v5 2/8] Refactor: pg_waldump: Separate logic used to calculate
 the required read size.

This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
 src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
 	return NULL;				/* not reached */
 }
 
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+				  int reqLen)
+{
+	int			count = XLOG_BLCKSZ;
+
+	if (private->endptr != InvalidXLogRecPtr)
+	{
+		if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+			count = XLOG_BLCKSZ;
+		else if (targetPagePtr + reqLen <= private->endptr)
+			count = private->endptr - targetPagePtr;
+		else
+		{
+			private->endptr_reached = true;
+			return -1;
+		}
+	}
+
+	return count;
+}
+
 /* pg_waldump's XLogReaderRoutine->segment_open callback */
 static void
 WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 				XLogRecPtr targetPtr, char *readBuff)
 {
 	XLogDumpPrivate *private = state->private_data;
-	int			count = XLOG_BLCKSZ;
+	int			count = required_read_len(private, targetPagePtr, reqLen);
 	WALReadError errinfo;
 
-	if (private->endptr != InvalidXLogRecPtr)
-	{
-		if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
-			count = XLOG_BLCKSZ;
-		else if (targetPagePtr + reqLen <= private->endptr)
-			count = private->endptr - targetPagePtr;
-		else
-		{
-			private->endptr_reached = true;
-			return -1;
-		}
-	}
+	if (private->endptr_reached)
+		return -1;
 
 	if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
 				 &errinfo))
-- 
2.47.1

From fdf23c243bc21cefd062d2b4960460722805bbee Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v5 3/8] Refactor: pg_waldump: Restructure TAP tests.

Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
 src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
 1 file changed, 67 insertions(+), 56 deletions(-)

diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
 	],
 	qr/./,
 	'runs with start and end segment specified');
-command_fails_like(
-	[ 'pg_waldump', '--path' => $node->data_dir ],
-	qr/error: no start WAL location given/,
-	'path option requires start location');
-command_like(
-	[
-		'pg_waldump',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn,
-		'--end' => $end_lsn,
-	],
-	qr/./,
-	'runs with path option and start and end locations');
-command_fails_like(
-	[
-		'pg_waldump',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn,
-	],
-	qr/error: error in WAL record at/,
-	'falling off the end of the WAL results in an error');
-
 command_like(
 	[
 		'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
 	],
 	qr/^$/,
 	'no output with --quiet option');
-command_fails_like(
-	[
-		'pg_waldump', '--quiet',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn
-	],
-	qr/error: error in WAL record at/,
-	'errors are shown with --quiet');
-
 
 # Test for: Display a message that we're skipping data if `from`
 # wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
 
 	my $result = IPC::Run::run [
 		'pg_waldump',
-		'--path' => $node->data_dir,
 		'--start' => $start_lsn,
 		'--end' => $end_lsn,
 		@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
 
 my @lines;
 
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+	{
+		'path' => $node->data_dir
+	});
 
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+	my $path = $scenario->{'path'};
 
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+	SKIP:
+	{
+		command_fails_like(
+			[ 'pg_waldump', '--path' => $path ],
+			qr/error: no start WAL location given/,
+			'path option requires start location');
+		command_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+				'--end' => $end_lsn,
+			],
+			qr/./,
+			'runs with path option and start and end locations');
+		command_fails_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+			],
+			qr/error: error in WAL record at/,
+			'falling off the end of the WAL results in an error');
 
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+		command_fails_like(
+			[
+				'pg_waldump', '--quiet',
+				'--path' => $path,
+				'--start' => $start_lsn
+			],
+			qr/error: error in WAL record at/,
+			'errors are shown with --quiet');
 
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+		@lines = test_pg_waldump('--path' => $path);
+		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
 
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+		@lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+		is(@lines, 6, 'limit option observed');
 
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+		@lines = test_pg_waldump('--path' => $path, '--fullpage');
+		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
 
-@lines = test_pg_waldump(
-	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
-	0, 'only lines for selected relation');
+		@lines = test_pg_waldump('--path' => $path, '--stats');
+		like($lines[0], qr/WAL statistics/, "statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-@lines = test_pg_waldump(
-	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
-	'--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+		@lines = test_pg_waldump('--path' => $path, '--stats=record');
+		like($lines[0], qr/WAL statistics/, "statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
+		@lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+		@lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+		@lines = test_pg_waldump('--path' => $path,
+			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+			0, 'only lines for selected relation');
+
+		@lines = test_pg_waldump('--path' => $path,
+			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+			'--block' => 1);
+		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+	}
+}
 
 done_testing();
-- 
2.47.1

From 787fc3c94431dedcc0d37d3d6d9329b62e4d00c5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v5 4/8] pg_waldump: Add support for archived WAL decoding.

pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.

Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
 doc/src/sgml/ref/pg_waldump.sgml     |   8 +-
 src/bin/pg_waldump/Makefile          |   7 +-
 src/bin/pg_waldump/archive_waldump.c | 577 +++++++++++++++++++++++++++
 src/bin/pg_waldump/meson.build       |   4 +-
 src/bin/pg_waldump/pg_waldump.c      | 222 ++++++++---
 src/bin/pg_waldump/pg_waldump.h      |  36 +-
 src/bin/pg_waldump/t/001_basic.pl    |  84 +++-
 src/tools/pgindent/typedefs.list     |   3 +
 8 files changed, 863 insertions(+), 78 deletions(-)
 create mode 100644 src/bin/pg_waldump/archive_waldump.c

diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
       <term><option>--path=<replaceable>path</replaceable></option></term>
       <listitem>
        <para>
-        Specifies a directory to search for WAL segment files or a
-        directory with a <literal>pg_wal</literal> subdirectory that
+        Specifies a tar archive or a directory to search for WAL segment files
+        or a directory with a <literal>pg_wal</literal> subdirectory that
         contains such files.  The default is to search in the current
         directory, the <literal>pg_wal</literal> subdirectory of the
         current directory, and the <literal>pg_wal</literal> subdirectory
         of <envar>PGDATA</envar>.
        </para>
+       <para>
+        If a tar archive is provided, its WAL segment files must be in
+        sequential order; otherwise, an error will be reported.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..05ac5763a57 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
 PGFILEDESC = "pg_waldump - decode and display WAL"
 PGAPPICON=win32
 
+# make these available to TAP test scripts
+export TAR
+
 subdir = src/bin/pg_waldump
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
 	$(WIN32RES) \
 	compat.o \
 	pg_waldump.o \
+	archive_waldump.o \
 	rmgrdesc.o \
 	xlogreader.o \
 	xlogstats.o
 
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
 
 RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
 RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..e619e29d5d4
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,577 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ *		A generic facility for reading WAL data from tar archives via archive
+ *		streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE				(128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+	uint32		status;			/* hash status */
+	XLogSegNo	segno;			/* hash key: WAL segment number */
+	TimeLineID	timeline;		/* timeline of this wal file */
+
+	StringInfoData buf;
+	bool		tmpseg_exists;	/* spill file exists? */
+
+	int			total_read;		/* total read of this WAL segment, including
+								 * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX				ArchivedWAL
+#define SH_ELEMENT_TYPE			ArchivedWALEntry
+#define SH_KEY_TYPE				XLogSegNo
+#define SH_KEY					segno
+#define SH_HASH_KEY(tb, key)	murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b)		(a == b)
+#define SH_GET_HASH(tb, a)		a->hash
+#define SH_SCOPE				static inline
+#define SH_RAW_ALLOCATOR		pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+	astreamer	base;
+	XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int	read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+											   XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+									  astreamer_member *member,
+									  const char *data, int len,
+									  astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+							   astreamer_member *member,
+							   XLogSegNo *curSegNo,
+							   TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+	.content = astreamer_waldump_content,
+	.finalize = astreamer_waldump_finalize,
+	.free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+	int			fname_len = strlen(fname);
+	pg_compress_algorithm compress_algo;
+
+	/* Now, check the compression type of the tar */
+	if (fname_len > 4 &&
+		strcmp(fname + fname_len - 4, ".tar") == 0)
+		compress_algo = PG_COMPRESSION_NONE;
+	else if (fname_len > 4 &&
+			 strcmp(fname + fname_len - 4, ".tgz") == 0)
+		compress_algo = PG_COMPRESSION_GZIP;
+	else if (fname_len > 7 &&
+			 strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+		compress_algo = PG_COMPRESSION_GZIP;
+	else if (fname_len > 8 &&
+			 strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+		compress_algo = PG_COMPRESSION_LZ4;
+	else if (fname_len > 8 &&
+			 strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+		compress_algo = PG_COMPRESSION_ZSTD;
+	else
+		return false;
+
+	*compression = compress_algo;
+
+	return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+					pg_compress_algorithm compression)
+{
+	int			fd;
+	astreamer  *streamer;
+	ArchivedWALEntry *entry = NULL;
+	XLogLongPageHeader longhdr;
+
+	/* Open tar archive and store its file descriptor */
+	fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+	if (fd < 0)
+		pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+	privateInfo->archive_fd = fd;
+
+	streamer = astreamer_waldump_new(privateInfo);
+
+	/* Before that we must parse the tar archive. */
+	streamer = astreamer_tar_parser_new(streamer);
+
+	/* Before that we must decompress, if archive is compressed. */
+	if (compression == PG_COMPRESSION_GZIP)
+		streamer = astreamer_gzip_decompressor_new(streamer);
+	else if (compression == PG_COMPRESSION_LZ4)
+		streamer = astreamer_lz4_decompressor_new(streamer);
+	else if (compression == PG_COMPRESSION_ZSTD)
+		streamer = astreamer_zstd_decompressor_new(streamer);
+
+	privateInfo->archive_streamer = streamer;
+
+	/* Hash table storing WAL entries read from the archive */
+	ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+	/*
+	 * Verify that the archive contains valid WAL files and fetch WAL segment
+	 * size
+	 */
+	while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+	{
+		if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+			pg_fatal("could not find WAL in \"%s\" archive",
+					 privateInfo->archive_name);
+
+		entry = privateInfo->cur_wal;
+	}
+
+	/* Set WalSegSz if WAL data is successfully read */
+	longhdr = (XLogLongPageHeader) entry->buf.data;
+
+	WalSegSz = longhdr->xlp_seg_size;
+
+	if (!IsValidWalSegSize(WalSegSz))
+	{
+		pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+							  "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+							  WalSegSz),
+					 privateInfo->archive_name, WalSegSz);
+		pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+		exit(1);
+	}
+
+	/*
+	 * With the WAL segment size available, we can now initialize the
+	 * dependent start and end segment numbers.
+	 */
+	XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+	XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+	/*
+	 * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+	 * flush any remaining buffered data or to ensure the end of the tar
+	 * archive is reached. However, when decoding a WAL file, once we hit the
+	 * end LSN, any remaining WAL data in the buffer or the tar archive's
+	 * unreached end can be safely ignored.
+	 */
+	astreamer_free(privateInfo->archive_streamer);
+
+	/* Close the file. */
+	if (close(privateInfo->archive_fd) != 0)
+		pg_log_error("could not close file \"%s\": %m",
+					 privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+					  Size count, char *readBuff)
+{
+	char	   *p = readBuff;
+	Size		nbytes = count;
+	XLogRecPtr	recptr = targetPagePtr;
+	XLogSegNo	segno;
+	ArchivedWALEntry *entry;
+
+	XLByteToSeg(targetPagePtr, segno, WalSegSz);
+	entry = get_archive_wal_entry(segno, privateInfo);
+
+	while (nbytes > 0)
+	{
+		char	   *buf = entry->buf.data;
+		int			len = entry->buf.len;
+
+		/* WAL record range that the buffer contains */
+		XLogRecPtr	endPtr;
+		XLogRecPtr	startPtr;
+
+		XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+								WalSegSz, endPtr);
+		startPtr = endPtr - len;
+
+		Assert((endPtr - startPtr) == len);
+
+		/*
+		 * pg_waldump never ask the same WAL bytes more than once, so if we're
+		 * now being asked for data beyond the end of what we've already read,
+		 * that means none of the data we currently have in the buffer will
+		 * ever be consulted again. So, we can discard the existing buffer
+		 * contents and start over.
+		 */
+		if (recptr >= endPtr)
+		{
+			len = 0;
+
+			/* Discard the buffered data */
+			resetStringInfo(&entry->buf);
+		}
+
+		if (len > 0 && recptr > startPtr)
+		{
+			int			skipBytes = 0;
+
+			/*
+			 * The required offset is not at the start of the buffer, so skip
+			 * bytes until reaching the desired offset of the target page.
+			 */
+			skipBytes = recptr - startPtr;
+
+			buf += skipBytes;
+			len -= skipBytes;
+		}
+
+		if (len > 0)
+		{
+			int			readBytes = len >= nbytes ? nbytes : len;
+
+			/* Ensure reading correct WAL record */
+			Assert(recptr >= startPtr && recptr < endPtr);
+
+			memcpy(p, buf, readBytes);
+
+			/* Update state for read */
+			nbytes -= readBytes;
+			p += readBytes;
+			recptr += readBytes;
+		}
+		else
+		{
+			/*
+			 * Fetch more data; raise an error if it's not the current segment
+			 * being read by the archive streamer or if reading of the
+			 * archived file has finished.
+			 */
+			if (privateInfo->cur_wal != entry ||
+				read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+			{
+				char		fname[MAXFNAMELEN];
+
+				XLogFileName(fname, privateInfo->timeline, entry->segno,
+							 WalSegSz);
+				pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+						 fname, privateInfo->archive_name,
+						 (long long int) count - nbytes,
+						 (long long int) nbytes);
+			}
+		}
+	}
+
+	/*
+	 * Should have either have successfully read all the requested bytes or
+	 * reported a failure before this point.
+	 */
+	Assert(nbytes == 0);
+
+	/*
+	 * NB: We return the fixed value provided as input. Although we could
+	 * return a boolean since we either successfully read the WAL page or
+	 * raise an error, but the caller expects this value to be returned. The
+	 * routine that reads WAL pages from the physical WAL file follows the
+	 * same convention.
+	 */
+	return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+	int			rc;
+	char	   *buffer;
+
+	buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+	rc = read(privateInfo->archive_fd, buffer, count);
+	if (rc < 0)
+		pg_fatal("could not read file \"%s\": %m",
+				 privateInfo->archive_name);
+
+	/*
+	 * Decompress (if required), and then parse the previously read contents
+	 * of the tar file.
+	 */
+	if (rc > 0)
+		astreamer_content(privateInfo->archive_streamer, NULL,
+						  buffer, rc, ASTREAMER_UNKNOWN);
+	pg_free(buffer);
+
+	return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+	ArchivedWALEntry *entry = NULL;
+	char		fname[MAXFNAMELEN];
+
+	/* Search hash table */
+	entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+	if (entry != NULL)
+		return entry;
+
+	/* Needed WAL yet to be decoded from archive, do the same */
+	while (1)
+	{
+		entry = privateInfo->cur_wal;
+
+		/* Fetch more data */
+		if (entry == NULL || entry->buf.len == 0)
+		{
+			if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+				break;			/* archive file ended */
+		}
+
+		/*
+		 * Either, here for the first time, or the archived streamer is
+		 * reading a non-WAL file or an irrelevant WAL file.
+		 */
+		if (entry == NULL)
+			continue;
+
+		/* Found the required entry */
+		if (entry->segno == segno)
+			return entry;
+
+		/*
+		 * Ignore if the timeline is different or the current segment is not
+		 * the desired one.
+		 */
+		if (privateInfo->timeline != entry->timeline ||
+			privateInfo->startSegNo > entry->segno ||
+			privateInfo->endSegNo < entry->segno)
+		{
+			privateInfo->cur_wal = NULL;
+			continue;
+		}
+
+		/* WAL segments must be archived in order */
+		pg_log_error("WAL files are not archived in sequential order");
+		pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+							segno, entry->segno);
+		exit(1);
+	}
+
+	/* Requested WAL segment not found */
+	XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+	pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+	astreamer_waldump *streamer;
+
+	streamer = palloc0(sizeof(astreamer_waldump));
+	*((const astreamer_ops **) &streamer->base.bbs_ops) =
+		&astreamer_waldump_ops;
+
+	streamer->privateInfo = privateInfo;
+
+	return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+						  const char *data, int len,
+						  astreamer_archive_context context)
+{
+	astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+	XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+	Assert(context != ASTREAMER_UNKNOWN);
+
+	switch (context)
+	{
+		case ASTREAMER_MEMBER_HEADER:
+			{
+				XLogSegNo	segno;
+				TimeLineID	timeline;
+				ArchivedWALEntry *entry;
+				bool		found;
+
+				pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+				if (!member_is_wal_file(mystreamer, member,
+										&segno, &timeline))
+					break;
+
+				entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+				/*
+				 * Shouldn't happen, but if it does, simply ignore the
+				 * duplicate WAL file.
+				 */
+				if (found)
+				{
+					pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+								   member->pathname);
+					break;
+				}
+
+				initStringInfo(&entry->buf);
+				entry->timeline = timeline;
+				entry->total_read = 0;
+
+				privateInfo->cur_wal = entry;
+			}
+			break;
+
+		case ASTREAMER_MEMBER_CONTENTS:
+			if (privateInfo->cur_wal)
+			{
+				appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+				privateInfo->cur_wal->total_read += len;
+			}
+			break;
+
+		case ASTREAMER_MEMBER_TRAILER:
+			privateInfo->cur_wal = NULL;
+			break;
+
+		case ASTREAMER_ARCHIVE_TRAILER:
+			break;
+
+		default:
+			/* Shouldn't happen. */
+			pg_fatal("unexpected state while parsing tar file");
+	}
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+	Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+	Assert(streamer->bbs_next == NULL);
+	pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+				   XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+	int			pathlen;
+	XLogSegNo	segNo;
+	TimeLineID	timeline;
+	char	   *fname;
+
+	/* We are only interested in normal files. */
+	if (member->is_directory || member->is_link)
+		return false;
+
+	pathlen = strlen(member->pathname);
+	if (pathlen < XLOG_FNAME_LEN)
+		return false;
+
+	/* WAL file could be with full path */
+	fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+	if (!IsXLogFileName(fname))
+		return false;
+
+	/*
+	 * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+	 * PaxHeaders when creating an archive. These are special entries that
+	 * store extended metadata for the file entry immediately following them,
+	 * and they share the exact same name as that file.
+	 */
+	if (strstr(member->pathname, "PaxHeaders."))
+		return false;
+
+	/* Parse position from file */
+	XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+	*curSegNo = segNo;
+	*curTimeline = timeline;
+
+	return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..da00746587c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
 pg_waldump_sources = files(
   'compat.c',
   'pg_waldump.c',
+  'archive_waldump.c',
   'rmgrdesc.c',
 )
 
@@ -18,7 +19,7 @@ endif
 
 pg_waldump = executable('pg_waldump',
   pg_waldump_sources,
-  dependencies: [frontend_code, lz4, zstd],
+  dependencies: [frontend_code, lz4, zstd, libpq],
   c_args: ['-DFRONTEND'], # needed for xlogreader et al
   kwargs: default_bin_args,
 )
@@ -29,6 +30,7 @@ tests += {
   'sd': meson.current_source_dir(),
   'bd': meson.current_build_dir(),
   'tap': {
+    'env': {'TAR': tar.found() ? tar.full_path() : ''},
     'tests': [
       't/001_basic.pl',
       't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..8a838f16ba2 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -177,7 +177,7 @@ split_path(const char *path, char **dir, char **fname)
  *
  * return a read only fd
  */
-static int
+int
 open_file_in_directory(const char *directory, const char *fname)
 {
 	int			fd = -1;
@@ -436,6 +436,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 	return count;
 }
 
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+					  TimeLineID *tli_p)
+{
+	/* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+	/* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+				   XLogRecPtr targetPtr, char *readBuff)
+{
+	XLogDumpPrivate *private = state->private_data;
+	int			count = required_read_len(private, targetPagePtr, reqLen);
+
+	if (private->endptr_reached)
+		return -1;
+
+	/* Read the WAL page from the archive streamer */
+	return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
 /*
  * Boolean to return whether the given WAL record matches a specific relation
  * and optionally block.
@@ -773,8 +811,8 @@ usage(void)
 	printf(_("  -F, --fork=FORK        only show records that modify blocks in fork FORK;\n"
 			 "                         valid names are main, fsm, vm, init\n"));
 	printf(_("  -n, --limit=N          number of records to display\n"));
-	printf(_("  -p, --path=PATH        directory in which to find WAL segment files or a\n"
-			 "                         directory with a ./pg_wal that contains such files\n"
+	printf(_("  -p, --path=PATH        tar archive or a directory in which to find WAL segment files or\n"
+			 "                         a directory with a ./pg_wal that contains such files\n"
 			 "                         (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
 	printf(_("  -q, --quiet            do not print any output, except for errors\n"));
 	printf(_("  -r, --rmgr=RMGR        only show records generated by resource manager RMGR;\n"
@@ -806,7 +844,10 @@ main(int argc, char **argv)
 	XLogRecord *record;
 	XLogRecPtr	first_record;
 	char	   *waldir = NULL;
+	char	   *walpath = NULL;
 	char	   *errormsg;
+	bool		is_archive = false;
+	pg_compress_algorithm compression;
 
 	static struct option long_options[] = {
 		{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +979,7 @@ main(int argc, char **argv)
 				}
 				break;
 			case 'p':
-				waldir = pg_strdup(optarg);
+				walpath = pg_strdup(optarg);
 				break;
 			case 'q':
 				config.quiet = true;
@@ -1102,10 +1143,27 @@ main(int argc, char **argv)
 		goto bad_argument;
 	}
 
-	if (waldir != NULL)
+	if (walpath != NULL)
 	{
+		/* validate path points to tar archive */
+		if (is_archive_file(walpath, &compression))
+		{
+			char	   *fname = NULL;
+
+			split_path(walpath, &waldir, &fname);
+
+			/*
+			 * A NULL WAL directory indicates that the archive file is located
+			 * in the current working directory of the pg_waldump execution
+			 */
+			if (waldir == NULL)
+				waldir = pg_strdup(".");
+
+			private.archive_name = fname;
+			is_archive = true;
+		}
 		/* validate path points to directory */
-		if (!verify_directory(waldir))
+		else if (!verify_directory(walpath))
 		{
 			pg_log_error("could not open directory \"%s\": %m", waldir);
 			goto bad_argument;
@@ -1123,46 +1181,36 @@ main(int argc, char **argv)
 		int			fd;
 		XLogSegNo	segno;
 
+		/*
+		 * If a tar archive is passed using the --path option, all other
+		 * arguments become unnecessary.
+		 */
+		if (is_archive)
+		{
+			pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+						 argv[optind]);
+			goto bad_argument;
+		}
+
 		split_path(argv[optind], &directory, &fname);
 
-		if (waldir == NULL && directory != NULL)
+		if (walpath == NULL && directory != NULL)
 		{
-			waldir = directory;
+			walpath = directory;
 
-			if (!verify_directory(waldir))
+			if (!verify_directory(walpath))
 				pg_fatal("could not open directory \"%s\": %m", waldir);
 		}
 
-		waldir = identify_target_directory(waldir, fname);
-		fd = open_file_in_directory(waldir, fname);
-		if (fd < 0)
-			pg_fatal("could not open file \"%s\"", fname);
-		close(fd);
-
-		/* parse position from file */
-		XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
-		if (XLogRecPtrIsInvalid(private.startptr))
-			XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
-		else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+		if (fname != NULL && is_archive_file(fname, &compression))
 		{
-			pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
-						 LSN_FORMAT_ARGS(private.startptr),
-						 fname);
-			goto bad_argument;
+			waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
+			private.archive_name = fname;
+			is_archive = true;
 		}
-
-		/* no second file specified, set end position */
-		if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
-			XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
-		/* parse ENDSEG if passed */
-		if (optind + 1 < argc)
+		else
 		{
-			XLogSegNo	endsegno;
-
-			/* ignore directory, already have that */
-			split_path(argv[optind + 1], &directory, &fname);
+			waldir = identify_target_directory(walpath, fname);
 
 			fd = open_file_in_directory(waldir, fname);
 			if (fd < 0)
@@ -1170,32 +1218,63 @@ main(int argc, char **argv)
 			close(fd);
 
 			/* parse position from file */
-			XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+			XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
 
-			if (endsegno < segno)
-				pg_fatal("ENDSEG %s is before STARTSEG %s",
-						 argv[optind + 1], argv[optind]);
+			if (XLogRecPtrIsInvalid(private.startptr))
+				XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+			else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+			{
+				pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+							 LSN_FORMAT_ARGS(private.startptr),
+							 fname);
+				goto bad_argument;
+			}
 
-			if (XLogRecPtrIsInvalid(private.endptr))
-				XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
-										private.endptr);
+			/* no second file specified, set end position */
+			if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+				XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
 
-			/* set segno to endsegno for check of --end */
-			segno = endsegno;
-		}
+			/* parse ENDSEG if passed */
+			if (optind + 1 < argc)
+			{
+				XLogSegNo	endsegno;
 
+				/* ignore directory, already have that */
+				split_path(argv[optind + 1], &directory, &fname);
 
-		if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
-			private.endptr != (segno + 1) * WalSegSz)
-		{
-			pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
-						 LSN_FORMAT_ARGS(private.endptr),
-						 argv[argc - 1]);
-			goto bad_argument;
+				fd = open_file_in_directory(waldir, fname);
+				if (fd < 0)
+					pg_fatal("could not open file \"%s\"", fname);
+				close(fd);
+
+				/* parse position from file */
+				XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+				if (endsegno < segno)
+					pg_fatal("ENDSEG %s is before STARTSEG %s",
+							 argv[optind + 1], argv[optind]);
+
+				if (XLogRecPtrIsInvalid(private.endptr))
+					XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+											private.endptr);
+
+				/* set segno to endsegno for check of --end */
+				segno = endsegno;
+			}
+
+
+			if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+				private.endptr != (segno + 1) * WalSegSz)
+			{
+				pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+							 LSN_FORMAT_ARGS(private.endptr),
+							 argv[argc - 1]);
+				goto bad_argument;
+			}
 		}
 	}
-	else
-		waldir = identify_target_directory(waldir, NULL);
+	else if (!is_archive)
+		waldir = identify_target_directory(walpath, NULL);
 
 	/* we don't know what to print */
 	if (XLogRecPtrIsInvalid(private.startptr))
@@ -1207,12 +1286,30 @@ main(int argc, char **argv)
 	/* done with argument parsing, do the actual work */
 
 	/* we have everything we need, start reading */
-	xlogreader_state =
-		XLogReaderAllocate(WalSegSz, waldir,
-						   XL_ROUTINE(.page_read = WALDumpReadPage,
-									  .segment_open = WALDumpOpenSegment,
-									  .segment_close = WALDumpCloseSegment),
-						   &private);
+	if (is_archive)
+	{
+		/* Set up for reading tar file */
+		init_archive_reader(&private, waldir, compression);
+
+		/* Routine to decode WAL files in tar archive */
+		xlogreader_state =
+			XLogReaderAllocate(WalSegSz, waldir,
+							   XL_ROUTINE(.page_read = TarWALDumpReadPage,
+										  .segment_open = TarWALDumpOpenSegment,
+										  .segment_close = TarWALDumpCloseSegment),
+							   &private);
+	}
+	else
+	{
+		/* Routine to decode WAL files */
+		xlogreader_state =
+			XLogReaderAllocate(WalSegSz, waldir,
+							   XL_ROUTINE(.page_read = WALDumpReadPage,
+										  .segment_open = WALDumpOpenSegment,
+										  .segment_close = WALDumpCloseSegment),
+							   &private);
+	}
+
 	if (!xlogreader_state)
 		pg_fatal("out of memory while allocating a WAL reading processor");
 
@@ -1321,6 +1418,9 @@ main(int argc, char **argv)
 
 	XLogReaderFree(xlogreader_state);
 
+	if (is_archive)
+		free_archive_reader(&private);
+
 	return EXIT_SUCCESS;
 
 bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..54758c3548a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
 #define PG_WALDUMP_H
 
 #include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
 
 extern int	WalSegSz;
 
+/* Forward declaration */
+struct ArchivedWALEntry;
+
 /* Contains the necessary information to drive WAL decoding */
 typedef struct XLogDumpPrivate
 {
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
 	XLogRecPtr	startptr;
 	XLogRecPtr	endptr;
 	bool		endptr_reached;
+
+	/* Fields required to read WAL from archive */
+	char	   *archive_name;	/* Tar archive name */
+	int			archive_fd;		/* File descriptor for the open tar file */
+
+	astreamer  *archive_streamer;
+
+	/* What the archive streamer is currently reading */
+	struct ArchivedWALEntry *cur_wal;
+
+	/*
+	 * Although these values can be easily derived from startptr and endptr,
+	 * doing so repeatedly for each archived member would be inefficient, as
+	 * it would involve recalculating and filtering out irrelevant WAL
+	 * segments.
+	 */
+	XLogSegNo	startSegNo;
+	XLogSegNo	endSegNo;
 } XLogDumpPrivate;
 
-#endif		/* end of PG_WALDUMP_H */
+extern int	open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+							pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+								const char *waldir,
+								pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int	read_archive_wal_page(XLogDumpPrivate *privateInfo,
+								  XLogRecPtr targetPagePtr,
+								  Size count, char *readBuff);
+
+#endif							/* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
 
 use strict;
 use warnings FATAL => 'all';
+use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+my $tar = $ENV{TAR};
+
 program_help_ok('pg_waldump');
 program_version_ok('pg_waldump');
 program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
 sub test_pg_waldump
 {
 	local $Test::Builder::Level = $Test::Builder::Level + 1;
-	my @opts = @_;
+	my ($path, @opts) = @_;
 
 	my ($stdout, $stderr);
 
@@ -243,6 +246,7 @@ sub test_pg_waldump
 		'pg_waldump',
 		'--start' => $start_lsn,
 		'--end' => $end_lsn,
+		'--path' => $path,
 		@opts
 	  ],
 	  '>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
 	return @lines;
 }
 
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+	my ($archive, $directory, $compression_flags) = @_;
+
+	my @files;
+	opendir my $dh, $directory or die "opendir: $!";
+	while (my $entry = readdir $dh) {
+		# Skip '.' and '..'
+		next if $entry eq '.' || $entry eq '..';
+		push @files, $entry;
+	}
+	closedir $dh;
+
+	@files = sort @files;
+
+	# move into the WAL directory before archiving files
+	my $cwd = getcwd;
+	chdir($directory) || die "chdir: $!";
+	command_ok([$tar, $compression_flags, $archive, @files]);
+	chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
 
 my @scenario = (
 	{
-		'path' => $node->data_dir
+		'path' => $node->data_dir,
+		'is_archive' => 0,
+		'enabled' => 1
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar",
+		'compression_method' => 'none',
+		'compression_flags' => '-cf',
+		'is_archive' => 1,
+		'enabled' => 1
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar.gz",
+		'compression_method' => 'gzip',
+		'compression_flags' => '-czf',
+		'is_archive' => 1,
+		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
 	});
 
 for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
 
 	SKIP:
 	{
+		skip "tar command is not available", 3
+		  if !defined $tar;
+		skip "$scenario->{'compression_method'} compression not supported by this build", 3
+		  if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+		  # create pg_wal archive
+		  if ($scenario->{'is_archive'})
+		  {
+			  generate_archive($path,
+				  $node->data_dir . '/pg_wal',
+				  $scenario->{'compression_flags'});
+		  }
+
 		command_fails_like(
 			[ 'pg_waldump', '--path' => $path ],
 			qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
 			qr/error: error in WAL record at/,
 			'errors are shown with --quiet');
 
-		@lines = test_pg_waldump('--path' => $path);
+		my @lines;
+		@lines = test_pg_waldump($path);
 		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
 
-		@lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+		@lines = test_pg_waldump($path, '--limit' => 6);
 		is(@lines, 6, 'limit option observed');
 
-		@lines = test_pg_waldump('--path' => $path, '--fullpage');
+		@lines = test_pg_waldump($path, '--fullpage');
 		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
 
-		@lines = test_pg_waldump('--path' => $path, '--stats');
+		@lines = test_pg_waldump($path, '--stats');
 		like($lines[0], qr/WAL statistics/, "statistics on stdout");
 		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-		@lines = test_pg_waldump('--path' => $path, '--stats=record');
+		@lines = test_pg_waldump($path, '--stats=record');
 		like($lines[0], qr/WAL statistics/, "statistics on stdout");
 		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-		@lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+		@lines = test_pg_waldump($path, '--rmgr' => 'Btree');
 		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
 
-		@lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+		@lines = test_pg_waldump($path, '--fork' => 'init');
 		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
 
-		@lines = test_pg_waldump('--path' => $path,
+		@lines = test_pg_waldump($path,
 			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
 		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
 			0, 'only lines for selected relation');
 
-		@lines = test_pg_waldump('--path' => $path,
+		@lines = test_pg_waldump($path,
 			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
 			'--block' => 1);
 		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+		# Cleanup.
+		unlink $path if $scenario->{'is_archive'};
 	}
 }
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index bb4e1b37005..de2ad42bcab 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
 ArchiveShutdownCB
 ArchiveStartupCB
 ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
 ArchiverOutput
 ArchiverStage
 ArrayAnalyzeExtraData
@@ -3453,6 +3455,7 @@ astreamer_recovery_injector
 astreamer_tar_archiver
 astreamer_tar_parser
 astreamer_verify
+astreamer_waldump
 astreamer_zstd_frame
 auth_password_hook_typ
 autovac_table
-- 
2.47.1

From 866225e20f1389b94e25c40314b58332d7e0a6c5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v5 5/8] pg_waldump: Remove the restriction on the order of
 archived WAL files.

With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue.  Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
 src/bin/pg_waldump/archive_waldump.c | 207 +++++++++++++++++++++++++--
 src/bin/pg_waldump/pg_waldump.c      |  41 +++++-
 src/bin/pg_waldump/pg_waldump.h      |   4 +
 src/bin/pg_waldump/t/001_basic.pl    |   3 +-
 4 files changed, 243 insertions(+), 12 deletions(-)

diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index e619e29d5d4..4a280b58ec2 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
 #include <unistd.h>
 
 #include "access/xlog_internal.h"
+#include "common/file_perm.h"
 #include "common/hashfn.h"
 #include "common/logging.h"
 #include "fe_utils/simple_list.h"
@@ -27,6 +28,11 @@
  */
 #define READ_CHUNK_SIZE				(128 * 1024)
 
+#define TEMP_FILE_PREFIX "waldump.tmp"
+
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
 /* Structure for storing the WAL segment data from the archive */
 typedef struct ArchivedWALEntry
 {
@@ -65,6 +71,11 @@ typedef struct astreamer_waldump
 static int	read_archive_file(XLogDumpPrivate *privateInfo, Size count);
 static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
 											   XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
 
 static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
 static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +131,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
 }
 
 /*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
  */
 void
 init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -194,6 +206,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
 	 */
 	XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
 	XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+	/*
+	 * Setup temporary directory to store WAL segments and set up an exit
+	 * callback to remove it upon completion.
+	 */
+	setup_tmpseg_dir(waldir);
+	atexit(cleanup_tmpseg_dir_atexit);
 }
 
 /*
@@ -362,13 +381,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
 /*
  * Returns the archived WAL entry from the hash table if it exists. Otherwise,
  * it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
  */
 static ArchivedWALEntry *
 get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
 {
 	ArchivedWALEntry *entry = NULL;
 	char		fname[MAXFNAMELEN];
+	FILE	   *write_fp = NULL;
 
 	/* Search hash table */
 	entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -411,11 +433,32 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
 			continue;
 		}
 
-		/* WAL segments must be archived in order */
-		pg_log_error("WAL files are not archived in sequential order");
-		pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
-							segno, entry->segno);
-		exit(1);
+		/*
+		 * Archive streamer is currently reading a file that isn't the one
+		 * asked for, but it's required for a future feature. It should be
+		 * written to a temporary location for retrieval when needed.
+		 */
+
+		/* Create a temporary file if one does not already exist */
+		if (!entry->tmpseg_exists)
+		{
+			write_fp = prepare_tmp_write(entry->segno);
+			entry->tmpseg_exists = true;
+		}
+
+		/* Flush data from the buffer to the file */
+		perform_tmp_write(entry->segno, &entry->buf, write_fp);
+		resetStringInfo(&entry->buf);
+
+		/*
+		 * The change in the current segment entry indicates that the reading
+		 * of this file has ended.
+		 */
+		if (entry != privateInfo->cur_wal && write_fp != NULL)
+		{
+			fclose(write_fp);
+			write_fp = NULL;
+		}
 	}
 
 	/* Requested WAL segment not found */
@@ -423,6 +466,150 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
 	pg_fatal("could not find file \"%s\" in archive", fname);
 }
 
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+	/*
+	 * Use the directory specified by the TEMDIR environment variable. If it's
+	 * not set, use the provided WAL directory to extract WAL file
+	 * temporarily.
+	 */
+	TmpWalSegDir = getenv("TMPDIR") ?
+		pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir);
+	canonicalize_path(TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+	ArchivedWAL_iterator it;
+	ArchivedWALEntry *entry;
+
+	ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+	while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+	{
+		if (entry->tmpseg_exists)
+		{
+			remove_tmp_walseg(entry->segno, false);
+			entry->tmpseg_exists = false;
+		}
+	}
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+	char	   *fpath = (char *) palloc(MAXPGPATH);
+
+	snprintf(fpath, MAXPGPATH, "%s/%s.%08X%08X",
+			 TmpWalSegDir,
+			 TEMP_FILE_PREFIX,
+			 (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+			 (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+	return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+	ArchivedWALEntry *entry;
+
+	entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+	if (entry == NULL)
+		return false;
+
+	return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+	FILE	   *file;
+	char	   *fpath;
+
+	fpath = get_tmp_walseg_path(segno);
+
+	/* Create an empty placeholder */
+	file = fopen(fpath, PG_BINARY_W);
+	if (file == NULL)
+		pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+	if (chmod(fpath, pg_file_create_mode))
+		pg_fatal("could not set permissions on file \"%s\": %m",
+				 fpath);
+#endif
+
+	pg_log_debug("temporarily exporting file \"%s\"", fpath);
+	pfree(fpath);
+
+	return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+	Assert(file);
+
+	errno = 0;
+	if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+	{
+		/*
+		 * If write didn't set errno, assume problem is no disk space
+		 */
+		if (errno == 0)
+			errno = ENOSPC;
+		pg_fatal("could not write to file \"%s\": %m",
+				 get_tmp_walseg_path(segno));
+	}
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+	char	   *fpath = get_tmp_walseg_path(segno);
+
+	if (unlink(fpath) == 0)
+		pg_log_debug("removed file \"%s\"", fpath);
+	pfree(fpath);
+
+	/* Update entry if requested */
+	if (update_entry)
+	{
+		ArchivedWALEntry *entry;
+
+		entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+		Assert(entry != NULL);
+		entry->tmpseg_exists = false;
+	}
+}
+
 /*
  * Create an astreamer that can read WAL from tar file.
  */
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8a838f16ba2..8acb7809645 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -466,11 +466,50 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 {
 	XLogDumpPrivate *private = state->private_data;
 	int			count = required_read_len(private, targetPagePtr, reqLen);
+	XLogSegNo	nextSegNo;
 
 	if (private->endptr_reached)
 		return -1;
 
-	/* Read the WAL page from the archive streamer */
+	/*
+	 * If the target page is in a different segment, first check for the WAL
+	 * segment's physical existence in the temporary directory.
+	 */
+	nextSegNo = state->seg.ws_segno;
+	if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+	{
+		if (state->seg.ws_file >= 0)
+		{
+			close(state->seg.ws_file);
+			state->seg.ws_file = -1;
+
+			/* Remove this file, as it is no longer needed. */
+			remove_tmp_walseg(nextSegNo, true);
+		}
+
+		XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+		state->seg.ws_tli = private->timeline;
+		state->seg.ws_segno = nextSegNo;
+
+		/*
+		 * If the next segment exists, open it and continue reading from there
+		 */
+		if (tmp_walseg_exists(nextSegNo))
+		{
+			char	   *fpath;
+
+			fpath = get_tmp_walseg_path(nextSegNo);
+			state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+			pfree(fpath);
+		}
+	}
+
+	/* Continue reading from the open WAL segment, if any */
+	if (state->seg.ws_file >= 0)
+		return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+							   readBuff);
+
+	/* Otherwise, read the WAL page from the archive streamer */
 	return read_archive_wal_page(private, targetPagePtr, count, readBuff);
 }
 
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54758c3548a..5c1fb1e080a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int	read_archive_wal_page(XLogDumpPrivate *privateInfo,
 								  XLogRecPtr targetPagePtr,
 								  Size count, char *readBuff);
 
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
 #endif							/* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
+use List::Util qw(shuffle);
 
 my $tar = $ENV{TAR};
 
@@ -272,7 +273,7 @@ sub generate_archive
 	}
 	closedir $dh;
 
-	@files = sort @files;
+	@files = shuffle @files;
 
 	# move into the WAL directory before archiving files
 	my $cwd = getcwd;
-- 
2.47.1

From f750f5fece87a9f642225065a540ad4a2d209496 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v5 6/8] pg_verifybackup: Delay default WAL directory
 preparation.

We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
 		manifest_path = psprintf("%s/backup_manifest",
 								 context.backup_directory);
 
-	/* By default, look for the WAL in the backup directory, too. */
-	if (wal_directory == NULL)
-		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
 	/*
 	 * Try to read the manifest. We treat any errors encountered while parsing
 	 * the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
 	if (context.format == 'p' && !context.skip_checksums)
 		verify_backup_checksums(&context);
 
+	/* By default, look for the WAL in the backup directory, too. */
+	if (wal_directory == NULL)
+		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
 	 * not to do so.
-- 
2.47.1

From 33299daf17137bded756aedbe122232cc4ecc244 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v5 7/8] pg_verifybackup: Rename the wal-directory switch to
 wal-path

With previous patches to pg_waldump can now decode WAL directly from
tar files.  This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.

To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
 doc/src/sgml/ref/pg_verifybackup.sgml     |  2 +-
 src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
 src/bin/pg_verifybackup/po/de.po          |  4 ++--
 src/bin/pg_verifybackup/po/el.po          |  4 ++--
 src/bin/pg_verifybackup/po/es.po          |  4 ++--
 src/bin/pg_verifybackup/po/fr.po          |  4 ++--
 src/bin/pg_verifybackup/po/it.po          |  4 ++--
 src/bin/pg_verifybackup/po/ja.po          |  4 ++--
 src/bin/pg_verifybackup/po/ka.po          |  4 ++--
 src/bin/pg_verifybackup/po/ko.po          |  4 ++--
 src/bin/pg_verifybackup/po/ru.po          |  4 ++--
 src/bin/pg_verifybackup/po/sv.po          |  4 ++--
 src/bin/pg_verifybackup/po/uk.po          |  4 ++--
 src/bin/pg_verifybackup/po/zh_CN.po       |  4 ++--
 src/bin/pg_verifybackup/po/zh_TW.po       |  4 ++--
 src/bin/pg_verifybackup/t/007_wal.pl      |  4 ++--
 16 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
 
      <varlistentry>
       <term><option>-w <replaceable class="parameter">path</replaceable></option></term>
-      <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+      <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
       <listitem>
        <para>
         Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
 								 uint8 *buffer);
 static void parse_required_wal(verifier_context *context,
 							   char *pg_waldump_path,
-							   char *wal_directory);
+							   char *wal_path);
 static astreamer *create_archive_verifier(verifier_context *context,
 										  char *archive_name,
 										  Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
 		{"progress", no_argument, NULL, 'P'},
 		{"quiet", no_argument, NULL, 'q'},
 		{"skip-checksums", no_argument, NULL, 's'},
-		{"wal-directory", required_argument, NULL, 'w'},
+		{"wal-path", required_argument, NULL, 'w'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -135,7 +135,7 @@ main(int argc, char **argv)
 	char	   *manifest_path = NULL;
 	bool		no_parse_wal = false;
 	bool		quiet = false;
-	char	   *wal_directory = NULL;
+	char	   *wal_path = NULL;
 	char	   *pg_waldump_path = NULL;
 	DIR		   *dir;
 
@@ -221,8 +221,8 @@ main(int argc, char **argv)
 				context.skip_checksums = true;
 				break;
 			case 'w':
-				wal_directory = pstrdup(optarg);
-				canonicalize_path(wal_directory);
+				wal_path = pstrdup(optarg);
+				canonicalize_path(wal_path);
 				break;
 			default:
 				/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
 		verify_backup_checksums(&context);
 
 	/* By default, look for the WAL in the backup directory, too. */
-	if (wal_directory == NULL)
-		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+	if (wal_path == NULL)
+		wal_path = psprintf("%s/pg_wal", context.backup_directory);
 
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
 	 * not to do so.
 	 */
 	if (!no_parse_wal)
-		parse_required_wal(&context, pg_waldump_path, wal_directory);
+		parse_required_wal(&context, pg_waldump_path, wal_path);
 
 	/*
 	 * If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
  */
 static void
 parse_required_wal(verifier_context *context, char *pg_waldump_path,
-				   char *wal_directory)
+				   char *wal_path)
 {
 	manifest_data *manifest = context->manifest;
 	manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
 		char	   *pg_waldump_cmd;
 
 		pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
-								  pg_waldump_path, wal_directory, this_wal_range->tli,
+								  pg_waldump_path, wal_path, this_wal_range->tli,
 								  LSN_FORMAT_ARGS(this_wal_range->start_lsn),
 								  LSN_FORMAT_ARGS(this_wal_range->end_lsn));
 		fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
 	printf(_("  -P, --progress              show progress information\n"));
 	printf(_("  -q, --quiet                 do not print any output, except for errors\n"));
 	printf(_("  -s, --skip-checksums        skip checksum verification\n"));
-	printf(_("  -w, --wal-directory=PATH    use specified path for WAL files\n"));
+	printf(_("  -w, --wal-path=PATH         use specified path for WAL files\n"));
 	printf(_("  -V, --version               output version information, then exit\n"));
 	printf(_("  -?, --help                  show this help, then exit\n"));
 	printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr "  -s, --skip-checksums        Überprüfung der Prüfsummen überspringe
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PFAD    angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PFAD    angegebenen Pfad für WAL-Dateien verwenden\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr "  -s, --skip-checksums        παράκαμψε την επαλήθευ
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr "  -s, --skip-checksums        omitir la verificación de la suma de comp
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    utilizar la ruta especificada para los archivos WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    utilizar la ruta especificada para los archivos WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr "  -s, --skip-checksums        ignore la vérification des sommes de cont
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=CHEMIN  utilise le chemin spécifié pour les fichiers WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=CHEMIN  utilise le chemin spécifié pour les fichiers WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr "  -s, --skip-checksums         salta la verifica del checksum\n"
 
 #: pg_verifybackup.c:911
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH     usa il percorso specificato per i file WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH     usa il percorso specificato per i file WAL\n"
 
 #: pg_verifybackup.c:912
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr "  -s, --skip-checksums        チェックサム検証をスキップ\n"
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    WALファイルに指定したパスを使用する\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    WALファイルに指定したパスを使用する\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr "  -s, --skip-checksums        საკონტროლო ჯამ
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=ბილიკი    WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=ბილიკი    WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr "  -s, --skip-checksums        체크섬 검사 건너뜀\n"
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=경로    WAL 파일이 있는 경로 지정\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=경로    WAL 파일이 있는 경로 지정\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr "  -s, --skip-checksums        пропустить проверку ко
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
 msgstr ""
-"  -w, --wal-directory=ПУТЬ    использовать заданный путь к файлам WAL\n"
+"  -w, --wal-path=ПУТЬ    использовать заданный путь к файлам WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr "  -s, --skip-checksums        hoppa över verifiering av kontrollsummor\
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=SÖKVÄG  använd denna sökväg till WAL-filer\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=SÖKVÄG  använd denna sökväg till WAL-filer\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr "  -s, --skip-checksums не перевіряти контрольні с
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr "  -s, --skip-checksums        跳过校验和验证\n"
 
 #: pg_verifybackup.c:919
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    对WAL文件使用指定路径\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    对WAL文件使用指定路径\n"
 
 #: pg_verifybackup.c:920
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr "  -s, --skip-checksums        跳過檢查碼驗證\n"
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    用指定的路徑存放 WAL 檔\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    用指定的路徑存放 WAL 檔\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
 command_ok(
 	[
 		'pg_verifybackup',
-		'--wal-directory' => $relocated_pg_wal,
+		'--wal-path' => $relocated_pg_wal,
 		$backup_path
 	],
-	'--wal-directory can be used to specify WAL directory');
+	'--wal-path can be used to specify WAL directory');
 
 # Move directory back to original location.
 rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
-- 
2.47.1

From 2498f315388fc8a1a840a2b883bca107b0113c0e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v5 8/8] pg_verifybackup: enabled WAL parsing for tar-format
 backup

Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
 doc/src/sgml/ref/pg_verifybackup.sgml         |  5 +-
 src/bin/pg_verifybackup/pg_verifybackup.c     | 66 +++++++++++++------
 src/bin/pg_verifybackup/t/002_algorithm.pl    |  4 --
 src/bin/pg_verifybackup/t/003_corruption.pl   |  4 +-
 src/bin/pg_verifybackup/t/008_untar.pl        |  5 +-
 src/bin/pg_verifybackup/t/010_client_untar.pl |  5 +-
 6 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
    <literal>backup_manifest</literal> generated by the server at the time
    of the backup. The backup may be stored either in the "plain" or the "tar"
    format; this includes tar-format backups compressed with any algorithm
-   supported by <application>pg_basebackup</application>. However, at present,
-   <literal>WAL</literal> verification is supported only for plain-format
-   backups. Therefore, if the backup is stored in tar-format, the
-   <literal>-n, --no-parse-wal</literal> option should be used.
+   supported by <application>pg_basebackup</application>.
   </para>
 
   <para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
 											  const char *fmt,...)
 			pg_attribute_printf(2, 3);
 
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+							  char **base_archive_path,
+							  char **wal_archive_path);
 static void verify_plain_backup_directory(verifier_context *context,
 										  char *relpath, char *fullpath,
 										  DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
 static void verify_control_file(const char *controlpath,
 								uint64 manifest_system_identifier);
 static void precheck_tar_backup_file(verifier_context *context, char *relpath,
-									 char *fullpath, SimplePtrList *tarfiles);
+									 char *fullpath, SimplePtrList *tarfiles,
+									 char **base_archive_path,
+									 char **wal_archive_path);
 static void verify_tar_file(verifier_context *context, char *relpath,
 							char *fullpath, astreamer *streamer);
 static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
 	bool		no_parse_wal = false;
 	bool		quiet = false;
 	char	   *wal_path = NULL;
+	char	   *base_archive_path = NULL;
+	char	   *wal_archive_path = NULL;
 	char	   *pg_waldump_path = NULL;
 	DIR		   *dir;
 
@@ -327,17 +333,6 @@ main(int argc, char **argv)
 		pfree(path);
 	}
 
-	/*
-	 * XXX: In the future, we should consider enhancing pg_waldump to read WAL
-	 * files from an archive.
-	 */
-	if (!no_parse_wal && context.format == 't')
-	{
-		pg_log_error("pg_waldump cannot read tar files");
-		pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
-		exit(1);
-	}
-
 	/*
 	 * Perform the appropriate type of verification appropriate based on the
 	 * backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
 		verify_plain_backup_directory(&context, NULL, context.backup_directory,
 									  dir);
 	else
-		verify_tar_backup(&context, dir);
+		verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
 
 	/*
 	 * The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
 	if (context.format == 'p' && !context.skip_checksums)
 		verify_backup_checksums(&context);
 
-	/* By default, look for the WAL in the backup directory, too. */
+	/*
+	 * By default, WAL files are expected to be found in the backup directory
+	 * for plain-format backups. In the case of tar-format backups, if a
+	 * separate WAL archive is not found, the WAL files are most likely
+	 * included within the main data directory archive.
+	 */
 	if (wal_path == NULL)
-		wal_path = psprintf("%s/pg_wal", context.backup_directory);
+	{
+		if (context.format == 'p')
+			wal_path = psprintf("%s/pg_wal", context.backup_directory);
+		else if (wal_archive_path)
+			wal_path = wal_archive_path;
+		else if (base_archive_path)
+			wal_path = base_archive_path;
+		else
+		{
+			pg_log_error("wal archive not found");
+			pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+							  "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+			exit(1);
+		}
+	}
 
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
  * close when we're done with it.
  */
 static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+				  char **wal_archive_path)
 {
 	struct dirent *dirent;
 	SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
 			char	   *fullpath;
 
 			fullpath = psprintf("%s/%s", context->backup_directory, filename);
-			precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+			precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+									 base_archive_path, wal_archive_path);
 			pfree(fullpath);
 		}
 	}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
  *
  * The arguments to this function are mostly the same as the
  * verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
  */
 static void
 precheck_tar_backup_file(verifier_context *context, char *relpath,
-						 char *fullpath, SimplePtrList *tarfiles)
+						 char *fullpath, SimplePtrList *tarfiles,
+						 char **base_archive_path, char **wal_archive_path)
 {
 	struct stat sb;
 	Oid			tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
 	 * extension such as .gz, .lz4, or .zst.
 	 */
 	if (strncmp("base", relpath, 4) == 0)
+	{
 		suffix = relpath + 4;
+
+		*base_archive_path = pstrdup(fullpath);
+	}
 	else if (strncmp("pg_wal", relpath, 6) == 0)
+	{
 		suffix = relpath + 6;
+
+		*wal_archive_path = pstrdup(fullpath);
+	}
 	else
 	{
 		/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
 	{
 		# Add switch to get a tar-format backup
 		push @backup, ('--format' => 'tar');
-
-		# Add switch to skip WAL verification, which is not yet supported for
-		# tar-format backups
-		push @verify, ('--no-parse-wal');
 	}
 
 	# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
 			command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
 			chdir($cwd) || die "chdir: $!";
 
-			# Now check that the backup no longer verifies. We must use -n
-			# here, because pg_waldump can't yet read WAL from a tarfile.
 			command_fails_like(
-				[ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+				[ 'pg_verifybackup', $tar_backup_path ],
 				$scenario->{'fails_like'},
 				"corrupt backup fails verification: $name");
 
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
 		SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
 
 my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
 
 my @test_configuration = (
 	{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
 		# Verify tar backup.
 		$primary->command_ok(
 			[
-				'pg_verifybackup', '--no-parse-wal',
-				'--exit-on-error', $backup_path,
+				'pg_verifybackup', '--exit-on-error', $backup_path,
 			],
 			"verify backup, compression $method");
 
 		# Cleanup.
 		rmtree($backup_path);
-		rmtree($extract_path);
 	}
 }
 
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
 close $jf;
 
 my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
 
 my @test_configuration = (
 	{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
 		# Verify tar backup.
 		$primary->command_ok(
 			[
-				'pg_verifybackup', '--no-parse-wal',
-				'--exit-on-error', $backup_path,
+				'pg_verifybackup', '--exit-on-error', $backup_path,
 			],
 			"verify backup, compression $method");
 
 		# Cleanup.
-		rmtree($extract_path);
 		rmtree($backup_path);
 	}
 }
-- 
2.47.1

Reply via email to