On Tue, Aug 20, 2024 at 3:56 PM Amul Sul <sula...@gmail.com> wrote:
>
> On Sat, Aug 17, 2024 at 1:34 AM Robert Haas <robertmh...@gmail.com> wrote:
> >
> > On Fri, Aug 16, 2024 at 3:53 PM Robert Haas <robertmh...@gmail.com> wrote:
[...]
> > There's probably more to look at here but I'm running out of energy for 
> > today.
> >
>
> Thank you for the review and committing 0004 and 0006 patches.
>

I have reworked a few comments, revised error messages, and made some
minor tweaks in the attached version.

Additionally, I would like to discuss a couple of concerns regarding
error placement and function names to gather your suggestions.

0007 patch: Regarding error placement:

1. I'm a bit unsure about the (bytes_read != m->size) check that I
placed in verify_checksum() and whether it's in the right place. Per
our previous discussion, this check is applicable to plain backup
files since they can change while being read, but not for files
belonging to tar backups. For consistency, I included the check for
tar backups as well, as it doesn't cause any harm. Is it okay to keep
this check in verify_checksum(), or should I move it back to
verify_file_checksum() and apply it only to the plain backup format?

2. For the verify_checksum() function, I kept the argument name as
bytes_read. Should we rename it to something more meaningful like
computed_bytes, computed_size, or checksum_computed_size?


0011 patch: Regarding function names:

1. named the function verify_tar_backup_file() to align with
verify_plain_backup_file(), but it does not perform the complete
verification as verify_plain_backup_file does. Not sure if it is the
right name.

2. verify_tar_file_contents() is the second and final part of tar
backup verification. Should its name be aligned with
verify_tar_backup_file()? I’m unsure what the best name would be.
Perhaps verify_tar_backup_file_final(), but then
verify_tar_backup_file() would need to be renamed to something like
verify_tar_backup_file_initial(), which might be too lengthy.

3. verify_tar_contents() is the core of verify_tar_file_contents()
that handles the actual verification. I’m unsure about the current
naming. Should we rename it to something like
verify_tar_contents_core()? It wouldn’t be an issue if we renamed
verify_tar_file_contents() as pointed in point #2.

Regards,
Amul
From 8c5c624fa89eaa42e8812dad840842b30ec35ec7 Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 14 Aug 2024 10:42:37 +0530
Subject: [PATCH v12 06/12] Refactor: split verify_backup_file() function and
 rename it.

The function verify_backup_file() has now been renamed to
verify_plain_backup_file() to make it clearer that it is specifically
used for verifying files in a plain backup. Similarly, in a future
patch, we would have a verify_tar_backup_file() function for
verifying TAR backup files.

In addition to that, moved the manifest entry verification code into a
new function called verify_manifest_entry() so that it can be reused
for tar backup verification. If verify_manifest_entry() doesn't find
an entry, it reports an error as before and returns NULL to the
caller. This is why a NULL check is added to should_verify_checksum().
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 58 +++++++++++++++--------
 src/bin/pg_verifybackup/pg_verifybackup.h |  6 ++-
 2 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 3fcfb167217..5bfc98e7874 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -64,8 +64,8 @@ static void report_manifest_error(JsonManifestParseContext *context,
 
 static void verify_backup_directory(verifier_context *context,
 									char *relpath, char *fullpath);
-static void verify_backup_file(verifier_context *context,
-							   char *relpath, char *fullpath);
+static void verify_plain_backup_file(verifier_context *context,
+									 char *relpath, char *fullpath);
 static void verify_control_file(const char *controlpath,
 								uint64 manifest_system_identifier);
 static void report_extra_backup_files(verifier_context *context);
@@ -570,7 +570,7 @@ verify_backup_directory(verifier_context *context, char *relpath,
 			newrelpath = psprintf("%s/%s", relpath, filename);
 
 		if (!should_ignore_relpath(context, newrelpath))
-			verify_backup_file(context, newrelpath, newfullpath);
+			verify_plain_backup_file(context, newrelpath, newfullpath);
 
 		pfree(newfullpath);
 		pfree(newrelpath);
@@ -591,7 +591,8 @@ verify_backup_directory(verifier_context *context, char *relpath,
  * verify_backup_directory.
  */
 static void
-verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
+verify_plain_backup_file(verifier_context *context, char *relpath,
+						 char *fullpath)
 {
 	struct stat sb;
 	manifest_file *m;
@@ -627,6 +628,32 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
 		return;
 	}
 
+	/* Check the backup manifest entry for this file. */
+	m = verify_manifest_entry(context, relpath, sb.st_size);
+
+	/*
+	 * Validate the manifest system identifier, not available in manifest
+	 * version 1.
+	 */
+	if (context->manifest->version != 1 &&
+		strcmp(relpath, "global/pg_control") == 0 &&
+		m != NULL && m->matched && !m->bad)
+		verify_control_file(fullpath, context->manifest->system_identifier);
+
+	/* Update statistics for progress report, if necessary */
+	if (show_progress && !context->skip_checksums &&
+		should_verify_checksum(m))
+		total_size += m->size;
+}
+
+/*
+ * Verify file and its size entry in the manifest.
+ */
+manifest_file *
+verify_manifest_entry(verifier_context *context, char *relpath, int64 filesize)
+{
+	manifest_file *m;
+
 	/* Check whether there's an entry in the manifest hash. */
 	m = manifest_files_lookup(context->manifest->files, relpath);
 	if (m == NULL)
@@ -634,40 +661,29 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
 		report_backup_error(context,
 							"\"%s\" is present on disk but not in the manifest",
 							relpath);
-		return;
+		return NULL;
 	}
 
 	/* Flag this entry as having been encountered in the filesystem. */
 	m->matched = true;
 
 	/* Check that the size matches. */
-	if (m->size != sb.st_size)
+	if (m->size != filesize)
 	{
 		report_backup_error(context,
 							"\"%s\" has size %lld on disk but size %zu in the manifest",
-							relpath, (long long int) sb.st_size, m->size);
+							relpath, (long long int) filesize, m->size);
 		m->bad = true;
 	}
 
-	/*
-	 * Validate the manifest system identifier, not available in manifest
-	 * version 1.
-	 */
-	if (context->manifest->version != 1 &&
-		strcmp(relpath, "global/pg_control") == 0)
-		verify_control_file(fullpath, context->manifest->system_identifier);
-
-	/* Update statistics for progress report, if necessary */
-	if (show_progress && !context->skip_checksums &&
-		should_verify_checksum(m))
-		total_size += m->size;
-
 	/*
 	 * We don't verify checksums at this stage. We first finish verifying that
 	 * we have the expected set of files with the expected sizes, and only
 	 * afterwards verify the checksums. That's because computing checksums may
 	 * take a while, and we'd like to report more obvious problems quickly.
 	 */
+
+	return m;
 }
 
 /*
@@ -830,7 +846,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
 
 	/*
 	 * Double-check that we read the expected number of bytes from the file.
-	 * Normally, a file size mismatch would be caught in verify_backup_file
+	 * Normally, a file size mismatch would be caught in verify_manifest_entry
 	 * and this check would never be reached, but this provides additional
 	 * safety and clarity in the event of concurrent modifications or
 	 * filesystem misbehavior.
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.h b/src/bin/pg_verifybackup/pg_verifybackup.h
index d8c566ed587..ff9476e356e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.h
+++ b/src/bin/pg_verifybackup/pg_verifybackup.h
@@ -37,7 +37,8 @@ typedef struct manifest_file
 } manifest_file;
 
 #define should_verify_checksum(m) \
-	(((m)->matched) && !((m)->bad) && (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
+	(((m) != NULL) && ((m)->matched) && !((m)->bad) && \
+	 (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
 
 /*
  * Define a hash table which we can use to store information about the files
@@ -93,6 +94,9 @@ typedef struct verifier_context
 	bool		saw_any_error;
 } verifier_context;
 
+extern manifest_file *verify_manifest_entry(verifier_context *context,
+											char *relpath, int64 filesize);
+
 extern void report_backup_error(verifier_context *context,
 								const char *pg_restrict fmt,...)
 			pg_attribute_printf(2, 3);
-- 
2.18.0

From 0d01f43ee33c20cbd1341a3454e312d85612104e Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 14 Aug 2024 15:14:15 +0530
Subject: [PATCH v12 07/12] Refactor: split verify_file_checksum() function.

Move the core functionality of verify_file_checksum to a new function
to reuse it instead of duplicating the code.

The verify_file_checksum() function is designed to take a file path,
open and read the file contents, and then calculate the checksum.
However, for TAR backups, instead of a file path, we receive the file
content in chunks, and the checksum needs to be calculated
incrementally. While the initial operations for plain and TAR backup
checksum calculations differ, the final checks and error handling are
the same. By moving the shared logic to a separate function, we can
reuse the code for both types of backups.
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 20 +++++++++++++++++---
 src/bin/pg_verifybackup/pg_verifybackup.h |  3 +++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5bfc98e7874..e44d0377cd5 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -792,8 +792,6 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
 	int			fd;
 	int			rc;
 	size_t		bytes_read = 0;
-	uint8		checksumbuf[PG_CHECKSUM_MAX_LENGTH];
-	int			checksumlen;
 
 	/* Open the target file. */
 	if ((fd = open(fullpath, O_RDONLY | PG_BINARY, 0)) < 0)
@@ -844,6 +842,22 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
 	if (rc < 0)
 		return;
 
+	/* Do the final computation and verification. */
+	verify_checksum(context, m, &checksum_ctx, bytes_read);
+}
+
+/*
+ * A helper function to finalize checksum computation and verify it against the
+ * backup manifest information.
+ */
+void
+verify_checksum(verifier_context *context, manifest_file *m,
+				pg_checksum_context *checksum_ctx, int64 bytes_read)
+{
+	const char *relpath = m->pathname;
+	int			checksumlen;
+	uint8		checksumbuf[PG_CHECKSUM_MAX_LENGTH];
+
 	/*
 	 * Double-check that we read the expected number of bytes from the file.
 	 * Normally, a file size mismatch would be caught in verify_manifest_entry
@@ -860,7 +874,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
 	}
 
 	/* Get the final checksum. */
-	checksumlen = pg_checksum_final(&checksum_ctx, checksumbuf);
+	checksumlen = pg_checksum_final(checksum_ctx, checksumbuf);
 	if (checksumlen < 0)
 	{
 		report_backup_error(context,
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.h b/src/bin/pg_verifybackup/pg_verifybackup.h
index ff9476e356e..fe0ce8a89aa 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.h
+++ b/src/bin/pg_verifybackup/pg_verifybackup.h
@@ -96,6 +96,9 @@ typedef struct verifier_context
 
 extern manifest_file *verify_manifest_entry(verifier_context *context,
 											char *relpath, int64 filesize);
+extern void verify_checksum(verifier_context *context, manifest_file *m,
+							pg_checksum_context *checksum_ctx,
+							int64 bytes_read);
 
 extern void report_backup_error(verifier_context *context,
 								const char *pg_restrict fmt,...)
-- 
2.18.0

From b877ed19cdb43bdce5aabb715b4dd6a4a3642d1a Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 21 Aug 2024 10:51:23 +0530
Subject: [PATCH v12 08/12] Refactor: split verify_control_file.

Separated the manifest entry verification code into a new function and
introduced the should_verify_control_data() macro, similar to
should_verify_checksum().

Like verify_file_checksum(), verify_control_file() is too design to
accept the pg_control file patch which will be opened and respective
information will be verified. But, in case of tar backup we would be
having pg_control file contents instead, that needs to be verified in
the same way.  For that reason the code that doing the verification is
separated into separate function to so that can be reused for the tar
backup verification as well.
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 42 ++++++++++-------------
 src/bin/pg_verifybackup/pg_verifybackup.h | 14 ++++++++
 2 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index e44d0377cd5..d04e1d8c8ac 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -66,8 +66,6 @@ static void verify_backup_directory(verifier_context *context,
 									char *relpath, char *fullpath);
 static void verify_plain_backup_file(verifier_context *context,
 									 char *relpath, char *fullpath);
-static void verify_control_file(const char *controlpath,
-								uint64 manifest_system_identifier);
 static void report_extra_backup_files(verifier_context *context);
 static void verify_backup_checksums(verifier_context *context);
 static void verify_file_checksum(verifier_context *context,
@@ -631,14 +629,20 @@ verify_plain_backup_file(verifier_context *context, char *relpath,
 	/* Check the backup manifest entry for this file. */
 	m = verify_manifest_entry(context, relpath, sb.st_size);
 
-	/*
-	 * Validate the manifest system identifier, not available in manifest
-	 * version 1.
-	 */
-	if (context->manifest->version != 1 &&
-		strcmp(relpath, "global/pg_control") == 0 &&
-		m != NULL && m->matched && !m->bad)
-		verify_control_file(fullpath, context->manifest->system_identifier);
+	/* Validate the pg_control information */
+	if (should_verify_control_data(context->manifest, m))
+	{
+		ControlFileData *control_file;
+		bool		crc_ok;
+
+		pg_log_debug("reading \"%s\"", fullpath);
+		control_file = get_controlfile_by_exact_path(fullpath, &crc_ok);
+
+		verify_control_data(control_file, fullpath, crc_ok,
+							context->manifest->system_identifier);
+		/* Release memory. */
+		pfree(control_file);
+	}
 
 	/* Update statistics for progress report, if necessary */
 	if (show_progress && !context->skip_checksums &&
@@ -687,18 +691,13 @@ verify_manifest_entry(verifier_context *context, char *relpath, int64 filesize)
 }
 
 /*
- * Sanity check control file and validate system identifier against manifest
- * system identifier.
+ * Sanity check control file data and validate system identifier against
+ * manifest system identifier.
  */
-static void
-verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
+void
+verify_control_data(ControlFileData *control_file, const char *controlpath,
+					bool crc_ok, uint64 manifest_system_identifier)
 {
-	ControlFileData *control_file;
-	bool		crc_ok;
-
-	pg_log_debug("reading \"%s\"", controlpath);
-	control_file = get_controlfile_by_exact_path(controlpath, &crc_ok);
-
 	/* Control file contents not meaningful if CRC is bad. */
 	if (!crc_ok)
 		report_fatal_error("%s: CRC is incorrect", controlpath);
@@ -714,9 +713,6 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
 						   controlpath,
 						   (unsigned long long) manifest_system_identifier,
 						   (unsigned long long) control_file->system_identifier);
-
-	/* Release memory. */
-	pfree(control_file);
 }
 
 /*
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.h b/src/bin/pg_verifybackup/pg_verifybackup.h
index fe0ce8a89aa..818064c6eed 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.h
+++ b/src/bin/pg_verifybackup/pg_verifybackup.h
@@ -40,6 +40,17 @@ typedef struct manifest_file
 	(((m) != NULL) && ((m)->matched) && !((m)->bad) && \
 	 (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
 
+/*
+ * Validate the control file and its system identifier against the manifest
+ * system identifier. Note that this feature is not available in manifest
+ * version 1. This validation should only be performed after the manifest entry
+ * validation for the pg_control file has been completed without errors.
+ */
+#define should_verify_control_data(manifest, m) \
+	(((manifest)->version != 1) && \
+	 ((m) != NULL) && ((m)->matched) && !((m)->bad) && \
+	 (strcmp((m)->pathname, "global/pg_control") == 0))
+
 /*
  * Define a hash table which we can use to store information about the files
  * mentioned in the backup manifest.
@@ -99,6 +110,9 @@ extern manifest_file *verify_manifest_entry(verifier_context *context,
 extern void verify_checksum(verifier_context *context, manifest_file *m,
 							pg_checksum_context *checksum_ctx,
 							int64 bytes_read);
+extern void verify_control_data(ControlFileData *control_file,
+								const char *controlpath, bool crc_ok,
+								uint64 manifest_system_identifier);
 
 extern void report_backup_error(verifier_context *context,
 								const char *pg_restrict fmt,...)
-- 
2.18.0

From 61edb2a348d87563aa3fde6b7c6b9e3a6dbeda9f Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Thu, 8 Aug 2024 16:01:33 +0530
Subject: [PATCH v12 09/12] Add simple_ptr_list_destroy() and
 simple_ptr_list_destroy_deep() API.

We didn't have any helper function to destroy SimplePtrList, likely
because it wasn't needed before, but it's required in a later patch in
this set.  I've added two functions for this purpose, inspired by
list_free() and list_free_deep().
---
 src/fe_utils/simple_list.c         | 39 ++++++++++++++++++++++++++++++
 src/include/fe_utils/simple_list.h |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/src/fe_utils/simple_list.c b/src/fe_utils/simple_list.c
index 2d88eb54067..9d218911c31 100644
--- a/src/fe_utils/simple_list.c
+++ b/src/fe_utils/simple_list.c
@@ -173,3 +173,42 @@ simple_ptr_list_append(SimplePtrList *list, void *ptr)
 		list->head = cell;
 	list->tail = cell;
 }
+
+/*
+ * Destroy a pointer list and optionally the pointed-to element
+ */
+static void
+simple_ptr_list_destroy_private(SimplePtrList *list, bool deep)
+{
+	SimplePtrListCell *cell;
+
+	cell = list->head;
+	while (cell != NULL)
+	{
+		SimplePtrListCell *next;
+
+		next = cell->next;
+		if (deep)
+			pg_free(cell->ptr);
+		pg_free(cell);
+		cell = next;
+	}
+}
+
+/*
+ * Destroy a pointer list and the pointed-to element
+ */
+void
+simple_ptr_list_destroy_deep(SimplePtrList *list)
+{
+	simple_ptr_list_destroy_private(list, true);
+}
+
+/*
+ * Destroy only pointer list and not the pointed-to element
+ */
+void
+simple_ptr_list_destroy(SimplePtrList *list)
+{
+	simple_ptr_list_destroy_private(list, false);
+}
diff --git a/src/include/fe_utils/simple_list.h b/src/include/fe_utils/simple_list.h
index d42ecded8ed..5b7cbec8a62 100644
--- a/src/include/fe_utils/simple_list.h
+++ b/src/include/fe_utils/simple_list.h
@@ -66,5 +66,7 @@ extern void simple_string_list_destroy(SimpleStringList *list);
 extern const char *simple_string_list_not_touched(SimpleStringList *list);
 
 extern void simple_ptr_list_append(SimplePtrList *list, void *ptr);
+extern void simple_ptr_list_destroy_deep(SimplePtrList *list);
+extern void simple_ptr_list_destroy(SimplePtrList *list);
 
 #endif							/* SIMPLE_LIST_H */
-- 
2.18.0

From 0ba8580c0ee6489f1195bae4d3d5427543479f76 Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 21 Aug 2024 11:03:44 +0530
Subject: [PATCH v12 10/12] pg_verifybackup: Add backup format and compression
 option

----
NOTE: This patch is not meant to be committed separately. It should
be squashed with the next patch that implements tar format support.
----
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 73 ++++++++++++++++++++++-
 src/bin/pg_verifybackup/pg_verifybackup.h |  1 +
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index d04e1d8c8ac..c186678a592 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -62,6 +62,7 @@ static void report_manifest_error(JsonManifestParseContext *context,
 								  const char *fmt,...)
 			pg_attribute_printf(2, 3) pg_attribute_noreturn();
 
+static char find_backup_format(verifier_context *context);
 static void verify_backup_directory(verifier_context *context,
 									char *relpath, char *fullpath);
 static void verify_plain_backup_file(verifier_context *context,
@@ -97,6 +98,7 @@ main(int argc, char **argv)
 		{"exit-on-error", no_argument, NULL, 'e'},
 		{"ignore", required_argument, NULL, 'i'},
 		{"manifest-path", required_argument, NULL, 'm'},
+		{"format", required_argument, NULL, 'F'},
 		{"no-parse-wal", no_argument, NULL, 'n'},
 		{"progress", no_argument, NULL, 'P'},
 		{"quiet", no_argument, NULL, 'q'},
@@ -118,6 +120,7 @@ main(int argc, char **argv)
 	progname = get_progname(argv[0]);
 
 	memset(&context, 0, sizeof(context));
+	context.format = '\0';
 
 	if (argc > 1)
 	{
@@ -154,7 +157,7 @@ main(int argc, char **argv)
 	simple_string_list_append(&context.ignore_list, "recovery.signal");
 	simple_string_list_append(&context.ignore_list, "standby.signal");
 
-	while ((c = getopt_long(argc, argv, "ei:m:nPqsw:", long_options, NULL)) != -1)
+	while ((c = getopt_long(argc, argv, "eF:i:m:nPqsw:", long_options, NULL)) != -1)
 	{
 		switch (c)
 		{
@@ -173,6 +176,15 @@ main(int argc, char **argv)
 				manifest_path = pstrdup(optarg);
 				canonicalize_path(manifest_path);
 				break;
+			case 'F':
+				if (strcmp(optarg, "p") == 0 || strcmp(optarg, "plain") == 0)
+					context.format = 'p';
+				else if (strcmp(optarg, "t") == 0 || strcmp(optarg, "tar") == 0)
+					context.format = 't';
+				else
+					pg_fatal("invalid backup format \"%s\", must be \"plain\" or \"tar\"",
+							 optarg);
+				break;
 			case 'n':
 				no_parse_wal = true;
 				break;
@@ -220,11 +232,26 @@ main(int argc, char **argv)
 		pg_fatal("cannot specify both %s and %s",
 				 "-P/--progress", "-q/--quiet");
 
+	/* Determine the backup format if it hasn't been specified. */
+	if (context.format == '\0')
+		context.format = find_backup_format(&context);
+
 	/* Unless --no-parse-wal was specified, we will need pg_waldump. */
 	if (!no_parse_wal)
 	{
 		int			ret;
 
+		/*
+		 * XXX: In the future, we should consider enhancing pg_waldump to read
+		 * WAL files from the tar archive.
+		 */
+		if (context.format != 'p')
+		{
+			pg_log_error("pg_waldump cannot read from a tar archive");
+			pg_log_error_hint("You must use -n or --no-parse-wal when verifying a tar-format backup.");
+			exit(1);
+		}
+
 		pg_waldump_path = pg_malloc(MAXPGPATH);
 		ret = find_other_exec(argv[0], "pg_waldump",
 							  "pg_waldump (PostgreSQL) " PG_VERSION "\n",
@@ -279,8 +306,13 @@ main(int argc, char **argv)
 	/*
 	 * Now do the expensive work of verifying file checksums, unless we were
 	 * told to skip it.
+	 *
+	 * We were only checking the plain backup here. For the tar backup, file
+	 * checksums verification (if requested) will be done immediately when the
+	 * file is accessed, as we don't have random access to the files like we
+	 * do with plain backups.
 	 */
-	if (!context.skip_checksums)
+	if (!context.skip_checksums && context.format == 'p')
 		verify_backup_checksums(&context);
 
 	/*
@@ -981,6 +1013,42 @@ should_ignore_relpath(verifier_context *context, const char *relpath)
 	return false;
 }
 
+/*
+ * To detect the backup format, it checks for the PG_VERSION file in the backup
+ * directory. If found, it will be considered a plain-format backup; otherwise,
+ * it will be assumed to be a tar-format backup.
+ */
+static char
+find_backup_format(verifier_context *context)
+{
+	char		result;
+	char	   *path;
+	struct stat sb;
+
+	/* Should be here only if the backup format is unknown */
+	Assert(context->format == '\0');
+
+	/* Check PG_VERSION file. */
+	path = psprintf("%s/%s", context->backup_directory, "PG_VERSION");
+	if (stat(path, &sb) == 0)
+		result = 'p';
+	else
+	{
+		if (errno != ENOENT)
+		{
+			pg_log_error("cannot determine backup format : %m");
+			pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+			exit(1);
+		}
+
+		/* Otherwise, it is assumed to be a tar format backup. */
+		result = 't';
+	}
+	pfree(path);
+
+	return result;
+}
+
 /*
  * Print a progress report based on the global variables.
  *
@@ -1038,6 +1106,7 @@ usage(void)
 	printf(_("  -e, --exit-on-error         exit immediately on error\n"));
 	printf(_("  -i, --ignore=RELATIVE_PATH  ignore indicated path\n"));
 	printf(_("  -m, --manifest-path=PATH    use specified path for manifest\n"));
+	printf(_("  -F, --format=p|t            backup format (plain, tar)\n"));
 	printf(_("  -n, --no-parse-wal          do not try to parse WAL files\n"));
 	printf(_("  -P, --progress              show progress information\n"));
 	printf(_("  -q, --quiet                 do not print any output, except for errors\n"));
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.h b/src/bin/pg_verifybackup/pg_verifybackup.h
index 818064c6eed..80031ad4dbc 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.h
+++ b/src/bin/pg_verifybackup/pg_verifybackup.h
@@ -100,6 +100,7 @@ typedef struct verifier_context
 	manifest_data *manifest;
 	char	   *backup_directory;
 	SimpleStringList ignore_list;
+	char		format;			/* backup format:  p(lain)/t(ar) */
 	bool		skip_checksums;
 	bool		exit_on_error;
 	bool		saw_any_error;
-- 
2.18.0

From 34ef4b8f5bad4c232aec86e837bad135449a0f02 Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 21 Aug 2024 12:49:04 +0530
Subject: [PATCH v12 11/12] pg_verifybackup: Read tar files and verify its
 contents

This patch implements TAR format backup verification.

For progress reporting support, we perform this verification in two
passes: the first pass calculates total_size, and the second pass
updates done_size as verification progresses.

For the verification, in the first pass, we call verify_tar_backup_file(),
which performs basic verification by expecting only base.tar, pg_wal.tar, or
<tablespaceoid>.tar files and raises an error for any other files.  It
also determines the compression type of the archive file. All this
information is stored in a newly added tarFile struct, which is
appended to a list that will be used in the second pass (by
verify_tar_content()) for the final verification. In the second pass,
the tar archives are read, decompressed, and the required verification
is carried out.

For reading and decompression, fe_utils/astreamer.h is used. For
verification, a new archive streamer has been added in
astreamer_verify.c to handle TAR member files and their contents; see
astreamer_verify_content() for details. The stack of astreamers will
be set up for each TAR file in verify_tar_content(), depending on its
compression type which is detected in the first pass.

When information about a TAR member file (i.e., ASTREAMER_MEMBER_HEADER)
is received, we first verify its entry against the backup manifest. We
then decide if further checks are needed, such as checksum
verification and control data verification (if it is a pg_control
file), once the member file contents are received. Although this
decision could be made when the contents are received, it is more
efficient to make it earlier since the member file contents are
received in multiple iterations. In short, we process
ASTREAMER_MEMBER_CONTENTS multiple times but only once for other
ASTREAMER_MEMBER_* cases. We maintain this information in the
astreamer_verify structure for each member file, which is reset when
the file ends.

Unlike in a plain backup, checksum verification here occurs in two
steps. First, as the contents are received, the checksum is computed
incrementally (see member_compute_checksum). Then, at the end of
processing the member file, the final verification is performed (see
member_verify_checksum).

Similarly, during the content receiving stage, if the file is
pg_control, the data will be copied into a local buffer (see
member_copy_control_data).  The verification will then be carried out
at the end of the member file processing (see member_verify_control_data)
---
 src/bin/pg_verifybackup/Makefile           |   2 +
 src/bin/pg_verifybackup/astreamer_verify.c | 365 +++++++++++++++++++++
 src/bin/pg_verifybackup/meson.build        |   1 +
 src/bin/pg_verifybackup/pg_verifybackup.c  | 287 +++++++++++++++-
 src/bin/pg_verifybackup/pg_verifybackup.h  |   6 +
 src/tools/pgindent/typedefs.list           |   2 +
 6 files changed, 658 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/pg_verifybackup/astreamer_verify.c

diff --git a/src/bin/pg_verifybackup/Makefile b/src/bin/pg_verifybackup/Makefile
index 7c045f142e8..374d4a8afd1 100644
--- a/src/bin/pg_verifybackup/Makefile
+++ b/src/bin/pg_verifybackup/Makefile
@@ -17,10 +17,12 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 # We need libpq only because fe_utils does.
+override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
 
 OBJS = \
 	$(WIN32RES) \
+	astreamer_verify.o \
 	pg_verifybackup.o
 
 all: pg_verifybackup
diff --git a/src/bin/pg_verifybackup/astreamer_verify.c b/src/bin/pg_verifybackup/astreamer_verify.c
new file mode 100644
index 00000000000..f08868170b4
--- /dev/null
+++ b/src/bin/pg_verifybackup/astreamer_verify.c
@@ -0,0 +1,365 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_verify.c
+ *
+ * Extend fe_utils/astreamer.h archive streaming facility to verify TAR
+ * format backup.
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ *
+ * src/bin/pg_verifybackup/astreamer_verify.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include "pg_verifybackup.h"
+
+typedef struct astreamer_verify
+{
+	astreamer	base;
+	verifier_context *context;
+	char	   *archive_name;
+	Oid			tblspc_oid;
+	pg_checksum_context *checksum_ctx;
+
+	/* Hold information for a member file verification */
+	manifest_file *mfile;
+	int64		received_bytes;
+	bool		verify_checksum;
+	bool		verify_control_data;
+} astreamer_verify;
+
+static void astreamer_verify_content(astreamer *streamer,
+									 astreamer_member *member,
+									 const char *data, int len,
+									 astreamer_archive_context context);
+static void astreamer_verify_finalize(astreamer *streamer);
+static void astreamer_verify_free(astreamer *streamer);
+
+static void member_verify_header(astreamer *streamer, astreamer_member *member);
+static void member_compute_checksum(astreamer *streamer,
+									astreamer_member *member,
+									const char *data, int len);
+static void member_verify_checksum(astreamer *streamer);
+static void member_copy_control_data(astreamer *streamer,
+									 astreamer_member *member,
+									 const char *data, int len);
+static void member_verify_control_data(astreamer *streamer);
+static void member_reset_info(astreamer *streamer);
+
+static const astreamer_ops astreamer_verify_ops = {
+	.content = astreamer_verify_content,
+	.finalize = astreamer_verify_finalize,
+	.free = astreamer_verify_free
+};
+
+/*
+ * Create a astreamer that can verifies content of a TAR file.
+ */
+astreamer *
+astreamer_verify_content_new(astreamer *next, verifier_context *context,
+							 char *archive_name, Oid tblspc_oid)
+{
+	astreamer_verify *streamer;
+
+	streamer = palloc0(sizeof(astreamer_verify));
+	*((const astreamer_ops **) &streamer->base.bbs_ops) =
+		&astreamer_verify_ops;
+
+	streamer->base.bbs_next = next;
+	streamer->context = context;
+	streamer->archive_name = archive_name;
+	streamer->tblspc_oid = tblspc_oid;
+	initStringInfo(&streamer->base.bbs_buffer);
+
+	if (!context->skip_checksums)
+		streamer->checksum_ctx = pg_malloc(sizeof(pg_checksum_context));
+
+	return &streamer->base;
+}
+
+/*
+ * The main entry point of the archive streamer for verifying tar members.
+ */
+static void
+astreamer_verify_content(astreamer *streamer, astreamer_member *member,
+						 const char *data, int len,
+						 astreamer_archive_context context)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+
+	Assert(context != ASTREAMER_UNKNOWN);
+
+	switch (context)
+	{
+		case ASTREAMER_MEMBER_HEADER:
+
+			/*
+			 * Perform the initial check and setup verification steps.
+			 */
+			member_verify_header(streamer, member);
+			break;
+
+		case ASTREAMER_MEMBER_CONTENTS:
+
+			/*
+			 * Since we are receiving the member content in chunks, it must be
+			 * processed according to the flags set by the member header
+			 * processing routine. This includes performing incremental
+			 * checksum computations and copying control data to the local
+			 * buffer.
+			 */
+			if (mystreamer->verify_checksum)
+				member_compute_checksum(streamer, member, data, len);
+
+			if (mystreamer->verify_control_data)
+				member_copy_control_data(streamer, member, data, len);
+			break;
+
+		case ASTREAMER_MEMBER_TRAILER:
+
+			/*
+			 * We have reached the end of the member file. By this point, we
+			 * should have successfully computed the checksum of the received
+			 * content and copied the entire pg_control file data into our
+			 * local buffer. We can now proceed with the final verification.
+			 */
+			if (mystreamer->verify_checksum)
+				member_verify_checksum(streamer);
+
+			if (mystreamer->verify_control_data)
+				member_verify_control_data(streamer);
+
+			/*
+			 * Reset the temporary information stored for the verification.
+			 */
+			member_reset_info(streamer);
+			break;
+
+		case ASTREAMER_ARCHIVE_TRAILER:
+			break;
+
+		default:
+			/* Shouldn't happen. */
+			pg_fatal("unexpected state while parsing tar file");
+	}
+}
+
+/*
+ * End-of-stream processing for a astreamer_verify stream.
+ */
+static void
+astreamer_verify_finalize(astreamer *streamer)
+{
+	Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_verify stream.
+ */
+static void
+astreamer_verify_free(astreamer *streamer)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+
+	if (mystreamer->checksum_ctx)
+		pfree(mystreamer->checksum_ctx);
+
+	pfree(streamer->bbs_buffer.data);
+	pfree(streamer);
+}
+
+/*
+ * Verifies whether the tar member entry exists in the backup manifest.
+ *
+ * If the archive being processed is a tablespace, it prepares the necessary
+ * file path first. If a valid entry is found in the backup manifest, it then
+ * determines whether checksum and control data verification should be
+ * performed during file content processing.
+ */
+static void
+member_verify_header(astreamer *streamer, astreamer_member *member)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+	manifest_file *m;
+
+	/* We are only interested in files that are not in the ignore list. */
+	if (member->is_directory || member->is_link ||
+		should_ignore_relpath(mystreamer->context, member->pathname))
+		return;
+
+	/*
+	 * The backup manifest stores a relative path to the base directory for
+	 * files belonging to a tablespace, while the tablespace backup tar
+	 * archive does not include this path. Ensure the required path is
+	 * prepared; otherwise, the manifest entry verification will fail.
+	 */
+	if (OidIsValid(mystreamer->tblspc_oid))
+	{
+		char		temp[MAXPGPATH];
+
+		/* Copy original name at temporary space */
+		memcpy(temp, member->pathname, MAXPGPATH);
+
+		snprintf(member->pathname, MAXPGPATH, "%s/%d/%s",
+				 "pg_tblspc", mystreamer->tblspc_oid, temp);
+	}
+
+	/* Check the manifest entry */
+	m = verify_manifest_entry(mystreamer->context, member->pathname,
+							  member->size);
+	mystreamer->mfile = (void *) m;
+
+	/*
+	 * Prepare for checksum and control data verification.
+	 *
+	 * We could have these checks while receiving contents. However, since
+	 * contents are received in multiple iterations, this would result in
+	 * these lengthy checks being performed multiple times. Instead, setting
+	 * flags here and using them before proceeding with verification will be
+	 * more efficient.
+	 */
+	mystreamer->verify_checksum =
+		(!mystreamer->context->skip_checksums && should_verify_checksum(m));
+	mystreamer->verify_control_data =
+		should_verify_control_data(mystreamer->context->manifest, m);
+
+	/* Initialize the context required for checksum verification. */
+	if (mystreamer->verify_checksum &&
+		pg_checksum_init(mystreamer->checksum_ctx, m->checksum_type) < 0)
+	{
+		report_backup_error(mystreamer->context,
+							"%s: could not initialize checksum of file \"%s\"",
+							mystreamer->archive_name, m->pathname);
+
+		/*
+		 * Checksum verification cannot be performed without proper context
+		 * initialization.
+		 */
+		mystreamer->verify_checksum = false;
+	}
+}
+
+/*
+ * Computes the checksum incrementally for the received file content.
+ *
+ * Should have a correctly initialized checksum_ctx, which will be used for
+ * incremental checksum computation.
+ */
+static void
+member_compute_checksum(astreamer *streamer, astreamer_member *member,
+						const char *data, int len)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+	pg_checksum_context *checksum_ctx = mystreamer->checksum_ctx;
+	manifest_file *m = mystreamer->mfile;
+
+	Assert(mystreamer->verify_checksum);
+
+	/* Should have came for the right file */
+	Assert(strcmp(member->pathname, m->pathname) == 0);
+
+	/*
+	 * The checksum context should match the type noted in the backup
+	 * manifest.
+	 */
+	Assert(checksum_ctx->type == m->checksum_type);
+
+	/*
+	 * Update the total count of computed checksum bytes for cross-checking
+	 * with the file size in the final verification stage.
+	 */
+	mystreamer->received_bytes += len;
+
+	if (pg_checksum_update(checksum_ctx, (uint8 *) data, len) < 0)
+	{
+		report_backup_error(mystreamer->context,
+							"could not update checksum of file \"%s\"",
+							m->pathname);
+		mystreamer->verify_checksum = false;
+	}
+}
+
+/*
+ * Perform the final computation and checksum verification after the entire
+ * file content has been processed.
+ */
+static void
+member_verify_checksum(astreamer *streamer)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+
+	Assert(mystreamer->verify_checksum);
+
+	verify_checksum(mystreamer->context, mystreamer->mfile,
+					mystreamer->checksum_ctx, mystreamer->received_bytes);
+}
+
+/*
+ * Stores the pg_control file contents into a local buffer; we need the entire
+ * control file data for verification.
+ */
+static void
+member_copy_control_data(astreamer *streamer, astreamer_member *member,
+						 const char *data, int len)
+{
+	/* Should be here only for control file */
+	Assert(strcmp(member->pathname, "global/pg_control") == 0);
+	Assert(((astreamer_verify *) streamer)->verify_control_data);
+
+	/* Copy enough control file data needed for verification. */
+	astreamer_buffer_until(streamer, &data, &len, sizeof(ControlFileData));
+}
+
+/*
+ * Performs the CRC calculation of pg_control data and then calls the routines
+ * that execute the final verification of the control file information.
+ */
+static void
+member_verify_control_data(astreamer *streamer)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+	manifest_data *manifest = mystreamer->context->manifest;
+	ControlFileData control_file;
+	pg_crc32c	crc;
+	bool		crc_ok;
+
+	/* Should be here only for control file */
+	Assert(strcmp(mystreamer->mfile->pathname, "global/pg_control") == 0);
+	Assert(mystreamer->verify_control_data);
+
+	/* Should have enough control file data needed for verification. */
+	if (streamer->bbs_buffer.len != sizeof(ControlFileData))
+		report_fatal_error("%s: unexpected control file size: %d, should be %zu",
+						   mystreamer->archive_name, streamer->bbs_buffer.len,
+						   sizeof(ControlFileData));
+
+	memcpy(&control_file, streamer->bbs_buffer.data, sizeof(ControlFileData));
+
+	/* Check the CRC. */
+	INIT_CRC32C(crc);
+	COMP_CRC32C(crc, (char *) (&control_file), offsetof(ControlFileData, crc));
+	FIN_CRC32C(crc);
+
+	crc_ok = EQ_CRC32C(crc, control_file.crc);
+
+	/* Do the final control data verification. */
+	verify_control_data(&control_file, mystreamer->mfile->pathname, crc_ok,
+						manifest->system_identifier);
+}
+
+/*
+ * Reset flags and free memory allocations for member file verification.
+ */
+static void
+member_reset_info(astreamer *streamer)
+{
+	astreamer_verify *mystreamer = (astreamer_verify *) streamer;
+
+	mystreamer->mfile = NULL;
+	mystreamer->received_bytes = 0;
+	mystreamer->verify_checksum = false;
+	mystreamer->verify_control_data = false;
+}
diff --git a/src/bin/pg_verifybackup/meson.build b/src/bin/pg_verifybackup/meson.build
index 7c7d31a0350..0e09d1379d1 100644
--- a/src/bin/pg_verifybackup/meson.build
+++ b/src/bin/pg_verifybackup/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 pg_verifybackup_sources = files(
+  'astreamer_verify.c',
   'pg_verifybackup.c'
 )
 
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index c186678a592..80b1a639803 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -22,6 +22,7 @@
 #include "common/parse_manifest.h"
 #include "fe_utils/simple_list.h"
 #include "getopt_long.h"
+#include "limits.h"
 #include "pg_verifybackup.h"
 #include "pgtime.h"
 
@@ -44,6 +45,16 @@
  */
 #define READ_CHUNK_SIZE				(128 * 1024)
 
+/*
+ * Tar file information needed for content verification.
+ */
+typedef struct tar_file
+{
+	char	   *relpath;
+	Oid			tblspc_oid;
+	pg_compress_algorithm compress_algorithm;
+} tar_file;
+
 static manifest_data *parse_manifest_file(char *manifest_path);
 static void verifybackup_version_cb(JsonManifestParseContext *context,
 									int manifest_version);
@@ -65,8 +76,14 @@ static void report_manifest_error(JsonManifestParseContext *context,
 static char find_backup_format(verifier_context *context);
 static void verify_backup_directory(verifier_context *context,
 									char *relpath, char *fullpath);
-static void verify_plain_backup_file(verifier_context *context,
-									 char *relpath, char *fullpath);
+static void verify_plain_backup_file(verifier_context *context, char *relpath,
+									 char *fullpath);
+static void verify_tar_backup_file(verifier_context *context, char *relpath,
+								   char *fullpath, SimplePtrList *tarfiles);
+static void verify_tar_file_contents(verifier_context *context,
+									 SimplePtrList *tarfiles);
+static void verify_tar_contents(verifier_context *context, char *relpath,
+								char *fullpath, astreamer *streamer);
 static void report_extra_backup_files(verifier_context *context);
 static void verify_backup_checksums(verifier_context *context);
 static void verify_file_checksum(verifier_context *context,
@@ -75,6 +92,10 @@ static void verify_file_checksum(verifier_context *context,
 static void parse_required_wal(verifier_context *context,
 							   char *pg_waldump_path,
 							   char *wal_directory);
+static astreamer *create_archive_verifier(verifier_context *context,
+										  char *archive_name,
+										  Oid tblspc_oid,
+										  pg_compress_algorithm compress_algo);
 
 static void progress_report(bool finished);
 static void usage(void);
@@ -243,11 +264,11 @@ main(int argc, char **argv)
 
 		/*
 		 * XXX: In the future, we should consider enhancing pg_waldump to read
-		 * WAL files from the tar archive.
+		 * WAL files from the tar file.
 		 */
 		if (context.format != 'p')
 		{
-			pg_log_error("pg_waldump cannot read from a tar archive");
+			pg_log_error("pg_waldump cannot read from a tar");
 			pg_log_error_hint("You must use -n or --no-parse-wal when verifying a tar-format backup.");
 			exit(1);
 		}
@@ -561,6 +582,7 @@ verify_backup_directory(verifier_context *context, char *relpath,
 {
 	DIR		   *dir;
 	struct dirent *dirent;
+	SimplePtrList tarfiles = {NULL, NULL};
 
 	dir = opendir(fullpath);
 	if (dir == NULL)
@@ -600,12 +622,23 @@ verify_backup_directory(verifier_context *context, char *relpath,
 			newrelpath = psprintf("%s/%s", relpath, filename);
 
 		if (!should_ignore_relpath(context, newrelpath))
-			verify_plain_backup_file(context, newrelpath, newfullpath);
+		{
+			if (context->format == 'p')
+				verify_plain_backup_file(context, newrelpath, newfullpath);
+			else
+				verify_tar_backup_file(context, newrelpath, newfullpath,
+									   &tarfiles);
+		}
 
 		pfree(newfullpath);
 		pfree(newrelpath);
 	}
 
+	/* Perform the final verification of the tar contents, if any. */
+	Assert(tarfiles.head == NULL || context->format == 't');
+	if (tarfiles.head != NULL)
+		verify_tar_file_contents(context, &tarfiles);
+
 	if (closedir(dir))
 	{
 		report_backup_error(context,
@@ -682,6 +715,215 @@ verify_plain_backup_file(verifier_context *context, char *relpath,
 		total_size += m->size;
 }
 
+/*
+ * Verify one tar file.
+ *
+ * This function does not perform a complete verification; it only carries out
+ * basic validation of the tar format backup file, detects the compression
+ * type, and appends that information to the tarfiles list. An error will be
+ * reported if the tar file is inaccessible, or if the file type, name, or
+ * compression type is not as expected.
+ *
+ * The arguments to this function are mostly the same as the
+ * verify_plain_backup_file. The additional argument outputs a list of valid
+ * tar files.
+ */
+static void
+verify_tar_backup_file(verifier_context *context, char *relpath,
+					   char *fullpath, SimplePtrList *tarfiles)
+{
+	struct stat sb;
+	Oid			tblspc_oid = InvalidOid;
+	pg_compress_algorithm compress_algorithm;
+	tar_file   *tar;
+	char	   *suffix = NULL;
+
+	/* Should be tar format backup */
+	Assert(context->format == 't');
+
+	/* Get file information */
+	if (stat(fullpath, &sb) != 0)
+	{
+		report_backup_error(context,
+							"could not stat file or directory \"%s\": %m",
+							relpath);
+		return;
+	}
+
+	/* In a tar format backup, we expect only plain files. */
+	if (!S_ISREG(sb.st_mode))
+	{
+		report_backup_error(context,
+							"\"%s\" is not a plain file",
+							relpath);
+		return;
+	}
+
+	/*
+	 * We expect tar files for backing up the main directory, tablespace, and
+	 * pg_wal directory.
+	 *
+	 * pg_basebackup writes the main data directory to an archive file named
+	 * base.tar, the pg_wal directory to pg_wal.tar, and the tablespace
+	 * directory to <tablespaceoid>.tar, each followed by a compression type
+	 * extension such as .gz, .lz4, or .zst.
+	 */
+	if (strncmp("base", relpath, 4) == 0)
+		suffix = relpath + 4;
+	else if (strncmp("pg_wal", relpath, 6) == 0)
+		suffix = relpath + 6;
+	else
+	{
+		/* Expected a <tablespaceoid>.tar file here. */
+		uint64		num = strtoul(relpath, &suffix, 10);
+
+		/*
+		 * Report an error if we didn't consume at least one character, if the
+		 * result is 0, or if the value is too large to be a valid OID.
+		 */
+		if (suffix == NULL || num <= 0 || num > OID_MAX)
+			report_backup_error(context,
+								"\"%s\" unexpected file in the tar format backup",
+								relpath);
+		tblspc_oid = (Oid) num;
+	}
+
+	/* Now, check the compression type of the tar */
+	if (strcmp(suffix, ".tar") == 0)
+		compress_algorithm = PG_COMPRESSION_NONE;
+	else if (strcmp(suffix, ".tgz") == 0)
+		compress_algorithm = PG_COMPRESSION_GZIP;
+	else if (strcmp(suffix, ".tar.gz") == 0)
+		compress_algorithm = PG_COMPRESSION_GZIP;
+	else if (strcmp(suffix, ".tar.lz4") == 0)
+		compress_algorithm = PG_COMPRESSION_LZ4;
+	else if (strcmp(suffix, ".tar.zst") == 0)
+		compress_algorithm = PG_COMPRESSION_ZSTD;
+	else
+	{
+		report_backup_error(context,
+							"\"%s\" unexpected file in the tar format backup",
+							relpath);
+		return;
+	}
+
+	/*
+	 * Ignore WALs, as reading and verification will be handled through
+	 * pg_waldump.
+	 */
+	if (strncmp("pg_wal", relpath, 6) == 0)
+		return;
+
+	/*
+	 * Append the information to the list for complete verification at a later
+	 * stage.
+	 */
+	tar = pg_malloc(sizeof(tar_file));
+	tar->relpath = pstrdup(relpath);
+	tar->tblspc_oid = tblspc_oid;
+	tar->compress_algorithm = compress_algorithm;
+
+	simple_ptr_list_append(tarfiles, tar);
+
+	/* Update statistics for progress report, if necessary */
+	if (show_progress)
+		total_size += sb.st_size;
+}
+
+/*
+ * This is the final part of tar file verification, which prepares the
+ * archive streamer stack according to the tar compression format for
+ * each tar file and invokes them for reading, decompressing, and ultimately
+ * verifying the contents.
+ *
+ * The arguments to this function should be a list of valid tar files to
+ * verify, and the allocation will be freed once the verification is complete.
+ */
+static void
+verify_tar_file_contents(verifier_context *context, SimplePtrList *tarfiles)
+{
+	SimplePtrListCell *cell;
+
+	progress_report(false);
+
+	for (cell = tarfiles->head; cell != NULL; cell = cell->next)
+	{
+		tar_file   *tar = (tar_file *) cell->ptr;
+		astreamer  *streamer;
+		char	   *fullpath;
+
+		/* Prepare archive streamer stack */
+		streamer = create_archive_verifier(context,
+										   tar->relpath,
+										   tar->tblspc_oid,
+										   tar->compress_algorithm);
+
+		/* Compute the full pathname to the target file. */
+		fullpath = psprintf("%s/%s", context->backup_directory,
+							tar->relpath);
+
+		/* Invoke the streamer for reading, decompressing, and verifying. */
+		verify_tar_contents(context, tar->relpath, fullpath, streamer);
+
+		/* Cleanup. */
+		pfree(tar->relpath);
+		pfree(tar);
+		pfree(fullpath);
+
+		astreamer_finalize(streamer);
+		astreamer_free(streamer);
+	}
+	simple_ptr_list_destroy(tarfiles);
+
+	progress_report(true);
+}
+
+/*
+ * Performs the actual work for tar content verification. It reads a given tar
+ * archive in predefined chunks and passes it to the streamer, which initiates
+ * routines for decompression (if necessary) and then verifies each member
+ * within the tar file.
+ */
+static void
+verify_tar_contents(verifier_context *context, char *relpath, char *fullpath,
+					astreamer *streamer)
+{
+	int			fd;
+	int			rc;
+	char	   *buffer;
+
+	pg_log_debug("reading \"%s\"", fullpath);
+
+	/* Open the target file. */
+	if ((fd = open(fullpath, O_RDONLY | PG_BINARY, 0)) < 0)
+	{
+		report_backup_error(context, "could not open file \"%s\": %m",
+							relpath);
+		return;
+	}
+
+	buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+	/* Perform the reads */
+	while ((rc = read(fd, buffer, READ_CHUNK_SIZE)) > 0)
+	{
+		astreamer_content(streamer, NULL, buffer, rc, ASTREAMER_UNKNOWN);
+
+		/* Report progress */
+		done_size += rc;
+		progress_report(false);
+	}
+
+	if (rc < 0)
+		report_backup_error(context, "could not read file \"%s\": %m",
+							relpath);
+
+	/* Close the file. */
+	if (close(fd) != 0)
+		report_backup_error(context, "could not close file \"%s\": %m",
+							relpath);
+}
+
 /*
  * Verify file and its size entry in the manifest.
  */
@@ -1049,6 +1291,41 @@ find_backup_format(verifier_context *context)
 	return result;
 }
 
+/*
+ * Identifies the necessary steps for verifying the contents of the
+ * provided tar file.
+ */
+static astreamer *
+create_archive_verifier(verifier_context *context, char *archive_name,
+						Oid tblspc_oid, pg_compress_algorithm compress_algo)
+{
+	astreamer  *streamer = NULL;
+
+	/* Should be here only for tar backup */
+	Assert(context->format == 't');
+
+	/*
+	 * To verify the contents of the tar, the initial step is to parse its
+	 * content.
+	 */
+	streamer = astreamer_verify_content_new(streamer, context, archive_name,
+											tblspc_oid);
+	streamer = astreamer_tar_parser_new(streamer);
+
+	/*
+	 * If the tar is compressed, we must perform the appropriate decompression
+	 * operation before proceeding with the verification of its contents.
+	 */
+	if (compress_algo == PG_COMPRESSION_GZIP)
+		streamer = astreamer_gzip_decompressor_new(streamer);
+	else if (compress_algo == PG_COMPRESSION_LZ4)
+		streamer = astreamer_lz4_decompressor_new(streamer);
+	else if (compress_algo == PG_COMPRESSION_ZSTD)
+		streamer = astreamer_zstd_decompressor_new(streamer);
+
+	return streamer;
+}
+
 /*
  * Print a progress report based on the global variables.
  *
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.h b/src/bin/pg_verifybackup/pg_verifybackup.h
index 80031ad4dbc..be7438af346 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.h
+++ b/src/bin/pg_verifybackup/pg_verifybackup.h
@@ -18,6 +18,7 @@
 #include "common/hashfn_unstable.h"
 #include "common/logging.h"
 #include "common/parse_manifest.h"
+#include "fe_utils/astreamer.h"
 #include "fe_utils/simple_list.h"
 
 /*
@@ -123,4 +124,9 @@ extern void report_fatal_error(const char *pg_restrict fmt,...)
 extern bool should_ignore_relpath(verifier_context *context,
 								  const char *relpath);
 
+extern astreamer *astreamer_verify_content_new(astreamer *next,
+											   verifier_context *context,
+											   char *archive_name,
+											   Oid tblspc_oid);
+
 #endif							/* PG_VERIFYBACKUP_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 6d424c89186..48f37131e6a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3330,6 +3330,7 @@ astreamer_plain_writer
 astreamer_recovery_injector
 astreamer_tar_archiver
 astreamer_tar_parser
+astreamer_verify
 astreamer_zstd_frame
 bgworker_main_type
 bh_node_type
@@ -3951,6 +3952,7 @@ substitute_phv_relids_context
 subxids_array_status
 symbol
 tablespaceinfo
+tar_file
 td_entry
 teSection
 temp_tablespaces_extra
-- 
2.18.0

From 00fc4d6018fb6f8c3826bf18144d578fd51a66e8 Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Wed, 21 Aug 2024 16:04:37 +0530
Subject: [PATCH v12 12/12] pg_verifybackup: Tests and document

----
NOTE: This patch is not meant to be committed separately. It should
be squashed with the previous patch that implements tar format support.
----
---
 doc/src/sgml/ref/pg_verifybackup.sgml         | 43 ++++++++++-
 src/bin/pg_verifybackup/pg_verifybackup.c     |  6 +-
 src/bin/pg_verifybackup/t/001_basic.pl        |  6 +-
 src/bin/pg_verifybackup/t/004_options.pl      |  2 +-
 src/bin/pg_verifybackup/t/005_bad_manifest.pl |  2 +-
 src/bin/pg_verifybackup/t/008_untar.pl        | 73 ++++++-------------
 src/bin/pg_verifybackup/t/010_client_untar.pl | 48 +-----------
 7 files changed, 76 insertions(+), 104 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index a3f167f9f6e..ea6bc3ccb12 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -34,8 +34,10 @@ PostgreSQL documentation
    integrity of a database cluster backup taken using
    <command>pg_basebackup</command> against a
    <literal>backup_manifest</literal> generated by the server at the time
-   of the backup.  The backup must be stored in the "plain"
-   format; a "tar" format backup can be checked after extracting it.
+   of the backup.  The backup must be stored in the "plain" or "tar"
+   format.  Verification is supported for <literal>gzip</literal>,
+   <literal>lz4</literal>, and  <literal>zstd</literal> compressed tar backup;
+   any other compressed format backups can be checked after decompressing them.
   </para>
 
   <para>
@@ -168,6 +170,43 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-F <replaceable class="parameter">format</replaceable></option></term>
+      <term><option>--format=<replaceable class="parameter">format</replaceable></option></term>
+      <listitem>
+       <para>
+        Specifies the format of the backup. <replaceable>format</replaceable>
+        can be one of the following:
+
+        <variablelist>
+         <varlistentry>
+          <term><literal>p</literal></term>
+          <term><literal>plain</literal></term>
+          <listitem>
+           <para>
+            Backup consists of plain files with the same layout as the
+            source server's data directory and tablespaces.
+           </para>
+          </listitem>
+         </varlistentry>
+
+         <varlistentry>
+          <term><literal>t</literal></term>
+          <term><literal>tar</literal></term>
+          <listitem>
+           <para>
+            Backup consists of tar files.  A valid backup includes the main data
+            directory in a file named <filename>base.tar</filename>, the WAL
+            files in <filename>pg_wal.tar</filename>, and separate tar files for
+            each tablespace, named after the tablespace's OID, followed by the
+            compression extension.
+           </para>
+           </listitem>
+         </varlistentry>
+        </variablelist></para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-n</option></term>
       <term><option>--no-parse-wal</option></term>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 80b1a639803..8302f022b22 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -880,9 +880,9 @@ verify_tar_file_contents(verifier_context *context, SimplePtrList *tarfiles)
 
 /*
  * Performs the actual work for tar content verification. It reads a given tar
- * archive in predefined chunks and passes it to the streamer, which initiates
- * routines for decompression (if necessary) and then verifies each member
- * within the tar file.
+ * file in predefined chunks and passes it to the archive streamer, which
+ * initiates routines for decompression (if necessary) and then verifies each
+ * member within the tar file.
  */
 static void
 verify_tar_contents(verifier_context *context, char *relpath, char *fullpath,
diff --git a/src/bin/pg_verifybackup/t/001_basic.pl b/src/bin/pg_verifybackup/t/001_basic.pl
index 2f3e52d296f..ca5b0402b7d 100644
--- a/src/bin/pg_verifybackup/t/001_basic.pl
+++ b/src/bin/pg_verifybackup/t/001_basic.pl
@@ -17,11 +17,11 @@ command_fails_like(
 	qr/no backup directory specified/,
 	'target directory must be specified');
 command_fails_like(
-	[ 'pg_verifybackup', $tempdir ],
+	[ 'pg_verifybackup', '-Fp', $tempdir ],
 	qr/could not open file.*\/backup_manifest\"/,
 	'pg_verifybackup requires a manifest');
 command_fails_like(
-	[ 'pg_verifybackup', $tempdir, $tempdir ],
+	[ 'pg_verifybackup', '-Fp', $tempdir, $tempdir ],
 	qr/too many command-line arguments/,
 	'multiple target directories not allowed');
 
@@ -31,7 +31,7 @@ close($fh);
 
 # but then try to use an alternate, nonexisting manifest
 command_fails_like(
-	[ 'pg_verifybackup', '-m', "$tempdir/not_the_manifest", $tempdir ],
+	[ 'pg_verifybackup', '-Fp', '-m', "$tempdir/not_the_manifest", $tempdir ],
 	qr/could not open file.*\/not_the_manifest\"/,
 	'pg_verifybackup respects -m flag');
 
diff --git a/src/bin/pg_verifybackup/t/004_options.pl b/src/bin/pg_verifybackup/t/004_options.pl
index 8ed2214408e..2f197648740 100644
--- a/src/bin/pg_verifybackup/t/004_options.pl
+++ b/src/bin/pg_verifybackup/t/004_options.pl
@@ -108,7 +108,7 @@ unlike(
 # Test valid manifest with nonexistent backup directory.
 command_fails_like(
 	[
-		'pg_verifybackup', '-m',
+		'pg_verifybackup', '-Fp', '-m',
 		"$backup_path/backup_manifest", "$backup_path/fake"
 	],
 	qr/could not open directory/,
diff --git a/src/bin/pg_verifybackup/t/005_bad_manifest.pl b/src/bin/pg_verifybackup/t/005_bad_manifest.pl
index c4ed64b62d5..28c51b6feb0 100644
--- a/src/bin/pg_verifybackup/t/005_bad_manifest.pl
+++ b/src/bin/pg_verifybackup/t/005_bad_manifest.pl
@@ -208,7 +208,7 @@ sub test_bad_manifest
 	print $fh $manifest_contents;
 	close($fh);
 
-	command_fails_like([ 'pg_verifybackup', $tempdir ], $regexp, $test_name);
+	command_fails_like([ 'pg_verifybackup', '-Fp', $tempdir ], $regexp, $test_name);
 	return;
 }
 
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index 7a09f3b75b2..1c83f38d5b5 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -16,6 +16,20 @@ my $primary = PostgreSQL::Test::Cluster->new('primary');
 $primary->init(allows_streaming => 1);
 $primary->start;
 
+# Create a tablespace directory.
+my $TS1_LOCATION = $primary->backup_dir .'/ts1';
+$TS1_LOCATION =~ s/\/\.\//\//g;    # collapse foo/./bar to foo/bar
+mkdir($TS1_LOCATION);
+
+# Create a tablespace with table in it.
+$primary->safe_psql('postgres', qq(
+		CREATE TABLESPACE regress_ts1 LOCATION '$TS1_LOCATION';
+		SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1';
+		CREATE TABLE regress_tbl1(i int) TABLESPACE regress_ts1;
+		INSERT INTO regress_tbl1 VALUES(generate_series(1,5));));
+my $tsoid = $primary->safe_psql('postgres', qq(
+		SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
+
 my $backup_path = $primary->backup_dir . '/server-backup';
 my $extract_path = $primary->backup_dir . '/extracted-backup';
 
@@ -23,39 +37,31 @@ my @test_configuration = (
 	{
 		'compression_method' => 'none',
 		'backup_flags' => [],
-		'backup_archive' => 'base.tar',
+		'backup_archive' => ['base.tar', "$tsoid.tar"],
 		'enabled' => 1
 	},
 	{
 		'compression_method' => 'gzip',
 		'backup_flags' => [ '--compress', 'server-gzip' ],
-		'backup_archive' => 'base.tar.gz',
-		'decompress_program' => $ENV{'GZIP_PROGRAM'},
-		'decompress_flags' => ['-d'],
+		'backup_archive' => [ 'base.tar.gz', "$tsoid.tar.gz" ],
 		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
 	},
 	{
 		'compression_method' => 'lz4',
 		'backup_flags' => [ '--compress', 'server-lz4' ],
-		'backup_archive' => 'base.tar.lz4',
-		'decompress_program' => $ENV{'LZ4'},
-		'decompress_flags' => [ '-d', '-m' ],
+		'backup_archive' => ['base.tar.lz4', "$tsoid.tar.lz4" ],
 		'enabled' => check_pg_config("#define USE_LZ4 1")
 	},
 	{
 		'compression_method' => 'zstd',
 		'backup_flags' => [ '--compress', 'server-zstd' ],
-		'backup_archive' => 'base.tar.zst',
-		'decompress_program' => $ENV{'ZSTD'},
-		'decompress_flags' => ['-d'],
+		'backup_archive' => [ 'base.tar.zst', "$tsoid.tar.zst" ],
 		'enabled' => check_pg_config("#define USE_ZSTD 1")
 	},
 	{
 		'compression_method' => 'zstd',
 		'backup_flags' => [ '--compress', 'server-zstd:level=1,long' ],
-		'backup_archive' => 'base.tar.zst',
-		'decompress_program' => $ENV{'ZSTD'},
-		'decompress_flags' => ['-d'],
+		'backup_archive' => [ 'base.tar.zst', "$tsoid.tar.zst" ],
 		'enabled' => check_pg_config("#define USE_ZSTD 1")
 	});
 
@@ -86,47 +92,16 @@ for my $tc (@test_configuration)
 		my $backup_files = join(',',
 			sort grep { $_ ne '.' && $_ ne '..' } slurp_dir($backup_path));
 		my $expected_backup_files =
-		  join(',', sort ('backup_manifest', $tc->{'backup_archive'}));
+		  join(',', sort ('backup_manifest', @{ $tc->{'backup_archive'} }));
 		is($backup_files, $expected_backup_files,
 			"found expected backup files, compression $method");
 
-		# Decompress.
-		if (exists $tc->{'decompress_program'})
-		{
-			my @decompress = ($tc->{'decompress_program'});
-			push @decompress, @{ $tc->{'decompress_flags'} }
-			  if $tc->{'decompress_flags'};
-			push @decompress, $backup_path . '/' . $tc->{'backup_archive'};
-			system_or_bail(@decompress);
-		}
-
-	  SKIP:
-		{
-			my $tar = $ENV{TAR};
-			# don't check for a working tar here, to accommodate various odd
-			# cases. If tar doesn't work the init_from_backup below will fail.
-			skip "no tar program available", 1
-			  if (!defined $tar || $tar eq '');
-
-			# Untar.
-			mkdir($extract_path);
-			system_or_bail($tar, 'xf', $backup_path . '/base.tar',
-				'-C', $extract_path);
-
-			# Verify.
-			$primary->command_ok(
-				[
-					'pg_verifybackup', '-n',
-					'-m', "$backup_path/backup_manifest",
-					'-e', $extract_path
-				],
-				"verify backup, compression $method");
-		}
+		# Verify tar backup.
+		$primary->command_ok(['pg_verifybackup', '-n', '-e', $backup_path],
+			"verify backup, compression $method");
 
 		# Cleanup.
-		unlink($backup_path . '/backup_manifest');
-		unlink($backup_path . '/base.tar');
-		unlink($backup_path . '/' . $tc->{'backup_archive'});
+		rmtree($backup_path);
 		rmtree($extract_path);
 	}
 }
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 8c076d46dee..6b7d7483f6e 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -29,41 +29,30 @@ my @test_configuration = (
 		'compression_method' => 'gzip',
 		'backup_flags' => [ '--compress', 'client-gzip:5' ],
 		'backup_archive' => 'base.tar.gz',
-		'decompress_program' => $ENV{'GZIP_PROGRAM'},
-		'decompress_flags' => ['-d'],
 		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
 	},
 	{
 		'compression_method' => 'lz4',
 		'backup_flags' => [ '--compress', 'client-lz4:5' ],
 		'backup_archive' => 'base.tar.lz4',
-		'decompress_program' => $ENV{'LZ4'},
-		'decompress_flags' => ['-d'],
-		'output_file' => 'base.tar',
 		'enabled' => check_pg_config("#define USE_LZ4 1")
 	},
 	{
 		'compression_method' => 'zstd',
 		'backup_flags' => [ '--compress', 'client-zstd:5' ],
 		'backup_archive' => 'base.tar.zst',
-		'decompress_program' => $ENV{'ZSTD'},
-		'decompress_flags' => ['-d'],
 		'enabled' => check_pg_config("#define USE_ZSTD 1")
 	},
 	{
 		'compression_method' => 'zstd',
 		'backup_flags' => [ '--compress', 'client-zstd:level=1,long' ],
 		'backup_archive' => 'base.tar.zst',
-		'decompress_program' => $ENV{'ZSTD'},
-		'decompress_flags' => ['-d'],
 		'enabled' => check_pg_config("#define USE_ZSTD 1")
 	},
 	{
 		'compression_method' => 'parallel zstd',
 		'backup_flags' => [ '--compress', 'client-zstd:workers=3' ],
 		'backup_archive' => 'base.tar.zst',
-		'decompress_program' => $ENV{'ZSTD'},
-		'decompress_flags' => ['-d'],
 		'enabled' => check_pg_config("#define USE_ZSTD 1"),
 		'possibly_unsupported' =>
 		  qr/could not set compression worker count to 3: Unsupported parameter/
@@ -118,40 +107,9 @@ for my $tc (@test_configuration)
 		is($backup_files, $expected_backup_files,
 			"found expected backup files, compression $method");
 
-		# Decompress.
-		if (exists $tc->{'decompress_program'})
-		{
-			my @decompress = ($tc->{'decompress_program'});
-			push @decompress, @{ $tc->{'decompress_flags'} }
-			  if $tc->{'decompress_flags'};
-			push @decompress, $backup_path . '/' . $tc->{'backup_archive'};
-			push @decompress, $backup_path . '/' . $tc->{'output_file'}
-			  if $tc->{'output_file'};
-			system_or_bail(@decompress);
-		}
-
-	  SKIP:
-		{
-			my $tar = $ENV{TAR};
-			# don't check for a working tar here, to accommodate various odd
-			# cases. If tar doesn't work the init_from_backup below will fail.
-			skip "no tar program available", 1
-			  if (!defined $tar || $tar eq '');
-
-			# Untar.
-			mkdir($extract_path);
-			system_or_bail($tar, 'xf', $backup_path . '/base.tar',
-				'-C', $extract_path);
-
-			# Verify.
-			$primary->command_ok(
-				[
-					'pg_verifybackup', '-n',
-					'-m', "$backup_path/backup_manifest",
-					'-e', $extract_path
-				],
-				"verify backup, compression $method");
-		}
+		# Verify tar backup.
+		$primary->command_ok( [ 'pg_verifybackup', '-n', '-e', $backup_path ],
+			"verify backup, compression $method");
 
 		# Cleanup.
 		rmtree($extract_path);
-- 
2.18.0

Reply via email to