Hi,


On Tue, 22 Aug 2023 at 07:32, Michael Paquier <mich...@paquier.xyz> wrote:

>
>
> I don't like much this patch.  While it takes correctly advantage of
> the backward record read logic from SimpleXLogPageRead() able to
> handle correctly timeline jumps, it creates a hidden dependency in the
> code between the hash table from filemap.c and the page callback.
> Wouldn't it be simpler to build a list of the segment names using the
> information from WALOpenSegment and build this list in
> findLastCheckpoint()?


I think the first version of the patch more or less did that. Not
necessarily a list, but a hash table of WAL file names that we want to
keep. But Kyotaro Horiguchi didn't like it and suggested creating entries
in the filemap.c hash table instead.
But, I agree, doing it directly from the findLastCheckpoint() makes the
code easier to understand.



>   Also, I am wondering if we should be smarter
> with any potential conflict handling between the source and the
> target, rather than just enforcing a FILE_ACTION_NONE for all these
> files.  In short, could it be better to copy the WAL file from the
> source if it exists there?
>

Before the switchpoint these files are supposed to be the same on the old
primary, new primary, and also in the archive. Also, if there is a
restore_command postgres will fetch the same file from the archive even if
it already exists in pg_wal, which effectively discards all pg_rewind
efforts on copying WAL files.


>
> +       /*
> +        * Some entries (WAL segments) already have an action assigned
> +        * (see SimpleXLogPageRead()).
> +        */
> +       if (entry->action == FILE_ACTION_UNDECIDED)
> +           entry->action = decide_file_action(entry);
>
> This change makes me a bit uneasy, per se my previous comment with the
> additional code dependencies.
>

We can revert to the original approach (see
v1-0001-pg_rewind-wal-deletion.patch from the very first email) if you like.


> I think that this scenario deserves a test case.  If one wants to
> emulate a delay in WAL archiving, it is possible to set
> archive_command to a command that we know will fail, for instance.
>

Yes, I totally agree, it is on our radar, but meanwhile please see the new
version, just to check if I correctly understood your idea.

Regards,
--
Alexander Kukushkin
From 1a8504419f7fafc04443c09d86807dccffc750f8 Mon Sep 17 00:00:00 2001
From: Alexander Kukushkin <cyberd...@gmail.com>
Date: Wed, 23 Aug 2023 13:40:47 +0200
Subject: [PATCH v4] Be more picky with WAL segment deletion in pg_rewind

Make pg_rewind to be a bit wiser in terms of creating filemap:
preserve on the target all WAL segments that contain records between the
last common checkpoint and the point of divergence.

Co-authored-by: Polina Bungina <bung...@gmail.com>
---
 src/bin/pg_rewind/filemap.c   | 19 ++++++++++++++++++-
 src/bin/pg_rewind/filemap.h   |  1 +
 src/bin/pg_rewind/parsexlog.c | 22 ++++++++++++++++++++++
 src/bin/pg_rewind/pg_rewind.c |  6 +++---
 4 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index bd5c598e20..a0c8dbc1b6 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -795,7 +795,12 @@ decide_file_actions(void)
 	filehash_start_iterate(filehash, &it);
 	while ((entry = filehash_iterate(filehash, &it)) != NULL)
 	{
-		entry->action = decide_file_action(entry);
+		/*
+		 * Some entries (WAL segments) already have an action assigned
+		 * (see findLastCheckpoint() function).
+		 */
+		if (entry->action == FILE_ACTION_UNDECIDED)
+			entry->action = decide_file_action(entry);
 	}
 
 	/*
@@ -818,6 +823,18 @@ decide_file_actions(void)
 	return filemap;
 }
 
+/*
+ * Prevent a given file deletion during rewind
+ */
+void
+preserve_file(char *filepath)
+{
+	file_entry_t *entry;
+
+	entry = insert_filehash_entry(filepath);
+	entry->action = FILE_ACTION_NONE;
+}
+
 
 /*
  * Helper function for filemap hash table.
diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h
index 48f240dff1..47c7a7fd09 100644
--- a/src/bin/pg_rewind/filemap.h
+++ b/src/bin/pg_rewind/filemap.h
@@ -109,5 +109,6 @@ extern void process_target_wal_block_change(ForkNumber forknum,
 extern filemap_t *decide_file_actions(void);
 extern void calculate_totals(filemap_t *filemap);
 extern void print_filemap(filemap_t *filemap);
+extern void preserve_file(char *filepath);
 
 #endif							/* FILEMAP_H */
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 27782237d0..40c7073522 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -176,6 +176,10 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex,
 	char	   *errormsg;
 	XLogPageReadPrivate private;
 
+	/* Track WAL segments opened while searching a checkpoint */
+	XLogSegNo	segno = 0;
+	TimeLineID	tli = 0;
+
 	/*
 	 * The given fork pointer points to the end of the last common record,
 	 * which is not necessarily the beginning of the next record, if the
@@ -217,6 +221,24 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex,
 						 LSN_FORMAT_ARGS(searchptr));
 		}
 
+		/* We are trying to detect if the new WAL file was opened */
+		if (xlogreader->seg.ws_tli != tli || xlogreader->seg.ws_segno != segno)
+		{
+			char		xlogfname[MAXFNAMELEN];
+
+			tli = xlogreader->seg.ws_tli;
+			segno = xlogreader->seg.ws_segno;
+
+			snprintf(xlogfname, MAXPGPATH, XLOGDIR "/");
+			XLogFileName(xlogfname + strlen(xlogfname),
+						 xlogreader->seg.ws_tli,
+						 xlogreader->seg.ws_segno, WalSegSz);
+
+			/* Make sure pg_rewind doesn't remove this file, because it is
+			 * required for postgres to start after rewind. */
+			preserve_file(xlogfname);
+		}
+
 		/*
 		 * Check if it is a checkpoint record. This checkpoint record needs to
 		 * be the latest checkpoint before WAL forked and not the checkpoint
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index f7f3b8227f..bb52aa5466 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -446,14 +446,14 @@ main(int argc, char **argv)
 		exit(0);
 	}
 
+	/* Initialize the hash table to track the status of each file */
+	filehash_init();
+
 	findLastCheckpoint(datadir_target, divergerec, lastcommontliIndex,
 					   &chkptrec, &chkpttli, &chkptredo, restore_command);
 	pg_log_info("rewinding from last common checkpoint at %X/%X on timeline %u",
 				LSN_FORMAT_ARGS(chkptrec), chkpttli);
 
-	/* Initialize the hash table to track the status of each file */
-	filehash_init();
-
 	/*
 	 * Collect information about all files in the both data directories.
 	 */
-- 
2.34.1

Reply via email to