Hi,
On Tue, 22 Aug 2023 at 07:32, Michael Paquier <mich...@paquier.xyz> wrote: > > > I don't like much this patch. While it takes correctly advantage of > the backward record read logic from SimpleXLogPageRead() able to > handle correctly timeline jumps, it creates a hidden dependency in the > code between the hash table from filemap.c and the page callback. > Wouldn't it be simpler to build a list of the segment names using the > information from WALOpenSegment and build this list in > findLastCheckpoint()? I think the first version of the patch more or less did that. Not necessarily a list, but a hash table of WAL file names that we want to keep. But Kyotaro Horiguchi didn't like it and suggested creating entries in the filemap.c hash table instead. But, I agree, doing it directly from the findLastCheckpoint() makes the code easier to understand. > Also, I am wondering if we should be smarter > with any potential conflict handling between the source and the > target, rather than just enforcing a FILE_ACTION_NONE for all these > files. In short, could it be better to copy the WAL file from the > source if it exists there? > Before the switchpoint these files are supposed to be the same on the old primary, new primary, and also in the archive. Also, if there is a restore_command postgres will fetch the same file from the archive even if it already exists in pg_wal, which effectively discards all pg_rewind efforts on copying WAL files. > > + /* > + * Some entries (WAL segments) already have an action assigned > + * (see SimpleXLogPageRead()). > + */ > + if (entry->action == FILE_ACTION_UNDECIDED) > + entry->action = decide_file_action(entry); > > This change makes me a bit uneasy, per se my previous comment with the > additional code dependencies. > We can revert to the original approach (see v1-0001-pg_rewind-wal-deletion.patch from the very first email) if you like. > I think that this scenario deserves a test case. If one wants to > emulate a delay in WAL archiving, it is possible to set > archive_command to a command that we know will fail, for instance. > Yes, I totally agree, it is on our radar, but meanwhile please see the new version, just to check if I correctly understood your idea. Regards, -- Alexander Kukushkin
From 1a8504419f7fafc04443c09d86807dccffc750f8 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin <cyberd...@gmail.com> Date: Wed, 23 Aug 2023 13:40:47 +0200 Subject: [PATCH v4] Be more picky with WAL segment deletion in pg_rewind Make pg_rewind to be a bit wiser in terms of creating filemap: preserve on the target all WAL segments that contain records between the last common checkpoint and the point of divergence. Co-authored-by: Polina Bungina <bung...@gmail.com> --- src/bin/pg_rewind/filemap.c | 19 ++++++++++++++++++- src/bin/pg_rewind/filemap.h | 1 + src/bin/pg_rewind/parsexlog.c | 22 ++++++++++++++++++++++ src/bin/pg_rewind/pg_rewind.c | 6 +++--- 4 files changed, 44 insertions(+), 4 deletions(-) diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c index bd5c598e20..a0c8dbc1b6 100644 --- a/src/bin/pg_rewind/filemap.c +++ b/src/bin/pg_rewind/filemap.c @@ -795,7 +795,12 @@ decide_file_actions(void) filehash_start_iterate(filehash, &it); while ((entry = filehash_iterate(filehash, &it)) != NULL) { - entry->action = decide_file_action(entry); + /* + * Some entries (WAL segments) already have an action assigned + * (see findLastCheckpoint() function). + */ + if (entry->action == FILE_ACTION_UNDECIDED) + entry->action = decide_file_action(entry); } /* @@ -818,6 +823,18 @@ decide_file_actions(void) return filemap; } +/* + * Prevent a given file deletion during rewind + */ +void +preserve_file(char *filepath) +{ + file_entry_t *entry; + + entry = insert_filehash_entry(filepath); + entry->action = FILE_ACTION_NONE; +} + /* * Helper function for filemap hash table. diff --git a/src/bin/pg_rewind/filemap.h b/src/bin/pg_rewind/filemap.h index 48f240dff1..47c7a7fd09 100644 --- a/src/bin/pg_rewind/filemap.h +++ b/src/bin/pg_rewind/filemap.h @@ -109,5 +109,6 @@ extern void process_target_wal_block_change(ForkNumber forknum, extern filemap_t *decide_file_actions(void); extern void calculate_totals(filemap_t *filemap); extern void print_filemap(filemap_t *filemap); +extern void preserve_file(char *filepath); #endif /* FILEMAP_H */ diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c index 27782237d0..40c7073522 100644 --- a/src/bin/pg_rewind/parsexlog.c +++ b/src/bin/pg_rewind/parsexlog.c @@ -176,6 +176,10 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex, char *errormsg; XLogPageReadPrivate private; + /* Track WAL segments opened while searching a checkpoint */ + XLogSegNo segno = 0; + TimeLineID tli = 0; + /* * The given fork pointer points to the end of the last common record, * which is not necessarily the beginning of the next record, if the @@ -217,6 +221,24 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex, LSN_FORMAT_ARGS(searchptr)); } + /* We are trying to detect if the new WAL file was opened */ + if (xlogreader->seg.ws_tli != tli || xlogreader->seg.ws_segno != segno) + { + char xlogfname[MAXFNAMELEN]; + + tli = xlogreader->seg.ws_tli; + segno = xlogreader->seg.ws_segno; + + snprintf(xlogfname, MAXPGPATH, XLOGDIR "/"); + XLogFileName(xlogfname + strlen(xlogfname), + xlogreader->seg.ws_tli, + xlogreader->seg.ws_segno, WalSegSz); + + /* Make sure pg_rewind doesn't remove this file, because it is + * required for postgres to start after rewind. */ + preserve_file(xlogfname); + } + /* * Check if it is a checkpoint record. This checkpoint record needs to * be the latest checkpoint before WAL forked and not the checkpoint diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index f7f3b8227f..bb52aa5466 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -446,14 +446,14 @@ main(int argc, char **argv) exit(0); } + /* Initialize the hash table to track the status of each file */ + filehash_init(); + findLastCheckpoint(datadir_target, divergerec, lastcommontliIndex, &chkptrec, &chkpttli, &chkptredo, restore_command); pg_log_info("rewinding from last common checkpoint at %X/%X on timeline %u", LSN_FORMAT_ARGS(chkptrec), chkpttli); - /* Initialize the hash table to track the status of each file */ - filehash_init(); - /* * Collect information about all files in the both data directories. */ -- 2.34.1