On 10/12/23 10:19, David Steele wrote:
On 10/11/23 18:10, Thomas Munro wrote:
As Stephen mentioned[1], we could perhaps also complain if both backup
label and control file exist, and then hint that the user should
remove the *control file* (not the backup label!). I had originally
suggested we would just overwrite the control file, but by explicitly
complaining about it we would also bring the matter to tool/script
authors' attention, ie that they shouldn't be backing that file up, or
should be removing it in a later step if they copy everything. He
also mentions that there doesn't seem to be anything stopping us from
back-patching changes to the backup label contents if we go this way.
I don't have a strong opinion on that and we could leave the question
for later.
I'm worried about the possibility of back patching this unless the
solution comes out to be simpler than I think and that rarely comes to
pass. Surely throwing errors on something that is currently valid (i.e.
backup_label and pg_control both present).
But perhaps there is a simpler, acceptable solution we could back patch
(transparent to all parties except Postgres) and then a more advanced
solution we could go forward with.
I guess I had better get busy on this.
Attached is a very POC attempt at something along these lines that could
be back patched. I stopped when I realized excluding pg_control from the
backup is a necessity to make this work (else we still could end up with
a torn copy of pg_control even if there is a good copy elsewhere). To
enumerate the back patch issues as I see them:
1) We still need a copy of pg_control in order to get Postgres to start
and that copy might be torn (pretty much where we are now). We can
handle this easily in pg_basebackup but most backup software will not be
able to do so. In my opinion teaching Postgres to start without
pg_control is too big a change to possibly back patch.
2) We need to deal with backups made with a prior *minor* version that
did not include pg_control in the backup_label. Doable, but...
3) We need to move backup_label to the end of the main pg_basebackup
tar, which could cause unforeseen breakage for tools.
4) This patch is less efficient for backups taken from standby because
it will overwrite pg_control on restart and force replay back to the
original MinRecoveryPoint.
5) We still need a solution for exclusive backup (still valid in PG <=
14). Doable but it would still have the weakness of 1.
All of this is fixable in HEAD, but seems incredibly dangerous to back
patch. Even so, I have attached the patch in case somebody sees an
opportunity that I do not.
Regards,
-David
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index c0e4ca50899..e4a4478af75 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -74,6 +74,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "port/atomics.h"
+#include "port/pg_crc32c.h"
#include "port/pg_iovec.h"
#include "postmaster/bgwriter.h"
#include "postmaster/startup.h"
@@ -8691,6 +8692,21 @@ do_pg_backup_stop(BackupState *state, bool
waitforarchive)
"and should not be used. "
"Try taking another online
backup.")));
+ /*
+ * Create a copy of control data to be stored in the backup label.
+ * Recalculate the CRC so recovery can validate the contents but do not
+ * bother with the timestamp since that will be applied before it is
+ * written by recovery.
+ */
+ LWLockAcquire(ControlFileLock, LW_SHARED);
+ state->controlFile = *ControlFile;
+ LWLockRelease(ControlFileLock);
+
+ INIT_CRC32C(state->controlFile.crc);
+ COMP_CRC32C(state->controlFile.crc, (char *)&state->controlFile,
+ offsetof(ControlFileData, crc));
+ FIN_CRC32C(state->controlFile.crc);
+
/*
* During recovery, we don't write an end-of-backup record. We assume
that
* pg_control was backed up last and its minimum recovery point can be
@@ -8741,11 +8757,8 @@ do_pg_backup_stop(BackupState *state, bool
waitforarchive)
"Enable
full_page_writes and run CHECKPOINT on the primary, "
"and then try an
online backup again.")));
-
- LWLockAcquire(ControlFileLock, LW_SHARED);
- state->stoppoint = ControlFile->minRecoveryPoint;
- state->stoptli = ControlFile->minRecoveryPointTLI;
- LWLockRelease(ControlFileLock);
+ state->stoppoint = state->controlFile.minRecoveryPoint;
+ state->stoptli = state->controlFile.minRecoveryPointTLI;
}
else
{
diff --git a/src/backend/access/transam/xlogbackup.c
b/src/backend/access/transam/xlogbackup.c
index 21d68133ae1..79559d78acb 100644
--- a/src/backend/access/transam/xlogbackup.c
+++ b/src/backend/access/transam/xlogbackup.c
@@ -16,6 +16,7 @@
#include "access/xlog.h"
#include "access/xlog_internal.h"
#include "access/xlogbackup.h"
+#include "common/base64.h"
/*
* Build contents for backup_label or backup history file.
@@ -76,6 +77,17 @@ build_backup_content(BackupState *state, bool ishistoryfile)
appendStringInfo(result, "STOP TIME: %s\n", stopstrfbuf);
appendStringInfo(result, "STOP TIMELINE: %u\n", state->stoptli);
}
+ /* Include a copy of control data */
+ else
+ {
+ char controlbuf[((sizeof(ControlFileData) + 2) / 3 * 4) + 1];
+
+ pg_b64_encode((char *)&state->controlFile,
sizeof(ControlFileData),
+ controlbuf, sizeof(controlbuf) - 1);
+ controlbuf[sizeof(controlbuf) - 1] = '\0';
+
+ appendStringInfo(result, "CONTROL DATA: %s\n", controlbuf);
+ }
data = result->data;
pfree(result);
diff --git a/src/backend/access/transam/xlogrecovery.c
b/src/backend/access/transam/xlogrecovery.c
index becc2bda62e..1abd9494bf0 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -43,6 +43,7 @@
#include "backup/basebackup.h"
#include "catalog/pg_control.h"
#include "commands/tablespace.h"
+#include "common/base64.h"
#include "common/file_utils.h"
#include "miscadmin.h"
#include "pgstat.h"
@@ -390,7 +391,8 @@ static void readRecoverySignalFile(void);
static void validateRecoveryParameters(void);
static bool read_backup_label(XLogRecPtr *checkPointLoc,
TimeLineID
*backupLabelTLI,
- bool
*backupEndRequired, bool *backupFromStandby);
+ bool
*backupEndRequired, bool *backupFromStandby,
+ ControlFileData
*ControlFile);
static bool read_tablespace_map(List **tablespaces);
static void xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI);
@@ -610,7 +612,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
primary_image_masked = (char *) palloc(BLCKSZ);
if (read_backup_label(&CheckPointLoc, &CheckPointTLI,
&backupEndRequired,
- &backupFromStandby))
+ &backupFromStandby,
ControlFile))
{
List *tablespaces = NIL;
@@ -1167,7 +1169,8 @@ validateRecoveryParameters(void)
*/
static bool
read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID *backupLabelTLI,
- bool *backupEndRequired, bool
*backupFromStandby)
+ bool *backupEndRequired, bool
*backupFromStandby,
+ ControlFileData *ControlFile)
{
char startxlogfilename[MAXFNAMELEN];
TimeLineID tli_from_walseg,
@@ -1180,6 +1183,7 @@ read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID
*backupLabelTLI,
char backuptime[128];
uint32 hi,
lo;
+ char controlB64Buf[684 + 1];
/* suppress possible uninitialized-variable warnings */
*checkPointLoc = InvalidXLogRecPtr;
@@ -1285,6 +1289,51 @@ read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID
*backupLabelTLI,
tli_from_file,
BACKUP_LABEL_FILE)));
}
+ /*
+ * Read control data to be used to create pg_control. Control data may
not
+ * exist if the backup was made with an older version, in which case the
+ * control file read from disk will be used.
+ */
+ if (fscanf(lfp, "CONTROL DATA: %684s\n", controlB64Buf) == 1)
+ {
+ ControlFileData controlBuf;
+ pg_crc32c crc;
+
+ /* Check that the base64 data is the correct length */
+ if (strlen(controlB64Buf) != (sizeof(ControlFileData) + 2) / 3
* 4)
+ ereport(FATAL,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE),
+ errdetail("Control data expected %zu
base64 characters.",
+
(sizeof(ControlFileData) + 2) / 3 * 4)));
+
+ /* Decode control file */
+ if (pg_b64_decode(controlB64Buf, strlen(controlB64Buf), (char
*)&controlBuf,
+ sizeof(ControlFileData)) == -1)
+ ereport(FATAL,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE),
+ errdetail("Control data contains
invalid base64 encoding.")));
+
+ /* CRC check on stored control file */
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, (char *)&controlBuf, offsetof(ControlFileData,
crc));
+ FIN_CRC32C(crc);
+
+ if (!EQ_CRC32C(crc, controlBuf.crc))
+ ereport(FATAL,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE),
+ errdetail("Control data contains
invalid CRC.")));
+
+ ereport(DEBUG1,
+ (errmsg_internal("backup control data in file
\"%s\"",
+
BACKUP_LABEL_FILE)));
+
+ /* Copy control data */
+ memcpy(ControlFile, &controlBuf, sizeof(ControlFileData));
+ }
+
if (ferror(lfp) || FreeFile(lfp))
ereport(FATAL,
(errcode_for_file_access(),
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 7d025bcf382..f49dabf63ef 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -330,13 +330,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
bbsink_begin_archive(sink, "base.tar");
- /* In the main tar, include the backup_label
first... */
- backup_label =
build_backup_content(backup_state, false);
- sendFileWithContent(sink, BACKUP_LABEL_FILE,
-
backup_label, &manifest);
- pfree(backup_label);
-
- /* Then the tablespace_map file, if required...
*/
+ /* Send the tablespace_map file, if required...
*/
if (opt->sendtblspcmapfile)
{
sendFileWithContent(sink,
TABLESPACE_MAP,
@@ -348,7 +342,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
sendDir(sink, ".", 1, false, state.tablespaces,
sendtblspclinks, &manifest,
NULL);
- /* ... and pg_control after everything else. */
+ /* ... and pg_control after everything but
backup_label */
if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
ereport(ERROR,
(errcode_for_file_access(),
@@ -356,6 +350,16 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
XLOG_CONTROL_FILE)));
sendFile(sink, XLOG_CONTROL_FILE,
XLOG_CONTROL_FILE, &statbuf,
false, InvalidOid, &manifest,
NULL);
+
+ /* End the backup before sending backup_label */
+ basebackup_progress_wait_wal_archive(&state);
+ do_pg_backup_stop(backup_state, !opt->nowait);
+
+ /* Last in the main tar, include backup_label */
+ backup_label =
build_backup_content(backup_state, false);
+ sendFileWithContent(sink, BACKUP_LABEL_FILE,
+
backup_label, &manifest);
+ pfree(backup_label);
}
else
{
@@ -389,9 +393,6 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
}
}
- basebackup_progress_wait_wal_archive(&state);
- do_pg_backup_stop(backup_state, !opt->nowait);
-
endptr = backup_state->stoppoint;
endtli = backup_state->stoptli;
diff --git a/src/include/access/xlogbackup.h b/src/include/access/xlogbackup.h
index 1611358137b..bd53405670e 100644
--- a/src/include/access/xlogbackup.h
+++ b/src/include/access/xlogbackup.h
@@ -15,6 +15,7 @@
#define XLOG_BACKUP_H
#include "access/xlogdefs.h"
+#include "catalog/pg_control.h"
#include "pgtime.h"
/* Structure to hold backup state. */
@@ -33,6 +34,7 @@ typedef struct BackupState
XLogRecPtr stoppoint; /* backup stop WAL location */
TimeLineID stoptli; /* backup stop TLI */
pg_time_t stoptime; /* backup stop time */
+ ControlFileData controlFile; /* Copy of control data */
} BackupState;
extern char *build_backup_content(BackupState *state,