On 11/6/23 02:35, Michael Paquier wrote:
On Sun, Nov 05, 2023 at 01:45:39PM -0400, David Steele wrote:
Rebased on 151ffcf6.
I like this patch a lot. Even if the backup_label file is removed, we
still have all the debug information from the backup history file,
thanks to its LABEL, BACKUP METHOD and BACKUP FROM, so no information
is lost. It does a 1:1 replacement of the contents parsed from the
backup_label needed by recovery by fetching them from the control
file. Sounds like a straight-forward change to me.
That's the plan, at least!
The patch is failing the recovery test 039_end_of_wal.pl. Could you
look at the failure?
I'm not seeing this failure, and CI seems happy [1]. Can you give
details of the error message?
/* Build and save the contents of the backup history file */
- history_file = build_backup_content(state, true);
+ history_file = build_backup_content(state);
build_backup_content() sounds like an incorrect name if it is a
routine onlyused to build the contents of backup history files.
Good point, I have renamed this to build_backup_history_content().
Why is there nothing updated in src/bin/pg_controldata/?
Oops, added.
+ /* Clear fields used to initialize recovery */
+ ControlFile->backupCheckPoint = InvalidXLogRecPtr;
+ ControlFile->backupStartPointTLI = 0;
+ ControlFile->backupRecoveryRequired = false;
+ ControlFile->backupFromStandby = false;
These variables in the control file are cleaned up when the
backup_label file was read previously, but backup_label is renamed to
backup_label.old a bit later than that. Your logic looks correct seen
from here, but shouldn't these variables be set much later, aka just
*after* UpdateControlFile(). This gap between the initialization of
the control file and the in-memory reset makes the code quite brittle,
IMO.
If we set these fields where backup_label was renamed, the logic would
not be exactly the same since pg_control won't be updated until the next
time through the loop. Since the fields should be updated before
UpdateControlFile() I thought it made sense to keep all the updates
together.
Overall I think it is simpler, and we don't need to acquire a lock on
ControlFile.
- basebackup_progress_wait_wal_archive(&state);
- do_pg_backup_stop(backup_state, !opt->nowait);
Why is that moved?
do_pg_backup_stop() generates the updated pg_control so it needs to run
before we transmit pg_control.
- The backup label
- file includes the label string you gave to
<function>pg_backup_start</function>,
- as well as the time at which <function>pg_backup_start</function> was run,
and
- the name of the starting WAL file. In case of confusion it is therefore
- possible to look inside a backup file and determine exactly which
- backup session the dump file came from. The tablespace map file includes
+ The tablespace map file includes
It may be worth mentioning that the backup history file holds this
information on the primary's pg_wal, as well.
OK, reworded.
The changes in sendFileWithContent() may be worth a patch of its own.
Thomas included this change in his pg_basebackup changes so I did the
same. Maybe wait a bit before we split this out? Seems like a pretty
small change...
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -146,6 +146,9 @@ typedef struct ControlFileData
@@ -160,14 +163,25 @@ typedef struct ControlFileData
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
+ XLogRecPtr backupCheckPoint;
XLogRecPtr backupStartPoint;
+ TimeLineID backupStartPointTLI;
XLogRecPtr backupEndPoint;
+ bool backupRecoveryRequired;
+ bool backupFromStandby;
This increases the size of the control file from 296B to 312B with an
8-byte alignment, as far as I can see. The size of the control file
has been always a sensitive subject especially with the hard limit of
PG_CONTROL_MAX_SAFE_SIZE. Well, the point of this patch is that this
is the price to pay to prevent users from doing something stupid with
a removal of a backup_label when they should not. Do others have an
opinion about this increase in size?
Actually, grouping backupStartPointTLI and minRecoveryPointTLI should
reduce more the size with some alignment magic, no?
I thought about this, but it seemed to me that existing fields had been
positioned to make the grouping logical rather than to optimize
alignment, e.g. minRecoveryPointTLI. Ideally that would have been placed
near backupEndRequired (or vice versa). But if the general opinion is to
rearrange for alignment, I'm OK with that.
backupRecoveryRequired in the control file is switched to false for
pg_rewind and true for streamed backups. My gut feeling is telling me
that this should be OK, as out-of-core tools would need an upgrade if
they relied on the backend_label file anyway. I can see that this
change makes use lose some documentation, unfortunately. Shouldn't
these removed lines be moved to pg_control.h instead for the
description of backupEndRequired?
Updated description in pg_control.h -- it's a bit vague but not sure it
is a good idea to get into the inner workings of pg_rewind here?
doc/src/sgml/ref/pg_rewind.sgml and
src/backend/access/transam/xlogrecovery.c still include references to
the backup_label file.
Fixed.
Attached is a new patch based on 18b585155.
Regards,
-David
[1] https://cirrus-ci.com/build/4939808120766464
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 8cb24d6ae54..584384875be 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -935,19 +935,20 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
ready to archive.
</para>
<para>
- <function>pg_backup_stop</function> will return one row with three
- values. The second of these fields should be written to a file named
- <filename>backup_label</filename> in the root directory of the backup. The
- third field should be written to a file named
- <filename>tablespace_map</filename> unless the field is empty. These
files are
+ <function>pg_backup_stop</function> returns the
+ <filename>pg_control</filename> file, which must be stored in the
+ <filename>global</filename> directory of the backup. It also returns the
+ <filename>tablespace_map</filename> file, which should be written in the
+ root directory of the backup unless the field is empty. These files are
vital to the backup working and must be written byte for byte without
- modification, which may require opening the file in binary mode.
+ modification, which will require opening the file in binary mode.
</para>
</listitem>
<listitem>
<para>
Once the WAL segment files active during the backup are archived, you are
- done. The file identified by <function>pg_backup_stop</function>'s first
return
+ done. The file identified by <function>pg_backup_stop</function>'s
+ <parameter>lsn</parameter> return
value is the last segment that is required to form a complete set of
backup files. On a primary, if <varname>archive_mode</varname> is
enabled and the
<literal>wait_for_archive</literal> parameter is <literal>true</literal>,
@@ -1013,7 +1014,15 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
</para>
<para>
- You should, however, omit from the backup the files within the
+ You must exclude <filename>global/pg_control</filename> from your backup
+ and put the contents of the <parameter>pg_control_file</parameter> column
+ returned from <function>pg_backup_stop</function> in your backup at
+ <filename>global/pg_control</filename>. This file contains the information
+ required to safely recover.
+ </para>
+
+ <para>
+ You should also omit from the backup the files within the
cluster's <filename>pg_wal/</filename> subdirectory. This
slight adjustment is worthwhile because it reduces the risk
of mistakes when restoring. This is easy to arrange if
@@ -1062,11 +1071,11 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
</para>
<para>
- The backup label
- file includes the label string you gave to
<function>pg_backup_start</function>,
+ The backup history file (which is archived like WAL) includes the label
+ string you gave to <function>pg_backup_start</function>,
as well as the time at which <function>pg_backup_start</function> was run,
and
the name of the starting WAL file. In case of confusion it is therefore
- possible to look inside a backup file and determine exactly which
+ possible to look inside a backup history file and determine exactly which
backup session the dump file came from. The tablespace map file includes
the symbolic link names as they exist in the directory
<filename>pg_tblspc/</filename> and the full path of each symbolic link.
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d963f0a0a00..ed3e5b9dce6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26845,7 +26845,10 @@ LOG: Grand total: 1651920 bytes in 201 blocks; 622360
free (88 chunks); 1029560
<parameter>label</parameter> <type>text</type>
<optional>, <parameter>fast</parameter> <type>boolean</type>
</optional> )
- <returnvalue>pg_lsn</returnvalue>
+ <returnvalue>record</returnvalue>
+ ( <parameter>lsn</parameter> <type>pg_lsn</type>,
+ <parameter>timeline_id</parameter> <type>int8</type>,
+ <parameter>start</parameter> <type>timestamptz</type> )
</para>
<para>
Prepares the server to begin an on-line backup. The only required
@@ -26857,6 +26860,13 @@ LOG: Grand total: 1651920 bytes in 201 blocks; 622360
free (88 chunks); 1029560
as possible. This forces an immediate checkpoint which will cause a
spike in I/O operations, slowing any concurrently executing queries.
</para>
+ <para>
+ The result columns contain information about the start of the backup
+ and can be ignored: the <parameter>lsn</parameter> column holds the
+ starting write-ahead log location, the
+ <parameter>timeline_id</parameter> column holds the starting timeline,
+ and the <parameter>stop</parameter> column holds the starting
timestamp.
+ </para>
<para>
This function is restricted to superusers by default, but other users
can be granted EXECUTE to run the function.
@@ -26872,13 +26882,15 @@ LOG: Grand total: 1651920 bytes in 201 blocks;
622360 free (88 chunks); 1029560
<optional><parameter>wait_for_archive</parameter>
<type>boolean</type>
</optional> )
<returnvalue>record</returnvalue>
- ( <parameter>lsn</parameter> <type>pg_lsn</type>,
- <parameter>labelfile</parameter> <type>text</type>,
- <parameter>spcmapfile</parameter> <type>text</type> )
+ ( <parameter>pg_control_file</parameter> <type>text</type>,
+ <parameter>tablespace_map_file</parameter> <type>text</type>,
+ <parameter>lsn</parameter> <type>pg_lsn</type>,
+ <parameter>timeline_id</parameter> <type>int8</type>,
+ <parameter>stop</parameter> <type>timestamptz</type> )
</para>
<para>
Finishes performing an on-line backup. The desired contents of the
- backup label file and the tablespace map file are returned as part of
+ pg_control file and the tablespace map file are returned as part of
the result of the function and must be written to files in the
backup area. These files must not be written to the live data
directory
(doing so will cause PostgreSQL to fail to restart in the event of a
@@ -26910,13 +26922,16 @@ LOG: Grand total: 1651920 bytes in 201 blocks;
622360 free (88 chunks); 1029560
backup.
</para>
<para>
- The result of the function is a single record.
- The <parameter>lsn</parameter> column holds the backup's ending
- write-ahead log location (which again can be ignored). The second
- column returns the contents of the backup label file, and the third
- column returns the contents of the tablespace map file. These must be
- stored as part of the backup and are required as part of the restore
- process.
+ The result of the function is a single record. The first column returns
+ the contents of the <filename>pg_control</filename> file and the
+ second column returns the contents of the
+ <filename>tablespace_map</filename> file. These must be stored as part
+ of the backup and are required as part of the restore process. The
+ remainder of the columns contain information about the end of the
backup
+ and can be ignored: the <parameter>lsn</parameter> column holds the
+ ending write-ahead log location, the <parameter>timeline_id</parameter>
+ column holds the ending timeline, and the <parameter>stop</parameter>
+ column holds the ending timestamp.
</para>
<para>
This function is restricted to superusers by default, but other users
diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index 8e0000d39fb..889add4c5e4 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -400,7 +400,6 @@ GRANT EXECUTE ON function
pg_catalog.pg_read_binary_file(text, bigint, bigint, b
<filename>pg_serial/</filename>, <filename>pg_snapshots/</filename>,
<filename>pg_stat_tmp/</filename>, and <filename>pg_subtrans/</filename>
are omitted from the data copied from the source cluster. The files
- <filename>backup_label</filename>,
<filename>tablespace_map</filename>,
<filename>pg_internal.init</filename>,
<filename>postmaster.opts</filename>, and
@@ -410,7 +409,7 @@ GRANT EXECUTE ON function
pg_catalog.pg_read_binary_file(text, bigint, bigint, b
</step>
<step>
<para>
- Create a <filename>backup_label</filename> file to begin WAL replay at
+ Update <filename>pg_control</filename> file to begin WAL replay at
the checkpoint created at failover and configure the
<filename>pg_control</filename> file with a minimum consistency LSN
defined as the result of <literal>pg_current_wal_insert_lsn()</literal>
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index b541be8eec2..34311cfc2b9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -74,6 +74,7 @@
#include "pg_trace.h"
#include "pgstat.h"
#include "port/atomics.h"
+#include "port/pg_crc32c.h"
#include "port/pg_iovec.h"
#include "postmaster/bgwriter.h"
#include "postmaster/startup.h"
@@ -5116,7 +5117,6 @@ StartupXLOG(void)
bool wasShutdown;
bool didCrash;
bool haveTblspcMap;
- bool haveBackupLabel;
XLogRecPtr EndOfLog;
TimeLineID EndOfLogTLI;
TimeLineID newTLI;
@@ -5240,13 +5240,14 @@ StartupXLOG(void)
/*
* Prepare for WAL recovery if needed.
*
- * InitWalRecovery analyzes the control file and the backup label file,
if
- * any. It updates the in-memory ControlFile buffer according to the
- * starting checkpoint, and sets InRecovery and
ArchiveRecoveryRequested.
+ * InitWalRecovery analyzes the control file and checks if backup
recovery
+ * has been requested. It updates the in-memory ControlFile buffer
+ * according to the starting checkpoint, and sets InRecovery and
+ * ArchiveRecoveryRequested.
+ *
* It also applies the tablespace map file, if any.
*/
- InitWalRecovery(ControlFile, &wasShutdown,
- &haveBackupLabel, &haveTblspcMap);
+ InitWalRecovery(ControlFile, &wasShutdown, &haveTblspcMap);
checkPoint = ControlFile->checkPointCopy;
/* initialize shared memory variables from the checkpoint record */
@@ -5389,20 +5390,6 @@ StartupXLOG(void)
*/
UpdateControlFile();
- /*
- * If there was a backup label file, it's done its job and the
info
- * has now been propagated into pg_control. We must get rid of
the
- * label file so that if we crash during recovery, we'll pick
up at
- * the latest recovery restartpoint instead of going all the
way back
- * to the backup start point. It seems prudent though to just
rename
- * the file out of the way rather than delete it completely.
- */
- if (haveBackupLabel)
- {
- unlink(BACKUP_LABEL_OLD);
- durable_rename(BACKUP_LABEL_FILE, BACKUP_LABEL_OLD,
FATAL);
- }
-
/*
* If there was a tablespace_map file, it's done its job and the
* symlinks have been created. We must get rid of the map file
so
@@ -5552,10 +5539,8 @@ StartupXLOG(void)
* (at which point we reset backupStartPoint to be Invalid), for
* backup-from-replica (which can't inject records into the WAL stream),
* that point is when we reach the minRecoveryPoint in pg_control (which
- * we purposefully copy last when backing up from a replica). For
- * pg_rewind (which creates a backup_label with a method of "pg_rewind")
- * or snapshot-style backups (which don't), backupEndRequired will be
set
- * to false.
+ * we purposefully copy last when backing up). For pg_rewind or
+ * snapshot-style backups, backupEndRequired will be set to false.
*
* Note: it is indeed okay to look at the local variable
* LocalMinRecoveryPoint here, even though ControlFile->minRecoveryPoint
@@ -8725,11 +8710,33 @@ do_pg_backup_stop(BackupState *state, bool
waitforarchive)
int seconds_before_warning;
int waits = 0;
bool reported_waiting = false;
+ ControlFileData *controlFileCopy = (ControlFileData
*)state->controlFile;
Assert(state != NULL);
backup_stopped_in_recovery = RecoveryInProgress();
+ /*
+ * Create a copy of control data and update it with fields required for
+ * recovery. Also recalculate the CRC.
+ */
+ memset(controlFileCopy, 0, PG_CONTROL_MAX_SAFE_SIZE);
+
+ LWLockAcquire(ControlFileLock, LW_SHARED);
+ memcpy(controlFileCopy, ControlFile, sizeof(ControlFileData));
+ LWLockRelease(ControlFileLock);
+
+ controlFileCopy->backupRecoveryRequired = true;
+ controlFileCopy->backupFromStandby = backup_stopped_in_recovery;
+ controlFileCopy->backupEndRequired = true;
+ controlFileCopy->backupCheckPoint = state->checkpointloc;
+ controlFileCopy->backupStartPoint = state->startpoint;
+ controlFileCopy->backupStartPointTLI = state->starttli;
+
+ INIT_CRC32C(controlFileCopy->crc);
+ COMP_CRC32C(controlFileCopy->crc, controlFileCopy,
offsetof(ControlFileData, crc));
+ FIN_CRC32C(controlFileCopy->crc);
+
/*
* During recovery, we don't need to check WAL level. Because, if WAL
* level is not sufficient, it's impossible to get here during recovery.
@@ -8831,11 +8838,8 @@ do_pg_backup_stop(BackupState *state, bool
waitforarchive)
"Enable
full_page_writes and run CHECKPOINT on the primary, "
"and then try an
online backup again.")));
-
- LWLockAcquire(ControlFileLock, LW_SHARED);
- state->stoppoint = ControlFile->minRecoveryPoint;
- state->stoptli = ControlFile->minRecoveryPointTLI;
- LWLockRelease(ControlFileLock);
+ state->stoppoint = controlFileCopy->minRecoveryPoint;
+ state->stoptli = controlFileCopy->minRecoveryPointTLI;
}
else
{
@@ -8877,7 +8881,7 @@ do_pg_backup_stop(BackupState *state, bool waitforarchive)
histfilepath)));
/* Build and save the contents of the backup history file */
- history_file = build_backup_content(state, true);
+ history_file = build_backup_history_content(state);
fprintf(fp, "%s", history_file);
pfree(history_file);
diff --git a/src/backend/access/transam/xlogbackup.c
b/src/backend/access/transam/xlogbackup.c
index 21d68133ae1..22c95f3c4c9 100644
--- a/src/backend/access/transam/xlogbackup.c
+++ b/src/backend/access/transam/xlogbackup.c
@@ -18,19 +18,19 @@
#include "access/xlogbackup.h"
/*
- * Build contents for backup_label or backup history file.
- *
- * When ishistoryfile is true, it creates the contents for a backup history
- * file, otherwise it creates contents for a backup_label file.
+ * Build contents for backup history file.
*
* Returns the result generated as a palloc'd string.
*/
char *
-build_backup_content(BackupState *state, bool ishistoryfile)
+build_backup_history_content(BackupState *state)
{
char startstrbuf[128];
+ char stopstrfbuf[128];
char startxlogfile[MAXFNAMELEN]; /* backup start WAL file */
+ char stopxlogfile[MAXFNAMELEN]; /* backup stop WAL file
*/
XLogSegNo startsegno;
+ XLogSegNo stopsegno;
StringInfo result = makeStringInfo();
char *data;
@@ -45,16 +45,10 @@ build_backup_content(BackupState *state, bool ishistoryfile)
appendStringInfo(result, "START WAL LOCATION: %X/%X (file %s)\n",
LSN_FORMAT_ARGS(state->startpoint),
startxlogfile);
- if (ishistoryfile)
- {
- char stopxlogfile[MAXFNAMELEN]; /* backup stop
WAL file */
- XLogSegNo stopsegno;
-
- XLByteToSeg(state->stoppoint, stopsegno, wal_segment_size);
- XLogFileName(stopxlogfile, state->stoptli, stopsegno,
wal_segment_size);
- appendStringInfo(result, "STOP WAL LOCATION: %X/%X (file %s)\n",
-
LSN_FORMAT_ARGS(state->stoppoint), stopxlogfile);
- }
+ XLByteToSeg(state->stoppoint, stopsegno, wal_segment_size);
+ XLogFileName(stopxlogfile, state->stoptli, stopsegno, wal_segment_size);
+ appendStringInfo(result, "STOP WAL LOCATION: %X/%X (file %s)\n",
+
LSN_FORMAT_ARGS(state->stoppoint), stopxlogfile);
appendStringInfo(result, "CHECKPOINT LOCATION: %X/%X\n",
LSN_FORMAT_ARGS(state->checkpointloc));
@@ -65,17 +59,12 @@ build_backup_content(BackupState *state, bool ishistoryfile)
appendStringInfo(result, "LABEL: %s\n", state->name);
appendStringInfo(result, "START TIMELINE: %u\n", state->starttli);
- if (ishistoryfile)
- {
- char stopstrfbuf[128];
-
- /* Use the log timezone here, not the session timezone */
- pg_strftime(stopstrfbuf, sizeof(stopstrfbuf), "%Y-%m-%d
%H:%M:%S %Z",
- pg_localtime(&state->stoptime,
log_timezone));
+ /* Use the log timezone here, not the session timezone */
+ pg_strftime(stopstrfbuf, sizeof(stopstrfbuf), "%Y-%m-%d %H:%M:%S %Z",
+ pg_localtime(&state->stoptime, log_timezone));
- appendStringInfo(result, "STOP TIME: %s\n", stopstrfbuf);
- appendStringInfo(result, "STOP TIMELINE: %u\n", state->stoptli);
- }
+ appendStringInfo(result, "STOP TIME: %s\n", stopstrfbuf);
+ appendStringInfo(result, "STOP TIMELINE: %u\n", state->stoptli);
data = result->data;
pfree(result);
diff --git a/src/backend/access/transam/xlogfuncs.c
b/src/backend/access/transam/xlogfuncs.c
index 45a70668b1c..2388a60a5e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -53,7 +53,7 @@ static MemoryContext backupcontext = NULL;
* pg_backup_start: set up for taking an on-line backup dump
*
* Essentially what this does is to create the contents required for the
- * backup_label file and the tablespace map.
+ * the tablespace map.
*
* Permission checking for this function is managed through the normal
* GRANT system.
@@ -61,6 +61,10 @@ static MemoryContext backupcontext = NULL;
Datum
pg_backup_start(PG_FUNCTION_ARGS)
{
+#define PG_BACKUP_START_V2_COLS 3
+ TupleDesc tupdesc;
+ Datum values[PG_BACKUP_START_V2_COLS] = {0};
+ bool nulls[PG_BACKUP_START_V2_COLS] = {0};
text *backupid = PG_GETARG_TEXT_PP(0);
bool fast = PG_GETARG_BOOL(1);
char *backupidstr;
@@ -69,6 +73,10 @@ pg_backup_start(PG_FUNCTION_ARGS)
backupidstr = text_to_cstring(backupid);
+ /* Initialize attributes information in the tuple descriptor */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
if (status == SESSION_BACKUP_RUNNING)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -102,7 +110,12 @@ pg_backup_start(PG_FUNCTION_ARGS)
register_persistent_abort_backup_handler();
do_pg_backup_start(backupidstr, fast, NULL, backup_state,
tablespace_map);
- PG_RETURN_LSN(backup_state->startpoint);
+ values[0] = LSNGetDatum(backup_state->startpoint);
+ values[1] = Int64GetDatum(backup_state->starttli);
+ values[2] =
TimestampTzGetDatum(time_t_to_timestamptz(backup_state->starttime));
+
+ /* Returns the record as Datum */
+ PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values,
nulls)));
}
@@ -113,14 +126,12 @@ pg_backup_start(PG_FUNCTION_ARGS)
* allows the user to choose if they want to wait for the WAL to be archived
* or if we should just return as soon as the WAL record is written.
*
- * This function stops an in-progress backup, creates backup_label contents and
- * it returns the backup stop LSN, backup_label and tablespace_map contents.
+ * This function stops an in-progress backup and returns the backup stop LSN,
+ * pg_control and tablespace_map contents.
*
- * The backup_label contains the user-supplied label string (typically this
- * would be used to tell where the backup dump will be stored), the starting
- * time, starting WAL location for the dump and so on. It is the caller's
- * responsibility to write the backup_label and tablespace_map files in the
- * data folder that will be restored from this backup.
+ * The pg_control file contains the recovery information for the backup. It is
+ * the caller's responsibility to write the pg_control and tablespace_map files
+ * in the data folder that will be restored from this backup.
*
* Permission checking for this function is managed through the normal
* GRANT system.
@@ -128,12 +139,12 @@ pg_backup_start(PG_FUNCTION_ARGS)
Datum
pg_backup_stop(PG_FUNCTION_ARGS)
{
-#define PG_BACKUP_STOP_V2_COLS 3
+#define PG_BACKUP_STOP_V2_COLS 5
TupleDesc tupdesc;
Datum values[PG_BACKUP_STOP_V2_COLS] = {0};
bool nulls[PG_BACKUP_STOP_V2_COLS] = {0};
bool waitforarchive = PG_GETARG_BOOL(0);
- char *backup_label;
+ bytea *pg_control_bytea;
SessionBackupState status = get_backup_status();
/* Initialize attributes information in the tuple descriptor */
@@ -152,15 +163,16 @@ pg_backup_stop(PG_FUNCTION_ARGS)
/* Stop the backup */
do_pg_backup_stop(backup_state, waitforarchive);
- /* Build the contents of backup_label */
- backup_label = build_backup_content(backup_state, false);
-
- values[0] = LSNGetDatum(backup_state->stoppoint);
- values[1] = CStringGetTextDatum(backup_label);
- values[2] = CStringGetTextDatum(tablespace_map->data);
+ /* Build the contents of pg_control */
+ pg_control_bytea = (bytea *) palloc(PG_CONTROL_MAX_SAFE_SIZE +
VARHDRSZ);
+ SET_VARSIZE(pg_control_bytea, PG_CONTROL_MAX_SAFE_SIZE + VARHDRSZ);
+ memcpy(VARDATA(pg_control_bytea), backup_state->controlFile,
PG_CONTROL_MAX_SAFE_SIZE);
- /* Deallocate backup-related variables */
- pfree(backup_label);
+ values[0] = PointerGetDatum(pg_control_bytea);
+ values[1] = CStringGetTextDatum(tablespace_map->data);
+ values[2] = LSNGetDatum(backup_state->stoppoint);
+ values[3] = Int64GetDatum(backup_state->stoptli);
+ values[4] =
TimestampTzGetDatum(time_t_to_timestamptz(backup_state->stoptime));
/* Clean up the session-level state and its memory context */
backup_state = NULL;
diff --git a/src/backend/access/transam/xlogrecovery.c
b/src/backend/access/transam/xlogrecovery.c
index c61566666aa..f43ea39f963 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -6,7 +6,7 @@
* This source file contains functions controlling WAL recovery.
* InitWalRecovery() initializes the system for crash or archive recovery,
* or standby mode, depending on configuration options and the state of
- * the control file and possible backup label file. PerformWalRecovery()
+ * the control file and possible backup recovery. PerformWalRecovery()
* performs the actual WAL replay, calling the rmgr-specific redo routines.
* FinishWalRecovery() performs end-of-recovery checks and cleanup actions,
* and prepares information needed to initialize the WAL for writes. In
@@ -152,11 +152,12 @@ static bool recovery_signal_file_found = false;
/*
* CheckPointLoc is the position of the checkpoint record that determines
- * where to start the replay. It comes from the backup label file or the
- * control file.
+ * where to start the replay. It comes from the control file, either from the
+ * default location or from a backup recovery field.
*
- * RedoStartLSN is the checkpoint's REDO location, also from the backup label
- * file or the control file. In standby mode, XLOG streaming usually starts
+ * RedoStartLSN is the checkpoint's REDO location, also from the default
+ * control file location or from a backup recovery field. In standby mode,
+ * XLOG streaming usually starts
* from the position where an invalid record was found. But if we fail to
* read even the initial checkpoint record, we use the REDO location instead
* of the checkpoint location as the start position of XLOG streaming.
@@ -388,9 +389,6 @@ static void ApplyWalRecord(XLogReaderState *xlogreader,
XLogRecord *record, Time
static void EnableStandbyMode(void);
static void readRecoverySignalFile(void);
static void validateRecoveryParameters(void);
-static bool read_backup_label(XLogRecPtr *checkPointLoc,
- TimeLineID
*backupLabelTLI,
- bool
*backupEndRequired, bool *backupFromStandby);
static bool read_tablespace_map(List **tablespaces);
static void xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI);
@@ -492,8 +490,8 @@ EnableStandbyMode(void)
* Prepare the system for WAL recovery, if needed.
*
* This is called by StartupXLOG() which coordinates the server startup
- * sequence. This function analyzes the control file and the backup label
- * file, if any, and figures out whether we need to perform crash recovery or
+ * sequence. This function analyzes the control file and backup recovery
+ * info, if any, and figures out whether we need to perform crash recovery or
* archive recovery, and how far we need to replay the WAL to reach a
* consistent state.
*
@@ -510,7 +508,7 @@ EnableStandbyMode(void)
*/
void
InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
- bool *haveBackupLabel_ptr, bool
*haveTblspcMap_ptr)
+ bool *haveTblspcMap_ptr)
{
XLogPageReadPrivate *private;
struct stat st;
@@ -518,7 +516,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
XLogRecord *record;
DBState dbstate_at_startup;
bool haveTblspcMap = false;
- bool haveBackupLabel = false;
+ bool backupRecoveryRequired = false;
CheckPoint checkPoint;
bool backupFromStandby = false;
@@ -549,7 +547,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
/*
* Set the WAL reading processor now, as it will be needed when reading
- * the checkpoint record required (backup_label or not).
+ * the checkpoint record required (backup recovery required or not).
*/
private = palloc0(sizeof(XLogPageReadPrivate));
xlogreader =
@@ -585,18 +583,34 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
primary_image_masked = (char *) palloc(BLCKSZ);
/*
- * Read the backup_label file. We want to run this part of the recovery
- * process after checking for signal files and after performing
validation
- * of the recovery parameters.
+ * Load recovery settings from pg_control. We want to run this part of
the
+ * recovery process after checking for signal files and after performing
+ * validation of the recovery parameters.
*/
- if (read_backup_label(&CheckPointLoc, &CheckPointTLI,
&backupEndRequired,
- &backupFromStandby))
+ if (ControlFile->backupRecoveryRequired)
{
List *tablespaces = NIL;
+ /* Initialize recovery from fields stored in pg_control */
+ CheckPointLoc = ControlFile->backupCheckPoint;
+ CheckPointTLI = ControlFile->backupStartPointTLI;
+ RedoStartLSN = ControlFile->backupStartPoint;
+ RedoStartTLI = ControlFile->backupStartPointTLI;
+ backupEndRequired = ControlFile->backupEndRequired;
+ backupFromStandby = ControlFile->backupFromStandby;
+
+ /* Clear fields used to initialize recovery */
+ ControlFile->backupCheckPoint = InvalidXLogRecPtr;
+ ControlFile->backupStartPointTLI = 0;
+ ControlFile->backupRecoveryRequired = false;
+ ControlFile->backupFromStandby = false;
+
+ /* Indicate that recovery was requested */
+ backupRecoveryRequired = true;
+
/*
- * Archive recovery was requested, and thanks to the backup
label
- * file, we know how far we need to replay to reach
consistency. Enter
+ * Archive recovery was requested, and thanks to the recovery
+ * info, we know how far we need to replay to reach
consistency. Enter
* archive recovery directly.
*/
InArchiveRecovery = true;
@@ -604,8 +618,9 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
EnableStandbyMode();
/*
- * When a backup_label file is present, we want to roll forward
from
- * the checkpoint it identifies, rather than using pg_control.
+ * When backup recovery is requested, we want to roll forward
from
+ * the checkpoint it identifies, rather than using the default
+ * checkpoint.
*/
record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
CheckPointTLI);
@@ -620,9 +635,8 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
/*
* Make sure that REDO location exists. This may not be
the case
- * if there was a crash during an online backup, which
left a
- * backup_label around that references a WAL segment
that's
- * already been archived.
+ * if recovery.signal is missing and the WAL has
already been
+ * archived.
*/
if (checkPoint.redo < CheckPointLoc)
{
@@ -631,20 +645,16 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
checkPoint.ThisTimeLineID))
ereport(FATAL,
(errmsg("could not find
redo location referenced by checkpoint record"),
- errhint("If you are
restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\"
and add required recovery options.\n"
- "If
you are not restoring from a backup, try removing the file
\"%s/backup_label\".\n"
- "Be
careful: removing \"%s/backup_label\" will result in a corrupt cluster if
restoring from a backup.",
-
DataDir, DataDir, DataDir, DataDir)));
+ errhint("If you are
restoring from a backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\"
and add required recovery options.\n",
+
DataDir, DataDir)));
}
}
else
{
ereport(FATAL,
(errmsg("could not locate required
checkpoint record"),
- errhint("If you are restoring from a
backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required
recovery options.\n"
- "If you are not
restoring from a backup, try removing the file \"%s/backup_label\".\n"
- "Be careful: removing
\"%s/backup_label\" will result in a corrupt cluster if restoring from a
backup.",
- DataDir, DataDir,
DataDir, DataDir)));
+ errhint("If you are restoring from a
backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add required
recovery options.\n",
+ DataDir, DataDir)));
wasShutdown = false; /* keep compiler quiet */
}
@@ -679,37 +689,32 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
/* tell the caller to delete it later */
haveTblspcMap = true;
}
-
- /* tell the caller to delete it later */
- haveBackupLabel = true;
}
else
{
- /* No backup_label file has been found if we are here. */
-
/*
- * If tablespace_map file is present without backup_label file,
there
- * is no use of such file. There is no harm in retaining it,
but it
- * is better to get rid of the map file so that we don't have
any
+ * If tablespace_map file is present without backup recovery
requested,
+ * there is no use of such file. There is no harm in retaining
it, but
+ * it is better to get rid of the map file so that we don't
have any
* redundant file in data directory and it will avoid any sort
of
* confusion. It seems prudent though to just rename the file
out of
* the way rather than delete it completely, also we ignore any
error
* that occurs in rename operation as even if map file is
present
- * without backup_label file, it is harmless.
+ * without backup recovery requested, it is harmless.
*/
if (stat(TABLESPACE_MAP, &st) == 0)
{
unlink(TABLESPACE_MAP_OLD);
if (durable_rename(TABLESPACE_MAP, TABLESPACE_MAP_OLD,
DEBUG1) == 0)
ereport(LOG,
- (errmsg("ignoring file \"%s\"
because no file \"%s\" exists",
- TABLESPACE_MAP,
BACKUP_LABEL_FILE),
+ (errmsg("ignoring file \"%s\"
because backup recovery was not requested",
+ TABLESPACE_MAP),
errdetail("File \"%s\" was
renamed to \"%s\".",
TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
else
ereport(LOG,
- (errmsg("ignoring file \"%s\"
because no file \"%s\" exists",
- TABLESPACE_MAP,
BACKUP_LABEL_FILE),
+ (errmsg("ignoring file \"%s\"
because backup recovery was not requested",
+ TABLESPACE_MAP),
errdetail("Could not rename
file \"%s\" to \"%s\": %m.",
TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
}
@@ -943,7 +948,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
* Any other state indicates that the backup somehow became
corrupted
* and we can't sensibly continue with recovery.
*/
- if (haveBackupLabel)
+ if (backupRecoveryRequired)
{
ControlFile->backupStartPoint = checkPoint.redo;
ControlFile->backupEndRequired = backupEndRequired;
@@ -953,7 +958,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
if (dbstate_at_startup !=
DB_IN_ARCHIVE_RECOVERY &&
dbstate_at_startup !=
DB_SHUTDOWNED_IN_RECOVERY)
ereport(FATAL,
- (errmsg("backup_label
contains data inconsistent with control file"),
+ (errmsg("pg_control
contains inconsistent data for standby backup"),
errhint("This means
that the backup is corrupted and you will "
"have
to use another backup for recovery.")));
ControlFile->backupEndPoint =
ControlFile->minRecoveryPoint;
@@ -983,7 +988,6 @@ InitWalRecovery(ControlFileData *ControlFile, bool
*wasShutdown_ptr,
missingContrecPtr = InvalidXLogRecPtr;
*wasShutdown_ptr = wasShutdown;
- *haveBackupLabel_ptr = haveBackupLabel;
*haveTblspcMap_ptr = haveTblspcMap;
}
@@ -1156,154 +1160,6 @@ validateRecoveryParameters(void)
}
}
-/*
- * read_backup_label: check to see if a backup_label file is present
- *
- * If we see a backup_label during recovery, we assume that we are recovering
- * from a backup dump file, and we therefore roll forward from the checkpoint
- * identified by the label file, NOT what pg_control says. This avoids the
- * problem that pg_control might have been archived one or more checkpoints
- * later than the start of the dump, and so if we rely on it as the start
- * point, we will fail to restore a consistent database state.
- *
- * Returns true if a backup_label was found (and fills the checkpoint
- * location and TLI into *checkPointLoc and *backupLabelTLI, respectively);
- * returns false if not. If this backup_label came from a streamed backup,
- * *backupEndRequired is set to true. If this backup_label was created during
- * recovery, *backupFromStandby is set to true.
- *
- * Also sets the global variables RedoStartLSN and RedoStartTLI with the LSN
- * and TLI read from the backup file.
- */
-static bool
-read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID *backupLabelTLI,
- bool *backupEndRequired, bool
*backupFromStandby)
-{
- char startxlogfilename[MAXFNAMELEN];
- TimeLineID tli_from_walseg,
- tli_from_file;
- FILE *lfp;
- char ch;
- char backuptype[20];
- char backupfrom[20];
- char backuplabel[MAXPGPATH];
- char backuptime[128];
- uint32 hi,
- lo;
-
- /* suppress possible uninitialized-variable warnings */
- *checkPointLoc = InvalidXLogRecPtr;
- *backupLabelTLI = 0;
- *backupEndRequired = false;
- *backupFromStandby = false;
-
- /*
- * See if label file is present
- */
- lfp = AllocateFile(BACKUP_LABEL_FILE, "r");
- if (!lfp)
- {
- if (errno != ENOENT)
- ereport(FATAL,
- (errcode_for_file_access(),
- errmsg("could not read file \"%s\":
%m",
- BACKUP_LABEL_FILE)));
- return false; /* it's not there, all is fine
*/
- }
-
- /*
- * Read and parse the START WAL LOCATION and CHECKPOINT lines (this code
- * is pretty crude, but we are not expecting any variability in the file
- * format).
- */
- if (fscanf(lfp, "START WAL LOCATION: %X/%X (file %08X%16s)%c",
- &hi, &lo, &tli_from_walseg, startxlogfilename, &ch)
!= 5 || ch != '\n')
- ereport(FATAL,
-
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE)));
- RedoStartLSN = ((uint64) hi) << 32 | lo;
- RedoStartTLI = tli_from_walseg;
- if (fscanf(lfp, "CHECKPOINT LOCATION: %X/%X%c",
- &hi, &lo, &ch) != 3 || ch != '\n')
- ereport(FATAL,
-
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE)));
- *checkPointLoc = ((uint64) hi) << 32 | lo;
- *backupLabelTLI = tli_from_walseg;
-
- /*
- * BACKUP METHOD lets us know if this was a typical backup ("streamed",
- * which could mean either pg_basebackup or the pg_backup_start/stop
- * method was used) or if this label came from somewhere else (the only
- * other option today being from pg_rewind). If this was a streamed
- * backup then we know that we need to play through until we get to the
- * end of the WAL which was generated during the backup (at which point
we
- * will have reached consistency and backupEndRequired will be reset to
be
- * false).
- */
- if (fscanf(lfp, "BACKUP METHOD: %19s\n", backuptype) == 1)
- {
- if (strcmp(backuptype, "streamed") == 0)
- *backupEndRequired = true;
- }
-
- /*
- * BACKUP FROM lets us know if this was from a primary or a standby. If
- * it was from a standby, we'll double-check that the control file state
- * matches that of a standby.
- */
- if (fscanf(lfp, "BACKUP FROM: %19s\n", backupfrom) == 1)
- {
- if (strcmp(backupfrom, "standby") == 0)
- *backupFromStandby = true;
- }
-
- /*
- * Parse START TIME and LABEL. Those are not mandatory fields for
recovery
- * but checking for their presence is useful for debugging and the next
- * sanity checks. Cope also with the fact that the result buffers have a
- * pre-allocated size, hence if the backup_label file has been generated
- * with strings longer than the maximum assumed here an incorrect
parsing
- * happens. That's fine as only minor consistency checks are done
- * afterwards.
- */
- if (fscanf(lfp, "START TIME: %127[^\n]\n", backuptime) == 1)
- ereport(DEBUG1,
- (errmsg_internal("backup time %s in file
\"%s\"",
- backuptime,
BACKUP_LABEL_FILE)));
-
- if (fscanf(lfp, "LABEL: %1023[^\n]\n", backuplabel) == 1)
- ereport(DEBUG1,
- (errmsg_internal("backup label %s in file
\"%s\"",
- backuplabel,
BACKUP_LABEL_FILE)));
-
- /*
- * START TIMELINE is new as of 11. Its parsing is not mandatory, still
use
- * it as a sanity check if present.
- */
- if (fscanf(lfp, "START TIMELINE: %u\n", &tli_from_file) == 1)
- {
- if (tli_from_walseg != tli_from_file)
- ereport(FATAL,
-
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("invalid data in file \"%s\"",
BACKUP_LABEL_FILE),
- errdetail("Timeline ID parsed is %u,
but expected %u.",
- tli_from_file,
tli_from_walseg)));
-
- ereport(DEBUG1,
- (errmsg_internal("backup timeline %u in file
\"%s\"",
- tli_from_file,
BACKUP_LABEL_FILE)));
- }
-
- if (ferror(lfp) || FreeFile(lfp))
- ereport(FATAL,
- (errcode_for_file_access(),
- errmsg("could not read file \"%s\": %m",
- BACKUP_LABEL_FILE)));
-
- return true;
-}
-
/*
* read_tablespace_map: check to see if a tablespace_map file is present
*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index b537f462197..01d09dbdd21 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -22,6 +22,7 @@
#include "backup/basebackup.h"
#include "backup/basebackup_sink.h"
#include "backup/basebackup_target.h"
+#include "catalog/pg_control.h"
#include "commands/defrem.h"
#include "common/compression.h"
#include "common/file_perm.h"
@@ -94,7 +95,7 @@ static bool verify_page_checksum(Page page, XLogRecPtr
start_lsn,
BlockNumber
blkno,
uint16
*expected_checksum);
static void sendFileWithContent(bbsink *sink, const char *filename,
- const char
*content,
+ const char
*content, int len,
backup_manifest_info *manifest);
static int64 _tarWriteHeader(bbsink *sink, const char *filename,
const char
*linktarget, struct stat *statbuf,
@@ -192,10 +193,9 @@ static const struct exclude_list_item excludeFiles[] =
{RELCACHE_INIT_FILENAME, true},
/*
- * backup_label and tablespace_map should not exist in a running cluster
- * capable of doing an online backup, but exclude them just in case.
+ * tablespace_map should not exist in a running cluster capable of doing
+ * an online backup, but exclude it just in case.
*/
- {BACKUP_LABEL_FILE, false},
{TABLESPACE_MAP, false},
/*
@@ -325,23 +325,15 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
if (ti->path == NULL)
{
- struct stat statbuf;
bool sendtblspclinks = true;
- char *backup_label;
bbsink_begin_archive(sink, "base.tar");
- /* In the main tar, include the backup_label
first... */
- backup_label =
build_backup_content(backup_state, false);
- sendFileWithContent(sink, BACKUP_LABEL_FILE,
-
backup_label, &manifest);
- pfree(backup_label);
-
- /* Then the tablespace_map file, if required...
*/
+ /* Send the tablespace_map file, if required...
*/
if (opt->sendtblspcmapfile)
{
sendFileWithContent(sink,
TABLESPACE_MAP,
-
tablespace_map->data, &manifest);
+
tablespace_map->data, -1, &manifest);
sendtblspclinks = false;
}
@@ -349,14 +341,14 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
sendDir(sink, ".", 1, false, state.tablespaces,
sendtblspclinks, &manifest,
InvalidOid);
- /* ... and pg_control after everything else. */
- if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
- ereport(ERROR,
-
(errcode_for_file_access(),
- errmsg("could not stat
file \"%s\": %m",
-
XLOG_CONTROL_FILE)));
- sendFile(sink, XLOG_CONTROL_FILE,
XLOG_CONTROL_FILE, &statbuf,
- false, InvalidOid, InvalidOid,
&manifest);
+ /* End the backup before sending pg_control */
+ basebackup_progress_wait_wal_archive(&state);
+ do_pg_backup_stop(backup_state, !opt->nowait);
+
+ /* Send copy of pg_control containing recovery
info */
+ sendFileWithContent(sink, XLOG_CONTROL_FILE,
+ (char
*)backup_state->controlFile,
+
PG_CONTROL_MAX_SAFE_SIZE, &manifest);
}
else
{
@@ -390,9 +382,6 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
}
}
- basebackup_progress_wait_wal_archive(&state);
- do_pg_backup_stop(backup_state, !opt->nowait);
-
endptr = backup_state->stoppoint;
endtli = backup_state->stoptli;
@@ -601,7 +590,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
* complete segment.
*/
StatusFilePath(pathbuf, walFileName, ".done");
- sendFileWithContent(sink, pathbuf, "", &manifest);
+ sendFileWithContent(sink, pathbuf, "", -1, &manifest);
}
/*
@@ -629,7 +618,7 @@ perform_base_backup(basebackup_options *opt, bbsink *sink)
/* unconditionally mark file as archived */
StatusFilePath(pathbuf, fname, ".done");
- sendFileWithContent(sink, pathbuf, "", &manifest);
+ sendFileWithContent(sink, pathbuf, "", -1, &manifest);
}
/* Properly terminate the tar file. */
@@ -1040,22 +1029,21 @@ SendBaseBackup(BaseBackupCmd *cmd)
*/
static void
sendFileWithContent(bbsink *sink, const char *filename, const char *content,
- backup_manifest_info *manifest)
+ int len, backup_manifest_info *manifest)
{
struct stat statbuf;
- int bytes_done = 0,
- len;
+ int bytes_done = 0;
pg_checksum_context checksum_ctx;
if (pg_checksum_init(&checksum_ctx, manifest->checksum_type) < 0)
elog(ERROR, "could not initialize checksum of file \"%s\"",
filename);
- len = strlen(content);
+ if (len < 0)
+ len = strlen(content);
/*
- * Construct a stat struct for the backup_label file we're injecting in
- * the tar.
+ * Construct a stat struct for the file we're injecting in the tar.
*/
/* Windows doesn't have the concept of uid and gid */
#ifdef WIN32
diff --git a/src/backend/catalog/system_functions.sql
b/src/backend/catalog/system_functions.sql
index 35d738d5763..24bf34b45eb 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -384,13 +384,15 @@ BEGIN ATOMIC
END;
CREATE OR REPLACE FUNCTION
- pg_backup_start(label text, fast boolean DEFAULT false)
- RETURNS pg_lsn STRICT VOLATILE LANGUAGE internal AS 'pg_backup_start'
+ pg_backup_start(label text, fast boolean DEFAULT false, OUT lsn pg_lsn,
+ OUT timeline_id int8, OUT start timestamptz)
+ RETURNS record STRICT VOLATILE LANGUAGE internal AS 'pg_backup_start'
PARALLEL RESTRICTED;
CREATE OR REPLACE FUNCTION pg_backup_stop (
- wait_for_archive boolean DEFAULT true, OUT lsn pg_lsn,
- OUT labelfile text, OUT spcmapfile text)
+ wait_for_archive boolean DEFAULT true, OUT pg_control_file bytea,
+ OUT tablespace_map_file text, OUT lsn pg_lsn, OUT timeline_id int8,
+ OUT stop timestamptz)
RETURNS record STRICT VOLATILE LANGUAGE internal as 'pg_backup_stop'
PARALLEL RESTRICTED;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b4..c655cb03352 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -171,8 +171,8 @@ SKIP:
# Write some files to test that they are not copied.
foreach my $filename (
- qw(backup_label tablespace_map postgresql.auto.conf.tmp
- current_logfiles.tmp global/pg_internal.init.123))
+ qw(tablespace_map postgresql.auto.conf.tmp current_logfiles.tmp
+ global/pg_internal.init.123))
{
open my $file, '>>', "$pgdata/$filename";
print $file "DONOTCOPY";
@@ -261,14 +261,13 @@ foreach my $filename (@tempRelationFiles)
"base/$postgresOid/$filename not copied");
}
-# Make sure existing backup_label was ignored.
-isnt(slurp_file("$tempdir/backup/backup_label"),
- 'DONOTCOPY', 'existing backup_label not copied');
+# Make sure existing tablespace_map was ignored.
+ok(!-f "$tempdir/backup/tablespace_map", 'tablespace_map not in backup');
rmtree("$tempdir/backup");
-# Now delete the bogus backup_label file since it will interfere with startup
-unlink("$pgdata/backup_label")
- or BAIL_OUT("unable to unlink $pgdata/backup_label");
+# Now delete the bogus tablespace_map file since it will interfere with startup
+unlink("$pgdata/tablespace_map")
+ or BAIL_OUT("unable to unlink $pgdata/tablespace_map");
$node->command_ok(
[
diff --git a/src/bin/pg_controldata/pg_controldata.c
b/src/bin/pg_controldata/pg_controldata.c
index 93e0837947c..cc515b622ff 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -277,10 +277,18 @@ main(int argc, char *argv[])
LSN_FORMAT_ARGS(ControlFile->minRecoveryPoint));
printf(_("Min recovery ending loc's timeline: %u\n"),
ControlFile->minRecoveryPointTLI);
+ printf(_("Backup checkpoint location: %X/%X\n"),
+ LSN_FORMAT_ARGS(ControlFile->backupCheckPoint));
printf(_("Backup start location: %X/%X\n"),
LSN_FORMAT_ARGS(ControlFile->backupStartPoint));
+ printf(_("Backup start location's timeline: %u\n"),
+ ControlFile->backupStartPointTLI);
printf(_("Backup end location: %X/%X\n"),
LSN_FORMAT_ARGS(ControlFile->backupEndPoint));
+ printf(_("Backup recovery required: %s\n"),
+ ControlFile->backupRecoveryRequired ? _("yes") : _("no"));
+ printf(_("Backup from standby: %s\n"),
+ ControlFile->backupFromStandby ? _("yes") : _("no"));
printf(_("End-of-backup record required: %s\n"),
ControlFile->backupEndRequired ? _("yes") : _("no"));
printf(_("wal_level setting: %s\n"),
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index ecadd69dc53..213f4e71b88 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -139,11 +139,10 @@ static const struct exclude_list_item excludeFiles[] =
{"pg_internal.init", true}, /* defined as RELCACHE_INIT_FILENAME */
/*
- * If there is a backup_label or tablespace_map file, it indicates that
a
- * recovery failed and this cluster probably can't be rewound, but
exclude
- * them anyway if they are found.
+ * If there is a tablespace_map file, it indicates that a recovery
failed
+ * and this cluster probably can't be rewound, but exclude it anyway if
it
+ * is found.
*/
- {"backup_label", false}, /* defined as BACKUP_LABEL_FILE */
{"tablespace_map", false}, /* defined as TABLESPACE_MAP */
/*
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bfd44a284e2..f42782e2eab 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -39,9 +39,6 @@ static void perform_rewind(filemap_t *filemap, rewind_source
*source,
TimeLineID chkpttli,
XLogRecPtr chkptredo);
-static void createBackupLabel(XLogRecPtr startpoint, TimeLineID starttli,
- XLogRecPtr
checkpointloc);
-
static void digestControlFile(ControlFileData *ControlFile,
const char *content,
size_t size);
static void getRestoreCommand(const char *argv0);
@@ -654,7 +651,7 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
pg_log_info("creating backup label and updating control file");
/*
- * Create a backup label file, to tell the target where to begin the WAL
+ * Get recovery fields to tell the target where to begin the WAL
* replay. Normally, from the last common checkpoint between the source
* and the target. But if the source is a standby server, it's possible
* that the last common checkpoint is *after* the standby's
restartpoint.
@@ -672,7 +669,6 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
chkpttli = ControlFile_source.checkPointCopy.ThisTimeLineID;
chkptrec = ControlFile_source.checkPoint;
}
- createBackupLabel(chkptredo, chkpttli, chkptrec);
/*
* Update control file of target, to tell the target how far it must
@@ -722,6 +718,12 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
ControlFile_new.minRecoveryPoint = endrec;
ControlFile_new.minRecoveryPointTLI = endtli;
ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
+ ControlFile_new.backupRecoveryRequired = true;
+ ControlFile_new.backupFromStandby = true;
+ ControlFile_new.backupEndRequired = false;
+ ControlFile_new.backupCheckPoint = chkptrec;
+ ControlFile_new.backupStartPoint = chkptredo;
+ ControlFile_new.backupStartPointTLI = chkpttli;
if (!dry_run)
update_controlfile(datadir_target, &ControlFile_new, do_sync);
}
@@ -729,7 +731,10 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
static void
sanityChecks(void)
{
- /* TODO Check that there's no backup_label in either cluster */
+ /*
+ * TODO Check that neither cluster has backupRecoveryRequested set in
+ * pg_control.
+ */
/* Check system_identifier match */
if (ControlFile_target.system_identifier !=
ControlFile_source.system_identifier)
@@ -951,51 +956,6 @@ findCommonAncestorTimeline(TimeLineHistoryEntry
*a_history, int a_nentries,
}
}
-
-/*
- * Create a backup_label file that forces recovery to begin at the last common
- * checkpoint.
- */
-static void
-createBackupLabel(XLogRecPtr startpoint, TimeLineID starttli, XLogRecPtr
checkpointloc)
-{
- XLogSegNo startsegno;
- time_t stamp_time;
- char strfbuf[128];
- char xlogfilename[MAXFNAMELEN];
- struct tm *tmp;
- char buf[1000];
- int len;
-
- XLByteToSeg(startpoint, startsegno, WalSegSz);
- XLogFileName(xlogfilename, starttli, startsegno, WalSegSz);
-
- /*
- * Construct backup label file
- */
- stamp_time = time(NULL);
- tmp = localtime(&stamp_time);
- strftime(strfbuf, sizeof(strfbuf), "%Y-%m-%d %H:%M:%S %Z", tmp);
-
- len = snprintf(buf, sizeof(buf),
- "START WAL LOCATION: %X/%X (file %s)\n"
- "CHECKPOINT LOCATION: %X/%X\n"
- "BACKUP METHOD: pg_rewind\n"
- "BACKUP FROM: standby\n"
- "START TIME: %s\n",
- /* omit LABEL: line */
- LSN_FORMAT_ARGS(startpoint), xlogfilename,
- LSN_FORMAT_ARGS(checkpointloc),
- strfbuf);
- if (len >= sizeof(buf))
- pg_fatal("backup label buffer too small"); /* shouldn't
happen */
-
- /* TODO: move old file out of the way, if any. */
- open_target_file("backup_label", true); /* BACKUP_LABEL_FILE */
- write_target_range(buf, 0, len);
- close_target_file();
-}
-
/*
* Check CRC of control file
*/
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index a14126d164f..3aac6839a70 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -293,8 +293,6 @@ extern SessionBackupState get_backup_status(void);
/* File path names (all relative to $PGDATA) */
#define RECOVERY_SIGNAL_FILE "recovery.signal"
#define STANDBY_SIGNAL_FILE "standby.signal"
-#define BACKUP_LABEL_FILE "backup_label"
-#define BACKUP_LABEL_OLD "backup_label.old"
#define TABLESPACE_MAP "tablespace_map"
#define TABLESPACE_MAP_OLD "tablespace_map.old"
diff --git a/src/include/access/xlogbackup.h b/src/include/access/xlogbackup.h
index 1611358137b..f2c3672fed6 100644
--- a/src/include/access/xlogbackup.h
+++ b/src/include/access/xlogbackup.h
@@ -15,6 +15,7 @@
#define XLOG_BACKUP_H
#include "access/xlogdefs.h"
+#include "catalog/pg_control.h"
#include "pgtime.h"
/* Structure to hold backup state. */
@@ -33,9 +34,18 @@ typedef struct BackupState
XLogRecPtr stoppoint; /* backup stop WAL location */
TimeLineID stoptli; /* backup stop TLI */
pg_time_t stoptime; /* backup stop time */
+
+ /*
+ * After pg_backup_stop() returns this field will contain a copy of
+ * pg_control that should be stored with the backup. Fields have been
+ * updated for recovery and the CRC has been recalculated. The buffer
+ * is padded to PG_CONTROL_MAX_SAFE_SIZE so that pg_control is always
+ * a consistent size but smaller (and hopefully easier to handle) than
+ * PG_CONTROL_FILE_SIZE. Bytes after sizeof(ControlFileData) are zeroed.
+ */
+ uint8_t controlFile[PG_CONTROL_MAX_SAFE_SIZE];
} BackupState;
-extern char *build_backup_content(BackupState *state,
- bool
ishistoryfile);
+extern char *build_backup_history_content(BackupState *state);
#endif /* XLOG_BACKUP_H */
diff --git a/src/include/access/xlogrecovery.h
b/src/include/access/xlogrecovery.h
index ee0bc742782..981266f7340 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -80,8 +80,7 @@ extern Size XLogRecoveryShmemSize(void);
extern void XLogRecoveryShmemInit(void);
extern void InitWalRecovery(ControlFileData *ControlFile,
- bool *wasShutdown_ptr,
bool *haveBackupLabel_ptr,
- bool
*haveTblspcMap_ptr);
+ bool *wasShutdown_ptr,
bool *haveTblspcMap_ptr);
extern void PerformWalRecovery(void);
/*
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 2ae72e3b266..258da052563 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -146,6 +146,9 @@ typedef struct ControlFileData
* to disk, we mustn't start up until we reach X again. Zero when not
* doing archive recovery.
*
+ * backupCheckPoint is the backup start checkpoint and is set to zero
after
+ * recovery is initialized.
+ *
* backupStartPoint is the redo pointer of the backup start checkpoint,
if
* we are recovering from an online backup and haven't reached the end
of
* backup yet. It is reset to zero when the end of backup is reached,
and
@@ -160,14 +163,27 @@ typedef struct ControlFileData
* pg_control which was backed up last. It is reset to zero when the end
* of backup is reached, and we mustn't start up before that.
*
+ * backupRecoveryRequired indicates that the pg_control file was
provided
+ * by a backup or pg_rewind and recovery settings need to be copied. It
will
+ * be set to false when the settings have been copied.
+ *
+ * backupFromStandby indicates that the backup was taken on a standby.
It is
+ * require to initialize recovery and set to false afterwards.
+ *
* If backupEndRequired is true, we know for sure that we're restoring
* from a backup, and must see a backup-end record before we can safely
- * start up.
+ * start up. Currently backupEndRequired should only be false if
recovery
+ * settings were configured by pg_rewind, which does not require an end
+ * point.
*/
XLogRecPtr minRecoveryPoint;
TimeLineID minRecoveryPointTLI;
+ XLogRecPtr backupCheckPoint;
XLogRecPtr backupStartPoint;
+ TimeLineID backupStartPointTLI;
XLogRecPtr backupEndPoint;
+ bool backupRecoveryRequired;
+ bool backupFromStandby;
bool backupEndRequired;
/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f14aed422a7..cc8156c57e7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6413,13 +6413,17 @@
prosrc => 'pg_terminate_backend' },
{ oid => '2172', descr => 'prepare for taking an online backup',
proname => 'pg_backup_start', provolatile => 'v', proparallel => 'r',
- prorettype => 'pg_lsn', proargtypes => 'text bool',
+ prorettype => 'record', proargtypes => 'text bool',
+ proallargtypes => '{text,bool,pg_lsn,int8,timestamptz}',
+ proargmodes => '{i,i,o,o,o}',
+ proargnames => '{label,fast,lsn,timeline_id,start}',
prosrc => 'pg_backup_start' },
{ oid => '2739', descr => 'finish taking an online backup',
proname => 'pg_backup_stop', provolatile => 'v', proparallel => 'r',
prorettype => 'record', proargtypes => 'bool',
- proallargtypes => '{bool,pg_lsn,text,text}', proargmodes => '{i,o,o,o}',
- proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile}',
+ proallargtypes => '{bool,bytea,text,pg_lsn,int8,timestamptz}',
+ proargmodes => '{i,o,o,o,o,o}',
+ proargnames =>
'{wait_for_archive,pg_control_file,tablespace_map_file,lsn,timeline_id,stop}',
prosrc => 'pg_backup_stop' },
{ oid => '3436', descr => 'promote standby server',
proname => 'pg_promote', provolatile => 'v', prorettype => 'bool',