On 11/20/23 15:08, David Steele wrote:
On 11/20/23 14:27, Andres Freund wrote:

I've wondered whether it's worth also adding an explicit message just after
ReachedEndOfBackup(), but it seems far less urgent due to the existing
"consistent recovery state reached at %X/%X" message.

I think the current message is sufficient, but what do you have in mind?

Well, the consistency message is emitted after every restart. Whereas a single instance only should go through backup recovery once. So it seems worthwhile
to differentiate the two in log messages.

Ah, right. That works for me, then.

Any status on this patch? If we do back patch it would be nice to see this in the upcoming minor releases. I'm in favor of a back patch, as I think this is minimally invasive and would be very useful for debugging recovery issues.

I like the phrasing you demonstrated in [1] but doesn't seem like there's a new patch for that, so I have attached one.

Happy to do whatever else I can to get this across the line.

Regards,
-David

---

[1] https://www.postgresql.org/message-id/20231120183633.c4lhoq4hld4u56dd%40awork3.anarazel.de
diff --git a/src/backend/access/transam/xlogrecovery.c 
b/src/backend/access/transam/xlogrecovery.c
index 1b48d7171a..c0918e6c75 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -603,6 +603,22 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
                if (StandbyModeRequested)
                        EnableStandbyMode();
 
+               /*
+                * Omitting backup_label when creating a new replica, PITR node 
etc.
+                * unfortunately is a common cause of corruption. Logging that
+                * backup_label was used makes it a bit easier to exclude that 
as the
+                * cause of observed corruption.
+                *
+                * Do so before we try to read the checkpoint record (which can 
fail),
+                * as otherwise it can be hard to understand why a checkpoint 
other
+                * than ControlFile->checkPoint is used.
+                */
+               ereport(LOG,
+                               (errmsg("starting backup recovery with redo LSN 
%X/%X, checkpoint LSN %X/%X, on timeline ID %u",
+                                               LSN_FORMAT_ARGS(RedoStartLSN),
+                                               LSN_FORMAT_ARGS(CheckPointLoc),
+                                               CheckPointTLI)));
+
                /*
                 * When a backup_label file is present, we want to roll forward 
from
                 * the checkpoint it identifies, rather than using pg_control.
@@ -742,6 +758,16 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
                                EnableStandbyMode();
                }
 
+               /*
+                * For the same reason as when starting up with backup_label 
present,
+                * emit a log message when we continue initializing from a base
+                * backup.
+                */
+               if (ControlFile->backupStartPoint != InvalidXLogRecPtr)
+                       ereport(LOG,
+                                       (errmsg("restarting backup recovery 
with redo LSN %X/%X",
+                                                       
LSN_FORMAT_ARGS(ControlFile->backupStartPoint))));
+
                /* Get the last valid checkpoint record. */
                CheckPointLoc = ControlFile->checkPoint;
                CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
@@ -2155,6 +2181,8 @@ CheckRecoveryConsistency(void)
        if (!XLogRecPtrIsInvalid(backupEndPoint) &&
                backupEndPoint <= lastReplayedEndRecPtr)
        {
+               XLogRecPtr oldBackupStartPoint = backupStartPoint;
+
                elog(DEBUG1, "end of backup reached");
 
                /*
@@ -2165,6 +2193,10 @@ CheckRecoveryConsistency(void)
                backupStartPoint = InvalidXLogRecPtr;
                backupEndPoint = InvalidXLogRecPtr;
                backupEndRequired = false;
+
+               ereport(LOG,
+                               (errmsg("completed backup recovery with redo 
LSN %X/%X",
+                                               
LSN_FORMAT_ARGS(oldBackupStartPoint))));
        }
 
        /*

Reply via email to