On Wed, Dec 8, 2021 at 2:08 PM Michael Paquier <mich...@paquier.xyz> wrote:
> > Do you want me to add "received from stream" in general?
>
> "from stream" would be fine, IMHO.

Done in the attached v5 patch.

> > How about we invent a new GUC log_recovery, with default set to false,
> > similar to log_checkpoints? And, these messages can be like:
> > ereport(log_recovery ? LOG : DEBUG1, .....)? This GUC can be useful to
> > add some recovery stats in future as well. Thoughts?
>
> This discussion does not justify a new GUC at this stage IMO.

Having "log_recovery" GUC gives the flexibility to the users to enable
the recovery logs added by this patch if they are concerned otherwise
the logs just be at DEBUG1 level. Setting log_min_messages level to
DEBUG1 would definitely flood the server logs.

> While quickly testing, I got reminded that the LOG for a segment
> retrieved from the local pg_wal would generate some noise when running
> for example the bootstrap process.  Is this one really interesting
> compared to the two others?

If we just have the archive related message at LOG level and others at
DEBUG1 level, it is a kind of partial recovery log information that
one has in the log files, missing recovery messages from the pg_wal
directory in the logs.

Another option is to just have all the messages (added in the patch)
at DEBUG1 level, let's users deal with it.

IMO, the new GUC "log_recovery" is way better as it can be extended to
include some recovery stats info as well.

Thoughts?

Regards,
Bharath Rupireddy.
From a486f3643cc7b84bae3cd6a72cd4d3323c224043 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 8 Dec 2021 09:01:49 +0000
Subject: [PATCH v5] add log messages for recovery

The messages at LOG level would be helpful to know what's happening
with the system while in recovery. Although these messages seem to
be filling up the server logs, but the advantages we gain with them
are plenty. Of course, having a good log consumption and rotation
mechanism will help the server logs not fill up the disk.

These LOG messages will help us know how much time a restore command
takes to fetch the WAL file, what is the current WAL file the server
is recovering and where is it recovering from.

As a developer or admin, one can monitor these logs and do bunch of
things:
1) see how many WAL files left to be recovered by looking at the WAL
files in the archive location or pg_wal directory or from primary
2) provide an approximate estimation of when the server will come up
or how much more the recovery takes by looking at these previous LOG
messages, one can know the number of WAL files that server recovered
over a minute and with the help of left-over WAL files calculated.
---
 src/backend/access/transam/xlog.c        | 30 +++++++++++++++++++++++-
 src/backend/access/transam/xlogarchive.c |  2 +-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d894af310a..d799e12a3a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3787,6 +3787,10 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 					 xlogfname);
 			set_ps_display(activitymsg);
 
+			ereport(LOG,
+					(errmsg("waiting for WAL segment \"%s\" from archive",
+							xlogfname)));
+
 			if (!RestoreArchivedFile(path, xlogfname,
 									 "RECOVERYXLOG",
 									 wal_segment_size,
@@ -3829,6 +3833,31 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 				 xlogfname);
 		set_ps_display(activitymsg);
 
+		switch (source)
+		{
+			case XLOG_FROM_ARCHIVE:
+				ereport(LOG,
+						(errmsg("recovering WAL segment \"%s\" received from archive",
+								xlogfname)));
+				break;
+			case XLOG_FROM_PG_WAL:
+				ereport(LOG,
+						(errmsg("recovering WAL segment \"%s\" received from pg_wal",
+								xlogfname)));
+				break;
+			case XLOG_FROM_STREAM:
+				ereport(LOG,
+						(errmsg("recovering WAL segment \"%s\" received from stream",
+								xlogfname)));
+				break;
+			default:
+				/*
+				 * Intentionally doing nothing here as invalid source check has
+				 * already been perofrmed at the beginning of this function.
+				 */
+				break;
+		}
+
 		/* Track source of data in assorted state variables */
 		readSource = source;
 		XLogReceiptSource = source;
@@ -3919,7 +3948,6 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, XLogSource source)
 							  XLOG_FROM_ARCHIVE, true);
 			if (fd != -1)
 			{
-				elog(DEBUG1, "got WAL segment from archive");
 				if (!expectedTLEs)
 					expectedTLEs = tles;
 				return fd;
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 4ddeac1fb9..1f724f4acf 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -214,7 +214,7 @@ RestoreArchivedFile(char *path, const char *xlogfname,
 			else
 			{
 				ereport(LOG,
-						(errmsg("restored log file \"%s\" from archive",
+						(errmsg("restored WAL segment \"%s\" from archive",
 								xlogfname)));
 				strcpy(path, xlogpath);
 				return true;
-- 
2.25.1

Reply via email to