On Mon, Nov 14, 2022 at 9:31 PM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs
> <simon.ri...@enterprisedb.com> wrote:
> > > Whilte at it, I noticed that we report redo progress for PITR, but we
> > > don't report when standby enters archive recovery mode, say due to a
> > > failure in the connection to primary or after the promote signal is
> > > found. Isn't it useful to report in this case as well to know the
> > > recovery progress?
> >
> > I think your patch disables progress too early, effectively turning
> > off the standby progress feature. The purpose was to report on things
> > that take long periods during recovery, not just prior to recovery.
> >
> > I would advocate that we disable progress only while waiting, as I've done 
> > here:
> > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com
>
> Maybe I'm confused here, but I think that, on a standby, startup
> progress messages are only printed until the main redo loop is
> reached. Otherwise, we would print a message on a standby every 10s
> forever, which seems like a thing that most users would not like. So I
> think that Bharath has the right idea here.

Yes, the idea is to disable the timeout on standby completely since we
actually don't report any recovery progress. Keeping it enabled,
unnecessarily calls startup_progress_timeout_handler() every
log_startup_progress_interval seconds i.e. 10 seconds. That's the
intention of the patch.

> I don't think that his patch is right in detail, though. I don't think
> the call to disable_timeout() needs to be conditional,

Yes, disable_timeout() returns if the timeout was previously disabled
i.e. all_timeouts[STARTUP_PROGRESS_TIMEOUT].active is false. I changed
it in the v2 patch.

> and I don't
> think the Assert is correct.

You're right. My intention there was to check if the timeout is
enabled while ereport_startup_progress() is called. In the v2 patch,
when we actually disable the timeout startup_progress_timer_expired
gets set to false and has_startup_progress_timeout_expired() just
returns in such a case.

> Also, I think that your patch has the
> right idea in encapsulating the disable_timeout() call inside a new
> function disable_startup_progress_timeout(), rather than having the
> details known directly by xlogrecovery.c.

Yes, I too like Simon's idea of {enable,
disable}_startup_progress_timeout functions, I utilized them in the v2
patch here.

I actually want to get rid of begin_startup_progress_phase() which now
becomes a thin wrapper calling disable and enable functions and ensure
the callers do follow enable()-report_progress()-disable() way to use
the feature, however I didn't code for that as it needs changes across
many files. If okay, I can code for that too. Thoughts?

Please review the v2 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 0edba6df408bbde80bc2a955990021995f3ec07e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 15 Nov 2022 13:01:33 +0000
Subject: [PATCH v2] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Authors: Bharath Rupireddy, Simon Riggs
Reviewed-by: Robert Haas
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c |  9 ++++++-
 src/backend/postmaster/startup.c          | 29 ++++++++++++++++++++---
 src/include/postmaster/startup.h          |  2 ++
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..20c0dfd9d1 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1653,7 +1653,14 @@ PerformWalRecovery(void)
 		 */
 		do
 		{
-			if (!StandbyMode)
+			/*
+			 * To avoid server log bloat, we don't report recovery progress in
+			 * a standby as it will always be in recovery unless promoted. We
+			 * also disable the timeout as we don't need it anymore.
+			 */
+			if (StandbyMode)
+				disable_startup_progress_timeout();
+			else
 				ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
 										 LSN_FORMAT_ARGS(xlogreader->ReadRecPtr));
 
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..2705e425e8 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -314,11 +314,22 @@ startup_progress_timeout_handler(void)
 	startup_progress_timer_expired = true;
 }
 
+void
+disable_startup_progress_timeout(void)
+{
+	/* Feature is disabled. */
+	if (log_startup_progress_interval == 0)
+		return;
+
+	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+	startup_progress_timer_expired = false;
+}
+
 /*
  * Set the start timestamp of the current operation and enable the timeout.
  */
 void
-begin_startup_progress_phase(void)
+enable_startup_progress_timeout(void)
 {
 	TimestampTz fin_time;
 
@@ -326,8 +337,6 @@ begin_startup_progress_phase(void)
 	if (log_startup_progress_interval == 0)
 		return;
 
-	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
-	startup_progress_timer_expired = false;
 	startup_progress_phase_start_time = GetCurrentTimestamp();
 	fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
 										   log_startup_progress_interval);
@@ -335,6 +344,20 @@ begin_startup_progress_phase(void)
 						 log_startup_progress_interval);
 }
 
+/*
+ * A thin wrapper to first disable and then enable the startup progress timeout.
+ */
+void
+begin_startup_progress_phase(void)
+{
+	/* Feature is disabled. */
+	if (log_startup_progress_interval == 0)
+		return;
+
+	disable_startup_progress_timeout();
+	enable_startup_progress_timeout();
+}
+
 /*
  * Report whether startup progress timeout has occurred. Reset the timer flag
  * if it did, set the elapsed time to the out parameters and return true,
diff --git a/src/include/postmaster/startup.h b/src/include/postmaster/startup.h
index d66ec1fcb1..12897531c3 100644
--- a/src/include/postmaster/startup.h
+++ b/src/include/postmaster/startup.h
@@ -32,6 +32,8 @@ extern void PostRestoreCommand(void);
 extern bool IsPromoteSignaled(void);
 extern void ResetPromoteSignaled(void);
 
+extern void enable_startup_progress_timeout(void);
+extern void disable_startup_progress_timeout(void);
 extern void begin_startup_progress_phase(void);
 extern void startup_progress_timeout_handler(void);
 extern bool has_startup_progress_timeout_expired(long *secs, int *usecs);
-- 
2.34.1

Reply via email to