On Mon, Nov 14, 2022 at 9:31 PM Robert Haas <robertmh...@gmail.com> wrote: > > On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs > <simon.ri...@enterprisedb.com> wrote: > > > Whilte at it, I noticed that we report redo progress for PITR, but we > > > don't report when standby enters archive recovery mode, say due to a > > > failure in the connection to primary or after the promote signal is > > > found. Isn't it useful to report in this case as well to know the > > > recovery progress? > > > > I think your patch disables progress too early, effectively turning > > off the standby progress feature. The purpose was to report on things > > that take long periods during recovery, not just prior to recovery. > > > > I would advocate that we disable progress only while waiting, as I've done > > here: > > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com > > Maybe I'm confused here, but I think that, on a standby, startup > progress messages are only printed until the main redo loop is > reached. Otherwise, we would print a message on a standby every 10s > forever, which seems like a thing that most users would not like. So I > think that Bharath has the right idea here.
Yes, the idea is to disable the timeout on standby completely since we actually don't report any recovery progress. Keeping it enabled, unnecessarily calls startup_progress_timeout_handler() every log_startup_progress_interval seconds i.e. 10 seconds. That's the intention of the patch. > I don't think that his patch is right in detail, though. I don't think > the call to disable_timeout() needs to be conditional, Yes, disable_timeout() returns if the timeout was previously disabled i.e. all_timeouts[STARTUP_PROGRESS_TIMEOUT].active is false. I changed it in the v2 patch. > and I don't > think the Assert is correct. You're right. My intention there was to check if the timeout is enabled while ereport_startup_progress() is called. In the v2 patch, when we actually disable the timeout startup_progress_timer_expired gets set to false and has_startup_progress_timeout_expired() just returns in such a case. > Also, I think that your patch has the > right idea in encapsulating the disable_timeout() call inside a new > function disable_startup_progress_timeout(), rather than having the > details known directly by xlogrecovery.c. Yes, I too like Simon's idea of {enable, disable}_startup_progress_timeout functions, I utilized them in the v2 patch here. I actually want to get rid of begin_startup_progress_phase() which now becomes a thin wrapper calling disable and enable functions and ensure the callers do follow enable()-report_progress()-disable() way to use the feature, however I didn't code for that as it needs changes across many files. If okay, I can code for that too. Thoughts? Please review the v2 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
From 0edba6df408bbde80bc2a955990021995f3ec07e Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 15 Nov 2022 13:01:33 +0000 Subject: [PATCH v2] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Authors: Bharath Rupireddy, Simon Riggs Reviewed-by: Robert Haas Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 9 ++++++- src/backend/postmaster/startup.c | 29 ++++++++++++++++++++--- src/include/postmaster/startup.h | 2 ++ 3 files changed, 36 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..20c0dfd9d1 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1653,7 +1653,14 @@ PerformWalRecovery(void) */ do { - if (!StandbyMode) + /* + * To avoid server log bloat, we don't report recovery progress in + * a standby as it will always be in recovery unless promoted. We + * also disable the timeout as we don't need it anymore. + */ + if (StandbyMode) + disable_startup_progress_timeout(); + else ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr)); diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..2705e425e8 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -314,11 +314,22 @@ startup_progress_timeout_handler(void) startup_progress_timer_expired = true; } +void +disable_startup_progress_timeout(void) +{ + /* Feature is disabled. */ + if (log_startup_progress_interval == 0) + return; + + disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); + startup_progress_timer_expired = false; +} + /* * Set the start timestamp of the current operation and enable the timeout. */ void -begin_startup_progress_phase(void) +enable_startup_progress_timeout(void) { TimestampTz fin_time; @@ -326,8 +337,6 @@ begin_startup_progress_phase(void) if (log_startup_progress_interval == 0) return; - disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); - startup_progress_timer_expired = false; startup_progress_phase_start_time = GetCurrentTimestamp(); fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time, log_startup_progress_interval); @@ -335,6 +344,20 @@ begin_startup_progress_phase(void) log_startup_progress_interval); } +/* + * A thin wrapper to first disable and then enable the startup progress timeout. + */ +void +begin_startup_progress_phase(void) +{ + /* Feature is disabled. */ + if (log_startup_progress_interval == 0) + return; + + disable_startup_progress_timeout(); + enable_startup_progress_timeout(); +} + /* * Report whether startup progress timeout has occurred. Reset the timer flag * if it did, set the elapsed time to the out parameters and return true, diff --git a/src/include/postmaster/startup.h b/src/include/postmaster/startup.h index d66ec1fcb1..12897531c3 100644 --- a/src/include/postmaster/startup.h +++ b/src/include/postmaster/startup.h @@ -32,6 +32,8 @@ extern void PostRestoreCommand(void); extern bool IsPromoteSignaled(void); extern void ResetPromoteSignaled(void); +extern void enable_startup_progress_timeout(void); +extern void disable_startup_progress_timeout(void); extern void begin_startup_progress_phase(void); extern void startup_progress_timeout_handler(void); extern bool has_startup_progress_timeout_expired(long *secs, int *usecs); -- 2.34.1