Re: when the startup process doesn't (logging startup delays)
On Wed, Feb 8, 2023 at 11:13 PM Tom Lane wrote: > > Bharath Rupireddy writes: > > Thanks a lot Robert for taking care of this. The patch is committed on > > HEAD and reverted on v15. Now that the minor version branches are > > stamped, is it time for us to get this to v15? I can then close the CF > > entry - https://commitfest.postgresql.org/42/4012/. > > No objection to un-reverting from here. Thanks Robert, Tom. It is now un-reverted on v15. I've closed the CF entry. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: when the startup process doesn't (logging startup delays)
Bharath Rupireddy writes: > Thanks a lot Robert for taking care of this. The patch is committed on > HEAD and reverted on v15. Now that the minor version branches are > stamped, is it time for us to get this to v15? I can then close the CF > entry - https://commitfest.postgresql.org/42/4012/. No objection to un-reverting from here. regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On Mon, Feb 6, 2023 at 9:39 PM Robert Haas wrote: > > On Mon, Feb 6, 2023 at 11:07 AM Tom Lane wrote: > > Umm ... is this really the sort of patch to be committing on a > > release wrap day? > > Oh, shoot, I wasn't thinking about that. Would you like me to revert > it in v15 for now? Thanks a lot Robert for taking care of this. The patch is committed on HEAD and reverted on v15. Now that the minor version branches are stamped, is it time for us to get this to v15? I can then close the CF entry - https://commitfest.postgresql.org/42/4012/. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Feb 6, 2023 at 11:15 AM Tom Lane wrote: > Robert Haas writes: > > On Mon, Feb 6, 2023 at 11:07 AM Tom Lane wrote: > >> Umm ... is this really the sort of patch to be committing on a > >> release wrap day? > > > Oh, shoot, I wasn't thinking about that. Would you like me to revert > > it in v15 for now? > > Yeah, seems like the safest course. I wouldn't object to it going in > after the release is over, but right now there's zero margin for error. Done. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Robert Haas writes: > On Mon, Feb 6, 2023 at 11:07 AM Tom Lane wrote: >> Umm ... is this really the sort of patch to be committing on a >> release wrap day? > Oh, shoot, I wasn't thinking about that. Would you like me to revert > it in v15 for now? Yeah, seems like the safest course. I wouldn't object to it going in after the release is over, but right now there's zero margin for error. regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On Mon, Feb 6, 2023 at 11:07 AM Tom Lane wrote: > Umm ... is this really the sort of patch to be committing on a > release wrap day? Oh, shoot, I wasn't thinking about that. Would you like me to revert it in v15 for now? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Robert Haas writes: > On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy > wrote: >> I took the v4 patch, added some comments and attached it as the v7 >> patch here. Please find it. > Committed and back-patched to v15. Umm ... is this really the sort of patch to be committing on a release wrap day? regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy wrote: > I took the v4 patch, added some comments and attached it as the v7 > patch here. Please find it. Committed and back-patched to v15. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Fri, Feb 3, 2023 at 2:29 AM Robert Haas wrote: > Thanks for looking at this. > On Tue, Nov 22, 2022 at 6:05 AM Bharath Rupireddy > wrote: > > If we place just the Assert(!StandbyMode); in > > enable_startup_progress_timeout(), it fails for > > begin_startup_progress_phase() in ResetUnloggedRelations() because the > > InitWalRecovery(), that sets StandbyMode to true, is called before > > ResetUnloggedRelations() . However, with the if (StandbyMode) { > > return; }, we fail to report progress of ResetUnloggedRelations() in a > > standby, which isn't a good idea at all because we only want to > > disable the timeout during the recovery's main loop. > > Ugh. Well, in that case, I guess my vote is to forget about this whole > Assert business and just commit what you had in v4. Does that work for > you? Yes, it seems reasonable to me. > Protecting against specifically the situation where we're in the > standby's main redo apply loop is not really what I had in mind here, > but this is already sort of weirdly complicated-looking, and making it > more weirdly complicated-looking to achieve the kind of protection > that I had in mind doesn't really seem worth it. IMHO, the responsibility of whether or not to report progress of any operation can lie with the developers writing features using the progress reporting mechanism. The progress reporting mechanism can just be independent of all that. I took the v4 patch, added some comments and attached it as the v7 patch here. Please find it. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 0ff4bc13db8abf70f8da158d11eecdfbe7a7e199 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Fri, 3 Feb 2023 03:30:49 + Subject: [PATCH v7] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Author: Bharath Rupireddy Reviewed-by: Robert Haas, Simon Riggs Reviewed-by: Thomas Munro Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 25 --- src/backend/postmaster/startup.c | 30 --- src/include/postmaster/startup.h | 2 ++ 3 files changed, 51 insertions(+), 6 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 2a5352f879..dbe9394762 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -385,6 +385,7 @@ static bool recoveryStopAfter; /* prototypes for local functions */ static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI); +static void EnableStandbyMode(void); static void readRecoverySignalFile(void); static void validateRecoveryParameters(void); static bool read_backup_label(XLogRecPtr *checkPointLoc, @@ -469,6 +470,24 @@ XLogRecoveryShmemInit(void) ConditionVariableInit(&XLogRecoveryCtl->recoveryNotPausedCV); } +/* + * A thin wrapper to enable StandbyMode and do other preparatory work as + * needed. + */ +static void +EnableStandbyMode(void) +{ + StandbyMode = true; + + /* + * To avoid server log bloat, we don't report recovery progress in a + * standby as it will always be in recovery unless promoted. We disable + * startup progress timeout in standby mode to avoid calling + * startup_progress_timeout_handler() unnecessarily. + */ + disable_startup_progress_timeout(); +} + /* * Prepare the system for WAL recovery, if needed. * @@ -602,7 +621,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, */ InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); /* * When a backup_label file is present, we want to roll forward from @@ -739,7 +758,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, { InArchiveRecovery = true; if (StandbyModeRequested) -StandbyMode = true; +EnableStandbyMode(); } /* Get the last valid checkpoint record. */ @@ -3117,7 +3136,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode, (errmsg_internal("reached end of WAL in pg_wal, entering archive recovery"))); InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI); minRecoveryPoint = xlogreader->EndRecPtr; diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index bcd23542f1..efc2580536 100644 --- a/src/backend/pos
Re: when the startup process doesn't (logging startup delays)
On Tue, Nov 22, 2022 at 6:05 AM Bharath Rupireddy wrote: > If we place just the Assert(!StandbyMode); in > enable_startup_progress_timeout(), it fails for > begin_startup_progress_phase() in ResetUnloggedRelations() because the > InitWalRecovery(), that sets StandbyMode to true, is called before > ResetUnloggedRelations() . However, with the if (StandbyMode) { > return; }, we fail to report progress of ResetUnloggedRelations() in a > standby, which isn't a good idea at all because we only want to > disable the timeout during the recovery's main loop. Ugh. Well, in that case, I guess my vote is to forget about this whole Assert business and just commit what you had in v4. Does that work for you? Protecting against specifically the situation where we're in the standby's main redo apply loop is not really what I had in mind here, but this is already sort of weirdly complicated-looking, and making it more weirdly complicated-looking to achieve the kind of protection that I had in mind doesn't really seem worth it. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Nov 21, 2022 at 10:37 PM Robert Haas wrote: > > On Sun, Nov 20, 2022 at 9:20 PM Kyotaro Horiguchi > wrote: > > I prefer Robert's approach as it is more robust for future changes and > > simple. I prefer to avoid this kind of piggy-backing and it doesn't > > seem to be needed in this case. XLogShutdownWalRcv() looks like a > > similar case to me and honestly I don't like it in the sense of > > robustness but it is simpler than checking walreceiver status at every > > site that refers to the flag. > > I don't understand what you want changed. Can you be more specific > about what you mean by "Robert's approach"? > > I don't agree with Bharath's logic for preferring an if-test to an > Assert. There are some cases where we think we've written the code > correctly but also recognize that the logic is complicated enough that > something might slip through the cracks. Then, a runtime check makes > sense, because otherwise something real bad might happen on a > production instance. But here, I don't think that's the main risk. I > think the main risk is that some future patch tries to add code that > should print startup log messages later on. That would probably be a > coding mistake, and Assert would alert the patch author about that, > whereas amending the if-test would just make the code do something > differently then the author intended. > > But I don't feel super-strongly about this, which is why I mentioned > both options in my previous email. I'm not clear on whether you are > expressing an opinion on this point in particular or something more > general. If we place just the Assert(!StandbyMode); in enable_startup_progress_timeout(), it fails for begin_startup_progress_phase() in ResetUnloggedRelations() because the InitWalRecovery(), that sets StandbyMode to true, is called before ResetUnloggedRelations() . However, with the if (StandbyMode) { return; }, we fail to report progress of ResetUnloggedRelations() in a standby, which isn't a good idea at all because we only want to disable the timeout during the recovery's main loop. I introduced an assert-only function returning true when we're in standby's main redo apply loop and modified the assertion to be Assert(!InStandbyMainRedoApplyLoop()); works better but it complicates the code a bit. FWIW, I'm attaching the v6 patch with this change. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com v6-0001-Disable-STARTUP_PROGRESS_TIMEOUT-in-standby-mode.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
On Sun, Nov 20, 2022 at 9:20 PM Kyotaro Horiguchi wrote: > I prefer Robert's approach as it is more robust for future changes and > simple. I prefer to avoid this kind of piggy-backing and it doesn't > seem to be needed in this case. XLogShutdownWalRcv() looks like a > similar case to me and honestly I don't like it in the sense of > robustness but it is simpler than checking walreceiver status at every > site that refers to the flag. I don't understand what you want changed. Can you be more specific about what you mean by "Robert's approach"? I don't agree with Bharath's logic for preferring an if-test to an Assert. There are some cases where we think we've written the code correctly but also recognize that the logic is complicated enough that something might slip through the cracks. Then, a runtime check makes sense, because otherwise something real bad might happen on a production instance. But here, I don't think that's the main risk. I think the main risk is that some future patch tries to add code that should print startup log messages later on. That would probably be a coding mistake, and Assert would alert the patch author about that, whereas amending the if-test would just make the code do something differently then the author intended. But I don't feel super-strongly about this, which is why I mentioned both options in my previous email. I'm not clear on whether you are expressing an opinion on this point in particular or something more general. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
At Fri, 18 Nov 2022 15:55:00 +0530, Bharath Rupireddy wrote in > On Fri, Nov 18, 2022 at 12:42 AM Robert Haas wrote: > > > > On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy > > wrote: > > > Duplication is a problem that I agree with and I have an idea here - > > > how about introducing a new function, say EnableStandbyMode() that > > > sets StandbyMode to true and disables the startup progress timeout, > > > something like the attached? > > > > That works for me, more or less. But I think that > > enable_startup_progress_timeout should be amended to either say if > > (log_startup_progress_interval == 0 || StandbyMode) return; or else it > > should at least Assert(!StandbyMode), so that we can't accidentally > > re-enable the timer after we shut it off. > > Hm, an assertion may not help in typical production servers running on > non-assert builds. I've modified the if condition, please see the > attached v5 patch. I prefer Robert's approach as it is more robust for future changes and simple. I prefer to avoid this kind of piggy-backing and it doesn't seem to be needed in this case. XLogShutdownWalRcv() looks like a similar case to me and honestly I don't like it in the sense of robustness but it is simpler than checking walreceiver status at every site that refers to the flag. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: when the startup process doesn't (logging startup delays)
On Fri, Nov 18, 2022 at 12:42 AM Robert Haas wrote: > > On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy > wrote: > > Duplication is a problem that I agree with and I have an idea here - > > how about introducing a new function, say EnableStandbyMode() that > > sets StandbyMode to true and disables the startup progress timeout, > > something like the attached? > > That works for me, more or less. But I think that > enable_startup_progress_timeout should be amended to either say if > (log_startup_progress_interval == 0 || StandbyMode) return; or else it > should at least Assert(!StandbyMode), so that we can't accidentally > re-enable the timer after we shut it off. Hm, an assertion may not help in typical production servers running on non-assert builds. I've modified the if condition, please see the attached v5 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 758d4cc81b1c7ed087969df07a958e92a374ee0f Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Fri, 18 Nov 2022 09:53:15 + Subject: [PATCH v5] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Authors: Bharath Rupireddy, Simon Riggs Reviewed-by: Robert Haas Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 21 --- src/backend/postmaster/startup.c | 33 +++ src/include/postmaster/startup.h | 2 ++ 3 files changed, 48 insertions(+), 8 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..2feae1ebd3 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -386,6 +386,7 @@ static bool recoveryStopAfter; /* prototypes for local functions */ static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI); +static void EnableStandbyMode(void); static void readRecoverySignalFile(void); static void validateRecoveryParameters(void); static bool read_backup_label(XLogRecPtr *checkPointLoc, @@ -470,6 +471,20 @@ XLogRecoveryShmemInit(void) ConditionVariableInit(&XLogRecoveryCtl->recoveryNotPausedCV); } +static void +EnableStandbyMode(void) +{ + StandbyMode = true; + + /* + * To avoid server log bloat, we don't report recovery progress in a + * standby as it will always be in recovery unless promoted. We disable + * startup progress timeout in standby mode to avoid calling + * startup_progress_timeout_handler() unnecessarily. + */ + disable_startup_progress_timeout(); +} + /* * Prepare the system for WAL recovery, if needed. * @@ -603,7 +618,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, */ InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); /* * When a backup_label file is present, we want to roll forward from @@ -740,7 +755,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, { InArchiveRecovery = true; if (StandbyModeRequested) -StandbyMode = true; +EnableStandbyMode(); } /* Get the last valid checkpoint record. */ @@ -3115,7 +3130,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode, (errmsg_internal("reached end of WAL in pg_wal, entering archive recovery"))); InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI); minRecoveryPoint = xlogreader->EndRecPtr; diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..4a46d2070c 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -314,20 +314,29 @@ startup_progress_timeout_handler(void) startup_progress_timer_expired = true; } +void +disable_startup_progress_timeout(void) +{ + /* Quick exit if feature is disabled. */ + if (log_startup_progress_interval == 0) + return; + + disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); + startup_progress_timer_expired = false; +} + /* * Set the start timestamp of the current operation and enable the timeout. */ void -begin_startup_progress_phase(void) +enable_startup_progress_timeout(void) { TimestampTz fin_time; - /* Feature is disabled. */ - if (log_startup_progress_interval == 0) + /* Quick exit if feature is disabled or we're in standby mode */ + if (log_startup_progress_interval == 0 || StandbyMode)
Re: when the startup process doesn't (logging startup delays)
On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy wrote: > Duplication is a problem that I agree with and I have an idea here - > how about introducing a new function, say EnableStandbyMode() that > sets StandbyMode to true and disables the startup progress timeout, > something like the attached? That works for me, more or less. But I think that enable_startup_progress_timeout should be amended to either say if (log_startup_progress_interval == 0 || StandbyMode) return; or else it should at least Assert(!StandbyMode), so that we can't accidentally re-enable the timer after we shut it off. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Thu, Nov 17, 2022 at 12:21 AM Robert Haas wrote: > > On Wed, Nov 16, 2022 at 1:47 AM Bharath Rupireddy > wrote: > > That can be done, only if we can disable the timeout in another place > > when the StandbyMode is set to true in ReadRecord(), that is, after > > the standby server finishes crash recovery and enters standby mode. > > Oh, interesting. I didn't realize that we would need to worry about that case. > > > I'm attaching the v3 patch for further review. Please find the CF > > entry here - https://commitfest.postgresql.org/41/4012/. > > I kind of dislike having to have logic for this in two places. Seems > like it could create future bugs. Duplication is a problem that I agree with and I have an idea here - how about introducing a new function, say EnableStandbyMode() that sets StandbyMode to true and disables the startup progress timeout, something like the attached? > How about the attached approach, instead? This way, the first time the > timer expires after we reach standby mode, we reactively disable it. Hm. I'm not really sure if it's a good idea. While it simplifies the code, the has_startup_progress_timeout_expired() gets called for every WAL record in standby mode. Isn't this an unnecessary thing? Currently, the if (!StandbyMode) condition blocks the function calls. And I'm also a little concerned that we move the StandbyMode variable to startup.c which so far tiled to xlogrecovery.c. Maybe these are not really concerns at all. Maybe others are okay with this approach. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From b25838215534488c53855a5cd5009fba09bd358a Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Thu, 17 Nov 2022 07:13:27 + Subject: [PATCH v4] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Authors: Bharath Rupireddy, Simon Riggs Reviewed-by: Robert Haas Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 21 +--- src/backend/postmaster/startup.c | 29 --- src/include/postmaster/startup.h | 2 ++ 3 files changed, 46 insertions(+), 6 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..2feae1ebd3 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -386,6 +386,7 @@ static bool recoveryStopAfter; /* prototypes for local functions */ static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI); +static void EnableStandbyMode(void); static void readRecoverySignalFile(void); static void validateRecoveryParameters(void); static bool read_backup_label(XLogRecPtr *checkPointLoc, @@ -470,6 +471,20 @@ XLogRecoveryShmemInit(void) ConditionVariableInit(&XLogRecoveryCtl->recoveryNotPausedCV); } +static void +EnableStandbyMode(void) +{ + StandbyMode = true; + + /* + * To avoid server log bloat, we don't report recovery progress in a + * standby as it will always be in recovery unless promoted. We disable + * startup progress timeout in standby mode to avoid calling + * startup_progress_timeout_handler() unnecessarily. + */ + disable_startup_progress_timeout(); +} + /* * Prepare the system for WAL recovery, if needed. * @@ -603,7 +618,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, */ InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); /* * When a backup_label file is present, we want to roll forward from @@ -740,7 +755,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr, { InArchiveRecovery = true; if (StandbyModeRequested) -StandbyMode = true; +EnableStandbyMode(); } /* Get the last valid checkpoint record. */ @@ -3115,7 +3130,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode, (errmsg_internal("reached end of WAL in pg_wal, entering archive recovery"))); InArchiveRecovery = true; if (StandbyModeRequested) - StandbyMode = true; + EnableStandbyMode(); SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI); minRecoveryPoint = xlogreader->EndRecPtr; diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..2705e425e8 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -314,11 +314,22 @@ startup_progress_timeout_handler(void)
Re: when the startup process doesn't (logging startup delays)
On Thu, Nov 17, 2022 at 7:51 AM Robert Haas wrote: + * up, since standby mode is a state that is intendeded to persist typo Otherwise LGTM.
Re: when the startup process doesn't (logging startup delays)
On Wed, Nov 16, 2022 at 1:47 AM Bharath Rupireddy wrote: > That can be done, only if we can disable the timeout in another place > when the StandbyMode is set to true in ReadRecord(), that is, after > the standby server finishes crash recovery and enters standby mode. Oh, interesting. I didn't realize that we would need to worry about that case. > I'm attaching the v3 patch for further review. Please find the CF > entry here - https://commitfest.postgresql.org/41/4012/. I kind of dislike having to have logic for this in two places. Seems like it could create future bugs. How about the attached approach, instead? This way, the first time the timer expires after we reach standby mode, we reactively disable it. -- Robert Haas EDB: http://www.enterprisedb.com another-way.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
On Wed, Nov 16, 2022 at 2:28 PM Simon Riggs wrote: > > On Wed, 16 Nov 2022 at 06:47, Bharath Rupireddy > wrote: > > > > On Tue, Nov 15, 2022 at 10:55 PM Robert Haas wrote: > > > > > > On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy > > > wrote: > > > > Please review the v2 patch. > > > > > > It seems to me that this will call disable_startup_progress_timeout > > > once per WAL record, which seems like an unnecessary expense. How > > > about leaving the code inside the loop just as we have it, and putting > > > if (StandbyMode) disable_startup_progress_timeout() before entering > > > the loop? > > > > That can be done, only if we can disable the timeout in another place > > when the StandbyMode is set to true in ReadRecord(), that is, after > > the standby server finishes crash recovery and enters standby mode. > > > > I'm attaching the v3 patch for further review. Please find the CF > > entry here - https://commitfest.postgresql.org/41/4012/. > > begin_startup_progress_phase() checks to see if feature is disabled > twice, so I think you can skip the check and just rely on the check in > enable(). Yes, I did that intentionally to avoid begin_startup_progress_phase() calling disable and enable functions when the feature is disabled. I'll leave it to the committer whether to retain it or delete it. > Otherwise, all good. Thanks. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: when the startup process doesn't (logging startup delays)
On Wed, 16 Nov 2022 at 06:47, Bharath Rupireddy wrote: > > On Tue, Nov 15, 2022 at 10:55 PM Robert Haas wrote: > > > > On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy > > wrote: > > > Please review the v2 patch. > > > > It seems to me that this will call disable_startup_progress_timeout > > once per WAL record, which seems like an unnecessary expense. How > > about leaving the code inside the loop just as we have it, and putting > > if (StandbyMode) disable_startup_progress_timeout() before entering > > the loop? > > That can be done, only if we can disable the timeout in another place > when the StandbyMode is set to true in ReadRecord(), that is, after > the standby server finishes crash recovery and enters standby mode. > > I'm attaching the v3 patch for further review. Please find the CF > entry here - https://commitfest.postgresql.org/41/4012/. begin_startup_progress_phase() checks to see if feature is disabled twice, so I think you can skip the check and just rely on the check in enable(). Otherwise, all good. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: when the startup process doesn't (logging startup delays)
On Tue, Nov 15, 2022 at 10:55 PM Robert Haas wrote: > > On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy > wrote: > > Please review the v2 patch. > > It seems to me that this will call disable_startup_progress_timeout > once per WAL record, which seems like an unnecessary expense. How > about leaving the code inside the loop just as we have it, and putting > if (StandbyMode) disable_startup_progress_timeout() before entering > the loop? That can be done, only if we can disable the timeout in another place when the StandbyMode is set to true in ReadRecord(), that is, after the standby server finishes crash recovery and enters standby mode. I'm attaching the v3 patch for further review. Please find the CF entry here - https://commitfest.postgresql.org/41/4012/. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 612f17d5e81181b69c3d711524de840c88cb3b4a Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 16 Nov 2022 06:20:15 + Subject: [PATCH v3] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Authors: Bharath Rupireddy, Simon Riggs Reviewed-by: Robert Haas Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 30 +-- src/backend/postmaster/startup.c | 29 +++--- src/include/postmaster/startup.h | 2 ++ 3 files changed, 56 insertions(+), 5 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..0d57ea17ca 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1644,9 +1644,24 @@ PerformWalRecovery(void) (errmsg("redo starts at %X/%X", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr; - /* Prepare to report progress of the redo phase. */ - if (!StandbyMode) + if (StandbyMode) + { + /* + * To avoid server log bloat, we don't report recovery progress in + * a standby as it will always be in recovery unless promoted. We + * disable startup progress timeout in standby mode to avoid + * calling startup_progress_timeout_handler() unnecessarily. + */ + disable_startup_progress_timeout(); + } + else + { + /* + * When not in standby mode, prepare to report progress of the redo + * phase. + */ begin_startup_progress_phase(); + } /* * main redo apply loop @@ -3115,8 +3130,19 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode, (errmsg_internal("reached end of WAL in pg_wal, entering archive recovery"))); InArchiveRecovery = true; if (StandbyModeRequested) +{ StandbyMode = true; + /* + * To avoid server log bloat, we don't report + * recovery progress in a standby as it will always be in + * recovery unless promoted. We disable startup progress + * timeout in standby mode to avoid calling + * startup_progress_timeout_handler() unnecessarily. + */ + disable_startup_progress_timeout(); +} + SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI); minRecoveryPoint = xlogreader->EndRecPtr; minRecoveryPointTLI = replayTLI; diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..2705e425e8 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -314,11 +314,22 @@ startup_progress_timeout_handler(void) startup_progress_timer_expired = true; } +void +disable_startup_progress_timeout(void) +{ + /* Feature is disabled. */ + if (log_startup_progress_interval == 0) + return; + + disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); + startup_progress_timer_expired = false; +} + /* * Set the start timestamp of the current operation and enable the timeout. */ void -begin_startup_progress_phase(void) +enable_startup_progress_timeout(void) { TimestampTz fin_time; @@ -326,8 +337,6 @@ begin_startup_progress_phase(void) if (log_startup_progress_interval == 0) return; - disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); - startup_progress_timer_expired = false; startup_progress_phase_start_time = GetCurrentTimestamp(); fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time, log_startup_progress_interval); @@ -335,6 +344,20 @@ begin_startup_progress_phase(void) log_startup_progress_interval); } +/* + * A thin wrapper to first disable and then enable the startup progress timeout. + */ +void +begin_startup_progr
Re: when the startup process doesn't (logging startup delays)
On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy wrote: > Please review the v2 patch. It seems to me that this will call disable_startup_progress_timeout once per WAL record, which seems like an unnecessary expense. How about leaving the code inside the loop just as we have it, and putting if (StandbyMode) disable_startup_progress_timeout() before entering the loop? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Tue, 15 Nov 2022 at 13:33, Bharath Rupireddy wrote: > > On Mon, Nov 14, 2022 at 9:31 PM Robert Haas wrote: > > > > On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs > > wrote: > > > > Whilte at it, I noticed that we report redo progress for PITR, but we > > > > don't report when standby enters archive recovery mode, say due to a > > > > failure in the connection to primary or after the promote signal is > > > > found. Isn't it useful to report in this case as well to know the > > > > recovery progress? > > > > > > I think your patch disables progress too early, effectively turning > > > off the standby progress feature. The purpose was to report on things > > > that take long periods during recovery, not just prior to recovery. > > > > > > I would advocate that we disable progress only while waiting, as I've > > > done here: > > > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com > > > > Maybe I'm confused here, but I think that, on a standby, startup > > progress messages are only printed until the main redo loop is > > reached. Otherwise, we would print a message on a standby every 10s > > forever, which seems like a thing that most users would not like. So I > > think that Bharath has the right idea here. > > Yes, the idea is to disable the timeout on standby completely since we > actually don't report any recovery progress. Keeping it enabled, > unnecessarily calls startup_progress_timeout_handler() every > log_startup_progress_interval seconds i.e. 10 seconds. That's the > intention of the patch. As long as we don't get the SIGALRMs that Thomas identified, then I'm happy. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: when the startup process doesn't (logging startup delays)
On Mon, Nov 14, 2022 at 9:31 PM Robert Haas wrote: > > On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs > wrote: > > > Whilte at it, I noticed that we report redo progress for PITR, but we > > > don't report when standby enters archive recovery mode, say due to a > > > failure in the connection to primary or after the promote signal is > > > found. Isn't it useful to report in this case as well to know the > > > recovery progress? > > > > I think your patch disables progress too early, effectively turning > > off the standby progress feature. The purpose was to report on things > > that take long periods during recovery, not just prior to recovery. > > > > I would advocate that we disable progress only while waiting, as I've done > > here: > > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com > > Maybe I'm confused here, but I think that, on a standby, startup > progress messages are only printed until the main redo loop is > reached. Otherwise, we would print a message on a standby every 10s > forever, which seems like a thing that most users would not like. So I > think that Bharath has the right idea here. Yes, the idea is to disable the timeout on standby completely since we actually don't report any recovery progress. Keeping it enabled, unnecessarily calls startup_progress_timeout_handler() every log_startup_progress_interval seconds i.e. 10 seconds. That's the intention of the patch. > I don't think that his patch is right in detail, though. I don't think > the call to disable_timeout() needs to be conditional, Yes, disable_timeout() returns if the timeout was previously disabled i.e. all_timeouts[STARTUP_PROGRESS_TIMEOUT].active is false. I changed it in the v2 patch. > and I don't > think the Assert is correct. You're right. My intention there was to check if the timeout is enabled while ereport_startup_progress() is called. In the v2 patch, when we actually disable the timeout startup_progress_timer_expired gets set to false and has_startup_progress_timeout_expired() just returns in such a case. > Also, I think that your patch has the > right idea in encapsulating the disable_timeout() call inside a new > function disable_startup_progress_timeout(), rather than having the > details known directly by xlogrecovery.c. Yes, I too like Simon's idea of {enable, disable}_startup_progress_timeout functions, I utilized them in the v2 patch here. I actually want to get rid of begin_startup_progress_phase() which now becomes a thin wrapper calling disable and enable functions and ensure the callers do follow enable()-report_progress()-disable() way to use the feature, however I didn't code for that as it needs changes across many files. If okay, I can code for that too. Thoughts? Please review the v2 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 0edba6df408bbde80bc2a955990021995f3ec07e Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Tue, 15 Nov 2022 13:01:33 + Subject: [PATCH v2] Disable STARTUP_PROGRESS_TIMEOUT in standby mode In standby mode, we actually don't report progress of recovery to not bloat server logs as the standby is always in recovery unless promoted. However, startup_progress_timeout_handler() gets called every log_startup_progress_interval seconds, which is unnecessary. Therefore, we disable the timeout in standby mode. Reported-by: Thomas Munro Authors: Bharath Rupireddy, Simon Riggs Reviewed-by: Robert Haas Backpatch-through: 15 Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com --- src/backend/access/transam/xlogrecovery.c | 9 ++- src/backend/postmaster/startup.c | 29 --- src/include/postmaster/startup.h | 2 ++ 3 files changed, 36 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..20c0dfd9d1 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1653,7 +1653,14 @@ PerformWalRecovery(void) */ do { - if (!StandbyMode) + /* + * To avoid server log bloat, we don't report recovery progress in + * a standby as it will always be in recovery unless promoted. We + * also disable the timeout as we don't need it anymore. + */ + if (StandbyMode) +disable_startup_progress_timeout(); + else ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr)); diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..2705e425e8 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -314,11 +314,22 @@ startup_progress_timeout_handler(void) startup_progress_timer_expired = true; } +voi
Re: when the startup process doesn't (logging startup delays)
On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs wrote: > > Whilte at it, I noticed that we report redo progress for PITR, but we > > don't report when standby enters archive recovery mode, say due to a > > failure in the connection to primary or after the promote signal is > > found. Isn't it useful to report in this case as well to know the > > recovery progress? > > I think your patch disables progress too early, effectively turning > off the standby progress feature. The purpose was to report on things > that take long periods during recovery, not just prior to recovery. > > I would advocate that we disable progress only while waiting, as I've done > here: > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com Maybe I'm confused here, but I think that, on a standby, startup progress messages are only printed until the main redo loop is reached. Otherwise, we would print a message on a standby every 10s forever, which seems like a thing that most users would not like. So I think that Bharath has the right idea here. I don't think that his patch is right in detail, though. I don't think the call to disable_timeout() needs to be conditional, and I don't think the Assert is correct. Also, I think that your patch has the right idea in encapsulating the disable_timeout() call inside a new function disable_startup_progress_timeout(), rather than having the details known directly by xlogrecovery.c. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Tue, 8 Nov 2022 at 12:33, Bharath Rupireddy wrote: > > On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro wrote: > > > > On Sat, Oct 30, 2021 at 7:44 AM Robert Haas wrote: > > > Committed. > > > > Is it expected that an otherwise idle standby's recovery process > > receives SIGALRM every N seconds, or should the timer be canceled at > > that point, as there is no further progress to report? > > Nice catch. Yeah, that seems unnecessary, see the below standby logs. > I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);, > something like the attached? I think there'll be no issue with the > patch since the StandbyMode gets reset only at the end of recovery (in > FinishWalRecovery()) but it can very well be set during recovery (in > ReadRecord()). Note that I also added an assertion in > has_startup_progress_timeout_expired(), just in case. > > 2022-11-08 11:28:23.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received > 2022-11-08 11:28:23.563 UTC [980909] LOG: > startup_progress_timeout_handler called > 2022-11-08 11:28:33.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received > 2022-11-08 11:28:33.563 UTC [980909] LOG: > startup_progress_timeout_handler called > 2022-11-08 11:28:43.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received > 2022-11-08 11:28:43.563 UTC [980909] LOG: > startup_progress_timeout_handler called > 2022-11-08 11:28:53.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received > 2022-11-08 11:28:53.563 UTC [980909] LOG: > startup_progress_timeout_handler called > > Whilte at it, I noticed that we report redo progress for PITR, but we > don't report when standby enters archive recovery mode, say due to a > failure in the connection to primary or after the promote signal is > found. Isn't it useful to report in this case as well to know the > recovery progress? I think your patch disables progress too early, effectively turning off the standby progress feature. The purpose was to report on things that take long periods during recovery, not just prior to recovery. I would advocate that we disable progress only while waiting, as I've done here: https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com -- Simon Riggshttp://www.EnterpriseDB.com/
Re: when the startup process doesn't (logging startup delays)
On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro wrote: > > On Sat, Oct 30, 2021 at 7:44 AM Robert Haas wrote: > > Committed. > > Is it expected that an otherwise idle standby's recovery process > receives SIGALRM every N seconds, or should the timer be canceled at > that point, as there is no further progress to report? Nice catch. Yeah, that seems unnecessary, see the below standby logs. I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);, something like the attached? I think there'll be no issue with the patch since the StandbyMode gets reset only at the end of recovery (in FinishWalRecovery()) but it can very well be set during recovery (in ReadRecord()). Note that I also added an assertion in has_startup_progress_timeout_expired(), just in case. 2022-11-08 11:28:23.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:23.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:33.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:33.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:43.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:43.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:53.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:53.563 UTC [980909] LOG: startup_progress_timeout_handler called Whilte at it, I noticed that we report redo progress for PITR, but we don't report when standby enters archive recovery mode, say due to a failure in the connection to primary or after the promote signal is found. Isn't it useful to report in this case as well to know the recovery progress? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com From 40ac498a76f38903bbee37108812907a76bb1a78 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Tue, 8 Nov 2022 12:18:11 + Subject: [PATCH v1] Disable STARTUP_PROGRESS_TIMEOUT in standby mode --- src/backend/access/transam/xlogrecovery.c | 13 - src/backend/postmaster/startup.c | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..5326a98633 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -63,6 +63,7 @@ #include "utils/pg_lsn.h" #include "utils/ps_status.h" #include "utils/pg_rusage.h" +#include "utils/timeout.h" /* Unsupported old recovery command file names (relative to $PGDATA) */ #define RECOVERY_COMMAND_FILE "recovery.conf" @@ -1653,7 +1654,17 @@ PerformWalRecovery(void) */ do { - if (!StandbyMode) + /* + * To avoid server log bloat, we don't report recovery progress in + * a standby as it will always be in recovery unless promoted. We + * also disable the timeout as we don't need it anymore. + */ + if (StandbyMode) + { +if (get_timeout_active(STARTUP_PROGRESS_TIMEOUT)) + disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); + } + else ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr)); diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..1456e3ad3a 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -347,6 +347,8 @@ has_startup_progress_timeout_expired(long *secs, int *usecs) int useconds; TimestampTz now; + Assert(get_timeout_active(STARTUP_PROGRESS_TIMEOUT) == true); + /* No timeout has occurred. */ if (!startup_progress_timer_expired) return false; -- 2.34.1
Re: when the startup process doesn't (logging startup delays)
On Sat, Oct 30, 2021 at 7:44 AM Robert Haas wrote: > Committed. Is it expected that an otherwise idle standby's recovery process receives SIGALRM every N seconds, or should the timer be canceled at that point, as there is no further progress to report?
Re: when the startup process doesn't (logging startup delays)
On Fri, Oct 29, 2021 at 9:10 AM Nitin Jadhav wrote: > > I think you're wrong. If we did that, the previous timer could fire > > right after we set startup_progress_timer_expired = false, and before > > we reschedule the timeout. It seems annoying to have to disable the > > timeout and immediately turn around and re-enable it, but I don't see > > how to avoid the race condition otherwise. > > Right. There is a possibility of race conditions. In that case the > above changes look good to me. Committed. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> I think you're wrong. If we did that, the previous timer could fire > right after we set startup_progress_timer_expired = false, and before > we reschedule the timeout. It seems annoying to have to disable the > timeout and immediately turn around and re-enable it, but I don't see > how to avoid the race condition otherwise. Right. There is a possibility of race conditions. In that case the above changes look good to me. Thanks & Regards, Nitin Jadhav On Fri, Oct 29, 2021 at 6:10 PM Robert Haas wrote: > > On Fri, Oct 29, 2021 at 7:37 AM Nitin Jadhav > wrote: > > ereport_startup_progress() logs a message. So I feel just setting > > 'startup_progress_timer_expired' to false in > > begin_startup_progress_phase() would work. Please correct me if I am > > wrong. > > I think you're wrong. If we did that, the previous timer could fire > right after we set startup_progress_timer_expired = false, and before > we reschedule the timeout. It seems annoying to have to disable the > timeout and immediately turn around and re-enable it, but I don't see > how to avoid the race condition otherwise. > > -- > Robert Haas > EDB: http://www.enterprisedb.com > >
Re: when the startup process doesn't (logging startup delays)
On Fri, Oct 29, 2021 at 7:37 AM Nitin Jadhav wrote: > ereport_startup_progress() logs a message. So I feel just setting > 'startup_progress_timer_expired' to false in > begin_startup_progress_phase() would work. Please correct me if I am > wrong. I think you're wrong. If we did that, the previous timer could fire right after we set startup_progress_timer_expired = false, and before we reschedule the timeout. It seems annoying to have to disable the timeout and immediately turn around and re-enable it, but I don't see how to avoid the race condition otherwise. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> I was fooling around with a test setup today, working on an unrelated > problem, and this happened: > > 2021-10-28 14:21:23.145 EDT [92010] LOG: resetting unlogged relations > (init), elapsed time: 0.00 s, current path: base/13020 Nice catch and interesting case. > That's not supposed to happen. I assume the problem is that the > timeout for the previous phase fired just as we were beginning a new > one, and the code got confused. I think we probably need to do > something like this to make sure that the timeout from one operation > can't trigger a log message for the next: > > diff --git a/src/backend/postmaster/startup.c > b/src/backend/postmaster/startup.c > index 28e68dd871..47ec737888 100644 > --- a/src/backend/postmaster/startup.c > +++ b/src/backend/postmaster/startup.c > @@ -320,6 +320,8 @@ begin_startup_progress_phase(void) > if (log_startup_progress_interval == 0) > return; > > +disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); > +startup_progress_timer_expired = false; > startup_progress_phase_start_time = GetCurrentTimestamp(); > fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time, >log_startup_progress_interval); > > Thoughts? Yes. I agree that this is not an expected behaviour and I do agree that, probably the timeout for the previous phase fired just as we were beginning a new one. For each operation, we call begin_startup_progress_phase() before starting the operation and one/multiple calls to ereport_startup_progress() to report the progress and intentionally we have not added any functionality to disable the timer at the end of the operation. The timer remains active and there may be multiple calls to startup_progress_timeout_handler() which sets a flag to true. So whenever we call a begin_startup_progress_phase() for the next operation, we do enable the timer (In my understanding it will internally disable the old timer and schedule a new one) but the flag is already set to true by the previous timer. Hence the next call to ereport_startup_progress() logs a message. So I feel just setting 'startup_progress_timer_expired' to false in begin_startup_progress_phase() would work. Please correct me if I am wrong. Thanks & Regards, Nitin Jadhav On Thu, Oct 28, 2021 at 11:59 PM Robert Haas wrote: > > On Mon, Oct 25, 2021 at 11:56 AM Robert Haas wrote: > > This version looks fine, so I have committed it (and my > > enable_timeout_every patch also, as a necessary prerequisite). > > I was fooling around with a test setup today, working on an unrelated > problem, and this happened: > > 2021-10-28 14:21:23.145 EDT [92010] LOG: resetting unlogged relations > (init), elapsed time: 0.00 s, current path: base/13020 > > That's not supposed to happen. I assume the problem is that the > timeout for the previous phase fired just as we were beginning a new > one, and the code got confused. I think we probably need to do > something like this to make sure that the timeout from one operation > can't trigger a log message for the next: > > diff --git a/src/backend/postmaster/startup.c > b/src/backend/postmaster/startup.c > index 28e68dd871..47ec737888 100644 > --- a/src/backend/postmaster/startup.c > +++ b/src/backend/postmaster/startup.c > @@ -320,6 +320,8 @@ begin_startup_progress_phase(void) > if (log_startup_progress_interval == 0) > return; > > +disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); > +startup_progress_timer_expired = false; > startup_progress_phase_start_time = GetCurrentTimestamp(); > fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time, > log_startup_progress_interval); > > Thoughts? > > -- > Robert Haas > EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Oct 25, 2021 at 11:56 AM Robert Haas wrote: > This version looks fine, so I have committed it (and my > enable_timeout_every patch also, as a necessary prerequisite). I was fooling around with a test setup today, working on an unrelated problem, and this happened: 2021-10-28 14:21:23.145 EDT [92010] LOG: resetting unlogged relations (init), elapsed time: 0.00 s, current path: base/13020 That's not supposed to happen. I assume the problem is that the timeout for the previous phase fired just as we were beginning a new one, and the code got confused. I think we probably need to do something like this to make sure that the timeout from one operation can't trigger a log message for the next: diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index 28e68dd871..47ec737888 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -320,6 +320,8 @@ begin_startup_progress_phase(void) if (log_startup_progress_interval == 0) return; +disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); +startup_progress_timer_expired = false; startup_progress_phase_start_time = GetCurrentTimestamp(); fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time, log_startup_progress_interval); Thoughts? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Tue, Oct 26, 2021 at 4:19 AM Bharath Rupireddy wrote: > Can we also log the total time the startup process took to recover, > and also the total time each stage of the recovery/redo processing > took: 1) into a file or 2) emitting that info via a new hook 3) into a > system catalog table (assuming at the end of the recovery the database > is in a consistent state, but I'm not sure if we ever update any > catalog tables in/after the startup/recovery phase). #3 would be hard to do because there could be any number of databases and it is unclear which one we ought to update. Also, we'd have to launch a background worker to connect to that database in order to do it, and be prepared for what happens if that worker fails to get the work done for whatever reason. Also, it is unclear why we should only log this specific thing to a system catalog and not anything else. #1 and #2 could certainly be done, but I'm not sure what the use case is, and I think it's a separate proposal from what we did here. So I think a new thread would be appropriate if you want to make a new proposal. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Oct 25, 2021 at 9:26 PM Robert Haas wrote: > > On Tue, Oct 19, 2021 at 9:06 AM Nitin Jadhav > wrote: > > Thanks for sharing the patch. Overall approach looks good to me. But > > just one concern about using enable_timeout_every() functionality. As > > per my understanding the caller should calculate the first scheduled > > timeout (now + interval) and pass it as the second argument but this > > is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have > > done the changes as I have mentioned (like now + interval). Kindly > > correct me if I am wrong. I am attaching 2 patches here. > > 'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is > > the same as Robert's v2 patch. I have rebased my patch on top of this > > and it is 'v19-0002-startup-progress.patch'. > > This version looks fine, so I have committed it (and my > enable_timeout_every patch also, as a necessary prerequisite). Thanks for getting this in. I have few more thoughts: Can we also log the total time the startup process took to recover, and also the total time each stage of the recovery/redo processing took: 1) into a file or 2) emitting that info via a new hook 3) into a system catalog table (assuming at the end of the recovery the database is in a consistent state, but I'm not sure if we ever update any catalog tables in/after the startup/recovery phase). This will help the users/admins/developers for summarizing, analytical and debugging purposes. This information can easily help us to understand the recovery patterns. Thoughts? If okay, I can spend some more time and start a separate thread to discuss. Regards, Bharath Rupireddy.
Re: when the startup process doesn't (logging startup delays)
On Tue, Oct 19, 2021 at 9:06 AM Nitin Jadhav wrote: > Thanks for sharing the patch. Overall approach looks good to me. But > just one concern about using enable_timeout_every() functionality. As > per my understanding the caller should calculate the first scheduled > timeout (now + interval) and pass it as the second argument but this > is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have > done the changes as I have mentioned (like now + interval). Kindly > correct me if I am wrong. I am attaching 2 patches here. > 'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is > the same as Robert's v2 patch. I have rebased my patch on top of this > and it is 'v19-0002-startup-progress.patch'. This version looks fine, so I have committed it (and my enable_timeout_every patch also, as a necessary prerequisite). Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> Apparently not, but here's a v2 anyway. In this version I made > enable_timeout_every() a three-argument function, so that the caller > can specify both the first time at which the timeout routine should be > called and the interval between them, instead of only the latter. That > seems to be more convenient for this use case, and is more powerful in > general. Thanks for sharing the patch. Overall approach looks good to me. But just one concern about using enable_timeout_every() functionality. As per my understanding the caller should calculate the first scheduled timeout (now + interval) and pass it as the second argument but this is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have done the changes as I have mentioned (like now + interval). Kindly correct me if I am wrong. I am attaching 2 patches here. 'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is the same as Robert's v2 patch. I have rebased my patch on top of this and it is 'v19-0002-startup-progress.patch'. > I just noticed something else which I realize is the indirect result > of my own suggestion but which doesn't actually look all that nice. > You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT, > ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then > another one in StartupProcessMain(). I think that's so that the > feature works in single-user mode, but that means that technically, > we're not reporting on the progress of the startup process. We're > reporting progress on the startup operations that are normally > performed by the startup process. But that means that the > documentation isn't quite accurate (because it mentions the startup > process specifically) and that the placement of the code in startup.c > is suspect (because that's specifically for the startup process) and > that basically every instance of the word "process" in the patch is > technically a little bit wrong. I'm not sure if that's a big enough > problem to be worth worrying about or exactly what we ought to do > about it, but it doesn't seem fantastic. At a minimum, I think we > should probably try to eliminate as many references to the startup > process as we can, and talk about startup progress or startup > operations or something like that. Yeah right. I have modified the comments accordingly and also fixed the other review comments related to the code comments. I have modified the code to include a call to RegisterTimeout() in only one place than the two calls previously. Initially I thought to call this in begin_startup_progress_phase(). I feel this is not a better choice since begin_startup_progress_phase function is getting called in many places. So it calls RegisterTimeout() many times which is not required. I feel StartupXLOG() is a better place for this since StartupXLOG() gets called during the startup process, bootstrap process or standalone backend. As per earlier discussion we need support for this in the case of startup process and standalone backend. Hence guarded this with '!IsBootstrapProcessingMode()'. Also verified the behaviour in both of the cases. Please correct me if I am wrong. Thanks & Regards, Nitin Jadhav On Mon, Oct 18, 2021 at 9:15 PM Robert Haas wrote: > > On Thu, Sep 30, 2021 at 5:08 PM Robert Haas wrote: > > Any thoughts on the patch I attached? > > Apparently not, but here's a v2 anyway. In this version I made > enable_timeout_every() a three-argument function, so that the caller > can specify both the first time at which the timeout routine should be > called and the interval between them, instead of only the latter. That > seems to be more convenient for this use case, and is more powerful in > general. > > -- > Robert Haas > EDB: http://www.enterprisedb.com v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch Description: Binary data v19-0002-startup-progress.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
On Thu, Sep 30, 2021 at 5:08 PM Robert Haas wrote: > Any thoughts on the patch I attached? Apparently not, but here's a v2 anyway. In this version I made enable_timeout_every() a three-argument function, so that the caller can specify both the first time at which the timeout routine should be called and the interval between them, instead of only the latter. That seems to be more convenient for this use case, and is more powerful in general. -- Robert Haas EDB: http://www.enterprisedb.com v2-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch Description: Binary data v2-0002-Quick-testing-hack.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
On Wed, Oct 13, 2021 at 9:06 AM Amul Sul wrote: > I think the "elapsed time" part can be implicitly added to the error > message inside ereport_startup_progress() which is common to all > calls. Not if it means having to call psprintf there! If there's some way we could do it with macro tricks, it might be worth considering, but I'm not sure there is, or that it would be less confusing. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 11:10 PM Robert Haas wrote: > > On Wed, Sep 29, 2021 at 10:08 AM Nitin Jadhav > wrote: > > Sorry. There was a misunderstanding about this and for the patch > > shared on September 27th, I had tested for the value '0' and observed > > that no progress messages were getting logged, probably the time at > > which 'enable_timeout_at' is getting called is past the 'next_timeout' > > value. This behaviour is completely dependent on the system. Now added > > an extra condition to disable the feature in case of '0' setting. > > Oh, interesting. I failed to consider that the behavior might vary > from one system to another. > > I just noticed something else which I realize is the indirect result > of my own suggestion but which doesn't actually look all that nice. > You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT, > ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then > another one in StartupProcessMain(). I think that's so that the > feature works in single-user mode, but that means that technically, > we're not reporting on the progress of the startup process. We're > reporting progress on the startup operations that are normally > performed by the startup process. But that means that the > documentation isn't quite accurate (because it mentions the startup > process specifically) and that the placement of the code in startup.c > is suspect (because that's specifically for the startup process) and > that basically every instance of the word "process" in the patch is > technically a little bit wrong. I'm not sure if that's a big enough > problem to be worth worrying about or exactly what we ought to do > about it, but it doesn't seem fantastic. At a minimum, I think we > should probably try to eliminate as many references to the startup > process as we can, and talk about startup progress or startup > operations or something like that. > > + * Start timestamp of the operation that occur during startup process. > > This is not correct grammar - it would need to be "operations that > occur" or "operation that occurs". But neither seems particularly > clear about what the variable actually does. How about "Time at which > the most recent startup operation started"? > > + * Indicates the timestamp at which the timer was supposed to expire. > > "Indicates" can be deleted, but also I think it would be better to > state the purpose of the timer i.e. "Timestamp at which the next > startup progress message should be logged." > > + enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout); > + scheduled_startup_progress_timeout = next_timeout; > + startup_progress_timer_expired = false; > > I think you should set startup_progress_timer_expired to false before > calling enable_timeout_at. Otherwise there's a race condition. It's > unlikely that the timer could expire and the interrupt handler fire > before we reach startup_progress_timer_expired = false, but it seems > like there's no reason to take a chance. > > + * Calculates the timestamp at which the next timer should expire and enables > > So in some places you have verbs with an "s" on the end, like here, > and in other places without, like in the next example. In "telegraph > style" comments like this, this implicit subject is "it" or "this," > but you don't write that. However you write the rest of the sentence > as if it were there. So this should say "calculate" and "enable" > rather than "calculates" and "enables". > > + * Schedule a wakeup call for logging the progress of startup process. > > This isn't really an accurate description, I think. It's not > scheduling anything, and I don't know what a "wakeup call" is anyway. > "Set a flag indicating that it's time to log a progress report" might > be better. > > + * Sets the start timestamp of the current operation and also enables the > > Set. enable. > > + * timeout for logging the progress of startup process. > > I think you could delete "for logging the progress of startup process" > here; that seems clear enough, and this reads a bit awkwardly. If you > want to keep something like this I wrote write "...enable the timeout > so that the progress of the startup progress will be logged." > > + * the timer if it did, otheriwse return false. > > otherwise > > + * Begin the startup progress phase to report the progress of syncing > + * data directory (syncfs). > > All of the comments that start with "Begin the startup progress phase" > should instead start with "Begin startup progress phase". > > I think this could be condensed down to "Prepare to report progress > syncing the data directory via syncfs", and likewise "Prepare to > report progress of the pre-fsync phase", "Prepare to report progress > resetting unlogged relations," etc. > > + gettext_noop("Sets the time interval between each progress update " > + "of the startup process."), > > This is actually inaccurate. It's sort of the same problem I was > worried about with respect to the documentation:
Re: when the startup process doesn't (logging startup delays)
On Thu, Sep 30, 2021 at 05:08:17PM -0400, Robert Haas wrote: > It's certainly less of an issue than it used to be back in my day. > > Any thoughts on the patch I attached? I don't know. Anyway, this is actively worked on, so I have taken the liberty to move that to the next CF. -- Michael signature.asc Description: PGP signature
Re: when the startup process doesn't (logging startup delays)
On Thu, Sep 30, 2021 at 3:10 PM Tom Lane wrote: > That would be lovely, certainly. But aren't you moving the goalposts > rather far? I don't think we make any promises about such things > today, so why has the issue suddenly gotten more pressing? Yeah, perhaps it's best to not to worry about it. I dislike failure to worry about that case on general principle, but I agree with you that it seems to be moving the goalposts a disproportionate distance. > In particular, > why do you think Nitin's patch is proof against this? Seems to me it's > probably got *more* failure cases, not fewer, if the system clock is > acting funny. You might be right. I sort of assumed that timeout.c had some defense against this, but since that seems not to be the case, I suppose no facility that depends on it can hope to stay out of trouble either. > On the whole, in these days of NTP, I'm not sure I care to spend > large amounts of effort on dealing with a bogus system clock. It's certainly less of an issue than it used to be back in my day. Any thoughts on the patch I attached? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Robert Haas writes: > ... When I say I want my handler to be > fired in 3 s, I don't mean that I want it to be fired when the system > time is 3 seconds greater than it is right now. I mean I want it to be > fired in 3 actual seconds, regardless of what dumb thing the system > clock may choose to do. That would be lovely, certainly. But aren't you moving the goalposts rather far? I don't think we make any promises about such things today, so why has the issue suddenly gotten more pressing? In particular, why do you think Nitin's patch is proof against this? Seems to me it's probably got *more* failure cases, not fewer, if the system clock is acting funny. BTW, one could imagine addressing this concern by having timeout.c work with CLOCK_MONOTONIC instead of the regular wall clock. But I fear we'd have to drop enable_timeout_at(), for lack of ability to translate between CLOCK_MONOTONIC timestamps and those used by anybody else. Also get_timeout_start_time/get_timeout_finish_time would become problematic. Maybe we only really care about deltas, so the more restrictive API would be workable, but it seems like a nontrivial amount of work. On the whole, in these days of NTP, I'm not sure I care to spend large amounts of effort on dealing with a bogus system clock. regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 5:12 PM Tom Lane wrote: > I didn't claim there are any other places that could use the feature > *today*. But once we've got one, it seems like there could be more > tomorrow. In any case, I dislike keeping timeout state data outside > timeout.c, because it's so likely to get out-of-sync that way. Well, I had a quick go at implementing this (attached). It seems to do a satisfactory job preventing drift over time, but it doesn't behave nicely if you set the system clock backward. With a bit of extra debugging output: 2021-09-30 14:23:50.291 EDT [2279] LOG: scheduling wakeup in 2 secs, 998727 usecs 2021-09-30 14:23:53.291 EDT [2279] LOG: scheduling wakeup in 2 secs, 998730 usecs 2021-09-30 14:23:56.291 EDT [2279] LOG: scheduling wakeup in 2 secs, 998728 usecs 2021-09-30 14:20:01.154 EDT [2279] LOG: scheduling wakeup in 238 secs, 135532 usecs 2021-09-30 14:23:59.294 EDT [2279] LOG: scheduling wakeup in 2 secs, 995922 use The issue here is that fin_time is really the driving force behind everything timeout.c does. In particular, it determines the order of active_timeouts[]. And that's not really correct either for enable_timeout_after(), or for the new function I added in this draft patch, enable_timeout_every(). When I say I want my handler to be fired in 3 s, I don't mean that I want it to be fired when the system time is 3 seconds greater than it is right now. I mean I want it to be fired in 3 actual seconds, regardless of what dumb thing the system clock may choose to do. I don't really think that precise behavior can be implemented, but ideally if a timeout that was supposed to happen after 3 s is now scheduled for a time that is more than 3 seconds beyond the current value of the system clock, we'd move the firing time backwards to 3 seconds beyond the current system clock. That way, if you set the system time backward by 4 minutes, you might see a few seconds of delay before the next firing, but you wouldn't go into the tank for 4 minutes. I don't see an obvious way of making timeout.c behave like that without major surgery, though. If nobody else does either, then we could (1) stick with something closer to Nitin's current patch, which tries to handle this concern outside of timeout.c, (2) adopt something like the attached 0001 and leave the question of improved behavior in case of backwards system clock adjustments for another day, or (3) undertake to rewrite timeout.c as a precondition of being able to log messages about why startup is slow. I'm not a huge fan of (3) but I'm open to suggestions. -- Robert Haas EDB: http://www.enterprisedb.com 0002-Quick-testing-hack.patch Description: Binary data 0001-Add-enable_timeout_every-to-fire-the-same-timeout-re.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
Robert Haas writes: > On Wed, Sep 29, 2021 at 2:06 PM Tom Lane wrote: >> The real comment I'd have here, though, is that writing one-off >> code for this purpose is bad. If we have a need for a repetitive >> timeout, it'd be better to add the feature to timeout.c explicitly. >> That would probably also remove the need for extra copies of the >> timeout time. > I'm not sure that really helps very much, honestly. I didn't claim there are any other places that could use the feature *today*. But once we've got one, it seems like there could be more tomorrow. In any case, I dislike keeping timeout state data outside timeout.c, because it's so likely to get out-of-sync that way. regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 2:06 PM Tom Lane wrote: > The real comment I'd have here, though, is that writing one-off > code for this purpose is bad. If we have a need for a repetitive > timeout, it'd be better to add the feature to timeout.c explicitly. > That would probably also remove the need for extra copies of the > timeout time. I'm not sure that really helps very much, honestly. I mean it would be useful in this particular case, but there are other cases where we have logic like this already, and this wouldn't do anything about those. For example, consider autoprewarm_main(). Like this code, that code thinks (perhaps just because I'm the one who reviewed it) that the next time should be measured from the last time ... but an enhancement to the timeout machinery wouldn't help it at all. I suspect there are other cases like this elsewhere, because this is what I personally tend to think is the right behavior and I feel like it comes up in patch reviews from time to time, but I'm not finding any at the moment. Even if I'm right that they exist, I'm not sure they look much like each other or can easily reuse any code. And then again on the other hand, BackgroundWriterMain() thinks that the next time should be measured from the time we got around to doing it, not the scheduled time. I guess we don't really have any consistent practice here. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 1:52 PM Alvaro Herrera wrote: > I think one person casting an opinion on one aspect does not set that > aspect in stone. Of course not. I was just explaining that how the patch ended up like it did. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Alvaro Herrera writes: > On 2021-Sep-29, Robert Haas wrote: >> Well, this was my suggestion, because if you don't do this, you get >> drift, which I think looks weird. Like the timestamps will be: >> >> 13:41:05.012456 >> 13:41:15.072484 >> 13:41:25.149632 >> >> ...and it gets further and further off as it goes on.' > Right ... I actually *expect* this drift to occur. Maybe people > generally don't like this, it just seems natural to me. Are there other > opinions on this aspect? FWIW, I agree with Robert that it's nicer if the timeout doesn't drift. There's a limit to how much complexity I'm willing to tolerate for that, but it doesn't seem like this exceeds it. The real comment I'd have here, though, is that writing one-off code for this purpose is bad. If we have a need for a repetitive timeout, it'd be better to add the feature to timeout.c explicitly. That would probably also remove the need for extra copies of the timeout time. regards, tom lane
Re: when the startup process doesn't (logging startup delays)
On 2021-Sep-29, Justin Pryzby wrote: > Robert requested the current behavior here. > https://www.postgresql.org/message-id/CA%2BTgmoYkS1ZeWdSMFMBecMNxWonHk6J5eoX4FEQrpKtvEbXqGQ%40mail.gmail.com > > It's confusing (at least) to get these kind of requests to change the behavior > back and forth. Well, I did scan the thread to see if this had been discussed, and I overlooked that message. But there was no reply to that message, so it's not clear whether this was just Robert's opinion or consensus; in fact we now have exactly two votes on it (mine and Robert's). I think one person casting an opinion on one aspect does not set that aspect in stone. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "It takes less than 2 seconds to get to 78% complete; that's a good sign. A few seconds later it's at 90%, but it seems to have stuck there. Did somebody make percentages logarithmic while I wasn't looking?" http://smylers.hates-software.com/2005/09/08/1995c749.html
Re: when the startup process doesn't (logging startup delays)
On 2021-Sep-29, Robert Haas wrote: > Well, this was my suggestion, because if you don't do this, you get > drift, which I think looks weird. Like the timestamps will be: > > 13:41:05.012456 > 13:41:15.072484 > 13:41:25.149632 > > ...and it gets further and further off as it goes on.' Right ... I actually *expect* this drift to occur. Maybe people generally don't like this, it just seems natural to me. Are there other opinions on this aspect? -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "Nadie está tan esclavizado como el que se cree libre no siéndolo" (Goethe)
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 02:36:14PM -0300, Alvaro Herrera wrote: > Why is it that we set the next timeout to fire not at "now + interval" > but at "when-it-should-have-fired-but-didn't + interval"? As a user, if > I request a message to be logged every N milliseconds, and one > of them is a little bit delayed, then (assuming I set it to 10s) I still > expect the next one to occur at now+10s. I don't expect the next at > "now+5s" if one is delayed 5s. > > In other words, I think this function should just be > enable_timeout_after(STARTUP_PROGRESS_TIMEOUT, > log_startup_progress_interval); > > This means you can remove the scheduled_startup_progress_timeout > variable. Robert requested the current behavior here. https://www.postgresql.org/message-id/CA%2BTgmoYkS1ZeWdSMFMBecMNxWonHk6J5eoX4FEQrpKtvEbXqGQ%40mail.gmail.com It's confusing (at least) to get these kind of requests to change the behavior back and forth. -- Justin
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 1:36 PM Alvaro Herrera wrote: > Why is it that we set the next timeout to fire not at "now + interval" > but at "when-it-should-have-fired-but-didn't + interval"? As a user, if > I request a message to be logged every N milliseconds, and one > of them is a little bit delayed, then (assuming I set it to 10s) I still > expect the next one to occur at now+10s. I don't expect the next at > "now+5s" if one is delayed 5s. Well, this was my suggestion, because if you don't do this, you get drift, which I think looks weird. Like the timestamps will be: 13:41:05.012456 13:41:15.072484 13:41:25.149632 ...and it gets further and further off as it goes on.' I guess my expectation is different from yours: I expect that if I ask for a message every 10 seconds, the time between messages is going to be 10s, at least on average, not 10s + however much latency the system has. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 29, 2021 at 10:08 AM Nitin Jadhav wrote: > Sorry. There was a misunderstanding about this and for the patch > shared on September 27th, I had tested for the value '0' and observed > that no progress messages were getting logged, probably the time at > which 'enable_timeout_at' is getting called is past the 'next_timeout' > value. This behaviour is completely dependent on the system. Now added > an extra condition to disable the feature in case of '0' setting. Oh, interesting. I failed to consider that the behavior might vary from one system to another. I just noticed something else which I realize is the indirect result of my own suggestion but which doesn't actually look all that nice. You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT, ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then another one in StartupProcessMain(). I think that's so that the feature works in single-user mode, but that means that technically, we're not reporting on the progress of the startup process. We're reporting progress on the startup operations that are normally performed by the startup process. But that means that the documentation isn't quite accurate (because it mentions the startup process specifically) and that the placement of the code in startup.c is suspect (because that's specifically for the startup process) and that basically every instance of the word "process" in the patch is technically a little bit wrong. I'm not sure if that's a big enough problem to be worth worrying about or exactly what we ought to do about it, but it doesn't seem fantastic. At a minimum, I think we should probably try to eliminate as many references to the startup process as we can, and talk about startup progress or startup operations or something like that. + * Start timestamp of the operation that occur during startup process. This is not correct grammar - it would need to be "operations that occur" or "operation that occurs". But neither seems particularly clear about what the variable actually does. How about "Time at which the most recent startup operation started"? + * Indicates the timestamp at which the timer was supposed to expire. "Indicates" can be deleted, but also I think it would be better to state the purpose of the timer i.e. "Timestamp at which the next startup progress message should be logged." + enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout); + scheduled_startup_progress_timeout = next_timeout; + startup_progress_timer_expired = false; I think you should set startup_progress_timer_expired to false before calling enable_timeout_at. Otherwise there's a race condition. It's unlikely that the timer could expire and the interrupt handler fire before we reach startup_progress_timer_expired = false, but it seems like there's no reason to take a chance. + * Calculates the timestamp at which the next timer should expire and enables So in some places you have verbs with an "s" on the end, like here, and in other places without, like in the next example. In "telegraph style" comments like this, this implicit subject is "it" or "this," but you don't write that. However you write the rest of the sentence as if it were there. So this should say "calculate" and "enable" rather than "calculates" and "enables". + * Schedule a wakeup call for logging the progress of startup process. This isn't really an accurate description, I think. It's not scheduling anything, and I don't know what a "wakeup call" is anyway. "Set a flag indicating that it's time to log a progress report" might be better. + * Sets the start timestamp of the current operation and also enables the Set. enable. + * timeout for logging the progress of startup process. I think you could delete "for logging the progress of startup process" here; that seems clear enough, and this reads a bit awkwardly. If you want to keep something like this I wrote write "...enable the timeout so that the progress of the startup progress will be logged." + * the timer if it did, otheriwse return false. otherwise + * Begin the startup progress phase to report the progress of syncing + * data directory (syncfs). All of the comments that start with "Begin the startup progress phase" should instead start with "Begin startup progress phase". I think this could be condensed down to "Prepare to report progress syncing the data directory via syncfs", and likewise "Prepare to report progress of the pre-fsync phase", "Prepare to report progress resetting unlogged relations," etc. + gettext_noop("Sets the time interval between each progress update " + "of the startup process."), This is actually inaccurate. It's sort of the same problem I was worried about with respect to the documentation: it's NOT the interval between progress updates, because it applies separately to each operation. We need to say something that makes that clear, or at least that doesn't get overtly the opposite impression. It's hard to do that briefly
Re: when the startup process doesn't (logging startup delays)
So, I've wondered about this part all along: > +/* > + * Calculates the timestamp at which the next timer should expire and enables > + * the timer accordingly. > + */ > +static void > +reset_startup_progress_timeout(TimestampTz now) > +{ > + TimestampTz next_timeout; > + > + next_timeout = > TimestampTzPlusMilliseconds(scheduled_startup_progress_timeout, > + >log_startup_progress_interval); > + if (next_timeout < now) > + { > + /* > + * Either the timeout was processed so late that we missed an > + * entire cycle or system clock was set backwards. > + */ > + next_timeout = TimestampTzPlusMilliseconds(now, > log_startup_progress_interval); > + } > + > + enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout); Why is it that we set the next timeout to fire not at "now + interval" but at "when-it-should-have-fired-but-didn't + interval"? As a user, if I request a message to be logged every N milliseconds, and one of them is a little bit delayed, then (assuming I set it to 10s) I still expect the next one to occur at now+10s. I don't expect the next at "now+5s" if one is delayed 5s. In other words, I think this function should just be enable_timeout_after(STARTUP_PROGRESS_TIMEOUT, log_startup_progress_interval); This means you can remove the scheduled_startup_progress_timeout variable. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)
Re: when the startup process doesn't (logging startup delays)
> It is common to mention what the default is as part of the > documentation of a GUC. I don't see why this one should be an > exception, especially since not mentioning it reduces the length of > the documentation by exactly one word. > > I don't mind the sentence you added at the end to clarify the default > units, but the way you've rewritten the first sentence makes it, in my > opinion, much less clear. Ok. I have kept your documentation as it is and added the sentence at the end to clarify the default units. > v9 was posted on August 3rd. I told you that it wasn't working on > September 23rd. You posted a new version that still does not work on > September 27th. I think you should have tested each version of your > patch before posting it, and especially after any major refactorings. > And if for whatever reason you didn't, then certainly after I told you > on September 23rd that it didn't work, I think you should have fixed > it before posting a new version. Sorry. There was a misunderstanding about this and for the patch shared on September 27th, I had tested for the value '0' and observed that no progress messages were getting logged, probably the time at which 'enable_timeout_at' is getting called is past the 'next_timeout' value. This behaviour is completely dependent on the system. Now added an extra condition to disable the feature in case of '0' setting. > I think this comment can be worded better. It says it "decides", but it > doesn't actually decide on any action to take -- it just reports whether > the timer expired or not, to allow its caller to make the decision. In > such situations we just say something like "Report whether startup > progress has caused a timeout, return true and rearm the timer if it > did, or just return false otherwise"; and we don't indicate what the > value is going to be used *for*. Then the caller can use the boolean > return value to make a decision, such as whether something is going to > be logged. This function can be oblivious to details such as this: > > here we can just say "No timeout has occurred" and make no inference > about what's going to happen or not happen. Modified the comment. > Also, for functions that do things like this we typically use English > sentence structure with the function name starting with the verb -- > perhaps has_startup_progress_timeout_expired(). Sometimes we are lax > about this if we have some sort of poor-man's modularisation by using a > common prefix for several functions doing related things, which perhaps > could be "startup_progress_*" in your case, but your other functions are > already not doing that (such as begin_startup_progress_phase) so it's > not clear why you would not use the most natural name for this one. I agree that has_startup_progress_timeout_expired() is better than the existing function name. So I changed the function name accordingly. > Please make sure to add ereport_startup_progress() as a translation > trigger in src/backend/nls.mk. I have added ereport_startup_progress() under the section GETTEXT_TRIGGERS and GETTEXT_FLAGS in src/backend/nls.mk. Also verified the messages in src/backend/po/postgres.pot file. Kindly let me know if I have missed anything. Thanks & Regards, Nitin Jadhav On Tue, Sep 28, 2021 at 8:29 PM Robert Haas wrote: > > On Tue, Sep 28, 2021 at 8:06 AM Nitin Jadhav > wrote: > > I thought mentioning the unit in milliseconds and the example in > > seconds would confuse the user, so I changed the example to > > milliseconds.The message behind the above description looks good to me > > however I feel some sentences can be explained in less words. The > > information related to the units is missing and I feel it should be > > mentioned in the document. The example says, if the setting has the > > default value of 10 seconds, then it explains the behaviour. I feel it > > may not be the default value, it can be any value set by the user. So > > mentioning 'default' in the example does not look good to me. I feel > > we just have to mention "if this setting is set to the value of 10 > > seconds". Below is the modified version of the above information. > > It is common to mention what the default is as part of the > documentation of a GUC. I don't see why this one should be an > exception, especially since not mentioning it reduces the length of > the documentation by exactly one word. > > I don't mind the sentence you added at the end to clarify the default > units, but the way you've rewritten the first sentence makes it, in my > opinion, much less clear. > > > I had added additional code to check the value of the > > 'log_startup_progress_interval' variable and disable the feature in > > case of -1 in the earlier versions of the patch (Specifically > > v9.patch). There was a review comment for v9 patch and it resulted in > > major refactoring of the patch. > ... > > The answer for the question of "I don't understand why you posted the > > previous version of the pa
Re: when the startup process doesn't (logging startup delays)
On Tue, Sep 28, 2021 at 8:06 AM Nitin Jadhav wrote: > I thought mentioning the unit in milliseconds and the example in > seconds would confuse the user, so I changed the example to > milliseconds.The message behind the above description looks good to me > however I feel some sentences can be explained in less words. The > information related to the units is missing and I feel it should be > mentioned in the document. The example says, if the setting has the > default value of 10 seconds, then it explains the behaviour. I feel it > may not be the default value, it can be any value set by the user. So > mentioning 'default' in the example does not look good to me. I feel > we just have to mention "if this setting is set to the value of 10 > seconds". Below is the modified version of the above information. It is common to mention what the default is as part of the documentation of a GUC. I don't see why this one should be an exception, especially since not mentioning it reduces the length of the documentation by exactly one word. I don't mind the sentence you added at the end to clarify the default units, but the way you've rewritten the first sentence makes it, in my opinion, much less clear. > I had added additional code to check the value of the > 'log_startup_progress_interval' variable and disable the feature in > case of -1 in the earlier versions of the patch (Specifically > v9.patch). There was a review comment for v9 patch and it resulted in > major refactoring of the patch. ... > The answer for the question of "I don't understand why you posted the > previous version of the patch without testing that it works" is, for > the value of -1, the above description was my understanding and for > the value of 0, the older versions of the patch was behaving as > documented. But with the later versions the behaviour got changed and > I missed to modify the documentation. So I modified it in the last > version. v9 was posted on August 3rd. I told you that it wasn't working on September 23rd. You posted a new version that still does not work on September 27th. I think you should have tested each version of your patch before posting it, and especially after any major refactorings. And if for whatever reason you didn't, then certainly after I told you on September 23rd that it didn't work, I think you should have fixed it before posting a new version. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> That's really not what I'm complaining about. I think if we're going > to give an example at all, 10ms is a better example than 100ms, > because 10s is a value that people are more likely to find useful. But > I'm not sure that it's necessary to mention a specific value, and if > it is, I think it needs to be phrased in a less confusing way. > > > >>It also looks pretty silly to say that if you set the value to 10s, > > >>something > > >>will happen every 10s. What else would anyone expect to happen? > > > > @Robert: that's consistent with existing documentation, even though it might > > seem obvious and silly to us. > > > > | For example, if you set this to 250ms then all automatic vacuums and > > analyzes that run 250ms or longer will be logged > > | For example, if you set it to 250ms then all SQL statements that run > > 250ms or longer will be logged > > Fair enough, but I still don't like it much. I tried my hand at > rewriting this and came up with the attached: > > + Sets the amount of time after which the startup process will log > + a message about a long-running operation that is still in progress, > + as well as the interval between further progress messages for that > + operation. This setting is applied separately to each operation. > + For example, if syncing the data directory takes 25 seconds and > + thereafter resetting unlogged relations takes 8 seconds, and if this > + setting has the default value of 10 seconds, then a messages will be > + logged for syncing the data directory after it has been in progress > + for 10 seconds and again after it has been in progress for 20 > seconds, > + but nothing will be logged for resetting unlogged operations. > + A setting of 0 disables the feature. > > I prefer this to Nitin's version because I think it could be unclear > to someone that the value applies separately to each operation, > whereas I don't think we need to document that we can't guarantee that > the messages will be perfectly on time - that's true of every kind of > scheduled event in pretty much every computer system - or what happens > if the system clock goes backwards. Those are things we should try to > get right, as well as we can anyway, but we don't need to tell the > user that we got them right. I thought mentioning the unit in milliseconds and the example in seconds would confuse the user, so I changed the example to milliseconds.The message behind the above description looks good to me however I feel some sentences can be explained in less words. The information related to the units is missing and I feel it should be mentioned in the document. The example says, if the setting has the default value of 10 seconds, then it explains the behaviour. I feel it may not be the default value, it can be any value set by the user. So mentioning 'default' in the example does not look good to me. I feel we just have to mention "if this setting is set to the value of 10 seconds". Below is the modified version of the above information. + Sets the amount of time after every such interval the startup process + will log a message about a long-running operation that is still in + progress. This setting is applied separately to each operation. + For example, if syncing the data directory takes 25 seconds and + thereafter resetting unlogged relations takes 8 seconds, and if this + setting is set to the value of 10 seconds, then a messages will be + logged for syncing the data directory after it has been in progress + for 10 seconds and again after it has been in progress for 20 seconds, + but nothing will be logged for resetting unlogged operations. + A setting of 0 disables the feature. If this value + is specified without units, it is taken as milliseconds. > Well, I see that -1 is now disallowed, and that's good as far as it > goes, but 0 still does not actually disable the feature. I don't > understand why you posted the previous version of the patch without > testing that it works, and I even less understand why you are posting > another version without fixing the bug that I pointed out to you in > the last version. I had added additional code to check the value of the 'log_startup_progress_interval' variable and disable the feature in case of -1 in the earlier versions of the patch (Specifically v9.patch). There was a review comment for v9 patch and it resulted in major refactoring of the patch. The comment was > With these changes you'd have only 1 place in the code that needs to > care about log_startup_progress_interval, as opposed to 3 as you have > it currently, and only one place that enables the timeout, as opposed > to 2 as you have it currently. I think that would be tidier. Based on the above comment and the idea behind enabling the timer, it does not log anything if the value is set to -1. So I
Re: when the startup process doesn't (logging startup delays)
> +/* > + * Decides whether to log the startup progress or not based on whether the > + * timer is expired or not. Returns FALSE if the timer is not expired, > otherwise > + * calculates the elapsed time and sets the respective out parameters secs > and > + * usecs. Enables the timer for the next log message and returns TRUE. > + */ > +bool > +startup_progress_timeout_has_expired(long *secs, int *usecs) I think this comment can be worded better. It says it "decides", but it doesn't actually decide on any action to take -- it just reports whether the timer expired or not, to allow its caller to make the decision. In such situations we just say something like "Report whether startup progress has caused a timeout, return true and rearm the timer if it did, or just return false otherwise"; and we don't indicate what the value is going to be used *for*. Then the caller can use the boolean return value to make a decision, such as whether something is going to be logged. This function can be oblivious to details such as this: > + /* If the timeout has not occurred, then no need to log the details. */ > + if (!startup_progress_timer_expired) > + return false; here we can just say "No timeout has occurred" and make no inference about what's going to happen or not happen. Also, for functions that do things like this we typically use English sentence structure with the function name starting with the verb -- perhaps has_startup_progress_timeout_expired(). Sometimes we are lax about this if we have some sort of poor-man's modularisation by using a common prefix for several functions doing related things, which perhaps could be "startup_progress_*" in your case, but your other functions are already not doing that (such as begin_startup_progress_phase) so it's not clear why you would not use the most natural name for this one. > + ereport_startup_progress("syncing data directory (syncfs), elapsed > time: %ld.%02d s, current path: %s", > + path); Please make sure to add ereport_startup_progress() as a translation trigger in src/backend/nls.mk. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Re: when the startup process doesn't (logging startup delays)
On Mon, Sep 27, 2021 at 9:32 AM Justin Pryzby wrote: > >>It also looks pretty silly to say that if you set the value to 10s, > >>something > >>will happen every 10s. What else would anyone expect to happen? > > @Robert: that's consistent with existing documentation, even though it might > seem obvious and silly to us. > > | For example, if you set this to 250ms then all automatic vacuums and > analyzes that run 250ms or longer will be logged > | For example, if you set it to 250ms then all SQL statements that run 250ms > or longer will be logged Fair enough, but I still don't like it much. I tried my hand at rewriting this and came up with the attached: + Sets the amount of time after which the startup process will log + a message about a long-running operation that is still in progress, + as well as the interval between further progress messages for that + operation. This setting is applied separately to each operation. + For example, if syncing the data directory takes 25 seconds and + thereafter resetting unlogged relations takes 8 seconds, and if this + setting has the default value of 10 seconds, then a messages will be + logged for syncing the data directory after it has been in progress + for 10 seconds and again after it has been in progress for 20 seconds, + but nothing will be logged for resetting unlogged operations. + A setting of 0 disables the feature. I prefer this to Nitin's version because I think it could be unclear to someone that the value applies separately to each operation, whereas I don't think we need to document that we can't guarantee that the messages will be perfectly on time - that's true of every kind of scheduled event in pretty much every computer system - or what happens if the system clock goes backwards. Those are things we should try to get right, as well as we can anyway, but we don't need to tell the user that we got them right. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Sep 27, 2021 at 7:26 AM Nitin Jadhav wrote: > > I really don't know what to say about this. You say that the time is > > measured in milliseconds, and then immediately turn around and say > > "For example, if you set it to 10s". Now we do expect that most people > > will set it to intervals that are measured in seconds rather than > > milliseconds, but saying that setting it to a value measured in > > seconds is an example of setting it in milliseconds is not logical. > > Based on the statement "I suggest making the GUC GUC_UNIT_MS rather > than GUC_UNIT_S, but expressing the default in postgresl.conf.sample > as 10s rather than 1ms", I have used the default value in the > postgresl.conf.sample as 10s rather than 1ms. So I just used the > same value in the example too in config.sgml. If it is really getting > confusing, I will change it to 100ms in config.sgml. That's really not what I'm complaining about. I think if we're going to give an example at all, 10ms is a better example than 100ms, because 10s is a value that people are more likely to find useful. But I'm not sure that it's necessary to mention a specific value, and if it is, I think it needs to be phrased in a less confusing way. > Made changes which indicate 0 mean disabled, > 0 mean interval in > millisecond and removed -1. Well, I see that -1 is now disallowed, and that's good as far as it goes, but 0 still does not actually disable the feature. I don't understand why you posted the previous version of the patch without testing that it works, and I even less understand why you are posting another version without fixing the bug that I pointed out to you in the last version. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Sep 27, 2021 at 04:57:20PM +0530, Nitin Jadhav wrote: > > It is also not logical to define 0 as meaning that "all messages for > > such operations are logged". What does that even mean? It makes sense > > for something like log_autovacuum_min_duration, because there we are > > talking about logging one message per operation, and we could log > > messages for all operations or just some of them. Here we are talking > > about the time between one message and the next, so talking about "all > > messages" does not really seem to make a lot of sense. Experimentally, > > what 0 actually does is cause the system to spam log lines in a tight > > loop, which cannot be what anyone wants. I think you should make 0 > > mean disabled, and a positive value mean log at that interval, and > > disallow -1 altogether. > > Made changes which indicate 0 mean disabled, > 0 mean interval in > millisecond and removed -1. > > Please find the patch attached. I think you misunderstood - Robert was saying that interval=0 doesn't work, not suggesting that you write more documentation about it. Also, I agree with Robert that the documentation is too verbose. I don't think you need to talk about what happens if the clock goes backwards (It just needs to behave conveniently). Look at the other _duration statements for what they say about units. "If this value is specified without units, it is taken as milliseconds." https://www.postgresql.org/docs/14/runtime-config-logging.html log_autovacuum_min_duration log_min_duration_statement >>It also looks pretty silly to say that if you set the value to 10s, something >>will happen every 10s. What else would anyone expect to happen? @Robert: that's consistent with existing documentation, even though it might seem obvious and silly to us. | For example, if you set this to 250ms then all automatic vacuums and analyzes that run 250ms or longer will be logged | For example, if you set it to 250ms then all SQL statements that run 250ms or longer will be logged -- Justin
Re: when the startup process doesn't (logging startup delays)
> I really don't know what to say about this. You say that the time is > measured in milliseconds, and then immediately turn around and say > "For example, if you set it to 10s". Now we do expect that most people > will set it to intervals that are measured in seconds rather than > milliseconds, but saying that setting it to a value measured in > seconds is an example of setting it in milliseconds is not logical. Based on the statement "I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but expressing the default in postgresl.conf.sample as 10s rather than 1ms", I have used the default value in the postgresl.conf.sample as 10s rather than 1ms. So I just used the same value in the example too in config.sgml. If it is really getting confusing, I will change it to 100ms in config.sgml. > It also looks pretty silly to say that if you set the value to 10s, > something will happen every 10s. What else would anyone expect to > happen? You really need to give some thought to how to explain the > behavior in a way that is clear and logical but not overly wordy. Added a few lines about that. "For example, if you set it to 1000ms, then it tries to emit a log every 1000ms. If the log message is not available at every 100th millisecond, then there is a possibility of delay in emitting the log. If the delay is more than a cycle or if the system clock gets set backwards then the next attempt is done based on the last logging time, otherwise the delay gets adjusted in the next attempt." Please correct the explanation if it does not meet your expectations. > Also, please note that you've got spaces around the literals, which > does not match the surrounding style and does not render properly in > HTML. Fixed. > It is also not logical to define 0 as meaning that "all messages for > such operations are logged". What does that even mean? It makes sense > for something like log_autovacuum_min_duration, because there we are > talking about logging one message per operation, and we could log > messages for all operations or just some of them. Here we are talking > about the time between one message and the next, so talking about "all > messages" does not really seem to make a lot of sense. Experimentally, > what 0 actually does is cause the system to spam log lines in a tight > loop, which cannot be what anyone wants. I think you should make 0 > mean disabled, and a positive value mean log at that interval, and > disallow -1 altogether. Made changes which indicate 0 mean disabled, > 0 mean interval in millisecond and removed -1. Please find the patch attached. On Thu, Sep 23, 2021 at 9:44 PM Robert Haas wrote: > > On Wed, Sep 22, 2021 at 10:28 AM Nitin Jadhav > wrote: > > Thanks Justin for the detailed explanation. Done the necessary changes. > > Not really. The documentation here does not make a ton of sense: > > + Sets the time interval between each progress update of the > operations > + performed by the startup process. This produces the log messages for > + those operations which take longer than the specified > duration. The unit > + used to specify the value is milliseconds. For example, if > you set it to > + 10s , then every 10s > , a log is > + emitted indicating which operation is ongoing, and the > elapsed time from > + the beginning of the operation. If the value is set to > 0 , > + then all messages for such operations are logged. > -1 > + disables the feature. The default value is 10s > > I really don't know what to say about this. You say that the time is > measured in milliseconds, and then immediately turn around and say > "For example, if you set it to 10s". Now we do expect that most people > will set it to intervals that are measured in seconds rather than > milliseconds, but saying that setting it to a value measured in > seconds is an example of setting it in milliseconds is not logical. It > also looks pretty silly to say that if you set the value to 10s, > something will happen every 10s. What else would anyone expect to > happen? You really need to give some thought to how to explain the > behavior in a way that is clear and logical but not overly wordy. > Also, please note that you've got spaces around the literals, which > does not match the surrounding style and does not render properly in > HTML. > > It is also not logical to define 0 as meaning that "all messages for > such operations are logged". What does that even mean? It makes sense > for something like log_autovacuum_min_duration, because there we are > talking about logging one message per operation, and we could log > messages for all operations or just some of them. Here we are talking > about the time between one message and the next, so talking about "all > messages" does not really seem to make a lot of sense. Experimentally, > what 0 actually does is cause the system to spam log lines in a tight > loop, which cannot be what
Re: when the startup process doesn't (logging startup delays)
On Wed, Sep 22, 2021 at 10:28 AM Nitin Jadhav wrote: > Thanks Justin for the detailed explanation. Done the necessary changes. Not really. The documentation here does not make a ton of sense: + Sets the time interval between each progress update of the operations + performed by the startup process. This produces the log messages for + those operations which take longer than the specified duration. The unit + used to specify the value is milliseconds. For example, if you set it to + 10s , then every 10s , a log is + emitted indicating which operation is ongoing, and the elapsed time from + the beginning of the operation. If the value is set to 0 , + then all messages for such operations are logged. -1 + disables the feature. The default value is 10s I really don't know what to say about this. You say that the time is measured in milliseconds, and then immediately turn around and say "For example, if you set it to 10s". Now we do expect that most people will set it to intervals that are measured in seconds rather than milliseconds, but saying that setting it to a value measured in seconds is an example of setting it in milliseconds is not logical. It also looks pretty silly to say that if you set the value to 10s, something will happen every 10s. What else would anyone expect to happen? You really need to give some thought to how to explain the behavior in a way that is clear and logical but not overly wordy. Also, please note that you've got spaces around the literals, which does not match the surrounding style and does not render properly in HTML. It is also not logical to define 0 as meaning that "all messages for such operations are logged". What does that even mean? It makes sense for something like log_autovacuum_min_duration, because there we are talking about logging one message per operation, and we could log messages for all operations or just some of them. Here we are talking about the time between one message and the next, so talking about "all messages" does not really seem to make a lot of sense. Experimentally, what 0 actually does is cause the system to spam log lines in a tight loop, which cannot be what anyone wants. I think you should make 0 mean disabled, and a positive value mean log at that interval, and disallow -1 altogether. And on that note, I tested your patch with log_startup_progress_interval=-1 and found that -1 behaves just like 0. In other words, contrary to what the documentation says, -1 does not disable the feature. It instead behaves just like 0. It's really confusing to me how you write documentation that says -1 has a certain behavior without thinking about the fact that you haven't written any code that would make -1 behave that way. And apparently you didn't test it, either. It took me approximately 1 minute of testing to find that this is broken, which really is not very much. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> Michael is right. You updated some of the units based on Robert's suggestion > to use MS, but didn't update all of the corresponding parts of the patch. > guc.c says that the units are in MS, which means that unqualified values are > interpretted as such. But postgresql.conf.sample still says "seconds", and > guc.c says the default value is "10", and you still do: > > + interval_in_ms = log_startup_progress_interval * 1000; > > I checked that this currently does not interpret the value as ms: > |./tmp_install/usr/local/pgsql/bin/postgres -D > src/test/regress/tmp_check/data/ -c log_startup_progress_interval=1 > |2021-09-07 06:28:58.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 1.00 s, current LSN: 0/E94ED88 > |2021-09-07 06:28:59.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 2.00 s, current LSN: 0/10808EE0 > |2021-09-07 06:29:00.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 3.00 s, current LSN: 0/126B8C80 > > (Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to > another int could overflow.) > > I think the convention is to for GUC global vars to be initialized with the > same default as in guc.c, so both should be 1, like: > > +int log_startup_progress_interval = 10 * 1000 /* 10sec */ Thanks Justin for the detailed explanation. Done the necessary changes. Please find the updated patch attached. On Mon, Sep 13, 2021 at 8:32 PM Nitin Jadhav wrote: > > > I think you're confusing discussions. > > > > Robert was talking about initdb/bootstrap/single, and I separately and > > independently asked about hot standbys. It seems like Robert and I agreed > > about the desired behavior and there was no further discussion. > > Sorry for including 2 separate points into one. > > > Looking over this version, I realized something I (or you) should have > > noticed sooner: you've added the RegisterTimeout call to > > InitPostgres(), but that's for things that are used by all backends, > > and this is only used by the startup process. So it seems to me that > > the right place is StartupProcessMain. That would have the further > > advantage of allowing startup_progress_timeout_handler to be made > > static. begin_startup_progress_phase() and > > startup_progress_timeout_has_expired() are the actual API functions > > though so they will need to remain extern. > > The earlier discussion wrt this point is as follows. > > > > I also agree that this is the better place to do it. Hence modified > > > the patch accordingly. The condition "!AmStartupProcess()" is added to > > > differentiate whether the call is done from a startup process or some > > > other process. Actually StartupXLOG() gets called in 2 places. one in > > > StartupProcessMain() and the other in InitPostgres(). As the logging > > > of the startup progress is required only during the startup process > > > and not in the other cases, > > > > The InitPostgres() case occurs when the server is started in bootstrap > > mode (during initdb) or in single-user mode (postgres --single). I do > > not see any reason why we shouldn't produce progress messages in at > > least the latter case. I suspect that someone who is in the rather > > desperate scenario of having to use single-user mode would really like > > to know how long the server is going to take to start up. > > > > Perhaps during initdb we don't want messages, but I'm not sure that we > > need to do anything about that here. None of the messages that the > > server normally produces show up when you run initdb, so I guess they > > are getting redirected to /dev/null or something. > > > > So I don't think that using AmStartupProcess() for this purpose is right. > > This point is really confusing. As per the earlier discussion we > concluded to include RegisterTimeout() call even in case of > InitPostgres() to support logging in case of single-user mode. Now if > we remove the RegisterTimeout() call from InitPostgres(), we are not > going to support that anymore. Is this what you're trying to convey? > or we should add some checks and disable the code to RegisterTimeout() > if it is other than single-user mode. I have added a check if > (!IsPostmasterEnvironment) in the attached patch for this scenario. > Kindly confirm my understanding. > > > > Should this feature distinguish between crash recovery and archive > > > recovery on > > > a hot standby ? Otherwise the standby will display this all the time. > > > > > >2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, elapsed > > >time: 124.42 s, current LSN: 0/EEE2100 > > > > > >If so, I think maybe you'd check !InArchiveRecovery (but until Robert > > >finishes > > > cleanup of xlog.c variables, I can't say that with much confidence). > > > > Hmm. My inclination is to think that on an actual standby, you > > wouldn't want to get messages like this, but if you were doing a > > point-in-time-recovery, then you would. So I think maybe > > InArchiveReco
Re: when the startup process doesn't (logging startup delays)
> I think you're confusing discussions. > > Robert was talking about initdb/bootstrap/single, and I separately and > independently asked about hot standbys. It seems like Robert and I agreed > about the desired behavior and there was no further discussion. Sorry for including 2 separate points into one. > Looking over this version, I realized something I (or you) should have > noticed sooner: you've added the RegisterTimeout call to > InitPostgres(), but that's for things that are used by all backends, > and this is only used by the startup process. So it seems to me that > the right place is StartupProcessMain. That would have the further > advantage of allowing startup_progress_timeout_handler to be made > static. begin_startup_progress_phase() and > startup_progress_timeout_has_expired() are the actual API functions > though so they will need to remain extern. The earlier discussion wrt this point is as follows. > > I also agree that this is the better place to do it. Hence modified > > the patch accordingly. The condition "!AmStartupProcess()" is added to > > differentiate whether the call is done from a startup process or some > > other process. Actually StartupXLOG() gets called in 2 places. one in > > StartupProcessMain() and the other in InitPostgres(). As the logging > > of the startup progress is required only during the startup process > > and not in the other cases, > > The InitPostgres() case occurs when the server is started in bootstrap > mode (during initdb) or in single-user mode (postgres --single). I do > not see any reason why we shouldn't produce progress messages in at > least the latter case. I suspect that someone who is in the rather > desperate scenario of having to use single-user mode would really like > to know how long the server is going to take to start up. > > Perhaps during initdb we don't want messages, but I'm not sure that we > need to do anything about that here. None of the messages that the > server normally produces show up when you run initdb, so I guess they > are getting redirected to /dev/null or something. > > So I don't think that using AmStartupProcess() for this purpose is right. This point is really confusing. As per the earlier discussion we concluded to include RegisterTimeout() call even in case of InitPostgres() to support logging in case of single-user mode. Now if we remove the RegisterTimeout() call from InitPostgres(), we are not going to support that anymore. Is this what you're trying to convey? or we should add some checks and disable the code to RegisterTimeout() if it is other than single-user mode. I have added a check if (!IsPostmasterEnvironment) in the attached patch for this scenario. Kindly confirm my understanding. > > Should this feature distinguish between crash recovery and archive recovery > > on > > a hot standby ? Otherwise the standby will display this all the time. > > > >2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, elapsed > >time: 124.42 s, current LSN: 0/EEE2100 > > > >If so, I think maybe you'd check !InArchiveRecovery (but until Robert > >finishes > > cleanup of xlog.c variables, I can't say that with much confidence). > > Hmm. My inclination is to think that on an actual standby, you > wouldn't want to get messages like this, but if you were doing a > point-in-time-recovery, then you would. So I think maybe > InArchiveRecovery is not the right thing. Perhaps StandbyMode? I also feel that the log messages should be recorded in case of point-in-time-recovery. So I have added a check if (!StandbyMode) and verified the replication and point-in-time-recovery scenario. > > Some doc comments: > > Thanks for the suggestions. I will take care in the next patch. Fixed. > Michael is right. You updated some of the units based on Robert's suggestion > to use MS, but didn't update all of the corresponding parts of the patch. > guc.c says that the units are in MS, which means that unqualified values are > interpretted as such. But postgresql.conf.sample still says "seconds", and > guc.c says the default value is "10", and you still do: > > + interval_in_ms = log_startup_progress_interval * 1000; > > I checked that this currently does not interpret the value as ms: > |./tmp_install/usr/local/pgsql/bin/postgres -D > src/test/regress/tmp_check/data/ -c log_startup_progress_interval=1 > |2021-09-07 06:28:58.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 1.00 s, current LSN: 0/E94ED88 > |2021-09-07 06:28:59.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 2.00 s, current LSN: 0/10808EE0 > |2021-09-07 06:29:00.694 CDT startup[18865] LOG: redo in progress, elapsed > time: 3.00 s, current LSN: 0/126B8C80 > > (Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to > another int could overflow.) > > I think the convention is to for GUC global vars to be initialized with the > same default as in guc.c, so both should be 1, like: > > +int log_st
Re: when the startup process doesn't (logging startup delays)
On Tue, Sep 07, 2021 at 03:07:15PM +0530, Nitin Jadhav wrote: > > Looking over this version, I realized something I (or you) should have > > noticed sooner: you've added the RegisterTimeout call to > > InitPostgres(), but that's for things that are used by all backends, > > and this is only used by the startup process. So it seems to me that > > the right place is StartupProcessMain. That would have the further > > advantage of allowing startup_progress_timeout_handler to be made > > static. begin_startup_progress_phase() and > > startup_progress_timeout_has_expired() are the actual API functions > > though so they will need to remain extern. > > > > I agree with Robert that a standby server should not continuously show > > timing > > messages for WAL replay. > > The earlier discussion wrt this point is as follows. I think you're confusing discussions. Robert was talking about initdb/bootstrap/single, and I separately and independently asked about hot standbys. It seems like Robert and I agreed about the desired behavior and there was no further discussion. > > Honestly, I don't see why we should have > > a GUC for what's proposed here. A value too low would bloat the logs > > with entries that are not that meaningful. A value too large would > > just prevent access to some information wanted. Wouldn't it be better > > to just pick up a value like 10s or 20s? I don't think bloating logs is a issue for values > 10sec. You agreed that it's important to choose the "right" value, but I think that will vary between users. Some installations with large checkpoint_timeout might anticipate taking 15+min to perform recovery, but others might even have a strict requirement that recovery must not take more than (say) 10sec; someone might want to use this to verify that, or to optimize the slow parts of recovery, with an interval that someone else might not care about. > > + {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN, > > + gettext_noop("Sets the time interval between each progress > > update " > > +"of the startup process."), > > + gettext_noop("0 logs all messages. -1 turns this feature off."), > > + GUC_UNIT_MS, |+ 10, -1, INT_MAX, > > The unit is incorrect here, as that would default to 10ms, contrary to > > what the documentation says about 10s. > > Kindly refer the previous few discussions wrt this point. You copied and pasted unrelated emails, which isn't helpful. Michael is right. You updated some of the units based on Robert's suggestion to use MS, but didn't update all of the corresponding parts of the patch. guc.c says that the units are in MS, which means that unqualified values are interpretted as such. But postgresql.conf.sample still says "seconds", and guc.c says the default value is "10", and you still do: + interval_in_ms = log_startup_progress_interval * 1000; I checked that this currently does not interpret the value as ms: |./tmp_install/usr/local/pgsql/bin/postgres -D src/test/regress/tmp_check/data/ -c log_startup_progress_interval=1 |2021-09-07 06:28:58.694 CDT startup[18865] LOG: redo in progress, elapsed time: 1.00 s, current LSN: 0/E94ED88 |2021-09-07 06:28:59.694 CDT startup[18865] LOG: redo in progress, elapsed time: 2.00 s, current LSN: 0/10808EE0 |2021-09-07 06:29:00.694 CDT startup[18865] LOG: redo in progress, elapsed time: 3.00 s, current LSN: 0/126B8C80 (Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to another int could overflow.) I think the convention is to for GUC global vars to be initialized with the same default as in guc.c, so both should be 1, like: +int log_startup_progress_interval = 10 * 1000 /* 10sec */ -- Justin
Re: when the startup process doesn't (logging startup delays)
> Looking over this version, I realized something I (or you) should have > noticed sooner: you've added the RegisterTimeout call to > InitPostgres(), but that's for things that are used by all backends, > and this is only used by the startup process. So it seems to me that > the right place is StartupProcessMain. That would have the further > advantage of allowing startup_progress_timeout_handler to be made > static. begin_startup_progress_phase() and > startup_progress_timeout_has_expired() are the actual API functions > though so they will need to remain extern. > > I agree with Robert that a standby server should not continuously show timing > messages for WAL replay. The earlier discussion wrt this point is as follows. > > I also agree that this is the better place to do it. Hence modified > > the patch accordingly. The condition "!AmStartupProcess()" is added to > > differentiate whether the call is done from a startup process or some > > other process. Actually StartupXLOG() gets called in 2 places. one in > > StartupProcessMain() and the other in InitPostgres(). As the logging > > of the startup progress is required only during the startup process > > and not in the other cases, > > The InitPostgres() case occurs when the server is started in bootstrap > mode (during initdb) or in single-user mode (postgres --single). I do > not see any reason why we shouldn't produce progress messages in at > least the latter case. I suspect that someone who is in the rather > desperate scenario of having to use single-user mode would really like > to know how long the server is going to take to start up. > > Perhaps during initdb we don't want messages, but I'm not sure that we > need to do anything about that here. None of the messages that the > server normally produces show up when you run initdb, so I guess they > are getting redirected to /dev/null or something. > > So I don't think that using AmStartupProcess() for this purpose is right. So as per the recent discussion, RegisterTimeout call should be removed from InitPostgres() and the condition "!AmStartupProcess()" is to be added in begin_startup_progress_phase() and ereport_startup_progress() to differentiate whether the call is from a startup process or some other process. Kindly correct me if I am wrong. > Some doc comments: Thanks for the suggestions. I will take care in the next patch. > Clearly. This should be limited to crash recovery, and maybe there > could be some checks to make sure that nothing is logged once a > consistent point is reached. The purpose here is to show the progress of the operation if it is taking longer than the interval set by the user until it completes the operation. Users should know what operation is happening in the background and to show the progress, displaying the elapsed time. So according to me the consistent point is nothing but the end of the operation. Kindly let me know if you have something in mind and that could be the better consistent point. > Honestly, I don't see why we should have > a GUC for what's proposed here. A value too low would bloat the logs > with entries that are not that meaningful. A value too large would > just prevent access to some information wanted. Wouldn't it be better > to just pick up a value like 10s or 20s? It is difficult to finalise the value and use that value without providing an option to change. If we choose one value (say 10s), it may be too less for some users or too large for some other users. So I feel it is better to provide an option to users so that they can choose the value according to their need. Anyway the default value set for this setting is 10s. > + {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN, > + gettext_noop("Sets the time interval between each progress update > " > +"of the startup process."), > + gettext_noop("0 logs all messages. -1 turns this feature off."), > + GUC_UNIT_MS, > The unit is incorrect here, as that would default to 10ms, contrary to > what the documentation says about 10s. Kindly refer the previous few discussions wrt this point. On Tue, Sep 7, 2021 at 10:58 AM Michael Paquier wrote: > > On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote: > > On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote: > > > Please find the updated patch attached. > > > > Please check > > CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com > > > > I agree with Robert that a standby server should not continuously show > > timing > > messages for WAL replay. > > Clearly. This should be limited to crash recovery, and maybe there > could be some checks to make sure that nothing is logged once a > consistent point is reached. Honestly, I don't see why we should have > a GUC for what's proposed here. A value too low would bloat the logs > with entries that are not that meaningful. A value too large would > just prevent access to some informa
Re: when the startup process doesn't (logging startup delays)
> I also agree that this is the better place to do it. Hence modified > the patch accordingly. The condition "!AmStartupProcess()" is added to > differentiate whether the call is done from a startup process or some > other process. Actually StartupXLOG() gets called in 2 places. one in > StartupProcessMain() and the other in InitPostgres(). As the logging > of the startup progress is required only during the startup process > and not in the other cases, The InitPostgres() case occurs when the server is started in bootstrap mode (during initdb) or in single-user mode (postgres --single). I do not see any reason why we shouldn't produce progress messages in at least the latter case. I suspect that someone who is in the rather desperate scenario of having to use single-user mode would really like to know how long the server is going to take to start up. Perhaps during initdb we don't want messages, but I'm not sure that we need to do anything about that here. None of the messages that the server normally produces show up when you run initdb, so I guess they are getting redirected to /dev/null or something. So I don't think that using AmStartupProcess() for this purpose is right. On Tue, Sep 7, 2021 at 10:58 AM Michael Paquier wrote: > > On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote: > > On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote: > > > Please find the updated patch attached. > > > > Please check > > CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com > > > > I agree with Robert that a standby server should not continuously show > > timing > > messages for WAL replay. > > Clearly. This should be limited to crash recovery, and maybe there > could be some checks to make sure that nothing is logged once a > consistent point is reached. Honestly, I don't see why we should have > a GUC for what's proposed here. A value too low would bloat the logs > with entries that are not that meaningful. A value too large would > just prevent access to some information wanted. Wouldn't it be better > to just pick up a value like 10s or 20s? > > Looking at v13.. > > + {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN, > + gettext_noop("Sets the time interval between each progress update > " > +"of the startup process."), > + gettext_noop("0 logs all messages. -1 turns this feature off."), > + GUC_UNIT_MS, > The unit is incorrect here, as that would default to 10ms, contrary to > what the documentation says about 10s. > -- > Michael
Re: when the startup process doesn't (logging startup delays)
On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote: > On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote: > > Please find the updated patch attached. > > Please check > CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com > > I agree with Robert that a standby server should not continuously show timing > messages for WAL replay. Clearly. This should be limited to crash recovery, and maybe there could be some checks to make sure that nothing is logged once a consistent point is reached. Honestly, I don't see why we should have a GUC for what's proposed here. A value too low would bloat the logs with entries that are not that meaningful. A value too large would just prevent access to some information wanted. Wouldn't it be better to just pick up a value like 10s or 20s? Looking at v13.. + {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN, + gettext_noop("Sets the time interval between each progress update " +"of the startup process."), + gettext_noop("0 logs all messages. -1 turns this feature off."), + GUC_UNIT_MS, The unit is incorrect here, as that would default to 10ms, contrary to what the documentation says about 10s. -- Michael signature.asc Description: PGP signature
Re: when the startup process doesn't (logging startup delays)
On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote: > Please find the updated patch attached. Please check CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com I agree with Robert that a standby server should not continuously show timing messages for WAL replay. Some doc comments: + Sets the time interval between each progress update of the operations + performed during startup process. This produces the log message after Either say "performed by the startup process" or "performed during startup". s/the/a/ + every interval of time for the operations that take longer time. The unit ..for those operations which take longer than the specified duration. + used to specify the value is seconds. For example, if you set it to + 10s , then after every 10s there remove "after" + is a log message indicating which operation is going on and what is the say "..every 10s, a log is emitted indicating which operation is ongoing, and the elapsed time from the beginning of the operation.." + elapsed time from beginning. If the value is set to 0 , + then it logs all the available messages for such operations. -1 "..then all messages for such operations are logged." + disables the feature. The default value is set to 10s + "The default value is >10s<."
Re: when the startup process doesn't (logging startup delays)
> > Anything that's not used in other files should be declared static in > > the file itself, and not present in the header. Once you fix this for > > reset_startup_progress_timeout, the header won't need to include > > datatype/timestamp.h any more, which is good, because we don't want > > header files to depend on more other header files than necessary. > > Thanks for identifying this. I will take care in the next patch. Fixed. > > This hunk should be removed. > > I will remove it in the next patch. Removed. Please find the updated patch attached. On Wed, Aug 18, 2021 at 12:23 PM Nitin Jadhav wrote: > > > Anything that's not used in other files should be declared static in > > the file itself, and not present in the header. Once you fix this for > > reset_startup_progress_timeout, the header won't need to include > > datatype/timestamp.h any more, which is good, because we don't want > > header files to depend on more other header files than necessary. > > Thanks for identifying this. I will take care in the next patch. > > > Looking over this version, I realized something I (or you) should have > > noticed sooner: you've added the RegisterTimeout call to > > InitPostgres(), but that's for things that are used by all backends, > > and this is only used by the startup process. So it seems to me that > > the right place is StartupProcessMain. That would have the further > > advantage of allowing startup_progress_timeout_handler to be made > > static. begin_startup_progress_phase() and > > startup_progress_timeout_has_expired() are the actual API functions > > though so they will need to remain extern. > > Yes. I had noticed this earlier and the RegisterTimeout() call was > only present in StartupProcessMain() and not in InitPostgres() in the > earlier versions (v7) of the patch. Since StartupXLOG() gets called in > the 2 places, I had restricted the InitPostgres() flow by checking for > the !AmStartupProcess() in the newly added functions. But later we had > discussion and concluded to add the RegisterTimeout() call even in > case of InitPostgres(). Kindly refer to the discussion just after the > v7 patch in this thread and let me know your thoughts. > > > This hunk should be removed. > > I will remove it in the next patch. > > On Tue, Aug 17, 2021 at 1:08 AM Robert Haas wrote: > > > > On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby wrote: > > > Should this feature distinguish between crash recovery and archive > > > recovery on > > > a hot standby ? Otherwise the standby will display this all the time. > > > > > > 2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, > > > elapsed time: 124.42 s, current LSN: 0/EEE2100 > > > > > > If so, I think maybe you'd check !InArchiveRecovery (but until Robert > > > finishes > > > cleanup of xlog.c variables, I can't say that with much confidence). > > > > Hmm. My inclination is to think that on an actual standby, you > > wouldn't want to get messages like this, but if you were doing a > > point-in-time-recovery, then you would. So I think maybe > > InArchiveRecovery is not the right thing. Perhaps StandbyMode? > > > > -- > > Robert Haas > > EDB: http://www.enterprisedb.com v13-0001-startup-process-progress.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
> Anything that's not used in other files should be declared static in > the file itself, and not present in the header. Once you fix this for > reset_startup_progress_timeout, the header won't need to include > datatype/timestamp.h any more, which is good, because we don't want > header files to depend on more other header files than necessary. Thanks for identifying this. I will take care in the next patch. > Looking over this version, I realized something I (or you) should have > noticed sooner: you've added the RegisterTimeout call to > InitPostgres(), but that's for things that are used by all backends, > and this is only used by the startup process. So it seems to me that > the right place is StartupProcessMain. That would have the further > advantage of allowing startup_progress_timeout_handler to be made > static. begin_startup_progress_phase() and > startup_progress_timeout_has_expired() are the actual API functions > though so they will need to remain extern. Yes. I had noticed this earlier and the RegisterTimeout() call was only present in StartupProcessMain() and not in InitPostgres() in the earlier versions (v7) of the patch. Since StartupXLOG() gets called in the 2 places, I had restricted the InitPostgres() flow by checking for the !AmStartupProcess() in the newly added functions. But later we had discussion and concluded to add the RegisterTimeout() call even in case of InitPostgres(). Kindly refer to the discussion just after the v7 patch in this thread and let me know your thoughts. > This hunk should be removed. I will remove it in the next patch. On Tue, Aug 17, 2021 at 1:08 AM Robert Haas wrote: > > On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby wrote: > > Should this feature distinguish between crash recovery and archive recovery > > on > > a hot standby ? Otherwise the standby will display this all the time. > > > > 2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, elapsed > > time: 124.42 s, current LSN: 0/EEE2100 > > > > If so, I think maybe you'd check !InArchiveRecovery (but until Robert > > finishes > > cleanup of xlog.c variables, I can't say that with much confidence). > > Hmm. My inclination is to think that on an actual standby, you > wouldn't want to get messages like this, but if you were doing a > point-in-time-recovery, then you would. So I think maybe > InArchiveRecovery is not the right thing. Perhaps StandbyMode? > > -- > Robert Haas > EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby wrote: > Should this feature distinguish between crash recovery and archive recovery on > a hot standby ? Otherwise the standby will display this all the time. > > 2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, elapsed > time: 124.42 s, current LSN: 0/EEE2100 > > If so, I think maybe you'd check !InArchiveRecovery (but until Robert finishes > cleanup of xlog.c variables, I can't say that with much confidence). Hmm. My inclination is to think that on an actual standby, you wouldn't want to get messages like this, but if you were doing a point-in-time-recovery, then you would. So I think maybe InArchiveRecovery is not the right thing. Perhaps StandbyMode? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Should this feature distinguish between crash recovery and archive recovery on a hot standby ? Otherwise the standby will display this all the time. 2021-08-14 16:13:33.139 CDT startup[11741] LOG: redo in progress, elapsed time: 124.42 s, current LSN: 0/EEE2100 If so, I think maybe you'd check !InArchiveRecovery (but until Robert finishes cleanup of xlog.c variables, I can't say that with much confidence).
Re: when the startup process doesn't (logging startup delays)
On Thu, Aug 12, 2021 at 7:40 AM Nitin Jadhav wrote: > > I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but > > expressing the default in postgresl.conf.sample as 10s rather than > > 1ms. I tried values measured in milliseconds just for testing > > purposes and didn't initially understand why it wasn't working. I > > don't think there's any reason it can't work. > > As suggested, I have changed it to GUC_UNIT_MS and kept the default > value to 10s. I would like to know the reason why it can't be > GUC_UNIT_S as we are expressing the values in terms of seconds. I mean, it *could* be. There's just no advantage. Values in seconds will work correctly either way. But values in milliseconds will only work if it's GUC_UNIT_MS. It seems to me that it's better to make more things work rather than fewer. > > There's no precedent in the tree for the use of ##__VA_ARGS__. On my > > system it seems to work fine if I just leave out the ##. Any reason > > not to do that? > > I had added this to support if no variable argument are passed to the > macro. For example, in the previous patches we used to log the > progress at the end of the operation like > "ereport_startup_progress(true, "data directory sync (syncfs) complete > after %ld.%02d s");". Since these calls are removed now, ## is not > required. Hence removed in the attached patch. Hmm, OK. That's actually a pretty good reason for using ## there. It just made me nervous because we have no similar uses of ## in the backend code. We rely on it elsewhere for concatenation, but not for comma removal. Let's try leaving it out for now unless somebody else shows up with a different opinion. > I had not added extern since those function were not used in the other > files. Now added to match the project style. Anything that's not used in other files should be declared static in the file itself, and not present in the header. Once you fix this for reset_startup_progress_timeout, the header won't need to include datatype/timestamp.h any more, which is good, because we don't want header files to depend on more other header files than necessary. Looking over this version, I realized something I (or you) should have noticed sooner: you've added the RegisterTimeout call to InitPostgres(), but that's for things that are used by all backends, and this is only used by the startup process. So it seems to me that the right place is StartupProcessMain. That would have the further advantage of allowing startup_progress_timeout_handler to be made static. begin_startup_progress_phase() and startup_progress_timeout_has_expired() are the actual API functions though so they will need to remain extern. @@ -679,7 +680,6 @@ static char *recovery_target_lsn_string; /* should be static, but commands/variable.c needs to get at this */ char *role_string; - /* * Displayable names for context types (enum GucContext) * This hunk should be removed. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> startup_progress_timer_expired should be declared as sig_atomic_t like > we do in other cases (see interrupt.c). Fixed. > The default value of the new GUC is 10s in postgresql.conf.sample, but > -1 in guc.c. They should both be 10s, and the one in > postgresql.conf.sample should be commented out. Fixed. > I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but > expressing the default in postgresl.conf.sample as 10s rather than > 1ms. I tried values measured in milliseconds just for testing > purposes and didn't initially understand why it wasn't working. I > don't think there's any reason it can't work. As suggested, I have changed it to GUC_UNIT_MS and kept the default value to 10s. I would like to know the reason why it can't be GUC_UNIT_S as we are expressing the values in terms of seconds. > I would prefer to see log_startup_progress_interval declared and > defined in startup.c/startup.h rather than guc.c/guc.h. Fixed. > There's no precedent in the tree for the use of ##__VA_ARGS__. On my > system it seems to work fine if I just leave out the ##. Any reason > not to do that? I had added this to support if no variable argument are passed to the macro. For example, in the previous patches we used to log the progress at the end of the operation like "ereport_startup_progress(true, "data directory sync (syncfs) complete after %ld.%02d s");". Since these calls are removed now, ## is not required. Hence removed in the attached patch. > Two of the declarations in startup.h forgot the leading "extern", > while the other two that are right next to them have it, matching > project style. I had not added extern since those function were not used in the other files. Now added to match the project style. > I'm reasonably happy with most of the identifier names now, but I > think init_startup_progress() is confusing. The reason I think that is > that we call it more than once, which is not really what people think > about when they think of an "init" function, I think. It's not > initializing the startup progress facility in general; it's preparing > for the next phase of startup progress reporting. How about renaming > it to begin_startup_progress_phase()? And then > startup_process_op_start_time could be > startup_process_phase_start_time to match. Yes begin_startup_progress_phase() looks more appropriate. Instead of startup_process_phase_start_time, startup_progress_phase_start_time looks more appropriate. Changed these in the attached patch. > SyncDataDirectory() potentially walks over the data directory three > times: once to call do_syncfs(), once to call pre_sync_fname(), and > once to call datadir_fsync_fname(). With this patch, the first and > third will emit progress messages if the operation runs long, but the > second will not. I think they should all be treated the same. Also, > the locations where you've inserted calls to init_startup_progress() > don't really make it clear with what code that's associated. I'd put > them *immediately* before the call to do_syncfs() or walkdir(). Fixed. > Remember that PostgreSQL comments are typically written "telegraph > style," so function comments should say "Does whatever" not "It does > whatever". Most of them are correct, but there's one sentence you need > to fix. Fixed in the function comments of startup_progress_timeout_has_expired(). Please let me now if this is not the one you wanted me to correct. > You mentioned previously that you would add documentation, but I do > not see it here. Sorry. I missed this. I have added the documentation in the attached patch. On Tue, Aug 10, 2021 at 8:56 PM Robert Haas wrote: > > On Tue, Aug 10, 2021 at 9:28 AM Nitin Jadhav > wrote: > > Please find the updated patch attached. > > I think this is getting close. The output looks nice. However, I still > see a bunch of issues. > > You mentioned previously that you would add documentation, but I do > not see it here. > > startup_progress_timer_expired should be declared as sig_atomic_t like > we do in other cases (see interrupt.c). > > The default value of the new GUC is 10s in postgresql.conf.sample, but > -1 in guc.c. They should both be 10s, and the one in > postgresql.conf.sample should be commented out. > > I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but > expressing the default in postgresl.conf.sample as 10s rather than > 1ms. I tried values measured in milliseconds just for testing > purposes and didn't initially understand why it wasn't working. I > don't think there's any reason it can't work. > > I would prefer to see log_startup_progress_interval declared and > defined in startup.c/startup.h rather than guc.c/guc.h. > > There's no precedent in the tree for the use of ##__VA_ARGS__. On my > system it seems to work fine if I just leave out the ##. Any reason > not to do that? > > Two of the declarations in startup.h forgot the leading "extern", > while the other two that are right next to them have it, matchin
Re: when the startup process doesn't (logging startup delays)
On Tue, Aug 10, 2021 at 9:28 AM Nitin Jadhav wrote: > Please find the updated patch attached. I think this is getting close. The output looks nice. However, I still see a bunch of issues. You mentioned previously that you would add documentation, but I do not see it here. startup_progress_timer_expired should be declared as sig_atomic_t like we do in other cases (see interrupt.c). The default value of the new GUC is 10s in postgresql.conf.sample, but -1 in guc.c. They should both be 10s, and the one in postgresql.conf.sample should be commented out. I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but expressing the default in postgresl.conf.sample as 10s rather than 1ms. I tried values measured in milliseconds just for testing purposes and didn't initially understand why it wasn't working. I don't think there's any reason it can't work. I would prefer to see log_startup_progress_interval declared and defined in startup.c/startup.h rather than guc.c/guc.h. There's no precedent in the tree for the use of ##__VA_ARGS__. On my system it seems to work fine if I just leave out the ##. Any reason not to do that? Two of the declarations in startup.h forgot the leading "extern", while the other two that are right next to them have it, matching project style. I'm reasonably happy with most of the identifier names now, but I think init_startup_progress() is confusing. The reason I think that is that we call it more than once, which is not really what people think about when they think of an "init" function, I think. It's not initializing the startup progress facility in general; it's preparing for the next phase of startup progress reporting. How about renaming it to begin_startup_progress_phase()? And then startup_process_op_start_time could be startup_process_phase_start_time to match. SyncDataDirectory() potentially walks over the data directory three times: once to call do_syncfs(), once to call pre_sync_fname(), and once to call datadir_fsync_fname(). With this patch, the first and third will emit progress messages if the operation runs long, but the second will not. I think they should all be treated the same. Also, the locations where you've inserted calls to init_startup_progress() don't really make it clear with what code that's associated. I'd put them *immediately* before the call to do_syncfs() or walkdir(). Remember that PostgreSQL comments are typically written "telegraph style," so function comments should say "Does whatever" not "It does whatever". Most of them are correct, but there's one sentence you need to fix. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> I'd really like to avoid this. I don't see why it's necessary. You say > it causes a problem, but you don't explain what problem it causes. > enable_timeout_at() will disable the timer if not already done. I > think all we need to do is set scheduled_startup_progress_timeout = 0 > before calling reset_startup_progress_timeout() in the "init" case and > don't do that for the non-init case. If that's not quite right, maybe > you can work out something that does work. But adding an is_init flag > to a function and having no common code between the is_init = true > case and the is_init = false case is exactly the kind of thing that I > don't want here. I want as much common code as possible. Setting set scheduled_startup_progress_timeout = 0 in the "init" case solves the problem. The problem was if we call init_start_progress() continuously then the first call to reset_startup_progress_timeout() sets the value of scheduled_startup_progress_timeout to "now + interval". Later call to reset_startup_progress_timeout() uses the previously set value of scheduled_startup_progress_timeout which was not correct and it was not behaving as expected. I could see that the first log gets printed far later than the expected interval. > To some degree, we tend to use names_like_this() for low-level > functions and NamesLikeThis() for higher-level functions, but that is > not a very consistent practice. Ok. Thanks for the information. > But this strategy cannot be used if the drift is larger than the > interval. If we are trying to log a message every 1000ms and the timer > doesn't fire for 14ms, we can wait only 986ms the next time. If it > doesn't fire for 140ms, we can wait only 860ms the next time. But if > the timer doesn't fire for 1400ms, we cannot wait for -400ms the next > time. So what should we do? My proposal is to just wait for the > configured interval, 1000ms, essentially giving up on drift > correction. Now you could argue that we ought to just wait for 600ms > in the hopes of making it 2 * 1000ms after the previous status > message, but I'm not sure that really has any value, and it doesn't > seem especially likely to work. The only way timer interrupts are > likely to be that badly delayed is if the system is horrifically > overloaded, and if that's the case the next timer interrupt isn't > likely to fire on schedule anyway. Trying to correct for drift in such > a situation seems more likely to be confusing than to produce any > helpful result. This is what I was trying to convey in case-2. I agree that it is better to consider "now + interval" in such a case instead of trying to adjust the drift. Please find the updated patch attached. On Tue, Aug 10, 2021 at 1:06 AM Robert Haas wrote: > > On Mon, Aug 9, 2021 at 11:20 AM Nitin Jadhav > wrote: > > Modified the reset_startup_progress_timeout() to take the second > > parameter which indicates whether it is called for initialization or > > for resetting. Without this parameter there is a problem if we call > > init_startup_progress() more than one time if there is no call to > > ereport_startup_progress() in between as the code related to disabling > > the timer has been removed. > > I'd really like to avoid this. I don't see why it's necessary. You say > it causes a problem, but you don't explain what problem it causes. > enable_timeout_at() will disable the timer if not already done. I > think all we need to do is set scheduled_startup_progress_timeout = 0 > before calling reset_startup_progress_timeout() in the "init" case and > don't do that for the non-init case. If that's not quite right, maybe > you can work out something that does work. But adding an is_init flag > to a function and having no common code between the is_init = true > case and the is_init = false case is exactly the kind of thing that I > don't want here. I want as much common code as possible. > > > > This makes sense, but I think I'd like to have all the functions in > > > this patch use names_like_this() rather than NamesLikeThis(). > > > > I have changed all the function names accordingly. But I would like to > > know why it should be names_like_this() as there are many functions > > with the format NamesLikeThis(). I would like to understand when to > > use what, just to incorporate in the future patches. > > There is, unfortunately, no hard-and-fast rule, but we want to > maintain as much consistency with the existing style as we can. I > wasn't initially sure what would work best for this particular patch, > but after we committed to a name like ereport_startup_progress() that > to me was a strong hint in favor of using names_like_this() > throughout. It seems impossible to imagine punctuating it as > EreportStartupProgress() or something since that would be wildly > inconsistent with the existing function name, and there seems to be no > good reason why this patch can't be internally consistent. > > To some degree, we tend to use names_like_this() for low-level > functio
Re: when the startup process doesn't (logging startup delays)
On Mon, Aug 9, 2021 at 11:20 AM Nitin Jadhav wrote: > Modified the reset_startup_progress_timeout() to take the second > parameter which indicates whether it is called for initialization or > for resetting. Without this parameter there is a problem if we call > init_startup_progress() more than one time if there is no call to > ereport_startup_progress() in between as the code related to disabling > the timer has been removed. I'd really like to avoid this. I don't see why it's necessary. You say it causes a problem, but you don't explain what problem it causes. enable_timeout_at() will disable the timer if not already done. I think all we need to do is set scheduled_startup_progress_timeout = 0 before calling reset_startup_progress_timeout() in the "init" case and don't do that for the non-init case. If that's not quite right, maybe you can work out something that does work. But adding an is_init flag to a function and having no common code between the is_init = true case and the is_init = false case is exactly the kind of thing that I don't want here. I want as much common code as possible. > > This makes sense, but I think I'd like to have all the functions in > > this patch use names_like_this() rather than NamesLikeThis(). > > I have changed all the function names accordingly. But I would like to > know why it should be names_like_this() as there are many functions > with the format NamesLikeThis(). I would like to understand when to > use what, just to incorporate in the future patches. There is, unfortunately, no hard-and-fast rule, but we want to maintain as much consistency with the existing style as we can. I wasn't initially sure what would work best for this particular patch, but after we committed to a name like ereport_startup_progress() that to me was a strong hint in favor of using names_like_this() throughout. It seems impossible to imagine punctuating it as EreportStartupProgress() or something since that would be wildly inconsistent with the existing function name, and there seems to be no good reason why this patch can't be internally consistent. To some degree, we tend to use names_like_this() for low-level functions and NamesLikeThis() for higher-level functions, but that is not a very consistent practice. > > reset_startup_progress_timeout(TimeStampTz now) > > { > > next_timeout = last_startup_progress_timeout + interval; > > if (next_timeout < now) > > { > > // Either the timeout was processed so late that we missed an entire > > cycle, > > // or the system clock was set backwards. > > next_timeout = now + interval; > > } > > enable_timeout_at(next_timeout); > > last_startup_progress_timeout = next_timeout; > > } > > As per the above logic, I would like to discuss 2 cases. > > Case-1: > First scheduled timeout is after 1 sec. But the time out occurred > after 1.5 sec. So the log msg prints after 1.5 sec. Next timer is > scheduled after 2 sec (scheduled_startup_progress_timeout + interval). > The condition (next_timeout < now) gets evaluated to false and > everything goes smooth. > > Case-2: > First scheduled timeout is after 1 sec. But the timeout occurred after > 2.5 sec. So the log msg prints after 2.5 sec. Now next time is > scheduled after 2 sec (scheduled_startup_progress_timeout + interval). > So the condition (next_timeout < now) will fail and the next_timeout > will get reset to 3.5 sec (2.5 + 1) and it continues. Is this > behaviour ok or should we set the next_timeout to 3 sec. Please share > your thoughts on this. I can't quite follow this, because it seems like you are sometimes viewing the interval as 1 second and sometimes as 2 seconds. Maybe you could clarify that, and perhaps show example output? My feeling is that the timer will almost always be slightly late, but it will very rarely be extremely late, and it will also very rarely be early (only if someone resets the system clock). So let's consider those two cases separately. If the timer is a little bit late each time, we want to avoid drift, so we want to shorten the next sleep time by the amount that the previous interrupt was late. If the interval is 1000ms and the interrupt fires 1ms late then we should sleep 999ms the next time; if 2ms late, 998ms. That way, although there will be some variation in which the messages are logged, the drift won't accumulate over time and even after many minutes of recovery the messages will be printed at ABOUT the same number of milliseconds after the second every time, instead of drifting further and further off course. But this strategy cannot be used if the drift is larger than the interval. If we are trying to log a message every 1000ms and the timer doesn't fire for 14ms, we can wait only 986ms the next time. If it doesn't fire for 140ms, we can wait only 860ms the next time. But if the timer doesn't fire for 1400ms, we cannot wait for -400ms the next time. So what should we do? My proposal is to just wait for the configured interval, 1
Re: when the startup process doesn't (logging startup delays)
Modified the reset_startup_progress_timeout() to take the second parameter which indicates whether it is called for initialization or for resetting. Without this parameter there is a problem if we call init_startup_progress() more than one time if there is no call to ereport_startup_progress() in between as the code related to disabling the timer has been removed. reset_startup_progress_timeout(TimeStampTz now, bool is_init) { if (is_init) next_timeout = now + interval; else { next_timeout = scheduled_startup_progress_timeout + interval; if (next_timeout < now) { // Either the timeout was processed so late that we missed an entire cycle, // or the system clock was set backwards. next_timeout = now + interval; } enable_timeout_at(next_timeout); scheduled_startup_progress_timeout = next_timeout; } } I have incorporated this in the attached patch. Please correct me if I am wrong. > This makes sense, but I think I'd like to have all the functions in > this patch use names_like_this() rather than NamesLikeThis(). I have changed all the function names accordingly. But I would like to know why it should be names_like_this() as there are many functions with the format NamesLikeThis(). I would like to understand when to use what, just to incorporate in the future patches. > Hmm, yeah, that seems good, but given this change, maybe the variables > need a little renaming. Like change last_startup_progress_timeout to > scheduled_startup_progress_timeout, perhaps. Right. Changed the variable name. > I guess this one needs to return a Boolean, actually. Yes. > reset_startup_progress_timeout(TimeStampTz now) > { > next_timeout = last_startup_progress_timeout + interval; > if (next_timeout < now) > { > // Either the timeout was processed so late that we missed an entire > cycle, > // or the system clock was set backwards. > next_timeout = now + interval; > } > enable_timeout_at(next_timeout); > last_startup_progress_timeout = next_timeout; > } As per the above logic, I would like to discuss 2 cases. Case-1: First scheduled timeout is after 1 sec. But the time out occurred after 1.5 sec. So the log msg prints after 1.5 sec. Next timer is scheduled after 2 sec (scheduled_startup_progress_timeout + interval). The condition (next_timeout < now) gets evaluated to false and everything goes smooth. Case-2: First scheduled timeout is after 1 sec. But the timeout occurred after 2.5 sec. So the log msg prints after 2.5 sec. Now next time is scheduled after 2 sec (scheduled_startup_progress_timeout + interval). So the condition (next_timeout < now) will fail and the next_timeout will get reset to 3.5 sec (2.5 + 1) and it continues. Is this behaviour ok or should we set the next_timeout to 3 sec. Please share your thoughts on this. Thanks & Regards, Nitin Jadhav On Thu, Aug 5, 2021 at 7:49 PM Robert Haas wrote: > > On Thu, Aug 5, 2021 at 7:41 AM Nitin Jadhav > wrote: > > This seems a little confusing. As we are setting > > last_startup_progress_timeout = now and then calling > > reset_startup_progress_timeout() which will calculate the next_time > > based on the value of last_startup_progress_timeout initially and > > checks whether next_timeout is less than now. It doesn't make sense to > > me. I feel we should calculate the next_timeout based on the time that > > it is supposed to fire next time. So we should set > > last_startup_progress_timeout = next_timeout after enabling the timer. > > Also I feel with the 2 functions mentioned above, we also need > > InitStartupProgress() which sets the initial value to > > startupProcessOpStartTime which is used to calculate the time > > difference and display in the logs. I could see those functions like > > below. > > > > InitStartupProgress(void) > > { > > startupProcessOpStartTime = GetCurrentTimestamp(); > > ResetStartupProgressTimeout(startupProcessOpStartTime); > > } > > This makes sense, but I think I'd like to have all the functions in > this patch use names_like_this() rather than NamesLikeThis(). > > > reset_startup_progress_timeout(TimeStampTz now) > > { > > next_timeout = last_startup_progress_timeout + interval; > > if (next_timeout < now) > > { > > // Either the timeout was processed so late that we missed an entire > > cycle, > > // or the system clock was set backwards. > > next_timeout = now + interval; > > } > > enable_timeout_at(next_timeout); > > last_startup_progress_timeout = next_timeout; > > } > > Hmm, yeah, that seems good, but given this change, maybe the variables > need a little renaming. Like change last_startup_progress_timeout to > scheduled_startup_progress_timeout, perhaps. > > > startup_progress_timeout_has_expired() > > { > >if (!startup_progress_timer_expired) > > return; > > now = GetCurrentTimestamp(); > > // compute timestamp difference based on startupProcessOpStartTime > > reset_startup
Re: when the startup process doesn't (logging startup delays)
On Thu, Aug 5, 2021 at 7:41 AM Nitin Jadhav wrote: > This seems a little confusing. As we are setting > last_startup_progress_timeout = now and then calling > reset_startup_progress_timeout() which will calculate the next_time > based on the value of last_startup_progress_timeout initially and > checks whether next_timeout is less than now. It doesn't make sense to > me. I feel we should calculate the next_timeout based on the time that > it is supposed to fire next time. So we should set > last_startup_progress_timeout = next_timeout after enabling the timer. > Also I feel with the 2 functions mentioned above, we also need > InitStartupProgress() which sets the initial value to > startupProcessOpStartTime which is used to calculate the time > difference and display in the logs. I could see those functions like > below. > > InitStartupProgress(void) > { > startupProcessOpStartTime = GetCurrentTimestamp(); > ResetStartupProgressTimeout(startupProcessOpStartTime); > } This makes sense, but I think I'd like to have all the functions in this patch use names_like_this() rather than NamesLikeThis(). > reset_startup_progress_timeout(TimeStampTz now) > { > next_timeout = last_startup_progress_timeout + interval; > if (next_timeout < now) > { > // Either the timeout was processed so late that we missed an entire > cycle, > // or the system clock was set backwards. > next_timeout = now + interval; > } > enable_timeout_at(next_timeout); > last_startup_progress_timeout = next_timeout; > } Hmm, yeah, that seems good, but given this change, maybe the variables need a little renaming. Like change last_startup_progress_timeout to scheduled_startup_progress_timeout, perhaps. > startup_progress_timeout_has_expired() > { >if (!startup_progress_timer_expired) > return; > now = GetCurrentTimestamp(); > // compute timestamp difference based on startupProcessOpStartTime > reset_startup_progress_timeout(now); > } I guess this one needs to return a Boolean, actually. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
Thanks for the detailed explanation. > +elapsed_ms = (seconds * 1000) + (useconds / 1000); > +interval_in_ms = log_startup_progress_interval * 1000; > +enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT, > + (interval_in_ms - (elapsed_ms % interval_in_ms))); > > This will work correctly only if elapsed_ms is equal to interval_in_ms > or slightly greater than interval_ms. But if elapsed_ms is greater > than two times interval_ms, then it will produce pretty much random > results. If elapsed_ms is negative because the system clock gets set > backward, a possibility I've already mentioned to you in a previous > review, then it will also misbehave. I actually don't think > enable_timeout_after() is very easy to use for this kind of thing. At > least for me, it's way easier to think about calculating the timestamp > at which I want the timer to expire. Maybe something along these > lines: > > next_timeout = last_startup_progress_timeout + interval; > if (next_timeout < now) > { >// Either the timeout was processed so late that we missed an entire cycle, >// or the system clock was set backwards. >next_timeout = now + interval; > } > enable_timeout_at(next_timeout); > > So I think we should take this out, which would permit simplifying a > bunch of code.The four places where you call > ereport_startup_progress(true, ...) would go away. > ereport_startup_progress() would no longer need a Boolean argument, > and neither would LogStartupProgressTimerExpired() / > startup_progress_timer_has_expired(). Note that there's no real need > to disable the timeout when we're done with it. It's fine if we do, > but if we don't, it's also not a big deal; all that happens if we > leave the timer scheduled and let it expire is that it will set a > Boolean flag that nobody will care about. So what I'm thinking about > is that we could just have, say, reset_startup_progress_timeout() and > startup_progress_timeout_has_expired(). > reset_startup_progress_timeout() would just do exactly what I showed > above to reset the timeout, and you'd call it at the beginning of each > phase. And startup_progress_timeout_has_expired() would look roughly > like this: > > if (!startup_progress_timer_expired) >return; > now = GetCurrentTimestamp(); > // compute timestamp difference > last_startup_progress_timeout = now; > reset_startup_progress_timeout(); This seems a little confusing. As we are setting last_startup_progress_timeout = now and then calling reset_startup_progress_timeout() which will calculate the next_time based on the value of last_startup_progress_timeout initially and checks whether next_timeout is less than now. It doesn't make sense to me. I feel we should calculate the next_timeout based on the time that it is supposed to fire next time. So we should set last_startup_progress_timeout = next_timeout after enabling the timer. Also I feel with the 2 functions mentioned above, we also need InitStartupProgress() which sets the initial value to startupProcessOpStartTime which is used to calculate the time difference and display in the logs. I could see those functions like below. InitStartupProgress(void) { startupProcessOpStartTime = GetCurrentTimestamp(); ResetStartupProgressTimeout(startupProcessOpStartTime); } reset_startup_progress_timeout(TimeStampTz now) { next_timeout = last_startup_progress_timeout + interval; if (next_timeout < now) { // Either the timeout was processed so late that we missed an entire cycle, // or the system clock was set backwards. next_timeout = now + interval; } enable_timeout_at(next_timeout); last_startup_progress_timeout = next_timeout; } startup_progress_timeout_has_expired() { if (!startup_progress_timer_expired) return; now = GetCurrentTimestamp(); // compute timestamp difference based on startupProcessOpStartTime reset_startup_progress_timeout(now); } Kindly share your thoughts and correct me if I am wrong. > I think we also should consider where to put the new functions > introduced by this patch, and the GUC. You put them in xlog.c which is > reasonable since that is where StartupXLOG() lives. However, xlog.c is > also a gigantic file, and xlog.h is included in a lot of places, most > of which aren't going to care about the new things you're adding to > that file at all. So I'm thinking it might make more sense to put the > new code in src/backend/postmaster/startup.c. That is actually a > better thematic fit given that this is really about the startup > process specifically, not WAL-logging in general. Then reinit.c would > include startup.h instead of xlog.h, which seems better, because I > don't think we want any actual xlog operations to happen from within > that file, so better not to be including xlog.h. > > The patch currently lacks documentation. It needs to update config.sgml. I agree and I will take care in the next patch. Thanks & Regards, Nitin Jadhav On Tue, Aug 3, 202
Re: when the startup process doesn't (logging startup delays)
On Tue, Aug 3, 2021 at 2:48 AM Nitin Jadhav wrote: > Implemented the above approach and verified the patch. Kindly have a > look and share your thoughts. +LogStartupProgressTimerExpired(bool force, long *secs, int *usecs) The name of this function begins with "log" but it does not log anything, so that's probably a sign that you should rethink the name of the function. I suggested startup_progress_timer_expired() upthread, but now I think maybe we should call it startup_progress_timer_has_expired() and then renaming the Boolean you've called logStartupProgressTimeout to startup_progress_timer_expired. Also, the argument "bool force" doesn't really make sense for this function, which is why I suggested above calling it "bool done" instead. +elapsed_ms = (seconds * 1000) + (useconds / 1000); +interval_in_ms = log_startup_progress_interval * 1000; +enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT, + (interval_in_ms - (elapsed_ms % interval_in_ms))); This will work correctly only if elapsed_ms is equal to interval_in_ms or slightly greater than interval_ms. But if elapsed_ms is greater than two times interval_ms, then it will produce pretty much random results. If elapsed_ms is negative because the system clock gets set backward, a possibility I've already mentioned to you in a previous review, then it will also misbehave. I actually don't think enable_timeout_after() is very easy to use for this kind of thing. At least for me, it's way easier to think about calculating the timestamp at which I want the timer to expire. Maybe something along these lines: next_timeout = last_startup_progress_timeout + interval; if (next_timeout < now) { // Either the timeout was processed so late that we missed an entire cycle, // or the system clock was set backwards. next_timeout = now + interval; } enable_timeout_at(next_timeout); Also, I said before that I thought it was OK that you were logging a line at the end of every operation as well as after every N milliseconds. But, the more I think about it, the less I like it. We already have a 'redo done' line that shows up at the end of redo, which the patch wisely does not duplicate. But it's not quite clear that any of these other things are important enough to bother mentioning in the log unless they take a long time. After an immediate shutdown of an empty cluster, with this patch applied, I get 3 extra log messages: 2021-08-03 10:17:49.630 EDT [17567] LOG: data directory sync (fsync) complete after 0.13 s 2021-08-03 10:17:49.633 EDT [17567] LOG: resetting unlogged relations (cleanup) complete after 0.00 s 2021-08-03 10:17:49.635 EDT [17567] LOG: resetting unlogged relations (init) complete after 0.00 s That doesn't seem like information anyone really needs. If it had taken a long time, it would have been worth logging, but in the normal case where it doesn't, it's just clutter. Another funny thing is that, as you've coded it, those additional log lines only appear when log_startup_progress_interval != 0. That's strange. It seems particularly strange because of the existing precedent where 'redo done' appears regardless of any setting, but also because when I set, say, a 10s interval, I guess I expect something to happen every 10s. Making something happen once at the end is different. So I think we should take this out, which would permit simplifying a bunch of code.The four places where you call ereport_startup_progress(true, ...) would go away. ereport_startup_progress() would no longer need a Boolean argument, and neither would LogStartupProgressTimerExpired() / startup_progress_timer_has_expired(). Note that there's no real need to disable the timeout when we're done with it. It's fine if we do, but if we don't, it's also not a big deal; all that happens if we leave the timer scheduled and let it expire is that it will set a Boolean flag that nobody will care about. So what I'm thinking about is that we could just have, say, reset_startup_progress_timeout() and startup_progress_timeout_has_expired(). reset_startup_progress_timeout() would just do exactly what I showed above to reset the timeout, and you'd call it at the beginning of each phase. And startup_progress_timeout_has_expired() would look roughly like this: if (!startup_progress_timer_expired) return; now = GetCurrentTimestamp(); // compute timestamp difference last_startup_progress_timeout = now; reset_startup_progress_timeout(); With these changes you'd have only 1 place in the code that needs to care about log_startup_progress_interval, as opposed to 3 as you have it currently, and only one place that enables the timeout, as opposed to 2 as you have it currently. I think that would be tidier. I think we also should consider where to put the new functions introduced by this patch, and the GUC. You put them in xlog.c which is reasonable since that is where StartupXLOG() lives. However, xlog.c is also a gigantic file, and xlog.h is included in
Re: when the startup process doesn't (logging startup delays)
Two issues that I saw: The first syncfs message is not output, because it's before InitStartupProgress() is called: 2021-08-03 07:53:02.176 CDT startup[9717] LOG: database system was interrupted; last known up at 2021-08-03 07:52:15 CDT 2021-08-03 07:53:02.733 CDT startup[9717] LOG: data directory sync (syncfs) complete after 0.55 s 2021-08-03 07:53:02.734 CDT startup[9717] LOG: database system was not properly shut down; automatic recovery in progress FP exception when the GUC is set to 0: 2021-08-03 07:53:02.877 CDT postmaster[9715] LOG: startup process (PID 9717) was terminated by signal 8: Floating point exception Probably due to mod zero operation. This prevents the process from starting. + enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT, +(interval_in_ms - (elapsed_ms % interval_in_ms))); -- Justin
Re: when the startup process doesn't (logging startup delays)
> Thanks. Can you please have a look at what I suggested down toward the > bottom of > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com ? Implemented the above approach and verified the patch. Kindly have a look and share your thoughts. Thanks & Regards, Nitin Jadhav On Fri, Jul 30, 2021 at 10:40 AM Nitin Jadhav wrote: > > > Thanks. Can you please have a look at what I suggested down toward the > > bottom of > > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com > > ? > > > > If we're going to go the route of combining the functions, I agree > > that a Boolean is the way to go; I think we have some existing > > precedent for 'bool finished' rather than 'bool done'. > > > > But I kind of wonder if we should have an enum in the first place. It > > feels like we've got code in a bunch of places that just exists to > > decide which enum value to use, and then code someplace else that > > turns around and decides which message to produce. That would be > > sensible if we were using the same enum values in lots of places, but > > that's not the case here. So suppose we just moved the messages to the > > places where we're now calling LogStartupProgress() or > > LogStartupProgressComplete()? So something like this: > > Sorry. I thought it is related to the discussion of deciding whether > LogStartupProgress() and LogStartupProgressComplete() should be > combined or not. I feel it's a really nice design. With this we avoid > a "action at a distance" issue and its easy to use. If we are > reporting the same kind of msgs at multiple places then the current > approach of using enum will be more suitable since we don't have to > worry about matching the log msg string. But in the current scenario, > we are not using the same kind of msgs at multiple places (I feel such > scenario will not occur in future also. Even if there is similar > operation, it can be distinguished like resetting unlogged relations > is distinguished by init and clean. Kindly mention if you can oversee > any such scenario), hence the approach you are suggesting will be a > best suit. > > Thanks & Regards, > Nitin Jadhav > > On Thu, Jul 29, 2021 at 9:48 PM Robert Haas wrote: > > > > On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav > > wrote: > > > Thanks for sharing the information. I have done the necessary changes > > > to show the logs during the latter case (postgres --single) and > > > verified the log messages. > > > > Thanks. Can you please have a look at what I suggested down toward the > > bottom of > > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com > > ? > > > > -- > > Robert Haas > > EDB: http://www.enterprisedb.com v9-0001-startup-process-progress.patch Description: Binary data
Re: when the startup process doesn't (logging startup delays)
> Thanks. Can you please have a look at what I suggested down toward the > bottom of > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com > ? > > If we're going to go the route of combining the functions, I agree > that a Boolean is the way to go; I think we have some existing > precedent for 'bool finished' rather than 'bool done'. > > But I kind of wonder if we should have an enum in the first place. It > feels like we've got code in a bunch of places that just exists to > decide which enum value to use, and then code someplace else that > turns around and decides which message to produce. That would be > sensible if we were using the same enum values in lots of places, but > that's not the case here. So suppose we just moved the messages to the > places where we're now calling LogStartupProgress() or > LogStartupProgressComplete()? So something like this: Sorry. I thought it is related to the discussion of deciding whether LogStartupProgress() and LogStartupProgressComplete() should be combined or not. I feel it's a really nice design. With this we avoid a "action at a distance" issue and its easy to use. If we are reporting the same kind of msgs at multiple places then the current approach of using enum will be more suitable since we don't have to worry about matching the log msg string. But in the current scenario, we are not using the same kind of msgs at multiple places (I feel such scenario will not occur in future also. Even if there is similar operation, it can be distinguished like resetting unlogged relations is distinguished by init and clean. Kindly mention if you can oversee any such scenario), hence the approach you are suggesting will be a best suit. Thanks & Regards, Nitin Jadhav On Thu, Jul 29, 2021 at 9:48 PM Robert Haas wrote: > > On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav > wrote: > > Thanks for sharing the information. I have done the necessary changes > > to show the logs during the latter case (postgres --single) and > > verified the log messages. > > Thanks. Can you please have a look at what I suggested down toward the > bottom of > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com > ? > > -- > Robert Haas > EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav wrote: > Thanks for sharing the information. I have done the necessary changes > to show the logs during the latter case (postgres --single) and > verified the log messages. Thanks. Can you please have a look at what I suggested down toward the bottom of http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com ? -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> The InitPostgres() case occurs when the server is started in bootstrap > mode (during initdb) or in single-user mode (postgres --single). I do > not see any reason why we shouldn't produce progress messages in at > least the latter case. I suspect that someone who is in the rather > desperate scenario of having to use single-user mode would really like > to know how long the server is going to take to start up. Thanks for sharing the information. I have done the necessary changes to show the logs during the latter case (postgres --single) and verified the log messages. > +1 to emit the same log messages in single-user mode and basically > whoever calls StartupXLOG. Do we need to adjust the GUC parameter > log_startup_progress_interval(a reasonable value) so that the logs are > generated by default? At present, this feature is enabled by default and the initial value set for log_startup_progress_interval is 10 seconds. On Wed, Jul 28, 2021 at 9:07 PM Robert Haas wrote: > > On Wed, Jul 28, 2021 at 11:25 AM Bharath Rupireddy > wrote: > > > Perhaps during initdb we don't want messages, but I'm not sure that we > > > need to do anything about that here. None of the messages that the > > > server normally produces show up when you run initdb, so I guess they > > > are getting redirected to /dev/null or something. > > > > I enabled the below log message in XLogFlush and ran initdb, it is > > printing these logs onto the stdout, looks like the logs have not been > > redirected to /dev/null in initdb/bootstrap mode. > > > > #ifdef WAL_DEBUG > > if (XLOG_DEBUG) > > elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X", > > LSN_FORMAT_ARGS(record), > > LSN_FORMAT_ARGS(LogwrtResult.Write), > > LSN_FORMAT_ARGS(LogwrtResult.Flush)); > > #endif > > > > So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and > > XLOG_DEBUG to print those logs? It looks like the problem with these > > macros is that they are not settable by normal users in the production > > environment, but only by the developers. I'm not sure how much it is > > helpful to print the startup process progress logs in bootstrap mode. > > Maybe we can use the IsBootstrapProcessingMode macro to disable these > > logs in the bootstrap mode. > > I don't think we should drag XLOG_DEBUG into this. That's a debugging > facility that isn't really relevant to the topic at hand. The point is > the server normally prints a bunch of messages that we don't see in > bootstrap mode. For example: > > [rhaas pgsql]$ postgres > 2021-07-28 11:32:33.824 EDT [36801] LOG: starting PostgreSQL 15devel > on x86_64-apple-darwin19.6.0, compiled by clang version 5.0.2 > (tags/RELEASE_502/final), 64-bit > 2021-07-28 11:32:33.825 EDT [36801] LOG: listening on IPv6 address > "::1", port 5432 > 2021-07-28 11:32:33.825 EDT [36801] LOG: listening on IPv4 address > "127.0.0.1", port 5432 > 2021-07-28 11:32:33.826 EDT [36801] LOG: listening on Unix socket > "/tmp/.s.PGSQL.5432" > 2021-07-28 11:32:33.846 EDT [36802] LOG: database system was shut > down at 2021-07-28 11:32:32 EDT > 2021-07-28 11:32:33.857 EDT [36801] LOG: database system is ready to > accept connections > > None of that shows up when you run initdb. Taking a fast look at the > code, I don't think the reasons are the same for all of those > messages. Some of the code isn't reached, whereas e.g. "database > system was shut down at 2021-07-28 11:32:32 EDT" is special-cased. I'm > not sure right off the top of my head what this code should do, but > ideally it looks something like one of the cases we've already got. > > -- > Robert Haas > EDB: http://www.enterprisedb.com From 55c73bb0920e3487b72d2588c43c1c5e8798e6d4 Mon Sep 17 00:00:00 2001 From: Nitin Date: Thu, 29 Jul 2021 14:11:24 +0530 Subject: [PATCH] Shows the progress of the operations performed during startup process. The interval between each progress update is configurable and it should be mentioned in seconds --- src/backend/access/transam/xlog.c | 163 ++ src/backend/postmaster/startup.c | 1 + src/backend/storage/file/fd.c | 13 ++ src/backend/storage/file/reinit.c | 17 +++ src/backend/utils/init/postinit.c | 1 + src/backend/utils/misc/guc.c | 12 ++ src/backend/utils/misc/postgresql.conf.sample | 6 + src/include/access/xlog.h | 18 +++ src/include/utils/timeout.h | 1 + 9 files changed, 232 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3479402..5a39da7 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -79,6 +79,7 @@ #include "utils/relmapper.h" #include "utils/pg_rusage.h" #include "utils/snapmgr.h" +#include "utils/timeout.h" #include "utils/timestamp.h" extern uint32 bootstrap_data_checksum_version; @@ -397,6 +398,23 @@ static bool doRequestWalReceiverReply; */ s
Re: when the startup process doesn't (logging startup delays)
On Wed, Jul 28, 2021 at 11:25 AM Bharath Rupireddy wrote: > > Perhaps during initdb we don't want messages, but I'm not sure that we > > need to do anything about that here. None of the messages that the > > server normally produces show up when you run initdb, so I guess they > > are getting redirected to /dev/null or something. > > I enabled the below log message in XLogFlush and ran initdb, it is > printing these logs onto the stdout, looks like the logs have not been > redirected to /dev/null in initdb/bootstrap mode. > > #ifdef WAL_DEBUG > if (XLOG_DEBUG) > elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X", > LSN_FORMAT_ARGS(record), > LSN_FORMAT_ARGS(LogwrtResult.Write), > LSN_FORMAT_ARGS(LogwrtResult.Flush)); > #endif > > So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and > XLOG_DEBUG to print those logs? It looks like the problem with these > macros is that they are not settable by normal users in the production > environment, but only by the developers. I'm not sure how much it is > helpful to print the startup process progress logs in bootstrap mode. > Maybe we can use the IsBootstrapProcessingMode macro to disable these > logs in the bootstrap mode. I don't think we should drag XLOG_DEBUG into this. That's a debugging facility that isn't really relevant to the topic at hand. The point is the server normally prints a bunch of messages that we don't see in bootstrap mode. For example: [rhaas pgsql]$ postgres 2021-07-28 11:32:33.824 EDT [36801] LOG: starting PostgreSQL 15devel on x86_64-apple-darwin19.6.0, compiled by clang version 5.0.2 (tags/RELEASE_502/final), 64-bit 2021-07-28 11:32:33.825 EDT [36801] LOG: listening on IPv6 address "::1", port 5432 2021-07-28 11:32:33.825 EDT [36801] LOG: listening on IPv4 address "127.0.0.1", port 5432 2021-07-28 11:32:33.826 EDT [36801] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2021-07-28 11:32:33.846 EDT [36802] LOG: database system was shut down at 2021-07-28 11:32:32 EDT 2021-07-28 11:32:33.857 EDT [36801] LOG: database system is ready to accept connections None of that shows up when you run initdb. Taking a fast look at the code, I don't think the reasons are the same for all of those messages. Some of the code isn't reached, whereas e.g. "database system was shut down at 2021-07-28 11:32:32 EDT" is special-cased. I'm not sure right off the top of my head what this code should do, but ideally it looks something like one of the cases we've already got. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Wed, Jul 28, 2021 at 7:02 PM Robert Haas wrote: > > On Wed, Jul 28, 2021 at 5:24 AM Nitin Jadhav > wrote: > > I also agree that this is the better place to do it. Hence modified > > the patch accordingly. The condition "!AmStartupProcess()" is added to > > differentiate whether the call is done from a startup process or some > > other process. Actually StartupXLOG() gets called in 2 places. one in > > StartupProcessMain() and the other in InitPostgres(). As the logging > > of the startup progress is required only during the startup process > > and not in the other cases, > > The InitPostgres() case occurs when the server is started in bootstrap > mode (during initdb) or in single-user mode (postgres --single). I do > not see any reason why we shouldn't produce progress messages in at > least the latter case. I suspect that someone who is in the rather > desperate scenario of having to use single-user mode would really like > to know how long the server is going to take to start up. +1 to emit the same log messages in single-user mode and basically whoever calls StartupXLOG. Do we need to adjust the GUC parameter log_startup_progress_interval(a reasonable value) so that the logs are generated by default? > Perhaps during initdb we don't want messages, but I'm not sure that we > need to do anything about that here. None of the messages that the > server normally produces show up when you run initdb, so I guess they > are getting redirected to /dev/null or something. I enabled the below log message in XLogFlush and ran initdb, it is printing these logs onto the stdout, looks like the logs have not been redirected to /dev/null in initdb/bootstrap mode. #ifdef WAL_DEBUG if (XLOG_DEBUG) elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X", LSN_FORMAT_ARGS(record), LSN_FORMAT_ARGS(LogwrtResult.Write), LSN_FORMAT_ARGS(LogwrtResult.Flush)); #endif So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and XLOG_DEBUG to print those logs? It looks like the problem with these macros is that they are not settable by normal users in the production environment, but only by the developers. I'm not sure how much it is helpful to print the startup process progress logs in bootstrap mode. Maybe we can use the IsBootstrapProcessingMode macro to disable these logs in the bootstrap mode. > So I don't think that using AmStartupProcess() for this purpose is right. Agree. Regards, Bharath Rupireddy.
Re: when the startup process doesn't (logging startup delays)
On Wed, Jul 28, 2021 at 5:24 AM Nitin Jadhav wrote: > I also agree that this is the better place to do it. Hence modified > the patch accordingly. The condition "!AmStartupProcess()" is added to > differentiate whether the call is done from a startup process or some > other process. Actually StartupXLOG() gets called in 2 places. one in > StartupProcessMain() and the other in InitPostgres(). As the logging > of the startup progress is required only during the startup process > and not in the other cases, The InitPostgres() case occurs when the server is started in bootstrap mode (during initdb) or in single-user mode (postgres --single). I do not see any reason why we shouldn't produce progress messages in at least the latter case. I suspect that someone who is in the rather desperate scenario of having to use single-user mode would really like to know how long the server is going to take to start up. Perhaps during initdb we don't want messages, but I'm not sure that we need to do anything about that here. None of the messages that the server normally produces show up when you run initdb, so I guess they are getting redirected to /dev/null or something. So I don't think that using AmStartupProcess() for this purpose is right. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
> > > I saw that you fixed it by calling InitStartupProgress() after the > > > walkdir() > > > calls which do pre_sync_fname. So then walkdir is calling > > > LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing > > > fsync, > > > and then LogStartupProgress() is returning because !AmStartupProcess(). > > > > I don't think walkdir() has any business calling LogStartupProgress() > > at all. It's supposed to be a generic function, not one that is only > > available to be called from the startup process, or has different > > behavior there. From my point of view, the right thing is to put the > > logging calls into the particular callbacks that SyncDataDirectory() > > uses. > > You're right - this is better. I also agree that this is the better place to do it. Hence modified the patch accordingly. The condition "!AmStartupProcess()" is added to differentiate whether the call is done from a startup process or some other process. Actually StartupXLOG() gets called in 2 places. one in StartupProcessMain() and the other in InitPostgres(). As the logging of the startup progress is required only during the startup process and not in the other cases, so added the condition to confirm the call is from the startup process. I did not find any other way to differentiate. Kindly let me know if there is any other better approach to do this. > > > Maybe I'm missing something here, but I don't understand the purpose > > > of this. You can always combine two functions into one, but it's only > > > worth doing if you end up with less code, which doesn't seem to be the > > > case here. > > > > 4 files changed, 39 insertions(+), 71 deletions(-) > > Hmm. I don't completely hate that, but I don't love it either. I don't > think the result is any easier to understand than the original, and in > some ways it's worse. In particular, you've flattened the separate > comments for each of the individual functions into a single-line > comment that is more generic than the comments it replaced. You could > fix that and you'd still be slightly ahead on LOC, but I'm not > convinced that it's actually a real win. As per my understanding there are no changes required wrt this. Hence not done any changes. Please find the updated patch attached. Kindly share your comments if any. On Mon, Jul 26, 2021 at 10:41 PM Robert Haas wrote: > > On Mon, Jul 26, 2021 at 11:30 AM Justin Pryzby wrote: > > > Maybe I'm missing something here, but I don't understand the purpose > > > of this. You can always combine two functions into one, but it's only > > > worth doing if you end up with less code, which doesn't seem to be the > > > case here. > > > > 4 files changed, 39 insertions(+), 71 deletions(-) > > Hmm. I don't completely hate that, but I don't love it either. I don't > think the result is any easier to understand than the original, and in > some ways it's worse. In particular, you've flattened the separate > comments for each of the individual functions into a single-line > comment that is more generic than the comments it replaced. You could > fix that and you'd still be slightly ahead on LOC, but I'm not > convinced that it's actually a real win. > > > > ... but I'm not exactly sure how to get there from here. Having only > > > LogStartupProgress() but having it do a giant if-statement to figure > > > out whether we're mid-phase or end-of-phase does not seem like the > > > right approach. > > > > I used a bool arg and negation to handle within a single switch. Maybe it's > > cleaner to use a separate enum value for each DONE, and set a local done > > flag. > > If we're going to go the route of combining the functions, I agree > that a Boolean is the way to go; I think we have some existing > precedent for 'bool finished' rather than 'bool done'. > > But I kind of wonder if we should have an enum in the first place. It > feels like we've got code in a bunch of places that just exists to > decide which enum value to use, and then code someplace else that > turns around and decides which message to produce. That would be > sensible if we were using the same enum values in lots of places, but > that's not the case here. So suppose we just moved the messages to the > places where we're now calling LogStartupProgress() or > LogStartupProgressComplete()? So something like this: > > if (should_report_startup_progress()) > ereport(LOG, > (errmsg("syncing data directory (syncfs), elapsed > time: %ld.%02d s, current path: %s", >secs, (usecs / 1), path))); > > Well, that doesn't quite work, because "secs" and "usecs" aren't going > to exist in the caller. We can fix that easily enough: let's just make > should_report_startup_progress take arguments long *secs, int *usecs, > and bool done. Then the above becomes: > > if (should_report_startup_progress(&secs, &usecs, false)) > ereport(LOG, > (errmsg("syncing data directory (syncfs), elapsed > time: %ld.%02d
Re: when the startup process doesn't (logging startup delays)
On Mon, Jul 26, 2021 at 11:30 AM Justin Pryzby wrote: > > Maybe I'm missing something here, but I don't understand the purpose > > of this. You can always combine two functions into one, but it's only > > worth doing if you end up with less code, which doesn't seem to be the > > case here. > > 4 files changed, 39 insertions(+), 71 deletions(-) Hmm. I don't completely hate that, but I don't love it either. I don't think the result is any easier to understand than the original, and in some ways it's worse. In particular, you've flattened the separate comments for each of the individual functions into a single-line comment that is more generic than the comments it replaced. You could fix that and you'd still be slightly ahead on LOC, but I'm not convinced that it's actually a real win. > > ... but I'm not exactly sure how to get there from here. Having only > > LogStartupProgress() but having it do a giant if-statement to figure > > out whether we're mid-phase or end-of-phase does not seem like the > > right approach. > > I used a bool arg and negation to handle within a single switch. Maybe it's > cleaner to use a separate enum value for each DONE, and set a local done flag. If we're going to go the route of combining the functions, I agree that a Boolean is the way to go; I think we have some existing precedent for 'bool finished' rather than 'bool done'. But I kind of wonder if we should have an enum in the first place. It feels like we've got code in a bunch of places that just exists to decide which enum value to use, and then code someplace else that turns around and decides which message to produce. That would be sensible if we were using the same enum values in lots of places, but that's not the case here. So suppose we just moved the messages to the places where we're now calling LogStartupProgress() or LogStartupProgressComplete()? So something like this: if (should_report_startup_progress()) ereport(LOG, (errmsg("syncing data directory (syncfs), elapsed time: %ld.%02d s, current path: %s", secs, (usecs / 1), path))); Well, that doesn't quite work, because "secs" and "usecs" aren't going to exist in the caller. We can fix that easily enough: let's just make should_report_startup_progress take arguments long *secs, int *usecs, and bool done. Then the above becomes: if (should_report_startup_progress(&secs, &usecs, false)) ereport(LOG, (errmsg("syncing data directory (syncfs), elapsed time: %ld.%02d s, current path: %s", secs, (usecs / 1), path))); And if this were the call site that corresponds to LogStartupProgressComplete(), we'd replace false with true. Now, the only real disadvantage of this that I can see is that it requires the caller to declare 'secs' and 'usecs', which is not a big deal, but mildly annoying perhaps. I think we can do better still with a little macro magic. Suppose we define a macro report_startup_progress(force, msg, ...) that expands to: { long secs; int usecs; if (startup_progress_timer_expired(&secs, &usecs, force)) ereport(LOG, errmsg(msg, secs, usecs, ...)); } Then the above just becomes this: report_startup_progress(false, "syncing data directory (syncfs), elapsed time: %ld.%02d s, current path: %s", path); This would have the advantage that the strings we're using would be present in the code that arranges to emit them, instead of being removed to some other module, so I think it would be clearer. It would also have the advantage of making it much easier to add further calls, if someone feels they want to do that. You don't have to run around and update enums and all the various things that use them, just copy and paste the line above and adjust as required. With this design, we avoid a lot of "action at a distance". We don't define the message strings in a place far-removed from the code that wants to emit them any more. When someone wants a new progress message, they can just add another call to report_statup_progress() wherever it needs to go and they're done. They don't have to go run and update the enum and various switch statements. They're just done. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Mon, Jul 26, 2021 at 10:13:09AM -0400, Robert Haas wrote: > I don't think walkdir() has any business calling LogStartupProgress() > at all. It's supposed to be a generic function, not one that is only > available to be called from the startup process, or has different > behavior there. From my point of view, the right thing is to put the > logging calls into the particular callbacks that SyncDataDirectory() > uses. You're right - this is better. On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote: > > > 1) I still don't see the need for two functions LogStartupProgress and > > > LogStartupProgressComplete. Most of the code is duplicate. I think we > > > can just do it with a single function something like [1]: > > > > I agree that one function can do this more succinctly. I think it's best to > > use a separate enum value for START operations and END operations. > > Maybe I'm missing something here, but I don't understand the purpose > of this. You can always combine two functions into one, but it's only > worth doing if you end up with less code, which doesn't seem to be the > case here. 4 files changed, 39 insertions(+), 71 deletions(-) > ... but I'm not exactly sure how to get there from here. Having only > LogStartupProgress() but having it do a giant if-statement to figure > out whether we're mid-phase or end-of-phase does not seem like the > right approach. I used a bool arg and negation to handle within a single switch. Maybe it's cleaner to use a separate enum value for each DONE, and set a local done flag. startup[29675] LOG: database system was interrupted; last known up at 2021-07-26 10:23:04 CDT startup[29675] LOG: syncing data directory (fsync), elapsed time: 1.38 s, current path: ./pg_ident.conf startup[29675] LOG: data directory sync (fsync) complete after 1.72 s startup[29675] LOG: database system was not properly shut down; automatic recovery in progress startup[29675] LOG: resetting unlogged relations (cleanup) complete after 0.00 s startup[29675] LOG: redo starts at 0/17BE500 startup[29675] LOG: redo in progress, elapsed time: 1.00 s, current LSN: 0/35D7CB8 startup[29675] LOG: redo in progress, elapsed time: 2.00 s, current LSN: 0/54A6918 startup[29675] LOG: redo in progress, elapsed time: 3.00 s, current LSN: 0/7370570 startup[29675] LOG: redo in progress, elapsed time: 4.00 s, current LSN: 0/924D8A0 startup[29675] LOG: redo done at 0/9B8 system usage: CPU: user: 4.28 s, system: 0.15 s, elapsed: 4.44 s startup[29675] LOG: resetting unlogged relations (init) complete after 0.03 s startup[29675] LOG: checkpoint starting: end-of-recovery immediate startup[29675] LOG: checkpoint complete: wrote 9872 buffers (60.3%); 0 WAL file(s) added, 0 removed, 8 recycled; write=0.136 s, sync=0.801 s, total=1.260 s; sync files=21, longest=0.774 s, average=B >From 9443005040be52225498d58678b5faa1668bd2ad Mon Sep 17 00:00:00 2001 From: Nitin Date: Fri, 23 Jul 2021 15:46:56 +0530 Subject: [PATCH 1/2] Shows the progress of the operations performed during startup process. The interval between each progress update is configurable and it should be mentioned in seconds --- src/backend/access/transam/xlog.c | 178 ++ src/backend/postmaster/startup.c | 1 + src/backend/storage/file/fd.c | 13 ++ src/backend/storage/file/reinit.c | 18 ++ src/backend/utils/misc/guc.c | 13 +- src/backend/utils/misc/postgresql.conf.sample | 6 + src/include/access/xlog.h | 18 ++ src/include/utils/timeout.h | 1 + 8 files changed, 247 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3479402272..4f052050f3 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -79,6 +79,7 @@ #include "utils/relmapper.h" #include "utils/pg_rusage.h" #include "utils/snapmgr.h" +#include "utils/timeout.h" #include "utils/timestamp.h" extern uint32 bootstrap_data_checksum_version; @@ -397,6 +398,23 @@ static bool doRequestWalReceiverReply; */ static XLogRecPtr RedoStartLSN = InvalidXLogRecPtr; +/* + * Start timestamp of the operation that occur during startup process. + */ +static TimestampTz startupProcessOpStartTime; + +/* + * Indicates whether the startup progress interval mentioned by the user is + * elapsed or not. TRUE if timeout occurred, FALSE otherwise. + */ +static bool logStartupProgressTimeout; + +/* + * The interval between each progress update of the operations that occur + * during startup process. + */ +int log_startup_progress_interval; + /*-- * Shared-memory data structures for XLOG control * @@ -7351,6 +7369,8 @@ StartupXLOG(void) (errmsg("redo starts at %X/%X", LSN_FORMAT_ARGS(ReadRecPtr; + InitStartupProgress(); + /* * main redo apply loop */ @@ -7358,6 +7378,8 @@
Re: when the startup process doesn't (logging startup delays)
On Sun, Jul 25, 2021 at 1:56 PM Justin Pryzby wrote: > On Fri, Jul 23, 2021 at 04:09:47PM +0530, Nitin Jadhav wrote: > > > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, > > > path); > > > when action == datadir_fsync_fname. > > > > I agree and fixed it. > > I saw that you fixed it by calling InitStartupProgress() after the walkdir() > calls which do pre_sync_fname. So then walkdir is calling > LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing fsync, > and then LogStartupProgress() is returning because !AmStartupProcess(). > > That seems indirect, fragile, and confusing. I suggest that walkdir() should > take an argument for which operation to pass to LogStartupProgress(). You can > pass a special enum for cases where nothing should be logged, like > STARTUP_PROCESS_OP_NONE. I don't think walkdir() has any business calling LogStartupProgress() at all. It's supposed to be a generic function, not one that is only available to be called from the startup process, or has different behavior there. From my point of view, the right thing is to put the logging calls into the particular callbacks that SyncDataDirectory() uses. > On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote: > > 1) I still don't see the need for two functions LogStartupProgress and > > LogStartupProgressComplete. Most of the code is duplicate. I think we > > can just do it with a single function something like [1]: > > I agree that one function can do this more succinctly. I think it's best to > use a separate enum value for START operations and END operations. Maybe I'm missing something here, but I don't understand the purpose of this. You can always combine two functions into one, but it's only worth doing if you end up with less code, which doesn't seem to be the case here. The strings are all different and that's most of the function, and the other stuff that gets done isn't the same either, so you'd just end up with a bunch of if-statements. That doesn't seem like an improvement. Thinking further, I guess I understand it from the caller's perspective. It's not necessarily clear why in some places we call LogStartupProgress() and others LogStartupProgressComplete(). Someone might expect a function with "complete" in the name like that to only be called once at the very end, rather than once at the end of a phase, and it does sort of make sense that you'd want to call one function everywhere rather than sometimes one and sometimes the other ... but I'm not exactly sure how to get there from here. Having only LogStartupProgress() but having it do a giant if-statement to figure out whether we're mid-phase or end-of-phase does not seem like the right approach. -- Robert Haas EDB: http://www.enterprisedb.com
Re: when the startup process doesn't (logging startup delays)
On Fri, Jul 23, 2021 at 04:09:47PM +0530, Nitin Jadhav wrote: > > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, > > path); > > when action == datadir_fsync_fname. > > I agree and fixed it. I saw that you fixed it by calling InitStartupProgress() after the walkdir() calls which do pre_sync_fname. So then walkdir is calling LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing fsync, and then LogStartupProgress() is returning because !AmStartupProcess(). That seems indirect, fragile, and confusing. I suggest that walkdir() should take an argument for which operation to pass to LogStartupProgress(). You can pass a special enum for cases where nothing should be logged, like STARTUP_PROCESS_OP_NONE. On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote: > 1) I still don't see the need for two functions LogStartupProgress and > LogStartupProgressComplete. Most of the code is duplicate. I think we > can just do it with a single function something like [1]: I agree that one function can do this more succinctly. I think it's best to use a separate enum value for START operations and END operations. switch(operation) { case STARTUP_PROCESS_OP_SYNCFS_START: ereport(...); break; case STARTUP_PROCESS_OP_SYNCFS_END: ereport(...); break; case STARTUP_PROCESS_OP_FSYNC_START: ereport(...); break; case STARTUP_PROCESS_OP_FSYNC_END: ereport(...); break; ... -- Justin
Re: when the startup process doesn't (logging startup delays)
> I still don't see the need for two functions LogStartupProgress and > LogStartupProgressComplete. Most of the code is duplicate. I think we > can just do it with a single function something like [1]: Initially I had written a common function for these 2. You can see that in the earlier version of the patch. Later separated it since it was too much for one function. If others also agree to make it common, I can make that change. > Why isn't there a > LogStartupProgressComplete(STARTUP_PROCESS_OP_REDO)? Is it because of > the below existing log message? > ereport(LOG, > (errmsg("redo done at %X/%X system usage: %s", > LSN_FORMAT_ARGS(ReadRecPtr), > pg_rusage_show(&ru0; Yes. Adding another log message makes it redundant. > I think it should be, "," after occurred instead of "." > + * elapsed or not. TRUE if timeout occurred, FALSE otherwise. > instead of > + * elapsed or not. TRUE if timeout occurred. FALSE otherwise. Fixed. > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, > path); > when action == datadir_fsync_fname. I agree and fixed it. > ResetUnloggedRelations() is calling > ResetUnloggedRelationsInTablespaceDir("base", op); > before calling InitStartupProgress(). Fixed. > This shows StartupXLOG() calling ResetUnloggedRelations() twice. > Should they both be shown ? If so, maybe they should be distinguished as > here: > >elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d", > (op & UNLOGGED_RELATION_CLEANUP) != 0, > (op & UNLOGGED_RELATION_INIT) != 0); Fixed. Added separate codes to distinguish. Please find the patch attached. On Wed, Jul 21, 2021 at 6:43 PM Justin Pryzby wrote: > > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, > path); > when action == datadir_fsync_fname. > > ResetUnloggedRelations() is calling > ResetUnloggedRelationsInTablespaceDir("base", op); > before calling InitStartupProgress(). > > This shows StartupXLOG() calling ResetUnloggedRelations() twice. > Should they both be shown ? If so, maybe they should be distinguished as > here: > > elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d", > (op & UNLOGGED_RELATION_CLEANUP) != 0, > (op & UNLOGGED_RELATION_INIT) != 0); > > On Wed, Jul 21, 2021 at 12:52:24PM +0530, Nitin Jadhav wrote: > > 2021-07-20 18:47:32.046 IST [102230] LOG: listening on IPv4 address > > "127.0.0.1", port 5445 > > 2021-07-20 18:47:32.048 IST [102230] LOG: listening on Unix socket > > "/tmp/.s.PGSQL.5445" > > 2021-07-20 18:47:32.051 IST [102234] LOG: database system was interrupted; > > last known up at 2021-07-20 18:46:27 IST > > 2021-07-20 18:47:32.060 IST [102234] LOG: data directory sync (fsync) > > complete after 0.00 s > > 2021-07-20 18:47:32.060 IST [102234] LOG: database system was not properly > > shut down; automatic recovery in progress > > 2021-07-20 18:47:32.063 IST [102234] LOG: unlogged relations reset after > > 0.00 s > > 2021-07-20 18:47:32.063 IST [102234] LOG: redo starts at 0/14EF418 > > 2021-07-20 18:47:33.063 IST [102234] LOG: redo in progress, elapsed time: > > 1.00 s, current LSN: 0/5C13D28 > > 2021-07-20 18:47:34.063 IST [102234] LOG: redo in progress, elapsed time: > > 2.00 s, current LSN: 0/A289160 > > 2021-07-20 18:47:35.063 IST [102234] LOG: redo in progress, elapsed time: > > 3.00 s, current LSN: 0/EE2DE10 > > 2021-07-20 18:47:35.563 IST [102234] LOG: invalid record length at > > 0/115C63E0: wanted 24, got 0 > > 2021-07-20 18:47:35.563 IST [102234] LOG: redo done at 0/115C63B8 system > > usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s > > 2021-07-20 18:47:35.564 IST [102234] LOG: unlogged relations reset after > > 0.00 s > > 2021-07-20 18:47:35.706 IST [102230] LOG: database system is ready to > > accept connections > > -- > Justin From 85c0e711cc48bcf2f76ed8ed0257af3ecf2de306 Mon Sep 17 00:00:00 2001 From: Nitin Date: Fri, 23 Jul 2021 15:46:56 +0530 Subject: [PATCH] Shows the progress of the operations performed during startup process. The interval between each progress update is configurable and it should be mentioned in seconds --- src/backend/access/transam/xlog.c | 178 ++ src/backend/postmaster/startup.c | 1 + src/backend/storage/file/fd.c | 13 ++ src/backend/storage/file/reinit.c | 18 +++ src/backend/utils/misc/guc.c | 13 +- src/backend/utils/misc/postgresql.conf.sample | 6 + src/include/access/xlog.h | 18 +++ src/include/utils/timeout.h | 1 + 8 files changed, 247 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3479402..4f05205 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -79,6 +79,7 @@ #include "utils/relmapper.h" #include "utils/pg_rusage.h
Re: when the startup process doesn't (logging startup delays)
I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, path); when action == datadir_fsync_fname. ResetUnloggedRelations() is calling ResetUnloggedRelationsInTablespaceDir("base", op); before calling InitStartupProgress(). This shows StartupXLOG() calling ResetUnloggedRelations() twice. Should they both be shown ? If so, maybe they should be distinguished as here: elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d", (op & UNLOGGED_RELATION_CLEANUP) != 0, (op & UNLOGGED_RELATION_INIT) != 0); On Wed, Jul 21, 2021 at 12:52:24PM +0530, Nitin Jadhav wrote: > 2021-07-20 18:47:32.046 IST [102230] LOG: listening on IPv4 address > "127.0.0.1", port 5445 > 2021-07-20 18:47:32.048 IST [102230] LOG: listening on Unix socket > "/tmp/.s.PGSQL.5445" > 2021-07-20 18:47:32.051 IST [102234] LOG: database system was interrupted; > last known up at 2021-07-20 18:46:27 IST > 2021-07-20 18:47:32.060 IST [102234] LOG: data directory sync (fsync) > complete after 0.00 s > 2021-07-20 18:47:32.060 IST [102234] LOG: database system was not properly > shut down; automatic recovery in progress > 2021-07-20 18:47:32.063 IST [102234] LOG: unlogged relations reset after > 0.00 s > 2021-07-20 18:47:32.063 IST [102234] LOG: redo starts at 0/14EF418 > 2021-07-20 18:47:33.063 IST [102234] LOG: redo in progress, elapsed time: > 1.00 s, current LSN: 0/5C13D28 > 2021-07-20 18:47:34.063 IST [102234] LOG: redo in progress, elapsed time: > 2.00 s, current LSN: 0/A289160 > 2021-07-20 18:47:35.063 IST [102234] LOG: redo in progress, elapsed time: > 3.00 s, current LSN: 0/EE2DE10 > 2021-07-20 18:47:35.563 IST [102234] LOG: invalid record length at > 0/115C63E0: wanted 24, got 0 > 2021-07-20 18:47:35.563 IST [102234] LOG: redo done at 0/115C63B8 system > usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s > 2021-07-20 18:47:35.564 IST [102234] LOG: unlogged relations reset after > 0.00 s > 2021-07-20 18:47:35.706 IST [102230] LOG: database system is ready to accept > connections -- Justin
Re: when the startup process doesn't (logging startup delays)
On Wed, Jul 21, 2021 at 12:52 PM Nitin Jadhav wrote: > Please find the v5 patch attached. Kindly let me know your comments. Thanks for the patch. Here are some comments on v5: 1) I still don't see the need for two functions LogStartupProgress and LogStartupProgressComplete. Most of the code is duplicate. I think we can just do it with a single function something like [1]: 2) Why isn't there a LogStartupProgressComplete(STARTUP_PROCESS_OP_REDO)? Is it because of the below existing log message? ereport(LOG, (errmsg("redo done at %X/%X system usage: %s", LSN_FORMAT_ARGS(ReadRecPtr), pg_rusage_show(&ru0; 3) I think it should be, "," after occurred instead of "." + * elapsed or not. TRUE if timeout occurred, FALSE otherwise. instead of + * elapsed or not. TRUE if timeout occurred. FALSE otherwise. [1] +/* + * Logs the progress of the operations performed during the startup process. + * is_complete true/false indicates that the operation is finished/ + * in-progress respectively. + */ +void +LogStartupProgress(StartupProcessOp op, const char *path, + bool is_complete) +{ + long secs; + int usecs; + int elapsed_ms; + int interval_in_ms; + + /* If not called from the startup process then this feature is of no use. */ + if (!AmStartupProcess()) + return; + + /* If the feature is disabled, then no need to proceed further. */ + if (log_startup_progress_interval < 0) + return; + + /* +* If the operation is in-progress and the timeout hasn't occurred, then +* no need to log the details. +*/ + if (!is_complete && !logStartupProgressTimeout) + return; + + /* Timeout has occurred. */ + TimestampDifference(startupProcessOpStartTime, + GetCurrentTimestamp(), + &secs, &usecs); + + /* +* If the operation is in-progress, enable the timer for the next log +* message based on the time that current log message timer was supposed to +* fire. +*/ + if (!is_complete) + { + elapsed_ms = (secs * 1000) + (usecs / 1000); + interval_in_ms = log_startup_progress_interval * 1000; + enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT, + (interval_in_ms - (elapsed_ms % interval_in_ms))); + } + + switch(op) + { + case STARTUP_PROCESS_OP_SYNCFS: + { + if (is_complete) + ereport(LOG, + (errmsg("data directory sync (syncfs) complete after %ld.%02d s", + secs, (usecs / 1; + else + ereport(LOG, + (errmsg("syncing data directory (syncfs), elapsed time: %ld.%02d s, current path: %s", + secs, (usecs / 1), path))); + } + break; + case STARTUP_PROCESS_OP_FSYNC: + { + if (is_complete) + ereport(LOG, + (errmsg("data directory sync (fsync) complete after %ld.%02d s", + secs, (usecs / 1; + else + ereport(LOG, + (errmsg("syncing data directory (fsync), elapsed time: %ld.%02d s, current path: %s", + secs, (usecs / 1), path))); + } + break; + case STARTUP_PROCESS_OP_REDO: + { + /* +* No need to log redo completion status here, as it will be +* done in the caller. +*/ + if (!is_complete) + ereport(LOG, + (errmsg("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X", + secs, (usecs / 1), LSN_FORMAT_ARGS(ReadRecPtr; + } + break; + case STARTUP_PROCESS_OP_RESET_UNLOGGED_REL: + { + if (is_complete) + ereport(LOG, + (errmsg("unlogged relations reset after %ld.%02d s", + secs, (usecs / 1; + else + ereport(LOG, + (errmsg("resetting unlogged relations, elapsed time: %ld.%02d s, current path: %s", + secs, (usecs / 1), path))); + } + break; + default: + ereport(E
Re: when the startup process doesn't (logging startup delays)
> I am not sure I am getting the code flow correctly. From > CloseStartupProgress() > naming it seems, it corresponds to InitStartupProgress() but what it is doing > is similar to LogStartupProgress(). I think it should be renamed to be inlined > with LogStartupProgress(), IMO. Renamed CloseStartupProgress() to LogStartupProgressComplete(). > This part should be an assertion, it's the developer's responsibility to call > correctly. This code is not required at all due to the fix of the next comment. > Since we do have a separate call for the in-progress operation and the > end-operation, only a single enum would have been enough. If we do this, then > I > think we should remove get_startup_process_operation_string() move messages to > the respective function. Fixed. > I'd really like to see this enabled by default, say with a default > interval of 10 seconds. If it has to be enabled explicitly, most > people won't, but I think a lot of people would benefit from knowing > why their system is slow to start up when that sort of thing happens. > I don't see much downside to having it on by default either, since it > shouldn't be expensive. I think the GUC's units should be seconds, not > milliseconds, though. I agree that it is better to enable it by default. Changed the code accordingly and changed the GUC's units to seconds. > The messages you've added are capitalized, but the ones PostgreSQL > has currently are not. You should conform to the existing style. Fixed. > The "crash recovery complete" message looks redundant with the "redo > done" message. Also, in my mind, "redo" is one particular phase of > crash recovery, so it looks really odd that "crash recovery" finishes > first and then "redo" finishes. I think some thought is needed about > the terminology here. Yes. "redo" is one phase of the crash recovery. Even "resetting unlogged relations" is also a part of the crash recovery. These 2 are the major time consuming operations of the crash recovery task. There is a separate log message to indicate the progress of "resetting the unlogged relations". So instead of saying 'performing crash recovery", it is better to say "redo in progress" and not add any additional message at the end of redo, instead retain the existing message to avoid redundancy. > I'm not clear why I get a message about the data directory fsync but > not about resetting unlogged relations. I wasn't really expecting to > get a message about things that completed in less than the configured > interval, although I find that I don't mind having it there either. > But then it seems like it should be consistent across the various > things we're timing, and resetting unlogged relations should certainly > be one of those. It is the same across all the things we'are timing. I was able to see those messages on my machine. I feel there is not much overhead of logging one message at the end of the operation even though it completes within the configured interval. Following are the log messages shown on my machine. 2021-07-20 18:47:32.046 IST [102230] LOG: listening on IPv4 address "127.0.0.1", port 5445 2021-07-20 18:47:32.048 IST [102230] LOG: listening on Unix socket "/tmp/.s.PGSQL.5445" 2021-07-20 18:47:32.051 IST [102234] LOG: database system was interrupted; last known up at 2021-07-20 18:46:27 IST 2021-07-20 18:47:32.060 IST [102234] LOG: data directory sync (fsync) complete after 0.00 s 2021-07-20 18:47:32.060 IST [102234] LOG: database system was not properly shut down; automatic recovery in progress 2021-07-20 18:47:32.063 IST [102234] LOG: unlogged relations reset after 0.00 s 2021-07-20 18:47:32.063 IST [102234] LOG: redo starts at 0/14EF418 .2021-07-20 18:47:33.063 IST [102234] LOG: redo in progress, elapsed time: 1.00 s, current LSN: 0/5C13D28 .2021-07-20 18:47:34.063 IST [102234] LOG: redo in progress, elapsed time: 2.00 s, current LSN: 0/A289160 .2021-07-20 18:47:35.063 IST [102234] LOG: redo in progress, elapsed time: 3.00 s, current LSN: 0/EE2DE10 2021-07-20 18:47:35.563 IST [102234] LOG: invalid record length at 0/115C63E0: wanted 24, got 0 2021-07-20 18:47:35.563 IST [102234] LOG: redo done at 0/115C63B8 system usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s 2021-07-20 18:47:35.564 IST [102234] LOG: unlogged relations reset after 0.00 s 2021-07-20 18:47:35.706 IST [102230] LOG: database system is ready to accept connections > The way you've coded this has some drift. In a perfect world, I'd > get a progress report at 1000.00 ms, 2000.00 ms, 3000.00 ms, etc. > That's never going to be the case, because there's always going to be > a slightly delay in responding to the timer interrupt. However, as > you've coded it, the delay increases over time. The first "Performing > crash recovery" message is only 373 ms late, but the last one is 4916 > ms late. To avoid this, you want to reschedule the timer interrupt > based on the time the last one was supposed to fire, not the time it > actually d