Re: recovery_target immediate timestamp
On 2020/11/13 8:39, Euler Taveira wrote: On Wed, 11 Nov 2020 at 22:40, Fujii Masao mailto:masao.fu...@oss.nttdata.com>> wrote: On 2020/11/12 6:00, Euler Taveira wrote: > The first patch adds a new message that prints the latest completed checkpoint > when the consistent state is reached. I'm not sure how useful this information is in practice. Fujii, thanks for reviewing it. It provides the same information as the "last completed transaction was" message. > It also exposes the checkpoint timestamp > in debug messages. ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(checkPoint.time; The above first debug message displays the LSN of the checkpoint record. OTOH, the second message displays the time when the checkpoint started (not the time when checkpoint record was written at the end of checkpoint). So isn't it confusing to display those inconsistent information together? Indeed the checkpoint timestamp is from before it determines the REDO LSN. Are you saying the this checkpoint timestamp is inconsistent because it is not near the point it saves the RedoRecPtr? If so, let's move checkPoint.time a few lines below. No. What I'd like to say is; checkPointLoc that the first debug message outputs is the LSN of checkpoint record, not the checkpoint REDO location. The checkpoint REDO location is determined at the early stage of checkpointing. OTOH, the location of checkpoint record is determined at the end of checkpointing. They are different. The checkpoint time that the second debug message you added outputs is the timestamp determined at the beginning of checkpointing. So it seems not reasonable to display the location of checkpoint record and the checkpoint time because they are determined at the different timing. Am I missing something? /* * Here we update the shared RedoRecPtr for future XLogInsert calls; this * must be done while holding all the insertion locks. * * Note: if we fail to complete the checkpoint, RedoRecPtr will be left * pointing past where it really needs to point. This is okay; the only * consequence is that XLogInsert might back up whole buffers that it * didn't really need to. We can't postpone advancing RedoRecPtr because * XLogInserts that happen while we are dumping buffers must assume that * their buffer changes are not included in the checkpoint. */ RedoRecPtr = XLogCtl->Insert.RedoRecPtr = checkPoint.redo; checkPoint.time = (pg_time_t) time(NULL); I realized that I was using the wrong variable in one of the debug messages. > The second patch provides the checkpoint timestamp in the pg_waldump output and > also when you enable wal_debug parameter. The pg_waldump output looks like +1 +#ifdef FRONTEND + strftime(checkpointstr, sizeof(checkpointstr), "%c", localtime(&time_tmp)); +#else + pg_strftime(checkpointstr, sizeof(checkpointstr), "%c", pg_localtime(&time_tmp, log_timezone)); +#endif You can simplify the code by using timestamptz_to_str() here instead, like xact_desc_commit() does. I have the same idea until I realized that checkPoint.time is pg_time_t and not TimestampTz. [digging the code a bit...] I figure out there is a function that converts from pg_time_t to TimestampTz: time_t_to_timestamptz(). I removed that ugly code but have to duplicate this function into compat.c. I don't have a strong preference but I'm attaching a new patch. Thanks for updating the patch! At least for me this approach looks better. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: recovery_target immediate timestamp
On Wed, 11 Nov 2020 at 22:40, Fujii Masao wrote: > > On 2020/11/12 6:00, Euler Taveira wrote: > > > The first patch adds a new message that prints the latest completed > checkpoint > > when the consistent state is reached. > > I'm not sure how useful this information is in practice. > > Fujii, thanks for reviewing it. It provides the same information as the "last completed transaction was" message. > > It also exposes the checkpoint timestamp > > in debug messages. > > ereport(DEBUG1, > (errmsg("checkpoint record is at > %X/%X", > (uint32) > (checkPointLoc >> 32), (uint32) checkPointLoc))); > + ereport(DEBUG1, > + (errmsg("checkpoint time is %s", > str_time(checkPoint.time; > > The above first debug message displays the LSN of the checkpoint record. > OTOH, the second message displays the time when the checkpoint started > (not the time when checkpoint record was written at the end of checkpoint). > So isn't it confusing to display those inconsistent information together? > > Indeed the checkpoint timestamp is from before it determines the REDO LSN. Are you saying the this checkpoint timestamp is inconsistent because it is not near the point it saves the RedoRecPtr? If so, let's move checkPoint.time a few lines below. /* * Here we update the shared RedoRecPtr for future XLogInsert calls; this * must be done while holding all the insertion locks. * * Note: if we fail to complete the checkpoint, RedoRecPtr will be left * pointing past where it really needs to point. This is okay; the only * consequence is that XLogInsert might back up whole buffers that it * didn't really need to. We can't postpone advancing RedoRecPtr because * XLogInserts that happen while we are dumping buffers must assume that * their buffer changes are not included in the checkpoint. */ RedoRecPtr = XLogCtl->Insert.RedoRecPtr = checkPoint.redo; checkPoint.time = (pg_time_t) time(NULL); I realized that I was using the wrong variable in one of the debug messages. > > The second patch provides the checkpoint timestamp in the pg_waldump > output and > > also when you enable wal_debug parameter. The pg_waldump output looks > like > > +1 > > +#ifdef FRONTEND > + strftime(checkpointstr, sizeof(checkpointstr), "%c", > localtime(&time_tmp)); > +#else > + pg_strftime(checkpointstr, sizeof(checkpointstr), "%c", > pg_localtime(&time_tmp, log_timezone)); > +#endif > > You can simplify the code by using timestamptz_to_str() here instead, like > xact_desc_commit() does. > > I have the same idea until I realized that checkPoint.time is pg_time_t and not TimestampTz. [digging the code a bit...] I figure out there is a function that converts from pg_time_t to TimestampTz: time_t_to_timestamptz(). I removed that ugly code but have to duplicate this function into compat.c. I don't have a strong preference but I'm attaching a new patch. At the end, I asked myself if it is worth changing this type from pg_time_t to TimestampTz. -- Euler Taveira http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From df8f4d295ebc205d46e77c5037c399afda17c546 Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Wed, 11 Nov 2020 11:24:25 -0300 Subject: [PATCH 1/2] Report latest completed checkpoint timestamp A new detail message prints the latest completed checkpoint when reaching the consistent state. This information is useful when you use recovery_target = immediate because it is not clear what timestamp it stops applying the WAL. It also adds new debug messages that report the checkpoint timestamp. It might be useful for debug purposes. --- src/backend/access/transam/xlog.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index aa63f37615..2ff83c49e2 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6524,6 +6524,8 @@ StartupXLOG(void) ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(checkPoint.time; InRecovery = true; /* force recovery even if SHUTDOWNED */ /* @@ -6657,6 +6659,8 @@ StartupXLOG(void) ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(ControlFile->checkPointCopy.time; } else { @@ -8033,7 +8037,9 @@ CheckRecoveryConsistency(void) ereport(LOG, (errmsg("consistent recovery state reached at %X/%X", (uint32) (lastReplayedEndRecPtr >> 32), -
Re: recovery_target immediate timestamp
On 2020/11/12 6:00, Euler Taveira wrote: Hi, While restoring a backup using recovery_target immediate, I noticed that there isn't a timestamp available. LOG: consistent recovery state reached at 0/A000100 LOG: recovery stopping after reaching consistency LOG: pausing at the end of recovery HINT: Execute pg_wal_replay_resume() to promote. LOG: database system is ready to accept read only connections if you decide to use one of the targets or just recover until the end of the WAL, you get a (last completed transaction) timestamp. LOG: redo done at 0/10FFEC38 system usage: CPU: user: 0.10 s, system: 0.05 s, elapsed: 1.65 s LOG: last completed transaction was at log time 2020-11-11 17:27:31.715251-03 LOG: restored log file "00010010" from archive cp: cannot stat '/a/pgarchive/0002.history': No such file or directory LOG: selected new timeline ID: 2 LOG: archive recovery complete cp: cannot stat '/a/pgarchive/0001.history': No such file or directory LOG: database system is ready to accept connections I dig into the pg_waldump output to figure out the timestamp, however, the checkpoint timestamp isn't printed by pg_waldump. The checkpoint timestamp might be useful information at least when you set large values for a checkpoint or need to investigate a performance/corruption issue. The first patch adds a new message that prints the latest completed checkpoint when the consistent state is reached. I'm not sure how useful this information is in practice. It also exposes the checkpoint timestamp in debug messages. ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(checkPoint.time; The above first debug message displays the LSN of the checkpoint record. OTOH, the second message displays the time when the checkpoint started (not the time when checkpoint record was written at the end of checkpoint). So isn't it confusing to display those inconsistent information together? LOG: consistent recovery state reached at 0/A000100 DETAIL: Last completed checkpoint was at log time 2020-11-11 17:31:50 -03. LOG: recovery stopping after reaching consistency LOG: pausing at the end of recovery HINT: Execute pg_wal_replay_resume() to promote. LOG: database system is ready to accept read only connections . . . DEBUG: checkpoint record is at 0/A60 DEBUG: checkpoint time is 2020-11-11 17:34:19 -03 The second patch provides the checkpoint timestamp in the pg_waldump output and also when you enable wal_debug parameter. The pg_waldump output looks like +1 +#ifdef FRONTEND + strftime(checkpointstr, sizeof(checkpointstr), "%c", localtime(&time_tmp)); +#else + pg_strftime(checkpointstr, sizeof(checkpointstr), "%c", pg_localtime(&time_tmp, log_timezone)); +#endif You can simplify the code by using timestamptz_to_str() here instead, like xact_desc_commit() does. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
recovery_target immediate timestamp
Hi, While restoring a backup using recovery_target immediate, I noticed that there isn't a timestamp available. LOG: consistent recovery state reached at 0/A000100 LOG: recovery stopping after reaching consistency LOG: pausing at the end of recovery HINT: Execute pg_wal_replay_resume() to promote. LOG: database system is ready to accept read only connections if you decide to use one of the targets or just recover until the end of the WAL, you get a (last completed transaction) timestamp. LOG: redo done at 0/10FFEC38 system usage: CPU: user: 0.10 s, system: 0.05 s, elapsed: 1.65 s LOG: last completed transaction was at log time 2020-11-11 17:27:31.715251-03 LOG: restored log file "00010010" from archive cp: cannot stat '/a/pgarchive/0002.history': No such file or directory LOG: selected new timeline ID: 2 LOG: archive recovery complete cp: cannot stat '/a/pgarchive/0001.history': No such file or directory LOG: database system is ready to accept connections I dig into the pg_waldump output to figure out the timestamp, however, the checkpoint timestamp isn't printed by pg_waldump. The checkpoint timestamp might be useful information at least when you set large values for a checkpoint or need to investigate a performance/corruption issue. The first patch adds a new message that prints the latest completed checkpoint when the consistent state is reached. It also exposes the checkpoint timestamp in debug messages. LOG: consistent recovery state reached at 0/A000100 DETAIL: Last completed checkpoint was at log time 2020-11-11 17:31:50 -03. LOG: recovery stopping after reaching consistency LOG: pausing at the end of recovery HINT: Execute pg_wal_replay_resume() to promote. LOG: database system is ready to accept read only connections . . . DEBUG: checkpoint record is at 0/A60 DEBUG: checkpoint time is 2020-11-11 17:34:19 -03 The second patch provides the checkpoint timestamp in the pg_waldump output and also when you enable wal_debug parameter. The pg_waldump output looks like rmgr: XLOGlen (rec/tot):114/ 114, tx: 0, lsn: 0/0A60, prev 0/0A28, desc: CHECKPOINT_ONLINE redo 0/A28; timestamp qua 11 nov 2020 17:34:19 -03; tli 1; prev tli 1; fpw true; xid 0:519; oid 24576; multi 1; offset 0; oldest xid 501 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 519; online and the debug messages are DEBUG: checkpoint record is at 0/A60 DEBUG: checkpoint time is 2020-11-11 17:37:47 -03 LOG: REDO @ 0/A60; LSN 0/AD8: prev 0/A28; xid 0; len 88 - XLOG/CHECKPOINT_ONLINE: redo 0/A28; timestamp Wed Nov 11 17:37:47 2020; tli 1; prev tli 1; fpw true; xid 0:519; oid 24576; multi 1; offset 0; oldest xid 501 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 519; online Regards, -- Euler Taveira http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 4b725a45dfb89ee9a463e6ee7987cd8b660053cd Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Wed, 11 Nov 2020 11:24:25 -0300 Subject: [PATCH 1/2] Report latest completed checkpoint timestamp A new detail message prints the latest completed checkpoint when reaching the consistent state. This information is useful when you use recovery_target = immediate because it is not clear what timestamp it stops applying the WAL. It also adds new debug messages that report the checkpoint timestamp. It might be useful for debug purposes. --- src/backend/access/transam/xlog.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index aa63f37615..cf70950932 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6524,6 +6524,8 @@ StartupXLOG(void) ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(checkPoint.time; InRecovery = true; /* force recovery even if SHUTDOWNED */ /* @@ -6657,6 +6659,8 @@ StartupXLOG(void) ereport(DEBUG1, (errmsg("checkpoint record is at %X/%X", (uint32) (checkPointLoc >> 32), (uint32) checkPointLoc))); + ereport(DEBUG1, + (errmsg("checkpoint time is %s", str_time(checkPoint.time; } else { @@ -8033,7 +8037,9 @@ CheckRecoveryConsistency(void) ereport(LOG, (errmsg("consistent recovery state reached at %X/%X", (uint32) (lastReplayedEndRecPtr >> 32), - (uint32) lastReplayedEndRecPtr))); + (uint32) lastReplayedEndRecPtr), + errdetail("Last completed checkpoint was at log time %s.", + str_time(ControlFile->checkPointCopy.time; } /* -- 2.20.1 From 9b7b3effad8c860a003dc60eb1ea53aa15a26ad9 Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Wed, 11 Nov 202