On Tue, Feb 1, 2022 at 9:39 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > I found that CreateRestartPoint() already reported the redo lsn as follows > after emitting the restartpoint log message. To avoid duplicated logging of > the same information, we should update this code? > > ereport((log_checkpoints ? LOG : DEBUG2), > (errmsg("recovery restart point at %X/%X", > LSN_FORMAT_ARGS(lastCheckPoint.redo)), > xtime ? errdetail("Last completed transaction was at > log time %s.", > > timestamptz_to_str(xtime)) : 0)); > > This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, > LogCheckpointEnd() reports ControlFile->checkPointCopy.redo. They may be > different, for example, when the current DB state is not > DB_IN_ARCHIVE_RECOVERY. In this case, which lsn should we report as redo lsn?
Do we ever reach CreateRestartPoint when ControlFile->stat != DB_IN_ARCHIVE_RECOVERY? Assert(ControlFile->state == DB_IN_ARCHIVE_RECOVERY); in CreateRestartPoint doesn't fail any regression tests. Here's what can happen: lastCheckPoint.redo is 100 and ControlFile->checkPointCopy.redo is 105, so, "skipping restartpoint, already performed at %X/%X" LogCheckpointEnd isn't reached lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100 and ControlFile->state == DB_IN_ARCHIVE_RECOVERY, then the control file gets updated and LogCheckpointEnd prints the right redo lsn. lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100 and ControlFile->state != DB_IN_ARCHIVE_RECOVERY, the the control file doesn't get updated and LogCheckpointEnd just prints the control redo lsn. Looks like this case is rare given Assert(ControlFile->state == DB_IN_ARCHIVE_RECOVERY); doesn't fail any tests. I think we should just let LogCheckpointEnd print the redo lsn from the control file. We can just remove the above errmsg("recovery restart point at %X/%X" message altogether or just print it only in the rare scenario, something like below: if (ControlFile->state != DB_IN_ARCHIVE_RECOVERY) { ereport((log_checkpoints ? LOG : DEBUG2), (errmsg("performed recovery restart point at %X/%X while the database state is %s", LSN_FORMAT_ARGS(lastCheckPoint.redo)), getDBState(ControlFile->state))); } And the last commit/abort records's timestamp will always get logged even before we reach here in the main redo loop (errmsg("last completed transaction was at log time %s"). Or another way is to just pass the redo lsn to LogCheckpointEnd and pass the lastCheckPoint.redo in if (ControlFile->state != DB_IN_ARCHIVE_RECOVERY) cases or when control file isn't updated but restart point happened. Thoughts? Regards, Bharath Rupireddy.