At Mon, 20 Jun 2022 11:57:20 -0400, Robert Haas <robertmh...@gmail.com> wrote 
in 
> It seems to me that what we want to do is: if we're about to start
> allowing WAL writes, then consider whether to insert an aborted
> contrecord record. Now, if we are about to start allowing WAL write,
> we must determine the LSN at which the first such record will be
> written. Then there are two possibilities: either that LSN is on an
> existing record boundary, or it isn't. In the former case, no aborted
> contrecord record is required. In the latter case, we're writing at
> LSN which would have been in the middle of a previous record, except
> that the record in question was never finished or at least we don't
> have all of it. So the first WAL we insert at our chosen starting LSN
> must be an aborted contrecord record.

Right.

> I'm not quite sure I understand the nature of the remaining bug here,
> but this logic seems quite sketchy to me:
> 
>             /*
>              * When not in standby mode we find that WAL ends in an incomplete
>              * record, keep track of that record.  After recovery is done,
>              * we'll write a record to indicate to downstream WAL readers that
>              * that portion is to be ignored.
>              */
>             if (!StandbyMode &&
>                 !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
>             {
>                 abortedRecPtr = xlogreader->abortedRecPtr;
>                 missingContrecPtr = xlogreader->missingContrecPtr;
>             }
> 
> I don't see what StandbyMode has to do with anything here. The
> question is whether the place where we're going to begin writing WAL
> is on an existing record boundary or not. If it so happens that when
> StandbyMode is turned on we can never decide to promote in the middle
> of a record, then maybe there's no need to track this information when
> StandbyMode = true, but I don't see a good reason why we should make
> that assumption. What if in the future somebody added a command that

Right.  I came to the same conclusion before reading this section. It
is rearer than other cases but surely possible.

> says "don't keep trying to read more WAL, just promote RIGHT HERE?". I
> think this logic would surely be incorrect in that case. It feels to
> me like the right thing to do is to always keep track if WAL ends in
> an incomplete record, and then when we promote, we write an aborted
> contrecord record if WAL ended in an incomplete record, full stop.

Agreed. Actually, with the second patch applied, removing !StandbyMode
from the condition makes no difference in behavior.

> Now, we do need to keep in mind that, while in StandbyMode, we might
> reach the end of WAL more than once, because we have a polling loop
> that looks for more WAL. So it does not work to just set the values
> once and then assume that's the whole truth forever. But why not
> handle that by storing the abortedRecPtr and missingContrecPtr
> unconditionally after every call to XLogPrefetcherReadRecord()? They
> will go back and forth between XLogRecPtrIsInvalid and other values
> many times but the last value should -- I think -- be however things
> ended up at the point where we decided to stop reading WAL.

Unfortunately it doesn't work because we read a record already known
to be complete again at the end of recovery.  It is the reason of
"abortedRecPtr < xlogreader->EndRecPtr" in my PoC patch.  Without it,
abrotedRecPtr is erased when it is actually needed.  I don't like that
expedient-looking condition, but the strict condition for resetting
abortedRecPtr is iff "we have read a complete record at the same or
grater LSN ofabortedRecPtr"...

Come to think of this, I noticed that we can get rid of the
file-global abortedRecPtr by letting FinishWalRecovery copy
abortedRecPtr *before* reading the last record. This also allows us to
remove the "expedient" condition.


> I haven't really dived into this issue much so I might be totally off
> base, but it just doesn't feel right to me that this should depend on
> whether we're in standby mode. That seems at best incidentally related
> to whether an aborted contrecord record is required.
> 
> P.S. "aborted contrecord record" is really quite an awkward turn of phrase!

Thats true!  Always open for better phrasings:(

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6eba626420..c1be15039d 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -360,15 +360,6 @@ typedef struct XLogRecoveryCtlData
 
 static XLogRecoveryCtlData *XLogRecoveryCtl = NULL;
 
-/*
- * abortedRecPtr is the start pointer of a broken record at end of WAL when
- * recovery completes; missingContrecPtr is the location of the first
- * contrecord that went missing.  See CreateOverwriteContrecordRecord for
- * details.
- */
-static XLogRecPtr abortedRecPtr;
-static XLogRecPtr missingContrecPtr;
-
 /*
  * if recoveryStopsBefore/After returns true, it saves information of the stop
  * point here
@@ -944,12 +935,6 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		minRecoveryPointTLI = 0;
 	}
 
-	/*
-	 * Start recovery assuming that the final record isn't lost.
-	 */
-	abortedRecPtr = InvalidXLogRecPtr;
-	missingContrecPtr = InvalidXLogRecPtr;
-
 	*wasShutdown_ptr = wasShutdown;
 	*haveBackupLabel_ptr = haveBackupLabel;
 	*haveTblspcMap_ptr = haveTblspcMap;
@@ -1430,6 +1415,14 @@ FinishWalRecovery(void)
 		lastRec = XLogRecoveryCtl->lastReplayedReadRecPtr;
 		lastRecTLI = XLogRecoveryCtl->lastReplayedTLI;
 	}
+
+	/*
+	 *  record aborted contrecord status before the next ReadRecord erases the
+	 *  state
+	 */
+	result->abortedRecPtr = xlogreader->abortedRecPtr;
+	result->missingContrecPtr = xlogreader->missingContrecPtr;
+
 	XLogPrefetcherBeginRead(xlogprefetcher, lastRec);
 	(void) ReadRecord(xlogprefetcher, PANIC, false, lastRecTLI);
 	endOfLog = xlogreader->EndRecPtr;
@@ -1503,9 +1496,6 @@ FinishWalRecovery(void)
 	result->lastRecTLI = lastRecTLI;
 	result->endOfLog = endOfLog;
 
-	result->abortedRecPtr = abortedRecPtr;
-	result->missingContrecPtr = missingContrecPtr;
-
 	result->standby_signal_file_found = standby_signal_file_found;
 	result->recovery_signal_file_found = recovery_signal_file_found;
 
@@ -1970,10 +1960,6 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
 				 LSN_FORMAT_ARGS(xlrec.overwritten_lsn),
 				 LSN_FORMAT_ARGS(record->overwrittenRecPtr));
 
-		/* We have safely skipped the aborted record */
-		abortedRecPtr = InvalidXLogRecPtr;
-		missingContrecPtr = InvalidXLogRecPtr;
-
 		ereport(LOG,
 				(errmsg("successfully skipped missing contrecord at %X/%X, overwritten at %s",
 						LSN_FORMAT_ARGS(xlrec.overwritten_lsn),
@@ -2971,19 +2957,6 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
 		record = XLogPrefetcherReadRecord(xlogprefetcher, &errormsg);
 		if (record == NULL)
 		{
-			/*
-			 * When not in standby mode we find that WAL ends in an incomplete
-			 * record, keep track of that record.  After recovery is done,
-			 * we'll write a record to indicate to downstream WAL readers that
-			 * that portion is to be ignored.
-			 */
-			if (!StandbyMode &&
-				!XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
-			{
-				abortedRecPtr = xlogreader->abortedRecPtr;
-				missingContrecPtr = xlogreader->missingContrecPtr;
-			}
-
 			if (readFile >= 0)
 			{
 				close(readFile);

Reply via email to