On Tue, Aug 30, 2022 at 6:04 AM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > On 8/29/22 12:21, Thomas Munro wrote: > > 0001: Instead of figuring out when to invalidate the cache, let's > > just invalidate it before every read attempt. It is only marked valid > > after success (ie state->readLen > 0). No need to worry about error > > cases. > > Maybe I misunderstand how all this works, but won't this have a really > bad performance impact. If not, why do we need the cache at all?
It's a bit confusing because there are several levels of "read". The cache remains valid as long as the caller of ReadPageInternal() keeps asking for data that is in range (see early return after comment "/* check whether we have all the requested data already */"). As soon as the caller asks for something not in range, this patch marks the cache invalid before calling the page_read() callback (= XLogPageRead()). It is only marked valid again after that succeeds. Here's a new version with no code change, just a better commit message to try to explain that more clearly.
From d02d851ab515d4a1f883f657bc43a9cae600a5d8 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 29 Aug 2022 19:56:27 +1200 Subject: [PATCH v2 1/3] Fix cache invalidation in rare recovery_prefetch cases. XLogPageRead() can retry internally in rare cases after a read has succeeded, overwriting the page buffer. Namely, short reads, and page validation failures while in standby mode (see commit 0668719801). Due to an oversight in commit 3f1ce973, these cases could leave stale data in the internal cache of xlogreader.c without marking it invalid. The main defense against stale cached data was in the error handling path of ReadPageInternal(), but that wasn't quite enough for errors handled internally by XLogPageRead()'s retry loop. Instead of trying to invalidate the cache in the relevant failure cases, ReadPageInternal() now invalidates it before even calling the page_read callback (ie XLogPageRead()). It is only marked valid by setting state->readLen etc after page_read() succeeds. It remains valid until the caller of ReadPageInternal() eventually asks for something out of range and we have to go back to page_read() for more. The removed line that was setting errormsg_deferred was redundant because it's already set by report_invalid_record(). Back-patch to 15. Reported-by: Noah Misch <n...@leadboat.com> Reviewed-by: Discussion: https://postgr.es/m/20220807003627.GA4168930%40rfd.leadboat.com --- src/backend/access/transam/xlogreader.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index f17e80948d..e883ade607 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -987,6 +987,13 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) targetPageOff == state->segoff && reqLen <= state->readLen) return state->readLen; + /* + * Invalidate contents of internal buffer before read attempt. Just set + * the length to 0, rather than a full XLogReaderInvalReadState(), so we + * don't forget the segment we last read. + */ + state->readLen = 0; + /* * Data is not in our buffer. * @@ -1067,11 +1074,6 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) return readLen; err: - if (state->errormsg_buf[0] != '\0') - { - state->errormsg_deferred = true; - XLogReaderInvalReadState(state); - } return XLREAD_FAIL; } -- 2.30.2
From 6439f5ed5b9cb6524c307f06a43c9c54f51d3b86 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 29 Aug 2022 21:08:25 +1200 Subject: [PATCH v2 2/3] Improve xlogrecovery.c/xlogreader.c boundary. Create a function XLogReaderResetError() to clobber the error message inside an XLogReaderState, instead of doing it directly from xlogrecovery.c. This is older code from commit 0668719801, but commit 5dc0418f probably should have tidied this up because it leaves the internal state a bit out of sync (not a live bug, but a little confusing). --- src/backend/access/transam/xlogreader.c | 10 ++++++++++ src/backend/access/transam/xlogrecovery.c | 2 +- src/include/access/xlogreader.h | 3 +++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index e883ade607..c100e18333 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -1325,6 +1325,16 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr, return true; } +/* + * Forget about an error produced by XLogReaderValidatePageHeader(). + */ +void +XLogReaderResetError(XLogReaderState *state) +{ + state->errormsg_buf[0] = '\0'; + state->errormsg_deferred = false; +} + /* * Find the first record with an lsn >= RecPtr. * diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index a59a0e826b..422322cf5a 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3335,7 +3335,7 @@ retry: (errmsg_internal("%s", xlogreader->errormsg_buf))); /* reset any error XLogReaderValidatePageHeader() might have set */ - xlogreader->errormsg_buf[0] = '\0'; + XLogReaderResetError(xlogreader); goto next_record_is_invalid; } diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index 87ff00feb7..97b6f00114 100644 --- a/src/include/access/xlogreader.h +++ b/src/include/access/xlogreader.h @@ -373,6 +373,9 @@ extern DecodedXLogRecord *XLogReadAhead(XLogReaderState *state, extern bool XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr, char *phdr); +/* Forget error produced by XLogReaderValidatePageHeader(). */ +extern void XLogReaderResetError(XLogReaderState *state); + /* * Error information from WALRead that both backend and frontend caller can * process. Currently only errors from pread can be reported. -- 2.30.2
From 394d2b08a896a7e3094ffb052adaa11486b7b14f Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 29 Aug 2022 21:14:25 +1200 Subject: [PATCH v2 3/3] Don't retry inside XLogPageRead() if reading ahead. Give up immediately if we have a short read or a page header invalidation failure inside XLogPageRead(). Once recovery catches up to this point, it can deal with it. This doesn't fix any known live bug, but it fits with the general philosophy of giving up fast when prefetching runs into trouble. --- src/backend/access/transam/xlogrecovery.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 422322cf5a..76ce9704ca 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3342,6 +3342,14 @@ retry: return readLen; next_record_is_invalid: + /* + * If we're reading ahead, give up fast. Standby retries and error + * reporting will be handled by a later read when recovery catches up to + * this point. + */ + if (xlogreader->nonblocking) + return XLREAD_WOULDBLOCK; + lastSourceFailed = true; if (readFile >= 0) -- 2.30.2