I spent a lot of time trying to figure out why xlog.c has global variables ReadRecPtr and EndRecPtr instead of just relying on the eponymous structure members inside the XLogReaderState. I concluded that the values are the same at most points in the code, and thus that we could just use xlogreaderstate->{Read,End}RecPtr instead. There are two places where this wouldn't produce the same results we're getting today. Both of those appear to be bugs.
The reason why it's generally the case that ReadRecPtr == xlogreaderstate->ReadRecPtr and likewise for EndRecPtr is that ReadRecord() calls XLogReadRecord(), which sets the values inside the XLogReaderState, and then immediately assigns the values stored there to the global variables. There's no other code that changes the other global variables, and the only other code that changes the structure members is XLogBeginRead(). So the values can be unequal from the time XLogBeginRead() is called up until the time that the XLogReadRecord() call inside ReadRecord() returns. In practice, StartupXLOG() always calls ReadRecord() right after it calls XLogBeginRead(), and ReadRecord() does not reference either global variable before calling XLogReadRecord(), so the problem surface is limited to code that runs underneath XLogReadRecord(). XLogReadRecord() is part of xlogreader.c, but it uses a callback interface: the callback is XLogPageRead(), which itself references EndRecPtr, and also calls WaitForWALToBecomeAvailable(), which in turn calls rescanLatestTimeLine(), which also references EndRecPtr. So these are the two problem cases: XLogPageRead(), and rescanLatestTimeLine(). In rescanLatestTimeLine(), the problem is IMHO probably serious enough to justify a separate commit with back-patching. The problem is that EndRecPtr is being used here to reject impermissible attempts to switch to a bad timeline, but if pg_wal starts out empty, EndRecPtr will be 0 here, which causes the code to fail to detect a case that should be prohibited. Consider the following test case: - create a primary - create standby #1 from the primary - start standby #1 and promote it - take a backup from the primary using -Xnone to create standby #2 - clear primary_conninfo on standby #2 and then start it - copy 00000002.history from standby #1 to standby #2 You get: 2021-11-17 15:34:26.213 EST [7474] LOG: selected new timeline ID: 2 But with the attached patch, you get: 2021-11-17 16:12:01.566 EST [20900] LOG: new timeline 2 forked off current database system timeline 1 before current recovery point 0/A000060 Had the problem occurred at some later point in the WAL stream rather than before fetching the very first record, I think everything is fine; at that point, I think that the global variable EndRecPtr will be initialized. I'm not entirely sure that it contains exactly the right value, but it's someplace in the right ballpark, at least. In XLogPageRead(), the problem is just cosmetic. We're only using EndRecPtr as an argument to emode_for_corrupt_record(), which is all about suppressing duplicate complaints about the same LSN. But if the xlogreader has been repositioned using XLogBeginRead() since the last call to ReadRecord(), or if there are no preceding calls to ReadRecord(), then the value of EndRecPtr here is left over from the previous read position and is not particularly related to the record we're reading now. xlogreader->EndRecPtr, OTOH, is. This doesn't seem worth a separate commit to me, or a back-patch, but it seems worth fixing while I'm cleaning up these global variables. Review appreciated. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
v1-0001-xlog.c-Remove-global-variables-ReadRecPtr-and-End.patch
Description: Binary data