Hi, > 2. Patch 0002 (remote source WAL race) I think goes the wrong way. > > Moving the consistency LSN capture before collecting the file list > breaks the rule that backup stop relies on: the consistency point must > be established AFTER the data pages are read, never before. The source > is a live server and pg_rewind reads pages from disk one by one with > pg_read_binary_file(). A page modified on the source after we captured > flush LSN, but before we copy it, lands on the target with an LSN > greater than minRecoveryPoint. The target can then reach minRecoveryPoint > and be declared consistent while holding pages from the future whose WAL > was never replayed - a torn, cross-page inconsistent state. > > The good half of the patch is insert LSN -> flush LSN. Note that > pg_rewind reads pages from disk, and by the WAL-before-data rule any > on-disk page has LSN <= flush LSN. So flush LSN is the tight upper bound > on what we can copy: insert LSN overshoots (may point past any copied > page, possibly into WAL still only in shared memory), flush-before-copy > undershoots (torn pages), and flush-after-copy is exactly right: it > covers every copied page and is durable. > > So I think the fix should keep the capture where it was (end of > perform_rewind, after the copy) and only switch insert LSN to flush LSN. > Missing WAL up to minRecoveryPoint is then handled like a base backup: > the target fetches it via streaming or restore_command, and patch 0001 > turns the unreachable case into a clean FATAL instead of silent > corruption. This also means the 012_remote_wal_race test, which checks > that the copied pg_wal alone covers minRecoveryPoint, encodes the > questionable assumption and would need to be reworked.
+1. I reported a related bug and wrote a patch doing "insert LSN -> flush LSN" in: https://commitfest.postgresql.org/patch/6679/ -- Regards, ChangAo Chen
