Re: Bug? pg_rewind produces unusable but starting database with standby recovery

cca5507 Wed, 10 Jun 2026 20:16:32 -0700

Hi,

> 2. Patch 0002 (remote source WAL race) I think goes the wrong way.
> 
> Moving the consistency LSN capture before collecting the file list
> breaks the rule that backup stop relies on: the consistency point must
> be established AFTER the data pages are read, never before. The source
> is a live server and pg_rewind reads pages from disk one by one with
> pg_read_binary_file(). A page modified on the source after we captured
> flush LSN, but before we copy it, lands on the target with an LSN
> greater than minRecoveryPoint. The target can then reach minRecoveryPoint
> and be declared consistent while holding pages from the future whose WAL
> was never replayed - a torn, cross-page inconsistent state.
> 
> The good half of the patch is insert LSN -> flush LSN. Note that
> pg_rewind reads pages from disk, and by the WAL-before-data rule any
> on-disk page has LSN <= flush LSN. So flush LSN is the tight upper bound
> on what we can copy: insert LSN overshoots (may point past any copied
> page, possibly into WAL still only in shared memory), flush-before-copy
> undershoots (torn pages), and flush-after-copy is exactly right: it
> covers every copied page and is durable.
> 
> So I think the fix should keep the capture where it was (end of
> perform_rewind, after the copy) and only switch insert LSN to flush LSN.
> Missing WAL up to minRecoveryPoint is then handled like a base backup:
> the target fetches it via streaming or restore_command, and patch 0001
> turns the unreachable case into a clean FATAL instead of silent
> corruption. This also means the 012_remote_wal_race test, which checks
> that the copied pg_wal alone covers minRecoveryPoint, encodes the
> questionable assumption and would need to be reworked.


+1. I reported a related bug and wrote a patch doing "insert LSN -> flush LSN" 
in:

https://commitfest.postgresql.org/patch/6679/

--
Regards,
ChangAo Chen

Re: Bug? pg_rewind produces unusable but starting database with standby recovery

Reply via email to