On Sun, 8 Feb 2026 at 23:38, Andres Freund <[email protected]> wrote: >
> Consider: > > 1) modify page w/ FPI > 2) redo pointer determined at X > 3) modify page w/o FPI, as the page hasn't yet been flushed at X+1 > 4) checkpointer flushes page > 5) checkpoint completes, at X+2 > 6) page is dirtied, w/o FPI X+3, as X+1 > X > 7) in the middle of writing out the page, we crash, the page is torn > > For recovery we will replay starting from position X. Then will replay the > record from 3), which will be skipped due to the LSN. Then we will replay X+3, > which either will be skipped due to the LSN condition (if the page header > survived the torn page), leading to the changes to the "old portion" of the > torn page not being replayed, or we will replay the WAL record, applying it to > a torn page (or failing to read in the page due to checksum errors). > > If we only needed to think about buffers that stay in memory, we could "just" > tackle this by remember that the page will need to be FPId during the next > modification in the BufferDesc, but that doesn't help us if the page is > evicted and reread... > > Hmm, after thinking about this, I wonder if we can actually have a TAP test for this sequence of events? Maybe it would be desirable to execute some rare recovery code path. But I'm unsure if there is any reliable way to have an OS to have a buffer in page cache, but not on disk when evicted. -- Best regards, Kirill Reshke
