On Tue, Dec 16, 2025 at 04:25:37PM +0530, Nitin Jadhav wrote: > it seems reasonable to align the checkpoint‑record‑missing case as well. > The existing PANIC dates back to an era before online backups and archive > recovery existed, when external manipulation of WAL was not expected and > such conditions were treated as internal faults. With all such features, it > is much more realistic for WAL segments to go missing due to operational > issues, and such cases are often recoverable. So switching this to FATAL > appears appropriate. > > Please share your thoughts.
FWIW, I think that we should lift the PANIC pattern in this case, at least to be able to provide more tests around the manipulation of WAL segments when triggering recovery, with or without a backup_label as much as with or without a recovery/standby.signal defined in the tree. The PANIC pattern to blow up the backend when missing a checkpoint record at the beginning of recovery is a historical artifact of 4d14fe0048cf. The backend has evolved a lot since, particularly with WAL archives that came much later than that. Lowering that to a FATAL does not imply a loss of information, just the lack of a backtrace that can be triggered depending on how one has set of a cluster to start (say a recovery.signal was forgotten and pg_wal/ has no contents, etc.). And IMO I doubt that a trace is really useful anyway in this specific code path. I'd love to hear the opinion of others on the matter, so if anybody has comments, feel free. I'd be curious to look at the amount of tests related to recovery startup you have in mind anyway, Nitin. -- Michael
signature.asc
Description: PGP signature
