Re: The danger of deleting backup_label

Robert Haas Mon, 16 Oct 2023 09:26:23 -0700

On Mon, Oct 16, 2023 at 11:45 AM David Steele <[email protected]> wrote:
> Hmmm, the reason to back patch this is that it would fix [1], which sure
> looks like a problem to me even if it is not a "bug". We can certainly
> require backup software to retry pg_control until the checksum is valid
> but that seems like a pretty big ask, even considering how complicated
> backup is.


That seems like a problem with pg_control not being written atomically
when the standby server is updating it during recovery, rather than a
problem with backup_label not being used at the start of recovery.
Unless I'm confused.

> If you start from the last checkpoint (which is what will generally be
> stored in pg_control) then the effect is pretty similar.

If the backup didn't span a checkpoint, then restoring from the one in
pg_control actually works fine. Not that I'm encouraging that. But if
you replay WAL from the control file, you at least get the last
checkpoint's worth of WAL; if you use pg_resetwal, you get nothing.

I don't really want to get hung up on this though. My main point here
is that I have trouble believing that an error after you've already
screwed up your backup helps much. I think what we need is to make it
less likely that you will screw up your backup in the first place.

> Right now the user can remove backup_label and get a "successful"
> restore and not realize that they have just corrupted their cluster.
> This is independent of the backup/restore tool doing all the right things.

I don't think it's independent of that at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: The danger of deleting backup_label

Reply via email to