On 10/16/23 12:25, Robert Haas wrote:
On Mon, Oct 16, 2023 at 11:45 AM David Steele <da...@pgmasters.net> wrote:
Hmmm, the reason to back patch this is that it would fix [1], which sure
looks like a problem to me even if it is not a "bug". We can certainly
require backup software to retry pg_control until the checksum is valid
but that seems like a pretty big ask, even considering how complicated
backup is.

That seems like a problem with pg_control not being written atomically
when the standby server is updating it during recovery, rather than a
problem with backup_label not being used at the start of recovery.
Unless I'm confused.

You are not confused. But the fact that it practically can be read as torn at all on the standby is a function of how rapidly it is being written to update min recovery point. Writing it atomically via a temp file may affect performance in this area, but only during the backup, which may cause recovery to run more slowly during a backup.

I don't have proof of this because I don't have any hosts large enough to test the theory.

Right now the user can remove backup_label and get a "successful"
restore and not realize that they have just corrupted their cluster.
This is independent of the backup/restore tool doing all the right things.

I don't think it's independent of that at all.

I think it is. Imagine the user does backup/recovery using their favorite tool and everything is fine. But due to some misconfiguration or a problem in the WAL archive, they get this error:

FATAL:  could not locate required checkpoint record
2023-10-16 16:42:50.132 UTC HINT: If you are restoring from a backup, touch "/home/dev/test/data/recovery.signal" or "/home/dev/test/data/standby.signal" and add required recovery options. If you are not restoring from a backup, try removing the file "/home/dev/test/data/backup_label". Be careful: removing "/home/dev/test/data/backup_label" will result in a corrupt cluster if restoring from a backup.

I did this by setting restore_command=false, but it could just as easily be the actual restore command that returns false due to a variety of reasons. The user has no idea what "could not locate required checkpoint record" means and if there is enough automation they may not even realize they just restored from a backup.

After some agonizing (we hope) they decide to delete backup_label and, wow, it just works! So now they merrily go on their way with a corrupted cluster. They also remember for the next time that deleting backup_label is definitely a good procedure.

The idea behind this patch is that deleting backup_label would produce a hard error because pg_control would be missing as well (if the backup software did its job). If both pg_control and backup_label are present (but pg_control has not been loaded with the contents of backup_label, i.e. it is the running copy from the backup cluster) we can also error.

It's not perfect, because they could backup (or restore) pg_control but not backup_label, but we are narrowing the cases where something can go wrong and they have silent corruption, especially if their backup/restore software follows the directions.

Regards,
-David


Reply via email to