On 4/16/24 23:50, Robert Haas wrote:
On Wed, Apr 10, 2024 at 9:36 PM David Steele <da...@pgmasters.net> wrote:
I've been playing around with the incremental backup feature trying to
get a sense of how it can be practically used. One of the first things I
always try is to delete random files and see what happens.

You can delete pretty much anything you want from the most recent
incremental backup (not the manifest) and it will not be detected.

Sure, but you can also delete anything you want from the most recent
non-incremental backup and it will also not be detected. There's no
reason at all to hold incremental backup to a higher standard than we
do in general.

Except that we are running pg_combinebackup on the incremental, which the user might reasonably expect to check backup integrity. It actually does a bunch of integrity checks -- but not this one.

Maybe the answer here is to update the docs to specify that
pg_verifybackup should be run on all backup directories before
pg_combinebackup is run. Right now that is not at all clear.

I don't want to make those kinds of prescriptive statements. If you
want to verify the backups that you use as input to pg_combinebackup,
you can use pg_verifybackup to do that, but it's not a requirement.
I'm not averse to having some kind of statement in the documentation
along the lines of "Note that pg_combinebackup does not attempt to
verify that the individual backups are intact; for that, use
pg_verifybackup."

I think we should do this at a minimum.

But I think it should be blindingly obvious to
everyone that you can't go whacking around the inputs to a program and
expect to get perfectly good output. I know it isn't blindingly
obvious to everyone, which is why I'm not averse to adding something
like what I just mentioned, and maybe it wouldn't be a bad idea to
document in a few other places that you shouldn't randomly remove
files from the data directory of your live cluster, either, because
people seem to keep doing it, but really, the expectation that you
can't just blow files away and expect good things to happen afterward
should hardly need to be stated.

And yet, we see it all the time.

I think it's very easy to go overboard with warnings of this type.
Weird stuff comes to me all the time because people call me when the
weird stuff happens, and I'm guessing that your experience is similar.
But my actual personal experience, as opposed to the cases reported to
me by others, practically never features files evaporating into the
ether.

Same -- if it happens at all it is very rare. Virtually every time I am able to track down the cause of missing files it is because the user deleted them, usually to "save space" or because they "did not seem important".

But given that this occurrence is pretty common in my experience, I think it is smart to mitigate against it, rather than just take it on faith that the user hasn't done anything destructive.

Especially given how pg_combinebackup works, backups are going to undergo a lot of user manipulation (pushing to and pull from storage, decompressing, untaring, etc.) and I think that means we should take extra care.

Regards,
-David


Reply via email to