On 10/18/23 08:39, Robert Haas wrote:
On Tue, Oct 17, 2023 at 4:17 PM David Steele <da...@pgmasters.net> wrote:
Given that the above can't be back patched, I'm thinking we don't need
backup_label at all going forward. We just write the values we need for
recovery into pg_control and return *that* from pg_backup_stop() and
tell the user to store it with their backup. We already have "These
files are vital to the backup working and must be written byte for byte
without modification, which may require opening the file in binary
mode." in the documentation so dealing with pg_control should not be a
problem. pg_control also has a CRC so we will know if it gets munged.

Yeah, I was thinking about this kind of idea, too. I think it might be
a good idea, although I'm not completely certain about that, either.

<snip>

First, anything that is stored in the backup_label but not the control
file has to (a) move into the control file,

I'd rather avoid this.

(b) be stored someplace
else,

I don't think the additional fields *need* to be stored anywhere at all, at least not by us. We can provide them as output from pg_backup_stop() and the caller can do as they please. None of those fields are part of the restore process.

or (c) be eliminated as a concept.

I think we should keep them as above since I don't believe they hurt anything.

We're likely to get
complaints about (a), especially if the data in question is anything
big. Any proposal to do (b) risks undermining the whole theory under
which this is a good proposal, namely that removing backup_label gives
us one less thing to worry about. So that brings us to (c).
Personally, I would lose very little sleep if the LABEL field died and
never came back, and I wouldn't miss START TIME and STOP TIME either,
but somebody else might well feel differently. I don't think it's
trivial to get rid of BACKUP METHOD, as there unfortunately seems to
be code that depends on knowing the difference between BACKUP FROM:
streamed and BACKUP FROM: pg_rewind. I suspect that BACKUP FROM:
primary/standby might have the same issue, but I'm not sure.

BACKUP FROM: primary/standby we can definitely handle, and BACKUP METHOD: pg_rewind just means backupEndRequired is false. That will need to be written by pg_rewind (instead of writing backup_label).

STOP
TIMELINE could be a problem too. I think that if somebody could do
some rejiggering to eliminate some of the differences between the
cases here, that could be really good general cleanup irrespective of
what we decide about this proposal, and moving some things in to
pg_control is probably reasonable too. For instance, it would seem
crazy to me to argue that storing the backup end location in the
control file is OK, but storing the backup end TLI there would not be
OK.

We have a place in pg_control for the end TLI (minRecoveryPointTLI). That's only valid for backup from standby since a primary backup can never change timelines.

But the point here is that there's probably a good deal of careful
thinking that would need to be done here about exactly where all of
the stuff that currently exists in the backup_label file but not in
pg_control needs to end up.

Agreed.

Second, right now, the stuff that we return at the end of a backup is
all text data. With this proposal, it becomes binary data. I entirely
realize that people should only be doing these kinds of backups using
automated tools that that those automated tools should be perfectly
capable of handling binary data without garbling anything. But that's
about as realistic as supposing that people won't instantly remove the
backup_label file the moment it seems like it will solve some problem,
even when the directions clearly state that this should only be done
in some other situation that is not the one the user is facing.

Well, we do specify that backup_label and tablespace_map should be saved byte for byte. But We've already seen users mess this up in the past and add \r characters that made the files unreadable.

If it can be done wrong, it will be done wrong by somebody.

It
just occurred to me that one thing we could do to improve the user
experience here is offer some kind of command-line utility to assist
with taking a low-level backup. This could be done even if we don't
proceed with this proposal e.g.

pg_llbackup -d $CONNTR --backup-label=PATH --tablespace-map=PATH
--copy-data-directory=SHELLCOMMAND

I don't know for sure how much that would help, but I wonder if it
might actually help quite a bit, because right now people do things
like use psql in a shell script to try to juggle a database connection
and then in some other part of the shell script do the data copying.
But it is quite easy to screw up the error handling or the psql
session lifetime or something like that, and this would maybe provide
a nicer interface.

I think in most cases where this would be useful the user should just be using pg_basebackup. If the backup is trying to use snapshots, then backup_label needs to be stored outside the snapshot and we won't be able to easily help.

There might be other problems, too. This is just what occurs to me off
the top of my head. But I think it's an interesting angle to explore
further.

There may definitely be other problems and I'm pretty sure there will be. My feeling is that they will be surmountable, but I won't know for sure until I finish the patch.

But I also feel it's a good idea to explore further. I'll work on the patch and should have something to share soon.

Regards,
-David


Reply via email to