On 11/20/23 12:11, Robert Haas wrote:
On Sun, Nov 19, 2023 at 8:16 PM Michael Paquier <mich...@paquier.xyz> wrote:
(I am not exactly sure how, but we've lost pgsql-hackers on the way
when you sent v5.  Now added back in CC with the two latest patches
you've proposed attached.)

Here is a short summary of what has been missed by the lists:
- I've commented that the patch should not create, not show up in
fields returned the SQL functions or stream control files with a size
of 512B, just stick to 8kB.  If this is worth changing this should be
applied consistently across the board including initdb, discussed on
its own thread.
- The backup-related fields in the control file are reset at the end
of recovery.  I've suggested to not do that to keep a trace of what
was happening during recovery.  The latest version of the patch resets
the fields.
- With the backup_label file gone, we lose some information in the
backups themselves, which is not good.  Instead, you have suggested an
approach where this data is added to the backup manifest, meaning that
no information would be lost, particularly useful for self-contained
backups.  The fields planned to be added to the backup manifest are:
-- The start and end time of the backup, the end timestamp being
useful to know when stop time can be used for PITR.
-- The backup label.
I've agreed that it may be the best thing to do at this end to not
lose any data related to the removal of the backup_label file.

I think we need more votes to make a change this big. I have a
concern, which I think I've expressed before, that we keep whacking
around the backup APIs, and that has a cost which is potentially
larger than the benefits.

From my perspective it's not that big a change for backup software but it does bring a lot of benefits, including fixing an outstanding bug in Postgres, i.e. reading pg_control without getting a torn copy.

The last time we changed the API, we changed
pg_stop_backup to pg_backup_stop, but this doesn't do that, and I
wonder if that's OK. Even if it is, do we really want to change this
API around again after such a short time?

This is a good point. We could just rename again, but not sure what names to go for this time. OTOH if the backup software is selecting fields then they will get an error because the names have changed. If the software is grabbing fields by position then they'll get a valid-looking result (even if querying by position is a terrible idea).

Another thing we could do is explicitly error if we see backup_label in PGDATA during recovery. That's just a few lines of code so would not be a big deal to maintain. This error would only be visible on restore, so it presumes that backup software is being tested.

Maybe just a rename to something like pg_backup_begin/end would be the way to go.

That said, I don't have an intrinsic problem with moving this
information from the backup_label to the backup_manifest file since it
is purely informational. I do think there should perhaps be some
additions to the test cases, though.

A little hard to add to the tests, I think, since they are purely informational, i.e. not pushed up by the parser. Maybe we could just grep for the fields?

I am concerned about the interaction of this proposal with incremental
backup. When you take an incremental backup, you get something that
looks a lot like a usable data directory but isn't. To prevent that
from causing avoidable disasters, the present version of the patch
adds INCREMENTAL FROM LSN and INCREMENTAL FROM TLI fields to the
backup_label. pg_combinebackup knows to look for those fields, and the
server knows that if they are present it should refuse to start. With
this change, though, I suppose those fields would end up in
pg_control. But that does not feel entirely great, because we have a
goal of keeping the amount of real data in pg_control below 512 bytes,
the traditional sector size, and this adds another 12 bytes of stuff
to that file that currently doesn't need to be there. I feel like
that's kind of a problem.

I think these fields would be handled the same as the rest of the fields in backup_label: returned from pg_backup_stop() and also stored in backup_manifest. Third-party software can do as they like with them and pg_combinebackup can just read from backup_manifest.

As for the pg_control file -- it might be best to give it a different name for backups that are not essentially copies of PGDATA. On the other hand, pgBackRest has included pg_control in incremental backups since day one and we've never had a user mistakenly do a manual restore of one and cause a problem (though manual restores are not the norm). Still, probably can't hurt to be a bit careful.

But my main point here is ... if we have a few more senior hackers
weigh in and vote in favor of this change, well then that's one thing.
But IMHO a discussion that's mostly between 2 people is not nearly a
strong enough consensus to justify this amount of disruption.

We absolutely need more people to look at this and sign off. I'm glad they have not so far because it has allowed time to whack the patch around and get it into better shape.

Regards,
-David


Reply via email to