Re: post-freeze damage control

David Steele Thu, 11 Apr 2024 20:11:30 -0700



On 4/12/24 12:15, Robert Haas wrote:

On Thu, Apr 11, 2024 at 5:48 PM David Steele <da...@pgmasters.net> wrote:

But they'll try because it is a new pg_basebackup feature and they'll
assume it is there to be used. Maybe it would be a good idea to make it
clear in the documentation that significant tooling will be required to
make it work.


I don't agree with that idea. LOTS of what we ship takes a significant
amount of effort to make it work. You may well need a connection
pooler. You may well need a failover manager which may or may not be
separate from your connection pooler. You need a backup tool. You need
a replication management tool which may or may not be separate from
your backup tool and may or may not be separate from your failover
tool. You probably need various out-of-core connections for the
programming languages you need. You may need a management tool, and
you probably need a monitoring tool. Some of the tools you might
choose to do all that stuff themselves have a whole bunch of complex
dependencies. It's a mess.

The difference here is you *can* use Postgres without a connectionpooler (I have many times) or failover (if downtime is acceptable) butmost people would agree that you really *need* backup.

The backup tool should be clear and easy to use or misery willinevitably result. pg_basebackup is difficult enough to use and automatebecause it has no notion of a repository, no expiration, and no WALhandling just to name a few things. Now there is an even more advancedfeature that is even harder to use. So, no, I really don't think thisfeature is practically usable by the vast majority of end users.

Now, if someone were to say that we ought to talk about these issues
in our documentation and maybe give people some ideas about how to get
started, I would likely be in favor of that, modulo the small
political problem that various people would want their solution to be
the canonical one to which everyone gets referred. But I think it's
wrong to pretend like this feature is somehow special, that it's
somehow more raw or unfinished than tons of other things. I actually
think it's significantly *better* than a lot of other things. If we
add a disclaimer to the documentation saying "hey, this new
incremental backup feature is half-finished garbage!", and meanwhile
the documentation still says "hey, you can use cp as your
archive_command," then we have completely lost our minds.

Fair point on cp, but that just points to an overall lack in ourdocumentation and built-in backup/recovery tools in general.

I also think that you're being more negative about this than the facts
justify. As I said to several colleagues today, I *fully* acknowledge
that you have a lot more practical experience in this area than I do,
and a bunch of good ideas. I was really pleased to see you talking
about how it would be good if these tools worked on tar files - and I
completely agree, and I hope that will happen, and I hope to help in
making that happen. I think there are a bunch of other problems too,
only some of which I can guess at. However, I think saying that this
feature is not realistically intended to be used by end-users or that
they will not be able to do so is over the top, and is actually kind

of insulting.

It is not meant to be insulting, but I still believe it to be true.After years of working with users on backup problems I think I have apretty good bead on what the vast majority of admins are capable ofand/or willing to do. Making this feature work is pretty high above thatbar.

If the primary motivation is to provide a feature that can be integratedwith third party tools, as Tomas suggests, then I guess usability issomewhat moot. But you are insisting that is not the case and I justdon't see it that way.

There has been more enthusiasm for this feature on this
mailing list and elsewhere than I've gotten for anything I've
developed in years. And I don't think that's because all of the people
who have expressed enthusiasm are silly geese who don't understand how
terrible it is.

No doubt there is enthusiasm. It's a great feature to have. Inparticular I think the WAL summarizer is cool. But I do think theshortcomings are significant and that will become very apparent whenpeople start to implement. The last minute effort to add COW support isan indication of problems that people will see in the field.

Further, I do think some less that ideal design decisions were made. Inparticular, I think sidelining manifests, i.e. making them optional, isnot a good choice. This has led directly to the issue we see in [1]. Ifwe require a manifest to make an incremental backup, why make itoptional for combine?

This same design decision has led us to have "marker files" forzero-length files and unchanged files, which just seems extremelywasteful when these could be noted in the manifest. There are goodreasons for writing everything out in a full backup, but for anincremental that can only be reconstructed using our tool the manifestshould be sufficient.

Maybe all of this can be improved in a future release, along with tarreading, but none of those potential future improvements help me tobelieve that this is a user-friendly feature in this release.


Regards,
-David

---

[1]https://www.postgresql.org/message-id/flat/9badd24d-5bd9-4c35-ba85-4c38a2feb73e%40pgmasters.net

Re: post-freeze damage control

Reply via email to