On Sat, Feb 18, 2017 at 4:52 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > I have my doubts about this actually addressing gitlab-like mistakes, > though, because it's a helluva jump from "It's waiting and not doing > anything," to "We need to remove the datadir." (One of the reasons being > that non-empty directory is a local issue, and there's no reason why the > tool should wait instead of just reporting an error.)
It's pretty clear that the gitlab postmortem involves multiple people making multiple serious errors, including failing to test that the ostensible backups could actually be restored. I was taught that rule #1 as far as backups are concerned is to test that you can restore them, so that seems like a big miss. However, I don't think the fact they made other mistakes is a reason not to improve the things we can improve and, certainly, having some way for pg_basebackup to tell you that it's waiting for the master to checkpoint will help the next person who is confused by that particular thing. That person may go on to be confused by something else, but then again maybe not. Improving the reporting in this case stands on its own merits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers