From: "Daniel Farina" <dan...@heroku.com>
On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus <j...@agliodbs.com> wrote:
Right now, what we're telling users is "You can have continuous backup
with Postgres, but you'd better hire and expensive consultant to set it
up for you, or use this external tool of dubious provenance which
there's no packages for, or you might accidentally cause your database
to shut down in the middle of the night."

At which point most sensible users say "no thanks, I'll use something else".

Inverted and just as well supported: "if you want to not accidentally
lose data, you better hire an expensive consultant to check your
systems for all sorts of default 'safety = off' features."  This
being but the hypothetical first one.

Furthermore, I see no reason why high quality external archiving
software cannot exist.  Maybe some even exists already, and no doubt
they can be improved and the contract with Postgres enriched to that
purpose.

Finally, it's not that hard to teach any archiver how to no-op at
user-peril, or perhaps Postgres can learn a way to do this expressly
to standardize the procedure a bit to ease publicly shared recipes, perhaps.

Yes, I feel designing reliable archiving, even for the simplest case - copy WAL to disk, is very difficult. I know there are following three problems if you just follow the PostgreSQL manual. Average users won't notice them. I guess even professional DBAs migrating from other DBMSs won't, either.

1. If the machine or postgres crashes while archive_command is copying a WAL file, later archive recovery fails. This is because cp leaves a file of less than 16MB in archive area, and postgres refuses to start when it finds such a small archive WAL file. The solution, which IIRC Tomas san told me here, is to do like "cp %p /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".

2. archive_command dumps core when you run pg_ctl stop -mi.
This is because postmaster sends SIGQUIT to all its descendants. The core files accumulate in the data directory, which will be backed up with the database. Of course those core files are garbage.
archive_command script needs to catch SIGQUIT and exit.

3. You cannot know the reason of archive_command failure (e.g. archive area full) if you don't use PostgreSQL's server logging.
This is because archive_command failure is not logged in syslog/eventlog.


I hope PostgreSQL will provide a reliable archiving facility that is ready to use.

Regards
MauMau







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to