Albe Laurenz wrote:
The documentation states in
http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL
"The archive command should generally be designed to refuse to overwrite any
pre-existing archive file."
and suggests an archive_command like "test ! -f .../%f && cp %p .../%f".
We ran into (small) problems with an archive_command similar to this
as follows:
The server received a fast shutdown request while a WAL segment was being
archived.
The archiver stopped and left behind a half-written archive file.
Hmm, if I'm reading the code correctly, a fast shutdown request
shouldn't kill an ongoing archive command.
Now when the server was restarted, the archiver tried to archive the same
WAL segment again and got an error because the destination file already
existed.
That means that WAL archiving is stuck until somebody manually removes
the partial archived file.
Yeah, that's a good point. Even if it turns out that the reason for your
partial write wasn't the fast shutdown request, the archive_command
could be interrupted for some other reason and leave behind a partially
written file behind.
I suggest that the documentation be changed so that it does not
recommend this setup. WAL segment names are unique anyway.
Well, the documentation states the reason to do that:
This is an important safety feature to preserve the integrity of your archive
in case of administrator error (such as sending the output of two different
servers to the same archive directory)
which seems like a reasonable concern too. Perhaps it should suggest
something like:
test ! -f .../%f && cp %p .../%f.tmp && mv .../%f.tmp .../%f
ie. copy under a different filename first, and rename the file in place
after it's completely written, assuming that mv is atomic. It gets a bit
complicated, though.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers