Heikki Linnakangas wrote: > > The documentation states > > > > "The archive command should generally be designed to refuse to overwrite > > any pre-existing archive file." > > > > and suggests an archive_command like "test ! -f .../%f && cp %p .../%f". > > > > We ran into (small) problems with an archive_command similar to this > > as follows: > > > > The server received a fast shutdown request while a WAL segment was being > > archived. > > The archiver stopped and left behind a half-written archive file. > > Hmm, if I'm reading the code correctly, a fast shutdown request > shouldn't kill an ongoing archive command.
Maybe it died because of a signal 1, I don't know. But it left behind a half-written file. > > Now when the server was restarted, the archiver tried to archive the same > > WAL segment again and got an error because the destination file already > > existed. > > > > That means that WAL archiving is stuck until somebody manually removes > > the partial archived file. > > Yeah, that's a good point. Even if it turns out that the reason for your > partial write wasn't the fast shutdown request, the archive_command > could be interrupted for some other reason and leave behind a partially > written file behind. > > > I suggest that the documentation be changed so that it does not > > recommend this setup. WAL segment names are unique anyway. > > Well, the documentation states the reason to do that: > > > This is an important safety feature to preserve the > integrity of your archive in case of administrator error > (such as sending the output of two different servers to the > same archive directory) > > which seems like a reasonable concern too. Of course, that's why I did that at first. But isn't it true that the vast majority of people have only one PostgreSQL cluster per machine, and it is highly unlikely that somebody else creates a file with the same name as a WAL segment in the archive directory? > Perhaps it should suggest > something like: > > test ! -f .../%f && cp %p .../%f.tmp && mv .../%f.tmp .../%f > > ie. copy under a different filename first, and rename the file in place > after it's completely written, assuming that mv is atomic. It gets a bit > complicated, though. That's a good idea (although it could lead to race conditions in the extremely rare case that two clusters want to archive equally named files at the same time). I'll write a patch for that and send it as basis for a discussion. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers