On 01/09/2014 10:55 PM, Josh Berkus wrote:
On 01/09/2014 12:05 PM, Heikki Linnakangas wrote:

Actually, why is the partially-filled 000000010000000000000002 file
archived in the first place? Looking at the code, it's been like that
forever, but it seems like a bad idea. If the original server is still
up and running, and writing more data to that file, what will happen is
that when the original server later tries to archive it, it will fail
because the partial version of the file is already in the archive. Or
worse, the partial version overwrites a previously archived more
complete version.

Oh!  This explains some transient errors I've seen.

Wouldn't it be better to not archive the old segment, and instead switch
to a new segment after writing the end-of-recovery checkpoint, so that
the segment on the new timeline is archived sooner?

It would be better to zero-fill and switch segments, yes.  We should
NEVER be in a position of archiving two different versions of the same
segment.

Ok, I think we're in agreement that that's the way to go for master.

Now, what to do about back-branches? On one hand, I'd like to apply the same fix to all stable branches, as the current behavior is silly and always has been. On the other hand, we haven't heard any complaints about it, so we probably shouldn't fix what ain't broken. Perhaps we should apply it to 9.3, as that's where we have the acute problem the OP reported. Thoughts?

In summary, I propose that we change master and REL9_3_STABLE to not archive the partial segment from previous timeline. Older branches will keep the current behavior.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to