Re: [HACKERS] streaming replication breaks horribly if master crashes

Fujii Masao Wed, 16 Jun 2010 22:58:27 -0700

On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas <[email protected]> wrote:
> On Wed, Jun 16, 2010 at 4:14 PM, Josh Berkus <[email protected]> wrote:
>>> The first problem I noticed is that the slave never seems to realize
>>> that the master has gone away.  Every time I crashed the master, I had
>>> to kill the wal receiver process on the slave to get it to reconnect;
>>> otherwise it just sat there waiting, either forever or at least for
>>> longer than I was willing to wait.
>>
>> Yes, I've noticed this.  That was the reason for forcing walreceiver to
>> shut down on a restart per prior discussion and patches.  This needs to
>> be on the open items list ... possibly it'll be fixed by Simon's
>> keepalive patch?  Or is it just a tcp_keeplalive issue?
>
> I think a TCP keepalive might be enough, but I have not tried to code
> or test it.


The "keepalive on libpq" patch would help.
https://commitfest.postgresql.org/action/patch_view?id=281

>>> and this just
>>> makes it more likely.  After the most recent crash, the master thought
>>> pg_current_xlog_location() was 1/86CD4000; the slave thought
>>> pg_last_xlog_receive_location() was 1/8733C000.  After reconnecting to
>>> the master, the slave then thought that
>>> pg_last_xlog_receive_location() was 1/87000000.
>>
>> So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would
>> have actually prevented the slave from being corrupted.
>>
>> My question, though, is detecting out-of-sequence xlogs *enough*?  Are
>> there any crash conditions on the master which would cause the master to
>> reuse the same locations for different records, for example?  I don't
>> think so, but I'd like to be certain.
>
> The real problem here is that we're sending records to the slave which
> might cease to exist on the master if it unexpectedly reboots.  I
> believe that what we need to do is make sure that the master only
> sends WAL it has already fsync'd (Tom suggested on another thread that
> this might be necessary, and I think it's now clear that it is 100%
> necessary).

The attached patch changes walsender so that it always sends WAL up to
LogwrtResult.Flush instead of LogwrtResult.Write.

> But I'm not sure how this will play with fsync=off - if
> we never fsync, then we can't ever really send any WAL without risking
> this failure mode.  Similarly with synchronous_commit=off, I believe
> that the next checkpoint will still fsync WAL, but the lag might be
> long.

First of all, we should not restart the master after the crash in
fsync=off case. That would cause the corruption of the master database
itself.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

send_after_fsync_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] streaming replication breaks horribly if master crashes

Reply via email to