On 01/01/2011 06:29 PM, Simon Riggs wrote:
On Sat, 2011-01-01 at 18:13 +0100, Stefan Kaltenbrunner wrote:
On 01/01/2011 05:55 PM, Simon Riggs wrote:
It appears to me there has been substantial confusion over alternatives,
because of a misunderstanding about how synchronisation works. Requiring
confirmation that standbys are in sync is *not* the same thing as them
actually being in sync. Every single proposal made by anybody here on
hackers that supports multiple standby servers suffers from the same
issue: when the primary crashes you need to work out which standby
server is ahead.
aaah that was exactly what I was after - so the problem is that when you
have a sync standby it will technically always be "in front" of the
master (because it needs to fsync/apply/whatever before the master).
In the end the question boils down to what is "the bigger problem" in
the case of a lost master:
a) a transaction that was confirmed on the master but might not be on
any of the surviving sync standbys (or you will never know if it is) -
this is how I understand the proposal so far
No that cannot happen, the current situation is that we will fsync WAL
on the master, then fsync WAL on the standby, then reply to the master.
The standby is never ahead of the master, at any point.
hmm maybe my "surviving" standbys(the case I'm wondering about is whole
datacenter failures which might take out more than just the master) was
not clear - consider three boxes, one master and two standby and
semisync replication(ie any one of the standbys is enough to reply).
1. master fsyncs wal
2. standby #1 fsyncs and replies
3. master confirms commit
4. desaster strikes and destroys master and standby #1 while standby m2
never had time to apply the change(IO/CPU load, latency, whatever)
5. now you have a sync standby that is missing something that was
commited on the master and confirmed to the client and no way to verify
that this thing happened (same problem with more than two standbys - as
long as you lose ONE standby and the master at the same time you will
never be sure)
what is it that I'm missing here?
b) a transaction that was not yet confirmed on the master but might have
been applied on the surving standby before the desaster - this is what I
understand "confirm from all sync standbys" could result in.
Yes, that is described in the docs changes I published.
(a) was discussed, but ruled out, since it would require any crash/immed
shutdown of the master to become a failover, or have some kind of weird
back channel to give the missing data back.
There hasn't been any difference of opinion in this area, that I am
aware of. All proposals have offered (b).
hmm I'm confused now - any chance you mixed up a & b here because in a)
no backchannel is needed because the standby could just fetch the
missing data from the master?
If that is the case I agree that it would be hard to get the replication
up again after a crash of the master with a standby that is ahead but in
the end it would be a business decision (as in conflict resolution) on
what to do - take the "ahead" standbys data and use that or destroy the
old standby and recreate.
Stefan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers