Re: [HACKERS] Synchronization levels in SR

Heikki Linnakangas Wed, 26 May 2010 09:56:08 -0700

On 26/05/10 18:31, Robert Haas wrote:

And frankly, I don't think it's possible for quorum commit to reduce
the number of parameters.  Even if we have that feature available, not
everyone will want to use it.  And the people who don't will
presumably need whatever parameters they would have needed if quorum
commit hadn't been available in the first place.


Agreed, quorum commit is not a panacea.

For example, suppose that you have two servers, master and a standby,and you want transactions to be synchronously committed to both, so thatin the event of a meteor striking the master, you don't lose anytransactions that have been replied to the client as committed.

Now you want to set up a temporary replica of the master at adevelopment server, for testing purposes. If you set quorum to 2, yourdevelopment server becomes critical infrastructure, which is not whatyou want. If you set quorum to 1, it also becomes criticalinfrastructure, because it's possible that a transaction has beenreplicated to the test server but not the real production standby, and ameteor strikes.

Per-standby settings would let you express that, but not OTOH the quorumbehavior where you require N out of M to acknowledge the commit beforereturning to client.

There's really no limit to how complex a setup can be. For example,imagine that you have two data centers, with two servers in each. Youwant to replicate the master to all four servers, but for commit toreturn to the client, it's enough that the transaction has beenreplicated to one server in each data center. How do you express that inthe config file? And it would be nice to have per-transaction controltoo, like with synchronous_commit...


So this is a tradeoff between
* flexibility, how complex a setup you can express?
* code complexity, how complicated is it to implement?
* user-friendliness, how easy is it to configure?

One way out of this is to implement something very simple in PostgreSQL,and build external WAL proxying tools in pgfoundry that allow you tocascade and disseminate the WAL in as complex scenarios as you want.

Your reply has again avoided the subject of how we would handle failure
modes with per-standby settings. That is important.


I don't think anyone is avoiding that, we just haven't discussed it.
The thing is, I don't think quorum commit actually does anything to
address that problem.  If I have a master and a standby configured for
sync rep and the standby goes down, we have to decide what impact that
has on the master.  If I have a master and two standbys configured for
sync rep with quorum commit such that I only need an ack from one of
them, and they both go down, we still have to decide what impact that
has on the master.  I agree we need to talk about, but I don't agree
that putting in quorum commit will remove the need to design that
case.

Right, failure modes need to be discussed, but how quorum commit orwhatnot is configured is irrelevant to that.

No-one has come up with a scheme on how to abort a transaction if youdon't get a reply from a synchronous standby (or all standbys or aquorum of standbys). Until someone does, a commit on the master willhave to always succeed. The "synchronous" aspect will provide aguarantee that if a standby is connected, any transaction in the masterwill become visible (or fsync'd or just streamed to, depending on thelevel) on the standby too before it's acknowledged as committed to theclient, nothing more, nothing less.

One way to do that would be to refrain from flushing the commit recordto disk on the master until the standby has acknowledged it. Thedownside is that the master is in a very severe state at that point:until you flush the WAL, you can buffer only a small amount WAL trafficuntil you run out of wal_buffers, stalling all write activity in themaster, with backends waiting. You can't even shut down the servercleanly. But if you value your transaction integrity much higher thanavailability, maybe that's what you want.

PS. I whole-heartedly agree with Simon's concern upthread that if weallow a standby to specify in its config file that it wants to be asynchronous standby, that's a bit dangerous because connecting such astandby to the master will suddenly make all commits on the master a lotslower. Adding a synchronous standby should require some action in themaster, since it affects the behavior on master.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

Reply via email to