Re: [HACKERS] Sync Rep Design

Heikki Linnakangas Fri, 31 Dec 2010 04:41:31 -0800

On 31.12.2010 13:48, Simon Riggs wrote:

On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote:

Regarding the rest of the proposal, I would still prefer the UI
discussed here:

http://archives.postgresql.org/message-id/[email protected]

It ought to be the same amount of work to implement, and provides the
same feature set, but makes administration a bit easier by being able to
name the standbys. Also, I dislike the idea of having the standby
specify that it's a synchronous standby that the master has to wait for.
Behavior on the master should be configured on the master.


Good point; I've added the people on the copy list from that post. This
question is they key, so please respond after careful thought on my
points below.

There are ways to blend together the two approaches, discussed later,
though first we need to look at the reasons behind my proposals.

I see significant real-world issues with configuring replication using
multiple named servers, as described in the link above:

All of these points only apply to specifying *multiple* named servers inthe synchronous_standbys='...' list. That's certainly a more complicatedscenario, and the configuration is more complicated as a result. Withyour proposal, it's not possible in the first place.

Multiple synchronous standbys probably isn't needed by most people, soI'm fine with leaving that out for now, keeping the design the sameotherwise. I included it in the proposal because it easily falls out ofthe design. So, if you're worried about the complexities of multiplesynchronous standbys, let's keep the UI exactly the same as what Idescribed in the link above, but only allow one name in thesynchronous_standbys setting, instead of a list.

3. Administrative complexity just jumped a huge amount.

(a) If you add or remove servers to the config you need to respecify all
the parameters, which need to be specific to the exact set of servers.

Hmm, this could be alleviated by allowing the master to have a name too.All the configs could then be identical, except for the unique name foreach server. For example, for a configuration with three servers thatare all synchronous with each other, each server would have"synchronous_standbys='server1, server2, server3'" in the config file.The master would simply ignore the entry for itself.

(b) After failover, the list of synchronous_standbys needs to be
re-specified, yet what is the correct list of servers? The only way to
make that config work is with complex middleware that automatically
generates new config files.

It depends on what you want. I think you're envisioning that theoriginal server is taken out of the system and not waited for, meaningthat you accept a lower level of persistence after failover. Yes, thenyou need to change the config. Or more likely you prepare the configfile in the standby that way to begin with.

I don't think that is "the same amount of
work to implement", its an order of magnitude harder overall.

I meant it's the same amount of work to implement the feature inPostgreSQL. No doubt that maintaining such a setup in production is morecomplicated.

5. Requesting sync from more than one server performs poorly, since you
must wait for additional servers. If there are sporadic or systemic
network performance issues you will be badly hit by them. Monitoring
that just got harder also. First-response-wins is more robust in the
case of volatile resources since it implies responsiveness to changing
conditions.

6. You just lost the ability to control performance on the master, with
a userset. Performance is a huge issue with sync rep. If you can't
control it, you'll simply turn it off. Having a feature that we daren't
ever use because it performs poorly helps nobody. This is not a tick-box
in our marketing checklist, I want it to be genuinely real-world usable.

You could make synchronous_standbys a user-settable GUC, just like yourproposed boolean switch. You could then control on a per-transactionbasis which servers you want to wait to respond. Although perhaps itwould be more user-friendly to just have an additional boolean GUC,similar to synchronous_commit=on/off. Or maybe synchronous_commit isenough to control that.

I suppose we might regard the feature set I am proposing as being the
same as making synchronous_standbys a USERSET parameter, and allowing
just two options:
"none" - allowing the user to specify async if they wish it
"*" - allowing people to specify that syncing to *any* standby is
acceptable

We can blend the two approaches together, if we wish, by having two
parameters (plus server naming)
   synchronous_replication = on | off (USERSET)
   synchronous_standbys = '...'
If synchronous_standbys is not set and synchronous_replication = on then
we sync to any standby. If  synchronous_replication = off then we use
async replication, whatever synchronous_standbys is set to.
If synchronous_standbys is set, then we use sync rep to all listed
servers.


Sounds good.

I still don't like the synchronous_standbys='' andsynchronous_replication=on combination, though. IMHO that still amountsto letting the standby control the behavior on master, and it makes itimpossible to temporarily add an asynchronous standby to the mix. Icould live with it, you wouldn't be forced to use it that way after all,but I would still prefer to throw an error on that combination. Or atleast document the pitfalls and recommend always naming the standbys.

My proposal amounts to "lets add synchronous_standbys as a parameter in
9.2". If you really think that we need that functionality in this
release, lets get the basic stuff added now and then fold in those ideas
on top afterwards. If we do that, I will help. However, my only
insistence is that we explain the above points very clearly in the docs
to specifically dissuade people from using those features for typical
cases.

Huh, wait, if you leave out synchronous_standbys, that's a completelydifferent UI again. I think we've finally reached agreement on how thisshould be configured, let's stick to that, please.

(I would be fine with limiting synchronous_standbys to just one serverin this release though.)

If you wondered why I ignored your post previously, its because I
understood that Fujii's post of 15 Oct, one week later, effectively
accepted my approach, albeit with two additional parameters. That is the
UI that I had been following.
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01009.php

That thread makes no mention of how to specify which standbys aresynchronous and which are not. It's about specifying the timeout andwhether to wait for a disconnected standby. Yeah, Fujii-san's proposalseems reasonable for configuring that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep Design

Reply via email to