On 31.12.2010 13:48, Simon Riggs wrote:
On Fri, 2010-12-31 at 12:06 +0200, Heikki Linnakangas wrote:

Regarding the rest of the proposal, I would still prefer the UI
discussed here:

http://archives.postgresql.org/message-id/4cae030a.2060...@enterprisedb.com

It ought to be the same amount of work to implement, and provides the
same feature set, but makes administration a bit easier by being able to
name the standbys. Also, I dislike the idea of having the standby
specify that it's a synchronous standby that the master has to wait for.
Behavior on the master should be configured on the master.

Good point; I've added the people on the copy list from that post. This
question is they key, so please respond after careful thought on my
points below.

There are ways to blend together the two approaches, discussed later,
though first we need to look at the reasons behind my proposals.

I see significant real-world issues with configuring replication using
multiple named servers, as described in the link above:

All of these points only apply to specifying *multiple* named servers in the synchronous_standbys='...' list. That's certainly a more complicated scenario, and the configuration is more complicated as a result. With your proposal, it's not possible in the first place.

Multiple synchronous standbys probably isn't needed by most people, so I'm fine with leaving that out for now, keeping the design the same otherwise. I included it in the proposal because it easily falls out of the design. So, if you're worried about the complexities of multiple synchronous standbys, let's keep the UI exactly the same as what I described in the link above, but only allow one name in the synchronous_standbys setting, instead of a list.

3. Administrative complexity just jumped a huge amount.

(a) If you add or remove servers to the config you need to respecify all
the parameters, which need to be specific to the exact set of servers.

Hmm, this could be alleviated by allowing the master to have a name too. All the configs could then be identical, except for the unique name for each server. For example, for a configuration with three servers that are all synchronous with each other, each server would have "synchronous_standbys='server1, server2, server3'" in the config file. The master would simply ignore the entry for itself.

(b) After failover, the list of synchronous_standbys needs to be
re-specified, yet what is the correct list of servers? The only way to
make that config work is with complex middleware that automatically
generates new config files.

It depends on what you want. I think you're envisioning that the original server is taken out of the system and not waited for, meaning that you accept a lower level of persistence after failover. Yes, then you need to change the config. Or more likely you prepare the config file in the standby that way to begin with.

I don't think that is "the same amount of
work to implement", its an order of magnitude harder overall.

I meant it's the same amount of work to implement the feature in PostgreSQL. No doubt that maintaining such a setup in production is more complicated.

5. Requesting sync from more than one server performs poorly, since you
must wait for additional servers. If there are sporadic or systemic
network performance issues you will be badly hit by them. Monitoring
that just got harder also. First-response-wins is more robust in the
case of volatile resources since it implies responsiveness to changing
conditions.

6. You just lost the ability to control performance on the master, with
a userset. Performance is a huge issue with sync rep. If you can't
control it, you'll simply turn it off. Having a feature that we daren't
ever use because it performs poorly helps nobody. This is not a tick-box
in our marketing checklist, I want it to be genuinely real-world usable.

You could make synchronous_standbys a user-settable GUC, just like your proposed boolean switch. You could then control on a per-transaction basis which servers you want to wait to respond. Although perhaps it would be more user-friendly to just have an additional boolean GUC, similar to synchronous_commit=on/off. Or maybe synchronous_commit is enough to control that.

I suppose we might regard the feature set I am proposing as being the
same as making synchronous_standbys a USERSET parameter, and allowing
just two options:
"none" - allowing the user to specify async if they wish it
"*" - allowing people to specify that syncing to *any* standby is
acceptable

We can blend the two approaches together, if we wish, by having two
parameters (plus server naming)
   synchronous_replication = on | off (USERSET)
   synchronous_standbys = '...'
If synchronous_standbys is not set and synchronous_replication = on then
we sync to any standby. If  synchronous_replication = off then we use
async replication, whatever synchronous_standbys is set to.
If synchronous_standbys is set, then we use sync rep to all listed
servers.

Sounds good.

I still don't like the synchronous_standbys='' and synchronous_replication=on combination, though. IMHO that still amounts to letting the standby control the behavior on master, and it makes it impossible to temporarily add an asynchronous standby to the mix. I could live with it, you wouldn't be forced to use it that way after all, but I would still prefer to throw an error on that combination. Or at least document the pitfalls and recommend always naming the standbys.

My proposal amounts to "lets add synchronous_standbys as a parameter in
9.2". If you really think that we need that functionality in this
release, lets get the basic stuff added now and then fold in those ideas
on top afterwards. If we do that, I will help. However, my only
insistence is that we explain the above points very clearly in the docs
to specifically dissuade people from using those features for typical
cases.

Huh, wait, if you leave out synchronous_standbys, that's a completely different UI again. I think we've finally reached agreement on how this should be configured, let's stick to that, please.

(I would be fine with limiting synchronous_standbys to just one server in this release though.)

If you wondered why I ignored your post previously, its because I
understood that Fujii's post of 15 Oct, one week later, effectively
accepted my approach, albeit with two additional parameters. That is the
UI that I had been following.
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01009.php

That thread makes no mention of how to specify which standbys are synchronous and which are not. It's about specifying the timeout and whether to wait for a disconnected standby. Yeah, Fujii-san's proposal seems reasonable for configuring that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to