On Mon, 2010-09-20 at 09:27 +0300, Heikki Linnakangas wrote:
> On 18/09/10 22:59, Robert Haas wrote:
> > On Sat, Sep 18, 2010 at 4:50 AM, Simon Riggs<si...@2ndquadrant.com>  wrote:
> >> Waiting might sound attractive. In practice, waiting will make all of
> >> your connections lock up and it will look to users as if their master
> >> has stopped working as well. (It has!). I can't imagine why anyone would
> >> ever want an option to select that; its the opposite of high
> >> availability. Just sounds like a serious footgun.
> >
> > Nevertheless, it seems that some people do want exactly that behavior,
> > no matter how crazy it may seem to you.
> 
> Yeah, I agree with both of you. I have a hard time imaging a situation 
> where you would actually want that. It's not high availability, it's 
> high durability. When a transaction is acknowledged as committed, you 
> know it's never ever going to disappear even if a meteor strikes the 
> current master server within the next 10 milliseconds. In practice, 
> people want high availability instead.
> 
> That said, the timeout option also feels a bit wishy-washy to me. With a 
> timeout, acknowledgment of a commit means "your transaction is safely 
> committed in the master and slave. Or not, if there was some glitch with 
> the slave". That doesn't seem like a very useful guarantee; if you're 
> happy with that why not just use async replication?
> 
> However, the "wait forever" behavior becomes useful if you have a 
> monitoring application outside the DB that decides when enough is enough 
> and tells the DB that the slave can be considered dead. So "wait 
> forever" actually means "wait until I tell you that you can give up". 
> The monitoring application can STONITH to ensure that the slave stays 
> down, before letting the master proceed with the commit.

err... what is the difference between a timeout and stonith? None. We
still proceed without the slave in both cases after the decision point. 

In all cases, we would clearly have a user accessible function to stop
particular sessions, or all sessions, from waiting for standby to
return.

You would have 3 choices:
* set automatic timeout
* set wait forever and then wait for manual resolution
* set wait forever and then trust to external clusterware

Many people have asked for timeouts and I agree it's probably the
easiest thing to do if you just have 1 standby.

> With that in mind, we have to make sure that a transaction that's 
> waiting for acknowledgment of the commit from a slave is woken up if the 
> configuration changes.

There's a misunderstanding here of what I've said and its a subtle one.

My patch supports a timeout of 0, i.e. wait forever. Which means I agree
that functionality is desired and should be included. This operates by
saying that if a currently-connected-standby goes down we will wait
until the timeout. So I agree all 3 choices should be available to
users.

Discussion has been about what happens to ought-to-have-been-connected
standbys. Heikki had argued we need standby registration because if a
server *ought* to have been there, yet isn't currently there when we
wait for sync rep, we would still wait forever for it to return. To do
this you require standby registration.

But there is a hidden issue there: If you care about high availability
AND sync rep you have two standbys. If one goes down, the other is still
there. In general, if you want high availability on N servers then you
have N+1 standbys. If one goes down, the other standbys provide the
required level of durability and we do not wait.

So the only case where standby registration is required is where you
deliberately choose to *not* have N+1 redundancy and then yet still
require all N standbys to acknowledge. That is a suicidal config and
nobody would sanely choose that. It's not a large or useful use case for
standby reg. (But it does raise the question again of whether we need
quorum commit).

My take is that if the above use case occurs it is because one standby
has just gone down and the standby is, for a hopefully short period, in
a degraded state and that the service responds to that. So in my
proposal, if a standby is not there *now* we don't wait for it. 

Which cuts out a huge bag of code, specification and such like that
isn't required to support sane use cases. More stuff to get wrong and
regret in later releases. The KISS principle, just like we apply in all
other cases.

If we did have standby registration, then I would implement it in a
table, not in an external config file. That way when we performed a
failover the data would be accessible on the new master. But I don't
suggest we have CREATE/ALTER STANDBY syntax. We already have
CREATE/ALTER SERVER if we wanted to do it in SQL. If we did that, ISTM
we should choose functions.

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to