On Mon, 2010-09-20 at 09:27 +0300, Heikki Linnakangas wrote: > On 18/09/10 22:59, Robert Haas wrote: > > On Sat, Sep 18, 2010 at 4:50 AM, Simon Riggs<si...@2ndquadrant.com> wrote: > >> Waiting might sound attractive. In practice, waiting will make all of > >> your connections lock up and it will look to users as if their master > >> has stopped working as well. (It has!). I can't imagine why anyone would > >> ever want an option to select that; its the opposite of high > >> availability. Just sounds like a serious footgun. > > > > Nevertheless, it seems that some people do want exactly that behavior, > > no matter how crazy it may seem to you. > > Yeah, I agree with both of you. I have a hard time imaging a situation > where you would actually want that. It's not high availability, it's > high durability. When a transaction is acknowledged as committed, you > know it's never ever going to disappear even if a meteor strikes the > current master server within the next 10 milliseconds. In practice, > people want high availability instead. > > That said, the timeout option also feels a bit wishy-washy to me. With a > timeout, acknowledgment of a commit means "your transaction is safely > committed in the master and slave. Or not, if there was some glitch with > the slave". That doesn't seem like a very useful guarantee; if you're > happy with that why not just use async replication? > > However, the "wait forever" behavior becomes useful if you have a > monitoring application outside the DB that decides when enough is enough > and tells the DB that the slave can be considered dead. So "wait > forever" actually means "wait until I tell you that you can give up". > The monitoring application can STONITH to ensure that the slave stays > down, before letting the master proceed with the commit.
err... what is the difference between a timeout and stonith? None. We still proceed without the slave in both cases after the decision point. In all cases, we would clearly have a user accessible function to stop particular sessions, or all sessions, from waiting for standby to return. You would have 3 choices: * set automatic timeout * set wait forever and then wait for manual resolution * set wait forever and then trust to external clusterware Many people have asked for timeouts and I agree it's probably the easiest thing to do if you just have 1 standby. > With that in mind, we have to make sure that a transaction that's > waiting for acknowledgment of the commit from a slave is woken up if the > configuration changes. There's a misunderstanding here of what I've said and its a subtle one. My patch supports a timeout of 0, i.e. wait forever. Which means I agree that functionality is desired and should be included. This operates by saying that if a currently-connected-standby goes down we will wait until the timeout. So I agree all 3 choices should be available to users. Discussion has been about what happens to ought-to-have-been-connected standbys. Heikki had argued we need standby registration because if a server *ought* to have been there, yet isn't currently there when we wait for sync rep, we would still wait forever for it to return. To do this you require standby registration. But there is a hidden issue there: If you care about high availability AND sync rep you have two standbys. If one goes down, the other is still there. In general, if you want high availability on N servers then you have N+1 standbys. If one goes down, the other standbys provide the required level of durability and we do not wait. So the only case where standby registration is required is where you deliberately choose to *not* have N+1 redundancy and then yet still require all N standbys to acknowledge. That is a suicidal config and nobody would sanely choose that. It's not a large or useful use case for standby reg. (But it does raise the question again of whether we need quorum commit). My take is that if the above use case occurs it is because one standby has just gone down and the standby is, for a hopefully short period, in a degraded state and that the service responds to that. So in my proposal, if a standby is not there *now* we don't wait for it. Which cuts out a huge bag of code, specification and such like that isn't required to support sane use cases. More stuff to get wrong and regret in later releases. The KISS principle, just like we apply in all other cases. If we did have standby registration, then I would implement it in a table, not in an external config file. That way when we performed a failover the data would be accessible on the new master. But I don't suggest we have CREATE/ALTER STANDBY syntax. We already have CREATE/ALTER SERVER if we wanted to do it in SQL. If we did that, ISTM we should choose functions. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers