On Mon, Dec 6, 2010 at 9:54 AM, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: >>> This occurred to me that the timeout would be required even for >>> asynchronous streaming replication. So, how about implementing the >>> replication timeout feature before synchronous replication itself? >> >> Here is the patch. This is one of features required for synchronous >> replication, so I added this into current CF as a part of synchronous >> replication. > > Hmm, that's actually a quite different timeout than what's required for > synchronous replication. In synchronous replication, you need to get an > acknowledgment within a timeout. This patch only puts a timeout on how long > we wait to have enough room in the TCP send buffer. That doesn't seem all > that useful.
Yeah. If we rely on the TCP send buffer filling up, then the amount of time the master takes to notice a dead standby is going to be hard for the user to predict. I think the standby ought to send some sort of heartbeat and the master should declare the standby dead if it doesn't see a heartbeat soon enough. Maybe the heartbeat could even include the receive/fsync/replay LSNs, so that sync rep can use the same machinery but with more aggressive policies about when they must be sent. I also can't help noticing that this approach requires drilling a hole through the abstraction stack. We just invented latches; if the API is going to have to change every time someone wants to implement a feature, we've built ourselves an awfully porous abstraction layer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers