On Jun 21, 2010, at 0:13 , Greg Stark wrote:
>> Keepalive is therefore extremely unlikely to break things - in the very 
>> worst case, a (really, really stupid) firewall might decide to drop packets 
>> with zero bytes of payload, causing inactive connections to abort after a 
>> while. AFAIK walreceiver will simply reconnect in this case.
> 
> Stateful firewalls whole raison-d'etre is to block packets which
> aren't consistent with the current TCP state -- such as packets with a
> sequence number earlier than the last acked sequence number.
> Keepalives do in fact violate the basic TCP spec so they wouldn't be
> entirely crazy to block them. 

Keepalives play games with the spec, but they don't outright violate it I'd 
say. The sender bluffs by retransmitting data it *knows* has been ACK'ed. But 
since nobody else can prove with certainty that the sender actually saw that 
ACK (think NIC-internal buffer overflow), nobody is able to call that bluff. 

> Of course a firewall that blocked them
> would be pretty criminally stupid given how ubiquitous they are.


Very true, and another reason to stop worrying about possibly brain-dead 
firewalls.

>> Plus, the postmaster enables keepalive on all incoming connections
>> *already*, so any problems ought to have caused bugreports about
>> dropped client connections.
> 
> Really? Since when? I thought there was some discussion about this
> about a year ago and I made it very clear this had to be an optional
> feature which defaulted to off.

Since 'bout 10 years. The setsockopt call is in StreamConnection() in 
src/backend/libpq/pqcomm.c.

Here's the corresponding commit:

commit 5aa160abba32a1f2d7818b9f49213f38c99b3fd8
Author: Tatsuo Ishii <is...@postgresql.org>
Date:   Sat May 20 13:10:54 2000 +0000

    Add KEEPALIVE option to the socket of backend. This will automatically
    terminate the backend that has no frontend anymore.

> Keepalives introduce spurious disconnections in working TCP
> connections that have transient outages which is basic TCP
> functionality that's supposed to work. There are cases where that's
> what you want but it isn't the kind of thing that should be on by
> default, let alone on unconditionally.

I'd buy that if all timeouts and retry counts would default to +infinity. But 
they don't, and hence sufficiently long network outages *will* cause connection 
aborts anyway. That a particular connection might survive due to inactivity 
proves nothing, since whether the connection is active or inactive during an 
outage is usually outside of anyone's control.

I really fail to see why anyone would prefer connections (and therefore 
transactions!) getting stuck forever over a few spurious disconnects. The 
former always require manual intervention and cause all sorts of performance 
and disk-space issues, while the latter won't even be an issue for well-written 
clients who just reconnect and retry.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to