Re: [p2p-hackers] TCP Keepalive timeouts

David Barrett Thu, 26 Mar 2009 09:42:24 -0700

Agreed with Wesley's comments.  I tend to do:

- 60s keepalive messages over TCP connections, more to detect connection 
failure than for any router/NAT reason.


- Xs keepalive for all UDP peer connections, to keep the NAT binding 
alive (and to detect peer failures).

The trick of course is what's a good "X" and the answer is "there's no 
single answer that works everywhere, or even a single answer that works 
always for the same router".

I suggest setting up some set of servers you are certain have good 
connectivity, and then having clients ping them with an exponential 
backoff delay in the response.  Keep doing this until you don't get a 
response because it comes too late (ie, after the NAT has given up), at 
which point you've found the NAT timeout.  Set "X" to the last delay 
that worked, and use that for all peers.

However, X can change.  So rather than doing this once and being done 
with it, use this approach to "walk up" to a keepalive frequency that is 
"too low", then walk back "down" to a frequency that is "unnecessarily 
high", and then repeat.  I've found X can change for a given router (or 
collection of routers), perhaps under load?  So this keeps the system on 
its toes.

Does this make sense?

-david
Follow me at http://twitter.com/quinthar

Wesley Eddy wrote:
> On Thu, Mar 26, 2009 at 10:09:20AM +0000, Will Morton wrote:
>> On 26/03/2009, Richard Price <r.m.pr...@cs.bham.ac.uk> wrote:
>>>  But my question is, say I'm in a P2P network and I'm connected to
>>>  multiple peers. Is one keepalive message from a single peer, say every 2
>>>  minutes, enough to stop my router timing out all my connections? Or do
>>>  all peers each have to ensure that their individual connection does not
>>>  timeout by regularly sending keepalive messages? If so how regularly?
>>>
>> If you are going through a NAT device, you need to send or receive a
>> packet to/from each host every N seconds or else the NAT mapping for
>> that host will be dropped.  These devices don't have very much memory
>> so they don't wait very long before dumping idle connections.  I
>> believe most of them use a Least-Recently Used algorithm to keep track
>> of which details to dump when memory gets tight, so in a sense the
>> different connections are competing with each other.  Anecdotal
>> evidence suggests a safe value is about 30 seconds, but I have no hard
>> data to back that up.
>>
>> If you're not going through NAT, you don't need to send a keep-alive
>> to keep the connection up, but you'd want one anyway to detect dead
>> hosts, although you could safely use one a lot longer than 30s.  How
>> much longer would depend on how much memory you're happy to have
>> sitting around keeping track of dead hosts.  Trade-offs, always. :-)
>>
> 
> 
> Even without NAT, you may still need this for some stateful firewalls
> that tend to throw away their state too soon.
> 
> Whether or not you need this at all heavily depends upon the exact
> NAT(s) and firewall implementations you're going through as well as
> their configuration parameters, and they both vary widely in the wild.
> Some follow the BEHAVE guidelines, but that number is still far from
> 100%, so for at least several more years, you still have to code for the
> worst behaving cases. 
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers@lists.zooko.com
> http://lists.zooko.com/mailman/listinfo/p2p-hackers


_______________________________________________
p2p-hackers mailing list
p2p-hackers@lists.zooko.com
http://lists.zooko.com/mailman/listinfo/p2p-hackers

Re: [p2p-hackers] TCP Keepalive timeouts

Reply via email to