Re: TCP 2MSL on loopback

Howard Chu Tue, 06 Mar 2007 11:31:53 -0800

Rick Jones wrote:

This is probably not something that happens in real world deployments.I But it's not 60,000 concurrent connections, it's 60,000 within a 2minute span.
Sounds like a case of Doctor! Doctor! It hurts when I do this.

I guess. In the cases where it matters, we use LDAP over Unix DomainSockets instead of TCP. Smarter clients that do connection pooling wouldhelp too, but the fact that this even came to our attention is becausenot all clients out there are smart enough.

Since we have an alternative that works, I'm not really worried aboutit. I just thought it was worthwhile to raise the question.

I'm not saying this is a high priority problem, I only encountered itin a test scenario where I was deliberately trying to max out the server.
Ideally the 2MSL parameter would be dynamically adjusted based on the
route to the destination and the weights associated with those routes.
In the simplest case, connections between machines on the same subnet
(i.e., no router hops involved) should have a much smaller defaultvalue
than connections that traverse any routers. I'd settle for a two-level
setting - with no router hops, use the small value; with any routerhops
use the large value.
With transparant bridging, nobody knows how long the datagram may be outthere. Admittedly, the chances of a datagram living for a full twominutes these days is probably nil, but just being in the same IP subnetdoesn't really mean anything when it comes to physical locality.

Bridging isn't necessarily a problem though. The 2MSL timeout isdesigned to prevent problems from delayed packets that got sent throughmultiple paths. In a bridging setup you don't allow multiple paths,that's what STP is designed to prevent. If you want to configure anetwork that allows multiple paths, you need to use a router, not a bridge.

SPECweb benchmarking has had to deal with the issue of attemptedTIME_WAIT reuse going back to 1997. It deals with it by not relying onthe client's configured local/anonymous/ephemeral port number range andinstead making explicit bind() calls in the (more or less) entire unprivport range (actually it may just be from 5000 to 65535 but still)

That still doesn't solve the problem, it only ~doubles the availableport range. That means it takes 0.6 seconds to trigger the probleminstead of only 0.3 seconds...

Now, if it weren't necessary to fully randomize the ISNs, the chances ofa successful transition from TIME_WAIT to ESTABLISHED might be greater,but going back to the good old days of more or less purly clock drivenISN's isn't likely.

In an environment where connections are opened and closed very quicklywith only a small amount of data carried per connection, it might makesense to remember the last sequence number used on a port and use thatas the floor of the next randomly generated ISN. Monotonicallyincreasing sequence numbers aren't a security risk if there's still arandomly determined gap from one connection to the next. But I don'tthink it's necessary to consider this at the moment.

--
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  Chief Architect, OpenLDAP     http://www.openldap.org/project/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: TCP 2MSL on loopback

Reply via email to