Rick Jones wrote:
This is probably not something that happens in real world deployments.
I But it's not 60,000 concurrent connections, it's 60,000 within a 2
minute span.
Sounds like a case of Doctor! Doctor! It hurts when I do this.
I guess. In the cases where it matters, we use LDAP over Unix Domain
Sockets instead of TCP. Smarter clients that do connection pooling would
help too, but the fact that this even came to our attention is because
not all clients out there are smart enough.
Since we have an alternative that works, I'm not really worried about
it. I just thought it was worthwhile to raise the question.
I'm not saying this is a high priority problem, I only encountered it
in a test scenario where I was deliberately trying to max out the server.
Ideally the 2MSL parameter would be dynamically adjusted based on the
route to the destination and the weights associated with those routes.
In the simplest case, connections between machines on the same subnet
(i.e., no router hops involved) should have a much smaller default
value
than connections that traverse any routers. I'd settle for a two-level
setting - with no router hops, use the small value; with any router
hops
use the large value.
With transparant bridging, nobody knows how long the datagram may be out
there. Admittedly, the chances of a datagram living for a full two
minutes these days is probably nil, but just being in the same IP subnet
doesn't really mean anything when it comes to physical locality.
Bridging isn't necessarily a problem though. The 2MSL timeout is
designed to prevent problems from delayed packets that got sent through
multiple paths. In a bridging setup you don't allow multiple paths,
that's what STP is designed to prevent. If you want to configure a
network that allows multiple paths, you need to use a router, not a bridge.
SPECweb benchmarking has had to deal with the issue of attempted
TIME_WAIT reuse going back to 1997. It deals with it by not relying on
the client's configured local/anonymous/ephemeral port number range and
instead making explicit bind() calls in the (more or less) entire unpriv
port range (actually it may just be from 5000 to 65535 but still)
That still doesn't solve the problem, it only ~doubles the available
port range. That means it takes 0.6 seconds to trigger the problem
instead of only 0.3 seconds...
Now, if it weren't necessary to fully randomize the ISNs, the chances of
a successful transition from TIME_WAIT to ESTABLISHED might be greater,
but going back to the good old days of more or less purly clock driven
ISN's isn't likely.
In an environment where connections are opened and closed very quickly
with only a small amount of data carried per connection, it might make
sense to remember the last sequence number used on a port and use that
as the floor of the next randomly generated ISN. Monotonically
increasing sequence numbers aren't a security risk if there's still a
randomly determined gap from one connection to the next. But I don't
think it's necessary to consider this at the moment.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
Chief Architect, OpenLDAP http://www.openldap.org/project/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html