André Warnier wrote:
TVFoodMaps wrote:
Hi,
My website is setup using apache 2, mod jk 1.37 and tomcat 6. Most of
the
connector settings are set to defaults and things normally run pretty
well.
However during certain spikes in traffic my server seems to hang by what
appears to be caused by "leaking connections". What I see is about 250
rows like this from a netstat:
tcp 1 0 ::ffff:127.0.0.1:8009 ::ffff:127.0.0.1:48387
CLOSE_WAIT
tcp 1 0 ::ffff:127.0.0.1:8009 ::ffff:127.0.0.1:48309
CLOSE_WAIT
tcp 689 0 ::ffff:127.0.0.1:8009 ::ffff:127.0.0.1:48423
CLOSE_WAIT
tcp 686 0 ::ffff:127.0.0.1:8009 ::ffff:127.0.0.1:48413
CLOSE_WAIT
I also am see a lot of these:
tcp 1 0 ::ffff:127.0.0.1:49261 ::ffff:127.0.0.1:8080
CLOSE_WAIT
tcp 1 0 ::ffff:127.0.0.1:52836 ::ffff:127.0.0.1:8080
CLOSE_WAIT
tcp 0 0 ::ffff:127.0.0.1:58262 ::ffff:127.0.0.1:8080
TIME_WAIT
(Note the application makes direct calls to port 8080 for a specific API
I'm using (SOLR)).
I'm really not sure which is actually causing the problem, but I was
hoping
for some guidane on which settings I should look into tweaking.
Hi.
1) here is one (among many) explanation of the CLOSE_WAIT state.
http://blogs.technet.com/b/janelewis/archive/2010/03/09/explaining-close-wait.aspx
(and many more if you search google for "tcp close_wait")
Basically, it is a normal state through which any TCP connection passes
at some point.
It is only pathological when you many of them persisting for a long time.
2) assuming yours are pathological and persist a long time :
2.a) the ones like this :
> tcp 1 0 ::ffff:127.0.0.1:8009 ::ffff:127.0.0.1:48387
> CLOSE_WAIT
involve port 8009, which is the AJP port of Tomcat, in this case the
"server" side. The other side is mod_jk within Apache httpd, in this
case the client side (because it is mod_jk which first establishes the
connection to Tomcat, so mod_jk is the client here).
If one of these persists for a long time, it means that the client
(mod_jk) does not entirely close its connection to Tomcat.
Why that could be, another person here would have to explain.
2.b) the other ones like
> tcp 1 0 ::ffff:127.0.0.1:49261 ::ffff:127.0.0.1:8080
> CLOSE_WAIT
relate to your application (as a client), which does not entirely
close() its connections to port 8080.
In my own experience - and assuming that you application is a java
application - this can happen for example as follows :
- the explicit connection to port 8080 is made from within some object,
as part of the creation of that object
- then when the application doesn't need the "connection object"
anymore, it discards it.
- the object is left on the heap, waiting to be garbage-collected
- when it is (eventually) garbage-collected, it will really de
destroyed, and any lingering "socket" within it will be closed (and the
line above will disappear from your netstat output)
but..
while it sits on the heap waiting to be garbage-collected (which could
be for a long time if your system has a lot of spare heap memory), that
inner socket is still there, not totally closed (the server closed its
side, but the client didn't).
Eventually, you may have so many sockets in the CLOSE_WAIT state, that
your system's TCP stack becomes unresponsive. (That's what I have seen
happening under Linux).
I do not really know the real underlying reason for this behaviour, but
my guess is that below the level of Java and the JVM, a Java socket at
some level relies on a OS-level socket object. And as long as that
OS-level socket object is not explicitly told to close() the connection,
it doesn't.
So make sure that before you discard your high-level objects containing
a connection, you explicitly close that connection.
Addendum : this may be interesting too :
http://www.michaelprivat.com/?p=63
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org