TVFoodMaps wrote:
Hi,

My website is setup using apache 2, mod jk 1.37 and tomcat 6.  Most of the
connector settings are set to defaults and things normally run pretty well.
 However during certain spikes in traffic my server seems to hang by what
appears to be caused by "leaking connections".  What I see is about 250
rows like this from a netstat:

tcp        1      0 ::ffff:127.0.0.1:8009       ::ffff:127.0.0.1:48387
 CLOSE_WAIT
tcp        1      0 ::ffff:127.0.0.1:8009       ::ffff:127.0.0.1:48309
 CLOSE_WAIT
tcp      689      0 ::ffff:127.0.0.1:8009       ::ffff:127.0.0.1:48423
 CLOSE_WAIT
tcp      686      0 ::ffff:127.0.0.1:8009       ::ffff:127.0.0.1:48413
 CLOSE_WAIT

I also am see a lot of these:

tcp        1      0 ::ffff:127.0.0.1:49261      ::ffff:127.0.0.1:8080
CLOSE_WAIT
tcp        1      0 ::ffff:127.0.0.1:52836      ::ffff:127.0.0.1:8080
CLOSE_WAIT
tcp        0      0 ::ffff:127.0.0.1:58262      ::ffff:127.0.0.1:8080
TIME_WAIT

(Note the application makes direct calls to port 8080 for a specific API
I'm using (SOLR)).

I'm really not sure which is actually causing the problem, but I was hoping
for some guidane on which settings I should look into tweaking.


Hi.
1) here is one (among many) explanation of the CLOSE_WAIT state.
http://blogs.technet.com/b/janelewis/archive/2010/03/09/explaining-close-wait.aspx
(and many more if you search google for "tcp close_wait")
Basically, it is a normal state through which any TCP connection passes at some 
point.
It is only pathological when you many of them persisting for a long time.

2) assuming yours are pathological and persist a long time :
2.a) the ones like this :
> tcp        1      0 ::ffff:127.0.0.1:8009       ::ffff:127.0.0.1:48387
>  CLOSE_WAIT

involve port 8009, which is the AJP port of Tomcat, in this case the "server" side. The other side is mod_jk within Apache httpd, in this case the client side (because it is mod_jk which first establishes the connection to Tomcat, so mod_jk is the client here). If one of these persists for a long time, it means that the client (mod_jk) does not entirely close its connection to Tomcat.
Why that could be, another person here would have to explain.

2.b) the other ones like
> tcp        1      0 ::ffff:127.0.0.1:49261      ::ffff:127.0.0.1:8080
> CLOSE_WAIT

relate to your application (as a client), which does not entirely close() its connections to port 8080. In my own experience - and assuming that you application is a java application - this can happen for example as follows : - the explicit connection to port 8080 is made from within some object, as part of the creation of that object
- then when the application doesn't need the "connection object" anymore, it 
discards it.
- the object is left on the heap, waiting to be garbage-collected
- when it is (eventually) garbage-collected, it will really de destroyed, and any lingering "socket" within it will be closed (and the line above will disappear from your netstat output)
but..
while it sits on the heap waiting to be garbage-collected (which could be for a long time if your system has a lot of spare heap memory), that inner socket is still there, not totally closed (the server closed its side, but the client didn't).

Eventually, you may have so many sockets in the CLOSE_WAIT state, that your system's TCP stack becomes unresponsive. (That's what I have seen happening under Linux).

I do not really know the real underlying reason for this behaviour, but my guess is that below the level of Java and the JVM, a Java socket at some level relies on a OS-level socket object. And as long as that OS-level socket object is not explicitly told to close() the connection, it doesn't. So make sure that before you discard your high-level objects containing a connection, you explicitly close that connection.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to