Trying to add some info below. On 21.05.2009 05:09, Caldarale, Charles R wrote: >> From: Pantvaidya, Vishwajit [mailto:[email protected]] >> Subject: RE: Running out of tomcat threads - why many threads in >> RUNNABLEstage even with no activity >> >> So socket_keepalive is already 1. So does this mean that firewall is >> dropping connections in spite of it. > > The doc does not mention using 1 here, just true (although other variables > allow either). Would be best to get Rainer's opinion when the sun comes up > in Europe. > >> My netstat o/p had only 2 tomcat connections active in FIN_WAIT2 and >> about 11 in keepalive on httpd side - I guess this does not indicate >> any hanging connections? > > It's not what one would like to see, especially since none of the ports match > - all of the connections are broken. > >> Could that be because currently connectionTimeout is >> active in my server.xml? > > I think so; it appears that setting the connectionTimeout on the Tomcat side > will effectively disable the expected persistence; it's an expensive > workaround for the problem.
1) If you want to analyze your original problem, you need to get back to the original situation, i.e. without connectionTimeout. It doesn't make much sense to guess about the original problem by looking at something very different. 2) The output of netstat and the content of a thread dump change in time. If you want to understand the exact relations between the two netstat outputs and a thread dump, you need to ensure to produce those three things as close in time as possible. I'm talking about seconds not milliseconds. 3) I think I already indicated that you do not want to look at entries in TIME_WAIT state. This state is special and not related to any threads in Apache or in Tomcat. A connection in TIME_WAIT state is in no way longer associated with a process and will no longer handle any data. It's a placeholder to prevent new connections using the same ports and possibly getting confused by old packets for the previous connections coming in late. The only reason to care about TIME_WAIT connections is, when the total number of all connections on the system (all ports and including TIME_WAIT) gets into more than 10000. Some systems can cope with more, like 60000, but if you go above 10000, then you need to start thinking about it. I assume this is in no way the case here. So let's for the moment always forget about the TIME_WAIT connections. 4) Firewall idle connection drop: First read http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html#Firewall%20Connection%20Dropping carefully and try to understand. Any mod_jk attribute that takes a booelan value will accept 1, true, True, t or T as true, 0, false, False, f or F as false (and maybe even more). 5) Matching port numbers Usually the port numbers should match. The non matching of the port numbers could indicate, that there is a firewall in between, although most firewall systems will be transparent to the ports (yes, I know there are many variations). Since the port numbers are very close I would guess, that the reason for not matching is that netstat was done a couple of seconds or more apart, and your connections are only used for a very short time, so we have lots of new connections activity. 6) TCP states LISTEN on the Tomcat side corresponds to the one TP thread, that does a socket accept. ESTABLISHED: both sides still want to use this connection. On the Tomcat side shows up as socketRead0() CLOSE_WAIT: the other side has closed the connection, the local side hasn't yet. E.g. if Tomcat closes the connection because of connectionTimeout, but Apache doesn't have a configured idle timeout and didn't yet try to reuse the half-closed connection, the connection will be shown as CLOSE_WAIT on the httpd side. If Apache closed the connection, but Tomcat hasn't noticed yet, it will be CLOSE_WAIT at the Tomcat end. In this case it could be also socketRead0() in the thread dump. FIN_WAIT2: most likely the other end of CLOSE_WAIT. 7) mod_jk update Before you start to fix your mod_jk configuration, go to your ops people and tell them that they are using a very bad mod_jk version and they have to update. The right version to update to is 1.2.28. It does make no sense at all to try to fix this with your old version. Solve your problem in the right way, by setting much more attributes on the JK side than simply the connectionTimeout on the Tomcat side. Most important: read the above timeouts page fully. Regards, Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
