Trying to add some info below.

On 21.05.2009 05:09, Caldarale, Charles R wrote:
>> From: Pantvaidya, Vishwajit [mailto:vpant...@selectica.com]
>> Subject: RE: Running out of tomcat threads - why many threads in
>> RUNNABLEstage even with no activity
>>
>> So socket_keepalive is already 1. So does this mean that firewall is
>> dropping connections in spite of it.
> 
> The doc does not mention using 1 here, just true (although other variables 
> allow either).  Would be best to get Rainer's opinion when the sun comes up 
> in Europe.
> 
>> My netstat o/p had only 2 tomcat connections active in FIN_WAIT2 and
>> about 11 in keepalive on httpd side - I guess this does not indicate
>> any hanging connections?
> 
> It's not what one would like to see, especially since none of the ports match 
> - all of the connections are broken.
> 
>> Could that be because currently connectionTimeout is 
>> active in my server.xml?
> 
> I think so; it appears that setting the connectionTimeout on the Tomcat side 
> will effectively disable the expected persistence; it's an expensive 
> workaround for the problem.

1) If you want to analyze your original problem, you need to get back to
the original situation, i.e. without connectionTimeout. It doesn't make
much sense to guess about the original problem by looking at something
very different.

2) The output of netstat and the content of a thread dump change in
time. If you want to understand the exact relations between the two
netstat outputs and a thread dump, you need to ensure to produce those
three things as close in time as possible. I'm talking about seconds not
milliseconds.

3) I think I already indicated that you do not want to look at entries
in TIME_WAIT state. This state is special and not related to any threads
in Apache or in Tomcat. A connection in TIME_WAIT state is in no way
longer associated with a process and will no longer handle any data.
It's a placeholder to prevent new connections using the same ports and
possibly getting confused by old packets for the previous connections
coming in late. The only reason to care about TIME_WAIT connections is,
when the total number of all connections on the system (all ports and
including TIME_WAIT) gets into more than 10000. Some systems can cope
with more, like 60000, but if you go above 10000, then you need to start
thinking about it. I assume this is in no way the case here. So let's
for the moment always forget about the TIME_WAIT connections.

4) Firewall idle connection drop: First read

http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html#Firewall%20Connection%20Dropping

carefully and try to understand.

Any mod_jk attribute that takes a booelan value will accept 1, true,
True, t or T as true, 0, false, False, f or F as false (and maybe even
more).

5) Matching port numbers

Usually the port numbers should match. The non matching of the port
numbers could indicate, that there is a firewall in between, although
most firewall systems will be transparent to the ports (yes, I know
there are many variations). Since the port numbers are very close I
would guess, that the reason for not matching is that netstat was done a
couple of seconds or more apart, and your connections are only used for
a very short time, so we have lots of new connections activity.

6) TCP states

LISTEN on the Tomcat side corresponds to the one TP thread, that does a
socket accept.

ESTABLISHED: both sides still want to use this connection. On the Tomcat
side shows up as socketRead0()

CLOSE_WAIT: the other side has closed the connection, the local side
hasn't yet. E.g. if Tomcat closes the connection because of
connectionTimeout, but Apache doesn't have a configured idle timeout and
didn't yet try to reuse the half-closed connection, the connection will
be shown as CLOSE_WAIT on the httpd side. If Apache closed the
connection, but Tomcat hasn't noticed yet, it will be CLOSE_WAIT at the
Tomcat end. In this case it could be also socketRead0() in the thread dump.

FIN_WAIT2: most likely the other end of CLOSE_WAIT.


7) mod_jk update

Before you start to fix your mod_jk configuration, go to your ops people
and tell them that they are using a very bad mod_jk version and they
have to update. The right version to update to is 1.2.28. It does make
no sense at all to try to fix this with your old version. Solve your
problem in the right way, by setting much more attributes on the JK side
than simply the connectionTimeout on the Tomcat side.

Most important: read the above timeouts page fully.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to