In the last two weeks I've had two occurrences where a single CentOS 7
production server hosting a public webpage has become unresponsive. The
first time, all 300 available "https-jsse-nio-8443" threads were consumed,
with the max age being around 45minutes, and all in a "S" status. This time
all 300 were consumed in "S" status with the oldest being around
~16minutes. A restart of Tomcat on both occasions freed these threads and
the website became responsive again. The connections are post/get methods
which shouldn't take very long at all.

CPU/MEM/JVM all appear to be within normal operating limits. I've not had
much luck searching for articles for this behavior nor finding remedies.
The default timeout values are used in both Tomcat and in the applications
that run within as far as I can tell. Hopefully someone will have some
insight on why the behavior could be occurring, why isn't Tomcat killing
the connections? Even in a RST/ACK status, shouldn't Tomcat terminate the
connection without an ACK from the client after the default timeout?

Is there a graceful way to script the termination of threads in case Tomcat
isn't able to for whatever reason? My research for killing threads results
in system threads or application threads, not Tomcat Connector connection
threads, so I'm not sure if this is even viable. I'm also looking into ways
to terminate these aged sessions via the F5.At this time I'm open to any
suggestions that would be able to automate a resolution to keep the system
from experiencing downtime, or for any insight on where to look for a root
cause. Thanks in advance for any guidance you can lend.

Thanks, David

Reply via email to