On 12.01.2018 07:58, Toom Andreas wrote:
Hi,
We are having an issue with an application running Apache Tomcat 8.0.47 using
Oracle Jvm 1.8.0.151 on Linux (RHEL 7). The Tomcat process is consuming cpu at
a constant high level even though there is a low amount of incoming traffic.
When inspecting the process health using JMX /JVisualVM CPU Sampler we see that
there are 4 ajp-nio-exec threads together with a
NioBlockingSelector.BlockPoller thread that consume most of the cpu.
A stack trace of one of the ajp-io-exec threads looks like this:
"ajp-nio-48317-exec-14233" - Thread t@201195
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <342fab60> (a
java.util.concurrent.CountDownLatch$Sync)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at
java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at
org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1400)
at
org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1402)
at
org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:185)
at
org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:250)
at
org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:231)
at
org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:194)
at
org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:160)
at
org.apache.coyote.ajp.AbstractAjpProcessor.readMessage(AbstractAjpProcessor.java:1091)
at
org.apache.coyote.ajp.AbstractAjpProcessor.process(AbstractAjpProcessor.java:804)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1539)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1495)
- locked <279cc7f7> (a
org.apache.tomcat.util.net.NioChannel)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- locked <10e1f474> (a
java.util.concurrent.ThreadPoolExecutor$Worker)
A stack trace of the NioBlockingSelector.BlockPoller thread looks like this:
"NioBlockingSelector.BlockPoller-2" - Thread t@17
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native
Method)
at
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <648895e4> (a sun.nio.ch.Util$3)
- locked <17921532> (a
java.util.Collections$UnmodifiableSet)
- locked <67762111> (a
sun.nio.ch.EPollSelectorImpl)
at
sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:105)
at
org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:339)
Locked ownable synchronizers:
- None
It seems like the worker thread is trying to read from the underlying socket. When we
inspect the OS tcp/ip statistics using the ss-command (ss -i src :<ajp-port>)
the following connections are suspicious:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
CLOSE-WAIT 2 0 10.x.x.x:48317 10.x.x.11:53724
cubic wscale:7,7 rto:202 rtt:1.724/2.424 ato:40
mss:1448 cwnd:10 ssthresh:31 bytes_acked:343869507 bytes_received:22259372
segs_out:302687 segs_in:141943 send 67.2Mbps lastsnd:1290791171
lastrcv:1290786778 lastack:1290786767 pacing_rate 134.4Mbps retrans:0/433
rcv_rtt:1.875 rcv_space:28960
CLOSE-WAIT 1 0 10.x.x.x:48317 10.x.x.12:41990
cubic wscale:7,7 rto:202 rtt:1.132/0.899 ato:40
mss:1448 cwnd:10 ssthresh:22 bytes_acked:37347221 bytes_received:3346200
segs_out:35273 segs_in:20608 send 102.3Mbps lastsnd:289840681 lastrcv:289730802
lastack:289720796 pacing_rate 204.5Mbps retrans:0/13 reordering:24
rcv_rtt:1.875 rcv_space:28960
CLOSE-WAIT 1 0 10.x.x.x:48317 10.x.x.12:60068
cubic wscale:7,7 rto:209 rtt:8.757/14.317 ato:40
mss:1448 cwnd:10 ssthresh:36 bytes_acked:180903401 bytes_received:16401965
segs_out:170017 segs_in:100392 send 13.2Mbps lastsnd:395602314
lastrcv:395602267 lastack:395602257 pacing_rate 26.5Mbps retrans:0/106
rcv_rtt:1.875 rcv_space:28960
CLOSE-WAIT 2 0 10.x.x.x:48317 10.x.x.11:50628
cubic wscale:7,7 rto:203 rtt:2.992/4.252 ato:40
mss:1448 cwnd:13 ssthresh:25 bytes_acked:4185335 bytes_received:471220
segs_out:4193 segs_in:2181 send 50.3Mbps lastsnd:1036916135 lastrcv:1036871377
lastack:1036871367 pacing_rate 100.6Mbps retrans:0/1 reordering:26
rcv_rtt:6.375 rcv_space:28960
We are running our Tomcat node behind and Apache server and use Tomcat
Connector (mod_jk) as a load balancer. When looking at the hosts running the
Apache node the corresponding sockets are closed. To us it seems that Tomcat
has not properly closed the underlying socket when Apache/mod_jk initiated a
FIN and is instead continuously trying to read from the socket even though it
has been closed on the Apache side ?
I know that we are one version behind the latest release of Tomcat 8.0.x but I
checked the changelog and I could not see any fixed issues matching this
problem. Unfortunately we are only seeing this in production and we have not
been able to reproduce in test. Has anyone experienced a similar issue like
this before ?
For reference our connector configuration in server.xml looks like this:
<Connector port="48317"
protocol="AJP/1.3"
maxThreads="200"
connectionTimeout="600000"
xpoweredBy="false"
allowTrace="false"
URIEncoding="UTF-8"
requiredSecret="xxx"
secure="true"
scheme="https"
proxyPort="443" />
And our Tomcat Connector configuration for Apache in workers.properties looks
like this:
worker.list = status,app
# Create workers for app
worker.app-1.type=ajp13
worker.app-1.host=host8128.corpdomain.internal
worker.app-1.port=48317
worker.app-1.ping_mode=A
worker.app-1.connection_pool_timeout=600
worker.app-1.secret=xxx
worker.app-2.type=ajp13
worker.app-2.host=host8129.corpdomain.internal
worker.app-2.port=48317
worker.app-2.ping_mode=A
worker.app-2.connection_pool_timeout=600
worker.app-2.secret=xxx
# Create a loadbalancer for app
worker.app.type=lb
worker.app.sticky_session=1
worker.app.balance_workers=app-1, app-2
Best regards,
Andreas
Andreas Toom
Systemutvecklare/IT
TEL: +46 702 03 04 98
[Volvofinans Bank AB]<http://www.volvofinans.se/>
Hi.
First, thanks for the very complete issue description.
This page may be of help :
https://tomcat.apache.org/connectors-doc/common_howto/timeouts.html
(which considering the above, you probably have read already).
I did not really analyse in detail your description, nor that page.
But I am just wondering if in this case, setting both the
worker.app-x.connection_pool_timeout=600
and the
<Connector .. connectionTimeout="600000" ..>
(to exactly the same value in seconds)
is not in the end counter-productive.
This is just a hunch, because determining the exact interaction and side-effects of these
multiple timeout parameters is somewhat of a black art, but it is maybe worth changing one
of them (for example back to the default value), and check what happens then.
(You say that traffic is low, so you probably do hit one or more of these timeouts at some
point)
That is not to say that there may not be some bug somewhere in the code. But doing the
experiment above and reporting the result may provide some more clues for someone to
analyse your issue in detail.
You may also want to try swapping the AJP Connector, using the following
attribute :
http://tomcat.apache.org/tomcat-8.5-doc/config/ajp.html#Common_Attributes
--> protocol
Of course, doing such changes one at a time facilitates the analysis.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org