Chris, On Thu, Nov 20, 2014 at 3:16 PM, Christopher Schultz <ch...@christopherschultz.net> wrote: > > Lisa, > > On 11/19/14 1:36 PM, Lisa Woodring wrote: >> On Tue, Nov 18, 2014 at 2:43 PM, Christopher Schultz >> <ch...@christopherschultz.net> wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >>> >>> Lisa, >>> >>> On 11/18/14 11:52 AM, Lisa Woodring wrote: >>>> We recently upgraded from Tomcat 6.0.29 to Tomcat 8.0.14. >>>> Everything appears to be working fine, except that Tomcat is >>>> keeping a high # of threads (in TIMED_WAITING state) -- and the >>>> CPU has a high load & low idle time. We are currently running >>>> Tomcat8 on 2 internal test machines, where we also monitor >>>> their statistics. In order to monitor the availability of the >>>> HTTPS/AJP port (Apache-->Tomcat), our monitoring software opens >>>> a port to verify that this works -- but then does not follow >>>> that up with an actual request. This happens every 2 minutes. >>>> We have noticed that the high thread/load activity on Tomcat >>>> coincides with this monitoring. If we disable our monitoring, >>>> the issue does not happen. We have enabled/disabled the >>>> monitoring on both machines over several days (and there is >>>> only very minimal, sometimes non-existent) internal traffic >>>> otherwise) -- in order to verify that the monitoring is really >>>> the issue. Once these threads ramp up, they stay there or keep >>>> increasing. We had no issues running on Tomcat 6 (the thread >>>> count stayed low, low load, high idle time). >>>> >>>> The thread backtraces for these threads look like this: >>>> ----------------------------------------------------------------------------- >>>> >>>> >>> >>>> > Thread[catalina-exec-24,5,main] >>>> at sun.misc.Unsafe.park(Native Method) at >>>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) >>>> >>>> >>> >>>> > at >>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) >>>> >>> > at >>>> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) >>>> >>>> >>> >>>> > at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85) >>>> at >>>> org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) >>>> >>>> >>> >>>> > at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) >>>> >>> > at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>> >>>> >>> >>>> > at >>> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) >>>> >>> > at java.lang.Thread.run(Thread.java:745) >>>> ----------------------------------------------------------------------------- >>>> >>>> >>> >>>> > The thread count grows over time (goes up to 130-150 threads after 2 >>>> hours). Setting 'connectionTimeout' (as opposed to the default >>>> of never timing out) does seems to help "some" -- the # of >>>> threads isn't quite as bad (only 60-80 threads after 2 hours). >>>> However, the CPU Idle % is still not good -- was only 10% idle >>>> with default tomcat settings, is something like 40% idle with >>>> current settings. Also tried setting Apache's 'KeepAliveTimeout >>>> = 5' (currently set to 15) but this did not make any >>>> difference. >>>> >>>> >>>> Is there some configuration we can set to make Tomcat tolerant >>>> of this monitoring? (We have tried setting connectionTimeout >>>> & keepAliveTimeout on the Connector. And we have tried putting >>>> the Connector behind an Executor with maxIdleTime.) OR, should >>>> we modify our monitoring somehow? And if so, suggestions? >>>> >>>> >>>> * Running on Linux CentOS release 5.9 * running Apache in front >>>> of Tomcat for authentication, using mod_jk * Tomcat 8.0.14 >>>> >>>> relevant sections of tomcat/conf/server.xml: >>>> ------------------------------------------------------------------------ >>>> >>>> >>> >>>> > <Executor name="tomcatThreadPool" namePrefix="catalina-exec-" >>>> maxThreads="250" minSpareThreads="20" maxIdleTime="60000" /> >>>> >>>> <Connector executor="tomcatThreadPool" port="8080" >>>> protocol="HTTP/1.1" connectionTimeout="20000" >>>> redirectPort="8443" /> >>>> >>>> <Connector executor="tomcatThreadPool" port="8009" >>>> protocol="AJP/1.3" redirectPort="8443" maxThreads="256" >>>> connectionTimeout="3000" keepAliveTimeout="60000" /> >>> >>> Both of these connectors should be NIO connectors, so they should >>> not block while waiting for more input. That means that you >>> should not run out of threads (which is good), but those >>> connections will sit in the poller queue for a long time (20 >>> seconds for HTTP, 3 seconds for AJP) and then sit in the acceptor >>> queue for the same amount of time (to check for a "next" >>> keepAlive request). Are you properly shutting-down the connection >>> on the client end every 2 minutes? >>> >> >> >> >> The monitoring software is trying to test is that the AJP port >> itself is actually accepting connections. With Apache in front in >> a production system, it could forward the actual request to one of >> several Tomcat boxes -- but we don't know which one from the >> outside. > > Given that the whole point is to test whether the AJP connection is > available, why would you bother making an HTTP request to the web > server and then be sent arbitrarily to an unknown back-end Tomcat server? > > Instead, might I suggest making a connection directly to the Tomcat > you want to test using the AJP protocol? >
Actually, that is what we're doing. It tests if it can successfully open a connection to port 8009. But then, doesn't actually do anything after opening the connection (doesn't send any type of request). >> The monitoring software is trying to test -- for each Tomcat >> instance -- if it is accepting connections. It used to send an >> "nmap" request, but now sends essentially a "tcp ping" -- to port >> 8009, gets a response & moves on. So, no, it does not shutdown the >> connection -- it's pretty simple/dumb. >> >> My main questions are: 1) Why was this ok on Tomcat 6? but now an >> issue with Tomcat 8? > > I'm not sure. > >> 2) Suggestions on how to monitor this better? > > We use check_ajp for Nagios, which you can easily find online. I'm not > exactly sure what it does, but it doesn't clog-up our request queues > in production. Thanks, I will take a look. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org