Chris,

On Thu, Nov 20, 2014 at 3:16 PM, Christopher Schultz
<ch...@christopherschultz.net> wrote:
>
> Lisa,
>
> On 11/19/14 1:36 PM, Lisa Woodring wrote:
>> On Tue, Nov 18, 2014 at 2:43 PM, Christopher Schultz
>> <ch...@christopherschultz.net> wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
>>>
>>> Lisa,
>>>
>>> On 11/18/14 11:52 AM, Lisa Woodring wrote:
>>>> We recently upgraded from Tomcat 6.0.29 to Tomcat 8.0.14.
>>>> Everything appears to be working fine, except that Tomcat is
>>>> keeping a high # of threads (in TIMED_WAITING state) -- and the
>>>> CPU has a high load & low idle time.  We are currently running
>>>> Tomcat8 on 2 internal test machines, where we also monitor
>>>> their statistics.  In order to monitor the availability of the
>>>> HTTPS/AJP port (Apache-->Tomcat), our monitoring software opens
>>>> a port to verify that this works -- but then does not follow
>>>> that up with an actual request.  This happens every 2 minutes.
>>>> We have noticed that the high thread/load activity on Tomcat
>>>> coincides with this monitoring.  If we disable our monitoring,
>>>> the issue does not happen.  We have enabled/disabled the
>>>> monitoring on both machines over several days (and there is
>>>> only very minimal, sometimes non-existent) internal traffic
>>>> otherwise) -- in order to verify that the monitoring is really
>>>> the issue.  Once these threads ramp up, they stay there or keep
>>>> increasing.  We had no issues running on Tomcat 6 (the thread
>>>> count stayed low, low load, high idle time).
>>>>
>>>> The thread backtraces for these threads look like this:
>>>> -----------------------------------------------------------------------------
>>>>
>>>>
>>>
>>>>
> Thread[catalina-exec-24,5,main]
>>>> at sun.misc.Unsafe.park(Native Method) at
>>>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>>>
>>>>
>>>
>>>>
> at
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>>>>
>>>
> at
>>>> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
>>>>
>>>>
>>>
>>>>
> at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
>>>> at
>>>> org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
>>>>
>>>>
>>>
>>>>
> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
>>>>
>>>
> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>
>>>>
>>>
>>>>
> at
>>> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>>>>
>>>
> at java.lang.Thread.run(Thread.java:745)
>>>> -----------------------------------------------------------------------------
>>>>
>>>>
>>>
>>>>
> The thread count grows over time (goes up to 130-150 threads after 2
>>>> hours).  Setting 'connectionTimeout' (as opposed to the default
>>>> of never timing out) does seems to help "some" -- the # of
>>>> threads isn't quite as bad (only 60-80 threads after 2 hours).
>>>> However, the CPU Idle % is still not good -- was only 10% idle
>>>> with default tomcat settings, is something like 40% idle with
>>>> current settings. Also tried setting Apache's 'KeepAliveTimeout
>>>> = 5' (currently set to 15) but this did not make any
>>>> difference.
>>>>
>>>>
>>>> Is there some configuration we can set to make Tomcat tolerant
>>>> of this monitoring?  (We have tried setting connectionTimeout
>>>> & keepAliveTimeout on the Connector.  And we have tried putting
>>>> the Connector behind an Executor with maxIdleTime.) OR, should
>>>> we modify our monitoring somehow?  And if so, suggestions?
>>>>
>>>>
>>>> * Running on Linux CentOS release 5.9 * running Apache in front
>>>> of Tomcat for authentication, using mod_jk * Tomcat 8.0.14
>>>>
>>>> relevant sections of tomcat/conf/server.xml:
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>
>>>>
> <Executor name="tomcatThreadPool" namePrefix="catalina-exec-"
>>>> maxThreads="250" minSpareThreads="20" maxIdleTime="60000" />
>>>>
>>>> <Connector executor="tomcatThreadPool" port="8080"
>>>> protocol="HTTP/1.1" connectionTimeout="20000"
>>>> redirectPort="8443" />
>>>>
>>>> <Connector executor="tomcatThreadPool" port="8009"
>>>> protocol="AJP/1.3" redirectPort="8443" maxThreads="256"
>>>> connectionTimeout="3000" keepAliveTimeout="60000" />
>>>
>>> Both of these connectors should be NIO connectors, so they should
>>> not block while waiting for more input. That means that you
>>> should not run out of threads (which is good), but those
>>> connections will sit in the poller queue for a long time (20
>>> seconds for HTTP, 3 seconds for AJP) and then sit in the acceptor
>>> queue for the same amount of time (to check for a "next"
>>> keepAlive request). Are you properly shutting-down the connection
>>> on the client end every 2 minutes?
>>>
>>
>>
>>
>> The monitoring software is trying to test is that the AJP port
>> itself is actually accepting connections.  With Apache in front in
>> a production system, it could forward the actual request to one of
>> several Tomcat boxes -- but we don't know which one from the
>> outside.
>
> Given that the whole point is to test whether the AJP connection is
> available, why would you bother making an HTTP request to the web
> server and then be sent arbitrarily to an unknown back-end Tomcat server?
>
> Instead, might I suggest making a connection directly to the Tomcat
> you want to test using the AJP protocol?
>


Actually, that is what we're doing.  It tests if it can successfully
open a connection to port 8009.  But then, doesn't actually do
anything after opening the connection (doesn't send any type of
request).


>> The monitoring software is trying to test -- for each Tomcat
>> instance -- if it is accepting connections.  It used to send an
>> "nmap" request, but now sends essentially a "tcp ping" -- to port
>> 8009, gets a response & moves on.  So, no, it does not shutdown the
>> connection -- it's pretty simple/dumb.
>>
>> My main questions are: 1) Why was this ok on Tomcat 6?  but now an
>> issue with Tomcat 8?
>
> I'm not sure.
>
>> 2) Suggestions on how to monitor this better?
>
> We use check_ajp for Nagios, which you can easily find online. I'm not
> exactly sure what it does, but it doesn't clog-up our request queues
> in production.


Thanks, I will take a look.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to