On 4 October 2017 15:17:25 BST, Mark Thomas <ma...@apache.org> wrote:
>On 04/10/17 13:51, TurboChargedDad . wrote:
>>  Hello all..
>> I am going to do my best to describe my problem.  Hopefully someone
>will
>> have some sort of insight.
>> 
>> Tomcat 7.0.41 (working on updating that)
>> Java 1.6 (Working on getting this updated to the latest minor
>release)
>> RHEL Linux
>> 
>> I inherited an opti-tenant setup.  Individual user accounts on the
>system
>> each have their own Tomcat instance, each is started using sysinit. 
>This
>> is done to keep each website in its own permissible world so one
>website
>> can't interfere with a others data.
>> 
>> There are two load balanced apache proxies at the edge that point to
>one
>> Tomcat server (I know I know but again I inherited this)
>> 
>> Apache lays over the top of tomcat to terminate SSL and uses AJP to
>> proxypass to each tomcat instance based on the users assigned port.
>> 
>> Things have run fine for years (so I am being told anyway) until
>recently.
>> Let me give an example of an outage.
>> 
>> User1, user2 and user3 all use unique databases on a shared database
>> server, SQL server 10.
>> 
>> User 4 runs on a windows jboss server and also has a database on
>shared
>> database server 10.
>> 
>> Users 5-50 all run in the mentioned Linux server using tomcat and
>have
>> databases on *other* various shared databases servers but have
>nothing to
>> do with database server 10.
>> 
>> User 4 had a stored proc go wild on database server 10 basically
>knocking
>> it offline.
>> 
>>   Now one would expect sites 1-4 to experience interruption of
>service
>> because they use a shared DBMS platform.  However.
>> 
>> Every single site goes down. I monitor the connections for each site
>with a
>> custom tool.  When this happens, the connections start stacking up
>across
>> all the components. (Proxies all the way through the stack)
>> Looking at the AJP connection pool threads for user 9 shows that user
>has
>> exhausted their AJP connection pool threads.  They are maxed out at
>300 yet
>> that user doesn't have high activity at all. The CPU load, memory
>usage and
>> traffic for everything except SQL server 10 is stable during this
>outrage.
>> The proxies start consuming more and more memory the longer the
>outrage
>> occurs but that's expected as the connection counts stack up into the
>> thousands.  After a short time all the sites apache / ssl termination
>later
>> start throwing AJP timeout errors.  Shortly after that the edge
>proxies
>> will naturally also starting throwing timeout errors of their own.
>> 
>> I am only watching user 9 using a tool that allows me to have insight
>into
>> what's going on using JMX metrics but I suspect that once I get all
>the
>> others instrumented that I will see the same thing. Maxed out AJP
>> connection pools.
>> 
>> Aren't those supposed to be unique per user/ JVM? Am I missing
>something in
>> the docs?
>> 
>> Any assistance from the tomcat gods is much appreciated.
>
>TL;DR - Try switching to the NIO AJP connector on Tomcat.
>
>Take a look at this session I just uploaded from TomcatCon London last
>week. You probably want to start around 35:00 and the topic of thread
>exhaustion.

Whoops. Here is the link.

https://youtu.be/2QYWp1k5QQM

Mark


>
>HTH,
>
>Mark
>
>P.S. The other sessions we have are on the way. I plan to update the
>site and post links once I have them all uploaded.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>For additional commands, e-mail: users-h...@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to