My initial reads about BIO vs NIO seems to involve terminating SSL at the tomcat instance. Which we do not do. Am I running off into the weeds with that?
Thanks, TCD On Wed, Oct 4, 2017 at 9:17 AM, Mark Thomas <ma...@apache.org> wrote: > On 04/10/17 13:51, TurboChargedDad . wrote: > > Hello all.. > > I am going to do my best to describe my problem. Hopefully someone will > > have some sort of insight. > > > > Tomcat 7.0.41 (working on updating that) > > Java 1.6 (Working on getting this updated to the latest minor release) > > RHEL Linux > > > > I inherited an opti-tenant setup. Individual user accounts on the system > > each have their own Tomcat instance, each is started using sysinit. This > > is done to keep each website in its own permissible world so one website > > can't interfere with a others data. > > > > There are two load balanced apache proxies at the edge that point to one > > Tomcat server (I know I know but again I inherited this) > > > > Apache lays over the top of tomcat to terminate SSL and uses AJP to > > proxypass to each tomcat instance based on the users assigned port. > > > > Things have run fine for years (so I am being told anyway) until > recently. > > Let me give an example of an outage. > > > > User1, user2 and user3 all use unique databases on a shared database > > server, SQL server 10. > > > > User 4 runs on a windows jboss server and also has a database on shared > > database server 10. > > > > Users 5-50 all run in the mentioned Linux server using tomcat and have > > databases on *other* various shared databases servers but have nothing to > > do with database server 10. > > > > User 4 had a stored proc go wild on database server 10 basically knocking > > it offline. > > > > Now one would expect sites 1-4 to experience interruption of service > > because they use a shared DBMS platform. However. > > > > Every single site goes down. I monitor the connections for each site > with a > > custom tool. When this happens, the connections start stacking up across > > all the components. (Proxies all the way through the stack) > > Looking at the AJP connection pool threads for user 9 shows that user has > > exhausted their AJP connection pool threads. They are maxed out at 300 > yet > > that user doesn't have high activity at all. The CPU load, memory usage > and > > traffic for everything except SQL server 10 is stable during this > outrage. > > The proxies start consuming more and more memory the longer the outrage > > occurs but that's expected as the connection counts stack up into the > > thousands. After a short time all the sites apache / ssl termination > later > > start throwing AJP timeout errors. Shortly after that the edge proxies > > will naturally also starting throwing timeout errors of their own. > > > > I am only watching user 9 using a tool that allows me to have insight > into > > what's going on using JMX metrics but I suspect that once I get all the > > others instrumented that I will see the same thing. Maxed out AJP > > connection pools. > > > > Aren't those supposed to be unique per user/ JVM? Am I missing something > in > > the docs? > > > > Any assistance from the tomcat gods is much appreciated. > > TL;DR - Try switching to the NIO AJP connector on Tomcat. > > Take a look at this session I just uploaded from TomcatCon London last > week. You probably want to start around 35:00 and the topic of thread > exhaustion. > > HTH, > > Mark > > P.S. The other sessions we have are on the way. I plan to update the > site and post links once I have them all uploaded. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >