HI Mark,

On Wed, 13 Nov 2019 at 15:38, Mark Thomas <ma...@apache.org> wrote:

> On 12/11/2019 19:11, M. Manna wrote:
> > HI Mark,
> >
> > following my previous reply, we have now confirmed that it's indeed
> 8.5.45
> > with APR 1.2.23 that's causing such high JVM CPU usage.
> > We used took out 2 out of 50 servers from the load balancer config,
> > reverted tomcat, and redeployed. With near to identical user traffic, the
> > two servers are responding normally without/without traffic with 8.5.41.
> > The JVM dump looks a lot better with 8.5.41.
> >
> > We do think that the recent changes in APR and some other tomcat jar may
> > have caused compatibility issue on Windows server 2016 (64-bit) platform.
> > But unfortunately, we cannot pinpoint exactly what change may have caused
> > this (i.e. actual OS vs Security Updates). With this in mind, we are also
> > being wary to move to 8.5.47 as we don't know if the same issue will
> occur
> > again. Since 8.5.41 has been packaged with previously accepted
> application
> > installer, we are more comfortable rolling back.
>
> Just to confirm, you see this high CPU usage with a clean install (no
> additional web applications deployed, no configuration changes) on
> Windows 2016 DataCenter (64-bit)?
>
> If this is the case, it should be fairly easy to reproduce.
>
> Mark
>
>  We do not deploy multiple applications. In fact, Under tomcat
webapps/ROOT we only have one application (ours). Each tomcat instance is
hosted on a VM (total 50) and all of them are identically configured
(server.xml, web.xml, logging, CPU/RAM).
 We have not made any other configuration change between 8.5.41 and 8.5.45.
And yes, I agree with you that it's fairly easy to reproduce.


Thanks,


>
> >
> > I would appreciate if this can be looked into.
> >
> > On Tue, 12 Nov 2019 at 11:27, M. Manna <manme...@gmail.com> wrote:
> >
> >> Hey Mark (appreciate your response in US holiday time)
> >>
> >> On Tue, 12 Nov 2019 at 07:51, Mark Thomas <ma...@apache.org> wrote:
> >>
> >>> On November 12, 2019 12:54:53 AM UTC, "M. Manna" <manme...@gmail.com>
> >>> wrote:
> >>>> Just to give an update again:
> >>>>
> >>>> 1) We reverted the APR to 1.2.21 - but observed no difference.
> >>>> 2) We took 3 thread dumps over 1 min interval (without any user
> >>>> sessions) -
> >>>> All threads are tomcat's internal pool threads.
> >>>>
> >>>> When we checked the thread details (using fasthread.io) - we didn't
> see
> >>>> any
> >>>> of our application stack. Since there is no user traffic, this is
> >>>> coming
> >>> >from tomcat internally. At this stage, we cannot really figure out
> >>>> what's
> >>>> the root cause.
> >>>>
> >>>> Any help is appreciated.
> >>>
> >>> Migrated from what (full version info please)?
> >>>
> >>  from 8.5.41 to 8.5.45 (we migrate 3 times a year, last was in June)
> >>
> >>>
> >>> Operating system exact version?
> >>>
> >>  Microsoft Windows Server 2016 DataCentre (64-bit)
> >>
> >>>
> >>> JRE vendor and exact version?
> >>>
> >>  C:\jdk1.8.0\bin>java.exe -version
> >> java version "1.8.0_162"
> >> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
> >> Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
> >>
> >>
> >>> Do you see the same behavior with the latest 8.5.x and latest Tomcat
> >>> Native?
> >>>
> >>   We are using APR 1.2.23 which I can also see in latest tomcat. Due to
> >> production due diligence we cannot roll to a different version that
> easily.
> >> Normally, we lag behind by 2 monthly releases of tomcat. We also
> reverted
> >> the APR to 1.2.21 (but no difference).
> >>
> >>>
> >>> What triggers this behaviour?
> >>>
> >>  That is quite strange. Due to US holidays, we had a low traffic on our
> >> servers, and nothing has crept in to suggest that it's
> application-driven.
> >> We took one tomcat instance out of 50 instances and removed all user
> >> sessions (i.e. no application activities or threads). Upon restart of
> >> tomcat, the CPU spike lingered past the initial servlet startup period.
> We
> >> monitored that over 1-2 hours but there was no difference.
> >>
> >>>
> >>> How often do you see this behaviour?
> >>>
> >> We took 2 sets of data
> >> 1) 3 Jstack dump based on 10 seconds interval.
> >> 2) 3 jstack dump based on 1 min interval.
> >>
> >> Both the above reveals that all background threads (http, pool etc.)
> were
> >> from tomcat. We didn't have any application threads lingered in those 3
> >> samples. So yes we see this almost all the time if we take samples.
> >> However, when we compared with pre-production instances (with Windows
> >> server R2 x64 bit), we don't see such abnormal spike. In fact, the
> >> application instance doesn't incur such a big CPU spike. Whilst
> composing
> >> this email, I am now thinking if the APR is indeed incompatible with
> >> WIndows Server R2 (or the presence of any Windows Updates) which blocks
> the
> >> native poll() call longer than usual.
> >>
> >> An example is that on Windows Server 2012 - APR poll() call takes about
> >> 30% CPU time - but with Windows Server 2016 it's almost always 95%.
> >>
> >>
> >>>
> >>> And anything else you think might be relevant.
> >>>
> >>
> >> We are using end-2-end encryption using APR (with Certificate and
> >> SSLConfig resource setup in server.xml). But it's survived past 3 tomcat
> >> upgrades without any issue.
> >> Except OS we don't have any obvious culprit identified at the moment.
> >>
> >> Thanks,
> >>
> >>>
> >>> Mark
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> On Mon, 11 Nov 2019 at 20:57, M. Manna <manme...@gmail.com> wrote:
> >>>>
> >>>>> Hello All,
> >>>>>
> >>>>> Any thoughts regarding this? Slightly clueless at this point, so any
> >>>>> direction will be appreciated.
> >>>>>
> >>>>> We are seeing the poll taking all the CPU time. We are using
> >>>>> OperatingSystemMXBean.getProcessCpuLoad() and
> >>>>> OperatingSystemMXBean.getSystemCpuLoad() to get our metrics (then
> >>>> x100 to
> >>>>> get the pct).
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>
> >>>>> On Mon, 11 Nov 2019 at 17:46, M. Manna <manme...@gmail.com> wrote:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> after migrating to 8.5.45, we are seeing a lot of cpu load by
> >>>> following
> >>>>>> JVM thread dump:
> >>>>>>
> >>>>>> "https-openssl-apr-0.0.0.0-8443-Poller" : 102 : RUNNABLE :
> >>>>>> cpu=172902703125000 : cpuLoad= 74.181015
> >>>>>>
> >>>>>> BlockedCount:8464 BlockedTime:0 LockName:null LockOwnerID:-1
> >>>>>> LockOwnerName:null
> >>>>>>
> >>>>>> WaitedCount:5397 WaitedTime:0 InNative:false IsSuspended:false at
> >>>>>> org.apache.tomcat.jni.Poll.poll(Poll.java:-2)
> >>>>>>
> >>>>>>             at
> >>>>>>
> >>>> org.apache.tomcat.util.net
> .AprEndpoint$Poller.run(AprEndpoint.java:1547)
> >>>>>>
> >>>>>>             at java.lang.Thread.run(Thread.java:748)
> >>>>>>
> >>>>>>
> >>>>>> These are coming after 2-3 successful jvm dump. Is this something
> >>>>>> familiar to anybody?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>
> >>>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to