Re: Please help with Tomcat Garbage Collection

George I. Develekos Wed, 16 Nov 2016 08:30:01 -0800

One thing I didn't respond to:

The VM host is unlikely to have something to do here. We have acoincidence (in time) of three events, consistently.


1. Full GC
2. High CPU
3. App freezes

So it seems really unlikely the host is responsible.

The odd thing is that, the same app, on the same OS, in otherinstallations, where of course we also have Full-GC periodically, nevershows any freeze.


Regards,

George I. Develekos | SeniorSW Engineer |t: _+30.210.6930664_| e:[email protected] <mailto:[email protected]>


------------------------------------------------------------------------

Technology that Listens, Understands and Cares - www.omilia.com<http://www.omilia.com/>


On 11/16/2016 6:11 PM, [email protected] wrote:


Sorry for top posting.  The format got weird.

Those numbers aren’t bad. The most important number for me is thethroughput on the summary tab. Yours is 99.38%. That means the JVMwas doing real work 99.38% of the time and garbage collecting theother .62%. You could improve that if you worked hard enough, butgoing from 99.4 to 99.5 or 99.6 probably isn’t worth the effort. Ifthe number was 90% or something, then you’d have more room forimprovement.

Look at the GC performance numbers on the summary tab. Obviouslyminor GCs are much faster in this regard than major GCs. You canreduce your total GC time by increasing the size of your younggeneration. You will get more or slower young collections but fewerand faster old collections. Overall the total time will be less thanit is now and the longest pauses will be shorter.

As others have said, though, something doesn’t add up. CMS is onlystop-the-world during certain phases. (Not the ones with “concurrent”in the name.) If you feel these GC events coincide with pauses inyour app, you can try a thread dump or three (kill -3 <pid>) duringthe pause. Use a tool like Samurai to parse the output. This mightonly be practical for longer pauses, though.

Also, is it possible the VM itself is having a problem? Maybe youshould talk to your virtualization team to see how stressed thehardware is. VMWare has an informative java best practices doc:http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/enterprise-java-applications-on-vmware-best-practices-guide.pdf


John

*From:*George I. Develekos [mailto:[email protected]]
*Sent:* Wednesday, November 16, 2016 5:21 AM
*To:* [email protected]
*Subject:* Re: Please help with Tomcat Garbage Collection

I'm attaching three screenshots of the GCViewer app as it processedthe complete *gc.log* file (about 19 hours).

Please have a look and advise on what I can do to limit Full-GC times.As of now I have a recommendation to increase the Young Gen..


The setup in summary:

We are using Java 6 (stuck with CentOS 5.8 at this time) and Tomcat7.0.64.

Xmx is 5G, Xms is 2G, and GC options are -XX:+UseConcMarkSweepGC-XX:+CMSIncrementalMode

On 11/15/2016 11:45 PM, [email protected]<mailto:[email protected]> wrote:


        -----Original Message-----

        From: George I. Develekos [mailto:[email protected]]

        Sent: Tuesday, November 15, 2016 3:00 PM

        To:[email protected] <mailto:[email protected]>

        Subject: Re: Please help with Tomcat Garbage Collection

        The system does very little swapping, both when it's GC'ing and when 
it's not.

        Less than 100MB worth of swap is taken.

        Giving Tomcat its own HW is not an option at this time, especially as 
there's no

        guarantee it'll solve the problem. Besides it would be a VM anyway, not 
physical

        dedicated HW.  The current server is also a VM.

        On 15-Nov-16 10:55 PM, Zdeněk Henek wrote:

            I would start with moving this tomcat to its own hw.

            Did you check swap? This long pauses could be because part of your

            heap is swapped to hdd

            Regards,

            Zdenek Henek

            On Tue, Nov 15, 2016, 21:37 George I. Develekos

            <[email protected]> <mailto:[email protected]>

            wrote:

                On 15-Nov-16 10:22 PM, Christopher Schultz wrote:

                    George,

                    On 11/15/16 10:46 AM, George I. Develekos wrote:

                        Hello guys,

                        We are having problems on a production system with very long 
"full

                        GC" times, as long as1200sec real time (!!!).

                        We are using Java 6 (stuck with CentOS 5.8 at this 
time) and Tomcat

                        7.0.64.

                        Xmx is 5G, Xms is 2G, and GC options are 
-XX:+UseConcMarkSweepGC

                        -XX:+CMSIncrementalMode

                        No other custom memory-related settings are in place.

                        Looking at the GC log, the last few Full-GC entries are:

                        1367.020: [Full GC 1367.020: [CMS: 
1178831K->527456K(1926784K),

                        2.1117220 secs] 1250378K->527456K(2080128K), [CMS Perm :

                        169762K->56187K(169984K)] icms_dc=0 , 2.1118160 secs] 
[Times:

                        user=1.96 sys=0.13, real=2.11 secs]

                        2579.317: [Full GC 2579.317: [CMS2581.876: 
[CMS-concurrent-mark:

                        2.558/1212.733 secs] [*Times: user=113.05 sys=28.01,

                        real=**1212.49 **secs] ** * 3539.969: [Full GC 3539.969:

                        [CMS3540.056: [CMS-concurrent-sweep: 1.571/23.223 secs] 
[Times:

                        user=6.12 sys=1.36, real=*23.21 secs*]

                        4070.456: [Full GC 4070.457: [CMS: 
1252569K->591200K(1926784K),

                        2.3447040 secs] 1270617K->591200K(2080128K), [CMS Perm :

                        169983K->56598K(169984K)] icms_dc=0 , 2.3448140 secs] 
[Times:

                        user=2.18 sys=0.14, real=2.34 secs]

                        What can we do?

                    1367.020 Full GC duration=2.11s

                    2579.317 Full GC duration=1212.49s

                    So your full GC immediately started another full GC that 
took 20

                    minutes ?

                    Are you only showing certain FULL GC activity from your 
log, or is

                    that everything?

                    CMS should have a mark and then a sweep each time, but your 
times

                    don't seem to add up.

                    also note that the whole point of CMS is that there isn't 
any

                    stop-the-world during the mark portion of the process.

                    Are you actually experiencing a problem, or are you just 
suffering

                    from instrumentor's remorse?

                    - -chris

                Chris,

                What I listed is the result of the command:

                grep "Full GC" gc.log

                So (obviously) I have skipped other GC activity, i.e. whatever 
GC

                activity didn't include the "Full GC" string.

                Yes we are having app trouble due to the GC delays so this is a 
real

                problem. Our application has real-time constraints so the GC 
delays

                cannot be tolerated. I selected those GC options _in order to 
avoid

                _long GC times.

                Additionally, these periods coincide with high CPU for that JVM

                process.  From 5-20% CPU where it is normally, it jumps to 60% 
ore more.

                Once GC is done, our app rushes to catch up with tasks that had 
to

                wait for GC to finish.

                Answering another question from a member who has kindly 
responded,

                yes the server is running other stuff. Basically it runs three

                tomcats, the main one being this one. It also runs a DB2 
database

                that has close-to-zero activity.

                George

    It might be helpful if you could post a larger chunk of your GC log, at 
least long enough to cover the start and end of the CMS phases and maybe even 
more.  Additionally, try using a tool like GCViewer to analyze the log.

    How many CPUs do you have?  60% CPU usage isn't usually a big deal.

    Like Chris already said, this is not a stop-the-world phase, so your 
application should continue working in parallel with the garbage collector.

    Looks like your young generation is only 150MB (2080128k - 1926784k.)  
That's very small for a 2-5GB heap.  Are you explicitly setting it somewhere or 
is the JVM choosing that for you?  It's so small that your old generation might 
be filling up faster that it should, leading to more frequent full collections. 
 You could try setting the young generation to something like 25-50% of the 
total heap.  You'd get a lot of small pauses as the young gen is collected but 
fewer long ones.

    John

    ---------------------------------------------------------------------

    To unsubscribe, e-mail:[email protected]
    <mailto:[email protected]>

    For additional commands, e-mail:[email protected] 
<mailto:[email protected]>



------------------------------------------------------------------------

Avast logo<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>


        

This email has been checked for viruses by Avast antivirus software.

www.avast.com<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>




---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Re: Please help with Tomcat Garbage Collection

Reply via email to