On Feb 20, 2013, at 3:52 AM, Zoran Avtarovski wrote:

> Hi Guys,
> 
> It's been a while but the nature of this problem means it may be a while
> between crashes. But we just had a big one which hung the system and
> required a reboot.

Can you elaborate more on this?  What OS are you running?  What do you mean by 
"hung the system"?  Did you get a kernel panic / Bsod?  

> I have changed the tomcat options as follows inline with all the advice
> and material I read to be as follows:

This can be dangerous.  Especially, if you haven't tested the settings and 
verified that they help to increase performance and lower GC overhead for your 
system and applications.  Applications are unique and what works to tune one of 
them may not work well for others.

> 
> -server -Xms1460m -Xmx11460m -Djava.awt.headless=true
> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
> -XX:MaxPermSize=512M -XX:NewSize=4500m -XX:+CMSClassUnloadingEnabled
> -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit
> -XX:CMSInitiatingOccupancyFraction=80 -verbose:gc -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps -Xloggc:/usr/local/tomcat/logs/gc.log

First, what JVM are you running?  vendor and version.  If you are running 
anything but the latest version of that JVM, upgrade to the latest version.  
See if the problem is still present.

Some comments on your JVM options...

1.) You have -XX:+UseConcMarkSweepGC listed twice

2.) You have -XX:+CMSIncrementalMode, does the following describe your system?  
If not, remove this setting.

"This feature is useful when applications that need the low pause times 
provided by the concurrent collector are run on machines with small numbers of 
processors (e.g., 1 or 2)."

  http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#icms

3.) I'm not a fan of specifying "-XX:NewSize=4500m".  I think the JVM's default 
usually works fine, plus it's difficult to manually specify this value and get 
it correct.  My suggestion would be to remove this option, unless you have load 
tested your application with and without the setting and you can 100% guarantee 
that it is helping performance.

4.) You have set -XX:-UseGCOverheadLimit, which could be dangerous.  "This 
feature is designed to prevent applications from running for an extended period 
of time while making little or no progress because the heap is too small."

  http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms.oom

Disabling this would seem unnecessary if your JVM options are tuned correctly.

5.) This option, -XX:CMSInitiatingOccupancyFraction, is another one where I 
would suggest using the JVM default.  Unless you have load tested with and 
without the setting and can guarantee that setting this value improves 
performance.

> 
> The garbage collection log had the following details just prior to the
> crash:
> 
> 4163.757: [GC [1 CMS-initial-mark: 0K(5376K)] 1834200K(4152576K),
> 1.9237250 secs] [Times: user=1.92 sys=0.00, real=1.92 secs]
> 4165.682: [CMS-concurrent-mark-start]
> 4165.834: [CMS-concurrent-mark: 0.152/0.152 secs] [Times: user=0.15
> sys=0.00, real=0.16 secs]
> 4165.834: [CMS-concurrent-preclean-start]
> 4165.849: [CMS-concurrent-preclean: 0.015/0.015 secs] [Times: user=0.01
> sys=0.00, real=0.01 secs]
> 4165.849: [CMS-concurrent-abortable-preclean-start]
> CMS: abort preclean due to time 4171.285:
> [CMS-concurrent-abortable-preclean: 5.035/5.436 secs] [Times: user=5.05
> sys=0.00, real=5.44 secs]
> 4171.285: [GC[YG occupancy: 1834200 K (4147200 K)]4171.286: [Rescan
> (parallel) , 1.5184720 secs]4172.804: [weak refs processing, 0.0001420
> secs]4172.804: [class unloading, 0.0118860 secs]4172.816: [scrub symbol &
> string tables, 0.0141570 secs] [1 CMS-remark: 0K(5376K)]
> 1834200K(4152576K), 1.5484470 secs]
> 
>       
> And the JavaMelody monitoring indicated the crash occurred at the same
> time as garbage collection took place. Basically the Garbage collector
> time chart peaked at 20 and ran for about 15minutes.
> 
> 
> I has a look at the garbage collector chart over a longer period and when
> the collector runs more frequently it appears to be more stable.
> 
> Any advice on where to go next?

1.) Look at the "Basic Troubleshooting" section here.

   
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#icms.troubleshooting

2.) If possible, take some heap dumps when you start to notice a problem.  Then 
you can analyze them with a profiler and see what is happening in the heap.

3.) Load test with a profiler hooked directly up to your application.  Try to 
recreate the problem.

Hope that helps.

Dan



> 
> 
> Z.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to