Hi Shanti! Thanks for your script! I
ran it (with minor modifications), and it turns out that the
garbage collection threads are indeed taking up the majority of
CPU (see below). What would you recommend I do to
tune/troubleshoot GC further? My server has 32G or RAM, JVM has
heap size of 26G
Thanks, Kirill PID CPU Mem JStack Info 0x253d 80 90.7 "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000001938800 nid=0x253d runnable 0x2540 76 90.7 "GC task thread#6 (ParallelGC)" prio=10 tid=0x000000000193e000 nid=0x2540 runnable 0x253b 74 90.7 "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000001934800 nid=0x253b runnable 0x253e 74 90.7 "GC task thread#4 (ParallelGC)" prio=10 tid=0x000000000193a000 nid=0x253e runnable 0x2541 74 90.7 "GC task thread#7 (ParallelGC)" prio=10 tid=0x000000000193f800 nid=0x2541 runnable 0x253c 66 90.7 "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000001936800 nid=0x253c runnable 0x253a 47 90.7 "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000001933000 nid=0x253a runnable 0x253f 39 90.7 "GC task thread#5 (ParallelGC)" prio=10 tid=0x000000000193c000 nid=0x253f runnable 0x2542 19 90.7 "VM Thread" prio=10 tid=0x00000000019b6000 nid=0x2542 runnable 0x2539 0 90.7 "main" prio=10 tid=0x0000000001925000 nid=0x2539 runnable [0x00007fee9d444000] 0x2543 0 90.7 "Reference Handler" daemon prio=10 tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000] 0x2544 0 90.7 "Finalizer" daemon prio=10 tid=0x00000000019c5800 nid=0x2544 in Object.wait() [0x00007fee4ffd7000] 0x2545 0 90.7 "Signal Dispatcher" daemon prio=10 tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000] 0x2546 0 90.7 "C2 CompilerThread0" daemon prio=10 tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000] 0x2547 0 90.7 "C2 CompilerThread1" daemon prio=10 tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000] 0x2548 0 90.7 "Service Thread" daemon prio=10 tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000] 0x2549 0 90.7 "VM Periodic Task Thread" prio=10 tid=0x0000000001a1f800 nid=0x2549 waiting on condition 0x254a 0 90.7 "GC Daemon" daemon prio=10 tid=0x00007fee404b3000 nid=0x254a in Object.wait() [0x00007fee4e1bd000] 0x254d 0 90.7 "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10 tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000] 0x254e 0 90.7 "http-apr-8080-Poller-0" daemon prio=10 tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000] 0x254f 0 90.7 "http-apr-8080-CometPoller-0" daemon prio=10 tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000] 0x2550 0 90.7 "http-apr-8080-Sendfile-0" daemon prio=10 tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000] 0x2551 0 90.7 "http-apr-8080-Acceptor-0" daemon prio=10 tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000] 0x2552 0 90.7 "http-apr-8080-AsyncTimeout" daemon prio=10 tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000] 0x2553 0 90.7 "http-apr-8443-Poller-0" daemon prio=10 tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000] 0x2554 0 90.7 "http-apr-8443-CometPoller-0" daemon prio=10 tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000] 0x2555 0 90.7 "http-apr-8443-Acceptor-0" daemon prio=10 tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000] 0x2556 0 90.7 "http-apr-8443-AsyncTimeout" daemon prio=10 tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000] 0x2557 0 90.7 "ajp-apr-8009-Poller-0" daemon prio=10 tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000] 0x2558 0 90.7 "ajp-apr-8009-CometPoller-0" daemon prio=10 tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000] 0x2559 0 90.7 "ajp-apr-8009-Acceptor-0" daemon prio=10 tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000] 0x255a 0 90.7 "ajp-apr-8009-AsyncTimeout" daemon prio=10 tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000] 0x255f 0 90.7 "ajp-apr-8009-exec-5" daemon prio=10 tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000] 0x2560 0 90.7 "ajp-apr-8009-exec-6" daemon prio=10 tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000] 0x2561 0 90.7 "ajp-apr-8009-exec-7" daemon prio=10 tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000] 0x2562 0 90.7 "ajp-apr-8009-exec-8" daemon prio=10 tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000] 0x2563 0 90.7 "ajp-apr-8009-exec-9" daemon prio=10 tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000] 0x2564 0 90.7 "ajp-apr-8009-exec-10" daemon prio=10 tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000] 0x2586 0 90.7 "ajp-apr-8009-exec-12" daemon prio=10 tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000] 0x2587 0 90.7 "pool-1-thread-1" prio=10 tid=0x0000000001fe4000 nid=0x2587 waiting on condition [0x00007fee47af3000] 0x2635 0 90.7 "ajp-apr-8009-exec-15" daemon prio=10 tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000] 0x2637 0 90.7 "ajp-apr-8009-exec-17" daemon prio=10 tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000] 0x2638 0 90.7 "ajp-apr-8009-exec-18" daemon prio=10 tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000] 0x263a 0 90.7 "ajp-apr-8009-exec-20" daemon prio=10 tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000] 0x263d 0 90.7 "ajp-apr-8009-exec-23" daemon prio=10 tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000] 0x50b 0 90.7 "http-apr-8443-exec-1" daemon prio=10 tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000] 0x50c 0 90.7 "http-apr-8443-exec-2" daemon prio=10 tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000] 0x50d 0 90.7 "http-apr-8443-exec-3" daemon prio=10 tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000] 0x50e 0 90.7 "http-apr-8443-exec-4" daemon prio=10 tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000] 0x50f 0 90.7 "http-apr-8443-exec-5" daemon prio=10 tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000] 0x510 0 90.7 "http-apr-8443-exec-6" daemon prio=10 tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000] 0x511 0 90.7 "http-apr-8443-exec-7" daemon prio=10 tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000] 0x512 0 90.7 "http-apr-8443-exec-8" daemon prio=10 tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000] 0x513 0 90.7 "http-apr-8443-exec-9" daemon prio=10 tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000] 0x514 0 90.7 "http-apr-8443-exec-10" daemon prio=10 tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000] 0x19a7 0 90.7 "http-apr-8080-exec-1" daemon prio=10 tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000] 0x19a8 0 90.7 "http-apr-8080-exec-2" daemon prio=10 tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000] 0x28b0 0 90.7 "http-apr-8080-exec-3" daemon prio=10 tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000] 0x28b1 0 90.7 "http-apr-8080-exec-4" daemon prio=10 tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000] 0x46e7 0 90.7 "Keep-Alive-Timer" daemon prio=10 tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000] 0x29c5 0 90.7 "Keep-Alive-SocketCleaner" daemon prio=10 tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000] 0x6ef8 0 90.7 "http-apr-8080-exec-5" daemon prio=10 tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000] 0x6ef9 0 90.7 "http-apr-8080-exec-6" daemon prio=10 tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000] 0x3d38 0 90.7 "http-apr-8080-exec-7" daemon prio=10 tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000] 0x3d39 0 90.7 "http-apr-8080-exec-8" daemon prio=10 tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000] 0x64a3 0 90.7 "ajp-apr-8009-exec-36" daemon prio=10 tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000] 0x64a6 0 90.7 "ajp-apr-8009-exec-39" daemon prio=10 tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000] 0x64a7 0 90.7 "ajp-apr-8009-exec-40" daemon prio=10 tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000] 0x64ab 0 90.7 "ajp-apr-8009-exec-44" daemon prio=10 tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000] 0x64ac 0 90.7 "ajp-apr-8009-exec-45" daemon prio=10 tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000] 0x64af 0 90.7 "ajp-apr-8009-exec-48" daemon prio=10 tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000] 0x64b0 0 90.7 "ajp-apr-8009-exec-49" daemon prio=10 tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000] 0x64b1 0 90.7 "ajp-apr-8009-exec-50" daemon prio=10 tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000] 0x64b2 0 90.7 "ajp-apr-8009-exec-51" daemon prio=10 tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000] 0x64b3 0 90.7 "ajp-apr-8009-exec-52" daemon prio=10 tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000] 0xf1b 0 90.7 "http-apr-8080-exec-9" daemon prio=10 tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000] 0xf1c 0 90.7 "http-apr-8080-exec-10" daemon prio=10 tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000] 0x3eda 0 90.7 "ajp-apr-8009-exec-55" daemon prio=10 tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000] 0x3edb 0 90.7 "ajp-apr-8009-exec-56" daemon prio=10 tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000] 0x51c7 0 90.7 "ajp-apr-8009-exec-57" daemon prio=10 tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000] On 9/27/12 12:17 PM, Shanti Suresh wrote: Hi Kirill, Like Mark, Bill and Jeff said, those threads are normal request-processing threads. I have included a script that might help with isolating high CPU issues with Tomcat.Also, I think it might be helpful to see how the Java heap is performing as well. Please bring up Jconsole and let it run over the week. Inspect the graphs for Memory, CPU and threads. Since you say that high CPU occurs intermittently several times during the week and clears itself, I wonder if it is somehow related with the garbage collection options you are using for the server. Or it may be a code-related problem. Things to look at may include: (1) Are high CPU times related to Java heap reductions happening at the same time? ==> GC possibly needs tuning (2) Are high CPU times related to increase in thread usage? ==> possible livelock in looping code? (3) how many network connections come into the Tomcat server during high-CPU times? Possible overload-related? Here is the script. I made a couple of small changes, for e.g., changing the username. But didn't test it after the change. During high-CPU times, invoke the script a few times, say 30 seconds apart. And then compare the thread-dumps. I like to use TDA for thread-dump analysis of Tomcat thread-dumps. Mark, et al, please feel free to help me refine this script. I would like to have a script to catch STUCK threads too :-) Let me know if anyone has a script already. Thanks. --------------high_cpu_diagnostics.pl:----- #!/usr/bin/perl # use Cwd; # Make a dated directory for capturing current diagnostics my ($sec,$min,$hour,$mday,$mon,$year, $wday,$yday,$isdst) = localtime time; $year += 1900; $mon += 1; my $pwd = cwd(); my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec"; print "$preview_diag_dir\n"; mkdir $preview_diag_dir, 0755; chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n"; # Capture Preview thread dump my $process_pattern = "preview"; my $preview_pid = `/usr/bin/pgrep -f $process_pattern`; my $login = getpwuid($<) ; if (kill 0, $preview_pid){ #Possible to send a signal to the Preview Tomcat - either "webinf" or "root" my $count = kill 3, $preview_pid; }else { # Not possible to send a signal to the VCM - use "sudo" system ("/usr/bin/sudo /bin/kill -3 $preview_pid"); } # Capture Preview thread dump system ("/usr/bin/jmap -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid"); # Gather the top threads; keep around for reference on what other threads are running @top_cmd = ("/usr/bin/top", "-H", "-n1", "-b"); @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9"); @sed_cmd = ("/bin/sed", "-n", "'8,$p'"); system("@top_cmd 1> top_all_threads.log"); # Get your tomcat user's threads, i.e. threads of user, "webinf" system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" | /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log'); # Get the thread dump my @output=`/usr/bin/jstack -l ${preview_pid}`; open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file: $!"; print $file @output; close $file; open LOG, "top_user_webinf_threads.log" or die $!; open (STDOUT, "| tee -ai top_cpu_preview_threads.log"); print "PID\tCPU\tMem\tJStack Info\n"; while ($l = <LOG>) { chop $l; $pid = $l; $pid =~ s/webinf.*//g; $pid =~ s/ *//g; ## Hex PID is available in the Sun HotSpot Stack Trace */ $hex_pid = sprintf("%#x", $pid); @values = split(/\s+/, $l); $pct = $values[8]; $mem = $values[9]; # Debugger breakpoint: $DB::single = 1; # Find the Java thread that corresponds to the thread-id from the TOP output for my $j (@output) { chop $j; ($j =~ m/nid=$hex_pid/) && print $hex_pid . "\t" . $pct . "\t" . $mem . "\t" . $j . "\n"; } } close (STDOUT); close LOG; ------------------end of script -------------------------- Thanks. -Shanti On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller < millebi.subscripti...@gmail.com> wrote:I agree; we have reproducible instances where PermGen is not set to our requirements on the Tomcat startup parameters and it will cause a "lockup" every time. Do some JMX monitoring and you may discover a memory spike that's killing Tomcat. Bill -----Original Message----- From: Jeff MAURY [mailto:jeffma...@gmail.com] Sent: September-27-2012 2:01 PM To: Tomcat Users List Subject: Re: high CPU usage on tomcat 7 This is probably due to out of memory, I have the same problem on my ubuntu ci machine Did you monitor your tomcat with jmx ? Jeff --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org --
Kirill Kireyev, PhD Founder/CTO instaGrok.com kir...@instagrok.com Twitter: @instaGrok FB: facebook.com/instagrok |
- high CPU usage on tomcat 7 Kirill Kireyev
- Re: high CPU usage on tomcat 7 Mark Thomas
- Re: high CPU usage on tomcat 7 Jeff MAURY
- RE: high CPU usage on tomcat 7 Bill Miller
- Re: high CPU usage on tomcat 7 Shanti Suresh
- Re: high CPU usage on tomcat 7 Shanti Suresh
- Re: high CPU usage on tomcat 7 Kirill Kireyev
- Re: high CPU usage on tomcat 7 mailingl...@j-b-s.de
- Re: high CPU usage on tomca... Jeff MAURY
- Re: high CPU usage on t... mailingl...@j-b-s.de
- Re: high CPU usage on tomcat 7 Kirill Kireyev
- Re: high CPU usage on tomcat 7 Shanti Suresh
- Re: high CPU usage on tomca... Shanti Suresh
- Re: high CPU usage on t... Shanti Suresh
- Re: high CPU usage on tomca... Kirill Kireyev
- RE: high CPU usage on t... Caldarale, Charles R
- RE: high CPU usage on t... Jeff MAURY
- Re: high CPU usage on t... Shanti Suresh
- Re: high CPU usage on tomcat 7 Mark Thomas
- Re: high CPU usage on tomcat 7 Jeff MAURY