Re: Sudden cpu spike

doug andrews Tue, 11 Aug 2009 13:27:51 -0700

I did get a thread dump from jconsole.

When ever this condition occurs, i have several threads that looklike this:


Stack trace:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:474)

com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(WOSessionStore.java:191)com.webobjects.appserver.WOApplication.restoreSessionWithID(WOApplication.java:1913)com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedApplication(WOComponentRequestHandler.java:324)com.webobjects.appserver._private.WOComponentRequestHandler._handleRequest(WOComponentRequestHandler.java:369)com.webobjects.appserver._private.WOComponentRequestHandler.handleRequest(WOComponentRequestHandler.java:445)com.webobjects.appserver.WOApplication.dispatchRequest(WOApplication.java:1687)

Application.dispatchRequest(Application.java:630)

com.webobjects.appserver._private.WOWorkerThread.runOnce(WOWorkerThread.java:144)com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:226)

java.lang.Thread.run(Thread.java:613)

As mentioned before, this is a request for a session that never gotchecked back in.

Anyone else run into this problem, or know how do i fix?

As soon as this starts to happen, the cpu usage starts to climb tocapacity, until the app is unreachable.

My session class does not override terminate or awake anymore, onlysleep, which is:


public void sleep()
{
    try
    {
        makePageSecure = false;
        skipSecurityCheck = false;
    }
    catch (Exception i)
    {
        i.printStackTrace();
    }
    finally
    {
        super.sleep();
    }
}








On Aug 5, 2009, at 3:42 PM, Chuck Hill wrote:

It does work with Java 1.5. I have had trouble getting a threaddump like this from one Leopard server. I have not dug into ityet. The lack of thread names in jstack on OS X looks like a bugto me.
If you have GUI access to the server and http access out, you couldtry this: http://www.adaptj.com/main/download
jconsole is also worth a look.


Chuck

On Aug 5, 2009, at 11:22 AM, doug andrews wrote:
Does the method recommended on the gvcsitemaker only work withjava 1.4?
I have 1.5, and all that shows up in /tmp/wothreaddump.log is:

Shared archive: sharing disabled for server vm
[2009-8-4 16:34:31 EDT] <main> WOApplication: Renamed previousWOOutputPath file to /Library/WebObjects/Logs/LeopardTest-1.20090804163431729
I modified /System/Library/WebObjects/JavaApplications/wotaskd.woa/Contents/Resources/SpawnOfWotaskd.sh and then rebooted the wholemachine.
Also, my original assumption that this had something to do with WO5.4 was wrong.
I can reproduce the error with WO 5.3



On Jul 31, 2009, at 5:52 PM, doug andrews wrote:
Ok, nevermind.
I did not read to the end of your last message.
I will try whats recommended on the gvcsitemaker site.

On Jul 31, 2009, at 5:27 PM, Chuck Hill wrote:
On Jul 31, 2009, at 2:17 PM, doug andrews wrote:
Memory: 4GB 800 Mhz DDR2 FB-DIMM
Database: MySQL 4.1.15
Heap Space: Xmx1024m and Xms1024m
Output from top -u:
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRDRSIZE VSIZE17494 java 387.8% 8:24:25 81 1126 1091 10M38M 736M 1322M
=8-0
20799 top 4.0% 0:00.79 1 18 29 604K 188K1196K 18M1826 Timbuktu H 2.5% 7:21.72 9 113 144 10M8000K 16M 185M1931 pmTool 2.1% 27:30.53 1 63 25 612K 1132K1484K 27M0 kernel_tas 1.9% 18:15.98 53 2 502 6116K 0285M 112M17761 Terminal 1.7% 0:08.43 3 105 188 4468K9744K 12M 172M1930 Activity M 1.6% 27:36.16 7 117 252 6616K20M 15M 203M58 hwmond 0.2% 17:11.70 3 65 46 2208K 416K3852K 20M183 WindowServ 0.2% 2:02.65 7 199 418 10M 18M26M 164M31 mDNSRespon 0.0% 0:10.91 2 36 27 572K 192K1356K 20M63 emond 0.0% 0:39.01 1 31 22 368K 1140K1880K 27M45 Python 0.0% 1:07.24 1 42 443 29M 1248K33M 55M44 java 0.0% 0:32.77 32 608 316 39M 16M50M 322M43 java 0.0% 0:55.18 32 661 321 37M 16M49M 323M17531 java 0.0% 0:13.42 32 603 312 46M37M 63M 327M17341 java 0.0% 0:20.98 31 691 309 59M37M 77M 327M17339 java 0.0% 0:13.08 30 585 305 42M37M 58M 326M17340 java 0.0% 0:14.07 30 590 302 42M37M 59M 326M17342 java 0.0% 0:14.69 32 604 310 45M37M 62M 327M
 40 ntpd         0.0%  0:06.40   1    15     19  196K   1

I will look at my session.terminate and sleep methods.
Can't do the kill QUIT right now because it's a live site.
a) at 388% CPU it is not very live  :-P
b) kill -QUIT <pid> just does a thread dump, it does not stopthe process
See:
http://www.gvcsitemaker.com/gvc.webobjects/faq&mode=single&recordID=41413&nextMode=list
Note that if your wotaskd is not configured like this already,you will need to restart it and the application for kill -QUITto provide you with any output.
Chuck
On Jul 31, 2009, at 5:06 PM, Chuck Hill wrote:
On Jul 31, 2009, at 1:47 PM, doug andrews wrote:
This is still an issue for us.

We have a site with the following:
OS 10.5.7
2.8 Ghz Quad-Core Intel Xeon
How much memory?
WebObjects 5.4.3
Java 1.5.0_19
Apache2.2 (We are using the Apache adaptor from Apple rightnow, and not the Wonder version)
There is one instance of our app.
There are about 7 sessions running.
Each session has a separate connection to the database.
Which database? Which version? Separate database connectionscan consume extra memory. Are you allocating enough heapspace to the JVM?
Within about 5 hours or so, all 4 processors run at close to100%, until eventually the app stops responding.
Which process(es) are using this CPU? top -u will quicklyshow this. I doubt that you will find it is Java (i.e. yourWO app). You might find it is the database.
We have a few sites that have this problem, and the onlything i can see in common is running WO5.4.x on Leopard.
I did manage to get the jstack output when this machine's cpuusage was close to capacity.
Below is the jstack output.

Does anything jump out as obviously wrong to anybody?
Not exactly. It looks idle. It was really slow to respond atsome time in the past and created a lot of worker threads.Eventually those did finish processing. You have a bug inyour session's sleep() and / or terminate() methods that ispreventing them from being checked in. The bug is likelyeither a missing call to super() (unlikely, you'd notice thisright away), or an exception that is thrown. Wrap the body ofeach in try...finally:
public void sleep() {
        try {
                // your buggy code here :-)
        }
        finally {
                super.sleep();
        }
}
kill -QUIT gives better info, BTW. Apple's jstack seems tohave gotten worse.
This is just an idle worker thread:
Thread t...@64771: (state = BLOCKED)
- java.net.PlainSocketImpl.accept(java.net.SocketImpl)@bci=0, line=382 (Interpreted frame)- java.net.ServerSocket.implAccept(java.net.Socket) @bci=50,line=450 (Interpreted frame)- java.net.ServerSocket.accept() @bci=48, line=421(Interpreted frame)- com.webobjects.appserver._private.WOWorkerThread.run()@bci=26, line=210 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
This is a request for a session that never got checked back in(see bug above):
Thread t...@80131: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Interpreted frame)
-com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(java.lang.String, com.webobjects.appserver.WORequest)@bci=48, line=191 (Interpreted frame)- com.webobjects.appserver.WOApplication.restoreSessionWithID(java.lang.String, com.webobjects.appserver.WOContext)@bci=9, line=1913 (Interpreted frame)-com.webobjects.appserver._private.WOComponentRequestHandler._dispatchWithPreparedApplication(com.webobjects.appserver.WOApplication,com.webobjects.appserver.WOContext,com.webobjects.foundation.NSDictionary) @bci=55, line=324(Interpreted frame)-com.webobjects.appserver._private.WOComponentRequestHandler._handleRequest(com.webobjects.appserver.WORequest) @bci=113,line=369 (Interpreted frame)-com.webobjects.appserver._private.WOComponentRequestHandler.handleRequest(com.webobjects.appserver.WORequest) @bci=46,line=445 (Interpreted frame)- com.webobjects.appserver.WOApplication.dispatchRequest(com.webobjects.appserver.WORequest) @bci=32, line=1687(Compiled frame)- Application.dispatchRequest(com.webobjects.appserver.WORequest) @bci=2, line=629(Interpreted frame)- com.webobjects.appserver._private.WOWorkerThread.runOnce()@bci=473, line=144 (Compiled frame)- com.webobjects.appserver._private.WOWorkerThread.run()@bci=129, line=226 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=613 (Interpreted frame)
These are jstack / OS X bugs (and might possibly be hiding theproblem in your app but I don't think so):
Thread t...@95491: (state = IN_NATIVE)
Error occurred during stack walking:
sun.jvm.hotspot.debugger.UnalignedAddressException: Trying toread at address: 0xc9e58955 with alignment: 4at sun.jvm.hotspot.debugger.DebuggerUtilities.checkAlignment(DebuggerUtilities.java:40)atsun.jvm.hotspot.debugger.macosx.MacOSXDebuggerLocal2.readCInteger(MacOSXDebuggerLocal2.java:387)at sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:425)atsun.jvm.hotspot.debugger.macosx.MacOSXDebuggerLocal2.readAddress(MacOSXDebuggerLocal2.java:257)at sun.jvm.hotspot.debugger.macosx.MacOSXAddress.getAddressAt(MacOSXAddress.java:54)at sun.jvm.hotspot.runtime.x86.X86Frame.getLink(X86Frame.java:338)at sun.jvm.hotspot.runtime.x86.X86Frame.sender(X86Frame.java:218)
        at sun.jvm.hotspot.runtime.Frame.sender(Frame.java:184)
        at sun.jvm.hotspot.runtime.Frame.realSender(Frame.java:189)
        at sun.jvm.hotspot.runtime.VFrame.sender(VFrame.java:102)
        at sun.jvm.hotspot.runtime.VFrame.javaSender(VFrame.java:134)
at sun.jvm.hotspot.runtime.JavaThread.getLastJavaVFrameDbg(JavaThread.java:231)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:53)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:27)
        at sun.jvm.hotspot.tools.JStack.run(JStack.java:41)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:204)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:62)


Thread t...@98051: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=120(Interpreted frame)- java.lang.ref.ReferenceQueue.remove() @bci=2, line=136(Compiled frame)
Error occurred during stack walking:
java.lang.NullPointerException
at sun.jvm.hotspot.runtime.Frame.addressOfStackSlot(Frame.java:214)at sun.jvm.hotspot.runtime.x86.X86Frame.getSenderSP(X86Frame.java:355)at sun.jvm.hotspot.runtime.x86.X86Frame.sender(X86Frame.java:218)
        at sun.jvm.hotspot.runtime.Frame.sender(Frame.java:184)
        at sun.jvm.hotspot.runtime.Frame.realSender(Frame.java:189)
        at sun.jvm.hotspot.runtime.VFrame.sender(VFrame.java:102)
        at sun.jvm.hotspot.runtime.VFrame.javaSender(VFrame.java:134)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:53)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:27)
        at sun.jvm.hotspot.tools.JStack.run(JStack.java:41)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:204)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:62)


Thread t...@98311: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=474 (Compiled frame)
Error occurred during stack walking:
java.lang.NullPointerException
at sun.jvm.hotspot.runtime.Frame.addressOfStackSlot(Frame.java:214)at sun.jvm.hotspot.runtime.x86.X86Frame.getSenderSP(X86Frame.java:355)at sun.jvm.hotspot.runtime.x86.X86Frame.sender(X86Frame.java:218)
        at sun.jvm.hotspot.runtime.Frame.sender(Frame.java:184)
        at sun.jvm.hotspot.runtime.Frame.realSender(Frame.java:189)
        at sun.jvm.hotspot.runtime.VFrame.sender(VFrame.java:102)
        at sun.jvm.hotspot.runtime.VFrame.javaSender(VFrame.java:134)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:53)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:27)
        at sun.jvm.hotspot.tools.JStack.run(JStack.java:41)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:204)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:62)
Chuck
On Sep 23, 2008, at 5:03 PM, Chuck Hill wrote:
Hi Doug,


On Sep 23, 2008, at 12:19 PM, doug andrews wrote:
Still in need of help-
I took your advice and made a build specifically for thissite using java 1.5 and the latest xcode and wo.
I am still having this problem.
We have one instance of our application that is causingeach processor to spike to over 90 percent of its cpucapacity.When this happens, one random user will get the 'instancenot available'.
Have you gotten a thread dump from such a run away instance?
http://www.gvcsitemaker.com/gvc.webobjects/faq&mode=single&recordID=41413
Is that one specific instance, or just randomly one ofthem? If one specific, what is different about it?
This only happens when someone upgrades to WO 5.4.2 or 5.4.3.
Tiger machines running WO 5.3.x have no problems.
Are the machines upgraded to WO 5.4 running a copy of theapp compiled against the same version of WO?
These are intel machines running OS 10.5.x.
Does anyone know of any changes made in WO 5.4.x (or OS10.5.x) that could cause this?
Not off hand.

Chuck
On Aug 25, 2008, at 3:22 PM, Mike Schrag wrote:
No, they were built using Xcode and WO 5.3.3.
This is a really bad idea ... I would never deploy on aversion that's different than development. It's justasking for problems. While it MAY work, you're justplaying roulette with your app. Either 1) replace 5.4with 5.3 on that server, 2) rebuild your app with WO 5.3embedded, or 3) Setup a proper 5.4.2 build and developmentenvironment and rebuild and test your app in 5.4.2.
ms
--
Chuck Hill             Senior Consultant / VP Development
Practical WebObjects - for developers who want to increase theiroverall knowledge of WebObjects or who are trying to solve specificproblems.
http://www.global-village.net/products/practical_webobjects


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-deploy mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-deploy/archive%40mail-archive.com

This email sent to [email protected]

Re: Sudden cpu spike

Reply via email to