John,
You say you captured an SQL file....what about a combined
file...API/Filter/SQL.  I would be interested to see if there is anything
going on during that time that the CPU is busy....a 100% CPU is abnormal,
but not unheard-of by any means, it all depends on what your system is doing
to determine if it's appropriate.  Due to the fact that you are 100% custom,
you could have put something into some sort of a loop accidentally....what's
your stack and max filter settings?

-----Original Message-----
From: Action Request System discussion list(ARSList)
[mailto:arslist@ARSLIST.ORG] On Behalf Of Reiser, John J
Sent: Monday, June 27, 2011 3:27 PM
To: arslist@ARSLIST.ORG
Subject: arserver.exe is consuming 100% cpu - possible DB corruption? (Long
Post)

Hello Listers,
ARS 7.6.03
MS 2003 Enterprise
MS SQL 2005 (remote)
Total home grown system. No OOTB modules.


I have a real stumper here. It even has BMC scratching their heads.
I have a production system that is experiencing cpu overload that runs up to
99 in the processes and sits there.
The ARSystem server is virtual machine. We thought maybe it was a MS "Patch
Tuesday" issue and we removed the 10 recent MS patches one at a time and
restarted the machine each time. The problem still exists after the arserver
service starts. Sometime immediately and sometimes it will sit for 1- 20
minutes before it starts to hog the CPUs.
To eliminate any other OS and file system issues we grabbed a two week old
backup image of the server and restored it.
The system came back ok for a short while and then started to lock up the
CPU again.
Working with BMC I set the logs on and restarted. We saw the system jump to
100% within a minute and captured a 10MB arsql.log file.
It can force the overload at anytime by firing filter workflow with a
notification action in it.
I disabled this one filter but the system still loaded up. I added a Filter
that ran a 0 and the only action was Goto 1000 to jump all Filter actions
that fired on the change of the Status field in question.
Still no joy. 
I've disabled every piece of Notify workflow. That worked the best and kept
the system alive for longer stretches but we can't run a system that way.

I've come to the realization that there may be corrupted information in the
DB object tables and I wanted to get some feedback.
Using rrrChive I can pull a copy of every form's data since, say, two weeks
ago. Then have the DBA restore the entire system from that date. After the
restore I would use rrrChive to reload the two weeks' data (Modified date' >
"06/11/2011") and hope for the best.

Any workflow that was changed in the last two weeks is negligible and could
be recreated/updated as needed.

Do you think this is a viable solution?
When I asked the BMC tech if I could dump the T,H & B tables ; restore the
db and reload the T, H & B tables he reminded me that the arschema and other
meta tables would probably be out of synch.
That's when I thought of using rrrChive.

Sorry to be so long winded but I need to get this back online, BMC can't
find anything in the logs and I don't want to lose the tickets we've taken
in the last week.




--- 
John J. Reiser 
Remedy Developer/Administrator 
Senior Software Development Analyst 
Lockheed Martin - MS2 
The star that burns twice as bright burns half as long. 
Pay close attention and be illuminated by its brilliance. - paraphrased by
me 

____________________________________________________________________________
___
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Reply via email to