Re: Strange ARS Timeout Problem

ZHANG, ERIC L Fri, 21 Jan 2011 11:45:25 -0800

** 

Thanks, Mark.


 

I did go through the escalations that were running during the timeout
and couldn't find anything out of the ordinary.  The escalation log
shows that all the escalation were completed in a fraction of a second
and no delays are showed in the sql log either.  

 

Eric

 

 

-----Original Message-----
From: Brittain, Mark [mailto:mbritt...@navisite.com] 
Sent: Thursday, January 20, 2011 3:30 PM
Subject: Re: Strange ARS Timeout Problem

 

Hi Eric,

 

Couple things you might check. 

Have you checked the indexing against the Run If in the escalations?
NULL in the Run If ignores indexing and should be avoided.

 

If you have a time calculation is the field on one side and the
calculation on the other (Create Date < $TIMESTAMP$ - 3600) vs. (Create
Date +3600 < $TIMESTAMP$). Calculating on the field value is slower.

 

Is there a SQL query to an external table in the Set Field action? Could
be a change/cause there. 

 

Likewise is the escalation doing a set field using information from
another form that you users frequently use? If so the issue might be the
indexing there.

 

These are small things that you can get away with when there is a
relatively limited number of records. Then at some magic number the
warts start to show.

 

Hope this helps and good luck. 

 

Mark

 

From: Action Request System discussion list(ARSList)
[mailto:arslist@ARSLIST.ORG] On Behalf Of ZHANG, ERIC L
Sent: Thursday, January 20, 2011 4:11 PM
To: arslist@ARSLIST.ORG
Subject: Strange ARS Timeout Problem

 

** 

Hi Listers.

 

We are experiencing intermittent timeouts with the ARS. Without me doing
anything, the AR system becomes normal again after about 5 minutes. All
users are getting timeout (or hourglass) but no process is being
restarted in armonitor.log. 

 

This is the message showing in arerror.log:

 

Tue Jan 18 12:09:24 2011  Dispatch : Timeout during data retrieval due
to busy server -- retry the operation (server_name)  ARERR - 93

Tue Jan 18 12:10:04 2011  Approve : Timeout during database query --
consider using more specific search criteria to narrow the results, and
retry the operation (ARERR 94)

 

In the API log, it shows a 5-minute gap:

 

<API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin     >
<Client-RPC: 999999   > <USER: Remedy Application Service
> /* Tue Jan 18 2011 12:06:16.2224 */-GLEWF            OK

<API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin     >
<Client-RPC: 999999   > <USER: Remedy Application Service
> /* Tue Jan 18 2011 12:11:16.0001 */+GLEWF  ARGetListEntryWithFields --
schema OBJSTR:Class from Unidentified Client (protocol 12) at IP address

 

Our DBA was monitoring the database during the time and found few
activities in the database. The activities shown in SQL log during the
timeout were all for user AR_ESCALATOR, which means the escalation was
still running during the time. This can also be verified from the
escalation log.

 

When this occurs, the CPU and RAM utilizations are dramatically dropping
to the lowest levels on both the ARS server and the database server.
There was no application change in the last couple of months. The
problem started about two weeks ago. It could occur 3 times a day and
sometimes it works fine for days without it occurring.

 

Our configuration/environment:

 

ARS: 7.1 patch 7

ITSM: 7.0.03 patch 9

SLM: 7.1 patch 2

SRM: 2.2 patch 4

Midtier: 7.6.03

 

ARS Server: Solaris 10 (16 GB of Physical Memory, 18 GB of SWAP, 8 CPUs)
- Dedicated to ARServer, ITSM, SLM, and SRM.

Midtier Server: Windows Server 2003 SP2 (2 CPUs, 2 GB of RAM) - Used
only by customers to submit service request.

Database: Oracle: 10gR2 (remote)

 

The following are threads settings in ar.conf:

 

Private-RPC-Socket:  390601   2   6

Private-RPC-Socket:  390603   2   2

Private-RPC-Socket:  390620  16  24  (FAST)

Private-RPC-Socket:  390626   8  16

Private-RPC-Socket:  390627   2  12

Private-RPC-Socket:  390635  24  30  (LIST)

Private-RPC-Socket:  390680  24  24

Private-RPC-Socket:  390693   2   4

Private-RPC-Socket:  390698   2   4

 

We have about 300 concurrent Remedy users during the peak hours.
ARServer is running as non-root process. The number of open file
descriptors for arserverd (~700) was well below the ulimit 3072.  The
FAST and LIST threads never reached the maximums.

 

I have an open ticket with BMC Support but thought I might get a
solution quicker from the Arslist here.

 

Thanks,

Eric

 

_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ 

 

________________________________

This e-mail is the property of NaviSite, Inc. It is intended only for
the person or entity to which it is addressed and may contain
information that is privileged, confidential, or otherwise protected
from disclosure. Distribution or copying of this e-mail, or the
information contained herein, to anyone other than the intended
recipient is prohibited.

_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Re: Strange ARS Timeout Problem

Reply via email to