Not sure if this can help.
We had the same issue as system used to give intermediate timeout issue and 
only similarity was the time out used to happen at a fixed time always.
Is this the case with you also?  ??

We found out that some views were running @ the specified time and system used 
to go for a full scan on the major form. No looks we found but yes the system 
performance used to go down.

As Axton said if there is any issue with the network the you can see that in 
the error log itself.

Not ignoring the network issue try changing the entry in the ORA file from 
hostname to IP. We use it in order to minimize the issue in case if there is 
any DNS issue also.

One more thing you need to find out is the issue happening through Midtier only 
or From User Client  or both.

With Best Regards

Rajesh

________________________________
From: Action Request System discussion list(ARSList) 
[mailto:arslist@arslist.org] On Behalf Of Dennis Ruble
Sent: Friday, January 28, 2011 4:08 AM
To: arslist@arslist.org
Subject: Re: Strange ARS Timeout Problem

**
Eric,
You might add a nslookup command to your cron job to see if a dns lookup is 
failing.  A dns failure will give the same ARS symptom as a network outage 
because it is an operation that the server must complete before communications 
can happen.

Good luck,
Dennis



"ZHANG, ERIC L" <ezh...@entergy.com>
Sent by: "Action Request System discussion list(ARSList)" <arslist@ARSLIST.ORG>

01/27/2011 04:26 PM
Please respond to
arslist@ARSLIST.ORG


To

arslist@ARSLIST.ORG

cc



Subject

Re: Strange ARS Timeout Problem










** **
Good idea.  I just put a cron job on the ars server that runs traceroute 
<db_server> every minute and appends the output to an output file. Waiting for 
the next timeout.

-----Original Message-----
From: LJ LongWing [mailto:lj.longw...@gmail.com]
Sent: Thursday, January 27, 2011 9:18 AM
Subject: Re: Strange ARS Timeout Problem

Ok....I just completely re-read the original post.....all indications save one 
are that during that 5 minute interval the application server lost connectivity 
with the DB server.  The only exception to that appears to be the escalation 
thread which continued processing during that 5 minute window.....so, what I 
would do would be to setup a cron to run every 30 seconds or every minute, 
something along those lines that issues a tracert between your remedy server 
and your db server.  My primary thought is that you are losing network 
connectivity....even though the escalation server is still working...it's at 
least something you can try and report back.

From: Action Request System discussion list(ARSList) 
[mailto:arslist@ARSLIST.ORG] On Behalf Of ZHANG, ERIC L
Sent: Wednesday, January 26, 2011 7:19 PM
To: arslist@ARSLIST.ORG
Subject: Re: Strange ARS Timeout Problem

**
Yes, I did initial log analysis. As I said in the original posting, there was 
5-minutes gap in the api log, while no gap/waiting/error/long operation was 
showing in the sql log and escalation log. All the sql queries were for user 
AR_ESCALATOR in the sql log.


-----Original Message-----
From: Axton [mailto:axton.gr...@gmail.com]
Sent: Wednesday, January 26, 2011 8:18 AM
Subject: Re: Strange ARS Timeout Problem

** What do the logs say?  I haven't seen that you've done analysis with the 
logs.  Is there a gap in time in the logs (indicating the server was not doing 
anything)?  Is there are gap in time in the logs (indicating a long operation 
was running?
On Tue, Jan 25, 2011 at 5:49 PM, ZHANG, ERIC L 
<ezh...@entergy.com<mailto:ezh...@entergy.com>> wrote:
**
We have sent BMC tech support all the logs including api, filter, sql, 
escalation, thread, plug-in, arfork, even pstack output that were taken during 
hanging, and so far they haven't been able to identify the cause of the problem.

-----Original Message-----
From: Axton [mailto:axton.gr...@gmail.com<mailto:axton.gr...@gmail.com>]
Sent: Monday, January 24, 2011 5:45 PM
Subject: Re: Strange ARS Timeout Problem

** Try to get the api, filter, and sql logs leading up to the point where it 
started hanging.  Those are your best indicator.  Also check the arerror.log 
for crashes.

There are things that can cause behavior like this that the logs will indicate. 
 For example, try creating a computed group during production operations, or 
importing a deployable application.
On Thu, Jan 20, 2011 at 3:10 PM, ZHANG, ERIC L 
<ezh...@entergy.com<mailto:ezh...@entergy.com>> wrote:
**
Hi Listers.

We are experiencing intermittent timeouts with the ARS. Without me doing 
anything, the AR system becomes normal again after about 5 minutes. All users 
are getting timeout (or hourglass) but no process is being restarted in 
armonitor.log.

This is the message showing in arerror.log:

Tue Jan 18 12:09:24 2011  Dispatch : Timeout during data retrieval due to busy 
server -- retry the operation (server_name)  ARERR - 93
Tue Jan 18 12:10:04 2011  Approve : Timeout during database query -- consider 
using more specific search criteria to narrow the results, and retry the 
operation (ARERR 94)

In the API log, it shows a 5-minute gap:

<API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin     > <Client-RPC: 
999999   > <USER: Remedy Application Service                   > /* Tue Jan 18 
2011 12:06:16.2224 */-GLEWF            OK
<API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin     > <Client-RPC: 
999999   > <USER: Remedy Application Service                   > /* Tue Jan 18 
2011 12:11:16.0001 */+GLEWF  ARGetListEntryWithFields -- schema OBJSTR:Class 
from Unidentified Client (protocol 12) at IP address

Our DBA was monitoring the database during the time and found few activities in 
the database. The activities shown in SQL log during the timeout were all for 
user AR_ESCALATOR, which means the escalation was still running during the 
time. This can also be verified from the escalation log.

When this occurs, the CPU and RAM utilizations are dramatically dropping to the 
lowest levels on both the ARS server and the database server. There was no 
application change in the last couple of months. The problem started about two 
weeks ago. It could occur 3 times a day and sometimes it works fine for days 
without it occurring.

Our configuration/environment:

ARS: 7.1 patch 7
ITSM: 7.0.03 patch 9
SLM: 7.1 patch 2
SRM: 2.2 patch 4
Midtier: 7.6.03

ARS Server: Solaris 10 (16 GB of Physical Memory, 18 GB of SWAP, 8 CPUs) - 
Dedicated to ARServer, ITSM, SLM, and SRM.
Midtier Server: Windows Server 2003 SP2 (2 CPUs, 2 GB of RAM) - Used only by 
customers to submit service request.
Database: Oracle: 10gR2 (remote)

The following are threads settings in ar.conf:

Private-RPC-Socket:  390601   2   6
Private-RPC-Socket:  390603   2   2
Private-RPC-Socket:  390620  16  24  (FAST)
Private-RPC-Socket:  390626   8  16
Private-RPC-Socket:  390627   2  12
Private-RPC-Socket:  390635  24  30  (LIST)
Private-RPC-Socket:  390680  24  24
Private-RPC-Socket:  390693   2   4
Private-RPC-Socket:  390698   2   4

We have about 300 concurrent Remedy users during the peak hours. ARServer is 
running as non-root process. The number of open file descriptors for arserverd 
(~700) was well below the ulimit 3072.  The FAST and LIST threads never reached 
the maximums.

I have an open ticket with BMC Support but thought I might get a solution 
quicker from the Arslist here.

Thanks,
Eric

_attend WWRUG11 www.wwrug.com<http://www.wwrug.com/> ARSlist: "Where the 
Answers Are"_

_attend WWRUG11 www.wwrug.com<http://www.wwrug.com/> ARSlist: "Where the 
Answers Are"_
_attend WWRUG11 www.wwrug.com<http://www.wwrug.com/> ARSlist: "Where the 
Answers Are"_

_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_
_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_

_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ _attend WWRUG11 
www.wwrug.com ARSlist: "Where the Answers Are"_

_attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Reply via email to