What do the logs say? I haven't seen that you've done analysis with the logs. Is there a gap in time in the logs (indicating the server was not doing anything)? Is there are gap in time in the logs (indicating a long operation was running?
On Tue, Jan 25, 2011 at 5:49 PM, ZHANG, ERIC L <ezh...@entergy.com> wrote: > ** > > We have sent BMC tech support all the logs including api, filter, sql, > escalation, thread, plug-in, arfork, even pstack output that were taken > during hanging, and so far they haven’t been able to identify the cause of > the problem. > > > > -----Original Message----- > *From:* Axton [mailto:axton.gr...@gmail.com] > *Sent:* Monday, January 24, 2011 5:45 PM > *Subject:* Re: Strange ARS Timeout Problem > > > > ** Try to get the api, filter, and sql logs leading up to the point where > it started hanging. Those are your best indicator. Also check the > arerror.log for crashes. > > > > There are things that can cause behavior like this that the logs will > indicate. For example, try creating a computed group during production > operations, or importing a deployable application. > > On Thu, Jan 20, 2011 at 3:10 PM, ZHANG, ERIC L <ezh...@entergy.com> wrote: > > ** > > Hi Listers. > > > > We are experiencing intermittent timeouts with the ARS. Without me doing > anything, the AR system becomes normal again after about 5 minutes. All > users are getting timeout (or hourglass) but no process is being restarted > in armonitor.log. > > > > This is the message showing in arerror.log: > > > > Tue Jan 18 12:09:24 2011 Dispatch : Timeout during data retrieval due to > busy server -- retry the operation (server_name) ARERR - 93 > > Tue Jan 18 12:10:04 2011 Approve : Timeout during database query -- > consider using more specific search criteria to narrow the results, and > retry the operation (ARERR 94) > > > > In the API log, it shows a 5-minute gap: > > > > <API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin > > <Client-RPC: 999999 > <USER: Remedy Application Service > > /* Tue Jan 18 2011 12:06:16.2224 */-GLEWF OK > > <API > <TID: 0000000004> <RPC ID: 0000000000> <Queue: Admin > > <Client-RPC: 999999 > <USER: Remedy Application Service > > /* Tue Jan 18 2011 12:11:16.0001 */+GLEWF ARGetListEntryWithFields -- > schema OBJSTR:Class from Unidentified Client (protocol 12) at IP address > > > > Our DBA was monitoring the database during the time and found few > activities in the database. The activities shown in SQL log during the > timeout were all for user AR_ESCALATOR, which means the escalation was still > running during the time. This can also be verified from the escalation log. > > > > When this occurs, the CPU and RAM utilizations are dramatically dropping to > the lowest levels on both the ARS server and the database server. There was > no application change in the last couple of months. The problem started > about two weeks ago. It could occur 3 times a day and sometimes it works > fine for days without it occurring. > > > > Our configuration/environment: > > > > ARS: 7.1 patch 7 > > ITSM: 7.0.03 patch 9 > > SLM: 7.1 patch 2 > > SRM: 2.2 patch 4 > > Midtier: 7.6.03 > > > > ARS Server: Solaris 10 (16 GB of Physical Memory, 18 GB of SWAP, 8 CPUs) – > Dedicated to ARServer, ITSM, SLM, and SRM. > > Midtier Server: Windows Server 2003 SP2 (2 CPUs, 2 GB of RAM) – Used only > by customers to submit service request. > > Database: Oracle: 10gR2 (remote) > > > > The following are threads settings in ar.conf: > > > > Private-RPC-Socket: 390601 2 6 > > Private-RPC-Socket: 390603 2 2 > > Private-RPC-Socket: 390620 16 24 (FAST) > > Private-RPC-Socket: 390626 8 16 > > Private-RPC-Socket: 390627 2 12 > > Private-RPC-Socket: 390635 24 30 (LIST) > > Private-RPC-Socket: 390680 24 24 > > Private-RPC-Socket: 390693 2 4 > > Private-RPC-Socket: 390698 2 4 > > > > We have about 300 concurrent Remedy users during the peak hours. ARServer > is running as non-root process. The number of open file descriptors for > arserverd (~700) was well below the ulimit 3072. The FAST and LIST threads > never reached the maximums. > > > > I have an open ticket with BMC Support but thought I might get a solution > quicker from the Arslist here. > > > > Thanks, > > Eric > > > > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ > > > > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ > _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"