***I apologize if you receive the posting twice - I tried to send the
posting with two pstack output attachments (less than 1 MB in size) but
couldn't.***

 

First of all, thanks to all who responded and provided valuable
suggestions to our issue.

 

Rejesh,

 

It happens randomly not at a fixed time. Sometime it happened early in
the morning or at night when there were just a few users, while other
time it happened during the peak business hours. When it happens, it
affects both web user and native client user.

 

Fred,

 

I did steal the script (from one of your early postings) and have put it
in place after I tweaked it a little since we encountered the problem.
Thanks Fred.

 

The BMC support is really looking for at this time is the pstack output
for arplugin during hanging because they think it might be arplugin
causing the problem (see comments form BMC support below).  I will also
try to get truss and dtrace (recommended by Axton) for arserverd and
plugin when it happens again 

 

Bob,

 

It's interesting you mentioned the dispatcher thread, because BMC tech
support has recommended turning on dispatcher logging.  I'm going to
look at the RPC-Non-Blocking-IO setting.  I am attaching a couple of the
pstack outputs for anyone is familiar with pstack to take a look.

 

 

I have received updates from BMC support and implemented some changes:

 

Added into ar.conf:

 

External-Authentication-Return-Data-Capabilities: 31

Plugin-Filter-API-Threads: 4 20

Approval-RPC-Socket: 390631

Private-RPC-Socket:  390631   2   4

 

Updated in ar.conf:

 

Next-ID-Block-Size from 10 to 40

Delay-Recache-Time from 5 to 120

 

Adjusted threads numbers:

 

- CAI Plugin threads: 

Private-RPC-Socket:  390680  24  24       to:        Private-RPC-Socket:
390680  16  24

 

- RPC Plugin Loopback threads

Private-RPC-Socket:  390626   8  16        to:
Private-RPC-Socket:  390626   4  10

 

 

Here are some comments from BMC support:

 

"It looks that the cause of the entire problem could be arplugin not
responding during that time, as in the logs we saw at least two threads
who were making a call to plugin server and they are waiting for a
response from plugin server, one for authentication and other for
getting the information via the vendor form and if other users are ITSM
users, so they would be using overview console which again use a
plugin."

 

"It showed that it might be waiting from the database and in one of the
other call on Thread 9 , it showed that plugin call is being made, and
that is taking time, that being the reason I suggested to add External
Authentication parameter, so that it don't have to authenticate for
everything."

 

"In one of the call, it also showed that one escalation is triggering,
that is giving a call to filter and some filter operations are performed
which is creating a db entry in there and that is taking time."

 

"In the plugin log, we see the last successful CAI plugin call and after
that plugin server stopped responding for some reason. Can you please
check how many records you have in the CAI:Events form? Is that too
many? Do you see any of the old or errored records as well?  If you see
the old records, will that be possible to remove those records from
CAI:events forms (or you can take a backup after exporting and then
delete), just incase if those are bad records or very old records."   -
I did clean up old records in CAI:Events and CAI:EventParams.

 

 

Thanks,

Eric

 


_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Reply via email to