I sent this to our premiere support person and manager, but I'd be interested 
to see what others have to say about this too.

Original message below:

Hi -

This came up on our call and I wanted to write it out.

BMC has stated that the Windows User Tool (WUT) is going to be discontinued (in 
fact, it already was in 7.6x).  What I need to  know is what is BMC's 
recommendation for diagnosing problems with the AR Servers in a server group?

Currently our users will report an issue like this: "Remedy is slow/locked 
up/whatever".  Routinely we get no more information than this.

Right now our troubleshooting is to first diagnose which server(s) is having 
problems.  The fast way to do this is to login to every server with the user 
tool.  We usually know within a few seconds if one of the AR servers is locked 
up, because we will not be able to log in to it.  Then we can bounce it and get 
service restored.

If they are responsive we then move on the Mid-tier servers, etc.

With a large load-balanced environment there is no way to QUICKLY do this 
without the WUT.  I could login with Developer Studio, but that doesn't use the 
same threads on the server as the WUT does.  We have seen instances where users 
are locked up and admins can log in with Dev studio (and vice versa).  Same 
goes for migrator and the import tool.

Support suggesting checking the AR Error log, but there are two problems with 
that - first, many lock-up scenarios do not results in errors in the 
arerror.log file.  There are numerous other logs to check on every server as 
well (CMDB, Email, AIE, etc).  Checking every log file on every server is time 
consuming and not 100% guaranteed to show us which server is locked up.

The second problem with support's suggestion is the sheer time it would take to 
login to each server.  We are on Linux, so we need to connect via SSH using 
putty.  We do that by first connecting to a gateway server.  Then we ssh to the 
actual AR server (direct access is not allowed).  Finally, we sudo to the user 
Remedy is running as.  That means each time we connect it's 3X we login.  If we 
multiply that by the 10 servers in our server group it would take at least an 
hour just to triage the problem.

I can do the same thing with the WUT in seconds.

So here is the question: What is the proper way to QUICKLY triage which server 
is having problems without using the WUT or Dev Studio/Migrator/Import?

William Rentfrow


_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
"Where the Answers Are, and have been for 20 years"

Reply via email to