Re: ARS 7.1 server group issue

Tony Worthington Wed, 25 Feb 2009 10:26:32 -0800

Was thread logging enabled when the server was started?

The Init should look like this:


<THRD> /* Sun Feb 22 2009 02:36:07.5150 */ Thread Trace Log -- ON (AR 
Server 7.1.00 Patch 002 200802011900)
<THRD> /* Sun Feb 22 2009 02:36:15.4060 */ Thread Id 3076 (thread number 
0) Thread Manager started.
<THRD> /* Sun Feb 22 2009 02:36:15.4060 */ Thread Id 3080 (thread number 
1) timed call thread started.
<THRD> /* Sun Feb 22 2009 02:36:15.4060 */ Thread Id 3084 (thread number 
2) on ADMIN queue started.
<THRD> /* Sun Feb 22 2009 02:36:19.5150 */ InitServerCache Begin
<THRD> /* Sun Feb 22 2009 02:43:20.3880 */ InitServerCache End: 
rpcCallProc=0 tid=3084

And re-caches look like this;

<THRD> /* Fri Feb 20 2009 13:17:16.3370 */ CopyCache Begin: 
rpcCallProc=10002 user="Remedy Application Service" tid=2808 rpcId=0
<THRD> /* Fri Feb 20 2009 13:19:27.7490 */ CopyCache End
<THRD> /* Fri Feb 20 2009 13:22:38.8550 */ FreeServerCache: rpcCallProc=5 
user="blah" tid=5776 rpcId=1761714

Can you verify that the server completes and InitServerCache before 
performing a CopyCache?

  Tony Worthington
  Sr. Technical Analyst
  Kohl's Department Stores
  N56 W17000 Ridgewood Drive
  Menomonee Falls, WI 53051
  262.703.5911 (phone)
  tony.worthing...@kohls.com
  www.Kohls.com




From:
Anthony K R <anthony_rathna...@dell.com>
To:
arslist@ARSLIST.ORG
Date:
02/25/2009 10:40 AM
Subject:
Re: ARS 7.1 server group issue
Sent by:
"Action Request System discussion list(ARSList)" <arslist@ARSLIST.ORG>



Why is doing ?InitServerCache? instead of ?CopyCache??
 
-Anthony
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Walters, Mark
Sent: Wednesday, February 25, 2009 10:02 PM
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
OK ? that?s just a failure of the admin thread then.  Another plus of 
Windows is that we seem to be able to handle individual thread failures 
more gracefully than Unix.  In this case the admin thread is getting a 
malloc error, dying and restarting to try again. 
 
Mark
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Anthony K R
Sent: 25 February 2009 15:36
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
Mark,
 
No entry seen in armonitor.log, but the arerror.log says;
 
Wed Feb 25 05:32:28 2009  390600 : Malloc failed on server (ARERR 300)
Wed Feb 25 05:32:28 2009  390600 : AR System server terminated -- fatal 
error encountered (ARNOTE 21)
 
Thanks,
Anthony
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Walters, Mark
Sent: Wednesday, February 25, 2009 5:17 PM
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
This looks like just the admin thread dying and not the arserver crashing? 
  What do the arerror.log and armonitor.log show at these times?
 
Mark
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Anthony K R
Sent: 25 February 2009 11:38
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
Here is the entries from thread log;
<THRD> /* Wed Feb 25 2009 05:16:37.8330 */ Thread Trace Log -- ON (AR 
Server 7.1.00 Patch 005 200809150630)
<THRD> /* Wed Feb 25 2009 05:22:15.5140 */ InitServerCache Begin
<THRD> /* Wed Feb 25 2009 05:22:29.0760 */ FreeServerCache: 
rpcCallProc=10004 user="Remedy Application Service" tid=3076 rpcId=390600
<THRD> /* Wed Feb 25 2009 05:22:29.3260 */ Thread Id 3076 (thread number 
1) on ADMIN queue died.
<THRD> /* Wed Feb 25 2009 05:22:29.3260 */ Thread Id 4600 (thread number 
1) on ADMIN queue restarted.
<THRD> /* Wed Feb 25 2009 05:32:15.5020 */ InitServerCache Begin
<THRD> /* Wed Feb 25 2009 05:32:28.7990 */ FreeServerCache: 
rpcCallProc=10004 user="Remedy Application Service" tid=4600 rpcId=390600
<THRD> /* Wed Feb 25 2009 05:32:29.0490 */ Thread Id 4600 (thread number 
1) on ADMIN queue died.
<THRD> /* Wed Feb 25 2009 05:32:29.0490 */ Thread Id 5916 (thread number 
1) on ADMIN queue restarted.
 
 
Regards,
Anthony
 
From: Rathnappa, Anthony 
Sent: Wednesday, February 25, 2009 4:57 PM
To: arslist@ARSLIST.ORG
Subject: RE: ARS 7.1 server group issue
 
I have verified the boot.ini file has /3G switch. Also using ?dumpbin? 
tool I got confirmed that arserver can address more than 2GB.
After startup the memory consumed is ~1.3GB, as shown in Task Manager. 
This is still a pre-prod env, so there are no users.
 
In the Dev env, I had used ;CopyCache Begin? flag, where the log showed 
only ?CopyCache Begin:? but no ?CopyCache End?
 
Will enable both flags and update you.
 
Thanks,
Anthony
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Walters, Mark
Sent: Wednesday, February 25, 2009 1:46 PM
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
By default the maximum memory arserver can access on 32-bit Windows is 
2GB.  If it tries to grow beyond this then it will fail.  This is an OS 
limitation that can be changed to 3GB by the addition of the /3GB switch 
to the appropriate line in the boot.ini file.  See 
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx and 
many of the other pages returned by a Google for ?windows 3gb boot.ini?.
 
The arserver is compiled with the large address aware flag that enables it 
to make use of the additional 1GB of RAM provided by this switch. 
 
However, I?d be interested to understand why your arserver process is 
getting so large that it is reaching the 2GB limit.  How much memory does 
arserver.exe consume after startup ? at the point that users can login? 
How many concurrent users?  The initial size of the process is largely 
determined by the amount of forms and workflow that you have on the system 
as these are all read in to the server to create the cache.  If you have a 
full ITSM system with multiple language packs the initial size could be in 
excess of 700MB.  Once it is up and running the server will increase in 
size as it allocates memory to handle it?s day-to-day work ? processing 
query results and so on.  One of the advantages of the Windows platform is 
that once the server releases the memory it is returned to the OS and the 
footprint should shrink again.  If the maximum process size (2 or 3 GB 
depending on the flag above) minus the current size or arserverd is LESS 
than the startup size a recache operation is likely to fail.
 
Things that you could do;
 
·         Enable the /3GB option
·         If your startup size is very large look to remove unused views, 
forms, workflow from the system
·         Set Large-Result-Logging-Threshold: 100000 in ar.cfg and enable 
thread logging on the secondary servers ? this will show you if you have 
users running queries returning large datasets and consuming memory. 
·         Set Copy-Cache-Logging: T too ? this will record the recache 
operations in the thread log.  You want to make sure that you see the 
freeservercache that indicates that the server has released the original 
copy of the cache.  If you have long running API calls it is possible for 
the server to end up with more than 2 copies of the cache ? if this is a 
large cache you can very quickly hit the memory limit.
Eg This is bad ? multiple copies ? you want to see a begin, end and free 
before the next begin.
CopyCache Begin: rpcCallProc=10002 user="Remedy Application Service" tid=5 
rpcId=0
CopyCache End
CopyCache Begin: rpcCallProc=10002 user="Remedy Application Service" tid=5 
rpcId=0
CopyCache End
FreeServerCache: rpcCallProc=10018 user="Remedy Application Service" tid=5 
rpcId=1178442632
 
Incidentally, if you have are using 64-bit Windows I believe the maximum 
size of a large address aware enabled 32-bit application is 4GB by default 
- http://msdn.microsoft.com/en-us/library/ms791558.aspx
 
Mark Walters
 
The opinions, statements, and/or suggested courses of action expressed in 
this E-mail do not necessarily reflect those of BMC Software, Inc.  My 
voluntary participation in this forum is not intended to convey a role as 
a spokesperson, liaison or support representative for BMC Software, Inc.
 
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Anthony K R
Sent: 25 February 2009 07:17
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
Joe,
 
The chunk setting should not cause malloc error. There is no timeout issue 
either.
 
Today I saw memory consumption report when the recache triggered on 
secondary servers. It is crossing 2GB before the malloc error, a memory 
limitation on OS or arserver process?
 
 
Regards,
Anthony
 
 
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Joe DeSouza
Sent: Wednesday, February 25, 2009 7:50 AM
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
** 
Its a known issue where ARS on Windows connected to a Remote Oracle 
database, takes forever to recache and that it takes forever to restart if 
the services have been stopped and is restarted. This is because of the 
way that data is read in chunks of 100 rows. It is as designed and Remedy 
has nothing to do with the design as its more how the Oracle client 
communicates to remote oracle databases when the client is on Windows..
 
I didn't experience the kinds of problems you are talking about on UNIX 
ARS Servers connected to remote Oracle databases.
 
So I guessed your configurations by the symptoms you described. 
Unfortunately you got to live with it unless you decide to move to UNIX.
 
Joe
 

From: Lyle Taylor <tayl...@ldschurch.org>
To: arslist@ARSLIST.ORG
Sent: Tuesday, February 24, 2009 6:02:40 PM
Subject: Re: ARS 7.1 server group issue
Correct??
 
From: Action Request System discussion list(ARSList) [
mailto:arsl...@arslist.org] On Behalf Of Joe DeSouza
Sent: Tuesday, February 24, 2009 3:20 PM
To: arslist@ARSLIST.ORG
Subject: Re: ARS 7.1 server group issue
 
** 
Your AR Servers are probably on windows and connect to Oracle setup as a 
Remote database?
 
Joe
 

From: Lyle Taylor <tayl...@ldschurch.org>
To: arslist@ARSLIST.ORG
Sent: Tuesday, February 24, 2009 4:27:56 PM
Subject: Re: ARS 7.1 server group issue

** 
I see server groups as being more useful for load balancing and 
redundancy.  While you can indeed have users on the other systems while 
you perform the updates, the other servers become nearly unusable as the 
cache updates, especially for anything other than very minor changes. I?ve 
simply had less issues if I simply bring down the other servers during the 
changes and then bring them back up again after.  In my experience, that 
actually provides a better user experience, because knowing that it?s down 
for a short time is easier to deal with than extremely slow performance 
during a cache update.
 
Lyle
 
From:

**********************************************************************
CONFIDENTIALITY NOTICE: 
This is a transmission from Kohl's Department Stores, Inc.
and may contain information which is confidential and proprietary.
If you are not the addressee, any disclosure, copying or distribution or use of 
the contents of this message is expressly prohibited.
If you have received this transmission in error, please destroy it and notify 
us immediately at 262-703-7000.

CAUTION:
Internet and e-mail communications are Kohl's property and Kohl's reserves the 
right to retrieve and read any message created, sent and received.  Kohl's 
reserves the right to monitor messages by authorized Kohl's Associates at any 
time
without any further consent.

<<image/jpeg>>

Re: ARS 7.1 server group issue

Reply via email to