Ulimit = unlimited looking at all the services running:: all under 500 (pfiles pid)
arserverd says 1654 memory in use 1615 RSS.. with Prstat -J no core bombs, no crashes.. On Fri, Sep 23, 2011 at 9:38 AM, Garrison, Sean (Norcross) < sean.garri...@fiserv.com> wrote: > ** > > Try running a ulimit –a to find the limits available to the user running > remedy (make sure you are logged in as that user). **** > > ** ** > > Check a couple of things:**** > > find the process number of the arserverd process and do a pfiles > {processed} -- compare to the ulimit of open files**** > > Do a top and hit shift M to sort by memory. Check and see if the arserverd > process is maxing out beyond your ulimit.**** > > ** ** > > I have had an issue in the past (Remedy 6.0) on solaris before where the > “number of open files” was exceeded by Remedy – especially when I used web > services. I only found this by running a script every few minutes for the > pfiles and logging it in a file. Just before the crash remedy hit the max > of open files. **** > > ** ** > > I’m not saying this is your issue but it would help eliminate a > possibility. **** > > ** ** > > Thanks,**** > > ** ** > > Sean**** > > ** ** > > ** ** > > ** ** > > ** ** > > *From:* Action Request System discussion list(ARSList) [mailto: > arslist@ARSLIST.ORG] *On Behalf Of *Ben Chernys > *Sent:* Friday, September 23, 2011 5:09 AM > > *To:* arslist@ARSLIST.ORG > *Subject:* Re: ARS 7.1 P6 Server -- 4 days restarting (possible memory OS > 32bit issue) signal is 11**** > > ** ** > > ** **** > > Hi Mark, Patrick,**** > > ** ** > > Signal 11 is SIGSEGV which is not necessarily a malloc failure though > indeed a malloc failure may lead to it. It is not always possible to log > malloc failures – after all it takes some memory to cut a log record. *** > * > > ** ** > > A segmentation violation is always the result of bad code (accessing memory > not allocated to the process or not in the processes address space – which 0 > is a candidate (malloc’s return value on failure)). **** > > ** ** > > That being said, it is possible to not trigger the execution path with that > bad code by altering filters etc, so definitely the route to go on is along > the lines that Mark talked: the core is always a wealth of info – even > though ARS will not have debugging compiled in ;-) I would also turn on all > logging. SQL, API, Filter on the server, and unlimited, and pointing to the > same file until the next occurrence. Then you will have a wealth of ARS > information to go through. Generally something will stand out.**** > > ** ** > > Recursive filter loops are usually trapped by the maximum filter limit – > though if that is set high enough the process will run out of memory before > hitting up against that. If yours is high, you could try setting it lower. > **** > > ** ** > > You may also want to go to a higher patch level if one is available. I am > no longer that familiar with the patches available on 7.1.**** > > ** ** > > Also, I know that memory on Solaris may be restricted by the admin. (I > forget the commands to determine this – but they will be easily found on the > web). ulimits Perhaps?**** > > ** ** > > Cheers**** > > ** ** > > Ben Chernys > > Senior Software Architect > Software Tool House Inc. > > Canada / Deutschland / Germany > Mobile: +49 171 380 2329 GMT + 1 + [ DST ] > Email: Ben.Chernys _AT_ > softwaretoolhouse.com<ben.cher...@softwaretoolhouse.com> > Web: www.softwaretoolhouse.com > > Check out Software Tool House's free Diary Editor. > > *Meta-Update**,* our premium ARS Data tool, lets you automate > your imports, migrations, *in no time at all*, without programming, > without staging forms, without merge workflow. > http://www.softwaretoolhouse.com/ **** > > ** ** > > ** ** > > *From:* Action Request System discussion list(ARSList) [mailto: > arslist@ARSLIST.ORG] *On Behalf Of *Walters, Mark > *Sent:* September-23-11 09:08 > *To:* arslist@ARSLIST.ORG > *Subject:* Re: ARS 7.1 P6 Server -- 4 days restarting (possible memory OS > 32bit issue) signal is 11**** > > ** ** > > ** **** > > It may be memory but I would expect to see malloc errors (ARERR 300) in the > arerror.log if this was the case. The fact you’re not seeing a stack trace > like this;**** > > ** ** > > Mon Sep 20 08:33:52 2010 6**** > > Timestamp: Mon Sep 20 2010 08:33:52.1865**** > > Thread Id: 4**** > > Version: 7.1.00 Patch 009 201009200800 **** > > ServerName: test71**** > > Database: SQL -- Oracle**** > > Hardware: sun4u**** > > OS: SunOS 5.10**** > > RPC Id: 337**** > > RPC Call: 106 (GLXS)**** > > RPC Queue: 390600**** > > Client: User Demo from Remedy Administrator (protocol 13) at IP address > 192.168.1.54**** > > Form:**** > > Logging On:**** > > ** ** > > suggests it may be a recursive filter – on Solaris this often causes a > crash without logging anything useful. Check to see whether there are any > core files in the server/bin directory as this is another symptom of this > type of crash on Solaris. If cores are enabled (check with the OS coreadm > command) then the server may create them even though you’re not running a > debug build.**** > > ** ** > > If you do have some core files then run the pstack command against them > (pstack core) and you will be able to see the stack of each thread within > the server – if it is a recursive filter causing a stack overflow then one > of the threads should stand out as being much bigger than the others. > Depending on what you see you may then need to enable FILTER/SQL logging to > try and capture the workflow that is causing the crash. It’s also worth > checking the Filter-Max-Stack value in ar.conf – various installers set this > to a very high value – try reducing it back down to 50 or so and this should > stop most filter recursion crashes and log an error instead.**** > > ** ** > > Mark**** > > ** ** > > I work for BMC, I don’t speak for them.**** > > ** ** > > ** ** > > *From:* Action Request System discussion list(ARSList) [mailto: > arslist@ARSLIST.ORG] *On Behalf Of *patrick zandi > *Sent:* 22 September 2011 21:07 > *To:* arslist@ARSLIST.ORG > *Subject:* ARS 7.1 P6 Server -- 4 days restarting (possible memory OS > 32bit issue) signal is 11**** > > ** ** > > ** Just a Quick Question:: ARS 7.1 P6 :: on solaris 10, I am seeing a > Operating system telling the ars to shutdown about every 4 -6 days.. > not positive, nothing in debugging of logs at all, only in the > ARMONITOR.log where it says.. **** > > 2011 ARMonitor child process (pid:15277) died with 11. And the signal > is 11.**** > > ./arserverd**** > > > Can I assume Signal 11 is Memory? --- I have seen alot of memory issues > with a 11 signal in the arslist... > > > -- > Patrick Zandi > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ **** > > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ **** > > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_**** > _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ > -- Patrick Zandi _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"