Ulimit = unlimited
looking at all the services running:: all under 500 (pfiles pid)

arserverd says  1654 memory in use 1615 RSS..  with Prstat -J

no core bombs, no crashes..

On Fri, Sep 23, 2011 at 9:38 AM, Garrison, Sean (Norcross) <
sean.garri...@fiserv.com> wrote:

> **
>
> Try running a ulimit –a to find the limits available to the user running
> remedy (make sure you are logged in as that user).  ****
>
> ** **
>
> Check a couple of things:****
>
> find the process number of the arserverd process and do a pfiles
> {processed}  -- compare to the ulimit of open files****
>
> Do a top and hit shift M to sort by memory.  Check and see if the arserverd
> process is maxing out beyond your ulimit.****
>
> ** **
>
> I have had an issue in the past (Remedy 6.0) on solaris before where the
> “number of open files” was exceeded by Remedy – especially when I used web
> services.  I only found this by running a script every few minutes for the
> pfiles and logging it in a file.  Just before the crash remedy hit the max
> of open files.  ****
>
> ** **
>
> I’m not saying this is your issue but it would help eliminate a
> possibility.  ****
>
> ** **
>
> Thanks,****
>
> ** **
>
> Sean****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *From:* Action Request System discussion list(ARSList) [mailto:
> arslist@ARSLIST.ORG] *On Behalf Of *Ben Chernys
> *Sent:* Friday, September 23, 2011 5:09 AM
>
> *To:* arslist@ARSLIST.ORG
> *Subject:* Re: ARS 7.1 P6 Server -- 4 days restarting (possible memory OS
> 32bit issue) signal is 11****
>
> ** **
>
> ** ****
>
> Hi Mark, Patrick,****
>
> ** **
>
> Signal 11 is SIGSEGV which is not necessarily a malloc failure though
> indeed a malloc failure may lead to it.  It is not always possible to log
> malloc failures – after all it takes some memory to cut a log record.  ***
> *
>
> ** **
>
> A segmentation violation is always the result of bad code (accessing memory
> not allocated to the process or not in the processes address space – which 0
> is a candidate (malloc’s return value on failure)).  ****
>
> ** **
>
> That being said, it is possible to not trigger the execution path with that
> bad code by altering filters etc, so definitely the route to go on is along
> the lines that Mark talked:   the core is always a wealth of info – even
> though ARS will not have debugging compiled in ;-)  I would also turn on all
> logging.  SQL, API, Filter on the server, and unlimited, and pointing to the
> same file until the next occurrence.  Then you will have a wealth of ARS
> information to go through.  Generally something will stand out.****
>
> ** **
>
> Recursive filter loops are usually trapped by the maximum filter limit –
> though if that is set high enough the process will run out of memory before
> hitting up against that.  If yours is high, you could try setting it lower.
> ****
>
> ** **
>
> You may also want to go to a higher patch level if one is available.  I am
> no longer that familiar with the patches available on 7.1.****
>
> ** **
>
> Also, I know that memory on  Solaris may be restricted by the admin.  (I
> forget the commands to determine this – but they will be easily found on the
> web).  ulimits Perhaps?****
>
> ** **
>
> Cheers****
>
> ** **
>
> Ben Chernys
>
> Senior Software Architect
> Software Tool House Inc.
>
> Canada / Deutschland / Germany
> Mobile:      +49 171 380 2329    GMT + 1 + [ DST ]
> Email:       Ben.Chernys _AT_ 
> softwaretoolhouse.com<ben.cher...@softwaretoolhouse.com>
> Web:         www.softwaretoolhouse.com
>
> Check out Software Tool House's free Diary Editor.
>
> *Meta-Update**,* our premium ARS Data tool, lets you automate
> your imports, migrations, *in no time at all*, without programming,
> without staging forms, without merge workflow.
> http://www.softwaretoolhouse.com/  ****
>
> ** **
>
> ** **
>
> *From:* Action Request System discussion list(ARSList) [mailto:
> arslist@ARSLIST.ORG] *On Behalf Of *Walters, Mark
> *Sent:* September-23-11 09:08
> *To:* arslist@ARSLIST.ORG
> *Subject:* Re: ARS 7.1 P6 Server -- 4 days restarting (possible memory OS
> 32bit issue) signal is 11****
>
> ** **
>
> ** ****
>
> It may be memory but I would expect to see malloc errors (ARERR 300) in the
> arerror.log if this was the case.  The fact  you’re not seeing a stack trace
> like this;****
>
> ** **
>
> Mon Sep 20 08:33:52 2010     6****
>
>   Timestamp: Mon Sep 20 2010 08:33:52.1865****
>
>   Thread Id: 4****
>
>   Version: 7.1.00 Patch 009 201009200800 ****
>
>   ServerName: test71****
>
>   Database: SQL -- Oracle****
>
>   Hardware: sun4u****
>
>   OS: SunOS 5.10****
>
>   RPC Id: 337****
>
>   RPC Call: 106 (GLXS)****
>
>   RPC Queue: 390600****
>
>   Client: User Demo from Remedy Administrator (protocol 13) at IP address
> 192.168.1.54****
>
>   Form:****
>
>   Logging On:****
>
> ** **
>
> suggests it may be a recursive filter – on Solaris this often causes a
> crash without logging anything useful.  Check to see whether there are any
> core files in the server/bin directory as this is another symptom of this
> type of crash on Solaris.  If cores are enabled (check with the OS coreadm
> command) then the server may create them even though you’re not running a
> debug build.****
>
> ** **
>
> If you do have some core files then run the pstack command against them
> (pstack core) and you will be able to see the stack of each thread within
> the server – if it is a recursive filter causing a stack overflow then one
> of the threads should stand out as being much bigger than the others.
> Depending on what you see you may then need to enable FILTER/SQL logging to
> try and capture the workflow that is causing the crash.  It’s also worth
> checking the Filter-Max-Stack value in ar.conf – various installers set this
> to a very high value – try reducing it back down to 50 or so and this should
> stop most filter recursion crashes and log an error instead.****
>
> ** **
>
> Mark****
>
> ** **
>
> I work for BMC, I don’t speak for them.****
>
> ** **
>
> ** **
>
> *From:* Action Request System discussion list(ARSList) [mailto:
> arslist@ARSLIST.ORG] *On Behalf Of *patrick zandi
> *Sent:* 22 September 2011 21:07
> *To:* arslist@ARSLIST.ORG
> *Subject:* ARS 7.1 P6 Server -- 4 days restarting (possible memory OS
> 32bit issue) signal is 11****
>
> ** **
>
> ** Just a Quick Question:: ARS 7.1 P6 :: on solaris 10, I am seeing a
> Operating system telling the ars to shutdown about every 4 -6 days..
> not positive, nothing in debugging of logs at all, only in the
> ARMONITOR.log  where it says.. ****
>
> 2011     ARMonitor child process (pid:15277) died with 11. And the signal
> is 11.****
>
> ./arserverd****
>
>
> Can I assume Signal 11 is Memory?  --- I have seen alot of memory issues
> with a 11 signal in the arslist...
>
>
> --
> Patrick Zandi
> _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ ****
>
> _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_ ****
>
> _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_****
> _attend WWRUG11 www.wwrug.com ARSlist: "Where the Answers Are"_
>



-- 
Patrick Zandi

_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
attend wwrug11 www.wwrug.com ARSList: "Where the Answers Are"

Reply via email to