On Fri, 2010-02-05 at 16:49 -0600, mkitchin.pub...@gmail.com wrote:
> On 2/5/2010 4:32 PM, Scott Lawrence wrote:
> > Agreed, but so far the only information we could put in an issue is
> > "sometimes freeswitch sucks up the entire cpu".  We don't know what
> > configurations those systems have, or what event triggers it.  As one of
> > the people that screens New issues, I can promise that that report would
> > just be returned with a request for more information.
> >
> > If someone can get a core file while this is happening, and take a
> > snapshot that includes the symbol data (this must be done from the
> > command line: see sipx-snapshot --help) and attach that too an issue,
> > then we'd have a starting point.
> >
> >    
> I will be more than happy to. I want to make sure I get the steps right, 
> because I would prefer to get my server out of this state as soon as 
> possible when it does happen.
> 
> 1) kill -6 <pid> (get core file)
> 2) sipx-snapshot
> 
> On the snapshot, sipx-snapshot --help does not contain the word 
> 'symbol'. Is there a certain switch that needs to be used?


I would set the logging on DEBUG for Freeswitch, INFO for the proxy,
registrar, sipXivr and the conference system, and NOTICE for everything
else (making that change effective requires restarting things).

Once the problem is detected, log in as root.  Find the runaway process
pid; check to see if it's the same as the process id in the
file /var/run/sipxecs/freeswitch.pid (interesting if not), and then cill
the process and note the time locally and in UTC:

  kill -6 <pid> ; date -u +%R ; date +%U

You don't need to do anything to restart freeswitch - the supervisor
will do that on its own when it sees that the process has died.

cd to the logs directory - that's where the core file should be:

  cd /var/log/sipxpbx
  ls -l core*

find the one created at the time you killed the process.

Then include that in the snapshot:

  sipx-snapshot --core <core-file-name> 

If you can identify the time that the problem started, you can (possibly
greatly) reduce the size of the snapshot by adding time qualifiers:

  --start-time <utc-time> --stop-time <time-from-date-dash-u-above>

where <utc-time> is a universal (aka GMT) time a little while (10
minutes?) before the problem was started - err on the side of early
rather than late, but let's try to avoid Gigabyte snapshot files.  Use
the difference between the two times above to calculate the time offset.




_______________________________________________
sipx-users mailing list sipx-users@list.sipfoundry.org
List Archive: http://list.sipfoundry.org/archive/sipx-users
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-users
sipXecs IP PBX -- http://www.sipfoundry.org/

Reply via email to