A good way to find out what its burning CPU doing if the logs have no useful info is to get a stack trace on all threads, several times, then just look at them and see what's going on. Even better is to use 'top's 'H' option to find out which specific thread ID's are burning cpu and correlate that with the stack traces. Best is to run oprofile (which is kind of a PITA to set up) and see where all the CPU is going with the calltree option.
I'm happy to try to help read a few stack dumps, I've already chased down the continuous 5% cpu issue (and it seems to be 'by design' unfortunately, it idles/polls at a high rate) this way. Not saying this is guaranteed, but it might give a hint as to what is getting stuck in Freeswitch and help point the direction for next steps. Grabbing a core file (with gcore) is OK too, but you really want a few samples and to start with, only the stacks are interesting instead of a full dump of memory (the core might be interesting later though). 1) as root ps ax or top to find the main 'pid' of the freeswitch process burning cycles 2) using top turn on thread mode by typing 'H', try to get a list of freeswitch thread ID's with high CPU 3) save all output from here on (use 'script xxx.out' or whatever tool you're familiar with) 4) gdb /proc/exe/12345 12345 (note this will suspend freeswitch while you do this) - be sure to use the 'main' process ID, not one of the thread ids 5) 'info threads' - this should show 15+ threads, if it shows only 1 you didn't use the main pid, try 'ps ax | grep frees' 6) 'thread apply all bt' - this will spew a stack trace for each thread - do this a few times to get a rough sample, between each do 'cont' to resume freeswitch, wait a bit, then hit ctl-c (in gdb) 7) 'quit' out of gdb and say its OK to detach the running process and freeswitch will continue running 'normally' 8) for good measure (still as root) run 'lsof -p 12345' to get a list of open files, this frequently gives a clue as to what the process is doing When you look at the thread stacks pay specific attention to the ones you determined were running at high CPU from top. If you post or send me the stacks and lsof output (and which threads were burning the CPU) I can help look for obvious clues. The other easy debugging tool to use is 'strace', you can run (for a short time) something like 'strace -f -o /tmp/xxx.out -s 1000 -p 12345', this will log every system call freeswitch is doing. Run this for a few seconds then hit ctl-c (it will make a huge file and slow down freeswitch while its running). If the stuck threads are interacting the kernel this is a great way to get a clue as to what's going on in them (maybe we'll see it playing the same file over and over into a closed socket for instance). -Eric On Feb 4, 2010, at 8:58 AM, mkitchin.pub...@gmail.com wrote: > Rebooted last night, and it started up again a little while ago. I'm already > at about 200% processor utilization. This looks like it could be a disaster > for me. > I've take a snapshot of the logs to preserve any evidence that might be > there. > > Cpu0 : 98.3%us, 0.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st > Cpu1 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 8177104k total, 3057448k used, 5119656k free, 138992k buffers > Swap: 10223608k total, 0k used, 10223608k free, 1730152k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3719 sipxchan 18 0 298m 46m 4888 S 196.7 0.6 273:22.00 freeswitch > 3915 sipxchan 21 0 1438m 113m 9068 S 1.3 1.4 1:18.14 java > > [r...@nshpbx1 sipxpbx]# ps aux |grep "3719" > root 790 0.0 0.0 61156 724 pts/0 S+ 08:58 0:00 grep 3719 > 500 3719 35.0 0.5 305748 47672 ? Sl Feb03 280:04 > /usr/local/freeswitch/bin/freeswitch -conf /etc/sipxpbx/freeswitch/conf -db > /var/sipxdata/tmp/freeswitch -log /var/log/sipxpbx -htdocs > /etc/sipxpbx/freeswitch/conf/htdocs -nc -nf -nosql > > > > On 2/3/2010 5:35 PM, mkitchin.pub...@gmail.com wrote: >> >> After reading a post today that mentioned size limits in posts, I realized >> these posts never made it through because it had 2 screenshots. I have >> removed the pictures and am resending. >> Any help would be greatly appreciated! >> >> On 2/3/2010 12:05 PM, mkitchin.pub...@gmail.com wrote: >>> >>> I rebooted last night to resolve the problem. It just started happening >>> again. >>> >>> It looks like there was a bug report briefly open about something similar: >>> http://track.sipfoundry.org/browse/XX-5881 >>> >>> Any ideas on this one? I will gladly provide any info that would help. >>> >>> Cpu0 : 12.0%us, 1.3%sy, 0.0%ni, 85.4%id, 0.0%wa, 0.0%hi, 1.3%si, >>> 0.0%st >>> Cpu1 : 92.0%us, 0.3%sy, 0.0%ni, 7.6%id, 0.0%wa, 0.0%hi, 0.0%si, >>> 0.0%st >>> Mem: 8177104k total, 2425588k used, 5751516k free, 152772k buffers >>> Swap: 10223608k total, 0k used, 10223608k free, 1017704k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 13533 sipxchan 18 0 281m 33m 4944 S 99.5 0.4 25:31.50 freeswitch >>> >>> [r...@nshpbx1 sipxpbx]# ps aux |grep "13533" >>> 500 13533 2.9 0.4 288392 34092 pts/0 Sl Feb02 26:32 >>> /usr/local/freeswitch/bin/freeswitch -conf /etc/sipxpbx/freeswitch/conf -db >>> /var/sipxdata/tmp/freeswitch -log /var/log/sipxpbx -htdocs >>> /etc/sipxpbx/freeswitch/conf/htdocs -nc -nf -nosql >> >>> I see others with the same issue: >>> http://list.sipfoundry.org/archive/sipx-dev/msg21612.html >>> I am not subscribed to the dev list. I don't think I could contribute too >>> much. the discussion on this one seems to have died off. It would seem to >>> me this is a pretty significant problem. >>> >>>> >>>> On 2/2/2010 12:55 PM, mkitchin.pub...@gmail.com wrote: >>>>> >>>>> I know there is a similar thread about this, but it was a little >>>>> different and I didn't want to hijack it. >>>>> http://list.sipfoundry.org/archive/sipx-users/msg21074.html >>>>> >>>>> A freeswitch process has started using a large amount of CPU on my >>>>> server. I can't see any obvious reason why. >>>>> >>>>> Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombie >>>>> Cpu0 : 50.2%us, 0.3%sy, 0.0%ni, 48.8%id, 0.0%wa, 0.3%hi, 0.3%si, >>>>> 0.0%st >>>>> Cpu1 : 53.0%us, 0.3%sy, 0.0%ni, 46.7%id, 0.0%wa, 0.0%hi, 0.0%si, >>>>> 0.0%st >>>>> Mem: 8177104k total, 2887232k used, 5289872k free, 183680k buffers >>>>> Swap: 10223608k total, 0k used, 10223608k free, 1232760k cached >>>>> >>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>>> 3811 sipxchan 18 0 296m 49m 4944 S 97.8 0.6 1133:41 freeswitch >>>>> 3591 sipxchan 18 0 1541m 403m 11m S 1.3 5.1 38:52.22 java >>>>> 3946 sipxchan 18 0 1418m 197m 9100 S 1.3 2.5 25:33.71 java >>>>> 3922 sipxchan 19 0 1379m 196m 9184 S 1.0 2.5 8:33.80 java >>>>> 10359 postgres 15 0 121m 13m 10m S 1.0 0.2 2:49.72 postmaster >>>>> >>>>> This may provide some details as to what the process is doing: >>>>> [r...@nshpbx1 sipxpbx]# ps aux |grep "freeswitch" >>>>> 500 3811 13.9 0.6 303420 50768 ? Sl Jan27 1134:08 >>>>> /usr/local/freeswitch/bin/freeswitch -conf /etc/sipxpbx/freeswitch/conf >>>>> -db /var/sipxdata/tmp/freeswitch -log /var/log/sipxpbx -htdocs >>>>> /etc/sipxpbx/freeswitch/conf/htdocs -nc -nf -nosql >>>>> >>>>> Local CPU monitoring seems to have died shortly after it registered the >>>>> spike in CPU. >>>>> <removed - picture of sipx SPU stats showing CPU stat collection died> >>>>> >>>>> Remote monitoring is still recording the high CPU utilization: >>>>> <removed - picture of zenoss showing CPU uake went way up and stayed up> >>>>> >>>>> I only have 1 warning entry in freeswitch.log from yesterday, and none >>>>> from today. >>>>> 2010-02-01 07:52:10 [WARNING] switch_core_file.c:119 >>>>> switch_core_perform_file_open() Sample rate doesn't match >>>>> >>>>> I only have a few active calls right now, and none active for more than >>>>> an hour. >>>>> >>>>> Anyone have any idea what might be causing this? >>>>> >>>>> CentOS 5.4 64 Bit, Sipx 4.0.4, sixbridge, Verizon VOIP, No firewall (not >>>>> needed, private connection), Polycom 450s and 550s - bootrom 4.2.1, >>>>> firmware 3.1.3C split. >>>>> >>>>> Thanks as always, >>>>> Matthew >>>>> >>>>> >>>>> >>>> >>> >>> >> > > _______________________________________________ > sipx-users mailing list sipx-users@list.sipfoundry.org > List Archive: http://list.sipfoundry.org/archive/sipx-users > Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-users > sipXecs IP PBX -- http://www.sipfoundry.org/
_______________________________________________ sipx-users mailing list sipx-users@list.sipfoundry.org List Archive: http://list.sipfoundry.org/archive/sipx-users Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-users sipXecs IP PBX -- http://www.sipfoundry.org/