if they went away by themselves they must not have been hung? On Thu, Mar 5, 2009 at 5:39 PM, Nik Middleton < [email protected]> wrote:
> Well if it's any consolation, I have a 4 day ish old copy of SVN and I > have around 200 of these hung calls, though after an hour or so they did > seem to clear. > > That said, FS made 138,330 call attempts today, not too shabby, and > through out the call quality was as good as the first one. Not sure how > to debug this one. > > Version: FreeSWITCH Version 1.0.trunk (12276) > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Eric > Liedtke > Sent: 05 March 2009 23:23 > To: [email protected] > Subject: Re: [Freeswitch-users] Hung Channels (SVN Rev 10231) > > Yup, as I mentioned to brian didn't want to clog jira with a bug that's > been fixed or report against a rev 2k+ revs behind. I was trying to work > through it as a learning exercise. And yeah I actually added a bunch of > stuff to the list_sessions function to spit out a variety of associated > variables for each session looking for a pattern somewhere to clue me > into what might be happening. > > No proxy or bypass media here, just defaults. > > I will keep at it and once we update the production systems, if the > problem persists I will open a bug in jira with all the neccessary > goodies. > > Thanks > -e > > It's seems fuzzy now but I think on Thu, Mar 05, 2009 at 05:55:33PM > -0500 , Mathieu Rene said: > > HI, > > > > If you suspect a bug, the place to report it is JIRA. See: > http://wiki.freeswitch.org/wiki/Reporting_Bugs > > . > > This gives the whole team a way of following up on issues. > > > > Also can you upgrade to svn trunk? A lot of fixes gets committed > > daily, so its good to stay up to date. > > > > As you seem familiar with GDB, you may symlink the .gdbinit file in > > the support-d/ folder to your home directory. > > This will give you some FS-specific macros such as "list_sessions" > > which will dump a list of uuids to session pointers. > > > > In your jira, make sure you include "thread apply all bt", > > "list_sessions" and show channels (this one goes in FS) but PLEASE > > update to svn trunk and test again to see if it still happens. > > > > Also, are you using proxy/bypass media or just the default? > > > > Math > > > > On 5-Mar-09, at 5:38 PM, Eric Liedtke wrote: > > > > > Greetings, > > > > > > I've been using FS in production on this rev (I realize it's pretty > > > > far > > > behind current) and it's been running well, save 1 issue. > > > > > > The basic setup is an SBC , 2 GiG-E ports, 1 public , 1 private. I > > > have > > > 2 sip profiles created , 1 per ip interface. This is being used to > > > terminate traffic to a provider so calls are only 1 direction. They > > > > come > > > into the private side profile, get routed via dialplan to the > gateway > > > defined in the external profile and on to the vendor. Pretty simple. > > > > > > I have noticed that under load (50 or so cps with ~800-900 bridged > > > calls up) > > > that over time some channels on the public side seem to get > > > "stuck". Due to > > > the nature of how this is being used , I would expect both sip > > > profiles to show > > > the same number of channels in use any time i do a 'sofia > > > status' ( or at least > > > be within a channel or 2 of each other). However after a day of > > > heavy use I had > > > a disparity of ~250 channels. These extra channels also seem to put > > > > some > > > continual load on the 'system cpu' as well , reported via top. > > > > > > Of course due to the load on the box I have to keep logging turned > way > > > down. So I've been trying to troubleshoot it as best I can. > > > > > > Last night I grabbed a core file and started in with GDB today. I > > > found > > > the 120 or so threads that represented real active calls when I took > > > > the > > > corefile, I also found ~250 threads that appeared to be stuck in the > > > CS_NEW state. The backtraces on all of them looks the same, > > > annotated below. > > > > > > I walked through the code path by hand , based on the bt's and I > > > don't see how > > > this could be happening unless it's a locking issue. But as far as > > > > I can tell > > > each session has it's own mutex defined in the > > > switch_core_session_t struct, > > > so I wouldn't think they would be stepping on each other. I also > > > would have expected > > > if it were something of a deadlock nature it would stop processing > > > calls all > > > together. > > > > > > I grabbed the commands from the .gdbinit (super handy btw!!) and > > > have been trolling > > > through the variables to try to ascertain something about why these > > > > threads seem to > > > be stuck, but am not having much luck even coming up with a scenario > > > > to try > > > to replicate the issue. > > > > > > If anyone has any pointers as to where I might look next it would be > > > > greatly > > > appreciated. > > > > > > We will be updating to the newest release soon, however I was hoping > > > > to nail down > > > what is going so I can systematically replicate it and verify by > > > testing in the lab > > > that it is fixed , rather than just pushing the new release to > > > produvction and hoping. > > > > > > Thanks in advance for any tips/pointers anyone may have. > > > > > > -e > > > > > > ......bt and bt full for a single "hung" thread > > > > > > > > > #0 0xb7fd5410 in __kernel_vsyscall () > > > #1 0xb7d14cb6 in nanosleep () from /lib/tls/i686/cmov/libc.so.6 > > > #2 0xb7d4f1dc in usleep () from /lib/tls/i686/cmov/libc.so.6 > > > #3 0xb7ee02cd in switch_sleep (t=1000) at src/switch_time.c:143 > > > #4 0xb7e9da03 in switch_core_session_run (session=0x95fe270) at > src/ > > > switch_core_state_machine.c:462 > > > #5 0xb7e9c765 in switch_core_session_thread (thread=0x9ada840, > > > obj=0x95fe270) at src/switch_core_session.c:853 > > > #6 0xb7efd916 in dummy_worker (opaque=0x9ada840) at > threadproc/unix/ > > > thread.c:138 > > > #7 0xb7e034fb in start_thread () from /lib/tls/i686/cmov/ > > > libpthread.so.0 > > > #8 0xb7d55e5e in clone () from /lib/tls/i686/cmov/libc.so.6 > > > (gdb) bt full > > > #0 0xb7fd5410 in __kernel_vsyscall () > > > No symbol table info available. > > > #1 0xb7d14cb6 in nanosleep () from /lib/tls/i686/cmov/libc.so.6 > > > No symbol table info available. > > > #2 0xb7d4f1dc in usleep () from /lib/tls/i686/cmov/libc.so.6 > > > No symbol table info available. > > > #3 0xb7ee02cd in switch_sleep (t=1000) at src/switch_time.c:143 > > > No locals. > > > #4 0xb7e9da03 in switch_core_session_run (session=0x95fe270) at > src/ > > > switch_core_state_machine.c:462 > > > exception = 0 '\0' > > > state = <value optimized out> > > > endstate = CS_NEW > > > endpoint_interface = <value optimized out> > > > driver_state_handler = (const switch_state_handler_table_t *) > > > > 0xb73b1720 > > > application_state_handler = <value optimized out> > > > thread_id = 3085554955 > > > env = {{__jmpbuf = {134603552, -1428248680, -1461722504, > > > 9184, -1210273432, -1210014020}, __mask_was_saved = -1210034895, > > > __saved_mask = {__val = {0, 3084988404, 3084937740, 3086469280, > > > 9184, 1, 2976641592, 2833244792, 3086590960, > > > 168036728, 3084937740, 2833244808, 3085923728, 1, 3086590960, > > > > 2833244840, 3086590960, 0, 134564192, 2833244840, 3085923728, > > > 134564244, 3086590960, 2833244872, 3085887870, 134564240, 168036728, > > > > 3085458203, 3086590960, 2976606624, > > > 134564192, 2833244904}}}} > > > sig = <value optimized out> > > > __func__ = "switch_core_session_run" > > > __PRETTY_FUNCTION__ = "switch_core_session_run" > > > #5 0xb7e9c765 in switch_core_session_thread (thread=0x9ada840, > > > obj=0x95fe270) at src/switch_core_session.c:853 > > > session = (switch_core_session_t *) 0x95fe270 > > > event = <value optimized out> > > > event_str = 0x0 > > > val = <value optimized out> > > > __func__ = "switch_core_session_thread" > > > __PRETTY_FUNCTION__ = "switch_core_session_thread" > > > #6 0xb7efd916 in dummy_worker (opaque=0x9ada840) at > threadproc/unix/ > > > thread.c:138 > > > No locals. > > > #7 0xb7e034fb in start_thread () from /lib/tls/i686/cmov/ > > > libpthread.so.0 > > > No symbol table info available. > > > #8 0xb7d55e5e in clone () from /lib/tls/i686/cmov/libc.so.6 > > > > > > > > > _______________________________________________ > > > Freeswitch-users mailing list > > > [email protected] > > > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users > > > > UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users > > > http://www.freeswitch.org > > > > > > _______________________________________________ > > Freeswitch-users mailing list > > [email protected] > > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users > > > UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users > > http://www.freeswitch.org > > _______________________________________________ > Freeswitch-users mailing list > [email protected] > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users > UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users > http://www.freeswitch.org > > _______________________________________________ > Freeswitch-users mailing list > [email protected] > http://lists.freeswitch.org/mailman/listinfo/freeswitch-users > UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users > http://www.freeswitch.org > -- Anthony Minessale II FreeSWITCH http://www.freeswitch.org/ ClueCon http://www.cluecon.com/ AIM: anthm MSN:[email protected] <msn%[email protected]> GTALK/JABBER/PAYPAL:[email protected]<paypal%[email protected]> IRC: irc.freenode.net #freeswitch FreeSWITCH Developer Conference sip:[email protected] <sip%[email protected]> iax:[email protected]/888 googletalk:[email protected]<googletalk%3aconf%[email protected]> pstn:213-799-1400
_______________________________________________ Freeswitch-users mailing list [email protected] http://lists.freeswitch.org/mailman/listinfo/freeswitch-users UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users http://www.freeswitch.org
