Hi! Yes, with -DF_MALLOC.
1.6.1 from sources, I build deb package. I use 128M of shared and 10*1024*1024 private memory (can increase - no problem). Hmmmm, "opensipsctl fifo get_statistics all" crashes/stops the opensips. 'fifo uptime' or 'fifo debug' are OK. strace while 'fifo get_statistics all': Process 9509 attached - interrupt to quit pause() = ? ERESTARTNOHAND (To be restarted) --- SIGUSR2 (User defined signal 2) @ 0 (0) --- sigreturn() = ? (mask now []) pause() = ? ERESTARTNOHAND (To be restarted) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGUSR2}], WNOHANG) = 9520 waitpid(-1, 0xbf84b4c8, WNOHANG) = 0 kill(0, SIGTERM) = 0 --- SIGTERM (Terminated) @ 0 (0) --- --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now [TERM]) sigreturn() = ? (mask now []) rt_sigaction(SIGALRM, {0x8065920, [ALRM], SA_RESTART}, {SIG_DFL}, 8) = 0 alarm(60) = 0 wait4(-1, NULL, 0, NULL) = 9514 wait4(-1, NULL, 0, NULL) = 9519 wait4(-1, NULL, 0, NULL) = 9521 wait4(-1, NULL, 0, NULL) = 9522 wait4(-1, NULL, 0, NULL) = 9512 --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) wait4(-1, NULL, 0, NULL) = 9510 wait4(-1, NULL, 0, NULL) = 9516 --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) wait4(-1, NULL, 0, NULL) = 9515 wait4(-1, NULL, 0, NULL) = 9517 wait4(-1, NULL, 0, NULL) = 9524 wait4(-1, NULL, 0, NULL) = 9525 --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGCHLD (Child exited) @ 0 (0) --- sigreturn() = ? (mask now []) wait4(-1, NULL, 0, NULL) = 9511 wait4(-1, NULL, 0, NULL) = 9513 wait4(-1, NULL, 0, NULL) = 9518 wait4(-1, NULL, 0, NULL) = 9523 wait4(-1, NULL, 0, NULL) = -1 ECHILD (No child processes) rt_sigaction(SIGALRM, {0x8066080, [ALRM], SA_RESTART}, {0x8065920, [ALRM], SA_RESTART}, 8) = 0 stat64("/tmp/opensips_fifo", {st_mode=S_IFIFO|0660, st_size=0, ...}) = 0 unlink("/tmp/opensips_fifo") = 0 munmap(0xaed25000, 134217728) = 0 unlink("/var/run/opensips/opensips.pid") = 0 alarm(0) = 60 rt_sigaction(SIGALRM, {SIG_IGN}, {0x8066080, [ALRM], SA_RESTART}, 8) = 0 exit_group(0) = ? Process 9509 detached -- Best Regards, Alex Massover VoIP R&D TL Jajah Inc. > -----Original Message----- > From: users-boun...@lists.opensips.org [mailto:users- > boun...@lists.opensips.org] On Behalf Of Andrei Dragus > Sent: Thursday, January 21, 2010 3:09 PM > To: OpenSIPS users mailling list > Subject: Re: [OpenSIPS-Users] sched_yield() > > > Hi, > > Since all the backtraces are in allocation routines my guess is that > the > shared memory lock might be causing a problem. > > Are you compiling with -DF_MALLOC? > What version of OpenSIPS are you using? > What is the total shared memory pool you are allocating? > What amount of memory are you using? ( Use : opensipsctl fifo > get_statistics all ) > > Alex Massover wrote: > > Some more, > > > > (gdb) bt > > #0 0xb78dc424 in __kernel_vsyscall () > > #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6 > > #2 0xb73d77fd in build_new_dlg () from > /usr/lib/opensips/modules/dialog.so > > #3 0xb73d4b81 in dlg_create_dialog () from > /usr/lib/opensips/modules/dialog.so > > #4 0xb73c9c9e in ?? () from /usr/lib/opensips/modules/dialog.so > > #5 0x08055030 in do_action () > > #6 0x08053ebf in run_action_list () > > #7 0x08056e7a in do_action () > > #8 0x08053ebf in run_action_list () > > #9 0x08057d99 in run_top_route () > > #10 0x0808ad6c in receive_msg () > > #11 0x080bd2f2 in udp_rcv_loop () > > #12 0x08069339 in main () > > > > > > (gdb) bt > > #0 0xb78dc424 in __kernel_vsyscall () > > #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6 > > #2 0xb77242cd in build_cell () from /usr/lib/opensips/modules/tm.so > > #3 0xb7739c4a in t_newtran () from /usr/lib/opensips/modules/tm.so > > #4 0xb772e7b8 in t_relay_to () from /usr/lib/opensips/modules/tm.so > > #5 0xb773b501 in ?? () from /usr/lib/opensips/modules/tm.so > > #6 0x08055030 in do_action () > > #7 0x08053ebf in run_action_list () > > #8 0x08095cf2 in eval_expr () > > #9 0x080958d9 in eval_expr () > > #10 0x08095919 in eval_expr () > > #11 0x080554e2 in do_action () > > #12 0x08053ebf in run_action_list () > > #13 0x080569d8 in do_action () > > #14 0x08053ebf in run_action_list () > > #15 0x08056e7a in do_action () > > #16 0x08053ebf in run_action_list () > > #17 0x08057d99 in run_top_route () > > #18 0x0808ad6c in receive_msg () > > #19 0x080bd2f2 in udp_rcv_loop () > > #20 0x08069339 in main () > > > > -- > > Best Regards, > > Alex Massover > > VoIP R&D TL > > Jajah Inc. > > > > > >> -----Original Message----- > >> From: users-boun...@lists.opensips.org [mailto:users- > >> boun...@lists.opensips.org] On Behalf Of Alex Massover > >> Sent: Thursday, January 21, 2010 2:24 PM > >> To: OpenSIPS users mailling list > >> Subject: Re: [OpenSIPS-Users] sched_yield() > >> > >> Hi, > >> > >> Another one.. It hangs for a number of seconds (but it's enough to > >> cause to SIP timeouts - MSG queue jumps to 260K), it's hard to make > a > >> bt at the right moment. > >> This one looks better because there's sched_yield() there :) > >> > >> (gdb) bt > >> #0 0xb77d5424 in __kernel_vsyscall () > >> #1 0xb771041c in sched_yield () from /lib/i686/cmov/libc.so.6 > >> #2 0x080bf23d in new_avp () > >> #3 0x080bf53f in add_avp () > >> #4 0xb72c1c9c in ?? () from /usr/lib/opensips/modules/dialog.so > >> #5 0x08055030 in do_action () > >> #6 0x08053ebf in run_action_list () > >> #7 0x08056e7a in do_action () > >> #8 0x08053ebf in run_action_list () > >> #9 0x08056e7a in do_action () > >> #10 0x08053ebf in run_action_list () > >> #11 0x08056e7a in do_action () > >> #12 0x08053ebf in run_action_list () > >> #13 0x08057d99 in run_top_route () > >> #14 0x0808ad6c in receive_msg () > >> #15 0x080bd2f2 in udp_rcv_loop () > >> #16 0x08069339 in main () > >> > >> -- > >> Best Regards, > >> Alex Massover > >> VoIP R&D TL > >> Jajah Inc. > >> > >> > >>> -----Original Message----- > >>> From: users-boun...@lists.opensips.org [mailto:users- > >>> boun...@lists.opensips.org] On Behalf Of Alex Massover > >>> Sent: Thursday, January 21, 2010 2:05 PM > >>> To: OpenSIPS users mailling list > >>> Subject: Re: [OpenSIPS-Users] sched_yield() > >>> > >>> Hi Andrei, > >>> Hopefully this is it (with FASTLOCK) > >>> > >>> #0 0xb77d5424 in __kernel_vsyscall () > >>> #1 0xb772babb in poll () from /lib/i686/cmov/libc.so.6 > >>> #2 0xb77ba83a in ?? () from /lib/i686/cmov/libresolv.so.2 > >>> #3 0xb77b8946 in __libc_res_nquery () from > >>> /lib/i686/cmov/libresolv.so.2 > >>> #4 0xb77b8fdb in ?? () from /lib/i686/cmov/libresolv.so.2 > >>> #5 0xb77b92ae in __libc_res_nsearch () from > >>> /lib/i686/cmov/libresolv.so.2 > >>> #6 0xb77b96d4 in __res_nsearch () from > /lib/i686/cmov/libresolv.so.2 > >>> #7 0xb77b808a in res_search () from /lib/i686/cmov/libresolv.so.2 > >>> #8 0x0808c613 in get_record () > >>> #9 0x0808cf05 in ?? () > >>> #10 0x0808e385 in sip_resolvehost () > >>> #11 0x0807a26c in mk_proxy () > >>> #12 0xb7627d39 in t_relay_to () from > /usr/lib/opensips/modules/tm.so > >>> #13 0xb7634501 in ?? () from /usr/lib/opensips/modules/tm.so > >>> #14 0x08055030 in do_action () > >>> #15 0x08053ebf in run_action_list () > >>> #16 0x08095cf2 in eval_expr () > >>> #17 0x080958d9 in eval_expr () > >>> #18 0x08095919 in eval_expr () > >>> #19 0x080554e2 in do_action () > >>> #20 0x08053ebf in run_action_list () > >>> #21 0x08056e7a in do_action () > >>> #22 0x08053ebf in run_action_list () > >>> ---Type <return> to continue, or q <return> to quit--- > >>> #23 0x080569d8 in do_action () > >>> #24 0x08053ebf in run_action_list () > >>> #25 0x08056e7a in do_action () > >>> #26 0x08053ebf in run_action_list () > >>> #27 0x08057d99 in run_top_route () > >>> #28 0x0808ad6c in receive_msg () > >>> #29 0x080bd2f2 in udp_rcv_loop () > >>> #30 0x08069339 in main () > >>> (gdb) > >>> > >>> -- > >>> Best Regards, > >>> Alex Massover > >>> VoIP R&D TL > >>> Jajah Inc. > >>> > >>>> -----Original Message----- > >>>> From: users-boun...@lists.opensips.org [mailto:users- > >>>> boun...@lists.opensips.org] On Behalf Of Andrei Dragus > >>>> Sent: Wednesday, January 20, 2010 2:58 PM > >>>> To: OpenSIPS users mailling list > >>>> Subject: Re: [OpenSIPS-Users] sched_yield() > >>>> > >>>> Hi, > >>>> > >>>> I think that there is a lock that is being held more than it > should > >>>> > >>> be > >>> > >>>> and that's what causes starvation. It would help us if you could > >>>> > >>> attach > >>> > >>>> to a process using gdb and give us a full backtrace. > >>>> > >>>> Temporary solutions which should work would be to reduce the > number > >>>> > >>> of > >>> > >>>> processes to 4-6 or to recompile replacing -DFAST_LOCK with one of > >>>> > >>> the > >>> > >>>> other options (-DUSE_POSIX_SEM or -DUSE_PTHREAD_MUTEX) but we > >>>> > >> should > >> > >>>> see > >>>> where this is from to fix it. > >>>> > >>>> Alex Massover wrote: > >>>> > >>>>> Hi! > >>>>> > >>>>> Yes, from the source on debian, I build deb package. (I did some > >>>>> > >>>> minor changes to the source, but the problem happens also without > >>>> > >> my > >> > >>>> changes) > >>>> > >>>>> 16 children on 4 cores. > >>>>> > >>>>> What do you suggest to reduce it to 4? It runs on 2.6.32 on > >>>>> > >> VMware > >> > >>>> ESX. > >>>> > >>>>> I'm also trying now sleep(0) instead of sched_yield(). > >>>>> > >>>>> -- > >>>>> Best Regards, > >>>>> Alex Massover > >>>>> VoIP R&D TL > >>>>> Jajah Inc. > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: users-boun...@lists.opensips.org [mailto:users- > >>>>>> boun...@lists.opensips.org] On Behalf Of Andrei Dragus > >>>>>> Sent: Wednesday, January 20, 2010 1:05 PM > >>>>>> To: OpenSIPS users mailling list > >>>>>> Subject: Re: [OpenSIPS-Users] sched_yield() > >>>>>> > >>>>>> Hi Alex, > >>>>>> > >>>>>> Are you building OpenSIPS from source? > >>>>>> How many processes do you have and on how many cores? > >>>>>> > >>>>>> > >>>>>> Alex Massover wrote: > >>>>>> > >>>>>> > >>>>>>> Hello! > >>>>>>> > >>>>>>> I'm facing a strange problem, sometimes under a stress OpenSIPS > >>>>>>> "locks" - load average jumps, SIP processing delays, opensips > >>>>>>> > >> msg > >> > >>>>>>> queue fills with a lot of sip messages, opensips processes > >>>>>>> > >> start > >> > >>> to > >>> > >>>>>>> comsume a lot of CPU. > >>>>>>> > >>>>>>> And strace shows: > >>>>>>> > >>>>>>> sched_yield() > >>>>>>> > >>>>>>> sched_yield() > >>>>>>> > >>>>>>> sched_yield() > >>>>>>> > >>>>>>> sched_yield() > >>>>>>> > >>>>>>> .... > >>>>>>> > >>>>>>> for all processes. > >>>>>>> > >>>>>>> If I stop the stress - after a while (not immediately) - it > >>>>>>> > >>>> unlocks, > >>>> > >>>>>>> also suddenly, I can see in top that all opensips processes > >>>>>>> > >> stop > >> > >>> to > >>> > >>>>>>> consume CPU. > >>>>>>> > >>>>>>> What can it be? Some kind of starvation? > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Best Regards, > >>>>>>> > >>>>>>> Alex Massover > >>>>>>> > >>>>>>> VoIP R&D TL > >>>>>>> > >>>>>>> Jajah Inc. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> This mail was sent via Mail-SeCure System. > >>>>>>> --------------------------------------------------------------- > >>>>>>> > >> -- > >> > >>> -- > >>> > >>>> -- > >>>> > >>>>>> --- > >>>>>> > >>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Users mailing list > >>>>>>> Users@lists.opensips.org > >>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> -- > >>>>>> Andrei Dragus > >>>>>> www.voice-system.ro > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Users mailing list > >>>>>> Users@lists.opensips.org > >>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >>>>>> > >>>>>> This mail was received via Mail-SeCure System. > >>>>>> > >>>>>> > >>>>>> > >>>>> This mail was sent via Mail-SeCure System. > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Users mailing list > >>>>> Users@lists.opensips.org > >>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >>>>> > >>>>> > >>>> -- > >>>> Andrei Dragus > >>>> www.voice-system.ro > >>>> > >>>> > >>>> _______________________________________________ > >>>> Users mailing list > >>>> Users@lists.opensips.org > >>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >>>> > >>>> This mail was received via Mail-SeCure System. > >>>> > >>>> > >>> This mail was sent via Mail-SeCure System. > >>> > >>> > >>> > >>> _______________________________________________ > >>> Users mailing list > >>> Users@lists.opensips.org > >>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >>> > >>> This mail was received via Mail-SeCure System. > >>> > >>> > >> This mail was sent via Mail-SeCure System. > >> > >> > >> > >> _______________________________________________ > >> Users mailing list > >> Users@lists.opensips.org > >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users > >> > >> This mail was received via Mail-SeCure System. > >> > >> > > > > > > This mail was sent via Mail-SeCure System. > > > > > > > > _______________________________________________ > > Users mailing list > > Users@lists.opensips.org > > http://lists.opensips.org/cgi-bin/mailman/listinfo/users > > > > -- > Andrei Dragus > www.voice-system.ro > > > _______________________________________________ > Users mailing list > Users@lists.opensips.org > http://lists.opensips.org/cgi-bin/mailman/listinfo/users > > This mail was received via Mail-SeCure System. > This mail was sent via Mail-SeCure System. _______________________________________________ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users