Re: [OpenSIPS-Users] Crash at dialog.so
Hi Liviu, I did a quick look myself at dlg_timer.c (around the lines that crash happens) and there is actually a FIXME note there too. it seems to me that this issue happens when using create_dialog("PpB"), if one side is already disconnected and also PpB doesn't receive a ping and decides to disconnect the call, the module crashes. Basically, if the call is already disconnected, we should not try to send BYE to them either. For now, I removed "PpB" from create_dialog and the error has not occurred again. I think that can be the workaround for now. Thanks, Mark On Wed, Nov 11, 2020 at 3:40 PM Liviu Chircu wrote: > On 07.11.2020 13:27, M S wrote: > > The server has a pretty high load, is Q_MALLOC_DBG safe? > > How "pretty" high of a load is that, in terms of "calls-per-second" and > "max-concurrent-calls"? Q_MALLOC_DBG will typically add +30-50% to > shared/private memory usage, as well as maybe a 10% increase in CPU > usage. But OpenSIPS uses very few CPU and memory to begin with, so it > will be quite safe to do. > > My advice: unless you're running more than 50 calls-per-second through > it, there is no need to even start worrying about Q_MALLOC_DBG altering > the system's behavior. Below the 50 CPS level, the difference is > unnoticeable. > > > also, is that an opensips command-line option? I don't see it in the > > opensips man page... > Yes, it's a "opensips" binary command-line option. The man page needs > updating, indeed, however you could try "opensips -h" and you'll see > that option. > > -- > Liviu Chircu > www.twitter.com/liviuchircu | www.opensips-solutions.com > > ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Re: [OpenSIPS-Users] Crash at dialog.so
On 07.11.2020 13:27, M S wrote: The server has a pretty high load, is Q_MALLOC_DBG safe? How "pretty" high of a load is that, in terms of "calls-per-second" and "max-concurrent-calls"? Q_MALLOC_DBG will typically add +30-50% to shared/private memory usage, as well as maybe a 10% increase in CPU usage. But OpenSIPS uses very few CPU and memory to begin with, so it will be quite safe to do. My advice: unless you're running more than 50 calls-per-second through it, there is no need to even start worrying about Q_MALLOC_DBG altering the system's behavior. Below the 50 CPS level, the difference is unnoticeable. also, is that an opensips command-line option? I don't see it in the opensips man page... Yes, it's a "opensips" binary command-line option. The man page needs updating, indeed, however you could try "opensips -h" and you'll see that option. -- Liviu Chircu www.twitter.com/liviuchircu | www.opensips-solutions.com ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Re: [OpenSIPS-Users] Crash at dialog.so
Will do, I didn't notice that I hit reply instead of reply all. The server has a pretty high load, is Q_MALLOC_DBG safe? also, is that an opensips command-line option? I don't see it in the opensips man page... On Sat, Nov 7, 2020 at 12:18 PM Liviu Chircu wrote: > Please keep the "users" mailing list CC'ed at all times. Both the > question and discussion are of public interest, after all. > > On 07.11.2020 13:06, M S wrote: > > I'm using CentOS 8 and couldn't find opensips-dbg package for it, > > please advise. > > It's enough, no need for "opensips-dbg" as you seem to have the debug > symbols already. > > Okay, so we're dealing with some kind of memory corruption, and this > backtrace only begins to offer hints as to what's wrong, without > decisive help. The next step is to switch to the "-a Q_MALLOC_DBG" > command-line option, trigger the crash again and see what that backtrace > looks like. > > -- > Liviu Chircu > www.twitter.com/liviuchircu | www.opensips-solutions.com > > ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Re: [OpenSIPS-Users] Crash at dialog.so
Please keep the "users" mailing list CC'ed at all times. Both the question and discussion are of public interest, after all. On 07.11.2020 13:06, M S wrote: I'm using CentOS 8 and couldn't find opensips-dbg package for it, please advise. It's enough, no need for "opensips-dbg" as you seem to have the debug symbols already. Okay, so we're dealing with some kind of memory corruption, and this backtrace only begins to offer hints as to what's wrong, without decisive help. The next step is to switch to the "-a Q_MALLOC_DBG" command-line option, trigger the crash again and see what that backtrace looks like. -- Liviu Chircu www.twitter.com/liviuchircu | www.opensips-solutions.com ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Re: [OpenSIPS-Users] Crash at dialog.so
On 07.11.2020 12:25, M S wrote: I am using opensips 3.1.0: Any ideas what caused it? Hi, Mark! Could you install the "opensips-dbg" package and re-post the backtrace? Thanks to the debug symbols, the information provided should be much richer. Also, if you're pushing less than 100 calls-per-second through that machine, then I recommend starting OpenSIPS with the "-a Q_MALLOC_DBG" command-line option and post the new backtrace, when the crash re-occurs. This quality assurance allocator contains extra runtime checks and it may detect the problem in some earlier, more relevant place in the code, as the bug occurs. PS: to obtain a quality backtrace using `gdb` and not that garbled wall of text from systemd, see this tutorial [1] [1]: https://www.opensips.org/Documentation/TroubleShooting-Crash -- Liviu Chircu www.twitter.com/liviuchircu | www.opensips-solutions.com ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users
[OpenSIPS-Users] Crash at dialog.so
Hi all, I am using opensips 3.1.0: kernel: traps: opensips[98311] general protection ip:543d22 sp:7ffcb17316e0 error:0 in opensips[40+34c000] systemd[1]: Started Process Core Dump (PID 100247/UID 0). kernel: opensips[98295]: segfault at 8 ip 7f697539df40 sp 7ffcb1731cb0 error 4 in dialog.so[7f6975332000+9b000] kernel: Code: 41 57 41 56 41 55 41 54 55 53 48 81 ec 98 00 00 00 48 89 fb 89 74 24 54 4c 8b 35 d3 fb 22 00 49 8b 06 44 8b 6f 1c 49 c1 e5 05 <4c> 03 68 08 48 8b 48 18 41 8b 55 18 48 8b 71 08 4c 8d 3c 96 bf 01 systemd[1]: Started Process Core Dump (PID 100249/UID 0). systemd[1]: opensips.service: Main process exited, code=dumped, status=11/SEGV systemd[1]: opensips.service: Failed with result 'core-dump'. systemd-coredump[100248]: Process 98311 (opensips) of user 1001 dumped core.#012#012Stack trace of thread 98311:#012#0 0x00543d22 fm_status (opensips)#012#1 0x00502c51 sig_usr (opensips)#012#2 0x7f69803f8dd0 __restore_rt (libpthread.so.0)#012#3 0x00542cc3 fm_remove_free (opensips)#012#4 0x7f69753476da build_extra_hdr (dialog.so)#012#5 0x7f6975341717 dlg_options_routine (dialog.so)#012#6 0x004c4a5d handle_timer_job (opensips)#012#7 0x0061124b handle_io (opensips)#012#8 0x00615c61 udp_start_processes (opensips)#012#9 0x0041a742 main_loop (opensips)#012#10 0x7f69800476a3 __libc_start_main (libc.so.6)#012#11 0x0041b1ae _start (opensips) systemd-coredump[100250]: Process 98295 (opensips) of user 1001 dumped core.#012#012Stack trace of thread 98295:#012#0 0x7f697539df40 _unref_dlg (dialog.so)#012#1 0x7f6975635c69 empty_tmcb_list (tm.so)#012#2 0x7f69755fb64a free_cell (tm.so)#012#3 0x7f69755fe42e free_hash_table (tm.so)#012#4 0x7f69755fab16 tm_shutdown (tm.so)#012#5 0x00507fb2 destroy_modules (opensips)#012#6 0x005033f3 cleanup (opensips)#012#7 0x00503ef1 shutdown_opensips (opensips)#012#8 0x0050475d handle_sigs (opensips)#012#9 0x0041ab01 main_loop (opensips)#012#10 0x7f69800476a3 __libc_start_main (libc.so.6)#012#11 0x0041b1ae _start (opensips) Any ideas what caused it? Thank you, Mark ___ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users