Bogdan, The one I provided is the only one I have with the memory debugging compiled in. I’m going to re-enable that and push it so that we will have that info when it recurs.
Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org> Date: Thursday, November 15, 2018 at 11:08 AM To: Ben Newlin <ben.new...@genesys.com>, OpenSIPS devel mailling list <devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, DO you have the backtraces from more similar crashes ? may there is a pattern there. Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 11/15/2018 05:01 PM, Ben Newlin wrote: Bogdan, It’s happening every few days, so it is pretty frequent. There was another one yesterday but the DBG compile flags had been temporarily removed for that one. We have not been able to determine a sequence to reproduce it yet. Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Thursday, November 15, 2018 at 7:06 AM To: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>, OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, How often this crash happens ? are you able to reproduce it ? The acc extra should work in the branch route, no problem. Out of curiosity, I will try to reproduce you case (timeout -> failure route -> t_relay -> branch_route) to see if I can reproduce it. Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 11/13/2018 07:41 PM, Ben Newlin wrote: Bogdan, Yes, we are setting acc_extra variables in our branch routes, which are sometimes (but not always) called from failure route. Are acc_extra variables not available for use in branch_routes? We don’t currently use drop_accounting anywhere in our script. If I call it before that branch_route then it will stop accounting for that call, right? We need to have accounting records for the call, so I’m not sure how that would resolve the issue? Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Tuesday, November 13, 2018 at 9:13 AM To: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>, OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, Thanks for the info. The crash happens when you try to set an acc extra variable in branch route (when a creating a new branch via failure route, on timeout). Now, do you use the drop accounting in your script ? and considering the above scenario, it is possible to have the drop acc before the branch route ? Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 11/12/2018 08:55 PM, Ben Newlin wrote: Bogdan, We upgraded to 2.4.3 and the crash reproduced today. Backtrace is available here: https://pastebin.com/CZxQnZdR<https://pastebin.com/CZxQnZdR>. Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Wednesday, November 7, 2018 at 6:18 AM To: OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org>, Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, The BT indicates a double free for the accounting context - and I noticed you use 2.4.1 version. And yes, there was an issue related to acc context, issue that was fixed starting 2.4.2. So, could you upgrade to the latest 2.4 and see if the crash still happens ? As I think the fix is already there. Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 11/06/2018 11:13 PM, Bogdan-Andrei Iancu wrote: Jackpot - you get it right !! I will start digging into the trace, but please keep the corefile, I might need it later. Thanks and regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 11/06/2018 10:24 PM, Ben Newlin wrote: Bogdan, I have reproduced this crash and verified this time that the flags were set. $ opensips -V version: opensips 2.4.1 (x86_64/linux) flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, QM_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_WAIT, DBG_LOCK ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535 poll method support: poll, epoll, sigio_rt, select. git revision: 5d042cffc main.c compiled on 23:38:55 Nov 5 2018 with gcc 7 Backtrace is available here: https://pastebin.com/KTQjkCwq<https://pastebin.com/KTQjkCwq> Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Thursday, November 1, 2018 at 1:19 PM To: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>, OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, According to the backtrace, the memory debugger was not activated. Do an "opensips -V" to check the resulting compile flags - do you see the DBG_MALLOC and QM_MALLOC ? Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 10/31/2018 05:04 PM, Ben Newlin wrote: Bogdan, I was able to compile with those options and the crash has occurred again. Backtrace is here: https://pastebin.com/dezi9xUU<https://pastebin.com/dezi9xUU>. Even though I had `memdump=1` set in my script, there was no extra memory debugging information in the logs prior to or at the time of the crash. I’m not sure if that is expected or not. Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Monday, October 29, 2018 at 8:11 AM To: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>, OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, You can change the compile flags via the Makefile.conf file - the menuconfig is also updating that file. So during your build you can simply push a pre-modified Makefile.conf file with the options needed for memory debugging. Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 10/26/2018 05:14 PM, Ben Newlin wrote: Bogdan, Unfortunately, we have run into a similar issue before. Our build system is completely automated and there is no way to inject the `make menuconfig` interactive step into that process. If I were to be testing this locally I might be able to work something out, but I could never get such a build into our testing environment which is where the crashes are occurring. Do you have instructions for enabling memory debugging that do not require using the interactive TUI tool? What does the menuconfig program do when these options are selected? Are there some defines or other settings we can change ourselves and bypass menuconfig? Ben Newlin From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org> Date: Friday, October 26, 2018 at 4:59 AM To: OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org>, Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Hi Ben, all the BT's points to crashes while doing memory ops. I suspect a memory corruption that randomly triggers crashes in different parts of the code. Could you try to re-compile with memory debugging support ? See http://www.opensips.org/Documentation/TroubleShooting-OutOfMem<http://www.opensips.org/Documentation/TroubleShooting-OutOfMem>, the "How to handle it" section. Regards, Bogdan-Andrei Iancu OpenSIPS Founder and Developer http://www.opensips-solutions.com<http://www.opensips-solutions.com> OpenSIPS Bootcamp 2018 http://opensips.org/training/OpenSIPS_Bootcamp_2018/<http://opensips.org/training/OpenSIPS_Bootcamp_2018/> On 10/24/2018 04:28 AM, Ben Newlin wrote: We have had 2 more crashes today. Crash 2: https://pastebin.com/rMruBQcZ<https://pastebin.com/rMruBQcZ> This crash appears to have occurred while processing an initial INVITE request. I could not see anything unusual about the request. I cannot tell if this crash is related to the others. Crash 3: https://pastebin.com/Gmk1m4NT<https://pastebin.com/Gmk1m4NT> This crash follows the pattern of the original crash I reported. Ben Newlin From: Devel <devel-boun...@lists.opensips.org><mailto:devel-boun...@lists.opensips.org> on behalf of Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com> Reply-To: OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Date: Monday, October 22, 2018 at 4:45 PM To: OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash Here is a better trace of the call: https://pastebin.com/gWpQR8E7<https://pastebin.com/gWpQR8E7> Ben Newlin From: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com> Date: Monday, October 22, 2018 at 4:34 PM To: OpenSIPS devel mailling list <devel@lists.opensips.org><mailto:devel@lists.opensips.org> Subject: OpenSIPS Crash Hello, We have been having sporadic crashes and I was recently able to recover a core dump for one. I have uploaded it here: https://pastebin.com/ABktcYcH<https://pastebin.com/ABktcYcH>. I picked out a Call-ID from the crash data and took a look in our tracing. I have uploaded it here: https://pastebin.com/ZEzUUKZ5<https://pastebin.com/ZEzUUKZ5>. It appears that a downstream server was extremely lagged and failed to respond to an INVITE. We sent the INVITE to another server and the call was connected, but then eventually the original server “caught up” and sent a burst of 200 OK responses. The crash seems to have occurred processing the ACK to one of these responses. Ben Newlin _______________________________________________ Devel mailing list Devel@lists.opensips.org<mailto:Devel@lists.opensips.org> http://lists.opensips.org/cgi-bin/mailman/listinfo/devel<http://lists.opensips.org/cgi-bin/mailman/listinfo/devel> _______________________________________________ Devel mailing list Devel@lists.opensips.org<mailto:Devel@lists.opensips.org> http://lists.opensips.org/cgi-bin/mailman/listinfo/devel<http://lists.opensips.org/cgi-bin/mailman/listinfo/devel>
_______________________________________________ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel