Bugs item #2818650, was opened at 2009-07-08 20:49 Message generated for change (Comment added) made by bogdan_iancu You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2818650&group_id=232389
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: core Group: 1.4.x Status: Open Resolution: None Priority: 5 Private: No Submitted By: Thomas V (tomvogt) Assigned to: Bogdan-Andrei Iancu (bogdan_iancu) Summary: signo=Could not find the frame base for "sig_alarm_abort" Initial Comment: opensips 1.4.5-notls, FreeBSD 7.0 (i386). opensips core dumps after a few hours with: #0 0x28268017 in kill () from /lib/libc.so.7 #1 0x28267f76 in raise () from /lib/libc.so.7 #2 0x28266b8a in abort () from /lib/libc.so.7 #3 0x08067ee3 in sig_alarm_abort (signo=Could not find the frame base for "sig_alarm_abort". ) at main.c:423 #4 <signal handler called> #5 fm_status (qm=0x287fe000) at mem/f_malloc.c:512 #6 0x080679ed in cleanup (show_status=1) at main.c:356 #7 0x08068795 in handle_sigs () at main.c:524 #8 0x0806c969 in main (argc=3, argv=0xbfbfec58) at main.c:866 any idea? ---------------------------------------------------------------------- >Comment By: Bogdan-Andrei Iancu (bogdan_iancu) Date: 2009-07-13 15:14 Message: Hi Thomas, Somewhere in the code, a SEG FAULT happens: Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]: INFO:core:handle_sigs: child process 8421 exited by a signal 11 This triggers a shutdown, but this fails and a second crash happens - the new core file is overwriting the original one, hiding the real cause of the crash..... To avoid this, I attached here a small patch that will skip the shutdown part, so that it will avoid the overwrite of the core. Please patch your code and post here the BT of the real core file ---------------------------------------------------------------------- Comment By: Thomas V (tomvogt) Date: 2009-07-10 16:07 Message: It crashes also with more Ram. Looks like memory is not the case. I do not have more information. With an older opensips 1.4.2-notls the system always crashed with: SIP/2.0 180 Ringing\r\nVia: SIP/2.0/UDP 212.xxx.xxx.xxx;branch=z9hG4bK0622.99552112.0\r\nVia: SIP/2.0/UDP 85.xxx.xxx.xxx4:5060;received=85.xxx.xxx.xxx;rport=5060;branch=z9hG4bKc3ba5c8f50bc782082391762e9a2fa9e\r\nF"..., len=503, rcv_info=0xbfbfeb14 A ngrep when 1.4.2 was running showed: U 85.xxx.xxx.xxx:5060 -> 212.xxx.xxx.xxx:5060 INVITE sip:xxxxx...@212.xxx.xxx.xxx SIP/2.0. Via: SIP/2.0/UDP 85.xxx.xxx.xxx4:5060;rport;branch=z9hG4bK3759d84a0bea3edc47d08b9ec1cd2979. To: <sip:xxxx...@212.xxx.xxx.xxx>. From: <sip:xxxx...@212.xxx.xxx.xxx>;tag=AI60E7E8F4CC7D6BD2. Call-ID: ai1e214f5edc57c...@192.168.1.240. CSeq: 1 INVITE. Max-Forwards: 70. Contact: <sip:xxxx...@85.xxx.xxx.xxx;line=AIDF22AB43A35036D9>. Accept: application/sdp. Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER. P-Preferred-Identity: <sip:xxxxxx...@212.xxx.xxx.xxx5>. Privacy: none. User-Agent: Aastra Intelligate. Content-Type: application/sdp. Content-Length: 248. . . . . . . . . . # I 212.xxx.xxx.xxx -> 85.xxx.xxx.xxx 3:3 ...E(..k...;. Uv...e.}~ ....;l # We installed Revision: 5049. After that we had the sig_alarm_abort issue. So we updated to 1.4.5-notls. Nothing changed, opensips still crashes. ---------------------------------------------------------------------- Comment By: Thomas V (tomvogt) Date: 2009-07-10 12:13 Message: Hi Bogdan I'm not sure but it looks like a memory leak. I upgraded the system to FreeBSD 7.2 and added an additional 1 GB memory (total 2gb). After 48h opensips reserved at least 1gb (1gb is for postgresql). We do not allow to use swap on the system. If the system has less than 50mb left, opensips crashes Log: Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]: INFO:core:handle_sigs: child process 8421 exited by a signal 11 Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]: INFO:core:handle_sigs: core was generated Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]: INFO:core:handle_sigs: terminating due to SIGCHLD Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8439]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8435]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8434]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8433]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8432]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8427]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8426]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8425]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8424]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8423]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8420]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8420]: CRITICAL:core:fm_status: different free frag. count: 15!=37 for hash 5 Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8419]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8431]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8430]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8429]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8428]: INFO:core:sig_usr: signal 15 received Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8450]: INFO:core:sig_usr: signal 15 received Jul 10 08:59:59 sip02 /usr/local/sbin/opensips[8408]: CRITICAL:core:sig_alarm_abort: BUG - shutdown timeout triggered, dying... coredump: (gdb) where #0 0x2826de67 in kill () from /lib/libc.so.7 #1 0x2826ddc6 in raise () from /lib/libc.so.7 #2 0x2826c9da in abort () from /lib/libc.so.7 #3 0x08067ee3 in sig_alarm_abort (signo=14) at main.c:422 #4 <signal handler called> #5 0x080daf8b in fm_status (qm=0x28807000) at mem/f_malloc.c:530 #6 0x080679ed in cleanup (show_status=1) at fastlock.h:181 #7 0x08068795 in handle_sigs () at main.c:523 #8 0x0806c969 in main (argc=3, argv=0xbfbfec7c) at main.c:860 (gdb) frame 6 #6 0x080679ed in cleanup (show_status=1) at fastlock.h:181 181 fastlock.h: No such file or directory. in fastlock.h ---------------------------------------------------------------------- Comment By: Bogdan-Andrei Iancu (bogdan_iancu) Date: 2009-07-09 10:54 Message: Hi Thomas, the core is actually showing an unsuccessful shutdown (see the cleanup() on frame 6). The final coredump was generated because the shutdown took longer than 20 seconds (and there is an ALARM signal to prevent blocking during the shutdown). So the question is: in logs, do you have any reports related to the cause for the initial shutdown ? Regards, Bogdan ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2818650&group_id=232389 _______________________________________________ Devel mailing list Devel@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/devel