The Github issue has the version info for that, for all the other crashes the 
version is below.

$ opensips -V
version: opensips 2.4.5 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, F_MALLOC, 
FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, 
MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: 60097425d
main.c compiled on 18:06:35 Jun 13 2019 with gcc 7

Ben Newlin

From: Devel <devel-boun...@lists.opensips.org> on behalf of Ben Newlin 
<ben.new...@genesys.com>
Reply-To: OpenSIPS devel mailling list <devel@lists.opensips.org>
Date: Wednesday, June 19, 2019 at 2:30 PM
To: Bogdan-Andrei Iancu <bog...@opensips.org>, OpenSIPS devel mailling list 
<devel@lists.opensips.org>
Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash

Bogdan,

I’m continuing to try to reproduce the timing for that crash. In the meantime, 
we’ve had quite a few more crashes, but they don’t seem to be the same cause.

I opened an issue on Github for a reproducible, consistent crash that is new on 
2.4.6 involving Dialog pinging. [1]

We also had several of our servers crash the last few days due to what may be a 
double memory free? That is just a guess, I’m not great at reading backtraces. 
[2] [3] [4] [5] [6] [7]

Finally, we had another crash that seems to be in TLS processing. These 
backtraces don’t show much, so I don’t know if they will be helpful. [8] [9]


[1] https://github.com/OpenSIPS/opensips/issues/1736
[2] https://pastebin.com/HeRPs5wt
[3] https://pastebin.com/Fs6iUD7b
[4] https://pastebin.com/EkRNi2iM
[5] https://pastebin.com/9ZAurMwa
[6] https://pastebin.com/QyWhygvf
[7] https://pastebin.com/vEUm4UtK
[8] https://pastebin.com/0VaQfX5B
[9] https://pastebin.com/LYUW0AqH


Ben Newlin

From: Bogdan-Andrei Iancu <bog...@opensips.org>
Date: Monday, June 10, 2019 at 2:41 AM
To: Ben Newlin <ben.new...@genesys.com>, OpenSIPS devel mailling list 
<devel@lists.opensips.org>
Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash

Ben,

With what we have so far it is too less data to get to a conclusion. Let's wait 
more and see if the crash reproduces.

Regards,



Bogdan-Andrei Iancu



OpenSIPS Founder and Developer

  https://www.opensips-solutions.com

OpenSIPS Summit 2019

  https://www.opensips.org/events/Summit-2019Amsterdam/
On 06/07/2019 04:27 PM, Ben Newlin wrote:
Bogdan,

I no longer have the original backtrace I posted in May, but if it was the same 
issue then it has only happened those two times, both when under load. I have 
not been able to reproduce it reliably or with single calls.

For the double ACK, so if the MF value is changed then it is not just traced 
twice it was actually sent twice. But what scenario would cause retransmission 
of a hop-by-hop ACK? I can’t think of one. So it still seems strange. But it 
may be a symptom of the issue and not a cause.

Ben Newlin

From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org>
Date: Friday, June 7, 2019 at 9:15 AM
To: Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>, 
OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>
Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash

Hi Ben,

How often/easy is to reproduce this crash (if possible) ? Brainstorming with 
Razvan, we suspect a race (on the msg save in shmem in transaction) between the 
process doing the cleanup after the async resume and the process running the 
failure route (due th 503).

But this is just a supposition, eventually you can validate it or not by 
removing the async ??

And on the double ACK - I'm not 100% it is a actually a double one, as the 
second has a smaller MF value (69, versus the 70 on the first ACK).

Regards,




Bogdan-Andrei Iancu



OpenSIPS Founder and Developer

  https://www.opensips-solutions.com

OpenSIPS Summit 2019

  https://www.opensips.org/events/Summit-2019Amsterdam/
On 06/07/2019 03:52 PM, Ben Newlin wrote:
Bogdan,

Sorry, I should have thought to actually look at the trace and examine this 
call.

1) Yes
2) The Called Party is 10.32.20.60, which is another OpenSIPS instance. The 
crashed instance received the "503 Service Unavailable" approximately 8-10 ms 
after sending the INVITE.

There is a SIP trace of the exchange here: https://pastebin.com/6bttsSVD.

One oddity I saw is that the crashed process appears to send (or at least 
siptrace) the ACK twice.

Ben Newlin

From: Bogdan-Andrei Iancu <bog...@opensips.org><mailto:bog...@opensips.org>
Date: Thursday, June 6, 2019 at 11:42 AM
To: OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>, Ben Newlin 
<ben.new...@genesys.com><mailto:ben.new...@genesys.com>
Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash

Hi Ben,

Thanks for "another" report :).

Questions:
1) do you do any async for the INVITE in this crash ?
2) if it is an YES to (1), is the caller party generating the "503 Service 
Unavailable" (which triggers the crash) - 10.32.20.60 ?? - a really close (from 
net delay perspective) and fast to answer party ?

Regards,





Bogdan-Andrei Iancu



OpenSIPS Founder and Developer

  https://www.opensips-solutions.com

OpenSIPS Summit 2019

  https://www.opensips.org/events/Summit-2019Amsterdam/
On 06/05/2019 10:02 PM, Ben Newlin wrote:
We have had another crash today.

Backtrace is here: https://pastebin.com/q4RQC7kS

I found this in the log at the time of the crash:

Jun  5 17:54:10 [4978] CRITICAL:core:sig_usr: segfault in process pid: 4978, 
id: 8


Please let me know if any further information can be useful.

Ben Newlin

From: Devel 
<devel-boun...@lists.opensips.org><mailto:devel-boun...@lists.opensips.org> on 
behalf of Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>
Reply-To: OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>
Date: Friday, May 10, 2019 at 6:31 PM
To: OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>
Subject: Re: [OpenSIPS-Devel] OpenSIPS Crash

I found this in the log at the time of the crash:

kernel: opensips[5003]: segfault at 30 ip 00007fbd4c8f59d0 sp 00007ffcaa850c80 
error 6 in tm.so[7fbd4c887000+8e000]

Ben Newlin

From: Devel 
<devel-boun...@lists.opensips.org><mailto:devel-boun...@lists.opensips.org> on 
behalf of Ben Newlin <ben.new...@genesys.com><mailto:ben.new...@genesys.com>
Reply-To: OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>
Date: Friday, May 10, 2019 at 5:44 PM
To: OpenSIPS devel mailling list 
<devel@lists.opensips.org><mailto:devel@lists.opensips.org>
Subject: [OpenSIPS-Devel] OpenSIPS Crash

Hello,

We had a crash today of our OpenSIPS instance.

Backtrace is here: https://pastebin.com/QbRJimwx

# opensips -V
version: opensips 2.4.5 (x86_64/linux)
flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, F_MALLOC, 
FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, 
MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll, sigio_rt, select.
git revision: d025b4f61
main.c compiled on 20:58:31 May  9 2019 with gcc 7

Ben Newlin







_______________________________________________

Devel mailing list

Devel@lists.opensips.org<mailto:Devel@lists.opensips.org>

http://lists.opensips.org/cgi-bin/mailman/listinfo/devel












_______________________________________________
Devel mailing list
Devel@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/devel

Reply via email to