On 04/03/2013 08:15 PM, Duane Larson wrote:
So it just happened again on both machines at the same time and I was
running debug on both servers. I am running OpenSIPS and load balancing
between both servers so I am guessing when the invite was sent to the first
server it was frozen for some reason and then OpenSIPS sent the invite to
the second server and that server was also frozen/deadlocked because of the
SIP message. I noticed on both servers the last log that was posted with
Asterisk deadlocked was the following
Asterisk version 11.0.1
[Apr 3 21:39:42] DEBUG[12984] res_timing_timerfd.c: Expected to
acknowledge 1 ticks but got 11805 instead
Asterisk version 11.2.1
[Apr 3 21:39:50] DEBUG[1854] res_timing_timerfd.c: Expected to acknowledge
1 ticks but got 12423 instead
In my last email I posted the debug from the Asterisk server with 11.0.1
version of code. Here is a post of the debug for the Asterisk server with
version 11.2.1
http://pastebin.com/mbjSSAWM
This has to be a bug right? I am thinking of opening an issue on the
Asterisk JIRA system
A number of deadlocks were fixed in the current release of 11.3. Please
read the change log to see if any fit your issue.
http://downloads.asterisk.org/pub/telephony/asterisk/ChangeLog-11-current
On Wed, Apr 3, 2013 at 4:45 PM, Duane Larson <duane.lar...@gmail.com> wrote:
It just happened again on the 11.0.1 box and I was able to grab a debug.
I am hoping someone can tell me if this is a bug or something wrong with
my config.
gdb asterisk-bin/sbin/asterisk 29048
Go here for the debug output
http://pastebin.com/DGXx0BSk
On Tue, Apr 2, 2013 at 7:42 PM, Duane Larson <duane.lar...@gmail.com>wrote:
I am currently running two different versions of Asterisk
11.0.1
11.2.1
I have noticed the bug occur on both servers.
The issue is that when I try to dial a phone number sometimes the call
will never go out. I will check the Asterisk server with NGREP and see
that the SIP messages are making it to Asterisk but Asterisk isn't
responding.
I do the following command "netstat -nap |grep 5060" and see that
Asterisk has a lot under the "Recv-Q" column.
It usually takes about 10 minutes before Asterisk becomes responsive
again or else before 10 minutes is up I could restart Asterisk and
everything will be back to normal.
I see in the message logs the following errors
On the 11.0.1 Asterisk server
WARNING[23723][C-00000010] chan_sip.c: Unable to cancel schedule ID
11473. This is probably a bug (chan_sip.c: update_provisional_keepalive,
line 4406).
On the 11.2.1 Asterisk server
WARNING[3493][C-0000001f] chan_sip.c: Unable to cancel schedule ID 30810.
This is probably a bug (chan_sip.c: update_provisional_keepalive, line
4683).
When I look in chan_sip.c on both servers I see that they are the same
line of code
AST_SCHED_DEL_UNREF(sched, pvt->provisional_keepalive_sched_id,
dialog_unref(pvt, "when you delete the provisional_keepalive_sched_id, you
should dec the refcount for the stored dialog ptr"));
What could be causing this because it seems to happen at least once a day.
--
--
*--*--*--*--*--*
Duane
*--*--*--*--*--*
--
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
http://www.asterisk.org/hello
asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users
--
Jim Lucas
http://www.cmsws.com/
http://www.cmsws.com/examples/
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
http://www.asterisk.org/hello
asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users