Bug#493055: BRIStuff patches introduce deadlock

2008-10-03 Thread Faidon Liambotis
Hi,

Faidon Liambotis wrote:
 Kevin Shanahan wrote:
 Well it looked promising, but no - the lockup still appears to be
 reproducable. The patch I used to test is attached. Let me know if I
 made any mistake in backporting it (it should apply cleanly as the
 last patch in the debian/patches/series file).
 Yes, you made a mistake in backporting it :)
 
 The right backported patch is this:
 http://svn.debian.org/viewsvn/pkg-voip/asterisk/trunk/debian/patches/zap-fix-deadlock?rev=6221view=auto
 
 The problem is where thw lock is getting held; I'm not sure if we should
  hold the lock there as well but it's definitely not what the original
 patch did.
Did you have any luck with this?
I'll make an upload tomorrow or the day after that, so I'd appreciate it
you could answer me until then. We can always make an upload later, but
it'd be nice to know if the bug will exist in lenny (hopefully not!)

Thanks,
Faidon



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Kevin Shanahan
On Sat, Sep 27, 2008 at 05:10:33AM +0300, Faidon Liambotis wrote:
 You seem to have a good overview of this particular bug.
 
 Do you think that the patch that Tzafrir mentioned[1] would fix it?
 If not, do you have a proposal for a fix?
 
 1: 
 http://repo.or.cz/w/asterisk-bristuff.git?a=commit;h=6e44531a8a112e36588b5dbced309b0521d6b64e

Well it looked promising, but no - the lockup still appears to be
reproducable. The patch I used to test is attached. Let me know if I
made any mistake in backporting it (it should apply cleanly as the
last patch in the debian/patches/series file).

I'll see if I can get a useful backtrace when I have a bit more time
where I can afford to have the phones offline.

Regards,
Kevin.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Tzafrir Cohen
On Sun, Sep 28, 2008 at 08:19:57PM +0930, Kevin Shanahan wrote:
 On Sat, Sep 27, 2008 at 05:10:33AM +0300, Faidon Liambotis wrote:
  You seem to have a good overview of this particular bug.
  
  Do you think that the patch that Tzafrir mentioned[1] would fix it?
  If not, do you have a proposal for a fix?
  
  1: 
  http://repo.or.cz/w/asterisk-bristuff.git?a=commit;h=6e44531a8a112e36588b5dbced309b0521d6b64e
 
 Well it looked promising, but no - the lockup still appears to be
 reproducable. The patch I used to test is attached. Let me know if I
 made any mistake in backporting it (it should apply cleanly as the
 last patch in the debian/patches/series file).
 
 I'll see if I can get a useful backtrace when I have a bit more time
 where I can afford to have the phones offline.

I noticed that 1.4.21 - 1.4.22 cleared quite a few deadlocks (see the
DEADLOCK_AVOIDANCE macro).

Can you give a reproducable deadlock scenario?

-- 
   Tzafrir Cohen
icq#16849755  jabber:[EMAIL PROTECTED]
+972-50-7952406   mailto:[EMAIL PROTECTED]
http://www.xorcom.com  iax:[EMAIL PROTECTED]/tzafrir



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Kevin Shanahan
On Sun, Sep 28, 2008 at 02:00:17PM +0300, Tzafrir Cohen wrote:
 I noticed that 1.4.21 - 1.4.22 cleared quite a few deadlocks (see the
 DEADLOCK_AVOIDANCE macro).

Remeber, my deadlock only occurs when the bristuff patches have been
applied.

 Can you give a reproducable deadlock scenario?

I don't know that I can give you an easy way to reproduce it yourself,
but I can certainly reproduce this at will here. All I need to do is
call into the PBX on a PRI number from my home phone.

Regards,
Kevin.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Tzafrir Cohen
On Sun, Sep 28, 2008 at 08:44:14PM +0930, Kevin Shanahan wrote:
 On Sun, Sep 28, 2008 at 02:00:17PM +0300, Tzafrir Cohen wrote:
  I noticed that 1.4.21 - 1.4.22 cleared quite a few deadlocks (see the
  DEADLOCK_AVOIDANCE macro).
 
 Remeber, my deadlock only occurs when the bristuff patches have been
 applied.
 
  Can you give a reproducable deadlock scenario?
 
 I don't know that I can give you an easy way to reproduce it yourself,
 but I can certainly reproduce this at will here. All I need to do is
 call into the PBX on a PRI number from my home phone.

Can you please give a full trace (both 'pri debug span N' and 'core set
verbose/debug' with verbose/debug logging enabled)?

What version do you use exactly?

-- 
   Tzafrir Cohen
icq#16849755  jabber:[EMAIL PROTECTED]
+972-50-7952406   mailto:[EMAIL PROTECTED]
http://www.xorcom.com  iax:[EMAIL PROTECTED]/tzafrir



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Kevin Shanahan
On Sun, Sep 28, 2008 at 08:19:57PM +0930, Kevin Shanahan wrote:
 Well it looked promising, but no - the lockup still appears to be
 reproducable. The patch I used to test is attached. Let me know if I
 made any mistake in backporting it (it should apply cleanly as the
 last patch in the debian/patches/series file).

Ugh, sorry - forgot to attach the patch.
Locking fixes in chan_zap: lock chan before pvt

Taken from 
http://repo.or.cz/w/asterisk-bristuff.git?a=commit;h=6e44531a8a112e36588b5dbced309b0521d6b64e

diff -urNp a/channels/chan_zap.c b/channels/chan_zap.c
--- a/channels/chan_zap.c   2008-09-28 19:33:09.0 +0930
+++ b/channels/chan_zap.c   2008-09-28 19:37:27.0 +0930
@@ -8851,7 +8851,7 @@ static void *pri_dchannel(void *vpri)
int haveidles;
int activeidles;
int nextidle = -1;
-   struct ast_channel *c;
+   struct ast_channel *c = NULL;
struct timeval tv, lowest, *next;
struct timeval lastidle = { 0, 0 };
int doidling=0;
@@ -9539,6 +9539,7 @@ static void *pri_dchannel(void *vpri)
if 
(e-ring.redirectingreason = 0)

pbx_builtin_setvar_helper(c, PRIREDIRECTREASON, 
redirectingreason2str(e-ring.redirectingreason));

+   
ast_muted_lock(c-lock);

ast_mutex_lock(pri-pvts[chanpos]-lock);

ast_mutex_lock(pri-lock);
 
@@ -9637,6 +9638,8 @@ static void *pri_dchannel(void *vpri)
if (crv)
ast_mutex_unlock(crv-lock);

ast_mutex_unlock(pri-pvts[chanpos]-lock);
+   if (c)
+   ast_mutex_unlock(c-lock);
} else {
if (e-ring.flexible)
pri_hangup(pri-pri, 
e-ring.call, PRI_CAUSE_NORMAL_CIRCUIT_CONGESTION, -1);


Bug#493055: BRIStuff patches introduce deadlock

2008-09-28 Thread Faidon Liambotis
Kevin, hi,

Kevin Shanahan wrote:
 Well it looked promising, but no - the lockup still appears to be
 reproducable. The patch I used to test is attached. Let me know if I
 made any mistake in backporting it (it should apply cleanly as the
 last patch in the debian/patches/series file).
Yes, you made a mistake in backporting it :)

The right backported patch is this:
http://svn.debian.org/viewsvn/pkg-voip/asterisk/trunk/debian/patches/zap-fix-deadlock?rev=6221view=auto

The problem is where thw lock is getting held; I'm not sure if we should
 hold the lock there as well but it's definitely not what the original
patch did.

Thanks a lot,
Faidon



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-26 Thread Faidon Liambotis
Kevin, hi,

Kevin Shanahan wrote:
 I had some problems with my Asterisk installation having the PRI
 channels lock up completely when certain types of calls were
 received. Eventually this was traced back to a deadlock caused by the
 bristuff patches being applied.
 
 Various information, logs and a stack trace of the lockup here:
http://bugs.digium.com/view.php?id=13192
 
 I'm choosing severity grave because this effectively will allow
 someone calling in with particlar caller ID options/flags to lock up
 your PRI span on demand (i.e. DoS).
You seem to have a good overview of this particular bug.

Do you think that the patch that Tzafrir mentioned[1] would fix it?
If not, do you have a proposal for a fix?

Thanks,
Faidon

1: 
http://repo.or.cz/w/asterisk-bristuff.git?a=commit;h=6e44531a8a112e36588b5dbced309b0521d6b64e



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-09-11 Thread Tzafrir Cohen
On Thu, Jul 31, 2008 at 01:14:46PM +0930, Kevin Shanahan wrote:
 Package: asterisk
 Version: 1.4.21.2~dfsg-1
 Severity: grave
 
 I had some problems with my Asterisk installation having the PRI
 channels lock up completely when certain types of calls were
 received. Eventually this was traced back to a deadlock caused by the
 bristuff patches being applied.
 
 Various information, logs and a stack trace of the lockup here:
http://bugs.digium.com/view.php?id=13192
 
 I'm choosing severity grave because this effectively will allow
 someone calling in with particlar caller ID options/flags to lock up
 your PRI span on demand (i.e. DoS).

The following patch has fixed at least one case of reproducable
deadlocks for us. It still needs to be backported to chan_zap.

It is currently at
http://repo.or.cz/w/asterisk-bristuff.git?a=commit;h=6e44531a8a112e36588b5dbced309b0521d6b64e
(Should be at the tag bristuff-0.4.0-RC4-xr1, but the mirroring seems
to be lagging)

-- 
   Tzafrir Cohen
icq#16849755  jabber:[EMAIL PROTECTED]
+972-50-7952406   mailto:[EMAIL PROTECTED]
http://www.xorcom.com  iax:[EMAIL PROTECTED]/tzafrir
commit ccd11da0599c190f5b678aed3f164579ca873c71
Author: Tzafrir Cohen [EMAIL PROTECTED]
Date:   Mon Sep 8 10:59:06 2008 +

Locking fixes in chan_dahdi: lock chan before pvt

diff --git a/channels/chan_dahdi.c b/channels/chan_dahdi.c
index 35c4e4e..c12c112 100644
--- a/channels/chan_dahdi.c
+++ b/channels/chan_dahdi.c
@@ -9105,7 +9105,7 @@ static void *pri_dchannel(void *vpri)
 	int haveidles;
 	int activeidles;
 	int nextidle = -1;
-	struct ast_channel *c;
+	struct ast_channel *c = NULL;
 	struct timeval tv, lowest, *next;
 	struct timeval lastidle = { 0, 0 };
 	int doidling=0;
@@ -9846,6 +9846,7 @@ static void *pri_dchannel(void *vpri)
 snprintf(calledtonstr, sizeof(calledtonstr)-1, %d, e-ring.calledplan);
 pbx_builtin_setvar_helper(c, CALLEDTON, calledtonstr);
 
+ast_mutex_lock(c-lock);
 ast_mutex_lock(pri-pvts[chanpos]-lock);
 ast_mutex_lock(pri-lock);
 
@@ -9893,6 +9894,8 @@ static void *pri_dchannel(void *vpri)
 	if (crv)
 		ast_mutex_unlock(crv-lock);
 	ast_mutex_unlock(pri-pvts[chanpos]-lock);
+	if (c)
+		ast_mutex_unlock(c-lock);
 } else {
 	if (e-ring.flexible)
 		pri_hangup(pri-pri, e-ring.call, PRI_CAUSE_NORMAL_CIRCUIT_CONGESTION);


Bug#493055: BRIStuff patches introduce deadlock

2008-07-30 Thread Kevin Shanahan
Package: asterisk
Version: 1.4.21.2~dfsg-1
Severity: grave

I had some problems with my Asterisk installation having the PRI
channels lock up completely when certain types of calls were
received. Eventually this was traced back to a deadlock caused by the
bristuff patches being applied.

Various information, logs and a stack trace of the lockup here:
   http://bugs.digium.com/view.php?id=13192

I'm choosing severity grave because this effectively will allow
someone calling in with particlar caller ID options/flags to lock up
your PRI span on demand (i.e. DoS).

Regards,
Kevin.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#493055: BRIStuff patches introduce deadlock

2008-07-30 Thread Faidon Liambotis

severity 493055 important
thanks

Kevin Shanahan wrote:
I had some problems with my Asterisk installation having the PRI

channels lock up completely when certain types of calls were
received. Eventually this was traced back to a deadlock caused by the
bristuff patches being applied.

Various information, logs and a stack trace of the lockup here:
   http://bugs.digium.com/view.php?id=13192

I'm choosing severity grave because this effectively will allow
someone calling in with particlar caller ID options/flags to lock up
your PRI span on demand (i.e. DoS).

Well, this is definitely a very nasty bug but from a release prespective

I'd like to point out two things:
a) You are one of the _very_ few people to experience this deadlock;
   I've experienced it once or twice in one of my (many!) installations
   all this time and noone else has reported it so far. Perhaps it has
   something to do with what the network sends to Asterisk, I'm not so
   sure myself.
b) Asterisk is a very complicated program with dozens of threads and
   locks between them. I don't think we (as in, Debian) can solve this
   and I highly doubt that upstream will, at least in time for lenny.

In summary, while I acknowledge that this breaks your setup -and believe 
me, I'd like to have this fixed as much as you do- I don't think it's a 
release critical bug and that it should prevent Asterisk from being 
shipped with our next stable release.


Thanks,
Faidon



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]