Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-04 Thread Alexey Kuznetsov
Hello!

 This path obviously breaks assumption 1) and therefore can lead to ABBA
 dead-locks.

Yes...


 I've looked at the history and there seems to be no reason for the lock
 to be held at all in dev_watchdog_up.  The lock appeared in day one and
 even there it was unnecessary.

Seems, it serializes mod_timer and timer handler to keep timer
in predictable state. Maybe, this is not necessary. A priori, it is required.

Note that in dev_watchdog_down() queue_lock is released before
taking xmit_lock. Probably, this is the thing which was supposed
to be done in dev_watchdog_up() too.

Alexey

-- 
VGER BF report: U 0.46385
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-04 Thread Herbert Xu
On Mon, Sep 04, 2006 at 12:44:14PM +0400, Alexey Kuznetsov wrote:
 
 Seems, it serializes mod_timer and timer handler to keep timer
 in predictable state. Maybe, this is not necessary. A priori, it is required.
 
 Note that in dev_watchdog_down() queue_lock is released before
 taking xmit_lock. Probably, this is the thing which was supposed
 to be done in dev_watchdog_up() too.

Right, in that case this should definitely be unncessary because both
dev_watchdog_up and dev_watchdog_up are called under RTNL.

The function __dev_watchdog_up could possibly be dodgy though but that's
a different story.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

-- 
VGER BF report: H 3.52484e-12
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-03 Thread Krzysztof Halasa
Francois Romieu [EMAIL PROTECTED] writes:

 dev_activate takes BH disabling locks only. How could a softirq happen
 on the same cpu and trigger a deadlock ?

That's BTW possible circular locking dependency and there is only
one CPU (no HT, single core - an old mobile Celeron P3).

I have to admit this machine has a history of mysterious hangs (black
screen of death), though I think they are related to LAN and maybe disk
activity, not GRE/PPP/ATM/ADSL (RAM tests ok, the hardware is rather
common - i440BX etc. but who knows).
-- 
Krzysztof Halasa

-- 
VGER BF report: S 0.999877
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-03 Thread Herbert Xu
[NET]: Drop tx lock in dev_watchdog_up

On Sat, Sep 02, 2006 at 08:39:28PM +, Krzysztof Halasa wrote:
 
 ===
 [ INFO: possible circular locking dependency detected ]
 ---
 swapper/0 is trying to acquire lock:
  (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290
 
 but task is already holding lock:
  (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290
 
 which lock already depends on the new lock.

This turns out to be a genuine bug.  The queue lock and xmit lock are
intentionally taken out of order.  Two things are supposed to prevent
dead-locks from occuring:

1) When we hold the queue_lock we're supposed to only do try_lock on the
tx_lock.

2) We always drop the queue_lock after taking the tx_lock and before doing
anything else.

 
 the existing dependency chain (in reverse order) is:
 
 - #1 (dev-_xmit_lock){-+..}:
[c012e7b6] lock_acquire+0x76/0xa0
[c0336241] _spin_lock_bh+0x31/0x40
[c02d25a9] dev_activate+0x69/0x120

This path obviously breaks assumption 1) and therefore can lead to ABBA
dead-locks.

I've looked at the history and there seems to be no reason for the lock
to be held at all in dev_watchdog_up.  The lock appeared in day one and
even there it was unnecessary.  In fact, people added __dev_watchdog_up
precisely in order to get around the tx lock there.

The function dev_watchdog_up is already serialised by rtnl_lock since
its only caller dev_activate is always called under it.

So here is a simple patch to remove the tx lock from dev_watchdog_up.
In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and
replace it with dev_watchdog_up.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 0834c2e..6f91518 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -238,9 +238,7 @@ void __netdev_watchdog_up(struct net_dev
 
 static void dev_watchdog_up(struct net_device *dev)
 {
-   netif_tx_lock_bh(dev);
__netdev_watchdog_up(dev);
-   netif_tx_unlock_bh(dev);
 }
 
 static void dev_watchdog_down(struct net_device *dev)

-- 
VGER BF report: U 0.5
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-03 Thread Krzysztof Halasa
Herbert Xu [EMAIL PROTECTED] writes:

 [NET]: Drop tx lock in dev_watchdog_up
 --- a/net/sched/sch_generic.c
 +++ b/net/sched/sch_generic.c
 @@ -238,9 +238,7 @@ void __netdev_watchdog_up(struct net_dev
  
  static void dev_watchdog_up(struct net_device *dev)
  {
 - netif_tx_lock_bh(dev);
   __netdev_watchdog_up(dev);
 - netif_tx_unlock_bh(dev);
  }
  
  static void dev_watchdog_down(struct net_device *dev)

Many thanks for looking into this. The lockdep warning is gone now.
-- 
Krzysztof Halasa

-- 
VGER BF report: H 3.43998e-11
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-03 Thread Larry Finger
Herbert Xu wrote:
 [NET]: Drop tx lock in dev_watchdog_up
 
 On Sat, Sep 02, 2006 at 08:39:28PM +, Krzysztof Halasa wrote:
 ===
 [ INFO: possible circular locking dependency detected ]
 ---
 swapper/0 is trying to acquire lock:
  (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290

 but task is already holding lock:
  (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290

 which lock already depends on the new lock.
 
 This turns out to be a genuine bug.  The queue lock and xmit lock are
 intentionally taken out of order.  Two things are supposed to prevent
 dead-locks from occuring:
 
 1) When we hold the queue_lock we're supposed to only do try_lock on the
 tx_lock.
 
 2) We always drop the queue_lock after taking the tx_lock and before doing
 anything else.
 
 the existing dependency chain (in reverse order) is:

 - #1 (dev-_xmit_lock){-+..}:
[c012e7b6] lock_acquire+0x76/0xa0
[c0336241] _spin_lock_bh+0x31/0x40
[c02d25a9] dev_activate+0x69/0x120
 
 This path obviously breaks assumption 1) and therefore can lead to ABBA
 dead-locks.
 
 I've looked at the history and there seems to be no reason for the lock
 to be held at all in dev_watchdog_up.  The lock appeared in day one and
 even there it was unnecessary.  In fact, people added __dev_watchdog_up
 precisely in order to get around the tx lock there.
 
 The function dev_watchdog_up is already serialised by rtnl_lock since
 its only caller dev_activate is always called under it.
 
 So here is a simple patch to remove the tx lock from dev_watchdog_up.
 In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and
 replace it with dev_watchdog_up.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

This patch also fixes a circular locking problem that I found on Linville's 
wireless-dev tree
involving ieee80211 and wpa_supplicant.

Larry Finger




-- 
VGER BF report: H 1.06514e-06
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM

2006-09-02 Thread Francois Romieu
Krzysztof Halasa [EMAIL PROTECTED] :
[...]
 ===
 [ INFO: possible circular locking dependency detected ]
 ---
 swapper/0 is trying to acquire lock:
  (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290
 
 but task is already holding lock:
  (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 - #1 (dev-_xmit_lock){-+..}:
[c012e7b6] lock_acquire+0x76/0xa0
[c0336241] _spin_lock_bh+0x31/0x40
[c02d25a9] dev_activate+0x69/0x120
[...]
[c0169957] vfs_ioctl+0x57/0x290
[c0169bc9] sys_ioctl+0x39/0x60
[c0102c8d] sysenter_past_esp+0x56/0x8d

 
 - #0 (dev-queue_lock){-+..}:
[c012e7b6] lock_acquire+0x76/0xa0
[c03361fc] _spin_lock+0x2c/0x40
[c02c8c46] dev_queue_xmit+0x56/0x290
[...]
[c01194b5] __do_softirq+0x55/0xc0
[c0104b13] do_softirq+0x63/0xd0

dev_activate takes BH disabling locks only. How could a softirq happen
on the same cpu and trigger a deadlock ?

-- 
Ueimor

-- 
VGER BF report: U 0.500151
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html