Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
Hello! This path obviously breaks assumption 1) and therefore can lead to ABBA dead-locks. Yes... I've looked at the history and there seems to be no reason for the lock to be held at all in dev_watchdog_up. The lock appeared in day one and even there it was unnecessary. Seems, it serializes mod_timer and timer handler to keep timer in predictable state. Maybe, this is not necessary. A priori, it is required. Note that in dev_watchdog_down() queue_lock is released before taking xmit_lock. Probably, this is the thing which was supposed to be done in dev_watchdog_up() too. Alexey -- VGER BF report: U 0.46385 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
On Mon, Sep 04, 2006 at 12:44:14PM +0400, Alexey Kuznetsov wrote: Seems, it serializes mod_timer and timer handler to keep timer in predictable state. Maybe, this is not necessary. A priori, it is required. Note that in dev_watchdog_down() queue_lock is released before taking xmit_lock. Probably, this is the thing which was supposed to be done in dev_watchdog_up() too. Right, in that case this should definitely be unncessary because both dev_watchdog_up and dev_watchdog_up are called under RTNL. The function __dev_watchdog_up could possibly be dodgy though but that's a different story. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- VGER BF report: H 3.52484e-12 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
Francois Romieu [EMAIL PROTECTED] writes: dev_activate takes BH disabling locks only. How could a softirq happen on the same cpu and trigger a deadlock ? That's BTW possible circular locking dependency and there is only one CPU (no HT, single core - an old mobile Celeron P3). I have to admit this machine has a history of mysterious hangs (black screen of death), though I think they are related to LAN and maybe disk activity, not GRE/PPP/ATM/ADSL (RAM tests ok, the hardware is rather common - i440BX etc. but who knows). -- Krzysztof Halasa -- VGER BF report: S 0.999877 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
[NET]: Drop tx lock in dev_watchdog_up On Sat, Sep 02, 2006 at 08:39:28PM +, Krzysztof Halasa wrote: === [ INFO: possible circular locking dependency detected ] --- swapper/0 is trying to acquire lock: (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290 but task is already holding lock: (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290 which lock already depends on the new lock. This turns out to be a genuine bug. The queue lock and xmit lock are intentionally taken out of order. Two things are supposed to prevent dead-locks from occuring: 1) When we hold the queue_lock we're supposed to only do try_lock on the tx_lock. 2) We always drop the queue_lock after taking the tx_lock and before doing anything else. the existing dependency chain (in reverse order) is: - #1 (dev-_xmit_lock){-+..}: [c012e7b6] lock_acquire+0x76/0xa0 [c0336241] _spin_lock_bh+0x31/0x40 [c02d25a9] dev_activate+0x69/0x120 This path obviously breaks assumption 1) and therefore can lead to ABBA dead-locks. I've looked at the history and there seems to be no reason for the lock to be held at all in dev_watchdog_up. The lock appeared in day one and even there it was unnecessary. In fact, people added __dev_watchdog_up precisely in order to get around the tx lock there. The function dev_watchdog_up is already serialised by rtnl_lock since its only caller dev_activate is always called under it. So here is a simple patch to remove the tx lock from dev_watchdog_up. In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and replace it with dev_watchdog_up. Signed-off-by: Herbert Xu [EMAIL PROTECTED] Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 0834c2e..6f91518 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -238,9 +238,7 @@ void __netdev_watchdog_up(struct net_dev static void dev_watchdog_up(struct net_device *dev) { - netif_tx_lock_bh(dev); __netdev_watchdog_up(dev); - netif_tx_unlock_bh(dev); } static void dev_watchdog_down(struct net_device *dev) -- VGER BF report: U 0.5 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
Herbert Xu [EMAIL PROTECTED] writes: [NET]: Drop tx lock in dev_watchdog_up --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -238,9 +238,7 @@ void __netdev_watchdog_up(struct net_dev static void dev_watchdog_up(struct net_device *dev) { - netif_tx_lock_bh(dev); __netdev_watchdog_up(dev); - netif_tx_unlock_bh(dev); } static void dev_watchdog_down(struct net_device *dev) Many thanks for looking into this. The lockdep warning is gone now. -- Krzysztof Halasa -- VGER BF report: H 3.43998e-11 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
Herbert Xu wrote: [NET]: Drop tx lock in dev_watchdog_up On Sat, Sep 02, 2006 at 08:39:28PM +, Krzysztof Halasa wrote: === [ INFO: possible circular locking dependency detected ] --- swapper/0 is trying to acquire lock: (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290 but task is already holding lock: (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290 which lock already depends on the new lock. This turns out to be a genuine bug. The queue lock and xmit lock are intentionally taken out of order. Two things are supposed to prevent dead-locks from occuring: 1) When we hold the queue_lock we're supposed to only do try_lock on the tx_lock. 2) We always drop the queue_lock after taking the tx_lock and before doing anything else. the existing dependency chain (in reverse order) is: - #1 (dev-_xmit_lock){-+..}: [c012e7b6] lock_acquire+0x76/0xa0 [c0336241] _spin_lock_bh+0x31/0x40 [c02d25a9] dev_activate+0x69/0x120 This path obviously breaks assumption 1) and therefore can lead to ABBA dead-locks. I've looked at the history and there seems to be no reason for the lock to be held at all in dev_watchdog_up. The lock appeared in day one and even there it was unnecessary. In fact, people added __dev_watchdog_up precisely in order to get around the tx lock there. The function dev_watchdog_up is already serialised by rtnl_lock since its only caller dev_activate is always called under it. So here is a simple patch to remove the tx lock from dev_watchdog_up. In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and replace it with dev_watchdog_up. Signed-off-by: Herbert Xu [EMAIL PROTECTED] This patch also fixes a circular locking problem that I found on Linville's wireless-dev tree involving ieee80211 and wpa_supplicant. Larry Finger -- VGER BF report: H 1.06514e-06 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-rc5 with GRE, iptables and Speedtouch ADSL, PPP over ATM
Krzysztof Halasa [EMAIL PROTECTED] : [...] === [ INFO: possible circular locking dependency detected ] --- swapper/0 is trying to acquire lock: (dev-queue_lock){-+..}, at: [c02c8c46] dev_queue_xmit+0x56/0x290 but task is already holding lock: (dev-_xmit_lock){-+..}, at: [c02c8e14] dev_queue_xmit+0x224/0x290 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (dev-_xmit_lock){-+..}: [c012e7b6] lock_acquire+0x76/0xa0 [c0336241] _spin_lock_bh+0x31/0x40 [c02d25a9] dev_activate+0x69/0x120 [...] [c0169957] vfs_ioctl+0x57/0x290 [c0169bc9] sys_ioctl+0x39/0x60 [c0102c8d] sysenter_past_esp+0x56/0x8d - #0 (dev-queue_lock){-+..}: [c012e7b6] lock_acquire+0x76/0xa0 [c03361fc] _spin_lock+0x2c/0x40 [c02c8c46] dev_queue_xmit+0x56/0x290 [...] [c01194b5] __do_softirq+0x55/0xc0 [c0104b13] do_softirq+0x63/0xd0 dev_activate takes BH disabling locks only. How could a softirq happen on the same cpu and trigger a deadlock ? -- Ueimor -- VGER BF report: U 0.500151 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html