Re: [BUG] RTNL and flush_scheduled_work deadlocks
On Fri, Feb 16, 2007 at 08:06:25AM -0800, Ben Greear wrote: ... Well, I had lockdep and all of the locking debugging I could find enabled, but it did not catch this problem..I had to use sysctl -t and manually dig through the backtraces to find the deadlock It may be that lockdep could be enhanced to catch this sort of thing I think you are really good at traceing very interesting (subtle) problems. I guess the scenario is like this: 1) some process takes some lock (e.g. RTNL), 2) kthread runs a work function, which tries to get the same lock, 3) the process with the lock calls flush_scheduled_work, 4) the flush_cpu_workqueue waits for kthread to finish. So, the process #1 (with the lock) waits for the end of the process #2, which waits for the lock held by process #1. Of course it's a lockup - similar to circular dependency but not the same: there is only one lock. I don't think lockdep could be blamed here - if it's not a lock it can't know the reason of process' #1 waiting. In my opinion the solution should be looked for in the workqueue code. My idea is: maybe there should be used some additional lock taken by kthread before running the workqueue and by a process calling the flush. Then lockdep shouldn't have any problems with this dependency. This lock could be #ifdef DEBUG_LOCK... so only where it could be analyzed. Of course there may be some simpler solution of this otherwise hard to track problem. I CC this message to Ingo Molnar and hope he could find some time to think about it. Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
On Thu, Feb 15, 2007 at 11:40:32PM -0800, Ben Greear wrote: ... Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? Yes! I thought about the same (at first). But in my opinion it was not enough, so I thought about doing this in flush_workqueue. But in my next opinion it was not enough too. Now I think something like this should be done in rtnl_lock (under some debugging #if of course). Cheers, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Jarek Poplawski wrote: On Thu, Feb 15, 2007 at 11:40:32PM -0800, Ben Greear wrote: ... Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? Yes! I thought about the same (at first). But in my opinion it was not enough, so I thought about doing this in flush_workqueue. But in my next opinion it was not enough too. Now I think something like this should be done in rtnl_lock (under some debugging #if of course). The reason these bugs have been hidden is that most of the time, there is nothing on the pending work queue that will try to grab RTNL. But, the flush_work_queue is still called with RTNL held, so an assert would find this much earlier than waiting for someone to get lucky and actually catch (and debug and report) a deadlock... I don't see how asserting it in the rtnl_lock would help anything, because at that point we are about to deadlock anyway... (and this is probably very rare, as mentioned above.) Thanks, Ben Cheers, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
On Fri, Feb 16, 2007 at 12:23:05AM -0800, Ben Greear wrote: Jarek Poplawski wrote: On Thu, Feb 15, 2007 at 11:40:32PM -0800, Ben Greear wrote: ... Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? Yes! I thought about the same (at first). But in my opinion it was not enough, so I thought about doing this in flush_workqueue. But in my next opinion it was not enough too. Now I think something like this should be done in rtnl_lock (under some debugging #if of course). The reason these bugs have been hidden is that most of the time, there is nothing on the pending work queue that will try to grab RTNL. But, the flush_work_queue is still called with RTNL held, so an assert would find this much earlier than waiting for someone to get lucky and actually catch (and debug and report) a deadlock... Yes. Regarding this, you are right - it should be better. But, anyway cancel_rearming... is equally dangerous, so there is a question where: in all places where used, in workqueue.c or maybe make it semi obligatory in networking and add some net wrappers e.g.: net_flush_work_queue and net_cancel_rearmnig with this ASSERT_NO_RTNL? I don't see how asserting it in the rtnl_lock would help anything, because at that point we are about to deadlock anyway... (and this is probably very rare, as mentioned above.) But it's happening now. Probably lockdep is not enough and rtnl_lock is probably used in too many places, so I meant some additional checks would be possible. I wrote something like this and mean some check if we have this lock already before trying to get it again. But maybe it's really too much and your proposal is sufficient for this. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
On Fri, Feb 16, 2007 at 10:04:25AM +0100, Jarek Poplawski wrote: On Fri, Feb 16, 2007 at 12:23:05AM -0800, Ben Greear wrote: ... I don't see how asserting it in the rtnl_lock would help anything, because at that point we are about to deadlock anyway... (and this is probably very rare, as mentioned above.) But it's happening now. Probably lockdep is not enough and rtnl_lock is probably used in too many places, so I meant some additional checks would be possible. And of course it already could be done by DEBUG_MUTEXES or DEBUG_SPINLOCK, so I gone too far and it's really bad idea. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Jarek Poplawski wrote: On Fri, Feb 16, 2007 at 10:04:25AM +0100, Jarek Poplawski wrote: On Fri, Feb 16, 2007 at 12:23:05AM -0800, Ben Greear wrote: ... I don't see how asserting it in the rtnl_lock would help anything, because at that point we are about to deadlock anyway... (and this is probably very rare, as mentioned above.) But it's happening now. Probably lockdep is not enough and rtnl_lock is probably used in too many places, so I meant some additional checks would be possible. And of course it already could be done by DEBUG_MUTEXES or DEBUG_SPINLOCK, so I gone too far and it's really bad idea. Well, I had lockdep and all of the locking debugging I could find enabled, but it did not catch this problem..I had to use sysctl -t and manually dig through the backtraces to find the deadlock It may be that lockdep could be enhanced to catch this sort of thing Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
On Thu, 15 Feb 2007 23:40:32 -0800 Ben Greear [EMAIL PROTECTED] wrote: Jarek Poplawski wrote: On 14-02-2007 22:27, Stephen Hemminger wrote: Ben found this but the problem seems pretty widespread. The following places are subject to deadlock between flush_scheduled_work and the RTNL mutex. What can happen is that a work queue routine (like bridge port_carrier_check) is waiting forever for RTNL, and the driver routine has called flush_scheduled_work with RTNL held and is waiting for the work queue to clear. Several other places have comments like: can't call flush_scheduled_work here or it will deadlock. Most of the problem places are in device close routine. My recommendation would be to add a check for device netif_running in what ever work routine is used, and move the flush_scheduled_work to the remove routine. 8139too.c: rtl8139_close -- rtl8139_stop_thread r8169.c: rtl8169_down cassini.c: cas_change_mtu iseries_veth.c: veth_stop_connection s2io.c: s2io_close sis190.c: sis190_down There is probably more than this... Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? You can't safely add a check like that. What if another cpu had acquired RTNL and was unrelated. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Stephen Hemminger wrote: On Thu, 15 Feb 2007 23:40:32 -0800 Ben Greear [EMAIL PROTECTED] wrote: Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? You can't safely add a check like that. What if another cpu had acquired RTNL and was unrelated. I guess there isn't a way to see if *this* thread is the owner of the RTNL currently? I think lockdep knows the owner...maybe could query it somehow, or just save the owner in the mutex object when debugging is enabled... Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Francois Romieu wrote: Ben Greear [EMAIL PROTECTED] : [...] I seem to be able to trigger this within about 1 minute on a particular 2.6.18.2 system with some 8139too devices, so if someone has a patch that could be tested, I'll gladly test it. For whatever reason, I haven't hit this problem on 2.6.20 yet, but that could easily be dumb luck, and I haven't been running .20 very much. Bandaid below. It is not complete if your device hits the tx_watchdog hard but it should help. So far, I've been running several hours without problems, so this does appear to be helping. Thanks, Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
On 14-02-2007 22:27, Stephen Hemminger wrote: Ben found this but the problem seems pretty widespread. The following places are subject to deadlock between flush_scheduled_work and the RTNL mutex. What can happen is that a work queue routine (like bridge port_carrier_check) is waiting forever for RTNL, and the driver routine has called flush_scheduled_work with RTNL held and is waiting for the work queue to clear. Several other places have comments like: can't call flush_scheduled_work here or it will deadlock. Most of the problem places are in device close routine. My recommendation would be to add a check for device netif_running in what ever work routine is used, and move the flush_scheduled_work to the remove routine. 8139too.c: rtl8139_close -- rtl8139_stop_thread r8169.c: rtl8169_down cassini.c: cas_change_mtu iseries_veth.c: veth_stop_connection s2io.c: s2io_close sis190.c: sis190_down There is probably more than this... I think the same problem is with cancel_rearming_delayed_work. Plus indirect calling these functions: eg. by ieee8021softmac_stop. I found these dangerous places (probably not all): cxgb3/cxgb3_main.c (cxgb_close - cxgb_down), macb.c (macb_close), skge.c (skge_down), wireless/bcm43xx/bcm43xx_main.c (bcm_net_stop both ieee80211... and flush_...), wireless/zd1211rw/zd_mac.c (zd_mac_stop - housekeeping_disable), chelsio/my3126.c (t1_interrupts_disable - my3126_interrupt_disable), /* not sure */ drivers/usb/net/kaweth.c (kaweth_close - kaweth_kill_urbs) Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Jarek Poplawski wrote: On 14-02-2007 22:27, Stephen Hemminger wrote: Ben found this but the problem seems pretty widespread. The following places are subject to deadlock between flush_scheduled_work and the RTNL mutex. What can happen is that a work queue routine (like bridge port_carrier_check) is waiting forever for RTNL, and the driver routine has called flush_scheduled_work with RTNL held and is waiting for the work queue to clear. Several other places have comments like: can't call flush_scheduled_work here or it will deadlock. Most of the problem places are in device close routine. My recommendation would be to add a check for device netif_running in what ever work routine is used, and move the flush_scheduled_work to the remove routine. 8139too.c: rtl8139_close -- rtl8139_stop_thread r8169.c: rtl8169_down cassini.c: cas_change_mtu iseries_veth.c: veth_stop_connection s2io.c: s2io_close sis190.c: sis190_down There is probably more than this... Maybe there should be something like an ASSERT_NOT_RTNL() in the flush_scheduled_work() method? If it's performance criticial, #ifdef it out if we're not debugging locks? Ben -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[BUG] RTNL and flush_scheduled_work deadlocks
Ben found this but the problem seems pretty widespread. The following places are subject to deadlock between flush_scheduled_work and the RTNL mutex. What can happen is that a work queue routine (like bridge port_carrier_check) is waiting forever for RTNL, and the driver routine has called flush_scheduled_work with RTNL held and is waiting for the work queue to clear. Several other places have comments like: can't call flush_scheduled_work here or it will deadlock. Most of the problem places are in device close routine. My recommendation would be to add a check for device netif_running in what ever work routine is used, and move the flush_scheduled_work to the remove routine. 8139too.c: rtl8139_close -- rtl8139_stop_thread r8169.c: rtl8169_down cassini.c: cas_change_mtu iseries_veth.c: veth_stop_connection s2io.c: s2io_close sis190.c: sis190_down - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Stephen Hemminger wrote: Ben found this but the problem seems pretty widespread. The following places are subject to deadlock between flush_scheduled_work and the RTNL mutex. What can happen is that a work queue routine (like bridge port_carrier_check) is waiting forever for RTNL, and the driver routine has called flush_scheduled_work with RTNL held and is waiting for the work queue to clear. Several other places have comments like: can't call flush_scheduled_work here or it will deadlock. Most of the problem places are in device close routine. My recommendation would be to add a check for device netif_running in what ever work routine is used, and move the flush_scheduled_work to the remove routine. I seem to be able to trigger this within about 1 minute on a particular 2.6.18.2 system with some 8139too devices, so if someone has a patch that could be tested, I'll gladly test it. For whatever reason, I haven't hit this problem on 2.6.20 yet, but that could easily be dumb luck, and I haven't been running .20 very much. To add to the list below, tg3 has this problem as well, as far as I can tell by looking at the code. Thanks, Ben 8139too.c: rtl8139_close -- rtl8139_stop_thread r8169.c: rtl8169_down cassini.c: cas_change_mtu iseries_veth.c: veth_stop_connection s2io.c: s2io_close sis190.c: sis190_down -- Ben Greear [EMAIL PROTECTED] Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] RTNL and flush_scheduled_work deadlocks
Ben Greear [EMAIL PROTECTED] : [...] I seem to be able to trigger this within about 1 minute on a particular 2.6.18.2 system with some 8139too devices, so if someone has a patch that could be tested, I'll gladly test it. For whatever reason, I haven't hit this problem on 2.6.20 yet, but that could easily be dumb luck, and I haven't been running .20 very much. Bandaid below. It is not complete if your device hits the tx_watchdog hard but it should help. I'll return in 24h. diff --git a/drivers/net/8139too.c b/drivers/net/8139too.c index 35ad5cf..cbee350 100644 --- a/drivers/net/8139too.c +++ b/drivers/net/8139too.c @@ -1603,18 +1603,20 @@ static void rtl8139_thread (struct work_struct *work) struct net_device *dev = tp-mii.dev; unsigned long thr_delay = next_tick; + rtnl_lock(); + + if (!netif_running(dev)) + goto unlock; + if (tp-watchdog_fired) { tp-watchdog_fired = 0; rtl8139_tx_timeout_task(work); - } else if (rtnl_trylock()) { - rtl8139_thread_iter (dev, tp, tp-mmio_addr); - rtnl_unlock (); - } else { - /* unlikely race. mitigate with fast poll. */ - thr_delay = HZ / 2; - } + } else + rtl8139_thread_iter(dev, tp, tp-mmio_addr); schedule_delayed_work(tp-thread, thr_delay); +unlock: + rtnl_unlock (); } static void rtl8139_start_thread(struct rtl8139_private *tp) @@ -1626,19 +1628,11 @@ static void rtl8139_start_thread(struct rtl8139_private *tp) return; tp-have_thread = 1; + tp-watchdog_fired = 0; schedule_delayed_work(tp-thread, next_tick); } -static void rtl8139_stop_thread(struct rtl8139_private *tp) -{ - if (tp-have_thread) { - cancel_rearming_delayed_work(tp-thread); - tp-have_thread = 0; - } else - flush_scheduled_work(); -} - static inline void rtl8139_tx_clear (struct rtl8139_private *tp) { tp-cur_tx = 0; @@ -2233,8 +2227,6 @@ static int rtl8139_close (struct net_device *dev) netif_stop_queue (dev); - rtl8139_stop_thread(tp); - if (netif_msg_ifdown(tp)) printk(KERN_DEBUG %s: Shutting down ethercard, status was 0x%4.4x.\n, dev-name, RTL_R16 (IntrStatus)); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html