We are seeing a number of softlockups occurring with HTB upon removing
the qdisc. We are still attempting to repro the exact circumstances,
however looking at the code I'm very suspicious of this block in
net_tx_action and its interaction with dev_deactivate (called through
tc_modify_qdisc):

       if (!test_bit(__QDISC_STATE_DEACTIVATED,
                     &q->state)) {
               __netif_reschedule(q);
       } else {
               smp_mb__before_atomic();
               clear_bit(__QDISC_STATE_SCHED,
                         &q->state);
       }

I think the following scenario could lead to badness:

0) net_tx_action spin_trylock fails, taking non-locked block
1) net_tx_action checks for __QDISC_STATE_DEACTIVATED, it's not set
   at this point
2) dev_deactive has lock and sets __QDISC_STATE_DEACTIVATED
3) dev_deactivate_many performs some_qdisc_is_busy(dev), neither
   __QDISC_STATE_SCHED nor __QDISC_STATE_BUSY are set at this
  point, so some_qdisc_busy fails (not seen as busy)
4) net_tx_action sets __QDISC_STATE_SCHED

At this point dev_deactivate_many finishes so the qdisc may
be freed in the tc_modify_qdisc path, however the qdisc is
also "successfully" rescheduled to run by net_tx_action.

The propsed fix for this is to eliminate the spin_trylock in
net_tx_action and always take the lock.

Signed-off-by: Tom Herbert <t...@herbertland.com>
---
 net/core/dev.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index edb7179..77ec0c1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3855,22 +3855,14 @@ static void net_tx_action(struct softirq_action *h)
                        head = head->next_sched;
 
                        root_lock = qdisc_lock(q);
-                       if (spin_trylock(root_lock)) {
-                               smp_mb__before_atomic();
-                               clear_bit(__QDISC_STATE_SCHED,
-                                         &q->state);
-                               qdisc_run(q);
-                               spin_unlock(root_lock);
-                       } else {
-                               if (!test_bit(__QDISC_STATE_DEACTIVATED,
-                                             &q->state)) {
-                                       __netif_reschedule(q);
-                               } else {
-                                       smp_mb__before_atomic();
-                                       clear_bit(__QDISC_STATE_SCHED,
-                                                 &q->state);
-                               }
-                       }
+                       spin_lock(root_lock);
+
+                       smp_mb__before_atomic();
+                       clear_bit(__QDISC_STATE_SCHED,
+                                 &q->state);
+                       qdisc_run(q);
+
+                       spin_unlock(root_lock);
                }
        }
 }
-- 
2.6.5

Reply via email to