Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Also should we count dropped packets here?

Right. And since it seems that we cant get by with just one bit,
here's the original patch again, with dropped packet counter fixed.


---

Fix the following race scenario:
Device is up.
Port event or set mcast list triggers ipoib_mcast_stop_thread,
this cancels the query and waits on mcast "done" completion.
Completion is called and "done" is set.
Meanwhile, ipoib_mcast_send arrives and starts a new query,
re-initializing "done".

Further, there's an additional issue that I saw in testing:
ipoib_mcast_send may get called when priv->broadcast is NULL
(e.g. if the device was downed and then upped internally because
of a port event).
If this happends and the sendonly join request gets completed before
priv->broadcast is set, we get an oops

----
Do not send multicasts if mcast thread is stopped or if
priv->broadcast is not set.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

Index: linux-2.6.15/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===================================================================
--- linux-2.6.15.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c    
2006-01-23 21:24:10.000000000 +0200
+++ linux-2.6.15/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2006-01-23 
21:25:19.000000000 +0200
@@ -600,6 +600,10 @@ int ipoib_mcast_start_thread(struct net_
                queue_work(ipoib_workqueue, &priv->mcast_task);
        mutex_unlock(&mcast_mutex);
 
+       spin_lock_irq(&priv->lock);
+       set_bit(IPOIB_MCAST_STARTED, &priv->flags);
+       spin_unlock_irq(&priv->lock);
+
        return 0;
 }
 
@@ -610,6 +614,10 @@ int ipoib_mcast_stop_thread(struct net_d
 
        ipoib_dbg_mcast(priv, "stopping multicast thread\n");
 
+       spin_lock_irq(&priv->lock);
+       clear_bit(IPOIB_MCAST_STARTED, &priv->flags);
+       spin_unlock_irq(&priv->lock);
+
        mutex_lock(&mcast_mutex);
        clear_bit(IPOIB_MCAST_RUN, &priv->flags);
        cancel_delayed_work(&priv->mcast_task);
@@ -692,6 +700,12 @@ void ipoib_mcast_send(struct net_device 
         */
        spin_lock(&priv->lock);
 
+       if (!test_bit(IPOIB_MCAST_STARTED, &priv->flags) || !priv->broadcast) {
+               ++priv->stats.tx_dropped;
+               dev_kfree_skb_any(skb);
+               goto unlock;
+       }
+
        mcast = __ipoib_mcast_find(dev, mgid);
        if (!mcast) {
                /* Let's create a new send only group now */
@@ -753,6 +767,7 @@ out:
                ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN);
        }
 
+unlock:
        spin_unlock(&priv->lock);
 }
 
Index: linux-2.6.15/drivers/infiniband/ulp/ipoib/ipoib.h
===================================================================
--- linux-2.6.15.orig/drivers/infiniband/ulp/ipoib/ipoib.h      2006-01-23 
21:24:10.000000000 +0200
+++ linux-2.6.15/drivers/infiniband/ulp/ipoib/ipoib.h   2006-01-23 
21:24:46.000000000 +0200
@@ -85,6 +85,7 @@ enum {
        IPOIB_FLAG_SUBINTERFACE   = 4,
        IPOIB_MCAST_RUN           = 5,
        IPOIB_STOP_REAPER         = 6,
+       IPOIB_MCAST_STARTED       = 7,
 
        IPOIB_MAX_BACKOFF_SECONDS = 16,
 

-- 
MST
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to