On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford <dledf...@redhat.com> wrote:
> Our mcast_dev_flush routine and our mcast_restart_task can race against
> each other.  In particular, they both hold the priv->lock while
> manipulating the rbtree and while removing mcast entries from the
> multicast_list and while adding entries to the remove_list, but they
> also both drop their locks prior to doing the actual removes.  The
> mcast_dev_flush routine is run entirely under the rtnl lock and so has
> at least some locking.  The actual race condition is like this:
>
> Thread 1                                Thread 2
> ifconfig ib0 up
>   start multicast join for broadcast
>   multicast join completes for broadcast
>   start to add more multicast joins
>     call mcast_restart_task to add new entries
>                                         ifconfig ib0 down
>                                           mcast_dev_flush
>                                             mcast_leave(mcast A)
>     mcast_leave(mcast A)
>
> As mcast_leave calls ib_sa_multicast_leave, and as member in
> core/multicast.c is ref counted, we run into an unbalanced refcount
> issue.  To avoid stomping on each others removes, take the rtnl lock
> specifically when we are deleting the entries from the remove list.

Isn't "test_and_clear_bit()" atomic so it is unlikely that
ib_sa_free_multicast() can run multiple times  ?

    638 static int ipoib_mcast_leave(struct net_device *dev, struct
ipoib_mcast         *mcast)
    639 {
    640         struct ipoib_dev_priv *priv = netdev_priv(dev);
    641         int ret = 0;
    642
    643         if (test_and_clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
    644                 ib_sa_free_multicast(mcast->mc);
    645
    646         if (test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED,
&mcast->flags)        ) {


-- Wendy

>
> Signed-off-by: Doug Ledford <dledf...@redhat.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 37 
> ++++++++++++++++++++++----
>  1 file changed, 32 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
> b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index f5e8da530d9..19e3fe75ebf 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -810,7 +810,10 @@ void ipoib_mcast_dev_flush(struct net_device *dev)
>
>         spin_unlock_irqrestore(&priv->lock, flags);
>
> -       /* seperate between the wait to the leave*/
> +       /*
> +        * make sure the in-flight joins have finished before we attempt
> +        * to leave
> +        */
>         list_for_each_entry_safe(mcast, tmcast, &remove_list, list)
>                 if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
>                         wait_for_completion(&mcast->done);
> @@ -931,14 +934,38 @@ void ipoib_mcast_restart_task(struct work_struct *work)
>         netif_addr_unlock(dev);
>         local_irq_restore(flags);
>
> -       /* We have to cancel outside of the spinlock */
> +       /*
> +        * make sure the in-flight joins have finished before we attempt
> +        * to leave
> +        */
> +       list_for_each_entry_safe(mcast, tmcast, &remove_list, list)
> +               if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
> +                       wait_for_completion(&mcast->done);
> +
> +       /*
> +        * We have to cancel outside of the spinlock, but we have to
> +        * take the rtnl lock or else we race with the removal of
> +        * entries from the remove list in mcast_dev_flush as part
> +        * of ipoib_stop() which will call mcast_stop_thread with
> +        * flush == 1 while holding the rtnl lock, and the
> +        * flush_workqueue won't complete until this restart_mcast_task
> +        * completes.  So do like the carrier on task and attempt to
> +        * take the rtnl lock, but if we can't before the ADMIN_UP flag
> +        * goes away, then just return and know that the remove list will
> +        * get flushed later by mcast_dev_flush.
> +        */
> +       while (!rtnl_trylock()) {
> +               if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> +                       return;
> +               else
> +                       msleep(20);
> +       }
>         list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
>                 ipoib_mcast_leave(mcast->dev, mcast);
>                 ipoib_mcast_free(mcast);
>         }
> -
> -       if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
> -               ipoib_mcast_start_thread(dev);
> +       ipoib_mcast_start_thread(dev);
> +       rtnl_unlock();
>  }
>
>  #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to