On Fri, May 22, 2009 at 01:24:17PM +0300, Yossi Etigin wrote:
> ...
> So, ipoib tries to list_del(neigh) twice because the second time
> the condition (neigh != NULL) is not protected with a lock.
> How about this:
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
> b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index ab2c192..993b5a7 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -845,23 +845,24 @@ static void ipoib_neigh_cleanup(struct neighbour *n)
>       unsigned long flags;
>       struct ipoib_ah *ah = NULL;
>  
> +     spin_lock_irqsave(&priv->lock, flags); <-- deadlock here
> +
>       neigh = *to_ipoib_neigh(n);
>       if (neigh)
>               priv = netdev_priv(neigh->dev);
>       else
> -             return;
> +             goto out;
>       ipoib_dbg(priv,
>                 "neigh_cleanup for %06x %pI6\n",
>                 IPOIB_QPN(n->ha),
>                 n->ha + 4);
>  
> -     spin_lock_irqsave(&priv->lock, flags);
> -
>       if (neigh->ah)
>               ah = neigh->ah;
>       list_del(&neigh->list);
>       ipoib_neigh_free(n->dev, neigh);
>  
> +out:
>       spin_unlock_irqrestore(&priv->lock, flags);
>  
>       if (ah)
> 

This is essentially what I did first time around, but a deadlock on 
the line marked above was quickly found. 

Instead what we've been doing is:


--- e/ofa_kernel-1.3.1/drivers/infiniband/ulp/ipoib/ipoib_main.c        
2008-06-06 12:04:20.791744390 -0700
+++ f/ofa_kernel-1.3.1/drivers/infiniband/ulp/ipoib/ipoib_main.c        
2008-06-06 12:10:14.129143660 -0700
@@ -835,11 +835,14 @@ static void ipoib_neigh_cleanup(struct n
                IPOIB_GID_RAW_ARG(n->ha + 4));

        spin_lock_irqsave(&priv->lock, flags);
-
-       if (neigh->ah)
-       ah = neigh->ah;
-               list_del(&neigh->list);
-       ipoib_neigh_free(n->dev, neigh);
+
+       neigh = *to_ipoib_neigh(n);
+       if (neigh) {
+               if (neigh->ah)
+                       ah = neigh->ah;
+               list_del(&neigh->list);
+               ipoib_neigh_free(n->dev, neigh);
+       }

        spin_unlock_irqrestore(&priv->lock, flags);


This has worked in practice, but it obviously leaves a small hole 
open.

-- 
Arthur

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to