On Sun, Mar 29, 2009 at 12:07:09PM +0300, Moni Shoua wrote: > Eli Cohen wrote: > > On Wed, Mar 25, 2009 at 09:20:19PM +0200, Yossi Etigin wrote: > >> Wouldn't this patch break the fast bond+SM failover recovery? > >> > >> Up till now, in such case neigh->dgid was the old gid until path record > >> query was successful, and that caused to retry the path query until it's > >> successful. With that patch, a new path query will not be issued until > >> the next arp refresh because neigh->dgid will be set to a valid value > >> right after the first packet. > > Looks to me like > > > >> How about keeping the memcpy in path_rec-completion, and also adding it > >> to ib_mcast_send (where ipoib neigh is created)? > >> > > I did not like the idea that a unicast ipoib_neigh can be created and > > destroyed till the path is resolved so I preferred to initialize at > > creation time. How about the following patch to solve the problem of > > failure to resolve a path: > > > > > >>From 4e6e3dd0186803f7547f5e4c233d7525dfcdaec6 Mon Sep 17 00:00:00 2001 > > From: Eli Cohen <e...@mellanox.co.il> > > Date: Wed, 25 Mar 2009 16:22:28 +0200 > > Subject: [PATCH] IB/ipoib: set neigh->dgid upon ipoib_neigh creation > > > > If we fail to do that, there can be superfluous calls to > > ipoib_neigh_free()/ipoib_neigh_alloc() due to the following if statement in > > ipoib_start_xmit(): > > > > if (unlikely((memcmp(&neigh->dgid.raw, > > skb->dst->neighbour->ha + 4, > > sizeof(union ib_gid))) || > > > > In the case of a unicast GID, it can happen until the path is resolved and > > is > > not so severe. In the case of a multicast address, neigh->dgid is never even > > updated, so the corresponding ipoib_neigh will be destroyed and created on > > every multicast packet sent. > > > > Signed-off-by: Eli Cohen <e...@mellanox.co.il> > > --- > > drivers/infiniband/ulp/ipoib/ipoib_main.c | 7 ++++--- > > 1 files changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > index 0bd2a4f..b680483 100644 > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > @@ -460,8 +460,6 @@ static void path_rec_completion(int status, > > } > > kref_get(&path->ah->ref); > > neigh->ah = path->ah; > > - memcpy(&neigh->dgid.raw, &path->pathrec.dgid.raw, > > - sizeof(union ib_gid)); > > > > if (ipoib_cm_enabled(dev, neigh->neighbour)) { > > if (!ipoib_cm_get(neigh)) > > @@ -481,7 +479,9 @@ static void path_rec_completion(int status, > > __skb_queue_tail(&skqueue, skb); > > } > > path->valid = 1; > > - } > > + } else > > + list_for_each_entry_safe(neigh, tn, &path->neigh_list, list) > > + memset(&neigh->dgid, 0, sizeof neigh->dgid); > > > > path->query = NULL; > > complete(&path->done); > > @@ -878,6 +878,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour > > *neighbour, > > neigh->neighbour = neighbour; > > neigh->dev = dev; > > *to_ipoib_neigh(neighbour) = neigh; > > + memcpy(neigh->dgid.raw, neighbour->ha + 4, 16); > > skb_queue_head_init(&neigh->queue); > > ipoib_cm_set(neigh, NULL); > > > > This might work but still, I don't understand what this patch fixes. What was > the initial problem?
The problem was that for multicast sends, the corresponding ipoib_neigh would be created and destroyed for every packet because this if will fail: if (unlikely((memcmp(&neigh->dgid.raw, skb->dst->neighbour->ha + 4, sizeof(union ib_gid))) || There is a less severe problem with unicast flow where the ipoib_neigh object might be created and destroyed until the path is resolved. > I agree with Yossi's early comments that the patch breaks the fast fail over > we've worked on in the past. _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg