Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Erez Shitrit

On 1/14/2015 6:09 PM, Doug Ledford wrote:

On Wed, 2015-01-14 at 18:02 +0200, Erez Shitrit wrote:

Hi Doug,

Perhaps I am missing something here, but ping6 still doesn't work for me
in many cases.

I think the reason is that your origin patch does the following:
in function ipoib_mcast_join_task
  if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, mcast-flags))
  ipoib_mcast_sendonly_join(mcast);
  else
  ipoib_mcast_join(dev, mcast, 1);
  return;
The flow for sendonly_join doesn't include handling the mc_task, so only
the first mc in the list (if it is sendonly mcg) will be sent, and no
more mcg's that are in the ipoib mc list are going to be sent. (see how
it is in ipoib_mcast_join flow)

Yes, I know what you are talking about.  However, my patches did not add
this bug, it was present in the original code.  Please check a plain
v3.18 kernel, which does not have my patches, and you will see that
ipoib_mcast_sendonly_join_complete also fails to restart the mcast join
thread there as well.

Agree.
but in 3.18 there was no call from mc_task to sendonly_join, just to the 
full-member join, so no need at that point to handle the task. (the call 
for sendonly-join was by demand whenever new packet to mcg was sent by 
the kernel)

only in 3.19 the sendonly join was called explicitly from the mc_task.



I can demonstrate it with the log of ipoib:
I am trying to ping6 fe80::202:c903:9f:3b0a via ib0

The log is:
ib0: restarting multicast task
ib0: setting up send only multicast group for
ff12:601b::::::0016
ib0: adding multicast entry for mgid ff12:601b::::0001:ff43:3bf1
ib0: no multicast record for ff12:601b::::::0016,
starting sendonly join
ib0: join completion for ff12:601b::::::0001 (status 0)
ib0: MGID ff12:601b::::::0001 AV 88081afb5f40,
LID 0xc015, SL 0
ib0: join completion for ff12:401b::::::0001 (status 0)
ib0: MGID ff12:401b::::::0001 AV 88081e1c42c0,
LID 0xc014, SL 0
ib0: sendonly multicast join failed for
ff12:601b::::::0016, status -22
ib0: no multicast record for ff12:601b::::::0016,
starting sendonly join
ib0: sendonly multicast join failed for
ff12:601b::::::0016, status -22
ib0: no multicast record for ff12:601b::::::0016,
starting sendonly join
ib0: sendonly multicast join failed for
ff12:601b::::::0016, status -22
ib0: setting up send only multicast group for
ff12:601b::::::0002
ib0: no multicast record for ff12:601b::::::0016,
starting sendonly join
ib0: sendonly multicast join failed for
ff12:601b::::::0016, status -22
ib0: setting up send only multicast group for
ff12:601b::::0001:ff9f:3b0a
   here you can see that the ipv6 address is added and queued
to the list
ib0: no multicast record for ff12:601b::::::0016,
starting sendonly join
ib0: sendonly multicast join failed for
ff12:601b::::::0016, status -22
   the ipv6 mcg will not be sent because it is after some other
sendonly, and no one in that flow re-queue the mc_task again.

This is a problem with the design of the original mcast task thread.
I'm looking at a fix now.  Currently the design only allows one join to
be outstanding at a time.  Is there a reason for that that I'm not aware
of?  Some historical context that I don't know about?
IMHO, the reason for  only one mc on the air at a time was to make our 
life easier, otherwise there are locks to take/manage, races between few 
responses, etc. also, the multicast module in the core keeps all the 
requests in serialize mode.
perhaps, you can use the relevant code from the full-member join in the 
sendonly joinin order to handle the mc_task, or to return the call to 
send-only to the mcast_send instead of the mc_task.






--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Doug Ledford
On Wed, 2015-01-14 at 20:47 +0200, Erez Shitrit wrote:
 On 1/14/2015 6:09 PM, Doug Ledford wrote:
  On Wed, 2015-01-14 at 18:02 +0200, Erez Shitrit wrote:
  Hi Doug,
 
  Perhaps I am missing something here, but ping6 still doesn't work for me
  in many cases.
 
  I think the reason is that your origin patch does the following:
  in function ipoib_mcast_join_task
if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, mcast-flags))
ipoib_mcast_sendonly_join(mcast);
else
ipoib_mcast_join(dev, mcast, 1);
return;
  The flow for sendonly_join doesn't include handling the mc_task, so only
  the first mc in the list (if it is sendonly mcg) will be sent, and no
  more mcg's that are in the ipoib mc list are going to be sent. (see how
  it is in ipoib_mcast_join flow)
  Yes, I know what you are talking about.  However, my patches did not add
  this bug, it was present in the original code.  Please check a plain
  v3.18 kernel, which does not have my patches, and you will see that
  ipoib_mcast_sendonly_join_complete also fails to restart the mcast join
  thread there as well.
 Agree.
 but in 3.18 there was no call from mc_task to sendonly_join, just to the 
 full-member join, so no need at that point to handle the task. (the call 
 for sendonly-join was by demand whenever new packet to mcg was sent by 
 the kernel)
 only in 3.19 the sendonly join was called explicitly from the mc_task.

I just sent a patch set that fixes this.

 
  I can demonstrate it with the log of ipoib:
  I am trying to ping6 fe80::202:c903:9f:3b0a via ib0
 
  The log is:
  ib0: restarting multicast task
  ib0: setting up send only multicast group for
  ff12:601b::::::0016
  ib0: adding multicast entry for mgid 
  ff12:601b::::0001:ff43:3bf1
  ib0: no multicast record for ff12:601b::::::0016,
  starting sendonly join
  ib0: join completion for ff12:601b::::::0001 (status 0)
  ib0: MGID ff12:601b::::::0001 AV 88081afb5f40,
  LID 0xc015, SL 0
  ib0: join completion for ff12:401b::::::0001 (status 0)
  ib0: MGID ff12:401b::::::0001 AV 88081e1c42c0,
  LID 0xc014, SL 0
  ib0: sendonly multicast join failed for
  ff12:601b::::::0016, status -22
  ib0: no multicast record for ff12:601b::::::0016,
  starting sendonly join
  ib0: sendonly multicast join failed for
  ff12:601b::::::0016, status -22
  ib0: no multicast record for ff12:601b::::::0016,
  starting sendonly join
  ib0: sendonly multicast join failed for
  ff12:601b::::::0016, status -22
  ib0: setting up send only multicast group for
  ff12:601b::::::0002
  ib0: no multicast record for ff12:601b::::::0016,
  starting sendonly join
  ib0: sendonly multicast join failed for
  ff12:601b::::::0016, status -22
  ib0: setting up send only multicast group for
  ff12:601b::::0001:ff9f:3b0a
 here you can see that the ipv6 address is added and queued
  to the list
  ib0: no multicast record for ff12:601b::::::0016,
  starting sendonly join
  ib0: sendonly multicast join failed for
  ff12:601b::::::0016, status -22
 the ipv6 mcg will not be sent because it is after some other
  sendonly, and no one in that flow re-queue the mc_task again.
  This is a problem with the design of the original mcast task thread.
  I'm looking at a fix now.  Currently the design only allows one join to
  be outstanding at a time.  Is there a reason for that that I'm not aware
  of?  Some historical context that I don't know about?
 IMHO, the reason for  only one mc on the air at a time was to make our 
 life easier, otherwise there are locks to take/manage, races between few 
 responses, etc. also, the multicast module in the core keeps all the 
 requests in serialize mode.
 perhaps, you can use the relevant code from the full-member join in the 
 sendonly joinin order to handle the mc_task, or to return the call to 
 send-only to the mcast_send instead of the mc_task.

I reworked things a bit, but yes, the send only task now does the right
thing.  Please review the latest patchset I posted.  It's working just
fine for me here.

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Erez Shitrit

Hi Doug,

Perhaps I am missing something here, but ping6 still doesn't work for me 
in many cases.


I think the reason is that your origin patch does the following:
in function ipoib_mcast_join_task
if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, mcast-flags))
ipoib_mcast_sendonly_join(mcast);
else
ipoib_mcast_join(dev, mcast, 1);
return;
The flow for sendonly_join doesn't include handling the mc_task, so only 
the first mc in the list (if it is sendonly mcg) will be sent, and no 
more mcg's that are in the ipoib mc list are going to be sent. (see how 
it is in ipoib_mcast_join flow)


I can demonstrate it with the log of ipoib:
I am trying to ping6 fe80::202:c903:9f:3b0a via ib0

The log is:
ib0: restarting multicast task
ib0: setting up send only multicast group for 
ff12:601b::::::0016

ib0: adding multicast entry for mgid ff12:601b::::0001:ff43:3bf1
ib0: no multicast record for ff12:601b::::::0016, 
starting sendonly join

ib0: join completion for ff12:601b::::::0001 (status 0)
ib0: MGID ff12:601b::::::0001 AV 88081afb5f40, 
LID 0xc015, SL 0

ib0: join completion for ff12:401b::::::0001 (status 0)
ib0: MGID ff12:401b::::::0001 AV 88081e1c42c0, 
LID 0xc014, SL 0
ib0: sendonly multicast join failed for 
ff12:601b::::::0016, status -22
ib0: no multicast record for ff12:601b::::::0016, 
starting sendonly join
ib0: sendonly multicast join failed for 
ff12:601b::::::0016, status -22
ib0: no multicast record for ff12:601b::::::0016, 
starting sendonly join
ib0: sendonly multicast join failed for 
ff12:601b::::::0016, status -22
ib0: setting up send only multicast group for 
ff12:601b::::::0002
ib0: no multicast record for ff12:601b::::::0016, 
starting sendonly join
ib0: sendonly multicast join failed for 
ff12:601b::::::0016, status -22
ib0: setting up send only multicast group for 
ff12:601b::::0001:ff9f:3b0a
 here you can see that the ipv6 address is added and queued 
to the list
ib0: no multicast record for ff12:601b::::::0016, 
starting sendonly join
ib0: sendonly multicast join failed for 
ff12:601b::::::0016, status -22
 the ipv6 mcg will not be sent because it is after some other 
sendonly, and no one in that flow re-queue the mc_task again.


Thanks, Erez

The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
it is used to mean our device is administratively allowed to send
multicast joins/leaves/packets and in other places it means our
multicast join task thread is currently running and will process your
request if you put it on the queue.  However, this latter meaning is in
fact flawed as there is a race condition between the join task testing
the mcast list and finding it empty of remaining work, dropping the
mcast mutex and also the priv-lock spinlock, and clearing the
IPOIB_MCAST_RUN flag.  Further, there are numerous locations that use
the flag in the former fashion, and when all tasks complete and the task
thread clears the RUN flag, all of those other locations will fail to
ever again queue any work.  This results in the interface coming up fine
initially, but having problems adding new multicast groups after the
first round of groups have all been added and the RUN flag is cleared by
the join task thread when it thinks it is done.  To resolve this issue,
convert all locations in the code to treat the RUN flag as an indicator
that the multicast portion of this interface is in fact administratively
up and joins/leaves/sends can be performed.  There is no harm (other
than a slight performance penalty) to never clearing this flag and using
it in this fashion as it simply means that a few places that used to
micro-optimize how often this task was queued on a work queue will now
queue the task a few extra times.  We can address that suboptimal
behavior in future patches.

Signed-off-by: Doug Ledford dledf...@redhat.com
---
  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +--
  1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index bc50dd0d0e4..91b8fe118ec 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -630,8 +630,6 @@ void ipoib_mcast_join_task(struct work_struct *work)
}
  
  	ipoib_dbg_mcast(priv, successfully joined all multicast groups\n);

-
-   clear_bit(IPOIB_MCAST_RUN, priv-flags);
  }
  
  int ipoib_mcast_start_thread(struct net_device *dev)

@@ -641,8 +639,8 @@ int ipoib_mcast_start_thread(struct net_device *dev)
ipoib_dbg_mcast(priv, starting multicast thread\n);
  

Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Doug Ledford
On Wed, 2015-01-14 at 18:02 +0200, Erez Shitrit wrote:
 Hi Doug,
 
 Perhaps I am missing something here, but ping6 still doesn't work for me 
 in many cases.
 
 I think the reason is that your origin patch does the following:
 in function ipoib_mcast_join_task
  if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, mcast-flags))
  ipoib_mcast_sendonly_join(mcast);
  else
  ipoib_mcast_join(dev, mcast, 1);
  return;
 The flow for sendonly_join doesn't include handling the mc_task, so only 
 the first mc in the list (if it is sendonly mcg) will be sent, and no 
 more mcg's that are in the ipoib mc list are going to be sent. (see how 
 it is in ipoib_mcast_join flow)

Yes, I know what you are talking about.  However, my patches did not add
this bug, it was present in the original code.  Please check a plain
v3.18 kernel, which does not have my patches, and you will see that
ipoib_mcast_sendonly_join_complete also fails to restart the mcast join
thread there as well.

 
 I can demonstrate it with the log of ipoib:
 I am trying to ping6 fe80::202:c903:9f:3b0a via ib0
 
 The log is:
 ib0: restarting multicast task
 ib0: setting up send only multicast group for 
 ff12:601b::::::0016
 ib0: adding multicast entry for mgid ff12:601b::::0001:ff43:3bf1
 ib0: no multicast record for ff12:601b::::::0016, 
 starting sendonly join
 ib0: join completion for ff12:601b::::::0001 (status 0)
 ib0: MGID ff12:601b::::::0001 AV 88081afb5f40, 
 LID 0xc015, SL 0
 ib0: join completion for ff12:401b::::::0001 (status 0)
 ib0: MGID ff12:401b::::::0001 AV 88081e1c42c0, 
 LID 0xc014, SL 0
 ib0: sendonly multicast join failed for 
 ff12:601b::::::0016, status -22
 ib0: no multicast record for ff12:601b::::::0016, 
 starting sendonly join
 ib0: sendonly multicast join failed for 
 ff12:601b::::::0016, status -22
 ib0: no multicast record for ff12:601b::::::0016, 
 starting sendonly join
 ib0: sendonly multicast join failed for 
 ff12:601b::::::0016, status -22
 ib0: setting up send only multicast group for 
 ff12:601b::::::0002
 ib0: no multicast record for ff12:601b::::::0016, 
 starting sendonly join
 ib0: sendonly multicast join failed for 
 ff12:601b::::::0016, status -22
 ib0: setting up send only multicast group for 
 ff12:601b::::0001:ff9f:3b0a
   here you can see that the ipv6 address is added and queued 
 to the list
 ib0: no multicast record for ff12:601b::::::0016, 
 starting sendonly join
 ib0: sendonly multicast join failed for 
 ff12:601b::::::0016, status -22
   the ipv6 mcg will not be sent because it is after some other 
 sendonly, and no one in that flow re-queue the mc_task again.

This is a problem with the design of the original mcast task thread.
I'm looking at a fix now.  Currently the design only allows one join to
be outstanding at a time.  Is there a reason for that that I'm not aware
of?  Some historical context that I don't know about?

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Doug Ledford
On Wed, 2015-01-14 at 11:54 +0200, Or Gerlitz wrote:
 On 1/14/2015 8:38 AM, Doug Ledford wrote:
  The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
  it is used to mean our device is administratively allowed to send
  multicast joins/leaves/packets [...]
 what? please tell which of these upstream hits (prior to any fix) has 
 this semantics?

Actually, almost all of them.  I don't think they were originally
intended to be that way, but as a result of the mechanics of how
multicast joins are done, the net result is what it is.  But even more
importantly, this semantic is what we *want*.  For example, in
join_finish() where it tests IPOIB_MCAST_RUN before queueing delayed
work on a join failure, we *want* this to queue work whether the current
task thinks it is done or not as this is an async failure case and the
thread could have thought it was finished and cleared the flag before we
got our failure.  Treating this flag with the old semantic leaves this
situation with a dangling error condition that doesn't get retried until
something else restarts the mcast thread.  In fact, in almost all of the
places that this flag is checked, we really do want to run the thread
again unless we are in the middle of a mcast flush or we are downing the
dev.  In all other cases, we want to requeue the task.  So the semantic
I'm applying here, even if not what you or some other people had in
mind, is in fact closer to the truth of how the flag is used, and is
more in line with how we *want* the flag used.

 # git grep -n IPOIB_MCAST_RUN drivers/infiniband/ulp/ipoib
 drivers/infiniband/ulp/ipoib/ipoib.h:90: IPOIB_MCAST_RUN   = 6,
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:434: if 
 (test_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:466: if (status  
 test_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:536: if 
 (test_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:550: if 
 (!test_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:576: if 
 (test_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:634: 
 clear_bit(IPOIB_MCAST_RUN, priv-flags);
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:644: if 
 (!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:658: 
 clear_bit(IPOIB_MCAST_RUN, priv-flags);
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c:728: if 
 (!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))
 


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-14 Thread Or Gerlitz

On 1/14/2015 8:38 AM, Doug Ledford wrote:

The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
it is used to mean our device is administratively allowed to send
multicast joins/leaves/packets [...]
what? please tell which of these upstream hits (prior to any fix) has 
this semantics?


# git grep -n IPOIB_MCAST_RUN drivers/infiniband/ulp/ipoib
drivers/infiniband/ulp/ipoib/ipoib.h:90: IPOIB_MCAST_RUN   = 6,
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:434: if 
(test_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:466: if (status  
test_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:536: if 
(test_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:550: if 
(!test_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:576: if 
(test_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:634: 
clear_bit(IPOIB_MCAST_RUN, priv-flags);
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:644: if 
(!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:658: 
clear_bit(IPOIB_MCAST_RUN, priv-flags);
drivers/infiniband/ulp/ipoib/ipoib_multicast.c:728: if 
(!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH FIX for-3.19] IB/ipoib: Fix failed multicast joins/sends

2015-01-13 Thread Doug Ledford
The usage of IPOIB_MCAST_RUN as a flag is inconsistent.  In some places
it is used to mean our device is administratively allowed to send
multicast joins/leaves/packets and in other places it means our
multicast join task thread is currently running and will process your
request if you put it on the queue.  However, this latter meaning is in
fact flawed as there is a race condition between the join task testing
the mcast list and finding it empty of remaining work, dropping the
mcast mutex and also the priv-lock spinlock, and clearing the
IPOIB_MCAST_RUN flag.  Further, there are numerous locations that use
the flag in the former fashion, and when all tasks complete and the task
thread clears the RUN flag, all of those other locations will fail to
ever again queue any work.  This results in the interface coming up fine
initially, but having problems adding new multicast groups after the
first round of groups have all been added and the RUN flag is cleared by
the join task thread when it thinks it is done.  To resolve this issue,
convert all locations in the code to treat the RUN flag as an indicator
that the multicast portion of this interface is in fact administratively
up and joins/leaves/sends can be performed.  There is no harm (other
than a slight performance penalty) to never clearing this flag and using
it in this fashion as it simply means that a few places that used to
micro-optimize how often this task was queued on a work queue will now
queue the task a few extra times.  We can address that suboptimal
behavior in future patches.

Signed-off-by: Doug Ledford dledf...@redhat.com
---
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index bc50dd0d0e4..91b8fe118ec 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -630,8 +630,6 @@ void ipoib_mcast_join_task(struct work_struct *work)
}
 
ipoib_dbg_mcast(priv, successfully joined all multicast groups\n);
-
-   clear_bit(IPOIB_MCAST_RUN, priv-flags);
 }
 
 int ipoib_mcast_start_thread(struct net_device *dev)
@@ -641,8 +639,8 @@ int ipoib_mcast_start_thread(struct net_device *dev)
ipoib_dbg_mcast(priv, starting multicast thread\n);
 
mutex_lock(mcast_mutex);
-   if (!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))
-   queue_delayed_work(priv-wq, priv-mcast_task, 0);
+   set_bit(IPOIB_MCAST_RUN, priv-flags);
+   queue_delayed_work(priv-wq, priv-mcast_task, 0);
mutex_unlock(mcast_mutex);
 
return 0;
@@ -725,7 +723,7 @@ void ipoib_mcast_send(struct net_device *dev, u8 *daddr, 
struct sk_buff *skb)
memcpy(mcast-mcmember.mgid.raw, mgid, sizeof (union ib_gid));
__ipoib_mcast_add(dev, mcast);
list_add_tail(mcast-list, priv-multicast_list);
-   if (!test_and_set_bit(IPOIB_MCAST_RUN, priv-flags))
+   if (test_bit(IPOIB_MCAST_RUN, priv-flags))
queue_delayed_work(priv-wq, priv-mcast_task, 0);
}
 
@@ -951,7 +949,8 @@ void ipoib_mcast_restart_task(struct work_struct *work)
/*
 * Restart our join task if needed
 */
-   ipoib_mcast_start_thread(dev);
+   if (test_bit(IPOIB_MCAST_RUN, priv-flags))
+   queue_delayed_work(priv-wq, priv-mcast_task, 0);
rtnl_unlock();
 }
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html