On Thu, 2015-01-15 at 09:19 +0000, Erez Shitrit wrote:
> Hi Doug,
> 
> Thank you for the quick response.
> 
> Now I can see 2 issues, that I want to draw your attention to:
> 
> 1. if there is a mcg that the driver failed to join, the mc_task enters to 
> endless loop of re-queue, and the log will be full with the next messages:
> [682560.569826] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.580136] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.590364] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.600504] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.610627] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.620769] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.631082] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.640835] ib0: sendonly multicast join failed for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
> [682560.651033] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.660758] ib0: sendonly multicast join failed for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
> [682560.670923] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.680676] ib0: sendonly multicast join failed for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
> [682560.690898] ib0: no multicast record for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, starting sendonly join
> [682560.700630] ib0: sendonly multicast join failed for 
> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22
> 
> around 100 times a sec.

OK, this looks like the send only joins that fail are not setting a
fallback properly or something like that.  There is a separate bug that
I've isolated that I'm going to fix, then I we can see if that fix
effects things here, as it very well might.

> 2. IPv6 still doesn't work for me, at the same case where it is not the first 
> mcg in the list.

Can you give me some sort of instructions on how to replicate your
testing?  Things are working for me here, but I don't have a complex
IPv6 setup and mine may be too simple to reproduce what you are seeing.

> Thanks, Erez
> 
> -----Original Message-----
> From: Doug Ledford [mailto:dledf...@redhat.com] 
> Sent: Wednesday, January 14, 2015 9:53 PM
> To: linux-rdma@vger.kernel.org; rol...@kernel.org
> Cc: Amir Vadai; Eyal Perry; Erez Shitrit; Or Gerlitz; Doug Ledford
> Subject: [PATCH V3 FIX For-3.19 0/3] IB/ipoib: Fix multicast join flow
> 
> This patch series fixes the multicast join behavior problems introduced by my 
> previous patchset.  In particular, the original code did not use the send 
> only join code from the multicast thread context, and so it did not need to 
> restart the multicast thread.  After my previous patchset, it does get called 
> from the thread context, and so the send only join completion areas need to 
> restart the join thread but they don't.  This patchset makes them do so.  It 
> then adds in some cleanups for restarting the thread, and fixes the fact that 
> one delayed join holds up the entire list of joins.
> 
> v3: Resend because the last send didn't register in patchworks properly
>     (because the subject-prefix was not on all of the emails, only the
>     first) and because the Cc: list didn't not pass from cover letter
>     to patches
> 
> v2: Added two new patches, the first creates a helper to restart the
>     multicast join thread and also adds using it in the two places where
>     it should have been used but wasn't, the second allows the joins to
>     proceed around a delayed join instead of stalling everything.
> 
> v1: Addressed the usage of the IPOIB_MCAST_RUN flag
> 
> Doug Ledford (3):
>   IB/ipoib: Fix failed multicast joins/sends
>   IB/ipoib: Add a helper to restart the multicast task
>   IB/ipoib: make delayed tasks not hold up everything
> 
>  drivers/infiniband/ulp/ipoib/ipoib.h           |  1 +
>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 94 
> ++++++++++++++++++--------
>  2 files changed, 66 insertions(+), 29 deletions(-)
> 
> --
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Doug Ledford <dledf...@redhat.com>
              GPG KeyID: 0E572FDD


Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to