On Wed, 2015-01-21 at 22:34 +0200, Or Gerlitz wrote:
> On Wed, Jan 21, 2015 at 7:19 PM, Roland Dreier <rol...@kernel.org> wrote:
> > On Tue, Jan 20, 2015 at 8:16 AM, Erez Shitrit <ere...@dev.mellanox.co.il> 
> > wrote:
> 
> >> After trying your V4 patch series, I can tell that first, the endless 
> >> scheduling of
> >> the mcast task is indeed over, but still, the multicast functionality in 
> >> ipoib is unstable.
> 
> > Is this worse than 3.18?  (Have you tested that?)
> 
> Roland, Doug,
> 
> To be fully clear here by "this" we're talking on seven patches of
> complexity and volume which I think go way beyond post -rc5 timeline:
> 
>   IB/ipoib: Fix failed multicast joins/sends
>   IB/ipoib: Add a helper to restart the multicast task
>   IB/ipoib: make delayed tasks not hold up everything
>   IB/ipoib: Handle -ENETRESET properly in our callback
>   IB/ipoib: don't restart our thread on ENETRESET
>   IB/ipoib: remove unneeded locks
>   IB/ipoib: fix race between mcast_dev_flush and mcast_join
> 
>  drivers/infiniband/ulp/ipoib/ipoib.h           |   1 +
>  drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 204 
> +++++++++++++++----------
>  2 files changed, 121 insertions(+), 84 deletions(-)

I don't usually look at the line count or the patch count, I look at the
changes.  Most of the 7 patches above are very self contained and very
small.  The exception is the final patch and even though it is large,
the net result is that it takes a needlessly complex function and makes
it easier to read, easier to understand, easier to verify that it does
what it's supposed to do, and then fixes a locking issue.  I also look
at testing.  While it's not obvious to the people on this list, I am
testing this rather profusely and so is our internal QE department and a
few others.  I'm really down to only the rmmod race (which I suspect the
7 patches above did not introduce, but the original 8 patches did).

> Doug, I understand your claim and frustration that with 3.18 and such
> (older kernels) your ifdown/up loop manages to break the driver,

Yes.  Quite easily.

>  but
> fixing the driver such that this test works and in the same time
> practically breaking IPv6 and IPv5 multicast introduces a deep
> regression vs. 3.18

You're right.  That's why there are these 7 patches.

>  - which as you wrote here, would be wrong to fix
> with such a further big change.
> 
> Are you really sure that reverting the offending patch 016d9fb25cd9
> "IPoIB: fix MCAST_FLAG_BUSY usage" and maybe some more dependent
> related hunks from downstream patches of that series isn't possible?

Positive.  The 8 patches change the semantics of the MCAST_BUSY_FLAG
usage throughout the code.  If you pick some and leave others, you can
end up with parts using the flag one way and parts another and then you
really blow your locking and usage to hell.

> If this is the case, I would suggest that we either revive the review
> on the fix we sent [1] or drop the whole 3.19-rc1 changes. I vote for
> the former.
> 
> [1] http://marc.info/?l=linux-rdma&m=142064313123254&w=2

So, my original 8 patches were broken.  That's easy enough to see.  My
"sin" in my testing of those patches was that my test methodology was
too myopic.  I focused on the ifup/ifdown loop scenario, and didn't test
other things that might be effected.  Similarly, if people are only
testing this latest set of code one way (for example, using rmmod/insmod
as the means to force the connection up and down and test the
joins/leaves), that can be similarly myopic.  This patch from Erez is
horribly broken for cases other than rmmod/insmod testing.  I'm guessing
you didn't see it because you never tested any other way, like using
ifup/ifdown or pulling a cable or turning a switch port off and back on.
If you had, there would not be a question of using it, at least as it
stands.

Look, smaller is not always better.  A fix that seems to work and is
smaller, but renders the code almost unintelligible because it makes the
code internally inconsistent between various areas is not a preferable
patch to a larger patch that fixes things right and makes the overall
code consistent and understandable.

> > Because Doug's changes fixed some bad, easy-to-reproduce issues.  On
> > the other hand we don't want to introduce new regressions to fix the
> > old issues.
> 
> See above, we did introduced regressions.

Which are already fixed by this 7 patch patchset.  The one remaining
issue isn't actually an ipv6 or ipv4 multicast issue, but specifically
an issue related to locking around rmmod/insmod and that we are allowing
ourselves to get into an inconsistent state during these cycles which
then effects later running (for instance, we might not clean out all of
our multicast joins before rmmod on ipoib succeeds, so on the next
insmod and attempt to join that same multicast group, the ib_sa layer
multicast membership state no longer agrees with us and we have
problems).  I've got to fix that.  But if you run your tests without
rmmod/insmod being how you bring the link up/down, I think you'll find
it now passes those tests with flying colors.  At least it does in my
testing and I would greatly value confirmation of that.

> > I think we only have a few days to decide whether to revert back to
> > 3.18 code, or push forward with these fixes.
> 
> Or.


-- 
Doug Ledford <dledf...@redhat.com>
              GPG KeyID: 0E572FDD


Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to