Re: shrink struct ib_send_wr V4

2015-11-04 Thread Doug Ledford
On 11/02/2015 11:16 PM, Greg Kroah-Hartman wrote:
> This too was something we discussed at the kernel summit, it's
> recommended that you just delete the drivers from where they are now, no
> need to move them to staging first.

So, don't take any more patches against them then.  I'll delete them in
a week or so when I send my pull request.  I have a few other things to
touch up before then and I have patches that touch those drivers, so in
order to avoid merge issues that would seem the best course at this point.


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-11-04 Thread Doug Ledford
On 11/04/2015 12:16 PM, Greg Kroah-Hartman wrote:
> On Wed, Nov 04, 2015 at 10:00:38AM -0500, Doug Ledford wrote:
>> On 11/02/2015 11:16 PM, Greg Kroah-Hartman wrote:
>>> This too was something we discussed at the kernel summit, it's
>>> recommended that you just delete the drivers from where they are now, no
>>> need to move them to staging first.
>>
>> So, don't take any more patches against them then.  I'll delete them in
>> a week or so when I send my pull request.  I have a few other things to
>> touch up before then and I have patches that touch those drivers, so in
>> order to avoid merge issues that would seem the best course at this point.
> 
> Which drivers specifically are you referring to?

All three of the drivers that are scheduled for deletion: ehca,
amso1100, ipath.

> And I'm not queueing up any new patches in my tree due to the merge
> window, so there's nothing for me to take right now :)
> 
> thanks,
> 
> greg k-h
> 


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-11-04 Thread Greg Kroah-Hartman
On Wed, Nov 04, 2015 at 10:00:38AM -0500, Doug Ledford wrote:
> On 11/02/2015 11:16 PM, Greg Kroah-Hartman wrote:
> > This too was something we discussed at the kernel summit, it's
> > recommended that you just delete the drivers from where they are now, no
> > need to move them to staging first.
> 
> So, don't take any more patches against them then.  I'll delete them in
> a week or so when I send my pull request.  I have a few other things to
> touch up before then and I have patches that touch those drivers, so in
> order to avoid merge issues that would seem the best course at this point.

Which drivers specifically are you referring to?

And I'm not queueing up any new patches in my tree due to the merge
window, so there's nothing for me to take right now :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-03 Thread Chuck Lever

> On Nov 2, 2015, at 6:37 PM, Greg Kroah-Hartman  
> wrote:
> 
> On Mon, Nov 02, 2015 at 06:20:40PM -0500, Doug Ledford wrote:
>> 1) Aging, but working, drivers that will be removed in the future.
>> Since we no longer have a deprecation mechanism, I was informed that the
>> normal procedure now is to move the driver to staging for a while and
>> then remove it permanently on a future schedule.  This provides an
>> orderly removal process.  But, if you make it so that random people can
>> break the driver with no responsibility for fixing up their breakage, in
>> contrast to the entire rest of the kernel, then you eliminate that
>> orderly removal process and turn it into a completely non-deterministic
>> and chaotic removal process.  So, no, if that's how it will be handled,
>> I will move the deprecated drivers back to the main rdma tree and time
>> them out and perform the orderly removal myself.
> 
> So you have what, one more kernel release before these are deleted?
> Odds are, nothing is going to be broken so badly that a simple patch
> will not fix up before they are removed.  So don't worry about this.
> 
> And why would you want a developer wasting time on fixing up these
> drivers if they are about to be deleted anyway?  Obviously no one even
> uses them, otherwise they wouldn't be about to be removed.

I understand the stated policy, but I was recently
bitten by it. I got dinged by the kbuild robot when I
made a change that broke something in staging. I was
caught between the staging policy and the guidelines
about fixing kbuild nits before a merge window.

My change was in the IB core, but the breakage was in
Lustre. Their fix (which was quick because I had warned
them in advance) was substantial, and not something I
could have provided myself.

If the bits in staging are truly to be ignored, then
the kbuild robot shouldn’t warn about new staging
breakage. Or the warning messages should state that
fixing said breakage is optional.


>> 2) New drivers (one currently, one other one submitted but not yet
>> pulled in) under active development but which need specific things fixed
>> up.  These have people already working on them full time.  They were
>> submitted to the staging tree with a specific TODO in order to get out.
>> If you then break that driver and force the driver author to fix it,
>> you have in essence changed the TODO list.  That means you now have a
>> moving goal post scenario.
> 
> Again, this shouldn't be taking you all that long to get out of staging,
> right?  And again, the number of api changes that cause breakage are
> usually very minor, if they happen at all (remember, this is a
> theoretical, not what usually happens.)  And a driver in this state
> doesn't deserve to be in the "real" portion of the kernel, so you can't
> merge it anywhere else, so overall it still benifits being in the
> staging tree, so a few minor breakages every once in a while should be
> easy for you to fix up, _if_ they happen.
> 
> Again, I don't know of any recent api change that has caused any
> problems in a long time, but the VFS developers really hate having to
> touch lustre code, and I don't blame them, so sometimes that codebase
> breaks.  That will not affect your drivers at all, so I wouldn't worry
> about this.
> 
> So relax, this isn't going to be an issue, or if it is, you can easily
> handle it, it is no different from any other tree-wide api change that
> happens.
> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

—
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Greg Kroah-Hartman
On Mon, Nov 02, 2015 at 06:20:40PM -0500, Doug Ledford wrote:
> 1) Aging, but working, drivers that will be removed in the future.
> Since we no longer have a deprecation mechanism, I was informed that the
> normal procedure now is to move the driver to staging for a while and
> then remove it permanently on a future schedule.  This provides an
> orderly removal process.  But, if you make it so that random people can
> break the driver with no responsibility for fixing up their breakage, in
> contrast to the entire rest of the kernel, then you eliminate that
> orderly removal process and turn it into a completely non-deterministic
> and chaotic removal process.  So, no, if that's how it will be handled,
> I will move the deprecated drivers back to the main rdma tree and time
> them out and perform the orderly removal myself.

So you have what, one more kernel release before these are deleted?
Odds are, nothing is going to be broken so badly that a simple patch
will not fix up before they are removed.  So don't worry about this.

And why would you want a developer wasting time on fixing up these
drivers if they are about to be deleted anyway?  Obviously no one even
uses them, otherwise they wouldn't be about to be removed.

> 2) New drivers (one currently, one other one submitted but not yet
> pulled in) under active development but which need specific things fixed
> up.  These have people already working on them full time.  They were
> submitted to the staging tree with a specific TODO in order to get out.
>  If you then break that driver and force the driver author to fix it,
> you have in essence changed the TODO list.  That means you now have a
> moving goal post scenario.

Again, this shouldn't be taking you all that long to get out of staging,
right?  And again, the number of api changes that cause breakage are
usually very minor, if they happen at all (remember, this is a
theoretical, not what usually happens.)  And a driver in this state
doesn't deserve to be in the "real" portion of the kernel, so you can't
merge it anywhere else, so overall it still benifits being in the
staging tree, so a few minor breakages every once in a while should be
easy for you to fix up, _if_ they happen.

Again, I don't know of any recent api change that has caused any
problems in a long time, but the VFS developers really hate having to
touch lustre code, and I don't blame them, so sometimes that codebase
breaks.  That will not affect your drivers at all, so I wouldn't worry
about this.

So relax, this isn't going to be an issue, or if it is, you can easily
handle it, it is no different from any other tree-wide api change that
happens.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Jason Gunthorpe
On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:

> Because they are *scheduled* for removal.  If I simply didn't care if
> they went away, then I wouldn't screw around with deprecating them or
> tagging them to be removed, I'd just delete them.  Breaking them before
> the scheduled removal time defeats that entire purpose.

You don't see in the compiles, but IIRC ipath is unusable after the
PAT rework.

We've also been ripping out the ULP side of things that would let
amso1100/ehca work in any meningful way with the kernel ULPs.

The reason I lobbied to get rid of them is specifically because they
don't work and maintaining them (ie the driver and the single-use ULP
codepath side) is a huge pain. Keeping them around and keeping them
compiling defeats the entire point. Just delete them, we don't need to
wait for 4.6.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Doug Ledford
On 11/01/2015 01:06 PM, Greg Kroah-Hartman wrote:
> On Sun, Nov 01, 2015 at 02:10:48PM +0200, Or Gerlitz wrote:
>> On 10/29/2015 1:51 PM, Christoph Hellwig wrote:
>>> On Wed, Oct 28, 2015 at 10:57:59PM -0400, Doug Ledford wrote:
>>> I had to do a minor hand merge to get this to apply, but it has been
>>> pulled in for 4.4.
>
> This breaks all of the drivers in staging BTW.  That will need fixed up
> before the pull request goes in during the merge window.
>>> That latest branch on infradead.org should have fixed all the staging
>>> drivers.  But Linus just clarified that we indeed do not have to care for
>>> staging drivers, I asked him at kernel summit in front of the whole 
>>> audience.
>>
>> Can you and/or Greg clarify this a little further... when someone modified
>> an API called by others
>> drivers,  they don't have to deal with staging drivers?
> 
> Yes, that is correct.
> 
>> that is, the folks that care for this staging code need to handle
>> that?
> 
> Yes, that is correct.
> 
> It's up to the owners of the staging drivers to do the work, we do not
> make it the responsibility of anyone else.  Now most of the time people
> are nice and fix up everything else, but it's not a requirement at all.
> 
> hope this helps,

Not really.  That may be a satisfactory answer for most of the things in
the staging area, but it is not a satisfactory for the items I moved
into that area.  The things I moved there belong in one of two categories:

1) Aging, but working, drivers that will be removed in the future.
Since we no longer have a deprecation mechanism, I was informed that the
normal procedure now is to move the driver to staging for a while and
then remove it permanently on a future schedule.  This provides an
orderly removal process.  But, if you make it so that random people can
break the driver with no responsibility for fixing up their breakage, in
contrast to the entire rest of the kernel, then you eliminate that
orderly removal process and turn it into a completely non-deterministic
and chaotic removal process.  So, no, if that's how it will be handled,
I will move the deprecated drivers back to the main rdma tree and time
them out and perform the orderly removal myself.

2) New drivers (one currently, one other one submitted but not yet
pulled in) under active development but which need specific things fixed
up.  These have people already working on them full time.  They were
submitted to the staging tree with a specific TODO in order to get out.
 If you then break that driver and force the driver author to fix it,
you have in essence changed the TODO list.  That means you now have a
moving goal post scenario.

And this behavior is somewhat justified if you have a driver that is
languishing in staging and being mostly ignored by the
authors/maintainers of the driver.  If the maintainers don't care enough
about it to fix it up, why should the core kernel developers keeping the
kernel modern fix them up?

But what about when the drivers *are* being actively maintained and
worked on?  From the perspective of a maintainer trying to get their
driver out of staging, this policy means that drivers that are already
in the main tree get help from the core developer who *must* fix them up
to work with their changes according to policy, while the same core
developer is allowed to ignore any responsibility for fixing up any
staging drivers.  At the same time, it is highly likely that since the
core kernel drivers have reached a level of maturity that they don't
really *need* a lot of help keeping up with changes nor require a lot of
work.  The maintainer trying to get their driver out of staging however
could probably use the help the most, and certainly likely doesn't need
any extra work dumped on their back.

So, no, for the drivers I have moved into the staging tree, I don't
support this policy.  This policy is really just a means of not making
core developers work on drivers that no one cares about.  The *means* of
saving those core developers that work turns out to effectively be a
punishment to any driver maintainer in the staging area who isn't
allowing their driver to languish and go unmaintained.  Because the
staging tree is not limited to just drivers deserving of this
punishment, the policy itself is inappropriate IMO.  If that's going to
be a problem, I have no issue moving the entire staging/rdma tree back
into the drivers/infiniband area and fixing it up appropriately.


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Doug Ledford
On 11/02/2015 06:37 PM, Greg Kroah-Hartman wrote:
> On Mon, Nov 02, 2015 at 06:20:40PM -0500, Doug Ledford wrote:
>> 1) Aging, but working, drivers that will be removed in the future.
>> Since we no longer have a deprecation mechanism, I was informed that the
>> normal procedure now is to move the driver to staging for a while and
>> then remove it permanently on a future schedule.  This provides an
>> orderly removal process.  But, if you make it so that random people can
>> break the driver with no responsibility for fixing up their breakage, in
>> contrast to the entire rest of the kernel, then you eliminate that
>> orderly removal process and turn it into a completely non-deterministic
>> and chaotic removal process.  So, no, if that's how it will be handled,
>> I will move the deprecated drivers back to the main rdma tree and time
>> them out and perform the orderly removal myself.
> 
> So you have what, one more kernel release before these are deleted?

4.6 merge window.

> Odds are, nothing is going to be broken so badly that a simple patch
> will not fix up before they are removed.  So don't worry about this.
> 
> And why would you want a developer wasting time on fixing up these
> drivers if they are about to be deleted anyway?  Obviously no one even
> uses them, otherwise they wouldn't be about to be removed.

Because they are *scheduled* for removal.  If I simply didn't care if
they went away, then I wouldn't screw around with deprecating them or
tagging them to be removed, I'd just delete them.  Breaking them before
the scheduled removal time defeats that entire purpose.

>> 2) New drivers (one currently, one other one submitted but not yet
>> pulled in) under active development but which need specific things fixed
>> up.  These have people already working on them full time.  They were
>> submitted to the staging tree with a specific TODO in order to get out.
>>  If you then break that driver and force the driver author to fix it,
>> you have in essence changed the TODO list.  That means you now have a
>> moving goal post scenario.
> 
> Again, this shouldn't be taking you all that long to get out of staging,
> right?

No, the one item on the TODO list is fairly big and requires
multi-vendor joint engineering efforts.  It could easily take multiple
kernel releases.

>  And again, the number of api changes that cause breakage are
> usually very minor, if they happen at all (remember, this is a
> theoretical, not what usually happens.)

We've already had two in the 4.4 for-next window in the InfiniBand
subsystem.  I requested that the authors fix up the staging drivers and
included that with the patches that required the fix ups, so you didn't
see it.

>  And a driver in this state
> doesn't deserve to be in the "real" portion of the kernel,

That's arguable.  Certainly scheduling a driver for removal doesn't mean
it no longer "deserves" to be in the "real" portion of the kernel.  It's
on its way out, yes, but that doesn't make it automatically and
immediately undeserving of being in the "real" kernel.

> so you can't
> merge it anywhere else,

Of course I could.  The deprecated drivers didn't have to be moved, I
could have created an INFINIBAND_DEPRECATED group and moved them in
their via Kconfig without ever moving the files themselves anywhere, and
then deleted them when the time came.

The new driver(s) could have been merged as it was.  It's only crime is
that it was the third driver to have a software transfer engine in it,
and so we wanted to get a transfer library built and move the transfer
management code out of the driver and into the library.  We did the same
thing with the SCSI stack, and we didn't make every new driver go
through staging while we built the library, which is what we are doing
here.  Staging is a carrot to get the library built, not an indication
that this driver is unsuitable for the "real" kernel.

> so overall it still benifits being in the
> staging tree, so a few minor breakages every once in a while should be
> easy for you to fix up, _if_ they happen.
> 
> Again, I don't know of any recent api change that has caused any
> problems in a long time, but the VFS developers really hate having to
> touch lustre code, and I don't blame them, so sometimes that codebase
> breaks.  That will not affect your drivers at all, so I wouldn't worry
> about this.

Yes.  I get that.  And I understand that.  But because of Lustre, you
have made a global policy that effects all staging drivers without
legitimate cause, proudly broadcast that policy at a conference, even
answered a question from Christoph confirming that policy, and now in
this email thread where I object to that policy you say "It doesn't
really happen anyway, don't worry about it.'  But the person who quoted
you (Christoph) is the author of one of the API changes that *did* break
the RDMA drivers in the staging tree and he fixed them up when I asked
him to.  Christoph then later quoted you and his interaction 

Re: shrink struct ib_send_wr V4

2015-11-02 Thread Greg Kroah-Hartman
On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:
> > so overall it still benifits being in the
> > staging tree, so a few minor breakages every once in a while should be
> > easy for you to fix up, _if_ they happen.
> > 
> > Again, I don't know of any recent api change that has caused any
> > problems in a long time, but the VFS developers really hate having to
> > touch lustre code, and I don't blame them, so sometimes that codebase
> > breaks.  That will not affect your drivers at all, so I wouldn't worry
> > about this.
> 
> Yes.  I get that.  And I understand that.  But because of Lustre, you
> have made a global policy that effects all staging drivers without
> legitimate cause, proudly broadcast that policy at a conference, even
> answered a question from Christoph confirming that policy, and now in
> this email thread where I object to that policy you say "It doesn't
> really happen anyway, don't worry about it.'

No, it's not "because" of lustre, it's been that way since day one when
we started the staging tree.  Christoph was just annoyed that someone
was trying to tell him otherwise.  And he is right.

> But the person who quoted
> you (Christoph) is the author of one of the API changes that *did* break
> the RDMA drivers in the staging tree and he fixed them up when I asked
> him to.  Christoph then later quoted you and his interaction with you at
> conference to indicate he didn't have to.

He didn't have to.  He was being nice.  That's your job to fix them up
if you want the code in staging.

> And what I'm telling him, and
> you since you are the person he's quoting to justify not fixing up
> things, is that I don't care what you say, when it comes to those
> drivers that I moved into staging, if he wants me to take his core
> patches, then he needs to make sure they don't break those drivers I
> moved into staging.

Nope, not true.  If you don't like this, I'll gladly just delete the
drivers from staging, sorry.  You don't get a "free ride" in staging at
all.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Greg Kroah-Hartman
On Mon, Nov 02, 2015 at 05:18:45PM -0700, Jason Gunthorpe wrote:
> On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:
> 
> > Because they are *scheduled* for removal.  If I simply didn't care if
> > they went away, then I wouldn't screw around with deprecating them or
> > tagging them to be removed, I'd just delete them.  Breaking them before
> > the scheduled removal time defeats that entire purpose.
> 
> You don't see in the compiles, but IIRC ipath is unusable after the
> PAT rework.
> 
> We've also been ripping out the ULP side of things that would let
> amso1100/ehca work in any meningful way with the kernel ULPs.
> 
> The reason I lobbied to get rid of them is specifically because they
> don't work and maintaining them (ie the driver and the single-use ULP
> codepath side) is a huge pain. Keeping them around and keeping them
> compiling defeats the entire point. Just delete them, we don't need to
> wait for 4.6.

Great, which ones can I delete, I'll go do that right now :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Doug Ledford
On 11/02/2015 07:18 PM, Jason Gunthorpe wrote:
> On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:
> 
>> Because they are *scheduled* for removal.  If I simply didn't care if
>> they went away, then I wouldn't screw around with deprecating them or
>> tagging them to be removed, I'd just delete them.  Breaking them before
>> the scheduled removal time defeats that entire purpose.
> 
> You don't see in the compiles, but IIRC ipath is unusable after the
> PAT rework.

It shouldn't be.  I reviewed those changes and they looked right (given
the limitations).  All you needed was to boot with nopat on the kernel
command line to get the old kernel behavior and it would continue to
work as before, and it would print out a message telling you to do so if
you hadn't already.

> We've also been ripping out the ULP side of things that would let
> amso1100/ehca work in any meningful way with the kernel ULPs.

For amso1100, kernel ULPs has never been its target.  Didn't we only
recently got any support for iWARP in iSER?  And we never had it or will
have it in IPoIB or SRP/SRPT.  That only leaves RDS and NFSoRDMA.
RDS/RDMA upstream has been a mess for a long time.  That only leaves
NFSoRDMA.  So it's fair to say amso1100 has been primarily a user space
RDMA device, not a kernel ULP device.

For ehca, that's another matter entirely.  I gave up trying to work with
that driver a long time ago.  If IBM tells me something is broke, I'll
review their patches to fix it, but as I have no working hardware, and
only every had hardware I couldn't get working in our environment, there
isn't much I can do about it.

> The reason I lobbied to get rid of them is specifically because they
> don't work and maintaining them (ie the driver and the single-use ULP
> codepath side) is a huge pain. Keeping them around and keeping them
> compiling defeats the entire point. Just delete them, we don't need to
> wait for 4.6.

That's not true.  User space continues to work, and amso1100 shouldn't
be greatly negatively impacted by recent changes.  Nor should ipath.

-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Jason Gunthorpe
On Mon, Nov 02, 2015 at 07:52:05PM -0500, Doug Ledford wrote:
> It shouldn't be.  I reviewed those changes and they looked right (given
> the limitations).  All you needed was to boot with nopat on the kernel
> command line to get the old kernel behavior and it would continue to
> work as before, and it would print out a message telling you to do so if
> you hadn't already.

Alright, that was done after it was pretty clear the driver was
useless and I stopped looking at that part of the series. I don't know
why we made Luis jump through such hoops instead of just deleting it
right then and there.

> > For amso1100, kernel ULPs has never been its target.  Didn't we only
> > recently got any support for iWARP in iSER?

I don't like that reasoning. In 4.4 we expect some ULPs to work on
iWarp, and amso110 can't do it. That is a big reason why the driver
is getting chopped.

> > The reason I lobbied to get rid of them is specifically because they
> > don't work and maintaining them (ie the driver and the single-use ULP
> > codepath side) is a huge pain. Keeping them around and keeping them
> > compiling defeats the entire point. Just delete them, we don't need to
> > wait for 4.6.
> 
> That's not true.  User space continues to work, and amso1100 shouldn't
> be greatly negatively impacted by recent changes.  Nor should ipath.

So what if user space works? The kernel consumers are known broken.

We've commited to removing the driver from the kernel. We have support
of the driver authors to do this. We have found no users or
testers. We've committed to removing the ULP support (ie MR-only code
is being ripped out).

*WHY* spend an ounce of time fixing up code that *NO-ONE* will ever
even run? Wasted effort. Just delete them now.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Greg Kroah-Hartman
On Mon, Nov 02, 2015 at 09:03:35PM -0500, Doug Ledford wrote:
> On 11/02/2015 08:28 PM, Jason Gunthorpe wrote:
> > On Mon, Nov 02, 2015 at 07:52:05PM -0500, Doug Ledford wrote:
> >> It shouldn't be.  I reviewed those changes and they looked right (given
> >> the limitations).  All you needed was to boot with nopat on the kernel
> >> command line to get the old kernel behavior and it would continue to
> >> work as before, and it would print out a message telling you to do so if
> >> you hadn't already.
> > 
> > Alright, that was done after it was pretty clear the driver was
> > useless and I stopped looking at that part of the series. I don't know
> > why we made Luis jump through such hoops instead of just deleting it
> > right then and there.
> > 
> >>> For amso1100, kernel ULPs has never been its target.  Didn't we only
> >>> recently got any support for iWARP in iSER?
> > 
> > I don't like that reasoning. In 4.4 we expect some ULPs to work on
> > iWarp, and amso110 can't do it. That is a big reason why the driver
> > is getting chopped.
> > 
> >>> The reason I lobbied to get rid of them is specifically because they
> >>> don't work and maintaining them (ie the driver and the single-use ULP
> >>> codepath side) is a huge pain. Keeping them around and keeping them
> >>> compiling defeats the entire point. Just delete them, we don't need to
> >>> wait for 4.6.
> >>
> >> That's not true.  User space continues to work, and amso1100 shouldn't
> >> be greatly negatively impacted by recent changes.  Nor should ipath.
> > 
> > So what if user space works? The kernel consumers are known broken.
> 
> No, one kernel consumer that never worked on iWARP before now works on a
> different iWARP controller but doesn't work on the old iWARP controller.
>  Hardly the end of the world.
> 
> > We've commited to removing the driver from the kernel. We have support
> > of the driver authors to do this. We have found no users or
> > testers. We've committed to removing the ULP support (ie MR-only code
> > is being ripped out).
> > 
> > *WHY* spend an ounce of time fixing up code that *NO-ONE* will ever
> > even run? Wasted effort. Just delete them now.
> 
> You don't just "delete them now" because it denies anyone advanced
> warning of the change and denies them the chance to speak up in
> opposition to the driver's removal.  We queried the kernel community and
> found no one using it.  There are a lot of users out there that don't
> involve themselves in the kernel community at all.  For those users,
> there is a normal push back mechanism that involves the change trickling
> down into distros and then people complaining if they don't want the
> drivers removed.  We could speed that push back cycle up if the various
> distros proactively queried their customer's hardware usage.  But
> without either a query activity or a push back cycle, your assertion
> that "*NO-ONE* will ever even run?" is mere assumption not a statement
> of fact.

Moving things into the staging directory for 2 releases usually doesn't
bring out these types of users, they only show up every few years when
their distro kernel suddenly stops working on their hardware.

This too was something we discussed at the kernel summit, it's
recommended that you just delete the drivers from where they are now, no
need to move them to staging first.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Greg Kroah-Hartman
On Mon, Nov 02, 2015 at 10:14:06PM -0500, Doug Ledford wrote:
> On 11/02/2015 07:49 PM, Greg Kroah-Hartman wrote:
> > On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:
> >>> so overall it still benifits being in the
> >>> staging tree, so a few minor breakages every once in a while should be
> >>> easy for you to fix up, _if_ they happen.
> >>>
> >>> Again, I don't know of any recent api change that has caused any
> >>> problems in a long time, but the VFS developers really hate having to
> >>> touch lustre code, and I don't blame them, so sometimes that codebase
> >>> breaks.  That will not affect your drivers at all, so I wouldn't worry
> >>> about this.
> >>
> >> Yes.  I get that.  And I understand that.  But because of Lustre, you
> >> have made a global policy that effects all staging drivers without
> >> legitimate cause, proudly broadcast that policy at a conference, even
> >> answered a question from Christoph confirming that policy, and now in
> >> this email thread where I object to that policy you say "It doesn't
> >> really happen anyway, don't worry about it.'
> > 
> > No, it's not "because" of lustre, it's been that way since day one when
> > we started the staging tree.  Christoph was just annoyed that someone
> > was trying to tell him otherwise.  And he is right.
> 
> No, he's not.  As I've already pointed out, the policy is only
> appropriate for neglected drivers.

No, it's appropriate for all of drivers/staging/ and was what I ensured
the rest of the kernel community when I created it many years ago.

> I don't care if the policy was applied to everything in staging,
> whether neglected or not, from day one.  It was evidently bad from the
> beginning.  I'm calling it out.  I've even provided justification for
> calling it out.

And I disagree, sorry.  And as maintainer of this portion of the kernel,
I don't want to go back on my word to the rest of the kernel community,
that would be rude.

> So far, your only response has been "That's the way it's always been,
> and that's the way it is", which does nothing to justify the status quo
> but stands only to argue that it is the status quo and therefore should
> remain such.  Inertia is a poor argument.

My argument is, "No kernel developer should care about what is in
drivers/staging/ as it is obviously code that is not in a mergable
state, otherwise it would not be in this location."  Now yes, we have
started to use it to delete drivers, but that's now not really a good
idea (see my other email about that.)

If your code was in mergable state, then please take it out of staging
now, but from what I have seen, that is not the case, so it needs to
live here until it can get cleaned up properly.

> >> But the person who quoted
> >> you (Christoph) is the author of one of the API changes that *did* break
> >> the RDMA drivers in the staging tree and he fixed them up when I asked
> >> him to.  Christoph then later quoted you and his interaction with you at
> >> conference to indicate he didn't have to.
> > 
> > He didn't have to.  He was being nice.
> 
> Again, I disagree.

Fair enough, but that doesn't change the fact that he had no requirement
to make this change.

> >  That's your job to fix them up
> > if you want the code in staging.
> 
> As I pointed out in an earlier email, allowing deprecated drivers to
> simply break defeats the whole purpose of deprecating them in the first
> place.

Then we should just delete them from the tree, that makes things simpler
for everyone involved.

> >> And what I'm telling him, and
> >> you since you are the person he's quoting to justify not fixing up
> >> things, is that I don't care what you say, when it comes to those
> >> drivers that I moved into staging, if he wants me to take his core
> >> patches, then he needs to make sure they don't break those drivers I
> >> moved into staging.
> > 
> > Nope, not true.  If you don't like this, I'll gladly just delete the
> > drivers from staging, sorry.
> 
> Maybe you should be more clear about your intention here.  Under
> precisely which actions of mine will you resort to retaliatory deletion
> of code, and which code?

If you insist to other kernel developers that they have to fix up code
in drivers/staging/ I will not accept that, and will ask for that code
to be removed as that is not how this portion of the kernel tree works,
sorry.

> >  You don't get a "free ride" in staging at
> > all.
> 
> I'm not sure what you are calling a "free ride".  Certainly, if what I
> asked Christoph for is a "free ride", then drivers in the core kernel
> get "free rides" all the time.  As such, I didn't ask for anything
> unusual or out of the ordinary.  Because of the reasons I stated in my
> previous email, I asked for the drivers I moved to staging to be treated
> the same as any other driver would be.  Equal treatment is hardly a
> "free ride".

drivers/staging is not "equal" to the rest of the kernel, I think that
is the misunderstanding here.

hope 

Re: shrink struct ib_send_wr V4

2015-11-02 Thread Jason Gunthorpe
On Mon, Nov 02, 2015 at 09:03:35PM -0500, Doug Ledford wrote:

> No, one kernel consumer that never worked on iWARP before now works on a
> different iWARP controller but doesn't work on the old iWARP controller.
>  Hardly the end of the world.

NFS is gone/going as well, and that used to work. No iWarp enabled ULP
(now or in future) will work without at least FMRs..

I just disagree we should continue to hang on to unused hardware that
isn't even capable of running the whole stack we are maintaining. That
makes no sense to me. If it doesn't run our full stack then ditch it.

> > *WHY* spend an ounce of time fixing up code that *NO-ONE* will ever
> > even run? Wasted effort. Just delete them now.
> 
> You don't just "delete them now" because it denies anyone advanced
> warning of the change and denies them the chance to speak up in
> opposition to the driver's removal.

We've already gone through a cycle, 'now' is a fine time to do it.

You are talking like deletion is irreversible. It isn't. If someone
comes forward with a need and a desire to test that it still works
then reverting a delete shouldn't be that difficult.

On the other hand, the developer time wasted keeping it working while
we wait is not recoverable.

> distros proactively queried their customer's hardware usage.  But
> without either a query activity or a push back cycle, your assertion
> that "*NO-ONE* will ever even run?" is mere assumption not a statement
> of fact.

No, it is a statement based on my knowledge of the history of these
cards and our user base, combined with the responses from the vendors
who do have a better sense.

Ammasso went out of buisness before the driver was even in mainline.
Tom got it in mainline after the fact as a early way to work on
generic iWarp support. There were few users (<<100) and similarly few
cards. As far as I'm concerned all users got the 'obsolete, do not
use' message 10 years ago.

ipath only supports the HTX cards, which where a short run engineering
experiment that were not market successful (10 years ago), few were
made. Those cards are unsuable in modern hardware. The vendor
announced they were no longer supported when the qib driver was made,
as far as I'm concerned all users got the 'obsolete, do no use'
message 5 years ago.

ehca was part of IBM's high end systems. A >= 4.4 kernel isn't going
to be qualified or tested on that hardware. Nobody would run
production like that. I actually bet some are still running
somewhere..

Seriously, there are zero production users, it isn't even a question.

If a hobby user appears, then they can restore the driver and fix it
up, and explain why we should carry it when it doesn't run the full
stack.

> I suspect you are right and we *could* delete them now without a
> problem.  But without that push back cycle or a query to find out for
> sure, you're asking me to take it on faith instead of doing due
> diligence, and I don't have any intention of being called out later for
> failing to do due diligence.

Nobody is going to call you out for removing a driver, especially when
the *vendor* said it was OK.

On the other hand, people have been sending unhappy messages with
the current arrangement...

kernel.org isn't a distro - we can flip flop ...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-02 Thread Doug Ledford
On 11/02/2015 07:49 PM, Greg Kroah-Hartman wrote:
> On Mon, Nov 02, 2015 at 07:02:04PM -0500, Doug Ledford wrote:
>>> so overall it still benifits being in the
>>> staging tree, so a few minor breakages every once in a while should be
>>> easy for you to fix up, _if_ they happen.
>>>
>>> Again, I don't know of any recent api change that has caused any
>>> problems in a long time, but the VFS developers really hate having to
>>> touch lustre code, and I don't blame them, so sometimes that codebase
>>> breaks.  That will not affect your drivers at all, so I wouldn't worry
>>> about this.
>>
>> Yes.  I get that.  And I understand that.  But because of Lustre, you
>> have made a global policy that effects all staging drivers without
>> legitimate cause, proudly broadcast that policy at a conference, even
>> answered a question from Christoph confirming that policy, and now in
>> this email thread where I object to that policy you say "It doesn't
>> really happen anyway, don't worry about it.'
> 
> No, it's not "because" of lustre, it's been that way since day one when
> we started the staging tree.  Christoph was just annoyed that someone
> was trying to tell him otherwise.  And he is right.

No, he's not.  As I've already pointed out, the policy is only
appropriate for neglected drivers.  I don't care if the policy was
applied to everything in staging, whether neglected or not, from day
one.  It was evidently bad from the beginning.  I'm calling it out.
I've even provided justification for calling it out.

So far, your only response has been "That's the way it's always been,
and that's the way it is", which does nothing to justify the status quo
but stands only to argue that it is the status quo and therefore should
remain such.  Inertia is a poor argument.

>> But the person who quoted
>> you (Christoph) is the author of one of the API changes that *did* break
>> the RDMA drivers in the staging tree and he fixed them up when I asked
>> him to.  Christoph then later quoted you and his interaction with you at
>> conference to indicate he didn't have to.
> 
> He didn't have to.  He was being nice.

Again, I disagree.

>  That's your job to fix them up
> if you want the code in staging.

As I pointed out in an earlier email, allowing deprecated drivers to
simply break defeats the whole purpose of deprecating them in the first
place.  Moving the burden of keeping them running from the person doing
a core API change to the maintainer just because they are now deprecated
and in staging makes no sense.  That just serves to overload the
maintainer.  And it's not like I want to work on those drivers any more
than anyone else.  I'm simply trying to follow a process that allows for
an orderly removal with some sort of ability for end users to have a
chance to push back if they need to.  I'm having a hard time
understanding why this deprecation procedure should be so maintainer
responsibility heavy.  Regardless though, I'm almost certain not to
follow this specific deprecation process again in the future because of
how you are suggesting it should be handled.

>> And what I'm telling him, and
>> you since you are the person he's quoting to justify not fixing up
>> things, is that I don't care what you say, when it comes to those
>> drivers that I moved into staging, if he wants me to take his core
>> patches, then he needs to make sure they don't break those drivers I
>> moved into staging.
> 
> Nope, not true.  If you don't like this, I'll gladly just delete the
> drivers from staging, sorry.

Maybe you should be more clear about your intention here.  Under
precisely which actions of mine will you resort to retaliatory deletion
of code, and which code?

>  You don't get a "free ride" in staging at
> all.

I'm not sure what you are calling a "free ride".  Certainly, if what I
asked Christoph for is a "free ride", then drivers in the core kernel
get "free rides" all the time.  As such, I didn't ask for anything
unusual or out of the ordinary.  Because of the reasons I stated in my
previous email, I asked for the drivers I moved to staging to be treated
the same as any other driver would be.  Equal treatment is hardly a
"free ride".

-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-11-01 Thread Or Gerlitz

On 10/29/2015 1:51 PM, Christoph Hellwig wrote:

On Wed, Oct 28, 2015 at 10:57:59PM -0400, Doug Ledford wrote:

> >I had to do a minor hand merge to get this to apply, but it has been
> >pulled in for 4.4.

>
>This breaks all of the drivers in staging BTW.  That will need fixed up
>before the pull request goes in during the merge window.

That latest branch on infradead.org should have fixed all the staging
drivers.  But Linus just clarified that we indeed do not have to care for
staging drivers, I asked him at kernel summit in front of the whole audience.


Can you and/or Greg clarify this a little further... when someone 
modified an API called by others
drivers,  they don't have to deal with staging drivers? that is, the 
folks that care for this staging code

need to handle that?


Or.


If you were not merging the latest branch you might also have to pick up
the mlx5 fix separately, Eli or Sagi might be able to help with that as
I'll be offline for a few days.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-11-01 Thread Greg Kroah-Hartman
On Sun, Nov 01, 2015 at 02:10:48PM +0200, Or Gerlitz wrote:
> On 10/29/2015 1:51 PM, Christoph Hellwig wrote:
> >On Wed, Oct 28, 2015 at 10:57:59PM -0400, Doug Ledford wrote:
>  >I had to do a minor hand merge to get this to apply, but it has been
>  >pulled in for 4.4.
> >>>
> >>>This breaks all of the drivers in staging BTW.  That will need fixed up
> >>>before the pull request goes in during the merge window.
> >That latest branch on infradead.org should have fixed all the staging
> >drivers.  But Linus just clarified that we indeed do not have to care for
> >staging drivers, I asked him at kernel summit in front of the whole audience.
> 
> Can you and/or Greg clarify this a little further... when someone modified
> an API called by others
> drivers,  they don't have to deal with staging drivers?

Yes, that is correct.

> that is, the folks that care for this staging code need to handle
> that?

Yes, that is correct.

It's up to the owners of the staging drivers to do the work, we do not
make it the responsibility of anyone else.  Now most of the time people
are nice and fix up everything else, but it's not a requirement at all.

hope this helps,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-10-29 Thread Christoph Hellwig
On Wed, Oct 28, 2015 at 10:57:59PM -0400, Doug Ledford wrote:
> > I had to do a minor hand merge to get this to apply, but it has been
> > pulled in for 4.4.
> 
> This breaks all of the drivers in staging BTW.  That will need fixed up
> before the pull request goes in during the merge window.

That latest branch on infradead.org should have fixed all the staging
drivers.  But Linus just clarified that we indeed do not have to care for
staging drivers, I asked him at kernel summit in front of the whole audience.

If you were not merging the latest branch you might also have to pick up
the mlx5 fix separately, Eli or Sagi might be able to help with that as
I'll be offline for a few days.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-10-29 Thread Doug Ledford
On 10/29/2015 07:51 AM, Christoph Hellwig wrote:
> On Wed, Oct 28, 2015 at 10:57:59PM -0400, Doug Ledford wrote:
>>> I had to do a minor hand merge to get this to apply, but it has been
>>> pulled in for 4.4.
>>
>> This breaks all of the drivers in staging BTW.  That will need fixed up
>> before the pull request goes in during the merge window.
> 
> That latest branch on infradead.org should have fixed all the staging
> drivers.

You're right, it did.  It was my mistake thinking it was your patchset
that caused the problem (that's what I get for working on very little
sleep).  You're patchset was pulled in as a preparatory requirement for
the fast-reg patchset from Sagi.  It was the fast-reg patchset that
skips the staging drivers.  I had to remove the top patch from that
patchset (the one that finally removed the old core API) in order to get
my kernel to compile again).  Sorry about the confusion.


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-10-28 Thread Doug Ledford
On 10/28/2015 10:25 PM, Doug Ledford wrote:
> On 09/13/2015 11:13 AM, Christoph Hellwig wrote:
>> This series shrinks the WR size by splitting out the different WR
>> types.
>>
>> Patch number one is too large for the mailinglist, so if you didn't
>> get it grab it here:
>>
>>  
>> http://git.infradead.org/users/hch/rdma.git/commitdiff_plain/c752d80937ff2d71f25ae7fcdd1a151054fb2ceb
>>
>> or the full git tree at:
>>
>>  git://git.infradead.org/users/hch/rdma.git wr-cleanup
> 
> I had to do a minor hand merge to get this to apply, but it has been
> pulled in for 4.4.

This breaks all of the drivers in staging BTW.  That will need fixed up
before the pull request goes in during the merge window.



-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-10-28 Thread Doug Ledford
On 09/13/2015 11:13 AM, Christoph Hellwig wrote:
> This series shrinks the WR size by splitting out the different WR
> types.
> 
> Patch number one is too large for the mailinglist, so if you didn't
> get it grab it here:
> 
>   
> http://git.infradead.org/users/hch/rdma.git/commitdiff_plain/c752d80937ff2d71f25ae7fcdd1a151054fb2ceb
> 
> or the full git tree at:
> 
>   git://git.infradead.org/users/hch/rdma.git wr-cleanup

I had to do a minor hand merge to get this to apply, but it has been
pulled in for 4.4.

> 
> Now that we have 4.3-rc1 out it might be a good idea to get the rdma tree
> for 4.4 started with it.  Both I and others have additional large patches
> pending that build ontop of this.
> 
> Changes since V3:
>   - dropped the old first patch as it has been merged
>   - rebase to 4.3-rc1
> 
> Changes since V2:
>  - fixed patch one to accept SEND - note that this was alredy fixed by
>patch 2
>  - added a CC: stable for patch 1
>  - added additional review tags
> 
> Changes since V1:
>  - new patch 1 which rejects invalid opcodes.  Patch 2 was doing
>this implicitly except for UD QPs, but I think we should a)
>do this explicitly and b) ensure this goes into 4.2 and -stable
>as I can see quite a lot of harm from submitting such malformed
>operations
>  - patch 2 now covers all drivers including those in staging to
>side step any sort of discussions on the staging tree.
>  - patch 2 now explicitly replaces the weird overloading in the mlx5
>driver with an explicit embedding of struct ib_send_wr, similar
>to what we do for all other MRs.
>  - new patch to drop another unused send_wr field.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V4

2015-10-11 Thread Christoph Hellwig
On Sun, Sep 13, 2015 at 05:13:33PM +0200, Christoph Hellwig wrote:
> This series shrinks the WR size by splitting out the different WR
> types.
> 
> Patch number one is too large for the mailinglist, so if you didn't
> get it grab it here:
> 
>   
> http://git.infradead.org/users/hch/rdma.git/commitdiff_plain/c752d80937ff2d71f25ae7fcdd1a151054fb2ceb
> 
> or the full git tree at:
> 
>   git://git.infradead.org/users/hch/rdma.git wr-cleanup

This git tree has been updated with the latests ACKs and has been
rebased to the latest rdma tree which introduced a few trivial rejects.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V4

2015-09-29 Thread Haggai Eran
On 13/09/2015 18:13, Christoph Hellwig wrote:
> This series shrinks the WR size by splitting out the different WR
> types.
> 
> Patch number one is too large for the mailinglist, so if you didn't
> get it grab it here:
> 
>   
> http://git.infradead.org/users/hch/rdma.git/commitdiff_plain/c752d80937ff2d71f25ae7fcdd1a151054fb2ceb
> 
> or the full git tree at:
> 
>   git://git.infradead.org/users/hch/rdma.git wr-cleanup
> 

Hi,

I've tested the current wr-cleanup branch with a Connect-IB device to
make sure the changes to the mlx5-internal UMR work requests are fine
(at least when invoked from user space). To test I also had to add
Sagi's patches to remove the reserved lkey, but with those patches it works.

Regards,
Haggai

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V3

2015-09-03 Thread Doug Ledford
On 08/31/2015 08:24 PM, Doug Ledford wrote:
> On 08/31/2015 12:11 PM, Christoph Hellwig wrote:
>> On Sun, Aug 30, 2015 at 06:31:35PM +0300, Sagi Grimberg wrote:
   - patch 2 now explicitly replaces the weird overloading in the mlx5
 driver with an explicit embedding of struct ib_send_wr, similar
 to what we do for all other MRs.
>>>
>>> That's nice,
>>>
>>> There is one non-trivial spot that was missed in mlx5_ib_post_send
>>> though:
>>
>> Oh, that was a weird abuse of the old casts.
>>
>> I've foled both your fixes and force pushed to the wr-cleanup branch.
>>
>> I do not plan to resend the series until the merge window for 4.4
>> is open.  Doug, any chance you could pick up the first patch in the
>> series for 4.3-rc?  It's marked for stable as well.
> 
> Yes, I can do that.
> 

I've applied 1 of 3.  For the remaining two, I plan to pull them into a
test kernel and run some internal tests before committing.  Just FYI.

-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V3

2015-08-31 Thread Christoph Hellwig
On Sun, Aug 30, 2015 at 06:31:35PM +0300, Sagi Grimberg wrote:
>>   - patch 2 now explicitly replaces the weird overloading in the mlx5
>> driver with an explicit embedding of struct ib_send_wr, similar
>> to what we do for all other MRs.
>
> That's nice,
>
> There is one non-trivial spot that was missed in mlx5_ib_post_send
> though:

Oh, that was a weird abuse of the old casts.

I've foled both your fixes and force pushed to the wr-cleanup branch.

I do not plan to resend the series until the merge window for 4.4
is open.  Doug, any chance you could pick up the first patch in the
series for 4.3-rc?  It's marked for stable as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V3

2015-08-31 Thread Doug Ledford
On 08/31/2015 12:11 PM, Christoph Hellwig wrote:
> On Sun, Aug 30, 2015 at 06:31:35PM +0300, Sagi Grimberg wrote:
>>>   - patch 2 now explicitly replaces the weird overloading in the mlx5
>>> driver with an explicit embedding of struct ib_send_wr, similar
>>> to what we do for all other MRs.
>>
>> That's nice,
>>
>> There is one non-trivial spot that was missed in mlx5_ib_post_send
>> though:
> 
> Oh, that was a weird abuse of the old casts.
> 
> I've foled both your fixes and force pushed to the wr-cleanup branch.
> 
> I do not plan to resend the series until the merge window for 4.4
> is open.  Doug, any chance you could pick up the first patch in the
> series for 4.3-rc?  It's marked for stable as well.

Yes, I can do that.

-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: shrink struct ib_send_wr V3

2015-08-30 Thread Sagi Grimberg

On 8/30/2015 6:31 PM, Sagi Grimberg wrote:

  - patch 2 now explicitly replaces the weird overloading in the mlx5
driver with an explicit embedding of struct ib_send_wr, similar
to what we do for all other MRs.


That's nice,

There is one non-trivial spot that was missed in mlx5_ib_post_send
though:

diff --git a/drivers/infiniband/hw/mlx5/qp.c
b/drivers/infiniband/hw/mlx5/qp.c
index 7ddfb74..35a18d6 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -2802,7 +2802,7 @@ int mlx5_ib_post_send(struct ib_qp *ibqp, struct
ib_send_wr *wr,
 goto out;
 }
 qp-sq.wr_data[idx] = MLX5_IB_WR_UMR;
-   ctrl-imm = cpu_to_be32(fast_reg_wr(wr)-rkey);
+   ctrl-imm = cpu_to_be32(umr_wr(wr)-mkey);
 set_reg_umr_segment(seg, wr);
 seg += sizeof(struct mlx5_wqe_umr_ctrl_seg);
 size += sizeof(struct mlx5_wqe_umr_ctrl_seg) / 16;

Care to fold this in?



There is another problem I'm seeing in this patch. in reg_umr the
local variable is a type ib_send_wr which is passed to
prep_umr_reg_wqe() which casts it to mlx5_umr_wr (container_of). It
should have probably been mlx5_umr_wr to begin with...

I think this needs to be folded in as well:
diff --git a/drivers/infiniband/hw/mlx5/mr.c 
b/drivers/infiniband/hw/mlx5/mr.c

index 9fac2c1..6f8f3d8 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -752,7 +752,8 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, 
struct ib_umem *umem,

struct device *ddev = dev-ib_dev.dma_device;
struct umr_common *umrc = dev-umrc;
struct mlx5_ib_umr_context umr_context;
-   struct ib_send_wr wr, *bad;
+   struct mlx5_umr_wr umrwr;
+   struct ib_send_wr *bad;
struct mlx5_ib_mr *mr;
struct ib_sge sg;
int size;
@@ -798,14 +799,14 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd 
*pd, struct ib_umem *umem,

goto free_pas;
}

-   memset(wr, 0, sizeof(wr));
-   wr.wr_id = (u64)(unsigned long)umr_context;
-   prep_umr_reg_wqe(pd, wr, sg, dma, npages, mr-mmr.key, page_shift,
-virt_addr, len, access_flags);
+   memset(umrwr, 0, sizeof(umrwr));
+   umrwr.wr.wr_id = (u64)(unsigned long)umr_context;
+   prep_umr_reg_wqe(pd, umrwr.wr, sg, dma, npages, mr-mmr.key,
+page_shift, virt_addr, len, access_flags);

mlx5_ib_init_umr_context(umr_context);
down(umrc-sem);
-   err = ib_post_send(umrc-qp, wr, bad);
+   err = ib_post_send(umrc-qp, umrwr.wr, bad);
if (err) {
mlx5_ib_warn(dev, post send failed, err %d\n, err);
goto unmap_dma;
@@ -1134,16 +1135,17 @@ static int unreg_umr(struct mlx5_ib_dev *dev, 
struct mlx5_ib_mr *mr)

 {
struct umr_common *umrc = dev-umrc;
struct mlx5_ib_umr_context umr_context;
-   struct ib_send_wr wr, *bad;
+   struct mlx5_umr_wr umrwr;
+   struct ib_send_wr *bad;
int err;

-   memset(wr, 0, sizeof(wr));
-   wr.wr_id = (u64)(unsigned long)umr_context;
-   prep_umr_unreg_wqe(dev, wr, mr-mmr.key);
+   memset(umrwr, 0, sizeof(umrwr));
+   umrwr.wr.wr_id = (u64)(unsigned long)umr_context;
+   prep_umr_unreg_wqe(dev, umrwr.wr, mr-mmr.key);

mlx5_ib_init_umr_context(umr_context);
down(umrc-sem);
-   err = ib_post_send(umrc-qp, wr, bad);
+   err = ib_post_send(umrc-qp, umrwr.wr, bad);
if (err) {
up(umrc-sem);
mlx5_ib_dbg(dev, err %d\n, err);
--
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr V3

2015-08-30 Thread Sagi Grimberg

  - patch 2 now explicitly replaces the weird overloading in the mlx5
driver with an explicit embedding of struct ib_send_wr, similar
to what we do for all other MRs.


That's nice,

There is one non-trivial spot that was missed in mlx5_ib_post_send
though:

diff --git a/drivers/infiniband/hw/mlx5/qp.c 
b/drivers/infiniband/hw/mlx5/qp.c

index 7ddfb74..35a18d6 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -2802,7 +2802,7 @@ int mlx5_ib_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,

goto out;
}
qp-sq.wr_data[idx] = MLX5_IB_WR_UMR;
-   ctrl-imm = cpu_to_be32(fast_reg_wr(wr)-rkey);
+   ctrl-imm = cpu_to_be32(umr_wr(wr)-mkey);
set_reg_umr_segment(seg, wr);
seg += sizeof(struct mlx5_wqe_umr_ctrl_seg);
size += sizeof(struct mlx5_wqe_umr_ctrl_seg) / 16;

Care to fold this in?

Cheers,
Sagi.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


shrink struct ib_send_wr V3

2015-08-26 Thread Christoph Hellwig
This series shrinks the WR size by splitting out the different WR
types.

Patch number two is too large for the mailinglist, so if you didn't
get it grab it here:


http://git.infradead.org/users/hch/rdma.git/commitdiff/2b63f958de7bd630aba85caf65986831d4372869

or the full git tree at:

git://git.infradead.org/users/hch/rdma.git wr-cleanup

Changes since V2:

 - fixed patch one to accept SEND - note that this was alredy fixed by
   patch 2
 - added a CC: stable for patch 1
 - added additional review tags

Changes since the version sent around a couple of times:

 - new patch 1 which rejects invalid opcodes.  Patch 2 was doing
   this implicitly except for UD QPs, but I think we should a)
   do this explicitly and b) ensure this goes into 4.2 and -stable
   as I can see quite a lot of harm from submitting such malformed
   operations
 - patch 2 now covers all drivers including those in staging to
   side step any sort of discussions on the staging tree.
 - patch 2 now explicitly replaces the weird overloading in the mlx5
   driver with an explicit embedding of struct ib_send_wr, similar
   to what we do for all other MRs.
 - new patch to drop another unused send_wr field.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: shrink struct ib_send_wr

2015-08-20 Thread Sagi Grimberg

  - patch 2 now explicitly replaces the weird overloading in the mlx5
driver with an explicit embedding of struct ib_send_wr, similar
to what we do for all other MRs.


This is on the user-space memory registration path.

Haggai, can you grab it for a Tested-by tag?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


shrink struct ib_send_wr

2015-08-19 Thread Christoph Hellwig
This series shrinks the WR size by splitting out the different WR
types.

Patch number two is too large for the mailinglist, so if you didn't
get it grab it here:


http://git.infradead.org/users/hch/rdma.git/commitdiff/30e522ee6c1d7adb614d7308f09fbfd71c6d3e07

or the full git tree at:

git://git.infradead.org/users/hch/rdma.git wr-cleanup

Changes since the version sent around a couple of times:

 - new patch 1 which rejects invalid opcodes.  Patch 2 was doing
   this implicitly except for UD QPs, but I think we should a)
   do this explicitly and b) ensure this goes into 4.2 and -stable
   as I can see quite a lot of harm from submitting such malformed
   operations
 - patch 2 now covers all drivers including those in staging to
   side step any sort of discussions on the staging tree.
 - patch 2 now explicitly replaces the weird overloading in the mlx5
   driver with an explicit embedding of struct ib_send_wr, similar
   to what we do for all other MRs.
 - new patch to drop another unused send_wr field.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html