Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-05 Thread Michael S. Tsirkin
> Quoting Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] [PATCH] RE:  regression in ofed 1.2
> 
> >>The name is "ib_mcast_wq" which is too long for older kernels.
> >>
> >>Did we loose a backport patch?
> > 
> > 
> > Not sure what happened here.
> > Sean, could you rename ib_mcast_wq to ib_mcast please?
> 
> I renamed the workqueue for what I requested to pull upstream, and I added a 
> patch to my pull request to rename a couple of other workqueues.
> 
> Didn't you already apply a rename patch to the ofed code?

You but I assumed it's in your branch so I threw it out when I took your
latest code.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-05 Thread Sean Hefty
>>The name is "ib_mcast_wq" which is too long for older kernels.
>>
>>Did we loose a backport patch?
> 
> 
> Not sure what happened here.
> Sean, could you rename ib_mcast_wq to ib_mcast please?

I renamed the workqueue for what I requested to pull upstream, and I added a 
patch to my pull request to rename a couple of other workqueues.

Didn't you already apply a rename patch to the ofed code?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-04 Thread Michael S. Tsirkin
> Quoting Steve Wise <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] RE:  regression in ofed 1.2
> 
> Um, now on rhel4u4 we crash creating the mcast workqueue.
> 
> The name is "ib_mcast_wq" which is too long for older kernels.
> 
> Did we loose a backport patch?

Not sure what happened here.
Sean, could you rename ib_mcast_wq to ib_mcast please?


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Michael S. Tsirkin
> Quoting Steve Wise <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] RE:  regression in ofed 1.2
> 
> Um, now on rhel4u4 we crash creating the mcast workqueue.
> 
> The name is "ib_mcast_wq" which is too long for older kernels.
> 
> Did we loose a backport patch?

Sean, please rename the multicast wq to ib_mcast as we agreed.

I just pushed the following out:

commit efedfe57a21a134a65d951bcca73af46da609c5e
Author: Michael S. Tsirkin <[EMAIL PROTECTED]>
Date:   Thu Feb 1 21:09:16 2007 +0200

Make multicast WQ name shorter.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

diff --git a/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch 
b/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
index e70d4da..4b968db 100644
--- a/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
+++ b/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
@@ -2225,7 +2225,7 @@ index 000..039f1eb
 +{
 +  int ret;
 +
-+  mcast_wq = create_singlethread_workqueue("ib_mcast_wq");
++  mcast_wq = create_singlethread_workqueue("ib_mcast");
 +  if (!mcast_wq)
 +  return -ENOMEM;
 +

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Michael S. Tsirkin
> Quoting Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] [PATCH] RE:  regression in ofed 1.2
> 
> > - Sean, please base your branches on specific -rc from linus
> >   (OFED 1.2 is now -rc7).
> 
> My branches should be in sync with rc6.

If you check, they are not. ofed 1 2 branch has an extra
commit on top of -rc6. But I figured it out already.

> so that they get completely rebuilt off of the latest kernel?

No need to do anything at this point.

> > - Now that we are entering feature freeze, we should not do full replaces 
> > anymore.
> >   So Sean, please post incremental patches, labeled ofed-1.2 clearly.
> 
> Additional patches will be posted to my ofed_1-2 branch, which you should be 
> able to pull.

First, please post patches on list as well.
We can then just take the patch from git or from mail and add it under fixes.

> Do you see a problem with this process?

Yes. I had to jump through some hoops to first get a patch I can put in OFED due
to the issue outlined above, and then get the diff I got to apply without
conflicts, since port randomization code conflicted with the QoS patches. All
solved now - just put your patch before QoS one - but these conflicts should be
be figured out by whoever submits patches.

> I don't understand why you would need to do a full replace.

We won't do a full replace, just add patches in fixes directory.

What I expect everyone to do however, to get patches put in OFED,
is to test that patches one posts work in OFED git tree, not just against
upstream based git trees.

This currently includes testing for build against older kernels on various
architectures (me and Vlad put a cross-build setup for this at staging,
it now has kernel.org kernels but we will be adding distro kernels)
and testing on at least one of the main supported enterprise distros 
(RHEL/SLES).

I simply can't take untested patches - I have nightly tests but no time to test
all ULPs before I apply.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Steve Wise
Um, now on rhel4u4 we crash creating the mcast workqueue.

The name is "ib_mcast_wq" which is too long for older kernels.

Did we loose a backport patch?


On Thu, 2007-02-01 at 15:55 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve WIse <[EMAIL PROTECTED]>:
> > Subject: Re: [PATCH] RE:  regression in ofed 1.2
> > 
> > > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > > before I created an ofed_1_2 branch (which contains the fix), and didn't 
> > > update
> > > to match my ofed_1_2 branch.  The crash that you reported occurring over 
> > > iWarp
> > > should also happen over IB for the same reason, so both are likely broken 
> > > atm...
> > > 
> > > Vlad, can you please update the ofed build by pulling from the ofed_1_2 
> > > branches
> > > of my rdma-dev.git and librdmacm.git trees?
> > 
> > I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> > you made there will resolve this issue.  It just needs to be pulled into
> > ofed_1_2.
> 
> OK, I've updated ofed to code from rdma-dev ofed_1_2 branch. Some notes: 
> 
> - Sean, please base your branches on specific -rc from linus
>   (OFED 1.2 is now -rc7).
> - Now that we are entering feature freeze, we should not do full replaces 
> anymore.
>   So Sean, please post incremental patches, labeled ofed-1.2 clearly.
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Sean Hefty
> - Sean, please base your branches on specific -rc from linus
>   (OFED 1.2 is now -rc7).

My branches should be in sync with rc6.  The original branches were built from 
an earlier rc version, and updated by pulling in the latest rc from Linus 
through my master branch.  Are you wanting the history of the branches reworked 
so that they get completely rebuilt off of the latest kernel?

> - Now that we are entering feature freeze, we should not do full replaces 
> anymore.
>   So Sean, please post incremental patches, labeled ofed-1.2 clearly.

Additional patches will be posted to my ofed_1-2 branch, which you should be 
able to pull.  Do you see a problem with this process?  I don't understand why 
you would need to do a full replace.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Michael S. Tsirkin
> Quoting Steve WIse <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] RE:  regression in ofed 1.2
> 
> On Thu, 2007-02-01 at 03:37 -0600, Steve WIse wrote:
> > > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > > before I created an ofed_1_2 branch (which contains the fix), and didn't 
> > > update
> > > to match my ofed_1_2 branch.  The crash that you reported occurring over 
> > > iWarp
> > > should also happen over IB for the same reason, so both are likely broken 
> > > atm...
> > > 
> > > Vlad, can you please update the ofed build by pulling from the ofed_1_2 
> > > branches
> > > of my rdma-dev.git and librdmacm.git trees?
> > 
> > I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> > you made there will resolve this issue.  It just needs to be pulled into
> > ofed_1_2.
> > 
> 
> Also, I just pulled down and built the latest ofed_1_2 kernel and user
> code against 2.6.20-rc7, and the ucma abi is 4.  So rdma_create_qp()
> will still crash even with the librdmacm code to avoid the call to
> rdma_init_qp_attr for ABI 3 kernels.
> 
> 
> Steve.

I'm a bit confused. Can you please try with latest code I've just pushed out?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Steve Wise
>> >
>>
>> Also, I just pulled down and built the latest ofed_1_2 kernel and 
>> user
>> code against 2.6.20-rc7, and the ucma abi is 4.  So rdma_create_qp()
>> will still crash even with the librdmacm code to avoid the call to
>> rdma_init_qp_attr for ABI 3 kernels.
>>
>>
>> Steve.
>
> I'm a bit confused. Can you please try with latest code I've just 
> pushed out?
>

Will do.  This was before you pulled in sean's code. 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Michael S. Tsirkin
> Quoting Steve WIse <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] RE:  regression in ofed 1.2
> 
> > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > before I created an ofed_1_2 branch (which contains the fix), and didn't 
> > update
> > to match my ofed_1_2 branch.  The crash that you reported occurring over 
> > iWarp
> > should also happen over IB for the same reason, so both are likely broken 
> > atm...
> > 
> > Vlad, can you please update the ofed build by pulling from the ofed_1_2 
> > branches
> > of my rdma-dev.git and librdmacm.git trees?
> 
> I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> you made there will resolve this issue.  It just needs to be pulled into
> ofed_1_2.

OK, I've updated ofed to code from rdma-dev ofed_1_2 branch. Some notes: 

- Sean, please base your branches on specific -rc from linus
  (OFED 1.2 is now -rc7).
- Now that we are entering feature freeze, we should not do full replaces 
anymore.
  So Sean, please post incremental patches, labeled ofed-1.2 clearly.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Steve WIse
On Thu, 2007-02-01 at 03:37 -0600, Steve WIse wrote:
> > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > before I created an ofed_1_2 branch (which contains the fix), and didn't 
> > update
> > to match my ofed_1_2 branch.  The crash that you reported occurring over 
> > iWarp
> > should also happen over IB for the same reason, so both are likely broken 
> > atm...
> > 
> > Vlad, can you please update the ofed build by pulling from the ofed_1_2 
> > branches
> > of my rdma-dev.git and librdmacm.git trees?
> 
> I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> you made there will resolve this issue.  It just needs to be pulled into
> ofed_1_2.
> 

Also, I just pulled down and built the latest ofed_1_2 kernel and user
code against 2.6.20-rc7, and the ucma abi is 4.  So rdma_create_qp()
will still crash even with the librdmacm code to avoid the call to
rdma_init_qp_attr for ABI 3 kernels.


Steve.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-02-01 Thread Steve WIse
> Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> before I created an ofed_1_2 branch (which contains the fix), and didn't 
> update
> to match my ofed_1_2 branch.  The crash that you reported occurring over iWarp
> should also happen over IB for the same reason, so both are likely broken 
> atm...
> 
> Vlad, can you please update the ofed build by pulling from the ofed_1_2 
> branches
> of my rdma-dev.git and librdmacm.git trees?

I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
you made there will resolve this issue.  It just needs to be pulled into
ofed_1_2.

Thanks!

Steve.






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Michael S. Tsirkin
> His ofed_1_2_multicast didn't have an rdma_user_cm.h file, so I'm not sure 
> about
> that branch.

That one should be removed. It was created as a debugging aid to help people
debug crashes observed by Dotan in the multicast module.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
>> OFED will be ABI 4, since it will include multicast support (which is what
>> causes the ABI to bump from 3 to 4).
>>
>
>Has the ofed tree been updated to ABI 4 yet?

I just looked in vlad's git tree a while ago, and his ofed_1_2 branch had ABI 3.
His ofed_1_2_multicast didn't have an rdma_user_cm.h file, so I'm not sure about
that branch.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Steve Wise
On Wed, 2007-01-31 at 14:04 -0800, Sean Hefty wrote:
> > Fixed it for IB maybe, but not for iWarp, right?
> 
> It should be fixed for both.
> 
> > So OFED 1.2 will be ABI 3, right?
> 
> OFED will be ABI 4, since it will include multicast support (which is what 
> causes the ABI to bump from 3 to 4).
> 

Has the ofed tree been updated to ABI 4 yet?




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
>> The librdmacm shipped with OFED 1.1 shouldn't hit this issue.  And neither
>> should the upcoming OFED 1.2 version of the librdmacm (with the previously
>> posted patch applied), when paired with either the OFED 1.2 kernel code, what
>> was requested to go into 2.6.21, or older kernels.
>>
>
>What patch?

I was referring to the patch to the librdmacm that I posted earlier today.  I
just committed this patch to my librdmacm.git tree.

>Well the OFED 1.2 builds are busted now for iWARP.  I guess I missed
>whatever patch you submitted that will fix this.

Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
before I created an ofed_1_2 branch (which contains the fix), and didn't update
to match my ofed_1_2 branch.  The crash that you reported occurring over iWarp
should also happen over IB for the same reason, so both are likely broken atm...

Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
of my rdma-dev.git and librdmacm.git trees?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Steve Wise
On Wed, 2007-01-31 at 14:35 -0800, Sean Hefty wrote:
> >But there still exists an iwarp issue that I need to fix because
> >librdmacm (the one shipped in OFED) now calls the kernel
> >rdma_init_qp_attr() function via ucma before the library calls kernel
> >rdma_connect() via ucma...
> 
> Can you clarify which versions of the librdmacm and kernel you are using?
> 

The 0130-0200 OFED 1.2 daily kernel and user builds applied to any
kernel.  But I'm using 2.6.20-rc6.

> The librdmacm shipped with OFED 1.1 shouldn't hit this issue.  And neither
> should the upcoming OFED 1.2 version of the librdmacm (with the previously
> posted patch applied), when paired with either the OFED 1.2 kernel code, what
> was requested to go into 2.6.21, or older kernels.
> 

What patch? 

> I just think that this problem is only exposed by developmental librdmacm code
> paired with older developmental rdma_cm multicast code.

Well the OFED 1.2 builds are busted now for iWARP.  I guess I missed
whatever patch you submitted that will fix this.

Steve.





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
>But there still exists an iwarp issue that I need to fix because
>librdmacm (the one shipped in OFED) now calls the kernel
>rdma_init_qp_attr() function via ucma before the library calls kernel
>rdma_connect() via ucma...

Can you clarify which versions of the librdmacm and kernel you are using?

The librdmacm shipped with OFED 1.1 shouldn't hit this issue.  And neither
should the upcoming OFED 1.2 version of the librdmacm (with the previously
posted patch applied), when paired with either the OFED 1.2 kernel code, what
was requested to go into 2.6.21, or older kernels.

I just think that this problem is only exposed by developmental librdmacm code
paired with older developmental rdma_cm multicast code.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Steve Wise
On Wed, 2007-01-31 at 14:04 -0800, Sean Hefty wrote:
> > Fixed it for IB maybe, but not for iWarp, right?
> 
> It should be fixed for both.
> 

Ok. 

But there still exists an iwarp issue that I need to fix because
librdmacm (the one shipped in OFED) now calls the kernel
rdma_init_qp_attr() function via ucma before the library calls kernel
rdma_connect() via ucma...

 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
> Fixed it for IB maybe, but not for iWarp, right?

It should be fixed for both.

> So OFED 1.2 will be ABI 3, right?

OFED will be ABI 4, since it will include multicast support (which is what 
causes the ABI to bump from 3 to 4).

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Steve Wise
On Wed, 2007-01-31 at 13:55 -0800, Sean Hefty wrote:
> Steve Wise wrote:
> > Should this be a problem for OFED 1.2?  I would think the ABI for all
> > backports should be the same, so it wouldn't be a problem.  Is this
> > true?  I'm assuming all backported UCMA modules would have the same
> > ABI.  
> 
> This is a problem for anyone that tries to use a newer version of the 
> librdamcm 
> (like an OFED 1.2 version) with an older kernel (e.g. 2.6.20).  As you 
> pointed 
> out, the issue is that the kernel rdma_cm crashes if rdma_init_qp_attr() is 
> called before the user calls rdma_connect().  The problem affects both IB and 
> iWarp.  The latest changes to the librdmacm exposed this bug, but the latest 
> kernel multicast code also fixed it.
> 

Fixed it for IB maybe, but not for iWarp, right?

> As far as I know, only ABI 3 has been released anywhere.  ABI 4 is only 
> available from my git tree.  This problem will occur on any code based on ABI 
> 3 
> or older code snapshots of ABI 4.
> 
> Hopefully this makes sense.

So OFED 1.2 will be ABI 3, right?

Sorry if I'm being dense...


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Sean Hefty
Steve Wise wrote:
> Should this be a problem for OFED 1.2?  I would think the ABI for all
> backports should be the same, so it wouldn't be a problem.  Is this
> true?  I'm assuming all backported UCMA modules would have the same
> ABI.  

This is a problem for anyone that tries to use a newer version of the librdamcm 
(like an OFED 1.2 version) with an older kernel (e.g. 2.6.20).  As you pointed 
out, the issue is that the kernel rdma_cm crashes if rdma_init_qp_attr() is 
called before the user calls rdma_connect().  The problem affects both IB and 
iWarp.  The latest changes to the librdmacm exposed this bug, but the latest 
kernel multicast code also fixed it.

As far as I know, only ABI 3 has been released anywhere.  ABI 4 is only 
available from my git tree.  This problem will occur on any code based on ABI 3 
or older code snapshots of ABI 4.

Hopefully this makes sense.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] RE: regression in ofed 1.2

2007-01-31 Thread Steve Wise
Should this be a problem for OFED 1.2?  I would think the ABI for all
backports should be the same, so it wouldn't be a problem.  Is this
true?  I'm assuming all backported UCMA modules would have the same
ABI.  



On Wed, 2007-01-31 at 11:19 -0800, Sean Hefty wrote:
> Here's a first attempt at a patch to allow the latest librdmacm to work with 
> kernel ABI
> version 3 without crashing the kernel.  If you're trying to use a 
> developmental kernel
> that has ABI 4, you'll have to update the kernel cma.
> 
> Note that I didn't actually run this against an older kernel (I need to 
> reload that on my
> system), but did test this fix by forcing the abi to version 3 with a newer 
> kernel loaded.
> 
> Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
> ---
> diff --git a/src/cma.c b/src/cma.c
> index 2d2a587..c5f8cd9 100644
> --- a/src/cma.c
> +++ b/src/cma.c
> @@ -653,11 +653,49 @@ static int ucma_modify_qp_err(struct rdma_cm_id *id)
>   return ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE);
>  }
>  
> +static int ucma_find_pkey(struct cma_device *cma_dev, uint8_t port_num,
> +   uint16_t pkey, uint16_t *pkey_index)
> +{
> + int ret, i;
> + uint16_t chk_pkey;
> +
> + for (i = 0, ret = 0; !ret; i++) {
> + ret = ibv_query_pkey(cma_dev->verbs, port_num, i, &chk_pkey);
> + if (!ret && pkey == chk_pkey) {
> + *pkey_index = (uint16_t) i;
> + return 0;
> + }
> + }
> + return -EINVAL;
> +}
> +
> +static int ucma_init_conn_qp3(struct cma_id_private *id_priv, struct ibv_qp 
> *qp)
> +{
> + struct ibv_qp_attr qp_attr;
> + int ret;
> +
> + ret = ucma_find_pkey(id_priv->cma_dev, id_priv->id.port_num,
> +  id_priv->id.route.addr.addr.ibaddr.pkey,
> +  &qp_attr.pkey_index);
> + if (ret)
> + return ret;
> +
> + qp_attr.port_num = id_priv->id.port_num;
> + qp_attr.qp_state = IBV_QPS_INIT;
> + qp_attr.qp_access_flags = 0;
> +
> + return ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_ACCESS_FLAGS |
> +IBV_QP_PKEY_INDEX | IBV_QP_PORT);
> +}
> +
>  static int ucma_init_conn_qp(struct cma_id_private *id_priv, struct ibv_qp 
> *qp)
>  {
>   struct ibv_qp_attr qp_attr;
>   int qp_attr_mask, ret;
>  
> + if (abi_ver == 3)
> + return ucma_init_conn_qp3(id_priv, qp);
> +
>   qp_attr.qp_state = IBV_QPS_INIT;
>   ret = rdma_init_qp_attr(&id_priv->id, &qp_attr, &qp_attr_mask);
>   if (ret)
> @@ -666,11 +704,44 @@ static int ucma_init_conn_qp(struct cma_id_private 
> *id_priv, struct
> ibv_qp *qp)
>   return ibv_modify_qp(qp, &qp_attr, qp_attr_mask);
>  }
>  
> +static int ucma_init_ud_qp3(struct cma_id_private *id_priv, struct ibv_qp 
> *qp)
> +{
> + struct ibv_qp_attr qp_attr;
> + int ret;
> +
> + ret = ucma_find_pkey(id_priv->cma_dev, id_priv->id.port_num,
> +  id_priv->id.route.addr.addr.ibaddr.pkey,
> +  &qp_attr.pkey_index);
> + if (ret)
> + return ret;
> +
> + qp_attr.port_num = id_priv->id.port_num;
> + qp_attr.qp_state = IBV_QPS_INIT;
> + qp_attr.qkey = RDMA_UDP_QKEY;
> +
> + ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_QKEY |
> +   IBV_QP_PKEY_INDEX | IBV_QP_PORT);
> + if (ret)
> + return ret;
> +
> + qp_attr.qp_state = IBV_QPS_RTR;
> + ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE);
> + if (ret)
> + return ret;
> +
> + qp_attr.qp_state = IBV_QPS_RTS;
> + qp_attr.sq_psn = 0;
> + return ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_SQ_PSN);
> +}
> +
>  static int ucma_init_ud_qp(struct cma_id_private *id_priv, struct ibv_qp *qp)
>  {
>   struct ibv_qp_attr qp_attr;
>   int qp_attr_mask, ret;
>  
> + if (abi_ver == 3)
> + return ucma_init_ud_qp3(id_priv, qp);
> +
>   qp_attr.qp_state = IBV_QPS_INIT;
>   ret = rdma_init_qp_attr(&id_priv->id, &qp_attr, &qp_attr_mask);
>   if (ret)
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general