Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-07-12 Thread Marty Schlining
Performing more tests for Vlad. Linux 4.8 and Linux 4.12.

-Original Message-
From: Davis, Arlin R [mailto:arlin.r.da...@intel.com] 
Sent: Wednesday, July 12, 2017 11:08 AM
To: Steve Wise <sw...@opengridcomputing.com>; 'Pradeep Kankipati' 
<pradeep.kankip...@broadcom.com>; Woodruff, Robert J 
<robert.j.woodr...@intel.com>; 'Vladimir Sokolovsky' <v...@dev.mellanox.co.il>; 
'Bart Van Assche' <bart.vanass...@wdc.com>; rsda...@soft-forge.com; Marty 
Schlining <mschlin...@ddn.com>
Cc: Mike Davis <mda...@ddn.com>; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Vlad and Marty,

Can you update the EWG on progress? 
Is there a possibility of a fix this week or shall we move ahead with Woody's 
proposal?

Thanks, Arlin

> -Original Message-
> From: Steve Wise [mailto:sw...@opengridcomputing.com]
> Sent: Wednesday, July 05, 2017 9:34 AM
> To: 'Pradeep Kankipati' <pradeep.kankip...@broadcom.com>; Woodruff, 
> Robert J <robert.j.woodr...@intel.com>; Davis, Arlin R 
> <arlin.r.da...@intel.com>; 'Vladimir Sokolovsky' 
> <v...@dev.mellanox.co.il>; 'Bart Van Assche' <bart.vanass...@wdc.com>; 
> rsda...@soft-forge.com; Hanania, Amir <amir.hana...@intel.com>; 
> mschlin...@ddn.com
> Cc: mda...@ddn.com; ewg@lists.openfabrics.org; cfernan...@ddn.com
> Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
> Ditto.
> 
> > -Original Message-
> > From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of 
> > Pradeep Kankipati
> > Sent: Monday, July 03, 2017 4:02 AM
> > To: Woodruff, Robert J; Davis, Arlin R; Vladimir Sokolovsky; Bart 
> > Van Assche; rsda...@soft-forge.com; Hanania, Amir; 
> > mschlin...@ddn.com
> > Cc: mda...@ddn.com; ewg@lists.openfabrics.org; cfernan...@ddn.com
> > Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> >
> > I  agree with Woody's proposal, makes sense to me.
> >
> > Thanks,
> > Pradeep
> > --
> >
> > > -Original Message-
> > > From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of 
> > > Woodruff, Robert J
> > > Sent: Friday, June 30, 2017 10:03 PM
> > > To: Davis, Arlin R; Vladimir Sokolovsky; Bart Van Assche;
> > > rsdance@soft- forge.com; Hanania, Amir; mschlin...@ddn.com
> > > Cc: mda...@ddn.com; ewg@lists.openfabrics.org; cfernan...@ddn.com
> > > Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> > >
> > > I vote for documenting the problem as a known issue with 
> > > documentation on any known workarounds (such as using Bart's 
> > > srp_backports driver) and then work on getting an in-box fix into 
> > > OFED-4.8-
> 1.
> > > especially since we are attempting to have a quick turn-around for
> > > OFED-4.8-1
> > > since  the new content being added is limited, and thus it should 
> > > not take too long to get done.
> > >
> > > Vlad, what is your position on this one ?
> > >
> > > And also, does anyone else have an opinion ?
> > >
> > > -Original Message-
> > > From: Davis, Arlin R
> > > Sent: Friday, June 30, 2017 9:12 AM
> > > To: Woodruff, Robert J <robert.j.woodr...@intel.com>; Vladimir 
> > > Sokolovsky <v...@dev.mellanox.co.il>; Bart Van Assche 
> > > <bart.vanass...@wdc.com>; rsda...@soft-forge.com; Hanania, Amir 
> > > <amir.hana...@intel.com>; mschlin...@ddn.com
> > > Cc: mda...@ddn.com; ewg@lists.openfabrics.org; cfernan...@ddn.com
> > > Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> > >
> > > We need community input here or some direction from the co-chairs 
> > > (Woody and Vlad) so we can move forward. I have no problem waiting 
> > > for a fix if someone can give the EWG a timely ETA on an upstream 
> > > fix. From what I can tell, the maintainer doesn't have the 
> > > bandwidth and DDN doesn't have the expertise to fix so I don't see 
> > > any quick resolution. There are several OFA members anxiously 
> > > waiting for OFED
> > > 4.8-1 to add their new drivers and many that need to get OFED 4.8 
> > > GA done. Not sure we can wait weeks and/or months for this to get 
> > > resolved.
> > >
> > > The EWG meeting next Monday is on a US holiday and will be 
> > > cancelled so we don't want to wait 2 weeks for a decision.
> > >
> > > I am asking EWG and OFA members to please reply with their 
> > > recommendations and suggestion

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-29 Thread Marty Schlining
All,

I do object. I do not see the sense in releasing OFED 4.8 where it completely 
breaks the functionality of the previous OFED release. I'd like to see the 
issues fixed before the release of OFED 4.8.

I understand that there is a bandwidth issue from the maintainer. How could I 
be of assistance in this regard? Given, that I not up to speed on the SRP 
codebase or setup to debug it properly. But, that could be changed with some 
assistance. Vlad, what area of the code should I be looking at?

Thanks,
Marty

-Original Message-
From: Davis, Arlin R [mailto:arlin.r.da...@intel.com] 
Sent: Thursday, June 29, 2017 2:10 PM
To: Davis, Arlin R <arlin.r.da...@intel.com>; Vladimir Sokolovsky 
<v...@dev.mellanox.co.il>; Bart Van Assche <bart.vanass...@wdc.com>; 
rsda...@soft-forge.com; Hanania, Amir <amir.hana...@intel.com>; Woodruff, 
Robert J <robert.j.woodr...@intel.com>; Marty Schlining <mschlin...@ddn.com>
Cc: Mike Davis <mda...@ddn.com>; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

All,

Here is a quick summary of OFED 4.8 and SRP blocking bugs. Please 
correct/comment if I have it wrong.

- SL7.2 and SL7.3 SRP support is a must for DDN, need latest upstream SRP 4.11 
fixes
- No resources available to backport or integrate the SRP 4.11 base or fixes 
into OFED 4.8
- ib_srp_backport (4.11 base) installs/builds/runs on OFED 4.8 for both 7.2 and 
7.3
- SL7.3+OFED+srp_backport fixes 2632
- SL7.2+OFED+srp_backport still hitting bug 2632
- SL7.2+inbox_infiniband+srp_backport (no OFED) still hitting 2632 and 2634 
(slightly different but still seeing CM DREQ)
- Bug 2632, 2634 still open, no resolution and no ETA on fix from maintainer

Given that we still have the same issues without OFED installed and no ETA on 
fixes, I would suggest moving forward and releasing OFED 4.8 GA. We can 
document issues and process for applying upstream SRP fixes in the release 
notes. 

If there are no objections, I would like Vlad to modify OFED release notes, add 
the 2 remaining bugs to known issues, and add backport procedure for SRP 
upstream as follow:

http://bugs.openfabrics.org/show_bug.cgi?id=2632
http://bugs.openfabrics.org/show_bug.cgi?id=2634

The ib_srp-backport procedure is as follows:
 1. Install the OS (RHEL, SLES, ...).
 2. Install OFED.
 3. Install the ib_srp-backport driver:
git clone https://github.com/bvanassche/ib_srp-backport
cd ib_srp-backport
make rpm
sudo rpm -U $PWD/rpmbuilddir/RPMS/*/*.rpm

We can then release GA package tomorrow with these changes and latest daily 
build.
Any other release notes changes?

Thanks,

Arlin

> -Original Message-
> From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of 
> Davis, Arlin R
> Sent: Friday, June 23, 2017 12:59 PM
> To: Vladimir Sokolovsky <v...@dev.mellanox.co.il>; Bart Van Assche 
> <bart.vanass...@wdc.com>; rsda...@soft-forge.com; Hanania, Amir 
> <amir.hana...@intel.com>; Woodruff, Robert J 
> <robert.j.woodr...@intel.com>; mschlin...@ddn.com
> Cc: mda...@ddn.com; ewg@lists.openfabrics.org; cfernan...@ddn.com
> Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
> It seems like we now have a way to apply the latest SRP upstream fixes 
> to OFED 4.8, tested with RH 7.2 and RH7.3. However, we still have 2 
> critical bugs
> (1 new one) on RH 7.2 even after the upstream fixes are applied. RH 
> 7.3+OFED4.8+srp_backport seems to be fine.
> 
> http://bugs.openfabrics.org/show_bug.cgi?id=2632
> http://bugs.openfabrics.org/show_bug.cgi?id=2634
> 
> How does DDN want to proceed?
> Is it acceptable to apply upstream version/fixes to OFED 4.8 as 
> follow, document in release notes?
> Can you tell us if the remaining bugs are related to OFED 4.8 or if 
> RH7.2+infiniband+srp_backport has the same issues?
> 
> The ib_srp-backport procedure is as follows:
> 1. Install the OS (RHEL, SLES, ...).
> 2. Install OFED.
> 3. Install the ib_srp-backport driver:
>   git clone https://github.com/bvanassche/ib_srp-backport
>   cd ib_srp-backport
>   make rpm
>   sudo rpm -U $PWD/rpmbuilddir/RPMS/*/*.rpm
> 
> Thanks,
> 
> Arlin
> 
> > SRP is included already in OFED-4.8 as a part of linux-4.8 kernel 
> > and I don't see any reason to include SRP code twice.
> > If some bug fixes have to be added there then there is a specific 
> > place in compat-rdma git tree for the relevant patches.
> > If somebody is interested to add an additional SRP standalone 
> > package then please send me the relevant patches for 
> > compat-rdma/ofed_scripts and build git trees.
> >
> > Regards,
> > Vladimir
> >
> >
> > On 06/20/2017 09:01 PM, Davis, Arlin R wrote:
> > &g

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Marty Schlining
The upstream test was for Bug 2631, SRP Reject Issue. The Base OS was SL 7.3 
(not SL 7.2) w/ OFED 4.8-rc4 and the ib_srp_backport_4_11. That combination 
worked properly.

From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Monday, June 19, 2017 4:35 PM
To: Marty Schlining <mschlin...@ddn.com>; Woodruff, Robert J 
<robert.j.woodr...@intel.com>; RSD@SFI <rsda...@soft-forge.com>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il>
Cc: bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>; Mike Davis <mda...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I tested SL 7.2 with OFED 4.8-rc4. Scanning SRP targets yielded the following 
response:

Ok, so this is a kernel.org 4.8 base SRP driver backported to 3.10.
Can someone tell me if the same problem exists with upstream SRP driver in 
kernel 4.8?
This will tell me if it is an OFED 4.8 backport issue or if it is a kernel 
4.8 SRP driver issue.

ib_srp: Sending CM DREQ failed and the SRP targets were never scanned. (See Bug 
2632. http://bugs.openfabrics.org/show_bug.cgi?id=2632)

I attempted to use Bart Van Assche’s ib_srp_backport driver on top of SL 7.2 
and OFED 4.8-rc4 according to his instructions. The ib_srp_backport driver 
failed to compile (see attached text file). The ib_srp_backport driver will not 
compile with OFED 4.8-rc4 installed. I was able to uninstall OFED 4.8-rc4, then 
compile the driver. However, when OFED 4.8-rc4 was reinstalled, the 
ib_srp_backport driver was not compatible.

Ok, I think I understand.
This is the upstream test that you said passed? A SL 
7.2+infiniband+ib_srp_backport_from_4.11?
Correct?

   -arlin

That’s all of the testing , so far.

-Marty

From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Monday, June 19, 2017 4:02 PM
To: Marty Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; Woodruff, 
Robert J <robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; 
RSD@SFI <rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Sorry, I am still trying to understand what is working and what is needed to 
fix this for OFED 4.8.

When you said you tested the upstream kernel, was it SL 7.2 kernel plus the 
backports from Bart’s 4.11 drop? Do you need the 4.11 kernel base driver set 
for SRP to work properly?

-arlin

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Monday, June 19, 2017 9:42 AM
To: Davis, Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; 
Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

No upstream kernel. SL 7.2 (3.10.0-327.el7.x86_64), OFED-4.8-rc4. Was unable to 
build the backport ib_srp backport driver on this platform with OFED 4.8-rc4 
installed. Multiple build errors. However, if I uninstall OFED-4.8-rc4, I am 
able to build the upstream ib_srp RPMs on SL 7.2

Backport ib_srp driver obtained by the following mechanism:

git clone https://github.com/bvanassche/ib_srp-backport.git

-Marty


From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Monday, June 19, 2017 12:36 PM
To: Marty Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; Woodruff, 
Robert J <robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; 
RSD@SFI <rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Marty,

What upstream kernel did you use to verify SRP? Was it on a 4.8 base or 
something newer? If newer, maybe we are missing some key SRP fixes in our

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Marty Schlining
No upstream kernel. SL 7.2 (3.10.0-327.el7.x86_64), OFED-4.8-rc4. Was unable to 
build the backport ib_srp backport driver on this platform with OFED 4.8-rc4 
installed. Multiple build errors. However, if I uninstall OFED-4.8-rc4, I am 
able to build the upstream ib_srp RPMs on SL 7.2

Backport ib_srp driver obtained by the following mechanism:

git clone https://github.com/bvanassche/ib_srp-backport.git

-Marty


From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Monday, June 19, 2017 12:36 PM
To: Marty Schlining <mschlin...@ddn.com>; Woodruff, Robert J 
<robert.j.woodr...@intel.com>; RSD@SFI <rsda...@soft-forge.com>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il>
Cc: bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>; Mike Davis <mda...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Marty,

What upstream kernel did you use to verify SRP? Was it on a 4.8 base or 
something newer? If newer, maybe we are missing some key SRP fixes in our OFED 
4.8 kernel base.

Thanks, Arlin

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Friday, June 16, 2017 1:00 PM
To: Davis, Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; 
Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I can’t say for sure. I am attempting to build the upstream ib_srp driver for 
this platform, but I am running into build issues. Bart, I will send you a 
separate email on that subject.

From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Friday, June 16, 2017 3:58 PM
To: Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; Marty 
Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Another SRP critical issue just opened. Is this also a backport issue?

Bug 2632<http://bugs.openfabrics.org/show_bug.cgi?id=2632> - SRP Login failure 
for SL 7.2 and OFED 4.8-rc4 (ib_srp: Sending CM DREQ failed)


From: Woodruff, Robert J
Sent: Friday, June 16, 2017 10:24 AM
To: Marty Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; Davis, 
Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Ok, then it sound like it is important enough to hold the release until we have 
a fix. I assume this is something that you can look at Vlad ?

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Friday, June 16, 2017 10:17 AM
To: Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; Davis, Arlin 
R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I tested the upstream driver from Bart VanAssche and it does not have the same 
issue as the OFED-4.8-rc4 ib_srp driver. The upstream driver handles the 
SRP_Reject as expected without crashing the SL 7.3 kernel. That is also 
detailed in

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Marty Schlining
I can’t say for sure. I am attempting to build the upstream ib_srp driver for 
this platform, but I am running into build issues. Bart, I will send you a 
separate email on that subject.

From: Davis, Arlin R [mailto:arlin.r.da...@intel.com]
Sent: Friday, June 16, 2017 3:58 PM
To: Woodruff, Robert J <robert.j.woodr...@intel.com>; Marty Schlining 
<mschlin...@ddn.com>; RSD@SFI <rsda...@soft-forge.com>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il>
Cc: bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>; Mike Davis <mda...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Another SRP critical issue just opened. Is this also a backport issue?

Bug 2632<http://bugs.openfabrics.org/show_bug.cgi?id=2632> - SRP Login failure 
for SL 7.2 and OFED 4.8-rc4 (ib_srp: Sending CM DREQ failed)


From: Woodruff, Robert J
Sent: Friday, June 16, 2017 10:24 AM
To: Marty Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; Davis, 
Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

Ok, then it sound like it is important enough to hold the release until we have 
a fix. I assume this is something that you can look at Vlad ?

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Friday, June 16, 2017 10:17 AM
To: Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; Davis, Arlin 
R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I tested the upstream driver from Bart VanAssche and it does not have the same 
issue as the OFED-4.8-rc4 ib_srp driver. The upstream driver handles the 
SRP_Reject as expected without crashing the SL 7.3 kernel. That is also 
detailed in the defect.

From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Friday, June 16, 2017 1:14 PM
To: Marty Schlining <mschlin...@ddn.com<mailto:mschlin...@ddn.com>>; Davis, 
Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

So I guess the next question is, does anyone have a fix that can be included ? 
Also, do you know if this is a problem with the backport or is it also broken 
in the upstream kernel ?

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Friday, June 16, 2017 10:03 AM
To: Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; Davis, Arlin 
R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I do not think this is an acceptable workaround. An SRP target could have many 
hosts connected (IO nodes), not just the one that may have been abruptly 
rebooted (due to a power failure or pulled power cord). Are you suggesting that 
customers must reboot all of the IO nodes in their cluster attached to the SRP 
target and the SRP target as a workaround thereby taking the entire file system 
offline?

The backported ib_s

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Marty Schlining
I tested the upstream driver from Bart VanAssche and it does not have the same 
issue as the OFED-4.8-rc4 ib_srp driver. The upstream driver handles the 
SRP_Reject as expected without crashing the SL 7.3 kernel. That is also 
detailed in the defect.

From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Friday, June 16, 2017 1:14 PM
To: Marty Schlining <mschlin...@ddn.com>; Davis, Arlin R 
<arlin.r.da...@intel.com>; RSD@SFI <rsda...@soft-forge.com>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il>
Cc: bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Cedric Fernandes 
<cfernan...@ddn.com>; Mike Davis <mda...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

So I guess the next question is, does anyone have a fix that can be included ? 
Also, do you know if this is a problem with the backport or is it also broken 
in the upstream kernel ?

From: Marty Schlining [mailto:mschlin...@ddn.com]
Sent: Friday, June 16, 2017 10:03 AM
To: Woodruff, Robert J 
<robert.j.woodr...@intel.com<mailto:robert.j.woodr...@intel.com>>; Davis, Arlin 
R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; RSD@SFI 
<rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir Sokolovsky' 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Cedric Fernandes 
<cfernan...@ddn.com<mailto:cfernan...@ddn.com>>; Mike Davis 
<mda...@ddn.com<mailto:mda...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I do not think this is an acceptable workaround. An SRP target could have many 
hosts connected (IO nodes), not just the one that may have been abruptly 
rebooted (due to a power failure or pulled power cord). Are you suggesting that 
customers must reboot all of the IO nodes in their cluster attached to the SRP 
target and the SRP target as a workaround thereby taking the entire file system 
offline?

The backported ib_srp driver has a defect where it is not handling an 
SRP_Reject properly leaving a null pointer for the blk_mq driver. Normally, 
this would just be a note in the log and the srp_daemon could be setup to 
attempt a reconnect after 30 seconds. But, that is no longer possible with the 
current defect. The host would crash and reboot every time a connection is 
attempted.

-Marty Schlining

From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Friday, June 16, 2017 12:18 PM
To: Davis, Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; 
RSD@SFI <rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; Marty Schlining 
<mschlin...@ddn.com<mailto:mschlin...@ddn.com>>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

When I read the bug, it looks like the work around is to also reboot the target 
whenever the host is rebooted. Is this work around acceptable for the short 
term, and then fix the
issue in OFED-4.8-1, which will be a quick turnaround release since the new 
functionality being added to OFED-4.8-1 is limited.  If people are OK with 
that, I would recommend moving
to GA for OFED-4.8 and then start work on OFED-4.8-1 right away. OFED-4.8 has 
been dragging on forever and there are people that will want to start using it.

From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R
Sent: Friday, June 16, 2017 9:13 AM
To: RSD@SFI <rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; 
mschlin...@ddn.com<mailto:mschlin...@ddn.com>
Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

What is the visibility and impact of the bug? Is this something that would be 
seen in normal use cases?
Is this actively being worked? What is the ETA for a fix?

From: RSD@SFI [mailto:rsda...@soft-forge.com]
Sent: Thursday, June 15, 2017 8:02 PM
To: Davis, Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; 
'Vladimir Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

SRP is very important to DDN and they have been a major supporter of IB and the 
OFA for a long time. Marty always attends the OFA Interop events and has been a 
big supporter of the OFA Logo program. There

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Marty Schlining
I do not think this is an acceptable workaround. An SRP target could have many 
hosts connected (IO nodes), not just the one that may have been abruptly 
rebooted (due to a power failure or pulled power cord). Are you suggesting that 
customers must reboot all of the IO nodes in their cluster attached to the SRP 
target and the SRP target as a workaround thereby taking the entire file system 
offline?

The backported ib_srp driver has a defect where it is not handling an 
SRP_Reject properly leaving a null pointer for the blk_mq driver. Normally, 
this would just be a note in the log and the srp_daemon could be setup to 
attempt a reconnect after 30 seconds. But, that is no longer possible with the 
current defect. The host would crash and reboot every time a connection is 
attempted.

-Marty Schlining

From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Friday, June 16, 2017 12:18 PM
To: Davis, Arlin R <arlin.r.da...@intel.com>; RSD@SFI <rsda...@soft-forge.com>; 
'Vladimir Sokolovsky' <v...@dev.mellanox.co.il>
Cc: bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Marty Schlining 
<mschlin...@ddn.com>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

When I read the bug, it looks like the work around is to also reboot the target 
whenever the host is rebooted. Is this work around acceptable for the short 
term, and then fix the
issue in OFED-4.8-1, which will be a quick turnaround release since the new 
functionality being added to OFED-4.8-1 is limited.  If people are OK with 
that, I would recommend moving
to GA for OFED-4.8 and then start work on OFED-4.8-1 right away. OFED-4.8 has 
been dragging on forever and there are people that will want to start using it.

From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R
Sent: Friday, June 16, 2017 9:13 AM
To: RSD@SFI <rsda...@soft-forge.com<mailto:rsda...@soft-forge.com>>; 'Vladimir 
Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: bart.vanass...@gmail.com<mailto:bart.vanass...@gmail.com>; 
ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>; 
mschlin...@ddn.com<mailto:mschlin...@ddn.com>
Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

What is the visibility and impact of the bug? Is this something that would be 
seen in normal use cases?
Is this actively being worked? What is the ETA for a fix?

From: RSD@SFI [mailto:rsda...@soft-forge.com]
Sent: Thursday, June 15, 2017 8:02 PM
To: Davis, Arlin R <arlin.r.da...@intel.com<mailto:arlin.r.da...@intel.com>>; 
'Vladimir Sokolovsky' <v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>
Subject: RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

SRP is very important to DDN and they have been a major supporter of IB and the 
OFA for a long time. Marty always attends the OFA Interop events and has been a 
big supporter of the OFA Logo program. Therefore I vote that we wait and find a 
fix.

From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R
Sent: Thursday, June 15, 2017 4:20 PM
To: Vladimir Sokolovsky 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>
Cc: ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>
Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

I just noticed a new critical SRP bug with RC4. Do we document as “known 
issues” or hold off on GA for a fix?

http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2631


From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Davis, Arlin R
Sent: Thursday, June 15, 2017 10:57 AM
To: Vladimir Sokolovsky 
<v...@dev.mellanox.co.il<mailto:v...@dev.mellanox.co.il>>; Schmidt, William R 
<william.r.schm...@intel.com<mailto:william.r.schm...@intel.com>>; Schulfer, 
Pawel <pawel.schul...@intel.com<mailto:pawel.schul...@intel.com>>
Cc: ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>
Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8


Last call for release note changes. Please get them to Vlad by end of the day 
today so we can wrap this up.



Vlad, please roll GA tomorrow with all updated release notes.



Thanks everyone!


From: ewg [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Vladimir 
Sokolovsky
Sent: Thursday, June 15, 2017 8:58 AM
To: Schmidt, William R 
<william.r.schm...@intel.com<mailto:william.r.schm...@intel.com>>; Schulfer, 
Pawel <pawel.schul...@intel.com<mailto:pawel.schul...@intel.com>>
Cc: ewg@lists.openfabrics.org<mailto:ewg@lists.openfabrics.org>
Subject: Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

This fix will get to GA as it was already applied as I mentioned before and the 
GA was not built yet.

Regards,
Vladimir
On 06/15/2017 12:51 AM, Schmidt, William R wrote:
Vlad,

Did this fix get into 

[ewg] Forcing a DDR HCA to SDR speeds

2008-09-09 Thread Marty Schlining
With OFED 1.3.1 or 1.4, is it possible to force the link speed of a DDR HCA 
port or the entire DDR HCA from a DDR link to strictly SDR? If so, how can it 
be done?  The HCAs in question is a Mellanox MT25208 dual port HCA, rev A3, 
firmware 4.8.2.

Martin Schlining
[EMAIL PROTECTED]
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg