Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Bart Van Assche
Hello Arlin,

Adding a URL to the OFED README is not the same as including a version of the
ib_srp-backport driver in OFED. How can OFED users know what version of the
ib_srp-backport driver has been tested in combination with which OFED version?

Thanks,

Bart.

On Tue, 2017-06-20 at 17:18 +, Davis, Arlin R wrote:
> Marty,
> 
> Would this work for DDN? We could add the URL and the SRP backport readme to
> the OFED release notes. I believe you had an issue when applying the backport
> to SL 7.2 and OFED 4.8 RC4 but I think Bart fixed that already.
> 
> Thanks, Arlin
>  
> > Hello Arlin and Robert,
> > 
> > A backported version of the 4.11 ib_srp driver that already has been tested 
> > is
> > available at https://github.com/bvanassche/ib_srp-backport.
> > That driver builds fine against kernel.org kernels, RHEL/CentOS kernels,
> > openSuSE and SLES kernels and also against MOFED and OFED. Has it already
> > been considered to use that code base instead of duplicating the backporting
> > effort of the ib_srp driver?
> > 
> > Thanks,
> > 
> > Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/ewg

Re: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8

2017-06-20 Thread Bart Van Assche
Hello Arlin and Robert,

A backported version of the 4.11 ib_srp driver that already has been
tested is available at https://github.com/bvanassche/ib_srp-backport.
That driver builds fine against kernel.org kernels, RHEL/CentOS kernels,
openSuSE and SLES kernels and also against MOFED and OFED. Has it
already been considered to use that code base instead of duplicating the
backporting effort of the ib_srp driver?

Thanks,

Bart.

On 06/20/17 09:32, Woodruff, Robert J wrote:
> The question is, do we need everything that is in 4.11 or is there just
> one or two fixes in 4.11 that would fix this specific issue rather than
> trying to backport the entire 4.11 srp driver ?
> 
>  
> 
> *From:*Davis, Arlin R
> *Sent:* Tuesday, June 20, 2017 9:21 AM
> *To:* Hanania, Amir ; Marty Schlining
> ; Woodruff, Robert J ;
> RSD@SFI ; 'Vladimir Sokolovsky'
> 
> *Cc:* bart.vanass...@gmail.com; ewg@lists.openfabrics.org; Cedric
> Fernandes ; Mike Davis 
> *Subject:* RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
>  
> 
> Bart and Vlad,
> 
>  
> 
> What is the best way to pull in 4.11 fixes into OFED 4.8?
> 
> Looks like a lot of changes between 4.8 and 4.11.
> 
>  
> 
> -arlin
> 
>  
> 
>  
> 
> *From:*Hanania, Amir
> *Sent:* Monday, June 19, 2017 4:26 PM
> *To:* Davis, Arlin R  >; Marty Schlining  >; Woodruff, Robert J
> >;
> RSD@SFI >;
> 'Vladimir Sokolovsky'  >
> *Cc:* bart.vanass...@gmail.com ;
> ewg@lists.openfabrics.org ; Cedric
> Fernandes >; Mike Davis
> >
> *Subject:* RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
>  
> 
> I updated the bug with the below comment:
> 
>  
> 
> I'm not sure if this ib_srp-backport patch is related, but it worth to
> take a look.
> 
>  
> 
> IB/srp: Improve an error path
> 
> Avoid that the following message is printed if login fails: scsi host0:
> ib_srp: Sending CM DREQ failed
> 
>  
> 
>  
> 
>  
> 
> https://github.com/bvanassche/ib_srp-backport/commit/75c59cb20e6b7d20948f5a7b5e4fd92bd7436a60
> 
>  
> 
>  
> 
> *From:*Davis, Arlin R
> *Sent:* Monday, June 19, 2017 2:21 PM
> *To:* Marty Schlining >;
> Woodruff, Robert J  >; RSD@SFI  >; 'Vladimir Sokolovsky'
> >
> *Cc:* bart.vanass...@gmail.com ;
> ewg@lists.openfabrics.org ; Cedric
> Fernandes >; Mike Davis
> >; Hanania, Amir
> >
> *Subject:* RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
>  
> 
> Ok, so it’s safe to say we are missing some bug fixes after kernel.org 4.8.
> 
>  
> 
> +Amir, to help isolate and validate.
> 
>  
> 
>  
> 
> *From:*Marty Schlining [mailto:mschlin...@ddn.com]
> *Sent:* Monday, June 19, 2017 1:40 PM
> *To:* Davis, Arlin R  >; Woodruff, Robert J
> >;
> RSD@SFI >;
> 'Vladimir Sokolovsky'  >
> *Cc:* bart.vanass...@gmail.com ;
> ewg@lists.openfabrics.org ; Cedric
> Fernandes >; Mike Davis
> >
> *Subject:* RE: [ewg] [GIT PULL compat-rdma] qib for OFED 4.8
> 
>  
> 
> The upstream test was for Bug 2631, SRP Reject Issue. The Base OS was SL
> 7.3 (not SL 7.2) w/ OFED 4.8-rc4 and the ib_srp_backport_4_11. That
> combination worked properly.
> 
>  
> 
> *From:*Davis, Arlin R [mailto:arlin.r.da...@intel.com]
> *Sent:* Monday, June 19, 2017 4:35 PM
> *To:* Marty Schlining >;
> Woodruff, Robert J  >; RSD@SFI  >; 'Vladimir Sokolovsky'
> >
> *Cc:* bart.vanass...@gmail.com ;
> ewg@lists.openfabrics.org ; Cedric
> Fernandes 

[ewg] srptools v1.0.3

2015-02-11 Thread Bart Van Assche
Hello Vlad,

Please consider srptools v1.0.3 for inclusion in OFED 3.18. The changes
compared to v1.0.2 are as follows (from srptools.spec.in):
- srp_daemon: Survive catastrophic HCA errors.
- srp_daemon: Fix ib_dev name and port assignments for non-default umad
  devices.
- srp_daemon: Add support for allow_ext_sg, cmd_sg_entries and
  sg_tablesize in /etc/srp_daemon.conf.
- srp_daemon: Reduce time needed to stop.
- srp_daemon: Log start and end of trap deregistration.
- srp_daemon: Avoid that clang complains about an invalid conversion
  specifier.
- srp_daemon: Fix memory leaks in error paths.
- ibsrpdm: Do not start trap threads in ibsrpdm.
- configure.ac: Add subdir-objects to AM_INIT_AUTOMAKE.
- srptools.spec: Avoid redundant stop in pre-uninstall.
- Debian: Fix build-deb.sh to read version from configure.ac.
- Debian: Fix package build.

See also http://downloads.openfabrics.org/downloads/srptools/ and
http://git.openfabrics.org/?p=~bvanassche/srptools.git/.git;a=summary.

Thanks,

Bart.


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/ewg


Re: [ewg] links and such

2014-04-02 Thread Bart Van Assche
Hello Ken,

Do you perhaps know when this migration will have finished ? If I navigate to 
http://openfabrics.org/ all I see is the source code of a Perl script instead 
of the actual OFA website.

Thanks,

Bart.

-Original Message-
From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of 
k...@flatbed.openfabrics.org
Sent: maandag 31 maart 2014 8:13
To: nvme...@openfabrics.org; e...@openfabrics.org
Subject: [ewg] links and such

We are migrating all web service to hardware. Some links and urls are not yet 
working, but I diligently trying to solve the issues. The web site, lists 
server, and mail server are running. Bugs are bugs.openfabrics.org/bugzilla/. 
The git daemon is running, but the web interface is not yet up. SVN is 
available through a client at svn://flatbed.openfabrics.org. The web interface 
is not up yet. My goal is to have them running today. 

Thanks for your patience. And thanks to Vladimir for help in getting the git 
daemon running. 

Ken

This e-mail (and any attachments) is confidential and may be privileged.  Any 
unauthorized use, copying, disclosure or dissemination of this communication is 
prohibited.  If you are not the intended recipient,  please notify the sender 
immediately and delete all copies of the message and its attachments.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/ewg


[ewg] srptools 1.0.2

2014-02-21 Thread Bart Van Assche
Hi Vlad,

Can you please update the srptools package in OFED-3.12 to version 1.0.2
? The only change compared to version 1.0.1 is:

* Thu Feb 20 2014 Bart Van Assche bvanass...@acm.org - 1.0.2
- Added support for specifying tl_retry_count in srp_daemon.conf. Changed
  default behavior for tl_retry_timeout parameter from setting it to 2 into
  leaving it at its default value (7). This makes srp_daemon again
compatible
  with the SRP initiator driver from kernel 3.12 and before.

The package is available here:
http://www.openfabrics.org/downloads/srptools/srptools-1.0.2.tar.gz

Thanks,

Bart.

This e-mail (and any attachments) is confidential and may be privileged.  Any 
unauthorized use, copying, disclosure or dissemination of this communication is 
prohibited.  If you are not the intended recipient,  please notify the sender 
immediately and delete all copies of the message and its attachments.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [ANNOUNCE] srptools 1.0.1 released

2014-02-04 Thread Bart Van Assche
Changes since version 1.0.0:
- Make process uniqueness check work.
- Unsubscribe from subnet manager for traps before exiting.
- Added support for the comp_vector and queue_size configuration file
options.
Changes since version 0.0.4:
- srp_daemon keeps working even if the LID changes of the port it is
using to
  scan the fabric or if a P_Key change occurs.
- Added P_Key support to srp_daemon and ibsrpdm.
- Fixed month in srp_daemon.log (OFED bug #2281). srp_daemon now uses syslog
  and logrotate for logging.
- srp_daemon is now only started for InfiniBand ports. It is no longer
  attempted to start srp_daemon on Ethernet ports.
- Added support for specifying the tl_retry_count parameter. By default use
  tl_retry_count=2.
- Allow srp_daemon to be started without configuration file.
- Fixed a memory leak in srp_daemon that was triggered once during every
fabric
  rescan.
- Reduced memory consumption of the srp_daemon process.
- MAD transaction ID 0 is skipped after 2**32 rescans.
- Installation: SRPHA_ENABLE=no / SRP_DAEMON_ENABLE=no is only added to
  /etc/infiniband/openibd.conf if these variables did not yet exist in that
  file.
- Changed range of the srp_daemon and ibsrpdm exit codes from 0..127
into 0..1.
- Changed ibsrpdm such that it uses the new umad P_Key ABI. Running ibsrpdm
  does no longer cause a warning to be logged (user_mad: process
ibsrpdm did
  not enable P_Key index support / user_mad:
  Documentation/infiniband/user_mad.txt has info on the new ABI).
- Fixed spelling of several help texts and diagnostic messages.

See also
http://www.openfabrics.org/downloads/srptools/srptools-1.0.1.tar.gz.

Bart.

This e-mail (and any attachments) is confidential and may be privileged.  Any 
unauthorized use, copying, disclosure or dissemination of this communication is 
prohibited.  If you are not the intended recipient,  please notify the sender 
immediately and delete all copies of the message and its attachments.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 3.12 SRP client compatibility for RHEL 6.5

2013-12-10 Thread Bart Van Assche
On 12/09/13 18:54, Vladimir Sokolovsky wrote:
 I adopted 0025-ib_srp-Backport-to-older-kernels.patch from OFED-3.5 into
 compat-rdma (master) including changes Amir sent to me.
 Please try
 http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/OFED-3.12-20131209-0842.tgz

Hi Vladimir,

I had a look at
http://git.openfabrics.org/git?p=compat-rdma/compat-rdma.git;a=blob;f=patches/0005-BACKPORT-ib_srp.patch
instead. I hope that you don't mind that I have a few questions about
that patch (which bears my signed-off-by but which is a patch I hadn't
seen before) ?
- Which upstream kernel versions and which vendor distro's will be
supported by OFED 3.12 ?
- Building the SRP initiator requires a recent version of the kernel
source files drivers/scsi/Makefile, drivers/scsi/scsi_priv.h,
drivers/scsi/scsi_transport_srp.c,
drivers/scsi/scsi_transport_srp_internal.h and
include/scsi/scsi_transport_srp.h. I haven't seen these in the master
branch of the linux-3.12 git repository
(http://git.openfabrics.org/git?p=compat-rdma/linux-3.12.git;a=tree;hb=master).
Is that correct ?
- Are there any plans to include the upstream kernel 3.13 SRP initiator
patches in OFED 3.12 ? I think these patches improve the SRP initiator
significantly in a multipath setup.

Thanks,

Bart.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 3.12 SRP client compatibility for RHEL 6.5

2013-12-06 Thread Bart Van Assche
On 12/06/13 19:26, Hanania, Amir wrote:
 I had a need to run ofed 3.12 SRP client on _RHEL 6.5_.
 
 So, I fixed:
 1. Compilation issue when SRP option is set. (scsi_target_unblock)
 2. Crashing issue when I first tried to use it. (srp_queuecommand)
 
 It is now running on my RHEL 6.5 system. _Please see attached patch_.
 
 My question is what is the way to make this patch part of the official
 OFED-3.12 release ?

Please have a look at a similar patch that is present in OFED 3.5. It is
much more elaborate than this patch and e.g. enables lockless queueing,
which improves performance significantly. See also the patch called
compat-rdma-3.5/patches/0025-ib_srp-Backport-to-older-kernels.patch in
SRPMS/compat-rdma-3.5-OFED.3.5.src.rpm in OFED-3.5.tgz.

The same patch can also be found here:
http://git.openfabrics.org/git?p=compat-rdma/compat-rdma.git;a=tree;f=patches;hb=ofed_3_5.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Bugs discovered: OFA Interop Debug Event - October 2012

2012-10-25 Thread Bart Van Assche

On 10/25/12 16:32, Rupert Dance wrote:

Hi All,

I just wanted to make sure that you were aware of the new bugs
discovered during the OFA Interop Debug event. These were filed by Marty
Schlining from DDN who has been working with us all week at UNH-IOL to
debug SRP issues in OFED-3.5-20121020-0600.tgz.

Here are the bug numbers: 2393
http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2393, 2394
http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2394, 2395
http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2395, 2396
http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2396

Today we are going to install OFED-3.5-20121022-0458 and will let you
know if we see any difference.


The use-after-free fix for unloading ib_srp is present in 
OFED-3.5-20121022-0458 but not in OFED-3.5-20121020-0600 so I'm curious 
for the new test results.


Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH 1/2 for compat-rdma] ib_srp: Avoid use-after-free during module unload on pre-2.6.36 kernels

2012-10-18 Thread Bart Van Assche
Since ib_unregister_client() flushes ib_wq that workqueue must be
destroyed after client unregistration instead of before.

This patch has been tested on RHEL 6.3 and SLES 11 SP2.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 .../0025-ib_srp-Backport-to-older-kernels.patch|   52 ++--
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/patches/0025-ib_srp-Backport-to-older-kernels.patch 
b/patches/0025-ib_srp-Backport-to-older-kernels.patch
index d070430..50e0644 100644
--- a/patches/0025-ib_srp-Backport-to-older-kernels.patch
+++ b/patches/0025-ib_srp-Backport-to-older-kernels.patch
@@ -127,32 +127,56 @@ index bcbf22e..d42e9c4 100644
.queuecommand   = srp_queuecommand,
.eh_abort_handler   = srp_abort,
.eh_device_reset_handler= srp_reset_device,
-@@ -2491,11 +2553,25 @@ static int __init srp_init_module(void)
-   return ret;
+@@ -2468,15 +2530,28 @@ static int __init srp_init_module(void)
+   indirect_sg_entries = cmd_sg_entries;
}
  
 +#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
 +  srp_wq = create_workqueue(srp);
-+  if (IS_ERR(srp_wq)) {
-+  ib_unregister_client(srp_client);
-+  ib_sa_unregister_client(srp_sa_client);
-+  class_unregister(srp_class);
-+  srp_release_transport(ib_srp_transport_template);
++  if (IS_ERR(srp_wq))
 +  return PTR_ERR(srp_wq);
-+  }
 +#endif
 +
-   return 0;
- }
+   ib_srp_transport_template =
+   srp_attach_transport(ib_srp_transport_functions);
+-  if (!ib_srp_transport_template)
++  if (!ib_srp_transport_template) {
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++  destroy_workqueue(srp_wq);
++#endif
+   return -ENOMEM;
++  }
  
- static void __exit srp_cleanup_module(void)
- {
+   ret = class_register(srp_class);
+   if (ret) {
+   pr_err(couldn't register class infiniband_srp\n);
+   srp_release_transport(ib_srp_transport_template);
 +#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
-+  destroy_workqueue(srp_wq);
++  destroy_workqueue(srp_wq);
 +#endif
-   ib_unregister_client(srp_client);
+   return ret;
+   }
+ 
+@@ -2488,6 +2563,9 @@ static int __init srp_init_module(void)
+   srp_release_transport(ib_srp_transport_template);
+   ib_sa_unregister_client(srp_sa_client);
+   class_unregister(srp_class);
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++  destroy_workqueue(srp_wq);
++#endif
+   return ret;
+   }
+ 
+@@ -2500,6 +2578,9 @@ static void __exit srp_cleanup_module(void)
ib_sa_unregister_client(srp_sa_client);
class_unregister(srp_class);
+   srp_release_transport(ib_srp_transport_template);
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++  destroy_workqueue(srp_wq);
++#endif
+ }
+ 
+ module_init(srp_init_module);
 -- 
 1.7.9.5
 
-- 
1.7.10.4

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH 2/2 for compat-rdma] ib_srp: Add post-3.5 upstream patches

2012-10-18 Thread Bart Van Assche
The following three patches have been accepted upstream after kernel
3.5 was released:
- IB-srp-Fix-a-race-condition: avoid that late replies can trigger
  a crash.
- IB-srp-Fix-use-after-free-in-srp_reset_req
- IB-srp-Avoid-having-aborted-requests-hang

Add these patches to OFED 3.5.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 patches/0026-IB-srp-Fix-a-race-condition.patch |  160 
 ...B-srp-Fix-use-after-free-in-srp_reset_req.patch |   35 +
 ...IB-srp-Avoid-having-aborted-requests-hang.patch |   30 
 3 files changed, 225 insertions(+)
 create mode 100644 patches/0026-IB-srp-Fix-a-race-condition.patch
 create mode 100644 
patches/0027-IB-srp-Fix-use-after-free-in-srp_reset_req.patch
 create mode 100644 patches/0028-IB-srp-Avoid-having-aborted-requests-hang.patch

diff --git a/patches/0026-IB-srp-Fix-a-race-condition.patch 
b/patches/0026-IB-srp-Fix-a-race-condition.patch
new file mode 100644
index 000..969a3c1
--- /dev/null
+++ b/patches/0026-IB-srp-Fix-a-race-condition.patch
@@ -0,0 +1,160 @@
+From 220329916c72ee3d54ae7262b215a050f04a18fc Mon Sep 17 00:00:00 2001
+From: Bart Van Assche bvanass...@acm.org
+Date: Tue, 14 Aug 2012 13:18:53 +
+Subject: [PATCH] IB/srp: Fix a race condition
+
+Avoid a crash caused by the scmnd-scsi_done(scmnd) call in
+srp_process_rsp() being invoked with scsi_done == NULL.  This can
+happen if a reply is received during or after a command abort.
+
+Reported-by: Joseph Glanville joseph.glanvi...@orionvm.com.au
+Reference: http://marc.info/?l=linux-rdmam=134314367801595
+Cc: sta...@vger.kernel.org
+Acked-by: David Dillow dillo...@ornl.gov
+Signed-off-by: Bart Van Assche bvanass...@acm.org
+Signed-off-by: Roland Dreier rol...@purestorage.com
+---
+ drivers/infiniband/ulp/srp/ib_srp.c |   87 +--
+ 1 file changed, 63 insertions(+), 24 deletions(-)
+
+diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
+index bcbf22e..1b5b0c7 100644
+--- a/drivers/infiniband/ulp/srp/ib_srp.c
 b/drivers/infiniband/ulp/srp/ib_srp.c
+@@ -586,24 +586,62 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
+   scmnd-sc_data_direction);
+ }
+ 
+-static void srp_remove_req(struct srp_target_port *target,
+- struct srp_request *req, s32 req_lim_delta)
++/**
++ * srp_claim_req - Take ownership of the scmnd associated with a request.
++ * @target: SRP target port.
++ * @req: SRP request.
++ * @scmnd: If NULL, take ownership of @req-scmnd. If not NULL, only take
++ * ownership of @req-scmnd if it equals @scmnd.
++ *
++ * Return value:
++ * Either NULL or a pointer to the SCSI command the caller became owner of.
++ */
++static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
++ struct srp_request *req,
++ struct scsi_cmnd *scmnd)
++{
++  unsigned long flags;
++
++  spin_lock_irqsave(target-lock, flags);
++  if (!scmnd) {
++  scmnd = req-scmnd;
++  req-scmnd = NULL;
++  } else if (req-scmnd == scmnd) {
++  req-scmnd = NULL;
++  } else {
++  scmnd = NULL;
++  }
++  spin_unlock_irqrestore(target-lock, flags);
++
++  return scmnd;
++}
++
++/**
++ * srp_free_req() - Unmap data and add request to the free request list.
++ */
++static void srp_free_req(struct srp_target_port *target,
++   struct srp_request *req, struct scsi_cmnd *scmnd,
++   s32 req_lim_delta)
+ {
+   unsigned long flags;
+ 
+-  srp_unmap_data(req-scmnd, target, req);
++  srp_unmap_data(scmnd, target, req);
++
+   spin_lock_irqsave(target-lock, flags);
+   target-req_lim += req_lim_delta;
+-  req-scmnd = NULL;
+   list_add_tail(req-list, target-free_reqs);
+   spin_unlock_irqrestore(target-lock, flags);
+ }
+ 
+ static void srp_reset_req(struct srp_target_port *target, struct srp_request 
*req)
+ {
+-  req-scmnd-result = DID_RESET  16;
+-  req-scmnd-scsi_done(req-scmnd);
+-  srp_remove_req(target, req, 0);
++  struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
++
++  if (scmnd) {
++  scmnd-result = DID_RESET  16;
++  scmnd-scsi_done(scmnd);
++  srp_free_req(target, req, scmnd, 0);
++  }
+ }
+ 
+ static int srp_reconnect_target(struct srp_target_port *target)
+@@ -1073,11 +,18 @@ static void srp_process_rsp(struct srp_target_port 
*target, struct srp_rsp *rsp)
+   complete(target-tsk_mgmt_done);
+   } else {
+   req = target-req_ring[rsp-tag];
+-  scmnd = req-scmnd;
+-  if (!scmnd)
++  scmnd = srp_claim_req(target, req, NULL);
++  if (!scmnd) {
+   shost_printk(KERN_ERR, target-scsi_host,
+Null scmnd for RSP w/tag %016llx\n

Re: [ewg] [PATCH] ib_srp: Avoid that module removal can trigger a deadlock

2012-10-17 Thread Bart Van Assche

On 10/17/12 05:12, Rupert Dance wrote:

However the Module took a long time (~1-2 minutes) to unload 2. Message
saying something to the effect of 'stale connection...retrying' was observed


That behavior is consistent with the behavior of the ib_srp driver in 
the latest upstream kernel (3.7-rc1), isn't it ?


Bart.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] ib_srp: Avoid that module removal can trigger a deadlock

2012-10-12 Thread Bart Van Assche
Avoid that scsi_remove_host() is invoked from the context of a work
queue thread on which work has been queued that scsi_remove_host()
might be waiting for. That avoids that module removal of ib_srp
triggers a deadlock on a pre-2.6.36 kernel. This patch has been
tested on RHEL 6.1, RHEL 6.2, RHEL 6.3 and SLES 11 SP2.

Reported-by: Rupert Dance rsda...@soft-forge.com
Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 .../0025-ib_srp-Backport-to-older-kernels.patch|   59 +++-
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/patches/0025-ib_srp-Backport-to-older-kernels.patch 
b/patches/0025-ib_srp-Backport-to-older-kernels.patch
index 20edccf..d070430 100644
--- a/patches/0025-ib_srp-Backport-to-older-kernels.patch
+++ b/patches/0025-ib_srp-Backport-to-older-kernels.patch
@@ -12,7 +12,7 @@ Signed-off-by: Bart Van Assche bvanass...@acm.org
  1 file changed, 108 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
-index bcbf22e..fab74e0 100644
+index bcbf22e..d42e9c4 100644
 --- a/drivers/infiniband/ulp/srp/ib_srp.c
 +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 @@ -30,8 +30,13 @@
@@ -29,7 +29,7 @@ index bcbf22e..fab74e0 100644
  #include linux/module.h
  #include linux/init.h
  #include linux/slab.h
-@@ -41,21 +46,27 @@
+@@ -41,21 +46,32 @@
  #include linux/random.h
  #include linux/jiffies.h
  
@@ -57,22 +57,15 @@ index bcbf22e..fab74e0 100644
 +#define pr_warn pr_warning
 +#endif
 +
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++static struct workqueue_struct *srp_wq;
++#define ib_wq srp_wq
++#endif
++
  MODULE_AUTHOR(Roland Dreier);
  MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator 
   v DRV_VERSION  ( DRV_RELDATE ));
-@@ -675,7 +686,11 @@ err:
-   if (target-state == SRP_TARGET_CONNECTING) {
-   target-state = SRP_TARGET_DEAD;
-   INIT_WORK(target-work, srp_remove_work);
-+#if LINUX_VERSION_CODE = KERNEL_VERSION(2, 6, 36)
-   queue_work(ib_wq, target-work);
-+#else
-+  schedule_work(target-work);
-+#endif
-   }
-   spin_unlock_irq(target-lock);
- 
-@@ -1254,7 +1269,50 @@ static void srp_send_completion(struct ib_cq *cq, void 
*target_ptr)
+@@ -1254,7 +1270,50 @@ static void srp_send_completion(struct ib_cq *cq, void 
*target_ptr)
}
  }
  
@@ -124,7 +117,7 @@ index bcbf22e..fab74e0 100644
  {
struct srp_target_port *target = host_to_target(shost);
struct srp_request *req;
-@@ -1822,6 +1880,9 @@ static struct scsi_host_template srp_template = {
+@@ -1822,6 +1881,9 @@ static struct scsi_host_template srp_template = {
.name   = InfiniBand SRP initiator,
.proc_name  = DRV_NAME,
.info   = srp_target_info,
@@ -134,18 +127,32 @@ index bcbf22e..fab74e0 100644
.queuecommand   = srp_queuecommand,
.eh_abort_handler   = srp_abort,
.eh_device_reset_handler= srp_reset_device,
-@@ -2412,7 +2473,11 @@ static void srp_remove_one(struct ib_device *device)
-* started before we marked our target ports as
-* removed, and any target port removal tasks.
-*/
-+#if LINUX_VERSION_CODE = KERNEL_VERSION(2, 6, 36)
-   flush_workqueue(ib_wq);
-+#else
-+  flush_scheduled_work();
+@@ -2491,11 +2553,25 @@ static int __init srp_init_module(void)
+   return ret;
+   }
+ 
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++  srp_wq = create_workqueue(srp);
++  if (IS_ERR(srp_wq)) {
++  ib_unregister_client(srp_client);
++  ib_sa_unregister_client(srp_sa_client);
++  class_unregister(srp_class);
++  srp_release_transport(ib_srp_transport_template);
++  return PTR_ERR(srp_wq);
++  }
 +#endif
++
+   return 0;
+ }
  
-   list_for_each_entry_safe(target, tmp_target,
-host-target_list, list) {
+ static void __exit srp_cleanup_module(void)
+ {
++#if LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 36)
++  destroy_workqueue(srp_wq);
++#endif
+   ib_unregister_client(srp_client);
+   ib_sa_unregister_client(srp_sa_client);
+   class_unregister(srp_class);
 -- 
 1.7.9.5
 
-- 
1.7.10.4



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Not ready for OFED-3.5 GA

2012-10-12 Thread Bart Van Assche

On 10/12/12 00:31, Rupert Dance wrote:

We are still seeing some issues with the SRP module during our testing at
UNH-IOL. We can now load the module and run the tests but whenever we unload
the module it causes a system crash.

I will update the bug with this information.


Hello Rupert,

Sorry but I'm not aware of any code in ib_srp that could make module 
unload crash. However, the patch I just posted on this mailing list 
should fix a deadlock that could be triggered by module unload on 
pre-2.6.36 kernels. Regarding the screenshot attached to 
http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2374: several 
essential pieces of information are missing from that screenshot. E.g. 
was the kernel reporting a BUG: or just an INFO: message that a task was 
blocked for more than 120s ? The call trace itself is missing too. Such 
information can be collected either via a serial console or by gathering 
kernel logs on a remote system via netconsole.


Bart.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH for compat-rdma] ib_srp: Unbreak build on SLES 11 SP2

2012-10-04 Thread Bart Van Assche
Use the scsi/srp.h header from compat-rdma instead of the
scsi/srp.h header from the kernel-devel package provided by
the OS for the srp_cred_req and related structure definitions.
Also, undefine pr_fmt() before redefining it since the
compat-rdma build process includes linux/printk.h from the
command line.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 .../0025-ib_srp-Backport-to-older-kernels.patch|   61 +---
 1 file changed, 13 insertions(+), 48 deletions(-)

diff --git a/patches/0025-ib_srp-Backport-to-older-kernels.patch 
b/patches/0025-ib_srp-Backport-to-older-kernels.patch
index eb1945a..20edccf 100644
--- a/patches/0025-ib_srp-Backport-to-older-kernels.patch
+++ b/patches/0025-ib_srp-Backport-to-older-kernels.patch
@@ -15,8 +15,11 @@ diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib
 index bcbf22e..fab74e0 100644
 --- a/drivers/infiniband/ulp/srp/ib_srp.c
 +++ b/drivers/infiniband/ulp/srp/ib_srp.c
-@@ -32,6 +32,10 @@
+@@ -30,8 +30,13 @@
+  * SOFTWARE.
+  */
  
++#undef  pr_fmt
  #define pr_fmt(fmt) PFX fmt
  
 +#define DRV_NAME  ib_srp
@@ -26,7 +29,7 @@ index bcbf22e..fab74e0 100644
  #include linux/module.h
  #include linux/init.h
  #include linux/slab.h
-@@ -41,7 +45,11 @@
+@@ -41,21 +46,27 @@
  #include linux/random.h
  #include linux/jiffies.h
  
@@ -38,7 +41,10 @@ index bcbf22e..fab74e0 100644
  
  #include scsi/scsi.h
  #include scsi/scsi_device.h
-@@ -51,11 +59,54 @@
+ #include scsi/scsi_dbg.h
+-#include scsi/srp.h
++#include ../../../../include/scsi/srp.h
+ #include scsi/scsi_transport_srp.h
  
  #include ib_srp.h
  
@@ -51,51 +57,10 @@ index bcbf22e..fab74e0 100644
 +#define pr_warn pr_warning
 +#endif
 +
-+#if !defined(RHEL_MAJOR)  LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 37) || \
-+  RHEL_MAJOR -0  6 || RHEL_MAJOR -0 == 6  RHEL_MINOR -0 == 0
-+struct srp_cred_req {
-+  u8  opcode;
-+  u8  sol_not;
-+  u8  reserved[2];
-+  __be32  req_lim_delta;
-+  u64 tag;
-+};
-+
-+struct srp_cred_rsp {
-+  u8  opcode;
-+  u8  reserved[7];
-+  u64 tag;
-+};
-+
-+/*
-+ * The SRP spec defines the fixed portion of the AER_REQ structure to be
-+ * 36 bytes, so it needs to be packed to avoid having it padded to 40 bytes
-+ * on 64-bit architectures.
-+ */
-+struct srp_aer_req {
-+  u8  opcode;
-+  u8  sol_not;
-+  u8  reserved[2];
-+  __be32  req_lim_delta;
-+  u64 tag;
-+  u32 reserved2;
-+  __be64  lun;
-+  __be32  sense_data_len;
-+  u32 reserved3;
-+  u8  sense_data[0];
-+} __attribute__((packed));
-+
-+struct srp_aer_rsp {
-+  u8  opcode;
-+  u8  reserved[7];
-+  u64 tag;
-+};
-+#endif
-+
  MODULE_AUTHOR(Roland Dreier);
  MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator 
   v DRV_VERSION  ( DRV_RELDATE ));
-@@ -675,7 +726,11 @@ err:
+@@ -675,7 +686,11 @@ err:
if (target-state == SRP_TARGET_CONNECTING) {
target-state = SRP_TARGET_DEAD;
INIT_WORK(target-work, srp_remove_work);
@@ -107,7 +72,7 @@ index bcbf22e..fab74e0 100644
}
spin_unlock_irq(target-lock);
  
-@@ -1254,7 +1309,50 @@ static void srp_send_completion(struct ib_cq *cq, void 
*target_ptr)
+@@ -1254,7 +1269,50 @@ static void srp_send_completion(struct ib_cq *cq, void 
*target_ptr)
}
  }
  
@@ -159,7 +124,7 @@ index bcbf22e..fab74e0 100644
  {
struct srp_target_port *target = host_to_target(shost);
struct srp_request *req;
-@@ -1822,6 +1920,9 @@ static struct scsi_host_template srp_template = {
+@@ -1822,6 +1880,9 @@ static struct scsi_host_template srp_template = {
.name   = InfiniBand SRP initiator,
.proc_name  = DRV_NAME,
.info   = srp_target_info,
@@ -169,7 +134,7 @@ index bcbf22e..fab74e0 100644
.queuecommand   = srp_queuecommand,
.eh_abort_handler   = srp_abort,
.eh_device_reset_handler= srp_reset_device,
-@@ -2412,7 +2513,11 @@ static void srp_remove_one(struct ib_device *device)
+@@ -2412,7 +2473,11 @@ static void srp_remove_one(struct ib_device *device)
 * started before we marked our target ports as
 * removed, and any target port removal tasks.
 */
-- 
1.7.10.4

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH for compat-rdma] /etc/init.d/openibd: Fix LSB header

2012-09-21 Thread Bart Van Assche
The meaning of the Required-Stop tag is which services must be
available during shutdown of a service. Avoid specifying the same
runlevel for the Default-Start and Default-Stop tag. Also,
the default start runlevels on Debian are 2, 3, 4 and 5.

See also 
http://refspecs.linuxfoundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/initscrcomconv.html.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 compat-rdma.spec |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/compat-rdma.spec b/compat-rdma.spec
index 6a10f1c..073b941 100755
--- a/compat-rdma.spec
+++ b/compat-rdma.spec
@@ -359,9 +359,9 @@ if [ -f /etc/SuSE-release ]; then
 ### BEGIN INIT INFO
 # Provides:   openibd
 # Required-Start: $local_fs
-# Required-Stop: opensmd $openiscsi
+# Required-Stop: $local_fs
 # Default-Start:  2 3 5
-# Default-Stop: 0 1 2 6
+# Default-Stop: 0 1 4 6
 # Description:Activates/Deactivates InfiniBand Driver to \
 # start at boot time.
 ### END INIT INFO
@@ -386,9 +386,9 @@ if [ -f /etc/debian_version ]; then
 ### BEGIN INIT INFO
 # Provides:   openibd
 # Required-Start: $local_fs
-# Required-Stop: opensmd $openiscsi
-# Default-Start:  2 3 5
-# Default-Stop: 0 1 2 6
+# Required-Stop: $local_fs
+# Default-Start:  2 3 4 5
+# Default-Stop: 0 1 6
 # Description:Activates/Deactivates InfiniBand Driver to \
 # start at boot time.
 ### END INIT INFO
-- 
1.7.10.4

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] IB/srp: Fix a race condition

2012-09-12 Thread Bart Van Assche
commit 220329916c72ee3d54ae7262b215a050f04a18fc upstream (v3.6).
commit 16ff8d53da93e20c57e88067a254d3106b7aed5f upstream (v3.5).

Avoid a crash caused by the scmnd-scsi_done(scmnd) call in
srp_process_rsp() being invoked with scsi_done == NULL.  This can
happen if a reply is received during or after a command abort.

Reported-by: Joseph Glanville joseph.glanvi...@orionvm.com.au
Reference: http://marc.info/?l=linux-rdmam=134314367801595
Acked-by: David Dillow dillo...@ornl.gov
Signed-off-by: Bart Van Assche bvanass...@acm.org
Signed-off-by: Roland Dreier rol...@purestorage.com
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
---
 drivers/infiniband/ulp/srp/ib_srp.c |   87 +--
 1 files changed, 63 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index bcbf22e..1b5b0c7 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -586,24 +586,62 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
scmnd-sc_data_direction);
 }
 
-static void srp_remove_req(struct srp_target_port *target,
-  struct srp_request *req, s32 req_lim_delta)
+/**
+ * srp_claim_req - Take ownership of the scmnd associated with a request.
+ * @target: SRP target port.
+ * @req: SRP request.
+ * @scmnd: If NULL, take ownership of @req-scmnd. If not NULL, only take
+ * ownership of @req-scmnd if it equals @scmnd.
+ *
+ * Return value:
+ * Either NULL or a pointer to the SCSI command the caller became owner of.
+ */
+static struct scsi_cmnd *srp_claim_req(struct srp_target_port *target,
+  struct srp_request *req,
+  struct scsi_cmnd *scmnd)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(target-lock, flags);
+   if (!scmnd) {
+   scmnd = req-scmnd;
+   req-scmnd = NULL;
+   } else if (req-scmnd == scmnd) {
+   req-scmnd = NULL;
+   } else {
+   scmnd = NULL;
+   }
+   spin_unlock_irqrestore(target-lock, flags);
+
+   return scmnd;
+}
+
+/**
+ * srp_free_req() - Unmap data and add request to the free request list.
+ */
+static void srp_free_req(struct srp_target_port *target,
+struct srp_request *req, struct scsi_cmnd *scmnd,
+s32 req_lim_delta)
 {
unsigned long flags;
 
-   srp_unmap_data(req-scmnd, target, req);
+   srp_unmap_data(scmnd, target, req);
+
spin_lock_irqsave(target-lock, flags);
target-req_lim += req_lim_delta;
-   req-scmnd = NULL;
list_add_tail(req-list, target-free_reqs);
spin_unlock_irqrestore(target-lock, flags);
 }
 
 static void srp_reset_req(struct srp_target_port *target, struct srp_request 
*req)
 {
-   req-scmnd-result = DID_RESET  16;
-   req-scmnd-scsi_done(req-scmnd);
-   srp_remove_req(target, req, 0);
+   struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL);
+
+   if (scmnd) {
+   scmnd-result = DID_RESET  16;
+   scmnd-scsi_done(scmnd);
+   srp_free_req(target, req, scmnd, 0);
+   }
 }
 
 static int srp_reconnect_target(struct srp_target_port *target)
@@ -1073,11 +,18 @@ static void srp_process_rsp(struct srp_target_port 
*target, struct srp_rsp *rsp)
complete(target-tsk_mgmt_done);
} else {
req = target-req_ring[rsp-tag];
-   scmnd = req-scmnd;
-   if (!scmnd)
+   scmnd = srp_claim_req(target, req, NULL);
+   if (!scmnd) {
shost_printk(KERN_ERR, target-scsi_host,
 Null scmnd for RSP w/tag %016llx\n,
 (unsigned long long) rsp-tag);
+
+   spin_lock_irqsave(target-lock, flags);
+   target-req_lim += be32_to_cpu(rsp-req_lim_delta);
+   spin_unlock_irqrestore(target-lock, flags);
+
+   return;
+   }
scmnd-result = rsp-status;
 
if (rsp-flags  SRP_RSP_FLAG_SNSVALID) {
@@ -1092,7 +1137,9 @@ static void srp_process_rsp(struct srp_target_port 
*target, struct srp_rsp *rsp)
else if (rsp-flags  (SRP_RSP_FLAG_DIOVER | 
SRP_RSP_FLAG_DIUNDER))
scsi_set_resid(scmnd, 
be32_to_cpu(rsp-data_in_res_cnt));
 
-   srp_remove_req(target, req, be32_to_cpu(rsp-req_lim_delta));
+   srp_free_req(target, req, scmnd,
+be32_to_cpu(rsp-req_lim_delta));
+
scmnd-host_scribble = NULL;
scmnd-scsi_done(scmnd);
}
@@ -1631,25 +1678,17 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 {
struct srp_target_port *target = host_to_target(scmnd-device-host);
struct srp_request

[ewg] [PATCH for OFED-3.5] ib_srp: Backport to older kernels

2012-09-12 Thread Bart Van Assche
This patch has been tested on RHEL 6.0, RHEL 6.1, RHEL 6.2, RHEL 6.3
and Ubuntu 10.04.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
 drivers/infiniband/ulp/srp/ib_srp.c |  111 ++-
 1 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 1b5b0c7..523fe57 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -32,6 +32,10 @@
 
 #define pr_fmt(fmt) PFX fmt
 
+#define DRV_NAME   ib_srp
+#define PFXDRV_NAME : 
+
+#include linux/version.h
 #include linux/module.h
 #include linux/init.h
 #include linux/slab.h
@@ -41,7 +45,11 @@
 #include linux/random.h
 #include linux/jiffies.h
 
+#if LINUX_VERSION_CODE = KERNEL_VERSION(2, 6, 37)
 #include linux/atomic.h
+#else
+#include asm/atomic.h
+#endif
 
 #include scsi/scsi.h
 #include scsi/scsi_device.h
@@ -51,11 +59,54 @@
 
 #include ib_srp.h
 
-#define DRV_NAME   ib_srp
-#define PFXDRV_NAME : 
 #define DRV_VERSION0.2
 #define DRV_RELDATENovember 1, 2005
 
+#ifndef pr_warn
+#define pr_warn pr_warning
+#endif
+
+#if !defined(RHEL_MAJOR)  LINUX_VERSION_CODE  KERNEL_VERSION(2, 6, 37) || \
+   RHEL_MAJOR -0  6 || RHEL_MAJOR -0 == 6  RHEL_MINOR -0 == 0
+struct srp_cred_req {
+   u8  opcode;
+   u8  sol_not;
+   u8  reserved[2];
+   __be32  req_lim_delta;
+   u64 tag;
+};
+
+struct srp_cred_rsp {
+   u8  opcode;
+   u8  reserved[7];
+   u64 tag;
+};
+
+/*
+ * The SRP spec defines the fixed portion of the AER_REQ structure to be
+ * 36 bytes, so it needs to be packed to avoid having it padded to 40 bytes
+ * on 64-bit architectures.
+ */
+struct srp_aer_req {
+   u8  opcode;
+   u8  sol_not;
+   u8  reserved[2];
+   __be32  req_lim_delta;
+   u64 tag;
+   u32 reserved2;
+   __be64  lun;
+   __be32  sense_data_len;
+   u32 reserved3;
+   u8  sense_data[0];
+} __attribute__((packed));
+
+struct srp_aer_rsp {
+   u8  opcode;
+   u8  reserved[7];
+   u64 tag;
+};
+#endif
+
 MODULE_AUTHOR(Roland Dreier);
 MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator 
   v DRV_VERSION  ( DRV_RELDATE ));
@@ -713,7 +764,11 @@ err:
if (target-state == SRP_TARGET_CONNECTING) {
target-state = SRP_TARGET_DEAD;
INIT_WORK(target-work, srp_remove_work);
+#if LINUX_VERSION_CODE = KERNEL_VERSION(2, 6, 36)
queue_work(ib_wq, target-work);
+#else
+   schedule_work(target-work);
+#endif
}
spin_unlock_irq(target-lock);
 
@@ -1301,7 +1356,50 @@ static void srp_send_completion(struct ib_cq *cq, void 
*target_ptr)
}
 }
 
-static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
+#if LINUX_VERSION_CODE = KERNEL_VERSION(2, 6, 36)
+/*
+ * Kernel with host lock push-down patch. See also upstream commit
+ * f281233d3eba15fb225d21ae2e228fd4553d824a.
+ */
+#define SRP_QUEUECOMMAND srp_queuecommand
+#elif defined(RHEL_MAJOR)  RHEL_MAJOR -0 == 6  RHEL_MINOR -0 = 2
+/*
+ * Kernel with lockless SCSI command dispatching enabled.
+ * See also the RHEL 6.2 release notes 
(http://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/6.2_Release_Notes/index.html).
+ */
+static int srp_queuecommand_wrk(struct Scsi_Host *shost,
+   struct scsi_cmnd *scmnd);
+static int srp_queuecommand(struct scsi_cmnd *scmnd,
+   void (*done)(struct scsi_cmnd *))
+{
+   scmnd-scsi_done = done;
+   return srp_queuecommand_wrk(scmnd-device-host, scmnd);
+}
+#define SRP_QUEUECOMMAND srp_queuecommand_wrk
+#else
+/*
+ * Kernel that invokes srp_queuecommand with the SCSI host lock held.
+ */
+static int srp_queuecommand_wrk(struct Scsi_Host *shost,
+   struct scsi_cmnd *scmnd);
+static int srp_queuecommand(struct scsi_cmnd *scmnd,
+   void (*done)(struct scsi_cmnd *))
+{
+   struct Scsi_Host *shost = scmnd-device-host;
+   int res;
+
+   spin_unlock_irq(shost-host_lock);
+
+   scmnd-scsi_done = done;
+   res = srp_queuecommand_wrk(shost, scmnd);
+
+   spin_lock_irq(shost-host_lock);
+   return res;
+}
+#define SRP_QUEUECOMMAND srp_queuecommand_wrk
+#endif
+
+static int SRP_QUEUECOMMAND(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 {
struct srp_target_port *target = host_to_target(shost);
struct srp_request *req;
@@ -1861,6 +1959,9 @@ static struct scsi_host_template srp_template = {
.name   = InfiniBand SRP initiator,
.proc_name  = DRV_NAME,
.info   = srp_target_info,
+#if defined(RHEL_MAJOR)  RHEL_MAJOR -0 == 6  RHEL_MINOR -0 = 2
+   .lockless   = true,
+#endif

Re: [ewg] crash system when srp targets are loaded

2012-09-09 Thread Bart Van Assche
As far as I can see the ib_srp source code in OFED-3.5 is identical to
what's present in the upstream Linux kernel 3.2. That means that the
current OFED 3.5 ib_srp driver can't work on RHEL 6.x since the RHEL 6.x
SCSI LLD API is different from the Linux kernel 3.2 SCSI LLD API.

A quote from the Linux kernel 3.2 include file scsi/scsi_host.h:

int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);

A quote from the RHEL 6.3 include file scsi/scsi_host.h:

/*
 * [ ... ]
 * NOTE: The 'lockless' flag in the scsi_host_template indicates
 * whether the host_lock should be held before calling this
 * routine. Also, the lockless queuecommand, as implemented
 * upstream has a different signature.
 */
int (* queuecommand)(struct scsi_cmnd *,
 void (*done)(struct scsi_cmnd *));
[ ... ]
/*
 * True if we are calling queuecommand without the
 * host_lock held. LLDs may want to do this for
 * performance reasons.
 */
unsigned lockless:1;

Bart.

On 09/09/12 04:36, Rupert Dance wrote:
 Hello,
 
 UNH-IOL has verified the same problem using SL 6.2 and SL 6.3. There have
 been two bugs filed in OFA Bugzilla:
 
 Bug 2374: http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2374
 
 Bug 2378: http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2378
 
 It is important to the industry and the end users that this be resolved
 before OFED 3.5 RC2 is released. We also have an OFA Interop debug event
 scheduled for a month from now that will depend on OFED 3.X
 
 Thank you,
 
 Rupert Dance
 
 
 -Original Message-
 From: ewg-boun...@lists.openfabrics.org
 [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Shuichi Ihara
 Sent: Saturday, September 08, 2012 10:28 PM
 To: ewg@lists.openfabrics.org
 Subject: [ewg] crash system when srp targets are loaded
 
 
 Hi,
 
 I'm testing SRP initiator with OFED-3.5 (OFED-3.5-20120831-0600.tgz), but
 once SRP targets are loaded, system crashed. This is reproducable.
 I filed this problem on bugzilla a week ago and submitted backtrace logs as
 well.
 http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2374
 
 Dose anyone have a look at it and advise please?
 
 Thanks
 Ihara
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [RFC] – Proposal for new process for OFED releases

2011-12-23 Thread Bart Van Assche
On Thu, Dec 1, 2011 at 7:53 PM, Tziporet Koren tzipo...@mellanox.comwrote:


 We propose a new process for the OFED releases starting from next OFED
 release:
 - OFED content will be the relevant kernel.org modules and user space
 released packages
 - OFED will offer only backports to the distros  (no fixes)
 - OFED package will be used for easy installation of all packages in a
 friendly manner

 The main goals of this change:
 1. Ensure OFED and the upstream kernel are the same
 2. Provide customers a way to use the new features in latest kernels on
 existing distros
 3. OFED qualification will contribute to the stability of the upstream code

 We think that at this point of the RDMA technology maturity this is the
 right way to go.
 In this way OFED is not conflicting with the kernel or the distros, and
 still provide a valuable value for early adopters of new features.

 Versions:
 We suggest that the OFED version will be the same as kernel.org
 For example, for kernel 3.2 the OFED release would be OFED-3.2.
 This would make it easy for people to associate the OFED code with the
 corresponding kernel.org code.

 Some open questions that we should consider:
 - How to handle experimental features?
 - Need to follow up kernel stable releases if bug fixes are relevant to
 OFA modules
 - Should we have a release for every kernel release (I think yes)
 - What should we do with modules like SDP that are not in kernel?

 Comments and responses are welcome


Personally I would appreciate it a lot if everything that is not a kernel
module would be moved out of the kernel-ib RPM. That would make it a lot
easier to use the OFED user space components in combination with upstream
or distro-provided kernel IB modules. The relevant files are:

# rpm -ql kernel-ib | grep -v lib/modules
/etc/infiniband
/etc/infiniband/connectx.conf
/etc/infiniband/info
/etc/infiniband/openib.conf
/etc/init.d/openibd
/etc/modprobe.d/ib_ipoib.conf
/etc/modprobe.d/mlx4_en.conf
/etc/udev/rules.d/90-ib.rules
/sbin/connectx_port_config
/sbin/sysctl_perf_tuning
/usr/bin/ibdev2netdev

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [RFC] – Proposal for new process for OFED releases

2011-12-02 Thread Bart Van Assche
On Fri, Dec 2, 2011 at 7:12 PM, Christoph Lameter c...@linux.com wrote:
 What were the issues that prevented the merging of the SDP
 implementation?

At least AF_INET_SDP - there might have been other issues. See e.g.
http://lkml.org/lkml/2006/3/6/70.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [RFC] – Proposal for new process for OFED releases

2011-12-01 Thread Bart Van Assche
On Fri, Dec 2, 2011 at 1:04 AM, Hefty, Sean sean.he...@intel.com wrote:
  - What should we do with modules like SDP that are not in kernel?

 Either remove them or carry them forward as experimental features.

Wat I expect is that reworking the SDP implementation such that it can
be included upstream will take less work in the long term than
maintaining it as out-of-tree code.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] ***SPAM*** Re: ***SPAM*** Re: [ofa-general] EWG (OFED) meeting minutes for Mar 12 09

2009-03-16 Thread Bart Van Assche
On Sun, Mar 15, 2009 at 9:42 PM, Roland Dreier rdre...@cisco.com wrote:
   As far as I know SDP is currently implemented in OFED as a separate
   address family (AF_INET_SDP). This is an unfortunate approach because:
   * This approach will never be accepted upstream by the Linux kernel 
 maintainers.

   One possible approach is to extend the BSD socket API with support for
   multiple IP stack implementations. This can be implemented by e.g.
   adding a system call msocket() that has four parameters -- the three
   classic socket() parameters and a fourth parameter for the IP stack.

 I'm not sure why you make this assertion... a new protocol family for
 SOCK_STREAM sockets seems far more likely to be accepted upstream than
 adding support for multiple networking stacks.  For example RDS was
 recently queued for 2.6.30 as a new protocol family.  I have a hard time
 imagining multiple IP stacks being accepted upstream on the other hand.

I would like to clarify that I do not know the opinion of the network
subsystem maintainers about msocket(), but that I posted this
information only as an example of an alternative for AF_INET_SDP.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] ***SPAM*** Re: [ofa-general] EWG (OFED) meeting minutes for Mar 12 09

2009-03-15 Thread Bart Van Assche
On Sat, Mar 14, 2009 at 7:05 PM, Tziporet Koren tzipo...@mellanox.co.il wrote:
 1348    major   RHEL 5  am...@mellanox.co.il    Sdp sockets doesnt closed 
 after programs end

As far as I know SDP is currently implemented in OFED as a separate
address family (AF_INET_SDP). This is an unfortunate approach because:
* This approach will never be accepted upstream by the Linux kernel maintainers.
* The approach of of preloading a library in order to make
applications use SDP without modifying these applications is error
prone -- it is really hard to make such a library 100% correct.

When will work start on an SDP API that is acceptable for inclusion in
the mainstream Linux kernel ?

One possible approach is to extend the BSD socket API with support for
multiple IP stack implementations. This can be implemented by e.g.
adding a system call msocket() that has four parameters -- the three
classic socket() parameters and a fourth parameter for the IP stack.
See also 
http://wiki.virtualsquare.org/index.php/Multi_stack_support_for_Berkeley_Sockets
or http://www.fosdem.org/2009/schedule/events/ipn_msockets for a more
complete explanation.

A previous discussion about SDP can be found here:
http://www.mail-archive.com/net...@vger.kernel.org/msg08546.html

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Scst-devel] SRP performance in SCST

2008-03-10 Thread Bart Van Assche
On Mon, Mar 10, 2008 at 3:20 PM, Doron Shoham [EMAIL PROTECTED] wrote:
  I used the following commands:
  for write commands:
  sgp_dd of=/dev/sg1 if=/dev/zero bs=512 bpt=from 2-512 thr=8 time=1 
 count=20M
  for read commands
  sgp_dd if=/dev/sg1 of=/dev/null bs=512 bpt=from 2-512 thr=8 time=1 
 count=20M

  And the results are (all in MB/s):
  bpt:2   4   8   16  32  64  128 256  
512
  Write   22.541.480.6160 275 443 570 740  
810
  Read42  80  170 313 550 940 12001312 
1300

  What can be the reason for such difference between the bandwidth of Read and 
 Write commands?

Can you please try to use direct I/O instead of buffered I/O ?

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Scst-devel] SRP performance in SCST

2008-03-10 Thread Bart Van Assche
On Mon, Mar 10, 2008 at 3:42 PM, Doron Shoham [EMAIL PROTECTED] wrote:

 Bart Van Assche wrote:
   Can you please try to use direct I/O instead of buffered I/O ?
  
   Bart.

  I have tried to use dio but when the bpt was 128 (or higher) I got the 
 following error:

  echo 1  /proc/scsi/sg/allow_dio

  sgp_dd if=/dev/sg1 of=/dev/null bs=512 bpt=128 thr=8 time=1 count=5M dio=1
  time to transfer data was 2.236290 secs, 1200.36 MB/sec
  5242880+0 records in
  5242880+0 records out
   Direct IO requested but incomplete 40960 times

I'm not familiar with the /proc/scsi/sg/allow_dio entry. I was
referring to options like iflag=direct or oflag=direct for dd --
all performance tests I have performed in the past were with dd and
xdd. Without direct I/O, read tests use Linux' buffering. If the
amount of data being read is less than the available RAM, buffered I/O
tests only measure how fast Linux can copy data in RAM.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Stgt-devel] [ANNOUNCE] open iSCSI over iSER target RPM is available

2008-02-05 Thread Bart Van Assche
On Feb 5, 2008 3:41 PM, Erez Zilber [EMAIL PROTECTED] wrote:
 stgt (SCSI target) is an open-source framework for storage target
 drivers. It supports iSCSI over iSER among other storage target drivers.

 Voltaire added a git tree for stgt that will be added to OFED 1.4:
 http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary

 Until OFED 1.4 gets released, it is possible to install the stgt RPM on
 top of OFED 1.3. For more details about how to install and use stgt,
 please refer to https://wiki.openfabrics.org/tiki-index.php?page=ISER-target

 Some performance numbers that were measured by OSC (using SDR cards):

 * READ: 920 MB/sec
 * WRITE: 850 MB/sec

 We hope to have DDR measurements numbers soon.

Hello Erez,

Can you please post more information about how these numbers were
obtained (test program and configuration parameters) ?

Bart Van Assche.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: Distributing the SRP target source code

2008-01-29 Thread Bart Van Assche
On Jan 28, 2008 6:07 PM, Vu Pham [EMAIL PROTECTED] wrote:

 On srpt readme file, the prerequisite is install SCST BEFORE
 ofed-1.3 or like Vlad warning recompiling ofed if you
 install scst after install ofed.

This is what will happen if someone installs Linux kernel headers +
SCST + OFED in this order:
1. Linux kernel headers matching the running kernel are installed in
/usr/src/linux-.../include or equivalent, and a symbolic link to the
kernel headers is created in /lib/modules/$(uname -r)/build/include.
2. By building and installing SCST, SCST modules are installed in
/lib/modules/$(uname -r)/extra and SCST kernel headers are installed
in /usr/local/include, a.o. SCST's scsi_tgt.h header file, the
interface between SCST and mid-level SCSI drivers.
3. Next, OFED kernel modules are being built. During this process the
SRP target module is compiled with the header file
drivers/infiniband/ulp/srpt/scsi_tgt.h. The version of this file
distributed with OFED 1.3 is incompatible with the one distributed
with the latest version of SCST. Or: the kernel will probably crash as
soon as one starts using the SRP target module, even if he or she
followed the above outlined official build procedure. Including
/usr/local/include/scsi_tgt.h in the SRP target module is not an
option -- kernel modules must not include userspace headers, except
for the well known exceptions like stdarg.h.

All this trouble can be avoided by distributing the SRP target code
with SCST instead of with OFED.

Furthermore, all kernel headers that define inter-module interfaces
should reside in kernel source root dir/include/subdir/...  The
SRP target breaks this convention by having a private copy of an
inter-module interface in a local directory
(drivers/infiniband/ulp/srpt/scsi_tgt.h).


 here is one of the reason srpt is part of ofed not scst:

 SCST is GPL
 ofed + srpt is GPL or BSD

This is not an issue -- if you have a look at the Linux kernel, you
will see that all source files are licensed under at least the GPLv2
and some source files are licensed under GPLv2 + one or more other
licenses, e.g. BSD.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: Distributing the SRP target source code

2008-01-29 Thread Bart Van Assche
On Jan 29, 2008 9:20 AM, Vu Pham [EMAIL PROTECTED] wrote:
 There are two include paths. The first one is
 /usr/local/include/scst and the second one are
 drivers/infiniband/ulp/srpt. Therefore, building srpt in
 ofed will always use the /usr/local/include/scst path first
 and if you already install scst then there won't be any problem

 As you already know /usr/local/include/scst/scsi_tgt.h is
 not userspace header. SCST is not part of kernel yet; srpt
 is also not part of kernel

Please remove drivers/infiniband/ulp/srpt/scsi_tgt.h and scst_const.h
from the OFED distribution. It's better that the SRP target doesn't
build if SCST was not yet installed instead of having to experience a
kernel crash when OFED was built before SCST.

  All this trouble can be avoided by distributing the SRP target code
  with SCST instead of with OFED.

 The same problem would appear if someone use different ofed
 versions

Personally I never use OFED kernel modules built from the OFED source
distribution but instead I use the InfiniBand kernel modules included
with the Linux distribution in use. This guarantees consistence
between the kernel core and the InfiniBand kernel modules. And
whenever I use the SRP target code, I copy it to the kernel source
tree and build it from there instead of relying on the OFED kernel
build process.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: Distributing the SRP target source code

2008-01-29 Thread Bart Van Assche
On Jan 29, 2008 11:09 AM, Vu Pham [EMAIL PROTECTED] wrote:
 Bart Van Assche wrote:
  Please remove drivers/infiniband/ulp/srpt/scsi_tgt.h and scst_const.h
  from the OFED distribution. It's better that the SRP target doesn't
  build if SCST was not yet installed instead of having to experience a
  kernel crash when OFED was built before SCST.

 It's clear from both ofed/srpt readme and Vlad's SCST bit
 fat warning. You either build scst before ofed or
 rebuild ofed

After having installed SCST and OFED 1.3 on a system there will be two
incompatible versions present on that system of SCST's header file
scsi_tgt.h. This is confusing and questionable.

Furthermore, the SRP target will only build correctly if
/usr/local/include is in the include path before . (current
directory). Relying on the order of directories in the include path is
a very questionable practice too.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] OFED 1.3 RC2 release is available

2008-01-28 Thread Bart Van Assche
On Jan 16, 2008 5:22 PM, Tziporet Koren [EMAIL PROTECTED] wrote:

 Hi,
 OFED 1.3 RC2 release is available on
 http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz

 To get BUILD_ID run ofed_info

 Please report any issues in bugzilla https://bugs.openfabrics.org/
 The RC3 release is expected on January 30

Apparently OFED 1.3 includes SRP target support ? Although I consider
SRP target support as a very valuable contribution, it should not be
included in the OFED distribution but in the SCST distribution. The
reason is that the SRP target relies on SCST interfaces that can
potentially change with each new SCST release. Consider e.g. the
scsi_tgt.h header file, which defines  the interface between SCST core
and SCST mid-level modules. The version of this file included with
git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is
incompatible with the latest scsi_tgt.h file from the SCST project
(0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target
users who combine OFED 1.3 with the latest SCST version.

Sorry for this late notice.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available

2008-01-28 Thread Bart Van Assche
On Jan 28, 2008 12:47 PM, Vladislav Bolkhovitin [EMAIL PROTECTED] wrote:

 Bart Van Assche wrote:
  Apparently OFED 1.3 includes SRP target support ? Although I consider
  SRP target support as a very valuable contribution, it should not be
  included in the OFED distribution but in the SCST distribution. The
  reason is that the SRP target relies on SCST interfaces that can
  potentially change with each new SCST release. Consider e.g. the
  scsi_tgt.h header file, which defines  the interface between SCST core
  and SCST mid-level modules. The version of this file included with
  git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is
  incompatible with the latest scsi_tgt.h file from the SCST project
  (0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target
  users who combine OFED 1.3 with the latest SCST version.

 No it won't crash, it will refuse to run. I've recently added in SCST
 protection against attempts running mixed SCST and target driver versions.

 BTW, there is a

 
 *!!*
 *!!  !!*
 *!! BIG FAT WARNING ABOUT MIXED VERSIONS PROBLEM !!*
 *!!  !!*
 *!!*
 

Hello Vladislav,

I did not test the above scenario -- what I wrote was the result of
source reading. It is very good that interface versions are checked
inside SCST before mid-level drivers are used. Even with interface
version checking in place, my opinion is that the SRP target code
should be included in the SCST project and not in the OFED project.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available

2008-01-28 Thread Bart Van Assche
On Jan 28, 2008 1:23 PM, Vladislav Bolkhovitin [EMAIL PROTECTED] wrote:
 But that won't change anything. The problem will be simply inverted:
 there will be a possibility to run a SRPT driver compiled for a wrong
 OFED version.

The SRP target driver indeed relies on several OFED kernel headers,
but these kernel headers are included in the mainstream Linux kernel
since some time. When I need OFED kernel modules, I use the modules
included with the mainstream Linux kernel and not those included with
the OFED distribution.

With regard to distribution of kernel code that is newer than the most
recent mainstream Linux kernel I prefer the model followed by the
realtime community: do not distribute the whole source tree but
publish an up-to-date patch every time a new kernel version is
released. See also
http://www.kernel.org/pub/linux/kernel/projects/rt/. Keeping such a
patch up to date is more work but is a lot easier to review than
having to compare source trees.

Bart.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg