Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Vu Pham


  

Though, now that I've unpacked it -- I don't think it is OK for
dev_loss_tmo to be off, but fast IO to be on? That drops another
conditional.
  
The combination of dev_loss_tmo off and reconnect_delay  0 worked fine 
in my tests. An I/O failure was detected shortly after the cable to the 
target was pulled. I/O resumed shortly after the cable to the target was 
reinserted.



Perhaps I don't understand your answer -- I'm asking about dev_loss_tmo
 0, and fast_io_fail_tmo = 0. The other transports do not allow this
scenario, and I'm asking if it makes sense for SRP to allow it.

But now that you mention reconnect_delay, what is the meaning of that
when it is negative? That's not in the documentation. And should it be
considered in srp_tmo_valid() -- are there values of reconnect_delay
that cause problems?

I'm starting to get a bit concerned about this patch -- can you, Vu, and
Sebastian comment on the testing you have done?

  

Hello Bart,

After running cable pull test on two local IB links for several hrs, 
I/Os got stuck.

Further commands multipath -ll or fdisk -l got stuck and never return
Here are the stack dump for srp-x kernel threads.
I'll run with #DEBUG to get more debug info on scsi host  rport

-vu


srp_threads.txt.tgz
Description: application/compressed


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Bart Van Assche
On 07/08/13 19:26, Vu Pham wrote:
 After running cable pull test on two local IB links for several hrs, 
 I/Os got stuck.
 Further commands multipath -ll or fdisk -l got stuck and never return
 Here are the stack dump for srp-x kernel threads.
 I'll run with #DEBUG to get more debug info on scsi host  rport

Hello Vu,

I had a quick look at the stack dump that was attached to your e-mail.
It shows that scsi_execute_req() hangs in blk_execute_rq(). It would be
appreciated if you could continue your tests with the kernel patch below
applied on top of v3 of this patch series. This patch should avoid that
a transport layer error that occurs after device removal has started can
cause the SCSI device state to change into blocked. This patch also
causes such TL errors to fail I/O quickly (scsi_host_alloc() zero-
initializes the memory it allocates so no explicit initialization of the
deleted variable is necessary).

Thanks,

Bart.

diff --git a/drivers/scsi/scsi_transport_srp.c 
b/drivers/scsi/scsi_transport_srp.c
index 1b9ebd5..1bb7c63 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -522,6 +522,12 @@ void srp_start_tl_fail_timers(struct srp_rport *rport)
int fast_io_fail_tmo, dev_loss_tmo, delay;
 
mutex_lock(rport-mutex);
+   if (rport-deleted) {
+   srp_rport_set_state(rport, SRP_RPORT_FAIL_FAST);
+   scsi_target_unblock(shost-shost_gendev,
+   SDEV_TRANSPORT_OFFLINE);
+   goto unlock;
+   }
delay = rport-reconnect_delay;
fast_io_fail_tmo = rport-fast_io_fail_tmo;
dev_loss_tmo = rport-dev_loss_tmo;
@@ -542,6 +548,7 @@ void srp_start_tl_fail_timers(struct srp_rport *rport)
if (dev_loss_tmo = 0)
queue_delayed_work(system_long_wq, rport-dev_loss_work,
   1UL * dev_loss_tmo * HZ);
+unlock:
mutex_unlock(rport-mutex);
 }
 EXPORT_SYMBOL(srp_start_tl_fail_timers);
@@ -730,6 +737,7 @@ void srp_rport_del(struct srp_rport *rport)
mutex_lock(rport-mutex);
if (rport-state == SRP_RPORT_BLOCKED)
__rport_fast_io_fail_timedout(rport);
+   rport-deleted = true;
mutex_unlock(rport-mutex);
 
put_device(dev);
diff --git a/include/scsi/scsi_transport_srp.h 
b/include/scsi/scsi_transport_srp.h
index fbcc985..a4addcf 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -54,6 +54,7 @@ struct srp_rport {
int dev_loss_tmo;
struct delayed_work fast_io_fail_work;
struct delayed_work dev_loss_work;
+   booldeleted;
 };
 
 struct srp_function_template {

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread David Dillow
On Thu, 2013-07-04 at 10:01 +0200, Bart Van Assche wrote:
 On 07/03/13 20:57, David Dillow wrote:
  And I'm getting the strong sense that the answer to my question about
  fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that
  combination, even if it doesn't break the kernel. If it doesn't make
  sense, there is no reason to create an opportunity for user confusion.
 
 Let's take a step back. I think we agree that the only combinations of 
 timeout parameters that are useful are those combinations that guarantee 
 that SCSI commands will be finished in a reasonable time and also that 
 allow multipath to detect failed paths. The first requirement comes down 
 to limiting the value fast_io_fail_tmo can be set to. The second 
 requirement means that either reconnect_delay or fast_io_fail_tmo must 
 be set (please note that a reconnect failure changes the state of all 
 involved SCSI devices into SDEV_TRANSPORT_OFFLINE). So how about 
 modifying srp_tmo_valid() as follows:
 * Add an argument called reconnect_delay.
 * Add the following code in that function:
  if (reconnect_delay  0  fast_io_fail_tmo  0  dev_loss_tmo  0)
  return -EINVAL;

I think this sounds reasonable; I need to make sure I understand the new
behaviors of the code to be sure.

I'm also concerned about Vu's bug report at this late stage; I'll be
watching for its resolution -- hopefully in time for inclusion.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche

On 07/03/13 20:57, David Dillow wrote:

And I'm getting the strong sense that the answer to my question about
fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that
combination, even if it doesn't break the kernel. If it doesn't make
sense, there is no reason to create an opportunity for user confusion.


Let's take a step back. I think we agree that the only combinations of 
timeout parameters that are useful are those combinations that guarantee 
that SCSI commands will be finished in a reasonable time and also that 
allow multipath to detect failed paths. The first requirement comes down 
to limiting the value fast_io_fail_tmo can be set to. The second 
requirement means that either reconnect_delay or fast_io_fail_tmo must 
be set (please note that a reconnect failure changes the state of all 
involved SCSI devices into SDEV_TRANSPORT_OFFLINE). So how about 
modifying srp_tmo_valid() as follows:

* Add an argument called reconnect_delay.
* Add the following code in that function:
if (reconnect_delay  0  fast_io_fail_tmo  0  dev_loss_tmo  0)
return -EINVAL;

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche

On 07/04/13 10:01, Bart Van Assche wrote:

On 07/03/13 20:57, David Dillow wrote:

And I'm getting the strong sense that the answer to my question about
fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that
combination, even if it doesn't break the kernel. If it doesn't make
sense, there is no reason to create an opportunity for user confusion.


Let's take a step back. I think we agree that the only combinations of
timeout parameters that are useful are those combinations that guarantee
that SCSI commands will be finished in a reasonable time and also that
allow multipath to detect failed paths. The first requirement comes down
to limiting the value fast_io_fail_tmo can be set to. The second
requirement means that either reconnect_delay or fast_io_fail_tmo must
be set (please note that a reconnect failure changes the state of all
involved SCSI devices into SDEV_TRANSPORT_OFFLINE). So how about
modifying srp_tmo_valid() as follows:
* Add an argument called reconnect_delay.
* Add the following code in that function:
 if (reconnect_delay  0  fast_io_fail_tmo  0  dev_loss_tmo  0)
 return -EINVAL;


(replying to my own e-mail)

A small correction to what I wrote above: a reconnect failure only 
causes the SCSI device state to be changed into SDEV_TRANSPORT_OFFLINE 
if the fast_io_fail mechanism has been disabled.


Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche
Add the necessary functions in the SRP transport module to allow
an SRP initiator driver to implement transport layer error handling
similar to the functionality already provided by the FC transport
layer. This includes:
- Support for implementing fast_io_fail_tmo, the time that should
  elapse after having detected a transport layer problem and
  before failing I/O.
- Support for implementing dev_loss_tmo, the time that should
  elapse after having detected a transport layer problem and
  before removing a remote port.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Roland Dreier rol...@purestorage.com
Cc: James Bottomley jbottom...@parallels.com
Cc: David Dillow dillo...@ornl.gov
Cc: Vu Pham v...@mellanox.com
Cc: Sebastian Riemer sebastian.rie...@profitbricks.com
---
 Documentation/ABI/stable/sysfs-transport-srp |   38 +++
 drivers/scsi/scsi_transport_srp.c|  468 +-
 include/scsi/scsi_transport_srp.h|   62 +++-
 3 files changed, 565 insertions(+), 3 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-transport-srp 
b/Documentation/ABI/stable/sysfs-transport-srp
index b36fb0d..52babb9 100644
--- a/Documentation/ABI/stable/sysfs-transport-srp
+++ b/Documentation/ABI/stable/sysfs-transport-srp
@@ -5,6 +5,24 @@ Contact:   linux-scsi@vger.kernel.org, 
linux-r...@vger.kernel.org
 Description:   Instructs an SRP initiator to disconnect from a target and to
remove all LUNs imported from that target.
 
+What:  /sys/class/srp_remote_ports/port-h:n/dev_loss_tmo
+Date:  October 1, 2013
+KernelVersion: 3.11
+Contact:   linux-scsi@vger.kernel.org, linux-r...@vger.kernel.org
+Description:   Number of seconds the SCSI layer will wait after a transport
+   layer error has been observed before removing a target port.
+   Zero means immediate removal. Setting this attribute to off
+   will disable this behavior.
+
+What:  /sys/class/srp_remote_ports/port-h:n/fast_io_fail_tmo
+Date:  October 1, 2013
+KernelVersion: 3.11
+Contact:   linux-scsi@vger.kernel.org, linux-r...@vger.kernel.org
+Description:   Number of seconds the SCSI layer will wait after a transport
+   layer error has been observed before failing I/O. Zero means
+   failing I/O immediately. Setting this attribute to off will
+   disable this behavior.
+
 What:  /sys/class/srp_remote_ports/port-h:n/port_id
 Date:  June 27, 2007
 KernelVersion: 2.6.24
@@ -12,8 +30,28 @@ Contact: linux-scsi@vger.kernel.org
 Description:   16-byte local SRP port identifier in hexadecimal format. An
example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00.
 
+What:  /sys/class/srp_remote_ports/port-h:n/reconnect_delay
+Date:  October 1, 2013
+KernelVersion: 3.11
+Contact:   linux-scsi@vger.kernel.org, linux-r...@vger.kernel.org
+Description:   Number of seconds the SCSI layer will wait after a reconnect
+   attempt failed before retrying.
+
 What:  /sys/class/srp_remote_ports/port-h:n/roles
 Date:  June 27, 2007
 KernelVersion: 2.6.24
 Contact:   linux-scsi@vger.kernel.org
 Description:   Role of the remote port. Either SRP Initiator or SRP Target.
+
+What:  /sys/class/srp_remote_ports/port-h:n/state
+Date:  October 1, 2013
+KernelVersion: 3.11
+Contact:   linux-scsi@vger.kernel.org, linux-r...@vger.kernel.org
+Description:   State of the transport layer used for communication with the
+   remote port. running if the transport layer is operational;
+   blocked if a transport layer error has been encountered but
+   the fail_io_fast_tmo timer has not yet fired; fail-fast
+   after the fail_io_fast_tmo timer has fired and before the
+   dev_loss_tmo timer has fired; lost after the
+   dev_loss_tmo timer has fired and before the port is finally
+   removed.
diff --git a/drivers/scsi/scsi_transport_srp.c 
b/drivers/scsi/scsi_transport_srp.c
index f7ba94a..1b9ebd5 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -24,12 +24,15 @@
 #include linux/err.h
 #include linux/slab.h
 #include linux/string.h
+#include linux/delay.h
 
 #include scsi/scsi.h
+#include scsi/scsi_cmnd.h
 #include scsi/scsi_device.h
 #include scsi/scsi_host.h
 #include scsi/scsi_transport.h
 #include scsi/scsi_transport_srp.h
+#include scsi_priv.h
 #include scsi_transport_srp_internal.h
 
 struct srp_host_attrs {
@@ -38,7 +41,7 @@ struct srp_host_attrs {
 #define to_srp_host_attrs(host)((struct srp_host_attrs 
*)(host)-shost_data)
 
 #define SRP_HOST_ATTRS 0
-#define SRP_RPORT_ATTRS 3
+#define SRP_RPORT_ATTRS 8
 
 struct srp_internal {
struct scsi_transport_template t;
@@ -54,6 +57,26 @@ struct srp_internal {
 
 #definedev_to_rport(d) container_of(d, struct srp_rport, dev)
 

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 14:54 +0200, Bart Van Assche wrote:
 +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
 +{
 + return (fast_io_fail_tmo  0 || dev_loss_tmo  0 ||
 + fast_io_fail_tmo  dev_loss_tmo) 
 + fast_io_fail_tmo = SCSI_DEVICE_BLOCK_MAX_TIMEOUT 
 + dev_loss_tmo  LONG_MAX / HZ ? 0 : -EINVAL;
 +}
 +EXPORT_SYMBOL_GPL(srp_tmo_valid);

This would have been more readable:

int srp_tmo_valid(int fast_io_fail_tmp, int dev_loss_tmo)
{
/* Fast IO fail must be off, or no greater than the max timeout */
if (fast_io_fail_tmo  SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
return -EINVAL;

/* Device timeout must be off, or fit into jiffies */
if (dev_loss_tmo = LONG_MAX / HZ)
return -EINVAL;

/* Fast IO must trigger before device loss, or one of the
 * timeouts must be disabled.
 */
if (fast_io_fail_tmo  0 || dev_loss_tmo  0)
return 0;
if (fast_io_fail  dev_loss_tmo)
return 0;

return -EINVAL;  
}

Though, now that I've unpacked it -- I don't think it is OK for
dev_loss_tmo to be off, but fast IO to be on? That drops another
conditional.

Also, FC caps dev_loss_tmo at SCSI_DEVICE_BLOCK_MAX_TIMEOUT if
fail_io_fast_tmo is off; I agree with your reasoning about leaving it
unlimited if fast fail is on, but does that still hold if it is off?



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche

On 07/03/13 19:27, David Dillow wrote:

On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote:

The combination of dev_loss_tmo off and reconnect_delay  0 worked fine
in my tests. An I/O failure was detected shortly after the cable to the
target was pulled. I/O resumed shortly after the cable to the target was
reinserted.


Perhaps I don't understand your answer -- I'm asking about dev_loss_tmo
 0, and fast_io_fail_tmo = 0. The other transports do not allow this
scenario, and I'm asking if it makes sense for SRP to allow it.

But now that you mention reconnect_delay, what is the meaning of that
when it is negative? That's not in the documentation. And should it be
considered in srp_tmo_valid() -- are there values of reconnect_delay
that cause problems?


None of the combinations that can be configured from user space can 
bring the kernel in trouble. If reconnect_delay = 0 that means that the 
time-based reconnect mechanism is disabled.



I'm starting to get a bit concerned about this patch -- can you, Vu, and
Sebastian comment on the testing you have done?


All combinations of reconnect_delay, fast_io_fail_tmo and dev_loss_tmo 
that result in different behavior have been tested.



Also, FC caps dev_loss_tmo at SCSI_DEVICE_BLOCK_MAX_TIMEOUT if
fail_io_fast_tmo is off; I agree with your reasoning about leaving it
unlimited if fast fail is on, but does that still hold if it is off?


I think setting dev_loss_tmo to a large value only makes sense if the
value of reconnect_delay is not too large. Setting both to a large value
would result in slow recovery after a transport layer failure has been
corrected.


So you agree it should be capped? I can't tell from your response.


Not all combinations of reconnect_delay / fail_io_fast_tmo / 
dev_loss_tmo result in useful behavior. It is up to the user to choose a 
meaningful combination.


Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote:
 On 07/03/13 19:27, David Dillow wrote:
  On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote:
  The combination of dev_loss_tmo off and reconnect_delay  0 worked fine
  in my tests. An I/O failure was detected shortly after the cable to the
  target was pulled. I/O resumed shortly after the cable to the target was
  reinserted.
 
  Perhaps I don't understand your answer -- I'm asking about dev_loss_tmo
   0, and fast_io_fail_tmo = 0. The other transports do not allow this
  scenario, and I'm asking if it makes sense for SRP to allow it.
 
  But now that you mention reconnect_delay, what is the meaning of that
  when it is negative? That's not in the documentation. And should it be
  considered in srp_tmo_valid() -- are there values of reconnect_delay
  that cause problems?
 
 None of the combinations that can be configured from user space can 
 bring the kernel in trouble. If reconnect_delay = 0 that means that the 
 time-based reconnect mechanism is disabled.

Then it should use the same semantics as the other attributes, and have
the user store off to turn it off.

And I'm getting the strong sense that the answer to my question about
fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that
combination, even if it doesn't break the kernel. If it doesn't make
sense, there is no reason to create an opportunity for user confusion.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Vu Pham

David Dillow wrote:

On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote:
  

On 07/03/13 19:27, David Dillow wrote:


On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote:
  

The combination of dev_loss_tmo off and reconnect_delay  0 worked fine
in my tests. An I/O failure was detected shortly after the cable to the
target was pulled. I/O resumed shortly after the cable to the target was
reinserted.


Perhaps I don't understand your answer -- I'm asking about dev_loss_tmo
 0, and fast_io_fail_tmo = 0. The other transports do not allow this
scenario, and I'm asking if it makes sense for SRP to allow it.

But now that you mention reconnect_delay, what is the meaning of that
when it is negative? That's not in the documentation. And should it be
considered in srp_tmo_valid() -- are there values of reconnect_delay
that cause problems?
  
None of the combinations that can be configured from user space can 
bring the kernel in trouble. If reconnect_delay = 0 that means that the 
time-based reconnect mechanism is disabled.



Then it should use the same semantics as the other attributes, and have
the user store off to turn it off.

And I'm getting the strong sense that the answer to my question about
fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that
combination, even if it doesn't break the kernel. If it doesn't make
sense, there is no reason to create an opportunity for user confusion.
  

Hello Dave,

when dev_loss_tmo expired, srp not only removes the rport but also 
removes the associated scsi_host.
One may wish to set fast_io_fail_tmo =0 for I/Os to fail-over fast to 
other paths, and dev_loss_tmo off to keep the scsi_host around until the 
target coming back.


-vu
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html