Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Vu Pham
Though, now that I've unpacked it -- I don't think it is OK for dev_loss_tmo to be off, but fast IO to be on? That drops another conditional. The combination of dev_loss_tmo off and reconnect_delay 0 worked fine in my tests. An I/O failure was detected shortly after the cable to the

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Bart Van Assche
On 07/08/13 19:26, Vu Pham wrote: After running cable pull test on two local IB links for several hrs, I/Os got stuck. Further commands multipath -ll or fdisk -l got stuck and never return Here are the stack dump for srp-x kernel threads. I'll run with #DEBUG to get more debug info on scsi

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread David Dillow
On Thu, 2013-07-04 at 10:01 +0200, Bart Van Assche wrote: On 07/03/13 20:57, David Dillow wrote: And I'm getting the strong sense that the answer to my question about fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that combination, even if it doesn't break the kernel. If

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche
On 07/03/13 20:57, David Dillow wrote: And I'm getting the strong sense that the answer to my question about fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that combination, even if it doesn't break the kernel. If it doesn't make sense, there is no reason to create an

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche
On 07/04/13 10:01, Bart Van Assche wrote: On 07/03/13 20:57, David Dillow wrote: And I'm getting the strong sense that the answer to my question about fast_io_fail_tmo = 0 when dev_loss_tmo is that we should not allow that combination, even if it doesn't break the kernel. If it doesn't make

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 14:54 +0200, Bart Van Assche wrote: +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) +{ + return (fast_io_fail_tmo 0 || dev_loss_tmo 0 || + fast_io_fail_tmo dev_loss_tmo) + fast_io_fail_tmo = SCSI_DEVICE_BLOCK_MAX_TIMEOUT +

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche
On 07/03/13 19:27, David Dillow wrote: On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: The combination of dev_loss_tmo off and reconnect_delay 0 worked fine in my tests. An I/O failure was detected shortly after the cable to the target was pulled. I/O resumed shortly after the cable

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote: On 07/03/13 19:27, David Dillow wrote: On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: The combination of dev_loss_tmo off and reconnect_delay 0 worked fine in my tests. An I/O failure was detected shortly after the cable

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Vu Pham
David Dillow wrote: On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote: On 07/03/13 19:27, David Dillow wrote: On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: The combination of dev_loss_tmo off and reconnect_delay 0 worked fine in my tests. An I/O failure was