Re: [PATCH 7/8] scsi: Add 'eh_deadline' to limit SCSI EH runtime
Hi, Hannes: On 10/23/2013 04:51 PM, Hannes Reinecke wrote: This patchs adds an 'eh_deadline' sysfs attribute to the scsi host which limits the overall runtime of the SCSI EH. As you known, adding to scsi host means such interface has also been added to the SATA and USB controllers. But to users, I think it is possible that there are 3 confusing points below: 1) There should not be this sysfs interface for SATA controllers, for such interface will not work under SATA's own EH policy; 2) There should not be this sysfs interface for USB controllers, because probably they will not consider EH recovery to USB ones; 3) They are not willing to affect SATA/USB controllers(even if their sysfs interfaces) while setting global interafce by assigning scsi module parameter. I was thinking how to mask SATA/USB controllers, but havn't a perfect solution so far, for it seems that it is not clever enough to mask them in each controller driver. Do you have any idea about above? Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] scsi: improved eh timeout handler
Hi, Hannes: I'm sorry that I don't know why you didn't consider my former patch below which not only raises the minimum valid value of 'eh_deadline' as '0' for your former patchset but also includes some fix for your this patchset: http://www.spinics.net/lists/linux-scsi/msg69361.html If you think I'd post the minimum value issue as a improvement when your patchset is accepted, I've no problem;-) On 10/31/2013 09:02 PM, Hannes Reinecke wrote: +void +scmd_eh_abort_handler(struct work_struct *work) +{ + struct scsi_cmnd *scmd = + container_of(work, struct scsi_cmnd, abort_work.work); + struct scsi_device *sdev = scmd->device; + unsigned long flags; + int rtn; + + spin_lock_irqsave(sdev->host->host_lock, flags); + if (scsi_host_eh_past_deadline(sdev->host)) { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "scmd %p eh timeout, not aborting\n", + scmd)); + } else { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "aborting command %p\n", scmd)); + rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd); + if (rtn == SUCCESS) { + scmd->result |= DID_TIME_OUT<< 16; + if (!scsi_noretry_cmd(scmd)&& + (++scmd->retries<= scmd->allowed)) { scsi_host_eh_past_deadline() should also be checked here before long term retrying. + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "scmd %p retry " + "aborted command\n", scmd)); + scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); + } else { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "scmd %p finish " + "aborted command\n", scmd)); + scsi_finish_command(scmd); + } + return; + } + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "scmd %p abort failed, rtn %d\n", + scmd, rtn)); + } + + if (scsi_eh_scmd_add(scmd, 0)) { scsi_finish_command() should be invoked if scsi_eh_scmd_add() is returned on failure. Thanks, Ren + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "scmd %p terminate " + "aborted command\n", scmd)); + scmd->result |= DID_TIME_OUT<< 16; + scsi_finish_command(scmd); + } +} -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] scsi: Add 'eh_deadline' to limit SCSI EH runtime
Hi, Hannes: On 10/23/2013 04:51 PM, Hannes Reinecke wrote: This patchs adds an 'eh_deadline' sysfs attribute to the scsi host which limits the overall runtime of the SCSI EH. The 'eh_deadline' value is stored in the now obsolete field 'resetting'. When a command is failed the start time of the EH is stored in 'last_reset'. If the overall runtime of the SCSI EH is longer than last_reset + eh_deadline, the EH is short-circuited and falls through to issue a host reset only. Signed-off-by: Hannes Reinecke --- drivers/scsi/hosts.c | 7 +++ drivers/scsi/scsi_error.c | 130 +++--- drivers/scsi/scsi_sysfs.c | 37 + include/scsi/scsi_host.h | 4 +- 4 files changed, 170 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index df0c3c7..f334859 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -316,6 +316,12 @@ static void scsi_host_dev_release(struct device *dev) kfree(shost); } +static unsigned int shost_eh_deadline; + +module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR); +MODULE_PARM_DESC(eh_deadline, +"SCSI EH timeout in seconds (should be between 1 and 2^32-1)"); Sorry, didn't consider '0' as the minimum valid value as we talked on Oct 9? Thanks, Ren + int eh_deadline; unsigned long last_reset; /* -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Set the minimum valid value of 'eh_deadline' as 0
Hi, Ewan, Hannes: On 10/09/2013 08:28 PM, Ewan Milne wrote: On Wed, 2013-10-09 at 15:43 +0800, Ren Mingxin wrote: The former minimum valid value of 'eh_deadline' is 1s, which means the earliest occasion to shorten EH is 1 second later since a command is failed or timed out. But if we want to skip EH steps ASAP, we have to wait until the first EH step is finished. If the duration of the first EH step is long, this waiting time is excruciating. So, it is necessary to accept 0 as the minimum valid value for 'eh_deadline'. According to my test, with Hannes' patchset 'New EH command timeout handler' as well, the minimum IO time is improved from 73s (eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed out by disabling RSCN and target port. Another thing: scsi_finish_command() should be invoked if scsi_eh_scmd_add() is returned on failure - let EH finish those commands. Signed-off-by: Ren Mingxin --- drivers/scsi/hosts.c | 14 +++--- drivers/scsi/scsi_error.c | 40 +++- drivers/scsi/scsi_sysfs.c | 36 +--- include/scsi/scsi_host.h |2 +- 4 files changed, 64 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index f334859..e84123a 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -316,11 +316,11 @@ static void scsi_host_dev_release(struct device *dev) kfree(shost); } -static unsigned int shost_eh_deadline; +static unsigned int shost_eh_deadline = -1; This should probably be "static int shost_eh_deadline = -1;". And the range tests in scsi_host_alloc() and store_shost_eh_deadline() below should probably use INT_MAX rather than UINT_MAX. The maximum value is decreased then. Hannes, agree? module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR); MODULE_PARM_DESC(eh_deadline, -"SCSI EH timeout in seconds (should be between 1 and 2^32-1)"); +"SCSI EH timeout in seconds (should be between 0 and 2^32-1)"); And the description above should be modified as: + "SCSI EH timeout in seconds (should be between 0 and (2^31-1)/HZ)"); static struct device_type scsi_host_type = { .name = "scsi_host", @@ -394,7 +394,15 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) shost->unchecked_isa_dma = sht->unchecked_isa_dma; shost->use_clustering = sht->use_clustering; shost->ordered_tag = sht->ordered_tag; - shost->eh_deadline = shost_eh_deadline * HZ; + if (shost_eh_deadline == -1) + shost->eh_deadline = -1; + else if ((ulong) shost_eh_deadline * HZ> UINT_MAX) { + printk(KERN_WARNING "scsi%d: eh_deadline %u exceeds the " + "maximum, setting to %u\n", + shost->host_no, shost_eh_deadline, UINT_MAX / HZ); + shost->eh_deadline = UINT_MAX / HZ * HZ; Just use "shost->eh_deadline = INT_MAX" here, leave off the "/ HZ * HZ". Nod. Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi: Set the minimum valid value of 'eh_deadline' as 0
The former minimum valid value of 'eh_deadline' is 1s, which means the earliest occasion to shorten EH is 1 second later since a command is failed or timed out. But if we want to skip EH steps ASAP, we have to wait until the first EH step is finished. If the duration of the first EH step is long, this waiting time is excruciating. So, it is necessary to accept 0 as the minimum valid value for 'eh_deadline'. According to my test, with Hannes' patchset 'New EH command timeout handler' as well, the minimum IO time is improved from 73s (eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed out by disabling RSCN and target port. Another thing: scsi_finish_command() should be invoked if scsi_eh_scmd_add() is returned on failure - let EH finish those commands. Signed-off-by: Ren Mingxin --- drivers/scsi/hosts.c | 14 +++--- drivers/scsi/scsi_error.c | 40 +++- drivers/scsi/scsi_sysfs.c | 36 +--- include/scsi/scsi_host.h |2 +- 4 files changed, 64 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index f334859..e84123a 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -316,11 +316,11 @@ static void scsi_host_dev_release(struct device *dev) kfree(shost); } -static unsigned int shost_eh_deadline; +static unsigned int shost_eh_deadline = -1; module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR); MODULE_PARM_DESC(eh_deadline, -"SCSI EH timeout in seconds (should be between 1 and 2^32-1)"); +"SCSI EH timeout in seconds (should be between 0 and 2^32-1)"); static struct device_type scsi_host_type = { .name = "scsi_host", @@ -394,7 +394,15 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) shost->unchecked_isa_dma = sht->unchecked_isa_dma; shost->use_clustering = sht->use_clustering; shost->ordered_tag = sht->ordered_tag; - shost->eh_deadline = shost_eh_deadline * HZ; + if (shost_eh_deadline == -1) + shost->eh_deadline = -1; + else if ((ulong) shost_eh_deadline * HZ > UINT_MAX) { + printk(KERN_WARNING "scsi%d: eh_deadline %u exceeds the " + "maximum, setting to %u\n", + shost->host_no, shost_eh_deadline, UINT_MAX / HZ); + shost->eh_deadline = UINT_MAX / HZ * HZ; + } else + shost->eh_deadline = shost_eh_deadline * HZ; if (sht->supported_mode == MODE_UNKNOWN) /* means we didn't set it ... default to INITIATOR */ diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index adb4cbe..c2f9431 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -90,7 +90,7 @@ EXPORT_SYMBOL_GPL(scsi_schedule_eh); static int scsi_host_eh_past_deadline(struct Scsi_Host *shost) { - if (!shost->last_reset || !shost->eh_deadline) + if (!shost->last_reset || shost->eh_deadline == -1) return 0; if (time_before(jiffies, @@ -127,29 +127,43 @@ scmd_eh_abort_handler(struct work_struct *work) rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd); if (rtn == SUCCESS) { scmd->result |= DID_TIME_OUT << 16; - if (!scsi_noretry_cmd(scmd) && + spin_lock_irqsave(sdev->host->host_lock, flags); + if (scsi_host_eh_past_deadline(sdev->host)) { + spin_unlock_irqrestore(sdev->host->host_lock, + flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "scmd %p eh timeout, " + "not retrying aborted " + "command\n", scmd)); + } else if (!scsi_noretry_cmd(scmd) && (++scmd->retries <= scmd->allowed)) { + spin_unlock_irqrestore(sdev->host->host_lock, + flags); SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_WARNING, scmd, "scmd %p retry " "aborted command\n", scmd)); scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); +
Re: [PATCH 2/3] scsi: improved eh timeout handler
Hi, Hannes: On 09/02/2013 07:58 PM, Hannes Reinecke wrote: +scmd_eh_abort_handler(struct work_struct *work) +{ + struct scsi_cmnd *scmd = + container_of(work, struct scsi_cmnd, abort_work.work); + struct scsi_device *sdev = scmd->device; + unsigned long flags; + int rtn; + + spin_lock_irqsave(sdev->host->host_lock, flags); + if (scsi_host_eh_past_deadline(sdev->host)) { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "scmd %p eh timeout, not aborting\n", scmd)); + } else { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "aborting command %p\n", scmd)); + rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd); + if (rtn == SUCCESS) { + scmd->result |= DID_TIME_OUT<< 16; + if (!scsi_noretry_cmd(scmd)&& + (++scmd->retries<= scmd->allowed)) { I think scsi_host_eh_past_deadline() should be checked here like: - if (!scsi_noretry_cmd(scmd)&& + if (!scsi_host_eh_past_deadline(sdev->host)&& + !scsi_noretry_cmd(scmd)&& According to my test, once retry requires 30 seconds. If eh_deadline is reached, we can stop EH here without waiting for long term retrying. Thanks, Ren + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "scmd %p retry " + "aborted command\n", scmd)); + scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); + } else { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "scmd %p finish " + "aborted command\n", scmd)); + scsi_finish_command(scmd); + } + return; + } -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] scsi: Add 'eh_deadline' to limit SCSI EH runtime
Hi, Hannes: On 07/01/2013 02:50 PM, Hannes Reinecke wrote: This patchs adds an 'eh_deadline' sysfs attribute to the scsi host which limits the overall runtime of the SCSI EH. The 'eh_deadline' value is stored in the now obsolete field 'resetting'. When a command is failed the start time of the EH is stored in 'last_reset'. If the overall runtime of the SCSI EH is longer than last_reset + eh_deadline, the EH is short-circuited and falls through to issue a host reset only. There is one thing during my test: if I want to stop EH ASAP, I can only set the 'eh_deadline' as the minimum value: 1 second. But on my box, since scsi command times out, it takes less than 1 second before the first check point - comparingthe overall runtime of the SCSI EH with last_reset + eh_deadline as you said. So, the EH could only be stopped once it spends more than 1 second before the check point rather than stopping at the first time. This problem is also existed in your second patchset "New EH command timeout handler" - it spends less than 1 second before the check point in scsi_abort_command(). So, should a special handling be considered for 1 second? E.g., we just past eh deadline when 1 second is set even if 1 second hasn't been reached. Or, should 0 second mean stopping EH ASAP rather than disabling eh_deadline? Signed-off-by: Hannes Reinecke @@ -1059,14 +1107,28 @@ static int scsi_eh_abort_cmds(struct list_head *work_q, struct scsi_cmnd *scmd, *next; LIST_HEAD(check_list); int rtn; + struct Scsi_Host *shost; + unsigned long flags; list_for_each_entry_safe(scmd, next, work_q, eh_entry) { if (!(scmd->eh_eflags& SCSI_EH_CANCEL_CMD)) continue; + shost = scmd->device->host; + spin_lock_irqsave(shost->host_lock, flags); + if (scsi_host_eh_past_deadline(shost)) { Especially speaking: could we remove this check point? In other words, could we keep aborting? According to my test, scsi_try_to_abort_cmd() takes so little time that we can ignore it. So, keeping aborting won't reduce the performance of stopping EH, and it is worth trying. Also, I'd like removing the check point in your new added scmd_eh_abort_handler() in your second patchset. Thanks, Ren + spin_unlock_irqrestore(shost->host_lock, flags); + list_splice_init(&check_list, work_q); + SCSI_LOG_ERROR_RECOVERY(3, + shost_printk(KERN_INFO, shost, + "skip %s, past eh deadline\n", +__func__)); + return list_empty(work_q); + } + spin_unlock_irqrestore(shost->host_lock, flags); SCSI_LOG_ERROR_RECOVERY(3, printk("%s: aborting cmd:" "0x%p\n", current->comm, scmd)); - rtn = scsi_try_to_abort_cmd(scmd->device->host->hostt, scmd); + rtn = scsi_try_to_abort_cmd(shost->hostt, scmd); if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { scmd->eh_eflags&= ~SCSI_EH_CANCEL_CMD; if (rtn == FAST_IO_FAIL) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] scsi: improved eh timeout handler
Hi, Hannes: On 09/02/2013 07:58 PM, Hannes Reinecke wrote: If abort succeeds the command is either retried or terminated, depending on the number of allowed retries. However, 'eh_eflags' records the abort, so if the retry would fail again the command is pushed onto the error handler without trying to abort it (again); it'll be cleared up from SCSI EH. I'm still thinking about the aborting 'scsi_eh_abort_cmds()' in SCSI EH - does it make sense to abort in SCSI EH since we've tried to abort via your scsi_abort_command()? Though the aborting in SCSI EH will handle commands which havn't been aborted in scsi_abort_command since EH has been engaged. Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/9] scsi: improved eh timeout handler
Hi, Hannes: On 07/01/2013 10:24 PM, Hannes Reinecke wrote: When a command runs into a timeout we need to send an 'ABORT TASK' TMF. This is typically done by the 'eh_abort_handler' LLDD callback. Conceptually, however, this function is a normal SCSI command, so there is no need to enter the error handler. This patch implements a new scsi_abort_command() function which invokes an asynchronous function scsi_eh_abort_handler() to abort the commands via the usual 'eh_abort_handler'. If abort succeeds the command is either retried or terminated, depending on the number of allowed retries. However, 'eh_eflags' records the abort, so if the retry would fail again the command is pushed onto the error handler without trying to abort it (again); it'll be cleared up from SCSI EH. Signed-off-by: Hannes Reinecke --- drivers/scsi/scsi.c | 1 + drivers/scsi/scsi_error.c | 139 ++ drivers/scsi/scsi_priv.h | 2 + include/scsi/scsi_cmnd.h | 2 + 4 files changed, 132 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index ebe3b0a..06257cf 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -297,6 +297,7 @@ struct scsi_cmnd *scsi_get_command(struct scsi_device *dev, gfp_t gfp_mask) cmd->device = dev; INIT_LIST_HEAD(&cmd->list); + INIT_WORK(&cmd->abort_work, scmd_eh_abort_handler); spin_lock_irqsave(&dev->list_lock, flags); list_add_tail(&cmd->list,&dev->cmd_list); spin_unlock_irqrestore(&dev->list_lock, flags); diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index e76e895..835f7e4 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -55,6 +55,7 @@ static void scsi_eh_done(struct scsi_cmnd *scmd); #define HOST_RESET_SETTLE_TIME (10) static int scsi_eh_try_stu(struct scsi_cmnd *scmd); +static int scsi_try_to_abort_cmd(struct scsi_host_template *, struct scsi_cmnd *); /* called with shost->host_lock held */ void scsi_eh_wakeup(struct Scsi_Host *shost) @@ -102,6 +103,111 @@ static int scsi_host_eh_past_deadline(struct Scsi_Host *shost) } /** + * scmd_eh_abort_handler - Handle command aborts + * @work: command to be aborted. + */ +void +scmd_eh_abort_handler(struct work_struct *work) +{ + struct scsi_cmnd *scmd = + container_of(work, struct scsi_cmnd, abort_work); + struct scsi_device *sdev = scmd->device; + unsigned long flags; + int rtn; + + spin_lock_irqsave(sdev->host->host_lock, flags); + if (scsi_host_eh_past_deadline(sdev->host)) { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "eh timeout, not aborting\n")); Command address should be also printed for debugging conveniently: +"eh timeout, not aborting command %p\n", scmd)); + } else { + spin_unlock_irqrestore(sdev->host->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "aborting command %p\n", scmd)); + rtn = scsi_try_to_abort_cmd(sdev->host->hostt, scmd); + if (rtn == SUCCESS) { + scmd->result |= DID_TIME_OUT<< 16; + if (!scsi_noretry_cmd(scmd)&& I think 'scsi_device_online(scmd->device)' is also necessary here. + (++scmd->retries<= scmd->allowed)) { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "retry aborted command\n")); Command address should be also printed here. + scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); + } else { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "finish aborted command\n")); Command address should be also printed here. + scsi_finish_command(scmd); + } + return; + } + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "abort command failed, rtn %d\n", rtn)); Command address should be also printed here. + } + + if (scsi_eh_scmd_add(scmd, 0)) { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "terminate aborted command\n")); Command address should be also printed here. + scmd->result |= DID_TIME_OUT<< 16; + scsi_finish_com
Re: [PATCHv3 0/9] New EH command timeout handler
Hi, Hannes: On 07/15/2013 02:05 PM, Ren Mingxin wrote: On 07/12/2013 06:27 PM, Hannes Reinecke wrote: On 07/12/2013 12:00 PM, Ren Mingxin wrote: On 07/12/2013 02:09 PM, Hannes Reinecke wrote: On 07/12/2013 06:14 AM, Ren Mingxin wrote: On 07/01/2013 10:24 PM, Hannes Reinecke wrote: With the original SCSI EH I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s real2m22.657s user0m0.013s sys0m0.145s With this patchset I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s real0m52.163s user0m0.012s sys0m0.145s Test was to disable RSCN on the target port, disable the target port, and then start the 'dd' command as indicated. Do you mean disabling RSCN/port is enough? I'm afraid I couldn't reproduce the problem by your steps. Both with and without your patchset are the same 'dd' result: 27s. Please let me know where I neglected or mistook: 1) I made a dm-multipath target 'dm-0' whose grouping policy was failover; 2) Disable RSCN/port via brocade fc switch: SW300:root> portcfg rscnsupr 15 --enable; portDisable 15 3) Start the 'dd' command: # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct dd: writing `/dev/sde': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s real0m27.860s user0m0.001s sys 0m0.000s You are aware that you have to disable RSCNs on the _target_ port, right? Disabling RSCNs on the _initiator_ ports is a well-tested case, and the one which actually makes sense (and is even implemented in QLogic switches). Disabling RSCNs for the _target_ port, OTOH, has a very questionable nature (hence QLogic switches don't even allow you to do this). You're right. By disabling RSCNs on target port, I've reproduced this problem. Thank you so much. But I've encountered the bug I said before. I'll test again with your new patchset once you send. Could you check with the attached patch? That should convert it to delayed_work and avoid this issue. Unfortunately, the login prompt couldn't be entered in and BUGs were printed ceaselessly while os booting with this patch. The BUGs are like below: BUG: scheduling while atomic: swapper/0/0/0x1100 Modules linked in: mptsas(F+) mptscsih(F) mptbase(F) scsi_transport_sas(F) CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF3.10.0hannes+ #10 Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB-8GDIMM-CN, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012 88047ee03b68 8153ada4 88047ee03b78 8107389d 88047ee03c08 8153ca26 81a01fd8 00012d00 81a00010 00012d00 00012d00 Call Trace: [] dump_stack+0x19/0x1d [] __schedule_bug+0x4d/0x60 [] __schedule+0x646/0x6f0 [] __cond_resched+0x2a/0x40 [] _cond_resched+0x30/0x40 [] start_flush_work+0x2c/0x140 [] flush_work+0x1a/0x40 [] ? try_to_grab_pending+0x109/0x190 [] __cancel_work_timer+0x7e/0x110 [] cancel_delayed_work_sync+0x13/0x20 [] scsi_put_command+0x65/0xa0 This bug is caused by the sync function 'cancel_delayed_work_sync' which is invoked in the interrupt context. By replacing it by non- sync function 'cancel_delayed_work' in 'scsi_put_command' can avoid. Do you think there is such need to sync in the function 'scsi_put_ command'? Since SCSI command block will be freed here, it is NOT necessary to wait for the abort work to finish on it, yes? Thanks, Ren [] scsi_next_command+0x3a/0x60 [] scsi_end_request+0xab/0xb0 [] scsi_io_completion+0x9f/0x670 [] scsi_finish_command+0xd4/0x140 [] scsi_softirq_done+0x147/0x170 [] blk_done_softirq+0x74/0x90 [] __do_softirq+0xef/0x260 [] irq_exit+0xb5/0xc0 [] do_IRQ+0x66/0xe0 [] common_interrupt+0x6a/0x6a [] ? clockevents_notify+0x52/0x150 [] ? cpuidle_enter_state+0x53/0xd0 [] ? cpuidle_enter_state+0x4f/0xd0 [] cpuidle_idle_call+0xcf/0x160 [] arch_cpu_idle+0xe/0x30 [] cpu_idle_loop+0x65/0x1f0 [] cpu_startup_entry+0x70/0x80 [] rest_init+0x77/0x80 [] start_kernel+0x41a/0x427 [] ? repair_env_string+0x5b/0x5b [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x12f/0x136 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 0/7] Limit overall SCSI EH runtime
Hi, James: On 07/11/2013 04:35 AM, Ewan Milne wrote: Looks good. We have been testing this extensively. Acked-by: Ewan D. Milne Do you think this patchset can be applied? If so, When? Perhaps you are waiting for someone's feedback? We've also tested and got the duration could be shortened from 6m26s to 44s when 'eh_deadline' was set as 1s(the minimum value of timeout) and 16M data were written(I/O processing time can be ignored - 0.7s). As Ewan said, this is efficient to fast failover policy for redundant environments. Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 0/7] Limit overall SCSI EH runtime
Hi, Hannes: On 07/15/2013 06:33 PM, Ren Mingxin wrote: I noticed that the dd time had been reduced from 6m+ to 2m+ when the 'eh_deadline' was set as 30s, but the dd time was 6m+(nearly the same as default - 'eh_deadline' was 0) when the 'eh_deadline' was set as 10s. I havn't been able to dig further, but I guess there is some restriction when setting this 'eh_deadline' interface. Maybe should not less than some timeout, otherwise 'eh_deadline' setting will not work? I've retried and confirmed that the exception above is caused by misoperation - for I had two fc hosts to build a failover multipath, but I just set 'eh_deadline' on one host. When I tested with 10s, the 'eh_deadline' on the host of the active path wasn't set :-( Sorry for my mistake. So: Tested-by: Ren Mingxin Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 0/7] Limit overall SCSI EH runtime
Hi, Ewan: On 07/12/2013 09:30 PM, Ewan Milne wrote: On Fri, 2013-07-12 at 13:54 +0800, Ren Mingxin wrote: I'm wondering how do you test, with a special hardware or self-made module?Would you mind pasting your test method() and result? This was tested in a SAN environment with an EMC Symmetrix and Brocade FC switches. The error was injected by the following commands: portcfg rscnsupr --enable portdisable Where is the FC port of the Symmetrix target. Multipath is used and the test records how long I/O from userspace takes to complete after the error handling stops and the I/O is retried on another path. What happens is that the target never responds to anything the HBA sends, so commands and TMFs just timeout. The HBA doesn't see link down (since it is the target port) and doesn't get an RSCN. When the HBA is finally reset, however, it can't login to the target port and so further I/O gets an immediate error. Unfortunately, not all SAN environments will exhibit the failing behavior -- it appears as if in some cases the HBA detects the problem regardless of the switch portcfg setting. But this has been verified to solve the problem of seemingly endless EH activity in testing at a large customer site. Thanks in advance for your explanations in detail. I've been able to reproduce only with this patchset. Also, to be clear, we tested with the "Limit overall SCSI EH runtime" patchset but not the "New EH command timeout handler". I think the changes to issue the abort in the timeout handler are a good idea, though, because there really is no need to wait for all activity on the host to cease before issuing the abort as far as I can see. Hmm, agree with you. It is much better to issue aborts without waiting, which can shorten the timeout handling time. Acked-by: Ewan D. Milne Hi, Hannes: I noticed that the dd time had been reduced from 6m+ to 2m+ when the 'eh_deadline' was set as 30s, but the dd time was 6m+(nearly the same as default - 'eh_deadline' was 0) when the 'eh_deadline' was set as 10s. I havn't been able to dig further, but I guess there is some restriction when setting this 'eh_deadline' interface. Maybe should not less than some timeout, otherwise 'eh_deadline' setting will not work? Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3 0/9] New EH command timeout handler
Hi, Hannes: On 07/12/2013 06:27 PM, Hannes Reinecke wrote: On 07/12/2013 12:00 PM, Ren Mingxin wrote: On 07/12/2013 02:09 PM, Hannes Reinecke wrote: On 07/12/2013 06:14 AM, Ren Mingxin wrote: On 07/01/2013 10:24 PM, Hannes Reinecke wrote: With the original SCSI EH I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s real2m22.657s user0m0.013s sys0m0.145s With this patchset I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s real0m52.163s user0m0.012s sys0m0.145s Test was to disable RSCN on the target port, disable the target port, and then start the 'dd' command as indicated. Do you mean disabling RSCN/port is enough? I'm afraid I couldn't reproduce the problem by your steps. Both with and without your patchset are the same 'dd' result: 27s. Please let me know where I neglected or mistook: 1) I made a dm-multipath target 'dm-0' whose grouping policy was failover; 2) Disable RSCN/port via brocade fc switch: SW300:root> portcfg rscnsupr 15 --enable; portDisable 15 3) Start the 'dd' command: # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct dd: writing `/dev/sde': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s real0m27.860s user0m0.001s sys 0m0.000s You are aware that you have to disable RSCNs on the _target_ port, right? Disabling RSCNs on the _initiator_ ports is a well-tested case, and the one which actually makes sense (and is even implemented in QLogic switches). Disabling RSCNs for the _target_ port, OTOH, has a very questionable nature (hence QLogic switches don't even allow you to do this). You're right. By disabling RSCNs on target port, I've reproduced this problem. Thank you so much. But I've encountered the bug I said before. I'll test again with your new patchset once you send. Could you check with the attached patch? That should convert it to delayed_work and avoid this issue. Unfortunately, the login prompt couldn't be entered in and BUGs were printed ceaselessly while os booting with this patch. The BUGs are like below: BUG: scheduling while atomic: swapper/0/0/0x1100 Modules linked in: mptsas(F+) mptscsih(F) mptbase(F) scsi_transport_sas(F) CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF3.10.0hannes+ #10 Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB-8GDIMM-CN, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012 88047ee03b68 8153ada4 88047ee03b78 8107389d 88047ee03c08 8153ca26 81a01fd8 00012d00 81a00010 00012d00 00012d00 Call Trace: [] dump_stack+0x19/0x1d [] __schedule_bug+0x4d/0x60 [] __schedule+0x646/0x6f0 [] __cond_resched+0x2a/0x40 [] _cond_resched+0x30/0x40 [] start_flush_work+0x2c/0x140 [] flush_work+0x1a/0x40 [] ? try_to_grab_pending+0x109/0x190 [] __cancel_work_timer+0x7e/0x110 [] cancel_delayed_work_sync+0x13/0x20 [] scsi_put_command+0x65/0xa0 [] scsi_next_command+0x3a/0x60 [] scsi_end_request+0xab/0xb0 [] scsi_io_completion+0x9f/0x670 [] scsi_finish_command+0xd4/0x140 [] scsi_softirq_done+0x147/0x170 [] blk_done_softirq+0x74/0x90 [] __do_softirq+0xef/0x260 [] irq_exit+0xb5/0xc0 [] do_IRQ+0x66/0xe0 [] common_interrupt+0x6a/0x6a [] ? clockevents_notify+0x52/0x150 [] ? cpuidle_enter_state+0x53/0xd0 [] ? cpuidle_enter_state+0x4f/0xd0 [] cpuidle_idle_call+0xcf/0x160 [] arch_cpu_idle+0xe/0x30 [] cpu_idle_loop+0x65/0x1f0 [] cpu_startup_entry+0x70/0x80 [] rest_init+0x77/0x80 [] start_kernel+0x41a/0x427 [] ? repair_env_string+0x5b/0x5b [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x12f/0x136 If there is any info I havn't expatiated, please let me know. Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3 0/9] New EH command timeout handler
Hi, Hannes: On 07/12/2013 02:09 PM, Hannes Reinecke wrote: On 07/12/2013 06:14 AM, Ren Mingxin wrote: On 07/01/2013 10:24 PM, Hannes Reinecke wrote: With the original SCSI EH I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s real2m22.657s user0m0.013s sys0m0.145s With this patchset I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s real0m52.163s user0m0.012s sys0m0.145s Test was to disable RSCN on the target port, disable the target port, and then start the 'dd' command as indicated. Do you mean disabling RSCN/port is enough? I'm afraid I couldn't reproduce the problem by your steps. Both with and without your patchset are the same 'dd' result: 27s. Please let me know where I neglected or mistook: 1) I made a dm-multipath target 'dm-0' whose grouping policy was failover; 2) Disable RSCN/port via brocade fc switch: SW300:root> portcfg rscnsupr 15 --enable; portDisable 15 3) Start the 'dd' command: # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct dd: writing `/dev/sde': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s real0m27.860s user0m0.001s sys 0m0.000s You are aware that you have to disable RSCNs on the _target_ port, right? Disabling RSCNs on the _initiator_ ports is a well-tested case, and the one which actually makes sense (and is even implemented in QLogic switches). Disabling RSCNs for the _target_ port, OTOH, has a very questionable nature (hence QLogic switches don't even allow you to do this). You're right. By disabling RSCNs on target port, I've reproduced this problem. Thank you so much. But I've encountered the bug I said before. I'll test again with your new patchset once you send. Thanks, Ren [ .. ] Another question: I also tried to produce timeouts by modifying Yasui's module(please see APPENDIX A): http://www.spinics.net/lists/linux-scsi/msg35091.html But I got a bug with your this patchset by follwing steps(there was not such bug without your patchset): # grep lpfc_template /proc/kallsyms a00f9240 d lpfc_template[lpfc] # multipath -ll ... mpathb (36000b5d0006a006a14e7000c) dm-1 FUJITSU,ETERNUS_DX400 size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=130 status=active | `- 2:0:0:1 sdf 8:80 active ready running `-+- policy='round-robin 0' prio=130 status=enabled `- 3:0:0:1 sdh 8:112 active ready running # insmod scsi_tmo_mod.ko param=0xa00f9240,2:0:0:1; time dd if=/dev/zero of=/dev/dm-1 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 151.194 s, 111 kB/s real2m31.195s user0m0.004s sys0m0.111s Please see logs in APPENDIX B. Do you think this bug is irrelevant to your patchset? Hmm. No, sadly not. 'cancel_work_sync' cannot be called from an interrupt context; guess I'll need to convert it to delayed work. Thanks for testing; will be updating the patchset. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 0/7] Limit overall SCSI EH runtime
Hi, Ewan: On 07/11/2013 04:35 AM, Ewan Milne wrote: On Mon, 2013-07-01 at 08:50 +0200, Hannes Reinecke wrote: This patchset implements a new 'eh_deadline' attribute to the SCSI host. It will limit the overall SCSI EH runtime by a given timeout. If the timeout is reached all intermediate EH steps will be skipped and host reset will be scheduled immediately. For this patch I've re-used the existing 'last_reset' field of the SCSI host to store the initial time SCSI EH started. Also the field 'resetting' has been removed as it never has been used as intended. As 'last_reset' might be in use by transport-specific EH implementation I've disallowed eh_deadline setting there. Changes from the initial version: - Add list_splice_init() calls to avoid stale commands - Rename function to scsi_host_eh_past_deadline Hannes Reinecke (7): dpt_i2o: Remove DPTI_STATE_IOCTL dpt_i2o: return SCSI_MLQUEUE_HOST_BUSY when in reset advansys: Remove 'last_reset' references tmscsim: Move 'last_reset' into host structure dc395: Move 'last_reset' into internal host structure scsi: remove check for 'resetting' scsi: Add 'eh_deadline' to limit SCSI EH runtime drivers/scsi/advansys.c | 8 +-- drivers/scsi/dc395x.c | 24 + drivers/scsi/dpt_i2o.c| 35 + drivers/scsi/dpti.h | 1 - drivers/scsi/hosts.c | 7 +++ drivers/scsi/scsi.c | 28 -- drivers/scsi/scsi_error.c | 130 +++--- drivers/scsi/scsi_sysfs.c | 37 + drivers/scsi/tmscsim.c| 14 ++--- drivers/scsi/tmscsim.h| 1 + include/scsi/scsi_host.h | 4 +- 11 files changed, 208 insertions(+), 81 deletions(-) Looks good. We have been testing this extensively. I'm wondering how do you test, with a special hardware or self-made module?Would you mind pasting your test method() and result? Thanks, Ren Acked-by: Ewan D. Milne -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3 0/9] New EH command timeout handler
Hi, Hannes: On 07/01/2013 10:24 PM, Hannes Reinecke wrote: With the original SCSI EH I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 142.652 s, 118 kB/s real2m22.657s user0m0.013s sys 0m0.145s With this patchset I got: # time dd if=/dev/zero of=/dev/dm-2 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 52.1579 s, 322 kB/s real0m52.163s user0m0.012s sys 0m0.145s Test was to disable RSCN on the target port, disable the target port, and then start the 'dd' command as indicated. Do you mean disabling RSCN/port is enough? I'm afraid I couldn't reproduce the problem by your steps. Both with and without your patchset are the same 'dd' result: 27s. Please let me know where I neglected or mistook: 1) I made a dm-multipath target 'dm-0' whose grouping policy was failover; 2) Disable RSCN/port via brocade fc switch: SW300:root> portcfg rscnsupr 15 --enable; portDisable 15 3) Start the 'dd' command: # time dd if=/dev/zero of=/dev/dm-0 bs=4k count=4k oflag=direct dd: writing `/dev/sde': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 27.8588 s, 0.0 kB/s real0m27.860s user0m0.001s sys 0m0.000s #) Corresponding logs in /var/log/messages Jul 9 14:56:06 build kernel: lpfc :0d:00.1: 1:1305 Link Down Event x4 received Data: x4 x20 x110 x0 x0 Jul 9 14:56:36 build kernel: rport-3:0-2: blocked FC remote port time out: removing target and saving binding Jul 9 14:56:36 build kernel: sd 3:0:0:0: rejecting I/O to offline device Jul 9 14:56:36 build kernel: lpfc :0d:00.1: 1:(0):0203 Devloss timeout on WWPN 20:41:00:0b:5d:6a:14:e7 NPort x620700 Data: x0 x8 x0 Jul 9 14:56:36 build kernel: sd 3:0:0:0: [sde] Synchronizing SCSI cache Jul 9 14:56:36 build kernel: sd 3:0:0:0: [sde] Jul 9 14:56:36 build kernel: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK Jul 9 14:56:36 build kernel: sd 3:0:0:1: [sdf] Synchronizing SCSI cache Jul 9 14:56:36 build kernel: sd 3:0:0:1: [sdf] Jul 9 14:56:36 build kernel: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK Jul 9 14:56:36 build multipathd: sdf: remove path (uevent) Jul 9 14:56:36 build multipathd: mpatha: load table [0 104857600 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:112 1] Jul 9 14:56:36 build multipathd: sdf: path removed from map mpatha Jul 9 14:56:36 build udevd-work[8420]: error opening ATTR{/sys/devices/pci:00/:00:03.0/:01:00.0/:02:01.0/:0a:00.0/:0b:01.0/:0d:00.1/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sde/queue/iosched/slice_idle} for writing: No such file or directory Jul 9 14:56:36 build udevd-work[8420]: error opening ATTR{/sys/devices/pci:00/:00:03.0/:01:00.0/:02:01.0/:0a:00.0/:0b:01.0/:0d:00.1/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sde/queue/iosched/quantum} for writing: No such file or directory Jul 9 14:56:36 build multipathd: sde: remove path (uevent) Jul 9 14:56:36 build multipathd: mpathb: load table [0 104857600 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:96 1] Jul 9 14:56:36 build multipathd: sde: path removed from map mpathb * there are two disks sde and sdf connected via port 15 Another question: I also tried to produce timeouts by modifying Yasui's module(please see APPENDIX A): http://www.spinics.net/lists/linux-scsi/msg35091.html But I got a bug with your this patchset by follwing steps(there was not such bug without your patchset): # grep lpfc_template /proc/kallsyms a00f9240 d lpfc_template[lpfc] # multipath -ll ... mpathb (36000b5d0006a006a14e7000c) dm-1 FUJITSU,ETERNUS_DX400 size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=130 status=active | `- 2:0:0:1 sdf 8:80 active ready running `-+- policy='round-robin 0' prio=130 status=enabled `- 3:0:0:1 sdh 8:112 active ready running # insmod scsi_tmo_mod.ko param=0xa00f9240,2:0:0:1; time dd if=/dev/zero of=/dev/dm-1 bs=4k count=4k oflag=direct 4096+0 records in 4096+0 records out 16777216 bytes (17 MB) copied, 151.194 s, 111 kB/s real2m31.195s user0m0.004s sys0m0.111s Please see logs in APPENDIX B. Do you think this bug is irrelevant to your patchset? Thanks, Ren APPENDIX A: /* * scsi timeout injection module */ #include #include #include #include static struct scsi_host_template *sht; static char config[32]; static struct target { short host; uint channel; uint id; uint lun; } st; static int (*org_qc)(struct Scsi_Host *, struct scsi_cmnd *); static inline int check_dev(struct target *st, struct scsi_cmnd *cmd) { return (st->host == cmd->device->host->host_no && st->channel == cmd->device->channel && st->id == cmd->device->id && st->lun == cmd->device->lun);
Re: [PATCH 0/7] Limit overall SCSI EH runtime
Hi, Hannes & James: On 06/10/2013 07:11 PM, Hannes Reinecke wrote: This patchset implements a new 'eh_deadline' attribute to the SCSI host. It will limit the overall SCSI EH runtime by a given timeout. If the timeout expires all intermediate steps will be skipped and host reset will be scheduled immediately. First of all, I think this patchset is useful to restrict some actual interminable EH processes. BTW: There were some patches which tried to add user interface to customize different hardware reset levels to shorten the EH duration, and unfortunately they were not accepted. Your hard work on EH improvement is much appreciated:-) But please let me know yourserialEH improvement jobs - will the redundant environment(such as multipath, mirroring) be taken into account specially? To this patchset, will just giving a appropriate timeout to skip EH except for host reset is enough for a quick fail over in redundant systems? In other words, do you think reserving the host reset which will occupy the longest time in the escalated levels is acceptable with redundant configuration? Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] New SCSI command timeout handler
Hi, Hannes: On 06/06/2013 05:43 PM, Hannes Reinecke wrote: this is the first step towards a new non-blocking error handler. This patch implements a new command timeout handler which will be sending command aborts inline without engaging SCSI EH. In addition the commands will be returned directly if the command abort succeeded, cutting down recovery times dramatically. With the original scsi error recovery I got: # time dd if=/dev/zero of=/mnt/test.blk bs=512 count=2048 oflag=sync 2048+0 records in 2048+0 records out 1048576 bytes (1.0 MB) copied, 3.72732 s, 281 kB/s real2m14.475s user0m0.000s sys 0m0.104s with this patchset I got: # time dd if=/dev/zero of=/mnt/test.blk bs=512 count=2048 oflag=sync 2048+0 records in 2048+0 records out 1048576 bytes (1.0 MB) copied, 31.5151 s, 33.3 kB/s real0m31.519s user0m0.000s sys 0m0.088s Test was to disable RSCN on the target port, disable the target port, and then start the 'dd' command as indicated. As a proof-of-concept I've also enabled the new timeout handler for virtio, so that things can be tested out more easily. So this 31.5s is tested on virtio disks, right? Much faster than your former test via fc. This approach may not work for some LLDDs as you said, but I wonder whether SAS is applicable(whether there will be later patches for SAS). Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] scsi: improved eh timeout handler
Hi, Hannes: On 06/07/2013 04:28 AM, Jörn Engel wrote: On Thu, 6 June 2013 22:39:14 +0200, Hannes Reinecke wrote: + spin_unlock_irqrestore(&sdev->list_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, + "aborting command %p\n", scmd)); + rtn = scsi_try_to_abort_cmd(shost->hostt, scmd); + if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { + if (((scmd->request->cmd_flags& REQ_FAILFAST_DEV) || Am I being stupid again or should this be negated? Knowing you I would think the former; where do you see the negation? If REQ_FAILFAST_DEV is set, this runs scsi_queue_insert(), which I would expect it should run scsi_finish_command(). I also think (scmd->request->cmd_flags & REQ_FAILFAST_DEV) and (scmd->request->cmd_type == REQ_TYPE_BLOCK_PC) should be negated. I'm confused why not use !scsi_noretry_cmd(scmd) directly as your former patch here? +(scmd->request->cmd_type == REQ_TYPE_BLOCK_PC))&& + (++scmd->retries<= scmd->allowed)) { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "retry aborted command\n")); + + scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY); + } else { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "fast fail aborted command\n")); + scmd->result |= DID_TRANSPORT_FAILFAST<< 16; + scsi_finish_command(scmd); + } + } else { + if (!scsi_eh_scmd_add(scmd, 0)) { + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_WARNING, scmd, + "terminate aborted command\n")); + scmd->result |= DID_TIME_OUT<< 16; + scsi_finish_command(scmd); + } + } + spin_lock_irqsave(&sdev->list_lock, flags); + } + spin_unlock_irqrestore(&sdev->list_lock, flags); ... +/** + * scsi_abort_command - schedule a command abort + * @scmd: scmd to abort. + * + * We only need to abort commands after a command timeout + */ +void +scsi_abort_command(struct scsi_cmnd *scmd) +{ + unsigned long flags; + int kick_worker = 0; + struct scsi_device *sdev = scmd->device; + + spin_lock_irqsave(&sdev->list_lock, flags); + if (list_empty(&sdev->eh_abort_list)) + kick_worker = 1; + list_add(&scmd->eh_entry,&sdev->eh_abort_list); + SCSI_LOG_ERROR_RECOVERY(3, + scmd_printk(KERN_INFO, scmd, "adding to eh_abort_list\n")); + spin_unlock_irqrestore(&sdev->list_lock, flags); + if (kick_worker) + schedule_work(&sdev->abort_work); +} +EXPORT_SYMBOL_GPL(scsi_abort_command); Should the name of function above be more ideographic/understandable? For example, scsi_abort_scmd_add? I was bewildered among functions named scsi_abort_eh_cmnd, scsi_eh_abort_cmds... Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] scsi: Return ENODATA on medium error
Hi, Hannes: On 06/05/2013 03:11 PM, Hannes Reinecke wrote: When a medium error is detected the SCSI stack should return ENODATA to the upper layers. Signed-off-by: Hannes Reinecke --- drivers/scsi/scsi_error.c | 7 ++- drivers/scsi/scsi_lib.c | 5 + include/scsi/scsi.h | 2 ++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index bf5e61a..2ded10a 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -235,6 +235,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost, *NEEDS_RETRY *TARGET_ERROR *ALLOC_ERROR + * MEDIA_FAILURE * * Notes: *When a deferred error is detected the current command has @@ -375,7 +376,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd) if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ sshdr.asc == 0x13 || /* AMNF DATA FIELD */ sshdr.asc == 0x14) { /* RECORD NOT FOUND */ - return TARGET_ERROR; + return MEDIA_FAILURE; } return NEEDS_RETRY; @@ -1598,6 +1599,10 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) /* target hit out-of-space condition */ set_host_byte(scmd, DID_ALLOC_FAILURE); rtn = SUCCESS; + } else if (rtn == MEDIA_FAILURE) { + /* medium error */ + set_host_byte(scmd, DID_MEDIUM_ERROR); + rtn = SUCCESS; } /* if rtn == FAILED, we have no sense information; * returning FAILED will wake the error handler thread diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 209a4d5..39d626e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -711,6 +711,7 @@ EXPORT_SYMBOL(scsi_release_buffers); * -EREMOTEIO permanent target failure, do not retry * -EBADE permanent nexus failure, retry on other path * -ENOSPCNo write space available + * -ENODATAMedium error */ static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result) { @@ -732,6 +733,10 @@ static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result) set_host_byte(cmd, DID_OK); error = -ENOSPC; break; + case DID_MEDIUM_ERROR: + set_host_byte(cmd, DID_OK); + error = -ENODATA; + break; It seems that there is a debugging requirement to announce the meaning of these new added error codes in the function blk_update_request()like this: diff --git a/block/blk-core.c b/block/blk-core.c index 33c33bc..a396eb6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2315,6 +2315,12 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes) case -EBADE: error_type = "critical nexus"; break; + case -ENOSPC: + error_type = "critical space allocation"; + break; + case -ENODATA: + error_type = "critical medium"; + break; case -EIO: default: error_type = "I/O"; # To tell the truth, I'm not understand why this patchset is needed # in practice for I've only just got limited info about LSF. I guess # this is one of the improvements for SCSI EH. Could you give an # example/condition the upper layers interest in? Thanks, Ren default: error = -EIO; break; diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index 5ead86b..c397684 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -453,6 +453,7 @@ static inline int scsi_is_wlun(unsigned int lun) #define DID_NEXUS_FAILURE 0x11 /* Permanent nexus failure, retry on other * paths might yield different results */ #define DID_ALLOC_FAILURE 0x12 /* Space allocation on the device failed */ +#define DID_MEDIUM_ERROR 0x13 /* Medium error */ #define DRIVER_OK 0x00 /* Driver status */ /* @@ -484,6 +485,7 @@ static inline int scsi_is_wlun(unsigned int lun) #define FAST_IO_FAIL 0x2009 #define TARGET_ERROR0x200A #define ALLOC_ERROR 0x200B +#define MEDIA_FAILURE 0x200C /* * Midlevel queue return values. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: Document enhanced error codes
Hi, Hannes: I have two questions about the comments: On 06/05/2013 03:10 PM, Hannes Reinecke wrote: Document the various error codes returned on I/O failure. Signed-off-by: Hannes Reinecke --- drivers/scsi/scsi_error.c | 7 +-- drivers/scsi/scsi_lib.c | 11 +++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f43de1e..443b0e3 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -229,8 +229,11 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost, * scsi_check_sense - Examine scsi cmd sense * @scmd: Cmd to have sense checked. * - * Return value: - * SUCCESS or FAILED or NEEDS_RETRY or TARGET_ERROR + * Possible return values: + * SUCCESS + * FAILED + * NEEDS_RETRY + * TARGET_ERROR This is more likely to be a historical non-update issue - there is another possible return value 'ADD_TO_MLQUEUE' which may be returned by the handler check_sense() or the case of this scsi_check_sense() below, right? switch (sshdr.sense_key) { case HARDWARE_ERROR: if (scmd->device->retry_hwerror) return ADD_TO_MLQUEUE; * * Notes: *When a deferred error is detected the current command has diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 86d5220..12bfa73 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -700,6 +700,17 @@ void scsi_release_buffers(struct scsi_cmnd *cmd) } EXPORT_SYMBOL(scsi_release_buffers); +/** + * __scsi_error_from_host_byte - translate SCSI error code into errno + * @cmd: SCSI command (unused) + * @result:scsi error code + * + * Translate SCSI error code into standard UNIX errno. + * Return values: + * -ENOLINKtemporary transport failure + * -EREMOTEIO permanent target failure, do not retry + * -EBADE permanent nexus failure, retry on other path Sorry, I'm afraid that I'm not clear why '-EIO' is not listed here... Perhaps some of them are not necessary to document for some reasons? Thanks, Ren + */ static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result) { int error = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] New FC timeout handler
Hi, Hannes: On 05/24/2013 05:50 PM, Hannes Reinecke wrote: this is the first step towards a new FC error handler. This patch implements a new FC command timeout handler which will be sending command aborts inline without engaging SCSI EH. In addition the commands will be returned directly if the command abort succeeded, cutting down recovery times dramatically. To the commands which can be aborted successfully, I guess your patchset has solved the problem "the error handler can't even be called until host_failed == host_busy", because it needn't to wait for the scheduling of EH threads(without engaging SCSI EH as you said) now, right? For any other return code from 'eh_abort_handler' the command will be pushed onto the existing SCSI EH handler, or aborted with an error if that fails. To the commands which can NOT be aborted successfully, there is not any improvements for the SCSI EH will be invoked as usual. But should we consider the repetitive/time-consuming issue for the commands will be tried to abort again in the SCSI EH handler? Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] scsi: Allow fast io fail without waiting through timeout
Hi, James, On 05/20/2013 11:53 PM, James Smart wrote: Based on the discussion recently held at LSF 2013, we are reworking the error recovery path to address all the issues you are mentioning. That work contradicts these patches. So for now, these should be held off. Interesting. Can I have your general goal/idea briefly even though via a reference? Will the URL below be one you will refer to? http://lwn.net/Articles/548500 And, could I know your current progress/schedule? Especially when can we see your patches? Much appreciated! Thanks, Ren On 5/20/2013 3:14 AM, Ren Mingxin wrote: When there is a scsi command timed-out or failed, the scsi eh tries a thorugh recovery, which is necessary for non-redundant systems. However, the thorugh recovery usually takes much time, which is not acceptable for misson critical systems. To improve this latency, if we are working on a redundant system, we should avoid the scsi eh for its long time failing recovery, and quick failover to another path. This set of patches is trying to implement above. NOTE: the userland tools need to eusure the environment restriction, which will be implemented later. Thanks, Ren Ren Mingxin (5): scsi: rename return code FAST_IO_FAIL to FAST_IO FC transport: Add interface to specify fast io level for timed-out cmds SAS transport: Add interface to specify fast io level for timed-out cmds lpfc: Allow fast timed-out io recovery mptfusion: Allow fast timed-out io recovery drivers/message/fusion/mptscsih.c | 29 - drivers/scsi/lpfc/lpfc_scsi.c | 34 ++ drivers/scsi/scsi_error.c | 18 ++--- drivers/scsi/scsi_sas_internal.h|4 - drivers/scsi/scsi_transport_fc.c| 112 ++-- drivers/scsi/scsi_transport_iscsi.c |6 - drivers/scsi/scsi_transport_sas.c | 103 - include/scsi/scsi.h |2 include/scsi/scsi_transport_fc.h| 11 +++ include/scsi/scsi_transport_sas.h |8 ++ 10 files changed, 303 insertions(+), 24 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5] scsi: Allow fast io fail without waiting through timeout
When there is a scsi command timed-out or failed, the scsi eh tries a thorugh recovery, which is necessary for non-redundant systems. However, the thorugh recovery usually takes much time, which is not acceptable for misson critical systems. To improve this latency, if we are working on a redundant system, we should avoid the scsi eh for its long time failing recovery, and quick failover to another path. This set of patches is trying to implement above. NOTE: the userland tools need to eusure the environment restriction, which will be implemented later. Thanks, Ren Ren Mingxin (5): scsi: rename return code FAST_IO_FAIL to FAST_IO FC transport: Add interface to specify fast io level for timed-out cmds SAS transport: Add interface to specify fast io level for timed-out cmds lpfc: Allow fast timed-out io recovery mptfusion: Allow fast timed-out io recovery drivers/message/fusion/mptscsih.c | 29 - drivers/scsi/lpfc/lpfc_scsi.c | 34 ++ drivers/scsi/scsi_error.c | 18 ++--- drivers/scsi/scsi_sas_internal.h|4 - drivers/scsi/scsi_transport_fc.c| 112 ++-- drivers/scsi/scsi_transport_iscsi.c |6 - drivers/scsi/scsi_transport_sas.c | 103 - include/scsi/scsi.h |2 include/scsi/scsi_transport_fc.h| 11 +++ include/scsi/scsi_transport_sas.h |8 ++ 10 files changed, 303 insertions(+), 24 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] SAS transport: Add interface to specify fast io level for timed-out cmds
This patch introduces new interfaces through sysfs for sas hosts and rphys to allow users to avoid the scsi_eh recovery actions on different levels when scsi commands timed out, e.g. /sys/devices/pci***/.../hostN/sas_host/hostN/fast_io_tmo_flags /sys/devices/pci***/.../hostN/port-X:Y/end_device-X:Y/\ sas_device/end_device-X:Y/fast_io_tmo_flags This new added interface "fast_io_tmo_flags" is a 8-bit mask with low 5-bit available up to now: 0x01 - Ignore aborting commands 0x02 - Ignore device resets 0x04 - Ignore target resets 0x08 - Ignore bus resets 0x10 - Ignore host resets When scsi_eh unjams hosts, the corresponding bit fields will be checked by LLDD to decide whether to ignore specified recovery levels. Its value is zero by default, so it keeps existing behavior, which is necessary for non-redundant systems. This interface is mainly for redundant environments. To redundant systems, they need a quick give up and failover, instead of thorough recovery which usually takes much time. The actions in LLDD/redundant configurations should be implemented individually later. Signed-off-by: Ren Mingxin --- drivers/scsi/scsi_sas_internal.h |4 +- drivers/scsi/scsi_transport_sas.c | 103 - include/scsi/scsi_transport_sas.h |8 +++ 3 files changed, 112 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_sas_internal.h b/drivers/scsi/scsi_sas_internal.h index 6266a5d..8c7ab08 100644 --- a/drivers/scsi/scsi_sas_internal.h +++ b/drivers/scsi/scsi_sas_internal.h @@ -1,10 +1,10 @@ #ifndef _SCSI_SAS_INTERNAL_H #define _SCSI_SAS_INTERNAL_H -#define SAS_HOST_ATTRS 0 +#define SAS_HOST_ATTRS 1 #define SAS_PHY_ATTRS 17 #define SAS_PORT_ATTRS 1 -#define SAS_RPORT_ATTRS7 +#define SAS_RPORT_ATTRS8 #define SAS_END_DEV_ATTRS 5 #define SAS_EXPANDER_ATTRS 7 diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c index 1b68142..960f3e5 100644 --- a/drivers/scsi/scsi_transport_sas.c +++ b/drivers/scsi/scsi_transport_sas.c @@ -37,6 +37,7 @@ #include #include #include +#include #include "scsi_sas_internal.h" struct sas_host_attrs { @@ -46,6 +47,7 @@ struct sas_host_attrs { u32 next_target_id; u32 next_expander_id; int next_port_id; + u8 fast_io_tmo_flags; }; #define to_sas_host_attrs(host)((struct sas_host_attrs *)(host)->shost_data) @@ -277,6 +279,59 @@ static void sas_bsg_remove(struct Scsi_Host *shost, struct sas_rphy *rphy) * SAS host attributes */ +static ssize_t +show_sas_private_host_fast_io_tmo_flags(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct Scsi_Host *shost = dev_to_shost(dev); + struct sas_host_attrs *sas_host = to_sas_host_attrs(shost); + + return sprintf(buf, "0x%02x\n", sas_host->fast_io_tmo_flags); +} + +static int sas_str_to_fast_io_tmo_flags(const char *buf, u8 *val) +{ + char *cp; + + *val = simple_strtoul(buf, &cp, 0) & 0xff; + if (cp == buf) + return -EINVAL; + + return 0; +} + +static ssize_t +store_sas_private_host_fast_io_tmo_flags(struct device *dev, +struct device_attribute *attr, +const char *buf, +size_t count) +{ + struct Scsi_Host *shost = dev_to_shost(dev); + struct sas_host_attrs *sas_host = to_sas_host_attrs(shost); + struct sas_rphy *rphy; + u8 val; + int rc; + unsigned long flags; + + if (count < 1) + return -EINVAL; + + rc = sas_str_to_fast_io_tmo_flags(buf, &val); + if (rc) + return rc; + + sas_host->fast_io_tmo_flags = val; + spin_lock_irqsave(shost->host_lock, flags); + list_for_each_entry(rphy, &sas_host->rphy_list, list) + rphy->fast_io_tmo_flags = val; + spin_unlock_irqrestore(shost->host_lock, flags); + return count; +} + +static SAS_DEVICE_ATTR(host, fast_io_tmo_flags, S_IRUGO | S_IWUSR, + show_sas_private_host_fast_io_tmo_flags, + store_sas_private_host_fast_io_tmo_flags); + static int sas_host_setup(struct transport_container *tc, struct device *dev, struct device *cdev) { @@ -1267,6 +1322,38 @@ sas_rphy_simple_attr(identify.sas_address, sas_address, "0x%016llx\n", unsigned long long); sas_rphy_simple_attr(identify.phy_identifier, phy_identifier, "%d\n", u8); +static ssize_t show_sas_rphy_fast_io_tmo_flags (struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct sas
[PATCH 1/5] scsi: rename return code FAST_IO_FAIL to FAST_IO
The return code FAST_IO_FAIL was introduced for fast failed io recovery. To use this code for fast timed-out io recovery as well, we'd rename it to FAST_IO. Signed-off-by: Ren Mingxin --- drivers/scsi/scsi_error.c | 18 +- drivers/scsi/scsi_transport_fc.c|4 ++-- drivers/scsi/scsi_transport_iscsi.c |6 +++--- include/scsi/scsi.h |2 +- 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f43de1e..9e8e37a 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1067,9 +1067,9 @@ static int scsi_eh_abort_cmds(struct list_head *work_q, "0x%p\n", current->comm, scmd)); rtn = scsi_try_to_abort_cmd(scmd->device->host->hostt, scmd); - if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { + if (rtn == SUCCESS || rtn == FAST_IO) { scmd->eh_eflags &= ~SCSI_EH_CANCEL_CMD; - if (rtn == FAST_IO_FAIL) + if (rtn == FAST_IO) scsi_eh_finish_cmd(scmd, done_q); else list_move_tail(&scmd->eh_entry, &check_list); @@ -1195,9 +1195,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost, " 0x%p\n", current->comm, sdev)); rtn = scsi_try_bus_device_reset(bdr_scmd); - if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { + if (rtn == SUCCESS || rtn == FAST_IO) { if (!scsi_device_online(sdev) || - rtn == FAST_IO_FAIL || + rtn == FAST_IO || !scsi_eh_tur(bdr_scmd)) { list_for_each_entry_safe(scmd, next, work_q, eh_entry) { @@ -1248,7 +1248,7 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost, "to target %d\n", current->comm, id)); rtn = scsi_try_target_reset(scmd); - if (rtn != SUCCESS && rtn != FAST_IO_FAIL) + if (rtn != SUCCESS && rtn != FAST_IO) SCSI_LOG_ERROR_RECOVERY(3, printk("%s: Target reset" " failed target: " "%d\n", @@ -1259,7 +1259,7 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost, if (rtn == SUCCESS) list_move_tail(&scmd->eh_entry, &check_list); - else if (rtn == FAST_IO_FAIL) + else if (rtn == FAST_IO) scsi_eh_finish_cmd(scmd, done_q); else /* push back on work queue for further processing */ @@ -1311,10 +1311,10 @@ static int scsi_eh_bus_reset(struct Scsi_Host *shost, " %d\n", current->comm, channel)); rtn = scsi_try_bus_reset(chan_scmd); - if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { + if (rtn == SUCCESS || rtn == FAST_IO) { list_for_each_entry_safe(scmd, next, work_q, eh_entry) { if (channel == scmd_channel(scmd)) { - if (rtn == FAST_IO_FAIL) + if (rtn == FAST_IO) scsi_eh_finish_cmd(scmd, done_q); else @@ -1354,7 +1354,7 @@ static int scsi_eh_host_reset(struct list_head *work_q, rtn = scsi_try_host_reset(scmd); if (rtn == SUCCESS) { list_splice_init(work_q, &check_list); - } else if (rtn == FAST_IO_FAIL) { + } else if (rtn == FAST_IO) { list_for_each_entry_safe(scmd, next, work_q, eh_entry) { scsi_eh_finish_cmd(scmd, done_q); } diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index e106c27..7b29e00 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -3301,7 +3301,7 @@ fc_scsi_scan_rport(struct work_struct *work) * rports which would lead to offlined SCSI device
[PATCH 5/5] mptfusion: Allow fast timed-out io recovery
This patch implements fast timed-out io recovery in LLDD(mptfusion) by checking the corresponding bit fields specified in the new added interface "fast_io_tmo_flags" and returning "FAST_IO" to avoid the scsi_eh recovery actions on corresponding levels. This is mainly for redundant configurations. To non-redundant systems, the thorough recovery is necessary. Furthermore, userland tools such as mdadm should ensure that this policy is available only if there are more than one mirrored devices active, which will be implemented later. NOTE: the device reset handler isn't implemented and the bus rest handler isn't defined for mptsas_driver_template. Here is an example which can show the improvement of this patch on md-raid1 devices: before: - takes about 69s to write 8GB normally # dd if=/dev/zero of=/dev/md0 bs=4k count=200 200+0 records in 200+0 records out 819200 bytes (8.2 GB) copied, 68.7898 s, 119 MB/s - takes about 188s to write 8GB when I/Os timed out # grep mptsas_driver_template /proc/kallsyms a00485c0 d mptsas_driver_template [mptsas] # insmod scsi_timeout.ko param=0xa00485c0,1:0:1:0[*] # dd if=/dev/zero of=/dev/md0 bs=4k count=200 200+0 records in 200+0 records out 819200 bytes (8.2 GB) copied, 187.857 s, 43.6 MB/s after: - takes about 129s to write 8GB by using this patch when I/Os timed out # echo 0x1f > /sys/devices/pci:00/:00:03.0/\ :01:00.0/:02:00.0/:03:00.0/\ :04:03.0/:08:00.0/host1/port-1:1/\ end_device-1:1/sas_device/end_device-1:1/\ fast_io_tmo_flags # insmod scsi_timeout.ko param=0xa00485c0,1:0:1:0 # dd if=/dev/zero of=/dev/md127 bs=4k count=200 200+0 records in 200+0 records out 819200 bytes (8.2 GB) copied, 129.478 s, 63.3 MB/s * scsi_timeout.ko is a self-made module which wraps the scsi queuecommand handler and ignores I/Os to the specified device and any I/Os are not passed to LLDD. Reference: http://www.spinics.net/lists/linux-scsi/msg35091.html So with this patch, we just spend time writing(about 69s) and waiting through timeout(60s), and save about 59s in scsi eh. Signed-off-by: Ren Mingxin --- drivers/message/fusion/mptscsih.c | 29 +++-- 1 files changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/message/fusion/mptscsih.c b/drivers/message/fusion/mptscsih.c index 727819c..47ef776 100644 --- a/drivers/message/fusion/mptscsih.c +++ b/drivers/message/fusion/mptscsih.c @@ -62,6 +62,7 @@ #include #include #include +#include #include "mptbase.h" #include "mptscsih.h" @@ -1698,6 +1699,12 @@ mptscsih_abort(struct scsi_cmnd * SCpnt) int retval; VirtDevice *vdevice; MPT_ADAPTER *ioc; + struct sas_rphy *rphy = target_to_rphy(SCpnt->device->sdev_target); + + if (rphy->fast_io_tmo_flags & SAS_RPHY_IGN_ABORT_CMDS) { + scsi_device_set_state(SCpnt->device, SDEV_OFFLINE); + return FAST_IO; + } /* If we can't locate our host adapter structure, return FAILED status. */ @@ -1818,6 +1825,12 @@ mptscsih_dev_reset(struct scsi_cmnd * SCpnt) int retval; VirtDevice *vdevice; MPT_ADAPTER *ioc; + struct sas_rphy *rphy = target_to_rphy(SCpnt->device->sdev_target); + + if (rphy->fast_io_tmo_flags & SAS_RPHY_IGN_TARGET_RESET) { + scsi_device_set_state(SCpnt->device, SDEV_OFFLINE); + return FAST_IO; + } /* If we can't locate our host adapter structure, return FAILED status. */ @@ -1878,6 +1891,12 @@ mptscsih_bus_reset(struct scsi_cmnd * SCpnt) int retval; VirtDevice *vdevice; MPT_ADAPTER *ioc; + struct sas_rphy *rphy = target_to_rphy(SCpnt->device->sdev_target); + + if (rphy->fast_io_tmo_flags & SAS_RPHY_IGN_BUS_RESET) { + scsi_device_set_state(SCpnt->device, SDEV_OFFLINE); + return FAST_IO; + } /* If we can't locate our host adapter structure, return FAILED status. */ @@ -1924,10 +1943,16 @@ mptscsih_bus_reset(struct scsi_cmnd * SCpnt) int mptscsih_host_reset(struct scsi_cmnd *SCpnt) { - MPT_SCSI_HOST * hd; - int status = SUCCESS; + MPT_SCSI_HOST *hd; + int status = SUCCESS; MPT_ADAPTER *ioc; int retval; + struct sas_rphy *rphy = target_to_rphy(SCpnt->device->sdev_target); + + if (rphy->fast_io_tmo_flags & SAS_RPHY_IGN_HOST_RESET) { + scsi_device_set_state(SCpnt->device, SDEV_OFFLINE); +
[PATCH 2/5] FC transport: Add interface to specify fast io level for timed-out cmds
This patch introduces new interfaces through sysfs for fc hosts and rports to allow users to avoid the scsi_eh recovery actions on different levels when scsi commands timed out, e.g. /sys/devices/pci***/.../hostN/fc_host/hostN/fast_io_tmo_flags /sys/devices/pci***/.../hostN/rport-X:Y-Z/fc_remote_ports/\ rport-X:Y-Z/fast_io_tmo_flags This new added interface "fast_io_tmo_flags" is a 8-bit mask with low 5-bit available up to now: 0x01 - Ignore aborting commands 0x02 - Ignore device resets 0x04 - Ignore target resets 0x08 - Ignore bus resets 0x10 - Ignore host resets When scsi_eh unjams hosts, the corresponding bit fields will be checked by LLDD to decide whether to ignore specified recovery levels. Its value is zero by default, so it keeps existing behavior, which is necessary for non-redundant systems. This interface is mainly for redundant environments. To redundant systems, they need a quick give up and failover, instead of thorough recovery which usually takes much time. The actions in LLDD/redundant configurations should be implemented individually later. Signed-off-by: Ren Mingxin --- drivers/scsi/scsi_transport_fc.c | 108 +- include/scsi/scsi_transport_fc.h | 11 2 files changed, 117 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c index 7b29e00..155a658 100644 --- a/drivers/scsi/scsi_transport_fc.c +++ b/drivers/scsi/scsi_transport_fc.c @@ -310,9 +310,9 @@ static void fc_scsi_scan_rport(struct work_struct *work); * Increase these values if you add attributes */ #define FC_STARGET_NUM_ATTRS 3 -#define FC_RPORT_NUM_ATTRS 10 +#define FC_RPORT_NUM_ATTRS 11 #define FC_VPORT_NUM_ATTRS 9 -#define FC_HOST_NUM_ATTRS 29 +#define FC_HOST_NUM_ATTRS 30 struct fc_internal { struct scsi_transport_template t; @@ -995,6 +995,67 @@ store_fc_rport_fast_io_fail_tmo(struct device *dev, static FC_DEVICE_ATTR(rport, fast_io_fail_tmo, S_IRUGO | S_IWUSR, show_fc_rport_fast_io_fail_tmo, store_fc_rport_fast_io_fail_tmo); +/* + * fast_io_tmo_flags attribute + */ +static ssize_t +show_fc_rport_fast_io_tmo_flags(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct fc_rport *rport = transport_class_to_rport(dev); + + return sprintf(buf, "0x%02x\n", rport->fast_io_tmo_flags); +} + +static int fc_str_to_fast_io_tmo_flags(const char *buf, u8 *val) +{ + char *cp; + + *val = simple_strtoul(buf, &cp, 0) & 0xff; + if (cp == buf) + return -EINVAL; + + return 0; +} + +static int fc_rport_set_fast_io_tmo_flags(struct fc_rport *rport, u8 val) +{ + if ((rport->port_state == FC_PORTSTATE_BLOCKED) || + (rport->port_state == FC_PORTSTATE_DELETED) || + (rport->port_state == FC_PORTSTATE_NOTPRESENT)) + return -EBUSY; + + rport->fast_io_tmo_flags = val; + + return 0; +} + +static ssize_t +store_fc_rport_fast_io_tmo_flags(struct device *dev, +struct device_attribute *attr, +const char *buf, +size_t count) +{ + struct fc_rport *rport = transport_class_to_rport(dev); + u8 val; + int rc; + + if (count < 1) + return -EINVAL; + + rc = fc_str_to_fast_io_tmo_flags(buf, &val); + if (rc) + return rc; + + rc = fc_rport_set_fast_io_tmo_flags(rport, val); + if (rc) + return rc; + return count; +} +static FC_DEVICE_ATTR(rport, fast_io_tmo_flags, S_IRUGO | S_IWUSR, + show_fc_rport_fast_io_tmo_flags, store_fc_rport_fast_io_tmo_flags); + /* * FC SCSI Target Attribute Management @@ -1679,6 +1740,47 @@ static FC_DEVICE_ATTR(host, dev_loss_tmo, S_IRUGO | S_IWUSR, show_fc_host_dev_loss_tmo, store_fc_private_host_dev_loss_tmo); +static ssize_t +show_fc_private_host_fast_io_tmo_flags (struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct Scsi_Host *shost = transport_class_to_shost(dev); + + return sprintf(buf, "0x%02x\n", fc_host_fast_io_tmo_flags(shost)); +} + +static ssize_t +store_fc_private_host_fast_io_tmo_flags(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct Scsi_Host *shost = transport_class_to_shost(dev); + struct fc_host_attrs *fc_host = shost_to_fc_host(shost); + struct fc_rport *rport; + u8 val; + int rc; + unsigned long flags; + +
[PATCH 4/5] lpfc: Allow fast timed-out io recovery
This patch implements fast timed-out io recovery in LLDD(lpfc) by checking the corresponding bit fields specified in the new added interface "fast_io_tmo_flags" and returning "FAST_IO" to avoid the scsi_eh recovery actions on corresponding levels. This is mainly for redundant configurations. To non-redundant systems, the thorough recovery is necessary. Furthermore, userland tools such as multipath-tools should ensure that this policy is available only if there are more than one path active, which will be implemented later. Here is an example which can show the improvement of this patch: before: - takes about 3s to write 800MB normally # dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20 20+0 records in 20+0 records out 81920 bytes (819 MB) copied, 3.10581 s, 264 MB/s - takes about 105s to write 800MB when I/Os timed out # grep lpfc_template /proc/kallsyms a00f83a0 d lpfc_template[lpfc] # insmod scsi_timeout.ko param=0xa00f83a0,2:0:0:1[*] # dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20 20+0 records in 20+0 records out 81920 bytes (819 MB) copied, 104.91 s, 7.8 MB/s after: - takes about 34s to write 800MB by using this patch when I/Os timed out # echo 0x1f > /sys/devices/pci:00/:00:03.0/\ :01:00.0/:02:01.0/:0a:00.0/\ :0b:01.0/:0d:00.0/host2/rport-2:0-2/\ fc_remote_ports/rport-2:0-2/fast_io_tmo_flags # insmod scsi_timeout.ko param=0xa00f83a0,2:0:0:1 # dd if=/dev/zero of=/dev/mapper/mpathb bs=4k count=20 20+0 records in 20+0 records out 81920 bytes (819 MB) copied, 33.7718 s, 24.3 MB/s * scsi_timeout.ko is a self-made module which wraps the scsi queuecommand handler and ignores I/Os to the specified device and any I/Os are not passed to LLDD. Reference: http://www.spinics.net/lists/linux-scsi/msg35091.html So with this patch, we just spend time writing(about 3s) and waiting through timeout(30s), and save about 71s in scsi eh. Signed-off-by: Ren Mingxin --- drivers/scsi/lpfc/lpfc_scsi.c | 34 -- 1 files changed, 32 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 8523b27..796893b 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -4798,6 +4798,7 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd) { struct Scsi_Host *shost = cmnd->device->host; struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata; + struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device)); struct lpfc_hba *phba = vport->phba; struct lpfc_iocbq *iocb; struct lpfc_iocbq *abtsiocb; @@ -4811,6 +4812,11 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd) if (status != 0 && status != SUCCESS) return status; + if (rport->fast_io_tmo_flags & FC_RPORT_IGN_ABORT_CMDS) { + scsi_device_set_state(cmnd->device, SDEV_OFFLINE); + return FAST_IO; + } + spin_lock_irqsave(&phba->hbalock, flags); /* driver queued commands are in process of being flushed */ if (phba->hba_flag & HBA_FCP_IOQ_FLUSH) { @@ -5150,6 +5156,7 @@ lpfc_device_reset_handler(struct scsi_cmnd *cmnd) { struct Scsi_Host *shost = cmnd->device->host; struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata; + struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device)); struct lpfc_rport_data *rdata = cmnd->device->hostdata; struct lpfc_nodelist *pnode; unsigned tgt_id = cmnd->device->id; @@ -5167,6 +5174,11 @@ lpfc_device_reset_handler(struct scsi_cmnd *cmnd) if (status != 0 && status != SUCCESS) return status; + if (rport->fast_io_tmo_flags & FC_RPORT_IGN_DEVICE_RESET) { + scsi_device_set_state(cmnd->device, SDEV_OFFLINE); + return FAST_IO; + } + status = lpfc_chk_tgt_mapped(vport, cmnd); if (status == FAILED) { lpfc_printf_vlog(vport, KERN_ERR, LOG_FCP, @@ -5217,6 +5229,7 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd) { struct Scsi_Host *shost = cmnd->device->host; struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata; + struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device)); struct lpfc_rport_data *rdata = cmnd->device->hostdata; struct lpfc_nodelist *pnode; unsigned tgt_id = cmnd->device->id; @@ -5234,6 +5247,11 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd) if (status != 0 && status != SUCCESS) retur
[PATCH] scsi_dh: remove unused declaration dm_pg_init_complete()
This patch removes dm_pg_init_complete()'s declaration as it is not needed anymore since 2651f5d7d3bc5120a439e498f131e4d731f99b3e. Signed-off-by: Ren Mingxin --- drivers/md/dm-mpath.h |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-mpath.h b/drivers/md/dm-mpath.h index e230f71..9c36d0f 100644 --- a/drivers/md/dm-mpath.h +++ b/drivers/md/dm-mpath.h @@ -16,7 +16,4 @@ struct dm_path { void *pscontext;/* For path-selector use */ }; -/* Callback for hwh_pg_init_fn to use when complete */ -void dm_pg_init_complete(struct dm_path *path, unsigned err_flags); - #endif -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error handler scheduling
On 03/29/2013 12:02 AM, Elliott, Robert (Server Storage) wrote: There are several possible reasons for SCSI command timeouts: a) the command request did not get to the SCSI target port and logical unit (e.g., error on the wire) b) logical unit is still working on the command c) the command completed, but status didn't get to the SCSI initiator port and application client (e.g., error on the wire) SCSI doesn't have a good way to detect case (c). For status delivery errors detected by the logical unit, I once proposed that the logical unit establish a unit attention condition and record the status delivery problem in a log page (T10 proposal 04-072) but this proposal didn't draw much interest. The QUERY TASK task management function can detect case (b) vs. the other cases. With SSDs, a lengthy timeout derived from ancient SCSI floppy drives doesn't make sense. Timeouts should scale automatically based on the device type (e.g., use microseconds for SSDs and seconds for HDDs). The REPORT SUPPORTED OPERATION CODES command provides some command timeout values to facilitate this. For Base feature set drives I'm encouraging an approach like this for handling command timeouts: 1) at discovery time: 1a) send REPORT SUPPORTED OPERATION CODES to determine the nominal and maximum command timeouts 1b) send REPORT SUPPORTED TASK MANAGEMENT FUNCTION to determine the TMF timeouts 2) send the command (e.g., READ, WRITE, FORMAT UNIT, ...) If status arrives for the command at any time, exit out of this procedure. If an I_T nexus loss occurs, then that handling overrides this procedure as well. Otherwise: 3) if the nominal command timeout is long (e.g., for a command like FORMAT UNIT with IMMED=0, but not for IO commands like READ and WRITE), then wait a short time and send QUERY TASK to ensure the command got there: 3a) if the command is not there (probably lost in delivery, but possibly lost status), go to step (2) to resend the command 3b) if the command is still being processed, keep waiting 4) if the nominal command timeout is reached, send QUERY TASK to determine what is happening: 4a) if the command is not there (if step (3) was run, then this probably means lost status), go to step (2) to resend the command 4b) if the command is still being processed, keep waiting 5) if the maximum command timeout is reached, send QUERY TASK to determine what is happening: 5a) if the command is not there (since step (4) was run, this probably means lost status), go to step (2) to resend the command 5b) if the command is still being processed, proceed to step (6) to abort the command 6) send ABORT TASK to abort the command 7) If ABORT TASK succeeds, either: 7a) escalate to a stronger TMF or hard reset if this command keeps having repeated problems; or 7b) go to step (2) to resend the command 8) If the ABORT TASK timeout is reached, either: 8a) escalate to a stronger TMF or hard reset, then go to step (2) to resend the command; or 8b) declare the logical unit is unavailable Doug: for ***, In addition to WSNZ bit now letting the drive not support the value of zero, T10 proposal 13-052 changes WRITE SAME so the NUMBER OF LOGICAL BLOCKS set to zero (if supported) must honor the MAXIMUM WRITE SAME LENGTH field, so the drive can provide a reasonable timeout value for the command (not worry that the entire capacity might be specified). Please let me summarize what this thread has talked about the scsi eh latency: 1) some scsi cmds' timemout values are inappropriate, we can avoid timeout by: a) sg_format sets the IMMED bit and use TEST UNIT READY or REQUEST SENSE polling to monitor - by Douglas b) cut big cmd into some reasonable-sized ones - by Douglas c) improve timeout values according to device types - by Elliott 2) call ->done() on the command after lun reset - by Hannes And, my question is: - could we wake up eh thread ASAP instead of waiting for all cmds complete to fast scheduling? BTW: my original question is here: http://www.spinics.net/lists/linux-scsi/msg65107.html Thanks, Ren --- Rob ElliottHP Server Storage -Original Message- From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi- ow...@vger.kernel.org] On Behalf Of Douglas Gilbert Sent: Wednesday, 27 March, 2013 9:39 AM To: james.sm...@emulex.com Cc: linux-scsi@vger.kernel.org Subject: Re: error handler scheduling On 13-03-26 10:11 PM, James Smart wrote: In looking through the error handler, if a command times out and is added to the eh_cmd_q for the shost, the error handler is only awakened once shost- host_busy (total number of i/os posted to the shost) is equal to shost->host_failed (number of i/o that have been failed and put on the eh_cmd_q). Which means, any other i/o that was outstanding must either complete or have their timeout fire. Additio
[PATCH] scsi/lpfc: add return code FAST_IO_FAIL in lpfc_abort_handler() comments
Signed-off-by: Ren Mingxin --- drivers/scsi/lpfc/lpfc_scsi.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 98af07c..cc6fc83 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -4426,6 +4426,7 @@ lpfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd) * Return code : * 0x2003 - Error * 0x2002 - Success + * 0x2009 - fast_io_fail_tmo fired **/ static int lpfc_abort_handler(struct scsi_cmnd *cmnd) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
scsi_error: improve the recovery latency for timeouted scsi cmds
Hi, Please let me ask one question about improving the recovery latency for timeouted scmds: In the functions 'scsi_eh_wakeup()' & 'scsi_error_handler()', there are two same condition judgements which ensure the number of active scmds equals to the number of failed scmds: void scsi_eh_wakeup(struct Scsi_Host *shost) { if (shost->host_busy == shost->host_failed) wake_up_process(shost->ehandler); } int scsi_error_handler(void *data) { while (!kthread_should_stop()) { if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) || shost->host_failed != shost->host_busy) { schedule(); continue; } } } I think the original reason for waking up eh thread until all scmds complete/fail may be in case of more overhead produced by threads waking up time after time, right? But in the below condition, the strategy above seems not appropriate: If a scmd is issued and stuck and another scmd is issued, scsi eh detects a timeout of the first scmd, but has to wait for the second one to be timedout/completed. Which means the first timeouted scmds couldn't be handled in time. This may be fatal to a certain extent(the critical system especially). So, please let me know the starting point for the wakeup strategy in eh. We'd investigate further based on your comments. Any suggestions will be appreciated. Thanks, Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] lpfc: init: fix misspelling word in mailbox command waiting comments
On 12/11/2012 11:53 AM, re...@cn.fujitsu.com wrote: From: Ren Mingxin Superfluous, sorry for disturbing everyone :-( Ren -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] lpfc: init: fix misspelling word in mailbox command waiting comments
Correct misspelling of "outstanding" in mailbox command waiting comments. Signed-off-by: Ren Mingxin Signed-off-by: Pan Dayu --- drivers/scsi/lpfc/lpfc_init.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 7dc4218..8533160 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -2566,7 +2566,7 @@ lpfc_block_mgmt_io(struct lpfc_hba *phba, int mbx_action) } spin_unlock_irqrestore(&phba->hbalock, iflag); - /* Wait for the outstnading mailbox command to complete */ + /* Wait for the outstanding mailbox command to complete */ while (phba->sli.mbox_active) { /* Check active mailbox complete status every 2ms */ msleep(2); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html