[PATCH] messages: fusion: Staticize local symbols
These local symbols are used only in this file. Fix the following sparse warnings: drivers/message/fusion/mptbase.c:7011:1: warning: symbol 'mpt_SoftResetHandler' was not declared. Should it be static? drivers/message/fusion/mptspi.c:624:1: warning: symbol 'mptscsih_quiesce_raid' was not declared. Should it be static? drivers/message/fusion/mptsas.c:1578:23: warning: symbol 'mptsas_refreshing_device_handles' was not declared. Should it be static? drivers/message/fusion/mptsas.c:3653:24: warning: symbol 'mptsas_expander_add' was not declared. Should it be static? drivers/message/fusion/mptsas.c:5327:1: warning: symbol 'mptsas_shutdown' was not declared. Should it be static? Signed-off-by: Jingoo Han jg1@samsung.com --- drivers/message/fusion/mptbase.c |2 +- drivers/message/fusion/mptsas.c |6 +++--- drivers/message/fusion/mptspi.c |2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/message/fusion/mptbase.c b/drivers/message/fusion/mptbase.c index 767ff4d..c783a42 100644 --- a/drivers/message/fusion/mptbase.c +++ b/drivers/message/fusion/mptbase.c @@ -7007,7 +7007,7 @@ EXPORT_SYMBOL(mpt_halt_firmware); * IOC doesn't reply to any outstanding request. This will transfer IOC * to READY state. **/ -int +static int mpt_SoftResetHandler(MPT_ADAPTER *ioc, int sleepFlag) { int rc; diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c index dd239bd..906f448 100644 --- a/drivers/message/fusion/mptsas.c +++ b/drivers/message/fusion/mptsas.c @@ -1575,7 +1575,7 @@ mptsas_del_end_device(MPT_ADAPTER *ioc, struct mptsas_phyinfo *phy_info) mptsas_port_delete(ioc, phy_info-port_details); } -struct mptsas_phyinfo * +static struct mptsas_phyinfo * mptsas_refreshing_device_handles(MPT_ADAPTER *ioc, struct mptsas_devinfo *sas_device) { @@ -3650,7 +3650,7 @@ mptsas_send_expander_event(struct fw_event_work *fw_event) * @handle: * */ -struct mptsas_portinfo * +static struct mptsas_portinfo * mptsas_expander_add(MPT_ADAPTER *ioc, u16 handle) { struct mptsas_portinfo buffer, *port_info; @@ -5323,7 +5323,7 @@ mptsas_probe(struct pci_dev *pdev, const struct pci_device_id *id) return error; } -void +static void mptsas_shutdown(struct pci_dev *pdev) { MPT_ADAPTER *ioc = pci_get_drvdata(pdev); diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c index 5653e50..ed8de0a 100644 --- a/drivers/message/fusion/mptspi.c +++ b/drivers/message/fusion/mptspi.c @@ -620,7 +620,7 @@ static void mptspi_read_parameters(struct scsi_target *starget) spi_width(starget) = (nego MPI_SCSIDEVPAGE0_NP_WIDE) ? 1 : 0; } -int +static int mptscsih_quiesce_raid(MPT_SCSI_HOST *hd, int quiesce, u8 channel, u8 id) { MPT_ADAPTER *ioc = hd-ioc; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] scsi: ufs: amend the ocs handling with fatal error
On 7/30/2013 6:32 PM, Seungwon Jeon wrote: On Mon, July 29, 2013, Sujit Reddy Thumma wrote: On 7/29/2013 11:47 AM, Subhash Jadavani wrote: On 7/26/2013 7:15 PM, Seungwon Jeon wrote: Fatal error in OCS(overall command status) field indicates error conditions which is not covered by UFSHCI. It means that host cannot define the result of command status and therefore host may need to check transfer response UPIU's response and status field. It was actually found that 'CHECK CONDITION' is stored in status field of UPIU where OCS is 'FATAL ERROR'. It looks like you are assuming that there will be some kind of response from the device. But the spec. mentions - OCS_FATAL ERROR: within host controller that is not covered by the error conditions described above in this table. Yes, error interrupt doesn't happen actually. So spec. left everything to implementers on how to trigger this error. Also, it says within host controller so ufshcd_is_valid_req_rsp() might fail as well. I couldn't understand why there is a need for a host controller to interpret the SCSI command status and raise an OCS_FATAL_ERROR? I feel like OCS values are related to syntax problem except OCS_FATAL_ERROR. Basically, host controller may need to refer the Response field[Target Success/Target Failure] of response UPIU. It's a overall result from response UPIU. When Response field is 'Target Failure', if host controller updates the OCS field with 'SUCCESS', it's not proper behavior. In this case host may refer the Status field of response UPIU to decide the OCS result. Of course, it's not clear thing and could depends on host's implementation, because there is no obvious description for that. But if we consider the way to report UTP error from UFSHCI 8.2.4, we can check the Response UPIU for more detailed error condition regardless OCS value. Could you check your host side? This is what our host documentation says: SW may check the contents of Response UPIU when OCS !=0 for debug purposes. It may (or may not) help to understand the error scenario better. This is needed only for debug. When system is stable, OCS should be 0. So this clearly means that we shouldn't rely on the response UPIU if the OCS is non-zero hence simply ignore it. When Response field is 'Target Failure', if host controller updates the OCS field with 'SUCCESS', it's not proper behavior. I doubt whether your above understanding is true from Host controller specification point of view. Host controller would not decode the “Response” field of the “response UPIU”, it would only decode the “Transaction Type” “Task Tag” (and may be “LUN”) from the response UPIU in order to find out the correct slot in the transfer/task request list. Even 7.3.2.3 in UFSHCI spec also mentions the same. I guess with this i don't see a need for this patch unless you feel otherwise. Regards, Subhash Thanks, Seungwon Jeon If it is clarified by the spec. then we can have generic implementation otherwise I would prefer to make this specific to those host controllers that raise OCS_FATAL_ERROR for CHECK_CONDITION. Signed-off-by: Seungwon Jeon tgih@samsung.com --- drivers/scsi/ufs/ufshcd.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b743bd6..4cf3a2d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1199,7 +1199,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) switch (ocs) { case OCS_SUCCESS: - +case OCS_FATAL_ERROR: /* check if the returned transfer response is valid */ result = ufshcd_is_valid_req_rsp(lrbp-ucd_rsp_ptr); I don't see the response UPIU data of the last command response is cleared anywhere in the driver. This means its quite possible that if the current command failed (and if it is using the same tag as the last succeeded command) with the OCS_FATAL_ERROR then response UPIU (pointed by lrbp-ucd_rsp_ptr) content may be still the same as the last one. If we ensure to clear the response UPIU data after every command completion then only we can rely on the response UPIU content in case of fatal errors. Regards, Subhash if (result) { @@ -1229,7 +1229,6 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case OCS_MISMATCH_DATA_BUF_SIZE: case OCS_MISMATCH_RESP_UPIU_SIZE: case OCS_PEER_COMM_FAILURE: -case OCS_FATAL_ERROR: default: result |= DID_ERROR 16; dev_err(hba-dev, -- Regards, Sujit -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5 0/4] scsi: ufs: Improve UFS error handling
Hi, I tested the new set of patches (V5 1-4) and it works. Thanks, Dolev The first patch fixes many issues with current task management handling in UFSHCD driver. Others improve error handling in various scenarios. These patches are rebased on: [PATCH 9/9] drivers/scsi/ufs: don't check resource with devm_ioremap_resource Changes from v4: - Addressed comments from Seungwon Jeon on V3 - Updated with proper locking while ufshcd state changes - Retained LUN reset instead of device reset with DME_END_POINT_RESET - Removed error handling decisions dependent on OCS value - Simplified fatal error handling to abort the requests first and then carry out reset. Changes from v3: - Rebased. Changes from v2: - [PATCH V3 1/4]: Make the task management command task tag unique across SCSI/NOP/QUERY request tags. - [PATCH V3 3/4]: While handling device/host reset, wait for pending fatal handler to return if running. Changes from v1: - [PATCH V2 1/4]: Fix a race condition because of overloading outstanding_tasks variable to lock the slots. A new variable tm_slots_in_use will track which slots are in use by the driver. - [PATCH V2 2/4]: Commit text update to clarify the hardware race with more details. - [PATCH V2 3/4]: Minor cleanup and rebase - [PATCH V2 4/4]: Fix a bug - sleeping in atomic context Sujit Reddy Thumma (4): scsi: ufs: Fix broken task management command implementation scsi: ufs: Fix hardware race conditions while aborting a command scsi: ufs: Fix device and host reset methods scsi: ufs: Improve UFS fatal error handling drivers/scsi/ufs/ufshcd.c | 684 - drivers/scsi/ufs/ufshcd.h | 20 +- 2 files changed, 501 insertions(+), 203 deletions(-) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5 2/4] scsi: ufs: Fix hardware race conditions while aborting a command
Tested-by: Dolev Raviv dra...@codeaurora.org There is a possible race condition in the hardware when the abort command is issued to terminate the ongoing SCSI command as described below: - A bit in the door-bell register is set in the controller for a new SCSI command. - In some rare situations, before controller get a chance to issue the command to the device, the software issued an abort command. - If the device recieves abort command first then it returns success because the command itself is not present. - Now if the controller commits the command to device it will be processed. - Software thinks that command is aborted and proceed while still the device is processing it. - The software, controller and device may go out of sync because of this race condition. To avoid this, query task presence in the device before sending abort task command so that after the abort operation, the command is guaranteed to be non-existent in both controller and the device. Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org --- drivers/scsi/ufs/ufshcd.c | 70 +++- 1 files changed, 55 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index d7f3746..d4ee48d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2485,6 +2485,12 @@ static int ufshcd_host_reset(struct scsi_cmnd *cmd) * ufshcd_abort - abort a specific command * @cmd: SCSI command pointer * + * Abort the pending command in device by sending UFS_ABORT_TASK task management + * command, and in host controller by clearing the door-bell register. There can + * be race between controller sending the command to the device while abort is + * issued. To avoid that, first issue UFS_QUERY_TASK to check if the command is + * really issued and then try to abort it. + * * Returns SUCCESS/FAILED */ static int ufshcd_abort(struct scsi_cmnd *cmd) @@ -2493,7 +2499,8 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) struct ufs_hba *hba; unsigned long flags; unsigned int tag; - int err; + int err = 0; + int poll_cnt; u8 resp = 0xF; struct ufshcd_lrb *lrbp; @@ -2501,33 +2508,59 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) hba = shost_priv(host); tag = cmd-request-tag; - spin_lock_irqsave(host-host_lock, flags); + /* If command is already aborted/completed, return SUCCESS */ + if (!(test_bit(tag, hba-outstanding_reqs))) + goto out; - /* check if command is still pending */ - if (!(test_bit(tag, hba-outstanding_reqs))) { - err = FAILED; - spin_unlock_irqrestore(host-host_lock, flags); + lrbp = hba-lrb[tag]; + for (poll_cnt = 100; poll_cnt; poll_cnt--) { + err = ufshcd_issue_tm_cmd(hba, lrbp-lun, lrbp-task_tag, + UFS_QUERY_TASK, resp); + if (!err resp == UPIU_TASK_MANAGEMENT_FUNC_SUCCEEDED) { + /* cmd pending in the device */ + break; + } else if (!err resp == UPIU_TASK_MANAGEMENT_FUNC_COMPL) { + u32 reg; + + /* + * cmd not pending in the device, check if it is + * in transition. + */ + reg = ufshcd_readl(hba, REG_UTP_TRANSFER_REQ_DOOR_BELL); + if (reg (1 tag)) { + /* sleep for max. 2ms to stabilize */ + usleep_range(1000, 2000); + continue; + } + /* command completed already */ + goto out; + } else { + if (!err) + err = resp; /* service response error */ + goto out; + } + } + + if (!poll_cnt) { + err = -EBUSY; goto out; } - spin_unlock_irqrestore(host-host_lock, flags); - lrbp = hba-lrb[tag]; err = ufshcd_issue_tm_cmd(hba, lrbp-lun, lrbp-task_tag, UFS_ABORT_TASK, resp); if (err || resp != UPIU_TASK_MANAGEMENT_FUNC_COMPL) { - err = FAILED; + if (!err) + err = resp; /* service response error */ goto out; - } else { - err = SUCCESS; } + err = ufshcd_clear_cmd(hba, tag); + if (err) + goto out; + scsi_dma_unmap(cmd); spin_lock_irqsave(host-host_lock, flags); - - /* clear the respective UTRLCLR register bit */ - ufshcd_utrl_clear(hba, tag); - __clear_bit(tag, hba-outstanding_reqs); hba-lrb[tag].cmd = NULL; spin_unlock_irqrestore(host-host_lock, flags); @@ -2535,6 +2568,13 @@ static int ufshcd_abort(struct
Re: [PATCH V5 1/4] scsi: ufs: Fix broken task management command implementation
Tested-by: Dolev Raviv dra...@codeaurora.org Currently, sending Task Management (TM) command to the card might be broken in some scenarios as listed below: Problem: If there are more than 8 TM commands the implementation returns error to the caller. Fix: Wait for one of the slots to be emptied and send the command. Problem: Sometimes it is necessary for the caller to know the TM service response code to determine the task status. Fix: Propogate the service response to the caller. Problem: If the TM command times out no proper error recovery is implemented. Fix: Clear the command in the controller door-bell register, so that further commands for the same slot don't fail. Problem: While preparing the TM command descriptor, the task tag used should be unique across SCSI/NOP/QUERY/TM commands and not the task tag of the command which the TM command is trying to manage. Fix: Use a unique task tag instead of task tag of SCSI command. Problem: Since the TM command involves H/W communication, abruptly ending the request on kill interrupt signal might cause h/w malfunction. Fix: Wait for hardware completion interrupt with TASK_UNINTERRUPTIBLE set. Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org --- drivers/scsi/ufs/ufshcd.c | 173 ++--- drivers/scsi/ufs/ufshcd.h |8 ++- 2 files changed, 122 insertions(+), 59 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b36ca9a..d7f3746 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -53,6 +53,9 @@ /* Query request timeout */ #define QUERY_REQ_TIMEOUT 30 /* msec */ +/* Task management command timeout */ +#define TM_CMD_TIMEOUT 100 /* msecs */ + /* Expose the flag value from utp_upiu_query.value */ #define MASK_QUERY_UPIU_FLAG_LOC 0xFF @@ -183,13 +186,35 @@ ufshcd_get_tmr_ocs(struct utp_task_req_desc *task_req_descp) /** * ufshcd_get_tm_free_slot - get a free slot for task management request * @hba: per adapter instance + * @free_slot: pointer to variable with available slot value * - * Returns maximum number of task management request slots in case of - * task management queue full or returns the free slot number + * Get a free tag and lock it until ufshcd_put_tm_slot() is called. + * Returns 0 if free slot is not available, else return 1 with tag value + * in @free_slot. */ -static inline int ufshcd_get_tm_free_slot(struct ufs_hba *hba) +static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int *free_slot) { - return find_first_zero_bit(hba-outstanding_tasks, hba-nutmrs); + int tag; + bool ret = false; + + if (!free_slot) + goto out; + + do { + tag = find_first_zero_bit(hba-tm_slots_in_use, hba-nutmrs); + if (tag = hba-nutmrs) + goto out; + } while (test_and_set_bit_lock(tag, hba-tm_slots_in_use)); + + *free_slot = tag; + ret = true; +out: + return ret; +} + +static inline void ufshcd_put_tm_slot(struct ufs_hba *hba, int slot) +{ + clear_bit_unlock(slot, hba-tm_slots_in_use); } /** @@ -1700,10 +1725,11 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev) * ufshcd_task_req_compl - handle task management request completion * @hba: per adapter instance * @index: index of the completed request + * @resp: task management service response * - * Returns SUCCESS/FAILED + * Returns non-zero value on error, zero on success */ -static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index) +static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index, u8 *resp) { struct utp_task_req_desc *task_req_descp; struct utp_upiu_task_rsp *task_rsp_upiup; @@ -1724,19 +1750,15 @@ static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index) task_req_descp[index].task_rsp_upiu; task_result = be32_to_cpu(task_rsp_upiup-header.dword_1); task_result = ((task_result MASK_TASK_RESPONSE) 8); - - if (task_result != UPIU_TASK_MANAGEMENT_FUNC_COMPL - task_result != UPIU_TASK_MANAGEMENT_FUNC_SUCCEEDED) - task_result = FAILED; - else - task_result = SUCCESS; + if (resp) + *resp = (u8)task_result; } else { - task_result = FAILED; - dev_err(hba-dev, - trc: Invalid ocs = %x\n, ocs_value); + dev_err(hba-dev, %s: failed, ocs = 0x%x\n, + __func__, ocs_value); } spin_unlock_irqrestore(hba-host-host_lock, flags); - return task_result; + + return ocs_value; } /** @@ -2237,7 +2259,7 @@ static void ufshcd_tmc_handler(struct ufs_hba *hba)
Re: [PATCH V5 3/4] scsi: ufs: Fix device and host reset methods
Tested-by: Dolev Raviv dra...@codeaurora.org As of now SCSI initiated error handling is broken because, the reset APIs don't try to bring back the device initialized and ready for further transfers. In case of timeouts, the scsi error handler takes care of handling aborts and resets. Improve the error handling in such scenario by resetting the device and host and re-initializing them in proper manner. Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org --- drivers/scsi/ufs/ufshcd.c | 240 +++-- drivers/scsi/ufs/ufshcd.h |2 + 2 files changed, 189 insertions(+), 53 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index d4ee48d..c577e6e 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -69,9 +69,14 @@ enum { /* UFSHCD states */ enum { - UFSHCD_STATE_OPERATIONAL, UFSHCD_STATE_RESET, UFSHCD_STATE_ERROR, + UFSHCD_STATE_OPERATIONAL, +}; + +/* UFSHCD error handling flags */ +enum { + UFSHCD_EH_IN_PROGRESS = (1 0), }; /* Interrupt configuration options */ @@ -87,6 +92,16 @@ enum { INT_AGGR_CONFIG, }; +#define ufshcd_set_eh_in_progress(h) \ + (h-eh_flags |= UFSHCD_EH_IN_PROGRESS) +#define ufshcd_eh_in_progress(h) \ + (h-eh_flags UFSHCD_EH_IN_PROGRESS) +#define ufshcd_clear_eh_in_progress(h) \ + (h-eh_flags = ~UFSHCD_EH_IN_PROGRESS) + +static void ufshcd_tmc_handler(struct ufs_hba *hba); +static void ufshcd_async_scan(void *data, async_cookie_t cookie); + /* * ufshcd_wait_for_register - wait for register value to change * @hba - per-adapter interface @@ -840,10 +855,25 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) tag = cmd-request-tag; - if (hba-ufshcd_state != UFSHCD_STATE_OPERATIONAL) { + spin_lock_irqsave(hba-host-host_lock, flags); + switch (hba-ufshcd_state) { + case UFSHCD_STATE_OPERATIONAL: + break; + case UFSHCD_STATE_RESET: err = SCSI_MLQUEUE_HOST_BUSY; - goto out; + goto out_unlock; + case UFSHCD_STATE_ERROR: + set_host_byte(cmd, DID_ERROR); + cmd-scsi_done(cmd); + goto out_unlock; + default: + dev_WARN_ONCE(hba-dev, 1, %s: invalid state %d\n, + __func__, hba-ufshcd_state); + set_host_byte(cmd, DID_BAD_TARGET); + cmd-scsi_done(cmd); + goto out_unlock; } + spin_unlock_irqrestore(hba-host-host_lock, flags); /* acquire the tag to make sure device cmds don't use it */ if (test_and_set_bit_lock(tag, hba-lrb_in_use)) { @@ -880,6 +910,7 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) /* issue command to the controller */ spin_lock_irqsave(hba-host-host_lock, flags); ufshcd_send_command(hba, tag); +out_unlock: spin_unlock_irqrestore(hba-host-host_lock, flags); out: return err; @@ -1495,8 +1526,6 @@ static int ufshcd_make_hba_operational(struct ufs_hba *hba) if (hba-ufshcd_state == UFSHCD_STATE_RESET) scsi_unblock_requests(hba-host); - hba-ufshcd_state = UFSHCD_STATE_OPERATIONAL; - out: return err; } @@ -2245,8 +2274,12 @@ static void ufshcd_err_handler(struct ufs_hba *hba) } return; fatal_eh: - hba-ufshcd_state = UFSHCD_STATE_ERROR; - schedule_work(hba-feh_workq); + /* handle fatal errors only when link is functional */ + if (hba-ufshcd_state == UFSHCD_STATE_OPERATIONAL) { + /* block commands at driver layer until error is handled */ + hba-ufshcd_state = UFSHCD_STATE_ERROR; + schedule_work(hba-feh_workq); + } } /** @@ -2411,12 +2444,13 @@ static int ufshcd_issue_tm_cmd(struct ufs_hba *hba, int lun_id, int task_id, } /** - * ufshcd_device_reset - reset device and abort all the pending commands + * ufshcd_eh_device_reset_handler - device reset handler registered to + *scsi layer. * @cmd: SCSI command pointer * * Returns SUCCESS/FAILED */ -static int ufshcd_device_reset(struct scsi_cmnd *cmd) +static int ufshcd_eh_device_reset_handler(struct scsi_cmnd *cmd) { struct Scsi_Host *host; struct ufs_hba *hba; @@ -2425,6 +2459,7 @@ static int ufshcd_device_reset(struct scsi_cmnd *cmd) int err; u8 resp = 0xF; struct ufshcd_lrb *lrbp; + unsigned long flags; host = cmd-device-host; hba = shost_priv(host); @@ -2433,55 +2468,33 @@ static int ufshcd_device_reset(struct scsi_cmnd *cmd) lrbp = hba-lrb[tag]; err = ufshcd_issue_tm_cmd(hba, lrbp-lun, 0, UFS_LOGICAL_RESET, resp); if (err || resp != UPIU_TASK_MANAGEMENT_FUNC_COMPL) { - err = FAILED; + if (!err) + err
Re: [PATCH V5 4/4] scsi: ufs: Improve UFS fatal error handling
Tested-by: Dolev Raviv dra...@codeaurora.org Error handling in UFS driver is broken and resets the host controller for fatal errors without re-initialization. Correct the fatal error handling sequence according to UFS Host Controller Interface (HCI) v1.1 specification. o Processed requests which are completed w/wo error are reported to SCSI layer and any pending commands that are not started are aborted in the controller and re-queued into scsi mid-layer queue. o Upon determining fatal error condition the host controller may hang forever until a reset is applied. Block SCSI layer for sending new requests and apply reset in a separate error handling work. o SCSI is informed about the expected Unit-Attention exception from the device for the immediate command after a reset so that the SCSI layer take necessary steps to establish communication with the device. Signed-off-by: Sujit Reddy Thumma sthu...@codeaurora.org --- drivers/scsi/ufs/ufshcd.c | 229 - drivers/scsi/ufs/ufshcd.h | 10 ++- 2 files changed, 149 insertions(+), 90 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index c577e6e..4dca9b4 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -79,6 +79,14 @@ enum { UFSHCD_EH_IN_PROGRESS = (1 0), }; +/* UFSHCD UIC layer error flags */ +enum { + UFSHCD_UIC_DL_PA_INIT_ERROR = (1 0), /* Data link layer error */ + UFSHCD_UIC_NL_ERROR = (1 1), /* Network layer error */ + UFSHCD_UIC_TL_ERROR = (1 2), /* Transport Layer error */ + UFSHCD_UIC_DME_ERROR = (1 3), /* DME error */ +}; + /* Interrupt configuration options */ enum { UFSHCD_INT_DISABLE, @@ -101,6 +109,8 @@ enum { static void ufshcd_tmc_handler(struct ufs_hba *hba); static void ufshcd_async_scan(void *data, async_cookie_t cookie); +static int ufshcd_reset_and_restore(struct ufs_hba *hba); +static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag); /* * ufshcd_wait_for_register - wait for register value to change @@ -1523,9 +1533,6 @@ static int ufshcd_make_hba_operational(struct ufs_hba *hba) goto out; } - if (hba-ufshcd_state == UFSHCD_STATE_RESET) - scsi_unblock_requests(hba-host); - out: return err; } @@ -1651,66 +1658,6 @@ static int ufshcd_verify_dev_init(struct ufs_hba *hba) } /** - * ufshcd_do_reset - reset the host controller - * @hba: per adapter instance - * - * Returns SUCCESS/FAILED - */ -static int ufshcd_do_reset(struct ufs_hba *hba) -{ - struct ufshcd_lrb *lrbp; - unsigned long flags; - int tag; - - /* block commands from midlayer */ - scsi_block_requests(hba-host); - - spin_lock_irqsave(hba-host-host_lock, flags); - hba-ufshcd_state = UFSHCD_STATE_RESET; - - /* send controller to reset state */ - ufshcd_hba_stop(hba); - spin_unlock_irqrestore(hba-host-host_lock, flags); - - /* abort outstanding commands */ - for (tag = 0; tag hba-nutrs; tag++) { - if (test_bit(tag, hba-outstanding_reqs)) { - lrbp = hba-lrb[tag]; - if (lrbp-cmd) { - scsi_dma_unmap(lrbp-cmd); - lrbp-cmd-result = DID_RESET 16; - lrbp-cmd-scsi_done(lrbp-cmd); - lrbp-cmd = NULL; - clear_bit_unlock(tag, hba-lrb_in_use); - } - } - } - - /* complete device management command */ - if (hba-dev_cmd.complete) - complete(hba-dev_cmd.complete); - - /* clear outstanding request/task bit maps */ - hba-outstanding_reqs = 0; - hba-outstanding_tasks = 0; - - /* Host controller enable */ - if (ufshcd_hba_enable(hba)) { - dev_err(hba-dev, - Reset: Controller initialization failed\n); - return FAILED; - } - - if (ufshcd_link_startup(hba)) { - dev_err(hba-dev, - Reset: Link start-up failed\n); - return FAILED; - } - - return SUCCESS; -} - -/** * ufshcd_slave_alloc - handle initial SCSI device configurations * @sdev: pointer to SCSI device * @@ -1727,6 +1674,9 @@ static int ufshcd_slave_alloc(struct scsi_device *sdev) sdev-use_10_for_ms = 1; scsi_set_tag_type(sdev, MSG_SIMPLE_TAG); + /* allow SCSI layer to restart the device in case of errors */ + sdev-allow_restart = 1; + /* * Inform SCSI Midlayer that the LUN queue depth is same as the * controller queue depth. If a LUN queue depth is less than the @@ -1930,6 +1880,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case OCS_ABORTED: result |= DID_ABORT 16; break; + case
Re: [PATCH] hpsa: fix warning with smp_processor_id() in preemptible
* John Kacur | 2013-07-26 16:42:30 [+0200]: Signed-off-by: John Kacur jka...@redhat.com Acked-by: Stephen scame...@beardog.cce.hp.com ping. I checked the branches for-next, scsi-fixes, fixes and misc at [0] and I didn't see it. I'm going to take this for 3.10-rt but please don't lose it on its way to Linus :) [0] git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] hpsa: fix warning with smp_processor_id() in preemptible
- Original Message - * John Kacur | 2013-07-26 16:42:30 [+0200]: Signed-off-by: John Kacur jka...@redhat.com Acked-by: Stephen scame...@beardog.cce.hp.com ping. I checked the branches for-next, scsi-fixes, fixes and misc at [0] and I didn't see it. I'm going to take this for 3.10-rt but please don't lose it on its way to Linus :) [0] git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git I hope it was clear to everyone that this patch was intended for upstream. It was discovered by running the real-time kernel, but it exposes a (minor) problem that should be fixed in the mainline kernel. Please apply it there Stephen, and push it upstream appropriately. Thank You. John Kacur -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] hpsa: fix warning with smp_processor_id() in preemptible
On Mon, Aug 12, 2013 at 09:33:32AM -0400, John Kacur wrote: - Original Message - * John Kacur | 2013-07-26 16:42:30 [+0200]: Signed-off-by: John Kacur jka...@redhat.com Acked-by: Stephen scame...@beardog.cce.hp.com ping. I checked the branches for-next, scsi-fixes, fixes and misc at [0] and I didn't see it. I'm going to take this for 3.10-rt but please don't lose it on its way to Linus :) [0] git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git I hope it was clear to everyone that this patch was intended for upstream. It was discovered by running the real-time kernel, but it exposes a (minor) problem that should be fixed in the mainline kernel. Please apply it there Stephen, and push it upstream appropriately. I don't have such pushing abilities. Acking it as I've done is all I can do. Bug James. :) -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
On Fri, Aug 09, 2013 at 11:07:37AM -0600, Jens Axboe wrote: You don't have to resubmit, I'll get it reviewed and applied today. Hi Jens, I limited the minimal queue depth to 4, which is apparently wrong in case of libata. I will post a new series. -- Jens Axboe -- Regards, Alexander Gordeev agord...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open
On 08/06/2013 04:52 AM, Douglas Gilbert wrote: On 13-08-04 10:19 PM, vaughan wrote: On 08/03/2013 01:25 PM, Douglas Gilbert wrote: On 13-08-01 01:01 AM, Douglas Gilbert wrote: On 13-07-22 01:03 PM, Jörn Engel wrote: On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote: There is a race when open sg with O_EXCL flag. Also a race may happen between sg_open and sg_remove. Changes from v4: * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp * [4/4] fix conflict for cherry-pick from v3. Changes from v3: * release o_sem in sg_release(), not in sg_remove_sfp(). * not set exclude with sfd_lock held. Vaughan Cao (4): [SCSI] sg: use rwsem to solve race during exclusive open [SCSI] sg: no need sg_open_exclusive_lock [SCSI] sg: checking sdp-detached isn't protected when open [SCSI] sg: push file descriptor list locking down to per-device locking drivers/scsi/sg.c | 178 +- 1 file changed, 83 insertions(+), 95 deletions(-) Patchset looks good to me, although I didn't test it on hardware yet. Signed-off-by: Joern Engel jo...@logfs.org James, care to pick this up? Acked-by: Douglas Gilbert dgilb...@interlog.com Tested O_EXCL with multiple processes and threads; passed. sg driver prior to this patch had leaky O_EXCL logic according to the same test. Block device passed. James, could you clean this up: drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ [-Wunused-variable] Further testing suggests this patch on the sg driver is broken, so I'll rescind my ack. The case it is broken for is when a device is opened without O_EXCL. Now if, while it is open, a second thread/process tries to open the same device O_EXCL then IMO the second open should fail with EBUSY. My testing shows that O_EXCL opens properly deflect other O_EXCL opens. Hi Doug, My test don't have this issue. The routine is something as below: I start three opens without O_EXCL, wait 30s each, and open with O_EXCL|O_NONBLOCK, it failed with EBUSY. And I also call myopen with/without O_EXCL many times in background at the same time, and the test is passed. I don't know why it failed in your test. Usage: myopen [-e][-n][-d delay] -f file -e: exclude -n: nonblock -d: delay N seconds and then close. [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [1] 3417 [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [2] 3418 [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [3] 3419 [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug max_active_device=6(origin 1) def_reserved_size=32768 device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active [root@vacaowol5 16835013]# ./myopen -e -n -f /dev/sg5 -d 30 [4] 3422 [3422:3351] /dev/sg5:exclude: Device or resource busy [4]+ Exit 1 ./myopen -e -n -f /dev/sg5 -d 30 [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug max_active_device=6(origin 1) def_reserved_size=32768 device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug [1] Done./myopen -f /dev/sg5 -d 30 [2]- Done./myopen -f /dev/sg5 -d 30 [3]+ Done./myopen -f /dev/sg5 -d 30 Hi, After the initial failures about 36 hours ago, retesting yesterday and today has not produced any unexpected failures. And I have been trying hard on lk 3.10.4 and lk 3.10.5 . My test program is a bit more intense than yours and can be found in the sg3_utils beta in the News section of this page: http://sg.danny.cz/sg/ It is in the examples directory, two variants called sg_tst_excl and sg_tst_excl2 . You will need a recent gcc compiler, IOW something that can compile c++11 . gcc 4.7.3 in Ubuntu 13.04 only just manages, fedora 19 should do better with gcc 4.8.1 . The threading is implemented using pthreads so it should be reliable. Typically I run multiple instances (processes) and each has multiple threads. One instance can run '-x' which will cause its first thread not to use
Re: [PATCH v5 0/4] [SCSI] sg: fix race condition in sg_open
On 13-08-12 10:46 PM, vaughan wrote: On 08/06/2013 04:52 AM, Douglas Gilbert wrote: On 13-08-04 10:19 PM, vaughan wrote: On 08/03/2013 01:25 PM, Douglas Gilbert wrote: On 13-08-01 01:01 AM, Douglas Gilbert wrote: On 13-07-22 01:03 PM, Jörn Engel wrote: On Mon, 22 July 2013 12:40:29 +0800, Vaughan Cao wrote: There is a race when open sg with O_EXCL flag. Also a race may happen between sg_open and sg_remove. Changes from v4: * [3/4] use ERR_PTR series instead of adding another parameter in sg_add_sfp * [4/4] fix conflict for cherry-pick from v3. Changes from v3: * release o_sem in sg_release(), not in sg_remove_sfp(). * not set exclude with sfd_lock held. Vaughan Cao (4): [SCSI] sg: use rwsem to solve race during exclusive open [SCSI] sg: no need sg_open_exclusive_lock [SCSI] sg: checking sdp-detached isn't protected when open [SCSI] sg: push file descriptor list locking down to per-device locking drivers/scsi/sg.c | 178 +- 1 file changed, 83 insertions(+), 95 deletions(-) Patchset looks good to me, although I didn't test it on hardware yet. Signed-off-by: Joern Engel jo...@logfs.org James, care to pick this up? Acked-by: Douglas Gilbert dgilb...@interlog.com Tested O_EXCL with multiple processes and threads; passed. sg driver prior to this patch had leaky O_EXCL logic according to the same test. Block device passed. James, could you clean this up: drivers/scsi/sg.c:242:6: warning: unused variable ‘res’ [-Wunused-variable] Further testing suggests this patch on the sg driver is broken, so I'll rescind my ack. The case it is broken for is when a device is opened without O_EXCL. Now if, while it is open, a second thread/process tries to open the same device O_EXCL then IMO the second open should fail with EBUSY. My testing shows that O_EXCL opens properly deflect other O_EXCL opens. Hi Doug, My test don't have this issue. The routine is something as below: I start three opens without O_EXCL, wait 30s each, and open with O_EXCL|O_NONBLOCK, it failed with EBUSY. And I also call myopen with/without O_EXCL many times in background at the same time, and the test is passed. I don't know why it failed in your test. Usage: myopen [-e][-n][-d delay] -f file -e: exclude -n: nonblock -d: delay N seconds and then close. [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [1] 3417 [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [2] 3418 [root@vacaowol5 16835013]# ./myopen -f /dev/sg5 -d 30 [3] 3419 [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug max_active_device=6(origin 1) def_reserved_size=32768 device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active [root@vacaowol5 16835013]# ./myopen -e -n -f /dev/sg5 -d 30 [4] 3422 [3422:3351] /dev/sg5:exclude: Device or resource busy [4]+ Exit 1 ./myopen -e -n -f /dev/sg5 -d 30 [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug max_active_device=6(origin 1) def_reserved_size=32768 device=sg5 scsi5 chan=0 id=1 lun=0 em=0 sg_tablesize=55 excl=0 FD(1): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(2): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active FD(3): timeout=6ms bufflen=32768 (res)sgat=1 low_dma=0 cmd_q=0 f_packid=0 k_orphan=0 closed=0 No requests active [root@vacaowol5 16835013]# cat /proc/scsi/sg/debug [1] Done./myopen -f /dev/sg5 -d 30 [2]- Done./myopen -f /dev/sg5 -d 30 [3]+ Done./myopen -f /dev/sg5 -d 30 Hi, After the initial failures about 36 hours ago, retesting yesterday and today has not produced any unexpected failures. And I have been trying hard on lk 3.10.4 and lk 3.10.5 . My test program is a bit more intense than yours and can be found in the sg3_utils beta in the News section of this page: http://sg.danny.cz/sg/ It is in the examples directory, two variants called sg_tst_excl and sg_tst_excl2 . You will need a recent gcc compiler, IOW something that can compile c++11 . gcc 4.7.3 in Ubuntu 13.04 only just manages, fedora 19 should do better with gcc 4.8.1 . The threading is implemented using pthreads so it should be reliable. Typically I run multiple instances (processes) and each has multiple threads. One instance can run '-x' which will cause its first thread not to use O_EXCL **. All my tests