Re: [PATCH v2 1/2] scsi: Handle Unit Attention when issuing SCSI command
On Mon, 2016-10-17 at 14:17 -0200, Gabriel Krisman Bertazi wrote: > James Bottomleywrites: > > > On Thu, 2016-10-13 at 15:47 -0300, Gabriel Krisman Bertazi wrote: > > > @@ -210,6 +219,13 @@ int scsi_execute(struct scsi_device *sdev, > > > const > > > unsigned char *cmd, > > >*/ > > > blk_execute_rq(req->q, NULL, req, 1); > > > > > > + if (scsi_sense_unit_attention(sense) && req->retries > > > > 0) { > > > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > > > + retries = req->retries - 1; > > > + blk_put_request(req); > > > + goto retry; > > > + } > > > > OK, so this is more theory, but I think you can actually reuse the > > same > > request to go around this loop without doing a get/put. I've cc'd > > Jens > > to confirm, since no other driver I can find does this, but if it's > > legal, it saves freeing and reallocating the request. You can then > > replace the goto with a do { } while (...) which makes the loop > > obvious > > to the next person looking at this. > > Hi James, > > I don't think the block layer currently has the machinery to reuse > the request. I think it would be easy to add for the MQ case but I > don't know about SQ. If we don't clean up or reinit the request > before re-sending, we'll hit the BUG_ON in blk_start_request: > > BUG_ON(test_bit(REQ_ATOM_COMPLETE, >atomic_flags)); > > Do you wanna take a v3 of the patch and fix this on a future patch, That works. I certainly believe, looking at the code, that we can reuse the request, but in the absence from confirmation from Jens I'm certainly not going to insist on it. James > or should I be looking into patching the block layer interface? I'll > be looking into it, but I need to get familiar with the SQ code > first. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/2] scsi: Handle Unit Attention when issuing SCSI command
On 10/17/2016 11:17 AM, Gabriel Krisman Bertazi wrote: > James Bottomleywrites: > >> On Thu, 2016-10-13 at 15:47 -0300, Gabriel Krisman Bertazi wrote: >>> @@ -210,6 +219,13 @@ int scsi_execute(struct scsi_device *sdev, const >>> unsigned char *cmd, >>> */ >>> blk_execute_rq(req->q, NULL, req, 1); >>> >>> + if (scsi_sense_unit_attention(sense) && req->retries > 0) { >>> + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); >>> + retries = req->retries - 1; >>> + blk_put_request(req); >>> + goto retry; >>> + } >> >> OK, so this is more theory, but I think you can actually reuse the same >> request to go around this loop without doing a get/put. I've cc'd Jens >> to confirm, since no other driver I can find does this, but if it's >> legal, it saves freeing and reallocating the request. You can then >> replace the goto with a do { } while (...) which makes the loop obvious >> to the next person looking at this. > > Hi James, > > I don't think the block layer currently has the machinery to reuse the > request. I think it would be easy to add for the MQ case but I > don't know about SQ. If we don't clean up or reinit the request before > re-sending, we'll hit the BUG_ON in blk_start_request: > > BUG_ON(test_bit(REQ_ATOM_COMPLETE, >atomic_flags)); > > Do you wanna take a v3 of the patch and fix this on a future patch, or > should I be looking into patching the block layer interface? I'll be > looking into it, but I need to get familiar with the SQ code first. Would it make more sense to try to stick this retry logic in scsi_error.c in check_sense? Or do we not have enough information there to decide whether or not to retry 06/29? -Brian -- Brian King Power Linux I/O IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/2] scsi: Handle Unit Attention when issuing SCSI command
James Bottomleywrites: > On Thu, 2016-10-13 at 15:47 -0300, Gabriel Krisman Bertazi wrote: >> @@ -210,6 +219,13 @@ int scsi_execute(struct scsi_device *sdev, const >> unsigned char *cmd, >> */ >> blk_execute_rq(req->q, NULL, req, 1); >> >> +if (scsi_sense_unit_attention(sense) && req->retries > 0) { >> +memset(sense, 0, SCSI_SENSE_BUFFERSIZE); >> +retries = req->retries - 1; >> +blk_put_request(req); >> +goto retry; >> +} > > OK, so this is more theory, but I think you can actually reuse the same > request to go around this loop without doing a get/put. I've cc'd Jens > to confirm, since no other driver I can find does this, but if it's > legal, it saves freeing and reallocating the request. You can then > replace the goto with a do { } while (...) which makes the loop obvious > to the next person looking at this. Hi James, I don't think the block layer currently has the machinery to reuse the request. I think it would be easy to add for the MQ case but I don't know about SQ. If we don't clean up or reinit the request before re-sending, we'll hit the BUG_ON in blk_start_request: BUG_ON(test_bit(REQ_ATOM_COMPLETE, >atomic_flags)); Do you wanna take a v3 of the patch and fix this on a future patch, or should I be looking into patching the block layer interface? I'll be looking into it, but I need to get familiar with the SQ code first. -- Gabriel Krisman Bertazi -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/2] scsi: Handle Unit Attention when issuing SCSI command
On Thu, 2016-10-13 at 15:47 -0300, Gabriel Krisman Bertazi wrote: > Hi James, > > Thanks for your review. Please see the v2 below. > > > OK, so really this isn't what you want, because blk_execute_req may > > have used several of your retries, so you now get a maximum > > possible set of retries at UNIT_ATTENTION_RETRIES*retries. You > > need to start from the returned req->retries, which probably means > > this loop needs to be inside __scsi_execute. > > Hmm, I was aware of that, but I saw there were other places that may > have run retries^2 times, like scsi_test_unit_ready and > scsi_mode_sense, if I read the code correctly. There's a lot of this type of retry multiplication, but we really do need to reduce it, not increase it. > But, I see your point and I fixed it on v2. I also updated the > second patch to rework these cases. > > Another thing that got me confused is where the blk layer updates > req->retries. > > What do you think about the v2 below? > > Thanks, > > -- >8 -- > > Usually, re-sending the SCSI command is enough to recover from a Unit > Attention (UA). This adds a generic retry code to the SCSI command > path in case of an UA, before giving up and returning the error > condition to the caller. > > I added the UA verification into scsi_execute instead of > scsi_execute_req because there are at least a few callers that invoke > scsi_execute directly and would benefit from the internal UA retry. > Also, I didn't use scsi_normalize_sense to not duplicate > functionality with scsi_execute_req_flags. Instead, scsi_execute > uses a small helper function that verifies only the UA condition > directly from the raw sense buffer. If this design is not OK, I can > refactor to use scsi_normalize_sense. > > This prevents us from duplicating the retry code in at least a few > places. In particular, it fixes an issue found in some IBM > enclosures, in which the device may return an Unit Attention during > probe, breaking the bind with the ses module: > > scsi 1:0:7:0: Failed to get diagnostic page 0x802 > scsi 1:0:7:0: Failed to bind enclosure -19 > > Link: https://patchwork.kernel.org/patch/9336763/ > Suggested-by: Brian King> Suggested-by: James Bottomley > Signed-off-by: Gabriel Krisman Bertazi > --- > drivers/scsi/scsi_lib.c| 27 --- > include/scsi/scsi_common.h | 9 + > 2 files changed, 33 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index c71344aebdbb..9c6623abf120 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -187,15 +187,24 @@ int scsi_execute(struct scsi_device *sdev, > const unsigned char *cmd, > struct request *req; > int write = (data_direction == DMA_TO_DEVICE); > int ret = DRIVER_ERROR << 24; > + bool priv_sense = false; > > + if (!sense) { > + sense = kzalloc(SCSI_SENSE_BUFFERSIZE, GFP_NOIO); > + if (!sense) > + return ret; > + priv_sense = true; > + } I think here we might be able to stack allocate this and avoid a potential error return under memory pressure. The annoying thing is that blk_execute_rq does exactly this, so now we have an additional useless allocation bloating the stack, but I think it's worth it to get rid of priv_sense and all the error legs. > + retry: > req = blk_get_request(sdev->request_queue, write, > __GFP_RECLAIM); > if (IS_ERR(req)) > - return ret; > + goto free_sense; > blk_rq_set_block_pc(req); > > if (bufflen && blk_rq_map_kern(sdev->request_queue, > req, > buffer, bufflen, > __GFP_RECLAIM)) > - goto out; > + goto put_req; > > req->cmd_len = COMMAND_SIZE(cmd[0]); > memcpy(req->cmd, cmd, req->cmd_len); > @@ -210,6 +219,13 @@ int scsi_execute(struct scsi_device *sdev, const > unsigned char *cmd, >*/ > blk_execute_rq(req->q, NULL, req, 1); > > + if (scsi_sense_unit_attention(sense) && req->retries > 0) { > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > + retries = req->retries - 1; > + blk_put_request(req); > + goto retry; > + } OK, so this is more theory, but I think you can actually reuse the same request to go around this loop without doing a get/put. I've cc'd Jens to confirm, since no other driver I can find does this, but if it's legal, it saves freeing and reallocating the request. You can then replace the goto with a do { } while (...) which makes the loop obvious to the next person looking at this. > /* >* Some devices (USB mass-storage in particular) may > transfer >* garbage data together with a residue indicating that the > data > @@ -222,9 +238,14 @@ int scsi_execute(struct scsi_device *sdev, const
[PATCH v2 1/2] scsi: Handle Unit Attention when issuing SCSI command
Hi James, Thanks for your review. Please see the v2 below. > OK, so really this isn't what you want, because blk_execute_req may > have used several of your retries, so you now get a maximum possible > set of retries at UNIT_ATTENTION_RETRIES*retries. You need to start > from the returned req->retries, which probably means this loop needs to > be inside __scsi_execute. Hmm, I was aware of that, but I saw there were other places that may have run retries^2 times, like scsi_test_unit_ready and scsi_mode_sense, if I read the code correctly. But, I see your point and I fixed it on v2. I also updated the second patch to rework these cases. Another thing that got me confused is where the blk layer updates req->retries. What do you think about the v2 below? Thanks, -- >8 -- Usually, re-sending the SCSI command is enough to recover from a Unit Attention (UA). This adds a generic retry code to the SCSI command path in case of an UA, before giving up and returning the error condition to the caller. I added the UA verification into scsi_execute instead of scsi_execute_req because there are at least a few callers that invoke scsi_execute directly and would benefit from the internal UA retry. Also, I didn't use scsi_normalize_sense to not duplicate functionality with scsi_execute_req_flags. Instead, scsi_execute uses a small helper function that verifies only the UA condition directly from the raw sense buffer. If this design is not OK, I can refactor to use scsi_normalize_sense. This prevents us from duplicating the retry code in at least a few places. In particular, it fixes an issue found in some IBM enclosures, in which the device may return an Unit Attention during probe, breaking the bind with the ses module: scsi 1:0:7:0: Failed to get diagnostic page 0x802 scsi 1:0:7:0: Failed to bind enclosure -19 Link: https://patchwork.kernel.org/patch/9336763/ Suggested-by: Brian KingSuggested-by: James Bottomley Signed-off-by: Gabriel Krisman Bertazi --- drivers/scsi/scsi_lib.c| 27 --- include/scsi/scsi_common.h | 9 + 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c71344aebdbb..9c6623abf120 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -187,15 +187,24 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, struct request *req; int write = (data_direction == DMA_TO_DEVICE); int ret = DRIVER_ERROR << 24; + bool priv_sense = false; + if (!sense) { + sense = kzalloc(SCSI_SENSE_BUFFERSIZE, GFP_NOIO); + if (!sense) + return ret; + priv_sense = true; + } + + retry: req = blk_get_request(sdev->request_queue, write, __GFP_RECLAIM); if (IS_ERR(req)) - return ret; + goto free_sense; blk_rq_set_block_pc(req); if (bufflen && blk_rq_map_kern(sdev->request_queue, req, buffer, bufflen, __GFP_RECLAIM)) - goto out; + goto put_req; req->cmd_len = COMMAND_SIZE(cmd[0]); memcpy(req->cmd, cmd, req->cmd_len); @@ -210,6 +219,13 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, */ blk_execute_rq(req->q, NULL, req, 1); + if (scsi_sense_unit_attention(sense) && req->retries > 0) { + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); + retries = req->retries - 1; + blk_put_request(req); + goto retry; + } + /* * Some devices (USB mass-storage in particular) may transfer * garbage data together with a residue indicating that the data @@ -222,9 +238,14 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, if (resid) *resid = req->resid_len; ret = req->errors; - out: + + put_req: blk_put_request(req); + free_sense: + if (priv_sense) + kfree(sense); + return ret; } EXPORT_SYMBOL(scsi_execute); diff --git a/include/scsi/scsi_common.h b/include/scsi/scsi_common.h index 20bf7eaef05a..747b632d5b57 100644 --- a/include/scsi/scsi_common.h +++ b/include/scsi/scsi_common.h @@ -58,6 +58,15 @@ static inline bool scsi_sense_valid(const struct scsi_sense_hdr *sshdr) return (sshdr->response_code & 0x70) == 0x70; } +static inline bool scsi_sense_unit_attention(const char *sense) +{ + int resp = sense[0] & 0x7f; + + return ((resp & 0x70) && + ((resp >= 0x72 && (sense[1] & 0xf) == UNIT_ATTENTION) || +(resp < 0x72 && (sense[2] & 0xf) == UNIT_ATTENTION))); +} + extern bool scsi_normalize_sense(const u8 *sense_buffer, int sb_len, struct scsi_sense_hdr *sshdr); --