Re: [PATCH] scsi: Change return type to vm_fault_t
On 2018-04-14 02:47 PM, Souptick Joarder wrote: Use new return type vm_fault_t for fault handler in struct vm_operations_struct. Signed-off-by: Souptick Joarder <jrdr.li...@gmail.com> Reviewed-by: Matthew Wilcox <mawil...@microsoft.com> Acked-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/sg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c198b963..c2b7d34 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1192,7 +1192,7 @@ static long sg_compat_ioctl(struct file *filp, unsigned int cmd_in, unsigned lon return fasync_helper(fd, filp, mode, >async_qp); } -static int +static vm_fault_t sg_vma_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; -- 1.9.1
Re: [VERY EARLY RFC 00/13] Rework SCSI results handling
On 2018-04-18 11:01 AM, Johannes Thumshirn wrote: Hey all, here's a early preview of my SCSI results rework so we can eventually discuss things next week at LSF/MM (it still has compiler errors on aic7xxx and scsi_debug). The motivation behing this is that some drivers have failed to set the scsi_cmnd::result bytes correctly in the past and this is resulting in hard to case down errors. The open points: 1) 148 files changed, treewide. That's huge. Is it worth it? 2) remove the old status byte definitions 3) add a scsi_cmnd::result == 0 wrapper 3) convert aic7xx's CAM stuff so this series compiles cleanly 4) What the heck is the SDEG_RES_IMMED_MASK stuff in scsi_debug doing It's a hack. The cdb tables have only one delay value per command and for Start Stop Unit and Synchronize Cache those are relatively long delays. However as you may be aware both those commands have an IMMED bit which makes those commands return more or less immediately. In the work flow of the scsi_debug driver the work done by a specific command is done more or less immediately and any delay associated with that command is done by generic code just prior to calling scsi_cmnd::done(). So for those two commands I needed an additional return value from the resp_start_stop() and resp_sync_cache() to indicate that their normal delay should be short circuited. And the signatures of those functions are function pointers in many of the scsi_debug tables, so expanding the number of arguments for these two cases would have been messy. So the (hacky) solution was to overload the return int with an extra flag. And that return int is none other than the infamous SCSI result. With the benefit of hindsight a better solution is to add a new bool to struct sdebug_dev_info: bool immed_shortens_delay; that is set by those two commands when their IMMED bit is set (in their cdbs). Then the code in schedule_resp() can act on that bool (when set) to short circuit the delay. That will be a worthwhile improvement so other commands (if and when implemented) can react properly when their IMMED bit is set (e.g. FORMAT UNIT, PRE-FETCH and SANITIZE). Does that clear up this matter? Doug Gilbert 5) change scsi_execute() so we get a newish 'struct scsi_results' instead of an int 6) {to,from}_scsi_result() are odd 7) find suitable commit messages In case someone want's it in a more viewable form I've pushed the series to my kernel.org git: https://git.kernel.org/pub/scm/linux/kernel/git/jth/linux.git/log/?h=scsi-results Johannes Thumshirn (13): scsi: use host_byte() accessor scsi: remove Scsi_Cmnd typedef scsi: add enum for host byte codes scsi: add enum for driver byte codes scsi: add enum for message byte codes scsi: introduce set_scsi_result scsi: use set_driver_byte treewide: use set_host_byte scsi: use set_msg_byte scsi: introduce set_status_byte and convert LLDDs to use it scsi: Change status bytes to use SAM-3 version reewide: introduce clear_scsi_result() and convert drivers scsi: introduce struct scsi_result arch/ia64/hp/sim/simscsi.c | 6 +- block/bsg-lib.c | 8 +- block/bsg.c | 8 +- block/scsi_ioctl.c | 12 +- drivers/ata/libata-scsi.c | 54 +++--- drivers/firewire/sbp2.c | 2 +- drivers/infiniband/ulp/srp/ib_srp.c | 23 ++- drivers/message/fusion/mptfc.c | 8 +- drivers/message/fusion/mptsas.c | 3 +- drivers/message/fusion/mptscsih.c | 122 - drivers/message/fusion/mptspi.c | 6 +- drivers/s390/scsi/zfcp_fc.h | 2 +- drivers/s390/scsi/zfcp_scsi.c | 4 +- drivers/scsi/3w-9xxx.c | 18 +- drivers/scsi/3w-sas.c | 16 +- drivers/scsi/3w-.c | 36 ++-- drivers/scsi/53c700.c | 3 +- drivers/scsi/BusLogic.c | 24 ++- drivers/scsi/NCR5380.c | 39 +++-- drivers/scsi/a100u2w.c | 2 +- drivers/scsi/aacraid/aachba.c | 221 +--- drivers/scsi/aacraid/commsup.c | 5 +- drivers/scsi/aacraid/linit.c| 12 +- drivers/scsi/advansys.c | 58 +++ drivers/scsi/aha152x.c | 76 drivers/scsi/aha1542.c | 2 +- drivers/scsi/aha1740.c | 11 +- drivers/scsi/aha1740.h | 4 +- drivers/scsi/aic7xxx/aic79xx_osm.c | 5 +- drivers/scsi/aic7xxx/aic79xx_osm.h | 6 +- drivers/scsi/aic7xxx/aic7xxx_osm.c | 5 +- drivers/scsi/aic7xxx/aic7xxx_osm.h
[PATCH] scsi_debug: replace SDEG_RES_IMMED_MASK with cmnd flag
This patch uses a cleaner method to convey the presence of the IMMED cdb bit back to the generic delay code in the scsi_debug driver. The previous method used a temporary mask over the SCSI result value (a 32 bit integer soon to be restructured) that was not visible outside this driver. It has still caused some confusion so a more conventional method is now used: adding an extra flag to the scsi_cmnd "host_scribble" area. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- This patch is designed to simplify part of the patchset titled: "Rework SCSI results handling" that touched the scsi_debug driver. This patch is against Martin's 4.18/scsi-queue branch. It will probably break soon as that branch doesn't yet contain my "scsi_debug: IMMED related delay adjustments" patch which touches the same area. The technique being used should be clear. drivers/scsi/scsi_debug.c | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 9ef5e3b810f6..65809f3d0d33 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -234,7 +234,7 @@ static const char *sdebug_version_date = "20180128"; #define F_INV_OP 0x200 #define F_FAKE_RW 0x400 #define F_M_ACCESS 0x800 /* media access */ -#define F_LONG_DELAY 0x1000 +#define F_LONG_DELAY 0x1000 /* IMMED bit on commands with this */ #define FF_RESPOND (F_RL_WLUN_OK | F_SKIP_UA | F_DELAY_OVERR) #define FF_MEDIA_IO (F_M_ACCESS | F_FAKE_RW) @@ -294,6 +294,7 @@ struct sdebug_queued_cmd { unsigned int inj_dix:1; unsigned int inj_short:1; unsigned int inj_host_busy:1; + bool immed_shortens_delay; }; struct sdebug_queue { @@ -399,14 +400,6 @@ static const unsigned char opcode_ind_arr[256] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, }; -/* - * The following "response" functions return the SCSI mid-level's 4 byte - * tuple-in-an-int. To handle commands with an IMMED bit, for a faster - * command completion, they can mask their return value with - * SDEG_RES_IMMED_MASK . - */ -#define SDEG_RES_IMMED_MASK 0x4000 - static int resp_inquiry(struct scsi_cmnd *, struct sdebug_dev_info *); static int resp_report_luns(struct scsi_cmnd *, struct sdebug_dev_info *); static int resp_requests(struct scsi_cmnd *, struct sdebug_dev_info *); @@ -1607,6 +1600,8 @@ static int resp_start_stop(struct scsi_cmnd *scp, { unsigned char *cmd = scp->cmnd; int power_cond, stop; + struct sdebug_queued_cmd *sqcp = + (struct sdebug_queued_cmd *)scp->host_scribble; power_cond = (cmd[4] & 0xf0) >> 4; if (power_cond) { @@ -1615,7 +1610,8 @@ static int resp_start_stop(struct scsi_cmnd *scp, } stop = !(cmd[4] & 1); atomic_xchg(>stopped, stop); - return (cmd[1] & 0x1) ? SDEG_RES_IMMED_MASK : 0; /* check IMMED bit */ + sqcp->immed_shortens_delay = !!(cmd[1] & 0x1); + return 0; } static sector_t get_sdebug_capacity(void) @@ -3586,6 +3582,8 @@ static int resp_sync_cache(struct scsi_cmnd *scp, u64 lba; u32 num_blocks; u8 *cmd = scp->cmnd; + struct sdebug_queued_cmd *sqcp = + (struct sdebug_queued_cmd *)scp->host_scribble; if (cmd[0] == SYNCHRONIZE_CACHE) { /* 10 byte cdb */ lba = get_unaligned_be32(cmd + 2); @@ -3598,7 +3596,8 @@ static int resp_sync_cache(struct scsi_cmnd *scp, mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0); return check_condition_result; } - return (cmd[1] & 0x2) ? SDEG_RES_IMMED_MASK : 0; /* check IMMED bit */ + sqcp->immed_shortens_delay = !!(cmd[1] & 0x2); + return 0; } #define RL_BUCKET_ELEMS 8 @@ -4399,12 +4398,12 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip, } cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0; - if (cmnd->result & SDEG_RES_IMMED_MASK) { + if (sqcp->immed_shortens_delay) { /* -* This is the F_DELAY_OVERR case. No delay. +* IMMED bit set overrides F_LONG_DELAY. */ - cmnd->result &= ~SDEG_RES_IMMED_MASK; - delta_jiff = ndelay = 0; + delta_jiff = 0; + ndelay = 0; } if (cmnd->result == 0 && scsi_result != 0) cmnd->result = scsi_result; @@ -4456,7 +4455,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip, respond_in_thread: /* call back to mid-layer using invocation thread */ cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0; - cmnd->result &= ~SDEG_RES_IMMED_MASK; if (cmnd->result == 0 && scsi_result != 0) cmnd->result = scsi_result; cmnd->scsi_done(cmnd); -- 2.14.1
Re: [PATCH v2 4/6] scsi_io_completion_action helper added
On 2018-03-26 08:13 PM, Bart Van Assche wrote: On Sun, 2018-03-18 at 21:59 +0100, Douglas Gilbert wrote: + /* sense not about current command is termed: deferred */ Do we really need comments that explain the SCSI specs? If such a comment is added I think it should be added above the definition of scsi_sense_is_deferred() together with a reference to the "Sense data" section in SPC. + if (result == 0) { + /* +* Unprep the request and put it back at the head of the +* queue. A new command will be prepared and issued. +* This block is the same as case ACTION_REPREP in +* scsi_io_completion_action() above. */ - if (q->mq_ops) { + if (q->mq_ops) scsi_mq_requeue_cmd(cmd); - } else { + else { scsi_release_buffers(cmd); scsi_requeue_command(q, cmd); } Have these changes been verified with checkpatch? Checkpatch should have reported the following about the above chunk of code: Unbalanced braces around else statement. Yes they were, did you check them? If so, with what command line options? Since with no options /scripts/checkpatch.pl returns clean for all patches in this set. Additionally, have you considered to introduce a new function instead of duplicating existing code? Yes, that could be done. Otherwise this patch looks fine to me. Doug Gilbert
[PATCH v3 5/7] scsi_io_completion_reprep helper added
Since the action "reprep" is called from two places, rather than repeat the code, make a new scsi_io_completion helper with "reprep" as its suffix. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 41 ++--- 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b1cefd8e527f..e7e5eb5b9e2e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when "reprep" action required. */ +static void scsi_io_completion_reprep(struct scsi_cmnd *cmd, + struct request_queue *q) +{ + /* A new command will be prepared and issued. */ + if (q->mq_ops) { + scsi_mq_requeue_cmd(cmd); + } else { + /* Unprep request and put it back at head of the queue. */ + scsi_release_buffers(cmd); + scsi_requeue_command(q, cmd); + } +} + /* Helper for scsi_io_completion() when special action required. */ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) { @@ -893,15 +907,7 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) return; /*FALLTHRU*/ case ACTION_REPREP: - /* Unprep the request and put it back at the head of the queue. -* A new command will be prepared and issued. -*/ - if (q->mq_ops) - scsi_mq_requeue_cmd(cmd); - else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } + scsi_io_completion_reprep(cmd, q); break; case ACTION_RETRY: /* Retry the same command immediately */ @@ -1079,20 +1085,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) { - /* -* Unprep the request and put it back at the head of the -* queue. A new command will be prepared and issued. -* This block is the same as case ACTION_REPREP in -* scsi_io_completion_action() above. -*/ - if (q->mq_ops) - scsi_mq_requeue_cmd(cmd); - else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } - } else + if (result == 0) + scsi_io_completion_reprep(cmd, q); + else scsi_io_completion_action(cmd, result); } -- 2.14.1
[PATCH v3 2/7] scsi_io_completion rename variables
Change some variable names and associated comments for clarity. Correct some misleading comments. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 57 ++--- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d44ee84df091..598836804745 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, *be put back on the queue and retried using the same *command as before, possibly after a delay. * - * c) We can call scsi_end_request() with -EIO to fail - *the remainder of the request. + * c) We can call scsi_end_request() with blk_stat other than + *BLK_STS_OK, to fail the remainder of the request. */ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) { int result = cmd->result; struct request_queue *q = cmd->device->request_queue; struct request *req = cmd->request; - blk_status_t error = BLK_STS_OK; + blk_status_t blk_stat = BLK_STS_OK; struct scsi_sense_hdr sshdr; bool sense_valid = false; - int sense_deferred = 0, level = 0; + bool sense_current = true; /* false implies "deferred sense" */ + int level = 0; enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; @@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) if (result) { sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) - sense_deferred = scsi_sense_is_deferred(); + sense_current = !scsi_sense_is_deferred(); } if (blk_rq_is_passthrough(req)) { @@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) min(8 + cmd->sense_buffer[7], SCSI_SENSE_BUFFERSIZE); } - if (!sense_deferred) - error = __scsi_error_from_host_byte(cmd, result); + if (sense_current) + blk_stat = __scsi_error_from_host_byte(cmd, + result); } /* * __scsi_error_from_host_byte may have reset the host_byte @@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) BUG(); return; } - } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { + } else if (blk_rq_bytes(req) == 0 && result && sense_current) { /* * Flush commands do not transfers any data, and thus cannot use * good_bytes != blk_rq_bytes(req) as the signal for an error. -* This sets the error explicitly for the problem case. +* This sets blk_stat explicitly for the problem case. */ - error = __scsi_error_from_host_byte(cmd, result); + blk_stat = __scsi_error_from_host_byte(cmd, result); } /* no bidi support for !blk_rq_is_passthrough yet */ @@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) else if (!(req->rq_flags & RQF_QUIET)) scsi_print_sense(cmd); result = 0; - /* for passthrough error may be set */ - error = BLK_STS_OK; + /* for passthrough, blk_stat may be set */ + blk_stat = BLK_STS_OK; } /* * Another corner case: the SCSI status byte is non-zero but 'good'. @@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) */ if (status_byte(result) && scsi_status_is_good(result)) { result = 0; - error = BLK_STS_OK; + blk_stat = BLK_STS_OK; } /* -* special case: failed zero length commands always need to -* drop down into the retry code. Otherwise, if we finished -* all bytes in the request we are done now. +* Next deal with any sectors which we were able to correctly +* handle. Failed, zero length commands always need to drop down +* to retry code. Fast path should return in this block. */ - if (!(b
[PATCH v3 1/7] scsi_io_completion comment on end_request return
scsi_end_request() is called multiple times from scsi_io_completion() which branches on its bool returned value. Add comment before the static definition of scsi_end_request() about the meaning of that return. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com> --- The reader might think that if scsi_end_request() had actually ended the request (i.e. no bytes remaining) it would return true. If so, the reader would be misled. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 393f9db8f41b..d44ee84df091 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd) cmd->request->next_rq->special = NULL; } +/* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, unsigned int bytes, unsigned int bidi_bytes) { -- 2.14.1
[PATCH v3 7/7] scsi_io_completion convert BUGs to WARNs
The scsi_io_completion function contains three BUG() and BUG_ON() calls. Replace them with WARN variants. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 391bbb0b19b8..6fa24cff43a6 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), blk_rq_bytes(req->next_rq))) - BUG(); + WARN_ONCE(true, + "Bidi command with remaining bytes"); return; } } /* no bidi support yet, other than in pass-through */ - BUG_ON(blk_bidi_rq(req)); + if (unlikely(blk_bidi_rq(req))) { + WARN_ONCE(true, "Only support bidi command in passthrough"); + scmd_printk(KERN_ERR, cmd, "Killing bidi command\n"); + if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req), +blk_rq_bytes(req->next_rq))) + WARN_ONCE(true, "Bidi command with remaining bytes"); + return; + } /* * Next deal with any sectors which we were able to correctly @@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) /* Kill remainder if no retries */ if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) - BUG(); + WARN_ONCE(true, + "Bytes remaining after failed, no-retry command"); return; } -- 2.14.1
[PATCH v3 6/7] scsi_io_completion hints on fastpatch
Add likely() and unlikely() hints to conditionals on or near the fastpath. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- A reviewer wanted any performance improvement (or otherwise) quantified. The improvement was so small, that ftrace ignored it. Inline timing code suggests the improvement from this whole patchset is around 7 nanoseconds per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge. Another win might be the smaller size of scsi_io_completion() after the refactoring; this should allow more other code to fit in the instruction cache. drivers/scsi/scsi_lib.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e7e5eb5b9e2e..391bbb0b19b8 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) struct request *req = cmd->request; blk_status_t blk_stat = BLK_STS_OK; - if (result) /* does not necessarily mean there is an error */ + if (unlikely(result)) /* a nz result may or may not be an error */ result = scsi_io_completion_nz_result(cmd, result, _stat); - if (blk_rq_is_passthrough(req)) { + if (unlikely(blk_rq_is_passthrough(req))) { /* * __scsi_error_from_host_byte may have reset the host_byte */ scsi_req(req)->result = cmd->result; scsi_req(req)->resid_len = scsi_get_resid(cmd); - if (scsi_bidi_cmnd(cmd)) { + if (unlikely(scsi_bidi_cmnd(cmd))) { /* * Bidi commands Must be complete as a whole, * both sides at once. @@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } } - /* no bidi support for !blk_rq_is_passthrough yet */ + /* no bidi support yet, other than in pass-through */ BUG_ON(blk_bidi_rq(req)); /* @@ -1067,15 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * handle. Failed, zero length commands always need to drop down * to retry code. Fast path should return in this block. */ - if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) { - if (!scsi_end_request(req, blk_stat, good_bytes, 0)) - return; /* no bytes remaining */ + if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) { + if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0))) + return; /* no bytes remaining */ } - /* -* Kill remainder if no retrys. -*/ - if (blk_stat && scsi_noretry_cmd(cmd)) { + /* Kill remainder if no retries */ + if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) BUG(); return; @@ -1085,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) + if (likely(result == 0)) scsi_io_completion_reprep(cmd, q); else scsi_io_completion_action(cmd, result); -- 2.14.1
[PATCH v3 3/7] scsi_io_completion_nz_result function added
Break out several intertwined paths when cmd->result is non zero and place them in the scsi_io_completion_nz_result helper function. The logic is not changed. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- A reviewer requested the original helper function's two return values be reduced to one: the blk_stat variable. This required a hack to differentiate the default setting of blk_stat (BLK_STS_OK) from the case when the helper assigns BLK_STS_OK as the return value. The hack is to return the otherwise unused BLK_STS_NOTSUPP value as an indication that the helper didn't change anything. That hack was judged by another reviewer to be worse that the "two return values" ugliness it was trying to address. So back to the original "two return values" solution. Returning a structure containing result and blk_stat is another possibility but returning structures is frowned upon in some circles. drivers/scsi/scsi_lib.c | 126 1 file changed, 74 insertions(+), 52 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 598836804745..3f65bb541ca3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,78 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* + * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a + * new result that may suppress further error checking. Also modifies + * *blk_statp in some cases. + */ +static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result, + blk_status_t *blk_statp) +{ + bool sense_valid; + bool about_current; + struct request *req = cmd->request; + struct scsi_sense_hdr sshdr; + + sense_valid = scsi_command_normalize_sense(cmd, ); + about_current = sense_valid ? !scsi_sense_is_deferred() : true; + + if (blk_rq_is_passthrough(req)) { + if (sense_valid) { + /* +* SG_IO wants current and deferred errors +*/ + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); + } + if (about_current) + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } else if (blk_rq_bytes(req) == 0 && about_current) { + /* +* Flush commands do not transfers any data, and thus cannot use +* good_bytes != blk_rq_bytes(req) as the signal for an error. +* This sets *blk_statp explicitly for the problem case. +*/ + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } + /* +* Recovered errors need reporting, but they're always treated as +* success, so fiddle the result code here. For passthrough requests +* we already took a copy of the original into sreq->result which +* is what gets returned to the user +*/ + if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { + bool do_print = true; + /* +* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d] +* skip print since caller wants ATA registers. Only occurs +* on SCSI ATA PASS_THROUGH commands when CK_COND=1 +*/ + if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) + do_print = false; + else if (req->rq_flags & RQF_QUIET) + do_print = false; + if (do_print) + scsi_print_sense(cmd); + /* for passthrough, *blk_statp may be set, so clear */ + *blk_statp = BLK_STS_OK; + result = 0; + } + /* +* Another corner case: the SCSI status byte is non-zero but 'good'. +* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when +* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD +* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related +* intermediate statuses (both obsolete in SAM-4) as good. +*/ + if (status_byte(result) && scsi_status_is_good(result)) { + *blk_statp = BLK_STS_OK; + result = 0; + } + return result; +} + /* * Function:scsi_io_completion() * @@ -794,26 +866,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; - if (result) { + if (result) { /* does not necessarily mean there is an error */ sense_valid = scsi_command_normalize_sense(cmd, );
[PATCH v3 0/7] scsi_io_completion cleanup
While working of this "[PATCH v4] Make SCSI Status CONDITION MET equivalent to GOOD" the author had trouble finding how to add another corner case check in the scsi_io_completion() function (scsi_lib.c). That function is executed at the completion of every SCSI command but only 20 or so of its hundred of lines of code are executed for the vast majority of invocations (i.e. the fastpath). This patch refactors scsi_io_completion() by taking the bulk of its error processing code into three static helper functions, leaving only the 20 or so code lines that constitute the fastpath. In this process comments were added and tweaked, plus variables renamed. The last two patches in this set are optional: adding conditional hints along the fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs as requested by a reviewer. There is no attempt in this patchset to change the logic of the original scsi_io_completion(). Some conditional checks are saved by reducing the number of redundant tests (e.g. on the 'result' variable being non-zero). Also De Morgan's laws are applied to some complex conditions to simplify them from the reader's perspective. The fact remains, this is a very complex function. This patch is against Martin Petersen's 4.17/scsi-queue branch. Douglas Gilbert (7): scsi_io_completion comment on end_request return scsi_io_completion rename variables scsi_io_completion_nz_result function added scsi_io_completion_action helper added scsi_io_completion_reprep helper added scsi_io_completion hints on fastpatch scsi_io_completion convert BUGs to WARNs drivers/scsi/scsi_lib.c | 376 +++- 1 file changed, 213 insertions(+), 163 deletions(-) -- 2.14.1
[PATCH v3 4/7] scsi_io_completion_action helper added
Place scsi_io_completion()'s complex error processing associated with a local enumeration into a static helper function. That enumeration's values start with "ACTION_" so use the suffix "_action" in the helper function's name. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 326 ++-- 1 file changed, 174 insertions(+), 152 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 3f65bb541ca3..b1cefd8e527f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,169 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when special action required. */ +static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) +{ + struct request_queue *q = cmd->device->request_queue; + struct request *req = cmd->request; + int level = 0; + enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, + ACTION_DELAYED_RETRY} action; + unsigned long wait_for = (cmd->allowed + 1) * req->timeout; + struct scsi_sense_hdr sshdr; + bool sense_valid = false; + bool sense_current = true; /* false implies "deferred sense" */ + blk_status_t blk_stat; + + if (scsi_command_normalize_sense(cmd, )) { + sense_valid = true; + sense_current = !scsi_sense_is_deferred(); + } + + blk_stat = __scsi_error_from_host_byte(cmd, result); + + if (host_byte(result) == DID_RESET) { + /* Third party bus reset or reset for error recovery +* reasons. Just retry the command and see what +* happens. +*/ + action = ACTION_RETRY; + } else if (sense_valid && sense_current) { + switch (sshdr.sense_key) { + case UNIT_ATTENTION: + if (cmd->device->removable) { + /* Detected disc change. Set a bit +* and quietly refuse further access. +*/ + cmd->device->changed = 1; + action = ACTION_FAIL; + } else { + /* Must have been a power glitch, or a +* bus reset. Could not have been a +* media change, so we just retry the +* command and see what happens. +*/ + action = ACTION_RETRY; + } + break; + case ILLEGAL_REQUEST: + /* If we had an ILLEGAL REQUEST returned, then +* we may have performed an unsupported +* command. The only thing this should be +* would be a ten byte read where only a six +* byte read was supported. Also, on a system +* where READ CAPACITY failed, we may have +* read past the end of the disk. +*/ + if ((cmd->device->use_10_for_rw && + sshdr.asc == 0x20 && sshdr.ascq == 0x00) && + (cmd->cmnd[0] == READ_10 || +cmd->cmnd[0] == WRITE_10)) { + /* This will issue a new 6-byte command. */ + cmd->device->use_10_for_rw = 0; + action = ACTION_REPREP; + } else if (sshdr.asc == 0x10) /* DIX */ { + action = ACTION_FAIL; + blk_stat = BLK_STS_PROTECTION; + /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */ + } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) { + action = ACTION_FAIL; + blk_stat = BLK_STS_TARGET; + } else + action = ACTION_FAIL; + break; + case ABORTED_COMMAND: + action = ACTION_FAIL; + if (sshdr.asc == 0x10) /* DIF */ + blk_stat = BLK_STS_PROTECTION; + break; + case NOT_READY: + /* If the device is in the process of becoming +* ready, or has a temporary blockage, retry. +*/ + if (sshdr.asc == 0x04) { + switch (sshdr.ascq) { +
[PATCH v5 4/7] scsi_io_completion_action helper added
Place scsi_io_completion()'s complex error processing associated with a local enumeration into a static helper function. That enumeration's values start with "ACTION_" so use the suffix "_action" in the helper function's name. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> drivers/scsi/scsi_lib.c | 321 ++-- 1 file changed, 171 insertions(+), 150 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 08c4db34af7b..831d1fad38b3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when special action required. */ +static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) +{ + struct request_queue *q = cmd->device->request_queue; + struct request *req = cmd->request; + int level = 0; + enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, + ACTION_DELAYED_RETRY} action; + unsigned long wait_for = (cmd->allowed + 1) * req->timeout; + struct scsi_sense_hdr sshdr; + bool sense_valid; + bool sense_current = true; /* false implies "deferred sense" */ + blk_status_t blk_stat; + + sense_valid = scsi_command_normalize_sense(cmd, ); + if (sense_valid) + sense_current = !scsi_sense_is_deferred(); + + blk_stat = __scsi_error_from_host_byte(cmd, result); + + if (host_byte(result) == DID_RESET) { + /* Third party bus reset or reset for error recovery +* reasons. Just retry the command and see what +* happens. +*/ + action = ACTION_RETRY; + } else if (sense_valid && sense_current) { + switch (sshdr.sense_key) { + case UNIT_ATTENTION: + if (cmd->device->removable) { + /* Detected disc change. Set a bit +* and quietly refuse further access. +*/ + cmd->device->changed = 1; + action = ACTION_FAIL; + } else { + /* Must have been a power glitch, or a +* bus reset. Could not have been a +* media change, so we just retry the +* command and see what happens. +*/ + action = ACTION_RETRY; + } + break; + case ILLEGAL_REQUEST: + /* If we had an ILLEGAL REQUEST returned, then +* we may have performed an unsupported +* command. The only thing this should be +* would be a ten byte read where only a six +* byte read was supported. Also, on a system +* where READ CAPACITY failed, we may have +* read past the end of the disk. +*/ + if ((cmd->device->use_10_for_rw && + sshdr.asc == 0x20 && sshdr.ascq == 0x00) && + (cmd->cmnd[0] == READ_10 || +cmd->cmnd[0] == WRITE_10)) { + /* This will issue a new 6-byte command. */ + cmd->device->use_10_for_rw = 0; + action = ACTION_REPREP; + } else if (sshdr.asc == 0x10) /* DIX */ { + action = ACTION_FAIL; + blk_stat = BLK_STS_PROTECTION; + /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */ + } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) { + action = ACTION_FAIL; + blk_stat = BLK_STS_TARGET; + } else + action = ACTION_FAIL; + break; + case ABORTED_COMMAND: + action = ACTION_FAIL; + if (sshdr.asc == 0x10) /* DIF */ + blk_stat = BLK_STS_PROTECTION; + break; + case NOT_READY: + /* If the device is in the process of becoming +* ready, or has a temporary blockage, retry. +*/ + if (sshdr.asc == 0x04) { + switch (sshdr.ascq) { +
[PATCH v5 7/7] scsi_io_completion convert BUGs to WARNs
The scsi_io_completion function contains three BUG() and BUG_ON() calls. Replace them with WARN variants. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 60b57324a244..779c45479ac3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), blk_rq_bytes(req->next_rq))) - BUG(); + WARN_ONCE(true, + "Bidi command with remaining bytes"); return; } } /* no bidi support yet, other than in pass-through */ - BUG_ON(blk_bidi_rq(req)); + if (unlikely(blk_bidi_rq(req))) { + WARN_ONCE(true, "Only support bidi command in passthrough"); + scmd_printk(KERN_ERR, cmd, "Killing bidi command\n"); + if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req), +blk_rq_bytes(req->next_rq))) + WARN_ONCE(true, "Bidi command with remaining bytes"); + return; + } /* * Next deal with any sectors which we were able to correctly @@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) /* Kill remainder if no retries. */ if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) - BUG(); + WARN_ONCE(true, + "Bytes remaining after failed, no-retry command"); return; } -- 2.14.1
[PATCH v5 6/7] scsi_io_completion hints on fastpatch
Add likely() and unlikely() hints to conditionals on or near the fastpath. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- A reviewer wanted any performance improvement (or otherwise) quantified. The improvement was so small, that ftrace ignored it. Inline timing code suggests the improvement from this whole patchset is around 7 nanoseconds per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge. Another win might be the smaller size of scsi_io_completion() after the refactoring; this should allow more other code to fit in the instruction cache. drivers/scsi/scsi_lib.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d24bd29cd94a..60b57324a244 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) struct request *req = cmd->request; blk_status_t blk_stat = BLK_STS_OK; - if (result) /* does not necessarily mean there is an error */ + if (unlikely(result)) /* a nz result may or may not be an error */ result = scsi_io_completion_nz_result(cmd, result, _stat); - if (blk_rq_is_passthrough(req)) { + if (unlikely(blk_rq_is_passthrough(req))) { /* * __scsi_error_from_host_byte may have reset the host_byte */ scsi_req(req)->result = cmd->result; scsi_req(req)->resid_len = scsi_get_resid(cmd); - if (scsi_bidi_cmnd(cmd)) { + if (unlikely(scsi_bidi_cmnd(cmd))) { /* * Bidi commands Must be complete as a whole, * both sides at once. @@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } } - /* no bidi support for !blk_rq_is_passthrough yet */ + /* no bidi support yet, other than in pass-through */ BUG_ON(blk_bidi_rq(req)); /* @@ -1067,13 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * handle. Failed, zero length commands always need to drop down * to retry code. Fast path should return in this block. */ - if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) { - if (!scsi_end_request(req, blk_stat, good_bytes, 0)) + if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) { + if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0))) return; /* no bytes remaining */ } -/* Kill remainder if no retries. */ - if (blk_stat && scsi_noretry_cmd(cmd)) { + /* Kill remainder if no retries. */ + if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) BUG(); return; @@ -1083,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) + if (likely(result == 0)) scsi_io_completion_reprep(cmd, q); else scsi_io_completion_action(cmd, result); -- 2.14.1
[PATCH v5 5/7] scsi_io_completion_reprep helper added
Since the action "reprep" is called from two places, rather than repeat the code, make a new scsi_io_completion helper with "reprep" as its suffix. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 41 ++--- 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 831d1fad38b3..d24bd29cd94a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when "reprep" action required. */ +static void scsi_io_completion_reprep(struct scsi_cmnd *cmd, + struct request_queue *q) +{ + /* A new command will be prepared and issued. */ + if (q->mq_ops) { + scsi_mq_requeue_cmd(cmd); + } else { + /* Unprep request and put it back at head of the queue. */ + scsi_release_buffers(cmd); + scsi_requeue_command(q, cmd); + } +} + /* Helper for scsi_io_completion() when special action required. */ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) { @@ -892,15 +906,7 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) return; /*FALLTHRU*/ case ACTION_REPREP: - /* Unprep the request and put it back at the head of the queue. -* A new command will be prepared and issued. -*/ - if (q->mq_ops) { - scsi_mq_requeue_cmd(cmd); - } else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } + scsi_io_completion_reprep(cmd, q); break; case ACTION_RETRY: /* Retry the same command immediately */ @@ -1077,20 +1083,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) { - /* -* Unprep the request and put it back at the head of the -* queue. A new command will be prepared and issued. -* This block is the same as case ACTION_REPREP in -* scsi_io_completion_action() above. -*/ - if (q->mq_ops) { - scsi_mq_requeue_cmd(cmd); - } else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } - } else + if (result == 0) + scsi_io_completion_reprep(cmd, q); + else scsi_io_completion_action(cmd, result); } -- 2.14.1
[PATCH v5 0/7] scsi_io_completion cleanup
While working of this "[PATCH v4] Make SCSI Status CONDITION MET equivalent to GOOD" the author had trouble finding how to add another corner case check in the scsi_io_completion() function (scsi_lib.c). That function is executed at the completion of every SCSI command but only 20 or so of its hundred of lines of code are executed for the vast majority of invocations (i.e. the fastpath). This patch refactors scsi_io_completion() by taking the bulk of its error processing code into three static helper functions, leaving only the 20 or so code lines that constitute the fastpath. In this process comments were added and tweaked, plus variables renamed. The last two patches in this set are optional: adding conditional hints along the fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs as requested by a reviewer. There is no attempt in this patchset to change the logic of the original scsi_io_completion() although the last patch attempts to continue from awkward situations rather than crashing the calling thread (and possibly the kernel). Some conditional checks are saved by reducing the number of redundant tests (e.g. multiple checks that the 'result' variable is non-zero). Also De Morgan's laws are applied to some complex conditions to simplify them from the reader's perspective. The fact remains, this is a very complex function. This patch is against Martin Petersen's 4.17/scsi-queue branch. Note: to apply this patchset to the lk 4.15 production kernels (tested on lk 4.15.13), these two lines near the end of scsi_io_completion(): __scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY, 0); and __scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY, 0); need that trailing 0 changed to 'false', prior to applying 4/7 . ChangeLog since v4 - introduce bool do_print to simplify conditionals in the case of a recovered error sense key in 2/7 rather than 3/7 ChangeLog since v3 - make use of bools sense_valid and sense_current consistent in 3/7 and 4/7 Douglas Gilbert (7): scsi_io_completion comment on end_request return scsi_io_completion rename variables scsi_io_completion_nz_result function added scsi_io_completion_action helper added scsi_io_completion_reprep helper added scsi_io_completion hints on fastpatch scsi_io_completion convert BUGs to WARNs drivers/scsi/scsi_lib.c | 378 +++- 1 file changed, 214 insertions(+), 164 deletions(-) -- 2.14.1
[PATCH v5 2/7] scsi_io_completion rename variables
Change and add some variable names, adjust some associated comments for clarity. Correct some misleading comments. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 72 ++--- 1 file changed, 39 insertions(+), 33 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d44ee84df091..d4f2c3fac907 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, *be put back on the queue and retried using the same *command as before, possibly after a delay. * - * c) We can call scsi_end_request() with -EIO to fail - *the remainder of the request. + * c) We can call scsi_end_request() with blk_stat other than + *BLK_STS_OK, to fail the remainder of the request. */ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) { int result = cmd->result; struct request_queue *q = cmd->device->request_queue; struct request *req = cmd->request; - blk_status_t error = BLK_STS_OK; + blk_status_t blk_stat = BLK_STS_OK; struct scsi_sense_hdr sshdr; bool sense_valid = false; - int sense_deferred = 0, level = 0; + bool sense_current = true; /* false implies "deferred sense" */ + int level = 0; enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; @@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) if (result) { sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) - sense_deferred = scsi_sense_is_deferred(); + sense_current = !scsi_sense_is_deferred(); } if (blk_rq_is_passthrough(req)) { @@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) min(8 + cmd->sense_buffer[7], SCSI_SENSE_BUFFERSIZE); } - if (!sense_deferred) - error = __scsi_error_from_host_byte(cmd, result); + if (sense_current) + blk_stat = __scsi_error_from_host_byte(cmd, + result); } /* * __scsi_error_from_host_byte may have reset the host_byte @@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) BUG(); return; } - } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { + } else if (blk_rq_bytes(req) == 0 && result && sense_current) { /* * Flush commands do not transfers any data, and thus cannot use * good_bytes != blk_rq_bytes(req) as the signal for an error. -* This sets the error explicitly for the problem case. +* This sets blk_stat explicitly for the problem case. */ - error = __scsi_error_from_host_byte(cmd, result); + blk_stat = __scsi_error_from_host_byte(cmd, result); } /* no bidi support for !blk_rq_is_passthrough yet */ @@ -856,17 +858,22 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * is what gets returned to the user */ if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { - /* if ATA PASS-THROUGH INFORMATION AVAILABLE skip + bool do_print = true; + + /* +* If ATA PASS-THROUGH INFORMATION AVAILABLE skip * print since caller wants ATA registers. Only occurs on * SCSI ATA PASS_THROUGH commands when CK_COND=1 */ if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) - ; - else if (!(req->rq_flags & RQF_QUIET)) + do_print = false; + else if (req->rq_flags & RQF_QUIET) + do_print = false; + if (do_print) scsi_print_sense(cmd); result = 0; - /* for passthrough error may be set */ - error = BLK_STS_OK; + /* for passthrough, blk_stat may be set */ + blk_stat = BLK_STS_OK; } /* * Another corner case: the SCSI status byte is non-zero but 'g
[PATCH v5 3/7] scsi_io_completion_nz_result function added
Break out several intertwined paths when cmd->result is non zero and place them in the scsi_io_completion_nz_result helper function. The logic is not changed. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- A reviewer requested the original helper function's two return values be reduced to one: the blk_stat variable. This required a hack to differentiate the default setting of blk_stat (BLK_STS_OK) from the case when the helper assigns BLK_STS_OK as the return value. The hack was to return the otherwise unused BLK_STS_NOTSUPP value as an indication that the helper didn't change anything. That hack was judged by another reviewer to be worse that the "two return values" ugliness it was trying to address. So back to the original "two return values" solution. drivers/scsi/scsi_lib.c | 132 +++- 1 file changed, 75 insertions(+), 57 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d4f2c3fac907..08c4db34af7b 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,79 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* + * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a + * new result that may suppress further error checking. Also modifies + * *blk_statp in some cases. + */ +static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result, + blk_status_t *blk_statp) +{ + bool sense_valid; + bool sense_current = true; /* false implies "deferred sense" */ + struct request *req = cmd->request; + struct scsi_sense_hdr sshdr; + + sense_valid = scsi_command_normalize_sense(cmd, ); + if (sense_valid) + sense_current = !scsi_sense_is_deferred(); + + if (blk_rq_is_passthrough(req)) { + if (sense_valid) { + /* +* SG_IO wants current and deferred errors +*/ + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); + } + if (sense_current) + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } else if (blk_rq_bytes(req) == 0 && sense_current) { + /* +* Flush commands do not transfers any data, and thus cannot use +* good_bytes != blk_rq_bytes(req) as the signal for an error. +* This sets *blk_statp explicitly for the problem case. +*/ + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } + /* +* Recovered errors need reporting, but they're always treated as +* success, so fiddle the result code here. For passthrough requests +* we already took a copy of the original into sreq->result which +* is what gets returned to the user +*/ + if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { + bool do_print = true; + /* +* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d] +* skip print since caller wants ATA registers. Only occurs +* on SCSI ATA PASS_THROUGH commands when CK_COND=1 +*/ + if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) + do_print = false; + else if (req->rq_flags & RQF_QUIET) + do_print = false; + if (do_print) + scsi_print_sense(cmd); + result = 0; + /* for passthrough, *blk_statp may be set */ + *blk_statp = BLK_STS_OK; + } + /* +* Another corner case: the SCSI status byte is non-zero but 'good'. +* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when +* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD +* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related +* intermediate statuses (both obsolete in SAM-4) as good. +*/ + if (status_byte(result) && scsi_status_is_good(result)) { + result = 0; + *blk_statp = BLK_STS_OK; + } + return result; +} + /* * Function:scsi_io_completion() * @@ -794,26 +867,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; - if (result) { + if (result) { /* does not necessarily mean there is an error */ sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) sense_current =
[PATCH v5 1/7] scsi_io_completion comment on end_request return
scsi_end_request() is called multiple times from scsi_io_completion() which branches on its bool returned value. Add comment before the static definition of scsi_end_request() about the meaning of that return. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com> --- The reader might think that if scsi_end_request() had actually ended the request (i.e. no bytes remaining) it would return true. If so, the reader would be misled. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 393f9db8f41b..d44ee84df091 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd) cmd->request->next_rq->special = NULL; } +/* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, unsigned int bytes, unsigned int bidi_bytes) { -- 2.14.1
Re: Multi-Actuator SAS HDD First Look
On 2018-03-26 11:08 AM, Hannes Reinecke wrote: On Fri, 23 Mar 2018 08:57:12 -0600 Tim Walkerwrote: Seagate announced their split actuator SAS drive, which will probably require some kernel changes for full support. It's targeted at cloud provider JBODs and RAID. Here are some of the drive's architectural points. Since the two LUNs share many common components (e.g. spindle) Seagate allocated some SCSI operations to be LUN specific and some to affect the entire device, that is, both LUNs. 1. Two LUNs, 0 & 1, each with independent lba space, and each connected to an independent read channel, actuator, and set of heads. 2. Each actuator addresses 1/2 of the media - no media is shared across the actuators. They seek independently. 3. One World Wide Name (WWN) is assigned to the port for device address. Each Logical Unit has a separate World Wide Name for identification in VPD page. 4. 128 deep command queue, shared across both LUNs 5. Each LUN can pull commands from the queue independently, so they can implement their own sorting and optimization. 6. Ordered tag attribute causes the command to be ordered across both Logical Units 7. Head of Queue attribute causes the command to be ordered with respect to a single Logical Unit 8. Mode pages are device-based (shared across both Logical Units) 9. Log pages are device-based. 10. Inquiry VPD pages (with minor exceptions) are device based. 11. Device health features (SMART, etc) are device based Seagate wants the multi-actuator design to integrate into the stack as painlessly as possible.The interface design is still in the early stages, so I am gathering requirements and recommendations, and also providing any information necessary to help scope integrating a multi-LUN device into the MQ stack. So, I am soliciting any pertinent feedback including: 1. Painful incompatibilities between the Seagate proposal and current MQ architecture 2. Linux changes needed 3. Drive interface changes needed 4. Anything else I may have overlooked So far it looks okay; just make sure to have VPD page 0x83 entries properly associated. To all intents and purposes these devices seem to look like 'normal' devices with two LUNs; nothing special with that. Real question would be in which areas those devices differentiate from the two indepdendent LUN scenario. There might be issues with per-device informations like SMART etc; ideally they are available from _both_ LUNs. Otherwise they'll show up as blank from one LUN, causing consternation with any management software. Further to this point, some types of damage, such as to a head or (one side of) a platter would degrade one LU, possibly making it unusable for storage, while the other side (and the other LU) would be fine. I'm curious how you plan to implement the START STOP UNIT command. If one side of the platter is in "start" state and the other side in "stop" state, will the heads on the stopped side be parked (if they can be parked)? And if both sides (LUs) are stopped I would hope you really would spin down the disk, then if either is started the disk would be spun up. Getting T10 to add a bit to the Block Device Characteristics VPD page might be helpful. It could be a "shares a spindle" bit with the other LUs identified in the SCSI Ports VPD page. Such an indication would help an enclosure find out if a Multi-Actuator disk was really spun down and ready to be removed or replaced. I think SES and smartmontools may need tweaks to handle this new device model sensibly. Doug Gilbert
Re: Multi-Actuator SAS HDD First Look
On 2018-04-02 11:34 AM, Tim Walker wrote: On Sat, Mar 31, 2018 at 10:52 AM, Douglas Gilbert <dgilb...@interlog.com> wrote: On 2018-03-30 04:01 PM, Bart Van Assche wrote: On Fri, 2018-03-30 at 12:36 -0600, Tim Walker wrote: Yes I will be there to discuss the multi-LUN approach. I wanted to get these interface details out so we could have some background and perhaps folks would come with ideas. I don't have much more to put out, but I will certainly answer questions - either on this list or directly. Hello Tim, As far as I know the Linux SCSI stack does not yet support any SCSI devices for which a single SCSI command affects multiple (two in this case) LUNs. Adding such support may take significant work. There will also be the disadvantage that most SCSI core contributors do not have access to a multi- actuator device and hence that their changes may break support for multi- actuator devices. Hmmm, INQUIRY (3PC bit) and REPORT LUNS seem to be counter examples to Bart's assertion. Plus there are a more that tell you about things outside the addressed LU, for example the SCSI Ports VPD page tells you about other SCSI ports hence other LUs) on the current device. From Tim's command list: Device level 0x0, 0x1: okay 0x4 (Format unit): yikes, that could be a nasty surprise, accessing a file system on the other LU and getting an error "Not ready, format in progress"!! 0x12: standard INQUIRY okay, VPD pages not so much LU id different; relative port id, different; target port id different (at the least) 0x1b (SSU): storage LUs need to know this model, otherwise the logic on each LU could get into a duel: "I want it spun up; no, I want it spun down ..." 0x35, 0x37, 0x3b, 0x3c: okay 0x48 (sanitize): similar to Format unit above 0x91,0x4c,0x4d: okay MODE SENSE/SELECT(6,10): depends on page, block descriptor needs to be partially device level (since LB size may be changed by FU which is device level) rest of device level: okay or (I) don't know 0xf7: READ UDS DATA, that's interesting, but proprietary I guess Perhaps you could add a rider on FU and SAN: they get rejected unless the other storage LU is in (logical) spun down state. LU specific --- all okay, I hoping READ(6,10,12,16,32) and their WRITE cousins will be there also :-) Plus the TMF: LU reset Device or LU all okay I'm intrigued by your 3 LU model. My wish list for that: LUN 0 would be processor device type (0x3) so it wouldn't confuse the OS (Linux) that it held storage (READ CAPACITY is mandatory for PDT 0x0 and cannot represent a 0 block LU) and you could pick and choose which SCSI commands to implement on it. LUN 0 TUR could report what the spindle was really doing, SSU could do precisely what it is told (and SSU on LUNs 1 and 2 would be an "and" for spin down and an "or" for spin up). I've got several boxes full of SAS cables and only one cable that I can think of that lets me get to the secondary SAS port. So on LUN 0 you could have a proprietary (unless T10 takes it on board) mode page to enable the user to say which ports that LUNs 1 and 2 where accessible on. Obviously LUN 0 would need to be accessible on both ports. [A non-accessible LUN would still respond to INQUIRY but with a first byte of 07f: PDT=0x1f (unknown) and PQ=3 which implies it is there but inaccessible via the current port.] A processor PDT opens up the possibility of putting a copy manager on LUN 0. Think offloaded (from main machine's perspective) backups and restores where LUN 1 or 2 is the source or destination. Enough dreaming. Doug Gilbert Thanks for all the input from everybody. I'll collate it for a meeting with our interface architect. Just to amplify the case against a FU or SAN being allowed at almost any time from either storage LU irrespective of what the other was doing. Imagine one initiator has some important data on one LU and has an EXCLUSIVE ACCESS persistent reservation on it. Then another initiator (e.g. on a different machine) sends a FU to the other LU which is honoured, wiping the whole device. One unhappy customer Doug Gilbert
Re: Regarding SCSI SANITIZE command support
On 2018-04-02 07:10 AM, Mahesh Rajashekhara wrote: Hello, I am RAID HBA driver engineer here at Microsemi. We are working on linux driver development for Microsemi SAS/SATA RAID HBA controllers. As per our understanding, while a drive is processing the SANITIZE command: - The drive should still be exposed to the OS. - The drive will fail all commands other than REQUEST SENSE Not sure that the drive should ever knowingly fail INQUIRY and REPORT LUNS. That is to allow a discovery process to occur. In the INQUIRY response the peripheral qualifier should be 000b by my reading. Go look at sbc4r15.pdf, specifically clause 4.11.2 to see all the commands that should be responded to during a SANITIZE (there are 9 of them). When we initiate SCSI SANITIZE operation on disk drive, from the OS dmesg, we could see that continuous SCSI READ (10) commands have been sent by the OS upper layer drivers and drive reports ASC: 04 ASCQ: 1B "LOGICAL UNIT NOT READY, SANITIZE IN PROGRESS". Since the sanitize operation is in progress, the drive will fail the commands. Although, OS could see drive is sanitizing, we are not sure why does OS upper layer driver is sending SCSI READ (10). During the OS boot, if the drive is sanitizing, OS boot is taking very long time and call trace issue has been occurred. It would be of great help if you could shed some light on this issue. Snippet of dmesg: [ 202.659196] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00 [ 202.659198] blk_update_request: I/O error, dev sdm, sector 0 [ 202.659201] Buffer I/O error on dev sdm, logical block 0, async page read [ 202.659294] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 202.659299] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current] [ 202.659303] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize in progress [ 202.659308] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00 [ 202.659311] blk_update_request: I/O error, dev sdm, sector 0 [ 202.659314] Buffer I/O error on dev sdm, logical block 0, async page read [ 202.659322] sdm: unable to read partition table [ 202.659633] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 202.659638] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current] [ 202.659641] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize in progress [ 202.659645] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00 [ 202.659648] blk_update_request: I/O error, dev sdm, sector 0 [ 202.660621] sd 0:0:11:0: [sdm] Spinning up disk... [ 204.067155] Buffer I/O error on dev sdm, logical block 488378624, async page read [ 204.122152] SGI XFS with ACLs, security attributes, no debug enabled [ 204.123494] XFS (dm-0): Mounting V5 Filesystem [ 204.289817] XFS (dm-0): Ending clean mount [ 240.069283] INFO: task systemd-udevd:251 blocked for more than 120 seconds. [ 240.069289] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message [ 203.661324] . [ 240.069293] systemd-udevd D [ 240.069296] c00c3580 0 251 1 0x0004 [ 240.069301] 88007e0e7d18 0082 88007e0ceeb0 88007e0e7fd8 [ 240.069306] 88007e0e7fd8 88007e0e7fd8 88007e0ceeb0 [ 240.069311] c00c35d0 0001 c00c3580 [ 240.069316] Call Trace: [ 240.069328] [] schedule+0x29/0x70 [ 240.069334] [] async_synchronize_cookie_domain+0x85/0x150 [ 240.069340] [] ? wake_up_atomic_t+0x30/0x30 [ 240.069346] [] async_synchronize_full+0x17/0x20 [ 240.069351] [] load_module+0x1fc0/0x29e0 [ 240.069358] [] ? ddebug_proc_write+0xf0/0xf0 [ 240.069365] [] SyS_init_module+0xc5/0x110 [ 240.069371] [] system_call_fastpath+0x16/0x1b The scsi mid-level and/or the sd driver sends TEST UNIT READY (TUR) commands and should take note of the NOT READY sense key and should not send READ commands until that condition clears. If we are talking about modern SCSI devices then REQUEST SENSE should respond with progress information. Doug Gilbert
[PATCH v4 4/7] scsi_io_completion_action helper added
Place scsi_io_completion()'s complex error processing associated with a local enumeration into a static helper function. That enumeration's values start with "ACTION_" so use the suffix "_action" in the helper function's name. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 325 ++-- 1 file changed, 173 insertions(+), 152 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 6a8b9dc7c40e..048b144de5b0 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when special action required. */ +static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) +{ + struct request_queue *q = cmd->device->request_queue; + struct request *req = cmd->request; + int level = 0; + enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, + ACTION_DELAYED_RETRY} action; + unsigned long wait_for = (cmd->allowed + 1) * req->timeout; + struct scsi_sense_hdr sshdr; + bool sense_valid; + bool sense_current = true; /* false implies "deferred sense" */ + blk_status_t blk_stat; + + sense_valid = scsi_command_normalize_sense(cmd, ); + if (sense_valid) + sense_current = !scsi_sense_is_deferred(); + + blk_stat = __scsi_error_from_host_byte(cmd, result); + + if (host_byte(result) == DID_RESET) { + /* Third party bus reset or reset for error recovery +* reasons. Just retry the command and see what +* happens. +*/ + action = ACTION_RETRY; + } else if (sense_valid && sense_current) { + switch (sshdr.sense_key) { + case UNIT_ATTENTION: + if (cmd->device->removable) { + /* Detected disc change. Set a bit +* and quietly refuse further access. +*/ + cmd->device->changed = 1; + action = ACTION_FAIL; + } else { + /* Must have been a power glitch, or a +* bus reset. Could not have been a +* media change, so we just retry the +* command and see what happens. +*/ + action = ACTION_RETRY; + } + break; + case ILLEGAL_REQUEST: + /* If we had an ILLEGAL REQUEST returned, then +* we may have performed an unsupported +* command. The only thing this should be +* would be a ten byte read where only a six +* byte read was supported. Also, on a system +* where READ CAPACITY failed, we may have +* read past the end of the disk. +*/ + if ((cmd->device->use_10_for_rw && + sshdr.asc == 0x20 && sshdr.ascq == 0x00) && + (cmd->cmnd[0] == READ_10 || +cmd->cmnd[0] == WRITE_10)) { + /* This will issue a new 6-byte command. */ + cmd->device->use_10_for_rw = 0; + action = ACTION_REPREP; + } else if (sshdr.asc == 0x10) /* DIX */ { + action = ACTION_FAIL; + blk_stat = BLK_STS_PROTECTION; + /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */ + } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) { + action = ACTION_FAIL; + blk_stat = BLK_STS_TARGET; + } else + action = ACTION_FAIL; + break; + case ABORTED_COMMAND: + action = ACTION_FAIL; + if (sshdr.asc == 0x10) /* DIF */ + blk_stat = BLK_STS_PROTECTION; + break; + case NOT_READY: + /* If the device is in the process of becoming +* ready, or has a temporary blockage, retry. +*/ + if (sshdr.asc == 0x04) { + switch (sshdr.ascq) { +
[PATCH v4 3/7] scsi_io_completion_nz_result function added
Break out several intertwined paths when cmd->result is non zero and place them in the scsi_io_completion_nz_result helper function. The logic is not changed. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- A reviewer requested the original helper function's two return values be reduced to one: the blk_stat variable. This required a hack to differentiate the default setting of blk_stat (BLK_STS_OK) from the case when the helper assigns BLK_STS_OK as the return value. The hack is to return the otherwise unused BLK_STS_NOTSUPP value as an indication that the helper didn't change anything. That hack was judged by another reviewer to be worse that the "two return values" ugliness it was trying to address. So back to the original "two return values" solution. drivers/scsi/scsi_lib.c | 127 1 file changed, 75 insertions(+), 52 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 598836804745..6a8b9dc7c40e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,79 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* + * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a + * new result that may suppress further error checking. Also modifies + * *blk_statp in some cases. + */ +static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result, + blk_status_t *blk_statp) +{ + bool sense_valid; + bool sense_current = true; /* false implies "deferred sense" */ + struct request *req = cmd->request; + struct scsi_sense_hdr sshdr; + + sense_valid = scsi_command_normalize_sense(cmd, ); + if (sense_valid) + sense_current = !scsi_sense_is_deferred(); + + if (blk_rq_is_passthrough(req)) { + if (sense_valid) { + /* +* SG_IO wants current and deferred errors +*/ + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); + } + if (sense_current) + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } else if (blk_rq_bytes(req) == 0 && sense_current) { + /* +* Flush commands do not transfers any data, and thus cannot use +* good_bytes != blk_rq_bytes(req) as the signal for an error. +* This sets *blk_statp explicitly for the problem case. +*/ + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } + /* +* Recovered errors need reporting, but they're always treated as +* success, so fiddle the result code here. For passthrough requests +* we already took a copy of the original into sreq->result which +* is what gets returned to the user +*/ + if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { + bool do_print = true; + /* +* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d] +* skip print since caller wants ATA registers. Only occurs +* on SCSI ATA PASS_THROUGH commands when CK_COND=1 +*/ + if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) + do_print = false; + else if (req->rq_flags & RQF_QUIET) + do_print = false; + if (do_print) + scsi_print_sense(cmd); + /* for passthrough, *blk_statp may be set, so clear */ + *blk_statp = BLK_STS_OK; + result = 0; + } + /* +* Another corner case: the SCSI status byte is non-zero but 'good'. +* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when +* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD +* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related +* intermediate statuses (both obsolete in SAM-4) as good. +*/ + if (status_byte(result) && scsi_status_is_good(result)) { + *blk_statp = BLK_STS_OK; + result = 0; + } + return result; +} + /* * Function:scsi_io_completion() * @@ -794,26 +867,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; - if (result) { + if (result) { /* does not necessarily mean there is an error */ sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) sense_current =
[PATCH v4 7/7] scsi_io_completion convert BUGs to WARNs
The scsi_io_completion function contains three BUG() and BUG_ON() calls. Replace them with WARN variants. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index a9676d824452..a89ca1ddc01a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), blk_rq_bytes(req->next_rq))) - BUG(); + WARN_ONCE(true, + "Bidi command with remaining bytes"); return; } } /* no bidi support yet, other than in pass-through */ - BUG_ON(blk_bidi_rq(req)); + if (unlikely(blk_bidi_rq(req))) { + WARN_ONCE(true, "Only support bidi command in passthrough"); + scmd_printk(KERN_ERR, cmd, "Killing bidi command\n"); + if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req), +blk_rq_bytes(req->next_rq))) + WARN_ONCE(true, "Bidi command with remaining bytes"); + return; + } /* * Next deal with any sectors which we were able to correctly @@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) /* Kill remainder if no retries */ if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) - BUG(); + WARN_ONCE(true, + "Bytes remaining after failed, no-retry command"); return; } -- 2.14.1
[PATCH v4 2/7] scsi_io_completion rename variables
Change some variable names and associated comments for clarity. Correct some misleading comments. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com> --- drivers/scsi/scsi_lib.c | 57 ++--- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d44ee84df091..598836804745 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, *be put back on the queue and retried using the same *command as before, possibly after a delay. * - * c) We can call scsi_end_request() with -EIO to fail - *the remainder of the request. + * c) We can call scsi_end_request() with blk_stat other than + *BLK_STS_OK, to fail the remainder of the request. */ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) { int result = cmd->result; struct request_queue *q = cmd->device->request_queue; struct request *req = cmd->request; - blk_status_t error = BLK_STS_OK; + blk_status_t blk_stat = BLK_STS_OK; struct scsi_sense_hdr sshdr; bool sense_valid = false; - int sense_deferred = 0, level = 0; + bool sense_current = true; /* false implies "deferred sense" */ + int level = 0; enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; @@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) if (result) { sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) - sense_deferred = scsi_sense_is_deferred(); + sense_current = !scsi_sense_is_deferred(); } if (blk_rq_is_passthrough(req)) { @@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) min(8 + cmd->sense_buffer[7], SCSI_SENSE_BUFFERSIZE); } - if (!sense_deferred) - error = __scsi_error_from_host_byte(cmd, result); + if (sense_current) + blk_stat = __scsi_error_from_host_byte(cmd, + result); } /* * __scsi_error_from_host_byte may have reset the host_byte @@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) BUG(); return; } - } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { + } else if (blk_rq_bytes(req) == 0 && result && sense_current) { /* * Flush commands do not transfers any data, and thus cannot use * good_bytes != blk_rq_bytes(req) as the signal for an error. -* This sets the error explicitly for the problem case. +* This sets blk_stat explicitly for the problem case. */ - error = __scsi_error_from_host_byte(cmd, result); + blk_stat = __scsi_error_from_host_byte(cmd, result); } /* no bidi support for !blk_rq_is_passthrough yet */ @@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) else if (!(req->rq_flags & RQF_QUIET)) scsi_print_sense(cmd); result = 0; - /* for passthrough error may be set */ - error = BLK_STS_OK; + /* for passthrough, blk_stat may be set */ + blk_stat = BLK_STS_OK; } /* * Another corner case: the SCSI status byte is non-zero but 'good'. @@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) */ if (status_byte(result) && scsi_status_is_good(result)) { result = 0; - error = BLK_STS_OK; + blk_stat = BLK_STS_OK; } /* -* special case: failed zero length commands always need to -* drop down into the retry code. Otherwise, if we finished -* all bytes in the request we are done now. +* Next deal with any sectors which we were able to correctly +* handle. Failed, zero length commands always need to drop down +* to retry code.
[PATCH v4 1/7] scsi_io_completion comment on end_request return
scsi_end_request() is called multiple times from scsi_io_completion() which branches on its bool returned value. Add comment before the static definition of scsi_end_request() about the meaning of that return. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com> --- The reader might think that if scsi_end_request() had actually ended the request (i.e. no bytes remaining) it would return true. If so, the reader would be misled. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 393f9db8f41b..d44ee84df091 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd) cmd->request->next_rq->special = NULL; } +/* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, unsigned int bytes, unsigned int bidi_bytes) { -- 2.14.1
[PATCH v4 0/7] scsi_io_completion cleanup
While working of this "[PATCH v4] Make SCSI Status CONDITION MET equivalent to GOOD" the author had trouble finding how to add another corner case check in the scsi_io_completion() function (scsi_lib.c). That function is executed at the completion of every SCSI command but only 20 or so of its hundred of lines of code are executed for the vast majority of invocations (i.e. the fastpath). This patch refactors scsi_io_completion() by taking the bulk of its error processing code into three static helper functions, leaving only the 20 or so code lines that constitute the fastpath. In this process comments were added and tweaked, plus variables renamed. The last two patches in this set are optional: adding conditional hints along the fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs as requested by a reviewer. There is no attempt in this patchset to change the logic of the original scsi_io_completion(). Some conditional checks are saved by reducing the number of redundant tests (e.g. on the 'result' variable being non-zero). Also De Morgan's laws are applied to some complex conditions to simplify them from the reader's perspective. The fact remains, this is a very complex function. This patch is against Martin Petersen's 4.17/scsi-queue branch. Note: to apply this patchset to the lk 4.15 production kernels (tested on lk 4.15.13), these two lines near the end of scsi_io_completion(): __scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY, 0); and __scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY, 0); need that trailing 0 changed to 'false', prior to applying 4/7 . Douglas Gilbert (7): scsi_io_completion comment on end_request return scsi_io_completion rename variables scsi_io_completion_nz_result function added scsi_io_completion_action helper added scsi_io_completion_reprep helper added scsi_io_completion hints on fastpatch scsi_io_completion convert BUGs to WARNs drivers/scsi/scsi_lib.c | 378 +++- 1 file changed, 214 insertions(+), 164 deletions(-) -- 2.14.1
[PATCH v4 6/7] scsi_io_completion hints on fastpatch
Add likely() and unlikely() hints to conditionals on or near the fastpath. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- A reviewer wanted any performance improvement (or otherwise) quantified. The improvement was so small, that ftrace ignored it. Inline timing code suggests the improvement from this whole patchset is around 7 nanoseconds per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge. Another win might be the smaller size of scsi_io_completion() after the refactoring; this should allow more other code to fit in the instruction cache. drivers/scsi/scsi_lib.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 7bf7d3c4b9c2..a9676d824452 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) struct request *req = cmd->request; blk_status_t blk_stat = BLK_STS_OK; - if (result) /* does not necessarily mean there is an error */ + if (unlikely(result)) /* a nz result may or may not be an error */ result = scsi_io_completion_nz_result(cmd, result, _stat); - if (blk_rq_is_passthrough(req)) { + if (unlikely(blk_rq_is_passthrough(req))) { /* * __scsi_error_from_host_byte may have reset the host_byte */ scsi_req(req)->result = cmd->result; scsi_req(req)->resid_len = scsi_get_resid(cmd); - if (scsi_bidi_cmnd(cmd)) { + if (unlikely(scsi_bidi_cmnd(cmd))) { /* * Bidi commands Must be complete as a whole, * both sides at once. @@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } } - /* no bidi support for !blk_rq_is_passthrough yet */ + /* no bidi support yet, other than in pass-through */ BUG_ON(blk_bidi_rq(req)); /* @@ -1067,15 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * handle. Failed, zero length commands always need to drop down * to retry code. Fast path should return in this block. */ - if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) { - if (!scsi_end_request(req, blk_stat, good_bytes, 0)) - return; /* no bytes remaining */ + if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) { + if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0))) + return; /* no bytes remaining */ } - /* -* Kill remainder if no retrys. -*/ - if (blk_stat && scsi_noretry_cmd(cmd)) { + /* Kill remainder if no retries */ + if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) BUG(); return; @@ -1085,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) + if (likely(result == 0)) scsi_io_completion_reprep(cmd, q); else scsi_io_completion_action(cmd, result); -- 2.14.1
[PATCH v4 5/7] scsi_io_completion_reprep helper added
Since the action "reprep" is called from two places, rather than repeat the code, make a new scsi_io_completion helper with "reprep" as its suffix. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 41 ++--- 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 048b144de5b0..7bf7d3c4b9c2 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when "reprep" action required. */ +static void scsi_io_completion_reprep(struct scsi_cmnd *cmd, + struct request_queue *q) +{ + /* A new command will be prepared and issued. */ + if (q->mq_ops) { + scsi_mq_requeue_cmd(cmd); + } else { + /* Unprep request and put it back at head of the queue. */ + scsi_release_buffers(cmd); + scsi_requeue_command(q, cmd); + } +} + /* Helper for scsi_io_completion() when special action required. */ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) { @@ -892,15 +906,7 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) return; /*FALLTHRU*/ case ACTION_REPREP: - /* Unprep the request and put it back at the head of the queue. -* A new command will be prepared and issued. -*/ - if (q->mq_ops) - scsi_mq_requeue_cmd(cmd); - else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } + scsi_io_completion_reprep(cmd, q); break; case ACTION_RETRY: /* Retry the same command immediately */ @@ -1079,20 +1085,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) { - /* -* Unprep the request and put it back at the head of the -* queue. A new command will be prepared and issued. -* This block is the same as case ACTION_REPREP in -* scsi_io_completion_action() above. -*/ - if (q->mq_ops) - scsi_mq_requeue_cmd(cmd); - else { - scsi_release_buffers(cmd); - scsi_requeue_command(q, cmd); - } - } else + if (result == 0) + scsi_io_completion_reprep(cmd, q); + else scsi_io_completion_action(cmd, result); } -- 2.14.1
Re: How to Locate drive directly attached to mpt3sas HBA
On 2018-03-19 11:40 AM, Jack Wang wrote: Hi list, Any one knows how can I locate a HDD directly attached to mpt3sas, sas3ircu only supports LOCATE commd to locates driver installed in a disk enclosure, but not directly attached. I know microsemi/PMCs supports SGPIO interface to locate drive eg: Adp80xxapp sgpio 0 set 0 1 I searched in latest upstream mpt3sas code, didn't find such interface. Do I miss something? Hi, If its a SAS disk, it might have an onboard LED. In the sdparm package there is a script called "sas_disk_blink" that will flash that LED. And I think either mpt2sas and/or mpt3sas HBAs (I don't have my hardware nearby) have a SMP target hidden away somewhere. Perhaps someone from LSI/Avago/Broadcom could supply more information about that. Doug Gilbert
[PATCH v2 6/6] scsi_io_completion convert BUGs to WARNs
The scsi_io_completion function contains three BUG() and BUG_ON() calls. Replace them with WARN variants. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- This was done at a reviewer's request. This patchset did not change those BUG() lines but the git-diff suggested it did. Then the checkpatch.pl warned "BUG() calls considered harmful". BUG() calls are simpler than WARNs because the programmer doesn't have to worry about a continuation after them. The continuations given here may not be ideal. For these reasons the author considers this patch optional. drivers/scsi/scsi_lib.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 63f79b0dbbf3..be8836d984f5 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1048,13 +1048,20 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), blk_rq_bytes(req->next_rq))) - BUG(); + WARN(true, "Bidi command with remaining bytes"); return; } } /* no bidi support yet, other than in pass-through */ - BUG_ON(blk_bidi_rq(req)); + if (unlikely(blk_bidi_rq(req))) { + WARN_ONCE(true, "Only support bidi command in passthrough"); + scmd_printk(KERN_ERR, cmd, "Killing bidi command\n"); + if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req), +blk_rq_bytes(req->next_rq))) + WARN(true, "Bidi command with remaining bytes"); + return; + } /* * Next deal with any sectors which we were able to correctly @@ -1077,7 +1084,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) /* Kill remainder if no retrys */ if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) - BUG(); + WARN(true, + "Bytes remaining after failed, no-retry command"); return; } -- 2.14.1
[PATCH v2 2/6] scsi_io_completion rename variables
Change some variable names and associated comments for clarity. Correct some misleading comments. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- drivers/scsi/scsi_lib.c | 57 ++--- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index d44ee84df091..a4da5773d42d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, *be put back on the queue and retried using the same *command as before, possibly after a delay. * - * c) We can call scsi_end_request() with -EIO to fail - *the remainder of the request. + * c) We can call scsi_end_request() with blk_stat other than + *BLK_STS_OK, to fail the remainder of the request. */ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) { int result = cmd->result; struct request_queue *q = cmd->device->request_queue; struct request *req = cmd->request; - blk_status_t error = BLK_STS_OK; + blk_status_t blk_stat = BLK_STS_OK; /* u8: BLK_STS_OK is only 0 */ struct scsi_sense_hdr sshdr; bool sense_valid = false; - int sense_deferred = 0, level = 0; + bool sense_current = true; /* false implies "deferred sense" */ + int level = 0; enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; @@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) if (result) { sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) - sense_deferred = scsi_sense_is_deferred(); + sense_current = !scsi_sense_is_deferred(); } if (blk_rq_is_passthrough(req)) { @@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) min(8 + cmd->sense_buffer[7], SCSI_SENSE_BUFFERSIZE); } - if (!sense_deferred) - error = __scsi_error_from_host_byte(cmd, result); + if (sense_current) + blk_stat = __scsi_error_from_host_byte(cmd, + result); } /* * __scsi_error_from_host_byte may have reset the host_byte @@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) BUG(); return; } - } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { + } else if (blk_rq_bytes(req) == 0 && result && sense_current) { /* * Flush commands do not transfers any data, and thus cannot use * good_bytes != blk_rq_bytes(req) as the signal for an error. -* This sets the error explicitly for the problem case. +* This sets blk_stat explicitly for the problem case. */ - error = __scsi_error_from_host_byte(cmd, result); + blk_stat = __scsi_error_from_host_byte(cmd, result); } /* no bidi support for !blk_rq_is_passthrough yet */ @@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) else if (!(req->rq_flags & RQF_QUIET)) scsi_print_sense(cmd); result = 0; - /* for passthrough error may be set */ - error = BLK_STS_OK; + /* for passthrough, blk_stat may be set */ + blk_stat = BLK_STS_OK; } /* * Another corner case: the SCSI status byte is non-zero but 'good'. @@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) */ if (status_byte(result) && scsi_status_is_good(result)) { result = 0; - error = BLK_STS_OK; + blk_stat = BLK_STS_OK; } /* -* special case: failed zero length commands always need to -* drop down into the retry code. Otherwise, if we finished -* all bytes in the request we are done now. +* Next deal with any sectors which we were able to correctly +* handle. Failed, zero length commands always need to drop down +* to retry code. Fast path should return in this block.
[PATCH v2 0/6] scsi_io_completion cleanup
While working of this "[PATCH v4] Make SCSI Status CONDITION MET equivalent to GOOD" the author had trouble finding how to add another corner case check in the scsi_io_completion() function (scsi_lib.c). That function is executed at the completion of every SCSI command but only 20 or so of its hundred of lines of code are executed for the vast majority of invocations (i.e. the fastpath). This patch refactors scsi_io_completion() by taking the bulk of its error processing code into two static helper functions, leaving only the 20 or so code lines that constitute the fastpath. In this process comments were added and tweaked, plus variables renamed. The last two patches in this set are optional: adding conditional hints along the fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs as requested by a reviewer. There is no attempt in this patchset to change the logic of the original scsi_io_completion(). Some conditional checks are saved by reducing the number of redundant tests (e.g. on the 'result' variable being non-zero). Also De Morgan's laws are applied to some complex conditions to simplify them from the reader's perspective. The fact remains, this is a very complex function. This patch is against Martin Petersen's 4.17/scsi-queue branch. Note to reviewers: a patchset that refactors code, in this case one very long function into one short function and two longish helpers, does not "own" the logic in the original function. The main thing to review is whether the refactoring is worthwhile and whether the logic remains the same after the refactoring. Douglas Gilbert (6): scsi_io_completion comment on end_request return scsi_io_completion rename variables scsi_io_completion_nz_result function added scsi_io_completion_action helper added scsi_io_completion hints on fastpatch scsi_io_completion convert BUGs to WARNs drivers/scsi/scsi_lib.c | 376 1 file changed, 219 insertions(+), 157 deletions(-) -- 2.14.1
[PATCH v2 5/6] scsi_io_completion hints on fastpatch
Add likely() and unlikely() hints to conditionals on or near the fastpath. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- A reviewer wanted any performance improvement (or otherwise) quantified. The improvement was so small, that ftrace ignored it. Inline timing code suggests the improvement from this whole patchset is around 7 nanoseconds per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge. Another win might be the smaller size of scsi_io_completion() after the refactoring; this should allow more other code to fit in the instruction cache. drivers/scsi/scsi_lib.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0e2dcdb63fcd..63f79b0dbbf3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1020,7 +1020,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) struct request *req = cmd->request; blk_status_t blk_stat = BLK_STS_OK; /* u8: BLK_STS_OK is only 0 */ - if (result) { /* does not necessarily mean there is an error */ + if (unlikely(result)) { /* a nz result may or may not be an error */ blk_stat = scsi_io_completion_nz_result(cmd, result); if (blk_stat == BLK_STS_OK) result = 0; /* suppress error processing now */ @@ -1033,14 +1033,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } - if (blk_rq_is_passthrough(req)) { + if (unlikely(blk_rq_is_passthrough(req))) { /* * __scsi_error_from_host_byte may have reset the host_byte */ scsi_req(req)->result = cmd->result; scsi_req(req)->resid_len = scsi_get_resid(cmd); - if (scsi_bidi_cmnd(cmd)) { + if (unlikely(scsi_bidi_cmnd(cmd))) { /* * Bidi commands Must be complete as a whole, * both sides at once. @@ -1053,7 +1053,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } } - /* no bidi support for !blk_rq_is_passthrough yet */ + /* no bidi support yet, other than in pass-through */ BUG_ON(blk_bidi_rq(req)); /* @@ -1069,15 +1069,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * handle. Failed, zero length commands always need to drop down * to retry code. Fast path should return in this block. */ - if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) { - if (!scsi_end_request(req, blk_stat, good_bytes, 0)) - return; /* no bytes remaining */ + if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) { + if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0))) + return; /* no bytes remaining */ } - /* -* Kill remainder if no retrys. -*/ - if (blk_stat && scsi_noretry_cmd(cmd)) { + /* Kill remainder if no retrys */ + if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) BUG(); return; @@ -1087,14 +1085,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) * If there had been no error, but we have leftover bytes in the * requeues just queue the command up again. */ - if (result == 0) { + if (likely(result == 0)) { /* * Unprep the request and put it back at the head of the * queue. A new command will be prepared and issued. * This block is the same as case ACTION_REPREP in * scsi_io_completion_action() above. */ - if (q->mq_ops) + if (likely(q->mq_ops)) scsi_mq_requeue_cmd(cmd); else { scsi_release_buffers(cmd); -- 2.14.1
[PATCH v2 1/6] scsi_io_completion comment on end_request return
scsi_end_request() is called multiple times from scsi_io_completion() which branches on its bool returned value. Add comment before the static definition of scsi_end_request() about the meaning of that return. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- The reader might think that if scsi_end_request() had actually ended the request (i.e. no bytes remaining) it would return true. If so, the reader would be misled. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 393f9db8f41b..d44ee84df091 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd) cmd->request->next_rq->special = NULL; } +/* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, unsigned int bytes, unsigned int bidi_bytes) { -- 2.14.1
[PATCH v2 4/6] scsi_io_completion_action helper added
Place scsi_io_completion()'s complex error processing associated with a local enumeration into a static helper function. That enumeration's values start with "ACTION_" so use the suffix "_action" in the helper function's name. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de> --- Replicate the ACTION_REPREP case code since the former 'goto requeue' would now need to jump into another function which is not permitted. drivers/scsi/scsi_lib.c | 322 ++-- 1 file changed, 172 insertions(+), 150 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 9731d0d3cc80..0e2dcdb63fcd 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when special action required. */ +static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result) +{ + struct request_queue *q = cmd->device->request_queue; + struct request *req = cmd->request; + int level = 0; + enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, + ACTION_DELAYED_RETRY} action; + unsigned long wait_for = (cmd->allowed + 1) * req->timeout; + struct scsi_sense_hdr sshdr; + bool sense_valid_and_current = false; + blk_status_t blk_stat; /* u8, BLK_STS_OK is only 0 */ + + /* sense not about current command is termed: deferred */ + if (scsi_command_normalize_sense(cmd, ) && + !scsi_sense_is_deferred()) + sense_valid_and_current = true; + + blk_stat = __scsi_error_from_host_byte(cmd, result); + + if (host_byte(result) == DID_RESET) { + /* Third party bus reset or reset for error recovery +* reasons. Just retry the command and see what +* happens. +*/ + action = ACTION_RETRY; + } else if (sense_valid_and_current) { + switch (sshdr.sense_key) { + case UNIT_ATTENTION: + if (cmd->device->removable) { + /* Detected disc change. Set a bit +* and quietly refuse further access. +*/ + cmd->device->changed = 1; + action = ACTION_FAIL; + } else { + /* Must have been a power glitch, or a +* bus reset. Could not have been a +* media change, so we just retry the +* command and see what happens. +*/ + action = ACTION_RETRY; + } + break; + case ILLEGAL_REQUEST: + /* If we had an ILLEGAL REQUEST returned, then +* we may have performed an unsupported +* command. The only thing this should be +* would be a ten byte read where only a six +* byte read was supported. Also, on a system +* where READ CAPACITY failed, we may have +* read past the end of the disk. +*/ + if ((cmd->device->use_10_for_rw && + sshdr.asc == 0x20 && sshdr.ascq == 0x00) && + (cmd->cmnd[0] == READ_10 || +cmd->cmnd[0] == WRITE_10)) { + /* This will issue a new 6-byte command. */ + cmd->device->use_10_for_rw = 0; + action = ACTION_REPREP; + } else if (sshdr.asc == 0x10) /* DIX */ { + action = ACTION_FAIL; + blk_stat = BLK_STS_PROTECTION; + /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */ + } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) { + action = ACTION_FAIL; + blk_stat = BLK_STS_TARGET; + } else + action = ACTION_FAIL; + break; + case ABORTED_COMMAND: + action = ACTION_FAIL; + if (sshdr.asc == 0x10) /* DIF */ + blk_stat = BLK_STS_PROTECTION; + break; + case NOT_READY: + /* If the device is in the process of becoming +* ready,
[PATCH v2 3/6] scsi_io_completion_nz_result function added
Break out several interwined paths when cmd->result is non zero and place them in scsi_io_completion_nz_result helper function. The logic is not changed. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- A reviewer requested the original helper function's two return values be reduced to one: the blk_stat variable. This required a hack to differentiate the default setting of blk_stat (BLK_STS_OK) from the case when the helper assigns BLK_STS_OK as the return value. The hack is to return the otherwise unused BLK_STS_NOTSUPP value as an indication that the helper didn't change anything. drivers/scsi/scsi_lib.c | 134 +--- 1 file changed, 82 insertions(+), 52 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index a4da5773d42d..9731d0d3cc80 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -751,6 +751,77 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* + * Helper for scsi_io_completion() when cmd->result is non-zero. Returns + * BLK_STS_NOTSUPP if this function does not change blk_status . + */ +static blk_status_t scsi_io_completion_nz_result(struct scsi_cmnd *cmd, +int result) +{ + bool sense_valid; + bool about_current; + /* __scsi_error_from_host_byte() does not return BLK_STS_NOTSUPP */ + blk_status_t blk_stat = BLK_STS_NOTSUPP; + struct request *req = cmd->request; + struct scsi_sense_hdr sshdr; + + sense_valid = scsi_command_normalize_sense(cmd, ); + about_current = sense_valid ? !scsi_sense_is_deferred() : true; + + if (blk_rq_is_passthrough(req)) { + if (sense_valid) { + /* +* SG_IO wants current and deferred errors +*/ + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); + } + if (about_current) + blk_stat = __scsi_error_from_host_byte(cmd, result); + } else if (blk_rq_bytes(req) == 0 && about_current) { + /* +* Flush commands do not transfers any data, and thus cannot use +* good_bytes != blk_rq_bytes(req) as the signal for an error. +* This sets blk_stat explicitly for the problem case. +*/ + blk_stat = __scsi_error_from_host_byte(cmd, result); + } + /* +* Recovered errors need reporting, but they're always treated as +* success, so fiddle the result code here. For passthrough requests +* we already took a copy of the original into sreq->result which +* is what gets returned to the user +*/ + if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { + bool do_print = true; + /* +* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d] +* skip print since caller wants ATA registers. Only occurs +* on SCSI ATA PASS_THROUGH commands when CK_COND=1 +*/ + if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) + do_print = false; + else if (req->rq_flags & RQF_QUIET) + do_print = false; + if (do_print) + scsi_print_sense(cmd); + /* for passthrough, blk_stat may be set */ + blk_stat = BLK_STS_OK; + } + /* +* Another corner case: the SCSI status byte is non-zero but 'good'. +* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when +* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD +* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related +* intermediate statuses (both obsolete in SAM-4) as good. +*/ + if (status_byte(result) && scsi_status_is_good(result)) + blk_stat = BLK_STS_OK; + + return blk_stat; +} + /* * Function:scsi_io_completion() * @@ -794,26 +865,23 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) ACTION_DELAYED_RETRY} action; unsigned long wait_for = (cmd->allowed + 1) * req->timeout; - if (result) { + if (result) { /* does not necessarily mean there is an error */ sense_valid = scsi_command_normalize_sense(cmd, ); if (sense_valid) sense_current = !scsi_sense_is_deferred(); + blk_stat = scsi_io_completion_nz_result(cmd, result); + if (blk_stat == BLK_STS_OK) + result = 0; /* suppress error processing now */ + /* +* if scsi_io_complet
Re: [PATCH] scsi: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()
On 2018-02-26 02:39 AM, Hannes Reinecke wrote: When converting __scsi_error_from_host_byte() to BLK_STS error codes the case DID_OK was forgotten, resulting in it always returning an error. Fixes: 2a842acab109 ("block: introduce new block status code type") Cc: Doug Gilbert <dgilb...@interlog.com> Signed-off-by: Hannes Reinecke <h...@suse.com> Reviewed-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index aea5a1ae318b..11d63136f0bd 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -720,6 +720,8 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result) { switch (host_byte(result)) { + case DID_OK: + return BLK_STS_OK; case DID_TRANSPORT_FAILFAST: return BLK_STS_TRANSPORT; case DID_TARGET_FAILURE:
Re: [PATCH v2] Make SCSI Status CONDITION MET equivalent to GOOD
On 2018-02-26 02:25 AM, Hannes Reinecke wrote: On 02/25/2018 07:29 PM, Douglas Gilbert wrote: The SCSI PRE-FETCH (10 or 16) command is present both on hard disks and some SSDs. It is useful when the address of the next block(s) to be read is known but it is not following the LBA of the current READ (so read-ahead won't help). It returns two "good" SCSI Status values. If the requested blocks have fitted (or will most likely fit (when the IMMED bit is set)) into the disk's cache, it returns CONDITION MET. If it didn't (or will not) fit then it returns GOOD status. The primary goal of this patch is to stop the SCSI subsystem treating the CONDITION MET SCSI status as an error. The current state makes the PRE-FETCH command effectively unusable via pass-throughs. The hunt to find where the erroneous decision was made led to the scsi_io_completion() function in scsi_lib.c . This is a complicated function made more complicated by the intertwining of good and error (or special case) processing paths. Future work: if the sd driver was to use the PRE-FETCH command, decide whether it and/or the block layer needs to know about the two different "good" statuses. If so a mechanism would be needed to do that. ChangeLog to v2: - further restructure the code, place some early non-zero result processing in a new helper function: scsi_io_completion_nz_result() - this reduces the number of error checks on the zero result path (the fast path) at the expense of some extra work for the non-zero result processing - rename some variables to make the code a little clearer ChangeLog to v1: - expand scsi_status_is_good() to check for CONDITION MET - add another corner case in scsi_io_completion() adjacent to the one for the RECOVERED ERROR sense key case. That is another "non-error" - structure code so extra checks are only on the error path (i.e. when cmd->result is non-zero) This patch is against mkp's 4.17/scsi-queue branch. It also applies to lk 4.15.x where it was re-tested on a SAS SSD. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 140 +--- include/scsi/scsi.h | 2 + 2 files changed, 87 insertions(+), 55 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index aea5a1ae318b..e1e974f08d52 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -738,6 +738,73 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } } +/* Helper for scsi_io_completion() when cmd->result is non-zero. */ +static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, + blk_status_t *blk_statp) +{ + bool sense_valid; + bool about_current; + int result = cmd->result; + struct request *req = cmd->request; + struct scsi_sense_hdr sshdr; + + sense_valid = scsi_command_normalize_sense(cmd, ); + about_current = sense_valid ? !scsi_sense_is_deferred() : true; + + if (blk_rq_is_passthrough(req)) { + if (sense_valid) { + /* +* SG_IO wants current and deferred errors +*/ + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); + } + if (about_current) + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } else if (blk_rq_bytes(req) == 0 && about_current) { + /* +* Flush commands do not transfers any data, and thus cannot use +* good_bytes != blk_rq_bytes(req) as the signal for an error. +* This sets the error explicitly for the problem case. +*/ + *blk_statp = __scsi_error_from_host_byte(cmd, result); + } + /* +* Recovered errors need reporting, but they're always treated as +* success, so fiddle the result code here. For passthrough requests +* we already took a copy of the original into sreq->result which +* is what gets returned to the user +*/ + if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) { + /* +* if ATA PASS-THROUGH INFORMATION AVAILABLE skip +* print since caller wants ATA registers. Only occurs +* on SCSI ATA PASS_THROUGH commands when CK_COND=1 +*/ + if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d)) + ; + else if (!(req->rq_flags & RQF_QUIET)) + scsi_print_sense(cmd); + result = 0; + *blk_statp = BLK_STS_OK; + /* for passthrough, blk_stat may be set
[PATCH v2 1/2] scsi_io_completion split, fix CONDITION MET handling
The "ChangeLog for v1" section in 0/2 (the cover letter) of this patch set outlines the changes in this patch. Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 297 +--- include/scsi/scsi.h | 2 + 2 files changed, 181 insertions(+), 118 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index aea5a1ae318b..8074d0a14391 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -739,103 +739,41 @@ static blk_status_t __scsi_error_from_host_byte(struct scsi_cmnd *cmd, } /* - * Function:scsi_io_completion() - * - * Purpose: Completion processing for block device I/O requests. - * - * Arguments: cmd - command that is finished. - * - * Lock status: Assumed that no lock is held upon entry. - * - * Returns: Nothing - * - * Notes: We will finish off the specified number of sectors. If we - * are done, the command block will be released and the queue - * function will be goosed. If we are not done then we have to - * figure out what to do next: - * - * a) We can call scsi_requeue_command(). The request - *will be unprepared and put back on the queue. Then - *a new command will be created for it. This should - *be used if we made forward progress, or if we want - *to switch from READ(10) to READ(6) for example. - * - * b) We can call __scsi_queue_insert(). The request will - *be put back on the queue and retried using the same - *command as before, possibly after a delay. - * - * c) We can call scsi_end_request() with -EIO to fail - *the remainder of the request. + * Helper for scsi_io_completion() when cmd->result is non-zero. Returns + * BLK_STS_NOTSUPP if this function does not change blk_status . */ -void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) +static blk_status_t scsi_io_completion_nz_result(struct scsi_cmnd *cmd, +int result) { - int result = cmd->result; - struct request_queue *q = cmd->device->request_queue; + bool sense_valid; + bool about_current; + /* __scsi_error_from_host_byte() does not return BLK_STS_NOTSUPP */ + blk_status_t blk_stat = BLK_STS_NOTSUPP; struct request *req = cmd->request; - blk_status_t error = BLK_STS_OK; struct scsi_sense_hdr sshdr; - bool sense_valid = false; - int sense_deferred = 0, level = 0; - enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY, - ACTION_DELAYED_RETRY} action; - unsigned long wait_for = (cmd->allowed + 1) * req->timeout; - if (result) { - sense_valid = scsi_command_normalize_sense(cmd, ); - if (sense_valid) - sense_deferred = scsi_sense_is_deferred(); - } + sense_valid = scsi_command_normalize_sense(cmd, ); + about_current = sense_valid ? !scsi_sense_is_deferred() : true; if (blk_rq_is_passthrough(req)) { - if (result) { - if (sense_valid) { - /* -* SG_IO wants current and deferred errors -*/ - scsi_req(req)->sense_len = - min(8 + cmd->sense_buffer[7], - SCSI_SENSE_BUFFERSIZE); - } - if (!sense_deferred) - error = __scsi_error_from_host_byte(cmd, result); - } - /* -* __scsi_error_from_host_byte may have reset the host_byte -*/ - scsi_req(req)->result = cmd->result; - scsi_req(req)->resid_len = scsi_get_resid(cmd); - - if (scsi_bidi_cmnd(cmd)) { + if (sense_valid) { /* -* Bidi commands Must be complete as a whole, -* both sides at once. +* SG_IO wants current and deferred errors */ - scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; - if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), - blk_rq_bytes(req->next_rq))) - BUG(); - return; + scsi_req(req)->sense_len = + min(8 + cmd->sense_buffer[7], + SCSI_SENSE_BUFFERSIZE); } - } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) { +
[PATCH v2 2/2] scsi_io_completion convert BUG to WARN calls
ChangeLog: - convert 3 BUG calls to WARN calls - un-invert some conditional logic to uncover the real fast path - try to improve the comments, including noting what the bool return value from scsi_end_request() means Signed-off-by: Douglas Gilbert <dgilb...@interlog.com> --- drivers/scsi/scsi_lib.c | 45 ++--- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 8074d0a14391..9ebcf6f15d1d 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -650,6 +650,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd) cmd->request->next_rq->special = NULL; } +/* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, unsigned int bytes, unsigned int bidi_bytes) { @@ -1030,38 +1031,44 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) */ scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid; if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req), - blk_rq_bytes(req->next_rq))) - BUG(); +blk_rq_bytes(req->next_rq))) + WARN(true, "Bidi command with remaining bytes"); return; } } - /* no bidi support for !blk_rq_is_passthrough yet */ - BUG_ON(blk_bidi_rq(req)); + /* no bidi support yet, other than in pass-through */ + if (unlikely(blk_bidi_rq(req))) { + WARN_ONCE(true, "Only support bidi command in passthrough"); + scmd_printk(KERN_ERR, cmd, "Killing bidi command\n"); + if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req), +blk_rq_bytes(req->next_rq))) + WARN(true, "Bidi command with remaining bytes"); + return; + } - /* -* Next deal with any sectors which we were able to correctly -* handle. -*/ SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, cmd, "%u sectors total, %d bytes done.\n", blk_rq_sectors(req), good_bytes)); /* -* special case: failed zero length commands always need to -* drop down into the retry code. Otherwise, if we finished -* all bytes in the request we are done now. +* Next deal with any sectors which we were able to correctly +* handle. Fast path should return in this block. */ - if (unlikely(!(blk_stat && blk_rq_bytes(req) == 0) && -!scsi_end_request(req, blk_stat, good_bytes, 0))) - return; - + if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) { + if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0))) + return; /* no bytes remaining */ + } /* -* Kill remainder if no retrys. +* This leaves failed, zero length commands plus commands with +* some bytes remaining. */ + + /* Kill remainder if no retrys. */ if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) { if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0)) - BUG(); + WARN_ONCE(true, +"Bytes remaining after failed, no-retry command"); return; } @@ -1071,8 +1078,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) */ if (likely(result == 0)) { /* -* Fast path: Unprep the request and put it back at the head -* of the queue. A new command will be prepared and issued. +* Unprep the request and put it back at the head of the +* queue. A new command will be prepared and issued. * This block is the same as case ACTION_REPREP in * scsi_io_completion_action() above. */ -- 2.14.1
[PATCH v2 0/2] scsi_io_completion cleanup and fix CONDITION MET handling
This patch started as an attempt to fix the erroneous handling of CONDITION MET, a relatively rare special case. A solution meant adding another special case to the already complicated scsi_io_completion() function. To better understand that function the author found it useful to refactor the function into one relatively short function that the fast path passes through and two helper functions. These helper functions do the bulk of the error and special case handling. The SCSI PRE-FETCH (10 or 16) command is present both on hard disks and some SSDs. It is useful when the address of the next block(s) to be read is known but it is not following the LBA of the current READ (so read-ahead won't help). It returns two "good" SCSI Status values. If the requested blocks have fitted (or will most likely fit (when the IMMED bit is set)) into the disk's cache, it returns CONDITION MET. If it didn't (or will not) fit then it returns GOOD status. Future work: if the sd driver was to use the PRE-FETCH command, decide whether it and/or the block layer needs to know about the two different "good" statuses. If so a mechanism would be needed to do that. ChangeLog for v2: - after checkpatch.pl warning about BUG statements and confirmed by a reviewer, convert 3 BUG calls in scsi_io_completion() into WARN calls ChangeLog for v1: - split scsi_io_completion() into one short function that the fast path uses, and two helper functions - add conditional hints on the fast path - rename some variables to make the code a little clearer - add scsi_io_completion_nz_result() helper for the initial handling of the non-zero cmd->result case - add scsi_io_completion_action() helper for the 'action' processing of errors and special cases - expand inline function scsi_status_is_good() to check for CONDITION MET - add a corner case in scsi_io_completion_nz_result() for CONDITION MET (and similar "good" SCSI statuses) - structure code so extra checks are only on the error path (i.e. when cmd->result is non-zero) This patch is against mkp's 4.17/scsi-queue branch. It also applies to lk 4.15.x where it was tested on a SAS SSD. Douglas Gilbert (2): scsi_io_completion cleanup and fix CONDITION MET handling scsi_io_completion-change2warn drivers/scsi/scsi_lib.c | 304 +--- include/scsi/scsi.h | 2 + 2 files changed, 188 insertions(+), 118 deletions(-) -- 2.14.1
Re: [PATCH] bsg: convert to use blk-mq
On 2018-10-17 11:55 a.m., Benjamin Block wrote: On Tue, Oct 16, 2018 at 08:43:01AM -0600, Jens Axboe wrote: Requires a few changes to the FC transport class as well. Cc: Johannes Thumshirn Cc: Benjamin Block Cc: linux-scsi@vger.kernel.org Signed-off-by: Jens Axboe --- block/bsg-lib.c | 102 +-- drivers/scsi/scsi_transport_fc.c | 61 ++ 2 files changed, 91 insertions(+), 72 deletions(-) Hey Jens, I haven't had time to look into this in any deep way - but I did plan to -, but just to see whether it starts and runs some I/O I tried giving it a spin and came up with nothing (see line 3 and 5): [ 14.082079] scsi host0: scsi_eh_0: sleeping [ 14.082288] scsi host0: zfcp [ 14.086823] scsi host0: fc_host0: bsg interface failed to initialize - setup queue [ 14.089198] qdio: 0.0.1900 ZFCP on SC 107 using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP [ 15.228583] rport-0:0-0: failed to setup bsg queue [ 15.229886] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36 [ 15.230801] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0 [ 15.230860] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 164 [ 15.231171] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0 [ 15.231190] scsi 0:0:0:0: scsi scan: peripheral device type of 31, no device added [ 15.233171] scsi 0:0:0:0: scsi scan: Sending REPORT LUNS to (try 0) [ 15.234144] scsi 0:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 [ 15.234156] scsi 0:0:0:0: scsi scan: REPORT LUN scan [ 15.235040] scsi 0:0:0:1082671104: scsi scan: INQUIRY pass 1 length 36 [ 15.235220] scsi host1: scsi_eh_1: sleeping [ 15.235336] scsi 0:0:0:1082671104: scsi scan: INQUIRY successful with code 0x0 [ 15.235355] scsi 0:0:0:1082671104: scsi scan: INQUIRY pass 2 length 164 [ 15.235434] scsi host1: zfcp [ 15.235667] scsi 0:0:0:1082671104: scsi scan: INQUIRY successful with code 0x0 [ 15.235709] scsi 0:0:0:1082671104: Direct-Access IBM 2107900 3230 PQ: 0 ANSI: 5 [ 15.238468] scsi host1: fc_host1: bsg interface failed to initialize - setup queue Seems to already fail at setting up the bsg interfaces for zFCP - at loading time of the module. This is just your patch on top of 4.19.0-rc8. Hi, Almost all of the utilities in the sg3_utils package (a few exceptions: sg_reset, sgm_dd, sgp_dd), when given a bsg file node (e.g. 'sg_inq /dev/bsg/0:0:0:1'), will use the bsg sg_v4 interface (i.e. as defined in ). Dito with the sdparm and ddpt packages. So there is a lot of test code for any changes to the bsg driver. Doug Gilbert P.S. that above is true from sg3_utils version 1.27 released on 20090411. The current version is 1.44
[PATCH 6/8] sg: complete locking changes on ioctl+debug
Complete the locking and structure changes of ioctl and debug ('cat /proc/scsi/sg/debug') handling. Signed-off-by: Douglas Gilbert --- This was the code that was "#if 0'-ed out 2 patches ago. It also shuts checkpatch.pl up as it doesn't like that technique but offers no viable substitute. drivers/scsi/sg.c | 204 +- 1 file changed, 130 insertions(+), 74 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 45bad8159e51..9b30d6667cc9 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1004,7 +1004,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct sg_request *srp) return ret; } -#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1036,24 +1035,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENXIO; if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR)) return -EFAULT; - result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR, -1, read_only, 1, ); + result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only, +true, ); if (result < 0) return result; result = wait_event_interruptible(sfp->read_wait, - (srp_done(sfp, srp) || atomic_read(>detaching))); - if (atomic_read(>detaching)) + srp_state_or_detaching(sdp, srp)); + + spin_lock_irqsave(>rq_entry_lck, iflags); + if (unlikely(result)) { /* -ERESTARTSYS because signal hit */ + srp->orphan = true; + srp->rq_state = SG_RQ_INFLIGHT; + spin_unlock_irqrestore(>rq_entry_lck, iflags); + SG_LOG(1, sdp, "%s: wait_event_interruptible-->%d\n", + __func__, result); + return result; + } + if (unlikely(atomic_read(>detaching))) { + srp->rq_state = SG_RQ_INACTIVE; + spin_unlock_irqrestore(>rq_entry_lck, iflags); return -ENODEV; - write_lock_irq(>rq_list_lock); - if (srp->done) { - srp->done = 2; - write_unlock_irq(>rq_list_lock); + } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) { + srp->rq_state = SG_RQ_DONE_READ; + spin_unlock_irqrestore(>rq_entry_lck, iflags); result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp); return (result < 0) ? result : 0; } - srp->orphan = true; - write_unlock_irq(>rq_list_lock); - return result; /* -ERESTARTSYS because signal hit process */ + cp = sg_rq_state_str(srp->rq_state, true); + SG_LOG(1, sdp, "%s: unexpected srp=0x%p state: %s\n", __func__, + srp, cp); + spin_unlock_irqrestore(>rq_entry_lck, iflags); + return -EPROTO; /* Logic error */ case SG_SET_TIMEOUT: result = get_user(val, ip); if (result) @@ -1112,25 +1124,36 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) if (!access_ok(VERIFY_WRITE, ip, sizeof(int))) return -EFAULT; read_lock_irqsave(>rq_list_lock, iflags); - list_for_each_entry(srp, >rq_list, entry) { - if ((1 == srp->done) && (!srp->sg_io_owned)) { - read_unlock_irqrestore(>rq_list_lock, - iflags); - __put_user(srp->header.pack_id, ip); - return 0; + leave = false; + val = -1; + list_for_each_entry(srp, >rq_list, rq_entry) { + spin_lock(>rq_entry_lck); + if (srp->rq_state == SG_RQ_AWAIT_READ && + !srp->sync_invoc) { + val = srp->header.pack_id; + leave = true; } + spin_unlock(>rq_entry_lck); + if (leave) + break; } read_unlock_irqrestore(>rq_list_lock, iflags); - __put_user(-1, ip); + __put_user(val, ip); + SG_LOG(3, sdp, "%s:SG_GET_PACK_ID=%d\n", __func__, val); ret
[PATCH 3/8] sg: split header, expand and correct descriptions
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header: uapi/scsi/sg.h . Overall expand the twin header files with new functionality in this patchset and functionality to be added in the next patchset to implement SG_IOSUBMIT and friends. Adjust format to modern kernel conventions. Correct and expand some comments. Signed-off-by: Douglas Gilbert --- The new header file: uapi/scsi/sg.h is expected to be in the kernel's public interface. This takes time (i.e. months or years) to find its way into glibc and Linux distributions. So the sooner it is presented and accepted the better. >From the C perspective, nothing is removed or changed (or shouldn't be), only additions. This means that the original typedefs stay (but they are not used in the driver source file: sg.c) since user space programs may be using them. include/scsi/sg.h | 252 ++--- include/uapi/scsi/sg.h | 415 + 2 files changed, 426 insertions(+), 241 deletions(-) create mode 100644 include/uapi/scsi/sg.h diff --git a/include/scsi/sg.h b/include/scsi/sg.h index f91bcca604e4..596f68746f66 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -7,23 +7,23 @@ /* * History: * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. + * process control of SCSI devices. * Development Sponsored by Killy Corp. NY NY * * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard + * Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert + * Copyright (C) 1998 - 2018 Douglas Gilbert * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. + * Version: 3.9.01 (20181016) + * This version is for 2.6, 3 and 4 series kernels. * * Documentation * = * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] + * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html + * http://sg.danny.cz/sg/p/sg_v3_ho.html * Also see: /Documentation/scsi/scsi-generic.txt * * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html @@ -33,242 +33,12 @@ extern int sg_big_buff; /* for sysctl */ #endif +#include -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory - or scatter gather list */ -unsigned char __user *cmdp; /* [i], [*i] points to command to perform */ -void __user *sbp; /* [i], [*o] points to sense_buffer memory */ -unsigned int timeout; /* [i] MAX_UINT->no timeout (unit: millisec) */ -unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */ -int pack_id;/* [i->o] unused internally (normally) */ -void __user * usr_ptr; /* [i->o] unused internally */ -unsigned char status; /* [o] scsi status */ -unsigned char masked_status;/* [o] shifted, masked scsi status */ -unsigned char msg_status; /* [o] messaging level data (optional) */ -unsigned char sb_len_wr;/* [o] byte count actually written to sbp */ -unsigned short host_status; /* [o] errors from host adapter */ -unsigned short driver_status;/* [o] errors from software driver */ -int resid; /* [o] dxfer_len - actual_transferred */ -unsigned int duration; /* [o] time taken by cmd (unit: millisec) */ -unsigned int info; /* [o] auxiliary information */ -} sg_io_hdr_t; /* 64 bytes long (on i386) */ - -#define SG_INTERFACE_ID_ORIG 'S' - -/* Use negative values to flag difference from original sg_header structure */ -#define SG_DXFER_NONE (-1) /* e.g. a SCSI Test Unit Ready command */ -#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */ -#define SG_DXFER_FROM_DEV (-3) /* e.g. a SCSI READ command */ -#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the - additional property than
[PATCH 7/8] sg: rework ioctl handling
Rework ioctl handling, report clearly to the log which ioctl has been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which permits several integer and boolean values to be "_SET_" (i.e. passed into the driver, potentially changing its actions) and/or read from the driver (the "_GET_" part) in a single operation. Signed-off-by: Douglas Gilbert --- One feature of the new SG_SET_GET_EXTENDED ioctl is ability to fetch the sg device minor number (e.g. the "3" in /dev/sg3) associated with the current file descriptor. A boolean addition is the ability to change command timekeeping on the current file descriptor from milliseconds (the default) to nanoseconds. drivers/scsi/sg.c | 546 +++--- 1 file changed, 414 insertions(+), 132 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 9b30d6667cc9..583846ebc5e0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -76,11 +76,11 @@ static int sg_proc_init(void); #define SG_MAX_CDB_SIZE 252 /* Following defines are states of sg_request::rq_state */ -#define SG_RQ_INACTIVE 0/* request not in use (e.g. on fl) */ -#define SG_RQ_INFLIGHT 1/* SCSI request issued, no response yet */ -#define SG_RQ_AWAIT_READ 2 /* response received, awaiting read */ -#define SG_RQ_DONE_READ 3 /* read is ongoing or done */ -#define SG_RQ_BUSY 4/* example: reserve request changing size */ +#define SG_RQ_INACTIVE 0 /* request not in use (e.g. on fl) */ +#define SG_RQ_INFLIGHT 1 /* SCSI request issued, no response yet */ +#define SG_RQ_AWAIT_READ 2 /* response received, awaiting read */ +#define SG_RQ_DONE_READ 3 /* read is ongoing or done */ +#define SG_RQ_BUSY 4 /* example: reserve request changing size */ /* free up requests larger than this dlen size after use */ #define SG_RQ_DATA_THRESHOLD (128 * 1024) @@ -88,8 +88,8 @@ static int sg_proc_init(void); /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) -#define SG_TIME_UNIT_MS 0 /* milliseconds */ -#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) @@ -184,7 +184,6 @@ struct sg_fd { /* holds the state of a file descriptor */ bool cmd_q; /* true -> allow command queuing, false -> don't */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ - bool sse_seen; /* SG_SET_EXTENDED ioctl seen */ bool time_in_ns;/* report times in nanoseconds */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ @@ -238,6 +237,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); static const char *sg_rq_state_str(u8 rq_state, bool long_str); +static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first, + rwlock_t *rwlp, unsigned long *iflagsp); #define SZ_SG_HEADER sizeof(struct sg_header) /* v1 and v2 header */ #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr) /* v3 header */ @@ -505,9 +506,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -812,11 +813,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const char __user *buf, return -ENOSYS; if (hp->flags & SG_FLAG_MMAP_IO) { if (!list_empty(>rq_list)) - return -EBUSY; /* already active requests on fd */ + return -EBUSY; /* already active requests on fd */ if (hp->dxfer_len > sfp->reserve_srp->data.dlen) - return -ENOMEM; /* MMAP_IO size must fit in reserve */ + return -ENOMEM; /* MMAP_IO size must fit in reserve */ if (hp->flags & SG_FLAG_DIRECT_IO) - return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ + return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ } sfp->
[PATCH 5/8] sg: add free list, rework locking
Remove fixed 16 sg_request object array and replace with an active rq_list plus a request free list. Add finer grained spin lock to sg_request and do a major rework on locking. sg_request objects now are only de-allocated when the owning file descriptor is closed. This simplifies locking issues. Signed-off-by: Douglas Gilbert --- This patch is big and complex. Towards the end the diff program completely loses the plot. Better to use difftool on a two pane window. drivers/scsi/sg.c | 1225 +++-- 1 file changed, 739 insertions(+), 486 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c77701c0b939..45bad8159e51 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -139,44 +139,56 @@ struct sg_scatter_hold { /* holding area for scsi scatter gather info */ struct sg_device; /* forward declarations */ struct sg_fd; -struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ - struct list_head entry; /* list entry */ - struct sg_fd *parentfp; /* NULL -> not in use */ +/* + * For any file descriptor: at any time a sg_request object must be a member + * of sg_fd::rq_list or rq_free_list::rq_free_list. The only exception is + * within a rq_list_lock write lock when it is moving between those two lists. + */ + +struct sg_request {/* active SCSI command or inactive on free list (fl) */ + struct list_head rq_entry; /* member of rq_list (active cmd) */ + struct list_head free_entry;/* member of rq_free_list */ + spinlock_t rq_entry_lck; struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */ union { struct sg_io_hdr header; /* see */ - struct sg_io_v4 hdr_v4; /* see */ + struct sg_v4_hold v4_hold;/* related to */ }; - u8 sense_b[SCSI_SENSE_BUFFERSIZE]; - bool hdr_v4_active; /* selector for anonymous union above */ - bool res_used; /* true -> use reserve buffer, false -> don't */ + struct execute_work ew; + ktime_t start_ts; /* used when sg_fd::time_in_ns is true */ + bool v4_active; /* selector for autonomous union above */ bool orphan;/* true -> drop on sight, false -> normal */ - bool sg_io_owned; /* true -> packet belongs to SG_IO */ - /* done protected by rq_list_lock */ - char done; /* 0->before bh, 1->before read, 2->read */ + bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */ + u8 rq_state;/* one of 5 states, see SG_RQ_* defines */ + u8 sense_b[SCSI_SENSE_BUFFERSIZE]; + struct sg_fd *parentfp; /* pointer to owning fd, even when on fl */ + struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */ struct request *rq; struct bio *bio; - struct execute_work ew; }; -struct sg_fd { /* holds the state of a file descriptor */ - struct list_head sfd_siblings; /* protected by device's sfd_lock */ +struct sg_fd { /* holds the state of a file descriptor */ + struct list_head sfd_entry; /* member sg_device::sfds list */ struct sg_device *parentdp; /* owning device */ wait_queue_head_t read_wait;/* queue read until command done */ - rwlock_t rq_list_lock; /* protect access to list in req_arr */ struct mutex f_mutex; /* protect against changes in this fd */ + rwlock_t rq_list_lock; /* protect access to sg_request lists */ + struct list_head rq_list; /* head of inflight sg_request list */ + struct list_head rq_free_list; /* head of sg_request free list */ int timeout;/* defaults to SG_DEFAULT_TIMEOUT */ int timeout_user; /* defaults to SG_DEFAULT_TIMEOUT_USER */ - struct sg_scatter_hold reserve; /* one held for this file descriptor */ - struct list_head rq_list; /* head of request list */ - struct fasync_struct *async_qp; /* used by asynchronous notification */ - struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */ + int rem_sgat_thresh;/* > this, request's sgat cleared after use */ + u32 tot_fd_thresh; /* E2BIG if sum_of(dlen) > this, 0: ignore */ + u32 sum_fd_dlens; /* when tot_fd_thresh>0 this is sum_of(dlen) */ bool force_packid; /* true -> pack_id input to read() */ bool cmd_q; /* true -> allow command queuing, false -> don't */ - u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ - bool res_in_use;/* true -> 'reserve' array in use */ + bool sse_seen; /* SG_SET_EXTENDED ioctl seen */ + bool
[PATCH 8/8] sg: user control for q_at_head or tail
Add a SG_SET_GET_EXTENDED ioctl control for whether commands will be queued_at_head or queued_at_tail by the block layer (together with the scsi mid-level). It has file scope. Signed-off-by: Douglas Gilbert --- The user can still override this setting on a per command basis with the SG_FLAG_Q_AT_HEAD and SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures. drivers/scsi/sg.c | 35 +++ include/uapi/scsi/sg.h | 3 ++- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 583846ebc5e0..258923aac50d 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -93,6 +93,10 @@ static int sg_proc_init(void); #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) +#define SG_FD_Q_AT_TAIL true +#define SG_FD_Q_AT_HEAD false +#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */ + int sg_big_buff = SG_DEF_RESERVED_SIZE; /* N.B. This variable is readable and writeable via /proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer @@ -185,6 +189,7 @@ struct sg_fd { /* holds the state of a file descriptor */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ bool time_in_ns;/* report times in nanoseconds */ + bool q_at_tail; /* queue at tail if true, head when false */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ struct fasync_struct *async_qp; /* used by asynchronous notification */ @@ -894,9 +899,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT); else hp->duration = jiffies_to_msecs(jiffies); - /* at tail if v3 or later interface and tail flag set */ - at_head = !(hp->interface_id != '\0' && - (SG_FLAG_Q_AT_TAIL & hp->flags)); + + if (hp->interface_id == '\0') /* v1 and v2 interface */ + at_head = true; /* backward compatibility */ + else if (sfp->q_at_tail) /* cmd flags can override sfd setting */ + at_head = (SG_FLAG_Q_AT_HEAD & hp->flags); + else/* this sfd is defaulting to head */ + at_head = !(SG_FLAG_Q_AT_TAIL & hp->flags); srp->rq->timeout = timeout; kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */ @@ -1186,8 +1195,10 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) } SG_LOG(3, sdp, "%s: wr_mask=0x%x rd_mask=0x%x\n", __func__, seip->valid_wr_mask, seip->valid_rd_mask); + /* reserved_sz (u32), read-write */ if (or_masks & SG_SEIM_RESERVED_SIZE) result = sg_reserved_sz(sfp, seip); + /* rq_rem_sgat_threshold (u32), read-write [impacts re-use only] */ if (or_masks & SG_SEIM_RQ_REM_THRESH) { if (seip->valid_wr_mask & SG_SEIM_RQ_REM_THRESH) { uv = seip->rq_rem_sgat_thresh; @@ -1198,6 +1209,7 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) if (seip->valid_rd_mask & SG_SEIM_RQ_REM_THRESH) seip->rq_rem_sgat_thresh = sfp->rem_sgat_thresh; } + /* tot_fd_thresh (u32), read-write [sum of active cmd dlen_s] */ if (or_masks & SG_SEIM_TOT_FD_THRESH) { if (seip->valid_wr_mask & SG_SEIM_TOT_FD_THRESH) { uv = seip->tot_fd_thresh; @@ -1208,8 +1220,9 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) if (seip->valid_rd_mask & SG_SEIM_TOT_FD_THRESH) seip->tot_fd_thresh = sfp->tot_fd_thresh; } + /* check all boolean flags if either wr or rd mask set in or_mask */ if (or_masks & SG_SEIM_CTL_FLAGS) { - /* don't care whether wr or rd mask set in or_mask */ + /* TIME_IN_NS boolean, read-write */ if (seip->ctl_flags_wr_mask & SG_CTL_FLAGM_TIME_IN_NS) sfp->time_in_ns = !!(seip->ctl_flags & SG_CTL_FLAGM_TIME_IN_NS); @@ -1219,19 +1232,32 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) else seip->ctl_flags &= ~SG_CTL_FLAGM_TIME_IN_NS; } + /* ORPHANS boolean, read-only */ if (seip->ctl_flags_rd_mask & SG_CTL_FLAGM_ORPHANS) { if (sg_any_persistent_orphans(sfp)) seip->ctl_flags |= SG_CTL_FLAGM_OR
[PATCH 1/8] sg: types and naming cleanup
Remove typedefs and use better type names like bool and u8 where appropriate. Rename some variables and functions for clarity. Adjust formatting (e.g. function definitions) to be more consistent across the driver. Signed-off-by: Douglas Gilbert --- The intention is to move to sg_version_num 40001 when a second patchset implementing SG_IOSUBMIT and friends is applied. In the meantime, move from the latest version number in the kernel (30536) to 30901 to indicate a significant change but not yet implementing the sg v4 interface. drivers/scsi/sg.c | 827 ++ 1 file changed, 467 insertions(+), 360 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c6ad00703c5b..7e723f37a269 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -7,7 +7,7 @@ * Original driver (sg.c): *Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - *Copyright (C) 1998 - 2014 Douglas Gilbert + *Copyright (C) 1998 - 2018 Douglas Gilbert * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -16,16 +16,9 @@ * */ -static int sg_version_num = 30536; /* 2 digits for each component */ -#define SG_VERSION_STR "3.5.36" +static int sg_version_num = 30901; /* 2 digits for each component */ +#define SG_VERSION_STR "3.9.01" -/* - * D. P. Gilbert (dgilb...@interlog.com), notes: - * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First - *the kernel/module needs to be built with CONFIG_SCSI_LOGGING - *(otherwise the macros compile to empty statements). - * - */ #include #include @@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #include #include #include /* for sg_check_file_access() */ +#include +#include #include "scsi.h" #include @@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20140603"; +static char *sg_version_date = "20181018"; static int sg_proc_init(void); #endif @@ -73,7 +68,8 @@ static int sg_proc_init(void); #define SG_MAX_DEVS 32768 -/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type +/* + * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater * than 16 bytes are "variable length" whose length is a multiple of 4 */ @@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct class_interface *); static void sg_remove_device(struct device *, struct class_interface *); static DEFINE_IDR(sg_index_idr); -static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock - file descriptor list for device */ +static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */ static struct class_interface sg_interface = { .add_dev= sg_add_device, .remove_dev = sg_remove_device, }; -typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ - unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ - unsigned bufflen; /* Size of (aggregate) data buffer */ - struct page **pages; - int page_order; - char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */ - unsigned char cmd_opcode; /* first byte of command */ -} Sg_scatter_hold; +struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */ + __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */ + __user u8 *sbp; /* derived from sg_io_v4::response */ + u16 cmd_len;/* truncated of sg_io_v4::request_len */ + u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */ + u32 flags; /* copy of sg_io_v4::flags */ +}; + +struct sg_scatter_hold { /* holding area for scsi scatter gather info */ + struct page **pages;/* num_sgat element array of struct page* */ + int page_order; /* byte_len = (page_size * (2**page_order)) */ + int dlen; /* Byte length of data buffer */ + unsigned short num_sgat;/* actual number of scatter-gather segments */ + bool dio_in_use;/* false->indirect IO (or mmap), true->dio */ + u8 cmd_opcode; /* first byte of command */ +}; struct sg_device; /* forward declarations */ struct sg_fd; -typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ +struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ struct list_head entry; /* list entry */
[PATCH 4/8] sg: expand request states
Introduce the new rq_state defines. SG_RQ_DATA_THRESHOLD is a default value that if the data length of a request exceeds then, after that request is completed, the data buffer will be freed up as the sg_request object is placed on the free list. SG_TOT_FD_THRESHOLD is a default, per file descriptor value that the sum of outstanding command data lengths is not allowed to exceed. Signed-off-by: Douglas Gilbert --- The '#if 0's are temporary and removed in a later patch. They allow the following large and complex patch to be a bit shorter and still compile. drivers/scsi/sg.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 71623685abfe..c77701c0b939 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -75,6 +75,22 @@ static int sg_proc_init(void); */ #define SG_MAX_CDB_SIZE 252 +/* Following defines are states of sg_request::rq_state */ +#define SG_RQ_INACTIVE 0/* request not in use (e.g. on fl) */ +#define SG_RQ_INFLIGHT 1/* SCSI request issued, no response yet */ +#define SG_RQ_AWAIT_READ 2 /* response received, awaiting read */ +#define SG_RQ_DONE_READ 3 /* read is ongoing or done */ +#define SG_RQ_BUSY 4/* example: reserve request changing size */ + +/* free up requests larger than this dlen size after use */ +#define SG_RQ_DATA_THRESHOLD (128 * 1024) + +/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ +#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) + +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) int sg_big_buff = SG_DEF_RESERVED_SIZE; @@ -950,6 +966,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo) } } +#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1227,6 +1244,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENOIOCTLCMD; } #endif +#endif /* temporary to shorten big patch */ static __poll_t sg_poll(struct file *filp, poll_table * wait) @@ -1496,10 +1514,12 @@ static const struct file_operations sg_fops = { .read = sg_read, .write = sg_write, .poll = sg_poll, +#if 0 /* temporary to shorten big patch */ .unlocked_ioctl = sg_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = sg_compat_ioctl, #endif +#endif /* temporary to shorten big patch */ .open = sg_open, .mmap = sg_mmap, .release = sg_release, @@ -2422,12 +2442,16 @@ static const struct seq_operations devstrs_seq_ops = { .show = sg_proc_seq_show_devstrs, }; +#if 0 /* temporary to shorten big patch */ static int sg_proc_seq_show_debug(struct seq_file *s, void *v); +#endif /* temporary to shorten big patch */ static const struct seq_operations debug_seq_ops = { .start = dev_seq_start, .next = dev_seq_next, .stop = dev_seq_stop, +#if 0 /* temporary to shorten big patch */ .show = sg_proc_seq_show_debug, +#endif /* temporary to shorten big patch */ }; static int @@ -2601,6 +2625,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v) return 0; } +#if 0 /* temporary to shorten big patch */ + /* must be called while holding sg_index_lock */ static void sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp) @@ -2704,6 +2730,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v) read_unlock_irqrestore(_index_lock, iflags); return 0; } +#endif /* temporary to shorten big patch */ #endif /* CONFIG_SCSI_PROC_FS */ -- 2.17.1
[PATCH 0/8] sg: major cleanup, remove max_queue limit
The intention is to add two new ioctls as proposed by Linus Torvalds: SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async interface. But first, clean up the driver and remove the SG_MAX_QUEUE limit of no more than 16 queued commands on a file descriptor at a time. A free list has been added and the de-allocation of sg_request objects is deferred until the close() of a file. Locking is extensively reworked, especially at the struct sg_fd and sg_request level. A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple integer values and booleans to be written to and/or read from the driver. An example is changing and/or reading the reserved request data length (there is one of these per fd). An example of a new feature is changing and/or reading the per-fd upper limit on the sum of outstanding data buffer sizes (default is 16 MB). An example of a boolean is a bit to do all following command timekeeping in nanoseconds rather that the default millseconds. A later patchset will add implementations for the SG_IOSUBMIT and SG_IORECEIVE plus handling of the sg v4 interface with the existing SG_IO ioctl. This patchset is against Martin Petersen 4.20/scsi-queue branch. Douglas Gilbert (8): sg: types and naming cleanup sg: introduce sg_log macro sg: split header, expand and correct descriptions sg: expand request states sg: add free list, rework locking sg: complete locking changes on ioctl+debug sg: rework ioctl handling sg: user control for q_at_head or tail drivers/scsi/sg.c | 2484 ++-- include/scsi/sg.h | 252 +--- include/uapi/scsi/sg.h | 416 +++ 3 files changed, 2036 insertions(+), 1116 deletions(-) create mode 100644 include/uapi/scsi/sg.h -- 2.17.1
[PATCH 2/8] sg: introduce sg_log macro
Introduce the SG_LOG macro to replace long-winded 'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__ wherever appropriate to make the debug messages more portable. Signed-off-by: Douglas Gilbert --- drivers/scsi/sg.c | 162 +- 1 file changed, 72 insertions(+), 90 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 7e723f37a269..71623685abfe 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref); /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */ #define SZ_SG_REQ_INFO sizeof(struct sg_req_info) -#define sg_printk(prefix, sdp, fmt, a...) \ - sdev_prefix_printk(prefix, (sdp)->device, \ - (sdp)->disk->disk_name, fmt, ##a) +/* + * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages. + * 'depth' is a number between 1 (most severe) and 7 (most noisy, most + * information). All messages are logged as informational (KERN_INFO). In + * the unexpected situation where sdp is NULL the macro reverts to a pr_info + * and ignores CONFIG_SCSI_LOGGING and always prints to the log. + */ +#define SG_LOG(depth, sdp, fmt, a...) \ + do {\ + if (IS_ERR_OR_NULL(sdp)) { \ + pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a); \ + } else {\ + SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \ +KERN_INFO, (sdp)->device, \ +(sdp)->disk->disk_name, fmt, \ +##a)); \ + } \ + } while (0) /* * The SCSI interfaces that use read() and write() as an asynchronous variant of @@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp) sdp = sg_get_dev(min_dev); if (IS_ERR(sdp)) return PTR_ERR(sdp); - - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, - "sg_open: flags=0x%x\n", flags)); + SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n", + __func__, flags, sdp->open_cnt); /* This driver's module count bumped by fops_get in */ /* Prevent the device driver from vanishing while we sleep */ @@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n")); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; + SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__, + sdp->open_cnt); mutex_lock(>open_rel_lock); scsi_autopm_put_device(sdp->device); @@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) sdp = sfp->parentdp; if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); if (atomic_read(>detaching)) return -ENODEV; if (!((filp->f_flags & O_NONBLOCK) || @@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return -EIO;/* minimum scsi command length is 6 bytes */ if (!(srp = sg_add_request(sfp))) { - SCSI_LOG_TIMEOUT(1, sg_printk(KERN_INFO, sdp, - "sg_write: queue full\n")); + SG_LOG(1, sdp, "%s: queue full\n", __func__); return -EDOM; } buf += SZ_SG_HEADER; @@ -703,9 +715,8 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
Re: [PATCH v2 3/8] sg: split header, expand and correct descriptions
On 2018-10-20 11:21 p.m., Douglas Gilbert wrote: Split scsi/sg.h into a smaller scsi/sg.h which includes a new header: uapi/scsi/sg.h . Overall expand the twin header files with new functionality in this patchset and functionality to be added in the next patchset to implement SG_IOSUBMIT and friends. Adjust format to modern kernel conventions. Correct and expand some comments. Signed-off-by: Douglas Gilbert --- The new header file: uapi/scsi/sg.h is expected to be in the kernel's public interface. This takes time (i.e. months or years) to find its way into glibc and Linux distributions. So the sooner it is presented and accepted the better. From the C perspective, nothing is removed or changed (or shouldn't be), only additions. This means that the original typedefs stay (but they are not used in the driver source file: sg.c) since user space programs may be using them. include/scsi/sg.h | 252 ++--- include/uapi/scsi/sg.h | 418 + 2 files changed, 429 insertions(+), 241 deletions(-) create mode 100644 include/uapi/scsi/sg.h diff --git a/include/scsi/sg.h b/include/scsi/sg.h index f91bcca604e4..596f68746f66 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -7,23 +7,23 @@ /* * History: * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. + * process control of SCSI devices. * Development Sponsored by Killy Corp. NY NY * * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard + * Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert + * Copyright (C) 1998 - 2018 Douglas Gilbert * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. + * Version: 3.9.01 (20181016) + * This version is for 2.6, 3 and 4 series kernels. * * Documentation * = * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] + * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html + * http://sg.danny.cz/sg/p/sg_v3_ho.html * Also see: /Documentation/scsi/scsi-generic.txt * * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html @@ -33,242 +33,12 @@ extern int sg_big_buff; /* for sysctl */ #endif +#include -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory - or scatter gather list */ -unsigned char __user *cmdp; /* [i], [*i] points to command to perform */ -void __user *sbp; /* [i], [*o] points to sense_buffer memory */ -unsigned int timeout; /* [i] MAX_UINT->no timeout (unit: millisec) */ -unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */ -int pack_id;/* [i->o] unused internally (normally) */ -void __user * usr_ptr; /* [i->o] unused internally */ -unsigned char status; /* [o] scsi status */ -unsigned char masked_status;/* [o] shifted, masked scsi status */ -unsigned char msg_status; /* [o] messaging level data (optional) */ -unsigned char sb_len_wr;/* [o] byte count actually written to sbp */ -unsigned short host_status; /* [o] errors from host adapter */ -unsigned short driver_status;/* [o] errors from software driver */ -int resid; /* [o] dxfer_len - actual_transferred */ -unsigned int duration; /* [o] time taken by cmd (unit: millisec) */ -unsigned int info; /* [o] auxiliary information */ -} sg_io_hdr_t; /* 64 bytes long (on i386) */ - -#define SG_INTERFACE_ID_ORIG 'S' - -/* Use negative values to flag difference from original sg_header structure */ -#define SG_DXFER_NONE (-1) /* e.g. a SCSI Test Unit Ready command */ -#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */ -#define SG_DXFER_FROM_DEV (-3) /* e.g. a SCSI READ command */ -#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DX
[PATCH v2 4/8] sg: expand request states
Introduce the new sg_rq_state enumerations for tracking the lifetime of a sg_request. SG_RQ_DATA_THRESHOLD is a default value that if the data length of a request exceeds then, after that request is completed, the data buffer will be freed up as the sg_request object is placed on the free list. SG_TOT_FD_THRESHOLD is a default, per file descriptor value that the sum of outstanding command data lengths is not allowed to exceed. Signed-off-by: Douglas Gilbert --- The '#if 0's are temporary and removed in a later patch. They allow the following large and complex patch (5 of 8) to be a bit shorter and still compile. drivers/scsi/sg.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 94e13a1d21a5..a76395f16fb1 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -75,6 +75,24 @@ static int sg_proc_init(void); */ #define SG_MAX_CDB_SIZE 252 +/* Following enum contains the states of sg_request::rq_state */ +enum sg_rq_state { + SG_RQ_INACTIVE = 0, /* request not in use (e.g. on fl) */ + SG_RQ_INFLIGHT, /* SCSI request issued, no response yet */ + SG_RQ_AWAIT_READ, /* response received, awaiting read */ + SG_RQ_DONE_READ,/* read is ongoing or done */ + SG_RQ_BUSY, /* example: reserve request changing size */ +}; + +/* free up requests larger than this dlen size after use */ +#define SG_RQ_DATA_THRESHOLD (128 * 1024) + +/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ +#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) + +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) int sg_big_buff = SG_DEF_RESERVED_SIZE; @@ -950,6 +968,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo) } } +#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1227,6 +1246,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENOIOCTLCMD; } #endif +#endif /* temporary to shorten big patch */ static __poll_t sg_poll(struct file *filp, poll_table * wait) @@ -1496,10 +1516,12 @@ static const struct file_operations sg_fops = { .read = sg_read, .write = sg_write, .poll = sg_poll, +#if 0 /* temporary to shorten big patch */ .unlocked_ioctl = sg_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = sg_compat_ioctl, #endif +#endif /* temporary to shorten big patch */ .open = sg_open, .mmap = sg_mmap, .release = sg_release, @@ -2422,12 +2444,16 @@ static const struct seq_operations devstrs_seq_ops = { .show = sg_proc_seq_show_devstrs, }; +#if 0 /* temporary to shorten big patch */ static int sg_proc_seq_show_debug(struct seq_file *s, void *v); +#endif /* temporary to shorten big patch */ static const struct seq_operations debug_seq_ops = { .start = dev_seq_start, .next = dev_seq_next, .stop = dev_seq_stop, +#if 0 /* temporary to shorten big patch */ .show = sg_proc_seq_show_debug, +#endif /* temporary to shorten big patch */ }; static int @@ -2601,6 +2627,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v) return 0; } +#if 0 /* temporary to shorten big patch */ + /* must be called while holding sg_index_lock */ static void sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp) @@ -2704,6 +2732,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v) read_unlock_irqrestore(_index_lock, iflags); return 0; } +#endif /* temporary to shorten big patch */ #endif /* CONFIG_SCSI_PROC_FS */ -- 2.17.1
[PATCH v2 6/8] sg: complete locking changes on ioctl+debug
Complete the locking and structure changes of ioctl and debug ('cat /proc/scsi/sg/debug') handling. Signed-off-by: Douglas Gilbert --- This was the code that was "#if 0'-ed out 2 patches ago. It also shuts checkpatch.pl up as it doesn't like that technique but offers no viable substitute. drivers/scsi/sg.c | 209 +- 1 file changed, 134 insertions(+), 75 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 4a2e9a616604..8c9fbf865106 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -239,6 +239,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, int dxfr_len, static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); +static const char *sg_rq_state_str(u8 rq_state, bool long_str); #define SZ_SG_HEADER sizeof(struct sg_header) /* v1 and v2 header */ #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr) /* v3 header */ @@ -358,7 +359,7 @@ sg_open(struct inode *inode, struct file *filp) nonseekable_open(inode, filp); if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) - return -EPERM; /* Can't lock it with read only access */ + return -EPERM;/* not permitted, need write access for O_EXCL */ sdp = sg_get_dev(min_dev); if (IS_ERR(sdp)) return PTR_ERR(sdp); @@ -1011,7 +1012,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct sg_request *srp) return ret; } -#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1043,24 +1043,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENXIO; if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR)) return -EFAULT; - result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR, -1, read_only, 1, ); + result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only, +true, ); if (result < 0) return result; result = wait_event_interruptible(sfp->read_wait, - (srp_done(sfp, srp) || atomic_read(>detaching))); - if (atomic_read(>detaching)) + srp_state_or_detaching(sdp, srp)); + + spin_lock_irqsave(>rq_entry_lck, iflags); + if (unlikely(result)) { /* -ERESTARTSYS because signal hit */ + srp->orphan = true; + srp->rq_state = SG_RQ_INFLIGHT; + spin_unlock_irqrestore(>rq_entry_lck, iflags); + SG_LOG(1, sdp, "%s: wait_event_interruptible-->%d\n", + __func__, result); + return result; + } + if (unlikely(atomic_read(>detaching))) { + srp->rq_state = SG_RQ_INACTIVE; + spin_unlock_irqrestore(>rq_entry_lck, iflags); return -ENODEV; - write_lock_irq(>rq_list_lock); - if (srp->done) { - srp->done = 2; - write_unlock_irq(>rq_list_lock); + } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) { + srp->rq_state = SG_RQ_DONE_READ; + spin_unlock_irqrestore(>rq_entry_lck, iflags); result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp); return (result < 0) ? result : 0; } - srp->orphan = true; - write_unlock_irq(>rq_list_lock); - return result; /* -ERESTARTSYS because signal hit process */ + cp = sg_rq_state_str(srp->rq_state, true); + SG_LOG(1, sdp, "%s: unexpected srp=0x%p state: %s\n", __func__, + srp, cp); + spin_unlock_irqrestore(>rq_entry_lck, iflags); + return -EPROTO; /* Logic error */ case SG_SET_TIMEOUT: result = get_user(val, ip); if (result) @@ -1119,25 +1132,36 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) if (!access_ok(VERIFY_WRITE, ip, sizeof(int))) return -EFAULT; read_lock_irqsave(>rq_list_lock, iflags); - list_for_each_entry(srp, >rq_list, entry) { - if ((1 == srp->done) && (!srp->sg_io_owned)) { - read_unlock_irqrestore(>rq_list_lock, - iflags); -
[PATCH v2 0/8] sg: major cleanup, remove max_queue limit
The intention is to add two new ioctls as proposed by Linus Torvalds: SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async interface. But first, clean up the driver and remove the SG_MAX_QUEUE limit of no more than 16 queued commands on a file descriptor at a time. A free list has been added and the de-allocation of sg_request objects is deferred until the close() of a file. Locking is extensively reworked, especially at the struct sg_fd and sg_request level. A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple integer values and booleans to be written to and/or read from the driver. An example is changing and/or reading the reserved request data length (there is one of these per fd). An example of a new feature is changing and/or reading the per-fd upper limit on the sum of outstanding data buffer sizes (default is 16 MB). An example of a boolean is a bit to do all following command timekeeping in nanoseconds rather that the default millseconds. A later patchset will add implementations for the SG_IOSUBMIT and SG_IORECEIVE plus handling of the sg v4 interface with the existing SG_IO ioctl. This patchset is against Martin Petersen's 4.20/scsi-queue branch. Changes since v1: - remove redundant casts from private_data field - introduce atomic to address locking problems around sg_fd::sum_fd_dlens - replace rq_state defines with an enumeration - add __must_hold() annotation to sg_fill_request_table() - fix compile/build problem around the 4th and 5th patches - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED ioctl and increase structure size from 64 to 96 bytes Douglas Gilbert (8): sg: types and naming cleanup sg: introduce sg_log macro sg: split header, expand and correct descriptions sg: expand request states sg: add free list, rework locking sg: complete locking changes on ioctl+debug sg: rework ioctl handling sg: user controls for q_at_head, read_value drivers/scsi/sg.c | 2538 ++-- include/scsi/sg.h | 268 + include/uapi/scsi/sg.h | 412 +++ 3 files changed, 2079 insertions(+), 1139 deletions(-) create mode 100644 include/uapi/scsi/sg.h -- 2.17.1
[PATCH v2 2/8] sg: introduce sg_log macro
Introduce the SG_LOG macro to replace long-winded 'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__ wherever appropriate to make the debug messages more portable. Signed-off-by: Douglas Gilbert --- drivers/scsi/sg.c | 162 +- 1 file changed, 72 insertions(+), 90 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 78a35e63d177..94e13a1d21a5 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref); /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */ #define SZ_SG_REQ_INFO sizeof(struct sg_req_info) -#define sg_printk(prefix, sdp, fmt, a...) \ - sdev_prefix_printk(prefix, (sdp)->device, \ - (sdp)->disk->disk_name, fmt, ##a) +/* + * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages. + * 'depth' is a number between 1 (most severe) and 7 (most noisy, most + * information). All messages are logged as informational (KERN_INFO). In + * the unexpected situation where sdp is NULL the macro reverts to a pr_info + * and ignores CONFIG_SCSI_LOGGING and always prints to the log. + */ +#define SG_LOG(depth, sdp, fmt, a...) \ + do {\ + if (IS_ERR_OR_NULL(sdp)) { \ + pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a); \ + } else {\ + SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \ +KERN_INFO, (sdp)->device, \ +(sdp)->disk->disk_name, fmt, \ +##a)); \ + } \ + } while (0) /* * The SCSI interfaces that use read() and write() as an asynchronous variant of @@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp) sdp = sg_get_dev(min_dev); if (IS_ERR(sdp)) return PTR_ERR(sdp); - - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, - "sg_open: flags=0x%x\n", flags)); + SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n", + __func__, flags, sdp->open_cnt); /* This driver's module count bumped by fops_get in */ /* Prevent the device driver from vanishing while we sleep */ @@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n")); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; + SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__, + sdp->open_cnt); mutex_lock(>open_rel_lock); scsi_autopm_put_device(sdp->device); @@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) sdp = sfp->parentdp; if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); if (atomic_read(>detaching)) return -ENODEV; if (!((filp->f_flags & O_NONBLOCK) || @@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return -EIO;/* minimum scsi command length is 6 bytes */ if (!(srp = sg_add_request(sfp))) { - SCSI_LOG_TIMEOUT(1, sg_printk(KERN_INFO, sdp, - "sg_write: queue full\n")); + SG_LOG(1, sdp, "%s: queue full\n", __func__); return -EDOM; } buf += SZ_SG_HEADER; @@ -703,9 +715,8 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
[PATCH v2 3/8] sg: split header, expand and correct descriptions
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header: uapi/scsi/sg.h . Overall expand the twin header files with new functionality in this patchset and functionality to be added in the next patchset to implement SG_IOSUBMIT and friends. Adjust format to modern kernel conventions. Correct and expand some comments. Signed-off-by: Douglas Gilbert --- The new header file: uapi/scsi/sg.h is expected to be in the kernel's public interface. This takes time (i.e. months or years) to find its way into glibc and Linux distributions. So the sooner it is presented and accepted the better. >From the C perspective, nothing is removed or changed (or shouldn't be), only additions. This means that the original typedefs stay (but they are not used in the driver source file: sg.c) since user space programs may be using them. include/scsi/sg.h | 252 ++--- include/uapi/scsi/sg.h | 418 + 2 files changed, 429 insertions(+), 241 deletions(-) create mode 100644 include/uapi/scsi/sg.h diff --git a/include/scsi/sg.h b/include/scsi/sg.h index f91bcca604e4..596f68746f66 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -7,23 +7,23 @@ /* * History: * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. + * process control of SCSI devices. * Development Sponsored by Killy Corp. NY NY * * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard + * Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert + * Copyright (C) 1998 - 2018 Douglas Gilbert * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. + * Version: 3.9.01 (20181016) + * This version is for 2.6, 3 and 4 series kernels. * * Documentation * = * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] + * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html + * http://sg.danny.cz/sg/p/sg_v3_ho.html * Also see: /Documentation/scsi/scsi-generic.txt * * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html @@ -33,242 +33,12 @@ extern int sg_big_buff; /* for sysctl */ #endif +#include -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory - or scatter gather list */ -unsigned char __user *cmdp; /* [i], [*i] points to command to perform */ -void __user *sbp; /* [i], [*o] points to sense_buffer memory */ -unsigned int timeout; /* [i] MAX_UINT->no timeout (unit: millisec) */ -unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */ -int pack_id;/* [i->o] unused internally (normally) */ -void __user * usr_ptr; /* [i->o] unused internally */ -unsigned char status; /* [o] scsi status */ -unsigned char masked_status;/* [o] shifted, masked scsi status */ -unsigned char msg_status; /* [o] messaging level data (optional) */ -unsigned char sb_len_wr;/* [o] byte count actually written to sbp */ -unsigned short host_status; /* [o] errors from host adapter */ -unsigned short driver_status;/* [o] errors from software driver */ -int resid; /* [o] dxfer_len - actual_transferred */ -unsigned int duration; /* [o] time taken by cmd (unit: millisec) */ -unsigned int info; /* [o] auxiliary information */ -} sg_io_hdr_t; /* 64 bytes long (on i386) */ - -#define SG_INTERFACE_ID_ORIG 'S' - -/* Use negative values to flag difference from original sg_header structure */ -#define SG_DXFER_NONE (-1) /* e.g. a SCSI Test Unit Ready command */ -#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */ -#define SG_DXFER_FROM_DEV (-3) /* e.g. a SCSI READ command */ -#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the - additional property than
[PATCH v2 7/8] sg: rework ioctl handling
Rework ioctl handling, report clearly to the log which ioctl has been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which permits several integer and boolean values to be "_SET_" (i.e. passed into the driver, potentially changing its actions) and/or read from the driver (the "_GET_" part) in a single operation. Signed-off-by: Douglas Gilbert --- One feature of the new SG_SET_GET_EXTENDED ioctl is ability to fetch the sg device minor number (e.g. the "3" in /dev/sg3) associated with the current file descriptor. A boolean addition is the ability to change command timekeeping on the current file descriptor from units of milliseconds (the default) to nanoseconds. drivers/scsi/sg.c | 535 +++--- 1 file changed, 409 insertions(+), 126 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 8c9fbf865106..1a63b0a9279a 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -90,8 +90,8 @@ enum sg_rq_state { /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) -#define SG_TIME_UNIT_MS 0 /* milliseconds */ -#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) @@ -186,7 +186,6 @@ struct sg_fd { /* holds the state of a file descriptor */ bool cmd_q; /* true -> allow command queuing, false -> don't */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ - bool sse_seen; /* SG_SET_EXTENDED ioctl seen */ bool time_in_ns;/* report times in nanoseconds */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ @@ -240,6 +239,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); static const char *sg_rq_state_str(u8 rq_state, bool long_str); +static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first, + rwlock_t *rwlp, unsigned long *iflagsp); #define SZ_SG_HEADER sizeof(struct sg_header) /* v1 and v2 header */ #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr) /* v3 header */ @@ -507,9 +508,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -814,11 +815,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const char __user *buf, return -ENOSYS; if (hp->flags & SG_FLAG_MMAP_IO) { if (!list_empty(>rq_list)) - return -EBUSY; /* already active requests on fd */ + return -EBUSY; /* already active requests on fd */ if (hp->dxfer_len > sfp->reserve_srp->data.dlen) - return -ENOMEM; /* MMAP_IO size must fit in reserve */ + return -ENOMEM; /* MMAP_IO size must fit in reserve */ if (hp->flags & SG_FLAG_DIRECT_IO) - return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ + return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ } sfp->cmd_q = true; /* when sg_io_hdr seen, set command queuing on */ ul_timeout = msecs_to_jiffies(hp->timeout); @@ -856,7 +857,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp = sg_add_request(sfp, hi_p->dxfer_len, false); if (IS_ERR(srp)) return srp; - srp->header = *hi_p;/* structure assignment, could memcpy */ + srp->header = *hi_p;/* structure assignment, could memcpy */ hp = >header; srp->data.cmd_opcode = cmnd[0]; /* hold opcode of command */ hp->status = 0; @@ -1012,69 +1013,300 @@ srp_state_or_detaching(struct sg_device *sdp, struct sg_request *srp) return ret; } +/* For handling ioctl(SG_IO). Returns 0 on success else a negated errno */ +static int +sg_sg_io(struct file *filp, struct sg_device *sdp, struct sg_fd *sfp, +void __user *p) +{ + bool read_only = (O_RDWR != (filp->f_flags & O_ACCM
[PATCH v2 8/8] sg: user controls for q_at_head, read_value
Add a SG_SET_GET_EXTENDED ioctl control for whether commands will be queued_at_head or queued_at_tail by the block layer (together with the scsi mid-level). It has file scope. Also add a read_value integer the can be used by write a value from the SG_SEIRV_* group then the corresponding value will be returned. Signed-off-by: Douglas Gilbert --- The user can still override the new file scope setting on a a per command basis with the SG_FLAG_Q_AT_HEAD and SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures. An example of read_value usage is to write the value SG_SEIRV_FL_RQS to the read_value field. Then after the SG_SET_GET_EXTENDED ioctl is run, the number of (inactive) requests currently on this file descriptor's request free list is placed in the read_value field. drivers/scsi/sg.c | 73 +- include/scsi/sg.h | 32 -- include/uapi/scsi/sg.h | 48 --- 3 files changed, 92 insertions(+), 61 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 1a63b0a9279a..31f6c364f60f 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -59,7 +59,7 @@ static int sg_version_num = 30901;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20181019"; +static char *sg_version_date = "20181020"; static int sg_proc_init(void); #endif @@ -95,6 +95,10 @@ enum sg_rq_state { #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) +#define SG_FD_Q_AT_TAIL true +#define SG_FD_Q_AT_HEAD false +#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */ + int sg_big_buff = SG_DEF_RESERVED_SIZE; /* N.B. This variable is readable and writeable via /proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer @@ -187,6 +191,7 @@ struct sg_fd { /* holds the state of a file descriptor */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ bool time_in_ns;/* report times in nanoseconds */ + bool q_at_tail; /* queue at tail if true, head when false */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ struct fasync_struct *async_qp; /* used by asynchronous notification */ @@ -896,9 +901,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT); else hp->duration = jiffies_to_msecs(jiffies); - /* at tail if v3 or later interface and tail flag set */ - at_head = !(hp->interface_id != '\0' && - (SG_FLAG_Q_AT_TAIL & hp->flags)); + + if (hp->interface_id == '\0') /* v1 and v2 interface */ + at_head = true; /* backward compatibility */ + else if (sfp->q_at_tail) /* cmd flags can override sfd setting */ + at_head = (SG_FLAG_Q_AT_HEAD & hp->flags); + else/* this sfd is defaulting to head */ + at_head = !(SG_FLAG_Q_AT_TAIL & hp->flags); srp->rq->timeout = timeout; kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */ @@ -1176,11 +1185,12 @@ static int sg_set_get_extended(struct sg_fd *sfp, void __user *p) { int result = 0; - u32 uv; + unsigned long iflags; + u32 uv, or_masks; struct sg_device *sdp = sfp->parentdp; struct sg_extended_info *seip; + struct sg_request *srp; struct sg_extended_info sei; - u32 or_masks; seip = if (!access_ok(VERIFY_READ, p, SZ_SG_EXTENDED_INFO)) @@ -1194,8 +1204,10 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) } SG_LOG(3, sdp, "%s: wr_mask=0x%x rd_mask=0x%x\n", __func__, seip->valid_wr_mask, seip->valid_rd_mask); + /* reserved_sz (u32), read-write */ if (or_masks & SG_SEIM_RESERVED_SIZE) result = sg_reserved_sz(sfp, seip); + /* rq_rem_sgat_threshold (u32), read-write [impacts re-use only] */ if (or_masks & SG_SEIM_RQ_REM_THRESH) { if (seip->valid_wr_mask & SG_SEIM_RQ_REM_THRESH) { uv = seip->rq_rem_sgat_thresh; @@ -1206,6 +1218,7 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p) if (seip->valid_rd_mask & SG_SEIM_RQ_REM_THRESH) seip->rq_rem_sgat_thresh = sfp->rem_sgat_thresh; } + /* tot_fd_thresh (u32), read-write [sum of active cmd dlen_s] */ if (or_masks & SG_SEIM_TOT_FD_THRESH) { if (seip->valid_wr_mask & SG_S
[PATCH v2 5/8] sg: add free list, rework locking
Remove fixed 16 sg_request object array and replace with an active rq_list plus a request free list. Add finer grained spin lock to sg_request and do a major rework on locking. sg_request objects now are only de-allocated when the owning file descriptor is closed. This simplifies locking issues. Signed-off-by: Douglas Gilbert --- This patch is big and complex. Towards the end the diff program completely loses the plot. Better to use difftool on a two pane window, or simply view the before sg.c and the after sg.c . drivers/scsi/sg.c | 1241 +++-- 1 file changed, 751 insertions(+), 490 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index a76395f16fb1..4a2e9a616604 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -141,46 +141,58 @@ struct sg_scatter_hold { /* holding area for scsi scatter gather info */ struct sg_device; /* forward declarations */ struct sg_fd; -struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ - struct list_head entry; /* list entry */ - struct sg_fd *parentfp; /* NULL -> not in use */ +/* + * For any file descriptor: at any time a sg_request object must be a member + * of sg_fd::rq_list or rq_free_list::rq_free_list. The only exception is + * within a rq_list_lock write lock when it is moving between those two lists. + */ + +struct sg_request {/* active SCSI command or inactive on free list (fl) */ + struct list_head rq_entry; /* member of rq_list (active cmd) */ + struct list_head free_entry;/* member of rq_free_list */ + spinlock_t rq_entry_lck; struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */ union { struct sg_io_hdr header; /* see */ - struct sg_io_v4 hdr_v4; /* see */ + struct sg_v4_hold v4_hold;/* related to */ }; - u8 sense_b[SCSI_SENSE_BUFFERSIZE]; - bool hdr_v4_active; /* selector for anonymous union above */ - bool res_used; /* true -> use reserve buffer, false -> don't */ + ktime_t start_ts; /* used when sg_fd::time_in_ns is true */ + enum sg_rq_state rq_state;/* tracks lifetime of each request */ + bool v4_active; /* selector for autonomous union above */ bool orphan;/* true -> drop on sight, false -> normal */ - bool sg_io_owned; /* true -> packet belongs to SG_IO */ - /* done protected by rq_list_lock */ - char done; /* 0->before bh, 1->before read, 2->read */ + bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */ + u8 sense_b[SCSI_SENSE_BUFFERSIZE]; + struct sg_fd *parentfp; /* pointer to owning fd, even when on fl */ + struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */ struct request *rq; struct bio *bio; - struct execute_work ew; + struct execute_work ew_orph;/* harvest orphan request */ }; -struct sg_fd { /* holds the state of a file descriptor */ - struct list_head sfd_siblings; /* protected by device's sfd_lock */ +struct sg_fd { /* holds the state of a file descriptor */ + struct list_head sfd_entry; /* member sg_device::sfds list */ struct sg_device *parentdp; /* owning device */ wait_queue_head_t read_wait;/* queue read until command done */ - rwlock_t rq_list_lock; /* protect access to list in req_arr */ struct mutex f_mutex; /* protect against changes in this fd */ + rwlock_t rq_list_lock; /* protect access to sg_request lists */ + struct list_head rq_list; /* head of inflight sg_request list */ + struct list_head rq_free_list; /* head of sg_request free list */ int timeout;/* defaults to SG_DEFAULT_TIMEOUT */ int timeout_user; /* defaults to SG_DEFAULT_TIMEOUT_USER */ - struct sg_scatter_hold reserve; /* one held for this file descriptor */ - struct list_head rq_list; /* head of request list */ - struct fasync_struct *async_qp; /* used by asynchronous notification */ - struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */ + int rem_sgat_thresh;/* > this, request's sgat cleared after use */ + int tot_fd_thresh; /* E2BIG if sum_of(dlen) > this, 0: ignore */ + atomic_t sum_fd_dlens; /* when tot_fd_thresh>0 this is sum_of(dlen) */ bool force_packid; /* true -> pack_id input to read() */ bool cmd_q; /* true -> allow command queuing, false -> don't */ - u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ - bool res_in_use;/* true -> 'reserve' array in
[PATCH v2 1/8] sg: types and naming cleanup
Remove typedefs and use better type names like bool and u8 where appropriate. Rename some variables and functions for clarity. Adjust formatting (e.g. function definitions) to be more consistent across the driver. Signed-off-by: Douglas Gilbert --- The intention is to move to sg_version_num 40001 when a second patchset implementing SG_IOSUBMIT and friends is applied. In the meantime, move from the latest version number in the kernel (30536) to 30901 to indicate a significant change but not yet implementing the sg v4 interface. drivers/scsi/sg.c | 827 ++ 1 file changed, 467 insertions(+), 360 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c6ad00703c5b..78a35e63d177 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -7,7 +7,7 @@ * Original driver (sg.c): *Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - *Copyright (C) 1998 - 2014 Douglas Gilbert + *Copyright (C) 1998 - 2018 Douglas Gilbert * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -16,16 +16,9 @@ * */ -static int sg_version_num = 30536; /* 2 digits for each component */ -#define SG_VERSION_STR "3.5.36" +static int sg_version_num = 30901; /* 2 digits for each component */ +#define SG_VERSION_STR "3.9.01" -/* - * D. P. Gilbert (dgilb...@interlog.com), notes: - * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First - *the kernel/module needs to be built with CONFIG_SCSI_LOGGING - *(otherwise the macros compile to empty statements). - * - */ #include #include @@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #include #include #include /* for sg_check_file_access() */ +#include +#include #include "scsi.h" #include @@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20140603"; +static char *sg_version_date = "20181019"; static int sg_proc_init(void); #endif @@ -73,7 +68,8 @@ static int sg_proc_init(void); #define SG_MAX_DEVS 32768 -/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type +/* + * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater * than 16 bytes are "variable length" whose length is a multiple of 4 */ @@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct class_interface *); static void sg_remove_device(struct device *, struct class_interface *); static DEFINE_IDR(sg_index_idr); -static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock - file descriptor list for device */ +static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */ static struct class_interface sg_interface = { .add_dev= sg_add_device, .remove_dev = sg_remove_device, }; -typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ - unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ - unsigned bufflen; /* Size of (aggregate) data buffer */ - struct page **pages; - int page_order; - char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */ - unsigned char cmd_opcode; /* first byte of command */ -} Sg_scatter_hold; +struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */ + __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */ + __user u8 *sbp; /* derived from sg_io_v4::response */ + u16 cmd_len;/* truncated of sg_io_v4::request_len */ + u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */ + u32 flags; /* copy of sg_io_v4::flags */ +}; + +struct sg_scatter_hold { /* holding area for scsi scatter gather info */ + struct page **pages;/* num_sgat element array of struct page* */ + int page_order; /* byte_len = (page_size * (2**page_order)) */ + int dlen; /* Byte length of data buffer */ + unsigned short num_sgat;/* actual number of scatter-gather segments */ + bool dio_in_use;/* false->indirect IO (or mmap), true->dio */ + u8 cmd_opcode; /* first byte of command */ +}; struct sg_device; /* forward declarations */ struct sg_fd; -typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ +struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ struct list_head entry; /* list entry */
Re: [PATCH 2/8] sg: introduce sg_log macro
On 2018-10-19 3:45 a.m., Johannes Thumshirn wrote: On 19/10/18 08:24, Douglas Gilbert wrote: [..] +/* + * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages. + * 'depth' is a number between 1 (most severe) and 7 (most noisy, most + * information). All messages are logged as informational (KERN_INFO). In + * the unexpected situation where sdp is NULL the macro reverts to a pr_info + * and ignores CONFIG_SCSI_LOGGING and always prints to the log. + */ +#define SG_LOG(depth, sdp, fmt, a...) \ + do {\ + if (IS_ERR_OR_NULL(sdp)) { \ + pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a);\ + } else {\ + SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \ +KERN_INFO, (sdp)->device, \ +(sdp)->disk->disk_name, fmt, \ +##a)); \ + } \ + } while (0) Hi Doug, have you considered using the kernel's dynamic debug infrastructure instead? Hi, I'll follow what the scsi mid-level and the other ULDs do. IOW, no change. The debug messages they produce are quite helpful (to me, I use them a lot, and Tony B. has asked for more precision) and well-tuned to the SCSI subsystem (e.g. telling us what sdp represents in useful terms). And they can be compiled out (but not my pr_info above, probably should be a pr_warn). Doug Gilbert
Re: [PATCH 5/8] sg: add free list, rework locking
On 2018-10-19 11:22 a.m., Bart Van Assche wrote: On Fri, 2018-10-19 at 02:24 -0400, Douglas Gilbert wrote: static void -sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo) +sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo, + int max_num) { struct sg_request *srp; int val; - unsigned int ms; val = 0; - list_for_each_entry(srp, >rq_list, entry) { - if (val >= SG_MAX_QUEUE) - break; - rinfo[val].req_state = srp->done + 1; + list_for_each_entry(srp, >rq_list, rq_entry) { + if (val >= max_num) + return; What protects the sfp->rq_list against concurrent changes? It seems to me like all other code that iterates over or modifies that list protects that list with rq_list_lock? Bart, The function is called from sg_ioctl() case SG_GET_REQUEST_TABLE and at the call point the read_lock is held on rq_list_lock. Maybe I can add a comment above the function about the lock being held. [At least it is as the end of the patch series, and that is all I care about :-)] Doug Gilbert P.S. Best to look at sg.c after each patch is applied, not the !@#$ing stupid patches.
Re: [PATCH 3/8] sg: split header, expand and correct descriptions
On 2018-10-19 4:31 a.m., Johannes Thumshirn wrote: On 19/10/18 08:24, Douglas Gilbert wrote: +/* Alternate style type names, "..._t" variants preferred */ +typedef struct sg_io_hdr Sg_io_hdr; +typedef struct sg_io_vec Sg_io_vec; +typedef struct sg_scsi_id Sg_scsi_id; +typedef struct sg_req_info Sg_req_info; There are no _t variants for the above, or am I missing something? I've expanded the comment to make it clearer I'm referring to the definitions above in that header ***. For example: Referring to typedef struct sg_io_hdr { ... /* the definition of its fields */ } sg_io_hdr_t; So the suggestion is to prefer sg_io_hdr_t to Sg_io_hdr and if you are using C rather that C++ (in the user space) and you have very backward looking conventions then prefer: struct sg_io_hdr Is that clearer? Doug Gilbert *** If the unified diff doesn't show that, then that is one of the many weakness of using unified diffs for code reviews. One of the things I'm trying to clean up is 15 years of "laparoscopic" patches (and diff and patch are fighting back).
Re: [PATCH] sg: remove bad blk_end_request_all() call
On 2018-10-16 10:38 a.m., Jens Axboe wrote: We just need to free the request here. Additionally, this is currently wrong for a queue that's using MQ currently, it'll crash. Surprise removals are difficult code paths to check. That snippet is after the request has been generated and before the call to: blk_execute_rq_nowait() IOWs a small window. Currently if a surprise removal is detected by the completion callback (sg_rq_end_io() ) then a pr_info() is called and it otherwise follows the main path out: free-ing the request, keeping the bio. Is that okay? More generally, once a request is "in flight" and a surprise removal occurs, can the sg driver expect the block layer (or scsi mid-level) to react (and clean-up) or is it up to the sg driver to force those actions? Acked-by: Douglas Gilbert Cc: Doug Gilbert Cc: linux-scsi@vger.kernel.org Signed-off-by: Jens Axboe --- drivers/scsi/sg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 8a254bb46a9b..c6ad00703c5b 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -822,7 +822,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, if (atomic_read(>detaching)) { if (srp->bio) { scsi_req_free_cmd(scsi_req(srp->rq)); - blk_end_request_all(srp->rq, BLK_STS_IOERR); + blk_put_request(srp->rq); srp->rq = NULL; }
Re: [PATCH 2/8] sg: introduce sg_log macro
On 2018-10-24 3:58 a.m., Martin K. Petersen wrote: Hi Doug, I'll follow what the scsi mid-level and the other ULDs do. IOW, no change. The debug messages they produce are quite helpful (to me, I use them a lot, and Tony B. has asked for more precision) and well-tuned to the SCSI subsystem (e.g. telling us what sdp represents in useful terms). And they can be compiled out (but not my pr_info above, probably should be a pr_warn). I agree with Johannes. SCSI logging is in sustaining mode. We're trying to remove it, not to add to it. The kernel has much more capable and flexible methods of getting information out to the user these days. No need to resort to arcane logging masks and the like. Examples please, preferably from the SCSI subsystem. If not, I'll do what most other drivers do, drop all debug statements. Doug Gilbert
[PATCH v3 5/8] sg: add free list, rework locking
Remove fixed 16 sg_request object array and replace with an active rq_list plus a request free list. Add finer grained spin lock to sg_request and do a major rework on locking. sg_request objects now are only de-allocated when the owning file descriptor is closed. This simplifies locking issues. Signed-off-by: Douglas Gilbert --- This patch is big and complex. Towards the end the diff program completely loses the plot. Better to use difftool on a two pane window, or simply view the before sg.c and the after sg.c . The requirement to keep the patch small enough to be reviewable but at the same time both compilable and buildable (i.e. no undefines when kernel is built) are in conflict, especially when the changes touch almost all functions in a driver. IMO you should be able to mark a patch (in the middle of a patchset) as "non compilable". drivers/scsi/sg.c | 1305 - 1 file changed, 805 insertions(+), 500 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index a76395f16fb1..5fbdb0f40739 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -141,46 +141,58 @@ struct sg_scatter_hold { /* holding area for scsi scatter gather info */ struct sg_device; /* forward declarations */ struct sg_fd; -struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ - struct list_head entry; /* list entry */ - struct sg_fd *parentfp; /* NULL -> not in use */ +/* + * For any file descriptor: at any time a sg_request object must be a member + * of sg_fd::rq_list or sg_fd::rq_free_list. The only exception is within a + * rq_list_lock write lock when it is moving between those two lists. + */ + +struct sg_request {/* active SCSI command or inactive on free list (fl) */ + struct list_head rq_entry; /* member of rq_list (active cmd) */ + struct list_head free_entry;/* member of rq_free_list */ + spinlock_t rq_entry_lck; struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */ union { struct sg_io_hdr header; /* see */ - struct sg_io_v4 hdr_v4; /* see */ + struct sg_v4_hold v4_hold;/* related to */ }; - u8 sense_b[SCSI_SENSE_BUFFERSIZE]; - bool hdr_v4_active; /* selector for anonymous union above */ - bool res_used; /* true -> use reserve buffer, false -> don't */ + ktime_t start_ts; /* used when sg_fd::time_in_ns is true */ + enum sg_rq_state rq_state;/* tracks lifetime of each request */ + bool v4_active; /* selector for autonomous union above */ bool orphan;/* true -> drop on sight, false -> normal */ - bool sg_io_owned; /* true -> packet belongs to SG_IO */ - /* done protected by rq_list_lock */ - char done; /* 0->before bh, 1->before read, 2->read */ + bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */ + u8 sense_b[SCSI_SENSE_BUFFERSIZE]; + struct sg_fd *parentfp; /* pointer to owning fd, even when on fl */ + struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */ struct request *rq; struct bio *bio; - struct execute_work ew; + struct execute_work ew_orph;/* harvest orphan request */ }; -struct sg_fd { /* holds the state of a file descriptor */ - struct list_head sfd_siblings; /* protected by device's sfd_lock */ +struct sg_fd { /* holds the state of a file descriptor */ + struct list_head sfd_entry; /* member sg_device::sfds list */ struct sg_device *parentdp; /* owning device */ wait_queue_head_t read_wait;/* queue read until command done */ - rwlock_t rq_list_lock; /* protect access to list in req_arr */ struct mutex f_mutex; /* protect against changes in this fd */ + rwlock_t rq_list_lock; /* protect access to sg_request lists */ + struct list_head rq_list; /* head of inflight sg_request list */ + struct list_head rq_free_list; /* head of sg_request free list */ int timeout;/* defaults to SG_DEFAULT_TIMEOUT */ int timeout_user; /* defaults to SG_DEFAULT_TIMEOUT_USER */ - struct sg_scatter_hold reserve; /* one held for this file descriptor */ - struct list_head rq_list; /* head of request list */ - struct fasync_struct *async_qp; /* used by asynchronous notification */ - struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */ + int rem_sgat_thresh;/* > this, request's sgat cleared after use */ + int tot_fd_thresh; /* E2BIG if sum_of(dlen) > this, 0: ignore */ + atomic_t sum_fd_dlens; /* when tot_fd_thresh>0 this is sum_of(dlen) */ bool force_packid; /* true -> pack_id input to read() */ bool cmd_q; /* true -> allow com
[PATCH v3 8/8] sg: user controls for q_at_head, read_value
Add a SG_SET_GET_EXTENDED ioctl control for whether commands will be queued_at_head or queued_at_tail by the block layer (together with the scsi mid-level). It has file scope. Also add a read_value integer the can be used by write a value from the SG_SEIRV_* group then the corresponding value will be returned. Signed-off-by: Douglas Gilbert --- The user can still override the new file scope setting on a a per command basis with the SG_FLAG_Q_AT_HEAD and SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures. An example of read_value usage is to write the value SG_SEIRV_FL_RQS to the read_value field. Then after the SG_SET_GET_EXTENDED ioctl is run, the number of (inactive) requests currently on this file descriptor's request free list is placed in the read_value field. Added in v3 is SG_SEIRV_DEV_FL_RQS which is an expansion of SG_SEIRV_FL_RQS. SG_SEIRV_DEV_FL_RQS counts free list entries on all sg file descriptors currently open on the device that the file descriptor (given to ioctl()) is associated with. drivers/scsi/sg.c | 160 +++-- include/scsi/sg.h | 32 ++--- include/uapi/scsi/sg.h | 49 ++--- 3 files changed, 150 insertions(+), 91 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 3e89bbd508de..4d6966d40949 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -59,7 +59,7 @@ static int sg_version_num = 30901;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20181019"; +static char *sg_version_date = "20181024"; static int sg_proc_init(void); #endif @@ -95,6 +95,10 @@ enum sg_rq_state { #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) +#define SG_FD_Q_AT_TAIL true +#define SG_FD_Q_AT_HEAD false +#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */ + int sg_big_buff = SG_DEF_RESERVED_SIZE; /* N.B. This variable is readable and writeable via /proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer @@ -187,6 +191,7 @@ struct sg_fd { /* holds the state of a file descriptor */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ bool time_in_ns;/* report times in nanoseconds */ + bool q_at_tail; /* queue at tail if true, head when false */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ struct fasync_struct *async_qp; /* used by asynchronous notification */ @@ -238,7 +243,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, int dxfr_len, static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); -static const char *sg_rq_state_str(u8 rq_state, bool long_str); +static const char *sg_rq_state_str(enum sg_rq_state rq_state, bool long_str); static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first, rwlock_t *rwlp, unsigned long *iflagsp); @@ -855,7 +860,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, if (h4p || !hi_p) return ERR_PTR(-EOPNOTSUPP); - srp = sg_add_request(sfp, hi_p->dxfer_len, false); + srp = sg_add_request(sfp, hi_p->dxfer_len, sync); if (IS_ERR(srp)) return srp; srp->header = *hi_p;/* structure assignment, could memcpy */ @@ -897,9 +902,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT); else hp->duration = jiffies_to_msecs(jiffies); - /* at tail if v3 or later interface and tail flag set */ - at_head = !(hp->interface_id != '\0' && - (SG_FLAG_Q_AT_TAIL & hp->flags)); + + if (hp->interface_id == '\0') /* v1 and v2 interface */ + at_head = true; /* backward compatibility */ + else if (sfp->q_at_tail) /* cmd flags can override sfd setting */ + at_head = (hp->flags & SG_FLAG_Q_AT_HEAD); + else/* this sfd is defaulting to head */ + at_head = !(hp->flags & SG_FLAG_Q_AT_TAIL); srp->rq->timeout = timeout; kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */ @@ -1084,30 +1093,30 @@ sg_reserved_sz(struct sg_fd *sfp, struct sg_extended_info *seip) { bool free_n_srp = false; int result = 0; - int val, mx_sect_bytes; + int new_sz, mx_sect_bytes; unsigned long iflags; - struct sg_request *srp; /* prior reserve request */
[PATCH v3 6/8] sg: complete locking changes on ioctl+debug
Complete the locking and structure changes of ioctl and debug ('cat /proc/scsi/sg/debug') handling. Signed-off-by: Douglas Gilbert --- This was the code that was "#if 0'-ed out 2 patches ago. It also shuts checkpatch.pl up as it doesn't like that technique but offers no viable substitute. drivers/scsi/sg.c | 217 +- 1 file changed, 136 insertions(+), 81 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 5fbdb0f40739..8b4a65340971 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -239,6 +239,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, int dxfr_len, static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); +static const char *sg_rq_state_str(u8 rq_state, bool long_str); #define SZ_SG_HEADER sizeof(struct sg_header) /* v1 and v2 header */ #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr) /* v3 header */ @@ -359,7 +360,7 @@ sg_open(struct inode *inode, struct file *filp) nonseekable_open(inode, filp); if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE))) - return -EPERM; /* Can't lock it with read only access */ + return -EPERM;/* not permitted, need write access for O_EXCL */ sdp = sg_get_dev(min_dev); if (IS_ERR(sdp)) return PTR_ERR(sdp); @@ -931,11 +932,6 @@ sg_ktime_sub_trunc(ktime_t now_ts, ktime_t ts0) return 0; } -/* - * Annotation under function arguments (i.e. '__must_hold...') states that - * this function expects that lock to be held, a read lock is sufficient in - * this case. - */ static void sg_fill_request_element(struct sg_fd *sfp, struct sg_request *srp, struct sg_req_info *rip) @@ -982,7 +978,6 @@ sg_fill_request_element(struct sg_fd *sfp, struct sg_request *srp, * Populate up to max_num struct sg_req_info objects, first from the active * list then, if there is still room, from the free list. All elements in * the free list should have SG_RQ_INACTIVE status. - * See sg_fill_request_element() for note about __must_hold annotation. */ static void sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo, @@ -1031,7 +1026,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct sg_request *srp) return ret; } -#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1063,24 +1057,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENXIO; if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR)) return -EFAULT; - result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR, -1, read_only, 1, ); + result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only, +true, ); if (result < 0) return result; result = wait_event_interruptible(sfp->read_wait, - (srp_done(sfp, srp) || atomic_read(>detaching))); - if (atomic_read(>detaching)) + srp_state_or_detaching(sdp, srp)); + + spin_lock_irqsave(>rq_entry_lck, iflags); + if (unlikely(result)) { /* -ERESTARTSYS because signal hit */ + srp->orphan = true; + srp->rq_state = SG_RQ_INFLIGHT; + spin_unlock_irqrestore(>rq_entry_lck, iflags); + SG_LOG(1, sdp, "%s: wait_event_interruptible-->%d\n", + __func__, result); + return result; + } + if (unlikely(atomic_read(>detaching))) { + srp->rq_state = SG_RQ_INACTIVE; + spin_unlock_irqrestore(>rq_entry_lck, iflags); return -ENODEV; - write_lock_irq(>rq_list_lock); - if (srp->done) { - srp->done = 2; - write_unlock_irq(>rq_list_lock); + } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) { + srp->rq_state = SG_RQ_DONE_READ; + spin_unlock_irqrestore(>rq_entry_lck, iflags); result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp); return (result < 0) ? result : 0; } - srp->orphan = true; - write_unlock_irq(>rq_list_lock); - return result; /* -ERESTARTSYS because signal hit process */ + cp = sg_rq_state_str(srp->rq_state, true); + SG_LOG(1, sdp, "%s: unexpected
[PATCH v3 3/8] sg: split header, expand and correct descriptions
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header: uapi/scsi/sg.h . Overall expand the twin header files with new functionality in this patchset and functionality to be added in the next patchset to implement SG_IOSUBMIT and friends. Adjust format to modern kernel conventions. Correct and expand some comments. Signed-off-by: Douglas Gilbert --- The new header file: uapi/scsi/sg.h is expected to be in the kernel's public interface. This takes time (i.e. months or years) to find its way into glibc and Linux distributions. So the sooner it is presented and accepted the better. >From the C perspective, nothing is removed or changed (or shouldn't be), only additions. This means that the original typedefs stay (but they are not used in the driver source file: sg.c) since user space programs may be using them. include/scsi/sg.h | 252 ++--- include/uapi/scsi/sg.h | 419 + 2 files changed, 430 insertions(+), 241 deletions(-) create mode 100644 include/uapi/scsi/sg.h diff --git a/include/scsi/sg.h b/include/scsi/sg.h index f91bcca604e4..596f68746f66 100644 --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -7,23 +7,23 @@ /* * History: * Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user - * process control of SCSI devices. + * process control of SCSI devices. * Development Sponsored by Killy Corp. NY NY * * Original driver (sg.h): - * Copyright (C) 1992 Lawrence Foard + * Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - * Copyright (C) 1998 - 2014 Douglas Gilbert + * Copyright (C) 1998 - 2018 Douglas Gilbert * - * Version: 3.5.36 (20140603) - * This version is for 2.6 and 3 series kernels. + * Version: 3.9.01 (20181016) + * This version is for 2.6, 3 and 4 series kernels. * * Documentation * = * A web site for the SG device driver can be found at: - * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] + * http://sg.danny.cz/sg [alternatively check the MAINTAINERS file] * The documentation for the sg version 3 driver can be found at: - * http://sg.danny.cz/sg/p/sg_v3_ho.html + * http://sg.danny.cz/sg/p/sg_v3_ho.html * Also see: /Documentation/scsi/scsi-generic.txt * * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html @@ -33,242 +33,12 @@ extern int sg_big_buff; /* for sysctl */ #endif +#include -typedef struct sg_iovec /* same structure as used by readv() Linux system */ -{ /* call. It defines one scatter-gather element. */ -void __user *iov_base; /* Starting address */ -size_t iov_len; /* Length in bytes */ -} sg_iovec_t; - - -typedef struct sg_io_hdr -{ -int interface_id; /* [i] 'S' for SCSI generic (required) */ -int dxfer_direction;/* [i] data transfer direction */ -unsigned char cmd_len; /* [i] SCSI command length */ -unsigned char mx_sb_len;/* [i] max length to write to sbp */ -unsigned short iovec_count; /* [i] 0 implies no scatter gather */ -unsigned int dxfer_len; /* [i] byte count of data transfer */ -void __user *dxferp; /* [i], [*io] points to data transfer memory - or scatter gather list */ -unsigned char __user *cmdp; /* [i], [*i] points to command to perform */ -void __user *sbp; /* [i], [*o] points to sense_buffer memory */ -unsigned int timeout; /* [i] MAX_UINT->no timeout (unit: millisec) */ -unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */ -int pack_id;/* [i->o] unused internally (normally) */ -void __user * usr_ptr; /* [i->o] unused internally */ -unsigned char status; /* [o] scsi status */ -unsigned char masked_status;/* [o] shifted, masked scsi status */ -unsigned char msg_status; /* [o] messaging level data (optional) */ -unsigned char sb_len_wr;/* [o] byte count actually written to sbp */ -unsigned short host_status; /* [o] errors from host adapter */ -unsigned short driver_status;/* [o] errors from software driver */ -int resid; /* [o] dxfer_len - actual_transferred */ -unsigned int duration; /* [o] time taken by cmd (unit: millisec) */ -unsigned int info; /* [o] auxiliary information */ -} sg_io_hdr_t; /* 64 bytes long (on i386) */ - -#define SG_INTERFACE_ID_ORIG 'S' - -/* Use negative values to flag difference from original sg_header structure */ -#define SG_DXFER_NONE (-1) /* e.g. a SCSI Test Unit Ready command */ -#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */ -#define SG_DXFER_FROM_DEV (-3) /* e.g. a SCSI READ command */ -#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the - additional property than
[PATCH v3 4/8] sg: expand request states
Introduce the new sg_rq_state enumerations for tracking the lifetime of a sg_request. SG_RQ_DATA_THRESHOLD is a default value that if the data length of a request exceeds then, after that request is completed, the data buffer will be freed up as the sg_request object is placed on the free list. SG_TOT_FD_THRESHOLD is a default, per file descriptor value that the sum of outstanding command data lengths is not allowed to exceed. Signed-off-by: Douglas Gilbert --- The '#if 0's are temporary and removed in a later patch. They allow the following large and complex patch (5 of 8) to be a bit shorter and still compile. drivers/scsi/sg.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 94e13a1d21a5..a76395f16fb1 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -75,6 +75,24 @@ static int sg_proc_init(void); */ #define SG_MAX_CDB_SIZE 252 +/* Following enum contains the states of sg_request::rq_state */ +enum sg_rq_state { + SG_RQ_INACTIVE = 0, /* request not in use (e.g. on fl) */ + SG_RQ_INFLIGHT, /* SCSI request issued, no response yet */ + SG_RQ_AWAIT_READ, /* response received, awaiting read */ + SG_RQ_DONE_READ,/* read is ongoing or done */ + SG_RQ_BUSY, /* example: reserve request changing size */ +}; + +/* free up requests larger than this dlen size after use */ +#define SG_RQ_DATA_THRESHOLD (128 * 1024) + +/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ +#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) + +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) int sg_big_buff = SG_DEF_RESERVED_SIZE; @@ -950,6 +968,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo) } } +#if 0 /* temporary to shorten big patch */ static long sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) { @@ -1227,6 +1246,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg) return -ENOIOCTLCMD; } #endif +#endif /* temporary to shorten big patch */ static __poll_t sg_poll(struct file *filp, poll_table * wait) @@ -1496,10 +1516,12 @@ static const struct file_operations sg_fops = { .read = sg_read, .write = sg_write, .poll = sg_poll, +#if 0 /* temporary to shorten big patch */ .unlocked_ioctl = sg_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = sg_compat_ioctl, #endif +#endif /* temporary to shorten big patch */ .open = sg_open, .mmap = sg_mmap, .release = sg_release, @@ -2422,12 +2444,16 @@ static const struct seq_operations devstrs_seq_ops = { .show = sg_proc_seq_show_devstrs, }; +#if 0 /* temporary to shorten big patch */ static int sg_proc_seq_show_debug(struct seq_file *s, void *v); +#endif /* temporary to shorten big patch */ static const struct seq_operations debug_seq_ops = { .start = dev_seq_start, .next = dev_seq_next, .stop = dev_seq_stop, +#if 0 /* temporary to shorten big patch */ .show = sg_proc_seq_show_debug, +#endif /* temporary to shorten big patch */ }; static int @@ -2601,6 +2627,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v) return 0; } +#if 0 /* temporary to shorten big patch */ + /* must be called while holding sg_index_lock */ static void sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp) @@ -2704,6 +2732,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v) read_unlock_irqrestore(_index_lock, iflags); return 0; } +#endif /* temporary to shorten big patch */ #endif /* CONFIG_SCSI_PROC_FS */ -- 2.17.1
[PATCH v3 7/8] sg: rework ioctl handling
Rework ioctl handling, report clearly to the log which ioctl has been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which permits several integer and boolean values to be "_SET_" (i.e. passed into the driver, potentially changing its actions) and/or read from the driver (the "_GET_" part) in a single operation. Signed-off-by: Douglas Gilbert --- One feature of the new SG_SET_GET_EXTENDED ioctl is ability to fetch the sg device minor number (e.g. the "3" in /dev/sg3) associated with the current file descriptor. A boolean addition is the ability to change command timekeeping on the current file descriptor from units of milliseconds (the default) to nanoseconds. drivers/scsi/sg.c | 535 +++--- 1 file changed, 409 insertions(+), 126 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 8b4a65340971..3e89bbd508de 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -90,8 +90,8 @@ enum sg_rq_state { /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */ #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024) -#define SG_TIME_UNIT_MS 0 /* milliseconds */ -#define SG_TIME_UNIT_NS 1 /* nanoseconds */ +#define SG_TIME_UNIT_MS 0 /* milliseconds */ +#define SG_TIME_UNIT_NS 1 /* nanoseconds */ #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) @@ -186,7 +186,6 @@ struct sg_fd { /* holds the state of a file descriptor */ bool cmd_q; /* true -> allow command queuing, false -> don't */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ - bool sse_seen; /* SG_SET_EXTENDED ioctl seen */ bool time_in_ns;/* report times in nanoseconds */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ @@ -240,6 +239,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); static const char *sg_rq_state_str(u8 rq_state, bool long_str); +static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first, + rwlock_t *rwlp, unsigned long *iflagsp); #define SZ_SG_HEADER sizeof(struct sg_header) /* v1 and v2 header */ #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr) /* v3 header */ @@ -508,9 +509,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -815,11 +816,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const char __user *buf, return -ENOSYS; if (hp->flags & SG_FLAG_MMAP_IO) { if (!list_empty(>rq_list)) - return -EBUSY; /* already active requests on fd */ + return -EBUSY; /* already active requests on fd */ if (hp->dxfer_len > sfp->reserve_srp->data.dlen) - return -ENOMEM; /* MMAP_IO size must fit in reserve */ + return -ENOMEM; /* MMAP_IO size must fit in reserve */ if (hp->flags & SG_FLAG_DIRECT_IO) - return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ + return -EINVAL; /* not both MMAP_IO and DIRECT_IO */ } sfp->cmd_q = true; /* when sg_io_hdr seen, set command queuing on */ ul_timeout = msecs_to_jiffies(hp->timeout); @@ -857,7 +858,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp = sg_add_request(sfp, hi_p->dxfer_len, false); if (IS_ERR(srp)) return srp; - srp->header = *hi_p;/* structure assignment, could memcpy */ + srp->header = *hi_p;/* structure assignment, could memcpy */ hp = >header; srp->data.cmd_opcode = cmnd[0]; /* hold opcode of command */ hp->status = 0; @@ -1026,69 +1027,300 @@ srp_state_or_detaching(struct sg_device *sdp, struct sg_request *srp) return ret; } +/* For handling ioctl(SG_IO). Returns 0 on success else a negated errno */ +static int +sg_sg_io(struct file *filp, struct sg_device *sdp, struct sg_fd *sfp, +void __user *p) +{ + bool read_only = (O_RDWR != (filp->f_flags & O_ACCM
[PATCH v3 2/8] sg: introduce sg_log macro
Introduce the SG_LOG macro to replace long-winded 'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__ wherever appropriate to make the debug messages more portable. Signed-off-by: Douglas Gilbert --- Reviewers want SCSI_LOG_* style logging replaced. But with what? One suggestion is the trace subsystem but that seems data and event oriented whereas the current logging is call based with extra messages when error paths are taken. There are many debugging hours of work crafted into the current SCSI_LOG_* messages that should not be thrown out for some pretty solution without additional benefits. drivers/scsi/sg.c | 162 +- 1 file changed, 72 insertions(+), 90 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 78a35e63d177..94e13a1d21a5 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref); /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */ #define SZ_SG_REQ_INFO sizeof(struct sg_req_info) -#define sg_printk(prefix, sdp, fmt, a...) \ - sdev_prefix_printk(prefix, (sdp)->device, \ - (sdp)->disk->disk_name, fmt, ##a) +/* + * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages. + * 'depth' is a number between 1 (most severe) and 7 (most noisy, most + * information). All messages are logged as informational (KERN_INFO). In + * the unexpected situation where sdp is NULL the macro reverts to a pr_info + * and ignores CONFIG_SCSI_LOGGING and always prints to the log. + */ +#define SG_LOG(depth, sdp, fmt, a...) \ + do {\ + if (IS_ERR_OR_NULL(sdp)) { \ + pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a); \ + } else {\ + SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \ +KERN_INFO, (sdp)->device, \ +(sdp)->disk->disk_name, fmt, \ +##a)); \ + } \ + } while (0) /* * The SCSI interfaces that use read() and write() as an asynchronous variant of @@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp) sdp = sg_get_dev(min_dev); if (IS_ERR(sdp)) return PTR_ERR(sdp); - - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, - "sg_open: flags=0x%x\n", flags)); + SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n", + __func__, flags, sdp->open_cnt); /* This driver's module count bumped by fops_get in */ /* Prevent the device driver from vanishing while we sleep */ @@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n")); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; + SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__, + sdp->open_cnt); mutex_lock(>open_rel_lock); scsi_autopm_put_device(sdp->device); @@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) sdp = sfp->parentdp; if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); + SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count); if (!access_ok(VERIFY_WRITE, buf, count)) return -EFAULT; @@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO; } sdp = sfp->parentdp; + SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count); if (IS_ERR_OR_NULL(sdp)) return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO; - SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n", -__func__, (int)count)); if (atomic_read(>detaching)) return -ENODEV; if (!((filp->f_flags & O_NONBLOCK) || @@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos) return -EIO;/* minimum scsi command length is 6 bytes */ if (!(srp = sg_add_request(s
[PATCH v3 1/8] sg: types and naming cleanup
Remove typedefs and use better type names like bool and u8 where appropriate. Rename some variables and functions for clarity. Adjust formatting (e.g. function definitions) to be more consistent across the driver. Signed-off-by: Douglas Gilbert --- The intention is to move to sg_version_num 40001 when a second patchset implementing SG_IOSUBMIT and friends is applied. In the meantime, move from the latest version number in the kernel (30536) to 30901 to indicate a significant change but not yet implementing the sg v4 interface. drivers/scsi/sg.c | 827 ++ 1 file changed, 467 insertions(+), 360 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c6ad00703c5b..78a35e63d177 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -7,7 +7,7 @@ * Original driver (sg.c): *Copyright (C) 1992 Lawrence Foard * Version 2 and 3 extensions to driver: - *Copyright (C) 1998 - 2014 Douglas Gilbert + *Copyright (C) 1998 - 2018 Douglas Gilbert * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -16,16 +16,9 @@ * */ -static int sg_version_num = 30536; /* 2 digits for each component */ -#define SG_VERSION_STR "3.5.36" +static int sg_version_num = 30901; /* 2 digits for each component */ +#define SG_VERSION_STR "3.9.01" -/* - * D. P. Gilbert (dgilb...@interlog.com), notes: - * - scsi logging is available via SCSI_LOG_TIMEOUT macros. First - *the kernel/module needs to be built with CONFIG_SCSI_LOGGING - *(otherwise the macros compile to empty statements). - * - */ #include #include @@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #include #include #include /* for sg_check_file_access() */ +#include +#include #include "scsi.h" #include @@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20140603"; +static char *sg_version_date = "20181019"; static int sg_proc_init(void); #endif @@ -73,7 +68,8 @@ static int sg_proc_init(void); #define SG_MAX_DEVS 32768 -/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type +/* + * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater * than 16 bytes are "variable length" whose length is a multiple of 4 */ @@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct class_interface *); static void sg_remove_device(struct device *, struct class_interface *); static DEFINE_IDR(sg_index_idr); -static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock - file descriptor list for device */ +static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */ static struct class_interface sg_interface = { .add_dev= sg_add_device, .remove_dev = sg_remove_device, }; -typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ - unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ - unsigned bufflen; /* Size of (aggregate) data buffer */ - struct page **pages; - int page_order; - char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */ - unsigned char cmd_opcode; /* first byte of command */ -} Sg_scatter_hold; +struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */ + __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */ + __user u8 *sbp; /* derived from sg_io_v4::response */ + u16 cmd_len;/* truncated of sg_io_v4::request_len */ + u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */ + u32 flags; /* copy of sg_io_v4::flags */ +}; + +struct sg_scatter_hold { /* holding area for scsi scatter gather info */ + struct page **pages;/* num_sgat element array of struct page* */ + int page_order; /* byte_len = (page_size * (2**page_order)) */ + int dlen; /* Byte length of data buffer */ + unsigned short num_sgat;/* actual number of scatter-gather segments */ + bool dio_in_use;/* false->indirect IO (or mmap), true->dio */ + u8 cmd_opcode; /* first byte of command */ +}; struct sg_device; /* forward declarations */ struct sg_fd; -typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ +struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */ struct list_head entry; /* list entry */
[PATCH v3 0/8] sg: major cleanup, remove max_queue limit
The intention is to add two new ioctls as proposed by Linus Torvalds: SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async interface. But first, clean up the driver and remove the SG_MAX_QUEUE limit of no more than 16 queued commands on a file descriptor at a time. A free list has been added and the de-allocation of sg_request objects is deferred until the close() of a file. Locking is extensively reworked, especially at the struct sg_fd and sg_request level. A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple integer values and booleans to be written to and/or read from the driver. An example is changing and/or reading the reserved request data length (there is one of these per fd). An example of a new feature is changing and/or reading the per-fd upper limit on the sum of outstanding data buffer sizes (default is 16 MB). An example of a boolean is a bit to do all following command timekeeping in nanoseconds rather that the default millseconds. A later patchset will add implementations for the SG_IOSUBMIT and SG_IORECEIVE plus handling of the sg v4 interface with the existing SG_IO ioctl. This patchset is against Martin Petersen's 4.20/scsi-queue branch. ToDo: - work out more modern technique for logging function invocations info with some error paths. Changes since v2: - fix problem with reserve request not being placed on fl - fix locking problem with sg_reserved_sz() - fix sum_fd_dlens counting problem - change new ioctl base numbers back to 0x22 - arrange fl in ascending dlen order apart from dlen=0 entries at tail - SG_GET_REQUEST_TABLE ioctl first fill from active list then, if there is room, it fills from the free list - add SG_SEIRV_DEV_FL_RQS to fetch count of all free list entries on the device the fd is associated with Changes since v1: - remove redundant casts from private_data field - introduce atomic to address locking problems around sg_fd::sum_fd_dlens - replace rq_state defines with an enumeration - add __must_hold() annotation to sg_fill_request_table() - fix compile/build problem around the 4th and 5th patches - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED ioctl and increase structure size from 64 to 96 bytes Douglas Gilbert (8): sg: types and naming cleanup sg: introduce sg_log macro sg: split header, expand and correct descriptions sg: expand request states sg: add free list, rework locking sg: complete locking changes on ioctl+debug sg: rework ioctl handling sg: user controls for q_at_head, read_value drivers/scsi/sg.c | 2621 ++-- include/scsi/sg.h | 268 +--- include/uapi/scsi/sg.h | 414 +++ 3 files changed, 2156 insertions(+), 1147 deletions(-) create mode 100644 include/uapi/scsi/sg.h -- 2.17.1
Re: [PATCH v3 0/8] sg: major cleanup, remove max_queue limit
For anyone thinking of testing this patchset, it also applies clean to lk 4.19 release. Doug Gilbert On 2018-10-26 12:48 p.m., Douglas Gilbert wrote: The intention is to add two new ioctls as proposed by Linus Torvalds: SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async interface. But first, clean up the driver and remove the SG_MAX_QUEUE limit of no more than 16 queued commands on a file descriptor at a time. A free list has been added and the de-allocation of sg_request objects is deferred until the close() of a file. Locking is extensively reworked, especially at the struct sg_fd and sg_request level. A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple integer values and booleans to be written to and/or read from the driver. An example is changing and/or reading the reserved request data length (there is one of these per fd). An example of a new feature is changing and/or reading the per-fd upper limit on the sum of outstanding data buffer sizes (default is 16 MB). An example of a boolean is a bit to do all following command timekeeping in nanoseconds rather that the default millseconds. A later patchset will add implementations for the SG_IOSUBMIT and SG_IORECEIVE plus handling of the sg v4 interface with the existing SG_IO ioctl. This patchset is against Martin Petersen's 4.20/scsi-queue branch. ToDo: - work out more modern technique for logging function invocations info with some error paths. Changes since v2: - fix problem with reserve request not being placed on fl - fix locking problem with sg_reserved_sz() - fix sum_fd_dlens counting problem - change new ioctl base numbers back to 0x22 - arrange fl in ascending dlen order apart from dlen=0 entries at tail - SG_GET_REQUEST_TABLE ioctl first fill from active list then, if there is room, it fills from the free list - add SG_SEIRV_DEV_FL_RQS to fetch count of all free list entries on the device the fd is associated with Changes since v1: - remove redundant casts from private_data field - introduce atomic to address locking problems around sg_fd::sum_fd_dlens - replace rq_state defines with an enumeration - add __must_hold() annotation to sg_fill_request_table() - fix compile/build problem around the 4th and 5th patches - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED ioctl and increase structure size from 64 to 96 bytes Douglas Gilbert (8): sg: types and naming cleanup sg: introduce sg_log macro sg: split header, expand and correct descriptions sg: expand request states sg: add free list, rework locking sg: complete locking changes on ioctl+debug sg: rework ioctl handling sg: user controls for q_at_head, read_value drivers/scsi/sg.c | 2621 ++-- include/scsi/sg.h | 268 +--- include/uapi/scsi/sg.h | 414 +++ 3 files changed, 2156 insertions(+), 1147 deletions(-) create mode 100644 include/uapi/scsi/sg.h
[PATCH v3.5 8/8] sg: user controls for q_at_head, read_value
Add a SG_SET_GET_EXTENDED ioctl control for whether commands will be queued_at_head or queued_at_tail by the block layer (together with the scsi mid-level). It has file scope. Also add a read_value integer the can be used by write a value from the SG_SEIRV_* group then the corresponding value will be returned. Signed-off-by: Douglas Gilbert --- The user can still override the new file scope setting on a a per command basis with the SG_FLAG_Q_AT_HEAD and SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures. An example of read_value usage is to write the value SG_SEIRV_FL_RQS to the read_value field. Then after the SG_SET_GET_EXTENDED ioctl is run, the number of (inactive) requests currently on this file descriptor's request free list is placed in the read_value field. Added in v3 is SG_SEIRV_DEV_FL_RQS which is an expansion of SG_SEIRV_FL_RQS. SG_SEIRV_DEV_FL_RQS counts free list entries on all sg file descriptors currently open on the device that the file descriptor (given to ioctl()) is associated with. Change since v3 [patches 1 to 7 of 8 are unchanged, this for 8/8]: - move sg_version_date definition from inside CONFIG_SCSI_PROC_FS conditional out to full file scope and near related definitions drivers/scsi/sg.c | 160 +++-- include/scsi/sg.h | 32 ++--- include/uapi/scsi/sg.h | 49 ++--- 3 files changed, 150 insertions(+), 91 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 3e89bbd508de..5ba92f112481 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -18,6 +18,7 @@ static int sg_version_num = 30901; /* 2 digits for each component */ #define SG_VERSION_STR "3.9.01" +static char *sg_version_date = "20181024"; #include @@ -59,7 +60,6 @@ static int sg_version_num = 30901;/* 2 digits for each component */ #ifdef CONFIG_SCSI_PROC_FS #include -static char *sg_version_date = "20181019"; static int sg_proc_init(void); #endif @@ -95,6 +95,10 @@ enum sg_rq_state { #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ) +#define SG_FD_Q_AT_TAIL true +#define SG_FD_Q_AT_HEAD false +#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */ + int sg_big_buff = SG_DEF_RESERVED_SIZE; /* N.B. This variable is readable and writeable via /proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer @@ -187,6 +191,7 @@ struct sg_fd { /* holds the state of a file descriptor */ bool keep_orphan;/* false -> drop (def), true -> keep for read() */ bool mmap_called; /* false -> mmap() never called on this fd */ bool time_in_ns;/* report times in nanoseconds */ + bool q_at_tail; /* queue at tail if true, head when false */ u8 next_cmd_len;/* 0: automatic, >0: use on next write() */ struct sg_request *reserve_srp; /* allocate on open(), starts on fl */ struct fasync_struct *async_qp; /* used by asynchronous notification */ @@ -238,7 +243,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, int dxfr_len, static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp); static struct sg_device *sg_get_dev(int min_dev); static void sg_device_destroy(struct kref *kref); -static const char *sg_rq_state_str(u8 rq_state, bool long_str); +static const char *sg_rq_state_str(enum sg_rq_state rq_state, bool long_str); static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first, rwlock_t *rwlp, unsigned long *iflagsp); @@ -855,7 +860,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, if (h4p || !hi_p) return ERR_PTR(-EOPNOTSUPP); - srp = sg_add_request(sfp, hi_p->dxfer_len, false); + srp = sg_add_request(sfp, hi_p->dxfer_len, sync); if (IS_ERR(srp)) return srp; srp->header = *hi_p;/* structure assignment, could memcpy */ @@ -897,9 +902,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr *hi_p, srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT); else hp->duration = jiffies_to_msecs(jiffies); - /* at tail if v3 or later interface and tail flag set */ - at_head = !(hp->interface_id != '\0' && - (SG_FLAG_Q_AT_TAIL & hp->flags)); + + if (hp->interface_id == '\0') /* v1 and v2 interface */ + at_head = true; /* backward compatibility */ + else if (sfp->q_at_tail) /* cmd flags can override sfd setting */ + at_head = (hp->flags & SG_FLAG_Q_AT_HEAD); + else/* this sfd is defaulting to head */ + at_head = !(hp->flags & SG_FLAG_Q_AT_TAIL); srp->rq->timeout = timeout; kref_get(>f_ref)
Re: [PATCH v3 8/8] sg: user controls for q_at_head, read_value
On 2018-10-26 12:48 p.m., Douglas Gilbert wrote: Add a SG_SET_GET_EXTENDED ioctl control for whether commands will be queued_at_head or queued_at_tail by the block layer (together with the scsi mid-level). It has file scope. Also add a read_value integer the can be used by write a value from the SG_SEIRV_* group then the corresponding value will be returned. Signed-off-by: Douglas Gilbert --- See "v3.5" of this patch to fix the compile error when CONFIG_SCSI_PROC_FS is not defined in a kernel build. Doug Gilbert
Re: [PATCH] rescan-scsi-bus.sh: use LUN wildcard in idlist
On 2018-10-25 10:49 a.m., Martin Wilck wrote: By scanning for LUN 0 only, we may encounter a device that the kernel won't add (e.g. peripheral device type 31) and which may thus never appear in sysfs for us to use for REPORT LUNS. That causes LUN additions for such devices to be missed by "rescan-iscsi-bus.sh -a". Seems reasonable and applies clean to my development true. For rescan-scsi-bus.sh I usually defer to Hannes. I have put it in pending a decision from him. BTW My alphas and betas (and releases) of sg3_utils go out through Hannes' github account [https://github.com/hreinecke/sg3_utils] and/or http://sg.danny.cz/sg . At the moment the latter is more up to date. Doug Gilbert Signed-off-by: Martin Wilck --- scripts/rescan-scsi-bus.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/rescan-scsi-bus.sh b/scripts/rescan-scsi-bus.sh index 6989208..a2aa8d8 100755 --- a/scripts/rescan-scsi-bus.sh +++ b/scripts/rescan-scsi-bus.sh @@ -376,7 +376,7 @@ idlist () oldlist=$(ls /sys/class/scsi_device/ | sed -n "s/${host}:${channel}:\([0-9]*:[0-9]*\)/\1/p" | uniq) # Rescan LUN 0 to check if we found new targets - echo "${channel} - 0" > /sys/class/scsi_host/host${host}/scan + echo "${channel} - -" > /sys/class/scsi_host/host${host}/scan newlist=$(ls /sys/class/scsi_device/ | sed -n "s/${host}:${channel}:\([0-9]*:[0-9]*\)/\1/p" | uniq) for newid in $newlist ; do oldid=$newid
Re: [PATCH v1] sg3_utils: sg_write_buffer: convert string to integer while reading from stdio
On 2018-11-09 7:18 p.m., Bean Huo (beanhuo) wrote: This patch is to convert inputted string to the integer when read data from stdin. While entering data, the data between bytes can be separated by space, or by ',' or by '.'. Could you send me this patch against sg_write_buffer.c as found in the recently released version 1.44 . In there the version string for sg_write_buffer.c is: "1.28 20180628" Signed-off-by: Bean Huo --- src/sg_write_buffer.c | 31 +++ 1 file changed, 31 insertions(+) diff --git a/src/sg_write_buffer.c b/src/sg_write_buffer.c index 7560e7e..902bc5d 100644 --- a/src/sg_write_buffer.c +++ b/src/sg_write_buffer.c @@ -195,6 +195,7 @@ main(int argc, char * argv[]) const char * device_name = NULL; const char * file_name = NULL; unsigned char * dop = NULL; +unsigned char * read_buf= NULL; For example the above 2 lines now use uint8_t char * cp; const struct mode_s * mp; char ebuff[EBUFF_SZ]; @@ -394,6 +395,33 @@ main(int argc, char * argv[]) } } } + if (infd == STDIN_FILENO) { + if (NULL == (read_buf = (unsigned char *)malloc(DEF_XFER_LEN))) { + pr2serr(ME "out of memory\n"); + ret = SG_LIB_SYNTAX_ERROR; + goto err_out; + } + res = read(infd, read_buf, DEF_XFER_LEN); + if (res < 0) { + snprintf(ebuff, EBUFF_SZ, ME "couldn't read from STDIN"); + perror(ebuff); + ret = SG_LIB_FILE_ERROR; + goto err_out; + } + char * pch; + int val = 0; + res = 0; + pch = strtok(read_buf, ",. "); + while (pch != NULL) { + printf("read %s ", pch); + val = sg_get_num_nomult(pch); + if (val > 0 && val < 255) { Hmm, perhaps: if (val >= 0 && val < 256) else + dop[res] = val; + res++; + } + pch = strtok(NULL, ",. "); + } + } else { res = read(infd, dop, wb_len); if (res < 0) { snprintf(ebuff, EBUFF_SZ, ME "couldn't read from %s", @@ -404,6 +432,7 @@ main(int argc, char * argv[]) ret = SG_LIB_FILE_ERROR; goto err_out; } + } Tabbing looks a little off here. Thanks Doug Gilbert if (res < wb_len) { if (wb_len_given) { pr2serr("tried to read %d bytes from %s, got %d bytes\n", @@ -472,6 +501,8 @@ main(int argc, char * argv[]) err_out: if (dop) free(dop); +if (read_buf) + free(read_buf); res = sg_cmds_close_device(sg_fd); if (res < 0) { pr2serr("close error: %s\n", safe_strerror(-res));
Re: [PATCH V2] sg3_utils: sg_write_buffer: convert string to integer while reading from stdio
On 2018-11-12 11:32 a.m., Bean Huo (beanhuo) wrote: This patch is to convert inputted string to the integer when read data from stdin. While entering data flow, the data between bytes can be separated by either space, or ',' (or by '.'). V1-V2: 1. Rebased the patch on the latest sg_write_buffer.c 2. Added the wrong input checkup, and process 3. Changed the delimer from ",. " to ",. /n/t", in order to skip the spaces at the end of string because of misoperation. 4. Modified some wrong indents. Tested on my own UFS platform, used to issue VU command: ./sg_write_buffer -b 0 -i 0 -v -l 0x2c -m 1 -S 7 -r /dev/block/sda 0x01 0x40 0x20 tried to read 44 bytes from -, got 3 bytes pad with 0xff bytes and continue sending single write buffer, mode=0x1, mpsec=7, id=0, offset=0, len=44 Write buffer cdb: 3b e1 00 00 00 00 00 00 2c 00 ./sg_read_buffer -l 32 -m 1 -S 6 -v /dev/block/sda Read buffer(10) cdb: 3c c1 00 00 00 00 00 00 20 00 00 61 34 64 36 63 38 61 42 65 36 4b 4d 4c 34 4c 30 10 30 34 44 43 45 32 30 32 57 30 00 00 00 00 00 00 Signed-off-by: Bean Huo Thanks, applied. Just prior to getting this patch today I put up a sg3_utils-1.45 (early) beta: revision 795. It's in a tarball at the top of http://sg.danny.cz/sg/ This patch will be in revision 796. Doug Gilbert