Re: [PATCH] scsi: Change return type to vm_fault_t

2018-04-14 Thread Douglas Gilbert

On 2018-04-14 02:47 PM, Souptick Joarder wrote:

Use new return type vm_fault_t for fault handler
in struct vm_operations_struct.

Signed-off-by: Souptick Joarder <jrdr.li...@gmail.com>
Reviewed-by: Matthew Wilcox <mawil...@microsoft.com>


Acked-by: Douglas Gilbert <dgilb...@interlog.com>


---
  drivers/scsi/sg.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c198b963..c2b7d34 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1192,7 +1192,7 @@ static long sg_compat_ioctl(struct file *filp, unsigned 
int cmd_in, unsigned lon
return fasync_helper(fd, filp, mode, >async_qp);
  }

-static int
+static vm_fault_t
  sg_vma_fault(struct vm_fault *vmf)
  {
struct vm_area_struct *vma = vmf->vma;
--
1.9.1






Re: [VERY EARLY RFC 00/13] Rework SCSI results handling

2018-04-18 Thread Douglas Gilbert

On 2018-04-18 11:01 AM, Johannes Thumshirn wrote:

Hey all,

here's a early preview of my SCSI results rework so we can eventually
discuss things next week at LSF/MM (it still has compiler errors on
aic7xxx and scsi_debug).

The motivation behing this is that some drivers have failed to set the
scsi_cmnd::result bytes correctly in the past and this is resulting in
hard to case down errors.

The open points:
1) 148 files changed, treewide. That's huge. Is it worth it?
2) remove the old status byte definitions
3) add a scsi_cmnd::result == 0 wrapper
3) convert aic7xx's CAM stuff so this series compiles cleanly
4) What the heck is the SDEG_RES_IMMED_MASK stuff in scsi_debug doing


It's a hack. The cdb tables have only one delay value per command and
for Start Stop Unit and Synchronize Cache those are relatively long
delays. However as you may be aware both those commands have an IMMED
bit which makes those commands return more or less immediately.

In the work flow of the scsi_debug driver the work done by a specific
command is done more or less immediately and any delay associated
with that command is done by generic code just prior to calling
scsi_cmnd::done(). So for those two commands I needed an additional
return value from the resp_start_stop() and resp_sync_cache() to
indicate that their normal delay should be short circuited.

And the signatures of those functions are function pointers in many
of the scsi_debug tables, so expanding the number of arguments for
these two cases would have been messy. So the (hacky) solution was to
overload the return int with an extra flag. And that return int is none
other than the infamous SCSI result.

With the benefit of hindsight a better solution is to add a new bool
to struct sdebug_dev_info:
bool immed_shortens_delay;

that is set by those two commands when their IMMED bit is set (in their
cdbs). Then the code in schedule_resp() can act on that bool (when set)
to short circuit the delay. That will be a worthwhile improvement so
other commands (if and when implemented) can react properly when their
IMMED bit is set (e.g. FORMAT UNIT, PRE-FETCH and SANITIZE).


Does that clear up this matter?

Doug Gilbert


5) change scsi_execute() so we get a newish 'struct scsi_results' instead of an 
int
6) {to,from}_scsi_result() are odd
7) find suitable commit messages

In case someone want's it in a more viewable form I've pushed the
series to my kernel.org git:
https://git.kernel.org/pub/scm/linux/kernel/git/jth/linux.git/log/?h=scsi-results

Johannes Thumshirn (13):
   scsi: use host_byte() accessor
   scsi: remove Scsi_Cmnd typedef
   scsi: add enum for host byte codes
   scsi: add enum for driver byte codes
   scsi: add enum for message byte codes
   scsi: introduce set_scsi_result
   scsi: use set_driver_byte
   treewide: use set_host_byte
   scsi: use set_msg_byte
   scsi: introduce set_status_byte and convert LLDDs to use it
   scsi: Change status bytes to use SAM-3 version
   reewide: introduce clear_scsi_result() and convert drivers
   scsi: introduce struct scsi_result

  arch/ia64/hp/sim/simscsi.c  |   6 +-
  block/bsg-lib.c |   8 +-
  block/bsg.c |   8 +-
  block/scsi_ioctl.c  |  12 +-
  drivers/ata/libata-scsi.c   |  54 +++---
  drivers/firewire/sbp2.c |   2 +-
  drivers/infiniband/ulp/srp/ib_srp.c |  23 ++-
  drivers/message/fusion/mptfc.c  |   8 +-
  drivers/message/fusion/mptsas.c |   3 +-
  drivers/message/fusion/mptscsih.c   | 122 -
  drivers/message/fusion/mptspi.c |   6 +-
  drivers/s390/scsi/zfcp_fc.h |   2 +-
  drivers/s390/scsi/zfcp_scsi.c   |   4 +-
  drivers/scsi/3w-9xxx.c  |  18 +-
  drivers/scsi/3w-sas.c   |  16 +-
  drivers/scsi/3w-.c  |  36 ++--
  drivers/scsi/53c700.c   |   3 +-
  drivers/scsi/BusLogic.c |  24 ++-
  drivers/scsi/NCR5380.c  |  39 +++--
  drivers/scsi/a100u2w.c  |   2 +-
  drivers/scsi/aacraid/aachba.c   | 221 +---
  drivers/scsi/aacraid/commsup.c  |   5 +-
  drivers/scsi/aacraid/linit.c|  12 +-
  drivers/scsi/advansys.c |  58 +++
  drivers/scsi/aha152x.c  |  76 
  drivers/scsi/aha1542.c  |   2 +-
  drivers/scsi/aha1740.c  |  11 +-
  drivers/scsi/aha1740.h  |   4 +-
  drivers/scsi/aic7xxx/aic79xx_osm.c  |   5 +-
  drivers/scsi/aic7xxx/aic79xx_osm.h  |   6 +-
  drivers/scsi/aic7xxx/aic7xxx_osm.c  |   5 +-
  drivers/scsi/aic7xxx/aic7xxx_osm.h  

[PATCH] scsi_debug: replace SDEG_RES_IMMED_MASK with cmnd flag

2018-04-18 Thread Douglas Gilbert
This patch uses a cleaner method to convey the presence of the IMMED
cdb bit back to the generic delay code in the scsi_debug driver. The
previous method used a temporary mask over the SCSI result value
(a 32 bit integer soon to be restructured) that was not visible
outside this driver. It has still caused some confusion so a more
conventional method is now used: adding an extra flag to the
scsi_cmnd "host_scribble" area.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---

This patch is designed to simplify part of the patchset titled:
"Rework SCSI results handling" that touched the scsi_debug driver.
This patch is against Martin's 4.18/scsi-queue branch. It will
probably break soon as that branch doesn't yet contain my
"scsi_debug: IMMED related delay adjustments" patch which touches
the same area. The technique being used should be clear.


 drivers/scsi/scsi_debug.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 9ef5e3b810f6..65809f3d0d33 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -234,7 +234,7 @@ static const char *sdebug_version_date = "20180128";
 #define F_INV_OP   0x200
 #define F_FAKE_RW  0x400
 #define F_M_ACCESS 0x800   /* media access */
-#define F_LONG_DELAY   0x1000
+#define F_LONG_DELAY   0x1000  /* IMMED bit on commands with this */
 
 #define FF_RESPOND (F_RL_WLUN_OK | F_SKIP_UA | F_DELAY_OVERR)
 #define FF_MEDIA_IO (F_M_ACCESS | F_FAKE_RW)
@@ -294,6 +294,7 @@ struct sdebug_queued_cmd {
unsigned int inj_dix:1;
unsigned int inj_short:1;
unsigned int inj_host_busy:1;
+   bool immed_shortens_delay;
 };
 
 struct sdebug_queue {
@@ -399,14 +400,6 @@ static const unsigned char opcode_ind_arr[256] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 };
 
-/*
- * The following "response" functions return the SCSI mid-level's 4 byte
- * tuple-in-an-int. To handle commands with an IMMED bit, for a faster
- * command completion, they can mask their return value with
- * SDEG_RES_IMMED_MASK .
- */
-#define SDEG_RES_IMMED_MASK 0x4000
-
 static int resp_inquiry(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_report_luns(struct scsi_cmnd *, struct sdebug_dev_info *);
 static int resp_requests(struct scsi_cmnd *, struct sdebug_dev_info *);
@@ -1607,6 +1600,8 @@ static int resp_start_stop(struct scsi_cmnd *scp,
 {
unsigned char *cmd = scp->cmnd;
int power_cond, stop;
+   struct sdebug_queued_cmd *sqcp =
+   (struct sdebug_queued_cmd *)scp->host_scribble;
 
power_cond = (cmd[4] & 0xf0) >> 4;
if (power_cond) {
@@ -1615,7 +1610,8 @@ static int resp_start_stop(struct scsi_cmnd *scp,
}
stop = !(cmd[4] & 1);
atomic_xchg(>stopped, stop);
-   return (cmd[1] & 0x1) ? SDEG_RES_IMMED_MASK : 0; /* check IMMED bit */
+   sqcp->immed_shortens_delay = !!(cmd[1] & 0x1);
+   return 0;
 }
 
 static sector_t get_sdebug_capacity(void)
@@ -3586,6 +3582,8 @@ static int resp_sync_cache(struct scsi_cmnd *scp,
u64 lba;
u32 num_blocks;
u8 *cmd = scp->cmnd;
+   struct sdebug_queued_cmd *sqcp =
+   (struct sdebug_queued_cmd *)scp->host_scribble;
 
if (cmd[0] == SYNCHRONIZE_CACHE) {  /* 10 byte cdb */
lba = get_unaligned_be32(cmd + 2);
@@ -3598,7 +3596,8 @@ static int resp_sync_cache(struct scsi_cmnd *scp,
mk_sense_buffer(scp, ILLEGAL_REQUEST, LBA_OUT_OF_RANGE, 0);
return check_condition_result;
}
-   return (cmd[1] & 0x2) ? SDEG_RES_IMMED_MASK : 0; /* check IMMED bit */
+   sqcp->immed_shortens_delay = !!(cmd[1] & 0x2);
+   return 0;
 }
 
 #define RL_BUCKET_ELEMS 8
@@ -4399,12 +4398,12 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct 
sdebug_dev_info *devip,
}
 
cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0;
-   if (cmnd->result & SDEG_RES_IMMED_MASK) {
+   if (sqcp->immed_shortens_delay) {
/*
-* This is the F_DELAY_OVERR case. No delay.
+* IMMED bit set overrides F_LONG_DELAY.
 */
-   cmnd->result &= ~SDEG_RES_IMMED_MASK;
-   delta_jiff = ndelay = 0;
+   delta_jiff = 0;
+   ndelay = 0;
}
if (cmnd->result == 0 && scsi_result != 0)
cmnd->result = scsi_result;
@@ -4456,7 +4455,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct 
sdebug_dev_info *devip,
 
 respond_in_thread: /* call back to mid-layer using invocation thread */
cmnd->result = pfp != NULL ? pfp(cmnd, devip) : 0;
-   cmnd->result &= ~SDEG_RES_IMMED_MASK;
if (cmnd->result == 0 && scsi_result != 0)
cmnd->result = scsi_result;
cmnd->scsi_done(cmnd);
-- 
2.14.1



Re: [PATCH v2 4/6] scsi_io_completion_action helper added

2018-03-27 Thread Douglas Gilbert

On 2018-03-26 08:13 PM, Bart Van Assche wrote:

On Sun, 2018-03-18 at 21:59 +0100, Douglas Gilbert wrote:

+   /* sense not about current command is termed: deferred */


Do we really need comments that explain the SCSI specs? If such a comment is
added I think it should be added above the definition of 
scsi_sense_is_deferred()
together with a reference to the "Sense data" section in SPC.


+   if (result == 0) {
+   /*
+* Unprep the request and put it back at the head of the
+* queue. A new command will be prepared and issued.
+* This block is the same as case ACTION_REPREP in
+* scsi_io_completion_action() above.
 */
-   if (q->mq_ops) {
+   if (q->mq_ops)
scsi_mq_requeue_cmd(cmd);
-   } else {
+   else {
scsi_release_buffers(cmd);
scsi_requeue_command(q, cmd);
}


Have these changes been verified with checkpatch? Checkpatch should have 
reported
the following about the above chunk of code: Unbalanced braces around else 
statement.


Yes they were, did you check them? If so, with what command line options?
Since with no options /scripts/checkpatch.pl returns
clean for all patches in this set.


Additionally, have you considered to introduce a new function instead of 
duplicating
existing code?


Yes, that could be done.


Otherwise this patch looks fine to me.


Doug Gilbert


[PATCH v3 5/7] scsi_io_completion_reprep helper added

2018-03-27 Thread Douglas Gilbert
Since the action "reprep" is called from two places, rather than repeat
the code, make a new scsi_io_completion helper with "reprep" as its
suffix.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
 drivers/scsi/scsi_lib.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b1cefd8e527f..e7e5eb5b9e2e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when "reprep" action required. */
+static void scsi_io_completion_reprep(struct scsi_cmnd *cmd,
+ struct request_queue *q)
+{
+   /* A new command will be prepared and issued. */
+   if (q->mq_ops) {
+   scsi_mq_requeue_cmd(cmd);
+   } else {
+   /* Unprep request and put it back at head of the queue. */
+   scsi_release_buffers(cmd);
+   scsi_requeue_command(q, cmd);
+   }
+}
+
 /* Helper for scsi_io_completion() when special action required. */
 static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 {
@@ -893,15 +907,7 @@ static void scsi_io_completion_action(struct scsi_cmnd 
*cmd, int result)
return;
/*FALLTHRU*/
case ACTION_REPREP:
-   /* Unprep the request and put it back at the head of the queue.
-* A new command will be prepared and issued.
-*/
-   if (q->mq_ops)
-   scsi_mq_requeue_cmd(cmd);
-   else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
+   scsi_io_completion_reprep(cmd, q);
break;
case ACTION_RETRY:
/* Retry the same command immediately */
@@ -1079,20 +1085,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0) {
-   /*
-* Unprep the request and put it back at the head of the
-* queue. A new command will be prepared and issued.
-* This block is the same as case ACTION_REPREP in
-* scsi_io_completion_action() above.
-*/
-   if (q->mq_ops)
-   scsi_mq_requeue_cmd(cmd);
-   else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
-   } else
+   if (result == 0)
+   scsi_io_completion_reprep(cmd, q);
+   else
scsi_io_completion_action(cmd, result);
 }
 
-- 
2.14.1



[PATCH v3 2/7] scsi_io_completion rename variables

2018-03-27 Thread Douglas Gilbert
Change some variable names and associated comments for clarity. Correct
some misleading comments.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 57 ++---
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d44ee84df091..598836804745 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
  *be put back on the queue and retried using the same
  *command as before, possibly after a delay.
  *
- * c) We can call scsi_end_request() with -EIO to fail
- *the remainder of the request.
+ * c) We can call scsi_end_request() with blk_stat other than
+ *BLK_STS_OK, to fail the remainder of the request.
  */
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
int result = cmd->result;
struct request_queue *q = cmd->device->request_queue;
struct request *req = cmd->request;
-   blk_status_t error = BLK_STS_OK;
+   blk_status_t blk_stat = BLK_STS_OK;
struct scsi_sense_hdr sshdr;
bool sense_valid = false;
-   int sense_deferred = 0, level = 0;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   int level = 0;
enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
@@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
if (result) {
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
-   sense_deferred = scsi_sense_is_deferred();
+   sense_current = !scsi_sense_is_deferred();
}
 
if (blk_rq_is_passthrough(req)) {
@@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
min(8 + cmd->sense_buffer[7],
SCSI_SENSE_BUFFERSIZE);
}
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
+   if (sense_current)
+   blk_stat = __scsi_error_from_host_byte(cmd,
+  result);
}
/*
 * __scsi_error_from_host_byte may have reset the host_byte
@@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
BUG();
return;
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+   } else if (blk_rq_bytes(req) == 0 && result && sense_current) {
/*
 * Flush commands do not transfers any data, and thus cannot use
 * good_bytes != blk_rq_bytes(req) as the signal for an error.
-* This sets the error explicitly for the problem case.
+* This sets blk_stat explicitly for the problem case.
 */
-   error = __scsi_error_from_host_byte(cmd, result);
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
}
 
/* no bidi support for !blk_rq_is_passthrough yet */
@@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
else if (!(req->rq_flags & RQF_QUIET))
scsi_print_sense(cmd);
result = 0;
-   /* for passthrough error may be set */
-   error = BLK_STS_OK;
+   /* for passthrough, blk_stat may be set */
+   blk_stat = BLK_STS_OK;
}
/*
 * Another corner case: the SCSI status byte is non-zero but 'good'.
@@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 */
if (status_byte(result) && scsi_status_is_good(result)) {
result = 0;
-   error = BLK_STS_OK;
+   blk_stat = BLK_STS_OK;
}
 
/*
-* special case: failed zero length commands always need to
-* drop down into the retry code. Otherwise, if we finished
-* all bytes in the request we are done now.
+* Next deal with any sectors which we were able to correctly
+* handle. Failed, zero length commands always need to drop down
+* to retry code. Fast path should return in this block.
 */
-   if (!(b

[PATCH v3 1/7] scsi_io_completion comment on end_request return

2018-03-27 Thread Douglas Gilbert
scsi_end_request() is called multiple times from scsi_io_completion()
which branches on its bool returned value. Add comment before the static
definition of scsi_end_request() about the meaning of that return.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>
---
The reader might think that if scsi_end_request() had actually ended the
request (i.e. no bytes remaining) it would return true. If so, the reader
would be misled.

 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 393f9db8f41b..d44ee84df091 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
cmd->request->next_rq->special = NULL;
 }
 
+/* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
unsigned int bytes, unsigned int bidi_bytes)
 {
-- 
2.14.1



[PATCH v3 7/7] scsi_io_completion convert BUGs to WARNs

2018-03-27 Thread Douglas Gilbert
The scsi_io_completion function contains three BUG() and BUG_ON() calls.
Replace them with WARN variants.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
 drivers/scsi/scsi_lib.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 391bbb0b19b8..6fa24cff43a6 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
blk_rq_bytes(req->next_rq)))
-   BUG();
+   WARN_ONCE(true,
+ "Bidi command with remaining bytes");
return;
}
}
 
/* no bidi support yet, other than in pass-through */
-   BUG_ON(blk_bidi_rq(req));
+   if (unlikely(blk_bidi_rq(req))) {
+   WARN_ONCE(true, "Only support bidi command in passthrough");
+   scmd_printk(KERN_ERR, cmd, "Killing bidi command\n");
+   if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req),
+blk_rq_bytes(req->next_rq)))
+   WARN_ONCE(true, "Bidi command with remaining bytes");
+   return;
+   }
 
/*
 * Next deal with any sectors which we were able to correctly
@@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
/* Kill remainder if no retries */
if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
-   BUG();
+   WARN_ONCE(true,
+   "Bytes remaining after failed, no-retry command");
return;
}
 
-- 
2.14.1



[PATCH v3 6/7] scsi_io_completion hints on fastpatch

2018-03-27 Thread Douglas Gilbert
Add likely() and unlikely() hints to conditionals on or near the fastpath.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
A reviewer wanted any performance improvement (or otherwise) quantified.
The improvement was so small, that ftrace ignored it. Inline timing code
suggests the improvement from this whole patchset is around 7 nanoseconds
per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge.
Another win might be the smaller size of scsi_io_completion() after the
refactoring; this should allow more other code to fit in the instruction
cache.

 drivers/scsi/scsi_lib.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index e7e5eb5b9e2e..391bbb0b19b8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
struct request *req = cmd->request;
blk_status_t blk_stat = BLK_STS_OK;
 
-   if (result) /* does not necessarily mean there is an error */
+   if (unlikely(result))   /* a nz result may or may not be an error */
result = scsi_io_completion_nz_result(cmd, result, _stat);
 
-   if (blk_rq_is_passthrough(req)) {
+   if (unlikely(blk_rq_is_passthrough(req))) {
/*
 * __scsi_error_from_host_byte may have reset the host_byte
 */
scsi_req(req)->result = cmd->result;
scsi_req(req)->resid_len = scsi_get_resid(cmd);
 
-   if (scsi_bidi_cmnd(cmd)) {
+   if (unlikely(scsi_bidi_cmnd(cmd))) {
/*
 * Bidi commands Must be complete as a whole,
 * both sides at once.
@@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
}
}
 
-   /* no bidi support for !blk_rq_is_passthrough yet */
+   /* no bidi support yet, other than in pass-through */
BUG_ON(blk_bidi_rq(req));
 
/*
@@ -1067,15 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * handle. Failed, zero length commands always need to drop down
 * to retry code. Fast path should return in this block.
 */
-   if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) {
-   if (!scsi_end_request(req, blk_stat, good_bytes, 0))
-   return; /* no bytes remaining */
+   if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) {
+   if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0)))
+   return; /* no bytes remaining */
}
 
-   /*
-* Kill remainder if no retrys.
-*/
-   if (blk_stat && scsi_noretry_cmd(cmd)) {
+   /* Kill remainder if no retries */
+   if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
BUG();
return;
@@ -1085,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0)
+   if (likely(result == 0))
scsi_io_completion_reprep(cmd, q);
else
scsi_io_completion_action(cmd, result);
-- 
2.14.1



[PATCH v3 3/7] scsi_io_completion_nz_result function added

2018-03-27 Thread Douglas Gilbert
Break out several intertwined paths when cmd->result is non zero and
place them in the scsi_io_completion_nz_result helper function. The
logic is not changed.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
A reviewer requested the original helper function's two return values
be reduced to one: the blk_stat variable. This required a hack to
differentiate the default setting of blk_stat (BLK_STS_OK) from the case
when the helper assigns BLK_STS_OK as the return value. The hack is to
return the otherwise unused BLK_STS_NOTSUPP value as an indication that
the helper didn't change anything. That hack was judged by another
reviewer to be worse that the "two return values" ugliness it was
trying to address. So back to the original "two return values" solution.
Returning a structure containing result and blk_stat is another
possibility but returning structures is frowned upon in some circles.

 drivers/scsi/scsi_lib.c | 126 
 1 file changed, 74 insertions(+), 52 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 598836804745..3f65bb541ca3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,78 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/*
+ * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a
+ * new result that may suppress further error checking. Also modifies
+ * *blk_statp in some cases.
+ */
+static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result,
+   blk_status_t *blk_statp)
+{
+   bool sense_valid;
+   bool about_current;
+   struct request *req = cmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   about_current = sense_valid ? !scsi_sense_is_deferred() : true;
+
+   if (blk_rq_is_passthrough(req)) {
+   if (sense_valid) {
+   /*
+* SG_IO wants current and deferred errors
+*/
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
+   }
+   if (about_current)
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   } else if (blk_rq_bytes(req) == 0 && about_current) {
+   /*
+* Flush commands do not transfers any data, and thus cannot use
+* good_bytes != blk_rq_bytes(req) as the signal for an error.
+* This sets *blk_statp explicitly for the problem case.
+*/
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   }
+   /*
+* Recovered errors need reporting, but they're always treated as
+* success, so fiddle the result code here.  For passthrough requests
+* we already took a copy of the original into sreq->result which
+* is what gets returned to the user
+*/
+   if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
+   bool do_print = true;
+   /*
+* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d]
+* skip print since caller wants ATA registers. Only occurs
+* on SCSI ATA PASS_THROUGH commands when CK_COND=1
+*/
+   if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
+   do_print = false;
+   else if (req->rq_flags & RQF_QUIET)
+   do_print = false;
+   if (do_print)
+   scsi_print_sense(cmd);
+   /* for passthrough, *blk_statp may be set, so clear */
+   *blk_statp = BLK_STS_OK;
+   result = 0;
+   }
+   /*
+* Another corner case: the SCSI status byte is non-zero but 'good'.
+* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when
+* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD
+* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related
+* intermediate statuses (both obsolete in SAM-4) as good.
+*/
+   if (status_byte(result) && scsi_status_is_good(result)) {
+   *blk_statp = BLK_STS_OK;
+   result = 0;
+   }
+   return result;
+}
+
 /*
  * Function:scsi_io_completion()
  *
@@ -794,26 +866,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
-   if (result) {
+   if (result) {   /* does not necessarily mean there is an error */
sense_valid = scsi_command_normalize_sense(cmd, );
   

[PATCH v3 0/7] scsi_io_completion cleanup

2018-03-27 Thread Douglas Gilbert
While working of this "[PATCH v4] Make SCSI Status CONDITION MET
equivalent to GOOD" the author had trouble finding how to add another
corner case check in the scsi_io_completion() function (scsi_lib.c).
That function is executed at the completion of every SCSI command but
only 20 or so of its hundred of lines of code are executed for the vast
majority of invocations (i.e. the fastpath).

This patch refactors scsi_io_completion() by taking the bulk of its
error processing code into three static helper functions, leaving only
the 20 or so code lines that constitute the fastpath. In this process
comments were added and tweaked, plus variables renamed. The last two
patches in this set are optional: adding conditional hints along the
fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs
as requested by a reviewer.

There is no attempt in this patchset to change the logic of the original
scsi_io_completion(). Some conditional checks are saved by reducing the
number of redundant tests (e.g. on the 'result' variable being non-zero).
Also De Morgan's laws are applied to some complex conditions to simplify
them from the reader's perspective. The fact remains, this is a very
complex function.

This patch is against Martin Petersen's 4.17/scsi-queue branch.

Douglas Gilbert (7):
  scsi_io_completion comment on end_request return
  scsi_io_completion rename variables
  scsi_io_completion_nz_result function added
  scsi_io_completion_action helper added
  scsi_io_completion_reprep helper added
  scsi_io_completion hints on fastpatch
  scsi_io_completion convert BUGs to WARNs

 drivers/scsi/scsi_lib.c | 376 +++-
 1 file changed, 213 insertions(+), 163 deletions(-)

-- 
2.14.1



[PATCH v3 4/7] scsi_io_completion_action helper added

2018-03-27 Thread Douglas Gilbert
Place scsi_io_completion()'s complex error processing associated with a
local enumeration into a static helper function. That enumeration's
values start with "ACTION_" so use the suffix "_action" in the helper
function's name.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 326 ++--
 1 file changed, 174 insertions(+), 152 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 3f65bb541ca3..b1cefd8e527f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,169 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when special action required. */
+static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
+{
+   struct request_queue *q = cmd->device->request_queue;
+   struct request *req = cmd->request;
+   int level = 0;
+   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
+ ACTION_DELAYED_RETRY} action;
+   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
+   struct scsi_sense_hdr sshdr;
+   bool sense_valid = false;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   blk_status_t blk_stat;
+
+   if (scsi_command_normalize_sense(cmd, )) {
+   sense_valid = true;
+   sense_current = !scsi_sense_is_deferred();
+   }
+
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+
+   if (host_byte(result) == DID_RESET) {
+   /* Third party bus reset or reset for error recovery
+* reasons.  Just retry the command and see what
+* happens.
+*/
+   action = ACTION_RETRY;
+   } else if (sense_valid && sense_current) {
+   switch (sshdr.sense_key) {
+   case UNIT_ATTENTION:
+   if (cmd->device->removable) {
+   /* Detected disc change.  Set a bit
+* and quietly refuse further access.
+*/
+   cmd->device->changed = 1;
+   action = ACTION_FAIL;
+   } else {
+   /* Must have been a power glitch, or a
+* bus reset.  Could not have been a
+* media change, so we just retry the
+* command and see what happens.
+*/
+   action = ACTION_RETRY;
+   }
+   break;
+   case ILLEGAL_REQUEST:
+   /* If we had an ILLEGAL REQUEST returned, then
+* we may have performed an unsupported
+* command.  The only thing this should be
+* would be a ten byte read where only a six
+* byte read was supported.  Also, on a system
+* where READ CAPACITY failed, we may have
+* read past the end of the disk.
+*/
+   if ((cmd->device->use_10_for_rw &&
+   sshdr.asc == 0x20 && sshdr.ascq == 0x00) &&
+   (cmd->cmnd[0] == READ_10 ||
+cmd->cmnd[0] == WRITE_10)) {
+   /* This will issue a new 6-byte command. */
+   cmd->device->use_10_for_rw = 0;
+   action = ACTION_REPREP;
+   } else if (sshdr.asc == 0x10) /* DIX */ {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_PROTECTION;
+   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
+   } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_TARGET;
+   } else
+   action = ACTION_FAIL;
+   break;
+   case ABORTED_COMMAND:
+   action = ACTION_FAIL;
+   if (sshdr.asc == 0x10) /* DIF */
+   blk_stat = BLK_STS_PROTECTION;
+   break;
+   case NOT_READY:
+   /* If the device is in the process of becoming
+* ready, or has a temporary blockage, retry.
+*/
+   if (sshdr.asc == 0x04) {
+   switch (sshdr.ascq) {
+

[PATCH v5 4/7] scsi_io_completion_action helper added

2018-03-28 Thread Douglas Gilbert
Place scsi_io_completion()'s complex error processing associated with a
local enumeration into a static helper function. That enumeration's
values start with "ACTION_" so use the suffix "_action" in the helper
function's name.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>

 drivers/scsi/scsi_lib.c | 321 ++--
 1 file changed, 171 insertions(+), 150 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 08c4db34af7b..831d1fad38b3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when special action required. */
+static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
+{
+   struct request_queue *q = cmd->device->request_queue;
+   struct request *req = cmd->request;
+   int level = 0;
+   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
+ ACTION_DELAYED_RETRY} action;
+   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
+   struct scsi_sense_hdr sshdr;
+   bool sense_valid;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   blk_status_t blk_stat;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   if (sense_valid)
+   sense_current = !scsi_sense_is_deferred();
+
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+
+   if (host_byte(result) == DID_RESET) {
+   /* Third party bus reset or reset for error recovery
+* reasons.  Just retry the command and see what
+* happens.
+*/
+   action = ACTION_RETRY;
+   } else if (sense_valid && sense_current) {
+   switch (sshdr.sense_key) {
+   case UNIT_ATTENTION:
+   if (cmd->device->removable) {
+   /* Detected disc change.  Set a bit
+* and quietly refuse further access.
+*/
+   cmd->device->changed = 1;
+   action = ACTION_FAIL;
+   } else {
+   /* Must have been a power glitch, or a
+* bus reset.  Could not have been a
+* media change, so we just retry the
+* command and see what happens.
+*/
+   action = ACTION_RETRY;
+   }
+   break;
+   case ILLEGAL_REQUEST:
+   /* If we had an ILLEGAL REQUEST returned, then
+* we may have performed an unsupported
+* command.  The only thing this should be
+* would be a ten byte read where only a six
+* byte read was supported.  Also, on a system
+* where READ CAPACITY failed, we may have
+* read past the end of the disk.
+*/
+   if ((cmd->device->use_10_for_rw &&
+   sshdr.asc == 0x20 && sshdr.ascq == 0x00) &&
+   (cmd->cmnd[0] == READ_10 ||
+cmd->cmnd[0] == WRITE_10)) {
+   /* This will issue a new 6-byte command. */
+   cmd->device->use_10_for_rw = 0;
+   action = ACTION_REPREP;
+   } else if (sshdr.asc == 0x10) /* DIX */ {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_PROTECTION;
+   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
+   } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_TARGET;
+   } else
+   action = ACTION_FAIL;
+   break;
+   case ABORTED_COMMAND:
+   action = ACTION_FAIL;
+   if (sshdr.asc == 0x10) /* DIF */
+   blk_stat = BLK_STS_PROTECTION;
+   break;
+   case NOT_READY:
+   /* If the device is in the process of becoming
+* ready, or has a temporary blockage, retry.
+*/
+   if (sshdr.asc == 0x04) {
+   switch (sshdr.ascq) {
+

[PATCH v5 7/7] scsi_io_completion convert BUGs to WARNs

2018-03-28 Thread Douglas Gilbert
The scsi_io_completion function contains three BUG() and BUG_ON() calls.
Replace them with WARN variants.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 60b57324a244..779c45479ac3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
blk_rq_bytes(req->next_rq)))
-   BUG();
+   WARN_ONCE(true,
+ "Bidi command with remaining bytes");
return;
}
}
 
/* no bidi support yet, other than in pass-through */
-   BUG_ON(blk_bidi_rq(req));
+   if (unlikely(blk_bidi_rq(req))) {
+   WARN_ONCE(true, "Only support bidi command in passthrough");
+   scmd_printk(KERN_ERR, cmd, "Killing bidi command\n");
+   if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req),
+blk_rq_bytes(req->next_rq)))
+   WARN_ONCE(true, "Bidi command with remaining bytes");
+   return;
+   }
 
/*
 * Next deal with any sectors which we were able to correctly
@@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
/* Kill remainder if no retries. */
if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
-   BUG();
+   WARN_ONCE(true,
+   "Bytes remaining after failed, no-retry command");
return;
}
 
-- 
2.14.1



[PATCH v5 6/7] scsi_io_completion hints on fastpatch

2018-03-28 Thread Douglas Gilbert
Add likely() and unlikely() hints to conditionals on or near the fastpath.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
A reviewer wanted any performance improvement (or otherwise) quantified.
The improvement was so small, that ftrace ignored it. Inline timing code
suggests the improvement from this whole patchset is around 7 nanoseconds
per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge.
Another win might be the smaller size of scsi_io_completion() after the
refactoring; this should allow more other code to fit in the instruction
cache.

 drivers/scsi/scsi_lib.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d24bd29cd94a..60b57324a244 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
struct request *req = cmd->request;
blk_status_t blk_stat = BLK_STS_OK;
 
-   if (result) /* does not necessarily mean there is an error */
+   if (unlikely(result))   /* a nz result may or may not be an error */
result = scsi_io_completion_nz_result(cmd, result, _stat);
 
-   if (blk_rq_is_passthrough(req)) {
+   if (unlikely(blk_rq_is_passthrough(req))) {
/*
 * __scsi_error_from_host_byte may have reset the host_byte
 */
scsi_req(req)->result = cmd->result;
scsi_req(req)->resid_len = scsi_get_resid(cmd);
 
-   if (scsi_bidi_cmnd(cmd)) {
+   if (unlikely(scsi_bidi_cmnd(cmd))) {
/*
 * Bidi commands Must be complete as a whole,
 * both sides at once.
@@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
}
}
 
-   /* no bidi support for !blk_rq_is_passthrough yet */
+   /* no bidi support yet, other than in pass-through */
BUG_ON(blk_bidi_rq(req));
 
/*
@@ -1067,13 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * handle. Failed, zero length commands always need to drop down
 * to retry code. Fast path should return in this block.
 */
-   if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) {
-   if (!scsi_end_request(req, blk_stat, good_bytes, 0))
+   if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) {
+   if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0)))
return; /* no bytes remaining */
}
 
-/* Kill remainder if no retries. */
-   if (blk_stat && scsi_noretry_cmd(cmd)) {
+   /* Kill remainder if no retries. */
+   if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
BUG();
return;
@@ -1083,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0)
+   if (likely(result == 0))
scsi_io_completion_reprep(cmd, q);
else
scsi_io_completion_action(cmd, result);
-- 
2.14.1



[PATCH v5 5/7] scsi_io_completion_reprep helper added

2018-03-28 Thread Douglas Gilbert
Since the action "reprep" is called from two places, rather than repeat
the code, make a new scsi_io_completion helper with "reprep" as its
suffix.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 831d1fad38b3..d24bd29cd94a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when "reprep" action required. */
+static void scsi_io_completion_reprep(struct scsi_cmnd *cmd,
+ struct request_queue *q)
+{
+   /* A new command will be prepared and issued. */
+   if (q->mq_ops) {
+   scsi_mq_requeue_cmd(cmd);
+   } else {
+   /* Unprep request and put it back at head of the queue. */
+   scsi_release_buffers(cmd);
+   scsi_requeue_command(q, cmd);
+   }
+}
+
 /* Helper for scsi_io_completion() when special action required. */
 static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 {
@@ -892,15 +906,7 @@ static void scsi_io_completion_action(struct scsi_cmnd 
*cmd, int result)
return;
/*FALLTHRU*/
case ACTION_REPREP:
-   /* Unprep the request and put it back at the head of the queue.
-* A new command will be prepared and issued.
-*/
-   if (q->mq_ops) {
-   scsi_mq_requeue_cmd(cmd);
-   } else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
+   scsi_io_completion_reprep(cmd, q);
break;
case ACTION_RETRY:
/* Retry the same command immediately */
@@ -1077,20 +1083,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0) {
-   /*
-* Unprep the request and put it back at the head of the
-* queue. A new command will be prepared and issued.
-* This block is the same as case ACTION_REPREP in
-* scsi_io_completion_action() above.
-*/
-   if (q->mq_ops) {
-   scsi_mq_requeue_cmd(cmd);
-   } else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
-   } else
+   if (result == 0)
+   scsi_io_completion_reprep(cmd, q);
+   else
scsi_io_completion_action(cmd, result);
 }
 
-- 
2.14.1



[PATCH v5 0/7] scsi_io_completion cleanup

2018-03-28 Thread Douglas Gilbert
While working of this "[PATCH v4] Make SCSI Status CONDITION MET
equivalent to GOOD" the author had trouble finding how to add another
corner case check in the scsi_io_completion() function (scsi_lib.c).
That function is executed at the completion of every SCSI command but
only 20 or so of its hundred of lines of code are executed for the vast
majority of invocations (i.e. the fastpath).

This patch refactors scsi_io_completion() by taking the bulk of its
error processing code into three static helper functions, leaving only
the 20 or so code lines that constitute the fastpath. In this process
comments were added and tweaked, plus variables renamed. The last two
patches in this set are optional: adding conditional hints along the
fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs
as requested by a reviewer.

There is no attempt in this patchset to change the logic of the original
scsi_io_completion() although the last patch attempts to continue from
awkward situations rather than crashing the calling thread (and possibly
the kernel). Some conditional checks are saved by reducing the number of
redundant tests (e.g. multiple checks that the 'result' variable is
non-zero). Also De Morgan's laws are applied to some complex conditions
to simplify them from the reader's perspective. The fact remains, this
is a very complex function.

This patch is against Martin Petersen's 4.17/scsi-queue branch.


Note: to apply this patchset to the lk 4.15 production kernels (tested
on lk 4.15.13), these two lines near the end of scsi_io_completion():
   __scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY, 0);
and
   __scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY, 0);

need that trailing 0 changed to 'false', prior to applying 4/7 .


ChangeLog since v4
  - introduce bool do_print to simplify conditionals in the case of
a recovered error sense key in 2/7 rather than 3/7
ChangeLog since v3
  - make use of bools sense_valid and sense_current consistent in 3/7
and 4/7


Douglas Gilbert (7):
  scsi_io_completion comment on end_request return
  scsi_io_completion rename variables
  scsi_io_completion_nz_result function added
  scsi_io_completion_action helper added
  scsi_io_completion_reprep helper added
  scsi_io_completion hints on fastpatch
  scsi_io_completion convert BUGs to WARNs

 drivers/scsi/scsi_lib.c | 378 +++-
 1 file changed, 214 insertions(+), 164 deletions(-)

-- 
2.14.1



[PATCH v5 2/7] scsi_io_completion rename variables

2018-03-28 Thread Douglas Gilbert
Change and add some variable names, adjust some associated comments
for clarity. Correct some misleading comments.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
 drivers/scsi/scsi_lib.c | 72 ++---
 1 file changed, 39 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d44ee84df091..d4f2c3fac907 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
  *be put back on the queue and retried using the same
  *command as before, possibly after a delay.
  *
- * c) We can call scsi_end_request() with -EIO to fail
- *the remainder of the request.
+ * c) We can call scsi_end_request() with blk_stat other than
+ *BLK_STS_OK, to fail the remainder of the request.
  */
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
int result = cmd->result;
struct request_queue *q = cmd->device->request_queue;
struct request *req = cmd->request;
-   blk_status_t error = BLK_STS_OK;
+   blk_status_t blk_stat = BLK_STS_OK;
struct scsi_sense_hdr sshdr;
bool sense_valid = false;
-   int sense_deferred = 0, level = 0;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   int level = 0;
enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
@@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
if (result) {
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
-   sense_deferred = scsi_sense_is_deferred();
+   sense_current = !scsi_sense_is_deferred();
}
 
if (blk_rq_is_passthrough(req)) {
@@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
min(8 + cmd->sense_buffer[7],
SCSI_SENSE_BUFFERSIZE);
}
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
+   if (sense_current)
+   blk_stat = __scsi_error_from_host_byte(cmd,
+  result);
}
/*
 * __scsi_error_from_host_byte may have reset the host_byte
@@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
BUG();
return;
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+   } else if (blk_rq_bytes(req) == 0 && result && sense_current) {
/*
 * Flush commands do not transfers any data, and thus cannot use
 * good_bytes != blk_rq_bytes(req) as the signal for an error.
-* This sets the error explicitly for the problem case.
+* This sets blk_stat explicitly for the problem case.
 */
-   error = __scsi_error_from_host_byte(cmd, result);
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
}
 
/* no bidi support for !blk_rq_is_passthrough yet */
@@ -856,17 +858,22 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * is what gets returned to the user
 */
if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
-   /* if ATA PASS-THROUGH INFORMATION AVAILABLE skip
+   bool do_print = true;
+
+   /*
+* If ATA PASS-THROUGH INFORMATION AVAILABLE skip
 * print since caller wants ATA registers. Only occurs on
 * SCSI ATA PASS_THROUGH commands when CK_COND=1
 */
if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
-   ;
-   else if (!(req->rq_flags & RQF_QUIET))
+   do_print = false;
+   else if (req->rq_flags & RQF_QUIET)
+   do_print = false;
+   if (do_print)
scsi_print_sense(cmd);
result = 0;
-   /* for passthrough error may be set */
-   error = BLK_STS_OK;
+   /* for passthrough, blk_stat may be set */
+   blk_stat = BLK_STS_OK;
}
/*
 * Another corner case: the SCSI status byte is non-zero but 'g

[PATCH v5 3/7] scsi_io_completion_nz_result function added

2018-03-28 Thread Douglas Gilbert
Break out several intertwined paths when cmd->result is non zero and
place them in the scsi_io_completion_nz_result helper function. The
logic is not changed.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
A reviewer requested the original helper function's two return values
be reduced to one: the blk_stat variable. This required a hack to
differentiate the default setting of blk_stat (BLK_STS_OK) from the case
when the helper assigns BLK_STS_OK as the return value. The hack was to
return the otherwise unused BLK_STS_NOTSUPP value as an indication that
the helper didn't change anything. That hack was judged by another
reviewer to be worse that the "two return values" ugliness it was
trying to address. So back to the original "two return values" solution.

 drivers/scsi/scsi_lib.c | 132 +++-
 1 file changed, 75 insertions(+), 57 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d4f2c3fac907..08c4db34af7b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,79 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/*
+ * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a
+ * new result that may suppress further error checking. Also modifies
+ * *blk_statp in some cases.
+ */
+static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result,
+   blk_status_t *blk_statp)
+{
+   bool sense_valid;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   struct request *req = cmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   if (sense_valid)
+   sense_current = !scsi_sense_is_deferred();
+
+   if (blk_rq_is_passthrough(req)) {
+   if (sense_valid) {
+   /*
+* SG_IO wants current and deferred errors
+*/
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
+   }
+   if (sense_current)
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   } else if (blk_rq_bytes(req) == 0 && sense_current) {
+   /*
+* Flush commands do not transfers any data, and thus cannot use
+* good_bytes != blk_rq_bytes(req) as the signal for an error.
+* This sets *blk_statp explicitly for the problem case.
+*/
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   }
+   /*
+* Recovered errors need reporting, but they're always treated as
+* success, so fiddle the result code here.  For passthrough requests
+* we already took a copy of the original into sreq->result which
+* is what gets returned to the user
+*/
+   if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
+   bool do_print = true;
+   /*
+* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d]
+* skip print since caller wants ATA registers. Only occurs
+* on SCSI ATA PASS_THROUGH commands when CK_COND=1
+*/
+   if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
+   do_print = false;
+   else if (req->rq_flags & RQF_QUIET)
+   do_print = false;
+   if (do_print)
+   scsi_print_sense(cmd);
+   result = 0;
+   /* for passthrough, *blk_statp may be set */
+   *blk_statp = BLK_STS_OK;
+   }
+   /*
+* Another corner case: the SCSI status byte is non-zero but 'good'.
+* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when
+* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD
+* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related
+* intermediate statuses (both obsolete in SAM-4) as good.
+*/
+   if (status_byte(result) && scsi_status_is_good(result)) {
+   result = 0;
+   *blk_statp = BLK_STS_OK;
+   }
+   return result;
+}
+
 /*
  * Function:scsi_io_completion()
  *
@@ -794,26 +867,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
-   if (result) {
+   if (result) {   /* does not necessarily mean there is an error */
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
sense_current = 

[PATCH v5 1/7] scsi_io_completion comment on end_request return

2018-03-28 Thread Douglas Gilbert
scsi_end_request() is called multiple times from scsi_io_completion()
which branches on its bool returned value. Add comment before the static
definition of scsi_end_request() about the meaning of that return.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>
---
The reader might think that if scsi_end_request() had actually ended the
request (i.e. no bytes remaining) it would return true. If so, the reader
would be misled.

 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 393f9db8f41b..d44ee84df091 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
cmd->request->next_rq->special = NULL;
 }
 
+/* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
unsigned int bytes, unsigned int bidi_bytes)
 {
-- 
2.14.1



Re: Multi-Actuator SAS HDD First Look

2018-03-29 Thread Douglas Gilbert

On 2018-03-26 11:08 AM, Hannes Reinecke wrote:

On Fri, 23 Mar 2018 08:57:12 -0600
Tim Walker  wrote:


Seagate announced their split actuator SAS drive, which will probably
require some kernel changes for full support. It's targeted at cloud
provider JBODs and RAID.

Here are some of the drive's architectural points. Since the two LUNs
share many common components (e.g. spindle) Seagate allocated some
SCSI operations to be LUN specific and some to affect the entire
device, that is, both LUNs.

1. Two LUNs, 0 & 1, each with independent lba space, and each
connected to an independent read channel, actuator, and set of heads.
2. Each actuator addresses 1/2 of the media - no media is shared
across the actuators. They seek independently.
3. One World Wide Name (WWN) is assigned to the port for device
address. Each Logical Unit has a separate World Wide Name for
identification in VPD page.
4. 128 deep command queue, shared across both LUNs
5. Each LUN can pull commands from the queue independently, so they
can implement their own sorting and optimization.
6. Ordered tag attribute causes the command to be ordered across both
Logical Units
7. Head of Queue attribute causes the command to be ordered with
respect to a single Logical Unit
8. Mode pages are device-based (shared across both Logical Units)
9. Log pages are device-based.
10. Inquiry VPD pages (with minor exceptions) are device based.
11. Device health features (SMART, etc) are device based

Seagate wants the multi-actuator design to integrate into the stack as
painlessly as possible.The interface design is still in the early
stages, so I am gathering requirements and recommendations, and also
providing any information necessary to help scope integrating a
multi-LUN device into the MQ stack. So, I am soliciting any pertinent
feedback including:

1. Painful incompatibilities between the Seagate proposal and current
MQ architecture
2. Linux changes needed
3. Drive interface changes needed
4. Anything else I may have overlooked


So far it looks okay; just make sure to have VPD page 0x83
entries properly associated.
To all intents and purposes these devices seem to look like 'normal'
devices with two LUNs; nothing special with that.
Real question would be in which areas those devices differentiate from
the two indepdendent LUN scenario.

There might be issues with per-device informations like SMART etc;
ideally they are available from _both_ LUNs.
Otherwise they'll show up as blank from one LUN, causing consternation
with any management software.


Further to this point, some types of damage, such as to a head
or (one side of) a platter would degrade one LU, possibly making
it unusable for storage, while the other side (and the other LU)
would be fine.

I'm curious how you plan to implement the START STOP UNIT command.
If one side of the platter is in "start" state and the other side
in "stop" state, will the heads on the stopped side be parked (if
they can be parked)? And if both sides (LUs) are stopped I would
hope you really would spin down the disk, then if either is started
the disk would be spun up.

Getting T10 to add a bit to the Block Device Characteristics VPD page
might be helpful. It could be a "shares a spindle" bit with the other
LUs identified in the SCSI Ports VPD page. Such an indication would
help an enclosure find out if a Multi-Actuator disk was really spun down
and ready to be removed or replaced. I think SES and smartmontools may
need tweaks to handle this new device model sensibly.

Doug Gilbert




Re: Multi-Actuator SAS HDD First Look

2018-04-02 Thread Douglas Gilbert

On 2018-04-02 11:34 AM, Tim Walker wrote:

On Sat, Mar 31, 2018 at 10:52 AM, Douglas Gilbert <dgilb...@interlog.com> wrote:

On 2018-03-30 04:01 PM, Bart Van Assche wrote:


On Fri, 2018-03-30 at 12:36 -0600, Tim Walker wrote:


Yes I will be there to discuss the multi-LUN approach. I wanted to get
these interface details out so we could have some background and
perhaps folks would come with ideas. I don't have much more to put
out, but I will certainly answer questions - either on this list or
directly.



Hello Tim,

As far as I know the Linux SCSI stack does not yet support any SCSI
devices
for which a single SCSI command affects multiple (two in this case) LUNs.
Adding such support may take significant work. There will also be the
disadvantage that most SCSI core contributors do not have access to a
multi-
actuator device and hence that their changes may break support for multi-
actuator devices.



Hmmm, INQUIRY (3PC bit) and REPORT LUNS seem to be counter examples to
Bart's assertion. Plus there are a more that tell you about things outside
the addressed LU, for example the SCSI Ports VPD page tells you about other
SCSI ports hence other LUs) on the current device.


 From Tim's command list:

Device level

0x0, 0x1: okay
0x4 (Format unit): yikes, that could be a nasty surprise, accessing a file
   system on the other LU and getting an error "Not ready, format in
progress"!!
0x12: standard INQUIRY okay, VPD pages not so much LU id different; relative
   port id, different; target port id different (at the least)
0x1b (SSU): storage LUs need to know this model, otherwise the logic on
   each LU could get into a duel: "I want it spun up; no, I want it spun
   down ..."
0x35, 0x37, 0x3b, 0x3c: okay
0x48 (sanitize): similar to Format unit above
0x91,0x4c,0x4d: okay
MODE SENSE/SELECT(6,10): depends on page, block descriptor needs to be
   partially device level (since LB size may be changed by FU which is
   device level)
rest of device level: okay or (I) don't know
0xf7: READ UDS DATA, that's interesting, but proprietary I guess

Perhaps you could add a rider on FU and SAN: they get rejected unless the
other storage LU is in (logical) spun down state.


LU specific
---
all okay, I hoping READ(6,10,12,16,32) and their WRITE cousins will be
   there also :-) Plus the TMF: LU reset

Device or LU

all okay


I'm intrigued by your 3 LU model. My wish list for that:

LUN 0 would be processor device type (0x3) so it wouldn't confuse the
OS (Linux) that it held storage (READ CAPACITY is mandatory for PDT 0x0
and cannot represent a 0 block LU) and you could pick and choose which
SCSI commands to implement on it. LUN 0 TUR could report what the spindle
was really doing, SSU could do precisely what it is told (and SSU on LUNs
1 and 2 would be an "and" for spin down and an "or" for spin up). I've
got several boxes full of SAS cables and only one cable that I can think
of that lets me get to the secondary SAS port. So on LUN 0 you could have
a proprietary (unless T10 takes it on board) mode page to enable the user
to say which ports that LUNs 1 and 2 where accessible on. Obviously LUN 0
would need to be accessible on both ports. [A non-accessible LUN would
still respond to INQUIRY but with a first byte of 07f: PDT=0x1f (unknown)
and PQ=3 which implies it is there but inaccessible via the current port.]

A processor PDT opens up the possibility of putting a copy manager on
LUN 0. Think offloaded (from main machine's perspective) backups and
restores where LUN 1 or 2 is the source or destination.

Enough dreaming.

Doug Gilbert




Thanks for all the input from everybody. I'll collate it for a meeting
with our interface architect.


Just to amplify the case against a FU or SAN being allowed at almost any
time from either storage LU irrespective of what the other was doing.
Imagine one initiator has some important data on one LU and has an
EXCLUSIVE ACCESS persistent reservation on it. Then another initiator
(e.g. on a different machine) sends a FU to the other LU which is
honoured, wiping the whole device. One unhappy customer 

Doug Gilbert


Re: Regarding SCSI SANITIZE command support

2018-04-03 Thread Douglas Gilbert

On 2018-04-02 07:10 AM, Mahesh Rajashekhara wrote:

Hello,

I am RAID HBA driver engineer here at Microsemi. We are working on linux driver 
development for Microsemi SAS/SATA RAID HBA controllers.

As per our understanding, while a drive is processing the SANITIZE command:
- The drive should still be exposed to the OS.
- The drive will fail all commands other than REQUEST SENSE


Not sure that the drive should ever knowingly fail INQUIRY and REPORT
LUNS. That is to allow a discovery process to occur. In the INQUIRY
response the peripheral qualifier should be 000b by my reading.

Go look at sbc4r15.pdf, specifically clause 4.11.2 to see all the
commands that should be responded to during a SANITIZE (there are
9 of them).


When we initiate SCSI SANITIZE operation on disk drive,  from the OS dmesg, we 
could see that continuous SCSI READ (10) commands have been sent by the OS 
upper layer drivers
and drive reports ASC: 04 ASCQ: 1B "LOGICAL UNIT NOT READY, SANITIZE IN 
PROGRESS". Since the sanitize operation is in progress, the drive will fail the 
commands.
Although, OS could see drive is sanitizing, we are not sure why does OS upper 
layer driver is sending SCSI READ (10).

During the OS boot, if the drive is sanitizing, OS boot is taking very long 
time and call trace issue has been occurred.

It would be of great help if you could shed some light on this issue.

Snippet of dmesg:

  [ 202.659196] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659198] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.659201] Buffer I/O error on dev sdm, logical block 0, async page read
  [ 202.659294] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
  [ 202.659299] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current]
  [ 202.659303] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize 
in progress
  [ 202.659308] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659311] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.659314] Buffer I/O error on dev sdm, logical block 0, async page read
  [ 202.659322] sdm: unable to read partition table
  [ 202.659633] sd 0:0:11:0: [sdm] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
  [ 202.659638] sd 0:0:11:0: [sdm] Sense Key : Not Ready [current]
  [ 202.659641] sd 0:0:11:0: [sdm] Add. Sense: Logical unit not ready, sanitize 
in progress
  [ 202.659645] sd 0:0:11:0: [sdm] CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
  [ 202.659648] blk_update_request: I/O error, dev sdm, sector 0
  [ 202.660621] sd 0:0:11:0: [sdm] Spinning up disk...
  [ 204.067155] Buffer I/O error on dev sdm, logical block 488378624, async 
page read
  [ 204.122152] SGI XFS with ACLs, security attributes, no debug enabled
  [ 204.123494] XFS (dm-0): Mounting V5 Filesystem
  [ 204.289817] XFS (dm-0): Ending clean mount
  [ 240.069283] INFO: task systemd-udevd:251 blocked for more than 120 seconds.
  [ 240.069289] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message
  [  203.661324] .
  [ 240.069293] systemd-udevd D
  [ 240.069296] c00c3580 0 251 1 0x0004
  [ 240.069301] 88007e0e7d18 0082 88007e0ceeb0 
88007e0e7fd8
  [ 240.069306] 88007e0e7fd8 88007e0e7fd8 88007e0ceeb0 

  [ 240.069311]  c00c35d0 0001 
c00c3580
  [ 240.069316] Call Trace:
  [ 240.069328] [] schedule+0x29/0x70
  [ 240.069334] [] async_synchronize_cookie_domain+0x85/0x150
  [ 240.069340] [] ? wake_up_atomic_t+0x30/0x30
  [ 240.069346] [] async_synchronize_full+0x17/0x20
  [ 240.069351] [] load_module+0x1fc0/0x29e0
  [ 240.069358] [] ? ddebug_proc_write+0xf0/0xf0
  [ 240.069365] [] SyS_init_module+0xc5/0x110
  [ 240.069371] [] system_call_fastpath+0x16/0x1b


The scsi mid-level and/or the sd driver sends TEST UNIT READY (TUR)
commands and should take note of the NOT READY sense key and should
not send READ commands until that condition clears. If we are talking
about modern SCSI devices then REQUEST SENSE should respond with progress
information.

Doug Gilbert




[PATCH v4 4/7] scsi_io_completion_action helper added

2018-03-28 Thread Douglas Gilbert
Place scsi_io_completion()'s complex error processing associated with a
local enumeration into a static helper function. That enumeration's
values start with "ACTION_" so use the suffix "_action" in the helper
function's name.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 325 ++--
 1 file changed, 173 insertions(+), 152 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6a8b9dc7c40e..048b144de5b0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when special action required. */
+static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
+{
+   struct request_queue *q = cmd->device->request_queue;
+   struct request *req = cmd->request;
+   int level = 0;
+   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
+ ACTION_DELAYED_RETRY} action;
+   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
+   struct scsi_sense_hdr sshdr;
+   bool sense_valid;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   blk_status_t blk_stat;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   if (sense_valid)
+   sense_current = !scsi_sense_is_deferred();
+
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+
+   if (host_byte(result) == DID_RESET) {
+   /* Third party bus reset or reset for error recovery
+* reasons.  Just retry the command and see what
+* happens.
+*/
+   action = ACTION_RETRY;
+   } else if (sense_valid && sense_current) {
+   switch (sshdr.sense_key) {
+   case UNIT_ATTENTION:
+   if (cmd->device->removable) {
+   /* Detected disc change.  Set a bit
+* and quietly refuse further access.
+*/
+   cmd->device->changed = 1;
+   action = ACTION_FAIL;
+   } else {
+   /* Must have been a power glitch, or a
+* bus reset.  Could not have been a
+* media change, so we just retry the
+* command and see what happens.
+*/
+   action = ACTION_RETRY;
+   }
+   break;
+   case ILLEGAL_REQUEST:
+   /* If we had an ILLEGAL REQUEST returned, then
+* we may have performed an unsupported
+* command.  The only thing this should be
+* would be a ten byte read where only a six
+* byte read was supported.  Also, on a system
+* where READ CAPACITY failed, we may have
+* read past the end of the disk.
+*/
+   if ((cmd->device->use_10_for_rw &&
+   sshdr.asc == 0x20 && sshdr.ascq == 0x00) &&
+   (cmd->cmnd[0] == READ_10 ||
+cmd->cmnd[0] == WRITE_10)) {
+   /* This will issue a new 6-byte command. */
+   cmd->device->use_10_for_rw = 0;
+   action = ACTION_REPREP;
+   } else if (sshdr.asc == 0x10) /* DIX */ {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_PROTECTION;
+   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
+   } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_TARGET;
+   } else
+   action = ACTION_FAIL;
+   break;
+   case ABORTED_COMMAND:
+   action = ACTION_FAIL;
+   if (sshdr.asc == 0x10) /* DIF */
+   blk_stat = BLK_STS_PROTECTION;
+   break;
+   case NOT_READY:
+   /* If the device is in the process of becoming
+* ready, or has a temporary blockage, retry.
+*/
+   if (sshdr.asc == 0x04) {
+   switch (sshdr.ascq) {
+

[PATCH v4 3/7] scsi_io_completion_nz_result function added

2018-03-28 Thread Douglas Gilbert
Break out several intertwined paths when cmd->result is non zero and
place them in the scsi_io_completion_nz_result helper function. The
logic is not changed.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
A reviewer requested the original helper function's two return values
be reduced to one: the blk_stat variable. This required a hack to
differentiate the default setting of blk_stat (BLK_STS_OK) from the case
when the helper assigns BLK_STS_OK as the return value. The hack is to
return the otherwise unused BLK_STS_NOTSUPP value as an indication that
the helper didn't change anything. That hack was judged by another
reviewer to be worse that the "two return values" ugliness it was
trying to address. So back to the original "two return values" solution.

 drivers/scsi/scsi_lib.c | 127 
 1 file changed, 75 insertions(+), 52 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 598836804745..6a8b9dc7c40e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,79 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/*
+ * Helper for scsi_io_completion() when cmd->result is non-zero. Returns a
+ * new result that may suppress further error checking. Also modifies
+ * *blk_statp in some cases.
+ */
+static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result,
+   blk_status_t *blk_statp)
+{
+   bool sense_valid;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   struct request *req = cmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   if (sense_valid)
+   sense_current = !scsi_sense_is_deferred();
+
+   if (blk_rq_is_passthrough(req)) {
+   if (sense_valid) {
+   /*
+* SG_IO wants current and deferred errors
+*/
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
+   }
+   if (sense_current)
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   } else if (blk_rq_bytes(req) == 0 && sense_current) {
+   /*
+* Flush commands do not transfers any data, and thus cannot use
+* good_bytes != blk_rq_bytes(req) as the signal for an error.
+* This sets *blk_statp explicitly for the problem case.
+*/
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   }
+   /*
+* Recovered errors need reporting, but they're always treated as
+* success, so fiddle the result code here.  For passthrough requests
+* we already took a copy of the original into sreq->result which
+* is what gets returned to the user
+*/
+   if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
+   bool do_print = true;
+   /*
+* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d]
+* skip print since caller wants ATA registers. Only occurs
+* on SCSI ATA PASS_THROUGH commands when CK_COND=1
+*/
+   if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
+   do_print = false;
+   else if (req->rq_flags & RQF_QUIET)
+   do_print = false;
+   if (do_print)
+   scsi_print_sense(cmd);
+   /* for passthrough, *blk_statp may be set, so clear */
+   *blk_statp = BLK_STS_OK;
+   result = 0;
+   }
+   /*
+* Another corner case: the SCSI status byte is non-zero but 'good'.
+* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when
+* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD
+* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related
+* intermediate statuses (both obsolete in SAM-4) as good.
+*/
+   if (status_byte(result) && scsi_status_is_good(result)) {
+   *blk_statp = BLK_STS_OK;
+   result = 0;
+   }
+   return result;
+}
+
 /*
  * Function:scsi_io_completion()
  *
@@ -794,26 +867,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
-   if (result) {
+   if (result) {   /* does not necessarily mean there is an error */
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
sense_current = 

[PATCH v4 7/7] scsi_io_completion convert BUGs to WARNs

2018-03-28 Thread Douglas Gilbert
The scsi_io_completion function contains three BUG() and BUG_ON() calls.
Replace them with WARN variants.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a9676d824452..a89ca1ddc01a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1046,13 +1046,21 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
blk_rq_bytes(req->next_rq)))
-   BUG();
+   WARN_ONCE(true,
+ "Bidi command with remaining bytes");
return;
}
}
 
/* no bidi support yet, other than in pass-through */
-   BUG_ON(blk_bidi_rq(req));
+   if (unlikely(blk_bidi_rq(req))) {
+   WARN_ONCE(true, "Only support bidi command in passthrough");
+   scmd_printk(KERN_ERR, cmd, "Killing bidi command\n");
+   if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req),
+blk_rq_bytes(req->next_rq)))
+   WARN_ONCE(true, "Bidi command with remaining bytes");
+   return;
+   }
 
/*
 * Next deal with any sectors which we were able to correctly
@@ -1075,7 +1083,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
/* Kill remainder if no retries */
if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
-   BUG();
+   WARN_ONCE(true,
+   "Bytes remaining after failed, no-retry command");
return;
}
 
-- 
2.14.1



[PATCH v4 2/7] scsi_io_completion rename variables

2018-03-28 Thread Douglas Gilbert
Change some variable names and associated comments for clarity. Correct
some misleading comments.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>
---
 drivers/scsi/scsi_lib.c | 57 ++---
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d44ee84df091..598836804745 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
  *be put back on the queue and retried using the same
  *command as before, possibly after a delay.
  *
- * c) We can call scsi_end_request() with -EIO to fail
- *the remainder of the request.
+ * c) We can call scsi_end_request() with blk_stat other than
+ *BLK_STS_OK, to fail the remainder of the request.
  */
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
int result = cmd->result;
struct request_queue *q = cmd->device->request_queue;
struct request *req = cmd->request;
-   blk_status_t error = BLK_STS_OK;
+   blk_status_t blk_stat = BLK_STS_OK;
struct scsi_sense_hdr sshdr;
bool sense_valid = false;
-   int sense_deferred = 0, level = 0;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   int level = 0;
enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
@@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
if (result) {
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
-   sense_deferred = scsi_sense_is_deferred();
+   sense_current = !scsi_sense_is_deferred();
}
 
if (blk_rq_is_passthrough(req)) {
@@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
min(8 + cmd->sense_buffer[7],
SCSI_SENSE_BUFFERSIZE);
}
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
+   if (sense_current)
+   blk_stat = __scsi_error_from_host_byte(cmd,
+  result);
}
/*
 * __scsi_error_from_host_byte may have reset the host_byte
@@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
BUG();
return;
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+   } else if (blk_rq_bytes(req) == 0 && result && sense_current) {
/*
 * Flush commands do not transfers any data, and thus cannot use
 * good_bytes != blk_rq_bytes(req) as the signal for an error.
-* This sets the error explicitly for the problem case.
+* This sets blk_stat explicitly for the problem case.
 */
-   error = __scsi_error_from_host_byte(cmd, result);
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
}
 
/* no bidi support for !blk_rq_is_passthrough yet */
@@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
else if (!(req->rq_flags & RQF_QUIET))
scsi_print_sense(cmd);
result = 0;
-   /* for passthrough error may be set */
-   error = BLK_STS_OK;
+   /* for passthrough, blk_stat may be set */
+   blk_stat = BLK_STS_OK;
}
/*
 * Another corner case: the SCSI status byte is non-zero but 'good'.
@@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 */
if (status_byte(result) && scsi_status_is_good(result)) {
result = 0;
-   error = BLK_STS_OK;
+   blk_stat = BLK_STS_OK;
}
 
/*
-* special case: failed zero length commands always need to
-* drop down into the retry code. Otherwise, if we finished
-* all bytes in the request we are done now.
+* Next deal with any sectors which we were able to correctly
+* handle. Failed, zero length commands always need to drop down
+* to retry code. 

[PATCH v4 1/7] scsi_io_completion comment on end_request return

2018-03-28 Thread Douglas Gilbert
scsi_end_request() is called multiple times from scsi_io_completion()
which branches on its bool returned value. Add comment before the static
definition of scsi_end_request() about the meaning of that return.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>
---
The reader might think that if scsi_end_request() had actually ended the
request (i.e. no bytes remaining) it would return true. If so, the reader
would be misled.

 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 393f9db8f41b..d44ee84df091 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
cmd->request->next_rq->special = NULL;
 }
 
+/* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
unsigned int bytes, unsigned int bidi_bytes)
 {
-- 
2.14.1



[PATCH v4 0/7] scsi_io_completion cleanup

2018-03-28 Thread Douglas Gilbert
While working of this "[PATCH v4] Make SCSI Status CONDITION MET
equivalent to GOOD" the author had trouble finding how to add another
corner case check in the scsi_io_completion() function (scsi_lib.c).
That function is executed at the completion of every SCSI command but
only 20 or so of its hundred of lines of code are executed for the vast
majority of invocations (i.e. the fastpath).

This patch refactors scsi_io_completion() by taking the bulk of its
error processing code into three static helper functions, leaving only
the 20 or so code lines that constitute the fastpath. In this process
comments were added and tweaked, plus variables renamed. The last two
patches in this set are optional: adding conditional hints along the
fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs
as requested by a reviewer.

There is no attempt in this patchset to change the logic of the original
scsi_io_completion(). Some conditional checks are saved by reducing the
number of redundant tests (e.g. on the 'result' variable being non-zero).
Also De Morgan's laws are applied to some complex conditions to simplify
them from the reader's perspective. The fact remains, this is a very
complex function.

This patch is against Martin Petersen's 4.17/scsi-queue branch.


Note: to apply this patchset to the lk 4.15 production kernels (tested
on lk 4.15.13), these two lines near the end of scsi_io_completion():
   __scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY, 0);
and
   __scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY, 0);

need that trailing 0 changed to 'false', prior to applying 4/7 .

Douglas Gilbert (7):
  scsi_io_completion comment on end_request return
  scsi_io_completion rename variables
  scsi_io_completion_nz_result function added
  scsi_io_completion_action helper added
  scsi_io_completion_reprep helper added
  scsi_io_completion hints on fastpatch
  scsi_io_completion convert BUGs to WARNs

 drivers/scsi/scsi_lib.c | 378 +++-
 1 file changed, 214 insertions(+), 164 deletions(-)

-- 
2.14.1



[PATCH v4 6/7] scsi_io_completion hints on fastpatch

2018-03-28 Thread Douglas Gilbert
Add likely() and unlikely() hints to conditionals on or near the fastpath.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
A reviewer wanted any performance improvement (or otherwise) quantified.
The improvement was so small, that ftrace ignored it. Inline timing code
suggests the improvement from this whole patchset is around 7 nanoseconds
per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge.
Another win might be the smaller size of scsi_io_completion() after the
refactoring; this should allow more other code to fit in the instruction
cache.

 drivers/scsi/scsi_lib.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7bf7d3c4b9c2..a9676d824452 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1028,17 +1028,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
struct request *req = cmd->request;
blk_status_t blk_stat = BLK_STS_OK;
 
-   if (result) /* does not necessarily mean there is an error */
+   if (unlikely(result))   /* a nz result may or may not be an error */
result = scsi_io_completion_nz_result(cmd, result, _stat);
 
-   if (blk_rq_is_passthrough(req)) {
+   if (unlikely(blk_rq_is_passthrough(req))) {
/*
 * __scsi_error_from_host_byte may have reset the host_byte
 */
scsi_req(req)->result = cmd->result;
scsi_req(req)->resid_len = scsi_get_resid(cmd);
 
-   if (scsi_bidi_cmnd(cmd)) {
+   if (unlikely(scsi_bidi_cmnd(cmd))) {
/*
 * Bidi commands Must be complete as a whole,
 * both sides at once.
@@ -1051,7 +1051,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
}
}
 
-   /* no bidi support for !blk_rq_is_passthrough yet */
+   /* no bidi support yet, other than in pass-through */
BUG_ON(blk_bidi_rq(req));
 
/*
@@ -1067,15 +1067,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * handle. Failed, zero length commands always need to drop down
 * to retry code. Fast path should return in this block.
 */
-   if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) {
-   if (!scsi_end_request(req, blk_stat, good_bytes, 0))
-   return; /* no bytes remaining */
+   if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) {
+   if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0)))
+   return; /* no bytes remaining */
}
 
-   /*
-* Kill remainder if no retrys.
-*/
-   if (blk_stat && scsi_noretry_cmd(cmd)) {
+   /* Kill remainder if no retries */
+   if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
BUG();
return;
@@ -1085,7 +1083,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0)
+   if (likely(result == 0))
scsi_io_completion_reprep(cmd, q);
else
scsi_io_completion_action(cmd, result);
-- 
2.14.1



[PATCH v4 5/7] scsi_io_completion_reprep helper added

2018-03-28 Thread Douglas Gilbert
Since the action "reprep" is called from two places, rather than repeat
the code, make a new scsi_io_completion helper with "reprep" as its
suffix.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 048b144de5b0..7bf7d3c4b9c2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,20 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when "reprep" action required. */
+static void scsi_io_completion_reprep(struct scsi_cmnd *cmd,
+ struct request_queue *q)
+{
+   /* A new command will be prepared and issued. */
+   if (q->mq_ops) {
+   scsi_mq_requeue_cmd(cmd);
+   } else {
+   /* Unprep request and put it back at head of the queue. */
+   scsi_release_buffers(cmd);
+   scsi_requeue_command(q, cmd);
+   }
+}
+
 /* Helper for scsi_io_completion() when special action required. */
 static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 {
@@ -892,15 +906,7 @@ static void scsi_io_completion_action(struct scsi_cmnd 
*cmd, int result)
return;
/*FALLTHRU*/
case ACTION_REPREP:
-   /* Unprep the request and put it back at the head of the queue.
-* A new command will be prepared and issued.
-*/
-   if (q->mq_ops)
-   scsi_mq_requeue_cmd(cmd);
-   else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
+   scsi_io_completion_reprep(cmd, q);
break;
case ACTION_RETRY:
/* Retry the same command immediately */
@@ -1079,20 +1085,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0) {
-   /*
-* Unprep the request and put it back at the head of the
-* queue. A new command will be prepared and issued.
-* This block is the same as case ACTION_REPREP in
-* scsi_io_completion_action() above.
-*/
-   if (q->mq_ops)
-   scsi_mq_requeue_cmd(cmd);
-   else {
-   scsi_release_buffers(cmd);
-   scsi_requeue_command(q, cmd);
-   }
-   } else
+   if (result == 0)
+   scsi_io_completion_reprep(cmd, q);
+   else
scsi_io_completion_action(cmd, result);
 }
 
-- 
2.14.1



Re: How to Locate drive directly attached to mpt3sas HBA

2018-03-19 Thread Douglas Gilbert

On 2018-03-19 11:40 AM, Jack Wang wrote:

Hi list,

Any one knows how can I locate a HDD directly attached to mpt3sas,
sas3ircu only supports LOCATE commd to locates driver installed in a
disk enclosure, but not directly attached.

I know microsemi/PMCs supports SGPIO interface to locate drive eg:
  Adp80xxapp sgpio 0 set 0 1

I searched in latest upstream mpt3sas code, didn't find such
interface. Do I miss something?


Hi,
If its a SAS disk, it might have an onboard LED.

In the sdparm package there is a script called "sas_disk_blink"
that will flash that LED.


And I think either mpt2sas and/or mpt3sas HBAs (I don't have my hardware
nearby) have a SMP target hidden away somewhere. Perhaps someone from
LSI/Avago/Broadcom could supply more information about that.

Doug Gilbert



[PATCH v2 6/6] scsi_io_completion convert BUGs to WARNs

2018-03-18 Thread Douglas Gilbert
The scsi_io_completion function contains three BUG() and BUG_ON() calls.
Replace them with WARN variants.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
This was done at a reviewer's request. This patchset did not change those
BUG() lines but the git-diff suggested it did. Then the checkpatch.pl
warned "BUG() calls considered harmful". BUG() calls are simpler than
WARNs because the programmer doesn't have to worry about a continuation
after them. The continuations given here may not be ideal. For these
reasons the author considers this patch optional.

 drivers/scsi/scsi_lib.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 63f79b0dbbf3..be8836d984f5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1048,13 +1048,20 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
blk_rq_bytes(req->next_rq)))
-   BUG();
+   WARN(true, "Bidi command with remaining bytes");
return;
}
}
 
/* no bidi support yet, other than in pass-through */
-   BUG_ON(blk_bidi_rq(req));
+   if (unlikely(blk_bidi_rq(req))) {
+   WARN_ONCE(true, "Only support bidi command in passthrough");
+   scmd_printk(KERN_ERR, cmd, "Killing bidi command\n");
+   if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req),
+blk_rq_bytes(req->next_rq)))
+   WARN(true, "Bidi command with remaining bytes");
+   return;
+   }
 
/*
 * Next deal with any sectors which we were able to correctly
@@ -1077,7 +1084,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
/* Kill remainder if no retrys  */
if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
-   BUG();
+   WARN(true,
+   "Bytes remaining after failed, no-retry command");
return;
}
 
-- 
2.14.1



[PATCH v2 2/6] scsi_io_completion rename variables

2018-03-18 Thread Douglas Gilbert
Change some variable names and associated comments for clarity. Correct
some misleading comments.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
 drivers/scsi/scsi_lib.c | 57 ++---
 1 file changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d44ee84df091..a4da5773d42d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -777,18 +777,19 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
  *be put back on the queue and retried using the same
  *command as before, possibly after a delay.
  *
- * c) We can call scsi_end_request() with -EIO to fail
- *the remainder of the request.
+ * c) We can call scsi_end_request() with blk_stat other than
+ *BLK_STS_OK, to fail the remainder of the request.
  */
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
int result = cmd->result;
struct request_queue *q = cmd->device->request_queue;
struct request *req = cmd->request;
-   blk_status_t error = BLK_STS_OK;
+   blk_status_t blk_stat = BLK_STS_OK; /* u8: BLK_STS_OK is only 0 */
struct scsi_sense_hdr sshdr;
bool sense_valid = false;
-   int sense_deferred = 0, level = 0;
+   bool sense_current = true;  /* false implies "deferred sense" */
+   int level = 0;
enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
@@ -796,7 +797,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
if (result) {
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
-   sense_deferred = scsi_sense_is_deferred();
+   sense_current = !scsi_sense_is_deferred();
}
 
if (blk_rq_is_passthrough(req)) {
@@ -809,8 +810,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
min(8 + cmd->sense_buffer[7],
SCSI_SENSE_BUFFERSIZE);
}
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
+   if (sense_current)
+   blk_stat = __scsi_error_from_host_byte(cmd,
+  result);
}
/*
 * __scsi_error_from_host_byte may have reset the host_byte
@@ -829,13 +831,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
BUG();
return;
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+   } else if (blk_rq_bytes(req) == 0 && result && sense_current) {
/*
 * Flush commands do not transfers any data, and thus cannot use
 * good_bytes != blk_rq_bytes(req) as the signal for an error.
-* This sets the error explicitly for the problem case.
+* This sets blk_stat explicitly for the problem case.
 */
-   error = __scsi_error_from_host_byte(cmd, result);
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
}
 
/* no bidi support for !blk_rq_is_passthrough yet */
@@ -865,8 +867,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
else if (!(req->rq_flags & RQF_QUIET))
scsi_print_sense(cmd);
result = 0;
-   /* for passthrough error may be set */
-   error = BLK_STS_OK;
+   /* for passthrough, blk_stat may be set */
+   blk_stat = BLK_STS_OK;
}
/*
 * Another corner case: the SCSI status byte is non-zero but 'good'.
@@ -877,23 +879,24 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 */
if (status_byte(result) && scsi_status_is_good(result)) {
result = 0;
-   error = BLK_STS_OK;
+   blk_stat = BLK_STS_OK;
}
 
/*
-* special case: failed zero length commands always need to
-* drop down into the retry code. Otherwise, if we finished
-* all bytes in the request we are done now.
+* Next deal with any sectors which we were able to correctly
+* handle. Failed, zero length commands always need to drop down
+* to retry code. Fast path should return in this block.
 

[PATCH v2 0/6] scsi_io_completion cleanup

2018-03-18 Thread Douglas Gilbert
While working of this "[PATCH v4] Make SCSI Status CONDITION MET
equivalent to GOOD" the author had trouble finding how to add another
corner case check in the scsi_io_completion() function (scsi_lib.c).
That function is executed at the completion of every SCSI command but
only 20 or so of its hundred of lines of code are executed for the vast
majority of invocations (i.e. the fastpath).

This patch refactors scsi_io_completion() by taking the bulk of its
error processing code into two static helper functions, leaving only
the 20 or so code lines that constitute the fastpath. In this process
comments were added and tweaked, plus variables renamed. The last two
patches in this set are optional: adding conditional hints along the
fastpath and converting 3 BUG() calls in scsi_io_completion() to WARNs
as requested by a reviewer.

There is no attempt in this patchset to change the logic of the original
scsi_io_completion(). Some conditional checks are saved by reducing the
number of redundant tests (e.g. on the 'result' variable being non-zero). 
Also De Morgan's laws are applied to some complex conditions to simplify
them from the reader's perspective. The fact remains, this is a very
complex function.

This patch is against Martin Petersen's 4.17/scsi-queue branch.

Note to reviewers: a patchset that refactors code, in this case one very
long function into one short function and two longish helpers, does not
"own" the logic in the original function. The main thing to review is
whether the refactoring is worthwhile and whether the logic remains the
same after the refactoring. 

Douglas Gilbert (6):
  scsi_io_completion comment on end_request return
  scsi_io_completion rename variables
  scsi_io_completion_nz_result function added
  scsi_io_completion_action helper added
  scsi_io_completion hints on fastpatch
  scsi_io_completion convert BUGs to WARNs

 drivers/scsi/scsi_lib.c | 376 
 1 file changed, 219 insertions(+), 157 deletions(-)

-- 
2.14.1



[PATCH v2 5/6] scsi_io_completion hints on fastpatch

2018-03-18 Thread Douglas Gilbert
Add likely() and unlikely() hints to conditionals on or near the fastpath.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
A reviewer wanted any performance improvement (or otherwise) quantified.
The improvement was so small, that ftrace ignored it. Inline timing code
suggests the improvement from this whole patchset is around 7 nanoseconds
per invocation (tested on a Lenovo X270 (i5-7200U)). Not exactly huge.
Another win might be the smaller size of scsi_io_completion() after the
refactoring; this should allow more other code to fit in the instruction
cache.

 drivers/scsi/scsi_lib.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0e2dcdb63fcd..63f79b0dbbf3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1020,7 +1020,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
struct request *req = cmd->request;
blk_status_t blk_stat = BLK_STS_OK; /* u8: BLK_STS_OK is only 0 */
 
-   if (result) {   /* does not necessarily mean there is an error */
+   if (unlikely(result)) { /* a nz result may or may not be an error */
blk_stat = scsi_io_completion_nz_result(cmd, result);
if (blk_stat == BLK_STS_OK)
result = 0; /* suppress error processing now */
@@ -1033,14 +1033,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 
}
 
-   if (blk_rq_is_passthrough(req)) {
+   if (unlikely(blk_rq_is_passthrough(req))) {
/*
 * __scsi_error_from_host_byte may have reset the host_byte
 */
scsi_req(req)->result = cmd->result;
scsi_req(req)->resid_len = scsi_get_resid(cmd);
 
-   if (scsi_bidi_cmnd(cmd)) {
+   if (unlikely(scsi_bidi_cmnd(cmd))) {
/*
 * Bidi commands Must be complete as a whole,
 * both sides at once.
@@ -1053,7 +1053,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
}
}
 
-   /* no bidi support for !blk_rq_is_passthrough yet */
+   /* no bidi support yet, other than in pass-through */
BUG_ON(blk_bidi_rq(req));
 
/*
@@ -1069,15 +1069,13 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * handle. Failed, zero length commands always need to drop down
 * to retry code. Fast path should return in this block.
 */
-   if (blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK) {
-   if (!scsi_end_request(req, blk_stat, good_bytes, 0))
-   return; /* no bytes remaining */
+   if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) {
+   if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0)))
+   return; /* no bytes remaining */
}
 
-   /*
-* Kill remainder if no retrys.
-*/
-   if (blk_stat && scsi_noretry_cmd(cmd)) {
+   /* Kill remainder if no retrys  */
+   if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
BUG();
return;
@@ -1087,14 +1085,14 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 * If there had been no error, but we have leftover bytes in the
 * requeues just queue the command up again.
 */
-   if (result == 0) {
+   if (likely(result == 0)) {
/*
 * Unprep the request and put it back at the head of the
 * queue. A new command will be prepared and issued.
 * This block is the same as case ACTION_REPREP in
 * scsi_io_completion_action() above.
 */
-   if (q->mq_ops)
+   if (likely(q->mq_ops))
scsi_mq_requeue_cmd(cmd);
else {
scsi_release_buffers(cmd);
-- 
2.14.1



[PATCH v2 1/6] scsi_io_completion comment on end_request return

2018-03-18 Thread Douglas Gilbert
scsi_end_request() is called multiple times from scsi_io_completion()
which branches on its bool returned value. Add comment before the static
definition of scsi_end_request() about the meaning of that return.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
The reader might think that if scsi_end_request() had actually ended the
request (i.e. no bytes remaining) it would return true. If so, the reader
would be misled.

 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 393f9db8f41b..d44ee84df091 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -662,6 +662,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
cmd->request->next_rq->special = NULL;
 }
 
+/* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
unsigned int bytes, unsigned int bidi_bytes)
 {
-- 
2.14.1



[PATCH v2 4/6] scsi_io_completion_action helper added

2018-03-18 Thread Douglas Gilbert
Place scsi_io_completion()'s complex error processing associated with a
local enumeration into a static helper function. That enumeration's
values start with "ACTION_" so use the suffix "_action" in the helper
function's name.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
Reviewed-by: Johannes Thumshirn <jthumsh...@suse.de>
---
Replicate the ACTION_REPREP case code since the former 'goto requeue'
would now need to jump into another function which is not permitted.

 drivers/scsi/scsi_lib.c | 322 ++--
 1 file changed, 172 insertions(+), 150 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9731d0d3cc80..0e2dcdb63fcd 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,168 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/* Helper for scsi_io_completion() when special action required. */
+static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
+{
+   struct request_queue *q = cmd->device->request_queue;
+   struct request *req = cmd->request;
+   int level = 0;
+   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
+ ACTION_DELAYED_RETRY} action;
+   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
+   struct scsi_sense_hdr sshdr;
+   bool sense_valid_and_current = false;
+   blk_status_t blk_stat;  /* u8, BLK_STS_OK is only 0 */
+
+   /* sense not about current command is termed: deferred */
+   if (scsi_command_normalize_sense(cmd, ) &&
+   !scsi_sense_is_deferred())
+   sense_valid_and_current = true;
+
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+
+   if (host_byte(result) == DID_RESET) {
+   /* Third party bus reset or reset for error recovery
+* reasons.  Just retry the command and see what
+* happens.
+*/
+   action = ACTION_RETRY;
+   } else if (sense_valid_and_current) {
+   switch (sshdr.sense_key) {
+   case UNIT_ATTENTION:
+   if (cmd->device->removable) {
+   /* Detected disc change.  Set a bit
+* and quietly refuse further access.
+*/
+   cmd->device->changed = 1;
+   action = ACTION_FAIL;
+   } else {
+   /* Must have been a power glitch, or a
+* bus reset.  Could not have been a
+* media change, so we just retry the
+* command and see what happens.
+*/
+   action = ACTION_RETRY;
+   }
+   break;
+   case ILLEGAL_REQUEST:
+   /* If we had an ILLEGAL REQUEST returned, then
+* we may have performed an unsupported
+* command.  The only thing this should be
+* would be a ten byte read where only a six
+* byte read was supported.  Also, on a system
+* where READ CAPACITY failed, we may have
+* read past the end of the disk.
+*/
+   if ((cmd->device->use_10_for_rw &&
+   sshdr.asc == 0x20 && sshdr.ascq == 0x00) &&
+   (cmd->cmnd[0] == READ_10 ||
+cmd->cmnd[0] == WRITE_10)) {
+   /* This will issue a new 6-byte command. */
+   cmd->device->use_10_for_rw = 0;
+   action = ACTION_REPREP;
+   } else if (sshdr.asc == 0x10) /* DIX */ {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_PROTECTION;
+   /* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
+   } else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
+   action = ACTION_FAIL;
+   blk_stat = BLK_STS_TARGET;
+   } else
+   action = ACTION_FAIL;
+   break;
+   case ABORTED_COMMAND:
+   action = ACTION_FAIL;
+   if (sshdr.asc == 0x10) /* DIF */
+   blk_stat = BLK_STS_PROTECTION;
+   break;
+   case NOT_READY:
+   /* If the device is in the process of becoming
+* ready, 

[PATCH v2 3/6] scsi_io_completion_nz_result function added

2018-03-18 Thread Douglas Gilbert
Break out several interwined paths when cmd->result is non zero and
place them in scsi_io_completion_nz_result helper function. The logic
is not changed.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
A reviewer requested the original helper function's two return values
be reduced to one: the blk_stat variable. This required a hack to
differentiate the default setting of blk_stat (BLK_STS_OK) from the case
when the helper assigns BLK_STS_OK as the return value. The hack is to
return the otherwise unused BLK_STS_NOTSUPP value as an indication that
the helper didn't change anything.

 drivers/scsi/scsi_lib.c | 134 +---
 1 file changed, 82 insertions(+), 52 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a4da5773d42d..9731d0d3cc80 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -751,6 +751,77 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
 }
 
+/*
+ * Helper for scsi_io_completion() when cmd->result is non-zero. Returns
+ * BLK_STS_NOTSUPP if this function does not change blk_status .
+ */
+static blk_status_t scsi_io_completion_nz_result(struct scsi_cmnd *cmd,
+int result)
+{
+   bool sense_valid;
+   bool about_current;
+   /* __scsi_error_from_host_byte() does not return BLK_STS_NOTSUPP */
+   blk_status_t blk_stat = BLK_STS_NOTSUPP;
+   struct request *req = cmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   about_current = sense_valid ? !scsi_sense_is_deferred() : true;
+
+   if (blk_rq_is_passthrough(req)) {
+   if (sense_valid) {
+   /*
+* SG_IO wants current and deferred errors
+*/
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
+   }
+   if (about_current)
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+   } else if (blk_rq_bytes(req) == 0 && about_current) {
+   /*
+* Flush commands do not transfers any data, and thus cannot use
+* good_bytes != blk_rq_bytes(req) as the signal for an error.
+* This sets blk_stat explicitly for the problem case.
+*/
+   blk_stat = __scsi_error_from_host_byte(cmd, result);
+   }
+   /*
+* Recovered errors need reporting, but they're always treated as
+* success, so fiddle the result code here.  For passthrough requests
+* we already took a copy of the original into sreq->result which
+* is what gets returned to the user
+*/
+   if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
+   bool do_print = true;
+   /*
+* if ATA PASS-THROUGH INFORMATION AVAILABLE [0x0, 0x1d]
+* skip print since caller wants ATA registers. Only occurs
+* on SCSI ATA PASS_THROUGH commands when CK_COND=1
+*/
+   if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
+   do_print = false;
+   else if (req->rq_flags & RQF_QUIET)
+   do_print = false;
+   if (do_print)
+   scsi_print_sense(cmd);
+   /* for passthrough, blk_stat may be set */
+   blk_stat = BLK_STS_OK;
+   }
+   /*
+* Another corner case: the SCSI status byte is non-zero but 'good'.
+* Example: PRE-FETCH command returns SAM_STAT_CONDITION_MET when
+* it is able to fit nominated LBs in its cache (and SAM_STAT_GOOD
+* if it can't fit). Treat SAM_STAT_CONDITION_MET and the related
+* intermediate statuses (both obsolete in SAM-4) as good.
+*/
+   if (status_byte(result) && scsi_status_is_good(result))
+   blk_stat = BLK_STS_OK;
+
+   return blk_stat;
+}
+
 /*
  * Function:scsi_io_completion()
  *
@@ -794,26 +865,23 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
  ACTION_DELAYED_RETRY} action;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
-   if (result) {
+   if (result) {   /* does not necessarily mean there is an error */
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
sense_current = !scsi_sense_is_deferred();
+   blk_stat = scsi_io_completion_nz_result(cmd, result);
+   if (blk_stat == BLK_STS_OK)
+   result = 0; /* suppress error processing now */
+   /*
+* if scsi_io_complet

Re: [PATCH] scsi: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()

2018-02-26 Thread Douglas Gilbert

On 2018-02-26 02:39 AM, Hannes Reinecke wrote:

When converting __scsi_error_from_host_byte() to BLK_STS error codes
the case DID_OK was forgotten, resulting in it always returning
an error.

Fixes: 2a842acab109 ("block: introduce new block status code type")
Cc: Doug Gilbert <dgilb...@interlog.com>
Signed-off-by: Hannes Reinecke <h...@suse.com>


Reviewed-by: Douglas Gilbert <dgilb...@interlog.com>


---
  drivers/scsi/scsi_lib.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index aea5a1ae318b..11d63136f0bd 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -720,6 +720,8 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
int result)
  {
switch (host_byte(result)) {
+   case DID_OK:
+   return BLK_STS_OK;
case DID_TRANSPORT_FAILFAST:
return BLK_STS_TRANSPORT;
case DID_TARGET_FAILURE:





Re: [PATCH v2] Make SCSI Status CONDITION MET equivalent to GOOD

2018-02-26 Thread Douglas Gilbert

On 2018-02-26 02:25 AM, Hannes Reinecke wrote:

On 02/25/2018 07:29 PM, Douglas Gilbert wrote:

The SCSI PRE-FETCH (10 or 16) command is present both on hard disks
and some SSDs. It is useful when the address of the next block(s) to
be read is known but it is not following the LBA of the current READ
(so read-ahead won't help). It returns two "good" SCSI Status values.
If the requested blocks have fitted (or will most likely fit (when
the IMMED bit is set)) into the disk's cache, it returns CONDITION
MET. If it didn't (or will not) fit then it returns GOOD status.

The primary goal of this patch is to stop the SCSI subsystem treating
the CONDITION MET SCSI status as an error. The current state makes the
PRE-FETCH command effectively unusable via pass-throughs. The hunt to
find where the erroneous decision was made led to the
scsi_io_completion() function in scsi_lib.c . This is a complicated
function made more complicated by the intertwining of good and error
(or special case) processing paths.

Future work: if the sd driver was to use the PRE-FETCH command,
decide whether it and/or the block layer needs to know about the
two different "good" statuses. If so a mechanism would be needed
to do that.

ChangeLog to v2:
   - further restructure the code, place some early non-zero
 result processing in a new helper function:
 scsi_io_completion_nz_result()
   - this reduces the number of error checks on the zero result
 path (the fast path) at the expense of some extra work for
 the non-zero result processing
   - rename some variables to make the code a little clearer

ChangeLog to v1:
   - expand scsi_status_is_good() to check for CONDITION MET
   - add another corner case in scsi_io_completion() adjacent
 to the one for the RECOVERED ERROR sense key case. That
 is another "non-error"
   - structure code so extra checks are only on the error
 path (i.e. when cmd->result is non-zero)

This patch is against mkp's 4.17/scsi-queue branch. It also applies
to lk 4.15.x where it was re-tested on a SAS SSD.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
  drivers/scsi/scsi_lib.c | 140 +---
  include/scsi/scsi.h |   2 +
  2 files changed, 87 insertions(+), 55 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index aea5a1ae318b..e1e974f08d52 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -738,6 +738,73 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
}
  }
  
+/* Helper for scsi_io_completion() when cmd->result is non-zero. */

+static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd,
+   blk_status_t *blk_statp)
+{
+   bool sense_valid;
+   bool about_current;
+   int result = cmd->result;
+   struct request *req = cmd->request;
+   struct scsi_sense_hdr sshdr;
+
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   about_current = sense_valid ? !scsi_sense_is_deferred() : true;
+
+   if (blk_rq_is_passthrough(req)) {
+   if (sense_valid) {
+   /*
+* SG_IO wants current and deferred errors
+*/
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
+   }
+   if (about_current)
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   } else if (blk_rq_bytes(req) == 0 && about_current) {
+   /*
+* Flush commands do not transfers any data, and thus cannot use
+* good_bytes != blk_rq_bytes(req) as the signal for an error.
+* This sets the error explicitly for the problem case.
+*/
+   *blk_statp = __scsi_error_from_host_byte(cmd, result);
+   }
+   /*
+* Recovered errors need reporting, but they're always treated as
+* success, so fiddle the result code here.  For passthrough requests
+* we already took a copy of the original into sreq->result which
+* is what gets returned to the user
+*/
+   if (sense_valid && (sshdr.sense_key == RECOVERED_ERROR)) {
+   /*
+* if ATA PASS-THROUGH INFORMATION AVAILABLE skip
+* print since caller wants ATA registers. Only occurs
+* on SCSI ATA PASS_THROUGH commands when CK_COND=1
+*/
+   if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
+   ;
+   else if (!(req->rq_flags & RQF_QUIET))
+   scsi_print_sense(cmd);
+   result = 0;
+   *blk_statp = BLK_STS_OK;
+   /* for passthrough, blk_stat may be set 

[PATCH v2 1/2] scsi_io_completion split, fix CONDITION MET handling

2018-02-27 Thread Douglas Gilbert
The "ChangeLog for v1" section in 0/2 (the cover letter) of this patch set
outlines the changes in this patch.

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
 drivers/scsi/scsi_lib.c | 297 +---
 include/scsi/scsi.h |   2 +
 2 files changed, 181 insertions(+), 118 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index aea5a1ae318b..8074d0a14391 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -739,103 +739,41 @@ static blk_status_t __scsi_error_from_host_byte(struct 
scsi_cmnd *cmd,
 }
 
 /*
- * Function:scsi_io_completion()
- *
- * Purpose: Completion processing for block device I/O requests.
- *
- * Arguments:   cmd   - command that is finished.
- *
- * Lock status: Assumed that no lock is held upon entry.
- *
- * Returns: Nothing
- *
- * Notes:   We will finish off the specified number of sectors.  If we
- * are done, the command block will be released and the queue
- * function will be goosed.  If we are not done then we have to
- * figure out what to do next:
- *
- * a) We can call scsi_requeue_command().  The request
- *will be unprepared and put back on the queue.  Then
- *a new command will be created for it.  This should
- *be used if we made forward progress, or if we want
- *to switch from READ(10) to READ(6) for example.
- *
- * b) We can call __scsi_queue_insert().  The request will
- *be put back on the queue and retried using the same
- *command as before, possibly after a delay.
- *
- * c) We can call scsi_end_request() with -EIO to fail
- *the remainder of the request.
+ * Helper for scsi_io_completion() when cmd->result is non-zero. Returns
+ * BLK_STS_NOTSUPP if this function does not change blk_status .
  */
-void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
+static blk_status_t scsi_io_completion_nz_result(struct scsi_cmnd *cmd,
+int result)
 {
-   int result = cmd->result;
-   struct request_queue *q = cmd->device->request_queue;
+   bool sense_valid;
+   bool about_current;
+   /* __scsi_error_from_host_byte() does not return BLK_STS_NOTSUPP */
+   blk_status_t blk_stat = BLK_STS_NOTSUPP;
struct request *req = cmd->request;
-   blk_status_t error = BLK_STS_OK;
struct scsi_sense_hdr sshdr;
-   bool sense_valid = false;
-   int sense_deferred = 0, level = 0;
-   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
- ACTION_DELAYED_RETRY} action;
-   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
-   if (result) {
-   sense_valid = scsi_command_normalize_sense(cmd, );
-   if (sense_valid)
-   sense_deferred = scsi_sense_is_deferred();
-   }
+   sense_valid = scsi_command_normalize_sense(cmd, );
+   about_current = sense_valid ? !scsi_sense_is_deferred() : true;
 
if (blk_rq_is_passthrough(req)) {
-   if (result) {
-   if (sense_valid) {
-   /*
-* SG_IO wants current and deferred errors
-*/
-   scsi_req(req)->sense_len =
-   min(8 + cmd->sense_buffer[7],
-   SCSI_SENSE_BUFFERSIZE);
-   }
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
-   }
-   /*
-* __scsi_error_from_host_byte may have reset the host_byte
-*/
-   scsi_req(req)->result = cmd->result;
-   scsi_req(req)->resid_len = scsi_get_resid(cmd);
-
-   if (scsi_bidi_cmnd(cmd)) {
+   if (sense_valid) {
/*
-* Bidi commands Must be complete as a whole,
-* both sides at once.
+* SG_IO wants current and deferred errors
 */
-   scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
-   if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
-   blk_rq_bytes(req->next_rq)))
-   BUG();
-   return;
+   scsi_req(req)->sense_len =
+   min(8 + cmd->sense_buffer[7],
+   SCSI_SENSE_BUFFERSIZE);
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
+  

[PATCH v2 2/2] scsi_io_completion convert BUG to WARN calls

2018-02-27 Thread Douglas Gilbert
ChangeLog:
  - convert 3 BUG calls to WARN calls
  - un-invert some conditional logic to uncover the real fast path
  - try to improve the comments, including noting what the bool return
value from scsi_end_request() means

Signed-off-by: Douglas Gilbert <dgilb...@interlog.com>
---
 drivers/scsi/scsi_lib.c | 45 ++---
 1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8074d0a14391..9ebcf6f15d1d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -650,6 +650,7 @@ static void scsi_release_bidi_buffers(struct scsi_cmnd *cmd)
cmd->request->next_rq->special = NULL;
 }
 
+/* Returns false when no more bytes to process, true if there are more */
 static bool scsi_end_request(struct request *req, blk_status_t error,
unsigned int bytes, unsigned int bidi_bytes)
 {
@@ -1030,38 +1031,44 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 */
scsi_req(req->next_rq)->resid_len = scsi_in(cmd)->resid;
if (scsi_end_request(req, BLK_STS_OK, blk_rq_bytes(req),
-   blk_rq_bytes(req->next_rq)))
-   BUG();
+blk_rq_bytes(req->next_rq)))
+   WARN(true, "Bidi command with remaining bytes");
return;
}
}
 
-   /* no bidi support for !blk_rq_is_passthrough yet */
-   BUG_ON(blk_bidi_rq(req));
+   /* no bidi support yet, other than in pass-through */
+   if (unlikely(blk_bidi_rq(req))) {
+   WARN_ONCE(true, "Only support bidi command in passthrough");
+   scmd_printk(KERN_ERR, cmd, "Killing bidi command\n");
+   if (scsi_end_request(req, BLK_STS_IOERR, blk_rq_bytes(req),
+blk_rq_bytes(req->next_rq)))
+   WARN(true, "Bidi command with remaining bytes");
+   return;
+   }
 
-   /*
-* Next deal with any sectors which we were able to correctly
-* handle.
-*/
SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, cmd,
"%u sectors total, %d bytes done.\n",
blk_rq_sectors(req), good_bytes));
 
/*
-* special case: failed zero length commands always need to
-* drop down into the retry code. Otherwise, if we finished
-* all bytes in the request we are done now.
+* Next deal with any sectors which we were able to correctly
+* handle. Fast path should return in this block.
 */
-   if (unlikely(!(blk_stat && blk_rq_bytes(req) == 0) &&
-!scsi_end_request(req, blk_stat, good_bytes, 0)))
-   return;
-
+   if (likely(blk_rq_bytes(req) > 0 || blk_stat == BLK_STS_OK)) {
+   if (likely(!scsi_end_request(req, blk_stat, good_bytes, 0)))
+   return; /* no bytes remaining */
+   }
/*
-* Kill remainder if no retrys.
+* This leaves failed, zero length commands plus commands with
+* some bytes remaining.
 */
+
+   /* Kill remainder if no retrys. */
if (unlikely(blk_stat && scsi_noretry_cmd(cmd))) {
if (scsi_end_request(req, blk_stat, blk_rq_bytes(req), 0))
-   BUG();
+   WARN_ONCE(true,
+"Bytes remaining after failed, no-retry command");
return;
}
 
@@ -1071,8 +1078,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
 */
if (likely(result == 0)) {
/*
-* Fast path: Unprep the request and put it back at the head
-* of the queue. A new command will be prepared and issued.
+* Unprep the request and put it back at the head of the
+* queue. A new command will be prepared and issued.
 * This block is the same as case ACTION_REPREP in
 * scsi_io_completion_action() above.
 */
-- 
2.14.1



[PATCH v2 0/2] scsi_io_completion cleanup and fix CONDITION MET handling

2018-02-27 Thread Douglas Gilbert
This patch started as an attempt to fix the erroneous handling of
CONDITION MET, a relatively rare special case. A solution meant adding
another special case to the already complicated scsi_io_completion()
function. To better understand that function the author found it
useful to refactor the function into one relatively short function
that the fast path passes through and two helper functions. These
helper functions do the bulk of the error and special case handling.

The SCSI PRE-FETCH (10 or 16) command is present both on hard disks
and some SSDs. It is useful when the address of the next block(s) to
be read is known but it is not following the LBA of the current READ
(so read-ahead won't help). It returns two "good" SCSI Status values.
If the requested blocks have fitted (or will most likely fit (when
the IMMED bit is set)) into the disk's cache, it returns CONDITION
MET. If it didn't (or will not) fit then it returns GOOD status.

Future work: if the sd driver was to use the PRE-FETCH command,
decide whether it and/or the block layer needs to know about the
two different "good" statuses. If so a mechanism would be needed
to do that.

ChangeLog for v2:
  - after checkpatch.pl warning about BUG statements and confirmed
by a reviewer, convert 3 BUG calls in scsi_io_completion()
into WARN calls

ChangeLog for v1:
  - split scsi_io_completion() into one short function that the
fast path uses, and two helper functions
  - add conditional hints on the fast path
  - rename some variables to make the code a little clearer
  - add scsi_io_completion_nz_result() helper for the initial
handling of the non-zero cmd->result case
  - add scsi_io_completion_action() helper for the 'action'
processing of errors and special cases
  - expand inline function scsi_status_is_good() to check for
CONDITION MET
  - add a corner case in scsi_io_completion_nz_result() for
CONDITION MET (and similar "good" SCSI statuses)
  - structure code so extra checks are only on the error
path (i.e. when cmd->result is non-zero)

This patch is against mkp's 4.17/scsi-queue branch. It also
 applies to lk 4.15.x where it was tested on a SAS SSD.


Douglas Gilbert (2):
  scsi_io_completion cleanup and fix CONDITION MET handling
  scsi_io_completion-change2warn

 drivers/scsi/scsi_lib.c | 304 +---
 include/scsi/scsi.h |   2 +
 2 files changed, 188 insertions(+), 118 deletions(-)

-- 
2.14.1



Re: [PATCH] bsg: convert to use blk-mq

2018-10-17 Thread Douglas Gilbert

On 2018-10-17 11:55 a.m., Benjamin Block wrote:

On Tue, Oct 16, 2018 at 08:43:01AM -0600, Jens Axboe wrote:

Requires a few changes to the FC transport class as well.

Cc: Johannes Thumshirn 
Cc: Benjamin Block 
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe 
---
  block/bsg-lib.c  | 102 +--
  drivers/scsi/scsi_transport_fc.c |  61 ++
  2 files changed, 91 insertions(+), 72 deletions(-)



Hey Jens,

I haven't had time to look into this in any deep way - but I did plan to
-, but just to see whether it starts and runs some I/O I tried giving it
a spin and came up with nothing (see line 3 and 5):

[   14.082079] scsi host0: scsi_eh_0: sleeping
[   14.082288] scsi host0: zfcp
[   14.086823] scsi host0: fc_host0: bsg interface failed to initialize - setup 
queue
[   14.089198] qdio: 0.0.1900 ZFCP on SC 107 using AI:1 QEBSM:0 PRI:1 TDD:1 
SIGA: W AP
[   15.228583]  rport-0:0-0: failed to setup bsg queue
[   15.229886] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
[   15.230801] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
[   15.230860] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 164
[   15.231171] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0
[   15.231190] scsi 0:0:0:0: scsi scan: peripheral device type of 31, no device 
added
[   15.233171] scsi 0:0:0:0: scsi scan: Sending REPORT LUNS to (try 0)
[   15.234144] scsi 0:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 
0x0
[   15.234156] scsi 0:0:0:0: scsi scan: REPORT LUN scan
[   15.235040] scsi 0:0:0:1082671104: scsi scan: INQUIRY pass 1 length 36
[   15.235220] scsi host1: scsi_eh_1: sleeping
[   15.235336] scsi 0:0:0:1082671104: scsi scan: INQUIRY successful with code 
0x0
[   15.235355] scsi 0:0:0:1082671104: scsi scan: INQUIRY pass 2 length 164
[   15.235434] scsi host1: zfcp
[   15.235667] scsi 0:0:0:1082671104: scsi scan: INQUIRY successful with code 
0x0
[   15.235709] scsi 0:0:0:1082671104: Direct-Access IBM  2107900
  3230 PQ: 0 ANSI: 5
[   15.238468] scsi host1: fc_host1: bsg interface failed to initialize - setup 
queue


Seems to already fail at setting up the bsg interfaces for zFCP - at
loading time of the module. This is just your patch on top of
4.19.0-rc8.


Hi,
Almost all of the utilities in the sg3_utils package (a few exceptions:
sg_reset, sgm_dd, sgp_dd), when given a bsg file node (e.g.
'sg_inq /dev/bsg/0:0:0:1'), will use the bsg sg_v4 interface (i.e. as
defined in ). Dito with the sdparm and ddpt packages.
So there is a lot of test code for any changes to the bsg driver.

Doug Gilbert


P.S. that above is true from sg3_utils version 1.27 released on
 20090411. The current version is 1.44





[PATCH 6/8] sg: complete locking changes on ioctl+debug

2018-10-19 Thread Douglas Gilbert
Complete the locking and structure changes of ioctl and debug
('cat /proc/scsi/sg/debug') handling.

Signed-off-by: Douglas Gilbert 
---

This was the code that was "#if 0'-ed out 2 patches ago. It
also shuts checkpatch.pl up as it doesn't like that technique
but offers no viable substitute.

 drivers/scsi/sg.c | 204 +-
 1 file changed, 130 insertions(+), 74 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 45bad8159e51..9b30d6667cc9 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1004,7 +1004,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct 
sg_request *srp)
return ret;
 }
 
-#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1036,24 +1035,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENXIO;
if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR))
return -EFAULT;
-   result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR,
-1, read_only, 1, );
+   result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only,
+true, );
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
-   (srp_done(sfp, srp) || atomic_read(>detaching)));
-   if (atomic_read(>detaching))
+   srp_state_or_detaching(sdp, srp));
+
+   spin_lock_irqsave(>rq_entry_lck, iflags);
+   if (unlikely(result)) { /* -ERESTARTSYS because signal hit */
+   srp->orphan = true;
+   srp->rq_state = SG_RQ_INFLIGHT;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
+   SG_LOG(1, sdp, "%s:  wait_event_interruptible-->%d\n",
+  __func__, result);
+   return result;
+   }
+   if (unlikely(atomic_read(>detaching))) {
+   srp->rq_state = SG_RQ_INACTIVE;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
return -ENODEV;
-   write_lock_irq(>rq_list_lock);
-   if (srp->done) {
-   srp->done = 2;
-   write_unlock_irq(>rq_list_lock);
+   } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) {
+   srp->rq_state = SG_RQ_DONE_READ;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp);
return (result < 0) ? result : 0;
}
-   srp->orphan = true;
-   write_unlock_irq(>rq_list_lock);
-   return result;  /* -ERESTARTSYS because signal hit process */
+   cp = sg_rq_state_str(srp->rq_state, true);
+   SG_LOG(1, sdp, "%s: unexpected srp=0x%p  state: %s\n", __func__,
+  srp, cp);
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
+   return -EPROTO; /* Logic error */
case SG_SET_TIMEOUT:
result = get_user(val, ip);
if (result)
@@ -1112,25 +1124,36 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
if (!access_ok(VERIFY_WRITE, ip, sizeof(int)))
return -EFAULT;
read_lock_irqsave(>rq_list_lock, iflags);
-   list_for_each_entry(srp, >rq_list, entry) {
-   if ((1 == srp->done) && (!srp->sg_io_owned)) {
-   read_unlock_irqrestore(>rq_list_lock,
-  iflags);
-   __put_user(srp->header.pack_id, ip);
-   return 0;
+   leave = false;
+   val = -1;
+   list_for_each_entry(srp, >rq_list, rq_entry) {
+   spin_lock(>rq_entry_lck);
+   if (srp->rq_state == SG_RQ_AWAIT_READ &&
+   !srp->sync_invoc) {
+   val = srp->header.pack_id;
+   leave = true;
}
+   spin_unlock(>rq_entry_lck);
+   if (leave)
+   break;
}
read_unlock_irqrestore(>rq_list_lock, iflags);
-   __put_user(-1, ip);
+   __put_user(val, ip);
+   SG_LOG(3, sdp, "%s:SG_GET_PACK_ID=%d\n", __func__, val);
ret

[PATCH 3/8] sg: split header, expand and correct descriptions

2018-10-19 Thread Douglas Gilbert
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header:
uapi/scsi/sg.h . Overall expand the twin header files with new
functionality in this patchset and functionality to be added in the
next patchset to implement SG_IOSUBMIT and friends. Adjust format to
modern kernel conventions. Correct and expand some comments.

Signed-off-by: Douglas Gilbert 
---

The new header file: uapi/scsi/sg.h is expected to be in the
kernel's public interface. This takes time (i.e. months or years)
to find its way into glibc and Linux distributions. So the sooner
it is presented and accepted the better.

>From the C perspective, nothing is removed or changed (or shouldn't
be), only additions. This means that the original typedefs stay (but
they are not used in the driver source file: sg.c) since user space
programs may be using them.

 include/scsi/sg.h  | 252 ++---
 include/uapi/scsi/sg.h | 415 +
 2 files changed, 426 insertions(+), 241 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index f91bcca604e4..596f68746f66 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -7,23 +7,23 @@
 /*
  * History:
  *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
+ *  process control of SCSI devices.
  *  Development Sponsored by Killy Corp. NY NY
  *
  * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
+ *   Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
+ *   Copyright (C) 1998 - 2018 Douglas Gilbert
  *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
+ *  Version: 3.9.01 (20181016)
+ *  This version is for 2.6, 3 and 4 series kernels.
  *
  * Documentation
  * =
  * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
+ *   http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
  * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
+ *   http://sg.danny.cz/sg/p/sg_v3_ho.html
  * Also see: /Documentation/scsi/scsi-generic.txt
  *
  * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
@@ -33,242 +33,12 @@
 extern int sg_big_buff; /* for sysctl */
 #endif
 
+#include 
 
-typedef struct sg_iovec /* same structure as used by readv() Linux system */
-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
- or scatter gather list */
-unsigned char __user *cmdp; /* [i], [*i] points to command to perform */
-void __user *sbp;  /* [i], [*o] points to sense_buffer memory */
-unsigned int timeout;   /* [i] MAX_UINT->no timeout (unit: millisec) */
-unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */
-int pack_id;/* [i->o] unused internally (normally) */
-void __user * usr_ptr;  /* [i->o] unused internally */
-unsigned char status;   /* [o] scsi status */
-unsigned char masked_status;/* [o] shifted, masked scsi status */
-unsigned char msg_status;   /* [o] messaging level data (optional) */
-unsigned char sb_len_wr;/* [o] byte count actually written to sbp */
-unsigned short host_status; /* [o] errors from host adapter */
-unsigned short driver_status;/* [o] errors from software driver */
-int resid;  /* [o] dxfer_len - actual_transferred */
-unsigned int duration;  /* [o] time taken by cmd (unit: millisec) */
-unsigned int info;  /* [o] auxiliary information */
-} sg_io_hdr_t;  /* 64 bytes long (on i386) */
-
-#define SG_INTERFACE_ID_ORIG 'S'
-
-/* Use negative values to flag difference from original sg_header structure */
-#define SG_DXFER_NONE (-1)  /* e.g. a SCSI Test Unit Ready command */
-#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */
-#define SG_DXFER_FROM_DEV (-3)  /* e.g. a SCSI READ command */
-#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the
-  additional property than 

[PATCH 7/8] sg: rework ioctl handling

2018-10-19 Thread Douglas Gilbert
Rework ioctl handling, report clearly to the log which ioctl has
been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which
permits several integer and boolean values to be "_SET_" (i.e.
passed into the driver, potentially changing its actions) and/or
read from the driver (the "_GET_" part) in a single operation.

Signed-off-by: Douglas Gilbert 
---

One feature of the new SG_SET_GET_EXTENDED ioctl is ability to
fetch the sg device minor number (e.g. the "3" in /dev/sg3)
associated with the current file descriptor. A boolean addition
is the ability to change command timekeeping on the current file
descriptor from milliseconds (the default) to nanoseconds.

 drivers/scsi/sg.c | 546 +++---
 1 file changed, 414 insertions(+), 132 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9b30d6667cc9..583846ebc5e0 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -76,11 +76,11 @@ static int sg_proc_init(void);
 #define SG_MAX_CDB_SIZE 252
 
 /* Following defines are states of sg_request::rq_state */
-#define SG_RQ_INACTIVE 0/* request not in use (e.g. on fl) */
-#define SG_RQ_INFLIGHT 1/* SCSI request issued, no response yet */
-#define SG_RQ_AWAIT_READ 2  /* response received, awaiting read */
-#define SG_RQ_DONE_READ 3   /* read is ongoing or done */
-#define SG_RQ_BUSY 4/* example: reserve request changing size */
+#define SG_RQ_INACTIVE 0   /* request not in use (e.g. on fl) */
+#define SG_RQ_INFLIGHT 1   /* SCSI request issued, no response yet */
+#define SG_RQ_AWAIT_READ 2 /* response received, awaiting read */
+#define SG_RQ_DONE_READ 3  /* read is ongoing or done */
+#define SG_RQ_BUSY 4   /* example: reserve request changing size */
 
 /* free up requests larger than this dlen size after use */
 #define SG_RQ_DATA_THRESHOLD (128 * 1024)
@@ -88,8 +88,8 @@ static int sg_proc_init(void);
 /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
 #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
 
-#define SG_TIME_UNIT_MS 0   /* milliseconds */
-#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_TIME_UNIT_MS 0  /* milliseconds */
+#define SG_TIME_UNIT_NS 1  /* nanoseconds */
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
@@ -184,7 +184,6 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool cmd_q; /* true -> allow command queuing, false -> don't */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
-   bool sse_seen;  /* SG_SET_EXTENDED ioctl seen */
bool time_in_ns;/* report times in nanoseconds */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
@@ -238,6 +237,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct 
sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
 static const char *sg_rq_state_str(u8 rq_state, bool long_str);
+static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first,
+   rwlock_t *rwlp, unsigned long *iflagsp);
 
 #define SZ_SG_HEADER sizeof(struct sg_header)  /* v1 and v2 header */
 #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr)  /* v3 header */
@@ -505,9 +506,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -812,11 +813,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const 
char __user *buf,
return -ENOSYS;
if (hp->flags & SG_FLAG_MMAP_IO) {
if (!list_empty(>rq_list))
-   return -EBUSY;  /* already active requests on fd */
+   return -EBUSY;  /* already active requests on fd */
if (hp->dxfer_len > sfp->reserve_srp->data.dlen)
-   return -ENOMEM; /* MMAP_IO size must fit in reserve */
+   return -ENOMEM; /* MMAP_IO size must fit in reserve */
if (hp->flags & SG_FLAG_DIRECT_IO)
-   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
+   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
}
sfp->

[PATCH 5/8] sg: add free list, rework locking

2018-10-19 Thread Douglas Gilbert
Remove fixed 16 sg_request object array and replace with an active
rq_list plus a request free list. Add finer grained spin lock to
sg_request and do a major rework on locking. sg_request objects now
are only de-allocated when the owning file descriptor is closed.
This simplifies locking issues.

Signed-off-by: Douglas Gilbert 
---

This patch is big and complex. Towards the end the diff program
completely loses the plot. Better to use difftool on a two pane
window.

 drivers/scsi/sg.c | 1225 +++--
 1 file changed, 739 insertions(+), 486 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c77701c0b939..45bad8159e51 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -139,44 +139,56 @@ struct sg_scatter_hold { /* holding area for scsi 
scatter gather info */
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
-   struct list_head entry; /* list entry */
-   struct sg_fd *parentfp; /* NULL -> not in use */
+/*
+ * For any file descriptor: at any time a sg_request object must be a member
+ * of sg_fd::rq_list or rq_free_list::rq_free_list. The only exception is
+ * within a rq_list_lock write lock when it is moving between those two lists.
+ */
+
+struct sg_request {/* active SCSI command or inactive on free list (fl) */
+   struct list_head rq_entry;  /* member of rq_list (active cmd) */
+   struct list_head free_entry;/* member of rq_free_list */
+   spinlock_t rq_entry_lck;
struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */
union {
struct sg_io_hdr header;  /* see  */
-   struct sg_io_v4 hdr_v4;   /* see  */
+   struct sg_v4_hold v4_hold;/* related to  */
};
-   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
-   bool hdr_v4_active; /* selector for anonymous union above */
-   bool res_used;  /* true -> use reserve buffer, false -> don't */
+   struct execute_work ew;
+   ktime_t start_ts;   /* used when sg_fd::time_in_ns is true */
+   bool v4_active; /* selector for autonomous union above */
bool orphan;/* true -> drop on sight, false -> normal */
-   bool sg_io_owned;   /* true -> packet belongs to SG_IO */
-   /* done protected by rq_list_lock */
-   char done;  /* 0->before bh, 1->before read, 2->read */
+   bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */
+   u8 rq_state;/* one of 5 states, see SG_RQ_* defines */
+   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
+   struct sg_fd *parentfp;  /* pointer to owning fd, even when on fl */
+   struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */
struct request *rq;
struct bio *bio;
-   struct execute_work ew;
 };
 
-struct sg_fd { /* holds the state of a file descriptor */
-   struct list_head sfd_siblings;  /* protected by device's sfd_lock */
+struct sg_fd { /* holds the state of a file descriptor */
+   struct list_head sfd_entry; /* member sg_device::sfds list */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
-   rwlock_t rq_list_lock;  /* protect access to list in req_arr */
struct mutex f_mutex;   /* protect against changes in this fd */
+   rwlock_t rq_list_lock;  /* protect access to sg_request lists */
+   struct list_head rq_list; /* head of inflight sg_request list */
+   struct list_head rq_free_list; /* head of sg_request free list */
int timeout;/* defaults to SG_DEFAULT_TIMEOUT  */
int timeout_user;   /* defaults to SG_DEFAULT_TIMEOUT_USER */
-   struct sg_scatter_hold reserve; /* one held for this file descriptor */
-   struct list_head rq_list; /* head of request list */
-   struct fasync_struct *async_qp; /* used by asynchronous notification */
-   struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */
+   int rem_sgat_thresh;/* > this, request's sgat cleared after use */
+   u32 tot_fd_thresh;  /* E2BIG if sum_of(dlen) > this, 0: ignore */
+   u32 sum_fd_dlens;   /* when tot_fd_thresh>0 this is sum_of(dlen) */
bool force_packid;  /* true -> pack_id input to read() */
bool cmd_q; /* true -> allow command queuing, false -> don't */
-   u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
-   bool res_in_use;/* true -> 'reserve' array in use */
+   bool sse_seen;  /* SG_SET_EXTENDED ioctl seen */
+   bool 

[PATCH 8/8] sg: user control for q_at_head or tail

2018-10-19 Thread Douglas Gilbert
Add a SG_SET_GET_EXTENDED ioctl control for whether commands
will be queued_at_head or queued_at_tail by the block layer
(together with the scsi mid-level). It has file scope.

Signed-off-by: Douglas Gilbert 
---

The user can still override this setting on a per command
basis with the SG_FLAG_Q_AT_HEAD and SG_FLAG_Q_AT_TAIL
in the sg v3 and v4 structures.

 drivers/scsi/sg.c  | 35 +++
 include/uapi/scsi/sg.h |  3 ++-
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 583846ebc5e0..258923aac50d 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -93,6 +93,10 @@ static int sg_proc_init(void);
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
+#define SG_FD_Q_AT_TAIL true
+#define SG_FD_Q_AT_HEAD false
+#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */
+
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
/proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer
@@ -185,6 +189,7 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
bool time_in_ns;/* report times in nanoseconds */
+   bool q_at_tail; /* queue at tail if true, head when false */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
struct fasync_struct *async_qp; /* used by asynchronous notification */
@@ -894,9 +899,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT);
else
hp->duration = jiffies_to_msecs(jiffies);
-   /* at tail if v3 or later interface and tail flag set */
-   at_head = !(hp->interface_id != '\0' &&
-   (SG_FLAG_Q_AT_TAIL & hp->flags));
+
+   if (hp->interface_id == '\0')   /* v1 and v2 interface */
+   at_head = true; /* backward compatibility */
+   else if (sfp->q_at_tail)  /* cmd flags can override sfd setting */
+   at_head = (SG_FLAG_Q_AT_HEAD & hp->flags);
+   else/* this sfd is defaulting to head */
+   at_head = !(SG_FLAG_Q_AT_TAIL & hp->flags);
 
srp->rq->timeout = timeout;
kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */
@@ -1186,8 +1195,10 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
}
SG_LOG(3, sdp, "%s: wr_mask=0x%x rd_mask=0x%x\n", __func__,
   seip->valid_wr_mask, seip->valid_rd_mask);
+   /* reserved_sz (u32), read-write */
if (or_masks & SG_SEIM_RESERVED_SIZE)
result = sg_reserved_sz(sfp, seip);
+   /* rq_rem_sgat_threshold (u32), read-write [impacts re-use only] */
if (or_masks & SG_SEIM_RQ_REM_THRESH) {
if (seip->valid_wr_mask & SG_SEIM_RQ_REM_THRESH) {
uv = seip->rq_rem_sgat_thresh;
@@ -1198,6 +1209,7 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
if (seip->valid_rd_mask & SG_SEIM_RQ_REM_THRESH)
seip->rq_rem_sgat_thresh = sfp->rem_sgat_thresh;
}
+   /* tot_fd_thresh (u32), read-write [sum of active cmd dlen_s] */
if (or_masks & SG_SEIM_TOT_FD_THRESH) {
if (seip->valid_wr_mask & SG_SEIM_TOT_FD_THRESH) {
uv = seip->tot_fd_thresh;
@@ -1208,8 +1220,9 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
if (seip->valid_rd_mask & SG_SEIM_TOT_FD_THRESH)
seip->tot_fd_thresh = sfp->tot_fd_thresh;
}
+   /* check all boolean flags if either wr or rd mask set in or_mask */
if (or_masks & SG_SEIM_CTL_FLAGS) {
-   /* don't care whether wr or rd mask set in or_mask */
+   /* TIME_IN_NS boolean, read-write */
if (seip->ctl_flags_wr_mask & SG_CTL_FLAGM_TIME_IN_NS)
sfp->time_in_ns =
!!(seip->ctl_flags & SG_CTL_FLAGM_TIME_IN_NS);
@@ -1219,19 +1232,32 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
else
seip->ctl_flags &= ~SG_CTL_FLAGM_TIME_IN_NS;
}
+   /* ORPHANS boolean, read-only */
if (seip->ctl_flags_rd_mask & SG_CTL_FLAGM_ORPHANS) {
if (sg_any_persistent_orphans(sfp))
seip->ctl_flags |= SG_CTL_FLAGM_OR

[PATCH 1/8] sg: types and naming cleanup

2018-10-19 Thread Douglas Gilbert
Remove typedefs and use better type names like bool and u8 where
appropriate. Rename some variables and functions for clarity.
Adjust formatting (e.g. function definitions) to be more
consistent across the driver.

Signed-off-by: Douglas Gilbert 
---

The intention is to move to sg_version_num 40001 when a second
patchset implementing SG_IOSUBMIT and friends is applied. In
the meantime, move from the latest version number in the kernel
(30536) to 30901 to indicate a significant change but not yet
implementing the sg v4 interface.

 drivers/scsi/sg.c | 827 ++
 1 file changed, 467 insertions(+), 360 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c6ad00703c5b..7e723f37a269 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -7,7 +7,7 @@
  * Original driver (sg.c):
  *Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- *Copyright (C) 1998 - 2014 Douglas Gilbert
+ *Copyright (C) 1998 - 2018 Douglas Gilbert
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -16,16 +16,9 @@
  *
  */
 
-static int sg_version_num = 30536; /* 2 digits for each component */
-#define SG_VERSION_STR "3.5.36"
+static int sg_version_num = 30901; /* 2 digits for each component */
+#define SG_VERSION_STR "3.9.01"
 
-/*
- *  D. P. Gilbert (dgilb...@interlog.com), notes:
- *  - scsi logging is available via SCSI_LOG_TIMEOUT macros. First
- *the kernel/module needs to be built with CONFIG_SCSI_LOGGING
- *(otherwise the macros compile to empty statements).
- *
- */
 #include 
 
 #include 
@@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 #include 
 #include 
 #include  /* for sg_check_file_access() */
+#include 
+#include 
 
 #include "scsi.h"
 #include 
@@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20140603";
+static char *sg_version_date = "20181018";
 
 static int sg_proc_init(void);
 #endif
@@ -73,7 +68,8 @@ static int sg_proc_init(void);
 
 #define SG_MAX_DEVS 32768
 
-/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
+/*
+ * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
  * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater
  * than 16 bytes are "variable length" whose length is a multiple of 4
  */
@@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct 
class_interface *);
 static void sg_remove_device(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */
 
 static struct class_interface sg_interface = {
.add_dev= sg_add_device,
.remove_dev = sg_remove_device,
 };
 
-typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info 
*/
-   unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */
-   unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */
-   unsigned bufflen;   /* Size of (aggregate) data buffer */
-   struct page **pages;
-   int page_order;
-   char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */
-   unsigned char cmd_opcode; /* first byte of command */
-} Sg_scatter_hold;
+struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */
+   __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */
+   __user u8 *sbp; /* derived from sg_io_v4::response */
+   u16 cmd_len;/* truncated of sg_io_v4::request_len */
+   u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */
+   u32 flags;  /* copy of sg_io_v4::flags */
+};
+
+struct sg_scatter_hold { /* holding area for scsi scatter gather info */
+   struct page **pages;/* num_sgat element array of struct page* */
+   int page_order; /* byte_len = (page_size * (2**page_order)) */
+   int dlen;   /* Byte length of data buffer */
+   unsigned short num_sgat;/* actual number of scatter-gather segments */
+   bool dio_in_use;/* false->indirect IO (or mmap), true->dio */
+   u8 cmd_opcode;  /* first byte of command */
+};
 
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
+struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
struct list_head entry; /* list entry */

[PATCH 4/8] sg: expand request states

2018-10-19 Thread Douglas Gilbert
Introduce the new rq_state defines. SG_RQ_DATA_THRESHOLD is a
default value that if the data length of a request exceeds
then, after that request is completed, the data buffer will
be freed up as the sg_request object is placed on the free
list. SG_TOT_FD_THRESHOLD is a default, per file descriptor
value that the sum of outstanding command data lengths is not
allowed to exceed.

Signed-off-by: Douglas Gilbert 
---

The '#if 0's are temporary and removed in a later patch.
They allow the following large and complex patch to be
a bit shorter and still compile.

 drivers/scsi/sg.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 71623685abfe..c77701c0b939 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -75,6 +75,22 @@ static int sg_proc_init(void);
  */
 #define SG_MAX_CDB_SIZE 252
 
+/* Following defines are states of sg_request::rq_state */
+#define SG_RQ_INACTIVE 0/* request not in use (e.g. on fl) */
+#define SG_RQ_INFLIGHT 1/* SCSI request issued, no response yet */
+#define SG_RQ_AWAIT_READ 2  /* response received, awaiting read */
+#define SG_RQ_DONE_READ 3   /* read is ongoing or done */
+#define SG_RQ_BUSY 4/* example: reserve request changing size */
+
+/* free up requests larger than this dlen size after use */
+#define SG_RQ_DATA_THRESHOLD (128 * 1024)
+
+/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
+#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
+
+#define SG_TIME_UNIT_MS 0   /* milliseconds */
+#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
@@ -950,6 +966,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info 
*rinfo)
}
 }
 
+#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1227,6 +1244,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENOIOCTLCMD;
 }
 #endif
+#endif /* temporary to shorten big patch */
 
 static __poll_t
 sg_poll(struct file *filp, poll_table * wait)
@@ -1496,10 +1514,12 @@ static const struct file_operations sg_fops = {
.read = sg_read,
.write = sg_write,
.poll = sg_poll,
+#if 0  /* temporary to shorten big patch */
.unlocked_ioctl = sg_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl = sg_compat_ioctl,
 #endif
+#endif /* temporary to shorten big patch */
.open = sg_open,
.mmap = sg_mmap,
.release = sg_release,
@@ -2422,12 +2442,16 @@ static const struct seq_operations devstrs_seq_ops = {
.show  = sg_proc_seq_show_devstrs,
 };
 
+#if 0  /* temporary to shorten big patch */
 static int sg_proc_seq_show_debug(struct seq_file *s, void *v);
+#endif /* temporary to shorten big patch */
 static const struct seq_operations debug_seq_ops = {
.start = dev_seq_start,
.next  = dev_seq_next,
.stop  = dev_seq_stop,
+#if 0  /* temporary to shorten big patch */
.show  = sg_proc_seq_show_debug,
+#endif /* temporary to shorten big patch */
 };
 
 static int
@@ -2601,6 +2625,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v)
return 0;
 }
 
+#if 0  /* temporary to shorten big patch */
+
 /* must be called while holding sg_index_lock */
 static void
 sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp)
@@ -2704,6 +2730,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v)
read_unlock_irqrestore(_index_lock, iflags);
return 0;
 }
+#endif /* temporary to shorten big patch */
 
 #endif /* CONFIG_SCSI_PROC_FS */
 
-- 
2.17.1



[PATCH 0/8] sg: major cleanup, remove max_queue limit

2018-10-19 Thread Douglas Gilbert
The intention is to add two new ioctls as proposed by Linus Torvalds:
SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async
interface.

But first, clean up the driver and remove the SG_MAX_QUEUE limit of
no more than 16 queued commands on a file descriptor at a time. A
free list has been added and the de-allocation of sg_request
objects is deferred until the close() of a file. Locking is
extensively reworked, especially at the struct sg_fd and
sg_request level.

A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple
integer values and booleans to be written to and/or read from the
driver. An example is changing and/or reading the reserved request
data length (there is one of these per fd). An example of a new
feature is changing and/or reading the per-fd upper limit on the
sum of outstanding data buffer sizes (default is 16 MB). An
example of a boolean is a bit to do all following command
timekeeping in nanoseconds rather that the default millseconds.

A later patchset will add implementations for the SG_IOSUBMIT and
SG_IORECEIVE plus handling of the sg v4 interface with the
existing SG_IO ioctl.

This patchset is against Martin Petersen 4.20/scsi-queue branch.

Douglas Gilbert (8):
  sg: types and naming cleanup
  sg: introduce sg_log macro
  sg: split header, expand and correct descriptions
  sg: expand request states
  sg: add free list, rework locking
  sg: complete locking changes on ioctl+debug
  sg: rework ioctl handling
  sg: user control for q_at_head or tail

 drivers/scsi/sg.c  | 2484 ++--
 include/scsi/sg.h  |  252 +---
 include/uapi/scsi/sg.h |  416 +++
 3 files changed, 2036 insertions(+), 1116 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

-- 
2.17.1



[PATCH 2/8] sg: introduce sg_log macro

2018-10-19 Thread Douglas Gilbert
Introduce the SG_LOG macro to replace long-winded
'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__
wherever appropriate to make the debug messages more portable.

Signed-off-by: Douglas Gilbert 
---
 drivers/scsi/sg.c | 162 +-
 1 file changed, 72 insertions(+), 90 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 7e723f37a269..71623685abfe 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref);
 /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */
 #define SZ_SG_REQ_INFO sizeof(struct sg_req_info)
 
-#define sg_printk(prefix, sdp, fmt, a...) \
-   sdev_prefix_printk(prefix, (sdp)->device,   \
-  (sdp)->disk->disk_name, fmt, ##a)
+/*
+ * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages.
+ * 'depth' is a number between 1 (most severe) and 7 (most noisy, most
+ * information). All messages are logged as informational (KERN_INFO). In
+ * the unexpected situation where sdp is NULL the macro reverts to a pr_info
+ * and ignores CONFIG_SCSI_LOGGING and always prints to the log.
+ */
+#define SG_LOG(depth, sdp, fmt, a...)  \
+   do {\
+   if (IS_ERR_OR_NULL(sdp)) {  \
+   pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a);  \
+   } else {\
+   SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \
+KERN_INFO, (sdp)->device,  \
+(sdp)->disk->disk_name, fmt,   \
+##a)); \
+   }   \
+   } while (0)
 
 /*
  * The SCSI interfaces that use read() and write() as an asynchronous variant 
of
@@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp)
sdp = sg_get_dev(min_dev);
if (IS_ERR(sdp))
return PTR_ERR(sdp);
-
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp,
- "sg_open: flags=0x%x\n", flags));
+   SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n",
+  __func__, flags, sdp->open_cnt);
 
/* This driver's module count bumped by fops_get in  */
/* Prevent the device driver from vanishing while we sleep */
@@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n"));
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
+   SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__,
+  sdp->open_cnt);
 
mutex_lock(>open_rel_lock);
scsi_autopm_put_device(sdp->device);
@@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
sdp = sfp->parentdp;
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
if (atomic_read(>detaching))
return -ENODEV;
if (!((filp->f_flags & O_NONBLOCK) ||
@@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return -EIO;/* minimum scsi command length is 6 bytes */
 
if (!(srp = sg_add_request(sfp))) {
-   SCSI_LOG_TIMEOUT(1, sg_printk(KERN_INFO, sdp,
- "sg_write: queue full\n"));
+   SG_LOG(1, sdp, "%s: queue full\n", __func__);
return -EDOM;
}
buf += SZ_SG_HEADER;
@@ -703,9 +715,8 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
   

Re: [PATCH v2 3/8] sg: split header, expand and correct descriptions

2018-10-21 Thread Douglas Gilbert

On 2018-10-20 11:21 p.m., Douglas Gilbert wrote:

Split scsi/sg.h into a smaller scsi/sg.h which includes a new header:
uapi/scsi/sg.h . Overall expand the twin header files with new
functionality in this patchset and functionality to be added in the
next patchset to implement SG_IOSUBMIT and friends. Adjust format to
modern kernel conventions. Correct and expand some comments.

Signed-off-by: Douglas Gilbert 
---

The new header file: uapi/scsi/sg.h is expected to be in the
kernel's public interface. This takes time (i.e. months or years)
to find its way into glibc and Linux distributions. So the sooner
it is presented and accepted the better.


From the C perspective, nothing is removed or changed (or shouldn't

be), only additions. This means that the original typedefs stay (but
they are not used in the driver source file: sg.c) since user space
programs may be using them.

  include/scsi/sg.h  | 252 ++---
  include/uapi/scsi/sg.h | 418 +
  2 files changed, 429 insertions(+), 241 deletions(-)
  create mode 100644 include/uapi/scsi/sg.h

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index f91bcca604e4..596f68746f66 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -7,23 +7,23 @@
  /*
   * History:
   *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
+ *  process control of SCSI devices.
   *  Development Sponsored by Killy Corp. NY NY
   *
   * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
+ *   Copyright (C) 1992 Lawrence Foard
   * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
+ *   Copyright (C) 1998 - 2018 Douglas Gilbert
   *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
+ *  Version: 3.9.01 (20181016)
+ *  This version is for 2.6, 3 and 4 series kernels.
   *
   * Documentation
   * =
   * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
+ *   http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
   * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
+ *   http://sg.danny.cz/sg/p/sg_v3_ho.html
   * Also see: /Documentation/scsi/scsi-generic.txt
   *
   * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
@@ -33,242 +33,12 @@
  extern int sg_big_buff; /* for sysctl */
  #endif
  
+#include 
  
-typedef struct sg_iovec /* same structure as used by readv() Linux system */

-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
- or scatter gather list */
-unsigned char __user *cmdp; /* [i], [*i] points to command to perform */
-void __user *sbp;  /* [i], [*o] points to sense_buffer memory */
-unsigned int timeout;   /* [i] MAX_UINT->no timeout (unit: millisec) */
-unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */
-int pack_id;/* [i->o] unused internally (normally) */
-void __user * usr_ptr;  /* [i->o] unused internally */
-unsigned char status;   /* [o] scsi status */
-unsigned char masked_status;/* [o] shifted, masked scsi status */
-unsigned char msg_status;   /* [o] messaging level data (optional) */
-unsigned char sb_len_wr;/* [o] byte count actually written to sbp */
-unsigned short host_status; /* [o] errors from host adapter */
-unsigned short driver_status;/* [o] errors from software driver */
-int resid;  /* [o] dxfer_len - actual_transferred */
-unsigned int duration;  /* [o] time taken by cmd (unit: millisec) */
-unsigned int info;  /* [o] auxiliary information */
-} sg_io_hdr_t;  /* 64 bytes long (on i386) */
-
-#define SG_INTERFACE_ID_ORIG 'S'
-
-/* Use negative values to flag difference from original sg_header structure */
-#define SG_DXFER_NONE (-1)  /* e.g. a SCSI Test Unit Ready command */
-#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */
-#define SG_DXFER_FROM_DEV (-3)  /* e.g. a SCSI READ command */
-#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DX

[PATCH v2 4/8] sg: expand request states

2018-10-20 Thread Douglas Gilbert
Introduce the new sg_rq_state enumerations for tracking the
lifetime of a sg_request. SG_RQ_DATA_THRESHOLD is a
default value that if the data length of a request exceeds
then, after that request is completed, the data buffer will
be freed up as the sg_request object is placed on the free
list. SG_TOT_FD_THRESHOLD is a default, per file descriptor
value that the sum of outstanding command data lengths is not
allowed to exceed.

Signed-off-by: Douglas Gilbert 
---

The '#if 0's are temporary and removed in a later patch.
They allow the following large and complex patch (5 of 8)
to be a bit shorter and still compile.

 drivers/scsi/sg.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 94e13a1d21a5..a76395f16fb1 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -75,6 +75,24 @@ static int sg_proc_init(void);
  */
 #define SG_MAX_CDB_SIZE 252
 
+/* Following enum contains the states of sg_request::rq_state */
+enum sg_rq_state {
+   SG_RQ_INACTIVE = 0, /* request not in use (e.g. on fl) */
+   SG_RQ_INFLIGHT, /* SCSI request issued, no response yet */
+   SG_RQ_AWAIT_READ,   /* response received, awaiting read */
+   SG_RQ_DONE_READ,/* read is ongoing or done */
+   SG_RQ_BUSY, /* example: reserve request changing size */
+};
+
+/* free up requests larger than this dlen size after use */
+#define SG_RQ_DATA_THRESHOLD (128 * 1024)
+
+/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
+#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
+
+#define SG_TIME_UNIT_MS 0   /* milliseconds */
+#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
@@ -950,6 +968,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info 
*rinfo)
}
 }
 
+#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1227,6 +1246,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENOIOCTLCMD;
 }
 #endif
+#endif /* temporary to shorten big patch */
 
 static __poll_t
 sg_poll(struct file *filp, poll_table * wait)
@@ -1496,10 +1516,12 @@ static const struct file_operations sg_fops = {
.read = sg_read,
.write = sg_write,
.poll = sg_poll,
+#if 0  /* temporary to shorten big patch */
.unlocked_ioctl = sg_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl = sg_compat_ioctl,
 #endif
+#endif /* temporary to shorten big patch */
.open = sg_open,
.mmap = sg_mmap,
.release = sg_release,
@@ -2422,12 +2444,16 @@ static const struct seq_operations devstrs_seq_ops = {
.show  = sg_proc_seq_show_devstrs,
 };
 
+#if 0  /* temporary to shorten big patch */
 static int sg_proc_seq_show_debug(struct seq_file *s, void *v);
+#endif /* temporary to shorten big patch */
 static const struct seq_operations debug_seq_ops = {
.start = dev_seq_start,
.next  = dev_seq_next,
.stop  = dev_seq_stop,
+#if 0  /* temporary to shorten big patch */
.show  = sg_proc_seq_show_debug,
+#endif /* temporary to shorten big patch */
 };
 
 static int
@@ -2601,6 +2627,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v)
return 0;
 }
 
+#if 0  /* temporary to shorten big patch */
+
 /* must be called while holding sg_index_lock */
 static void
 sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp)
@@ -2704,6 +2732,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v)
read_unlock_irqrestore(_index_lock, iflags);
return 0;
 }
+#endif /* temporary to shorten big patch */
 
 #endif /* CONFIG_SCSI_PROC_FS */
 
-- 
2.17.1



[PATCH v2 6/8] sg: complete locking changes on ioctl+debug

2018-10-20 Thread Douglas Gilbert
Complete the locking and structure changes of ioctl and debug
('cat /proc/scsi/sg/debug') handling.

Signed-off-by: Douglas Gilbert 
---

This was the code that was "#if 0'-ed out 2 patches ago. It
also shuts checkpatch.pl up as it doesn't like that technique
but offers no viable substitute.

 drivers/scsi/sg.c | 209 +-
 1 file changed, 134 insertions(+), 75 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 4a2e9a616604..8c9fbf865106 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -239,6 +239,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, 
int dxfr_len,
 static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
+static const char *sg_rq_state_str(u8 rq_state, bool long_str);
 
 #define SZ_SG_HEADER sizeof(struct sg_header)  /* v1 and v2 header */
 #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr)  /* v3 header */
@@ -358,7 +359,7 @@ sg_open(struct inode *inode, struct file *filp)
 
nonseekable_open(inode, filp);
if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE)))
-   return -EPERM; /* Can't lock it with read only access */
+   return -EPERM;/* not permitted, need write access for O_EXCL */
sdp = sg_get_dev(min_dev);
if (IS_ERR(sdp))
return PTR_ERR(sdp);
@@ -1011,7 +1012,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct 
sg_request *srp)
return ret;
 }
 
-#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1043,24 +1043,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENXIO;
if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR))
return -EFAULT;
-   result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR,
-1, read_only, 1, );
+   result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only,
+true, );
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
-   (srp_done(sfp, srp) || atomic_read(>detaching)));
-   if (atomic_read(>detaching))
+   srp_state_or_detaching(sdp, srp));
+
+   spin_lock_irqsave(>rq_entry_lck, iflags);
+   if (unlikely(result)) { /* -ERESTARTSYS because signal hit */
+   srp->orphan = true;
+   srp->rq_state = SG_RQ_INFLIGHT;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
+   SG_LOG(1, sdp, "%s:  wait_event_interruptible-->%d\n",
+  __func__, result);
+   return result;
+   }
+   if (unlikely(atomic_read(>detaching))) {
+   srp->rq_state = SG_RQ_INACTIVE;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
return -ENODEV;
-   write_lock_irq(>rq_list_lock);
-   if (srp->done) {
-   srp->done = 2;
-   write_unlock_irq(>rq_list_lock);
+   } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) {
+   srp->rq_state = SG_RQ_DONE_READ;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp);
return (result < 0) ? result : 0;
}
-   srp->orphan = true;
-   write_unlock_irq(>rq_list_lock);
-   return result;  /* -ERESTARTSYS because signal hit process */
+   cp = sg_rq_state_str(srp->rq_state, true);
+   SG_LOG(1, sdp, "%s: unexpected srp=0x%p  state: %s\n", __func__,
+  srp, cp);
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
+   return -EPROTO; /* Logic error */
case SG_SET_TIMEOUT:
result = get_user(val, ip);
if (result)
@@ -1119,25 +1132,36 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
if (!access_ok(VERIFY_WRITE, ip, sizeof(int)))
return -EFAULT;
read_lock_irqsave(>rq_list_lock, iflags);
-   list_for_each_entry(srp, >rq_list, entry) {
-   if ((1 == srp->done) && (!srp->sg_io_owned)) {
-   read_unlock_irqrestore(>rq_list_lock,
-  iflags);
-

[PATCH v2 0/8] sg: major cleanup, remove max_queue limit

2018-10-20 Thread Douglas Gilbert
The intention is to add two new ioctls as proposed by Linus Torvalds:
SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async
interface.

But first, clean up the driver and remove the SG_MAX_QUEUE limit of
no more than 16 queued commands on a file descriptor at a time. A
free list has been added and the de-allocation of sg_request
objects is deferred until the close() of a file. Locking is
extensively reworked, especially at the struct sg_fd and
sg_request level.

A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple
integer values and booleans to be written to and/or read from the
driver. An example is changing and/or reading the reserved request
data length (there is one of these per fd). An example of a new
feature is changing and/or reading the per-fd upper limit on the
sum of outstanding data buffer sizes (default is 16 MB). An
example of a boolean is a bit to do all following command
timekeeping in nanoseconds rather that the default millseconds.

A later patchset will add implementations for the SG_IOSUBMIT and
SG_IORECEIVE plus handling of the sg v4 interface with the
existing SG_IO ioctl.

This patchset is against Martin Petersen's 4.20/scsi-queue branch.

Changes since v1:
  - remove redundant casts from private_data field
  - introduce atomic to address locking problems around
sg_fd::sum_fd_dlens
  - replace rq_state defines with an enumeration
  - add __must_hold() annotation to sg_fill_request_table()
  - fix compile/build problem around the 4th and 5th patches
  - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED
ioctl and increase structure size from 64 to 96 bytes


Douglas Gilbert (8):
  sg: types and naming cleanup
  sg: introduce sg_log macro
  sg: split header, expand and correct descriptions
  sg: expand request states
  sg: add free list, rework locking
  sg: complete locking changes on ioctl+debug
  sg: rework ioctl handling
  sg: user controls for q_at_head, read_value

 drivers/scsi/sg.c  | 2538 ++--
 include/scsi/sg.h  |  268 +
 include/uapi/scsi/sg.h |  412 +++
 3 files changed, 2079 insertions(+), 1139 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

-- 
2.17.1



[PATCH v2 2/8] sg: introduce sg_log macro

2018-10-20 Thread Douglas Gilbert
Introduce the SG_LOG macro to replace long-winded
'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__
wherever appropriate to make the debug messages more portable.

Signed-off-by: Douglas Gilbert 
---
 drivers/scsi/sg.c | 162 +-
 1 file changed, 72 insertions(+), 90 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 78a35e63d177..94e13a1d21a5 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref);
 /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */
 #define SZ_SG_REQ_INFO sizeof(struct sg_req_info)
 
-#define sg_printk(prefix, sdp, fmt, a...) \
-   sdev_prefix_printk(prefix, (sdp)->device,   \
-  (sdp)->disk->disk_name, fmt, ##a)
+/*
+ * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages.
+ * 'depth' is a number between 1 (most severe) and 7 (most noisy, most
+ * information). All messages are logged as informational (KERN_INFO). In
+ * the unexpected situation where sdp is NULL the macro reverts to a pr_info
+ * and ignores CONFIG_SCSI_LOGGING and always prints to the log.
+ */
+#define SG_LOG(depth, sdp, fmt, a...)  \
+   do {\
+   if (IS_ERR_OR_NULL(sdp)) {  \
+   pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a);  \
+   } else {\
+   SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \
+KERN_INFO, (sdp)->device,  \
+(sdp)->disk->disk_name, fmt,   \
+##a)); \
+   }   \
+   } while (0)
 
 /*
  * The SCSI interfaces that use read() and write() as an asynchronous variant 
of
@@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp)
sdp = sg_get_dev(min_dev);
if (IS_ERR(sdp))
return PTR_ERR(sdp);
-
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp,
- "sg_open: flags=0x%x\n", flags));
+   SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n",
+  __func__, flags, sdp->open_cnt);
 
/* This driver's module count bumped by fops_get in  */
/* Prevent the device driver from vanishing while we sleep */
@@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n"));
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
+   SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__,
+  sdp->open_cnt);
 
mutex_lock(>open_rel_lock);
scsi_autopm_put_device(sdp->device);
@@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
sdp = sfp->parentdp;
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
if (atomic_read(>detaching))
return -ENODEV;
if (!((filp->f_flags & O_NONBLOCK) ||
@@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return -EIO;/* minimum scsi command length is 6 bytes */
 
if (!(srp = sg_add_request(sfp))) {
-   SCSI_LOG_TIMEOUT(1, sg_printk(KERN_INFO, sdp,
- "sg_write: queue full\n"));
+   SG_LOG(1, sdp, "%s: queue full\n", __func__);
return -EDOM;
}
buf += SZ_SG_HEADER;
@@ -703,9 +715,8 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
   

[PATCH v2 3/8] sg: split header, expand and correct descriptions

2018-10-20 Thread Douglas Gilbert
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header:
uapi/scsi/sg.h . Overall expand the twin header files with new
functionality in this patchset and functionality to be added in the
next patchset to implement SG_IOSUBMIT and friends. Adjust format to
modern kernel conventions. Correct and expand some comments.

Signed-off-by: Douglas Gilbert 
---

The new header file: uapi/scsi/sg.h is expected to be in the
kernel's public interface. This takes time (i.e. months or years)
to find its way into glibc and Linux distributions. So the sooner
it is presented and accepted the better.

>From the C perspective, nothing is removed or changed (or shouldn't
be), only additions. This means that the original typedefs stay (but
they are not used in the driver source file: sg.c) since user space
programs may be using them.

 include/scsi/sg.h  | 252 ++---
 include/uapi/scsi/sg.h | 418 +
 2 files changed, 429 insertions(+), 241 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index f91bcca604e4..596f68746f66 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -7,23 +7,23 @@
 /*
  * History:
  *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
+ *  process control of SCSI devices.
  *  Development Sponsored by Killy Corp. NY NY
  *
  * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
+ *   Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
+ *   Copyright (C) 1998 - 2018 Douglas Gilbert
  *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
+ *  Version: 3.9.01 (20181016)
+ *  This version is for 2.6, 3 and 4 series kernels.
  *
  * Documentation
  * =
  * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
+ *   http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
  * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
+ *   http://sg.danny.cz/sg/p/sg_v3_ho.html
  * Also see: /Documentation/scsi/scsi-generic.txt
  *
  * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
@@ -33,242 +33,12 @@
 extern int sg_big_buff; /* for sysctl */
 #endif
 
+#include 
 
-typedef struct sg_iovec /* same structure as used by readv() Linux system */
-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
- or scatter gather list */
-unsigned char __user *cmdp; /* [i], [*i] points to command to perform */
-void __user *sbp;  /* [i], [*o] points to sense_buffer memory */
-unsigned int timeout;   /* [i] MAX_UINT->no timeout (unit: millisec) */
-unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */
-int pack_id;/* [i->o] unused internally (normally) */
-void __user * usr_ptr;  /* [i->o] unused internally */
-unsigned char status;   /* [o] scsi status */
-unsigned char masked_status;/* [o] shifted, masked scsi status */
-unsigned char msg_status;   /* [o] messaging level data (optional) */
-unsigned char sb_len_wr;/* [o] byte count actually written to sbp */
-unsigned short host_status; /* [o] errors from host adapter */
-unsigned short driver_status;/* [o] errors from software driver */
-int resid;  /* [o] dxfer_len - actual_transferred */
-unsigned int duration;  /* [o] time taken by cmd (unit: millisec) */
-unsigned int info;  /* [o] auxiliary information */
-} sg_io_hdr_t;  /* 64 bytes long (on i386) */
-
-#define SG_INTERFACE_ID_ORIG 'S'
-
-/* Use negative values to flag difference from original sg_header structure */
-#define SG_DXFER_NONE (-1)  /* e.g. a SCSI Test Unit Ready command */
-#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */
-#define SG_DXFER_FROM_DEV (-3)  /* e.g. a SCSI READ command */
-#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the
-  additional property than 

[PATCH v2 7/8] sg: rework ioctl handling

2018-10-20 Thread Douglas Gilbert
Rework ioctl handling, report clearly to the log which ioctl has
been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which
permits several integer and boolean values to be "_SET_" (i.e.
passed into the driver, potentially changing its actions) and/or
read from the driver (the "_GET_" part) in a single operation.

Signed-off-by: Douglas Gilbert 
---

One feature of the new SG_SET_GET_EXTENDED ioctl is ability to
fetch the sg device minor number (e.g. the "3" in /dev/sg3)
associated with the current file descriptor. A boolean addition
is the ability to change command timekeeping on the current file
descriptor from units of milliseconds (the default) to
nanoseconds.

 drivers/scsi/sg.c | 535 +++---
 1 file changed, 409 insertions(+), 126 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 8c9fbf865106..1a63b0a9279a 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -90,8 +90,8 @@ enum sg_rq_state {
 /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
 #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
 
-#define SG_TIME_UNIT_MS 0   /* milliseconds */
-#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_TIME_UNIT_MS 0  /* milliseconds */
+#define SG_TIME_UNIT_NS 1  /* nanoseconds */
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
@@ -186,7 +186,6 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool cmd_q; /* true -> allow command queuing, false -> don't */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
-   bool sse_seen;  /* SG_SET_EXTENDED ioctl seen */
bool time_in_ns;/* report times in nanoseconds */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
@@ -240,6 +239,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct 
sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
 static const char *sg_rq_state_str(u8 rq_state, bool long_str);
+static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first,
+   rwlock_t *rwlp, unsigned long *iflagsp);
 
 #define SZ_SG_HEADER sizeof(struct sg_header)  /* v1 and v2 header */
 #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr)  /* v3 header */
@@ -507,9 +508,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -814,11 +815,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const 
char __user *buf,
return -ENOSYS;
if (hp->flags & SG_FLAG_MMAP_IO) {
if (!list_empty(>rq_list))
-   return -EBUSY;  /* already active requests on fd */
+   return -EBUSY;  /* already active requests on fd */
if (hp->dxfer_len > sfp->reserve_srp->data.dlen)
-   return -ENOMEM; /* MMAP_IO size must fit in reserve */
+   return -ENOMEM; /* MMAP_IO size must fit in reserve */
if (hp->flags & SG_FLAG_DIRECT_IO)
-   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
+   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
}
sfp->cmd_q = true; /* when sg_io_hdr seen, set command queuing on */
ul_timeout = msecs_to_jiffies(hp->timeout);
@@ -856,7 +857,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp = sg_add_request(sfp, hi_p->dxfer_len, false);
if (IS_ERR(srp))
return srp;
-   srp->header = *hi_p;/* structure assignment, could memcpy */
+   srp->header = *hi_p;/* structure assignment, could memcpy */
hp = >header;
srp->data.cmd_opcode = cmnd[0]; /* hold opcode of command */
hp->status = 0;
@@ -1012,69 +1013,300 @@ srp_state_or_detaching(struct sg_device *sdp, struct 
sg_request *srp)
return ret;
 }
 
+/* For handling ioctl(SG_IO). Returns 0 on success else a negated errno */
+static int
+sg_sg_io(struct file *filp, struct sg_device *sdp, struct sg_fd *sfp,
+void __user *p)
+{
+   bool read_only = (O_RDWR != (filp->f_flags & O_ACCM

[PATCH v2 8/8] sg: user controls for q_at_head, read_value

2018-10-20 Thread Douglas Gilbert
Add a SG_SET_GET_EXTENDED ioctl control for whether commands
will be queued_at_head or queued_at_tail by the block layer
(together with the scsi mid-level). It has file scope.

Also add a read_value integer the can be used by write a
value from the SG_SEIRV_* group then the corresponding value
will be returned.

Signed-off-by: Douglas Gilbert 
---

The user can still override the new file scope setting on a
a per command basis with the SG_FLAG_Q_AT_HEAD and
SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures.

An example of read_value usage is to write the value
SG_SEIRV_FL_RQS to the read_value field. Then after the
SG_SET_GET_EXTENDED ioctl is run, the number of (inactive)
requests currently on this file descriptor's request free
list is placed in the read_value field.

 drivers/scsi/sg.c  | 73 +-
 include/scsi/sg.h  | 32 --
 include/uapi/scsi/sg.h | 48 ---
 3 files changed, 92 insertions(+), 61 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 1a63b0a9279a..31f6c364f60f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -59,7 +59,7 @@ static int sg_version_num = 30901;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20181019";
+static char *sg_version_date = "20181020";
 
 static int sg_proc_init(void);
 #endif
@@ -95,6 +95,10 @@ enum sg_rq_state {
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
+#define SG_FD_Q_AT_TAIL true
+#define SG_FD_Q_AT_HEAD false
+#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */
+
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
/proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer
@@ -187,6 +191,7 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
bool time_in_ns;/* report times in nanoseconds */
+   bool q_at_tail; /* queue at tail if true, head when false */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
struct fasync_struct *async_qp; /* used by asynchronous notification */
@@ -896,9 +901,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT);
else
hp->duration = jiffies_to_msecs(jiffies);
-   /* at tail if v3 or later interface and tail flag set */
-   at_head = !(hp->interface_id != '\0' &&
-   (SG_FLAG_Q_AT_TAIL & hp->flags));
+
+   if (hp->interface_id == '\0')   /* v1 and v2 interface */
+   at_head = true; /* backward compatibility */
+   else if (sfp->q_at_tail)  /* cmd flags can override sfd setting */
+   at_head = (SG_FLAG_Q_AT_HEAD & hp->flags);
+   else/* this sfd is defaulting to head */
+   at_head = !(SG_FLAG_Q_AT_TAIL & hp->flags);
 
srp->rq->timeout = timeout;
kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */
@@ -1176,11 +1185,12 @@ static int
 sg_set_get_extended(struct sg_fd *sfp, void __user *p)
 {
int result = 0;
-   u32 uv;
+   unsigned long iflags;
+   u32 uv, or_masks;
struct sg_device *sdp = sfp->parentdp;
struct sg_extended_info *seip;
+   struct sg_request *srp;
struct sg_extended_info sei;
-   u32 or_masks;
 
seip = 
if (!access_ok(VERIFY_READ, p, SZ_SG_EXTENDED_INFO))
@@ -1194,8 +1204,10 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
}
SG_LOG(3, sdp, "%s: wr_mask=0x%x rd_mask=0x%x\n", __func__,
   seip->valid_wr_mask, seip->valid_rd_mask);
+   /* reserved_sz (u32), read-write */
if (or_masks & SG_SEIM_RESERVED_SIZE)
result = sg_reserved_sz(sfp, seip);
+   /* rq_rem_sgat_threshold (u32), read-write [impacts re-use only] */
if (or_masks & SG_SEIM_RQ_REM_THRESH) {
if (seip->valid_wr_mask & SG_SEIM_RQ_REM_THRESH) {
uv = seip->rq_rem_sgat_thresh;
@@ -1206,6 +1218,7 @@ sg_set_get_extended(struct sg_fd *sfp, void __user *p)
if (seip->valid_rd_mask & SG_SEIM_RQ_REM_THRESH)
seip->rq_rem_sgat_thresh = sfp->rem_sgat_thresh;
}
+   /* tot_fd_thresh (u32), read-write [sum of active cmd dlen_s] */
if (or_masks & SG_SEIM_TOT_FD_THRESH) {
if (seip->valid_wr_mask & SG_S

[PATCH v2 5/8] sg: add free list, rework locking

2018-10-20 Thread Douglas Gilbert
Remove fixed 16 sg_request object array and replace with an active
rq_list plus a request free list. Add finer grained spin lock to
sg_request and do a major rework on locking. sg_request objects now
are only de-allocated when the owning file descriptor is closed.
This simplifies locking issues.

Signed-off-by: Douglas Gilbert 
---

This patch is big and complex. Towards the end the diff program
completely loses the plot. Better to use difftool on a two pane
window, or simply view the before sg.c and the after sg.c .

 drivers/scsi/sg.c | 1241 +++--
 1 file changed, 751 insertions(+), 490 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index a76395f16fb1..4a2e9a616604 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -141,46 +141,58 @@ struct sg_scatter_hold { /* holding area for scsi 
scatter gather info */
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
-   struct list_head entry; /* list entry */
-   struct sg_fd *parentfp; /* NULL -> not in use */
+/*
+ * For any file descriptor: at any time a sg_request object must be a member
+ * of sg_fd::rq_list or rq_free_list::rq_free_list. The only exception is
+ * within a rq_list_lock write lock when it is moving between those two lists.
+ */
+
+struct sg_request {/* active SCSI command or inactive on free list (fl) */
+   struct list_head rq_entry;  /* member of rq_list (active cmd) */
+   struct list_head free_entry;/* member of rq_free_list */
+   spinlock_t rq_entry_lck;
struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */
union {
struct sg_io_hdr header;  /* see  */
-   struct sg_io_v4 hdr_v4;   /* see  */
+   struct sg_v4_hold v4_hold;/* related to  */
};
-   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
-   bool hdr_v4_active; /* selector for anonymous union above */
-   bool res_used;  /* true -> use reserve buffer, false -> don't */
+   ktime_t start_ts;   /* used when sg_fd::time_in_ns is true */
+   enum sg_rq_state rq_state;/* tracks lifetime of each request */
+   bool v4_active; /* selector for autonomous union above */
bool orphan;/* true -> drop on sight, false -> normal */
-   bool sg_io_owned;   /* true -> packet belongs to SG_IO */
-   /* done protected by rq_list_lock */
-   char done;  /* 0->before bh, 1->before read, 2->read */
+   bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */
+   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
+   struct sg_fd *parentfp;  /* pointer to owning fd, even when on fl */
+   struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */
struct request *rq;
struct bio *bio;
-   struct execute_work ew;
+   struct execute_work ew_orph;/* harvest orphan request */
 };
 
-struct sg_fd { /* holds the state of a file descriptor */
-   struct list_head sfd_siblings;  /* protected by device's sfd_lock */
+struct sg_fd { /* holds the state of a file descriptor */
+   struct list_head sfd_entry; /* member sg_device::sfds list */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
-   rwlock_t rq_list_lock;  /* protect access to list in req_arr */
struct mutex f_mutex;   /* protect against changes in this fd */
+   rwlock_t rq_list_lock;  /* protect access to sg_request lists */
+   struct list_head rq_list; /* head of inflight sg_request list */
+   struct list_head rq_free_list; /* head of sg_request free list */
int timeout;/* defaults to SG_DEFAULT_TIMEOUT  */
int timeout_user;   /* defaults to SG_DEFAULT_TIMEOUT_USER */
-   struct sg_scatter_hold reserve; /* one held for this file descriptor */
-   struct list_head rq_list; /* head of request list */
-   struct fasync_struct *async_qp; /* used by asynchronous notification */
-   struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */
+   int rem_sgat_thresh;/* > this, request's sgat cleared after use */
+   int tot_fd_thresh;  /* E2BIG if sum_of(dlen) > this, 0: ignore */
+   atomic_t sum_fd_dlens;  /* when tot_fd_thresh>0 this is sum_of(dlen) */
bool force_packid;  /* true -> pack_id input to read() */
bool cmd_q; /* true -> allow command queuing, false -> don't */
-   u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
-   bool res_in_use;/* true -> 'reserve' array in 

[PATCH v2 1/8] sg: types and naming cleanup

2018-10-20 Thread Douglas Gilbert
Remove typedefs and use better type names like bool and u8 where
appropriate. Rename some variables and functions for clarity.
Adjust formatting (e.g. function definitions) to be more
consistent across the driver.

Signed-off-by: Douglas Gilbert 
---

The intention is to move to sg_version_num 40001 when a second
patchset implementing SG_IOSUBMIT and friends is applied. In
the meantime, move from the latest version number in the kernel
(30536) to 30901 to indicate a significant change but not yet
implementing the sg v4 interface.

 drivers/scsi/sg.c | 827 ++
 1 file changed, 467 insertions(+), 360 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c6ad00703c5b..78a35e63d177 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -7,7 +7,7 @@
  * Original driver (sg.c):
  *Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- *Copyright (C) 1998 - 2014 Douglas Gilbert
+ *Copyright (C) 1998 - 2018 Douglas Gilbert
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -16,16 +16,9 @@
  *
  */
 
-static int sg_version_num = 30536; /* 2 digits for each component */
-#define SG_VERSION_STR "3.5.36"
+static int sg_version_num = 30901; /* 2 digits for each component */
+#define SG_VERSION_STR "3.9.01"
 
-/*
- *  D. P. Gilbert (dgilb...@interlog.com), notes:
- *  - scsi logging is available via SCSI_LOG_TIMEOUT macros. First
- *the kernel/module needs to be built with CONFIG_SCSI_LOGGING
- *(otherwise the macros compile to empty statements).
- *
- */
 #include 
 
 #include 
@@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 #include 
 #include 
 #include  /* for sg_check_file_access() */
+#include 
+#include 
 
 #include "scsi.h"
 #include 
@@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20140603";
+static char *sg_version_date = "20181019";
 
 static int sg_proc_init(void);
 #endif
@@ -73,7 +68,8 @@ static int sg_proc_init(void);
 
 #define SG_MAX_DEVS 32768
 
-/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
+/*
+ * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
  * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater
  * than 16 bytes are "variable length" whose length is a multiple of 4
  */
@@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct 
class_interface *);
 static void sg_remove_device(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */
 
 static struct class_interface sg_interface = {
.add_dev= sg_add_device,
.remove_dev = sg_remove_device,
 };
 
-typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info 
*/
-   unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */
-   unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */
-   unsigned bufflen;   /* Size of (aggregate) data buffer */
-   struct page **pages;
-   int page_order;
-   char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */
-   unsigned char cmd_opcode; /* first byte of command */
-} Sg_scatter_hold;
+struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */
+   __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */
+   __user u8 *sbp; /* derived from sg_io_v4::response */
+   u16 cmd_len;/* truncated of sg_io_v4::request_len */
+   u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */
+   u32 flags;  /* copy of sg_io_v4::flags */
+};
+
+struct sg_scatter_hold { /* holding area for scsi scatter gather info */
+   struct page **pages;/* num_sgat element array of struct page* */
+   int page_order; /* byte_len = (page_size * (2**page_order)) */
+   int dlen;   /* Byte length of data buffer */
+   unsigned short num_sgat;/* actual number of scatter-gather segments */
+   bool dio_in_use;/* false->indirect IO (or mmap), true->dio */
+   u8 cmd_opcode;  /* first byte of command */
+};
 
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
+struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
struct list_head entry; /* list entry */

Re: [PATCH 2/8] sg: introduce sg_log macro

2018-10-19 Thread Douglas Gilbert

On 2018-10-19 3:45 a.m., Johannes Thumshirn wrote:

On 19/10/18 08:24, Douglas Gilbert wrote:
[..]

+/*
+ * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages.
+ * 'depth' is a number between 1 (most severe) and 7 (most noisy, most
+ * information). All messages are logged as informational (KERN_INFO). In
+ * the unexpected situation where sdp is NULL the macro reverts to a pr_info
+ * and ignores CONFIG_SCSI_LOGGING and always prints to the log.
+ */
+#define SG_LOG(depth, sdp, fmt, a...)  \
+   do {\
+   if (IS_ERR_OR_NULL(sdp)) {  \
+   pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a);\
+   } else {\
+   SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \
+KERN_INFO, (sdp)->device,   \
+(sdp)->disk->disk_name, fmt, \
+##a)); \
+   }   \
+   } while (0)


Hi Doug,
have you considered using the kernel's dynamic debug infrastructure instead?


Hi,
I'll follow what the scsi mid-level and the other ULDs do. IOW, no
change. The debug messages they produce are quite helpful (to me, I
use them a lot, and Tony B. has asked for more precision) and well-tuned
to the SCSI subsystem (e.g. telling us what sdp represents in useful
terms).

And they can be compiled out (but not my pr_info above, probably
should be a pr_warn).

Doug Gilbert




Re: [PATCH 5/8] sg: add free list, rework locking

2018-10-19 Thread Douglas Gilbert

On 2018-10-19 11:22 a.m., Bart Van Assche wrote:

On Fri, 2018-10-19 at 02:24 -0400, Douglas Gilbert wrote:

static void
-sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo)
+sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo,
+ int max_num)
  {
 struct sg_request *srp;
 int val;
-   unsigned int ms;
  
 val = 0;

-   list_for_each_entry(srp, >rq_list, entry) {
-   if (val >= SG_MAX_QUEUE)
-   break;
-   rinfo[val].req_state = srp->done + 1;
+   list_for_each_entry(srp, >rq_list, rq_entry) {
+   if (val >= max_num)
+   return;


What protects the sfp->rq_list against concurrent changes? It seems to me
like all other code that iterates over or modifies that list protects that
list with rq_list_lock?


Bart,
The function is called from sg_ioctl() case SG_GET_REQUEST_TABLE and at the
call point the read_lock is held on rq_list_lock. Maybe I can add a comment
above the function about the lock being held. [At least it is as the end
of the patch series, and that is all I care about :-)]

Doug Gilbert

P.S. Best to look at sg.c after each patch is applied, not the !@#$ing
stupid patches.


Re: [PATCH 3/8] sg: split header, expand and correct descriptions

2018-10-19 Thread Douglas Gilbert

On 2018-10-19 4:31 a.m., Johannes Thumshirn wrote:

On 19/10/18 08:24, Douglas Gilbert wrote:

+/* Alternate style type names, "..._t" variants preferred */
+typedef struct sg_io_hdr Sg_io_hdr;
+typedef struct sg_io_vec Sg_io_vec;
+typedef struct sg_scsi_id Sg_scsi_id;
+typedef struct sg_req_info Sg_req_info;


There are no _t variants for the above, or am I missing something?


I've expanded the comment to make it clearer I'm referring to the
definitions above in that header ***.

For example: Referring to

typedef struct sg_io_hdr {
... /* the definition of its fields */
} sg_io_hdr_t;


So the suggestion is to prefer sg_io_hdr_t to Sg_io_hdr and if you
are using C rather that C++ (in the user space) and you have very
backward looking conventions then prefer:
   struct sg_io_hdr

Is that clearer?

Doug Gilbert


*** If the unified diff doesn't show that, then that is one of the
many weakness of using unified diffs for code reviews. One
of the things I'm trying to clean up is 15 years of
"laparoscopic" patches (and diff and patch are fighting back).


Re: [PATCH] sg: remove bad blk_end_request_all() call

2018-10-16 Thread Douglas Gilbert

On 2018-10-16 10:38 a.m., Jens Axboe wrote:

We just need to free the request here. Additionally, this is
currently wrong for a queue that's using MQ currently, it'll
crash.


Surprise removals are difficult code paths to check. That snippet
is after the request has been generated and before the call to:
  blk_execute_rq_nowait()

IOWs a small window.

Currently if a surprise removal is detected by the completion
callback (sg_rq_end_io() ) then a pr_info() is called and it
otherwise follows the main path out: free-ing the request,
keeping the bio. Is that okay?

More generally, once a request is "in flight" and a surprise
removal occurs, can the sg driver expect the block layer (or
scsi mid-level) to react (and clean-up) or is it up to the
sg driver to force those actions?

Acked-by: Douglas Gilbert 


Cc: Doug Gilbert 
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe 
---
  drivers/scsi/sg.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 8a254bb46a9b..c6ad00703c5b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -822,7 +822,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
if (atomic_read(>detaching)) {
if (srp->bio) {
scsi_req_free_cmd(scsi_req(srp->rq));
-   blk_end_request_all(srp->rq, BLK_STS_IOERR);
+   blk_put_request(srp->rq);
srp->rq = NULL;
}
  





Re: [PATCH 2/8] sg: introduce sg_log macro

2018-10-24 Thread Douglas Gilbert

On 2018-10-24 3:58 a.m., Martin K. Petersen wrote:


Hi Doug,


I'll follow what the scsi mid-level and the other ULDs do. IOW, no
change. The debug messages they produce are quite helpful (to me, I
use them a lot, and Tony B. has asked for more precision) and
well-tuned to the SCSI subsystem (e.g. telling us what sdp represents
in useful terms).

And they can be compiled out (but not my pr_info above, probably
should be a pr_warn).


I agree with Johannes. SCSI logging is in sustaining mode. We're trying
to remove it, not to add to it.

The kernel has much more capable and flexible methods of getting
information out to the user these days. No need to resort to arcane
logging masks and the like.


Examples please, preferably from the SCSI subsystem. If not, I'll do what
most other drivers do, drop all debug statements.

Doug Gilbert




[PATCH v3 5/8] sg: add free list, rework locking

2018-10-26 Thread Douglas Gilbert
Remove fixed 16 sg_request object array and replace with an active
rq_list plus a request free list. Add finer grained spin lock to
sg_request and do a major rework on locking. sg_request objects now
are only de-allocated when the owning file descriptor is closed.
This simplifies locking issues.

Signed-off-by: Douglas Gilbert 
---

This patch is big and complex. Towards the end the diff program
completely loses the plot. Better to use difftool on a two pane
window, or simply view the before sg.c and the after sg.c .

The requirement to keep the patch small enough to be reviewable
but at the same time both compilable and buildable (i.e. no
undefines when kernel is built) are in conflict, especially when
the changes touch almost all functions in a driver. IMO you
should be able to mark a patch (in the middle of a patchset)
as "non compilable".

 drivers/scsi/sg.c | 1305 -
 1 file changed, 805 insertions(+), 500 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index a76395f16fb1..5fbdb0f40739 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -141,46 +141,58 @@ struct sg_scatter_hold { /* holding area for scsi 
scatter gather info */
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
-   struct list_head entry; /* list entry */
-   struct sg_fd *parentfp; /* NULL -> not in use */
+/*
+ * For any file descriptor: at any time a sg_request object must be a member
+ * of sg_fd::rq_list or sg_fd::rq_free_list. The only exception is within a
+ * rq_list_lock write lock when it is moving between those two lists.
+ */
+
+struct sg_request {/* active SCSI command or inactive on free list (fl) */
+   struct list_head rq_entry;  /* member of rq_list (active cmd) */
+   struct list_head free_entry;/* member of rq_free_list */
+   spinlock_t rq_entry_lck;
struct sg_scatter_hold data;/* hold buffer, perhaps scatter list */
union {
struct sg_io_hdr header;  /* see  */
-   struct sg_io_v4 hdr_v4;   /* see  */
+   struct sg_v4_hold v4_hold;/* related to  */
};
-   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
-   bool hdr_v4_active; /* selector for anonymous union above */
-   bool res_used;  /* true -> use reserve buffer, false -> don't */
+   ktime_t start_ts;   /* used when sg_fd::time_in_ns is true */
+   enum sg_rq_state rq_state;/* tracks lifetime of each request */
+   bool v4_active; /* selector for autonomous union above */
bool orphan;/* true -> drop on sight, false -> normal */
-   bool sg_io_owned;   /* true -> packet belongs to SG_IO */
-   /* done protected by rq_list_lock */
-   char done;  /* 0->before bh, 1->before read, 2->read */
+   bool sync_invoc;/* true -> synchronous (e.g. from ioctl(SG_IO)) */
+   u8 sense_b[SCSI_SENSE_BUFFERSIZE];
+   struct sg_fd *parentfp;  /* pointer to owning fd, even when on fl */
+   struct sg_scatter_hold *d2p; /* optional 2nd data buffer for bidi */
struct request *rq;
struct bio *bio;
-   struct execute_work ew;
+   struct execute_work ew_orph;/* harvest orphan request */
 };
 
-struct sg_fd { /* holds the state of a file descriptor */
-   struct list_head sfd_siblings;  /* protected by device's sfd_lock */
+struct sg_fd { /* holds the state of a file descriptor */
+   struct list_head sfd_entry; /* member sg_device::sfds list */
struct sg_device *parentdp; /* owning device */
wait_queue_head_t read_wait;/* queue read until command done */
-   rwlock_t rq_list_lock;  /* protect access to list in req_arr */
struct mutex f_mutex;   /* protect against changes in this fd */
+   rwlock_t rq_list_lock;  /* protect access to sg_request lists */
+   struct list_head rq_list; /* head of inflight sg_request list */
+   struct list_head rq_free_list; /* head of sg_request free list */
int timeout;/* defaults to SG_DEFAULT_TIMEOUT  */
int timeout_user;   /* defaults to SG_DEFAULT_TIMEOUT_USER */
-   struct sg_scatter_hold reserve; /* one held for this file descriptor */
-   struct list_head rq_list; /* head of request list */
-   struct fasync_struct *async_qp; /* used by asynchronous notification */
-   struct sg_request req_arr[SG_MAX_QUEUE];/* used as singly-linked list */
+   int rem_sgat_thresh;/* > this, request's sgat cleared after use */
+   int tot_fd_thresh;  /* E2BIG if sum_of(dlen) > this, 0: ignore */
+   atomic_t sum_fd_dlens;  /* when tot_fd_thresh>0 this is sum_of(dlen) */
bool force_packid;  /* true -> pack_id input to read() */
bool cmd_q; /* true -> allow com

[PATCH v3 8/8] sg: user controls for q_at_head, read_value

2018-10-26 Thread Douglas Gilbert
Add a SG_SET_GET_EXTENDED ioctl control for whether commands
will be queued_at_head or queued_at_tail by the block layer
(together with the scsi mid-level). It has file scope.

Also add a read_value integer the can be used by write a
value from the SG_SEIRV_* group then the corresponding value
will be returned.

Signed-off-by: Douglas Gilbert 
---

The user can still override the new file scope setting on a
a per command basis with the SG_FLAG_Q_AT_HEAD and
SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures.

An example of read_value usage is to write the value
SG_SEIRV_FL_RQS to the read_value field. Then after the
SG_SET_GET_EXTENDED ioctl is run, the number of (inactive)
requests currently on this file descriptor's request free
list is placed in the read_value field.

Added in v3 is SG_SEIRV_DEV_FL_RQS which is an expansion of
SG_SEIRV_FL_RQS. SG_SEIRV_DEV_FL_RQS counts free list entries
on all sg file descriptors currently open on the device that
the file descriptor (given to ioctl()) is associated with.

 drivers/scsi/sg.c  | 160 +++--
 include/scsi/sg.h  |  32 ++---
 include/uapi/scsi/sg.h |  49 ++---
 3 files changed, 150 insertions(+), 91 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 3e89bbd508de..4d6966d40949 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -59,7 +59,7 @@ static int sg_version_num = 30901;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20181019";
+static char *sg_version_date = "20181024";
 
 static int sg_proc_init(void);
 #endif
@@ -95,6 +95,10 @@ enum sg_rq_state {
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
+#define SG_FD_Q_AT_TAIL true
+#define SG_FD_Q_AT_HEAD false
+#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */
+
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
/proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer
@@ -187,6 +191,7 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
bool time_in_ns;/* report times in nanoseconds */
+   bool q_at_tail; /* queue at tail if true, head when false */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
struct fasync_struct *async_qp; /* used by asynchronous notification */
@@ -238,7 +243,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, 
int dxfr_len,
 static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
-static const char *sg_rq_state_str(u8 rq_state, bool long_str);
+static const char *sg_rq_state_str(enum sg_rq_state rq_state, bool long_str);
 static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first,
rwlock_t *rwlp, unsigned long *iflagsp);
 
@@ -855,7 +860,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
 
if (h4p || !hi_p)
return ERR_PTR(-EOPNOTSUPP);
-   srp = sg_add_request(sfp, hi_p->dxfer_len, false);
+   srp = sg_add_request(sfp, hi_p->dxfer_len, sync);
if (IS_ERR(srp))
return srp;
srp->header = *hi_p;/* structure assignment, could memcpy */
@@ -897,9 +902,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT);
else
hp->duration = jiffies_to_msecs(jiffies);
-   /* at tail if v3 or later interface and tail flag set */
-   at_head = !(hp->interface_id != '\0' &&
-   (SG_FLAG_Q_AT_TAIL & hp->flags));
+
+   if (hp->interface_id == '\0')   /* v1 and v2 interface */
+   at_head = true; /* backward compatibility */
+   else if (sfp->q_at_tail)  /* cmd flags can override sfd setting */
+   at_head = (hp->flags & SG_FLAG_Q_AT_HEAD);
+   else/* this sfd is defaulting to head */
+   at_head = !(hp->flags & SG_FLAG_Q_AT_TAIL);
 
srp->rq->timeout = timeout;
kref_get(>f_ref); /* sg_rq_end_io() does kref_put(). */
@@ -1084,30 +1093,30 @@ sg_reserved_sz(struct sg_fd *sfp, struct 
sg_extended_info *seip)
 {
bool free_n_srp = false;
int result = 0;
-   int val, mx_sect_bytes;
+   int new_sz, mx_sect_bytes;
unsigned long iflags;
-   struct sg_request *srp; /* prior reserve request */

[PATCH v3 6/8] sg: complete locking changes on ioctl+debug

2018-10-26 Thread Douglas Gilbert
Complete the locking and structure changes of ioctl and debug
('cat /proc/scsi/sg/debug') handling.

Signed-off-by: Douglas Gilbert 
---

This was the code that was "#if 0'-ed out 2 patches ago. It
also shuts checkpatch.pl up as it doesn't like that technique
but offers no viable substitute.

 drivers/scsi/sg.c | 217 +-
 1 file changed, 136 insertions(+), 81 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 5fbdb0f40739..8b4a65340971 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -239,6 +239,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, 
int dxfr_len,
 static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
+static const char *sg_rq_state_str(u8 rq_state, bool long_str);
 
 #define SZ_SG_HEADER sizeof(struct sg_header)  /* v1 and v2 header */
 #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr)  /* v3 header */
@@ -359,7 +360,7 @@ sg_open(struct inode *inode, struct file *filp)
 
nonseekable_open(inode, filp);
if ((flags & O_EXCL) && (O_RDONLY == (flags & O_ACCMODE)))
-   return -EPERM; /* Can't lock it with read only access */
+   return -EPERM;/* not permitted, need write access for O_EXCL */
sdp = sg_get_dev(min_dev);
if (IS_ERR(sdp))
return PTR_ERR(sdp);
@@ -931,11 +932,6 @@ sg_ktime_sub_trunc(ktime_t now_ts, ktime_t ts0)
return 0;
 }
 
-/*
- * Annotation under function arguments (i.e. '__must_hold...') states that
- * this function expects that lock to be held, a read lock is sufficient in
- * this case.
- */
 static void
 sg_fill_request_element(struct sg_fd *sfp, struct sg_request *srp,
struct sg_req_info *rip)
@@ -982,7 +978,6 @@ sg_fill_request_element(struct sg_fd *sfp, struct 
sg_request *srp,
  * Populate up to max_num struct sg_req_info objects, first from the active
  * list then, if there is still room, from the free list. All elements in
  * the free list should have SG_RQ_INACTIVE status.
- * See sg_fill_request_element() for note about __must_hold annotation.
  */
 static void
 sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info *rinfo,
@@ -1031,7 +1026,6 @@ srp_state_or_detaching(struct sg_device *sdp, struct 
sg_request *srp)
return ret;
 }
 
-#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1063,24 +1057,37 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENXIO;
if (!access_ok(VERIFY_WRITE, p, SZ_SG_IO_HDR))
return -EFAULT;
-   result = sg_new_write(sfp, filp, p, SZ_SG_IO_HDR,
-1, read_only, 1, );
+   result = sg_v3_write(sfp, filp, p, SZ_SG_IO_HDR, read_only,
+true, );
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
-   (srp_done(sfp, srp) || atomic_read(>detaching)));
-   if (atomic_read(>detaching))
+   srp_state_or_detaching(sdp, srp));
+
+   spin_lock_irqsave(>rq_entry_lck, iflags);
+   if (unlikely(result)) { /* -ERESTARTSYS because signal hit */
+   srp->orphan = true;
+   srp->rq_state = SG_RQ_INFLIGHT;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
+   SG_LOG(1, sdp, "%s:  wait_event_interruptible-->%d\n",
+  __func__, result);
+   return result;
+   }
+   if (unlikely(atomic_read(>detaching))) {
+   srp->rq_state = SG_RQ_INACTIVE;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
return -ENODEV;
-   write_lock_irq(>rq_list_lock);
-   if (srp->done) {
-   srp->done = 2;
-   write_unlock_irq(>rq_list_lock);
+   } else if (likely(srp->rq_state == SG_RQ_AWAIT_READ)) {
+   srp->rq_state = SG_RQ_DONE_READ;
+   spin_unlock_irqrestore(>rq_entry_lck, iflags);
result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp);
return (result < 0) ? result : 0;
}
-   srp->orphan = true;
-   write_unlock_irq(>rq_list_lock);
-   return result;  /* -ERESTARTSYS because signal hit process */
+   cp = sg_rq_state_str(srp->rq_state, true);
+   SG_LOG(1, sdp, "%s: unexpected 

[PATCH v3 3/8] sg: split header, expand and correct descriptions

2018-10-26 Thread Douglas Gilbert
Split scsi/sg.h into a smaller scsi/sg.h which includes a new header:
uapi/scsi/sg.h . Overall expand the twin header files with new
functionality in this patchset and functionality to be added in the
next patchset to implement SG_IOSUBMIT and friends. Adjust format to
modern kernel conventions. Correct and expand some comments.

Signed-off-by: Douglas Gilbert 
---

The new header file: uapi/scsi/sg.h is expected to be in the
kernel's public interface. This takes time (i.e. months or years)
to find its way into glibc and Linux distributions. So the sooner
it is presented and accepted the better.

>From the C perspective, nothing is removed or changed (or shouldn't
be), only additions. This means that the original typedefs stay (but
they are not used in the driver source file: sg.c) since user space
programs may be using them.

 include/scsi/sg.h  | 252 ++---
 include/uapi/scsi/sg.h | 419 +
 2 files changed, 430 insertions(+), 241 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

diff --git a/include/scsi/sg.h b/include/scsi/sg.h
index f91bcca604e4..596f68746f66 100644
--- a/include/scsi/sg.h
+++ b/include/scsi/sg.h
@@ -7,23 +7,23 @@
 /*
  * History:
  *  Started: Aug 9 by Lawrence Foard (entr...@world.std.com), to allow user
- *   process control of SCSI devices.
+ *  process control of SCSI devices.
  *  Development Sponsored by Killy Corp. NY NY
  *
  * Original driver (sg.h):
- *   Copyright (C) 1992 Lawrence Foard
+ *   Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- * Copyright (C) 1998 - 2014 Douglas Gilbert
+ *   Copyright (C) 1998 - 2018 Douglas Gilbert
  *
- *  Version: 3.5.36 (20140603)
- *  This version is for 2.6 and 3 series kernels.
+ *  Version: 3.9.01 (20181016)
+ *  This version is for 2.6, 3 and 4 series kernels.
  *
  * Documentation
  * =
  * A web site for the SG device driver can be found at:
- * http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
+ *   http://sg.danny.cz/sg  [alternatively check the MAINTAINERS file]
  * The documentation for the sg version 3 driver can be found at:
- * http://sg.danny.cz/sg/p/sg_v3_ho.html
+ *   http://sg.danny.cz/sg/p/sg_v3_ho.html
  * Also see: /Documentation/scsi/scsi-generic.txt
  *
  * For utility and test programs see: http://sg.danny.cz/sg/sg3_utils.html
@@ -33,242 +33,12 @@
 extern int sg_big_buff; /* for sysctl */
 #endif
 
+#include 
 
-typedef struct sg_iovec /* same structure as used by readv() Linux system */
-{   /* call. It defines one scatter-gather element. */
-void __user *iov_base;  /* Starting address  */
-size_t iov_len; /* Length in bytes  */
-} sg_iovec_t;
-
-
-typedef struct sg_io_hdr
-{
-int interface_id;   /* [i] 'S' for SCSI generic (required) */
-int dxfer_direction;/* [i] data transfer direction  */
-unsigned char cmd_len;  /* [i] SCSI command length */
-unsigned char mx_sb_len;/* [i] max length to write to sbp */
-unsigned short iovec_count; /* [i] 0 implies no scatter gather */
-unsigned int dxfer_len; /* [i] byte count of data transfer */
-void __user *dxferp;   /* [i], [*io] points to data transfer memory
- or scatter gather list */
-unsigned char __user *cmdp; /* [i], [*i] points to command to perform */
-void __user *sbp;  /* [i], [*o] points to sense_buffer memory */
-unsigned int timeout;   /* [i] MAX_UINT->no timeout (unit: millisec) */
-unsigned int flags; /* [i] 0 -> default, see SG_FLAG... */
-int pack_id;/* [i->o] unused internally (normally) */
-void __user * usr_ptr;  /* [i->o] unused internally */
-unsigned char status;   /* [o] scsi status */
-unsigned char masked_status;/* [o] shifted, masked scsi status */
-unsigned char msg_status;   /* [o] messaging level data (optional) */
-unsigned char sb_len_wr;/* [o] byte count actually written to sbp */
-unsigned short host_status; /* [o] errors from host adapter */
-unsigned short driver_status;/* [o] errors from software driver */
-int resid;  /* [o] dxfer_len - actual_transferred */
-unsigned int duration;  /* [o] time taken by cmd (unit: millisec) */
-unsigned int info;  /* [o] auxiliary information */
-} sg_io_hdr_t;  /* 64 bytes long (on i386) */
-
-#define SG_INTERFACE_ID_ORIG 'S'
-
-/* Use negative values to flag difference from original sg_header structure */
-#define SG_DXFER_NONE (-1)  /* e.g. a SCSI Test Unit Ready command */
-#define SG_DXFER_TO_DEV (-2)/* e.g. a SCSI WRITE command */
-#define SG_DXFER_FROM_DEV (-3)  /* e.g. a SCSI READ command */
-#define SG_DXFER_TO_FROM_DEV (-4) /* treated like SG_DXFER_FROM_DEV with the
-  additional property than 

[PATCH v3 4/8] sg: expand request states

2018-10-26 Thread Douglas Gilbert
Introduce the new sg_rq_state enumerations for tracking the
lifetime of a sg_request. SG_RQ_DATA_THRESHOLD is a
default value that if the data length of a request exceeds
then, after that request is completed, the data buffer will
be freed up as the sg_request object is placed on the free
list. SG_TOT_FD_THRESHOLD is a default, per file descriptor
value that the sum of outstanding command data lengths is not
allowed to exceed.

Signed-off-by: Douglas Gilbert 
---

The '#if 0's are temporary and removed in a later patch.
They allow the following large and complex patch (5 of 8)
to be a bit shorter and still compile.

 drivers/scsi/sg.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 94e13a1d21a5..a76395f16fb1 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -75,6 +75,24 @@ static int sg_proc_init(void);
  */
 #define SG_MAX_CDB_SIZE 252
 
+/* Following enum contains the states of sg_request::rq_state */
+enum sg_rq_state {
+   SG_RQ_INACTIVE = 0, /* request not in use (e.g. on fl) */
+   SG_RQ_INFLIGHT, /* SCSI request issued, no response yet */
+   SG_RQ_AWAIT_READ,   /* response received, awaiting read */
+   SG_RQ_DONE_READ,/* read is ongoing or done */
+   SG_RQ_BUSY, /* example: reserve request changing size */
+};
+
+/* free up requests larger than this dlen size after use */
+#define SG_RQ_DATA_THRESHOLD (128 * 1024)
+
+/* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
+#define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
+
+#define SG_TIME_UNIT_MS 0   /* milliseconds */
+#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
@@ -950,6 +968,7 @@ sg_fill_request_table(struct sg_fd *sfp, struct sg_req_info 
*rinfo)
}
 }
 
+#if 0  /* temporary to shorten big patch */
 static long
 sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 {
@@ -1227,6 +1246,7 @@ sg_compat_ioctl(struct file *filp, unsigned int cmd_in, 
unsigned long arg)
return -ENOIOCTLCMD;
 }
 #endif
+#endif /* temporary to shorten big patch */
 
 static __poll_t
 sg_poll(struct file *filp, poll_table * wait)
@@ -1496,10 +1516,12 @@ static const struct file_operations sg_fops = {
.read = sg_read,
.write = sg_write,
.poll = sg_poll,
+#if 0  /* temporary to shorten big patch */
.unlocked_ioctl = sg_ioctl,
 #ifdef CONFIG_COMPAT
.compat_ioctl = sg_compat_ioctl,
 #endif
+#endif /* temporary to shorten big patch */
.open = sg_open,
.mmap = sg_mmap,
.release = sg_release,
@@ -2422,12 +2444,16 @@ static const struct seq_operations devstrs_seq_ops = {
.show  = sg_proc_seq_show_devstrs,
 };
 
+#if 0  /* temporary to shorten big patch */
 static int sg_proc_seq_show_debug(struct seq_file *s, void *v);
+#endif /* temporary to shorten big patch */
 static const struct seq_operations debug_seq_ops = {
.start = dev_seq_start,
.next  = dev_seq_next,
.stop  = dev_seq_stop,
+#if 0  /* temporary to shorten big patch */
.show  = sg_proc_seq_show_debug,
+#endif /* temporary to shorten big patch */
 };
 
 static int
@@ -2601,6 +2627,8 @@ sg_proc_seq_show_devstrs(struct seq_file *s, void *v)
return 0;
 }
 
+#if 0  /* temporary to shorten big patch */
+
 /* must be called while holding sg_index_lock */
 static void
 sg_proc_debug_helper(struct seq_file *s, struct sg_device *sdp)
@@ -2704,6 +2732,7 @@ sg_proc_seq_show_debug(struct seq_file *s, void *v)
read_unlock_irqrestore(_index_lock, iflags);
return 0;
 }
+#endif /* temporary to shorten big patch */
 
 #endif /* CONFIG_SCSI_PROC_FS */
 
-- 
2.17.1



[PATCH v3 7/8] sg: rework ioctl handling

2018-10-26 Thread Douglas Gilbert
Rework ioctl handling, report clearly to the log which ioctl has
been invoked. Add a new "IOWR" ioctl: SG_SET_GET_EXTENDED which
permits several integer and boolean values to be "_SET_" (i.e.
passed into the driver, potentially changing its actions) and/or
read from the driver (the "_GET_" part) in a single operation.

Signed-off-by: Douglas Gilbert 
---

One feature of the new SG_SET_GET_EXTENDED ioctl is ability to
fetch the sg device minor number (e.g. the "3" in /dev/sg3)
associated with the current file descriptor. A boolean addition
is the ability to change command timekeeping on the current file
descriptor from units of milliseconds (the default) to
nanoseconds.

 drivers/scsi/sg.c | 535 +++---
 1 file changed, 409 insertions(+), 126 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 8b4a65340971..3e89bbd508de 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -90,8 +90,8 @@ enum sg_rq_state {
 /* If sum_of(dlen) of a fd exceeds this, write() will yield E2BIG */
 #define SG_TOT_FD_THRESHOLD (16 * 1024 * 1024)
 
-#define SG_TIME_UNIT_MS 0   /* milliseconds */
-#define SG_TIME_UNIT_NS 1   /* nanoseconds */
+#define SG_TIME_UNIT_MS 0  /* milliseconds */
+#define SG_TIME_UNIT_NS 1  /* nanoseconds */
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
@@ -186,7 +186,6 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool cmd_q; /* true -> allow command queuing, false -> don't */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
-   bool sse_seen;  /* SG_SET_EXTENDED ioctl seen */
bool time_in_ns;/* report times in nanoseconds */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
@@ -240,6 +239,8 @@ static void sg_remove_request(struct sg_fd *sfp, struct 
sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
 static const char *sg_rq_state_str(u8 rq_state, bool long_str);
+static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first,
+   rwlock_t *rwlp, unsigned long *iflagsp);
 
 #define SZ_SG_HEADER sizeof(struct sg_header)  /* v1 and v2 header */
 #define SZ_SG_IO_HDR sizeof(struct sg_io_hdr)  /* v3 header */
@@ -508,9 +509,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -815,11 +816,11 @@ sg_v3_write(struct sg_fd *sfp, struct file *file, const 
char __user *buf,
return -ENOSYS;
if (hp->flags & SG_FLAG_MMAP_IO) {
if (!list_empty(>rq_list))
-   return -EBUSY;  /* already active requests on fd */
+   return -EBUSY;  /* already active requests on fd */
if (hp->dxfer_len > sfp->reserve_srp->data.dlen)
-   return -ENOMEM; /* MMAP_IO size must fit in reserve */
+   return -ENOMEM; /* MMAP_IO size must fit in reserve */
if (hp->flags & SG_FLAG_DIRECT_IO)
-   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
+   return -EINVAL; /* not both MMAP_IO and DIRECT_IO */
}
sfp->cmd_q = true; /* when sg_io_hdr seen, set command queuing on */
ul_timeout = msecs_to_jiffies(hp->timeout);
@@ -857,7 +858,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp = sg_add_request(sfp, hi_p->dxfer_len, false);
if (IS_ERR(srp))
return srp;
-   srp->header = *hi_p;/* structure assignment, could memcpy */
+   srp->header = *hi_p;/* structure assignment, could memcpy */
hp = >header;
srp->data.cmd_opcode = cmnd[0]; /* hold opcode of command */
hp->status = 0;
@@ -1026,69 +1027,300 @@ srp_state_or_detaching(struct sg_device *sdp, struct 
sg_request *srp)
return ret;
 }
 
+/* For handling ioctl(SG_IO). Returns 0 on success else a negated errno */
+static int
+sg_sg_io(struct file *filp, struct sg_device *sdp, struct sg_fd *sfp,
+void __user *p)
+{
+   bool read_only = (O_RDWR != (filp->f_flags & O_ACCM

[PATCH v3 2/8] sg: introduce sg_log macro

2018-10-26 Thread Douglas Gilbert
Introduce the SG_LOG macro to replace long-winded
'SCSI_LOG_TIMEOUT(3, sg_printk ...' debug messages. Use __func__
wherever appropriate to make the debug messages more portable.

Signed-off-by: Douglas Gilbert 
---

Reviewers want SCSI_LOG_* style logging replaced. But with
what? One suggestion is the trace subsystem but that seems
data and event oriented whereas the current logging is call
based with extra messages when error paths are taken. There
are many debugging hours of work crafted into the current
SCSI_LOG_* messages that should not be thrown out for some
pretty solution without additional benefits.

 drivers/scsi/sg.c | 162 +-
 1 file changed, 72 insertions(+), 90 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 78a35e63d177..94e13a1d21a5 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -216,9 +216,24 @@ static void sg_device_destroy(struct kref *kref);
 /* #define SZ_SG_IOVEC sizeof(struct sg_iovec) synonym for 'struct iovec' */
 #define SZ_SG_REQ_INFO sizeof(struct sg_req_info)
 
-#define sg_printk(prefix, sdp, fmt, a...) \
-   sdev_prefix_printk(prefix, (sdp)->device,   \
-  (sdp)->disk->disk_name, fmt, ##a)
+/*
+ * Kernel needs to be built with CONFIG_SCSI_LOGGING to see log messages.
+ * 'depth' is a number between 1 (most severe) and 7 (most noisy, most
+ * information). All messages are logged as informational (KERN_INFO). In
+ * the unexpected situation where sdp is NULL the macro reverts to a pr_info
+ * and ignores CONFIG_SCSI_LOGGING and always prints to the log.
+ */
+#define SG_LOG(depth, sdp, fmt, a...)  \
+   do {\
+   if (IS_ERR_OR_NULL(sdp)) {  \
+   pr_info("sg: sdp=NULL_or_ERR, " fmt, ##a);  \
+   } else {\
+   SCSI_LOG_TIMEOUT(depth, sdev_prefix_printk( \
+KERN_INFO, (sdp)->device,  \
+(sdp)->disk->disk_name, fmt,   \
+##a)); \
+   }   \
+   } while (0)
 
 /*
  * The SCSI interfaces that use read() and write() as an asynchronous variant 
of
@@ -316,9 +331,8 @@ sg_open(struct inode *inode, struct file *filp)
sdp = sg_get_dev(min_dev);
if (IS_ERR(sdp))
return PTR_ERR(sdp);
-
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp,
- "sg_open: flags=0x%x\n", flags));
+   SG_LOG(3, sdp, "%s: flags=0x%x; device open count prior=%d\n",
+  __func__, flags, sdp->open_cnt);
 
/* This driver's module count bumped by fops_get in  */
/* Prevent the device driver from vanishing while we sleep */
@@ -414,9 +428,10 @@ sg_release(struct inode *inode, struct file *filp)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "sg_release\n"));
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
+   SG_LOG(3, sdp, "%s: device open count prior=%d\n", __func__,
+  sdp->open_cnt);
 
mutex_lock(>open_rel_lock);
scsi_autopm_put_device(sdp->device);
@@ -462,8 +477,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
sdp = sfp->parentdp;
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
+   SG_LOG(3, sdp, "%s: read() count=%d\n", __func__, (int)count);
 
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;
@@ -663,10 +677,9 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return IS_ERR(sfp) ? PTR_ERR(sfp) : -ENXIO;
}
sdp = sfp->parentdp;
+   SG_LOG(3, sdp, "%s: write(3rd arg) count=%d\n", __func__, (int)count);
if (IS_ERR_OR_NULL(sdp))
return IS_ERR(sdp) ? PTR_ERR(sdp) : -ENXIO;
-   SCSI_LOG_TIMEOUT(3, sg_printk(KERN_INFO, sdp, "%s: count=%d\n",
-__func__, (int)count));
if (atomic_read(>detaching))
return -ENODEV;
if (!((filp->f_flags & O_NONBLOCK) ||
@@ -687,8 +700,7 @@ sg_write(struct file *filp, const char __user *buf, size_t 
count, loff_t * ppos)
return -EIO;/* minimum scsi command length is 6 bytes */
 
if (!(srp = sg_add_request(s

[PATCH v3 1/8] sg: types and naming cleanup

2018-10-26 Thread Douglas Gilbert
Remove typedefs and use better type names like bool and u8 where
appropriate. Rename some variables and functions for clarity.
Adjust formatting (e.g. function definitions) to be more
consistent across the driver.

Signed-off-by: Douglas Gilbert 
---

The intention is to move to sg_version_num 40001 when a second
patchset implementing SG_IOSUBMIT and friends is applied. In
the meantime, move from the latest version number in the kernel
(30536) to 30901 to indicate a significant change but not yet
implementing the sg v4 interface.

 drivers/scsi/sg.c | 827 ++
 1 file changed, 467 insertions(+), 360 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c6ad00703c5b..78a35e63d177 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -7,7 +7,7 @@
  * Original driver (sg.c):
  *Copyright (C) 1992 Lawrence Foard
  * Version 2 and 3 extensions to driver:
- *Copyright (C) 1998 - 2014 Douglas Gilbert
+ *Copyright (C) 1998 - 2018 Douglas Gilbert
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -16,16 +16,9 @@
  *
  */
 
-static int sg_version_num = 30536; /* 2 digits for each component */
-#define SG_VERSION_STR "3.5.36"
+static int sg_version_num = 30901; /* 2 digits for each component */
+#define SG_VERSION_STR "3.9.01"
 
-/*
- *  D. P. Gilbert (dgilb...@interlog.com), notes:
- *  - scsi logging is available via SCSI_LOG_TIMEOUT macros. First
- *the kernel/module needs to be built with CONFIG_SCSI_LOGGING
- *(otherwise the macros compile to empty statements).
- *
- */
 #include 
 
 #include 
@@ -52,6 +45,8 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 #include 
 #include 
 #include  /* for sg_check_file_access() */
+#include 
+#include 
 
 #include "scsi.h"
 #include 
@@ -64,7 +59,7 @@ static int sg_version_num = 30536;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20140603";
+static char *sg_version_date = "20181019";
 
 static int sg_proc_init(void);
 #endif
@@ -73,7 +68,8 @@ static int sg_proc_init(void);
 
 #define SG_MAX_DEVS 32768
 
-/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
+/*
+ * SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
  * of sg_io_hdr::cmd_len can only represent 255. All SCSI commands greater
  * than 16 bytes are "variable length" whose length is a multiple of 4
  */
@@ -100,44 +96,54 @@ static int sg_add_device(struct device *, struct 
class_interface *);
 static void sg_remove_device(struct device *, struct class_interface *);
 
 static DEFINE_IDR(sg_index_idr);
-static DEFINE_RWLOCK(sg_index_lock);   /* Also used to lock
-  file descriptor list 
for device */
+static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock fd list for device */
 
 static struct class_interface sg_interface = {
.add_dev= sg_add_device,
.remove_dev = sg_remove_device,
 };
 
-typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info 
*/
-   unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */
-   unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */
-   unsigned bufflen;   /* Size of (aggregate) data buffer */
-   struct page **pages;
-   int page_order;
-   char dio_in_use;/* 0->indirect IO (or mmap), 1->dio */
-   unsigned char cmd_opcode; /* first byte of command */
-} Sg_scatter_hold;
+struct sg_v4_hold {/* parts of sg_io_v4 object needed in async usage */
+   __user u8 *usr_ptr; /* derived from sg_io_v4::usr_ptr */
+   __user u8 *sbp; /* derived from sg_io_v4::response */
+   u16 cmd_len;/* truncated of sg_io_v4::request_len */
+   u16 max_sb_len; /* truncated of sg_io_v4::max_response_len */
+   u32 flags;  /* copy of sg_io_v4::flags */
+};
+
+struct sg_scatter_hold { /* holding area for scsi scatter gather info */
+   struct page **pages;/* num_sgat element array of struct page* */
+   int page_order; /* byte_len = (page_size * (2**page_order)) */
+   int dlen;   /* Byte length of data buffer */
+   unsigned short num_sgat;/* actual number of scatter-gather segments */
+   bool dio_in_use;/* false->indirect IO (or mmap), true->dio */
+   u8 cmd_opcode;  /* first byte of command */
+};
 
 struct sg_device;  /* forward declarations */
 struct sg_fd;
 
-typedef struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
+struct sg_request {/* SG_MAX_QUEUE requests outstanding per file */
struct list_head entry; /* list entry */

[PATCH v3 0/8] sg: major cleanup, remove max_queue limit

2018-10-26 Thread Douglas Gilbert
The intention is to add two new ioctls as proposed by Linus Torvalds:
SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async
interface.

But first, clean up the driver and remove the SG_MAX_QUEUE limit of
no more than 16 queued commands on a file descriptor at a time. A
free list has been added and the de-allocation of sg_request
objects is deferred until the close() of a file. Locking is
extensively reworked, especially at the struct sg_fd and
sg_request level.

A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple
integer values and booleans to be written to and/or read from the
driver. An example is changing and/or reading the reserved request
data length (there is one of these per fd). An example of a new
feature is changing and/or reading the per-fd upper limit on the
sum of outstanding data buffer sizes (default is 16 MB). An
example of a boolean is a bit to do all following command
timekeeping in nanoseconds rather that the default millseconds.

A later patchset will add implementations for the SG_IOSUBMIT and
SG_IORECEIVE plus handling of the sg v4 interface with the
existing SG_IO ioctl.

This patchset is against Martin Petersen's 4.20/scsi-queue branch.

ToDo:
  - work out more modern technique for logging function invocations
info with some error paths.

Changes since v2:
  - fix problem with reserve request not being placed on fl
  - fix locking problem with sg_reserved_sz()
  - fix sum_fd_dlens counting problem
  - change new ioctl base numbers back to 0x22
  - arrange fl in ascending dlen order apart from dlen=0
entries at tail
  - SG_GET_REQUEST_TABLE ioctl first fill from active list
then, if there is room, it fills from the free list
  - add SG_SEIRV_DEV_FL_RQS to fetch count of all free list
entries on the device the fd is associated with

Changes since v1:
  - remove redundant casts from private_data field
  - introduce atomic to address locking problems around
sg_fd::sum_fd_dlens
  - replace rq_state defines with an enumeration
  - add __must_hold() annotation to sg_fill_request_table()
  - fix compile/build problem around the 4th and 5th patches
  - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED
ioctl and increase structure size from 64 to 96 bytes


Douglas Gilbert (8):
  sg: types and naming cleanup
  sg: introduce sg_log macro
  sg: split header, expand and correct descriptions
  sg: expand request states
  sg: add free list, rework locking
  sg: complete locking changes on ioctl+debug
  sg: rework ioctl handling
  sg: user controls for q_at_head, read_value

 drivers/scsi/sg.c  | 2621 ++--
 include/scsi/sg.h  |  268 +---
 include/uapi/scsi/sg.h |  414 +++
 3 files changed, 2156 insertions(+), 1147 deletions(-)
 create mode 100644 include/uapi/scsi/sg.h

-- 
2.17.1



Re: [PATCH v3 0/8] sg: major cleanup, remove max_queue limit

2018-10-28 Thread Douglas Gilbert

For anyone thinking of testing this patchset, it also applies clean to
lk 4.19 release.

Doug Gilbert

On 2018-10-26 12:48 p.m., Douglas Gilbert wrote:

The intention is to add two new ioctls as proposed by Linus Torvalds:
SG_IOSUBMIT and SG_IORECEIVE to replace the write()/read() async
interface.

But first, clean up the driver and remove the SG_MAX_QUEUE limit of
no more than 16 queued commands on a file descriptor at a time. A
free list has been added and the de-allocation of sg_request
objects is deferred until the close() of a file. Locking is
extensively reworked, especially at the struct sg_fd and
sg_request level.

A new SG_SET_GET_EXTENDED ioctl has been added. It allows multiple
integer values and booleans to be written to and/or read from the
driver. An example is changing and/or reading the reserved request
data length (there is one of these per fd). An example of a new
feature is changing and/or reading the per-fd upper limit on the
sum of outstanding data buffer sizes (default is 16 MB). An
example of a boolean is a bit to do all following command
timekeeping in nanoseconds rather that the default millseconds.

A later patchset will add implementations for the SG_IOSUBMIT and
SG_IORECEIVE plus handling of the sg v4 interface with the
existing SG_IO ioctl.

This patchset is against Martin Petersen's 4.20/scsi-queue branch.

ToDo:
   - work out more modern technique for logging function invocations
 info with some error paths.

Changes since v2:
   - fix problem with reserve request not being placed on fl
   - fix locking problem with sg_reserved_sz()
   - fix sum_fd_dlens counting problem
   - change new ioctl base numbers back to 0x22
   - arrange fl in ascending dlen order apart from dlen=0
 entries at tail
   - SG_GET_REQUEST_TABLE ioctl first fill from active list
 then, if there is room, it fills from the free list
   - add SG_SEIRV_DEV_FL_RQS to fetch count of all free list
 entries on the device the fd is associated with

Changes since v1:
   - remove redundant casts from private_data field
   - introduce atomic to address locking problems around
 sg_fd::sum_fd_dlens
   - replace rq_state defines with an enumeration
   - add __must_hold() annotation to sg_fill_request_table()
   - fix compile/build problem around the 4th and 5th patches
   - add read_value[SG_SEIRV_*] options in SG_SET_GET_EXTENDED
 ioctl and increase structure size from 64 to 96 bytes


Douglas Gilbert (8):
   sg: types and naming cleanup
   sg: introduce sg_log macro
   sg: split header, expand and correct descriptions
   sg: expand request states
   sg: add free list, rework locking
   sg: complete locking changes on ioctl+debug
   sg: rework ioctl handling
   sg: user controls for q_at_head, read_value

  drivers/scsi/sg.c  | 2621 ++--
  include/scsi/sg.h  |  268 +---
  include/uapi/scsi/sg.h |  414 +++
  3 files changed, 2156 insertions(+), 1147 deletions(-)
  create mode 100644 include/uapi/scsi/sg.h





[PATCH v3.5 8/8] sg: user controls for q_at_head, read_value

2018-10-26 Thread Douglas Gilbert
Add a SG_SET_GET_EXTENDED ioctl control for whether commands
will be queued_at_head or queued_at_tail by the block layer
(together with the scsi mid-level). It has file scope.

Also add a read_value integer the can be used by write a
value from the SG_SEIRV_* group then the corresponding value
will be returned.

Signed-off-by: Douglas Gilbert 
---

The user can still override the new file scope setting on a
a per command basis with the SG_FLAG_Q_AT_HEAD and
SG_FLAG_Q_AT_TAIL in the sg v3 and v4 structures.

An example of read_value usage is to write the value
SG_SEIRV_FL_RQS to the read_value field. Then after the
SG_SET_GET_EXTENDED ioctl is run, the number of (inactive)
requests currently on this file descriptor's request free
list is placed in the read_value field.

Added in v3 is SG_SEIRV_DEV_FL_RQS which is an expansion of
SG_SEIRV_FL_RQS. SG_SEIRV_DEV_FL_RQS counts free list entries
on all sg file descriptors currently open on the device that
the file descriptor (given to ioctl()) is associated with.


Change since v3 [patches 1 to 7 of 8 are unchanged, this for 8/8]:
  - move sg_version_date definition from inside CONFIG_SCSI_PROC_FS
conditional out to full file scope and near related definitions

 drivers/scsi/sg.c  | 160 +++--
 include/scsi/sg.h  |  32 ++---
 include/uapi/scsi/sg.h |  49 ++---
 3 files changed, 150 insertions(+), 91 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 3e89bbd508de..5ba92f112481 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -18,6 +18,7 @@
 
 static int sg_version_num = 30901; /* 2 digits for each component */
 #define SG_VERSION_STR "3.9.01"
+static char *sg_version_date = "20181024";
 
 #include 
 
@@ -59,7 +60,6 @@ static int sg_version_num = 30901;/* 2 digits for each 
component */
 
 #ifdef CONFIG_SCSI_PROC_FS
 #include 
-static char *sg_version_date = "20181019";
 
 static int sg_proc_init(void);
 #endif
@@ -95,6 +95,10 @@ enum sg_rq_state {
 #define SG_DEF_TIME_UNIT SG_TIME_UNIT_MS
 #define SG_DEFAULT_TIMEOUT mult_frac(SG_DEFAULT_TIMEOUT_USER, HZ, USER_HZ)
 
+#define SG_FD_Q_AT_TAIL true
+#define SG_FD_Q_AT_HEAD false
+#define SG_DEFAULT_Q_AT SG_FD_Q_AT_HEAD/* for backward compatibility */
+
 int sg_big_buff = SG_DEF_RESERVED_SIZE;
 /* N.B. This variable is readable and writeable via
/proc/scsi/sg/def_reserved_size . Each time sg_open() is called a buffer
@@ -187,6 +191,7 @@ struct sg_fd {  /* holds the state of a file 
descriptor */
bool keep_orphan;/* false -> drop (def), true -> keep for read() */
bool mmap_called;   /* false -> mmap() never called on this fd */
bool time_in_ns;/* report times in nanoseconds */
+   bool q_at_tail; /* queue at tail if true, head when false */
u8 next_cmd_len;/* 0: automatic, >0: use on next write() */
struct sg_request *reserve_srp; /* allocate on open(), starts on fl */
struct fasync_struct *async_qp; /* used by asynchronous notification */
@@ -238,7 +243,7 @@ static struct sg_request *sg_add_request(struct sg_fd *sfp, 
int dxfr_len,
 static void sg_remove_request(struct sg_fd *sfp, struct sg_request *srp);
 static struct sg_device *sg_get_dev(int min_dev);
 static void sg_device_destroy(struct kref *kref);
-static const char *sg_rq_state_str(u8 rq_state, bool long_str);
+static const char *sg_rq_state_str(enum sg_rq_state rq_state, bool long_str);
 static struct sg_request *sg_mk_srp(struct sg_fd *sfp, bool first,
rwlock_t *rwlp, unsigned long *iflagsp);
 
@@ -855,7 +860,7 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
 
if (h4p || !hi_p)
return ERR_PTR(-EOPNOTSUPP);
-   srp = sg_add_request(sfp, hi_p->dxfer_len, false);
+   srp = sg_add_request(sfp, hi_p->dxfer_len, sync);
if (IS_ERR(srp))
return srp;
srp->header = *hi_p;/* structure assignment, could memcpy */
@@ -897,9 +902,13 @@ sg_common_write(struct sg_fd *sfp, const struct sg_io_hdr 
*hi_p,
srp->start_ts = ktime_get_with_offset(TK_OFFS_BOOT);
else
hp->duration = jiffies_to_msecs(jiffies);
-   /* at tail if v3 or later interface and tail flag set */
-   at_head = !(hp->interface_id != '\0' &&
-   (SG_FLAG_Q_AT_TAIL & hp->flags));
+
+   if (hp->interface_id == '\0')   /* v1 and v2 interface */
+   at_head = true; /* backward compatibility */
+   else if (sfp->q_at_tail)  /* cmd flags can override sfd setting */
+   at_head = (hp->flags & SG_FLAG_Q_AT_HEAD);
+   else/* this sfd is defaulting to head */
+   at_head = !(hp->flags & SG_FLAG_Q_AT_TAIL);
 
srp->rq->timeout = timeout;
kref_get(>f_ref)

Re: [PATCH v3 8/8] sg: user controls for q_at_head, read_value

2018-10-26 Thread Douglas Gilbert

On 2018-10-26 12:48 p.m., Douglas Gilbert wrote:

Add a SG_SET_GET_EXTENDED ioctl control for whether commands
will be queued_at_head or queued_at_tail by the block layer
(together with the scsi mid-level). It has file scope.

Also add a read_value integer the can be used by write a
value from the SG_SEIRV_* group then the corresponding value
will be returned.

Signed-off-by: Douglas Gilbert 
---


See "v3.5" of this patch to fix the compile error when
CONFIG_SCSI_PROC_FS is not defined in a kernel build.

Doug Gilbert



Re: [PATCH] rescan-scsi-bus.sh: use LUN wildcard in idlist

2018-11-08 Thread Douglas Gilbert

On 2018-10-25 10:49 a.m., Martin Wilck wrote:

By scanning for LUN 0 only, we may encounter a device that the
kernel won't add (e.g. peripheral device type 31) and which may
thus never appear in sysfs for us to use for REPORT LUNS. That
causes LUN additions for such devices to be missed by
"rescan-iscsi-bus.sh -a".


Seems reasonable and applies clean to my development true. For
rescan-scsi-bus.sh I usually defer to Hannes. I have put it in
pending a decision from him.

BTW My alphas and betas (and releases) of sg3_utils go out through
Hannes' github account [https://github.com/hreinecke/sg3_utils] and/or
http://sg.danny.cz/sg . At the moment the latter is more up to date.

Doug Gilbert


Signed-off-by: Martin Wilck 
---
  scripts/rescan-scsi-bus.sh | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/rescan-scsi-bus.sh b/scripts/rescan-scsi-bus.sh
index 6989208..a2aa8d8 100755
--- a/scripts/rescan-scsi-bus.sh
+++ b/scripts/rescan-scsi-bus.sh
@@ -376,7 +376,7 @@ idlist ()
  
oldlist=$(ls /sys/class/scsi_device/ | sed -n "s/${host}:${channel}:\([0-9]*:[0-9]*\)/\1/p" | uniq)

# Rescan LUN 0 to check if we found new targets
-  echo "${channel} - 0" > /sys/class/scsi_host/host${host}/scan
+  echo "${channel} - -" > /sys/class/scsi_host/host${host}/scan
newlist=$(ls /sys/class/scsi_device/ | sed -n 
"s/${host}:${channel}:\([0-9]*:[0-9]*\)/\1/p" | uniq)
for newid in $newlist ; do
  oldid=$newid





Re: [PATCH v1] sg3_utils: sg_write_buffer: convert string to integer while reading from stdio

2018-11-09 Thread Douglas Gilbert

On 2018-11-09 7:18 p.m., Bean Huo (beanhuo) wrote:

This patch is to convert inputted string to the integer when
read data from stdin. While entering data, the data between bytes
can be separated by space, or by ',' or by '.'.


Could you send me this patch against sg_write_buffer.c as found
in the recently released version 1.44 . In there the version
string for sg_write_buffer.c is: "1.28 20180628"


Signed-off-by: Bean Huo 
---
  src/sg_write_buffer.c | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/src/sg_write_buffer.c b/src/sg_write_buffer.c
index 7560e7e..902bc5d 100644
--- a/src/sg_write_buffer.c
+++ b/src/sg_write_buffer.c
@@ -195,6 +195,7 @@ main(int argc, char * argv[])
  const char * device_name = NULL;
  const char * file_name = NULL;
  unsigned char * dop = NULL;
+unsigned char * read_buf= NULL;


For example the above 2 lines now use uint8_t


  char * cp;
  const struct mode_s * mp;
  char ebuff[EBUFF_SZ];
@@ -394,6 +395,33 @@ main(int argc, char * argv[])
  }
  }
  }
+   if (infd == STDIN_FILENO) {
+   if (NULL == (read_buf = (unsigned char *)malloc(DEF_XFER_LEN))) 
{
+   pr2serr(ME "out of memory\n");
+   ret = SG_LIB_SYNTAX_ERROR;
+   goto err_out;
+   }
+   res = read(infd, read_buf, DEF_XFER_LEN);
+   if (res < 0) {
+   snprintf(ebuff, EBUFF_SZ, ME "couldn't read from 
STDIN");
+   perror(ebuff);
+   ret = SG_LIB_FILE_ERROR;
+   goto err_out;
+   }
+   char * pch;
+   int val = 0;
+   res = 0;
+   pch = strtok(read_buf, ",. ");
+   while (pch != NULL) {
+   printf("read %s ", pch);
+   val = sg_get_num_nomult(pch);
+   if (val > 0 && val < 255) {


Hmm, perhaps:
if (val >= 0 && val < 256)

else



+   dop[res] = val;
+   res++;
+   }
+   pch = strtok(NULL, ",. ");
+   }
+   } else {
  res = read(infd, dop, wb_len);
  if (res < 0) {
  snprintf(ebuff, EBUFF_SZ, ME "couldn't read from %s",
@@ -404,6 +432,7 @@ main(int argc, char * argv[])
  ret = SG_LIB_FILE_ERROR;
  goto err_out;
  }
+   }


Tabbing looks a little off here.

Thanks
Doug Gilbert


  if (res < wb_len) {
  if (wb_len_given) {
  pr2serr("tried to read %d bytes from %s, got %d bytes\n",
@@ -472,6 +501,8 @@ main(int argc, char * argv[])
  err_out:
  if (dop)
  free(dop);
+if (read_buf)
+   free(read_buf);
  res = sg_cmds_close_device(sg_fd);
  if (res < 0) {
  pr2serr("close error: %s\n", safe_strerror(-res));





Re: [PATCH V2] sg3_utils: sg_write_buffer: convert string to integer while reading from stdio

2018-11-12 Thread Douglas Gilbert

On 2018-11-12 11:32 a.m., Bean Huo (beanhuo) wrote:

This patch is to convert inputted string to the integer
when read data from stdin. While entering data flow, the
data between bytes can be separated by either space,
or ',' (or by '.').

V1-V2:
1. Rebased the patch on the latest sg_write_buffer.c
2. Added the wrong input checkup, and process
3. Changed the delimer from ",. " to ",. /n/t", in order
 to skip the spaces at the end of string because of misoperation.
4. Modified some wrong indents.

Tested on my own UFS platform, used to issue VU command:

./sg_write_buffer -b 0 -i 0 -v -l 0x2c -m 1  -S 7 -r /dev/block/sda
0x01 0x40 0x20
tried to read 44 bytes from -, got 3 bytes
pad with 0xff bytes and continue
sending single write buffer, mode=0x1, mpsec=7, id=0, offset=0, len=44
 Write buffer cdb: 3b e1 00 00 00 00 00 00 2c 00

./sg_read_buffer -l 32 -m 1 -S 6 -v /dev/block/sda
 Read buffer(10) cdb: 3c c1 00 00 00 00 00 00 20 00
  00 61 34 64 36 63 38 61 42  65 36 4b 4d 4c 34 4c 30
  10 30 34 44 43 45 32 30 32  57 30 00 00 00 00 00 00

Signed-off-by: Bean Huo 


Thanks, applied.

Just prior to getting this patch today I put up a sg3_utils-1.45
(early) beta: revision 795. It's in a tarball at the top of
http://sg.danny.cz/sg/

This patch will be in revision 796.

Doug Gilbert


<    3   4   5   6   7   8