RE: [patch] Revert "block: remove artifical max_hw_sectors cap"

2015-07-29 Thread Elliott, Robert (Server Storage)

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Jeff Moyer
> Sent: Wednesday, July 29, 2015 11:53 AM
> To: Christoph Hellwig 
> Cc: Jens Axboe ; linux-ker...@vger.kernel.org;
> dmilb...@redhat.com

Adding linux-scsi...

> Subject: Re: [patch] Revert "block: remove artifical max_hw_sectors cap"
> 
> Christoph Hellwig  writes:
> 
> > On Mon, Jul 20, 2015 at 03:17:07PM -0400, Jeff Moyer wrote:
> >> For SAN storage, we've seen initial write and re-write performance drop
> >> 25-50% across all I/O sizes.  On locally attached storage, we've seen
> >> regressions of 40% for all I/O types, but only for I/O sizes larger
> than
> >> 1MB.
> >
> > Workload, and hardare please.  An only mainline numbers, not some old
> > hacked vendor kernel, please.
> 
> I've attached a simple fio config that reproduces the problem.  It just
> does sequential, O_DIRECT write I/O with I/O sizes of 1M, 2M and 4M.  So
> far I've tested it on an HP HSV400 and an IBM XIV SAN array connected
> via a qlogic adapter, a nearline sata driveand a WD Red (NAS) sata disk
> connected via an intel ich9r sata controller.  The kernel I tested was
> 4.2.0-rc3, and the testing was done across 3 different hosts (just
> because I don't have all the hardware connected to a single box).  I did
> 10 runs using max_sectors_kb set to 1024, and 10 runs with it set to
> 32767.  Results compare the averages of those 10 runs.  In no cases did
> I see a performance gain.  In two cases, there is a performance hit.
> 
> In addition to my testing, our performance teams have seen performance
> regressions running iozone on fibre channel-attached HP MSA1000 storage,
> as well as on an SSD hidden behind a megaraid controller.  I was not
> able to get the exact details on the SSD.  iozone configurations can be
> provided, but I think I've nailed the underlying problem with this test
> case.
> 
> But, don't take my word for it.  Run the fio script on your own
> hardware.  All you have to do is echo a couple of values into
> /sys/block/sdX/queue/max_sectors_kb to test, no kernel rebuilding
> required.
> 
> In the tables below, concentrate on the BW/IOPS numbers under the WRITE
> column.  Negative numbers indicate that max_sectors_kb of 32767 shows a
> performance regression of the indicated percentage when compared with a
> setting of 1024.
> 
> Christoph, did you have some hardware where a higher max_sectors_kb
> improved performance?

I don't still have performance numbers, but the old default of 
512 KiB was interfering with building large writes that RAID
controllers can treat as full stripe writes (avoiding the need
to read the old parity).

With the SCSI LLD value bumped up, some other limits remain:
* the block layer BIO_MAX_PAGES value of 256 limits IOs
  to a maximum of 1 MiB (bio chaining affects this too)
* the SCSI LLD maximum number of scatter gather entries
  reported in /sys/block/sdNN/queue/max_segments and
  /sys/block/sdNN/queue/max_segment_size creates a
  limit based on how fragmented the data buffer is
  in virtual memory
* the Block Limits VPD page MAXIMUM TRANSFER LENGTH field
  indicates the maximum transfer size for one command over
  the SCSI transport protocol supported by the drive itself

The patch let 1 MiB IOs flow through the stack, which
is a better fit for modern strip sizes than 512 KiB.

Software using large IOs must be prepared for long 
latencies in exchange for the potential bandwidth gains,
and must use a low (but greater than 1) queue depth 
to keep the IOs flowing back-to-back.

Are you finding real software generating such IOs
but relying on the storage stack to break them up
for decent performance?

Your fio script is using the sync IO engine, which
means no queuing.  This forces a turnaround time 
between IOs, preventing the device from looking ahead
to see what's next (for sequential IOs, probably
continuing data transfers with minimal delay).

If the storage stack breaks up large sync IOs, the 
drive might be better at detecting that the access
pattern is sequential (e.g., the gaps are between 
every set of 2 IOs rather than every IO).  This is
very drive-specific.

If we have to go back to that artificial limit, then
modern drivers (e.g., blk-mq capable drivers) need a
way to raise the default; relying on users to change
the sysfs settings means they're usually not changed.

---
Robert Elliott, HP Server Storage


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] target: add support for START_STOP_UNIT SCSI opcode

2015-07-23 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Spencer Baugh
> Sent: Thursday, July 23, 2015 5:28 PM
> To: Christoph Hellwig ; Spencer Baugh
> 
...
> Subject: Re: [PATCH] target: add support for START_STOP_UNIT SCSI opcode
> 
> From: Brian Bunker 
> 
> AIX servers using VIOS servers that virtualize FC cards will have a
> problem booting without support for START_STOP_UNIT.
> 
> v2: Cite sb3r36 exactly, clean up if conditions
> 
> Signed-off-by: Brian Bunker 
> Signed-off-by: Spencer Baugh 
...
> v2: Cite sb3r36 exactly, clean up if conditions
> 
...
> diff --git a/drivers/target/target_core_sbc.c
> b/drivers/target/target_core_sbc.c
> index e318ddb..85c3c0a 100644
> --- a/drivers/target/target_core_sbc.c
> +++ b/drivers/target/target_core_sbc.c
> @@ -154,6 +154,38 @@ sbc_emulate_readcapacity_16(struct se_cmd *cmd)
>   return 0;
>  }
> 
> +static sense_reason_t
> +sbc_emulate_startstop(struct se_cmd *cmd)
> +{
> + unsigned char *cdb = cmd->t_task_cdb;
> +
> + /*
> +  * See sb3r36 section 5.25

Global typo in this patch - sb3 should be sbc3.

Editorial comment:
Note that the officially published versions of the ISO and ANSI 
standards don't carry that revision number r36; they just have
the standard name and year.  SBC-3 revision 36 became 
"ANSI INCITS 514-2014 Information technology - SCSI Block 
Commands - 3 (SBC-3)".

T10 isn't really obligated to keep making particular working 
drafts available, although the ones that have been assigned
version descriptors (in SPC-n) are more likely to stick 
around.  For SBC-3, only revisions 35 and 36 earned those.

Section numbers are quite volatile, too, so you might be
better off including the section name.  It's starting to
become standardese, but this kind of wording might be better:

"See the SBC-3 START STOP UNIT command description
(e.g., sbc3r36 5.25)"

> + /* From SBC-3:
> +  * Immediate bit should be set since there is nothing to complete
> +  * POWER CONDITION MODIFIER 0h
> +  */
> + if (!(cdb[1] & 1) || (cdb[2] | cdb[3]))
> + return TCM_INVALID_CDB_FIELD;

Technical comment:
The application client is not obligated to set the IMMED bit here.
In fact, that's an unusual choice.  Using the IMMED bit means the 
application client must handle the initial status for the 
command, then poll for the functional results with REQUEST SENSE
and TEST UNIT READY commands, which is much more complicated.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] target: Drop iSCSI use of mutex around max_cmd_sn increment

2015-07-22 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Spencer Baugh
> Sent: Wednesday, July 22, 2015 5:08 PM
> Subject: [PATCH] target: Drop iSCSI use of mutex around max_cmd_sn
> increment
...
> diff --git a/drivers/target/iscsi/iscsi_target_configfs.c
> b/drivers/target/iscsi/iscsi_target_configfs.c
> index c1898c8..29d5930 100644
> --- a/drivers/target/iscsi/iscsi_target_configfs.c
> +++ b/drivers/target/iscsi/iscsi_target_configfs.c
> @@ -706,8 +706,8 @@ static ssize_t lio_target_nacl_show_info(
>   rb += sprintf(page+rb, " 0x%08x   0x%08x   0x%08x   0x%08x"
>   "   0x%08x   0x%08x\n",
>   sess->cmdsn_window,
> - (sess->max_cmd_sn - sess->exp_cmd_sn) + 1,
> - sess->exp_cmd_sn, sess->max_cmd_sn,
> + ((u32) atomic_read(&sess->max_cmd_sn) - sess-
> >exp_cmd_sn) + 1,
> + sess->exp_cmd_sn, (u32) atomic_read(&sess->max_cmd_sn),
>   sess->init_task_tag, sess->targ_xfer_tag);
>   rb += sprintf(page+rb, "--[iSCSI"
>   " Connections]-\n");

Two calls to atomic_read could pick up different values; 
calling it once and using the same value twice would ensure
the arguments are consistent with each other.

> diff --git a/drivers/target/iscsi/iscsi_target_device.c
> b/drivers/target/iscsi/iscsi_target_device.c
> index 5fabcd3..a526904 100644
> --- a/drivers/target/iscsi/iscsi_target_device.c
> +++ b/drivers/target/iscsi/iscsi_target_device.c
...
> @@ -57,9 +57,7 @@ void iscsit_increment_maxcmdsn(struct iscsi_cmd *cmd,
> struct iscsi_session *sess
> 
>   cmd->maxcmdsn_inc = 1;
> 
> - mutex_lock(&sess->cmdsn_mutex);
> - sess->max_cmd_sn += 1;
> - pr_debug("Updated MaxCmdSN to 0x%08x\n", sess->max_cmd_sn);
> - mutex_unlock(&sess->cmdsn_mutex);
> + atomic_inc(&sess->max_cmd_sn);
> + pr_debug("Updated MaxCmdSN to 0x%08x\n", atomic_read(&sess-
> >max_cmd_sn));
>  }
>  EXPORT_SYMBOL(iscsit_increment_maxcmdsn);

If there is another change before the atomic_read, this would
print a value unrelated to the increment done by this thread. 
atomic_inc_return would provide the appropriate value to print.

---
Robert Elliott, HP Server Storage
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH, RFC] scsi: use host wide tags by default

2015-04-17 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of James Bottomley
> Sent: Friday, April 17, 2015 4:43 PM
> To: Christoph Hellwig
> Cc: linux-scsi@vger.kernel.org; ax...@kernel.dk
> Subject: Re: [PATCH, RFC] scsi: use host wide tags by default
> 
> On Fri, 2015-04-17 at 22:11 +0200, Christoph Hellwig wrote:
...
> > +   spin_lock_irqsave(sdev->request_queue->queue_lock, flags);
> > +   sdev->queue_depth = depth;
> > +   spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags);
> 
> This lock/unlock is a nasty global sync point which can be eliminated:
> we can rely on the architectural atomicity of 32 bit writes (might need
> to make sdev->queue_depth a u32 because I seem to remember 16 bit writes
> had to be done as two byte stores on some architectures).

If such a change is made, consider using atomic_set (for an int) or
atomic64_set (for a 64-bit value) plus a barrier when needed.  That
communicates the need for atomicity but reduces to a plain store if
the architecture always handles that width atomically.

I don't think there is an atomic_set_short for a short, though.

---
Robert Elliott, HP Server Storage


RE: [PATCH v4 42/43] hpsa: add PMC to copyright

2015-04-16 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Don Brace
> Sent: Thursday, April 16, 2015 8:51 AM
> Subject: [PATCH v4 42/43] hpsa: add PMC to copyright
> 
> need to add PMC to copyright notice and update the Hewlett-Packard
> copyright notification.
...
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
...
>   *Disk Array driver for HP Smart Array SAS controllers
> - *Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
> + *Copyright 2014-2015 PMC-Sierra, Inc.
> + *Portions Copyright 2008-2014 Hewlett-Packard Development Company,
> L.P.
...
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
...
>   *Disk Array driver for HP Smart Array SAS controllers
> - *Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
> + *Copyright 2014-2015 PMC-Sierra, Inc.
> + *Portions Copyright 2008-2014 Hewlett-Packard Development Company,
> L.P.
...
> diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
...
>   *Disk Array driver for HP Smart Array SAS controllers
> - *Copyright 2000, 2014 Hewlett-Packard Development Company, L.P.
> + *Copyright 2014-2015 PMC-Sierra, Inc.
> + *Portions Copyright 2008-2014 Hewlett-Packard Development Company,
> L.P.
...

HP would prefer to not add the word "Portions" there.

Also, I don't think the earliest date should change from what was
submitted in the first commit edd163687ea5 (Dec 2009), which was
"2000, 2009".  Maybe use "2000, 2009-2015" based on:
* 2000: original earliest claimed date
* 2009: date of first inclusion in kernel
* 2015: some of the HP-era patches are still trickling out 
  this year.

---
Robert Elliott, HP Server Storage

N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: smp_processor_id warning in megasas driver on 3.19.3

2015-04-08 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Andy Lutomirski
> Sent: Wednesday, April 8, 2015 1:27 PM
> To: James Bottomley
> Cc: Kashyap Desai; Sumit Saxena; Uday Lingala;
> megaraidlinux@avagotech.com; Linux SCSI List
> Subject: Re: smp_processor_id warning in megasas driver on 3.19.3
> 
...
> Would raw_smp_processor_id be a good compromise?  I'm testing a patch
> right now and, if it works, I can send it and cc stable.
> 
> --Andy

That's what hpsa uses.

---
Robert Elliott, HP Server Storage

N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [PATCH] [RESEND] aic7xxx: replace kmalloc/memset by kzalloc

2015-03-24 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Joe Perches [mailto:j...@perches.com]
> Sent: Tuesday, March 24, 2015 3:57 PM
> To: Michael Opdenacker
> Cc: Hannes Reinecke; jbottom...@parallels.com; Elliott, Robert (Server
> Storage); linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] [RESEND] aic7xxx: replace kmalloc/memset by kzalloc
> 
> On Tue, 2015-03-24 at 13:46 -0700, Michael Opdenacker wrote:
...
> > On 03/22/2015 11:59 PM, Hannes Reinecke wrote:
> > > On 03/22/2015 05:31 PM, Michael Opdenacker wrote:
> > >> This replaces kmalloc + memset by a call to kzalloc
> > >> (or kcalloc when appropriate, which zeroes memory too)
> > >>
...
> > I'm sending a version that reverts the use of kcalloc() instead of
> > kzalloc(). For reasons I don't understand, I didn't see the end of
> > Robert Elliott's comment that the use of kcalloc() could prevent the
> > compiler from detecting an overflow.
> 
> I'm confused.  I don't see that comment either, but
> the entire point of kcalloc is to prevent overflows
> by returning NULL when an overflow might occur.

It was a reply to the original post on 2014-10-16, not the resend
this month. 

>From http://permalink.gmane.org/gmane.linux.kernel/1808168:

kcalloc is helpful when one of the values is a variable that 
might cause the multiply to overflow during runtime.  Here, 
two constants are being multiplied together, which can
be done and checked by the compiler at compile time.  

Since kcalloc and kmalloc_array are both static inline 
functions:
static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
{
if (size != 0 && n > SIZE_MAX / size)
return NULL;
return __kmalloc(n * size, flags);
}
static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
{
return kmalloc_array(n, size, flags | __GFP_ZERO);
}

a compiler that detects an overflow will probably just reduce
that to an inlined "return NULL."

BUILD_BUG_ON could be used to trigger a compile-time error,
instead of building a kernel that returns a run-time error.



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] scsi_debug: rework resp_report_luns

2015-02-25 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Winkler, Tomas
> Sent: Wednesday, February 25, 2015 1:54 AM
> To: dgilb...@interlog.com; James E.J. Bottomley"
> Cc: linux-scsi@vger.kernel.org
> Subject: RE: [PATCH] scsi_debug: rework resp_report_luns
> 
> > On 15-02-24 04:37 PM, Tomas Winkler wrote:
> > > 1. Fix the error check: the alloc length should be > 16
> > > and not > 4
> >
> > You are proposing to make a marginally incorrect test
> > completely incorrect!
> 
> Quoting from the spec:
> The ALLOCATION LENGTH field is defined in 2.2.6. The allocation length
> should be at least 16.
> Note. Device servers compliant with SPC return CHECK CONDITION status,
> with the sense key set to ILLEGAL REQUEST, and the additional sense code
> set to INVALID FIELD IN CDB when the allocation length is less than 16 bytes.
> This is how it is implemented also in other drivers in the scsi drivers.

That was just a warning to software that very old devices (compliant with SPC-1)
returned CHECK CONDITION in that case, explaining why the "should" advice
is included.  New devices (SPC-2 and later) are not supposed to return
CHECK CONDITION.  "SPC" means SPC-1 only, not all versions.  This
warning note was in SPC-2 and SPC-3 and dropped in SPC-4.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Device removal lockup with mptsas + scsi-mq

2015-02-04 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Tony Battersby
> Sent: Wednesday, 04 February, 2015 12:39 PM
> To: linux-scsi; Jens Axboe; Christoph Hellwig
> Cc: Sreekanth Reddy
> Subject: Device removal lockup with mptsas + scsi-mq
> 
> Summary:
> 
> When removing a SCSI device with scsi-mq, blk_mq_update_tag_set_depth()
> ends up waiting for commands to *other* SCSI devices to complete.  If
> those other SCSI devices are in the SDEV_BLOCK state, then the removal
> deadlocks.
> 
...
> 
> So far the only way I can get device removal to be reliable with scsi-mq
> enabled is by disabling the call to scsi_device_set_state(sdev,
> SDEV_BLOCK) entirely.  Device removal completes successfully with
> scsi-mq disabled, both with an unmodified kernel and with the patch from
> 2012.
> 
...
> Regarding mptsas:
> 
> When the cable is pulled, mptsas calls scsi_device_set_state(sdev,
> SDEV_BLOCK) and sets vtarget->deleted = 1.  If mptsas queuecommand()
> sees vtarget->deleted, it fails the I/O with DID_NO_CONNECT.  There is
> nowhere in mptsas where it calls scsi_device_set_state(sdev,
> SDEV_RUNNING) or scsi_internal_device_unblock() (except in the patch
> from 2012 just before deleting the device).  So setting SDEV_BLOCK is
> just blocking commands that can never do anything but fail anyway, so it
> can probably either be removed, or else a call to
> scsi_internal_device_unblock() should be added somewhere to unblock a
> device that came back.
> 

I ran into issues with mpt3sas usage of SDEV_BLOCK last year, and
recommend dropping that as part of any solution.

Old description:
"After a drive SAS link goes down, I often see device_blocked get
set to 3 and stay there forever, even if the drive comes back.

Although it seems good to keep the CPUs from retrying over and
over again, it's bad that the processes hang and become
unkillable, and really bad that the system cannot shutdown.

Everything seems to work better if you return host_byte
set to DID_SOFT_ERROR, which causes the SCSI midlayer to retry
a few times, or DID_IMM_RETRY which causes infinite retries,
or DID_ERROR with CHECK CONDITION status and an additional sense
code explaining the error.

If the drive is gone too long, you want the application to 
give up and quit.  On the other hand, retrying while giving 
it time to come back is also important.  In SAS, the I_T
nexus loss time should be the basis for calculating how
long to wait."

---
Rob ElliottHP Server Storage


N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [PATCH 9/9] scsi_error: do not display kernel pointer in message logs

2015-01-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Hannes Reinecke
> Sent: Thursday, January 08, 2015 12:44 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 9/9] scsi_error: do not display kernel pointer in
> message logs
> 
> It is not good practice to display the kernel pointer
> in any message logs, and it doesn't display any additional
> information. And as we know have block-layer assigned tags
> we can use them to differentiate the messages.
> So remove any pointer references from the displayed messages.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_error.c | 49 +++-
> ---
>  1 file changed, 20 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index e42fff6..10ffa21 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -124,41 +124,37 @@ scmd_eh_abort_handler(struct work_struct *work)
...
>   SCSI_LOG_ERROR_RECOVERY(3,
>   scmd_printk(KERN_INFO, scmd,
> - "scmd %p eh timeout, "
> - "not retrying aborted "
> - "command\n", scmd));
> + "eh timeout, not retrying "
> + "aborted command\n"));

checkpatch allows/encourages keeping strings on one line even when
they go past the 80th column.  So, I recommend joining those two
lines.

Reviewed-by: Robert Elliott 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/5] scsi: add support for COMPARE_AND_WRITE

2014-12-17 Thread Elliott, Robert (Server Storage)
I haven't reviewed the whole patch, but noticed one thing:

> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of micha...@cs.wisc.edu
> Sent: Thursday, 16 October, 2014 12:37 AM
...
> @@ -635,6 +646,54 @@ static void sd_prot_op(struct scsi_cmnd *scmd,
> unsigned int dif)
>   scsi_set_prot_type(scmd, dif);
>  }
> 
> +static void sd_config_cmp_and_write(struct scsi_disk *sdkp)
> +{
> + if (sdkp->max_cmp_and_write_blocks > sdkp->max_xfer_blocks) {
> + /* Invalid settings returned. Do not try to support for now
> */
> + blk_queue_max_cmp_and_write_sectors(sdkp->disk->queue, 0);
> + return;
> + }
> +
> + /*
> +  * mult by 2, because the block layer wants the total number of
> +  * sectors that will be put in bios and transferred.
> +  */
> + blk_queue_max_cmp_and_write_sectors(sdkp->disk->queue,
> + 2 * sdkp->max_cmp_and_write_blocks *
> + (sdkp->device->sector_size  >> 9));
> +}
> +
> +/**
> + * sd_setup_cmp_and_write_cmnd - compare and write data
> + * @cmd: scsi_cmnd to prepare
> + **/
> +static int sd_setup_cmd_and_write_cmd(struct scsi_cmnd *cmd)

I suspect you intended to use 
setup_cmp_and_write
rather than 
setup_cmd_and_write
for that function name, to match 
sd_config_cmp_and_write

...
> @@ -1134,6 +1193,8 @@ static int sd_init_command(struct scsi_cmnd *cmd)
>   return sd_setup_write_same_cmnd(cmd);
>   else if (rq->cmd_flags & REQ_FLUSH)
>   return sd_setup_flush_cmnd(cmd);
> + else if (rq->cmd_flags & REQ_CMP_AND_WRITE)
> + return sd_setup_cmd_and_write_cmd(cmd);
>   else
>   return sd_setup_read_write_cmnd(cmd);
>  }

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Tagged overlapped commands

2014-12-17 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Andrew Brooks
> Sent: Tuesday, 16 December, 2014 4:23 AM
...
> 
> On 8 December 2014 at 07:02, James Bottomley
>  wrote:
> >
> > The error message likely means that the tape device had more than one
> > command outstanding at once (i.e. one command was sent while the current
> > command was still pending).  It could also mean that a tag got re-used
> > (same tag issued while command with that tag was outstanding) but for a
> > tape I think the former is more likely.  Is the queue depth set to one
> > for the device?
> >
...
> 
> st0: Sense Key : Aborted Command [current]
> st0: Add. Sense: Tagged overlapped commands (task tag 0)

That error could be due to:
1. send command with tag 0
2. timeout command with tag 0
3. abort command with tag 0, doesn't get through
4. send new command with tag 0

What are the serial log messages preceding those messages?

For example, if you see this print from mv_sas.c mvs_abort_task():
if (rc == 0) {
mv_printk("No such tag in %s\n", __func__);
rc = TMF_RESP_FUNC_FAILED;
return rc;
}
and this from libsas.c:
case TMF_RESP_FUNC_FAILED:
SAS_DPRINTK("%s: task 0x%p failed to abort\n",
__func__, task);
return TASK_ABORT_FAILED;
}

then it didn't send the ABORT TASK at all.  If it really needed
to, that could lead to an overlap in tag 0.

BTW, in SCSI architecture, ABORT TASK is supposed to return 
a service response of FUNCTION COMPLETE rather than an error
if the tag is not present (e.g., because the command has
completed by the time the ABORT TASK was received), so the 
mvs_abort_task() handling there is questionable.

---
Rob ElliottHP Server Storage












RE: [PATCH 09/22] [SCSI] mpt2sas, mpt3sas: Added a support to set cpu affinity for each MSIX vector enabled by the HBA

2014-12-10 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Sreekanth Reddy
> Sent: Tuesday, 09 December, 2014 6:17 AM
> To: martin.peter...@oracle.com; j...@kernel.org; h...@infradead.org
...
> Change_set:
> 1. Added affinity_hint varable of type cpumask_var_t in adapter_reply_queue
>structure. And allocated a memory for this varable by calling
> zalloc_cpumask_var.
> 2. Call the API irq_set_affinity_hint for each MSIx vector to affiniate it
>with calculated cpus at driver inilization time.
> 3. While freeing the MSIX vector, call this same API to release the cpu
> affinity mask
>for each MSIx vector by providing the NULL value in cpumask argument.
> 4. then call the free_cpumask_var API to free the memory allocated in step 2.
> 
...

> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c
> b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index 1560115..f0f8ba0 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
...
> @@ -1609,6 +1611,10 @@ _base_request_irq(struct MPT3SAS_ADAPTER *ioc, u8
> index, u32 vector)
>   reply_q->ioc = ioc;
>   reply_q->msix_index = index;
>   reply_q->vector = vector;
> +
> + if (!zalloc_cpumask_var(&reply_q->affinity_hint, GFP_KERNEL))
> + return -ENOMEM;

I think this will create the problem Alex Thorlton just reported
with lpfc on a system with a huge number (6144) of CPUs.

See this thread:
[BUG] kzalloc overflow in lpfc driver on 6k core system

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/3] libata: Whitelist SSDs that are known to properly return zeroes after TRIM

2014-12-05 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Martin K. Petersen
> Sent: Thursday, 06 November, 2014 11:08 PM
> To: linux-scsi@vger.kernel.org; linux-...@vger.kernel.org; linux-
> fsde...@vger.kernel.org; ne...@suse.de
> Cc: Martin K. Petersen
> Subject: [PATCH 1/3] libata: Whitelist SSDs that are known to properly return
> zeroes after TRIM
> 
> As defined, the DRAT (Deterministic Read After Trim) and RZAT (Return
> Zero After Trim) flags in the ATA Command Set are unreliable in the
> sense that they only define what happens if the device successfully
> executed the DSM TRIM command. TRIM is only advisory, however, and the
> device is free to silently ignore all or parts of the request.
> 
> In practice this renders the DRAT and RZAT flags completely useless and
> because the results are unpredictable we decided to disable discard in
> MD for 3.18 to avoid the risk of data corruption.
> 
> Hardware vendors in the real world obviously need better guarantees than
> what the standards bodies provide. Unfortuntely those guarantees are
> encoded in product requirements documents rather than somewhere we can
> key off of them programatically. So we are compelled to disabling
> discard_zeroes_data for all devices unless we explicitly have data to
> support whitelisting them.
> 
> This patch whitelists SSDs from a few of the main vendors. None of the
> whitelists are based on written guarantees. They are purely based on
> empirical evidence collected from internal and external users that have
> tested or qualified these drives in RAID deployments.
> 
> The whitelist is only meant as a starting point and is by no means
> comprehensive:
> 
>- All intel SSD models except for 510
>- Micron M5*
>- Samsung SSDs
>- Seagate SSDs
> 

That description and Paolo's reply:
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Paolo Bonzini
> Sent: Friday, 05 December, 2014 10:45 AM
...
> I have a Crucial_CT256MX1 (i.e. MX100) and it does reliably zero.

make me concerned about this whitelist approach.

I think you need a manufacturer assertion that this is indeed
the design intent; you cannot be certain based on observation
from outside.

Since the SCSI and ATA standards allow ignoring the hint, it 
might be honored most of the time, but ignored in some rare
cases (e.g., drive firmware has a malloc() failure that only
happens when the drive is handling an overtemperature
condition and six other problems at the same time).

Maybe there should be two levels:
* vendor asserts the drive is designed to always honor the hint
* community observes the drive always seems to honor the hint

and a sysfs flag for users to select the level at which 
they feel safe.

A user running 3 replicas of the data in different sites 
might be more trusting than a user for which this is the 
only copy of the data.

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] scsi_scan: Send TEST UNIT READY to LUN0 before LUN scanning

2014-12-04 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Hannes Reinecke
> Sent: Thursday, 04 December, 2014 10:39 AM
...
>  /**
> + * scsi_test_lun - waiting for a LUN to become ready
> + * @sdev:scsi_device to test
> + *
> + * Description:
> + * Wait for the lun associated with @sdev to become ready
> + *
> + * Send a TEST UNIT READY to detect any unit attention conditions.
> + * Retry TEST UNIT READY for up to @scsi_scan_timeout if the
> + * returned sense key is 02/04/01 (Not ready, Logical Unit is
> + * in process of becoming ready)
> + **/
> +static int
> +scsi_test_lun(struct scsi_device *sdev)
> +{
> + struct scsi_sense_hdr sshdr;
> + int res = SCSI_SCAN_TARGET_PRESENT;
> + int tur_result;
> + unsigned long tur_timeout = jiffies + scsi_scan_timeout * HZ;
> +
> + /* Skip for older devices */
> + if (sdev->scsi_level <= SCSI_3)
> + return SCSI_SCAN_LUN_PRESENT;
> +
> + /*
> +  * Wait for the device to become ready.
> +  *
> +  * Some targets take some time before the firmware is
> +  * fully initialized, during which time they might not
> +  * be able to fill out any REPORT_LUN command correctly.
> +  * And as we're not capable of handling the
> +  * INQUIRY DATA CHANGED unit attention correctly we'd
> +  * rather wait here.
> +  */
> + do {
> + tur_result = scsi_test_unit_ready(sdev, SCSI_TIMEOUT,
> +   3, &sshdr);
> + if (!tur_result) {
> + res = SCSI_SCAN_LUN_PRESENT;
> + break;
> + }
> + if ((driver_byte(tur_result) & DRIVER_SENSE) &&
> + scsi_sense_valid(&sshdr)) {
> + SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
> + "scsi_scan: tur returned %02x/%02x/%02x\n",
> + sshdr.sense_key, sshdr.asc, sshdr.ascq));
> + if (sshdr.sense_key == NOT_READY &&
> + sshdr.asc == 0x04 && sshdr.ascq == 0x01) {
> + /* Logical Unit is in process
> +  * of becoming ready */
> + msleep(100);
> + continue;
> + }
> + }

In SAS, you may see these after power on or after sending a
START STOP UNIT command with START=1, as the controller or 
expander is performing staggered spinup:
04h/11h LOGICAL UNIT NOT READY, NOTIFY (ENABLE SPINUP) REQUIRED

Once the drive receives NOTIFY (ENABLE SPINUP), then it starts
reporting this until spinup is done:
04h/01h LOGICAL UNIT IS IN PROCESS OF BECOMING READY

If this function is intended for general "wait for ready" use, 
then it should sit through both codes.

If this function is only intended for drives that violate the
rule to always be capable of returning REPORT LUNS data 
indicating LUN 0, it's hard to guess if 04h/11h could also
appear.  

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [BUG] kzalloc overflow in lpfc driver on 6k core system

2014-12-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Alex Thorlton
> Sent: Tuesday, 02 December, 2014 3:58 PM
...
> We've recently upgraded our big machine up to 6144 cores, and we're
> shaking out a number of bugs related to booting at that large core
> count.  Last night I tripped a warning from the lpfc driver that appears
> to be related to a kzalloc that uses the number of cores as part of it's
> size calculation.  Here's the backtrace from the warning:
...
> For a little bit more information on exactly what's going wrong, we're
> tripping the warning from lpfc_pci_probe_one_s4 (as you can see from the
> trace).  That function calls down to lpfc_sli4_driver_resource_setup,
> which contains the failing kzalloc here:
> 
> phba->sli4_hba.cpu_map = kzalloc((sizeof(struct lpfc_vector_map_info) *
>  phba->sli4_hba.num_present_cpu),
>  GFP_KERNEL);
> 
> As mentioned, it looks like we're multiplying the number available cpus
> by that struct size to get an allocation size, which ends up being
> greater than KMALLOC_MAX_SIZE.
> 
> Does anyone have any ideas on what could be done to break that
> allocation up into smaller pieces, or to make it in a different way so
> that we avoid this warning?
> 
> Any help is greatly appreciated.  Thanks!
> 

That structure includes an NR_CPU-based maskbits field, which is
probably too big.

include/cpumask.h:
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

drivers/scsi/lpfc/lpfc_sli4.h:
struct lpfc_vector_map_info {
uint16_tphys_id;
uint16_tcore_id;
uint16_tirq;
uint16_tchannel_id;
struct cpumask  maskbits;
};

maskbits appears to only be used for setting IRQ affinity hints in
drivers/scsi/lpfc_init.c:
for (idx = 0; idx < vectors; idx++) {
cpup = phba->sli4_hba.cpu_map;
cpu = lpfc_find_next_cpu(phba, phys_id);
...
mask = &cpup->maskbits;
cpumask_clear(mask);
cpumask_set_cpu(cpu, mask);
i = irq_set_affinity_hint(phba->sli4_hba.msix_entries[idx].
  vector, mask);

In similar code, mpt3sas and lockless hpsa just call get_cpu_mask()
inside the loop:
cpu = cpumask_first(cpu_online_mask);
for (i = 0; i < h->msix_vector; i++) {
rc = irq_set_affinity_hint(h->intr[i], get_cpu_mask(cpu));
cpu = cpumask_next(cpu, cpu_online_mask);
}

get_cpu_mask() uses the global cpu_bit_bitmap array, which is declared
in kernel/cpu.c:
extern const unsigned long
cpu_bit_bitmap[BITS_PER_LONG+1][BITS_TO_LONGS(NR_CPUS)];

That approach should work for lpfc.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCHv2 00/10] scsi logging update: the real thing

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:30 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCHv2 00/10] scsi logging update: the real thing
> 
> Hi all,
> 
> this is the second part of my scsi logging update, covering
> CDB and sense code printing. With this patchset CDBs and
> sense code bytes will be formatted on a single line
> (where possible), and prefixed with the device and command
> tag.
> 
> To ensure CDBs and sense code bytes are not broken up during
> printk I've implemented a per-cpu buffer for formatting
> messages. One can allocate a chunk from the buffer using
> scsi_log_reserve_buffer() and return it via
> scsi_log_release_buffer().
> Both function do an implicit preempt_disable() / preempt_enable()
> to ensure we're not scheduled away from that cpu whilst writing
> into the buffer.
> 
> With that I've been able to clean up constants.c to remove all the
> #ifdef CONFIG_SCSI_CONSTANTS statements. It'll now be compiled in
> conditionally if CONFIG_SCSI_CONSTANTS is set.
> 
> Additionally I've updated the SCSI command definitions to SPC-3.
> 
> Thanks go to Stephen Rostedt who suffered my annoying questions
> during LPC.
> 

I've run with v2 for a while.  Tag prints look good;
CDBs are coming out on one line, etc.

I haven't tested every line of every patch, but I think I
exercised patches 3, 5, 7, and 8 pretty thoroughly, so you
may add this to those:

Tested-by: Robert Elliott 

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 10/10] scsi: Do not display buffer pointers in scsi_log_send()

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 10/10] scsi: Do not display buffer pointers in
> scsi_log_send()
> 
> scsi_log_send() would display buffer pointer for higher
> logging levels. This is not only of questionable value
> but also exposes kernel pointer to userspace, which is
> discouraged in some setups. So drop this message
> altogether.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 92d5912..9ec576d 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -531,7 +531,7 @@ void scsi_log_send(struct scsi_cmnd *cmd)
>*
>* 3: same as 2
>*
> -  * 4: same as 3 plus dump extra junk
> +  * 4: same as 3
>*/
>   if (unlikely(scsi_logging_level)) {
>   level = SCSI_LOG_LEVEL(SCSI_LOG_MLQUEUE_SHIFT,
> @@ -540,13 +540,6 @@ void scsi_log_send(struct scsi_cmnd *cmd)
>   scmd_printk(KERN_INFO, cmd,
>   "Send: scmd 0x%p\n", cmd);
>   scsi_print_command(cmd);
> - if (level > 3) {
> - printk(KERN_INFO "buffer = 0x%p, bufflen =
> %d,"
> -" queuecommand 0x%p\n",
> - scsi_sglist(cmd), scsi_bufflen(cmd),
> - cmd->device->host->hostt-
> >queuecommand);
> -
> - }
>   }
>   }
>  }

Reviewed-by: Robert Elliott 

One more comment on the series:

After the tags, there are still a few scmd %p prints left
including the one above.  Is this series supposed to get
rid of the rest of them?

drivers/scsi/scsi.c:"Send: scmd 0x%p\n", cmd);
drivers/scsi/scsi_error.c:  "scmd %p eh 
timeout, not aborting\n",
drivers/scsi/scsi_error.c:  "aborting command 
%p\n", scmd));
drivers/scsi/scsi_error.c:  
"scmd %p eh timeout, "
drivers/scsi/scsi_error.c:  
"scmd %p retry "
drivers/scsi/scsi_error.c:  
"scmd %p finish "
drivers/scsi/scsi_error.c:  "scmd %p 
abort %s\n", scmd,
drivers/scsi/scsi_error.c:  "scmd %p terminate "
drivers/scsi/scsi_error.c:  "scmd %p previous 
abort failed\n", scmd));
drivers/scsi/scsi_error.c:  "scmd %p not 
aborting, host in recovery\n",
drivers/scsi/scsi_error.c:  "scmd %p abort 
scheduled\n", scmd));
drivers/scsi/scsi_error.c:  "%s scmd: %p result: %x\n",
drivers/scsi/scsi_error.c:  "%s: scmd: %p, timeleft: %ld\n",
drivers/scsi/scsi_error.c:  "sense requested for %p result 
%x\n",
drivers/scsi/scsi_error.c:  "%s: scmd %p rtn %x\n", __func__, scmd, 
rtn));
drivers/scsi/scsi_error.c:   "%s: flush 
retry cmd: %p\n",
drivers/scsi/scsi_error.c:   "%s: flush 
finish cmd: %p\n",
drivers/scsi/scsi_lib.c:"Inserting command %p into mlqueue\n", 
cmd));
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s no slave_alloc 
function in hostt %p\n", __func__, shost->hostt);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "sdev %p scmd %p 
sending INQUIRY\n", sdev, scsi_cmd);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s sdev %p from 
lookup_by_target\n", __func__, sdev);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s sdev %p from 
alloc_sdev\n", __func__, sdev);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s sdev %p from 
alloc_sdev\n", __func__, sdev);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s sdev %p from 
lookup_by_target\n", __func__, sdev);
drivers/scsi/scsi_scan.c:   printk(KERN_WARNING "%s sdev %p from 
alloc_sdev\n", __func__, sdev);



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 09/10] scsi: Conditionally compile in constants.c

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 09/10] scsi: Conditionally compile in constants.c
> 
> Instead of having constants.c littered with ifdef statements
> we should be moving dummy functions into the header and
> condintionally compile in constants.c if selected.
> 
> Suggested-by: Christoph Hellwig 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/Makefile  |  4 +--
>  drivers/scsi/constants.c   | 42 
>  drivers/xen/xen-scsiback.c |  1 +
>  include/scsi/scsi_dbg.h| 68
> +++---
>  4 files changed, 67 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index 4991b62..76f8932 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -158,9 +158,9 @@ obj-$(CONFIG_SCSI_OSD_INITIATOR) += osd/
> 
>  # This goes last, so that "real" scsi devices probe earlier
>  obj-$(CONFIG_SCSI_DEBUG) += scsi_debug.o
> -
> -scsi_mod-y   += scsi.o hosts.o scsi_ioctl.o constants.o \
> +scsi_mod-y   += scsi.o hosts.o scsi_ioctl.o \
>  scsicam.o scsi_error.o scsi_lib.o
> +scsi_mod-$(CONFIG_SCSI_CONSTANTS) += constants.o
>  scsi_mod-$(CONFIG_SCSI_DMA)  += scsi_lib_dma.o
>  scsi_mod-y   += scsi_scan.o scsi_sysfs.o scsi_devinfo.o
>  scsi_mod-$(CONFIG_SCSI_NETLINK)  += scsi_netlink.o
...

I has no compile issues with and without CONFIG_SCSI_CONSTANTS.

make menuconfig says SCSI_CONSTANTS causes "(kernel size +=12K)".
That appears low now (although other config settings might
be exaggerating that on my machine).

Without CONFIG_SCSI_CONSTANTS:
-rwxrwxr-x 1 relliott relliott 133057434 Nov 14 16:27 vmlinux
-rw-rw-r-- 1 relliott relliott 313306350 Nov 14 16:27 vmlinux.o

With CONFIG_SCSI_CONSTANTS:
-rwxrwxr-x 1 relliott relliott 133132674 Nov 14 16:33 vmlinux
-rw-rw-r-- 1 relliott relliott 313575486 Nov 14 16:33 vmlinux.o

vmlinux is 75,240 bytes different.  So, you might want to
modify the Kconfig description.

Reviewed-by: Robert Elliott 

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 08/10] scsi: use per-cpu buffer for formatting scsi_print_result()

2014-11-14 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
...
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
...
> diff --git a/drivers/scsi/scsi_logging.c
> b/drivers/scsi/scsi_logging.c
> index 065792a3..e7e7cab 100644
> --- a/drivers/scsi/scsi_logging.c
> +++ b/drivers/scsi/scsi_logging.c
> @@ -398,3 +398,47 @@ void scsi_print_sense(const struct scsi_cmnd
> *cmd)
>cmd->sense_buffer, SCSI_SENSE_BUFFERSIZE);
>  }
>  EXPORT_SYMBOL(scsi_print_sense);
> +
> +void scsi_print_result(struct scsi_cmnd *cmd, const char *msg, int
> disposition)

Since this function does not modify anything pointed to by cmd,
consider adding const to that argument.

> +{
> + char *logbuf;
> + size_t off, logbuf_len;
> + const char *mlret_string = scsi_mlreturn_string(disposition);

As mentioned in the last series, it might be good to
change the midlayer string from SUCCESS to COMPLETE,
since that is printed even for commands that fail with
errors.  

(This patch series doesn't touch that function, so mentioning
the issue here at the only caller)

>
> + const char *hb_string = scsi_hostbyte_string(cmd->result);
> + const char *db_string = scsi_driverbyte_string(cmd->result);
> +
> + logbuf = scsi_log_reserve_buffer(&logbuf_len);
> + if (!logbuf)
> + return;
> +
> + off = sdev_format_header(logbuf, logbuf_len,
> +  scmd_name(cmd), cmd->request->tag);
> +
> + if (msg)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "%s: ", msg);
> + if (mlret_string)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "%s ", mlret_string);
> + else
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "UNKNOWN(0x%02x) ", disposition);
> + off += scnprintf(logbuf + off, logbuf_len - off, "Result: ");
...

Consider printing "Result: " first.  That would make
the messages more consistent since "Done:" is not always there.

Current:
 [491784.462209] sd 1:0:0:0: [sdq] tag#0 Done: SUCCESS Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK
 [491784.464962] sd 1:0:0:0: [sdq] tag#0 CDB: Write(10) 2a 00 39 8c fa 80 00 00 
08 00
 [  259.667383] sd 2:0:0:0: [sdr] tag#334 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
 [  259.667387] sd 2:0:0:0: [sdr] tag#334 Sense Key : Hardware Error [current]
 [  259.667391] sd 2:0:0:0: [sdr] tag#334 Add. Sense: Logical unit failure
 [  259.667395] sd 2:0:0:0: [sdr] tag#334 CDB: Read(10) 28 00 1b 85 9d 30 00 00 
08 00
 [27907.647377] sd 1:0:0:0: [sdq] tag#768 Done: TIMEOUT_ERROR Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
 [27907.647380] sd 1:0:0:0: [sdq] tag#768 CDB: Write(10) 2a 00 39 8a f4 e0 00 
00 08 00
 [27907.647382] sd 1:0:0:0: [sdq] tag#768 scmd 880424761470 abort scheduled

Proposed:
 [491784.462209] sd 1:0:0:0: [sdq] tag#0 Result: Done: COMPLETE hostbyte=DID_OK 
driverbyte=DRIVER_OK
 [491784.464962] sd 1:0:0:0: [sdq] tag#0 CDB: Write(10) 2a 00 39 8c fa 80 00 00 
08 00
 [  259.667383] sd 2:0:0:0: [sdr] tag#334 Result: FAILED hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
 [  259.667387] sd 2:0:0:0: [sdr] tag#334 Sense Key : Hardware Error [current]
 [  259.667391] sd 2:0:0:0: [sdr] tag#334 Add. Sense: Logical unit failure
 [  259.667395] sd 2:0:0:0: [sdr] tag#334 CDB: Read(10) 28 00 1b 85 9d 30 00 00 
08 00
 [27907.647377] sd 1:0:0:0: [sdq] tag#768 Result: Done: TIMEOUT_ERROR 
hostbyte=DID_OK driverbyte=DRIVER_OK
 [27907.647380] sd 1:0:0:0: [sdq] tag#768 CDB: Write(10) 2a 00 39 8a f4 e0 00 
00 08 00
 [27907.647382] sd 1:0:0:0: [sdq] tag#768 scmd 880424761470 abort scheduled

Furthermore, consider dropping "Done" (the only msg argument 
ever used) altogether.  "Done" means scsi_log_completion is
printing this. No "Done" means scsi_io_completion's ACTION_FAIL
case is printing this.

The disposition (COMPLETE, FAILED, TIMEOUT_ERROR, etc.) seems
to convey that same information, in more detail.


Reviewed-by: Robert Elliott 

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 07/10] scsi: use per-cpu buffer for formatting sense

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
...

> diff --git a/drivers/scsi/scsi_logging.c
...
> @@ -249,3 +255,146 @@ void scsi_print_command(struct scsi_cmnd *cmd)
>   }
>  }
>  EXPORT_SYMBOL(scsi_print_command);
> +
> +static size_t
> +scsi_format_extd_sense(char *buffer, size_t buf_len,
> +unsigned char asc, unsigned char ascq)
> +{
> + size_t off = 0;
> + const char *extd_sense_fmt = NULL;
> + const char *extd_sense_str = scsi_extd_sense_format(asc, ascq,
> + &extd_sense_fmt);

As Dan Carpenter noted, there's a missing {} in 
scsi_extd_sense_format (from the previous logging series)
that causes it to return incorrectly.

...
> +static void
> +scsi_log_print_sense_hdr(const struct scsi_device *sdev, const char
> *name,
> +  int tag, const struct scsi_sense_hdr *sshdr)
> +{
> + char *logbuf;
> + size_t off, logbuf_len;
> +
> + logbuf = scsi_log_reserve_buffer(&logbuf_len);
> + if (!logbuf)
> + return;
> + off = sdev_format_header(logbuf, logbuf_len, name, tag);
> + off += scsi_format_sense_hdr(logbuf + off, logbuf_len - off,
> sshdr);
> + dev_printk(KERN_INFO, &sdev->sdev_gendev, logbuf);
> + scsi_log_release_buffer(logbuf);
> +
> + logbuf = scsi_log_reserve_buffer(&logbuf_len);
> + if (!logbuf)
> + return;
> + off = sdev_format_header(logbuf, logbuf_len, name, tag);
> + off += scsi_format_extd_sense(logbuf + off, logbuf_len - off,
> +   sshdr->asc, sshdr->ascq);
> + dev_printk(KERN_INFO, &sdev->sdev_gendev, logbuf);
> + scsi_log_release_buffer(logbuf);
> +}

This releases the buffer, but reserves it on the next line.
Should the buffer just be held between these two portions?
The subfunctions being called aren't functions that will cause
delays and thus delay other functions waiting on a buffer.
Although the tag on each line helps tremendously, the serial
log will be even more readable if the prints are back-to-back.

The same question applies to scsi_print_command in patch 05/10.
That function prints several lines in a for loop (for long CDBs):

+   for (k = 0; k < cmd->cmd_len; k += 16) {
+   size_t linelen = min(cmd->cmd_len - k, 16);
+
+   logbuf = scsi_log_reserve_buffer(&logbuf_len);
...
+   scsi_log_release_buffer(logbuf);
+   }

Perhaps the reserve/release should be outside the for loop
so they all reuse one buffer and are printed adjacently.

...
> +/* Normalize and print sense buffer in SCSI command */
> +void scsi_print_sense(const struct scsi_cmnd *cmd)
> +{
> + struct gendisk *disk = cmd->request->rq_disk;
> + const char *disk_name = disk ? disk->disk_name : NULL;
> + scsi_log_print_sense(cmd->device, disk_name, cmd->request->tag,
> +  cmd->sense_buffer, SCSI_SENSE_BUFFERSIZE);
> +}
> +EXPORT_SYMBOL(scsi_print_sense);

This function should use the new scmd_name function, which does:
return scmd->request->rq_disk ?
scmd->request->rq_disk->disk_name : NULL;


Reviewed-by: Robert Elliott 


---
Rob ElliottHP Server Storage


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/10] scsi: Rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 01/10] scsi: Rename SERVICE_ACTION_IN to
> SERVICE_ACTION_IN_16
> 
> SPC-3 defines SERVICE ACTION IN(12) and SERVICE ACTION IN(16).
> So rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16 to be
> consistent with SPC and to allow for better distinction.
> 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 

Reviewed-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 06/10] libata: use __scsi_format_command()

2014-11-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke; linux-
> i...@vger.kernel.org; LKML
> Subject: [PATCH 06/10] libata: use __scsi_format_command()
> 
> libata already uses an internal buffer, so we should be using
> __scsi_format_command() here.
> 
> Cc: linux-...@vger.kernel.org
> Cc: LKML 
> Reviewed-by: Christoph Hellwig 
> Acked-by: Tejun Heo 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/ata/libata-eh.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index dad83df..6550a9a 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -2509,15 +2509,12 @@ static void ata_eh_link_report(struct
> ata_link *link)
> 
>   if (ata_is_atapi(qc->tf.protocol)) {
>   if (qc->scsicmd)
> - scsi_print_command(qc->scsicmd);
> + __scsi_format_command(cdb_buf,
> sizeof(cdb_buf),
> +   qc->scsicmd->cmnd,
> +   qc->scsicmd->cmd_len);
>   else
> - snprintf(cdb_buf, sizeof(cdb_buf),
> -  "cdb %02x %02x %02x %02x %02x %02x %02x
> %02x  "
> -  "%02x %02x %02x %02x %02x %02x %02x %02x\n
> ",
> -  cdb[0], cdb[1], cdb[2], cdb[3],
> -  cdb[4], cdb[5], cdb[6], cdb[7],
> -  cdb[8], cdb[9], cdb[10], cdb[11],
> -  cdb[12], cdb[13], cdb[14], cdb[15]);
> + __scsi_format_command(cdb_buf,
> sizeof(cdb_buf),
> +   cdb, qc->dev->cdb_len);

Since results in just one "cdb" variable usage, you could change
"cdb" to qc->cdb" to get rid of the variable declaration and 
slightly simplify the code.
const u8 *cdb = qc->cdb;


>   } else {
>   const char *descr = ata_get_cmd_descript(cmd-
> >command);
>   if (descr)
> --
> 1.8.5.2

Reviewed-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: BUG in scsi_lib.c due to a bad commit

2014-11-12 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Barto
> Sent: Wednesday, November 12, 2014 9:28 PM
> To: Guenter Roeck; Bjorn Helgaas
> Cc: linux-ker...@vger.kernel.org; linux-scsi@vger.kernel.org; Joe
> Perches
> Subject: Re: BUG in scsi_lib.c due to a bad commit
> 
> reverting your commit 045065d8a300a37218c is a solution, but it's just a
> temporary solution,
> 
> it's better to search why your commit can create a random hang on boot
> on some PC configurations,
> 
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue
> *q)
> blk_requeue_request(q, req);
> atomic_dec(&sdev->device_busy);
> out_delay:
> - if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
> + if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
> blk_delay_queue(q, SCSI_QUEUE_DELAY);
> }
> 
> perhaps the atomic_read() function doesn't make the expected job on some
> rare circonstances, I have the same doubts about the blk_delay_queue()
> function

Were you running with scsi_mod.use_blk_mq=Y or =N?

device_busy is the active queue depth for the device (e.g.
5 means there are 5 commands submitted but not yet completed).

The function reaches this code if it has run out of tags, the host
has reached its limit of outstanding commands, or the target has
reached its limit.  It requeus the request:
* with delay if device_busy is zero
* without delay if device_busy is non_zero

I think this is the reasoning:
If device_busy is zero, trying to process the request again will
probably run into the same problem; a delay gives time for the
situation to change.  If device_busy is non-zero, then the 
requeued command goes behind others and might get a different
result.

With the polarity backwards, the lack of delay hung PA-RISC 
and SPARC64 systems), not just QEMU.  So, I don't think reverting
the fix is good.

Changing it to an unconditional delay might be safe - delay
regardless of device_busy (until the root cause is understood).

Also, SCSI_QUEUE_DELAY seems like an arbitrary magic number; 
maybe that value isn't working correctly anymore?




RE: [PATCH 05/10] scsi: use external buffer for command logging

2014-11-12 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 05/10] scsi: use external buffer for command logging
> 
...
> diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
> index 4f502f9..ed0d10d 100644
> --- a/drivers/scsi/ch.c
> +++ b/drivers/scsi/ch.c
> @@ -195,8 +195,10 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd,
> int cmd_len,
>   retry:
>   errno = 0;
>   if (debug) {
> - DPRINTK("command: ");
> - __scsi_print_command(cmd, cmd_len);
> + char logbuf[SCSI_LOG_BUFSIZE];
> +
> + __scsi_format_command(logbuf, SCSI_LOG_BUFSIZE, cmd,
> cmd_len);

Use sizeof(logbuf) rather than SCSI_LOG_BUFSIZE to avoid 
the macro name ever mismatching.

> + DPRINTK("command: %s", logbuf);
>   }
> 
>   result = scsi_execute_req(ch->device, cmd, direction, buffer,
...
> diff --git a/drivers/scsi/scsi_logging.c
> b/drivers/scsi/scsi_logging.c
> index dca45fe..d166d12 100644
> --- a/drivers/scsi/scsi_logging.c
> +++ b/drivers/scsi/scsi_logging.c
> @@ -13,10 +13,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #define SCSI_LOG_SPOOLSIZE 4096
> -#define SCSI_LOG_BUFSIZE 128
> 
>  struct scsi_log_buf {
>   char buffer[SCSI_LOG_SPOOLSIZE];
> @@ -64,6 +64,21 @@ static void scsi_log_release_buffer(char *bufptr)
>   preempt_enable();
>  }
> 
> +static size_t scmd_format_header(char *logbuf, size_t logbuf_len,
> +  struct gendisk *disk, int tag)
> +{
> + size_t off = 0;
> +
> + if (disk)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "[%s] ", disk->disk_name);
> +
> + if (tag >= 0)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "tag#%d ", tag);

If the first scnprintf consumes the full logbuf_len, then
logbuf_len - off will cause an unsigned wrap (since it's
using size_t types; it'd go negative if it used signed types), 
triggering this from vsnprintf (called by scnprintf):
if (WARN_ON_ONCE((int) size < 0))
return 0;

It might be prudent to check that it hasn't gone too far
before calling scnprintf again.

> + return off;
> +}
> +
>  int sdev_prefix_printk(const char *level, const struct scsi_device
> *sdev,
>  const char *name, const char *fmt, ...)
>  {
> @@ -106,13 +121,8 @@ int scmd_printk(const char *level, const struct
> scsi_cmnd *scmd,
>   logbuf = scsi_log_reserve_buffer(&logbuf_len);
>   if (!logbuf)
>   return 0;
> - if (disk)
> - off += scnprintf(logbuf + off, logbuf_len - off,
> -  "[%s] ", disk->disk_name);
> -
> - if (scmd->request->tag >= 0)
> - off += scnprintf(logbuf + off, logbuf_len - off,
> -  "tag#%d ", scmd->request->tag);
> + off = scmd_format_header(logbuf, logbuf_len, disk,
> +  scmd->request->tag);
>   va_start(args, fmt);

Same comment about ensuring off hasn't gone too far.

>   off += vscnprintf(logbuf + off, logbuf_len - off, fmt, args);
>   va_end(args);
> @@ -121,3 +131,121 @@ int scmd_printk(const char *level, const struct
> scsi_cmnd *scmd,
>   return ret;
>  }
>  EXPORT_SYMBOL_GPL(scmd_printk);
> +
> +static size_t scsi_format_opcode_name(char *buffer, size_t buf_len,
> +   const unsigned char *cdbp)
> +{
> + int sa, cdb0;
> + const char *cdb_name = NULL, *sa_name = NULL;
> + size_t off;
> +
> + cdb0 = cdbp[0];
> + if (cdb0 == VARIABLE_LENGTH_CMD) {
> + int len = scsi_varlen_cdb_length(cdbp);
> +
> + if (len < 10) {
> + off = scnprintf(buffer, buf_len,
> + "short variable length command,
> len=%d",
> + len);
> + return off;
> + }
> + sa = (cdbp[8] << 8) + cdbp[9];
> + } else
> + sa = cdbp[1] & 0x1f;
> +
> + if (!scsi_opcode_sa_name(cdb0, sa, &cdb_name, &sa_name)) {
> + if (cdb_name)
> + off = scnprintf(buffer, buf_len, "%s", cdb_name);
> + else {
&

RE: [PATCH 03/10] scsi: Implement per-cpu logging buffer

2014-11-12 Thread Elliott, Robert (Server Storage)
One more comment on this one (which was PATCH 01 in v1 of the series,
to which I sent most of my comments)...

> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 03/10] scsi: Implement per-cpu logging buffer
> 
> +++ b/drivers/scsi/scsi_logging.c
> @@ -0,0 +1,119 @@
...
> +static DEFINE_PER_CPU(struct scsi_log_buf, scsi_format_log);
> +
> +static char *scsi_log_reserve_buffer(size_t *len)
> +{
> + struct scsi_log_buf *buf;
> + unsigned long map_bits = SCSI_LOG_SPOOLSIZE / SCSI_LOG_BUFSIZE;

sizeof(buf->buffer) would be a bit safer than 
SCSI_LOG_SPOOLSIZE - adapts automatically if the real value
used in the structure definition ever changes.


---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: making the queue_type attribute read only, was: Re: tag handling refactor V2

2014-11-12 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Wednesday, 12 November, 2014 11:42 AM
> To: Christoph Hellwig
> Cc: linux-scsi@vger.kernel.org; James Bottomley; Elliott, Robert
> (Server Storage); Hannes Reinecke; Martin K. Petersen; Bart van
> Assche; Mike Christie
> Subject: making the queue_type attribute read only, was: Re: tag
> handling refactor V2
> 
> >  - how is the change_queue_type API supposed to be used for most
> >drivers?  It only changes the tag type from none to simple
> >or back, but except for the special implementation in the
> >53c700 driver doesn't change the queue depth,
> >which might cause it to issue multiple non-tagged command.
> >Fortunately most drivers never look at this information, but
> >then again changing the queue type is useless to start with.
> >  - for those drivers looking at the command tagged information
> >we'd need to quiesce the LUN.  No driver but the 53c700
> >driver does that, and the 53c700 does it at a target-level,
> >which despite a comment claiming it's needed doesn't seem
> >to make sense given the code.  If we can make sure
> >to quience all LUNs we could avoid the per-command flag for
> >tagged commands and always look at the scsi_device flag.
> 
> I've not merged the series, but ondering the above two issues I've
> become convinved that we should make the queue_type sysfs attribute
> read only.
> 
> Switching off tagging isn't really something we should leave to
> users, as the hardware knows much better.  Changing the tag type
> might have made sense during the times of the ordered tags for
> barriers insanity, but now that it's just tagged vs untagged
> it's pretty pointless.
> Especially given that only a single driver ever got the
> implementation right.
> 
> Does anyone have objections to removing the change_queue_type method?

A queue_type of "none" disables the SCSI midlayer automatic queue depth
adjustments based on TASK SET FULL vs. other status values, while
"simple" enables them.  That's the feature where TASK SET FULL reception
causes the queue depth to be lowered, while several GOOD status values
in a row causes it to be increased.

That logic doesn't work very well right now, so that sysfs file
might be the best way to keep controlling the automatic adjustment
feature even if it no longer controls whether tags exist.

One problem is it adjusts the queue depths for all devices sharing
the same target number; we talked about letting the host choose
the policy (http://marc.info/?l=linux-scsi&m=140440001619574&w=2).

Even if that is corrected, the ramp-up/ramp-down algorithms
and controls are not quite right for the full range of device 
types (SSDs to HDDs).  With SSDs, I saw the queue_depth:
* ramp down by about 10 every 2 seconds
* with the default queue_ramp_up_period of 12: never ramp up
* with a queue_ramp_up_period of 120: ramp up by 16 ever 2 seconds

At 1M IOPS, 2 seconds is far too long to decide anything; some
algorithm changes are needed.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/10] scsi: log request tag for scmd_printk()

2014-11-12 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 04/10] scsi: log request tag for scmd_printk()
> 
> The request tag provides a concise identification of a SCSI
> command, so we should be printing that out for scmd_printk().
> 
> Suggested-by: Christoph Hellwig 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_logging.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_logging.c
> b/drivers/scsi/scsi_logging.c
> index 4a76796..dca45fe 100644
> --- a/drivers/scsi/scsi_logging.c
> +++ b/drivers/scsi/scsi_logging.c
> @@ -109,6 +109,10 @@ int scmd_printk(const char *level, const struct
> scsi_cmnd *scmd,
>   if (disk)
>   off += scnprintf(logbuf + off, logbuf_len - off,
>"[%s] ", disk->disk_name);
> +
> + if (scmd->request->tag >= 0)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "tag#%d ", scmd->request->tag);
>   va_start(args, fmt);
>   off += vscnprintf(logbuf + off, logbuf_len - off, fmt, args);
>   va_end(args);
> --
> 1.8.5.2

Reviewed-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/10] scsi: Add SPC-3 command definitions

2014-11-12 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Thursday, 06 November, 2014 2:31 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Ewan Milne; Elliott, Robert (Server Storage);
> linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 02/10] scsi: Add SPC-3 command definitions
> 
> SPC-3 defines SERVICE ACTION IN(12), SERVICE_ACTION OUT(12),
> SERVICE ACTION OUT(16), and SERVICE ACTION BIDIRECTIONAL.
> And READ MEDIA SERIAL NUMBER has long since been deprecated.
> So update callers to refer to the new cdb name.
> 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/constants.c   | 4 
>  drivers/target/target_core_pr.c| 2 +-
>  include/scsi/scsi.h| 6 +-
>  tools/lib/traceevent/plugin_scsi.c | 5 -
>  4 files changed, 10 insertions(+), 7 deletions(-)
> 
...
> diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
> index b354c0d..df00fd1 100644
> --- a/include/scsi/scsi.h
> +++ b/include/scsi/scsi.h
> @@ -128,8 +128,10 @@ enum scsi_timeouts {
>  #define MOVE_MEDIUM   0xa5
>  #define EXCHANGE_MEDIUM   0xa6
>  #define READ_12   0xa8
> +#define SERVICE_ACTION_OUT_12 0xa9
>  #define WRITE_12  0xaa
> -#define READ_MEDIA_SERIAL_NUMBER 0xab
> +#define READ_MEDIA_SERIAL_NUMBER 0xab /* Obsolete with SPC-2 */
> +#define SERVICE_ACTION_IN_12 0xab

That needs one more space before 0xab to line up with the others.

...
> diff --git a/tools/lib/traceevent/plugin_scsi.c
> b/tools/lib/traceevent/plugin_scsi.c
> index c699f47..63aba97 100644
> --- a/tools/lib/traceevent/plugin_scsi.c
> +++ b/tools/lib/traceevent/plugin_scsi.c
> @@ -85,8 +85,9 @@ typedef unsigned int u32;
>  #define MOVE_MEDIUM  0xa5
>  #define EXCHANGE_MEDIUM  0xa6
>  #define READ_12  0xa8
> +#define SERVICE_ACTION_out_120xa9
...

out should be capitalized

Reviewed-by: Robert Elliott 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/10] scsi: Use real functions for logging

2014-11-11 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Tuesday, 04 November, 2014 2:07 AM
...
> diff --git a/drivers/scsi/scsi_logging.c
> b/drivers/scsi/scsi_logging.c
...
> @@ -0,0 +1,119 @@
> +/*
> + * scsi_logging.c
> + *
> + * Copyright (C) 2014 SUSE Linux Products GmbH
> + * Copyright (C) 2014 Hannes Reinecke 
> + *
> + * This file is released under the GPLv2
> + */
> +
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SCSI_LOG_SPOOLSIZE 4096
> +#define SCSI_LOG_BUFSIZE 128
> +
> +struct scsi_log_buf {
> + char buffer[SCSI_LOG_SPOOLSIZE];
> + unsigned long map;
> +};
> +
> +static DEFINE_PER_CPU(struct scsi_log_buf, scsi_format_log);
> +
> +static char *scsi_log_reserve_buffer(size_t *len)
> +{
> + struct scsi_log_buf *buf;
> + unsigned long map_bits = SCSI_LOG_SPOOLSIZE / SCSI_LOG_BUFSIZE;
> + unsigned long idx = 0;
> +
> + WARN_ON(map_bits > BITS_PER_LONG);

Since SCSI_LOG_SPOOLSIZE, SCSI_LOG_BUFSIZE, and BITS_PER_LONG
are constants, that can be a compile-time check.

> + preempt_disable();
> + buf = this_cpu_ptr(&scsi_format_log);
> + idx = find_first_zero_bit(&buf->map, map_bits);

If this fails to find a bit, it returns map_bits.
This could result in the next test_and_set_bit call 
accessing an address that is outside the bounds of
buf->map.

A safety check seems prudent before the test_and_set_bit:
if (likely(idx < map_bits))

> + while (test_and_set_bit(idx, &buf->map)) {
> + idx = find_next_zero_bit(&buf->map, map_bits, idx);
> + if (idx >= map_bits) {
> + break;
> + }

scripts/checkpatch.pl -f doesn't like the {} on that.

> + }
> + if (WARN_ON(idx >= map_bits)) {
> + preempt_enable();
> + return NULL;
> + }
> + *len = SCSI_LOG_BUFSIZE;
> + return buf->buffer + idx * SCSI_LOG_BUFSIZE;
> +}
> +
> +static void scsi_log_release_buffer(char *bufptr)
> +{
> + struct scsi_log_buf *buf;
> + unsigned long idx;
> + int ret;
> +
> + buf = this_cpu_ptr(&scsi_format_log);
> + if (bufptr < buf->buffer + SCSI_LOG_SPOOLSIZE) {

Should that also check that bufptr > buf->buffer?

> + idx = (bufptr - buf->buffer) / SCSI_LOG_BUFSIZE;
> + ret = test_and_clear_bit(idx, &buf->map);
> + WARN_ON(!ret);
> + }
> + preempt_enable();
> +}
> +
> +int sdev_prefix_printk(const char *level, const struct scsi_device
> *sdev,
> +const char *name, const char *fmt, ...)
> +{
> + va_list args;
> + char *logbuf;
> + size_t off = 0, logbuf_len;
> + int ret;
> +
> + if (!sdev)
> + return 0;
> +
> + logbuf = scsi_log_reserve_buffer(&logbuf_len);
> + if (!logbuf)
> + return 0;
> +
> + if (name)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "[%s] ", name);
> + va_start(args, fmt);
> + off += vscnprintf(logbuf + off, logbuf_len - off, fmt, args);
> + va_end(args);
> + ret = dev_printk(level, &sdev->sdev_gendev, "%s", logbuf);
> + scsi_log_release_buffer(logbuf);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(sdev_prefix_printk);
> +
> +int scmd_printk(const char *level, const struct scsi_cmnd *scmd,
> + const char *fmt, ...)
> +{
> + struct gendisk *disk = scmd->request->rq_disk;
> + va_list args;
> + char *logbuf;
> + size_t off = 0, logbuf_len;
> + int ret;
> +
> + if (!scmd || scmd->cmnd == NULL)
> + return 0;

!scmd->cmnd seems more common in neighboring code.

> +
> + logbuf = scsi_log_reserve_buffer(&logbuf_len);
> + if (!logbuf)
> + return 0;
> + if (disk)
> + off += scnprintf(logbuf + off, logbuf_len - off,
> +  "[%s] ", disk->disk_name);
> + va_start(args, fmt);
> + off += vscnprintf(logbuf + off, logbuf_len - off, fmt, args);
> + va_end(args);
> + ret = dev_printk(level, &scmd->device->sdev_gendev, "%s",
> logbuf);
> + scsi_log_release_buffer(logbuf);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(scmd_printk);
...

dev_printk is EXPORT_SYMBOL.  The former macro versions of 
sdev_printk and scmd_printk just called dev_printk, and not
being symbols, did not change that.

So, I think these function versions should use EXPORT_SYMBOL, so 
the hated binary drivers can still call them.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: UNMAP command parameter list

2014-11-07 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Andras Kovacs
> Sent: Friday, 07 November, 2014 5:30 PM
> To: linux-scsi@vger.kernel.org
> Subject: UNMAP command parameter list
> 
> 
>   Hi all,
> 
>   I'm trying to see what parameters are passed to the UFS driver from the
> SCSI midlayer when an UNMAP command is issued. In the UFS driver I can see
> the 10 byte command itself in the unsigned char *cmnd field of the struct
> scsi_cmnd, and it tells me that the parameter list length is 24 bytes (an 8
> byte UNMAP Parameter List header followed by one 16 byte long UNMAP Block
> Descriptor), but I'm having trouble finding where this UNMAP Parameter List
> is. It doesn't seem to be in the cmnd[] byte array after the UNMAP
> command itself.
> 
>   Can anyone tell me where I can find the UNMAP Parameter List which
> belongs to a particular UNMAP command?
> 
>   Any and all info would be greatly appreciated.
> 
>   Andras

That is the data transferred for the command, not in the
command descriptor block itself.

Typical T10 terminology is:
* parameter list = data-out buffer (i.e., write data)
* parameter data = data-in buffer (i.e., read data)


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: absurdly high "optimal_io_size" on Seagate SAS disk

2014-11-07 Thread Elliott, Robert (Server Storage)
> commit 87c0103ea3f96615b8a9816b8aee8a7ccdf55d50
> Author: Martin K. Petersen 
> Date:   Thu Nov 6 12:31:43 2014 -0500
> 
> [SCSI] sd: Sanity check the optimal I/O size
> 
> We have come across a couple of devices that report crackpot
>   values in the optimal I/O size in the Block Limits VPD page. 
>   Since this is a 32-bit entity that gets multiplied by the
>   logical block size we can get
> disproportionately large values reported to the block layer.
> 
> Cap io_opt at 1 GB.

Another reasonable cap is the maximum transfer size.
There are lots of them:

* the block layer BIO_MAX_PAGES value of 256 limits IOs
  to a maximum of 1 MiB
* SCSI LLDs report their maximum transfer size in
  /sys/block/sdNN/queue/max_hw_sectors_kb
* the SCSI midlayer maximum transfer size is set/reported
  in /sys/block/sdNN/queue/max_sectors_kb
  and the default is 512 KiB
* the SCSI LLD maximum number of scatter gather entries
  reported in /sys/block/sdNN/queue/max_segments and
  /sys/block/sdNN/queue/max_segment_size creates a
  limit based on how fragmented the data buffer is
  in virtual memory
* the Block Limits VPD page MAXIMUM TRANSFER LENGTH field
  indicates the maximum transfer size for one command over
  the SCSI transport protocol supported by the drive itself

It is risky to use transfer sizes larger than linux and
Windows can generate, since drives are probably tested in
those environments.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/6] osst: call scsi_set_medium_removal directly

2014-11-06 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 5/6] osst: call scsi_set_medium_removal directly
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/osst.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/osst.c b/drivers/scsi/osst.c
> index b6d63d6..8c38464 100644
> --- a/drivers/scsi/osst.c
> +++ b/drivers/scsi/osst.c
> @@ -3327,19 +3327,18 @@ static int osst_write_frame(struct osst_tape
> * STp, struct osst_request ** aSRpn
>  /* Lock or unlock the drive door. Don't use when struct osst_request
> allocated. */
>  static int do_door_lock(struct osst_tape * STp, int do_lock)
>  {
> - int retval, cmd;
> + int retval;
> 
> - cmd = do_lock ? SCSI_IOCTL_DOORLOCK : SCSI_IOCTL_DOORUNLOCK;
>  #if DEBUG
>   printk(OSST_DEB_MSG "%s:D: %socking drive door.\n",
> tape_name(STp), do_lock ? "L" : "Unl");
>  #endif
> - retval = scsi_ioctl(STp->device, cmd, NULL);
> - if (!retval) {
> +
> + retval = scsi_set_medium_removal(STp->device,
> + do_lock ? SCSI_REMOVAL_PREVENT :
> SCSI_REMOVAL_ALLOW);
> + if (!retval)
>   STp->door_locked = do_lock ? ST_LOCKED_EXPLICIT :
> ST_UNLOCKED;
> - }
> - else {
> + else
>   STp->door_locked = ST_LOCK_FAILS;
> - }
>   return retval;
>  }
> 

Reviewed-by: Robert Elliott 

One additional nit (not necessary):
The do_lock argument could be bool rather than int both 
here and in st.c.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/6] st: call scsi_set_medium_removal directly

2014-11-06 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 4/6] st: call scsi_set_medium_removal directly
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/st.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> index 7d2e036..e46e02b2 100644
> --- a/drivers/scsi/st.c
> +++ b/drivers/scsi/st.c
> @@ -861,17 +861,16 @@ static int set_mode_densblk(struct scsi_tape *
> STp, struct st_modedef * STm)
>  /* Lock or unlock the drive door. Don't use when st_request
> allocated. */
>  static int do_door_lock(struct scsi_tape * STp, int do_lock)
>  {
> - int retval, cmd;
> + int retval;
> 
> - cmd = do_lock ? SCSI_IOCTL_DOORLOCK : SCSI_IOCTL_DOORUNLOCK;
>   DEBC_printk(STp, "%socking drive door.\n", do_lock ? "L" :
> "Unl");
> - retval = scsi_ioctl(STp->device, cmd, NULL);
> - if (!retval) {
> +
> + retval = scsi_set_medium_removal(STp->device,
> + do_lock ? SCSI_REMOVAL_PREVENT :
> SCSI_REMOVAL_ALLOW);
> + if (!retval)
>   STp->door_locked = do_lock ? ST_LOCKED_EXPLICIT :
> ST_UNLOCKED;
> - }
> - else {
> + else
>   STp->door_locked = ST_LOCK_FAILS;
> - }
>   return retval;
>  }

Reviewed-by: Robert Elliott 

A few comments spawned by this (and patch 5/6):

1. Although PREVENT ALLOW MEDIUM REMOVAL was a generic
command defined in SPC-2, In 2006 T10 proposal
06-248r1 changed it to be a command-set specific 
command for SPC-3.  MMC, SSC, SBC, and SMC had slight
differences and the groups wouldn't agree on a converged
definition.

For example:
* SSC and SBC only define two of the bit combinations 
  in byte 4 bits 1:0 - allowed and prevented
* MMC defines all four combinations of byte 4 bits 1:0 -
  unlocked, locked, persistent allow, and persistent prevent
* SMC follows SSC/SBC but adds a PREEMPT bit in 
  byte 4 bit 7

So, relying on one SCSI-wide scsi_set_medium_removal
call could theoretically be a problem (though not yet,
since none of those extra features are used).


2. Reviewing the callers for scsi_set_medium_removal,
I noticed sd.c sd_release ignores the return value 
from scsi_set_medium_removal (like many others).

Its comment says:
*  Returns 0.
but it is a void function so doesn't really return anything:
static void sd_release(struct gendisk *disk, fmode_t mode)


3. Reviewing the callers, st_release has an initialized
result variable but does nothing else with it but return it:
int result = 0;
...
return result;

It ignores the do_door_lock -> scsi_set_medium_removal
result (like many others).

---
Rob ElliottHP Server Storage







--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/6] scsi: split scsi_nonblockable_ioctl

2014-11-06 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 2/6] scsi: split scsi_nonblockable_ioctl
> 
> The calling conventions for this function where bad as it could
> return
> -ENODEV both for a device not currently online and a not recognized
> ioctl.
> 
> Add a new scsi_ioctl_block_when_processing_errors function that wraps
> scsi_block_when_processing_errors with the a special case for the
> SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself
> in scsi_ioctl.  All callers of scsi_ioctl now must call the above
> helper to check for the EH state, so that the ioctl handler itself doesn't
> have to.
> 
> Reported-by: Robert Elliott 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/ch.c |  5 +
>  drivers/scsi/osst.c   |  6 +++---
>  drivers/scsi/scsi_ioctl.c | 47 +
> --
>  drivers/scsi/sd.c |  6 +++---
>  drivers/scsi/sg.c | 33 +++--
>  drivers/scsi/sr.c | 15 +--
>  drivers/scsi/st.c |  7 +++
>  include/scsi/scsi_ioctl.h |  4 ++--
>  8 files changed, 49 insertions(+), 74 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
> index 5ddc08f..712f159 100644
> --- a/drivers/scsi/scsi_ioctl.c
> +++ b/drivers/scsi/scsi_ioctl.c
...

> @@ -281,30 +270,20 @@ int scsi_ioctl(struct scsi_device *sdev, int
> cmd, void __user *arg)
>  }
>  EXPORT_SYMBOL(scsi_ioctl);
> 
> -/**
> - * scsi_nonblockable_ioctl() - Handle SG_SCSI_RESET
> - * @sdev: scsi device receiving ioctl
> - * @cmd: Must be SC_SCSI_RESET
> - * @arg: pointer to int containing
> SG_SCSI_RESET_{DEVICE,TARGET,BUS,HOST}
> - *   possibly OR-ed with SG_SCSI_RESET_NO_ESCALATE
> - * @ndelay: file mode O_NDELAY flag
> +/*
> + * We can process a reset even when a device isn't fully operable.
>   */
> -int scsi_nonblockable_ioctl(struct scsi_device *sdev, int cmd,
> - void __user *arg, int ndelay)
> +int scsi_ioctl_block_when_processing_errors(struct scsi_device
> *sdev, int cmd,
> + bool ndelay)
>  {
> - /* The first set of iocts may be executed even if we're doing
> -  * error processing, as long as the device was opened
> -  * non-blocking */
> - if (ndelay) {
> + if (cmd == SG_SCSI_RESET && ndelay) {
>   if (scsi_host_in_recovery(sdev->host))
>   return -ENODEV;
> - } else if (!scsi_block_when_processing_errors(sdev))
> - return -ENODEV;
> -
> - switch (cmd) {
> - case SG_SCSI_RESET:
> - return scsi_ioctl_reset(sdev, arg);
> + } else {
> + if (!scsi_block_when_processing_errors(sdev))
> + return -ENODEV;
>   }
> - return -ENODEV;
> +
> + return 0;
>  }
> -EXPORT_SYMBOL(scsi_nonblockable_ioctl);
> +EXPORT_SYMBOL_GPL(scsi_ioctl_block_when_processing_errors);

Most of the SCSI midlayer functions are exported as non-GPL; 
is it really necessary to prevent closed source drivers from 
using this new function, especially since it's just a
refactoring of some previously non-GPL exported functions?

...
> diff --git a/include/scsi/scsi_ioctl.h b/include/scsi/scsi_ioctl.h
> index b900684..8d19d1d 100644
> --- a/include/scsi/scsi_ioctl.h
> +++ b/include/scsi/scsi_ioctl.h
> @@ -40,9 +40,9 @@ typedef struct scsi_fctargaddress {
>   unsigned char host_wwn[8]; // include NULL term.
>  } Scsi_FCTargAddress;
> 
> +int scsi_ioctl_block_when_processing_errors(struct scsi_device
> *sdev,
> + int cmd, bool ndelay);
>  extern int scsi_ioctl(struct scsi_device *, int, void __user *);
> -extern int scsi_nonblockable_ioctl(struct scsi_device *sdev, int
> cmd,
> -void __user *arg, int ndelay);
> 
>  #endif /* __KERNEL__ */
>  #endif /* _SCSI_IOCTL_H */
> --
> 1.9.1

Beyond this patch, the existing scsi_block_when_processing_errors
description should be expanded to mention that it is used to
prevent ioctls and commands, not just commands:

scsi_error.c:
/**
 * scsi_block_when_processing_errors - Prevent cmds from being queued.
 * @sdev:   Device on which we are performing recovery.
 *
 * Description:
 * We block until the host is out of error recovery, and then check to
 * see whether the host or the device is offline.
 *
 * Return value:
 * 0 when dev was taken offline by error recovery. 1 OK to proceed.
 */
int scsi_block_when_processing_errors(struct scsi_device *sdev)
...


Reviewed-by: Robert Elliott 


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/6] scsi: return EAGAIN when resetting a device under EH

2014-11-06 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 6/6] scsi: return EAGAIN when resetting a device
> under EH
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/scsi_ioctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
> index 712f159..c4f7b56 100644
> --- a/drivers/scsi/scsi_ioctl.c
> +++ b/drivers/scsi/scsi_ioctl.c
> @@ -278,7 +278,7 @@ int
> scsi_ioctl_block_when_processing_errors(struct scsi_device *sdev, int
> cmd,
>  {
>   if (cmd == SG_SCSI_RESET && ndelay) {
>   if (scsi_host_in_recovery(sdev->host))
> - return -ENODEV;
> + return -EAGAIN;
>   } else {
>   if (!scsi_block_when_processing_errors(sdev))
>   return -ENODEV;
> --
> 1.9.1

This is how sg_reset responds to -EAGAIN, which seems reasonable:
sg_reset: starting device reset
sg_reset: SG_SCSI_RESET failed: Resource temporarily unavailable

That is easy to trigger when running sg_reset in parallel to
multiple devices presented by the same controller.  In one
example to 16 devices, 10 got that result while 6 succeeded.

Tested-by: Robert Elliott 
Reviewed-by: Robert Elliott 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/6] sd: fix up ->compat_ioctl

2014-11-06 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 3/6] sd: fix up ->compat_ioctl
> 
> No need to verify the passthrough ioctls, the real handler will
> take care of that.  Also make sure not to block for resets on
> O_NONBLOCK fds.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/sd.c | 28 
>  1 file changed, 8 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 61e50ae..baf5cf4 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1541,31 +1541,19 @@ static int sd_compat_ioctl(struct
> block_device *bdev, fmode_t mode,
>  unsigned int cmd, unsigned long arg)
...
>   /*
>* Let the static ioctl translation table take care of it.
>*/
> - return -ENOIOCTLCMD;
> + if (!sdev->host->hostt->compat_ioctl)
> + return -ENOIOCTLCMD;

Although I don't see it in the quoted email, git complains that 
this line (and the original being deleted) have a space 
after -ENOIOCTLCMD;

> + return sdev->host->hostt->compat_ioctl(sdev, cmd, (void __user
> *)arg);
>  }
>  #endif
> 
> --
> 1.9.1

Reviewed-by: Robert Elliott 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/6] scsi: refactor scsi_reset_provider handling

2014-11-06 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Thursday, 30 October, 2014 4:27 AM
> To: linux-scsi@vger.kernel.org
> Cc: Douglas Gilbert; Elliott, Robert (Server Storage)
> Subject: [PATCH 1/6] scsi: refactor scsi_reset_provider handling
> 
> Pull the common code from the two callers into the function,
> and renamed it to scsi_ioctl_reset.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/scsi_error.c | 76 ++---
> --
>  drivers/scsi/scsi_ioctl.c | 33 +---
>  drivers/scsi/sg.c | 34 ++---
>  include/scsi/scsi_eh.h| 15 +-
>  4 files changed, 41 insertions(+), 117 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index fa7b5ec..ba19687 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "scsi_priv.h"
>  #include "scsi_logging.h"
> @@ -2309,39 +2310,36 @@ scsi_reset_provider_done_command(struct
> scsi_cmnd *scmd)
>  {
>  }
> 
> -/*
> - * Function: scsi_reset_provider
> - *
> - * Purpose:  Send requested reset to a bus or device at any phase.
> - *
> - * Arguments:device  - device to send reset to
> - *   flag - reset type (see scsi.h)
> - *
> - * Returns:  SUCCESS/FAILURE.
> - *
> - * Notes:This is used by the SCSI Generic driver to provide
> - *   Bus/Device reset capability.
> +/**
> + * scsi_ioctl_reset: explicitly reset a host/bus/target/device
> + * @dev: scsi_device to operate on
> + * @val: reset type (see sg.h)
>   */
>  int
> -scsi_reset_provider(struct scsi_device *dev, int flag)
> +scsi_ioctl_reset(struct scsi_device *dev, int __user *arg)
>  {
>   struct scsi_cmnd *scmd;
>   struct Scsi_Host *shost = dev->host;
>   struct request req;

Is this 368 byte structure too big to place on the stack?  
Most other code using struct request uses an alloc() 
function.  Also, on the stack, it is unlikely to be
cacheline aligned.

>   unsigned long flags;
> - int rtn;
> + int error = 0, rtn, val;
> +

No need to initialize error, as it is unilaterally set 
4 lines below.

> + if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
> + return -EACCES;
> +
> + error = get_user(val, arg);
> + if (error)
> + return error;
> 
>   if (scsi_autopm_get_host(shost) < 0)
> - return FAILED;
> + return -EIO;
>

This could propagate the return value from scsi_autopm_get_host,
which is the value from pm_runtime_get_sync, 
which is the value from __pm_runtime_resume,
which is the value from rpm_resume,
which could be -EINVAL, -EACCESS, -EINPROGRESS, -EBUSY, or
the value from rpm_callback, 
which could be -ENOSYS, etc.

> - if (!get_device(&dev->sdev_gendev)) {
> - rtn = FAILED;
> + error = -EIO;
> + if (!get_device(&dev->sdev_gendev))
>   goto out_put_autopm_host;
> - }
> 
>   scmd = scsi_get_command(dev, GFP_KERNEL);
>   if (!scmd) {
> - rtn = FAILED;
>   put_device(&dev->sdev_gendev);
>   goto out_put_autopm_host;
>   }
> @@ -2362,39 +2360,37 @@ scsi_reset_provider(struct scsi_device *dev,
> int flag)
>   shost->tmf_in_progress = 1;
>   spin_unlock_irqrestore(shost->host_lock, flags);
> 
> - switch (flag) {
> - case SCSI_TRY_RESET_DEVICE:
> + switch (val & ~SG_SCSI_RESET_NO_ESCALATE) {
> + case SG_SCSI_RESET_NOTHING:
> + rtn = SUCCESS;
> + break;
> + case SG_SCSI_RESET_DEVICE:
>   rtn = scsi_try_bus_device_reset(scmd);
> - if (rtn == SUCCESS)
> + if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE))
>   break;
>   /* FALLTHROUGH */
> - case SCSI_TRY_RESET_TARGET:
> + case SG_SCSI_RESET_TARGET:
>   rtn = scsi_try_target_reset(scmd);
> - if (rtn == SUCCESS)
> + if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE))
>   break;
>   /* FALLTHROUGH */
> - case SCSI_TRY_RESET_BUS:
> + case SG_SCSI_RESET_BUS:
>   rtn = scsi_try_bus_reset(scmd);
> - if (rtn == SUCCESS)
> + if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE))
>   break;
>   /* FALLTHROUGH */

RE: [PATCH v2 12/12] IB/srp: Add multichannel support

2014-11-04 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il]
> Sent: Tuesday, November 04, 2014 6:15 AM
> To: Bart Van Assche; Elliott, Robert (Server Storage); Christoph Hellwig
> Cc: Jens Axboe; Sagi Grimberg; Sebastian Parschauer; Ming Lei; linux-
> s...@vger.kernel.org; linux-rdma
> Subject: Re: [PATCH v2 12/12] IB/srp: Add multichannel support
> 
...
> I think that Rob and I are not talking about the same issue. In
> case only a single core is servicing interrupts it is indeed expected
> that it will spend 100% in hard-irq, that's acceptable since it is
> pounded with completions all the time.
> 
> However, I'm referring to a condition where SRP will spend infinite
> time servicing a single interrupt (while loop on ib_poll_cq that never
> drains) which will lead to a hard lockup.
> 
> This *can* happen, and I do believe that with an optimized IO path
> it is even more likely to.

If the IB completions/interrupts are only for IOs submitted on this
CPU, then the CQ will eventually drain, because this CPU is not 
submitting anything new while stuck in the loop.

This can become bursty, though - submit a lot of IOs, then be busy
completing all of them and not submitting more, resulting in the 
queue depth bouncing from 0 to high to 0 to high.  I've seen
that with both hpsa and mpt3sas drivers.  The fio options
iodepth_batch, iodepth_batch_complete, and iodepth_low
can amplify and reduce that effect (using libaio).

I haven't found a good way for the LLD ISRs and the block
layer completion code to decide to yield the CPU based on how
much time they are taking - that would almost qualify as
a realtime kernel feature.  If you compile with
CONFIG_IRQ_TIME_ACCOUNTING, the kernel does keep track
of that information; perhaps that could be exported so
modules can use it?

---
Rob Elliott, HP Server Storage

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 12/12] IB/srp: Add multichannel support

2014-11-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il]
> Sent: Sunday, November 02, 2014 7:03 AM
> To: Bart Van Assche; Christoph Hellwig
> Cc: Jens Axboe; Sagi Grimberg; Sebastian Parschauer; Elliott, Robert
> (Server Storage); Ming Lei; linux-scsi@vger.kernel.org; linux-rdma
> Subject: Re: [PATCH v2 12/12] IB/srp: Add multichannel support
> 
...
> IMHO, this is not iSER specific issue, it is easily indicated from the
> code that a specific workload SRP will poll recv completion queue
> forever in an interrupt context.
> 
> I encountered this issue on a virtual guest in a high workload (80+
> sessions with heavy traffic on all) because qemu smp_affinity setting
> was broken (might still be, didn't check that for a while). This caused
> all completion vectors to fire interrupts to core 0 causing a high
> events contention on a single event queue (causing lockup situations
> and starvation of other CQs). Using more completion queues will enhance
> this situation.
> 
> I think running multichannel code when all MSIX vectors affinity are
> directed to a single CPU can invoke what I'm talking about.

That's not an SRP specific problem either.  If you ask just one CPU to
service interrupts and block layer completions for submissions from lots
of other CPUs, it's bound to become overloaded.

Setting rq_affinity=2 helps quite a bit for the block layer completion
work.  This patch proposed making that the default for blk-mq:
https://lkml.org/lkml/2014/9/9/931

For SRP interrupt processing, irqbalance recently changed its default 
to ignore the affinity_hint; you now need to pass an option to honor
the hint, or provide a policy script to do so for selected irqs.  For
multi-million IOPS workloads, irqbalance takes far too long to reroute
them based on activity; you're likely to overload a CPU with 100% 
hardirq processing, creating self-detected stalls for the submitting
processes on that CPU and other problems.  Sending interrupts back 
to the submitting CPU provides self-throttling.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/1 linux-next] hpsa: remove set but unused variable rc

2014-10-29 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Sudip Mukherjee
> Sent: Thursday, October 30, 2014 12:55 AM
> To: Fabian Frederick
> Cc: linux-ker...@vger.kernel.org; Stephen M. Cameron; James E.J.
> Bottomley; iss_storage...@hp.com; linux-scsi@vger.kernel.org
> Subject: Re: [PATCH 1/1 linux-next] hpsa: remove set but unused variable
> rc
> 
> On Wed, Oct 29, 2014 at 04:15:04PM +0100, Fabian Frederick wrote:
> > Fix -Wunused-but-set-variable warning
> 
> you should also mention why you have left the call to
> irq_set_affinity_hint().
> i am not sure , but it looks like irq_set_affinity_hint() is only
> checking if
> the lock is available or not. It is just locking ,then if lock is
> successfull then
> returning 0 or if lock fails then return -EINVAL, and unlocks before
> returnig.
> not doing anything else.
> 
> thanks
> sudip

No, that function sets a mask value that shows up in 
/proc/irq/nnn/affinity_hint
that a program like irqbalance may use to set the CPU affinity mask
for each irq via 
/proc/irq/nnn/smp_affinity   (bitmap format)
/proc/irq/nnn/smp_affinity_list   (range format)

The reason is that in many cases, it is best when all these occur 
on the same CPU that submitted the IO:
* LLD submission queues (if multiple are supported)
* LDD completion queues
* MSI-X interrupt indicating completion
* LLD completion interrupt handler
* block layer completion handler

Benefits include:
* cache efficiency - the data structures for the IO aren't
pulled from CPU to CPU.

* avoid IPI overhead in the block layer to get the completion 
processed on the submitting CPU (which is done if using 
rq_affinity=2 and the interrupt is routed to on another CPU).

* self-throttle the CPUs - avoid swamping one CPU with 
completion processing for IOs submitted by many other CPUs
(which leads to stalls on the victim and timeouts on the
aggressors).

You must run irqbalance with an option to honor the hints;
some versions default to that, others don't.  Or, disable
the irqbalance service and set them up the affinities
manually.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: blk-mq problem on proliant DL380 G3 (cciss)

2014-10-29 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Meelis Roos
> Sent: Wednesday, 29 October, 2014 10:38 AM
> To: Jens Axboe
> Cc: linux-scsi@vger.kernel.org; Christoph Hellwig
> Subject: Re: blk-mq problem on proliant DL380 G3 (cciss)
> 
> > On 2014-10-29 05:46, Meelis Roos wrote:
> > > > I tried 3.18-rc2 with blk-mq default on on HP ProLiant DL380 G3
> (with HP
> > > > CCISS RAID controller). It fails late in the bootup with "task
> > > > scsi_eh_1:720 blocked for more than 120 seconds." messages.
> > > >
> > > > Booting with scsi_mod.use_blk_mq=0 fixes the problem.
> > >
> > > Another test server with MPT SCSI RAID has similar problem,
> > > scsi_mode.use_blk_mq=0 cures it but I can not get good trace (no
> > serail
> > > console). 3.18.0-rc2-00043-gf7e87a4 was tested there.
> >
> > The first issue looks like scsi cdrom and error handling, it must
> > be leaking
> > requests hence we hang on allocation of a new one. cciss doesn't
> > use blk_mq
> > regardless of the scsi setting. Does the mpt box also have a libata
> > driven cdrom?
> 
> Yes, it does.
> 
> --
> Meelis Roos (mr...@linux.ee)

In the log, the first soft lockup for scsi_eh_1 means the thread 
for host1, which is a pata controller:

...
[   15.069114] scsi host1: pata_serverworks
[   15.173491] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo 
Giometti 
[   15.184512] scsi host2: pata_serverworks
[   15.184673] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x2000 irq 14
[   15.184675] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x2008 irq 15
[   15.460452] ata1.00: ATAPI: COMPAQ  CD-ROM SN-124, N104, max PIO4
[   15.476445] ata1.00: configured for PIO4
[   15.477110] scsi 1:0:0:0: CD-ROMCOMPAQ   CD-ROM SN-124N104 
PQ: 0 ANSI: 5 
...
[  240.704040] INFO: task scsi_eh_1:720 blocked for more than 120 seconds.
[  240.783198]   Not tainted 3.18.0-rc2-dirty #22
[  240.840485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  240.934172] scsi_eh_1   D c1264d7f 0   720  2 0x
[  241.010385]  f5bdbe54 0046 f5bdbde0 c1264d7f f5bdbe00 1412  
2be89db4
[  241.103850]  0004 2be8b1c6 0004 c1534000 f63bca10 c10892f5 6f223d9e 
0132
[  241.197335]   066087ce f5bdbe50 c10892f5 6f22478a 0132  
066087ce
[  241.290803] Call Trace:
[  241.320039]  [] ? put_device+0xf/0x20
[  241.374205]  [] ? ktime_get+0x45/0x110
[  241.429416]  [] ? ktime_get+0x45/0x110
[  241.484631]  [] schedule+0x1e/0x60
[  241.535679]  [] io_schedule+0x77/0xc0
[  241.589854]  [] bt_get+0xc3/0x140
[  241.639867]  [] ? __wake_up_sync+0x20/0x20
[  241.699240]  [] blk_mq_get_tag+0x9e/0xc0
...


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 26/27] scsi: correct return values for .eh_abort_handler implementations

2014-10-28 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Friday, 24 October, 2014 7:27 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Elliott, Robert (Server Storage); linux-
> s...@vger.kernel.org; Hannes Reinecke; sta...@vger.kernel.org
> Subject: [PATCH 26/27] scsi: correct return values for
> .eh_abort_handler implementations
> 
> The .eh_abort_handler needs to return SUCCESS, FAILED, or
> FAST_IO_FAIL. So fixup all callers to adhere to this requirement.
> 
> Cc: Robert Elliott 
> Cc: 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/NCR5380.c| 2 +-
>  drivers/scsi/aha1740.c| 2 +-
>  drivers/scsi/esas2r/esas2r_main.c | 2 +-
>  drivers/scsi/megaraid.c   | 8 
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
> index 50873bb..e2c9e73 100644
> --- a/drivers/scsi/NCR5380.c
> +++ b/drivers/scsi/NCR5380.c
> @@ -2703,7 +2703,7 @@ static int NCR5380_abort(Scsi_Cmnd * cmd) {
>   * aborted flag and get back into our main loop.
>   */
> 
> - return 0;
> + return SUCCESS;
>   }
>  #endif

The comment above this function still says:
* Returns : 0 - success, -1 on failure.

The same comment problem also exists in
atari_NCR5380.c and sun3_NCR5380.c, which have
functions with this same name.

With all those comments also updated...
Reviewed-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 25/27] scsi: check for correct return code in scsi_eh_abort_cmds()

2014-10-28 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Friday, 24 October, 2014 7:27 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Elliott, Robert (Server Storage); linux-
> s...@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 25/27] scsi: check for correct return code in
> scsi_eh_abort_cmds()
> 
> scsi_try_to_abort_cmd() might return SUCCESS, FAILED, or
> FAST_IO_FAIL. So just checking for FAILED will treat
> FAST_IO_FAIL as SUCCESS, which is wrong.
> 
> Cc: Robert Elliott 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_error.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index fa7b5ec..e94baf1 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1345,7 +1345,7 @@ static int scsi_eh_abort_cmds(struct list_head
> *work_q,
>   scmd_printk(KERN_INFO, scmd,
>"%s: aborting cmd\n", current->comm));
>   rtn = scsi_try_to_abort_cmd(shost->hostt, scmd);
> - if (rtn == FAILED) {
> + if (rtn != SUCCESS) {
>   SCSI_LOG_ERROR_RECOVERY(3,
>   scmd_printk(KERN_INFO, scmd,
>   "%s: aborting cmd failed\n",

The rest of the code in that function is:
current->comm));
list_splice_init(&check_list, work_q);
 return list_empty(work_q);
}   [closing the rtn != SUCCESS block]
scmd->eh_eflags &= ~SCSI_EH_CANCEL_CMD;
if (rtn == FAST_IO_FAIL)
scsi_eh_finish_cmd(scmd, done_q);
else
list_move_tail(&scmd->eh_entry, &check_list);
}

return scsi_eh_test_devices(&check_list, work_q, done_q, 0);
}

With this change, if rtn is FAST_IO_FAIL, the 
return list_empty(work_q);
will be taken and it'll never get to
if (rtn == FAST_IO_FAIL)

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 27/27] scsi: ratelimit I/O error messages

2014-10-28 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Tuesday, 28 October, 2014 12:57 PM
> To: Elliott, Robert (Server Storage)
> Cc: Hannes Reinecke; James Bottomley; Christoph Hellwig; linux-
> s...@vger.kernel.org
> Subject: Re: [PATCH 27/27] scsi: ratelimit I/O error messages
> 
> On Fri, Oct 24, 2014 at 09:00:15PM +0000, Elliott, Robert (Server
> Storage) wrote:
> > Assuming the MLQUEUE -> MLCOMPLETE fix is put back in,
> > you may add:
> > Tested-by: Robert Elliott 
> > Reviewed-by: Robert Elliott 
> 
> Does this apply just to this patch or the whole series?

I didn't exercise every change in the series, so just
keep the Tested-by tag on this one.

1-24 have Reviewed-by tags (I reviewed v5 which ended there).
25-26 do not; I'll take a look at them now.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] scsi: TUR path is down after adapter gets reset in multipath configuration(scsi_dh_alus.c)

2014-10-27 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of wenxi...@linux.vnet.ibm.com
> Sent: Monday, 27 October, 2014 1:02 PM
> To: james.bottom...@hansenpartnership.com
> Cc: h...@infradead.org; linux-scsi@vger.kernel.org;
> brk...@linux.vnet.ibm.com
> Subject: [PATCH 2/2] scsi: TUR path is down after adapter gets reset
> in multipath configuration(scsi_dh_alus.c)
> 
> This patch also fixes the 02/04/02 K/C/Q check in alua_check_sense
> handler.
> 
> Signed-off-by: Brian King 
> Teste-by: Wen Xiong 

Missing a d

> ---
>  drivers/scsi/device_handler/scsi_dh_alua.c |7 +++
>  1 file changed, 7 insertions(+)
> 
> Index: b/drivers/scsi/device_handler/scsi_dh_alua.c
> ===
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c  2014-10-23
> 13:00:45.0 -0500
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c  2014-10-23
> 13:04:16.152079004 -0500
> @@ -474,6 +474,13 @@ static int alua_check_sense(struct scsi_
>* LUN Not Ready -- Offline
>*/
>   return SUCCESS;
> + if (sdev->allow_restart &&
> + (sense_hdr->asc == 0x04) && (sense_hdr->ascq ==
> 0x02))

The coding style in that function does not include the extra 
parenthesis, as shown by the next excerpt:

> + /*
> +  * if the device is not started, we need to wake
> +  * the error handler to start the motor
> +  */
> + return FAILED;
>   break;
>   case UNIT_ATTENTION:
>   if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)

Thus function is used several places:
* installed as the scsi_device_handler .check_sense function 
* called to parse the response to REPORT TARGET PORT GROUPS
  in alua_rtpg
* called to parse the response to SET TARGET PORT GROUPS
  in stpg_endio

I'm not sure that adding NOT READY/LOGICAL UNIT NOT READY,
INITIALIZING COMMAND REQUIRED (2h/04h/02h) is a good idea
for the second case. The expected way to handle that 
response is to send START STOP UNIT with START=1.

There are conditions in which REPORT TARGET PORT GROUPS
is allowed and START STOP UNIT with START=1 is not allowed:
* CbCS (capabilities-based command security) only allows 
  START STOP UNIT if physical access (PHY ACC) is enabled,
  while REPORT TARGET PORT GROUPS is always allowed.
* the standby or unavailable asymmetric access states only 
  guarantee that REPORT TARGET PORT GROUPS is allowed, not
  START STOP UNIT.  The device is permitted to support START
  STOP UNIT, but it's not required.

So, it's really not a response that should be returned for
that command.

Any device that does return that response must also support
START STOP UNIT or it's misleading the application client.
In that case, falling through to the EH to send START STOP
UNIT is the right thing to do.

SET TARGET PORT GROUPS is questionable too; it also has 
different CbCS permissions and asymmetric access state
requirements than START STOP UNIT with START=1.

Perhaps that new "return FAILED" should be skipped if the
opcode is REPORT TARGET PORT GROUPS or SET TARGET PORT
GROUPS?

---
Rob ElliottHP Server Storage


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] scsi: TUR path is down after adapter gets reset in multipath configuration(scsi_error.c)

2014-10-27 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of wenxi...@linux.vnet.ibm.com
> Sent: Monday, 27 October, 2014 1:02 PM
> To: james.bottom...@hansenpartnership.com
> Cc: h...@infradead.org; linux-scsi@vger.kernel.org;
> brk...@linux.vnet.ibm.com
> Subject: [PATCH 1/2] scsi: TUR path is down after adapter gets reset
> in multipath configuration(scsi_error.c)
> 
> After an ipr adapter gets reset, all disk array devices require a
> start unit command to be issued to them before they will accept
> commands. So, with the SCSI EH change, we now end up in a scenario 
> with dual ipr adapters where the TUR getting issued from the health
> checker returns with a Not Ready response and since SCSI EH no
> longer triggers the Start Unit in this scenario, the path never
> recovers.
> 

This description doesn't make sense without reading the thread:
[dm-devel] [PATCH 1/1] multipath-tools: Change path checker for IBM IPR devices

Example: what is "the SCSI EH change"?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 27/27] scsi: ratelimit I/O error messages

2014-10-24 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Friday, 24 October, 2014 7:27 AM
> To: James Bottomley
> Cc: Christoph Hellwig; Elliott, Robert (Server Storage); linux-
> s...@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 27/27] scsi: ratelimit I/O error messages
> 
> There can be quite a lot of I/O error messages, even on smaller
> machines. So we need to ratelimit them to not overwhelm logging.
> 
> Cc: Robert Elliott 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_lib.c | 32 
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 3ea5969..4572991 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
...
> @@ -1038,18 +1039,25 @@ void scsi_io_completion(struct scsi_cmnd
> *cmd, unsigned int good_bytes)
>   switch (action) {
>   case ACTION_FAIL:
>   /* Give up and fail the remainder of the request */
> - if (unlikely(scsi_logging_level))
> - level = SCSI_LOG_LEVEL(SCSI_LOG_MLQUEUE_SHIFT,
> -SCSI_LOG_MLQUEUE_BITS);
> - /*
> -  * if logging is enabled the failure will be printed
> -  * in scsi_log_completion(), so avoid duplicate messages
> -  */
> - if (!level && !(req->cmd_flags & REQ_QUIET)) {
> - scsi_print_result(cmd, NULL, FAILED);
> - if (driver_byte(result) & DRIVER_SENSE)
> - scsi_print_sense(cmd);
> - scsi_print_command(cmd);
> + if (!(req->cmd_flags & REQ_QUIET)) {
> + static DEFINE_RATELIMIT_STATE(_rs,
> + DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
> +
> + if (unlikely(scsi_logging_level))
> + level =
> SCSI_LOG_LEVEL(SCSI_LOG_MLQUEUE_SHIFT,
> +SCSI_LOG_MLQUEUE_BITS);
> +

This switched back to MLQUEUE from MLCOMPLETE (which v7 had
corrected).

With MLCOMPLETE level 1, the scsi_log_completion prints
also occur, so everything is doubled.

This should use MLCOMPLETE.


> + /*
> +  * if logging is enabled the failure will be
> printed
> +  * in scsi_log_completion(), so avoid duplicate
> messages
> +  */
> + if (!level && __ratelimit(&_rs)) {
> + scsi_print_result(cmd, NULL, FAILED);
> + if (driver_byte(result) & DRIVER_SENSE)
> + scsi_print_sense(cmd);
> + scsi_print_command(cmd);

The scsi_log_completion equivalent section calls scsi_print_command
before scsi_print_sense (noticed due to the previous issue).  You 
may want to make them the same (though in this case the subtle
difference was helpful).

> + }
>   }
>   if (!scsi_end_request(req, error, blk_rq_err_bytes(req),
> 0))
>   return;
> --
> 1.8.5.2

With MLQUEUE level 0 and MLCOMPLETE level 0, v8 runs those prints, 
and they work as expected:
* the value in the ratelimit message matches the block layer (82)
* the number of print_result/print_sense/print_command matches 
  the number of block layer prints (10)
* the SCSI prints end up interleaved, but that's for the next 
  patch series to fix

Assuming the MLQUEUE -> MLCOMPLETE fix is put back in,
you may add:
Tested-by: Robert Elliott 
Reviewed-by: Robert Elliott  

Thanks.

Excerpt with MLQUEUE level 0, MLCOMPLETE level 0:
[  789.844126] scsi_io_completion: 82 callbacks suppressed
[  789.844231] blk_update_request: 82 callbacks suppressed
[  789.844233] blk_update_request: critical target error, dev sds, sector 35768
[  789.844263] blk_update_request: critical target error, dev sds, sector 3984
[  789.844267] blk_update_request: critical target error, dev sds, sector 74936
[  789.844310] blk_update_request: critical target error, dev sds, sector 46592
[  789.844481] blk_update_request: critical target error, dev sds, sector 111072
[  789.844485] blk_update_request: critical target error, dev sds, sector 160608
[  789.844488] blk_update_request: critical target error, dev sds, sector 1232
[  789.844492] blk_update_request: critical target error, dev sds, sector 129896
[  789.844498] blk_update_request: critical target error, dev sds, sector 119672
[  789.844535] blk_update_request: critical target error, dev sds, sector 14272
[  789.8

RE: [PATCH v2 10/12] IB/srp: Use block layer tags

2014-10-23 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, October 23, 2014 3:48 AM
> To: Elliott, Robert (Server Storage)
> Cc: Bart Van Assche; Jens Axboe; Sagi Grimberg; Sebastian Parschauer;
> Ming Lei; linux-scsi@vger.kernel.org; linux-rdma; Scales, Webb; Don
> Brace (PMC)
> Subject: Re: [PATCH v2 10/12] IB/srp: Use block layer tags
> 
> On Wed, Oct 22, 2014 at 10:03:24PM +0000, Elliott, Robert (Server
> Storage) wrote:
> > Have you tested this with scsi_mod.use_blk_mq=n?
> >
> > Trying similar changes in hpsa, we still receive some INQUIRY commands
> > submitted through queuecommand with tag -1.  They are for devices for
> > which slave_alloc has not yet been run, implying this work needs to
> > be done even earlier.  Maybe the midlayer is missing a slave_alloc
> > call somewhere?
> 
> Did that version of hpsa really enable tagging in slave_alloc
> or just in slave_configure?  The latter would cause INQUIRY to be
> sent untagged.

Yes, it is slave_alloc, not slave_configure.  

However, it was looking at scmd->tag, which is always 0xff (at 
least in those early discovery commands).  scmd->request->tag 
looks like it is the field that has the correct values.

Also, I noticed that scmd->tag is just an 8 bit field, so
it could never represent a large number of tags.

Just to confirm: After calling scsi_init_shared_tag_map()
in non-mq mode, will scmd->request->tag be based on 
controller-wide tag allocation (never using the same
value at the same time for the request queues of multiple
devices in that controller)?


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/6] scsi: use dev_printk variants where possible

2014-10-22 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Hannes Reinecke
> Sent: Tuesday, 03 June, 2014 6:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Hannes Reinecke
> Subject: [PATCH 6/6] scsi: use dev_printk variants where possible
> 
> Using dev_printk variants prefixes the logging message with
> the originating device, which makes debugging easier.
> 
> Signed-off-by: Hannes Reinecke 
...

This patch (91921e016a2199e7afe5933c94bd9f723d946598
in the kernel) has left the devname[64] variable in this 
function unused.  There is an sprintf setting it, but it is 
never read.

char devname[64];
...
sprintf(devname, "host %d channel %d id %d",
shost->host_no, sdev->channel, sdev->id);

Here are the former uses:

> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index e02b3aa..46563b1 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
...
> @@ -1430,17 +1433,19 @@ static int scsi_report_lun_scan(struct
> scsi_target *starget, int bflags,
>* a retry.
>*/
>   for (retries = 0; retries < 3; retries++) {
> - SCSI_LOG_SCAN_BUS(3, printk (KERN_INFO "scsi scan:
> Sending"
> - " REPORT LUNS to %s (try %d)\n", devname,
> + SCSI_LOG_SCAN_BUS(3, sdev_printk (KERN_INFO, sdev,
> + "scsi scan: Sending REPORT LUNS to (try
> %d)\n",
>   retries));

The word "to" remains from:
"to %s", devname 

but the switch to sdev_printk probably meant to delete "to" too.

...
> @@ -1466,10 +1471,11 @@ static int scsi_report_lun_scan(struct
> scsi_target *starget, int bflags,
> 
>   num_luns = (length / sizeof(struct scsi_lun));
>   if (num_luns > max_scsi_report_luns) {
> - printk(KERN_WARNING "scsi: On %s only %d
> (max_scsi_report_luns)"
> -" of %d luns reported, try increasing"
> -" max_scsi_report_luns.\n", devname,
> -max_scsi_report_luns, num_luns);
> + sdev_printk(KERN_WARNING, sdev,
> + "Only %d (max_scsi_report_luns)"
> + " of %d luns reported, try increasing"
> + " max_scsi_report_luns.\n",
> + max_scsi_report_luns, num_luns);
>   num_luns = max_scsi_report_luns;
>   }
> 
> @@ -1495,15 +1501,15 @@ static int scsi_report_lun_scan(struct
> scsi_target *starget, int bflags,
>* this differs from what linux would print for the
>* integer LUN value.
>*/
> - printk(KERN_WARNING "scsi: %s lun 0x", devname);
> - data = (char *)lunp->scsi_lun;
> - for (i = 0; i < sizeof(struct scsi_lun); i++)
> - printk("%02x", data[i]);
> - printk(" has a LUN larger than currently
> supported.\n");
> + sdev_printk(KERN_WARNING, sdev,
> + "lun 0x%8phN has a LUN larger"
> + " than currently supported.\n",
> + lunp->scsi_lun);
>   } else if (lun > sdev->host->max_lun) {
> - printk(KERN_WARNING "scsi: %s lun%d has a LUN
> larger"
> -" than allowed by the host adapter\n",
> -devname, lun);
> + sdev_printk(KERN_WARNING, sdev,
> + "lun 0x%8phN has a LUN larger"
> + " than allowed by the host adapter\n",
> + lunp->scsi_lun);
>   } else {
>   int res;
> 
...
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 10/12] IB/srp: Use block layer tags

2014-10-22 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Bart Van Assche [mailto:bvanass...@acm.org]
> Sent: Tuesday, 07 October, 2014 8:07 AM
...
> @@ -1927,7 +1931,7 @@ static int srp_queuecommand(struct Scsi_Host
> *shost, struct scsi_cmnd *scmnd)
> 
>   cmd->opcode = SRP_CMD;
>   cmd->lun= cpu_to_be64((u64) scmnd->device->lun << 48);
> - cmd->tag= req->index;
> + cmd->tag= tag;
>   memcpy(cmd->cdb, scmnd->cmnd, scmnd->cmd_len);
> 
>   req->scmnd= scmnd;
...
> 
> +static int srp_slave_alloc(struct scsi_device *sdev)
> +{
> + sdev->tagged_supported = 1;
> +
> + scsi_activate_tcq(sdev, sdev->queue_depth);
> +
> + return 0;
> +}
> +

Have you tested this with scsi_mod.use_blk_mq=n?

Trying similar changes in hpsa, we still receive some INQUIRY commands 
submitted through queuecommand with tag -1.  They are for devices for
which slave_alloc has not yet been run, implying this work needs to 
be done even earlier.  Maybe the midlayer is missing a slave_alloc
call somewhere?

---
Rob ElliottHP Server Storage



N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [PATCH 27/27] scsi: ratelimit I/O error messages

2014-10-21 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Elliott, Robert (Server Storage)
...
> With scsi_logging_level --set --error=5, there are:
> * 10 "FAILED Result" prints
>   * the first has no CDB
>   * at the end, there is a much later CDB print not preceded by a
> FAILED print
> * 10 "blk_update_request: critical target error" prints
>   * their sector number matches the preceding CDB print
> * 1 "blk_update_request: 539 callbacks suppressed" print
> * fio printed 33 io_u errors
>   * the first at offset 27118616, matching the first print
>   * one of the later ones is at offset 28297168, matching the last
> print
> * no "callbacks suppressed" messages from scsi_io_completion
...
> [49190.337402] sd 2:0:0:1: [sds] FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [49190.337404] sd 2:0:0:1: [sds] Sense Key : Hardware Error [current]
> [49190.337406] sd 2:0:0:1: [sds] Add. Sense: Logical unit failure
> [49190.337408] sd 2:0:0:1: [sds] CDB:
> [49190.337412] Read(10): 28 00 01 9d cc 58 00 00 08 00
> [49190.337413] blk_update_request: critical target error, dev sds,
> sector 27118680 <0x19dcc58>
> [49190.337417] blk_update_request: critical target error, dev sds,
> sector 27118688 <0x19dcc60>
> [49190.554043] sd 2:0:0:1: [sds] CDB:
> [49190.555178] Read(10): 28 00 01 af c7 d0 00 00 08 00 <28297168>
...

After triggering another error after lunch, a ratelimit message
for scsi_io_completion did finally appear:
[57601.577000] scsi_io_completion: 536 callbacks suppressed
[57601.578865] sd 2:0:0:0: [sdr] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[57601.581535] sd 2:0:0:0: [sdr] Sense Key : Hardware Error [current]
[57601.584406] sd 2:0:0:0: [sdr] Add. Sense: Logical unit failure
[57601.586419] sd 2:0:0:0: [sdr] CDB:
[57601.587662] Read(10): 28 00 2e 92 90 00 00 00 08 00

---
Rob ElliottHP Server Storage





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 27/27] scsi: ratelimit I/O error messages

2014-10-21 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Monday, 20 October, 2014 1:53 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 27/27] scsi: ratelimit I/O error messages
> 
> There can be quite a lot of I/O error messages, even on smaller
> machines. So we need to ratelimit them to not overwhelm logging.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_lib.c | 30 ++
>  1 file changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 2df485f..9d248d2 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> @@ -1037,18 +1038,23 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
>   switch (action) {
>   case ACTION_FAIL:
>   /* Give up and fail the remainder of the request */
> - if (unlikely(scsi_logging_level))
> - level = SCSI_LOG_LEVEL(SCSI_LOG_MLQUEUE_SHIFT,
> -SCSI_LOG_MLQUEUE_BITS);
> - /*
> -  * if logging is enabled the failure will be printed
> -  * in scsi_log_completion(), so avoid duplicate messages
> -  */
> - if (!level && !(req->cmd_flags & REQ_QUIET)) {
> - scsi_print_result(cmd, NULL, FAILED);
> - if (driver_byte(result) & DRIVER_SENSE)
> - scsi_print_sense(cmd);
> - scsi_print_command(cmd);
> + if (unlikely(scsi_logging_level)) {

I don't think wrapping all that inside:
if (unlikely(scsi_logging_level))
is correct.  That means unrelated logging levels (e.g., LLQUEUE)
have an impact on whether this print happens.

> + static DEFINE_RATELIMIT_STATE(_rs,
> + DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
> + level = SCSI_LOG_LEVEL(SCSI_LOG_MLCOMPLETE_SHIFT,
> +SCSI_LOG_MLCOMPLETE_BITS);
> + /*
> +  * if logging is enabled the failure will be printed
> +  * in scsi_log_completion(), so avoid duplicate messages
> +  */
> + if (__ratelimit(&_rs) && !level &&
> + !(req->cmd_flags & REQ_QUIET)) {

__ratelimit should only be invoked if the other two are 
leading towards printing, since it is fairly involved
(including a spinlock):
if (!level && !(req->cmd_flags & REQ_QUIET) && 
(__ratelimit(&_rs)) {

> + scsi_print_result(cmd, NULL, FAILED);
> + if (driver_byte(result) & DRIVER_SENSE)
> + scsi_print_sense(cmd);
> + scsi_print_command(cmd);
> + }
>   }
>   if (!scsi_end_request(req, error, blk_rq_err_bytes(req), 0))
>   return;

Here are some results from removing devices during sequential reads.

With the scsi logging level clear, just 10 block layer prints show up
(that outer "if" blocks everything):
[48973.034156] blk_update_request: critical target error, dev sdr, sector 
7447280
[48973.035298] blk_update_request: critical target error, dev sdr, sector 
7591344
[48973.035302] blk_update_request: critical target error, dev sdr, sector 
7591352
[48973.035351] blk_update_request: critical target error, dev sdr, sector 
7591360
[48973.035355] blk_update_request: critical target error, dev sdr, sector 
7591368
[48973.035391] blk_update_request: critical target error, dev sdr, sector 
7591376
[48973.035394] blk_update_request: critical target error, dev sdr, sector 
7591384
[48973.035436] blk_update_request: critical target error, dev sdr, sector 
7591392
[48973.035439] blk_update_request: critical target error, dev sdr, sector 
7591400
[48973.035471] blk_update_request: critical target error, dev sdr, sector 
7591408

With scsi_logging_level --set --error=5, there are: 
* 10 "FAILED Result" prints
  * the first has no CDB
  * at the end, there is a much later CDB print not preceded by a FAILED print
* 10 "blk_update_request: critical target error" prints
  * their sector number matches the preceding CDB print
* 1 "blk_update_request: 539 callbacks suppr

RE: [PATCH 0/5] block/scsi/lio support for COMPARE_AND_WRITE

2014-10-16 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Douglas Gilbert
> Sent: Thursday, 16 October, 2014 3:02 PM
...
> On 14-10-16 12:39 PM, Douglas Gilbert wrote:
...
> > The COMPARE AND WRITE command may fail for other reasons
> > such as a transport problem or a Unit Attention, so the
> > SCSI eh logic may need to know about it.
> 
> Elaborating ...
> 
> Hannes will enjoy this one: say a COMPARE AND WRITE (CAW) fails
> due to a transport error or timeout. What should the EH do *** ?
> Answer: read that LBA(s) to see whether the command succeeded
> (i.e. wrote the new data)! If it did, do nothing; if it didn't,
> repeat the CAW command. And naturally that second CAW may
> yield a MISCOMPARE.

I don't think that is safe.. detecting the write data is in
place doesn't prove the COMPARE AND WRITE really worked as
a whole.

Example: if two applications do this:
* application 1: COMPARE AND WRITE: change 4 to 6
* application 2: COMPARE AND WRITE: change 5 to 6

and they both run into transport errors, then they'll 
both read back 6 and think their commands worked when
really only one of them worked.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] aic7xxx: replace kmalloc/memset by kzalloc

2014-10-16 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Michael Opdenacker
> Sent: Thursday, 16 October, 2014 2:31 PM
...
> On 10/16/2014 09:28 PM, Joe Perches wrote:
> > On Thu, 2014-10-16 at 21:14 +0200, Michael Opdenacker wrote:
> >
> >
> > /* Allocate SCB resources */
> > -   scb_data->scbarray = kmalloc(sizeof(struct scb) * AHC_SCB_MAX_ALLOC,
> GFP_ATOMIC);
> > +   scb_data->scbarray = kzalloc(sizeof(struct scb) * AHC_SCB_MAX_ALLOC,
> > +   GFP_ATOMIC);
...
> >
> > Probably better as kcalloc.
> 
> Hey, well spotted! Thanks for your review. I will post a new version soon.

kcalloc is helpful when one of the values is a variable that 
might cause the multiply to overflow during runtime.  Here, 
two constants are being multiplied together, which can
be done and checked by the compiler at compile time.  

Since kcalloc and kmalloc_array are both static inline 
functions:
static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
{
if (size != 0 && n > SIZE_MAX / size)
return NULL;
return __kmalloc(n * size, flags);
}
static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
{
return kmalloc_array(n, size, flags | __GFP_ZERO);
}

a compiler that detects an overflow will probably just reduce
that to an inlined "return NULL."

BUILD_BUG_ON could be used to trigger a compile-time error,
instead of building a kernel that returns a run-time error.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Concurrent SG_SCSI_RESET ioctls

2014-10-15 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Wednesday, 15 October, 2014 8:26 AM
...
> On Tue, Oct 14, 2014 at 12:42:26PM -0400, Douglas Gilbert wrote:
> > I don't mind if you change it. However I plan to release
> > sg3_utils-1.40 in the next 2 or 3 weeks, so that would
> > be the earliest a revised sg_reset would be available for
> > distros. Improving error reports is something I always
> > like to do (so ENODEV for the "in progress" case seems a
> > bit strident).
> 
> If sg_utils needs any changes for a different ENODEV we shouldn't
> bother - breaking backwards compatibility is a bad idea.

None of the values will cause the current version of sg_reset to
retry the request.  Maybe -EBUSY should be returned, and interpreted
that way by future versions of sg_reset?

> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Monday, 13 October, 2014 5:24 AM
...
> Both the existing code and my new code still serialize
> eh_*_reset_handler callers using the crude tmf_in_progress flag. Using
> a proper lock for it would seem preferable to me, as would be bouncing
> the work for SG_SCSI_RESET to the EH thread.

Eliminating serialization would be better, though.  Devices should
be independent, so bus device resets (that are not escalated)
should also be independent.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SG_SCSI_RESET ioctl reset escalation

2014-10-13 Thread Elliott, Robert (Server Storage)
Currently, if you request a reset through ioctl and it fails,
the kernel escalates like it does for scsi_abort_eh_cmnd:
bus device reset ->
target reset ->
bus reset ->
host reset

This is from scsi_error.c scsi_ioctl_reset:

switch (val) {
case SG_SCSI_RESET_NOTHING:
break;
case SG_SCSI_RESET_DEVICE:
rtn = scsi_try_bus_device_reset(scmd);
if (rtn == SUCCESS)
break;
/* FALLTHROUGH */
case SG_SCSI_RESET_TARGET:
rtn = scsi_try_target_reset(scmd);
if (rtn == SUCCESS)
break;
/* FALLTHROUGH */
case SG_SCSI_RESET_BUS:
rtn = scsi_try_bus_reset(scmd);
if (rtn == SUCCESS)
break;
/* FALLTHROUGH */
case SG_SCSI_RESET_HOST:
rtn = scsi_try_host_reset(scmd);
if (rtn == SUCCESS)
break;
default:
/* FALLTHROUGH */
error = -EIO;
break;
}

Those stronger reset levels affect multiple devices,
which may or may not be desired.

sg-reset includes a --no-esc option to ask the kernel
to not escalate resets, based on a February 2013 thread,
but the matching kernel changes were never merged.

Original issue:
http://marc.info/?l=linux-scsi&m=136000904311597&w=2

Suggested fix:
http://marc.info/?l=linux-scsi&m=136095718729372&w=2

Could that fix be revived?

---
Rob ElliottHP Server Storage


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Concurrent SG_SCSI_RESET ioctls

2014-10-11 Thread Elliott, Robert (Server Storage)


> > From: Christoph Hellwig [mailto:h...@infradead.org]
> > Sent: Saturday, 11 October, 2014 11:11 AM
...
> >
> > Hi Robert,
> >
> > can you take a look at the patches at:
> >
> > http://git.infradead.org/users/hch/scsi.git/shortlog/refs/heads/scsi-ioctl
> >
> > and confirm if they fix your issue?
> 
> Thanks.  That's a bit better, but sg_reset can now run into
> "No such device" errors.
> 
> sg_reset: starting device reset
> sg_reset: starting device reset
> sg_reset: SG_SCSI_RESET failed: No such device
> sg_reset: completed device (or target or bus or host) reset
> sg_reset: starting device reset
> sg_reset: completed device (or target or bus or host) reset
> sg_reset: starting device reset
> sg_reset: completed device (or target or bus or host) reset
> 

Running fio concurrently with all the sg_resets, it also led to
a hang with:
* all the fio threads sitting in blk_mq_get_tag, reported
  as blocked tasks after 120 seconds
* all the sg_resets from pass 23 hung 

Starting with sg_reset pass 22, where 6 of 14 got
"No such device", here is the serial log (with a few
extra prints added in the scsi midlayer):

[ 2750.165637] -- loop-sg_reset pass 22:
[ 2750.175055] sd_ioctl continuing cmd=0x2284
[ 2750.175469] sd_ioctl continuing cmd=0x2284
[ 2750.175470] sd_ioctl default cmd=0x2284
[ 2750.175474] calling scsi_try_bus_device_reset
[ 2750.175476] hpsa_eh_device_reset_handler called with 8804097297c0
[ 2750.175482] hpsa :04:00.0: resetting scsi 2:0:0:0: Direct-Access HP  
 LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En- Exp=3
[ 2750.181328] hpsa :04:00.0: resetting scsi 2:0:0:0 completed successfully
[ 2750.181330] scsi host2: waking up host to restart after TMF
[ 2750.181377] sd_ioctl continuing cmd=0x2284
[ 2750.181377] sd_ioctl default cmd=0x2284
[ 2750.181379] sd_ioctl continuing cmd=0x2284
[ 2750.181381] sd_ioctl continuing cmd=0x2284
[ 2750.181382] sd_ioctl default cmd=0x2284
[ 2750.181382] calling scsi_try_bus_device_reset
[ 2750.181383] sd_ioctl default cmd=0x2284
[ 2750.181384] hpsa_eh_device_reset_handler called with 8803f55de500
[ 2750.181388] hpsa :04:00.0: resetting scsi 2:0:0:12: Direct-Access HP 
  LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En- Exp=3
[ 2750.181427] hpsa :04:00.0: resetting scsi 2:0:0:12 completed successfully
[ 2750.181428] scsi host2: waking up host to restart after TMF
[ 2750.181443] calling scsi_try_bus_device_reset
[ 2750.181444] calling scsi_try_bus_device_reset
[ 2750.181446] hpsa_eh_device_reset_handler called with 8803f966ae00
[ 2750.181447] hpsa_eh_device_reset_handler called with 88042d558cc0
[ 2750.181451] hpsa :04:00.0: resetting scsi 2:0:0:8: Direct-Access HP  
 LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En+ Exp=3
[ 2750.181454] hpsa :04:00.0: resetting scsi 2:0:0:9: Direct-Access HP  
 LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En- Exp=3
[ 2750.181562] hpsa :04:00.0: resetting scsi 2:0:0:8 completed successfully
[ 2750.181564] scsi host2: waking up host to restart after TMF
[ 2750.181570] hpsa :04:00.0: resetting scsi 2:0:0:9 completed successfully
[ 2750.181572] scsi host2: waking up host to restart after TMF
[ 2750.181] sd_ioctl continuing cmd=0x2285
[ 2750.181840] sd_ioctl default cmd=0x2285
[ 2750.181842] sd_ioctl continuing cmd=0x2285
[ 2750.181842] sd_ioctl default cmd=0x2285
[ 2750.181982] sd_ioctl continuing cmd=0x2285
[ 2750.181982] sd_ioctl default cmd=0x2285
[ 2750.181987] sd_ioctl continuing cmd=0x2285
[ 2750.181988] sd_ioctl default cmd=0x2285
[ 2750.181990] sd_ioctl continuing cmd=0x2285
[ 2750.181990] sd_ioctl default cmd=0x2285
[ 2750.182075] sd_ioctl continuing cmd=0x2285
[ 2750.182076] sd_ioctl default cmd=0x2285
[ 2750.182148] sd_ioctl continuing cmd=0x2285
[ 2750.182148] sd_ioctl default cmd=0x2285
[ 2750.182210] sd_ioctl continuing cmd=0x2285
[ 2750.182211] sd_ioctl default cmd=0x2285
[ 2750.182362] sd_ioctl continuing cmd=0x2285
[ 2750.182362] sd_ioctl default cmd=0x2285
[ 2750.182498] sd_ioctl continuing cmd=0x2285
[ 2750.182499] sd_ioctl default cmd=0x2285
[ 2750.185274] sd_ioctl continuing cmd=0x5331
[ 2750.185275] sd_ioctl default cmd=0x5331
[ 2750.185277] hpsa_ioctl rejecting cmd=0x5331 with -ENOTTY
[ 2750.185612] sd_ioctl continuing cmd=0x2285
[ 2750.185613] sd_ioctl default cmd=0x2285
[ 2750.185614] sd_ioctl continuing cmd=0x2285
[ 2750.185615] sd_ioctl default cmd=0x2285
[ 2750.185888] sd_ioctl continuing cmd=0x2284
[ 2750.185889] sd_ioctl default cmd=0x2284
[ 2750.185892] calling scsi_try_bus_device_reset
[ 2750.185894] hpsa_eh_device_reset_handler called with 8804097297c0
[ 2750.185900] hpsa :04:00.0: resetting scsi 2:0:0:5: Direct-Access HP  
 LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En+ Exp=3
[ 2750.191332] hpsa :04:00.0: resetting scsi 2:0:0:5 completed successfully
[ 2750.191334] scsi host2: waking up host to restart after TMF
[ 2750.191357] sd_ioctl continuing cmd=0x2285
[ 2750.191358] sd_ioctl continuing cmd=0x2285
[ 27

RE: Concurrent SG_SCSI_RESET ioctls

2014-10-11 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Saturday, 11 October, 2014 11:11 AM
> To: Elliott, Robert (Server Storage)
> Cc: James Bottomley (jbottom...@parallels.com); Christoph Hellwig;
> dgilb...@interlog.com; linux-scsi@vger.kernel.org; Don Brace (PMC); Scales,
> Webb
> Subject: Re: Concurrent SG_SCSI_RESET ioctls
> 
> Hi Robert,
> 
> can you take a look at the patches at:
> 
> http://git.infradead.org/users/hch/scsi.git/shortlog/refs/heads/scsi-ioctl
> 
> and confirm if they fix your issue?

Thanks.  That's a bit better, but sg_reset can now run into
"No such device" errors.

sg_reset: starting device reset
sg_reset: starting device reset
sg_reset: SG_SCSI_RESET failed: No such device
sg_reset: completed device (or target or bus or host) reset
sg_reset: starting device reset
sg_reset: completed device (or target or bus or host) reset
sg_reset: starting device reset
sg_reset: completed device (or target or bus or host) reset


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Concurrent SG_SCSI_RESET ioctls

2014-10-10 Thread Elliott, Robert (Server Storage)
The sg3_utils command
sg_reset --device /dev/sda

invokes an ioctl with SG_SCSI_RESET, an argument of
SG_SCSI_RESET_DEVICE, on a device opened with O_NDELAY.

The call chain is like this:

sd_ioctl[sd.c]
scsi_nonblockable_ioctl [scsi_ioctl.c]
if (ndelay) {   [this is true]
if (scsi_host_in_recovery(sdev->host))
return -ENODEV;
}
scsi_reset_provider [scsi_error.c]
shost->tmf_in_progress = 1;
scsi_try_bus_device_reset [scsi_error.c]
hostt->eh_device_reset_handler [LLD]
shost->tmf_in_progress = 1;
if that returns 0, return; otherwise try:
scsi_cmd_blk_ioctl
scsi_ioctl
sdev->host->hostt->ioctl[LLD]

where scsi_host_in_recovery is:

static inline int scsi_host_in_recovery(struct Scsi_Host *shost)
{
return shost->shost_state == SHOST_RECOVERY ||
shost->shost_state == SHOST_CANCEL_RECOVERY ||
shost->shost_state == SHOST_DEL_RECOVERY ||
shost->tmf_in_progress;
}

Problem
===
If you run sg_reset --device concurrently to multiple
devices on the same host, then some of them will run
into tmf_in_progress and have scsi_nonblockable_ioctl
return -ENODEV.  This causes sd_ioctl to send the ioctl
request to the LLD's ioctl function, where it gets
rejected as unsupported with -ENOTTY.  sg_reset ends
up displaying:
sg_reset: SG_SCSI_RESET failed: Inappropriate ioctl for device

Any suggestions for how to fix this?  

Is the check of scsi_host_in_recovery, which includes
tmf_in_progress, too strong?  Most LLDs are not parallel
SCSI where you can just have one TMF on the bus at a
time anymore.

Is returning -ENODEV if the host is in recovery the
wrong code?  There might be a device there...it's
just that access is temporarily blocked.

Should sd_ioctl not proceed to scsi_ioctl in so many
cases? Perhaps it should:
  * proceed if it gets back -EINVAL (and maybe -ENOTTY)
indicating the ioctl specified was bad
  * return if it gets back any other error (e.g., -ENODEV,
-EACCESS, or -EIO) 
That would at least result in a more meaningful error.

Also, should scsi_nonblockable_ioctl return -ENOTTY rather
than -ENODEV if cmd is unsupported?  There's not really
a no-device problem.

---
Rob ElliottHP Server Storage

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 21/26] scsi: simplify scsi_log_(send|completion)

2014-10-08 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Tuesday, 07 October, 2014 4:03 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 21/26] scsi: simplify scsi_log_(send|completion)
> 
> Simplify scsi_log_(send|completion) by externalizing
> scsi_mlreturn_string() and always print the command address.
> 
...

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index b853659..2df485f 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -831,7 +831,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned
> int good_bytes)
>   int error = 0;
>   struct scsi_sense_hdr sshdr;
>   bool sense_valid = false;
> - int sense_deferred = 0;
> + int sense_deferred = 0, level = 0;
>   enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
> ACTION_DELAYED_RETRY} action;
>   unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
> @@ -1037,8 +1037,15 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
>   switch (action) {
>   case ACTION_FAIL:
>   /* Give up and fail the remainder of the request */
> - if (!(req->cmd_flags & REQ_QUIET)) {
> - scsi_print_result(cmd);
> + if (unlikely(scsi_logging_level))
> + level = SCSI_LOG_LEVEL(SCSI_LOG_MLQUEUE_SHIFT,
> +SCSI_LOG_MLQUEUE_BITS);

scsi_log_completion prints based on SCSI_LOG_MLCOMPLETE,
not SCSI_LOG_MLQUEUE, so this is not accomplishing what
the next comment says.

> + /*
> +  * if logging is enabled the failure will be printed
> +  * in scsi_log_completion(), so avoid duplicate messages
> +  */
> + if (!level && !(req->cmd_flags & REQ_QUIET)) {
> + scsi_print_result(cmd, NULL, FAILED);
>   if (driver_byte(result) & DRIVER_SENSE)
>   scsi_print_sense(cmd);
>   scsi_print_command(cmd);

This still results in prints on every completion with ACTION_FAIL.  
Since they're not ratelimited, the massive number of prints after
a device failure under heavy load causes other timeouts.  Since
the block layer also prints errors, with ratelimiting, these are
not really necessary (although they're more informative than the
block layer prints).  There needs to be a way to turn these off,
both in scsi_log_completion and here - maybe different MLCOMPLETE
levels?

The issue will be mitigated if the next patch series makes these
ratelimited.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V6 01/18] scsi: fixing the "type" for well known LUs

2014-10-03 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Friday, 26 September, 2014 3:14 AM
> To: Dolev Raviv
> Cc: james.bottom...@hansenpartnership.com; h...@infradead.org; linux-
> s...@vger.kernel.org; linux-scsi-ow...@vger.kernel.org; linux-arm-
> m...@vger.kernel.org; santos...@gmail.com; Subhash Jadavani; Sujit Reddy
> Thumma; Elliott, Robert (Server Storage); Martin K. Petersen
> Subject: Re: [PATCH V6 01/18] scsi: fixing the "type" for well known LUs
> 
> Robert, I guess this version is okay with you?
> 
> On Thu, Sep 25, 2014 at 03:32:19PM +0300, Dolev Raviv wrote:
> > From: Subhash Jadavani 
> >
> > Some devices may respond with wrong type for well-known logical units.
> > This patch forces well-known type for devices which doesn't report it
> > correct.
> >
> > Signed-off-by: Subhash Jadavani 
> > Signed-off-by: Sujit Reddy Thumma 
> > Signed-off-by: Dolev Raviv 
> >
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index 56675db..1095d5a 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -805,6 +805,19 @@ static int scsi_add_lun(struct scsi_device *sdev,
> unsigned char *inq_result,
> > } else {
> > sdev->type = (inq_result[0] & 0x1f);
> > sdev->removable = (inq_result[1] & 0x80) >> 7;
> > +
> > +   /*
> > +* some devices may respond with wrong type for
> > +* well-known logical units. Force well-known type
> > +* to enumerate them correctly.
> > +*/
> > +   if (scsi_is_wlun(sdev->lun) && sdev->type != TYPE_WLUN) {
> > +   sdev_printk(KERN_WARNING, sdev,
> > +   "%s: correcting incorrect peripheral device type
> 0x%x for W-LUN 0x%16phN\n",
> > +   __func__, sdev->type, sdev->lun);
> > +   sdev->type = TYPE_WLUN;
> > +   }
> > +

Yes, that looks good.

Reviewed-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] scsi_debug: deadlock between completions and surprise module removal

2014-10-03 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Thursday, 25 September, 2014 7:13 AM
> To: Douglas Gilbert
> Cc: SCSI development list; linux-kernel; James Bottomley; Christoph Hellwig;
> Milan Broz
> Subject: Re: [PATCH] scsi_debug: deadlock between completions and surprise
> module removal
> 
> Review ping again?
> 
> While I think the shutdown code in scsi_debug needs a bit more of an
> overhault I'd really like to include the fix at least for 3.18 and
> 3.17-stable now that we have missed the 3.17 window.
> 
> On Sun, Aug 31, 2014 at 07:09:59PM -0400, Douglas Gilbert wrote:
> > A deadlock has been reported when the completion
> > of SCSI commands (simulated by a timer) was surprised
> > by a module removal. This patch removes one half of
> > the offending locks around timer deletions. This fix
> > is applied both to stop_all_queued() which is were
> > the deadlock was discovered and stop_queued_cmnd()
> > which has very similar logic.
> >
> > This patch should be applied both to the lk 3.17 tree
> > and Christoph's drivers-for-3.18 tree.
> >
> > Tested-and-reported-by: Milan Broz 
> > Signed-off-by: Douglas Gilbert 

Reviewed-by: Robert Elliott 



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 24/24] scsi_error: document scsi_try_to_abort_cmd

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 24/24] scsi_error: document scsi_try_to_abort_cmd
> 
> scsi_try_to_abort_cmd() should only return
> SUCCESS or FAILED. So document that in the
> function description and simplify the
> logging message.
> 
> Suggested-by: Christoph Hellwig 
> Cc: Robert Elliott 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/scsi_error.c | 20 +---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index a41ef5b..75dd203 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -157,8 +157,7 @@ scmd_eh_abort_handler(struct work_struct *work)
>   } else {
>   SCSI_LOG_ERROR_RECOVERY(3,
>   scmd_printk(KERN_INFO, scmd,
> - "scmd %p abort failed, rtn %d\n",
> - scmd, rtn));
> + "scmd %p abort failed\n", scmd));
>   }
>   }
> 
> @@ -869,7 +868,22 @@ static int scsi_try_bus_device_reset(struct scsi_cmnd
> *scmd)
>   return rtn;
>  }
> 
> -static int scsi_try_to_abort_cmd(struct scsi_host_template *hostt, struct
> scsi_cmnd *scmd)
> +/**
> + * scsi_try_to_abort_cmd - Ask host to abort a SCSI command
> + * @scmd:SCSI cmd used to send a target reset
> + *
> + * Return value:
> + *   SUCCESS or FAILED
> + *
> + * Notes:
> + *SUCCESS does not necessarily indicate that the command
> + *has been aborted; it only indicates that the LLDDs
> + *has cleared all references to that command.
> + *LLDDs should return FAILED only if an abort was required
> + *but could not be executed.
> + */
> +static int scsi_try_to_abort_cmd(struct scsi_host_template *hostt,
> +  struct scsi_cmnd *scmd)
>  {
>   if (!hostt->eh_abort_handler)
>   return FAILED;

Since the rest of that function is just:
return hostt->eh_abort_handler(scmd);

this relies on the LLD to use only those values:
#define SUCCESS 0x2002
#define FAILED  0x2003

which is hard to ensure.

Randomly picking on one, megaraid.c registers this
function:
.eh_abort_handler   = megaraid_abort,

megaraid_abort returns the return code from 
megaraid_abort_and_reset, which returns TRUE or FALSE, 
not SUCCESS and FAILED.

Printing the return value helps expose such problems
(if the code path is exercised).

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 22/24] scsi: fixup logging messages in scsi_error.c

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 554f885..7e1e190 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1156,9 +1156,10 @@ int scsi_eh_get_sense(struct list_head *work_q,
>   shost = scmd->device->host;
>   if (scsi_host_eh_past_deadline(shost)) {
>   SCSI_LOG_ERROR_RECOVERY(3,
> - shost_printk(KERN_INFO, shost,
> - "skip %s, past eh deadline\n",
> -  __func__));
> + scmd_printk(KERN_INFO, scmd,
> + "%s: skip request sense, "
> +  "past eh deadline\n",

checkpatch lets you keep strings on one line.

> +  current->comm));
>   break;
>   }
>   if (status_byte(scmd->result) != CHECK_CONDITION)
> @@ -1265,9 +1266,10 @@ static int scsi_eh_test_devices(struct list_head
> *cmd_list,
>   /* Push items back onto work_q */
>   list_splice_init(cmd_list, work_q);
>   SCSI_LOG_ERROR_RECOVERY(3,
> - shost_printk(KERN_INFO, sdev->host,
> -  "skip %s, past eh 
> deadline",
> -  __func__));
> + sdev_printk(KERN_INFO, sdev,
> + "%s: skip test device, "
> + "past eh deadline",

checkpatch lets you keep strings on one line.

> + current->comm));
>   break;
>   }
>   }
> @@ -1318,21 +1320,22 @@ static int scsi_eh_abort_cmds(struct list_head
> *work_q,
>   if (scsi_host_eh_past_deadline(shost)) {
>   list_splice_init(&check_list, work_q);
>   SCSI_LOG_ERROR_RECOVERY(3,
> - shost_printk(KERN_INFO, shost,
> - "skip %s, past eh deadline\n",
> -  __func__));
> + scmd_printk(KERN_INFO, scmd,
> + "%s: skip aborting cmd, "
> + "past eh deadline\n",

checkpatch lets you keep strings on one line.

> + current->comm));
>   return list_empty(work_q);
>   }
>   SCSI_LOG_ERROR_RECOVERY(3,
> - shost_printk(KERN_INFO, shost,
> -  "%s: aborting cmd: 0x%p\n",
> + scmd_printk(KERN_INFO, scmd,
> +  "%s: aborting cmd\n",
>current->comm, scmd));

As noted by Ewan, the extra scmd argument causes compiler
warnings.

...
> @@ -1390,9 +1393,10 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
>   shost_for_each_device(sdev, shost) {
>   if (scsi_host_eh_past_deadline(shost)) {
>   SCSI_LOG_ERROR_RECOVERY(3,
> - shost_printk(KERN_INFO, shost,
> - "skip %s, past eh deadline\n",
> -  __func__));
> + sdev_printk(KERN_INFO, sdev,
> + "%s: skip START_UNIT, "
> + "past eh deadline\n",

checkpatch lets you keep strings on one line.

> + current->comm));
>   break;
>   }
>   stu_scmd = NULL;
...
> @@ -1528,8 +1530,9 @@ static int scsi_eh_target_reset(struct Scsi_Host
> *shost,
>   list_splice_init(&tmp_list, work_q);
>   SCSI_LOG_ERROR_RECOVERY(3,
>   shost_printk(KERN_INFO, shost,
> - "skip %s, past eh deadline\n",
> -  __func__));
> + "%s: Skip target reset, "
> +  "past eh deadline\n",

checkpatch lets you keep strings on one line.

...
> @@ -2191,9 +2194,10 @@ int scsi_error_handler(void *data)
>*/
>   if (!shost->eh_noresume && scsi_autopm_get_host(shost) != 0) {
>   SCSI_LOG_ERROR_RECOVERY(1,
> - printk(KERN_ERR "Error handler scsi_eh_%d "
> - "unable to autoresume\n

RE: [PATCH 21/24] scsi: simplify scsi_log_(send|completion)

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 21/24] scsi: simplify scsi_log_(send|completion)
> 
> Simplify scsi_log_(send|completion) by externalizing
> scsi_mlreturn_string() and always print the command address.
> 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/constants.c | 39 ---
>  drivers/scsi/scsi.c  | 43 ++-
>  drivers/scsi/scsi_lib.c  | 13 ++---
>  include/scsi/scsi_dbg.h  |  3 ++-
>  4 files changed, 54 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
> index b131900..207ebef 100644
> --- a/drivers/scsi/constants.c
> +++ b/drivers/scsi/constants.c
> @@ -1437,19 +1437,52 @@ const char *scsi_driverbyte_string(int result)
>  }
>  EXPORT_SYMBOL(scsi_driverbyte_string);
> 
> -void scsi_print_result(struct scsi_cmnd *cmd)
> +#ifdef CONFIG_SCSI_CONSTANTS
> +#define scsi_mlreturn_name(result)   { result, #result }
> +static const struct value_name_pair scsi_mlreturn_arr[] = {
> + scsi_mlreturn_name(NEEDS_RETRY),
> + scsi_mlreturn_name(SUCCESS),
> + scsi_mlreturn_name(FAILED),
> + scsi_mlreturn_name(QUEUED),
> + scsi_mlreturn_name(SOFT_ERROR),
> + scsi_mlreturn_name(ADD_TO_MLQUEUE),
> + scsi_mlreturn_name(TIMEOUT_ERROR),
> + scsi_mlreturn_name(SCSI_RETURN_NOT_HANDLED),
> + scsi_mlreturn_name(FAST_IO_FAIL)
> +};

SUCCESS is a misleading name to print on commands that were
really not successful.  Example:
[ 5978.573297] sd 2:0:0:8: [sdz] tag#209 Done: SUCCESS Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK
[ 5978.576335] sd 2:0:0:8: [sdz] tag#209 CDB: Test Unit Ready 00 00 00 00 00 00
[ 5978.578721] sd 2:0:0:8: [sdz] tag#209 Sense Key : Unit Attention [current]
[ 5978.581177] sd 2:0:0:8: [sdz] tag#209 Add. Sense: Bus device reset function 
occurred

Maybe that mlreturn value should be printed as 
"COMPLETE" since it is similar to the SAM service
responses called COMMAND COMPLETE and FUNCTION COMPLETE.
That is a more neutral term.

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 20/24] sd: Cleanup logging

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
...
> @@ -3328,9 +3319,19 @@ static void sd_print_sense_hdr(struct scsi_disk *sdkp,
>sshdr->asc, sshdr->ascq);
>  }
> 
> -static void sd_print_result(struct scsi_disk *sdkp, int result)
> +static void sd_print_result(struct scsi_disk *sdkp, const char *msg,
> + int result)
>  {
> - sd_printk(KERN_INFO, sdkp, " ");
> - scsi_show_result(result);
> + const char *hb_string = scsi_hostbyte_string(result);
> + const char *db_string = scsi_driverbyte_string(result);
> +
> + if (hb_string && db_string)
> + sd_printk(KERN_INFO, sdkp,
> +   "%s: Result: hostbyte=%s driverbyte=%s\n",
> +   msg, hb_string, db_string);
> + else
> + sd_printk(KERN_INFO, sdkp,
> +   "%s: Result: hostbyte=0x%02x driverbyte=0x%02x\n",
> +   msg, host_byte(result), driver_byte(result));

In a similar construct, patch 19 used || in scsi_print_result
to choose the string path, and printed "invalid" if one
of the strings was NULL.  They should probably match.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 18/24] scsi: Remove scsi_print_command when calling abort

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 18/24] scsi: Remove scsi_print_command when calling abort
> 
> Calling scsi_print_command should not be necessary during abort;
> if the information is required one should enable scsi logging.
> 
...
> diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
> index 45da3c8..50873bb 100644
> --- a/drivers/scsi/NCR5380.c
> +++ b/drivers/scsi/NCR5380.c
> @@ -2666,9 +2666,8 @@ static int NCR5380_abort(Scsi_Cmnd * cmd) {
>   struct Scsi_Host *instance = cmd->device->host;
>   struct NCR5380_hostdata *hostdata = (struct NCR5380_hostdata *)
> instance->hostdata;
>   Scsi_Cmnd *tmp, **prev;
> -
> - printk(KERN_WARNING "scsi%d : aborting command\n", instance->host_no);
> - scsi_print_command(cmd);
> +
> + scmd_printk(KERN_WARNING, cmd, "aborting command\n");
> 
>   NCR5380_print_status(instance);
> 
> diff --git a/drivers/scsi/arm/fas216.c b/drivers/scsi/arm/fas216.c
> index b2fd18e..c16aca1 100644
> --- a/drivers/scsi/arm/fas216.c
> +++ b/drivers/scsi/arm/fas216.c
> @@ -2425,14 +2425,11 @@ int fas216_eh_abort(struct scsi_cmnd *SCpnt)
> 
>   info->stats.aborts += 1;
> 
> - printk(KERN_WARNING "scsi%d: abort command ", info->host->host_no);
> - __scsi_print_command(SCpnt->cmnd);
> + sdev_printk(KERN_WARNING, SCpnt, "abort command\n");

That should be scmd_printk.

It'd be nice if the s*_printk macros could typecheck 
that pointer.

> 
>   print_debug_list();
>   fas216_dumpstate(info);
> 
> - printk(KERN_WARNING "scsi%d: abort %p ", info->host->host_no, SCpnt);
> -
>   switch (fas216_find_command(info, SCpnt)) {
>   /*
>* We found the command, and cleared it out.  Either
> @@ -2440,7 +2437,7 @@ int fas216_eh_abort(struct scsi_cmnd *SCpnt)
>* target, or the busylun bit is not set.
>*/
>   case res_success:
> - printk("success\n");
> + sdev_printk(KERN_WARNING, SCpnt, "abort %p success\n", SCpnt);

That should be scmd_printk, dropping the %p and the extra SCpnt
argument.

>   result = SUCCESS;
>   break;
> 
> @@ -2450,14 +2447,13 @@ int fas216_eh_abort(struct scsi_cmnd *SCpnt)
>* if the bus is free.
>*/
>   case res_hw_abort:
> -
> 
>   /*
>* We are unable to abort the command for some reason.
>*/
>   default:
>   case res_failed:
> - printk("failed\n");
> + sdev_printk(KERN_WARNING, SCpnt, "abort %p failed\n", SCpnt);

That should be scmd_printk, dropping the %p and the extra SCpnt
argument.

>   break;
>   }
> 
> diff --git a/drivers/scsi/atari_NCR5380.c b/drivers/scsi/atari_NCR5380.c
> index 79e6f04..322f361 100644
> --- a/drivers/scsi/atari_NCR5380.c
> +++ b/drivers/scsi/atari_NCR5380.c
> @@ -2623,8 +2623,7 @@ int NCR5380_abort(Scsi_Cmnd *cmd)
>   Scsi_Cmnd *tmp, **prev;
>   unsigned long flags;
> 
> - printk(KERN_NOTICE "scsi%d: aborting command\n", HOSTNO);
> - scsi_print_command(cmd);
> + sdev_printk(KERN_NOTICE, cmd, "aborting command\n");

That should be scmd_printk.

...
> diff --git a/drivers/scsi/sun3_NCR5380.c b/drivers/scsi/sun3_NCR5380.c
> index 1a2367a..68d0e1a 100644
> --- a/drivers/scsi/sun3_NCR5380.c
> +++ b/drivers/scsi/sun3_NCR5380.c
> @@ -2608,8 +2608,7 @@ static int NCR5380_abort(struct scsi_cmnd *cmd)
>  struct scsi_cmnd *tmp, **prev;
>  unsigned long flags;
> 
> -printk(KERN_NOTICE "scsi%d: aborting command\n", HOSTNO);
> -scsi_print_command(cmd);
> +sdev_printk(KERN_NOTICE, cmd, "aborting command\n");

That should be scmd_printk.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 17/24] scsi: remove last argument from print_opcode_name()

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 17/24] scsi: remove last argument from print_opcode_name()
> 
> print_opcode_name() was only ever called with a '0' argument
> from LLDDs and ULDs which were _not_ supporting variable length
> CDBs, so the 'if' clause was never triggered.
> 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/constants.c | 21 +++--
>  1 file changed, 7 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
> index 2110d61..713e1e0 100644
> --- a/drivers/scsi/constants.c
> +++ b/drivers/scsi/constants.c
> @@ -320,25 +320,21 @@ static bool scsi_opcode_sa_name(int cmd, int
> service_action,
>   return true;
>  }
> 
> -/* attempt to guess cdb length if cdb_len==0 . No trailing linefeed. */
> -static void print_opcode_name(unsigned char * cdbp, int cdb_len)
> +static void print_opcode_name(unsigned char *cdbp)

Add const since what cdbp points to is not modified.

>  {
> - int sa, len, cdb0;
> + int sa, cdb0;
>   const char *cdb_name = NULL, *sa_name = NULL;
> 
>   cdb0 = cdbp[0];
>   if (cdb0 == VARIABLE_LENGTH_CMD) {
> - len = scsi_varlen_cdb_length(cdbp);
> + int len = scsi_varlen_cdb_length(cdbp);

cdbp must point to a buffer containing at least 8 valid
bytes for that to be safe. Is that guaranteed?

>   if (len < 10) {
> - printk("short variable length command, "
> -"len=%d ext_len=%d", len, cdb_len);
> + printk("short variable length command, len=%d", len);
>   return;
>   }
>   sa = (cdbp[8] << 8) + cdbp[9];
> - } else {
> + } else
>   sa = cdbp[1] & 0x1f;

cdbp must point to a buffer containing at least 2 valid
bytes for that to be safe. Is that guaranteed?

> - len = cdb_len;
> - }
> 
>   if (!scsi_opcode_sa_name(cdb0, sa, &cdb_name, &sa_name)) {
>   if (cdb_name)
> @@ -356,9 +352,6 @@ static void print_opcode_name(unsigned char * cdbp, int
> cdb_len)
>   printk("%s, sa=0x%x", cdb_name, sa);
>   else
>   printk("cdb[0]=0x%x, sa=0x%x", cdb0, sa);
> -
> - if (cdb_len > 0 && len != cdb_len)
> - printk(", in_cdb_len=%d, ext_len=%d", len, cdb_len);
>   }
>  }
> 
> @@ -366,7 +359,7 @@ void __scsi_print_command(unsigned char *cdb)
...
> @@ -383,7 +376,7 @@ void scsi_print_command(struct scsi_cmnd *cmd)

Add const to each of those, since what cdbp and cmd point 
to is not modified.

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 16/24] scsi: consolidate opcode lookup in scsi_opcode_sa_name()

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
...
> @@ -332,21 +340,20 @@ static void print_opcode_name(unsigned char * cdbp, int
> cdb_len)
>   len = cdb_len;
>   }
> 
> - if (!scsi_opcode_sa_name(cdb0, sa, &name)) {
> - if (cdb0 < 0xc0) {
> - if (ARRAY_SIZE(cdb_byte0_names)) {
> - name = cdb_byte0_names[cdb0];
> - if (name)
> - printk("%s", name);
> - else
> - printk("cdb[0]=0x%x (reserved)", cdb0);
> - } else
> - printk("cdb[0]=0x%x", cdb0);
> - } else
> + if (!scsi_opcode_sa_name(cdb0, sa, &cdb_name, &sa_name)) {
> + if (cdb_name)
> + printk("%s", cdb_name);
> + else if (cdb0 > VENDOR_SPECIFIC_CDB)
>   printk("cdb[0]=0x%x (vendor)", cdb0);
> + else if (cdb0 > 0x60 && cdb0 < 0x7e)
> + printk("cdb[0]=0x%x (reserved)", cdb0);
> + else
> + printk("cdb[0]=0x%x", cdb0);

Both of those > need to be >=, since the reserved
range starts at and includes 0x60, and the vendor
specific range starts at and includes 0xc0.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 15/24] scsi: merge print_opcode_name()

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> +struct sa_name_list {
> + int cmd;
> + const struct value_name_pair *arr;
> + int arr_sz;
> +};

The suggestion to rename cmd to opcode in patch 14 would
follow the movements here.

...
> @@ -273,7 +292,7 @@ static bool scsi_opcode_sa_name(int cmd, int
> service_action,
>   const struct value_name_pair *arr = NULL;
>   int arr_sz, k;
> 
> - for (k = 0; sa_name_ptr->arr; ++k, ++sa_name_ptr) {
> + for (k = 0; k < ARRAY_SIZE(sa_names_arr); ++k, ++sa_name_ptr) {
>   if (sa_name_ptr->cmd == cmd) {
>   arr = sa_name_ptr->arr;
>   arr_sz = sa_name_ptr->arr_sz;
> @@ -315,11 +334,14 @@ static void print_opcode_name(unsigned char * cdbp, int
> cdb_len)
> 
>   if (!scsi_opcode_sa_name(cdb0, sa, &name)) {
>   if (cdb0 < 0xc0) {
> - name = cdb_byte0_names[cdb0];
> - if (name)
> - printk("%s", name);
> - else
> - printk("cdb[0]=0x%x (reserved)", cdb0);
> + if (ARRAY_SIZE(cdb_byte0_names)) {
> + name = cdb_byte0_names[cdb0];
> + if (name)
> + printk("%s", name);
> + else
> + printk("cdb[0]=0x%x (reserved)", cdb0);
> + } else
> + printk("cdb[0]=0x%x", cdb0);

Just ARRAY_SIZE(cdb_byte0_names) is not right - that array 
is always defined and the statement is always true.

Patch 16 replaces that block of code and eliminates that
issue.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 14/24] Implement scsi_opcode_sa_name

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> +struct sa_name_list {
> + int cmd;
> + const struct value_name_pair *arr;
> + int arr_sz;
> +};

cmd usually refers to a whole structure and is usually 
a pointer variable.  grep searches will be easier if 
you name that field "opcode".

> +
> +static struct sa_name_list sa_names_arr[] = {
> + {VARIABLE_LENGTH_CMD, variable_length_arr, VARIABLE_LENGTH_SZ},
> + {MAINTENANCE_IN, maint_in_arr, MAINT_IN_SZ},
> + {MAINTENANCE_OUT, maint_out_arr, MAINT_OUT_SZ},
> + {PERSISTENT_RESERVE_IN, pr_in_arr, PR_IN_SZ},
> + {PERSISTENT_RESERVE_OUT, pr_out_arr, PR_OUT_SZ},
> + {SERVICE_ACTION_IN_12, serv_in12_arr, SERV_IN12_SZ},
> + {SERVICE_ACTION_OUT_12, serv_out12_arr, SERV_OUT12_SZ},
> + {SERVICE_ACTION_BIDIRECTIONAL, serv_bidi_arr, SERV_BIDI_SZ},
> + {SERVICE_ACTION_IN_16, serv_in16_arr, SERV_IN16_SZ},
> + {SERVICE_ACTION_OUT_16, serv_out16_arr, SERV_OUT16_SZ},
> + {THIRD_PARTY_COPY_IN, tpc_in_arr, TPC_IN_SZ},
> + {THIRD_PARTY_COPY_OUT, tpc_out_arr, TPC_OUT_SZ},
> + {0, NULL, 0},
> +};
> +
> +static bool scsi_opcode_sa_name(int cmd, int service_action,
> + const char **sa_name)
...


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 12/24] scsi: use 'bool' as return value for scsi_normalize_sense()

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCH 12/24] scsi: use 'bool' as return value for
> scsi_normalize_sense()
> 
> Convert scsi_normalize_sense() and frieds to return 'bool'
> instead of an integer.

frieds should be friends

...
> -int scsi_command_normalize_sense(struct scsi_cmnd *cmd,
> +bool scsi_command_normalize_sense(struct scsi_cmnd *cmd,
>struct scsi_sense_hdr *sshdr)

Add const for *cmd since what it points to is not
modified.  (prototype in scsi_eh.h)

...
> -static inline int scsi_sense_valid(struct scsi_sense_hdr *sshdr)
> +static inline bool scsi_sense_valid(struct scsi_sense_hdr *sshdr)
>  {
>   if (!sshdr)
> - return 0;
> + return false;
> 
>   return (sshdr->response_code & 0x70) == 0x70;
>  }

Add const to the argument, since what it points to is not
changed by this function:
const struct scsi_sense_hdr *sshdr


> -static inline int scsi_sense_is_deferred(struct scsi_sense_hdr *sshdr)
> +static inline bool scsi_sense_is_deferred(struct scsi_sense_hdr *sshdr)
>  {
>   return ((sshdr->response_code >= 0x70) && (sshdr->response_code & 1));
>  }

Add const to the argument, since what it points to is not
changed by this function:
const struct scsi_sense_hdr *sshdr

(also mentioned in reply to PATCH 4)


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 08/24] fas216: Update logging messages

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> @@ -2079,12 +2078,12 @@ fas216_std_done(FAS216_Info *info, struct scsi_cmnd
> *SCpnt, unsigned int result)
>   break;
> 
>   default:
> - printk(KERN_ERR "scsi%d.%c: incomplete data transfer "
> - "detected: res=%08X ptr=%p len=%X CDB: ",
> - info->host->host_no, '0' + SCpnt->device->id,
> - SCpnt->result, info->scsi.SCp.ptr,
> - info->scsi.SCp.this_residual);
> - __scsi_print_command(SCpnt->cmnd);
> + scmd_printk(KERN_ERR, SCpnt,
> + "incomplete data transfer "
> + "detected: res=%08X ptr=%p len=%X\n",

checkpatch lets you keep strings on one line even if the code
exceeds 80 characters.

> + SCpnt->result, info->scsi.SCp.ptr,
> + info->scsi.SCp.this_residual);
> + scsi_print_command(SCpnt);
>   set_host_byte(SCpnt, DID_ERROR);
>   goto request_sense;
>   }
> @@ -2157,12 +2156,12 @@ static void fas216_done(FAS216_Info *info, unsigned
> int result)
>* to transfer, we should not have a valid pointer.
>*/
>   if (info->scsi.SCp.ptr && info->scsi.SCp.this_residual == 0) {
> - printk("scsi%d.%c: zero bytes left to transfer, but "
> -"buffer pointer still valid: ptr=%p len=%08x CDB: ",
> -info->host->host_no, '0' + SCpnt->device->id,
> -info->scsi.SCp.ptr, info->scsi.SCp.this_residual);
> + scmd_printk(KERN_INFO, SCpnt,
> + "zero bytes left to transfer, but "
> + "buffer pointer still valid: ptr=%p len=%08x\n",

checkpatch lets you keep strings on one line even if the code
exceeds 80 characters.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/24] scsi: Use sdev as argument for sense code printing

2014-10-02 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> @@ -1369,49 +1372,53 @@ scsi_extd_sense_format(unsigned char asc, unsigned
> char ascq) {
>  EXPORT_SYMBOL(scsi_extd_sense_format);
> 
>  void
> -scsi_show_extd_sense(unsigned char asc, unsigned char ascq)
> +scsi_show_extd_sense(struct scsi_device *sdev, const char *name,
> +  unsigned char asc, unsigned char ascq)

Since what sdev points to is not modified, that could be
const struct scsi_device *sdev.

(also adjust the function prototype in scsi_dbg.h to match)


>  {
> -const char *extd_sense_fmt = scsi_extd_sense_format(asc, ascq);
> + const char *extd_sense_fmt = NULL;
> + const char *extd_sense_str = scsi_extd_sense_format(asc, ascq,
> + &extd_sense_fmt);
> +
> + if (extd_sense_str) {
> + if (extd_sense_fmt)
> + sdev_prefix_printk(KERN_INFO, sdev, name,
> +"Add. Sense: %s (%s%x)",
> +extd_sense_str, extd_sense_fmt,
> +ascq);
> + else
> + sdev_prefix_printk(KERN_INFO, sdev, name,
> +"Add. Sense: %s", extd_sense_str);
> 
> - if (extd_sense_fmt) {
> - if (strstr(extd_sense_fmt, "%x")) {
> - printk("Add. Sense: ");
> - printk(extd_sense_fmt, ascq);
> - } else
> - printk("Add. Sense: %s", extd_sense_fmt);
>   } else {
> - if (asc >= 0x80)
> - printk("<> ASC=0x%x ASCQ=0x%x", asc,
> -ascq);
> - if (ascq >= 0x80)
> - printk("ASC=0x%x <> ASCQ=0x%x", asc,
> -ascq);
> - else
> - printk("ASC=0x%x ASCQ=0x%x", asc, ascq);
> + sdev_prefix_printk(KERN_INFO, sdev, name,
> +"%sASC=0x%x %sASCQ=0x%x\n",
> +asc >= 0x80 ? "<> " : "", asc,
> +ascq >= 0x80 ? "<> " : "", ascq);
>   }
> -
> - printk("\n");
>  }
>  EXPORT_SYMBOL(scsi_show_extd_sense);
> 
>  void
> -scsi_show_sense_hdr(struct scsi_sense_hdr *sshdr)
> +scsi_show_sense_hdr(struct scsi_device *sdev, const char *name,
> + struct scsi_sense_hdr *sshdr)

Add const for *sdev and *sshdr.

The latter requires also adding const to the argument in
scsi_sense_is_deferred() in scsi_eh.h.


...
> @@ -1419,12 +1426,11 @@ EXPORT_SYMBOL(scsi_show_sense_hdr);
>   * Print normalized SCSI sense header with a prefix.
>   */
>  void
> -scsi_print_sense_hdr(const char *name, struct scsi_sense_hdr *sshdr)
> +scsi_print_sense_hdr(struct scsi_device *sdev, const char *name,
> +  struct scsi_sense_hdr *sshdr)

Add const for *sdev and *sshdr.

...
> @@ -1513,33 +1519,26 @@ scsi_decode_sense_extras(const unsigned char
> *sense_buffer, int sense_len,
>  }
> 
>  /* Normalize and print sense buffer with name prefix */
> -void __scsi_print_sense(const char *name, const unsigned char *sense_buffer,
> - int sense_len)
> +void __scsi_print_sense(struct scsi_device *sdev, const char *name,
> + const unsigned char *sense_buffer, int sense_len)

Add const for *sdev.

>  {
>   struct scsi_sense_hdr sshdr;
> 
> - printk(KERN_INFO "%s: ", name);
>   scsi_decode_sense_buffer(sense_buffer, sense_len, &sshdr);
> - scsi_show_sense_hdr(&sshdr);
> + scsi_show_sense_hdr(sdev, name, &sshdr);
>   scsi_decode_sense_extras(sense_buffer, sense_len, &sshdr);
> - printk(KERN_INFO "%s: ", name);
> - scsi_show_extd_sense(sshdr.asc, sshdr.ascq);
> + scsi_show_extd_sense(sdev, name, sshdr.asc, sshdr.ascq);
>  }
>  EXPORT_SYMBOL(__scsi_print_sense);
> 
>  /* Normalize and print sense buffer in SCSI command */
> -void scsi_print_sense(char *name, struct scsi_cmnd *cmd)
> +void scsi_print_sense(struct scsi_cmnd *cmd)
...

Add const for *cmd.


---
Rob ElliottHP Server Storage





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 03/24] aha152x: Debug output update and whitespace cleanup

2014-10-02 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/aha152x.c b/drivers/scsi/aha152x.c
> index e77b72f..e1aba73 100644
> --- a/drivers/scsi/aha152x.c
> +++ b/drivers/scsi/aha152x.c
...
> @@ -345,10 +311,10 @@ CMD_INC_RESID(struct scsi_cmnd *cmd, int inc)
> 
>  enum {
>   not_issued  = 0x0001,   /* command not yet issued */
> - selecting   = 0x0002,   /* target is beeing selected */
> + selecting   = 0x0002,   /* target is beeing selected */

You might want to fix the spelling of "being" 
while modifying that line.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 20/24] sd: Cleanup logging

2014-10-02 Thread Elliott, Robert (Server Storage)
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 848b17d..2cc8703 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -116,7 +116,7 @@ static int sd_eh_action(struct scsi_cmnd *, int);
>  static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer);
>  static void scsi_disk_release(struct device *cdev);
>  static void sd_print_sense_hdr(struct scsi_disk *, struct scsi_sense_hdr *);
> -static void sd_print_result(struct scsi_disk *, int);
> +static void sd_print_result(struct scsi_disk *, const char *, int);
> 
>  static DEFINE_SPINLOCK(sd_index_lock);
>  static DEFINE_IDA(sd_index_ida);
> @@ -1479,7 +1479,7 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
>   }
> 
>   if (res) {
> - sd_print_result(sdkp, res);
> + sd_print_result(sdkp, "SYNCHRONIZE_CACHE failed", res);

The cdb_byte0_names[] array in constants.c capitalizes
the candidates as:
Synchronize Cache(10)
Synchronize cache(16)

Here, it's specifically the 10 byte version (so far), so
Synchronize Cache(10) failed
might be a better string.  Or, if following SCSI standards,
SYNCHRONIZE CACHE(10)
without the _ would be better.

Consider changing cdb_byte0_names to Synchronize Cache(16) too.
(it has a lot of inconsistent capitalization)

> 
>   if (driver_byte(res) & DRIVER_SENSE)
>   sd_print_sense_hdr(sdkp, &sshdr);
> @@ -1700,17 +1700,9 @@ static int sd_done(struct scsi_cmnd *SCpnt)
>   if (sense_valid)
>   sense_deferred = scsi_sense_is_deferred(&sshdr);
>   }
> -#ifdef CONFIG_SCSI_LOGGING
> - SCSI_LOG_HLCOMPLETE(1, scsi_print_result(SCpnt));
> - if (sense_valid) {
> - SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
> -"sd_done: sb[respc,sk,asc,"
> -"ascq]=%x,%x,%x,%x\n",
> -sshdr.response_code,
> -sshdr.sense_key, sshdr.asc,
> -sshdr.ascq));
> - }
> -#endif
> + SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt,
> +"sd_done: completed %d bytes\n",
> +good_bytes));

Printing good_bytes here is misleading, because it is 
subsequently changed from 0 to nonzero in these cases:
* HARDWARE ERROR sense key
* MEDIUM ERROR sense key
* RECOVERED ERROR sense key
* ABORTED COMMAND sense key for ASC 0x10
* ILLEGAL REQUEST sense key for ASC 0x10
* ILLEGAL REQUEST sense key for ASC/ASCQ 0x20/0x24 
for WRITE SAME without the UNMAP bit

Suggestions:
1. Move the print down after the out: label, when good_bytes
has reached its final value.

2. Indicate if the command was able to transfer all the bytes
by adding:
"%d of %d bytes" good_bytes, scsi_bufflen(SCpnt) 

>   sdkp->medium_access_timed_out = 0;
> 
>   if (driver_byte(result) != DRIVER_SENSE &&
> @@ -1820,12 +1812,12 @@ sd_spinup_disk(struct scsi_disk *sdkp)
>   /* no sense, TUR either succeeded or failed
>* with a status error */
>   if(!spintime && !scsi_status_is_good(the_result)) {
> - sd_printk(KERN_NOTICE, sdkp, "Unit Not 
> Ready\n");
> - sd_print_result(sdkp, the_result);
> + sd_print_result(sdkp, "Unit Not Ready",
> + the_result);

"unit" by itself is not a real SCSI term.  "Logical Unit Not 
Ready" would be better, but this is printed for all non-GOOD 
status values, not just CHECK CONDITION status with a sense 
key of NOT READY and one of many additional sense codes with 
"LOGICAL UNIT NOT READY" in their names.

Following the other prints in this patch, how about this?
"TEST UNIT READY failed"

>   }
>   break;
>   }
> -
> +
>   /*
>* The device does not want the automatic start to be issued.
>*/
> @@ -1941,7 +1933,7 @@ static void read_capacity_error(struct scsi_disk *sdkp,
> struct scsi_device *sdp,
>   struct scsi_sense_hdr *sshdr, int sense_valid,
>   int the_result)
>  {
> - sd_print_result(sdkp, the_result);
> + sd_print_result(sdkp, "READ CAPACITY failed", the_result);

The cdb_byte0_names[] array in constants.c capitalizes
the candidates as:
Read Capacity(10)
Read capacity(16)

The two callers to this function already print the opcode
name, using all caps and not printing "(10)", at the
KERN_NOTICE level:
sd_printk(KERN_NOTICE, sdkp, "READ CAPACITY(16) failed\n");
read_capacity_error(sdkp, sdp, &sshdr, sense_valid, t

RE: [PATCHv5 00/24] scsi logging update (the boring part)

2014-10-02 Thread Elliott, Robert (Server Storage)

> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Wednesday, 01 October, 2014 1:23 AM
> To: James Bottomley
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org; Elliott, Robert (Server
> Storage); Hannes Reinecke
> Subject: [PATCHv5 00/24] scsi logging update (the boring part)
> 
> This is the next iteration of my scsi logging patchset.
> The patchset just contains some logging updates, code
> reshuffling and sanity fixes. Nothing major.
> 
> The second part of this patchset (the printk bit)
> has been vetoed for the moment, so I'll have to
> work on that. But in any case this patchset is
> pretty much independent.
> 
> Difference to v4:
> - Add 'Reviewed-by' tags from hch where applicable
> - Split off the fas216 patch into two different patches
> - Added a patch to document scsi_try_to_abort_cmd
> 
> Hannes Reinecke (24):
>   Remove scsi_cmd_print_sense_hdr()
>   sd: Remove scsi_print_sense() in sd_done()
>   aha152x: Debug output update and whitespace cleanup
>   scsi: introduce sdev_prefix_printk()
>   scsi: Use sdev as argument for sense code printing
>   acornscsi: use scsi_print_command()
>   fas216: Return DID_ERROR for incomplete data transfer
>   fas216: Update logging messages
>   53c700: remove scsi_print_sense() usage
>   scsi: stop decoding if scsi_normalize_sense() fails
>   scsi: do not decode sense extras
>   scsi: use 'bool' as return value for scsi_normalize_sense()
>   scsi: remove scsi_print_status()
>   Implement scsi_opcode_sa_name
>   scsi: merge print_opcode_name()
>   scsi: consolidate opcode lookup in scsi_opcode_sa_name()
>   scsi: remove last argument from print_opcode_name()
>   scsi: Remove scsi_print_command when calling abort
>   scsi: separate out scsi_(host|driver)byte_string()
>   sd: Cleanup logging
>   scsi: simplify scsi_log_(send|completion)
>   scsi: fixup logging messages in scsi_error.c
>   scsi: use shost argument in scsi_eh_prt_fail_stats
>   scsi_error: document scsi_try_to_abort_cmd

With this 24-part series applied to 3.17rc7, with the 
kernel set to print timestamps (e.g., printk.time=y on the kernel
command line), the prints still end up trickling out a byte
at a time, interleaved from different threads, when the system 
is overloaded.

Maybe that's addressed by the part that is deferred?

Example serial log excerpt:

[74465.705193] sd 2:0:0:0: [sdr] CDB:
[74465.705194] :
[74465.705196]  0
 [74465.705196] sd 2:0:0:0: [sdr] (null)FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
[74465.705197]  00
[74465.705198] :
[74465.705198] :
[74465.705199]  00
[74465.705200] Read(10)
[74465.705201]  28
[74465.705202]  f4
[74465.705202] sd 2:0:0:0: [sdr] Sense Key : Hardware Error [current]
[74465.705204]  b2
[74465.705204]  28
[74465.705205]  28
[74465.705206]  01
[74465.705207] :
[74465.705207]  00
[74465.705208]  70
[74465.705209]  f8
[74465.705210] sd 2:0:0:0: [sdr] Add. Sense: Logical unit failure
[74465.705211]  00
[74465.705211]  00
[74465.705212]  2f
[74465.705213]  28
[74465.705214]  00
[74465.705215]  00
[74465.705215]  00
[74465.705217]  00
[74465.705217] sd 2:0:0:0: [sdr] CDB:
[74465.705218]  00
[74465.705219]  70
[74465.705219]  00
[74465.705220]  00
[74465.705221]  00
[74465.705221]  00
[74465.705222]  01
[74465.705223] Read(10)
[74465.705223]  01
[74465.705224]  00
[74465.705225]  00
[74465.705226]  9c
[74465.705227]  08
[74465.705227]  08
[74465.705228]  81
[74465.705229] :
[74465.705229]  9d
[74465.705230]  00
[74465.705231]  01
[74465.705232]  70
[74465.705232]  00
[74465.705233]  00
[74465.705234]  50
[74465.705234]  28
[74465.705235]  e8
[74465.705236]  08
[74465.705237]  9f
[74465.705237]  00
[74465.705238]
[74465.705238]
[74465.705239]  00
[74465.705240]  00
[74465.705240]  00
[74465.705241]  00
[74465.705242]  68
[74465.705243]  00
[74465.705243]  00
[74465.705244]  00
[74465.705245]  00
[74465.705245]
[74465.705246]  00
[74465.705247]  08
[74465.705247]  08
[74465.705248]  01
[74465.705249]  08
[74465.705250]  00
[74465.705250]  00
[74465.705251]  00
[74465.705253]  33
[74465.705253] sd 2:0:0:0: [sdr] Done: SUCCESS Result: 
hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/24] scsi: introduce sdev_prefix_printk()

2014-10-02 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
...
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 4c3ab83..c01dc89 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -103,9 +103,10 @@ static inline struct scsi_disk *scsi_disk(struct gendisk
> *disk)
> 
>  #define sd_printk(prefix, sdsk, fmt, a...)   \
>  (sdsk)->disk ?   
> \
> - sdev_printk(prefix, (sdsk)->device, "[%s] " fmt,\
> - (sdsk)->disk->disk_name, ##a) : \
> - sdev_printk(prefix, (sdsk)->device, fmt, ##a)
> +   sdev_prefix_printk(prefix, (sdsk)->device,\
> +  (sdsk)->disk->disk_name, fmt, ##a) :   \
> +   sdev_prefix_printk(prefix, (sdsk)->device,\
> +  NULL, fmt, ##a)
> 
>  #define sd_first_printk(prefix, sdsk, fmt, a...) \
>   do {\
...
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 27ecee7..0b18a09 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -244,6 +244,15 @@ struct scsi_dh_data {
>  #define sdev_dbg(sdev, fmt, a...) \
>   dev_dbg(&(sdev)->sdev_gendev, fmt, ##a)
> 
> +/*
> + * like scmd_printk, but the device name is passed in
> + * as a string pointer
> + */
> +#define sdev_prefix_printk(l, sdev, p, fmt, a...)\
> + (p) ?   \
> + sdev_printk(l, sdev, "[%s] " fmt, p, ##a) : \
> + sdev_printk(l, sdev, fmt, ##a)
> +
>  #define scmd_printk(prefix, scmd, fmt, a...) \
>  (scmd)->request->rq_disk ?   \
>   sdev_printk(prefix, (scmd)->device, "[%s] " fmt,\
> --
> 1.8.5.2

This triggers lots of compiler warnings with gcc 4.4.7 like:

drivers/scsi/sd.c: In function 'sd_open':
drivers/scsi/sd.c:1179: warning: reading through null pointer (argument 4)
drivers/scsi/sd.c:1179: warning: format '%s' expects type 'char *', but 
argument 4 has type 'void *'


That is from:
SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));

Since:
#define NULL ((void *)0)

gcc probably doesn't realize the (p)? prevents the NULL (a void *) 
from being passed to sdev_printk.

Passing "" rather than NULL eliminates the compiler warnings.

There should probably be a () around p in the sdev_printk call, too.


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] block: remove artifical max_hw_sectors cap

2014-10-01 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Wednesday, 01 October, 2014 8:08 AM
> To: Jens Axboe; linux-ker...@vger.kernel.org; linux-scsi@vger.kernel.org; Wu
> Fengguang
> Subject: Re: [PATCH] block: remove artifical max_hw_sectors cap
> 
> As we still haven't made any progress on this let me explain why
> the limit does not make sense:  It only applies to _FS request,
> which basically have three use cases:
> 
>  - metadata I/O.  Generally small enough that the limit does not
>matter at all.
>  - buffered reads/writes.  We already have a self-tuning algorithm
>that limits writeback size, and a readahead size tunable that
>caps read sizes.  Imposing another confusing limit that does
>not interact with the visible tunables here is not helpful
>  - direct I/O.  Users should get something resembling their request
>as closely as possible on the write, and this is where our
>stupid limitation causes the most problems.

One supporting example: A low limit interferes with creation of
full stripe writes for RAID controllers.



> On Sat, Sep 06, 2014 at 04:08:05PM -0700, Christoph Hellwig wrote:
> > Set max_sectors to the value the drivers provides as hardware limit by
> > default.  Linux had proper I/O throttling for a long time and doesn't
> > rely on a artifically small maximum I/O size anymore.  By not limiting
> > the I/O size by default we remove an annoying tuning step required for
> > most Linux installation.
> >
> > Note that both the user, and if absolutely required the driver can still
> > impose a limit for FS requests below max_hw_sectors_kb.
> >
> > Signed-off-by: Christoph Hellwig 
> > ---
> >  block/blk-settings.c   | 4 +---
> >  drivers/block/aoe/aoeblk.c | 2 +-
> >  include/linux/blkdev.h | 1 -
> >  3 files changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index f1a1795..f52c223 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -257,9 +257,7 @@ void blk_limits_max_hw_sectors(struct queue_limits
> *limits, unsigned int max_hw_
> >__func__, max_hw_sectors);
> > }
> >
> > -   limits->max_hw_sectors = max_hw_sectors;
> > -   limits->max_sectors = min_t(unsigned int, max_hw_sectors,
> > -   BLK_DEF_MAX_SECTORS);
> > +   limits->max_sectors = limits->max_hw_sectors = max_hw_sectors;
> >  }
> >  EXPORT_SYMBOL(blk_limits_max_hw_sectors);

1. Documentation/block/biodoc.txt needs some updates:

blk_queue_max_sectors(q, max_sectors)
Sets two variables that limit the size of the request.

- The request queue's max_sectors, which is a soft size in
units of 512 byte sectors, and could be dynamically varied
by the core kernel.

- The request queue's max_hw_sectors, which is a hard limit
and reflects the maximum size request a driver can handle
in units of 512 byte sectors.

The default for both max_sectors and max_hw_sectors is
255. The upper limit of max_sectors is 1024.

There is no function with that name (it is now called
blk_queue_max_hw_sectors), the upper limit of max_sectors
is max_hw_sectors, and the default is misleading (255
is the default if the LLD doesn't provide max_hw_sectors).

2. Testing with hpsa and mpt3sas, this patch works as expected
for this setting.  I/O sizes are still limited by max_segments, 
which is expected.  Something else is still limiting I/O sizes
to 1 MiB, though; probably bio_get_nr_vecs enforcing a maximum
size per bio of BIO_MAX_PAGES 256 (which is 1 MiB with 4 KiB
pages).


Otherwise,
Reviewed-by: Robert Elliott 
Tested-by: Robert Elliott 

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] scsi: add a CONFIG_SCSI_MQ_DEFAULT option

2014-09-30 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
...
> Add a Kconfig option to enable the blk-mq path for SCSI by default
> to ease testing and deployment in setups that know they benefit
> from blk-mq.
...
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 1423cb1..79c77b4 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -1367,7 +1367,11 @@ MODULE_LICENSE("GPL");
>  module_param(scsi_logging_level, int, S_IRUGO|S_IWUSR);
>  MODULE_PARM_DESC(scsi_logging_level, "a bit mask of logging levels");
> 
> +#ifdef CONFIG_SCSI_MQ_DEFAULT
> +bool scsi_use_blk_mq = true;
> +#else
>  bool scsi_use_blk_mq = false;
> +#endif
>  module_param_named(use_blk_mq, scsi_use_blk_mq, bool, S_IWUSR | S_IRUGO);
> 
>  static int __init init_scsi(void)

A printk indicating the choice would be helpful; there's no indication
in the "Command line" or "Kernel command line" lines after this patch.

Reviewed-by: Robert Elliott 
Tested-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Question: request tag usage

2014-09-26 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Friday, September 26, 2014 3:21 AM
> To: Christoph Hellwig
> Cc: Elliott, Robert (Server Storage); SCSI Mailing List; Jens Axboe
> Subject: Re: Question: request tag usage
> 
> On 09/26/2014 10:03 AM, Christoph Hellwig wrote:
...
> >
> > I guess for now we'll need to stay with the command pointer address.
> > We can revisit this once the old request layer is gone.
> >
> Too bad. The tags would have provided a really nice concise way
> of identifying the command...

I would still go ahead and print the tag; it's extremely useful for
blk-mq.

How about having scmd_printk do this?
* if tag is -1, print "scmd %p"
* else print "tag %d"

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V5 01/17] scsi: fixing the "type" for well known LUs

2014-09-24 Thread Elliott, Robert (Server Storage)

> From: Christoph Hellwig [mailto:h...@infradead.org]
...
> On Wed, Sep 24, 2014 at 06:13:57PM +0300, Dolev Raviv wrote:
> > From: Subhash Jadavani 
> >
> > Some devices may respond with wrong type for well-known logical units.
> > This patch forces well-known type for devices which doesn't report it
> > correct.
> 
> This looks fine to me, as the well known LUN addresses seem to be nailed
> down nicely in t10, but let's see if Martin or Robert disagree..
> 
...
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index 56675db..a34db9e 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -805,6 +805,14 @@ static int scsi_add_lun(struct scsi_device *sdev,
> unsigned char *inq_result,
> > } else {
> > sdev->type = (inq_result[0] & 0x1f);
> > sdev->removable = (inq_result[1] & 0x80) >> 7;
> > +
> > +   /*
> > +* some devices may respond with wrong type for
> > +* well-known logical units. Force well-known type
> > +* to enumerate them correctly.
> > +*/
> > +   if (scsi_is_wlun(sdev->lun))
> > +   sdev->type = TYPE_WLUN;
> > }
...

My only concern is that the peripheral device type was included
at the outset in spc3r01 in 2001, so a design that can't get this
right might have other problems.  A print might be justified
to report something is amiss:

if (scsi_is_wlun(sdev->lun) && sdev->type != TYPE_WLUN) {
sdev_printk(KERN_WARNING, sdev,
"%s: correcting incorrect peripheral device type 0x%x 
for W-LUN 0x%16phN\n",
__func__, sdev->type, sdev->lun);
sdev->type = TYPE_WLUN;
}


---
Rob ElliottHP Server Storage





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] hpsa: add missing pci_set_master in kdump path

2014-09-24 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Tomas Henzl [mailto:the...@redhat.com]
> Sent: Wednesday, September 24, 2014 5:32 AM
> To: 'linux-scsi@vger.kernel.org'
> Cc: Elliott, Robert (Server Storage); steve.came...@hp.com; Christoph
> Hellwig; Handzik, Joe
> Subject: Re: [PATCH] hpsa: add missing pci_set_master in kdump path
> 
> Christoph,
> 
> this is a fix for 132aa220b45d60e9b20def1e9d8be9422eed9616 hpsa: refine
> the pci enable/disable handling
> which is in 'for-3.18'.
> If this patch will not be reviewed and added to 3.18 the aforementioned
> patch needs to be reverted.
> 
> Tomas

Sorry for the delay - that is fine to go in directly, or will be part
of a larger hpsa series if HP or PMC is able to post one for 3.18.

Bit of business turmoil here lately...

For the interim you can add:
Reviewed-by: Robert Elliott 
Tested-by: Robert Elliott 
Acked-by: Robert Elliott 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Bug 81861] Oops by mvsas v0.8.16: sas: ataX: end_device-Y:0:Z: dev error handler -> general protection fault, RIP: mvs_task_prep_ata+0x80/0x3a0

2014-09-23 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of bugzilla-dae...@bugzilla.kernel.org
> Sent: Tuesday, 23 September, 2014 4:56 PM
> To: linux-scsi@vger.kernel.org
> Subject: [Bug 81861] Oops by mvsas v0.8.16: sas: ataX: end_device-Y:0:Z: dev
> error handler -> general protection fault, RIP: mvs_task_prep_ata+0x80/0x3a0
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=81861
> 
> --- Comment #16 from linux-...@crashplan.pro ---
> When line-by-line dumping the called constants/vars from:
> 469del_q = TXQ_MODE_I | tag |
> 470(TXQ_CMD_STP << TXQ_CMD_SHIFT) |
> 471(MVS_PHY_ID << TXQ_PHY_SHIFT) |
> 472(mvi_dev->taskfileset << TXQ_SRS_SHIFT);
> 
> using the prepended statements:
> printk("slot=%p ", slot);
> printk(KERN_INFO "TXQ_MODE_I=%d ", TXQ_MODE_I);
> printk(KERN_INFO "tag=%d ", tag);
> printk(KERN_INFO "TXQ_CMD_STP=%d ", TXQ_CMD_STP);
> printk(KERN_INFO "TXQ_CMD_SHIFT=%d ", TXQ_CMD_SHIFT);
> printk(KERN_INFO "MVS_PHY_ID=%d ", MVS_PHY_ID);
> printk(KERN_INFO "TXQ_PHY_SHIFT=%d ", TXQ_PHY_SHIFT);
> del_q = TXQ_MODE_I | tag |
> (TXQ_CMD_STP << TXQ_CMD_SHIFT) |
> (MVS_PHY_ID << TXQ_PHY_SHIFT) |
> (mvi_dev->taskfileset << TXQ_SRS_SHIFT);
> 
> the kernel crash occurs after printing "TXQ_CMD_SHIFT" or when trying to
> output
> the value of "MVS_PHY_ID":
> [  529.113152] sas: DONE DISCOVERY on port 0, pid:133, result:0
> [  529.114313] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
> [  529.115460] sas: ata7: end_device-6:0:28: dev error handler
> [  529.115522] sas: ata8: end_device-6:0:29: dev error handler
> [  529.118706] sas: ata9: end_device-6:0:30: dev error handler
> [  529.119840] sas: ata10: end_device-6:0:31: dev error handler
> [  529.271634] [mvi=8800d368, mvi_dev=8800d36836a0 tag=0
> slot=8800d36a55b8
> [  529.271753] TXQ_MODE_I=268435456 tag=0
> [  529.272679] TXQ_CMD_STP=3 TXQ_CMD_SHIFT=29
> [  529.273618] MVS_PHY_ID=32768 TXQ_PHY_SHIFT=12 tx_prod=44]
> [  529.276091] [mvi=8800d368, mvi_dev=8800d3683618 tag=1
> slot=8800d36a5610
> [  529.276207] TXQ_MODE_I=268435456 tag=1
> [  529.277095] TXQ_CMD_STP=3 TXQ_CMD_SHIFT=29
> [  529.278038] MVS_PHY_ID=1 TXQ_PHY_SHIFT=12 tx_prod=46]
> [  529.280271] [mvi=8800d368, mvi_dev=8800d3683618 tag=1
> slot=8800d36a5610
> [  529.280385] TXQ_MODE_I=268435456 tag=1
> [  529.281445] TXQ_CMD_STP=3 TXQ_CMD_SHIFT=29
> [  529.282562] MVS_PHY_ID=1 TXQ_PHY_SHIFT=12 tx_prod=48]
> [  529.284894] [mvi=8800d368, mvi_dev=8800d36837b0 tag=2
> slot=8800d36a5668
> [  529.285010] TXQ_MODE_I=268435456 tag=2
> [  529.286248] TXQ_CMD_STP=3 TXQ_CMD_SHIFT=29
> [  529.287555] BUG: unable to handle kernel NULL pointer dereference at
> 0257
> [  529.290225] IP: [] mvs_task_prep+0x7cb/0xe50 [mvsas]
> [  529.291686] PGD 0
> [  529.293141] Oops:  [#1] SMP
> [  529.294630] Modules linked in: mvsas(OF) libsas scsi_transport_sas
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm crct10dif_pclmul
> crc32_pclmul ghash_clmulni_intel cryptd serio_raw lpc_ich i915 mei_me mei
> drm_kms_helper video netconsole drm configfs mac_hid i2c_algo_bit psmouse
> r8169
> ahci mii libahci
> 
> Any suggestions why accessing "MVS_PHY_ID" leads to the kernel NULL pointer
> dereference oops?

1. Although MVS_PHY_ID looks like a constant, it's really not:
#define MVS_PHY_ID (1U << sas_phy->id)

2. This fault:
[   32.271218] BUG: unable to handle kernel NULL pointer dereference at 
0255
(although 255 looks like a decimal number 0xff, it's really hex 0x255)

at this line:
   0xa01c481e <+1838>:  mov0x254(%rbx),%ecx

implies that rbx contains 1, so 0x254 + 1 = 0x255.

3. pahole drivers/scsi/mvsas/mv_sas.o
shows there are two structures with fields at offset 596:
* asd_sas_phy.id
* asd_sas_port.sas_addr[8]

4. objdump -drS drivers/scsi/mvsas/mv_sas.o
shows only a few lines with 0x254(%something), one of which
is the del_q line you've identified:

mvs_task_prep_ata(struct mvs_info *mvi, struct mvs_task_exec_info *tei):
struct sas_ha_struct *sha = mvi->sas;
struct sas_task *task = tei->task;
struct domain_device *dev = task->dev;
struct sas_phy *sphy = dev->phy;
struct asd_sas_phy *sas_phy = sha->sas_phy[sphy->number];

...
del_q = TXQ_MODE_I | tag |
(TXQ_CMD_STP << TXQ_CMD_SHIFT) |
(MVS_PHY_ID << TXQ_PHY_SHIFT) |
(mvi_dev->taskfileset << TXQ_SRS_SHIFT);
mvi->tx[mvi->tx_prod] = cpu_to_le32(del_q);

MVS_PHY_ID =
sas_phy->id =
sha->sas_phy[sphy->number] =
mvi->sas->sas_phy[dev->phy->number] =
mvi->sas->sas_phy[task->dev->phy->number]->id
mvi->sas->sas_phy[tei->task->dev->phy->number]->id

Looking at the offsets reported by pahole, that means:
%rdi->56->344[%rsi->0->0->56->688]->

RE: boot stall regression due to blk-mq: use percpu_ref for mq usage count

2014-09-23 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Tejun Heo
> Sent: Tuesday, 23 September, 2014 1:12 AM
> To: Christoph Hellwig
> Cc: Jens Axboe; linux-ker...@vger.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: boot stall regression due to blk-mq: use percpu_ref for mq usage
> count
> 
> On Tue, Sep 23, 2014 at 08:09:06AM +0200, Christoph Hellwig wrote:
> > On Tue, Sep 23, 2014 at 02:01:41AM -0400, Tejun Heo wrote:
> > > On Tue, Sep 23, 2014 at 07:59:24AM +0200, Christoph Hellwig wrote:
> > > > "[PATCHSET percpu/for-3.18] percpu_ref: implement
> switch_to_atomic/percpu()"
> > > >
> > > > looks way to big for 3.17, and the regression was introduced in the
> 3.17
> > > > merge window.  I'm not sure what was broken before, but it defintively
> > > > survived a lot of testing.
> > >
> > > Do we even care about fixing it for 3.17?  scsi-mq isn't enabled by
> > > default even for 3.18.  The open-coded percpu ref thing was subtly
> > > broken there.  It'd be difficult to trigger but I'm fairly sure it'd
> > > crap out in the wild once in a blue moon.
> >
> > It's compiled in by default, and people are extremly eager to test it.
> 
> Ugh, I don't know.  It's not like we have a very good baseline we can
> go back to and reverting it for -stable and then redoing it seems
> kinda excessive for a yet experimental feature.  Jens?

scsi_mod.use_blk_mq is not listed in 3.17rc6 Documentation/kernel-
parameters.txt or Documentation/scsi/scsi-parameters.txt (which
does admit "This document may not be entirely up to date and
comprehensive.").

Perhaps a description with an "experimental" warning could be 
slipped into scsi-parameters.txt in 3.17, with plans to remove
that warning in 3.18.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 20/22] scsi: align logging messages

2014-09-18 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
...
> > scmd ties the messages together so you can tell which command
> > has gotten to which state.  grep works.
> 
> Can we just print the tag instead, that would be a much more human
> readable number normally.

I made a local patch to add the tag (scmd->request->tag), and it does
look good.  I endorse dropping scmd %p in favor of just the tag,
and trying to include the tag in all the lines (here, the Done,
Result, CDB, Sense Key, and Add. Sense lines do not)

[  624.607501] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.609121] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.611542] sd 2:0:0:3: [sdv] CDB: 28 00 43 03 24 98 00 00 08 00
[  624.613637] sd 2:0:0:3: [sdv] scmd 880420891470 tag 1 abort scheduled
[  624.616015] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.617510] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.619792] sd 2:0:0:3: [sdv] CDB: 28 00 52 83 20 80 00 00 08 00
[  624.621855] sd 2:0:0:3: [sdv] scmd 880420893a70 tag 3 abort scheduled
[  624.624148] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.625739] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.628062] sd 2:0:0:3: [sdv] CDB: 28 00 58 a4 64 70 00 00 08 00
[  624.630132] sd 2:0:0:3: [sdv] scmd 880420894d70 tag 4 abort scheduled
[  624.632477] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.633913] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.636291] sd 2:0:0:3: [sdv] CDB: 28 00 16 90 5c 98 00 00 08 00
[  624.638378] sd 2:0:0:3: [sdv] scmd 880420899970 tag 8 abort scheduled
...
[  624.780177] sd 2:0:0:3: [sdv] scmd 880420976070 tag 161 abort scheduled
[  624.782661] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.784092] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.786553] sd 2:0:0:3: [sdv] CDB: 28 00 1e dc 59 e0 00 00 08 00
[  624.788617] sd 2:0:0:3: [sdv] scmd 880420977370 tag 162 abort scheduled
[  624.790995] sd 2:0:0:3: [sdv] Done: TIMEOUT
[  624.792443] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.794835] sd 2:0:0:3: [sdv] CDB: 28 00 1e 83 1f b8 00 00 08 00
[  624.796965] sd 2:0:0:3: [sdv] scmd 88042099ac70 tag 191 abort scheduled 

[  624.799433] sd 2:0:0:3: [sdv] aborting scmd 880420891470 tag 1
[  624.801601] sd 2:0:0:3: [sdv] scmd 880420891470 tag 1 abort failed, rtn 
FAILED
[  624.804172] sd 2:0:0:3: [sdv] aborting scmd 880420893a70 tag 3
[  624.806417] sd 2:0:0:3: [sdv] scmd 880420893a70 tag 3 abort failed, rtn 
FAILED
[  624.808980] sd 2:0:0:3: [sdv] aborting scmd 880420894d70 tag 4
[  624.811086] sd 2:0:0:3: [sdv] scmd 880420894d70 tag 4 abort failed, rtn 
FAILED
[  624.813776] sd 2:0:0:3: [sdv] aborting scmd 880420899970 tag 8
[  624.815978] sd 2:0:0:3: [sdv] scmd 880420899970 tag 8 abort failed, rtn 
FAILED
[  624.818653] sd 2:0:0:3: [sdv] aborting scmd 88042089ac70 tag 9
...
[  624.852244] sd 2:0:0:3: [sdv] scmd 880420976070 tag 161 abort failed, 
rtn FAILED
[  624.855200] sd 2:0:0:3: [sdv] aborting scmd 880420977370 tag 162
[  624.857380] sd 2:0:0:3: [sdv] scmd 880420977370 tag 162 abort failed, 
rtn FAILED
[  624.860057] sd 2:0:0:3: [sdv] aborting scmd 88042099ac70 tag 191
[  624.862250] sd 2:0:0:3: [sdv] scmd 88042099ac70 tag 191 abort failed, 
rtn FAILED 

[  624.865174] scsi host2: scsi_eh_2: waking up 0/11/11
[  624.866977] sd 2:0:0:3: scsi_eh_prt_fail_stats: cmds failed: 11, cancel: 0
[  624.869276] scsi host2: Total of 11 commands on 1 devices require eh work
[  624.871683] scsi host2: scsi_eh_2: Sending BDR sdev:88042b162800
[  624.974058] hpsa :04:00.0: resetting scsi 2:0:0:3: Direct-Access HP  
 LOGICAL VOLUME   RAID-6 SSDSmartPathCap+ En- Exp=3
[  624.978450] sd 2:0:0:3: [sdv] scsi_eh_done scmd 880420891470 tag 1 
result: 2
[  624.981114] sd 2:0:0:3: [sdv] Done: SUCCESS
[  624.982719] sd 2:0:0:3: [sdv] Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[  624.985083] sd 2:0:0:3: [sdv] CDB: 00 00 00 00 00 00
[  624.986874] sd 2:0:0:3: [sdv] Sense Key : Unit Attention [current]
[  624.988997] sd 2:0:0:3: [sdv] Add. Sense: Bus device reset function occurred 
((null)3)

[  624.991727] sd 2:0:0:3: [sdv] scsi_send_eh_cmnd: scmd 880420891470 tag 1 
timeleft: 9998
[  624.994927] sd 2:0:0:3: [sdv] scsi_send_eh_cmnd: scsi_eh_completed_normally 
rtn 2001
[  624.997560] sd 2:0:0:3: [sdv] scsi_eh_tur: scmd 880420891470 tag 1 rtn 
2001
[  625.001243] sd 2:0:0:3: [sdv] scsi_eh_done scmd 880420891470 tag 1 
result: 0
[  625.003856] sd 2:0:0:3: [sdv] scsi_send_eh_cmnd: scmd 880420891470 tag 1 
timeleft: 9997
[  625.006864] sd 2:0:0:3: [sdv] scsi_send_eh_cmnd: scsi_eh_completed_normally 
rtn 2002
[  625.009481] sd 2:0:0:3: [sdv] scsi_eh_tur: scmd 880420891470 tag 1 rtn 
2002
[  625.011971] sd 2:0:0:3: [sdv] scsi_eh_2: flush retry scmd 880420891470 
tag 1
[  625.014925] sd 2:0:0:3: [sdv] scsi_eh_2: flush retry scmd 8

RE: blk-mq timeout handling fixes

2014-09-17 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Saturday, 13 September, 2014 6:40 PM
> To: Jens Axboe
> Cc: Elliott, Robert (Server Storage); linux-scsi@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: blk-mq timeout handling fixes
> 
> This series fixes various issues with timeout handling that Robert
> ran into when testing scsi-mq heavily.  He tested an earlier version,
> and couldn't reproduce the issues anymore, although the series changed
> quite significantly since and should probably be retested.
> 
> In summary we not only start the blk-mq timer inside the drivers
> ->queue_rq method after the request has been fully setup, and we
> also tell the drivers if we're timing out a reserved (internal)
> request or a real one.  Many drivers including will need to handle
> those internal ones differently, e.g. for scsi-mq we don't even
> have a scsi command structure allocated for the reserved commands.

I have rerun a variety of tests on:
* Jens' for-next tree that went into 3.17rc5
* plus this series
* plus two patches for infinite recursion on flushes from 
  Ming and then Christoph

and have not been able to trigger the scsi_times_out req->special
NULL pointer dereference that prompted this series.

Testing includes:
* concurrent heavy workload generators:
  * fio high iodepth direct 512 byte random reads (> 1M IOPS)
  * programs generating large bursts of paged writes
* mkfs.ext4 (followed by e2fsck)
* mkfs.xfs (followed by xfs_check)
* ddpt
  * watch -n 0 sync to generate flushes
* scsi_logging_level MLCOMPLETE set to 0 or 1
  * scsi_lib.c patched to put all the ACTION_FAIL messages
under level 1 so they can be squelched (massive error 
prints cause more timeouts themselves)
* 4 hpsa and 16 mpt3sas devices (all made from SAS SSDs)
  * lockless hpsa driver
* injecting errors
  * device removal
  * device generating infinite errors
  * device generating a brief number of errors

The filesystems don't always recover properly, but nothing in 
the block or scsi midlayers crashed.

So, you may add this to the series:
Tested-by: Robert Elliott 

---
Rob ElliottHP Server Storage




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] scsi: scsi_devinfo.c: Cleaning up unnecessarily complicated in conjunction with strncpy

2014-09-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Rickard Strandqvist [mailto:rickard_strandqv...@spectrumdigital.se]
> How do you mean?
> 
> strncpy zeroes throughout the remainder of the string "from" until the
> length off to_length, or otherwise guaranteed trailing zero characters
> and a warning is printed.
> 
> Is not it exactly the functionality that is desired?

Ah, I see that in man 3 strcpy: 
"If the length of src is less than n, strncpy() pads the 
 remainder of dest with null bytes."

I agree that should work.

---
Rob ElliottHP Server Storage





RE: [PATCH] scsi: scsi_devinfo.c: Cleaning up unnecessarily complicated in conjunction with strncpy

2014-09-14 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Rickard Strandqvist
...
> diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
...
>  static void scsi_strcpy_devinfo(char *name, char *to, size_t to_length,
>   char *from, int compatible)
>  {
> - size_t from_length;
> -
> - from_length = strlen(from);
> - strncpy(to, from, min(to_length, from_length));
> - if (from_length < to_length) {
> - if (compatible) {
> - /*
> -  * NUL terminate the string if it is short.
> -  */
> - to[from_length] = '\0';
> - } else {
> - /*
> -  * space pad the string if it is short.
> -  */
> - strncpy(&to[from_length], spaces,
> - to_length - from_length);
> - }
> - }
> - if (from_length > to_length)
> -  printk(KERN_WARNING "%s: %s string '%s' is too long\n",
> + strncpy(to, from, to_length);
> + if (to[to_length - 1] != '\0') {
> + to[to_length - 1] = '\0';
> + printk(KERN_WARNING "%s: %s string '%s' is too long\n",
>   __func__, name, from);
> + }

The caller of this function, scsi_dev_info_list_add_keyed, created
the "to" destination buffer, devinfo, with kmalloc, so it's not
guaranteed to be full of zeros.

If from_length is shorter than to_length, then this code will
be inspecting an uninitialized character that strncpy didn't
touch.

---
Rob ElliottHP Server Storage





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 20/20] scsi_error: format abort error message

2014-09-12 Thread Elliott, Robert (Server Storage)
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Friday, 05 September, 2014 7:33 PM
> Subject: Re: [PATCH 20/20] scsi_error: format abort error message
> 
> On Wed, Sep 03, 2014 at 12:06:15PM +0200, Hannes Reinecke wrote:
...
> > Decode the return value if the command abort failed.
> > @@ -157,8 +157,8 @@ scmd_eh_abort_handler(struct work_struct *work)
> > } else {
> > SCSI_LOG_ERROR_RECOVERY(3,
> > scmd_printk(KERN_INFO, scmd,
> > -   "scmd %p abort failed, rtn %d\n",
> > -   scmd, rtn));
> > +   "scmd %p abort failed, rtn %s\n",
> > +   scmd, scsi_retval_string(rtn)));
...
> I've not seen an answer to my question in reply to the previous version
> of this.  Why would you do pretty printing of a variable that can just
> return SUCCESS or FAILURE?

Is there anything prohibiting the scsi_eh_abort_handler provided by
the LLD from returning any of the other values like NEEDS_RETRY?

#define NEEDS_RETRY 0x2001
#define SUCCESS 0x2002
#define FAILED  0x2003
#define QUEUED  0x2004
#define SOFT_ERROR  0x2005
#define ADD_TO_MLQUEUE  0x2006
#define TIMEOUT_ERROR   0x2007
#define SCSI_RETURN_NOT_HANDLED   0x2008
#define FAST_IO_FAIL0x2009

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: lk 3.17-rc4 blk_mq large write problems

2014-09-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Jens Axboe
> Sent: Wednesday, September 10, 2014 9:00 PM
> To: dgilb...@interlog.com; Christoph Hellwig
> Cc: SCSI development list
> Subject: Re: lk 3.17-rc4 blk_mq large write problems
> 
> On 2014-09-10 18:58, Douglas Gilbert wrote:
> > On 14-09-10 11:41 AM, Christoph Hellwig wrote:
> >> While it might not help with a blown stack, can you give the patch
> below
> >> a try?  I tries to solve a problem where the timeout handler hits
> >> before we've fully set up a command.  While I'd like to understand
> the
> >> root cause of why we're hitting it as well, I'd also really to fix
> that
> >> race. It would also be good to get a gdb listing of the exact area in
> >> scsi_times_out listed in the oops.
> >
> > RIP: 0010:[]  []
> > scsi_times_out+0xe/0x2e0
> >
> > (gdb) disassemble scsi_times_out
> > Dump of assembler code for function scsi_times_out:
> > 0x8127d030 <+0>:push   %rbp
> > 0x8127d031 <+1>:mov$0x2007,%esi
> > 0x8127d036 <+6>:push   %rbx
> > 0x8127d037 <+7>:mov0xf8(%rdi),%rbx
> > 0x8127d03e <+14>:mov(%rbx),%rax
> > 0x8127d041 <+17>:mov%rbx,%rdi
> > 0x8127d044 <+20>:mov(%rax),%rbp
> > 0x8127d047 <+23>:callq  0x81277c70
> > 
> > 0x8127d04c <+28>:cmpl   $0x,0x154(%rbp)
> > 0x8127d053 <+35>:je 0x8127d05f
> > 
> > ...
> >
> > which seems to agree 'objdump -drS scsi_error.o':
> >
> > 28b0 :
> >  28b0:55   push   %rbp
> >  28b1:be 07 20 00 00   mov$0x2007,%esi
> >  28b6:53   push   %rbx
> >  28b7:48 8b 9f f8 00 00 00 mov0xf8(%rdi),%rbx
> >  28be:48 8b 03 mov(%rbx),%rax
> >  28c1:48 89 df mov%rbx,%rdi
> >  28c4:48 8b 28 mov(%rax),%rbp
> >  28c7:e8 00 00 00 00   callq  28cc
> 
> >  28c8: R_X86_64_PC32scsi_log_completion-0x4
> >  28cc:83 bd 54 01 00 00 ff cmpl   $0x,0x154(%rbp)
> 
> This would be more useful if you had DEBUGINFO enabled. At least it
> would save use some time :-)
> 

On my system, that function compiles to:

enum blk_eh_timer_return scsi_times_out(struct request *req)
{
2580:   55  push   %rbp
2581:   48 89 e5mov%rsp,%rbp
2584:   41 55   push   %r13
2586:   41 54   push   %r12
2588:   53  push   %rbx
2589:   48 83 ec 08 sub$0x8,%rsp
258d:   e8 00 00 00 00  callq  2592 
258e: R_X86_64_PC32 mcount-0x4
struct scsi_cmnd *scmd = req->special;
2592:   48 8b 9f f8 00 00 00mov0xf8(%rdi),%rbx
enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
struct Scsi_Host *host = scmd->device->host;
2599:   48 8b 03mov(%rbx),%rax
259c:   4c 8b 20mov(%rax),%r12
259f:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)

So, Doug's +0xe corresponds to +0x19 here.

258d-2591 and the extra code preceding them are for stack
overflow checking, which I have enabled, and the location
for the optional ftrace jump if tracing this function is
turned on.

2592 is scmd = req->special.  %rdi is req, the first and
only function argument (per the x86_64 abi).  0xf8(%rdi)
is referencing req->special.  req must be OK, since that's
not the instruction pointer address that failed.

2599 is dereferencing req->special (i.e., scmd) to set host.
req->special must contain NULL, causing the exception
at this IP.

As a short-term workaround, it might be better to just let
this function exit if scmd is NULL, assuming that it was 
already dealt with properly.  That would still be misled
if the race condition, whatever it may be, allows scmd to
be reused before this function runs; this function would
then operate on a future command.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 18/21] uas: Use scsi_print_command

2014-09-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Hans de Goede
> Sent: Wednesday, 10 September, 2014 6:47 AM
> To: Greg Kroah-Hartman
> Cc: linux-...@vger.kernel.org; linux-scsi@vger.kernel.org;
> sta...@vger.kernel.org; Hans de Goede
> Subject: [PATCH 18/21] uas: Use scsi_print_command
> 
> Use scsi_print_command to print commands during errors, rather then
> printing
> the rather meaningless pointer to the command.
> 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/storage/uas.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> index 46b8788..220f4c7 100644
> --- a/drivers/usb/storage/uas.c
> +++ b/drivers/usb/storage/uas.c
> @@ -229,8 +229,8 @@ static void uas_log_cmd_state(struct scsi_cmnd
> *cmnd, const char *caller)
>   struct uas_cmd_info *ci = (void *)&cmnd->SCp;
> 
>   scmd_printk(KERN_INFO, cmnd,
> - "%s %p tag %d,
> inflight:%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
> - caller, cmnd, uas_get_tag(cmnd),
> + "%s tag %d inflight:%s%s%s%s%s%s%s%s%s%s%s%s%s ",
> + caller, uas_get_tag(cmnd),
>   (ci->state & SUBMIT_STATUS_URB) ? " s-st"  : "",
>   (ci->state & ALLOC_DATA_IN_URB) ? " a-in"  : "",
>   (ci->state & SUBMIT_DATA_IN_URB)? " s-in"  : "",
> @@ -244,6 +244,7 @@ static void uas_log_cmd_state(struct scsi_cmnd
> *cmnd, const char *caller)
>   (ci->state & COMMAND_COMPLETED) ? " done"  : "",
>   (ci->state & COMMAND_ABORTED)   ? " abort" : "",
>   (ci->state & IS_IN_WORK_LIST)   ? " work"  : "");
> + scsi_print_command(cmnd);
>  }

The SCSI midlayer already does a scsi_print_command, under
the MLCOMPLETE logging level.  Printing the command in
the LLD is redundant; Printing the scmd pointer was not,
as it matched the prints in the SCSI midlayer, letting
you find all the messages for a command.

In Hannes' logging patch series, Christoph Hellwig suggested
printing the block layer tag instead of the scmd pointer.
If UAS uses that tag directly, then the uas_get_tag() print
already does that.  If they're not identical (looks like 
they differ by 2), then you might want to print both.

More general comment: 
scsi-mq stress testing has shown that calling printk
on each command with an error is problematic.  If the
system runs into hundreds or thousands of errors, the
printks take so long they induce more timeouts and errors.  

As discussed in the threads by Yoshihiro YUNOMAE and Hannes
Reinecke, you might want to:
* tuck the scmd_printk inside SCSI_LOG_LLCOMPLETE so
the user can control the verbosity
* use a _ratelimited version of each print
* plan to add and eventually switch to tracepoints, so
logging doesn't have performance impacts

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-10 Thread Elliott, Robert (Server Storage)
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Sumit Saxena
> 
> >From: Tomas Henzl [mailto:the...@redhat.com]
> >
> >With several controllers in a system this may take a lot memory,
> > could you also in case when a kdump kernel is running lower it,
> > by not using this feature?
> >
> Agreed, we will disable this feature for kdump kernel by adding
> "reset_devices" global varaiable.
> That check is required for only one place, throughout the code, this
> feature will remain disabled.  Code snippet for the same-
> 
> instance->crash_dump_drv_support = (!reset_devices) &&
> crashdump_enable &&
> instance->crash_dump_fw_support &&
> instance->crash_dump_buf);
> if(instance->crash_dump_drv_support) {
> printk(KERN_INFO "megaraid_sas: FW Crash dump is
> supported\n");
> megasas_set_crash_dump_params(instance,
> MR_CRASH_BUF_TURN_OFF);
> 
> } else {
> ..
> }

Network drivers have been running into similar problems.

There's a new patch from Amir coming through net-next to make 
is_kdump_kernel() (in crash_dump.h) accessible to modules.
That may be a better signal than reset_devices that the
driver should use minimal resources.

http://comments.gmane.org/gmane.linux.network/324737

I'm not sure about the logistics of a SCSI patch depending
on a net-next patch.

---
Rob ElliottHP Server Storage






RE: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-09 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Tomas Henzl
> Sent: Tuesday, 09 September, 2014 10:54 AM
> Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
> support
> 
> On 09/06/2014 03:25 PM, sumit.sax...@avagotech.com wrote:
> > This feature will provide similar interface as kernel crash dump
> feature.
> > When megaraid firmware encounter any crash, driver will collect the
> firmware raw image and
> > dump it into pre-configured location.
> >
...
> With several controllers in a system this may take a lot memory,
> could you also
> in case when a kdump kernel is running lower it, by not using this
> feature?

What is the correct way for a driver to determine that it is
running in a kdump kernel?  The reset_devices global variable?


---
Rob ElliottHP Server Storage



N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [PATCH 0/6] blk-mq: initialize pdu of flush req explicitly

2014-09-08 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Ming Lei
> Sent: Monday, 08 September, 2014 11:55 AM
> To: Christoph Hellwig
> Cc: Jens Axboe; Linux Kernel Mailing List; Linux SCSI List
> Subject: Re: [PATCH 0/6] blk-mq: initialize pdu of flush req
> explicitly
> 
> On Mon, Sep 8, 2014 at 2:49 AM, Christoph Hellwig  wrote:
> > This works fine for me, although I still don't really like it very
> much.
> >
> > If you really want to go down the path of major surgery in this
> > area we should probably allocate a flush request per hw_ctx, and
> > initialize it using the normal init/exit functions.  If we want
> > to have proper multiqueue performance on devices needing flushes
> > we'll need something like that anyway.
> 
> Yes, that should be the final solution for the problem, and looks the
> whole flush machinery need to move into hctx, I will try to figure
> out one patch to do that.

Please change flush_rq allocation from kzalloc to kzalloc_node 
while operating on that code (would have affected PATCH 1/6).

blk_mq_init_queue currently has this for q->flush_rq:
q->flush_rq = kzalloc(round_up(sizeof(struct request) +
set->cmd_size, cache_line_size()),
GFP_KERNEL);

while all its other allocations are tied to set->numa_node:
hctxs = kmalloc_node(set->nr_hw_queues * sizeof(*hctxs), GFP_KERNEL,
set->numa_node);
q = blk_alloc_queue_node(GFP_KERNEL, set->numa_node);

or, for per-CPU structures, tied to the appropriate node:
for (i = 0; i < set->nr_hw_queues; i++) {
int node = blk_mq_hw_queue_to_node(map, i);

hctxs[i] = kzalloc_node(sizeof(struct blk_mq_hw_ctx),
GFP_KERNEL, node);

Per-hctx flush requests would mean following the hctxs[i]
approach.


---
Rob ElliottHP Server Storage





RE: scsi-mq and 3.17rc1

2014-09-07 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Monday, 25 August, 2014 9:51 AM
> To: Elliott, Robert (Server Storage)
> Cc: linux-scsi@vger.kernel.org
> Subject: Re: scsi-mq and 3.17rc1
> 
> On Mon, Aug 25, 2014 at 02:31:58PM +, Elliott, Robert (Server
> Storage) wrote:
> > Two scsi-mq tips:
> >
> > 1. Several people have been wondering how to enable scsi-mq.
> > Add this to your kernel command line (e.g., in /boot/grub/grub.conf
> > if using grub-1):
> > scsi_mod.use_blk_mq=Y
> 
> Btw, I heard a few complaints about this, and one suggestion would be
> to add a config option to enable it by default.  Would you or others
> like this?

Yes.  It shouldn't become a command line parameter that clutters up
user boot configuration files for years.  A config option will ease
the transition from defaulting to off to defaulting to on.

The command line parameter could override both ways:
config default=N, scsi_mod.use_blk_mq=Y: on
config default=N, scsi_mod.use_blk_mq=N: off
config default=N, scsi_mod.use_blk_mq not specified: off
config default=Y, scsi_mod.use_blk_mq=Y: on
config default=Y, scsi_mod.use_blk_mq=N: off
config default=Y, scsi_mod.use_blk_mq not specified: on

Right now, shost->hostt->disable_blk_mq also allows the LLD to
override to off if needed.  Perhaps that should move to
shost->disable_blk_mq so the LLD could have per-scsi_host 
control (e.g., let the user specify they want it off for for
an HBA attached mostly to HDDs and on for an HBA attached
mostly to SSDs).

I also suggest that scsi_add_host_with_dma use shost_printk 
to indicate what was selected for each scsi_host.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2] hpsa: refine the pci enable/disable handling

2014-09-06 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Tomas Henzl
...

> + /* kdump kernel is loading, we don't know in which state is
> +  * the pci interface. The dev->enable_cnt is equal zero
> +  * so we call enable+disable, wait a while and switch it on.
> +  */
> + rc = pci_enable_device(pdev);
> + if (rc) {
> + dev_warn(&pdev->dev, "Failed to enable PCI device\n");
> + return -ENODEV;
> + }
> + pci_disable_device(pdev);
> + msleep(260);/* a randomly chosen number */
> + rc = pci_enable_device(pdev);
> + if (rc) {
> + dev_warn(&pdev->dev, "failed to enable device.\n");
> + return -ENODEV;
> + }
> +
>   /* Reset the controller with a PCI power-cycle or via doorbell
> */

I tested this patch by adding:
reset_devices
to the kernel command line, which sets a kernel global variable that
triggers the driver to take this path.

Controller initialization failed with:
[   21.822789] hpsa :02:00.0: Waiting for controller to respond to no-op
[  121.822219] hpsa :02:00.0: controller message 03:00 timed out
[  121.854814] hpsa :02:00.0: no-op failed; re-trying
...

The reason is that pci_disable_device clears the Bus Master Enable bit,
and there is no call to pci_set_master after pci_enable.  The controller is
unable to return the response for the command sent by hpsa_noop down
below.  Adding pci_set_master(pdev) got it working.

A call to pci_request_regions is also missing (that's supposed to
be called before accessing the controller's memory space), but
that's not fatal here.

---
Rob ElliottHP Server Storage
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 21/22] scsi: reduce messages for command failure

2014-08-31 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Sunday, 31 August, 2014 5:29 PM
> To: Hannes Reinecke
> Cc: James Bottomley; Ewan Milne; Christoph Hellwig; linux-
> s...@vger.kernel.org; Hoffmann, Elliot; Yoshihiro Yunomae
> Subject: Re: [PATCH 21/22] scsi: reduce messages for command failure
> 
> On Thu, Aug 28, 2014 at 07:33:35PM +0200, Hannes Reinecke wrote:
> > When devices fail they can generate quite a lot of error
> > messages, possibly overloading the logging system.
> > So we should be putting these messages under logging control
> > to be able to silence them if requested.
> > And we should shorten the overall message; more verbose
> > logging can be requested by the normal scsi logging.
> 
> While the logging mask magic always confused me I'm fairly sure this
> disables logging the completion errors entirely by default.  If that
> was the intention it should be mentioned in the changelog.
> 
> I do however think printing them in a ratelimit way might be useful,
> so I'd like to see a few more arguments for not doing it that way if
> you really prefer it.

Before this patch set, ratelimited printing was impossible because
there were so many partial-line printk calls.  This patch set
may resolve that problem.

Multiple related lines may still cause problems; if it prints
CDB, sense key, and additional sense code on three lines, you
don't want the CDB and sense key suppressed with the additional
sense code still sneaking out (because command completions
are concurrent).

If those issues can be overcome, then I'd favor offering ratelimited
calls.  Maybe the logging level could be used to select the amount,
like:
0 = no prints
1..n = ratelimited versions of the prints
n+1..m = not ratelimited versions of the prints


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 20/22] scsi: align logging messages

2014-08-31 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Sunday, 31 August, 2014 5:26 PM
...
> On Thu, Aug 28, 2014 at 07:33:34PM +0200, Hannes Reinecke wrote:
> > Always use 'scmd 0x%p' when logging a command pointer.
> 
> Is there any good reason to print the address of a command pointer?
 
It relates the messages to each other.  When you get lots of
interleaved messages like:

[ 3161.441116] sd 2:0:0:2: [sdt] scmd 88041a874e70 abort scheduled
[ 3161.444258] sd 2:0:0:2: [sdt] scmd 88041a8b3b30 abort scheduled
...
[ 3162.707427] sd 2:0:0:2: [sdt] aborting command 88041a8b3b30
[ 3162.710445] sd 2:0:0:2: [sdt] scmd 88041a8b3b30 abort failed, rtn 8195
[ 3162.713092] sd 2:0:0:2: [sdt] aborting command 88041a8dd530
[ 3162.715172] sd 2:0:0:2: [sdt] scmd 88041a8dd530 abort failed, rtn 8195
...
[ 3171.052277] sd 2:0:0:2: [sdt] scsi_eh_2: flush retry cmd: 88041a8b3b30
[ 3171.054910] sd 2:0:0:2: [sdt] scsi_eh_2: flush retry cmd: 88041a8dd530
...
[ 3203.406218] sd 2:0:0:0: [sdr] scmd 88041a8214b0 not aborting, host in 
recovery
[ 3203.406424] sd 2:0:0:0: [sdr] scmd 88041a8474f0 not aborting, host in 
recovery

scmd ties the messages together so you can tell which command
has gotten to which state.  grep works.

---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >