from:"James Bottomley"

Re: [PATCH v2] scsi: core: fix uninit-value access of variable sshdr

2019-10-11 Thread James Bottomley

On Sat, 2019-10-12 at 10:03 +0800, zhengbin (A) wrote:
> On 2019/10/12 9:58, James Bottomley wrote:
> > On Sat, 2019-10-12 at 09:26 +0800, zhengbin wrote:
> > > BTW: we can't just init sshdr->response_code, sr_do_ioctl use
> > > sshdr->sense_key
> > 
> > That's an actual bug, isn't it?
> 
> If we init sshdr in __scsi_execute, this will be ok

No I mean it's a bug because sr_do_ioctl shouldn't be acting on sense
that isn't valid.  So all uses of sshdr should be gated on a validity
check.

James

Re: [PATCH v2] scsi: core: fix uninit-value access of variable sshdr

2019-10-11 Thread James Bottomley

On Sat, 2019-10-12 at 09:26 +0800, zhengbin wrote:
> BTW: we can't just init sshdr->response_code, sr_do_ioctl use
> sshdr->sense_key

That's an actual bug, isn't it?

James

Re: [PATCH v3 17/20] sg: add sg_iosubmit_v3 and sg_ioreceive_v3 ioctls

2019-08-12 Thread James Bottomley

On Mon, 2019-08-12 at 12:14 -0400, Tony Battersby wrote:
> On 8/12/19 11:37 AM, Douglas Gilbert wrote:
> > On 2019-08-09 7:15 p.m., James Bottomley wrote:
> > > On Wed, 2019-08-07 at 13:42 +0200, Douglas Gilbert wrote:
> > > > Add ioctl(SG_IOSUBMIT_V3) and ioctl(SG_IORECEIVE_V3). These
> > > > ioctls
> > > > are meant to be (almost) drop-in replacements for the
> > > > write()/read()
> > > > async version 3 interface. They only accept the version 3
> > > > interface.
> > > 
> > > I don't think we should do this at all.  Anyone who wants to use
> > > the
> > > new async interfaces should use the v4 headers.  As Tony
> > > Battersby
> > > already said, the legacy v3 users aren't going to update, so
> > > there's no
> > > point at all introducing new interfaces for v3.  We simply keep
> > > the v3
> > > only read/write interface until there are no users left and it
> > > can be
> > > eliminated.
> > 
> > Tony Battersby wrote [20190809]:
> >"Actually I used the asynchronous write()/read()/poll() sg
> > interface to
> >implement RAID-like functionality for tape drives and medium
> > changers,
> >in a commercial product that has been around since 2002.  These
> > days our
> >products use a lot more disk I/O than tape I/O, so I don't write
> > much
> >new code using the sg interface anymore, although that code is
> > still
> >there and has to be maintained as needed.  So I don't have any
> > immediate
> >plans to use any of the new sgv4 features being introduced, and
> >unfortunately I am way too busy to even give it a good
> > review..."
> > 
> > That is quoted in full his post. And here is the only other post
> > from
> > Tony I can find on this subject, again quoted in full [20190808]:
> > 
> >"One of the reasons ioctls have a bad reputation is because they
> > can be
> >used to implement poorly-thought-out interfaces.  So kernel
> > maintainers
> >push back on adding new ioctls.  But the push back isn't about
> > the
> >number of ioctls, it is about the poor interfaces.  My advice
> > was that
> >in general, to implement a given API, it would be better to add
> > more
> >ioctls with a simple interface for each one rather than to add
> > fewer
> >extremely complex multiplexing ioctls."
> > 
> > Call me biased but I believe that taken together those posts
> > support
> > what I am proposing. And I can _not_ see how you deduce: "so
> > there's
> > no point at all introducing new interfaces for v3" in reference to
> > Tony's posts.
> > 
> > 
> > As I stated in a previous post, I do not consider the sg v3
> > interface
> > as legacy. Where simply implemented, I am prepared to implement new
> > features on both the sg v3 and v4 interfaces. One example of this
> > is
> > doing command timing in nanoseconds rather than the current
> > default,
> > which is timing in milliseconds. There is also the new option of
> > not
> > doing any command timing at all. In my current implementation it
> > would
> > actually be more code to implement that for the v4 interface but
> > not
> > for the v3 interface.
> > 
> > Replicating my argument from a previous post:
> > If the kernel had an API mapping layer that was sensitive to file
> > descriptors of a "special file" (e.g. "/dev/sg3") then it could
> > map:
> >  write(sg_fd, &sg_io_v3_obj, sizeof(sg_io_v3_obj))
> > to
> >  ioctl(sg_fd, SG_IOSUBMIT_V3, &sg_io_v3_obj)
> > 
> > Plus a similar mapping for read() to ioctl(SG_IORECEIVE_V3). If
> > such
> > a mapping did exist and was transparent to the user then write()
> > and read() could be retired from the sg driver.  And I assume that
> > would get a thumbs up from the kernel security folk.
> > 
> 
> FWIW, my employer will probably continue to use the async sg v3
> interface for a long time.  If the read/write syscalls are a security
> problem, and if we had ioctl()s that are mostly a drop-in replacement
> for them, then we could convert our products over to the new ioctl()s
> on our next kernel upgrade without too much work (our products are
> embedded devices, so we control the whole software stack).  So if you
> plan to deprecate the read/write syscall interface anytime soon, then
> having drop-in replacement ioctl()s would be beneficial, even if it
> can't be done transparently as Doug suggests.

So far we've mitigated the security threat by withdrawing the v4
r/w interface which you don't use and keeping the sg nodes root only
for v3 r/w.  Unless we get another security incident based on them, as
long as the use case doesn't expand, I think the prior issue is pretty
nasty but contained to root who should know what they're doing, so
there's no pressing need to remove it.

Given that shifting to ioctls or a different async interface would be
development anyway, is there a solid reason you couldn't also shift to
v4 if you do that?  I know all the field names changed but for a
standard SCSI command it should be very similar to v3.

James

Re: [PATCH v3 07/20] sg: move header to uapi section

2019-08-12 Thread James Bottomley

On Mon, 2019-08-12 at 07:24 -0700, Christoph Hellwig wrote:
> [note: question for the linux-spdx audience below]
> 
> > -
> >  #ifdef __KERNEL__
> >  extern int sg_big_buff; /* for sysctl */
> >  #endif
> 
> FYI, these __KERNEL__ ifdefs in non-uapi headers should go away.
> 
> >  
> > +/*
> > + * In version 3.9.01 of the sg driver, this file was spilt in two,
> > with the
> > + * bulk of the user space interface being placed in the file being
> > included
> > + * in the following line.
> > + */
> > +#include 
> 
> Splitting uapi headers is standard practive, no need for the comment,
> especially not with a meaningless driver version number.
> 
> > diff --git a/include/uapi/scsi/sg.h b/include/uapi/scsi/sg.h
> > new file mode 100644
> > index ..072b45bd732f
> > --- /dev/null
> > +++ b/include/uapi/scsi/sg.h
> > @@ -0,0 +1,329 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> 
> This needs the syscall noticed for uapi headers.  FYI, what is our
> stance of just adding that notice to headers newly moved to UAPI?
> Do we need agreement from everyone who touched the file?  Or just
> after we started the split and SPDX annotations, as in this case
> this header used to be available to user programs?

It's pretty much covered by the original linux COPYING file, which
contains the systemcall exception for everything.  I think our
intention is that this still applies globally, so things can be moved
into the UAPI without additional permissions.

James

Re: [PATCH v3 17/20] sg: add sg_iosubmit_v3 and sg_ioreceive_v3 ioctls

2019-08-09 Thread James Bottomley

On Wed, 2019-08-07 at 13:42 +0200, Douglas Gilbert wrote:
> Add ioctl(SG_IOSUBMIT_V3) and ioctl(SG_IORECEIVE_V3). These ioctls
> are meant to be (almost) drop-in replacements for the write()/read()
> async version 3 interface. They only accept the version 3 interface.

I don't think we should do this at all.  Anyone who wants to use the
new async interfaces should use the v4 headers.  As Tony Battersby
already said, the legacy v3 users aren't going to update, so there's no
point at all introducing new interfaces for v3.  We simply keep the v3
only read/write interface until there are no users left and it can be
eliminated.

James

Re: [PATCH v3 13/20] sg: add sg v4 interface support

2019-08-09 Thread James Bottomley

On Wed, 2019-08-07 at 13:42 +0200, Douglas Gilbert wrote:
> Add support for the sg v4 interface based on struct sg_io_v4 found
> in include/uapi/linux/bsg.h and only previously supported by the
> bsg driver. Add ioctl(SG_IOSUBMIT) and ioctl(SG_IORECEIVE) for
> async (non-blocking) usage of the sg v4 interface. Do not accept
> the v3 interface with these ioctls. Do not accept the v4
> interface with this driver's existing write() and read()
> system calls.
> 
> For sync (blocking) usage expand the existing ioctl(SG_IO)
> to additionally accept the sg v4 interface object.

First the meta comments:

Since this is effectively a new interface for sg, we're not constrained
by what happened before.  Specifically, we don't want to support the
read/write interface for v4, that should remain only for legacy v3.

We're already discussing what the correct async interface should look
like, I won't comment on the IOSUBMIT/IORECEIVE parts.

Given that we want to unify the v4 code paths, I think this:

> @@ -1293,15 +1528,25 @@ sg_ctl_sg_io(struct file *filp, struct
> sg_device *sdp, struct sg_fd *sfp,
>   return res;
>   if (copy_from_user(h3p, p, SZ_SG_IO_HDR))
>   return -EFAULT;
> - if (h3p->interface_id == 'S')
> - res = sg_submit(filp, sfp, h3p, true, &srp);
> - else
> + if (h3p->interface_id == 'Q') {
> + /* copy in rest of sg_io_v4 object */
> + if (copy_from_user(hu8arr + SZ_SG_IO_HDR,
> +((u8 __user *)p) + SZ_SG_IO_HDR,
> +SZ_SG_IO_V4 - SZ_SG_IO_HDR))
> + return -EFAULT;
> + res = sg_v4_submit(filp, sfp, p, h4p, true, &srp);

Can simply become

if (h3p->interface_id == 'Q')
return bsg_sg_io(sdp->request_queue, filp->file_mode, p);

And all the duplicate code could then be eliminated.  Of course, we
have to export bsg_sg_io, but that should be a trivial addition.

James

Re: [PATCH 3/4] Complain if scsi_target_block() fails

2019-07-26 Thread James Bottomley

On Fri, 2019-07-26 at 11:00 -0700, Bart Van Assche wrote:
> On 7/26/19 10:00 AM, James Bottomley wrote:
> > On Fri, 2019-07-26 at 09:48 -0700, Bart Van Assche wrote:
> > > If scsi_target_block() fails that can break the code that calls
> > > this
> > > function. Hence complain loudly if scsi_target_block() fails.
> > > 
> > > Cc: Christoph Hellwig 
> > > Cc: Hannes Reinecke 
> > > Cc: Johannes Thumshirn 
> > > Cc: Ming Lei 
> > > Signed-off-by: Bart Van Assche 
> > > ---
> > >   drivers/scsi/scsi_lib.c | 2 ++
> > >   1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index bbed72eff9c9..c9630bd59b5a 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -2770,6 +2770,8 @@ int scsi_target_block(struct device *dev)
> > >   else
> > >   device_for_each_child(dev, &ret, target_block);
> > >   
> > > + WARN_ONCE(ret, "ret = %d\n", ret);
> > > +
> > 
> > If this is the only point to the previous change to make SCSI
> > target block return an error, why not put the WARN_ONCE in
> > device_block?  That way you'll at least know which device was the
> > problem.
> 
> Hi James,
> 
> Adding a WARN_ON_ONCE() statement in device_block() sounds like a
> good idea to me. But since scsi_target_block() can fail, I think it
> should  have a return value that indicates whether or not it
> succeeded.

I don't disagree, but nothing in this patch set actually uses it ... is
there a plan for something to use the scsi_target_block return value to
perform some action other than issue a warning?   If not, it likely
makes better sense simply to add the warn once to the device block ...
unless there's some problem that might make it too verbose for a target
(say 100 devices each of which would issue the warning).

James

Re: [PATCH 3/4] Complain if scsi_target_block() fails

2019-07-26 Thread James Bottomley

On Fri, 2019-07-26 at 09:48 -0700, Bart Van Assche wrote:
> If scsi_target_block() fails that can break the code that calls this
> function. Hence complain loudly if scsi_target_block() fails.
> 
> Cc: Christoph Hellwig 
> Cc: Hannes Reinecke 
> Cc: Johannes Thumshirn 
> Cc: Ming Lei 
> Signed-off-by: Bart Van Assche 
> ---
>  drivers/scsi/scsi_lib.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index bbed72eff9c9..c9630bd59b5a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2770,6 +2770,8 @@ int scsi_target_block(struct device *dev)
>   else
>   device_for_each_child(dev, &ret, target_block);
>  
> + WARN_ONCE(ret, "ret = %d\n", ret);
> +

If this is the only point to the previous change to make SCSI target
block return an error, why not put the WARN_ONCE in device_block?  That
way you'll at least know which device was the problem.

James

Re: [PATCH] fcoe: avoid memset across pointer boundaries

2019-07-20 Thread James Bottomley

On Sat, 2019-07-20 at 12:21 -0700, Linus Torvalds wrote:
> On Tue, Jun 4, 2019 at 2:30 AM Hannes Reinecke  wrote:
> > 
> > Gcc-9 complains for a memset across pointer boundaries, which
> > happens as the code tries to allocate a flexible array on the
> > stack. Turns out we cannot do this without relying on gcc-isms, so
> > this patch converts the stack allocation in proper kzalloc() calls.
> > 
> > Signed-off-by: Hannes Reinecke 
> 
> So this patch apparently isn't making it into 5.3?
> 
> The gcc-9 warnings are still there, and as annoying as they were
> originally. Appended for your viewing "pleasure" once again, in case
> you don't have gcc-9 installed..

Is that a Reviewed-By or Tested-by?  If so we can slide it into one of
the -rc candidates.

James

Re: [PATCH] scsi: mpt3sas: clean up a sizeof()

2019-06-26 Thread James Bottomley

On Wed, 2019-06-26 at 13:12 +0300, Dan Carpenter wrote:
> This patch is just a cleanup and doesn't change run time because both
> sizeof EVENT and SCSI are 84 bytes.  But this is clearly a cut and
> paste
> error and the SCSI struct was intended.
> 
> Signed-off-by: Dan Carpenter 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_ctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
> b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
> index d4ecfbbe738c..06a901ed743c 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
> @@ -3280,7 +3280,7 @@ diag_trigger_scsi_store(struct device *cdev,
>   spin_lock_irqsave(&ioc->diag_trigger_lock, flags);
>   sz = min(sizeof(struct SL_WH_SCSI_TRIGGERS_T), count);
>   memset(&ioc->diag_trigger_scsi, 0,
> - sizeof(struct SL_WH_EVENT_TRIGGERS_T));
> + sizeof(struct SL_WH_SCSI_TRIGGERS_T));

Please can we make this the correct pattern of

sizeof(ioc->diag_trigger_scsi)

James

Re: [PATCH] scsi: mvumi: fix build warning

2019-06-20 Thread James Bottomley

On Thu, 2019-06-20 at 14:26 +0800, Ming Lei wrote:
> The local variable 'sg' should be initialized in the failure path of
> mvumi_make_sgl(), otherwise the following build warning is triggered:
> 
>   In file included from include/linux/pci-dma-compat.h:8,
>from include/linux/pci.h:2408,
>from drivers/scsi/mvumi.c:13:
>   drivers/scsi/mvumi.c: In function 'mvumi_queue_command':
>   include/linux/dma-mapping.h:608:34: warning: 'sg' may be used
> uninitialized in this function
>   +[-Wmaybe-uninitialized]
>#define dma_unmap_sg(d, s, n, r) dma_unmap_sg_attrs(d, s, n,
> r, 0)
> ^~
>   drivers/scsi/mvumi.c:192:22: note: 'sg' was declared here
> struct scatterlist *sg;
>   ^~
> Fixed it by removing the local variable reference in failure path.
> 
> Cc: Christoph Hellwig 
> Cc: Bart Van Assche 
> Cc: Ewan D. Milne 
> Fixes: 350d66a72adc ("scsi: mvumi: use sg helper to iterate over
> scatterlist")
> Signed-off-by: Ming Lei 
> ---
>  drivers/scsi/mvumi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
> index 0022cd31500a..53f3563aca22 100644
> --- a/drivers/scsi/mvumi.c
> +++ b/drivers/scsi/mvumi.c
> @@ -217,7 +217,7 @@ static int mvumi_make_sgl(struct mvumi_hba *mhba,
> struct scsi_cmnd *scmd,
>   dev_err(&mhba->pdev->dev,
>   "sg count[0x%x] is bigger than max
> sg[0x%x].\n",
>   *sg_count, mhba->max_sge);
> - dma_unmap_sg(&mhba->pdev->dev, sg, sgnum,
> + dma_unmap_sg(&mhba->pdev->dev, scsi_sglist(scmd),
> sgnum,
>scmd->sc_data_direction);
>   return -1;
>   }

I think this is a serious enough problem for mvumi that it needs
folding with comment into the original path.  I believe it will cause
an oops when this code path is hit otherwise, right?

James

Re: [PATCH V2 00/15] scsi: use sg helper to operate sgl

2019-06-13 Thread James Bottomley

On Thu, 2019-06-13 at 15:13 +0800, Ming Lei wrote:
> Hi,
> 
> Most of drivers use sg helpers to operate sgl, however there is
> still a few drivers which operate sgl directly, this way can't
> work in case of chained sgl.

This isn't a useful explanation of the issue you make it sound like a
bug, which it isn't: it's a change of behaviour we'd like to introduce
in SCSI.  Please reword the explanation more along the lines of

---
Scsi MQ makes a large static allocation for the first scatter gather
list chunk for the driver to use.  This is a performance headache we'd
like to fix by reducing the size of the allocation to a 2 element
array.  Doing this will break the current guarantee that any driver
using SG_ALL doesn't need to use the scatterlist iterators and can get
away with directly dereferencing the array.  Thus we need to update all
drivers to use the scatterlist iterators and remove direct indexing of
the scatterlist array before reducing the initial scatterlist
allocation size in SCSI.
---

Which explains what we're trying to do and why.

In particular changelogs like this

> The current way isn't safe for chained sgl, so use sgl helper to
> operate sgl.

Are just plain wrong:  They were perfectly safe until you altered the
conditions for using non-chained sgls.  Please use the above
explanation in the patches, abbreviated if you like, so all recipients
know why this needs doing and that it isn't an existing bug.

James

Re: [PATCH V2 08/15] staging: unisys: visorhba: use sg helper to operate sgl

2019-06-13 Thread James Bottomley

On Thu, 2019-06-13 at 12:16 +0200, Greg Kroah-Hartman wrote:
> On Thu, Jun 13, 2019 at 06:04:11PM +0800, Ming Lei wrote:
> > On Thu, Jun 13, 2019 at 11:52:14AM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jun 13, 2019 at 03:13:28PM +0800, Ming Lei wrote:
> > > > The current way isn't safe for chained sgl, so use sg helper to
> > > > operate sgl.
> > > 
> > > I can not make any sense out of this changelog.
> > > 
> > > What "isn't safe"?  What is a "sgl"?
> > 
> > sgl is 'scatterlist' in kernel, and several linear sgl can be
> > chained together, so accessing the sgl in linear way may see a
> > chained sg, which is like a link pointer, then may cause trouble
> > for driver.
> 
> What kind of "trouble"?  Is this a bug fix that needs to be
> backported to stable kernels?  How can this be triggered?

OK, stop.  I haven't seen the commit log since the original hasn't
appeared on the list yet, but the changelog needs to say something like
this to prevent questions like the above, as I asked in the last
review.  Please make it something like this:

---
Scsi MQ makes a large static allocation for the first scatter gather
list chunk for the driver to use.  This is a performance headache we'd
like to fix by reducing the size of the allocation to a 2 element
array.  Doing this will break the current guarantee that any driver
using SG_ALL doesn't need to use the scatterlist iterators and can get
away with directly dereferencing the array.  Thus we need to update all
drivers to use the scatterlist iterators and remove direct indexing of
the scatterlist array before reducing the initial scatterlist
allocation size in SCSI.
---

James

Re: [PATCH 2/5] scsi: advansys: use sg helper to operate sgl

2019-06-11 Thread James Bottomley

On Wed, 2019-06-12 at 09:39 +0800, Ming Lei wrote:
> Hi Martin,
> 
> On Tue, Jun 11, 2019 at 07:50:01PM -0400, Martin K. Petersen wrote:
> > 
> > Ming,
> > 
> > > 1) revert the 3 first, then re-organize the whole patchset in
> > > correct order(convert drivers first, then the 3 above drivers)
> > > 
> > > 2) simply apply the 5 patches now
> > > 
> > > Any comments?
> > 
> > I'm on the fence about this. Your patches were some of the first
> > ones that went into the 5.3 tree. So I'd have to rebase pretty much
> > the whole 5.3 queue.
> > 
> > Whereas merging your updates leaves a sequence of 100+ commits that
> > could lead to bisection problems in the future. I'm particularly
> > worried about ipr and lpfc but all these drivers are actively used.
> 
> The issue has been introduced, and people has complained, so I think
> we have to do something.

Studying the issue further, I think we have to do the rebase.  The
problem is that any driver which hasn't been updated can be persuaded
to walk of the end of the request and dereference the next struct
request.  It's not impossible for userspace to set up both requests, so
it looks like this could be used at least to leak information from the
kernel if not exploit it outright.  I think that means we have to have
every driver updated before this goes in.

> > As much as I like to see all drivers, without exception, use the sg
> > iterators, it would have been nice to have a smoother transition.
> 
> All the 5 drivers are found via static code analysis by eyes, and not
> see other ways for looking at this issue.

Can't coccinelle be persuaded?  All we're looking for is a semantic
search where we have a struct scatterlist that is either incremented or
indexed.

That said, it looks like the microtek scanner is yet another driver
that needs updating.

James

>  That said it is quite hard to prove 'all drivers, without exception,
> use the sg iterators'.
> 
> Even though some of them is missed, it should have been triggered
> easily if drivers are actively used, then it can be fixed easily too.

Re: [PATCH 2/5] scsi: advansys: use sg helper to operate sgl

2019-06-10 Thread James Bottomley

On Mon, 2019-06-10 at 15:36 -0400, Ewan D. Milne wrote:
> On Mon, 2019-06-10 at 11:37 -0700, James Bottomley wrote:
> > On Mon, 2019-06-10 at 23:03 +0800, Ming Lei wrote:
> > > The current way isn't safe for chained sgl, so use sgl helper to
> > > operate sgl.
> > 
> > The advansys driver doesn't currently use a chained
> > scatterlist.  In
> > theory it could; the 
> > 
> > if (shost->sg_tablesize > SG_ALL) {
> > shost->sg_tablesize = SG_ALL;
> > }
> > 
> > At around line 11226 is what prevents it and that could be
> > eliminated provided someone actually has the hardware to test.
> > 
> > However, provided drivers make the correct SG_ALL or less
> > declaration, they're entitled to treat scatterlists as fully
> > contiguous, so there's no real justification (beyond uniformity)
> > for making it use the chain helpers.
> > 
> > James
> > 
> 
> I thought the whole issue came about because Ming's earlier changes
> to scsi_lib.c made the previously SG_CHUNK_SIZE scatterlist allocated
> with the struct request much smaller, (SCSI_INLINE_SG_CNT is 2) so
> everything needs to support it?

Um, well, I'm just going by the commit log.  If the problem is someone
broke the SG_ALL no chaining assumption in an earlier commit, finger
that as the buggy commit you're fixing rather than implying the drivers
had a longstanding bug.  In fact, preferably do this work as a
precursor to the original instead of leaving the drivers broken for a
bisect.

James

Re: [PATCH 1/5] scsi: vmw_pscsi: use sgl helper to operate sgl

2019-06-10 Thread James Bottomley

On Mon, 2019-06-10 at 23:03 +0800, Ming Lei wrote:
> The current way isn't safe for chained sgl, so use sgl helper to
> operate sgl.

This also isn't a chained driver.  However, this driver seems to
achieve this by magic number matching, which looks unsafe.  I'd really
prefer it if vmw_pvscsi.h had

#define PVSCSI_MAX_NUM_SG_ENTRIES_PER_SEGMENT SG_ALL

James

Re: [PATCH 3/5] scsi: ipr: use sg helper to operate sgl

2019-06-10 Thread James Bottomley

On Mon, 2019-06-10 at 23:03 +0800, Ming Lei wrote:
> The current way isn't safe for chained sgl, so use sg helper to
> operate sgl.

ipr doesn't use chaining, so it can likewise assume a contiguous
scatterlist.  Since the hardware seems to have a 64 entry limit, it
looks like it never will use chaining.

James

Re: [PATCH 2/5] scsi: advansys: use sg helper to operate sgl

2019-06-10 Thread James Bottomley

On Mon, 2019-06-10 at 23:03 +0800, Ming Lei wrote:
> The current way isn't safe for chained sgl, so use sgl helper to
> operate sgl.

The advansys driver doesn't currently use a chained scatterlist.  In
theory it could; the 

if (shost->sg_tablesize > SG_ALL) {
shost->sg_tablesize = SG_ALL;
}

At around line 11226 is what prevents it and that could be eliminated
provided someone actually has the hardware to test.

However, provided drivers make the correct SG_ALL or less declaration,
they're entitled to treat scatterlists as fully contiguous, so there's
no real justification (beyond uniformity) for making it use the chain
helpers.

James

Re: [PATCH] mptsas: fix undefined behaviour of a shift of an int by more than 31 places

2019-05-09 Thread James Bottomley

On Thu, 2019-05-09 at 17:30 +0200, Hannes Reinecke wrote:
> On 5/8/19 4:24 PM, James Bottomley wrote:
> > On Wed, 2019-05-08 at 14:07 +0100, Colin Ian King wrote:
> > > On 05/05/2019 04:34, James Bottomley wrote:
> > > > On Sat, 2019-05-04 at 17:40 +0100, Colin King wrote:
> > > > > From: Colin Ian King 
> > > > > 
> > > > > Currently the shift of int value 1 by more than 31 places can
> > > > > result in undefined behaviour. Fix this by making the 1 a ULL
> > > > > value before the shift operation.
> > > > 
> > > > Fusion SAS is pretty ancient.  I thought the largest one ever
> > > > produced had four phys, so how did you produce the overflow?
> > > 
> > > This was an issue found by static analysis with Coverity; so I
> > > guess won't happen in the wild, in which case the patch could be
> > > ignored.
> > 
> > The point I was more making is that if we thought this could ever
> > happen in practice, we'd need more error handling than simply this:
> > we'd be setting the phy_bitmap to zero which would be every bit as
> > bad as some random illegal value.
> > 
> 
> Thing is, mptsas is used as the default emulation in VMWare, and
> that does allow you to do some pretty weird configurations (I've
> found myself  fixing a bug with SATA hotplug on mptsas once ...).
> So I wouldn't discard this issue out of hand.

I'm not, I'm just saying the proposed fix is no fix at all since it
would just produce undefined behaviour in the driver.  I thought the
issue might have been coming from VMWare, which is why I asked how the
bug was seen.  The proper fix is probably to fail attachment if the phy
number goes over a fixed value (16 sounds reasonable) but if it's never
a problem in the field, I'm happy doing nothing because we have no real
idea what the reasonable value is.

James

Re: [PATCH] mptsas: fix undefined behaviour of a shift of an int by more than 31 places

2019-05-08 Thread James Bottomley

On Wed, 2019-05-08 at 14:07 +0100, Colin Ian King wrote:
> On 05/05/2019 04:34, James Bottomley wrote:
> > On Sat, 2019-05-04 at 17:40 +0100, Colin King wrote:
> > > From: Colin Ian King 
> > > 
> > > Currently the shift of int value 1 by more than 31 places can
> > > result in undefined behaviour. Fix this by making the 1 a ULL
> > > value before the shift operation.
> > 
> > Fusion SAS is pretty ancient.  I thought the largest one ever
> > produced had four phys, so how did you produce the overflow?
> 
> This was an issue found by static analysis with Coverity; so I guess
> won't happen in the wild, in which case the patch could be ignored.

The point I was more making is that if we thought this could ever
happen in practice, we'd need more error handling than simply this:
we'd be setting the phy_bitmap to zero which would be every bit as bad
as some random illegal value.

James

[GIT PULL] first round of SCSI updates for the 5.1+ merge window

2019-05-07 Thread James Bottomley

 Malavali (8):
  scsi: qla2xxx: Change abort wait_loop from msleep to wait_event_timeout
  scsi: qla2xxx: Fix driver unload when FC-NVMe LUNs are connected
  scsi: qla2xxx: Set remote port devloss timeout to 0
  scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe
  scsi: qla2xxx: Increase the max_sgl_segments to 1024
  scsi: qla2xxx: Reset the FCF_ASYNC_{SENT|ACTIVE} flags
  scsi: qla2xxx: Set the qpair in SRB to NULL when SRB is released
  scsi: qla2xxx: Set the SCSI command result before calling the command done

Gustavo A. R. Silva (2):
  scsi: mptscsih: Mark expected switch fall-throughs
  scsi: mptfusion: mark expected switch fall-through

Hannes Reinecke (6):
  scsi: scsi_transport_fc: nvme: display FC-NVMe port roles
  scsi: qedf: fc_rport_priv reference counting fixes
  scsi: qedf: fixup bit operations
  scsi: qedf: fixup locking in qedf_restart_rport()
  scsi: qedf: missing kref_put in qedf_xmit()
  scsi: core: reshuffle no_scsi2_lun_in_cdb for better alignment

Himanshu Madhani (5):
  scsi: qla2xxx: Silence Successful ELS IOCB message
  scsi: qla2xxx: Fix read offset in qla24xx_load_risc_flash()
  scsi: qla2xxx: Update driver version to 10.01.00.16-k
  scsi: qla2xxx: Fix NULL pointer crash due to stale CPUID
  scsi: qla2xxx: Update driver version to 10.01.00.15-k

James Bottomley (1):
  scsi: lpfc: Fix build error

James Smart (36):
  scsi: lpfc: add support for posting FC events on FPIN reception
  scsi: scsi_transport_fc: Add FPIN fc event codes
  scsi: scsi_transport_fc: refactor event posting routines
  scsi: fc: add FPIN ELS definition
  scsi: lpfc: Fix missing wakeups on abort threads
  scsi: lpfc: Fixup eq_clr_intr references
  scsi: lpfc: Update lpfc version to 12.2.0.1
  scsi: lpfc: Update Copyright in driver version
  scsi: lpfc: Enhance 6072 log string
  scsi: lpfc: Fix duplicate log message numbers
  scsi: lpfc: Specify node affinity for queue memory allocation
  scsi: lpfc: Reduce memory footprint for lpfc_queue
  scsi: lpfc: Add loopback testing to trunking mode
  scsi: lpfc: Fix link speed reporting for 4-link trunk
  scsi: lpfc: Fix handling of trunk links state reporting
  scsi: lpfc: Fix protocol support on G6 and G7 adapters
  scsi: lpfc: Correct boot bios information to FDMI registration
  scsi: lpfc: Fix HDMI2 registration string for symbolic name
  scsi: lpfc: Fix fc4type information for FDMI
  scsi: lpfc: Fix FDMI manufacturer attribute value
  scsi: lpfc: Fix io lost on host resets
  scsi: lpfc: Fix mailbox hang on adapter init
  scsi: lpfc: Fix driver crash in target reset handler
  scsi: lpfc: Correct localport timeout duration error
  scsi: lpfc: Convert bootstrap mbx polling from msleep to udelay
  scsi: lpfc: Coordinate adapter error handling with offline handling
  scsi: lpfc: Stop adapter if pci errors detected
  scsi: lpfc: Fix deadlock due to nested hbalock call
  scsi: lpfc: Fix nvmet handling of first burst cmd
  scsi: lpfc: Fix lpfc_nvmet_mrq attribute handling when 0
  scsi: lpfc: Fix nvmet async receive buffer replenishment
  scsi: lpfc: Fix location of SCSI ktime counters
  scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
  scsi: lpfc: Fix use-after-free mailbox cmd completion
  scsi: lpfc: Resolve irq-unsafe lockdep heirarchy warning in lpfc_io_free
  scsi: lpfc: Resolve inconsistent check of hdwq in lpfc_scsi_cmd_iocb_cmpl

Jan Kotas (2):
  scsi: ufs-cdns: Add support for UFSHCI with M31 PHY
  scsi: dt-bindings: ufs-cdns: Update Cadence UFS compatibility list

Joe Carnuccio (12):
  scsi: qla2xxx: Add 28xx flash primary/secondary status/image mechanism
  scsi: qla2xxx: Simplification of register address used in qla_tmpl.c
  scsi: qla2xxx: Correction and improvement to fwdt processing
  scsi: qla2xxx: Update flash read/write routine
  scsi: qla2xxx: Add support for multiple fwdump templates/segments
  scsi: qla2xxx: Cleanups for NVRAM/Flash read/write path
  scsi: qla2xxx: Correctly report max/min supported speeds
  scsi: qla2xxx: Add Serdes support for ISP28XX
  scsi: qla2xxx: Add Device ID for ISP28XX
  scsi: qla2xxx: Fix routine qla27xx_dump_{mpi|ram}()
  scsi: qla2xxx: Remove FW default template
  scsi: qla2xxx: Add fw_attr and port_no SysFS node

John Garry (7):
  scsi: libsas: Print expander PHY indexes in decimal
  scsi: libsas: Do discovery on empty PHY to update PHY info
  scsi: libsas: Inject revalidate event for root port event
  scsi: libsas: Improve vague log in SAS rediscovery
  scsi: libsas: Try to retain programmed min linkrate for SATA min pathway 
unmatch fixing
  scsi: libsas: Stop hardcoding SAS address length
  scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected

Kangjie Lu (2):
  scsi:

Re: [PATCH] mptsas: fix undefined behaviour of a shift of an int by more than 31 places

2019-05-04 Thread James Bottomley

On Sat, 2019-05-04 at 17:40 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Currently the shift of int value 1 by more than 31 places can result
> in undefined behaviour. Fix this by making the 1 a ULL value before
> the shift operation.

Fusion SAS is pretty ancient.  I thought the largest one ever produced
had four phys, so how did you produce the overflow?

James

Re: [PATCH 2/2] scsi: core: avoid to pre-allocate big chunk for sg list

2019-04-24 Thread James Bottomley

On Wed, 2019-04-24 at 09:09 -0700, Bart Van Assche wrote:
> On Wed, 2019-04-24 at 08:49 -0700, James Bottomley wrote:
> > On Wed, 2019-04-24 at 08:32 -0700, Bart Van Assche wrote:
> > > Another concern is whether this change can cause a livelock. If
> > > the system is running out of memory and the page cache submits a
> > > write request with a scatterlist with more than two elements, if
> > > the kmalloc() for the scatterlist fails, will that prevent the
> > > page cache from making any progress with writeback?
> > 
> > It's pool backed, as I said.  Is the concern there isn't enough
> > depth in the pools for a large write?
> 
> That memory pool is used by multiple drivers. Most but not all
> sg_alloc_table_chained() calls happen from inside .queue_rq()
> implementations. One sg_alloc_table_chained() call occurs in the NFS
> server code. I'm not sure whether it is guaranteed that an
> sg_alloc_table_chained() will succeed sooner or later under low
> memory conditions. Additionally, new sg_alloc_table_chained() could
> be added in drivers any time.

The number of users is irrelevant.  All we need is sequential forward
progress to guarantee freedom from memory allocation related live lock.
 Even if they make write progress one at a time (although the current
pool depth seems to be 2, so they make progress at least two at a
time), memory will be released by the write and reclaim will progress.

The guarantee required is ability to send or have outstanding at least
one write and also that that write will return eventually releasing
memory back to the pool for another write to proceed.

James

Re: [PATCH 2/2] scsi: core: avoid to pre-allocate big chunk for sg list

2019-04-24 Thread James Bottomley

On Wed, 2019-04-24 at 08:32 -0700, Bart Van Assche wrote:
> On Wed, 2019-04-24 at 08:24 -0700, James Bottomley wrote:
> > On Wed, 2019-04-24 at 15:52 +0800, Ming Lei wrote:
> > > On Tue, Apr 23, 2019 at 08:37:15AM -0700, Bart Van Assche wrote:
> > > > On Tue, 2019-04-23 at 18:32 +0800, Ming Lei wrote:
> > > > >  #define  SCSI_INLINE_PROT_SG_CNT  1
> > > > >  
> > > > > +#define  SCSI_INLINE_SG_CNT  2
> > > > 
> > > > So this patch inserts one kmalloc() and one kfree() call in the
> > > > hot path for every SCSI request with more than two elements in
> > > > its scatterlist? Isn't
> > > 
> > > Slab or its variants are designed for fast path, and NVMe PCI
> > > uses slab for allocating sg list in fast path too.
> > 
> > Actually, that's not really true  base kmalloc can do all sorts of
> > things including kick off reclaim so it's not really something we
> > like using in the fast path.  The only fast and safe kmalloc you
> > can rely on  in the fast path is GFP_ATOMIC which will fail quickly
> > if no memory can easily be found.  *However* the sg_table
> > allocation functions are all pool backed (lib/sg_pool.c), so they
> > use the lightweight GFP_ATOMIC mechanism for kmalloc initially
> > coupled with a backing pool in case of failure to ensure forward
> > progress.
> > 
> > So, I think you're both right: you shouldn't simply use kmalloc,
> > but this implementation doesn't, it uses the sg_table allocation
> > functions which correctly control kmalloc to be lightweight and
> > efficient and able to make forward progress.
> 
> Another concern is whether this change can cause a livelock. If the
> system is running out of memory and the page cache submits a write
> request with a scatterlist with more than two elements, if the
> kmalloc() for the scatterlist fails, will that prevent the page cache
> from making any progress with writeback?

It's pool backed, as I said.  Is the concern there isn't enough depth
in the pools for a large write?

James

Re: [PATCH 2/2] scsi: core: avoid to pre-allocate big chunk for sg list

2019-04-24 Thread James Bottomley

On Wed, 2019-04-24 at 15:52 +0800, Ming Lei wrote:
> On Tue, Apr 23, 2019 at 08:37:15AM -0700, Bart Van Assche wrote:
> > On Tue, 2019-04-23 at 18:32 +0800, Ming Lei wrote:
> > >  #define  SCSI_INLINE_PROT_SG_CNT  1
> > >  
> > > +#define  SCSI_INLINE_SG_CNT  2
> > 
> > So this patch inserts one kmalloc() and one kfree() call in the hot
> > path for every SCSI request with more than two elements in its
> > scatterlist? Isn't
> 
> Slab or its variants are designed for fast path, and NVMe PCI uses
> slab for allocating sg list in fast path too.

Actually, that's not really true  base kmalloc can do all sorts of
things including kick off reclaim so it's not really something we like
using in the fast path.  The only fast and safe kmalloc you can rely on
 in the fast path is GFP_ATOMIC which will fail quickly if no memory
can easily be found.  *However* the sg_table allocation functions are
all pool backed (lib/sg_pool.c), so they use the lightweight GFP_ATOMIC
mechanism for kmalloc initially coupled with a backing pool in case of
failure to ensure forward progress.

So, I think you're both right: you shouldn't simply use kmalloc, but
this implementation doesn't, it uses the sg_table allocation functions
which correctly control kmalloc to be lightweight and efficient and
able to make forward progress.

James

[GIT PULL] SCSI fixes for 5.1-rc5

2019-04-20 Thread James Bottomley

Three minor fixes: two obvious ones in drivers and a fix to the SG_IO
path to correctly return status on error.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Christoph Hellwig (1):
  scsi: aic7xxx: fix EISA support

Jaesoo Lee (1):
  scsi: core: set result when the command cannot be dispatched

Saurav Kashyap (1):
  Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"

And the diffstat:

 drivers/scsi/aic7xxx/aic7770_osm.c |  1 +
 drivers/scsi/aic7xxx/aic7xxx.h |  1 +
 drivers/scsi/aic7xxx/aic7xxx_osm.c | 10 --
 drivers/scsi/aic7xxx/aic7xxx_osm_pci.c |  1 +
 drivers/scsi/libfc/fc_rport.c  |  1 -
 drivers/scsi/scsi_lib.c|  6 +-
 6 files changed, 12 insertions(+), 8 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/aic7xxx/aic7770_osm.c 
b/drivers/scsi/aic7xxx/aic7770_osm.c
index 3d401d02c019..bdd177e3d762 100644
--- a/drivers/scsi/aic7xxx/aic7770_osm.c
+++ b/drivers/scsi/aic7xxx/aic7770_osm.c
@@ -91,6 +91,7 @@ aic7770_probe(struct device *dev)
ahc = ahc_alloc(&aic7xxx_driver_template, name);
if (ahc == NULL)
return (ENOMEM);
+   ahc->dev = dev;
error = aic7770_config(ahc, aic7770_ident_table + edev->id.driver_data,
   eisaBase);
if (error != 0) {
diff --git a/drivers/scsi/aic7xxx/aic7xxx.h b/drivers/scsi/aic7xxx/aic7xxx.h
index 5614921b4041..88b90f9806c9 100644
--- a/drivers/scsi/aic7xxx/aic7xxx.h
+++ b/drivers/scsi/aic7xxx/aic7xxx.h
@@ -943,6 +943,7 @@ struct ahc_softc {
 * Platform specific device information.
 */
ahc_dev_softc_t   dev_softc;
+   struct device *dev;
 
/*
 * Bus specific device information.
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c 
b/drivers/scsi/aic7xxx/aic7xxx_osm.c
index 3c9c17450bb3..d5c4a0d23706 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
@@ -860,8 +860,8 @@ int
 ahc_dmamem_alloc(struct ahc_softc *ahc, bus_dma_tag_t dmat, void** vaddr,
 int flags, bus_dmamap_t *mapp)
 {
-   *vaddr = pci_alloc_consistent(ahc->dev_softc,
- dmat->maxsize, mapp);
+   /* XXX: check if we really need the GFP_ATOMIC and unwind this mess! */
+   *vaddr = dma_alloc_coherent(ahc->dev, dmat->maxsize, mapp, GFP_ATOMIC);
if (*vaddr == NULL)
return ENOMEM;
return 0;
@@ -871,8 +871,7 @@ void
 ahc_dmamem_free(struct ahc_softc *ahc, bus_dma_tag_t dmat,
void* vaddr, bus_dmamap_t map)
 {
-   pci_free_consistent(ahc->dev_softc, dmat->maxsize,
-   vaddr, map);
+   dma_free_coherent(ahc->dev, dmat->maxsize, vaddr, map);
 }
 
 int
@@ -1123,8 +1122,7 @@ ahc_linux_register_host(struct ahc_softc *ahc, struct 
scsi_host_template *templa
 
host->transportt = ahc_linux_transport_template;
 
-   retval = scsi_add_host(host,
-   (ahc->dev_softc ? &ahc->dev_softc->dev : NULL));
+   retval = scsi_add_host(host, ahc->dev);
if (retval) {
printk(KERN_WARNING "aic7xxx: scsi_add_host failed\n");
scsi_host_put(host);
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c 
b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index 0fc14dac7070..717d8d1082ce 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -250,6 +250,7 @@ ahc_linux_pci_dev_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
}
ahc->dev_softc = pci;
+   ahc->dev = &pci->dev;
error = ahc_pci_config(ahc, entry);
if (error != 0) {
ahc_free(ahc);
diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c
index dfba4921b265..5bf61431434b 100644
--- a/drivers/scsi/libfc/fc_rport.c
+++ b/drivers/scsi/libfc/fc_rport.c
@@ -2162,7 +2162,6 @@ static void fc_rport_recv_logo_req(struct fc_lport 
*lport, struct fc_frame *fp)
FC_RPORT_DBG(rdata, "Received LOGO request while in state %s\n",
 fc_rport_state(rdata));
 
-   rdata->flags &= ~FC_RP_STARTED;
fc_rport_enter_delete(rdata, RPORT_EV_STOP);
mutex_unlock(&rdata->rp_mutex);
kref_put(&rdata->kref, fc_rport_destroy);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 601b9f1de267..07dfc17d4824 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1706,8 +1706,12 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx 
*hctx,
ret = BLK_STS_DEV_RESOURCE;
break;
default:
+   if (unlikely(!scsi_device_online(sdev)))
+   scsi_req(req)->result = DID_NO_CONNECT << 16;
+   else
+

[GIT PULL] SCSI fixes for 5.1-rc4

2019-04-13 Thread James Bottomley

One obvious fix for a ciostor data corruption on error bug.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Varun Prakash (1):
  scsi: csiostor: fix missing data copy in csio_scsi_err_handler()

And the diffstat:

 drivers/scsi/csiostor/csio_scsi.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

With full diff below.

James

---

diff --git a/drivers/scsi/csiostor/csio_scsi.c 
b/drivers/scsi/csiostor/csio_scsi.c
index 462560b2855e..469d0bc9f5fe 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -1713,8 +1713,11 @@ csio_scsi_err_handler(struct csio_hw *hw, struct 
csio_ioreq *req)
}
 
 out:
-   if (req->nsge > 0)
+   if (req->nsge > 0) {
scsi_dma_unmap(cmnd);
+   if (req->dcopy && (host_status == DID_OK))
+   host_status = csio_scsi_copy_to_sgl(hw, req);
+   }
 
cmnd->result = (((host_status) << 16) | scsi_status);
cmnd->scsi_done(cmnd);

[GIT PULL] SCSI fixes for 5.1-rc3

2019-04-06 Thread James Bottomley

five small fixes.  Four in three drivers: qedi, lpfc and storvsc.  The
final one is labelled core, but merely adds a dh rdac entry for Lenovo
systems.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Colin Ian King (1):
  scsi: qedi: remove declaration of nvm_image from stack

James Smart (1):
  scsi: lpfc: Fix missing wakeups on abort threads

Michael Kelley (2):
  scsi: storvsc: Reduce default ring buffer size to 128 Kbytes
  scsi: storvsc: Fix calculation of sub-channel count

Xose Vazquez Perez (1):
  scsi: core: add new RDAC LENOVO/DE_Series device

With diffstat:

 drivers/scsi/lpfc/lpfc_scsi.c |  7 +++
 drivers/scsi/qedi/qedi_main.c |  7 ++-
 drivers/scsi/scsi_devinfo.c   |  1 +
 drivers/scsi/scsi_dh.c|  1 +
 drivers/scsi/storvsc_drv.c| 15 ---
 5 files changed, 19 insertions(+), 12 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index c98f264f1d83..a497b2c0cb79 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3878,10 +3878,9 @@ lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pIocbIn,
 * wake up the thread.
 */
spin_lock(&lpfc_cmd->buf_lock);
-   if (unlikely(lpfc_cmd->cur_iocbq.iocb_flag & LPFC_DRIVER_ABORTED)) {
-   lpfc_cmd->cur_iocbq.iocb_flag &= ~LPFC_DRIVER_ABORTED;
-   if (lpfc_cmd->waitq)
-   wake_up(lpfc_cmd->waitq);
+   lpfc_cmd->cur_iocbq.iocb_flag &= ~LPFC_DRIVER_ABORTED;
+   if (lpfc_cmd->waitq) {
+   wake_up(lpfc_cmd->waitq);
lpfc_cmd->waitq = NULL;
}
spin_unlock(&lpfc_cmd->buf_lock);
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index e74a62448ba4..e5db9a9954dc 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -1392,10 +1392,8 @@ static void qedi_free_nvm_iscsi_cfg(struct qedi_ctx 
*qedi)
 
 static int qedi_alloc_nvm_iscsi_cfg(struct qedi_ctx *qedi)
 {
-   struct qedi_nvm_iscsi_image nvm_image;
-
qedi->iscsi_image = dma_alloc_coherent(&qedi->pdev->dev,
-  sizeof(nvm_image),
+  sizeof(struct 
qedi_nvm_iscsi_image),
   &qedi->nvm_buf_dma, GFP_KERNEL);
if (!qedi->iscsi_image) {
QEDI_ERR(&qedi->dbg_ctx, "Could not allocate NVM BUF.\n");
@@ -2236,14 +2234,13 @@ static void qedi_boot_release(void *data)
 static int qedi_get_boot_info(struct qedi_ctx *qedi)
 {
int ret = 1;
-   struct qedi_nvm_iscsi_image nvm_image;
 
QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
  "Get NVM iSCSI CFG image\n");
ret = qedi_ops->common->nvm_get_image(qedi->cdev,
  QED_NVM_IMAGE_ISCSI_CFG,
  (char *)qedi->iscsi_image,
- sizeof(nvm_image));
+ sizeof(struct 
qedi_nvm_iscsi_image));
if (ret)
QEDI_ERR(&qedi->dbg_ctx,
 "Could not get NVM image. ret = %d\n", ret);
diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
index c4cbfd07b916..a08ff3bd6310 100644
--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -238,6 +238,7 @@ static struct {
{"NETAPP", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"LSI", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"ENGENIO", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
+   {"LENOVO", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"SMSC", "USB 2 HS-CF", NULL, BLIST_SPARSELUN | BLIST_INQUIRY_36},
{"SONY", "CD-ROM CDU-8001", NULL, BLIST_BORKEN},
{"SONY", "TSL", NULL, BLIST_FORCELUN},  /* DDS3 & DDS4 
autoloaders */
diff --git a/drivers/scsi/scsi_dh.c b/drivers/scsi/scsi_dh.c
index 5a58cbf3a75d..c14006ac98f9 100644
--- a/drivers/scsi/scsi_dh.c
+++ b/drivers/scsi/scsi_dh.c
@@ -75,6 +75,7 @@ static const struct scsi_dh_blist scsi_dh_blist[] = {
{"NETAPP", "INF-01-00", "rdac", },
{"LSI", "INF-01-00","rdac", },
{"ENGENIO", "INF-01-00","rdac", },
+   {"LENOVO", "DE_Series", "rdac", },
{NULL, NULL,NULL },
 };
 
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 84380bae20f1..8472de1007ff 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -385,7 +385,7 @@ enum storvsc_request_type {
  * This is the end of Protocol specific defines.
  */
 
-static int storvsc_ringbuffer_size = (256 * PAGE_SIZE);
+static int storvsc_ringbuffer_size = (128 * 1024);
 static u32 max_outstanding_req_per_channel;
 
 static in

Re: [PATCH 25/30] lpfc: Reduce memory footprint for lpfc_queue

2019-03-20 Thread James Bottomley

On Tue, 2019-03-19 at 20:35 -0700, James Smart wrote:
> > On Mar 19, 2019, at 6:02 PM, James Bottomley  > nPartnership.com> wrote:
> > 
> > On Tue, 2019-03-12 at 16:30 -0700, James Smart wrote:
> > > Currently the driver maintains a sideband structure which has a
> > > pointer for each queue element. However, at 8bytes a pointer, and
> > > up
> > > to 4k elements per queue, and 100's of queues, this can take up a
> > > lot
> > > of memory.
> > > 
> > > Convert the driver to using an access routine that calculates the
> > > element address based on it's index rather than using the pointer
> > > table.
> > 
> > We're getting a failure from the ppc builds according to linux-
> > next:
> > 
> > n file included from drivers/scsi/lpfc/lpfc_debugfs.c:46:
> > drivers/scsi/lpfc/lpfc_debugfs.c: In function
> > 'lpfc_idiag_queacc_write':
> > drivers/scsi/lpfc/lpfc_sli4.h:1083:14: error: inlining failed in
> > call to always_inline 'lpfc_sli4_qe': function body not available
> > inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
> >  ^~~~
> > drivers/scsi/lpfc/lpfc_debugfs.c:4488:12: note: called from here
> >   pentry = lpfc_sli4_qe(pque, index);
> >^
> > In file included from drivers/scsi/lpfc/lpfc_debugfs.c:46:
> > drivers/scsi/lpfc/lpfc_sli4.h:1083:14: error: inlining failed in
> > call to always_inline 'lpfc_sli4_qe': function body not available
> > inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
> >  ^~~~
> > drivers/scsi/lpfc/lpfc_debugfs.c:4488:12: note: called from here
> >   pentry = lpfc_sli4_qe(pque, index);
> >^
> > 
> > You can't declare a function inline in a header if it doesn't have
> > a
> > body available to the compiler.  So realistically you either don't
> > declare it inline or you make it a static inline in the header.  I
> > think the latter applies in this case, so this should be the fix
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/lpfc/lpfc_sli.c
> > b/drivers/scsi/lpfc/lpfc_sli.c
> > index 6fc9ef13..d6ea0c473ed7 100644
> > --- a/drivers/scsi/lpfc/lpfc_sli.c
> > +++ b/drivers/scsi/lpfc/lpfc_sli.c
> > @@ -14574,12 +14574,6 @@ lpfc_sli4_queue_alloc(struct lpfc_hba
> > *phba, uint32_t page_size,
> > return NULL;
> > }
> > 
> > -inline void *lpfc_sli4_qe(struct lpfc_queue *q, uint16_t idx)
> > -{
> > -   return q->q_pgs[idx / q->entry_cnt_per_pg] +
> > -   (q->entry_size * (idx % q->entry_cnt_per_pg));
> > -}
> > -
> > /**
> >  * lpfc_dual_chute_pci_bar_map - Map pci base address register to
> > host memory
> >  * @phba: HBA structure that indicates port to create a queue on.
> > diff --git a/drivers/scsi/lpfc/lpfc_sli4.h
> > b/drivers/scsi/lpfc/lpfc_sli4.h
> > index bd5b5c3de35e..20bc6d3d0653 100644
> > --- a/drivers/scsi/lpfc/lpfc_sli4.h
> > +++ b/drivers/scsi/lpfc/lpfc_sli4.h
> > @@ -1080,4 +1080,8 @@ int lpfc_sli4_post_status_check(struct
> > lpfc_hba *);
> > uint8_t lpfc_sli_config_mbox_subsys_get(struct lpfc_hba *,
> > LPFC_MBOXQ_t *);
> > uint8_t lpfc_sli_config_mbox_opcode_get(struct lpfc_hba *,
> > LPFC_MBOXQ_t *);
> > void lpfc_sli4_ras_dma_free(struct lpfc_hba *phba);
> > -inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
> > +static inline void *lpfc_sli4_qe(struct lpfc_queue *q, uint16_t
> > idx)
> > +{
> > +   return q->q_pgs[idx / q->entry_cnt_per_pg] +
> > +   (q->entry_size * (idx % q->entry_cnt_per_pg));
> > +}
> 
> 
> Agree. Thank You James.

So there are a couple of other instances you can fix at your leisure:
they're not causing immediate linux-next problems because the body of
they're only apparently used within one file so the body is available,
but if the use expands we'll get the same problem:

lpfc_sli4.h:inline void lpfc_sli4_eq_clr_intr(struct lpfc_queue *);
lpfc_sli4.h:inline void lpfc_sli4_if6_eq_clr_intr(struct lpfc_queue *q);

James

Re: [PATCH 25/30] lpfc: Reduce memory footprint for lpfc_queue

2019-03-19 Thread James Bottomley

On Tue, 2019-03-12 at 16:30 -0700, James Smart wrote:
> Currently the driver maintains a sideband structure which has a
> pointer for each queue element. However, at 8bytes a pointer, and up
> to 4k elements per queue, and 100's of queues, this can take up a lot
> of memory.
> 
> Convert the driver to using an access routine that calculates the
> element address based on it's index rather than using the pointer
> table.

We're getting a failure from the ppc builds according to linux-next:

n file included from drivers/scsi/lpfc/lpfc_debugfs.c:46:
drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_idiag_queacc_write':
drivers/scsi/lpfc/lpfc_sli4.h:1083:14: error: inlining failed in call to 
always_inline 'lpfc_sli4_qe': function body not available
 inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
  ^~~~
drivers/scsi/lpfc/lpfc_debugfs.c:4488:12: note: called from here
   pentry = lpfc_sli4_qe(pque, index);
^
In file included from drivers/scsi/lpfc/lpfc_debugfs.c:46:
drivers/scsi/lpfc/lpfc_sli4.h:1083:14: error: inlining failed in call to 
always_inline 'lpfc_sli4_qe': function body not available
 inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
  ^~~~
drivers/scsi/lpfc/lpfc_debugfs.c:4488:12: note: called from here
   pentry = lpfc_sli4_qe(pque, index);
^

You can't declare a function inline in a header if it doesn't have a
body available to the compiler.  So realistically you either don't
declare it inline or you make it a static inline in the header.  I
think the latter applies in this case, so this should be the fix

James

---

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 6fc9ef13..d6ea0c473ed7 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -14574,12 +14574,6 @@ lpfc_sli4_queue_alloc(struct lpfc_hba *phba, uint32_t 
page_size,
return NULL;
 }
 
-inline void *lpfc_sli4_qe(struct lpfc_queue *q, uint16_t idx)
-{
-   return q->q_pgs[idx / q->entry_cnt_per_pg] +
-   (q->entry_size * (idx % q->entry_cnt_per_pg));
-}
-
 /**
  * lpfc_dual_chute_pci_bar_map - Map pci base address register to host memory
  * @phba: HBA structure that indicates port to create a queue on.
diff --git a/drivers/scsi/lpfc/lpfc_sli4.h b/drivers/scsi/lpfc/lpfc_sli4.h
index bd5b5c3de35e..20bc6d3d0653 100644
--- a/drivers/scsi/lpfc/lpfc_sli4.h
+++ b/drivers/scsi/lpfc/lpfc_sli4.h
@@ -1080,4 +1080,8 @@ int lpfc_sli4_post_status_check(struct lpfc_hba *);
 uint8_t lpfc_sli_config_mbox_subsys_get(struct lpfc_hba *, LPFC_MBOXQ_t *);
 uint8_t lpfc_sli_config_mbox_opcode_get(struct lpfc_hba *, LPFC_MBOXQ_t *);
 void lpfc_sli4_ras_dma_free(struct lpfc_hba *phba);
-inline void *lpfc_sli4_qe(struct lpfc_queue *, uint16_t);
+static inline void *lpfc_sli4_qe(struct lpfc_queue *q, uint16_t idx)
+{
+   return q->q_pgs[idx / q->entry_cnt_per_pg] +
+   (q->entry_size * (idx % q->entry_cnt_per_pg));
+}

Re: [PATCH] scsi_debug: fix write_same with virtual_gb problem

2019-03-19 Thread James Bottomley

On Tue, 2019-03-19 at 15:37 -0400, Martin K. Petersen wrote:
> Bart,
> 
> > Sorry, I was looking at the wrong branch. The patch is in Linus'
> > master but it seems like that patch is not yet in your 5.1/scsi-
> > queue branch?
> 
> It wouldn't be as 5.1/scsi-queue is based on 5.0-rc1 and the fix went
> into 5.0-rc5.

I tend to manage the merges and conflicts, so if you want everything
for both scsi-fixes and scsi-queue, you need to look at my for-next
branch.

James

[GIT PULL] first round of SCSI updates for the 5.0+ merge window

2019-03-06 Thread James Bottomley

This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
hisi_sas, target/iscsi and target/core.  Additionally Christoph
refactored gdth as part of the dma changes.  The major mid-layer change
this time is the removal of bidi commands and with them the whole of
the osd/exofs driver and filesystem.  This is a major simplification
for block and mq in particular.

Additionally, there are four existing and one potential conflict:

- The unpulled block tree updates the removed osd driver

- 750afb08ca71 cross-tree: phase out dma_zalloc_coherent() conflicts
with the arcmsr update.  The fix is simple: go with our version

The remaining conflicts are internal between updates we supplied in our
fixes branches and changes made to the misc branch:

- hisi_sas_v3_hw.c: This is the nastiest: the fix to move the
protection parameters (7bb25a89aad2) conflicts with the DIX feature
addition (b3cce125cb1e).  The resolution is to make sure the DIX
enablement follows the move of the prot_mask check in
hisi_sas_v3_probe().

- lpfc_nvme.c: the fix to avoid hang/use after free (7961cba6f7d8)
conflicts with moving the stats to HW queue structures (4c47efc140fa). 
The resolution a simple combination of both patches.

- qla_init.c: The fix for the panic after free (388a49959ee4) conflicts
with move marker behind QPair (9eb9c6dc3ab0).  The fix is a straight
combination plus the transformation of sp->fcport->loop_id to fcport-
>loop_id to preserve the panic fix.

I've put the resolution in the linus-resolved branch of the scsi tree
for you to see and also attached the --cc diff below.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

The short changelog is:

Anil Gurumurthy (1):
  scsi: qla2xxx: Add support for setting port speed

Avri Altman (4):
  scsi: ufs-bsg: Allow reading descriptors
  scsi: ufs: Allow reading descriptor via raw upiu
  scsi: ufs-bsg: Change the calling convention for write descriptor
  scsi: clean obsolete return values of eh_timed_out

Bart Van Assche (27):
  scsi: core: Move resid from scsi_data_buffer to scsi_cmnd
  scsi: sd: Remove superfluous residual assignments
  scsi: uas: Use scsi_[gs]et_resid() where appropriate
  scsi: scsi_debug: Use scsi_[gs]et_resid() where appropriate
  scsi: libiscsi: Use scsi_[gs]et_resid() where appropriate
  scsi: scsi_debug: Fix a recently introduced regression
  scsi: target/iscsi: Simplify iscsit_handle_text_cmd()
  scsi: target/iscsi: Simplify iscsit_dump_data_payload()
  scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
  scsi: target/iscsi: Rename a function and a function pointer
  scsi: target/iscsi: Fix spelling of "unsolicited"
  scsi: target/iscsi: Convert comments about locking into runtime checks
  scsi: target/iscsi: Remove an incorrect comment
  scsi: RDMA/srpt: Fix a credit leak for aborted commands
  scsi: RDMA/srpt: Rework I/O context allocation
  scsi: RDMA/srpt: Fix handling of TMF submission failure
  scsi: RDMA/srpt: Fix handling of command / TMF submission failure
  scsi: target/core: Add target_send_busy()
  scsi: target/core: Inline transport_lun_remove_cmd()
  scsi: target/core: Simplify the LUN RESET implementation
  scsi: target/core: Remove several state tests from the TMF code
  scsi: target/core: Remove the write_pending_status() callback function
  scsi: sd: Protect against READ(6) or WRITE(6) with zero block transfer 
length
  scsi: libsas: Remove scsi_to_u32()
  scsi: core: Remove an atomic instruction from the hot path
  scsi: sd: Rename 'SCpnt' into 'cmd'
  scsi: sd: Remove a local variable

Benjamin Block (1):
  scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c

Bill Kuzeja (1):
  scsi: qla2xxx: Move debug messages before sending srb preventing panic

Chengguang Xu (1):
  scsi: ufs: fix a typo in comment

Ching Huang (15):
  scsi: arcmsr: Update driver version to v1.40.00.10-20190116
  scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
  scsi: arcmsr: Use dma_alloc_coherent to replace dma_zalloc_coherent
  scsi: arcmsr: Update driver version to v1.40.00.10-20181217
  scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B
  scsi: arcmsr: Separate 'set dma mask' as a function
  scsi: arcmsr: Add an option of set dma_mask_64 for ACB_ADAPTER_TYPE_A
  scsi: arcmsr: Update ACB_ADAPTER_TYPE_D for >4GB ccb addressing
  scsi: arcmsr: Update ACB_ADAPTER_TYPE_C for >4GB ccb addressing
  scsi: arcmsr: Update ACB_ADAPTER_TYPE_B for >4GB ccb addressing
  scsi: arcmsr: Update ACB_ADAPTER_TYPE_A for >4GB ccb addressing
  scsi: arcmsr: Update arcmsr_alloc_ccb_pool for ccb buffer address above 
4GB
  scsi: arcmsr: Merge arcmsr_alloc_io_queue to arcmsr_alloc_ccb_pool
  scsi: arcmsr: Rename arcmsr_free_mu to arcmsr_free_io_queue
  scsi: arcmsr: Rename acb str

[GIT PULL] SCSI fixes for 5.0-rc7

2019-03-02 Thread James Bottomley

Nine small fixes.  The resume fix is a cosmetic removal of a warning
with an incorrect condition causing it to alarm people wrongly.  The
other eight patches correct a thinko in Christoph Hellwig's DMA
conversion series.  Without it all these drivers end up with 32 bit DMA
masks meaning they bounce any page over 4GB before sending it to the
controller.  Nowadays, even laptops mostly have memory above 4GB, so
this can lead to significant performance degradation with all the
bouncing.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bart Van Assche (1):
  scsi: core: Avoid that system resume triggers a kernel warning

Hannes Reinecke (8):
  scsi: hptiop: fix calls to dma_set_mask()
  scsi: hisi_sas: fix calls to dma_set_mask_and_coherent()
  scsi: csiostor: fix calls to dma_set_mask_and_coherent()
  scsi: bfa: fix calls to dma_set_mask_and_coherent()
  scsi: aic94xx: fix calls to dma_set_mask_and_coherent()
  scsi: 3w-sas: fix calls to dma_set_mask_and_coherent()
  scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent()
  scsi: lpfc: fix calls to dma_set_mask_and_coherent()

And the diffstat:

 drivers/scsi/3w-9xxx.c | 14 +-
 drivers/scsi/3w-sas.c  | 12 
 drivers/scsi/aic94xx/aic94xx_init.c|  8 +---
 drivers/scsi/bfa/bfad.c| 18 +-
 drivers/scsi/csiostor/csio_init.c  |  7 +--
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  8 ++--
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  8 +---
 drivers/scsi/hptiop.c  | 10 +++---
 drivers/scsi/lpfc/lpfc_init.c  | 19 ---
 drivers/scsi/scsi_lib.c|  1 -
 10 files changed, 70 insertions(+), 35 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index a3c20e3a8b7c..3337b1e80412 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -2009,7 +2009,7 @@ static int twa_probe(struct pci_dev *pdev, const struct 
pci_device_id *dev_id)
struct Scsi_Host *host = NULL;
TW_Device_Extension *tw_dev;
unsigned long mem_addr, mem_len;
-   int retval = -ENODEV;
+   int retval;
 
retval = pci_enable_device(pdev);
if (retval) {
@@ -2020,8 +2020,10 @@ static int twa_probe(struct pci_dev *pdev, const struct 
pci_device_id *dev_id)
pci_set_master(pdev);
pci_try_set_mwi(pdev);
 
-   if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
-   dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
+   retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+   if (retval)
+   retval = dma_set_mask_and_coherent(&pdev->dev, 
DMA_BIT_MASK(32));
+   if (retval) {
TW_PRINTK(host, TW_DRIVER, 0x23, "Failed to set dma mask");
retval = -ENODEV;
goto out_disable_device;
@@ -2240,8 +2242,10 @@ static int twa_resume(struct pci_dev *pdev)
pci_set_master(pdev);
pci_try_set_mwi(pdev);
 
-   if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
-   dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
+   retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+   if (retval)
+   retval = dma_set_mask_and_coherent(&pdev->dev, 
DMA_BIT_MASK(32));
+   if (retval) {
TW_PRINTK(host, TW_DRIVER, 0x40, "Failed to set dma mask during 
resume");
retval = -ENODEV;
goto out_disable_device;
diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c
index e8f5f7c63190..2d15b878bd94 100644
--- a/drivers/scsi/3w-sas.c
+++ b/drivers/scsi/3w-sas.c
@@ -1572,8 +1572,10 @@ static int twl_probe(struct pci_dev *pdev, const struct 
pci_device_id *dev_id)
pci_set_master(pdev);
pci_try_set_mwi(pdev);
 
-   if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
-   dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
+   retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+   if (retval)
+   retval = dma_set_mask_and_coherent(&pdev->dev, 
DMA_BIT_MASK(32));
+   if (retval) {
TW_PRINTK(host, TW_DRIVER, 0x18, "Failed to set dma mask");
retval = -ENODEV;
goto out_disable_device;
@@ -1804,8 +1806,10 @@ static int twl_resume(struct pci_dev *pdev)
pci_set_master(pdev);
pci_try_set_mwi(pdev);
 
-   if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
-   dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
+   retval = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+   if (retval)
+   retval = dma_set_mask_and_coherent(&pdev->dev, 
DMA_BIT_MASK(32));
+   if (retval) {
TW_PRINTK(host, TW_DRIVER, 0x25,

Re: [PATCH v2] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread James Bottomley

On Fri, 2019-03-01 at 15:43 +, Erwan Velu wrote:
> Le 01/03/2019 à 16:26, James Bottomley a écrit :
> > [...]
> > Shouldn't this be a variant of sdev/scmd_printk?  Otherwise it
> > tells
> > you what disk in the array terms is the problem but not what device
> > in
> > your actual system is affected.
> 
> Hey James,
> 
> My initial take on that was that pqi_take_device_offline(), which is 
> called just after, will print the "re-scanning " message with the
> same 
> format.
> 
> As they will be both printed in the same error context and one after
> the 
> other, I though that would make sense to represent the same
> information 
> to ease the reading like cause -> consequence.
> 
> As the message is about the LUN itself, which is reported faulty, I 
> though it would worth reporting the info that way.
> 
> Shall I consider printing also the disk name in addition ?

I was thinking just

if (printk_ratelimit())
scmd_printk(KERN_ERR, scmd, "received 'logical unit failure' from 
controller for scsi %d:%d:%d:%d\n", ...

That will give all the necessary information

James

Re: [PATCH v2] scsi: smartpqi_init: Reporting 'logical unit failure'

2019-03-01 Thread James Bottomley

On Fri, 2019-03-01 at 15:58 +0100, Erwan Velu wrote:
> + dev_err_ratelimited(&ctrl_info->pci_dev-
> >dev, "received 'logical unit failure' from controller for scsi
> %d:%d:%d:%d\n",
> + ctrl_info-
> >scsi_host->host_no, device->bus,
> + device-
> >target, device->lun);

Shouldn't this be a variant of sdev/scmd_printk?  Otherwise it tells
you what disk in the array terms is the problem but not what device in
your actual system is affected.

James

[GIT PULL] SCSI fixes for 5.0-rc7

2019-02-22 Thread James Bottomley

Four small fixes: three in drivers and one in the core.  The core fix
is also minor in scope since the bug it fixes is only known to affect
systems using SCSI reservations.  Of the driver bugs, the libsas one is
 the most major because it can lead to multiple disks on the same
expander not being exposed.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Anoob Soman (1):
  scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task

John Garry (1):
  scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached

Martin Wilck (1):
  scsi: core: reset host byte in DID_NEXUS_FAILURE case

Masato Suzuki (1):
  scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation

And the diffstat:

 drivers/scsi/libiscsi.c| 6 ++
 drivers/scsi/libsas/sas_expander.c | 2 ++
 drivers/scsi/scsi_lib.c| 1 +
 drivers/scsi/sd_zbc.c  | 8 +---
 4 files changed, 14 insertions(+), 3 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index b8d325ce8754..120fc520f27a 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1459,7 +1459,13 @@ static int iscsi_xmit_task(struct iscsi_conn *conn)
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx))
return -ENODATA;
 
+   spin_lock_bh(&conn->session->back_lock);
+   if (conn->task == NULL) {
+   spin_unlock_bh(&conn->session->back_lock);
+   return -ENODATA;
+   }
__iscsi_get_task(task);
+   spin_unlock_bh(&conn->session->back_lock);
spin_unlock_bh(&conn->session->frwd_lock);
rc = conn->session->tt->xmit_task(task);
spin_lock_bh(&conn->session->frwd_lock);
diff --git a/drivers/scsi/libsas/sas_expander.c 
b/drivers/scsi/libsas/sas_expander.c
index 17eb4185f29d..f21c93bbb35c 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -828,6 +828,7 @@ static struct domain_device *sas_ex_discover_end_dev(
rphy = sas_end_device_alloc(phy->port);
if (!rphy)
goto out_free;
+   rphy->identify.phy_identifier = phy_id;
 
child->rphy = rphy;
get_device(&rphy->dev);
@@ -854,6 +855,7 @@ static struct domain_device *sas_ex_discover_end_dev(
 
child->rphy = rphy;
get_device(&rphy->dev);
+   rphy->identify.phy_identifier = phy_id;
sas_fill_in_rphy(child, rphy);
 
list_add_tail(&child->disco_list_node, 
&parent->port->disco_list);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6d65ac584eba..f8d51c3d5582 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -655,6 +655,7 @@ static blk_status_t scsi_result_to_blk_status(struct 
scsi_cmnd *cmd, int result)
set_host_byte(cmd, DID_OK);
return BLK_STS_TARGET;
case DID_NEXUS_FAILURE:
+   set_host_byte(cmd, DID_OK);
return BLK_STS_NEXUS;
case DID_ALLOC_FAILURE:
set_host_byte(cmd, DID_OK);
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index fff86940388b..a340af797a85 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -142,10 +142,12 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t 
sector,
return -EOPNOTSUPP;
 
/*
-* Get a reply buffer for the number of requested zones plus a header.
-* For ATA, buffers must be aligned to 512B.
+* Get a reply buffer for the number of requested zones plus a header,
+* without exceeding the device maximum command size. For ATA disks,
+* buffers must be aligned to 512B.
 */
-   buflen = roundup((nrz + 1) * 64, 512);
+   buflen = min(queue_max_hw_sectors(disk->queue) << 9,
+roundup((nrz + 1) * 64, 512));
buf = kmalloc(buflen, gfp_mask);
if (!buf)
return -ENOMEM;

Re: [PATCH] scsi: mvumi: fix 32 bit shift of a 32 bit unsigned int

2019-02-18 Thread James Bottomley

On Mon, 2019-02-18 at 12:37 +0300, Dan Carpenter wrote:
> On Sat, Feb 16, 2019 at 05:27:16PM +0100, Walter Harms wrote:
> > Am 16.02.2019 15:44, schrieb Colin King:
> > > From: Colin Ian King 
> > > 
> > > Currently m_sg->baseaddr_h (a 32 bit unsigned int) is being
> > > shifted by a
> > > total of 32 bits; this always produces a 0 result.  Fix this by
> > > casting
> > > it to a dma_addr_t (a 64 bit unsigned int) before performing the
> > > shift.
> > > 
> > > Detected by CoverityScan, CID#147270 ("Operands don't affect
> > > result")
> > > 
> > > Fixes: f0c568a478f0 ("[SCSI] mvumi: Add Marvell UMI driver")
> > > Signed-off-by: Colin Ian King 
> > > ---
> > >  drivers/scsi/mvumi.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
> > > index 36f64205ecfa..d3582accfd09 100644
> > > --- a/drivers/scsi/mvumi.c
> > > +++ b/drivers/scsi/mvumi.c
> > > @@ -313,7 +313,7 @@ static void mvumi_delete_internal_cmd(struct
> > > mvumi_hba
> > > *mhba,
> > >   sgd_getsz(mhba, m_sg, size);
> > >  
> > >   phy_addr = (dma_addr_t) m_sg->baseaddr_l 
> > > |
> > > - (dma_addr_t) ((m_sg->baseaddr_h
> > > << 16) << 16);
> > > + (((dma_addr_t) m_sg->baseaddr_h
> > > << 16) << 16);
> > >  
> > >   dma_free_coherent(&mhba->pdev->dev,
> > > size, cmd->data_buf,
> > >   
> > > phy_addr);
> > 
> > i would suggest to try a version with less casts to make it more
> > readable
> > like this untested suggestion:
> > 
> > phy_addr =(m_sg->baseaddr_h << 16)| m_sg->baseaddr_l;
> > phy_addr <<= 16;
> > 
> 
> That would be a behavior change but it also might be a bugfix?  Why
> doesn't the code just do:
> 
>   phy_addr = ((dma_addr_t)m_sg->baseaddr_h << 32) | m_sg-
> >baseaddr_l;
> 
> (Probably they broke it up into two shifts to silence a GCC warning
> that the shift was wrong because of the missing cast?)

No because dma_addr_t can be 32 bits and the warning would then always
appear on some builds.  The << 16 << 16 makes sure it doesn't.

James

[GIT PULL] SCSI fixes for 5.0-rc

2019-02-15 Thread James Bottomley

Two fairly small fixes: the qla one is a panic inducing use after free
and the entropy fix may seem minor but it has had huge userspace impact
thanks to an unrelated change in openssl that causes sshd to refuse
logins until it has enough entropy for the session keys, which causes
tens of minutes delay before the affected systems allow logins after
reboot.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bill Kuzeja (1):
  scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd

James Bottomley (1):
  scsi: sd: fix entropy gathering for most rotational disks

And the diffstat:

 drivers/scsi/qla2xxx/qla_init.c |  4 ++--
 drivers/scsi/sd.c   | 12 +---
 2 files changed, 11 insertions(+), 5 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 364bb52ed2a6..109587e62983 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -1785,13 +1785,13 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, 
uint32_t lun,
 
/* Issue Marker IOCB */
qla2x00_marker(vha, vha->hw->req_q_map[0],
-   vha->hw->rsp_q_map[0], sp->fcport->loop_id, lun,
+   vha->hw->rsp_q_map[0], fcport->loop_id, lun,
flags == TCF_LUN_RESET ? MK_SYNC_ID_LUN : MK_SYNC_ID);
}
 
 done_free_sp:
sp->free(sp);
-   sp->fcport->flags &= ~FCF_ASYNC_SENT;
+   fcport->flags &= ~FCF_ASYNC_SENT;
 done:
return rval;
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index b2da8a00ec33..5464d467e23e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2951,9 +2951,6 @@ static void sd_read_block_characteristics(struct 
scsi_disk *sdkp)
if (rot == 1) {
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
-   } else {
-   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
-   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
}
 
if (sdkp->device->type == TYPE_ZBC) {
@@ -3090,6 +3087,15 @@ static int sd_revalidate_disk(struct gendisk *disk)
if (sdkp->media_present) {
sd_read_capacity(sdkp, buffer);
 
+   /*
+* set the default to rotational.  All non-rotational devices
+* support the block characteristics VPD page, which will
+* cause this to be updated correctly and any device which
+* doesn't support it should be treated as rotational.
+*/
+   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
+   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
+
if (scsi_device_supports_vpd(sdp)) {
sd_read_block_provisioning(sdkp);
sd_read_block_limits(sdkp);

Re: [PATCH -next] scsi: lpfc: Remove set but not used variable 'phys_id'

2019-02-14 Thread James Bottomley

On Thu, 2019-02-14 at 13:19 -0800, James Smart wrote:
> 
> On 2/14/2019 11:39 AM, James Bottomley wrote:
> > On Thu, 2019-02-14 at 10:52 -0800, James Smart wrote:
> > > On 2/13/2019 5:51 PM, YueHaibing wrote:
> > > > Fixes gcc '-Wunused-but-set-variable' warning:
> > > > 
> > > > drivers/scsi/lpfc/lpfc_init.c: In function
> > > > 'lpfc_cpu_affinity_check':
> > > > drivers/scsi/lpfc/lpfc_init.c:10599:19: warning:
> > > >variable 'phys_id' set but not used [-Wunused-but-set-
> > > > variable]
> > > > 
> > > > It never used since introduction in commit 6a828b0f6192 ("scsi:
> > > > lpfc: Support
> > > > non-uniform allocation of MSIX vectors to hardware queues")
> > > > 
> > > > Signed-off-by: YueHaibing 
> > > > ---
> > > >drivers/scsi/lpfc/lpfc_init.c | 3 +--
> > > >1 file changed, 1 insertion(+), 2 deletions(-)
> > > > 
> > > > 
> > > 
> > > Looks fine. Thanks
> > > 
> > > Signed-off-by:   James Smart  
> > 
> > Under the DCO this can't be a Signed-off-by tag: signoffs track the
> > patch transmission path under the DCO, so unless you send it you
> > can't
> > add your signoff.
> > 
> > If you just want Martin to apply it now, and you don't want to
> > gather and resend it with your other lpfc patches, I think the tag
> > you want is Acked-by.
> > 
> > James
> 
> I've been told multiple answers on which way to reply over the years.
> My initial position followed your statement, but a later rebuke
> (would have to look hard to find it)

If you could, that would help me find and correct the source ...

>  told me as maintainer that I should be doing something different. 
> I'll go back to the DCO definitions and follow
> those.

Great, thanks.  Our only tag with formal requirements from the DCO is
Signed-off-by: everything else is undefined by the DCO (and thus more
prone to being argued over and having more per-subsystem meanings that
lead to irreconcilable differences between subsystem use).

James


> Thanks for cleaning it up Martin.
> 
> -- james
>

Re: [PATCH -next] scsi: lpfc: Remove set but not used variable 'phys_id'

2019-02-14 Thread James Bottomley

On Thu, 2019-02-14 at 10:52 -0800, James Smart wrote:
> 
> On 2/13/2019 5:51 PM, YueHaibing wrote:
> > Fixes gcc '-Wunused-but-set-variable' warning:
> > 
> > drivers/scsi/lpfc/lpfc_init.c: In function
> > 'lpfc_cpu_affinity_check':
> > drivers/scsi/lpfc/lpfc_init.c:10599:19: warning:
> >   variable 'phys_id' set but not used [-Wunused-but-set-variable]
> > 
> > It never used since introduction in commit 6a828b0f6192 ("scsi:
> > lpfc: Support
> > non-uniform allocation of MSIX vectors to hardware queues")
> > 
> > Signed-off-by: YueHaibing 
> > ---
> >   drivers/scsi/lpfc/lpfc_init.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > 
> 
> Looks fine. Thanks
> 
> Signed-off-by:   James Smart  

Under the DCO this can't be a Signed-off-by tag: signoffs track the
patch transmission path under the DCO, so unless you send it you can't
add your signoff.

If you just want Martin to apply it now, and you don't want to gather
and resend it with your other lpfc patches, I think the tag you want is
Acked-by.

James

[PATCH] scsi: sd: fix entropy gathering for most rotational disks

2019-02-12 Thread James Bottomley

The problem is that the default for MQ is not to gather entropy,
whereas the default for the legacy queue was always to gather it.  The
original attempt to fix entropy gathering for rotational disks under MQ
added an else branch in sd_read_block_characteristics(). 
Unfortunately, the entire check isn't reached if the device has no
characteristics VPD page.  Since this page was only introduced in SBC-3 
and its optional anyway, most less expensive rotational disks don't
have one, meaning they all stopped gathering entropy when we made MQ
the default.  In a wholly unrelated change, openssl and openssh won't
function until the random number generator is initialised, meaning lots
of people have been seeing large delays before they could log into
systems with default MQ kernels due to this lack of entropy, because it
now can take tens of minutes to initialise the kernel random number
generator.

The fix is to set the non-rotational and add-randomness flags
unconditionally early on in the disk initialization path, so they can
be reset only if the device actually reports being non-rotational via
the VPD page.

Reported-by: Mikael Pettersson 
Fixes: 83e32a591077 ("scsi: sd: Contribute to randomness when running 
rotational device")
Cc: sta...@vger.kernel.org
Signed-off-by: James Bottomley 

---
I updated this slightly over the original proposal so we set the flags
even if the device doesn't have any VPD pages, so it should work for
very old disks.

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d0a980915801..3b8093c48eba 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2971,9 +2971,6 @@ static void sd_read_block_characteristics(struct 
scsi_disk *sdkp)
if (rot == 1) {
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
-   } else {
-   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
-   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
}
 
if (sdkp->device->type == TYPE_ZBC) {
@@ -3110,6 +3107,15 @@ static int sd_revalidate_disk(struct gendisk *disk)
if (sdkp->media_present) {
sd_read_capacity(sdkp, buffer);
 
+   /*
+* set the default to rotational.  All non-rotational devices
+* support the block characteristics VPD page, which will
+* cause this to be updated correctly and any device which
+* doesn't support it should be treated as rotational.
+*/
+   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
+   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
+
if (scsi_device_supports_vpd(sdp)) {
sd_read_block_provisioning(sdkp);
sd_read_block_limits(sdkp);

Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500

2019-02-12 Thread James Bottomley

On Mon, 2019-02-11 at 19:50 -0700, Jens Axboe wrote:
> On 2/11/19 7:13 PM, James Bottomley wrote:
> > On Mon, 2019-02-11 at 09:31 -0700, Jens Axboe wrote:
> > > On 2/11/19 9:28 AM, James Bottomley wrote:
> > > > On Mon, 2019-02-11 at 08:46 -0700, Jens Axboe wrote:
> > > > > On 2/11/19 8:42 AM, James Bottomley wrote:
> > > > > > On Mon, 2019-02-11 at 08:28 -0700, Jens Axboe wrote:
> > > > > > > On 2/11/19 8:25 AM, James Bottomley wrote:
> > > > > > > > On Sun, 2019-02-10 at 09:35 -0700, Jens Axboe wrote:
> > > > > > > > > On 2/10/19 9:25 AM, James Bottomley wrote:
> > > > 
> > > > [...]
> > > > > > > > > > That check wasn't changed by the code removal.
> > > > > > > > > 
> > > > > > > > > As I said above, for sd. This isn't true for non-
> > > > > > > > > disks.
> > > > > > > > 
> > > > > > > > Yes, but the behaviour above doesn't change across a
> > > > > > > > switch
> > > > > > > > to MQ, so I don't quite understand how it bisects back
> > > > > > > > to
> > > > > > > > that change.  If we're not gathering entropy for the
> > > > > > > > device
> > > > > > > > now, we wouldn't have been before the switch, so the
> > > > > > > > entropy characteristics shouldn't have changed.
> > > > > > > 
> > > > > > > But it does, as I also wrote in that first email. The
> > > > > > > legacy
> > > > > > > queue flags had QUEUE_FLAG_ADD_RANDOM set by default, the
> > > > > > > MQ
> > > > > > > ones do not. Hence any non-sd device would previously
> > > > > > > ALWAYS
> > > > > > > have ADD_RANDOM set, now none of them do. Also see the
> > > > > > > patch
> > > > > > > I sent.
> > > > > > 
> > > > > > So your theory is that the disk in question never gets to
> > > > > > the
> > > > > > rotational check?  because the check will clear the flag if
> > > > > > it's non-rotational and set it if it's not, so the default
> > > > > > state of the flag shouldn't matter.
> > > > > 
> > > > > No, my point is about non-disks, devices that aren't driven
> > > > > by
> > > > > sd. The behavior for sd hasn't changed, as it sets/clears it
> > > > > unconditionally. 
> > > > 
> > > > I agree, but I don't think any of them were significant entropy
> > > > contributors before: things like nvme have always been outside
> > > > of
> > > > this and sr and st don't really contribute much to the seek
> > > > load
> > > > during boot because they're probed but not used by the boot
> > > > sequence, so I can't see how they would cause this
> > > > behaviour.  I
> > > > suppose it could be target probing, but even that seems
> > > > unlikely
> > > > because it should be dwarfed by the number of root disk reads
> > > > during boot.
> > > > 
> > > > For the rng to take an additional 5 minutes to initialize, we
> > > > must
> > > > have lost a significant entropy source somewhere.
> > > 
> > > I agree it's not a significant amount of entropy, but even just
> > > one
> > > bit could mean a long stall if that put us over the edge of just
> > > not
> > > having enough for whatever is blocking on /dev/random. Mikael's
> > > boot
> > > did have a CDROM, it's not impossible that the handful of
> > > commands we
> > > end up doing to that device would have contributed enough entropy
> > > to
> > > get the boot done without stalling for minutes.
> > > 
> > > One way to know for sure, and that's if Mikael tests the patch.
> > 
> > I think I've got the root cause.  I have one system in my test bed
> > exhibiting this behaviour.  It turns out the disk in it has no
> > characteristics VPD page.  The 0xB1 VPD was a SBC-3 addition, so
> > that's
> > not surprising.  However, the characteristics check bails before
> > setting the flags, so it takes the default flag which has flipped.
> > 
> > We can either fix this by setting the QUEUE_FLAG_ADD_RANDOM if
> > there's
> > no 0xB1 page or by setting the default as Jens proposed.
> 
> I'd recommend just doing my patch, since that'll be the same behavior
> that SCSI had before.

I've got the history now, it's this patch

Author: Xuewei Zhang 
Date:   Thu Sep 6 13:37:19 2018 -0700

scsi: sd: Contribute to randomness when running rotational device

It added the else branch to the if (rot == 1).  It's the position of
that else branch which is wrong because not all disks have a SBC-3
characteristics VPD page, so they're the ones under MQ which stop
contributing entropy.  Whichever patch we go with will need a fixes:
for this.

James

Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500

2019-02-11 Thread James Bottomley

On Tue, 2019-02-12 at 03:37 +, Elliott, Robert (Persistent Memory)
wrote:
> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org  > ow...@vger.kernel.org> On Behalf Of Jens Axboe
> > Sent: Monday, February 11, 2019 8:50 PM
> > To: James Bottomley ; Mikael
> > Pettersson 
> > Cc: Linux SPARC Kernel Mailing List ;
> > linux-bl...@vger.kernel.org; linux-ker...@vger.kernel.org; linux-
> > scsi
> > 
> > Subject: Re: [5.0-rc5 regression] "scsi: kill off the legacy IO
> > path"
> > causes 5 minute delay during boot on Sun Blade 2500
> 
> ...
> 
> > > We can either fix this by setting the QUEUE_FLAG_ADD_RANDOM if
> > > there's no 0xB1 page or by setting the default as Jens proposed.
> > 
> > I'd recommend just doing my patch, since that'll be the same
> > behavior that SCSI had before.
> 
> When blk-mq and scsi-mq were introduced to boost SAS SSDs over
> 1 million IOPS, this was purposely kept off, because it generated
> a noticeable 3-12 us latency blip whenever the "fast_mix" code ran
> out of bits. If Jens' patch changes the default to be enabled for
> all scsi-mq users, that'll be a change for the newer legacy
> scsi-mq users (except for users that don't trust the default and
> always set the value in sysfs).

All high IOPs devices have a device characteristics page, so even if we
 default this to ON, it will get turned off for these devices which
should be the desired result regardless of the default setting. 

James

Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500

2019-02-11 Thread James Bottomley

On Mon, 2019-02-11 at 09:31 -0700, Jens Axboe wrote:
> On 2/11/19 9:28 AM, James Bottomley wrote:
> > On Mon, 2019-02-11 at 08:46 -0700, Jens Axboe wrote:
> > > On 2/11/19 8:42 AM, James Bottomley wrote:
> > > > On Mon, 2019-02-11 at 08:28 -0700, Jens Axboe wrote:
> > > > > On 2/11/19 8:25 AM, James Bottomley wrote:
> > > > > > On Sun, 2019-02-10 at 09:35 -0700, Jens Axboe wrote:
> > > > > > > On 2/10/19 9:25 AM, James Bottomley wrote:
> > 
> > [...]
> > > > > > > > That check wasn't changed by the code removal.
> > > > > > > 
> > > > > > > As I said above, for sd. This isn't true for non-disks.
> > > > > > 
> > > > > > Yes, but the behaviour above doesn't change across a switch
> > > > > > to MQ, so I don't quite understand how it bisects back to
> > > > > > that change.  If we're not gathering entropy for the device
> > > > > > now, we wouldn't have been before the switch, so the
> > > > > > entropy characteristics shouldn't have changed.
> > > > > 
> > > > > But it does, as I also wrote in that first email. The legacy
> > > > > queue flags had QUEUE_FLAG_ADD_RANDOM set by default, the MQ
> > > > > ones do not. Hence any non-sd device would previously ALWAYS
> > > > > have ADD_RANDOM set, now none of them do. Also see the patch
> > > > > I sent.
> > > > 
> > > > So your theory is that the disk in question never gets to the
> > > > rotational check?  because the check will clear the flag if
> > > > it's non-rotational and set it if it's not, so the default
> > > > state of the flag shouldn't matter.
> > > 
> > > No, my point is about non-disks, devices that aren't driven by
> > > sd. The behavior for sd hasn't changed, as it sets/clears it
> > > unconditionally. 
> > 
> > I agree, but I don't think any of them were significant entropy
> > contributors before: things like nvme have always been outside of
> > this and sr and st don't really contribute much to the seek load
> > during boot because they're probed but not used by the boot
> > sequence, so I can't see how they would cause this behaviour.  I
> > suppose it could be target probing, but even that seems unlikely
> > because it should be dwarfed by the number of root disk reads
> > during boot.
> > 
> > For the rng to take an additional 5 minutes to initialize, we must
> > have lost a significant entropy source somewhere.
> 
> I agree it's not a significant amount of entropy, but even just one
> bit could mean a long stall if that put us over the edge of just not
> having enough for whatever is blocking on /dev/random. Mikael's boot
> did have a CDROM, it's not impossible that the handful of commands we
> end up doing to that device would have contributed enough entropy to
> get the boot done without stalling for minutes.
> 
> One way to know for sure, and that's if Mikael tests the patch.

I think I've got the root cause.  I have one system in my test bed
exhibiting this behaviour.  It turns out the disk in it has no
characteristics VPD page.  The 0xB1 VPD was a SBC-3 addition, so that's
not surprising.  However, the characteristics check bails before
setting the flags, so it takes the default flag which has flipped.

We can either fix this by setting the QUEUE_FLAG_ADD_RANDOM if there's
no 0xB1 page or by setting the default as Jens proposed.

James

---

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d0a980915801..1f3a1474042e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2961,6 +2961,10 @@ static void sd_read_block_characteristics(struct 
scsi_disk *sdkp)
 
buffer = kmalloc(vpd_len, GFP_KERNEL);
 
+   /* set to rotational in case no device characteristic page exists */
+   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
+   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
+
if (!buffer ||
/* Block Device Characteristics VPD */
scsi_get_vpd_page(sdkp->device, 0xb1, buffer, vpd_len))
@@ -2971,9 +2975,6 @@ static void sd_read_block_characteristics(struct 
scsi_disk *sdkp)
if (rot == 1) {
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
-   } else {
-   blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
-   blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
}
 
if (sdkp->device->type == TYPE_ZBC) {

[GIT PULL] SCSI fixes for 5.0-rc5

2019-02-08 Thread James Bottomley

This is a set of five minor fixes (although, tecnhincally, the aicxxx
fix is for a major problem in that the driver won't load without it,
but I think the fact it's taken us since 4.10 to discover this
indicates that the user base for these things has declined).

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Damien Le Moal (1):
  scsi: sd_zbc: Fix zone information messages

David Disseldorp (1):
  scsi: target: make the pi_prot_format ConfigFS path readable

James Bottomley (1):
  scsi: aic94xx: fix module loading

Ross Lagerwall (1):
  Revert "scsi: libfc: Add WARN_ON() when deleting rports"

Vaibhav Jain (1):
  scsi: cxlflash: Prevent deadlock when adapter probe fails

And the diffstat:

 drivers/scsi/aic94xx/aic94xx_init.c   |  8 
 drivers/scsi/cxlflash/main.c  |  2 ++
 drivers/scsi/libfc/fc_rport.c |  1 -
 drivers/scsi/sd_zbc.c | 12 
 drivers/target/target_core_configfs.c |  8 +++-
 5 files changed, 21 insertions(+), 10 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/aic94xx/aic94xx_init.c 
b/drivers/scsi/aic94xx/aic94xx_init.c
index f83f79b07b50..07efcb9b5b94 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -280,7 +280,7 @@ static ssize_t asd_show_dev_rev(struct device *dev,
return snprintf(buf, PAGE_SIZE, "%s\n",
asd_dev_rev[asd_ha->revision_id]);
 }
-static DEVICE_ATTR(revision, S_IRUGO, asd_show_dev_rev, NULL);
+static DEVICE_ATTR(aic_revision, S_IRUGO, asd_show_dev_rev, NULL);
 
 static ssize_t asd_show_dev_bios_build(struct device *dev,
   struct device_attribute *attr,char *buf)
@@ -477,7 +477,7 @@ static int asd_create_dev_attrs(struct asd_ha_struct 
*asd_ha)
 {
int err;
 
-   err = device_create_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   err = device_create_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
if (err)
return err;
 
@@ -499,13 +499,13 @@ static int asd_create_dev_attrs(struct asd_ha_struct 
*asd_ha)
 err_biosb:
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_bios_build);
 err_rev:
-   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
return err;
 }
 
 static void asd_remove_dev_attrs(struct asd_ha_struct *asd_ha)
 {
-   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_bios_build);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_pcba_sn);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_update_bios);
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index bfa13e3b191c..c8bad2c093b8 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -3687,6 +3687,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
host->max_cmd_len = CXLFLASH_MAX_CDB_LEN;
 
cfg = shost_priv(host);
+   cfg->state = STATE_PROBING;
cfg->host = host;
rc = alloc_mem(cfg);
if (rc) {
@@ -3775,6 +3776,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
return rc;
 
 out_remove:
+   cfg->state = STATE_PROBED;
cxlflash_remove(pdev);
goto out;
 }
diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c
index 9192a1d9dec6..dfba4921b265 100644
--- a/drivers/scsi/libfc/fc_rport.c
+++ b/drivers/scsi/libfc/fc_rport.c
@@ -184,7 +184,6 @@ void fc_rport_destroy(struct kref *kref)
struct fc_rport_priv *rdata;
 
rdata = container_of(kref, struct fc_rport_priv, kref);
-   WARN_ON(!list_empty(&rdata->peers));
kfree_rcu(rdata, rcu);
 }
 EXPORT_SYMBOL(fc_rport_destroy);
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 83365b29a4d8..fff86940388b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -462,12 +462,16 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned 
char *buf)
sdkp->device->use_10_for_rw = 0;
 
/*
-* If something changed, revalidate the disk zone bitmaps once we have
-* the capacity, that is on the second revalidate execution during disk
-* scan and always during normal revalidate.
+* Revalidate the disk zone bitmaps once the block device capacity is
+* set on the second revalidate execution during disk scan and if
+* something changed when executing a normal revalidate.
 */
-   if (sdkp->first_scan)
+   if (sdkp->firs

[GIT PULL] SCSI fixes for 5.0-rc4

2019-02-01 Thread James Bottomley

Five minor bug fixes.  The libfc one is a tiny memory leak, the zfcp
one is an incorrect user visible parameter and the rest are on error
legs or obscure features.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Dan Carpenter (2):
  scsi: 53c700: pass correct "dev" to dma_alloc_attrs()
  scsi: bnx2fc: Fix error handling in probe()

Douglas Gilbert (1):
  scsi: scsi_debug: fix write_same with virtual_gb problem

Ming Lu (1):
  scsi: libfc: free skb when receiving invalid flogi resp

Steffen Maier (1):
  scsi: zfcp: fix sysfs block queue limit output for max_segment_size

And the diffstat

 drivers/s390/scsi/zfcp_aux.c|  1 -
 drivers/s390/scsi/zfcp_scsi.c   |  2 ++
 drivers/scsi/53c700.c   |  2 +-
 drivers/scsi/bnx2fc/bnx2fc_io.c |  4 ++--
 drivers/scsi/libfc/fc_lport.c   |  6 +++---
 drivers/scsi/scsi_debug.c   | 41 +
 6 files changed, 29 insertions(+), 27 deletions(-)

With full diff below.

James

---

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 9cf30d124b9e..e390f8c6d5f3 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -403,7 +403,6 @@ struct zfcp_adapter *zfcp_adapter_enqueue(struct ccw_device 
*ccw_device)
goto failed;
 
/* report size limit per scatter-gather segment */
-   adapter->dma_parms.max_segment_size = ZFCP_QDIO_SBALE_LEN;
adapter->ccw_device->dev.dma_parms = &adapter->dma_parms;
 
adapter->stat_read_buf_num = FSF_STATUS_READS_RECOM;
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 00acc7144bbc..f4f6a07c5222 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -428,6 +428,8 @@ static struct scsi_host_template zfcp_scsi_host_template = {
.max_sectors = (((QDIO_MAX_ELEMENTS_PER_BUFFER - 1)
 * ZFCP_QDIO_MAX_SBALS_PER_REQ) - 2) * 8,
   /* GCD, adjusted later */
+   /* report size limit per scatter-gather segment */
+   .max_segment_size= ZFCP_QDIO_SBALE_LEN,
.dma_boundary= ZFCP_QDIO_SBALE_LEN - 1,
.shost_attrs = zfcp_sysfs_shost_attrs,
.sdev_attrs  = zfcp_sysfs_sdev_attrs,
diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index 128d658d472a..16957d7ac414 100644
--- a/drivers/scsi/53c700.c
+++ b/drivers/scsi/53c700.c
@@ -295,7 +295,7 @@ NCR_700_detect(struct scsi_host_template *tpnt,
if(tpnt->sdev_attrs == NULL)
tpnt->sdev_attrs = NCR_700_dev_attrs;
 
-   memory = dma_alloc_attrs(hostdata->dev, TOTAL_MEM_SIZE, &pScript,
+   memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript,
 GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
if(memory == NULL) {
printk(KERN_ERR "53c700: Failed to allocate memory for driver, 
detaching\n");
diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
index 350257c13a5b..bc9f2a2365f4 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_io.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
@@ -240,6 +240,7 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct 
bnx2fc_hba *hba)
return NULL;
}
 
+   cmgr->hba = hba;
cmgr->free_list = kcalloc(arr_sz, sizeof(*cmgr->free_list),
  GFP_KERNEL);
if (!cmgr->free_list) {
@@ -256,7 +257,6 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct 
bnx2fc_hba *hba)
goto mem_err;
}
 
-   cmgr->hba = hba;
cmgr->cmds = (struct bnx2fc_cmd **)(cmgr + 1);
 
for (i = 0; i < arr_sz; i++)  {
@@ -295,7 +295,7 @@ struct bnx2fc_cmd_mgr *bnx2fc_cmd_mgr_alloc(struct 
bnx2fc_hba *hba)
 
/* Allocate pool of io_bdts - one for each bnx2fc_cmd */
mem_size = num_ios * sizeof(struct io_bdt *);
-   cmgr->io_bdt_pool = kmalloc(mem_size, GFP_KERNEL);
+   cmgr->io_bdt_pool = kzalloc(mem_size, GFP_KERNEL);
if (!cmgr->io_bdt_pool) {
printk(KERN_ERR PFX "failed to alloc io_bdt_pool\n");
goto mem_err;
diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c
index be83590ed955..ff943f477d6f 100644
--- a/drivers/scsi/libfc/fc_lport.c
+++ b/drivers/scsi/libfc/fc_lport.c
@@ -1726,14 +1726,14 @@ void fc_lport_flogi_resp(struct fc_seq *sp, struct 
fc_frame *fp,
fc_frame_payload_op(fp) != ELS_LS_ACC) {
FC_LPORT_DBG(lport, "FLOGI not accepted or bad response\n");
fc_lport_error(lport, fp);
-   goto err;
+   goto out;
}
 
flp = fc_frame_payload_get(fp, sizeof(*flp));
if (!flp) {
FC_LPORT_DBG(lport, "FLOGI bad response\n");
fc_lport_error(lport, fp);
-   goto err;
+

Re: [PATCH] scsi: aic94xx: fix module loading

2019-02-01 Thread James Bottomley

On Wed, 2019-01-30 at 16:42 -0800, James Bottomley wrote:
> The aic94xx driver is currently failing to load with errors like
> 
> sysfs: cannot create duplicate filename 
> '/devices/pci:00/:00:03.0/:02:00.3/:07:02.0/revision'
> 
> Because the PCI code had recently added a file named 'revision' to
> every PCI device.  Fix this by renaming the aic94xx revision file to
> aic_revision.  This is safe to do for us because as far as I can tell,
> there's nothing in userspace relying on the current aic94xx revision
> file so it can be renamed without breaking anything.
> 
> Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs)
> Cc: sta...@vger.kernel.org
> Signed-off-by: James Bottomley 

I finally managed to test this on actual hardware and it works for me. 
I don't have any aic94xx cards, just a couple of IBM donated machines
that have aic94xx built in on an internal PCI express bus daughter
card, so I had to find them and persuade them to boot modern kernels
before I could actually test anything, hence the delay.

James

[PATCH] scsi: aic94xx: fix module loading

2019-01-30 Thread James Bottomley

The aic94xx driver is currently failing to load with errors like

sysfs: cannot create duplicate filename 
'/devices/pci:00/:00:03.0/:02:00.3/:07:02.0/revision'

Because the PCI code had recently added a file named 'revision' to
every PCI device.  Fix this by renaming the aic94xx revision file to
aic_revision.  This is safe to do for us because as far as I can tell,
there's nothing in userspace relying on the current aic94xx revision
file so it can be renamed without breaking anything.

Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs)
Cc: sta...@vger.kernel.org
Signed-off-by: James Bottomley 

---

Patch is currently compile tested only ... I'm just excavating my
aic94xx based sas system from the unused equipment pile in the garage.

diff --git a/drivers/scsi/aic94xx/aic94xx_init.c 
b/drivers/scsi/aic94xx/aic94xx_init.c
index 41c4d8abdd4a..38e768057129 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -281,7 +281,7 @@ static ssize_t asd_show_dev_rev(struct device *dev,
return snprintf(buf, PAGE_SIZE, "%s\n",
asd_dev_rev[asd_ha->revision_id]);
 }
-static DEVICE_ATTR(revision, S_IRUGO, asd_show_dev_rev, NULL);
+static DEVICE_ATTR(aic_revision, S_IRUGO, asd_show_dev_rev, NULL);
 
 static ssize_t asd_show_dev_bios_build(struct device *dev,
   struct device_attribute *attr,char *buf)
@@ -478,7 +478,7 @@ static int asd_create_dev_attrs(struct asd_ha_struct 
*asd_ha)
 {
int err;
 
-   err = device_create_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   err = device_create_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
if (err)
return err;
 
@@ -500,13 +500,13 @@ static int asd_create_dev_attrs(struct asd_ha_struct 
*asd_ha)
 err_biosb:
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_bios_build);
 err_rev:
-   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
return err;
 }
 
 static void asd_remove_dev_attrs(struct asd_ha_struct *asd_ha)
 {
-   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_revision);
+   device_remove_file(&asd_ha->pcidev->dev, &dev_attr_aic_revision);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_bios_build);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_pcba_sn);
device_remove_file(&asd_ha->pcidev->dev, &dev_attr_update_bios);

Re: [Bug 201609] sysfs duplicate filename on driver loading Adaptec AIC-9410W

2019-01-30 Thread James Bottomley

On Wed, 2019-01-30 at 11:43 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=201609
> 
> Emil Velikov (emil.l.veli...@gmail.com) changed:
> 
>What|Removed |Added
> ---
> -
>  CC||emil.l.velikov@gmail
> .com
> 
> --- Comment #5 from Emil Velikov (emil.l.veli...@gmail.com) ---
> The other five files ({subsystem_,}{vendor,device} and class) do not
> have a prefix, so the more reasonable thing was it not add one for
> the revision file.

No, it wasn't.  This is the problem.  In the early days we all grabbed
un-namespaced names in this directory.  If you add another un-
namespaced file, a clash is likely to happen which has serious
consequences.  If you want to add a generically named file, you have to
check all the drivers before doing it.  If you clash with a file that's
in use by userspace, your new addition gets reverted because we can't
rename an in-use file because it's an exported ABI.

> That said, I did spend a lot of time going through the Documentation
> looking for guidelines, rules or restrictions that should be applied
> and could not find any. The PCI subsystem maintainer did not suggest
> adding a prefix either, so I would not use "blatently silly" here.

Either you prefix the addition or you check every driver ... neither
happened in this case.

> On a more practical side - the aic94xx driver custom revision file
> solves the exact same problem my patch does. Admittedly, aic94xx
> lacks the leading "0x" for the hexadecimal number provided.
> 
> My personal inclination is that we'd want to check if existing
> userspace
> requires the leading 0x and if so to what extend:
>  - cosmetic - remove the aic94xx code, update userspace
>  - unused - remove the aic94xx code
>  - major impact - the aic94xx driver removes the generic revision
> file, before setting it's own
> 
> Does that sound reasonable, am I missing something?

I think we're safe and the aic file isn't used from userspace, so just
renaming the aic revision file should work.

James

Re: [Bug 201609] sysfs duplicate filename on driver loading Adaptec AIC-9410W

2019-01-30 Thread James Bottomley

On Wed, 2019-01-30 at 14:21 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=201609
> 
> Bjorn Helgaas (bhelg...@google.com) changed:
> 
>What|Removed |Added
> ---
> -
>  CC||bhelg...@google.com
> 
> --- Comment #7 from Bjorn Helgaas (bhelg...@google.com) ---
> I don't know what the sysfs namespace rules are (if there are
> any).  It doesn't sound ideal for the PCI core to prefix all the
> files in a PCI device directory with "pci_" or similar, since we
> already know by the path (/sys/devices/pci:00,
> /sys/bus/pci/devices, etc) that they are PCI-related).

In the early days, we all simply grabbed files in the directory.  This
resulted in a complete lack of namespacing for everything.  Arguably
this was the really silly thing to do but we stuck with it.  What it
means is that now we have to be very careful about adding files because
things like this can easily happen.  In particular, if you add a
generic name like "revision" there's a really good chance some driver
already has it.

THere's nothing we can do about the prior lack of namespacing, but if
you don't want to check every driver in the kernel, using a namespace
prefix is a good way of avoiding clashes like this.  You can still add
a non-namespaced file, it's just you're on the hook to check every
driver to prevent cockups like this.

The key point is that if the file you clashed with is in use by
userspace, we can't rename it because it's an ABI, so now the conflict
is unresolvable.  I don't think this applies in the aic driver, but
it's the reason why you need to be very careful about adding files.

James

Re: [Bug 201609] sysfs duplicate filename on driver loading Adaptec AIC-9410W

2019-01-25 Thread James Bottomley

On Sat, 2019-01-26 at 03:13 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=201609
> 
> Pablo Ruiz García (pablo.r...@gmail.com) changed:
> 
>What|Removed |Added
> ---
> -
>  CC||pablo.r...@gmail.com
> 
> --- Comment #2 from Pablo Ruiz García (pablo.r...@gmail.com) ---
> Hi,
> 
> I've stumbled upon this very same issue, and have created a github
> repo with a
> fix for the actual issue (see git commit's description for bug
> details).
> 
> https://github.com/pruiz/aic94xx-hotfix
> https://github.com/pruiz/aic94xx-hotfix/commit/01898ea7dff8eb52f43773
> ea3b59ab41ef2482c8
> 
> Hope it helps someone.

I think the correct fix is to rename the ai94xx revision file (say to
aic94x_revision). However I've got to say that commit

commit 702ed3be1b1bf4dea05954168321741c0910c645
Author: Emil Velikov 
Date:   Mon Nov 21 16:24:49 2016 -0600

PCI: Create revision file in sysfs

Was a blatently silly thing to do without trying to namespace the added
file (say calling it something less generic like pci_revision).  I
assume we're lucky and nothing in userspace depends on the 'revision'
sysfs file of the aic driver, but technically I should request
reversion of this patch on the grounds that it's broken our ABI (since
that's what exported sysfs files are).

James

[GIT PULL] SCSI fixes for 5.0-rc3

2019-01-25 Thread James Bottomley

Six fixes, all of which appear to have user visible consequences.  The
DMA one is a regression fix from the merge window and of the others,
four are driver specific and one specific to the target code.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Christoph Hellwig (1):
  scsi: communicate max segment size to the DMA mapping code

Ewan D. Milne (2):
  scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport
  scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport

Marc Gonzalez (1):
  scsi: ufs: Use explicit access size in ufshcd_dump_regs

Varun Prakash (1):
  scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()

Xiubo Li (1):
  scsi: tcmu: fix use after free

And the diffstat:

 drivers/ata/pata_macio.c  |  9 -
 drivers/ata/sata_inic162x.c   | 22 +-
 drivers/firewire/sbp2.c   |  5 +
 drivers/scsi/aacraid/linit.c  |  9 -
 drivers/scsi/csiostor/csio_attr.c |  2 +-
 drivers/scsi/lpfc/lpfc_nvme.c | 16 +---
 drivers/scsi/lpfc/lpfc_nvme.h |  2 +-
 drivers/scsi/lpfc/lpfc_nvmet.c|  8 +---
 drivers/scsi/lpfc/lpfc_nvmet.h|  2 +-
 drivers/scsi/scsi_lib.c   |  4 ++--
 drivers/scsi/ufs/ufshcd.c | 10 --
 drivers/target/target_core_user.c |  3 ++-
 12 files changed, 47 insertions(+), 45 deletions(-)

With full diff below.

James

---

diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c
index 8cc9c429ad95..9e7fc302430f 100644
--- a/drivers/ata/pata_macio.c
+++ b/drivers/ata/pata_macio.c
@@ -915,6 +915,10 @@ static struct scsi_host_template pata_macio_sht = {
.sg_tablesize   = MAX_DCMDS,
/* We may not need that strict one */
.dma_boundary   = ATA_DMA_BOUNDARY,
+   /* Not sure what the real max is but we know it's less than 64K, let's
+* use 64K minus 256
+*/
+   .max_segment_size   = MAX_DBDMA_SEG,
.slave_configure= pata_macio_slave_config,
 };
 
@@ -1044,11 +1048,6 @@ static int pata_macio_common_init(struct pata_macio_priv 
*priv,
/* Make sure we have sane initial timings in the cache */
pata_macio_default_timings(priv);
 
-   /* Not sure what the real max is but we know it's less than 64K, let's
-* use 64K minus 256
-*/
-   dma_set_max_seg_size(priv->dev, MAX_DBDMA_SEG);
-
/* Allocate libata host for 1 port */
memset(&pinfo, 0, sizeof(struct ata_port_info));
pmac_macio_calc_timing_masks(priv, &pinfo);
diff --git a/drivers/ata/sata_inic162x.c b/drivers/ata/sata_inic162x.c
index e0bcf9b2dab0..174e84ce4379 100644
--- a/drivers/ata/sata_inic162x.c
+++ b/drivers/ata/sata_inic162x.c
@@ -245,8 +245,15 @@ struct inic_port_priv {
 
 static struct scsi_host_template inic_sht = {
ATA_BASE_SHT(DRV_NAME),
-   .sg_tablesize   = LIBATA_MAX_PRD,   /* maybe it can be larger? */
-   .dma_boundary   = INIC_DMA_BOUNDARY,
+   .sg_tablesize   = LIBATA_MAX_PRD, /* maybe it can be larger? */
+
+   /*
+* This controller is braindamaged.  dma_boundary is 0x like others
+* but it will lock up the whole machine HARD if 65536 byte PRD entry
+* is fed.  Reduce maximum segment size.
+*/
+   .dma_boundary   = INIC_DMA_BOUNDARY,
+   .max_segment_size   = 65536 - 512,
 };
 
 static const int scr_map[] = {
@@ -868,17 +875,6 @@ static int inic_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
return rc;
}
 
-   /*
-* This controller is braindamaged.  dma_boundary is 0x
-* like others but it will lock up the whole machine HARD if
-* 65536 byte PRD entry is fed. Reduce maximum segment size.
-*/
-   rc = dma_set_max_seg_size(&pdev->dev, 65536 - 512);
-   if (rc) {
-   dev_err(&pdev->dev, "failed to set the maximum segment size\n");
-   return rc;
-   }
-
rc = init_controller(hpriv->mmio_base, hpriv->cached_hctl);
if (rc) {
dev_err(&pdev->dev, "failed to initialize controller\n");
diff --git a/drivers/firewire/sbp2.c b/drivers/firewire/sbp2.c
index 09b845e90114..a785ffd5af89 100644
--- a/drivers/firewire/sbp2.c
+++ b/drivers/firewire/sbp2.c
@@ -1144,10 +1144,6 @@ static int sbp2_probe(struct fw_unit *unit, const struct 
ieee1394_device_id *id)
if (device->is_local)
return -ENODEV;
 
-   if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
-   WARN_ON(dma_set_max_seg_size(device->card->device,
-SBP2_MAX_SEG_SIZE));
-
shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt));
if (shost == NULL)
return -ENOMEM;
@@ -1610,6 +1606,7 @@ static struct scsi_host_templat

[GIT PULL] SCSI fixes for 5.0-rc2

2019-01-18 Thread James Bottomley

A set of 17 fixes.  Most of these are minor or trivial.  The one fix
that may be serious is the isci one: the bug can cause hba parameters
to be set from uninitialized memory.  I don't think it's exploitable,
but you never know.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Avri Altman (1):
  scsi: ufs: Fix geometry descriptor size

Gustavo A. R. Silva (2):
  scsi: lpfc: lpfc_sli: Mark expected switch fall-throughs
  scsi: smartpqi_init: fix boolean expression in pqi_device_remove_start

Ivan Mironov (1):
  scsi: sd: Fix cache_type_store()

John Garry (1):
  scsi: hisi_sas: Set protection parameters prior to adding SCSI host

Julia Lawall (1):
  scsi: pm80xx: reduce indentation

Leo Zhang (1):
  scsi: target/iscsi: fix error msg typo when create lio_qr_cache failed

Logan Gunthorpe (1):
  scsi: isci: initialize shost fully before calling scsi_add_host()

Manish Rangankar (1):
  scsi: qedi: Add ep_state for login completion on un-reachable targets

Ming Lei (1):
  scsi: qla2xxx: Use correct number of vectors for online CPUs

Shivasharan S (1):
  scsi: megaraid_sas: Retry reads of outbound_intr_status reg

Stanley Chu (2):
  scsi: ufs: Fix system suspend status
  scsi: core: Synchronize request queue PM status only on successful resume

Thomas Bogendoerfer (1):
  scsi: qla1280: set 64bit coherent mask

Tomas Henzl (1):
  scsi: megaraid_sas: correct an info message

Varun Prakash (1):
  scsi: cxgb4i: add wait_for_completion()

Xiubo Li (1):
  scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes

YueHaibing (1):
  scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param

And the diffstat:

 drivers/scsi/cxgbi/cxgb3i/cxgb3i.c  |  9 ++-
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c  | 28 ++---
 drivers/scsi/cxgbi/libcxgbi.c   |  7 ++-
 drivers/scsi/cxgbi/libcxgbi.h   |  5 +-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c  | 12 ++--
 drivers/scsi/isci/init.c| 14 ++---
 drivers/scsi/lpfc/lpfc_sli.c| 20 ---
 drivers/scsi/megaraid/megaraid_sas_base.c   |  2 +-
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +-
 drivers/scsi/pm8001/pm8001_sas.c|  2 +-
 drivers/scsi/qedi/qedi_iscsi.c  |  3 +
 drivers/scsi/qedi/qedi_iscsi.h  |  1 +
 drivers/scsi/qla1280.c  |  2 +-
 drivers/scsi/qla2xxx/qla_def.h  |  2 +
 drivers/scsi/qla2xxx/qla_isr.c  |  1 +
 drivers/scsi/qla2xxx/qla_os.c   |  2 +-
 drivers/scsi/qla4xxx/ql4_os.c   |  2 +
 drivers/scsi/scsi_pm.c  | 26 +
 drivers/scsi/sd.c   |  6 ++
 drivers/scsi/smartpqi/smartpqi_init.c   |  2 +-
 drivers/scsi/ufs/ufs.h  |  2 +-
 drivers/scsi/ufs/ufshcd.c   |  2 +
 drivers/target/iscsi/iscsi_target.c |  2 +-
 drivers/target/target_core_user.c   | 88 -
 24 files changed, 157 insertions(+), 86 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c 
b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
index 8a20411699d9..75e1273a44b3 100644
--- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
+++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
@@ -1144,7 +1144,7 @@ static void ddp_clear_map(struct cxgbi_device *cdev, 
struct cxgbi_ppm *ppm,
 }
 
 static int ddp_setup_conn_pgidx(struct cxgbi_sock *csk,
-  unsigned int tid, int pg_idx, bool reply)
+   unsigned int tid, int pg_idx)
 {
struct sk_buff *skb = alloc_wr(sizeof(struct cpl_set_tcb_field), 0,
GFP_KERNEL);
@@ -1160,7 +1160,7 @@ static int ddp_setup_conn_pgidx(struct cxgbi_sock *csk,
req = (struct cpl_set_tcb_field *)skb->head;
req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD));
OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SET_TCB_FIELD, tid));
-   req->reply = V_NO_REPLY(reply ? 0 : 1);
+   req->reply = V_NO_REPLY(1);
req->cpu_idx = 0;
req->word = htons(31);
req->mask = cpu_to_be64(0xF000);
@@ -1177,11 +1177,10 @@ static int ddp_setup_conn_pgidx(struct cxgbi_sock *csk,
  * @tid: connection id
  * @hcrc: header digest enabled
  * @dcrc: data digest enabled
- * @reply: request reply from h/w
  * set up the iscsi digest settings for a connection identified by tid
  */
 static int ddp_setup_conn_digest(struct cxgbi_sock *csk, unsigned int tid,
-int hcrc, int dcrc, int reply)
+int hcrc, int dcrc)
 {
struct sk_buff *skb = alloc_wr(sizeof(struct cpl_set_tcb_field), 0,
GFP_KERNEL);
@@ -1197,7 +1196,7 @@ static int ddp_setup_conn_digest(struct cxgbi_sock *csk, 
unsigned int tid,

Re: [PATCH] scsi: advansys: use struct_size() in kzalloc()

2019-01-11 Thread James Bottomley

On Fri, 2019-01-11 at 08:54 -0800, Matthew Wilcox wrote:
> On Fri, Jan 11, 2019 at 08:41:43AM -0800, James Bottomley wrote:
> > On Fri, 2019-01-11 at 16:46 +0100, Hannes Reinecke wrote:
> > > > -   asc_sg_head = kzalloc(sizeof(asc_scsi_q-
> > > > >sg_head)
> > > > +
> > > > -   use_sg * sizeof(struct asc_sg_list),
> > > > GFP_ATOMIC);
> > > > +   asc_sg_head = kzalloc(struct_size(asc_sg_head,
> > > > sg_list, use_sg),
> > > > + GFP_ATOMIC);
> > > 
> > > If you want ...
> > 
> > Are we sure there's a benefit to this?  It's obvious that the
> > current
> > code is correct but no-one's likely to test the new code for quite
> > some
> > time, so changing the code introduces risk. What's the benefit of
> > making the change in legacy drivers?  Just because we have a new,
> > shiny
> > macro doesn't mean we have to force its use everywhere.
> > 
> > I would recommend we have a rational needs test: so run the
> > coccinelle
> > script over all the drivers to find out where this construct is
> > used,
> > but only update those that are actually buggy with the new macro.
> 
> It's hard to tell whether they're buggy.  The problem being defended
> against here is integer overflow.  So can 'use_sg' ever get large
> enough that sizeof(asc_scsi_q->sg_head) + use_sg * sizeof(struct
> asc_sg_list) is larger than 4 billion?  Probably not; I imagine
> there's some rational sane limit elsewhere that says "No more than
> 256 SG elements" or something.

OK so firstly describing why we're doing this would have been
enormously useful.

Secondly, as you say, even with the enhanced rationale I'm not sure it
provides any benefit:  he advansys is two drivers squashed together:
the asc_ and adv_ prefixes.  It looks like the adv_ variant does check
the number of sg elements against the max, but asc_ doesn't; it relies
on the host limit sg_tablesize.  In both cases the actual limit is
somewhere around 255, so if the user can control the value they can
definitely cause corruption long before we get to mathematical
overflow.

The limit should be enforced by blk_queue_max_segments() and I think
this is done in all cases (including SG_IO).

> But I don't know without checking.  Is there some device-specific
> ioctl where the user can specify 2^31 scatterlist entries and
> somebody forgot to check?  This macro is a defense-in-depth strategy,
> so using it as widely as possible makes more sense than arguing about
> whether there are already adequate safeguards in place.

OK, so this is a question worth asking (I believe the answer to be "no"
but I could be wrong) because if there is some way of getting the value
over the driver internal table max (which is fixed for quite a few
drivers) we can induce corruption which this macro won't defend against
and if we need to, we should probably defend in sg_map_dma().

James

Re: [PATCH] scsi: advansys: use struct_size() in kzalloc()

2019-01-11 Thread James Bottomley

On Fri, 2019-01-11 at 16:46 +0100, Hannes Reinecke wrote:
> On 1/4/19 10:22 PM, Gustavo A. R. Silva wrote:
> > One of the more common cases of allocation size calculations is
> > finding the
> > size of a structure that has a zero-sized array at the end, along
> > with memory
> > for some number of elements for that array. For example:
> > 
> > struct foo {
> >  int stuff;
> >  void *entry[];
> > };
> > 
> > instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count,
> > GFP_KERNEL);
> > 
> > Instead of leaving these open-coded and prone to type mistakes, we
> > can now
> > use the new struct_size() helper:
> > 
> > instance = kzalloc(struct_size(instance, entry, count),
> > GFP_KERNEL);
> > 
> > This code was detected with the help of Coccinelle.
> > 
> > Signed-off-by: Gustavo A. R. Silva 
> > ---
> >   drivers/scsi/advansys.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
> > index d37584403c33..6c274e6e1c33 100644
> > --- a/drivers/scsi/advansys.c
> > +++ b/drivers/scsi/advansys.c
> > @@ -7576,8 +7576,8 @@ static int asc_build_req(struct asc_board
> > *boardp, struct scsi_cmnd *scp,
> > return ASC_ERROR;
> > }
> >   
> > -   asc_sg_head = kzalloc(sizeof(asc_scsi_q->sg_head)
> > +
> > -   use_sg * sizeof(struct asc_sg_list),
> > GFP_ATOMIC);
> > +   asc_sg_head = kzalloc(struct_size(asc_sg_head,
> > sg_list, use_sg),
> > + GFP_ATOMIC);
> > if (!asc_sg_head) {
> > scsi_dma_unmap(scp);
> > scp->result = HOST_BYTE(DID_SOFT_ERROR);
> > 
> 
> If you want ...

Are we sure there's a benefit to this?  It's obvious that the current
code is correct but no-one's likely to test the new code for quite some
time, so changing the code introduces risk. What's the benefit of
making the change in legacy drivers?  Just because we have a new, shiny
macro doesn't mean we have to force its use everywhere.

I would recommend we have a rational needs test: so run the coccinelle
script over all the drivers to find out where this construct is used,
but only update those that are actually buggy with the new macro.

James

Re: PROBLEM: syzkaller found / pool corruption-overwrite / page in user-area or NULL

2019-01-10 Thread James Bottomley

On Thu, 2019-01-10 at 19:12 +, Esme wrote:
> Sorry for the resend some mail servers rejected the mime type.
> 
> Hi, I've been getting more into Kernel stuff lately and forged ahead
> with some syzkaller bug finding.  I played with reducing it further
> as you can see from the attached c code but am moving on and hope to
> get better about this process moving forward as I'm still building
> out my test systems/debugging tools.
> 
> Attached is the report and C repro that still triggers on a fresh git
> pull as of a few minutes ago, if you need anything else please let me
> know.
> Esme
> 
> Linux syzkaller 5.0.0-rc1+ #5 SMP Tue Jan 8 20:39:33 EST 2019 x86_64
> GNU/Linux

I'm not sure I'm reading this right, but it seems that a simple
allocation inside block/scsi_ioctl.h

buffer = kzalloc(bytes, q->bounce_gfp | GFP_USER| __GFP_NOWARN);

(where bytes is < 4k) caused a slub padding check failure on free. 
>From the internal details, the freeing entity seems to be KASAN as part
of its quarantine reduction (albeit triggered by this kzalloc).  I'm
not remotely familiar with what KASAN is doing, but it seems the memory
corruption problem is somewhere within the KASAN tracking?

I added linux-mm in case they can confirm this diagnosis or give me a
pointer to what might be wrong in scsi.

James

Re: [PATCH v3] scsi: add debugfs directories

2019-01-04 Thread James Bottomley

On Fri, 2019-01-04 at 13:34 -0500, Douglas Gilbert wrote:
> On 2019-01-04 11:52 a.m., James Bottomley wrote:
> > On Fri, 2019-01-04 at 11:18 -0500, Douglas Gilbert wrote:
> > > Add a top level "scsi" directory in debugfs (usually at
> > > /sys/kernel/debugfs/scsi) with two subdirectories: "uld" and
> > > "lld". The idea is to place mid-level related 'knobs' in the
> > > "scsi" directory, and for the ULDs to make subsirectories like
> > > "scsi/uld/sd" and "scsi/uld/st" as required. LLDs could follow a
> > > similar pattern.
> > 
> > I'm really not sure this is useful, but just in case it is, I
> > suppose the first question should be why this isn't all simply
> > conditioned on BLK_DEBUG_FS?  Is there a solid use case where
> > someone would use this in a !BLK_DEBUG_FS situation.
> 
> Logically this extension should only depend on CONFIG_DEBUG_FS as
> that is the minimum requirement.

Yes, but it's not a reason .. you can use that to justify all sorts of
micro config options.  However to be useful, config options must
represent kernels people would actually build, so the pressure is on to
reduce the number.

>  Also it is not blindingly obviously why someone debugging the sg, st
> and ses driver (all char devices) needs to select
> CONFIG_BLK_DEBUG_FS. Why not have CONFIG_SCSI_DEBUG_FS ?

The reason I could see them being tied together is that if you're after
debug information for a SCSI command, part of it might be what it went
into block as ... i.e. the current request flags which are SCSI
specific and enabled on BLK_DEBUG_FS.  What I can't see is a reason to
build a special kernel that has SCSI debug stuff but not block.

> The patch evolved this way due to the kbuild bot splitting the
> BLK_DEBUG_FS and DEBUG_FS cases, causing a compile error.

Right, but that can be fixed by unifying them.

> In review comments on my sg v4 driver rewrite, reviewers comment that
> I should not be using the existing SCSI_LOGGING infrastructure since
> it is deprecated and "maintain-only". [IMO it works fine]. However
> there seems no or very little support within the SCSI subsystem for
> any other approach. So this patch is an attempt to make some progress
> on that front.

That's kind of orthogonal.  One of the best ways of getting debugging
information out of the kernel is tracepoints and if we're doing that we
should probably do it in a way that it plugs nicely into blktrace, so
if that's the genesis, we should probably have the discussion first
about how we want to best extract debug information about command flow.

> When I'm debugging (and I do that a lot lately), I assume that the
> calls to the mid-level, block and other layers do what they are
> supposed to do. So I'm not interested in their debug or a breakdown
> of their objects. So, for example, I have never used scsi_debugfs and
> I expect that if I moved /proc/scsi/sg/debug to
> /sys/kernel/debug/scsi/sg/debug then I still would not want or need
> however scsi_show_rq() is used.

but would you actually build a kernel specially with the sole object of
not allowing the information to be seen?  That's the point: this is a
Kconfig option there has to be a good reason to build every possible
kernel or else it's not a good option to have.

> Does debugging the hisi_sas driver need to access scsi_show_rq() ?

It's not about need to, it's about need not.  Is there a specific
reason not to build with it ... likelihood of looking at it isn't
really the best measure.  The kernel emits tons of stuff I don't
particularly care about ... it even annoys me when it gets in the way
of stuff I do care about but it doesn't annoy me enough to want to
build a special kernel without it.

James

Re: [PATCH v3] scsi: add debugfs directories

2019-01-04 Thread James Bottomley

On Fri, 2019-01-04 at 11:18 -0500, Douglas Gilbert wrote:
> Add a top level "scsi" directory in debugfs (usually at 
> /sys/kernel/debugfs/scsi) with two subdirectories: "uld" and "lld".
> The idea is to place mid-level related 'knobs' in the "scsi"
> directory, and for the ULDs to make subsirectories like
> "scsi/uld/sd" and "scsi/uld/st" as required. LLDs could follow a
> similar pattern.

I'm really not sure this is useful, but just in case it is, I suppose
the first question should be why this isn't all simply conditioned on
BLK_DEBUG_FS?  Is there a solid use case where someone would use this
in a !BLK_DEBUG_FS situation.

[...]
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -68,6 +68,7 @@
>  
>  #include "scsi_priv.h"
>  #include "scsi_logging.h"
> +#include "scsi_debugfs.h"
>  
>  #define CREATE_TRACE_POINTS
>  #include 
> @@ -812,9 +813,27 @@ static int __init init_scsi(void)
>   scsi_netlink_init();
>  
> +#ifdef CONFIG_DEBUG_FS
> + scsi_debugfs_root = debugfs_create_dir("scsi", NULL);
> + if (!scsi_debugfs_root)
> + goto cleanup_netlink;
> + scsi_debugfs_uld = debugfs_create_dir("uld",
> scsi_debugfs_root);
> + if (!scsi_debugfs_uld)
> + goto cleanup_debugfs;
> + scsi_debugfs_lld = debugfs_create_dir("lld",
> scsi_debugfs_root);
> + if (!scsi_debugfs_lld)
> + goto cleanup_debugfs;
> +#endif

Chunks of conditionally compiled code cause major problems (primarily
because you need multiple Kconfig runs to check them all.  The way to
avoid this is either to use the IF_ENABLED() etc machinery, or simply
do it in header files.  In this case, like all init functions, I think
there needs to be a scsi_debugfs_init() scsi_debugfs_exit() function
like there is for netlink ... in fact, do it precisely how netlink does
it.

[...]
> --- a/drivers/scsi/scsi_debugfs.c
> +++ b/drivers/scsi/scsi_debugfs.c
> @@ -4,6 +4,20 @@
>  #include 
>  #include "scsi_debugfs.h"
>  
> +#ifdef CONFIG_DEBUG_FS

This is already conditionally compiled on CONFIG_DEBUG_FS ... you don't
need another check of the condition here.

> +struct dentry *scsi_debugfs_root;
> +struct dentry *scsi_debugfs_uld;
> +struct dentry *scsi_debugfs_lld;
> +
> +EXPORT_SYMBOL(scsi_debugfs_root);
> +EXPORT_SYMBOL(scsi_debugfs_uld);
> +EXPORT_SYMBOL(scsi_debugfs_lld);
> +
> +#endif
> +
> +
> +#ifdef CONFIG_BLK_DEBUG_FS

So rather than doing this, it might be better to have two conditionally
compiled files, one for just DEBUG_FS and the other for BLK_DEBUG_FS
... assuming theres a good reason not to condition it all on
BLK_DEBUG_FS.

[...]
> --- a/drivers/scsi/scsi_debugfs.h
> +++ b/drivers/scsi/scsi_debugfs.h
> @@ -1,4 +1,14 @@
> +#include 
> +
> +#ifdef CONFIG_DEBUG_FS

This file didn't have any CONFIG guards before and it doesn't need them
now.

James

Re: [PATCH v8 01/25] scsi/atari_scsi: Don't select CONFIG_NVRAM

2018-12-30 Thread James Bottomley

On Sun, 2018-12-30 at 18:50 +0100, LEROY Christophe wrote:
> Arnd Bergmann  a écrit :
> > On Sat, Dec 29, 2018 at 3:51 AM Michael Schmitz
> >  wrote:
[...]
> > > (On second thought - I don't want to speculate whether there's
> > > weird compiler options that could result in the
> > > nvram_check_checksum and nvram_read_bytes symbols to still be
> > > referenced in the final link, even though
> > > IS_BUILTIN(CONFIG_NVRAM) always evaluates to false. Best
> > > leave this as-is.)
> > 
> > As far as I know, it's totally reliable with the supported
> > compilers  (gcc-4.6+). In the older compilers (e.g. 4.1), there was
> > a corner case, where it could have failed to eliminate a function
> > that was only referenced through a pointer from a discarded
> > variable, but a plain IS_ENABLED() check like the one here
> > was still ok, and lots of code relies on that.
> > 
> > Other than that, I agree either way is totally fine here, so no
> > objections to using the #ifdef.
> 
> As far as I know, kernel codying style promotes the use of  
> IS_ENABLED() etc. instead of #ifdefs when possible.

It's a preference, as with a lot of coding style stuff, which we leave
up to the maintainer.

That said, as has been pointed out, the current #ifdef has a failing
corner case when both are modular (because the code should then be
included).  The runtime macro that correctly expresses this is
IS_REACHABLE(CONFIG_NVRAM).

James

[GIT PULL] first round of SCSI updates for the 4.20+ merge window

2018-12-24 Thread James Bottomley

no connect 
conditions
  scsi: smartpqi: wake up drives after os resumes from suspend
  scsi: smartpqi: add sysfs attributes
  scsi: smartpqi: refactor sending controller raid requests
  scsi: smartpqi: turn off lun data caching for ptraid
  scsi: smartpqi: add no_write_same for logical volumes

David Disseldorp (9):
  scsi: target: perform t10_wwn ID initialisation in target_alloc_device()
  scsi: target: remove hardcoded T10 Vendor ID in INQUIRY response
  scsi: target: add device vendor_id configfs attribute
  scsi: target: consistently null-terminate t10_wwn strings
  scsi: target: use consistent left-aligned ASCII INQUIRY data
  scsi: target: replace fabric_ops.name with fabric_alias
  scsi: target: drop unnecessary get_fabric_name() accessor from fabric_ops
  scsi: target: drop unused pi_prot_format attribute storage
  scsi: target: add emulate_pr backstore attr to toggle PR support

Don Brace (4):
  scsi: smartpqi: fix build warnings
  scsi: smartpqi: update driver version
  scsi: smartpqi: bump driver version
  scsi: smartpqi: add smp_utils support

Ewan D. Milne (1):
  scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is 
invalid

Fedor Loshakov (1):
  scsi: zfcp: make DIX experimental, disabled, and independent of DIF

Fred Herard (1):
  scsi: iscsi: Capture iscsi debug messages using tracepoints

Giridhar Malavali (1):
  scsi: qla2xxx: Fix for FC-NVMe discovery for NPIV port

Gustavo A. R. Silva (14):
  scsi: isci: request: mark expected switch fall-through
  scsi: isci: remote_node_context: mark expected switch fall-throughs
  scsi: isci: remote_device: Mark expected switch fall-throughs
  scsi: isci: phy: Mark expected switch fall-through
  scsi: myrb: Mark expected switch fall-throughs
  scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
  scsi: BusLogic: mark expected switch fall-through
  scsi: xen-scsifront: mark expected switch fall-through
  scsi: megaraid_sas_fusion: Mark expected switch fall-through
  scsi: megaraid: megaraid_sas_base: Mark expected switch fall-through
  scsi: libfc: fc_rport: Mark expected switch fall-through
  scsi: hpsa: mark expected switch fall-throughs
  scsi: aacraid: Mark expected switch fall-through
  scsi: aacraid: Mark expected switch fall-throughs

Himanshu Madhani (2):
  scsi: qla2xxx: Update driver version to 10.00.00.12-k
  scsi: qla2xxx: Fix NPIV handling for FC-NVMe

James Bottomley (1):
  scsi: aha1542: Fix zeroday __udivdi3 warning

James Smart (35):
  scsi: lpfc: Update lpfc version to 12.0.0.10
  scsi: lpfc: Adding ability to reset chip via pci bus reset
  scsi: lpfc: Add log messages to aid in debugging fc4type discovery issues
  scsi: lpfc: Fix discovery failure when PLOGI is defered
  scsi: lpfc: update fault value on successful trunk events.
  scsi: lpfc: Correct MDS loopback diagnostics support
  scsi: lpfc: Fix link state reporting for trunking when adapter is offline
  scsi: lpfc: Enable Management features for IF_TYPE=6
  scsi: lpfc: update driver version to 12.0.0.9
  scsi: lpfc: Fix dif and first burst use in write commands
  scsi: lpfc: Fix driver release of fw-logging buffers
  scsi: lpfc: Correct topology type reporting on G7 adapters
  scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE
  scsi: lpfc: Defer LS_ACC to FLOGI on point to point logins
  scsi: lpfc: ls_rjt erroneus FLOGIs
  scsi: lpfc: rport port swap discovery issue.
  scsi: lpfc: Cap NPIV vports to 256
  scsi: lpfc: Fix kernel Oops due to null pring pointers
  scsi: lpfc: Fix a duplicate 0711 log message number.
  scsi: lpfc: Fix discovery failures during port failovers with lots of 
vports
  scsi: lpfc: refactor mailbox structure context fields
  scsi: lpfc: update manufacturer attribute to reflect Broadcom
  scsi: lpfc: Fix panic when FW-log buffsize is not initialized
  scsi: lpfc: update driver version to 12.0.0.8
  scsi: lpfc: add Trunking support
  scsi: lpfc: Implement GID_PT on Nameserver query to support faster 
failover
  scsi: lpfc: Correct loss of fc4 type on remote port address change
  scsi: lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point
  scsi: lpfc: Correct LCB RJT handling
  scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces
  scsi: lpfc: Correct errors accessing fw log
  scsi: lpfc: Reset link or adapter instead of doing infinite nameserver 
PLOGI retry
  scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event
  scsi: lpfc: Fix lpfc_sli4_read_config return value check
  scsi: lpfc: Correct speeds on SFP swap

Janek Kotas (2):
  scsi: ufs: Add UFS platform driver for Cadence UFS
  scsi: dt-bindings: ufs: Add bindings for Cadence UFS

John Garry (6):
  scs

Re: [PATCH] sd: use mempool for discard special page

2018-12-22 Thread James Bottomley

On Fri, 2018-12-21 at 20:49 -0700, Jens Axboe wrote:
> On 12/21/18 7:48 PM, James Bottomley wrote:
> > On Wed, 2018-12-12 at 06:46 -0700, Jens Axboe wrote:
> > > When boxes are run near (or to) OOM, we have a problem with the
> > > discard page allocation in sd. If we fail allocating the special
> > > page, we return busy, and it'll get retried. But since ordering
> > > is
> > > honored for dispatch requests, we can keep retrying this same IO
> > > and
> > > failing. Behind that IO could be requests that want to free
> > > memory,
> > > but they never get the chance. This means you get repeated spews
> > > of
> > > traces like this:
> > > 
> > > [1201401.625972] Call Trace:
> > > [1201401.631748]  dump_stack+0x4d/0x65
> > > [1201401.639445]  warn_alloc+0xec/0x190
> > > [1201401.647335]  __alloc_pages_slowpath+0xe84/0xf30
> > > [1201401.657722]  ? get_page_from_freelist+0x11b/0xb10
> > > [1201401.668475]  ? __alloc_pages_slowpath+0x2e/0xf30
> > > [1201401.679054]  __alloc_pages_nodemask+0x1f9/0x210
> > > [1201401.689424]  alloc_pages_current+0x8c/0x110
> > > [1201401.699025]  sd_setup_write_same16_cmnd+0x51/0x150
> > > [1201401.709987]  sd_init_command+0x49c/0xb70
> > > [1201401.719029]  scsi_setup_cmnd+0x9c/0x160
> > > [1201401.727877]  scsi_queue_rq+0x4d9/0x610
> > > [1201401.736535]  blk_mq_dispatch_rq_list+0x19a/0x360
> > > [1201401.747113]  blk_mq_sched_dispatch_requests+0xff/0x190
> > > [1201401.758844]  __blk_mq_run_hw_queue+0x95/0xa0
> > > [1201401.768653]  blk_mq_run_work_fn+0x2c/0x30
> > > [1201401.777886]  process_one_work+0x14b/0x400
> > > [1201401.787119]  worker_thread+0x4b/0x470
> > > [1201401.795586]  kthread+0x110/0x150
> > > [1201401.803089]  ? rescuer_thread+0x320/0x320
> > > [1201401.812322]  ? kthread_park+0x90/0x90
> > > [1201401.820787]  ? do_syscall_64+0x53/0x150
> > > [1201401.829635]  ret_from_fork+0x29/0x40
> > > 
> > > Ensure that the discard page allocation has a mempool backing, so
> > > we
> > > know we can make progress.
> > > 
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Jens Axboe 
> > > 
> > > ---
> > > 
> > > We actually hit this in production, it's not a theoretical issue.
> > > 
> > > 
> > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > > index 4a6ed2fc8c71..a1a44f52e0e8 100644
> > > --- a/drivers/scsi/sd.c
> > > +++ b/drivers/scsi/sd.c
> > 
> > [...]
> > > @@ -759,9 +760,10 @@ static blk_status_t
> > > sd_setup_unmap_cmnd(struct
> > > scsi_cmnd *cmd)
> > >   unsigned int data_len = 24;
> > >   char *buf;
> > >  
> > > - rq->special_vec.bv_page = alloc_page(GFP_ATOMIC |
> > > __GFP_ZERO);
> > > + rq->special_vec.bv_page = mempool_alloc(sd_page_pool,
> > > GFP_ATOMIC);
> > >   if (!rq->special_vec.bv_page)
> > >   return BLK_STS_RESOURCE;
> > > + clear_highpage(rq->special_vec.bv_page);
> > 
> > Since the kernel never accesses this page and you take pains to
> > kmap it
> > when clearing, shouldn't the allocation have __GFP_HIGHMEM?
> 
> It's not going to great lengths to kmap it, clear_page() conveniently
> takes an address, not a page. So it's either
> 
> clear_page(page_address(rq->special_vec.bv_page));
> 
> or the current clear_highpage(). Shrug.

Yes, I guessed that and agree our semantics are slightly problematic.

However, the question was, since this is about starvation, why not use
__GFP_HIGHMEM in the allocation?  I realise it makes no difference on
64 bit, but it does on 32 where it allows access to the HIGHMEM zone
which is often quite large and would lead to better behaviour under
memory pressure.

James

[GIT PULL] SCSI fixes for 4.20-rc7

2018-12-21 Thread James Bottomley

This is two simple target fixes and one discard related I/O starvation
problem in sd.  The discard problem occurs because the discard page
doesn't have a mempool backing so if the allocation fails due to memory
pressure, we then lose the forward progress we require if the writeout
is on the same device.  The fix is to back it with a mempool.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Jens Axboe (1):
  scsi: sd: use mempool for discard special page

Varun Prakash (2):
  scsi: target: iscsi: cxgbit: add missing spin_lock_init()
  scsi: target: iscsi: cxgbit: fix csk leak

And the diffstat:

 drivers/scsi/sd.c | 23 +++
 drivers/target/iscsi/cxgbit/cxgbit_cm.c   |  5 -
 drivers/target/iscsi/cxgbit/cxgbit_main.c |  1 +
 3 files changed, 24 insertions(+), 5 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3bb2b3351e35..bd0a5c694a97 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -133,6 +133,7 @@ static DEFINE_MUTEX(sd_ref_mutex);
 
 static struct kmem_cache *sd_cdb_cache;
 static mempool_t *sd_cdb_pool;
+static mempool_t *sd_page_pool;
 
 static const char *sd_cache_types[] = {
"write through", "none", "write back",
@@ -759,9 +760,10 @@ static int sd_setup_unmap_cmnd(struct scsi_cmnd *cmd)
unsigned int data_len = 24;
char *buf;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -792,9 +794,10 @@ static int sd_setup_write_same16_cmnd(struct scsi_cmnd 
*cmd, bool unmap)
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -822,9 +825,10 @@ static int sd_setup_write_same10_cmnd(struct scsi_cmnd 
*cmd, bool unmap)
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -1286,7 +1290,7 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
u8 *cmnd;
 
if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
-   __free_page(rq->special_vec.bv_page);
+   mempool_free(rq->special_vec.bv_page, sd_page_pool);
 
if (SCpnt->cmnd != scsi_req(rq)->cmd) {
cmnd = SCpnt->cmnd;
@@ -3623,6 +3627,13 @@ static int __init init_sd(void)
goto err_out_cache;
}
 
+   sd_page_pool = mempool_create_page_pool(SD_MEMPOOL_SIZE, 0);
+   if (!sd_page_pool) {
+   printk(KERN_ERR "sd: can't init discard page pool\n");
+   err = -ENOMEM;
+   goto err_out_ppool;
+   }
+
err = scsi_register_driver(&sd_template.gendrv);
if (err)
goto err_out_driver;
@@ -3630,6 +3641,9 @@ static int __init init_sd(void)
return 0;
 
 err_out_driver:
+   mempool_destroy(sd_page_pool);
+
+err_out_ppool:
mempool_destroy(sd_cdb_pool);
 
 err_out_cache:
@@ -3656,6 +3670,7 @@ static void __exit exit_sd(void)
 
scsi_unregister_driver(&sd_template.gendrv);
mempool_destroy(sd_cdb_pool);
+   mempool_destroy(sd_page_pool);
kmem_cache_destroy(sd_cdb_cache);
 
class_unregister(&sd_disk_class);
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c 
b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
index 71888b979ab5..b19c960d5490 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
@@ -641,8 +641,11 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
 
 static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb)
 {
+   struct cxgbit_sock *csk = handle;
+
pr_debug("%s cxgbit_device %p\n", __func__, handle);
kfree_skb(skb);
+   cxgbit_put_csk(csk);
 }
 
 static void cxgbit_abort_arp_failure(void *handl

Re: [PATCH] sd: use mempool for discard special page

2018-12-21 Thread James Bottomley

On Wed, 2018-12-12 at 06:46 -0700, Jens Axboe wrote:
> When boxes are run near (or to) OOM, we have a problem with the
> discard page allocation in sd. If we fail allocating the special
> page, we return busy, and it'll get retried. But since ordering is
> honored for dispatch requests, we can keep retrying this same IO and
> failing. Behind that IO could be requests that want to free memory,
> but they never get the chance. This means you get repeated spews of
> traces like this:
> 
> [1201401.625972] Call Trace:
> [1201401.631748]  dump_stack+0x4d/0x65
> [1201401.639445]  warn_alloc+0xec/0x190
> [1201401.647335]  __alloc_pages_slowpath+0xe84/0xf30
> [1201401.657722]  ? get_page_from_freelist+0x11b/0xb10
> [1201401.668475]  ? __alloc_pages_slowpath+0x2e/0xf30
> [1201401.679054]  __alloc_pages_nodemask+0x1f9/0x210
> [1201401.689424]  alloc_pages_current+0x8c/0x110
> [1201401.699025]  sd_setup_write_same16_cmnd+0x51/0x150
> [1201401.709987]  sd_init_command+0x49c/0xb70
> [1201401.719029]  scsi_setup_cmnd+0x9c/0x160
> [1201401.727877]  scsi_queue_rq+0x4d9/0x610
> [1201401.736535]  blk_mq_dispatch_rq_list+0x19a/0x360
> [1201401.747113]  blk_mq_sched_dispatch_requests+0xff/0x190
> [1201401.758844]  __blk_mq_run_hw_queue+0x95/0xa0
> [1201401.768653]  blk_mq_run_work_fn+0x2c/0x30
> [1201401.777886]  process_one_work+0x14b/0x400
> [1201401.787119]  worker_thread+0x4b/0x470
> [1201401.795586]  kthread+0x110/0x150
> [1201401.803089]  ? rescuer_thread+0x320/0x320
> [1201401.812322]  ? kthread_park+0x90/0x90
> [1201401.820787]  ? do_syscall_64+0x53/0x150
> [1201401.829635]  ret_from_fork+0x29/0x40
> 
> Ensure that the discard page allocation has a mempool backing, so we
> know we can make progress.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Jens Axboe 
> 
> ---
> 
> We actually hit this in production, it's not a theoretical issue.
> 
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 4a6ed2fc8c71..a1a44f52e0e8 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
[...]
> @@ -759,9 +760,10 @@ static blk_status_t sd_setup_unmap_cmnd(struct
> scsi_cmnd *cmd)
>   unsigned int data_len = 24;
>   char *buf;
>  
> - rq->special_vec.bv_page = alloc_page(GFP_ATOMIC |
> __GFP_ZERO);
> + rq->special_vec.bv_page = mempool_alloc(sd_page_pool,
> GFP_ATOMIC);
>   if (!rq->special_vec.bv_page)
>   return BLK_STS_RESOURCE;
> + clear_highpage(rq->special_vec.bv_page);

Since the kernel never accesses this page and you take pains to kmap it
when clearing, shouldn't the allocation have __GFP_HIGHMEM?

James

Re: [PATCH 03/10] SCSI: fcoe: convert to use BUS_ATTR_WO

2018-12-21 Thread James Bottomley

[scsi list cc added]
On Fri, 2018-12-21 at 08:54 +0100, Greg Kroah-Hartman wrote:
> We are trying to get rid of BUS_ATTR() and the usage of that in the
> fcoe driver can be trivially converted to use BUS_ATTR_WO(), so use
> that instead.
> 
> At the same time remove a unneeded EXPORT_SYMBOL() marking for the
> sysfs callback function we are renaming, no idea of how that got into
> the tree...

The EXPORT_SYMBOL removal is fine, but

[...]
> --- a/include/scsi/libfcoe.h
> +++ b/include/scsi/libfcoe.h
> @@ -405,10 +405,8 @@ int fcoe_transport_attach(struct fcoe_transport
> *ft);
>  int fcoe_transport_detach(struct fcoe_transport *ft);
> 
>  /* sysfs store handler for ctrl_control interface */
> -ssize_t fcoe_ctlr_create_store(struct bus_type *bus,
> -const char *buf, size_t count);
> -ssize_t fcoe_ctlr_destroy_store(struct bus_type *bus,
> - const char *buf, size_t count);
> +ssize_t ctlr_create_store(struct bus_type *bus, const char *buf,
> size_t count);
> +ssize_t ctlr_destroy_store(struct bus_type *bus, const char *buf,
> size_t count);

You're really damaging our prefix namespace here.  It looks like the
ctlr_ name is a farly recent addition for sysfs (only myra/b) use it in
SCSI but it's inviting symbol clashes.

Since the XXX_ATTR_RO seem to be defined with only local file usage in
mind, I think we need static functions to shim the problem, like below
(provided this is the only instance, because if it isn't, you probably
need a XXX_ATTR_WO_prefix() macro)

James

---

diff --git a/drivers/scsi/fcoe/fcoe_sysfs.c b/drivers/scsi/fcoe/fcoe_sysfs.c
index 5c8310bade61..88e5c26ce381 100644
--- a/drivers/scsi/fcoe/fcoe_sysfs.c
+++ b/drivers/scsi/fcoe/fcoe_sysfs.c
@@ -671,8 +671,20 @@ static const struct device_type fcoe_fcf_device_type = {
.release = fcoe_fcf_device_release,
 };
 
-static BUS_ATTR(ctlr_create, S_IWUSR, NULL, fcoe_ctlr_create_store);
-static BUS_ATTR(ctlr_destroy, S_IWUSR, NULL, fcoe_ctlr_destroy_store);
+static ssize_t ctlr_create_store(struct bus_type *bus, const char *buf,
+size_t count)
+{
+   return fcoe_ctlr_create_store(bus, buf, count);
+}
+
+static ssize_t ctlr_destroy_store(struct bus_type *bus, const char *buf,
+ size_t count)
+{
+   return fcoe_ctlr_destroy_store(bus, buf, count);
+}
+
+static BUS_ATTR_WO(ctlr_create);
+static BUS_ATTR_WO(ctlr_destroy);
 
 static struct attribute *fcoe_bus_attrs[] = {
&bus_attr_ctlr_create.attr,

[GIT PULL] SCSI fixes for 4.20-rc7

2018-12-18 Thread James Bottomley

Three fixes: The t10-pi one is a regression from the 4.19 release, the
qla2xxx one is a 4.20 merge window regression and the bnx2fc is a very
old bug.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Dan Carpenter (1):
  scsi: bnx2fc: Fix NULL dereference in error handling

Himanshu Madhani (1):
  Revert "scsi: qla2xxx: Fix NVMe Target discovery"

Martin K. Petersen (1):
  scsi: t10-pi: Return correct ref tag when queue has no integrity profile

And the diffstat:

 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 2 +-
 drivers/scsi/qla2xxx/qla_os.c | 4 ++--
 include/linux/t10-pi.h| 9 +
 3 files changed, 8 insertions(+), 7 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c 
b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index cd160f2ec75d..bcd30e2374f1 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -2364,7 +2364,7 @@ static int _bnx2fc_create(struct net_device *netdev,
if (!interface) {
printk(KERN_ERR PFX "bnx2fc_interface_create failed\n");
rc = -ENOMEM;
-   goto ifput_err;
+   goto netdev_err;
}
 
if (is_vlan_dev(netdev)) {
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index b658b9a5eb1e..d0ecc729a90a 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -4886,10 +4886,10 @@ void qla24xx_create_new_sess(struct scsi_qla_host *vha, 
struct qla_work_evt *e)
fcport->d_id = e->u.new_sess.id;
fcport->flags |= FCF_FABRIC_DEVICE;
fcport->fw_login_state = DSC_LS_PLOGI_PEND;
-   if (e->u.new_sess.fc4_type & FS_FC4TYPE_FCP)
+   if (e->u.new_sess.fc4_type == FS_FC4TYPE_FCP)
fcport->fc4_type = FC4_TYPE_FCP_SCSI;
 
-   if (e->u.new_sess.fc4_type & FS_FC4TYPE_NVME) {
+   if (e->u.new_sess.fc4_type == FS_FC4TYPE_NVME) {
fcport->fc4_type = FC4_TYPE_OTHER;
fcport->fc4f_nvme = FC4_TYPE_NVME;
}
diff --git a/include/linux/t10-pi.h b/include/linux/t10-pi.h
index b9626aa7e90c..3e2a80cc7b56 100644
--- a/include/linux/t10-pi.h
+++ b/include/linux/t10-pi.h
@@ -39,12 +39,13 @@ struct t10_pi_tuple {
 
 static inline u32 t10_pi_ref_tag(struct request *rq)
 {
+   unsigned int shift = ilog2(queue_logical_block_size(rq->q));
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
-   return blk_rq_pos(rq) >>
-   (rq->q->integrity.interval_exp - 9) & 0x;
-#else
-   return -1U;
+   if (rq->q->integrity.interval_exp)
+   shift = rq->q->integrity.interval_exp;
 #endif
+   return blk_rq_pos(rq) >> (shift - SECTOR_SHIFT) & 0x;
 }
 
 extern const struct blk_integrity_profile t10_pi_type1_crc;

Re: [PATCH] scsi: qla2xxx: fix unused function warning

2018-12-10 Thread James Bottomley

On Mon, 2018-12-10 at 22:28 +0100, Arnd Bergmann wrote:
> On Mon, Dec 10, 2018 at 10:01 PM Bart Van Assche 
> wrote:
> > 
> > On Mon, 2018-12-10 at 21:51 +0100, Arnd Bergmann wrote:
> > > In what seems to be a mismatch between the scsi-fixes branch and
> > > the scsi-mkp/for-next branch, a newly introduced variable from
> > > one patch got obsoleted in another one:
> > > 
> > > drivers/scsi/qla2xxx/qla_os.c: In function
> > > '__qla2x00_abort_all_cmds':
> > > drivers/scsi/qla2xxx/qla_os.c:1791:11: error: unused variable
> > > 'status' [-Werror=unused-variable]
> > > 
> > > Remove the variable again.
> > > 
> > > Fixes: c4e521b654e1 ("scsi: qla2xxx: Split the
> > > __qla2x00_abort_all_cmds() function")
> > > Fixes: f2ffd4e5bc7b ("scsi: qla2xxx: Timeouts occur on surprise
> > > removal of QLogic adapter")
> > > Signed-off-by: Arnd Bergmann 
> > > ---
> > > Maybe check carefully that the merge in linux-next is otherwise
> > > correct
> > > ---
> > >  drivers/scsi/qla2xxx/qla_os.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/qla2xxx/qla_os.c
> > > b/drivers/scsi/qla2xxx/qla_os.c
> > > index 63c47bc7ae59..db331cb5ba3c 100644
> > > --- a/drivers/scsi/qla2xxx/qla_os.c
> > > +++ b/drivers/scsi/qla2xxx/qla_os.c
> > > @@ -1788,7 +1788,7 @@ static void qla2x00_abort_srb(struct
> > > qla_qpair *qp, srb_t *sp, const int res,
> > >  static void
> > >  __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
> > >  {
> > > - int cnt, status;
> > > + int cnt;
> > >   unsigned long flags;
> > >   srb_t *sp;
> > >   scsi_qla_host_t *vha = qp->vha;
> > 
> > When I prepared commit c4e521b654e1 I verified that my patch did
> > not produce any
> > warnings when building with W=1. So something must be wrong at your
> > side. Did you
> > perhaps start from linux-next to prepare this patch? If so, please
> > submit this
> > patch to Stephen Rothwell.
> 
> Yes, I tried to make clear that the two branches are fine by
> themselves, sorry if I was still ambiguous. The patch is currently
> only needed on linux-next as far as I can tell, but we should avoid
> getting the same error when the branches are merged in mainline.

It should be fixed in linux-next as of the next release.  It was just a
problem with the original merge which I fixed when I did my version of
it.

James

[GIT PULL] SCSI fixes for 4.20-rc5

2018-12-05 Thread James Bottomley

Four obvious bug fixes.  The vmw_pscsi one is so old that it's amazing
no-one noticed before now.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Cathy Avery (1):
  scsi: vmw_pscsi: Rearrange code to avoid multiple calls to free_irq 
during unload

Dexuan Cui (1):
  scsi: storvsc: Fix a race in sub-channel creation that can cause panic

Fred Herard (1):
  scsi: libiscsi: Fix NULL pointer dereference in iscsi_eh_session_reset

Martin Wilck (1):
  scsi: lpfc: fix block guard enablement on SLI3 adapters

And the diffstat:

 drivers/scsi/libiscsi.c   |  4 +--
 drivers/scsi/lpfc/lpfc_init.c |  6 -
 drivers/scsi/lpfc/lpfc_sli.c  |  1 -
 drivers/scsi/storvsc_drv.c| 61 +--
 drivers/scsi/vmw_pvscsi.c |  4 +--
 5 files changed, 39 insertions(+), 37 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 93c66ebad907..f78d2e5c1471 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2416,8 +2416,8 @@ int iscsi_eh_session_reset(struct scsi_cmnd *sc)
 failed:
ISCSI_DBG_EH(session,
 "failing session reset: Could not log back into "
-"%s, %s [age %d]\n", session->targetname,
-conn->persistent_address, session->age);
+"%s [age %d]\n", session->targetname,
+session->age);
spin_unlock_bh(&session->frwd_lock);
mutex_unlock(&session->eh_mutex);
return FAILED;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 20fa6785a0e2..68d62d55a3a5 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -167,7 +167,11 @@ lpfc_config_port_prep(struct lpfc_hba *phba)
   sizeof(phba->wwpn));
}
 
-   phba->sli3_options = 0x0;
+   /*
+* Clear all option bits except LPFC_SLI3_BG_ENABLED,
+* which was already set in lpfc_get_cfgparam()
+*/
+   phba->sli3_options &= (uint32_t)LPFC_SLI3_BG_ENABLED;
 
/* Setup and issue mailbox READ REV command */
lpfc_read_rev(phba, pmb);
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 783a1540cfbe..b9e5cd79931a 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -4965,7 +4965,6 @@ lpfc_sli_config_port(struct lpfc_hba *phba, int sli_mode)
phba->sli3_options &= ~(LPFC_SLI3_NPIV_ENABLED |
LPFC_SLI3_HBQ_ENABLED |
LPFC_SLI3_CRP_ENABLED |
-   LPFC_SLI3_BG_ENABLED |
LPFC_SLI3_DSS_ENABLED);
if (rc != MBX_SUCCESS) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index f03dc03a42c3..8f88348ebe42 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -446,7 +446,6 @@ struct storvsc_device {
 
bool destroy;
bool drain_notify;
-   bool open_sub_channel;
atomic_t num_outstanding_req;
struct Scsi_Host *host;
 
@@ -636,33 +635,38 @@ static inline struct storvsc_device *get_in_stor_device(
 static void handle_sc_creation(struct vmbus_channel *new_sc)
 {
struct hv_device *device = new_sc->primary_channel->device_obj;
+   struct device *dev = &device->device;
struct storvsc_device *stor_device;
struct vmstorage_channel_properties props;
+   int ret;
 
stor_device = get_out_stor_device(device);
if (!stor_device)
return;
 
-   if (stor_device->open_sub_channel == false)
-   return;
-
memset(&props, 0, sizeof(struct vmstorage_channel_properties));
 
-   vmbus_open(new_sc,
-  storvsc_ringbuffer_size,
-  storvsc_ringbuffer_size,
-  (void *)&props,
-  sizeof(struct vmstorage_channel_properties),
-  storvsc_on_channel_callback, new_sc);
+   ret = vmbus_open(new_sc,
+storvsc_ringbuffer_size,
+storvsc_ringbuffer_size,
+(void *)&props,
+sizeof(struct vmstorage_channel_properties),
+storvsc_on_channel_callback, new_sc);
 
-   if (new_sc->state == CHANNEL_OPENED_STATE) {
-   stor_device->stor_chns[new_sc->target_cpu] = new_sc;
-   cpumask_set_cpu(new_sc->target_cpu, &stor_device->alloced_cpus);
+   /* In case vmbus_open() fails, we don't use the sub-channel. */
+   if (ret != 0) {
+   dev_err(dev, "Failed to open sub-channel: err=%d\n", ret);
+

Re: [PATCH RFC 00/15] Zero ****s, hugload of hugs <3

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 14:26 -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 30, 2018 at 03:14:59PM -0700, Jonathan Corbet wrote:
[...]
> > Have you read Documentation/process/code-of-conduct-
> > interpretation.rst? 
> > As has been pointed out, it contains a clear answer to how things
> > should be interpreted here.
> 
> Ugh, was not aware that there two documents.
> 
> Yeah, definitely sheds light. Why the documents could not be merged
> to single common sense code of conduct?

The fact that we've arrived at essentially an original CoC
reinterpreted to the point where it's effectively a new CoC has been
the source of much debate and recrimination over the last few months
... you can read it in the ksummit-discuss archives, but I really think
we don't want to reopen that can of worms.

James

Re: [PATCH RFC 00/15] Zero ****s, hugload of hugs <3

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 14:12 -0800, Jarkko Sakkinen wrote:
[...]
> I pasted this already to another response and this was probably the
> part that ignited me to send the patch set (was a few days ago, so
> had to revisit to find the exact paragraph):

I replied in to the other thread.

> "Maintainers have the right and responsibility to remove, edit, or
> reject comments, commits, code, wiki edits, issues, and other
> contributions that are not aligned to this Code of Conduct, or to ban
> temporarily or permanently any contributor for other behaviors that
> they deem inappropriate, threatening, offensive, or harmful."
> 
> The whole patch set is neither a joke/troll nor something I would
> necessarily want to be include myself. It does have the RFC tag.
> 
> As a maintainer myself (and based on somewhat disturbed feedback from
> other maintainers) I can only make the conclusion that nobody knows
> what the responsibility part here means.
> 
> I would interpret, if I read it like at lawyer at least, that even
> for existing code you would need to do the changes postmorterm.

That's wrong in the light of the interpretation document, yes.

> Is this wrong interpretation?  Should I conclude that I made a
> mistake by reading the CoC and trying to understand what it
> *actually* says?

You can't read it in isolation, you need to read it along with the
interpretation document.  The latter was created precisely because
there was a lot of push back on interpretation problems and ambiguities
with the original CoC and it specifically covers this case (and a lot
of others).

James


> After this discussion, I can say that I understand it less than
> before.
> 
> /Jarkko
>

Re: [PATCH RFC 00/15] Zero ****s, hugload of hugs <3

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 13:54 -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 30, 2018 at 01:48:08PM -0800, David Miller wrote:
> > From: Jarkko Sakkinen 
> > Date: Fri, 30 Nov 2018 13:44:05 -0800
> > 
> > > On Fri, Nov 30, 2018 at 01:01:02PM -0800, James Bottomley wrote:
> > > > No because use of what some people consider to be bad language
> > > > isn't
> > > > necessarily abusive, offensive or degrading.  Our most heavily
> > > > censored
> > > > medium is TV and "fuck" is now considered acceptable in certain
> > > > contexts on most channels in the UK and EU.
> > > 
> > > This makes following the CoC extremely hard to a non-native
> > > speaker as
> > > it is not too explicit on what is OK and what is not. I did
> > > through the
> > > whole thing with an eye glass and this what I deduced from it.
> > 
> > It would be helpful if you could explain what part of the language
> > is unclear wrt. explaining how CoC does not apply to existing code.
> > 
> > That part seems very explicit to me.
> 
> Well, now that I re-read it, it was this part to be exact:
> 
> "Maintainers have the right and responsibility to remove, edit, or
> reject comments, commits, code, wiki edits, issues, and other
> contributions that are not aligned to this Code of Conduct, or to ban
> temporarily or permanently any contributor for other behaviors that
> they deem inappropriate, threatening, offensive, or harmful."
> 
> How this should be interpreted?

Firstly, this is *only* about contributions going forward.  The
interpretation document says we don't have to look back into the
repository and we definitely can't remove something from git that's
already been committed.

As a Maintainer, if you feel bad language is inappropriate for your
subsystem then you say so and reject with that reason patches that come
in containing it (suggesting alternative rewordings).  However, your
determination about what is or isn't acceptable in your subsystem isn't
binding on other maintainers, who may have different standards ... this
is identical to what we do with coding, like your insistence on one
line per variable or other subsystem's insistence on reverse christmas
tree for includes ...

James

> I have not really followed the previous CoC discussions as I try to
> always use polite language so I'm sorry if this duplicate.
> 
> /Jarkko
>

Re: [PATCH RFC 00/15] Zero ****s, hugload of hugs <3

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 13:44 -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 30, 2018 at 01:01:02PM -0800, James Bottomley wrote:
> > No because use of what some people consider to be bad language
> > isn't necessarily abusive, offensive or degrading.  Our most
> > heavily censored medium is TV and "fuck" is now considered
> > acceptable in certain contexts on most channels in the UK and EU.
> 
> This makes following the CoC extremely hard to a non-native speaker
> as it is not too explicit on what is OK and what is not. I did
> through the whole thing with an eye glass and this what I deduced
> from it.

OK, so something that would simply be considered in some quarters as
bad language isn't explicitly banned.  The thing which differentiates
simple bad language from "abusive, offensive or degrading language",
which is called out by the CoC, is the context and the target.

So when it's a simple expletive or the code of the author or even the
hardware is the target, I'd say it's an easy determination it's not a
CoC violation.  If someone else's code is the target or the inventor of
the hardware is targetted by name, I'd say it is.  Even non-native
English speakers should be able to determine target and context,
because that's the essence of meaning.

James

Re: [PATCH RFC 00/15] Zero ****s, hugload of hugs <3

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 12:55 -0800, Jarkko Sakkinen wrote:
> On Fri, Nov 30, 2018 at 11:56:52AM -0800, Davidlohr Bueso wrote:
> > On Fri, 30 Nov 2018, Kees Cook wrote:
> > 
> > > On Fri, Nov 30, 2018 at 11:27 AM Jarkko Sakkinen
> > >  wrote:
> > > > 
> > > > In order to comply with the CoC, replace  with a hug.
> > 
> > I hope this is some kind of joke. How would anyone get offended by
> > reading technical comments? This is all beyond me...
> 
> Well... Not a joke really but more like conversation starter :-)
> 
> I had 10h flight from Amsterdam to Portland and one of the things
> that I did was to read the new CoC properly.
> 
> This a direct quote from the CoC:
> 
> "Harassment includes the use of abusive, offensive or degrading
> language, intimidation, stalking, harassing photography or recording,
> inappropriate physical contact, sexual imagery and unwelcome sexual
> advances or requests for sexual favors."
> 
> Doesn't this fall into this category?

No because use of what some people consider to be bad language isn't
necessarily abusive, offensive or degrading.  Our most heavily censored
medium is TV and "fuck" is now considered acceptable in certain
contexts on most channels in the UK and EU.

> Your argument is not that great because you could say that from any
> LKML discussion. If you don't like hugging, please propose something
> else
> :-)

The interpretation document also says this:

   ontributions submitted for the kernel should use appropriate
   language. Content that already exists predating the Code of Conduct
   will not be addressed now as a violation.

So that definitely means there should be no hunting down of existing
comments in kernel code.

James

Re: [PATCH RFC 11/15] scsi: replace **** with a hug

2018-11-30 Thread James Bottomley

On Fri, 2018-11-30 at 11:27 -0800, Jarkko Sakkinen wrote:
> In order to comply with the CoC, replace  with a hug.
> 
> Signed-off-by: Jarkko Sakkinen 
> ---
>  drivers/scsi/qlogicpti.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qlogicpti.h b/drivers/scsi/qlogicpti.h
> index 884ad72ade57..f1785154061c 100644
> --- a/drivers/scsi/qlogicpti.h
> +++ b/drivers/scsi/qlogicpti.h
> @@ -62,7 +62,7 @@
>  #define REQUEST_QUEUE_WAKEUP 0x8005
>  #define EXECUTION_TIMEOUT_RESET  0x8006
>  
> -/* Am I fucking pedantic or what? */
> +/* Am I hugging pedantic or what? */

I really don't think this is either necessary or a good idea.  The code
of conduct explicitly doesn't require this.  It says:

   Contributions submitted for the kernel should use appropriate
   language.
   Content that already exists predating the Code of Conduct will not
   be
   addressed now as a violation.

However, I assume you're thinking about the next sentence:

   Inappropriate language can be seen as a bug, though; such bugs will
   be fixed more quickly if any interested parties submit patches to
   that effect.

?

I would interpret the above comment as simply an expression of
frustration directed by the author at some of the endian problems
drivers like this bring.  Since it's not directed at anything other
than hardware I don't think it could be seen as threatening or abusive
or contributing to a negative environment even if submitted today.

For better or for worse, the word "fuck", which was once seen as so
beyond the pale that only loose tongued individuals like monty python
could get away with it (The great John Cleese use of the word fuck at a
funeral [Eric Chapman's] springing immediately to mind), has now moved
into the TV primetime mainstream and it's not our job to try and
reimpose language decency on the world.

James

Re: [PATCH] scsi: libsas: Add missing license and update to SPDX license identifier

2018-11-29 Thread James Bottomley

On Thu, 2018-11-29 at 11:52 +, John Garry wrote:
[...]
> Hi Greg,
> 
> I also note that currently we have an inconsistency in license of 
> sas_init.c:
> 
> /*
>   * Serial Attached SCSI (SAS) Transport Layer initialization
>   *
>   * Copyright (C) 2005 Adaptec, Inc.  All rights reserved.
>   * Copyright (C) 2005 Luben Tuikov 
>   *
>   * This file is licensed under GPLv2.
>   *
>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License as
>   * published by the Free Software Foundation; either version 2 of
> the
>   * License, or (at your option) any later version.
>   *
>   * This program is distributed in the hope that it will be useful,
> but
>   * WITHOUT ANY WARRANTY; without even the implied warranty of
>   * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>   * General Public License for more details.
>   *
>   * You should have received a copy of the GNU General Public License
>   * along with this program; if not, write to the Free Software
>   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-
> 1307
>   * USA
>   *
>   */
> 
> ...
> 
> MODULE_AUTHOR("Luben Tuikov ");
> MODULE_DESCRIPTION("SAS Transport Layer");
> MODULE_LICENSE("GPL v2");
> 
> So the license specifies v2+ but module license states v2.
> 
> I could not find a docment for guidance on this. I also note that
> making sas_task.c v2 would mean mixing v2 and v2+ into the module.

The point here is to get help.  There are three pieces of evidence in
the file one for v2+ and two for v2.  So you can look in the git tree
to when it was actually contributed:

commit 2908d778ab3e244900c310974e1fc1c69066e450
Author: James Bottomley 
Date:   Tue Aug 29 09:22:51 2006 -0500

[SCSI] aic94xx: new driver
 
and if you ask the original contributor he can tell you the original
intent was v2 only.  If you want to modify all the files in libsas and
aic9xxx to have that SPDX tag.

> At this point I'm reluctant to touch this in case I mess up, but
> there  is still the missing license in sas_task.c .

Again, the tree will tell you.  In this case it's

commit 366ca51f30de1cbb5b356c70b7bb22051c558e41
Author: James Bottomley 
Date:   Fri Jan 18 10:47:01 2008 -0600

[SCSI] libsas: abstract STP task status into a function

So that file is a direct extraction from an existing v2 only file in
aic9xxx, so it's licence is also v2 only.

James

Re: linux-next: Tree for Nov 27 (scsi/aha1542)

2018-11-27 Thread James Bottomley

On Wed, 2018-11-28 at 15:38 +1100, Stephen Rothwell wrote:
> Hi all,
> 
> On Tue, 27 Nov 2018 20:14:58 -0800 Randy Dunlap  g> wrote:
> > 
> > On 11/26/18 8:25 PM, Stephen Rothwell wrote:
> > > Hi all,
> > > 
> > > Changes since 20181126:
> > >   
> > 
> > on i386:
> > 
> > ERROR: "__udivdi3" [drivers/scsi/aha1542.ko] undefined!
> > 
> > somewhere in aha1542_interrupt() according to objdump.
> 
> Presumably caused by commit
> 
>   1794ef2b150d ("scsi: aha1542: convert to DMA mapping API")

Yes, it's because dma_addr_t can be u64 on pae systems but
isa_virt_to_bus only ever returns unsigned long (because an ISA
physical address can only be 24 bits).

I think this is the fix; there doesn't seem to be much point converting
to do_div given all the limitations.

James

---

diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c
index a9c29757172f..afb693d7b44f 100644
--- a/drivers/scsi/aha1542.c
+++ b/drivers/scsi/aha1542.c
@@ -325,7 +325,7 @@ static irqreturn_t aha1542_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
};
 
-   mbo = (scsi2int(mb[mbi].ccbptr) - aha1542->ccb_handle) / 
sizeof(struct ccb);
+   mbo = (scsi2int(mb[mbi].ccbptr) - (unsigned 
long)aha1542->ccb_handle) / sizeof(struct ccb);
mbistatus = mb[mbi].status;
mb[mbi].status = 0;
aha1542->aha1542_last_mbi_used = mbi;

signature.asc
Description: This is a digitally signed message part

[GIT PULL] SCSI fixes for 4.20-rc3

2018-11-21 Thread James Bottomley

Two small fixes.  The qla2xxx is a regression from 4.18 and the ufs one
is a device enablement fix.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Bill Kuzeja (1):
  scsi: qla2xxx: Timeouts occur on surprise removal of QLogic adapter

Wei Li (1):
  scsi: ufs: Fix hynix ufs bug with quirk on hi36xx SoC

And the diffstat:

 drivers/scsi/qla2xxx/qla_os.c | 10 --
 drivers/scsi/ufs/ufs-hisi.c   |  9 +
 drivers/scsi/ufs/ufs_quirks.h |  6 ++
 drivers/scsi/ufs/ufshcd.c |  2 ++
 4 files changed, 25 insertions(+), 2 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 20c85eed1a75..b658b9a5eb1e 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1749,7 +1749,7 @@ qla2x00_loop_reset(scsi_qla_host_t *vha)
 static void
 __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
 {
-   int cnt;
+   int cnt, status;
unsigned long flags;
srb_t *sp;
scsi_qla_host_t *vha = qp->vha;
@@ -1799,10 +1799,16 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
if (!sp_get(sp)) {
spin_unlock_irqrestore
(qp->qp_lock_ptr, 
flags);
-   qla2xxx_eh_abort(
+   status = qla2xxx_eh_abort(
GET_CMD_SP(sp));
spin_lock_irqsave
(qp->qp_lock_ptr, 
flags);
+   /*
+* Get rid of extra reference 
caused
+* by early exit from 
qla2xxx_eh_abort
+*/
+   if (status == FAST_IO_FAIL)
+   
atomic_dec(&sp->ref_count);
}
}
sp->done(sp, res);
diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
index 46df707e6f2c..452e19f8fb47 100644
--- a/drivers/scsi/ufs/ufs-hisi.c
+++ b/drivers/scsi/ufs/ufs-hisi.c
@@ -20,6 +20,7 @@
 #include "unipro.h"
 #include "ufs-hisi.h"
 #include "ufshci.h"
+#include "ufs_quirks.h"
 
 static int ufs_hisi_check_hibern8(struct ufs_hba *hba)
 {
@@ -390,6 +391,14 @@ static void ufs_hisi_set_dev_cap(struct 
ufs_hisi_dev_params *hisi_param)
 
 static void ufs_hisi_pwr_change_pre_change(struct ufs_hba *hba)
 {
+   if (hba->dev_quirks & UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME) {
+   pr_info("ufs flash device must set VS_DebugSaveConfigTime 
0x10\n");
+   /* VS_DebugSaveConfigTime */
+   ufshcd_dme_set(hba, UIC_ARG_MIB(0xD0A0), 0x10);
+   /* sync length */
+   ufshcd_dme_set(hba, UIC_ARG_MIB(0x1556), 0x48);
+   }
+
/* update */
ufshcd_dme_set(hba, UIC_ARG_MIB(0x15A8), 0x1);
/* PA_TxSkip */
diff --git a/drivers/scsi/ufs/ufs_quirks.h b/drivers/scsi/ufs/ufs_quirks.h
index 71f73d1d1ad1..5d2dfdb41a6f 100644
--- a/drivers/scsi/ufs/ufs_quirks.h
+++ b/drivers/scsi/ufs/ufs_quirks.h
@@ -131,4 +131,10 @@ struct ufs_dev_fix {
  */
 #define UFS_DEVICE_QUIRK_HOST_PA_SAVECONFIGTIME(1 << 8)
 
+/*
+ * Some UFS devices require VS_DebugSaveConfigTime is 0x10,
+ * enabling this quirk ensure this.
+ */
+#define UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME   (1 << 9)
+
 #endif /* UFS_QUIRKS_H_ */
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 27db55b0ca7f..f1c57cd33b5b 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -231,6 +231,8 @@ static struct ufs_dev_fix ufs_fixups[] = {
UFS_FIX(UFS_VENDOR_SKHYNIX, UFS_ANY_MODEL, UFS_DEVICE_NO_VCCQ),
UFS_FIX(UFS_VENDOR_SKHYNIX, UFS_ANY_MODEL,
UFS_DEVICE_QUIRK_HOST_PA_SAVECONFIGTIME),
+   UFS_FIX(UFS_VENDOR_SKHYNIX, "hB8aL1" /*H28U62301AMR*/,
+   UFS_DEVICE_QUIRK_HOST_VS_DEBUGSAVECONFIGTIME),
 
END_FIX
 };

Re: [PATCH] drivers/scsi/fnic/fnic_trace.c: Use vzalloc

2018-11-13 Thread James Bottomley

On Tue, 2018-11-13 at 16:53 +0100, Johannes Thumshirn wrote:
> On 13/11/2018 16:44, John Garry wrote:
> > On 13/11/2018 15:08, Sabyasachi Gupta wrote:
[...]
> > > -fnic_trace_buf_p = (unsigned long)vmalloc((trace_max_pages *
> > > PAGE_SIZE));
> > > +fnic_trace_buf_p = (unsigned long)vzalloc((trace_max_pages *
> > > +PAGE_SIZE));
> > 
> > If you remove the extra brackets in vzalloc() argument then you may
> > not spill onto the next line.
> 
> And remove the unnecessary cast. vzalloc() (just like vmalloc())
> returns a void*, so no reason to cast it.

This is incorrect advice: there's no need to cast it to other *pointer*
types, but if you cast it to a non-pointer type (which this is doing)
the compiler will complain if there is no explicit cast.

James

[GIT PULL] SCSI fixes for 4.20-rc2

2018-11-13 Thread James Bottomley

This is mostly a set of minor and obvious fixes (three in one of the
new drivers).  The only substantial change is to move the ufs to the
blk-mq now that the merge window fixed the suspend/resume issues with
blk-mq.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Arnd Bergmann (4):
  scsi: myrs: only build on little-endian platforms
  scsi: myrs: avoid stack overflow warning
  scsi: lpfc: fix remoteport access
  scsi: myrb: fix sprintf buffer overflow warning

Bart Van Assche (1):
  scsi: target/core: Avoid that a kernel oops is triggered when COMPARE AND 
WRITE fails

Finn Thain (1):
  scsi: NCR5380: Return false instead of NULL

Martin K. Petersen (1):
  Revert "scsi: ufs: Disable blk-mq for now"

Masanari Iida (1):
  scsi: qla2xxx: Fix a typo in MODULE_PARM_DESC

Quinn Tran (1):
  scsi: qla2xxx: Initialize port speed to avoid setting lower speed

YueHaibing (1):
  scsi: hisi_sas: Remove set but not used variable 'dq_list'

And the diffstat:

 drivers/scsi/Kconfig   |  1 +
 drivers/scsi/NCR5380.c |  2 +-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |  2 --
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  2 --
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  2 --
 drivers/scsi/lpfc/lpfc_debugfs.c   |  2 ++
 drivers/scsi/myrb.c|  3 ++-
 drivers/scsi/myrs.c| 13 -
 drivers/scsi/qla2xxx/qla_init.c|  1 +
 drivers/scsi/qla2xxx/qla_os.c  |  2 +-
 drivers/scsi/ufs/ufshcd.c  |  7 ---
 drivers/target/target_core_transport.c |  4 ++--
 12 files changed, 18 insertions(+), 23 deletions(-)

With full diff below

James

---

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index f07444d30b21..640cd1b31a18 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -578,6 +578,7 @@ config SCSI_MYRB
 config SCSI_MYRS
tristate "Mylex DAC960/DAC1100 PCI RAID Controller (SCSI Interface)"
depends on PCI
+   depends on !CPU_BIG_ENDIAN || COMPILE_TEST
select RAID_ATTRS
help
  This driver adds support for the Mylex DAC960, AcceleRAID, and
diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 8429c855701f..01c23d27f290 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -1198,7 +1198,7 @@ static bool NCR5380_select(struct Scsi_Host *instance, 
struct scsi_cmnd *cmd)
 
 out:
if (!hostdata->selecting)
-   return NULL;
+   return false;
hostdata->selecting = NULL;
return ret;
 }
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index f0e457e6884e..8df822a4a1bd 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -904,11 +904,9 @@ static void start_delivery_v1_hw(struct hisi_sas_dq *dq)
 {
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
-   struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
 
-   dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index cc36b6473e98..77a85ead483e 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -1670,11 +1670,9 @@ static void start_delivery_v2_hw(struct hisi_sas_dq *dq)
 {
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
-   struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
 
-   dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index bd4ce38b98d2..a369450a1fa7 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -886,11 +886,9 @@ static void start_delivery_v3_hw(struct hisi_sas_dq *dq)
 {
struct hisi_hba *hisi_hba = dq->hisi_hba;
struct hisi_sas_slot *s, *s1, *s2 = NULL;
-   struct list_head *dq_list;
int dlvry_queue = dq->id;
int wp;
 
-   dq_list = &dq->list;
list_for_each_entry_safe(s, s1, &dq->list, delivery) {
if (!s->ready)
break;
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index 0c8005bb0f53..34d311a7dbef 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -698,6 +698,8 @@ lpfc_debugfs_nodelist_data(struct lpfc_vport *vport, char 
*buf, int size)
rport = lpfc_ndlp_get_nrport(ndlp);
if (rport)
nrport = rpor

[GIT PULL] final round of SCSI updates for the 4.19+ merge window

2018-11-02 Thread James Bottomley

This is a set of minor small (and safe changes) that didn't make the
initial pull request plus some bug fixes.  

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

The short changelog is:

Bart Van Assche (7):
  scsi: qla2xxx: Remove two arguments from qlafx00_error_entry()
  scsi: qla2xxx: Make sure that qlafx00_ioctl_iosb_entry() initializes 'res'
  scsi: qla2xxx: Remove a set-but-not-used variable
  scsi: qla2xxx: Make qla2x00_sysfs_write_nvram() easier to analyze
  scsi: qla2xxx: Declare local functions 'static'
  scsi: qla2xxx: Improve several kernel-doc headers
  scsi: qla2xxx: Modify fall-through annotations

Nathan Chancellor (1):
  scsi: 3w-sas: 3w-9xxx: Use unsigned char for cdb

Roland Dreier (2):
  scsi: target: Don't request modules that aren't even built
  scsi: target: Set response length for REPORT TARGET PORT GROUPS

Sabyasachi Gupta (1):
  scsi: mvsas: Use dma_pool_zalloc

YueHaibing (1):
  scsi: mvsas: Remove set but not used variable 'id'

And the diffstat:

 drivers/scsi/3w-9xxx.c | 12 
 drivers/scsi/3w-sas.c  |  8 +---
 drivers/scsi/mvsas/mv_sas.c|  6 ++
 drivers/scsi/qla2xxx/qla_attr.c|  2 +-
 drivers/scsi/qla2xxx/qla_init.c|  7 ---
 drivers/scsi/qla2xxx/qla_iocb.c|  4 ++--
 drivers/scsi/qla2xxx/qla_isr.c |  6 +++---
 drivers/scsi/qla2xxx/qla_mbx.c |  6 +++---
 drivers/scsi/qla2xxx/qla_mr.c  | 21 +
 drivers/scsi/qla2xxx/qla_nx.c  |  2 +-
 drivers/scsi/qla2xxx/qla_nx2.c |  2 +-
 drivers/scsi/qla2xxx/qla_os.c  |  6 +++---
 drivers/scsi/qla2xxx/qla_sup.c |  2 +-
 drivers/scsi/qla2xxx/qla_target.c  |  8 
 drivers/target/target_core_alua.c  |  2 +-
 drivers/target/target_core_transport.c |  8 
 16 files changed, 52 insertions(+), 50 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index 05293babb031..2d655a97b959 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -143,7 +143,9 @@ static int twa_poll_status_gone(TW_Device_Extension 
*tw_dev, u32 flag, int secon
 static int twa_post_command_packet(TW_Device_Extension *tw_dev, int 
request_id, char internal);
 static int twa_reset_device_extension(TW_Device_Extension *tw_dev);
 static int twa_reset_sequence(TW_Device_Extension *tw_dev, int soft_reset);
-static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int 
request_id, char *cdb, int use_sg, TW_SG_Entry *sglistarg);
+static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
+  unsigned char *cdb, int use_sg,
+  TW_SG_Entry *sglistarg);
 static void twa_scsiop_execute_scsi_complete(TW_Device_Extension *tw_dev, int 
request_id);
 static char *twa_string_lookup(twa_message_type *table, unsigned int aen_code);
 
@@ -278,7 +280,7 @@ static int twa_aen_complete(TW_Device_Extension *tw_dev, 
int request_id)
 static int twa_aen_drain_queue(TW_Device_Extension *tw_dev, int no_check_reset)
 {
int request_id = 0;
-   char cdb[TW_MAX_CDB_LEN];
+   unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry sglist[1];
int finished = 0, count = 0;
TW_Command_Full *full_command_packet;
@@ -423,7 +425,7 @@ static void twa_aen_queue_event(TW_Device_Extension 
*tw_dev, TW_Command_Apache_H
 /* This function will read the aen queue from the isr */
 static int twa_aen_read_queue(TW_Device_Extension *tw_dev, int request_id)
 {
-   char cdb[TW_MAX_CDB_LEN];
+   unsigned char cdb[TW_MAX_CDB_LEN];
TW_SG_Entry sglist[1];
TW_Command_Full *full_command_packet;
int retval = 1;
@@ -1798,7 +1800,9 @@ static int twa_scsi_queue_lck(struct scsi_cmnd *SCpnt, 
void (*done)(struct scsi_
 static DEF_SCSI_QCMD(twa_scsi_queue)
 
 /* This function hands scsi cdb's to the firmware */
-static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int 
request_id, char *cdb, int use_sg, TW_SG_Entry *sglistarg)
+static int twa_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
+  unsigned char *cdb, int use_sg,
+  TW_SG_Entry *sglistarg)
 {
TW_Command_Full *full_command_packet;
TW_Command_Apache *command_packet;
diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c
index 266bdac75304..480cf82700e9 100644
--- a/drivers/scsi/3w-sas.c
+++ b/drivers/scsi/3w-sas.c
@@ -287,7 +287,9 @@ static int twl_post_command_packet(TW_Device_Extension 
*tw_dev, int request_id)
 } /* End twl_post_command_packet() */
 
 /* This function hands scsi cdb's to the firmware */
-static int twl_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int 
request_id, char *cdb, int use_sg, TW_SG_Entry_ISO *sglistarg)
+static int twl_scsiop_execute_scsi(T

Re: [PATCH] scsi: aic7xxx: Fix unintended sign extension issue

2018-10-25 Thread James Bottomley

On Thu, 2018-10-25 at 16:13 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> In the expression "ahc_inb(ahc, port+3) << 24", the initial value is
> a u8, but is promoted to a signed int, then sign-extended to
> uint64_t.

Why is this, that's highly non intuitive?  The compiler is supposed to
promote to the biggest type, which is uint64_t and then do the
calculation

James

[GIT PULL] first round of SCSI updates for the 4.19+ merge window

2018-10-24 Thread James Bottomley

27;t allow negative thresholds
  scsi: lpfc: remove an unnecessary NULL check

Darren Trapp (1):
  scsi: qla2xxx: Allow FC-NVMe underrun to be handled by transport

David Disseldorp (5):
  scsi: target: stash sess_err_stats on Data-Out timeout
  scsi: target: split out helper for cxn timeout error stashing
  scsi: target: log NOP ping timeouts as errors
  scsi: target: log Data-Out timeouts as errors
  scsi: target: use ISCSI_IQN_LEN in iscsi_target_stat

Deepak Ukey (4):
  scsi: pm80xx: Update driver version to 0.1.39
  scsi: pm80xx: Fixed system hang issue during kexec boot
  scsi: pm80xx: Corrected dma_unmap_sg() parameter
  scsi: pm80xx: Fix for phy enable/disable functionality

Evan Green (1):
  scsi: ufs: Schedule clk gating work on correct queue

Faisal Mehmood (1):
  scsi: 53c700: Fix spelling of 'NEGOTIATION'

Finn Thain (15):
  scsi: esp_scsi: Optimize PIO loops
  scsi: esp_scsi: De-duplicate PIO routines
  scsi: esp_scsi: Eliminate ESP_FLAG_DOING_SLOWCMD
  scsi: esp_scsi: Grant disconnect privilege for untagged commands
  scsi: esp_scsi: Track residual for PIO transfers
  scsi: zorro_esp: Limit DMA transfers to 65535 bytes
  scsi: NCR5380: Check for bus reset
  scsi: NCR5380: Handle BUS FREE during reselection
  scsi: NCR5380: Don't call dsprintk() following reselection interrupt
  scsi: NCR5380: Don't clear busy flag when abort fails
  scsi: NCR5380: Check for invalid reselection target
  scsi: NCR5380: Use DRIVER_SENSE to indicate valid sense data
  scsi: NCR5380: Withhold disconnect privilege for REQUEST SENSE
  scsi: NCR5380: Have NCR5380_select() return a bool
  scsi: NCR5380: Reduce goto statements in NCR5380_select()

Geert Uytterhoeven (1):
  scsi: arcmsr: Spelling s/rebulid/rebuild/

George Kennedy (1):
  scsi: sym53c8xx: fix NULL pointer dereference panic in sym_int_sir()

Giridhar Malavali (2):
  scsi: qla2xxx: Move log messages before issuing command to firmware
  scsi: qla2xxx: Fix for double free of SRB structure

Greg Edwards (1):
  scsi: target: iblock: split T10 PI SGL across command bios

Gustavo A. R. Silva (3):
  scsi: hisi_sas: Fix NULL pointer dereference
  scsi: aic94xx: mark expected switch fall-throughs
  scsi: ips: fix missing break in switch

Hannes Reinecke (5):
  scsi: myrs: Add Mylex RAID controller (SCSI interface)
  scsi: myrb: Add Mylex RAID controller (block interface)
  scsi: libfc: retry PRLI if we cannot analyse the payload
  scsi: core: Allow state transitions from OFFLINE to BLOCKED
  scsi: NCR5380: Clear all unissued commands on host reset

Himanshu Madhani (8):
  scsi: qla2xxx: Return switch command on a timeout
  scsi: qla2xxx: Fix driver hang when FC-NVMe LUNs are configured
  scsi: qla2xxx: Update driver version to 10.00.00.11-k
  scsi: qla2xxx: Update driver version to 10.00.00.10-k
  scsi: qla2xxx: Remove stale ADISC_DONE event
  scsi: qla2xxx: Remove ASYNC GIDPN switch command
  scsi: qla2xxx: Update driver to version 10.00.00.09-k
  scsi: qla2xxx: Fix incorrect port speed being set for FC adapters

Igor Stoppa (1):
  scsi: core: remove unnecessary unlikely()

James Bottomley (1):
  scsi: myrs: fix build failure on 32 bit

James Smart (11):
  scsi: lpfc: update driver version to 12.0.0.7
  scsi: lpfc: add support to retrieve firmware logs
  scsi: lpfc: reduce locking when updating statistics
  scsi: lpfc: Fix errors in log messages.
  scsi: lpfc: Correct invalid EQ doorbell write on if_type=6
  scsi: lpfc: Correct irq handling via locks when taking adapter offline
  scsi: lpfc: Correct soft lockup when running mds diagnostics
  scsi: lpfc: Correct race with abort on completion path
  scsi: lpfc: Raise nvme defaults to support a larger io and more 
connectivity
  scsi: lpfc: raise sg count for nvme to use available sg resources
  scsi: lpfc: Fix GFT_ID and PRLI logic for RSCN

Jason Yan (5):
  scsi: libsas: fix a race condition when smp task timeout
  scsi: libsas: check the ata device status by ata_dev_enabled()
  scsi: libsas: always unregister the old device if going to discover new
  scsi: libsas: make the lldd_port_deformed method optional
  scsi: libsas: delete dead code in scsi_transport_sas.c

Jens Axboe (3):
  scsi: fnic: replace gross legacy tag hack with blk-mq hack
  scsi: sg: remove bad blk_end_request_all() call
  scsi: osd: initiator should use mq variant of request ending

Joe Perches (7):
  scsi: mpt3sas: Remove unused macro MPT3SAS_FMT
  scsi: mpt3sas: Convert logging uses with MPT3SAS_FMT without logging 
levels
  scsi: mpt3sas: Remove KERN_WARNING from panic uses
  scsi: mpt3sas: Convert logging uses with MPT3SAS_FMT and reply_q_name to 
%s:
  scsi: mpt3sas: Convert mlsleading uses of pr_ with MPT3SAS_FMT
  sc

Re: [PATCH] scsi: lpfc: Uninitialized variable in lpfc_debugfs_nodelist_data()

2018-10-22 Thread James Bottomley

On Mon, 2018-10-22 at 09:50 +0300, Dan Carpenter wrote:
> There was a merge problem and we accidentally removed the "nrport"
> initialization.
> 
> Fixes: 77c5bf5647b5 ("Merge branch 'misc' into for-next")
> Signed-off-by: Dan Carpenter 
> ---
>  drivers/scsi/lpfc/lpfc_debugfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c
> b/drivers/scsi/lpfc/lpfc_debugfs.c
> index c6912394b56d..0c8005bb0f53 100644
> --- a/drivers/scsi/lpfc/lpfc_debugfs.c
> +++ b/drivers/scsi/lpfc/lpfc_debugfs.c
> @@ -550,7 +550,7 @@ lpfc_debugfs_nodelist_data(struct lpfc_vport
> *vport, char *buf, int size)
>   struct lpfc_nodelist *ndlp;
>   unsigned char *statep;
>   struct nvme_fc_local_port *localport;
> - struct nvme_fc_remote_port *nrport;
> + struct nvme_fc_remote_port *nrport = NULL;

Really? that's not the way I did the merge in my for-next:

https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=for-next&id=1f86cc958e2a60cef07546b15bdce90713a05e80

77c5bf5647b5 looks to be the orphaned residue of a failed merge ... how
did you find it because it's not in any of the public branches?

James

[PATCH] scsi: myrs: fix build failure on 32 bit

2018-10-18 Thread James Bottomley

For 32 bit versions we have to be careful about divisions of 64 bit
quantities so use do_div() instead of a direct division.  This fixes a
warning about _uldivmod being undefined in certain configurations

Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller")
Reported-by: kbuild test robot 
Signed-off-by: James Bottomley 
---
 drivers/scsi/myrs.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c
index b02ee0b0dd55..a9f9c77e889f 100644
--- a/drivers/scsi/myrs.c
+++ b/drivers/scsi/myrs.c
@@ -1978,7 +1978,8 @@ myrs_get_resync(struct device *dev)
struct scsi_device *sdev = to_scsi_device(dev);
struct myrs_hba *cs = shost_priv(sdev->host);
struct myrs_ldev_info *ldev_info = sdev->hostdata;
-   u8 percent_complete = 0, status;
+   u64 percent_complete = 0;
+   u8 status;
 
if (sdev->channel < cs->ctlr_info->physchan_present || !ldev_info)
return;
@@ -1986,8 +1987,8 @@ myrs_get_resync(struct device *dev)
unsigned short ldev_num = ldev_info->ldev_num;
 
status = myrs_get_ldev_info(cs, ldev_num, ldev_info);
-   percent_complete = ldev_info->rbld_lba * 100 /
-   ldev_info->cfg_devsize;
+   percent_complete = ldev_info->rbld_lba * 100;
+   do_div(percent_complete, ldev_info->cfg_devsize);
}
raid_set_resync(myrs_raid_template, dev, percent_complete);
 }
-- 
2.16.4

Re: [scsi:misc 194/233] ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!

2018-10-18 Thread James Bottomley

On Thu, 2018-10-18 at 10:28 -0700, James Bottomley wrote:
> On Fri, 2018-10-19 at 01:18 +0800, kbuild test robot wrote:
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.g
> > it
> >  misc
> > head:   4d5b4ac1eae471bcd0fa381ab4099cc33e94e15d
> > commit: 77266186397c6c782a3f670d32808a9671806ec5 [194/233] scsi:
> > myrs: Add Mylex RAID controller (SCSI interface)
> > config: arm-allmodconfig (attached as .config)
> > compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
> > reproduce:
> > wget https://raw.githubusercontent.com/intel/lkp-tests/mast
> > er
> > /sbin/make.cross -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > git checkout 77266186397c6c782a3f670d32808a9671806ec5
> > # save the attached .config to linux build tree
> > GCC_VERSION=7.2.0 make.cross ARCH=arm 
> > 
> > All errors (new ones prefixed by >>):
> > 
> > > > ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!
> 
> I think this is the fix, can someone with an arm build check?

This fix turned out to be bogus, but now I've dusted off my arm32 build
environment (had to fix a bug with arm32 emulation in qemu 2.11.2 would
you believe) and actually tried compiling it, the next patch is the fix
(build tested but not boot tested).

James

Re: [scsi:misc 194/233] ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!

2018-10-18 Thread James Bottomley

On Fri, 2018-10-19 at 01:18 +0800, kbuild test robot wrote:
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git
>  misc
> head:   4d5b4ac1eae471bcd0fa381ab4099cc33e94e15d
> commit: 77266186397c6c782a3f670d32808a9671806ec5 [194/233] scsi:
> myrs: Add Mylex RAID controller (SCSI interface)
> config: arm-allmodconfig (attached as .config)
> compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget https://raw.githubusercontent.com/intel/lkp-tests/master
> /sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 77266186397c6c782a3f670d32808a9671806ec5
> # save the attached .config to linux build tree
> GCC_VERSION=7.2.0 make.cross ARCH=arm 
> 
> All errors (new ones prefixed by >>):
> 
> > > ERROR: "__aeabi_uldivmod" [drivers/scsi/myrs.ko] undefined!

I think this is the fix, can someone with an arm build check?

Thanks,

James

---

diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c
index b02ee0b0dd55..6bf3cb416505 100644
--- a/drivers/scsi/myrs.c
+++ b/drivers/scsi/myrs.c
@@ -1986,8 +1986,8 @@ myrs_get_resync(struct device *dev)
unsigned short ldev_num = ldev_info->ldev_num;
 
status = myrs_get_ldev_info(cs, ldev_num, ldev_info);
-   percent_complete = ldev_info->rbld_lba * 100 /
-   ldev_info->cfg_devsize;
+   percent_complete = do_div(ldev_info->rbld_lba * 100,
+ ldev_info->cfg_devsize);
}
raid_set_resync(myrs_raid_template, dev, percent_complete);
 }

Re: [possible bug] critical target error (out of range) when sending UNMAP on lsi2308 (mpt3sas) for the last sector of the drive

2018-10-10 Thread James Bottomley

On Wed, 2018-10-10 at 16:55 +0200, Michal Soltys wrote:
> Hi,
> 
> I have a server with old server with lsi2308 ("it" mode) and sas2x36
> expander in front. I've been testing how it handles ssd drives (among
> those if/how it copes with unmap). And it looks like
> stomped on a bug (it's 100% reproducible in 4.14.74 and 4.18.12).
> 
> The hardware in question:
> 
> mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05),
> BiosVersion(07.39.02.00)
> scsi host6: Fusion MPT SAS Host
> scsi 6:0:19:0: Enclosure LSI  SAS2X36  0e12 PQ: 0
> ANSI: 5
> 
> Tested with old Samsung 840 pro drive.
> 
> Initially I was just doing a simple blkdiscard (which on the drive in
> question works just fine in plain sata controller), which greeted me
> with unexpected:
> 
> 16:00 # blkdiscard /dev/sdt
> blkdiscard: /dev/sdt: BLKDISCARD ioctl failed: Remote I/O error
> 
> Log in dmesg:
> 
> [  391.870979] sd 7:0:20:0: [sdt] 250069680 512-byte logical blocks:
> (128 GB/119 GiB)
> [  391.872702] sd 7:0:20:0: [sdt] Write Protect is off
> [  391.872705] sd 7:0:20:0: [sdt] Mode Sense: 7f 00 10 08
> [  391.872962] sd 7:0:20:0: [sdt] Write cache: enabled, read cache:
> enabled, supports DPO and FUA
> ...
> [  529.907273] sd 7:0:20:0: [sdt] tag#3625 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [  529.907277] sd 7:0:20:0: [sdt] tag#3625 Sense Key : Illegal
> Request [current]
> [  529.907279] sd 7:0:20:0: [sdt] tag#3625 Add. Sense: Logical block
> address out of range
> [  529.907281] sd 7:0:20:0: [sdt] tag#3625 CDB: Unmap/Read sub-
> channel 42 00 00 00 00 00 00 00 18 00
> [  529.907283] print_req_error: critical target error, dev sdt,
> sector 247463877
> 
> Then I started testing it more manually and it turned out that if I
> try to discard any range that includes the last sector, the command
> fail in this fashion. For example:
> 
> 2 sectors from the end, 1 sector (works):
> blkdiscard -o $(( 250069678*512 )) -l 512 /dev/sdt
> 
> 1 sectors from the end, 1 sector (fails):
> blkdiscard -o $(( 250069679*512 )) -l 512 /dev/sdt
> blkdiscard: /dev/sdt: BLKDISCARD ioctl failed: Remote I/O error
> 
> [  702.139612] sd 7:0:20:0: [sdt] tag#2511 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [  702.139618] sd 7:0:20:0: [sdt] tag#2511 Sense Key : Illegal
> Request [current]
> [  702.139622] sd 7:0:20:0: [sdt] tag#2511 Add. Sense: Logical block
> address out of range
> [  702.139627] sd 7:0:20:0: [sdt] tag#2511 CDB: Unmap/Read sub-
> channel 42 00 00 00 00 00 00 00 18 00
> [  702.139631] print_req_error: critical target error, dev sdt,
> sector 250069679
> 
> Similar results with sg_unmap:
> 
> 16:36 # sg_unmap -f -l 250069678 -n 1 /dev/sdt
> 16:36 # sg_unmap -f -l 250069679 -n 1 /dev/sdt
> sg_unmap failed: LBA out of range
> 
> The last sector on its own is of course normally readable/writable
> otherwise.
> 
> This looks like some +/-1 division/reminder issue somewhere.

This looks like a firmware error either on the drive or the mpt3sas.

I'm afraid the only ways to fix are either:

   1. Don't use discard at all and use the entire disk
   2. Use discard but shorten the disk in the partition map by one sector
  so the entire visible disk can be discarded

I'm afraid if you can't change the partition table, you can only use
option one (turn off discard).

James

[GIT PULL] SCSI fixes for 4.19-rc5

2018-10-06 Thread James Bottomley

Small fix for an unititialized mutex in the qedi driver.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Nilesh Javali (1):
  scsi: qedi: Initialize the stats mutex lock

and the diffstat:

 drivers/scsi/qedi/qedi_main.c | 1 +
 1 file changed, 1 insertion(+)

With full diff below.

James

---

diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index cc8e64dc65ad..e5bd035ebad0 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -2472,6 +2472,7 @@ static int __qedi_probe(struct pci_dev *pdev, int mode)
/* start qedi context */
spin_lock_init(&qedi->hba_lock);
spin_lock_init(&qedi->task_idx_lock);
+   mutex_init(&qedi->stats_lock);
}
qedi_ops->ll2->register_cb_ops(qedi->cdev, &qedi_ll2_cb_ops, qedi);
qedi_ops->ll2->start(qedi->cdev, ¶ms);

[GIT PULL] SCSI fixes for 4.19-rc5

2018-09-25 Thread James Bottomley

Nine obvious bug fixes mostly in individual drivers.  The target fix is
of particular importance because it's CVE related.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Adrian Hunter (1):
  scsi: ufs: Disable blk-mq for now

James Smart (1):
  scsi: lpfc: Synchronize access to remoteport via rport

Johannes Thumshirn (1):
  scsi: sd: don't crash the host on invalid commands

Laura Abbott (2):
  scsi: ibmvscsis: Ensure partition name is properly NUL terminated
  scsi: ibmvscsis: Fix a stringop-overflow warning

Vincent Pelletier (2):
  scsi: target: iscsi: Use bin2hex instead of a re-implementation
  scsi: target: iscsi: Use hex2bin instead of a re-implementation

Wen Xiong (1):
  scsi: ipr: System hung while dlpar adding primary ipr adapter back

Xuewei Zhang (1):
  scsi: sd: Contribute to randomness when running rotational device

And the diffstat:

 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |   5 +-
 drivers/scsi/ipr.c   | 106 ++-
 drivers/scsi/ipr.h   |   1 +
 drivers/scsi/lpfc/lpfc_attr.c|  15 +++--
 drivers/scsi/lpfc/lpfc_debugfs.c |  10 +--
 drivers/scsi/lpfc/lpfc_nvme.c|  11 +++-
 drivers/scsi/sd.c|   6 +-
 drivers/scsi/ufs/ufshcd.c|   7 ++
 drivers/target/iscsi/iscsi_target_auth.c |  45 +
 9 files changed, 116 insertions(+), 90 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c 
b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
index fac377320158..f42a619198c4 100644
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -3474,11 +3474,10 @@ static int ibmvscsis_probe(struct vio_dev *vdev,
vscsi->dds.window[LOCAL].liobn,
vscsi->dds.window[REMOTE].liobn);
 
-   strcpy(vscsi->eye, "VSCSI ");
-   strncat(vscsi->eye, vdev->name, MAX_EYE);
+   snprintf(vscsi->eye, sizeof(vscsi->eye), "VSCSI %s", vdev->name);
 
vscsi->dds.unit_id = vdev->unit_address;
-   strncpy(vscsi->dds.partition_name, partition_name,
+   strscpy(vscsi->dds.partition_name, partition_name,
sizeof(vscsi->dds.partition_name));
vscsi->dds.partition_num = partition_number;
 
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index f2ec80b0ffc0..271990bc065b 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -3335,6 +3335,65 @@ static void ipr_release_dump(struct kref *kref)
LEAVE;
 }
 
+static void ipr_add_remove_thread(struct work_struct *work)
+{
+   unsigned long lock_flags;
+   struct ipr_resource_entry *res;
+   struct scsi_device *sdev;
+   struct ipr_ioa_cfg *ioa_cfg =
+   container_of(work, struct ipr_ioa_cfg, scsi_add_work_q);
+   u8 bus, target, lun;
+   int did_work;
+
+   ENTER;
+   spin_lock_irqsave(ioa_cfg->host->host_lock, lock_flags);
+
+restart:
+   do {
+   did_work = 0;
+   if (!ioa_cfg->hrrq[IPR_INIT_HRRQ].allow_cmds) {
+   spin_unlock_irqrestore(ioa_cfg->host->host_lock, 
lock_flags);
+   return;
+   }
+
+   list_for_each_entry(res, &ioa_cfg->used_res_q, queue) {
+   if (res->del_from_ml && res->sdev) {
+   did_work = 1;
+   sdev = res->sdev;
+   if (!scsi_device_get(sdev)) {
+   if (!res->add_to_ml)
+   list_move_tail(&res->queue, 
&ioa_cfg->free_res_q);
+   else
+   res->del_from_ml = 0;
+   
spin_unlock_irqrestore(ioa_cfg->host->host_lock, lock_flags);
+   scsi_remove_device(sdev);
+   scsi_device_put(sdev);
+   
spin_lock_irqsave(ioa_cfg->host->host_lock, lock_flags);
+   }
+   break;
+   }
+   }
+   } while (did_work);
+
+   list_for_each_entry(res, &ioa_cfg->used_res_q, queue) {
+   if (res->add_to_ml) {
+   bus = res->bus;
+   target = res->target;
+   lun = res->lun;
+   res->add_to_ml = 0;
+   spin_unlock_irqrestore(ioa_cfg->host->host_lock, 
lock_flags);
+   scsi_add_device(ioa_cfg->host, bus, target, lun);
+   spin_lock_irqsave(ioa_cfg->host->host_lock, lock_flags);
+   goto restart;
+   }
+   }
+
+   ioa_cfg->scan_done = 1;
+   spin_unl

[GIT PULL] SCSI fixes for 4.19-rc3

2018-09-19 Thread James Bottomley

A couple of small but important fixes, one affecting big endian and the
other fixing a BUG_ON in scatterlist processing.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Dan Carpenter (1):
  scsi: qla2xxx: Fix an endian bug in fcpcmd_is_corrupted()

Laura Abbott (1):
  scsi: iscsi: target: Don't use stack buffer for scatterlist

And the diffstat:

 drivers/scsi/qla2xxx/qla_target.h   |  4 ++--
 drivers/target/iscsi/iscsi_target.c | 22 ++
 2 files changed, 16 insertions(+), 10 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qla2xxx/qla_target.h 
b/drivers/scsi/qla2xxx/qla_target.h
index fecf96f0225c..199d3ba1916d 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -374,8 +374,8 @@ struct atio_from_isp {
 static inline int fcpcmd_is_corrupted(struct atio *atio)
 {
if (atio->entry_type == ATIO_TYPE7 &&
-   (le16_to_cpu(atio->attr_n_length & FCP_CMD_LENGTH_MASK) <
-   FCP_CMD_LENGTH_MIN))
+   ((le16_to_cpu(atio->attr_n_length) & FCP_CMD_LENGTH_MASK) <
+FCP_CMD_LENGTH_MIN))
return 1;
else
return 0;
diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index 9cdfccbdd06f..cc756a123fd8 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -1416,7 +1416,8 @@ static void iscsit_do_crypto_hash_buf(struct 
ahash_request *hash,
 
sg_init_table(sg, ARRAY_SIZE(sg));
sg_set_buf(sg, buf, payload_length);
-   sg_set_buf(sg + 1, pad_bytes, padding);
+   if (padding)
+   sg_set_buf(sg + 1, pad_bytes, padding);
 
ahash_request_set_crypt(hash, sg, data_crc, payload_length + padding);
 
@@ -3910,10 +3911,14 @@ static bool iscsi_target_check_conn_state(struct 
iscsi_conn *conn)
 static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
 {
int ret;
-   u8 buffer[ISCSI_HDR_LEN], opcode;
+   u8 *buffer, opcode;
u32 checksum = 0, digest = 0;
struct kvec iov;
 
+   buffer = kcalloc(ISCSI_HDR_LEN, sizeof(*buffer), GFP_KERNEL);
+   if (!buffer)
+   return;
+
while (!kthread_should_stop()) {
/*
 * Ensure that both TX and RX per connection kthreads
@@ -3921,7 +3926,6 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
 */
iscsit_thread_check_cpumask(conn, current, 0);
 
-   memset(buffer, 0, ISCSI_HDR_LEN);
memset(&iov, 0, sizeof(struct kvec));
 
iov.iov_base= buffer;
@@ -3930,7 +3934,7 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
ret = rx_data(conn, &iov, 1, ISCSI_HDR_LEN);
if (ret != ISCSI_HDR_LEN) {
iscsit_rx_thread_wait_for_tcp(conn);
-   return;
+   break;
}
 
if (conn->conn_ops->HeaderDigest) {
@@ -3940,7 +3944,7 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
ret = rx_data(conn, &iov, 1, ISCSI_CRC_LEN);
if (ret != ISCSI_CRC_LEN) {
iscsit_rx_thread_wait_for_tcp(conn);
-   return;
+   break;
}
 
iscsit_do_crypto_hash_buf(conn->conn_rx_hash, buffer,
@@ -3964,7 +3968,7 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
}
 
if (conn->conn_state == TARG_CONN_STATE_IN_LOGOUT)
-   return;
+   break;
 
opcode = buffer[0] & ISCSI_OPCODE_MASK;
 
@@ -3975,13 +3979,15 @@ static void iscsit_get_rx_pdu(struct iscsi_conn *conn)
" while in Discovery Session, rejecting.\n", opcode);
iscsit_add_reject(conn, ISCSI_REASON_PROTOCOL_ERROR,
  buffer);
-   return;
+   break;
}
 
ret = iscsi_target_rx_opcode(conn, buffer);
if (ret < 0)
-   return;
+   break;
}
+
+   kfree(buffer);
 }
 
 int iscsi_target_rx_thread(void *arg)

Re: [PATCH v4] libata: Add hwmon support for SMART temperature sensors

2018-09-14 Thread James Bottomley

On Thu, 2018-09-13 at 16:22 +0200, Linus Walleij wrote:
[...]
> + /* Send ATA command to read SMART values */
> + memset(scsi_cmd, 0, sizeof(scsi_cmd));
> + scsi_cmd[0] = ATA_16;
> + scsi_cmd[1] = (4 << 1); /* PIO Data-in */
> + /*
> +  * No off.line or cc, read from dev, block count in sector
> count
> +  * field.
> +  */
> + scsi_cmd[2] = 0x0e;
> + scsi_cmd[4] = ATA_SMART_READ_VALUES;
> + scsi_cmd[6] = 1; /* Read 1 sector */
> + scsi_cmd[8] = 0; /* args[1]; */
> + scsi_cmd[10] = ATA_SMART_LBAM_PASS;
> + scsi_cmd[12] = ATA_SMART_LBAH_PASS;
> + scsi_cmd[14] = ATA_CMD_SMART;
> +
> + cmd_result = scsi_execute(ata->sdev, scsi_cmd,
> DMA_FROM_DEVICE,
> +   buf, ATA_SECT_SIZE,
> +   NULL, &sshdr, 10 * HZ, 5, 0, 0,
> NULL);

Given that you're using scsi_execute and this would work on most SAS
drives as well as SATA ones, why not use the SAS mode pages and we'll
translate it to SATA in the existing libata-scsi SAT?

That way this can work on all SCSI devices that support SMART not just
the SATA subset.

If you can't figure out how to do this initially, then simply
separating smart from libata is a good first start so we can build on
it in SCSI as well.

James

[GIT PULL] SCSI fixes for 4.19-rc3

2018-09-12 Thread James Bottomley

Three fixes, all in drivers (qedi and iscsi target) so no wider impact
even if the code changes are a bit extensive.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Mike Christie (1):
  scsi: iscsi: target: Fix conn_ops double free

Nilesh Javali (1):
  scsi: qedi: Add the CRC size within iSCSI NVM image

Vincent Pelletier (1):
  scsi: iscsi: target: Set conn->sess to NULL when 
iscsi_login_set_conn_values fails

And the diffstat:

 drivers/scsi/qedi/qedi.h  |   7 +-
 drivers/scsi/qedi/qedi_main.c |  28 +++---
 drivers/target/iscsi/iscsi_target.c   |   9 +-
 drivers/target/iscsi/iscsi_target_login.c | 149 --
 drivers/target/iscsi/iscsi_target_login.h |   2 +-
 5 files changed, 101 insertions(+), 94 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/qedi/qedi.h b/drivers/scsi/qedi/qedi.h
index fc3babc15fa3..a6f96b35e971 100644
--- a/drivers/scsi/qedi/qedi.h
+++ b/drivers/scsi/qedi/qedi.h
@@ -77,6 +77,11 @@ enum qedi_nvm_tgts {
QEDI_NVM_TGT_SEC,
 };
 
+struct qedi_nvm_iscsi_image {
+   struct nvm_iscsi_cfg iscsi_cfg;
+   u32 crc;
+};
+
 struct qedi_uio_ctrl {
/* meta data */
u32 uio_hsi_version;
@@ -294,7 +299,7 @@ struct qedi_ctx {
void *bdq_pbl_list;
dma_addr_t bdq_pbl_list_dma;
u8 bdq_pbl_list_num_entries;
-   struct nvm_iscsi_cfg *iscsi_cfg;
+   struct qedi_nvm_iscsi_image *iscsi_image;
dma_addr_t nvm_buf_dma;
void __iomem *bdq_primary_prod;
void __iomem *bdq_secondary_prod;
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index aa96bccb5a96..cc8e64dc65ad 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -1346,23 +1346,26 @@ static int qedi_setup_int(struct qedi_ctx *qedi)
 
 static void qedi_free_nvm_iscsi_cfg(struct qedi_ctx *qedi)
 {
-   if (qedi->iscsi_cfg)
+   if (qedi->iscsi_image)
dma_free_coherent(&qedi->pdev->dev,
- sizeof(struct nvm_iscsi_cfg),
- qedi->iscsi_cfg, qedi->nvm_buf_dma);
+ sizeof(struct qedi_nvm_iscsi_image),
+ qedi->iscsi_image, qedi->nvm_buf_dma);
 }
 
 static int qedi_alloc_nvm_iscsi_cfg(struct qedi_ctx *qedi)
 {
-   qedi->iscsi_cfg = dma_zalloc_coherent(&qedi->pdev->dev,
-sizeof(struct nvm_iscsi_cfg),
-&qedi->nvm_buf_dma, GFP_KERNEL);
-   if (!qedi->iscsi_cfg) {
+   struct qedi_nvm_iscsi_image nvm_image;
+
+   qedi->iscsi_image = dma_zalloc_coherent(&qedi->pdev->dev,
+   sizeof(nvm_image),
+   &qedi->nvm_buf_dma,
+   GFP_KERNEL);
+   if (!qedi->iscsi_image) {
QEDI_ERR(&qedi->dbg_ctx, "Could not allocate NVM BUF.\n");
return -ENOMEM;
}
QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
- "NVM BUF addr=0x%p dma=0x%llx.\n", qedi->iscsi_cfg,
+ "NVM BUF addr=0x%p dma=0x%llx.\n", qedi->iscsi_image,
  qedi->nvm_buf_dma);
 
return 0;
@@ -1905,7 +1908,7 @@ qedi_get_nvram_block(struct qedi_ctx *qedi)
struct nvm_iscsi_block *block;
 
pf = qedi->dev_info.common.abs_pf_id;
-   block = &qedi->iscsi_cfg->block[0];
+   block = &qedi->iscsi_image->iscsi_cfg.block[0];
for (i = 0; i < NUM_OF_ISCSI_PF_SUPPORTED; i++, block++) {
flags = ((block->id) & NVM_ISCSI_CFG_BLK_CTRL_FLAG_MASK) >>
NVM_ISCSI_CFG_BLK_CTRL_FLAG_OFFSET;
@@ -2194,15 +2197,14 @@ static void qedi_boot_release(void *data)
 static int qedi_get_boot_info(struct qedi_ctx *qedi)
 {
int ret = 1;
-   u16 len;
-
-   len = sizeof(struct nvm_iscsi_cfg);
+   struct qedi_nvm_iscsi_image nvm_image;
 
QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
  "Get NVM iSCSI CFG image\n");
ret = qedi_ops->common->nvm_get_image(qedi->cdev,
  QED_NVM_IMAGE_ISCSI_CFG,
- (char *)qedi->iscsi_cfg, len);
+ (char *)qedi->iscsi_image,
+ sizeof(nvm_image));
if (ret)
QEDI_ERR(&qedi->dbg_ctx,
 "Could not get NVM image. ret = %d\n", ret);
diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index 94bad43c41ff..9cdfccbdd06f 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4208,22 +4208,15 @@ int iscsit_close_connection(
crypto_free_ahash(tfm);

Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch

2018-07-03 Thread James Bottomley

On Tue, 2018-07-03 at 22:49 +0900, David Miller wrote:
> From: Sreekanth Reddy 
> Date: Tue, 3 Jul 2018 17:48:49 +0530
> 
> > Any suggestion/update over my previous mail. I am using 4.13
> kernel.
> 
> I think the issue is that if you are reading a 32-bit word and then
> interpreting it as a struct full of individual bytes, you have to
> order the bytes in the structure appropriately for the cpu
> endianness.

This is undoubtedly it.  The point being if you read from a structure
using readX, you have to read every element at its correct length for
the endian swaps to work.  You can't do a readq on 2 32 bit words and
expect the endianness to be correct (you'll find they come out in the
wrong order).

I think you're using a shared (device and cpu) memory mapped structured
data with a doorbell register, which is pretty identical to how the
qla1280 does it.  We went through several iterations of fixing that
driver for big endian but finally settled on putting __le annotations
on all the structures and doing cpu_to_leX() swaps as we wrote them
(and obviously leX_to_cpu() swaps to read them), meaning the structure
in memory is always correct for the device.  Then we used a writeX to
poke the doorbell and the device just picked up the correct
information.

The rule you want to be following is: memory mapped structure, you're
responsible for annotation and swapping; readX/writeX to correctly
sized data, the API will swap for you.

So, can we just revert the original patch which is clearly now a
regression and try to get this fixed in the merge window?  I think the
actual bug is simply you're missing __leX annotations on the shared
memory mapped structure to fix sparse, but otherwise everything is
working.

James

Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch

2018-06-29 Thread James Bottomley

On Fri, 2018-06-29 at 10:58 -0400, Chaitra P B wrote:
>  "scsi: mpt3sas: Bug fix for big endian systems"
> 
> Above patch with commit id "cf6bf9710cabba1fe94a4349f4eb8db623c77ebc"
> was posted to fix sparse warnings. While posting this patch it was
> assumed that readl() & writel() APIs internally calls le32_to_cpu() &
> cpu_to_le32() APIs respectively. Looks like it is not true for all
> architecture

Just a minute, it damn well should be.  The definition of readl/writel
is barriers and little endian (you can see this in asm-generic/io.h).

Which architecture is getting this wrong?  Because it sounds like
that's what we need to fix rather than doing something like this in all
drivers.

Sparc (and parisc) definitely do the little endian thing, so if this
code is what it takes to get them working again, it looks like you're
double swapping somewhere.  I really think cf6bf9710c needs to be
reverted and you should look again at your sparse warnings.  Since the
driver is using the non-raw readX/writeX it should be cpu endian
internally which cf6bf9710c upsets.

James

Re: [PATCH v2] scsi: qla2xxx: Spinlock recursion in qla_target

2018-06-13 Thread James Bottomley

On Wed, 2018-06-13 at 16:13 +, Madhani, Himanshu wrote:
> > On Jun 13, 2018, at 6:05 AM, Mikhail Malygin 
> > wrote:
> > 
> > Here is the patch used for verification:
> > 
> > [PATCH] scsi: qla2xxx: Fixup spinlock recursion in qla_target
> > 
> > The patch reverts changes done in qlt_schedule_sess_for_deletion()
> > To avoid spinlock recursion sess->vha->work_lock should be used
> > instead of ha->tgt.sess_lock, that can be locked in
> > callers: qlt_reset() or qlt_handle_login()
> > 
> 
> Thanks for testing out the change. 
> 
> > Fixes: 1c6cacf4ea6c04 ("scsi: qla2xxx: Fixup locking for session
> > deletion")
> > 
> > Signed-off-by: Mikhail Malygin 
> 
> I want to see following tags for correctness
> 
> Fixes: 1c6cacf4ea6c04 ("scsi: qla2xxx: Fixup locking for session
> deletion”)
> Cc:  #4.17
> Reported-by: Mikhail Malygin 
> Tested-by: Mikhail Malygin 
> Signed-off-by: Himanshu Madhani 

No on this last one: he can't add your signoff tag.  The Signed-off-by: 
tags track the patch submission path and has meaning under the DCO, so
if Mikhail sends it to you and you send it to the list (or you pick it
off the list and resend it) *you* can add your signoff but not if it
goes straight from Mikhail to the scsi tree (for this case we have an
Acked-by: tag instead if that works for you?).

James

Re: sd 6:0:0:0: [sdb] Unaligned partial completion

2018-06-11 Thread James Bottomley

On Mon, 2018-06-11 at 14:59 -0700, Ted Cabeen wrote:
> On 06/11/2018 02:40 PM, James Bottomley wrote:
> > On Mon, 2018-06-11 at 12:20 -0400, Douglas Gilbert wrote:
> > > I have also seen Aborted Command sense when doing heavy testing
> > > on one or more SAS disks behind a SAS expander. I put it down to
> > > a temporary lack of paths available (on the link between the
> > > host's HBA and the expander) when one of those SAS disks tries to
> > > get a connection back to the host with the data (data-in
> > > transfer) from an earlier READ command.
> > > 
> > > In my code (ddpt and sg_dd) I treat it as a "retry" type error
> > > and in my experience that works. IOW a follow-up READ with the
> > > same parameters is successful.
> > 
> > We do treat ABORTED_COMMAND as a retry.  However, it will tick down
> > the retry count (usually 3) and then fail if it still occurs.  How
> > long does this condition persist for? because if it's long lived we
> > could treat it as ADD_TO_MLQUEUE which would mean we'd retry until
> > the timeout condition was reached.
> 
> On my system, it's a bit hard to tell, as as soon as ZFS sees the
> read error, it starts resilvering to repair the sector that reported
> the I/O error.  Without the scrub, it happened once over a 5-day
> window.  During the scrub, it was usually 10s of minutes between
> occurrences that failed all the retries, but I had some occasions
> where it happened about 5-10 minutes apart.  It definitely seems to
> be load-related, so how long and hard the load stays elevated is a
> factor.

OK, try this: it will print a rate limited warning if it triggers
(showing it is this problem) and return ADD_TO_MLQUEUE for all the SAS
errors (we'll likely narrow this if it works, but for now let's do the
lot).

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..94aa5cb94064 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -531,6 +531,11 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
if (sshdr.asc == 0xc1 && sshdr.ascq == 0x01 &&
sdev->sdev_bflags & BLIST_RETRY_ASC_C1)
return ADD_TO_MLQUEUE;
+   if (sshdr.asc == 0x4b) {
+   printk_ratelimited(KERN_WARNING "SAS/SATA link 
retry\n");
+   return ADD_TO_MLQUEUE;
+   }
+
 
return NEEDS_RETRY;
case NOT_READY:

Re: RAID6: "Bad block number requested"

2018-06-11 Thread James Bottomley

On Mon, 2018-06-11 at 17:56 -0400, Bryan Gurney wrote:
> On Mon, Jun 11, 2018 at 1:00 PM, Anthony Youngman
>  wrote:
> > On 11/06/18 16:06, James Bottomley wrote:
> > > Well, this is the problem: a 4k logical (presumably 4k physical)
> > > drive cannot be addressed in block sectors that are not divisible
> > > by 8.  This type of drive configuration is very unusual (although
> > > it was something we tested years ago before the industry realised
> > > it had to ship drives with 4k physical but 512 byte logical
> > > sectors because of the legacy problem).
> > 
> > I understood these drives were now becoming much more common,
> > especially enterprise-grade drives. I know there were problems
> > switching from 512/512 drives to 512/4096, but as you say I thought
> > they were pretty much addressed.
> 
> As soon as I saw the model number "HGST HUH721010AL", and did a
> search, I said, "Oh, it's _this_ drive."
> 
> The HGST Ultrastar He10 has both "512e Format" and "4K Native Format"
> part numbers, so it's easy to potentially buy the wrong type of drive
> (e.g.: accidentally buy a 4K Native drive, and discover some obscure
> I/O failures).
> 
> FYI, in my experience, when an application sends a
> smaller-than-4096-bytes I/O to a 4096-bytes block device, the usual
> error code that's sent by the driver is EINVAL (or "Invalid
> argument"), so see if there's a log message citing that error code.

We've done the work to make this function.  However, it was a while ago
and I don't believe anyone tests regularly now (particularly with the
corner cases) so errors can creep back into the stack.

> > I think it must be a couple of years ago now though, that I heard
> > (on LWN) enterprise drives were apparently switching over to
> > 4096/4096. With NO 512 emulation fall-back.
> 
> Some drive manufacturers seem to be more eager than others, but
> there's still work to be done.  For example, try this with a 4K-
> native drive:
> 
> 1. Write an ISO image to the drive with the command "dd
> if=isofile.iso of=/dev/testdevice bs=4096 oflag=direct"
> 
> 2. Create a test directory (for example, "/mnt/testdir"), then
> attempt to mount the device with "mount /dev/testdevice /mnt/testdir"

This is a textbook case of something that can never work: The
requirement for a 4k drive is that the stack must be aligned, meaning
4k or multiple of 4k block size all the way up and down.  The isofs
you're copying only has a 2k block size.  You get the same failure with
any non 4k multiple filesystem block size.  Fortunately most modern
filesystems have had 4k, or multiple thereof, block sizes for a while
now, so you're unlikely to see this on your old ext4 devices but, in
principle, it could happen.

James


> When I tried it on RHEL 7.5, I saw this: "kernel: isofs_fill_super:
> bread failed, dev=testdevice, iso_blknum=17, block=-2147483648"
> 
> Note that ISO filesystems have a 2048-byte block size (maximum), but
> in this test, it's stored on a block device with a block size of 4096
> bytes.
> 
> There may be more issues out there, but they have to be found first.
> And finding the issues is difficult, due to the obscurity of the
> error messages seen.
> 
> 
> Thanks,
> 
> Bryan
>

Re: sd 6:0:0:0: [sdb] Unaligned partial completion

2018-06-11 Thread James Bottomley

[readd linux-scsi]
On Mon, 2018-06-11 at 14:43 -0700, Ted Cabeen wrote:
> On 06/11/2018 02:40 PM, James Bottomley wrote:
> > > I have also seen Aborted Command sense when doing heavy testing
> > > on one or more SAS disks behind a SAS expander. I put it down to
> > > a temporary lack of paths available (on the link between the
> > > host's HBA and the expander) when one of those SAS disks tries to
> > > get a connection back to the host with the data (data-in
> > > transfer) from an earlier READ command.
> > > 
> > > In my code (ddpt and sg_dd) I treat it as a "retry" type error
> > > and in my experience that works. IOW a follow-up READ with the
> > > same parameters is successful.
> > 
> > We do treat ABORTED_COMMAND as a retry.  However, it will tick down
> > the retry count (usually 3) and then fail if it still occurs.  How
> > long does this condition persist for? because if it's long lived we
> > could treat it as ADD_TO_MLQUEUE which would mean we'd retry until
> > the timeout condition was reached.
> 
> When you retry, should that result in additional kernel messages, or 
> does the kernel message only appear after the 3 retrys have all
> failed?

The latter: without enabling logging, we don't print anything for
successfully retried commands.

James

Re: sd 6:0:0:0: [sdb] Unaligned partial completion

2018-06-11 Thread James Bottomley

On Mon, 2018-06-11 at 12:20 -0400, Douglas Gilbert wrote:
> On 2018-06-11 12:07 PM, Ted Cabeen wrote:
> > I'm seeing a similar behavior on my system, but across multiple
> > devices on a SAS drive array (front bays on a Supermicro-based
> > system with onboard mpt3sas card). 
> > The Sense Key here doesn't show a medium error, and the multiple-
> > drive behavior makes me think it's more likely either a controller
> > or cable problem. Interestingly, the issue only shows up under
> > heavy load (specifically a ZFS scrub).
> > 
> > During my next downtime window, I'm going to try to re-create the
> > problem while capturing a blktrace.  Any other things to try at
> > that time, or a filter-mask I should apply?
> > 
> > [Wed Jun  6 14:30:19 2018] blk_update_request: I/O error, dev sdn,
> > sector 
> > 1757633640
> > [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: unaligned partial
> > completion avoided 
> > (xfer_cnt=3072, sector_sz=4096)
> > [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] FAILED Result:
> > hostbyte=DID_OK 
> > driverbyte=DRIVER_SENSE
> > [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] Sense Key : Aborted
> > Command 
> > [current] [descriptor]
> > [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] Add. Sense: Nak
> > received
> > [Wed Jun  6 14:37:10 2018] sd 15:0:5:0: [sdr] CDB: Read(10) 28 00
> > 07 8a c1 ca 00 
> > 00 01 00
> > [Wed Jun  6 14:37:10 2018] blk_update_request: I/O error, dev sdr,
> > sector 
> > 1012272720
> > [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: unaligned partial
> > completion avoided 
> > (xfer_cnt=52224, sector_sz=4096)
> > [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] FAILED Result:
> > hostbyte=DID_OK 
> > driverbyte=DRIVER_SENSE
> > [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] Sense Key : Aborted
> > Command 
> > [current] [descriptor]
> > [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] Add. Sense: Nak
> > received
> > [Wed Jun  6 15:20:43 2018] sd 15:0:8:0: [sdu] CDB: Read(10) 28 00
> > 12 ab dc 52 00 
> > 00 19 00
> > [Wed Jun  6 15:20:43 2018] blk_update_request: I/O error, dev sdu,
> > sector 
> > 2506023568
> > [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: unaligned partial
> > completion avoided 
> > (xfer_cnt=11264, sector_sz=4096)
> > [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] FAILED Result:
> > hostbyte=DID_OK 
> > driverbyte=DRIVER_SENSE
> > [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] Sense Key : Aborted
> > Command 
> > [current] [descriptor]
> > [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] Add. Sense: Nak
> > received
> > [Wed Jun  6 15:46:20 2018] sd 15:0:2:0: [sdo] CDB: Read(10) 28 00
> > 40 a8 ef b5 00 
> > 00 03 00
> > [Wed Jun  6 15:46:20 2018] blk_update_request: I/O error, dev sdo,
> > sector 
> > 8678505896
> > 
> 
> I have also seen Aborted Command sense when doing heavy testing on
> one or more SAS disks behind a SAS expander. I put it down to a
> temporary lack of paths available (on the link between the host's HBA
> and the expander) when one of those SAS disks tries to get a
> connection back to the host with the data (data-in transfer) from an
> earlier READ command.
> 
> In my code (ddpt and sg_dd) I treat it as a "retry" type error and in
> my experience that works. IOW a follow-up READ with the same
> parameters is successful.

We do treat ABORTED_COMMAND as a retry.  However, it will tick down the
retry count (usually 3) and then fail if it still occurs.  How long
does this condition persist for? because if it's long lived we could
treat it as ADD_TO_MLQUEUE which would mean we'd retry until the
timeout condition was reached.

James

Re: RAID6: "Bad block number requested"

2018-06-11 Thread James Bottomley

On Mon, 2018-06-11 at 11:18 -0400, Douglas Gilbert wrote:
> On 2018-06-11 11:06 AM, James Bottomley wrote:
> > On Mon, 2018-06-11 at 16:24 +0200, Sebastian Hegler wrote:
> > > Dear all,
> > > 
> > > First off: sorry for cross-posting. I don't know if this is a
> > > RAID
> > > issue or a SCSI issue, so I'll just ask y'all.
> > > 
> > > 
> > > For a RAID6 capacity upgrade (higher capacity drives), we bought
> > > some
> > > 10TB disks:
> > > ==
> > > Apr 17 11:16:05 kuiper kernel: [12795386.862031] scsi 6:0:36:0:
> > > Direct-Access ATA  HGST HUH721010AL T21D PQ: 0 ANSI: 6
> > > Apr 17 11:16:05 kuiper kernel: [12795386.919904] scsi 6:0:36:0:
> > > atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y),
> > > sw_preserve(y)
> > > Apr 17 11:16:05 kuiper kernel: [12795386.974186] sd 6:0:36:0:
> > > [sdl]
> > > 2441609216 4096-byte logical blocks: (10.0 TB/9.10 TiB)
> > 
> > Well, this is the problem: a 4k logical (presumably 4k physical)
> > drive
> > cannot be addressed in block sectors that are not divisible by
> > 8.  This
> > type of drive configuration is very unusual (although it was
> > something
> > we tested years ago before the industry realised it had to ship
> > drives
> > with 4k physical but 512 byte logical sectors because of the legacy
> > problem).
> > 
> > > Apr 17 11:16:05 kuiper kernel: [12795386.998016] sd 6:0:36:0:
> > > [sdl]
> > > Write Protect is off
> > > Apr 17 11:16:05 kuiper kernel: [12795387.000625] sd 6:0:36:0:
> > > Attached scsi generic sg12 type 0
> > > Apr 17 11:16:05 kuiper kernel: [12795387.035341] sd 6:0:36:0:
> > > [sdl]
> > > Mode Sense: 7f 00 10 08
> > > Apr 17 11:16:05 kuiper kernel: [12795387.035679] sd 6:0:36:0:
> > > [sdl]
> > > Write cache: enabled, read cache: enabled, supports DPO and FUA
> > > Apr 17 11:16:05 kuiper kernel: [12795387.098315] sd 6:0:36:0:
> > > [sdl]
> > > Attached SCSI disk
> > > ==
> > > 
> > > RAID add and rebuild operations went fine. However, some minutes
> > > after rebuild completion, several hundreds of these error
> > > messages
> > > started to appear:
> > > ==
> > > Apr 20 03:37:29 kuiper kernel: [13027072.454811] sd 6:0:36:0:
> > > [sdl]
> > > Bad block number requested
> > 
> > This means that somehow, something sent a non 4k aligned 4k sized
> > request. SCSI here is just the messenger.  However, if you apply
> > this
> > patch, it will capture the stack trace of what above it triggered
> > this,
> > which may help us in debugging.  It could be we may also want to
> > see
> > what the values of block and blk_rq_sectors(rq) actually are, but
> > lets
> > begin with the stack trace.
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > index 9421d9877730..ac865e048533 100644
> > --- a/drivers/scsi/sd.c
> > +++ b/drivers/scsi/sd.c
> > @@ -1109,6 +1109,7 @@ static int sd_setup_read_write_cmnd(struct
> > scsi_cmnd *SCpnt)
> >     if ((block & 7) || (blk_rq_sectors(rq) & 7)) {
> >     scmd_printk(KERN_ERR, SCpnt,
> >     "Bad block number
> > requested\n");
> 
> Not a very informative error message. How about a quasi SCSI one
> like:
>  Logical Block out of range, due to different block sizes

Well, this is supposed to be an impossible condition: if someone wants
to use non-512 byte logical drives, they're supposed to align the stack
to ensure it works.  In this case, it looks like the stack is aligned
but one component isn't completely 4k safe.  I don't think there's any
message we can print in SCSI that would help with debugging that.

I agree the message we print could be more informative about what went
wrong, but it's unlikely to be helpful to the user who sees it.

James

Re: RAID6: "Bad block number requested"

2018-06-11 Thread James Bottomley

On Mon, 2018-06-11 at 16:24 +0200, Sebastian Hegler wrote:
> Dear all,
> 
> First off: sorry for cross-posting. I don't know if this is a RAID
> issue or a SCSI issue, so I'll just ask y'all.
> 
> 
> For a RAID6 capacity upgrade (higher capacity drives), we bought some
> 10TB disks:
> ==
> Apr 17 11:16:05 kuiper kernel: [12795386.862031] scsi 6:0:36:0:
> Direct-Access ATA  HGST HUH721010AL T21D PQ: 0 ANSI: 6
> Apr 17 11:16:05 kuiper kernel: [12795386.919904] scsi 6:0:36:0:
> atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
> Apr 17 11:16:05 kuiper kernel: [12795386.974186] sd 6:0:36:0: [sdl]
> 2441609216 4096-byte logical blocks: (10.0 TB/9.10 TiB)

Well, this is the problem: a 4k logical (presumably 4k physical) drive
cannot be addressed in block sectors that are not divisible by 8.  This
type of drive configuration is very unusual (although it was something
we tested years ago before the industry realised it had to ship drives
with 4k physical but 512 byte logical sectors because of the legacy
problem).

> Apr 17 11:16:05 kuiper kernel: [12795386.998016] sd 6:0:36:0: [sdl]
> Write Protect is off
> Apr 17 11:16:05 kuiper kernel: [12795387.000625] sd 6:0:36:0:
> Attached scsi generic sg12 type 0
> Apr 17 11:16:05 kuiper kernel: [12795387.035341] sd 6:0:36:0: [sdl]
> Mode Sense: 7f 00 10 08
> Apr 17 11:16:05 kuiper kernel: [12795387.035679] sd 6:0:36:0: [sdl]
> Write cache: enabled, read cache: enabled, supports DPO and FUA
> Apr 17 11:16:05 kuiper kernel: [12795387.098315] sd 6:0:36:0: [sdl]
> Attached SCSI disk
> ==
> 
> RAID add and rebuild operations went fine. However, some minutes
> after rebuild completion, several hundreds of these error messages
> started to appear:
> ==
> Apr 20 03:37:29 kuiper kernel: [13027072.454811] sd 6:0:36:0: [sdl]
> Bad block number requested

This means that somehow, something sent a non 4k aligned 4k sized
request. SCSI here is just the messenger.  However, if you apply this
patch, it will capture the stack trace of what above it triggered this,
which may help us in debugging.  It could be we may also want to see
what the values of block and blk_rq_sectors(rq) actually are, but lets
begin with the stack trace.

James

---

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 9421d9877730..ac865e048533 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1109,6 +1109,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd 
*SCpnt)
if ((block & 7) || (blk_rq_sectors(rq) & 7)) {
scmd_printk(KERN_ERR, SCpnt,
"Bad block number requested\n");
+   WARN_ON_ONCE(1);
goto out;
} else {
block = block >> 3;

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3148 matches

Mail list logo