Re: [PATCH] scsi: arm: fas216: Use %llu to print 'u64' format

2014-07-10 Thread Christoph Hellwig
On Fri, Jul 11, 2014 at 08:30:17AM +0200, Hannes Reinecke wrote:
> On 07/11/2014 08:23 AM, Christoph Hellwig wrote:
> >Thanks Fabio,
> >
> >this looks good to me.  Are there any other warnings you got from that
> >autobuilder?
> >
> It _should_ have been fixed with my latest patch for fixing the 64bit LUN
> fallout. Christoph, have you applied it?

Looks like this is in, indeed.  Seems like that latest drivers-for-3.17
update hasn't made it into James' for-next and thus linux-next, though.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16

2014-07-10 Thread h...@infradead.org
On Wed, Jul 09, 2014 at 10:27:24PM +, James Bottomley wrote:
> If we fix it at source, why would there be any need to filter?  That's
> the reason the no_write_same flag was introduced.  If we can find and
> fix the bug, it can go back into the stable trees as a bug fix, hence
> nothing should ever emit write_same(10 or 16) and additional driver code
> is redundant (and counter productive, since if this ever breaks again
> you're our best canary).
> 
> This looks like it might be the problem but Martin should confirm (I
> think the problem comes to us from the RC16 code which unconditionally
> sets WS16).

I think the problem is a differnet one.  If we have the logical
provisioning EVPD it configures what method to use, but if we don't have
one we simply check for a max unmap blocks field, and if that's not
present use WRITE SAME.

The patch checks the no_write_same flag before doing that, for which
we also have to do the write_same setup before the discard setup
in sd_revalidate_disk.

Ky: does hyperv support UNMAP?  If so any idea why it doesn't set
the maximum unmap block count field in the EVPD?

If we want to enable UNMAP in this case I'd prefer a blacklist entry
than trying UNMAP despite the device not advertising it.

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ba756b1..fbccfd2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2614,9 +2614,10 @@ static void sd_read_block_limits(struct scsi_disk *sdkp)
 
if (sdkp->max_unmap_blocks)
sd_config_discard(sdkp, SD_LBP_UNMAP);
-   else
+   else if (!sdkp->device->no_write_same)
sd_config_discard(sdkp, SD_LBP_WS16);
-
+   else
+   sd_config_discard(sdkp, SD_LBP_DISABLE);
} else {/* LBP VPD page tells us what to use */
 
if (sdkp->lbpu && sdkp->max_unmap_blocks)
@@ -2766,6 +2767,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 */
if (sdkp->media_present) {
sd_read_capacity(sdkp, buffer);
+   sd_read_write_same(sdkp, buffer);
 
if (sd_try_extended_inquiry(sdp)) {
sd_read_block_provisioning(sdkp);
@@ -2776,7 +2778,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
sd_read_write_protect_flag(sdkp, buffer);
sd_read_cache_type(sdkp, buffer);
sd_read_app_tag_own(sdkp, buffer);
-   sd_read_write_same(sdkp, buffer);
}
 
sdkp->first_scan = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: arm: fas216: Use %llu to print 'u64' format

2014-07-10 Thread Hannes Reinecke

On 07/11/2014 08:23 AM, Christoph Hellwig wrote:

Thanks Fabio,

this looks good to me.  Are there any other warnings you got from that
autobuilder?

It _should_ have been fixed with my latest patch for fixing the 
64bit LUN fallout. Christoph, have you applied it?


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: arm: fas216: Use %llu to print 'u64' format

2014-07-10 Thread Christoph Hellwig
Thanks Fabio,

this looks good to me.  Are there any other warnings you got from that
autobuilder?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Fri, Jul 11, 2014 at 06:02:03AM +, Elliott, Robert (Server Storage) 
wrote:
> Allowing longer run times before declaring success, the problem 
> does appear in all of the bisect trees.  I just let fio
> continue to run for many minutes - no ^Cs necessary.
> 
> no-rebase: good for > 45 minutes (I will leave that running for
>   8 more hours)

Ok, thanks.  If it's still running tomorrow morning let's look into the
aio reverts again.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Elliott, Robert (Server Storage)
> 
> I added some prints in aio_setup_ring and  ioctx_alloc and
> rebooted.  This time it took much longer to hit the problem.  It
> survived dozens of ^Cs.  Running a few minutes, though, IOPS
> eventually dropped.  So, sometimes it happens immediately,
> sometimes it takes time to develop.
> 
> I will rerun bisect-1 -2 and -3 for longer times to increase
> confidence that they didn't just appear good.

Allowing longer run times before declaring success, the problem 
does appear in all of the bisect trees.  I just let fio
continue to run for many minutes - no ^Cs necessary.

no-rebase: good for > 45 minutes (I will leave that running for
  8 more hours)
bisect-1: bad
bisect-2: bad
bisect-3: bad
bisect-4: bad


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi: arm: fas216: Use %llu to print 'u64' format

2014-07-10 Thread Fabio Estevam
From: Fabio Estevam 

The following warning is seen when building for ARM:

drivers/scsi/arm/fas216.c:3003:3: warning: format '%d' expects argument of type
'int', but argument 4 has type 'u64' [-Wformat=]

Use %llu to print 'u64' format.

Reported-by: Olof's autobuilder 
Signed-off-by: Fabio Estevam 
---
 drivers/scsi/arm/fas216.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/arm/fas216.c b/drivers/scsi/arm/fas216.c
index b46a6f6..aeddd11 100644
--- a/drivers/scsi/arm/fas216.c
+++ b/drivers/scsi/arm/fas216.c
@@ -3000,7 +3000,7 @@ void fas216_print_devices(FAS216_Info *info, struct 
seq_file *m)
 
shost_for_each_device(scd, info->host) {
dev = &info->device[scd->id];
-   seq_printf(m, " %d/%d   ", scd->id, scd->lun);
+   seq_printf(m, " %d/%llu   ", scd->id, scd->lun);
if (scd->tagged_supported)
seq_printf(m, "%3sabled(%3d) ",
 scd->simple_tags ? "en" : "dis",
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 6/8] Drivers: scsi: storvsc: Implement an abort handler

2014-07-10 Thread KY Srinivasan


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, July 10, 2014 3:13 AM
> To: KY Srinivasan
> Cc: Christoph Hellwig; linux-ker...@vger.kernel.org;
> de...@linuxdriverproject.org; oher...@suse.com;
> jbottom...@parallels.com; jasow...@redhat.com; a...@canonical.com;
> linux-scsi@vger.kernel.org
> Subject: Re: [PATCH 6/8] Drivers: scsi: storvsc: Implement an abort handler
> 
> 
> Note that you could increase the timeout and/or implement an
> eh_timed_out handler that just returns BLK_EH_RESET_TIMER, but if the
> completion takes too long the expectation is that a command will eventually
> finish instead of beeing delayed by an unmound amount.

I like this idea; I will implement a eh_timed_out_handler.

K. Y

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/8] drivers: scsi: storvsc: Set srb_flags in all cases

2014-07-10 Thread KY Srinivasan


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, July 10, 2014 3:17 AM
> To: KY Srinivasan
> Cc: Christoph Hellwig; linux-ker...@vger.kernel.org;
> de...@linuxdriverproject.org; oher...@suse.com;
> jbottom...@parallels.com; jasow...@redhat.com; a...@canonical.com;
> linux-scsi@vger.kernel.org; sta...@vger.kernel.org
> Subject: Re: [PATCH 7/8] drivers: scsi: storvsc: Set srb_flags in all cases
> 
> On Wed, Jul 09, 2014 at 06:40:09PM +, KY Srinivasan wrote:
> > > On Tue, Jul 08, 2014 at 05:46:51PM -0700, K. Y. Srinivasan wrote:
> > > > Correctly set SRB flags for all valid I/O directions. Some IHV
> > > > drivers on the Windows host require this.
> > >
> > > What are IHV drivers?
> >
> > If the target is a SAN device, the host simply passes the request over to 
> > the
> native driver stack on the host. Some specific hardware (IHV - independent
> hadware vendor) drivers on Windows require that SRB flags be correctly set
> in all cases.
> >
> 
> I'm fine with putting this in, but treating I/O request from guests as 
> trusted in
> a hypervisor doesn't seem like a good idea in general..

The host does  validate the guest request before forwarding the request to the 
appropriate target.

K. Y
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16

2014-07-10 Thread James Bottomley
On Thu, 2014-07-10 at 21:02 +, KY Srinivasan wrote:
> 
> > -Original Message-
> > From: James Bottomley [mailto:jbottom...@parallels.com]
> > Sent: Wednesday, July 9, 2014 3:27 PM
> > To: KY Srinivasan
> > Cc: linux-ker...@vger.kernel.org; m...@mkp.net; h...@infradead.org;
> > de...@linuxdriverproject.org; a...@canonical.com; sta...@vger.kernel.org;
> > linux-scsi@vger.kernel.org; oher...@suse.com; jasow...@redhat.com
> > Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16
> > 
> > On Wed, 2014-07-09 at 21:14 +, KY Srinivasan wrote:
> > >
> > > > -Original Message-
> > > > From: James Bottomley [mailto:jbottom...@parallels.com]
> > > > Sent: Wednesday, July 9, 2014 12:57 PM
> > > > To: KY Srinivasan
> > > > Cc: linux-ker...@vger.kernel.org; h...@infradead.org;
> > > > de...@linuxdriverproject.org; a...@canonical.com;
> > > > sta...@vger.kernel.org; linux-scsi@vger.kernel.org;
> > > > oher...@suse.com; jasow...@redhat.com
> > > > Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter
> > > > WRITE_SAME_16
> > > >
> > > > On Wed, 2014-07-09 at 19:52 +, KY Srinivasan wrote:
> > > > >
> > > > > > -Original Message-
> > > > > > From: Christoph Hellwig [mailto:h...@infradead.org]
> > > > > > Sent: Wednesday, July 9, 2014 1:43 AM
> > > > > > To: KY Srinivasan
> > > > > > Cc: linux-ker...@vger.kernel.org; de...@linuxdriverproject.org;
> > > > > > oher...@suse.com; jbottom...@parallels.com;
> > jasow...@redhat.com;
> > > > > > a...@canonical.com; linux-scsi@vger.kernel.org;
> > > > > > sta...@vger.kernel.org
> > > > > > Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter
> > > > > > WRITE_SAME_16
> > > > > >
> > > > > > On Tue, Jul 08, 2014 at 05:46:48PM -0700, K. Y. Srinivasan wrote:
> > > > > > > Host does not handle WRITE_SAME_16; filter this command out.
> > > > > > > This patch is required to handle large devices (greater than 2 TB
> > disks).
> > > > > >
> > > > > > Storvsc already sets the no_write_same flag, where is the
> > > > > > command coming from?
> > > > >
> > > > > In spite of this flag,  I see WRITE_SAME_16 being issued when I
> > > > > format a device bigger than 2 TB; I tried both xfs and ext4.
> > > > > Windows hosts currently do not handle unsupported commands
> > > > > correctly - The information returned is not sufficient to effect 
> > > > > recovery
> > in the Linux guest.
> > > > While this may be addressed in future hosts, this patch fixes the 
> > > > problem.
> > > >
> > > > What Christoph means is that this looks like a bug somewhere in SCSI
> > itself.
> > > > That means we need to find it and kill it, not add workarounds to
> > > > every driver that sets no_write_same ...
> > >
> > > James,
> > >
> > > I will try to isolate this issue in the SCSI stack. If it is ok with
> > > you guys, I would still want to filter WRITE_SAME_16 (as we currently
> > > do WRITE_SAME) in our driver since this would address the problem for a
> > large number of customers on our platform.
> > 
> > If we fix it at source, why would there be any need to filter?  That's the
> > reason the no_write_same flag was introduced.  If we can find and fix the
> > bug, it can go back into the stable trees as a bug fix, hence nothing should
> > ever emit write_same(10 or 16) and additional driver code is redundant (and
> > counter productive, since if this ever breaks again you're our best canary).
> > 
> > This looks like it might be the problem but Martin should confirm (I think 
> > the
> > problem comes to us from the RC16 code which unconditionally sets WS16).
> > 
> > James
> 
> James,
> 
> This patch works for me; are you planning on committing this patch.

OK, if we go with this we can add your tested by.

It's not going in until Martin takes a look because there may be a
better way of doing this.

James



RE: [PATCH V2 2/8] Drivers: scsi: storvsc: Set cmd_per_lun to reflect value supported by the Host

2014-07-10 Thread KY Srinivasan


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, July 10, 2014 3:25 AM
> To: KY Srinivasan
> Cc: linux-ker...@vger.kernel.org; de...@linuxdriverproject.org;
> oher...@suse.com; jbottom...@parallels.com; h...@infradead.org;
> jasow...@redhat.com; a...@canonical.com; linux-scsi@vger.kernel.org;
> sta...@vger.kernel.org
> Subject: Re: [PATCH V2 2/8] Drivers: scsi: storvsc: Set cmd_per_lun to reflect
> value supported by the Host
> 
> > -   .cmd_per_lun =  1,
> > +   .cmd_per_lun =  255,
> > .can_queue =
>   STORVSC_MAX_IO_REQUESTS*STORVSC_MAX_TARGETS,
> 
> slave_configure immediately adjusts this down to
> STORVSC_MAX_IO_REQUESTS (250), any reson to start out with the magic
> 255 here?

The number 255 (set for cmd_per_lun) is what the host can support (I recently 
discovered this in an MSDN document). STORVSC_MAX_IO_REQUESTS is an 
implementation limitation in this driver.

Regards,

K. Y
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Jeff Moyer [mailto:jmo...@redhat.com]
> Sent: Thursday, 10 July, 2014 2:14 PM
> To: Elliott, Robert (Server Storage)
> Cc: Christoph Hellwig; Jens Axboe; dgilb...@interlog.com; James Bottomley;
> Bart Van Assche; Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: scsi-mq V2
> 
> "Elliott, Robert (Server Storage)"  writes:
> 
> >> -Original Message-
> >> From: Christoph Hellwig [mailto:h...@infradead.org]
> >> Sent: Thursday, 10 July, 2014 11:15 AM
> >> To: Elliott, Robert (Server Storage)
> >> Cc: Jens Axboe; dgilb...@interlog.com; James Bottomley; Bart Van Assche;
> >> Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
> >> Subject: Re: scsi-mq V2
> >>
> >> On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote:
> >> > It's starting to look weird.  I'll prepare another two bisect branches
> >> > around some MM changes, which seems the only other possible candidate.
> >>
> >> I've pushed out scsi-mq.3-bisect-3
> >
> > Good.
> >
> >> and scsi-mq.3-bisect-4 for you.
> >
> > Bad.
> >
> > Note: I had to apply the vdso2c.h patch to build this -rc3 based kernel:
> > diff --git a/arch/x86/vdso/vdso2c.h b/arch/x86/vdso/vdso2c.h
> > index df95a2f..11b65d4 100644
> > --- a/arch/x86/vdso/vdso2c.h
> > +++ b/arch/x86/vdso/vdso2c.h
> > @@ -93,6 +93,9 @@ static void BITSFUNC(copy_section)(struct
> BITSFUNC(fake_sections) *out,
> > uint64_t flags = GET_LE(&in->sh_flags);
> >
> > bool copy = flags & SHF_ALLOC &&
> > +   (GET_LE(&in->sh_size) ||
> > +   (GET_LE(&in->sh_type) != SHT_RELA &&
> > +   GET_LE(&in->sh_type) != SHT_REL)) &&
> > strcmp(name, ".altinstructions") &&
> > strcmp(name, ".altinstr_replacement");
> >
> > Results: fio started OK, getting 900K IOPS, but ^C led to 0 IOPS and
> > an fio hang, with one CPU (CPU 0) stuck in io_submit loops.
> 

I added some prints in aio_setup_ring and  ioctx_alloc and
rebooted.  This time it took much longer to hit the problem.  It 
survived dozens of ^Cs.  Running a few minutes, though, IOPS 
eventually dropped.  So, sometimes it happens immediately,
sometimes it takes time to develop.

I will rerun bisect-1 -2 and -3 for longer times to increase
confidence that they didn't just appear good.

On this bisect-4 run, as IOPS started to drop from 900K to 40K, 
I ran perf top when it was at 700K.  You can see io_submit times
creeping up.

  4.30%  [kernel][k] do_io_submit
  4.29%  [kernel][k] _raw_spin_lock_irqsave
  3.88%  libaio.so.1.0.1 [.] io_submit
  3.55%  [kernel][k] system_call
  3.34%  [kernel][k] put_compound_page
  3.11%  [kernel][k] io_submit_one
  3.06%  [kernel][k] system_call_after_swapgs
  2.89%  [kernel][k] copy_user_generic_string
  2.45%  [kernel][k] lookup_ioctx
  2.16%  [kernel][k] apic_timer_interrupt
  2.00%  [kernel][k] _raw_spin_lock
  1.97%  [scsi_debug][k] sdebug_q_cmd_hrt_complete
  1.84%  [kernel][k] __get_page_tail
  1.74%  [kernel][k] do_blockdev_direct_IO
  1.68%  [kernel][k] blk_flush_plug_list
  1.41%  [kernel][k] _raw_spin_unlock_irqrestore
  1.24%  [scsi_debug][k] schedule_resp

finally settling like before:
 14.15%  [kernel][k] do_io_submit
 13.61%  libaio.so.1.0.1 [.] io_submit
 11.81%  [kernel][k] system_call
 10.11%  [kernel][k] system_call_after_swapgs
  8.59%  [kernel][k] io_submit_one
  8.56%  [kernel][k] copy_user_generic_string
  7.96%  [kernel][k] lookup_ioctx
  5.33%  [kernel][k] blk_flush_plug_list
  3.11%  [kernel][k] blk_finish_plug
  2.84%  [kernel][k] sysret_check
  2.63%  fio [.] fio_libaio_commit
  2.27%  [kernel][k] blk_start_plug
  1.17%  [kernel][k] SyS_io_submit

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16

2014-07-10 Thread KY Srinivasan


> -Original Message-
> From: James Bottomley [mailto:jbottom...@parallels.com]
> Sent: Wednesday, July 9, 2014 3:27 PM
> To: KY Srinivasan
> Cc: linux-ker...@vger.kernel.org; m...@mkp.net; h...@infradead.org;
> de...@linuxdriverproject.org; a...@canonical.com; sta...@vger.kernel.org;
> linux-scsi@vger.kernel.org; oher...@suse.com; jasow...@redhat.com
> Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16
> 
> On Wed, 2014-07-09 at 21:14 +, KY Srinivasan wrote:
> >
> > > -Original Message-
> > > From: James Bottomley [mailto:jbottom...@parallels.com]
> > > Sent: Wednesday, July 9, 2014 12:57 PM
> > > To: KY Srinivasan
> > > Cc: linux-ker...@vger.kernel.org; h...@infradead.org;
> > > de...@linuxdriverproject.org; a...@canonical.com;
> > > sta...@vger.kernel.org; linux-scsi@vger.kernel.org;
> > > oher...@suse.com; jasow...@redhat.com
> > > Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter
> > > WRITE_SAME_16
> > >
> > > On Wed, 2014-07-09 at 19:52 +, KY Srinivasan wrote:
> > > >
> > > > > -Original Message-
> > > > > From: Christoph Hellwig [mailto:h...@infradead.org]
> > > > > Sent: Wednesday, July 9, 2014 1:43 AM
> > > > > To: KY Srinivasan
> > > > > Cc: linux-ker...@vger.kernel.org; de...@linuxdriverproject.org;
> > > > > oher...@suse.com; jbottom...@parallels.com;
> jasow...@redhat.com;
> > > > > a...@canonical.com; linux-scsi@vger.kernel.org;
> > > > > sta...@vger.kernel.org
> > > > > Subject: Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter
> > > > > WRITE_SAME_16
> > > > >
> > > > > On Tue, Jul 08, 2014 at 05:46:48PM -0700, K. Y. Srinivasan wrote:
> > > > > > Host does not handle WRITE_SAME_16; filter this command out.
> > > > > > This patch is required to handle large devices (greater than 2 TB
> disks).
> > > > >
> > > > > Storvsc already sets the no_write_same flag, where is the
> > > > > command coming from?
> > > >
> > > > In spite of this flag,  I see WRITE_SAME_16 being issued when I
> > > > format a device bigger than 2 TB; I tried both xfs and ext4.
> > > > Windows hosts currently do not handle unsupported commands
> > > > correctly - The information returned is not sufficient to effect 
> > > > recovery
> in the Linux guest.
> > > While this may be addressed in future hosts, this patch fixes the problem.
> > >
> > > What Christoph means is that this looks like a bug somewhere in SCSI
> itself.
> > > That means we need to find it and kill it, not add workarounds to
> > > every driver that sets no_write_same ...
> >
> > James,
> >
> > I will try to isolate this issue in the SCSI stack. If it is ok with
> > you guys, I would still want to filter WRITE_SAME_16 (as we currently
> > do WRITE_SAME) in our driver since this would address the problem for a
> large number of customers on our platform.
> 
> If we fix it at source, why would there be any need to filter?  That's the
> reason the no_write_same flag was introduced.  If we can find and fix the
> bug, it can go back into the stable trees as a bug fix, hence nothing should
> ever emit write_same(10 or 16) and additional driver code is redundant (and
> counter productive, since if this ever breaks again you're our best canary).
> 
> This looks like it might be the problem but Martin should confirm (I think the
> problem comes to us from the RC16 code which unconditionally sets WS16).
> 
> James

James,

This patch works for me; are you planning on committing this patch.

Regards,

K. Y
> 
> ---
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 6825eda..8353a4c
> 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -634,6 +634,23 @@ static void sd_config_discard(struct scsi_disk *sdkp,
> unsigned int mode)
>   max(sdkp->physical_block_size,
>   sdkp->unmap_granularity * logical_block_size);
> 
> + if (sdkp->device->host->no_write_same) {
> + switch(mode) {
> +
> + case SD_LBP_WS16:
> + case SD_LBP_WS10:
> + case SD_LBP_ZERO:
> + /*
> +  * filter out all the WRITE SAME modes and map them
> +  * directly to UNMAP
> +  */
> + mode = SD_LBP_UNMAP;
> + /* fall through */
> + default:
> + /* everything else is OK */
> + break;
> + }
> + }
>   sdkp->provisioning_mode = mode;
> 
>   switch (mode) {

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 22:05, Jeff Moyer wrote:

Jens Axboe  writes:


On 2014-07-10 17:11, Jeff Moyer wrote:

Benjamin LaHaise  writes:



[  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536


Something is horribly wrong here.  There is no way that value for nr_events
should be passed in to ioctx_alloc().  This implies that userland is calling
io_setup() with an impossibly large value for nr_events.  Can you post the
actual diff for your fs/aio.c relative to linus' tree?



fio does exactly this!  it passes INT_MAX.


That's correct, I had actually forgotten about this. It was a change
made a few years back, in correlation with the aio optimizations
posted then, basically telling aio to ignore that silly (and broken)
user ring.


I still don't see how you accomplish that.  Making it bigger doesn't get
rid of it.  ;-)


See the patches from back then - INT_MAX basically just meant the same 
as 0, but 0 could not be used because of the (silly) setup with the 
wrappers around the syscalls. So INT_MAX was overloaded to mean "no ring 
events, I don't care".


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Jens Axboe  writes:

> On 2014-07-10 17:11, Jeff Moyer wrote:
>> Benjamin LaHaise  writes:
>>

 [  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
 [  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
 [  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
 [  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
 [  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>>>
>>> Something is horribly wrong here.  There is no way that value for nr_events
>>> should be passed in to ioctx_alloc().  This implies that userland is calling
>>> io_setup() with an impossibly large value for nr_events.  Can you post the
>>> actual diff for your fs/aio.c relative to linus' tree?
>>>
>>
>> fio does exactly this!  it passes INT_MAX.
>
> That's correct, I had actually forgotten about this. It was a change
> made a few years back, in correlation with the aio optimizations
> posted then, basically telling aio to ignore that silly (and broken)
> user ring.

I still don't see how you accomplish that.  Making it bigger doesn't get
rid of it.  ;-)

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 17:11, Jeff Moyer wrote:

Benjamin LaHaise  writes:



[  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536


Something is horribly wrong here.  There is no way that value for nr_events
should be passed in to ioctx_alloc().  This implies that userland is calling
io_setup() with an impossibly large value for nr_events.  Can you post the
actual diff for your fs/aio.c relative to linus' tree?



fio does exactly this!  it passes INT_MAX.


That's correct, I had actually forgotten about this. It was a change 
made a few years back, in correlation with the aio optimizations posted 
then, basically telling aio to ignore that silly (and broken) user ring.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] Let device drivers disable msi on shutdown

2014-07-10 Thread Bjorn Helgaas
On Thu, Jul 10, 2014 at 12:53 PM, Keith Busch  wrote:
> On Thu, 10 Jul 2014, Bjorn Helgaas wrote:
>>
>> [+cc LKML, Greg KH for driver core async shutdown question]
>> On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:
>>>
>>> To provide context why I want to do this asynchronously, NVM-Express has
>>> one PCI device per controller, of which there could be dozens in a
>>> system,
>>> and each one may take many seconds (I've heard over ten in some cases)
>>> to safely shutdown.
>>
>>
>> I don't see anything in device_shutdown() that would wait for this
>> sort of asynchronous shutdown to complete.  So how do we know it's
>> finished before we turn off the power, reset, kexec, etc.?
>>
>> If we need to do asynchronous shutdown, it seems like we need some
>> sort of driver core infrastructure to manage that.
>
>
> Yes, good point! To address that, I did submit this patch:
>
> http://lists.infradead.org/pipermail/linux-nvme/2014-May/000827.html
>
> I need to fix the EXPORT_SYMBOL_GPL usage for a v2, but before that, I
> wan't to know the reason the driver can't use MSIx in an async shutdown
> shutdown, and came to the patch mentioned above.
>
> I'd originally had the async shutdown use legacy interrupts, but I
> know some NVMe devices do not support legacy, so can't use my original
> proposal. If I can't rely on MSI/MSI-x being enabled in an async shutdown,
> then I have to add polling, which I suppose we can live with.

OK, I wasn't aware of your async shutdown patch.  Given that, I would
think we'd want to somehow delay the rest of pci_device_shutdown()
until after the driver's async shutdown completes.  We don't want to
do either the MSI shutdown or the pci_clear_master() until after it
completes, do we?  And if we delay that, there would be no need to
move the MSI shutdown from the core to the driver.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Jeff Moyer  writes:

> Hi, Rob,
>
> Can you get sysrq-t output for me?  I don't know how/why we'd continue
> to get io_submits for an exiting process.

Also, do you know what sys_io_submit is returning?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
"Elliott, Robert (Server Storage)"  writes:

>> -Original Message-
>> From: Christoph Hellwig [mailto:h...@infradead.org]
>> Sent: Thursday, 10 July, 2014 11:15 AM
>> To: Elliott, Robert (Server Storage)
>> Cc: Jens Axboe; dgilb...@interlog.com; James Bottomley; Bart Van Assche;
>> Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
>> Subject: Re: scsi-mq V2
>> 
>> On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote:
>> > It's starting to look weird.  I'll prepare another two bisect branches
>> > around some MM changes, which seems the only other possible candidate.
>> 
>> I've pushed out scsi-mq.3-bisect-3 
>
> Good.
>
>> and scsi-mq.3-bisect-4 for you.
>
> Bad.
>
> Note: I had to apply the vdso2c.h patch to build this -rc3 based kernel:
> diff --git a/arch/x86/vdso/vdso2c.h b/arch/x86/vdso/vdso2c.h
> index df95a2f..11b65d4 100644
> --- a/arch/x86/vdso/vdso2c.h
> +++ b/arch/x86/vdso/vdso2c.h
> @@ -93,6 +93,9 @@ static void BITSFUNC(copy_section)(struct 
> BITSFUNC(fake_sections) *out,
>   uint64_t flags = GET_LE(&in->sh_flags);
>
>   bool copy = flags & SHF_ALLOC &&
> + (GET_LE(&in->sh_size) ||
> + (GET_LE(&in->sh_type) != SHT_RELA &&
> + GET_LE(&in->sh_type) != SHT_REL)) &&
>   strcmp(name, ".altinstructions") &&
>   strcmp(name, ".altinstr_replacement");
>
> Results: fio started OK, getting 900K IOPS, but ^C led to 0 IOPS and
> an fio hang, with one CPU (CPU 0) stuck in io_submit loops.

Hi, Rob,

Can you get sysrq-t output for me?  I don't know how/why we'd continue
to get io_submits for an exiting process.

Thanks,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] Let device drivers disable msi on shutdown

2014-07-10 Thread Keith Busch

On Thu, 10 Jul 2014, Bjorn Helgaas wrote:

[+cc LKML, Greg KH for driver core async shutdown question]
On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:

To provide context why I want to do this asynchronously, NVM-Express has
one PCI device per controller, of which there could be dozens in a system,
and each one may take many seconds (I've heard over ten in some cases)
to safely shutdown.


I don't see anything in device_shutdown() that would wait for this
sort of asynchronous shutdown to complete.  So how do we know it's
finished before we turn off the power, reset, kexec, etc.?

If we need to do asynchronous shutdown, it seems like we need some
sort of driver core infrastructure to manage that.


Yes, good point! To address that, I did submit this patch:

http://lists.infradead.org/pipermail/linux-nvme/2014-May/000827.html

I need to fix the EXPORT_SYMBOL_GPL usage for a v2, but before that, I
wan't to know the reason the driver can't use MSIx in an async shutdown
shutdown, and came to the patch mentioned above.

I'd originally had the async shutdown use legacy interrupts, but I
know some NVMe devices do not support legacy, so can't use my original
proposal. If I can't rely on MSI/MSI-x being enabled in an async shutdown,
then I have to add polling, which I suppose we can live with.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, 10 July, 2014 11:15 AM
> To: Elliott, Robert (Server Storage)
> Cc: Jens Axboe; dgilb...@interlog.com; James Bottomley; Bart Van Assche;
> Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: scsi-mq V2
> 
> On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote:
> > It's starting to look weird.  I'll prepare another two bisect branches
> > around some MM changes, which seems the only other possible candidate.
> 
> I've pushed out scsi-mq.3-bisect-3 

Good.

> and scsi-mq.3-bisect-4 for you.

Bad.

Note: I had to apply the vdso2c.h patch to build this -rc3 based kernel:
diff --git a/arch/x86/vdso/vdso2c.h b/arch/x86/vdso/vdso2c.h
index df95a2f..11b65d4 100644
--- a/arch/x86/vdso/vdso2c.h
+++ b/arch/x86/vdso/vdso2c.h
@@ -93,6 +93,9 @@ static void BITSFUNC(copy_section)(struct 
BITSFUNC(fake_sections) *out,
uint64_t flags = GET_LE(&in->sh_flags);

bool copy = flags & SHF_ALLOC &&
+   (GET_LE(&in->sh_size) ||
+   (GET_LE(&in->sh_type) != SHT_RELA &&
+   GET_LE(&in->sh_type) != SHT_REL)) &&
strcmp(name, ".altinstructions") &&
strcmp(name, ".altinstr_replacement");

Results: fio started OK, getting 900K IOPS, but ^C led to 0 IOPS and
an fio hang, with one CPU (CPU 0) stuck in io_submit loops.

perf top shows lookup_ioctx function alongside io_submit and
do_io_submit this time:
 14.96%  [kernel] [k] lookup_ioctx
 14.71%  libaio.so.1.0.1  [.] io_submit
 13.78%  [kernel] [k] system_call
 10.79%  [kernel] [k] system_call_after_swapgs
 10.17%  [kernel] [k] do_io_submit
  8.91%  [kernel] [k] copy_user_generic_string
  4.24%  [kernel] [k] io_submit_one
  3.93%  [kernel] [k] blk_flush_plug_list
  3.32%  fio  [.] fio_libaio_commit
  2.84%  [kernel] [k] sysret_check
  2.06%  [kernel] [k] blk_finish_plug
  1.89%  [kernel] [k] SyS_io_submit
  1.48%  [kernel] [k] blk_start_plug
  1.04%  fio  [.] io_submit@plt
  0.84%  [kernel] [k] __get_user_4
  0.74%  [kernel] [k] system_call_fastpath
  0.60%  [kernel] [k] _copy_from_user
  0.51%  diff [.] 0x7abb

ftrace on CPU 0 shows similar repetition to before:
 fio-4107  [000]    389.992300: lookup_ioctx <-do_io_submit
 fio-4107  [000]    389.992300: blk_start_plug <-do_io_submit
 fio-4107  [000]    389.992300: io_submit_one <-do_io_submit
 fio-4107  [000]    389.992300: blk_finish_plug <-do_io_submit
 fio-4107  [000]    389.992300: blk_flush_plug_list 
<-blk_finish_plug
 fio-4107  [000]    389.992301: SyS_io_submit 
<-system_call_fastpath
 fio-4107  [000]    389.992301: do_io_submit <-SyS_io_submit
 fio-4107  [000]    389.992301: lookup_ioctx <-do_io_submit
 fio-4107  [000]    389.992301: blk_start_plug <-do_io_submit
 fio-4107  [000]    389.992301: io_submit_one <-do_io_submit
 fio-4107  [000]    389.992301: blk_finish_plug <-do_io_submit
 fio-4107  [000]    389.992301: blk_flush_plug_list 
<-blk_finish_plug
 fio-4107  [000]    389.992301: SyS_io_submit 
<-system_call_fastpath
 fio-4107  [000]    389.992302: do_io_submit <-SyS_io_submit
 fio-4107  [000]    389.992302: lookup_ioctx <-do_io_submit
 fio-4107  [000]    389.992302: blk_start_plug <-do_io_submit
 fio-4107  [000]    389.992302: io_submit_one <-do_io_submit
 fio-4107  [000]    389.992302: blk_finish_plug <-do_io_submit
 fio-4107  [000]    389.992302: blk_flush_plug_list 
<-blk_finish_plug



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] Let device drivers disable msi on shutdown

2014-07-10 Thread Bjorn Helgaas
[+cc LKML, Greg KH for driver core async shutdown question]

On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:
> I'd like to do shutdowns asynchronously so I can shutdown multiple
> devices in parallel, but the pci-driver disables interrupts after my
> driver returns from its '.shutdown', though it needs to rely on these
> interrupts in its asynchronously scheduled shutdown.
> 
> I tracked the reason for pci disabling msi to ...
> 
> | commit d52877c7b1afb8c37ebe17e2005040b79cb618b0
> | Author: Yinghai Lu 
> | Date:   Wed Apr 23 14:58:09 2008 -0700
> |
> | pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2
> 
> ... because mptfusion doesn't disable msi in its shutdown path.
> 
> Any reason we can't let the drivers do this instead?

I guess they *could*, but my general idea is that when we can do
something in the core, we probably should, because then we're not
depending on all the drivers to do the right thing.  So I would turn
this around and say "we need a pretty good reason to move something
out of the core and into the drivers."

A lot of the issues in the shutdown path are related to getting the
device to shut up so it doesn't cause problems with kexec.  So it does
seem like a good idea to have pci_device_shutdown() disable MSI to
remove one source of bothersome interrupts.

> To provide context why I want to do this asynchronously, NVM-Express has
> one PCI device per controller, of which there could be dozens in a system,
> and each one may take many seconds (I've heard over ten in some cases)
> to safely shutdown.

I don't see anything in device_shutdown() that would wait for this
sort of asynchronous shutdown to complete.  So how do we know it's
finished before we turn off the power, reset, kexec, etc.?

If we need to do asynchronous shutdown, it seems like we need some
sort of driver core infrastructure to manage that.

> In this patch, mptfusion was compile tested only; I didn't observe any
> adverse affects from running the pci portion.
> 
> Signed-off-by: Keith Busch 
> Cc: Nagalakshmi Nandigama 
> Cc: Sreekanth Reddy 
> Cc: Bjorn Helgaas 
> ---
>  drivers/message/fusion/mptscsih.c |3 +++
>  drivers/pci/pci-driver.c  |2 --
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/message/fusion/mptscsih.c 
> b/drivers/message/fusion/mptscsih.c
> index 2a1c6f2..3186e17 100644
> --- a/drivers/message/fusion/mptscsih.c
> +++ b/drivers/message/fusion/mptscsih.c
> @@ -1215,6 +1215,9 @@ mptscsih_remove(struct pci_dev *pdev)
>  void
>  mptscsih_shutdown(struct pci_dev *pdev)
>  {
> + MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
> + if (ioc->msi_enable)
> + pci_disable_msi(pdev);
>  }
>  
>  #ifdef CONFIG_PM
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3f8e3db..8079d98 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -453,8 +453,6 @@ static void pci_device_shutdown(struct device *dev)
>  
>   if (drv && drv->shutdown)
>   drv->shutdown(pci_dev);
> - pci_msi_shutdown(pci_dev);
> - pci_msix_shutdown(pci_dev);
>  
>  #ifdef CONFIG_KEXEC
>   /*
> -- 
> 1.7.10.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 09:04:22AM -0700, Christoph Hellwig wrote:
> It's starting to look weird.  I'll prepare another two bisect branches
> around some MM changes, which seems the only other possible candidate.

I've pushed out scsi-mq.3-bisect-3 and scsi-mq.3-bisect-4 for you.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 03:51:44PM +, Elliott, Robert (Server Storage) 
wrote:
> > scsi-mq.3-bisect-1 branch that is rebased to just before the merge of
> > the block tree
> 
> good.
> 
> > and a scsi-mq.3-bisect-2 branch that is just after the merge of the 
> > block tree to get started.
> 
> good.

It's starting to look weird.  I'll prepare another two bisect branches
around some MM changes, which seems the only other possible candidate.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, 10 July, 2014 1:21 AM
> To: Elliott, Robert (Server Storage)
> Cc: Jens Axboe; dgilb...@interlog.com; Christoph Hellwig; James Bottomley;
> Bart Van Assche; Benjamin LaHaise; linux-scsi@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: scsi-mq V2
> 
> On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage)
> wrote:
> > the problem still occurs - fio results in low or 0 IOPS, with perf top
> > reporting unusual amounts of time spent in do_io_submit and io_submit.
> 
> The diff between the two version doesn't show too much other possible
> interesting commits, the most interesting being some minor block
> updates.
> 
> I guess we'll have to a manual bisect, I've pushed out a

> scsi-mq.3-bisect-1 branch that is rebased to just before the merge of
> the block tree

good.

> and a scsi-mq.3-bisect-2 branch that is just after the merge of the 
> block tree to get started.

good.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jeff Moyer
Benjamin LaHaise  writes:

>> 
>> [  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>> [  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>> [  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>> [  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>> [  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>
> Something is horribly wrong here.  There is no way that value for nr_events 
> should be passed in to ioctx_alloc().  This implies that userland is calling 
> io_setup() with an impossibly large value for nr_events.  Can you post the 
> actual diff for your fs/aio.c relative to linus' tree?
>

fio does exactly this!  it passes INT_MAX.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] arm64: Fix the APM X-Gene SoC SATA PHY clock DTS node csr-mask of the SATA Host Controller 1.

2014-07-10 Thread Tejun Heo
On Thu, Jul 10, 2014 at 07:19:16PM +0530, Suman Tripathi wrote:
> This patch fixes the SATA PHY clock DTS node csr-mask of the
> SATA Host controller 1. This patch also fixes the status of
> the PHY clock node of SATA Host Controller 1.

Ditto.  Before, XXX was wrong and as a result YYY didn't work
properly.  This patch updates ZZZ to fix the issue.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] ata: Fix the watermark threshold for the APM X-Gene SATA host controller driver.

2014-07-10 Thread Tejun Heo
On Thu, Jul 10, 2014 at 07:19:15PM +0530, Suman Tripathi wrote:
> This patch fixes the watermark threshold of the receive FIFO for the
> APM X-Gene SATA host controller driver.

Can you please explain what are the effects of these patches?  Patch
descriptions should include why the specific changes are being made in
addition to the description of the changes themselves.  What are the
implications of applying this patch?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Thu, Jul 10, 2014 at 02:36:40PM +, Elliott, Robert (Server Storage) 
wrote:
> 
> 
> > -Original Message-
> > From: Jens Axboe [mailto:ax...@kernel.dk]
> > Sent: Thursday, 10 July, 2014 8:53 AM
> > To: Christoph Hellwig; Benjamin LaHaise
> > Cc: Elliott, Robert (Server Storage); dgilb...@interlog.com; James 
> > Bottomley;
> > Bart Van Assche; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
> > Subject: Re: scsi-mq V2
> > 
> > On 2014-07-10 15:50, Christoph Hellwig wrote:
> > > On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote:
> > >> There is one possible concern that could be exacerbated by other changes
> > in
> > >> the system: if the application is running close to the bare minimum 
> > >> number
> > >> of requests allocated in io_setup(), the per cpu reference counters will
> > >> have a hard time batching things.  It might be worth testing with an
> > >> increased number of requests being allocated if this is the case.
> > >
> > > Well, Robert said reverting the two aio commits didn't help.  Either he
> > > didn't manage to boot into the right kernel, or we need to look
> > > elsewhere for the culprit.
> > 
> > Rob, let me know what scsi_debug setup you use, and I can try and
> > reproduce it here as well.
> > 
> > --
> > Jens Axboe
> 
> This system has 6 online CPUs and 64 possible CPUs.
> 
> Printing avail and req_batch in that loop results in many of these:
> ** 3813 printk messages dropped ** [10643.503772] ctx 88042d8d4cc0 
> avail=0 req_batch=2
> 
> Adding CFLAGS_aio.o := -DDEBUG to the Makefile to enable
> those pr_debug prints results in nothing extra printing,
> so it's not hitting an error.
> 
> Printing nr_events and aio_max_nr at the top of ioctx_alloc results in
> these as fio starts:
> 
> [  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536

Something is horribly wrong here.  There is no way that value for nr_events 
should be passed in to ioctx_alloc().  This implies that userland is calling 
io_setup() with an impossibly large value for nr_events.  Can you post the 
actual diff for your fs/aio.c relative to linus' tree?

-ben


> [  186.339070] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.339071] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.339071] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.339074] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.339076] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.339076] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.359772] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.359971] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.359972] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.359985] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.359986] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.359987] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.359995] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.359995] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.359998] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.359998] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.362529] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.362529] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.363510] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.363513] sd 5:0:4:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.363520] sd 5:0:2:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.363521] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.398113] sd 5:0:5:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.398115] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.398121] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.398122] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.398124] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.398124] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.398130] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.398131] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.398164] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.398165] ioctx_alloc: nr_events=512 aio_max_nr=65536
> [  186.398499] sd 5:0:4:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.400489] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.401478] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.401491] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.434522] ioctx_alloc: nr_events=-2 aio_max_nr=65536
> [  186.434523] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.434526] sd 5:0:5:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.434533] sd 5:0:0:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
> [  186.435370] hrtimer: interrupt took 6868 ns
> [  186.435491] ioctx_alloc: nr_events=-2 aio_max_nr=65536
>

RE: scsi-mq V2

2014-07-10 Thread Elliott, Robert (Server Storage)


> -Original Message-
> From: Jens Axboe [mailto:ax...@kernel.dk]
> Sent: Thursday, 10 July, 2014 8:53 AM
> To: Christoph Hellwig; Benjamin LaHaise
> Cc: Elliott, Robert (Server Storage); dgilb...@interlog.com; James Bottomley;
> Bart Van Assche; linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: scsi-mq V2
> 
> On 2014-07-10 15:50, Christoph Hellwig wrote:
> > On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote:
> >> There is one possible concern that could be exacerbated by other changes
> in
> >> the system: if the application is running close to the bare minimum number
> >> of requests allocated in io_setup(), the per cpu reference counters will
> >> have a hard time batching things.  It might be worth testing with an
> >> increased number of requests being allocated if this is the case.
> >
> > Well, Robert said reverting the two aio commits didn't help.  Either he
> > didn't manage to boot into the right kernel, or we need to look
> > elsewhere for the culprit.
> 
> Rob, let me know what scsi_debug setup you use, and I can try and
> reproduce it here as well.
> 
> --
> Jens Axboe

This system has 6 online CPUs and 64 possible CPUs.

Printing avail and req_batch in that loop results in many of these:
** 3813 printk messages dropped ** [10643.503772] ctx 88042d8d4cc0 avail=0 
req_batch=2

Adding CFLAGS_aio.o := -DDEBUG to the Makefile to enable
those pr_debug prints results in nothing extra printing,
so it's not hitting an error.

Printing nr_events and aio_max_nr at the top of ioctx_alloc results in
these as fio starts:

[  186.339064] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339065] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339067] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339068] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339069] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339070] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.339071] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.339071] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.339074] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.339076] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.339076] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.359772] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.359971] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.359972] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.359985] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.359986] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.359987] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.359995] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.359995] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.359998] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.359998] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.362529] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.362529] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.363510] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.363513] sd 5:0:4:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.363520] sd 5:0:2:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.363521] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.398113] sd 5:0:5:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.398115] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.398121] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.398122] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.398124] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.398124] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.398130] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.398131] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.398164] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.398165] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.398499] sd 5:0:4:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.400489] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.401478] sd 5:0:1:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.401491] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.434522] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.434523] sd 5:0:3:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.434526] sd 5:0:5:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.434533] sd 5:0:0:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.435370] hrtimer: interrupt took 6868 ns
[  186.435491] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.435492] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.447864] sd 5:0:0:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.449896] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.449900] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.449901] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.449909] sd 5:0:0:0: scsi_debug_ioctl: BLKFLSBUF [0x1261]
[  186.449932] ioctx_alloc: nr_events=-2 aio_max_nr=65536
[  186.449933] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.461147] ioctx_alloc: nr_events=512 aio_max_nr=65536
[  186.461176] ioctx_alloc: nr_events=-2 aio_max

Re: [Bug 79901] New: Extremely slow boot on Promise VTrak E610f due to sd_mod RSOC usage

2014-07-10 Thread Janusz Dziemidowicz
2014-07-10 15:54 GMT+02:00 Christoph Hellwig :
> Sounds like we need to skip REPORT SUPPORTED OPERATION CODES on this
> array.
>
> For this it needs a new BLIST_NORSOC flag added to
> include/scsi/scsi_devinfo.h which sets sdev->no_report_opcodes in
> scsi_add_lun, and an entry for your array in the big black list array in
> drivers/scsi/scsi_devinfo.c.  Does this sound easy enough that you could
> prepare a patch for it?

Sure, I'll give it a try. It will probably take a couple of days as
currently I have only one server that is available for experiments for
a few hours a day.

-- 
Janusz Dziemidowicz
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug 79901] New: Extremely slow boot on Promise VTrak E610f due to sd_mod RSOC usage

2014-07-10 Thread Christoph Hellwig
Sounds like we need to skip REPORT SUPPORTED OPERATION CODES on this
array.

For this it needs a new BLIST_NORSOC flag added to
include/scsi/scsi_devinfo.h which sets sdev->no_report_opcodes in
scsi_add_lun, and an entry for your array in the big black list array in
drivers/scsi/scsi_devinfo.c.  Does this sound easy enough that you could
prepare a patch for it?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 15:50, Benjamin LaHaise wrote:

On Thu, Jul 10, 2014 at 03:48:10PM +0200, Jens Axboe wrote:

On 2014-07-10 15:44, Benjamin LaHaise wrote:

On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote:

That's how fio always runs, it sets up the context with the exact queue
depth that it needs. Do we have a good enough understanding of other aio
use cases to say that this isn't the norm? I would expect it to be, it's
the way that the API would most obviously be used.


The problem with this approach is that it works very poorly with per cpu
reference counting's batching of references, which is pretty much a
requirement now that many core systems are the norm.  Allocating the bare
minimum is not the right thing to do today.  That said, the default limits
on the number of requests probably needs to be raised.


Sorry, that's a complete cop-out. Then you handle this internally,
allocate a bigger pool and cap the limit if you need to. Look at the
API. You pass in the number of requests you will use. Do you expect
anyone to double up, just in case? Will never happen.

But all of this is side stepping the point that there's a real bug
reported here. The above could potentially explain the "it's using X
more CPU, or it's Y slower". The above is a softlock, it never completes.


I'm not trying to cop out on this -- I'm asking for a data point to see
if changing the request limits has any effect.


Fair enough, if the question is "does it solve the regression", then 
it's a valid data point. Rob/Doug, for fio, you can just double the 
iodepth passed in in engines/libaio:fio_libaio_init() and test with that 
and see if it makes a difference.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 15:50, Christoph Hellwig wrote:

On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote:

There is one possible concern that could be exacerbated by other changes in
the system: if the application is running close to the bare minimum number
of requests allocated in io_setup(), the per cpu reference counters will
have a hard time batching things.  It might be worth testing with an
increased number of requests being allocated if this is the case.


Well, Robert said reverting the two aio commits didn't help.  Either he
didn't manage to boot into the right kernel, or we need to look
elsewhere for the culprit.


Rob, let me know what scsi_debug setup you use, and I can try and 
reproduce it here as well.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Christoph Hellwig
On Thu, Jul 10, 2014 at 09:36:09AM -0400, Benjamin LaHaise wrote:
> There is one possible concern that could be exacerbated by other changes in 
> the system: if the application is running close to the bare minimum number 
> of requests allocated in io_setup(), the per cpu reference counters will 
> have a hard time batching things.  It might be worth testing with an 
> increased number of requests being allocated if this is the case.

Well, Robert said reverting the two aio commits didn't help.  Either he
didn't manage to boot into the right kernel, or we need to look
elsewhere for the culprit.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Thu, Jul 10, 2014 at 03:48:10PM +0200, Jens Axboe wrote:
> On 2014-07-10 15:44, Benjamin LaHaise wrote:
> >On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote:
> >>That's how fio always runs, it sets up the context with the exact queue
> >>depth that it needs. Do we have a good enough understanding of other aio
> >>use cases to say that this isn't the norm? I would expect it to be, it's
> >>the way that the API would most obviously be used.
> >
> >The problem with this approach is that it works very poorly with per cpu
> >reference counting's batching of references, which is pretty much a
> >requirement now that many core systems are the norm.  Allocating the bare
> >minimum is not the right thing to do today.  That said, the default limits
> >on the number of requests probably needs to be raised.
> 
> Sorry, that's a complete cop-out. Then you handle this internally, 
> allocate a bigger pool and cap the limit if you need to. Look at the 
> API. You pass in the number of requests you will use. Do you expect 
> anyone to double up, just in case? Will never happen.
> 
> But all of this is side stepping the point that there's a real bug 
> reported here. The above could potentially explain the "it's using X 
> more CPU, or it's Y slower". The above is a softlock, it never completes.

I'm not trying to cop out on this -- I'm asking for a data point to see 
if changing the request limits has any effect.

-ben

> -- 
> Jens Axboe

-- 
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] ata: Fixes related to APM X-Gene SATA host controller driver.

2014-07-10 Thread Suman Tripathi
This patch set contains a couple of fixes related to APM X-Gene SATA
controller driver.

v2 Change:
  - Drop the Link down retry patch from this patch set.

Signed-off-by: Loc Ho 
Signed-off-by: Suman Tripathi 
---

Suman Tripathi (2):
  ata: Fix the watermark threshold for the APM X-Gene SATA host
controller driver.
  arm64: Fix the APM X-Gene SoC SATA PHY clock DTS node csr-mask of the
SATA Host Controller 1.

 arch/arm64/boot/dts/apm-storm.dtsi | 5 +
 drivers/ata/ahci_xgene.c   | 7 +++
 2 files changed, 8 insertions(+), 4 deletions(-)

--
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] arm64: Fix the APM X-Gene SoC SATA PHY clock DTS node csr-mask of the SATA Host Controller 1.

2014-07-10 Thread Suman Tripathi
This patch fixes the SATA PHY clock DTS node csr-mask of the
SATA Host controller 1. This patch also fixes the status of
the PHY clock node of SATA Host Controller 1.

Signed-off-by: Loc Ho 
Signed-off-by: Suman Tripathi 
---
 arch/arm64/boot/dts/apm-storm.dtsi | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/apm-storm.dtsi 
b/arch/arm64/boot/dts/apm-storm.dtsi
index f8c40a6..ce6967c 100644
--- a/arch/arm64/boot/dts/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm-storm.dtsi
@@ -184,9 +184,8 @@
reg = <0x0 0x1f21c000 0x0 0x1000>;
reg-names = "csr-reg";
clock-output-names = "sataphy1clk";
-   status = "disabled";
csr-offset = <0x4>;
-   csr-mask = <0x00>;
+   csr-mask = <0x3a>;
enable-offset = <0x0>;
enable-mask = <0x06>;
};
@@ -198,7 +197,6 @@
reg = <0x0 0x1f22c000 0x0 0x1000>;
reg-names = "csr-reg";
clock-output-names = "sataphy2clk";
-   status = "ok";
csr-offset = <0x4>;
csr-mask = <0x3a>;
enable-offset = <0x0>;
@@ -212,7 +210,6 @@
reg = <0x0 0x1f23c000 0x0 0x1000>;
reg-names = "csr-reg";
clock-output-names = "sataphy3clk";
-   status = "ok";
csr-offset = <0x4>;
csr-mask = <0x3a>;
enable-offset = <0x0>;
--
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] ata: Fix the watermark threshold for the APM X-Gene SATA host controller driver.

2014-07-10 Thread Suman Tripathi
This patch fixes the watermark threshold of the receive FIFO for the
APM X-Gene SATA host controller driver.

Signed-off-by: Loc Ho 
Signed-off-by: Suman Tripathi 
---
 drivers/ata/ahci_xgene.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/ata/ahci_xgene.c b/drivers/ata/ahci_xgene.c
index 77c89bf..03b6b0f 100644
--- a/drivers/ata/ahci_xgene.c
+++ b/drivers/ata/ahci_xgene.c
@@ -67,6 +67,9 @@
 #define PORTAXICFG 0x00bc
 #define PORTAXICFG_OUTTRANS_SET(dst, src) \
(((dst) & ~0x00f0) | (((u32)(src) << 0x14) & 0x00f0))
+#define PORTRANSCFG0x00c8
+#define PORTRANSCFG_RXWM_SET(dst, src) \
+   (((dst) & ~0x007f) | (((u32)(src)) & 0x007f))

 /* SATA host controller AXI CSR */
 #define INT_SLV_TMOMASK0x0010
@@ -176,6 +179,10 @@ static void xgene_ahci_set_phy_cfg(struct 
xgene_ahci_context *ctx, int channel)
val = PORTAXICFG_OUTTRANS_SET(val, 0xe); /* Set outstanding */
writel(val, mmio + PORTAXICFG);
readl(mmio + PORTAXICFG); /* Force a barrier */
+   /* Set the watermark threshold of the receive FIFO */
+   val = readl(mmio + PORTRANSCFG);
+   val = PORTRANSCFG_RXWM_SET(val, 0x30);
+   writel(val, mmio + PORTRANSCFG);
 }

 /**
--
1.8.2.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 15:44, Benjamin LaHaise wrote:

On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote:

That's how fio always runs, it sets up the context with the exact queue
depth that it needs. Do we have a good enough understanding of other aio
use cases to say that this isn't the norm? I would expect it to be, it's
the way that the API would most obviously be used.


The problem with this approach is that it works very poorly with per cpu
reference counting's batching of references, which is pretty much a
requirement now that many core systems are the norm.  Allocating the bare
minimum is not the right thing to do today.  That said, the default limits
on the number of requests probably needs to be raised.


Sorry, that's a complete cop-out. Then you handle this internally, 
allocate a bigger pool and cap the limit if you need to. Look at the 
API. You pass in the number of requests you will use. Do you expect 
anyone to double up, just in case? Will never happen.


But all of this is side stepping the point that there's a real bug 
reported here. The above could potentially explain the "it's using X 
more CPU, or it's Y slower". The above is a softlock, it never completes.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Thu, Jul 10, 2014 at 03:39:57PM +0200, Jens Axboe wrote:
> That's how fio always runs, it sets up the context with the exact queue 
> depth that it needs. Do we have a good enough understanding of other aio 
> use cases to say that this isn't the norm? I would expect it to be, it's 
> the way that the API would most obviously be used.

The problem with this approach is that it works very poorly with per cpu 
reference counting's batching of references, which is pretty much a 
requirement now that many core systems are the norm.  Allocating the bare 
minimum is not the right thing to do today.  That said, the default limits 
on the number of requests probably needs to be raised.

-ben
-- 
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Jens Axboe

On 2014-07-10 15:36, Benjamin LaHaise wrote:

On Wed, Jul 09, 2014 at 11:20:40PM -0700, Christoph Hellwig wrote:

On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) 
wrote:

the problem still occurs - fio results in low or 0 IOPS, with perf top
reporting unusual amounts of time spent in do_io_submit and io_submit.


The diff between the two version doesn't show too much other possible
interesting commits, the most interesting being some minor block
updates.

I guess we'll have to a manual bisect, I've pushed out a
scsi-mq.3-bisect-1 branch that is rebased to just before the merge of
the block tree, and a scsi-mq.3-bisect-2 branch that is just after
the merge of the block tree to get started.


There is one possible concern that could be exacerbated by other changes in
the system: if the application is running close to the bare minimum number
of requests allocated in io_setup(), the per cpu reference counters will
have a hard time batching things.  It might be worth testing with an
increased number of requests being allocated if this is the case.


That's how fio always runs, it sets up the context with the exact queue 
depth that it needs. Do we have a good enough understanding of other aio 
use cases to say that this isn't the norm? I would expect it to be, it's 
the way that the API would most obviously be used.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi-mq V2

2014-07-10 Thread Benjamin LaHaise
On Wed, Jul 09, 2014 at 11:20:40PM -0700, Christoph Hellwig wrote:
> On Thu, Jul 10, 2014 at 12:53:36AM +, Elliott, Robert (Server Storage) 
> wrote:
> > the problem still occurs - fio results in low or 0 IOPS, with perf top 
> > reporting unusual amounts of time spent in do_io_submit and io_submit.
> 
> The diff between the two version doesn't show too much other possible
> interesting commits, the most interesting being some minor block
> updates.
> 
> I guess we'll have to a manual bisect, I've pushed out a
> scsi-mq.3-bisect-1 branch that is rebased to just before the merge of
> the block tree, and a scsi-mq.3-bisect-2 branch that is just after
> the merge of the block tree to get started.

There is one possible concern that could be exacerbated by other changes in 
the system: if the application is running close to the bare minimum number 
of requests allocated in io_setup(), the per cpu reference counters will 
have a hard time batching things.  It might be worth testing with an 
increased number of requests being allocated if this is the case.

-ben
-- 
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] scsi: pm8001: fix pm8001_store_update_fw

2014-07-10 Thread Tomas Henzl
On 07/10/2014 08:43 AM, Christoph Hellwig wrote:
> On Mon, Jul 07, 2014 at 05:20:01PM +0200, Tomas Henzl wrote:
>> The current implementation may mix the negative value returned
>> from pm8001_set_nvmd with with count. -(-ENOMEM) could be interpreted
>> as bytes programmed, this patch fixes it.
> This still doesn;t look correct to me as err mixes up the driver
> internal FAIL_* codes with Linux error codes.  It seems like for the
> FAIL_* codes should only go into ->fw_status and the return value
> should be a proper Linux error code.

And the fw_status might be later used to show error strings in 
pm8001_show_update_fw,
if it is so it depends on the flash utility but it seems likely. 

>
> Funny fact: the FAIL_* / FLASH_IN_PROGRESS codes seems to be the same
> between aic94xx and pm8001.

And similar story there too - asd_store_update_bios -...- asd_poll_flash(might 
return -ENOENT)

Maybe the flash utility ignores the return value or it has never happened.
-

I'll try to find what seems to be the most probable way and post it in few days.
This patch is not related to the patches 1-3/4, so just wait with this one.

Thanks, Tomas

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 79901] New: Extremely slow boot on Promise VTrak E610f due to sd_mod RSOC usage

2014-07-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=79901

Bug ID: 79901
   Summary: Extremely slow boot on Promise VTrak E610f due to
sd_mod RSOC usage
   Product: IO/Storage
   Version: 2.5
Kernel Version: 3.14.7
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: rrapt...@nails.eu.org
Regression: No

Recently I've started upgrading all of my machines to kernel 3.14 (from Debian
wheezy backports to be precise). Mostly there were not problems, but I've
stumbled upon weird behavior on Fibre Channel servers (QLogic cards inside HP
blades) using Promise VTrak E610f arrays.

As soon as SCSI subsystem tries to detect partitions a lot of SCSI errors are
reported. The system stalls (but initramfs is responsive) for about 20-30
minutes (depending on number of arrays and LUNs). After that time, disk
detection finishes and system continues booting as usual. Everything works
perfectly afterwards.

I've spent some time fiddling with qla2xxx driver versions, SCSI scanning
options and anything else I could think of. Finally, I was able to find
culprit. The problem lies in sd_mod usage of scsi_report_opcode(). This
function is used to determine if the disk supports WRITE SAME command. It does
so by issuing REPORT SUPPORTED OPERATION CODES command. Unfortunately, it seems
Promise VTrak E610f really, really does not like RSOC. As soon as RSOC is
issued the array stalls for a while, then kernel tries to abort the command and
finally it must reset the port. Fortunately the array starts working again
after the reset. I've also verified this behavior with sg_opcodes utility.

Commit that introduced RSOC usage in sd_mod:
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=98dcc2946adbe4349ef1ef9b99873b912831edd4
Removing it fixes the issue.

I'm not sure what is the correct way to fix this as I'm not very familiar with
SCSI spec. If RSOC support cannot be reliably determined then probably some
kind of blacklist should be introduced.

As a workaround, I've modified qla2xxx driver to set 'no_write_same' flag.
While not directly related it forces sd_mod not to issue RSOC and it is easier
for me to ship Debian package with modified single driver (I'd prefer not to
manage my own kernel packages).

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] scsi_debug: support scsi-mq, queues and locks

2014-07-10 Thread Christoph Hellwig
This patch looks reasonable to me.  I'd prefer if we didn't have to keep
the optional host_lock mode for the long run, but if it's important to
you for now we can keep it for a while.

And chance you could split updates into a series of updates for future
changes?

Reviewed-by: Christoph Hellwig 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] Drivers: scsi: storvsc: Implement an abort handler

2014-07-10 Thread Richard Weinberger
On Wed, Jul 9, 2014 at 8:51 PM, KY Srinivasan  wrote:
>
>
>> -Original Message-
>> From: Christoph Hellwig [mailto:h...@infradead.org]
>> Sent: Wednesday, July 9, 2014 1:44 AM
>> To: KY Srinivasan
>> Cc: linux-ker...@vger.kernel.org; de...@linuxdriverproject.org;
>> oher...@suse.com; jbottom...@parallels.com; jasow...@redhat.com;
>> a...@canonical.com; linux-scsi@vger.kernel.org
>> Subject: Re: [PATCH 6/8] Drivers: scsi: storvsc: Implement an abort handler
>>
>> On Tue, Jul 08, 2014 at 05:46:50PM -0700, K. Y. Srinivasan wrote:
>> > Implement a simple abort handler. The host does not support "Abort";
>> > just ensure that all inflight I/Os have been accounted for.
>>
>> The abort handler should abort a single command, not wait for all of them.
>> What issue do you see that this tries to address?
>
> On Azure, we sometimes have unbounded I/O latencies and some distributions 
> (such as SLES12) based on recent kernels are invoking
> the "Abort Handler". Unfortunately, our scsi emulation on the host does not 
> support aborting a command.
> The issue I have seen is that the upper level scsi code attempts error 
> recovery when the command times out and finally frees up the command.
> The host subsequently responds to the command that has timed out and since 
> the memory has been freed up, we end up touching freed memory
> in this driver. Since the host is also doing error recovery, by just delaying 
> the error handler in the guest until we can account for all the in-flight 
> commands,
> we can get around the problem.

I see strange issues in Azure and maybe they are related to this.
Some Linux machines crash in a way that no disk IO is possible (thus,
no SSH for me) but they still respond to
ping. It happens rather seldom (every few weeks).

Do you see similar symptoms?

-- 
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/4] scsi: pm8001: fix update_flash

2014-07-10 Thread Suresh Thiagarajan


On Mon, Jul 7, 2014 at 8:49 PM, Tomas Henzl  wrote:
> The driver checks the return valu, but after he tries to wait_for_completion
> which might never happen.
> Also the ioctlbuffer is freed at the end of the function,
> so the first removal is not needed.

Looks good. Thanks Tomas.
Acked-by: Suresh Thiagarajan

>
> Signed-off-by: Tomas Henzl 
> ---
>  drivers/scsi/pm8001/pm8001_ctl.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_ctl.c 
> b/drivers/scsi/pm8001/pm8001_ctl.c
> index 211ffd6..e9753bd 100644
> --- a/drivers/scsi/pm8001/pm8001_ctl.c
> +++ b/drivers/scsi/pm8001/pm8001_ctl.c
> @@ -617,11 +617,11 @@ static int pm8001_update_flash(struct pm8001_hba_info 
> *pm8001_ha)
>
> pm8001_ha->nvmd_completion = &completion;
> ret = PM8001_CHIP_DISP->fw_flash_update_req(pm8001_ha, 
> payload);
> +   if (ret)
> +   break;
> wait_for_completion(&completion);
> -   if (ret || (fwControl->retcode > FLASH_UPDATE_IN_PROGRESS)) {
> +   if (fwControl->retcode > FLASH_UPDATE_IN_PROGRESS) {
> ret = fwControl->retcode;
> -   kfree(ioctlbuffer);
> -   ioctlbuffer = NULL;
> break;
> }
> }
> --
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/4] scsi: pm8001: fix a memory leak in flash_update

2014-07-10 Thread Suresh Thiagarajan


On Mon, Jul 7, 2014 at 8:50 PM, Tomas Henzl  wrote:
> ccb->fw_control_context is copied to local fw_control_context and
> the local variable is never used later
>
> Free ccb->fw_control_context.
> The task is forgotten thus also the reference to fw_control_context
> and the completion thread takes the info from virt_ptr again.

Looks good. Thanks Tomas.
Acked-by: Suresh Thiagarajan

>
> Signed-off-by: Tomas Henzl 
> ---
>  drivers/scsi/pm8001/pm8001_hwi.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c 
> b/drivers/scsi/pm8001/pm8001_hwi.c
> index dc70791..05beb69 100644
> --- a/drivers/scsi/pm8001/pm8001_hwi.c
> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -3617,15 +3617,11 @@ int pm8001_mpi_fw_flash_update_resp(struct 
> pm8001_hba_info *pm8001_ha,
> void *piomb)
>  {
> u32 status;
> -   struct fw_control_exfw_control_context;
> struct fw_flash_Update_resp *ppayload =
> (struct fw_flash_Update_resp *)(piomb + 4);
> u32 tag = le32_to_cpu(ppayload->tag);
> struct pm8001_ccb_info *ccb = &pm8001_ha->ccb_info[tag];
> status = le32_to_cpu(ppayload->status);
> -   memcpy(&fw_control_context,
> -   ccb->fw_control_context,
> -   sizeof(fw_control_context));
> switch (status) {
> case FLASH_UPDATE_COMPLETE_PENDING_REBOOT:
> PM8001_MSG_DBG(pm8001_ha,
> @@ -3668,11 +3664,11 @@ int pm8001_mpi_fw_flash_update_resp(struct 
> pm8001_hba_info *pm8001_ha,
> pm8001_printk("No matched status = %d\n", status));
> break;
> }
> -   ccb->fw_control_context->fw_control->retcode = status;
> -   complete(pm8001_ha->nvmd_completion);
> +   kfree(ccb->fw_control_context);
> ccb->task = NULL;
> ccb->ccb_tag = 0x;
> pm8001_tag_free(pm8001_ha, tag);
> +   complete(pm8001_ha->nvmd_completion);
> return 0;
>  }
>
> @@ -4876,6 +4872,10 @@ int pm8001_chip_set_nvmd_req(struct pm8001_hba_info 
> *pm8001_ha,
> break;
> }
> rc = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &nvmd_req, 0);
> +   if (rc) {
> +   kfree(fw_control_context);
> +   pm8001_tag_free(pm8001_ha, tag);
> +   }
> return rc;
>  }
>
> --
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/4] scsi: pm8001: fix a memory leak in nvmd_resp

2014-07-10 Thread Suresh Thiagarajan


On Mon, Jul 7, 2014 at 8:49 PM, Tomas Henzl  wrote:
> Instead of copying information to fw_control_context free it.
>
> The task is forgotten thus also the reference to fw_control_context
> and the completion thread takes the info from virt_ptr again.

Looks good. Thanks Tomas.
Acked-by: Suresh Thiagarajan

>
> Signed-off-by: Tomas Henzl 
> ---
>  drivers/scsi/pm8001/pm8001_hwi.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c 
> b/drivers/scsi/pm8001/pm8001_hwi.c
> index cacb429..dc70791 100644
> --- a/drivers/scsi/pm8001/pm8001_hwi.c
> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -3129,7 +3129,6 @@ void pm8001_mpi_set_nvmd_resp(struct pm8001_hba_info 
> *pm8001_ha, void *piomb)
>  void
>  pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
>  {
> -   struct fw_control_ex*fw_control_context;
> struct get_nvm_data_resp *pPayload =
> (struct get_nvm_data_resp *)(piomb + 4);
> u32 tag = le32_to_cpu(pPayload->tag);
> @@ -3138,7 +3137,6 @@ pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info 
> *pm8001_ha, void *piomb)
> u32 ir_tds_bn_dps_das_nvm =
> le32_to_cpu(pPayload->ir_tda_bn_dps_das_nvm);
> void *virt_addr = pm8001_ha->memoryMap.region[NVMD].virt_ptr;
> -   fw_control_context = ccb->fw_control_context;
>
> PM8001_MSG_DBG(pm8001_ha, pm8001_printk("Get nvm data complete!\n"));
> if ((dlen_status & NVMD_STAT) != 0) {
> @@ -3179,13 +3177,11 @@ pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info 
> *pm8001_ha, void *piomb)
> pm8001_printk("Get NVMD success, IR=0, dataLen=%d\n",
> (dlen_status & NVMD_LEN) >> 24));
> }
> -   memcpy(fw_control_context->usrAddr,
> -   pm8001_ha->memoryMap.region[NVMD].virt_ptr,
> -   fw_control_context->len);
> -   complete(pm8001_ha->nvmd_completion);
> +   kfree(ccb->fw_control_context);
> ccb->task = NULL;
> ccb->ccb_tag = 0x;
> pm8001_tag_free(pm8001_ha, tag);
> +   complete(pm8001_ha->nvmd_completion);
>  }
>
>  int pm8001_mpi_local_phy_ctl(struct pm8001_hba_info *pm8001_ha, void *piomb)
> --
> 1.8.3.1
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 2/8] Drivers: scsi: storvsc: Set cmd_per_lun to reflect value supported by the Host

2014-07-10 Thread Christoph Hellwig
> - .cmd_per_lun =  1,
> + .cmd_per_lun =  255,
>   .can_queue =STORVSC_MAX_IO_REQUESTS*STORVSC_MAX_TARGETS,

slave_configure immediately adjusts this down to STORVSC_MAX_IO_REQUESTS
(250), any reson to start out with the magic 255 here?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] drivers: scsi: storvsc: Set srb_flags in all cases

2014-07-10 Thread Christoph Hellwig
>   default:
>   vm_srb->data_in = UNKNOWN_TYPE;
> - vm_srb->win8_extension.srb_flags = 0;
> + vm_srb->win8_extension.srb_flags |= (SRB_FLAGS_DATA_IN |
> +  SRB_FLAGS_DATA_OUT);

This would usually be a command that doesn't transfer data (e.g.
TEST_UNIT_READY or SYNCHRONIZE_CACHE), do you really want to set the in
and out flags here?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] drivers: scsi: storvsc: Set srb_flags in all cases

2014-07-10 Thread Christoph Hellwig
On Wed, Jul 09, 2014 at 06:40:09PM +, KY Srinivasan wrote:
> > On Tue, Jul 08, 2014 at 05:46:51PM -0700, K. Y. Srinivasan wrote:
> > > Correctly set SRB flags for all valid I/O directions. Some IHV drivers
> > > on the Windows host require this.
> > 
> > What are IHV drivers?
> 
> If the target is a SAN device, the host simply passes the request over to the 
> native driver stack on the host. Some specific hardware (IHV - independent 
> hadware vendor) drivers on Windows require that SRB flags be correctly set in 
> all cases.
> 

I'm fine with putting this in, but treating I/O request from guests as
trusted in a hypervisor doesn't seem like a good idea in general..

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] Drivers: scsi: storvsc: Implement an abort handler

2014-07-10 Thread Christoph Hellwig
On Wed, Jul 09, 2014 at 06:51:38PM +, KY Srinivasan wrote:
> On Azure, we sometimes have unbounded I/O latencies and some distributions
> (such as SLES12) based on recent kernels are invoking the "Abort Handler".

Any kernel will invoke the abort handler if present, and then escalate
to the various resets.

> Unfortunately, our scsi emulation on the host does not support aborting 
> a command. The issue I have seen is that the upper level scsi code attempts
> error recovery when the command times out and finally frees up the command.
> The host subsequently responds to the command that has timed out and since
> the memory has been freed up, we end up touching freed memory in this
> driver. Since the host is also doing error recovery, by just delaying the
> error handler in the guest until we can account for all the in-flight 
> commands, we can get around the problem.

The storvsc driver does implement an bus reset error handler, and
after that completes successfully the midlayer frees the commands,
and the driver has to deal with this and not call scsi_done after
the reset finished (normally you'd expect the hardware to not complete
requests after an reset).

Note that you could increase the timeout and/or implement an
eh_timed_out handler that just returns BLK_EH_RESET_TIMER, but if the
completion takes too long the expectation is that a command will
eventually finish instead of beeing delayed by an unmound amount.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16

2014-07-10 Thread h...@infradead.org
On Wed, Jul 09, 2014 at 10:36:26PM +, KY Srinivasan wrote:
> Ok; I am concerned about older kernels that do not have no_write_same flag.
> I suppose I can work directly with these Distros and give them a choice: 
> either implement
> the no_write_same flag or filter the command in our driver. I will not 
> include this patch when

Exactly.  In general they should be interested in the flag as various
raid controllers react badly to it as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3] scsi_lib: remove the description string in scsi_io_completion()

2014-07-10 Thread Hannes Reinecke

On 07/10/2014 09:41 AM, Maurizio Lombardi wrote:

During IO with fabric faults, one generally sees several "Unhandled error
code" messages in the syslog as shown below:

sd 4:0:6:2: [sdbw] Unhandled error code
sd 4:0:6:2: [sdbw] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
sd 4:0:6:2: [sdbw] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
end_request: I/O error, dev sdbw, sector 0

This comes from scsi_io_completion (in scsi_lib.c) while handling error
codes other than DID_RESET or not deferred sense keys i.e. this is
actually handled by the SCSI mid layer. But what gets displayed here is
"Unhandled error code" which is quite misleading as it indicates
something that is not addressed by the mid layer.

The description string is based on the sense key and sometimes on the
additional sense code;
since the ACTION_FAIL case always prints the sense key and the
additional sense code, this patch removes the description string
completely because it does not add useful information.

Signed-off-by: Maurizio Lombardi 

Very good. I've been wanting to do this for a long time, too.

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3] scsi_lib: remove the description string in scsi_io_completion()

2014-07-10 Thread Christoph Hellwig
Thanks,

this looks very nice to me.  It should also allow for later splitting
the error handling from scsi_io_completion into a separate function away
from the fast path.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SCSI eats error from flush failure during hot plug

2014-07-10 Thread Christoph Hellwig
On Wed, Jul 09, 2014 at 03:38:18PM -0700, James Bottomley wrote:
> On Thu, 2014-07-03 at 10:18 -0700, Christoph Hellwig wrote:
> > On Thu, Jul 03, 2014 at 09:57:32AM -0700, Steven Haber wrote:
> > > Both patches work fine -- zero uncommitted writes over several hundred
> > > power cycles.
> > 
> > Thanks for the testing.
> > 
> > Below is James' patch with a trivial comment fix and a proper changelog.
> > James, can I get your signoff for this one?  I'll throw it into the
> > core-for-3.16 tree then.
> 
> OK, here it is.  I tweaked the comment to be accurate.

Thanks.  Can someone give me a second review for it as well?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3] scsi_lib: remove the description string in scsi_io_completion()

2014-07-10 Thread Maurizio Lombardi
During IO with fabric faults, one generally sees several "Unhandled error
code" messages in the syslog as shown below:

sd 4:0:6:2: [sdbw] Unhandled error code
sd 4:0:6:2: [sdbw] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
sd 4:0:6:2: [sdbw] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
end_request: I/O error, dev sdbw, sector 0

This comes from scsi_io_completion (in scsi_lib.c) while handling error
codes other than DID_RESET or not deferred sense keys i.e. this is
actually handled by the SCSI mid layer. But what gets displayed here is
"Unhandled error code" which is quite misleading as it indicates
something that is not addressed by the mid layer.

The description string is based on the sense key and sometimes on the
additional sense code;
since the ACTION_FAIL case always prints the sense key and the
additional sense code, this patch removes the description string
completely because it does not add useful information.

Signed-off-by: Maurizio Lombardi 
---
 drivers/scsi/scsi_lib.c | 40 
 1 file changed, 4 insertions(+), 36 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f7e3163..e953b9b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -686,7 +686,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
int sense_deferred = 0;
enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
  ACTION_DELAYED_RETRY} action;
-   char *description = NULL;
unsigned long wait_for = (cmd->allowed + 1) * req->timeout;
 
if (result) {
@@ -803,7 +802,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
 * and quietly refuse further access.
 */
cmd->device->changed = 1;
-   description = "Media Changed";
action = ACTION_FAIL;
} else {
/* Must have been a power glitch, or a
@@ -831,27 +829,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
cmd->device->use_10_for_rw = 0;
action = ACTION_REPREP;
} else if (sshdr.asc == 0x10) /* DIX */ {
-   description = "Host Data Integrity Failure";
action = ACTION_FAIL;
error = -EILSEQ;
/* INVALID COMMAND OPCODE or INVALID FIELD IN CDB */
} else if (sshdr.asc == 0x20 || sshdr.asc == 0x24) {
-   switch (cmd->cmnd[0]) {
-   case UNMAP:
-   description = "Discard failure";
-   break;
-   case WRITE_SAME:
-   case WRITE_SAME_16:
-   if (cmd->cmnd[1] & 0x8)
-   description = "Discard failure";
-   else
-   description =
-   "Write same failure";
-   break;
-   default:
-   description = "Invalid command failure";
-   break;
-   }
action = ACTION_FAIL;
error = -EREMOTEIO;
} else
@@ -859,10 +840,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
break;
case ABORTED_COMMAND:
action = ACTION_FAIL;
-   if (sshdr.asc == 0x10) { /* DIF */
-   description = "Target Data Integrity Failure";
+   if (sshdr.asc == 0x10) /* DIF */
error = -EILSEQ;
-   }
break;
case NOT_READY:
/* If the device is in the process of becoming
@@ -881,42 +860,31 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
action = ACTION_DELAYED_RETRY;
break;
default:
-   description = "Device not ready";
action = ACTION_FAIL;
break;
}
-   } else {
-   description = "Device not ready";
+   } else
action = ACTIO