Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-08-19 Thread Kashyap Desai
On Tue, Aug 19, 2014 at 9:36 PM, Christoph Hellwig  wrote:
> On Tue, Aug 19, 2014 at 03:51:42AM +0530, Kashyap Desai wrote:
>> I read  this comment and find that very few drivers are using this
>> cmd_list.  I think if we remove this cmd_list, performance will scale as I
>> am seeing major contention in this lock.
>> Just thought to ping you to see if this is known limitation for now or any
>> plan to change this lock in near future ?
>
> Removing the lock entirely and pushing the list into the two drivers
> using it is on my TODO list.  Bart actually suggested keeping the code in the
> SCSI core and having a flag to enabled.  Given that I'm too busy to get my 
> full
> version done in time, it might be a good idea if someone picks up Barts
> idea.  Can you send me a patch to add a enable_cmd_list flag to the host
> template and only enable it for aacraid and dpt_i2o?
>

Sure. I will work on relevant code change and will post patch for review.

-- 
Device Driver Developer @ Avagotech
Kashyap D. Desai
Note - my new email address
kashyap.de...@avagotech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-08-19 Thread Christoph Hellwig
On Tue, Aug 19, 2014 at 03:51:42AM +0530, Kashyap Desai wrote:
> I read  this comment and find that very few drivers are using this
> cmd_list.  I think if we remove this cmd_list, performance will scale as I
> am seeing major contention in this lock.
> Just thought to ping you to see if this is known limitation for now or any
> plan to change this lock in near future ?

Removing the lock entirely and pushing the list into the two drivers
using it is on my TODO list.  Bart actually suggested keeping the code in the
SCSI core and having a flag to enabled.  Given that I'm too busy to get my full
version done in time, it might be a good idea if someone picks up Barts
idea.  Can you send me a patch to add a enable_cmd_list flag to the host
template and only enable it for aacraid and dpt_i2o?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-08-19 Thread Kashyap Desai
On Tue, Aug 19, 2014 at 3:51 AM, Kashyap Desai
 wrote:
>
> > -Original Message-
> > From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> > ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> > Sent: Friday, July 18, 2014 3:43 PM
> > To: James Bottomley; linux-scsi@vger.kernel.org
> > Cc: Jens Axboe; Bart Van Assche; Mike Christie; Martin K. Petersen;
> Robert
> > Elliott; Webb Scales; linux-ker...@vger.kernel.org
> > Subject: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.
> >
> > This patch adds support for an alternate I/O path in the scsi midlayer
> which
> > uses the blk-mq infrastructure instead of the legacy request code.
> >
> > Use of blk-mq is fully transparent to drivers, although for now a host
> > template field is provided to opt out of blk-mq usage in case any
> unforseen
> > incompatibilities arise.
> >
> > In general replacing the legacy request code with blk-mq is a simple and
> > mostly mechanical transformation.  The biggest exception is the new code
> > that deals with the fact the I/O submissions in blk-mq must happen from
> > process context, which slightly complicates the I/O completion handler.
> > The second biggest differences is that blk-mq is build around the
> concept of
> > preallocated requests that also include driver specific data, which in
> SCSI
> > context means the scsi_cmnd structure.  This completely avoids dynamic
> > memory allocations for the fast path through I/O submission.
> >
> > Due the preallocated requests the MQ code path exclusively uses the
> host-
> > wide shared tag allocator instead of a per-LUN one.  This only affects
> drivers
> > actually using the block layer provided tag allocator instead of their
> own.
> > Unlike the old path blk-mq always provides a tag, although drivers don't
> have
> > to use it.
> >
> > For now the blk-mq path is disable by defauly and must be enabled using
> the
> > "use_blk_mq" module parameter.  Once the remaining work in the block
> > layer to make blk-mq more suitable for slow devices is complete I hope
> to
> > make it the default and eventually even remove the old code path.
> >
> > Based on the earlier scsi-mq prototype by Nicholas Bellinger.
> >
> > Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking
> and
> > various sugestions and code contributions.
> >
> > Signed-off-by: Christoph Hellwig 
> > Reviewed-by: Hannes Reinecke 
> > Reviewed-by: Webb Scales 
> > Acked-by: Jens Axboe 
> > Tested-by: Bart Van Assche 
> > Tested-by: Robert Elliott 
> > ---
> >  drivers/scsi/hosts.c  |  35 +++-
> >  drivers/scsi/scsi.c   |   5 +-
> >  drivers/scsi/scsi_lib.c   | 464
> > --
> >  drivers/scsi/scsi_priv.h  |   3 +
> >  drivers/scsi/scsi_scan.c  |   5 +-
> >  drivers/scsi/scsi_sysfs.c |   2 +
> >  include/scsi/scsi_host.h  |  18 +-
> >  include/scsi/scsi_tcq.h   |  28 ++-
> >  8 files changed, 488 insertions(+), 72 deletions(-)
> >
> > diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index
> 0632eee..6de80e3
> > 100644
> > --- a/drivers/scsi/hosts.c
> > +++ b/drivers/scsi/hosts.c
> > @@ -213,9 +213,24 @@ int scsi_add_host_with_dma(struct Scsi_Host
> > *shost, struct device *dev,
> >   goto fail;
> >   }
> >
> > + if (shost_use_blk_mq(shost)) {
> > + error = scsi_mq_setup_tags(shost);
> > + if (error)
> > + goto fail;
> > + }
> > +
> > + /*
> > +  * Note that we allocate the freelist even for the MQ case for
> now,
> > +  * as we need a command set aside for scsi_reset_provider.  Having
> > +  * the full host freelist and one command available for that is a
> > +  * little heavy-handed, but avoids introducing a special allocator
> > +  * just for this.  Eventually the structure of scsi_reset_provider
> > +  * will need a major overhaul.
> > +  */
> >   error = scsi_setup_command_freelist(shost);
> >   if (error)
> > - goto fail;
> > + goto out_destroy_tags;
> > +
> >
> >   if (!shost->shost_gendev.parent)
> >   shost->shost_gendev.parent = dev ? dev : &platform_bus;
> > @@ -226,7 +241,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost,
> > struct device *dev,
> >
> >   error = device_add(&shost->shost_gendev);
> >   if (error)
> > - goto out;
> > + goto out_destroy_freelist;
> >
> >   pm_runtime_set_active(&shost->shost_gendev);
> >   pm_runtime_enable(&shost->shost_gendev);
> > @@ -279,8 +294,11 @@ int scsi_add_host_with_dma(struct Scsi_Host
> > *shost, struct device *dev,
> >   device_del(&shost->shost_dev);
> >   out_del_gendev:
> >   device_del(&shost->shost_gendev);
> > - out:
> > + out_destroy_freelist:
> >   scsi_destroy_command_freelist(shost);
> > + out_destroy_tags:
> > + if (shost_use_blk_mq(shost))
> > + scsi_mq_destroy_tags(shost);
> >   fail:
> >   return error;
> >  }
> > @@ -309,8 

RE: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-08-18 Thread Kashyap Desai
> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
> Sent: Friday, July 18, 2014 3:43 PM
> To: James Bottomley; linux-scsi@vger.kernel.org
> Cc: Jens Axboe; Bart Van Assche; Mike Christie; Martin K. Petersen;
Robert
> Elliott; Webb Scales; linux-ker...@vger.kernel.org
> Subject: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.
>
> This patch adds support for an alternate I/O path in the scsi midlayer
which
> uses the blk-mq infrastructure instead of the legacy request code.
>
> Use of blk-mq is fully transparent to drivers, although for now a host
> template field is provided to opt out of blk-mq usage in case any
unforseen
> incompatibilities arise.
>
> In general replacing the legacy request code with blk-mq is a simple and
> mostly mechanical transformation.  The biggest exception is the new code
> that deals with the fact the I/O submissions in blk-mq must happen from
> process context, which slightly complicates the I/O completion handler.
> The second biggest differences is that blk-mq is build around the
concept of
> preallocated requests that also include driver specific data, which in
SCSI
> context means the scsi_cmnd structure.  This completely avoids dynamic
> memory allocations for the fast path through I/O submission.
>
> Due the preallocated requests the MQ code path exclusively uses the
host-
> wide shared tag allocator instead of a per-LUN one.  This only affects
drivers
> actually using the block layer provided tag allocator instead of their
own.
> Unlike the old path blk-mq always provides a tag, although drivers don't
have
> to use it.
>
> For now the blk-mq path is disable by defauly and must be enabled using
the
> "use_blk_mq" module parameter.  Once the remaining work in the block
> layer to make blk-mq more suitable for slow devices is complete I hope
to
> make it the default and eventually even remove the old code path.
>
> Based on the earlier scsi-mq prototype by Nicholas Bellinger.
>
> Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking
and
> various sugestions and code contributions.
>
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Hannes Reinecke 
> Reviewed-by: Webb Scales 
> Acked-by: Jens Axboe 
> Tested-by: Bart Van Assche 
> Tested-by: Robert Elliott 
> ---
>  drivers/scsi/hosts.c  |  35 +++-
>  drivers/scsi/scsi.c   |   5 +-
>  drivers/scsi/scsi_lib.c   | 464
> --
>  drivers/scsi/scsi_priv.h  |   3 +
>  drivers/scsi/scsi_scan.c  |   5 +-
>  drivers/scsi/scsi_sysfs.c |   2 +
>  include/scsi/scsi_host.h  |  18 +-
>  include/scsi/scsi_tcq.h   |  28 ++-
>  8 files changed, 488 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index
0632eee..6de80e3
> 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -213,9 +213,24 @@ int scsi_add_host_with_dma(struct Scsi_Host
> *shost, struct device *dev,
>   goto fail;
>   }
>
> + if (shost_use_blk_mq(shost)) {
> + error = scsi_mq_setup_tags(shost);
> + if (error)
> + goto fail;
> + }
> +
> + /*
> +  * Note that we allocate the freelist even for the MQ case for
now,
> +  * as we need a command set aside for scsi_reset_provider.  Having
> +  * the full host freelist and one command available for that is a
> +  * little heavy-handed, but avoids introducing a special allocator
> +  * just for this.  Eventually the structure of scsi_reset_provider
> +  * will need a major overhaul.
> +  */
>   error = scsi_setup_command_freelist(shost);
>   if (error)
> - goto fail;
> + goto out_destroy_tags;
> +
>
>   if (!shost->shost_gendev.parent)
>   shost->shost_gendev.parent = dev ? dev : &platform_bus;
> @@ -226,7 +241,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost,
> struct device *dev,
>
>   error = device_add(&shost->shost_gendev);
>   if (error)
> - goto out;
> + goto out_destroy_freelist;
>
>   pm_runtime_set_active(&shost->shost_gendev);
>   pm_runtime_enable(&shost->shost_gendev);
> @@ -279,8 +294,11 @@ int scsi_add_host_with_dma(struct Scsi_Host
> *shost, struct device *dev,
>   device_del(&shost->shost_dev);
>   out_del_gendev:
>   device_del(&shost->shost_gendev);
> - out:
> + out_destroy_freelist:
>   scsi_destroy_command_freelist(shost);
> + out_destroy_tags:
> + if (shost_use_blk_mq(shost))
> + scsi_mq_destroy_tags(shost);
>   fail:
>   return error;
>  }
> @@ -309,8 +327,13 @@ static void scsi_host_dev_release(struct device
> *dev)
>   }
>
>   scsi_destroy_command_freelist(shost);
> - if (shost->bqt)
> - blk_free_tags(shost->bqt);
> + if (shost_use_blk_mq(shost)) {
> + if (shost->tag_set.tags)
> + scsi_mq_destroy_tag

Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-07-25 Thread Martin K. Petersen
> "Christoph" == Christoph Hellwig  writes:

Christoph> This patch adds support for an alternate I/O path in the scsi
Christoph> midlayer which uses the blk-mq infrastructure instead of the
Christoph> legacy request code.

Reviewed-by: Martin K. Petersen 

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-07-16 Thread Christoph Hellwig
On Wed, Jul 16, 2014 at 06:13:21AM -0500, Mike Christie wrote:
> I see the request timer is started before calling queue_rq, but I could
> not figure out what the cancel_delayed_work here is for exactly. It
> seems if the request were to time out and the eh started while queue_rq
> was running we could end up some nasty bugs like the requested requeued
> twice.
> 
> Is the cancel_delayed_work call just to be safe or is supposed to be
> handling a case where the abort_work could be queued at this time up due
> to a request timing out while queue_rq is running? Is this case mq specific?

It was cargo cult copy & paste from the old path.  I've merged a patch
from Bart to remove it from the old code, so it should go away here as well,
thanks for the reminder.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-07-16 Thread Mike Christie
On 06/25/2014 11:52 AM, Christoph Hellwig wrote:
> +static int scsi_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
> +{
> + struct request_queue *q = req->q;
> + struct scsi_device *sdev = q->queuedata;
> + struct Scsi_Host *shost = sdev->host;
> + struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
> + int ret;
> + int reason;
> +
> + ret = prep_to_mq(scsi_prep_state_check(sdev, req));
> + if (ret)
> + goto out;
> +
> + ret = BLK_MQ_RQ_QUEUE_BUSY;
> + if (!get_device(&sdev->sdev_gendev))
> + goto out;
> +
> + if (!scsi_dev_queue_ready(q, sdev))
> + goto out_put_device;
> + if (!scsi_target_queue_ready(shost, sdev))
> + goto out_dec_device_busy;
> + if (!scsi_host_queue_ready(q, shost, sdev))
> + goto out_dec_target_busy;
> +
> + if (!(req->cmd_flags & REQ_DONTPREP)) {
> + ret = prep_to_mq(scsi_mq_prep_fn(req));
> + if (ret)
> + goto out_dec_host_busy;
> + req->cmd_flags |= REQ_DONTPREP;
> + }
> +
> + scsi_init_cmd_errh(cmd);
> + cmd->scsi_done = scsi_mq_done;
> +
> + reason = scsi_dispatch_cmd(cmd);
> + if (reason) {
> + scsi_set_blocked(cmd, reason);
> + ret = BLK_MQ_RQ_QUEUE_BUSY;
> + goto out_dec_host_busy;
> + }
> +
> + return BLK_MQ_RQ_QUEUE_OK;
> +
> +out_dec_host_busy:
> + cancel_delayed_work(&cmd->abort_work);

Hey Christoph,

I see the request timer is started before calling queue_rq, but I could
not figure out what the cancel_delayed_work here is for exactly. It
seems if the request were to time out and the eh started while queue_rq
was running we could end up some nasty bugs like the requested requeued
twice.

Is the cancel_delayed_work call just to be safe or is supposed to be
handling a case where the abort_work could be queued at this time up due
to a request timing out while queue_rq is running? Is this case mq specific?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/14] scsi: add support for a blk-mq based I/O path.

2014-07-09 Thread Hannes Reinecke

On 06/25/2014 06:52 PM, Christoph Hellwig wrote:

This patch adds support for an alternate I/O path in the scsi midlayer
which uses the blk-mq infrastructure instead of the legacy request code.

Use of blk-mq is fully transparent to drivers, although for now a host
template field is provided to opt out of blk-mq usage in case any unforseen
incompatibilities arise.

In general replacing the legacy request code with blk-mq is a simple and
mostly mechanical transformation.  The biggest exception is the new code
that deals with the fact the I/O submissions in blk-mq must happen from
process context, which slightly complicates the I/O completion handler.
The second biggest differences is that blk-mq is build around the concept
of preallocated requests that also include driver specific data, which
in SCSI context means the scsi_cmnd structure.  This completely avoids
dynamic memory allocations for the fast path through I/O submission.

Due the preallocated requests the MQ code path exclusively uses the
host-wide shared tag allocator instead of a per-LUN one.  This only
affects drivers actually using the block layer provided tag allocator
instead of their own.  Unlike the old path blk-mq always provides a tag,
although drivers don't have to use it.

For now the blk-mq path is disable by defauly and must be enabled using
the "use_blk_mq" module parameter.  Once the remaining work in the block
layer to make blk-mq more suitable for slow devices is complete I hope
to make it the default and eventually even remove the old code path.

Based on the earlier scsi-mq prototype by Nicholas Bellinger.

Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and
various sugestions and code contributions.

Signed-off-by: Christoph Hellwig 
---
  drivers/scsi/hosts.c  |   30 ++-
  drivers/scsi/scsi.c   |5 +-
  drivers/scsi/scsi_lib.c   |  475 +++--
  drivers/scsi/scsi_priv.h  |3 +
  drivers/scsi/scsi_scan.c  |5 +-
  drivers/scsi/scsi_sysfs.c |2 +
  include/scsi/scsi_host.h  |   18 +-
  include/scsi/scsi_tcq.h   |   28 ++-
  8 files changed, 494 insertions(+), 72 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 0632eee..6322e6c 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -213,9 +213,24 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct 
device *dev,
goto fail;
}

+   if (shost_use_blk_mq(shost)) {
+   error = scsi_mq_setup_tags(shost);
+   if (error)
+   goto fail;
+   }
+
+   /*
+* Note that we allocate the freelist even for the MQ case for now,
+* as we need a command set aside for scsi_reset_provider.  Having
+* the full host freelist and one command available for that is a
+* little heavy-handed, but avoids introducing a special allocator
+* just for this.  Eventually the structure of scsi_reset_provider
+* will need a major overhaul.
+*/
error = scsi_setup_command_freelist(shost);
if (error)
-   goto fail;
+   goto out_destroy_tags;
+

if (!shost->shost_gendev.parent)
shost->shost_gendev.parent = dev ? dev : &platform_bus;
@@ -226,7 +241,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct 
device *dev,

error = device_add(&shost->shost_gendev);
if (error)
-   goto out;
+   goto out_destroy_freelist;

pm_runtime_set_active(&shost->shost_gendev);
pm_runtime_enable(&shost->shost_gendev);
@@ -279,8 +294,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct 
device *dev,
device_del(&shost->shost_dev);
   out_del_gendev:
device_del(&shost->shost_gendev);
- out:
+ out_destroy_freelist:
scsi_destroy_command_freelist(shost);
+ out_destroy_tags:
+   if (shost_use_blk_mq(shost))
+   scsi_mq_destroy_tags(shost);
   fail:
return error;
  }
@@ -309,7 +327,9 @@ static void scsi_host_dev_release(struct device *dev)
}

scsi_destroy_command_freelist(shost);
-   if (shost->bqt)
+   if (shost_use_blk_mq(shost) && shost->tag_set.tags)
+   scsi_mq_destroy_tags(shost);
+   else if (shost->bqt)
blk_free_tags(shost->bqt);

kfree(shost->shost_data);
@@ -436,6 +456,8 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template 
*sht, int privsize)
else
shost->dma_boundary = 0x;

+   shost->use_blk_mq = scsi_use_blk_mq && !shost->hostt->disable_blk_mq;
+
device_initialize(&shost->shost_gendev);
dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
shost->shost_gendev.bus = &scsi_bus_type;
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index b362058..c089812 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -809,7 +809,7 @@ void scsi_adjust_que