Re: [PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Ming Lei
On Sat, Oct 14, 2017 at 09:39:21AM -0600, Jens Axboe wrote: > On 10/14/2017 03:22 AM, Ming Lei wrote: > > Hi Jens, > > > > In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O > > performance is much bad with mq-deadline, especially about sequential I/O > > on some multi-queue

Re: [PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Ming Lei
On Sat, Oct 14, 2017 at 07:38:29PM +0200, Oleksandr Natalenko wrote: > Hi. > > By any chance, could this be backported to 4.14? I'm confused with "SCSI: > allow to pass null rq to scsi_prep_state_check()" since it uses refactored > flags. > > === > if (req && !(req->rq_flags & RQF_PREEMPT)) > =

Re: [PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Oleksandr Natalenko
Hi. By any chance, could this be backported to 4.14? I'm confused with "SCSI: allow to pass null rq to scsi_prep_state_check()" since it uses refactored flags. === if (req && !(req->rq_flags & RQF_PREEMPT)) === Is it safe to revert to REQ_PREEMPT here, or rq_flags should also be replaced with

Re: [PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Jens Axboe
On 10/14/2017 03:22 AM, Ming Lei wrote: > Hi Jens, > > In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O > performance is much bad with mq-deadline, especially about sequential I/O > on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) > > Turns out one big issue causes

[PATCH V10 3/8] sbitmap: introduce __sbitmap_for_each_set()

2017-10-14 Thread Ming Lei
We need to iterate ctx starting from any ctx in round robin way, so introduce this helper. Reviewed-by: Omar Sandoval Cc: Omar Sandoval Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 64 - 1 file changed, 47 insertions(+), 17 deletions(-)

[PATCH V10 1/8] blk-mq-sched: dispatch from scheduler only after progress is made on ->dispatch

2017-10-14 Thread Ming Lei
When hw queue is busy, we shouldn't take requests from scheduler queue any more, otherwise it is difficult to do IO merge. This patch fixes the awful IO performance on some SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if hw queue is busy. Reviewed-by: Oma

[PATCH V10 4/8] block: kyber: check if there is request in ctx in kyber_has_work()

2017-10-14 Thread Ming Lei
There may be request in sw queue, and not fetched to domain queue yet, so check it in kyber_has_work(). Signed-off-by: Ming Lei --- block/kyber-iosched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index f58cab82105b..94df3ce9

[PATCH V10 2/8] blk-mq-sched: move actual dispatching into one helper

2017-10-14 Thread Ming Lei
So that it becomes easy to support to dispatch from sw queue in the following patch. No functional change. Reviewed-by: Bart Van Assche Reviewed-by: Omar Sandoval Suggested-by: Christoph Hellwig # for simplifying dispatch logic Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 43 ++

[PATCH V10 6/8] blk-mq-sched: improve dispatching from sw queue

2017-10-14 Thread Ming Lei
SCSI devices use host-wide tagset, and the shared driver tag space is often quite big. Meantime there is also queue depth for each lun( .cmd_per_lun), which is often small, for example, on both lpfc and qla2xxx, .cmd_per_lun is just 3. So lots of requests may stay in sw queue, and we always flush

[PATCH V10 5/8] blk-mq: introduce .get_budget and .put_budget in blk_mq_ops

2017-10-14 Thread Ming Lei
For SCSI devices, there is often per-request-queue depth, which need to be respected before queuing one request. The current blk-mq always dequeues request first, then calls .queue_rq() to dispatch the request to lld. One obvious issue of this way is that I/O merge may not be good, because when th

[PATCH V10 7/8] SCSI: allow to pass null rq to scsi_prep_state_check()

2017-10-14 Thread Ming Lei
In the following patch, we will implement scsi_get_budget() which need to call scsi_prep_state_check() when rq isn't dequeued yet. Signed-off-by: Ming Lei --- drivers/scsi/scsi_lib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/sc

[PATCH V10 8/8] SCSI: implement .get_budget and .put_budget for blk-mq

2017-10-14 Thread Ming Lei
We need to tell blk-mq for reserving resource before queuing one request, so implement these two callbacks. Then blk-mq can avoid to dequeue request earlier, and IO merge can be improved a lot. Signed-off-by: Ming Lei --- drivers/scsi/scsi_lib.c | 75 ++---

[PATCH V10 0/8] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Ming Lei
Hi Jens, In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O performance is much bad with mq-deadline, especially about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) Turns out one big issue causes the performance regression: requests are still dequeu

Re: [PATCH V9 6/7] SCSI: allow to pass null rq to scsi_prep_state_check()

2017-10-14 Thread Ming Lei
On Fri, Oct 13, 2017 at 11:16:12PM +, Bart Van Assche wrote: > On Sat, 2017-10-14 at 02:05 +0800, Ming Lei wrote: > > In the following patch, we will implement scsi_get_budget() > > which need to call scsi_prep_state_check() when rq isn't > > dequeued yet. > > My understanding is that this cha

Re: [PATCH V9 4/7] blk-mq: introduce .get_budget and .put_budget in blk_mq_ops

2017-10-14 Thread Ming Lei
On Fri, Oct 13, 2017 at 11:43:44PM +, Bart Van Assche wrote: > On Sat, 2017-10-14 at 02:05 +0800, Ming Lei wrote: > > @@ -89,19 +89,36 @@ static bool blk_mq_sched_restart_hctx(struct > > blk_mq_hw_ctx *hctx) > > return false; > > } > > > > -static void blk_mq_do_dispatch_sched(struct bl

Re: [PATCH V9 0/7] blk-mq-sched: improve sequential I/O performance

2017-10-14 Thread Ming Lei
On Fri, Oct 13, 2017 at 02:23:07PM -0600, Jens Axboe wrote: > On 10/13/2017 01:21 PM, Jens Axboe wrote: > > On 10/13/2017 01:08 PM, Jens Axboe wrote: > >> On 10/13/2017 12:05 PM, Ming Lei wrote: > >>> Hi Jens, > >>> > >>> In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O > >