Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Coly Li
On 2017/10/1 上午6:49, Michael Lyle wrote: > One final attempt to resend, because gmail has been giving me trouble > sending plain text mail. > > Two instances of this. Tested as above, with a big set of random I/Os > that ultimately cover every block in a file (e.g. allowing sequential > writeback

Loan

2017-09-30 Thread CAPITAL FINANCE
UNSECURED BUSINESS/PERSONAL LOAN BY LOAN CAPITAL FINANCE - NO COLLATERAL - MINIMUM DOCUMENTATION - BUSINESS LOAN UP TO FIVE(5) MILLION US DOLLARS CONTACT US TODAY VIA EMAIL: financecapital...@mail.com

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Michael Lyle
One final attempt to resend, because gmail has been giving me trouble sending plain text mail. Two instances of this. Tested as above, with a big set of random I/Os that ultimately cover every block in a file (e.g. allowing sequential writeback). With the 5 patches, samsung 940 SSD cache + crumm

Re: [PATCH 8/9] nvme: implement multipath access to nvme subsystems

2017-09-30 Thread Johannes Thumshirn
[+Cc Hannes ] Keith Busch writes: > On Mon, Sep 25, 2017 at 03:40:30PM +0200, Christoph Hellwig wrote: >> The new block devices nodes for multipath access will show up as >> >> /dev/nvm-subXnZ > > Just thinking ahead ... Once this goes in, someone will want to boot their > OS from a multi

Re: [PATCH v4] blk-mq: fix nr_requests wrong value when modify it from sysfs

2017-09-30 Thread weiping zhang
On Fri, Sep 22, 2017 at 11:36:28PM +0800, weiping zhang wrote: > if blk-mq use "none" io scheduler, nr_request get a wrong value when > input a number > tag_set->queue_depth. blk_mq_tag_update_depth will get > the smaller one min(nr, set->queue_depth), and then q->nr_request get a > wrong value. >

[PATCH 4/5] dm-mpath: cache ti->clone during requeue

2017-09-30 Thread Ming Lei
During requeue, block layer won't change the request any more, such as no merge, so we can cache ti->clone and let .clone_and_map_rq check if the cache can be hit. Signed-off-by: Ming Lei --- drivers/md/dm-mpath.c | 31 --- drivers/md/dm-rq.c| 41 +

[PATCH 5/5] dm-rq: improve I/O merge by dealing with underlying STS_RESOURCE

2017-09-30 Thread Ming Lei
If the underlying queue returns BLK_STS_RESOURCE, we let dm-rq handle the requeue instead of blk-mq, then I/O merge can be improved because underlying's out-of-resource can be perceived and handled by dm-rq now. Follows IOPS test of mpath on lpfc, fio(libaio, bs:4k, dio, queue_depth:64, 8 jobs).

[PATCH 3/5] dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure

2017-09-30 Thread Ming Lei
blk-mq will rerun queue via RESTART after one request is completed, so not necessary to wait random time for requeuing, we should trust blk-mq to do it. More importantly, we need return BLK_STS_RESOURCE to blk-mq so that dequeue from I/O scheduler can be stopped, then I/O merge gets improved. Sig

[PATCH 1/5] dm-mpath: remove annoying message of 'blk_get_request() returned -11'

2017-09-30 Thread Ming Lei
It is very normal to see allocation failure, so not necessary to dump it and annoy people. Signed-off-by: Ming Lei --- drivers/md/dm-mpath.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 11f273d2f018..e8094d8fbe0d 100644 --- a/drivers/md

[PATCH 0/5] dm-rq: improve sequential I/O performance

2017-09-30 Thread Ming Lei
Hi, This 1st one patch removes one log message which can be triggered very easily. The 2nd patch removes the workaround of blk_mq_delay_run_hw_queue() in case of requeue, this way isn't necessary, and more worse, it makes BLK_MQ_S_SCHED_RESTART not working, and degarde I/O performance. The 3rd p

[PATCH 2/5] dm-mpath: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

2017-09-30 Thread Ming Lei
If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun the queue in the three situations: 1) if BLK_MQ_S_SCHED_RESTART is set - queue is rerun after one rq is completed, see blk_mq_sched_restart() which is run from blk_mq_free_request() 2) BLK_MQ_S_TAG_WAITING is set - queue is rerun after on

[PATCH V5 8/8] blk-mq: improve bio merge from blk-mq sw queue

2017-09-30 Thread Ming Lei
This patch uses hash table to do bio merge from sw queue, then we can align to blk-mq scheduler/block legacy's way for bio merge. Turns out bio merge via hash table is more efficient than simple merge on the last 8 requests in sw queue. On SCSI SRP, it is observed ~10% IOPS is increased in sequent

[PATCH V5 5/8] block: add check on elevator for supporting bio merge via hashtable from blk-mq sw queue

2017-09-30 Thread Ming Lei
blk_mq_sched_try_merge() will be reused in following patches to support bio merge to blk-mq sw queue, so add checkes to related functions which are called from blk_mq_sched_try_merge(). Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- b

[PATCH V5 7/8] blk-mq-sched: refactor blk_mq_sched_try_merge()

2017-09-30 Thread Ming Lei
This patch introduces one function __blk_mq_try_merge() which will be resued for bio merge to sw queue in the following patch. No functional change. Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei --- block/b

[PATCH V5 6/8] block: introduce .last_merge and .hash to blk_mq_ctx

2017-09-30 Thread Ming Lei
Prepare for supporting bio merge to sw queue if no blk-mq io scheduler is taken. Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- block/blk-mq.h | 4 block/blk.h | 3 +++ block/elevator.c | 22 +++--- 3 fil

[PATCH V5 4/8] block: move actual bio merge code into __elv_merge

2017-09-30 Thread Ming Lei
So that we can reuse __elv_merge() to merge bio into requests from sw queue in the following patches. No functional change. Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- block/elevator.c | 19 +-- 1 file changed, 13

[PATCH V5 1/8] blk-mq-sched: introduce blk_mq_sched_queue_depth()

2017-09-30 Thread Ming Lei
The following patch will use one hint to figure out default queue depth for scheduler queue, so introduce the helper of blk_mq_sched_queue_depth() for this purpose. Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo

[PATCH V5 3/8] block: introduce rqhash helpers

2017-09-30 Thread Ming Lei
We need this helpers for supporting to use hashtable to improve bio merge from sw queue in the following patches. No functional change. Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- block/blk.h | 52

[PATCH V5 2/8] blk-mq-sched: use q->queue_depth as hint for q->nr_requests

2017-09-30 Thread Ming Lei
SCSI sets q->queue_depth from shost->cmd_per_lun, and q->queue_depth is per request_queue and more related to scheduler queue compared with hw queue depth, which can be shared by queues, such as TAG_SHARED. This patch tries to use q->queue_depth as hint for computing q->nr_requests, which should b

[PATCH V5 0/8] blk-mq: improve bio merge for none scheduler

2017-09-30 Thread Ming Lei
Hi, Patch 1 ~ 2 uses q->queue_depth as hint for setting up scheduler queue depth. Patch 3 ~ 8 improve bio merge via hash table in sw queue, which makes bio merge more efficient than current approch in which only the last 8 requests in sw queue are checked. Also this way has been used in block le

Re: [PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)

2017-09-30 Thread Ming Lei
On Sat, Sep 30, 2017 at 06:27:13PM +0800, Ming Lei wrote: > Hi Jens, > > In Red Hat internal storage test wrt. blk-mq scheduler, we > found that I/O performance is much bad with mq-deadline, especially > about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, > SRP...) > > Turns out

[PATCH V5 4/7] blk-mq: introduce blk_mq_dequeue_from_ctx()

2017-09-30 Thread Ming Lei
This function is introduced for dequeuing request from sw queue so that we can dispatch it in scheduler's way. More importantly, some SCSI devices may set q->queue_depth, which is a per-request_queue limit, and applied on pending I/O from all hctxs. This function is introduced for avoiding to dequ

[PATCH V5 7/7] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed

2017-09-30 Thread Ming Lei
During dispatching, we moved all requests from hctx->dispatch to one temporary list, then dispatch them one by one from this list. Unfortunately during this period, run queue from other contexts may think the queue is idle, then start to dequeue from sw/scheduler queue and still try to dispatch bec

[PATCH V5 5/7] blk-mq-sched: move actual dispatching into one helper

2017-09-30 Thread Ming Lei
So that it becomes easy to support to dispatch from sw queue in the following patch. No functional change. Reviewed-by: Bart Van Assche Reviewed-by: Omar Sandoval Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- block/blk-mq-sched.c

[PATCH V5 6/7] blk-mq-sched: improve dispatching from sw queue

2017-09-30 Thread Ming Lei
SCSI devices use host-wide tagset, and the shared driver tag space is often quite big. Meantime there is also queue depth for each lun(.cmd_per_lun), which is often small. So lots of requests may stay in sw queue, and we always flush all belonging to same hw queue and dispatch them all to driver,

[PATCH V5 3/7] sbitmap: introduce __sbitmap_for_each_set()

2017-09-30 Thread Ming Lei
We need to iterate ctx starting from any ctx in round robin way, so introduce this helper. Cc: Omar Sandoval Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- include/linux/sbitmap.h | 64

[PATCH V5 1/7] blk-mq: issue rq directly in blk_mq_request_bypass_insert()

2017-09-30 Thread Ming Lei
With issuing rq directly in blk_mq_request_bypass_insert(), we can: 1) avoid to acquire hctx->lock. 2) the dispatch result can be returned to dm-rq, so that dm-rq can use this information for improving I/O performance, and part2 of this patchset will do that. 3) Also the following patch for impr

[PATCH V5 2/7] blk-mq-sched: fix scheduler bad performance

2017-09-30 Thread Ming Lei
When hw queue is busy, we shouldn't take requests from scheduler queue any more, otherwise it is difficult to do IO merge. This patch fixes the awful IO performance on some SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if hw queue is busy. Tested-by: Oleks

[PATCH V5 00/14] blk-mq-sched: improve sequential I/O performance(part 1)

2017-09-30 Thread Ming Lei
Hi Jens, In Red Hat internal storage test wrt. blk-mq scheduler, we found that I/O performance is much bad with mq-deadline, especially about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx, SRP...) Turns out one big issue causes the performance regression: requests are still dequeu

Re: [PATCH V7 0/6] block/scsi: safe SCSI quiescing

2017-09-30 Thread Ming Lei
Hi Martin, On Sat, Sep 30, 2017 at 11:47:10AM +0200, Martin Steigerwald wrote: > Hi Ming. > > Ming Lei - 30.09.17, 14:12: > > Please consider this patchset for V4.15, and it fixes one > > kind of long-term I/O hang issue in either block legacy path > > or blk-mq. > > > > The current SCSI quiesce

Re: [PATCH V7 0/6] block/scsi: safe SCSI quiescing

2017-09-30 Thread Martin Steigerwald
Hi Ming. Ming Lei - 30.09.17, 14:12: > Please consider this patchset for V4.15, and it fixes one > kind of long-term I/O hang issue in either block legacy path > or blk-mq. > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. Isn´t that material for -stable as well? I´d love

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Michael Lyle
Just one more note--- IO merging is not happening properly right now. It's easy to get a case together where basically all the writeback is sequential. E.g. if your writeback dirty data target is 15GB, do something like: $ sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --n

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Michael Lyle
On Sat, Sep 30, 2017 at 1:03 AM, Coly Li wrote: >>> If writeback_rate is not minimum value, it means there are front end >>> write requests existing. >> >> This is wrong. Else we'd never make it back to the target. >> > > Of cause we can :-) When dirty data is far beyond dirty percent, > writebac

Re: [PATCH v2] null_blk: add "no_sched" module parameter

2017-09-30 Thread Jens Axboe
On 09/30/2017 03:49 AM, weiping zhang wrote: > add an option that disable io scheduler for null block device. Applied for 4.15, thanks. -- Jens Axboe

Re: [PATCH] blk-mq: remove unused function hctx_allow_merges

2017-09-30 Thread Jens Axboe
On 09/30/2017 10:15 AM, Jens Axboe wrote: > On 09/30/2017 09:01 AM, weiping zhang wrote: >> since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default >> .bio_merge" >> there is no caller for this function. > > Looks like this should change the new check to actually use this function > i

Re: [PATCH] blk-mq: remove unused function hctx_allow_merges

2017-09-30 Thread Jens Axboe
On 09/30/2017 09:01 AM, weiping zhang wrote: > since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default > .bio_merge" > there is no caller for this function. Looks like this should change the new check to actually use this function instead, or we will be doing merges if someone has tur

[RFC PATCH 2/2] block/cfq: Fix memory leak of async cfqq

2017-09-30 Thread Jeffy Chen
Currently we only unref the async cfqqs in cfq_pd_offline, which would not be called when CONFIG_CFQ_GROUP_IOSCHED is disabled. Kmemleak reported: unreferenced object 0xffc0cd9fc000 (size 240): comm "kworker/3:1", pid 52, jiffies 4294673527 (age 97.149s) hex dump (first 32 bytes): 01 0

[PATCH] blk-mq: wire up completion notifier for laptop mode

2017-09-30 Thread Jens Axboe
For some reason, the laptop mode IO completion notified was never wired up for blk-mq. Ensure that we trigger the callback appropriately, to arm the laptop mode flush timer. Signed-off-by: Jens Axboe diff --git a/block/blk-mq.c b/block/blk-mq.c index 98a18609755e..09e92667be98 100644 --- a/block

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Coly Li
On 2017/9/30 下午3:13, Michael Lyle wrote: > Coly-- > > > On Fri, Sep 29, 2017 at 11:58 PM, Coly Li wrote: >> On 2017/9/30 上午11:17, Michael Lyle wrote: >> [snip] >> >> If writeback_rate is not minimum value, it means there are front end >> write requests existing. > > This is wrong. Else we'd ne

Re: [PATCH] blk-mq: remove unused function hctx_allow_merges

2017-09-30 Thread Ming Lei
On Sat, Sep 30, 2017 at 3:01 PM, weiping zhang wrote: > since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default > .bio_merge" > there is no caller for this function. > > Signed-off-by: weiping zhang Reviewed-by: Ming Lei -- Ming Lei

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Michael Lyle
Actually-- I give up. You've initially bounced every single one of my patches, even the ones that fix crash & data corruption bugs. I spend 10x as much time fighting with you as writing stuff for bcache, and basically every time it's turned out that you're wrong. I will go do something else wher

Re: [PATCH 4/5] bcache: writeback: collapse contiguous IO better

2017-09-30 Thread Michael Lyle
Coly-- On Fri, Sep 29, 2017 at 11:58 PM, Coly Li wrote: > On 2017/9/30 上午11:17, Michael Lyle wrote: > [snip] > > If writeback_rate is not minimum value, it means there are front end > write requests existing. This is wrong. Else we'd never make it back to the target. > In this case, backend w

[PATCH] blk-mq: remove unused function hctx_allow_merges

2017-09-30 Thread weiping zhang
since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default .bio_merge" there is no caller for this function. Signed-off-by: weiping zhang --- block/blk-mq.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f84d145..520d257 100644 --- a/block