On 2017/10/1 上午6:49, Michael Lyle wrote:
> One final attempt to resend, because gmail has been giving me trouble
> sending plain text mail.
>
> Two instances of this. Tested as above, with a big set of random I/Os
> that ultimately cover every block in a file (e.g. allowing sequential
> writeback
UNSECURED BUSINESS/PERSONAL LOAN BY LOAN CAPITAL FINANCE
- NO COLLATERAL
- MINIMUM DOCUMENTATION
- BUSINESS LOAN UP TO FIVE(5) MILLION US DOLLARS
CONTACT US TODAY VIA EMAIL: financecapital...@mail.com
One final attempt to resend, because gmail has been giving me trouble
sending plain text mail.
Two instances of this. Tested as above, with a big set of random I/Os
that ultimately cover every block in a file (e.g. allowing sequential
writeback).
With the 5 patches, samsung 940 SSD cache + crumm
[+Cc Hannes ]
Keith Busch writes:
> On Mon, Sep 25, 2017 at 03:40:30PM +0200, Christoph Hellwig wrote:
>> The new block devices nodes for multipath access will show up as
>>
>> /dev/nvm-subXnZ
>
> Just thinking ahead ... Once this goes in, someone will want to boot their
> OS from a multi
On Fri, Sep 22, 2017 at 11:36:28PM +0800, weiping zhang wrote:
> if blk-mq use "none" io scheduler, nr_request get a wrong value when
> input a number > tag_set->queue_depth. blk_mq_tag_update_depth will get
> the smaller one min(nr, set->queue_depth), and then q->nr_request get a
> wrong value.
>
During requeue, block layer won't change the request any
more, such as no merge, so we can cache ti->clone and
let .clone_and_map_rq check if the cache can be hit.
Signed-off-by: Ming Lei
---
drivers/md/dm-mpath.c | 31 ---
drivers/md/dm-rq.c| 41 +
If the underlying queue returns BLK_STS_RESOURCE, we let dm-rq
handle the requeue instead of blk-mq, then I/O merge can be
improved because underlying's out-of-resource can be perceived
and handled by dm-rq now.
Follows IOPS test of mpath on lpfc, fio(libaio, bs:4k, dio,
queue_depth:64, 8 jobs).
blk-mq will rerun queue via RESTART after one request is completed,
so not necessary to wait random time for requeuing, we should trust
blk-mq to do it.
More importantly, we need return BLK_STS_RESOURCE to blk-mq
so that dequeue from I/O scheduler can be stopped, then
I/O merge gets improved.
Sig
It is very normal to see allocation failure, so not necessary
to dump it and annoy people.
Signed-off-by: Ming Lei
---
drivers/md/dm-mpath.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 11f273d2f018..e8094d8fbe0d 100644
--- a/drivers/md
Hi,
This 1st one patch removes one log message which can be triggered
very easily.
The 2nd patch removes the workaround of blk_mq_delay_run_hw_queue()
in case of requeue, this way isn't necessary, and more worse, it
makes BLK_MQ_S_SCHED_RESTART not working, and degarde I/O performance.
The 3rd p
If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun
the queue in the three situations:
1) if BLK_MQ_S_SCHED_RESTART is set
- queue is rerun after one rq is completed, see blk_mq_sched_restart()
which is run from blk_mq_free_request()
2) BLK_MQ_S_TAG_WAITING is set
- queue is rerun after on
This patch uses hash table to do bio merge from sw queue,
then we can align to blk-mq scheduler/block legacy's way
for bio merge.
Turns out bio merge via hash table is more efficient than
simple merge on the last 8 requests in sw queue. On SCSI SRP,
it is observed ~10% IOPS is increased in sequent
blk_mq_sched_try_merge() will be reused in following patches
to support bio merge to blk-mq sw queue, so add checkes to
related functions which are called from blk_mq_sched_try_merge().
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
b
This patch introduces one function __blk_mq_try_merge()
which will be resued for bio merge to sw queue in
the following patch.
No functional change.
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Reviewed-by: Bart Van Assche
Signed-off-by: Ming Lei
---
block/b
Prepare for supporting bio merge to sw queue if no
blk-mq io scheduler is taken.
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
block/blk-mq.h | 4
block/blk.h | 3 +++
block/elevator.c | 22 +++---
3 fil
So that we can reuse __elv_merge() to merge bio
into requests from sw queue in the following patches.
No functional change.
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
block/elevator.c | 19 +--
1 file changed, 13
The following patch will use one hint to figure out
default queue depth for scheduler queue, so introduce
the helper of blk_mq_sched_queue_depth() for this purpose.
Reviewed-by: Christoph Hellwig
Reviewed-by: Bart Van Assche
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo
We need this helpers for supporting to use hashtable to improve
bio merge from sw queue in the following patches.
No functional change.
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
block/blk.h | 52
SCSI sets q->queue_depth from shost->cmd_per_lun, and
q->queue_depth is per request_queue and more related to
scheduler queue compared with hw queue depth, which can be
shared by queues, such as TAG_SHARED.
This patch tries to use q->queue_depth as hint for computing
q->nr_requests, which should b
Hi,
Patch 1 ~ 2 uses q->queue_depth as hint for setting up
scheduler queue depth.
Patch 3 ~ 8 improve bio merge via hash table in sw queue,
which makes bio merge more efficient than current approch
in which only the last 8 requests in sw queue are checked.
Also this way has been used in block le
On Sat, Sep 30, 2017 at 06:27:13PM +0800, Ming Lei wrote:
> Hi Jens,
>
> In Red Hat internal storage test wrt. blk-mq scheduler, we
> found that I/O performance is much bad with mq-deadline, especially
> about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx,
> SRP...)
>
> Turns out
This function is introduced for dequeuing request
from sw queue so that we can dispatch it in
scheduler's way.
More importantly, some SCSI devices may set
q->queue_depth, which is a per-request_queue limit,
and applied on pending I/O from all hctxs. This
function is introduced for avoiding to dequ
During dispatching, we moved all requests from hctx->dispatch to
one temporary list, then dispatch them one by one from this list.
Unfortunately during this period, run queue from other contexts
may think the queue is idle, then start to dequeue from sw/scheduler
queue and still try to dispatch bec
So that it becomes easy to support to dispatch from
sw queue in the following patch.
No functional change.
Reviewed-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
block/blk-mq-sched.c
SCSI devices use host-wide tagset, and the shared
driver tag space is often quite big. Meantime
there is also queue depth for each lun(.cmd_per_lun),
which is often small.
So lots of requests may stay in sw queue, and we
always flush all belonging to same hw queue and
dispatch them all to driver,
We need to iterate ctx starting from any ctx in round robin
way, so introduce this helper.
Cc: Omar Sandoval
Tested-by: Oleksandr Natalenko
Tested-by: Tom Nguyen
Tested-by: Paolo Valente
Signed-off-by: Ming Lei
---
include/linux/sbitmap.h | 64
With issuing rq directly in blk_mq_request_bypass_insert(),
we can:
1) avoid to acquire hctx->lock.
2) the dispatch result can be returned to dm-rq, so that dm-rq
can use this information for improving I/O performance, and
part2 of this patchset will do that.
3) Also the following patch for impr
When hw queue is busy, we shouldn't take requests from
scheduler queue any more, otherwise it is difficult to do
IO merge.
This patch fixes the awful IO performance on some
SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber
is used by not taking requests if hw queue is busy.
Tested-by: Oleks
Hi Jens,
In Red Hat internal storage test wrt. blk-mq scheduler, we
found that I/O performance is much bad with mq-deadline, especially
about sequential I/O on some multi-queue SCSI devcies(lpfc, qla2xxx,
SRP...)
Turns out one big issue causes the performance regression: requests
are still dequeu
Hi Martin,
On Sat, Sep 30, 2017 at 11:47:10AM +0200, Martin Steigerwald wrote:
> Hi Ming.
>
> Ming Lei - 30.09.17, 14:12:
> > Please consider this patchset for V4.15, and it fixes one
> > kind of long-term I/O hang issue in either block legacy path
> > or blk-mq.
> >
> > The current SCSI quiesce
Hi Ming.
Ming Lei - 30.09.17, 14:12:
> Please consider this patchset for V4.15, and it fixes one
> kind of long-term I/O hang issue in either block legacy path
> or blk-mq.
>
> The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.
Isn´t that material for -stable as well?
I´d love
Just one more note---
IO merging is not happening properly right now.
It's easy to get a case together where basically all the writeback is
sequential. E.g. if your writeback dirty data target is 15GB, do
something like:
$ sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1
--n
On Sat, Sep 30, 2017 at 1:03 AM, Coly Li wrote:
>>> If writeback_rate is not minimum value, it means there are front end
>>> write requests existing.
>>
>> This is wrong. Else we'd never make it back to the target.
>>
>
> Of cause we can :-) When dirty data is far beyond dirty percent,
> writebac
On 09/30/2017 03:49 AM, weiping zhang wrote:
> add an option that disable io scheduler for null block device.
Applied for 4.15, thanks.
--
Jens Axboe
On 09/30/2017 10:15 AM, Jens Axboe wrote:
> On 09/30/2017 09:01 AM, weiping zhang wrote:
>> since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default
>> .bio_merge"
>> there is no caller for this function.
>
> Looks like this should change the new check to actually use this function
> i
On 09/30/2017 09:01 AM, weiping zhang wrote:
> since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default
> .bio_merge"
> there is no caller for this function.
Looks like this should change the new check to actually use this function
instead, or we will be doing merges if someone has tur
Currently we only unref the async cfqqs in cfq_pd_offline, which would
not be called when CONFIG_CFQ_GROUP_IOSCHED is disabled.
Kmemleak reported:
unreferenced object 0xffc0cd9fc000 (size 240):
comm "kworker/3:1", pid 52, jiffies 4294673527 (age 97.149s)
hex dump (first 32 bytes):
01 0
For some reason, the laptop mode IO completion notified was never wired
up for blk-mq. Ensure that we trigger the callback appropriately, to arm
the laptop mode flush timer.
Signed-off-by: Jens Axboe
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 98a18609755e..09e92667be98 100644
--- a/block
On 2017/9/30 下午3:13, Michael Lyle wrote:
> Coly--
>
>
> On Fri, Sep 29, 2017 at 11:58 PM, Coly Li wrote:
>> On 2017/9/30 上午11:17, Michael Lyle wrote:
>> [snip]
>>
>> If writeback_rate is not minimum value, it means there are front end
>> write requests existing.
>
> This is wrong. Else we'd ne
On Sat, Sep 30, 2017 at 3:01 PM, weiping zhang
wrote:
> since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default
> .bio_merge"
> there is no caller for this function.
>
> Signed-off-by: weiping zhang
Reviewed-by: Ming Lei
--
Ming Lei
Actually-- I give up.
You've initially bounced every single one of my patches, even the ones
that fix crash & data corruption bugs.
I spend 10x as much time fighting with you as writing stuff for
bcache, and basically every time it's turned out that you're wrong.
I will go do something else wher
Coly--
On Fri, Sep 29, 2017 at 11:58 PM, Coly Li wrote:
> On 2017/9/30 上午11:17, Michael Lyle wrote:
> [snip]
>
> If writeback_rate is not minimum value, it means there are front end
> write requests existing.
This is wrong. Else we'd never make it back to the target.
> In this case, backend w
since 9bddeb2a5b981 "blk-mq: make per-sw-queue bio merge as default .bio_merge"
there is no caller for this function.
Signed-off-by: weiping zhang
---
block/blk-mq.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f84d145..520d257 100644
--- a/block
43 matches
Mail list logo