Hi Guys,

The 1st patch removes the workaround of blk_mq_delay_run_hw_queue() in
case of requeue, this way isn't necessary, and more worse, it makes
BLK_MQ_S_SCHED_RESTART not working, and degarde I/O performance.

The 2nd patch return DM_MAPIO_REQUEUE to dm-rq if underlying request
allocation fails, then we can return BLK_STS_RESOURCE from dm-rq to
blk-mq, so that blk-mq can hold the requests to be dequeued.

The other 3 paches changes the blk-mq part of blk_insert_cloned_request(),
in which we switch to blk_mq_try_issue_directly(), so that both dm-rq
and blk-mq can get the dispatch result of underlying queue, and with
this information, blk-mq can handle IO merge much better, then
sequential I/O performance is improved much.

In my dm-mpath over virtio-scsi test, this whole patchset improves
sequential IO by 3X ~ 5X.

V3:
        - rebase on the latest for-4.16/block of block tree
        - add missed pg_init_all_paths() in patch 1, according to Bart's review

V2:
        - drop 'dm-mpath: cache ti->clone during requeue', which is a bit
        too complicated, and not see obvious performance improvement.
        - make change on blk-mq part cleaner

Ming Lei (5):
  dm-mpath: don't call blk_mq_delay_run_hw_queue() in case of
    BLK_STS_RESOURCE
  dm-mpath: return DM_MAPIO_REQUEUE in case of rq allocation failure
  blk-mq: move actual issue into one helper
  blk-mq: return dispatch result to caller in blk_mq_try_issue_directly
  blk-mq: issue request directly for blk_insert_cloned_request

 block/blk-core.c      |  3 +-
 block/blk-mq.c        | 86 +++++++++++++++++++++++++++++++++++++++------------
 block/blk-mq.h        |  3 ++
 drivers/md/dm-mpath.c | 19 +++++++++---
 drivers/md/dm-rq.c    | 20 +++++++++---
 5 files changed, 101 insertions(+), 30 deletions(-)

-- 
2.9.5

Reply via email to