On Sun, Jun 02, 2019 at 02:42:02PM +0800, Ming Lei wrote:
> Hi Kashyap,
>
> Thanks for your test.
>
> On Sun, Jun 02, 2019 at 03:11:26AM +0530, Kashyap Desai wrote:
> > > SCSI's reply qeueue is very similar with blk-mq's hw queue, both
> > assigned by
> > > IRQ vector, so map te private reply queue into blk-mq's hw queue via
> > > .host_tagset.
> > >
> > > Then the private reply mapping can be removed.
> > >
> > > Another benefit is that the request/irq lost issue may be solved in
> > generic
> > > approach because managed IRQ may be shutdown during CPU hotplug.
> >
> > Ming,
> >
> > I quickly tested this patch series on MegaRaid Aero controller. Without
> > this patch I can get 3.0M IOPS, but once I apply this patch I see only
> > 1.2M IOPS (40% Performance drop)
> > HBA supports 5089 can_queue.
> >
> > <perf top> output without patch -
> >
> > 3.39% [megaraid_sas] [k] complete_cmd_fusion
> > 3.36% [kernel] [k] scsi_queue_rq
> > 3.26% [kernel] [k] entry_SYSCALL_64
> > 2.57% [kernel] [k] syscall_return_via_sysret
> > 1.95% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion
> > 1.88% [kernel] [k] _raw_spin_lock_irqsave
> > 1.79% [kernel] [k] gup_pmd_range
> > 1.73% [kernel] [k] _raw_spin_lock
> > 1.68% [kernel] [k] __sched_text_start
> > 1.19% [kernel] [k] irq_entries_start
> > 1.13% [kernel] [k] scsi_dec_host_busy
> > 1.08% [kernel] [k] aio_complete
> > 1.07% [kernel] [k] read_tsc
> > 1.01% [kernel] [k] blk_mq_get_request
> > 0.93% [kernel] [k] __update_load_avg_cfs_rq
> > 0.92% [kernel] [k] aio_read_events
> > 0.91% [kernel] [k] lookup_ioctx
> > 0.91% fio [.] fio_gettime
> > 0.87% [kernel] [k] set_next_entity
> > 0.87% [megaraid_sas] [k] megasas_build_ldio_fusion
> >
> > <perf top> output with patch -
> >
> > 11.30% [kernel] [k] native_queued_spin_lock_slowpath
>
> I guess there must be one global lock required in megaraid submission path,
> could you run 'perf record -g -a' to see which lock is and what the stack
> trace is?
Meantime please try the following patch and see if difference can be made.
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 49d73d979cb3..d2abec3b0f60 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -589,7 +589,7 @@ static void __blk_mq_complete_request(struct request *rq)
* So complete IO reqeust in softirq context in case of single queue
* for not degrading IO performance by irqsoff latency.
*/
- if (q->nr_hw_queues == 1) {
+ if (q->nr_hw_queues == 1 || (rq->mq_hctx->flags & BLK_MQ_F_HOST_TAGS)) {
__blk_complete_request(rq);
return;
}
@@ -1977,7 +1977,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue
*q, struct bio *bio)
/* bypass scheduler for flush rq */
blk_insert_flush(rq);
blk_mq_run_hw_queue(data.hctx, true);
- } else if (plug && (q->nr_hw_queues == 1 || q->mq_ops->commit_rqs)) {
+ } else if (plug && (q->nr_hw_queues == 1 || q->mq_ops->commit_rqs ||
+ (data.hctx->flags & BLK_MQ_F_HOST_TAGS))) {
/*
* Use plugging if we have a ->commit_rqs() hook as well, as
* we know the driver uses bd->last in a smart fashion.
thanks,
Ming