Re: [PATCH 05/14] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed
On Tue, 2017-08-01 at 00:51 +0800, Ming Lei wrote: > During dispatch, we moved all requests from hctx->dispatch to > one temporary list, then dispatch them one by one from this list. > Unfortunately duirng this period, run queue from other contexts > may think the queue is idle and start to dequeue from sw/scheduler > queue and try to dispatch because ->dispatch is empty. > > This way will hurt sequential I/O performance because requests are > dequeued when queue is busy. > > Signed-off-by: Ming Lei > --- > block/blk-mq-sched.c | 24 ++-- > include/linux/blk-mq.h | 1 + > 2 files changed, 19 insertions(+), 6 deletions(-) > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 3510c01cb17b..eb638063673f 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -112,8 +112,15 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx > *hctx) >*/ > if (!list_empty_careful(&hctx->dispatch)) { > spin_lock(&hctx->lock); > - if (!list_empty(&hctx->dispatch)) > + if (!list_empty(&hctx->dispatch)) { > list_splice_init(&hctx->dispatch, &rq_list); > + > + /* > + * BUSY won't be cleared until all requests > + * in hctx->dispatch are dispatched successfully > + */ > + set_bit(BLK_MQ_S_BUSY, &hctx->state); > + } > spin_unlock(&hctx->lock); > } > > @@ -129,15 +136,20 @@ void blk_mq_sched_dispatch_requests(struct > blk_mq_hw_ctx *hctx) > if (!list_empty(&rq_list)) { > blk_mq_sched_mark_restart_hctx(hctx); > can_go = blk_mq_dispatch_rq_list(q, &rq_list); > - } else if (!has_sched_dispatch && !q->queue_depth) { > - blk_mq_flush_busy_ctxs(hctx, &rq_list); > - blk_mq_dispatch_rq_list(q, &rq_list); > - can_go = false; > + if (can_go) > + clear_bit(BLK_MQ_S_BUSY, &hctx->state); > } > > - if (!can_go) > + /* can't go until ->dispatch is flushed */ > + if (!can_go || test_bit(BLK_MQ_S_BUSY, &hctx->state)) > return; > > + if (!has_sched_dispatch && !q->queue_depth) { > + blk_mq_flush_busy_ctxs(hctx, &rq_list); > + blk_mq_dispatch_rq_list(q, &rq_list); > + return; > + } Hello Ming, Since setting, clearing and testing of BLK_MQ_S_BUSY can happen concurrently and since clearing and testing happens without any locks held I'm afraid this patch introduces the following race conditions: * Clearing of BLK_MQ_S_BUSY immediately after this bit has been set, resulting in this bit not being set although there are requests on the dispatch list. * Checking BLK_MQ_S_BUSY after requests have been added to the dispatch list but before that bit is set, resulting in test_bit(BLK_MQ_S_BUSY, &hctx->state) reporting that the BLK_MQ_S_BUSY has not been set although there are requests on the dispatch list. * Checking BLK_MQ_S_BUSY after requests have been removed from the dispatch list but before that bit is cleared, resulting in test_bit(BLK_MQ_S_BUSY, &hctx->state) reporting that the BLK_MQ_S_BUSY has been set although there are no requests on the dispatch list. Bart.
Re: [PATCH 04/14] blk-mq-sched: improve dispatching from sw queue
On Tue, 2017-08-01 at 00:51 +0800, Ming Lei wrote: > SCSI devices use host-wide tagset, and the shared > driver tag space is often quite big. Meantime > there is also queue depth for each lun(.cmd_per_lun), > which is often small. > > So lots of requests may stay in sw queue, and we > always flush all belonging to same hw queue and > dispatch them all to driver, unfortunately it is > easy to cause queue busy becasue of the small > per-lun queue depth. Once these requests are flushed > out, they have to stay in hctx->dispatch, and no bio > merge can participate into these requests, and > sequential IO performance is hurted. > > This patch improves dispatching from sw queue when > there is per-request-queue queue depth by taking > request one by one from sw queue, just like the way > of IO scheduler. > > Signed-off-by: Ming Lei > --- > block/blk-mq-sched.c | 25 +++-- > 1 file changed, 15 insertions(+), 10 deletions(-) > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 47a25333a136..3510c01cb17b 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -96,6 +96,9 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx > *hctx) > const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; > bool can_go = true; > LIST_HEAD(rq_list); > + struct request *(*dispatch_fn)(struct blk_mq_hw_ctx *) = > + has_sched_dispatch ? e->type->ops.mq.dispatch_request : > + blk_mq_dispatch_rq_from_ctxs; > > /* RCU or SRCU read lock is needed before checking quiesced flag */ > if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) > @@ -126,26 +129,28 @@ void blk_mq_sched_dispatch_requests(struct > blk_mq_hw_ctx *hctx) > if (!list_empty(&rq_list)) { > blk_mq_sched_mark_restart_hctx(hctx); > can_go = blk_mq_dispatch_rq_list(q, &rq_list); > - } else if (!has_sched_dispatch) { > + } else if (!has_sched_dispatch && !q->queue_depth) { > blk_mq_flush_busy_ctxs(hctx, &rq_list); > blk_mq_dispatch_rq_list(q, &rq_list); > + can_go = false; > } > > + if (!can_go) > + return; > + > /* >* We want to dispatch from the scheduler if we had no work left >* on the dispatch list, OR if we did have work but weren't able >* to make progress. >*/ > - if (can_go && has_sched_dispatch) { > - do { > - struct request *rq; > + do { > + struct request *rq; > > - rq = e->type->ops.mq.dispatch_request(hctx); > - if (!rq) > - break; > - list_add(&rq->queuelist, &rq_list); > - } while (blk_mq_dispatch_rq_list(q, &rq_list)); > - } > + rq = dispatch_fn(hctx); > + if (!rq) > + break; > + list_add(&rq->queuelist, &rq_list); > + } while (blk_mq_dispatch_rq_list(q, &rq_list)); > } Hello Ming, Although I like the idea behind this patch, I'm afraid that this patch will cause a performance regression for high-performance SCSI LLD drivers, e.g. ib_srp. Have you considered to rework this patch as follows: * Remove the code under "else if (!has_sched_dispatch && !q->queue_depth) {". * Modify all blk_mq_dispatch_rq_list() functions such that these dispatch up to cmd_per_lun - (number of requests in progress) at once. Thanks, Bart.
Re: [PATCH 03/14] blk-mq: introduce blk_mq_dispatch_rq_from_ctxs()
On Tue, 2017-08-01 at 00:51 +0800, Ming Lei wrote: > @@ -810,7 +810,11 @@ static void blk_mq_timeout_work(struct work_struct *work) > > struct ctx_iter_data { > struct blk_mq_hw_ctx *hctx; > - struct list_head *list; > + > + union { > + struct list_head *list; > + struct request *rq; > + }; > }; Hello Ming, Please introduce a new data structure for dispatch_rq_from_ctx() / blk_mq_dispatch_rq_from_ctxs() instead of introducing a union in struct ctx_iter_data. That will avoid that .list can be used in a context where a struct request * pointer has been stored in the structure and vice versa. > static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void > *data) > @@ -826,6 +830,26 @@ static bool flush_busy_ctx(struct sbitmap *sb, unsigned > int bitnr, void *data) > return true; > } > > +static bool dispatch_rq_from_ctx(struct sbitmap *sb, unsigned int bitnr, > void *data) > +{ > + struct ctx_iter_data *dispatch_data = data; > + struct blk_mq_hw_ctx *hctx = dispatch_data->hctx; > + struct blk_mq_ctx *ctx = hctx->ctxs[bitnr]; > + bool empty = true; > + > + spin_lock(&ctx->lock); > + if (unlikely(!list_empty(&ctx->rq_list))) { > + dispatch_data->rq = list_entry_rq(ctx->rq_list.next); > + list_del_init(&dispatch_data->rq->queuelist); > + empty = list_empty(&ctx->rq_list); > + } > + spin_unlock(&ctx->lock); > + if (empty) > + sbitmap_clear_bit(sb, bitnr); This sbitmap_clear_bit() occurs without holding blk_mq_ctx.lock. Sorry but I don't think this is safe. Please either remove this sbitmap_clear_bit() call or make sure that it happens with blk_mq_ctx.lock held. Thanks, Bart.
Re: [PATCH 02/14] blk-mq: rename flush_busy_ctx_data as ctx_iter_data
On Tue, 2017-08-01 at 00:50 +0800, Ming Lei wrote: > The following patch need to reuse this data structure, > so rename as one generic name. Hello Ming, Please drop this patch (see also my comments on the next patch). Thanks, Bart.
Re: [PATCH 01/14] blk-mq-sched: fix scheduler bad performance
On Tue, 2017-08-01 at 00:50 +0800, Ming Lei wrote: > When hw queue is busy, we shouldn't take requests from > scheduler queue any more, otherwise IO merge will be > difficult to do. > > This patch fixes the awful IO performance on some > SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber > is used by not taking requests if hw queue is busy. > > Signed-off-by: Ming Lei > --- > block/blk-mq-sched.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 4ab69435708c..47a25333a136 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -94,7 +94,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx > *hctx) > struct request_queue *q = hctx->queue; > struct elevator_queue *e = q->elevator; > const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; > - bool did_work = false; > + bool can_go = true; > LIST_HEAD(rq_list); > > /* RCU or SRCU read lock is needed before checking quiesced flag */ > @@ -125,7 +125,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx > *hctx) >*/ > if (!list_empty(&rq_list)) { > blk_mq_sched_mark_restart_hctx(hctx); > - did_work = blk_mq_dispatch_rq_list(q, &rq_list); > + can_go = blk_mq_dispatch_rq_list(q, &rq_list); > } else if (!has_sched_dispatch) { > blk_mq_flush_busy_ctxs(hctx, &rq_list); > blk_mq_dispatch_rq_list(q, &rq_list); > @@ -136,7 +136,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx > *hctx) >* on the dispatch list, OR if we did have work but weren't able >* to make progress. >*/ > - if (!did_work && has_sched_dispatch) { > + if (can_go && has_sched_dispatch) { > do { > struct request *rq; Hello Ming, Please chose a better name for the new variable, e.g. "do_sched_dispatch". Otherwise this patch looks fine to me. Hence: Reviewed-by: Bart Van Assche Bart.
Re: [patch 3/5] scsi/bnx2i: Prevent recursive cpuhotplug locking
On Mon, 24 Jul 2017 12:52:58 +0200 Thomas Gleixner wrote: > The BNX2I module init/exit code installs/removes the hotplug callbacks with > the cpu hotplug lock held. This worked with the old CPU locking > implementation which allowed recursive locking, but with the new percpu > rwsem based mechanism this is not longer allowed. > > Use the _cpuslocked() variants to fix this. > > Reported-by: Steven Rostedt Tested-by: Steven Rostedt (VMware) (makes the lockdep splat go away) -- Steve > Signed-off-by: Thomas Gleixner > --- > drivers/scsi/bnx2i/bnx2i_init.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > --- a/drivers/scsi/bnx2i/bnx2i_init.c > +++ b/drivers/scsi/bnx2i/bnx2i_init.c > @@ -516,15 +516,16 @@ static int __init bnx2i_mod_init(void) > for_each_online_cpu(cpu) > bnx2i_percpu_thread_create(cpu); > > - err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, > -"scsi/bnx2i:online", > -bnx2i_cpu_online, NULL); > + err = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN, > +"scsi/bnx2i:online", > +bnx2i_cpu_online, NULL); > if (err < 0) > goto remove_threads; > bnx2i_online_state = err; > > - cpuhp_setup_state_nocalls(CPUHP_SCSI_BNX2I_DEAD, "scsi/bnx2i:dead", > - NULL, bnx2i_cpu_dead); > + cpuhp_setup_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD, > + "scsi/bnx2i:dead", > + NULL, bnx2i_cpu_dead); > put_online_cpus(); > return 0; > > @@ -574,8 +575,8 @@ static void __exit bnx2i_mod_exit(void) > for_each_online_cpu(cpu) > bnx2i_percpu_thread_destroy(cpu); > > - cpuhp_remove_state_nocalls(bnx2i_online_state); > - cpuhp_remove_state_nocalls(CPUHP_SCSI_BNX2I_DEAD); > + cpuhp_remove_state_nocalls_cpuslocked(bnx2i_online_state); > + cpuhp_remove_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD); > put_online_cpus(); > > iscsi_unregister_transport(&bnx2i_iscsi_transport); >
[PATCH 1/1] qla2xxx: Fix system crash while triggering FW dump
From: Michael Hernandez This patch fixes system hang/crash while firmware dump is attempted with Block MQ enabled in qla2xxx driver. Fix is to remove check in fw dump template entries for existing request and response queues so that full buffer size is calculated during template size calculation. Following stack trace is seen during firmware dump capture process [ 694.390588] qla2xxx [:81:00.0]-5003:11: ISP System Error - mbx1=4b1fh mbx2=10h mbx3=2ah mbx7=0h. [ 694.402336] BUG: unable to handle kernel paging request at c90008c7b000 [ 694.402372] IP: memcpy_erms+0x6/0x10 [ 694.402386] PGD 105f01a067 [ 694.402386] PUD 85f89c067 [ 694.402398] PMD 10490cb067 [ 694.402409] PTE 0 [ 694.402421] [ 694.402437] Oops: 0002 [#1] PREEMPT SMP [ 694.402452] Modules linked in: netconsole configfs qla2xxx scsi_transport_fc nvme_fc nvme_fabrics bnep bluetooth rfkill xt_tcpudp unix_diag xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ipmi_ssif sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass igb crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt aes_x86_64 crypto_simd ptp iTCO_vendor_support glue_helper cryptd lpc_ich joydev i2c_i801 pcspkr ioatdma mei_me pps_core tpm_tis mei mfd_core acpi_power_meter tpm_tis_core ipmi_si ipmi_devintf tpm ipmi_msghandler shpchp wmi dca button acpi_pad btrfs xor uas usb_storage hid_generic usbhid raid6_pq crc32c_intel ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect [ 694.402692] sysimgblt fb_sys_fops xhci_pci ttm ehci_pci sr_mod xhci_hcd cdrom ehci_hcd drm usbcore sg [ 694.402730] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-1-default+ #19 [ 694.402753] Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015 [ 694.402776] task: 81c0e4c0 task.stack: 81c0 [ 694.402798] RIP: 0010:memcpy_erms+0x6/0x10 [ 694.402813] RSP: 0018:88085fc03cd0 EFLAGS: 00210006 [ 694.402832] RAX: c90008c7ae0c RBX: 0004 RCX: 0001fe0c [ 694.402856] RDX: 0002 RSI: 8810332c01f4 RDI: c90008c7b000 [ 694.402879] RBP: 88085fc03d18 R08: 0002 R09: 00279e0a [ 694.402903] R10: R11: f000 R12: 88085fc03d80 [ 694.402927] R13: c90008a01000 R14: c90008a056d4 R15: 881052ef17e0 [ 694.402951] FS: () GS:88085fc0() knlGS: [ 694.402977] CS: 0010 DS: ES: CR0: 80050033 [ 694.403012] CR2: c90008c7b000 CR3: 01c09000 CR4: 001406f0 [ 694.403036] Call Trace: [ 694.403047] [ 694.403072] ? qla27xx_fwdt_entry_t263+0x18e/0x380 [qla2xxx] [ 694.403099] qla27xx_walk_template+0x9d/0x1a0 [qla2xxx] [ 694.403124] qla27xx_fwdump+0x1f3/0x272 [qla2xxx] [ 694.403149] qla2x00_async_event+0xb08/0x1a50 [qla2xxx] [ 694.403169] ? enqueue_task_fair+0xa2/0x9d0 Signed-off-by: Mike Hernandez Signed-off-by: Joe Carnuccio Signed-off-by: Himanshu Madhani --- Hi Martin, Please apply this patch to 4.13.0-rc4. Without this patch our capabilty to collect and analyze firmware dump in a customer enviorment will be greatly affected. Thanks, Himanshu --- drivers/scsi/qla2xxx/qla_tmpl.c | 12 1 file changed, 12 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_tmpl.c b/drivers/scsi/qla2xxx/qla_tmpl.c index 33142610882f..b18646d6057f 100644 --- a/drivers/scsi/qla2xxx/qla_tmpl.c +++ b/drivers/scsi/qla2xxx/qla_tmpl.c @@ -401,9 +401,6 @@ qla27xx_fwdt_entry_t263(struct scsi_qla_host *vha, for (i = 0; i < vha->hw->max_req_queues; i++) { struct req_que *req = vha->hw->req_q_map[i]; - if (!test_bit(i, vha->hw->req_qid_map)) - continue; - if (req || !buf) { length = req ? req->length : REQUEST_ENTRY_CNT_24XX; @@ -418,9 +415,6 @@ qla27xx_fwdt_entry_t263(struct scsi_qla_host *vha, for (i = 0; i < vha->hw->max_rsp_queues; i++) { struct rsp_que *rsp = vha->hw->rsp_q_map[i]; - if (!test_bit(i, vha->hw->rsp_qid_map)) - continue; - if (rsp || !buf) { length = rsp ? rsp->length : RESPONSE_ENTRY_CNT_MQ; @@ -660,9 +654,6 @@ qla27xx_fwdt_entry_t274(struct scsi_qla_host *vha, for (i = 0; i < vha->hw->max_req_queues; i++) { struct req_que *req = vha->hw->req_q_map[i]; - if (!test_bit(i, vha->hw->req_qid_map)) - continue; - if (req || !buf) { qla27xx_insert16(i, buf, len); qla27xx_insert16(1, buf, len); @@ -67
[PATCH 14/14] blk-mq-sched: improve IO scheduling on SCSI devcie
SCSI device often has per-request_queue queue depth (.cmd_per_lun), which is applied among all hw queues actually, and this patchset calls this as shared queue depth. One theory of scheduler is that we shouldn't dequeue request from sw/scheduler queue and dispatch it to driver when the low level queue is busy. For SCSI device, queue being busy depends on the per-request_queue limit, so we should hold all hw queues if the request queue is busy. This patch introduces per-request_queue dispatch list for this purpose, and only when all requests in this list are dispatched out successfully, we can restart to dequeue request from sw/scheduler queue and dispath it to lld. Signed-off-by: Ming Lei --- block/blk-mq.c | 8 +++- block/blk-mq.h | 14 +++--- include/linux/blkdev.h | 5 + 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9b8b3a740d18..6d02901d798e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2667,8 +2667,14 @@ int blk_mq_update_sched_queue_depth(struct request_queue *q) * this queue depth limit */ if (q->queue_depth) { - queue_for_each_hw_ctx(q, hctx, i) + queue_for_each_hw_ctx(q, hctx, i) { hctx->flags |= BLK_MQ_F_SHARED_DEPTH; + hctx->dispatch_lock = &q->__mq_dispatch_lock; + hctx->dispatch_list = &q->__mq_dispatch_list; + + spin_lock_init(hctx->dispatch_lock); + INIT_LIST_HEAD(hctx->dispatch_list); + } } if (!q->elevator) diff --git a/block/blk-mq.h b/block/blk-mq.h index a8788058da56..4853d422836f 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -138,19 +138,27 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx) static inline bool blk_mq_hctx_is_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - return test_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + return test_bit(BLK_MQ_S_BUSY, &hctx->state); + return q->mq_dispatch_busy; } static inline void blk_mq_hctx_set_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - set_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + set_bit(BLK_MQ_S_BUSY, &hctx->state); + else + q->mq_dispatch_busy = 1; } static inline void blk_mq_hctx_clear_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - clear_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + clear_bit(BLK_MQ_S_BUSY, &hctx->state); + else + q->mq_dispatch_busy = 0; } static inline bool blk_mq_has_dispatch_rqs(struct blk_mq_hw_ctx *hctx) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 25f6a0cb27d3..bc0e607710f2 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -395,6 +395,11 @@ struct request_queue { atomic_tshared_hctx_restart; + /* blk-mq dispatch list and lock for shared queue depth case */ + struct list_head__mq_dispatch_list; + spinlock_t __mq_dispatch_lock; + unsigned intmq_dispatch_busy; + struct blk_queue_stats *stats; struct rq_wb*rq_wb; -- 2.9.4
[PATCH 13/14] blk-mq: pass 'request_queue *' to several helpers of operating BUSY
We need to support per-request_queue dispatch list for avoiding early dispatch in case of shared queue depth. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 6 +++--- block/blk-mq.h | 15 +-- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 8ff74efe4172..37702786c6d1 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -132,7 +132,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) * more fair dispatch. */ if (blk_mq_has_dispatch_rqs(hctx)) - blk_mq_take_list_from_dispatch(hctx, &rq_list); + blk_mq_take_list_from_dispatch(q, hctx, &rq_list); /* * Only ask the scheduler for requests, if we didn't have residual @@ -147,11 +147,11 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) blk_mq_sched_mark_restart_hctx(hctx); can_go = blk_mq_dispatch_rq_list(q, &rq_list); if (can_go) - blk_mq_hctx_clear_busy(hctx); + blk_mq_hctx_clear_busy(q, hctx); } /* can't go until ->dispatch is flushed */ - if (!can_go || blk_mq_hctx_is_busy(hctx)) + if (!can_go || blk_mq_hctx_is_busy(q, hctx)) return; /* diff --git a/block/blk-mq.h b/block/blk-mq.h index d9795cbba1bb..a8788058da56 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -135,17 +135,20 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx) return hctx->nr_ctx && hctx->tags; } -static inline bool blk_mq_hctx_is_busy(struct blk_mq_hw_ctx *hctx) +static inline bool blk_mq_hctx_is_busy(struct request_queue *q, + struct blk_mq_hw_ctx *hctx) { return test_bit(BLK_MQ_S_BUSY, &hctx->state); } -static inline void blk_mq_hctx_set_busy(struct blk_mq_hw_ctx *hctx) +static inline void blk_mq_hctx_set_busy(struct request_queue *q, + struct blk_mq_hw_ctx *hctx) { set_bit(BLK_MQ_S_BUSY, &hctx->state); } -static inline void blk_mq_hctx_clear_busy(struct blk_mq_hw_ctx *hctx) +static inline void blk_mq_hctx_clear_busy(struct request_queue *q, + struct blk_mq_hw_ctx *hctx) { clear_bit(BLK_MQ_S_BUSY, &hctx->state); } @@ -179,8 +182,8 @@ static inline void blk_mq_add_list_to_dispatch_tail(struct blk_mq_hw_ctx *hctx, spin_unlock(hctx->dispatch_lock); } -static inline void blk_mq_take_list_from_dispatch(struct blk_mq_hw_ctx *hctx, - struct list_head *list) +static inline void blk_mq_take_list_from_dispatch(struct request_queue *q, + struct blk_mq_hw_ctx *hctx, struct list_head *list) { spin_lock(hctx->dispatch_lock); list_splice_init(hctx->dispatch_list, list); @@ -190,7 +193,7 @@ static inline void blk_mq_take_list_from_dispatch(struct blk_mq_hw_ctx *hctx, * in hctx->dispatch are dispatched successfully */ if (!list_empty(list)) - blk_mq_hctx_set_busy(hctx); + blk_mq_hctx_set_busy(q, hctx); spin_unlock(hctx->dispatch_lock); } -- 2.9.4
[PATCH 12/14] blk-mq: introduce pointers to dispatch lock & list
Prepare to support per-request-queue dispatch list, so introduce dispatch lock and list for avoiding to do runtime check. Signed-off-by: Ming Lei --- block/blk-mq-debugfs.c | 10 +- block/blk-mq.c | 7 +-- block/blk-mq.h | 26 +- include/linux/blk-mq.h | 3 +++ 4 files changed, 26 insertions(+), 20 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index c4f70b453c76..4f8cddb8505f 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -370,23 +370,23 @@ static void *hctx_dispatch_start(struct seq_file *m, loff_t *pos) { struct blk_mq_hw_ctx *hctx = m->private; - spin_lock(&hctx->lock); - return seq_list_start(&hctx->dispatch, *pos); + spin_lock(hctx->dispatch_lock); + return seq_list_start(hctx->dispatch_list, *pos); } static void *hctx_dispatch_next(struct seq_file *m, void *v, loff_t *pos) { struct blk_mq_hw_ctx *hctx = m->private; - return seq_list_next(v, &hctx->dispatch, pos); + return seq_list_next(v, hctx->dispatch_list, pos); } static void hctx_dispatch_stop(struct seq_file *m, void *v) - __releases(&hctx->lock) + __releases(hctx->dispatch_lock) { struct blk_mq_hw_ctx *hctx = m->private; - spin_unlock(&hctx->lock); + spin_unlock(hctx->dispatch_lock); } static const struct seq_operations hctx_dispatch_seq_ops = { diff --git a/block/blk-mq.c b/block/blk-mq.c index 785145f60c1d..9b8b3a740d18 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1925,8 +1925,11 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, static void blk_mq_init_dispatch(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - spin_lock_init(&hctx->lock); - INIT_LIST_HEAD(&hctx->dispatch); + hctx->dispatch_lock = &hctx->lock; + hctx->dispatch_list = &hctx->dispatch; + + spin_lock_init(hctx->dispatch_lock); + INIT_LIST_HEAD(hctx->dispatch_list); } static int blk_mq_init_hctx(struct request_queue *q, diff --git a/block/blk-mq.h b/block/blk-mq.h index 2ed355881996..d9795cbba1bb 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -152,38 +152,38 @@ static inline void blk_mq_hctx_clear_busy(struct blk_mq_hw_ctx *hctx) static inline bool blk_mq_has_dispatch_rqs(struct blk_mq_hw_ctx *hctx) { - return !list_empty_careful(&hctx->dispatch); + return !list_empty_careful(hctx->dispatch_list); } static inline void blk_mq_add_rq_to_dispatch(struct blk_mq_hw_ctx *hctx, struct request *rq) { - spin_lock(&hctx->lock); - list_add(&rq->queuelist, &hctx->dispatch); - spin_unlock(&hctx->lock); + spin_lock(hctx->dispatch_lock); + list_add(&rq->queuelist, hctx->dispatch_list); + spin_unlock(hctx->dispatch_lock); } static inline void blk_mq_add_list_to_dispatch(struct blk_mq_hw_ctx *hctx, struct list_head *list) { - spin_lock(&hctx->lock); - list_splice_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); + spin_lock(hctx->dispatch_lock); + list_splice_init(list, hctx->dispatch_list); + spin_unlock(hctx->dispatch_lock); } static inline void blk_mq_add_list_to_dispatch_tail(struct blk_mq_hw_ctx *hctx, struct list_head *list) { - spin_lock(&hctx->lock); - list_splice_tail_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); + spin_lock(hctx->dispatch_lock); + list_splice_tail_init(list, hctx->dispatch_list); + spin_unlock(hctx->dispatch_lock); } static inline void blk_mq_take_list_from_dispatch(struct blk_mq_hw_ctx *hctx, struct list_head *list) { - spin_lock(&hctx->lock); - list_splice_init(&hctx->dispatch, list); + spin_lock(hctx->dispatch_lock); + list_splice_init(hctx->dispatch_list, list); /* * BUSY won't be cleared until all requests @@ -191,7 +191,7 @@ static inline void blk_mq_take_list_from_dispatch(struct blk_mq_hw_ctx *hctx, */ if (!list_empty(list)) blk_mq_hctx_set_busy(hctx); - spin_unlock(&hctx->lock); + spin_unlock(hctx->dispatch_lock); } #endif diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 14f2ad3af31f..016f16c48f72 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -22,6 +22,9 @@ struct blk_mq_hw_ctx { unsigned long flags; /* BLK_MQ_F_* flags */ + spinlock_t *dispatch_lock; + struct list_head*dispatch_list; + void*sched_data; struct request_queue*queue; struct blk_flush_queue *fq; -- 2.9.4
[PATCH 11/14] blk-mq: introduce helpers for operating ->dispatch list
Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 19 +++ block/blk-mq.c | 18 +++--- block/blk-mq.h | 44 3 files changed, 58 insertions(+), 23 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 112270961af0..8ff74efe4172 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -131,19 +131,8 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) * If we have previous entries on our dispatch list, grab them first for * more fair dispatch. */ - if (!list_empty_careful(&hctx->dispatch)) { - spin_lock(&hctx->lock); - if (!list_empty(&hctx->dispatch)) { - list_splice_init(&hctx->dispatch, &rq_list); - - /* -* BUSY won't be cleared until all requests -* in hctx->dispatch are dispatched successfully -*/ - blk_mq_hctx_set_busy(hctx); - } - spin_unlock(&hctx->lock); - } + if (blk_mq_has_dispatch_rqs(hctx)) + blk_mq_take_list_from_dispatch(hctx, &rq_list); /* * Only ask the scheduler for requests, if we didn't have residual @@ -296,9 +285,7 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx, * If we already have a real request tag, send directly to * the dispatch list. */ - spin_lock(&hctx->lock); - list_add(&rq->queuelist, &hctx->dispatch); - spin_unlock(&hctx->lock); + blk_mq_add_rq_to_dispatch(hctx, rq); return true; } diff --git a/block/blk-mq.c b/block/blk-mq.c index db635ef06a72..785145f60c1d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -63,7 +63,7 @@ static int blk_mq_poll_stats_bkt(const struct request *rq) bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx) { return sbitmap_any_bit_set(&hctx->ctx_map) || - !list_empty_careful(&hctx->dispatch) || + blk_mq_has_dispatch_rqs(hctx) || blk_mq_sched_has_work(hctx); } @@ -1097,9 +1097,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list) rq = list_first_entry(list, struct request, queuelist); blk_mq_put_driver_tag(rq); - spin_lock(&hctx->lock); - list_splice_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); + blk_mq_add_list_to_dispatch(hctx, list); /* * If SCHED_RESTART was set by the caller of this function and @@ -1874,9 +1872,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) if (list_empty(&tmp)) return 0; - spin_lock(&hctx->lock); - list_splice_tail_init(&tmp, &hctx->dispatch); - spin_unlock(&hctx->lock); + blk_mq_add_list_to_dispatch_tail(hctx, &tmp); blk_mq_run_hw_queue(hctx, true); return 0; @@ -1926,6 +1922,13 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, } } +static void blk_mq_init_dispatch(struct request_queue *q, + struct blk_mq_hw_ctx *hctx) +{ + spin_lock_init(&hctx->lock); + INIT_LIST_HEAD(&hctx->dispatch); +} + static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) @@ -1939,6 +1942,7 @@ static int blk_mq_init_hctx(struct request_queue *q, INIT_DELAYED_WORK(&hctx->run_work, blk_mq_run_work_fn); spin_lock_init(&hctx->lock); INIT_LIST_HEAD(&hctx->dispatch); + blk_mq_init_dispatch(q, hctx); hctx->queue = q; hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED; diff --git a/block/blk-mq.h b/block/blk-mq.h index d9f875093613..2ed355881996 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -150,4 +150,48 @@ static inline void blk_mq_hctx_clear_busy(struct blk_mq_hw_ctx *hctx) clear_bit(BLK_MQ_S_BUSY, &hctx->state); } +static inline bool blk_mq_has_dispatch_rqs(struct blk_mq_hw_ctx *hctx) +{ + return !list_empty_careful(&hctx->dispatch); +} + +static inline void blk_mq_add_rq_to_dispatch(struct blk_mq_hw_ctx *hctx, + struct request *rq) +{ + spin_lock(&hctx->lock); + list_add(&rq->queuelist, &hctx->dispatch); + spin_unlock(&hctx->lock); +} + +static inline void blk_mq_add_list_to_dispatch(struct blk_mq_hw_ctx *hctx, + struct list_head *list) +{ + spin_lock(&hctx->lock); + list_splice_init(list, &hctx->dispatch); + spin_unlock(&hctx->lock); +} + +static inline void blk_mq_add_list_to_dispatch_tail(struct blk_mq_hw_ctx *hctx, + struct list_head *list) +{ + spin_lock(&hctx->lock); + list_
[PATCH 10/14] blk-mq-sched: introduce helpers for query, change busy state
Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 6 +++--- block/blk-mq.h | 15 +++ 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 07ff53187617..112270961af0 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -140,7 +140,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) * BUSY won't be cleared until all requests * in hctx->dispatch are dispatched successfully */ - set_bit(BLK_MQ_S_BUSY, &hctx->state); + blk_mq_hctx_set_busy(hctx); } spin_unlock(&hctx->lock); } @@ -158,11 +158,11 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) blk_mq_sched_mark_restart_hctx(hctx); can_go = blk_mq_dispatch_rq_list(q, &rq_list); if (can_go) - clear_bit(BLK_MQ_S_BUSY, &hctx->state); + blk_mq_hctx_clear_busy(hctx); } /* can't go until ->dispatch is flushed */ - if (!can_go || test_bit(BLK_MQ_S_BUSY, &hctx->state)) + if (!can_go || blk_mq_hctx_is_busy(hctx)) return; /* diff --git a/block/blk-mq.h b/block/blk-mq.h index 44d3aaa03d7c..d9f875093613 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -135,4 +135,19 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx) return hctx->nr_ctx && hctx->tags; } +static inline bool blk_mq_hctx_is_busy(struct blk_mq_hw_ctx *hctx) +{ + return test_bit(BLK_MQ_S_BUSY, &hctx->state); +} + +static inline void blk_mq_hctx_set_busy(struct blk_mq_hw_ctx *hctx) +{ + set_bit(BLK_MQ_S_BUSY, &hctx->state); +} + +static inline void blk_mq_hctx_clear_busy(struct blk_mq_hw_ctx *hctx) +{ + clear_bit(BLK_MQ_S_BUSY, &hctx->state); +} + #endif -- 2.9.4
[PATCH 08/14] blk-mq: introduce BLK_MQ_F_SHARED_DEPTH
SCSI devices often provides one per-requeest_queue depth via q->queue_depth(.cmd_per_lun), which is a global limit on all hw queues. After the pending I/O submitted to one rquest queue reaches this limit, BLK_STS_RESOURCE will be returned to all dispatch path. That means when one hw queue is stuck, actually all hctxs are stuck too. This flag is introduced for improving blk-mq IO scheduling on this kind of device. Signed-off-by: Ming Lei --- block/blk-mq-debugfs.c | 1 + block/blk-mq-sched.c | 2 +- block/blk-mq.c | 25 ++--- include/linux/blk-mq.h | 1 + 4 files changed, 25 insertions(+), 4 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 9ebc2945f991..c4f70b453c76 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -209,6 +209,7 @@ static const char *const hctx_flag_name[] = { HCTX_FLAG_NAME(SG_MERGE), HCTX_FLAG_NAME(BLOCKING), HCTX_FLAG_NAME(NO_SCHED), + HCTX_FLAG_NAME(SHARED_DEPTH), }; #undef HCTX_FLAG_NAME diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 3eb524ccb7aa..cc0687a4d0ab 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -144,7 +144,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) if (!can_go || test_bit(BLK_MQ_S_BUSY, &hctx->state)) return; - if (!has_sched_dispatch && !q->queue_depth) { + if (!has_sched_dispatch && !(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) { blk_mq_flush_busy_ctxs(hctx, &rq_list); blk_mq_dispatch_rq_list(q, &rq_list); return; diff --git a/block/blk-mq.c b/block/blk-mq.c index 7df68d31bc23..db635ef06a72 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2647,12 +2647,31 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) int blk_mq_update_sched_queue_depth(struct request_queue *q) { unsigned nr; + struct blk_mq_hw_ctx *hctx; + unsigned int i; + int ret = 0; - if (!q->mq_ops || !q->elevator) - return 0; + if (!q->mq_ops) + return ret; + + blk_mq_freeze_queue(q); + /* +* if there is q->queue_depth, all hw queues share +* this queue depth limit +*/ + if (q->queue_depth) { + queue_for_each_hw_ctx(q, hctx, i) + hctx->flags |= BLK_MQ_F_SHARED_DEPTH; + } + + if (!q->elevator) + goto exit; nr = blk_mq_sched_queue_depth(q); - return __blk_mq_update_nr_requests(q, true, nr); + ret = __blk_mq_update_nr_requests(q, true, nr); + exit: + blk_mq_unfreeze_queue(q); + return ret; } static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 6d44b242b495..14f2ad3af31f 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -164,6 +164,7 @@ enum { BLK_MQ_F_SG_MERGE = 1 << 2, BLK_MQ_F_BLOCKING = 1 << 5, BLK_MQ_F_NO_SCHED = 1 << 6, + BLK_MQ_F_SHARED_DEPTH = 1 << 7, BLK_MQ_F_ALLOC_POLICY_START_BIT = 8, BLK_MQ_F_ALLOC_POLICY_BITS = 1, -- 2.9.4
[PATCH 09/14] blk-mq-sched: cleanup blk_mq_sched_dispatch_requests()
This patch split blk_mq_sched_dispatch_requests() into two parts: 1) the 1st part is for checking if queue is busy, and handle the busy situation 2) the 2nd part is moved to __blk_mq_sched_dispatch_requests() which focuses on dispatch from sw queue or scheduler queue. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 42 +- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index cc0687a4d0ab..07ff53187617 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -89,16 +89,37 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx *hctx) return false; } -void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) +static void __blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) { struct request_queue *q = hctx->queue; struct elevator_queue *e = q->elevator; const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; - bool can_go = true; - LIST_HEAD(rq_list); struct request *(*dispatch_fn)(struct blk_mq_hw_ctx *) = has_sched_dispatch ? e->type->ops.mq.dispatch_request : blk_mq_dispatch_rq_from_ctxs; + LIST_HEAD(rq_list); + + if (!has_sched_dispatch && !(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) { + blk_mq_flush_busy_ctxs(hctx, &rq_list); + blk_mq_dispatch_rq_list(q, &rq_list); + return; + } + + do { + struct request *rq; + + rq = dispatch_fn(hctx); + if (!rq) + break; + list_add(&rq->queuelist, &rq_list); + } while (blk_mq_dispatch_rq_list(q, &rq_list)); +} + +void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) +{ + struct request_queue *q = hctx->queue; + bool can_go = true; + LIST_HEAD(rq_list); /* RCU or SRCU read lock is needed before checking quiesced flag */ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) @@ -144,25 +165,12 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) if (!can_go || test_bit(BLK_MQ_S_BUSY, &hctx->state)) return; - if (!has_sched_dispatch && !(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) { - blk_mq_flush_busy_ctxs(hctx, &rq_list); - blk_mq_dispatch_rq_list(q, &rq_list); - return; - } - /* * We want to dispatch from the scheduler if we had no work left * on the dispatch list, OR if we did have work but weren't able * to make progress. */ - do { - struct request *rq; - - rq = dispatch_fn(hctx); - if (!rq) - break; - list_add(&rq->queuelist, &rq_list); - } while (blk_mq_dispatch_rq_list(q, &rq_list)); + __blk_mq_sched_dispatch_requests(hctx); } bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, -- 2.9.4
[PATCH 07/14] blk-mq-sched: use q->queue_depth as hint for q->nr_requests
SCSI sets q->queue_depth from shost->cmd_per_lun, and q->queue_depth is per request_queue and more related to scheduler queue compared with hw queue depth, which can be shared by queues, such as TAG_SHARED. This patch trys to use q->queue_depth as hint for computing q->nr_requests, which should be more effective than current way. Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq-sched.h | 18 +++--- block/blk-mq.c | 27 +-- block/blk-mq.h | 1 + block/blk-settings.c | 2 ++ 4 files changed, 43 insertions(+), 5 deletions(-) diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 1d47f3fda1d0..bb772e680e01 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -99,12 +99,24 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx) static inline unsigned blk_mq_sched_queue_depth(struct request_queue *q) { /* -* Default to double of smaller one between hw queue_depth and 128, +* q->queue_depth is more close to scheduler queue, so use it +* as hint for computing scheduler queue depth if it is valid +*/ + unsigned q_depth = q->queue_depth ?: q->tag_set->queue_depth; + + /* +* Default to double of smaller one between queue depth and 128, * since we don't split into sync/async like the old code did. * Additionally, this is a per-hw queue depth. */ - return 2 * min_t(unsigned int, q->tag_set->queue_depth, - BLKDEV_MAX_RQ); + q_depth = 2 * min_t(unsigned int, q_depth, BLKDEV_MAX_RQ); + + /* +* when queue depth of driver is too small, we set queue depth +* of scheduler queue as 32 so that small queue device still +* can benefit from IO merging. +*/ + return max_t(unsigned, q_depth, 32); } #endif diff --git a/block/blk-mq.c b/block/blk-mq.c index 86b8fdcb8434..7df68d31bc23 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2593,7 +2593,9 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set) } EXPORT_SYMBOL(blk_mq_free_tag_set); -int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) +static int __blk_mq_update_nr_requests(struct request_queue *q, + bool sched_only, + unsigned int nr) { struct blk_mq_tag_set *set = q->tag_set; struct blk_mq_hw_ctx *hctx; @@ -2612,7 +2614,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) * If we're using an MQ scheduler, just update the scheduler * queue depth. This is similar to what the old code would do. */ - if (!hctx->sched_tags) { + if (!sched_only && !hctx->sched_tags) { ret = blk_mq_tag_update_depth(hctx, &hctx->tags, min(nr, set->queue_depth), false); @@ -2632,6 +2634,27 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) return ret; } +int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) +{ + return __blk_mq_update_nr_requests(q, false, nr); +} + +/* + * When drivers update q->queue_depth, this API is called so that + * we can use this queue depth as hint for adjusting scheduler + * queue depth. + */ +int blk_mq_update_sched_queue_depth(struct request_queue *q) +{ + unsigned nr; + + if (!q->mq_ops || !q->elevator) + return 0; + + nr = blk_mq_sched_queue_depth(q); + return __blk_mq_update_nr_requests(q, true, nr); +} + static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues) { diff --git a/block/blk-mq.h b/block/blk-mq.h index 0c398f29dc4b..44d3aaa03d7c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -36,6 +36,7 @@ bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx); bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx, bool wait); struct request *blk_mq_dispatch_rq_from_ctxs(struct blk_mq_hw_ctx *hctx); +int blk_mq_update_sched_queue_depth(struct request_queue *q); /* * Internal helpers for allocating/freeing the request map diff --git a/block/blk-settings.c b/block/blk-settings.c index be1f115b538b..94a349601545 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -877,6 +877,8 @@ void blk_set_queue_depth(struct request_queue *q, unsigned int depth) { q->queue_depth = depth; wbt_set_queue_depth(q->rq_wb, depth); + + WARN_ON(blk_mq_update_sched_queue_depth(q)); } EXPORT_SYMBOL(blk_set_queue_depth); -- 2.9.4
[PATCH 06/14] blk-mq-sched: introduce blk_mq_sched_queue_depth()
The following patch will propose some hints to figure out default queue depth for scheduler queue, so introduce helper of blk_mq_sched_queue_depth() for this purpose. Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 8 +--- block/blk-mq-sched.h | 11 +++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index eb638063673f..3eb524ccb7aa 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -531,13 +531,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) return 0; } - /* -* Default to double of smaller one between hw queue_depth and 128, -* since we don't split into sync/async like the old code did. -* Additionally, this is a per-hw queue depth. -*/ - q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth, - BLKDEV_MAX_RQ); + q->nr_requests = blk_mq_sched_queue_depth(q); queue_for_each_hw_ctx(q, hctx, i) { ret = blk_mq_sched_alloc_tags(q, hctx, i); diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 9267d0b7c197..1d47f3fda1d0 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -96,4 +96,15 @@ static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx) return test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state); } +static inline unsigned blk_mq_sched_queue_depth(struct request_queue *q) +{ + /* +* Default to double of smaller one between hw queue_depth and 128, +* since we don't split into sync/async like the old code did. +* Additionally, this is a per-hw queue depth. +*/ + return 2 * min_t(unsigned int, q->tag_set->queue_depth, + BLKDEV_MAX_RQ); +} + #endif -- 2.9.4
[PATCH 05/14] blk-mq-sched: don't dequeue request until all in ->dispatch are flushed
During dispatch, we moved all requests from hctx->dispatch to one temporary list, then dispatch them one by one from this list. Unfortunately duirng this period, run queue from other contexts may think the queue is idle and start to dequeue from sw/scheduler queue and try to dispatch because ->dispatch is empty. This way will hurt sequential I/O performance because requests are dequeued when queue is busy. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 24 ++-- include/linux/blk-mq.h | 1 + 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 3510c01cb17b..eb638063673f 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -112,8 +112,15 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) */ if (!list_empty_careful(&hctx->dispatch)) { spin_lock(&hctx->lock); - if (!list_empty(&hctx->dispatch)) + if (!list_empty(&hctx->dispatch)) { list_splice_init(&hctx->dispatch, &rq_list); + + /* +* BUSY won't be cleared until all requests +* in hctx->dispatch are dispatched successfully +*/ + set_bit(BLK_MQ_S_BUSY, &hctx->state); + } spin_unlock(&hctx->lock); } @@ -129,15 +136,20 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); can_go = blk_mq_dispatch_rq_list(q, &rq_list); - } else if (!has_sched_dispatch && !q->queue_depth) { - blk_mq_flush_busy_ctxs(hctx, &rq_list); - blk_mq_dispatch_rq_list(q, &rq_list); - can_go = false; + if (can_go) + clear_bit(BLK_MQ_S_BUSY, &hctx->state); } - if (!can_go) + /* can't go until ->dispatch is flushed */ + if (!can_go || test_bit(BLK_MQ_S_BUSY, &hctx->state)) return; + if (!has_sched_dispatch && !q->queue_depth) { + blk_mq_flush_busy_ctxs(hctx, &rq_list); + blk_mq_dispatch_rq_list(q, &rq_list); + return; + } + /* * We want to dispatch from the scheduler if we had no work left * on the dispatch list, OR if we did have work but weren't able diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 14542308d25b..6d44b242b495 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -172,6 +172,7 @@ enum { BLK_MQ_S_SCHED_RESTART = 2, BLK_MQ_S_TAG_WAITING= 3, BLK_MQ_S_START_ON_RUN = 4, + BLK_MQ_S_BUSY = 5, BLK_MQ_MAX_DEPTH= 10240, -- 2.9.4
[PATCH 04/14] blk-mq-sched: improve dispatching from sw queue
SCSI devices use host-wide tagset, and the shared driver tag space is often quite big. Meantime there is also queue depth for each lun(.cmd_per_lun), which is often small. So lots of requests may stay in sw queue, and we always flush all belonging to same hw queue and dispatch them all to driver, unfortunately it is easy to cause queue busy becasue of the small per-lun queue depth. Once these requests are flushed out, they have to stay in hctx->dispatch, and no bio merge can participate into these requests, and sequential IO performance is hurted. This patch improves dispatching from sw queue when there is per-request-queue queue depth by taking request one by one from sw queue, just like the way of IO scheduler. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 47a25333a136..3510c01cb17b 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -96,6 +96,9 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; bool can_go = true; LIST_HEAD(rq_list); + struct request *(*dispatch_fn)(struct blk_mq_hw_ctx *) = + has_sched_dispatch ? e->type->ops.mq.dispatch_request : + blk_mq_dispatch_rq_from_ctxs; /* RCU or SRCU read lock is needed before checking quiesced flag */ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) @@ -126,26 +129,28 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); can_go = blk_mq_dispatch_rq_list(q, &rq_list); - } else if (!has_sched_dispatch) { + } else if (!has_sched_dispatch && !q->queue_depth) { blk_mq_flush_busy_ctxs(hctx, &rq_list); blk_mq_dispatch_rq_list(q, &rq_list); + can_go = false; } + if (!can_go) + return; + /* * We want to dispatch from the scheduler if we had no work left * on the dispatch list, OR if we did have work but weren't able * to make progress. */ - if (can_go && has_sched_dispatch) { - do { - struct request *rq; + do { + struct request *rq; - rq = e->type->ops.mq.dispatch_request(hctx); - if (!rq) - break; - list_add(&rq->queuelist, &rq_list); - } while (blk_mq_dispatch_rq_list(q, &rq_list)); - } + rq = dispatch_fn(hctx); + if (!rq) + break; + list_add(&rq->queuelist, &rq_list); + } while (blk_mq_dispatch_rq_list(q, &rq_list)); } bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, -- 2.9.4
[PATCH 03/14] blk-mq: introduce blk_mq_dispatch_rq_from_ctxs()
This function is introduced for picking up request from sw queue so that we can dispatch in scheduler's way. More importantly, for some SCSI devices, driver tags are host wide, and the number is quite big, but each lun has very limited queue depth. This function is introduced for avoiding to take too many requests from sw queue when queue is busy, and only try to dispatch request when queue isn't busy. Signed-off-by: Ming Lei --- block/blk-mq.c | 38 +- block/blk-mq.h | 1 + 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 94818f78c099..86b8fdcb8434 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -810,7 +810,11 @@ static void blk_mq_timeout_work(struct work_struct *work) struct ctx_iter_data { struct blk_mq_hw_ctx *hctx; - struct list_head *list; + + union { + struct list_head *list; + struct request *rq; + }; }; static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) @@ -826,6 +830,26 @@ static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) return true; } +static bool dispatch_rq_from_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) +{ + struct ctx_iter_data *dispatch_data = data; + struct blk_mq_hw_ctx *hctx = dispatch_data->hctx; + struct blk_mq_ctx *ctx = hctx->ctxs[bitnr]; + bool empty = true; + + spin_lock(&ctx->lock); + if (unlikely(!list_empty(&ctx->rq_list))) { + dispatch_data->rq = list_entry_rq(ctx->rq_list.next); + list_del_init(&dispatch_data->rq->queuelist); + empty = list_empty(&ctx->rq_list); + } + spin_unlock(&ctx->lock); + if (empty) + sbitmap_clear_bit(sb, bitnr); + + return !dispatch_data->rq; +} + /* * Process software queues that have been marked busy, splicing them * to the for-dispatch @@ -841,6 +865,18 @@ void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list) } EXPORT_SYMBOL_GPL(blk_mq_flush_busy_ctxs); +struct request *blk_mq_dispatch_rq_from_ctxs(struct blk_mq_hw_ctx *hctx) +{ + struct ctx_iter_data data = { + .hctx = hctx, + .rq = NULL, + }; + + sbitmap_for_each_set(&hctx->ctx_map, dispatch_rq_from_ctx, &data); + + return data.rq; +} + static inline unsigned int queued_to_index(unsigned int queued) { if (!queued) diff --git a/block/blk-mq.h b/block/blk-mq.h index 60b01c0309bc..0c398f29dc4b 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -35,6 +35,7 @@ void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list); bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx); bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx, bool wait); +struct request *blk_mq_dispatch_rq_from_ctxs(struct blk_mq_hw_ctx *hctx); /* * Internal helpers for allocating/freeing the request map -- 2.9.4
[PATCH 02/14] blk-mq: rename flush_busy_ctx_data as ctx_iter_data
The following patch need to reuse this data structure, so rename as one generic name. Signed-off-by: Ming Lei --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b70a4ad78b63..94818f78c099 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -808,14 +808,14 @@ static void blk_mq_timeout_work(struct work_struct *work) blk_queue_exit(q); } -struct flush_busy_ctx_data { +struct ctx_iter_data { struct blk_mq_hw_ctx *hctx; struct list_head *list; }; static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) { - struct flush_busy_ctx_data *flush_data = data; + struct ctx_iter_data *flush_data = data; struct blk_mq_hw_ctx *hctx = flush_data->hctx; struct blk_mq_ctx *ctx = hctx->ctxs[bitnr]; @@ -832,7 +832,7 @@ static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) */ void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list) { - struct flush_busy_ctx_data data = { + struct ctx_iter_data data = { .hctx = hctx, .list = list, }; -- 2.9.4
[PATCH 01/14] blk-mq-sched: fix scheduler bad performance
When hw queue is busy, we shouldn't take requests from scheduler queue any more, otherwise IO merge will be difficult to do. This patch fixes the awful IO performance on some SCSI devices(lpfc, qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if hw queue is busy. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 4ab69435708c..47a25333a136 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -94,7 +94,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) struct request_queue *q = hctx->queue; struct elevator_queue *e = q->elevator; const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request; - bool did_work = false; + bool can_go = true; LIST_HEAD(rq_list); /* RCU or SRCU read lock is needed before checking quiesced flag */ @@ -125,7 +125,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) */ if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); - did_work = blk_mq_dispatch_rq_list(q, &rq_list); + can_go = blk_mq_dispatch_rq_list(q, &rq_list); } else if (!has_sched_dispatch) { blk_mq_flush_busy_ctxs(hctx, &rq_list); blk_mq_dispatch_rq_list(q, &rq_list); @@ -136,7 +136,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) * on the dispatch list, OR if we did have work but weren't able * to make progress. */ - if (!did_work && has_sched_dispatch) { + if (can_go && has_sched_dispatch) { do { struct request *rq; -- 2.9.4
[PATCH 00/14] blk-mq-sched: fix SCSI-MQ performance regression
In Red Hat internal storage test wrt. blk-mq scheduler, we found that its performance is quite bad, especially about sequential I/O on some multi-queue SCSI devcies. Turns out one big issue causes the performance regression: requests are still dequeued from sw queue/scheduler queue even when ldd's queue is busy, so I/O merge becomes quite difficult to do, and sequential IO degrades a lot. The 1st five patches improve this situation, and brings back some performance loss. But looks they are still not enough. Finally it is caused by the shared queue depth among all hw queues. For SCSI devices, .cmd_per_lun defines the max number of pending I/O on one request queue, which is per-request_queue depth. So during dispatch, if one hctx is too busy to move on, all hctxs can't dispatch too because of the per-request_queue depth. Patch 6 ~ 14 use per-request_queue dispatch list to avoid to dequeue requests from sw/scheduler queue when lld queue is busy. With this changes, SCSI-MQ performance is brought back against block legacy path, follows the test result on lpfc: - fio(libaio, bs:4k, dio, queue_depth:64, 20 jobs) |v4.13-rc3 | v4.13-rc3 | patched v4.13-rc3 |legacy deadline | mq-none | mq-none - read"iops" | 401749.4001| 346237.5025 | 387536.4427 randread"iops" | 25175.07121| 21688.64067 | 25578.50374 write "iops" | 376168.7578| 335262.0475 | 370132.4735 reandwrite "iops" | 25235.46163| 24982.63819 | 23934.95610 |v4.13-rc3 | v4.13-rc3 | patched v4.13-rc3 |legacy deadline | mq-deadline | mq-deadline -- read"iops" | 401749.4001| 35592.48901 | 401681.1137 randread"iops" | 25175.07121| 30029.52618 | 21446.68731 write "iops" | 376168.7578| 27340.56777 | 377356.7286 randwrite "iops" | 25235.46163| 24395.02969 | 24885.66152 Ming Lei (14): blk-mq-sched: fix scheduler bad performance blk-mq: rename flush_busy_ctx_data as ctx_iter_data blk-mq: introduce blk_mq_dispatch_rq_from_ctxs() blk-mq-sched: improve dispatching from sw queue blk-mq-sched: don't dequeue request until all in ->dispatch are flushed blk-mq-sched: introduce blk_mq_sched_queue_depth() blk-mq-sched: use q->queue_depth as hint for q->nr_requests blk-mq: introduce BLK_MQ_F_SHARED_DEPTH blk-mq-sched: cleanup blk_mq_sched_dispatch_requests() blk-mq-sched: introduce helpers for query, change busy state blk-mq: introduce helpers for operating ->dispatch list blk-mq: introduce pointers to dispatch lock & list blk-mq: pass 'request_queue *' to several helpers of operating BUSY blk-mq-sched: improve IO scheduling on SCSI devcie block/blk-mq-debugfs.c | 11 ++--- block/blk-mq-sched.c | 70 +++-- block/blk-mq-sched.h | 23 ++ block/blk-mq.c | 117 +++-- block/blk-mq.h | 72 ++ block/blk-settings.c | 2 + include/linux/blk-mq.h | 5 +++ include/linux/blkdev.h | 5 +++ 8 files changed, 255 insertions(+), 50 deletions(-) -- 2.9.4
[PATCH 00/14] blk-mq-sched: fix SCSI-MQ performance regression
In Red Hat internal storage test wrt. blk-mq scheduler, we found that its performance is quite bad, especially about sequential I/O on some multi-queue SCSI devcies. Turns out one big issue causes the performance regression: requests are still dequeued from sw queue/scheduler queue even when ldd's queue is busy, so I/O merge becomes quite difficult to do, and sequential IO degrades a lot. The 1st five patches improve this situation, and brings back some performance loss. But looks they are still not enough. Finally it is caused by the shared queue depth among all hw queues. For SCSI devices, .cmd_per_lun defines the max number of pending I/O on one request queue, which is per-request_queue depth. So during dispatch, if one hctx is too busy to move on, all hctxs can't dispatch too because of the per-request_queue depth. Patch 6 ~ 14 use per-request_queue dispatch list to avoid to dequeue requests from sw/scheduler queue when lld queue is busy. With this changes, SCSI-MQ performance is brought back against block legacy path, follows the test result on lpfc: - fio(libaio, bs:4k, dio, queue_depth:64, 20 jobs) |v4.13-rc3 | v4.13-rc3 | patched v4.13-rc3 |legacy deadline | mq-none | mq-none - read"iops" | 401749.4001| 346237.5025 | 387536.4427 randread"iops" | 25175.07121| 21688.64067 | 25578.50374 write "iops" | 376168.7578| 335262.0475 | 370132.4735 reandwrite "iops" | 25235.46163| 24982.63819 | 23934.95610 |v4.13-rc3 | v4.13-rc3 | patched v4.13-rc3 |legacy deadline | mq-deadline | mq-deadline -- read"iops" | 401749.4001| 35592.48901 | 401681.1137 randread"iops" | 25175.07121| 30029.52618 | 21446.68731 write "iops" | 376168.7578| 27340.56777 | 377356.7286 randwrite "iops" | 25235.46163| 24395.02969 | 24885.66152 Ming Lei (14): blk-mq-sched: fix scheduler bad performance blk-mq: rename flush_busy_ctx_data as ctx_iter_data blk-mq: introduce blk_mq_dispatch_rq_from_ctxs() blk-mq-sched: improve dispatching from sw queue blk-mq-sched: don't dequeue request until all in ->dispatch are flushed blk-mq-sched: introduce blk_mq_sched_queue_depth() blk-mq-sched: use q->queue_depth as hint for q->nr_requests blk-mq: introduce BLK_MQ_F_SHARED_DEPTH blk-mq-sched: cleanup blk_mq_sched_dispatch_requests() blk-mq-sched: introduce helpers for query, change busy state blk-mq: introduce helpers for operating ->dispatch list blk-mq: introduce pointers to dispatch lock & list blk-mq: pass 'request_queue *' to several helpers of operating BUSY blk-mq-sched: improve IO scheduling on SCSI devcie block/blk-mq-debugfs.c | 11 ++--- block/blk-mq-sched.c | 70 +++-- block/blk-mq-sched.h | 23 ++ block/blk-mq.c | 117 +++-- block/blk-mq.h | 72 ++ block/blk-settings.c | 2 + include/linux/blk-mq.h | 5 +++ include/linux/blkdev.h | 5 +++ 8 files changed, 255 insertions(+), 50 deletions(-) -- 2.9.4
[PATCH] scsi: csiostor: fail probe if fw does not support FCoE
Fail probe if FCoE capability is not enabled in the firmware. Signed-off-by: Varun Prakash --- drivers/scsi/csiostor/csio_hw.c | 4 +++- drivers/scsi/csiostor/csio_init.c | 12 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/csiostor/csio_hw.c b/drivers/scsi/csiostor/csio_hw.c index 2029ad2..5be0086 100644 --- a/drivers/scsi/csiostor/csio_hw.c +++ b/drivers/scsi/csiostor/csio_hw.c @@ -3845,8 +3845,10 @@ csio_hw_start(struct csio_hw *hw) if (csio_is_hw_ready(hw)) return 0; - else + else if (csio_match_state(hw, csio_hws_uninit)) return -EINVAL; + else + return -ENODEV; } int diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c index ea0c310..dcd0741 100644 --- a/drivers/scsi/csiostor/csio_init.c +++ b/drivers/scsi/csiostor/csio_init.c @@ -969,10 +969,14 @@ static int csio_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) pci_set_drvdata(pdev, hw); - if (csio_hw_start(hw) != 0) { - dev_err(&pdev->dev, - "Failed to start FW, continuing in debug mode.\n"); - return 0; + rv = csio_hw_start(hw); + if (rv) { + if (rv == -EINVAL) { + dev_err(&pdev->dev, + "Failed to start FW, continuing in debug mode.\n"); + return 0; + } + goto err_lnode_exit; } sprintf(hw->fwrev_str, "%u.%u.%u.%u\n", -- 2.0.2
Loan offer
Hello, BarclaysHomeFinance is offering loan at a low interest rate of 2.5%.do You need a loan of any kind ? if yes email now for more info best regards, Taylor Anderson
Re: [PATCH 00/29] constify scsi pci_device_id.
On Mon, Jul 31, 2017 at 02:23:11PM +0530, Arvind Yadav wrote: > Yes, We can add all of them in single patch. But other maintainer wants > single single patch. thats why I have send 29 patch. :( Ultimately it's up to Martin and James but I don't see a hughe benefit in having it all in a separate patch. Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH 00/29] constify scsi pci_device_id.
On Monday 31 July 2017 01:26 PM, Johannes Thumshirn wrote: On Sun, Jul 30, 2017 at 02:07:09PM +0530, Arvind Yadav wrote: pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. Can't this go all in one patch instead of replicating the same patch 29 times? Yes, We can add all of them in single patch. But other maintainer wants single single patch. thats why I have send 29 patch. :( Thanks, Johannes ~arvind
URGENT REPLY FOR MORE DETAILS.
Compliment of the day, I am Mr.Kere Casmire I Have a Business Proposal of $5.3 million For You. I am aware of the unsafe nature of the internet, and was compelled to use this medium due to the nature of this project. I have access to very vital information that can be used to transfer this huge amount of money. which may culminate into the investment of the said funds into your company or any lucrative venture in your country. If you will like to assist me as a partner then indicate your interest, after which we shall both discuss the modalities and the sharing percentage. Upon receipt of your reply on your expression of Interest I will give you full details, on how the business will be executed I am open for negotiation. Thanks for your anticipated cooperation. Note you might receive this message in your inbox or spam or junk folder, depends on your web host or server network. Regards, Mr.Kere Casmire
Re: [PATCH 04/29] scsi: pm8001: constify pci_device_id.
On Sun, Jul 30, 2017 at 10:37 AM, Arvind Yadav wrote: > pci_device_id are not supposed to change at runtime. All functions > working with pci_device_id provided by work with > const pci_device_id. So mark the non-const structs as const. > > Signed-off-by: Arvind Yadav > --- > drivers/scsi/pm8001/pm8001_init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/pm8001/pm8001_init.c > b/drivers/scsi/pm8001/pm8001_init.c > index 034b2f7..f2757cc 100644 > --- a/drivers/scsi/pm8001/pm8001_init.c > +++ b/drivers/scsi/pm8001/pm8001_init.c > @@ -1270,7 +1270,7 @@ static int pm8001_pci_resume(struct pci_dev *pdev) > /* update of pci device, vendor id and driver data with > * unique value for each of the controller > */ > -static struct pci_device_id pm8001_pci_table[] = { > +static const struct pci_device_id pm8001_pci_table[] = { > { PCI_VDEVICE(PMC_Sierra, 0x8001), chip_8001 }, > { PCI_VDEVICE(PMC_Sierra, 0x8006), chip_8006 }, > { PCI_VDEVICE(ADAPTEC2, 0x8006), chip_8006 }, > -- > 2.7.4 > Thanks, Acked-by: Jack Wang -- Jack Wang Linux Kernel Developer
Re: [PATCH 00/29] constify scsi pci_device_id.
On Sun, Jul 30, 2017 at 02:07:09PM +0530, Arvind Yadav wrote: > pci_device_id are not supposed to change at runtime. All functions > working with pci_device_id provided by work with > const pci_device_id. So mark the non-const structs as const. Can't this go all in one patch instead of replicating the same patch 29 times? Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850