Re: [PATCH] blk-mq: order getting budget and driver tag

2018-04-04 Thread Jens Axboe
On 4/4/18 10:35 AM, Ming Lei wrote:
> This patch orders getting budget and driver tag by making sure to acquire
> driver tag after budget is got, this way can help to avoid the following
> race:
> 
> 1) before dispatch request from scheduler queue, get one budget first, then
> dequeue a request, call it request A.
> 
> 2) in another IO path for dispatching request B which is from hctx->dispatch,
> driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
> unfortunately the budget is held by request A.
> 
> 3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
> A, and try to get driver tag first, unfortunately no driver tag is
> available because the driver tag is held by request B
> 
> 4) both two IO pathes can't move on, and IO stall is caused.
> 
> This issue can be observed when running dbench on USB storage.

Good catch, this can trigger on anything potentially, but of course more
likely with limited budget and/or tag space. Classic ABBA deadlock.

-- 
Jens Axboe



[PATCH] blk-mq: order getting budget and driver tag

2018-04-04 Thread Ming Lei
This patch orders getting budget and driver tag by making sure to acquire
driver tag after budget is got, this way can help to avoid the following
race:

1) before dispatch request from scheduler queue, get one budget first, then
dequeue a request, call it request A.

2) in another IO path for dispatching request B which is from hctx->dispatch,
driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
unfortunately the budget is held by request A.

3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
A, and try to get driver tag first, unfortunately no driver tag is
available because the driver tag is held by request B

4) both two IO pathes can't move on, and IO stall is caused.

This issue can be observed when running dbench on USB storage.

This patch fixes this issue by always getting budget before getting
driver tag.

Cc: sta...@vger.kernel.org
Fixes: de1482974080ec9e ("blk-mq: introduce .get_budget and .put_budget in 
blk_mq_ops")
Cc: Christoph Hellwig 
Cc: Bart Van Assche 
Cc: Omar Sandoval 
Signed-off-by: Ming Lei 
---
 block/blk-mq.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 16e83e6df404..90838e998f66 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1188,7 +1188,12 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list,
struct blk_mq_queue_data bd;
 
rq = list_first_entry(list, struct request, queuelist);
-   if (!blk_mq_get_driver_tag(rq, , false)) {
+
+   hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu);
+   if (!got_budget && !blk_mq_get_dispatch_budget(hctx))
+   break;
+
+   if (!blk_mq_get_driver_tag(rq, NULL, false)) {
/*
 * The initial allocation attempt failed, so we need to
 * rerun the hardware queue when a tag is freed. The
@@ -1197,8 +1202,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list,
 * we'll re-run it below.
 */
if (!blk_mq_mark_tag_wait(, rq)) {
-   if (got_budget)
-   blk_mq_put_dispatch_budget(hctx);
+   blk_mq_put_dispatch_budget(hctx);
/*
 * For non-shared tags, the RESTART check
 * will suffice.
@@ -1209,11 +1213,6 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, 
struct list_head *list,
}
}
 
-   if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) {
-   blk_mq_put_driver_tag(rq);
-   break;
-   }
-
list_del_init(>queuelist);
 
bd.rq = rq;
@@ -1812,11 +1811,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct 
blk_mq_hw_ctx *hctx,
if (q->elevator && !bypass_insert)
goto insert;
 
-   if (!blk_mq_get_driver_tag(rq, NULL, false))
+   if (!blk_mq_get_dispatch_budget(hctx))
goto insert;
 
-   if (!blk_mq_get_dispatch_budget(hctx)) {
-   blk_mq_put_driver_tag(rq);
+   if (!blk_mq_get_driver_tag(rq, NULL, false)) {
+   blk_mq_put_dispatch_budget(hctx);
goto insert;
}
 
-- 
2.9.5