Shaohua Li <[email protected]> writes:

> plug is still helpful for workload with IO merge, but it can be harmful
> otherwise especially with multiple hardware queues, as there is
> (supposed) no lock contention in this case and plug can introduce
> latency. For multiple queues, we do limited plug, eg plug only if there
> is request merge. If a request doesn't have merge with following
> request, the requet will be dispatched immediately.
>
> This also fixes a bug. If we directly issue a request and it fails, we
> use blk_mq_merge_queue_io(). But we already assigned bio to a request in
> blk_mq_bio_to_request. blk_mq_merge_queue_io shouldn't run
> blk_mq_bio_to_request again.

Good catch.  Might've been better to split that out first for easy
backport to stable kernels, but I won't hold you to that.

> @@ -1243,6 +1277,10 @@ static void blk_mq_make_request(struct request_queue 
> *q, struct bio *bio)
>               return;
>       }
>  
> +     if (likely(!is_flush_fua) && !blk_queue_nomerges(q) &&
> +         blk_attempt_plug_merge(q, bio, &request_count))
> +             return;
> +
>       rq = blk_mq_map_request(q, bio, &data);
>       if (unlikely(!rq))
>               return;

After this patch, everything up to this point in blk_mq_make_request and
blk_sq_make_request is the same.  This can be factored out (in another
patch) to a common function.

> @@ -1253,38 +1291,38 @@ static void blk_mq_make_request(struct request_queue 
> *q, struct bio *bio)
>               goto run_queue;
>       }
>  
> +     plug = current->plug;
>       /*
>        * If the driver supports defer issued based on 'last', then
>        * queue it up like normal since we can potentially save some
>        * CPU this way.
>        */
> -     if (is_sync && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> -             struct blk_mq_queue_data bd = {
> -                     .rq = rq,
> -                     .list = NULL,
> -                     .last = 1
> -             };
> -             int ret;
> +     if ((plug || is_sync) && !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) {
> +             struct request *old_rq = NULL;

I would add a !blk_queue_nomerges(q) to that conditional.  There's no
point holding back an I/O when we won't merge it anyway.

That brings up another quirk of the current implementation (not your
patches) that bugs me.

BLK_MQ_F_SHOULD_MERGE
QUEUE_FLAG_NOMERGES

Those two flags are set independently, one via the driver and the other
via a sysfs file.  So the user could set the nomerges flag to 1 or 2,
and still potentially get merges (see blk_mq_merge_queue_io).  That's
something that should be fixed, albeit that can wait.

>               blk_mq_bio_to_request(rq, bio);
>  
>               /*
> -              * For OK queue, we are done. For error, kill it. Any other
> -              * error (busy), just add it to our list as we previously
> -              * would have done
> +              * we do limited pluging. If bio can be merged, do merge.
> +              * Otherwise the existing request in the plug list will be
> +              * issued. So the plug list will have one request at most
>                */
> -             ret = q->mq_ops->queue_rq(data.hctx, &bd);
> -             if (ret == BLK_MQ_RQ_QUEUE_OK)
> -                     goto done;
> -             else {
> -                     __blk_mq_requeue_request(rq);
> -
> -                     if (ret == BLK_MQ_RQ_QUEUE_ERROR) {
> -                             rq->errors = -EIO;
> -                             blk_mq_end_request(rq, rq->errors);
> -                             goto done;
> +             if (plug) {
> +                     if (!list_empty(&plug->mq_list)) {
> +                             old_rq = list_first_entry(&plug->mq_list,
> +                                     struct request, queuelist);
> +                             list_del_init(&old_rq->queuelist);
>                       }
> -             }
> +                     list_add_tail(&rq->queuelist, &plug->mq_list);
> +             } else /* is_sync */
> +                     old_rq = rq;
> +             blk_mq_put_ctx(data.ctx);
> +             if (!old_rq)
> +                     return;
> +             if (!blk_mq_direct_issue_request(old_rq))
> +                     return;
> +             blk_mq_insert_request(old_rq, false, true, true);
> +             return;
>       }

Now there is no way to exit that if block, we always return.  It may be
worth cosidering moving that block to its own function, if you can think
of a good name for it.

Other than those minor issues, this looks good to me.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to