On 2/11/19 4:15 PM, Jens Axboe wrote:
> On 2/11/19 8:59 AM, Jens Axboe wrote:
>> On 2/10/19 10:41 PM, Jianchao Wang wrote:
>>> When requeue, if RQF_DONTPREP, rq has contained some driver
>>> specific data, so insert it to hctx dispatch list to avoid any
>>> merge. Take scsi as example, here is the trace event log (no
>>> io scheduler, because RQF_STARTED would prevent merging),
>>>
>>>    kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 
>>> () 32768 + 8 [kworker/0:1H]
>>> scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 
>>> + 8 [scsi_inert_test]
>>> scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 
>>> 32776 + 8 [scsi_inert_test]
>>>    kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 
>>> () 32768 + 16 [kworker/0:1H]
>>> scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 
>>> 32768 + 8 [0]
>>> scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 
>>> 32776 + 8 [0]
>>>    kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 
>>> () 32776 + 8 [kworker/0:1H]
>>>    kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 
>>> () 32776 + 8 [kworker/0:1H]
>>> scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 
>>> 32776 + 8 [0]
>>>
>>> (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
>>> Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
>>> the sdb only contained the part of (32768 + 8), then only that part
>>> was completed. The lucky thing was that scsi_io_completion detected
>>> it and requeued the remaining part. So we didn't get corrupted data.
>>> However, the requeue of (32776 + 8) is not expected.
>>
>> Good catch, but how about something like this? Makes it more integrated,
>> I think that's cleaner.
> 
> This is probably better (and safer):

Here's the one I wanted to send, not a half done one. Maybe I'll be
luckier this time around?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533764ca..35e6aba52808 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,12 +737,21 @@ static void blk_mq_requeue_work(struct work_struct *work)
        spin_unlock_irq(&q->requeue_lock);
 
        list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
-               if (!(rq->rq_flags & RQF_SOFTBARRIER))
+               if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
                        continue;
 
                rq->rq_flags &= ~RQF_SOFTBARRIER;
                list_del_init(&rq->queuelist);
-               blk_mq_sched_insert_request(rq, true, false, false);
+
+               /*
+                * If RQF_DONTPREP is set, rq may contain some driver
+                * specific data. Insert it to hctx dispatch list to avoid
+                * any merge.
+                */
+               if (rq->rq_flags & RQF_DONTPREP)
+                       blk_mq_request_bypass_insert(rq, false);
+               else
+                       blk_mq_sched_insert_request(rq, true, false, false);
        }
 
        while (!list_empty(&rq_list)) {

-- 
Jens Axboe

Reply via email to