On 3/17/2019 9:09 PM, Bart Van Assche wrote:
On 3/17/19 8:29 PM, Ming Lei wrote:
In NVMe's error handler, follows the typical steps for tearing down
hardware:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
    blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
    mark the request as abort
    blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
actually completes the request asynchronously.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Other block drivers wait until outstanding requests have completed by calling blk_cleanup_queue() before hardware queues are destroyed. Why can't the NVMe driver follow that approach?


speaking for the fabrics, not necessarily pci:

The intent of this looping, which happens immediately following an error being detected, is to cause the termination of the outstanding requests. Otherwise, the only recourse is to wait for the ios to finish, which they may never do, or have their upper-level timeout expire to cause their termination - thus a very long delay.   And one of the commands, on the admin queue - a different tag set but handled the same, doesn't have a timeout (the Async Event Reporting command) so it wouldn't necessarily clear without this looping.

-- james

Reply via email to