On Mon, 2019-03-18 at 15:38 +0800, Ming Lei wrote: > On Sun, Mar 17, 2019 at 09:09:09PM -0700, Bart Van Assche wrote: > > On 3/17/19 8:29 PM, Ming Lei wrote: > > > In NVMe's error handler, follows the typical steps for tearing down > > > hardware: > > > > > > 1) stop blk_mq hw queues > > > 2) stop the real hw queues > > > 3) cancel in-flight requests via > > > blk_mq_tagset_busy_iter(tags, cancel_request, ...) > > > cancel_request(): > > > mark the request as abort > > > blk_mq_complete_request(req); > > > 4) destroy real hw queues > > > > > > However, there may be race between #3 and #4, because > > > blk_mq_complete_request() > > > actually completes the request asynchronously. > > > > > > This patch introduces blk_mq_complete_request_sync() for fixing the > > > above race. > > > > Other block drivers wait until outstanding requests have completed by > > calling blk_cleanup_queue() before hardware queues are destroyed. Why can't > > the NVMe driver follow that approach? > > The tearing down of controller can be done in error handler, in which > the request queues may not be cleaned up, almost all kinds of NVMe > controller's error handling follows the above steps, such as: > > nvme_rdma_error_recovery_work() > ->nvme_rdma_teardown_io_queues() > > nvme_timeout() > ->nvme_dev_disable
Hi Ming, This makes me wonder whether the current design of the NVMe core is the best design we can come up with? The structure of e.g. the SRP initiator and target drivers is similar to the NVMeOF drivers. However, there is no need in the SRP initiator driver to terminate requests synchronously. Is this due to differences in the error handling approaches in the SCSI and NVMe core drivers? Thanks, Bart.
