Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing
Ming Lei - 27.09.17, 16:27: > On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote: > > Hi Ming. > > > > Ming Lei - 27.09.17, 13:48: > > > Hi, > > > > > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > > > > > Once SCSI device is put into QUIESCE, no new request except for > > > RQF_PREEMPT can be dispatched to SCSI successfully, and > > > scsi_device_quiesce() just simply waits for completion of I/Os > > > dispatched to SCSI stack. It isn't enough at all. > > > > > > Because new request still can be comming, but all the allocated > > > requests can't be dispatched successfully, so request pool can be > > > consumed up easily. > > > > > > Then request with RQF_PREEMPT can't be allocated and wait forever, > > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT, > > > then system hangs forever, such as during system suspend or > > > sending SCSI domain alidation. > > > > > > Both IO hang inside system suspend[1] or SCSI domain validation > > > were reported before. > > > > > > This patch introduces preempt only mode, and solves the issue > > > by allowing RQF_PREEMP only during SCSI quiesce. > > > > > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes > > > them all. > > > > > > V6: > > > - borrow Bart's idea of preempt only, with clean > > > > > > implementation(patch 5/patch 6) > > > > > > - needn't any external driver's dependency, such as MD's > > > change > > > > Do you want me to test with v6 of the patch set? If so, it would be nice > > if > > you´d make a v6 branch in your git repo. > > Hi Martin, > > I appreciate much if you may run V6 and provide your test result, > follows the branch: > > https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6 > > https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6 > > > After an uptime of almost 6 days I am pretty confident that the V5 one > > fixes the issue for me. So > > > > Tested-by: Martin Steigerwald > > > > for V5. > > Thanks for your test! Two days and almost 6 hours, no hang yet. I bet the whole thing works. (3e45474d7df3bfdabe4801b5638d197df9810a79) Tested-By: Martin Steigerwald (It could still hang after three days, but usually I got the first hang within the first two days.) Thanks, -- Martin
Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing
Hey. I can confirm that v6 of your patchset still works well for me. Tested on v4.13 kernel. Thanks. On středa 27. září 2017 10:52:41 CEST Ming Lei wrote: > On Wed, Sep 27, 2017 at 04:27:51PM +0800, Ming Lei wrote: > > On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote: > > > Hi Ming. > > > > > > Ming Lei - 27.09.17, 13:48: > > > > Hi, > > > > > > > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > > > > > > > Once SCSI device is put into QUIESCE, no new request except for > > > > RQF_PREEMPT can be dispatched to SCSI successfully, and > > > > scsi_device_quiesce() just simply waits for completion of I/Os > > > > dispatched to SCSI stack. It isn't enough at all. > > > > > > > > Because new request still can be comming, but all the allocated > > > > requests can't be dispatched successfully, so request pool can be > > > > consumed up easily. > > > > > > > > Then request with RQF_PREEMPT can't be allocated and wait forever, > > > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT, > > > > then system hangs forever, such as during system suspend or > > > > sending SCSI domain alidation. > > > > > > > > Both IO hang inside system suspend[1] or SCSI domain validation > > > > were reported before. > > > > > > > > This patch introduces preempt only mode, and solves the issue > > > > by allowing RQF_PREEMP only during SCSI quiesce. > > > > > > > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes > > > > them all. > > > > > > > > V6: > > > > - borrow Bart's idea of preempt only, with clean > > > > > > > > implementation(patch 5/patch 6) > > > > > > > > - needn't any external driver's dependency, such as MD's > > > > change > > > > > > Do you want me to test with v6 of the patch set? If so, it would be nice > > > if > > > you´d make a v6 branch in your git repo. > > > > Hi Martin, > > > > I appreciate much if you may run V6 and provide your test result, > > follows the branch: > > > > https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6 > > > > https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6 > > Also follows the branch against V4.13: > > https://github.com/ming1/linux/tree/v4.13-safe-scsi-quiesce_V6_for_test > > https://github.com/ming1/linux.git #v4.13-safe-scsi-quiesce_V6_for_test
Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing
On Wed, Sep 27, 2017 at 04:27:51PM +0800, Ming Lei wrote: > On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote: > > Hi Ming. > > > > Ming Lei - 27.09.17, 13:48: > > > Hi, > > > > > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > > > > > Once SCSI device is put into QUIESCE, no new request except for > > > RQF_PREEMPT can be dispatched to SCSI successfully, and > > > scsi_device_quiesce() just simply waits for completion of I/Os > > > dispatched to SCSI stack. It isn't enough at all. > > > > > > Because new request still can be comming, but all the allocated > > > requests can't be dispatched successfully, so request pool can be > > > consumed up easily. > > > > > > Then request with RQF_PREEMPT can't be allocated and wait forever, > > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT, > > > then system hangs forever, such as during system suspend or > > > sending SCSI domain alidation. > > > > > > Both IO hang inside system suspend[1] or SCSI domain validation > > > were reported before. > > > > > > This patch introduces preempt only mode, and solves the issue > > > by allowing RQF_PREEMP only during SCSI quiesce. > > > > > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes > > > them all. > > > > > > V6: > > > - borrow Bart's idea of preempt only, with clean > > > implementation(patch 5/patch 6) > > > - needn't any external driver's dependency, such as MD's > > > change > > > > Do you want me to test with v6 of the patch set? If so, it would be nice if > > you´d make a v6 branch in your git repo. > > Hi Martin, > > I appreciate much if you may run V6 and provide your test result, > follows the branch: > > https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6 > > https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6 > Also follows the branch against V4.13: https://github.com/ming1/linux/tree/v4.13-safe-scsi-quiesce_V6_for_test https://github.com/ming1/linux.git #v4.13-safe-scsi-quiesce_V6_for_test -- Ming
Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing
On Wed, Sep 27, 2017 at 09:57:37AM +0200, Martin Steigerwald wrote: > Hi Ming. > > Ming Lei - 27.09.17, 13:48: > > Hi, > > > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > > > Once SCSI device is put into QUIESCE, no new request except for > > RQF_PREEMPT can be dispatched to SCSI successfully, and > > scsi_device_quiesce() just simply waits for completion of I/Os > > dispatched to SCSI stack. It isn't enough at all. > > > > Because new request still can be comming, but all the allocated > > requests can't be dispatched successfully, so request pool can be > > consumed up easily. > > > > Then request with RQF_PREEMPT can't be allocated and wait forever, > > meantime scsi_device_resume() waits for completion of RQF_PREEMPT, > > then system hangs forever, such as during system suspend or > > sending SCSI domain alidation. > > > > Both IO hang inside system suspend[1] or SCSI domain validation > > were reported before. > > > > This patch introduces preempt only mode, and solves the issue > > by allowing RQF_PREEMP only during SCSI quiesce. > > > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes > > them all. > > > > V6: > > - borrow Bart's idea of preempt only, with clean > > implementation(patch 5/patch 6) > > - needn't any external driver's dependency, such as MD's > > change > > Do you want me to test with v6 of the patch set? If so, it would be nice if > you´d make a v6 branch in your git repo. Hi Martin, I appreciate much if you may run V6 and provide your test result, follows the branch: https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V6 https://github.com/ming1/linux.git #blk_safe_scsi_quiesce_V6 > > After an uptime of almost 6 days I am pretty confident that the V5 one fixes > the > issue for me. So > > Tested-by: Martin Steigerwald > > for V5. Thanks for your test! -- Ming
Re: [PATCH V6 0/6] block/scsi: safe SCSI quiescing
Hi Ming. Ming Lei - 27.09.17, 13:48: > Hi, > > The current SCSI quiesce isn't safe and easy to trigger I/O deadlock. > > Once SCSI device is put into QUIESCE, no new request except for > RQF_PREEMPT can be dispatched to SCSI successfully, and > scsi_device_quiesce() just simply waits for completion of I/Os > dispatched to SCSI stack. It isn't enough at all. > > Because new request still can be comming, but all the allocated > requests can't be dispatched successfully, so request pool can be > consumed up easily. > > Then request with RQF_PREEMPT can't be allocated and wait forever, > meantime scsi_device_resume() waits for completion of RQF_PREEMPT, > then system hangs forever, such as during system suspend or > sending SCSI domain alidation. > > Both IO hang inside system suspend[1] or SCSI domain validation > were reported before. > > This patch introduces preempt only mode, and solves the issue > by allowing RQF_PREEMP only during SCSI quiesce. > > Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes > them all. > > V6: > - borrow Bart's idea of preempt only, with clean > implementation(patch 5/patch 6) > - needn't any external driver's dependency, such as MD's > change Do you want me to test with v6 of the patch set? If so, it would be nice if you´d make a v6 branch in your git repo. After an uptime of almost 6 days I am pretty confident that the V5 one fixes the issue for me. So Tested-by: Martin Steigerwald for V5. Thanks, Martin > V5: > - fix one tiny race by introducing blk_queue_enter_preempt_freeze() > given this change is small enough compared with V4, I added > tested-by directly > > V4: > - reorganize patch order to make it more reasonable > - support nested preempt freeze, as required by SCSI transport spi > - check preempt freezing in slow path of of blk_queue_enter() > - add "SCSI: transport_spi: resume a quiesced device" > - wake up freeze queue in setting dying for both blk-mq and legacy > - rename blk_mq_[freeze|unfreeze]_queue() in one patch > - rename .mq_freeze_wq and .mq_freeze_depth > - improve comment > > V3: > - introduce q->preempt_unfreezing to fix one bug of preempt freeze > - call blk_queue_enter_live() only when queue is preempt frozen > - cleanup a bit on the implementation of preempt freeze > - only patch 6 and 7 are changed > > V2: > - drop the 1st patch in V1 because percpu_ref_is_dying() is > enough as pointed by Tejun > - introduce preempt version of blk_[freeze|unfreeze]_queue > - sync between preempt freeze and normal freeze > - fix warning from percpu-refcount as reported by Oleksandr > > > [1] https://marc.info/?t=150340250100013&r=3&w=2 > > > Thanks, > Ming > > Ming Lei (6): > blk-mq: only run hw queues for blk-mq > block: tracking request allocation with q_usage_counter > block: pass flags to blk_queue_enter() > block: prepare for passing RQF_PREEMPT to request allocation > block: support PREEMPT_ONLY > SCSI: set block queue at preempt only when SCSI device is put into > quiesce > > block/blk-core.c| 62 > ++--- block/blk-mq.c | > 14 --- > block/blk-timeout.c | 2 +- > drivers/scsi/scsi_lib.c | 25 +--- > fs/block_dev.c | 4 ++-- > include/linux/blk-mq.h | 7 +++--- > include/linux/blkdev.h | 27 ++--- > 7 files changed, 106 insertions(+), 35 deletions(-) -- Martin