On Tue, Sep 18, 2018 at 01:59:01PM -0700, Bart Van Assche wrote:
> Instead of scheduling runtime resume of a request queue after a
> request has been queued, schedule asynchronous resume during request
> allocation. The new pm_request_resume() calls occur after
> blk_queue_enter() has increased the q_usage_counter request queue
> member. This change is needed for a later patch that will make request
> allocation block while the queue status is not RPM_ACTIVE.
> 
> Signed-off-by: Bart Van Assche <bvanass...@acm.org>
> Cc: Christoph Hellwig <h...@lst.de>
> Cc: Ming Lei <ming....@redhat.com>
> Cc: Jianchao Wang <jianchao.w.w...@oracle.com>
> Cc: Hannes Reinecke <h...@suse.com>
> Cc: Johannes Thumshirn <jthumsh...@suse.de>
> Cc: Alan Stern <st...@rowland.harvard.edu>
> ---
>  block/blk-core.c | 2 ++
>  block/elevator.c | 1 -
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index fd91e9bf2893..18b874d5c9c9 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -942,6 +942,8 @@ int blk_queue_enter(struct request_queue *q, 
> blk_mq_req_flags_t flags)
>               if (success)
>                       return 0;
>  
> +             blk_pm_request_resume(q);


Looks this patch may introduce the following race between queue freeze and
runtime suspend:

--------------------------------------------------------------------------
CPU0                            CPU1                                            
CPU2
--------------------------------------------------------------------------

blk_freeze_queue()

                                        blk_mq_alloc_request()
                                                blk_queue_enter()
                                                        blk_pm_request_resume()
                                                        wait_event()

                                                                                
                blk_pre_runtime_suspend()
                                                                                
                        ->blk_set_pm_only
                                                                                
                ...
                                                                                
                blk_post_runtime_suspend()

...
blk_mq_unfreeze_queue()
--------------------------------------------------------------------------
- CPU0: queue is frozen

- CPU1: one new request comes, and see queue is frozen, but queue isn't
runtime-suspended yet, then blk_pm_request_resume() does nothing. So this
allocation is blocked in wait_event().

- CPU2: runtime suspend comes, and queue becomes runtime-suspended now

- CPU0: queue is unfreeze, but the new request allocation in CPU1 may never
be done because the queue is runtime suspended, and wait_event() won't return.
And the expected result is that the queue becomes active and the allocation on
CPU1 is done immediately.

Thanks,
Ming

Reply via email to