Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello, Bart.

On Thu, Sep 24, 2015 at 03:54:18PM -0700, Bart Van Assche wrote:
> Sorry that I had not yet made this clear but I agreed with the analysis in
> your two most recent e-mails. I think I have found the cause of the loop:
> for one or another reason the scsi_dh_alua driver was not loaded
> automatically. I think that caused the SCSI core to return a retryable error
> code for reads and writes sent over paths in the SCSI ALUA state "standby"
> instead of a non-retryable error code and that that caused the dm-mpath
> driver to enter an infinite loop. Loading the scsi_dh_alua driver resolved
> the infinite loop. Anyway, thank you for the feedback.

Great!  I think we still probably wanna fix the kill/reinit race tho.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche

On 09/24/2015 11:14 AM, Tejun Heo wrote:

On Thu, Sep 24, 2015 at 11:09:33AM -0700, Bart Van Assche wrote:

On 09/24/2015 10:49 AM, Tejun Heo wrote:

Again, that doesn't happen.


In case anyone would be interested, the backtraces for the lockup I had
observed are as follows:


If this is happening and it's not caused by a hung in-flight request,
it's either percpu_ref being buggy or the forementioned kill/reinit
race screwing it up.  percpu_ref_kill() is expected to disable
tryget_live() in a finite amount of time regardless of concurrent
tryget tries.


Hello Tejun,

Sorry that I had not yet made this clear but I agreed with the analysis 
in your two most recent e-mails. I think I have found the cause of the 
loop: for one or another reason the scsi_dh_alua driver was not loaded 
automatically. I think that caused the SCSI core to return a retryable 
error code for reads and writes sent over paths in the SCSI ALUA state 
"standby" instead of a non-retryable error code and that that caused the 
dm-mpath driver to enter an infinite loop. Loading the scsi_dh_alua 
driver resolved the infinite loop. Anyway, thank you for the feedback.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/23/2015 08:23 PM, Ming Lei wrote:
> One solution I thought of is the following patch, which depends on
> Akinobu's patch (blk-mq: fix freeze queue race
> http://marc.info/?l=linux-kernel&m=143723697010781&w=2).

Has that patch been tested against a debug kernel ? The following call
trace is triggered by that patch:

WARNING: CPU: 3 PID: 200 at kernel/sched/core.c:7485 __might_sleep+0x77/0x80()
do not call blocking ops when !TASK_RUNNING; state=1 set at 
[] prepare_to_wait_event+0x63/0xf0
Modules linked in: ib_srp(O) scsi_transport_srp(O) netconsole configfs 
af_packet msr sg coretemp x86_pkg_temp_thermal crct10dif_pclmul crc32c_intel 
aesni_intel aes_x86_64 glue_helper lrw gf128mul ipmi_devintf ablk_helper cryptd 
tg3 lpc_ich microcode pcspkr libphy mfd_core ipmi_si ipmi_msghandler ib_ipoib 
rdma_ucm ib_ucm ib_uverbs ib_umad dm_multipath dm_mod rdma_cm ib_cm 
acpi_power_meter hwmon iw_cm processor button wmi ext4 crc16 mbcache jbd2 
mlx4_ib mlx4_en ib_sa ptp pps_core ib_mad ib_core ib_addr sd_mod sr_mod cdrom 
hid_generic usbhid hid ahci libahci libata ehci_pci ehci_hcd mlx4_core usbcore 
usb_common scsi_mod autofs4 [last unloaded: brd]
CPU: 3 PID: 200 Comm: kworker/u16:9 Tainted: G   O4.3.0-rc2-debug+ 
#1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 [] dump_stack+0x4b/0x68
 [] warn_slowpath_common+0x86/0xc0
 [] warn_slowpath_fmt+0x4c/0x50
 [] ? prepare_to_wait_event+0x63/0xf0
 [] ? prepare_to_wait_event+0x63/0xf0
 [] __might_sleep+0x77/0x80
 [] mutex_lock_nested+0x33/0x3b0
 [] ? trace_hardirqs_on+0xd/0x10
 [] ? prepare_to_wait_event+0x95/0xf0
 [] blk_mq_queue_enter+0x17a/0x2d0
 [] ? blk_mq_queue_enter+0x33/0x2d0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_map_request+0x2e/0x330
 [] ? __lock_is_held+0x49/0x70
 [] blk_sq_make_request+0x91/0x430
[ ... ]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello,

On Thu, Sep 24, 2015 at 11:09:33AM -0700, Bart Van Assche wrote:
> On 09/24/2015 10:49 AM, Tejun Heo wrote:
> > Again, that doesn't happen.
> 
> Hello Tejun,
> 
> In case anyone would be interested, the backtraces for the lockup I had
> observed are as follows:

If this is happening and it's not caused by a hung in-flight request,
it's either percpu_ref being buggy or the forementioned kill/reinit
race screwing it up.  percpu_ref_kill() is expected to disable
tryget_live() in a finite amount of time regardless of concurrent
tryget tries.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche
On 09/24/2015 10:49 AM, Tejun Heo wrote:
> Again, that doesn't happen.

Hello Tejun,

In case anyone would be interested, the backtraces for the lockup I had
observed are as follows:

sysrq: SysRq : Show Blocked State
  taskPC stack   pid father
kworker/4:0 D 88045c5d5a00 029  2 0x
Workqueue: srp_remove srp_remove_work [ib_srp]
 88045c767c08 0086 810ba11d 88047fd15ad8
 88045c5d5a00 88045c768000 88045c768000 88041c737ab8
 880036f7bcf8 8804158b55e8 0100 88045c767c20
Call Trace:
 [] ? trace_hardirqs_on+0xd/0x10
 [] schedule+0x3a/0x90
 [] blk_mq_freeze_queue_wait+0x56/0xb0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_update_tag_set_depth+0x44/0xb0
 [] blk_mq_free_queue+0x50/0x110
 [] blk_cleanup_queue+0x148/0x240
 [] __scsi_remove_device+0x65/0xd0 [scsi_mod]
 [] scsi_forget_host+0x69/0x70 [scsi_mod]
 [] scsi_remove_host+0x82/0x130 [scsi_mod]
 [] srp_remove_work+0x90/0x1f0 [ib_srp]
 [] process_one_work+0x1d8/0x610
 [] ? process_one_work+0x14b/0x610
 [] worker_thread+0x114/0x460
 [] ? process_one_work+0x610/0x610
 [] kthread+0xf8/0x110
 [] ? kthread_create_on_node+0x200/0x200
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x200/0x200
kworker/1:2 D 88045c648000 0   264  2 0x
Workqueue: srp_remove srp_remove_work [ib_srp]
 880444d47c08 0086 810ba11d 88047fc55ad8
 88045c648000 880450a9da00 880444d48000 88042826c8d8
 880428969790 880413b27b50 0040 880444d47c20
Call Trace:
 [] ? trace_hardirqs_on+0xd/0x10
 [] schedule+0x3a/0x90
 [] blk_mq_freeze_queue_wait+0x56/0xb0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_update_tag_set_depth+0x44/0xb0
 [] blk_mq_free_queue+0x50/0x110
 [] blk_cleanup_queue+0x148/0x240
 [] __scsi_remove_device+0x65/0xd0 [scsi_mod]
 [] scsi_forget_host+0x69/0x70 [scsi_mod]
 [] scsi_remove_host+0x82/0x130 [scsi_mod]
 [] srp_remove_work+0x90/0x1f0 [ib_srp]
 [] process_one_work+0x1d8/0x610
 [] ? process_one_work+0x14b/0x610
 [] worker_thread+0x114/0x460
 [] ? process_one_work+0x610/0x610
 [] kthread+0xf8/0x110
 [] ? kthread_create_on_node+0x200/0x200
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x200/0x200
kworker/7:0 D 88045c675a00 0 27179  2 0x
Workqueue: srp_remove srp_remove_work [ib_srp]
 8803f7333c08 0086 810ba11d 88047fdd5ad8
 88045c675a00 8803f7139680 8803f7334000 880403d76e40
 88040c412408 8803fe493cf8 01c0 8803f7333c20
Call Trace:
 [] ? trace_hardirqs_on+0xd/0x10
 [] schedule+0x3a/0x90
 [] blk_mq_freeze_queue_wait+0x56/0xb0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_update_tag_set_depth+0x44/0xb0
 [] blk_mq_free_queue+0x50/0x110
 [] blk_cleanup_queue+0x148/0x240
 [] __scsi_remove_device+0x65/0xd0 [scsi_mod]
 [] scsi_forget_host+0x69/0x70 [scsi_mod]
 [] scsi_remove_host+0x82/0x130 [scsi_mod]
 [] srp_remove_work+0x90/0x1f0 [ib_srp]
 [] process_one_work+0x1d8/0x610
 [] ? process_one_work+0x14b/0x610
 [] worker_thread+0x114/0x460
 [] ? process_one_work+0x610/0x610
 [] kthread+0xf8/0x110
 [] ? kthread_create_on_node+0x200/0x200
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x200/0x200
kworker/3:0 D 88045c649680 0 21529  2 0x
Workqueue: srp_remove srp_remove_work [ib_srp]
 880392f2fc08 0086 810ba11d 88047fcd5ad8
 88045c649680 88039a72c380 880392f3 880420f2fab8
 880425896ed8 880416043cf8 00c0 880392f2fc20
Call Trace:
 [] ? trace_hardirqs_on+0xd/0x10
 [] schedule+0x3a/0x90
 [] blk_mq_freeze_queue_wait+0x56/0xb0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_update_tag_set_depth+0x44/0xb0
 [] blk_mq_free_queue+0x50/0x110
 [] blk_cleanup_queue+0x148/0x240
 [] __scsi_remove_device+0x65/0xd0 [scsi_mod]
 [] scsi_forget_host+0x69/0x70 [scsi_mod]
 [] scsi_remove_host+0x82/0x130 [scsi_mod]
 [] srp_remove_work+0x90/0x1f0 [ib_srp]
 [] process_one_work+0x1d8/0x610
 [] ? process_one_work+0x14b/0x610
 [] worker_thread+0x114/0x460
 [] ? process_one_work+0x610/0x610
 [] kthread+0xf8/0x110
 [] ? kthread_create_on_node+0x200/0x200
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x200/0x200
kworker/5:0 D 88045c64ad00 0 10950  2 0x
Workqueue: srp_remove srp_remove_work [ib_srp]
 88037111bc08 0086 810ba11d 88047fd55ad8
 88045c64ad00 880055991680 88037111c000 88041ff348d8
 880424463080 8804145155e8 0140 88037111bc20
Call Trace:
 [] ? trace_hardirqs_on+0xd/0x10
 [] schedule+0x3a/0x90
 [] blk_mq_freeze_queue_wait+0x56/0xb0
 [] ? prepare_to_wait_event+0xf0/0xf0
 [] blk_mq_update_tag_set_depth+0x44/0xb0
 [] blk_mq_free_queue+0x50/0x110
 [] blk_cleanup_queue+0x148/0x240
 [] __scsi_remove_device+0x65/0xd0 [scsi_mod]
 [] scsi_forget_host+0x69/0x70 [scsi_mod]
 [] scsi_remove_host+0x82/0x

Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
Hello, Bart.

On Thu, Sep 24, 2015 at 10:35:41AM -0700, Bart Van Assche wrote:
> My interpretation of the percpu_ref_tryget_live() implementation in
>  is that the tryget operation will only fail if the
> refcount is in atomic mode and additionally the __PERCPU_REF_DEAD flag has
> been set.

Yeah and percpu_ref_kill() does both.

> >Also, what does the barriers do in your patch?
> 
> My intention was to guarantee that on architectures that do not provide the
> same ordering guarantees as x86 (e.g. PPC or ARM) that the store and load
> operations on mq_freeze_depth and mq_usage_counter would not be reordered.
> However, it is probably safe to leave out the barrier I proposed to
> introduce in blk_mq_queue_enter() since it is acceptable that there is some
> delay in communicating mq_freeze_depth updates from the CPU that modified
> that counter to the CPU that reads that counter.

Hmmm... please don't use barriers this way.  Use it only when there's
a clear requirement for interlocking writer and reader pair.  There
isn't one here.  All it does is confusing people trying to read the
code.

> >The only race condition that I can see there is if unfreeze and freeze
> >race each other and freeze tries to kill the ref which hasn't finished
> >reinit yet.  We prolly want to put mutexes around freeze/unfreeze so
> >that they're serialized if something like that can happen (it isn't a
> >hot path to begin with).
> 
> My concern is that the following could happen if mq_freeze_depth is not
> checked in the hot path of blk_mq_queue_enter():
> * mq_usage_counter >= 1 before blk_mq_freeze_queue() is called.
> * blk_mq_freeze_queue() keeps waiting forever if new requests are queued
>   faster than that these requests complete.

Again, that doesn't happen.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche

On 09/24/2015 09:54 AM, Tejun Heo wrote:

On Thu, Sep 24, 2015 at 09:43:48AM -0700, Bart Van Assche wrote:

On 09/23/2015 08:23 PM, Ming Lei wrote:

IMO, mq_freeze_depth should only be accessed in slow path, and looks
the race just happens during the small window between increasing
'mq_freeze_depth' and killing the percpu counter.


Hello Ming,

My concern is that *not* checking mq_freeze_depth in the hot path can cause
a livelock. If there is a software layer, e.g. multipathd, that periodically
submits new commands and if these commands take time to process e.g. because
the transport layer is unavailable, how to guarantee that freezing ever
succeeds without checking mq_freeze_depth in the hot path ?


I couldn't tell what the patch was trying to do from the patch
description, so including the above prolly is a good idea.  Isn't the
above guaranteed by percpu_ref_kill() preventing new tryget_live()'s?


My interpretation of the percpu_ref_tryget_live() implementation in 
 is that the tryget operation will only fail if 
the refcount is in atomic mode and additionally the __PERCPU_REF_DEAD 
flag has been set.



Also, what does the barriers do in your patch?


My intention was to guarantee that on architectures that do not provide 
the same ordering guarantees as x86 (e.g. PPC or ARM) that the store and 
load operations on mq_freeze_depth and mq_usage_counter would not be 
reordered. However, it is probably safe to leave out the barrier I 
proposed to introduce in blk_mq_queue_enter() since it is acceptable 
that there is some delay in communicating mq_freeze_depth updates from 
the CPU that modified that counter to the CPU that reads that counter.



The only race condition that I can see there is if unfreeze and freeze
race each other and freeze tries to kill the ref which hasn't finished
reinit yet.  We prolly want to put mutexes around freeze/unfreeze so
that they're serialized if something like that can happen (it isn't a
hot path to begin with).


My concern is that the following could happen if mq_freeze_depth is not 
checked in the hot path of blk_mq_queue_enter():

* mq_usage_counter >= 1 before blk_mq_freeze_queue() is called.
* blk_mq_freeze_queue() keeps waiting forever if new requests are queued
  faster than that these requests complete.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Tejun Heo
On Thu, Sep 24, 2015 at 09:43:48AM -0700, Bart Van Assche wrote:
> On 09/23/2015 08:23 PM, Ming Lei wrote:
> >IMO, mq_freeze_depth should only be accessed in slow path, and looks
> >the race just happens during the small window between increasing
> >'mq_freeze_depth' and killing the percpu counter.
> 
> Hello Ming,
> 
> My concern is that *not* checking mq_freeze_depth in the hot path can cause
> a livelock. If there is a software layer, e.g. multipathd, that periodically
> submits new commands and if these commands take time to process e.g. because
> the transport layer is unavailable, how to guarantee that freezing ever
> succeeds without checking mq_freeze_depth in the hot path ?

I couldn't tell what the patch was trying to do from the patch
description, so including the above prolly is a good idea.  Isn't the
above guaranteed by percpu_ref_kill() preventing new tryget_live()'s?
Also, what does the barriers do in your patch?

The only race condition that I can see there is if unfreeze and freeze
race each other and freeze tries to kill the ref which hasn't finished
reinit yet.  We prolly want to put mutexes around freeze/unfreeze so
that they're serialized if something like that can happen (it isn't a
hot path to begin with).

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-24 Thread Bart Van Assche

On 09/23/2015 08:23 PM, Ming Lei wrote:

IMO, mq_freeze_depth should only be accessed in slow path, and looks
the race just happens during the small window between increasing
'mq_freeze_depth' and killing the percpu counter.


Hello Ming,

My concern is that *not* checking mq_freeze_depth in the hot path can 
cause a livelock. If there is a software layer, e.g. multipathd, that 
periodically submits new commands and if these commands take time to 
process e.g. because the transport layer is unavailable, how to 
guarantee that freezing ever succeeds without checking mq_freeze_depth 
in the hot path ?


Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

2015-09-23 Thread Ming Lei
On Wed, 23 Sep 2015 15:14:10 -0700
Bart Van Assche  wrote:

> Ensure that blk_mq_queue_enter() waits if mq_freeze_depth is not
> zero. Ensure that the update of mq_freeze_depth by blk_mq_freeze_queue()
> is visible by all CPU cores before that function waits on
> mq_usage_counter.
> 
> It is unfortunate that this patch introduces an smp_mb() in the
> hot path (blk_mq_queue_enter()) but I have not yet found a way to
> avoid this.
> 
> I came across this code while analyzing a lockup triggered by
> deleting a SCSI host created by the SRP initiator immediately
> followed by a relogin.
> 
> Signed-off-by: Bart Van Assche 
> Cc: Christoph Hellwig 
> Cc: Tejun Heo 
> Cc: 
> ---
>  block/blk-mq.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 2077f0d..e3ad411 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -83,8 +83,13 @@ static int blk_mq_queue_enter(struct request_queue *q, 
> gfp_t gfp)
>   while (true) {
>   int ret;
>  
> - if (percpu_ref_tryget_live(&q->mq_usage_counter))
> - return 0;
> + if (percpu_ref_tryget_live(&q->mq_usage_counter)) {
> + /* Order mq_use_counter and mq_freeze_depth accesses */
> + smp_mb();
> + if (!atomic_read(&q->mq_freeze_depth))
> + return 0;
> + percpu_ref_put(&q->mq_usage_counter);
> + }

IMO, mq_freeze_depth should only be accessed in slow path, and looks
the race just happens during the small window between increasing
'mq_freeze_depth' and killing the percpu counter.

One solution I thought of is the following patch, which depends on
Akinobu's patch (blk-mq: fix freeze queue race
http://marc.info/?l=linux-kernel&m=143723697010781&w=2).

---
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f774f67..1c71c04 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -77,6 +77,17 @@ static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx 
*hctx,
clear_bit(CTX_TO_BIT(hctx, ctx), &bm->word);
 }
 
+static inline int blk_mq_read_freeze_depth(struct request_queue *q)
+{
+   int  depth;
+
+   mutex_lock(&q->mq_freeze_lock);
+   depth = q->mq_freeze_depth;
+   mutex_unlock(&q->mq_freeze_lock);
+
+   return depth;
+}
+
 static int blk_mq_queue_enter(struct request_queue *q, gfp_t gfp)
 {
while (true) {
@@ -89,7 +100,7 @@ static int blk_mq_queue_enter(struct request_queue *q, gfp_t 
gfp)
return -EBUSY;
 
ret = wait_event_interruptible(q->mq_freeze_wq,
-   !atomic_read(&q->mq_freeze_depth) ||
+   !blk_mq_read_freeze_depth(q) ||
blk_queue_dying(q));
if (blk_queue_dying(q))
return -ENODEV;
@@ -113,12 +124,9 @@ static void blk_mq_usage_counter_release(struct percpu_ref 
*ref)
 
 void blk_mq_freeze_queue_start(struct request_queue *q)
 {
-   int freeze_depth;
-
mutex_lock(&q->mq_freeze_lock);
 
-   freeze_depth = atomic_inc_return(&q->mq_freeze_depth);
-   if (freeze_depth == 1) {
+   if (!q->mq_freeze_depth++) {
percpu_ref_kill(&q->mq_usage_counter);
blk_mq_run_hw_queues(q, false);
}
@@ -149,7 +157,7 @@ void blk_mq_unfreeze_queue(struct request_queue *q)
 
mutex_lock(&q->mq_freeze_lock);
 
-   freeze_depth = atomic_dec_return(&q->mq_freeze_depth);
+   freeze_depth = --q->mq_freeze_depth;
WARN_ON_ONCE(freeze_depth < 0);
if (!freeze_depth) {
percpu_ref_reinit(&q->mq_usage_counter);
@@ -2084,7 +2092,7 @@ void blk_mq_free_queue(struct request_queue *q)
 /* Basically redo blk_mq_init_queue with queue frozen */
 static void blk_mq_queue_reinit(struct request_queue *q)
 {
-   WARN_ON_ONCE(!atomic_read(&q->mq_freeze_depth));
+   WARN_ON_ONCE(!ACCESS_ONCE(q->mq_freeze_depth));
 
blk_mq_sysfs_unregister(q);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6cdf2b7..86fedcc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -436,7 +436,7 @@ struct request_queue {
struct mutexsysfs_lock;
 
int bypass_depth;
-   atomic_tmq_freeze_depth;
+   int mq_freeze_depth;
 
 #if defined(CONFIG_BLK_DEV_BSG)
bsg_job_fn  *bsg_job_fn;



>  
>   if (!(gfp & __GFP_WAIT))
>   return -EBUSY;
> @@ -136,6 +141,11 @@ static void blk_mq_freeze_queue_wait(struct 
> request_queue *q)
>  void blk_mq_freeze_queue(struct request_queue *q)
>  {
>   blk_mq_freeze_queue_start(q);
> + /*
> +  * Ensure that the mq_freeze_depth update is visiable before
> +  * mq_use_counter is read.
> +  */
> + smp_mb();
>   blk_mq_freeze_queue_wait(q);
>  }