Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-30 Thread Joe Thornber
On Fri, Jun 27, 2014 at 02:44:41PM -0400, Mikulas Patocka wrote:
> I suggest - split it to two patches, a minimal patch that fixes the bug by 
> changing sleeper to completion and the second patch with remaining changes 
> - so that only the first patch can be backported to stable kernels.

Agreed.  Thanks for reviewing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-30 Thread Joe Thornber
On Fri, Jun 27, 2014 at 02:44:41PM -0400, Mikulas Patocka wrote:
 I suggest - split it to two patches, a minimal patch that fixes the bug by 
 changing sleeper to completion and the second patch with remaining changes 
 - so that only the first patch can be backported to stable kernels.

Agreed.  Thanks for reviewing.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Mikulas Patocka wrote:

> 
> 
> On Fri, 27 Jun 2014, Minfei Huang wrote:
> 
> > BUG: unable to handle kernel NULL pointer dereference at 0046
> > IP: [] dec_count+0x5f/0x80 [dm_mod]
> > PGD 0
> > Oops:  [#1] SMP
> > last sysfs file: 
> > /sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name
> > 
> > Pid: 2708, comm: kcopyd Tainted: GW  --- H  
> > 2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
> > RIP: 0010:[]  [] dec_count+0x5f/0x80 
> > [dm_mod]
> > RSP: 0018:880100603c30  EFLAGS: 00010246
> > RAX: 0046 RBX: 8817968a5c30 RCX: 
> > RDX:  RSI:  RDI: 8817968a5c00
> > RBP: 880100603c50 R08:  R09: 
> > R10: 880caa594cc0 R11:  R12: 8817968a5c80
> > R13: 81013963 R14: 1000 R15: 
> > FS:  () GS:88010060() knlGS:
> > CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> > CR2: 0046 CR3: 00020c309000 CR4: 001426e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: 0ff0 DR7: 0400
> > Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 
> > 881841c9aa80)
> > Stack:
> >  880100603c40 880aa8b32300  8817968a5c00
> >  880100603c80 a000a12a  880aa8b32300
> >   880caa594cc0 880100603c90 811bcf6d
> > Call Trace:
> >  
> >  [] endio+0x4a/0x70 [dm_mod]
> >  [] bio_endio+0x1d/0x40
> >  [] req_bio_endio+0x9b/0xe0
> >  [] blk_update_request+0x104/0x500
> >  [] ? blk_update_request+0x321/0x500
> >  [] blk_update_bidi_request+0x27/0xa0
> >  [] blk_end_bidi_request+0x2f/0x80
> >  [] blk_end_request+0x10/0x20
> >  [] scsi_io_completion+0xaf/0x6c0
> >  [] scsi_finish_command+0xc2/0x130
> >  [] scsi_softirq_done+0x145/0x170
> >  [] blk_done_softirq+0x8d/0xa0
> >  [] __do_softirq+0xdf/0x210
> >  [] call_softirq+0x1c/0x30
> >  [] do_softirq+0xad/0xe0
> >  [] irq_exit+0x95/0xa0
> >  [] do_IRQ+0x75/0xf0
> >  [] ret_from_intr+0x0/0x16
> > 
> > The value of rdi register(0x8817968a5c00) is the io pointer,
> > If the sync io, the address of io point must be alloc from stack.
> > SO
> > crash> struct thread_info 8817968a4000
> > struct thread_info {
> >   task = 0x88180cd9a580,
> >   exec_domain = 0x81a98ac0,
> >  ...
> > }
> > 
> > crash> struct task_struct 0x88180cd9a580
> > struct task_struct {
> >   state = 2,
> >   stack = 0x8817968a4000,
> >  ...
> > }
> > 
> > It shows value exactly when use the value of io address.
> > 
> > The io address in callback function will become the danging point,
> > cause by the thread of sync io wakes up by other threads
> > and return to relieve the io address,
> > 
> > Signed-off-by: Minfei Huang 
> > ---
> >  drivers/md/dm-io.c |   19 +++
> >  1 files changed, 15 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
> > index 3842ac7..f992913 100644
> > --- a/drivers/md/dm-io.c
> > +++ b/drivers/md/dm-io.c
> > @@ -38,6 +38,7 @@ struct io {
> > void *context;
> > void *vma_invalidate_address;
> > unsigned long vma_invalidate_size;
> > +   atomic_t wakeup;
> >  } __attribute__((aligned(DM_IO_MAX_REGIONS)));
> >  
> >  static struct kmem_cache *_dm_io_cache;
> > @@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int 
> > region, int error)
> > invalidate_kernel_vmap_range(io->vma_invalidate_address,
> >  io->vma_invalidate_size);
> >  
> > -   if (io->sleeper)
> > -   wake_up_process(io->sleeper);
> > +   if (io->sleeper) {
> > +   struct task_struct *sleeper = io->sleeper;
> >  
> > -   else {
> > +   atomic_set(>wakeup, 1);
> 
> The problem here is that the processor may reorder the read of io->sleeper 
> with atomic_set(>wakeup, 1); (performing atomic_set first and "sleeper 
> = io->sleeper" afterwards) exposing the same race condition.
> 
> You need to use memory barriers to avoid reordering, but I think the 
> solution with the completion is better (the completion takes care of 
> barriers automatically).
> 
> Mikulas

Also - there is another race - after atomic_set(>wakeup, 1), the 
target process may terminate, so wake_up_process(sleeper) operates on 
non-existing process. You need to declare a special wait queue or use 
wait_on_atomic_t+wake_up_atomic_t (that uses uses pre-initialized hash of 
wait queues) to avoid that race.

But as I said - the completion approach is better. It doesn't suffer from 
these problems.

Mikulas

> > +/*
> > + * The thread may be waked up by other threads,
> > + * if then the sync io point will become 

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Joe Thornber wrote:

> On Fri, Jun 27, 2014 at 12:01:30PM +0800, Minfei Huang wrote:
> > The io address in callback function will become the danging point,
> > cause by the thread of sync io wakes up by other threads
> > and return to relieve the io address,
> 
> Yes, well found.  I prefer the following fix however.
> 
> - Joe

It seems ok.

The patch is too big, I think the only change that needs to be done to fix 
the bug is to replace "struct task_struct *sleeper;" with "struct 
completion *completion;", replace "if (io->sleeper) 
wake_up_process(io->sleeper);" with "if (io->completion) 
complete(io->completion);" and declare the completion in sync_io() and 
wait on it instead of "while (1)" loop there.

I suggest - split it to two patches, a minimal patch that fixes the bug by 
changing sleeper to completion and the second patch with remaining changes 
- so that only the first patch can be backported to stable kernels.

The smaller patch is less likely to produce conflicts (or bugs introduced 
by incorrectly resolved conflicts) when being backported.

Mikulas


> Author: Joe Thornber 
> Date:   Fri Jun 27 15:49:29 2014 +0100
> 
> [dm-io] Fix a race condition in the wake up code for sync_io
> 
> There's a race condition between the atomic_dec_and_test(>count)
> in dec_count() and the waking of the sync_io() thread.  If the thread
> is spuriously woken immediately after the decrement it may exit,
> making the on the stack io struct invalid, yet the dec_count could
> still be using it.
> 
> There are smaller fixes than the one here (eg, just take the io object
> off the stack).  But I feel this code could use a clean up.
> 
> - simplify dec_count().
> 
>   - It always calls a callback fn now.
>   - It always frees the io object back to the pool.
> 
> - sync_io()
> 
>   - Take the io object off the stack and allocate it from the pool the
> same as async_io.
>   - Use a completion object rather than an explicit io_schedule()
> loop.  The callback triggers the completion.
> 
> Reported by: Minfei Huang
> 
> diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
> index 3842ac7..a0982e81 100644
> --- a/drivers/md/dm-io.c
> +++ b/drivers/md/dm-io.c
> @@ -10,6 +10,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -32,7 +33,6 @@ struct dm_io_client {
>  struct io {
>   unsigned long error_bits;
>   atomic_t count;
> - struct task_struct *sleeper;
>   struct dm_io_client *client;
>   io_notify_fn callback;
>   void *context;
> @@ -111,28 +111,27 @@ static void retrieve_io_and_region_from_bio(struct bio 
> *bio, struct io **io,
>   * We need an io object to keep track of the number of bios that
>   * have been dispatched for a particular io.
>   *---*/
> -static void dec_count(struct io *io, unsigned int region, int error)
> +static void complete_io(struct io *io)
>  {
> - if (error)
> - set_bit(region, >error_bits);
> + unsigned long error_bits = io->error_bits;
> + io_notify_fn fn = io->callback;
> + void *context = io->context;
>  
> - if (atomic_dec_and_test(>count)) {
> - if (io->vma_invalidate_size)
> - invalidate_kernel_vmap_range(io->vma_invalidate_address,
> -  io->vma_invalidate_size);
> + if (io->vma_invalidate_size)
> + invalidate_kernel_vmap_range(io->vma_invalidate_address,
> +  io->vma_invalidate_size);
>  
> - if (io->sleeper)
> - wake_up_process(io->sleeper);
> + mempool_free(io, io->client->pool);
> + fn(error_bits, context);
> +}
>  
> - else {
> - unsigned long r = io->error_bits;
> - io_notify_fn fn = io->callback;
> - void *context = io->context;
> +static void dec_count(struct io *io, unsigned int region, int error)
> +{
> + if (error)
> + set_bit(region, >error_bits);
>  
> - mempool_free(io, io->client->pool);
> - fn(r, context);
> - }
> - }
> + if (atomic_dec_and_test(>count))
> + complete_io(io);
>  }
>  
>  static void endio(struct bio *bio, int error)
> @@ -375,48 +374,49 @@ static void dispatch_io(int rw, unsigned int 
> num_regions,
>   dec_count(io, 0, 0);
>  }
>  
> +struct sync_io {
> + unsigned long error_bits;
> + struct completion complete;
> +};
> +
> +static void sync_complete(unsigned long error, void *context)
> +{
> + struct sync_io *sio = context;
> + sio->error_bits = error;
> + complete(>complete);
> +}
> +
>  static int sync_io(struct dm_io_client *client, unsigned int num_regions,
>  struct dm_io_region *where, int rw, 

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Minfei Huang wrote:

> BUG: unable to handle kernel NULL pointer dereference at 0046
> IP: [] dec_count+0x5f/0x80 [dm_mod]
> PGD 0
> Oops:  [#1] SMP
> last sysfs file: 
> /sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name
> 
> Pid: 2708, comm: kcopyd Tainted: GW  --- H  
> 2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
> RIP: 0010:[]  [] dec_count+0x5f/0x80 
> [dm_mod]
> RSP: 0018:880100603c30  EFLAGS: 00010246
> RAX: 0046 RBX: 8817968a5c30 RCX: 
> RDX:  RSI:  RDI: 8817968a5c00
> RBP: 880100603c50 R08:  R09: 
> R10: 880caa594cc0 R11:  R12: 8817968a5c80
> R13: 81013963 R14: 1000 R15: 
> FS:  () GS:88010060() knlGS:
> CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> CR2: 0046 CR3: 00020c309000 CR4: 001426e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 881841c9aa80)
> Stack:
>  880100603c40 880aa8b32300  8817968a5c00
>  880100603c80 a000a12a  880aa8b32300
>   880caa594cc0 880100603c90 811bcf6d
> Call Trace:
>  
>  [] endio+0x4a/0x70 [dm_mod]
>  [] bio_endio+0x1d/0x40
>  [] req_bio_endio+0x9b/0xe0
>  [] blk_update_request+0x104/0x500
>  [] ? blk_update_request+0x321/0x500
>  [] blk_update_bidi_request+0x27/0xa0
>  [] blk_end_bidi_request+0x2f/0x80
>  [] blk_end_request+0x10/0x20
>  [] scsi_io_completion+0xaf/0x6c0
>  [] scsi_finish_command+0xc2/0x130
>  [] scsi_softirq_done+0x145/0x170
>  [] blk_done_softirq+0x8d/0xa0
>  [] __do_softirq+0xdf/0x210
>  [] call_softirq+0x1c/0x30
>  [] do_softirq+0xad/0xe0
>  [] irq_exit+0x95/0xa0
>  [] do_IRQ+0x75/0xf0
>  [] ret_from_intr+0x0/0x16
> 
> The value of rdi register(0x8817968a5c00) is the io pointer,
> If the sync io, the address of io point must be alloc from stack.
> SO
> crash> struct thread_info 8817968a4000
> struct thread_info {
>   task = 0x88180cd9a580,
>   exec_domain = 0x81a98ac0,
>  ...
> }
> 
> crash> struct task_struct 0x88180cd9a580
> struct task_struct {
>   state = 2,
>   stack = 0x8817968a4000,
>  ...
> }
> 
> It shows value exactly when use the value of io address.
> 
> The io address in callback function will become the danging point,
> cause by the thread of sync io wakes up by other threads
> and return to relieve the io address,
> 
> Signed-off-by: Minfei Huang 
> ---
>  drivers/md/dm-io.c |   19 +++
>  1 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
> index 3842ac7..f992913 100644
> --- a/drivers/md/dm-io.c
> +++ b/drivers/md/dm-io.c
> @@ -38,6 +38,7 @@ struct io {
>   void *context;
>   void *vma_invalidate_address;
>   unsigned long vma_invalidate_size;
> + atomic_t wakeup;
>  } __attribute__((aligned(DM_IO_MAX_REGIONS)));
>  
>  static struct kmem_cache *_dm_io_cache;
> @@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int 
> region, int error)
>   invalidate_kernel_vmap_range(io->vma_invalidate_address,
>io->vma_invalidate_size);
>  
> - if (io->sleeper)
> - wake_up_process(io->sleeper);
> + if (io->sleeper) {
> + struct task_struct *sleeper = io->sleeper;
>  
> - else {
> + atomic_set(>wakeup, 1);

The problem here is that the processor may reorder the read of io->sleeper 
with atomic_set(>wakeup, 1); (performing atomic_set first and "sleeper 
= io->sleeper" afterwards) exposing the same race condition.

You need to use memory barriers to avoid reordering, but I think the 
solution with the completion is better (the completion takes care of 
barriers automatically).

Mikulas


> +/*
> + * The thread may be waked up by other threads,
> + * if then the sync io point will become the dangling pointer
> + */
> + wake_up_process(sleeper);
> + } else {
>   unsigned long r = io->error_bits;
>   io_notify_fn fn = io->callback;
>   void *context = io->context;
> @@ -401,12 +408,14 @@ static int sync_io(struct dm_io_client *client, 
> unsigned int num_regions,
>   io->vma_invalidate_address = dp->vma_invalidate_address;
>   io->vma_invalidate_size = dp->vma_invalidate_size;
>  
> + atomic_set(>wakeup, 0);
> +
>   dispatch_io(rw, num_regions, where, dp, io, 1);
>  
>   while (1) {
>   set_current_state(TASK_UNINTERRUPTIBLE);
>  
> -   

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Joe Thornber
On Fri, Jun 27, 2014 at 12:01:30PM +0800, Minfei Huang wrote:
> The io address in callback function will become the danging point,
> cause by the thread of sync io wakes up by other threads
> and return to relieve the io address,

Yes, well found.  I prefer the following fix however.

- Joe



Author: Joe Thornber 
Date:   Fri Jun 27 15:49:29 2014 +0100

[dm-io] Fix a race condition in the wake up code for sync_io

There's a race condition between the atomic_dec_and_test(>count)
in dec_count() and the waking of the sync_io() thread.  If the thread
is spuriously woken immediately after the decrement it may exit,
making the on the stack io struct invalid, yet the dec_count could
still be using it.

There are smaller fixes than the one here (eg, just take the io object
off the stack).  But I feel this code could use a clean up.

- simplify dec_count().

  - It always calls a callback fn now.
  - It always frees the io object back to the pool.

- sync_io()

  - Take the io object off the stack and allocate it from the pool the
same as async_io.
  - Use a completion object rather than an explicit io_schedule()
loop.  The callback triggers the completion.

Reported by: Minfei Huang

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 3842ac7..a0982e81 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,7 +33,6 @@ struct dm_io_client {
 struct io {
unsigned long error_bits;
atomic_t count;
-   struct task_struct *sleeper;
struct dm_io_client *client;
io_notify_fn callback;
void *context;
@@ -111,28 +111,27 @@ static void retrieve_io_and_region_from_bio(struct bio 
*bio, struct io **io,
  * We need an io object to keep track of the number of bios that
  * have been dispatched for a particular io.
  *---*/
-static void dec_count(struct io *io, unsigned int region, int error)
+static void complete_io(struct io *io)
 {
-   if (error)
-   set_bit(region, >error_bits);
+   unsigned long error_bits = io->error_bits;
+   io_notify_fn fn = io->callback;
+   void *context = io->context;
 
-   if (atomic_dec_and_test(>count)) {
-   if (io->vma_invalidate_size)
-   invalidate_kernel_vmap_range(io->vma_invalidate_address,
-io->vma_invalidate_size);
+   if (io->vma_invalidate_size)
+   invalidate_kernel_vmap_range(io->vma_invalidate_address,
+io->vma_invalidate_size);
 
-   if (io->sleeper)
-   wake_up_process(io->sleeper);
+   mempool_free(io, io->client->pool);
+   fn(error_bits, context);
+}
 
-   else {
-   unsigned long r = io->error_bits;
-   io_notify_fn fn = io->callback;
-   void *context = io->context;
+static void dec_count(struct io *io, unsigned int region, int error)
+{
+   if (error)
+   set_bit(region, >error_bits);
 
-   mempool_free(io, io->client->pool);
-   fn(r, context);
-   }
-   }
+   if (atomic_dec_and_test(>count))
+   complete_io(io);
 }
 
 static void endio(struct bio *bio, int error)
@@ -375,48 +374,49 @@ static void dispatch_io(int rw, unsigned int num_regions,
dec_count(io, 0, 0);
 }
 
+struct sync_io {
+   unsigned long error_bits;
+   struct completion complete;
+};
+
+static void sync_complete(unsigned long error, void *context)
+{
+   struct sync_io *sio = context;
+   sio->error_bits = error;
+   complete(>complete);
+}
+
 static int sync_io(struct dm_io_client *client, unsigned int num_regions,
   struct dm_io_region *where, int rw, struct dpages *dp,
   unsigned long *error_bits)
 {
-   /*
-* gcc <= 4.3 can't do the alignment for stack variables, so we must
-* align it on our own.
-* volatile prevents the optimizer from removing or reusing
-* "io_" field from the stack frame (allowed in ANSI C).
-*/
-   volatile char io_[sizeof(struct io) + __alignof__(struct io) - 1];
-   struct io *io = (struct io *)PTR_ALIGN(_, __alignof__(struct io));
+   struct io *io;
+   struct sync_io sio;
 
if (num_regions > 1 && (rw & RW_MASK) != WRITE) {
WARN_ON(1);
return -EIO;
}
 
+   init_completion();
+
+   io = mempool_alloc(client->pool, GFP_NOIO);
io->error_bits = 0;
atomic_set(>count, 1); /* see dispatch_io() */
-   io->sleeper = current;
+   io->callback = sync_complete;
+   io->context = 
io->client = client;
 
   

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Joe Thornber
On Fri, Jun 27, 2014 at 12:01:30PM +0800, Minfei Huang wrote:
 The io address in callback function will become the danging point,
 cause by the thread of sync io wakes up by other threads
 and return to relieve the io address,

Yes, well found.  I prefer the following fix however.

- Joe



Author: Joe Thornber e...@redhat.com
Date:   Fri Jun 27 15:49:29 2014 +0100

[dm-io] Fix a race condition in the wake up code for sync_io

There's a race condition between the atomic_dec_and_test(io-count)
in dec_count() and the waking of the sync_io() thread.  If the thread
is spuriously woken immediately after the decrement it may exit,
making the on the stack io struct invalid, yet the dec_count could
still be using it.

There are smaller fixes than the one here (eg, just take the io object
off the stack).  But I feel this code could use a clean up.

- simplify dec_count().

  - It always calls a callback fn now.
  - It always frees the io object back to the pool.

- sync_io()

  - Take the io object off the stack and allocate it from the pool the
same as async_io.
  - Use a completion object rather than an explicit io_schedule()
loop.  The callback triggers the completion.

Reported by: Minfei Huang

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 3842ac7..a0982e81 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -10,6 +10,7 @@
 #include linux/device-mapper.h
 
 #include linux/bio.h
+#include linux/completion.h
 #include linux/mempool.h
 #include linux/module.h
 #include linux/sched.h
@@ -32,7 +33,6 @@ struct dm_io_client {
 struct io {
unsigned long error_bits;
atomic_t count;
-   struct task_struct *sleeper;
struct dm_io_client *client;
io_notify_fn callback;
void *context;
@@ -111,28 +111,27 @@ static void retrieve_io_and_region_from_bio(struct bio 
*bio, struct io **io,
  * We need an io object to keep track of the number of bios that
  * have been dispatched for a particular io.
  *---*/
-static void dec_count(struct io *io, unsigned int region, int error)
+static void complete_io(struct io *io)
 {
-   if (error)
-   set_bit(region, io-error_bits);
+   unsigned long error_bits = io-error_bits;
+   io_notify_fn fn = io-callback;
+   void *context = io-context;
 
-   if (atomic_dec_and_test(io-count)) {
-   if (io-vma_invalidate_size)
-   invalidate_kernel_vmap_range(io-vma_invalidate_address,
-io-vma_invalidate_size);
+   if (io-vma_invalidate_size)
+   invalidate_kernel_vmap_range(io-vma_invalidate_address,
+io-vma_invalidate_size);
 
-   if (io-sleeper)
-   wake_up_process(io-sleeper);
+   mempool_free(io, io-client-pool);
+   fn(error_bits, context);
+}
 
-   else {
-   unsigned long r = io-error_bits;
-   io_notify_fn fn = io-callback;
-   void *context = io-context;
+static void dec_count(struct io *io, unsigned int region, int error)
+{
+   if (error)
+   set_bit(region, io-error_bits);
 
-   mempool_free(io, io-client-pool);
-   fn(r, context);
-   }
-   }
+   if (atomic_dec_and_test(io-count))
+   complete_io(io);
 }
 
 static void endio(struct bio *bio, int error)
@@ -375,48 +374,49 @@ static void dispatch_io(int rw, unsigned int num_regions,
dec_count(io, 0, 0);
 }
 
+struct sync_io {
+   unsigned long error_bits;
+   struct completion complete;
+};
+
+static void sync_complete(unsigned long error, void *context)
+{
+   struct sync_io *sio = context;
+   sio-error_bits = error;
+   complete(sio-complete);
+}
+
 static int sync_io(struct dm_io_client *client, unsigned int num_regions,
   struct dm_io_region *where, int rw, struct dpages *dp,
   unsigned long *error_bits)
 {
-   /*
-* gcc = 4.3 can't do the alignment for stack variables, so we must
-* align it on our own.
-* volatile prevents the optimizer from removing or reusing
-* io_ field from the stack frame (allowed in ANSI C).
-*/
-   volatile char io_[sizeof(struct io) + __alignof__(struct io) - 1];
-   struct io *io = (struct io *)PTR_ALIGN(io_, __alignof__(struct io));
+   struct io *io;
+   struct sync_io sio;
 
if (num_regions  1  (rw  RW_MASK) != WRITE) {
WARN_ON(1);
return -EIO;
}
 
+   init_completion(sio.complete);
+
+   io = mempool_alloc(client-pool, GFP_NOIO);
io-error_bits = 0;
atomic_set(io-count, 1); /* see dispatch_io() */
-   io-sleeper = 

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Minfei Huang wrote:

 BUG: unable to handle kernel NULL pointer dereference at 0046
 IP: [a0009cef] dec_count+0x5f/0x80 [dm_mod]
 PGD 0
 Oops:  [#1] SMP
 last sysfs file: 
 /sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name
 
 Pid: 2708, comm: kcopyd Tainted: GW  --- H  
 2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
 RIP: 0010:[a0009cef]  [a0009cef] dec_count+0x5f/0x80 
 [dm_mod]
 RSP: 0018:880100603c30  EFLAGS: 00010246
 RAX: 0046 RBX: 8817968a5c30 RCX: 
 RDX:  RSI:  RDI: 8817968a5c00
 RBP: 880100603c50 R08:  R09: 
 R10: 880caa594cc0 R11:  R12: 8817968a5c80
 R13: 81013963 R14: 1000 R15: 
 FS:  () GS:88010060() knlGS:
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
 CR2: 0046 CR3: 00020c309000 CR4: 001426e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 881841c9aa80)
 Stack:
  880100603c40 880aa8b32300  8817968a5c00
 d 880100603c80 a000a12a  880aa8b32300
 d  880caa594cc0 880100603c90 811bcf6d
 Call Trace:
  IRQ
  [a000a12a] endio+0x4a/0x70 [dm_mod]
  [811bcf6d] bio_endio+0x1d/0x40
  [81260beb] req_bio_endio+0x9b/0xe0
  [81263114] blk_update_request+0x104/0x500
  [81263331] ? blk_update_request+0x321/0x500
  [81263537] blk_update_bidi_request+0x27/0xa0
  [8126419f] blk_end_bidi_request+0x2f/0x80
  [81264240] blk_end_request+0x10/0x20
  [81375c6f] scsi_io_completion+0xaf/0x6c0
  [8136cb92] scsi_finish_command+0xc2/0x130
  [813763e5] scsi_softirq_done+0x145/0x170
  [812698ed] blk_done_softirq+0x8d/0xa0
  [81074c5f] __do_softirq+0xdf/0x210
  [8100c2cc] call_softirq+0x1c/0x30
  [8100df9d] do_softirq+0xad/0xe0
  [81074995] irq_exit+0x95/0xa0
  [81510515] do_IRQ+0x75/0xf0
  [8100ba53] ret_from_intr+0x0/0x16
 
 The value of rdi register(0x8817968a5c00) is the io pointer,
 If the sync io, the address of io point must be alloc from stack.
 SO
 crash struct thread_info 8817968a4000
 struct thread_info {
   task = 0x88180cd9a580,
   exec_domain = 0x81a98ac0,
  ...
 }
 
 crash struct task_struct 0x88180cd9a580
 struct task_struct {
   state = 2,
   stack = 0x8817968a4000,
  ...
 }
 
 It shows value exactly when use the value of io address.
 
 The io address in callback function will become the danging point,
 cause by the thread of sync io wakes up by other threads
 and return to relieve the io address,
 
 Signed-off-by: Minfei Huang huangmin...@ucloud.cn
 ---
  drivers/md/dm-io.c |   19 +++
  1 files changed, 15 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
 index 3842ac7..f992913 100644
 --- a/drivers/md/dm-io.c
 +++ b/drivers/md/dm-io.c
 @@ -38,6 +38,7 @@ struct io {
   void *context;
   void *vma_invalidate_address;
   unsigned long vma_invalidate_size;
 + atomic_t wakeup;
  } __attribute__((aligned(DM_IO_MAX_REGIONS)));
  
  static struct kmem_cache *_dm_io_cache;
 @@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int 
 region, int error)
   invalidate_kernel_vmap_range(io-vma_invalidate_address,
io-vma_invalidate_size);
  
 - if (io-sleeper)
 - wake_up_process(io-sleeper);
 + if (io-sleeper) {
 + struct task_struct *sleeper = io-sleeper;
  
 - else {
 + atomic_set(io-wakeup, 1);

The problem here is that the processor may reorder the read of io-sleeper 
with atomic_set(io-wakeup, 1); (performing atomic_set first and sleeper 
= io-sleeper afterwards) exposing the same race condition.

You need to use memory barriers to avoid reordering, but I think the 
solution with the completion is better (the completion takes care of 
barriers automatically).

Mikulas


 +/*
 + * The thread may be waked up by other threads,
 + * if then the sync io point will become the dangling pointer
 + */
 + wake_up_process(sleeper);
 + } else {
   unsigned long r = io-error_bits;
   io_notify_fn fn = io-callback;
   void *context = io-context;
 @@ -401,12 +408,14 @@ static int sync_io(struct dm_io_client *client, 
 unsigned int num_regions,
   io-vma_invalidate_address = dp-vma_invalidate_address;
   

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Joe Thornber wrote:

 On Fri, Jun 27, 2014 at 12:01:30PM +0800, Minfei Huang wrote:
  The io address in callback function will become the danging point,
  cause by the thread of sync io wakes up by other threads
  and return to relieve the io address,
 
 Yes, well found.  I prefer the following fix however.
 
 - Joe

It seems ok.

The patch is too big, I think the only change that needs to be done to fix 
the bug is to replace struct task_struct *sleeper; with struct 
completion *completion;, replace if (io-sleeper) 
wake_up_process(io-sleeper); with if (io-completion) 
complete(io-completion); and declare the completion in sync_io() and 
wait on it instead of while (1) loop there.

I suggest - split it to two patches, a minimal patch that fixes the bug by 
changing sleeper to completion and the second patch with remaining changes 
- so that only the first patch can be backported to stable kernels.

The smaller patch is less likely to produce conflicts (or bugs introduced 
by incorrectly resolved conflicts) when being backported.

Mikulas


 Author: Joe Thornber e...@redhat.com
 Date:   Fri Jun 27 15:49:29 2014 +0100
 
 [dm-io] Fix a race condition in the wake up code for sync_io
 
 There's a race condition between the atomic_dec_and_test(io-count)
 in dec_count() and the waking of the sync_io() thread.  If the thread
 is spuriously woken immediately after the decrement it may exit,
 making the on the stack io struct invalid, yet the dec_count could
 still be using it.
 
 There are smaller fixes than the one here (eg, just take the io object
 off the stack).  But I feel this code could use a clean up.
 
 - simplify dec_count().
 
   - It always calls a callback fn now.
   - It always frees the io object back to the pool.
 
 - sync_io()
 
   - Take the io object off the stack and allocate it from the pool the
 same as async_io.
   - Use a completion object rather than an explicit io_schedule()
 loop.  The callback triggers the completion.
 
 Reported by: Minfei Huang
 
 diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
 index 3842ac7..a0982e81 100644
 --- a/drivers/md/dm-io.c
 +++ b/drivers/md/dm-io.c
 @@ -10,6 +10,7 @@
  #include linux/device-mapper.h
  
  #include linux/bio.h
 +#include linux/completion.h
  #include linux/mempool.h
  #include linux/module.h
  #include linux/sched.h
 @@ -32,7 +33,6 @@ struct dm_io_client {
  struct io {
   unsigned long error_bits;
   atomic_t count;
 - struct task_struct *sleeper;
   struct dm_io_client *client;
   io_notify_fn callback;
   void *context;
 @@ -111,28 +111,27 @@ static void retrieve_io_and_region_from_bio(struct bio 
 *bio, struct io **io,
   * We need an io object to keep track of the number of bios that
   * have been dispatched for a particular io.
   *---*/
 -static void dec_count(struct io *io, unsigned int region, int error)
 +static void complete_io(struct io *io)
  {
 - if (error)
 - set_bit(region, io-error_bits);
 + unsigned long error_bits = io-error_bits;
 + io_notify_fn fn = io-callback;
 + void *context = io-context;
  
 - if (atomic_dec_and_test(io-count)) {
 - if (io-vma_invalidate_size)
 - invalidate_kernel_vmap_range(io-vma_invalidate_address,
 -  io-vma_invalidate_size);
 + if (io-vma_invalidate_size)
 + invalidate_kernel_vmap_range(io-vma_invalidate_address,
 +  io-vma_invalidate_size);
  
 - if (io-sleeper)
 - wake_up_process(io-sleeper);
 + mempool_free(io, io-client-pool);
 + fn(error_bits, context);
 +}
  
 - else {
 - unsigned long r = io-error_bits;
 - io_notify_fn fn = io-callback;
 - void *context = io-context;
 +static void dec_count(struct io *io, unsigned int region, int error)
 +{
 + if (error)
 + set_bit(region, io-error_bits);
  
 - mempool_free(io, io-client-pool);
 - fn(r, context);
 - }
 - }
 + if (atomic_dec_and_test(io-count))
 + complete_io(io);
  }
  
  static void endio(struct bio *bio, int error)
 @@ -375,48 +374,49 @@ static void dispatch_io(int rw, unsigned int 
 num_regions,
   dec_count(io, 0, 0);
  }
  
 +struct sync_io {
 + unsigned long error_bits;
 + struct completion complete;
 +};
 +
 +static void sync_complete(unsigned long error, void *context)
 +{
 + struct sync_io *sio = context;
 + sio-error_bits = error;
 + complete(sio-complete);
 +}
 +
  static int sync_io(struct dm_io_client *client, unsigned int num_regions,
  struct dm_io_region *where, int rw, struct dpages *dp,
  

Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-27 Thread Mikulas Patocka


On Fri, 27 Jun 2014, Mikulas Patocka wrote:

 
 
 On Fri, 27 Jun 2014, Minfei Huang wrote:
 
  BUG: unable to handle kernel NULL pointer dereference at 0046
  IP: [a0009cef] dec_count+0x5f/0x80 [dm_mod]
  PGD 0
  Oops:  [#1] SMP
  last sysfs file: 
  /sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name
  
  Pid: 2708, comm: kcopyd Tainted: GW  --- H  
  2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
  RIP: 0010:[a0009cef]  [a0009cef] dec_count+0x5f/0x80 
  [dm_mod]
  RSP: 0018:880100603c30  EFLAGS: 00010246
  RAX: 0046 RBX: 8817968a5c30 RCX: 
  RDX:  RSI:  RDI: 8817968a5c00
  RBP: 880100603c50 R08:  R09: 
  R10: 880caa594cc0 R11:  R12: 8817968a5c80
  R13: 81013963 R14: 1000 R15: 
  FS:  () GS:88010060() knlGS:
  CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
  CR2: 0046 CR3: 00020c309000 CR4: 001426e0
  DR0:  DR1:  DR2: 
  DR3:  DR6: 0ff0 DR7: 0400
  Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 
  881841c9aa80)
  Stack:
   880100603c40 880aa8b32300  8817968a5c00
  d 880100603c80 a000a12a  880aa8b32300
  d  880caa594cc0 880100603c90 811bcf6d
  Call Trace:
   IRQ
   [a000a12a] endio+0x4a/0x70 [dm_mod]
   [811bcf6d] bio_endio+0x1d/0x40
   [81260beb] req_bio_endio+0x9b/0xe0
   [81263114] blk_update_request+0x104/0x500
   [81263331] ? blk_update_request+0x321/0x500
   [81263537] blk_update_bidi_request+0x27/0xa0
   [8126419f] blk_end_bidi_request+0x2f/0x80
   [81264240] blk_end_request+0x10/0x20
   [81375c6f] scsi_io_completion+0xaf/0x6c0
   [8136cb92] scsi_finish_command+0xc2/0x130
   [813763e5] scsi_softirq_done+0x145/0x170
   [812698ed] blk_done_softirq+0x8d/0xa0
   [81074c5f] __do_softirq+0xdf/0x210
   [8100c2cc] call_softirq+0x1c/0x30
   [8100df9d] do_softirq+0xad/0xe0
   [81074995] irq_exit+0x95/0xa0
   [81510515] do_IRQ+0x75/0xf0
   [8100ba53] ret_from_intr+0x0/0x16
  
  The value of rdi register(0x8817968a5c00) is the io pointer,
  If the sync io, the address of io point must be alloc from stack.
  SO
  crash struct thread_info 8817968a4000
  struct thread_info {
task = 0x88180cd9a580,
exec_domain = 0x81a98ac0,
   ...
  }
  
  crash struct task_struct 0x88180cd9a580
  struct task_struct {
state = 2,
stack = 0x8817968a4000,
   ...
  }
  
  It shows value exactly when use the value of io address.
  
  The io address in callback function will become the danging point,
  cause by the thread of sync io wakes up by other threads
  and return to relieve the io address,
  
  Signed-off-by: Minfei Huang huangmin...@ucloud.cn
  ---
   drivers/md/dm-io.c |   19 +++
   1 files changed, 15 insertions(+), 4 deletions(-)
  
  diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
  index 3842ac7..f992913 100644
  --- a/drivers/md/dm-io.c
  +++ b/drivers/md/dm-io.c
  @@ -38,6 +38,7 @@ struct io {
  void *context;
  void *vma_invalidate_address;
  unsigned long vma_invalidate_size;
  +   atomic_t wakeup;
   } __attribute__((aligned(DM_IO_MAX_REGIONS)));
   
   static struct kmem_cache *_dm_io_cache;
  @@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int 
  region, int error)
  invalidate_kernel_vmap_range(io-vma_invalidate_address,
   io-vma_invalidate_size);
   
  -   if (io-sleeper)
  -   wake_up_process(io-sleeper);
  +   if (io-sleeper) {
  +   struct task_struct *sleeper = io-sleeper;
   
  -   else {
  +   atomic_set(io-wakeup, 1);
 
 The problem here is that the processor may reorder the read of io-sleeper 
 with atomic_set(io-wakeup, 1); (performing atomic_set first and sleeper 
 = io-sleeper afterwards) exposing the same race condition.
 
 You need to use memory barriers to avoid reordering, but I think the 
 solution with the completion is better (the completion takes care of 
 barriers automatically).
 
 Mikulas

Also - there is another race - after atomic_set(io-wakeup, 1), the 
target process may terminate, so wake_up_process(sleeper) operates on 
non-existing process. You need to declare a special wait queue or use 
wait_on_atomic_t+wake_up_atomic_t (that uses uses pre-initialized hash of 
wait queues) to avoid that race.

But as I said - the completion approach is better. It doesn't suffer 

[PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-26 Thread Minfei Huang
BUG: unable to handle kernel NULL pointer dereference at 0046
IP: [] dec_count+0x5f/0x80 [dm_mod]
PGD 0
Oops:  [#1] SMP
last sysfs file: 
/sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name

Pid: 2708, comm: kcopyd Tainted: GW  --- H  
2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
RIP: 0010:[]  [] dec_count+0x5f/0x80 
[dm_mod]
RSP: 0018:880100603c30  EFLAGS: 00010246
RAX: 0046 RBX: 8817968a5c30 RCX: 
RDX:  RSI:  RDI: 8817968a5c00
RBP: 880100603c50 R08:  R09: 
R10: 880caa594cc0 R11:  R12: 8817968a5c80
R13: 81013963 R14: 1000 R15: 
FS:  () GS:88010060() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0046 CR3: 00020c309000 CR4: 001426e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 881841c9aa80)
Stack:
 880100603c40 880aa8b32300  8817968a5c00
 880100603c80 a000a12a  880aa8b32300
  880caa594cc0 880100603c90 811bcf6d
Call Trace:
 
 [] endio+0x4a/0x70 [dm_mod]
 [] bio_endio+0x1d/0x40
 [] req_bio_endio+0x9b/0xe0
 [] blk_update_request+0x104/0x500
 [] ? blk_update_request+0x321/0x500
 [] blk_update_bidi_request+0x27/0xa0
 [] blk_end_bidi_request+0x2f/0x80
 [] blk_end_request+0x10/0x20
 [] scsi_io_completion+0xaf/0x6c0
 [] scsi_finish_command+0xc2/0x130
 [] scsi_softirq_done+0x145/0x170
 [] blk_done_softirq+0x8d/0xa0
 [] __do_softirq+0xdf/0x210
 [] call_softirq+0x1c/0x30
 [] do_softirq+0xad/0xe0
 [] irq_exit+0x95/0xa0
 [] do_IRQ+0x75/0xf0
 [] ret_from_intr+0x0/0x16

The value of rdi register(0x8817968a5c00) is the io pointer,
If the sync io, the address of io point must be alloc from stack.
SO
crash> struct thread_info 8817968a4000
struct thread_info {
  task = 0x88180cd9a580,
  exec_domain = 0x81a98ac0,
 ...
}

crash> struct task_struct 0x88180cd9a580
struct task_struct {
  state = 2,
  stack = 0x8817968a4000,
 ...
}

It shows value exactly when use the value of io address.

The io address in callback function will become the danging point,
cause by the thread of sync io wakes up by other threads
and return to relieve the io address,

Signed-off-by: Minfei Huang 
---
 drivers/md/dm-io.c |   19 +++
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 3842ac7..f992913 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -38,6 +38,7 @@ struct io {
void *context;
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
+   atomic_t wakeup;
 } __attribute__((aligned(DM_IO_MAX_REGIONS)));
 
 static struct kmem_cache *_dm_io_cache;
@@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int region, 
int error)
invalidate_kernel_vmap_range(io->vma_invalidate_address,
 io->vma_invalidate_size);
 
-   if (io->sleeper)
-   wake_up_process(io->sleeper);
+   if (io->sleeper) {
+   struct task_struct *sleeper = io->sleeper;
 
-   else {
+   atomic_set(>wakeup, 1);
+/*
+ * The thread may be waked up by other threads,
+ * if then the sync io point will become the dangling pointer
+ */
+   wake_up_process(sleeper);
+   } else {
unsigned long r = io->error_bits;
io_notify_fn fn = io->callback;
void *context = io->context;
@@ -401,12 +408,14 @@ static int sync_io(struct dm_io_client *client, unsigned 
int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;
 
+   atomic_set(>wakeup, 0);
+
dispatch_io(rw, num_regions, where, dp, io, 1);
 
while (1) {
set_current_state(TASK_UNINTERRUPTIBLE);
 
-   if (!atomic_read(>count))
+   if (atomic_read(>wakeup))
break;
 
io_schedule();
@@ -442,6 +451,8 @@ static int async_io(struct dm_io_client *client, unsigned 
int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;
 
+   atomic_set(>wakeup, 0);
+
dispatch_io(rw, num_regions, where, dp, io, 0);
return 0;
 }
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More 

[PATCH] dm-io: Prevent the danging point of the sync io callback function

2014-06-26 Thread Minfei Huang
BUG: unable to handle kernel NULL pointer dereference at 0046
IP: [a0009cef] dec_count+0x5f/0x80 [dm_mod]
PGD 0
Oops:  [#1] SMP
last sysfs file: 
/sys/devices/pci:00/:00:02.2/:02:00.0/host0/scsi_host/host0/proc_name

Pid: 2708, comm: kcopyd Tainted: GW  --- H  
2.6.32-279.19.5.el6.ucloud.x86_64 #1 Dell Inc. PowerEdge R720xd/0DCWD1
RIP: 0010:[a0009cef]  [a0009cef] dec_count+0x5f/0x80 
[dm_mod]
RSP: 0018:880100603c30  EFLAGS: 00010246
RAX: 0046 RBX: 8817968a5c30 RCX: 
RDX:  RSI:  RDI: 8817968a5c00
RBP: 880100603c50 R08:  R09: 
R10: 880caa594cc0 R11:  R12: 8817968a5c80
R13: 81013963 R14: 1000 R15: 
FS:  () GS:88010060() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0046 CR3: 00020c309000 CR4: 001426e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process kcopyd (pid: 2708, threadinfo 88180cd26000, task 881841c9aa80)
Stack:
 880100603c40 880aa8b32300  8817968a5c00
d 880100603c80 a000a12a  880aa8b32300
d  880caa594cc0 880100603c90 811bcf6d
Call Trace:
 IRQ
 [a000a12a] endio+0x4a/0x70 [dm_mod]
 [811bcf6d] bio_endio+0x1d/0x40
 [81260beb] req_bio_endio+0x9b/0xe0
 [81263114] blk_update_request+0x104/0x500
 [81263331] ? blk_update_request+0x321/0x500
 [81263537] blk_update_bidi_request+0x27/0xa0
 [8126419f] blk_end_bidi_request+0x2f/0x80
 [81264240] blk_end_request+0x10/0x20
 [81375c6f] scsi_io_completion+0xaf/0x6c0
 [8136cb92] scsi_finish_command+0xc2/0x130
 [813763e5] scsi_softirq_done+0x145/0x170
 [812698ed] blk_done_softirq+0x8d/0xa0
 [81074c5f] __do_softirq+0xdf/0x210
 [8100c2cc] call_softirq+0x1c/0x30
 [8100df9d] do_softirq+0xad/0xe0
 [81074995] irq_exit+0x95/0xa0
 [81510515] do_IRQ+0x75/0xf0
 [8100ba53] ret_from_intr+0x0/0x16

The value of rdi register(0x8817968a5c00) is the io pointer,
If the sync io, the address of io point must be alloc from stack.
SO
crash struct thread_info 8817968a4000
struct thread_info {
  task = 0x88180cd9a580,
  exec_domain = 0x81a98ac0,
 ...
}

crash struct task_struct 0x88180cd9a580
struct task_struct {
  state = 2,
  stack = 0x8817968a4000,
 ...
}

It shows value exactly when use the value of io address.

The io address in callback function will become the danging point,
cause by the thread of sync io wakes up by other threads
and return to relieve the io address,

Signed-off-by: Minfei Huang huangmin...@ucloud.cn
---
 drivers/md/dm-io.c |   19 +++
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 3842ac7..f992913 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -38,6 +38,7 @@ struct io {
void *context;
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
+   atomic_t wakeup;
 } __attribute__((aligned(DM_IO_MAX_REGIONS)));
 
 static struct kmem_cache *_dm_io_cache;
@@ -121,10 +122,16 @@ static void dec_count(struct io *io, unsigned int region, 
int error)
invalidate_kernel_vmap_range(io-vma_invalidate_address,
 io-vma_invalidate_size);
 
-   if (io-sleeper)
-   wake_up_process(io-sleeper);
+   if (io-sleeper) {
+   struct task_struct *sleeper = io-sleeper;
 
-   else {
+   atomic_set(io-wakeup, 1);
+/*
+ * The thread may be waked up by other threads,
+ * if then the sync io point will become the dangling pointer
+ */
+   wake_up_process(sleeper);
+   } else {
unsigned long r = io-error_bits;
io_notify_fn fn = io-callback;
void *context = io-context;
@@ -401,12 +408,14 @@ static int sync_io(struct dm_io_client *client, unsigned 
int num_regions,
io-vma_invalidate_address = dp-vma_invalidate_address;
io-vma_invalidate_size = dp-vma_invalidate_size;
 
+   atomic_set(io-wakeup, 0);
+
dispatch_io(rw, num_regions, where, dp, io, 1);
 
while (1) {
set_current_state(TASK_UNINTERRUPTIBLE);
 
-   if (!atomic_read(io-count))
+   if (atomic_read(io-wakeup))
break;
 
io_schedule();
@@ -442,6 +451,8 @@ static int async_io(struct dm_io_client *client, unsigned 
int num_regions,
io-vma_invalidate_address