Re: [PATCH 1/2] md/raid5: split wait_for_stripe and introduce wait_for_quiesce

2015-04-26 Thread NeilBrown
On Mon, 27 Apr 2015 10:12:49 +0800 Yuanhan Liu 
wrote:

> On Mon, Apr 27, 2015 at 10:10:24AM +1000, NeilBrown wrote:
> > On Fri, 24 Apr 2015 21:39:03 +0800 Yuanhan Liu 
> > wrote:
> > 
> > > If I read code correctly, current wait_for_stripe actually has 2 usage:
> > > 
> > > - wait for there is enough free stripe cache, triggered when
> > >   get_free_stripe() failed. This is what wait_for_stripe intend
> > >   for literally.
> > > 
> > > - wait for quiesce == 0 or
> > >active_aligned_reads == 0 && active_stripes == 0
> > > 
> > >   It has nothing to do with wait_for_stripe literally, and releasing
> > >   an active stripe won't actually wake them up. On the contrary, wake_up
> > >   from under this case won't actually wake up the process waiting for
> > >   an free stripe being available.
> > 
> > I disagree.  Releasing an active stripe *will* (or *can*) wake up that third
> > case, as it decrements "active_stripes" which will eventually reach zero.
> > 
> > I don't think your new code will properly wake up a process which is waiting
> > for "active_stripes == 0".
> 
> Right, and thanks for pointing it out. So, is this enough?
> 
> ---
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 2d8fcc1..3f23035 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -383,6 +383,9 @@ static void release_inactive_stripe_list(struct
> r5conf *conf,
> }
> }
> }
> +
> +   if (!atomic_read(&conf->active_stripes))
> +   wake_up(&conf->wait_for_quiesce);
>  }
> 
>  /* should hold conf->device_lock already */
> 
> 
> Or, should I put it a bit ahead, trying to invoke 
> wake_up(&conf->wait_for_quiesce)
> after each atomic_dec(&conf->active_stripes)?
> 
>   if (atomic_dec_return(&conf->active_stripes) == 0)
>   wake_up(&conf->wait_for_quiesce);

I think the first version is fine.  While waiting for active_stripes to be
zero, ->quiesce is set to 2, and so no new stripes should get used.

> 
> > 
> > > 
> > > Hence, we'd better split wait_for_stripe, and here I introduce
> > > wait_for_quiesce for the second usage. The name may not well taken, or
> > > even taken wrongly. Feel free to correct me then.
> > > 
> > > This is also a prepare patch for next patch: make wait_for_stripe
> > > exclusive.
> > 
> > I think you have this commit description upside down :-)
> > 
> > The real motivation is that you are seeing contention on some spinlock and 
> > so
> > you want to split 'wait_for_stripe' up in to multiple wait_queues so that 
> > you
> > can use exclusive wakeup.  As this is the main motivation, it should be
> > stated first.
> > 
> > Then explain that 'wait_for_stripe' is used to wait for the array to enter 
> > or
> > leave the quiescent state, and also to wait for an available stripe in each
> > of the hash lists.
> > 
> > So this patch splits the first usage off into a separate wait_queue, and the
> > next patch will split the second usage into one waitqueue for each hash 
> > value.
> > 
> > Then explain just is what is needed for that first step.
> > 
> > When you put it that way around, the patch makes lots of sense.
> 
> It does, and thanks!
> 
> > 
> > So: could you please resubmit with the description the right way around, and
> 
> To make sure I followed you correctly, my patch order is correct(I mean,
> split lock first, and make wait_for_stripe per lock hash and exclusive
> second), and what I need to do is re-writing the commit log as you suggested,
> and fixing all issues you pointed out. Right?

Correct.

Thanks,
NeilBrown


> 
>   --yliu
> 
> > with an appropriate wakeup call to ensure raid5_quiesce is woken up when
> > active_stripes reaches zero?
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> > > 
> > > Signed-off-by: Yuanhan Liu 
> > > ---
> > >  drivers/md/raid5.c | 13 +++--
> > >  drivers/md/raid5.h |  1 +
> > >  2 files changed, 8 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > > index 9716319..b7e385f 100644
> > > --- a/drivers/md/raid5.c
> > > +++ b/drivers/md/raid5.c
> > > @@ -667,7 +667,7 @@ get_active_stripe(struct r5conf *conf, sector_t 
> > > sector,
> > >   spin_lock_irq(conf->hash_locks + hash);
> > >  
> > >   do {
> > > - wait_event_lock_irq(conf->wait_for_stripe,
> > > + wait_event_lock_irq(conf->wait_for_quiesce,
> > >   conf->quiesce == 0 || noquiesce,
> > >   *(conf->hash_locks + hash));
> > >   sh = __find_stripe(conf, sector, conf->generation - previous);
> > > @@ -4725,7 +4725,7 @@ static void raid5_align_endio(struct bio *bi, int 
> > > error)
> > >raid_bi, 0);
> > >   bio_endio(raid_bi, 0);
> > >   if (atomic_dec_and_test(&conf->active_aligned_reads))
> > > - wake_up(&conf->wait_for_stripe);
> > > + wake_up(&conf->wait_for_quiesce);
> > >   re

Re: [PATCH 1/2] md/raid5: split wait_for_stripe and introduce wait_for_quiesce

2015-04-26 Thread Yuanhan Liu
On Mon, Apr 27, 2015 at 10:10:24AM +1000, NeilBrown wrote:
> On Fri, 24 Apr 2015 21:39:03 +0800 Yuanhan Liu 
> wrote:
> 
> > If I read code correctly, current wait_for_stripe actually has 2 usage:
> > 
> > - wait for there is enough free stripe cache, triggered when
> >   get_free_stripe() failed. This is what wait_for_stripe intend
> >   for literally.
> > 
> > - wait for quiesce == 0 or
> >active_aligned_reads == 0 && active_stripes == 0
> > 
> >   It has nothing to do with wait_for_stripe literally, and releasing
> >   an active stripe won't actually wake them up. On the contrary, wake_up
> >   from under this case won't actually wake up the process waiting for
> >   an free stripe being available.
> 
> I disagree.  Releasing an active stripe *will* (or *can*) wake up that third
> case, as it decrements "active_stripes" which will eventually reach zero.
> 
> I don't think your new code will properly wake up a process which is waiting
> for "active_stripes == 0".

Right, and thanks for pointing it out. So, is this enough?

---
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2d8fcc1..3f23035 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -383,6 +383,9 @@ static void release_inactive_stripe_list(struct
r5conf *conf,
}
}
}
+
+   if (!atomic_read(&conf->active_stripes))
+   wake_up(&conf->wait_for_quiesce);
 }

 /* should hold conf->device_lock already */


Or, should I put it a bit ahead, trying to invoke 
wake_up(&conf->wait_for_quiesce)
after each atomic_dec(&conf->active_stripes)?

if (atomic_dec_return(&conf->active_stripes) == 0)
wake_up(&conf->wait_for_quiesce);

> 
> > 
> > Hence, we'd better split wait_for_stripe, and here I introduce
> > wait_for_quiesce for the second usage. The name may not well taken, or
> > even taken wrongly. Feel free to correct me then.
> > 
> > This is also a prepare patch for next patch: make wait_for_stripe
> > exclusive.
> 
> I think you have this commit description upside down :-)
> 
> The real motivation is that you are seeing contention on some spinlock and so
> you want to split 'wait_for_stripe' up in to multiple wait_queues so that you
> can use exclusive wakeup.  As this is the main motivation, it should be
> stated first.
> 
> Then explain that 'wait_for_stripe' is used to wait for the array to enter or
> leave the quiescent state, and also to wait for an available stripe in each
> of the hash lists.
> 
> So this patch splits the first usage off into a separate wait_queue, and the
> next patch will split the second usage into one waitqueue for each hash value.
> 
> Then explain just is what is needed for that first step.
> 
> When you put it that way around, the patch makes lots of sense.

It does, and thanks!

> 
> So: could you please resubmit with the description the right way around, and

To make sure I followed you correctly, my patch order is correct(I mean,
split lock first, and make wait_for_stripe per lock hash and exclusive
second), and what I need to do is re-writing the commit log as you suggested,
and fixing all issues you pointed out. Right?

--yliu

> with an appropriate wakeup call to ensure raid5_quiesce is woken up when
> active_stripes reaches zero?
> 
> Thanks,
> NeilBrown
> 
> 
> > 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  drivers/md/raid5.c | 13 +++--
> >  drivers/md/raid5.h |  1 +
> >  2 files changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 9716319..b7e385f 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -667,7 +667,7 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
> > spin_lock_irq(conf->hash_locks + hash);
> >  
> > do {
> > -   wait_event_lock_irq(conf->wait_for_stripe,
> > +   wait_event_lock_irq(conf->wait_for_quiesce,
> > conf->quiesce == 0 || noquiesce,
> > *(conf->hash_locks + hash));
> > sh = __find_stripe(conf, sector, conf->generation - previous);
> > @@ -4725,7 +4725,7 @@ static void raid5_align_endio(struct bio *bi, int 
> > error)
> >  raid_bi, 0);
> > bio_endio(raid_bi, 0);
> > if (atomic_dec_and_test(&conf->active_aligned_reads))
> > -   wake_up(&conf->wait_for_stripe);
> > +   wake_up(&conf->wait_for_quiesce);
> > return;
> > }
> >  
> > @@ -4820,7 +4820,7 @@ static int chunk_aligned_read(struct mddev *mddev, 
> > struct bio * raid_bio)
> > align_bi->bi_iter.bi_sector += rdev->data_offset;
> >  
> > spin_lock_irq(&conf->device_lock);
> > -   wait_event_lock_irq(conf->wait_for_stripe,
> > +   wait_event_lock_irq(conf->wait_for_quiesce,
> > conf->quiesce == 0,
> > conf->device_loc

Re: [PATCH 1/2] md/raid5: split wait_for_stripe and introduce wait_for_quiesce

2015-04-26 Thread NeilBrown
On Fri, 24 Apr 2015 21:39:03 +0800 Yuanhan Liu 
wrote:

> If I read code correctly, current wait_for_stripe actually has 2 usage:
> 
> - wait for there is enough free stripe cache, triggered when
>   get_free_stripe() failed. This is what wait_for_stripe intend
>   for literally.
> 
> - wait for quiesce == 0 or
>active_aligned_reads == 0 && active_stripes == 0
> 
>   It has nothing to do with wait_for_stripe literally, and releasing
>   an active stripe won't actually wake them up. On the contrary, wake_up
>   from under this case won't actually wake up the process waiting for
>   an free stripe being available.

I disagree.  Releasing an active stripe *will* (or *can*) wake up that third
case, as it decrements "active_stripes" which will eventually reach zero.

I don't think your new code will properly wake up a process which is waiting
for "active_stripes == 0".

> 
> Hence, we'd better split wait_for_stripe, and here I introduce
> wait_for_quiesce for the second usage. The name may not well taken, or
> even taken wrongly. Feel free to correct me then.
> 
> This is also a prepare patch for next patch: make wait_for_stripe
> exclusive.

I think you have this commit description upside down :-)

The real motivation is that you are seeing contention on some spinlock and so
you want to split 'wait_for_stripe' up in to multiple wait_queues so that you
can use exclusive wakeup.  As this is the main motivation, it should be
stated first.

Then explain that 'wait_for_stripe' is used to wait for the array to enter or
leave the quiescent state, and also to wait for an available stripe in each
of the hash lists.

So this patch splits the first usage off into a separate wait_queue, and the
next patch will split the second usage into one waitqueue for each hash value.

Then explain just is what is needed for that first step.

When you put it that way around, the patch makes lots of sense.

So: could you please resubmit with the description the right way around, and
with an appropriate wakeup call to ensure raid5_quiesce is woken up when
active_stripes reaches zero?

Thanks,
NeilBrown


> 
> Signed-off-by: Yuanhan Liu 
> ---
>  drivers/md/raid5.c | 13 +++--
>  drivers/md/raid5.h |  1 +
>  2 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 9716319..b7e385f 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -667,7 +667,7 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
>   spin_lock_irq(conf->hash_locks + hash);
>  
>   do {
> - wait_event_lock_irq(conf->wait_for_stripe,
> + wait_event_lock_irq(conf->wait_for_quiesce,
>   conf->quiesce == 0 || noquiesce,
>   *(conf->hash_locks + hash));
>   sh = __find_stripe(conf, sector, conf->generation - previous);
> @@ -4725,7 +4725,7 @@ static void raid5_align_endio(struct bio *bi, int error)
>raid_bi, 0);
>   bio_endio(raid_bi, 0);
>   if (atomic_dec_and_test(&conf->active_aligned_reads))
> - wake_up(&conf->wait_for_stripe);
> + wake_up(&conf->wait_for_quiesce);
>   return;
>   }
>  
> @@ -4820,7 +4820,7 @@ static int chunk_aligned_read(struct mddev *mddev, 
> struct bio * raid_bio)
>   align_bi->bi_iter.bi_sector += rdev->data_offset;
>  
>   spin_lock_irq(&conf->device_lock);
> - wait_event_lock_irq(conf->wait_for_stripe,
> + wait_event_lock_irq(conf->wait_for_quiesce,
>   conf->quiesce == 0,
>   conf->device_lock);
>   atomic_inc(&conf->active_aligned_reads);
> @@ -5659,7 +5659,7 @@ static int  retry_aligned_read(struct r5conf *conf, 
> struct bio *raid_bio)
>   bio_endio(raid_bio, 0);
>   }
>   if (atomic_dec_and_test(&conf->active_aligned_reads))
> - wake_up(&conf->wait_for_stripe);
> + wake_up(&conf->wait_for_quiesce);
>   return handled;
>  }
>  
> @@ -6390,6 +6390,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
>   goto abort;
>   spin_lock_init(&conf->device_lock);
>   seqcount_init(&conf->gen_lock);
> + init_waitqueue_head(&conf->wait_for_quiesce);
>   init_waitqueue_head(&conf->wait_for_stripe);
>   init_waitqueue_head(&conf->wait_for_overlap);
>   INIT_LIST_HEAD(&conf->handle_list);
> @@ -7413,7 +7414,7 @@ static void raid5_quiesce(struct mddev *mddev, int 
> state)
>* active stripes can drain
>*/
>   conf->quiesce = 2;
> - wait_event_cmd(conf->wait_for_stripe,
> + wait_event_cmd(conf->wait_for_quiesce,
>   atomic_read(&conf->active_stripes) == 0 &&
>   atomic_read(&conf->active_aligned_reads) 

[PATCH 1/2] md/raid5: split wait_for_stripe and introduce wait_for_quiesce

2015-04-24 Thread Yuanhan Liu
If I read code correctly, current wait_for_stripe actually has 2 usage:

- wait for there is enough free stripe cache, triggered when
  get_free_stripe() failed. This is what wait_for_stripe intend
  for literally.

- wait for quiesce == 0 or
   active_aligned_reads == 0 && active_stripes == 0

  It has nothing to do with wait_for_stripe literally, and releasing
  an active stripe won't actually wake them up. On the contrary, wake_up
  from under this case won't actually wake up the process waiting for
  an free stripe being available.

Hence, we'd better split wait_for_stripe, and here I introduce
wait_for_quiesce for the second usage. The name may not well taken, or
even taken wrongly. Feel free to correct me then.

This is also a prepare patch for next patch: make wait_for_stripe
exclusive.

Signed-off-by: Yuanhan Liu 
---
 drivers/md/raid5.c | 13 +++--
 drivers/md/raid5.h |  1 +
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9716319..b7e385f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -667,7 +667,7 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
spin_lock_irq(conf->hash_locks + hash);
 
do {
-   wait_event_lock_irq(conf->wait_for_stripe,
+   wait_event_lock_irq(conf->wait_for_quiesce,
conf->quiesce == 0 || noquiesce,
*(conf->hash_locks + hash));
sh = __find_stripe(conf, sector, conf->generation - previous);
@@ -4725,7 +4725,7 @@ static void raid5_align_endio(struct bio *bi, int error)
 raid_bi, 0);
bio_endio(raid_bi, 0);
if (atomic_dec_and_test(&conf->active_aligned_reads))
-   wake_up(&conf->wait_for_stripe);
+   wake_up(&conf->wait_for_quiesce);
return;
}
 
@@ -4820,7 +4820,7 @@ static int chunk_aligned_read(struct mddev *mddev, struct 
bio * raid_bio)
align_bi->bi_iter.bi_sector += rdev->data_offset;
 
spin_lock_irq(&conf->device_lock);
-   wait_event_lock_irq(conf->wait_for_stripe,
+   wait_event_lock_irq(conf->wait_for_quiesce,
conf->quiesce == 0,
conf->device_lock);
atomic_inc(&conf->active_aligned_reads);
@@ -5659,7 +5659,7 @@ static int  retry_aligned_read(struct r5conf *conf, 
struct bio *raid_bio)
bio_endio(raid_bio, 0);
}
if (atomic_dec_and_test(&conf->active_aligned_reads))
-   wake_up(&conf->wait_for_stripe);
+   wake_up(&conf->wait_for_quiesce);
return handled;
 }
 
@@ -6390,6 +6390,7 @@ static struct r5conf *setup_conf(struct mddev *mddev)
goto abort;
spin_lock_init(&conf->device_lock);
seqcount_init(&conf->gen_lock);
+   init_waitqueue_head(&conf->wait_for_quiesce);
init_waitqueue_head(&conf->wait_for_stripe);
init_waitqueue_head(&conf->wait_for_overlap);
INIT_LIST_HEAD(&conf->handle_list);
@@ -7413,7 +7414,7 @@ static void raid5_quiesce(struct mddev *mddev, int state)
 * active stripes can drain
 */
conf->quiesce = 2;
-   wait_event_cmd(conf->wait_for_stripe,
+   wait_event_cmd(conf->wait_for_quiesce,
atomic_read(&conf->active_stripes) == 0 &&
atomic_read(&conf->active_aligned_reads) == 
0,
unlock_all_device_hash_locks_irq(conf),
@@ -7427,7 +7428,7 @@ static void raid5_quiesce(struct mddev *mddev, int state)
case 0: /* re-enable writes */
lock_all_device_hash_locks_irq(conf);
conf->quiesce = 0;
-   wake_up(&conf->wait_for_stripe);
+   wake_up(&conf->wait_for_quiesce);
wake_up(&conf->wait_for_overlap);
unlock_all_device_hash_locks_irq(conf);
break;
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 7dc0dd8..fab53a3 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -508,6 +508,7 @@ struct r5conf {
struct list_headinactive_list[NR_STRIPE_HASH_LOCKS];
atomic_tempty_inactive_list_nr;
struct llist_head   released_stripes;
+   wait_queue_head_t   wait_for_quiesce;
wait_queue_head_t   wait_for_stripe;
wait_queue_head_t   wait_for_overlap;
unsigned long   cache_state;
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/