On 09/01, Peter Zijlstra wrote:
>
> On Fri, Aug 26, 2016 at 02:45:52PM +0200, Oleg Nesterov wrote:
>
> > We do not need anything tricky to avoid the race,
>
> The race being:
>
> CPU0                  CPU1                    CPU2
>
>                       __wait_on_bit_lock()
>                         bit_wait_io()
>                           io_schedule()
>
> clear_bit_unlock()
> __wake_up_common(.nr_exclusive=1)
>   list_for_each_entry()
>     if (curr->func() && --nr_exclusive)
>       break
>
>                                               signal()
>
>                           if (signal_pending_state()) == TRUE
>                             return -EINTR
>
> And no progress because CPU1 exits without acquiring the lock and CPU0
> thinks its done because it woke someone.

Yes,

> > we can just call finish_wait() if action() fails.
>
> That would be bit_wait*() returning -EINTR because sigpending.

Hmm. Not sure I understand... Let me reply just in case, even if
I am sure you get it right.

Yes, in the likely case we are going to fail with -EINTR, but only
if test-and-set after thar fails.

> Sure, you can always call that, first thing through the loop does
> prepare again, so no harm. That however does not connect to your
> condition,.. /me puzzled

If ->action() fails we will abort the loop in any case, prepare
won't be called. So in this case finish_wait() does the right thing.

> > test_and_set_bit() implies mb() so
> > the lockless list_empty_careful() case is fine, we can not miss the
> > condition if we race with unlock_page().
>
> You're talking about this ordering?:
>
>       finish_wait()                   clear_bit_unlock();
>         list_empty_careful()
>
>       /* MB implied */                smp_mb__after_atomic();
>       test_and_set_bit()              wake_up_page()
>                                         ...
>                                           autoremove_wake_function()
>                                             list_del_init();
>
>
> That could do with spelling out I feel.. :-)

Yes, yes.

> >  __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
> >                     wait_bit_action_f *action, unsigned mode)
> >  {
> > +   int ret = 0;
> >
> > +   for (;;) {
> >             prepare_to_wait_exclusive(wq, &q->wait, mode);
> > +           if (test_bit(q->key.bit_nr, q->key.flags)) {
> > +                   ret = action(&q->key, mode);
> > +                   /*
> > +                    * Ensure that clear_bit() + wake_up() right after
> > +                    * test_and_set_bit() below can't see us; it should
> > +                    * wake up another exclusive waiter if we fail.
> > +                    */
> > +                   if (ret)
> > +                           finish_wait(wq, &q->wait);
> > +           }
> > +           if (!test_and_set_bit(q->key.bit_nr, q->key.flags)) {
>
> So this is the actual difference, instead of failing the lock and
> aborting on signal, we acquire the lock if possible. If its not
> possible, someone else has it, which guarantees that someone else will
> do an unlock which implies another wakeup and life goes on.

Yes. This way we eliminate the need for the additional wake_up.

Oleg.

Reply via email to