Re: [PATCH -v6 12/13] futex: futex_unlock_pi() determinism

2017-04-07 Thread Darren Hart
On Wed, Mar 22, 2017 at 11:35:59AM +0100, Peter Zijlstra wrote:
> The problem with returning -EAGAIN when the waiter state mismatches is
> that it becomes very hard to proof a bounded execution time on the

prove

> operation. And seeing that this is a RT operation, this is somewhat

an RT

> important.
> 
> While in practise; given the previous patch; it will be very unlikely

Heh, that's not what semicolons are for ;-) Commas here, or a parenthetical.

> to ever really take more than one or two rounds, proving so becomes
> rather hard.
> 
> However, now that modifying wait_list is done while holding both
> hb->lock and wait_lock, we can avoid the scenario entirely if we
> acquire wait_lock while still holding hb-lock. Doing a hand-over,
> without leaving a hole.

Nice :)

-- 
Darren Hart
VMware Open Source Technology Center


Re: [PATCH -v6 12/13] futex: futex_unlock_pi() determinism

2017-04-07 Thread Darren Hart
On Wed, Mar 22, 2017 at 11:35:59AM +0100, Peter Zijlstra wrote:
> The problem with returning -EAGAIN when the waiter state mismatches is
> that it becomes very hard to proof a bounded execution time on the

prove

> operation. And seeing that this is a RT operation, this is somewhat

an RT

> important.
> 
> While in practise; given the previous patch; it will be very unlikely

Heh, that's not what semicolons are for ;-) Commas here, or a parenthetical.

> to ever really take more than one or two rounds, proving so becomes
> rather hard.
> 
> However, now that modifying wait_list is done while holding both
> hb->lock and wait_lock, we can avoid the scenario entirely if we
> acquire wait_lock while still holding hb-lock. Doing a hand-over,
> without leaving a hole.

Nice :)

-- 
Darren Hart
VMware Open Source Technology Center


[PATCH -v6 12/13] futex: futex_unlock_pi() determinism

2017-03-22 Thread Peter Zijlstra
The problem with returning -EAGAIN when the waiter state mismatches is
that it becomes very hard to proof a bounded execution time on the
operation. And seeing that this is a RT operation, this is somewhat
important.

While in practise; given the previous patch; it will be very unlikely
to ever really take more than one or two rounds, proving so becomes
rather hard.

However, now that modifying wait_list is done while holding both
hb->lock and wait_lock, we can avoid the scenario entirely if we
acquire wait_lock while still holding hb-lock. Doing a hand-over,
without leaving a hole.

Signed-off-by: Peter Zijlstra (Intel) 
---
 kernel/futex.c |   26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

Index: linux-2.6/kernel/futex.c
===
--- linux-2.6.orig/kernel/futex.c
+++ linux-2.6/kernel/futex.c
@@ -1398,15 +1398,10 @@ static int wake_futex_pi(u32 __user *uad
DEFINE_WAKE_Q(wake_q);
int ret = 0;
 
-   raw_spin_lock_irq(_state->pi_mutex.wait_lock);
new_owner = rt_mutex_next_owner(_state->pi_mutex);
-   if (!new_owner) {
+   if (WARN_ON_ONCE(!new_owner)) {
/*
-* Since we held neither hb->lock nor wait_lock when coming
-* into this function, we could have raced with futex_lock_pi()
-* such that we might observe @this futex_q waiter, but the
-* rt_mutex's wait_list can be empty (either still, or again,
-* depending on which side we land).
+* As per the comment in futex_unlock_pi() this should not 
happen.
 *
 * When this happens, give up our locks and try again, giving
 * the futex_lock_pi() instance time to complete, either by
@@ -2787,15 +2782,18 @@ retry:
if (pi_state->owner != current)
goto out_unlock;
 
+   get_pi_state(pi_state);
/*
-* Grab a reference on the pi_state and drop hb->lock.
+* Since modifying the wait_list is done while holding both
+* hb->lock and wait_lock, holding either is sufficient to
+* observe it.
 *
-* The reference ensures pi_state lives, dropping the hb->lock
-* is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to
-* close the races against futex_lock_pi(), but in case of
-* _any_ fail we'll abort and retry the whole deal.
+* By taking wait_lock while still holding hb->lock, we ensure
+* there is no point where we hold neither; and therefore
+* wake_futex_pi() must observe a state consistent with what we
+* observed.
 */
-   get_pi_state(pi_state);
+   raw_spin_lock_irq(_state->pi_mutex.wait_lock);
spin_unlock(>lock);
 
ret = wake_futex_pi(uaddr, uval, pi_state);




[PATCH -v6 12/13] futex: futex_unlock_pi() determinism

2017-03-22 Thread Peter Zijlstra
The problem with returning -EAGAIN when the waiter state mismatches is
that it becomes very hard to proof a bounded execution time on the
operation. And seeing that this is a RT operation, this is somewhat
important.

While in practise; given the previous patch; it will be very unlikely
to ever really take more than one or two rounds, proving so becomes
rather hard.

However, now that modifying wait_list is done while holding both
hb->lock and wait_lock, we can avoid the scenario entirely if we
acquire wait_lock while still holding hb-lock. Doing a hand-over,
without leaving a hole.

Signed-off-by: Peter Zijlstra (Intel) 
---
 kernel/futex.c |   26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

Index: linux-2.6/kernel/futex.c
===
--- linux-2.6.orig/kernel/futex.c
+++ linux-2.6/kernel/futex.c
@@ -1398,15 +1398,10 @@ static int wake_futex_pi(u32 __user *uad
DEFINE_WAKE_Q(wake_q);
int ret = 0;
 
-   raw_spin_lock_irq(_state->pi_mutex.wait_lock);
new_owner = rt_mutex_next_owner(_state->pi_mutex);
-   if (!new_owner) {
+   if (WARN_ON_ONCE(!new_owner)) {
/*
-* Since we held neither hb->lock nor wait_lock when coming
-* into this function, we could have raced with futex_lock_pi()
-* such that we might observe @this futex_q waiter, but the
-* rt_mutex's wait_list can be empty (either still, or again,
-* depending on which side we land).
+* As per the comment in futex_unlock_pi() this should not 
happen.
 *
 * When this happens, give up our locks and try again, giving
 * the futex_lock_pi() instance time to complete, either by
@@ -2787,15 +2782,18 @@ retry:
if (pi_state->owner != current)
goto out_unlock;
 
+   get_pi_state(pi_state);
/*
-* Grab a reference on the pi_state and drop hb->lock.
+* Since modifying the wait_list is done while holding both
+* hb->lock and wait_lock, holding either is sufficient to
+* observe it.
 *
-* The reference ensures pi_state lives, dropping the hb->lock
-* is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to
-* close the races against futex_lock_pi(), but in case of
-* _any_ fail we'll abort and retry the whole deal.
+* By taking wait_lock while still holding hb->lock, we ensure
+* there is no point where we hold neither; and therefore
+* wake_futex_pi() must observe a state consistent with what we
+* observed.
 */
-   get_pi_state(pi_state);
+   raw_spin_lock_irq(_state->pi_mutex.wait_lock);
spin_unlock(>lock);
 
ret = wake_futex_pi(uaddr, uval, pi_state);