Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Davidlohr Bueso
On Sat, 2013-11-23 at 13:01 +0100, Peter Zijlstra wrote:
> > I used to have a patch to schedule() that would always immediately fall
> > through and only actually block on the second call; it illustrated the
> > problem really well, in fact so well the kernels fails to boot most
> > times.
> 
> I found the below on my filesystem -- making it apply shouldn't be hard.
> Making it work is the same effort as that patch you sent, we need to
> guarantee all schedule() callers can deal with not actually sleeping --
> aka. spurious wakeups.

Thanks, I'll definitely try the patch and see what comes up.

> 
> I don't think anybody ever got that thing to run reliable enough to see
> if the idea proposed in the patch made any difference to actual
> workloads though.

Since your idea can also be applied to sysv sems (patch 3/3 back then),
I can definitely do some Oracle runs which IIRC, also likes doing
multiple wakeups at once. In any case this patch deals very nicely with
our customer workload, which is why I believe its particularly good
here.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Peter Zijlstra
> I used to have a patch to schedule() that would always immediately fall
> through and only actually block on the second call; it illustrated the
> problem really well, in fact so well the kernels fails to boot most
> times.

I found the below on my filesystem -- making it apply shouldn't be hard.
Making it work is the same effort as that patch you sent, we need to
guarantee all schedule() callers can deal with not actually sleeping --
aka. spurious wakeups.

I don't think anybody ever got that thing to run reliable enough to see
if the idea proposed in the patch made any difference to actual
workloads though.

---
Subject: 
From: Peter Zijlstra 
Date: Thu Dec 09 17:51:09 CET 2010


Signed-off-by: Peter Zijlstra 
Link: http://lkml.kernel.org/n/tip-v17vshx6uasjguuwd67fe...@git.kernel.org
---
 include/linux/sched.h |5 +++--
 kernel/sched/core.c   |   18 ++
 2 files changed, 21 insertions(+), 2 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -153,9 +153,10 @@ print_cfs_rq(struct seq_file *m, int cpu
 #define TASK_DEAD  64
 #define TASK_WAKEKILL  128
 #define TASK_WAKING256
-#define TASK_STATE_MAX 512
+#define TASK_YIELD 512
+#define TASK_STATE_MAX 1024
 
-#define TASK_STATE_TO_CHAR_STR "RSDTtZXxKW"
+#define TASK_STATE_TO_CHAR_STR "RSDTtZXxKWY"
 
 extern char ___assert_task_state[1 - 2*!!(
sizeof(TASK_STATE_TO_CHAR_STR)-1 != ilog2(TASK_STATE_MAX)+1)];
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -931,6 +931,7 @@ void set_task_cpu(struct task_struct *p,
 * ttwu() will sort out the placement.
 */
WARN_ON_ONCE(p->state != TASK_RUNNING && p->state != TASK_WAKING &&
+   !(p->state & TASK_YIELD) &&
!(task_thread_info(p)->preempt_count & PREEMPT_ACTIVE));
 
 #ifdef CONFIG_LOCKDEP
@@ -2864,6 +2865,22 @@ static void __sched __schedule(void)
if (unlikely(signal_pending_state(prev->state, prev))) {
prev->state = TASK_RUNNING;
} else {
+   /*
+* Provide an auto-yield feature on schedule().
+*
+* The thought is to avoid a sleep+wakeup cycle
+* if simply yielding the cpu will suffice to
+* satisfy the required condition.
+*
+* Assumes the calling schedule() site can deal
+* with spurious wakeups.
+*/
+   if (prev->state & TASK_YIELD) {
+   prev->state &= ~TASK_YIELD;
+   if (rq->nr_running > 1)
+   goto no_deactivate;
+   }
+
deactivate_task(rq, prev, DEQUEUE_SLEEP);
prev->on_rq = 0;
 
@@ -2880,6 +2897,7 @@ static void __sched __schedule(void)
try_to_wake_up_local(to_wakeup);
}
}
+no_deactivate:
switch_count = >nvcsw;
}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Peter Zijlstra
On Fri, Nov 22, 2013 at 04:56:37PM -0800, Davidlohr Bueso wrote:
> From: Peter Zijlstra 
> 
> Original patchset: https://lkml.org/lkml/2011/9/14/118
> 
> This is useful for locking primitives that can effect multiple
> wakeups per operation and want to avoid lock internal lock contention
> by delaying the wakeups until we've released the lock internal locks.
> 
> Alternatively it can be used to avoid issuing multiple wakeups, and
> thus save a few cycles, in packet processing. Queue all target tasks
> and wakeup once you've processed all packets. That way you avoid
> waking the target task multiple times if there were multiple packets
> for the same task.
> 
> This patch adds the needed infrastructure into the scheduler code
> and uses the new wake_list to delay the futex wakeups until
> after we've released the hash bucket locks. This avoids the newly
> woken tasks from immediately getting stuck on the hb->lock.
> 
> Cc: Ingo Molnar 
> Cc: Darren Hart 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Mike Galbraith 
> Cc: Jeff Mahoney 
> Cc: Linus Torvalds 
> Cc: Scott Norton 
> Cc: Tom Vaden 
> Cc: Aswin Chandramouleeswaran 
> Cc: Waiman Long 
> Tested-by: Jason Low 
> [forward ported]
> Signed-off-by: Davidlohr Bueso 
> ---
> Please note that in the original thread there was some debate
> about spurious wakeups (https://lkml.org/lkml/2011/9/17/31), so
> you can consider this more of an RFC patch if folks believe that
> this functionality is incomplete/buggy.

Right, from what I remember, this patch can cause spurious wakeups, and
while all our regular sleeping lock / wait thingies can deal with this,
not all creative schedule() usage in the tree can deal with this.

There's about ~1400 (or there were that many 2 years ago, might be more
by now) schedule() calls, many of which are open coded wait constructs
of which most are buggy in one way or another.

So we first need to audit / fix all those before we can do this one.

I used to have a patch to schedule() that would always immediately fall
through and only actually block on the second call; it illustrated the
problem really well, in fact so well the kernels fails to boot most
times.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Peter Zijlstra
On Fri, Nov 22, 2013 at 04:56:37PM -0800, Davidlohr Bueso wrote:
 From: Peter Zijlstra pet...@infradead.org
 
 Original patchset: https://lkml.org/lkml/2011/9/14/118
 
 This is useful for locking primitives that can effect multiple
 wakeups per operation and want to avoid lock internal lock contention
 by delaying the wakeups until we've released the lock internal locks.
 
 Alternatively it can be used to avoid issuing multiple wakeups, and
 thus save a few cycles, in packet processing. Queue all target tasks
 and wakeup once you've processed all packets. That way you avoid
 waking the target task multiple times if there were multiple packets
 for the same task.
 
 This patch adds the needed infrastructure into the scheduler code
 and uses the new wake_list to delay the futex wakeups until
 after we've released the hash bucket locks. This avoids the newly
 woken tasks from immediately getting stuck on the hb-lock.
 
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Darren Hart dvh...@linux.intel.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Mike Galbraith efa...@gmx.de
 Cc: Jeff Mahoney je...@suse.com
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Scott Norton scott.nor...@hp.com
 Cc: Tom Vaden tom.va...@hp.com
 Cc: Aswin Chandramouleeswaran as...@hp.com
 Cc: Waiman Long waiman.l...@hp.com
 Tested-by: Jason Low jason.l...@hp.com
 [forward ported]
 Signed-off-by: Davidlohr Bueso davidl...@hp.com
 ---
 Please note that in the original thread there was some debate
 about spurious wakeups (https://lkml.org/lkml/2011/9/17/31), so
 you can consider this more of an RFC patch if folks believe that
 this functionality is incomplete/buggy.

Right, from what I remember, this patch can cause spurious wakeups, and
while all our regular sleeping lock / wait thingies can deal with this,
not all creative schedule() usage in the tree can deal with this.

There's about ~1400 (or there were that many 2 years ago, might be more
by now) schedule() calls, many of which are open coded wait constructs
of which most are buggy in one way or another.

So we first need to audit / fix all those before we can do this one.

I used to have a patch to schedule() that would always immediately fall
through and only actually block on the second call; it illustrated the
problem really well, in fact so well the kernels fails to boot most
times.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Peter Zijlstra
 I used to have a patch to schedule() that would always immediately fall
 through and only actually block on the second call; it illustrated the
 problem really well, in fact so well the kernels fails to boot most
 times.

I found the below on my filesystem -- making it apply shouldn't be hard.
Making it work is the same effort as that patch you sent, we need to
guarantee all schedule() callers can deal with not actually sleeping --
aka. spurious wakeups.

I don't think anybody ever got that thing to run reliable enough to see
if the idea proposed in the patch made any difference to actual
workloads though.

---
Subject: 
From: Peter Zijlstra a.p.zijls...@chello.nl
Date: Thu Dec 09 17:51:09 CET 2010


Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl
Link: http://lkml.kernel.org/n/tip-v17vshx6uasjguuwd67fe...@git.kernel.org
---
 include/linux/sched.h |5 +++--
 kernel/sched/core.c   |   18 ++
 2 files changed, 21 insertions(+), 2 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -153,9 +153,10 @@ print_cfs_rq(struct seq_file *m, int cpu
 #define TASK_DEAD  64
 #define TASK_WAKEKILL  128
 #define TASK_WAKING256
-#define TASK_STATE_MAX 512
+#define TASK_YIELD 512
+#define TASK_STATE_MAX 1024
 
-#define TASK_STATE_TO_CHAR_STR RSDTtZXxKW
+#define TASK_STATE_TO_CHAR_STR RSDTtZXxKWY
 
 extern char ___assert_task_state[1 - 2*!!(
sizeof(TASK_STATE_TO_CHAR_STR)-1 != ilog2(TASK_STATE_MAX)+1)];
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -931,6 +931,7 @@ void set_task_cpu(struct task_struct *p,
 * ttwu() will sort out the placement.
 */
WARN_ON_ONCE(p-state != TASK_RUNNING  p-state != TASK_WAKING 
+   !(p-state  TASK_YIELD) 
!(task_thread_info(p)-preempt_count  PREEMPT_ACTIVE));
 
 #ifdef CONFIG_LOCKDEP
@@ -2864,6 +2865,22 @@ static void __sched __schedule(void)
if (unlikely(signal_pending_state(prev-state, prev))) {
prev-state = TASK_RUNNING;
} else {
+   /*
+* Provide an auto-yield feature on schedule().
+*
+* The thought is to avoid a sleep+wakeup cycle
+* if simply yielding the cpu will suffice to
+* satisfy the required condition.
+*
+* Assumes the calling schedule() site can deal
+* with spurious wakeups.
+*/
+   if (prev-state  TASK_YIELD) {
+   prev-state = ~TASK_YIELD;
+   if (rq-nr_running  1)
+   goto no_deactivate;
+   }
+
deactivate_task(rq, prev, DEQUEUE_SLEEP);
prev-on_rq = 0;
 
@@ -2880,6 +2897,7 @@ static void __sched __schedule(void)
try_to_wake_up_local(to_wakeup);
}
}
+no_deactivate:
switch_count = prev-nvcsw;
}
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-23 Thread Davidlohr Bueso
On Sat, 2013-11-23 at 13:01 +0100, Peter Zijlstra wrote:
  I used to have a patch to schedule() that would always immediately fall
  through and only actually block on the second call; it illustrated the
  problem really well, in fact so well the kernels fails to boot most
  times.
 
 I found the below on my filesystem -- making it apply shouldn't be hard.
 Making it work is the same effort as that patch you sent, we need to
 guarantee all schedule() callers can deal with not actually sleeping --
 aka. spurious wakeups.

Thanks, I'll definitely try the patch and see what comes up.

 
 I don't think anybody ever got that thing to run reliable enough to see
 if the idea proposed in the patch made any difference to actual
 workloads though.

Since your idea can also be applied to sysv sems (patch 3/3 back then),
I can definitely do some Oracle runs which IIRC, also likes doing
multiple wakeups at once. In any case this patch deals very nicely with
our customer workload, which is why I believe its particularly good
here.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-22 Thread Davidlohr Bueso
From: Peter Zijlstra 

Original patchset: https://lkml.org/lkml/2011/9/14/118

This is useful for locking primitives that can effect multiple
wakeups per operation and want to avoid lock internal lock contention
by delaying the wakeups until we've released the lock internal locks.

Alternatively it can be used to avoid issuing multiple wakeups, and
thus save a few cycles, in packet processing. Queue all target tasks
and wakeup once you've processed all packets. That way you avoid
waking the target task multiple times if there were multiple packets
for the same task.

This patch adds the needed infrastructure into the scheduler code
and uses the new wake_list to delay the futex wakeups until
after we've released the hash bucket locks. This avoids the newly
woken tasks from immediately getting stuck on the hb->lock.

Cc: Ingo Molnar 
Cc: Darren Hart 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Mike Galbraith 
Cc: Jeff Mahoney 
Cc: Linus Torvalds 
Cc: Scott Norton 
Cc: Tom Vaden 
Cc: Aswin Chandramouleeswaran 
Cc: Waiman Long 
Tested-by: Jason Low 
[forward ported]
Signed-off-by: Davidlohr Bueso 
---
Please note that in the original thread there was some debate
about spurious wakeups (https://lkml.org/lkml/2011/9/17/31), so
you can consider this more of an RFC patch if folks believe that
this functionality is incomplete/buggy.

 include/linux/sched.h | 41 +
 kernel/futex.c| 31 +++
 kernel/sched/core.c   | 19 +++
 3 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7e35d4b..679aabb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -793,6 +793,20 @@ enum cpu_idle_type {
 
 extern int __weak arch_sd_sibiling_asym_packing(void);
 
+struct wake_list_head {
+   struct wake_list_node *first;
+};
+
+struct wake_list_node {
+   struct wake_list_node *next;
+};
+
+#define WAKE_LIST_TAIL ((struct wake_list_node *) 0x01)
+
+#define WAKE_LIST(name)\
+   struct wake_list_head name = { WAKE_LIST_TAIL }
+
+
 struct sched_domain_attr {
int relax_domain_level;
 };
@@ -1078,6 +1092,8 @@ struct task_struct {
unsigned int btrace_seq;
 #endif
 
+   struct wake_list_node wake_list;
+
unsigned int policy;
int nr_cpus_allowed;
cpumask_t cpus_allowed;
@@ -2044,6 +2060,31 @@ extern void wake_up_new_task(struct task_struct *tsk);
 extern void sched_fork(unsigned long clone_flags, struct task_struct *p);
 extern void sched_dead(struct task_struct *p);
 
+static inline void
+wake_list_add(struct wake_list_head *head, struct task_struct *p)
+{
+   struct wake_list_node *n = >wake_list;
+
+   /*
+* Atomically grab the task, if ->wake_list is !0 already it means
+* its already queued (either by us or someone else) and will get the
+* wakeup due to that.
+*
+* This cmpxchg() implies a full barrier, which pairs with the write
+* barrier implied by the wakeup in wake_up_list().
+*/
+   if (cmpxchg(>next, NULL, head->first))
+   return;
+
+   /*
+* The head is context local, there can be no concurrency.
+*/
+   get_task_struct(p);
+   head->first = n;
+}
+
+extern void wake_up_list(struct wake_list_head *head, unsigned int state);
+
 extern void proc_caches_init(void);
 extern void flush_signals(struct task_struct *);
 extern void __flush_signals(struct task_struct *);
diff --git a/kernel/futex.c b/kernel/futex.c
index 2f9dd5d..3d60a3d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -851,20 +851,17 @@ static void __unqueue_futex(struct futex_q *q)
  * The hash bucket lock must be held when this is called.
  * Afterwards, the futex_q must not be accessed.
  */
-static void wake_futex(struct futex_q *q)
+static void wake_futex(struct wake_list_head *wake_list, struct futex_q *q)
 {
struct task_struct *p = q->task;
 
/*
-* We set q->lock_ptr = NULL _before_ we wake up the task. If
-* a non-futex wake up happens on another CPU then the task
-* might exit and p would dereference a non-existing task
-* struct. Prevent this by holding a reference on p across the
-* wake up.
+* Queue up and delay the futex wakeups until the hb lock
+* has been released.
 */
-   get_task_struct(p);
-
+   wake_list_add(wake_list, p);
__unqueue_futex(q);
+
/*
 * The waiting task can free the futex_q as soon as
 * q->lock_ptr = NULL is written, without taking any locks. A
@@ -873,9 +870,6 @@ static void wake_futex(struct futex_q *q)
 */
smp_wmb();
q->lock_ptr = NULL;
-
-   wake_up_state(p, TASK_NORMAL);
-   put_task_struct(p);
 }
 
 static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this)
@@ -992,6 +986,7 @@ futex_wake(u32 __user 

[PATCH 5/5] sched,futex: Provide delayed wakeup list

2013-11-22 Thread Davidlohr Bueso
From: Peter Zijlstra pet...@infradead.org

Original patchset: https://lkml.org/lkml/2011/9/14/118

This is useful for locking primitives that can effect multiple
wakeups per operation and want to avoid lock internal lock contention
by delaying the wakeups until we've released the lock internal locks.

Alternatively it can be used to avoid issuing multiple wakeups, and
thus save a few cycles, in packet processing. Queue all target tasks
and wakeup once you've processed all packets. That way you avoid
waking the target task multiple times if there were multiple packets
for the same task.

This patch adds the needed infrastructure into the scheduler code
and uses the new wake_list to delay the futex wakeups until
after we've released the hash bucket locks. This avoids the newly
woken tasks from immediately getting stuck on the hb-lock.

Cc: Ingo Molnar mi...@kernel.org
Cc: Darren Hart dvh...@linux.intel.com
Cc: Peter Zijlstra pet...@infradead.org
Cc: Thomas Gleixner t...@linutronix.de
Cc: Mike Galbraith efa...@gmx.de
Cc: Jeff Mahoney je...@suse.com
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Scott Norton scott.nor...@hp.com
Cc: Tom Vaden tom.va...@hp.com
Cc: Aswin Chandramouleeswaran as...@hp.com
Cc: Waiman Long waiman.l...@hp.com
Tested-by: Jason Low jason.l...@hp.com
[forward ported]
Signed-off-by: Davidlohr Bueso davidl...@hp.com
---
Please note that in the original thread there was some debate
about spurious wakeups (https://lkml.org/lkml/2011/9/17/31), so
you can consider this more of an RFC patch if folks believe that
this functionality is incomplete/buggy.

 include/linux/sched.h | 41 +
 kernel/futex.c| 31 +++
 kernel/sched/core.c   | 19 +++
 3 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7e35d4b..679aabb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -793,6 +793,20 @@ enum cpu_idle_type {
 
 extern int __weak arch_sd_sibiling_asym_packing(void);
 
+struct wake_list_head {
+   struct wake_list_node *first;
+};
+
+struct wake_list_node {
+   struct wake_list_node *next;
+};
+
+#define WAKE_LIST_TAIL ((struct wake_list_node *) 0x01)
+
+#define WAKE_LIST(name)\
+   struct wake_list_head name = { WAKE_LIST_TAIL }
+
+
 struct sched_domain_attr {
int relax_domain_level;
 };
@@ -1078,6 +1092,8 @@ struct task_struct {
unsigned int btrace_seq;
 #endif
 
+   struct wake_list_node wake_list;
+
unsigned int policy;
int nr_cpus_allowed;
cpumask_t cpus_allowed;
@@ -2044,6 +2060,31 @@ extern void wake_up_new_task(struct task_struct *tsk);
 extern void sched_fork(unsigned long clone_flags, struct task_struct *p);
 extern void sched_dead(struct task_struct *p);
 
+static inline void
+wake_list_add(struct wake_list_head *head, struct task_struct *p)
+{
+   struct wake_list_node *n = p-wake_list;
+
+   /*
+* Atomically grab the task, if -wake_list is !0 already it means
+* its already queued (either by us or someone else) and will get the
+* wakeup due to that.
+*
+* This cmpxchg() implies a full barrier, which pairs with the write
+* barrier implied by the wakeup in wake_up_list().
+*/
+   if (cmpxchg(n-next, NULL, head-first))
+   return;
+
+   /*
+* The head is context local, there can be no concurrency.
+*/
+   get_task_struct(p);
+   head-first = n;
+}
+
+extern void wake_up_list(struct wake_list_head *head, unsigned int state);
+
 extern void proc_caches_init(void);
 extern void flush_signals(struct task_struct *);
 extern void __flush_signals(struct task_struct *);
diff --git a/kernel/futex.c b/kernel/futex.c
index 2f9dd5d..3d60a3d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -851,20 +851,17 @@ static void __unqueue_futex(struct futex_q *q)
  * The hash bucket lock must be held when this is called.
  * Afterwards, the futex_q must not be accessed.
  */
-static void wake_futex(struct futex_q *q)
+static void wake_futex(struct wake_list_head *wake_list, struct futex_q *q)
 {
struct task_struct *p = q-task;
 
/*
-* We set q-lock_ptr = NULL _before_ we wake up the task. If
-* a non-futex wake up happens on another CPU then the task
-* might exit and p would dereference a non-existing task
-* struct. Prevent this by holding a reference on p across the
-* wake up.
+* Queue up and delay the futex wakeups until the hb lock
+* has been released.
 */
-   get_task_struct(p);
-
+   wake_list_add(wake_list, p);
__unqueue_futex(q);
+
/*
 * The waiting task can free the futex_q as soon as
 * q-lock_ptr = NULL is written, without taking any locks. A
@@ -873,9 +870,6 @@ static void wake_futex(struct futex_q *q)