Re: [PATCH 1/6] wake_q: Cleanup + Documentation update.
On Sat, Oct 12, 2019 at 07:49:53AM +0200, Manfred Spraul wrote: > 1) wake_q_add() contains a memory barrier, and callers such as > ipc/mqueue.c rely on this barrier. > Unfortunately, this is documented in ipc/mqueue.c, and not in the > description of wake_q_add(). > Therefore: Update the documentation. > Removing/updating ipc/mqueue.c will happen with the next patch in the > series. > > 2) wake_q_add() ends with get_task_struct(), which is an > unordered refcount increase. Add a clear comment that the callers > are responsible for a barrier: most likely spin_unlock() or > smp_store_release(). > > 3) wake_up_q() relies on the memory barrier in try_to_wake_up(). > Add a comment, to simplify searching. > > 4) wake_q.next is accessed without synchroniyation by wake_q_add(), > using cmpxchg_relaxed(), and by wake_up_q(). > Therefore: Use WRITE_ONCE in wake_up_q(), to ensure that the > compiler doesn't perform any tricks. > > Signed-off-by: Manfred Spraul > Cc: Davidlohr Bueso > --- > kernel/sched/core.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index dd05a378631a..60ae574317fd 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -440,8 +440,16 @@ static bool __wake_q_add(struct wake_q_head *head, > struct task_struct *task) > * @task: the task to queue for 'later' wakeup > * > * Queue a task for later wakeup, most likely by the wake_up_q() call in the > - * same context, _HOWEVER_ this is not guaranteed, the wakeup can come > - * instantly. > + * same context, _HOWEVER_ this is not guaranteed. Especially, the wakeup > + * may happen before the function returns. > + * > + * What is guaranteed is that there is a memory barrier before the wakeup, > + * callers may rely on this barrier. I would like to stress that this (wake_q_add) provides the same ordering guarantees as a 'normal' wakeup. > + * > + * On the other hand, the caller must guarantee that @task does not disappear > + * before wake_q_add() completed. wake_q_add() does not contain any memory > + * barrier to ensure ordering, thus the caller may need to use > + * smp_store_release(). Like Davidlohr, I'm slightly puzzled by this last paragraph. > * > * This function must be used as-if it were wake_up_process(); IOW the task > * must be ready to be woken at this location. > @@ -486,11 +494,14 @@ void wake_up_q(struct wake_q_head *head) > BUG_ON(!task); > /* Task can safely be re-inserted now: */ > node = node->next; > - task->wake_q.next = NULL; > + > + WRITE_ONCE(task->wake_q.next, NULL); > > /* >* wake_up_process() executes a full barrier, which pairs with >* the queueing in wake_q_add() so as not to miss wakeups. > + * The barrier is the smp_mb__after_spinlock() in > + * try_to_wake_up(). We already have wake_up_process() documented as implying this barrier; why do we want to mention the specifics here? That is, have a look at the comments before try_to_wake_up(). >*/ > wake_up_process(task); > put_task_struct(task); > -- > 2.21.0 >
Re: [PATCH 1/6] wake_q: Cleanup + Documentation update.
On Sat, 12 Oct 2019, Manfred Spraul wrote: 1) wake_q_add() contains a memory barrier, and callers such as ipc/mqueue.c rely on this barrier. Unfortunately, this is documented in ipc/mqueue.c, and not in the description of wake_q_add(). Therefore: Update the documentation. Removing/updating ipc/mqueue.c will happen with the next patch in the series. 2) wake_q_add() ends with get_task_struct(), which is an unordered refcount increase. Add a clear comment that the callers are responsible for a barrier: most likely spin_unlock() or smp_store_release(). 3) wake_up_q() relies on the memory barrier in try_to_wake_up(). Add a comment, to simplify searching. 4) wake_q.next is accessed without synchroniyation by wake_q_add(), using cmpxchg_relaxed(), and by wake_up_q(). Therefore: Use WRITE_ONCE in wake_up_q(), to ensure that the compiler doesn't perform any tricks. Signed-off-by: Manfred Spraul Cc: Davidlohr Bueso --- kernel/sched/core.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dd05a378631a..60ae574317fd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -440,8 +440,16 @@ static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task) * @task: the task to queue for 'later' wakeup * * Queue a task for later wakeup, most likely by the wake_up_q() call in the - * same context, _HOWEVER_ this is not guaranteed, the wakeup can come - * instantly. + * same context, _HOWEVER_ this is not guaranteed. Especially, the wakeup + * may happen before the function returns. + * + * What is guaranteed is that there is a memory barrier before the wakeup, + * callers may rely on this barrier. + * + * On the other hand, the caller must guarantee that @task does not disappear + * before wake_q_add() completed. wake_q_add() does not contain any memory + * barrier to ensure ordering, thus the caller may need to use + * smp_store_release(). This is why we have wake_q_add_safe(). I think this last paragraph is unnecessary and confusing. Thanks, Davidlohr * * This function must be used as-if it were wake_up_process(); IOW the task * must be ready to be woken at this location. @@ -486,11 +494,14 @@ void wake_up_q(struct wake_q_head *head) BUG_ON(!task); /* Task can safely be re-inserted now: */ node = node->next; - task->wake_q.next = NULL; + + WRITE_ONCE(task->wake_q.next, NULL); /* * wake_up_process() executes a full barrier, which pairs with * the queueing in wake_q_add() so as not to miss wakeups. +* The barrier is the smp_mb__after_spinlock() in +* try_to_wake_up(). */ wake_up_process(task); put_task_struct(task); -- 2.21.0
[PATCH 1/6] wake_q: Cleanup + Documentation update.
1) wake_q_add() contains a memory barrier, and callers such as ipc/mqueue.c rely on this barrier. Unfortunately, this is documented in ipc/mqueue.c, and not in the description of wake_q_add(). Therefore: Update the documentation. Removing/updating ipc/mqueue.c will happen with the next patch in the series. 2) wake_q_add() ends with get_task_struct(), which is an unordered refcount increase. Add a clear comment that the callers are responsible for a barrier: most likely spin_unlock() or smp_store_release(). 3) wake_up_q() relies on the memory barrier in try_to_wake_up(). Add a comment, to simplify searching. 4) wake_q.next is accessed without synchroniyation by wake_q_add(), using cmpxchg_relaxed(), and by wake_up_q(). Therefore: Use WRITE_ONCE in wake_up_q(), to ensure that the compiler doesn't perform any tricks. Signed-off-by: Manfred Spraul Cc: Davidlohr Bueso --- kernel/sched/core.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dd05a378631a..60ae574317fd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -440,8 +440,16 @@ static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task) * @task: the task to queue for 'later' wakeup * * Queue a task for later wakeup, most likely by the wake_up_q() call in the - * same context, _HOWEVER_ this is not guaranteed, the wakeup can come - * instantly. + * same context, _HOWEVER_ this is not guaranteed. Especially, the wakeup + * may happen before the function returns. + * + * What is guaranteed is that there is a memory barrier before the wakeup, + * callers may rely on this barrier. + * + * On the other hand, the caller must guarantee that @task does not disappear + * before wake_q_add() completed. wake_q_add() does not contain any memory + * barrier to ensure ordering, thus the caller may need to use + * smp_store_release(). * * This function must be used as-if it were wake_up_process(); IOW the task * must be ready to be woken at this location. @@ -486,11 +494,14 @@ void wake_up_q(struct wake_q_head *head) BUG_ON(!task); /* Task can safely be re-inserted now: */ node = node->next; - task->wake_q.next = NULL; + + WRITE_ONCE(task->wake_q.next, NULL); /* * wake_up_process() executes a full barrier, which pairs with * the queueing in wake_q_add() so as not to miss wakeups. +* The barrier is the smp_mb__after_spinlock() in +* try_to_wake_up(). */ wake_up_process(task); put_task_struct(task); -- 2.21.0