Re: wake_q memory ordering
On Fri, 11 Oct 2019, Manfred Spraul wrote: I don't know. The new documentation would not have answered my question (is it ok to combine smp_mb__before_atomic() with atomic_relaxed()?). And it copies content already present in atomic_t.txt. Well, the _relaxed (and release/acquire) extentions refer to a _successful_ operation (LL/SC), and whether it has barriers on each of the sides before and after. I thought you were mainly worried about the failed CAS scenario, not the relaxed itself. I don't know how this copies content from atomic_t.txt, at no point does it talk about failed CAS. Thus: I would prefer if the first sentence of the paragraph is replaced: The list of operations should end with "...", and it should match what is in atomic_t.txt I'll see about combining some of your changes in patch 5/5 of your new series, but have to say I still prefer my documentation change. Thanks, Davidlohr
Re: wake_q memory ordering
Hi Davidlohr, On 10/10/19 9:25 PM, Davidlohr Bueso wrote: On Thu, 10 Oct 2019, Peter Zijlstra wrote: On Thu, Oct 10, 2019 at 02:13:47PM +0200, Manfred Spraul wrote: Therefore smp_mb__{before,after}_atomic() may be combined with cmpxchg_relaxed, to form a full memory barrier, on all archs. Just so. We might want something like this? 8<- From: Davidlohr Bueso Subject: [PATCH] Documentation/memory-barriers.txt: Mention smp_mb__{before,after}_atomic() and CAS Explicitly mention possible usages to guarantee serialization even upon failed cmpxchg (or similar) calls along with smp_mb__{before,after}_atomic(). Signed-off-by: Davidlohr Bueso --- Documentation/memory-barriers.txt | 12 1 file changed, 12 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 1adbb8a371c7..5d2873d4b442 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1890,6 +1890,18 @@ There are some more advanced barrier functions: This makes sure that the death mark on the object is perceived to be set *before* the reference counter is decremented. + Similarly, these barriers can be used to guarantee serialization for atomic + RMW calls on architectures which may not imply memory barriers upon failure. + + obj->next = NULL; + smp_mb__before_atomic() + if (cmpxchg(>ptr, NULL, val)) + return; + + This makes sure that the store to the next pointer always has smp_store_mb() + semantics. As such, smp_mb__{before,after}_atomic() calls allow optimizing + the barrier usage by finer grained serialization. + See Documentation/atomic_{t,bitops}.txt for more information. I don't know. The new documentation would not have answered my question (is it ok to combine smp_mb__before_atomic() with atomic_relaxed()?). And it copies content already present in atomic_t.txt. Thus: I would prefer if the first sentence of the paragraph is replaced: The list of operations should end with "...", and it should match what is in atomic_t.txt Ok? -- Manfred >From 8df60211228042672ba0cd89c3566c5145e8b203 Mon Sep 17 00:00:00 2001 From: Manfred Spraul Date: Fri, 11 Oct 2019 10:33:26 +0200 Subject: [PATCH 4/4] Documentation/memory-barriers.txt: Clarify cmpxchg() The documentation in memory-barriers.txt claims that smp_mb__{before,after}_atomic() are for atomic ops that do not return a value. This is misleading and doesn't match the example in atomic_t.txt, and e.g. smp_mb__before_atomic() may and is used together with cmpxchg_relaxed() in the wake_q code. The purpose of e.g. smp_mb__before_atomic() is to "upgrade" a following RMW atomic operation to a full memory barrier. The return code of the atomic operation has no impact, so all of the following examples are valid: 1) smp_mb__before_atomic(); atomic_add(); 2) smp_mb__before_atomic(); atomic_xchg_relaxed(); 3) smp_mb__before_atomic(); atomic_fetch_add_relaxed(); Invalid would be: smp_mb__before_atomic(); atomic_set(); Signed-off-by: Manfred Spraul Cc: Waiman Long Cc: Davidlohr Bueso Cc: Peter Zijlstra --- Documentation/memory-barriers.txt | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 1adbb8a371c7..52076b057400 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1873,12 +1873,13 @@ There are some more advanced barrier functions: (*) smp_mb__before_atomic(); (*) smp_mb__after_atomic(); - These are for use with atomic (such as add, subtract, increment and - decrement) functions that don't return a value, especially when used for - reference counting. These functions do not imply memory barriers. + These are for use with atomic RMW functions (such as add, subtract, + increment, decrement, failed conditional operations, ...) that do + not imply memory barriers, but where the code needs a memory barrier, + for example when used for reference counting. - These are also used for atomic bitop functions that do not return a - value (such as set_bit and clear_bit). + These are also used for atomic RMW bitop functions that do imply a full + memory barrier (such as set_bit and clear_bit). As an example, consider a piece of code that marks an object as being dead and then decrements the object's reference count: -- 2.21.0
Re: wake_q memory ordering
On Thu, 10 Oct 2019, Peter Zijlstra wrote: On Thu, Oct 10, 2019 at 02:13:47PM +0200, Manfred Spraul wrote: Hi Peter, On 10/10/19 1:42 PM, Peter Zijlstra wrote: > On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote: > > Hi, > > > > Waiman Long noticed that the memory barriers in sem_lock() are not really > > documented, and while adding documentation, I ended up with one case where > > I'm not certain about the wake_q code: > > > > Questions: > > - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an > > ordering guarantee? > Yep. Either the atomic instruction implies ordering (eg. x86 LOCK > prefix) or it doesn't (most RISC LL/SC), if it does, > smp_mb__{before,after}_atomic() are a NO-OP and the ordering is > unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are > unconditional barriers. And _relaxed() differs from "normal" cmpxchg only for LL/SC architectures, correct? Indeed. Therefore smp_mb__{before,after}_atomic() may be combined with cmpxchg_relaxed, to form a full memory barrier, on all archs. Just so. We might want something like this? 8<- From: Davidlohr Bueso Subject: [PATCH] Documentation/memory-barriers.txt: Mention smp_mb__{before,after}_atomic() and CAS Explicitly mention possible usages to guarantee serialization even upon failed cmpxchg (or similar) calls along with smp_mb__{before,after}_atomic(). Signed-off-by: Davidlohr Bueso --- Documentation/memory-barriers.txt | 12 1 file changed, 12 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 1adbb8a371c7..5d2873d4b442 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1890,6 +1890,18 @@ There are some more advanced barrier functions: This makes sure that the death mark on the object is perceived to be set *before* the reference counter is decremented. + Similarly, these barriers can be used to guarantee serialization for atomic + RMW calls on architectures which may not imply memory barriers upon failure. + + obj->next = NULL; + smp_mb__before_atomic() + if (cmpxchg(>ptr, NULL, val)) + return; + + This makes sure that the store to the next pointer always has smp_store_mb() + semantics. As such, smp_mb__{before,after}_atomic() calls allow optimizing + the barrier usage by finer grained serialization. + See Documentation/atomic_{t,bitops}.txt for more information. -- 2.16.4
Re: wake_q memory ordering
On Thu, Oct 10, 2019 at 02:13:47PM +0200, Manfred Spraul wrote: > Hi Peter, > > On 10/10/19 1:42 PM, Peter Zijlstra wrote: > > On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote: > > > Hi, > > > > > > Waiman Long noticed that the memory barriers in sem_lock() are not really > > > documented, and while adding documentation, I ended up with one case where > > > I'm not certain about the wake_q code: > > > > > > Questions: > > > - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an > > > ordering guarantee? > > Yep. Either the atomic instruction implies ordering (eg. x86 LOCK > > prefix) or it doesn't (most RISC LL/SC), if it does, > > smp_mb__{before,after}_atomic() are a NO-OP and the ordering is > > unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are > > unconditional barriers. > > And _relaxed() differs from "normal" cmpxchg only for LL/SC architectures, > correct? Indeed. > Therefore smp_mb__{before,after}_atomic() may be combined with > cmpxchg_relaxed, to form a full memory barrier, on all archs. Just so. > > > - Is it ok that wake_up_q just writes wake_q->next, shouldn't > > > smp_store_acquire() be used? I.e.: guarantee that wake_up_process() > > > happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed > > > provides any ordering. > > There is no such thing as store_acquire, it is either load_acquire or > > store_release. But just like how we can write load-aquire like > > load+smp_mb(), so too I suppose we could write store-acquire like > > store+smp_mb(), and that is exactly what is there (through the implied > > barrier of wake_up_process()). > > Thanks for confirming my assumption: > The code is correct, due to the implied barrier inside wake_up_process(). It has a comment there, trying to state this. task->wake_q.next = NULL; /* * wake_up_process() executes a full barrier, which pairs with * the queueing in wake_q_add() so as not to miss wakeups. */ wake_up_process(task);
Re: wake_q memory ordering
Hi Peter, On 10/10/19 1:42 PM, Peter Zijlstra wrote: On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote: Hi, Waiman Long noticed that the memory barriers in sem_lock() are not really documented, and while adding documentation, I ended up with one case where I'm not certain about the wake_q code: Questions: - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an ordering guarantee? Yep. Either the atomic instruction implies ordering (eg. x86 LOCK prefix) or it doesn't (most RISC LL/SC), if it does, smp_mb__{before,after}_atomic() are a NO-OP and the ordering is unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are unconditional barriers. And _relaxed() differs from "normal" cmpxchg only for LL/SC architectures, correct? Therefore smp_mb__{before,after}_atomic() may be combined with cmpxchg_relaxed, to form a full memory barrier, on all archs. [...] - Is it ok that wake_up_q just writes wake_q->next, shouldn't smp_store_acquire() be used? I.e.: guarantee that wake_up_process() happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed provides any ordering. There is no such thing as store_acquire, it is either load_acquire or store_release. But just like how we can write load-aquire like load+smp_mb(), so too I suppose we could write store-acquire like store+smp_mb(), and that is exactly what is there (through the implied barrier of wake_up_process()). Thanks for confirming my assumption: The code is correct, due to the implied barrier inside wake_up_process(). [...] rewritten: start condition: A = 1; B = 0; CPU1: B = 1; RELEASE, unlock LockX; CPU2: lock LockX, ACQUIRE if (LOAD A == 1) return; /* using cmp_xchg_relaxed */ CPU2: A = 0; ACQUIRE, lock LockY smp_mb__after_spinlock(); READ B Question: is A = 1, B = 0 possible? Your example is incomplete (there is no A=1 assignment for example), but I'm thinking I can guess where that should go given the earlier text. A=1 is listed as start condition. Way before, someone did wake_q_add(). I don't think this is broken. Thanks. -- Manfred
Re: wake_q memory ordering
On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote: > Hi, > > Waiman Long noticed that the memory barriers in sem_lock() are not really > documented, and while adding documentation, I ended up with one case where > I'm not certain about the wake_q code: > > Questions: > - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an > ordering guarantee? Yep. Either the atomic instruction implies ordering (eg. x86 LOCK prefix) or it doesn't (most RISC LL/SC), if it does, smp_mb__{before,after}_atomic() are a NO-OP and the ordering is unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are unconditional barriers. IOW, the only way to get a cmpxchg without barriers on failure, is with LL/SC, and in that case smp_mb__{before,after}_atomic() are unconditional. For instance, the way ARM64 does cmpxchg() is: cmpxchg(p, o, n) do { v = LL(p); if (v != o) return v; } while (!SC_RELEASE(p, n)) smp_mb(); return v; And you'll note how on success the store-release constraints all prior memory operations, and the smp_mb() constraints all later memops. But on failure there's not a barrier to be found. > - Is it ok that wake_up_q just writes wake_q->next, shouldn't > smp_store_acquire() be used? I.e.: guarantee that wake_up_process() > happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed > provides any ordering. There is no such thing as store_acquire, it is either load_acquire or store_release. But just like how we can write load-aquire like load+smp_mb(), so too I suppose we could write store-acquire like store+smp_mb(), and that is exactly what is there (through the implied barrier of wake_up_process()). (arguably it should've been WRITE_ONCE() I suppose) > > Example: > - CPU2 never touches lock a. It is just an unrelated wake_q user that also > wants to wake up task 1234. > - I've noticed already that smp_store_acquire() doesn't exist. > So smp_store_mb() is required. But from semantical point of view, we would > need an ACQUIRE: the wake_up_process() must happen after cmpxchg(). > - May wake_up_q() rely on the spinlocks/memory barriers in try_to_wake_up, > or should the function be safe by itself? > > CPU1: /current=1234, inside do_semtimedop()/ > g_wakee = current; > current->state = TASK_INTERRUPTIBLE; > spin_unlock(a); > > CPU2: / arbitrary kernel thread that uses wake_q / > wake_q_add(_q, 1234); > wake_up_q(_q); > <...ongoing> > > CPU3: / do_semtimedop() + wake_up_sem_queue_prepare() / > spin_lock(a); > wake_q_add(,g_wakee); > < within wake_q_add() >: > smp_mb__before_atomic(); > if (unlikely(cmpxchg_relaxed(>next, NULL, > WAKE_Q_TAIL))) > return false; /* -> this happens */ > > CPU2: > > 1234->wake_q.next = NULL; < Ok? Is store_acquire() > missing? /* smp_mb(); implied by the following wake_up_process() */ > wake_up_process(1234); > < within wake_up_process/try_to_wake_up(): > raw_spin_lock_irqsave() > smp_mb__after_spinlock() > if(1234->state = TASK_RUNNING) return; > > > > > rewritten: > > start condition: A = 1; B = 0; > > CPU1: > B = 1; > RELEASE, unlock LockX; > > CPU2: > lock LockX, ACQUIRE > if (LOAD A == 1) return; /* using cmp_xchg_relaxed */ > > CPU2: > A = 0; > ACQUIRE, lock LockY > smp_mb__after_spinlock(); > READ B > > Question: is A = 1, B = 0 possible? Your example is incomplete (there is no A=1 assignment for example), but I'm thinking I can guess where that should go given the earlier text. I don't think this is broken.
wake_q memory ordering
Hi, Waiman Long noticed that the memory barriers in sem_lock() are not really documented, and while adding documentation, I ended up with one case where I'm not certain about the wake_q code: Questions: - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an ordering guarantee? - Is it ok that wake_up_q just writes wake_q->next, shouldn't smp_store_acquire() be used? I.e.: guarantee that wake_up_process() happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed provides any ordering. Example: - CPU2 never touches lock a. It is just an unrelated wake_q user that also wants to wake up task 1234. - I've noticed already that smp_store_acquire() doesn't exist. So smp_store_mb() is required. But from semantical point of view, we would need an ACQUIRE: the wake_up_process() must happen after cmpxchg(). - May wake_up_q() rely on the spinlocks/memory barriers in try_to_wake_up, or should the function be safe by itself? CPU1: /current=1234, inside do_semtimedop()/ g_wakee = current; current->state = TASK_INTERRUPTIBLE; spin_unlock(a); CPU2: / arbitrary kernel thread that uses wake_q / wake_q_add(_q, 1234); wake_up_q(_q); <...ongoing> CPU3: / do_semtimedop() + wake_up_sem_queue_prepare() / spin_lock(a); wake_q_add(,g_wakee); < within wake_q_add() >: smp_mb__before_atomic(); if (unlikely(cmpxchg_relaxed(>next, NULL, WAKE_Q_TAIL))) return false; /* -> this happens */ CPU2: 1234->wake_q.next = NULL; < Ok? Is store_acquire() missing? wake_up_process(1234); < within wake_up_process/try_to_wake_up(): raw_spin_lock_irqsave() smp_mb__after_spinlock() if(1234->state = TASK_RUNNING) return; > rewritten: start condition: A = 1; B = 0; CPU1: B = 1; RELEASE, unlock LockX; CPU2: lock LockX, ACQUIRE if (LOAD A == 1) return; /* using cmp_xchg_relaxed */ CPU2: A = 0; ACQUIRE, lock LockY smp_mb__after_spinlock(); READ B Question: is A = 1, B = 0 possible? -- Manfred