On Thu, May 21, 2020 at 11:39:39AM +0200, Peter Zijlstra wrote:
> On Thu, May 21, 2020 at 02:40:36AM +0200, Frederic Weisbecker wrote:

> This:
> 
> >         smp_call_function_single_async() {             
> > smp_call_function_single_async() {
> >             // verified csd->flags != CSD_LOCK             // verified 
> > csd->flags != CSD_LOCK
> >             csd->flags = CSD_LOCK                          csd->flags = 
> > CSD_LOCK
> 
> concurrent smp_call_function_single_async() using the same csd is what
> I'm looking at as well.

So something like this ought to cure the fundamental problem and make
smp_call_function_single_async() more user friendly, but also more
expensive.

The problem is that while the ILB case is easy to fix, I can't seem to
find an equally nice solution for the ttwu_remote_queue() case; that
would basically require sticking the wake_csd in task_struct, I'll also
post that.

So it's either this:

---
 kernel/smp.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 84303197caf9..d1ca2a2d1cc7 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -109,6 +109,12 @@ static __always_inline void 
csd_lock_wait(call_single_data_t *csd)
        smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
 }
 
+/*
+ * csd_lock() can use non-atomic operations to set CSD_FLAG_LOCK because it's
+ * users are careful to only use CPU-local data. IOW, there is no cross-cpu
+ * lock usage. Also, you're not allowed to use smp_call_function*() from IRQs,
+ * and must be extra careful from SoftIRQ.
+ */
 static __always_inline void csd_lock(call_single_data_t *csd)
 {
        csd_lock_wait(csd);
@@ -318,7 +324,7 @@ EXPORT_SYMBOL(smp_call_function_single);
 
 /**
  * smp_call_function_single_async(): Run an asynchronous function on a
- *                              specific CPU.
+ *                                  specific CPU.
  * @cpu: The CPU to run on.
  * @csd: Pre-allocated and setup data structure
  *
@@ -339,18 +345,23 @@ EXPORT_SYMBOL(smp_call_function_single);
  */
 int smp_call_function_single_async(int cpu, call_single_data_t *csd)
 {
+       unsigned int csd_flags;
        int err = 0;
 
        preempt_disable();
 
-       if (csd->flags & CSD_FLAG_LOCK) {
+       /*
+        * Unlike the regular smp_call_function*() APIs, this one is actually
+        * usable from IRQ context, also the -EBUSY return value suggests
+        * it is safe to share csd's.
+        */
+       csd_flags = READ_ONCE(csd->flags);
+       if (csd_flags & CSD_FLAG_LOCK ||
+           cmpxchg(&csd->flags, csd_flags, csd_flags | CSD_FLAG_LOCK) != 
csd_flags) {
                err = -EBUSY;
                goto out;
        }
 
-       csd->flags = CSD_FLAG_LOCK;
-       smp_wmb();
-
        err = generic_exec_single(cpu, csd, csd->func, csd->info);
 
 out:

Reply via email to