>> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/ >> that the spinning-lock used at __schedule() should be RCsc to ensure >> visibility of writes prior to __schedule when the task is to be migrated to >> another CPU. >> >> And this is emphasized at the comment of the newly introduced >> smp_mb__after_spinlock(), >> >> * This barrier must provide two things: >> * >> * - it must guarantee a STORE before the spin_lock() is ordered against a >> * LOAD after it, see the comments at its two usage sites. >> * >> * - it must ensure the critical section is RCsc. >> * >> * The latter is important for cases where we observe values written by other >> * CPUs in spin-loops, without barriers, while being subject to scheduling. >> * >> * CPU0 CPU1 CPU2 >> * >> * for (;;) { >> * if (READ_ONCE(X)) >> * break; >> * } >> * X=1 >> * <sched-out> >> * <sched-in> >> * r = X; >> * >> * without transitivity it could be that CPU1 observes X!=0 breaks the loop, >> * we get migrated and CPU2 sees X==0. >> >> which is used at, >> >> __schedule(bool preempt) { >> ... >> rq_lock(rq, &rf); >> smp_mb__after_spinlock(); >> ... >> } >> . >> >> If I didn't miss something, I found this kind of visibility is __not__ >> necessarily >> depends on the spinning-lock at __schedule being RCsc. >> >> In fact, as for runnable task A, the migration would be, >> >> CPU0 CPU1 CPU2 >> >> <ACCESS before schedule out A> >> >> lock(rq0) >> schedule out A >> unock(rq0) >> >> lock(rq0) >> remove A from rq0 >> unlock(rq0) >> >> lock(rq2) >> add A into rq2 >> unlock(rq2) >> lock(rq2) >> schedule in A >> unlock(rq2) >> >> <ACCESS after schedule in A> >> >> <ACCESS before schedule out A> happens-before >> unlock(rq0) happends-before >> lock(rq0) happends-before >> unlock(rq2) happens-before >> lock(rq2) happens-before >> <ACCESS after schedule in A> >> > > But without RCsc lock, you cannot guarantee that a write propagates to > CPU 0 and CPU 2 at the same time, so the same write may propagate to > CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after > <ACCESS after scheduler in A>. So.. > > Regards, > Boqun
Thank you for pointing out this case, Boqun. But this is just one special case that acquire-release chains promise us. A=B=0 as initial CPU0 CPU1 CPU2 CPU3 write A=1 read A=1 write B=1 release X acquire X read A=? release Y acquire Y read B=? assurance 1: CPU3 will surely see B=1 writing by CPU1, and assurance 2: CPU2 will also see A=1 writing by CPU0 as a special case The second assurance is both in theory and implemented by real hardware. As for theory, the C++11 memory model, which is a potential formal model for kernel memory model as http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html descripes, states that: If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either be the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M. at $1.10 rule 18, on page 14 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf As for real hardware, Luc provided detailed test and explanation on ARM and POWER in 5.1 Cumulative Barriers for WRC on page 19 in this paper: A Tutorial Introduction to the ARM and POWER Relaxed Memory Models https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf So, I think we may remove RCsc from smp_mb__after_spinlock which is really confusing. Best Regards, Trol > >> And for stopped tasks, >> >> CPU0 CPU1 CPU2 >> >> <ACCESS before schedule out A> >> >> lock(rq0) >> schedule out A >> remove A from rq0 >> store-release(A->on_cpu) >> unock(rq0) >> >> load_acquire(A->on_cpu) >> set_task_cpu(A, 2) >> >> lock(rq2) >> add A into rq2 >> unlock(rq2) >> >> lock(rq2) >> schedule in A >> unlock(rq2) >> >> <ACCESS after schedule in A> >> >> <ACCESS before schedule out A> happens-before >> store-release(A->on_cpu) happens-before >> load_acquire(A->on_cpu) happens-before >> unlock(rq2) happens-before >> lock(rq2) happens-before >> <ACCESS after schedule in A> >> >> So, I think the only requirement to smp_mb__after_spinlock is >> to guarantee a STORE before the spin_lock() is ordered >> against a LOAD after it. So we could remove the RCsc requirement >> to allow more efficient implementation. >> >> Did I miss something or this RCsc requirement does not really matter?