On Tue, Jul 17, 2018 at 11:33:41AM -0700, Paul E. McKenney wrote: > On Tue, Jul 17, 2018 at 09:19:15AM -0700, Linus Torvalds wrote: > > > > In particular, I find: > > > > "isync is not a memory barrier instruction, but the > > load-compare-conditional branch-isync sequence can provide this > > ordering property" > > > > so why are you doing "sync/lwsync", when it sounds like "isync/lwsync" > > (for lock/unlock) is the right thing and would already give memory > > barrier semantics? > > The PowerPC guys will correct me if I miss something here... > > The isync provides ordering roughly similar to lwsync, but nowhere near > as strong as sync, and it is sync that would be needed to cause lock > acquisition to provide full ordering. The reason for using lwsync instead > of isync is that the former proved to be faster on recent hardware. > The reason that the kernel still has the ability to instead generate > isync instructions is that some older PowerPC hardware does not provide > the lwsync instruction. If the hardware does support lwsync, the isync > instructions are overwritten with lwsync at boot time.
Isn't ISYNC the instruction-sync pipeline flush instruction? That is used as an smp_rmb() here to, together with the control dependency from the LL/SC, to form a LOAD->{LOAD,STORE} (aka LOAD-ACQUIRE) ordering? Where LWSYNC provides a TSO like ordering and SYNC provides a full transitive barrier aka. smp_mb() (althgouh I think it is strictly stronger than smp_mb() since it also implies completion, which smp_mb() does not). And since both LL/SC-CTRL + ISYNC / LWSYNC are strictly CPU local, they cannot be used to create RCsc ordering.