Hi all- x86's sync_core_before_usermode() was bogus. Without the other patches applied, it would never be called in a problematic context, but that's about to change. In any event, sync_core_before_usermode() should be correct.
The second patch fixes a minor issue, but it also makes the third patch nicer. The third patch is the biggie. The old code looped over all CPUs without disabling migration, and it skipped the current CPU. There were comments about how the scheduler's barriers made this okay. This may well be true, but it was a mess, and it's considerably simpler to treat the current CPU just like all other CPUs. The messy skip-the-current-CPU code masked what seems to be a couple of much bigger issues: if the membarrier() syscall ran all the way through _without_ being preempted, it completely failed to operate on the calling thread. The smp_mb() calls sprinkled through the function would mask this problem for the normal barrier mode, but they wouldn't help for the core-serializing mode or rseq_preempt mode. In other words, modifying some code, calling membarrier(MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, 0), and running that modified code was not actually safe. This seems to rather defeat the purpose. Some or all of this might be -stable material. The global_expedited code looks similarly nasty. Any volunteers to clean it up? Andy Lutomirski (3): x86/membarrier: Get rid of a dubious optimization membarrier: Add an actual barrier before rseq_preempt() membarrier: Propagate SYNC_CORE and RSEQ actions more carefully arch/x86/include/asm/sync_core.h | 9 +-- arch/x86/mm/tlb.c | 6 +- kernel/sched/membarrier.c | 102 ++++++++++++++++++++++--------- 3 files changed, 83 insertions(+), 34 deletions(-) -- 2.28.0