x86 has a missing core serializing instruction in migration scenarios.

Given that x86-32 can return to user-space with sysexit, and x86-64
through sysretq and sysexit, which are not core serializing, the
following user-space self-modifiying code (JIT) scenario can occur:

     CPU 0                      CPU 1

User-space self-modify code
Preempted
 migrated             ->
                                scheduler selects task
                                Return to user-space (iret or sysexit)
                                User-space issues sync_core()
                      <-        migrated
scheduler selects task
Return to user-space (sysexit)
jump to modified code
Run modified code without sync_core() -> bug.

This migration pattern can return to user-space through sysexit or
sysret64, which is not core serializing, and therefore breaks sequential
consistency expectations from a single-threaded process.

Fix this issue by invoking sync_core_before_usermode() the first
time a runqueue finishes a task switch after receiving a migrated
thread.

Signed-off-by: Mathieu Desnoyers <[email protected]>
CC: Peter Zijlstra <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: Paul E. McKenney <[email protected]>
CC: Boqun Feng <[email protected]>
CC: Andrew Hunter <[email protected]>
CC: Maged Michael <[email protected]>
CC: Avi Kivity <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Michael Ellerman <[email protected]>
CC: Dave Watson <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andrea Parri <[email protected]>
CC: Russell King <[email protected]>
CC: Greg Hackmann <[email protected]>
CC: Will Deacon <[email protected]>
CC: David Sehr <[email protected]>
CC: Linus Torvalds <[email protected]>
CC: [email protected]
CC: [email protected]
---
 kernel/sched/core.c  | 7 +++++++
 kernel/sched/sched.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c79e94278613..4a1c9782267a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -927,6 +927,7 @@ static struct rq *move_queued_task(struct rq *rq, struct 
rq_flags *rf,
 
        rq_lock(rq, rf);
        BUG_ON(task_cpu(p) != new_cpu);
+       rq->need_sync_core = 1;
        enqueue_task(rq, p, 0);
        p->on_rq = TASK_ON_RQ_QUEUED;
        check_preempt_curr(rq, p, 0);
@@ -2684,6 +2685,12 @@ static struct rq *finish_task_switch(struct task_struct 
*prev)
        prev_state = prev->state;
        vtime_task_switch(prev);
        perf_event_task_sched_in(prev, current);
+#ifdef CONFIG_SMP
+       if (unlikely(rq->need_sync_core)) {
+               sync_core_before_usermode();
+               rq->need_sync_core = 0;
+       }
+#endif
        finish_lock_switch(rq, prev);
        finish_arch_post_lock_switch();
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cab256c1720a..33e617bc491c 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -734,6 +734,7 @@ struct rq {
        /* For active balancing */
        int active_balance;
        int push_cpu;
+       int need_sync_core;
        struct cpu_stop_work active_balance_work;
        /* cpu of this runqueue: */
        int cpu;
-- 
2.11.0

Reply via email to