On Fri, Aug 5, 2016 at 6:37 AM, Sebastian Andrzej Siewior <[email protected]> wrote: > Usually current->mm (and therefore mm->pgd) stays the same during the > lifetime of a task so it does not matter if a task gets preempted during > the read and write of the CR3. > > But then, there is this scenario on x86-UP: > TaskA is in do_exit() and exit_mm() sets current->mm = NULL followed by > mmput() -> exit_mmap() -> tlb_finish_mmu() -> tlb_flush_mmu() -> > tlb_flush_mmu_tlbonly() -> tlb_flush() -> flush_tlb_mm_range() -> > __flush_tlb_up() -> __flush_tlb() -> __native_flush_tlb(). > > At this point current->mm is NULL but current->active_mm still points to > the "old" mm. > Let's preempt taskA _after_ native_read_cr3() by taskB. TaskB has its > own mm so CR3 has changed. > Now preempt back to taskA. TaskA has no ->mm set so it borrows taskB's > mm and so CR3 remains unchanged. Once taskA gets active it continues > where it was interrupted and that means it writes its old CR3 value > back. Everything is fine because userland won't need its memory > anymore.
This should affect kernel threads too, right? Acked-by: Andy Lutomirski <[email protected]>

