These two prctls adjust the current task flags allowing to disable page
table isolation respectively for the current process or for the one
resulting from the next execve().

Both settings depend on CONFIG_PER_PROCESS_PTI. It is not possible to
set the flags if the pti_adjust sysctl is lower than 1, nor if the task
isn't capable of CAP_SYS_RAWIO, though it is still possible to disable
them.

Setting the flags is not allowed anymore once the task has created new
threads, but it's still possible to disable them.

Signed-off-by: Willy Tarreau <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Kees Cook <[email protected]>

v3:
  - depend on CONFIG_PER_PROCESS_PTI
  - switch back to task flags
  - use one task flag for the immediate task (config-based setting) and
    one task flag for the task resulting from the next execve (wrapper-based
    setting)
  - check the pti_adjust sysctl

v2:
  - use {set,clear}_thread_flag() as recommended by Peter
  - use task->mm->context.pti_disable instead of task flag
  - check for mm_users == 1
  - check for CAP_SYS_RAWIO only when setting, not clearing
  - make the code depend on CONFIG_PAGE_TABLE_ISOLATION
---
 arch/x86/include/uapi/asm/prctl.h |  3 +++
 arch/x86/kernel/process_64.c      | 30 ++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/arch/x86/include/uapi/asm/prctl.h 
b/arch/x86/include/uapi/asm/prctl.h
index 5a6aac9..1564f98 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -10,6 +10,9 @@
 #define ARCH_GET_CPUID         0x1011
 #define ARCH_SET_CPUID         0x1012
 
+#define ARCH_DISABLE_PTI_NOW   0x1021
+#define ARCH_DISABLE_PTI_NEXT  0x1022
+
 #define ARCH_MAP_VDSO_X32      0x2001
 #define ARCH_MAP_VDSO_32       0x2002
 #define ARCH_MAP_VDSO_64       0x2003
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index c754662..b4de8aa 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -654,7 +654,37 @@ long do_arch_prctl_64(struct task_struct *task, int 
option, unsigned long arg2)
                ret = put_user(base, (unsigned long __user *)arg2);
                break;
        }
+#ifdef CONFIG_PER_PROCESS_PTI
+       case ARCH_DISABLE_PTI_NOW:
+               if (!task->mm || atomic_read(&task->mm->mm_users) > 1)
+                       return -EPERM;
+
+               if (arg2 && (!capable(CAP_SYS_RAWIO) || pti_adjust < 1))
+                       return -EPERM;
+
+               if (doit) {
+                       if (arg2)
+                               set_thread_flag(TIF_DISABLE_PTI_NOW);
+                       else
+                               clear_thread_flag(TIF_DISABLE_PTI_NOW);
+               }
+               break;
 
+       case ARCH_DISABLE_PTI_NEXT:
+               if (!task->mm || atomic_read(&task->mm->mm_users) > 1)
+                       return -EPERM;
+
+               if (arg2 && (!capable(CAP_SYS_RAWIO) || pti_adjust < 1))
+                       return -EPERM;
+
+               if (doit) {
+                       if (arg2)
+                               set_thread_flag(TIF_DISABLE_PTI_NEXT);
+                       else
+                               clear_thread_flag(TIF_DISABLE_PTI_NEXT);
+               }
+               break;
+#endif
 #ifdef CONFIG_CHECKPOINT_RESTORE
 # ifdef CONFIG_X86_X32_ABI
        case ARCH_MAP_VDSO_X32:
-- 
1.7.12.1

Reply via email to