[tip:x86/fpu] x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK
Commit-ID: 3c1d32300920a446c67d697cd6b80f012ad06028 Gitweb: http://git.kernel.org/tip/3c1d32300920a446c67d697cd6b80f012ad06028 Author: Qiaowei Ren AuthorDate: Sun, 7 Jun 2015 11:37:02 -0700 Committer: Ingo Molnar CommitDate: Tue, 9 Jun 2015 12:24:30 +0200 x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes redundant one. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Reviewed-by: Thomas Gleixner Cc: Andrew Morton Cc: Dave Hansen Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20150607183702.5f129...@viggo.jf.intel.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/mpx.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 0cdd16a..871e5e5 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -45,7 +45,6 @@ #define MPX_BNDSTA_TAIL2 #define MPX_BNDCFG_TAIL12 #define MPX_BNDSTA_ADDR_MASK (~((1UL<http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mpx] x86, mpx: Add documentation on Intel MPX
Commit-ID: 5776563648f6437ede91c91cbad85862ca682b0b Gitweb: http://git.kernel.org/tip/5776563648f6437ede91c91cbad85862ca682b0b Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:32 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:54 +0100 x86, mpx: Add documentation on Intel MPX This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151832.7fdb1...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- Documentation/x86/intel_mpx.txt | 234 1 file changed, 234 insertions(+) diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..4472ed2 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,234 @@ +1. Intel(R) MPX Overview + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability +introduced into Intel Architecture. Intel MPX provides hardware features +that can be used in conjunction with compiler changes to check memory +references, for those references whose compile-time normal intentions are +usurped at runtime due to buffer overflow or underflow. + +For more information, please refer to Intel(R) Architecture Instruction +Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection +Extensions. + +Note: Currently no hardware with MPX ISA is available but it is always +possible to use SDE (Intel(R) Software Development Emulator) instead, which +can be downloaded from +http://software.intel.com/en-us/articles/intel-software-development-emulator + + +2. How to get the advantage of MPX +== + +For MPX to work, changes are required in the kernel, binutils and compiler. +No source changes are required for applications, just a recompile. + +There are a lot of moving parts of this to all work right. The following +is how we expect the compiler, application and kernel to work together. + +1) Application developer compiles with -fmpx. The compiler will add the + instrumentation as well as some setup code called early after the app + starts. New instruction prefixes are noops for old CPUs. +2) That setup code allocates (virtual) space for the "bounds directory", + points the "bndcfgu" register to the directory and notifies the kernel + (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using + MPX. +3) The kernel detects that the CPU has MPX, allows the new prctl() to + succeed, and notes the location of the bounds directory. Userspace is + expected to keep the bounds directory at that locationWe note it + instead of reading it each time because the 'xsave' operation needed + to access the bounds directory register is an expensive operation. +4) If the application needs to spill bounds out of the 4 registers, it + issues a bndstx instruction. Since the bounds directory is empty at + this point, a bounds fault (#BR) is raised, the kernel allocates a + bounds table (in the user address space) and makes the relevant entry + in the bounds directory point to the new table. +5) If the application violates the bounds specified in the bounds registers, + a separate kind of #BR is raised which will deliver a signal with + information about the violation in the 'struct siginfo'. +6) Whenever memory is freed, we know that it can no longer contain valid + pointers, and we attempt to free the associated space in the bounds + tables. If an entire table becomes unused, we will attempt to free + the table and remove the entry in the directory. + +To summarize, there are essentially three things interacting here: + +GCC with -fmpx: + * enables annotation of code with MPX instructions and prefixes + * inserts code early in the application to call in to the "gcc runtime" +GCC MPX Runtime: + * Checks for hardware MPX support in cpuid leaf + * allocates virtual space for the bounds directory (malloc() essentially) + * points the hardware BNDCFGU register at the directory + * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to + start managing the bounds directories +Kernel MPX Code: + * Checks for hardware MPX support in cpuid leaf + * Handles #BR exceptions and sends SIGSEGV to the app when it violates + bounds, like during a buffer overflow. + * When bounds are spilled in to an unallocated bounds table, the kernel + notices in the #BR exception, allocates the virtual space, then + updates the bounds directory to point to the new table. It keeps + special track of the memory with a VM_MPX flag. + * Frees unused bounds tables at the time that the memory they described + is unmapped. + + +3. How does MPX kernel code work +===
[tip:x86/mpx] x86, mpx: Add MPX-specific mmap interface
Commit-ID: 57319d80e1d328e34cb24868a4f4405661485e30 Gitweb: http://git.kernel.org/tip/57319d80e1d328e34cb24868a4f4405661485e30 Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:27 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:53 +0100 x86, mpx: Add MPX-specific mmap interface We have chosen to perform the allocation of bounds tables in kernel (See the patch "on-demand kernel allocation of bounds tables") and to mark these VMAs with VM_MPX. However, there is currently no suitable interface to actually do this. Existing interfaces, like do_mmap_pgoff(), have no way to set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem long enough to let a caller do it. This patch wraps mmap_region() and hold mmap_sem long enough to make the modifications to the VMA which we need. Also note the 32/64-bit #ifdef in the header. We actually need to do this at runtime eventually. But, for now, we don't support running 32-bit binaries on 64-bit kernels. Support for this will come in later patches. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151827.ce440...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 4 +++ arch/x86/include/asm/mpx.h | 36 +++ arch/x86/mm/Makefile | 2 ++ arch/x86/mm/mpx.c | 86 ++ 4 files changed, 128 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ded8a67..967dfe0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -248,6 +248,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..7d7c5f5 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,36 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..72d13b0 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,86 @@ +/* + * mpx.c - Memory Protection eXtensions + * + * Copyright (c) 2014, Intel Corporation. + * Qiaowei Ren + * Dave Hansen + */ +#include +#include +#include + +#include +#include + +static const char *mpx_mapping_name(struct vm_area_struct *vma) +{ + return "[mpx]"; +} + +static struct vm_operations_struct mpx_vma_ops = { + .name = mpx_mapping_name, +}; + +/* + * This is really a simplified "vm_mmap". it only handles MPX + * bounds tables (the bounds directory is user-allocated). + * + * Later on, we use the vma->vm_ops to uniquely identify these + * VMAs. + */ +static unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + struct vm_area_struct *vma; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(&mm->mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS
[tip:x86/mpx] x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific
Commit-ID: 4aae7e436fa51faf4bf5d11b175aea82cfe8224a Gitweb: http://git.kernel.org/tip/4aae7e436fa51faf4bf5d11b175aea82cfe8224a Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:25 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:53 +0100 x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific MPX-enabled applications using large swaths of memory can potentially have large numbers of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. Being this huge, our expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. So we need a way to track memory use for MPX. If we want to specifically track MPX VMAs we need to be able to distinguish them from normal VMAs, and keep them from getting merged with normal VMAs. A new VM_ flag set only on MPX VMAs does both of those things. With this flag, MPX bounds-table VMAs can be distinguished from other VMAs, and userspace can also walk /proc/$pid/smaps to get memory usage for MPX. In addition to this flag, we also introduce a special ->vm_ops specific to MPX VMAs (see the patch "add MPX specific mmap interface"), but currently different ->vm_ops do not by themselves prevent VMA merging, so we still need this flag. We understand that VM_ flags are scarce and are open to other options. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151825.56562...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- fs/proc/task_mmu.c | 3 +++ include/linux/mm.h | 6 ++ 2 files changed, 9 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 4e0388c..f6734c6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -552,6 +552,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", [ilog2(VM_DENYWRITE)] = "dw", +#ifdef CONFIG_X86_INTEL_MPX + [ilog2(VM_MPX)] = "mp", +#endif [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)]= "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index b464611..f7606d3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -128,6 +128,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGETLB 0x0040 /* Huge TLB Page VM */ #define VM_NONLINEAR 0x0080 /* Is non-linear (remap_file_pages) */ #define VM_ARCH_1 0x0100 /* Architecture-specific flag */ +#define VM_ARCH_2 0x0200 #define VM_DONTDUMP0x0400 /* Do not include in the core dump */ #ifdef CONFIG_MEM_SOFT_DIRTY @@ -155,6 +156,11 @@ extern unsigned int kobjsize(const void *objp); # define VM_MAPPED_COPYVM_ARCH_1 /* T if mapped copy of data (nommu mmap) */ #endif +#if defined(CONFIG_X86) +/* MPX specific bounds table or bounds directory */ +# define VM_MPXVM_ARCH_2 +#endif + #ifndef VM_GROWSUP # define VM_GROWSUPVM_NONE #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mpx] mips: Sync struct siginfo with general version
Commit-ID: 232b5fff5bad78ad00b94153fa90ca53bef6a444 Gitweb: http://git.kernel.org/tip/232b5fff5bad78ad00b94153fa90ca53bef6a444 Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:20 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:53 +0100 mips: Sync struct siginfo with general version New fields about bound violation are added into general struct siginfo. This will impact MIPS and IA64, which extend general struct siginfo. This patch syncs this struct for MIPS with general version. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151820.f7edc...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- arch/mips/include/uapi/asm/siginfo.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/mips/include/uapi/asm/siginfo.h b/arch/mips/include/uapi/asm/siginfo.h index e811744..d08f83f 100644 --- a/arch/mips/include/uapi/asm/siginfo.h +++ b/arch/mips/include/uapi/asm/siginfo.h @@ -92,6 +92,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL, SIGXFSZ (To do ...) */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mpx] ia64: Sync struct siginfo with general version
Commit-ID: 53f037b08b5bebf47aa2b574a984e2f9fc7926f2 Gitweb: http://git.kernel.org/tip/53f037b08b5bebf47aa2b574a984e2f9fc7926f2 Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:22 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:53 +0100 ia64: Sync struct siginfo with general version New fields about bound violation are added into general struct siginfo. This will impact MIPS and IA64, which extend general struct siginfo. This patch syncs this struct for IA64 with general version. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151822.82b3b...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- arch/ia64/include/uapi/asm/siginfo.h | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/ia64/include/uapi/asm/siginfo.h b/arch/ia64/include/uapi/asm/siginfo.h index 4ea6225..bce9bc1 100644 --- a/arch/ia64/include/uapi/asm/siginfo.h +++ b/arch/ia64/include/uapi/asm/siginfo.h @@ -63,6 +63,10 @@ typedef struct siginfo { unsigned int _flags;/* see below */ unsigned long _isr; /* isr */ short _addr_lsb;/* lsb of faulting address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -110,9 +114,9 @@ typedef struct siginfo { /* * SIGSEGV si_codes */ -#define __SEGV_PSTKOVF (__SI_FAULT|3) /* paragraph stack overflow */ +#define __SEGV_PSTKOVF (__SI_FAULT|4) /* paragraph stack overflow */ #undef NSIGSEGV -#define NSIGSEGV 3 +#define NSIGSEGV 4 #undef NSIGTRAP #define NSIGTRAP 4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mpx] mpx: Extend siginfo structure to include bound violation information
Commit-ID: ee1b58d36aa1b5a79eaba11f5c3633c88231da83 Gitweb: http://git.kernel.org/tip/ee1b58d36aa1b5a79eaba11f5c3633c88231da83 Author: Qiaowei Ren AuthorDate: Fri, 14 Nov 2014 07:18:19 -0800 Committer: Thomas Gleixner CommitDate: Tue, 18 Nov 2014 00:58:53 +0100 mpx: Extend siginfo structure to include bound violation information This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren Signed-off-by: Dave Hansen Cc: linux...@kvack.org Cc: linux-m...@linux-mips.org Cc: Dave Hansen Link: http://lkml.kernel.org/r/20141114151819.1908c...@viggo.jf.intel.com Signed-off-by: Thomas Gleixner --- include/uapi/asm-generic/siginfo.h | 9 - kernel/signal.c| 4 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index ba5be7f..1e35520 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,6 +91,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; /* LSB of the reported address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -131,6 +135,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb_sifields._sigfault._addr_lsb +#define si_lower _sifields._sigfault._addr_bnd._lower +#define si_upper _sifields._sigfault._addr_bnd._upper #define si_band_sifields._sigpoll._band #define si_fd _sifields._sigpoll._fd #ifdef __ARCH_SIGSYS @@ -199,7 +205,8 @@ typedef struct siginfo { */ #define SEGV_MAPERR(__SI_FAULT|1) /* address not mapped to object */ #define SEGV_ACCERR(__SI_FAULT|2) /* invalid permissions for mapped object */ -#define NSIGSEGV 2 +#define SEGV_BNDERR(__SI_FAULT|3) /* failed address bound checks */ +#define NSIGSEGV 3 /* * SIGBUS si_codes diff --git a/kernel/signal.c b/kernel/signal.c index 8f0876f..2c403a4 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from) if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO) err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb); #endif +#ifdef SEGV_BNDERR + err |= __put_user(from->si_lower, &to->si_lower); + err |= __put_user(from->si_upper, &to->si_upper); +#endif break; case __SI_CHLD: err |= __put_user(from->si_pid, &to->si_pid); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 06/12] mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren --- include/uapi/asm-generic/siginfo.h |9 - kernel/signal.c|4 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index ba5be7f..1e35520 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,6 +91,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; /* LSB of the reported address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -131,6 +135,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb_sifields._sigfault._addr_lsb +#define si_lower _sifields._sigfault._addr_bnd._lower +#define si_upper _sifields._sigfault._addr_bnd._upper #define si_band_sifields._sigpoll._band #define si_fd _sifields._sigpoll._fd #ifdef __ARCH_SIGSYS @@ -199,7 +205,8 @@ typedef struct siginfo { */ #define SEGV_MAPERR(__SI_FAULT|1) /* address not mapped to object */ #define SEGV_ACCERR(__SI_FAULT|2) /* invalid permissions for mapped object */ -#define NSIGSEGV 2 +#define SEGV_BNDERR(__SI_FAULT|3) /* failed address bound checks */ +#define NSIGSEGV 3 /* * SIGBUS si_codes diff --git a/kernel/signal.c b/kernel/signal.c index 8f0876f..2c403a4 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from) if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO) err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb); #endif +#ifdef SEGV_BNDERR + err |= __put_user(from->si_lower, &to->si_lower); + err |= __put_user(from->si_upper, &to->si_upper); +#endif break; case __SI_CHLD: err |= __put_user(from->si_pid, &to->si_pid); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 04/12] x86, mpx: add MPX to disaabled features
This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as both a runtime and compile-time check. When CONFIG_X86_INTEL_MPX is disabled, cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid flag will be checked at runtime. This patch must be applied after another Dave's commit: 381aa07a9b4e1f82969203e9e4863da2a157781d Signed-off-by: Dave Hansen Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/disabled-features.h |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 97534a7..f226df0 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -10,6 +10,12 @@ * cpu_feature_enabled(). */ +#ifdef CONFIG_X86_INTEL_MPX +# define DISABLE_MPX 0 +#else +# define DISABLE_MPX (1<<(X86_FEATURE_MPX & 31)) +#endif + #ifdef CONFIG_X86_64 # define DISABLE_VME (1<<(X86_FEATURE_VME & 31)) # define DISABLE_K6_MTRR (1<<(X86_FEATURE_K6_MTRR & 31)) @@ -34,6 +40,6 @@ #define DISABLED_MASK6 0 #define DISABLED_MASK7 0 #define DISABLED_MASK8 0 -#define DISABLED_MASK9 0 +#define DISABLED_MASK9 (DISABLE_MPX) #endif /* _ASM_X86_DISABLED_FEATURES_H */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 07/12] mips: sync struct siginfo with general version
New fields about bound violation are added into general struct siginfo. This will impact MIPS and IA64, which extend general struct siginfo. This patch syncs this struct for MIPS with general version. Signed-off-by: Qiaowei Ren --- arch/mips/include/uapi/asm/siginfo.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/mips/include/uapi/asm/siginfo.h b/arch/mips/include/uapi/asm/siginfo.h index e811744..d08f83f 100644 --- a/arch/mips/include/uapi/asm/siginfo.h +++ b/arch/mips/include/uapi/asm/siginfo.h @@ -92,6 +92,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL, SIGXFSZ (To do ...) */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables
MPX only has 4 hardware registers for storing bounds information. If MPX-enabled code needs more than these 4 registers, it needs to spill them somewhere. It has two special instructions for this which allow the bounds to be moved between the bounds registers and some new "bounds tables". They are similar conceptually to a page fault and will be raised by the MPX hardware during both bounds violations or when the tables are not present. This patch handles those #BR exceptions for not-present tables by carving the space out of the normal processes address space (essentially calling the new mmap() interface indroduced earlier in this patch set.) and then pointing the bounds-directory over to it. The tables *need* to be accessed and controlled by userspace because the instructions for moving bounds in and out of them are extremely frequent. They potentially happen every time a register pointing to memory is dereferenced. Any direct kernel involvement (like a syscall) to access the tables would obviously destroy performance. Why not do this in userspace? This patch is obviously doing this allocation in the kernel. However, MPX does not strictly *require* anything in the kernel. It can theoretically be done completely from userspace. Here are a few ways this *could* be done. I don't think any of them are practical in the real-world, but here they are. Q: Can virtual space simply be reserved for the bounds tables so that we never have to allocate them? A: As noted earlier, these tables are *HUGE*. An X-GB virtual area needs 4*X GB of virtual space, plus 2GB for the bounds directory. If we were to preallocate them for the 128TB of user virtual address space, we would need to reserve 512TB+2GB, which is larger than the entire virtual address space today. This means they can not be reserved ahead of time. Also, a single process's pre-popualated bounds directory consumes 2GB of virtual *AND* physical memory. IOW, it's completely infeasible to prepopulate bounds directories. Q: Can we preallocate bounds table space at the same time memory is allocated which might contain pointers that might eventually need bounds tables? A: This would work if we could hook the site of each and every memory allocation syscall. This can be done for small, constrained applications. But, it isn't practical at a larger scale since a given app has no way of controlling how all the parts of the app might allocate memory (think libraries). The kernel is really the only place to intercept these calls. Q: Could a bounds fault be handed to userspace and the tables allocated there in a signal handler instead of in the kernel? A: (thanks to tglx) mmap() is not on the list of safe async handler functions and even if mmap() would work it still requires locking or nasty tricks to keep track of the allocation state there. Having ruled out all of the userspace-only approaches for managing bounds tables that we could think of, we create them on demand in the kernel. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 20 + arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 101 arch/x86/kernel/traps.c| 52 ++- 4 files changed, 173 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 5725ac4..b7598ac 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -18,6 +18,8 @@ #define MPX_BT_ENTRY_SHIFT 5 #define MPX_IGN_BITS 3 +#define MPX_BD_ENTRY_TAIL 3 + #else #define MPX_BD_ENTRY_OFFSET20 @@ -26,13 +28,31 @@ #define MPX_BT_ENTRY_SHIFT 4 #define MPX_IGN_BITS 2 +#define MPX_BD_ENTRY_TAIL 2 + #endif +#define MPX_BNDSTA_TAIL2 +#define MPX_BNDCFG_TAIL12 +#define MPX_BNDSTA_ADDR_MASK (~((1UL< + * Dave Hansen + */ + +#include +#include +#include + +/* + * With 32-bit mode, MPX_BT_SIZE_BYTES is 4MB, and the size of each + * bounds table is 16KB. With 64-bit mode, MPX_BT_SIZE_BYTES is 2GB, + * and the size of each bounds table is 4MB. + */ +static int allocate_bt(long __user *bd_entry) +{ + unsigned long bt_addr; + unsigned long expected_old_val = 0; + unsigned long actual_old_val = 0; + int ret = 0; + + /* +* Carve the virtual space out of userspace for the new +* bounds table: +*/ + bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES); + if (IS_ERR((void *)bt_addr)) + return PTR_ERR((void *)bt_addr); + /* +* Set the valid flag (kinda like _PAGE_PRESENT in a pte) +*/ + bt_addr = bt_addr | MPX_BD_ENTRY_VALID_FLAG; + + /* +* Go poke the address of the new bounds table in to the +* bounds directory entry out in userspace memory.
[PATCH v9 03/12] x86, mpx: add MPX specific mmap interface
We have to do the allocation of bounds tables in kernel (See the patch "on-demand kernel allocation of bounds tables"). Moreover, if we want to track MPX VMAs we need to be able to stick new VM_MPX flag and a specific vm_ops for MPX in the vma_area_struct. But there are not suitable interfaces to do this in current kernel. Existing interfaces, like do_mmap_pgoff(), could not stick specific ->vm_ops in the vma_area_struct when a VMA is created. So, this patch adds MPX specific mmap interface to do the allocation of bounds tables. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h | 38 + arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 79 4 files changed, 123 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4b663e1..e5bcc70 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -243,6 +243,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..e1b28e6 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,79 @@ +#include +#include +#include +#include +#include + +static const char *mpx_mapping_name(struct vm_area_struct *vma) +{ + return "[mpx]"; +} + +static struct vm_operations_struct mpx_vma_ops = { + .name = mpx_mapping_name, +}; + +/* + * this is really a simplified "vm_mmap". it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + struct vm_area_struct *vma; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(&mm->mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr & ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr >> PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + if (IS_ERR_VALUE(ret)) + goto out; + + vma = find_vma(mm, ret); + if (!vma) { + ret = -ENOMEM; + goto out; + } + vma->vm_ops = &mpx_vma_ops; + +
[PATCH v9 01/12] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
MPX-enabled applications using large swaths of memory can potentially have large numbers of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. Being this huge, our expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. So we need a way to track memory use for MPX. If we want to specifically track MPX VMAs we need to be able to distinguish them from normal VMAs, and keep them from getting merged with normal VMAs. A new VM_ flag set only on MPX VMAs does both of those things. With this flag, MPX bounds-table VMAs can be distinguished from other VMAs, and userspace can also walk /proc/$pid/smaps to get memory usage for MPX. Except this flag, we also introduce a specific ->vm_ops for MPX VMAs (see the patch "add MPX specific mmap interface"), but currently vmas with different ->vm_ops could be not prevented from merging. We understand that VM_ flags are scarce and are open to other options. Signed-off-by: Qiaowei Ren --- fs/proc/task_mmu.c |1 + include/linux/mm.h |6 ++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index dfc791c..cc31520 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", [ilog2(VM_DENYWRITE)] = "dw", + [ilog2(VM_MPX)] = "mp", [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)]= "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..942be8a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGETLB 0x0040 /* Huge TLB Page VM */ #define VM_NONLINEAR 0x0080 /* Is non-linear (remap_file_pages) */ #define VM_ARCH_1 0x0100 /* Architecture-specific flag */ +#define VM_ARCH_2 0x0200 #define VM_DONTDUMP0x0400 /* Do not include in the core dump */ #ifdef CONFIG_MEM_SOFT_DIRTY @@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp); # define VM_MAPPED_COPYVM_ARCH_1 /* T if mapped copy of data (nommu mmap) */ #endif +#if defined(CONFIG_X86) +/* MPX specific bounds table or bounds directory */ +# define VM_MPXVM_ARCH_2 +#endif + #ifndef VM_GROWSUP # define VM_GROWSUPVM_NONE #endif -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 08/12] ia64: sync struct siginfo with general version
New fields about bound violation are added into general struct siginfo. This will impact MIPS and IA64, which extend general struct siginfo. This patch syncs this struct for IA64 with general version. Signed-off-by: Qiaowei Ren --- arch/ia64/include/uapi/asm/siginfo.h |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/ia64/include/uapi/asm/siginfo.h b/arch/ia64/include/uapi/asm/siginfo.h index 4ea6225..bce9bc1 100644 --- a/arch/ia64/include/uapi/asm/siginfo.h +++ b/arch/ia64/include/uapi/asm/siginfo.h @@ -63,6 +63,10 @@ typedef struct siginfo { unsigned int _flags;/* see below */ unsigned long _isr; /* isr */ short _addr_lsb;/* lsb of faulting address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -110,9 +114,9 @@ typedef struct siginfo { /* * SIGSEGV si_codes */ -#define __SEGV_PSTKOVF (__SI_FAULT|3) /* paragraph stack overflow */ +#define __SEGV_PSTKOVF (__SI_FAULT|4) /* paragraph stack overflow */ #undef NSIGSEGV -#define NSIGSEGV 3 +#define NSIGSEGV 4 #undef NSIGTRAP #define NSIGTRAP 4 -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 12/12] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 245 +++ 1 files changed, 245 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..3c20a17 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,245 @@ +1. Intel(R) MPX Overview + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability +introduced into Intel Architecture. Intel MPX provides hardware features +that can be used in conjunction with compiler changes to check memory +references, for those references whose compile-time normal intentions are +usurped at runtime due to buffer overflow or underflow. + +For more information, please refer to Intel(R) Architecture Instruction +Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection +Extensions. + +Note: Currently no hardware with MPX ISA is available but it is always +possible to use SDE (Intel(R) Software Development Emulator) instead, which +can be downloaded from +http://software.intel.com/en-us/articles/intel-software-development-emulator + + +2. How to get the advantage of MPX +== + +For MPX to work, changes are required in the kernel, binutils and compiler. +No source changes are required for applications, just a recompile. + +There are a lot of moving parts of this to all work right. The following +is how we expect the compiler, application and kernel to work together. + +1) Application developer compiles with -fmpx. The compiler will add the + instrumentation as well as some setup code called early after the app + starts. New instruction prefixes are noops for old CPUs. +2) That setup code allocates (virtual) space for the "bounds directory", + points the "bndcfgu" register to the directory and notifies the kernel + (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using + MPX. +3) The kernel detects that the CPU has MPX, allows the new prctl() to + succeed, and notes the location of the bounds directory. Userspace is + expected to keep the bounds directory at that locationWe note it + instead of reading it each time because the 'xsave' operation needed + to access the bounds directory register is an expensive operation. +4) If the application needs to spill bounds out of the 4 registers, it + issues a bndstx instruction. Since the bounds directory is empty at + this point, a bounds fault (#BR) is raised, the kernel allocates a + bounds table (in the user address space) and makes the relevant entry + in the bounds directory point to the new table. +5) If the application violates the bounds specified in the bounds registers, + a separate kind of #BR is raised which will deliver a signal with + information about the violation in the 'struct siginfo'. +6) Whenever memory is freed, we know that it can no longer contain valid + pointers, and we attempt to free the associated space in the bounds + tables. If an entire table becomes unused, we will attempt to free + the table and remove the entry in the directory. + +To summarize, there are essentially three things interacting here: + +GCC with -fmpx: + * enables annotation of code with MPX instructions and prefixes + * inserts code early in the application to call in to the "gcc runtime" +GCC MPX Runtime: + * Checks for hardware MPX support in cpuid leaf + * allocates virtual space for the bounds directory (malloc() essentially) + * points the hardware BNDCFGU register at the directory + * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to + start managing the bounds directories +Kernel MPX Code: + * Checks for hardware MPX support in cpuid leaf + * Handles #BR exceptions and sends SIGSEGV to the app when it violates + bounds, like during a buffer overflow. + * When bounds are spilled in to an unallocated bounds table, the kernel + notices in the #BR exception, allocates the virtual space, then + updates the bounds directory to point to the new table. It keeps + special track of the memory with a VM_MPX flag. + * Frees unused bounds tables at the time that the memory they described + is unmapped. + + +3. How does MPX kernel code work + + +Handling #BR faults caused by MPX +- + +When MPX is enabled, there are 2 new situations that can generate +#BR faults. + * new bounds tables (BT) need to be allocated to save bounds. + * bounds violation caused by MPX instructions. + +We hook #BR handler to handle these two new situations. + +On-demand kernel allocation of bounds tables + + +MPX only has 4 hardware registers for stor
[PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information
This patch sets bound violation fields of siginfo struct in #BR exception handler by decoding the user instruction and constructing the faulting pointer. This patch does't use the generic decoder, and implements a limited special-purpose decoder to decode MPX instructions, simply because the generic decoder is very heavyweight not just in terms of performance but in terms of interface -- because it has to. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 23 arch/x86/kernel/mpx.c | 299 arch/x86/kernel/traps.c|6 + 3 files changed, 328 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index b7598ac..780af63 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -44,15 +45,37 @@ #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BD_ENTRY_VALID_FLAG0x1 +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + unsigned long mpx_mmap(unsigned long len); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #else static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf) { return -EINVAL; } +static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf) +{ +} #endif /* CONFIG_X86_INTEL_MPX */ #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 2103b5e..b7e4c0e 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -10,6 +10,275 @@ #include #include +enum reg_type { + REG_TYPE_RM = 0, + REG_TYPE_INDEX, + REG_TYPE_BASE, +}; + +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +enum reg_type type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { +
[PATCH v9 11/12] x86, mpx: cleanup unused bound tables
There are two mappings in play: 1. The mapping with the actual data, which userspace is munmap()ing or brk()ing away, etc... 2. The mapping for the bounds table *backing* the data (is tagged with mpx_vma_ops, see the patch "add MPX specific mmap interface"). If userspace use the prctl() indroduced earlier in this patchset to enable the management of bounds tables in kernel, when it unmaps the first kind of mapping with the actual data, kernel needs to free the mapping for the bounds table backing the data. This patch calls arch_unmap() at the very end of do_unmap() to do so. This will walk the directory to look at the entries covered in the data vma and unmaps the bounds table which is referenced from the directory and then clears the directory entry. Unmapping of bounds tables is called under vm_munmap() of the data VMA. So we have to check ->vm_ops to prevent recursion. This recursion represents having bounds tables for bounds tables, which should not occur normally. Being strict about it here helps ensure that we do not have an exploitable stack overflow. Once we unmap the bounds table, we would have a bounds directory entry pointing at empty address space. That address space could now be allocated for some other (random) use, and the MPX hardware is now going to go trying to walk it as if it were a bounds table. That would be bad. So any unmapping of a bounds table has to be accompanied by a corresponding write to the bounds directory entry to have it invalid. That write to the bounds directory can fault. Since we are doing the freeing from munmap() (and other paths like it), we hold mmap_sem for write. If we fault, the page fault handler will attempt to acquire mmap_sem for read and we will deadlock. For now, to avoid deadlock, we disable page faults while touching the bounds directory entry. This keeps us from being able to free the tables in this case. This deficiency will be addressed in later patches. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mmu_context.h | 16 ++ arch/x86/include/asm/mpx.h |9 + arch/x86/mm/mpx.c | 317 include/asm-generic/mmu_context.h |6 + mm/mmap.c |2 + 5 files changed, 350 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index e33ddb7..2b52d1b 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -111,4 +111,20 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm, #endif } +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ +#ifdef CONFIG_X86_INTEL_MPX + /* +* Userspace never asked us to manage the bounds tables, +* so refuse to help. +*/ + if (!kernel_managing_mpx_tables(current->mm)) + return; + + mpx_notify_unmap(mm, vma, start, end); +#endif +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 32f13f5..a1a0155 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -48,6 +48,13 @@ #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) +#define MPX_BD_ENTRY_MASK ((1<>(MPX_BT_ENTRY_OFFSET+ \ + MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT) +#define MPX_GET_BT_ENTRY_OFFSET(addr) addr)>>MPX_IGN_BITS) & \ + MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT) + #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 @@ -73,6 +80,8 @@ static inline int kernel_managing_mpx_tables(struct mm_struct *mm) return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR); } unsigned long mpx_mmap(unsigned long len); +void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long start, unsigned long end); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c index 376f2ee..dcc6621 100644 --- a/arch/x86/mm/mpx.c +++ b/arch/x86/mm/mpx.c @@ -1,7 +1,16 @@ +/* + * mpx.c - Memory Protection eXtensions + * + * Copyright (c) 2014, Intel Corporation. + * Qiaowei Ren + * Dave Hansen + */ + #include #include #include #include +#include #include static const char *mpx_mapping_name(struct vm_area_struct *vma) @@ -13,6 +22,11 @@ static struct vm_operations_struct mpx_vma_ops = { .name = mpx_mapping_name, }; +int is_mpx_vma(struct vm_area_struct *vma) +{ + return (vma->vm_ops == &mpx_vma_ops); +} + /* * this is really a simplified "vm_mmap". it only handles mpx * related maps, including bounds table an
[PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT
This patch adds two prctl() commands to provide one explicit interaction mechanism to enable or disable the management of bounds tables in kernel, including on-demand kernel allocation (See the patch "on-demand kernel allocation of bounds tables") and cleanup (See the patch "cleanup unused bound tables"). Applications do not strictly need the kernel to manage bounds tables and we expect some applications to use MPX without taking advantage of the kernel support. This means the kernel can not simply infer whether an application needs bounds table management from the MPX registers. prctl() is an explicit signal from userspace. PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to require kernel's help in managing bounds tables. And PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, kernel won't allocate and free the bounds table, even if the CPU supports MPX feature. PR_MPX_ENABLE_MANAGEMENT will do an xsave and fetch the base address of bounds directory from the xsave buffer and then cache it into new filed "bd_addr" of struct mm_struct. PR_MPX_DISABLE_MANAGEMENT will set "bd_addr" to one invalid address. Then we can check "bd_addr" to judge whether the management of bounds tables in kernel is enabled. xsaves are expensive, so "bd_addr" is kept for caching to reduce the number of we have to do at munmap() time. But we still have to do xsave to get the value of BNDSTATUS at #BR fault time. In addition, with this caching, userspace can't just move the bounds directory around willy-nilly. For sane applications, base address of the bounds directory won't be changed, otherwise we would be in a world of hurt. But we will still check whether it is changed by users at #BR fault time. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mmu_context.h |9 arch/x86/include/asm/mpx.h | 11 + arch/x86/include/asm/processor.h | 18 +++ arch/x86/kernel/mpx.c | 88 arch/x86/kernel/setup.c|8 +++ arch/x86/kernel/traps.c| 30 - arch/x86/mm/mpx.c | 25 +++--- fs/exec.c |2 + include/asm-generic/mmu_context.h |5 ++ include/linux/mm_types.h |3 + include/uapi/linux/prctl.h |6 +++ kernel/sys.c | 12 + 12 files changed, 198 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 166af2a..e33ddb7 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -10,6 +10,7 @@ #include #include #include +#include #ifndef CONFIG_PARAVIRT #include @@ -102,4 +103,12 @@ do { \ } while (0) #endif +static inline void arch_bprm_mm_init(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_X86_INTEL_MPX + mm->bd_addr = MPX_INVALID_BOUNDS_DIR; +#endif +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 780af63..32f13f5 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -5,6 +5,12 @@ #include #include +/* + * NULL is theoretically a valid place to put the bounds + * directory, so point this at an invalid address. + */ +#define MPX_INVALID_BOUNDS_DIR ((void __user *)-1) + #ifdef CONFIG_X86_64 /* upper 28 bits [47:20] of the virtual address in 64-bit used to @@ -43,6 +49,7 @@ #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) #define MPX_BNDSTA_ERROR_CODE 0x3 +#define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 struct mpx_insn { @@ -61,6 +68,10 @@ struct mpx_insn { #define MAX_MPX_INSN_SIZE 15 +static inline int kernel_managing_mpx_tables(struct mm_struct *mm) +{ + return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR); +} unsigned long mpx_mmap(unsigned long len); #ifdef CONFIG_X86_INTEL_MPX diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 020142f..b35aefa 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +/* Register/unregister a process' MPX related resource */ +#define MPX_ENABLE_MANAGEMENT(tsk) mpx_enable_management((tsk)) +#define MPX_DISABLE_MANAGEMENT(tsk)mpx_disable_management((tsk)) + +#ifdef CONFIG_X86_INTEL_MPX +extern int mpx_enable_management(struct task_struct *tsk); +extern int mpx_disable_management(struct task_struct *tsk); +#else +static inline int mpx_enable_managem
[PATCH v9 00/12] Intel MPX support
uild issue. Changes since v6: * because arch_vma_name is removed, this patchset have toset MPX specific ->vm_ops to do the same thing. * fix warnings for 32 bit arch. * add more description into these patches. Changes since v7: * introduce VM_ARCH_2 flag. * remove all of the pr_debug()s. * fix prctl numbers in documentation. * fix some bugs on bounds tables freeing. Changes since v8: * add new patch to rename cfg_reg_u and status_reg. * add new patch to use disabled features from Dave's patches. * add new patch to sync struct siginfo for IA64. * rename two new prctl() commands to PR_MPX_ENABLE_MANAGEMENT and PR_MPX_DISABLE_MANAGEMENT, check whether the management of bounds tables in kernel is enabled at #BR fault time, and add locking to protect the access to 'bd_addr'. * update the documentation file to add more content about on-demand allocation of bounds tables, etc.. Qiaowei Ren (12): mm: distinguish VMAs with different vm_ops x86, mpx: rename cfg_reg_u and status_reg x86, mpx: add MPX specific mmap interface x86, mpx: add MPX to disaabled features x86, mpx: on-demand kernel allocation of bounds tables mpx: extend siginfo structure to include bound violation information mips: sync struct siginfo with general version ia64: sync struct siginfo with general version x86, mpx: decode MPX instruction to get bound violation information x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT x86, mpx: cleanup unused bound tables x86, mpx: add documentation on Intel MPX Qiaowei Ren (12): x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific x86, mpx: rename cfg_reg_u and status_reg x86, mpx: add MPX specific mmap interface x86, mpx: add MPX to disaabled features x86, mpx: on-demand kernel allocation of bounds tables mpx: extend siginfo structure to include bound violation information mips: sync struct siginfo with general version ia64: sync struct siginfo with general version x86, mpx: decode MPX instruction to get bound violation information x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT x86, mpx: cleanup unused bound tables x86, mpx: add documentation on Intel MPX Documentation/x86/intel_mpx.txt | 245 +++ arch/ia64/include/uapi/asm/siginfo.h |8 +- arch/mips/include/uapi/asm/siginfo.h |4 + arch/x86/Kconfig |4 + arch/x86/include/asm/disabled-features.h |8 +- arch/x86/include/asm/mmu_context.h | 25 ++ arch/x86/include/asm/mpx.h | 101 ++ arch/x86/include/asm/processor.h | 22 ++- arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c| 488 ++ arch/x86/kernel/setup.c |8 + arch/x86/kernel/traps.c | 86 ++- arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c| 385 +++ fs/exec.c|2 + fs/proc/task_mmu.c |1 + include/asm-generic/mmu_context.h| 11 + include/linux/mm.h |6 + include/linux/mm_types.h |3 + include/uapi/asm-generic/siginfo.h |9 +- include/uapi/linux/prctl.h |6 + kernel/signal.c |4 + kernel/sys.c | 12 + mm/mmap.c|2 + 24 files changed, 1436 insertions(+), 7 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c create mode 100644 arch/x86/mm/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 02/12] x86, mpx: rename cfg_reg_u and status_reg
According to Intel SDM extension, MPX configuration and status registers should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and status_reg to bndcfgu and bndstatus. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/processor.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index eb71ec7..020142f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -379,8 +379,8 @@ struct bndregs_struct { } __packed; struct bndcsr_struct { - u64 cfg_reg_u; - u64 status_reg; + u64 bndcfgu; + u64 bndstatus; } __packed; struct xsave_hdr_struct { -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 00/10] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. The high-level areas modified in the patchset are as follow: 1) struct siginfo is extended to include bound violation information. 2) two prctl() commands are added to do performance optimization. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator This patchset has been tested on real internal hardware platform at Intel. We have some simple unit tests in user space, which directly call MPX instructions to produce #BR to let kernel allocate bounds tables and cause bounds violations. We also compiled several benchmarks with an MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set. We found a number of bugs in this code in these tests. Future TODO items: 1) support 32-bit binaries on 64-bit kernels. Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Changes since v3: * correct some syntax errors at documentation, and document extended struct siginfo. * for kill the process when the error code of BNDSTATUS is 3. * add some comments. * remove new prctl() commands. * fix some compile warnings for 32-bit. Changes since v4: * raise SIGBUS if the allocations of the bound tables fail. Changes since v5: * hook unmap() path to cleanup unused bounds tables, and use new prctl() command to register bounds directory address to struct mm_struct to check whether one process is MPX enabled during unmap(). * in order track precisely MPX memory usage, add MPX specific mmap interface and one VM_MPX flag to check whether a VMA is MPX bounds table. * add macro cpu_has_mpx to do performance optimization. * sync struct figinfo for mips with general version to avoid build issue. Changes since v6: * because arch_vma_name is removed, this patchset have toset MPX specific ->vm_ops to do the same thing. * fix warnings for 32 bit arch. * add more description into these patches. Changes since v7: * introduce VM_ARCH_2 flag. * remove all of the pr_debug()s. * fix prctl numbers in documentation. * fix some bugs on bounds tables freeing. Qiaowei Ren (10): x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific x86, mpx: add MPX specific mmap interface x86, mpx: add macro cpu_has_mpx x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: extend siginfo structure to include bound violation information mips: sync struct siginfo with general version x86, mpx: decode MPX instruction to get bound violation information x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER x86, mpx: cleanup unused bound tables
[PATCH v8 02/10] x86, mpx: add MPX specific mmap interface
This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. These bounds tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h | 38 + arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 79 4 files changed, 123 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 778178f..935aa69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -243,6 +243,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..e1b28e6 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,79 @@ +#include +#include +#include +#include +#include + +static const char *mpx_mapping_name(struct vm_area_struct *vma) +{ + return "[mpx]"; +} + +static struct vm_operations_struct mpx_vma_ops = { + .name = mpx_mapping_name, +}; + +/* + * this is really a simplified "vm_mmap". it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + struct vm_area_struct *vma; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(&mm->mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr & ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr >> PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + if (IS_ERR_VALUE(ret)) + goto out;
[PATCH v8 03/10] x86, mpx: add macro cpu_has_mpx
In order to do performance optimization, this patch adds macro cpu_has_mpx which will directly return 0 when MPX is not supported by kernel. Community gave a lot of comments on this macro cpu_has_mpx in previous version. Dave will introduce a patchset about disabled features to fix it later. In this code: if (cpu_has_mpx) do_some_mpx_thing(); The patch series from Dave will introduce a new macro cpu_feature_enabled() (if merged after this patchset) to replace the cpu_has_mpx. if (cpu_feature_enabled(X86_FEATURE_MPX)) do_some_mpx_thing(); Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/cpufeature.h |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index bb9b258..82ec7ed 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -353,6 +353,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) +#ifdef CONFIG_X86_INTEL_MPX +#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) +#else +#define cpu_has_mpx 0 +#endif /* CONFIG_X86_INTEL_MPX */ + #ifdef CONFIG_X86_64 #undef cpu_has_vme -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 05/10] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren --- include/uapi/asm-generic/siginfo.h |9 - kernel/signal.c|4 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index ba5be7f..1e35520 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,6 +91,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; /* LSB of the reported address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -131,6 +135,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb_sifields._sigfault._addr_lsb +#define si_lower _sifields._sigfault._addr_bnd._lower +#define si_upper _sifields._sigfault._addr_bnd._upper #define si_band_sifields._sigpoll._band #define si_fd _sifields._sigpoll._fd #ifdef __ARCH_SIGSYS @@ -199,7 +205,8 @@ typedef struct siginfo { */ #define SEGV_MAPERR(__SI_FAULT|1) /* address not mapped to object */ #define SEGV_ACCERR(__SI_FAULT|2) /* invalid permissions for mapped object */ -#define NSIGSEGV 2 +#define SEGV_BNDERR(__SI_FAULT|3) /* failed address bound checks */ +#define NSIGSEGV 3 /* * SIGBUS si_codes diff --git a/kernel/signal.c b/kernel/signal.c index 8f0876f..2c403a4 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from) if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO) err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb); #endif +#ifdef SEGV_BNDERR + err |= __put_user(from->si_lower, &to->si_lower); + err |= __put_user(from->si_upper, &to->si_upper); +#endif break; case __SI_CHLD: err |= __put_user(from->si_pid, &to->si_pid); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
MPX-enabled application will possibly create a lot of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. Being this huge, we need a way to track their memory use. If we want to track them, we essentially have two options: 1. walk the multi-GB (in virtual space) bounds directory to locate all the VMAs and walk them 2. Find a way to distinguish MPX bounds-table VMAs from normal anonymous VMAs and use some existing mechanism to walk them We expect (1) will be prohibitively expensive. For (2), we only need a single bit, and we've chosen to use a VM_ flag. We understand that they are scarce and are open to other options. There is one potential hybrid approach: check the bounds directory entry for any anonymous VMA that could possibly contain a bounds table. This is less expensive than (1), but still requires reading a pointer out of userspace for every VMA that we iterate over. Signed-off-by: Qiaowei Ren --- fs/proc/task_mmu.c |1 + include/linux/mm.h |6 ++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index dfc791c..cc31520 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", [ilog2(VM_DENYWRITE)] = "dw", + [ilog2(VM_MPX)] = "mp", [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)]= "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..942be8a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGETLB 0x0040 /* Huge TLB Page VM */ #define VM_NONLINEAR 0x0080 /* Is non-linear (remap_file_pages) */ #define VM_ARCH_1 0x0100 /* Architecture-specific flag */ +#define VM_ARCH_2 0x0200 #define VM_DONTDUMP0x0400 /* Do not include in the core dump */ #ifdef CONFIG_MEM_SOFT_DIRTY @@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp); # define VM_MAPPED_COPYVM_ARCH_1 /* T if mapped copy of data (nommu mmap) */ #endif +#if defined(CONFIG_X86) +/* MPX specific bounds table or bounds directory */ +# define VM_MPXVM_ARCH_2 +#endif + #ifndef VM_GROWSUP # define VM_GROWSUPVM_NONE #endif -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 06/10] mips: sync struct siginfo with general version
Due to new fields about bound violation added into struct siginfo, this patch syncs it with general version to avoid build issue. Signed-off-by: Qiaowei Ren --- arch/mips/include/uapi/asm/siginfo.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/mips/include/uapi/asm/siginfo.h b/arch/mips/include/uapi/asm/siginfo.h index e811744..d08f83f 100644 --- a/arch/mips/include/uapi/asm/siginfo.h +++ b/arch/mips/include/uapi/asm/siginfo.h @@ -92,6 +92,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL, SIGXFSZ (To do ...) */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables
This patch handles a #BR exception for non-existent tables by carving the space out of the normal processes address space (essentially calling mmap() from inside the kernel) and then pointing the bounds-directory over to it. The tables need to be accessed and controlled by userspace because the compiler generates instructions for MPX-enabled code which frequently store and retrieve entries from the bounds tables. Any direct kernel involvement (like a syscall) to access the tables would destroy performance since these are so frequent. The tables are carved out of userspace because we have no better spot to put them. For each pointer which is being tracked by MPX, the bounds tables contain 4 longs worth of data, and the tables are indexed virtually. If we were to preallocate the tables, we would theoretically need to allocate 4x the virtual space that we have available for userspace somewhere else. We don't have that room in the kernel address space. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 20 +++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 58 arch/x86/kernel/traps.c| 55 - 4 files changed, 133 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 5725ac4..b7598ac 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -18,6 +18,8 @@ #define MPX_BT_ENTRY_SHIFT 5 #define MPX_IGN_BITS 3 +#define MPX_BD_ENTRY_TAIL 3 + #else #define MPX_BD_ENTRY_OFFSET20 @@ -26,13 +28,31 @@ #define MPX_BT_ENTRY_SHIFT 4 #define MPX_IGN_BITS 2 +#define MPX_BD_ENTRY_TAIL 2 + #endif +#define MPX_BNDSTA_TAIL2 +#define MPX_BNDCFG_TAIL12 +#define MPX_BNDSTA_ADDR_MASK (~((1UL< +#include +#include + +static int allocate_bt(long __user *bd_entry) +{ + unsigned long bt_addr, old_val = 0; + int ret = 0; + + bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES); + if (IS_ERR((void *)bt_addr)) + return bt_addr; + bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG; + + ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr); + if (ret) + goto out; + + /* +* there is a existing bounds table pointed at this bounds +* directory entry, and so we need to free the bounds table +* allocated just now. +*/ + if (old_val) + goto out; + + return 0; + +out: + vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES); + return ret; +} + +/* + * When a BNDSTX instruction attempts to save bounds to a BD entry + * with the lack of the valid bit being set, a #BR is generated. + * This is an indication that no BT exists for this entry. In this + * case the fault handler will allocate a new BT. + * + * With 32-bit mode, the size of BD is 4MB, and the size of each + * bound table is 16KB. With 64-bit mode, the size of BD is 2GB, + * and the size of each bound table is 4MB. + */ +int do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry < bd_base) || + (bd_entry >= bd_base + MPX_BD_SIZE_BYTES)) + return -EINVAL; + + return allocate_bt((long __user *)bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 0d0e922..396a88b 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -60,6 +60,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR(X86_TRAP_DE, SIGFPE, "divide error", divide_error) DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow) -DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds) DO_ERROR(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op) DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",coprocessor_segment_overrun) DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS) @@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif +dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) +{ + enum ctx_state prev_state; + unsigned long status; + struct xsave_struct *xsave_buf; + struct task_struct *tsk = current; + + prev_state = exception_enter(); + if (notify_die(DIE_TRAP, "boun
[PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl() commands. These commands can be used to register and unregister MPX related resource on the x86 platform. The base of the bounds directory is set into mm_struct during PR_MPX_REGISTER command execution. This member can be used to check whether one application is mpx enabled. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h |1 + arch/x86/include/asm/processor.h | 18 arch/x86/kernel/mpx.c| 55 ++ include/linux/mm_types.h |3 ++ include/uapi/linux/prctl.h |6 kernel/sys.c | 12 6 files changed, 95 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 780af63..6cb0853 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -43,6 +43,7 @@ #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) #define MPX_BNDSTA_ERROR_CODE 0x3 +#define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 struct mpx_insn { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index eb71ec7..b801fea 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +/* Register/unregister a process' MPX related resource */ +#define MPX_REGISTER(tsk) mpx_register((tsk)) +#define MPX_UNREGISTER(tsk)mpx_unregister((tsk)) + +#ifdef CONFIG_X86_INTEL_MPX +extern int mpx_register(struct task_struct *tsk); +extern int mpx_unregister(struct task_struct *tsk); +#else +static inline int mpx_register(struct task_struct *tsk) +{ + return -EINVAL; +} +static inline int mpx_unregister(struct task_struct *tsk) +{ + return -EINVAL; +} +#endif /* CONFIG_X86_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 7ef6e39..b86873a 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,6 +1,61 @@ #include #include +#include #include +#include +#include + +/* + * This should only be called when cpuid has been checked + * and we are sure that MPX is available. + */ +static __user void *task_get_bounds_dir(struct task_struct *tsk) +{ + struct xsave_struct *xsave_buf; + + fpu_xsave(&tsk->thread.fpu); + xsave_buf = &(tsk->thread.fpu.state->xsave); + if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG)) + return NULL; + + return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u & + MPX_BNDCFG_ADDR_MASK); +} + +int mpx_register(struct task_struct *tsk) +{ + struct mm_struct *mm = tsk->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + /* +* runtime in the userspace will be responsible for allocation of +* the bounds directory. Then, it will save the base of the bounds +* directory into XSAVE/XRSTOR Save Area and enable MPX through +* XRSTOR instruction. +* +* fpu_xsave() is expected to be very expensive. In order to do +* performance optimization, here we get the base of the bounds +* directory and then save it into mm_struct to be used in future. +*/ + mm->bd_addr = task_get_bounds_dir(tsk); + if (!mm->bd_addr) + return -EINVAL; + + return 0; +} + +int mpx_unregister(struct task_struct *tsk) +{ + struct mm_struct *mm = current->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + mm->bd_addr = NULL; + return 0; +} enum reg_type { REG_TYPE_RM = 0, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e0b286..760aee3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -454,6 +454,9 @@ struct mm_struct { bool tlb_flush_pending; #endif struct uprobes_state uprobes_state; +#ifdef CONFIG_X86_INTEL_MPX + void __user *bd_addr; /* address of the bounds directory */ +#endif }; static inline void mm_init_cpumask(struct mm_struct *mm) diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 58afc04..ce86fa9 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -152,4 +152,10 @@ #define PR_SET_THP_DISABLE 41 #define PR_GET_THP_DISABLE 42 +/* + * Register/unregister MPX related resource. + */ +#define PR_MPX_REGISTER43 +#define PR_MPX_UNREGISTER 44 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index ce81291..9a43587 100644 --- a/kernel/sys.c +++ b/kerne
[PATCH v8 10/10] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 127 +++ 1 files changed, 127 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..ccffeee --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,127 @@ +1. Intel(R) MPX Overview + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +For more information, please refer to Intel(R) Architecture +Instruction Set Extensions Programming Reference, Chapter 9: +Intel(R) Memory Protection Extensions. + +Note: Currently no hardware with MPX ISA is available but it is always +possible to use SDE (Intel(R) Software Development Emulator) instead, +which can be downloaded from +http://software.intel.com/en-us/articles/intel-software-development-emulator + + +2. How does MPX kernel code work + + +Handling #BR faults caused by MPX +- + +When MPX is enabled, there are 2 new situations that can generate +#BR faults. + * bounds violation caused by MPX instructions. + * new bounds tables (BT) need to be allocated to save bounds. + +We hook #BR handler to handle these two new situations. + +Decoding MPX instructions +- + +If a #BR is generated due to a bounds violation caused by MPX. +We need to decode MPX instructions to get violation address and +set this address into extended struct siginfo. + +The _sigfault feild of struct siginfo is extended as follow: + +87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */ +88 struct { +89 void __user *_addr; /* faulting insn/memory ref. */ +90 #ifdef __ARCH_SI_TRAPNO +91 int _trapno;/* TRAP # which caused the signal */ +92 #endif +93 short _addr_lsb; /* LSB of the reported address */ +94 struct { +95 void __user *_lower; +96 void __user *_upper; +97 } _addr_bnd; +98 } _sigfault; + +The '_addr' field refers to violation address, and new '_addr_and' +field refers to the upper/lower bounds when a #BR is caused. + +Glibc will be also updated to support this new siginfo. So user +can get violation address and bounds when bounds violations occur. + +Freeing unused bounds tables + + +When a BNDSTX instruction attempts to save bounds to a bounds directory +entry marked as invalid, a #BR is generated. This is an indication that +no bounds table exists for this entry. In this case the fault handler +will allocate a new bounds table on demand. + +Since the kernel allocated those tables on-demand without userspace +knowledge, it is also responsible for freeing them when the associated +mappings go away. + +Here, the solution for this issue is to hook do_munmap() to check +whether one process is MPX enabled. If yes, those bounds tables covered +in the virtual address region which is being unmapped will be freed also. + +Adding new prctl commands +- + +Runtime library in userspace is responsible for allocation of bounds +directory. So kernel have to use XSAVE instruction to get the base +of bounds directory from BNDCFG register. + +But XSAVE is expected to be very expensive. In order to do performance +optimization, we have to add new prctl command to get the base of +bounds directory to be used in future. + +Two new prctl commands are added to register and unregister MPX related +resource. + +155#define PR_MPX_REGISTER 43 +156#define PR_MPX_UNREGISTER 44 + +The base of the bounds directory is set into mm_struct during +PR_MPX_REGISTER command execution. This member can be used to +check whether one application is mpx enabled. + + +3. Tips +=== + +1) Users are not allowed to create bounds tables and point the bounds +directory at them in the userspace. In fact, it is not also necessary +for users to create bounds tables in the userspace. + +When #BR fault is produced due to invalid entry, bounds table will be +created in kernel on demand and kernel will not transfer this fault to +userspace. So usersapce can't receive #BR fault for invalid entry, and +it is not also necessary for users to create bounds tables by themselves. + +Certainly users can allocate bounds tables and forcibly point the bounds +directory at them through XSAVE instruction, and then set v
[PATCH v8 09/10] x86, mpx: cleanup unused bound tables
Since the kernel allocated those tables on-demand without userspace knowledge, it is also responsible for freeing them when the associated mappings go away. Here, the solution for this issue is to hook do_munmap() to check whether one process is MPX enabled. If yes, those bounds tables covered in the virtual address region which is being unmapped will be freed also. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mmu_context.h | 16 +++ arch/x86/include/asm/mpx.h |9 ++ arch/x86/mm/mpx.c | 252 include/asm-generic/mmu_context.h |6 + mm/mmap.c |2 + 5 files changed, 285 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 166af2a..d13e01c 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -10,6 +10,7 @@ #include #include #include +#include #ifndef CONFIG_PARAVIRT #include @@ -102,4 +103,19 @@ do { \ } while (0) #endif +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ +#ifdef CONFIG_X86_INTEL_MPX + /* +* Check whether this vma comes from MPX-enabled application. +* If so, release this vma related bound tables. +*/ + if (mm->bd_addr && !(vma->vm_flags & VM_MPX)) + mpx_unmap(mm, start, end); + +#endif +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 6cb0853..e848a74 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -42,6 +42,13 @@ #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) +#define MPX_BD_ENTRY_MASK ((1<>(MPX_BT_ENTRY_OFFSET+ \ + MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT) +#define MPX_GET_BT_ENTRY_OFFSET(addr) addr)>>MPX_IGN_BITS) & \ + MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT) + #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 @@ -63,6 +70,8 @@ struct mpx_insn { #define MAX_MPX_INSN_SIZE 15 unsigned long mpx_mmap(unsigned long len); +void mpx_unmap(struct mm_struct *mm, + unsigned long start, unsigned long end); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c index e1b28e6..feb1f01 100644 --- a/arch/x86/mm/mpx.c +++ b/arch/x86/mm/mpx.c @@ -1,7 +1,16 @@ +/* + * mpx.c - Memory Protection eXtensions + * + * Copyright (c) 2014, Intel Corporation. + * Qiaowei Ren + * Dave Hansen + */ + #include #include #include #include +#include #include static const char *mpx_mapping_name(struct vm_area_struct *vma) @@ -77,3 +86,246 @@ out: up_write(&mm->mmap_sem); return ret; } + +/* + * Get the base of bounds tables pointed by specific bounds + * directory entry. + */ +static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr) +{ + int valid; + + if (!access_ok(VERIFY_READ, (bd_entry), sizeof(*(bd_entry + return -EFAULT; + + pagefault_disable(); + if (get_user(*bt_addr, bd_entry)) + goto out; + pagefault_enable(); + + valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG; + *bt_addr &= MPX_BT_ADDR_MASK; + + /* +* If this bounds directory entry is nonzero, and meanwhile +* the valid bit is zero, one SIGSEGV will be produced due to +* this unexpected situation. +*/ + if (!valid && *bt_addr) + return -EINVAL; + if (!valid) + return -ENOENT; + + return 0; + +out: + pagefault_enable(); + return -EFAULT; +} + +/* + * Free the backing physical pages of bounds table 'bt_addr'. + * Assume start...end is within that bounds table. + */ +static int __must_check zap_bt_entries(struct mm_struct *mm, + unsigned long bt_addr, + unsigned long start, unsigned long end) +{ + struct vm_area_struct *vma; + + /* Find the vma which overlaps this bounds table */ + vma = find_vma(mm, bt_addr); + /* +* The table entry comes from userspace and could be +* pointing anywhere, so make sure it is at least +* pointing to valid memory. +*/ + if (!vma || !(vma->vm_flags & VM_MPX) || + vma->vm_start > bt_addr || + vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES) + return -EINVAL; + + zap_page_range(vma, start, end - start, NULL); + r
[PATCH v8 07/10] x86, mpx: decode MPX instruction to get bound violation information
This patch sets bound violation fields of siginfo struct in #BR exception handler by decoding the user instruction and constructing the faulting pointer. This patch does't use the generic decoder, and implements a limited special-purpose decoder to decode MPX instructions, simply because the generic decoder is very heavyweight not just in terms of performance but in terms of interface -- because it has to. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 23 arch/x86/kernel/mpx.c | 299 arch/x86/kernel/traps.c|6 + 3 files changed, 328 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index b7598ac..780af63 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -44,15 +45,37 @@ #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BD_ENTRY_VALID_FLAG0x1 +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + unsigned long mpx_mmap(unsigned long len); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #else static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf) { return -EINVAL; } +static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf) +{ +} #endif /* CONFIG_X86_INTEL_MPX */ #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 88d660f..7ef6e39 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -2,6 +2,275 @@ #include #include +enum reg_type { + REG_TYPE_RM = 0, + REG_TYPE_INDEX, + REG_TYPE_BASE, +}; + +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +enum reg_type type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { +
[PATCH v7 02/10] x86, mpx: add MPX specific mmap interface
This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. These bounds tables can take huge amounts of memory. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. My expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. With this feature, plus some grepping in /proc/$pid/smaps one could take a pretty good stab at it. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h | 38 + arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 79 4 files changed, 123 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a8f749e..020db35 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -238,6 +238,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..e1b28e6 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,79 @@ +#include +#include +#include +#include +#include + +static const char *mpx_mapping_name(struct vm_area_struct *vma) +{ + return "[mpx]"; +} + +static struct vm_operations_struct mpx_vma_ops = { + .name = mpx_mapping_name, +}; + +/* + * this is really a simplified "vm_mmap". it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + struct vm_area_struct *vma; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(&mm->mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr & ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr >> PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + if (IS_ERR_VALUE(ret)) + goto out;
[PATCH v7 05/10] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren --- include/uapi/asm-generic/siginfo.h |9 - kernel/signal.c|4 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index ba5be7f..1e35520 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,6 +91,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; /* LSB of the reported address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -131,6 +135,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb_sifields._sigfault._addr_lsb +#define si_lower _sifields._sigfault._addr_bnd._lower +#define si_upper _sifields._sigfault._addr_bnd._upper #define si_band_sifields._sigpoll._band #define si_fd _sifields._sigpoll._fd #ifdef __ARCH_SIGSYS @@ -199,7 +205,8 @@ typedef struct siginfo { */ #define SEGV_MAPERR(__SI_FAULT|1) /* address not mapped to object */ #define SEGV_ACCERR(__SI_FAULT|2) /* invalid permissions for mapped object */ -#define NSIGSEGV 2 +#define SEGV_BNDERR(__SI_FAULT|3) /* failed address bound checks */ +#define NSIGSEGV 3 /* * SIGBUS si_codes diff --git a/kernel/signal.c b/kernel/signal.c index a4077e9..2131636 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from) if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO) err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb); #endif +#ifdef SEGV_BNDERR + err |= __put_user(from->si_lower, &to->si_lower); + err |= __put_user(from->si_upper, &to->si_upper); +#endif break; case __SI_CHLD: err |= __put_user(from->si_pid, &to->si_pid); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 04/10] x86, mpx: hook #BR exception handler to allocate bound tables
This patch handles a #BR exception for non-existent tables by carving the space out of the normal processes address space (essentially calling mmap() from inside the kernel) and then pointing the bounds-directory over to it. The tables need to be accessed and controlled by userspace because the compiler generates instructions for MPX-enabled code which frequently store and retrieve entries from the bounds tables. Any direct kernel involvement (like a syscall) to access the tables would destroy performance since these are so frequent. The tables are carved out of userspace because we have no better spot to put them. For each pointer which is being tracked by MPX, the bounds tables contain 4 longs worth of data, and the tables are indexed virtually. If we were to preallocate the tables, we would theoretically need to allocate 4x the virtual space that we have available for userspace somewhere else. We don't have that room in the kernel address space. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 20 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 60 arch/x86/kernel/traps.c| 55 +++- 4 files changed, 135 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 5725ac4..b7598ac 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -18,6 +18,8 @@ #define MPX_BT_ENTRY_SHIFT 5 #define MPX_IGN_BITS 3 +#define MPX_BD_ENTRY_TAIL 3 + #else #define MPX_BD_ENTRY_OFFSET20 @@ -26,13 +28,31 @@ #define MPX_BT_ENTRY_SHIFT 4 #define MPX_IGN_BITS 2 +#define MPX_BD_ENTRY_TAIL 2 + #endif +#define MPX_BNDSTA_TAIL2 +#define MPX_BNDCFG_TAIL12 +#define MPX_BNDSTA_ADDR_MASK (~((1UL< +#include +#include + +static int allocate_bt(long __user *bd_entry) +{ + unsigned long bt_addr, old_val = 0; + int ret = 0; + + bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES); + if (IS_ERR((void *)bt_addr)) + return bt_addr; + bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG; + + ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr); + if (ret) + goto out; + + /* +* there is a existing bounds table pointed at this bounds +* directory entry, and so we need to free the bounds table +* allocated just now. +*/ + if (old_val) + goto out; + + pr_debug("Allocate bounds table %lx at entry %p\n", + bt_addr, bd_entry); + return 0; + +out: + vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES); + return ret; +} + +/* + * When a BNDSTX instruction attempts to save bounds to a BD entry + * with the lack of the valid bit being set, a #BR is generated. + * This is an indication that no BT exists for this entry. In this + * case the fault handler will allocate a new BT. + * + * With 32-bit mode, the size of BD is 4MB, and the size of each + * bound table is 16KB. With 64-bit mode, the size of BD is 2GB, + * and the size of each bound table is 4MB. + */ +int do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry < bd_base) || + (bd_entry >= bd_base + MPX_BD_SIZE_BYTES)) + return -EINVAL; + + return allocate_bt((long __user *)bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 0d0e922..396a88b 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -60,6 +60,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR(X86_TRAP_DE, SIGFPE, "divide error", divide_error) DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow) -DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds) DO_ERROR(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op) DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",coprocessor_segment_overrun) DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS) @@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif +dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) +{ + enum ctx_state prev_state; + unsigned long status; + struct xsave_struct *xsave_buf; + struct
[PATCH v7 00/10] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. The high-level areas modified in the patchset are as follow: 1) struct siginfo is extended to include bound violation information. 2) two prctl() commands are added to do performance optimization. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator In addition, this patchset has been tested on Intel internal hardware platform for MPX testing. Future TODO items: 1) support 32-bit binaries on 64-bit kernels. Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Changes since v3: * correct some syntax errors at documentation, and document extended struct siginfo. * for kill the process when the error code of BNDSTATUS is 3. * add some comments. * remove new prctl() commands. * fix some compile warnings for 32-bit. Changes since v4: * raise SIGBUS if the allocations of the bound tables fail. Changes since v5: * hook unmap() path to cleanup unused bounds tables, and use new prctl() command to register bounds directory address to struct mm_struct to check whether one process is MPX enabled during unmap(). * in order track precisely MPX memory usage, add MPX specific mmap interface and one VM_MPX flag to check whether a VMA is MPX bounds table. * add macro cpu_has_mpx to do performance optimization. * sync struct figinfo for mips with general version to avoid build issue. Changes since v6: * because arch_vma_name is removed, this patchset have toset MPX specific ->vm_ops to do the same thing. * fix warnings for 32 bit arch. * add more description into these patches. Qiaowei Ren (10): x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific x86, mpx: add MPX specific mmap interface x86, mpx: add macro cpu_has_mpx x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: extend siginfo structure to include bound violation information mips: sync struct siginfo with general version x86, mpx: decode MPX instruction to get bound violation information x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER x86, mpx: cleanup unused bound tables x86, mpx: add documentation on Intel MPX Documentation/x86/intel_mpx.txt | 127 +++ arch/mips/include/uapi/asm/siginfo.h |4 + arch/x86/Kconfig |4 + arch/x86/include/asm/cpufeature.h|6 + arch/x86/include/asm/mmu_context.h | 16 ++ arch/x86/include/asm/mpx.h | 91 arch/x86/include/asm/processor.h | 18 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c
[PATCH v7 09/10] x86, mpx: cleanup unused bound tables
Since the kernel allocated those tables on-demand without userspace knowledge, it is also responsible for freeing them when the associated mappings go away. Here, the solution for this issue is to hook do_munmap() to check whether one process is MPX enabled. If yes, those bounds tables covered in the virtual address region which is being unmapped will be freed also. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mmu_context.h | 16 +++ arch/x86/include/asm/mpx.h |9 ++ arch/x86/mm/mpx.c | 181 include/asm-generic/mmu_context.h |6 + mm/mmap.c |2 + 5 files changed, 214 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index be12c53..af70d4f 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -6,6 +6,7 @@ #include #include #include +#include #ifndef CONFIG_PARAVIRT #include @@ -96,4 +97,19 @@ do { \ } while (0) #endif +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ +#ifdef CONFIG_X86_INTEL_MPX + /* +* Check whether this vma comes from MPX-enabled application. +* If so, release this vma related bound tables. +*/ + if (mm->bd_addr && !(vma->vm_flags & VM_MPX)) + mpx_unmap(mm, start, end); + +#endif +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 6cb0853..e848a74 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -42,6 +42,13 @@ #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) +#define MPX_BD_ENTRY_MASK ((1<>(MPX_BT_ENTRY_OFFSET+ \ + MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT) +#define MPX_GET_BT_ENTRY_OFFSET(addr) addr)>>MPX_IGN_BITS) & \ + MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT) + #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 @@ -63,6 +70,8 @@ struct mpx_insn { #define MAX_MPX_INSN_SIZE 15 unsigned long mpx_mmap(unsigned long len); +void mpx_unmap(struct mm_struct *mm, + unsigned long start, unsigned long end); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c index e1b28e6..d29ec9c 100644 --- a/arch/x86/mm/mpx.c +++ b/arch/x86/mm/mpx.c @@ -2,6 +2,7 @@ #include #include #include +#include #include static const char *mpx_mapping_name(struct vm_area_struct *vma) @@ -77,3 +78,183 @@ out: up_write(&mm->mmap_sem); return ret; } + +/* + * Get the base of bounds tables pointed by specific bounds + * directory entry. + */ +static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr, + unsigned int *valid) +{ + if (get_user(*bt_addr, bd_entry)) + return -EFAULT; + + *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG; + *bt_addr &= MPX_BT_ADDR_MASK; + + /* +* If this bounds directory entry is nonzero, and meanwhile +* the valid bit is zero, one SIGSEGV will be produced due to +* this unexpected situation. +*/ + if (!(*valid) && *bt_addr) + force_sig(SIGSEGV, current); + + return 0; +} + +/* + * Free the backing physical pages of bounds table 'bt_addr'. + * Assume start...end is within that bounds table. + */ +static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr, + unsigned long start, unsigned long end) +{ + struct vm_area_struct *vma; + + /* Find the vma which overlaps this bounds table */ + vma = find_vma(mm, bt_addr); + if (!vma || vma->vm_start > bt_addr || + vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES) + return; + + zap_page_range(vma, start, end, NULL); +} + +static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry, + unsigned long bt_addr) +{ + if (user_atomic_cmpxchg_inatomic(&bt_addr, bd_entry, + bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0)) + return; + + /* +* to avoid recursion, do_munmap() will check whether it comes +* from one bounds table through VM_MPX flag. +*/ + do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES); +} + +/* + * If the bounds table pointed by bounds directory 'bd_entry' is + * not shared, unmap this whole bounds table. Otherwise, only free + * those backing physical pages of bo
[PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx
In order to do performance optimization, this patch adds macro cpu_has_mpx which will directly return 0 when MPX is not supported by kernel. Community gave a lot of comments on this macro cpu_has_mpx in previous version. Dave will introduce a patchset about disabled features to fix it later. In this code: if (cpu_has_mpx) do_some_mpx_thing(); The patch series from Dave will introduce a new macro cpu_feature_enabled() (if merged after this patchset) to replace the cpu_has_mpx. if (cpu_feature_enabled(X86_FEATURE_MPX)) do_some_mpx_thing(); Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/cpufeature.h |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index e265ff9..f302d08 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32]; #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) +#ifdef CONFIG_X86_INTEL_MPX +#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) +#else +#define cpu_has_mpx 0 +#endif /* CONFIG_X86_INTEL_MPX */ + #ifdef CONFIG_X86_64 #undef cpu_has_vme -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information
This patch sets bound violation fields of siginfo struct in #BR exception handler by decoding the user instruction and constructing the faulting pointer. This patch does't use the generic decoder, and implements a limited special-purpose decoder to decode MPX instructions, simply because the generic decoder is very heavyweight not just in terms of performance but in terms of interface -- because it has to. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 23 arch/x86/kernel/mpx.c | 299 arch/x86/kernel/traps.c|6 + 3 files changed, 328 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index b7598ac..780af63 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -44,15 +45,37 @@ #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BD_ENTRY_VALID_FLAG0x1 +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + unsigned long mpx_mmap(unsigned long len); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #else static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf) { return -EINVAL; } +static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf) +{ +} #endif /* CONFIG_X86_INTEL_MPX */ #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index f02dcea..c1957a8 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -2,6 +2,275 @@ #include #include +enum reg_type { + REG_TYPE_RM = 0, + REG_TYPE_INDEX, + REG_TYPE_BASE, +}; + +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +enum reg_type type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { +
[PATCH v7 10/10] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 127 +++ 1 files changed, 127 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..1af9809 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,127 @@ +1. Intel(R) MPX Overview + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +For more information, please refer to Intel(R) Architecture +Instruction Set Extensions Programming Reference, Chapter 9: +Intel(R) Memory Protection Extensions. + +Note: Currently no hardware with MPX ISA is available but it is always +possible to use SDE (Intel(R) Software Development Emulator) instead, +which can be downloaded from +http://software.intel.com/en-us/articles/intel-software-development-emulator + + +2. How does MPX kernel code work + + +Handling #BR faults caused by MPX +- + +When MPX is enabled, there are 2 new situations that can generate +#BR faults. + * bounds violation caused by MPX instructions. + * new bounds tables (BT) need to be allocated to save bounds. + +We hook #BR handler to handle these two new situations. + +Decoding MPX instructions +- + +If a #BR is generated due to a bounds violation caused by MPX. +We need to decode MPX instructions to get violation address and +set this address into extended struct siginfo. + +The _sigfault feild of struct siginfo is extended as follow: + +87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */ +88 struct { +89 void __user *_addr; /* faulting insn/memory ref. */ +90 #ifdef __ARCH_SI_TRAPNO +91 int _trapno;/* TRAP # which caused the signal */ +92 #endif +93 short _addr_lsb; /* LSB of the reported address */ +94 struct { +95 void __user *_lower; +96 void __user *_upper; +97 } _addr_bnd; +98 } _sigfault; + +The '_addr' field refers to violation address, and new '_addr_and' +field refers to the upper/lower bounds when a #BR is caused. + +Glibc will be also updated to support this new siginfo. So user +can get violation address and bounds when bounds violations occur. + +Freeing unused bounds tables + + +When a BNDSTX instruction attempts to save bounds to a bounds directory +entry marked as invalid, a #BR is generated. This is an indication that +no bounds table exists for this entry. In this case the fault handler +will allocate a new bounds table on demand. + +Since the kernel allocated those tables on-demand without userspace +knowledge, it is also responsible for freeing them when the associated +mappings go away. + +Here, the solution for this issue is to hook do_munmap() to check +whether one process is MPX enabled. If yes, those bounds tables covered +in the virtual address region which is being unmapped will be freed also. + +Adding new prctl commands +- + +Runtime library in userspace is responsible for allocation of bounds +directory. So kernel have to use XSAVE instruction to get the base +of bounds directory from BNDCFG register. + +But XSAVE is expected to be very expensive. In order to do performance +optimization, we have to add new prctl command to get the base of +bounds directory to be used in future. + +Two new prctl commands are added to register and unregister MPX related +resource. + +155#define PR_MPX_REGISTER 41 +156#define PR_MPX_UNREGISTER 42 + +The base of the bounds directory is set into mm_struct during +PR_MPX_REGISTER command execution. This member can be used to +check whether one application is mpx enabled. + + +3. Tips +=== + +1) Users are not allowed to create bounds tables and point the bounds +directory at them in the userspace. In fact, it is not also necessary +for users to create bounds tables in the userspace. + +When #BR fault is produced due to invalid entry, bounds table will be +created in kernel on demand and kernel will not transfer this fault to +userspace. So usersapce can't receive #BR fault for invalid entry, and +it is not also necessary for users to create bounds tables by themselves. + +Certainly users can allocate bounds tables and forcibly point the bounds +directory at them through XSAVE instruction, and then set v
[PATCH v7 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
MPX-enabled application will possibly create a lot of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. Being this huge, we need a way to track their memory use. If we want to track them, we essentially have two options: 1. walk the multi-GB (in virtual space) bounds directory to locate all the VMAs and walk them 2. Find a way to distinguish MPX bounds-table VMAs from normal anonymous VMAs and use some existing mechanism to walk them We expect (1) will be prohibitively expensive. For (2), we only need a single bit, and we've chosen to use a VM_ flag. We understand that they are scarce and are open to other options. There is one potential hybrid approach: check the bounds directory entry for any anonymous VMA that could possibly contain a bounds table. This is less expensive than (1), but still requires reading a pointer out of userspace for every VMA that we iterate over. Signed-off-by: Qiaowei Ren --- fs/proc/task_mmu.c |1 + include/linux/mm.h |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index cfa63ee..b2bc755 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", [ilog2(VM_DENYWRITE)] = "dw", + [ilog2(VM_MPX)] = "mp", [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)]= "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index e03dd29..44c75d7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGETLB 0x0040 /* Huge TLB Page VM */ #define VM_NONLINEAR 0x0080 /* Is non-linear (remap_file_pages) */ #define VM_ARCH_1 0x0100 /* Architecture-specific flag */ +/* MPX specific bounds table or bounds directory (x86) */ +#define VM_MPX 0x0200 #define VM_DONTDUMP0x0400 /* Do not include in the core dump */ #ifdef CONFIG_MEM_SOFT_DIRTY -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 06/10] mips: sync struct siginfo with general version
Due to new fields about bound violation added into struct siginfo, this patch syncs it with general version to avoid build issue. Signed-off-by: Qiaowei Ren --- arch/mips/include/uapi/asm/siginfo.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/mips/include/uapi/asm/siginfo.h b/arch/mips/include/uapi/asm/siginfo.h index e811744..d08f83f 100644 --- a/arch/mips/include/uapi/asm/siginfo.h +++ b/arch/mips/include/uapi/asm/siginfo.h @@ -92,6 +92,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL, SIGXFSZ (To do ...) */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl() commands. These commands can be used to register and unregister MPX related resource on the x86 platform. The base of the bounds directory is set into mm_struct during PR_MPX_REGISTER command execution. This member can be used to check whether one application is mpx enabled. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h |1 + arch/x86/include/asm/processor.h | 18 arch/x86/kernel/mpx.c| 56 ++ include/linux/mm_types.h |3 ++ include/uapi/linux/prctl.h |6 kernel/sys.c | 12 6 files changed, 96 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 780af63..6cb0853 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -43,6 +43,7 @@ #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) #define MPX_BNDSTA_ERROR_CODE 0x3 +#define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 struct mpx_insn { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4ea023..6e0966e 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +/* Register/unregister a process' MPX related resource */ +#define MPX_REGISTER(tsk) mpx_register((tsk)) +#define MPX_UNREGISTER(tsk)mpx_unregister((tsk)) + +#ifdef CONFIG_X86_INTEL_MPX +extern int mpx_register(struct task_struct *tsk); +extern int mpx_unregister(struct task_struct *tsk); +#else +static inline int mpx_register(struct task_struct *tsk) +{ + return -EINVAL; +} +static inline int mpx_unregister(struct task_struct *tsk) +{ + return -EINVAL; +} +#endif /* CONFIG_X86_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index c1957a8..6b7e526 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,6 +1,62 @@ #include #include +#include #include +#include +#include + +/* + * This should only be called when cpuid has been checked + * and we are sure that MPX is available. + */ +static __user void *task_get_bounds_dir(struct task_struct *tsk) +{ + struct xsave_struct *xsave_buf; + + fpu_xsave(&tsk->thread.fpu); + xsave_buf = &(tsk->thread.fpu.state->xsave); + if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG)) + return NULL; + + return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u & + MPX_BNDCFG_ADDR_MASK); +} + +int mpx_register(struct task_struct *tsk) +{ + struct mm_struct *mm = tsk->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + /* +* runtime in the userspace will be responsible for allocation of +* the bounds directory. Then, it will save the base of the bounds +* directory into XSAVE/XRSTOR Save Area and enable MPX through +* XRSTOR instruction. +* +* fpu_xsave() is expected to be very expensive. In order to do +* performance optimization, here we get the base of the bounds +* directory and then save it into mm_struct to be used in future. +*/ + mm->bd_addr = task_get_bounds_dir(tsk); + if (!mm->bd_addr) + return -EINVAL; + + pr_debug("MPX BD base address %p\n", mm->bd_addr); + return 0; +} + +int mpx_unregister(struct task_struct *tsk) +{ + struct mm_struct *mm = current->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + mm->bd_addr = NULL; + return 0; +} enum reg_type { REG_TYPE_RM = 0, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 96c5750..131b5b3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -454,6 +454,9 @@ struct mm_struct { bool tlb_flush_pending; #endif struct uprobes_state uprobes_state; +#ifdef CONFIG_X86_INTEL_MPX + void __user *bd_addr; /* address of the bounds directory */ +#endif }; static inline void mm_init_cpumask(struct mm_struct *mm) diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 58afc04..ce86fa9 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -152,4 +152,10 @@ #define PR_SET_THP_DISABLE 41 #define PR_GET_THP_DISABLE 42 +/* + * Register/unregister MPX related resource. + */ +#define PR_MPX_REGISTER43 +#define PR_MPX_UNREGISTER 44 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kern
[PATCH v6 00/10] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. The high-level areas modified in the patchset are as follow: 1) struct siginfo is extended to include bound violation information. 2) two prctl() commands are added to do performance optimization. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator Future TODO items: 1) support 32-bit binaries on 64-bit kernels. Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Changes since v3: * correct some syntax errors at documentation, and document extended struct siginfo. * for kill the process when the error code of BNDSTATUS is 3. * add some comments. * remove new prctl() commands. * fix some compile warnings for 32-bit. Changes since v4: * raise SIGBUS if the allocations of the bound tables fail. Changes since v5: * hook unmap() path to cleanup unused bounds tables, and use new prctl() command to register bounds directory address to struct mm_struct to check whether one process is MPX enabled during unmap(). * in order track precisely MPX memory usage, add MPX specific mmap interface and one VM_MPX flag to check whether a VMA is MPX bounds table. * add macro cpu_has_mpx to do performance optimization. * sync struct figinfo for mips with general version to avoid build issue. Qiaowei Ren (10): x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific x86, mpx: add MPX specific mmap interface x86, mpx: add macro cpu_has_mpx x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: extend siginfo structure to include bound violation information mips: sync struct siginfo with general version x86, mpx: decode MPX instruction to get bound violation information x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER x86, mpx: cleanup unused bound tables x86, mpx: add documentation on Intel MPX Documentation/x86/intel_mpx.txt | 127 +++ arch/mips/include/uapi/asm/siginfo.h |4 + arch/x86/Kconfig |4 + arch/x86/include/asm/cpufeature.h|6 + arch/x86/include/asm/mmu_context.h | 16 ++ arch/x86/include/asm/mpx.h | 91 arch/x86/include/asm/processor.h | 18 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c| 413 ++ arch/x86/kernel/traps.c | 62 +- arch/x86/mm/Makefile |2 + arch/x86/mm/init_64.c|2 + arch/x86/mm/mpx.c| 247 fs/proc/task_mmu.c |1 + include/a
[PATCH v6 03/10] x86, mpx: add macro cpu_has_mpx
In order to do performance optimization, this patch adds macro cpu_has_mpx which will directly return 0 when MPX is not supported by kernel. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/cpufeature.h |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index e265ff9..f302d08 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32]; #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) +#ifdef CONFIG_X86_INTEL_MPX +#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) +#else +#define cpu_has_mpx 0 +#endif /* CONFIG_X86_INTEL_MPX */ + #ifdef CONFIG_X86_64 #undef cpu_has_vme -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
MPX-enabled application will possibly create a lot of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. Being this huge, we need a way to track their memory use. If we want to track them, we essentially have two options: 1. walk the multi-GB (in virtual space) bounds directory to locate all the VMAs and walk them 2. Find a way to distinguish MPX bounds-table VMAs from normal anonymous VMAs and use some existing mechanism to walk them We expect (1) will be prohibitively expensive. For (2), we only need a single bit, and we've chosen to use a VM_ flag. We understand that they are scarce and are open to other options. There is one potential hybrid approach: check the bounds directory entry for any anonymous VMA that could possibly contain a bounds table. This is less expensive than (1), but still requires reading a pointer out of userspace for every VMA that we iterate over. Signed-off-by: Qiaowei Ren --- arch/x86/mm/init_64.c |2 ++ fs/proc/task_mmu.c|1 + include/linux/mm.h|2 ++ 3 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index f35c66c..2d41679 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1223,6 +1223,8 @@ int in_gate_area_no_mm(unsigned long addr) const char *arch_vma_name(struct vm_area_struct *vma) { + if (vma->vm_flags & VM_MPX) + return "[mpx]"; if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso) return "[vdso]"; if (vma == &gate_vma) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 442177b..09266bd 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -543,6 +543,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", [ilog2(VM_DENYWRITE)] = "dw", + [ilog2(VM_MPX)] = "mp", [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)]= "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index d677706..029c716 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp); #define VM_HUGETLB 0x0040 /* Huge TLB Page VM */ #define VM_NONLINEAR 0x0080 /* Is non-linear (remap_file_pages) */ #define VM_ARCH_1 0x0100 /* Architecture-specific flag */ +/* MPX specific bounds table or bounds directory (x86) */ +#define VM_MPX 0x0200 #define VM_DONTDUMP0x0400 /* Do not include in the core dump */ #ifdef CONFIG_MEM_SOFT_DIRTY -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 05/10] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren --- include/uapi/asm-generic/siginfo.h |9 - kernel/signal.c|4 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index ba5be7f..1e35520 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,6 +91,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; /* LSB of the reported address */ + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL */ @@ -131,6 +135,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb_sifields._sigfault._addr_lsb +#define si_lower _sifields._sigfault._addr_bnd._lower +#define si_upper _sifields._sigfault._addr_bnd._upper #define si_band_sifields._sigpoll._band #define si_fd _sifields._sigpoll._fd #ifdef __ARCH_SIGSYS @@ -199,7 +205,8 @@ typedef struct siginfo { */ #define SEGV_MAPERR(__SI_FAULT|1) /* address not mapped to object */ #define SEGV_ACCERR(__SI_FAULT|2) /* invalid permissions for mapped object */ -#define NSIGSEGV 2 +#define SEGV_BNDERR(__SI_FAULT|3) /* failed address bound checks */ +#define NSIGSEGV 3 /* * SIGBUS si_codes diff --git a/kernel/signal.c b/kernel/signal.c index 6ea13c0..0fcf749 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2773,6 +2773,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const siginfo_t *from) if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO) err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb); #endif +#ifdef SEGV_BNDERR + err |= __put_user(from->si_lower, &to->si_lower); + err |= __put_user(from->si_upper, &to->si_upper); +#endif break; case __SI_CHLD: err |= __put_user(from->si_pid, &to->si_pid); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface
This patch adds one MPX specific mmap interface, which only handles mpx related maps, including bounds table and bounds directory. In order to track MPX specific memory usage, this interface is added to stick new vm_flag VM_MPX in the vma_area_struct when create a bounds table or bounds directory. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 +++ arch/x86/include/asm/mpx.h | 38 arch/x86/mm/Makefile |2 + arch/x86/mm/mpx.c | 58 4 files changed, 102 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/mm/mpx.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 25d2c6f..0194790 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -237,6 +237,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config X86_INTEL_MPX + def_bool y + depends on CPU_SUP_INTEL + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..5725ac4 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,38 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +/* upper 28 bits [47:20] of the virtual address in 64-bit used to + * index into bounds directory (BD). + */ +#define MPX_BD_ENTRY_OFFSET28 +#define MPX_BD_ENTRY_SHIFT 3 +/* bits [19:3] of the virtual address in 64-bit used to index into + * bounds table (BT). + */ +#define MPX_BT_ENTRY_OFFSET17 +#define MPX_BT_ENTRY_SHIFT 5 +#define MPX_IGN_BITS 3 + +#else + +#define MPX_BD_ENTRY_OFFSET20 +#define MPX_BD_ENTRY_SHIFT 2 +#define MPX_BT_ENTRY_OFFSET10 +#define MPX_BT_ENTRY_SHIFT 4 +#define MPX_IGN_BITS 2 + +#endif + +#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) +#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) + +#define MPX_BNDSTA_ERROR_CODE 0x3 + +unsigned long mpx_mmap(unsigned long len); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 6a19ad9..ecfdc46 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_NUMA_EMU) += numa_emulation.o obj-$(CONFIG_MEMTEST) += memtest.o + +obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c new file mode 100644 index 000..546c5d1 --- /dev/null +++ b/arch/x86/mm/mpx.c @@ -0,0 +1,58 @@ +#include +#include +#include +#include +#include + +/* + * this is really a simplified "vm_mmap". it only handles mpx + * related maps, including bounds table and bounds directory. + * + * here we can stick new vm_flag VM_MPX in the vma_area_struct + * when create a bounds table or bounds directory, in order to + * track MPX specific memory. + */ +unsigned long mpx_mmap(unsigned long len) +{ + unsigned long ret; + unsigned long addr, pgoff; + struct mm_struct *mm = current->mm; + vm_flags_t vm_flags; + + /* Only bounds table and bounds directory can be allocated here */ + if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES) + return -EINVAL; + + down_write(&mm->mmap_sem); + + /* Too many mappings? */ + if (mm->map_count > sysctl_max_map_count) { + ret = -ENOMEM; + goto out; + } + + /* Obtain the address to map to. we verify (or select) it and ensure +* that it represents a valid section of the address space. +*/ + addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE); + if (addr & ~PAGE_MASK) { + ret = addr; + goto out; + } + + vm_flags = VM_READ | VM_WRITE | VM_MPX | + mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + + /* Make bounds tables and bouds directory unlocked. */ + if (vm_flags & VM_LOCKED) + vm_flags &= ~VM_LOCKED; + + /* Set pgoff according to addr for anon_vma */ + pgoff = addr >> PAGE_SHIFT; + + ret = mmap_region(NULL, addr, len, vm_flags, pgoff); + +out: + up_write(&mm->mmap_sem); + return ret; +} -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl() commands. These commands can be used to register and unregister MPX related resource on the x86 platform. The base of the bounds directory is set into mm_struct during PR_MPX_REGISTER command execution. This member can be used to check whether one application is mpx enabled. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h |1 + arch/x86/include/asm/processor.h | 18 arch/x86/kernel/mpx.c| 56 ++ include/linux/mm_types.h |3 ++ include/uapi/linux/prctl.h |6 kernel/sys.c | 12 6 files changed, 96 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 780af63..6cb0853 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -43,6 +43,7 @@ #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) #define MPX_BNDSTA_ERROR_CODE 0x3 +#define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 struct mpx_insn { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4ea023..6e0966e 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +/* Register/unregister a process' MPX related resource */ +#define MPX_REGISTER(tsk) mpx_register((tsk)) +#define MPX_UNREGISTER(tsk)mpx_unregister((tsk)) + +#ifdef CONFIG_X86_INTEL_MPX +extern int mpx_register(struct task_struct *tsk); +extern int mpx_unregister(struct task_struct *tsk); +#else +static inline int mpx_register(struct task_struct *tsk) +{ + return -EINVAL; +} +static inline int mpx_unregister(struct task_struct *tsk) +{ + return -EINVAL; +} +#endif /* CONFIG_X86_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 650b282..d8a2a09 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,6 +1,62 @@ #include #include +#include #include +#include +#include + +/* + * This should only be called when cpuid has been checked + * and we are sure that MPX is available. + */ +static __user void *task_get_bounds_dir(struct task_struct *tsk) +{ + struct xsave_struct *xsave_buf; + + fpu_xsave(&tsk->thread.fpu); + xsave_buf = &(tsk->thread.fpu.state->xsave); + if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG)) + return NULL; + + return (void __user *)(xsave_buf->bndcsr.cfg_reg_u & + MPX_BNDCFG_ADDR_MASK); +} + +int mpx_register(struct task_struct *tsk) +{ + struct mm_struct *mm = tsk->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + /* +* runtime in the userspace will be responsible for allocation of +* the bounds directory. Then, it will save the base of the bounds +* directory into XSAVE/XRSTOR Save Area and enable MPX through +* XRSTOR instruction. +* +* fpu_xsave() is expected to be very expensive. In order to do +* performance optimization, here we get the base of the bounds +* directory and then save it into mm_struct to be used in future. +*/ + mm->bd_addr = task_get_bounds_dir(tsk); + if (!mm->bd_addr) + return -EINVAL; + + pr_debug("MPX BD base address %p\n", mm->bd_addr); + return 0; +} + +int mpx_unregister(struct task_struct *tsk) +{ + struct mm_struct *mm = current->mm; + + if (!cpu_has_mpx) + return -EINVAL; + + mm->bd_addr = NULL; + return 0; +} typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8967e20..54b8011 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -454,6 +454,9 @@ struct mm_struct { bool tlb_flush_pending; #endif struct uprobes_state uprobes_state; +#ifdef CONFIG_X86_INTEL_MPX + void __user *bd_addr; /* address of the bounds directory */ +#endif }; static inline void mm_init_cpumask(struct mm_struct *mm) diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 58afc04..ce86fa9 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -152,4 +152,10 @@ #define PR_SET_THP_DISABLE 41 #define PR_GET_THP_DISABLE 42 +/* + * Register/unregister MPX related resource. + */ +#define PR_MPX_REGISTER43 +#define
[PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information
This patch sets bound violation fields of siginfo struct in #BR exception handler by decoding the user instruction and constructing the faulting pointer. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 23 arch/x86/kernel/mpx.c | 294 arch/x86/kernel/traps.c|6 + 3 files changed, 323 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index b7598ac..780af63 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -44,15 +45,37 @@ #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BD_ENTRY_VALID_FLAG0x1 +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + unsigned long mpx_mmap(unsigned long len); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #else static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf) { return -EINVAL; } +static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf) +{ +} #endif /* CONFIG_X86_INTEL_MPX */ #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 4230c7b..650b282 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -2,6 +2,270 @@ #include #include +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { + addr = get_reg(insn, regs, REG_TYPE_RM); + } + addr += insn->displacement.value; + } + + return addr; +} + +/* Verify next sizeof(t) bytes can be on the same instruction */ +#define validate_next(t, insn, n) \ + ((insn)-&
[PATCH v6 09/10] x86, mpx: cleanup unused bound tables
When user memory region is unmapped, related bound tables become unused and need to be released also. This patch cleanups these unused bound tables through hooking unmap path. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mmu_context.h | 16 +++ arch/x86/include/asm/mpx.h |9 ++ arch/x86/mm/mpx.c | 189 include/asm-generic/mmu_context.h |6 + mm/mmap.c |2 + 5 files changed, 222 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index be12c53..af70d4f 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -6,6 +6,7 @@ #include #include #include +#include #ifndef CONFIG_PARAVIRT #include @@ -96,4 +97,19 @@ do { \ } while (0) #endif +static inline void arch_unmap(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ +#ifdef CONFIG_X86_INTEL_MPX + /* +* Check whether this vma comes from MPX-enabled application. +* If so, release this vma related bound tables. +*/ + if (mm->bd_addr && !(vma->vm_flags & VM_MPX)) + mpx_unmap(mm, start, end); + +#endif +} + #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 6cb0853..e848a74 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -42,6 +42,13 @@ #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT)) #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT)) +#define MPX_BD_ENTRY_MASK ((1<>(MPX_BT_ENTRY_OFFSET+ \ + MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT) +#define MPX_GET_BT_ENTRY_OFFSET(addr) addr)>>MPX_IGN_BITS) & \ + MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT) + #define MPX_BNDSTA_ERROR_CODE 0x3 #define MPX_BNDCFG_ENABLE_FLAG 0x1 #define MPX_BD_ENTRY_VALID_FLAG0x1 @@ -63,6 +70,8 @@ struct mpx_insn { #define MAX_MPX_INSN_SIZE 15 unsigned long mpx_mmap(unsigned long len); +void mpx_unmap(struct mm_struct *mm, + unsigned long start, unsigned long end); #ifdef CONFIG_X86_INTEL_MPX int do_mpx_bt_fault(struct xsave_struct *xsave_buf); diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c index 546c5d1..fd05cd4 100644 --- a/arch/x86/mm/mpx.c +++ b/arch/x86/mm/mpx.c @@ -2,6 +2,7 @@ #include #include #include +#include #include /* @@ -56,3 +57,191 @@ out: up_write(&mm->mmap_sem); return ret; } + +/* + * Get the base of bounds tables pointed by specific bounds + * directory entry. + */ +static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr, + unsigned int *valid) +{ + if (get_user(*bt_addr, bd_entry)) + return -EFAULT; + + *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG; + *bt_addr &= MPX_BT_ADDR_MASK; + + /* +* If this bounds directory entry is nonzero, and meanwhile +* the valid bit is zero, one SIGSEGV will be produced due to +* this unexpected situation. +*/ + if (!(*valid) && *bt_addr) + force_sig(SIGSEGV, current); + + pr_debug("get_bt: BD Entry (%p) - Table (%lx,%d)\n", + bd_entry, *bt_addr, *valid); + return 0; +} + +/* + * Free the backing physical pages of bounds table 'bt_addr'. + * Assume start...end is within that bounds table. + */ +static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr, + unsigned long start, unsigned long end) +{ + struct vm_area_struct *vma; + + /* Find the vma which overlaps this bounds table */ + vma = find_vma(mm, bt_addr); + if (!vma || vma->vm_start > bt_addr || + vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES) + return; + + zap_page_range(vma, start, end, NULL); + pr_debug("Bound table de-allocation %lx (%lx, %lx)\n", + bt_addr, start, end); +} + +static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry, + unsigned long bt_addr) +{ + if (user_atomic_cmpxchg_inatomic(&bt_addr, bd_entry, + bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0)) + return; + + pr_debug("Bound table de-allocation %lx at entry addr %p\n", + bt_addr, bd_entry); + /* +* to avoid recursion, do_munmap() will check whether it comes +* from one bounds table through VM_MPX flag. +*/ + do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES); +} + +/* + * If the bounds table pointed by bounds directory 'bd_entr
[PATCH v6 10/10] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 127 +++ 1 files changed, 127 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..1af9809 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,127 @@ +1. Intel(R) MPX Overview + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +For more information, please refer to Intel(R) Architecture +Instruction Set Extensions Programming Reference, Chapter 9: +Intel(R) Memory Protection Extensions. + +Note: Currently no hardware with MPX ISA is available but it is always +possible to use SDE (Intel(R) Software Development Emulator) instead, +which can be downloaded from +http://software.intel.com/en-us/articles/intel-software-development-emulator + + +2. How does MPX kernel code work + + +Handling #BR faults caused by MPX +- + +When MPX is enabled, there are 2 new situations that can generate +#BR faults. + * bounds violation caused by MPX instructions. + * new bounds tables (BT) need to be allocated to save bounds. + +We hook #BR handler to handle these two new situations. + +Decoding MPX instructions +- + +If a #BR is generated due to a bounds violation caused by MPX. +We need to decode MPX instructions to get violation address and +set this address into extended struct siginfo. + +The _sigfault feild of struct siginfo is extended as follow: + +87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */ +88 struct { +89 void __user *_addr; /* faulting insn/memory ref. */ +90 #ifdef __ARCH_SI_TRAPNO +91 int _trapno;/* TRAP # which caused the signal */ +92 #endif +93 short _addr_lsb; /* LSB of the reported address */ +94 struct { +95 void __user *_lower; +96 void __user *_upper; +97 } _addr_bnd; +98 } _sigfault; + +The '_addr' field refers to violation address, and new '_addr_and' +field refers to the upper/lower bounds when a #BR is caused. + +Glibc will be also updated to support this new siginfo. So user +can get violation address and bounds when bounds violations occur. + +Freeing unused bounds tables + + +When a BNDSTX instruction attempts to save bounds to a bounds directory +entry marked as invalid, a #BR is generated. This is an indication that +no bounds table exists for this entry. In this case the fault handler +will allocate a new bounds table on demand. + +Since the kernel allocated those tables on-demand without userspace +knowledge, it is also responsible for freeing them when the associated +mappings go away. + +Here, the solution for this issue is to hook do_munmap() to check +whether one process is MPX enabled. If yes, those bounds tables covered +in the virtual address region which is being unmapped will be freed also. + +Adding new prctl commands +- + +Runtime library in userspace is responsible for allocation of bounds +directory. So kernel have to use XSAVE instruction to get the base +of bounds directory from BNDCFG register. + +But XSAVE is expected to be very expensive. In order to do performance +optimization, we have to add new prctl command to get the base of +bounds directory to be used in future. + +Two new prctl commands are added to register and unregister MPX related +resource. + +155#define PR_MPX_REGISTER 41 +156#define PR_MPX_UNREGISTER 42 + +The base of the bounds directory is set into mm_struct during +PR_MPX_REGISTER command execution. This member can be used to +check whether one application is mpx enabled. + + +3. Tips +=== + +1) Users are not allowed to create bounds tables and point the bounds +directory at them in the userspace. In fact, it is not also necessary +for users to create bounds tables in the userspace. + +When #BR fault is produced due to invalid entry, bounds table will be +created in kernel on demand and kernel will not transfer this fault to +userspace. So usersapce can't receive #BR fault for invalid entry, and +it is not also necessary for users to create bounds tables by themselves. + +Certainly users can allocate bounds tables and forcibly point the bounds +directory at them through XSAVE instruction, and then set v
[PATCH v6 06/10] mips: sync struct siginfo with general version
Due to new fields about bound violation added into struct siginfo, this patch syncs it with general version to avoid build issue. Signed-off-by: Qiaowei Ren --- arch/mips/include/uapi/asm/siginfo.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/mips/include/uapi/asm/siginfo.h b/arch/mips/include/uapi/asm/siginfo.h index e811744..d08f83f 100644 --- a/arch/mips/include/uapi/asm/siginfo.h +++ b/arch/mips/include/uapi/asm/siginfo.h @@ -92,6 +92,10 @@ typedef struct siginfo { int _trapno;/* TRAP # which caused the signal */ #endif short _addr_lsb; + struct { + void __user *_lower; + void __user *_upper; + } _addr_bnd; } _sigfault; /* SIGPOLL, SIGXFSZ (To do ...) */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables
This patch handles a #BR exception for non-existent tables by carving the space out of the normal processes address space (essentially calling mmap() from inside the kernel) and then pointing the bounds-directory over to it. The tables need to be accessed and controlled by userspace because the compiler generates instructions for MPX-enabled code which frequently store and retrieve entries from the bounds tables. Any direct kernel involvement (like a syscall) to access the tables would destroy performance since these are so frequent. The tables are carved out of userspace because we have no better spot to put them. For each pointer which is being tracked by MPX, the bounds tables contain 4 longs worth of data, and the tables are indexed virtually. If we were to preallocate the tables, we would theoretically need to allocate 4x the virtual space that we have available for userspace somewhere else. We don't have that room in the kernel address space. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 20 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 63 arch/x86/kernel/traps.c| 56 ++- 4 files changed, 139 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 5725ac4..b7598ac 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -18,6 +18,8 @@ #define MPX_BT_ENTRY_SHIFT 5 #define MPX_IGN_BITS 3 +#define MPX_BD_ENTRY_TAIL 3 + #else #define MPX_BD_ENTRY_OFFSET20 @@ -26,13 +28,31 @@ #define MPX_BT_ENTRY_SHIFT 4 #define MPX_IGN_BITS 2 +#define MPX_BD_ENTRY_TAIL 2 + #endif +#define MPX_BNDSTA_TAIL2 +#define MPX_BNDCFG_TAIL12 +#define MPX_BNDSTA_ADDR_MASK (~((1UL< +#include +#include + +static int allocate_bt(long __user *bd_entry) +{ + unsigned long bt_addr, old_val = 0; + int ret = 0; + + bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES); + if (IS_ERR((void *)bt_addr)) { + pr_err("Bounds table allocation failed at entry addr %p\n", + bd_entry); + return bt_addr; + } + bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG; + + ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr); + if (ret) + goto out; + + /* +* there is a existing bounds table pointed at this bounds +* directory entry, and so we need to free the bounds table +* allocated just now. +*/ + if (old_val) + goto out; + + pr_debug("Allocate bounds table %lx at entry %p\n", + bt_addr, bd_entry); + return 0; + +out: + vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES); + return ret; +} + +/* + * When a BNDSTX instruction attempts to save bounds to a BD entry + * with the lack of the valid bit being set, a #BR is generated. + * This is an indication that no BT exists for this entry. In this + * case the fault handler will allocate a new BT. + * + * With 32-bit mode, the size of BD is 4MB, and the size of each + * bound table is 16KB. With 64-bit mode, the size of BD is 2GB, + * and the size of each bound table is 4MB. + */ +int do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry < bd_base) || + (bd_entry >= bd_base + MPX_BD_SIZE_BYTES)) + return -EINVAL; + + return allocate_bt((long __user *)bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index f73b5d4..35b9b29 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error,FPE_INTDIV, regs->ip ) DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow", overflow ) -DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds", bounds ) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip ) DO_ERROR (X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", coprocessor_segment_
[PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
An access to an invalid bound directory entry will cause a #BR exception. This patch hook #BR exception handler to allocate one bound table and bind it with that buond directory entry. This will avoid the need of forwarding the #BR exception to the user space when bound directory has invalid entry. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 35 +++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 46 arch/x86/kernel/traps.c| 56 +++- 4 files changed, 137 insertions(+), 1 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..d9a61a8 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,35 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +#define MPX_L1_BITS28 +#define MPX_L1_SHIFT 3 +#define MPX_L2_BITS17 +#define MPX_L2_SHIFT 5 +#define MPX_IGN_BITS 3 +#define MPX_L2_NODE_ADDR_MASK 0xfff8UL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#else + +#define MPX_L1_BITS20 +#define MPX_L1_SHIFT 2 +#define MPX_L2_BITS10 +#define MPX_L2_SHIFT 4 +#define MPX_IGN_BITS 2 +#define MPX_L2_NODE_ADDR_MASK 0xfffcUL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#endif + +bool do_mpx_bt_fault(struct xsave_struct *xsave_buf); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index cb648c8..becb970 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o obj-y += process.o obj-y += i387.o xsave.o +obj-y += mpx.o obj-y += ptrace.o obj-$(CONFIG_X86_32) += tls.o obj-$(CONFIG_IA32_EMULATION) += tls.o diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c new file mode 100644 index 000..f1a16c0 --- /dev/null +++ b/arch/x86/kernel/mpx.c @@ -0,0 +1,46 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +static bool allocate_bt(unsigned long bd_entry) +{ + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr, old_val = 0; + + bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0); + if (bt_addr == -1) { + pr_err("L2 Node Allocation Failed at L1 addr %lx\n", + bd_entry); + return false; + } + bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01; + + user_atomic_cmpxchg_inatomic(&old_val, + (long __user *)bd_entry, 0, bt_addr); + if (old_val) + vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size); + + return true; +} + +bool do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT); + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size)) + return allocate_bt(bd_entry); + + return false; +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 57409f6..b894f09 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error,FPE_INTDIV, regs->ip ) DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow", overflow ) -DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds", bounds ) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip ) DO_ERROR (X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun ) DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS ) @@ -263,6 +263,60 @@ dotraplinkage void do_double_fault(struct pt_regs *reg
[PATCH v5 3/3] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. These fields will be set in #BR exception handler by decoding the user instruction and constructing the faulting pointer. A userspace application can get violation address, lower bound and upper bound for bound violation from this new siginfo structure. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 19 +++ arch/x86/kernel/mpx.c | 289 arch/x86/kernel/traps.c|6 + include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 5 files changed, 326 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index d9a61a8..3296052 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -30,6 +31,24 @@ #endif +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + bool do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index f1a16c0..7c0e36c 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -7,6 +7,270 @@ #include #include +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { + addr = get_reg(insn, regs, REG_TYPE_RM); + } + addr += insn->displacement.value; + } + + return addr; +} + +/* Verify next sizeof(t) bytes can be on the same instruction */ +#define validate_next(t, insn, n) \ + ((insn)->next_byte + sizeof(t)
[PATCH v5 1/3] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 239 +++ 1 files changed, 239 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..9af8636 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,239 @@ +1. Intel(R) MPX Overview + + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. MPX architecture is designed to +allow a machine (i.e., the processor(s) and the OS software) +to run both MPX enabled software and legacy software that +is MPX unaware. In such a case, the legacy software does not +benefit from MPX, but it also does not experience any change +in functionality or reduction in performance. + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. The linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 are +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + + +2. How to get the advantage of MPX +== + + +To get the advantage of MPX, changes are required in +the OS kernel, binutils, compiler, and system libraries support. + +MPX support in the GNU toolchain + + +This section describes changes in GNU Binutils, GCC and Glibc +to support MPX. + +The first step of MPX support is to implement support for new +hardware features in binutils and the GCC. + +The second step is implementation of MPX instrumentation pass +in the GCC compiler which is responsible for instrumenting all +memory accesses with pointer checks. Compiler changes for runtime +bound checks include: + + * Bounds creation for statically alloca
[PATCH v5 0/3] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Changes since v3: * correct some syntax errors at documentation, and document extended struct siginfo. * for kill the process when the error code of BNDSTATUS is 3. * add some comments. * remove new prctl() commands. * fix some compile warnings for 32-bit. Changes since v4: * raise SIGBUS if the allocations of the bound tables fail. Qiaowei Ren (3): x86, mpx: add documentation on Intel MPX x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: extend siginfo structure to include bound violation information Documentation/x86/intel_mpx.txt| 239 + arch/x86/include/asm/mpx.h | 54 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 335 arch/x86/kernel/traps.c| 62 +++- include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 7 files changed, 702 insertions(+), 2 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/3] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 239 +++ 1 files changed, 239 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..9af8636 --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,239 @@ +1. Intel(R) MPX Overview + + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. MPX architecture is designed to +allow a machine (i.e., the processor(s) and the OS software) +to run both MPX enabled software and legacy software that +is MPX unaware. In such a case, the legacy software does not +benefit from MPX, but it also does not experience any change +in functionality or reduction in performance. + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. The linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 are +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + + +2. How to get the advantage of MPX +== + + +To get the advantage of MPX, changes are required in +the OS kernel, binutils, compiler, and system libraries support. + +MPX support in the GNU toolchain + + +This section describes changes in GNU Binutils, GCC and Glibc +to support MPX. + +The first step of MPX support is to implement support for new +hardware features in binutils and the GCC. + +The second step is implementation of MPX instrumentation pass +in the GCC compiler which is responsible for instrumenting all +memory accesses with pointer checks. Compiler changes for runtime +bound checks include: + + * Bounds creation for statically alloca
[PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. These fields will be set in #BR exception handler by decoding the user instruction and constructing the faulting pointer. A userspace application can get violation address, lower bound and upper bound for bound violation from this new siginfo structure. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 19 +++ arch/x86/kernel/mpx.c | 289 arch/x86/kernel/traps.c|6 + include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 5 files changed, 326 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index d074153..3129b1e 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -30,6 +31,24 @@ #endif +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + void do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index e055e0e..f95abc2 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -7,6 +7,270 @@ #include #include +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { + addr = get_reg(insn, regs, REG_TYPE_RM); + } + addr += insn->displacement.value; + } + + return addr; +} + +/* Verify next sizeof(t) bytes can be on the same instruction */ +#define validate_next(t, insn, n) \ + ((insn)->next_byte + sizeof(t)
[PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables
An access to an invalid bound directory entry will cause a #BR exception. This patch hook #BR exception handler to allocate one bound table and bind it with that buond directory entry. This will avoid the need of forwarding the #BR exception to the user space when bound directory has invalid entry. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 35 arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 44 +++ arch/x86/kernel/traps.c| 55 +++- 4 files changed, 134 insertions(+), 1 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..d074153 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,35 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +#define MPX_L1_BITS28 +#define MPX_L1_SHIFT 3 +#define MPX_L2_BITS17 +#define MPX_L2_SHIFT 5 +#define MPX_IGN_BITS 3 +#define MPX_L2_NODE_ADDR_MASK 0xfff8UL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#else + +#define MPX_L1_BITS20 +#define MPX_L1_SHIFT 2 +#define MPX_L2_BITS10 +#define MPX_L2_SHIFT 4 +#define MPX_IGN_BITS 2 +#define MPX_L2_NODE_ADDR_MASK 0xfffcUL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#endif + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index cb648c8..becb970 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o obj-y += process.o obj-y += i387.o xsave.o +obj-y += mpx.o obj-y += ptrace.o obj-$(CONFIG_X86_32) += tls.o obj-$(CONFIG_IA32_EMULATION) += tls.o diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c new file mode 100644 index 000..e055e0e --- /dev/null +++ b/arch/x86/kernel/mpx.c @@ -0,0 +1,44 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +static bool allocate_bt(unsigned long bd_entry) +{ + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr, old_val = 0; + + bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0); + if (bt_addr == -1) { + pr_err("L2 Node Allocation Failed at L1 addr %lx\n", + bd_entry); + return false; + } + bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01; + + user_atomic_cmpxchg_inatomic(&old_val, + (long __user *)bd_entry, 0, bt_addr); + if (old_val) + vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size); + + return true; +} + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT); + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size)) + allocate_bt(bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 57409f6..fe09b3d 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error,FPE_INTDIV, regs->ip ) DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow", overflow ) -DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds", bounds ) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip ) DO_ERROR (X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun ) DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS ) @@ -263,6 +263,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif
[PATCH v4 0/3] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Changes since v3: * correct some syntax errors at documentation, and document extended struct siginfo. * for kill the process when the error code of BNDSTATUS is 3. * add some comments. * remove new prctl() commands. * fix some compile warnings for 32-bit. Qiaowei Ren (3): x86, mpx: add documentation on Intel MPX x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: extend siginfo structure to include bound violation information Documentation/x86/intel_mpx.txt| 239 ++ arch/x86/include/asm/mpx.h | 54 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 333 arch/x86/kernel/traps.c| 61 +++- include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 7 files changed, 699 insertions(+), 2 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] kernel/trace: fix compiler warning
The patch fixes the following compiler warning: CC kernel/trace/trace_events.o kernel/trace/trace_events.c: In function 'event_enable_read' kernel/trace/trace_events.c:693: warning: 'flags' may be used \ uninitialized in this function Signed-off-by: Qiaowei Ren --- kernel/trace/trace_events.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index e71ffd4..b7915f2 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -690,7 +690,7 @@ event_enable_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { struct ftrace_event_file *file; - unsigned long flags; + unsigned long flags = 0; char buf[4] = "0"; mutex_lock(&event_mutex); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. These fields will be set in #BR exception handler by decoding the user instruction and constructing the faulting pointer. A userspace application can get violation address, lower bound and upper bound for bound violation from this new siginfo structure. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 19 +++ arch/x86/kernel/mpx.c | 287 arch/x86/kernel/traps.c|6 + include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 5 files changed, 324 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 9652e9e..e099573 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -30,6 +31,22 @@ #endif +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + typedef union { struct { unsigned long ignored:MPX_IGN_BITS; @@ -40,5 +57,7 @@ typedef union { } mpx_addr; void do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 9e91178..983abf7 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk) return 0; } +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { + addr = get_reg(insn, regs, REG_TYPE_RM); + } + addr += insn->displacement.value; + } + +
[PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands on the x86 platform. These commands can be used to init and release MPX related resource. A MMU notifier will be registered during PR_MPX_INIT command execution. So the bound tables can be automatically deallocated when one memory area is unmapped. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h |9 arch/x86/include/asm/processor.h | 16 +++ arch/x86/kernel/mpx.c| 84 ++ include/uapi/linux/prctl.h |6 +++ kernel/sys.c | 12 + 6 files changed, 131 insertions(+), 0 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index cd18b83..28916e1 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -236,6 +236,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config HAVE_INTEL_MPX + def_bool y + select MMU_NOTIFIER + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index d074153..9652e9e 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -30,6 +30,15 @@ #endif +typedef union { + struct { + unsigned long ignored:MPX_IGN_BITS; + unsigned long l2index:MPX_L2_BITS; + unsigned long l1index:MPX_L1_BITS; + }; + unsigned long addr; +} mpx_addr; + void do_mpx_bt_fault(struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index fdedd38..5962413 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -943,6 +943,22 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +#ifdef CONFIG_HAVE_INTEL_MPX + +/* Init/release a process' MPX related resource */ +#define MPX_INIT(tsk) mpx_init((tsk)) +#define MPX_RELEASE(tsk) mpx_release((tsk)) + +extern int mpx_init(struct task_struct *tsk); +extern int mpx_release(struct task_struct *tsk); + +#else /* CONFIG_HAVE_INTEL_MPX */ + +#define MPX_INIT(tsk) (-EINVAL) +#define MPX_RELEASE(tsk) (-EINVAL) + +#endif /* CONFIG_HAVE_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index e055e0e..9e91178 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,5 +1,7 @@ #include #include +#include +#include #include #include #include @@ -7,6 +9,88 @@ #include #include +static struct mmu_notifier mpx_mn; + +static void mpx_invl_range_end(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct xsave_struct *xsave_buf; + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr; + unsigned long bd_base; + unsigned long bd_entry, bde_start, bde_end; + mpx_addr lap; + + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + /* ignore swap notifications */ + pgd = pgd_offset(mm, start); + pud = pud_offset(pgd, start); + pmd = pmd_offset(pud, start); + pte = pte_offset_kernel(pmd, start); + if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte)) + return; + + /* get bound directory base address */ + fpu_xsave(¤t->thread.fpu); + xsave_buf = &(current->thread.fpu.state->xsave); + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + + /* get related bde range */ + lap.addr = start; + bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT); + + lap.addr = end; + if (lap.ignored || lap.l2index) + bde_end = bd_base + (lap.l1index<mm); + + return 0; +} + +int mpx_release(struct task_struct *tsk) +{ + if (!boot_cpu_has(X86_FEATURE_MPX)) + return -EINVAL; + + /* unregister mmu_notifier */ + mmu_notifier_unregister(&mpx_mn, current->mm); + + return 0; +} + static bool allocate_bt(unsigned long bd_entry) { unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 289760f..19ab881 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -149,4 +149,10 @@ #define PR_GET_TID_ADDRESS 40 +/* + * Init/release MPX related resource. + */ +#define PR_MPX_INIT41 +#define PR_MPX_RELEASE 42 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c723113..0334d03 100644 --
[PATCH v3 0/4] Intel MPX support
This patchset adds support for the Memory Protection Extensions (MPX) feature found in future Intel processors. MPX can be used in conjunction with compiler changes to check memory references, for those references whose compile-time normal intentions are usurped at runtime due to buffer overflow or underflow. MPX provides this capability at very low performance overhead for newly compiled code, and provides compatibility mechanisms with legacy software components. MPX architecture is designed allow a machine to run both MPX enabled software and legacy software that is MPX unaware. In such a case, the legacy software does not benefit from MPX, but it also does not experience any change in functionality or reduction in performance. More information about Intel MPX can be found in "Intel(R) Architecture Instruction Set Extensions Programming Reference". To get the advantage of MPX, changes are required in the OS kernel, binutils, compiler, system libraries support. New GCC option -fmpx is introduced to utilize MPX instructions. Currently GCC compiler sources with MPX support is available in a separate branch in common GCC SVN repository. See GCC SVN page (http://gcc.gnu.org/svn.html) for details. To have the full protection, we had to add MPX instrumentation to all the necessary Glibc routines (e.g. memcpy) written on assembler, and compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled Glibc source can be found in Glibc git repository. Enabling an application to use MPX will generally not require source code updates but there is some runtime code, which is responsible for configuring and enabling MPX, needed in order to make use of MPX. For most applications this runtime support will be available by linking to a library supplied by the compiler or possibly it will come directly from the OS once OS versions that support MPX are available. MPX kernel code, namely this patchset, has mainly the 2 responsibilities: provide handlers for bounds faults (#BR), and manage bounds memory. Currently no hardware with MPX ISA is available but it is always possible to use SDE (Intel(R) software Development Emulator) instead, which can be downloaded from http://software.intel.com/en-us/articles/intel-software-development-emulator Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Changes since v2: * fix some compile warnings. * update documentation. Qiaowei Ren (4): x86, mpx: add documentation on Intel MPX x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE x86, mpx: extend siginfo structure to include bound violation information Documentation/x86/intel_mpx.txt| 226 arch/x86/Kconfig |4 + arch/x86/include/asm/mpx.h | 63 ++ arch/x86/include/asm/processor.h | 16 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 415 arch/x86/kernel/traps.c| 61 +- include/uapi/asm-generic/siginfo.h |9 +- include/uapi/linux/prctl.h |6 + kernel/signal.c|4 + kernel/sys.c | 12 + 11 files changed, 815 insertions(+), 2 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables
An access to an invalid bound directory entry will cause a #BR exception. This patch hook #BR exception handler to allocate one bound table and bind it with that buond directory entry. This will avoid the need of forwarding the #BR exception to the user space when bound directory has invalid entry. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 35 arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 44 +++ arch/x86/kernel/traps.c| 55 +++- 4 files changed, 134 insertions(+), 1 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..d074153 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,35 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +#define MPX_L1_BITS28 +#define MPX_L1_SHIFT 3 +#define MPX_L2_BITS17 +#define MPX_L2_SHIFT 5 +#define MPX_IGN_BITS 3 +#define MPX_L2_NODE_ADDR_MASK 0xfff8UL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#else + +#define MPX_L1_BITS20 +#define MPX_L1_SHIFT 2 +#define MPX_L2_BITS10 +#define MPX_L2_SHIFT 4 +#define MPX_IGN_BITS 2 +#define MPX_L2_NODE_ADDR_MASK 0xfffcUL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#endif + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index cb648c8..becb970 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o obj-y += process.o obj-y += i387.o xsave.o +obj-y += mpx.o obj-y += ptrace.o obj-$(CONFIG_X86_32) += tls.o obj-$(CONFIG_IA32_EMULATION) += tls.o diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c new file mode 100644 index 000..e055e0e --- /dev/null +++ b/arch/x86/kernel/mpx.c @@ -0,0 +1,44 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +static bool allocate_bt(unsigned long bd_entry) +{ + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr, old_val = 0; + + bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0); + if (bt_addr == -1) { + pr_err("L2 Node Allocation Failed at L1 addr %lx\n", + bd_entry); + return false; + } + bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01; + + user_atomic_cmpxchg_inatomic(&old_val, + (long __user *)bd_entry, 0, bt_addr); + if (old_val) + vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size); + + return true; +} + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT); + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size)) + allocate_bt(bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 57409f6..6b284a4 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error,FPE_INTDIV, regs->ip ) DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow", overflow ) -DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds", bounds ) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip ) DO_ERROR (X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun ) DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS ) @@ -263,6 +263,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif
[PATCH v3 1/4] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 226 +++ 1 files changed, 226 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..052001c --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,226 @@ +1. Intel(R) MPX Overview + + + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX provides +hardware features that can be used in conjunction with compiler +changes to check memory references, for those references whose +compile-time normal intentions are usurped at runtime due to +buffer overflow or underflow. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. MPX architecture is designed +allow a machine (i.e., the processor(s) and the OS software) +to run both MPX enabled software and legacy software that +is MPX unaware. In such a case, the legacy software does not +benefit from MPX, but it also does not experience any change +in functionality or reduction in performance. + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. And the linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 is +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + + +2. How to get the advantage of MPX +== + + +To get the advantage of MPX, changes are required in +the OS kernel, binutils, compiler, system libraries support. + +MPX support in the GNU toolchain + + +This section describes changes in GNU Binutils, GCC and Glibc +to support MPX. + +The first step of MPX support is to implement support for new +hardware features in binutils and the GCC. + +The second step is implementation of MPX instrumentation pass +in the GCC compiler which is responsible for instrumenting all +memory accesses with pointer checks. Compiler changes for runtime +bound checks include: + + * Bounds creation for statically alloca
[PATCH v2 0/4] Intel MPX support
Changes since v1: * check to see if #BR occurred in userspace or kernel space. * use generic structure and macro as much as possible when decode mpx instructions. Qiaowei Ren (4): x86, mpx: add documentation on Intel MPX x86, mpx: hook #BR exception handler to allocate bound tables x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE x86, mpx: extend siginfo structure to include bound violation information Documentation/x86/intel_mpx.txt| 76 +++ arch/x86/Kconfig |4 + arch/x86/include/asm/mpx.h | 63 ++ arch/x86/include/asm/processor.h | 16 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 417 arch/x86/kernel/traps.c| 61 +- include/uapi/asm-generic/siginfo.h |9 +- include/uapi/linux/prctl.h |6 + kernel/signal.c|4 + kernel/sys.c | 12 + 11 files changed, 667 insertions(+), 2 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/4] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 76 +++ 1 files changed, 76 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..778d06e --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,76 @@ +Intel(R) MPX Overview: += + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX can +increase the robustness of software when it is used in conjunction +with compiler changes to check that memory references intended +at compile time do not become unsafe at runtime. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. A direct benefit Intel MPX provides +is hardening software against malicious attacks designed to +cause or exploit buffer overruns. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. And the linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 is +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands on the x86 platform. These commands can be used to init and release MPX related resource. A MMU notifier will be registered during PR_MPX_INIT command execution. So the bound tables can be automatically deallocated when one memory area is unmapped. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h |9 arch/x86/include/asm/processor.h | 16 +++ arch/x86/kernel/mpx.c| 84 ++ include/uapi/linux/prctl.h |6 +++ kernel/sys.c | 12 + 6 files changed, 131 insertions(+), 0 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ee2fb9d..695101a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -233,6 +233,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config HAVE_INTEL_MPX + def_bool y + select MMU_NOTIFIER + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index d074153..9652e9e 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -30,6 +30,15 @@ #endif +typedef union { + struct { + unsigned long ignored:MPX_IGN_BITS; + unsigned long l2index:MPX_L2_BITS; + unsigned long l1index:MPX_L1_BITS; + }; + unsigned long addr; +} mpx_addr; + void do_mpx_bt_fault(struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 43be6f6..ea4e72d 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -963,6 +963,22 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +#ifdef CONFIG_HAVE_INTEL_MPX + +/* Init/release a process' MPX related resource */ +#define MPX_INIT(tsk) mpx_init((tsk)) +#define MPX_RELEASE(tsk) mpx_release((tsk)) + +extern int mpx_init(struct task_struct *tsk); +extern int mpx_release(struct task_struct *tsk); + +#else /* CONFIG_HAVE_INTEL_MPX */ + +#define MPX_INIT(tsk) (-EINVAL) +#define MPX_RELEASE(tsk) (-EINVAL) + +#endif /* CONFIG_HAVE_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 767b3bf..ffe5aee 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,5 +1,7 @@ #include #include +#include +#include #include #include #include @@ -7,6 +9,88 @@ #include #include +static struct mmu_notifier mpx_mn; + +static void mpx_invl_range_end(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct xsave_struct *xsave_buf; + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr; + unsigned long bd_base; + unsigned long bd_entry, bde_start, bde_end; + mpx_addr lap; + + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + /* ignore swap notifications */ + pgd = pgd_offset(mm, start); + pud = pud_offset(pgd, start); + pmd = pmd_offset(pud, start); + pte = pte_offset_kernel(pmd, start); + if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte)) + return; + + /* get bound directory base address */ + fpu_xsave(¤t->thread.fpu); + xsave_buf = &(current->thread.fpu.state->xsave); + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + + /* get related bde range */ + lap.addr = start; + bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT); + + lap.addr = end; + if (lap.ignored || lap.l2index) + bde_end = bd_base + (lap.l1index<mm); + + return 0; +} + +int mpx_release(struct task_struct *tsk) +{ + if (!boot_cpu_has(X86_FEATURE_MPX)) + return -EINVAL; + + /* unregister mmu_notifier */ + mmu_notifier_unregister(&mpx_mn, current->mm); + + return 0; +} + static bool allocate_bt(unsigned long bd_entry) { unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 289760f..19ab881 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -149,4 +149,10 @@ #define PR_GET_TID_ADDRESS 40 +/* + * Init/release MPX related resource. + */ +#define PR_MPX_INIT41 +#define PR_MPX_RELEASE 42 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c18ecca..bbaf573 100644 --
[PATCH v2 2/4] x86, mpx: hook #BR exception handler to allocate bound tables
An access to an invalid bound directory entry will cause a #BR exception. This patch hook #BR exception handler to allocate one bound table and bind it with that buond directory entry. This will avoid the need of forwarding the #BR exception to the user space when bound directory has invalid entry. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 35 arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 44 +++ arch/x86/kernel/traps.c| 55 +++- 4 files changed, 134 insertions(+), 1 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..d074153 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,35 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +#define MPX_L1_BITS28 +#define MPX_L1_SHIFT 3 +#define MPX_L2_BITS17 +#define MPX_L2_SHIFT 5 +#define MPX_IGN_BITS 3 +#define MPX_L2_NODE_ADDR_MASK 0xfff8UL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#else + +#define MPX_L1_BITS20 +#define MPX_L1_SHIFT 2 +#define MPX_L2_BITS10 +#define MPX_L2_SHIFT 4 +#define MPX_IGN_BITS 2 +#define MPX_L2_NODE_ADDR_MASK 0xfffcUL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#endif + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index a5408b9..bba7a71 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -38,6 +38,7 @@ obj-y += resource.o obj-y += process.o obj-y += i387.o xsave.o +obj-y += mpx.o obj-y += ptrace.o obj-$(CONFIG_X86_32) += tls.o obj-$(CONFIG_IA32_EMULATION) += tls.o diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c new file mode 100644 index 000..767b3bf --- /dev/null +++ b/arch/x86/kernel/mpx.c @@ -0,0 +1,44 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +static bool allocate_bt(unsigned long bd_entry) +{ + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr, old_val; + + bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0); + if (bt_addr == -1) { + pr_err("L2 Node Allocation Failed at L1 addr %lx\n", + bd_entry); + return false; + } + bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01; + + user_atomic_cmpxchg_inatomic(&old_val, + (long __user *)bd_entry, 0, bt_addr); + if (old_val) + vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size); + + return true; +} + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT); + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size)) + allocate_bt(bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 8c8093b..78e9c16 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -214,7 +215,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->ip) DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow) -DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip) DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", @@ -267,6 +267,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif +dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) +{ + enum ctx_state prev_state; + unsigned long status; + struct xsave_struct *xsave_buf; + struct task_struct *tsk = current; + + prev_state = exception_enter(); + if (notify_die(DIE_TRAP, "bounds", regs, error_code, + X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)
[PATCH v2 4/4] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. These fields will be set in #BR exception handler by decoding the user instruction and constructing the faulting pointer. A userspace application can get violation address, lower bound and upper bound for bound violation from this new siginfo structure. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 19 +++ arch/x86/kernel/mpx.c | 289 arch/x86/kernel/traps.c|6 + include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 5 files changed, 326 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 9652e9e..e099573 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_X86_64 @@ -30,6 +31,22 @@ #endif +struct mpx_insn { + struct insn_field rex_prefix; /* REX prefix */ + struct insn_field modrm; + struct insn_field sib; + struct insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + typedef union { struct { unsigned long ignored:MPX_IGN_BITS; @@ -40,5 +57,7 @@ typedef union { } mpx_addr; void do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index ffe5aee..3770991 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk) return 0; } +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + if (X86_MODRM_MOD(modrm) == 3) { + addr = get_reg(insn, regs, REG_TYPE_RM); + } else { + if (insn->sib.nbytes) { + base = get_reg(insn, regs, REG_TYPE_BASE); + indx = get_reg(insn, regs, REG_TYPE_INDEX); + addr = base + indx * (1 << X86_SIB_SCALE(sib)); + } else { + addr = get_reg(insn, regs, REG_TYPE_RM); + } + addr += insn->displacement.value; + } + +
[tip:x86/mpx] x86, mpx: Add MPX related opcodes to the x86 opcode map
Commit-ID: fb09b78151361f5001ad462e4b242b10845830e2 Gitweb: http://git.kernel.org/tip/fb09b78151361f5001ad462e4b242b10845830e2 Author: Qiaowei Ren AuthorDate: Sun, 12 Jan 2014 17:20:02 +0800 Committer: H. Peter Anvin CommitDate: Fri, 17 Jan 2014 11:04:09 -0800 x86, mpx: Add MPX related opcodes to the x86 opcode map This patch adds all the MPX instructions to x86 opcode map, so the x86 instruction decoder can decode MPX instructions. Signed-off-by: Qiaowei Ren Link: http://lkml.kernel.org/r/1389518403-7715-4-git-send-email-qiaowei@intel.com Cc: Masami Hiramatsu Signed-off-by: H. Peter Anvin --- arch/x86/lib/x86-opcode-map.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 533a85e..1a2be7c 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -346,8 +346,8 @@ AVXcode: 1 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1) 18: Grp16 (1A) 19: -1a: -1b: +1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv +1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv 1c: 1d: 1e: -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] x86, mpx: add MPX related opcodes to the x86 opcode map
This patch adds all the MPX instructions to x86 opcode map, and then the x86 instruction decoder can decode MPX instructions used in kernel. Signed-off-by: Qiaowei Ren --- arch/x86/lib/x86-opcode-map.txt |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 533a85e..1a2be7c 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -346,8 +346,8 @@ AVXcode: 1 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1) 18: Grp16 (1A) 19: -1a: -1b: +1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv +1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv 1c: 1d: 1e: -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl() commands on the x86 platform. These commands can be used to init and release MPX related resource. A MMU notifier will be registered during PR_MPX_INIT command execution. So the bound tables can be automatically deallocated when one memory area is unmapped. Signed-off-by: Qiaowei Ren --- arch/x86/Kconfig |4 ++ arch/x86/include/asm/mpx.h |9 arch/x86/include/asm/processor.h | 16 +++ arch/x86/kernel/mpx.c| 84 ++ include/uapi/linux/prctl.h |6 +++ kernel/sys.c | 12 + 6 files changed, 131 insertions(+), 0 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ee2fb9d..695101a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -233,6 +233,10 @@ config HAVE_INTEL_TXT def_bool y depends on INTEL_IOMMU && ACPI +config HAVE_INTEL_MPX + def_bool y + select MMU_NOTIFIER + config X86_32_SMP def_bool y depends on X86_32 && SMP diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index d074153..9652e9e 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -30,6 +30,15 @@ #endif +typedef union { + struct { + unsigned long ignored:MPX_IGN_BITS; + unsigned long l2index:MPX_L2_BITS; + unsigned long l1index:MPX_L1_BITS; + }; + unsigned long addr; +} mpx_addr; + void do_mpx_bt_fault(struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 43be6f6..ea4e72d 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -963,6 +963,22 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip, extern int get_tsc_mode(unsigned long adr); extern int set_tsc_mode(unsigned int val); +#ifdef CONFIG_HAVE_INTEL_MPX + +/* Init/release a process' MPX related resource */ +#define MPX_INIT(tsk) mpx_init((tsk)) +#define MPX_RELEASE(tsk) mpx_release((tsk)) + +extern int mpx_init(struct task_struct *tsk); +extern int mpx_release(struct task_struct *tsk); + +#else /* CONFIG_HAVE_INTEL_MPX */ + +#define MPX_INIT(tsk) (-EINVAL) +#define MPX_RELEASE(tsk) (-EINVAL) + +#endif /* CONFIG_HAVE_INTEL_MPX */ + extern u16 amd_get_nb_id(int cpu); static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves) diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index 767b3bf..ffe5aee 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -1,5 +1,7 @@ #include #include +#include +#include #include #include #include @@ -7,6 +9,88 @@ #include #include +static struct mmu_notifier mpx_mn; + +static void mpx_invl_range_end(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct xsave_struct *xsave_buf; + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr; + unsigned long bd_base; + unsigned long bd_entry, bde_start, bde_end; + mpx_addr lap; + + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + /* ignore swap notifications */ + pgd = pgd_offset(mm, start); + pud = pud_offset(pgd, start); + pmd = pmd_offset(pud, start); + pte = pte_offset_kernel(pmd, start); + if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte)) + return; + + /* get bound directory base address */ + fpu_xsave(¤t->thread.fpu); + xsave_buf = &(current->thread.fpu.state->xsave); + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + + /* get related bde range */ + lap.addr = start; + bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT); + + lap.addr = end; + if (lap.ignored || lap.l2index) + bde_end = bd_base + (lap.l1index<mm); + + return 0; +} + +int mpx_release(struct task_struct *tsk) +{ + if (!boot_cpu_has(X86_FEATURE_MPX)) + return -EINVAL; + + /* unregister mmu_notifier */ + mmu_notifier_unregister(&mpx_mn, current->mm); + + return 0; +} + static bool allocate_bt(unsigned long bd_entry) { unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 289760f..19ab881 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -149,4 +149,10 @@ #define PR_GET_TID_ADDRESS 40 +/* + * Init/release MPX related resource. + */ +#define PR_MPX_INIT41 +#define PR_MPX_RELEASE 42 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c18ecca..bbaf573 100644 --
[PATCH 1/5] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/x86/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren --- Documentation/x86/intel_mpx.txt | 76 +++ 1 files changed, 76 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..778d06e --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,76 @@ +Intel(R) MPX Overview: += + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX can +increase the robustness of software when it is used in conjunction +with compiler changes to check that memory references intended +at compile time do not become unsafe at runtime. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. A direct benefit Intel MPX provides +is hardening software against malicious attacks designed to +cause or exploit buffer overruns. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. And the linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 is +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] x86, mpx: hook #BR exception handler to allocate bound tables
An access to an invalid bound directory entry will cause a #BR exception. This patch hook #BR exception handler to allocate one bound table and bind it with that buond directory entry. This will avoid the need of forwarding the #BR exception to the user space when bound directory has invalid entry. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 35 + arch/x86/kernel/Makefile |1 + arch/x86/kernel/mpx.c | 44 ++ arch/x86/kernel/traps.c| 46 +++- 4 files changed, 125 insertions(+), 1 deletions(-) create mode 100644 arch/x86/include/asm/mpx.h create mode 100644 arch/x86/kernel/mpx.c diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h new file mode 100644 index 000..d074153 --- /dev/null +++ b/arch/x86/include/asm/mpx.h @@ -0,0 +1,35 @@ +#ifndef _ASM_X86_MPX_H +#define _ASM_X86_MPX_H + +#include +#include + +#ifdef CONFIG_X86_64 + +#define MPX_L1_BITS28 +#define MPX_L1_SHIFT 3 +#define MPX_L2_BITS17 +#define MPX_L2_SHIFT 5 +#define MPX_IGN_BITS 3 +#define MPX_L2_NODE_ADDR_MASK 0xfff8UL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#else + +#define MPX_L1_BITS20 +#define MPX_L1_SHIFT 2 +#define MPX_L2_BITS10 +#define MPX_L2_SHIFT 4 +#define MPX_IGN_BITS 2 +#define MPX_L2_NODE_ADDR_MASK 0xfffcUL + +#define MPX_BNDSTA_ADDR_MASK 0xfffcUL +#define MPX_BNDCFG_ADDR_MASK 0xf000UL + +#endif + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf); + +#endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index a5408b9..bba7a71 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -38,6 +38,7 @@ obj-y += resource.o obj-y += process.o obj-y += i387.o xsave.o +obj-y += mpx.o obj-y += ptrace.o obj-$(CONFIG_X86_32) += tls.o obj-$(CONFIG_IA32_EMULATION) += tls.o diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c new file mode 100644 index 000..767b3bf --- /dev/null +++ b/arch/x86/kernel/mpx.c @@ -0,0 +1,44 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +static bool allocate_bt(unsigned long bd_entry) +{ + unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT); + unsigned long bt_addr, old_val; + + bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0); + if (bt_addr == -1) { + pr_err("L2 Node Allocation Failed at L1 addr %lx\n", + bd_entry); + return false; + } + bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01; + + user_atomic_cmpxchg_inatomic(&old_val, + (long __user *)bd_entry, 0, bt_addr); + if (old_val) + vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size); + + return true; +} + +void do_mpx_bt_fault(struct xsave_struct *xsave_buf) +{ + unsigned long status; + unsigned long bd_entry, bd_base; + unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT); + + bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK; + status = xsave_buf->bndcsr.status_reg; + + bd_entry = status & MPX_BNDSTA_ADDR_MASK; + if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size)) + allocate_bt(bd_entry); +} diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 8c8093b..eb04039 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 #include @@ -214,7 +215,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->ip) DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow) -DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds) DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip) DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun", @@ -267,6 +267,50 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) } #endif +dotraplinkage void do_bounds(struct pt_regs *regs, long error_code) +{ + enum ctx_state prev_state; + unsigned long status; + struct xsave_struct *xsave_buf; + struct task_struct *tsk = current; + + prev_state = exception_enter(); + if (notify_die(DIE_TRAP, "bounds", regs, error_code, + X86_TRAP_BR
[PATCH 5/5] x86, mpx: extend siginfo structure to include bound violation information
This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. These fields will be set in #BR exception handler by decoding the user instruction and constructing the faulting pointer. A userspace application can get violation address, lower bound and upper bound for bound violation from this new siginfo structure. Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/mpx.h | 39 + arch/x86/kernel/mpx.c | 289 arch/x86/kernel/traps.c|6 + include/uapi/asm-generic/siginfo.h |9 +- kernel/signal.c|4 + 5 files changed, 346 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h index 9652e9e..8c1c914 100644 --- a/arch/x86/include/asm/mpx.h +++ b/arch/x86/include/asm/mpx.h @@ -30,6 +30,43 @@ #endif +struct mpx_insn_field { + union { + signed int value; + unsigned char bytes[4]; + }; + unsigned char nbytes; +}; + +struct mpx_insn { + struct mpx_insn_field rex_prefix; /* REX prefix */ + struct mpx_insn_field modrm; + struct mpx_insn_field sib; + struct mpx_insn_field displacement; + + unsigned char addr_bytes; /* effective address size */ + unsigned char limit; + unsigned char x86_64; + + const unsigned char *kaddr; /* kernel address of insn to analyze */ + const unsigned char *next_byte; +}; + +#define MAX_MPX_INSN_SIZE 15 + +#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6) +#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3) +#define X86_MODRM_RM(modrm) ((modrm) & 0x07) + +#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6) +#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) +#define X86_SIB_BASE(sib) ((sib) & 0x07) + +#define X86_REX_W(rex) ((rex) & 8) +#define X86_REX_R(rex) ((rex) & 4) +#define X86_REX_X(rex) ((rex) & 2) +#define X86_REX_B(rex) ((rex) & 1) + typedef union { struct { unsigned long ignored:MPX_IGN_BITS; @@ -40,5 +77,7 @@ typedef union { } mpx_addr; void do_mpx_bt_fault(struct xsave_struct *xsave_buf); +void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info, + struct xsave_struct *xsave_buf); #endif /* _ASM_X86_MPX_H */ diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c index ffe5aee..3770991 100644 --- a/arch/x86/kernel/mpx.c +++ b/arch/x86/kernel/mpx.c @@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk) return 0; } +typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t; +static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs, +reg_type_t type) +{ + int regno = 0; + unsigned char modrm = (unsigned char)insn->modrm.value; + unsigned char sib = (unsigned char)insn->sib.value; + + static const int regoff[] = { + offsetof(struct pt_regs, ax), + offsetof(struct pt_regs, cx), + offsetof(struct pt_regs, dx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, sp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), +#ifdef CONFIG_X86_64 + offsetof(struct pt_regs, r8), + offsetof(struct pt_regs, r9), + offsetof(struct pt_regs, r10), + offsetof(struct pt_regs, r11), + offsetof(struct pt_regs, r12), + offsetof(struct pt_regs, r13), + offsetof(struct pt_regs, r14), + offsetof(struct pt_regs, r15), +#endif + }; + + switch (type) { + case REG_TYPE_RM: + regno = X86_MODRM_RM(modrm); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_INDEX: + regno = X86_SIB_INDEX(sib); + if (X86_REX_X(insn->rex_prefix.value) == 1) + regno += 8; + break; + + case REG_TYPE_BASE: + regno = X86_SIB_BASE(sib); + if (X86_REX_B(insn->rex_prefix.value) == 1) + regno += 8; + break; + + default: + break; + } + + return regs_get_register(regs, regoff[regno]); +} + +/* + * return the address being referenced be instruction + * for rm=3 returning the content of the rm reg + * for rm!=3 calculates the address using SIB and Disp + */ +static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs) +{ + unsigned long addr; + unsigned long base; + unsigned long indx; + unsigned char modrm = (unsigned char)insn->mod
[tip:x86/mpx] x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic
Commit-ID: 0ee3b6f87d4d748d5362cb47ff33fa1553805cb4 Gitweb: http://git.kernel.org/tip/0ee3b6f87d4d748d5362cb47ff33fa1553805cb4 Author: Qiaowei Ren AuthorDate: Sat, 14 Dec 2013 14:25:03 +0800 Committer: H. Peter Anvin CommitDate: Mon, 16 Dec 2013 09:08:13 -0800 x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic futex_atomic_cmpxchg_inatomic() is simply the 32-bit implementation of user_atomic_cmpxchg_inatomic(), which in turn is simply a generalization of the original code in futex_atomic_cmpxchg_inatomic(). Use the newly generalized user_atomic_cmpxchg_inatomic() as the futex implementation, too. [ hpa: retain the inline in futex.h rather than changing it to a macro ] Signed-off-by: Qiaowei Ren Link: http://lkml.kernel.org/r/1387002303-6620-2-git-send-email-qiaowei@intel.com Signed-off-by: H. Peter Anvin Cc: Peter Zijlstra --- arch/x86/include/asm/futex.h | 21 + 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h index be27ba1..b4c1f54 100644 --- a/arch/x86/include/asm/futex.h +++ b/arch/x86/include/asm/futex.h @@ -110,26 +110,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr) static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval) { - int ret = 0; - - if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) - return -EFAULT; - - asm volatile("\t" ASM_STAC "\n" -"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" -"2:\t" ASM_CLAC "\n" -"\t.section .fixup, \"ax\"\n" -"3:\tmov %3, %0\n" -"\tjmp 2b\n" -"\t.previous\n" -_ASM_EXTABLE(1b, 3b) -: "+r" (ret), "=a" (oldval), "+m" (*uaddr) -: "i" (-EFAULT), "r" (newval), "1" (oldval) -: "memory" - ); - - *uval = oldval; - return ret; + return user_atomic_cmpxchg_inatomic(uval, uaddr, oldval, newval); } #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mpx] x86: add user_atomic_cmpxchg_inatomic at uaccess.h
Commit-ID: f09174c501f8bb259788cc36d5a7aa5b2831fb5e Gitweb: http://git.kernel.org/tip/f09174c501f8bb259788cc36d5a7aa5b2831fb5e Author: Qiaowei Ren AuthorDate: Sat, 14 Dec 2013 14:25:02 +0800 Committer: H. Peter Anvin CommitDate: Mon, 16 Dec 2013 09:07:57 -0800 x86: add user_atomic_cmpxchg_inatomic at uaccess.h This patch adds user_atomic_cmpxchg_inatomic() to use CMPXCHG instruction against a user space address. This generalizes the already existing futex_atomic_cmpxchg_inatomic() so it can be used in other contexts. This will be used in the upcoming support for Intel MPX (Memory Protection Extensions.) [ hpa: replaced #ifdef inside a macro with IS_ENABLED() ] Signed-off-by: Qiaowei Ren Link: http://lkml.kernel.org/r/1387002303-6620-1-git-send-email-qiaowei@intel.com Signed-off-by: H. Peter Anvin Cc: Peter Zijlstra --- arch/x86/include/asm/uaccess.h | 92 ++ 1 file changed, 92 insertions(+) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 8ec57c0..48ff838 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -525,6 +525,98 @@ extern __must_check long strnlen_user(const char __user *str, long n); unsigned long __must_check clear_user(void __user *mem, unsigned long len); unsigned long __must_check __clear_user(void __user *mem, unsigned long len); +extern void __cmpxchg_wrong_size(void) + __compiletime_error("Bad argument size for cmpxchg"); + +#define __user_atomic_cmpxchg_inatomic(uval, ptr, old, new, size) \ +({ \ + int __ret = 0; \ + __typeof__(ptr) __uval = (uval);\ + __typeof__(*(ptr)) __old = (old); \ + __typeof__(*(ptr)) __new = (new); \ + switch (size) { \ + case 1: \ + { \ + asm volatile("\t" ASM_STAC "\n" \ + "1:\t" LOCK_PREFIX "cmpxchgb %4, %2\n" \ + "2:\t" ASM_CLAC "\n"\ + "\t.section .fixup, \"ax\"\n" \ + "3:\tmov %3, %0\n" \ + "\tjmp 2b\n"\ + "\t.previous\n" \ + _ASM_EXTABLE(1b, 3b)\ + : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \ + : "i" (-EFAULT), "q" (__new), "1" (__old) \ + : "memory" \ + ); \ + break; \ + } \ + case 2: \ + { \ + asm volatile("\t" ASM_STAC "\n" \ + "1:\t" LOCK_PREFIX "cmpxchgw %4, %2\n" \ + "2:\t" ASM_CLAC "\n"\ + "\t.section .fixup, \"ax\"\n" \ + "3:\tmov %3, %0\n" \ + "\tjmp 2b\n"\ + "\t.previous\n" \ + _ASM_EXTABLE(1b, 3b)\ + : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \ + : "i" (-EFAULT), "r" (__new), "1" (__old) \ + : "memory" \ + ); \ + break; \ + } \ + case 4: \ + { \ + asm volatile("\t&q
[PATCH 1/2] x86: add user_atomic_cmpxchg_inatomic at uaccess.h
This patch adds user_atomic_cmpxchg_inatomic() to use CMPXCHG instruction against a user space address. This generalizes the already existing futex_atomic_cmpxchg_inatomic() so it can be used in other contexts. This will be used in the upcoming support for Intel MPX (Memory Protection Extensions.) Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/uaccess.h | 91 1 files changed, 91 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 5838fa9..894d8bf 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -525,6 +525,97 @@ extern __must_check long strnlen_user(const char __user *str, long n); unsigned long __must_check clear_user(void __user *mem, unsigned long len); unsigned long __must_check __clear_user(void __user *mem, unsigned long len); +extern void __cmpxchg_wrong_size(void) + __compiletime_error("Bad argument size for cmpxchg"); + +#define __user_atomic_cmpxchg_inatomic(uval, ptr, old, new, size) \ +({ \ + int __ret = 0; \ + __typeof__(ptr) __uval = (uval);\ + __typeof__(*(ptr)) __old = (old); \ + __typeof__(*(ptr)) __new = (new); \ + switch (size) { \ + case 1: \ + { \ + asm volatile("\t" ASM_STAC "\n" \ + "1:\t" LOCK_PREFIX "cmpxchgb %4, %2\n" \ + "2:\t" ASM_CLAC "\n"\ + "\t.section .fixup, \"ax\"\n" \ + "3:\tmov %3, %0\n" \ + "\tjmp 2b\n"\ + "\t.previous\n" \ + _ASM_EXTABLE(1b, 3b)\ + : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \ + : "i" (-EFAULT), "q" (__new), "1" (__old) \ + : "memory" \ + ); \ + break; \ + } \ + case 2: \ + { \ + asm volatile("\t" ASM_STAC "\n" \ + "1:\t" LOCK_PREFIX "cmpxchgw %4, %2\n" \ + "2:\t" ASM_CLAC "\n"\ + "\t.section .fixup, \"ax\"\n" \ + "3:\tmov %3, %0\n" \ + "\tjmp 2b\n"\ + "\t.previous\n" \ + _ASM_EXTABLE(1b, 3b)\ + : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \ + : "i" (-EFAULT), "r" (__new), "1" (__old) \ + : "memory" \ + ); \ + break; \ + } \ + case 4: \ + { \ + asm volatile("\t" ASM_STAC "\n" \ + "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" \ + "2:\t" ASM_CLAC "\n"\ + "\t.section .fixup, \"ax\"\n" \ + "3:\tmov %3, %0\n" \ + "\tjmp 2b\n"
[PATCH 2/2] x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic
futex_atomic_cmpxchg_inatomic() is only the 32bit implementation of user_atomic_cmpxchg_inatomic(). This patch replaces it with user_atomic_cmpxchg_inatomic(). Signed-off-by: Qiaowei Ren --- arch/x86/include/asm/futex.h | 27 ++- 1 files changed, 2 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h index be27ba1..a9f7de4 100644 --- a/arch/x86/include/asm/futex.h +++ b/arch/x86/include/asm/futex.h @@ -41,6 +41,8 @@ "+m" (*uaddr), "=&r" (tem) \ : "r" (oparg), "i" (-EFAULT), "1" (0)) +#define futex_atomic_cmpxchg_inatomic user_atomic_cmpxchg_inatomic + static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr) { int op = (encoded_op >> 28) & 7; @@ -107,30 +109,5 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr) return ret; } -static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, - u32 oldval, u32 newval) -{ - int ret = 0; - - if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) - return -EFAULT; - - asm volatile("\t" ASM_STAC "\n" -"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" -"2:\t" ASM_CLAC "\n" -"\t.section .fixup, \"ax\"\n" -"3:\tmov %3, %0\n" -"\tjmp 2b\n" -"\t.previous\n" -_ASM_EXTABLE(1b, 3b) -: "+r" (ret), "=a" (oldval), "+m" (*uaddr) -: "i" (-EFAULT), "r" (newval), "1" (oldval) -: "memory" - ); - - *uval = oldval; - return ret; -} - #endif #endif /* _ASM_X86_FUTEX_H */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Documentation: move intel_txt.txt to Documentation/x86
Documentation/x86 is a more fitting place for intel_txt.txt. Signed-off-by: Qiaowei Ren --- Documentation/intel_txt.txt | 210 --- Documentation/x86/intel_txt.txt | 210 +++ 2 files changed, 210 insertions(+), 210 deletions(-) delete mode 100644 Documentation/intel_txt.txt create mode 100644 Documentation/x86/intel_txt.txt diff --git a/Documentation/intel_txt.txt b/Documentation/intel_txt.txt deleted file mode 100644 index 91d89c5..000 --- a/Documentation/intel_txt.txt +++ /dev/null @@ -1,210 +0,0 @@ -Intel(R) TXT Overview: -= - -Intel's technology for safer computing, Intel(R) Trusted Execution -Technology (Intel(R) TXT), defines platform-level enhancements that -provide the building blocks for creating trusted platforms. - -Intel TXT was formerly known by the code name LaGrande Technology (LT). - -Intel TXT in Brief: -o Provides dynamic root of trust for measurement (DRTM) -o Data protection in case of improper shutdown -o Measurement and verification of launched environment - -Intel TXT is part of the vPro(TM) brand and is also available some -non-vPro systems. It is currently available on desktop systems -based on the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell -Optiplex 755, HP dc7800, etc.) and mobile systems based on the GM45, -PM45, and GS45 Express chipsets. - -For more information, see http://www.intel.com/technology/security/. -This site also has a link to the Intel TXT MLE Developers Manual, -which has been updated for the new released platforms. - -Intel TXT has been presented at various events over the past few -years, some of which are: - LinuxTAG 2008: - http://www.linuxtag.org/2008/en/conf/events/vp-donnerstag.html - TRUST2008: - http://www.trust-conference.eu/downloads/Keynote-Speakers/ - 3_David-Grawrock_The-Front-Door-of-Trusted-Computing.pdf - IDF, Shanghai: - http://www.prcidf.com.cn/index_en.html - IDFs 2006, 2007 (I'm not sure if/where they are online) - -Trusted Boot Project Overview: -= - -Trusted Boot (tboot) is an open source, pre-kernel/VMM module that -uses Intel TXT to perform a measured and verified launch of an OS -kernel/VMM. - -It is hosted on SourceForge at http://sourceforge.net/projects/tboot. -The mercurial source repo is available at http://www.bughost.org/ -repos.hg/tboot.hg. - -Tboot currently supports launching Xen (open source VMM/hypervisor -w/ TXT support since v3.2), and now Linux kernels. - - -Value Proposition for Linux or "Why should you care?" -= - -While there are many products and technologies that attempt to -measure or protect the integrity of a running kernel, they all -assume the kernel is "good" to begin with. The Integrity -Measurement Architecture (IMA) and Linux Integrity Module interface -are examples of such solutions. - -To get trust in the initial kernel without using Intel TXT, a -static root of trust must be used. This bases trust in BIOS -starting at system reset and requires measurement of all code -executed between system reset through the completion of the kernel -boot as well as data objects used by that code. In the case of a -Linux kernel, this means all of BIOS, any option ROMs, the -bootloader and the boot config. In practice, this is a lot of -code/data, much of which is subject to change from boot to boot -(e.g. changing NICs may change option ROMs). Without reference -hashes, these measurement changes are difficult to assess or -confirm as benign. This process also does not provide DMA -protection, memory configuration/alias checks and locks, crash -protection, or policy support. - -By using the hardware-based root of trust that Intel TXT provides, -many of these issues can be mitigated. Specifically: many -pre-launch components can be removed from the trust chain, DMA -protection is provided to all launched components, a large number -of platform configuration checks are performed and values locked, -protection is provided for any data in the event of an improper -shutdown, and there is support for policy-based execution/verification. -This provides a more stable measurement and a higher assurance of -system configuration and initial state than would be otherwise -possible. Since the tboot project is open source, source code for -almost all parts of the trust chain is available (excepting SMM and -Intel-provided firmware). - -How Does it Work? -= - -o Tboot is an executable that is launched by the bootloader as - the "kernel" (the binary the bootloader executes). -o It performs all of the work necessary to determine if the - platform supports Intel TXT and, if so, executes the GETSEC[SENTER] - processor instruction that initiates the dynamic root of trust. - - If tboot determines that the system doe
[tip:x86/cpufeature] x86, xsave: Support eager-only xsave features, add MPX support
Commit-ID: e7d820a5e549b3eb6c3f9467507566565646a669 Gitweb: http://git.kernel.org/tip/e7d820a5e549b3eb6c3f9467507566565646a669 Author: Qiaowei Ren AuthorDate: Thu, 5 Dec 2013 17:15:34 +0800 Committer: H. Peter Anvin CommitDate: Fri, 6 Dec 2013 17:17:42 -0800 x86, xsave: Support eager-only xsave features, add MPX support Some features, like Intel MPX, work only if the kernel uses eagerfpu model. So we should force eagerfpu on unless the user has explicitly disabled it. Add definitions for Intel MPX and add it to the supported list. [ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ] Signed-off-by: Qiaowei Ren Link: http://lkml.kernel.org/r/9e0be1322f2f2246bd820da9fc397ade014a6...@shsmsx102.ccr.corp.intel.com Signed-off-by: H. Peter Anvin --- arch/x86/include/asm/processor.h | 23 +++ arch/x86/include/asm/xsave.h | 14 ++ arch/x86/kernel/xsave.c | 10 ++ 3 files changed, 43 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 7b034a4..b7845a1 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -370,6 +370,26 @@ struct ymmh_struct { u32 ymmh_space[64]; }; +struct lwp_struct { + u64 lwpcb_addr; + u32 flags; + u32 buf_head_offset; + u64 buf_base; + u32 buf_size; + u32 filters; + u64 saved_event_record[4]; + u32 event_counter[16]; +}; + +struct bndregs_struct { + u64 bndregs[8]; +} __packed; + +struct bndcsr_struct { + u64 cfg_reg_u; + u64 status_reg; +} __packed; + struct xsave_hdr_struct { u64 xstate_bv; u64 reserved1[2]; @@ -380,6 +400,9 @@ struct xsave_struct { struct i387_fxsave_struct i387; struct xsave_hdr_struct xsave_hdr; struct ymmh_struct ymmh; + struct lwp_struct lwp; + struct bndregs_struct bndregs; + struct bndcsr_struct bndcsr; /* new processor state extensions will go here */ } __attribute__ ((packed, aligned (64))); diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..5547389 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -20,10 +22,14 @@ #define XSAVE_YMM_SIZE 256 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET) -/* - * These are the features that the OS can handle currently. - */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +/* Supported features which support lazy state saving */ +#define XSTATE_LAZY(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) + +/* Supported features which require eager state saving */ +#define XSTATE_EAGER (XSTATE_BNDREGS | XSTATE_BNDCSR) + +/* All currently supported features */ +#define XCNTXT_MASK(XSTATE_LAZY | XSTATE_EAGER) #ifdef CONFIG_X86_64 #define REX_PREFIX "0x48, " diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c index 422fd82..a4b451c 100644 --- a/arch/x86/kernel/xsave.c +++ b/arch/x86/kernel/xsave.c @@ -562,6 +562,16 @@ static void __init xstate_enable_boot_cpu(void) if (cpu_has_xsaveopt && eagerfpu != DISABLE) eagerfpu = ENABLE; + if (pcntxt_mask & XSTATE_EAGER) { + if (eagerfpu == DISABLE) { + pr_err("eagerfpu not present, disabling some xstate features: 0x%llx\n", + pcntxt_mask & XSTATE_EAGER); + pcntxt_mask &= ~XSTATE_EAGER; + } else { + eagerfpu = ENABLE; + } + } + pr_info("enabled xstate_bv 0x%llx, cntxt size 0x%x\n", pcntxt_mask, xstate_size); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 3/3] X86, mpx: Intel MPX xstate feature definition
This patch defines xstate feature and extends struct xsave_hdr_struct to support Intel MPX. Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- arch/x86/include/asm/processor.h | 12 arch/x86/include/asm/xsave.h |6 +- 2 files changed, 17 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 987c75e..2fe2e75 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -370,6 +370,15 @@ struct ymmh_struct { u32 ymmh_space[64]; }; +struct bndregs_struct { + u64 bndregs[8]; +} __packed; + +struct bndcsr_struct { + u64 cfg_reg_u; + u64 status_reg; +} __packed; + struct xsave_hdr_struct { u64 xstate_bv; u64 reserved1[2]; @@ -380,6 +389,9 @@ struct xsave_struct { struct i387_fxsave_struct i387; struct xsave_hdr_struct xsave_hdr; struct ymmh_struct ymmh; + u8 lwp_area[128]; + struct bndregs_struct bndregs; + struct bndcsr_struct bndcsr; /* new processor state extensions will go here */ } __attribute__ ((packed, aligned (64))); diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..5cd9de3 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -20,10 +22,12 @@ #define XSAVE_YMM_SIZE 256 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET) +#define XSTATE_FLEXIBLE (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XSTATE_EAGER (XSTATE_BNDREGS | XSTATE_BNDCSR) /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FLEXIBLE | XSTATE_EAGER) #ifdef CONFIG_X86_64 #define REX_PREFIX "0x48, " -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/cpufeature] x86, cpufeature: Define the Intel MPX feature flag
Commit-ID: 191f57c137bcce0e3e9313acb77b2f114d15afbb Gitweb: http://git.kernel.org/tip/191f57c137bcce0e3e9313acb77b2f114d15afbb Author: Qiaowei Ren AuthorDate: Sat, 7 Dec 2013 08:20:57 +0800 Committer: H. Peter Anvin CommitDate: Fri, 6 Dec 2013 10:21:44 -0800 x86, cpufeature: Define the Intel MPX feature flag Define the Intel MPX (Memory Protection Extensions) CPU feature flag in the cpufeature list. Signed-off-by: Qiaowei Ren Link: http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei@intel.com Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong Signed-off-by: H. Peter Anvin --- arch/x86/include/asm/cpufeature.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 89270b4..e099f95 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -216,6 +216,7 @@ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_MPX(9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP (9*32+20) /* Supervisor Mode Access Prevention */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/3] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- Documentation/x86/intel_mpx.txt | 76 +++ 1 files changed, 76 insertions(+), 0 deletions(-) create mode 100644 Documentation/x86/intel_mpx.txt diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt new file mode 100644 index 000..778d06e --- /dev/null +++ b/Documentation/x86/intel_mpx.txt @@ -0,0 +1,76 @@ +Intel(R) MPX Overview: += + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX can +increase the robustness of software when it is used in conjunction +with compiler changes to check that memory references intended +at compile time do not become unsafe at runtime. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. A direct benefit Intel MPX provides +is hardening software against malicious attacks designed to +cause or exploit buffer overruns. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. And the linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 is +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/3] X86, mpx: Intel MPX xstate feature definition
This patch defines xstate feature and extends struct xsave_hdr_struct to support Intel MPX. Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- arch/x86/include/asm/processor.h | 12 arch/x86/include/asm/xsave.h |5 - 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 987c75e..2fe2e75 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -370,6 +370,15 @@ struct ymmh_struct { u32 ymmh_space[64]; }; +struct bndregs_struct { + u64 bndregs[8]; +} __packed; + +struct bndcsr_struct { + u64 cfg_reg_u; + u64 status_reg; +} __packed; + struct xsave_hdr_struct { u64 xstate_bv; u64 reserved1[2]; @@ -380,6 +389,9 @@ struct xsave_struct { struct i387_fxsave_struct i387; struct xsave_hdr_struct xsave_hdr; struct ymmh_struct ymmh; + u8 lwp_area[128]; + struct bndregs_struct bndregs; + struct bndcsr_struct bndcsr; /* new processor state extensions will go here */ } __attribute__ ((packed, aligned (64))); diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..7fa8855 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -20,10 +22,11 @@ #define XSAVE_YMM_SIZE 256 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET) +#define XSTATE_EAGER (XSTATE_BNDREGS | XSTATE_BNDCSR) /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_EAGER) #ifdef CONFIG_X86_64 #define REX_PREFIX "0x48, " -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/3] X86, mpx: Intel MPX CPU feature definition
This patch defines Intel MPX CPU feature. Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- arch/x86/include/asm/cpufeature.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index d3f5c63..ef9f9c2 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -216,6 +216,7 @@ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_MPX(9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP (9*32+20) /* Supervisor Mode Access Prevention */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] X86, mpx: Intel MPX xstate feature definition
Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- arch/x86/include/asm/processor.h | 23 +++ arch/x86/include/asm/xsave.h |6 +- 2 files changed, 28 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 987c75e..43be6f6 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -370,6 +370,26 @@ struct ymmh_struct { u32 ymmh_space[64]; }; +struct lwp_struct { + u64 lwpcb_addr; + u32 flags; + u32 buf_head_offset; + u64 buf_base; + u32 buf_size; + u32 filters; + u64 saved_event_record[4]; + u32 event_counter[16]; +}; + +struct bndregs_struct { + u64 bndregs[8]; +} __packed; + +struct bndcsr_struct { + u64 cfg_reg_u; + u64 status_reg; +} __packed; + struct xsave_hdr_struct { u64 xstate_bv; u64 reserved1[2]; @@ -380,6 +400,9 @@ struct xsave_struct { struct i387_fxsave_struct i387; struct xsave_hdr_struct xsave_hdr; struct ymmh_struct ymmh; + struct lwp_struct lwp; + struct bndregs_struct bndregs; + struct bndcsr_struct bndcsr; /* new processor state extensions will go here */ } __attribute__ ((packed, aligned (64))); diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index 0415cda..5cd9de3 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -9,6 +9,8 @@ #define XSTATE_FP 0x1 #define XSTATE_SSE 0x2 #define XSTATE_YMM 0x4 +#define XSTATE_BNDREGS 0x8 +#define XSTATE_BNDCSR 0x10 #define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE) @@ -20,10 +22,12 @@ #define XSAVE_YMM_SIZE 256 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET) +#define XSTATE_FLEXIBLE (XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XSTATE_EAGER (XSTATE_BNDREGS | XSTATE_BNDCSR) /* * These are the features that the OS can handle currently. */ -#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM) +#define XCNTXT_MASK(XSTATE_FLEXIBLE | XSTATE_EAGER) #ifdef CONFIG_X86_64 #define REX_PREFIX "0x48, " -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] x86, mpx: add documentation on Intel MPX
This patch adds the Documentation/intel_mpx.txt file with some information about Intel MPX. Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- Documentation/intel_mpx.txt | 77 +++ 1 files changed, 77 insertions(+), 0 deletions(-) create mode 100644 Documentation/intel_mpx.txt diff --git a/Documentation/intel_mpx.txt b/Documentation/intel_mpx.txt new file mode 100644 index 000..3d947d0 --- /dev/null +++ b/Documentation/intel_mpx.txt @@ -0,0 +1,77 @@ +Intel(R) MPX Overview: += + +Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new +capability introduced into Intel Architecture. Intel MPX can +increase the robustness of software when it is used in conjunction +with compiler changes to check memory references, for those +references whose compile-time normal intentions are usurped +at runtime due to buffer overflow or underflow. + +Two of the most important goals of Intel MPX are to provide +this capability at very low performance overhead for newly +compiled code, and to provide compatibility mechanisms with +legacy software components. A direct benefit Intel MPX provides +is hardening software against malicious attacks designed to +cause or exploit buffer overruns. + +For details about the Intel MPX instructions, see "Intel(R) +Architecture Instruction Set Extensions Programming Reference". + +Intel(R) MPX Programming Model +-- + +Intel MPX introduces new registers and new instructions that +operate on these registers. Some of the registers added are +bounds registers which store a pointer's lower bound and upper +bound limits. Whenever the pointer is used, the requested +reference is checked against the pointer's associated bounds, +thereby preventing out-of-bound memory access (such as buffer +overflows and overruns). Out-of-bounds memory references +initiate a #BR exception which can then be handled in an +appropriate manner. + +Loading and Storing Bounds using Translation + + +Intel MPX defines two instructions for load/store of the linear +address of a pointer to a buffer, along with the bounds of the +buffer into a paging structure of extended bounds. Specifically +when storing extended bounds, the processor will perform address +translation of the address where the pointer is stored to an +address in the Bound Table (BT) to determine the store location +of extended bounds. Loading of an extended bounds performs the +reverse sequence. + +The structure in memory to load/store an extended bound is a +4-tuple consisting of lower bound, upper bound, pointer value +and a reserved field. Bound loads and stores access 32-bit or +64-bit operand size according to the operation mode. Thus, +a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits +in 64-bit mode. + +The linear address of a bound table is stored in a Bound +Directory (BD) entry. And the linear address of the bound +directory is derived from either BNDCFGU or BNDCFGS registers. +Bounds in memory are stored in Bound Tables (BT) as an extended +bound, which are accessed via Bound Directory (BD) and address +translation performed by BNDLDX/BNDSTX instructions. + +Bounds Directory (BD) and Bounds Tables (BT) are stored in +application memory and are allocated by the application (in case +of kernel use, the structures will be in kernel memory). The +bound directory and each instance of bound table are in contiguous +linear memory. + +XSAVE/XRESTOR Support of Intel MPX State + + +Enabling Intel MPX requires an OS to manage two bits in XCR0: + - BNDREGS for saving and restoring registers BND0-BND3, + - BNDCSR for saving and restoring the user-mode configuration +(BNDCFGU) and the status register (BNDSTATUS). + +The reason for having two separate bits is that BND0-BND3 is +likely to be volatile state, while BNDCFGU and BNDSTATUS are not. +Therefore, an OS has flexibility in handling these two states +differently in saving or restoring them. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] X86, mpx: Intel MPX definition
Signed-off-by: Qiaowei Ren Signed-off-by: Xudong Hao Signed-off-by: Liu Jinsong --- arch/x86/include/asm/cpufeature.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index d3f5c63..6c2738d 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -216,6 +216,7 @@ #define X86_FEATURE_ERMS (9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_MPX(9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP (9*32+20) /* Supervisor Mode Access Prevention */ @@ -330,6 +331,7 @@ extern const char * const x86_power_flags[32]; #define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2) #define cpu_has_cx8boot_cpu_has(X86_FEATURE_CX8) #define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16) +#define cpu_has_mpxboot_cpu_has(X86_FEATURE_MPX) #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU) #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5] x86, tboot: iomem fixes
Current code doesn't use specific interface to access I/O space. So some potential bugs can be caused. We can fix this by using specific API. Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 3ff42d2..4e149c7 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -468,7 +468,8 @@ struct sinit_mle_data { struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tbl) { - void *heap_base, *heap_ptr, *config; + void __iomem *heap_base, *heap_ptr, *config; + u32 dmar_tbl_off; if (!tboot_enabled()) return dmar_tbl; @@ -485,25 +486,26 @@ struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tb return NULL; /* now map TXT heap */ - heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE), - *(u64 *)(config + TXTCR_HEAP_SIZE)); + heap_base = ioremap(readl(config + TXTCR_HEAP_BASE), + readl(config + TXTCR_HEAP_SIZE)); iounmap(config); if (!heap_base) return NULL; /* walk heap to SinitMleData */ /* skip BiosData */ - heap_ptr = heap_base + *(u64 *)heap_base; + heap_ptr = heap_base + readq(heap_base); /* skip OsMleData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* skip OsSinitData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* now points to SinitMleDataSize; set to SinitMleData */ heap_ptr += sizeof(u64); /* get addr of DMAR table */ + dmar_tbl_off = readl(heap_ptr + + offsetof(struct sinit_mle_data, vtd_dmars_off)); dmar_tbl = (struct acpi_table_header *)(heap_ptr + - ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off - - sizeof(u64)); + dmar_tbl_off - sizeof(u64)); /* don't unmap heap because dmar.c needs access to this */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, tboot: iomem fixes
Current code doesn't use specific interface to access I/O space. So some potential bugs can be caused. We can fix this by using specific API. Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 3ff42d2..c902237 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -466,9 +466,12 @@ struct sinit_mle_data { u32 vtd_dmars_off; } __packed; +#define SINIT_MLE_DATA_VTD_DMAR_OFF140 + struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tbl) { - void *heap_base, *heap_ptr, *config; + void __iomem *heap_base, *heap_ptr, *config; + u32 dmar_tbl_off; if (!tboot_enabled()) return dmar_tbl; @@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tb return NULL; /* now map TXT heap */ - heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE), - *(u64 *)(config + TXTCR_HEAP_SIZE)); + heap_base = ioremap(readl(config + TXTCR_HEAP_BASE), + readl(config + TXTCR_HEAP_SIZE)); iounmap(config); if (!heap_base) return NULL; /* walk heap to SinitMleData */ /* skip BiosData */ - heap_ptr = heap_base + *(u64 *)heap_base; + heap_ptr = heap_base + readq(heap_base); /* skip OsMleData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* skip OsSinitData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* now points to SinitMleDataSize; set to SinitMleData */ heap_ptr += sizeof(u64); /* get addr of DMAR table */ + dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF); dmar_tbl = (struct acpi_table_header *)(heap_ptr + - ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off - - sizeof(u64)); + dmar_tbl_off - sizeof(u64)); /* don't unmap heap because dmar.c needs access to this */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3] x86, tboot: iomem fixes
Current code doesn't use specific interface to access I/O space. So some potential bugs can be caused. We can fix this by using specific API. Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 3ff42d2..afe8cf8 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -466,9 +466,12 @@ struct sinit_mle_data { u32 vtd_dmars_off; } __packed; +#define SINIT_MLE_DATA_VTD_DMAR_OFF140 + struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tbl) { - void *heap_base, *heap_ptr, *config; + void __iomem *heap_base, *heap_ptr, *config; + u32 dmar_tbl_off; if (!tboot_enabled()) return dmar_tbl; @@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tb return NULL; /* now map TXT heap */ - heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE), - *(u64 *)(config + TXTCR_HEAP_SIZE)); + heap_base = ioremap(readl(config + TXTCR_HEAP_BASE), + readl(config + TXTCR_HEAP_SIZE)); iounmap(config); if (!heap_base) return NULL; /* walk heap to SinitMleData */ /* skip BiosData */ - heap_ptr = heap_base + *(u64 *)heap_base; + heap_ptr = heap_base + readq(heap_base); /* skip OsMleData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* skip OsSinitData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* now points to SinitMleDataSize; set to SinitMleData */ heap_ptr += sizeof(u64); /* get addr of DMAR table */ - dmar_tbl = (struct acpi_table_header *)(heap_ptr + - ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off - - sizeof(u64)); + dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF); + memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64), + sizeof(struct acpi_table_header)); /* don't unmap heap because dmar.c needs access to this */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] x86, tboot: iomem fixes
Fixes for iomem annotations in arch/x86/kernel/tboot.c Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 3ff42d2..afe8cf8 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -466,9 +466,12 @@ struct sinit_mle_data { u32 vtd_dmars_off; } __packed; +#define SINIT_MLE_DATA_VTD_DMAR_OFF140 + struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tbl) { - void *heap_base, *heap_ptr, *config; + void __iomem *heap_base, *heap_ptr, *config; + u32 dmar_tbl_off; if (!tboot_enabled()) return dmar_tbl; @@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tb return NULL; /* now map TXT heap */ - heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE), - *(u64 *)(config + TXTCR_HEAP_SIZE)); + heap_base = ioremap(readl(config + TXTCR_HEAP_BASE), + readl(config + TXTCR_HEAP_SIZE)); iounmap(config); if (!heap_base) return NULL; /* walk heap to SinitMleData */ /* skip BiosData */ - heap_ptr = heap_base + *(u64 *)heap_base; + heap_ptr = heap_base + readq(heap_base); /* skip OsMleData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* skip OsSinitData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* now points to SinitMleDataSize; set to SinitMleData */ heap_ptr += sizeof(u64); /* get addr of DMAR table */ - dmar_tbl = (struct acpi_table_header *)(heap_ptr + - ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off - - sizeof(u64)); + dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF); + memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64), + sizeof(struct acpi_table_header)); /* don't unmap heap because dmar.c needs access to this */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, tboot: iomem fixes
Fixes for iomem annotations in arch/x86/kernel/tboot.c Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 43 +++ 1 file changed, 11 insertions(+), 32 deletions(-) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 3ff42d2..d06574c 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -442,33 +442,12 @@ late_initcall(tboot_late_init); #define TXTCR_HEAP_BASE 0x0300 #define TXTCR_HEAP_SIZE 0x0308 -#define SHA1_SIZE 20 - -struct sha1_hash { - u8 hash[SHA1_SIZE]; -}; - -struct sinit_mle_data { - u32 version; /* currently 6 */ - struct sha1_hash bios_acm_id; - u32 edx_senter_flags; - u64 mseg_valid; - struct sha1_hash sinit_hash; - struct sha1_hash mle_hash; - struct sha1_hash stm_hash; - struct sha1_hash lcp_policy_hash; - u32 lcp_policy_control; - u32 rlp_wakeup_addr; - u32 reserved; - u32 num_mdrs; - u32 mdrs_off; - u32 num_vtd_dmars; - u32 vtd_dmars_off; -} __packed; +#define SINIT_MLE_DATA_VTD_DMAR_OFF140 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tbl) { - void *heap_base, *heap_ptr, *config; + void __iomem *heap_base, *heap_ptr, *config; + u32 dmar_tbl_off; if (!tboot_enabled()) return dmar_tbl; @@ -485,25 +464,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header *dmar_tb return NULL; /* now map TXT heap */ - heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE), - *(u64 *)(config + TXTCR_HEAP_SIZE)); + heap_base = ioremap(readl(config + TXTCR_HEAP_BASE), + readl(config + TXTCR_HEAP_SIZE)); iounmap(config); if (!heap_base) return NULL; /* walk heap to SinitMleData */ /* skip BiosData */ - heap_ptr = heap_base + *(u64 *)heap_base; + heap_ptr = heap_base + readq(heap_base); /* skip OsMleData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* skip OsSinitData */ - heap_ptr += *(u64 *)heap_ptr; + heap_ptr += readq(heap_ptr); /* now points to SinitMleDataSize; set to SinitMleData */ heap_ptr += sizeof(u64); /* get addr of DMAR table */ - dmar_tbl = (struct acpi_table_header *)(heap_ptr + - ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off - - sizeof(u64)); + dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF); + memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64), + sizeof(struct acpi_table_header)); /* don't unmap heap because dmar.c needs access to this */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/debug] x86/tboot: Provide debugfs interfaces to access TXT log
Commit-ID: 13bfd47a0ef68fc8b21e67873dbdf269c7db6b59 Gitweb: http://git.kernel.org/tip/13bfd47a0ef68fc8b21e67873dbdf269c7db6b59 Author: Qiaowei Ren AuthorDate: Mon, 24 Jun 2013 13:55:33 +0800 Committer: Ingo Molnar CommitDate: Fri, 28 Jun 2013 11:05:16 +0200 x86/tboot: Provide debugfs interfaces to access TXT log These logs come from tboot (Trusted Boot, an open source, pre-kernel/VMM module that uses Intel TXT to perform a measured and verified launch of an OS kernel/VMM.). Signed-off-by: Qiaowei Ren Acked-by: H. Peter Anvin Cc: Gang Wei Link: http://lkml.kernel.org/r/137205-21788-1-git-send-email-qiaowei@intel.com [ Beautified the code a bit. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/tboot.c | 73 + 1 file changed, 73 insertions(+) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index f84fe00..3ff42d2 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -338,6 +339,73 @@ static struct notifier_block tboot_cpu_notifier __cpuinitdata = .notifier_call = tboot_cpu_callback, }; +#ifdef CONFIG_DEBUG_FS + +#define TBOOT_LOG_UUID { 0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \ + 0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74 } + +#define TBOOT_SERIAL_LOG_ADDR 0x6 +#define TBOOT_SERIAL_LOG_SIZE 0x08000 +#define LOG_MAX_SIZE_OFF 16 +#define LOG_BUF_OFF24 + +static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID; + +static ssize_t tboot_log_read(struct file *file, char __user *user_buf, size_t count, loff_t *ppos) +{ + void __iomem *log_base; + u8 log_uuid[16]; + u32 max_size; + void *kbuf; + int ret = -EFAULT; + + log_base = ioremap_nocache(TBOOT_SERIAL_LOG_ADDR, TBOOT_SERIAL_LOG_SIZE); + if (!log_base) + return ret; + + memcpy_fromio(log_uuid, log_base, sizeof(log_uuid)); + if (memcmp(&tboot_log_uuid, log_uuid, sizeof(log_uuid))) + goto err_iounmap; + + max_size = readl(log_base + LOG_MAX_SIZE_OFF); + if (*ppos >= max_size) { + ret = 0; + goto err_iounmap; + } + + if (*ppos + count > max_size) + count = max_size - *ppos; + + kbuf = kmalloc(count, GFP_KERNEL); + if (!kbuf) { + ret = -ENOMEM; + goto err_iounmap; + } + + memcpy_fromio(kbuf, log_base + LOG_BUF_OFF + *ppos, count); + if (copy_to_user(user_buf, kbuf, count)) + goto err_kfree; + + *ppos += count; + + ret = count; + +err_kfree: + kfree(kbuf); + +err_iounmap: + iounmap(log_base); + + return ret; +} + +static const struct file_operations tboot_log_fops = { + .read = tboot_log_read, + .llseek = default_llseek, +}; + +#endif /* CONFIG_DEBUG_FS */ + static __init int tboot_late_init(void) { if (!tboot_enabled()) @@ -348,6 +416,11 @@ static __init int tboot_late_init(void) atomic_set(&ap_wfs_count, 0); register_hotcpu_notifier(&tboot_cpu_notifier); +#ifdef CONFIG_DEBUG_FS + debugfs_create_file("tboot_log", S_IRUSR, + arch_debugfs_dir, NULL, &tboot_log_fops); +#endif + acpi_os_set_prepare_sleep(&tboot_sleep); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] x86, tboot: provide debugfs interfaces to access TXT log
These logs come from tboot (Trusted Boot, an open source, pre-kernel/VMM module that uses Intel TXT to perform a measured and verified launch of an OS kernel/VMM.). Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 72 +++ 1 file changed, 72 insertions(+) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index f84fe00..2dec186 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -338,6 +339,72 @@ static struct notifier_block tboot_cpu_notifier __cpuinitdata = .notifier_call = tboot_cpu_callback, }; +#if defined(CONFIG_DEBUG_FS) + +#define TBOOT_LOG_UUID {0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \ +0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74} +#define TBOOT_SERIAL_LOG_ADDR 0x6 +#define TBOOT_SERIAL_LOG_SIZE 0x08000 +#define LOG_MAX_SIZE_OFF 16 +#define LOG_BUF_OFF24 + +static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID; + +static ssize_t tboot_log_read(struct file *file, char __user *user_buf, + size_t count, loff_t *ppos) +{ + void __iomem *log_base; + u8 log_uuid[16]; + u32 max_size; + void *kbuf; + int ret = -EFAULT; + + log_base = ioremap_nocache(TBOOT_SERIAL_LOG_ADDR, + TBOOT_SERIAL_LOG_SIZE); + if (!log_base) + return ret; + + memcpy_fromio(log_uuid, log_base, sizeof(log_uuid)); + if (memcmp(&tboot_log_uuid, log_uuid, sizeof(log_uuid))) + goto err_iounmap; + + max_size = readl(log_base + LOG_MAX_SIZE_OFF); + if (*ppos >= max_size) { + ret = 0; + goto err_iounmap; + } + + if (*ppos + count > max_size) + count = max_size - *ppos; + + kbuf = kmalloc(count, GFP_KERNEL); + if (!kbuf) { + ret = -ENOMEM; + goto err_iounmap; + } + + memcpy_fromio(kbuf, log_base + LOG_BUF_OFF + *ppos, count); + if (copy_to_user(user_buf, kbuf, count)) + goto err_kfree; + + *ppos += count; + + ret = count; + +err_kfree: + kfree(kbuf); +err_iounmap: + iounmap(log_base); + return ret; +} + +static const struct file_operations tboot_log_fops = { + .read = tboot_log_read, + .llseek = default_llseek, +}; + +#endif /* CONFIG_DEBUG_FS */ + static __init int tboot_late_init(void) { if (!tboot_enabled()) @@ -348,6 +415,11 @@ static __init int tboot_late_init(void) atomic_set(&ap_wfs_count, 0); register_hotcpu_notifier(&tboot_cpu_notifier); +#if defined(CONFIG_DEBUG_FS) + debugfs_create_file("tboot_log", S_IRUSR, + arch_debugfs_dir, NULL, &tboot_log_fops); +#endif + acpi_os_set_prepare_sleep(&tboot_sleep); return 0; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, tboot: provide debugfs interfaces to access TXT log
These logs come from tboot (Trusted Boot, an open source, pre-kernel/VMM module that uses Intel TXT to perform a measured and verified launch of an OS kernel/VMM.). Signed-off-by: Qiaowei Ren --- arch/x86/kernel/tboot.c | 70 +++ 1 file changed, 70 insertions(+) diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index f84fe00..dd6f198 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -338,6 +339,70 @@ static struct notifier_block tboot_cpu_notifier __cpuinitdata = .notifier_call = tboot_cpu_callback, }; +#if defined(CONFIG_DEBUG_FS) + +#define TBOOT_LOG_UUID {0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \ +0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74} +#define TBOOT_SERIAL_LOG_ADDR 0x6 +#define TBOOT_SERIAL_LOG_SIZE 0x08000 + +static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID; + +struct tboot_log { + uint8_t uuid[16]; + uint32_tmax_size; + uint32_tcurr_pos; + charbuf[]; +}; + +static struct tboot_log *get_log(void) +{ + struct tboot_log *log; + + log = (struct tboot_log *)ioremap_nocache(TBOOT_SERIAL_LOG_ADDR, + TBOOT_SERIAL_LOG_SIZE); + if (!log) + return NULL; + + if (memcmp(&tboot_log_uuid, &log->uuid, sizeof(log->uuid))) { + iounmap(log); + return NULL; + } + + return log; +} + +static ssize_t tboot_log_read(struct file *file, char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct tboot_log *log; + + log = get_log(); + if (!log) + return -EFAULT; + + if (*ppos >= log->max_size) + return 0; + + if (*ppos + count > log->max_size) + count = log->max_size - *ppos; + + if (copy_to_user(user_buf, log->buf + *ppos, count)) + return -EFAULT; + + *ppos += count; + + iounmap(log); + return count; +} + +static const struct file_operations tboot_log_fops = { + .read = tboot_log_read, + .llseek = default_llseek, +}; + +#endif /* CONFIG_DEBUG_FS */ + static __init int tboot_late_init(void) { if (!tboot_enabled()) @@ -348,6 +413,11 @@ static __init int tboot_late_init(void) atomic_set(&ap_wfs_count, 0); register_hotcpu_notifier(&tboot_cpu_notifier); +#if defined(CONFIG_DEBUG_FS) + debugfs_create_file("tboot_log", S_IRUSR, + arch_debugfs_dir, NULL, &tboot_log_fops); +#endif + acpi_os_set_prepare_sleep(&tboot_sleep); return 0; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/