[tip:x86/fpu] x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

2015-06-09 Thread tip-bot for Qiaowei Ren
Commit-ID:  3c1d32300920a446c67d697cd6b80f012ad06028
Gitweb: http://git.kernel.org/tip/3c1d32300920a446c67d697cd6b80f012ad06028
Author: Qiaowei Ren 
AuthorDate: Sun, 7 Jun 2015 11:37:02 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 9 Jun 2015 12:24:30 +0200

x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK

MPX_BNDCFG_ADDR_MASK is defined two times, so this patch removes
redundant one.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
Cc: Andrew Morton 
Cc: Dave Hansen 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20150607183702.5f129...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/mpx.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 0cdd16a..871e5e5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -45,7 +45,6 @@
 #define MPX_BNDSTA_TAIL2
 #define MPX_BNDCFG_TAIL12
 #define MPX_BNDSTA_ADDR_MASK   (~((1UL<http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] x86, mpx: Add documentation on Intel MPX

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  5776563648f6437ede91c91cbad85862ca682b0b
Gitweb: http://git.kernel.org/tip/5776563648f6437ede91c91cbad85862ca682b0b
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:32 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:54 +0100

x86, mpx: Add documentation on Intel MPX

This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151832.7fdb1...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 Documentation/x86/intel_mpx.txt | 234 
 1 file changed, 234 insertions(+)

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..4472ed2
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,234 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the "bounds directory",
+   points the "bndcfgu" register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the "gcc runtime"
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX kernel code work
+===

[tip:x86/mpx] x86, mpx: Add MPX-specific mmap interface

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  57319d80e1d328e34cb24868a4f4405661485e30
Gitweb: http://git.kernel.org/tip/57319d80e1d328e34cb24868a4f4405661485e30
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:27 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Add MPX-specific mmap interface

We have chosen to perform the allocation of bounds tables in
kernel (See the patch "on-demand kernel allocation of bounds
tables") and to mark these VMAs with VM_MPX.

However, there is currently no suitable interface to actually do
this.  Existing interfaces, like do_mmap_pgoff(), have no way to
set a modified ->vm_ops or ->vm_flags and don't hold mmap_sem
long enough to let a caller do it.

This patch wraps mmap_region() and hold mmap_sem long enough to
make the modifications to the VMA which we need.

Also note the 32/64-bit #ifdef in the header.  We actually need
to do this at runtime eventually.  But, for now, we don't support
running 32-bit binaries on 64-bit kernels.  Support for this will
come in later patches.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151827.ce440...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/x86/Kconfig   |  4 +++
 arch/x86/include/asm/mpx.h | 36 +++
 arch/x86/mm/Makefile   |  2 ++
 arch/x86/mm/mpx.c  | 86 ++
 4 files changed, 128 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ded8a67..967dfe0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -248,6 +248,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..7d7c5f5
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,36 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..72d13b0
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,86 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * This is really a simplified "vm_mmap". it only handles MPX
+ * bounds tables (the bounds directory is user-allocated).
+ *
+ * Later on, we use the vma->vm_ops to uniquely identify these
+ * VMAs.
+ */
+static unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(&mm->mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS 

[tip:x86/mpx] x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Gitweb: http://git.kernel.org/tip/4aae7e436fa51faf4bf5d11b175aea82cfe8224a
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:25 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific

MPX-enabled applications using large swaths of memory can
potentially have large numbers of bounds tables in process
address space to save bounds information. These tables can take
up huge swaths of memory (as much as 80% of the memory on the
system) even if we clean them up aggressively. In the worst-case
scenario, the tables can be 4x the size of the data structure
being tracked. IOW, a 1-page structure can require 4 bounds-table
pages.

Being this huge, our expectation is that folks using MPX are
going to be keen on figuring out how much memory is being
dedicated to it. So we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting
merged with normal VMAs. A new VM_ flag set only on MPX VMAs does
both of those things. With this flag, MPX bounds-table VMAs can
be distinguished from other VMAs, and userspace can also walk
/proc/$pid/smaps to get memory usage for MPX.

In addition to this flag, we also introduce a special ->vm_ops
specific to MPX VMAs (see the patch "add MPX specific mmap
interface"), but currently different ->vm_ops do not by
themselves prevent VMA merging, so we still need this flag.

We understand that VM_ flags are scarce and are open to other
options.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151825.56562...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 fs/proc/task_mmu.c | 3 +++
 include/linux/mm.h | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4e0388c..f6734c6 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -552,6 +552,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+#ifdef CONFIG_X86_INTEL_MPX
+   [ilog2(VM_MPX)] = "mp",
+#endif
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b464611..f7606d3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -128,6 +128,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -155,6 +156,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mips: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  232b5fff5bad78ad00b94153fa90ca53bef6a444
Gitweb: http://git.kernel.org/tip/232b5fff5bad78ad00b94153fa90ca53bef6a444
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:20 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mips: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151820.f7edc...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/mips/include/uapi/asm/siginfo.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] ia64: Sync struct siginfo with general version

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Gitweb: http://git.kernel.org/tip/53f037b08b5bebf47aa2b574a984e2f9fc7926f2
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:22 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

ia64: Sync struct siginfo with general version

New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151822.82b3b...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 arch/ia64/include/uapi/asm/siginfo.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] mpx: Extend siginfo structure to include bound violation information

2014-11-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Gitweb: http://git.kernel.org/tip/ee1b58d36aa1b5a79eaba11f5c3633c88231da83
Author: Qiaowei Ren 
AuthorDate: Fri, 14 Nov 2014 07:18:19 -0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 18 Nov 2014 00:58:53 +0100

mpx: Extend siginfo structure to include bound violation information

This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Dave Hansen 
Cc: linux...@kvack.org
Cc: linux-m...@linux-mips.org
Cc: Dave Hansen 
Link: http://lkml.kernel.org/r/20141114151819.1908c...@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner 
---
 include/uapi/asm-generic/siginfo.h | 9 -
 kernel/signal.c| 4 
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, &to->si_lower);
+   err |= __put_user(from->si_upper, &to->si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, &to->si_pid);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 06/12] mpx: extend siginfo structure to include bound violation information

2014-10-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, &to->si_lower);
+   err |= __put_user(from->si_upper, &to->si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 04/12] x86, mpx: add MPX to disaabled features

2014-10-11 Thread Qiaowei Ren
This allows us to use cpu_feature_enabled(X86_FEATURE_MPX) as
both a runtime and compile-time check.

When CONFIG_X86_INTEL_MPX is disabled,
cpu_feature_enabled(X86_FEATURE_MPX) will evaluate at
compile-time to 0. If CONFIG_X86_INTEL_MPX=y, then the cpuid
flag will be checked at runtime.

This patch must be applied after another Dave's commit:
  381aa07a9b4e1f82969203e9e4863da2a157781d

Signed-off-by: Dave Hansen 
Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/disabled-features.h |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 97534a7..f226df0 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -10,6 +10,12 @@
  * cpu_feature_enabled().
  */
 
+#ifdef CONFIG_X86_INTEL_MPX
+# define DISABLE_MPX   0
+#else
+# define DISABLE_MPX   (1<<(X86_FEATURE_MPX & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME   (1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR   (1<<(X86_FEATURE_K6_MTRR & 31))
@@ -34,6 +40,6 @@
 #define DISABLED_MASK6 0
 #define DISABLED_MASK7 0
 #define DISABLED_MASK8 0
-#define DISABLED_MASK9 0
+#define DISABLED_MASK9 (DISABLE_MPX)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 07/12] mips: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for MIPS with
general version.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 05/12] x86, mpx: on-demand kernel allocation of bounds tables

2014-10-11 Thread Qiaowei Ren
MPX only has 4 hardware registers for storing bounds information.
If MPX-enabled code needs more than these 4 registers, it needs to
spill them somewhere. It has two special instructions for this
which allow the bounds to be moved between the bounds registers
and some new "bounds tables".

They are similar conceptually to a page fault and will be raised by
the MPX hardware during both bounds violations or when the tables
are not present. This patch handles those #BR exceptions for
not-present tables by carving the space out of the normal processes
address space (essentially calling the new mmap() interface indroduced
earlier in this patch set.) and then pointing the bounds-directory
over to it.

The tables *need* to be accessed and controlled by userspace because
the instructions for moving bounds in and out of them are extremely
frequent. They potentially happen every time a register pointing to
memory is dereferenced. Any direct kernel involvement (like a syscall)
to access the tables would obviously destroy performance.

 Why not do this in userspace? 

This patch is obviously doing this allocation in the kernel.
However, MPX does not strictly *require* anything in the kernel.
It can theoretically be done completely from userspace. Here are
a few ways this *could* be done. I don't think any of them are
practical in the real-world, but here they are.

Q: Can virtual space simply be reserved for the bounds tables so
   that we never have to allocate them?
A: As noted earlier, these tables are *HUGE*. An X-GB virtual
   area needs 4*X GB of virtual space, plus 2GB for the bounds
   directory. If we were to preallocate them for the 128TB of
   user virtual address space, we would need to reserve 512TB+2GB,
   which is larger than the entire virtual address space today.
   This means they can not be reserved ahead of time. Also, a
   single process's pre-popualated bounds directory consumes 2GB
   of virtual *AND* physical memory. IOW, it's completely
   infeasible to prepopulate bounds directories.

Q: Can we preallocate bounds table space at the same time memory
   is allocated which might contain pointers that might eventually
   need bounds tables?
A: This would work if we could hook the site of each and every
   memory allocation syscall. This can be done for small,
   constrained applications. But, it isn't practical at a larger
   scale since a given app has no way of controlling how all the
   parts of the app might allocate memory (think libraries). The
   kernel is really the only place to intercept these calls.

Q: Could a bounds fault be handed to userspace and the tables
   allocated there in a signal handler instead of in the kernel?
A: (thanks to tglx) mmap() is not on the list of safe async
   handler functions and even if mmap() would work it still
   requires locking or nasty tricks to keep track of the
   allocation state there.

Having ruled out all of the userspace-only approaches for managing
bounds tables that we could think of, we create them on demand in
the kernel.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 +
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  101 
 arch/x86/kernel/traps.c|   52 ++-
 4 files changed, 173 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+ * Dave Hansen 
+ */
+
+#include 
+#include 
+#include 
+
+/*
+ * With 32-bit mode, MPX_BT_SIZE_BYTES is 4MB, and the size of each
+ * bounds table is 16KB. With 64-bit mode, MPX_BT_SIZE_BYTES is 2GB,
+ * and the size of each bounds table is 4MB.
+ */
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr;
+   unsigned long expected_old_val = 0;
+   unsigned long actual_old_val = 0;
+   int ret = 0;
+
+   /*
+* Carve the virtual space out of userspace for the new
+* bounds table:
+*/
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return PTR_ERR((void *)bt_addr);
+   /*
+* Set the valid flag (kinda like _PAGE_PRESENT in a pte)
+*/
+   bt_addr = bt_addr | MPX_BD_ENTRY_VALID_FLAG;
+
+   /*
+* Go poke the address of the new bounds table in to the
+* bounds directory entry out in userspace memory.  

[PATCH v9 03/12] x86, mpx: add MPX specific mmap interface

2014-10-11 Thread Qiaowei Ren
We have to do the allocation of bounds tables in kernel (See the patch
"on-demand kernel allocation of bounds tables"). Moreover, if we want
to track MPX VMAs we need to be able to stick new VM_MPX flag and a
specific vm_ops for MPX in the vma_area_struct.

But there are not suitable interfaces to do this in current kernel.
Existing interfaces, like do_mmap_pgoff(), could not stick specific
->vm_ops in the vma_area_struct when a VMA is created. So, this patch
adds MPX specific mmap interface to do the allocation of bounds tables.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4b663e1..e5bcc70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(&mm->mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;
+
+   vma = find_vma(mm, ret);
+   if (!vma) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   vma->vm_ops = &mpx_vma_ops;
+
+   

[PATCH v9 01/12] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-10-11 Thread Qiaowei Ren
MPX-enabled applications using large swaths of memory can potentially
have large numbers of bounds tables in process address space to save
bounds information. These tables can take up huge swaths of memory
(as much as 80% of the memory on the system) even if we clean them
up aggressively. In the worst-case scenario, the tables can be 4x the
size of the data structure being tracked. IOW, a 1-page structure can
require 4 bounds-table pages.

Being this huge, our expectation is that folks using MPX are going to
be keen on figuring out how much memory is being dedicated to it. So
we need a way to track memory use for MPX.

If we want to specifically track MPX VMAs we need to be able to
distinguish them from normal VMAs, and keep them from getting merged
with normal VMAs. A new VM_ flag set only on MPX VMAs does both of
those things. With this flag, MPX bounds-table VMAs can be distinguished
from other VMAs, and userspace can also walk /proc/$pid/smaps to get
memory usage for MPX.

Except this flag, we also introduce a specific ->vm_ops for MPX VMAs
(see the patch "add MPX specific mmap interface"), but currently vmas
with different ->vm_ops could be not prevented from merging. We
understand that VM_ flags are scarce and are open to other options.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 08/12] ia64: sync struct siginfo with general version

2014-10-11 Thread Qiaowei Ren
New fields about bound violation are added into general struct
siginfo. This will impact MIPS and IA64, which extend general
struct siginfo. This patch syncs this struct for IA64 with
general version.

Signed-off-by: Qiaowei Ren 
---
 arch/ia64/include/uapi/asm/siginfo.h |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 4ea6225..bce9bc1 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -63,6 +63,10 @@ typedef struct siginfo {
unsigned int _flags;/* see below */
unsigned long _isr; /* isr */
short _addr_lsb;/* lsb of faulting address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -110,9 +114,9 @@ typedef struct siginfo {
 /*
  * SIGSEGV si_codes
  */
-#define __SEGV_PSTKOVF (__SI_FAULT|3)  /* paragraph stack overflow */
+#define __SEGV_PSTKOVF (__SI_FAULT|4)  /* paragraph stack overflow */
 #undef NSIGSEGV
-#define NSIGSEGV   3
+#define NSIGSEGV   4
 
 #undef NSIGTRAP
 #define NSIGTRAP   4
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 12/12] x86, mpx: add documentation on Intel MPX

2014-10-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  245 +++
 1 files changed, 245 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..3c20a17
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,245 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
+introduced into Intel Architecture. Intel MPX provides hardware features
+that can be used in conjunction with compiler changes to check memory
+references, for those references whose compile-time normal intentions are
+usurped at runtime due to buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture Instruction
+Set Extensions Programming Reference, Chapter 9: Intel(R) Memory Protection
+Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead, which
+can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How to get the advantage of MPX
+==
+
+For MPX to work, changes are required in the kernel, binutils and compiler.
+No source changes are required for applications, just a recompile.
+
+There are a lot of moving parts of this to all work right. The following
+is how we expect the compiler, application and kernel to work together.
+
+1) Application developer compiles with -fmpx. The compiler will add the
+   instrumentation as well as some setup code called early after the app
+   starts. New instruction prefixes are noops for old CPUs.
+2) That setup code allocates (virtual) space for the "bounds directory",
+   points the "bndcfgu" register to the directory and notifies the kernel
+   (via the new prctl(PR_MPX_ENABLE_MANAGEMENT)) that the app will be using
+   MPX.
+3) The kernel detects that the CPU has MPX, allows the new prctl() to
+   succeed, and notes the location of the bounds directory. Userspace is
+   expected to keep the bounds directory at that locationWe note it
+   instead of reading it each time because the 'xsave' operation needed
+   to access the bounds directory register is an expensive operation.
+4) If the application needs to spill bounds out of the 4 registers, it
+   issues a bndstx instruction. Since the bounds directory is empty at
+   this point, a bounds fault (#BR) is raised, the kernel allocates a
+   bounds table (in the user address space) and makes the relevant entry
+   in the bounds directory point to the new table.
+5) If the application violates the bounds specified in the bounds registers,
+   a separate kind of #BR is raised which will deliver a signal with
+   information about the violation in the 'struct siginfo'.
+6) Whenever memory is freed, we know that it can no longer contain valid
+   pointers, and we attempt to free the associated space in the bounds
+   tables. If an entire table becomes unused, we will attempt to free
+   the table and remove the entry in the directory.
+
+To summarize, there are essentially three things interacting here:
+
+GCC with -fmpx:
+ * enables annotation of code with MPX instructions and prefixes
+ * inserts code early in the application to call in to the "gcc runtime"
+GCC MPX Runtime:
+ * Checks for hardware MPX support in cpuid leaf
+ * allocates virtual space for the bounds directory (malloc() essentially)
+ * points the hardware BNDCFGU register at the directory
+ * calls a new prctl(PR_MPX_ENABLE_MANAGEMENT) to notify the kernel to
+   start managing the bounds directories
+Kernel MPX Code:
+ * Checks for hardware MPX support in cpuid leaf
+ * Handles #BR exceptions and sends SIGSEGV to the app when it violates
+   bounds, like during a buffer overflow.
+ * When bounds are spilled in to an unallocated bounds table, the kernel
+   notices in the #BR exception, allocates the virtual space, then
+   updates the bounds directory to point to the new table. It keeps
+   special track of the memory with a VM_MPX flag.
+ * Frees unused bounds tables at the time that the memory they described
+   is unmapped.
+
+
+3. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * new bounds tables (BT) need to be allocated to save bounds.
+  * bounds violation caused by MPX instructions.
+
+We hook #BR handler to handle these two new situations.
+
+On-demand kernel allocation of bounds tables
+
+
+MPX only has 4 hardware registers for stor

[PATCH v9 09/12] x86, mpx: decode MPX instruction to get bound violation information

2014-10-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 2103b5e..b7e4c0e 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -10,6 +10,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   

[PATCH v9 11/12] x86, mpx: cleanup unused bound tables

2014-10-11 Thread Qiaowei Ren
There are two mappings in play: 1. The mapping with the actual data,
which userspace is munmap()ing or brk()ing away, etc... 2. The mapping
for the bounds table *backing* the data (is tagged with mpx_vma_ops,
see the patch "add MPX specific mmap interface").

If userspace use the prctl() indroduced earlier in this patchset to
enable the management of bounds tables in kernel, when it unmaps the
first kind of mapping with the actual data, kernel needs to free the
mapping for the bounds table backing the data. This patch calls
arch_unmap() at the very end of do_unmap() to do so. This will walk
the directory to look at the entries covered in the data vma and unmaps
the bounds table which is referenced from the directory and then clears
the directory entry.

Unmapping of bounds tables is called under vm_munmap() of the data VMA.
So we have to check ->vm_ops to prevent recursion. This recursion represents
having bounds tables for bounds tables, which should not occur normally.
Being strict about it here helps ensure that we do not have an exploitable
stack overflow.

Once we unmap the bounds table, we would have a bounds directory entry
pointing at empty address space. That address space could now be allocated
for some other (random) use, and the MPX hardware is now going to go
trying to walk it as if it were a bounds table. That would be bad. So
any unmapping of a bounds table has to be accompanied by a corresponding
write to the bounds directory entry to have it invalid. That write to
the bounds directory can fault.

Since we are doing the freeing from munmap() (and other paths like it),
we hold mmap_sem for write. If we fault, the page fault handler will
attempt to acquire mmap_sem for read and we will deadlock. For now, to
avoid deadlock, we disable page faults while touching the bounds directory
entry. This keeps us from being able to free the tables in this case.
This deficiency will be addressed in later patches.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 ++
 arch/x86/include/asm/mpx.h |9 +
 arch/x86/mm/mpx.c  |  317 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 350 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index e33ddb7..2b52d1b 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -111,4 +111,20 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 #endif
 }
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Userspace never asked us to manage the bounds tables,
+* so refuse to help.
+*/
+   if (!kernel_managing_mpx_tables(current->mm))
+   return;
+
+   mpx_notify_unmap(mm, vma, start, end);
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 32f13f5..a1a0155 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -48,6 +48,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -73,6 +80,8 @@ static inline int kernel_managing_mpx_tables(struct mm_struct 
*mm)
return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
 }
 unsigned long mpx_mmap(unsigned long len);
+void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 376f2ee..dcc6621 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -13,6 +22,11 @@ static struct vm_operations_struct mpx_vma_ops = {
.name = mpx_mapping_name,
 };
 
+int is_mpx_vma(struct vm_area_struct *vma)
+{
+   return (vma->vm_ops == &mpx_vma_ops);
+}
+
 /*
  * this is really a simplified "vm_mmap". it only handles mpx
  * related maps, including bounds table an

[PATCH v9 10/12] x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT

2014-10-11 Thread Qiaowei Ren
This patch adds two prctl() commands to provide one explicit interaction
mechanism to enable or disable the management of bounds tables in kernel,
including on-demand kernel allocation (See the patch "on-demand kernel
allocation of bounds tables") and cleanup (See the patch "cleanup unused
bound tables"). Applications do not strictly need the kernel to manage
bounds tables and we expect some applications to use MPX without taking
advantage of the kernel support. This means the kernel can not simply
infer whether an application needs bounds table management from the
MPX registers. prctl() is an explicit signal from userspace.

PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to
require kernel's help in managing bounds tables. And
PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't
want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, kernel
won't allocate and free the bounds table, even if the CPU supports MPX
feature.

PR_MPX_ENABLE_MANAGEMENT will do an xsave and fetch the base address
of bounds directory from the xsave buffer and then cache it into new
filed "bd_addr" of struct mm_struct. PR_MPX_DISABLE_MANAGEMENT will
set "bd_addr" to one invalid address. Then we can check "bd_addr" to
judge whether the management of bounds tables in kernel is enabled.

xsaves are expensive, so "bd_addr" is kept for caching to reduce the
number of we have to do at munmap() time. But we still have to do
xsave to get the value of BNDSTATUS at #BR fault time. In addition,
with this caching, userspace can't just move the bounds directory
around willy-nilly. For sane applications, base address of the bounds
directory won't be changed, otherwise we would be in a world of hurt.
But we will still check whether it is changed by users at #BR fault
time.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |9 
 arch/x86/include/asm/mpx.h |   11 +
 arch/x86/include/asm/processor.h   |   18 +++
 arch/x86/kernel/mpx.c  |   88 
 arch/x86/kernel/setup.c|8 +++
 arch/x86/kernel/traps.c|   30 -
 arch/x86/mm/mpx.c  |   25 +++---
 fs/exec.c  |2 +
 include/asm-generic/mmu_context.h  |5 ++
 include/linux/mm_types.h   |3 +
 include/uapi/linux/prctl.h |6 +++
 kernel/sys.c   |   12 +
 12 files changed, 198 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..e33ddb7 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -102,4 +103,12 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_bprm_mm_init(struct mm_struct *mm,
+   struct vm_area_struct *vma)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   mm->bd_addr = MPX_INVALID_BOUNDS_DIR;
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..32f13f5 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -5,6 +5,12 @@
 #include 
 #include 
 
+/*
+ * NULL is theoretically a valid place to put the bounds
+ * directory, so point this at an invalid address.
+ */
+#define MPX_INVALID_BOUNDS_DIR ((void __user *)-1)
+
 #ifdef CONFIG_X86_64
 
 /* upper 28 bits [47:20] of the virtual address in 64-bit used to
@@ -43,6 +49,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
@@ -61,6 +68,10 @@ struct mpx_insn {
 
 #define MAX_MPX_INSN_SIZE  15
 
+static inline int kernel_managing_mpx_tables(struct mm_struct *mm)
+{
+   return (mm->bd_addr != MPX_INVALID_BOUNDS_DIR);
+}
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 020142f..b35aefa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_ENABLE_MANAGEMENT(tsk) mpx_enable_management((tsk))
+#define MPX_DISABLE_MANAGEMENT(tsk)mpx_disable_management((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_enable_management(struct task_struct *tsk);
+extern int mpx_disable_management(struct task_struct *tsk);
+#else
+static inline int mpx_enable_managem

[PATCH v9 00/12] Intel MPX support

2014-10-11 Thread Qiaowei Ren
uild issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Changes since v8:
  * add new patch to rename cfg_reg_u and status_reg.
  * add new patch to use disabled features from Dave's patches.
  * add new patch to sync struct siginfo for IA64.
  * rename two new prctl() commands to PR_MPX_ENABLE_MANAGEMENT and
PR_MPX_DISABLE_MANAGEMENT, check whether the management of bounds
tables in kernel is enabled at #BR fault time, and add locking to
protect the access to 'bd_addr'.
  * update the documentation file to add more content about on-demand
allocation of bounds tables, etc..

Qiaowei Ren (12):
  mm: distinguish VMAs with different vm_ops
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX


Qiaowei Ren (12):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: rename cfg_reg_u and status_reg
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add MPX to disaabled features
  x86, mpx: on-demand kernel allocation of bounds tables
  mpx: extend siginfo structure to include bound violation information
  mips: sync struct siginfo with general version
  ia64: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_ENABLE_MANAGEMENT,
PR_MPX_DISABLE_MANAGEMENT
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  245 +++
 arch/ia64/include/uapi/asm/siginfo.h |8 +-
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/disabled-features.h |8 +-
 arch/x86/include/asm/mmu_context.h   |   25 ++
 arch/x86/include/asm/mpx.h   |  101 ++
 arch/x86/include/asm/processor.h |   22 ++-
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  488 ++
 arch/x86/kernel/setup.c  |8 +
 arch/x86/kernel/traps.c  |   86 ++-
 arch/x86/mm/Makefile |2 +
 arch/x86/mm/mpx.c|  385 +++
 fs/exec.c|2 +
 fs/proc/task_mmu.c   |1 +
 include/asm-generic/mmu_context.h|   11 +
 include/linux/mm.h   |6 +
 include/linux/mm_types.h |3 +
 include/uapi/asm-generic/siginfo.h   |9 +-
 include/uapi/linux/prctl.h   |6 +
 kernel/signal.c  |4 +
 kernel/sys.c |   12 +
 mm/mmap.c|2 +
 24 files changed, 1436 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c
 create mode 100644 arch/x86/mm/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 02/12] x86, mpx: rename cfg_reg_u and status_reg

2014-10-11 Thread Qiaowei Ren
According to Intel SDM extension, MPX configuration and status registers
should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and
status_reg to bndcfgu and bndstatus.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/processor.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..020142f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -379,8 +379,8 @@ struct bndregs_struct {
 } __packed;
 
 struct bndcsr_struct {
-   u64 cfg_reg_u;
-   u64 status_reg;
+   u64 bndcfgu;
+   u64 bndstatus;
 } __packed;
 
 struct xsave_hdr_struct {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 00/10] Intel MPX support

2014-09-11 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

This patchset has been tested on real internal hardware platform at Intel.
We have some simple unit tests in user space, which directly call MPX
instructions to produce #BR to let kernel allocate bounds tables and
cause bounds violations. We also compiled several benchmarks with an
MPX-enabled Gcc/Glibc and ICC, an ran them with this patch set.
We found a number of bugs in this code in these tests.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Changes since v7:
  * introduce VM_ARCH_2 flag. 
  * remove all of the pr_debug()s.
  * fix prctl numbers in documentation.
  * fix some bugs on bounds tables freeing.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  

[PATCH v8 02/10] x86, mpx: add MPX specific mmap interface

2014-09-11 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 778178f..935aa69 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -243,6 +243,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(&mm->mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;

[PATCH v8 03/10] x86, mpx: add macro cpu_has_mpx

2014-09-11 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index bb9b258..82ec7ed 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -353,6 +353,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-09-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..2c403a4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, &to->si_lower);
+   err |= __put_user(from->si_upper, &to->si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-09-11 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfc791c..cc31520 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8981cc8..942be8a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+#define VM_ARCH_2  0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
@@ -154,6 +155,11 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MAPPED_COPYVM_ARCH_1   /* T if mapped copy of data 
(nommu mmap) */
 #endif
 
+#if defined(CONFIG_X86)
+/* MPX specific bounds table or bounds directory */
+# define VM_MPXVM_ARCH_2
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUPVM_NONE
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 06/10] mips: sync struct siginfo with general version

2014-09-11 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-09-11 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 +++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   58 
 arch/x86/kernel/traps.c|   55 -
 4 files changed, 133 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0d0e922..396a88b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR(X86_TRAP_DE, SIGFPE,  "divide error", divide_error)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds",   bounds)
 DO_ERROR(X86_TRAP_UD, SIGILL,  "invalid opcode",   invalid_op)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment 
overrun",coprocessor_segment_overrun)
 DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS",  invalid_TSS)
@@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct task_struct *tsk = current;
+
+   prev_state = exception_enter();
+   if (notify_die(DIE_TRAP, "boun

[PATCH v8 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-09-11 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   55 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index eb71ec7..b801fea 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -953,6 +953,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 7ef6e39..b86873a 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,61 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(&tsk->thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6e0b286..760aee3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index ce81291..9a43587 100644
--- a/kernel/sys.c
+++ b/kerne

[PATCH v8 10/10] x86, mpx: add documentation on Intel MPX

2014-09-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..ccffeee
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 43
+156#define PR_MPX_UNREGISTER   44
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set v

[PATCH v8 09/10] x86, mpx: cleanup unused bound tables

2014-09-11 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  252 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 285 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 166af2a..d13e01c 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -102,4 +103,19 @@ do {   \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..feb1f01 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -1,7 +1,16 @@
+/*
+ * mpx.c - Memory Protection eXtensions
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Qiaowei Ren 
+ * Dave Hansen 
+ */
+
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +86,246 @@ out:
up_write(&mm->mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr)
+{
+   int valid;
+
+   if (!access_ok(VERIFY_READ, (bd_entry), sizeof(*(bd_entry
+   return -EFAULT;
+
+   pagefault_disable();
+   if (get_user(*bt_addr, bd_entry))
+   goto out;
+   pagefault_enable();
+
+   valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!valid && *bt_addr)
+   return -EINVAL;
+   if (!valid)
+   return -ENOENT;
+
+   return 0;
+
+out:
+   pagefault_enable();
+   return -EFAULT;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static int __must_check zap_bt_entries(struct mm_struct *mm,
+   unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   /*
+* The table entry comes from userspace and could be
+* pointing anywhere, so make sure it is at least
+* pointing to valid memory.
+*/
+   if (!vma || !(vma->vm_flags & VM_MPX) ||
+   vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   zap_page_range(vma, start, end - start, NULL);
+   r

[PATCH v8 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-09-11 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 88d660f..7ef6e39 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   

[PATCH v7 02/10] x86, mpx: add MPX specific mmap interface

2014-07-20 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

These bounds tables can take huge amounts of memory.  In the
worst-case scenario, the tables can be 4x the size of the data
structure being tracked. IOW, a 1-page structure can require 4
bounds-table pages.

My expectation is that folks using MPX are going to be keen on
figuring out how much memory is being dedicated to it. With this
feature, plus some grepping in /proc/$pid/smaps one could take a
pretty good stab at it.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 ++
 arch/x86/include/asm/mpx.h |   38 +
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   79 
 4 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a8f749e..020db35 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -238,6 +238,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..e1b28e6
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,79 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const char *mpx_mapping_name(struct vm_area_struct *vma)
+{
+   return "[mpx]";
+}
+
+static struct vm_operations_struct mpx_vma_ops = {
+   .name = mpx_mapping_name,
+};
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+   struct vm_area_struct *vma;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(&mm->mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+   if (IS_ERR_VALUE(ret))
+   goto out;

[PATCH v7 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-07-20 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index a4077e9..2131636 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2748,6 +2748,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, &to->si_lower);
+   err |= __put_user(from->si_upper, &to->si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-07-20 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   60 
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 135 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr))
+   return bt_addr;
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug("Allocate bounds table %lx at entry %p\n",
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0d0e922..396a88b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -228,7 +229,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR(X86_TRAP_DE, SIGFPE,  "divide error", divide_error)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds",   bounds)
 DO_ERROR(X86_TRAP_UD, SIGILL,  "invalid opcode",   invalid_op)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment 
overrun",coprocessor_segment_overrun)
 DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS",  invalid_TSS)
@@ -278,6 +278,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct 

[PATCH v7 00/10] Intel MPX support

2014-07-20 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

In addition, this patchset has been tested on Intel internal hardware
platform for MPX testing.

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.

Changes since v6:
  * because arch_vma_name is removed, this patchset have toset MPX
specific ->vm_ops to do the same thing.
  * fix warnings for 32 bit arch.
  * add more description into these patches.

Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  127 +++
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/cpufeature.h|6 +
 arch/x86/include/asm/mmu_context.h   |   16 ++
 arch/x86/include/asm/mpx.h   |   91 
 arch/x86/include/asm/processor.h |   18 ++
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c 

[PATCH v7 09/10] x86, mpx: cleanup unused bound tables

2014-07-20 Thread Qiaowei Ren
Since the kernel allocated those tables on-demand without userspace
knowledge, it is also responsible for freeing them when the associated
mappings go away.

Here, the solution for this issue is to hook do_munmap() to check
whether one process is MPX enabled. If yes, those bounds tables covered
in the virtual address region which is being unmapped will be freed also.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  181 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 214 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index e1b28e6..d29ec9c 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const char *mpx_mapping_name(struct vm_area_struct *vma)
@@ -77,3 +78,183 @@ out:
up_write(&mm->mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid) && *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(&bt_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+}
+
+/*
+ * If the bounds table pointed by bounds directory 'bd_entry' is
+ * not shared, unmap this whole bounds table. Otherwise, only free
+ * those backing physical pages of bo

[PATCH v7 03/10] x86, mpx: add macro cpu_has_mpx

2014-07-20 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Community gave a lot of comments on this macro cpu_has_mpx in previous
version. Dave will introduce a patchset about disabled features to fix
it later.

In this code:
if (cpu_has_mpx)
do_some_mpx_thing();

The patch series from Dave will introduce a new macro cpu_feature_enabled()
(if merged after this patchset) to replace the cpu_has_mpx.
if (cpu_feature_enabled(X86_FEATURE_MPX))
do_some_mpx_thing();

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e265ff9..f302d08 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-07-20 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

This patch does't use the generic decoder, and implements a limited
special-purpose decoder to decode MPX instructions, simply because the
generic decoder is very heavyweight not just in terms of performance
but in terms of interface -- because it has to.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  299 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 328 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index f02dcea..c1957a8 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,275 @@
 #include 
 #include 
 
+enum reg_type {
+   REG_TYPE_RM = 0,
+   REG_TYPE_INDEX,
+   REG_TYPE_BASE,
+};
+
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+enum reg_type type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   

[PATCH v7 10/10] x86, mpx: add documentation on Intel MPX

2014-07-20 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set v

[PATCH v7 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-07-20 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |2 ++
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee..b2bc755 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -549,6 +549,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e03dd29..44c75d7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+/* MPX specific bounds table or bounds directory (x86) */
+#define VM_MPX 0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 06/10] mips: sync struct siginfo with general version

2014-07-20 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v7 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-07-20 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index c1957a8..6b7e526 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(&tsk->thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(unsigned long)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   pr_debug("MPX BD base address %p\n", mm->bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 enum reg_type {
REG_TYPE_RM = 0,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 96c5750..131b5b3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define PR_MPX_UNREGISTER  44
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kern

[PATCH v6 00/10] Intel MPX support

2014-06-18 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

The high-level areas modified in the patchset are as follow:
1) struct siginfo is extended to include bound violation information.
2) two prctl() commands are added to do performance optimization.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator

Future TODO items:
1) support 32-bit binaries on 64-bit kernels.

Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Changes since v5:
  * hook unmap() path to cleanup unused bounds tables, and use
new prctl() command to register bounds directory address to
struct mm_struct to check whether one process is MPX enabled
during unmap().
  * in order track precisely MPX memory usage, add MPX specific
mmap interface and one VM_MPX flag to check whether a VMA
is MPX bounds table.
  * add macro cpu_has_mpx to do performance optimization.
  * sync struct figinfo for mips with general version to avoid
build issue.


Qiaowei Ren (10):
  x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific
  x86, mpx: add MPX specific mmap interface
  x86, mpx: add macro cpu_has_mpx
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information
  mips: sync struct siginfo with general version
  x86, mpx: decode MPX instruction to get bound violation information
  x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER
  x86, mpx: cleanup unused bound tables
  x86, mpx: add documentation on Intel MPX

 Documentation/x86/intel_mpx.txt  |  127 +++
 arch/mips/include/uapi/asm/siginfo.h |4 +
 arch/x86/Kconfig |4 +
 arch/x86/include/asm/cpufeature.h|6 +
 arch/x86/include/asm/mmu_context.h   |   16 ++
 arch/x86/include/asm/mpx.h   |   91 
 arch/x86/include/asm/processor.h |   18 ++
 arch/x86/kernel/Makefile |1 +
 arch/x86/kernel/mpx.c|  413 ++
 arch/x86/kernel/traps.c  |   62 +-
 arch/x86/mm/Makefile |2 +
 arch/x86/mm/init_64.c|2 +
 arch/x86/mm/mpx.c|  247 
 fs/proc/task_mmu.c   |1 +
 include/a

[PATCH v6 03/10] x86, mpx: add macro cpu_has_mpx

2014-06-18 Thread Qiaowei Ren
In order to do performance optimization, this patch adds macro
cpu_has_mpx which will directly return 0 when MPX is not supported
by kernel.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/cpufeature.h |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index e265ff9..f302d08 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -339,6 +339,12 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
+#ifdef CONFIG_X86_INTEL_MPX
+#define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX)
+#else
+#define cpu_has_mpx 0
+#endif /* CONFIG_X86_INTEL_MPX */
+
 #ifdef CONFIG_X86_64
 
 #undef  cpu_has_vme
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 01/10] x86, mpx: introduce VM_MPX to indicate that a VMA is MPX specific

2014-06-18 Thread Qiaowei Ren
MPX-enabled application will possibly create a lot of bounds tables
in process address space to save bounds information. These tables
can take up huge swaths of memory (as much as 80% of the memory on
the system) even if we clean them up aggressively. Being this huge,
we need a way to track their memory use. If we want to track them,
we essentially have two options:

1. walk the multi-GB (in virtual space) bounds directory to locate
   all the VMAs and walk them
2. Find a way to distinguish MPX bounds-table VMAs from normal
   anonymous VMAs and use some existing mechanism to walk them

We expect (1) will be prohibitively expensive. For (2), we only
need a single bit, and we've chosen to use a VM_ flag.  We understand
that they are scarce and are open to other options.

There is one potential hybrid approach: check the bounds directory
entry for any anonymous VMA that could possibly contain a bounds table.
This is less expensive than (1), but still requires reading a pointer
out of userspace for every VMA that we iterate over.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/mm/init_64.c |2 ++
 fs/proc/task_mmu.c|1 +
 include/linux/mm.h|2 ++
 3 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index f35c66c..2d41679 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1223,6 +1223,8 @@ int in_gate_area_no_mm(unsigned long addr)
 
 const char *arch_vma_name(struct vm_area_struct *vma)
 {
+   if (vma->vm_flags & VM_MPX)
+   return "[mpx]";
if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso)
return "[vdso]";
if (vma == &gate_vma)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 442177b..09266bd 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -543,6 +543,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_GROWSDOWN)]   = "gd",
[ilog2(VM_PFNMAP)]  = "pf",
[ilog2(VM_DENYWRITE)]   = "dw",
+   [ilog2(VM_MPX)] = "mp",
[ilog2(VM_LOCKED)]  = "lo",
[ilog2(VM_IO)]  = "io",
[ilog2(VM_SEQ_READ)]= "sr",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d677706..029c716 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -127,6 +127,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB 0x0040  /* Huge TLB Page VM */
 #define VM_NONLINEAR   0x0080  /* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1  0x0100  /* Architecture-specific flag */
+/* MPX specific bounds table or bounds directory (x86) */
+#define VM_MPX 0x0200
 #define VM_DONTDUMP0x0400  /* Do not include in the core dump */
 
 #ifdef CONFIG_MEM_SOFT_DIRTY
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 05/10] x86, mpx: extend siginfo structure to include bound violation information

2014-06-18 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

Signed-off-by: Qiaowei Ren 
---
 include/uapi/asm-generic/siginfo.h |9 -
 kernel/signal.c|4 
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index ba5be7f..1e35520 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL */
@@ -131,6 +135,8 @@ typedef struct siginfo {
 #define si_trapno  _sifields._sigfault._trapno
 #endif
 #define si_addr_lsb_sifields._sigfault._addr_lsb
+#define si_lower   _sifields._sigfault._addr_bnd._lower
+#define si_upper   _sifields._sigfault._addr_bnd._upper
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -199,7 +205,8 @@ typedef struct siginfo {
  */
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
-#define NSIGSEGV   2
+#define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
+#define NSIGSEGV   3
 
 /*
  * SIGBUS si_codes
diff --git a/kernel/signal.c b/kernel/signal.c
index 6ea13c0..0fcf749 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2773,6 +2773,10 @@ int copy_siginfo_to_user(siginfo_t __user *to, const 
siginfo_t *from)
if (from->si_code == BUS_MCEERR_AR || from->si_code == 
BUS_MCEERR_AO)
err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
 #endif
+#ifdef SEGV_BNDERR
+   err |= __put_user(from->si_lower, &to->si_lower);
+   err |= __put_user(from->si_upper, &to->si_upper);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, &to->si_pid);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 02/10] x86, mpx: add MPX specific mmap interface

2014-06-18 Thread Qiaowei Ren
This patch adds one MPX specific mmap interface, which only handles
mpx related maps, including bounds table and bounds directory.

In order to track MPX specific memory usage, this interface is added
to stick new vm_flag VM_MPX in the vma_area_struct when create a
bounds table or bounds directory.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig   |4 +++
 arch/x86/include/asm/mpx.h |   38 
 arch/x86/mm/Makefile   |2 +
 arch/x86/mm/mpx.c  |   58 
 4 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/mm/mpx.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 25d2c6f..0194790 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,6 +237,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config X86_INTEL_MPX
+   def_bool y
+   depends on CPU_SUP_INTEL
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..5725ac4
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,38 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+/* upper 28 bits [47:20] of the virtual address in 64-bit used to
+ * index into bounds directory (BD).
+ */
+#define MPX_BD_ENTRY_OFFSET28
+#define MPX_BD_ENTRY_SHIFT 3
+/* bits [19:3] of the virtual address in 64-bit used to index into
+ * bounds table (BT).
+ */
+#define MPX_BT_ENTRY_OFFSET17
+#define MPX_BT_ENTRY_SHIFT 5
+#define MPX_IGN_BITS   3
+
+#else
+
+#define MPX_BD_ENTRY_OFFSET20
+#define MPX_BD_ENTRY_SHIFT 2
+#define MPX_BT_ENTRY_OFFSET10
+#define MPX_BT_ENTRY_SHIFT 4
+#define MPX_IGN_BITS   2
+
+#endif
+
+#define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
+#define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
+
+#define MPX_BNDSTA_ERROR_CODE  0x3
+
+unsigned long mpx_mmap(unsigned long len);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 6a19ad9..ecfdc46 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -30,3 +30,5 @@ obj-$(CONFIG_ACPI_NUMA)   += srat.o
 obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
 
 obj-$(CONFIG_MEMTEST)  += memtest.o
+
+obj-$(CONFIG_X86_INTEL_MPX)+= mpx.o
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
new file mode 100644
index 000..546c5d1
--- /dev/null
+++ b/arch/x86/mm/mpx.c
@@ -0,0 +1,58 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * this is really a simplified "vm_mmap". it only handles mpx
+ * related maps, including bounds table and bounds directory.
+ *
+ * here we can stick new vm_flag VM_MPX in the vma_area_struct
+ * when create a bounds table or bounds directory, in order to
+ * track MPX specific memory.
+ */
+unsigned long mpx_mmap(unsigned long len)
+{
+   unsigned long ret;
+   unsigned long addr, pgoff;
+   struct mm_struct *mm = current->mm;
+   vm_flags_t vm_flags;
+
+   /* Only bounds table and bounds directory can be allocated here */
+   if (len != MPX_BD_SIZE_BYTES && len != MPX_BT_SIZE_BYTES)
+   return -EINVAL;
+
+   down_write(&mm->mmap_sem);
+
+   /* Too many mappings? */
+   if (mm->map_count > sysctl_max_map_count) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   /* Obtain the address to map to. we verify (or select) it and ensure
+* that it represents a valid section of the address space.
+*/
+   addr = get_unmapped_area(NULL, 0, len, 0, MAP_ANONYMOUS | MAP_PRIVATE);
+   if (addr & ~PAGE_MASK) {
+   ret = addr;
+   goto out;
+   }
+
+   vm_flags = VM_READ | VM_WRITE | VM_MPX |
+   mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
+
+   /* Make bounds tables and bouds directory unlocked. */
+   if (vm_flags & VM_LOCKED)
+   vm_flags &= ~VM_LOCKED;
+
+   /* Set pgoff according to addr for anon_vma */
+   pgoff = addr >> PAGE_SHIFT;
+
+   ret = mmap_region(NULL, addr, len, vm_flags, pgoff);
+
+out:
+   up_write(&mm->mmap_sem);
+   return ret;
+}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 08/10] x86, mpx: add prctl commands PR_MPX_REGISTER, PR_MPX_UNREGISTER

2014-06-18 Thread Qiaowei Ren
This patch adds the PR_MPX_REGISTER and PR_MPX_UNREGISTER prctl()
commands. These commands can be used to register and unregister MPX
related resource on the x86 platform.

The base of the bounds directory is set into mm_struct during
PR_MPX_REGISTER command execution. This member can be used to
check whether one application is mpx enabled.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h   |1 +
 arch/x86/include/asm/processor.h |   18 
 arch/x86/kernel/mpx.c|   56 ++
 include/linux/mm_types.h |3 ++
 include/uapi/linux/prctl.h   |6 
 kernel/sys.c |   12 
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 780af63..6cb0853 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -43,6 +43,7 @@
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
 #define MPX_BNDSTA_ERROR_CODE  0x3
+#define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
 struct mpx_insn {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6e0966e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -952,6 +952,24 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+/* Register/unregister a process' MPX related resource */
+#define MPX_REGISTER(tsk)  mpx_register((tsk))
+#define MPX_UNREGISTER(tsk)mpx_unregister((tsk))
+
+#ifdef CONFIG_X86_INTEL_MPX
+extern int mpx_register(struct task_struct *tsk);
+extern int mpx_unregister(struct task_struct *tsk);
+#else
+static inline int mpx_register(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+static inline int mpx_unregister(struct task_struct *tsk)
+{
+   return -EINVAL;
+}
+#endif /* CONFIG_X86_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 650b282..d8a2a09 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,6 +1,62 @@
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+
+/*
+ * This should only be called when cpuid has been checked
+ * and we are sure that MPX is available.
+ */
+static __user void *task_get_bounds_dir(struct task_struct *tsk)
+{
+   struct xsave_struct *xsave_buf;
+
+   fpu_xsave(&tsk->thread.fpu);
+   xsave_buf = &(tsk->thread.fpu.state->xsave);
+   if (!(xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ENABLE_FLAG))
+   return NULL;
+
+   return (void __user *)(xsave_buf->bndcsr.cfg_reg_u &
+   MPX_BNDCFG_ADDR_MASK);
+}
+
+int mpx_register(struct task_struct *tsk)
+{
+   struct mm_struct *mm = tsk->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   /*
+* runtime in the userspace will be responsible for allocation of
+* the bounds directory. Then, it will save the base of the bounds
+* directory into XSAVE/XRSTOR Save Area and enable MPX through
+* XRSTOR instruction.
+*
+* fpu_xsave() is expected to be very expensive. In order to do
+* performance optimization, here we get the base of the bounds
+* directory and then save it into mm_struct to be used in future.
+*/
+   mm->bd_addr = task_get_bounds_dir(tsk);
+   if (!mm->bd_addr)
+   return -EINVAL;
+
+   pr_debug("MPX BD base address %p\n", mm->bd_addr);
+   return 0;
+}
+
+int mpx_unregister(struct task_struct *tsk)
+{
+   struct mm_struct *mm = current->mm;
+
+   if (!cpu_has_mpx)
+   return -EINVAL;
+
+   mm->bd_addr = NULL;
+   return 0;
+}
 
 typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
 static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8967e20..54b8011 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -454,6 +454,9 @@ struct mm_struct {
bool tlb_flush_pending;
 #endif
struct uprobes_state uprobes_state;
+#ifdef CONFIG_X86_INTEL_MPX
+   void __user *bd_addr;   /* address of the bounds directory */
+#endif
 };
 
 static inline void mm_init_cpumask(struct mm_struct *mm)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 58afc04..ce86fa9 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -152,4 +152,10 @@
 #define PR_SET_THP_DISABLE 41
 #define PR_GET_THP_DISABLE 42
 
+/*
+ * Register/unregister MPX related resource.
+ */
+#define PR_MPX_REGISTER43
+#define

[PATCH v6 07/10] x86, mpx: decode MPX instruction to get bound violation information

2014-06-18 Thread Qiaowei Ren
This patch sets bound violation fields of siginfo struct in #BR
exception handler by decoding the user instruction and constructing
the faulting pointer.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   23 
 arch/x86/kernel/mpx.c  |  294 
 arch/x86/kernel/traps.c|6 +
 3 files changed, 323 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index b7598ac..780af63 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -44,15 +45,37 @@
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BD_ENTRY_VALID_FLAG0x1
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 unsigned long mpx_mmap(unsigned long len);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 #else
 static inline int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
 {
return -EINVAL;
 }
+static inline void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf)
+{
+}
 #endif /* CONFIG_X86_INTEL_MPX */
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 4230c7b..650b282 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -2,6 +2,270 @@
 #include 
 #include 
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   return addr;
+}
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)  \
+   ((insn)-&

[PATCH v6 09/10] x86, mpx: cleanup unused bound tables

2014-06-18 Thread Qiaowei Ren
When user memory region is unmapped, related bound tables
become unused and need to be released also. This patch cleanups
these unused bound tables through hooking unmap path.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mmu_context.h |   16 +++
 arch/x86/include/asm/mpx.h |9 ++
 arch/x86/mm/mpx.c  |  189 
 include/asm-generic/mmu_context.h  |6 +
 mm/mmap.c  |2 +
 5 files changed, 222 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index be12c53..af70d4f 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_PARAVIRT
 #include 
 
@@ -96,4 +97,19 @@ do { \
 } while (0)
 #endif
 
+static inline void arch_unmap(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long start, unsigned long end)
+{
+#ifdef CONFIG_X86_INTEL_MPX
+   /*
+* Check whether this vma comes from MPX-enabled application.
+* If so, release this vma related bound tables.
+*/
+   if (mm->bd_addr && !(vma->vm_flags & VM_MPX))
+   mpx_unmap(mm, start, end);
+
+#endif
+}
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 6cb0853..e848a74 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -42,6 +42,13 @@
 #define MPX_BD_SIZE_BYTES (1UL<<(MPX_BD_ENTRY_OFFSET+MPX_BD_ENTRY_SHIFT))
 #define MPX_BT_SIZE_BYTES (1UL<<(MPX_BT_ENTRY_OFFSET+MPX_BT_ENTRY_SHIFT))
 
+#define MPX_BD_ENTRY_MASK  ((1<>(MPX_BT_ENTRY_OFFSET+ \
+   MPX_IGN_BITS)) & MPX_BD_ENTRY_MASK) << MPX_BD_ENTRY_SHIFT)
+#define MPX_GET_BT_ENTRY_OFFSET(addr)  addr)>>MPX_IGN_BITS) & \
+   MPX_BT_ENTRY_MASK) << MPX_BT_ENTRY_SHIFT)
+
 #define MPX_BNDSTA_ERROR_CODE  0x3
 #define MPX_BNDCFG_ENABLE_FLAG 0x1
 #define MPX_BD_ENTRY_VALID_FLAG0x1
@@ -63,6 +70,8 @@ struct mpx_insn {
 #define MAX_MPX_INSN_SIZE  15
 
 unsigned long mpx_mmap(unsigned long len);
+void mpx_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end);
 
 #ifdef CONFIG_X86_INTEL_MPX
 int do_mpx_bt_fault(struct xsave_struct *xsave_buf);
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index 546c5d1..fd05cd4 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -56,3 +57,191 @@ out:
up_write(&mm->mmap_sem);
return ret;
 }
+
+/*
+ * Get the base of bounds tables pointed by specific bounds
+ * directory entry.
+ */
+static int get_bt_addr(long __user *bd_entry, unsigned long *bt_addr,
+   unsigned int *valid)
+{
+   if (get_user(*bt_addr, bd_entry))
+   return -EFAULT;
+
+   *valid = *bt_addr & MPX_BD_ENTRY_VALID_FLAG;
+   *bt_addr &= MPX_BT_ADDR_MASK;
+
+   /*
+* If this bounds directory entry is nonzero, and meanwhile
+* the valid bit is zero, one SIGSEGV will be produced due to
+* this unexpected situation.
+*/
+   if (!(*valid) && *bt_addr)
+   force_sig(SIGSEGV, current);
+
+   pr_debug("get_bt: BD Entry (%p) - Table (%lx,%d)\n",
+   bd_entry, *bt_addr, *valid);
+   return 0;
+}
+
+/*
+ * Free the backing physical pages of bounds table 'bt_addr'.
+ * Assume start...end is within that bounds table.
+ */
+static void zap_bt_entries(struct mm_struct *mm, unsigned long bt_addr,
+   unsigned long start, unsigned long end)
+{
+   struct vm_area_struct *vma;
+
+   /* Find the vma which overlaps this bounds table */
+   vma = find_vma(mm, bt_addr);
+   if (!vma || vma->vm_start > bt_addr ||
+   vma->vm_end < bt_addr+MPX_BT_SIZE_BYTES)
+   return;
+
+   zap_page_range(vma, start, end, NULL);
+   pr_debug("Bound table de-allocation %lx (%lx, %lx)\n",
+   bt_addr, start, end);
+}
+
+static void unmap_single_bt(struct mm_struct *mm, long __user *bd_entry,
+   unsigned long bt_addr)
+{
+   if (user_atomic_cmpxchg_inatomic(&bt_addr, bd_entry,
+   bt_addr | MPX_BD_ENTRY_VALID_FLAG, 0))
+   return;
+
+   pr_debug("Bound table de-allocation %lx at entry addr %p\n",
+   bt_addr, bd_entry);
+   /*
+* to avoid recursion, do_munmap() will check whether it comes
+* from one bounds table through VM_MPX flag.
+*/
+   do_munmap(mm, bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+}
+
+/*
+ * If the bounds table pointed by bounds directory 'bd_entr

[PATCH v6 10/10] x86, mpx: add documentation on Intel MPX

2014-06-18 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  127 +++
 1 files changed, 127 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..1af9809
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,127 @@
+1. Intel(R) MPX Overview
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+For more information, please refer to Intel(R) Architecture
+Instruction Set Extensions Programming Reference, Chapter 9:
+Intel(R) Memory Protection Extensions.
+
+Note: Currently no hardware with MPX ISA is available but it is always
+possible to use SDE (Intel(R) Software Development Emulator) instead,
+which can be downloaded from
+http://software.intel.com/en-us/articles/intel-software-development-emulator
+
+
+2. How does MPX kernel code work
+
+
+Handling #BR faults caused by MPX
+-
+
+When MPX is enabled, there are 2 new situations that can generate
+#BR faults.
+  * bounds violation caused by MPX instructions.
+  * new bounds tables (BT) need to be allocated to save bounds.
+
+We hook #BR handler to handle these two new situations.
+
+Decoding MPX instructions
+-
+
+If a #BR is generated due to a bounds violation caused by MPX.
+We need to decode MPX instructions to get violation address and
+set this address into extended struct siginfo.
+
+The _sigfault feild of struct siginfo is extended as follow:
+
+87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+88 struct {
+89 void __user *_addr; /* faulting insn/memory ref. */
+90 #ifdef __ARCH_SI_TRAPNO
+91 int _trapno;/* TRAP # which caused the signal */
+92 #endif
+93 short _addr_lsb; /* LSB of the reported address */
+94 struct {
+95 void __user *_lower;
+96 void __user *_upper;
+97 } _addr_bnd;
+98 } _sigfault;
+
+The '_addr' field refers to violation address, and new '_addr_and'
+field refers to the upper/lower bounds when a #BR is caused.
+
+Glibc will be also updated to support this new siginfo. So user
+can get violation address and bounds when bounds violations occur.
+
+Freeing unused bounds tables
+
+
+When a BNDSTX instruction attempts to save bounds to a bounds directory
+entry marked as invalid, a #BR is generated. This is an indication that
+no bounds table exists for this entry. In this case the fault handler
+will allocate a new bounds table on demand.
+
+Since the kernel allocated those tables on-demand without userspace
+knowledge, it is also responsible for freeing them when the associated
+mappings go away.
+
+Here, the solution for this issue is to hook do_munmap() to check
+whether one process is MPX enabled. If yes, those bounds tables covered
+in the virtual address region which is being unmapped will be freed also.
+
+Adding new prctl commands
+-
+
+Runtime library in userspace is responsible for allocation of bounds
+directory. So kernel have to use XSAVE instruction to get the base
+of bounds directory from BNDCFG register.
+
+But XSAVE is expected to be very expensive. In order to do performance
+optimization, we have to add new prctl command to get the base of
+bounds directory to be used in future.
+
+Two new prctl commands are added to register and unregister MPX related
+resource.
+
+155#define PR_MPX_REGISTER 41
+156#define PR_MPX_UNREGISTER   42
+
+The base of the bounds directory is set into mm_struct during
+PR_MPX_REGISTER command execution. This member can be used to
+check whether one application is mpx enabled.
+
+
+3. Tips
+===
+
+1) Users are not allowed to create bounds tables and point the bounds
+directory at them in the userspace. In fact, it is not also necessary
+for users to create bounds tables in the userspace.
+
+When #BR fault is produced due to invalid entry, bounds table will be
+created in kernel on demand and kernel will not transfer this fault to
+userspace. So usersapce can't receive #BR fault for invalid entry, and
+it is not also necessary for users to create bounds tables by themselves.
+
+Certainly users can allocate bounds tables and forcibly point the bounds
+directory at them through XSAVE instruction, and then set v

[PATCH v6 06/10] mips: sync struct siginfo with general version

2014-06-18 Thread Qiaowei Ren
Due to new fields about bound violation added into struct siginfo,
this patch syncs it with general version to avoid build issue.

Signed-off-by: Qiaowei Ren 
---
 arch/mips/include/uapi/asm/siginfo.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/uapi/asm/siginfo.h 
b/arch/mips/include/uapi/asm/siginfo.h
index e811744..d08f83f 100644
--- a/arch/mips/include/uapi/asm/siginfo.h
+++ b/arch/mips/include/uapi/asm/siginfo.h
@@ -92,6 +92,10 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb;
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
} _sigfault;
 
/* SIGPOLL, SIGXFSZ (To do ...)  */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 04/10] x86, mpx: hook #BR exception handler to allocate bound tables

2014-06-18 Thread Qiaowei Ren
This patch handles a #BR exception for non-existent tables by
carving the space out of the normal processes address space
(essentially calling mmap() from inside the kernel) and then
pointing the bounds-directory over to it.

The tables need to be accessed and controlled by userspace
because the compiler generates instructions for MPX-enabled
code which frequently store and retrieve entries from the bounds
tables. Any direct kernel involvement (like a syscall) to access
the tables would destroy performance since these are so frequent.

The tables are carved out of userspace because we have no better
spot to put them. For each pointer which is being tracked by MPX,
the bounds tables contain 4 longs worth of data, and the tables
are indexed virtually. If we were to preallocate the tables, we
would theoretically need to allocate 4x the virtual space that
we have available for userspace somewhere else. We don't have
that room in the kernel address space.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   20 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   63 
 arch/x86/kernel/traps.c|   56 ++-
 4 files changed, 139 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 5725ac4..b7598ac 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -18,6 +18,8 @@
 #define MPX_BT_ENTRY_SHIFT 5
 #define MPX_IGN_BITS   3
 
+#define MPX_BD_ENTRY_TAIL  3
+
 #else
 
 #define MPX_BD_ENTRY_OFFSET20
@@ -26,13 +28,31 @@
 #define MPX_BT_ENTRY_SHIFT 4
 #define MPX_IGN_BITS   2
 
+#define MPX_BD_ENTRY_TAIL  2
+
 #endif
 
+#define MPX_BNDSTA_TAIL2
+#define MPX_BNDCFG_TAIL12
+#define MPX_BNDSTA_ADDR_MASK   (~((1UL<
+#include 
+#include 
+
+static int allocate_bt(long __user *bd_entry)
+{
+   unsigned long bt_addr, old_val = 0;
+   int ret = 0;
+
+   bt_addr = mpx_mmap(MPX_BT_SIZE_BYTES);
+   if (IS_ERR((void *)bt_addr)) {
+   pr_err("Bounds table allocation failed at entry addr %p\n",
+   bd_entry);
+   return bt_addr;
+   }
+   bt_addr = (bt_addr & MPX_BT_ADDR_MASK) | MPX_BD_ENTRY_VALID_FLAG;
+
+   ret = user_atomic_cmpxchg_inatomic(&old_val, bd_entry, 0, bt_addr);
+   if (ret)
+   goto out;
+
+   /*
+* there is a existing bounds table pointed at this bounds
+* directory entry, and so we need to free the bounds table
+* allocated just now.
+*/
+   if (old_val)
+   goto out;
+
+   pr_debug("Allocate bounds table %lx at entry %p\n",
+   bt_addr, bd_entry);
+   return 0;
+
+out:
+   vm_munmap(bt_addr & MPX_BT_ADDR_MASK, MPX_BT_SIZE_BYTES);
+   return ret;
+}
+
+/*
+ * When a BNDSTX instruction attempts to save bounds to a BD entry
+ * with the lack of the valid bit being set, a #BR is generated.
+ * This is an indication that no BT exists for this entry. In this
+ * case the fault handler will allocate a new BT.
+ *
+ * With 32-bit mode, the size of BD is 4MB, and the size of each
+ * bound table is 16KB. With 64-bit mode, the size of BD is 2GB,
+ * and the size of each bound table is 4MB.
+ */
+int do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry < bd_base) ||
+   (bd_entry >= bd_base + MPX_BD_SIZE_BYTES))
+   return -EINVAL;
+
+   return allocate_bt((long __user *)bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f73b5d4..35b9b29 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE,  "divide error",
divide_error,FPE_INTDIV, regs->ip )
 DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow",
overflow  )
-DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds",  bounds  
  )
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL,  "invalid opcode",  
invalid_op,  ILL_ILLOPN, regs->ip )
 DO_ERROR (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun", 
coprocessor_segment_

[PATCH v5 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-22 Thread Qiaowei Ren
An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   35 +++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   46 
 arch/x86/kernel/traps.c|   56 +++-
 4 files changed, 137 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..d9a61a8
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS28
+#define MPX_L1_SHIFT   3
+#define MPX_L2_BITS17
+#define MPX_L2_SHIFT   5
+#define MPX_IGN_BITS   3
+#define MPX_L2_NODE_ADDR_MASK  0xfff8UL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#else
+
+#define MPX_L1_BITS20
+#define MPX_L1_SHIFT   2
+#define MPX_L2_BITS10
+#define MPX_L2_SHIFT   4
+#define MPX_IGN_BITS   2
+#define MPX_L2_NODE_ADDR_MASK  0xfffcUL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#endif
+
+bool do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index cb648c8..becb970 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-y  += mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..f1a16c0
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,46 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+   if (bt_addr == -1) {
+   pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+   bd_entry);
+   return false;
+   }
+   bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+   user_atomic_cmpxchg_inatomic(&old_val,
+   (long __user *)bd_entry, 0, bt_addr);
+   if (old_val)
+   vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+   return true;
+}
+
+bool do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   return allocate_bt(bd_entry);
+
+   return false;
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..b894f09 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE,  "divide error",
divide_error,FPE_INTDIV, regs->ip )
 DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow",
overflow  )
-DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds",  bounds  
  )
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL,  "invalid opcode",  
invalid_op,  ILL_ILLOPN, regs->ip )
 DO_ERROR (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun", 
coprocessor_segment_overrun   )
 DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", 
invalid_TSS   )
@@ -263,6 +263,60 @@ dotraplinkage void do_double_fault(struct pt_regs *reg

[PATCH v5 3/3] x86, mpx: extend siginfo structure to include bound violation information

2014-02-22 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   19 +++
 arch/x86/kernel/mpx.c  |  289 
 arch/x86/kernel/traps.c|6 +
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 5 files changed, 326 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d9a61a8..3296052 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -30,6 +31,24 @@
 
 #endif
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 bool do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index f1a16c0..7c0e36c 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -7,6 +7,270 @@
 #include 
 #include 
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   return addr;
+}
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)  \
+   ((insn)->next_byte + sizeof(t) 

[PATCH v5 1/3] x86, mpx: add documentation on Intel MPX

2014-02-22 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  239 +++
 1 files changed, 239 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..9af8636
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,239 @@
+1. Intel(R) MPX Overview
+
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. MPX architecture is designed to
+allow a machine (i.e., the processor(s) and the OS software)
+to run both MPX enabled software and legacy software that
+is MPX unaware. In such a case, the legacy software does not
+benefit from MPX, but it also does not experience any change
+in functionality or reduction in performance.
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. The linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 are
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+
+2. How to get the advantage of MPX
+==
+
+
+To get the advantage of MPX, changes are required in
+the OS kernel, binutils, compiler, and system libraries support.
+
+MPX support in the GNU toolchain
+
+
+This section describes changes in GNU Binutils, GCC and Glibc
+to support MPX.
+
+The first step of MPX support is to implement support for new
+hardware features in binutils and the GCC.
+
+The second step is implementation of MPX instrumentation pass
+in the GCC compiler which is responsible for instrumenting all
+memory accesses with pointer checks. Compiler changes for runtime
+bound checks include:
+
+  * Bounds creation for statically alloca

[PATCH v5 0/3] Intel MPX support

2014-02-22 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Changes since v4:
  * raise SIGBUS if the allocations of the bound tables fail.

Qiaowei Ren (3):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information

 Documentation/x86/intel_mpx.txt|  239 +
 arch/x86/include/asm/mpx.h |   54 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  335 
 arch/x86/kernel/traps.c|   62 +++-
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 7 files changed, 702 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/3] x86, mpx: add documentation on Intel MPX

2014-02-12 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  239 +++
 1 files changed, 239 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..9af8636
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,239 @@
+1. Intel(R) MPX Overview
+
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. MPX architecture is designed to
+allow a machine (i.e., the processor(s) and the OS software)
+to run both MPX enabled software and legacy software that
+is MPX unaware. In such a case, the legacy software does not
+benefit from MPX, but it also does not experience any change
+in functionality or reduction in performance.
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. The linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 are
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+
+2. How to get the advantage of MPX
+==
+
+
+To get the advantage of MPX, changes are required in
+the OS kernel, binutils, compiler, and system libraries support.
+
+MPX support in the GNU toolchain
+
+
+This section describes changes in GNU Binutils, GCC and Glibc
+to support MPX.
+
+The first step of MPX support is to implement support for new
+hardware features in binutils and the GCC.
+
+The second step is implementation of MPX instrumentation pass
+in the GCC compiler which is responsible for instrumenting all
+memory accesses with pointer checks. Compiler changes for runtime
+bound checks include:
+
+  * Bounds creation for statically alloca

[PATCH v4 3/3] x86, mpx: extend siginfo structure to include bound violation information

2014-02-12 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   19 +++
 arch/x86/kernel/mpx.c  |  289 
 arch/x86/kernel/traps.c|6 +
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 5 files changed, 326 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d074153..3129b1e 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -30,6 +31,24 @@
 
 #endif
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index e055e0e..f95abc2 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -7,6 +7,270 @@
 #include 
 #include 
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   return addr;
+}
+
+/* Verify next sizeof(t) bytes can be on the same instruction */
+#define validate_next(t, insn, n)  \
+   ((insn)->next_byte + sizeof(t) 

[PATCH v4 2/3] x86, mpx: hook #BR exception handler to allocate bound tables

2014-02-12 Thread Qiaowei Ren
An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   35 
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   44 +++
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 134 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..d074153
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS28
+#define MPX_L1_SHIFT   3
+#define MPX_L2_BITS17
+#define MPX_L2_SHIFT   5
+#define MPX_IGN_BITS   3
+#define MPX_L2_NODE_ADDR_MASK  0xfff8UL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#else
+
+#define MPX_L1_BITS20
+#define MPX_L1_SHIFT   2
+#define MPX_L2_BITS10
+#define MPX_L2_SHIFT   4
+#define MPX_IGN_BITS   2
+#define MPX_L2_NODE_ADDR_MASK  0xfffcUL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#endif
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index cb648c8..becb970 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-y  += mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..e055e0e
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,44 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+   if (bt_addr == -1) {
+   pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+   bd_entry);
+   return false;
+   }
+   bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+   user_atomic_cmpxchg_inatomic(&old_val,
+   (long __user *)bd_entry, 0, bt_addr);
+   if (old_val)
+   vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+   return true;
+}
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..fe09b3d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE,  "divide error",
divide_error,FPE_INTDIV, regs->ip )
 DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow",
overflow  )
-DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds",  bounds  
  )
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL,  "invalid opcode",  
invalid_op,  ILL_ILLOPN, regs->ip )
 DO_ERROR (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun", 
coprocessor_segment_overrun   )
 DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", 
invalid_TSS   )
@@ -263,6 +263,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 

[PATCH v4 0/3] Intel MPX support

2014-02-12 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Changes since v3:
  * correct some syntax errors at documentation, and document
extended struct siginfo.
  * for kill the process when the error code of BNDSTATUS is 3.
  * add some comments.
  * remove new prctl() commands.
  * fix some compile warnings for 32-bit.

Qiaowei Ren (3):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: extend siginfo structure to include bound violation
information

 Documentation/x86/intel_mpx.txt|  239 ++
 arch/x86/include/asm/mpx.h |   54 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  333 
 arch/x86/kernel/traps.c|   61 +++-
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 7 files changed, 699 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kernel/trace: fix compiler warning

2014-02-11 Thread Qiaowei Ren
The patch fixes the following compiler warning:
CC  kernel/trace/trace_events.o
  kernel/trace/trace_events.c: In function 'event_enable_read'
  kernel/trace/trace_events.c:693: warning: 'flags' may be used \
  uninitialized in this function

Signed-off-by: Qiaowei Ren 
---
 kernel/trace/trace_events.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index e71ffd4..b7915f2 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -690,7 +690,7 @@ event_enable_read(struct file *filp, char __user *ubuf, 
size_t cnt,
  loff_t *ppos)
 {
struct ftrace_event_file *file;
-   unsigned long flags;
+   unsigned long flags = 0;
char buf[4] = "0";
 
mutex_lock(&event_mutex);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-25 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   19 +++
 arch/x86/kernel/mpx.c  |  287 
 arch/x86/kernel/traps.c|6 +
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 5 files changed, 324 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 9652e9e..e099573 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -30,6 +31,22 @@
 
 #endif
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 typedef union {
struct {
unsigned long ignored:MPX_IGN_BITS;
@@ -40,5 +57,7 @@ typedef union {
 } mpx_addr;
 
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 9e91178..983abf7 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk)
return 0;
 }
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   

[PATCH v3 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-25 Thread Qiaowei Ren
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig |4 ++
 arch/x86/include/asm/mpx.h   |9 
 arch/x86/include/asm/processor.h |   16 +++
 arch/x86/kernel/mpx.c|   84 ++
 include/uapi/linux/prctl.h   |6 +++
 kernel/sys.c |   12 +
 6 files changed, 131 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd18b83..28916e1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -236,6 +236,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config HAVE_INTEL_MPX
+   def_bool y
+   select MMU_NOTIFIER
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d074153..9652e9e 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -30,6 +30,15 @@
 
 #endif
 
+typedef union {
+   struct {
+   unsigned long ignored:MPX_IGN_BITS;
+   unsigned long l2index:MPX_L2_BITS;
+   unsigned long l1index:MPX_L1_BITS;
+   };
+   unsigned long addr;
+} mpx_addr;
+
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index fdedd38..5962413 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -943,6 +943,22 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+#ifdef CONFIG_HAVE_INTEL_MPX
+
+/* Init/release a process' MPX related resource */
+#define MPX_INIT(tsk)  mpx_init((tsk))
+#define MPX_RELEASE(tsk)   mpx_release((tsk))
+
+extern int mpx_init(struct task_struct *tsk);
+extern int mpx_release(struct task_struct *tsk);
+
+#else /* CONFIG_HAVE_INTEL_MPX */
+
+#define MPX_INIT(tsk)  (-EINVAL)
+#define MPX_RELEASE(tsk)   (-EINVAL)
+
+#endif /* CONFIG_HAVE_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index e055e0e..9e91178 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,5 +1,7 @@
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -7,6 +9,88 @@
 #include 
 #include 
 
+static struct mmu_notifier mpx_mn;
+
+static void mpx_invl_range_end(struct mmu_notifier *mn,
+   struct mm_struct *mm,
+   unsigned long start, unsigned long end)
+{
+   struct xsave_struct *xsave_buf;
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr;
+   unsigned long bd_base;
+   unsigned long bd_entry, bde_start, bde_end;
+   mpx_addr lap;
+
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   /* ignore swap notifications */
+   pgd = pgd_offset(mm, start);
+   pud = pud_offset(pgd, start);
+   pmd = pmd_offset(pud, start);
+   pte = pte_offset_kernel(pmd, start);
+   if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte))
+   return;
+
+   /* get bound directory base address */
+   fpu_xsave(¤t->thread.fpu);
+   xsave_buf = &(current->thread.fpu.state->xsave);
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+
+   /* get related bde range */
+   lap.addr = start;
+   bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT);
+
+   lap.addr = end;
+   if (lap.ignored || lap.l2index)
+   bde_end = bd_base + (lap.l1index<mm);
+
+   return 0;
+}
+
+int mpx_release(struct task_struct *tsk)
+{
+   if (!boot_cpu_has(X86_FEATURE_MPX))
+   return -EINVAL;
+
+   /* unregister mmu_notifier */
+   mmu_notifier_unregister(&mpx_mn, current->mm);
+
+   return 0;
+}
+
 static bool allocate_bt(unsigned long bd_entry)
 {
unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 289760f..19ab881 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -149,4 +149,10 @@
 
 #define PR_GET_TID_ADDRESS 40
 
+/*
+ * Init/release MPX related resource.
+ */
+#define PR_MPX_INIT41
+#define PR_MPX_RELEASE 42
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index c723113..0334d03 100644
--

[PATCH v3 0/4] Intel MPX support

2014-01-25 Thread Qiaowei Ren
This patchset adds support for the Memory Protection Extensions
(MPX) feature found in future Intel processors.

MPX can be used in conjunction with compiler changes to check memory
references, for those references whose compile-time normal intentions
are usurped at runtime due to buffer overflow or underflow.

MPX provides this capability at very low performance overhead for
newly compiled code, and provides compatibility mechanisms with legacy
software components. MPX architecture is designed allow a machine to
run both MPX enabled software and legacy software that is MPX unaware.
In such a case, the legacy software does not benefit from MPX, but it
also does not experience any change in functionality or reduction in
performance.

More information about Intel MPX can be found in "Intel(R) Architecture
Instruction Set Extensions Programming Reference".

To get the advantage of MPX, changes are required in the OS kernel,
binutils, compiler, system libraries support.

New GCC option -fmpx is introduced to utilize MPX instructions.
Currently GCC compiler sources with MPX support is available in a
separate branch in common GCC SVN repository. See GCC SVN page
(http://gcc.gnu.org/svn.html) for details.

To have the full protection, we had to add MPX instrumentation to all
the necessary Glibc routines (e.g. memcpy) written on assembler, and
compile Glibc with the MPX enabled GCC compiler. Currently MPX enabled
Glibc source can be found in Glibc git repository.

Enabling an application to use MPX will generally not require source
code updates but there is some runtime code, which is responsible for
configuring and enabling MPX, needed in order to make use of MPX.
For most applications this runtime support will be available by linking
to a library supplied by the compiler or possibly it will come directly
from the OS once OS versions that support MPX are available.

MPX kernel code, namely this patchset, has mainly the 2 responsibilities:
provide handlers for bounds faults (#BR), and manage bounds memory.

Currently no hardware with MPX ISA is available but it is always
possible to use SDE (Intel(R) software Development Emulator) instead,
which can be downloaded from
http://software.intel.com/en-us/articles/intel-software-development-emulator


Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Changes since v2:
  * fix some compile warnings.
  * update documentation.

Qiaowei Ren (4):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
  x86, mpx: extend siginfo structure to include bound violation
information

 Documentation/x86/intel_mpx.txt|  226 
 arch/x86/Kconfig   |4 +
 arch/x86/include/asm/mpx.h |   63 ++
 arch/x86/include/asm/processor.h   |   16 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  415 
 arch/x86/kernel/traps.c|   61 +-
 include/uapi/asm-generic/siginfo.h |9 +-
 include/uapi/linux/prctl.h |6 +
 kernel/signal.c|4 +
 kernel/sys.c   |   12 +
 11 files changed, 815 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-25 Thread Qiaowei Ren
An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   35 
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   44 +++
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 134 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..d074153
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS28
+#define MPX_L1_SHIFT   3
+#define MPX_L2_BITS17
+#define MPX_L2_SHIFT   5
+#define MPX_IGN_BITS   3
+#define MPX_L2_NODE_ADDR_MASK  0xfff8UL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#else
+
+#define MPX_L1_BITS20
+#define MPX_L1_SHIFT   2
+#define MPX_L2_BITS10
+#define MPX_L2_SHIFT   4
+#define MPX_IGN_BITS   2
+#define MPX_L2_NODE_ADDR_MASK  0xfffcUL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#endif
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index cb648c8..becb970 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_PREEMPT) += preempt.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-y  += mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..e055e0e
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,44 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val = 0;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+   if (bt_addr == -1) {
+   pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+   bd_entry);
+   return false;
+   }
+   bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+   user_atomic_cmpxchg_inatomic(&old_val,
+   (long __user *)bd_entry, 0, bt_addr);
+   if (old_val)
+   vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+   return true;
+}
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..6b284a4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -213,7 +214,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE,  "divide error",
divide_error,FPE_INTDIV, regs->ip )
 DO_ERROR (X86_TRAP_OF, SIGSEGV, "overflow",
overflow  )
-DO_ERROR (X86_TRAP_BR, SIGSEGV, "bounds",  bounds  
  )
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL,  "invalid opcode",  
invalid_op,  ILL_ILLOPN, regs->ip )
 DO_ERROR (X86_TRAP_OLD_MF, SIGFPE,  "coprocessor segment overrun", 
coprocessor_segment_overrun   )
 DO_ERROR (X86_TRAP_TS, SIGSEGV, "invalid TSS", 
invalid_TSS   )
@@ -263,6 +263,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 

[PATCH v3 1/4] x86, mpx: add documentation on Intel MPX

2014-01-25 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |  226 +++
 1 files changed, 226 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..052001c
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,226 @@
+1. Intel(R) MPX Overview
+
+
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX provides
+hardware features that can be used in conjunction with compiler
+changes to check memory references, for those references whose
+compile-time normal intentions are usurped at runtime due to
+buffer overflow or underflow.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. MPX architecture is designed
+allow a machine (i.e., the processor(s) and the OS software)
+to run both MPX enabled software and legacy software that
+is MPX unaware. In such a case, the legacy software does not
+benefit from MPX, but it also does not experience any change
+in functionality or reduction in performance.
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. And the linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 is
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+
+2. How to get the advantage of MPX 
+==
+
+
+To get the advantage of MPX, changes are required in
+the OS kernel, binutils, compiler, system libraries support.
+
+MPX support in the GNU toolchain
+
+
+This section describes changes in GNU Binutils, GCC and Glibc
+to support MPX.
+
+The first step of MPX support is to implement support for new
+hardware features in binutils and the GCC.
+
+The second step is implementation of MPX instrumentation pass
+in the GCC compiler which is responsible for instrumenting all
+memory accesses with pointer checks. Compiler changes for runtime
+bound checks include:
+
+  * Bounds creation for statically alloca

[PATCH v2 0/4] Intel MPX support

2014-01-21 Thread Qiaowei Ren
Changes since v1:
  * check to see if #BR occurred in userspace or kernel space.
  * use generic structure and macro as much as possible when
decode mpx instructions.

Qiaowei Ren (4):
  x86, mpx: add documentation on Intel MPX
  x86, mpx: hook #BR exception handler to allocate bound tables
  x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE
  x86, mpx: extend siginfo structure to include bound violation
information

 Documentation/x86/intel_mpx.txt|   76 +++
 arch/x86/Kconfig   |4 +
 arch/x86/include/asm/mpx.h |   63 ++
 arch/x86/include/asm/processor.h   |   16 ++
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |  417 
 arch/x86/kernel/traps.c|   61 +-
 include/uapi/asm-generic/siginfo.h |9 +-
 include/uapi/linux/prctl.h |6 +
 kernel/signal.c|4 +
 kernel/sys.c   |   12 +
 11 files changed, 667 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/4] x86, mpx: add documentation on Intel MPX

2014-01-21 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |   76 +++
 1 files changed, 76 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..778d06e
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,76 @@
+Intel(R) MPX Overview:
+=
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX can
+increase the robustness of software when it is used in conjunction
+with compiler changes to check that memory references intended
+at compile time do not become unsafe at runtime.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. A direct benefit Intel MPX provides
+is hardening software against malicious attacks designed to
+cause or exploit buffer overruns.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. And the linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 is
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/4] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-21 Thread Qiaowei Ren
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig |4 ++
 arch/x86/include/asm/mpx.h   |9 
 arch/x86/include/asm/processor.h |   16 +++
 arch/x86/kernel/mpx.c|   84 ++
 include/uapi/linux/prctl.h   |6 +++
 kernel/sys.c |   12 +
 6 files changed, 131 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ee2fb9d..695101a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -233,6 +233,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config HAVE_INTEL_MPX
+   def_bool y
+   select MMU_NOTIFIER
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d074153..9652e9e 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -30,6 +30,15 @@
 
 #endif
 
+typedef union {
+   struct {
+   unsigned long ignored:MPX_IGN_BITS;
+   unsigned long l2index:MPX_L2_BITS;
+   unsigned long l1index:MPX_L1_BITS;
+   };
+   unsigned long addr;
+} mpx_addr;
+
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 43be6f6..ea4e72d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -963,6 +963,22 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+#ifdef CONFIG_HAVE_INTEL_MPX
+
+/* Init/release a process' MPX related resource */
+#define MPX_INIT(tsk)  mpx_init((tsk))
+#define MPX_RELEASE(tsk)   mpx_release((tsk))
+
+extern int mpx_init(struct task_struct *tsk);
+extern int mpx_release(struct task_struct *tsk);
+
+#else /* CONFIG_HAVE_INTEL_MPX */
+
+#define MPX_INIT(tsk)  (-EINVAL)
+#define MPX_RELEASE(tsk)   (-EINVAL)
+
+#endif /* CONFIG_HAVE_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 767b3bf..ffe5aee 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,5 +1,7 @@
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -7,6 +9,88 @@
 #include 
 #include 
 
+static struct mmu_notifier mpx_mn;
+
+static void mpx_invl_range_end(struct mmu_notifier *mn,
+   struct mm_struct *mm,
+   unsigned long start, unsigned long end)
+{
+   struct xsave_struct *xsave_buf;
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr;
+   unsigned long bd_base;
+   unsigned long bd_entry, bde_start, bde_end;
+   mpx_addr lap;
+
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   /* ignore swap notifications */
+   pgd = pgd_offset(mm, start);
+   pud = pud_offset(pgd, start);
+   pmd = pmd_offset(pud, start);
+   pte = pte_offset_kernel(pmd, start);
+   if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte))
+   return;
+
+   /* get bound directory base address */
+   fpu_xsave(¤t->thread.fpu);
+   xsave_buf = &(current->thread.fpu.state->xsave);
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+
+   /* get related bde range */
+   lap.addr = start;
+   bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT);
+
+   lap.addr = end;
+   if (lap.ignored || lap.l2index)
+   bde_end = bd_base + (lap.l1index<mm);
+
+   return 0;
+}
+
+int mpx_release(struct task_struct *tsk)
+{
+   if (!boot_cpu_has(X86_FEATURE_MPX))
+   return -EINVAL;
+
+   /* unregister mmu_notifier */
+   mmu_notifier_unregister(&mpx_mn, current->mm);
+
+   return 0;
+}
+
 static bool allocate_bt(unsigned long bd_entry)
 {
unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 289760f..19ab881 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -149,4 +149,10 @@
 
 #define PR_GET_TID_ADDRESS 40
 
+/*
+ * Init/release MPX related resource.
+ */
+#define PR_MPX_INIT41
+#define PR_MPX_RELEASE 42
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index c18ecca..bbaf573 100644
--

[PATCH v2 2/4] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-21 Thread Qiaowei Ren
An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   35 
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   44 +++
 arch/x86/kernel/traps.c|   55 +++-
 4 files changed, 134 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..d074153
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS28
+#define MPX_L1_SHIFT   3
+#define MPX_L2_BITS17
+#define MPX_L2_SHIFT   5
+#define MPX_IGN_BITS   3
+#define MPX_L2_NODE_ADDR_MASK  0xfff8UL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#else
+
+#define MPX_L1_BITS20
+#define MPX_L1_SHIFT   2
+#define MPX_L2_BITS10
+#define MPX_L2_SHIFT   4
+#define MPX_IGN_BITS   2
+#define MPX_L2_NODE_ADDR_MASK  0xfffcUL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#endif
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index a5408b9..bba7a71 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -38,6 +38,7 @@ obj-y += resource.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-y  += mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..767b3bf
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,44 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+   if (bt_addr == -1) {
+   pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+   bd_entry);
+   return false;
+   }
+   bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+   user_atomic_cmpxchg_inatomic(&old_val,
+   (long __user *)bd_entry, 0, bt_addr);
+   if (old_val)
+   vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+   return true;
+}
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 8c8093b..78e9c16 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -214,7 +215,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV,
regs->ip)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds)
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN,
regs->ip)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",
@@ -267,6 +267,59 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct task_struct *tsk = current;
+
+   prev_state = exception_enter();
+   if (notify_die(DIE_TRAP, "bounds", regs, error_code,
+   X86_TRAP_BR, SIGSEGV) == NOTIFY_STOP)

[PATCH v2 4/4] x86, mpx: extend siginfo structure to include bound violation information

2014-01-21 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   19 +++
 arch/x86/kernel/mpx.c  |  289 
 arch/x86/kernel/traps.c|6 +
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 5 files changed, 326 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 9652e9e..e099573 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 
@@ -30,6 +31,22 @@
 
 #endif
 
+struct mpx_insn {
+   struct insn_field rex_prefix;   /* REX prefix */
+   struct insn_field modrm;
+   struct insn_field sib;
+   struct insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
 typedef union {
struct {
unsigned long ignored:MPX_IGN_BITS;
@@ -40,5 +57,7 @@ typedef union {
 } mpx_addr;
 
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index ffe5aee..3770991 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk)
return 0;
 }
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   if (X86_MODRM_MOD(modrm) == 3) {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   } else {
+   if (insn->sib.nbytes) {
+   base = get_reg(insn, regs, REG_TYPE_BASE);
+   indx = get_reg(insn, regs, REG_TYPE_INDEX);
+   addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   } else {
+   addr = get_reg(insn, regs, REG_TYPE_RM);
+   }
+   addr += insn->displacement.value;
+   }
+
+   

[tip:x86/mpx] x86, mpx: Add MPX related opcodes to the x86 opcode map

2014-01-17 Thread tip-bot for Qiaowei Ren
Commit-ID:  fb09b78151361f5001ad462e4b242b10845830e2
Gitweb: http://git.kernel.org/tip/fb09b78151361f5001ad462e4b242b10845830e2
Author: Qiaowei Ren 
AuthorDate: Sun, 12 Jan 2014 17:20:02 +0800
Committer:  H. Peter Anvin 
CommitDate: Fri, 17 Jan 2014 11:04:09 -0800

x86, mpx: Add MPX related opcodes to the x86 opcode map

This patch adds all the MPX instructions to x86 opcode map, so the x86
instruction decoder can decode MPX instructions.

Signed-off-by: Qiaowei Ren 
Link: 
http://lkml.kernel.org/r/1389518403-7715-4-git-send-email-qiaowei@intel.com
Cc: Masami Hiramatsu 
Signed-off-by: H. Peter Anvin 
---
 arch/x86/lib/x86-opcode-map.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 533a85e..1a2be7c 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -346,8 +346,8 @@ AVXcode: 1
 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
 18: Grp16 (1A)
 19:
-1a:
-1b:
+1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
+1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
 1c:
 1d:
 1e:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] x86, mpx: add MPX related opcodes to the x86 opcode map

2014-01-11 Thread Qiaowei Ren
This patch adds all the MPX instructions to x86 opcode map, and then
the x86 instruction decoder can decode MPX instructions used in kernel.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/lib/x86-opcode-map.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 533a85e..1a2be7c 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -346,8 +346,8 @@ AVXcode: 1
 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
 18: Grp16 (1A)
 19:
-1a:
-1b:
+1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
+1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
 1c:
 1d:
 1e:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/5] x86, mpx: add prctl commands PR_MPX_INIT, PR_MPX_RELEASE

2014-01-11 Thread Qiaowei Ren
This patch adds the PR_MPX_INIT and PR_MPX_RELEASE prctl()
commands on the x86 platform. These commands can be used to
init and release MPX related resource.

A MMU notifier will be registered during PR_MPX_INIT
command execution. So the bound tables can be automatically
deallocated when one memory area is unmapped.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/Kconfig |4 ++
 arch/x86/include/asm/mpx.h   |9 
 arch/x86/include/asm/processor.h |   16 +++
 arch/x86/kernel/mpx.c|   84 ++
 include/uapi/linux/prctl.h   |6 +++
 kernel/sys.c |   12 +
 6 files changed, 131 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ee2fb9d..695101a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -233,6 +233,10 @@ config HAVE_INTEL_TXT
def_bool y
depends on INTEL_IOMMU && ACPI
 
+config HAVE_INTEL_MPX
+   def_bool y
+   select MMU_NOTIFIER
+
 config X86_32_SMP
def_bool y
depends on X86_32 && SMP
diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index d074153..9652e9e 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -30,6 +30,15 @@
 
 #endif
 
+typedef union {
+   struct {
+   unsigned long ignored:MPX_IGN_BITS;
+   unsigned long l2index:MPX_L2_BITS;
+   unsigned long l1index:MPX_L1_BITS;
+   };
+   unsigned long addr;
+} mpx_addr;
+
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 43be6f6..ea4e72d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -963,6 +963,22 @@ extern void start_thread(struct pt_regs *regs, unsigned 
long new_ip,
 extern int get_tsc_mode(unsigned long adr);
 extern int set_tsc_mode(unsigned int val);
 
+#ifdef CONFIG_HAVE_INTEL_MPX
+
+/* Init/release a process' MPX related resource */
+#define MPX_INIT(tsk)  mpx_init((tsk))
+#define MPX_RELEASE(tsk)   mpx_release((tsk))
+
+extern int mpx_init(struct task_struct *tsk);
+extern int mpx_release(struct task_struct *tsk);
+
+#else /* CONFIG_HAVE_INTEL_MPX */
+
+#define MPX_INIT(tsk)  (-EINVAL)
+#define MPX_RELEASE(tsk)   (-EINVAL)
+
+#endif /* CONFIG_HAVE_INTEL_MPX */
+
 extern u16 amd_get_nb_id(int cpu);
 
 static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index 767b3bf..ffe5aee 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -1,5 +1,7 @@
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -7,6 +9,88 @@
 #include 
 #include 
 
+static struct mmu_notifier mpx_mn;
+
+static void mpx_invl_range_end(struct mmu_notifier *mn,
+   struct mm_struct *mm,
+   unsigned long start, unsigned long end)
+{
+   struct xsave_struct *xsave_buf;
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr;
+   unsigned long bd_base;
+   unsigned long bd_entry, bde_start, bde_end;
+   mpx_addr lap;
+
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   /* ignore swap notifications */
+   pgd = pgd_offset(mm, start);
+   pud = pud_offset(pgd, start);
+   pmd = pmd_offset(pud, start);
+   pte = pte_offset_kernel(pmd, start);
+   if (!pte_present(*pte) && !pte_none(*pte) && !pte_file(*pte))
+   return;
+
+   /* get bound directory base address */
+   fpu_xsave(¤t->thread.fpu);
+   xsave_buf = &(current->thread.fpu.state->xsave);
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+
+   /* get related bde range */
+   lap.addr = start;
+   bde_start = bd_base + (lap.l1index << MPX_L1_SHIFT);
+
+   lap.addr = end;
+   if (lap.ignored || lap.l2index)
+   bde_end = bd_base + (lap.l1index<mm);
+
+   return 0;
+}
+
+int mpx_release(struct task_struct *tsk)
+{
+   if (!boot_cpu_has(X86_FEATURE_MPX))
+   return -EINVAL;
+
+   /* unregister mmu_notifier */
+   mmu_notifier_unregister(&mpx_mn, current->mm);
+
+   return 0;
+}
+
 static bool allocate_bt(unsigned long bd_entry)
 {
unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 289760f..19ab881 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -149,4 +149,10 @@
 
 #define PR_GET_TID_ADDRESS 40
 
+/*
+ * Init/release MPX related resource.
+ */
+#define PR_MPX_INIT41
+#define PR_MPX_RELEASE 42
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index c18ecca..bbaf573 100644
--

[PATCH 1/5] x86, mpx: add documentation on Intel MPX

2014-01-11 Thread Qiaowei Ren
This patch adds the Documentation/x86/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
---
 Documentation/x86/intel_mpx.txt |   76 +++
 1 files changed, 76 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..778d06e
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,76 @@
+Intel(R) MPX Overview:
+=
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX can
+increase the robustness of software when it is used in conjunction
+with compiler changes to check that memory references intended
+at compile time do not become unsafe at runtime.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. A direct benefit Intel MPX provides
+is hardening software against malicious attacks designed to
+cause or exploit buffer overruns.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. And the linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 is
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] x86, mpx: hook #BR exception handler to allocate bound tables

2014-01-11 Thread Qiaowei Ren
An access to an invalid bound directory entry will cause a #BR
exception. This patch hook #BR exception handler to allocate
one bound table and bind it with that buond directory entry.

This will avoid the need of forwarding the #BR exception
to the user space when bound directory has invalid entry.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   35 +
 arch/x86/kernel/Makefile   |1 +
 arch/x86/kernel/mpx.c  |   44 ++
 arch/x86/kernel/traps.c|   46 +++-
 4 files changed, 125 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/include/asm/mpx.h
 create mode 100644 arch/x86/kernel/mpx.c

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
new file mode 100644
index 000..d074153
--- /dev/null
+++ b/arch/x86/include/asm/mpx.h
@@ -0,0 +1,35 @@
+#ifndef _ASM_X86_MPX_H
+#define _ASM_X86_MPX_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_64
+
+#define MPX_L1_BITS28
+#define MPX_L1_SHIFT   3
+#define MPX_L2_BITS17
+#define MPX_L2_SHIFT   5
+#define MPX_IGN_BITS   3
+#define MPX_L2_NODE_ADDR_MASK  0xfff8UL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#else
+
+#define MPX_L1_BITS20
+#define MPX_L1_SHIFT   2
+#define MPX_L2_BITS10
+#define MPX_L2_SHIFT   4
+#define MPX_IGN_BITS   2
+#define MPX_L2_NODE_ADDR_MASK  0xfffcUL
+
+#define MPX_BNDSTA_ADDR_MASK   0xfffcUL
+#define MPX_BNDCFG_ADDR_MASK   0xf000UL
+
+#endif
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+
+#endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index a5408b9..bba7a71 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -38,6 +38,7 @@ obj-y += resource.o
 
 obj-y  += process.o
 obj-y  += i387.o xsave.o
+obj-y  += mpx.o
 obj-y  += ptrace.o
 obj-$(CONFIG_X86_32)   += tls.o
 obj-$(CONFIG_IA32_EMULATION)   += tls.o
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
new file mode 100644
index 000..767b3bf
--- /dev/null
+++ b/arch/x86/kernel/mpx.c
@@ -0,0 +1,44 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static bool allocate_bt(unsigned long bd_entry)
+{
+   unsigned long bt_size = 1UL << (MPX_L2_BITS+MPX_L2_SHIFT);
+   unsigned long bt_addr, old_val;
+
+   bt_addr = sys_mmap_pgoff(0, bt_size, PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+   if (bt_addr == -1) {
+   pr_err("L2 Node Allocation Failed at L1 addr %lx\n",
+   bd_entry);
+   return false;
+   }
+   bt_addr = (bt_addr & MPX_L2_NODE_ADDR_MASK) | 0x01;
+
+   user_atomic_cmpxchg_inatomic(&old_val,
+   (long __user *)bd_entry, 0, bt_addr);
+   if (old_val)
+   vm_munmap(bt_addr & MPX_L2_NODE_ADDR_MASK, bt_size);
+
+   return true;
+}
+
+void do_mpx_bt_fault(struct xsave_struct *xsave_buf)
+{
+   unsigned long status;
+   unsigned long bd_entry, bd_base;
+   unsigned long bd_size = 1UL << (MPX_L1_BITS+MPX_L1_SHIFT);
+
+   bd_base = xsave_buf->bndcsr.cfg_reg_u & MPX_BNDCFG_ADDR_MASK;
+   status = xsave_buf->bndcsr.status_reg;
+
+   bd_entry = status & MPX_BNDSTA_ADDR_MASK;
+   if ((bd_entry >= bd_base) && (bd_entry < bd_base + bd_size))
+   allocate_bt(bd_entry);
+}
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 8c8093b..eb04039 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -214,7 +215,6 @@ dotraplinkage void do_##name(struct pt_regs *regs, long 
error_code) \
 DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV,
regs->ip)
 DO_ERROR(X86_TRAP_OF, SIGSEGV, "overflow", overflow)
-DO_ERROR(X86_TRAP_BR, SIGSEGV, "bounds", bounds)
 DO_ERROR_INFO(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN,
regs->ip)
 DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",
@@ -267,6 +267,50 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, 
long error_code)
 }
 #endif
 
+dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
+{
+   enum ctx_state prev_state;
+   unsigned long status;
+   struct xsave_struct *xsave_buf;
+   struct task_struct *tsk = current;
+
+   prev_state = exception_enter();
+   if (notify_die(DIE_TRAP, "bounds", regs, error_code,
+   X86_TRAP_BR

[PATCH 5/5] x86, mpx: extend siginfo structure to include bound violation information

2014-01-11 Thread Qiaowei Ren
This patch adds new fields about bound violation into siginfo
structure. si_lower and si_upper are respectively lower bound
and upper bound when bound violation is caused.

These fields will be set in #BR exception handler by decoding
the user instruction and constructing the faulting pointer.
A userspace application can get violation address, lower bound
and upper bound for bound violation from this new siginfo structure.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/mpx.h |   39 +
 arch/x86/kernel/mpx.c  |  289 
 arch/x86/kernel/traps.c|6 +
 include/uapi/asm-generic/siginfo.h |9 +-
 kernel/signal.c|4 +
 5 files changed, 346 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mpx.h b/arch/x86/include/asm/mpx.h
index 9652e9e..8c1c914 100644
--- a/arch/x86/include/asm/mpx.h
+++ b/arch/x86/include/asm/mpx.h
@@ -30,6 +30,43 @@
 
 #endif
 
+struct mpx_insn_field {
+   union {
+   signed int value;
+   unsigned char bytes[4];
+   };
+   unsigned char nbytes;
+};
+
+struct mpx_insn {
+   struct mpx_insn_field rex_prefix;   /* REX prefix */
+   struct mpx_insn_field modrm;
+   struct mpx_insn_field sib;
+   struct mpx_insn_field displacement;
+
+   unsigned char addr_bytes;   /* effective address size */
+   unsigned char limit;
+   unsigned char x86_64;
+
+   const unsigned char *kaddr; /* kernel address of insn to analyze */
+   const unsigned char *next_byte;
+};
+
+#define MAX_MPX_INSN_SIZE  15
+
+#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6)
+#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3)
+#define X86_MODRM_RM(modrm) ((modrm) & 0x07)
+
+#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6)
+#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3)
+#define X86_SIB_BASE(sib) ((sib) & 0x07)
+
+#define X86_REX_W(rex) ((rex) & 8)
+#define X86_REX_R(rex) ((rex) & 4)
+#define X86_REX_X(rex) ((rex) & 2)
+#define X86_REX_B(rex) ((rex) & 1)
+
 typedef union {
struct {
unsigned long ignored:MPX_IGN_BITS;
@@ -40,5 +77,7 @@ typedef union {
 } mpx_addr;
 
 void do_mpx_bt_fault(struct xsave_struct *xsave_buf);
+void do_mpx_bounds(struct pt_regs *regs, siginfo_t *info,
+   struct xsave_struct *xsave_buf);
 
 #endif /* _ASM_X86_MPX_H */
diff --git a/arch/x86/kernel/mpx.c b/arch/x86/kernel/mpx.c
index ffe5aee..3770991 100644
--- a/arch/x86/kernel/mpx.c
+++ b/arch/x86/kernel/mpx.c
@@ -91,6 +91,269 @@ int mpx_release(struct task_struct *tsk)
return 0;
 }
 
+typedef enum {REG_TYPE_RM, REG_TYPE_INDEX, REG_TYPE_BASE} reg_type_t;
+static unsigned long get_reg(struct mpx_insn *insn, struct pt_regs *regs,
+reg_type_t type)
+{
+   int regno = 0;
+   unsigned char modrm = (unsigned char)insn->modrm.value;
+   unsigned char sib = (unsigned char)insn->sib.value;
+
+   static const int regoff[] = {
+   offsetof(struct pt_regs, ax),
+   offsetof(struct pt_regs, cx),
+   offsetof(struct pt_regs, dx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, sp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+   offsetof(struct pt_regs, r8),
+   offsetof(struct pt_regs, r9),
+   offsetof(struct pt_regs, r10),
+   offsetof(struct pt_regs, r11),
+   offsetof(struct pt_regs, r12),
+   offsetof(struct pt_regs, r13),
+   offsetof(struct pt_regs, r14),
+   offsetof(struct pt_regs, r15),
+#endif
+   };
+
+   switch (type) {
+   case REG_TYPE_RM:
+   regno = X86_MODRM_RM(modrm);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_INDEX:
+   regno = X86_SIB_INDEX(sib);
+   if (X86_REX_X(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   case REG_TYPE_BASE:
+   regno = X86_SIB_BASE(sib);
+   if (X86_REX_B(insn->rex_prefix.value) == 1)
+   regno += 8;
+   break;
+
+   default:
+   break;
+   }
+
+   return regs_get_register(regs, regoff[regno]);
+}
+
+/*
+ * return the address being referenced be instruction
+ * for rm=3 returning the content of the rm reg
+ * for rm!=3 calculates the address using SIB and Disp
+ */
+static unsigned long get_addr_ref(struct mpx_insn *insn, struct pt_regs *regs)
+{
+   unsigned long addr;
+   unsigned long base;
+   unsigned long indx;
+   unsigned char modrm = (unsigned char)insn->mod

[tip:x86/mpx] x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic

2013-12-16 Thread tip-bot for Qiaowei Ren
Commit-ID:  0ee3b6f87d4d748d5362cb47ff33fa1553805cb4
Gitweb: http://git.kernel.org/tip/0ee3b6f87d4d748d5362cb47ff33fa1553805cb4
Author: Qiaowei Ren 
AuthorDate: Sat, 14 Dec 2013 14:25:03 +0800
Committer:  H. Peter Anvin 
CommitDate: Mon, 16 Dec 2013 09:08:13 -0800

x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic

futex_atomic_cmpxchg_inatomic() is simply the 32-bit implementation of
user_atomic_cmpxchg_inatomic(), which in turn is simply a
generalization of the original code in
futex_atomic_cmpxchg_inatomic().

Use the newly generalized user_atomic_cmpxchg_inatomic() as the futex
implementation, too.

[ hpa: retain the inline in futex.h rather than changing it to a macro ]

Signed-off-by: Qiaowei Ren 
Link: 
http://lkml.kernel.org/r/1387002303-6620-2-git-send-email-qiaowei@intel.com
Signed-off-by: H. Peter Anvin 
Cc: Peter Zijlstra 
---
 arch/x86/include/asm/futex.h | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index be27ba1..b4c1f54 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -110,26 +110,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, 
u32 __user *uaddr)
 static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
u32 oldval, u32 newval)
 {
-   int ret = 0;
-
-   if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
-   return -EFAULT;
-
-   asm volatile("\t" ASM_STAC "\n"
-"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
-"2:\t" ASM_CLAC "\n"
-"\t.section .fixup, \"ax\"\n"
-"3:\tmov %3, %0\n"
-"\tjmp 2b\n"
-"\t.previous\n"
-_ASM_EXTABLE(1b, 3b)
-: "+r" (ret), "=a" (oldval), "+m" (*uaddr)
-: "i" (-EFAULT), "r" (newval), "1" (oldval)
-: "memory"
-   );
-
-   *uval = oldval;
-   return ret;
+   return user_atomic_cmpxchg_inatomic(uval, uaddr, oldval, newval);
 }
 
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/mpx] x86: add user_atomic_cmpxchg_inatomic at uaccess.h

2013-12-16 Thread tip-bot for Qiaowei Ren
Commit-ID:  f09174c501f8bb259788cc36d5a7aa5b2831fb5e
Gitweb: http://git.kernel.org/tip/f09174c501f8bb259788cc36d5a7aa5b2831fb5e
Author: Qiaowei Ren 
AuthorDate: Sat, 14 Dec 2013 14:25:02 +0800
Committer:  H. Peter Anvin 
CommitDate: Mon, 16 Dec 2013 09:07:57 -0800

x86: add user_atomic_cmpxchg_inatomic at uaccess.h

This patch adds user_atomic_cmpxchg_inatomic() to use CMPXCHG
instruction against a user space address.

This generalizes the already existing futex_atomic_cmpxchg_inatomic()
so it can be used in other contexts.  This will be used in the
upcoming support for Intel MPX (Memory Protection Extensions.)

[ hpa: replaced #ifdef inside a macro with IS_ENABLED() ]

Signed-off-by: Qiaowei Ren 
Link: 
http://lkml.kernel.org/r/1387002303-6620-1-git-send-email-qiaowei@intel.com
Signed-off-by: H. Peter Anvin 
Cc: Peter Zijlstra 
---
 arch/x86/include/asm/uaccess.h | 92 ++
 1 file changed, 92 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 8ec57c0..48ff838 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -525,6 +525,98 @@ extern __must_check long strnlen_user(const char __user 
*str, long n);
 unsigned long __must_check clear_user(void __user *mem, unsigned long len);
 unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
 
+extern void __cmpxchg_wrong_size(void)
+   __compiletime_error("Bad argument size for cmpxchg");
+
+#define __user_atomic_cmpxchg_inatomic(uval, ptr, old, new, size)  \
+({ \
+   int __ret = 0;  \
+   __typeof__(ptr) __uval = (uval);\
+   __typeof__(*(ptr)) __old = (old);   \
+   __typeof__(*(ptr)) __new = (new);   \
+   switch (size) { \
+   case 1: \
+   {   \
+   asm volatile("\t" ASM_STAC "\n" \
+   "1:\t" LOCK_PREFIX "cmpxchgb %4, %2\n"  \
+   "2:\t" ASM_CLAC "\n"\
+   "\t.section .fixup, \"ax\"\n"   \
+   "3:\tmov %3, %0\n"  \
+   "\tjmp 2b\n"\
+   "\t.previous\n" \
+   _ASM_EXTABLE(1b, 3b)\
+   : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \
+   : "i" (-EFAULT), "q" (__new), "1" (__old)   \
+   : "memory"  \
+   );  \
+   break;  \
+   }   \
+   case 2: \
+   {   \
+   asm volatile("\t" ASM_STAC "\n" \
+   "1:\t" LOCK_PREFIX "cmpxchgw %4, %2\n"  \
+   "2:\t" ASM_CLAC "\n"\
+   "\t.section .fixup, \"ax\"\n"   \
+   "3:\tmov %3, %0\n"  \
+   "\tjmp 2b\n"\
+   "\t.previous\n" \
+   _ASM_EXTABLE(1b, 3b)\
+   : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \
+   : "i" (-EFAULT), "r" (__new), "1" (__old)   \
+   : "memory"  \
+   );  \
+   break;  \
+   }   \
+   case 4: \
+   {   \
+   asm volatile("\t&q

[PATCH 1/2] x86: add user_atomic_cmpxchg_inatomic at uaccess.h

2013-12-13 Thread Qiaowei Ren
This patch adds user_atomic_cmpxchg_inatomic() to use CMPXCHG
instruction against a user space address.

This generalizes the already existing futex_atomic_cmpxchg_inatomic()
so it can be used in other contexts.  This will be used in the
upcoming support for Intel MPX (Memory Protection Extensions.)

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/uaccess.h |   91 
 1 files changed, 91 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 5838fa9..894d8bf 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -525,6 +525,97 @@ extern __must_check long strnlen_user(const char __user 
*str, long n);
 unsigned long __must_check clear_user(void __user *mem, unsigned long len);
 unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
 
+extern void __cmpxchg_wrong_size(void)
+   __compiletime_error("Bad argument size for cmpxchg");
+
+#define __user_atomic_cmpxchg_inatomic(uval, ptr, old, new, size)  \
+({ \
+   int __ret = 0;  \
+   __typeof__(ptr) __uval = (uval);\
+   __typeof__(*(ptr)) __old = (old);   \
+   __typeof__(*(ptr)) __new = (new);   \
+   switch (size) { \
+   case 1: \
+   {   \
+   asm volatile("\t" ASM_STAC "\n" \
+   "1:\t" LOCK_PREFIX "cmpxchgb %4, %2\n"  \
+   "2:\t" ASM_CLAC "\n"\
+   "\t.section .fixup, \"ax\"\n"   \
+   "3:\tmov %3, %0\n"  \
+   "\tjmp 2b\n"\
+   "\t.previous\n" \
+   _ASM_EXTABLE(1b, 3b)\
+   : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \
+   : "i" (-EFAULT), "q" (__new), "1" (__old)   \
+   : "memory"  \
+   );  \
+   break;  \
+   }   \
+   case 2: \
+   {   \
+   asm volatile("\t" ASM_STAC "\n" \
+   "1:\t" LOCK_PREFIX "cmpxchgw %4, %2\n"  \
+   "2:\t" ASM_CLAC "\n"\
+   "\t.section .fixup, \"ax\"\n"   \
+   "3:\tmov %3, %0\n"  \
+   "\tjmp 2b\n"\
+   "\t.previous\n" \
+   _ASM_EXTABLE(1b, 3b)\
+   : "+r" (__ret), "=a" (__old), "+m" (*(ptr)) \
+   : "i" (-EFAULT), "r" (__new), "1" (__old)   \
+   : "memory"  \
+   );  \
+   break;  \
+   }   \
+   case 4: \
+   {   \
+   asm volatile("\t" ASM_STAC "\n" \
+   "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"  \
+   "2:\t" ASM_CLAC "\n"\
+   "\t.section .fixup, \"ax\"\n"   \
+   "3:\tmov %3, %0\n"  \
+   "\tjmp 2b\n" 

[PATCH 2/2] x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic

2013-12-13 Thread Qiaowei Ren
futex_atomic_cmpxchg_inatomic() is only the 32bit implementation of
user_atomic_cmpxchg_inatomic(). This patch replaces it with
user_atomic_cmpxchg_inatomic().

Signed-off-by: Qiaowei Ren 
---
 arch/x86/include/asm/futex.h |   27 ++-
 1 files changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index be27ba1..a9f7de4 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -41,6 +41,8 @@
   "+m" (*uaddr), "=&r" (tem)   \
 : "r" (oparg), "i" (-EFAULT), "1" (0))
 
+#define futex_atomic_cmpxchg_inatomic user_atomic_cmpxchg_inatomic
+
 static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
 {
int op = (encoded_op >> 28) & 7;
@@ -107,30 +109,5 @@ static inline int futex_atomic_op_inuser(int encoded_op, 
u32 __user *uaddr)
return ret;
 }
 
-static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
-   u32 oldval, u32 newval)
-{
-   int ret = 0;
-
-   if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
-   return -EFAULT;
-
-   asm volatile("\t" ASM_STAC "\n"
-"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
-"2:\t" ASM_CLAC "\n"
-"\t.section .fixup, \"ax\"\n"
-"3:\tmov %3, %0\n"
-"\tjmp 2b\n"
-"\t.previous\n"
-_ASM_EXTABLE(1b, 3b)
-: "+r" (ret), "=a" (oldval), "+m" (*uaddr)
-: "i" (-EFAULT), "r" (newval), "1" (oldval)
-: "memory"
-   );
-
-   *uval = oldval;
-   return ret;
-}
-
 #endif
 #endif /* _ASM_X86_FUTEX_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Documentation: move intel_txt.txt to Documentation/x86

2013-12-09 Thread Qiaowei Ren
Documentation/x86 is a more fitting place for intel_txt.txt.

Signed-off-by: Qiaowei Ren 
---
 Documentation/intel_txt.txt |  210 ---
 Documentation/x86/intel_txt.txt |  210 +++
 2 files changed, 210 insertions(+), 210 deletions(-)
 delete mode 100644 Documentation/intel_txt.txt
 create mode 100644 Documentation/x86/intel_txt.txt

diff --git a/Documentation/intel_txt.txt b/Documentation/intel_txt.txt
deleted file mode 100644
index 91d89c5..000
--- a/Documentation/intel_txt.txt
+++ /dev/null
@@ -1,210 +0,0 @@
-Intel(R) TXT Overview:
-=
-
-Intel's technology for safer computing, Intel(R) Trusted Execution
-Technology (Intel(R) TXT), defines platform-level enhancements that
-provide the building blocks for creating trusted platforms.
-
-Intel TXT was formerly known by the code name LaGrande Technology (LT).
-
-Intel TXT in Brief:
-o  Provides dynamic root of trust for measurement (DRTM)
-o  Data protection in case of improper shutdown
-o  Measurement and verification of launched environment
-
-Intel TXT is part of the vPro(TM) brand and is also available some
-non-vPro systems.  It is currently available on desktop systems
-based on the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell
-Optiplex 755, HP dc7800, etc.) and mobile systems based on the GM45,
-PM45, and GS45 Express chipsets.
-
-For more information, see http://www.intel.com/technology/security/.
-This site also has a link to the Intel TXT MLE Developers Manual,
-which has been updated for the new released platforms.
-
-Intel TXT has been presented at various events over the past few
-years, some of which are:
-  LinuxTAG 2008:
-  http://www.linuxtag.org/2008/en/conf/events/vp-donnerstag.html
-  TRUST2008:
-  http://www.trust-conference.eu/downloads/Keynote-Speakers/
-  3_David-Grawrock_The-Front-Door-of-Trusted-Computing.pdf
-  IDF, Shanghai:
-  http://www.prcidf.com.cn/index_en.html
-  IDFs 2006, 2007 (I'm not sure if/where they are online)
-
-Trusted Boot Project Overview:
-=
-
-Trusted Boot (tboot) is an open source, pre-kernel/VMM module that
-uses Intel TXT to perform a measured and verified launch of an OS
-kernel/VMM.
-
-It is hosted on SourceForge at http://sourceforge.net/projects/tboot.
-The mercurial source repo is available at http://www.bughost.org/
-repos.hg/tboot.hg.
-
-Tboot currently supports launching Xen (open source VMM/hypervisor
-w/ TXT support since v3.2), and now Linux kernels.
-
-
-Value Proposition for Linux or "Why should you care?"
-=
-
-While there are many products and technologies that attempt to
-measure or protect the integrity of a running kernel, they all
-assume the kernel is "good" to begin with.  The Integrity
-Measurement Architecture (IMA) and Linux Integrity Module interface
-are examples of such solutions.
-
-To get trust in the initial kernel without using Intel TXT, a
-static root of trust must be used.  This bases trust in BIOS
-starting at system reset and requires measurement of all code
-executed between system reset through the completion of the kernel
-boot as well as data objects used by that code.  In the case of a
-Linux kernel, this means all of BIOS, any option ROMs, the
-bootloader and the boot config.  In practice, this is a lot of
-code/data, much of which is subject to change from boot to boot
-(e.g. changing NICs may change option ROMs).  Without reference
-hashes, these measurement changes are difficult to assess or
-confirm as benign.  This process also does not provide DMA
-protection, memory configuration/alias checks and locks, crash
-protection, or policy support.
-
-By using the hardware-based root of trust that Intel TXT provides,
-many of these issues can be mitigated.  Specifically: many
-pre-launch components can be removed from the trust chain, DMA
-protection is provided to all launched components, a large number
-of platform configuration checks are performed and values locked,
-protection is provided for any data in the event of an improper
-shutdown, and there is support for policy-based execution/verification.
-This provides a more stable measurement and a higher assurance of
-system configuration and initial state than would be otherwise
-possible.  Since the tboot project is open source, source code for
-almost all parts of the trust chain is available (excepting SMM and
-Intel-provided firmware).
-
-How Does it Work?
-=
-
-o  Tboot is an executable that is launched by the bootloader as
-   the "kernel" (the binary the bootloader executes).
-o  It performs all of the work necessary to determine if the
-   platform supports Intel TXT and, if so, executes the GETSEC[SENTER]
-   processor instruction that initiates the dynamic root of trust.
-   -  If tboot determines that the system doe

[tip:x86/cpufeature] x86, xsave: Support eager-only xsave features, add MPX support

2013-12-06 Thread tip-bot for Qiaowei Ren
Commit-ID:  e7d820a5e549b3eb6c3f9467507566565646a669
Gitweb: http://git.kernel.org/tip/e7d820a5e549b3eb6c3f9467507566565646a669
Author: Qiaowei Ren 
AuthorDate: Thu, 5 Dec 2013 17:15:34 +0800
Committer:  H. Peter Anvin 
CommitDate: Fri, 6 Dec 2013 17:17:42 -0800

x86, xsave: Support eager-only xsave features, add MPX support

Some features, like Intel MPX, work only if the kernel uses eagerfpu
model.  So we should force eagerfpu on unless the user has explicitly
disabled it.

Add definitions for Intel MPX and add it to the supported list.

[ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]

Signed-off-by: Qiaowei Ren 
Link: 
http://lkml.kernel.org/r/9e0be1322f2f2246bd820da9fc397ade014a6...@shsmsx102.ccr.corp.intel.com
Signed-off-by: H. Peter Anvin 
---
 arch/x86/include/asm/processor.h | 23 +++
 arch/x86/include/asm/xsave.h | 14 ++
 arch/x86/kernel/xsave.c  | 10 ++
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 7b034a4..b7845a1 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -370,6 +370,26 @@ struct ymmh_struct {
u32 ymmh_space[64];
 };
 
+struct lwp_struct {
+   u64 lwpcb_addr;
+   u32 flags;
+   u32 buf_head_offset;
+   u64 buf_base;
+   u32 buf_size;
+   u32 filters;
+   u64 saved_event_record[4];
+   u32 event_counter[16];
+};
+
+struct bndregs_struct {
+   u64 bndregs[8];
+} __packed;
+
+struct bndcsr_struct {
+   u64 cfg_reg_u;
+   u64 status_reg;
+} __packed;
+
 struct xsave_hdr_struct {
u64 xstate_bv;
u64 reserved1[2];
@@ -380,6 +400,9 @@ struct xsave_struct {
struct i387_fxsave_struct i387;
struct xsave_hdr_struct xsave_hdr;
struct ymmh_struct ymmh;
+   struct lwp_struct lwp;
+   struct bndregs_struct bndregs;
+   struct bndcsr_struct bndcsr;
/* new processor state extensions will go here */
 } __attribute__ ((packed, aligned (64)));
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 0415cda..5547389 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -9,6 +9,8 @@
 #define XSTATE_FP  0x1
 #define XSTATE_SSE 0x2
 #define XSTATE_YMM 0x4
+#define XSTATE_BNDREGS 0x8
+#define XSTATE_BNDCSR  0x10
 
 #define XSTATE_FPSSE   (XSTATE_FP | XSTATE_SSE)
 
@@ -20,10 +22,14 @@
 #define XSAVE_YMM_SIZE 256
 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
-/*
- * These are the features that the OS can handle currently.
- */
-#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+/* Supported features which support lazy state saving */
+#define XSTATE_LAZY(XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+
+/* Supported features which require eager state saving */
+#define XSTATE_EAGER   (XSTATE_BNDREGS | XSTATE_BNDCSR)
+
+/* All currently supported features */
+#define XCNTXT_MASK(XSTATE_LAZY | XSTATE_EAGER)
 
 #ifdef CONFIG_X86_64
 #define REX_PREFIX "0x48, "
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 422fd82..a4b451c 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -562,6 +562,16 @@ static void __init xstate_enable_boot_cpu(void)
if (cpu_has_xsaveopt && eagerfpu != DISABLE)
eagerfpu = ENABLE;
 
+   if (pcntxt_mask & XSTATE_EAGER) {
+   if (eagerfpu == DISABLE) {
+   pr_err("eagerfpu not present, disabling some xstate 
features: 0x%llx\n",
+   pcntxt_mask & XSTATE_EAGER);
+   pcntxt_mask &= ~XSTATE_EAGER;
+   } else {
+   eagerfpu = ENABLE;
+   }
+   }
+
pr_info("enabled xstate_bv 0x%llx, cntxt size 0x%x\n",
pcntxt_mask, xstate_size);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 3/3] X86, mpx: Intel MPX xstate feature definition

2013-12-06 Thread Qiaowei Ren
This patch defines xstate feature and extends struct xsave_hdr_struct
to support Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 arch/x86/include/asm/processor.h |   12 
 arch/x86/include/asm/xsave.h |6 +-
 2 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 987c75e..2fe2e75 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -370,6 +370,15 @@ struct ymmh_struct {
u32 ymmh_space[64];
 };
 
+struct bndregs_struct {
+   u64 bndregs[8];
+} __packed;
+
+struct bndcsr_struct {
+   u64 cfg_reg_u;
+   u64 status_reg;
+} __packed;
+
 struct xsave_hdr_struct {
u64 xstate_bv;
u64 reserved1[2];
@@ -380,6 +389,9 @@ struct xsave_struct {
struct i387_fxsave_struct i387;
struct xsave_hdr_struct xsave_hdr;
struct ymmh_struct ymmh;
+   u8 lwp_area[128];
+   struct bndregs_struct bndregs;
+   struct bndcsr_struct bndcsr;
/* new processor state extensions will go here */
 } __attribute__ ((packed, aligned (64)));
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 0415cda..5cd9de3 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -9,6 +9,8 @@
 #define XSTATE_FP  0x1
 #define XSTATE_SSE 0x2
 #define XSTATE_YMM 0x4
+#define XSTATE_BNDREGS 0x8
+#define XSTATE_BNDCSR  0x10
 
 #define XSTATE_FPSSE   (XSTATE_FP | XSTATE_SSE)
 
@@ -20,10 +22,12 @@
 #define XSAVE_YMM_SIZE 256
 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
+#define XSTATE_FLEXIBLE (XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+#define XSTATE_EAGER   (XSTATE_BNDREGS | XSTATE_BNDCSR)
 /*
  * These are the features that the OS can handle currently.
  */
-#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+#define XCNTXT_MASK(XSTATE_FLEXIBLE | XSTATE_EAGER)
 
 #ifdef CONFIG_X86_64
 #define REX_PREFIX "0x48, "
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/cpufeature] x86, cpufeature: Define the Intel MPX feature flag

2013-12-06 Thread tip-bot for Qiaowei Ren
Commit-ID:  191f57c137bcce0e3e9313acb77b2f114d15afbb
Gitweb: http://git.kernel.org/tip/191f57c137bcce0e3e9313acb77b2f114d15afbb
Author: Qiaowei Ren 
AuthorDate: Sat, 7 Dec 2013 08:20:57 +0800
Committer:  H. Peter Anvin 
CommitDate: Fri, 6 Dec 2013 10:21:44 -0800

x86, cpufeature: Define the Intel MPX feature flag

Define the Intel MPX (Memory Protection Extensions) CPU feature flag
in the cpufeature list.

Signed-off-by: Qiaowei Ren 
Link: 
http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei@intel.com
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
Signed-off-by: H. Peter Anvin 
---
 arch/x86/include/asm/cpufeature.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 89270b4..e099f95 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -216,6 +216,7 @@
 #define X86_FEATURE_ERMS   (9*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */
 #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional 
Memory */
+#define X86_FEATURE_MPX(9*32+14) /* Memory Protection 
Extension */
 #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX 
instructions */
 #define X86_FEATURE_SMAP   (9*32+20) /* Supervisor Mode Access Prevention 
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] x86, mpx: add documentation on Intel MPX

2013-12-06 Thread Qiaowei Ren
This patch adds the Documentation/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 Documentation/x86/intel_mpx.txt |   76 +++
 1 files changed, 76 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/x86/intel_mpx.txt

diff --git a/Documentation/x86/intel_mpx.txt b/Documentation/x86/intel_mpx.txt
new file mode 100644
index 000..778d06e
--- /dev/null
+++ b/Documentation/x86/intel_mpx.txt
@@ -0,0 +1,76 @@
+Intel(R) MPX Overview:
+=
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX can
+increase the robustness of software when it is used in conjunction
+with compiler changes to check that memory references intended
+at compile time do not become unsafe at runtime.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. A direct benefit Intel MPX provides
+is hardening software against malicious attacks designed to
+cause or exploit buffer overruns.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. And the linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 is
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] X86, mpx: Intel MPX xstate feature definition

2013-12-06 Thread Qiaowei Ren
This patch defines xstate feature and extends struct xsave_hdr_struct
to support Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 arch/x86/include/asm/processor.h |   12 
 arch/x86/include/asm/xsave.h |5 -
 2 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 987c75e..2fe2e75 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -370,6 +370,15 @@ struct ymmh_struct {
u32 ymmh_space[64];
 };
 
+struct bndregs_struct {
+   u64 bndregs[8];
+} __packed;
+
+struct bndcsr_struct {
+   u64 cfg_reg_u;
+   u64 status_reg;
+} __packed;
+
 struct xsave_hdr_struct {
u64 xstate_bv;
u64 reserved1[2];
@@ -380,6 +389,9 @@ struct xsave_struct {
struct i387_fxsave_struct i387;
struct xsave_hdr_struct xsave_hdr;
struct ymmh_struct ymmh;
+   u8 lwp_area[128];
+   struct bndregs_struct bndregs;
+   struct bndcsr_struct bndcsr;
/* new processor state extensions will go here */
 } __attribute__ ((packed, aligned (64)));
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 0415cda..7fa8855 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -9,6 +9,8 @@
 #define XSTATE_FP  0x1
 #define XSTATE_SSE 0x2
 #define XSTATE_YMM 0x4
+#define XSTATE_BNDREGS 0x8
+#define XSTATE_BNDCSR  0x10
 
 #define XSTATE_FPSSE   (XSTATE_FP | XSTATE_SSE)
 
@@ -20,10 +22,11 @@
 #define XSAVE_YMM_SIZE 256
 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
+#define XSTATE_EAGER   (XSTATE_BNDREGS | XSTATE_BNDCSR)
 /*
  * These are the features that the OS can handle currently.
  */
-#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_EAGER)
 
 #ifdef CONFIG_X86_64
 #define REX_PREFIX "0x48, "
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] X86, mpx: Intel MPX CPU feature definition

2013-12-06 Thread Qiaowei Ren
This patch defines Intel MPX CPU feature.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 arch/x86/include/asm/cpufeature.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index d3f5c63..ef9f9c2 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -216,6 +216,7 @@
 #define X86_FEATURE_ERMS   (9*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */
 #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional 
Memory */
+#define X86_FEATURE_MPX(9*32+14) /* Memory Protection 
Extension */
 #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX 
instructions */
 #define X86_FEATURE_SMAP   (9*32+20) /* Supervisor Mode Access Prevention 
*/
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] X86, mpx: Intel MPX xstate feature definition

2013-12-06 Thread Qiaowei Ren

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 arch/x86/include/asm/processor.h |   23 +++
 arch/x86/include/asm/xsave.h |6 +-
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 987c75e..43be6f6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -370,6 +370,26 @@ struct ymmh_struct {
u32 ymmh_space[64];
 };
 
+struct lwp_struct {
+   u64 lwpcb_addr;
+   u32 flags;
+   u32 buf_head_offset;
+   u64 buf_base;
+   u32 buf_size;
+   u32 filters;
+   u64 saved_event_record[4];
+   u32 event_counter[16];
+};
+
+struct bndregs_struct {
+   u64 bndregs[8];
+} __packed;
+
+struct bndcsr_struct {
+   u64 cfg_reg_u;
+   u64 status_reg;
+} __packed;
+
 struct xsave_hdr_struct {
u64 xstate_bv;
u64 reserved1[2];
@@ -380,6 +400,9 @@ struct xsave_struct {
struct i387_fxsave_struct i387;
struct xsave_hdr_struct xsave_hdr;
struct ymmh_struct ymmh;
+   struct lwp_struct lwp;
+   struct bndregs_struct bndregs;
+   struct bndcsr_struct bndcsr;
/* new processor state extensions will go here */
 } __attribute__ ((packed, aligned (64)));
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 0415cda..5cd9de3 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -9,6 +9,8 @@
 #define XSTATE_FP  0x1
 #define XSTATE_SSE 0x2
 #define XSTATE_YMM 0x4
+#define XSTATE_BNDREGS 0x8
+#define XSTATE_BNDCSR  0x10
 
 #define XSTATE_FPSSE   (XSTATE_FP | XSTATE_SSE)
 
@@ -20,10 +22,12 @@
 #define XSAVE_YMM_SIZE 256
 #define XSAVE_YMM_OFFSET(XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
+#define XSTATE_FLEXIBLE (XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+#define XSTATE_EAGER   (XSTATE_BNDREGS | XSTATE_BNDCSR)
 /*
  * These are the features that the OS can handle currently.
  */
-#define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM)
+#define XCNTXT_MASK(XSTATE_FLEXIBLE | XSTATE_EAGER)
 
 #ifdef CONFIG_X86_64
 #define REX_PREFIX "0x48, "
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86, mpx: add documentation on Intel MPX

2013-12-06 Thread Qiaowei Ren
This patch adds the Documentation/intel_mpx.txt file with some
information about Intel MPX.

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 Documentation/intel_mpx.txt |   77 +++
 1 files changed, 77 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/intel_mpx.txt

diff --git a/Documentation/intel_mpx.txt b/Documentation/intel_mpx.txt
new file mode 100644
index 000..3d947d0
--- /dev/null
+++ b/Documentation/intel_mpx.txt
@@ -0,0 +1,77 @@
+Intel(R) MPX Overview:
+=
+
+Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new
+capability introduced into Intel Architecture. Intel MPX can
+increase the robustness of software when it is used in conjunction
+with compiler changes to check memory references, for those
+references whose compile-time normal intentions are usurped
+at runtime due to buffer overflow or underflow.
+
+Two of the most important goals of Intel MPX are to provide
+this capability at very low performance overhead for newly
+compiled code, and to provide compatibility mechanisms with
+legacy software components. A direct benefit Intel MPX provides
+is hardening software against malicious attacks designed to
+cause or exploit buffer overruns.
+
+For details about the Intel MPX instructions, see "Intel(R)
+Architecture Instruction Set Extensions Programming Reference".
+
+Intel(R) MPX Programming Model
+--
+
+Intel MPX introduces new registers and new instructions that
+operate on these registers. Some of the registers added are
+bounds registers which store a pointer's lower bound and upper
+bound limits. Whenever the pointer is used, the requested
+reference is checked against the pointer's associated bounds,
+thereby preventing out-of-bound memory access (such as buffer
+overflows and overruns). Out-of-bounds memory references
+initiate a #BR exception which can then be handled in an
+appropriate manner.
+
+Loading and Storing Bounds using Translation
+
+
+Intel MPX defines two instructions for load/store of the linear
+address of a pointer to a buffer, along with the bounds of the
+buffer into a paging structure of extended bounds. Specifically
+when storing extended bounds, the processor will perform address
+translation of the address where the pointer is stored to an
+address in the Bound Table (BT) to determine the store location
+of extended bounds. Loading of an extended bounds performs the
+reverse sequence.
+
+The structure in memory to load/store an extended bound is a
+4-tuple consisting of lower bound, upper bound, pointer value
+and a reserved field. Bound loads and stores access 32-bit or
+64-bit operand size according to the operation mode. Thus,
+a bound table entry is 4*32 bits in 32-bit mode and 4*64 bits
+in 64-bit mode.
+
+The linear address of a bound table is stored in a Bound
+Directory (BD) entry. And the linear address of the bound
+directory is derived from either BNDCFGU or BNDCFGS registers.
+Bounds in memory are stored in Bound Tables (BT) as an extended
+bound, which are accessed via Bound Directory (BD) and address
+translation performed by BNDLDX/BNDSTX instructions.
+
+Bounds Directory (BD) and Bounds Tables (BT) are stored in
+application memory and are allocated by the application (in case
+of kernel use, the structures will be in kernel memory). The
+bound directory and each instance of bound table are in contiguous
+linear memory.
+
+XSAVE/XRESTOR Support of Intel MPX State
+
+
+Enabling Intel MPX requires an OS to manage two bits in XCR0:
+  - BNDREGS for saving and restoring registers BND0-BND3,
+  - BNDCSR for saving and restoring the user-mode configuration
+(BNDCFGU) and the status register (BNDSTATUS).
+
+The reason for having two separate bits is that BND0-BND3 is
+likely to be volatile state, while BNDCFGU and BNDSTATUS are not.
+Therefore, an OS has flexibility in handling these two states
+differently in saving or restoring them.
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] X86, mpx: Intel MPX definition

2013-12-06 Thread Qiaowei Ren

Signed-off-by: Qiaowei Ren 
Signed-off-by: Xudong Hao 
Signed-off-by: Liu Jinsong 
---
 arch/x86/include/asm/cpufeature.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index d3f5c63..6c2738d 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -216,6 +216,7 @@
 #define X86_FEATURE_ERMS   (9*32+ 9) /* Enhanced REP MOVSB/STOSB */
 #define X86_FEATURE_INVPCID(9*32+10) /* Invalidate Processor Context ID */
 #define X86_FEATURE_RTM(9*32+11) /* Restricted Transactional 
Memory */
+#define X86_FEATURE_MPX(9*32+14) /* Memory Protection 
Extension */
 #define X86_FEATURE_RDSEED (9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX(9*32+19) /* The ADCX and ADOX 
instructions */
 #define X86_FEATURE_SMAP   (9*32+20) /* Supervisor Mode Access Prevention 
*/
@@ -330,6 +331,7 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_perfctr_l2 boot_cpu_has(X86_FEATURE_PERFCTR_L2)
 #define cpu_has_cx8boot_cpu_has(X86_FEATURE_CX8)
 #define cpu_has_cx16   boot_cpu_has(X86_FEATURE_CX16)
+#define cpu_has_mpxboot_cpu_has(X86_FEATURE_MPX)
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5] x86, tboot: iomem fixes

2013-07-20 Thread Qiaowei Ren
Current code doesn't use specific interface to access I/O space.
So some potential bugs can be caused. We can fix this by using
specific API.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 3ff42d2..4e149c7 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -468,7 +468,8 @@ struct sinit_mle_data {
 
 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header 
*dmar_tbl)
 {
-   void *heap_base, *heap_ptr, *config;
+   void __iomem *heap_base, *heap_ptr, *config;
+   u32 dmar_tbl_off;
 
if (!tboot_enabled())
return dmar_tbl;
@@ -485,25 +486,26 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
acpi_table_header *dmar_tb
return NULL;
 
/* now map TXT heap */
-   heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE),
-   *(u64 *)(config + TXTCR_HEAP_SIZE));
+   heap_base = ioremap(readl(config + TXTCR_HEAP_BASE),
+   readl(config + TXTCR_HEAP_SIZE));
iounmap(config);
if (!heap_base)
return NULL;
 
/* walk heap to SinitMleData */
/* skip BiosData */
-   heap_ptr = heap_base + *(u64 *)heap_base;
+   heap_ptr = heap_base + readq(heap_base);
/* skip OsMleData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* skip OsSinitData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* now points to SinitMleDataSize; set to SinitMleData */
heap_ptr += sizeof(u64);
/* get addr of DMAR table */
+   dmar_tbl_off = readl(heap_ptr +
+   offsetof(struct sinit_mle_data, vtd_dmars_off));
dmar_tbl = (struct acpi_table_header *)(heap_ptr +
-  ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off -
-  sizeof(u64));
+   dmar_tbl_off - sizeof(u64));
 
/* don't unmap heap because dmar.c needs access to this */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, tboot: iomem fixes

2013-07-19 Thread Qiaowei Ren
Current code doesn't use specific interface to access I/O space.
So some potential bugs can be caused. We can fix this by using
specific API.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 3ff42d2..c902237 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -466,9 +466,12 @@ struct sinit_mle_data {
u32   vtd_dmars_off;
 } __packed;
 
+#define SINIT_MLE_DATA_VTD_DMAR_OFF140
+
 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header 
*dmar_tbl)
 {
-   void *heap_base, *heap_ptr, *config;
+   void __iomem *heap_base, *heap_ptr, *config;
+   u32 dmar_tbl_off;
 
if (!tboot_enabled())
return dmar_tbl;
@@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
acpi_table_header *dmar_tb
return NULL;
 
/* now map TXT heap */
-   heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE),
-   *(u64 *)(config + TXTCR_HEAP_SIZE));
+   heap_base = ioremap(readl(config + TXTCR_HEAP_BASE),
+   readl(config + TXTCR_HEAP_SIZE));
iounmap(config);
if (!heap_base)
return NULL;
 
/* walk heap to SinitMleData */
/* skip BiosData */
-   heap_ptr = heap_base + *(u64 *)heap_base;
+   heap_ptr = heap_base + readq(heap_base);
/* skip OsMleData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* skip OsSinitData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* now points to SinitMleDataSize; set to SinitMleData */
heap_ptr += sizeof(u64);
/* get addr of DMAR table */
+   dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF);
dmar_tbl = (struct acpi_table_header *)(heap_ptr +
-  ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off -
-  sizeof(u64));
+   dmar_tbl_off - sizeof(u64));
 
/* don't unmap heap because dmar.c needs access to this */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] x86, tboot: iomem fixes

2013-07-18 Thread Qiaowei Ren
Current code doesn't use specific interface to access I/O space.
So some potential bugs can be caused. We can fix this by using
specific API.

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 3ff42d2..afe8cf8 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -466,9 +466,12 @@ struct sinit_mle_data {
u32   vtd_dmars_off;
 } __packed;
 
+#define SINIT_MLE_DATA_VTD_DMAR_OFF140
+
 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header 
*dmar_tbl)
 {
-   void *heap_base, *heap_ptr, *config;
+   void __iomem *heap_base, *heap_ptr, *config;
+   u32 dmar_tbl_off;
 
if (!tboot_enabled())
return dmar_tbl;
@@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
acpi_table_header *dmar_tb
return NULL;
 
/* now map TXT heap */
-   heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE),
-   *(u64 *)(config + TXTCR_HEAP_SIZE));
+   heap_base = ioremap(readl(config + TXTCR_HEAP_BASE),
+   readl(config + TXTCR_HEAP_SIZE));
iounmap(config);
if (!heap_base)
return NULL;
 
/* walk heap to SinitMleData */
/* skip BiosData */
-   heap_ptr = heap_base + *(u64 *)heap_base;
+   heap_ptr = heap_base + readq(heap_base);
/* skip OsMleData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* skip OsSinitData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* now points to SinitMleDataSize; set to SinitMleData */
heap_ptr += sizeof(u64);
/* get addr of DMAR table */
-   dmar_tbl = (struct acpi_table_header *)(heap_ptr +
-  ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off -
-  sizeof(u64));
+   dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF);
+   memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64),
+   sizeof(struct acpi_table_header));
 
/* don't unmap heap because dmar.c needs access to this */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86, tboot: iomem fixes

2013-07-17 Thread Qiaowei Ren
Fixes for iomem annotations in arch/x86/kernel/tboot.c

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 3ff42d2..afe8cf8 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -466,9 +466,12 @@ struct sinit_mle_data {
u32   vtd_dmars_off;
 } __packed;
 
+#define SINIT_MLE_DATA_VTD_DMAR_OFF140
+
 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header 
*dmar_tbl)
 {
-   void *heap_base, *heap_ptr, *config;
+   void __iomem *heap_base, *heap_ptr, *config;
+   u32 dmar_tbl_off;
 
if (!tboot_enabled())
return dmar_tbl;
@@ -485,25 +488,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
acpi_table_header *dmar_tb
return NULL;
 
/* now map TXT heap */
-   heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE),
-   *(u64 *)(config + TXTCR_HEAP_SIZE));
+   heap_base = ioremap(readl(config + TXTCR_HEAP_BASE),
+   readl(config + TXTCR_HEAP_SIZE));
iounmap(config);
if (!heap_base)
return NULL;
 
/* walk heap to SinitMleData */
/* skip BiosData */
-   heap_ptr = heap_base + *(u64 *)heap_base;
+   heap_ptr = heap_base + readq(heap_base);
/* skip OsMleData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* skip OsSinitData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* now points to SinitMleDataSize; set to SinitMleData */
heap_ptr += sizeof(u64);
/* get addr of DMAR table */
-   dmar_tbl = (struct acpi_table_header *)(heap_ptr +
-  ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off -
-  sizeof(u64));
+   dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF);
+   memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64),
+   sizeof(struct acpi_table_header));
 
/* don't unmap heap because dmar.c needs access to this */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, tboot: iomem fixes

2013-07-04 Thread Qiaowei Ren
Fixes for iomem annotations in arch/x86/kernel/tboot.c

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   43 +++
 1 file changed, 11 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 3ff42d2..d06574c 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -442,33 +442,12 @@ late_initcall(tboot_late_init);
 #define TXTCR_HEAP_BASE 0x0300
 #define TXTCR_HEAP_SIZE 0x0308
 
-#define SHA1_SIZE  20
-
-struct sha1_hash {
-   u8 hash[SHA1_SIZE];
-};
-
-struct sinit_mle_data {
-   u32   version; /* currently 6 */
-   struct sha1_hash  bios_acm_id;
-   u32   edx_senter_flags;
-   u64   mseg_valid;
-   struct sha1_hash  sinit_hash;
-   struct sha1_hash  mle_hash;
-   struct sha1_hash  stm_hash;
-   struct sha1_hash  lcp_policy_hash;
-   u32   lcp_policy_control;
-   u32   rlp_wakeup_addr;
-   u32   reserved;
-   u32   num_mdrs;
-   u32   mdrs_off;
-   u32   num_vtd_dmars;
-   u32   vtd_dmars_off;
-} __packed;
+#define SINIT_MLE_DATA_VTD_DMAR_OFF140
 
 struct acpi_table_header *tboot_get_dmar_table(struct acpi_table_header 
*dmar_tbl)
 {
-   void *heap_base, *heap_ptr, *config;
+   void __iomem *heap_base, *heap_ptr, *config;
+   u32 dmar_tbl_off;
 
if (!tboot_enabled())
return dmar_tbl;
@@ -485,25 +464,25 @@ struct acpi_table_header *tboot_get_dmar_table(struct 
acpi_table_header *dmar_tb
return NULL;
 
/* now map TXT heap */
-   heap_base = ioremap(*(u64 *)(config + TXTCR_HEAP_BASE),
-   *(u64 *)(config + TXTCR_HEAP_SIZE));
+   heap_base = ioremap(readl(config + TXTCR_HEAP_BASE),
+   readl(config + TXTCR_HEAP_SIZE));
iounmap(config);
if (!heap_base)
return NULL;
 
/* walk heap to SinitMleData */
/* skip BiosData */
-   heap_ptr = heap_base + *(u64 *)heap_base;
+   heap_ptr = heap_base + readq(heap_base);
/* skip OsMleData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* skip OsSinitData */
-   heap_ptr += *(u64 *)heap_ptr;
+   heap_ptr += readq(heap_ptr);
/* now points to SinitMleDataSize; set to SinitMleData */
heap_ptr += sizeof(u64);
/* get addr of DMAR table */
-   dmar_tbl = (struct acpi_table_header *)(heap_ptr +
-  ((struct sinit_mle_data *)heap_ptr)->vtd_dmars_off -
-  sizeof(u64));
+   dmar_tbl_off = readl(heap_ptr + SINIT_MLE_DATA_VTD_DMAR_OFF);
+   memcpy_fromio(dmar_tbl, heap_ptr + dmar_tbl_off - sizeof(u64),
+   sizeof(struct acpi_table_header));
 
/* don't unmap heap because dmar.c needs access to this */
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/debug] x86/tboot: Provide debugfs interfaces to access TXT log

2013-06-28 Thread tip-bot for Qiaowei Ren
Commit-ID:  13bfd47a0ef68fc8b21e67873dbdf269c7db6b59
Gitweb: http://git.kernel.org/tip/13bfd47a0ef68fc8b21e67873dbdf269c7db6b59
Author: Qiaowei Ren 
AuthorDate: Mon, 24 Jun 2013 13:55:33 +0800
Committer:  Ingo Molnar 
CommitDate: Fri, 28 Jun 2013 11:05:16 +0200

x86/tboot: Provide debugfs interfaces to access TXT log

These logs come from tboot (Trusted Boot, an open source,
pre-kernel/VMM module that uses Intel TXT to perform a
measured and verified launch of an OS kernel/VMM.).

Signed-off-by: Qiaowei Ren 
Acked-by: H. Peter Anvin 
Cc: Gang Wei 
Link: 
http://lkml.kernel.org/r/137205-21788-1-git-send-email-qiaowei@intel.com
[ Beautified the code a bit. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/tboot.c | 73 +
 1 file changed, 73 insertions(+)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index f84fe00..3ff42d2 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -338,6 +339,73 @@ static struct notifier_block tboot_cpu_notifier 
__cpuinitdata =
.notifier_call = tboot_cpu_callback,
 };
 
+#ifdef CONFIG_DEBUG_FS
+
+#define TBOOT_LOG_UUID { 0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \
+ 0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74 }
+
+#define TBOOT_SERIAL_LOG_ADDR  0x6
+#define TBOOT_SERIAL_LOG_SIZE  0x08000
+#define LOG_MAX_SIZE_OFF   16
+#define LOG_BUF_OFF24
+
+static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID;
+
+static ssize_t tboot_log_read(struct file *file, char __user *user_buf, size_t 
count, loff_t *ppos)
+{
+   void __iomem *log_base;
+   u8 log_uuid[16];
+   u32 max_size;
+   void *kbuf;
+   int ret = -EFAULT;
+
+   log_base = ioremap_nocache(TBOOT_SERIAL_LOG_ADDR, 
TBOOT_SERIAL_LOG_SIZE);
+   if (!log_base)
+   return ret;
+
+   memcpy_fromio(log_uuid, log_base, sizeof(log_uuid));
+   if (memcmp(&tboot_log_uuid, log_uuid, sizeof(log_uuid)))
+   goto err_iounmap;
+
+   max_size = readl(log_base + LOG_MAX_SIZE_OFF);
+   if (*ppos >= max_size) {
+   ret = 0;
+   goto err_iounmap;
+   }
+
+   if (*ppos + count > max_size)
+   count = max_size - *ppos;
+
+   kbuf = kmalloc(count, GFP_KERNEL);
+   if (!kbuf) {
+   ret = -ENOMEM;
+   goto err_iounmap;
+   }
+
+   memcpy_fromio(kbuf, log_base + LOG_BUF_OFF + *ppos, count);
+   if (copy_to_user(user_buf, kbuf, count))
+   goto err_kfree;
+
+   *ppos += count;
+
+   ret = count;
+
+err_kfree:
+   kfree(kbuf);
+
+err_iounmap:
+   iounmap(log_base);
+
+   return ret;
+}
+
+static const struct file_operations tboot_log_fops = {
+   .read   = tboot_log_read,
+   .llseek = default_llseek,
+};
+
+#endif /* CONFIG_DEBUG_FS */
+
 static __init int tboot_late_init(void)
 {
if (!tboot_enabled())
@@ -348,6 +416,11 @@ static __init int tboot_late_init(void)
atomic_set(&ap_wfs_count, 0);
register_hotcpu_notifier(&tboot_cpu_notifier);
 
+#ifdef CONFIG_DEBUG_FS
+   debugfs_create_file("tboot_log", S_IRUSR,
+   arch_debugfs_dir, NULL, &tboot_log_fops);
+#endif
+
acpi_os_set_prepare_sleep(&tboot_sleep);
return 0;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] x86, tboot: provide debugfs interfaces to access TXT log

2013-06-23 Thread Qiaowei Ren
These logs come from tboot (Trusted Boot, an open source,
pre-kernel/VMM module that uses Intel TXT to perform a
measured and verified launch of an OS kernel/VMM.).

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   72 +++
 1 file changed, 72 insertions(+)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index f84fe00..2dec186 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -338,6 +339,72 @@ static struct notifier_block tboot_cpu_notifier 
__cpuinitdata =
.notifier_call = tboot_cpu_callback,
 };
 
+#if defined(CONFIG_DEBUG_FS)
+
+#define TBOOT_LOG_UUID {0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \
+0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74}
+#define TBOOT_SERIAL_LOG_ADDR  0x6
+#define TBOOT_SERIAL_LOG_SIZE  0x08000
+#define LOG_MAX_SIZE_OFF   16
+#define LOG_BUF_OFF24
+
+static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID;
+
+static ssize_t tboot_log_read(struct file *file, char __user *user_buf,
+ size_t count, loff_t *ppos)
+{
+   void __iomem *log_base;
+   u8 log_uuid[16];
+   u32 max_size;
+   void *kbuf;
+   int ret = -EFAULT;
+
+   log_base = ioremap_nocache(TBOOT_SERIAL_LOG_ADDR,
+   TBOOT_SERIAL_LOG_SIZE);
+   if (!log_base)
+   return ret;
+
+   memcpy_fromio(log_uuid, log_base, sizeof(log_uuid));
+   if (memcmp(&tboot_log_uuid, log_uuid, sizeof(log_uuid)))
+   goto err_iounmap;
+
+   max_size = readl(log_base + LOG_MAX_SIZE_OFF);
+   if (*ppos >= max_size) {
+   ret = 0;
+   goto err_iounmap;
+   }
+
+   if (*ppos + count > max_size)
+   count = max_size - *ppos;
+
+   kbuf = kmalloc(count, GFP_KERNEL);
+   if (!kbuf) {
+   ret = -ENOMEM;
+   goto err_iounmap;
+   }
+
+   memcpy_fromio(kbuf, log_base + LOG_BUF_OFF + *ppos, count);
+   if (copy_to_user(user_buf, kbuf, count))
+   goto err_kfree;
+
+   *ppos += count;
+
+   ret = count;
+
+err_kfree:
+   kfree(kbuf);
+err_iounmap:
+   iounmap(log_base);
+   return ret;
+}
+
+static const struct file_operations tboot_log_fops = {
+   .read = tboot_log_read,
+   .llseek = default_llseek,
+};
+
+#endif /* CONFIG_DEBUG_FS */
+
 static __init int tboot_late_init(void)
 {
if (!tboot_enabled())
@@ -348,6 +415,11 @@ static __init int tboot_late_init(void)
atomic_set(&ap_wfs_count, 0);
register_hotcpu_notifier(&tboot_cpu_notifier);
 
+#if defined(CONFIG_DEBUG_FS)
+   debugfs_create_file("tboot_log", S_IRUSR,
+   arch_debugfs_dir, NULL, &tboot_log_fops);
+#endif
+
acpi_os_set_prepare_sleep(&tboot_sleep);
return 0;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, tboot: provide debugfs interfaces to access TXT log

2013-06-23 Thread Qiaowei Ren
These logs come from tboot (Trusted Boot, an open source,
pre-kernel/VMM module that uses Intel TXT to perform a
measured and verified launch of an OS kernel/VMM.).

Signed-off-by: Qiaowei Ren 
---
 arch/x86/kernel/tboot.c |   70 +++
 1 file changed, 70 insertions(+)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index f84fe00..dd6f198 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -338,6 +339,70 @@ static struct notifier_block tboot_cpu_notifier 
__cpuinitdata =
.notifier_call = tboot_cpu_callback,
 };
 
+#if defined(CONFIG_DEBUG_FS)
+
+#define TBOOT_LOG_UUID {0x26, 0x25, 0x19, 0xc0, 0x30, 0x6b, 0xb4, 0x4d, \
+0x4c, 0x84, 0xa3, 0xe9, 0x53, 0xb8, 0x81, 0x74}
+#define TBOOT_SERIAL_LOG_ADDR  0x6
+#define TBOOT_SERIAL_LOG_SIZE  0x08000
+
+static uint8_t tboot_log_uuid[16] = TBOOT_LOG_UUID;
+
+struct tboot_log {
+   uint8_t uuid[16];
+   uint32_tmax_size;
+   uint32_tcurr_pos;
+   charbuf[];
+};
+
+static struct tboot_log *get_log(void)
+{
+   struct tboot_log *log;
+
+   log = (struct tboot_log *)ioremap_nocache(TBOOT_SERIAL_LOG_ADDR,
+ TBOOT_SERIAL_LOG_SIZE);
+   if (!log)
+   return NULL;
+
+   if (memcmp(&tboot_log_uuid, &log->uuid, sizeof(log->uuid))) {
+   iounmap(log);
+   return NULL;
+   }
+
+   return log;
+}
+
+static ssize_t tboot_log_read(struct file *file, char __user *user_buf,
+ size_t count, loff_t *ppos)
+{
+   struct tboot_log *log;
+
+   log = get_log();
+   if (!log)
+   return -EFAULT;
+
+   if (*ppos >= log->max_size)
+   return 0;
+
+   if (*ppos + count > log->max_size)
+   count = log->max_size - *ppos;
+
+   if (copy_to_user(user_buf, log->buf + *ppos, count))
+   return -EFAULT;
+
+   *ppos += count;
+
+   iounmap(log);
+   return count;
+}
+
+static const struct file_operations tboot_log_fops = {
+   .read = tboot_log_read,
+   .llseek = default_llseek,
+};
+
+#endif /* CONFIG_DEBUG_FS */
+
 static __init int tboot_late_init(void)
 {
if (!tboot_enabled())
@@ -348,6 +413,11 @@ static __init int tboot_late_init(void)
atomic_set(&ap_wfs_count, 0);
register_hotcpu_notifier(&tboot_cpu_notifier);
 
+#if defined(CONFIG_DEBUG_FS)
+   debugfs_create_file("tboot_log", S_IRUSR,
+   arch_debugfs_dir, NULL, &tboot_log_fops);
+#endif
+
acpi_os_set_prepare_sleep(&tboot_sleep);
return 0;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >