[v6 PATCH 19/21] x86/traps: Fixup general protection faults caused by UMIP

2017-03-07 Thread Ricardo Neri
If the User-Mode Instruction Prevention CPU feature is available and
enabled, a general protection fault will be issued if the instructions
sgdt, sldt, sidt, str or smsw are executed from user-mode context
(CPL > 0). If the fault was caused by any of the instructions protected
by UMIP, fixup_umip_exception will emulate dummy results for these
instructions. If emulation is successful, the result is passed to the
user space program and no SIGSEGV signal is emitted.

Please note that fixup_umip_exception also caters for the case when
the fault originated while running in virtual-8086 mode.

Cc: Andy Lutomirski 
Cc: Andrew Morton 
Cc: H. Peter Anvin 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Chen Yucong 
Cc: Chris Metcalf 
Cc: Dave Hansen 
Cc: Fenghua Yu 
Cc: Huang Rui 
Cc: Jiri Slaby 
Cc: Jonathan Corbet 
Cc: Michael S. Tsirkin 
Cc: Paul Gortmaker 
Cc: Peter Zijlstra 
Cc: Ravi V. Shankar 
Cc: Shuah Khan 
Cc: Vlastimil Babka 
Cc: Tony Luck 
Cc: Paolo Bonzini 
Cc: Liang Z. Li 
Cc: Alexandre Julliard 
Cc: Stas Sergeev 
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/kernel/traps.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 948443e..86efbcb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -492,6 +493,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
cond_local_irq_enable(regs);
 
+   if (user_mode(regs) && fixup_umip_exception(regs))
+   return;
+
if (v8086_mode(regs)) {
local_irq_enable();
handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 02/21] x86/mpx: Do not use SIB index if index points to R/ESP

2017-03-07 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when memory addressing is used
(i.e., mod part of ModR/M is not 3), a SIB byte is used and the index of
the SIB byte points to the R/ESP (i.e., index = 4), the index should not be
used in the computation of the memory address.

In these cases the address is simply the value present in the register
pointed by the base part of the SIB byte plus the displacement byte.

An example of such instruction could be

insn -0x80(%rsp)

This is represented as:

 [opcode] 4c 23 80

  ModR/M=0x4c: mod: 0x1, reg: 0x1: r/m: 0x4(R/ESP)
  SIB=0x23: sc: 0, index: 0x100(R/ESP), base: 0x11(R/EBX):
  Displacement -0x80

The correct address is (base) + displacement; no index is used.

We can achieve the desired effect of not using the index by making
get_reg_offset return -EDOM in this particular case. This value indicates
callers that they should not use the index to calculate the address.
EINVAL continues to indicate that an error when decoding the SIB byte.

Care is taken to allow R12 to be used as index, which is a valid scenario.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Peter Zijlstra 
Cc: Nathan Howard 
Cc: Adan Hawthorn 
Cc: Joe Perches 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/mm/mpx.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index ff112e3..d9e92d6 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -110,6 +110,13 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
regno = X86_SIB_INDEX(insn->sib.value);
if (X86_REX_X(insn->rex_prefix.value))
regno += 8;
+   /*
+* If mod !=3, register R/ESP (regno=4) is not used as index in
+* the address computation. Check is done after looking at REX.X
+* This is because R12 (regno=12) can be used as an index.
+*/
+   if (regno == 4 && X86_MODRM_MOD(insn->modrm.value) != 3)
+   return -EDOM;
break;
 
case REG_TYPE_BASE:
@@ -159,11 +166,19 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
goto out_err;
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   if (indx_offset < 0)
+   /*
+* A negative offset generally means a error, except
+* -EDOM, which means that the contents of the register
+* should not be used as index.
+*/
+   if (unlikely(indx_offset == -EDOM))
+   indx = 0;
+   else if (unlikely(indx_offset < 0))
goto out_err;
+   else
+   indx = regs_get_register(regs, indx_offset);
 
base = regs_get_register(regs, base_offset);
-   indx = regs_get_register(regs, indx_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 20/21] x86: Enable User-Mode Instruction Prevention

2017-03-07 Thread Ricardo Neri
User_mode Instruction Prevention (UMIP) is enabled by setting/clearing a
bit in %cr4.

It makes sense to enable UMIP at some point while booting, before user
spaces come up. Like SMAP and SMEP, is not critical to have it enabled
very early during boot. This is because UMIP is relevant only when there is
a userspace to be protected from. Given the similarities in relevance, it
makes sense to enable UMIP along with SMAP and SMEP.

UMIP is enabled by default. It can be disabled by adding clearcpuid=514
to the kernel parameters.

Cc: Andy Lutomirski 
Cc: Andrew Morton 
Cc: H. Peter Anvin 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Chen Yucong 
Cc: Chris Metcalf 
Cc: Dave Hansen 
Cc: Fenghua Yu 
Cc: Huang Rui 
Cc: Jiri Slaby 
Cc: Jonathan Corbet 
Cc: Michael S. Tsirkin 
Cc: Paul Gortmaker 
Cc: Peter Zijlstra 
Cc: Ravi V. Shankar 
Cc: Shuah Khan 
Cc: Vlastimil Babka 
Cc: Tony Luck 
Cc: Paolo Bonzini 
Cc: Liang Z. Li 
Cc: Alexandre Julliard 
Cc: Stas Sergeev 
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/Kconfig | 10 ++
 arch/x86/kernel/cpu/common.c | 16 +++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a..b7f1226 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1735,6 +1735,16 @@ config X86_SMAP
 
  If unsure, say Y.
 
+config X86_INTEL_UMIP
+   def_bool y
+   depends on CPU_SUP_INTEL
+   prompt "Intel User Mode Instruction Prevention" if EXPERT
+   ---help---
+ The User Mode Instruction Prevention (UMIP) is a security
+ feature in newer Intel processors. If enabled, a general
+ protection fault is issued if the instructions SGDT, SLDT,
+ SIDT, SMSW and STR are executed in user mode.
+
 config X86_INTEL_MPX
prompt "Intel MPX (Memory Protection Extensions)"
def_bool n
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 58094a1..9f59eb5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -311,6 +311,19 @@ static __always_inline void setup_smap(struct cpuinfo_x86 
*c)
}
 }
 
+static __always_inline void setup_umip(struct cpuinfo_x86 *c)
+{
+   if (cpu_feature_enabled(X86_FEATURE_UMIP) &&
+   cpu_has(c, X86_FEATURE_UMIP))
+   cr4_set_bits(X86_CR4_UMIP);
+   else
+   /*
+* Make sure UMIP is disabled in case it was enabled in a
+* previous boot (e.g., via kexec).
+*/
+   cr4_clear_bits(X86_CR4_UMIP);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1080,9 +1093,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
 
-   /* Set up SMEP/SMAP */
+   /* Set up SMEP/SMAP/UMIP */
setup_smep(c);
setup_smap(c);
+   setup_umip(c);
 
/*
 * The vendor-specific functions might have changed features.
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

2017-03-07 Thread Ricardo Neri
The 32-bit and 64-bit address encodings are identical. This means that we
can use the same function in both cases. In order to reuse the function for
32-bit address encodings, we must sign-extend our 32-bit signed operands to
64-bit signed variables (only for 64-bit builds). To decide on whether sign
extension is needed, we rely on the address size as given by the
instruction structure.

Lastly, before computing the linear address, we must truncate our signed
64-bit signed effective address if the address size is 32-bit.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 44 
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index edb360f..a9a1704 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
struct pt_regs *regs)
return get_reg_offset(insn, regs, REG_TYPE_INDEX);
 }
 
+static inline long __to_signed_long(unsigned long val, int long_bytes)
+{
+#ifdef CONFIG_X86_64
+   return long_bytes == 4 ? (long)((int)((val) & 0x)) : (long)val;
+#else
+   return (long)val;
+#endif
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
@@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, 
struct pt_regs *regs)
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr;
-   long eff_addr, base, indx;
-   int addr_offset, base_offset, indx_offset;
+   long eff_addr, base, indx, tmp;
+   int addr_offset, base_offset, indx_offset, addr_bytes;
insn_byte_t sib;
 
insn_get_modrm(insn);
insn_get_sib(insn);
sib = insn->sib.value;
+   addr_bytes = insn->addr_bytes;
 
if (X86_MODRM_MOD(insn->modrm.value) == 3) {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
if (addr_offset < 0)
goto out_err;
-   eff_addr = regs_get_register(regs, addr_offset);
+   tmp = regs_get_register(regs, addr_offset);
+   eff_addr = __to_signed_long(tmp, addr_bytes);
seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
  false);
} else {
@@ -591,20 +602,24 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 * in the address computation.
 */
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-   if (unlikely(base_offset == -EDOM))
+   if (unlikely(base_offset == -EDOM)) {
base = 0;
-   else if (unlikely(base_offset < 0))
+   } else if (unlikely(base_offset < 0)) {
goto out_err;
-   else
-   base = regs_get_register(regs, base_offset);
+   } else {
+   tmp = regs_get_register(regs, base_offset);
+   base = __to_signed_long(tmp, addr_bytes);
+   }
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   if (unlikely(indx_offset == -EDOM))
+   if (unlikely(indx_offset == -EDOM)) {
indx = 0;
-   else if (unlikely(indx_offset < 0))
+   } else if (unlikely(indx_offset < 0)) {
goto out_err;
-   else
-   indx = regs_get_register(regs, indx_offset);
+   } else {
+   tmp = regs_get_register(regs, indx_offset);
+   indx = __to_signed_long(tmp, addr_bytes);
+   }
 
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
seg_base_addr = insn_get_seg_base(regs, insn,
@@ -625,13 +640,18 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
  

[v6 PATCH 13/21] x86/insn-eval: Add support to resolve 16-bit addressing encodings

2017-03-07 Thread Ricardo Neri
Tasks running in virtual-8086 mode or in protected mode with code
segment descriptors that specify 16-bit default address sizes via the
D bit will use 16-bit addressing form encodings as described in the Intel
64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
2.1.5. 16-bit addressing encodings differ in several ways from the
32-bit/64-bit addressing form encodings: the r/m part of the ModRM byte
points to different registers and, in some cases, addresses can be
indicated by the addition of the value of two registers. Also, there is
no support for SiB bytes. Thus, a separate function is needed to parse
this form of addressing.

A couple of functions are introduced. get_reg_offset_16 obtains the
offset from the base of pt_regs of the registers indicated by the ModRM
byte of the address encoding. insn_get_addr_ref_16 computes the linear
address indicated by the instructions using the value of the registers
given by ModRM as well as the base address of the segment.

Lastly, the original function insn_get_addr_ref is renamed as
insn_get_addr_ref_32_64. A new insn_get_addr_ref function decides what
type of address decoding must be done base on the number of address bytes
given by the instruction. Documentation for insn_get_addr_ref_32_64 is
also improved.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 137 +++
 1 file changed, 137 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index a9a1704..cb1076d 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -306,6 +306,73 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_reg_offset_16 - Obtain offset of register indicated by instruction
+ * @insn:  Instruction structure containing ModRM and SiB bytes
+ * @regs:  Set of registers referred by the instruction
+ * @offs1: Offset of the first operand register
+ * @offs2: Offset of the second opeand register, if applicable.
+ *
+ * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
+ * within insn. This function is to be used with 16-bit address encodings. The
+ * offs1 and offs2 will be written with the offset of the two registers
+ * indicated by the instruction. In cases where any of the registers is not
+ * referenced by the instruction, the value will be set to -EDOM.
+ *
+ * Return: 0 on success, -EINVAL on failure.
+ */
+static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
+int *offs1, int *offs2)
+{
+   /* 16-bit addressing can use one or two registers */
+   static const int regoff1[] = {
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bx),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, bp),
+   offsetof(struct pt_regs, bx),
+   };
+
+   static const int regoff2[] = {
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   offsetof(struct pt_regs, si),
+   offsetof(struct pt_regs, di),
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   -EDOM,
+   };
+
+   if (!offs1 || !offs2)
+   return -EINVAL;
+
+   /* operand is a register, use the generic function */
+   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
+   *offs1 = insn_get_reg_offset_modrm_rm(insn, regs);
+   *offs2 = -EDOM;
+   return 0;
+   }
+
+   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
+   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
+
+   /*
+* If no displacement is indicated in the mod part of the ModRM byte,
+* (mod part is 0) and the r/m part of the same byte is 6, no register
+* is used caculate the operand address. An r/m part of 6 means that
+* the second register offset is already invalid.
+*/
+   if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
+   (X86_MODRM_RM(insn->modrm.value) == 6))
+   *offs1 = -EDOM;
+
+   return 0;
+}
+
+/**
  * get_desc() - Obtain 

[v6 PATCH 14/21] x86/insn-eval: Add wrapper function for 16-bit and 32-bit address encodings

2017-03-07 Thread Ricardo Neri
Convert the function insn_get_add_ref into a wrapper function that calls
the correct static address-decoding function depending on the size of the
address. In this way, callers do not need to worry about calling the
correct function and decreases the number of functions that need to be
exposed.

To this end, the original 32/64-bit insn_get_addr_ref is renamed as
insn_get_addr_ref_32_64 to reflect the type of address encodings that it
handles.

Documentation is added to the new wrapper function and the documentation
for the 32/64-bit address decoding function is improved.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 45 -
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cb1076d..e633588 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -705,12 +705,21 @@ static inline long __to_signed_long(unsigned long val, 
int long_bytes)
 #endif
 }
 
-/*
- * return the address being referenced be instruction
- * for rm=3 returning the content of the rm reg
- * for rm!=3 calculates the address using SIB and Disp
+/**
+ * insn_get_addr_ref_32_64 - Obtain a 32/64-bit address referred by instruction
+ * @insn:  Instruction struct with ModRM and SiB bytes and displacement
+ * @regs:  Set of registers referred by the instruction
+ *
+ * This function is to be used with 32-bit and 64-bit address encodings. Obtain
+ * the memory address referred by the instruction's ModRM bytes and
+ * displacement. Also, the segment used as base is determined by either any
+ * segment override prefixes in insn or the default segment of the registers
+ * involved in the linear address computation.
+ *
+ * Return: linear address referenced by instruction and registers
  */
-void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+static void __user *insn_get_addr_ref_32_64(struct insn *insn,
+   struct pt_regs *regs)
 {
unsigned long linear_addr, seg_base_addr;
long eff_addr, base, indx, tmp;
@@ -795,3 +804,29 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
 out_err:
return (void __user *)-1;
 }
+
+/**
+ * insn_get_addr_ref - Obtain the linear address referred by instruction
+ * @insn:  Instruction structure containing ModRM byte and displacement
+ * @regs:  Set of registers referred by the instruction
+ *
+ * Obtain the memory address referred by the instruction's ModRM bytes and
+ * displacement. Also, the segment used as base is determined by either any
+ * segment override prefixes in insn or the default segment of the registers
+ * involved in the address computation.
+ *
+ * Return: linear address referenced by instruction and registers
+ */
+void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
+{
+   switch (insn->addr_bytes) {
+   case 2:
+   return insn_get_addr_ref_16(insn, regs);
+   case 4:
+   /* fall through */
+   case 8:
+   return insn_get_addr_ref_32_64(insn, regs);
+   default:
+   return (void __user *)-1;
+   }
+}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 10/21] x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero

2017-03-07 Thread Ricardo Neri
Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when the mod part of the ModRM
byte is zero and R/EBP is specified in the R/M part of such bit, the value
of the aforementioned register should not be used in the address
computation. Instead, a 32-bit displacement is expected. The instruction
decoder takes care of setting the displacement to the expected value.
Returning -EDOM signals callers that they should ignore the value of such
register when computing the address encoded in the instruction operands.

Also, callers should exercise care to correctly interpret this particular
case. In IA-32e 64-bit mode, the address is given by the displacement plus
the value of the RIP. In IA-32e compatibility mode, the value of EIP is
ignored. This correction is done for our insn_get_addr_ref.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cda6c71..ea10b03 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -250,6 +250,14 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
switch (type) {
case REG_TYPE_RM:
regno = X86_MODRM_RM(insn->modrm.value);
+   /* if mod=0, register R/EBP is not used in the address
+* computation. Instead, a 32-bit displacement is expected;
+* the instruction decoder takes care of reading such
+* displacement. This is true for both R/EBP and R13, as the
+* REX.B bit is not decoded.
+*/
+   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
+   return -EDOM;
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -599,9 +607,22 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-   if (addr_offset < 0)
+   /* -EDOM means that we must ignore the address_offset.
+* The only case in which we see this value is when
+* R/M points to R/EBP. In such a case, in 64-bit mode
+* the effective address is relative to tho RIP.
+*/
+   if (addr_offset == -EDOM) {
+   eff_addr = 0;
+#ifdef CONFIG_X86_64
+   if (user_64bit_mode(regs))
+   eff_addr = (long)regs->ip;
+#endif
+   } else if (addr_offset < 0) {
goto out_err;
-   eff_addr = regs_get_register(regs, addr_offset);
+   } else {
+   eff_addr = regs_get_register(regs, addr_offset);
+   }
}
eff_addr += insn->displacement.value;
}
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 17/21] x86: Add emulation code for UMIP instructions

2017-03-07 Thread Ricardo Neri
The feature User-Mode Instruction Prevention present in recent Intel
processor prevents a group of instructions from being executed with
CPL > 0. Otherwise, a general protection fault is issued.

Rather than relaying this fault to the user space (in the form of a SIGSEGV
signal), the instructions protected by UMIP can be emulated to provide
dummy results. This allows to conserve the current kernel behavior and not
reveal the system resources that UMIP intends to protect (the global
descriptor and interrupt descriptor tables, the segment selectors of the
local descriptor table and the task state and the machine status word).

This emulation is needed because certain applications (e.g., WineHQ) rely
on this subset of instructions to function.

The instructions protected by UMIP can be split in two groups. Those who
return a kernel memory address (sgdt and sidt) and those who return a
value (sldt, str and smsw).

For the instructions that return a kernel memory address, applications
such as WineHQ rely on the result being located in the kernel memory space.
The result is emulated as a hard-coded value that, lies close to the top
of the kernel memory. The limit for the GDT and the IDT are set to zero.

The instructions sldt and str return a segment selector relative to the
base address of the global descriptor table. Since the actual address of
such table is not revealed, it makes sense to emulate the result as zero.

The instruction smsw is emulated to return the value that the register CR0
has at boot time as set in the head_32.

Care is taken to appropriately emulate the results when segmentation is
used. This is, rather than relying on USER_DS and USER_CS, the function
insn_get_addr_ref inspects the segment descriptor pointed by the registers
in pt_regs. This ensures that we correctly obtain the segment base address
and the address and operand sizes even if the user space application uses
local descriptor table.

Cc: Andy Lutomirski 
Cc: Andrew Morton 
Cc: H. Peter Anvin 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Chen Yucong 
Cc: Chris Metcalf 
Cc: Dave Hansen 
Cc: Fenghua Yu 
Cc: Huang Rui 
Cc: Jiri Slaby 
Cc: Jonathan Corbet 
Cc: Michael S. Tsirkin 
Cc: Paul Gortmaker 
Cc: Peter Zijlstra 
Cc: Ravi V. Shankar 
Cc: Shuah Khan 
Cc: Vlastimil Babka 
Cc: Tony Luck 
Cc: Paolo Bonzini 
Cc: Liang Z. Li 
Cc: Alexandre Julliard 
Cc: Stas Sergeev 
Cc: x...@kernel.org
Cc: linux-msdos@vger.kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/include/asm/umip.h |  15 +++
 arch/x86/kernel/Makefile|   1 +
 arch/x86/kernel/umip.c  | 257 
 3 files changed, 273 insertions(+)
 create mode 100644 arch/x86/include/asm/umip.h
 create mode 100644 arch/x86/kernel/umip.c

diff --git a/arch/x86/include/asm/umip.h b/arch/x86/include/asm/umip.h
new file mode 100644
index 000..077b236
--- /dev/null
+++ b/arch/x86/include/asm/umip.h
@@ -0,0 +1,15 @@
+#ifndef _ASM_X86_UMIP_H
+#define _ASM_X86_UMIP_H
+
+#include 
+#include 
+
+#ifdef CONFIG_X86_INTEL_UMIP
+bool fixup_umip_exception(struct pt_regs *regs);
+#else
+static inline bool fixup_umip_exception(struct pt_regs *regs)
+{
+   return false;
+}
+#endif  /* CONFIG_X86_INTEL_UMIP */
+#endif  /* _ASM_X86_UMIP_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 84c0059..0ded7b1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -122,6 +122,7 @@ obj-$(CONFIG_EFI)   += sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_TRACING)  += tracepoint.o
 obj-$(CONFIG_SCHED_MC_PRIO)+= itmt.o
+obj-$(CONFIG_X86_INTEL_UMIP)   += umip.o
 
 ifdef CONFIG_FRAME_POINTER
 obj-y  += unwind_frame.o
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
new file mode 100644
index 000..e64d8e5
--- /dev/null
+++ b/arch/x86/kernel/umip.c
@@ -0,0 +1,257 @@
+/*
+ * umip.c Emulation for instruction protected by the Intel User-Mode
+ * Instruction Prevention. The instructions are:
+ *sgdt
+ *sldt
+ *sidt
+ *str
+ *smsw
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ * Ricardo Neri 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * == Base addresses of GDT and IDT
+ * Some applications to function rely finding the global descriptor table (GDT)
+ * and the interrupt descriptor table (IDT) in kernel memory.
+ * For 

[v6 PATCH 08/21] x86/insn-eval: Add utility function to get segment descriptor base address

2017-03-07 Thread Ricardo Neri
With segmentation, the base address of the segment descriptor is needed
to compute a linear address. The segment descriptor used in the address
computation depends on either any segment override prefixes in the in the
instruction or the default segment determined by the registers involved
in the address computation. Thus, both the instruction as well as the
register (specified as the offset from the base of pt_regs) are given as
inputs, along with a boolean variable to select between override and
default.

The segment selector is determined by get_seg_selector with the inputs
described above. Once the selector is known the base address is
determined. In protected mode, the selector is used to obtain the segment
descriptor and then its base address. If in 64-bit user mode, the segment =
base address is zero except when FS or GS are used. In virtual-8086 mode,
the base address is computed as the value of the segment selector shifted 4
positions to the left.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/include/asm/insn-eval.h |  2 ++
 arch/x86/lib/insn-eval.c | 66 
 2 files changed, 68 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 754211b..b201742 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -15,5 +15,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs);
 int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff, bool use_default_seg);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 8608adf..383ca83 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -355,6 +355,72 @@ static int get_desc(unsigned short seg, struct desc_struct 
**desc)
 }
 
 /**
+ * insn_get_seg_base() - Obtain base address contained in descriptor
+ * @regs:  Set of registers containing the segment selector
+ * @insn:  Instruction structure with selector override prefixes
+ * @regoff:Operand offset, in pt_regs, of which the selector is needed
+ * @use_default_seg: Use the default segment instead of prefix overrides
+ *
+ * Obtain the base address of the segment descriptor as indicated by either
+ * any segment override prefixes contained in insn or the default segment
+ * applicable to the register indicated by regoff. regoff is specified as the
+ * offset in bytes from the base of pt_regs.
+ *
+ * Return: In protected mode, base address of the segment. It may be zero in
+ * certain cases for 64-bit builds and/or 64-bit applications. In virtual-8086
+ * mode, the segment selector shifed 4 positions to the right. -1L in case of
+ * error.
+ */
+unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
+   int regoff, bool use_default_seg)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   enum segment seg_type;
+   int ret;
+
+   seg_type = resolve_seg_selector(insn, regoff, use_default_seg);
+
+   seg = get_segment_selector(regs, seg_type);
+   if (seg < 0)
+   return -1L;
+
+   if (v8086_mode(regs))
+   /*
+* Base is simply the segment selector shifted 4
+* positions to the right.
+*/
+   return (unsigned long)(seg << 4);
+
+#ifdef CONFIG_X86_64
+   if (user_64bit_mode(regs)) {
+   /*
+* Only FS or GS will have a base address, the rest of
+* the segments' bases are forced to 0.
+*/
+   unsigned long base;
+
+   if (seg_type == SEG_FS)
+   rdmsrl(MSR_FS_BASE, base);
+   else if (seg_type == SEG_GS)
+   /*
+* swapgs was called at the kernel entry point. Thus,
+* MSR_KERNEL_GS_BASE will have the user-space GS base.
+*/
+   rdmsrl(MSR_KERNEL_GS_BASE, base);
+ 

[v6 PATCH 05/21] x86/insn-eval: Add utility functions to get register offsets

2017-03-07 Thread Ricardo Neri
The function insn_get_reg_offset takes as argument an enumeration that
indicates the type of offset that is returned: the R/M part of the ModRM
byte, the index of the SIB byte or the base of the SIB byte. Callers of
this function would need the definition of such enumeration. This is not
needed. Instead, helper functions can be defined for this purpose can be
added. These functions are useful in cases when, for instance, the caller
needs to decide whether the operand is a register or a memory location by
looking at the mod part of the ModRM byte.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/include/asm/insn-eval.h |  3 +++
 arch/x86/lib/insn-eval.c | 51 
 2 files changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index 5cab1b1..754211b 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -12,5 +12,8 @@
 #include 
 
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 23cf010..78df1c9 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -98,6 +98,57 @@ static int get_reg_offset(struct insn *insn, struct pt_regs 
*regs,
return regoff[regno];
 }
 
+/**
+ * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
+ * @insn:  Instruction structure containing the ModRM byte
+ * @regs:  Set of registers indicated by the ModRM byte
+ *
+ * Obtain the register indicated by the r/m part of the ModRM byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of ModRM does not refer to a register.
+ *
+ * Return: Register indicated by r/m, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_RM);
+}
+
+/**
+ * insn_get_reg_offset_sib_base - Obtain register in base part of SiB byte
+ * @insn:  Instruction structure containing the SiB byte
+ * @regs:  Set of registers indicated by the SiB byte
+ *
+ * Obtain the register indicated by the base part of the SiB byte. The
+ * register is obtained as an offset from the base of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of SiB does not refer to a register.
+ *
+ * Return: Register indicated by SiB's base, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_BASE);
+}
+
+/**
+ * insn_get_reg_offset_sib_index - Obtain register in index part of SiB byte
+ * @insn:  Instruction structure containing the SiB byte
+ * @regs:  Set of registers indicated by the SiB byte
+ *
+ * Obtain the register indicated by the index part of the SiB byte. The
+ * register is obtained as an offset from the index of pt_regs. In specific
+ * cases, the returned value can be -EDOM to indicate that the particular value
+ * of SiB does not refer to a register.
+ *
+ * Return: Register indicated by SiB's base, as an offset within struct pt_regs
+ */
+int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs)
+{
+   return get_reg_offset(insn, regs, REG_TYPE_INDEX);
+}
+
 /*
  * return the address being referenced be instruction
  * for rm=3 returning the content of the rm reg
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 18/21] x86/umip: Force a page fault when unable to copy emulated result to user

2017-03-07 Thread Ricardo Neri
fixup_umip_exception will be called from do_general_protection. If the
former returns false, the latter will issue a SIGSEGV with SEND_SIG_PRIV.
However, when emulation is successful but the emulated result cannot be
copied to user space memory, it is more accurate to issue a SIGSEGV with
SEGV_MAPERR with the offending address. A new function is inspired in
force_sig_info_fault is introduced to model the page fault.

Signed-off-by: Ricardo Neri 
---
 arch/x86/kernel/umip.c | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index e64d8e5..bd06e26 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -163,6 +163,41 @@ static int __emulate_umip_insn(struct insn *insn, enum 
umip_insn umip_inst,
 }
 
 /**
+ * __force_sig_info_umip_fault - Force a SIGSEGV with SEGV_MAPERR
+ * @address:   Address that caused the signal
+ * @regs:  Register set containing the instruction pointer
+ *
+ * Force a SIGSEGV signal with SEGV_MAPERR as the error code. This function is
+ * intended to be used to provide a segmentation fault when the result of the
+ * UMIP emulation could not be copied to the user space memory.
+ *
+ * Return: none
+ */
+static void __force_sig_info_umip_fault(void __user *address,
+   struct pt_regs *regs)
+{
+   siginfo_t info;
+   struct task_struct *tsk = current;
+
+   if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)) {
+   printk_ratelimited("%s[%d] umip emulation segfault ip:%lx 
sp:%lx error:%x in %lx\n",
+  tsk->comm, task_pid_nr(tsk), regs->ip,
+  regs->sp, X86_PF_USER | X86_PF_WRITE,
+  regs->ip);
+   }
+
+   tsk->thread.cr2 = (unsigned long)address;
+   tsk->thread.error_code  = X86_PF_USER | X86_PF_WRITE;
+   tsk->thread.trap_nr = X86_TRAP_PF;
+
+   info.si_signo   = SIGSEGV;
+   info.si_errno   = 0;
+   info.si_code= SEGV_MAPERR;
+   info.si_addr= address;
+   force_sig_info(SIGSEGV, , tsk);
+}
+
+/**
  * fixup_umip_exception - Fixup #GP faults caused by UMIP
  * @regs:  Registers as saved when entering the #GP trap
  *
@@ -247,8 +282,14 @@ bool fixup_umip_exception(struct pt_regs *regs)
} else {
uaddr = insn_get_addr_ref(, regs);
nr_copied = copy_to_user(uaddr, dummy_data, dummy_data_size);
-   if (nr_copied  > 0)
-   return false;
+   if (nr_copied  > 0) {
+   /*
+* If copy fails, send a signal and tell caller that
+* fault was fixed up
+*/
+   __force_sig_info_umip_fault(uaddr, regs);
+   return true;
+   }
}
 
/* increase IP to let the program keep going */
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 15/21] x86/mm: Relocate page fault error codes to traps.h

2017-03-07 Thread Ricardo Neri
Up to this point, only fault.c used the definitions of the page fault error
codes. Thus, it made sense to keep them within such file. Other portions of
code might be interested in those definitions too. For instance, the User-
Mode Instruction Prevention emulation code will use such definitions to
emulate a page fault when it is unable to successfully copy the results
of the emulated instructions to user space.

While relocating the error code enumeration, the prefix X86_ is used to
make it consistent with the rest of the definitions in traps.h. Of course,
code using the enumeration had to be updated as well. No functional changes
were performed.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Andy Lutomirski 
Cc: "Kirill A. Shutemov" 
Cc: Josh Poimboeuf 
Cc: Dave Hansen 
Cc: Paul Gortmaker 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/include/asm/traps.h | 18 +
 arch/x86/mm/fault.c  | 88 +---
 2 files changed, 52 insertions(+), 54 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 01fd0a7..4a2e585 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -148,4 +148,22 @@ enum {
X86_TRAP_IRET = 32, /* 32, IRET Exception */
 };
 
+/*
+ * Page fault error code bits:
+ *
+ *   bit 0 ==   0: no page found   1: protection fault
+ *   bit 1 ==   0: read access 1: write access
+ *   bit 2 ==   0: kernel-mode access  1: user-mode access
+ *   bit 3 ==  1: use of reserved bit detected
+ *   bit 4 ==  1: fault was an instruction fetch
+ *   bit 5 ==  1: protection keys block access
+ */
+enum x86_pf_error_code {
+   X86_PF_PROT =   1 << 0,
+   X86_PF_WRITE=   1 << 1,
+   X86_PF_USER =   1 << 2,
+   X86_PF_RSVD =   1 << 3,
+   X86_PF_INSTR=   1 << 4,
+   X86_PF_PK   =   1 << 5,
+};
 #endif /* _ASM_X86_TRAPS_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 428e3176..e859a9c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -29,26 +29,6 @@
 #include 
 
 /*
- * Page fault error code bits:
- *
- *   bit 0 ==   0: no page found   1: protection fault
- *   bit 1 ==   0: read access 1: write access
- *   bit 2 ==   0: kernel-mode access  1: user-mode access
- *   bit 3 ==  1: use of reserved bit detected
- *   bit 4 ==  1: fault was an instruction fetch
- *   bit 5 ==  1: protection keys block access
- */
-enum x86_pf_error_code {
-
-   PF_PROT =   1 << 0,
-   PF_WRITE=   1 << 1,
-   PF_USER =   1 << 2,
-   PF_RSVD =   1 << 3,
-   PF_INSTR=   1 << 4,
-   PF_PK   =   1 << 5,
-};
-
-/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -149,7 +129,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
 * If it was a exec (instruction fetch) fault on NX page, then
 * do not ignore the fault:
 */
-   if (error_code & PF_INSTR)
+   if (error_code & X86_PF_INSTR)
return 0;
 
instr = (void *)convert_ip_to_linear(current, regs);
@@ -179,7 +159,7 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, 
unsigned long addr)
  * siginfo so userspace can discover which protection key was set
  * on the PTE.
  *
- * If we get here, we know that the hardware signaled a PF_PK
+ * If we get here, we know that the hardware signaled a X86_PF_PK
  * fault and that there was a VMA once we got in the fault
  * handler.  It does *not* guarantee that the VMA we find here
  * was the one that we faulted on.
@@ -205,7 +185,7 @@ static void fill_sig_info_pkey(int si_code, siginfo_t *info,
/*
 * force_sig_info_fault() is called from a number of
 * contexts, some of which have a VMA and some of which
-* do not.  The PF_PK handing happens after we have a
+* do not.  The X86_PF_PK handing happens after we have a
 * valid VMA, so we should never reach this without a
 * valid VMA.
 */
@@ -655,7 +635,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long 
error_code,
if (!oops_may_print())
return;
 
-   if (error_code & PF_INSTR) {
+   if (error_code & X86_PF_INSTR) {
unsigned int level;
pgd_t *pgd;
pte_t *pte;
@@ -739,7 +719,7 @@ 

[v6 PATCH 03/21] x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0

2017-03-07 Thread Ricardo Neri
Section 2.2.1.2 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when a SIB byte is used and the
base of the SIB byte points to R/EBP (i.e., base = 5) and the mod part
of the ModRM byte is zero, the value of such register will not be used
as part of the address computation. To signal this, a -EDOM error is
returned to indicate callers that they should ignore the value.

Also, for this particular case, a displacement of 32-bits should follow
the SIB byte if the mod part of ModRM is equal to zero. The instruction
decoder ensures that this is the case.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Peter Zijlstra 
Cc: Nathan Howard 
Cc: Adan Hawthorn 
Cc: Joe Perches 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/mm/mpx.c | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index d9e92d6..ef7eb67 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -121,6 +121,17 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 
case REG_TYPE_BASE:
regno = X86_SIB_BASE(insn->sib.value);
+   /*
+* If mod is 0 and register R/EBP (regno=5) is indicated in the
+* base part of the SIB byte, the value of such register should
+* not be used in the address computation. Also, a 32-bit
+* displacement is expected in this case; the instruction
+* decoder takes care of it. This is true for both R13 and
+* R/EBP as REX.B will not be decoded.
+*/
+   if (regno == 5 && X86_MODRM_MOD(insn->modrm.value) == 0)
+   return -EDOM;
+
if (X86_REX_B(insn->rex_prefix.value))
regno += 8;
break;
@@ -161,16 +172,21 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
eff_addr = regs_get_register(regs, addr_offset);
} else {
if (insn->sib.nbytes) {
+   /*
+* Negative values in the base and index offset means
+* an error when decoding the SIB byte. Except -EDOM,
+* which means that the registers should not be used
+* in the address computation.
+*/
base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
-   if (base_offset < 0)
+   if (unlikely(base_offset == -EDOM))
+   base = 0;
+   else if (unlikely(base_offset < 0))
goto out_err;
+   else
+   base = regs_get_register(regs, base_offset);
 
indx_offset = get_reg_offset(insn, regs, 
REG_TYPE_INDEX);
-   /*
-* A negative offset generally means a error, except
-* -EDOM, which means that the contents of the register
-* should not be used as index.
-*/
if (unlikely(indx_offset == -EDOM))
indx = 0;
else if (unlikely(indx_offset < 0))
@@ -178,7 +194,6 @@ static void __user *mpx_get_addr_ref(struct insn *insn, 
struct pt_regs *regs)
else
indx = regs_get_register(regs, indx_offset);
 
-   base = regs_get_register(regs, base_offset);
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 21/21] selftests/x86: Add tests for User-Mode Instruction Prevention

2017-03-07 Thread Ricardo Neri
Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel catches it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed without
causing such #GP. If no #GP exceptions occur, we expect to exit virtual-
8086 mode from INT 0x80.

The instructions protected by UMIP are executed in representative use
cases:
 a) the memory address of the result is given in the form of a displacement
from the base of the data segment
 b) the memory address of the result is given in a general purpose register
 c) the result is stored directly in a general purpose register.

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not have
the UMIP feature. Instead, results are printed for verification.

Cc: Andy Lutomirski 
Cc: Andrew Morton 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Chen Yucong 
Cc: Chris Metcalf 
Cc: Dave Hansen 
Cc: Fenghua Yu 
Cc: Huang Rui 
Cc: Jiri Slaby 
Cc: Jonathan Corbet 
Cc: Michael S. Tsirkin 
Cc: Paul Gortmaker 
Cc: Peter Zijlstra 
Cc: Ravi V. Shankar 
Cc: Shuah Khan 
Cc: Vlastimil Babka 
Signed-off-by: Ricardo Neri 
---
 tools/testing/selftests/x86/entry_from_vm86.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/entry_from_vm86.c 
b/tools/testing/selftests/x86/entry_from_vm86.c
index d075ea0..377b773 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -95,6 +95,22 @@ asm (
"int3\n\t"
"vmcode_int80:\n\t"
"int $0x80\n\t"
+   "umip:\n\t"
+   /* addressing via displacements */
+   "smsw (2052)\n\t"
+   "sidt (2054)\n\t"
+   "sgdt (2060)\n\t"
+   /* addressing via registers */
+   "mov $2066, %bx\n\t"
+   "smsw (%bx)\n\t"
+   "mov $2068, %bx\n\t"
+   "sidt (%bx)\n\t"
+   "mov $2074, %bx\n\t"
+   "sgdt (%bx)\n\t"
+   /* register operands, only for smsw */
+   "smsw %ax\n\t"
+   "mov %ax, (2080)\n\t"
+   "int $0x80\n\t"
".size vmcode, . - vmcode\n\t"
"end_vmcode:\n\t"
".code32\n\t"
@@ -103,7 +119,7 @@ asm (
 
 extern unsigned char vmcode[], end_vmcode[];
 extern unsigned char vmcode_bound[], vmcode_sysenter[], vmcode_syscall[],
-   vmcode_sti[], vmcode_int3[], vmcode_int80[];
+   vmcode_sti[], vmcode_int3[], vmcode_int80[], umip[];
 
 /* Returns false if the test was skipped. */
 static bool do_test(struct vm86plus_struct *v86, unsigned long eip,
@@ -218,6 +234,27 @@ int main(void)
v86.regs.eax = (unsigned int)-1;
do_test(, vmcode_int80 - vmcode, VM86_INTx, 0x80, "int80");
 
+   /* UMIP -- should exit with INTx 0x80 unless UMIP was not disabled */
+   do_test(, umip - vmcode, VM86_INTx, 0x80, "UMIP tests");
+   printf("[INFO]\tResults of UMIP-protected instructions via 
displacements:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2052));
+   printf("[INFO]\tSIDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2054),
+  *(unsigned long  *)(addr + 2056));
+   printf("[INFO]\tSGDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2060),
+  *(unsigned long  *)(addr + 2062));
+   printf("[INFO]\tResults of UMIP-protected instructions via addressing 
in registers:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2066));
+   printf("[INFO]\tSIDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2068),
+  *(unsigned long  *)(addr + 2070));
+   printf("[INFO]\tSGDT: limit[0x%04x]base[0x%08lx]\n",
+  *(unsigned short *)(addr + 2074),
+  *(unsigned long  *)(addr + 2076));
+   printf("[INFO]\tResults of SMSW via register operands:\n");
+   printf("[INFO]\tSMSW:[0x%04x]\n", *(unsigned short *)(addr + 2080));
+
/* Execute a null pointer */
v86.regs.cs = 0;
v86.regs.ss = 0;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 11/21] insn/eval: Incorporate segment base in address computation

2017-03-07 Thread Ricardo Neri
insn_get_addr_ref returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. Furthermore, the segment descriptor to use depends on the
register that is used as the base of the effective address. The effective
base address varies depending on whether the operand is a register or a
memory address and on whether a SiB byte is used.

In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.

Since the effective address is a signed quantity, the unsigned segment
base address saved in a separate variable and added to the final effective
address.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index ea10b03..edb360f 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -566,7 +566,7 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct 
pt_regs *regs)
  */
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 {
-   unsigned long linear_addr;
+   unsigned long linear_addr, seg_base_addr;
long eff_addr, base, indx;
int addr_offset, base_offset, indx_offset;
insn_byte_t sib;
@@ -580,6 +580,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
if (addr_offset < 0)
goto out_err;
eff_addr = regs_get_register(regs, addr_offset);
+   seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
+ false);
} else {
if (insn->sib.nbytes) {
/*
@@ -605,6 +607,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
indx = regs_get_register(regs, indx_offset);
 
eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ base_offset, false);
} else {
addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
/* -EDOM means that we must ignore the address_offset.
@@ -623,10 +627,12 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs)
} else {
eff_addr = regs_get_register(regs, addr_offset);
}
+   seg_base_addr = insn_get_seg_base(regs, insn,
+ addr_offset, false);
}
eff_addr += insn->displacement.value;
}
-   linear_addr = (unsigned long)eff_addr;
+   linear_addr = (unsigned long)eff_addr + seg_base_addr;
 
return (void __user *)linear_addr;
 out_err:
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v6 PATCH 09/21] x86/insn-eval: Add functions to get default operand and address sizes

2017-03-07 Thread Ricardo Neri
These functions read the default values of the address and operand sizes
as specified in the segment descriptor. This information is determined
from the D and L bits. Hence, it can be used for both IA-32e 64-bit and
32-bit legacy modes. For virtual-8086 mode, the default address and
operand sizes are always 2 bytes.

The D bit is only meaningful for code segments. Thus, these functions
always use the code segment selector contained in regs.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/include/asm/insn-eval.h |  2 +
 arch/x86/lib/insn-eval.c | 80 
 2 files changed, 82 insertions(+)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index b201742..a0d81fc 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -15,6 +15,8 @@ void __user *insn_get_addr_ref(struct insn *insn, struct 
pt_regs *regs);
 int insn_get_reg_offset_modrm_rm(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
 int insn_get_reg_offset_sib_base(struct insn *insn, struct pt_regs *regs);
+unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs);
+unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs);
 unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn,
int regoff, bool use_default_seg);
 
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 383ca83..cda6c71 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -421,6 +421,86 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, 
struct insn *insn,
 }
 
 /**
+ * insn_get_seg_default_address_bytes - Obtain default address size of segment
+ * @regs:  Set of registers containing the segment selector
+ *
+ * Obtain the default address size as indicated in the segment descriptor
+ * selected in regs' code segment selector. In protected mode, the default
+ * address is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes.
+ *
+ * Return: Default address size of segment
+ */
+unsigned char insn_get_seg_default_address_bytes(struct pt_regs *regs)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   int ret;
+
+   if (v8086_mode(regs))
+   return 2;
+
+   seg = (unsigned short)regs->cs;
+
+   ret = get_desc(seg, );
+   if (ret)
+   return 0;
+
+   switch ((desc->l << 1) | desc->d) {
+   case 0: /* Legacy mode. 16-bit addresses. CS.L=0, CS.D=0 */
+   return 2;
+   case 1: /* Legacy mode. 32-bit addresses. CS.L=0, CS.D=1 */
+   return 4;
+   case 2: /* IA-32e 64-bit mode. 64-bit addresses. CS.L=1, CS.D=0 */
+   return 8;
+   case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+   /* fall through */
+   default:
+   return 0;
+   }
+}
+
+/**
+ * insn_get_seg_default_operand_bytes - Obtain default operand size of segment
+ * @regs:  Set of registers containing the segment selector
+ *
+ * Obtain the default operand size as indicated in the segment descriptor
+ * selected in regs' code segment selector. In protected mode, the default
+ * operand size is determined by inspecting the L and D bits of the segment
+ * descriptor. In virtual-8086 mode, the default is always two bytes.
+ *
+ * Return: Default operand size of segment
+ */
+unsigned char insn_get_seg_default_operand_bytes(struct pt_regs *regs)
+{
+   struct desc_struct *desc;
+   unsigned short seg;
+   int ret;
+
+   if (v8086_mode(regs))
+   return 2;
+
+   seg = (unsigned short)regs->cs;
+
+   ret = get_desc(seg, );
+   if (ret)
+   return 0;
+
+   switch ((desc->l << 1) | desc->d) {
+   case 0: /* Legacy mode. 16-bit or 8-bit operands CS.L=0, CS.D=0 */
+   return 2;
+   case 1: /* Legacy mode. 32- or 8 bit operands CS.L=0, CS.D=1 */
+   /* fall through */
+   case 2: /* IA-32e 64-bit mode. 32- or 8-bit opnds. CS.L=1, CS.D=0 */
+   return 4;
+   case 3: /* Invalid setting. CS.L=1, CS.D=1 */
+   /* fall through */
+   default:
+   return 0;
+   }
+}

[v6 PATCH 07/21] x86/insn-eval: Add utility function to get segment descriptor

2017-03-07 Thread Ricardo Neri
The segment descriptor contains information that is relevant to how linear
address need to be computed. It contains the default size of addresses as
well as the base address of the segment. Thus, given a segment selector,
we ought look at segment descriptor to correctly calculate the linear
address.

In protected mode, the segment selector might indicate a segment
descriptor from either the global descriptor table or a local descriptor
table. Both cases are considered in this function.

This function is the initial implementation for subsequent functions that
will obtain the aforementioned attributes of the segment descriptor.

Cc: Dave Hansen 
Cc: Adam Buchbinder 
Cc: Colin Ian King 
Cc: Lorenzo Stoakes 
Cc: Qiaowei Ren 
Cc: Arnaldo Carvalho de Melo 
Cc: Masami Hiramatsu 
Cc: Adrian Hunter 
Cc: Kees Cook 
Cc: Thomas Garnier 
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Dmitry Vyukov 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/lib/insn-eval.c | 61 
 1 file changed, 61 insertions(+)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 8d45df8..8608adf 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -5,9 +5,13 @@
  */
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 enum reg_type {
@@ -294,6 +298,63 @@ static int get_reg_offset(struct insn *insn, struct 
pt_regs *regs,
 }
 
 /**
+ * get_desc() - Obtain address of segment descriptor
+ * @seg:   Segment selector
+ * @desc:  Pointer to the selected segment descriptor
+ *
+ * Given a segment selector, obtain a memory pointer to the segment
+ * descriptor. Both global and local descriptor tables are supported.
+ * desc will contain the address of the descriptor.
+ *
+ * Return: 0 if success, -EINVAL if failure
+ */
+static int get_desc(unsigned short seg, struct desc_struct **desc)
+{
+   struct desc_ptr gdt_desc = {0, 0};
+   unsigned long desc_base;
+
+   if (!desc)
+   return -EINVAL;
+
+   desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+   if ((seg & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+   seg >>= 3;
+
+   mutex_lock(>active_mm->context.lock);
+   if (unlikely(!current->active_mm->context.ldt ||
+seg >= current->active_mm->context.ldt->size)) {
+   *desc = NULL;
+   mutex_unlock(>active_mm->context.lock);
+   return -EINVAL;
+   }
+
+   *desc = >active_mm->context.ldt->entries[seg];
+   mutex_unlock(>active_mm->context.lock);
+   return 0;
+   }
+#endif
+   native_store_gdt(_desc);
+
+   /*
+* Bits [15:3] of the segment selector contain the index. Such
+* index needs to be multiplied by 8. However, as the index
+* least significant bit is already in bit 3, we don't have
+* to perform the multiplication.
+*/
+   desc_base = seg & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
+
+   if (desc_base > gdt_desc.size) {
+   *desc = NULL;
+   return -EINVAL;
+   }
+
+   *desc = (struct desc_struct *)(gdt_desc.address + desc_base);
+   return 0;
+}
+
+/**
  * insn_get_reg_offset_modrm_rm - Obtain register in r/m part of ModRM byte
  * @insn:  Instruction structure containing the ModRM byte
  * @regs:  Set of registers indicated by the ModRM byte
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html