Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings
On Wed, 2017-06-07 at 18:28 +0200, Borislav Petkov wrote: > On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote: > > Tasks running in virtual-8086 mode or in protected mode with code > > segment descriptors that specify 16-bit default address sizes via the > > D bit will use 16-bit addressing form encodings as described in the Intel > > 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section > > 2.1.5. 16-bit addressing encodings differ in several ways from the > > 32-bit/64-bit addressing form encodings: ModRM.rm points to different > > registers and, in some cases, effective addresses are indicated by the > > addition of the value of two registers. Also, there is no support for SIB > > bytes. Thus, a separate function is needed to parse this form of > > addressing. > > > > A couple of functions are introduced. get_reg_offset_16() obtains the > > offset from the base of pt_regs of the registers indicated by the ModRM > > byte of the address encoding. get_addr_ref_16() computes the linear > > address indicated by the instructions using the value of the registers > > given by ModRM as well as the base address of the segment. > > > > Cc: Dave Hansen> > Cc: Adam Buchbinder > > Cc: Colin Ian King > > Cc: Lorenzo Stoakes > > Cc: Qiaowei Ren > > Cc: Arnaldo Carvalho de Melo > > Cc: Masami Hiramatsu > > Cc: Adrian Hunter > > Cc: Kees Cook > > Cc: Thomas Garnier > > Cc: Peter Zijlstra > > Cc: Borislav Petkov > > Cc: Dmitry Vyukov > > Cc: Ravi V. Shankar > > Cc: x...@kernel.org > > Signed-off-by: Ricardo Neri > > --- > > arch/x86/lib/insn-eval.c | 155 > > +++ > > 1 file changed, 155 insertions(+) > > > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > > index 9822061..928a662 100644 > > --- a/arch/x86/lib/insn-eval.c > > +++ b/arch/x86/lib/insn-eval.c > > @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct > > pt_regs *regs, > > } > > > > /** > > + * get_reg_offset_16 - Obtain offset of register indicated by instruction > > Please end function names with parentheses. I will correct. > > > + * @insn: Instruction structure containing ModRM and SiB bytes > > s/SiB/SIB/g I will correct. > > > + * @regs: Structure with register values as seen when entering kernel mode > > + * @offs1: Offset of the first operand register > > + * @offs2: Offset of the second opeand register, if applicable. > > + * > > + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM > > byte > > + * within insn. This function is to be used with 16-bit address encodings. > > The > > + * offs1 and offs2 will be written with the offset of the two registers > > + * indicated by the instruction. In cases where any of the registers is not > > + * referenced by the instruction, the value will be set to -EDOM. > > + * > > + * Return: 0 on success, -EINVAL on failure. > > + */ > > +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs, > > +int *offs1, int *offs2) > > +{ > > + /* 16-bit addressing can use one or two registers */ > > + static const int regoff1[] = { > > + offsetof(struct pt_regs, bx), > > + offsetof(struct pt_regs, bx), > > + offsetof(struct pt_regs, bp), > > + offsetof(struct pt_regs, bp), > > + offsetof(struct pt_regs, si), > > + offsetof(struct pt_regs, di), > > + offsetof(struct pt_regs, bp), > > + offsetof(struct pt_regs, bx), > > + }; > > + > > + static const int regoff2[] = { > > + offsetof(struct pt_regs, si), > > + offsetof(struct pt_regs, di), > > + offsetof(struct pt_regs, si), > > + offsetof(struct pt_regs, di), > > + -EDOM, > > + -EDOM, > > + -EDOM, > > + -EDOM, > > + }; > > You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in > the SDM, right? Yes. > > Please add a comment pointing to it here because it is not trivial to > map that code to the documentation. Sure, I will add a comment pointing to this table. > > > + > > + if (!offs1 || !offs2) > > + return -EINVAL; > > + > > + /* operand is a register, use the generic function */ > > + if (X86_MODRM_MOD(insn->modrm.value) == 3) { > > + *offs1 = insn_get_modrm_rm_off(insn, regs); > > + *offs2 = -EDOM; > > + return 0; > > + } > > + > > + *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)]; > > + *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)]; > > + > > + /* > > +* If no displacement is
Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses
On Wed, 2017-06-07 at 17:49 +0200, Borislav Petkov wrote: > On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote: > > @@ -697,18 +753,21 @@ void __user *insn_get_addr_ref(struct insn *insn, > > struct pt_regs *regs) > > { > > unsigned long linear_addr, seg_base_addr, seg_limit; > > long eff_addr, base, indx; > > - int addr_offset, base_offset, indx_offset; > > + int addr_offset, base_offset, indx_offset, addr_bytes; > > insn_byte_t sib; > > > > insn_get_modrm(insn); > > insn_get_sib(insn); > > sib = insn->sib.value; > > + addr_bytes = insn->addr_bytes; > > > > if (X86_MODRM_MOD(insn->modrm.value) == 3) { > > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM); > > if (addr_offset < 0) > > goto out_err; > > - eff_addr = regs_get_register(regs, addr_offset); > > + eff_addr = get_mem_offset(regs, addr_offset, addr_bytes); > > + if (eff_addr == -1L) > > + goto out_err; > > seg_base_addr = insn_get_seg_base(regs, insn, addr_offset); > > if (seg_base_addr == -1L) > > goto out_err; > > This code here is too dense, it needs spacing for better readability. I have spaced out in my upcoming version. Thanks and BR, Ricardo -- To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 14/26] x86/insn-eval: Indicate a 32-bit displacement if ModRM.mod is 0 and ModRM.rm is 5
On Wed, 2017-06-07 at 15:15 +0200, Borislav Petkov wrote: > On Fri, May 05, 2017 at 11:17:12AM -0700, Ricardo Neri wrote: > > Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software > > Developer's Manual volume 2A states that when ModRM.mod is zero and > > ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means > > that none of the registers are used in the computation of the effective > > address. A return value of -EDOM signals callers that they should not use > > the value of registers when computing the effective address for the > > instruction. > > > > In IA-32e 64-bit mode (long mode), the effective address is given by the > > 32-bit displacement plus the value of RIP of the next instruction. > > In IA-32e compatibility mode (protected mode), only the displacement is > > used. > > > > The instruction decoder takes care of obtaining the displacement. > > ... > > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > > index 693e5a8..4f600de 100644 > > --- a/arch/x86/lib/insn-eval.c > > +++ b/arch/x86/lib/insn-eval.c > > @@ -379,6 +379,12 @@ static int get_reg_offset(struct insn *insn, struct > > pt_regs *regs, > > switch (type) { > > case REG_TYPE_RM: > > regno = X86_MODRM_RM(insn->modrm.value); > > > < newline here. Will add the new line. > > > + /* > > +* ModRM.mod == 0 and ModRM.rm == 5 means a 32-bit displacement > > +* follows the ModRM byte. > > +*/ > > + if (!X86_MODRM_MOD(insn->modrm.value) && regno == 5) > > + return -EDOM; > > if (X86_REX_B(insn->rex_prefix.value)) > > regno += 8; > > break; > > @@ -730,9 +736,21 @@ void __user *insn_get_addr_ref(struct insn *insn, > > struct pt_regs *regs) > > eff_addr = base + indx * (1 << X86_SIB_SCALE(sib)); > > } else { > > addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM); > > - if (addr_offset < 0) > > ditto. Will add the new line. > > > + /* > > +* -EDOM means that we must ignore the address_offset. > > +* In such a case, in 64-bit mode the effective address > > +* relative to the RIP of the following instruction. > > +*/ > > + if (addr_offset == -EDOM) { > > + eff_addr = 0; > > + if (user_64bit_mode(regs)) > > + eff_addr = (long)regs->ip + > > + insn->length; > > Let that line stick out and write it balanced: > > if (addr_offset == -EDOM) { > if (user_64bit_mode(regs)) > eff_addr = (long)regs->ip + > insn->length; > else > eff_addr = 0; > > should be easier parseable this way. Will rewrite as you suggest. Thanks and BR, Ricardo -- To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 13/26] x86/insn-eval: Add function to get default params of code segment
On Wed, 2017-06-07 at 14:59 +0200, Borislav Petkov wrote: > On Fri, May 05, 2017 at 11:17:11AM -0700, Ricardo Neri wrote: > > This function returns the default values of the address and operand sizes > > as specified in the segment descriptor. This information is determined > > from the D and L bits. Hence, it can be used for both IA-32e 64-bit and > > 32-bit legacy modes. For virtual-8086 mode, the default address and > > operand sizes are always 2 bytes. > > > > The D bit is only meaningful for code segments. Thus, these functions > > always use the code segment selector contained in regs. > > > > Cc: Dave Hansen> > Cc: Adam Buchbinder > > Cc: Colin Ian King > > Cc: Lorenzo Stoakes > > Cc: Qiaowei Ren > > Cc: Arnaldo Carvalho de Melo > > Cc: Masami Hiramatsu > > Cc: Adrian Hunter > > Cc: Kees Cook > > Cc: Thomas Garnier > > Cc: Peter Zijlstra > > Cc: Borislav Petkov > > Cc: Dmitry Vyukov > > Cc: Ravi V. Shankar > > Cc: x...@kernel.org > > Signed-off-by: Ricardo Neri > > --- > > arch/x86/include/asm/insn-eval.h | 6 > > arch/x86/lib/insn-eval.c | 65 > > > > 2 files changed, 71 insertions(+) > > > > diff --git a/arch/x86/include/asm/insn-eval.h > > b/arch/x86/include/asm/insn-eval.h > > index 7f3c7fe..9ed1c88 100644 > > --- a/arch/x86/include/asm/insn-eval.h > > +++ b/arch/x86/include/asm/insn-eval.h > > @@ -11,9 +11,15 @@ > > #include > > #include > > > > +struct insn_code_seg_defaults { > > A whole struct for a function which gets called only once? > > Bah, that's a bit too much, if you ask me. > > So you're returning two small unsigned integers - i.e., you can just as > well return a single u8 and put address and operand sizes in there: > > ret = oper_sz | addr_sz << 4; > > No need for special structs for that. OK. This makes sense. Perhaps I can use a couple of #define's to set and get the the address and operand sizes in a single u8. This would make the code more readable. > > > + unsigned char address_bytes; > > + unsigned char operand_bytes; > > +}; > > + > > void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs); > > int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs); > > unsigned long insn_get_seg_base(struct pt_regs *regs, struct insn *insn, > > int regoff); > > +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs > > *regs); > > > > #endif /* _ASM_X86_INSN_EVAL_H */ > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > > index c77ed80..693e5a8 100644 > > --- a/arch/x86/lib/insn-eval.c > > +++ b/arch/x86/lib/insn-eval.c > > @@ -603,6 +603,71 @@ static unsigned long get_seg_limit(struct pt_regs > > *regs, struct insn *insn, > > } > > > > /** > > + * insn_get_code_seg_defaults() - Obtain code segment default parameters > > + * @regs: Structure with register values as seen when entering kernel mode > > + * > > + * Obtain the default parameters of the code segment: address and operand > > sizes. > > + * The code segment is obtained from the selector contained in the CS > > register > > + * in regs. In protected mode, the default address is determined by > > inspecting > > + * the L and D bits of the segment descriptor. In virtual-8086 mode, the > > default > > + * is always two bytes for both address and operand sizes. > > + * > > + * Return: A populated insn_code_seg_defaults structure on success. The > > + * structure contains only zeros on failure. > > s/failure/error/ Will correct. > > > + */ > > +struct insn_code_seg_defaults insn_get_code_seg_defaults(struct pt_regs > > *regs) > > +{ > > + struct desc_struct *desc; > > + struct insn_code_seg_defaults defs; > > + unsigned short sel; > > + /* > > +* The most significant byte of AR_TYPE_MASK determines whether a > > +* segment contains data or code. > > +*/ > > + unsigned int type_mask = AR_TYPE_MASK & (1 << 11); > > + > > + memset(, 0, sizeof(defs)); > > + > > + if (v8086_mode(regs)) { > > + defs.address_bytes = 2; > > + defs.operand_bytes = 2; > > + return defs; > > + } > > + > > + sel = (unsigned short)regs->cs; > > + > > + desc = get_desc(sel); > > + if (!desc) > > + return defs; > > + > > + /* if data segment, return */ > > + if (!(desc->b & type_mask)) > > + return defs; > > So you can simplify that into: > > /* A code segment? */ > if (!(desc->b & BIT(11))) > return defs; > > and remove that type_mask thing. Alternatively, I can do desc->type & BIT(3) to avoid
Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector
On Thu, 2017-06-15 at 11:37 -0700, Ricardo Neri wrote: > > Yuck, didn't we talk about this already? > > I am sorry Borislav. I thought you agreed that I could use the values > of > the segment override prefixes to identify the segment registers [1]. This time with the reference: [1]. https://lkml.org/lkml/2017/5/5/377 -- To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 10/26] x86/insn-eval: Add utility functions to get segment selector
On Tue, 2017-05-30 at 12:35 +0200, Borislav Petkov wrote: > On Fri, May 05, 2017 at 11:17:08AM -0700, Ricardo Neri wrote: > > When computing a linear address and segmentation is used, we need to know > > the base address of the segment involved in the computation. In most of > > the cases, the segment base address will be zero as in USER_DS/USER32_DS. > > However, it may be possible that a user space program defines its own > > segments via a local descriptor table. In such a case, the segment base > > address may not be zero .Thus, the segment base address is needed to > > calculate correctly the linear address. > > > > The segment selector to be used when computing a linear address is > > determined by either any of segment override prefixes in the > > instruction or inferred from the registers involved in the computation of > > the effective address; in that order. Also, there are cases when the > > overrides shall be ignored (code segments are always selected by the CS > > segment register; string instructions always use the ES segment register > > along with the EDI register). > > > > For clarity, this process can be split into two steps: resolving the > > relevant segment register to use and, once known, read its value to > > obtain the segment selector. > > > > The method to obtain the segment selector depends on several factors. In > > 32-bit builds, segment selectors are saved into the pt_regs structure > > when switching to kernel mode. The same is also true for virtual-8086 > > mode. In 64-bit builds, segmentation is mostly ignored, except when > > running a program in 32-bit legacy mode. In this case, CS and SS can be > > obtained from pt_regs. DS, ES, FS and GS can be read directly from > > the respective segment registers. > > > > Lastly, the only two segment registers that are not ignored in long mode > > are FS and GS. In these two cases, base addresses are obtained from the > > respective MSRs. > > > > Cc: Dave Hansen> > Cc: Adam Buchbinder > > Cc: Colin Ian King > > Cc: Lorenzo Stoakes > > Cc: Qiaowei Ren > > Cc: Arnaldo Carvalho de Melo > > Cc: Masami Hiramatsu > > Cc: Adrian Hunter > > Cc: Kees Cook > > Cc: Thomas Garnier > > Cc: Peter Zijlstra > > Cc: Borislav Petkov > > Cc: Dmitry Vyukov > > Cc: Ravi V. Shankar > > Cc: x...@kernel.org > > Signed-off-by: Ricardo Neri > > --- > > arch/x86/lib/insn-eval.c | 256 > > +++ > > 1 file changed, 256 insertions(+) > > > > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c > > index 1634762..0a496f4 100644 > > --- a/arch/x86/lib/insn-eval.c > > +++ b/arch/x86/lib/insn-eval.c > > @@ -9,6 +9,7 @@ > > #include > > #include > > #include > > +#include > > > > enum reg_type { > > REG_TYPE_RM = 0, > > @@ -33,6 +34,17 @@ enum string_instruction { > > SCASW_SCASD = 0xaf, > > }; > > > > +enum segment_register { > > + SEG_REG_INVAL = -1, > > + SEG_REG_IGNORE = 0, > > + SEG_REG_CS = 0x23, > > + SEG_REG_SS = 0x36, > > + SEG_REG_DS = 0x3e, > > + SEG_REG_ES = 0x26, > > + SEG_REG_FS = 0x64, > > + SEG_REG_GS = 0x65, > > +}; > > Yuck, didn't we talk about this already? I am sorry Borislav. I thought you agreed that I could use the values of the segment override prefixes to identify the segment registers [1]. > > Those are segment override prefixes so call them as such. > > #define SEG_OVR_PFX_CS0x23 > #define SEG_OVR_PFX_SS0x36 > ... > > and we already have those! > > arch/x86/include/asm/inat.h: > ... > #define INAT_PFX_CS 5 /* 0x2E */ > #define INAT_PFX_DS 6 /* 0x3E */ > #define INAT_PFX_ES 7 /* 0x26 */ > #define INAT_PFX_FS 8 /* 0x64 */ > #define INAT_PFX_GS 9 /* 0x65 */ > #define INAT_PFX_SS 10 /* 0x36 */ > > well, kinda, they're numbers there and not the actual prefix values. These numbers can 'translated' to the actual value of the prefixes via inat_get_opcode_attribute(). In my next version I am planning to use these function and reuse the aforementioned definitions. > > And then there's: > > arch/x86/kernel/uprobes.c::is_prefix_bad() which looks at some of those. > > Please add your defines to inat.h Will do. > and make that function is_prefix_bad() > use them instead of naked numbers. We need to pay attention to all those > different things needing to look at insn opcodes and not let them go > unwieldy by each defining and duplicating stuff. I have implemented this change and will be part of my next version. > > > /** > > * is_string_instruction - Determine if instruction is a string