insn-eval: Add support to resolve 16-bit addressing encodings

Ricardo Neri Thu, 15 Jun 2017 14:51:07 -0700

On Wed, 2017-06-07 at 18:28 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> > Tasks running in virtual-8086 mode or in protected mode with code
> > segment descriptors that specify 16-bit default address sizes via the
> > D bit will use 16-bit addressing form encodings as described in the Intel
> > 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> > 2.1.5. 16-bit addressing encodings differ in several ways from the
> > 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> > registers and, in some cases, effective addresses are indicated by the
> > addition of the value of two registers. Also, there is no support for SIB
> > bytes. Thus, a separate function is needed to parse this form of
> > addressing.
> > 
> > A couple of functions are introduced. get_reg_offset_16() obtains the
> > offset from the base of pt_regs of the registers indicated by the ModRM
> > byte of the address encoding. get_addr_ref_16() computes the linear
> > address indicated by the instructions using the value of the registers
> > given by ModRM as well as the base address of the segment.
> > 
> > Cc: Dave Hansen <[email protected]>
> > Cc: Adam Buchbinder <[email protected]>
> > Cc: Colin Ian King <[email protected]>
> > Cc: Lorenzo Stoakes <[email protected]>
> > Cc: Qiaowei Ren <[email protected]>
> > Cc: Arnaldo Carvalho de Melo <[email protected]>
> > Cc: Masami Hiramatsu <[email protected]>
> > Cc: Adrian Hunter <[email protected]>
> > Cc: Kees Cook <[email protected]>
> > Cc: Thomas Garnier <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Dmitry Vyukov <[email protected]>
> > Cc: Ravi V. Shankar <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Ricardo Neri <[email protected]>
> > ---
> >  arch/x86/lib/insn-eval.c | 155 
> > +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 155 insertions(+)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 9822061..928a662 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct 
> > pt_regs *regs,
> >  }
> >  
> >  /**
> > + * get_reg_offset_16 - Obtain offset of register indicated by instruction
> 
> Please end function names with parentheses.


I will correct.
> 
> > + * @insn:  Instruction structure containing ModRM and SiB bytes
> 
> s/SiB/SIB/g

I will correct.
> 
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + * @offs1: Offset of the first operand register
> > + * @offs2: Offset of the second opeand register, if applicable.
> > + *
> > + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM 
> > byte
> > + * within insn. This function is to be used with 16-bit address encodings. 
> > The
> > + * offs1 and offs2 will be written with the offset of the two registers
> > + * indicated by the instruction. In cases where any of the registers is not
> > + * referenced by the instruction, the value will be set to -EDOM.
> > + *
> > + * Return: 0 on success, -EINVAL on failure.
> > + */
> > +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> > +                        int *offs1, int *offs2)
> > +{
> > +   /* 16-bit addressing can use one or two registers */
> > +   static const int regoff1[] = {
> > +           offsetof(struct pt_regs, bx),
> > +           offsetof(struct pt_regs, bx),
> > +           offsetof(struct pt_regs, bp),
> > +           offsetof(struct pt_regs, bp),
> > +           offsetof(struct pt_regs, si),
> > +           offsetof(struct pt_regs, di),
> > +           offsetof(struct pt_regs, bp),
> > +           offsetof(struct pt_regs, bx),
> > +   };
> > +
> > +   static const int regoff2[] = {
> > +           offsetof(struct pt_regs, si),
> > +           offsetof(struct pt_regs, di),
> > +           offsetof(struct pt_regs, si),
> > +           offsetof(struct pt_regs, di),
> > +           -EDOM,
> > +           -EDOM,
> > +           -EDOM,
> > +           -EDOM,
> > +   };
> 
> You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
> the SDM, right?

Yes.
> 
> Please add a comment pointing to it here because it is not trivial to
> map that code to the documentation.

Sure, I will add a comment pointing to this table.

> 
> > +
> > +   if (!offs1 || !offs2)
> > +           return -EINVAL;
> > +
> > +   /* operand is a register, use the generic function */
> > +   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > +           *offs1 = insn_get_modrm_rm_off(insn, regs);
> > +           *offs2 = -EDOM;
> > +           return 0;
> > +   }
> > +
> > +   *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
> > +   *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
> > +
> > +   /*
> > +    * If no displacement is indicated in the mod part of the ModRM byte,
> 
> s/"no "//
> 
> > +    * (mod part is 0) and the r/m part of the same byte is 6, no register
> > +    * is used caculate the operand address. An r/m part of 6 means that
> > +    * the second register offset is already invalid.

Perhaps my comment was misleading. When ModRM.mod is 0, no displacement
is used except for ModRM.mod = 0 and ModRM.rm 110b. In this case we have
displacement-only addressing. I will reword the comment to reflect this
fact.

> > +    */
> > +   if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
> > +       (X86_MODRM_RM(insn->modrm.value) == 6))
> > +           *offs1 = -EDOM;
> > +
> > +   return 0;
> > +}
> > +
> > +/**
> >   * get_desc() - Obtain address of segment descriptor
> >   * @sel:   Segment selector
> >   *
> > @@ -689,6 +756,94 @@ int insn_get_modrm_rm_off(struct insn *insn, struct 
> > pt_regs *regs)
> >  }
> >  
> >  /**
> > + * get_addr_ref_16() - Obtain the 16-bit address referred by instruction
> > + * @insn:  Instruction structure containing ModRM byte and displacement
> > + * @regs:  Structure with register values as seen when entering kernel mode
> > + *
> > + * This function is to be used with 16-bit address encodings. Obtain the 
> > memory
> > + * address referred by the instruction's ModRM bytes and displacement. 
> > Also, the
> > + * segment used as base is determined by either any segment override 
> > prefixes in
> > + * insn or the default segment of the registers involved in the address
> > + * computation. In protected mode, segment limits are enforced.
> > + *
> > + * Return: linear address referenced by instruction and registers on 
> > success.
> > + * -1L on failure.
> > + */
> > +static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs 
> > *regs)
> > +{
> > +   unsigned long linear_addr, seg_base_addr, seg_limit;
> > +   short eff_addr, addr1 = 0, addr2 = 0;
> > +   int addr_offset1, addr_offset2;
> > +   int ret;
> > +
> > +   insn_get_modrm(insn);
> > +   insn_get_displacement(insn);
> > +
> > +   /*
> > +    * If operand is a register, the layout is the same as in
> > +    * 32-bit and 64-bit addressing.
> > +    */
> > +   if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> > +           addr_offset1 = get_reg_offset(insn, regs, REG_TYPE_RM);
> > +           if (addr_offset1 < 0)
> > +                   goto out_err;
> 
> <---- newline here.

Will add newline.

> 
> > +           eff_addr = regs_get_register(regs, addr_offset1);
> > +           seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1);
> > +           if (seg_base_addr == -1L)
> > +                   goto out_err;
> 
> ditto.

Will add newline.
> 
> > +           seg_limit = get_seg_limit(regs, insn, addr_offset1);
> > +   } else {
> > +           ret = get_reg_offset_16(insn, regs, &addr_offset1,
> > +                                   &addr_offset2);
> > +           if (ret < 0)
> > +                   goto out_err;
> 
> ditto.

Will add newline.
> 
> > +           /*
> > +            * Don't fail on invalid offset values. They might be invalid
> > +            * because they cannot be used for this particular value of
> > +            * the ModRM. Instead, use them in the computation only if
> > +            * they contain a valid value.
> > +            */
> > +           if (addr_offset1 != -EDOM)
> > +                   addr1 = 0xffff & regs_get_register(regs, addr_offset1);
> > +           if (addr_offset2 != -EDOM)
> > +                   addr2 = 0xffff & regs_get_register(regs, addr_offset2);
> > +           eff_addr = addr1 + addr2;
> 
> ditto.

Will add newline.
> 
> Space those codelines out, we want to be able to read that code again at
> some point :-)))

Sure! I have gone through all this code adding newlines as necessary.

> 
> > +           /*
> > +            * The first register is in the operand implies the SS or DS
> > +            * segment selectors, the second register in the operand can
> > +            * only imply DS. Thus, use the first register to obtain
> > +            * the segment selector.
> > +            */
> > +           seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1);
> > +           if (seg_base_addr == -1L)
> > +                   goto out_err;
> > +           seg_limit = get_seg_limit(regs, insn, addr_offset1);
> > +
> > +           eff_addr += (insn->displacement.value & 0xffff);
> > +   }
> > +
> > +   linear_addr = (unsigned long)(eff_addr & 0xffff);
> > +
> > +   /*
> > +    * Make sure the effective address is within the limits of the
> > +    * segment. In long mode, the limit is -1L. Thus, the second part
> 
> Long mode in a 16-bit handling function?

Yes, this is not correct. However, it is true for virtual-8086 mode. I
will update the comment accordingly.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

Reply via email to