Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On Fri, 2014-05-02 at 17:04 +0200, Denys Vlasenko wrote: > On 05/02/2014 02:48 AM, Jim Keniston wrote: > > On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: > >> +#define VEX2_(insn) X86_VEX_V((insn)->vex_prefix.bytes[1]) > >> +#define VEX3_(insn) X86_VEX_V((insn)->vex_prefix.bytes[2]) > > > > I disclaim any knowledge about VEX* stuff. > ... > > skipped this next part... > > > >> + /* Similar treatment for VEX3 prefix */ > >> + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ > >> + if (insn->vex_prefix.nbytes == 3) { > >> + /* > >> + * vex2: c5rLpp (has no b bit) > >> + * vex3/xop: c4/8f rxbm wLpp > >> + * evex: 62rxbR00mm.w1pp.zllBVaaa > >> + * (evex will need setting of both b and x since > >> + * in non-sib encoding evex.x is 4th bit of MODRM.rm) > >> + * Setting VEX3.b (setting because it has inverted meaning): > >> + */ > >> + cursor = auprobe->insn + insn_offset_vex_prefix(insn) + 1; > >> + *cursor |= 0x20; > >>} > > > > ... resumed reviewing here. > > I do realize that most people reading this code won't be > aficionados of memorizing x86 insn encoding quirks. > > To help them, I added the comment which spells out bit layout > of vex/evex/xop prefixes (see above). Does it help? > ... Well, yes. I read your comments in conjunction with the "VEX Prefix" wikipedia page and figured out what you're doing, and it looks right. I would definitely leave the comments in; otherwise there's a chance that even you won't remember all this stuff 6 months from now. Jim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On Fri, 2014-05-02 at 17:04 +0200, Denys Vlasenko wrote: On 05/02/2014 02:48 AM, Jim Keniston wrote: On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: +#define VEX2_(insn) X86_VEX_V((insn)-vex_prefix.bytes[1]) +#define VEX3_(insn) X86_VEX_V((insn)-vex_prefix.bytes[2]) I disclaim any knowledge about VEX* stuff. ... skipped this next part... + /* Similar treatment for VEX3 prefix */ + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ + if (insn-vex_prefix.nbytes == 3) { + /* + * vex2: c5rLpp (has no b bit) + * vex3/xop: c4/8f rxbm wLpp + * evex: 62rxbR00mm.w1pp.zllBVaaa + * (evex will need setting of both b and x since + * in non-sib encoding evex.x is 4th bit of MODRM.rm) + * Setting VEX3.b (setting because it has inverted meaning): + */ + cursor = auprobe-insn + insn_offset_vex_prefix(insn) + 1; + *cursor |= 0x20; } ... resumed reviewing here. I do realize that most people reading this code won't be aficionados of memorizing x86 insn encoding quirks. To help them, I added the comment which spells out bit layout of vex/evex/xop prefixes (see above). Does it help? ... Well, yes. I read your comments in conjunction with the VEX Prefix wikipedia page and figured out what you're doing, and it looks right. I would definitely leave the comments in; otherwise there's a chance that even you won't remember all this stuff 6 months from now. Jim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On 05/02/2014 02:48 AM, Jim Keniston wrote: > On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: >> +#define VEX2_(insn) X86_VEX_V((insn)->vex_prefix.bytes[1]) >> +#define VEX3_(insn) X86_VEX_V((insn)->vex_prefix.bytes[2]) > > I disclaim any knowledge about VEX* stuff. ... > skipped this next part... > >> +/* Similar treatment for VEX3 prefix */ >> +/* TODO: add XOP/EVEX treatment when insn decoder supports them */ >> +if (insn->vex_prefix.nbytes == 3) { >> +/* >> + * vex2: c5rLpp (has no b bit) >> + * vex3/xop: c4/8f rxbm wLpp >> + * evex: 62rxbR00mm.w1pp.zllBVaaa >> + * (evex will need setting of both b and x since >> + * in non-sib encoding evex.x is 4th bit of MODRM.rm) >> + * Setting VEX3.b (setting because it has inverted meaning): >> + */ >> +cursor = auprobe->insn + insn_offset_vex_prefix(insn) + 1; >> +*cursor |= 0x20; >> } > > ... resumed reviewing here. I do realize that most people reading this code won't be aficionados of memorizing x86 insn encoding quirks. To help them, I added the comment which spells out bit layout of vex/evex/xop prefixes (see above). Does it help? > VEX again. Covering my ears and humming... > >> +if (insn->vex_prefix.nbytes == 2) { >> +reg2 = VEX2_(insn) ^ 0xf; >> +} >> +if (insn->vex_prefix.nbytes == 3) { >> +reg2 = VEX3_(insn) ^ 0xf; >> +} >> +/* TODO: add XOP, EXEV reading */ >> + In fact there is room for improvement in this part: it's better to ignore most-significant bit of this field (32-bit mode strikes back). >> +reg2 = 3; /* BX */ >> +if (can_use_regs & UPROBE_FIX_RIP_DI) >> +reg2 = 7; >> +if (can_use_regs & UPROBE_FIX_RIP_SI) >> +reg2 = 6; > > It seems more natural to code this as: > if (can_use_regs & UPROBE_FIX_RIP_SI) > reg2 = 6; > else if (can_use_regs & UPROBE_FIX_RIP_DI) > reg2 = 7; > else > reg2 = 3; /* BX */ > ... which is pretty much how you do it in scratch_reg(). Okay. I can also get rid of "can_use_regs" variable altogether then! Sending new version in a jiffy. -- vda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On 05/02/2014 02:48 AM, Jim Keniston wrote: On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: +#define VEX2_(insn) X86_VEX_V((insn)-vex_prefix.bytes[1]) +#define VEX3_(insn) X86_VEX_V((insn)-vex_prefix.bytes[2]) I disclaim any knowledge about VEX* stuff. ... skipped this next part... +/* Similar treatment for VEX3 prefix */ +/* TODO: add XOP/EVEX treatment when insn decoder supports them */ +if (insn-vex_prefix.nbytes == 3) { +/* + * vex2: c5rLpp (has no b bit) + * vex3/xop: c4/8f rxbm wLpp + * evex: 62rxbR00mm.w1pp.zllBVaaa + * (evex will need setting of both b and x since + * in non-sib encoding evex.x is 4th bit of MODRM.rm) + * Setting VEX3.b (setting because it has inverted meaning): + */ +cursor = auprobe-insn + insn_offset_vex_prefix(insn) + 1; +*cursor |= 0x20; } ... resumed reviewing here. I do realize that most people reading this code won't be aficionados of memorizing x86 insn encoding quirks. To help them, I added the comment which spells out bit layout of vex/evex/xop prefixes (see above). Does it help? VEX again. Covering my ears and humming... +if (insn-vex_prefix.nbytes == 2) { +reg2 = VEX2_(insn) ^ 0xf; +} +if (insn-vex_prefix.nbytes == 3) { +reg2 = VEX3_(insn) ^ 0xf; +} +/* TODO: add XOP, EXEV reading */ + In fact there is room for improvement in this part: it's better to ignore most-significant bit of this field (32-bit mode strikes back). +reg2 = 3; /* BX */ +if (can_use_regs UPROBE_FIX_RIP_DI) +reg2 = 7; +if (can_use_regs UPROBE_FIX_RIP_SI) +reg2 = 6; It seems more natural to code this as: if (can_use_regs UPROBE_FIX_RIP_SI) reg2 = 6; else if (can_use_regs UPROBE_FIX_RIP_DI) reg2 = 7; else reg2 = 3; /* BX */ ... which is pretty much how you do it in scratch_reg(). Okay. I can also get rid of can_use_regs variable altogether then! Sending new version in a jiffy. -- vda -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: > Before this patch, instructions such as div, mul, > shifts with count in CL, cmpxchg are mishandled. > > This patch adds vex prefix handling. In particular, > it avoids colliding with register operand encoded > in vex. field. > > Since we need to avoid two possible register operands, > the selection of scratch register needs to be from at least > three registers. > > After looking through a lot of CPU docs, it looks like > the safest choice is SI,DI,BX. Selecting BX needs care > to not collide with implicit use of BX by cmpxchg8b. I skipped reviewing some of this because I don't know about those parts of the x86 architecture. What I saw and understood looks like it'll work, although I have a couple of suggestions there. See below. Thanks for your diligence and creativity on this. Jim > > Signed-off-by: Denys Vlasenko > CC: Jim Keniston > CC: Masami Hiramatsu > CC: Srikar Dronamraju > CC: Ingo Molnar > CC: Oleg Nesterov > --- > arch/x86/kernel/uprobes.c | 154 > ++ > 1 file changed, 116 insertions(+), 38 deletions(-) > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > index c41bb4b..6cb53d1 100644 > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -41,8 +41,10 @@ > /* Instruction will modify TF, don't change it */ > #define UPROBE_FIX_SETF 0x04 > > -#define UPROBE_FIX_RIP_AX0x08 > -#define UPROBE_FIX_RIP_CX0x10 > +#define UPROBE_FIX_RIP_SI0x08 > +#define UPROBE_FIX_RIP_DI0x10 > +#define UPROBE_FIX_RIP_BX0x20 > +#define UPROBE_FIX_RIP_ALL 0x38 I agree with Oleg's suggestion about how to define UPROBE_FIX_RIP_ALL. > > #define UPROBE_TRAP_NR UINT_MAX > > @@ -51,6 +53,8 @@ > #define OPCODE2(insn)((insn)->opcode.bytes[1]) > #define OPCODE3(insn)((insn)->opcode.bytes[2]) > #define MODRM_REG(insn) X86_MODRM_REG((insn)->modrm.value) > +#define VEX2_(insn) X86_VEX_V((insn)->vex_prefix.bytes[1]) > +#define VEX3_(insn) X86_VEX_V((insn)->vex_prefix.bytes[2]) I disclaim any knowledge about VEX* stuff. > > #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, > bf)\ > (((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) | \ > @@ -274,63 +278,137 @@ static inline bool is_64bit_mm(struct mm_struct *mm) > static void riprel_analyze(struct arch_uprobe *auprobe, struct insn *insn) > { > u8 *cursor; > + u8 can_use_regs; > u8 reg; > + int reg2; > > if (!insn_rip_relative(insn)) > return; > > /* > - * insn_rip_relative() would have decoded rex_prefix, modrm. > + * insn_rip_relative() would have decoded rex_prefix, > + * vex_prefix, modrm. >* Clear REX.b bit (extension of MODRM.rm field): > - * we want to encode rax/rcx, not r8/r9. > + * we want to encode low numbered reg, not r8+. >*/ > if (insn->rex_prefix.nbytes) { > cursor = auprobe->insn + insn_offset_rex_prefix(insn); > - *cursor &= 0xfe;/* Clearing REX.B bit */ > + /* REX byte has 0100wrxb layout, clearing REX.b bit */ > + *cursor &= 0xfe; > + } skipped this next part... > + /* Similar treatment for VEX3 prefix */ > + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ > + if (insn->vex_prefix.nbytes == 3) { > + /* > + * vex2: c5rLpp (has no b bit) > + * vex3/xop: c4/8f rxbm wLpp > + * evex: 62rxbR00mm.w1pp.zllBVaaa > + * (evex will need setting of both b and x since > + * in non-sib encoding evex.x is 4th bit of MODRM.rm) > + * Setting VEX3.b (setting because it has inverted meaning): > + */ > + cursor = auprobe->insn + insn_offset_vex_prefix(insn) + 1; > + *cursor |= 0x20; > } > ... resumed reviewing here. > /* > + * Convert from rip-relative addressing > + * to register-relative addressing via a scratch register. > + * > + * This is tricky since there are insns with modrm byte > + * which also use registers not encoded in modrm byte: > + * [i]div/[i]mul: implicitly use dx:ax > + * shift ops: implicitly use cx > + * cmpxchg: implicitly uses ax > + * cmpxchg8/16b: implicitly uses dx:ax and bx:cx > + * Encoding: 0f c7/1 modrm > + * The code below thinks that reg=1 (cx), chooses si as scratch. I verified the above in the AMD manual, but cannot comment on this next section... > + * mulx: implicitly uses dx: mulx r/m,r1,r2: r1:r2 = dx * r/m > + * First appeared in Haswell (BMI2 insn). It is vex-encoded. > + * Example where none of bx,cx,dx can be used as scratch reg: > + * c4 e2
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On 05/01, Denys Vlasenko wrote: > > Before this patch, instructions such as div, mul, > shifts with count in CL, cmpxchg are mishandled. Thanks. I'll try to read this patch tomorrow, but you do know that I can't review (or even understand ;) the change in riprel_analyze(). As for other changes, perhaps we can cleanup them later somehow, but I am not going to discuss this now. I agree that the fix should be as simple as possible. Only on nit, > -#define UPROBE_FIX_RIP_AX0x08 > -#define UPROBE_FIX_RIP_CX0x10 > +#define UPROBE_FIX_RIP_SI0x08 > +#define UPROBE_FIX_RIP_DI0x10 > +#define UPROBE_FIX_RIP_BX0x20 > +#define UPROBE_FIX_RIP_ALL 0x38 OK, but please define UPROBE_FIX_RIP_ALL as #define UPROBE_FIX_RIP_ALL \ (UPROBE_FIX_RIP_SI | UPROBE_FIX_RIP_DI | UPROBE_FIX_RIP_BX) Please do not bother to do this right now, lets wait for review. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] uprobes: fix scratch register selection for rip-relative fixups
Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. This patch adds vex prefix handling. In particular, it avoids colliding with register operand encoded in vex. field. Since we need to avoid two possible register operands, the selection of scratch register needs to be from at least three registers. After looking through a lot of CPU docs, it looks like the safest choice is SI,DI,BX. Selecting BX needs care to not collide with implicit use of BX by cmpxchg8b. Signed-off-by: Denys Vlasenko CC: Jim Keniston CC: Masami Hiramatsu CC: Srikar Dronamraju CC: Ingo Molnar CC: Oleg Nesterov --- arch/x86/kernel/uprobes.c | 154 ++ 1 file changed, 116 insertions(+), 38 deletions(-) diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index c41bb4b..6cb53d1 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -41,8 +41,10 @@ /* Instruction will modify TF, don't change it */ #define UPROBE_FIX_SETF0x04 -#define UPROBE_FIX_RIP_AX 0x08 -#define UPROBE_FIX_RIP_CX 0x10 +#define UPROBE_FIX_RIP_SI 0x08 +#define UPROBE_FIX_RIP_DI 0x10 +#define UPROBE_FIX_RIP_BX 0x20 +#define UPROBE_FIX_RIP_ALL 0x38 #defineUPROBE_TRAP_NR UINT_MAX @@ -51,6 +53,8 @@ #define OPCODE2(insn) ((insn)->opcode.bytes[1]) #define OPCODE3(insn) ((insn)->opcode.bytes[2]) #define MODRM_REG(insn)X86_MODRM_REG((insn)->modrm.value) +#define VEX2_(insn)X86_VEX_V((insn)->vex_prefix.bytes[1]) +#define VEX3_(insn)X86_VEX_V((insn)->vex_prefix.bytes[2]) #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ (((b0##UL << 0x0)|(b1##UL << 0x1)|(b2##UL << 0x2)|(b3##UL << 0x3) | \ @@ -274,63 +278,137 @@ static inline bool is_64bit_mm(struct mm_struct *mm) static void riprel_analyze(struct arch_uprobe *auprobe, struct insn *insn) { u8 *cursor; + u8 can_use_regs; u8 reg; + int reg2; if (!insn_rip_relative(insn)) return; /* -* insn_rip_relative() would have decoded rex_prefix, modrm. +* insn_rip_relative() would have decoded rex_prefix, +* vex_prefix, modrm. * Clear REX.b bit (extension of MODRM.rm field): -* we want to encode rax/rcx, not r8/r9. +* we want to encode low numbered reg, not r8+. */ if (insn->rex_prefix.nbytes) { cursor = auprobe->insn + insn_offset_rex_prefix(insn); - *cursor &= 0xfe;/* Clearing REX.B bit */ + /* REX byte has 0100wrxb layout, clearing REX.b bit */ + *cursor &= 0xfe; + } + /* Similar treatment for VEX3 prefix */ + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ + if (insn->vex_prefix.nbytes == 3) { + /* +* vex2: c5rLpp (has no b bit) +* vex3/xop: c4/8f rxbm wLpp +* evex: 62rxbR00mm.w1pp.zllBVaaa +* (evex will need setting of both b and x since +* in non-sib encoding evex.x is 4th bit of MODRM.rm) +* Setting VEX3.b (setting because it has inverted meaning): +*/ + cursor = auprobe->insn + insn_offset_vex_prefix(insn) + 1; + *cursor |= 0x20; } /* +* Convert from rip-relative addressing +* to register-relative addressing via a scratch register. +* +* This is tricky since there are insns with modrm byte +* which also use registers not encoded in modrm byte: +* [i]div/[i]mul: implicitly use dx:ax +* shift ops: implicitly use cx +* cmpxchg: implicitly uses ax +* cmpxchg8/16b: implicitly uses dx:ax and bx:cx +* Encoding: 0f c7/1 modrm +* The code below thinks that reg=1 (cx), chooses si as scratch. +* mulx: implicitly uses dx: mulx r/m,r1,r2: r1:r2 = dx * r/m +* First appeared in Haswell (BMI2 insn). It is vex-encoded. +* Example where none of bx,cx,dx can be used as scratch reg: +* c4 e2 63 f6 0d disp32 mulx disp32(%rip),%ebx,%ecx +* [v]pcmpistri: implicitly uses cx, xmm0 +* [v]pcmpistrm: implicitly uses xmm0 +* [v]pcmpestri: implicitly uses ax, dx, cx, xmm0 +* [v]pcmpestrm: implicitly uses ax, dx, xmm0 +* Evil SSE4.2 string comparison ops from hell. +* maskmovq/[v]maskmovdqu: implicitly uses (ds:rdi) as destination. +* Encoding: 0f f7 modrm, 66 0f f7 modrm, vex-encoded: c5 f9 f7 modrm. +* Store op1, byte-masked by op2 msb's in each byte, to (ds:rdi). +* AMD says it has no 3-operand form (vex. must be ) +* and that it can have only register operands, not mem +
[PATCH] uprobes: fix scratch register selection for rip-relative fixups
Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. This patch adds vex prefix handling. In particular, it avoids colliding with register operand encoded in vex. field. Since we need to avoid two possible register operands, the selection of scratch register needs to be from at least three registers. After looking through a lot of CPU docs, it looks like the safest choice is SI,DI,BX. Selecting BX needs care to not collide with implicit use of BX by cmpxchg8b. Signed-off-by: Denys Vlasenko dvlas...@redhat.com CC: Jim Keniston jkeni...@us.ibm.com CC: Masami Hiramatsu masami.hiramatsu...@hitachi.com CC: Srikar Dronamraju sri...@linux.vnet.ibm.com CC: Ingo Molnar mi...@kernel.org CC: Oleg Nesterov o...@redhat.com --- arch/x86/kernel/uprobes.c | 154 ++ 1 file changed, 116 insertions(+), 38 deletions(-) diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index c41bb4b..6cb53d1 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -41,8 +41,10 @@ /* Instruction will modify TF, don't change it */ #define UPROBE_FIX_SETF0x04 -#define UPROBE_FIX_RIP_AX 0x08 -#define UPROBE_FIX_RIP_CX 0x10 +#define UPROBE_FIX_RIP_SI 0x08 +#define UPROBE_FIX_RIP_DI 0x10 +#define UPROBE_FIX_RIP_BX 0x20 +#define UPROBE_FIX_RIP_ALL 0x38 #defineUPROBE_TRAP_NR UINT_MAX @@ -51,6 +53,8 @@ #define OPCODE2(insn) ((insn)-opcode.bytes[1]) #define OPCODE3(insn) ((insn)-opcode.bytes[2]) #define MODRM_REG(insn)X86_MODRM_REG((insn)-modrm.value) +#define VEX2_(insn)X86_VEX_V((insn)-vex_prefix.bytes[1]) +#define VEX3_(insn)X86_VEX_V((insn)-vex_prefix.bytes[2]) #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ (((b0##UL 0x0)|(b1##UL 0x1)|(b2##UL 0x2)|(b3##UL 0x3) | \ @@ -274,63 +278,137 @@ static inline bool is_64bit_mm(struct mm_struct *mm) static void riprel_analyze(struct arch_uprobe *auprobe, struct insn *insn) { u8 *cursor; + u8 can_use_regs; u8 reg; + int reg2; if (!insn_rip_relative(insn)) return; /* -* insn_rip_relative() would have decoded rex_prefix, modrm. +* insn_rip_relative() would have decoded rex_prefix, +* vex_prefix, modrm. * Clear REX.b bit (extension of MODRM.rm field): -* we want to encode rax/rcx, not r8/r9. +* we want to encode low numbered reg, not r8+. */ if (insn-rex_prefix.nbytes) { cursor = auprobe-insn + insn_offset_rex_prefix(insn); - *cursor = 0xfe;/* Clearing REX.B bit */ + /* REX byte has 0100wrxb layout, clearing REX.b bit */ + *cursor = 0xfe; + } + /* Similar treatment for VEX3 prefix */ + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ + if (insn-vex_prefix.nbytes == 3) { + /* +* vex2: c5rLpp (has no b bit) +* vex3/xop: c4/8f rxbm wLpp +* evex: 62rxbR00mm.w1pp.zllBVaaa +* (evex will need setting of both b and x since +* in non-sib encoding evex.x is 4th bit of MODRM.rm) +* Setting VEX3.b (setting because it has inverted meaning): +*/ + cursor = auprobe-insn + insn_offset_vex_prefix(insn) + 1; + *cursor |= 0x20; } /* +* Convert from rip-relative addressing +* to register-relative addressing via a scratch register. +* +* This is tricky since there are insns with modrm byte +* which also use registers not encoded in modrm byte: +* [i]div/[i]mul: implicitly use dx:ax +* shift ops: implicitly use cx +* cmpxchg: implicitly uses ax +* cmpxchg8/16b: implicitly uses dx:ax and bx:cx +* Encoding: 0f c7/1 modrm +* The code below thinks that reg=1 (cx), chooses si as scratch. +* mulx: implicitly uses dx: mulx r/m,r1,r2: r1:r2 = dx * r/m +* First appeared in Haswell (BMI2 insn). It is vex-encoded. +* Example where none of bx,cx,dx can be used as scratch reg: +* c4 e2 63 f6 0d disp32 mulx disp32(%rip),%ebx,%ecx +* [v]pcmpistri: implicitly uses cx, xmm0 +* [v]pcmpistrm: implicitly uses xmm0 +* [v]pcmpestri: implicitly uses ax, dx, cx, xmm0 +* [v]pcmpestrm: implicitly uses ax, dx, xmm0 +* Evil SSE4.2 string comparison ops from hell. +* maskmovq/[v]maskmovdqu: implicitly uses (ds:rdi) as destination. +* Encoding: 0f f7 modrm, 66 0f f7 modrm, vex-encoded: c5 f9 f7 modrm. +* Store op1, byte-masked by op2 msb's in each byte, to (ds:rdi). +* AMD says it has no
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On 05/01, Denys Vlasenko wrote: Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. Thanks. I'll try to read this patch tomorrow, but you do know that I can't review (or even understand ;) the change in riprel_analyze(). As for other changes, perhaps we can cleanup them later somehow, but I am not going to discuss this now. I agree that the fix should be as simple as possible. Only on nit, -#define UPROBE_FIX_RIP_AX0x08 -#define UPROBE_FIX_RIP_CX0x10 +#define UPROBE_FIX_RIP_SI0x08 +#define UPROBE_FIX_RIP_DI0x10 +#define UPROBE_FIX_RIP_BX0x20 +#define UPROBE_FIX_RIP_ALL 0x38 OK, but please define UPROBE_FIX_RIP_ALL as #define UPROBE_FIX_RIP_ALL \ (UPROBE_FIX_RIP_SI | UPROBE_FIX_RIP_DI | UPROBE_FIX_RIP_BX) Please do not bother to do this right now, lets wait for review. Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] uprobes: fix scratch register selection for rip-relative fixups
On Thu, 2014-05-01 at 19:09 +0200, Denys Vlasenko wrote: Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. This patch adds vex prefix handling. In particular, it avoids colliding with register operand encoded in vex. field. Since we need to avoid two possible register operands, the selection of scratch register needs to be from at least three registers. After looking through a lot of CPU docs, it looks like the safest choice is SI,DI,BX. Selecting BX needs care to not collide with implicit use of BX by cmpxchg8b. I skipped reviewing some of this because I don't know about those parts of the x86 architecture. What I saw and understood looks like it'll work, although I have a couple of suggestions there. See below. Thanks for your diligence and creativity on this. Jim Signed-off-by: Denys Vlasenko dvlas...@redhat.com CC: Jim Keniston jkeni...@us.ibm.com CC: Masami Hiramatsu masami.hiramatsu...@hitachi.com CC: Srikar Dronamraju sri...@linux.vnet.ibm.com CC: Ingo Molnar mi...@kernel.org CC: Oleg Nesterov o...@redhat.com --- arch/x86/kernel/uprobes.c | 154 ++ 1 file changed, 116 insertions(+), 38 deletions(-) diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index c41bb4b..6cb53d1 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -41,8 +41,10 @@ /* Instruction will modify TF, don't change it */ #define UPROBE_FIX_SETF 0x04 -#define UPROBE_FIX_RIP_AX0x08 -#define UPROBE_FIX_RIP_CX0x10 +#define UPROBE_FIX_RIP_SI0x08 +#define UPROBE_FIX_RIP_DI0x10 +#define UPROBE_FIX_RIP_BX0x20 +#define UPROBE_FIX_RIP_ALL 0x38 I agree with Oleg's suggestion about how to define UPROBE_FIX_RIP_ALL. #define UPROBE_TRAP_NR UINT_MAX @@ -51,6 +53,8 @@ #define OPCODE2(insn)((insn)-opcode.bytes[1]) #define OPCODE3(insn)((insn)-opcode.bytes[2]) #define MODRM_REG(insn) X86_MODRM_REG((insn)-modrm.value) +#define VEX2_(insn) X86_VEX_V((insn)-vex_prefix.bytes[1]) +#define VEX3_(insn) X86_VEX_V((insn)-vex_prefix.bytes[2]) I disclaim any knowledge about VEX* stuff. #define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\ (((b0##UL 0x0)|(b1##UL 0x1)|(b2##UL 0x2)|(b3##UL 0x3) | \ @@ -274,63 +278,137 @@ static inline bool is_64bit_mm(struct mm_struct *mm) static void riprel_analyze(struct arch_uprobe *auprobe, struct insn *insn) { u8 *cursor; + u8 can_use_regs; u8 reg; + int reg2; if (!insn_rip_relative(insn)) return; /* - * insn_rip_relative() would have decoded rex_prefix, modrm. + * insn_rip_relative() would have decoded rex_prefix, + * vex_prefix, modrm. * Clear REX.b bit (extension of MODRM.rm field): - * we want to encode rax/rcx, not r8/r9. + * we want to encode low numbered reg, not r8+. */ if (insn-rex_prefix.nbytes) { cursor = auprobe-insn + insn_offset_rex_prefix(insn); - *cursor = 0xfe;/* Clearing REX.B bit */ + /* REX byte has 0100wrxb layout, clearing REX.b bit */ + *cursor = 0xfe; + } skipped this next part... + /* Similar treatment for VEX3 prefix */ + /* TODO: add XOP/EVEX treatment when insn decoder supports them */ + if (insn-vex_prefix.nbytes == 3) { + /* + * vex2: c5rLpp (has no b bit) + * vex3/xop: c4/8f rxbm wLpp + * evex: 62rxbR00mm.w1pp.zllBVaaa + * (evex will need setting of both b and x since + * in non-sib encoding evex.x is 4th bit of MODRM.rm) + * Setting VEX3.b (setting because it has inverted meaning): + */ + cursor = auprobe-insn + insn_offset_vex_prefix(insn) + 1; + *cursor |= 0x20; } ... resumed reviewing here. /* + * Convert from rip-relative addressing + * to register-relative addressing via a scratch register. + * + * This is tricky since there are insns with modrm byte + * which also use registers not encoded in modrm byte: + * [i]div/[i]mul: implicitly use dx:ax + * shift ops: implicitly use cx + * cmpxchg: implicitly uses ax + * cmpxchg8/16b: implicitly uses dx:ax and bx:cx + * Encoding: 0f c7/1 modrm + * The code below thinks that reg=1 (cx), chooses si as scratch. I verified the above in the AMD manual, but cannot comment on this next section... + * mulx: implicitly uses dx: mulx r/m,r1,r2: r1:r2 = dx * r/m + * First appeared in Haswell (BMI2 insn). It is vex-encoded. + * Example where none of bx,cx,dx can be used as scratch reg: + * c4 e2 63