On Tue, Oct 08, 2019 at 04:29:24PM +0200, Borislav Petkov wrote:
> On Mon, Oct 07, 2019 at 10:17:17AM +0200, Peter Zijlstra wrote:

> > @@ -63,8 +66,17 @@ static inline void int3_emulate_jmp(stru
> >     regs->ip = ip;
> >  }
> >  
> > -#define INT3_INSN_SIZE 1
> > -#define CALL_INSN_SIZE 5
> > +#define INT3_INSN_SIZE             1
> > +#define INT3_INSN_OPCODE   0xCC
> > +
> > +#define CALL_INSN_SIZE             5
> > +#define CALL_INSN_OPCODE   0xE8
> > +
> > +#define JMP32_INSN_SIZE            5
> > +#define JMP32_INSN_OPCODE  0xE9
> > +
> > +#define JMP8_INSN_SIZE             2
> > +#define JMP8_INSN_OPCODE   0xEB
> 
> You probably should switch those to have the name prefix come first and
> make them even shorter:
> 
> OPCODE_CALL
> INSN_SIZE_CALL
> OPCODE_JMP32
> INSN_SIZE_JMP32
> OPCODE_JMP8
> ...
> 
> This way you have the opcodes prefixed with OPCODE_ and the insn sizes
> with INSN_SIZE_. I.e., what they actually are.

I really don't like that; the important part is which instruction and
that really should come first. Also, your variant is horribly
inconsistent.

> > --- a/arch/x86/kernel/alternative.c
> > +++ b/arch/x86/kernel/alternative.c
> 
> ...
> 
> > @@ -1027,9 +1046,9 @@ NOKPROBE_SYMBOL(poke_int3_handler);
> >   */
> >  void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries)
> >  {
> > -   int patched_all_but_first = 0;
> > -   unsigned char int3 = 0xcc;
> > +   unsigned char int3 = INT3_INSN_OPCODE;
> >     unsigned int i;
> > +   int do_sync;
> >  
> >     lockdep_assert_held(&text_mutex);
> >  
> > @@ -1053,16 +1072,16 @@ void text_poke_bp_batch(struct text_poke
> >     /*
> >      * Second step: update all but the first byte of the patched range.
> >      */
> > -   for (i = 0; i < nr_entries; i++) {
> > +   for (do_sync = 0, i = 0; i < nr_entries; i++) {
> >             if (tp[i].len - sizeof(int3) > 0) {
> >                     text_poke((char *)tp[i].addr + sizeof(int3),
> > -                             (const char *)tp[i].opcode + sizeof(int3),
> > +                             (const char *)tp[i].text + sizeof(int3),
> >                               tp[i].len - sizeof(int3));
> > -                   patched_all_but_first++;
> > +                   do_sync++;
> >             }
> >     }
> >  
> > -   if (patched_all_but_first) {
> > +   if (do_sync) {
> >             /*
> >              * According to Intel, this core syncing is very likely
> >              * not necessary and we'd be safe even without it. But
> > @@ -1075,10 +1094,17 @@ void text_poke_bp_batch(struct text_poke
> >      * Third step: replace the first byte (int3) by the first byte of
> >      * replacing opcode.
> >      */
> > -   for (i = 0; i < nr_entries; i++)
> > -           text_poke(tp[i].addr, tp[i].opcode, sizeof(int3));
> > +   for (do_sync = 0, i = 0; i < nr_entries; i++) {
> 
> Can we have the do_sync reset outside of the loop?

Can, but why? That's more lines for no raisin ;-)

> > +           if (tp[i].text[0] == INT3_INSN_OPCODE)
> > +                   continue;
> 
> I'm guessing we preset the 0th byte to 0xcc somewhere.... I just can't
> seem to find it...

Very first pass, we write INT3's everywhere.

Reply via email to