On Thu, May 14, 2026 at 06:54:37PM +0200, Jakub Sitnicki wrote: > On Thu, May 14, 2026 at 03:53:36PM +0200, Jiri Olsa wrote: > > Andrii reported an issue with optimized uprobes [1] that can clobber > > redzone area with call instruction storing return address on stack > > where user code may keep temporary data without adjusting rsp. > > > > Fixing this by moving the optimized uprobes on top of 10-bytes nop > > instruction, so we can squeeze another instruction to escape the > > redzone area before doing the call, like: > > > > lea -0x80(%rsp), %rsp > > call tramp > > > > Note the lea instruction is used to adjust the rsp register without > > changing the flags. > > > > The optimized uprobe performance stays the same: > > > > uprobe-nop : 3.129 ± 0.013M/s > > uprobe-push : 3.045 ± 0.006M/s > > uprobe-ret : 1.095 ± 0.004M/s > > --> uprobe-nop10 : 7.170 ± 0.020M/s > > uretprobe-nop : 2.143 ± 0.021M/s > > uretprobe-push : 2.090 ± 0.000M/s > > uretprobe-ret : 0.942 ± 0.000M/s > > --> uretprobe-nop10: 3.381 ± 0.003M/s > > usdt-nop : 3.245 ± 0.004M/s > > --> usdt-nop10 : 7.256 ± 0.023M/s > > > > [1] https://lore.kernel.org/bpf/[email protected]/ > > Reported-by: Andrii Nakryiko <[email protected]> > > Closes: > > https://lore.kernel.org/bpf/[email protected]/ > > Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes") > > Signed-off-by: Jiri Olsa <[email protected]> > > --- > > arch/x86/kernel/uprobes.c | 121 +++++++++++++++++++++++++++----------- > > 1 file changed, 86 insertions(+), 35 deletions(-) > > > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > > index ebb1baf1eb1d..f7c4101a4039 100644 > > --- a/arch/x86/kernel/uprobes.c > > +++ b/arch/x86/kernel/uprobes.c > > @@ -636,9 +636,21 @@ struct uprobe_trampoline { > > unsigned long vaddr; > > }; > > > > +#define LEA_INSN_SIZE 5 > > +#define OPT_INSN_SIZE (LEA_INSN_SIZE + CALL_INSN_SIZE) > > +#define OPT_JMP8_OFFSET (OPT_INSN_SIZE - JMP8_INSN_SIZE) > > +#define REDZONE_SIZE 0x80 > > + > > +static const u8 lea_rsp[] = { 0x48, 0x8d, 0x64, 0x24, 0x80 }; > > + > > +static bool is_lea_insn(const uprobe_opcode_t *insn) > > +{ > > + return !memcmp(insn, lea_rsp, LEA_INSN_SIZE); > > +} > > + > > Just a thought. See if below maybe reads better when plugged in. > is_call_insn can then be removed, I think. > > static bool is_call_past_redzone_insns(const uprobe_opcode_t *insn) > { > static const u8 lea_rsp_call[] = { > 0x48, 0x8d, 0x64, 0x24, REDZONE_SIZE, /* lea -0x80(%rsp), %rsp > */ > CALL_INSN_OPCODE > }; > > return !memcmp(insn, lea_rsp_call, ARRAY_SIZE(lea_rsp_call)); > }
yep, might be easier to unify that, thanks jirka
