On Thu, May 14, 2026 at 03:53:36PM +0200, Jiri Olsa wrote:
> Andrii reported an issue with optimized uprobes [1] that can clobber
> redzone area with call instruction storing return address on stack
> where user code may keep temporary data without adjusting rsp.
> 
> Fixing this by moving the optimized uprobes on top of 10-bytes nop
> instruction, so we can squeeze another instruction to escape the
> redzone area before doing the call, like:
> 
>   lea -0x80(%rsp), %rsp
>   call tramp
> 
> Note the lea instruction is used to adjust the rsp register without
> changing the flags.
> 
> The optimized uprobe performance stays the same:
> 
>         uprobe-nop     :    3.129 ± 0.013M/s
>         uprobe-push    :    3.045 ± 0.006M/s
>         uprobe-ret     :    1.095 ± 0.004M/s
>   -->   uprobe-nop10   :    7.170 ± 0.020M/s
>         uretprobe-nop  :    2.143 ± 0.021M/s
>         uretprobe-push :    2.090 ± 0.000M/s
>         uretprobe-ret  :    0.942 ± 0.000M/s
>   -->   uretprobe-nop10:    3.381 ± 0.003M/s
>         usdt-nop       :    3.245 ± 0.004M/s
>   -->   usdt-nop10     :    7.256 ± 0.023M/s
> 
> [1] https://lore.kernel.org/bpf/[email protected]/
> Reported-by: Andrii Nakryiko <[email protected]>
> Closes: https://lore.kernel.org/bpf/[email protected]/
> Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes")
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
>  arch/x86/kernel/uprobes.c | 121 +++++++++++++++++++++++++++-----------
>  1 file changed, 86 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index ebb1baf1eb1d..f7c4101a4039 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -636,9 +636,21 @@ struct uprobe_trampoline {
>       unsigned long           vaddr;
>  };
>  
> +#define LEA_INSN_SIZE                5
> +#define OPT_INSN_SIZE                (LEA_INSN_SIZE + CALL_INSN_SIZE)
> +#define OPT_JMP8_OFFSET              (OPT_INSN_SIZE - JMP8_INSN_SIZE)
> +#define REDZONE_SIZE         0x80
> +
> +static const u8 lea_rsp[] = { 0x48, 0x8d, 0x64, 0x24, 0x80 };
> +
> +static bool is_lea_insn(const uprobe_opcode_t *insn)
> +{
> +     return !memcmp(insn, lea_rsp, LEA_INSN_SIZE);
> +}
> +

Just a thought. See if below maybe reads better when plugged in.
is_call_insn can then be removed, I think.

static bool is_call_past_redzone_insns(const uprobe_opcode_t *insn)
{
        static const u8 lea_rsp_call[] = {
                0x48, 0x8d, 0x64, 0x24, REDZONE_SIZE, /* lea -0x80(%rsp), %rsp 
*/
                CALL_INSN_OPCODE
        };

        return !memcmp(insn, lea_rsp_call, ARRAY_SIZE(lea_rsp_call));
}

[...]

Reply via email to