Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, May 03, 2024 at 08:35:24PM +, Edgecombe, Rick P wrote: > On Fri, 2024-05-03 at 22:17 +0200, Jiri Olsa wrote: > > when uretprobe is created, kernel overwrites the return address on user > > stack to point to user space trampoline, so the setup is in kernel hands > > I mean for uprobes in general. I'm didn't have any specific ideas in mind, but > in general when we give the kernel more abilities around shadow stack we have > to > think if attackers could use it to work around shadow stack protections. > > > > > with the hack below on top of this patchset I'm no longer seeing shadow > > stack app crash on uretprobe.. I'll try to polish it and send out next > > week, any suggestions are welcome ;-) > > Thanks. Some comments below. > > > > > thanks, > > jirka > > > > > > --- > > diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h > > index 42fee8959df7..d374305a6851 100644 > > --- a/arch/x86/include/asm/shstk.h > > +++ b/arch/x86/include/asm/shstk.h > > @@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct > > *p, unsigned long clon > > void shstk_free(struct task_struct *p); > > int setup_signal_shadow_stack(struct ksignal *ksig); > > int restore_signal_shadow_stack(void); > > +void uprobe_change_stack(unsigned long addr); > > +void uprobe_push_stack(unsigned long addr); > > Maybe name them: > shstk_update_last_frame(); > shstk_push_frame(); ok > > > > #else > > static inline long shstk_prctl(struct task_struct *task, int option, > > unsigned long arg2) { return -EINVAL; } > > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c > > index 59e15dd8d0f8..804c446231d9 100644 > > --- a/arch/x86/kernel/shstk.c > > +++ b/arch/x86/kernel/shstk.c > > @@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option, > > unsigned long arg2) > > return wrss_control(true); > > return -EINVAL; > > } > > + > > +void uprobe_change_stack(unsigned long addr) > > +{ > > + unsigned long ssp; > > Probably want something like: > > if (!features_enabled(ARCH_SHSTK_SHSTK)) > return; ok > > So this doesn't try the below if shadow stack is disabled. > > > + > > + ssp = get_user_shstk_addr(); > > + write_user_shstk_64((u64 __user *)ssp, (u64)addr); > > +} > > Can we know that there was a valid return address just before this point on > the > stack? Or could it be a sigframe or something? when uprobe hijack the return address it assumes it's on the top of the stack, so it's saved and replaced with address of the user space trampoline > > > + > > +void uprobe_push_stack(unsigned long addr) > > +{ > > + unsigned long ssp; > > if (!features_enabled(ARCH_SHSTK_SHSTK)) > return; > > > + > > + ssp = get_user_shstk_addr(); > > + ssp -= SS_FRAME_SIZE; > > + write_user_shstk_64((u64 __user *)ssp, (u64)addr); > > + > > + fpregs_lock_and_load(); > > + wrmsrl(MSR_IA32_PL3_SSP, ssp); > > + fpregs_unlock(); > > +} > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > > index 81e6ee95784d..259457838020 100644 > > --- a/arch/x86/kernel/uprobes.c > > +++ b/arch/x86/kernel/uprobes.c > > @@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe) > > regs->r11 = regs->flags; > > regs->cx = regs->ip; > > > > + uprobe_push_stack(r11_cx_ax[2]); > > I'm concerned this could be used to push arbitrary frames to the shadow stack. > Couldn't an attacker do a jump to the point that calls this syscall? Maybe > this > is what peterz was raising. of course never say never, but here's my reasoning why I think it's ok the page with the syscall trampoline is mapped in user space and can be found in procfs maps file under '[uprobes]' name the syscall can be called only from this trampoline, if it's called from anywhere else the calling process receives SIGILL now if you run the uretprobe syscall without any pending uretprobe for the task it will receive SIGILL before it gets to the point of pushing address on the shadow stack and to configure the uretprobe you need to have CAP_PERFMON or CAP_SYS_ADMIN if you'd actually managed to get the pending uretprobe instance, the shadow stack entry is going to be used/pop-ed right away in the trampoline with the ret instruction and as I mentioned above it's ensured that the syscall is returning to the trampoline and it can't be called from any other place > > > return regs->ax; > > > > sigill: > > @@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long > > trampoline_vaddr, struct pt_regs > > return orig_ret_vaddr; > > > > nleft = copy_to_user((void __user *)regs->sp, _vaddr, > > rasize); > > - if (likely(!nleft)) > > + if (likely(!nleft)) { > > + uprobe_change_stack(trampoline_vaddr); > > return orig_ret_vaddr; > > + } > > > >
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, May 03, 2024 at 07:38:18PM +, Edgecombe, Rick P wrote: +Some more shadow stack folks from other archs. We are discussing how uretprobes work with shadow stack. Context: https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/ Thanks Rick. Yeah I didn't give enough attention to uprobes either. Although now that I think for RISC-V shadow stack, it shouldn't be an issue. On RISC-V return addresses don't get pushed as part of call instruction. There is a distinct instruction "shadow stack push of return address" in prolog. Similarly in epilog there is distinct instruction "shadow stack pop and check with link register". On RISC-V, uretprobe would install a uprobe on function start and when it's hit. It'll replace pt_regs->ra = trampoline_handler. As function will resume, trampoline addr will get pushed and popped. Although trampoline_handler would have to be enlightened to eventually return to original return site. On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote: hack below seems to fix it for the current uprobe setup, we need similar fix for the uretprobe syscall trampoline setup It seems like a reasonable direction. Security-wise, applications cannot do this on themselves, or it is an otherwise privileged thing right?
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, 2024-05-03 at 22:17 +0200, Jiri Olsa wrote: > when uretprobe is created, kernel overwrites the return address on user > stack to point to user space trampoline, so the setup is in kernel hands I mean for uprobes in general. I'm didn't have any specific ideas in mind, but in general when we give the kernel more abilities around shadow stack we have to think if attackers could use it to work around shadow stack protections. > > with the hack below on top of this patchset I'm no longer seeing shadow > stack app crash on uretprobe.. I'll try to polish it and send out next > week, any suggestions are welcome ;-) Thanks. Some comments below. > > thanks, > jirka > > > --- > diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h > index 42fee8959df7..d374305a6851 100644 > --- a/arch/x86/include/asm/shstk.h > +++ b/arch/x86/include/asm/shstk.h > @@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct > *p, unsigned long clon > void shstk_free(struct task_struct *p); > int setup_signal_shadow_stack(struct ksignal *ksig); > int restore_signal_shadow_stack(void); > +void uprobe_change_stack(unsigned long addr); > +void uprobe_push_stack(unsigned long addr); Maybe name them: shstk_update_last_frame(); shstk_push_frame(); > #else > static inline long shstk_prctl(struct task_struct *task, int option, > unsigned long arg2) { return -EINVAL; } > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c > index 59e15dd8d0f8..804c446231d9 100644 > --- a/arch/x86/kernel/shstk.c > +++ b/arch/x86/kernel/shstk.c > @@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option, > unsigned long arg2) > return wrss_control(true); > return -EINVAL; > } > + > +void uprobe_change_stack(unsigned long addr) > +{ > + unsigned long ssp; Probably want something like: if (!features_enabled(ARCH_SHSTK_SHSTK)) return; So this doesn't try the below if shadow stack is disabled. > + > + ssp = get_user_shstk_addr(); > + write_user_shstk_64((u64 __user *)ssp, (u64)addr); > +} Can we know that there was a valid return address just before this point on the stack? Or could it be a sigframe or something? > + > +void uprobe_push_stack(unsigned long addr) > +{ > + unsigned long ssp; if (!features_enabled(ARCH_SHSTK_SHSTK)) return; > + > + ssp = get_user_shstk_addr(); > + ssp -= SS_FRAME_SIZE; > + write_user_shstk_64((u64 __user *)ssp, (u64)addr); > + > + fpregs_lock_and_load(); > + wrmsrl(MSR_IA32_PL3_SSP, ssp); > + fpregs_unlock(); > +} > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > index 81e6ee95784d..259457838020 100644 > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe) > regs->r11 = regs->flags; > regs->cx = regs->ip; > > + uprobe_push_stack(r11_cx_ax[2]); I'm concerned this could be used to push arbitrary frames to the shadow stack. Couldn't an attacker do a jump to the point that calls this syscall? Maybe this is what peterz was raising. > return regs->ax; > > sigill: > @@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long > trampoline_vaddr, struct pt_regs > return orig_ret_vaddr; > > nleft = copy_to_user((void __user *)regs->sp, _vaddr, > rasize); > - if (likely(!nleft)) > + if (likely(!nleft)) { > + uprobe_change_stack(trampoline_vaddr); > return orig_ret_vaddr; > + } > > if (nleft != rasize) { > pr_err("return address clobbered: pid=%d, %%sp=%#lx, > %%ip=%#lx\n",
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, May 03, 2024 at 07:38:18PM +, Edgecombe, Rick P wrote: > +Some more shadow stack folks from other archs. We are discussing how > uretprobes > work with shadow stack. > > Context: > https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/ > > On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote: > > > > hack below seems to fix it for the current uprobe setup, > > we need similar fix for the uretprobe syscall trampoline setup > > It seems like a reasonable direction. > > Security-wise, applications cannot do this on themselves, or it is an > otherwise > privileged thing right? when uretprobe is created, kernel overwrites the return address on user stack to point to user space trampoline, so the setup is in kernel hands with the hack below on top of this patchset I'm no longer seeing shadow stack app crash on uretprobe.. I'll try to polish it and send out next week, any suggestions are welcome ;-) thanks, jirka --- diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h index 42fee8959df7..d374305a6851 100644 --- a/arch/x86/include/asm/shstk.h +++ b/arch/x86/include/asm/shstk.h @@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clon void shstk_free(struct task_struct *p); int setup_signal_shadow_stack(struct ksignal *ksig); int restore_signal_shadow_stack(void); +void uprobe_change_stack(unsigned long addr); +void uprobe_push_stack(unsigned long addr); #else static inline long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) { return -EINVAL; } diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 59e15dd8d0f8..804c446231d9 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) return wrss_control(true); return -EINVAL; } + +void uprobe_change_stack(unsigned long addr) +{ + unsigned long ssp; + + ssp = get_user_shstk_addr(); + write_user_shstk_64((u64 __user *)ssp, (u64)addr); +} + +void uprobe_push_stack(unsigned long addr) +{ + unsigned long ssp; + + ssp = get_user_shstk_addr(); + ssp -= SS_FRAME_SIZE; + write_user_shstk_64((u64 __user *)ssp, (u64)addr); + + fpregs_lock_and_load(); + wrmsrl(MSR_IA32_PL3_SSP, ssp); + fpregs_unlock(); +} diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 81e6ee95784d..259457838020 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe) regs->r11 = regs->flags; regs->cx = regs->ip; + uprobe_push_stack(r11_cx_ax[2]); return regs->ax; sigill: @@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs return orig_ret_vaddr; nleft = copy_to_user((void __user *)regs->sp, _vaddr, rasize); - if (likely(!nleft)) + if (likely(!nleft)) { + uprobe_change_stack(trampoline_vaddr); return orig_ret_vaddr; + } if (nleft != rasize) { pr_err("return address clobbered: pid=%d, %%sp=%#lx, %%ip=%#lx\n",
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
+Some more shadow stack folks from other archs. We are discussing how uretprobes work with shadow stack. Context: https://lore.kernel.org/lkml/ZjU4ganRF1Cbiug6@krava/ On Fri, 2024-05-03 at 21:18 +0200, Jiri Olsa wrote: > > hack below seems to fix it for the current uprobe setup, > we need similar fix for the uretprobe syscall trampoline setup It seems like a reasonable direction. Security-wise, applications cannot do this on themselves, or it is an otherwise privileged thing right?
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, May 03, 2024 at 03:53:15PM +, Edgecombe, Rick P wrote: > On Fri, 2024-05-03 at 15:04 +0200, Jiri Olsa wrote: > > On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote: > > > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote: > > > > Adding uretprobe syscall instead of trap to speed up return probe. > > > > > > > > At the moment the uretprobe setup/path is: > > > > > > > > - install entry uprobe > > > > > > > > - when the uprobe is hit, it overwrites probed function's return > > > > address > > > > on stack with address of the trampoline that contains breakpoint > > > > instruction > > > > > > > > - the breakpoint trap code handles the uretprobe consumers execution > > > > and > > > > jumps back to original return address > > Hi, > > I worked on the x86 shadow stack support. > > I didn't know uprobes did anything like this. In hindsight I should have > looked > more closely. The current upstream behavior is to overwrite the return address > on the stack? > > Stupid uprobes question - what is actually overwriting the return address on > the > stack? Is it the kernel? If so perhaps the kernel could just update the shadow > stack at the same time. yes, it's in kernel - arch_uretprobe_hijack_return_addr .. so I guess we need to update the shadow stack with the new return value as well > > > > > > > > > This patch replaces the above trampoline's breakpoint instruction with > > > > new > > > > ureprobe syscall call. This syscall does exactly the same job as the > > > > trap > > > > with some more extra work: > > > > > > > > - syscall trampoline must save original value for rax/r11/rcx > > > > registers > > > > on stack - rax is set to syscall number and r11/rcx are changed and > > > > used by syscall instruction > > > > > > > > - the syscall code reads the original values of those registers and > > > > restore those values in task's pt_regs area > > > > > > > > - only caller from trampoline exposed in '[uprobes]' is allowed, > > > > the process will receive SIGILL signal otherwise > > > > > > > > > > Did you consider shadow stacks? IIRC we currently have userspace shadow > > > stack support available, and that will utterly break all of this. > > > > nope.. I guess it's the extra ret instruction in the trampoline that would > > make it crash? > > The original behavior seems problematic for shadow stack IIUC. I'm not sure of > the additional breakage with the new behavior. I can see it's broken also for current uprobes > > Roughly, how shadow stack works is there is an additional protected stack for > the app thread. The HW pushes to from the shadow stack with CALL, and pops > from > it with RET. But it also continues to push and pop from the normal stack. On > pop, if the values don't match between the two stacks, an exception is > generated. The whole point is to prevent the app from overwriting its stack > return address to return to random places. > > Userspace cannot (normally) write to the shadow stack, but the kernel can do > this or adust the SSP (shadow stack pointer). So in the kernel (for things > like > sigreturn) there is an ability to do what is needed. Ptracers also can do > things > like this. hack below seems to fix it for the current uprobe setup, we need similar fix for the uretprobe syscall trampoline setup jirka --- diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h index 42fee8959df7..99a0948a3b79 100644 --- a/arch/x86/include/asm/shstk.h +++ b/arch/x86/include/asm/shstk.h @@ -21,6 +21,7 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clon void shstk_free(struct task_struct *p); int setup_signal_shadow_stack(struct ksignal *ksig); int restore_signal_shadow_stack(void); +void uprobe_change_stack(unsigned long addr); #else static inline long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) { return -EINVAL; } diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 59e15dd8d0f8..d2c4dbe5843c 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -577,3 +577,11 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2) return wrss_control(true); return -EINVAL; } + +void uprobe_change_stack(unsigned long addr) +{ + unsigned long ssp; + + ssp = get_user_shstk_addr(); + write_user_shstk_64((u64 __user *)ssp, (u64)addr); +} diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 81e6ee95784d..88afbeaacb8f 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -348,7 +348,7 @@ void *arch_uprobe_trampoline(unsigned long *psize) * only for native 64-bit process, the compat process still uses * standard breakpoint. */ - if (user_64bit_mode(regs)) { + if (0 && user_64bit_mode(regs)) { *psize =
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, 2024-05-03 at 15:04 +0200, Jiri Olsa wrote: > On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote: > > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote: > > > Adding uretprobe syscall instead of trap to speed up return probe. > > > > > > At the moment the uretprobe setup/path is: > > > > > > - install entry uprobe > > > > > > - when the uprobe is hit, it overwrites probed function's return > > > address > > > on stack with address of the trampoline that contains breakpoint > > > instruction > > > > > > - the breakpoint trap code handles the uretprobe consumers execution > > > and > > > jumps back to original return address Hi, I worked on the x86 shadow stack support. I didn't know uprobes did anything like this. In hindsight I should have looked more closely. The current upstream behavior is to overwrite the return address on the stack? Stupid uprobes question - what is actually overwriting the return address on the stack? Is it the kernel? If so perhaps the kernel could just update the shadow stack at the same time. > > > > > > This patch replaces the above trampoline's breakpoint instruction with new > > > ureprobe syscall call. This syscall does exactly the same job as the trap > > > with some more extra work: > > > > > > - syscall trampoline must save original value for rax/r11/rcx registers > > > on stack - rax is set to syscall number and r11/rcx are changed and > > > used by syscall instruction > > > > > > - the syscall code reads the original values of those registers and > > > restore those values in task's pt_regs area > > > > > > - only caller from trampoline exposed in '[uprobes]' is allowed, > > > the process will receive SIGILL signal otherwise > > > > > > > Did you consider shadow stacks? IIRC we currently have userspace shadow > > stack support available, and that will utterly break all of this. > > nope.. I guess it's the extra ret instruction in the trampoline that would > make it crash? The original behavior seems problematic for shadow stack IIUC. I'm not sure of the additional breakage with the new behavior. Roughly, how shadow stack works is there is an additional protected stack for the app thread. The HW pushes to from the shadow stack with CALL, and pops from it with RET. But it also continues to push and pop from the normal stack. On pop, if the values don't match between the two stacks, an exception is generated. The whole point is to prevent the app from overwriting its stack return address to return to random places. Userspace cannot (normally) write to the shadow stack, but the kernel can do this or adust the SSP (shadow stack pointer). So in the kernel (for things like sigreturn) there is an ability to do what is needed. Ptracers also can do things like this.
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote: > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote: > > Adding uretprobe syscall instead of trap to speed up return probe. > > > > At the moment the uretprobe setup/path is: > > > > - install entry uprobe > > > > - when the uprobe is hit, it overwrites probed function's return address > > on stack with address of the trampoline that contains breakpoint > > instruction > > > > - the breakpoint trap code handles the uretprobe consumers execution and > > jumps back to original return address > > > > This patch replaces the above trampoline's breakpoint instruction with new > > ureprobe syscall call. This syscall does exactly the same job as the trap > > with some more extra work: > > > > - syscall trampoline must save original value for rax/r11/rcx registers > > on stack - rax is set to syscall number and r11/rcx are changed and > > used by syscall instruction > > > > - the syscall code reads the original values of those registers and > > restore those values in task's pt_regs area > > > > - only caller from trampoline exposed in '[uprobes]' is allowed, > > the process will receive SIGILL signal otherwise > > > > Did you consider shadow stacks? IIRC we currently have userspace shadow > stack support available, and that will utterly break all of this. nope.. I guess it's the extra ret instruction in the trampoline that would make it crash? > > It would be really nice if the new scheme would consider shadow stacks. I seem to have the hw with support for user_shstk, let me test that thanks, jirka
Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote: > Adding uretprobe syscall instead of trap to speed up return probe. > > At the moment the uretprobe setup/path is: > > - install entry uprobe > > - when the uprobe is hit, it overwrites probed function's return address > on stack with address of the trampoline that contains breakpoint > instruction > > - the breakpoint trap code handles the uretprobe consumers execution and > jumps back to original return address > > This patch replaces the above trampoline's breakpoint instruction with new > ureprobe syscall call. This syscall does exactly the same job as the trap > with some more extra work: > > - syscall trampoline must save original value for rax/r11/rcx registers > on stack - rax is set to syscall number and r11/rcx are changed and > used by syscall instruction > > - the syscall code reads the original values of those registers and > restore those values in task's pt_regs area > > - only caller from trampoline exposed in '[uprobes]' is allowed, > the process will receive SIGILL signal otherwise > Did you consider shadow stacks? IIRC we currently have userspace shadow stack support available, and that will utterly break all of this. It would be really nice if the new scheme would consider shadow stacks.