Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On 2/8/2021 11:48 AM, Borislav Petkov wrote: On Mon, Feb 08, 2021 at 11:23:18AM -0800, Yu, Yu-cheng wrote: exc_general_protection() and do_trap() both call show_signal(), which then calls printk_ratelimit(). You could've done some git archeology and could've found abd4f7505baf ("x86: i386-show-unhandled-signals-v3") which explains why that ratelimiting is needed. For example, if a shell script, in a loop re-starts an app when it exits, and the app is causing control-protection fault. The log messages should be rate limited. I think you should be able to get where I'm going with this, by now: please put a comment over the ratelimiting to explain why it is there, just like the above commit explains. I will add that. Thx.
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Mon, Feb 08, 2021 at 11:23:18AM -0800, Yu, Yu-cheng wrote: > exc_general_protection() and do_trap() both call show_signal(), which > then calls printk_ratelimit(). You could've done some git archeology and could've found abd4f7505baf ("x86: i386-show-unhandled-signals-v3") which explains why that ratelimiting is needed. > For example, if a shell script, in a loop re-starts an app when it > exits, and the app is causing control-protection fault. The log > messages should be rate limited. I think you should be able to get where I'm going with this, by now: please put a comment over the ratelimiting to explain why it is there, just like the above commit explains. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On 2/8/2021 10:53 AM, Borislav Petkov wrote: On Mon, Feb 08, 2021 at 10:50:07AM -0800, Yu, Yu-cheng wrote: I have not run into the situation. Initially it was there because other faults have it. Which other faults? exc_general_protection() and do_trap() both call show_signal(), which then calls printk_ratelimit(). When you asked, I went through it and put out my reasoning. What does that mean? I went through my patch and check if ratelimit is necessary, and then describe the finding. I think it still makes sense to keep it. Because you have a hunch or you actually have an objective reason why? For example, if a shell script, in a loop re-starts an app when it exits, and the app is causing control-protection fault. The log messages should be rate limited.
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Mon, Feb 08, 2021 at 10:50:07AM -0800, Yu, Yu-cheng wrote: > I have not run into the situation. Initially it was there because other > faults have it. Which other faults? > When you asked, I went through it and put out my reasoning. What does that mean? > I think it still makes sense to keep it. Because you have a hunch or you actually have an objective reason why? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On 2/8/2021 10:20 AM, Borislav Petkov wrote: On Fri, Feb 05, 2021 at 10:00:21AM -0800, Yu, Yu-cheng wrote: The ratelimit here is only for #CP, and its rate is not counted together with other types of faults. If a task gets here, it will exit. The only condition the ratelimit will trigger is when multiple tasks hit #CP at once, which is unlikely. Are you suggesting that we do not need the ratelimit here? I'm trying to first find out why is it there. Is this something you've hit during testing and thought, oh well, this needs a ratelimit or was it added just because? I have not run into the situation. Initially it was there because other faults have it. When you asked, I went through it and put out my reasoning. I think it still makes sense to keep it. -- Yu-cheng
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Fri, Feb 05, 2021 at 10:00:21AM -0800, Yu, Yu-cheng wrote: > The ratelimit here is only for #CP, and its rate is not counted together > with other types of faults. If a task gets here, it will exit. The only > condition the ratelimit will trigger is when multiple tasks hit #CP at once, > which is unlikely. Are you suggesting that we do not need the ratelimit > here? I'm trying to first find out why is it there. Is this something you've hit during testing and thought, oh well, this needs a ratelimit or was it added just because? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Fri, Feb 05, 2021 at 10:00:21AM -0800, Yu, Yu-cheng wrote: > On 2/5/2021 5:59 AM, Borislav Petkov wrote: > > On Wed, Feb 03, 2021 at 02:55:28PM -0800, Yu-cheng Yu wrote: > > > +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) > > > +{ > > > + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, > > > + DEFAULT_RATELIMIT_BURST); > > > + struct task_struct *tsk; > > > + > > > + if (!user_mode(regs)) { > > > + pr_emerg("PANIC: unexpected kernel control protection fault\n"); > > > + die("kernel control protection fault", regs, error_code); > > > + panic("Machine halted."); > > > + } > > > + > > > + cond_local_irq_enable(regs); > > > + > > > + if (!boot_cpu_has(X86_FEATURE_CET)) > > > + WARN_ONCE(1, "Control protection fault with CET support > > > disabled\n"); > > > + > > > + tsk = current; > > > + tsk->thread.error_code = error_code; > > > + tsk->thread.trap_nr = X86_TRAP_CP; > > > + > > > + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && > > > + __ratelimit(&rs)) { > > > > I can't find it written down anywhere why the ratelimiting is needed at > > all? > > > > The ratelimit here is only for #CP, and its rate is not counted together > with other types of faults. If a task gets here, it will exit. The only > condition the ratelimit will trigger is when multiple tasks hit #CP at once, > which is unlikely. Are you suggesting that we do not need the ratelimit > here? Since this is a potentially unprivileged-userspace-triggerable condition, I tend to prefer having a ratelimit. I don't feel _strongly_ about it, but I find it better to be defensive against log spamming (whether malicious or accidental). -- Kees Cook
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On 2/5/2021 5:59 AM, Borislav Petkov wrote: On Wed, Feb 03, 2021 at 02:55:28PM -0800, Yu-cheng Yu wrote: +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) +{ + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + struct task_struct *tsk; + + if (!user_mode(regs)) { + pr_emerg("PANIC: unexpected kernel control protection fault\n"); + die("kernel control protection fault", regs, error_code); + panic("Machine halted."); + } + + cond_local_irq_enable(regs); + + if (!boot_cpu_has(X86_FEATURE_CET)) + WARN_ONCE(1, "Control protection fault with CET support disabled\n"); + + tsk = current; + tsk->thread.error_code = error_code; + tsk->thread.trap_nr = X86_TRAP_CP; + + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && + __ratelimit(&rs)) { I can't find it written down anywhere why the ratelimiting is needed at all? The ratelimit here is only for #CP, and its rate is not counted together with other types of faults. If a task gets here, it will exit. The only condition the ratelimit will trigger is when multiple tasks hit #CP at once, which is unlikely. Are you suggesting that we do not need the ratelimit here? Thanks! -- Yu-cheng
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Wed, Feb 03, 2021 at 02:55:28PM -0800, Yu-cheng Yu wrote: > +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) > +{ > + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, > + DEFAULT_RATELIMIT_BURST); > + struct task_struct *tsk; > + > + if (!user_mode(regs)) { > + pr_emerg("PANIC: unexpected kernel control protection fault\n"); > + die("kernel control protection fault", regs, error_code); > + panic("Machine halted."); > + } > + > + cond_local_irq_enable(regs); > + > + if (!boot_cpu_has(X86_FEATURE_CET)) > + WARN_ONCE(1, "Control protection fault with CET support > disabled\n"); > + > + tsk = current; > + tsk->thread.error_code = error_code; > + tsk->thread.trap_nr = X86_TRAP_CP; > + > + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && > + __ratelimit(&rs)) { I can't find it written down anywhere why the ratelimiting is needed at all? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On 2/4/2021 12:09 PM, Kees Cook wrote: On Wed, Feb 03, 2021 at 02:55:28PM -0800, Yu-cheng Yu wrote: [...] diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 7f5aec758f0e..f5354c35df32 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -606,6 +606,66 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) cond_local_irq_disable(regs); } +#ifdef CONFIG_X86_CET +static const char * const control_protection_err[] = { + "unknown", + "near-ret", + "far-ret/iret", + "endbranch", + "rstorssp", + "setssbsy", +}; + +/* + * When a control protection exception occurs, send a signal to the responsible + * application. Currently, control protection is only enabled for user mode. + * This exception should not come from kernel mode. + */ +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) +{ + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + struct task_struct *tsk; + + if (!user_mode(regs)) { + pr_emerg("PANIC: unexpected kernel control protection fault\n"); + die("kernel control protection fault", regs, error_code); + panic("Machine halted."); + } + + cond_local_irq_enable(regs); + + if (!boot_cpu_has(X86_FEATURE_CET)) + WARN_ONCE(1, "Control protection fault with CET support disabled\n"); + + tsk = current; + tsk->thread.error_code = error_code; + tsk->thread.trap_nr = X86_TRAP_CP; + + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && + __ratelimit(&rs)) { + unsigned int max_err; + unsigned long ssp; + + max_err = ARRAY_SIZE(control_protection_err) - 1; + if (error_code < 0 || error_code > max_err) + error_code = 0; Do you want to mask the error_code here before printing its value? + + rdmsrl(MSR_IA32_PL3_SSP, ssp); + pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx error:%lx(%s)", +tsk->comm, task_pid_nr(tsk), +regs->ip, regs->sp, ssp, error_code, +control_protection_err[error_code]); Instead, you could clamp error_code to ARRAY_SIZE(control_protection_err), and add another "unknown" to the end of the strings: control_protection_err[ array_index_nospec(error_code, ARRAY_SIZE(control_protection_err))] Everything else looks good. I will update it. Thanks! [...]
Re: [PATCH v19 06/25] x86/cet: Add control-protection fault handler
On Wed, Feb 03, 2021 at 02:55:28PM -0800, Yu-cheng Yu wrote: > A control-protection fault is triggered when a control-flow transfer > attempt violates Shadow Stack or Indirect Branch Tracking constraints. > For example, the return address for a RET instruction differs from the copy > on the shadow stack; or an indirect JMP instruction, without the NOTRACK > prefix, arrives at a non-ENDBR opcode. > > The control-protection fault handler works in a similar way as the general > protection fault handler. It provides the si_code SEGV_CPERR to the signal > handler. > > Signed-off-by: Yu-cheng Yu > Cc: Michael Kerrisk > --- > arch/x86/include/asm/idtentry.h| 4 ++ > arch/x86/kernel/idt.c | 4 ++ > arch/x86/kernel/signal_compat.c| 2 +- > arch/x86/kernel/traps.c| 60 ++ > include/uapi/asm-generic/siginfo.h | 3 +- > 5 files changed, 71 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h > index f656aabd1545..ff4b3bf634da 100644 > --- a/arch/x86/include/asm/idtentry.h > +++ b/arch/x86/include/asm/idtentry.h > @@ -574,6 +574,10 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_SS, > exc_stack_segment); > DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_GP, exc_general_protection); > DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC, exc_alignment_check); > > +#ifdef CONFIG_X86_CET > +DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP, exc_control_protection); > +#endif > + > /* Raw exception entries which need extra work */ > DECLARE_IDTENTRY_RAW(X86_TRAP_UD,exc_invalid_op); > DECLARE_IDTENTRY_RAW(X86_TRAP_BP,exc_int3); > diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c > index ee1a283f8e96..e8166d9bbb10 100644 > --- a/arch/x86/kernel/idt.c > +++ b/arch/x86/kernel/idt.c > @@ -105,6 +105,10 @@ static const __initconst struct idt_data def_idts[] = { > #elif defined(CONFIG_X86_32) > SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32), > #endif > + > +#ifdef CONFIG_X86_CET > + INTG(X86_TRAP_CP, asm_exc_control_protection), > +#endif > }; > > /* > diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c > index a5330ff498f0..dd92490b1e7f 100644 > --- a/arch/x86/kernel/signal_compat.c > +++ b/arch/x86/kernel/signal_compat.c > @@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void) >*/ > BUILD_BUG_ON(NSIGILL != 11); > BUILD_BUG_ON(NSIGFPE != 15); > - BUILD_BUG_ON(NSIGSEGV != 9); > + BUILD_BUG_ON(NSIGSEGV != 10); > BUILD_BUG_ON(NSIGBUS != 5); > BUILD_BUG_ON(NSIGTRAP != 5); > BUILD_BUG_ON(NSIGCHLD != 6); > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 7f5aec758f0e..f5354c35df32 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -606,6 +606,66 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) > cond_local_irq_disable(regs); > } > > +#ifdef CONFIG_X86_CET > +static const char * const control_protection_err[] = { > + "unknown", > + "near-ret", > + "far-ret/iret", > + "endbranch", > + "rstorssp", > + "setssbsy", > +}; > + > +/* > + * When a control protection exception occurs, send a signal to the > responsible > + * application. Currently, control protection is only enabled for user mode. > + * This exception should not come from kernel mode. > + */ > +DEFINE_IDTENTRY_ERRORCODE(exc_control_protection) > +{ > + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, > + DEFAULT_RATELIMIT_BURST); > + struct task_struct *tsk; > + > + if (!user_mode(regs)) { > + pr_emerg("PANIC: unexpected kernel control protection fault\n"); > + die("kernel control protection fault", regs, error_code); > + panic("Machine halted."); > + } > + > + cond_local_irq_enable(regs); > + > + if (!boot_cpu_has(X86_FEATURE_CET)) > + WARN_ONCE(1, "Control protection fault with CET support > disabled\n"); > + > + tsk = current; > + tsk->thread.error_code = error_code; > + tsk->thread.trap_nr = X86_TRAP_CP; > + > + if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && > + __ratelimit(&rs)) { > + unsigned int max_err; > + unsigned long ssp; > + > + max_err = ARRAY_SIZE(control_protection_err) - 1; > + if (error_code < 0 || error_code > max_err) > + error_code = 0; Do you want to mask the error_code here before printing its value? > + > + rdmsrl(MSR_IA32_PL3_SSP, ssp); > + pr_emerg("%s[%d] control protection ip:%lx sp:%lx ssp:%lx > error:%lx(%s)", > + tsk->comm, task_pid_nr(tsk), > + regs->ip, regs->sp, ssp, error_code, > + control_protection_err[error_code]); Instead, you could clamp error_code to ARRAY_SIZE(control_protection_err),