On Mon, Jun 15, 2020 at 02:08:16PM -0700, Andy Lutomirski wrote:

> > All !user exceptions really should be NMI-like. If you want to go
> > overboard, I suppose you can look at IF and have them behave interrupt
> > like when set, but why make things complicated.
> 
> This entire rabbit hole opened because of #PF. So we at least need the
> set of exceptions that are permitted to schedule if they came from
> kernel mode to remain schedulable.

What exception, other than #PF, actually needs to schedule from kernel?

> Prior to the giant changes, all the non-IST *exceptions*, but not the
> interrupts, were schedulable from kernel mode, assuming the original
> context could schedule. Right now, interrupts can schedule, too, which
> is nice if we ever want to fully clean up the Xen abomination. I
> suppose we could make it so #PF opts in to special treatment again,
> but we should decide that the result is simpler or otherwise better
> before we do this.
> 
> One possible justification would be that the schedulable entry variant
> is more complicated, and most kernel exceptions except the ones with
> fixups are bad news, and we want the oopses to succeed. But page
> faults are probably the most common source of oopses, so this is a bit
> weak, and we really want page faults to work even from nasty contexts.

I think I'd prefer the argument of consistent failure.

Do we ever want #UD to schedule? If not, then why allow it to sometimes
schedule and sometimes fail, better to always fail.

#DB is still a giant trainwreck in this regard as well.

Something like this...

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -216,10 +216,25 @@ static inline void handle_invalid_op(str
                      ILL_ILLOPN, error_get_trap_addr(regs));
 }
 
-DEFINE_IDTENTRY_RAW(exc_invalid_op)
+static void handle_invalid_op_kernel(struct pt_regs *regs)
+{
+       if (is_valid_bugaddr(regs->ip) &&
+           report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN) {
+               /* Skip the ud2. */
+               regs->ip += LEN_UD2;
+               return;
+       }
+
+       handle_invalid_op(regs);
+}
+
+static void handle_invalid_op_user(struct pt_regs *regs)
 {
-       bool rcu_exit;
+       handle_invalid_op(regs);
+}
 
+DEFINE_IDTENTRY_RAW(exc_invalid_op)
+{
        /*
         * Handle BUG/WARN like NMIs instead of like normal idtentries:
         * if we bugged/warned in a bad RCU context, for example, the last
@@ -227,38 +242,25 @@ DEFINE_IDTENTRY_RAW(exc_invalid_op)
         * infinitum.
         */
        if (!user_mode(regs)) {
-               enum bug_trap_type type = BUG_TRAP_TYPE_NONE;
-
                nmi_enter();
                instrumentation_begin();
                trace_hardirqs_off_finish();
 
-               if (is_valid_bugaddr(regs->ip))
-                       type = report_bug(regs->ip, regs);
+               handle_invalid_op_kernel(regs);
 
                if (regs->flags & X86_EFLAGS_IF)
                        trace_hardirqs_on_prepare();
                instrumentation_end();
                nmi_exit();
+       } else {
+               bool rcu_exit;
 
-               if (type == BUG_TRAP_TYPE_WARN) {
-                       /* Skip the ud2. */
-                       regs->ip += LEN_UD2;
-                       return;
-               }
-
-               /*
-                * Else, if this was a BUG and report_bug returns or if this
-                * was just a normal #UD, we want to continue onward and
-                * crash.
-                */
+               rcu_exit = idtentry_enter_cond_rcu(regs);
+               instrumentation_begin();
+               handle_invalid_op_user(regs);
+               instrumentation_end();
+               idtentry_exit_cond_rcu(regs, rcu_exit);
        }
-
-       rcu_exit = idtentry_enter_cond_rcu(regs);
-       instrumentation_begin();
-       handle_invalid_op(regs);
-       instrumentation_end();
-       idtentry_exit_cond_rcu(regs, rcu_exit);
 }
 
 DEFINE_IDTENTRY(exc_coproc_segment_overrun)

Reply via email to