On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <t...@linutronix.de> wrote: > > Provide functions which handle the low level entry and exit similiar to > enter/exit from user mode. >
> + > +/** > + * idtentry_exit - Common code to handle return from exceptions > + * @regs: Pointer to pt_regs (exception entry regs) > + * > + * Depending on the return target (kernel/user) this runs the necessary > + * preemption and work checks if possible and reguired and returns to > + * the caller with interrupts disabled and no further work pending. > + * > + * This is the last action before returning to the low level ASM code which > + * just needs to return to the appropriate context. > + * > + * Invoked by all exception/interrupt IDTENTRY handlers which are not > + * returning through the paranoid exit path (all except NMI, #DF and the IST > + * variants of #MC and #DB). The paranoid-exit bit is not really relevant. The important part is which stack we're on. See below. > + */ > +void noinstr idtentry_exit(struct pt_regs *regs) > +{ > + lockdep_assert_irqs_disabled(); How about: #ifdef CONFIG_DEBUG_ENTRY WARN_ON_ONCE(!on_thread_stack()); #endif > + > + /* Check whether this returns to user mode */ > + if (user_mode(regs)) { > + prepare_exit_to_usermode(regs); > + } else if (regs->flags & X86_EFLAGS_IF) { > + /* Check kernel preemption, if enabled */ > + if (IS_ENABLED(CONFIG_PREEMPTION)) { > + /* > + * This needs to be done very carefully. > + * idtentry_enter() invoked rcu_irq_enter(). This > + * needs to undone before scheduling. > + * > + * Preemption is disabled inside of RCU idle > + * sections. When the task returns from > + * preempt_schedule_irq(), RCU is still watching. > + * > + * rcu_irq_exit_preempt() has additional state > + * checking if CONFIG_PROVE_RCU=y > + */ > + if (!preempt_count()) { > + instr_begin(); > + rcu_irq_exit_preempt(); > + if (need_resched()) > + preempt_schedule_irq(); This is an excellent improvement. Thanks! > + /* Covers both tracing and lockdep */ > + trace_hardirqs_on(); > + instr_end(); > + return; > + } > + } > + instr_begin(); > + /* Tell the tracer that IRET will enable interrupts */ > + trace_hardirqs_on_prepare(); Why is trace_hardirqs_on() okay above but not here? Is it that we know we weren't RCU-quiescent if we had preemption and IF on? But even this code path came from an IF-on context. I'm confused. Maybe some comments as to why this case seems to be ordered so differently from the !preempt_count() case would be helpful. > + lockdep_hardirqs_on_prepare(CALLER_ADDR0); > + instr_end(); > + rcu_irq_exit(); > + lockdep_hardirqs_on(CALLER_ADDR0); > + } else { > + /* IRQ flags state is correct already. Just tell RCU */ > + rcu_irq_exit(); > + } > +} > --- a/arch/x86/include/asm/idtentry.h > +++ b/arch/x86/include/asm/idtentry.h > @@ -7,6 +7,9 @@ > > #ifndef __ASSEMBLY__ > > +void idtentry_enter(struct pt_regs *regs); > +void idtentry_exit(struct pt_regs *regs); > + > /** > * DECLARE_IDTENTRY - Declare functions for simple IDT entry points > * No error code pushed by hardware >