Hi Marc,

(CC: +Jon)

On 01/09/2020 15:43, Marc Zyngier wrote:
> Change the way we deal with GIC SGIs by turning them into proper
> IRQs, and calling into the arch code to register the interrupt range
> instead of a callback.

Your comment "This only works because we don't nest SGIs..." on this thread 
tripped some
bad memories from adding the irq-stack. Softirq causes us to nest irqs, but 
only once.


(I've messed with the below diff to remove the added stuff:)

> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 4ffd62af888f..4be2b62f816f 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -335,31 +335,22 @@ static void __exception_irq_entry gic_handle_irq(struct 
> pt_regs *regs)
>               irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
>               irqnr = irqstat & GICC_IAR_INT_ID_MASK;
>  
> -             if (likely(irqnr > 15 && irqnr < 1020)) {
> -                     if (static_branch_likely(&supports_deactivate_key))
> -                             writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
> -                     isb();
> -                     handle_domain_irq(gic->domain, irqnr, regs);
> -                     continue;
> -             }
> -             if (irqnr < 16) {
>                       writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
> -                     if (static_branch_likely(&supports_deactivate_key))
> -                             writel_relaxed(irqstat, cpu_base + 
> GIC_CPU_DEACTIVATE);
> -#ifdef CONFIG_SMP
> -                     /*
> -                      * Ensure any shared data written by the CPU sending
> -                      * the IPI is read after we've read the ACK register
> -                      * on the GIC.
> -                      *
> -                      * Pairs with the write barrier in gic_raise_softirq
> -                      */
>                       smp_rmb();
> -                     handle_IPI(irqnr, regs);

If I read this right, previously we would EOI the interrupt before calling 
handle_IPI().
Where as now with the version of this series in your tree, we stuff the 
to-be-EOId value
in a percpu variable, which is only safe if these don't nest.

Hidden in irq_exit(), kernel/softirq.c::__irq_exit_rcu() has this:
|       preempt_count_sub(HARDIRQ_OFFSET);
|       if (!in_interrupt() && local_softirq_pending())
|               invoke_softirq();

The arch code doesn't raise the preempt counter by HARDIRQ, so once 
__irq_exit_rcu() has
dropped it, in_interrupt() returns false, and we invoke_softirq().

invoke_softirq() -> __do_softirq() -> local_irq_enable()!

Fortunately, __do_softirq() raises the softirq count first using 
__local_bh_disable_ip(),
which in-interrupt() checks too, so this can only happen once per IRQ.

Now the irq_exit() has moved from handle_IPI(), which ran after EOI, into
handle_domain_irq(), which runs before. I think its possible SGIs nest, and the 
new percpu
variable becomes corrupted.

Presumably this isn't a problem for regular IRQ, as they don't need the 
sending-CPU in
order to EOI, which is why it wasn't a problem before.

Adding anything to preempt-count around the whole thing upsets RCU, and softirq 
seems to
expect this nesting, but evidently the gic does not. I'm not sure what the 
right thing to
do would be. A dirty hack like [0] would confirm the theory.

/me runs

Thanks,

James



[0] A dirty hack
-----------%<-----------
diff --git a/kernel/softirq.c b/kernel/softirq.c
index bf88d7f62433..50e14d8cbec3 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -376,7 +376,7 @@ static inline void invoke_softirq(void)
        if (ksoftirqd_running(local_softirq_pending()))
                return;

-       if (!force_irqthreads) {
+       if (false) {
 #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
                /*
                 * We can safely execute softirq on the current stack if
@@ -393,6 +393,7 @@ static inline void invoke_softirq(void)
                do_softirq_own_stack();
 #endif
        } else {
+               /* hack: force this */
                wakeup_softirqd();
        }
 }
-----------%<-----------

Reply via email to