On 09/27/2012 05:15 AM, Chuansheng Liu wrote: > > When one CPU is going offline, and fixup_irqs() will re-set the > irq affinity in some cases, we should clean the offlining CPU from > the irq affinity. > > The reason is setting offlining CPU as of the affinity is useless. > Moreover, the smp_affinity value will be confusing when the > offlining CPU come back again. > > Example: > For irq 93 with 4 CPUS, the default affinity f(1111), > normal cases: 4 CPUS will receive the irq93 interrupts. > > When echo 0 > /sys/devices/system/cpu/cpu3/online, just CPU0,1,2 will > receive the interrupts. > > But after the CPU3 is online again, we will not set affinity,the result > will be: > the smp_affinity is f, but still just CPU0,1,2 can receive the interrupts. > > So we should clean the offlining CPU from irq affinity mask > in fixup_irqs(). >
I have some fundamental questions here: 1. Why was the CPU never removed from the affinity masks in the original code? I find it hard to believe that it was just an oversight, because the whole point of fixup_irqs() is to affine the interrupts to other CPUs, IIUC. So, is that really a bug or is the existing code correct for some reason which I don't know of? 2. In case this is indeed a bug, why are the warnings ratelimited when the interrupts can't be affined to other CPUs? Are they not serious enough to report? Put more strongly, why do we even silently return with a warning instead of reporting that the CPU offline operation failed?? Is that because we have come way too far in the hotplug sequence and we can't easily roll back? Or are we still actually OK in that situation? Suresh, I'd be grateful if you could kindly throw some light on these issues... I'm actually debugging an issue where an offline CPU gets apic timer interrupts (and in one case, I even saw a device interrupt), which I have reported in another thread at: https://lkml.org/lkml/2012/9/26/119 But this issue in fixup_irqs() that Liu brought to light looks even more surprising to me.. Regards, Srivatsa S. Bhat > --- > arch/x86/kernel/irq.c | 21 +++++++++++++++++---- > 1 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index d44f782..ead0807 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -239,10 +239,13 @@ void fixup_irqs(void) > struct irq_desc *desc; > struct irq_data *data; > struct irq_chip *chip; > + int cpu = smp_processor_id(); > > for_each_irq_desc(irq, desc) { > int break_affinity = 0; > int set_affinity = 1; > + bool set_ret = false; > + > const struct cpumask *affinity; > > if (!desc) > @@ -256,7 +259,8 @@ void fixup_irqs(void) > data = irq_desc_get_irq_data(desc); > affinity = data->affinity; > if (!irq_has_action(irq) || irqd_is_per_cpu(data) || > - cpumask_subset(affinity, cpu_online_mask)) { > + cpumask_subset(affinity, cpu_online_mask) || > + !cpumask_test_cpu(cpu, data->affinity)) { > raw_spin_unlock(&desc->lock); > continue; > } > @@ -277,9 +281,18 @@ void fixup_irqs(void) > if (!irqd_can_move_in_process_context(data) && chip->irq_mask) > chip->irq_mask(data); > > - if (chip->irq_set_affinity) > - chip->irq_set_affinity(data, affinity, true); > - else if (!(warned++)) > + if (chip->irq_set_affinity) { > + struct cpumask mask; > + cpumask_copy(&mask, affinity); > + cpumask_clear_cpu(cpu, &mask); > + switch (chip->irq_set_affinity(data, &mask, true)) { > + case IRQ_SET_MASK_OK: > + cpumask_copy(data->affinity, &mask); > + case IRQ_SET_MASK_OK_NOCOPY: > + set_ret = true; > + } > + } > + if ((!set_ret) && !(warned++)) > set_affinity = 0; > > /* > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/