On Wed, 9 May 2018 06:24:16 +0000 Dexuan Cui <de...@microsoft.com> wrote:

> In include/linux/cpumask.h, for_each_cpu is defined like this for UP kernel 
> (CONFIG_NR_CPUS=1):
> 
> #define for_each_cpu(cpu, mask)                 \
>         for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
> 
> Here 'mask' is ignored, but what if 'mask' contains 0 CPU? -- in this case, 
> the for loop should not
> run at all, but with the current code, we run the loop once with cpu==0.
> 
> I think I'm seeing a bug in my UP kernel that is caused by the buggy 
> for_each_cpu():
> 
> in kernel/time/tick-broadcast.c: tick_handle_oneshot_broadcast(), 
> tick_broadcast_oneshot_mask
> contains 0 CPU, but due to the buggy for_each_cpu(), the variable 
> 'next_event' is changed from
> its default value KTIME_MAX to "next_event = td->evtdev->next_event"; as a 
> result,
> tick_handle_oneshot_broadcast () -> tick_broadcast_set_event() -> 
> clockevents_program_event()
> -> pit_next_event() is programming the PIT timer by accident, causing an 
> interrupt storm of PIT
> interrupts in some way: I'm seeing that the kernel is receiving ~8000 PIT 
> interrupts per second for
> 1~5 minutes when the UP kernel boots, and it looks the kernel hangs, but in 
> 1~5 minutes, finally
> somehow the kernel can recover and boot up fine. But, occasionally, the 
> kernel just hangs there
> forever, receiving ~8000 PIT timers per second. 
> 
> With the below change in kernel/time/tick-broadcast.c, the interrupt storm 
> will go away:
> 
> +#undef for_each_cpu
> +#define for_each_cpu(cpu, mask)                        \
> +       for ((cpu) = 0; (((cpu) < 1) && ((mask)[0].bits[0] & 1)); (cpu)++, 
> (void)mask)
> 
> Should we fix the for_each_cpu() in include/linux/cpumask.h for UP?

I think so, yes.  That might reveal new peculiarities, but such is life.

I guess we should use bitmap_empty() rather than open-coding it.

Reply via email to