On 22.01.2026 09:50, Roger Pau Monné wrote:
> On Mon, Nov 17, 2025 at 03:37:45PM +0100, Jan Beulich wrote:
>> If already we play with the IRQ count, we should do so only if we actually
>> "consume" the interrupt; normal timer IRQs should not have any adjustment
>> done.
>>
>> Fixes: 353533232730 ("cpuidle: fix the menu governor to enhance IO
>> performance")
>> Signed-off-by: Jan Beulich <[email protected]>
>> ---
>> _Why_ we do these adjustments (also elsewhere) I don't really know.
>
> I think I have an idea of what's going on here. This accounting is
> used by the idle governor to decide when to go idle. On Linux (where
> the code is imported from) the governor took into account the inflight
> IO request state. However that's not available to Xen and instead
> they decided to mimic the tracking of the IO activity by counting
> interrupts. I bet then realized the timer interrupt would "skew"
> those results and make it look like there's IO activity when the
> system is otherwise mostly idle.
Hmm, yes, that sounds pretty plausible. Except for one aspect: Why would
it be I/O that the governor would care about? It wants to judge by the
system being busy, and timer interrupts generally are an indication of
busyness. Just not broadcast ones. Hence ...
>> --- a/xen/arch/x86/hpet.c
>> +++ b/xen/arch/x86/hpet.c
>> @@ -808,13 +808,13 @@ int hpet_broadcast_is_available(void)
>>
>> int hpet_legacy_irq_tick(void)
>> {
>> - this_cpu(irq_count)--;
>
> I think you want to pull this decrease into timer_interrupt() itself,
> so it does the decrease unconditionally of whether the interrupt is a
> legacy HPET one or from the PIT?
... I think moving to timer_interrupt() would actually be wrong.
> By gating the decrease on the interrupt having been originated from
> the HPET you completely avoid the decrease in the PIT case AFAICT.
>
>> -
>> if ( !hpet_events ||
>> (hpet_events->flags & (HPET_EVT_DISABLE|HPET_EVT_LEGACY)) !=
>> HPET_EVT_LEGACY )
>> return 0;
>>
>> + this_cpu(irq_count)--;
>
> Also in hpet_interrupt_handler() we might consider only doing the
> decrease after we ensure it's not a spurious interrupt? We don't seem
> to decrease irq_count for spurious interrupts elsewhere.
Even a spurious interrupt is only an idle management auxiliary one (i.e.
really an artifact thereof). It doesn't hint at the system being busy.
Jan