On 22.01.2026 09:50, Roger Pau Monné wrote:
> On Mon, Nov 17, 2025 at 03:37:45PM +0100, Jan Beulich wrote:
>> If already we play with the IRQ count, we should do so only if we actually
>> "consume" the interrupt; normal timer IRQs should not have any adjustment
>> done.
>>
>> Fixes: 353533232730 ("cpuidle: fix the menu governor to enhance IO 
>> performance")
>> Signed-off-by: Jan Beulich <[email protected]>
>> ---
>> _Why_ we do these adjustments (also elsewhere) I don't really know.
> 
> I think I have an idea of what's going on here.  This accounting is
> used by the idle governor to decide when to go idle.  On Linux (where
> the code is imported from) the governor took into account the inflight
> IO request state.  However that's not available to Xen and instead
> they decided to mimic the tracking of the IO activity by counting
> interrupts.  I bet then realized the timer interrupt would "skew"
> those results and make it look like there's IO activity when the
> system is otherwise mostly idle.

Hmm, yes, that sounds pretty plausible. Except for one aspect: Why would
it be I/O that the governor would care about? It wants to judge by the
system being busy, and timer interrupts generally are an indication of
busyness. Just not broadcast ones. Hence ...

>> --- a/xen/arch/x86/hpet.c
>> +++ b/xen/arch/x86/hpet.c
>> @@ -808,13 +808,13 @@ int hpet_broadcast_is_available(void)
>>  
>>  int hpet_legacy_irq_tick(void)
>>  {
>> -    this_cpu(irq_count)--;
> 
> I think you want to pull this decrease into timer_interrupt() itself,
> so it does the decrease unconditionally of whether the interrupt is a
> legacy HPET one or from the PIT?

... I think moving to timer_interrupt() would actually be wrong.

> By gating the decrease on the interrupt having been originated from
> the HPET you completely avoid the decrease in the PIT case AFAICT.
> 
>> -
>>      if ( !hpet_events ||
>>           (hpet_events->flags & (HPET_EVT_DISABLE|HPET_EVT_LEGACY)) !=
>>           HPET_EVT_LEGACY )
>>          return 0;
>>  
>> +    this_cpu(irq_count)--;
> 
> Also in hpet_interrupt_handler() we might consider only doing the
> decrease after we ensure it's not a spurious interrupt?  We don't seem
> to decrease irq_count for spurious interrupts elsewhere.

Even a spurious interrupt is only an idle management auxiliary one (i.e.
really an artifact thereof). It doesn't hint at the system being busy.

Jan

Reply via email to