mwait-idle: enable interrupts before C1 on Xeons

Jan Beulich Mon, 24 Jan 2022 06:45:05 -0800

On 20.01.2022 16:52, Roger Pau Monné wrote:
> On Thu, Jan 20, 2022 at 03:04:39PM +0100, Jan Beulich wrote:
>> From: Artem Bityutskiy <artem.bityuts...@linux.intel.com>
>>
>> Enable local interrupts before requesting C1 on the last two generations
>> of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
>> This decreases average C1 interrupt latency by about 5-10%, as measured
>> with the 'wult' tool.
>>
>> The '->enter()' function of the driver enters C-states with local
>> interrupts disabled by executing the 'monitor' and 'mwait' pair of
>> instructions. If an interrupt happens, the CPU exits the C-state and
>> continues executing instructions after 'mwait'. It does not jump to
>> the interrupt handler, because local interrupts are disabled. The
>> cpuidle subsystem enables interrupts a bit later, after doing some
>> housekeeping.
>>
>> With this patch, we enable local interrupts before requesting C1. In
>> this case, if the CPU wakes up because of an interrupt, it will jump
>> to the interrupt handler right away. The cpuidle housekeeping will be
>> done after the pending interrupt(s) are handled.
>>
>> Enabling interrupts before entering a C-state has measurable impact
>> for faster C-states, like C1. Deeper, but slower C-states like C6 do
>> not really benefit from this sort of change, because their latency is
>> a lot higher comparing to the delay added by cpuidle housekeeping.
>>
>> This change was also tested with cyclictest and dbench. In case of Ice
>> Lake, the average cyclictest latency decreased by 5.1%, and the average
>> 'dbench' throughput increased by about 0.8%. Both tests were run for 4
>> hours with only C1 enabled (all other idle states, including 'POLL',
>> were disabled). CPU frequency was pinned to HFM, and uncore frequency
>> was pinned to the maximum value. The other platforms had similar
>> single-digit percentage improvements.
>>
>> It is worth noting that this patch affects 'cpuidle' statistics a tiny
>> bit.  Before this patch, C1 residency did not include the interrupt
>> handling time, but with this patch, it will include it. This is similar
>> to what happens in case of the 'POLL' state, which also runs with
>> interrupts enabled.
>>
>> Suggested-by: Len Brown <len.br...@intel.com>
>> Signed-off-by: Artem Bityutskiy <artem.bityuts...@linux.intel.com>
>> [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
>>
>> We don't have a pointer into cpuidle_state_table[] readily available.
>> To compensate, make use of the new flag only appearing for C1 and hence
>> only in the first table entry.
> 
> We could likely use the 8 padding bits in acpi_processor_cx between
> entry_method and method to store a flags field?


When looking there initially, I thought it wouldn't be good to add
field there. But looking again, I now don't see why I thought what
I thought. (We actually have an 8-bit padding slot there and a
32-bit one.)

> It would seems more resilient, as I guess at some point we could
> enable interrupts for further states?

C1E maybe; I don't think anything beyond, for having higher wakeup
latency anyway.

>> Unlike Linux we want to disable IRQs again after MWAITing, as
>> subsequently invoked functions assume so.
> 
> I'm also wondering whether there could be interrupts that rely on some
> of the housekeeping that's done when returning from mwait. I guess
> it's unlikely for an interrupt handler to have anything to do with the
> TSC not having been restored.

Actually this is a good point you make: We don't want to enable
IRQs when cstate_restore_tsc() is not a no-op, or else we might
confuse the time rendezvous. (I thought that I would remember
TSC to stop only in deeper C-states, but maybe I'm mixing this up
with the APIC timer.)

>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> 
> Acked-by: Roger Pau Monné <roger....@citrix.com>

Thanks, but as per above - not just yet.

Jan

Re: [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons

Reply via email to