On Thu, Jan 20, 2022 at 03:04:39PM +0100, Jan Beulich wrote: > From: Artem Bityutskiy <artem.bityuts...@linux.intel.com> > > Enable local interrupts before requesting C1 on the last two generations > of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake. > This decreases average C1 interrupt latency by about 5-10%, as measured > with the 'wult' tool. > > The '->enter()' function of the driver enters C-states with local > interrupts disabled by executing the 'monitor' and 'mwait' pair of > instructions. If an interrupt happens, the CPU exits the C-state and > continues executing instructions after 'mwait'. It does not jump to > the interrupt handler, because local interrupts are disabled. The > cpuidle subsystem enables interrupts a bit later, after doing some > housekeeping. > > With this patch, we enable local interrupts before requesting C1. In > this case, if the CPU wakes up because of an interrupt, it will jump > to the interrupt handler right away. The cpuidle housekeeping will be > done after the pending interrupt(s) are handled. > > Enabling interrupts before entering a C-state has measurable impact > for faster C-states, like C1. Deeper, but slower C-states like C6 do > not really benefit from this sort of change, because their latency is > a lot higher comparing to the delay added by cpuidle housekeeping. > > This change was also tested with cyclictest and dbench. In case of Ice > Lake, the average cyclictest latency decreased by 5.1%, and the average > 'dbench' throughput increased by about 0.8%. Both tests were run for 4 > hours with only C1 enabled (all other idle states, including 'POLL', > were disabled). CPU frequency was pinned to HFM, and uncore frequency > was pinned to the maximum value. The other platforms had similar > single-digit percentage improvements. > > It is worth noting that this patch affects 'cpuidle' statistics a tiny > bit. Before this patch, C1 residency did not include the interrupt > handling time, but with this patch, it will include it. This is similar > to what happens in case of the 'POLL' state, which also runs with > interrupts enabled. > > Suggested-by: Len Brown <len.br...@intel.com> > Signed-off-by: Artem Bityutskiy <artem.bityuts...@linux.intel.com> > [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3] > > We don't have a pointer into cpuidle_state_table[] readily available. > To compensate, make use of the new flag only appearing for C1 and hence > only in the first table entry.
We could likely use the 8 padding bits in acpi_processor_cx between entry_method and method to store a flags field? It would seems more resilient, as I guess at some point we could enable interrupts for further states? > > Unlike Linux we want to disable IRQs again after MWAITing, as > subsequently invoked functions assume so. I'm also wondering whether there could be interrupts that rely on some of the housekeeping that's done when returning from mwait. I guess it's unlikely for an interrupt handler to have anything to do with the TSC not having been restored. > > Signed-off-by: Jan Beulich <jbeul...@suse.com> Acked-by: Roger Pau Monné <roger....@citrix.com> I think it's important to keep in sync with our upstream that's Linux there. Albeit I would prefer if we could carry the flags into acpi_processor_cx. Thanks, Roger.