Sorry, forgot to add the for-4.19 tag and Cc Oleksii.

Since we have taken the start of the series, we might as well take the
remaining patches (if other x86 maintainers agree) and attempt to
hopefully fix all the interrupt issues with CPU hotplug/unplug.

FTR: there are further issues when doing CPU hotplug/unplug from a PVH
dom0, but those are out of the scope for 4.19, as I haven't even
started to diagnose what's going on.

Thanks, Roger.

On Thu, Jun 13, 2024 at 06:56:14PM +0200, Roger Pau Monne wrote:
> Hello,
> 
> The following series aim to fix interrupt handling when doing CPU
> plug/unplug operations.  Without this series running:
> 
> cpus=`xl info max_cpu_id`
> while [ 1 ]; do
>     for i in `seq 1 $cpus`; do
>         xen-hptool cpu-offline $i;
>         xen-hptool cpu-online $i;
>     done
> done
> 
> Quite quickly results in interrupts getting lost and "No irq handler for
> vector" messages on the Xen console.  Drivers in dom0 also start getting
> interrupt timeouts and the system becomes unusable.
> 
> After applying the series running the loop over night still result in a
> fully usable system, no  "No irq handler for vector" messages at all, no
> interrupt loses reported by dom0.  Test with x2apic-mode={mixed,cluster}.
> 
> I've attempted to document all code as good as I could, interrupt
> handling has some unexpected corner cases that are hard to diagnose and
> reason about.
> 
> Some XenRT testing is undergoing to ensure no breakages.
> 
> Thanks, Roger.
> 
> Roger Pau Monne (3):
>   x86/irq: deal with old_cpu_mask for interrupts in movement in
>     fixup_irqs()
>   x86/irq: handle moving interrupts in _assign_irq_vector()
>   x86/irq: forward pending interrupts to new destination in fixup_irqs()
> 
>  xen/arch/x86/include/asm/apic.h |   5 +
>  xen/arch/x86/irq.c              | 163 +++++++++++++++++++++++++-------
>  2 files changed, 132 insertions(+), 36 deletions(-)
> 
> -- 
> 2.45.2
> 

Reply via email to