On 10/1/24 12:26, Jan Beulich wrote:
On 10.01.2024 10:53, Roger Pau Monne wrote:
The HVM pirq feature allows routing interrupts from both physical and emulated
devices over event channels, this was done a performance improvement.  However
its usage is fully undocumented, and the only reference implementation is in
Linux.  It defeats the purpose of local APIC hardware virtualization, because
when using it interrupts avoid the usage of the local APIC altogether.

So without sufficient APIC acceleration, isn't this arranging for degraded
performance then? IOW should the new default perhaps be dependent on the
degree of APIC acceleration?

It has also been reported to not work properly with certain devices, at least
when using some AMD GPUs Linux attempts to route interrupts over event
channels, but Xen doesn't correctly detect such routing, which leads to the
hypervisor complaining with:

(XEN) d15v0: Unsupported MSI delivery mode 7 for Dom15

When MSIs are attempted to be routed over event channels the entry delivery
mode is set to ExtINT, but Xen doesn't detect such routing and attempts to
inject the interrupt following the native MSI path, and the ExtINT delivery
mode is not supported.

Shouldn't this be properly addressed nevertheless? The way it's described
it sounds as if MSI wouldn't work at all this way; I can't spot why the
issue would only be "with certain devices". Yet that in turn doesn't look
to be very likely - pass-through use cases, in particular SR-IOV ones,
would certainly have noticed.

The issue gets triggered when the guest performs save/restore of MSIs, because PHYSDEVOP_map_pirq is not implemented for MSIs, and thus, QEMU cannot remap the MSI to the event channel once unmapped. So, to fix this issue either would be needed to change QEMU to not unmap pirq-emulated MSIs or to implement PHYSDEVOP_map_pirq for MSIs.

But still, even when no device has been passed-through, scheduling latencies (of hundreds of ms), were observed in the guest even when running a simple loop application, that disappear once the flag is disabled. We did not have the chance to root cause it further.


Jan


Reply via email to