On Tue, 2023-06-20 at 13:24 -0400, Joel Upham wrote:
> The primary difference in PCI device IRQ management between Xen HVM and
> QEMU is that Xen PCI IRQs are "device-centric" while QEMU PCI IRQs are
> "chipset-centric". Namely, Xen uses PCI device BDF and INTx as coordinates
> to assert IRQ while QEMU finds out to which chipset PIRQ the IRQ is routed
> through the hierarchy of PCI buses and manages IRQ assertion on chipset
> side (as PIRQ inputs).

I don't think that's an accurate way of describing it.

Let's take the ICH9 as the basic case, and look at how the PIIX3 and
Xen both differ from it. As far as I understand it...

• ICH9

The four INTx pins from each PCI slot (32*4) are multiplexed down to a
smaller number of PIRQ lines; 8 of them on the ICH9. The mapping for
each slot is quite complex and depends on chipset registers.

Those 8 PIRQ lines (PIRQ[A-H]) are mapped directly and unconditionally
to IRQ16-23 on the I/OAPIC.

There is also a set of mapping registers in the chipset which allows
each PIRQ line to be mapped to the i8259 PIC (as e.g. IRQ5, 10, etc.).

(I think QEMU has a bug here. It should be able to deliver to *both*
the I/O APIC and the i8259, but it seems not to deliver to the I/O APIC
when the i8259 routing is enabled.)


• PIIX3

The PIIX3 only has four PIRQ lines, and the mapping from slot/pin to
PIRQ line is a *lot* more deterministic; it's basically just a simple
mask and shift of the slot/pin numbers. And since the PIIX3 also didn't
have an internal I/O APIC, the chipset registers mapping PIRQ# to IRQ#
*do* (at least in QEMU's emulation) affect the routing to the I/O APIC
as well as the i8259.

(I think this is probably a QEMU bug, or at least lack of fidelity in
its PC platform emulation. Real hardware with a PIIX3 and external I/O
APIC would have routed PIRQ[A-D] to I/O APIC IRQ16-20, wouldn't it?)


• Xen

Xen has two *separate* hard-coded rotations from slot/pin down to
PIRQs. For the I/O APIC it multiplexes down to 32 I/O APIC pins (IRQ16-
47). But for the i8259 it hard-codes the PIIX3 rotation down to 4 PIRQs
and expects the device model to provide the i8259 IRQ# for each of them
(from the chipset registers).

When you say Xen is "device-centric" I think you're saying it's hard-
coded the pin mappings and that's why it expects to take the actual PCI
bus/device/function/pin in order to do the mapping for itself, while
QEMU would normally expect to have done that part "properly" to get a
faithful emulation of the hardware in question.

(Note the extra fun part I mentioned earlier: Xen can route I/O APIC
interrupts as PIRQs, and needs the *I/O APIC* IRQ# for that which might
differ to the i8259 IRQ#. So running with 'noapic' and
XENFEAT_hvm_pirqs is probably going to *really* confuse your guests
because the ACPI _PRT table can only tell them one number.

> Two callback functions are used for this purpose: .map_irq and .set_irq
> (named after corresponding structure fields). Corresponding Xen-specific
> callback functions are piix3_set_irq() and pci_slot_get_pirq(). In Xen
> case these functions do not operate on pirq pin numbers. Instead, they use
> a specific value to pass BDF/INTx information between .map_irq and
> .set_irq -- PCI device devfn and INTx pin number are combined into
> pseudo-PIRQ in pci_slot_get_pirq, which piix3_set_irq later decodes back
> into devfn and INTx number for passing to *set_pci_intx_level() call.
> 
> For Xen on Q35 this scheme is still applicable, with the exception that
> function names are non-descriptive now and need to be renamed to show
> their common i440/Q35 nature. Proposed new names are:
> 
> xen_pci_slot_get_pirq --> xen_cmn_pci_slot_get_pirq
> xen_piix3_set_irq     --> xen_cmn_set_irq
> 
> Another IRQ-related difference between i440 and Q35 is the number of PIRQ
> inputs and PIRQ routers (PCI IRQ links in terms of ACPI) available. i440
> has 4 PCI interrupt links, while Q35 has 8 (PIRQA...PIRQH).
> Currently Xen have support for only 4 PCI links, so we describe only 4 of
> 8 PCI links in ACPI tables. Also, hvmloader disables PIRQ routing for
> PIRQE..PIRQH by writing 80h into corresponding PIRQ[n]_ROUT registers.
>
> All this PCI interrupt routing stuff is largely an ancient legacy from PIC
> era. It's hardly worth to extend number of PCI links supported as we
> normally deal with APIC mode and/or MSI interrupts.
> 
> The only useful thing to do with PIRQE..PIRQH routing currently is to
> check if guest actually attempts to use it for some reason (despite ACPI
> PCI routing information provided). In this case, a warning is logged.

I don't quite understand how this works. PIRQA-H are supposed to map
unconditionally to IRQ16-23 on the ICH9 I/OAPIC. But you can't do that
without fixing Xen. So doesn't the ACPI _PRT table have to reflect
Xen's hard-coded I/O APIC mapping of slots to IRQ16-47? I don't see
where you did that?

And there are some devices which are defined to use PIRQ[E-H] by the
ICH9 datasheet and which *can't* route to PIRQ[A-D], aren't there?
Those devices just can't be used in i8259 mode unless Xen is fixed to
handle more PIRQ routings? In fact, Xen doesn't even get the mappings
to PIRQ[A-D] right for the ICH9, does it? It's just applying its hard-
coded PIIX3 mappings to PIRQ[A-D] and then the ICH9's mapping to IRQ#
on top of that?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to