On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote:
> On 11.07.2023 15:02, Roger Pau Monné wrote:
> > On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote:
> >> On 10.07.2023 16:43, Roger Pau Monné wrote:
> >>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote:
> >>>> On 07.07.2023 11:53, Roger Pau Monne wrote:
> >>>>> The current logic to init the local APIC and the IO-APIC does init the
> >>>>> former first before doing any kind of sanitation on the IO-APIC pin
> >>>>> configuration.  It's already noted on enable_IO_APIC() that Xen
> >>>>> shouldn't trust the IO-APIC being empty at bootup.
> >>>>>
> >>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen
> >>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and
> >>>>> with a vector of 0 (all fields of the RTE are zeroed).  Once the local
> >>>>> APIC is enabled periodic injections from such pin cause a storm of
> >>>>> errors:
> >>>>>
> >>>>> APIC error on CPU0: 00(40), Received illegal vector
> >>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>
> >>>>> That prevents Xen from booting.
> >>>>
> >>>> And I expect no RTE is in ExtInt mode, so one might conclude that
> >>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit
> >>>> of course there's then the question whether to change the RTE
> >>>> rather than masking it. What do ACPI tables say?
> >>>
> >>> There's one relevant override:
> >>>
> >>> [668h 1640   1]                Subtable Type : 02 [Interrupt Source 
> >>> Override]
> >>> [669h 1641   1]                       Length : 0A
> >>> [66Ah 1642   1]                          Bus : 00
> >>> [66Bh 1643   1]                       Source : 00
> >>> [66Ch 1644   4]                    Interrupt : 00000002
> >>> [670h 1648   2]        Flags (decoded below) : 0000
> >>>                                     Polarity : 0
> >>>                                 Trigger Mode : 0
> >>>
> >>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is
> >>> connected to.
> >>
> >> Then wouldn't we be better off converting that RTE to ExtInt? That
> >> would allow PIC IRQs (not likely to exist, but still) to work
> >> (without undue other side effects afaics), whereas masking this RTE
> >> would not.
> > 
> > I'm kind of worry of trying to automate this logic.  Should we always
> > convert pin 0 to ExtInt if it's found unmasked, with and invalid
> > vector and there's a source override entry for the IRQ?
> > 
> > It seems weird to infer this just from the fact that pin 0 is all
> > zeroed out.
> 
> As you say in the earlier paragraph, it wouldn't be just that. It's
> unmasked + bad vector + present source override (indicating that
> nothing except perhaps a PIC is connected to the pin).

I can do this as a separate patch, but not really here IMO.  The
purpose of this patch is strictly to perform the IO-APIC sanitation
ahead of enabling the local APIC.  I will add a followup patch to the
series, albeit I'm unconvinced we want to infer IO-APIC pin
configuration based on firmware miss configurations.

Thanks, Roger.

Reply via email to