On 21.12.2023 21:41, Sébastien Chaumat wrote:
> Le jeu. 21 déc. 2023 à 14:29, Juergen Gross <jgr...@suse.com> a écrit :
> 
>> On 21.12.23 13:40, Jan Beulich wrote:
>>> On 20.12.2023 17:34, Sébastien Chaumat wrote:
>>>> Here are the patches I made to xen and linux kernel
>>>> Plus dmesg (bare metal,xen) and "xl dmesg"
>>>
>>> So the problem looks to be that pci_xen_initial_domain() results in
>>> permanent setup of IRQ7, when there only "static" ACPI tables (in
>>> particular source overrides in MADT) are consulted. The necessary
>>> settings, however, are known only after _CRS for the device was
>>> evaluated (and possibly _PRS followed by invocation of _SRS). All of
>>> this happens before xen_register_gsi() is called. But that function's
>>> call to xen_register_pirq() is short-circuited by the very first if()
>>> in xen_register_pirq() when there was an earlier invocation. Hence
>>> the (wrong) "edge" binding remains in place, as was established by
>>> the earlier call here.
>>>
>>> Jürgen, there's an interesting comment in xen_bind_pirq_gsi_to_irq(),
>>> right before invoking irq_set_chip_and_handler_name(). Despite what
>>> the comment says (according to my reading), the fasteoi one is _not_
>>> used in all cases. Assuming there's a reason for this, it's not clear
>>> to me whether updating the handler later on is a valid thing to do.
>>> __irq_set_handler() being even an exported symbol suggests that might
>>> be an option to use here. Then again merely updating the handler may
>>> not be sufficient, seeing there are also e.g. IRQD_TRIGGER_MASK and
>>> IRQD_LEVEL.
>>
>> I understand the last paragraph of that comment to reason, that in case
>> pirq_needs_eoi() return true even in case of an edge triggered interrupt,
>> the outcome is still okay.
>>
>> I don't think updating the handler later is valid.
>>
>>> Sébastien, to prove the (still pretty weak) theory that the change in
>>> handler is all that's needed to make things work in your case, could
>>> you fiddle with pci_xen_initial_domain() to have it skip IRQ7? (That
>>> of course won't be a proper solution, but ought to be okay for your
>>> system.) The main weakness of the theory is that IRQ7 really isn't
>>> very special in this regard - other PCI IRQs routed to the low 16
>>> IO-APIC pins ought to have similar issues (from the log, on your
>>> system this would be at least IRQ6 and IRQ10, except that they happen
>>> to be edge/low, so fitting the ISA defaults); only IRQ16 and up would
>>> work okay.
>>>
>>
> 
> Doing just that : IQR7 is now  of type level
>   xen-pirq     -ioapic-level  pinctrl_amd
> 
> 
> (but is ioapic-level there totally equivalent to the fasteoi of baremetal)
> Still the touchpad does not work.
> 
> And we have now :
> Dec 21 20:13:57 fedora kernel: i2c_hid_acpi i2c-PIXA3854:00: failed to
> reset device: -61
> Dec 21 20:14:17 fedora kernel: i2c_hid_acpi: probe of i2c-PIXA3854:00
> failed with error -61
> 
> in addition to
> Dec 21 20:13:57 fedora kernel: i2c_hid_acpi i2c-FRMW0004:00: failed to
> reset device: -61
> Dec 21 20:13:57 fedora kernel: i2c_hid_acpi i2c-FRMW0005:00: failed to
> reset device: -61
> Dec 21 20:14:17 fedora kernel: i2c_hid_acpi: probe of i2c-FRMW0004:00
> failed with error -61
> Dec 21 20:14:17 fedora kernel: i2c_hid_acpi: probe of i2c-FRMW0005:00
> failed with error -61

So there's more to this, which I'm afraid will (first) need looking into
by a person familiar with the involved drivers.

> I noticed that on baremetal :
> 
>   53:          0          0          0          0          0       1268
>      0          0          0          0          0          0          0
>        0          0          0  amd_gpio    5  FRMW0005:00
>   54:          0          0          0          0          0          1
>      0          0          0          0          0          0          0
>        0          0          0  amd_gpio   84  FRMW0004:00
>   55:          0          0          0          0          0       1403
>      0          0          0          0          0          0          0
>        0          0          0  amd_gpio    8  PIXA3854:00
> 
> with xen with IRQ7 setup only once there's only (the touchpad is
> PIXA3854:00)
> 
>  176:          0          0          0          0          0          0
>      1          0          0          0          0          0          0
>        0          0          0  amd_gpio    8
> 
> Interestingly when IRQ7 is setup twice (normal xen)
>  176:          0          0          0          0          0          0
>      1          0          0          0          0          0          0
>        0          0          0  amd_gpio    8  PIXA3854:00

That's odd, as with IRQ7 (wrongly) setup as edge, it should also be marked
as non-sharable. Otoh with the "i2c-PIXA3854:00:" error above it's no
surprise no interrupt is set up there.

>> Furthermore it might be interesting to know whether ELCR would give us
>>> any hint in this case. Sadly depending on where you look,
>>> applicability of this pair of registers (I/O ports 0x4d0 and 0x4d1)
>>> to other than EISA systems is claimed true or false. Could you perhaps
>>> make Xen simply log the values read from those two ports, by e.g.
>>> inserting
>>>
>>>      printk("ELCR: %02x, %02x\n", inb(0x4d0), inb(0x4d1));
>>>
>>> in, say, setup_dump_irqs()?
>>>
>>
> did that but I don't know how to trigger the dump.

There's no need to trigger the dump. The message will be logged during
boot, and hence ought to be visible in "xl dmesg" output.

Jan

Reply via email to