On Fri, Mar 11, 2022 at 03:19:22PM +0000, Julien Grall wrote:
> Hi Roger,
> 
> On 11/03/2022 15:04, Roger Pau Monné wrote:
> > On Fri, Mar 11, 2022 at 11:15:13AM +0000, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 11/03/2022 10:52, Marek Marczykowski-Górecki wrote:
> > > > On Fri, Mar 11, 2022 at 10:23:03AM +0000, Julien Grall wrote:
> > > > > Hi Marek,
> > > > > 
> > > > > On 10/03/2022 16:37, Marek Marczykowski-Górecki wrote:
> > > > > > On Thu, Mar 10, 2022 at 04:21:50PM +0000, Julien Grall wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 10/03/2022 16:12, Roger Pau Monné wrote:
> > > > > > > > On Thu, Mar 10, 2022 at 05:08:07PM +0100, Jan Beulich wrote:
> > > > > > > > > On 10.03.2022 16:47, Roger Pau Monné wrote:
> > > > > > > > > > On Thu, Mar 10, 2022 at 04:23:00PM +0100, Jan Beulich wrote:
> > > > > > > > > > > On 10.03.2022 15:34, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > --- a/xen/drivers/char/ns16550.c
> > > > > > > > > > > > +++ b/xen/drivers/char/ns16550.c
> > > > > > > > > > > > @@ -1221,6 +1221,9 @@ pci_uart_config(struct ns16550 
> > > > > > > > > > > > *uart, bool_t skip_amt, unsigned int idx)
> > > > > > > > > > > >                                  
> > > > > > > > > > > > pci_conf_read8(PCI_SBDF(0, b, d, f),
> > > > > > > > > > > >                                                 
> > > > > > > > > > > > PCI_INTERRUPT_LINE) : 0;
> > > > > > > > > > > > +                if (uart->irq >= nr_irqs)
> > > > > > > > > > > > +                    uart->irq = 0;
> > > > > > > > > > > 
> > > > > > > > > > > Don't you mean nr_irqs_gsi here? Also (nit) please add 
> > > > > > > > > > > the missing blanks
> > > > > > > > > > > immediately inside the parentheses.
> > > > > > > > > > 
> > > > > > > > > > If we use nr_irqs_gsi we will need to make the check x86 
> > > > > > > > > > only AFAICT.
> > > > > > > > > 
> > > > > > > > > Down the road (when Arm wants to select HAS_PCI) - yes. Not 
> > > > > > > > > necessarily
> > > > > > > > > right away. After all Arm wants to have an equivalent check 
> > > > > > > > > here then,
> > > > > > > > > not merely checking against nr_irqs instead. So putting a 
> > > > > > > > > conditional
> > > > > > > > > here right away would hide the need for putting in place an 
> > > > > > > > > Arm-specific
> > > > > > > > > alternative.
> > > > > > > > 
> > > > > > > > Oh, I always forget Arm doesn't have CONFIG_HAS_PCI enabled 
> > > > > > > > just yet.
> > > > > > > The PCI code in ns16550.c is gated by CONFIG_HAS_PCI and 
> > > > > > > CONFIG_X86. I am
> > > > > > > not sure we will ever see a support for PCI UART card in Xen on 
> > > > > > > Arm.
> > > > > > > 
> > > > > > > However, if it evers happens then neither nr_irqs or nr_irqs_gsi 
> > > > > > > would help
> > > > > > > here because from the interrupt controller PoV 0xff may be a 
> > > > > > > valid (GICv2
> > > > > > > supports up to 1024 interrupts).
> > > > > > > 
> > > > > > > Is there any reason we can't explicitely check 0xff?
> > > > > > 
> > > > > > That's what my v0.1 did, but Roger suggested nr_irqs. And I agree,
> > > > > > because the value is later used (on x86) to access irq_desc array 
> > > > > > (via
> > > > > > irq_to_desc), which has nr_irqs size.
> > > > > 
> > > > > I think it would be better if that check is closer to who access the
> > > > > irq_desc. This would be helpful for other users (I am sure this is 
> > > > > not the
> > > > > only potential place where the IRQ may be wrong). So how about moving 
> > > > > it in
> > > > > setup_irq()?
> > > > 
> > > > I don't like it, it's rather fragile approach (at least in the current
> > > > code base, without some refactor). There are a bunch of places using
> > > > uart->irq (even if just checking if its -1 or 0) before setup_irq()
> > > > call. This includes smp_intr_init(), which is what was the first thing
> > > > crashing with 0xff set there.
> > > 
> > > Even if the code is gated with !CONFIG_X86, it sounds wrong to me to have
> > > such check in an UART driver. It only prevents us to do an out-of-bound
> > > access. There are no guarantee the interrupt will be usable (on Arm 256 
> > > is a
> > > valid interrupt).
> > 
> > It's a sanity check of a value we get from the hardware, I don't think
> > it's that strange.
> 
> I think it is strange because the behavior would be different between the
> architectures. On x86, we would reject the interrupt and poll. On Arm, we
> would accept the interrupt and the UART would be unusable.
> 
> > It's mostly similar to doing sanity checks of input
> > values we get from users.
> I am a bit concerned that we are using an unrelated check (see above
> why) to catch the "misconfiguration".
> 
> I think it would be good to understand why the interrupt line is 0xff and
> properly fix it. Is it a misconfiguration?  Is it intended to indicate "no
> IRQ"? Can we actually trust the value for the Intel LPSS?

Sorry, maybe this wasn't clear. My suggestion was not to just do this
fix and call it done, but rather to add this check for sanity and then
figure out how to properly handle this specific device.

So adding the check here is not a workaround in order to support Intel
LPSS, but rather a generic fix to ns16550 for an issue which happens
to be triggered by Intel LPSS. We would still need to figure how to
handle that specific Line value. I haven't looked at the docs, will do
on Monday hopefully.

Thanks, Roger.

Reply via email to