On Mon, Dec 22, 2025 at 09:39:38AM +0100, Jan Beulich wrote:
> On 19.12.2025 09:39, Roger Pau Monné wrote:
> > On Thu, Dec 18, 2025 at 08:56:24AM +0100, Jan Beulich wrote:
> >> Legacy PCI devices don't have any extended config space. Reading any part
> >> thereof may very well return all ones. That then necessarily means we
> >> would think we found a "loop", when there simply is nothing.
> >>
> >> Fixes: a845b50c12f3 ("vpci/header: Emulate extended capability list for
> >> dom0")
> >> Signed-off-by: Jan Beulich <[email protected]>
> >
> > Acked-by: Roger Pau Monné <[email protected]>
> >
> > With the U suffix added to the constant, as noted by Stewart.
>
> Thanks, albeit I'm not quite convinced I actually should put it in. Imo ...
What about using ~0U instead of the longish 0xfff... hex literal?
Am I misremembering that we had a coding style rule to write hex
numbers all in uppercase letters? I don't seem to find it anywhere
> >> ---
> >> This is the minimalistic change to get rid of "overlap in extended cap
> >> list" warnings I'm observing. We may want to avoid any attempt to access
> >> extended config space when there is none - see Linux'es
> >> pci_cfg_space_size() and it helper pci_cfg_space_size_ext(). This would
> >> then also avoid us interpreting as an extended cap list what isn't one at
> >> all (some legacy PCI devices don't decode register address bits 9-11, some
> >> return other non-0, non-all-ones data). Including the risk of reading a
> >> register with read side effects. Thoughts?
> >
> > I think that's likely too much - for the hardware domain we want to
> > allow the domain to access all the PCI config space, regardless of
> > Xen's thinking there's nothing there.
>
> ... we really need to do better here, irrespective of this intended behavior
> for hwdom. Us accessing the supposed extended capabilities list is already a
> mistake when there's no extended config space. Us then calling
> vpci_add_register() to "pin down" the value read is wrong too in that case.
Hm, yes, it would be better for Xen to use a logic similar to Linux's
helpers to find the config space size. This would need extending to
pci_find_next_ext_capability(), as Xen does read the extended space
without any checks there also.
> Question here is whether even with that fixed the check being added here
> would make sense to keep. In that case putting it in now and then doing the
> other re-work would likely be the right thing to do.
Yes, it probably wants to be there in addition to the extended checks
for extended space presence. If we have a pre-check that ensures the
extended space is available, Xen should return an error also if
reading from PCI_CFG_SPACE_SIZE returns ~0, as in that case the device
must handle at least that access and return 0 to signal no extended
capabilities.
> >> The DomU part of the function worries me as well. Rather than making it
> >> "read 0, write ignore" for just the first 32 bits, shouldn't we make it so
> >> for the entire extended config space, and shouldn't we also make it "read
> >> all ones, write ignore" when there is no extended config space in the
> >> first place (then in particular also for the first 32 bits)?
> >
> > If there's no explicitly handler added, the behavior for domU will
> > already be to drop writes, and return reads as all 1s, which is fine
> > for the rest of the extended config space? We just need to return 0
> > for the first 32bits to avoid seeming to have extended capability
> > support.
> >
> > Maybe we want to keep the same behavior as expected from native for
> > legacy devices and just return all 1s consistency for the extended
> > space?
> >
> > Hence we don't need to special case this region, as it's already
> > covered by how unhandled accesses are resolved for domUs.
> >
> > Or is there something else I'm missing?
>
> Imo correct behavior would be to return 0 for the first 32 bits when there
> is extended config space, and ~0 when there isn't.
I see, I was missing your point that there are older devices that
might not implement the extended space at all, and for those the
natural value to return would be ~0 instead of 0.
Xen could read the value the device returns for PCI_CFG_SPACE_SIZE
accesses, and whether it's ~0 or something else in order to device
whether the dummy handler added in vpci_init_ext_capability_list()
should return 0 or ~0, but that also risks possible side-effects if
the device aliases accesses >= PCI_CFG_SPACE_SIZE to registers in the
legacy config space region.
Thanks,