> On 30-Jun-2023, at 8:13 AM, Akihiko Odaki <akihiko.od...@daynix.com> wrote:
>
> On 2023/06/29 23:18, Ani Sinha wrote:
>>> On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.od...@daynix.com> wrote:
>>>
>>> On 2023/06/29 17:05, Ani Sinha wrote:
>>>> On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.od...@daynix.com
>>>> <mailto:akihiko.od...@daynix.com>> wrote:
>>>> On 2023/06/29 13:07, Ani Sinha wrote:
>>>> > PCI Express ports only have one slot, so PCI Express devices can
>>>> only be
>>>> > plugged into slot 0 on a PCIE port. Enforce it.
>>>> >
>>>> > The change has been tested to not break ARI by instantiating
>>>> seven vfs on an
>>>> > emulated igb device (the maximum number of vfs the linux igb
>>>> driver supports).
>>>> > The vfs are seen to have non-zero device/slot numbers in the
>>>> conventional
>>>> > PCI BDF representation.
>>>> >
>>>> > CC: jus...@redhat.com <mailto:jus...@redhat.com>
>>>> > CC: imamm...@redhat.com <mailto:imamm...@redhat.com>
>>>> > CC: akihiko.od...@daynix.com <mailto:akihiko.od...@daynix.com>
>>>> >
>>>> > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
>>>> <https://bugzilla.redhat.com/show_bug.cgi?id=2128929>
>>>> > Signed-off-by: Ani Sinha <anisi...@redhat.com
>>>> <mailto:anisi...@redhat.com>>
>>>> > Reviewed-by: Julia Suvorova <jus...@redhat.com
>>>> <mailto:jus...@redhat.com>>
>>>> > ---
>>>> > hw/pci/pci.c | 15 +++++++++++++++
>>>> > 1 file changed, 15 insertions(+)
>>>> >
>>>> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> > index e2eb4c3b4a..0320ac2bb3 100644
>>>> > --- a/hw/pci/pci.c
>>>> > +++ b/hw/pci/pci.c
>>>> > @@ -65,6 +65,7 @@ bool pci_available = true;
>>>> > static char *pcibus_get_dev_path(DeviceState *dev);
>>>> > static char *pcibus_get_fw_dev_path(DeviceState *dev);
>>>> > static void pcibus_reset(BusState *qbus);
>>>> > +static bool pcie_has_upstream_port(PCIDevice *dev);
>>>> >
>>>> > static Property pci_props[] = {
>>>> > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
>>>> > @@ -1190,6 +1191,20 @@ static PCIDevice
>>>> *do_pci_register_device(PCIDevice *pci_dev,
>>>> > name);
>>>> >
>>>> > return NULL;
>>>> > + } /*
>>>> > + * With SRIOV and ARI, vfs can have non-zero slot in the
>>>> conventional
>>>> > + * PCI interpretation as all five bits reserved for slot
>>>> addresses are
>>>> > + * also used for function bits for the various vfs. Ignore
>>>> that case.
>>>> > + * It is too early here to check for ARI capabilities in
>>>> the PCI config
>>>> > + * space. Hence, we check for a vf device instead.
>>>> > + */
>>>> Why don't just perform this check after the capabilities are set?
>>>> We don't want to allocate resources for wrong device parameters. We want
>>>> to error out early. Other checks also are performed at the same place .
>>>
>>> It is indeed better to raise an error as early as possible so that we can
>>> avoid allocation and other operations that will be reverted and may go
>>> wrong due to the invalid condition. That should be the reason why other
>>> checks for the address are performed here.
>>>
>>> However, in this particular case, we cannot confidently perform the check
>>> here because it is unknown if the ARI capability will be advertised until
>>> the device realization code runs. This can justify delaying the check after
>>> the device realization, unlike the other checks.
>> Ok so are you proposing that the check we have right before (the check for
>> unoccupied function 0) be also moved? It also uses the same vf approximation
>> for seemingly to support ARI.
>
> No, I don't think the check for function 0 is required to be disabled because
> of the change of addressing caused by ARI, but it is required because SR-IOV
> VF can be added and removed while the PF (function 0) remains. I think this
> check should be performed also when SR-IOV is disabled and ARI is enabled.
OK I am a little confused with your explanation here. I understand the non-ARI
case - to detect the PF we need to have function 0 available. Looking at the
comment in pci_init_multifunction() it seems in ARI case, we can simply have a
vf in function 0 (conventional interpretation) as well. Hence we need to ignore
vfs both in ARI and non-ARI cases. This is what I understood.
>
> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of
> checking ARI capability, and that can happen in do_pci_register_device().
>
>> Also where do you propose we move the check?
>
> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM
> loading.
Hmm, I tried this. The issue here is something like this would be now allowed
since the PF has ARI capability:
-device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0
The above should not be allowed and when used, we do not see the igb ethernet
device from the guest OS.
> See the check for failover pair as an example. I guess it's also placed in
> this region because it needs capability information.
>
>>>
>>>> Show quoted text
>>>> Regards,
>>>> Akihiko Odaki
>>>> > + else if (!pci_is_vf(pci_dev) &&
>>>> > + pcie_has_upstream_port(pci_dev) &&
>>>> > + PCI_SLOT(devfn)) {
>>>> > + error_setg(errp, "PCI: slot %d is not valid for %s,"
>>>> > + " parent device only allows plugging into
>>>> slot 0.",
>>>> > + PCI_SLOT(devfn), name);
>>>> > + return NULL;
>>>> > }
>>>> >
>>>> > pci_dev->devfn = devfn;