On Mon, Dec 11, 2023 at 4:29 PM Akihiko Odaki <akihiko.od...@daynix.com> wrote:
>
> On 2023/12/11 16:26, Jason Wang wrote:
> > On Mon, Dec 11, 2023 at 1:30 PM Akihiko Odaki <akihiko.od...@daynix.com> 
> > wrote:
> >>
> >> On 2023/12/11 11:52, Jason Wang wrote:
> >>> On Sun, Dec 10, 2023 at 12:06 PM Akihiko Odaki <akihiko.od...@daynix.com> 
> >>> wrote:
> >>>>
> >>>> Introduction
> >>>> ------------
> >>>>
> >>>> This series is based on the RFC series submitted by Yui Washizu[1].
> >>>> See also [2] for the context.
> >>>>
> >>>> This series enables SR-IOV emulation for virtio-net. It is useful
> >>>> to test SR-IOV support on the guest, or to expose several vDPA devices
> >>>> in a VM. vDPA devices can also provide L2 switching feature for
> >>>> offloading though it is out of scope to allow the guest to configure
> >>>> such a feature.
> >>>>
> >>>> The PF side code resides in virtio-pci. The VF side code resides in
> >>>> the PCI common infrastructure, but it is restricted to work only for
> >>>> virtio-net-pci because of lack of validation.
> >>>>
> >>>> User Interface
> >>>> --------------
> >>>>
> >>>> A user can configure a SR-IOV capable virtio-net device by adding
> >>>> virtio-net-pci functions to a bus. Below is a command line example:
> >>>>     -netdev user,id=n -netdev user,id=o
> >>>>     -netdev user,id=p -netdev user,id=q
> >>>>     -device pcie-root-port,id=b
> >>>>     -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f
> >>>>     -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f
> >>>>     -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f
> >>>>     -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f
> >>>>
> >>>> The VFs specify the paired PF with "sriov-pf" property. The PF must be
> >>>> added after all VFs. It is user's responsibility to ensure that VFs have
> >>>> function numbers larger than one of the PF, and the function numbers
> >>>> have a consistent stride.
> >>>
> >>> This seems not user friendly. Any reason we can't just allow user to
> >>> specify the stride here?
> >>
> >> It should be possible to assign addr automatically without requiring
> >> user to specify the stride. I'll try that in the next version.
> >>
> >>>
> >>> Btw, I vaguely remember qemu allows the params to be accepted as a
> >>> list. If this is true, we can accept a list of netdev here?
> >>
> >> Yes, rocker does that. But the problem is not just about getting
> >> parameters needed for VFs, which I forgot to mention in the cover letter
> >> and will explain below.
> >>
> >>>
> >>>>
> >>>> Keeping VF instances
> >>>> --------------------
> >>>>
> >>>> A problem with SR-IOV emulation is that it needs to hotplug the VFs as
> >>>> the guest requests. Previously, this behavior was implemented by
> >>>> realizing and unrealizing VFs at runtime. However, this strategy does
> >>>> not work well for the proposed virtio-net emulation; in this proposal,
> >>>> device options passed in the command line must be maintained as VFs
> >>>> are hotplugged, but they are consumed when the machine starts and not
> >>>> available after that, which makes realizing VFs at runtime impossible.
> >>>
> >>> Could we store the device options in the PF?
> >>
> >> I wrote it's to store the device options, but the problem is actually
> >> more about realizing VFs at runtime instead of at the initialization time.
> >>
> >> Realizing VFs at runtime have two major problems. One is that it delays
> >> the validations of options; invalid options will be noticed when the
> >> guest requests to realize VFs.
> >
> > If PCI spec allows the failure when creating VF, then it should not be
> > a problem.
>
> I doubt the spec cares such a failure at all. VF enablement should
> always work for a real hardware. It's neither user-friendly to tell
> configuration errors at runtime.

I'm not sure which options we should care about? Did you mean netdev
options or the virtio-net specific ones?

If VF stick to the same options as PF (except for the SRIOV), it
should be validated during the PF initialization.

>
> >
> >> netdevs also warn that they are not used
> >> at initialization time, not knowing that they will be used by VFs later.
> >
> > We could invent things to calm down this false positive.
> >
> >> References to other QEMU objects in the option may also die before VFs
> >> are realized.
> >
> > Is there any other thing than netdev we need to consider?
>
> You will also want to set a distinct mac for each VF. Other properties
> does not matter much in my opinion.

Qemu doesn't check mac duplication now. So it's up to the mgmt layer.

>
> >
> >>
> >> The other problem is that QEMU cannot interact with the unrealized VFs.
> >> For example, if you type "device_add virtio-net-pci,id=vf,sriov-pf=pf"
> >> in HMP, you will expect "device_del vf" works, but it's hard to
> >> implement such behaviors with unrealized VFs.
> >
> > I think hotplug can only be done at PF level if we do that.
>
> Assuming you mean to let netdev and mac accept arrays, yes.
>
> >
> >>
> >> I was first going to compromise and allow such quirky behaviors, but I
> >> realized such a compromise is unnecessary if we reuse the PCI power down
> >> logic so I wrote v2.
> >
> > Haven't checked the code, but anything related to the PM here?
>
> You mean power management? We don't have to care about PCI power down
> for VFs because powering down a SR-IOV PCI device will reset it and thus
> disable its VFs.

Ok.

Thanks

>
> Regards,
> Akihiko Odaki
>


Reply via email to