On Mon, Dec 11, 2023 at 4:29 PM Akihiko Odaki <akihiko.od...@daynix.com> wrote: > > On 2023/12/11 16:26, Jason Wang wrote: > > On Mon, Dec 11, 2023 at 1:30 PM Akihiko Odaki <akihiko.od...@daynix.com> > > wrote: > >> > >> On 2023/12/11 11:52, Jason Wang wrote: > >>> On Sun, Dec 10, 2023 at 12:06 PM Akihiko Odaki <akihiko.od...@daynix.com> > >>> wrote: > >>>> > >>>> Introduction > >>>> ------------ > >>>> > >>>> This series is based on the RFC series submitted by Yui Washizu[1]. > >>>> See also [2] for the context. > >>>> > >>>> This series enables SR-IOV emulation for virtio-net. It is useful > >>>> to test SR-IOV support on the guest, or to expose several vDPA devices > >>>> in a VM. vDPA devices can also provide L2 switching feature for > >>>> offloading though it is out of scope to allow the guest to configure > >>>> such a feature. > >>>> > >>>> The PF side code resides in virtio-pci. The VF side code resides in > >>>> the PCI common infrastructure, but it is restricted to work only for > >>>> virtio-net-pci because of lack of validation. > >>>> > >>>> User Interface > >>>> -------------- > >>>> > >>>> A user can configure a SR-IOV capable virtio-net device by adding > >>>> virtio-net-pci functions to a bus. Below is a command line example: > >>>> -netdev user,id=n -netdev user,id=o > >>>> -netdev user,id=p -netdev user,id=q > >>>> -device pcie-root-port,id=b > >>>> -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f > >>>> -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f > >>>> -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f > >>>> -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f > >>>> > >>>> The VFs specify the paired PF with "sriov-pf" property. The PF must be > >>>> added after all VFs. It is user's responsibility to ensure that VFs have > >>>> function numbers larger than one of the PF, and the function numbers > >>>> have a consistent stride. > >>> > >>> This seems not user friendly. Any reason we can't just allow user to > >>> specify the stride here? > >> > >> It should be possible to assign addr automatically without requiring > >> user to specify the stride. I'll try that in the next version. > >> > >>> > >>> Btw, I vaguely remember qemu allows the params to be accepted as a > >>> list. If this is true, we can accept a list of netdev here? > >> > >> Yes, rocker does that. But the problem is not just about getting > >> parameters needed for VFs, which I forgot to mention in the cover letter > >> and will explain below. > >> > >>> > >>>> > >>>> Keeping VF instances > >>>> -------------------- > >>>> > >>>> A problem with SR-IOV emulation is that it needs to hotplug the VFs as > >>>> the guest requests. Previously, this behavior was implemented by > >>>> realizing and unrealizing VFs at runtime. However, this strategy does > >>>> not work well for the proposed virtio-net emulation; in this proposal, > >>>> device options passed in the command line must be maintained as VFs > >>>> are hotplugged, but they are consumed when the machine starts and not > >>>> available after that, which makes realizing VFs at runtime impossible. > >>> > >>> Could we store the device options in the PF? > >> > >> I wrote it's to store the device options, but the problem is actually > >> more about realizing VFs at runtime instead of at the initialization time. > >> > >> Realizing VFs at runtime have two major problems. One is that it delays > >> the validations of options; invalid options will be noticed when the > >> guest requests to realize VFs. > > > > If PCI spec allows the failure when creating VF, then it should not be > > a problem. > > I doubt the spec cares such a failure at all. VF enablement should > always work for a real hardware. It's neither user-friendly to tell > configuration errors at runtime.
I'm not sure which options we should care about? Did you mean netdev options or the virtio-net specific ones? If VF stick to the same options as PF (except for the SRIOV), it should be validated during the PF initialization. > > > > >> netdevs also warn that they are not used > >> at initialization time, not knowing that they will be used by VFs later. > > > > We could invent things to calm down this false positive. > > > >> References to other QEMU objects in the option may also die before VFs > >> are realized. > > > > Is there any other thing than netdev we need to consider? > > You will also want to set a distinct mac for each VF. Other properties > does not matter much in my opinion. Qemu doesn't check mac duplication now. So it's up to the mgmt layer. > > > > >> > >> The other problem is that QEMU cannot interact with the unrealized VFs. > >> For example, if you type "device_add virtio-net-pci,id=vf,sriov-pf=pf" > >> in HMP, you will expect "device_del vf" works, but it's hard to > >> implement such behaviors with unrealized VFs. > > > > I think hotplug can only be done at PF level if we do that. > > Assuming you mean to let netdev and mac accept arrays, yes. > > > > >> > >> I was first going to compromise and allow such quirky behaviors, but I > >> realized such a compromise is unnecessary if we reuse the PCI power down > >> logic so I wrote v2. > > > > Haven't checked the code, but anything related to the PM here? > > You mean power management? We don't have to care about PCI power down > for VFs because powering down a SR-IOV PCI device will reset it and thus > disable its VFs. Ok. Thanks > > Regards, > Akihiko Odaki >