On Thu, Sep 24, 2020 at 09:21:32AM +0100, Stefan Hajnoczi wrote: > On Tue, Sep 15, 2020 at 07:29:17AM -0700, Thanos Makatos wrote: > > This patch introduces the vfio-user protocol specification (formerly > > known as VFIO-over-socket), which is designed to allow devices to be > > emulated outside QEMU, in a separate process. vfio-user reuses the > > existing VFIO defines, structs and concepts. > > > > It has been earlier discussed as an RFC in: > > "RFC: use VFIO over a UNIX domain socket to implement device offloading" > > > > Signed-off-by: John G Johnson <john.g.john...@oracle.com> > > Signed-off-by: Thanos Makatos <thanos.maka...@nutanix.com> > > The approach looks promising. It's hard to know what changes will be > required when this is implemented, so let's not worry about getting > every detail of the spec right. > > Now that there is a spec to start from, the next step is patches > implementing --device vfio-user-pci,chardev=<chardev> in > hw/vfio-user/pci.c (mirroring hw/vfio/). > > It should be accompanied by a test in tests/. PCI-level testing APIS for > BARs, configuration space, interrupts, etc are available in > tests/qtest/libqos/pci.h. The test case needs to include a vfio-user > device backend interact with QEMU's vfio-user-pci implementation. > > I think this spec can be merged in docs/devel/ now and marked as > "subject to change (not a stable public interface)". > > After the details have been proven and any necessary changes have been > made the spec can be promoted to docs/interop/ as a stable public > interface. This gives the freedom to make changes discovered when > figuring out issues like disconnect/reconnect, live migration, etc that > can be hard to get right without a working implementation. > > Does this approach sound good? > > Also please let us know who is working on what so additional people can > get involved in areas that need work! > > Stefan
Problem we discovered with e.g. vhost is once you ship a management interface, people start using it immediately and it does not matter that you never promised stability. So I feel a good first step would be to limit this to only allow known in-tree devices, started/destroyed automatically by qemu when device is created. This way lots of reconnect etc issues go away, and we don't commit to a stable protocol until we have a decent handle on how things will work in production. -- MST