Re: RFC: use VFIO over a UNIX domain socket to implement device offloading

Stefan Hajnoczi Mon, 04 May 2020 02:46:10 -0700

On Fri, May 01, 2020 at 04:28:25PM +0100, Daniel P. Berrangé wrote:
> On Fri, May 01, 2020 at 03:01:01PM +0000, Felipe Franciosi wrote:
> > Hi,
> > 
> > > On Apr 30, 2020, at 4:20 PM, Thanos Makatos <thanos.maka...@nutanix.com> 
> > > wrote:
> > > 
> > >>>> More importantly, considering:
> > >>>> a) Marc-André's comments about data alignment etc., and
> > >>>> b) the possibility to run the server on another guest or host,
> > >>>> we won't be able to use native VFIO types. If we do want to support 
> > >>>> that
> > >>>> then
> > >>>> we'll have to redefine all data formats, similar to
> > >>>> https://urldefense.proofpoint.com/v2/url?u=https-
> > >>>> 3A__github.com_qemu_qemu_blob_master_docs_interop_vhost-
> > >>>> 
> > >> 2Duser.rst&d=DwIFAw&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6
> > >>>> 
> > >> ogtti46atk736SI4vgsJiUKIyDE&m=lJC7YeMMsAaVsr99tmTYncQdjEfOXiJQkRkJ
> > >>>> W7NMgRg&s=1d_kB7VWQ-
> > >> 8d4t6Ikga5KSVwws4vwiVMvTyWVaS6PRU&e= .
> > >>>> 
> > >>>> So the protocol will be more like an enhanced version of the Vhost-user
> > >>>> protocol
> > >>>> than VFIO. I'm fine with either direction (VFIO vs. enhanced 
> > >>>> Vhost-user),
> > >>>> so we need to decide before proceeding as the request format is
> > >>>> substantially
> > >>>> different.
> > >>> 
> > >>> Regarding the ability to use the protocol on non-AF_UNIX sockets, we can
> > >>> support this future use case without unnecessarily complicating the
> > >> protocol by
> > >>> defining the C structs and stating that data alignment and endianness 
> > >>> for
> > >> the
> > >>> non AF_UNIX case must be the one used by GCC on a x86_64 bit machine,
> > >> or can
> > >>> be overridden as required.
> > >> 
> > >> Defining it to be x86_64 semantics is effectively saying "we're not going
> > >> to do anything and it is up to other arch maintainers to fix the 
> > >> inevitable
> > >> portability problems that arise".
> > > 
> > > Pretty much.
> > > 
> > >> Since this is a new protocol should we take the opportunity to model it
> > >> explicitly in some common standard RPC protocol language. This would have
> > >> the benefit of allowing implementors to use off the shelf APIs for their
> > >> wire protocol marshalling, and eliminate questions about endianness and
> > >> alignment across architectures.
> > > 
> > > The problem is that we haven't defined the scope very well. My initial 
> > > impression 
> > > was that we should use the existing VFIO structs and constants, however 
> > > that's 
> > > impossible if we're to support non AF_UNIX. We need consensus on this, 
> > > we're 
> > > open to ideas how to do this.
> > 
> > Thanos has a point.
> > 
> > From https://wiki.qemu.org/Features/MultiProcessQEMU, which I believe
> > was written by Stefan, I read:
> > 
> > > Inventing a new device emulation protocol from scratch has many
> > > disadvantages. VFIO could be used as the protocol to avoid reinventing
> > > the wheel ...
> > 
> > At the same time, this appears to be incompatible with the (new?)
> > requirement of supporting device emulation which may run in non-VFIO
> > compliant OSs or even across OSs (ie. via TCP or similar).
> 
> To be clear, I don't have any opinion on whether we need to support
> cross-OS/TCP or not.
> 
> I'm merely saying that if we do decide to support cross-OS/TCP, then
> I think we need a more explicitly modelled protocol, instead of relying
> on serialization of C structs.
> 
> There could be benefits to an explicitly modelled protocol, even for
> local only usage, if we want to more easily support non-C languages
> doing serialization, but again I don't have a strong opinion on whether
> that's neccessary to worry about or not.
> 
> So I guess largely the question boils down to setting the scope of
> what we want to be able to achieve in terms of RPC endpoints.


The protocol relies on both file descriptor and memory mapping. These
are hard to achieve with networking.

I think the closest would be using RDMA to accelerate memory access and
switching to a network notification mechanism instead of eventfd.

Sooner or later someone will probably try this. I don't think it makes
sense to define this transport in detail now if there are no users, but
we should try to make it possible to add it in the future, if necessary.

Another use case that is interesting and not yet directly addressed is:
how can another VM play the role of the device? This is important in
compute cloud environments where everything is a VM and running a
process on the host is not possible.

The virtio-vhost-user prototype showed that it's possible to add this on
top of an existing vhost-user style protocol by terminating the
connection in the device VMM and then communicating with the device
using a new VIRTIO device. Maybe that's the way to do it here too and we
don't need to worry about explicitly designing that into the vfio-user
protocol, but if anyone has other approaches in mind then let's discuss
them now.

Finally, I think the goal of integrating this new protocol into the
existing vfio component of VMMs is a good idea. Sticking closely to the
<linux/vfio.h> interface will help in this regard. The further away we
get, the harder it will be to fit it into the vfio code in existing VMMs
and the harder it will be for users to configure the VMM along the lines
for how vfio works today.

Stefan

signature.asc
Description: PGP signature

Re: RFC: use VFIO over a UNIX domain socket to implement device offloading

Reply via email to