On Fri, 6 Sept 2024 at 00:19, David Stevens <steve...@chromium.org> wrote:
>
> On Fri, Sep 6, 2024 at 12:56 AM Stefan Hajnoczi <stefa...@redhat.com> wrote:
> >
> > On Tue, Jul 16, 2024 at 10:21:35AM +0900, David Stevens wrote:
> > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin <m...@redhat.com> 
> > > wrote:
> > > >
> > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote:
> > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross <h...@alyssa.is> wrote:
> > > > > >
> > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in
> > > > > > crosvm a couple of years ago.
> > > > > >
> > > > > > David, I'd be particularly interested for your thoughts on the 
> > > > > > MEM_READ
> > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't 
> > > > > > implement
> > > > > > anything like that.  The discussion leading to those being added 
> > > > > > starts
> > > > > > here:
> > > > > >
> > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/
> > > > > >
> > > > > > It would be great if this could be standardised between QEMU and 
> > > > > > crosvm
> > > > > > (and therefore have a clearer path toward being implemented in 
> > > > > > other VMMs)!
> > > > >
> > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan
> > > > > won't work in crosvm today.
> > > > >
> > > > > Is universal access to virtio shared memory regions actually mandated
> > > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing
> > > > > seems reasonable enough, but what about virtio-pmem to virtio-blk?
> > > > > What about screenshotting a framebuffer in virtio-gpu shared memory to
> > > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a
> > > > > virtualized environment. But what about when you have real hardware
> > > > > that speaks virtio involved? That's outside my wheelhouse, but it
> > > > > doesn't seem like that would be easy to solve.
> > > >
> > > > Yes, it can work for physical devices if allowed by host configuration.
> > > > E.g. VFIO supports that I think. Don't think VDPA does.
> > >
> > > I'm sure it can work, but that sounds more like a SHOULD (MAY?),
> > > rather than a MUST.
> > >
> > > > > For what it's worth, my interpretation of the target scenario:
> > > > >
> > > > > > Other backends don't see these mappings. If the guest submits a 
> > > > > > vring
> > > > > > descriptor referencing a mapping to another backend, then that 
> > > > > > backend
> > > > > > won't be able to access this memory
> > > > >
> > > > > is that it's omitting how the implementation is reconciled with
> > > > > section 2.10.1 of v1.3 of the virtio spec, which states that:
> > > > >
> > > > > > References into shared memory regions are represented as offsets 
> > > > > > from
> > > > > > the beginning of the region instead of absolute memory addresses. 
> > > > > > Offsets
> > > > > > are used both for references between structures stored within shared
> > > > > > memory and for requests placed in virtqueues that refer to shared 
> > > > > > memory.
> > > > >
> > > > > My interpretation of that statement is that putting raw guest physical
> > > > > addresses corresponding to virtio shared memory regions into a vring
> > > > > is a driver spec violation.
> > > > >
> > > > > -David
> > > >
> > > > This really applies within device I think. Should be clarified ...
> > >
> > > You mean that a virtio device can use absolute memory addresses for
> > > other devices' shared memory regions, but it can't use absolute memory
> > > addresses for its own shared memory regions? That's a rather strange
> > > requirement. Or is the statement simply giving an addressing strategy
> > > that device type specifications are free to ignore?
> >
> > My recollection of the intent behind the quoted section is:
> >
> > 1. Structures in shared memory that point to shared memory must used
> >    relative offsets instead of absolute physical addresses.
> > 2. Virtqueue requests that refer to shared memory (e.g. map this page
> >    from virtiofs file to this location in shared memory) must use
> >    relative offsets instead of absolute physical addresses.
> >
> > In other words, shared memory must be relocatable. Don't assume Shared
> > Memory Regions have an absolute guest physical address. This makes
> > device implementations independent of the guest physical memory layout
> > and might also help when Shared Memory Regions are exposed to guest
> > user-space where the guest physical memory layout isn't known.
>
> Doesn't this discussion contradict the necessity of point 1? If I'm
> understanding things correctly, it is valid for virtio device A to
> refer to a structure in virtio device B's shared memory region by
> absolute guest physical address. At least there is nothing in the spec
> about resolving shmid values among different virtio devices, so
> absolute guest physical addresses is the only way this sharing can be
> done. And if it's valid for a pointer to a structure in a shared
> memory region to exist, it's not clear to me why you can't have
> pointers within a shared memory region.

The reason is that VIRTIO has a layered design where the transport and
vring layout deal with bus addresses but device types generally do not
(except for specific exceptions like memory ballooning, etc).

A device's virtqueue requests do not contain addresses (e.g. struct
virtio_net_hdr). The virtqueue interface hides the details of memory
organization and access. In theory a transport could be implemented
over a medium that doesn't even offer shared memory (I think people
have played with remote VIRTIO over TCP) and this is possible because
device types don't know how virtqueue elements are represented.

This same design constraint extends to VIRTIO Shared Memory Regions
because a VIRTIO Shared Memory Region's contents are defined by the
device type, just like virtqueue requests.. I mentioned that it avoids
address translation in device type implementations and also makes it
easy to expose VIRTIO Shared Memory Regions to guest userspace.

(Similarly, putting addresses into the VIRTIO Configuration Space is
also problematic because it exposes details of memory to the device
type. They should be hidden by the VIRTIO transport.)

That explains the intention within the VIRTIO world. The question you
raised was why you're allowed to then pass the address of a VIRTIO
Shared Memory Region to another device instead of passing a <shmid,
offset> pair. The answer is because DMA is beyond the scope of the
VIRTIO spec. If the architecture allows you to expose a buffer that
happens to be located in a VIRTIO Shared Memory Region to another
device, then it's possible to pass that address. The other device may
not even be a VIRTIO device. It just performs a DMA transaction to
read/write that memory. This is happening at another layer and it's a
valid thing to do.

So the answer is that in terms of designing VIRTIO device types,
VIRTIO Shared Memory Region structure layouts or virtqueue request
structs referring to VIRTIO Shared Memory Regions must not use
addresses. But you may be able to pass the address of a VIRTIO Shared
Memory Region to another device for DMA. They don't conflict because
they are at different levels.

> It definitely makes sense that setting up a mapping should be done
> with offsets. But unless a shared memory region can be dynamically
> reallocated at runtime, then it doesn't seem necessary to ban pointers
> within a shared memory region.

On a PCI device the BAR containing the VIRTIO Shared Memory Region can
be remapped at runtime, so the shared memory region can move.

The reason is the same as for why device types don't use address
inside virtqueue request structures: the details of memory are hidden
by the VIRTIO transport and the device type doesn't deal in addresses.

Also, it makes mmapping a VIRTIO Shared Memory Region difficult in
guest userspace because now some kind of address translation mechanism
is necessary (i.e. IOMMU) so that the userspace application (which
doesn't know about physical memory addresses) and the device
implementation can translate memory. Just using offsets avoids this
problem.

Stefan

Reply via email to