Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, 6 Sept 2024 at 03:06, Albert Esteve wrote: > On Thu, Sep 5, 2024 at 6:39 PM Stefan Hajnoczi wrote: >> >> On Tue, Sep 03, 2024 at 10:42:34AM +0200, Albert Esteve wrote: >> > Hello all, >> > >> > Sorry, I have been a bit disconnected from this thread as I was on >> > vacations and then had to switch tasks for a while. >> > >> > I will try to go through all comments and address them for the first >> > non-RFC drop of this patch series. >> > >> > But I was discussing with some colleagues on this. So turns out rust-vmm's >> > vhost-user-gpu will potentially use >> > this soon, and a rust-vmm/vhost patch have been already posted: >> > https://github.com/rust-vmm/vhost/pull/251. >> > So I think it may make sense to: >> > 1. Split the vhost-user documentation patch once settled. Since it is taken >> > as the official spec, >> > having it upstreamed independently of the implementation will benefit >> > other projects to >> > work/integrate their own code. >> > 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches. >> > If I remember correctly, this addresses a virtio-fs specific issue, >> > that will not >> > impact either virtio-gpu nor virtio-media, or any other. >> >> This is an architectural issue that arises from exposing VIRTIO Shared >> Memory Regions in vhost-user. It was first seen with Linux virtiofs but >> it could happen with other devices and/or guest operating systems. >> >> Any VIRTIO Shared Memory Region that can be mmapped into Linux userspace >> may trigger this issue. Userspace may write(2) to an O_DIRECT file with >> the mmap as the source. The vhost-user-blk device will not be able to >> access the source device's VIRTIO Shared Memory Region and will fail. >> >> > So it may make >> > sense >> > to separate them so that one does not stall the other. I will try to >> > have both >> > integrated in the mid term. >> >> If READ_/WRITE_MEM is a pain to implement (I think it is in the >> vhost-user back-end, even though I've been a proponent of it), then >> another way to deal with this issue is to specify that upon receiving >> MAP/UNMAP messages, the vhost-user front-end must update the vhost-user >> memory tables of all other vhost-user devices. That way vhost-user >> devices will be able to access VIRTIO Shared Memory Regions mapped by >> other devices. >> >> Implementing this in QEMU should be much easier than implementing >> READ_/WRITE_MEM support in device back-ends. >> >> This will be slow and scale poorly but performance is only a problem for >> devices that frequently MAP/UNMAP like virtiofs. Will virtio-gpu and >> virtio-media use MAP/UNMAP often at runtime? They might be able to get >> away with this simple solution. >> >> I'd be happy with that. If someone wants to make virtiofs DAX faster, >> they can implement READ/WRITE_MEM or another solution later, but let's >> at least make things correct from the start. > > > I agree. I want it to be correct first. If you agree on splitting the spec > bits from this > patch I'm already happy. I suggested splitting READ_/WRITE_MEM messages > because I thought that it was a virtiofs-specific issue. > > The alternative that you proposed is interesting. I'll take it into account. > But I > feel I prefer to go for the better solution, and if I get too entangled, then > switch > to the easier implementation. Great. The difficult part to implementing READ_/WRITE_MEM messages is modifying libvhost-user and rust-vmm's vhost crate to send the new messages when address translation fails. This needs to cover all memory accesses (including vring struct accesses). That code may be a few levels down in the call stack and assume it can always load/store directly from mmapped memory. > > I think we could do this in 2 patches: > 1. Split the documentation bits for SHMEM_MAP/_UNMAP. The > implementation for these messages will go into the second patch. > 2. The implementation patch: keep going for the time being with > READ_/WRITE_MEM support. And the documentation for that > is kept it within this patch. This way if we switch to the frontend > updating vhost-user memory table, we weren't set in any specific > solution if patch 1 has been already merged. I'm happy as long as the vhost-user spec patch that introduces MAP/UNMAP also covers a solution for the memory access problem (either READ_/WRITE_MEM or propagating mappings to all vhost-user back-ends). Stefan > > BR, > Albert. > >> >> >> Stefan >> >> > >> > WDYT? >> > >> > BR, >> > Albert. >> > >> > On Tue, Jul 16, 2024 at 3:21 AM David Stevens >> > wrote: >> > >> > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin >> > > wrote: >> > > > >> > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: >> > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: >> > > > > > >> > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in >> > > > > > crosvm a couple of years ago. >> > > > > > >> > > > > > David,
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, 6 Sept 2024 at 00:19, David Stevens wrote: > > On Fri, Sep 6, 2024 at 12:56 AM Stefan Hajnoczi wrote: > > > > On Tue, Jul 16, 2024 at 10:21:35AM +0900, David Stevens wrote: > > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin > > > wrote: > > > > > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > > > > crosvm a couple of years ago. > > > > > > > > > > > > David, I'd be particularly interested for your thoughts on the > > > > > > MEM_READ > > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't > > > > > > implement > > > > > > anything like that. The discussion leading to those being added > > > > > > starts > > > > > > here: > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > > > > > It would be great if this could be standardised between QEMU and > > > > > > crosvm > > > > > > (and therefore have a clearer path toward being implemented in > > > > > > other VMMs)! > > > > > > > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > > > > won't work in crosvm today. > > > > > > > > > > Is universal access to virtio shared memory regions actually mandated > > > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > > > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > > > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > > > > virtualized environment. But what about when you have real hardware > > > > > that speaks virtio involved? That's outside my wheelhouse, but it > > > > > doesn't seem like that would be easy to solve. > > > > > > > > Yes, it can work for physical devices if allowed by host configuration. > > > > E.g. VFIO supports that I think. Don't think VDPA does. > > > > > > I'm sure it can work, but that sounds more like a SHOULD (MAY?), > > > rather than a MUST. > > > > > > > > For what it's worth, my interpretation of the target scenario: > > > > > > > > > > > Other backends don't see these mappings. If the guest submits a > > > > > > vring > > > > > > descriptor referencing a mapping to another backend, then that > > > > > > backend > > > > > > won't be able to access this memory > > > > > > > > > > is that it's omitting how the implementation is reconciled with > > > > > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > > > > > > > > > References into shared memory regions are represented as offsets > > > > > > from > > > > > > the beginning of the region instead of absolute memory addresses. > > > > > > Offsets > > > > > > are used both for references between structures stored within shared > > > > > > memory and for requests placed in virtqueues that refer to shared > > > > > > memory. > > > > > > > > > > My interpretation of that statement is that putting raw guest physical > > > > > addresses corresponding to virtio shared memory regions into a vring > > > > > is a driver spec violation. > > > > > > > > > > -David > > > > > > > > This really applies within device I think. Should be clarified ... > > > > > > You mean that a virtio device can use absolute memory addresses for > > > other devices' shared memory regions, but it can't use absolute memory > > > addresses for its own shared memory regions? That's a rather strange > > > requirement. Or is the statement simply giving an addressing strategy > > > that device type specifications are free to ignore? > > > > My recollection of the intent behind the quoted section is: > > > > 1. Structures in shared memory that point to shared memory must used > >relative offsets instead of absolute physical addresses. > > 2. Virtqueue requests that refer to shared memory (e.g. map this page > >from virtiofs file to this location in shared memory) must use > >relative offsets instead of absolute physical addresses. > > > > In other words, shared memory must be relocatable. Don't assume Shared > > Memory Regions have an absolute guest physical address. This makes > > device implementations independent of the guest physical memory layout > > and might also help when Shared Memory Regions are exposed to guest > > user-space where the guest physical memory layout isn't known. > > Doesn't this discussion contradict the necessity of point 1? If I'm > understanding things correctly, it is valid for virtio device A to > refer to a structure in virtio device B's shared memory region by > absolute guest physical address. At least there is nothing in the spec > about resolving shmid values among different virtio devices, so > absolute guest physical addresses is the only way this sharing can be > done. And if it's valid for a pointer to a structure in a shared > memory region to
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Thu, Sep 5, 2024 at 6:39 PM Stefan Hajnoczi wrote: > On Tue, Sep 03, 2024 at 10:42:34AM +0200, Albert Esteve wrote: > > Hello all, > > > > Sorry, I have been a bit disconnected from this thread as I was on > > vacations and then had to switch tasks for a while. > > > > I will try to go through all comments and address them for the first > > non-RFC drop of this patch series. > > > > But I was discussing with some colleagues on this. So turns out > rust-vmm's > > vhost-user-gpu will potentially use > > this soon, and a rust-vmm/vhost patch have been already posted: > > https://github.com/rust-vmm/vhost/pull/251. > > So I think it may make sense to: > > 1. Split the vhost-user documentation patch once settled. Since it is > taken > > as the official spec, > > having it upstreamed independently of the implementation will benefit > > other projects to > > work/integrate their own code. > > 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches. > > If I remember correctly, this addresses a virtio-fs specific issue, > > that will not > > impact either virtio-gpu nor virtio-media, or any other. > > This is an architectural issue that arises from exposing VIRTIO Shared > Memory Regions in vhost-user. It was first seen with Linux virtiofs but > it could happen with other devices and/or guest operating systems. > > Any VIRTIO Shared Memory Region that can be mmapped into Linux userspace > may trigger this issue. Userspace may write(2) to an O_DIRECT file with > the mmap as the source. The vhost-user-blk device will not be able to > access the source device's VIRTIO Shared Memory Region and will fail. > > > So it may make > > sense > > to separate them so that one does not stall the other. I will try to > > have both > > integrated in the mid term. > > If READ_/WRITE_MEM is a pain to implement (I think it is in the > vhost-user back-end, even though I've been a proponent of it), then > another way to deal with this issue is to specify that upon receiving > MAP/UNMAP messages, the vhost-user front-end must update the vhost-user > memory tables of all other vhost-user devices. That way vhost-user > devices will be able to access VIRTIO Shared Memory Regions mapped by > other devices. > > Implementing this in QEMU should be much easier than implementing > READ_/WRITE_MEM support in device back-ends. > > This will be slow and scale poorly but performance is only a problem for > devices that frequently MAP/UNMAP like virtiofs. Will virtio-gpu and > virtio-media use MAP/UNMAP often at runtime? They might be able to get > away with this simple solution. > > I'd be happy with that. If someone wants to make virtiofs DAX faster, > they can implement READ/WRITE_MEM or another solution later, but let's > at least make things correct from the start. > I agree. I want it to be correct first. If you agree on splitting the spec bits from this patch I'm already happy. I suggested splitting READ_/WRITE_MEM messages because I thought that it was a virtiofs-specific issue. The alternative that you proposed is interesting. I'll take it into account. But I feel I prefer to go for the better solution, and if I get too entangled, then switch to the easier implementation. I think we could do this in 2 patches: 1. Split the documentation bits for SHMEM_MAP/_UNMAP. The implementation for these messages will go into the second patch. 2. The implementation patch: keep going for the time being with READ_/WRITE_MEM support. And the documentation for that is kept it within this patch. This way if we switch to the frontend updating vhost-user memory table, we weren't set in any specific solution if patch 1 has been already merged. BR, Albert. > > Stefan > > > > > WDYT? > > > > BR, > > Albert. > > > > On Tue, Jul 16, 2024 at 3:21 AM David Stevens > wrote: > > > > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin > wrote: > > > > > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP > in > > > > > > crosvm a couple of years ago. > > > > > > > > > > > > David, I'd be particularly interested for your thoughts on the > > > MEM_READ > > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't > > > implement > > > > > > anything like that. The discussion leading to those being added > > > starts > > > > > > here: > > > > > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > > > > > It would be great if this could be standardised between QEMU and > > > crosvm > > > > > > (and therefore have a clearer path toward being implemented in > other > > > VMMs)! > > > > > > > > > > Setting aside vhost-user for a moment, the DAX example given by > Stefan > > > > > won't work in crosvm today. > > > > > > > > > > Is universal access to virtio shared memory re
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, Sep 6, 2024 at 12:56 AM Stefan Hajnoczi wrote: > > On Tue, Jul 16, 2024 at 10:21:35AM +0900, David Stevens wrote: > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin wrote: > > > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > > > crosvm a couple of years ago. > > > > > > > > > > David, I'd be particularly interested for your thoughts on the > > > > > MEM_READ > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't > > > > > implement > > > > > anything like that. The discussion leading to those being added > > > > > starts > > > > > here: > > > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > > > It would be great if this could be standardised between QEMU and > > > > > crosvm > > > > > (and therefore have a clearer path toward being implemented in other > > > > > VMMs)! > > > > > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > > > won't work in crosvm today. > > > > > > > > Is universal access to virtio shared memory regions actually mandated > > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > > > virtualized environment. But what about when you have real hardware > > > > that speaks virtio involved? That's outside my wheelhouse, but it > > > > doesn't seem like that would be easy to solve. > > > > > > Yes, it can work for physical devices if allowed by host configuration. > > > E.g. VFIO supports that I think. Don't think VDPA does. > > > > I'm sure it can work, but that sounds more like a SHOULD (MAY?), > > rather than a MUST. > > > > > > For what it's worth, my interpretation of the target scenario: > > > > > > > > > Other backends don't see these mappings. If the guest submits a vring > > > > > descriptor referencing a mapping to another backend, then that backend > > > > > won't be able to access this memory > > > > > > > > is that it's omitting how the implementation is reconciled with > > > > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > > > > > > > References into shared memory regions are represented as offsets from > > > > > the beginning of the region instead of absolute memory addresses. > > > > > Offsets > > > > > are used both for references between structures stored within shared > > > > > memory and for requests placed in virtqueues that refer to shared > > > > > memory. > > > > > > > > My interpretation of that statement is that putting raw guest physical > > > > addresses corresponding to virtio shared memory regions into a vring > > > > is a driver spec violation. > > > > > > > > -David > > > > > > This really applies within device I think. Should be clarified ... > > > > You mean that a virtio device can use absolute memory addresses for > > other devices' shared memory regions, but it can't use absolute memory > > addresses for its own shared memory regions? That's a rather strange > > requirement. Or is the statement simply giving an addressing strategy > > that device type specifications are free to ignore? > > My recollection of the intent behind the quoted section is: > > 1. Structures in shared memory that point to shared memory must used >relative offsets instead of absolute physical addresses. > 2. Virtqueue requests that refer to shared memory (e.g. map this page >from virtiofs file to this location in shared memory) must use >relative offsets instead of absolute physical addresses. > > In other words, shared memory must be relocatable. Don't assume Shared > Memory Regions have an absolute guest physical address. This makes > device implementations independent of the guest physical memory layout > and might also help when Shared Memory Regions are exposed to guest > user-space where the guest physical memory layout isn't known. Doesn't this discussion contradict the necessity of point 1? If I'm understanding things correctly, it is valid for virtio device A to refer to a structure in virtio device B's shared memory region by absolute guest physical address. At least there is nothing in the spec about resolving shmid values among different virtio devices, so absolute guest physical addresses is the only way this sharing can be done. And if it's valid for a pointer to a structure in a shared memory region to exist, it's not clear to me why you can't have pointers within a shared memory region. It definitely makes sense that setting up a mapping should be done with offsets. But unless a shared memory region can be dynamically reallocated at runtime, then it doesn't seem necessary to ban pointe
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Tue, Sep 03, 2024 at 10:42:34AM +0200, Albert Esteve wrote: > Hello all, > > Sorry, I have been a bit disconnected from this thread as I was on > vacations and then had to switch tasks for a while. > > I will try to go through all comments and address them for the first > non-RFC drop of this patch series. > > But I was discussing with some colleagues on this. So turns out rust-vmm's > vhost-user-gpu will potentially use > this soon, and a rust-vmm/vhost patch have been already posted: > https://github.com/rust-vmm/vhost/pull/251. > So I think it may make sense to: > 1. Split the vhost-user documentation patch once settled. Since it is taken > as the official spec, > having it upstreamed independently of the implementation will benefit > other projects to > work/integrate their own code. > 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches. > If I remember correctly, this addresses a virtio-fs specific issue, > that will not > impact either virtio-gpu nor virtio-media, or any other. This is an architectural issue that arises from exposing VIRTIO Shared Memory Regions in vhost-user. It was first seen with Linux virtiofs but it could happen with other devices and/or guest operating systems. Any VIRTIO Shared Memory Region that can be mmapped into Linux userspace may trigger this issue. Userspace may write(2) to an O_DIRECT file with the mmap as the source. The vhost-user-blk device will not be able to access the source device's VIRTIO Shared Memory Region and will fail. > So it may make > sense > to separate them so that one does not stall the other. I will try to > have both > integrated in the mid term. If READ_/WRITE_MEM is a pain to implement (I think it is in the vhost-user back-end, even though I've been a proponent of it), then another way to deal with this issue is to specify that upon receiving MAP/UNMAP messages, the vhost-user front-end must update the vhost-user memory tables of all other vhost-user devices. That way vhost-user devices will be able to access VIRTIO Shared Memory Regions mapped by other devices. Implementing this in QEMU should be much easier than implementing READ_/WRITE_MEM support in device back-ends. This will be slow and scale poorly but performance is only a problem for devices that frequently MAP/UNMAP like virtiofs. Will virtio-gpu and virtio-media use MAP/UNMAP often at runtime? They might be able to get away with this simple solution. I'd be happy with that. If someone wants to make virtiofs DAX faster, they can implement READ/WRITE_MEM or another solution later, but let's at least make things correct from the start. Stefan > > WDYT? > > BR, > Albert. > > On Tue, Jul 16, 2024 at 3:21 AM David Stevens wrote: > > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin wrote: > > > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > > > crosvm a couple of years ago. > > > > > > > > > > David, I'd be particularly interested for your thoughts on the > > MEM_READ > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't > > implement > > > > > anything like that. The discussion leading to those being added > > starts > > > > > here: > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > > > It would be great if this could be standardised between QEMU and > > crosvm > > > > > (and therefore have a clearer path toward being implemented in other > > VMMs)! > > > > > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > > > won't work in crosvm today. > > > > > > > > Is universal access to virtio shared memory regions actually mandated > > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > > > virtualized environment. But what about when you have real hardware > > > > that speaks virtio involved? That's outside my wheelhouse, but it > > > > doesn't seem like that would be easy to solve. > > > > > > Yes, it can work for physical devices if allowed by host configuration. > > > E.g. VFIO supports that I think. Don't think VDPA does. > > > > I'm sure it can work, but that sounds more like a SHOULD (MAY?), > > rather than a MUST. > > > > > > For what it's worth, my interpretation of the target scenario: > > > > > > > > > Other backends don't see these mappings. If the guest submits a vring > > > > > descriptor referencing a mapping to another backend, then that > > backend > > > > > won't be able to access this memory > > > > > > > > is that it's omitting how the implementation is reconciled with > > > > section
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Tue, Jul 16, 2024 at 10:21:35AM +0900, David Stevens wrote: > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin wrote: > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > > crosvm a couple of years ago. > > > > > > > > David, I'd be particularly interested for your thoughts on the MEM_READ > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't implement > > > > anything like that. The discussion leading to those being added starts > > > > here: > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > It would be great if this could be standardised between QEMU and crosvm > > > > (and therefore have a clearer path toward being implemented in other > > > > VMMs)! > > > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > > won't work in crosvm today. > > > > > > Is universal access to virtio shared memory regions actually mandated > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > > virtualized environment. But what about when you have real hardware > > > that speaks virtio involved? That's outside my wheelhouse, but it > > > doesn't seem like that would be easy to solve. > > > > Yes, it can work for physical devices if allowed by host configuration. > > E.g. VFIO supports that I think. Don't think VDPA does. > > I'm sure it can work, but that sounds more like a SHOULD (MAY?), > rather than a MUST. > > > > For what it's worth, my interpretation of the target scenario: > > > > > > > Other backends don't see these mappings. If the guest submits a vring > > > > descriptor referencing a mapping to another backend, then that backend > > > > won't be able to access this memory > > > > > > is that it's omitting how the implementation is reconciled with > > > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > > > > > References into shared memory regions are represented as offsets from > > > > the beginning of the region instead of absolute memory addresses. > > > > Offsets > > > > are used both for references between structures stored within shared > > > > memory and for requests placed in virtqueues that refer to shared > > > > memory. > > > > > > My interpretation of that statement is that putting raw guest physical > > > addresses corresponding to virtio shared memory regions into a vring > > > is a driver spec violation. > > > > > > -David > > > > This really applies within device I think. Should be clarified ... > > You mean that a virtio device can use absolute memory addresses for > other devices' shared memory regions, but it can't use absolute memory > addresses for its own shared memory regions? That's a rather strange > requirement. Or is the statement simply giving an addressing strategy > that device type specifications are free to ignore? My recollection of the intent behind the quoted section is: 1. Structures in shared memory that point to shared memory must used relative offsets instead of absolute physical addresses. 2. Virtqueue requests that refer to shared memory (e.g. map this page from virtiofs file to this location in shared memory) must use relative offsets instead of absolute physical addresses. In other words, shared memory must be relocatable. Don't assume Shared Memory Regions have an absolute guest physical address. This makes device implementations independent of the guest physical memory layout and might also help when Shared Memory Regions are exposed to guest user-space where the guest physical memory layout isn't known. Stefan signature.asc Description: PGP signature
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
Hello all, Sorry, I have been a bit disconnected from this thread as I was on vacations and then had to switch tasks for a while. I will try to go through all comments and address them for the first non-RFC drop of this patch series. But I was discussing with some colleagues on this. So turns out rust-vmm's vhost-user-gpu will potentially use this soon, and a rust-vmm/vhost patch have been already posted: https://github.com/rust-vmm/vhost/pull/251. So I think it may make sense to: 1. Split the vhost-user documentation patch once settled. Since it is taken as the official spec, having it upstreamed independently of the implementation will benefit other projects to work/integrate their own code. 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches. If I remember correctly, this addresses a virtio-fs specific issue, that will not impact either virtio-gpu nor virtio-media, or any other. So it may make sense to separate them so that one does not stall the other. I will try to have both integrated in the mid term. WDYT? BR, Albert. On Tue, Jul 16, 2024 at 3:21 AM David Stevens wrote: > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin wrote: > > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > > crosvm a couple of years ago. > > > > > > > > David, I'd be particularly interested for your thoughts on the > MEM_READ > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't > implement > > > > anything like that. The discussion leading to those being added > starts > > > > here: > > > > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > > > It would be great if this could be standardised between QEMU and > crosvm > > > > (and therefore have a clearer path toward being implemented in other > VMMs)! > > > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > > won't work in crosvm today. > > > > > > Is universal access to virtio shared memory regions actually mandated > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > > virtualized environment. But what about when you have real hardware > > > that speaks virtio involved? That's outside my wheelhouse, but it > > > doesn't seem like that would be easy to solve. > > > > Yes, it can work for physical devices if allowed by host configuration. > > E.g. VFIO supports that I think. Don't think VDPA does. > > I'm sure it can work, but that sounds more like a SHOULD (MAY?), > rather than a MUST. > > > > For what it's worth, my interpretation of the target scenario: > > > > > > > Other backends don't see these mappings. If the guest submits a vring > > > > descriptor referencing a mapping to another backend, then that > backend > > > > won't be able to access this memory > > > > > > is that it's omitting how the implementation is reconciled with > > > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > > > > > References into shared memory regions are represented as offsets from > > > > the beginning of the region instead of absolute memory addresses. > Offsets > > > > are used both for references between structures stored within shared > > > > memory and for requests placed in virtqueues that refer to shared > memory. > > > > > > My interpretation of that statement is that putting raw guest physical > > > addresses corresponding to virtio shared memory regions into a vring > > > is a driver spec violation. > > > > > > -David > > > > This really applies within device I think. Should be clarified ... > > You mean that a virtio device can use absolute memory addresses for > other devices' shared memory regions, but it can't use absolute memory > addresses for its own shared memory regions? That's a rather strange > requirement. Or is the statement simply giving an addressing strategy > that device type specifications are free to ignore? > > -David > >
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin wrote: > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > crosvm a couple of years ago. > > > > > > David, I'd be particularly interested for your thoughts on the MEM_READ > > > and MEM_WRITE commands, since as far as I know crosvm doesn't implement > > > anything like that. The discussion leading to those being added starts > > > here: > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > It would be great if this could be standardised between QEMU and crosvm > > > (and therefore have a clearer path toward being implemented in other > > > VMMs)! > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > won't work in crosvm today. > > > > Is universal access to virtio shared memory regions actually mandated > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > virtualized environment. But what about when you have real hardware > > that speaks virtio involved? That's outside my wheelhouse, but it > > doesn't seem like that would be easy to solve. > > Yes, it can work for physical devices if allowed by host configuration. > E.g. VFIO supports that I think. Don't think VDPA does. I'm sure it can work, but that sounds more like a SHOULD (MAY?), rather than a MUST. > > For what it's worth, my interpretation of the target scenario: > > > > > Other backends don't see these mappings. If the guest submits a vring > > > descriptor referencing a mapping to another backend, then that backend > > > won't be able to access this memory > > > > is that it's omitting how the implementation is reconciled with > > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > > > References into shared memory regions are represented as offsets from > > > the beginning of the region instead of absolute memory addresses. Offsets > > > are used both for references between structures stored within shared > > > memory and for requests placed in virtqueues that refer to shared memory. > > > > My interpretation of that statement is that putting raw guest physical > > addresses corresponding to virtio shared memory regions into a vring > > is a driver spec violation. > > > > -David > > This really applies within device I think. Should be clarified ... You mean that a virtio device can use absolute memory addresses for other devices' shared memory regions, but it can't use absolute memory addresses for its own shared memory regions? That's a rather strange requirement. Or is the statement simply giving an addressing strategy that device type specifications are free to ignore? -David
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, Jul 12, 2024 at 1:48 PM Michael S. Tsirkin wrote: > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > > crosvm a couple of years ago. > > > > > > David, I'd be particularly interested for your thoughts on the MEM_READ > > > and MEM_WRITE commands, since as far as I know crosvm doesn't implement > > > anything like that. The discussion leading to those being added starts > > > here: > > > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > > > It would be great if this could be standardised between QEMU and crosvm > > > (and therefore have a clearer path toward being implemented in other > > > VMMs)! > > > > Setting aside vhost-user for a moment, the DAX example given by Stefan > > won't work in crosvm today. > > > > Is universal access to virtio shared memory regions actually mandated > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > > seems reasonable enough, but what about virtio-pmem to virtio-blk? > > What about screenshotting a framebuffer in virtio-gpu shared memory to > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > > virtualized environment. But what about when you have real hardware > > that speaks virtio involved? That's outside my wheelhouse, but it > > doesn't seem like that would be easy to solve. > > Yes, it can work for physical devices if allowed by host configuration. > E.g. VFIO supports that I think. Don't think VDPA does. > I guess you meant iommufd support here? Thanks
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote: > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > > crosvm a couple of years ago. > > > > David, I'd be particularly interested for your thoughts on the MEM_READ > > and MEM_WRITE commands, since as far as I know crosvm doesn't implement > > anything like that. The discussion leading to those being added starts > > here: > > > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > > > It would be great if this could be standardised between QEMU and crosvm > > (and therefore have a clearer path toward being implemented in other VMMs)! > > Setting aside vhost-user for a moment, the DAX example given by Stefan > won't work in crosvm today. > > Is universal access to virtio shared memory regions actually mandated > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing > seems reasonable enough, but what about virtio-pmem to virtio-blk? > What about screenshotting a framebuffer in virtio-gpu shared memory to > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a > virtualized environment. But what about when you have real hardware > that speaks virtio involved? That's outside my wheelhouse, but it > doesn't seem like that would be easy to solve. Yes, it can work for physical devices if allowed by host configuration. E.g. VFIO supports that I think. Don't think VDPA does. > For what it's worth, my interpretation of the target scenario: > > > Other backends don't see these mappings. If the guest submits a vring > > descriptor referencing a mapping to another backend, then that backend > > won't be able to access this memory > > is that it's omitting how the implementation is reconciled with > section 2.10.1 of v1.3 of the virtio spec, which states that: > > > References into shared memory regions are represented as offsets from > > the beginning of the region instead of absolute memory addresses. Offsets > > are used both for references between structures stored within shared > > memory and for requests placed in virtqueues that refer to shared memory. > > My interpretation of that statement is that putting raw guest physical > addresses corresponding to virtio shared memory regions into a vring > is a driver spec violation. > > -David This really applies within device I think. Should be clarified ... -- MST
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross wrote: > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in > crosvm a couple of years ago. > > David, I'd be particularly interested for your thoughts on the MEM_READ > and MEM_WRITE commands, since as far as I know crosvm doesn't implement > anything like that. The discussion leading to those being added starts > here: > > https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ > > It would be great if this could be standardised between QEMU and crosvm > (and therefore have a clearer path toward being implemented in other VMMs)! Setting aside vhost-user for a moment, the DAX example given by Stefan won't work in crosvm today. Is universal access to virtio shared memory regions actually mandated by the virtio spec? Copying from virtiofs DAX to virtiofs sharing seems reasonable enough, but what about virtio-pmem to virtio-blk? What about screenshotting a framebuffer in virtio-gpu shared memory to virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a virtualized environment. But what about when you have real hardware that speaks virtio involved? That's outside my wheelhouse, but it doesn't seem like that would be easy to solve. For what it's worth, my interpretation of the target scenario: > Other backends don't see these mappings. If the guest submits a vring > descriptor referencing a mapping to another backend, then that backend > won't be able to access this memory is that it's omitting how the implementation is reconciled with section 2.10.1 of v1.3 of the virtio spec, which states that: > References into shared memory regions are represented as offsets from > the beginning of the region instead of absolute memory addresses. Offsets > are used both for references between structures stored within shared > memory and for requests placed in virtqueues that refer to shared memory. My interpretation of that statement is that putting raw guest physical addresses corresponding to virtio shared memory regions into a vring is a driver spec violation. -David
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in crosvm a couple of years ago. David, I'd be particularly interested for your thoughts on the MEM_READ and MEM_WRITE commands, since as far as I know crosvm doesn't implement anything like that. The discussion leading to those being added starts here: https://lore.kernel.org/qemu-devel/20240604185416.gb90...@fedora.redhat.com/ It would be great if this could be standardised between QEMU and crosvm (and therefore have a clearer path toward being implemented in other VMMs)! Albert Esteve writes: > Hi all, > > v1->v2: > - Corrected typos and clarifications from > first review > - Added SHMEM_CONFIG frontend request to > query VIRTIO shared memory regions from > backends > - vhost-user-device to use SHMEM_CONFIG > to request and initialise regions > - Added MEM_READ/WRITE backend requests > in case address translation fails > accessing VIRTIO Shared Memory Regions > with MMAPs > > This is an update of my attempt to have > backends support dynamic fd mapping into VIRTIO > Shared Memory Regions. After the first review > I have added more commits and new messages > to the vhost-user protocol. > However, I still have some doubts as to > how will this work, specially regarding > the MEM_READ and MEM_WRITE commands. > Thus, I am still looking for feedback, > to ensure that I am going in the right > direction with the implementation. > > The usecase for this patch is, e.g., to support > vhost-user-gpu RESOURCE_BLOB operations, > or DAX Window request for virtio-fs. In > general, any operation where a backend > need to request the frontend to mmap an > fd into a VIRTIO Shared Memory Region, > so that the guest can then access it. > > After receiving the SHMEM_MAP/UNMAP request, > the frontend will perform the mmap with the > instructed parameters (i.e., shmid, shm_offset, > fd_offset, fd, lenght). > > As there are already a couple devices > that could benefit of such a feature, > and more could require it in the future, > the goal is to make the implementation > generic. > > To that end, the VIRTIO Shared Memory > Region list is declared in the `VirtIODevice` > struct. > > This patch also includes: > SHMEM_CONFIG frontend request that is > specifically meant to allow generic > vhost-user-device frontend to be able to > query VIRTIO Shared Memory settings from the > backend (as this device is generic and agnostic > of the actual backend configuration). > > Finally, MEM_READ/WRITE backend requests are > added to deal with a potential issue when having > any backend sharing a descriptor that references > a mapping to another backend. The first > backend will not be able to see these > mappings. So these requests are a fallback > for vhost-user memory translation fails. > > Albert Esteve (5): > vhost-user: Add VIRTIO Shared Memory map request > vhost_user: Add frontend command for shmem config > vhost-user-dev: Add cache BAR > vhost_user: Add MEM_READ/WRITE backend requests > vhost_user: Implement mem_read/mem_write handlers > > docs/interop/vhost-user.rst | 58 ++ > hw/virtio/vhost-user-base.c | 39 +++- > hw/virtio/vhost-user-device-pci.c | 37 +++- > hw/virtio/vhost-user.c| 221 ++ > hw/virtio/virtio.c| 12 ++ > include/hw/virtio/vhost-backend.h | 6 + > include/hw/virtio/vhost-user.h| 1 + > include/hw/virtio/virtio.h| 5 + > subprojects/libvhost-user/libvhost-user.c | 149 +++ > subprojects/libvhost-user/libvhost-user.h | 91 + > 10 files changed, 614 insertions(+), 5 deletions(-) > > -- > 2.45.2 signature.asc Description: PGP signature
Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
On Fri, Jun 28, 2024 at 04:57:05PM +0200, Albert Esteve wrote: > Hi all, > > v1->v2: > - Corrected typos and clarifications from > first review > - Added SHMEM_CONFIG frontend request to > query VIRTIO shared memory regions from > backends > - vhost-user-device to use SHMEM_CONFIG > to request and initialise regions > - Added MEM_READ/WRITE backend requests > in case address translation fails > accessing VIRTIO Shared Memory Regions > with MMAPs Hi Albert, I will be offline next week. I've posted comments. I think the hard part will be adjusting vhost-user backend code to make use of MEM_READ/MEM_WRITE when address translation fails. Ideally every guest memory access (including vring accesses) should fall back to MEM_READ/MEM_WRITE. A good test for MEM_READ/MEM_WRITE is to completely skip setting the memory table from the frontend and fall back for every guest memory access. If the vhost-user backend still works without a memory table then you know MEM_READ/MEM_WRITE is working too. The vhost-user spec should probably contain a comment explaining that MEM_READ/MEM_WRITE may be necessary when other device backends use SHMEM_MAP due to the incomplete memory table that prevents translating those memory addresses. In other words, if the guest has a device that uses SHMEM_MAP, then all other vhost-user devices should support MEM_READ/MEM_WRITE in order to ensure that DMA works with Shared Memory Regions. Stefan > > This is an update of my attempt to have > backends support dynamic fd mapping into VIRTIO > Shared Memory Regions. After the first review > I have added more commits and new messages > to the vhost-user protocol. > However, I still have some doubts as to > how will this work, specially regarding > the MEM_READ and MEM_WRITE commands. > Thus, I am still looking for feedback, > to ensure that I am going in the right > direction with the implementation. > > The usecase for this patch is, e.g., to support > vhost-user-gpu RESOURCE_BLOB operations, > or DAX Window request for virtio-fs. In > general, any operation where a backend > need to request the frontend to mmap an > fd into a VIRTIO Shared Memory Region, > so that the guest can then access it. > > After receiving the SHMEM_MAP/UNMAP request, > the frontend will perform the mmap with the > instructed parameters (i.e., shmid, shm_offset, > fd_offset, fd, lenght). > > As there are already a couple devices > that could benefit of such a feature, > and more could require it in the future, > the goal is to make the implementation > generic. > > To that end, the VIRTIO Shared Memory > Region list is declared in the `VirtIODevice` > struct. > > This patch also includes: > SHMEM_CONFIG frontend request that is > specifically meant to allow generic > vhost-user-device frontend to be able to > query VIRTIO Shared Memory settings from the > backend (as this device is generic and agnostic > of the actual backend configuration). > > Finally, MEM_READ/WRITE backend requests are > added to deal with a potential issue when having > any backend sharing a descriptor that references > a mapping to another backend. The first > backend will not be able to see these > mappings. So these requests are a fallback > for vhost-user memory translation fails. > > Albert Esteve (5): > vhost-user: Add VIRTIO Shared Memory map request > vhost_user: Add frontend command for shmem config > vhost-user-dev: Add cache BAR > vhost_user: Add MEM_READ/WRITE backend requests > vhost_user: Implement mem_read/mem_write handlers > > docs/interop/vhost-user.rst | 58 ++ > hw/virtio/vhost-user-base.c | 39 +++- > hw/virtio/vhost-user-device-pci.c | 37 +++- > hw/virtio/vhost-user.c| 221 ++ > hw/virtio/virtio.c| 12 ++ > include/hw/virtio/vhost-backend.h | 6 + > include/hw/virtio/vhost-user.h| 1 + > include/hw/virtio/virtio.h| 5 + > subprojects/libvhost-user/libvhost-user.c | 149 +++ > subprojects/libvhost-user/libvhost-user.h | 91 + > 10 files changed, 614 insertions(+), 5 deletions(-) > > -- > 2.45.2 > signature.asc Description: PGP signature
[RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
Hi all, v1->v2: - Corrected typos and clarifications from first review - Added SHMEM_CONFIG frontend request to query VIRTIO shared memory regions from backends - vhost-user-device to use SHMEM_CONFIG to request and initialise regions - Added MEM_READ/WRITE backend requests in case address translation fails accessing VIRTIO Shared Memory Regions with MMAPs This is an update of my attempt to have backends support dynamic fd mapping into VIRTIO Shared Memory Regions. After the first review I have added more commits and new messages to the vhost-user protocol. However, I still have some doubts as to how will this work, specially regarding the MEM_READ and MEM_WRITE commands. Thus, I am still looking for feedback, to ensure that I am going in the right direction with the implementation. The usecase for this patch is, e.g., to support vhost-user-gpu RESOURCE_BLOB operations, or DAX Window request for virtio-fs. In general, any operation where a backend need to request the frontend to mmap an fd into a VIRTIO Shared Memory Region, so that the guest can then access it. After receiving the SHMEM_MAP/UNMAP request, the frontend will perform the mmap with the instructed parameters (i.e., shmid, shm_offset, fd_offset, fd, lenght). As there are already a couple devices that could benefit of such a feature, and more could require it in the future, the goal is to make the implementation generic. To that end, the VIRTIO Shared Memory Region list is declared in the `VirtIODevice` struct. This patch also includes: SHMEM_CONFIG frontend request that is specifically meant to allow generic vhost-user-device frontend to be able to query VIRTIO Shared Memory settings from the backend (as this device is generic and agnostic of the actual backend configuration). Finally, MEM_READ/WRITE backend requests are added to deal with a potential issue when having any backend sharing a descriptor that references a mapping to another backend. The first backend will not be able to see these mappings. So these requests are a fallback for vhost-user memory translation fails. Albert Esteve (5): vhost-user: Add VIRTIO Shared Memory map request vhost_user: Add frontend command for shmem config vhost-user-dev: Add cache BAR vhost_user: Add MEM_READ/WRITE backend requests vhost_user: Implement mem_read/mem_write handlers docs/interop/vhost-user.rst | 58 ++ hw/virtio/vhost-user-base.c | 39 +++- hw/virtio/vhost-user-device-pci.c | 37 +++- hw/virtio/vhost-user.c| 221 ++ hw/virtio/virtio.c| 12 ++ include/hw/virtio/vhost-backend.h | 6 + include/hw/virtio/vhost-user.h| 1 + include/hw/virtio/virtio.h| 5 + subprojects/libvhost-user/libvhost-user.c | 149 +++ subprojects/libvhost-user/libvhost-user.h | 91 + 10 files changed, 614 insertions(+), 5 deletions(-) -- 2.45.2