On Fri, Oct 17, 2025 at 2:13 PM Stefano Garzarella <[email protected]> wrote: > > On Fri, Oct 17, 2025 at 01:24:52PM +0200, Albert Esteve wrote: > >On Fri, Oct 17, 2025 at 11:23 AM Stefano Garzarella <[email protected]> > >wrote: > >> > >> On Thu, Oct 16, 2025 at 04:38:21PM +0200, Albert Esteve wrote: > >> >Add SHMEM_MAP/UNMAP requests to vhost-user for dynamic management of > >> >VIRTIO Shared Memory mappings. > >> > > >> >This implementation introduces VirtioSharedMemoryMapping as a unified > >> >QOM object that manages both the mapping metadata and MemoryRegion > >> >lifecycle. This object provides reference-counted lifecycle management > >> >with automatic cleanup of file descriptors and memory regions > >> >through QOM finalization. > >> > > >> >This request allows backends to dynamically map file descriptors into a > >> >VIRTIO Shared Memory Region identified by their shmid. Maps are created > >> >using memory_region_init_ram_from_fd() with configurable read/write > >> >permissions, and the resulting MemoryRegions are added as subregions to > >> >the shmem container region. The mapped memory is then advertised to the > >> >guest VIRTIO drivers as a base address plus offset for reading and > >> >writting according to the requested mmap flags. > >> > > >> >The backend can unmap memory ranges within a given VIRTIO Shared Memory > >> >Region to free resources. Upon receiving this message, the frontend > >> >removes the MemoryRegion as a subregion and automatically unreferences > >> >the VirtioSharedMemoryMapping object, triggering cleanup if no other > >> >references exist. > >> > > >> >Error handling has been improved to ensure consistent behavior across > >> >handlers that manage their own vhost_user_send_resp() calls. Since > >> >these handlers clear the VHOST_USER_NEED_REPLY_MASK flag, explicit > >> >error checking ensures proper connection closure on failures, > >> >maintaining the expected error flow. > >> > > >> >Note the memory region commit for these operations needs to be delayed > >> >until after we reply to the backend to avoid deadlocks. Otherwise, > >> >the MemoryListener would send a VHOST_USER_SET_MEM_TABLE message > >> >before the reply. > >> > > >> >Reviewed-by: Stefan Hajnoczi <[email protected]> > >> >Signed-off-by: Albert Esteve <[email protected]> > >> >--- > >> > hw/virtio/vhost-user.c | 267 ++++++++++++++++++++++ > >> > hw/virtio/virtio.c | 199 ++++++++++++++++ > >> > include/hw/virtio/virtio.h | 135 +++++++++++ > >> > subprojects/libvhost-user/libvhost-user.c | 70 ++++++ > >> > subprojects/libvhost-user/libvhost-user.h | 54 +++++ > >> > 5 files changed, 725 insertions(+) > >> > > >> >diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > >> >index 36c9c2e04d..890be55937 100644 > >> >--- a/hw/virtio/vhost-user.c > >> >+++ b/hw/virtio/vhost-user.c > >> >@@ -104,6 +104,7 @@ typedef enum VhostUserRequest { > >> > VHOST_USER_GET_SHARED_OBJECT = 41, > >> > VHOST_USER_SET_DEVICE_STATE_FD = 42, > >> > VHOST_USER_CHECK_DEVICE_STATE = 43, > >> >+ VHOST_USER_GET_SHMEM_CONFIG = 44, > >> > VHOST_USER_MAX > >> > } VhostUserRequest; > >> > > >> >@@ -115,6 +116,8 @@ typedef enum VhostUserBackendRequest { > >> > VHOST_USER_BACKEND_SHARED_OBJECT_ADD = 6, > >> > VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE = 7, > >> > VHOST_USER_BACKEND_SHARED_OBJECT_LOOKUP = 8, > >> >+ VHOST_USER_BACKEND_SHMEM_MAP = 9, > >> >+ VHOST_USER_BACKEND_SHMEM_UNMAP = 10, > >> > VHOST_USER_BACKEND_MAX > >> > } VhostUserBackendRequest; > >> > > >> >@@ -136,6 +139,12 @@ typedef struct VhostUserMemRegMsg { > >> > VhostUserMemoryRegion region; > >> > } VhostUserMemRegMsg; > >> > > >> >+typedef struct VhostUserShMemConfig { > >> >+ uint32_t nregions; > >> >+ uint32_t padding; > >> >+ uint64_t memory_sizes[VIRTIO_MAX_SHMEM_REGIONS]; > >> >+} VhostUserShMemConfig; > >> >+ > >> > typedef struct VhostUserLog { > >> > uint64_t mmap_size; > >> > uint64_t mmap_offset; > >> >@@ -192,6 +201,23 @@ typedef struct VhostUserShared { > >> > unsigned char uuid[16]; > >> > } VhostUserShared; > >> > > >> >+/* For the flags field of VhostUserMMap */ > >> >+#define VHOST_USER_FLAG_MAP_RW (1u << 0) > >> >+ > >> >+typedef struct { > >> >+ /* VIRTIO Shared Memory Region ID */ > >> >+ uint8_t shmid; > >> >+ uint8_t padding[7]; > >> >+ /* File offset */ > >> >+ uint64_t fd_offset; > >> >+ /* Offset within the VIRTIO Shared Memory Region */ > >> >+ uint64_t shm_offset; > >> >+ /* Size of the mapping */ > >> >+ uint64_t len; > >> >+ /* Flags for the mmap operation, from VHOST_USER_FLAG_MAP_* */ > >> >+ uint64_t flags; > >> >+} VhostUserMMap; > >> >+ > >> > typedef struct { > >> > VhostUserRequest request; > >> > > >> >@@ -224,6 +250,8 @@ typedef union { > >> > VhostUserInflight inflight; > >> > VhostUserShared object; > >> > VhostUserTransferDeviceState transfer_state; > >> >+ VhostUserMMap mmap; > >> >+ VhostUserShMemConfig shmem; > >> > } VhostUserPayload; > >> > > >> > typedef struct VhostUserMsg { > >> >@@ -1768,6 +1796,196 @@ > >> >vhost_user_backend_handle_shared_object_lookup(struct vhost_user *u, > >> > return 0; > >> > } > >> > > >> >+/** > >> >+ * vhost_user_backend_handle_shmem_map() - Handle SHMEM_MAP backend > >> >request > >> >+ * @dev: vhost device > >> >+ * @ioc: QIOChannel for communication > >> >+ * @hdr: vhost-user message header > >> >+ * @payload: message payload containing mapping details > >> >+ * @fd: file descriptor for the shared memory region > >> >+ * > >> >+ * Handles VHOST_USER_BACKEND_SHMEM_MAP requests from the backend. > >> >Creates > >> >+ * a VhostUserShmemObject to manage the shared memory mapping and adds it > >> >+ * to the appropriate VirtIO shared memory region. The > >> >VhostUserShmemObject > >> >+ * serves as an intermediate parent for the MemoryRegion, ensuring proper > >> >+ * lifecycle management with reference counting. > >> >+ * > >> >+ * Returns: 0 on success, negative errno on failure > >> >+ */ > >> >+static int > >> >+vhost_user_backend_handle_shmem_map(struct vhost_dev *dev, > >> >+ QIOChannel *ioc, > >> >+ VhostUserHeader *hdr, > >> >+ VhostUserPayload *payload, > >> >+ int fd) > >> >+{ > >> >+ VirtioSharedMemory *shmem; > >> >+ VhostUserMMap *vu_mmap = &payload->mmap; > >> >+ VirtioSharedMemoryMapping *existing; > >> >+ Error *local_err = NULL; > >> >+ int ret = 0; > >> >+ > >> >+ if (fd < 0) { > >> >+ error_report("Bad fd for map"); > >> >+ ret = -EBADF; > >> >+ goto send_reply; > >> >+ } > >> >+ > >> >+ if (QSIMPLEQ_EMPTY(&dev->vdev->shmem_list)) { > >> >+ error_report("Device has no VIRTIO Shared Memory Regions. " > >> >+ "Requested ID: %d", vu_mmap->shmid); > >> >+ ret = -EFAULT; > >> >+ goto send_reply; > >> >+ } > >> >+ > >> >+ shmem = virtio_find_shmem_region(dev->vdev, vu_mmap->shmid); > >> >+ if (!shmem) { > >> >+ error_report("VIRTIO Shared Memory Region at " > >> >+ "ID %d not found or uninitialized", vu_mmap->shmid); > >> >+ ret = -EFAULT; > >> >+ goto send_reply; > >> >+ } > >> >+ > >> >+ if ((vu_mmap->shm_offset + vu_mmap->len) < vu_mmap->len || > >> >+ (vu_mmap->shm_offset + vu_mmap->len) > shmem->mr.size) { > >> >+ error_report("Bad offset/len for mmap %" PRIx64 "+%" PRIx64, > >> >+ vu_mmap->shm_offset, vu_mmap->len); > >> >+ ret = -EFAULT; > >> >+ goto send_reply; > >> >+ } > >> >+ > >> >+ QTAILQ_FOREACH(existing, &shmem->mmaps, link) { > >> >+ if (ranges_overlap(existing->offset, existing->len, > >> >+ vu_mmap->shm_offset, vu_mmap->len)) { > >> >+ error_report("VIRTIO Shared Memory mapping overlap"); > >> >+ ret = -EFAULT; > >> >+ goto send_reply; > >> >+ } > >> >+ } > >> >+ > >> >+ memory_region_transaction_begin(); > >> >+ > >> >+ /* Create VirtioSharedMemoryMapping object */ > >> >+ VirtioSharedMemoryMapping *mapping = > >> >virtio_shared_memory_mapping_new( > >> >+ vu_mmap->shmid, fd, vu_mmap->fd_offset, vu_mmap->shm_offset, > >> >+ vu_mmap->len, vu_mmap->flags & VHOST_USER_FLAG_MAP_RW); > >> >+ > >> >+ if (!mapping) { > >> >+ ret = -EFAULT; > >> >+ goto send_reply_commit; > >> >+ } > >> >+ > >> >+ /* Add the mapping to the shared memory region */ > >> >+ if (virtio_add_shmem_map(shmem, mapping) != 0) { > >> >+ error_report("Failed to add shared memory mapping"); > >> >+ object_unref(OBJECT(mapping)); > >> >+ ret = -EFAULT; > >> >+ goto send_reply_commit; > >> >+ } > >> >+ > >> >+send_reply_commit: > >> >+ /* Send reply and commit after transaction started */ > >> >+ if (hdr->flags & VHOST_USER_NEED_REPLY_MASK) { > >> >+ payload->u64 = !!ret; > >> >+ hdr->size = sizeof(payload->u64); > >> >+ if (!vhost_user_send_resp(ioc, hdr, payload, &local_err)) { > >> >+ error_report_err(local_err); > >> >+ memory_region_transaction_commit(); > >> >+ return -EFAULT; > >> >+ } > >> >+ } > >> >+ memory_region_transaction_commit(); > >> > >> Sorry to be late, I did a quick review, my only doubts is here, maybe it > >> was already discussed, but why do we commit after responding to the > >> backend? > >> > >> Should we do it first to prevent the backend from “seeing” something > >> that hasn't been committed yet? > > > >There is a race that leads to a deadlock. hw/virtio/vhost.c has a > >MemoryListener that sends VHOST_USER_SET_MEM_TABLE messages in its > >.commit() callback. If this happens before the reply, the backend will > >not process it as it is stuck waiting for the SHMEM reply, and the > >handler in qemu will not send it as it is waiting for the reply to the > >SET_MEM_TABLE. So we have to delay the transaction commit to > >immediately after the reply. > > Okay, I see now that you mentioned that in the commit description, > great, I should have read it more carefully! > IMO it would be worth adding a comment here, but I definitely won't ask > you to send a v11 for this! (maybe a followup patch later). > > > > >> > >> Also, if vhost_user_send_resp() fails, should we call > >> virtio_del_shmem_map()? > > > >If vhost_user_send_resp() fails, the connection with the backend is > >closed, so the mappings will indeed never be removed unless we reset. > > > >Maybe better than removing the single mapping, would be to loop > >through mappings in the shared memory and clean them all (same we do : > > > >``` > > while (!QTAILQ_EMPTY(&shmem->mmaps)) { > > VirtioSharedMemoryMapping *mapping = QTAILQ_FIRST(&shmem->mmaps); > > virtio_del_shmem_map(shmem, mapping->offset, mapping->mr->size); > > } > >``` > > > >But since a backend may utilize more than one shared memory region, > >and we do not know the mapping between a given backend and its shared > >memories, whatever we do will be incomplete (?). > > I don't know if this is the right place to do this kind of cleanup, > maybe further up in the stack. > > > >I think the only > >solution after this happens is to reset (virtio_reset) to remove all > >mappings from the all shared regions, and re-establish the backend > >channel (is it possible?). Even if the channel cannot be restablished, > >I wouldn't bother just removing one mapping, I would assume it needs a > >reset. > > So, in conclusion, we are saying that if we can no longer communicate > with the backend, there is no point in maintaining a consistent state, > because we have to reset the device anyway.
I guess what I'm saying after looking at the issue you raised (which is reasonable and founded) is that the is no good option to ensure we go back to a consistent state unless we reset. > Are we already doing this, or should we be doing it? I don't think we are? What I do not know is if we should. Probably yes. I feel we should at least set the broken flag to true in case of an error: dev->vdev->broken = true; Looking at virtio/virtio.h: `bool broken; /* device in invalid state, needs reset */` I can send a separate patch. > > BTW, I don't want to stop this series, I just found this error path > strange. On the contrary, thanks for taking a look! It is better to clear any potential issues before merging. Even if it needs a new version. > > Thanks, > Stefano >
