On 7/30/2020 11:14 AM, Steve Sistare wrote: > Anonymous memory segments used by the guest are preserved across a re-exec > of qemu, mapped at the same VA, via a proposed madvise(MADV_DOEXEC) option > in the Linux kernel. For the madvise patches, see: > > https://lore.kernel.org/lkml/1595869887-23307-1-git-send-email-anthony.yzn...@oracle.com/ > > Signed-off-by: Steve Sistare <steven.sist...@oracle.com> > --- > include/qemu/osdep.h | 7 +++++++ > 1 file changed, 7 insertions(+)
Hi Alex, The MADV_DOEXEC functionality, which is a pre-requisite for the entire qemu live update series, is getting a chilly reception on lkml. We could instead create guest memory using memfd_create and preserve the fd across exec. However, the subsequent mmap(fd) will return a different VA than was used previously, which is a problem for memory that was registered with vfio, as the original VA is remembered in the kernel struct vfio_dma and used in various kernel functions, such as vfio_iommu_replay. To fix, we could provide a VFIO_IOMMU_REMAP_DMA ioctl taking iova, size, and new_vaddr. The implementation finds an exact match for (iova, size) and replaces vaddr with new_vaddr. Flags cannot be changed. memfd_create plus VFIO_IOMMU_REMAP_DMA would replace MADV_DOEXEC. vfio on any form of shared memory (shm, dax, etc) could also be preserved across exec with shmat/mmap plus VFIO_IOMMU_REMAP_DMA. What do you think? - Steve