On Tue, May 28, 2024 at 11:10:16AM -0400, Steven Sistare wrote: > On 5/27/2024 2:31 PM, Peter Xu wrote: > > On Mon, Apr 29, 2024 at 08:55:17AM -0700, Steve Sistare wrote: > > > Define VMSTATE_VOID_PTR so the value of a pointer (but not its target) > > > can be saved in the migration stream. This will be needed for CPR. > > > > > > Signed-off-by: Steve Sistare <steven.sist...@oracle.com> > > > > This is really tricky. > > > > From a first glance, I don't think migrating a VA is valid at all for > > migration even if with exec.. and looks insane to me for a cross-process > > migration, which seems to be allowed to use as a generic VMSD helper.. as > > VA is the address space barrier for different processes and I think it > > normally even apply to generic execve(), and we're trying to jailbreak for > > some reason.. > > > > It definitely won't work for any generic migration as sizeof(void*) can be > > different afaict between hosts, e.g. 32bit -> 64bit migrations. > > > > Some description would be really helpful in this commit message, > > e.g. explain the users and why. Do we need to poison that for generic VMSD > > use (perhaps with prefixed underscores)? I think I'll need to read on the > > rest to tell.. > > Short answer: we never dereference the void* in the new process. And must > not. > > Longer answer: > > During CPR for vfio, each mapped DMA region is re-registered in the new > process using the new VA. The ioctl to re-register identifies the mapping > by IOVA and length. > > The same requirement holds for CPR of iommufd devices. However, in the > iommufd framework, IOVA does not uniquely identify a dma mapping, and we > need to use the old VA as the unique identifier. The new process > re-registers each mapping, passing the old VA and new VA to the kernel. > The old VA is never dereferenced in the new process, we just need its value. > > I suspected that the void* which must not be dereferenced might make people > uncomfortable. I have an older version of my code which adds a uint64_t > field to RAMBlock for recording and migrating the old VA. The saving and > loading code is slightly less elegant, but no big deal. Would you prefer > that?
I see, thanks for explaining. Yes that sounds better to me. Re the ugliness: is that about a pre_save() plus one extra uint64_t field? In that case it looks better comparing to migrating "void*". I'm trying to read some context on the vaddr remap thing from you, and I found this: https://lore.kernel.org/all/y90bvbnrvraceq%2f...@nvidia.com/ So it will work with iommufd now? Meanwhile, what's the status for mdev? Looks like it isn't supported yet for both. Thanks, -- Peter Xu