On Tue, May 28, 2024 at 11:10:16AM -0400, Steven Sistare wrote:
> On 5/27/2024 2:31 PM, Peter Xu wrote:
> > On Mon, Apr 29, 2024 at 08:55:17AM -0700, Steve Sistare wrote:
> > > Define VMSTATE_VOID_PTR so the value of a pointer (but not its target)
> > > can be saved in the migration stream.  This will be needed for CPR.
> > > 
> > > Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
> > 
> > This is really tricky.
> > 
> >  From a first glance, I don't think migrating a VA is valid at all for
> > migration even if with exec.. and looks insane to me for a cross-process
> > migration, which seems to be allowed to use as a generic VMSD helper.. as
> > VA is the address space barrier for different processes and I think it
> > normally even apply to generic execve(), and we're trying to jailbreak for
> > some reason..
> > 
> > It definitely won't work for any generic migration as sizeof(void*) can be
> > different afaict between hosts, e.g. 32bit -> 64bit migrations.
> > 
> > Some description would be really helpful in this commit message,
> > e.g. explain the users and why.  Do we need to poison that for generic VMSD
> > use (perhaps with prefixed underscores)?  I think I'll need to read on the
> > rest to tell..
> 
> Short answer: we never dereference the void* in the new process.  And must 
> not.
> 
> Longer answer:
> 
> During CPR for vfio, each mapped DMA region is re-registered in the new
> process using the new VA.  The ioctl to re-register identifies the mapping
> by IOVA and length.
> 
> The same requirement holds for CPR of iommufd devices.  However, in the
> iommufd framework, IOVA does not uniquely identify a dma mapping, and we
> need to use the old VA as the unique identifier.  The new process
> re-registers each mapping, passing the old VA and new VA to the kernel.
> The old VA is never dereferenced in the new process, we just need its value.
> 
> I suspected that the void* which must not be dereferenced might make people
> uncomfortable.  I have an older version of my code which adds a uint64_t
> field to RAMBlock for recording and migrating the old VA.  The saving and
> loading code is slightly less elegant, but no big deal.  Would you prefer
> that?

I see, thanks for explaining.  Yes that sounds better to me.  Re the
ugliness: is that about a pre_save() plus one extra uint64_t field?  In
that case it looks better comparing to migrating "void*".

I'm trying to read some context on the vaddr remap thing from you, and I
found this:

https://lore.kernel.org/all/y90bvbnrvraceq%2f...@nvidia.com/

So it will work with iommufd now?  Meanwhile, what's the status for mdev?
Looks like it isn't supported yet for both.

Thanks,

-- 
Peter Xu


Reply via email to