On Tue, Feb 25, 2020 at 08:44:56AM +0100, David Hildenbrand wrote:
> On 24.02.20 23:49, Peter Xu wrote:
> > On Fri, Feb 21, 2020 at 05:42:04PM +0100, David Hildenbrand wrote:
> >> When we partially change mappings (esp., mmap over parts of an existing
> >> mmap like qemu_ram_remap() does) where we have a userfaultfd handler
> >> registered, the handler will implicitly be unregistered from the parts that
> >> changed.
> >>
> >> Trying to place pages onto mappings where there is no longer a handler
> >> registered will fail. Let's make sure that any waiter is woken up - we
> >> have to do that manually.
> >>
> >> Let's also document how UFFDIO_UNREGISTER will handle this scenario.
> >>
> >> This is mainly a preparation for RAM blocks with resizable allcoations,
> >> where the mapping of the invalid RAM range will change. The source will
> >> keep sending pages that are outside of the new (shrunk) RAM size. We have
> >> to treat these pages like they would have been migrated, but can
> >> essentially simply drop the content (ignore the placement error).
> >>
> >> Keep printing a warning on EINVAL, to avoid hiding other (programming)
> >> issues. ENOENT is unique.
> >>
> >> Cc: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
> >> Cc: Juan Quintela <quint...@redhat.com>
> >> Cc: Peter Xu <pet...@redhat.com>
> >> Cc: Andrea Arcangeli <aarca...@redhat.com>
> >> Signed-off-by: David Hildenbrand <da...@redhat.com>
> >> ---
> >>  migration/postcopy-ram.c | 37 +++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 37 insertions(+)
> >>
> >> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> >> index c68caf4e42..f023830b9a 100644
> >> --- a/migration/postcopy-ram.c
> >> +++ b/migration/postcopy-ram.c
> >> @@ -506,6 +506,12 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
> >>      range_struct.start = (uintptr_t)host_addr;
> >>      range_struct.len = length;
> >>  
> >> +    /*
> >> +     * In case the mapping was partially changed since we enabled 
> >> userfault
> >> +     * (e.g., via qemu_ram_remap()), the userfaultfd handler was already 
> >> removed
> >> +     * for the mappings that changed. Unregistering will, however, still 
> >> work
> >> +     * and ignore mappings without a registered handler.
> >> +     */
> > 
> > Ideally we should still only unregister what we have registered.
> > After all we do have this information because we know what we
> > registered, we know what has unmapped (in your new resize() hook, when
> > postcopy_state==RUNNING).
> 
> Not in the case of qemu_ram_remap(). And whatever you propose will
> require synchronization (see my other mail) and more complicated
> handling than this. uffd allows you to handle races with mmap changes in
> a very elegant way (e.g., -ENOENT, or unregisterignoring changed mappings).

All writers to the new postcopy_min_length should have BQL already.
The only left is the last cleanup_range() where we can take the BQL
for a while.  However...

> 
> > 
> > An extreme example is when we register with pages in range [A, B),
> > then shrink it to [A, C), then we mapped something else within [C, B)
> > (note, with virtio-mem logically B can be very big and C can be very
> > small, it means [B, C) can cover quite some address space). Then if:
> > 
> >   - [C, B) memory type is not compatible with uffd, or
> 
> That will never happen in the near future. Without resizable allocations:
> - All memory is either anonymous or from a single fd
> 
> In addition, right now, only anonymous memory can be used for resizable
> RAM. However, with resizable allocations we could have:
> - All used_length memory is either anonymous or from a single fd
> - All remaining memory is either anonymous or from a single fd
> 
> Everything else does not make any sense IMHO and I don't think this is
> relevant long term. You cannot arbitrarily map things into the
> used_length part of a RAMBlock. That would contradict to its page_size
> and its fd. E.g., you would break qemu_ram_remap().

... I think this persuaded me. :) You are right they can still be
protected until max_length with PROT_NONE.  Would you mind add some of
the above into the comment above unregister of uffd?

Thanks,

-- 
Peter Xu


Reply via email to