* Juan Quintela (quint...@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilb...@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
> >
> > postcopy_place_page (etc) provide a way for postcopy to place a page
> > into guests memory atomically (using the copy ioctl on the ufd).
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
> 
> > --- a/include/migration/postcopy-ram.h
> > +++ b/include/migration/postcopy-ram.h
> > @@ -69,4 +69,20 @@ void postcopy_discard_send_range(MigrationState *ms, 
> > PostcopyDiscardState *pds,
> >  void postcopy_discard_send_finish(MigrationState *ms,
> >                                    PostcopyDiscardState *pds);
> >  
> > +/*
> > + * Place a page (from) at (host) efficiently
> > + *    There are restrictions on how 'from' must be mapped, in general best
> > + *    to use other postcopy_ routines to allocate.
> > + * returns 0 on success
> > + */
> > +int postcopy_place_page(MigrationIncomingState *mis, void *host, void 
> > *from,
> > +                        bool all_zero);
> > +
> > +/*
> > + * Allocate a page of memory that can be mapped at a later point in time
> > + * using postcopy_place_page
> > + * Returns: Pointer to allocated page
> > + */
> > +void *postcopy_get_tmp_page(MigrationIncomingState *mis);
> > +
> 
> I don't think that this makes sense, but wouldn't have been a good idea
> to ask for the address that we want as a hint.  That could help with
> fragmentation, no?

I think that we may be able to do something if we were to transmit huge
pages (which is a separate problem); but at the moment all get_tmp_page
does it an mmap with the right set of flags, and that mmap only happens
once for all the pages; it's only the backing page that gets moved,
that mmap is reused for the whole run.

> > +/*
> > + * Place a host page (from) at (host) atomically
> > + * all_zero: Hint that the page being placed is 0 throughout
> > + * returns 0 on success
> > + */
> > +int postcopy_place_page(MigrationIncomingState *mis, void *host, void 
> > *from,
> > +                        bool all_zero)
> 
> postcop_place_page() and postcop_place_zero_page()?  They just share a
> trace point :p

Done.

> > +{
> > +    if (!all_zero) {
> > +        struct uffdio_copy copy_struct;
> > +
> > +        copy_struct.dst = (uint64_t)(uintptr_t)host;
> > +        copy_struct.src = (uint64_t)(uintptr_t)from;
> > +        copy_struct.len = getpagesize();
> > +        copy_struct.mode = 0;
> > +
> > +        /* copy also acks to the kernel waking the stalled thread up
> > +         * TODO: We can inhibit that ack and only do it if it was requested
> > +         * which would be slightly cheaper, but we'd have to be careful
> > +         * of the order of updating our page state.
> > +         */
> > +        if (ioctl(mis->userfault_fd, UFFDIO_COPY, &copy_struct)) {
> > +            int e = errno;
> > +            error_report("%s: %s copy host: %p from: %p",
> > +                         __func__, strerror(e), host, from);
> > +
> > +            return -e;
> > +        }
> > +    } else {
> > +        struct uffdio_zeropage zero_struct;
> > +
> > +        zero_struct.range.start = (uint64_t)(uintptr_t)host;
> > +        zero_struct.range.len = getpagesize();
> > +        zero_struct.mode = 0;
> > +
> > +        if (ioctl(mis->userfault_fd, UFFDIO_ZEROPAGE, &zero_struct)) {
> > +            int e = errno;
> > +            error_report("%s: %s zero host: %p from: %p",
> > +                         __func__, strerror(e), host, from);
> > +
> > +            return -e;
> > +        }
> > +    }
> > +
> > +    trace_postcopy_place_page(host, all_zero);
> > +    return 0;
> > +}
> 
> I really think that the userfault code should be in a linux specific
> file, but that can be done late, so I will not insist O:-)

I think it will make sense once we have another OSs view of what the interface
should look like, and then we can get an abstraction that works for both and
move the implementations out.

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to