On Thu, Jul 27, 2017 at 10:27:41AM +0300, Alexey Perevalov wrote: > On 07/27/2017 05:35 AM, Peter Xu wrote: > >On Wed, Jul 26, 2017 at 06:24:11PM +0300, Alexey Perevalov wrote: > >>On 07/26/2017 11:43 AM, Peter Xu wrote: > >>>On Wed, Jul 26, 2017 at 11:07:17AM +0300, Alexey Perevalov wrote: > >>>>On 07/26/2017 04:49 AM, Peter Xu wrote: > >>>>>On Thu, Jul 20, 2017 at 09:52:34AM +0300, Alexey Perevalov wrote: > >>>>>>This patch adds ability to track down already received > >>>>>>pages, it's necessary for calculation vCPU block time in > >>>>>>postcopy migration feature, maybe for restore after > >>>>>>postcopy migration failure. > >>>>>>Also it's necessary to solve shared memory issue in > >>>>>>postcopy livemigration. Information about received pages > >>>>>>will be transferred to the software virtual bridge > >>>>>>(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for > >>>>>>already received pages. fallocate syscall is required for > >>>>>>remmaped shared memory, due to remmaping itself blocks > >>>>>>ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT > >>>>>>error (struct page is exists after remmap). > >>>>>> > >>>>>>Bitmap is placed into RAMBlock as another postcopy/precopy > >>>>>>related bitmaps. > >>>>>> > >>>>>>Reviewed-by: Peter Xu <pet...@redhat.com> > >>>>>>Signed-off-by: Alexey Perevalov <a.pereva...@samsung.com> > >>>>>>--- > >>>>>[...] > >>>>> > >>>>>> static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr, > >>>>>>- void *from_addr, uint64_t pagesize) > >>>>>>+ void *from_addr, uint64_t pagesize, > >>>>>>RAMBlock *rb) > >>>>>> { > >>>>>>+ int ret; > >>>>>> if (from_addr) { > >>>>>> struct uffdio_copy copy_struct; > >>>>>> copy_struct.dst = (uint64_t)(uintptr_t)host_addr; > >>>>>> copy_struct.src = (uint64_t)(uintptr_t)from_addr; > >>>>>> copy_struct.len = pagesize; > >>>>>> copy_struct.mode = 0; > >>>>>>- return ioctl(userfault_fd, UFFDIO_COPY, ©_struct); > >>>>>>+ ret = ioctl(userfault_fd, UFFDIO_COPY, ©_struct); > >>>>>> } else { > >>>>>> struct uffdio_zeropage zero_struct; > >>>>>> zero_struct.range.start = (uint64_t)(uintptr_t)host_addr; > >>>>>> zero_struct.range.len = pagesize; > >>>>>> zero_struct.mode = 0; > >>>>>>- return ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); > >>>>>>+ ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); > >>>>>>+ } > >>>>>>+ if (!ret) { > >>>>>>+ ramblock_recv_bitmap_set(host_addr, rb); > >>>>>Wait... > >>>>> > >>>>>Now we are using 4k-page/bit bitmap, do we need to take care of the > >>>>>huge pages here? Looks like we are only setting the first bit of it > >>>>>if it is a huge page? > >>>>First version was per ramblock page size, IOW bitmap was smaller in > >>>>case of hugepages. > >>>Yes, but this is not the first version any more. :) > >>> > >>>This patch is using: > >>> > >>> bitmap_new(rb->max_length >> TARGET_PAGE_BITS); > >>> > >>>to allocate bitmap, so it is using small pages always for bitmap, > >>>right? (I should not really say "4k" pages, here I think the size is > >>>host page size, which is the thing returned from getpagesize()). > >>> > >>>>You mentioned that TARGET_PAGE_SIZE is reasonable for precopy case, > >>>>in "Re: [Qemu-devel] [PATCH v1 2/2] migration: add bitmap for copied page" > >>>>I though TARGET_PAGE_SIZE as transmition unit, is using in precopy even > >>>>hugepage case. > >>>>But it's not so logically, page being marked as dirty, should be sent as a > >>>>whole page. > >>>Sorry if I misunderstood, but I didn't see anything wrong - we are > >>>sending pages in small pages, but when postcopy is there, we do > >>>UFFDIO_COPY in huge page, so everything is fine? > >>I think yes, we chose TARGET_PAGE_SIZE because of wider > >>use case ranges. > >So... are you going to post another version? IIUC we just need to use > >a bitmap_set() to replace the ramblock_recv_bitmap_set(), while set > >the size with "pagesize / TARGET_PAGE_SIZE"? > From my point of view TARGET_PAGE_SIZE/TARGET_PAGE_BITS it's a platform > specific > > and it used in ram_load to copy to buffer so it's more preferred for bitmap > size > and I'm not going to replace ramblock_recv_bitmap_set helper - it calculates > offset. > > > > >(I think I was wrong when saying getpagesize() above: the small page > > should be target page size, while the huge page should be the host's) > I think we should forget about huge page case in "received bitmap" > concept, maybe in "uffd_copied bitmap" it was reasonable ;)
Again, I am not sure I got the whole idea of the reply... However, I do think when we UFFDIO_COPY a huge page, then we should do bitmap_set() on the received bitmap for the whole range that the huge page covers. IMHO, the bitmap is defined as "one bit per small page", and the small page size is TARGET_PAGE_SIZE. We cannot just assume that "as long as the first bit of the huge page is set, all the small pages in the huge page are set". Thanks, -- Peter Xu