> Am 23.12.2016 um 03:50 schrieb Li, Liang Z: > >> While measuring live migration performance for qemu/kvm guest, it was > >> observed that the qemu doesn’t maintain any intelligence for the > >> guest ram pages released by the guest balloon driver and treat such > >> pages as any other normal guest ram pages. This has direct impact on > >> overall migration time for the guest which has released (ballooned out) > memory to the host. > >> > >> In case of large systems, where we can configure large guests with > >> 1TB and with considerable amount of memory released by balloon driver > >> to the host, the migration time gets worse. > >> > >> The solution proposed below is local to qemu (and does not require > >> any modification to Linux kernel or any guest driver). We have > >> verified the fix for large guests =1TB on HPE Superdome X (which can > >> support up to 240 cores and 12TB of memory). > >> > >> During live migration, as part of first iteration in > >> ram_save_iterate() -> ram_find_and_save_block () will try to migrate > >> ram pages even if they are released by vitrio-balloon driver (balloon > >> inflate). Although these pages which are returned to the host by > >> virtio-balloon driver are zero pages, the migration algorithm will > >> still end up scanning the entire page > >> ram_find_and_save_block() -> > >> ram_save_page()/ram_save_compressed_page() -> > >> save_zero_page() -> is_zero_range(). We also end-up sending header > >> information over network for these pages during migration. This adds > >> to the total migration time. > >> > >> The solution creates a balloon bitmap ramblock as a part of > >> virtio-balloon device initialization. The bits in the balloon bitmap > >> represent a guest ram page of size 1UL << VIRTIO_BALLOON_PFN_SHIFT > or > >> 4K. If TARGET_PAGE_BITS <= VIRTIO_BALLOON_PFN_SHIFT, ram_addr > offset > >> for the dirty page which is used by dirty page bitmap during > >> migration is checked against the balloon bitmap as is, if the bit is > >> set in the balloon bitmap, the corresponding ram page will be > >> excluded from scanning and sending header information during > >> migration. In case TARGET_PAGE_BITS > VIRTIO_BALLOON_PFN_SHIFT > for a > >> given dirty page ram_addr, all sub-pages of 1UL << > >> VIRTIO_BALLOON_PFN_SHIFT size should be ballooned out to avoid zero > page scan and sending header information. > >> > >> The bitmap represents entire guest ram memory till max configured > memory. > >> Guest ram pages claimed by the virtio-balloon driver will be > >> represented by 1 in the bitmap. Since the bitmap is maintained as a > >> ramblock, it’s migrated to target as part migration’s ram iterative > >> and ram complete phase. So that substituent migrations from the target > can continue to use optimization. > >> > >> A new migration capability called skip-balloon is introduced. The > >> user can disable the capability in cases where user does not expect > >> much benefit or in case the migration is from an older version. > >> > >> During live migration setup the optimization can be set to disabled state > >> if . > >> no virtio-balloon device is initialized. > >> . skip-balloon migration capability is disabled. > >> . If the guest virtio-balloon driver has not set > >> VIRTIO_BALLOON_F_MUST_TELL_HOST > >> flag. Which means the guest may start using a ram pages freed by > >> guest balloon > >> driver, even before the host/qemu is aware of it. In such case, the > >> optimization is disabled so that the ram pages that are being used by the > >> guest will continue to be scanned and migrated. > >> > >> Balloon bitmap ramblock size is set to zero if the optimization is > >> disabled, to avoid overhead of migrating the bitmap. If the bitmap is > >> not migrated to the target, the destination starts with a fresh > >> bitmap and tracks the ballooning operation thereafter. > >> > > > > I have a better way to get rid of the bitmap. > > We should not maintain the inflating pages in the bitmap, instead, we > > can get them from the guest if it's needed, just like what we did for > > the guest's unused pages. Then we can combine the inflating page info > > with the unused page info together, and skip them during live migration. > > > > If we want to actually host enforce (disallow) access to inflated pages, > having
Is that a new feature? > such a bitmap in QEMU is required. Getting them from the guest doesn't > make sense in that context anymore. Even a bitmap is required, we should avoid to send it in the stop & copy stage. Thanks! Liang > > Thanks, > David >