* David Hildenbrand (da...@redhat.com) wrote: > This series is fully reviewed by Peter and I hope we can get either more > review feedback or get it merged via the migration tree soonish. Thanks.
Yep, I think that's a full set now; we should take this via migration. Dave > --- > > virtio-mem exposes a dynamic amount of memory within RAMBlocks by > coordinating with the VM. Memory within a RAMBlock can either get > plugged and consequently used by the VM, or unplugged and consequently no > longer used by the VM. Logical unplug is realized by discarding the > physical memory backing for virtual memory ranges, similar to memory > ballooning. > > However, important difference to virtio-balloon are: > > a) A virtio-mem device only operates on its assigned memory region / > RAMBlock ("device memory") > b) Initially, all device memory is logically unplugged > c) Virtual machines will never accidentally reuse memory that is currently > logically unplugged. The spec defines most accesses to unplugged memory > as "undefined behavior" -- except reading unplugged memory, which is > currently expected to work, but that will change in the future. > d) The (un)plug granularity is in the range of megabytes -- "memory blocks" > e) The state (plugged/unplugged) of a memory block is always known and > properly tracked. > > Whenever memory blocks within the RAMBlock get (un)plugged, changes are > communicated via the RamDiscardManager to other QEMU subsystems, most > prominently vfio which updates the DMA mapping accordingly. "Unplugging" > corresponds to "discarding" and "plugging" corresponds to "populating". > > While migrating (precopy/postcopy) that state of such memory blocks cannot > change, as virtio-mem will reject any guest requests that would change > the state of blocks with "busy". We don't want to migrate such logically > unplugged memory, because it can result in an unintended memory consumption > both, on the source (when reading memory from some memory backends) and on > the destination (when writing memory). Further, migration time can be > heavily reduced when skipping logically unplugged blocks and we avoid > populating unnecessary page tables in Linux. > > Right now, virtio-mem reuses the free page hinting infrastructure during > precopy to exclude all logically unplugged ("discarded") parts from the > migration stream. However, there are some scenarios that are not handled > properly and need fixing. Further, there are some ugly corner cases in > postcopy code and background snapshotting code that similarly have to > handle such special RAMBlocks. > > Let's reuse the RamDiscardManager infrastructure to essentially handle > precopy, postcopy and background snapshots cleanly, which means: > > a) In precopy code, fixing up the initial dirty bitmaps (in the RAMBlock > and e.g., KVM) to exclude discarded ranges. > b) In postcopy code, placing a zeropage when requested to handle a page > falling into a discarded range -- because the source will never send it. > Further, fix up the dirty bitmap when overwriting it in recovery mode. > c) In background snapshot code, never populating discarded ranges, not even > with the shared zeropage, to avoid unintended memory consumption, > especially in the future with hugetlb and shmem. > > Detail: When realizing a virtio-mem devices, it will register the RAM > for migration via vmstate_register_ram(). Further, it will > set itself as the RamDiscardManager for the corresponding memory > region of the RAMBlock via memory_region_set_ram_discard_manager(). > Last but not least, memory device code will actually map the > memory region into guest physical address space. So migration > code can always properly identify such RAMBlocks. > > Tested with precopy/postcopy on shmem, where even reading unpopulated > memory ranges will populate actual memory and not the shared zeropage. > Tested with background snapshots on anonymous memory, because other > backends are not supported yet with upstream Linux. > > > v5 -> v6: > - Rebased and added ACKs > > v4 -> v5: > - "migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the > destination" > -- Use ROUND_DOWN and fix compile warning on 32 bit > -- Use int128_make64() instead of wrongly int128_get64() > - "migration: Simplify alignment and alignment checks" > -- Use ROUND_DOWN where possible instead of QEMU_ALIGN_DOWN and fix > compilation warning on 32 bit > - "migration/ram: Factor out populating pages readable in > ram_block_populate_pages()" > -- Rename functions, add a comment. > - "migration/ram: Handle RAMBlocks with a RamDiscardManager on background > snapshots" > -- Adjust to changed function names > > v3 -> v4: > - Added ACKs > - "migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the > destination" > -- Use QEMU_ALIGN_DOWN() to align to ram pagesize > - "migration: Simplify alignment and alignment checks" > -- Added > - "migration/ram: Factor out populating pages readable in > ram_block_populate_pages()" > -- Added > - "migration/ram: Handle RAMBlocks with a RamDiscardManager on background > snapshots" > -- Simplified due to factored out code > > v2 -> v3: > - "migration/ram: Don't passs RAMState to > migration_clear_memory_region_dirty_bitmap_*()" > -- Added to make the next patch easier to implement > - "migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration > source" > -- Fixup the dirty bitmaps only initially and during postcopy recovery, > not after every bitmap sync. Also properly clear the dirty bitmaps e.g., > in KVM. [Peter] > - "migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the > destination" > -- Take care of proper host-page alignment [Peter] > > v1 -> v2: > - "migration/ram: Handle RAMBlocks with a RamDiscardManager on the > migration source" > -- Added a note how it interacts with the clear_bmap and what we might want > to further optimize in the future when synchronizing bitmaps. > > Cc: "Michael S. Tsirkin" <m...@redhat.com> > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: Juan Quintela <quint...@redhat.com> > Cc: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > Cc: Eduardo Habkost <ehabk...@redhat.com> > Cc: Peter Xu <pet...@redhat.com> > Cc: Andrey Gruzdev <andrey.gruz...@virtuozzo.com> > Cc: Marek Kedzierski <mkedz...@redhat.com> > Cc: Wei Yang <richard.weiy...@linux.alibaba.com> > Cc: teawater <teawat...@linux.alibaba.com> > Cc: Alex Williamson <alex.william...@redhat.com> > Cc: Pankaj Gupta <pankaj.gu...@cloud.ionos.com> > Cc: Philippe Mathieu-Daudé <phi...@redhat.com> > > David Hildenbrand (9): > memory: Introduce replay_discarded callback for RamDiscardManager > virtio-mem: Implement replay_discarded RamDiscardManager callback > migration/ram: Don't passs RAMState to > migration_clear_memory_region_dirty_bitmap_*() > migration/ram: Handle RAMBlocks with a RamDiscardManager on the > migration source > virtio-mem: Drop precopy notifier > migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the > destination > migration: Simplify alignment and alignment checks > migration/ram: Factor out populating pages readable in > ram_block_populate_pages() > migration/ram: Handle RAMBlocks with a RamDiscardManager on background > snapshots > > hw/virtio/virtio-mem.c | 92 ++++++++++------- > include/exec/memory.h | 21 ++++ > include/hw/virtio/virtio-mem.h | 3 - > migration/migration.c | 6 +- > migration/postcopy-ram.c | 40 ++++++-- > migration/ram.c | 180 +++++++++++++++++++++++++++++---- > migration/ram.h | 1 + > softmmu/memory.c | 11 ++ > 8 files changed, 284 insertions(+), 70 deletions(-) > > -- > 2.31.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK