snapshots

David Hildenbrand Fri, 19 Feb 2021 13:21:03 -0800


> Am 19.02.2021 um 22:14 schrieb David Hildenbrand <dhild...@redhat.com>:
> 
> 
>>> Am 19.02.2021 um 22:10 schrieb Peter Xu <pet...@redhat.com>:
>>> 
>>> On Fri, Feb 19, 2021 at 03:50:52PM -0500, Peter Xu wrote:
>>> Andrey,
>>> 
>>>> On Fri, Feb 19, 2021 at 09:57:37AM +0300, Andrey Gruzdev wrote:
>>>> For the discards that happen before snapshot is started, I need to dig 
>>>> into Linux and QEMU virtio-baloon
>>>> code more to get clear with it.
>>> 
>>> Yes it's very tricky on how the error could trigger.
>>> 
>>> Let's think of below sequence:
>>> 
>>> - Start a guest with init_on_free=1 set and also a virtio-balloon device
>>> 
>>> - Guest frees a page P and zeroed it (since init_on_free=1). Now P contains
>>>   all zeros.
>>> 
>>> - Virtio-balloon reports this page to host, MADV_DONTNEED sent, then this
>>>   page is dropped on the host.
>>> 
>>> - Start live snapshot, wr-protect all pages (but not including page P 
>>> because
>>>   it's currently missing).  Let's call it $SNAPSHOT1.
>>> 
>>> - Guest does alloc_page(__GFP_ZERO), accidentally fetching this page P and
>>>   returned
>>> 
>>> - So far, page P is still all zero (which is good!), then guest uses page P
>>>   and writes data to it (say, now P has data P1 rather than all zeros).
>>> 
>>> - Live snapshot saves page P, which content P1 rather than all zeros.
>>> 
>>> - Live snapshot completed.  Saved as $SNAPSHOT1.
>>> 
>>> Then when load snapshot $SNAPSHOT1, we'll have P contains data P1.  After
>>> snapshot loaded, when guest allocate again with alloc_page(__GFP_ZERO) on 
>>> this
>>> page P, since guest kernel "thought" this page is all-zero already so 
>>> memzero()
>>> is skipped even if __GFP_ZERO is provided.  Then this page P (with content 
>>> P1)
>>> got returned for the alloc_page(__GFP_ZERO) even if __GFP_ZERO set.  That 
>>> could
>>> break the caller of alloc_page().
>>> 
>>>> Anyhow I'm quite sure that adding global MISSING handler for snapshotting
>>>> is too heavy and not really needed.
>>> 
>>> UFFDIO_ZEROCOPY installs a zero pfn and that should be all of it.  There'll
>>> definitely be overhead, but it may not be that huge as imagined.  Live 
>>> snapshot
>>> is great in that we have point-in-time image of guest without stopping the
>>> guest, so taking slightly longer time won't be a huge loss to us too.
>>> 
>>> Actually we can also think of other ways to work around it.  One way is we 
>>> can
>>> pre-fault all guest pages before wr-protect.  Note that we don't need to 
>>> write
>>> to the guest page because read would suffice, since uffd-wp would also work
>>> with zero pfn.  It's just that this workaround won't help on saving snapshot
>>> disk space, but it seems working.  It would be great if you have other
>>> workarounds, maybe as you said UFFDIO_ZEROCOPY is not the only route.
>> 
>> Wait.. it actually seems to also solve the disk usage issue.. :)
>> 
>> We should just need to make sure to prohibit balloon before staring to
>> pre-fault read on all guest ram.  Seems awkward, but also seems working.. 
>> Hmm..
> 
> A shiver just went down my spine. Please don‘t just for the sake of creating 
> a snapshot.
> 
> (Just imagine you don‘t have a shared zeropage...)


... and I just remembered we read all memory either way. Gah.

I have some patches to make snapshots fly with virtio-mem so exactly that won‘t 
happen. But they depend on vfio support, so it might take a while.

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

Reply via email to