On 22.02.21 18:54, Peter Xu wrote:
On Mon, Feb 22, 2021 at 06:33:27PM +0100, David Hildenbrand wrote:
On 22.02.21 18:29, Peter Xu wrote:
On Sat, Feb 20, 2021 at 02:59:42AM -0500, David Hildenbrand wrote:
Live snapshotting ends up reading all guest memory (dirty bitmap starts with
all 1s), which is not what we want for virtio-mem - we don’t want to read and
migrate memory that has been discarded and has no stable content.
For ordinary migration we use the guest page hint API to clear bits in the
dirty bitmap after dirty bitmap sync. Well, if we don‘t do bitmap syncs we‘ll
never clear any dirty bits. That‘s the problem.
Using dirty bitmap for that information is less efficient, becase it's
definitely a larger granularity information than PAGE_SIZE. If the disgarded
ranges are always continuous and at the end of a memory region, we should have
some parameter in the ramblock showing that where we got shrinked then we don't
check dirty bitmap at all, rather than always assuming used_length is the one.
They are randomly scattered across the whole RAMBlock. Shrinking/growing
will be done to some degree in the future (but it won't get rid of the
general sparse layout we can produce).
OK. Btw I think currently live snapshot should still be reading dirty bitmap,
so maybe it's still fine. It's just that it's still not very clear to hide
virtio-mem information into dirty bitmap, imho, since that's not how we
interpret dirty bitmap - which is only for the sake of tracking page changes.
Well, currently it is "what do we have to migrate".
What's the granule of virtio-mem for this discard behavior? Maybe we could
virtio-mem granularity is at least 1MB. This corresponds to 256 bits (32
bytes) in the dirty bitmap I think.
decouple it with dirty bitmap some day; if the unit is big enough it's also a
gain on efficiency so we skip in chunk rather than looping over tons of pages
knowing that they're discarded.
Yeah, it's not optimal having to go over the dirty bitmap to cross off
"discarded" parts and later having to find bits to migrate.
At least find_next_bit() can skip whole longs (8 bytes) and is fairly
efficient. There is certainly room for improvement (the current guest
free page hinting API is certainly a hack).
--
Thanks,
David / dhildenb