* Igor Mammedov (imamm...@redhat.com) wrote: > On Thu, 26 Apr 2018 03:37:51 -0400 (EDT) > Pankaj Gupta <pagu...@redhat.com> wrote: > > trimming CC list to keep people that might be interested in the topic > and renaming thread to reflect it. > > > > > > > > > >> + > > > > > > > > >> + memory_region_add_subregion(&hpms->mr, addr - > > > > > > > > >> hpms->base, > > > > > > > > >> mr); > > > > > > > > > missing vmstate registration? > > > > > > > > > > > > > > > > Missed this one: To be called by the caller. Important because > > > > > > > > e.g. > > > > > > > > for > > > > > > > > virtio-pmem we don't want this (I assume :) ). > > > > > > > if pmem isn't on shared storage, then We'd probably want to > > > > > > > migrate > > > > > > > it as well, otherwise target would experience data loss. > > > > > > > Anyways, I'd just reat it as normal RAM in migration case > > > > > > > > > > > > Main difference between RAM and pmem it acts like combination of RAM > > > > > > and > > > > > > disk. > > > > > > Saying this, in normal use-case size would be 100 GB's - few TB's > > > > > > range. > > > > > > I am not sure we really want to migrate it for non-shared storage > > > > > > use-case. > > > > > with non shared storage you'd have to migrate it target host but > > > > > with shared storage it might be possible to flush it and use directly > > > > > from target host. That probably won't work right out of box and would > > > > > need some sort of synchronization between src/dst hosts. > > > > > > > > Shared storage should work out of the box. > > > > Only thing is data in destination > > > > host will be cache cold and existing pages in cache should be > > > > invalidated > > > > first. > > > > But if we migrate entire fake DAX RAMstate it will populate destination > > > > host page > > > > cache including pages while were idle in source host. This would > > > > unnecessarily > > > > create entropy in destination host. > > > > > > > > To me this feature don't make much sense. Problem which we are solving > > > > is: > > > > Efficiently use guest RAM. > > > What would live migration handover flow look like in case of > > > guest constantly dirting memory provided by virtio-pmem and > > > and sometimes issuing async flush req along with it? > > > > Dirty entire pmem (disk) at once not a usual scenario. Some part of > > disk/pmem > > would get dirty and we need to handle that. I just want to say moving entire > > pmem (disk) is not efficient solution because we are using this solution to > > manage guest memory efficiently. Otherwise it will be like any block device > > copy > > with non-shared storage. > not sure if we can use block layer analogy here. > > > > > > The same applies to nv/pc-dimm as well, as backend file easily could > > > > > be > > > > > on pmem storage as well. > > > > > > > > Are you saying backing file is in actual actual nvdimm hardware? we > > > > don't > > > > need > > > > emulation at all. > > > depends on if file is on DAX filesystem, but your argument about > > > migrating huge 100Gb- TB's range applies in this case as well. > > > > > > > > > > > > > > > > > Maybe for now we should migrate everything so it would work in case of > > > > > non shared NVDIMM on host. And then later add migration-less > > > > > capability > > > > > to all of them. > > > > > > > > not sure I agree. > > > So would you inhibit migration in case of non shared backend storage, > > > to avoid loosing data since they aren't migrated? > > > > I am just thinking what features we want to support with pmem. And live > > migration > > with shared storage is the one which comes to my mind. > > > > If live migration with non-shared storage is what we want to support (I > > don't know > > yet) we can add this? Even with shared storage it would copy entire pmem > > state? > Perhaps we should register vmstate like for normal ram and use something > similar to > http://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html this > to skip shared memory on migration. > In this case we could use this for pc-dimms as well. > > David, > what's your take on it?
My feel is that something is going to have to migrate it, I'm just not sure how. So let me just check I understand: a) It's potentially huge b) It's a RAMBlock c) It's backed by ???? c1) Something machine local - i.e. a physical lump of flash in a socket rather than something sharable by machines? d) It can potentially be rapidly changing as the guest writes to it? Dave > > Thanks, > > Pankaj > > > > > > > > > > > > > > One reason why nvdimm added vmstate info could be: still there > > > > > > would be > > > > > > transient > > > > > > writes in memory with fake DAX and there is no way(till now) to > > > > > > flush > > > > > > the > > > > > > guest > > > > > > writes. But with virtio-pmem we can flush such writes before > > > > > > migration > > > > > > and > > > > > > automatically > > > > > > at destination host with shared disk we will have updated data. > > > > > nvdimm has concept of flush address hint (may be not implemented in > > > > > qemu > > > > > yet) > > > > > but it can flush. The only reason I'm buying into virtio-mem idea > > > > > is that would allow async flush queues which would reduce number > > > > > of vmexits. > > > > > > > > Thats correct. > > > > > > > > Thanks, > > > > Pankaj > > > > > > > > > > > > > > > > > > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK