On Wed, Jun 7, 2023 at 12:43 AM Si-Wei Liu <si-wei....@oracle.com> wrote: > > Sorry for reviving this old thread, I lost the best timing to follow up > on this while I was on vacation. I have been working on this and found > out some discrepancy, please see below. > > On 4/5/23 04:37, Eugenio Perez Martin wrote: > > Hi! > > > > As mentioned in the last upstream virtio-networking meeting, one of > > the factors that adds more downtime to migration is the handling of > > the guest memory (pin, map, etc). At this moment this handling is > > bound to the virtio life cycle (DRIVER_OK, RESET). In that sense, the > > destination device waits until all the guest memory / state is > > migrated to start pinning all the memory. > > > > The proposal is to bind it to the char device life cycle (open vs > > close), > > Hmmm, really? If it's the life cycle for char device, the next guest / > qemu launch on the same vhost-vdpa device node won't make it work. >
Maybe my sentence was not accurate, but I think we're on the same page here. Two qemu instances opening the same char device at the same time are not allowed, and vhost_vdpa_release clean all the maps. So the next qemu that opens the char device should see a clean device anyway. > > > so all the guest memory can be pinned for all the guest / qemu > > lifecycle. > > I think to tie pinning to guest / qemu process life cycle makes more > sense. Essentially this pinning part needs to be decoupled from the > iotlb mapping abstraction layer, and can / should work as a standalone > uAPI. Such that QEMU at the destination may launch and pin all guest's > memory as needed without having to start the device, while awaiting any > incoming migration request. Though problem is, there's no existing vhost > uAPI that could properly serve as the vehicle for that. SET_OWNER / > SET_MEM_TABLE / RESET_OWNER seems a remote fit.. Any objection against > introducing a new but clean vhost uAPI for pinning guest pages, subject > to guest's life cycle? > I think that to pin or not pin memory maps should be a kernel decision, not to be driven by qemu. I'm not against it if needed, but let me know if the current "clean at close" address your concerns. > Another concern is the use_va stuff, originally it tags to the device > level and is made static at the time of device instantiation, which is > fine. But others to come just find a new home at per-group level or > per-vq level struct. Hard to tell whether or not pinning is actually > needed for the latter use_va friends, as they are essentially tied to > the virtio life cycle or feature negotiation. While guest / Qemu starts > way earlier than that. Perhaps just ignore those sub-device level use_va > usages? Presumably !use_va at the device level is sufficient to infer > the need of pinning for device? > I don't follow this. But I have the feeling that the subject of my original mail is way more accurate if I would have said just "memory maps". I still consider the way to fix it is to actually delegate that to the kernel vdpa, so it can choose if a particular ASID needs the pin or not. But let me know if I missed something. Thanks! > Regards, > -Siwei > > > > > > This has two main problems: > > * At this moment the reset semantics forces the vdpa device to unmap > > all the memory. So this change needs a vhost vdpa feature flag. > > * This may increase the initialization time. Maybe we can delay it if > > qemu is not the destination of a LM. Anyway I think this should be > > done as an optimization on top. > > > > Any ideas or comments in this regard? > > > > Thanks! > > >