Avihai Horon <avih...@nvidia.com> wrote: > Implement the basic mandatory part of VFIO migration protocol v2. > This includes all functionality that is necessary to support > VFIO_MIGRATION_STOP_COPY part of the v2 protocol. > > The two protocols, v1 and v2, will co-exist and in the following patches > v1 protocol code will be removed. > > There are several main differences between v1 and v2 protocols: > - VFIO device state is now represented as a finite state machine instead > of a bitmap. > > - Migration interface with kernel is now done using VFIO_DEVICE_FEATURE > ioctl and normal read() and write() instead of the migration region. > > - Pre-copy is made optional in v2 protocol. Support for pre-copy will be > added later on. > > Detailed information about VFIO migration protocol v2 and its difference > compared to v1 protocol can be found here [1]. > > [1] > https://lore.kernel.org/all/20220224142024.147653-10-yish...@nvidia.com/ > > Signed-off-by: Avihai Horon <avih...@nvidia.com> > +/* > + * Migration size of VFIO devices can be as little as a few KBs or as big as > + * many GBs. This value should be big enough to cover the worst case. > + */ > +#define VFIO_MIG_STOP_COPY_SIZE (100 * GiB)
Wow O:-) > + > +/* > + * Only exact function is implemented and not estimate function. The reason > is > + * that during pre-copy phase of migration the estimate function is called > + * repeatedly while pending RAM size is over the threshold, thus migration > + * can't converge and querying the VFIO device pending data size is useless. > + */ You can do it after this is merge, but I think you can do better than this. Something in the lines of: // I put it in a global variable, but it really needs to be in VFIODevice to be // able to support several devices. You get the idea O:-) static uint64_t cached_size = -1; static void vfio_state_pending_exact(void *opaque, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) { VFIODevice *vbasedev = opaque; uint64_t stop_copy_size = VFIO_MIG_STOP_COPY_SIZE; /* * If getting pending migration size fails, VFIO_MIG_STOP_COPY_SIZE is * reported so downtime limit won't be violated. */ vfio_query_stop_copy_size(vbasedev, &stop_copy_size); *res_precopy_only += stop_copy_size; cached_size = stop_copy_size; trace_vfio_state_pending_exact(vbasedev->name, *res_precopy_only, *res_postcopy_only, *res_compatible, stop_copy_size); } static void vfio_state_pending_estimate(void *opaque, uint64_t *res_precopy_only, uint64_t *res_compatible, uint64_t *res_postcopy_only) { VFIODevice *vbasedev = opaque; uint64_t stop_copy_size = VFIO_MIG_STOP_COPY_SIZE; if (cached_size == -1) { uint64_t res_precopy; uint64_t res_compatible; uint64_t res_postcopy; vfio_state_pending_exact(opaque, &res_precopy, &res_compatible, &res_postcopy); } *res_precopy_only += cached_size; }