Re: [PATCH QEMU v23 09/18] vfio: Add save state functions to SaveVMHandlers

2020-05-21 Thread Peter Xu
On Wed, May 20, 2020 at 11:54:39PM +0530, Kirti Wankhede wrote:
> In _SAVING|_RUNNING device state or pre-copy phase:
> - read pending_bytes. If pending_bytes > 0, go through below steps.
> - read data_offset - indicates kernel driver to write data to staging
>   buffer.
> - read data_size - amount of data in bytes written by vendor driver in
>   migration region.
> - read data_size bytes of data from data_offset in the migration region.
> - Write data packet to file stream as below:
> {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data,
> VFIO_MIG_FLAG_END_OF_STATE }

Hi, Kirti,

(Sorry if this question is silly, since I didn't follow up with all these
 threads...)

Can I understand this commit as it only tracks dirty device BAR memory but not
guest system memory?  How did you track device writes to system memory when
without the vIOMMU?

Thanks,

-- 
Peter Xu




Re: [PATCH QEMU v23 09/18] vfio: Add save state functions to SaveVMHandlers

2020-05-21 Thread Dr. David Alan Gilbert
* Kirti Wankhede (kwankh...@nvidia.com) wrote:
> Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy
> functions. These functions handles pre-copy and stop-and-copy phase.
> 
> In _SAVING|_RUNNING device state or pre-copy phase:
> - read pending_bytes. If pending_bytes > 0, go through below steps.
> - read data_offset - indicates kernel driver to write data to staging
>   buffer.
> - read data_size - amount of data in bytes written by vendor driver in
>   migration region.
> - read data_size bytes of data from data_offset in the migration region.
> - Write data packet to file stream as below:
> {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data,
> VFIO_MIG_FLAG_END_OF_STATE }
> 
> In _SAVING device state or stop-and-copy phase
> a. read config space of device and save to migration file stream. This
>doesn't need to be from vendor driver. Any other special config state
>from driver can be saved as data in following iteration.
> b. read pending_bytes. If pending_bytes > 0, go through below steps.
> c. read data_offset - indicates kernel driver to write data to staging
>buffer.
> d. read data_size - amount of data in bytes written by vendor driver in
>migration region.
> e. read data_size bytes of data from data_offset in the migration region.
> f. Write data packet as below:
>{VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data}
> g. iterate through steps b to f while (pending_bytes > 0)
> h. Write {VFIO_MIG_FLAG_END_OF_STATE}
> 
> When data region is mapped, its user's responsibility to read data from
> data_offset of data_size before moving to next steps.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/migration.c   | 254 
> +-
>  hw/vfio/trace-events  |   6 +
>  include/hw/vfio/vfio-common.h |   1 +
>  3 files changed, 260 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 773c8d16b1c1..d90bd2296afd 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -139,6 +139,137 @@ static int vfio_migration_set_state(VFIODevice 
> *vbasedev, uint32_t mask,
>  return 0;
>  }
>  
> +static void *find_data_region(VFIORegion *region,
> +  uint64_t data_offset,
> +  uint64_t data_size)
> +{
> +void *ptr = NULL;
> +int i;
> +
> +for (i = 0; i < region->nr_mmaps; i++) {
> +if ((data_offset >= region->mmaps[i].offset) &&

that checks the bit we want to map starts at or after this region

> +(data_offset < region->mmaps[i].offset + region->mmaps[i].size) 
> &&

and the bit we want to map starts before the end of this region

> +(data_size <= region->mmaps[i].size)) {

Shouldn't that check take into account the offset?  If 
data_offset-mmaps[i].offset is almost mmaps[i].size then we would be
accessing data_size after the end of the region?

> +ptr = region->mmaps[i].mmap + (data_offset -
> +   region->mmaps[i].offset);
> +break;
> +}
> +}
> +return ptr;
> +}
> +
> +static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev)

  ^^^ do you need to make this int64_t or ssize_t - internally
to this function it's uint64_t but you're also returning errors.
(and then fix up the trace_ and callers below)

> +{
> +VFIOMigration *migration = vbasedev->migration;
> +VFIORegion *region = &migration->region;
> +uint64_t data_offset = 0, data_size = 0;
> +int ret;
> +
> +ret = pread(vbasedev->fd, &data_offset, sizeof(data_offset),
> +region->fd_offset + offsetof(struct 
> vfio_device_migration_info,
> + data_offset));
> +if (ret != sizeof(data_offset)) {
> +error_report("%s: Failed to get migration buffer data offset %d",
> + vbasedev->name, ret);
> +return -EINVAL;
> +}
> +
> +ret = pread(vbasedev->fd, &data_size, sizeof(data_size),
> +region->fd_offset + offsetof(struct 
> vfio_device_migration_info,
> + data_size));
> +if (ret != sizeof(data_size)) {
> +error_report("%s: Failed to get migration buffer data size %d",
> + vbasedev->name, ret);
> +return -EINVAL;
> +}
> +
> +if (data_size > 0) {
> +void *buf = NULL;
> +bool buffer_mmaped;
> +
> +if (region->mmaps) {
> +buf = find_data_region(region, data_offset, data_size);
> +}
> +
> +buffer_mmaped = (buf != NULL);
> +
> +if (!buffer_mmaped) {
> +buf = g_try_malloc(data_size);
> +if (!buf) {
> +error_report("%s: Error allocating buffer ", __func__);
> +return -ENOMEM;
> +}
> +
> +ret = pread(vbasedev->fd, buf, data_size,
> +