<snip> >>>>>>>>>> +static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev) >>>>>>>>>> +{ >>>>>>>>>> + VFIOMigration *migration = vbasedev->migration; >>>>>>>>>> + VFIORegion *region = &migration->region.buffer; >>>>>>>>>> + uint64_t data_offset = 0, data_size = 0; >>>>>>>>>> + int ret; >>>>>>>>>> + >>>>>>>>>> + ret = pread(vbasedev->fd, &data_offset, sizeof(data_offset), >>>>>>>>>> + region->fd_offset + offsetof(struct >>>>>>>>>> vfio_device_migration_info, >>>>>>>>>> + data_offset)); >>>>>>>>>> + if (ret != sizeof(data_offset)) { >>>>>>>>>> + error_report("Failed to get migration buffer data offset >>>>>>>>>> %d", >>>>>>>>>> + ret); >>>>>>>>>> + return -EINVAL; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + ret = pread(vbasedev->fd, &data_size, sizeof(data_size), >>>>>>>>>> + region->fd_offset + offsetof(struct >>>>>>>>>> vfio_device_migration_info, >>>>>>>>>> + data_size)); >>>>>>>>>> + if (ret != sizeof(data_size)) { >>>>>>>>>> + error_report("Failed to get migration buffer data size %d", >>>>>>>>>> + ret); >>>>>>>>>> + return -EINVAL; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + if (data_size > 0) { >>>>>>>>>> + void *buf = NULL; >>>>>>>>>> + bool buffer_mmaped = false; >>>>>>>>>> + >>>>>>>>>> + if (region->mmaps) { >>>>>>>>>> + int i; >>>>>>>>>> + >>>>>>>>>> + for (i = 0; i < region->nr_mmaps; i++) { >>>>>>>>>> + if ((data_offset >= region->mmaps[i].offset) && >>>>>>>>>> + (data_offset < region->mmaps[i].offset + >>>>>>>>>> + region->mmaps[i].size)) { >>>>>>>>>> + buf = region->mmaps[i].mmap + (data_offset - >>>>>>>>>> + >>>>>>>>>> region->mmaps[i].offset); >>>>>>>>> >>>>>>>>> So you're expecting that data_offset is somewhere within the data >>>>>>>>> area. Why doesn't the data always simply start at the beginning of >>>>>>>>> the >>>>>>>>> data area? ie. data_offset would coincide with the beginning of the >>>>>>>>> mmap'able area (if supported) and be static. Does this enable some >>>>>>>>> functionality in the vendor driver? >>>>>>>> >>>>>>>> Do you want to enforce that to vendor driver? >>>>>>>> From the feedback on previous version I thought vendor driver should >>>>>>>> define data_offset within the region >>>>>>>> "I'd suggest that the vendor driver expose a read-only >>>>>>>> data_offset that matches a sparse mmap capability entry should the >>>>>>>> driver support mmap. The use should always read or write data from the >>>>>>>> vendor defined data_offset" >>>>>>>> >>>>>>>> This also adds flexibility to vendor driver such that vendor driver can >>>>>>>> define different data_offset for device data and dirty page bitmap >>>>>>>> within same mmaped region. >>>>>>> >>>>>>> I agree, it adds flexibility, the protocol was not evident to me until >>>>>>> I got here though. >>>>>>> >>>>>>>>> Does resume data need to be >>>>>>>>> written from the same offset where it's read? >>>>>>>> >>>>>>>> No, resume data should be written from the data_offset that vendor >>>>>>>> driver provided during resume. >>>>> >>>>> A) >>>>> >>>>>>> s/resume/save/? >>>>> >>>>> B) >>>>> >>>>>>> Or is this saying that on resume that the vendor driver is requesting a >>>>>>> specific block of data via data_offset? >>>>>> >>>>>> Correct. >>>>> >>>>> Which one is correct? Thanks, >>>>> >>>> >>>> B is correct. >>> >>> Shouldn't data_offset be stored in the migration stream then so we can >>> at least verify that source and target are in sync? >> >> Why? data_offset is offset within migration region, nothing to do with >> data stream. While resuming vendor driver can ask data at different >> offset in migration region. > > So the data is opaque and the sequencing is opaque, the user should > have no expectation that there's any relationship between where the > data was read from while saving versus where the target device is > requesting the next block be written while resuming. We have a data > blob and a size and we do what we're told. >
That's correct. >>> I'm not getting a >>> sense that this protocol involves any sort of sanity or integrity >>> testing on the vendor driver end, the user can just feed garbage into >>> the device on resume and watch the results :-\ Thanks, >>> >> >> vendor driver should be able to do sanity and integrity check within its >> opaque data. If that sanity fails, return failure for access on field in >> migration region structure. > > Would that be a synchronous failure on the write of data_size, which > should result in the device_state moving to invalid? Thanks, > If data section of migration region is mapped, then on write to data_size, vendor driver should read staging buffer, validate data and return sizeof(data_size) if success or return error (< 0). If data section is trapped, then write on data section should return accordingly on receiving data. On error, migration/restore would fail. Thanks, Kirti