On Thu, Nov 17, 2022 at 07:07:10PM +0200, Avihai Horon wrote: > > > + } > > > + > > > + if (mig_state->data_fd != -1) { > > > + if (migration->data_fd != -1) { > > > + /* > > > + * This can happen if the device is asynchronously reset and > > > + * terminates a data transfer. > > > + */ > > > + error_report("%s: data_fd out of sync", vbasedev->name); > > > + close(mig_state->data_fd); > > > + > > > + return -1; > > Should we go to recover_state here? Is migration->device_state > > invalid? -EBADF? > > Yes, we should. > Although VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE ioctl above succeeded, setting > the device state didn't *really* succeed, as the data_fd went out of sync. > So we should go to recover_state and return -EBADF.
The state did succeed and it is now "new_state". Getting an unexpected data_fd means it did something like RUNNING->PRE_COPY_P2P when the code was expecting PRE_COPY->PRE_COPY_P2P. It is actually in PRE_COPY_P2P but the in-progress migration must be stopped and the kernel would have made the migration->data_fd permanently return some error when it went async to RUNNING. The recovery is to resart the migration (of this device?) from the start. Jason