If vfio_migration_set_state() fails to set the device in the requested state it tries to put it in a recover state. If setting the device in the recover state fails as well, hw_error is triggered and the VM is aborted.
To improve user experience and avoid VM data loss, reset the device with VFIO_RESET_DEVICE instead of aborting the VM. Signed-off-by: Avihai Horon <avih...@nvidia.com> --- hw/vfio/migration.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 852759e6ca..6c34502611 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -89,8 +89,16 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, /* Try to put the device in some good state */ mig_state->device_state = recover_state; if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) { - hw_error("%s: Device in error state, can't recover", - vbasedev->name); + if (ioctl(vbasedev->fd, VFIO_DEVICE_RESET)) { + hw_error("%s: Device in error state, can't recover", + vbasedev->name); + } + + error_report( + "%s: Device was reset due to failure in changing device state to recover state %s", + vbasedev->name, mig_state_to_str(recover_state)); + + return -1; } error_report("%s: Failed changing device state to %s", vbasedev->name, -- 2.21.3