On Tue, Apr 30, 2024 at 02:30:19PM +0200, Fiona Ebner wrote: > Am 12.03.24 um 15:02 schrieb marcandre.lur...@redhat.com: > > From: Marc-André Lureau <marcandre.lur...@redhat.com> > > > > The current post-loading code for scanout has a FIXME: it doesn't take > > the resource region/rect into account. But there is more, when adding > > blob migration support in commit f66767f75c9, I didn't realize that blob > > resources could be used for scanouts. This situationn leads to a crash > > during post-load, as they don't have an associated res->image. > > > > virtio_gpu_do_set_scanout() handle all cases, but requires the > > associated virtio_gpu_framebuffer, which is currently not saved during > > migration. > > > > Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we > > can restore blob scanouts, as well as fixing the existing FIXME. > > > > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> > > Reviewed-by: Sebastian Ott <seb...@redhat.com> > > Hi, > unfortunately, this broke migration from pre-9.0 to 9.0: > > > vmstate_load_state_field virtio-gpu:virtio-gpu > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.enable > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.conf.max_outputs > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.scanout > > vmstate_load_state_field virtio-gpu-one-scanout:resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:width > > vmstate_load_state_field virtio-gpu-one-scanout:height > > vmstate_load_state_field virtio-gpu-one-scanout:x > > vmstate_load_state_field virtio-gpu-one-scanout:y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.y > > vmstate_load_state_field virtio-gpu-one-scanout:fb.format > > vmstate_load_state_field virtio-gpu-one-scanout:fb.bytes_pp > > vmstate_load_state_field virtio-gpu-one-scanout:fb.width > > vmstate_load_state_field virtio-gpu-one-scanout:fb.height > > vmstate_load_state_field virtio-gpu-one-scanout:fb.stride > > vmstate_load_state_field virtio-gpu-one-scanout:fb.offset > > qemu-system-x86_64: Missing section footer for 0000:00:02.0/virtio-gpu > > qemu-system-x86_64: Error -22 while loading VM state > > It wrongly tries to load the fb fields even though they should be > guarded by version 2. > > Looking at it with GDB, in vmstate_load_state(), when we come to > field->name == parent_obj.scanout, the > > > } else if (field->flags & VMS_STRUCT) { > > ret = vmstate_load_state(f, field->vmsd, curr_elem, > > field->vmsd->version_id); > > branch will be taken and suddenly we'll have a call to > vmstate_load_state() for vmsd==vmstate_virtio_gpu_scanout with > version_id==2 rather than version_id==1, because that is > field->vmsd->version_id (i.e. the .version_id in VMStateDescription > vmstate_virtio_gpu_scanout). > > Would it have been necessary to version the VMStateDescription > vmstate_virtio_gpu_scanouts too using VMS_VSTRUCT (or am I > misinterpreting the use case for that)?
Looks right. And there's only one such user which is when it's introduced in 2018. It's sad we can't simply already use vmsd subsections even if that was there before this VSTRUCT thing, and that should work with internal versioning. Maybe we introduced that because we can't replace a VMS_STRUCT to subsections? https://lore.kernel.org/qemu-devel/1524670052-28373-1-git-send-email-miny...@acm.org/#t OTOH, I don't think vmsd versioning would work for ping-pong migrations. Migrating backwards should fail with 'not supported' with vmsd versionings. Depending on the requirement (in this virtio-gpu case, it looks like applicable to be used in a cluster and doing back-and-forth moves?), we may want to support bi-directional migrations which should be superior. That will need to stick with machine type check (compat fields in hw_compat_*, then conditionally save/load fields) to guarantee migration can work back and forth. Thanks, -- Peter Xu