On 10/1/23 18:07, Marc-André Lureau wrote: > Hi Laszlo > > On Sun, Oct 1, 2023 at 4:20 AM Laszlo Ersek <ler...@redhat.com> wrote: >> >> On 10/1/23 00:14, Laszlo Ersek wrote: >>> On 9/29/23 13:17, Marc-André Lureau wrote: > [..] >>>> fwiw, my migration support patch is still unreviewed: >>>> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lur...@redhat.com/ >>>> >>> >>> I don't have a copy of that patch (not subscribed, sorry...), but how >>> simply you did it surprises me. I did expect to simulate an fw_cfg write >>> in post_load, but I thought we'd approach the device (for the sake of >>> including it in the migration stream) from the higher level device's >>> vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the >>> first place. I didn't know that raw vmstate_register() was still accepted. >>> >>> If it is, then, for that patch (with Gerd's comment addressed): >>> >>> Acked-by: Laszlo Ersek <ler...@redhat.com> >> >> I think I may have found one issue with that patch. >> >> The fields that we save into the migration stream are integer members of >> the RAMFBCfg structure that lives directly in the fw_cfg file. The ramfb >> device specifies those fields for the guest as big endian. This means >> that when ramfb_fw_cfg_write() runs, triggered by the guest, then on big >> endian hosts, be32_to_cpu() and be64_to_cpu() will be no-ops, as the >> integer representation inside the fw_cfg file will match the host CPU at >> once. And on little endian hosts, these functions will byte swap. In >> both cases, ramfb_create_display_surface() will have to be called with >> identical host-side integers. This means that *before* the be32_to_cpu() >> and be64_to_cpu() calls, the host side *values* read out from the fw_cfg >> file members *differ* between big endian and little endian hosts. >> >> And the problem is that we write precisely those values into the >> migration stream, via "vmstate_ramfb_cfg". The migration stream >> represents integers in big endian regardless of host endianness, but the >> question is the *values* that we encode in BE for the stream. And the >> values (from fw_cg) will differ between little endian and big endian hosts. >> >> Thus, I think we should just use >> >> VMSTATE_BUFFER(cfg, RAMFBState) >> >> in "vmstate_ramfb", and remove "vmstate_ramfb_cfg". For migration >> purposes, we should treat the fw_cfg file as an opaque blob. > > I think I see your point. Using VMSTATE_BUFFER like that doesn't work > though.
Why not? (I mean -- does it compile but misbehave, or it doesn't even compile (an invalid use of the VMSTATE_BUFFER macro)?) > It's also more error-prone if fields are added in the struct, > imho. The structure is effectively the guest-visible register block for the device. We probably can't add any fields, and even if we do, the new fields are going to be part of the fw_cfg blob (writeable file), so they should be covered by VMSTATE_BUFFER just the same. > > However, we could simply have a post-load to convert the values to BE. post_load itself is not enough; if we want to go this route, then we need pre_save too. Without a pre_save, the host endianness influences the data serialized to the migration stream, and there's no way to know how to recover (the source host's endianness is unknown at load time). pre_save could work though, if it performed the same BE to host conversions (to a separate buffer I guess!) as the fw_cfg write callback does. > I wonder if new macros couldn't be introduced too. > >>> >>> BTW: can you please assign >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and >>> link your patch into it? The reason we ended up duplicating work here is >>> that noone took RHBZ 1859424 before. > > I thought I did that. > > https://bugzilla.redhat.com/show_bug.cgi?id=1859424#c17 Ouch, sorry. That must have happened since I last looked, and I was foolish enough not to CC myself on the BZ early on. My mistake! > >>> >>> ... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/ > > :/ > > Laszlo