Cédric Le Goater <c...@kaod.org> writes:

> On 1/10/24 14:19, Fabiano Rosas wrote:
>> Markus Armbruster <arm...@redhat.com> writes:
>> 
>>> Peter Xu <pet...@redhat.com> writes:
>>>
>>>> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
>>>>> Hi Fabiano,
>>>>>
>>>>> On 9/1/24 21:21, Fabiano Rosas wrote:
>>>>>> Cédric Le Goater <c...@kaod.org> writes:
>>>>>>
>>>>>>> On 1/9/24 18:40, Fabiano Rosas wrote:
>>>>>>>> Cédric Le Goater <c...@kaod.org> writes:
>>>>>>>>
>>>>>>>>> On 1/3/24 20:53, Fabiano Rosas wrote:
>>>>>>>>>> Philippe Mathieu-Daudé <phi...@linaro.org> writes:
>>>>>>>>>>
>>>>>>>>>>> +Peter/Fabiano
>>>>>>>>>>>
>>>>>>>>>>> On 2/1/24 17:41, Cédric Le Goater wrote:
>>>>>>>>>>>> On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
>>>>>>>>>>>>> Hi Cédric,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2/1/24 15:55, Cédric Le Goater wrote:
>>>>>>>>>>>>>> On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When a MPCore cluster is used, the Cortex-A cores belong the the
>>>>>>>>>>>>>>> cluster container, not to the board/soc layer. This series move
>>>>>>>>>>>>>>> the creation of vCPUs to the MPCore private container.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Doing so we consolidate the QOM model, moving common code in a
>>>>>>>>>>>>>>> central place (abstract MPCore parent).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Changing the QOM hierarchy has an impact on the state of the 
>>>>>>>>>>>>>> machine
>>>>>>>>>>>>>> and some fixups are then required to maintain migration 
>>>>>>>>>>>>>> compatibility.
>>>>>>>>>>>>>> This can become a real headache for KVM machines like virt for 
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>> migration compatibility is a feature, less for emulated ones.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All changes are either moving properties (which are not migrated)
>>>>>>>>>>>>> or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
>>>>>>>>>>>>> is still migrated elsewhere). So I don't see any obvious migration
>>>>>>>>>>>>> problem, but I might be missing something, so I Cc'ed Juan :>
>>>>>>>>>>
>>>>>>>>>> FWIW, I didn't spot anything problematic either.
>>>>>>>>>>
>>>>>>>>>> I've ran this through my migration compatibility series [1] and it
>>>>>>>>>> doesn't regress aarch64 migration from/to 8.2. The tests use '-M
>>>>>>>>>> virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I 
>>>>>>>>>> don't
>>>>>>>>>> think we even support migration of anything non-KVM on arm.
>>>>>>>>>
>>>>>>>>> it happens we do.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Oh, sorry, I didn't mean TCG here. Probably meant to say something like
>>>>>>>> non-KVM-capable cpus, as in 32-bit. Nevermind.
>>>>>>>
>>>>>>> Theoretically, we should be able to migrate to a TCG guest. Well, this
>>>>>>> worked in the past for PPC. When I was doing more KVM related changes,
>>>>>>> this was very useful for dev. Also, some machines are partially 
>>>>>>> emulated.
>>>>>>> Anyhow I agree this is not a strong requirement and we often break it.
>>>>>>> Let's focus on KVM only.
>>>>>>>
>>>>>>>>>> 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
>>>>>>>>>
>>>>>>>>> yes it depends on the QOM hierarchy and virt seems immune to the 
>>>>>>>>> changes.
>>>>>>>>> Good.
>>>>>>>>>
>>>>>>>>> However, changing the QOM topology clearly breaks migration compat,
>>>>>>>>
>>>>>>>> Well, "clearly" is relative =) You've mentioned pseries and aspeed
>>>>>>>> already, do you have a pointer to one of those cases were we broke
>>>>>>>> migration
>>>>>>>
>>>>>>> Regarding pseries, migration compat broke because of 5bc8d26de20c
>>>>>>> ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
>>>>>>> is similar to the changes proposed by this series, it impacts the QOM
>>>>>>> hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
>>>>>>> ("spapr: fix migration of ICPState objects from/to older QEMU") which
>>>>>>> is quite an headache and this turned out to raise another problem some
>>>>>>> months ago ... :/ That's why I sent [1] to prepare removal of old
>>>>>>> machines and workarounds becoming a burden.
>>>>>>
>>>>>> This feels like something that could be handled by the vmstate code
>>>>>> somehow. The state is there, just under a different path.
>>>>>
>>>>> What, the QOM path is used in migration? ...
>>>>
>>>> Hopefully not..
>> 
>> Unfortunately the original fix doesn't mention _what_ actually broke
>> with migration. I assumed the QOM path was needed because otherwise I
>> don't think the fix makes sense. The thread discussing that patch also
>> directly mentions the QOM path:
>> 
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html
>> 
>> But I probably misunderstood something while reading that thread.
>> 
>>>>
>>>>>
>>>>> See recent discussions on "QOM path stability":
>>>>> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/
>>>>> https://lore.kernel.org/qemu-devel/87jzojbxt7....@pond.sub.org/
>>>>> https://lore.kernel.org/qemu-devel/87v883by34....@pond.sub.org/
>>>>
>>>> If I read it right, the commit 46f7afa37096 example is pretty special that
>>>> the QOM path more or less decided more than the hierachy itself but changes
>>>> the existances of objects.
>>>
>>> Let's see whether I got this...
>>>
>>> We removed some useless objects, moved the useful ones to another home.
>>> The move changed their QOM path.
>>>
>>> The problem was the removal of useless objects, because this also
>>> removed their vmstate.
>> 
>> If you checkout at the removal commit (5bc8d26de20c), the vmstate has
>> been kept untouched.
>> 
>>>
>>> The fix was adding the vmstate back as a dummy.
>> 
>> Since the vmstate was kept I don't see why would we need a dummy. The
>> incoming migration stream would still have the state, only at a
>> different point in the stream. It's surprising to me that that would
>> cause an issue, but I'm not well versed in that code.
>> 
>>>
>>> The QOM patch changes are *not* part of the problem.
>> 
>> The only explanation I can come up with is that after the patch
>> migration has broken after a hotplug or similar operation. In such
>> situation, the preallocated state would always be present before the
>> patch, but sometimes not present after the patch in case, say, a
>> hot-unplug has taken away a cpu + ICPState.
>
> May be. It has been a while.
>
> For a better understanding, I tried to reproduce. QEMU 2.9 still
> compiles on a f38 (memfd_create needs a fix). However, a
> QEMU-2.9/pseries-2.9 machine won't start any recent OS because
> something broke the OS/platform negotiation process (CAS) ...

Thanks for letting us know it still builds. That motivated me to try as
well. Although it took me a while because of python's brokenness.

The issue was indeed the difference in preallocation vs. no
preallocation. I don't remember how to do hotplug in ppc anymore, but
maxcpus was sufficient to reproduce:

src 2.9.0:
$ ./ppc64-softmmu/qemu-system-ppc64 -serial mon:stdio -nographic -vga none
-display none -accel tcg -machine pseries -smp 4,maxcpus=8 -m 256M
-nodefaults

dst 2.9.0 prior to 46f7afa:
$ ./ppc64-softmmu/qemu-system-ppc64 -serial mon:stdio -display none
-accel tcg -machine pseries-2.9 -smp 4,maxcpus=8 -m 256M -nodefaults
-incoming defer

(qemu) migrate_incoming tcp:localhost:1234
(qemu) qemu-system-ppc64: Unknown savevm section or instance 'icp/server' 4
qemu-system-ppc64: load of migration failed: Invalid argument

As you can see, both use 4 cpus, but the src will try to migrate all 8
icp/server instances. After the patch, the same migration works.

So this was not related to the QOM path nor the QOM hierarchy, it was
just about having state created at machine init vs. cpu realize time. It
might be wise to keep an eye on the QOM hierarchy anyway as it could
make these kinds of changes more evident during review.

> I removed the pre_2_10_has_unused_icps hack and worked around
> another migration compat issue in IOMMU ... Here are the last
> trace-events for migration :
>
>      QEMU-mainline/pseries-2.9 -> QEMU-2.9/pseries-2.9
>
> 1016464@1705053752.106417:vmstate_load spapr, spapr
> 1016464@1705053752.106419:vmstate_load_state spapr v3
> 1016464@1705053752.106421:vmstate_load_state_field spapr:unused
> 1016464@1705053752.106424:vmstate_load_state_field spapr:rtc_offset
> 1016464@1705053752.106425:vmstate_load_state_field spapr:tb
> 1016464@1705053752.106427:vmstate_n_elems tb: 1
> 1016464@1705053752.106429:vmstate_load_state timebase v1
> 1016464@1705053752.106432:vmstate_load_state_field timebase:guest_timebase
> 1016464@1705053752.106433:vmstate_n_elems guest_timebase: 1
> 1016464@1705053752.106435:vmstate_load_state_field timebase:time_of_the_day_ns
> 1016464@1705053752.106436:vmstate_n_elems time_of_the_day_ns: 1
> 1016464@1705053752.106438:vmstate_subsection_load timebase
> 1016464@1705053752.106440:vmstate_subsection_load_bad timebase: 
> spapr_option_vector_ov5_cas/(prefix)
> 1016464@1705053752.106442:vmstate_load_state_end timebase end/0
> 1016464@1705053752.106443:vmstate_subsection_load spapr
> 1016464@1705053752.106445:vmstate_load_state spapr_option_vector_ov5_cas v1
> 1016464@1705053752.106447:vmstate_load_state_field 
> spapr_option_vector_ov5_cas:ov5_cas
> 1016464@1705053752.106448:vmstate_n_elems ov5_cas: 1
> 1016464@1705053752.106450:vmstate_load_state spapr_option_vector v1
> 1016464@1705053752.106452:vmstate_load_state_field spapr_option_vector:bitmap
> 1016464@1705053752.106454:vmstate_n_elems bitmap: 1
> 1016464@1705053752.106456:vmstate_subsection_load spapr_option_vector
> 1016464@1705053752.106457:vmstate_subsection_load_bad spapr_option_vector: 
> (short)/
> 1016464@1705053752.106459:vmstate_load_state_end spapr_option_vector end/0
> 1016464@1705053752.106461:vmstate_subsection_load spapr_option_vector_ov5_cas
> 1016464@1705053752.106462:vmstate_subsection_load_bad 
> spapr_option_vector_ov5_cas: (short)/
> 1016464@1705053752.106465:vmstate_load_state_end spapr_option_vector_ov5_cas 
> end/0
> 1016464@1705053752.106466:vmstate_subsection_load_bad spapr: 
> spapr/cap/htm/(lookup)
> qemu-system-ppc64: error while loading state for instance 0x0 of device 
> 'spapr'

This trace this seems to be something else entirely. I'd be amazed if
migrating mainline vs. something that old worked at all, even on x86.

Reply via email to