Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/12/24 20:54, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/10/24 14:19, Fabiano Rosas wrote: Markus Armbruster writes: Peter Xu writes: On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... Hopefully not.. Unfortunately the original fix doesn't mention _what_ actually broke with migration. I assumed the QOM path was needed because otherwise I don't think the fix makes sense. The thread discussing that patch also directly mentions the QOM path: https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html But I probably misunderstood something while reading that thread. See recent discussions on "QOM path stability": https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path more or less decided more than the hierachy itself but changes the existances of objects. Let's see whether I got this... We removed some useless objects, moved the useful ones to another home. The move changed their QOM path. The problem was the removal of useless objects, because this also removed their vmstate. If you checkout at the removal commit (5bc8d26de20c), the vmstate has been kept untouched. The fix was adding the vmstate back as a dummy. Since the vmstate was kept I don't see why would we need a dummy. The incoming migration stream would still have the state, only at a different point in the stream. It's surprising to me that that would cause an issue, but I'm not well versed in that code. The QOM patch changes are *not* part of the problem. The only explanation I can come up with is that after the patch migration has broken after a hotplug or similar operation. In such situation, the preallocated state would
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Cédric Le Goater writes: > On 1/10/24 14:19, Fabiano Rosas wrote: >> Markus Armbruster writes: >> >>> Peter Xu writes: >>> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: > Hi Fabiano, > > On 9/1/24 21:21, Fabiano Rosas wrote: >> Cédric Le Goater writes: >> >>> On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: > On 1/3/24 20:53, Fabiano Rosas wrote: >> Philippe Mathieu-Daudé writes: >> >>> +Peter/Fabiano >>> >>> On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: > Hi Cédric, > > On 2/1/24 15:55, Cédric Le Goater wrote: >> On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: >>> Hi, >>> >>> When a MPCore cluster is used, the Cortex-A cores belong the the >>> cluster container, not to the board/soc layer. This series move >>> the creation of vCPUs to the MPCore private container. >>> >>> Doing so we consolidate the QOM model, moving common code in a >>> central place (abstract MPCore parent). >> >> Changing the QOM hierarchy has an impact on the state of the >> machine >> and some fixups are then required to maintain migration >> compatibility. >> This can become a real headache for KVM machines like virt for >> which >> migration compatibility is a feature, less for emulated ones. > > All changes are either moving properties (which are not migrated) > or moving non-migrated QOM members (i.e. pointers of ARMCPU, which > is still migrated elsewhere). So I don't see any obvious migration > problem, but I might be missing something, so I Cc'ed Juan :> >> >> FWIW, I didn't spot anything problematic either. >> >> I've ran this through my migration compatibility series [1] and it >> doesn't regress aarch64 migration from/to 8.2. The tests use '-M >> virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I >> don't >> think we even support migration of anything non-KVM on arm. > > it happens we do. > Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. >>> >>> Theoretically, we should be able to migrate to a TCG guest. Well, this >>> worked in the past for PPC. When I was doing more KVM related changes, >>> this was very useful for dev. Also, some machines are partially >>> emulated. >>> Anyhow I agree this is not a strong requirement and we often break it. >>> Let's focus on KVM only. >>> >> 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 > > yes it depends on the QOM hierarchy and virt seems immune to the > changes. > Good. > > However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration >>> >>> Regarding pseries, migration compat broke because of 5bc8d26de20c >>> ("spapr: allocate the ICPState object from under sPAPRCPUCore") which >>> is similar to the changes proposed by this series, it impacts the QOM >>> hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 >>> ("spapr: fix migration of ICPState objects from/to older QEMU") which >>> is quite an headache and this turned out to raise another problem some >>> months ago ... :/ That's why I sent [1] to prepare removal of old >>> machines and workarounds becoming a burden. >> >> This feels like something that could be handled by the vmstate code >> somehow. The state is there, just under a different path. > > What, the QOM path is used in migration? ... Hopefully not.. >> >> Unfortunately the original fix doesn't mention _what_ actually broke >> with migration. I assumed the QOM path was needed because otherwise I >> don't think the fix makes sense. The thread discussing that patch also >> directly mentions the QOM path: >> >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html >> >> But I probably misunderstood something while reading that thread. >> > > See recent discussions on "QOM path stability": > https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ > https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ > https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/10/24 14:19, Fabiano Rosas wrote: Markus Armbruster writes: Peter Xu writes: On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... Hopefully not.. Unfortunately the original fix doesn't mention _what_ actually broke with migration. I assumed the QOM path was needed because otherwise I don't think the fix makes sense. The thread discussing that patch also directly mentions the QOM path: https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html But I probably misunderstood something while reading that thread. See recent discussions on "QOM path stability": https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path more or less decided more than the hierachy itself but changes the existances of objects. Let's see whether I got this... We removed some useless objects, moved the useful ones to another home. The move changed their QOM path. The problem was the removal of useless objects, because this also removed their vmstate. If you checkout at the removal commit (5bc8d26de20c), the vmstate has been kept untouched. The fix was adding the vmstate back as a dummy. Since the vmstate was kept I don't see why would we need a dummy. The incoming migration stream would still have the state, only at a different point in the stream. It's surprising to me that that would cause an issue, but I'm not well versed in that code. The QOM patch changes are *not* part of the problem. The only explanation I can come up with is that after the patch migration has broken after a hotplug or similar operation. In such situation, the preallocated state would always be present before the patch, but sometimes not present after
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/10/24 07:26, Peter Xu wrote: On Wed, Jan 10, 2024 at 07:03:06AM +0100, Markus Armbruster wrote: Peter Xu writes: On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... Hopefully not.. See recent discussions on "QOM path stability": https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path more or less decided more than the hierachy itself but changes the existances of objects. Let's see whether I got this... We removed some useless objects, moved the useful ones to another home. The move changed their QOM path. The problem was the removal of useless objects, because this also removed their vmstate. The fix was adding the vmstate back as a dummy. The QOM patch changes are *not* part of the problem. Correct? [I'd leave this to Cedric] No one wants to be policing QOM hierarchy changes in every single series that shows up on the list. Anyway, thanks for the pointers. I'll study that code a bit more, maybe I can come up with some way to handle these cases. Hopefully between the analyze-migration test and the compat tests we'll catch the next bug of this kind before it gets merged. Things like that might be able to be detected via vmstate-static-checker.py. But I'm not 100% sure, also its coverage is limited. For example, I don't think it can detect changes to objects that will only be created dynamically, e.g., I think sometimes we create objects after some guest behaviors (consider guest enables the device, then QEMU emulation creates some objects on demand of device setup?), Feels nuts to me. In real hardware, software enabling a device that is disabled by default doesn't create the device. The device is always
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/10/24 07:03, Markus Armbruster wrote: Peter Xu writes: On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... Hopefully not.. See recent discussions on "QOM path stability": https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path more or less decided more than the hierachy itself but changes the existances of objects. Let's see whether I got this... We removed some useless objects, moved the useful ones to another home. The move changed their QOM path. They interrupt controller presenter objects were quite useful :) From what I recall, we moved them from an array under the machine to the CPU object, so the interrupt controller presenter states previously under the machine were not there anymore and this broke migration compatibility. Sorry for the noise if this is not a problem anymore. It was at the time and we found a way to work around it; I should probably say we hacked our way around it. Nevertheless, this was kind of a trauma too because since I never dared touch the QOM hierarchy of a migratable machine again. Migration is complicated. The problem was the removal of useless objects, because this also removed their vmstate. The fix was adding the vmstate back as a dummy. The QOM patch changes are *not* part of the problem. Correct? No one wants to be policing QOM hierarchy changes in every single series that shows up on the list. Anyway, thanks for the pointers. I'll study that code a bit more, maybe I can come up with some way to handle these cases. Hopefully between the analyze-migration test and the compat tests we'll catch the next bug of this kind before it gets merged. Things like that might be able
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/9/24 22:09, Philippe Mathieu-Daudé wrote: On 9/1/24 22:07, Philippe Mathieu-Daudé wrote: Hi Cédric, On 9/1/24 19:06, Cédric Le Goater wrote: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. No no, we want the same for TCG. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. Regarding aspeed, this series breaks compat. Can you write down the steps to reproduce please? I'll debug it. Also, have you figured (bisecting) which patch start to break? Sorry I didn't. I wished I had more time for that. Thanks, C.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. Regarding aspeed, this series breaks compat. Can you write down the steps to reproduce please? I'll debug it. We need to understand this. Nothing complex, $ wget https://github.com/legoater/qemu-aspeed-boot/raw/master/images/ast2600-evb/buildroot-2023.11/flash.img $ qemu-system-arm -M ast2600-evb -net user -drive file=./flash.img,format=raw,if=mtd -nographic -snapshot -serial mon:stdio -trace vmstate* -trace save* -trace load* $ qemu-system-arm-patch -M ast2600-evb -net user -drive file=./flash.img,format=raw,if=mtd -nographic -snapshot -serial mon:stdio -trace vmstate* -trace save* -trace load* -incoming tcp::1234 stop the VM in U-boot before loading the kernel because QEMU does not support migrating CPU when in secure mode. That's how I understood what Peter told me. (qemu) migrate tcp:localhost:1234 ... vmstate_load_state_field cpu:cpreg_vmstate_array_len vmstate_n_elems cpreg_vmstate_array_len: 1 qemu-system-arm: Invalid value 266 expecting positive value <= 223 qemu-system-arm: Failed to load cpu:cpreg_vmstate_array_len vmstate_load_field_error field "cpreg_vmstate_array_len" load failed, ret = -22 qemu-system-arm: error while loading state for instance 0x0 of device 'cpu' Thanks, C.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Fabiano Rosas writes: > Markus Armbruster writes: > >> Peter Xu writes: >> >>> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: > Cédric Le Goater writes: > > > On 1/9/24 18:40, Fabiano Rosas wrote: > > > Cédric Le Goater writes: > > > > > > > On 1/3/24 20:53, Fabiano Rosas wrote: > > > > > Philippe Mathieu-Daudé writes: > > > > > > > > > > > +Peter/Fabiano > > > > > > > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote: > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: > > > > > > > > Hi Cédric, > > > > > > > > > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote: > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores > > > > > > > > > > belong the the > > > > > > > > > > cluster container, not to the board/soc layer. This > > > > > > > > > > series move > > > > > > > > > > the creation of vCPUs to the MPCore private container. > > > > > > > > > > > > > > > > > > > > Doing so we consolidate the QOM model, moving common > > > > > > > > > > code in a > > > > > > > > > > central place (abstract MPCore parent). > > > > > > > > > > > > > > > > > > Changing the QOM hierarchy has an impact on the state of > > > > > > > > > the machine > > > > > > > > > and some fixups are then required to maintain migration > > > > > > > > > compatibility. > > > > > > > > > This can become a real headache for KVM machines like > > > > > > > > > virt for which > > > > > > > > > migration compatibility is a feature, less for emulated > > > > > > > > > ones. > > > > > > > > > > > > > > > > All changes are either moving properties (which are not > > > > > > > > migrated) > > > > > > > > or moving non-migrated QOM members (i.e. pointers of > > > > > > > > ARMCPU, which > > > > > > > > is still migrated elsewhere). So I don't see any obvious > > > > > > > > migration > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan > > > > > > > > :> > > > > > > > > > > FWIW, I didn't spot anything problematic either. > > > > > > > > > > I've ran this through my migration compatibility series [1] and > > > > > it > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. > > > > > I don't > > > > > think we even support migration of anything non-KVM on arm. > > > > > > > > it happens we do. > > > > > > > > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something > > > like > > > non-KVM-capable cpus, as in 32-bit. Nevermind. > > > > Theoretically, we should be able to migrate to a TCG guest. Well, this > > worked in the past for PPC. When I was doing more KVM related changes, > > this was very useful for dev. Also, some machines are partially > > emulated. > > Anyhow I agree this is not a strong requirement and we often break it. > > Let's focus on KVM only. > > > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 > > > > > > > > yes it depends on the QOM hierarchy and virt seems immune to the > > > > changes. > > > > Good. > > > > > > > > However, changing the QOM topology clearly breaks migration compat, > > > > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed > > > already, do you have a pointer to one of those cases were we broke > > > migration > > > > Regarding pseries, migration compat broke because of 5bc8d26de20c > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which > > is similar to the changes proposed by this series, it impacts the QOM > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 > > ("spapr: fix migration of ICPState objects from/to older QEMU") which > > is quite an headache and this turned out to raise another problem some > > months ago ... :/ That's why I sent [1] to prepare removal of old > > machines and workarounds becoming a burden. > > This feels like something that could be handled by the vmstate code > somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... >>> >>> Hopefully not.. > > Unfortunately the original fix doesn't mention _what_ actually broke > with migration. I assumed the QOM path was needed because otherwise I > don't think the fix makes sense. The thread discussing that patch also > directly mentions the QOM path: > >
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Markus Armbruster writes: > Peter Xu writes: > >> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: >>> Hi Fabiano, >>> >>> On 9/1/24 21:21, Fabiano Rosas wrote: >>> > Cédric Le Goater writes: >>> > >>> > > On 1/9/24 18:40, Fabiano Rosas wrote: >>> > > > Cédric Le Goater writes: >>> > > > >>> > > > > On 1/3/24 20:53, Fabiano Rosas wrote: >>> > > > > > Philippe Mathieu-Daudé writes: >>> > > > > > >>> > > > > > > +Peter/Fabiano >>> > > > > > > >>> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote: >>> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: >>> > > > > > > > > Hi Cédric, >>> > > > > > > > > >>> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote: >>> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: >>> > > > > > > > > > > Hi, >>> > > > > > > > > > > >>> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores >>> > > > > > > > > > > belong the the >>> > > > > > > > > > > cluster container, not to the board/soc layer. This >>> > > > > > > > > > > series move >>> > > > > > > > > > > the creation of vCPUs to the MPCore private container. >>> > > > > > > > > > > >>> > > > > > > > > > > Doing so we consolidate the QOM model, moving common >>> > > > > > > > > > > code in a >>> > > > > > > > > > > central place (abstract MPCore parent). >>> > > > > > > > > > >>> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of >>> > > > > > > > > > the machine >>> > > > > > > > > > and some fixups are then required to maintain migration >>> > > > > > > > > > compatibility. >>> > > > > > > > > > This can become a real headache for KVM machines like >>> > > > > > > > > > virt for which >>> > > > > > > > > > migration compatibility is a feature, less for emulated >>> > > > > > > > > > ones. >>> > > > > > > > > >>> > > > > > > > > All changes are either moving properties (which are not >>> > > > > > > > > migrated) >>> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of >>> > > > > > > > > ARMCPU, which >>> > > > > > > > > is still migrated elsewhere). So I don't see any obvious >>> > > > > > > > > migration >>> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan >>> > > > > > > > > :> >>> > > > > > >>> > > > > > FWIW, I didn't spot anything problematic either. >>> > > > > > >>> > > > > > I've ran this through my migration compatibility series [1] and it >>> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M >>> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. >>> > > > > > I don't >>> > > > > > think we even support migration of anything non-KVM on arm. >>> > > > > >>> > > > > it happens we do. >>> > > > > >>> > > > >>> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something >>> > > > like >>> > > > non-KVM-capable cpus, as in 32-bit. Nevermind. >>> > > >>> > > Theoretically, we should be able to migrate to a TCG guest. Well, this >>> > > worked in the past for PPC. When I was doing more KVM related changes, >>> > > this was very useful for dev. Also, some machines are partially >>> > > emulated. >>> > > Anyhow I agree this is not a strong requirement and we often break it. >>> > > Let's focus on KVM only. >>> > > >>> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 >>> > > > > >>> > > > > yes it depends on the QOM hierarchy and virt seems immune to the >>> > > > > changes. >>> > > > > Good. >>> > > > > >>> > > > > However, changing the QOM topology clearly breaks migration compat, >>> > > > >>> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed >>> > > > already, do you have a pointer to one of those cases were we broke >>> > > > migration >>> > > >>> > > Regarding pseries, migration compat broke because of 5bc8d26de20c >>> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which >>> > > is similar to the changes proposed by this series, it impacts the QOM >>> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 >>> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which >>> > > is quite an headache and this turned out to raise another problem some >>> > > months ago ... :/ That's why I sent [1] to prepare removal of old >>> > > machines and workarounds becoming a burden. >>> > >>> > This feels like something that could be handled by the vmstate code >>> > somehow. The state is there, just under a different path. >>> >>> What, the QOM path is used in migration? ... >> >> Hopefully not.. Unfortunately the original fix doesn't mention _what_ actually broke with migration. I assumed the QOM path was needed because otherwise I don't think the fix makes sense. The thread discussing that patch also directly mentions the QOM path: https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html But I probably misunderstood something while reading that thread. >> >>> >>> See recent discussions on
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On Wed, Jan 10, 2024 at 09:09:51AM +0100, Markus Armbruster wrote: > If an object has state that needs to be migrated only sometimes, and > that part of the state is large enough to bother, we can put it in an > optional subsection, can't we? > > Destination: if present, take it. If absent, initialize to default. > > Source: send unless (known to be) in default state. Hmm.. correct. I think I messed up VMSD's needed() hook with field_exists() of the fields; my apologies. The trick should be that VMSD's subsections is more flexible, due to the fact that vmstate_subsection_load() has: while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) { ... sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr); ... } So it tolerates the case where the subsection doesn't exist, or out of order subsections. While field_exists() hook seems not that flexible, as it's implemented as this in vmstate_load_state(): while (field->name) { ... if (vmstate_field_exists(vmsd, field, opaque, version_id)) { ... } ... field++; } So that vmstate_field_exists() needs to be known even on dest before the main vmsd got loaded, and it should always return the same value as the source. Also, the field has ordering requirements. Then yes, subsection should work for dynamic objects. Thanks, -- Peter Xu
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Peter Xu writes: > On Wed, Jan 10, 2024 at 07:03:06AM +0100, Markus Armbruster wrote: >> Peter Xu writes: >> >> > On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: >> >> Hi Fabiano, >> >> >> >> On 9/1/24 21:21, Fabiano Rosas wrote: [...] >> >> > No one wants >> >> > to be policing QOM hierarchy changes in every single series that shows >> >> > up on the list. >> >> > >> >> > Anyway, thanks for the pointers. I'll study that code a bit more, maybe >> >> > I can come up with some way to handle these cases. >> >> > >> >> > Hopefully between the analyze-migration test and the compat tests we'll >> >> > catch the next bug of this kind before it gets merged. >> > >> > Things like that might be able to be detected via >> > vmstate-static-checker.py. >> > But I'm not 100% sure, also its coverage is limited. >> > >> > For example, I don't think it can detect changes to objects that will only >> > be created dynamically, e.g., I think sometimes we create objects after >> > some guest behaviors (consider guest enables the device, then QEMU >> > emulation creates some objects on demand of device setup?), >> >> Feels nuts to me. >> >> In real hardware, software enabling a device that is disabled by default >> doesn't create the device. The device is always there, it just happens >> to be inactive unless enabled. We should model the device just like >> that. > > It doesn't need to be the device itself to be dynamically created, but some > other sub-objects that do not require to exist until the device is enabled, > or some specific function of that device is enabled. It is logically doable. > > Is the example Cedric provided looks like some case like this? I am not > sure, that's also why I'm not sure the static checker would work here. But > logically it seems possible, e.g. with migration VMSD needed() facilities. > Consider a device has a sub-function that requires a sub-object. It may > not need to migrate that object if that sub-feature is not even enabled. > If that object is very large, it might be wise to do so if possible to not > send chunks of junk during the VM downtime. > > But then after a 2nd thought I do agree it's probably not sensible, because > even if the src may know whether the sub-object will be needed, there's > probably no good way for the dest QEMU to know. It can only know in > something like a post_load() hook, but logically that can happen only after > a full load of that device state, so might already be too late. > > Thanks, If an object has state that needs to be migrated only sometimes, and that part of the state is large enough to bother, we can put it in an optional subsection, can't we? Destination: if present, take it. If absent, initialize to default. Source: send unless (known to be) in default state.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On Wed, Jan 10, 2024 at 07:03:06AM +0100, Markus Armbruster wrote: > Peter Xu writes: > > > On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: > >> Hi Fabiano, > >> > >> On 9/1/24 21:21, Fabiano Rosas wrote: > >> > Cédric Le Goater writes: > >> > > >> > > On 1/9/24 18:40, Fabiano Rosas wrote: > >> > > > Cédric Le Goater writes: > >> > > > > >> > > > > On 1/3/24 20:53, Fabiano Rosas wrote: > >> > > > > > Philippe Mathieu-Daudé writes: > >> > > > > > > >> > > > > > > +Peter/Fabiano > >> > > > > > > > >> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote: > >> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: > >> > > > > > > > > Hi Cédric, > >> > > > > > > > > > >> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote: > >> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: > >> > > > > > > > > > > Hi, > >> > > > > > > > > > > > >> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores > >> > > > > > > > > > > belong the the > >> > > > > > > > > > > cluster container, not to the board/soc layer. This > >> > > > > > > > > > > series move > >> > > > > > > > > > > the creation of vCPUs to the MPCore private container. > >> > > > > > > > > > > > >> > > > > > > > > > > Doing so we consolidate the QOM model, moving common > >> > > > > > > > > > > code in a > >> > > > > > > > > > > central place (abstract MPCore parent). > >> > > > > > > > > > > >> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of > >> > > > > > > > > > the machine > >> > > > > > > > > > and some fixups are then required to maintain migration > >> > > > > > > > > > compatibility. > >> > > > > > > > > > This can become a real headache for KVM machines like > >> > > > > > > > > > virt for which > >> > > > > > > > > > migration compatibility is a feature, less for emulated > >> > > > > > > > > > ones. > >> > > > > > > > > > >> > > > > > > > > All changes are either moving properties (which are not > >> > > > > > > > > migrated) > >> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of > >> > > > > > > > > ARMCPU, which > >> > > > > > > > > is still migrated elsewhere). So I don't see any obvious > >> > > > > > > > > migration > >> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan > >> > > > > > > > > :> > >> > > > > > > >> > > > > > FWIW, I didn't spot anything problematic either. > >> > > > > > > >> > > > > > I've ran this through my migration compatibility series [1] and > >> > > > > > it > >> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M > >> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. > >> > > > > > I don't > >> > > > > > think we even support migration of anything non-KVM on arm. > >> > > > > > >> > > > > it happens we do. > >> > > > > > >> > > > > >> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something > >> > > > like > >> > > > non-KVM-capable cpus, as in 32-bit. Nevermind. > >> > > > >> > > Theoretically, we should be able to migrate to a TCG guest. Well, this > >> > > worked in the past for PPC. When I was doing more KVM related changes, > >> > > this was very useful for dev. Also, some machines are partially > >> > > emulated. > >> > > Anyhow I agree this is not a strong requirement and we often break it. > >> > > Let's focus on KVM only. > >> > > > >> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 > >> > > > > > >> > > > > yes it depends on the QOM hierarchy and virt seems immune to the > >> > > > > changes. > >> > > > > Good. > >> > > > > > >> > > > > However, changing the QOM topology clearly breaks migration compat, > >> > > > > >> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed > >> > > > already, do you have a pointer to one of those cases were we broke > >> > > > migration > >> > > > >> > > Regarding pseries, migration compat broke because of 5bc8d26de20c > >> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which > >> > > is similar to the changes proposed by this series, it impacts the QOM > >> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 > >> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which > >> > > is quite an headache and this turned out to raise another problem some > >> > > months ago ... :/ That's why I sent [1] to prepare removal of old > >> > > machines and workarounds becoming a burden. > >> > > >> > This feels like something that could be handled by the vmstate code > >> > somehow. The state is there, just under a different path. > >> > >> What, the QOM path is used in migration? ... > > > > Hopefully not.. > > > >> > >> See recent discussions on "QOM path stability": > >> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ > >> https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ > >>
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Peter Xu writes: > On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: >> Hi Fabiano, >> >> On 9/1/24 21:21, Fabiano Rosas wrote: >> > Cédric Le Goater writes: >> > >> > > On 1/9/24 18:40, Fabiano Rosas wrote: >> > > > Cédric Le Goater writes: >> > > > >> > > > > On 1/3/24 20:53, Fabiano Rosas wrote: >> > > > > > Philippe Mathieu-Daudé writes: >> > > > > > >> > > > > > > +Peter/Fabiano >> > > > > > > >> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote: >> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: >> > > > > > > > > Hi Cédric, >> > > > > > > > > >> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote: >> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: >> > > > > > > > > > > Hi, >> > > > > > > > > > > >> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores belong >> > > > > > > > > > > the the >> > > > > > > > > > > cluster container, not to the board/soc layer. This >> > > > > > > > > > > series move >> > > > > > > > > > > the creation of vCPUs to the MPCore private container. >> > > > > > > > > > > >> > > > > > > > > > > Doing so we consolidate the QOM model, moving common >> > > > > > > > > > > code in a >> > > > > > > > > > > central place (abstract MPCore parent). >> > > > > > > > > > >> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of >> > > > > > > > > > the machine >> > > > > > > > > > and some fixups are then required to maintain migration >> > > > > > > > > > compatibility. >> > > > > > > > > > This can become a real headache for KVM machines like virt >> > > > > > > > > > for which >> > > > > > > > > > migration compatibility is a feature, less for emulated >> > > > > > > > > > ones. >> > > > > > > > > >> > > > > > > > > All changes are either moving properties (which are not >> > > > > > > > > migrated) >> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of ARMCPU, >> > > > > > > > > which >> > > > > > > > > is still migrated elsewhere). So I don't see any obvious >> > > > > > > > > migration >> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan :> >> > > > > > >> > > > > > FWIW, I didn't spot anything problematic either. >> > > > > > >> > > > > > I've ran this through my migration compatibility series [1] and it >> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M >> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I >> > > > > > don't >> > > > > > think we even support migration of anything non-KVM on arm. >> > > > > >> > > > > it happens we do. >> > > > > >> > > > >> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something like >> > > > non-KVM-capable cpus, as in 32-bit. Nevermind. >> > > >> > > Theoretically, we should be able to migrate to a TCG guest. Well, this >> > > worked in the past for PPC. When I was doing more KVM related changes, >> > > this was very useful for dev. Also, some machines are partially emulated. >> > > Anyhow I agree this is not a strong requirement and we often break it. >> > > Let's focus on KVM only. >> > > >> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 >> > > > > >> > > > > yes it depends on the QOM hierarchy and virt seems immune to the >> > > > > changes. >> > > > > Good. >> > > > > >> > > > > However, changing the QOM topology clearly breaks migration compat, >> > > > >> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed >> > > > already, do you have a pointer to one of those cases were we broke >> > > > migration >> > > >> > > Regarding pseries, migration compat broke because of 5bc8d26de20c >> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which >> > > is similar to the changes proposed by this series, it impacts the QOM >> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 >> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which >> > > is quite an headache and this turned out to raise another problem some >> > > months ago ... :/ That's why I sent [1] to prepare removal of old >> > > machines and workarounds becoming a burden. >> > >> > This feels like something that could be handled by the vmstate code >> > somehow. The state is there, just under a different path. >> >> What, the QOM path is used in migration? ... > > Hopefully not.. > >> >> See recent discussions on "QOM path stability": >> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ >> https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ >> https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ > > If I read it right, the commit 46f7afa37096 example is pretty special that > the QOM path more or less decided more than the hierachy itself but changes > the existances of objects. Let's see whether I got this... We removed some useless objects, moved the useful ones to another home. The move changed their QOM path.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote: > Hi Fabiano, > > On 9/1/24 21:21, Fabiano Rosas wrote: > > Cédric Le Goater writes: > > > > > On 1/9/24 18:40, Fabiano Rosas wrote: > > > > Cédric Le Goater writes: > > > > > > > > > On 1/3/24 20:53, Fabiano Rosas wrote: > > > > > > Philippe Mathieu-Daudé writes: > > > > > > > > > > > > > +Peter/Fabiano > > > > > > > > > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote: > > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: > > > > > > > > > Hi Cédric, > > > > > > > > > > > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote: > > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores belong > > > > > > > > > > > the the > > > > > > > > > > > cluster container, not to the board/soc layer. This > > > > > > > > > > > series move > > > > > > > > > > > the creation of vCPUs to the MPCore private container. > > > > > > > > > > > > > > > > > > > > > > Doing so we consolidate the QOM model, moving common code > > > > > > > > > > > in a > > > > > > > > > > > central place (abstract MPCore parent). > > > > > > > > > > > > > > > > > > > > Changing the QOM hierarchy has an impact on the state of > > > > > > > > > > the machine > > > > > > > > > > and some fixups are then required to maintain migration > > > > > > > > > > compatibility. > > > > > > > > > > This can become a real headache for KVM machines like virt > > > > > > > > > > for which > > > > > > > > > > migration compatibility is a feature, less for emulated > > > > > > > > > > ones. > > > > > > > > > > > > > > > > > > All changes are either moving properties (which are not > > > > > > > > > migrated) > > > > > > > > > or moving non-migrated QOM members (i.e. pointers of ARMCPU, > > > > > > > > > which > > > > > > > > > is still migrated elsewhere). So I don't see any obvious > > > > > > > > > migration > > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan :> > > > > > > > > > > > > FWIW, I didn't spot anything problematic either. > > > > > > > > > > > > I've ran this through my migration compatibility series [1] and it > > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M > > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I > > > > > > don't > > > > > > think we even support migration of anything non-KVM on arm. > > > > > > > > > > it happens we do. > > > > > > > > > > > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something like > > > > non-KVM-capable cpus, as in 32-bit. Nevermind. > > > > > > Theoretically, we should be able to migrate to a TCG guest. Well, this > > > worked in the past for PPC. When I was doing more KVM related changes, > > > this was very useful for dev. Also, some machines are partially emulated. > > > Anyhow I agree this is not a strong requirement and we often break it. > > > Let's focus on KVM only. > > > > > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 > > > > > > > > > > yes it depends on the QOM hierarchy and virt seems immune to the > > > > > changes. > > > > > Good. > > > > > > > > > > However, changing the QOM topology clearly breaks migration compat, > > > > > > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed > > > > already, do you have a pointer to one of those cases were we broke > > > > migration > > > > > > Regarding pseries, migration compat broke because of 5bc8d26de20c > > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which > > > is similar to the changes proposed by this series, it impacts the QOM > > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 > > > ("spapr: fix migration of ICPState objects from/to older QEMU") which > > > is quite an headache and this turned out to raise another problem some > > > months ago ... :/ That's why I sent [1] to prepare removal of old > > > machines and workarounds becoming a burden. > > > > This feels like something that could be handled by the vmstate code > > somehow. The state is there, just under a different path. > > What, the QOM path is used in migration? ... Hopefully not.. > > See recent discussions on "QOM path stability": > https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ > https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ > https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ If I read it right, the commit 46f7afa37096 example is pretty special that the QOM path more or less decided more than the hierachy itself but changes the existances of objects. > > > No one wants > > to be policing QOM hierarchy changes in every single series that shows > > up on the list. > > > > Anyway, thanks for the pointers. I'll study that code a bit more, maybe > > I can come up with some way to handle these cases. > > > > Hopefully
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Hi Fabiano, On 9/1/24 21:21, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. What, the QOM path is used in migration? ... See recent discussions on "QOM path stability": https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/ https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/ https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/ No one wants to be policing QOM hierarchy changes in every single series that shows up on the list. Anyway, thanks for the pointers. I'll study that code a bit more, maybe I can come up with some way to handle these cases. Hopefully between the analyze-migration test and the compat tests we'll catch the next bug of this kind before it gets merged.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 9/1/24 22:07, Philippe Mathieu-Daudé wrote: Hi Cédric, On 9/1/24 19:06, Cédric Le Goater wrote: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. No no, we want the same for TCG. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. Regarding aspeed, this series breaks compat. Can you write down the steps to reproduce please? I'll debug it. Also, have you figured (bisecting) which patch start to break? We need to understand this. Not that we care much but this caught my attention because of my past experience on pseries. Same kind of QOM change which could impact other machines, like virt. Since you checked that migration compat is preserved on virt, we should be fine. Thanks, C. [1] https://lore.kernel.org/qemu-devel/20231214181723.1520854-1-...@kaod.org/
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Hi Cédric, On 9/1/24 19:06, Cédric Le Goater wrote: On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. No no, we want the same for TCG. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. Regarding aspeed, this series breaks compat. Can you write down the steps to reproduce please? I'll debug it. We need to understand this. Not that we care much but this caught my attention because of my past experience on pseries. Same kind of QOM change which could impact other machines, like virt. Since you checked that migration compat is preserved on virt, we should be fine. Thanks, C. [1] https://lore.kernel.org/qemu-devel/20231214181723.1520854-1-...@kaod.org/
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Cédric Le Goater writes: > On 1/9/24 18:40, Fabiano Rosas wrote: >> Cédric Le Goater writes: >> >>> On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: > +Peter/Fabiano > > On 2/1/24 17:41, Cédric Le Goater wrote: >> On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: >>> Hi Cédric, >>> >>> On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: > Hi, > > When a MPCore cluster is used, the Cortex-A cores belong the the > cluster container, not to the board/soc layer. This series move > the creation of vCPUs to the MPCore private container. > > Doing so we consolidate the QOM model, moving common code in a > central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. >>> >>> All changes are either moving properties (which are not migrated) >>> or moving non-migrated QOM members (i.e. pointers of ARMCPU, which >>> is still migrated elsewhere). So I don't see any obvious migration >>> problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. >>> >>> it happens we do. >>> >> >> Oh, sorry, I didn't mean TCG here. Probably meant to say something like >> non-KVM-capable cpus, as in 32-bit. Nevermind. > > Theoretically, we should be able to migrate to a TCG guest. Well, this > worked in the past for PPC. When I was doing more KVM related changes, > this was very useful for dev. Also, some machines are partially emulated. > Anyhow I agree this is not a strong requirement and we often break it. > Let's focus on KVM only. > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 >>> >>> yes it depends on the QOM hierarchy and virt seems immune to the changes. >>> Good. >>> >>> However, changing the QOM topology clearly breaks migration compat, >> >> Well, "clearly" is relative =) You've mentioned pseries and aspeed >> already, do you have a pointer to one of those cases were we broke >> migration > > Regarding pseries, migration compat broke because of 5bc8d26de20c > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which > is similar to the changes proposed by this series, it impacts the QOM > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 > ("spapr: fix migration of ICPState objects from/to older QEMU") which > is quite an headache and this turned out to raise another problem some > months ago ... :/ That's why I sent [1] to prepare removal of old > machines and workarounds becoming a burden. This feels like something that could be handled by the vmstate code somehow. The state is there, just under a different path. No one wants to be policing QOM hierarchy changes in every single series that shows up on the list. Anyway, thanks for the pointers. I'll study that code a bit more, maybe I can come up with some way to handle these cases. Hopefully between the analyze-migration test and the compat tests we'll catch the next bug of this kind before it gets merged.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/9/24 18:40, Fabiano Rosas wrote: Cédric Le Goater writes: On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. Theoretically, we should be able to migrate to a TCG guest. Well, this worked in the past for PPC. When I was doing more KVM related changes, this was very useful for dev. Also, some machines are partially emulated. Anyhow I agree this is not a strong requirement and we often break it. Let's focus on KVM only. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration Regarding pseries, migration compat broke because of 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") which is similar to the changes proposed by this series, it impacts the QOM hierarchy. Here is the workaround/fix from Greg : 46f7afa37096 ("spapr: fix migration of ICPState objects from/to older QEMU") which is quite an headache and this turned out to raise another problem some months ago ... :/ That's why I sent [1] to prepare removal of old machines and workarounds becoming a burden. Regarding aspeed, this series breaks compat. Not that we care much but this caught my attention because of my past experience on pseries. Same kind of QOM change which could impact other machines, like virt. Since you checked that migration compat is preserved on virt, we should be fine. Thanks, C. [1] https://lore.kernel.org/qemu-devel/20231214181723.1520854-1-...@kaod.org/
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Cédric Le Goater writes: > On 1/3/24 20:53, Fabiano Rosas wrote: >> Philippe Mathieu-Daudé writes: >> >>> +Peter/Fabiano >>> >>> On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: > Hi Cédric, > > On 2/1/24 15:55, Cédric Le Goater wrote: >> On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: >>> Hi, >>> >>> When a MPCore cluster is used, the Cortex-A cores belong the the >>> cluster container, not to the board/soc layer. This series move >>> the creation of vCPUs to the MPCore private container. >>> >>> Doing so we consolidate the QOM model, moving common code in a >>> central place (abstract MPCore parent). >> >> Changing the QOM hierarchy has an impact on the state of the machine >> and some fixups are then required to maintain migration compatibility. >> This can become a real headache for KVM machines like virt for which >> migration compatibility is a feature, less for emulated ones. > > All changes are either moving properties (which are not migrated) > or moving non-migrated QOM members (i.e. pointers of ARMCPU, which > is still migrated elsewhere). So I don't see any obvious migration > problem, but I might be missing something, so I Cc'ed Juan :> >> >> FWIW, I didn't spot anything problematic either. >> >> I've ran this through my migration compatibility series [1] and it >> doesn't regress aarch64 migration from/to 8.2. The tests use '-M >> virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't >> think we even support migration of anything non-KVM on arm. > > it happens we do. > Oh, sorry, I didn't mean TCG here. Probably meant to say something like non-KVM-capable cpus, as in 32-bit. Nevermind. >> >> 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 > > yes it depends on the QOM hierarchy and virt seems immune to the changes. > Good. > > However, changing the QOM topology clearly breaks migration compat, Well, "clearly" is relative =) You've mentioned pseries and aspeed already, do you have a pointer to one of those cases were we broke migration so I could take a look?
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/3/24 20:53, Fabiano Rosas wrote: Philippe Mathieu-Daudé writes: +Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. it happens we do. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 yes it depends on the QOM hierarchy and virt seems immune to the changes. Good. However, changing the QOM topology clearly breaks migration compat, at least on the Aspeed SoC. The question is : do we care ? For Aspeed probably not, but the series should say so. Thanks, C. We broke migration compatibility by moving the IC object in the QOM hierarchy of the pseries machines in the past. These changes might be different. Here is the QOM tree of the ast2600 SoC. before : /soc (ast2600-a3) /a7mpcore (a15mpcore_priv) /a15mp-priv-container[0] (memory-region) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /gic_vcpu[0] (memory-region) /gic_viface[0] (memory-region) /gic_viface[1] (memory-region) /gic_viface[2] (memory-region) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) after : /soc (ast2600-a3) /a7mpcore (a7mpcore_priv) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /mpcore-priv-container[0] (memory-region) Thanks, C.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Philippe Mathieu-Daudé writes: > +Peter/Fabiano > > On 2/1/24 17:41, Cédric Le Goater wrote: >> On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: >>> Hi Cédric, >>> >>> On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: > Hi, > > When a MPCore cluster is used, the Cortex-A cores belong the the > cluster container, not to the board/soc layer. This series move > the creation of vCPUs to the MPCore private container. > > Doing so we consolidate the QOM model, moving common code in a > central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. >>> >>> All changes are either moving properties (which are not migrated) >>> or moving non-migrated QOM members (i.e. pointers of ARMCPU, which >>> is still migrated elsewhere). So I don't see any obvious migration >>> problem, but I might be missing something, so I Cc'ed Juan :> FWIW, I didn't spot anything problematic either. I've ran this through my migration compatibility series [1] and it doesn't regress aarch64 migration from/to 8.2. The tests use '-M virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't think we even support migration of anything non-KVM on arm. 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533 >> We broke migration compatibility by moving the IC object in the QOM >> hierarchy of the pseries machines in the past. These changes might >> be different. Here is the QOM tree of the ast2600 SoC. >> >> before : >> >> /soc (ast2600-a3) >> /a7mpcore (a15mpcore_priv) >> /a15mp-priv-container[0] (memory-region) >> /gic (arm_gic) >> /gic_cpu[0] (memory-region) >> /gic_cpu[1] (memory-region) >> /gic_cpu[2] (memory-region) >> /gic_dist[0] (memory-region) >> /gic_vcpu[0] (memory-region) >> /gic_viface[0] (memory-region) >> /gic_viface[1] (memory-region) >> /gic_viface[2] (memory-region) >> /cpu[0] (cortex-a7-arm-cpu) >> /cpu[1] (cortex-a7-arm-cpu) >> >> after : >> >> /soc (ast2600-a3) >> /a7mpcore (a7mpcore_priv) >> /cpu[0] (cortex-a7-arm-cpu) >> /cpu[1] (cortex-a7-arm-cpu) >> /gic (arm_gic) >> /gic_cpu[0] (memory-region) >> /gic_cpu[1] (memory-region) >> /gic_cpu[2] (memory-region) >> /gic_dist[0] (memory-region) >> /mpcore-priv-container[0] (memory-region) >> >> >> Thanks, >> >> C. >> >> >>
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
+Peter/Fabiano On 2/1/24 17:41, Cédric Le Goater wrote: On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> We broke migration compatibility by moving the IC object in the QOM hierarchy of the pseries machines in the past. These changes might be different. Here is the QOM tree of the ast2600 SoC. before : /soc (ast2600-a3) /a7mpcore (a15mpcore_priv) /a15mp-priv-container[0] (memory-region) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /gic_vcpu[0] (memory-region) /gic_viface[0] (memory-region) /gic_viface[1] (memory-region) /gic_viface[2] (memory-region) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) after : /soc (ast2600-a3) /a7mpcore (a7mpcore_priv) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /mpcore-priv-container[0] (memory-region) Thanks, C.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 1/2/24 17:15, Philippe Mathieu-Daudé wrote: Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> We broke migration compatibility by moving the IC object in the QOM hierarchy of the pseries machines in the past. These changes might be different. Here is the QOM tree of the ast2600 SoC. before : /soc (ast2600-a3) /a7mpcore (a15mpcore_priv) /a15mp-priv-container[0] (memory-region) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /gic_vcpu[0] (memory-region) /gic_viface[0] (memory-region) /gic_viface[1] (memory-region) /gic_viface[2] (memory-region) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) after : /soc (ast2600-a3) /a7mpcore (a7mpcore_priv) /cpu[0] (cortex-a7-arm-cpu) /cpu[1] (cortex-a7-arm-cpu) /gic (arm_gic) /gic_cpu[0] (memory-region) /gic_cpu[1] (memory-region) /gic_cpu[2] (memory-region) /gic_dist[0] (memory-region) /mpcore-priv-container[0] (memory-region) Thanks, C.
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Hi Cédric, On 2/1/24 15:55, Cédric Le Goater wrote: On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. All changes are either moving properties (which are not migrated) or moving non-migrated QOM members (i.e. pointers of ARMCPU, which is still migrated elsewhere). So I don't see any obvious migration problem, but I might be missing something, so I Cc'ed Juan :> I don't have a good solution to propose to overcome this problem :/ C. This eventually allow removing one qemu_get_cpu() use, which we want to remove in heterogeneous machines (machines using MPCore are candidate for heterogeneous emulation). include/hw/arm/aspeed_soc.h | 5 +- include/hw/arm/boot.h | 4 +- include/hw/arm/exynos4210.h | 6 +- include/hw/arm/fsl-imx6.h | 6 +- include/hw/arm/fsl-imx6ul.h | 8 +- include/hw/arm/fsl-imx7.h | 8 +- include/hw/arm/npcm7xx.h | 3 +- include/hw/cpu/a15mpcore.h | 44 --- include/hw/cpu/a9mpcore.h | 39 --- include/hw/cpu/cortex_mpcore.h | 135 ++
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). Changing the QOM hierarchy has an impact on the state of the machine and some fixups are then required to maintain migration compatibility. This can become a real headache for KVM machines like virt for which migration compatibility is a feature, less for emulated ones. I don't have a good solution to propose to overcome this problem :/ C. This eventually allow removing one qemu_get_cpu() use, which we want to remove in heterogeneous machines (machines using MPCore are candidate for heterogeneous emulation). Maybe these hw/cpu/arm/ files belong to hw/arm/... Regards, Phil. Philippe Mathieu-Daudé (33): hw/arm/boot: Propagate vCPU to arm_load_dtb() hw/arm/fsl-imx6: Add a local 'gic' variable hw/arm/fsl-imx6ul: Add a local 'gic' variable hw/arm/fsl-imx7: Add a local 'gic' variable hw/cpu: Remove dead Kconfig hw/cpu/arm: Rename 'busdev' -> 'gicsbd' in a15mp_priv_realize() hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro hw/cpu/arm: Merge {a9mpcore.h, a15mpcore.h} as cortex_mpcore.h hw/cpu/arm: Introduce abstract CORTEX_MPCORE_PRIV QOM type hw/cpu/arm: Have A9MPCORE/A15MPCORE inheritate common CORTEX_MPCORE_PRIV hw/cpu/arm: Create MPCore container in QOM parent hw/cpu/arm: Handle 'num_cores' property once in MPCore parent hw/cpu/arm: Handle 'has_el2/3' properties once in MPCore parent hw/cpu/arm: Handle 'gic-irq' property once in MPCore parent hw/cpu/arm: Handle GIC once in MPCore parent hw/cpu/arm: Document more properties of CORTEX_MPCORE_PRIV QOM type hw/cpu/arm: Replace A15MPPrivState by CortexMPPrivState hw/cpu/arm: Introduce TYPE_A7MPCORE_PRIV for Cortex-A7 MPCore hw/cpu/arm: Consolidate check on max GIC spi supported hw/cpu/arm: Create CPUs once in MPCore parent hw/arm/aspeed_ast2600: Let the A7MPcore create/wire the CPU cores hw/arm/exynos4210: Let the A9MPcore create/wire the CPU cores hw/arm/fsl-imx6: Let the A9MPcore create/wire the CPU cores hw/arm/fsl-imx6ul: Let the A7MPcore create/wire the CPU cores hw/arm/fsl-imx7: Let the A7MPcore create/wire the CPU cores hw/arm/highbank: Let the A9/A15MPcore create/wire the CPU cores hw/arm/vexpress: Let the A9/A15MPcore create/wire the CPU cores hw/arm/xilinx_zynq: Let the A9MPcore create/wire the CPU cores hw/arm/npcm7xx: Let the A9MPcore create/wire the CPU cores hw/cpu/a9mpcore: Remove legacy code hw/cpu/arm: Remove 'num-cpu' property alias hw/cpu/arm: Remove use of qemu_get_cpu() in A7/A15 realize() MAINTAINERS| 3 +- include/hw/arm/aspeed_soc.h| 5 +- include/hw/arm/boot.h | 4 +- include/hw/arm/exynos4210.h| 6 +- include/hw/arm/fsl-imx6.h | 6 +- include/hw/arm/fsl-imx6ul.h| 8 +- include/hw/arm/fsl-imx7.h | 8 +- include/hw/arm/npcm7xx.h | 3 +- include/hw/cpu/a15mpcore.h | 44 --- include/hw/cpu/a9mpcore.h | 39 --- include/hw/cpu/cortex_mpcore.h | 135 ++ hw/arm/aspeed_ast2600.c| 61 -- hw/arm/boot.c | 11 +- hw/arm/exynos4210.c| 60 -- hw/arm/exynos4_boards.c| 6 +- hw/arm/fsl-imx6.c | 84 -- hw/arm/fsl-imx6ul.c| 65 --- hw/arm/fsl-imx7.c | 103 + hw/arm/highbank.c | 56 ++--- hw/arm/mcimx6ul-evk.c | 3 +- hw/arm/mcimx7d-sabre.c | 3 +- hw/arm/npcm7xx.c | 48 ++-- hw/arm/realview.c | 4 +- hw/arm/sabrelite.c | 4 +- hw/arm/vexpress.c | 60 +++--- hw/arm/virt.c | 2 +- hw/arm/xilinx_zynq.c | 30 ++--- hw/cpu/a15mpcore.c | 179 + hw/cpu/a9mpcore.c | 138 +- hw/cpu/arm11mpcore.c | 23 ++-- hw/cpu/cortex_mpcore.c | 202 + hw/cpu/realview_mpcore.c | 30 ++--- hw/arm/Kconfig | 8 +- hw/cpu/Kconfig | 8 -- hw/cpu/meson.build | 1 + 35 files changed, 689 insertions(+), 761 deletions(-) delete mode 100644 include/hw/cpu/a15mpcore.h delete mode 100644 include/hw/cpu/a9mpcore.h create mode 100644 include/hw/cpu/cortex_mpcore.h create mode 100644 hw/cpu/cortex_mpcore.c delete mode 100644 hw/cpu/Kconfig
Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Hi, ping for review? On 12/12/23 17:29, Philippe Mathieu-Daudé wrote: Hi, When a MPCore cluster is used, the Cortex-A cores belong the the cluster container, not to the board/soc layer. This series move the creation of vCPUs to the MPCore private container. Doing so we consolidate the QOM model, moving common code in a central place (abstract MPCore parent). This eventually allow removing one qemu_get_cpu() use, which we want to remove in heterogeneous machines (machines using MPCore are candidate for heterogeneous emulation). Maybe these hw/cpu/arm/ files belong to hw/arm/... Regards, Phil. Philippe Mathieu-Daudé (33): hw/arm/boot: Propagate vCPU to arm_load_dtb() hw/arm/fsl-imx6: Add a local 'gic' variable hw/arm/fsl-imx6ul: Add a local 'gic' variable hw/arm/fsl-imx7: Add a local 'gic' variable hw/cpu: Remove dead Kconfig hw/cpu/arm: Rename 'busdev' -> 'gicsbd' in a15mp_priv_realize() hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro hw/cpu/arm: Merge {a9mpcore.h, a15mpcore.h} as cortex_mpcore.h hw/cpu/arm: Introduce abstract CORTEX_MPCORE_PRIV QOM type hw/cpu/arm: Have A9MPCORE/A15MPCORE inheritate common CORTEX_MPCORE_PRIV hw/cpu/arm: Create MPCore container in QOM parent hw/cpu/arm: Handle 'num_cores' property once in MPCore parent hw/cpu/arm: Handle 'has_el2/3' properties once in MPCore parent hw/cpu/arm: Handle 'gic-irq' property once in MPCore parent hw/cpu/arm: Handle GIC once in MPCore parent hw/cpu/arm: Document more properties of CORTEX_MPCORE_PRIV QOM type hw/cpu/arm: Replace A15MPPrivState by CortexMPPrivState hw/cpu/arm: Introduce TYPE_A7MPCORE_PRIV for Cortex-A7 MPCore hw/cpu/arm: Consolidate check on max GIC spi supported hw/cpu/arm: Create CPUs once in MPCore parent hw/arm/aspeed_ast2600: Let the A7MPcore create/wire the CPU cores hw/arm/exynos4210: Let the A9MPcore create/wire the CPU cores hw/arm/fsl-imx6: Let the A9MPcore create/wire the CPU cores hw/arm/fsl-imx6ul: Let the A7MPcore create/wire the CPU cores hw/arm/fsl-imx7: Let the A7MPcore create/wire the CPU cores hw/arm/highbank: Let the A9/A15MPcore create/wire the CPU cores hw/arm/vexpress: Let the A9/A15MPcore create/wire the CPU cores hw/arm/xilinx_zynq: Let the A9MPcore create/wire the CPU cores hw/arm/npcm7xx: Let the A9MPcore create/wire the CPU cores hw/cpu/a9mpcore: Remove legacy code hw/cpu/arm: Remove 'num-cpu' property alias hw/cpu/arm: Remove use of qemu_get_cpu() in A7/A15 realize()