On Fri Oct 20, 2023 at 6:33 PM AEST, Greg Kurz wrote: > On Fri, 20 Oct 2023 17:49:38 +1000 > "Nicholas Piggin" <npig...@gmail.com> wrote: > > > On Fri Oct 20, 2023 at 7:39 AM AEST, Greg Kurz wrote: > > > On Thu, 19 Oct 2023 21:08:25 +0200 > > > Juan Quintela <quint...@redhat.com> wrote: > > > > > > > Current code does: > > > > - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance > > > > dependinfg on cpu number > > > > - for newer machines, it register vmstate_icp with "icp/server" name > > > > and instance 0 > > > > - now it unregisters "icp/server" for the 1st instance. > > > > > > > > This is wrong at many levels: > > > > - we shouldn't have two VMSTATEDescriptions with the same name > > > > - In case this is the only solution that we can came with, it needs to > > > > be: > > > > * register pre_2_10_vmstate_dummy_icp > > > > * unregister pre_2_10_vmstate_dummy_icp > > > > * register real vmstate_icp > > > > > > > > As the initialization of this machine is already complex enough, I > > > > need help from PPC maintainers to fix this. > > > > > > > > Volunteers? > > > > > > > > CC: Cedric Le Goater <c...@kaod.org> > > > > CC: Daniel Henrique Barboza <danielhb...@gmail.com> > > > > CC: David Gibson <da...@gibson.dropbear.id.au> > > > > CC: Greg Kurz <gr...@kaod.org> > > > > > > > > Signed-off-by: Juan Quintela <quint...@redhat.com> > > > > --- > > > > hw/ppc/spapr.c | 7 ++++++- > > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > > index cb840676d3..8531d13492 100644 > > > > --- a/hw/ppc/spapr.c > > > > +++ b/hw/ppc/spapr.c > > > > @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void > > > > *opaque) > > > > } > > > > > > > > static const VMStateDescription pre_2_10_vmstate_dummy_icp = { > > > > - .name = "icp/server", > > > > + /* > > > > + * Hack ahead. We can't have two devices with the same name and > > > > + * instance id. So I rename this to pass make check. > > > > + * Real help from people who knows the hardware is needed. > > > > + */ > > > > + .name = "pre-2.10-icp/server", > > > > .version_id = 1, > > > > .minimum_version_id = 1, > > > > .needed = pre_2_10_vmstate_dummy_icp_needed, > > > > > > I guess this fix is acceptable as well and a lot simpler than > > > reverting the hack actually. Outcome is the same : drop > > > compat with pseries-2.9 and older. > > > > > > Reviewed-by: Greg Kurz <gr...@kaod.org> > > > > So the reason we can't have duplicate names registered, aside from it > > surely going bad if we actually send or receive a stream at the point > > they are registered, is the duplcate check introduced in patch 9? But > > before that, this hack does seem to actually work because the duplicate > > is unregistered right away. > > > > Correct. > > > If I understand the workaround, there is an asymmetry in the migration > > sequence in that receiving an unexpected object would cause a failure, > > but going from newer to older would just skip some "expected" objects > > and that didn't cause a problem. So you only have to deal with ignoring > > the unexpected ones going form older to newer. > > > > Correct. > > > Side question, is it possible to flag the problem of *not* receiving > > an object that you did expect? That might be a source of bugs too. > > > > AFAICR we try to only migrate state that differs from reset : the > destination cannot really assume it will receive anything for a > given device.
That's true, I guess you could always add some flag yourself if you certainly need something. > > > Anyway, I wonder if we could fix this spapr problem by adding a special > > case wild card instance matcher to ignore it? It's still a bit hacky > > but maybe a bit nicer. I don't mind deprecating the machine soon if > > you want to clear the wildcard hack away soon, but it would be nice to > > separate the deprecation and removal from the fix, if possible. > > > > This patch is not tested but hopefully helps illustrate the idea. > > > > I'm not sure this will fly with older QEMUs that don't know about > VMSTATE_INSTANCE_ID_WILD... but I'll let Juan comment on that. You could be right about that. He gave a simpler solution now anyway. Thanks, Nick