On Fri, Apr 20, 2018 at 05:39:42PM +0200, Greg Kurz wrote: > On Fri, 20 Apr 2018 11:15:01 +0200 > Greg Kurz <gr...@kaod.org> wrote: > > > On Fri, 20 Apr 2018 16:34:37 +1000 > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > On Thu, Apr 19, 2018 at 03:48:23PM +0200, Greg Kurz wrote: > > > > On Tue, 17 Apr 2018 17:17:13 +1000 > > > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > > > > > af81cf323c1 "spapr: CPU hotplug support" added a direct call to > > > > > spapr_cpu_reset() in spapr_cpu_init(), as well as registering it as a > > > > > reset callback. That was in order to make sure that the reset > > > > > function > > > > > got called for a newly hotplugged cpu, which would miss the global > > > > > machine > > > > > reset. > > > > > > > > > > However, this change means that spapr_cpu_reset() gets called twice > > > > > for > > > > > normal cold-plugged cpus: once from spapr_cpu_init(), then again > > > > > during > > > > > the system reset. As well as being ugly in its redundancy, the first > > > > > call > > > > > happens before the machine reset calls have happened, which will cause > > > > > problems for some things we're going to want to add. > > > > > > > > > > So, we remove the reset call from spapr_cpu_init(). We instead put an > > > > > explicit reset call in the hotplug specific path. > > > > > > > > > > Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> > > > > > --- > > > > > > > > I had sent a tentative patch to do something similar earlier this year: > > > > > > > > https://patchwork.ozlabs.org/patch/862116/ > > > > > > > > but it got nacked for several reasons, one of them being you were > > > > "always wary of using the hotplugged parameter, because what qemu > > > > means by it often doesn't line up with what PAPR means by it." > > > > > > Yeah, I was and am wary of that, but convinced myself it was correct > > > in this case (which doesn't really interact with the PAPR meaning of > > > hotplug). > > > > > > > > hw/ppc/spapr.c | 6 ++++-- > > > > > hw/ppc/spapr_cpu_core.c | 13 ++++++++++++- > > > > > include/hw/ppc/spapr_cpu_core.h | 2 ++ > > > > > 3 files changed, 18 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > > > index 7b2bc4e25d..81b50af3b5 100644 > > > > > --- a/hw/ppc/spapr.c > > > > > +++ b/hw/ppc/spapr.c > > > > > @@ -3370,9 +3370,11 @@ static void spapr_core_plug(HotplugHandler > > > > > *hotplug_dev, DeviceState *dev, > > > > > > > > > > if (hotplugged) { > > > > > > > > ... but you rely on it here. Can you explain why it is > > > > okay now ? > > > > > > So the value I actually need here is "wasn't present at the last > > > system reset" (with false positives being mostly harmless, but not > > > false negatives). > > > > > > > Hmm... It is rather the other way around, sth like "will be caught > > by the initial machine reset". > > > > > > Also, if QEMU is started with -incoming and the CPU core > > > > is hotplugged before migration begins, the following will > > > > return false: > > > > > > > > static inline bool spapr_drc_hotplugged(DeviceState *dev) > > > > { > > > > return dev->hotplugged && !runstate_check(RUN_STATE_INMIGRATE); > > > > } > > > > > > > > and the CPU core won't be reset. > > > > > > Uh... spapr_dtc_hotplugged() would definitely be wrong here, which is > > > why I'm not using it. > > > > > > > This is how hotplugged is set in spapr_core_plug(): > > > > bool hotplugged = spapr_drc_hotplugged(dev); > > > > but to detect the "will be caught by the initial machine reset" condition, > > we only need to check dev->hotplugged actually. > > > > > > > > > > > /* > > > > > - * Send hotplug notification interrupt to the guest only > > > > > - * in case of hotplugged CPUs. > > > > > + * For hotplugged CPUs, we need to reset them (they > > > > > missed > > > > > + * out on the system reset), and send the guest a > > > > > + * notification > > > > > */ > > > > > + spapr_cpu_core_reset(core); > > > > > > > > spapr_cpu_reset() also sets the compat mode, which is used > > > > to set some properties in the DT, ie, this should be called > > > > before spapr_populate_hotplug_cpu_dt(). > > > > > > Good point. I've moved the reset to fix that. > > > > > Thinking of it again: since cold-plugged devices reach this before machine > reset, we would then attach to the DRC a DT blob based on a non-reset CPU :-\ > > Instead of registering a reset handler for each individual CPUs, maybe > we should rather register it a the CPU core level. The handler would > first reset all CPUs in the core and then setup the DRC for new cores only, > like it is done currently in spapr_core_plug(). > > spapr_core_plug() would then just need to register the reset handler, > and call it only for hotplugged cores.
Handling the resets via the core level might be a good idea anyway, but I don't think it can address the problem we're hitting here. I've investigated further and I'm pretty sure we can't fix this without generic code changes. cpu_common_realizefn() (which is called by our ppc cpu realize hook via the parent_realize chain) contains this: if (dev->hotplugged) { cpu_synchronize_post_init(cpu); cpu_resume(cpu); } So, as soon as the hotplugged cpu is realized, it's running, which means by the time we call the plug() hotplug handler we're too late to do any reset initialization. I think there are two ways to look at this: 1) The reset handlers are specifically about *system* reset, not device reset, and so we shoudln't really expect them to be called for hotplugged devices. If we want to share reset initialization with "initial" initialization, we should explicitly call the reset handler from the (realize time) init code.. which is what we do now. 2) Common core realize should _not_ activate the cpu. Instead that should be the plug() handler's job. This would require changing the x86 cpu plug handler (and whatever else) to kick off the cpu after realization. For now I'm inclined to just let it stay at (1). The problem I had which I thought required this, doesn't after all. I came up with a different solution that involves moving the spapr caps initialization earlier, instead of the cpu reset later. That turned out to be substantially easier than I first thought, and regardless of what we do above long term, I think it's a better way to handle the caps. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature