On Fri, Jun 15, 2018 at 10:02:25AM +1000, David Gibson wrote:
> On Thu, Jun 14, 2018 at 11:50:42PM +0200, Greg Kurz wrote:
> > The spapr_realize_vcpu() function doesn't rollback in case of error.
> > This isn't a problem with coldplugged CPUs because the machine won't
> > start and QEMU will exit. Hotplug is a different story though: the
> > CPU thread is started under object_property_set_bool() and it assumes
> > it can access the CPU object.
> > 
> > If icp_create() fails, we return an error without unregistering the
> > reset handler for this CPU, and we let the underlying QEMU thread for
> > this CPU alive. Since spapr_cpu_core_realize() doesn't care to unrealize
> > already realized CPUs either, but happily frees all of them anyway, the
> > CPU thread crashes instantly:
> > 
> > (qemu) device_add host-spapr-cpu-core,core-id=1,id=gku
> > GKU: failing icp_create (cpu 0x11497fd0)
> >                              ^^^^^^^^^^
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fffee3feaa0 (LWP 24725)]
> > 0x00000000104c8374 in object_dynamic_cast_assert (obj=0x11497fd0,
> >                                                   ^^^^^^^^^^^^^^
> >                                              pointer to the CPU object
> > 623         trace_object_dynamic_cast_assert(obj ? obj->class->type->name
> > (gdb) p obj->class->type
> > $1 = (Type) 0x0
> > (gdb) p * obj
> > $2 = {class = 0x10ea9c10, free = 0x11244620,
> >                                  ^^^^^^^^^^
> >                               should be g_free
> > (gdb) p g_free
> > $3 = {<text variable, no debug info>} 0x7ffff282bef0 <g_free>
> > 
> > obj is a dangling pointer to the CPU that was just destroyed in
> > spapr_cpu_core_realize().
> > 
> > This patch adds proper rollback to both spapr_realize_vcpu() and
> > spapr_cpu_core_realize().
> > 
> > Signed-off-by: Greg Kurz <gr...@kaod.org>
> 
> Applied to ppc-for-3.0, since it definitely looks to fix some
> problems.

Uh.. actually it has a definite bug - the first exit point will call
g_free() on an uninitialized spapr_cpu.  I fixed it up with a NULL
initialization in my tree.

> 
> > ---
> >  hw/ppc/spapr_cpu_core.c |   12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > index 003c4c5a79d2..04c818a6ecac 100644
> > --- a/hw/ppc/spapr_cpu_core.c
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -159,12 +159,16 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
> > sPAPRMachineState *spapr,
> >      spapr_cpu->icp = icp_create(OBJECT(cpu), spapr->icp_type,
> >                                  XICS_FABRIC(spapr), &local_err);
> >      if (local_err) {
> > -        goto error;
> > +        goto error_unregister;
> >      }
> >  
> >      return;
> >  
> > +error_unregister:
> > +    qemu_unregister_reset(spapr_cpu_reset, cpu);
> > +    cpu_remove_sync(CPU(cpu));
> 
> I'm a little unclear on exactly what init the cpu_remove_sync() is
> mirroring, though.
> 
> >  error:
> > +    g_free(spapr_cpu);
> >      error_propagate(errp, local_err);
> >  }
> >  
> > @@ -222,11 +226,15 @@ static void spapr_cpu_core_realize(DeviceState *dev, 
> > Error **errp)
> >      for (j = 0; j < cc->nr_threads; j++) {
> >          spapr_realize_vcpu(sc->threads[j], spapr, &local_err);
> >          if (local_err) {
> > -            goto err;
> > +            goto err_unrealize;
> >          }
> >      }
> >      return;
> >  
> > +err_unrealize:
> > +    while (--j >= 0) {
> > +        spapr_unrealize_vcpu(sc->threads[i]);
> > +    }
> >  err:
> >      while (--i >= 0) {
> >          obj = OBJECT(sc->threads[i]);
> > 
> 



-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature

Reply via email to