Re: [PATCH] spapr: Don't request to unplug the same core twice

2019-10-29 Thread David Gibson
On Thu, Oct 24, 2019 at 08:28:54AM +0200, Greg Kurz wrote:
> On Thu, 24 Oct 2019 09:38:17 +1100
> David Gibson  wrote:
> 
> > On Wed, Oct 23, 2019 at 09:17:40PM +0200, Greg Kurz wrote:
> > > We must not call spapr_drc_detach() on a detached DRC otherwise bad things
> > > can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
> > > a CPU hotplug/unplug loop using QMP.
> > > 
> > > Signed-off-by: Greg Kurz 
> > 
> > Ouch, good catch.  Applied.
> > 
> > I wonder if we have the same problem with other DRC types.
> > 
> 
> We don't have it with PHB and PCI types, through the same use of
> spapr_drc_unplug_requested().
> 
> LMBs see to avoid it by failing device_del early if an unplug
> request is already in progress:
> 
> /*
>  * An existing pending dimm state for this DIMM means that there is an
>  * unplug operation in progress, waiting for the spapr_lmb_release
>  * callback to complete the job (BQL can't cover that far). In this case,
>  * bail out to avoid detaching DRCs that were already released.
>  */
> if (spapr_pending_dimm_unplugs_find(spapr, dimm)) {
> error_setg(_err,
>"Memory unplug already in progress for device %s",
>dev->id);
> goto out;
> }
> 
> Not sure why we error out in this case instead of ignoring the unplug 
> request.

I suspect no particularly good reason, just history.  Everything's a
bit  different with LMBs because a single device_del will usually
remove a whole bunch of LMB DRCs.  In general the interfacing between
the qemu user side DIMM handling and the PAPR side LMB/DRC handling
is... pretty clunky.

> 
> > > ---
> > >  hw/ppc/spapr.c |7 ---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index f9410d390a07..94f9d27096af 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -3741,9 +3741,10 @@ void spapr_core_unplug_request(HotplugHandler 
> > > *hotplug_dev, DeviceState *dev,
> > >spapr_vcpu_id(spapr, cc->core_id));
> > >  g_assert(drc);
> > >  
> > > -spapr_drc_detach(drc);
> > > -
> > > -spapr_hotplug_req_remove_by_index(drc);
> > > +if (!spapr_drc_unplug_requested(drc)) {
> > > +spapr_drc_detach(drc);
> > > +spapr_hotplug_req_remove_by_index(drc);
> > > +}
> > >  }
> > >  
> > >  int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> > > 
> > 
> 



-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] spapr: Don't request to unplug the same core twice

2019-10-24 Thread Greg Kurz
On Thu, 24 Oct 2019 09:38:17 +1100
David Gibson  wrote:

> On Wed, Oct 23, 2019 at 09:17:40PM +0200, Greg Kurz wrote:
> > We must not call spapr_drc_detach() on a detached DRC otherwise bad things
> > can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
> > a CPU hotplug/unplug loop using QMP.
> > 
> > Signed-off-by: Greg Kurz 
> 
> Ouch, good catch.  Applied.
> 
> I wonder if we have the same problem with other DRC types.
> 

We don't have it with PHB and PCI types, through the same use of
spapr_drc_unplug_requested().

LMBs see to avoid it by failing device_del early if an unplug
request is already in progress:

/*
 * An existing pending dimm state for this DIMM means that there is an
 * unplug operation in progress, waiting for the spapr_lmb_release
 * callback to complete the job (BQL can't cover that far). In this case,
 * bail out to avoid detaching DRCs that were already released.
 */
if (spapr_pending_dimm_unplugs_find(spapr, dimm)) {
error_setg(_err,
   "Memory unplug already in progress for device %s",
   dev->id);
goto out;
}

Not sure why we error out in this case instead of ignoring the unplug 
request.

> > ---
> >  hw/ppc/spapr.c |7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index f9410d390a07..94f9d27096af 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -3741,9 +3741,10 @@ void spapr_core_unplug_request(HotplugHandler 
> > *hotplug_dev, DeviceState *dev,
> >spapr_vcpu_id(spapr, cc->core_id));
> >  g_assert(drc);
> >  
> > -spapr_drc_detach(drc);
> > -
> > -spapr_hotplug_req_remove_by_index(drc);
> > +if (!spapr_drc_unplug_requested(drc)) {
> > +spapr_drc_detach(drc);
> > +spapr_hotplug_req_remove_by_index(drc);
> > +}
> >  }
> >  
> >  int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> > 
> 



pgpSydqiYcuTR.pgp
Description: OpenPGP digital signature


Re: [PATCH] spapr: Don't request to unplug the same core twice

2019-10-23 Thread David Gibson
On Wed, Oct 23, 2019 at 09:17:40PM +0200, Greg Kurz wrote:
> We must not call spapr_drc_detach() on a detached DRC otherwise bad things
> can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
> a CPU hotplug/unplug loop using QMP.
> 
> Signed-off-by: Greg Kurz 

Ouch, good catch.  Applied.

I wonder if we have the same problem with other DRC types.

> ---
>  hw/ppc/spapr.c |7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f9410d390a07..94f9d27096af 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3741,9 +3741,10 @@ void spapr_core_unplug_request(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>spapr_vcpu_id(spapr, cc->core_id));
>  g_assert(drc);
>  
> -spapr_drc_detach(drc);
> -
> -spapr_hotplug_req_remove_by_index(drc);
> +if (!spapr_drc_unplug_requested(drc)) {
> +spapr_drc_detach(drc);
> +spapr_hotplug_req_remove_by_index(drc);
> +}
>  }
>  
>  int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH] spapr: Don't request to unplug the same core twice

2019-10-23 Thread Greg Kurz
We must not call spapr_drc_detach() on a detached DRC otherwise bad things
can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
a CPU hotplug/unplug loop using QMP.

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f9410d390a07..94f9d27096af 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3741,9 +3741,10 @@ void spapr_core_unplug_request(HotplugHandler 
*hotplug_dev, DeviceState *dev,
   spapr_vcpu_id(spapr, cc->core_id));
 g_assert(drc);
 
-spapr_drc_detach(drc);
-
-spapr_hotplug_req_remove_by_index(drc);
+if (!spapr_drc_unplug_requested(drc)) {
+spapr_drc_detach(drc);
+spapr_hotplug_req_remove_by_index(drc);
+}
 }
 
 int spapr_core_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,