On Wed, Jan 20, 2021 at 08:23:05PM -0300, Daniel Henrique Barboza wrote:
> In the CPU hotunplug bug [1] the guest kernel throws a scary
> message in dmesg:
> 
> pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16
> 
> The reason isn't related to the bug though. This happens because the
> kernel file arch/powerpc/platform/pseries/hotplug-cpu.c, function
> dlpar_cpu_remove(), is not finding the device_node.name of the offending
> CPU.
> 
> We're not populating the 'name' property for hotplugged CPUs. Since the
> kernel relies on device_node.name for identifying CPU nodes, and the
> CPUs that are coldplugged has the 'name' property filled by SLOF, this
> is creating an unneeded inconsistency between hotplug and coldplug CPUs
> in the kernel.
> 
> Let's fill the 'name' property for hotplugged CPUs as well. This will
> make the guest dmesg throws a less intimidating message when we try to
> unplug the last online CPU:
> 
> pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9@1, rc: -16
> 
> [1] https://bugzilla.redhat.com/1911414
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb...@gmail.com>

Nice catch.  Because the PAPR code has an odd mix of flattened-tree
pieces (where 'name' is implicit) and real OF pieces (where we
definitely need it), getting this right is kind of fiddly.  Since this
bit of flat tree gets encoded into PAPR's CAS update format, which
does require the 'name' property, this is correct.

You could argue that it's more technically correct for the flattening
code to add the name property from the FDT node name, but this is
simpler and gets the job done.

Applied, thanks.

> ---
>  hw/ppc/spapr.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cc1b709615..6ab27ea269 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3750,6 +3750,19 @@ int spapr_core_dt_populate(SpaprDrc *drc, 
> SpaprMachineState *spapr,
>  
>      spapr_dt_cpu(cs, fdt, offset, spapr);
>  
> +    /*
> +     * spapr_dt_cpu() does not fill the 'name' property in the
> +     * CPU node. The function is called during boot process, before
> +     * and after CAS, and overwriting the 'name' property written
> +     * by SLOF is not allowed.
> +     *
> +     * Write it manually after spapr_dt_cpu(). This makes the hotplug
> +     * CPUs more compatible with the coldplugged ones, which have
> +     * the 'name' property. Linux Kernel also relies on this
> +     * property to identify CPU nodes.
> +     */
> +    _FDT((fdt_setprop_string(fdt, offset, "name", nodename)));
> +
>      *fdt_start_offset = offset;
>      return 0;
>  }

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature

Reply via email to