One of the reasons that dlpar_cpu_offline can fail is when attempting to
offline the last online CPU of the kernel. This can be observed in a
pseries QEMU guest that has hotplugged CPUs. If the user offlines all
other CPUs of the guest, and a hotplugged CPU is now the last online
CPU, trying to reclaim it will fail. See [1] for an example.

The current error message in this situation returns rc with -EBUSY and a
generic explanation, e.g.:

pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16

EBUSY can be caused by other conditions, such as cpu_hotplug_disable
being true. Throwing a more specific error message for this case,
instead of just "Failed to offline CPU", makes it clearer that the error
is in fact a known error situation instead of other generic/unknown
cause.

This patch adds a 'last online' check in dlpar_cpu_offline() to catch
the 'last online CPU' offline error, eturning a more informative error
message:

pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9

[1] https://bugzilla.redhat.com/1911414

Signed-off-by: Daniel Henrique Barboza <danielhb...@gmail.com>
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 12cbffd3c2e3..3ac7e904385c 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -271,6 +271,18 @@ static int dlpar_offline_cpu(struct device_node *dn)
                        if (!cpu_online(cpu))
                                break;
 
+                       /* device_offline() will return -EBUSY (via cpu_down())
+                        * if there is only one CPU left. Check it here to fail
+                        * earlier and with a more informative error message,
+                        * while also retaining the cpu_add_remove_lock to be 
sure
+                        * that no CPUs are being online/offlined during this
+                        * check. */
+                       if (num_online_cpus() == 1) {
+                               pr_warn("Unable to remove last online CPU 
%pOFn\n", dn);
+                               rc = -EBUSY;
+                               goto out_unlock;
+                       }
+
                        cpu_maps_update_done();
                        rc = device_offline(get_cpu_device(cpu));
                        if (rc)
@@ -283,6 +295,7 @@ static int dlpar_offline_cpu(struct device_node *dn)
                                thread);
                }
        }
+out_unlock:
        cpu_maps_update_done();
 
 out:
-- 
2.30.2

Reply via email to