libcfs cpu partition can't support CPU hotplug, but it is safe
when plug-in new CPU or enabling/disabling hyper-threading.
It has potential risk only if plug-out CPU because it may break CPU
affinity of Lustre threads.

Current libcfs will print warning for all CPU notification, this
patch changed this behavior and only output warning when we lost all
HTs in a CPU core which may have broken affinity of Lustre threads.

Signed-off-by: Liang Zhen <liang.z...@intel.com>
Reviewed-on: http://review.whamcloud.com/8770
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4454
Reviewed-by: Bobi Jam <bobi...@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dil...@intel.com>
Signed-off-by: Oleg Drokin <oleg.dro...@intel.com>
---
 .../staging/lustre/lustre/libcfs/linux/linux-cpu.c    | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lustre/libcfs/linux/linux-cpu.c
index 58bb256..77b1ef6 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-cpu.c
@@ -952,6 +952,7 @@ static int
 cfs_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu)
 {
        unsigned int  cpu = (unsigned long)hcpu;
+       bool         warn;
 
        switch (action) {
        case CPU_DEAD:
@@ -962,9 +963,21 @@ cfs_cpu_notify(struct notifier_block *self, unsigned long 
action, void *hcpu)
                cpt_data.cpt_version++;
                spin_unlock(&cpt_data.cpt_lock);
        default:
-               CWARN("Lustre: can't support CPU hotplug well now, "
-                     "performance and stability could be impacted"
-                     "[CPU %u notify: %lx]\n", cpu, action);
+               if (action != CPU_DEAD && action != CPU_DEAD_FROZEN) {
+                       CDEBUG(D_INFO, "CPU changed [cpu %u action %lx]\n",
+                              cpu, action);
+                       break;
+               }
+
+               down(&cpt_data.cpt_mutex);
+               /* if all HTs in a core are offline, it may break affinity */
+               cfs_cpu_ht_siblings(cpu, cpt_data.cpt_cpumask);
+               warn = any_online_cpu(*cpt_data.cpt_cpumask) >= nr_cpu_ids;
+               up(&cpt_data.cpt_mutex);
+               CDEBUG(warn ? D_WARNING : D_INFO,
+                      "Lustre: can't support CPU plug-out well now, "
+                      "performance and stability could be impacted "
+                      "[CPU %u action: %lx]\n", cpu, action);
        }
 
        return NOTIFY_OK;
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to