[PATCH] xen/smp: Speed up on_selected_cpus()

Andrew Cooper Fri, 04 Feb 2022 12:31:48 -0800

cpumask_weight() is a horribly expensive way to find if no bits are set, made
worse by the fact that the calculation is performed with the global call_lock
held.


Switch to using cpumask_empty() instead, which will short circuit as soon as
it find any set bit in the cpumask.

Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
CC: Jan Beulich <jbeul...@suse.com>
CC: Roger Pau Monné <roger....@citrix.com>
CC: Wei Liu <w...@xen.org>
CC: Juergen Gross <jgr...@suse.com>
CC: Stefano Stabellini <sstabell...@kernel.org>
CC: Julien Grall <jul...@xen.org>
CC: Volodymyr Babchuk <volodymyr_babc...@epam.com>
CC: Bertrand Marquis <bertrand.marq...@arm.com>

I have not done performance testing, but I would be surprised if this cannot
be measured on a busy or large box.
---
 xen/common/smp.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/common/smp.c b/xen/common/smp.c
index 781bcf2c246c..a011f541f1ea 100644
--- a/xen/common/smp.c
+++ b/xen/common/smp.c
@@ -50,8 +50,6 @@ void on_selected_cpus(
     void *info,
     int wait)
 {
-    unsigned int nr_cpus;
-
     ASSERT(local_irq_is_enabled());
     ASSERT(cpumask_subset(selected, &cpu_online_map));
 
@@ -59,8 +57,7 @@ void on_selected_cpus(
 
     cpumask_copy(&call_data.selected, selected);
 
-    nr_cpus = cpumask_weight(&call_data.selected);
-    if ( nr_cpus == 0 )
+    if ( cpumask_empty(&call_data.selected) )
         goto out;
 
     call_data.func = func;
-- 
2.11.0

[PATCH] xen/smp: Speed up on_selected_cpus()

Reply via email to