cpumask_weight() is a horribly expensive way to find if no bits are set, made worse by the fact that the calculation is performed with the global call_lock held.
Switch to using cpumask_empty() instead, which will short circuit as soon as it find any set bit in the cpumask. Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> --- CC: Jan Beulich <jbeul...@suse.com> CC: Roger Pau Monné <roger....@citrix.com> CC: Wei Liu <w...@xen.org> CC: Juergen Gross <jgr...@suse.com> CC: Stefano Stabellini <sstabell...@kernel.org> CC: Julien Grall <jul...@xen.org> CC: Volodymyr Babchuk <volodymyr_babc...@epam.com> CC: Bertrand Marquis <bertrand.marq...@arm.com> I have not done performance testing, but I would be surprised if this cannot be measured on a busy or large box. --- xen/common/smp.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/xen/common/smp.c b/xen/common/smp.c index 781bcf2c246c..a011f541f1ea 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -50,8 +50,6 @@ void on_selected_cpus( void *info, int wait) { - unsigned int nr_cpus; - ASSERT(local_irq_is_enabled()); ASSERT(cpumask_subset(selected, &cpu_online_map)); @@ -59,8 +57,7 @@ void on_selected_cpus( cpumask_copy(&call_data.selected, selected); - nr_cpus = cpumask_weight(&call_data.selected); - if ( nr_cpus == 0 ) + if ( cpumask_empty(&call_data.selected) ) goto out; call_data.func = func; -- 2.11.0