cpu_caps_cleared and cpu_caps_set may not be aligned to unsigned long. Atomic operations (i.e. set_bit and clear_bit) on the bitmaps may access two cache lines (a.k.a. split lock) and lock bus to block all memory accesses from other processors to ensure atomicity.
To avoid the overall performance degradation from the bus locking, align the two variables to unsigned long. Defining the variables as unsigned long may also fix the issue because they are naturally aligned to unsigned long. But that needs additional code changes. Adding __aligned(unsigned long) are simpler fixes. Signed-off-by: Fenghua Yu <fenghua...@intel.com> --- arch/x86/kernel/cpu/common.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index cb28e98a0659..51ab37ba5f64 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -488,8 +488,9 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) return NULL; /* Not found */ } -__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; -__u32 cpu_caps_set[NCAPINTS + NBUGINTS]; +/* Unsigned long alignment to avoid split lock in atomic bitmap ops */ +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); +__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); void load_percpu_segment(int cpu) { -- 2.7.4