On 02/03/19 03:44, Fenghua Yu wrote:
> cpu_caps_cleared and cpu_caps_set may not be aligned to unsigned long.
> Atomic operations (i.e. set_bit and clear_bit) on the bitmaps may access
> two cache lines (a.k.a. split lock) and lock bus to block all memory
> accesses from other processors to ensure atomicity.
> 
> To avoid the overall performance degradation from the bus locking, align
> the two variables to unsigned long.
> 
> Defining the variables as unsigned long may also fix the issue because
> they are naturally aligned to unsigned long. But that needs additional
> code changes. Adding __aligned(unsigned long) are simpler fixes.
> 
> Signed-off-by: Fenghua Yu <fenghua...@intel.com>
> ---
>  arch/x86/kernel/cpu/common.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index cb28e98a0659..51ab37ba5f64 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -488,8 +488,9 @@ static const char *table_lookup_model(struct cpuinfo_x86 
> *c)
>       return NULL;            /* Not found */
>  }
>  
> -__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
> -__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
> +/* Unsigned long alignment to avoid split lock in atomic bitmap ops */
> +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
> +__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
>  
>  void load_percpu_segment(int cpu)
>  {
> 

(resending including the list)

Why not instead change set_bit/clear_bit to use btsl/btrl instead of
btsq/btrq?

Thanks,

Paolo

Reply via email to