On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
>  static __always_inline unsigned int __arch_hweight32(unsigned int w)
>  {
> -     unsigned int res = 0;
> +     unsigned int res;
>  
> -     asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> -                  : "="REG_OUT (res)
> -                  : REG_IN (w));
> +     if (likely(static_cpu_has(X86_FEATURE_POPCNT))) {
> +             /* popcnt %eax, %eax */
> +             asm volatile(POPCNT32
> +                             : "="REG_OUT (res)
> +                             : REG_IN (w));
>  
> -     return res;
> +             return res;
> +     }
> +     return __sw_hweight32(w);
>  }

So what was wrong with using the normal thunk_*.S wrappers for the
calls? That would allow you to use the alternative() stuff which does
generate smaller code.

Reply via email to