On May 10, 2016 10:23:13 AM PDT, Peter Zijlstra <pet...@infradead.org> wrote:
>On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
>>  static __always_inline unsigned int __arch_hweight32(unsigned int w)
>>  {
>> -    unsigned int res = 0;
>> +    unsigned int res;
>>  
>> -    asm (ALTERNATIVE("call __sw_hweight32", POPCNT32,
>X86_FEATURE_POPCNT)
>> -                 : "="REG_OUT (res)
>> -                 : REG_IN (w));
>> +    if (likely(static_cpu_has(X86_FEATURE_POPCNT))) {
>> +            /* popcnt %eax, %eax */
>> +            asm volatile(POPCNT32
>> +                            : "="REG_OUT (res)
>> +                            : REG_IN (w));
>>  
>> -    return res;
>> +            return res;
>> +    }
>> +    return __sw_hweight32(w);
>>  }
>
>So what was wrong with using the normal thunk_*.S wrappers for the
>calls? That would allow you to use the alternative() stuff which does
>generate smaller code.

Also, to be fair... if the problem is with these being in C then we could just 
do it in assembly easily enough.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

Reply via email to