On Thu, May 12, 2016 at 03:09:32PM +0200, Borislav Petkov wrote:
> I wanted to have gcc use %[w] and this way not hardcode the reg but the
> ABI kinda hardcodes it to rAX. And you're right about tracing funkyness
> adding glue so we're probably better off doing the .S thing directly and
> making
On Thu, May 12, 2016 at 03:09:32PM +0200, Borislav Petkov wrote:
> I wanted to have gcc use %[w] and this way not hardcode the reg but the
> ABI kinda hardcodes it to rAX. And you're right about tracing funkyness
> adding glue so we're probably better off doing the .S thing directly and
> making
On Thu, May 12, 2016 at 02:14:52PM +0200, Peter Zijlstra wrote:
> But this is a C function, with C calling convention. You're now assuming
> GCC doesn't clobber anything with its prologue/epilogue.
>
> I think hpa meant to put it in an .S file and avoid all that.
I wanted to have gcc use %[w]
On Thu, May 12, 2016 at 02:14:52PM +0200, Peter Zijlstra wrote:
> But this is a C function, with C calling convention. You're now assuming
> GCC doesn't clobber anything with its prologue/epilogue.
>
> I think hpa meant to put it in an .S file and avoid all that.
I wanted to have gcc use %[w]
On Thu, May 12, 2016 at 01:57:38PM +0200, Borislav Petkov wrote:
> #ifdef CONFIG_X86_32
> # define PUSH_DX"pushl %%edx\n\t"
> # define POP_DX "popl %%edx\n\t"
> #else
> # define PUSH_DX"pushq %%rdx\n\t"
> # define POP_DX "popq %%rdx\n\t"
> #endif
>
> unsigned int
On Thu, May 12, 2016 at 01:57:38PM +0200, Borislav Petkov wrote:
> #ifdef CONFIG_X86_32
> # define PUSH_DX"pushl %%edx\n\t"
> # define POP_DX "popl %%edx\n\t"
> #else
> # define PUSH_DX"pushq %%rdx\n\t"
> # define POP_DX "popq %%rdx\n\t"
> #endif
>
> unsigned int
On Wed, May 11, 2016 at 09:54:50PM -0700, H. Peter Anvin wrote:
> I was thinking it isn't really very complex code even in assembly as
> it is super-regular; you can even crib the gcc-generated code if you
> wish.
Do I wanna do experiments in asm? Always! :-)
Ok, so I did steal gcc -m32 -O3
On Wed, May 11, 2016 at 09:54:50PM -0700, H. Peter Anvin wrote:
> I was thinking it isn't really very complex code even in assembly as
> it is super-regular; you can even crib the gcc-generated code if you
> wish.
Do I wanna do experiments in asm? Always! :-)
Ok, so I did steal gcc -m32 -O3
On May 11, 2016 4:24:09 AM PDT, Peter Zijlstra wrote:
>On Wed, May 11, 2016 at 07:15:19AM -0400, Brian Gerst wrote:
>
>> I think he meant the out of line version would be asm, so you could
>> control what registers were clobbered.
>
>Yeah, it might save a few cycles on the
On May 11, 2016 4:24:09 AM PDT, Peter Zijlstra wrote:
>On Wed, May 11, 2016 at 07:15:19AM -0400, Brian Gerst wrote:
>
>> I think he meant the out of line version would be asm, so you could
>> control what registers were clobbered.
>
>Yeah, it might save a few cycles on the call, but given that
On Wed, May 11, 2016 at 01:24:09PM +0200, Peter Zijlstra wrote:
> Yeah, it might save a few cycles on the call, but given that most
> machines should have popcnt these days is it worth the hassle/cost of
> duplicating the lib/hweight.c magic in asm (and remember, twice, once
> for 32bit and once
On Wed, May 11, 2016 at 01:24:09PM +0200, Peter Zijlstra wrote:
> Yeah, it might save a few cycles on the call, but given that most
> machines should have popcnt these days is it worth the hassle/cost of
> duplicating the lib/hweight.c magic in asm (and remember, twice, once
> for 32bit and once
On Wed, May 11, 2016 at 07:15:19AM -0400, Brian Gerst wrote:
> I think he meant the out of line version would be asm, so you could
> control what registers were clobbered.
Yeah, it might save a few cycles on the call, but given that most
machines should have popcnt these days is it worth the
On Wed, May 11, 2016 at 07:15:19AM -0400, Brian Gerst wrote:
> I think he meant the out of line version would be asm, so you could
> control what registers were clobbered.
Yeah, it might save a few cycles on the call, but given that most
machines should have popcnt these days is it worth the
On Wed, May 11, 2016 at 12:11 AM, Borislav Petkov wrote:
> On Tue, May 10, 2016 at 03:30:48PM -0700, H. Peter Anvin wrote:
>> I didn't mean inline assembly.
>
> How does that matter?
>
> The problem is having as less insn bytes as possible and the minimal
> size we can do is issuing
On Wed, May 11, 2016 at 12:11 AM, Borislav Petkov wrote:
> On Tue, May 10, 2016 at 03:30:48PM -0700, H. Peter Anvin wrote:
>> I didn't mean inline assembly.
>
> How does that matter?
>
> The problem is having as less insn bytes as possible and the minimal
> size we can do is issuing POPCNT
On Tue, May 10, 2016 at 03:30:48PM -0700, H. Peter Anvin wrote:
> I didn't mean inline assembly.
How does that matter?
The problem is having as less insn bytes as possible and the minimal
size we can do is issuing POPCNT everywhere which is 4 or 5 bytes. The
alternatives then replace that with a
On Tue, May 10, 2016 at 03:30:48PM -0700, H. Peter Anvin wrote:
> I didn't mean inline assembly.
How does that matter?
The problem is having as less insn bytes as possible and the minimal
size we can do is issuing POPCNT everywhere which is 4 or 5 bytes. The
alternatives then replace that with a
On 05/10/16 12:10, Borislav Petkov wrote:
> On Tue, May 10, 2016 at 12:03:48PM -0700, H. Peter Anvin wrote:
>> Also, to be fair... if the problem is with these being in C then we
>> could just do it in assembly easily enough.
>
> I thought about converting the __sw_hweight* variants to asm but
>
On 05/10/16 12:10, Borislav Petkov wrote:
> On Tue, May 10, 2016 at 12:03:48PM -0700, H. Peter Anvin wrote:
>> Also, to be fair... if the problem is with these being in C then we
>> could just do it in assembly easily enough.
>
> I thought about converting the __sw_hweight* variants to asm but
>
On Tue, May 10, 2016 at 12:03:48PM -0700, H. Peter Anvin wrote:
> Also, to be fair... if the problem is with these being in C then we
> could just do it in assembly easily enough.
I thought about converting the __sw_hweight* variants to asm but
__sw_hweight32, for example, is 55 bytes here and
On Tue, May 10, 2016 at 12:03:48PM -0700, H. Peter Anvin wrote:
> Also, to be fair... if the problem is with these being in C then we
> could just do it in assembly easily enough.
I thought about converting the __sw_hweight* variants to asm but
__sw_hweight32, for example, is 55 bytes here and
On May 10, 2016 10:23:13 AM PDT, Peter Zijlstra wrote:
>On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
>> static __always_inline unsigned int __arch_hweight32(unsigned int w)
>> {
>> -unsigned int res = 0;
>> +unsigned int res;
>>
>> -asm
On May 10, 2016 10:23:13 AM PDT, Peter Zijlstra wrote:
>On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
>> static __always_inline unsigned int __arch_hweight32(unsigned int w)
>> {
>> -unsigned int res = 0;
>> +unsigned int res;
>>
>> -asm (ALTERNATIVE("call
On Tue, May 10, 2016 at 07:23:13PM +0200, Peter Zijlstra wrote:
> So what was wrong with using the normal thunk_*.S wrappers for the
> calls? That would allow you to use the alternative() stuff which does
> generate smaller code.
Yeah, so a full allyesconfig vmlinux gives ~22K .text size
On Tue, May 10, 2016 at 07:23:13PM +0200, Peter Zijlstra wrote:
> So what was wrong with using the normal thunk_*.S wrappers for the
> calls? That would allow you to use the alternative() stuff which does
> generate smaller code.
Yeah, so a full allyesconfig vmlinux gives ~22K .text size
On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
> static __always_inline unsigned int __arch_hweight32(unsigned int w)
> {
> - unsigned int res = 0;
> + unsigned int res;
>
> - asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> -
On Tue, May 10, 2016 at 06:53:18PM +0200, Borislav Petkov wrote:
> static __always_inline unsigned int __arch_hweight32(unsigned int w)
> {
> - unsigned int res = 0;
> + unsigned int res;
>
> - asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
> -
From: Borislav Petkov <b...@suse.de>
Date: Wed, 4 May 2016 18:52:09 +0200
Subject: [PATCH -v2] x86/hweight: Get rid of the special calling convention
People complained about ARCH_HWEIGHT_CFLAGS and how it throws a wrench
into kcov, lto, etc, experimentation.
And its not like we absolutel
From: Borislav Petkov
Date: Wed, 4 May 2016 18:52:09 +0200
Subject: [PATCH -v2] x86/hweight: Get rid of the special calling convention
People complained about ARCH_HWEIGHT_CFLAGS and how it throws a wrench
into kcov, lto, etc, experimentation.
And its not like we absolutely need it so let's get
30 matches
Mail list logo