subject:"Strange instruction sequence with DMD while calling functions with float parameters"

Re: Strange instruction sequence with DMD while calling functions with float parameters

2020-02-14 Thread Basile B. via Digitalmars-d-learn


On Friday, 14 February 2020 at 22:36:20 UTC, PatateVerte wrote:

Hello
I noticed a strange behaviour of the DMD compiler when it has 
to call a function with float arguments.


I build with the flags "-mcpu=avx2 -O  -m64" under windows 64 
bits using "DMD32 D Compiler v2.090.1-dirty"


I have the following function :
   float mul_add(float a, float b, float c); //Return a * b + c

When I try to call it :
   float f = d_mul_add(1.0, 2.0, 3.0);

I tested with other functions with float parameters, and there 
is the same problem.


Then the following instructions are generated :
//Loads the values, as it can be expected
vmovss xmm2,dword [rel 0x64830]
vmovss xmm1,dword [rel 0x64834]
vmovss xmm0,dword [rel 0x64838]
//Why ?
movq r8,xmm2
movq rdx,xmm1
movq rcx,xmm0
//
call 0x400   //0x400 is where the mul_add function is located

My questions are :
 - Is there a reason why the registers xmm0/1/2 are saved in 
rcx/rdx/r8 before calling ? The calling convention specifies 
that the floating point parameters have to be put in xmm 
registers, and not GPR, unless you are using your own calling 
convention.
 - Why is it done using non-avx instructions ? Mixing AVX and 
non-AVX instructions may impact the speed greatly.


Any idea ? Thank you in advance.


It's simply the bad codegen (or rather a missed opportunity to 
optimize) from DMD, its backend doesn't see that the parameters 
are already in the right order and in the right registers so it 
copy them and put them in the regs for the inner func call.


I had observed this in the past too, i.e unexplained round 
tripping from GP to SSE regs. For good FP codegen use LDC2 or GDC 
or write iasm (but loose inlining).


For other people who'd like to observe the problem: 
https://godbolt.org/z/gvqEqz.
By the way I had to deactivate AVX2 targeting because otherwise 
the result is even more weird (https://godbolt.org/z/T9NwMc)

Strange instruction sequence with DMD while calling functions with float parameters

2020-02-14 Thread PatateVerte via Digitalmars-d-learn


Hello
I noticed a strange behaviour of the DMD compiler when it has to 
call a function with float arguments.


I build with the flags "-mcpu=avx2 -O  -m64" under windows 64 
bits using "DMD32 D Compiler v2.090.1-dirty"


I have the following function :
   float mul_add(float a, float b, float c); //Return a * b + c

When I try to call it :
   float f = d_mul_add(1.0, 2.0, 3.0);

I tested with other functions with float parameters, and there is 
the same problem.


Then the following instructions are generated :
//Loads the values, as it can be expected
vmovss xmm2,dword [rel 0x64830]
vmovss xmm1,dword [rel 0x64834]
vmovss xmm0,dword [rel 0x64838]
//Why ?
movq r8,xmm2
movq rdx,xmm1
movq rcx,xmm0
//
call 0x400   //0x400 is where the mul_add function is located

My questions are :
 - Is there a reason why the registers xmm0/1/2 are saved in 
rcx/rdx/r8 before calling ? The calling convention specifies that 
the floating point parameters have to be put in xmm registers, 
and not GPR, unless you are using your own calling convention.
 - Why is it done using non-avx instructions ? Mixing AVX and 
non-AVX instructions may impact the speed greatly.


Any idea ? Thank you in advance.

Re: Strange instruction sequence with DMD while calling functions with float parameters

Strange instruction sequence with DMD while calling functions with float parameters

2 matches

Site Navigation

Mail list logo

Footer information