Matthew Knepley <[email protected]> writes:

> On Tue, Apr 4, 2017 at 3:40 PM, Filippo Leonardi <[email protected]>
> wrote:
>
>> I had weird issues where gcc (that I am using for my tests right now)
>> wasn't vectorising properly (even enabling all flags, from tree-vectorize,
>> to mavx). According to my tests, I know the Intel compiler was a bit better
>> at that.
>>
>
> We are definitely at the mercy of the compiler for this. Maybe Jed has an
> idea why its not vectorizing.

Is this so bad?

000000000024080e <VecMAXPY_Seq+0x2fe> mov    rax,QWORD PTR [rbp-0xb0]
0000000000240815 <VecMAXPY_Seq+0x305> add    ebx,0x1
0000000000240818 <VecMAXPY_Seq+0x308> vmulpd ymm0,ymm7,YMMWORD PTR [rax+r9*1]
000000000024081e <VecMAXPY_Seq+0x30e> mov    rax,QWORD PTR [rbp-0xa8]
0000000000240825 <VecMAXPY_Seq+0x315> vfmadd231pd ymm0,ymm8,YMMWORD PTR 
[rax+r9*1]
000000000024082b <VecMAXPY_Seq+0x31b> mov    rax,QWORD PTR [rbp-0xb8]
0000000000240832 <VecMAXPY_Seq+0x322> vfmadd231pd ymm0,ymm6,YMMWORD PTR 
[rax+r9*1]
0000000000240838 <VecMAXPY_Seq+0x328> vfmadd231pd ymm0,ymm5,YMMWORD PTR 
[r10+r9*1]
000000000024083e <VecMAXPY_Seq+0x32e> vaddpd ymm0,ymm0,YMMWORD PTR [r11+r9*1]
0000000000240844 <VecMAXPY_Seq+0x334> vmovapd YMMWORD PTR [r11+r9*1],ymm0
000000000024084a <VecMAXPY_Seq+0x33a> add    r9,0x20
000000000024084e <VecMAXPY_Seq+0x33e> cmp    DWORD PTR [rbp-0xa0],ebx
0000000000240854 <VecMAXPY_Seq+0x344> ja     000000000024080e 
<VecMAXPY_Seq+0x2fe>

Attachment: signature.asc
Description: PGP signature

Reply via email to