https://bugs.llvm.org/show_bug.cgi?id=50598
Bug ID: 50598
Summary: suboptimal vectorization on function call with >10
parameters with -march=znver2
Product: new-bugs
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: new bugs
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected]
example code can be found on [github](https://github.com/Apache-HB/bench).
when benchmarking this code on a 2700X (znver2) the highest optimization level
performs 10% slower than GCC generated code on average.
llvm pushes arguments onto the stack with
```
vbroadcastsd r256, m64
vmovups m256, r256
```
which is slower on znver2 than the less vectorized
```
push r64
push r64
push r64
push r64
```
currently llvm generates
```
mov m64, imm64
mov m64, imm64
mov m64, imm64
mov m64, imm64
```
without vectorization enabled which is 20% slower than GCC and 10% slower than
the vectorized equivalent on znver2.
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs