https://bugs.llvm.org/show_bug.cgi?id=42633
Bug ID: 42633
Summary: [X86] Avoid scalar/vector transfers for scalar
arithmetic
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected],
[email protected], [email protected]
https://godbolt.org/z/LvQiCP
#include <x86intrin.h>
auto add(__v4si x) {
return _mm_set1_epi32(x[1] + x[3]);
}
_Z3addDv4_i:
vextractps $1, %xmm0, %eax
vextractps $3, %xmm0, %ecx
addl %eax, %ecx
vmovd %ecx, %xmm0
vpshufd $0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,0,0]
retq
A more optimal method would be something like:
add(int __vector(4)):
vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1]
vpaddd xmm0, xmm1, xmm0
vpshufd xmm0, xmm0, 85 # xmm0 = xmm0[1,1,1,1]
ret
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs