https://bugs.llvm.org/show_bug.cgi?id=50183
Bug ID: 50183
Summary: Preferred canonicalization - select-of-idx vs
select-of-gep ?
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected]
If we're selecting between the base address and base+idx, which is the
preferred canonicalization?
define <4 x i32> @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) {
%gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
%sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep
%res = load <4 x i32>, <4 x i32>* %sel
ret <4 x i32> %res
}
select0:
shlq $4, %rsi
addq %rdi, %rsi
testb $1, %dl
cmovneq %rdi, %rsi
vmovaps (%rsi), %xmm0
retq
define <4 x i32> @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) {
%sel = select i1 %a2, i64 %a1, i64 0
%gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
%res = load <4 x i32>, <4 x i32>* %gep
ret <4 x i32> %res
}
select1:
xorl %eax, %eax
testb $1, %dl
cmovneq %rsi, %rax
shlq $4, %rax
vmovaps (%rdi,%rax), %xmm0
retq
https://godbolt.org/z/Yxs3fjbWo
opt -O3 doesn't seem to have any effect.
X86 could have a slight preference for select1 (select-of-idx) as it moves more
of the address math into the fold, which is useful if the base address has
additional uses (the use case this was pulled from was clamping the
out-of-bounds indices to zero for several of these in an unrolled loop).
--
You are receiving this mail because:
You are on the CC list for the bug._______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs