| Issue |
169437
|
| Summary |
Missed optimization: redundant calls to `test` around jumps/moves
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
tgross35
|
Input:
```llvm
define { i1, i8 } @demo(ptr noalias noundef nonnull readonly align 1 captures(none) %s.0, i64 noundef %s.1) unnamed_addr {
start:
%_23 = icmp samesign ne i64 %s.1, 0
br i1 %_23, label %bb3, label %bb1
bb3:
%v = load i8, ptr %s.0, align 1
br label %bb1
bb1:
%_0.sroa.3.0 = phi i8 [ %v, %bb3 ], [ undef, %start ]
%0 = insertvalue { i1, i8 } poison, i1 %_23, 0
%1 = insertvalue { i1, i8 } %0, i8 %_0.sroa.3.0, 1
ret { i1, i8 } %1
}
```
This creates the following output on x86:
```asm
demo2:
test rsi, rsi
je .LBB0_1
movzx edx, byte ptr [rdi]
test rsi, rsi
setne al
ret
.LBB0_1:
test rsi, rsi
setne al
ret
```
Note the possible control flows
- `test rsi, rsi -> je .LBB0_1 (false) -> test rsi, rsi`, and
- `test rsi, rsi -> je .LBB0_1 (true) -> movzx edx, byte ptr [rdi] -> test rsi, rsi`
Since `je` and `movzx` do not affect flags, the second `test` in each flow is redundant and the `setne` could be unconditional.
Aarch64 has something similar: Both `cmp x1, #0`s could be removed and `cset w0, ne` changed to unconditional since the `cbz x1` guards x1's zero/nonzero status for that branch.
```asm
demo2:
cbz x1, .LBB0_2
ldrb w8, [x0]
cmp x1, #0
mov w1, w8
cset w0, ne
ret
.LBB0_2:
cmp x1, #0
mov w1, w8
cset w0, ne
ret
```
(Also, is it actually performing a store of poison with that `mov w1, w8` in `LBB0_2`?)
https://llvm.godbolt.org/z/cdxK7qfGz
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs