Issue 169437
Summary Missed optimization: redundant calls to `test` around jumps/moves
Labels new issue
Assignees
Reporter tgross35
    Input:

```llvm
define { i1, i8 } @demo(ptr noalias noundef nonnull readonly align 1 captures(none) %s.0, i64 noundef %s.1) unnamed_addr {
start:
  %_23 = icmp samesign ne i64 %s.1, 0
  br i1 %_23, label %bb3, label %bb1

bb3:
  %v = load i8, ptr %s.0, align 1
  br label %bb1

bb1:
  %_0.sroa.3.0 = phi i8 [ %v, %bb3 ], [ undef, %start ]
 %0 = insertvalue { i1, i8 } poison, i1 %_23, 0
  %1 = insertvalue { i1, i8 } %0, i8 %_0.sroa.3.0, 1
  ret { i1, i8 } %1
}
```

This creates the following output on x86:

```asm
demo2:
        test    rsi, rsi
 je      .LBB0_1
        movzx   edx, byte ptr [rdi]
        test    rsi, rsi
        setne   al
        ret
.LBB0_1:
        test    rsi, rsi
 setne   al
        ret
```

Note the possible control flows

- `test rsi, rsi -> je .LBB0_1 (false) -> test rsi, rsi`, and
- `test rsi, rsi -> je .LBB0_1 (true) ->  movzx edx, byte ptr [rdi] -> test rsi, rsi`

Since `je` and `movzx` do not affect flags, the second `test` in each flow is redundant and the `setne` could be unconditional.

Aarch64 has something similar: Both `cmp x1, #0`s could be removed and `cset w0, ne` changed to unconditional since the `cbz x1` guards x1's zero/nonzero status for that branch.

```asm
demo2:
        cbz     x1, .LBB0_2
        ldrb    w8, [x0]
        cmp     x1, #0
        mov     w1, w8
        cset    w0, ne
        ret
.LBB0_2:
        cmp     x1, #0
        mov     w1, w8
 cset    w0, ne
        ret
```

(Also, is it actually performing a store of poison with that `mov     w1, w8` in `LBB0_2`?)

https://llvm.godbolt.org/z/cdxK7qfGz
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to