| Issue |
180639
|
| Summary |
MIPS1 Load delay slot not respected when load is in branch delay slot
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
cody-code-wy
|
Hello,
I am reporting this issue along with @JaberwockySeamonstah, we encountered a strange error in a rust project for the PlayStation 1 and report it as [issue 150676 on rust-lang](https://github.com/rust-lang/rust/issues/150676#issuecomment-3747279300). After digging into the issue further we now believe that it is an LLVM issue, and reasonably believe there in not any undefined behavior in the source code.
I have a [fairly minimal example in godbolt](https://godbolt.org/z/5zTz1ej3v) and you can particularly see the issue here.
```asm
9 beqz $1, $BB0_5
10 lbu $2, %lo(app[3dfc78e43c546c95]::CONTROLLERS_A)($2)
11 andi $1, $2, 1
#...
24 $BB0_5: # %bb3.i.i.i
25 andi $1, $2, 1
```
In line 10 (inside the branch delay slot from line 9) we load into `$2` and regardless of wether we branch or not both subsequent instructions make use of the `$2` register without having waited for the load delay slot to pass. I don't believe this is correct on MIPS-I processors.
It's fairly easy to see in the opt pipeline how this code gets setup in this state, but I don't fully understand why this happens.
If we follow throught this same code is represented by the raw LLVM-IR
```llvm
bb1.i.i: ; preds = %start
%1 = trunc nuw i8 %0 to i1
%2 = load i8, ptr @_RNvCs5jWYnRsDZoD_3app13CONTROLLERS_A, align 4, !range !3, !noundef !2
%3 = trunc nuw i8 %2 to i1
br i1 %1, label %bb2.i.i2.i, label %bb3.i.i.i
bb2.i.i2.i: ; preds = %bb1.i.i
br i1 %3, label %bb8.i.i.i, label %bb11.i.i.i
bb3.i.i.i: ; preds = %bb1.i.i
br i1 %3, label %bb5.i.i.i, label %bb7.i.i.i
```
here the `load` instruction will become the `lbu` and the `trunc` will eventually become `andi`
First in the `CodeGen Prepare` opt pass we can see the code change to doing the `trunc` on either side of the branch instead of before the branch
```diff llvm
bb1.i.i: ; preds = %start
%1 = trunc nuw i8 %0 to i1
%2 = load i8, ptr @_RNvCs5jWYnRsDZoD_3app13CONTROLLERS_A, align 4, !range !3, !noundef !2
- %3 = trunc nuw i8 %2 to i1
br i1 %1, label %bb2.i.i2.i, label %bb3.i.i.i
bb2.i.i2.i: ; preds = %bb1.i.i
+ %3 = trunc nuw i8 %2 to i1
br i1 %3, label %bb8.i.i.i, label %bb11.i.i.i
bb3.i.i.i: ; preds = %bb1.i.i
- br i1 %3, label %bb5.i.i.i, label %bb7.i.i.i
+ %4 = trunc nuw i8 %2 to i1
+ br i1 %4, label %bb5.i.i.i, label %bb7.i.i.i
```
I don't know what this optimization is actually trying to do here, but it seems to be creating in this case the exact setup needed to confuse the later load delay slot filler optimization.
>From here I'm really not sure what to do next. I wanted to try adjusting what opt passes were run but I can't seem to find any usefull information about adjusting them. The issue does not appear with `-O0` but anything higher reliably causes the issues with the godbolt example.
Since I know that MIPS-I is a fairly niche target both @JaberwockySeamonstah and I are willing to get our hands dirty to try to fix this.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs