Issue 181605
Summary [x86] asm parser allows/expects incorrect syntax for JCC causing UB and LLVM crash
Labels new issue
Assignees
Reporter meowette
    ### Description:

The `X86AsmParser` allows `jcc [label]`, `jcc label` and for offsets expects `jcc [<relative offset>]`, while the GNU Assembler only allows `jcc label` and `jcc <relative offset>`, and you can also apply an offset to labels on both. This behavior from LLVM also leads to a crash on Windows with `jcc [<relative offset>]` that is prevented on Linux (and maybe other platforms) due to an unrelated nullptr check as detailed in this PR (https://github.com/llvm/llvm-project/pull/181459). This is likely because of fixups that should not have been created for relative offsets in the first place.

Im not sure how to test this for GCC/GNUAS but it appears that while LLVM with `jcc [<relative offset>]` does not crash on Linux, that does not mean its actually working, its just produces UB and jumps to `[image_base + offset]`. This does not seem to be unique to intel syntax as shown in the example below.

### Repro:
Here's an example (intel syntax) on the rust playground on how to trigger undefined behavior with `jcc [<rel>]`:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=d7624cb45e2438ddcb6a1ca663d3cbab
When compiling this locally and throwing it into binary ninja I saw it jumped into the elf header signature.

I don't know much about C++, but here is a clang AT&T syntax repro that also causes a crash:
https://godbolt.org/z/5qG4zd11P

Both examples seem to interpret the offset as `image_base + offset` rather than `rip + offset`.

One way to work around LLVM not supporting RIP relative addressing with JCCs (at least in intel, but could not figure it out with AT&T syntax) is to use a label, plus an offset like this:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=68c10324d4a09dea3152a052b2f88404

clang AT&T:
https://godbolt.org/z/szbfzrMsP

These examples produce the correct code and do not crash. But from my comparison between GNUAS and LLVM, I'm not sure GNUAS is handling this right either..

The only way I found so far to make RIP relative addressing for JCC's "work" with GAS and LLVMAS without labels is using this syntax:
```asm
.att_syntax
jz .+4
ud2
ret
```

But this uses the address of the jump + offset, not the address + size then the offset, so its not *really* RIP relative either.
LLVMAS: https://godbolt.org/z/qx8bKM9Ka
GAS: https://godbolt.org/z/4o8x5oYYq

Note that the example above works with intel and AT&T syntax on GNUAS, but only with AT&T syntax on LLVMAS.

### Expected Result:

As far as I am aware, LLVM wants to be compatible with GNUAS. I am not sure if this means it also cannot expand upon and have syntax that is not allowed in GNUAS (`[]`) with AT&T and intel syntax. Either ways it should not interpret the offset in `jcc [<offset>]` (AT&T: `jcc <offset>` as image base relative and ideally error (if proper rip related addressing is not going to be supported). While it would be preferable that it matches GNUAS, it appears that (at least according to the godbolt example below) GNUAS does also not support actual RIP relative addressing in JCC's without labels and will completely ignore the offset, or do something similar to LLVM most likely, so I am not sure what the ideal solution here would be.

Here is a comparison between GNUAS and LLVMAS on how they handle assembling JCC.
GNUAS: https://godbolt.org/z/xqjq8a4dM
LLVMAS: https://godbolt.org/z/9vzr86nEx

##### This is my first issue here, so please feel free to let me know if I did something wrong or did not properly explain something ^o^!!

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to