| Issue |
170010
|
| Summary |
[AArch64] Fusion of floating-point round+convert to integer is not always performed
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
valadaptive
|
AArch64 has instructions for converting a float to an int using a variety of different rounding modes (`fcvtn`, `fcvta`, `fcvtp`, `fcvtm`, etc). When a call to a floating-point rounding intrinsic (`floor`, `ceil`, `round`, `rint`, `trunc`) is followed by `fptoui` or `fptosi`, that sequence should be combined into a single rounding conversion instruction.
>From my testing, LLVM currently performs this optimization for scalar `floor`, `ceil`, `round`, and `trunc`. The optimization is missing for `rint`, and for vector operations (including autovectorized ones).
[Here's a Compiler Explorer demo.](https://godbolt.org/z/nKjjEsqGh) The `round_to_int`, `floor_to_int`, `ceil_to_int`, and `trunc_to_int` functions compile down to a single `fcvt[mode]u`. However, the `round_to_int_ties_even` function compiles down to a `frintx`+`fcvtzu`. All the four-at-a-time functions are autovectorized, and likewise compile down to a vector `frint[mode]`+`fcvtzu` instead of a vector `fcvt[mode]u`.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs