| Issue |
76448
|
| Summary |
If `vmaxss` already handles NaN like `fmax`, why is `fmax` so complicated?
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
Eisenwave
|
This is a possible missed optimization in clang++. It's not a libc++ issue because `fmax` is just a thin wrapper around `__builtin_fmax`:
https://github.com/llvm/llvm-project/blob/1150e8ef7765f43a730575bd224eda18e916ac1e/libcxx/include/__math/min_max.h#L28-L30
## Code to reproduce
```cpp
#include <cmath>
float fmax_(float x, float y) {
return std::fmax(x, y);
}
```
## Possibly suboptimal output
This compiles to (`clang++ -O3 -stdlib=libc++`) (https://godbolt.org/z/fn4v99sv8)
```asm
fmax_(float, float): # @fmax_(float, float)
vmaxss xmm2, xmm1, xmm0
vcmpunordss xmm0, xmm0, xmm0
vblendvps xmm0, xmm2, xmm1, xmm0
ret
```
Is this a missed optimization? The latter two instructions are solely dedicated to the handling of NaN. Namely `vcmpunordss` detects `isnan(x)` and `vblendvps` selects `y` or `max(x, y)` depending on the result. However, I don't believe this is necessary, given the documented behavior of `maxss`:
```c
MAX(SRC1, SRC2)
{
IF ((SRC1 = 0.0) and (SRC2 = 0.0)) THEN DEST := SRC2;
ELSE IF (SRC1 = NaN) THEN DEST := SRC2; FI;
ELSE IF (SRC2 = NaN) THEN DEST := SRC2; FI;
ELSE IF (SRC1 > SRC2) THEN DEST := SRC1;
ELSE DEST := SRC2;
FI;
}
```
\- https://www.felixcloutier.com/x86/maxss
To me it seems like the NaN-handling behavior of `vcmpunordss` and `vblendvps` is already covered by the first two `ELSE IF` branches. Am I missing something obvious, or is this a missed optimization?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs