https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124624
Bug ID: 124624
Summary: Missed optimization on arrangement of code
blocks/branches (-Os mode)
Product: gcc
Version: 15.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: Explorer09 at gmail dot com
Target Milestone: ---
This example code is from my personal project.
```c
/* <https://gitlab.com/Explorer09/utf_convert> */
#include <stdint.h>
uint32_t utf16_decode_code_unit(unsigned int *prev_surrogate, unsigned int
code_unit) {
if (*prev_surrogate == 0) {
if ((code_unit >> 10) == (0xD800 >> 10)) {
*prev_surrogate = code_unit;
return (uint32_t)-2;
}
if ((code_unit >> 10) == (0xDC00 >> 10)) {
return (uint32_t)-1;
}
if ((code_unit >> 16) != 0) {
return (uint32_t)-1;
}
return code_unit;
}
*prev_surrogate = 0;
if ((code_unit >> 10) != (0xDC00 >> 10)) {
return (uint32_t)-1;
}
return (((uint32_t)*prev_surrogate << 10) + code_unit -
((uint32_t)0xD800 << 10) + 0x10000 - 0xDC00);
}
```
Compiler Explorer link: https://godbolt.org/z/KMeTanerd
Take note on the two conditionals `(code_unit >> 10) == (0xDC00 >> 10)` and
`(code_unit >> 16) != 0`.
x86-64 gcc 15.2 with -Os option generates the following for the two
conditionals:
```assembly
.L3:
cmpl $55, %edx
jne .L5
.L6:
movl $-1, %eax
ret
.L5:
movl %esi, %ecx
shrl $16, %ecx
je .L4
jmp .L6
# ... skipped ...
.L4:
ret
```
The code can technically be smaller (one less JMP instruction) by swapping the
.L5 and .L6 blocks, resulting in the following:
```assembly
.L3:
cmpl $55, %edx
je .L6
movl %esi, %ecx
shrl $16, %ecx
je .L4
.L6:
movl $-1, %eax
ret
# ... skipped ...
.L4:
ret
```
If I swap the two conditionals in the example code (as a workaround), GCC can
make the code size I desired.