https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115506
Bug ID: 115506 Summary: Possible but missed "cmp" instruction merging (x86 & ARM, optimization) Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: Explorer09 at gmail dot com Target Milestone: --- I'm not sure if the GCC developers are interested in this optimization or if this issue has been reported before. ---- (Compiler Explorer link: https://godbolt.org/z/vqqf865cb) ```c #include <stdint.h> // Modified from <https://gitlab.com/-/snippets/3718423> uint32_t utf8_to_code_point(const uint8_t **sequence) { if (**sequence <= 0x7F) { return *(*sequence)++; } unsigned int max_length; uint32_t min_code_point; if ((**sequence & 0xF0) == 0xE0) { /* NOTE 0 */ max_length = 3; min_code_point = 0x0800; } else if ((**sequence & 0xF0) < 0xE0) { /* NOTE 1 */ max_length = 2; min_code_point = 0x80; } else { max_length = 4; min_code_point = 0x10000; } uint32_t code_point = (uint32_t)**sequence - (0xFF & ~(0xFF >> max_length)); while (1) { (*sequence)++; if (--max_length == 0) { break; } unsigned int code_offset = (unsigned int)**sequence - 0x80; if (code_offset > 0x3F) { return (uint32_t)-1; } code_point = (code_point << 6) + code_offset; } if (code_point < min_code_point) { return (uint32_t)-1; } if (code_point >= 0xD800 && code_point <= 0xDFFF) { return (uint32_t)-1; } return code_point; } ``` I wrote this code with the expectation that the compiler can optimize the "cmp" instructions so that the same compare status can be used twice. But instead I get this: ```x86asm # ... movl %eax, %ecx andl $-16, %ecx cmpb $-32, %cl je .L8 cmpb $-33, %cl ja .L9 # ... ``` The immediate constant for the second "cmp" instruction gets modified off-by-one (I can guess the reason for that) and missed this particular chance of merging the comparisons. A workaround is to modify the line marked with the `/* NOTE 1 */` comment to use `<=` instead of `<`. But I wish GCC can detect this optimization opportunity better.