https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115506

            Bug ID: 115506
           Summary: Possible but missed "cmp" instruction merging (x86 &
                    ARM, optimization)
           Product: gcc
           Version: 14.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

I'm not sure if the GCC developers are interested in this optimization or if
this issue has been reported before.

----

(Compiler Explorer link: https://godbolt.org/z/vqqf865cb)

```c
#include <stdint.h>

// Modified from <https://gitlab.com/-/snippets/3718423>
uint32_t utf8_to_code_point(const uint8_t **sequence) {
    if (**sequence <= 0x7F) {
        return *(*sequence)++;
    }

    unsigned int max_length;
    uint32_t min_code_point;
    if ((**sequence & 0xF0) == 0xE0) { /* NOTE 0 */
        max_length = 3;
        min_code_point = 0x0800;
    } else if ((**sequence & 0xF0) < 0xE0) { /* NOTE 1 */
        max_length = 2;
        min_code_point = 0x80;
    } else {
        max_length = 4;
        min_code_point = 0x10000;
    }

    uint32_t code_point = (uint32_t)**sequence - (0xFF & ~(0xFF >>
max_length));
    while (1) {
        (*sequence)++;
        if (--max_length == 0) {
            break;
        }
        unsigned int code_offset = (unsigned int)**sequence - 0x80;
        if (code_offset > 0x3F) {
            return (uint32_t)-1;
        }
        code_point = (code_point << 6) + code_offset;
    }

    if (code_point < min_code_point) {
        return (uint32_t)-1;
    }
    if (code_point >= 0xD800 && code_point <= 0xDFFF) {
        return (uint32_t)-1;
    }
    return code_point;
}
```

I wrote this code with the expectation that the compiler can optimize the "cmp"
 instructions so that the same compare status can be used twice.

But instead I get this:

```x86asm
#   ...
    movl    %eax, %ecx
    andl    $-16, %ecx
    cmpb    $-32, %cl
    je      .L8
    cmpb    $-33, %cl
    ja      .L9
#   ...
```

The immediate constant for the second "cmp" instruction gets modified
off-by-one (I can guess the reason for that) and missed this particular chance
of merging the comparisons.

A workaround is to modify the line marked with the `/* NOTE 1 */` comment to
use `<=` instead of `<`. But I wish GCC can detect this optimization
opportunity better.

Reply via email to