https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088

            Bug ID: 114088
           Summary: Please provide __builtin_c16slen and __builtin_c32slen
                    to complement __builtin_wcslenw
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

Actually, GCC doesn't have __builtin_wcslen, but Clang does. Providing these
extra two builtins would allow implementing __builtin_wcslen too. The names are
not part of the C standard, but follow the current naming construction rules
for it, similar to how "mbrtowc" and "wcslen" parallel.

My specific need is actually to implement char16_t string containers in C++.
I'm particularly interested in QString/QStringView, but this applies to
std::basic_string{_view} too.

For example:

std::string_view f1() { return "Hello"; }
std::wstring_view fw() { return L"Hello"; }
std::u16string_view f16() { return u"Hello"; }
std::u32string_view f32() { return U"Hello"; }

With GCC and libstdc++, the first function produces optimal code:
        movl    $5, %eax
        leaq    .LC0(%rip), %rdx
        ret

For wchar_t case, GCC emits an out-of-line call to wcslen:
        pushq   %rbx
        leaq    .LC2(%rip), %rbx
        movq    %rbx, %rdi
        call    wcslen@PLT
        movq    %rbx, %rdx
        popq    %rbx
        ret

The next two, because of the absence of a C library function, emit a loop:
        xorl    %eax, %eax
        leaq    .LC1(%rip), %rcx
.L4:
        incq    %rax
        cmpw    $0, (%rcx,%rax,2)
        jne     .L4
        movq    %rcx, %rdx
        ret

Clang, meanwhile, emits optimal code for all four and so did the pre-Clang
Intel compiler. See https://gcc.godbolt.org/z/qvj7qnYbz. MSVC emits optimal for
the char and wchar_t versions, but loops for the other two.

Clang gives up when the string gets longer, though. See
https://gcc.godbolt.org/z/54j3zr6e6. That indicates that it gave up on guessing
the loop run and would do better if the intrinsic were present.

Reply via email to