https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114088
Bug ID: 114088 Summary: Please provide __builtin_c16slen and __builtin_c32slen to complement __builtin_wcslenw Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: thiago at kde dot org Target Milestone: --- Actually, GCC doesn't have __builtin_wcslen, but Clang does. Providing these extra two builtins would allow implementing __builtin_wcslen too. The names are not part of the C standard, but follow the current naming construction rules for it, similar to how "mbrtowc" and "wcslen" parallel. My specific need is actually to implement char16_t string containers in C++. I'm particularly interested in QString/QStringView, but this applies to std::basic_string{_view} too. For example: std::string_view f1() { return "Hello"; } std::wstring_view fw() { return L"Hello"; } std::u16string_view f16() { return u"Hello"; } std::u32string_view f32() { return U"Hello"; } With GCC and libstdc++, the first function produces optimal code: movl $5, %eax leaq .LC0(%rip), %rdx ret For wchar_t case, GCC emits an out-of-line call to wcslen: pushq %rbx leaq .LC2(%rip), %rbx movq %rbx, %rdi call wcslen@PLT movq %rbx, %rdx popq %rbx ret The next two, because of the absence of a C library function, emit a loop: xorl %eax, %eax leaq .LC1(%rip), %rcx .L4: incq %rax cmpw $0, (%rcx,%rax,2) jne .L4 movq %rcx, %rdx ret Clang, meanwhile, emits optimal code for all four and so did the pre-Clang Intel compiler. See https://gcc.godbolt.org/z/qvj7qnYbz. MSVC emits optimal for the char and wchar_t versions, but loops for the other two. Clang gives up when the string gets longer, though. See https://gcc.godbolt.org/z/54j3zr6e6. That indicates that it gave up on guessing the loop run and would do better if the intrinsic were present.