On Mon, Aug 18, 2025 at 06:40:04PM +0200, Walter Alejandro Iglesias wrote:
> Question for the experts. Let's take the following example:
>
> ----->8------------->8--------------------
> #include <stdio.h>
> #include <string.h>
> #include <wchar.h>
>
> #define period 0x2e
> #define question 0x3f
> #define exclam 0x21
> #define ellipsis L'\u2026'
>
> const wchar_t p[] = { period, question, exclam, ellipsis };
This is not a string, as is is not NUL terminated, so there's garbage
after it, which will be picked up by wcsspn() until it hits a NUL.
Declaring it as const wchar_t p[5] = { period, question, exclam, ellipsis }
and/or initing it as { period, question, exclam, ellipsis, '\0' }
should work.
-Otto
>
> int
> main()
> {
> const wchar_t s[] = L". Hello.";
>
> printf("%ls\n", s);
> printf("%lu\n", wcsspn(s, p));
>
> return 0;
> }
> -------------8<-----------8<----------------
>
>
> Now run:
>
> $ cc -Wall example.c -o example && ./example
> . Hello.
> 8
> $ egcc -Wall example.c -o example && ./example
> . Hello.
> 1
>
> As you see, compiled with GCC the program does what is expected. To get
> the desired result with CLANG you have to write the string literally.
> Change the declaration of p[] above to:
>
> const wchar_t p[] = L".?!?";
> ^ This is a UTF-8 ellipsis.
>
> And now:
>
> $ cc -Wall example.c -o example && ./example
> . Hello.
> 1
>
> Using only ASCII or only UTF-8 in the array also works.
>
> Is this a bug in clang's wcsspn() or I'm wrong in assuming that the
> array can be declared in the way I did?
>
>
> --
> Walter
>