On 08/02/18 22:34, Martin Sebor wrote: > On 08/02/2018 12:56 PM, Bernd Edlinger wrote: >> On 08/02/18 15:26, Bernd Edlinger wrote: >>>> >>>> /* If the length can be computed at compile-time, return it. */ >>>> - len = c_strlen (src, 0); >>>> + tree array; >>>> + tree len = c_strlen (src, 0, &array); >>> >>> You know the c_strlen tries to compute wide character sizes, >>> but strlen does not do that, strlen (L"abc") should give 1 >>> (or 0 on a BE machine) >>> I wonder if that is correct. >>> >> [snip] >>>> >>>> static tree >>>> -fold_builtin_strlen (location_t loc, tree type, tree arg) >>>> +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) >>>> { >>>> if (!validate_arg (arg, POINTER_TYPE)) >>>> return NULL_TREE; >>>> else >>>> { >>>> - tree len = c_strlen (arg, 0); >>>> - >>>> + tree arr = NULL_TREE; >>>> + tree len = c_strlen (arg, 0, &arr); >>> >>> Is it possible to write a test case where strlen(L"test") reaches this >>> point? >>> what will c_strlen return then? >>> >> >> Yes, of course it is: >> >> $ cat y.c >> int f(char *x) >> { >> return __builtin_strlen(x); >> } >> >> int main () >> { >> return f((char*)&L"abcdef"[0]); >> } >> $ gcc -O3 -S y.c >> $ cat y.s >> main: >> .LFB1: >> .cfi_startproc >> movl $6, %eax >> ret >> .cfi_endproc >> >> The reason is that c_strlen tries to fold wide chars at all. >> I do not know when that was introduced, was that already before your last >> patches? > > The function itself was introduced in 1992 if not earlier, > before wide strings even existed. AFAICS, it has always > accepted strings of all widths. Until r241489 (in GCC 7) > it computed their length in bytes, not characters. I don't > know if that was on purpose or if it was just never changed > to compute the length in characters when wide strings were > first introduced. From the name I assume it's the latter. > The difference wasn't detected until sprintf tests were added > for wide string directives. The ChangeLog description for > the change reads: Correctly handle wide strings. I didn't > consider pathological cases like strlen (L"abc"). It > shouldn't be difficult to change to fix this case. >
Oh, oh, oh.... $ cat y3.c int main () { char c[100]; int x = __builtin_sprintf (c, "%S", L"\uFFFF"); __builtin_printf("%d %ld\n", x,__builtin_strlen(c)); } $ gcc-4.8 -O3 -std=c99 y3.c $ ./a.out -1 0 $ gcc -O3 y3.c $ ./a.out 1 0 $ echo $LANG de_DE.UTF-8 I would have expected L"\uFFFF" to converted to UTF-8 or another encoding, so the return value if sprintf is far from obvious, and probably language dependent. Why do you think it is a good idea to use really every opportunity to optimize totally unnecessary things like using the return value from the sprintf function as it is? Did you never think this adds a significant maintenance burden on the rest of us as well? Bernd.