On 08/03/2018 07:00 AM, Bernd Edlinger wrote: > On 08/02/18 22:34, Martin Sebor wrote: >> On 08/02/2018 12:56 PM, Bernd Edlinger wrote: >>> On 08/02/18 15:26, Bernd Edlinger wrote: >>>>> >>>>> /* If the length can be computed at compile-time, return it. */ >>>>> - len = c_strlen (src, 0); >>>>> + tree array; >>>>> + tree len = c_strlen (src, 0, &array); >>>> >>>> You know the c_strlen tries to compute wide character sizes, >>>> but strlen does not do that, strlen (L"abc") should give 1 >>>> (or 0 on a BE machine) >>>> I wonder if that is correct. >>>> >>> [snip] >>>>> >>>>> static tree >>>>> -fold_builtin_strlen (location_t loc, tree type, tree arg) >>>>> +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) >>>>> { >>>>> if (!validate_arg (arg, POINTER_TYPE)) >>>>> return NULL_TREE; >>>>> else >>>>> { >>>>> - tree len = c_strlen (arg, 0); >>>>> - >>>>> + tree arr = NULL_TREE; >>>>> + tree len = c_strlen (arg, 0, &arr); >>>> >>>> Is it possible to write a test case where strlen(L"test") reaches this >>>> point? >>>> what will c_strlen return then? >>>> >>> >>> Yes, of course it is: >>> >>> $ cat y.c >>> int f(char *x) >>> { >>> return __builtin_strlen(x); >>> } >>> >>> int main () >>> { >>> return f((char*)&L"abcdef"[0]); >>> } >>> $ gcc -O3 -S y.c >>> $ cat y.s >>> main: >>> .LFB1: >>> .cfi_startproc >>> movl $6, %eax >>> ret >>> .cfi_endproc >>> >>> The reason is that c_strlen tries to fold wide chars at all. >>> I do not know when that was introduced, was that already before your last >>> patches? >> >> The function itself was introduced in 1992 if not earlier, >> before wide strings even existed. AFAICS, it has always >> accepted strings of all widths. Until r241489 (in GCC 7) >> it computed their length in bytes, not characters. I don't >> know if that was on purpose or if it was just never changed >> to compute the length in characters when wide strings were >> first introduced. From the name I assume it's the latter. >> The difference wasn't detected until sprintf tests were added >> for wide string directives. The ChangeLog description for >> the change reads: Correctly handle wide strings. I didn't >> consider pathological cases like strlen (L"abc"). It >> shouldn't be difficult to change to fix this case. >> > > Oh, oh, oh.... > > $ cat y3.c > int main () > { > char c[100]; > int x = __builtin_sprintf (c, "%S", L"\uFFFF"); > > __builtin_printf("%d %ld\n", x,__builtin_strlen(c)); > } > > $ gcc-4.8 -O3 -std=c99 y3.c > $ ./a.out > -1 0 > $ gcc -O3 y3.c > $ ./a.out > 1 0 > $ echo $LANG > de_DE.UTF-8 > > I would have expected L"\uFFFF" to converted to UTF-8 > or another encoding, so the return value if sprintf is > far from obvious, and probably language dependent. FWIW, Martin has a patch (under review) that I think will fix this and includes a testcase that is likely inspired by the code above.
> > Why do you think it is a good idea to use really every > opportunity to optimize totally unnecessary things like > using the return value from the sprintf function as it is? > > Did you never think this adds a significant maintenance > burden on the rest of us as well? It largely came along for free during the implementation of the sprintf warnings. That's changed a bit over time, but it's still the case that the sprintf warnings do all the analysis necessary to optimize the sprintf return value. As both Martin and I have stated before the real goal is getting good warnings from sprintf. Optimization is a distant second. jeff