Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

Bernd Edlinger Fri, 03 Aug 2018 06:01:39 -0700

On 08/02/18 22:34, Martin Sebor wrote:
> On 08/02/2018 12:56 PM, Bernd Edlinger wrote:
>> On 08/02/18 15:26, Bernd Edlinger wrote:
>>>>
>>>>    /* If the length can be computed at compile-time, return it.  */
>>>> -  len = c_strlen (src, 0);
>>>> +  tree array;
>>>> +  tree len = c_strlen (src, 0, &array);
>>>
>>> You know the c_strlen tries to compute wide character sizes,
>>> but strlen does not do that, strlen (L"abc") should give 1
>>> (or 0 on a BE machine)
>>> I wonder if that is correct.
>>>
>> [snip]
>>>>
>>>>  static tree
>>>> -fold_builtin_strlen (location_t loc, tree type, tree arg)
>>>> +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg)
>>>>  {
>>>>    if (!validate_arg (arg, POINTER_TYPE))
>>>>      return NULL_TREE;
>>>>    else
>>>>      {
>>>> -      tree len = c_strlen (arg, 0);
>>>> -
>>>> +      tree arr = NULL_TREE;
>>>> +      tree len = c_strlen (arg, 0, &arr);
>>>
>>> Is it possible to write a test case where strlen(L"test") reaches this 
>>> point?
>>> what will c_strlen return then?
>>>
>>
>> Yes, of course it is:
>>
>> $ cat y.c
>> int f(char *x)
>> {
>>    return __builtin_strlen(x);
>> }
>>
>> int main ()
>> {
>>    return f((char*)&L"abcdef"[0]);
>> }
>> $ gcc -O3 -S y.c
>> $ cat y.s
>> main:
>> .LFB1:
>>     .cfi_startproc
>>     movl    $6, %eax
>>     ret
>>     .cfi_endproc
>>
>> The reason is that c_strlen tries to fold wide chars at all.
>> I do not know when that was introduced, was that already before your last 
>> patches?
> 
> The function itself was introduced in 1992 if not earlier,
> before wide strings even existed.  AFAICS, it has always
> accepted strings of all widths.  Until r241489 (in GCC 7)
> it computed their length in bytes, not characters.  I don't
> know if that was on purpose or if it was just never changed
> to compute the length in characters when wide strings were
> first introduced.  From the name I assume it's the latter.
> The difference wasn't detected until sprintf tests were added
> for wide string directives.  The ChangeLog description for
> the change reads: Correctly handle wide strings.  I didn't
> consider pathological cases like strlen (L"abc").  It
> shouldn't be difficult to change to fix this case.
>


Oh, oh, oh....

$ cat y3.c
int main ()
{
   char c[100];
   int x = __builtin_sprintf (c, "%S", L"\uFFFF");

   __builtin_printf("%d %ld\n", x,__builtin_strlen(c));
}

$ gcc-4.8 -O3 -std=c99 y3.c
$ ./a.out
-1 0
$ gcc -O3 y3.c
$ ./a.out
1 0
$ echo $LANG
de_DE.UTF-8

I would have expected L"\uFFFF" to converted to UTF-8
or another encoding, so the return value if sprintf is
far from obvious, and probably language dependent.

Why do you think it is a good idea to use really every
opportunity to optimize totally unnecessary things like
using the return value from the sprintf function as it is?

Did you never think this adds a significant maintenance
burden on the rest of us as well?


Bernd.

Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )

Reply via email to