On Sat, Jan 05, 2019 at 12:20:24PM +0000, Bernd Edlinger wrote:
> co-incidentally my "Make strlen range computations more conservative" patch
> contained a fix on the same spot, I just did not have a test case for it:
> 
> @@ -3184,7 +3146,10 @@ get_min_string_length (tree rhs, bool *f
>        && TREE_READONLY (rhs))
>      rhs = DECL_INITIAL (rhs);
> 
> -  if (rhs && TREE_CODE (rhs) == STRING_CST)
> +  if (rhs && TREE_CODE (rhs) == STRING_CST
> +      && tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (rhs)))) == 1
> +      && TREE_STRING_LENGTH (rhs) > 0
> +      && TREE_STRING_POINTER (rhs) [TREE_STRING_LENGTH (rhs) - 1] == '\0')
>      {
>        *full_string_p = true;
>        return strlen (TREE_STRING_POINTER (rhs));
> 
> 
> additionally to your patch this tests the string is in fact a single-byte 
> string.
> since strlen returns garbage otherwise.

Multi-byte STRING_CSTs seem to be stored in the target byte order, e.g.
L"abcde" in a x86_64-linux -> powerpc64-linux cross is still
"\000\000\000a\000\000\000b\000\000\000c\000\000\000d\000\000\000\000"
so I think strlen is exactly what we want.  The tree-ssa-strlen.c pass
doesn't track string lengths in whatever units it has, it tracks number of
non-zero bytes followed by zero byte known at certain address, or, if there
is no zero byte known, the minimum amount of non-zero bytes known at certain
address.

And, your patch has also the bad effect that it won't return any length for
the strings that aren't zero terminated.  We do want to return 9 for the
"abcdefghi" string without zero terminator at the end, just need to set
*full_string_p to false.  And, for "abcd\000fghi" string without zero
termination at the end, we do want to return 4 and set *full_string_p to
true.

        Jakub

Reply via email to