Ugh.  I should have caught this earlier.

My Fedora tester failed recently on the "flatbuffers" package.  It
worked on Oct 6th GCC snapshot, but was failing by Oct 13th snapshot.

Bisection ultimately landed on:

> commit d9d534895b775a453b8d8d291ef72d6dfa5f9e52 (HEAD, refs/bisect/bad)
> Author: msebor <msebor@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   Wed Oct 9 21:35:11 2019 +0000
> 
>     PR tree-optimization/90879 - fold zero-equality of strcmp between a 
> longer string and a smaller array
>     
>     gcc/c-family/ChangeLog:
>     
>             PR tree-optimization/90879
>             * c.opt (-Wstring-compare): New option.
>     
>     gcc/testsuite/ChangeLog:
>     
>             PR tree-optimization/90879
>             * gcc.dg/Wstring-compare-2.c: New test.
>             * gcc.dg/Wstring-compare.c: New test.
>             * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>             * gcc.dg/strcmpopt_6.c: New test.
>             * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>             test cases.
>             * gcc.dg/strlenopt-66.c: Run it.
>             * gcc.dg/strlenopt-68.c: New test.
>     
>     gcc/ChangeLog:
>     
>             PR tree-optimization/90879
>             * builtins.c (check_access): Avoid using maxbound when null.
>             * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen 
> change.
>             * doc/invoke.texi (-Wstring-compare): Document new warning option.
>             * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>             conditional.
>             (get_range_strlen): Overwrite initial maxbound when non-null.
>             * gimple-ssa-sprintf.c (get_string_length): Adjust to 
> get_range_strlen
>             changes.
>             * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>             (used_only_for_zero_equality): New function.
>             (handle_builtin_memcmp): Call it.
>             (determine_min_objsize): Return an integer instead of tree.
>             (get_len_or_size, strxcmp_eqz_result): New functions.
>             (maybe_warn_pointless_strcmp): New function.
>             (handle_builtin_string_cmp): Call it.  Fold zero-equality of 
> strcmp
>             between a longer string and a smaller array.
>             (get_range_strlen_dynamic): Overwrite initial maxbound when 
> non-null.


But I think the root of the problem is actually here:

> commit 72dbc21dbbdd08cd4047d68bce3154dc485a4255
> Author: qinzhao <qinzhao@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   Thu May 31 20:01:42 2018 +0000
> 
>     2nd Patch for PR78009
>     Patch for PR83026
>     
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
>     Inline strcmp with small constant strings
>     
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
>     missing strlen optimization for strcmp of unequal strings
>     
>     The design doc for PR78809 is at:
>     https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
>     
>     this patch is for the second part of change of PR78809 and PR83026:
>     
>     B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>     
>        B.1. (PR83026) When the lengths of both arguments are constant and
>             it's a strcmp:
>           * if the lengths are NOT equal, we can safely fold the call
>             to a non-zero value.
>           * otherwise, do nothing now.
>     
>        B.2. (PR78809) When the length of one argument is constant, try to 
> replace
>        the call with a __builtin_str(n)cmp_eq call where possible, i.e:
>     
>        strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR 
> is a
>        string with constant length, C is a constant.
>          if (C <= strlen(STR) && sizeof_array(s) > C)
>            {
>              replace this call with
>              __builtin_strncmp_eq (s, STR, C) (!)= 0
>            }
>          if (C > strlen(STR)
>            {
>              it can be safely treated as a call to strcmp (s, STR) (!)= 0
>              can handled by the following strcmp.
>            }
>     
>        strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>        string with constant length.
>          if  (sizeof_array(s) > strlen(STR))
>            {
>              replace this call with
>              __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>            }
>     
>        later when expanding the new __builtin_str(n)cmp_eq calls, first 
> expand them
>        as __builtin_memcmp_eq, if the expansion does not succeed, change them 
> back
>        to call to __builtin_str(n)cmp.
>     
>     adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
>     PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

Qing's patch introduced determine_min_objsize, which has this chunk of code:

>  /* Try to determine the size of the object from its type.  */
>   if (TREE_CODE (dest) != ADDR_EXPR)
>     return HOST_WIDE_INT_M1U;
> 
>   tree type = TREE_TYPE (dest);
>   if (TREE_CODE (type) == POINTER_TYPE)
>     type = TREE_TYPE (type);
> 
>   type = TYPE_MAIN_VARIANT (type);
> 
>   /* The size of a flexible array cannot be determined.  Otherwise,
>      for arrays with more than one element, return the size of its
>      type.  GCC itself misuses arrays of both zero and one elements
>      as flexible array members so they are excluded as well.  */
>   if (TREE_CODE (type) != ARRAY_TYPE
>       || !array_at_struct_end_p (dest))
>     {
>       tree type_size = TYPE_SIZE_UNIT (type);
>       if (type_size && TREE_CODE (type_size) == INTEGER_CST
>           && !integer_onep (type_size)
>           && !integer_zerop (type_size))
>         return tree_to_uhwi (type_size);
>     }

Alarm bells should be going off at this point.  We're using the size of
a type to determine an object's size which is in turn used for code
generation/optimization purposes.  That's not safe.  We can use a DECL's
size when we can map an SSA_NAME back to a suitable DECL, but we can not
use a type size like that.

WIthin flatbuffers we have this fragment:

> ;;   basic block 52, loop depth 2, count 11840041856 (estimated locally), 
> maybe hot
> ;;    prev block 51, next block 398, flags: (NEW, REACHABLE, VISITED)
> ;;    pred:       51 [always]  count:4203214856 (estimated locally) 
> (FALLTHRU,EXECUTABLE)
> ;;                49 [50.0% (guessed)]  count:5920020928 (estimated locally) 
> (FALSE_VALUE,EXECUTABLE)
> ;;                50 [29.0% (guessed)]  count:1716806064 (estimated locally) 
> (FALSE_VALUE,EXECUTABLE)
>   # iftmp.78_557 = PHI <iftmp.78_556(51), 0B(49), 0B(50)>
>   _541 = &MEM <const uoffset_t> [(void *)iftmp.78_557 + 4B];
>   _542 = strcmp (_541, "priority");
>   if (_542 > 0)
>     goto <bb 398>; [41.00%]
>   else
>     goto <bb 53>; [59.00%]
> ;;    succ:       398 [41.0% (guessed)]  count:4854417119 (estimated locally) 
> (TRUE_VALUE,EXECUTABLE)
> ;;                53 [59.0% (guessed)]  count:6985624737 (estimated locally) 
> (FALSE_VALUE,EXECUTABLE)

We then assume that the size the memory object pointed to by _541 is 4
bytes (uoffset_t is a uint32_t).

That in turn allows Martin's new code to optimize away the strcmp
because "priority" can never be equal to any 4 byte object (the call
chain eventually gets down into determine_min_objsize).

Of course the object in question is actually more than 4 bytes and
sometimes actually has the value "priority", so removing the test is, of
course, wrong.

Anyway, I'm not going to be able to finish chasing this down today, but
wanted to get the info out there ASAP so that others don't start chasing
the same underlying issue.


jeff

ps.  I can confirm this is the same issue affecting acpica-tools and
hopefully others.

Reply via email to