[Bug sanitizer/70683] New: [6 Regression] -fcompare-debug bug with -fsanitize=address

jakub at gcc dot gnu.org Fri, 15 Apr 2016 07:32:51 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70683


            Bug ID: 70683
           Summary: [6 Regression] -fcompare-debug bug with
                    -fsanitize=address
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: sanitizer
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
                CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
                    jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

On a testcase that has been provided privately to me and I can't share (and
that can't be easily reduced), I'm seeing differences in sanopt optimization,
without -gtoggle ASAN_CHECK is optimized away, while with -gtoggle it is.
maybe_optimize_asan_check_ifn works with a hash_map using tree_operand_hash.
What I'm seeing is that in both cases we first process ASAN_CHECK with ptr
&m_nullId.m_id
which is ADDR_EXPR of COMPONENT_REF of VAR_DECL, and then later on there is
ASAN_CHECK with ptr
&m_nullId.m_id
which is actually ADDR_EXPR of COMPONENT_REF of MEM_REF of ADDR_EXPR of
VAR_DECL.
The FIELD_DECL and VAR_DECL in both cases are the same.
Now, because of the VAR_DECL vs. MEM_REF[&VAR_DECL, 0] difference both hash
differently using iterative_hash_expr.
Unlike the hash, the comparison is done using operand_equal_p (, , 0), which
has:
2833          else if (flags & OEP_ADDRESS_OF)
2834            {
2835              /* If we are interested in comparing addresses ignore
2836                 MEM_REF wrappings of the base that can appear just for
2837                 TBAA reasons.  */
2838              if (TREE_CODE (arg0) == MEM_REF
2839                  && DECL_P (arg1)
2840                  && TREE_CODE (TREE_OPERAND (arg0, 0)) == ADDR_EXPR
2841                  && TREE_OPERAND (TREE_OPERAND (arg0, 0), 0) == arg1
2842                  && integer_zerop (TREE_OPERAND (arg0, 1)))
2843                return 1;
2844              else if (TREE_CODE (arg1) == MEM_REF
2845                       && DECL_P (arg0)
2846                       && TREE_CODE (TREE_OPERAND (arg1, 0)) == ADDR_EXPR
2847                       && TREE_OPERAND (TREE_OPERAND (arg1, 0), 0) == arg0
2848                       && integer_zerop (TREE_OPERAND (arg1, 1)))
2849                return 1;
2850              return 0;
2851            }
and thus returns that the two are equal.  While both of the decls in here
(VAR_DECL and FIELD_DECL) have the same DECL_UID for no -gtoggle vs. -gtoggle,
I presume the problem is that there are other tree expressions pushed into the
hash_map as keys that have some DECL_UID differences somewhere, in any case
both the hash maps have the same number of elements, but report different
number of collisions (note debug stmts never query anything, so that is not the
issue).

So, I believe the bug is that we have a hash function that can return different
hashes even for objects that compare equal by the comparison function, so it is
then by pure luck if we find a match or not.
IMHO the right solution would be to have next to operand_equal_p a hashing
function that guarantees that if operand_equal_p returns true on two tree
expressions, then they have the same hash.

As short term, perhaps (maybe just for sanopt, as that is where the problem is
reported), we could use a comparison function that compares both the hash
values and operand_equal_p, i.e. tree expressions that hash differently would
never compare equal.  This can be done either by remembering also the hash
value, or just computing iterative_hash_expr each time.

[Bug sanitizer/70683] New: [6 Regression] -fcompare-debug bug with -fsanitize=address

Reply via email to