Hi!

While this isn't really a regression, the -fstrict-flex-arrays*
option is new in GCC 13 and so I think we should make -fsanitize=bounds
play with it well from the beginning.

The current behavior is that -fsanitize=bounds considers all trailing
arrays as flexible member-like arrays and both -fsanitize=bounds and
-fsanitize=bounds-strict because of a bug don't even instrument
[0] arrays at all, not as trailing nor when followed by other members.

I think -fstrict-flex-arrays* options can be considered as language
mode changing options, by default flexible member-like arrays are
handled like flexible arrays, but that option can change the set of
the arrays which are treated like that.  So, -fsanitize=bounds should
change with that on what is considered acceptable and what isn't.
While -fsanitize=bounds-strict should reject them all always to
continue previous behavior.

The following patch implements that.  To support [0] array instrumentation,
I had to change the meaning of the bounds argument to .UBSAN_BOUNDS,
previously it was the TYPE_MAX_VALUE of the domain unless ignore_off_by_one
(used for taking address of the array element rather than accessing it;
in that case 1 is added to the bound argument) and the later lowered checks
were if (index > bound) report_failure ().
The problem with that is that for [0] arrays where (at least for C++)
the max value is all ones, for accesses that condition will be never true;
for addresses of elements it was working (in C++) correctly before.
This patch changes it to add 1 + ignore_off_by_one, so -1 becomes 0 or
1 for &array_ref and changing the lowering to be if (index >= bound)
report_failure ().  Furthermore, as C represents the [0] arrays with
NULL TYPE_MAX_VALUE, I treated those like the C++ ones.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-28  Jakub Jelinek  <ja...@redhat.com>

        PR sanitizer/108894
gcc/
        * ubsan.cc (ubsan_expand_bounds_ifn): Emit index >= bound
        comparison rather than index > bound.
        * gimple-fold.cc (gimple_fold_call): Use tree_int_cst_lt
        rather than tree_int_cst_le for IFN_UBSAN_BOUND comparison.
        * doc/invoke.texi (-fsanitize=bounds): Document that whether
        flexible array member-like arrays are instrumented or not depends
        on -fstrict-flex-arrays* options of strict_flex_array attributes.
        (-fsanitize=bounds-strict): Document that flexible array members
        are not instrumented.
gcc/c-family/
        * c-common.h (c_strict_flex_array_level_of): Declare.
        * c-common.cc (c_strict_flex_array_level_of): New function,
        moved and renamed from c-decl.cc's strict_flex_array_level_of.
        * c-ubsan.cc (ubsan_instrument_bounds): Fix comment typo.  For
        C check c_strict_flex_array_level_of whether a trailing array
        should be treated as flexible member like.  Handle C [0] arrays.
        Add 1 + index_off_by_one rather than index_off_by_one to bounds
        and use tree_int_cst_lt rather than tree_int_cst_le for idx vs.
        bounds comparison.
gcc/c/
        * c-decl.cc (strict_flex_array_level_of): Move to c-common.cc
        and rename to c_strict_flex_array_level_of.
        (is_flexible_array_member_p): Adjust caller.
gcc/testsuite/
        * gcc.dg/ubsan/bounds-4.c: New test.
        * gcc.dg/ubsan/bounds-4a.c: New test.
        * gcc.dg/ubsan/bounds-4b.c: New test.
        * gcc.dg/ubsan/bounds-4c.c: New test.
        * gcc.dg/ubsan/bounds-4d.c: New test.
        * g++.dg/ubsan/bounds-1.C: New test.

--- gcc/ubsan.cc.jj     2023-02-06 09:05:49.001628881 +0100
+++ gcc/ubsan.cc        2023-02-27 16:46:13.452122423 +0100
@@ -721,7 +721,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
 
   gimple_stmt_iterator gsi_orig = *gsi;
 
-  /* Create condition "if (index > bound)".  */
+  /* Create condition "if (index >= bound)".  */
   basic_block then_bb, fallthru_bb;
   gimple_stmt_iterator cond_insert_point
     = create_cond_insert_point (gsi, false, false, true,
@@ -730,7 +730,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
   index = force_gimple_operand_gsi (&cond_insert_point, index,
                                    true, NULL_TREE,
                                    false, GSI_NEW_STMT);
-  gimple *g = gimple_build_cond (GT_EXPR, index, bound, NULL_TREE, NULL_TREE);
+  gimple *g = gimple_build_cond (GE_EXPR, index, bound, NULL_TREE, NULL_TREE);
   gimple_set_location (g, loc);
   gsi_insert_after (&cond_insert_point, g, GSI_NEW_STMT);
 
--- gcc/gimple-fold.cc.jj       2023-02-08 11:57:29.726582589 +0100
+++ gcc/gimple-fold.cc  2023-02-27 16:45:00.895160752 +0100
@@ -5624,7 +5624,7 @@ gimple_fold_call (gimple_stmt_iterator *
              {
                index = fold_convert (TREE_TYPE (bound), index);
                if (TREE_CODE (index) == INTEGER_CST
-                   && tree_int_cst_le (index, bound))
+                   && tree_int_cst_lt (index, bound))
                  {
                    replace_call_with_value (gsi, NULL_TREE);
                    return true;
--- gcc/doc/invoke.texi.jj      2023-02-24 09:58:39.888509771 +0100
+++ gcc/doc/invoke.texi 2023-02-28 08:32:27.313775696 +0100
@@ -16831,14 +16831,17 @@ a++;
 @item -fsanitize=bounds
 This option enables instrumentation of array bounds.  Various out of bounds
 accesses are detected.  Flexible array members, flexible array member-like
-arrays, and initializers of variables with static storage are not instrumented.
+arrays, and initializers of variables with static storage are not
+instrumented, with the exception of flexible array member-like arrays
+for which @code{-fstrict-flex-arrays} or @code{-fstrict-flex-arrays=}
+options or @code{strict_flex_array} attributes say they shouldn't be treated
+like flexible array member-like arrays.
 
 @opindex fsanitize=bounds-strict
 @item -fsanitize=bounds-strict
 This option enables strict instrumentation of array bounds.  Most out of bounds
-accesses are detected, including flexible array members and flexible array
-member-like arrays.  Initializers of variables with static storage are not
-instrumented.
+accesses are detected, including flexible array member-like arrays.
+Initializers of variables with static storage are not instrumented.
 
 @opindex fsanitize=alignment
 @item -fsanitize=alignment
--- gcc/c-family/c-common.h.jj  2023-02-17 12:45:07.524646367 +0100
+++ gcc/c-family/c-common.h     2023-02-27 15:11:16.737259968 +0100
@@ -907,6 +907,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 
 /* Used by convert_and_check; in front ends.  */
--- gcc/c-family/c-common.cc.jj 2023-01-19 09:58:50.875229150 +0100
+++ gcc/c-family/c-common.cc    2023-02-27 15:10:56.232547326 +0100
@@ -9501,4 +9501,33 @@ c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
+   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
+unsigned int
+c_strict_flex_array_level_of (tree array_field)
+{
+  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
+  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
+
+  tree attr_strict_flex_array
+    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
+  /* If there is a strict_flex_array attribute attached to the field,
+     override the flag_strict_flex_arrays.  */
+  if (attr_strict_flex_array)
+    {
+      /* Get the value of the level first from the attribute.  */
+      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
+      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
+      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
+      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
+      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
+      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
+      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
+
+      /* The attribute has higher priority than flag_struct_flex_array.  */
+      strict_flex_array_level = attr_strict_flex_array_level;
+    }
+  return strict_flex_array_level;
+}
+
 #include "gt-c-family-c-common.h"
--- gcc/c-family/c-ubsan.cc.jj  2023-01-16 11:52:15.899736776 +0100
+++ gcc/c-family/c-ubsan.cc     2023-02-28 08:15:57.579124095 +0100
@@ -354,7 +354,7 @@ ubsan_instrument_return (location_t loc)
    that gets expanded in the sanopt pass, and make an array dimension
    of it.  ARRAY is the array, *INDEX is an index to the array.
    Return NULL_TREE if no instrumentation is emitted.
-   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR.  */
+   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside an ADDR_EXPR.  */
 
 tree
 ubsan_instrument_bounds (location_t loc, tree array, tree *index,
@@ -363,13 +363,25 @@ ubsan_instrument_bounds (location_t loc,
   tree type = TREE_TYPE (array);
   tree domain = TYPE_DOMAIN (type);
 
-  if (domain == NULL_TREE || TYPE_MAX_VALUE (domain) == NULL_TREE)
+  if (domain == NULL_TREE)
     return NULL_TREE;
 
   tree bound = TYPE_MAX_VALUE (domain);
-  if (ignore_off_by_one)
-    bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
-                        build_int_cst (TREE_TYPE (bound), 1));
+  if (!bound)
+    {
+      /* Handle C [0] arrays, which have TYPE_MAX_VALUE NULL, like
+        C++ [0] arrays which have TYPE_MIN_VALUE 0 TYPE_MAX_VALUE -1.  */
+      if (!c_dialect_cxx ()
+         && COMPLETE_TYPE_P (type)
+         && integer_zerop (TYPE_SIZE (type)))
+       bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+      else
+       return NULL_TREE;
+    }
+
+  bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
+                      build_int_cst (TREE_TYPE (bound),
+                      1 + ignore_off_by_one));
 
   /* Detect flexible array members and suchlike, unless
      -fsanitize=bounds-strict.  */
@@ -392,6 +404,45 @@ ubsan_instrument_bounds (location_t loc,
          if (next)
            /* Not a last element.  Instrument it.  */
            break;
+         if (TREE_CODE (TREE_TYPE (TREE_OPERAND (cref, 1))) == ARRAY_TYPE
+             && !c_dialect_cxx ())
+           {
+             unsigned l
+               = c_strict_flex_array_level_of (TREE_OPERAND (cref, 1));
+             tree type2 = TREE_TYPE (TREE_OPERAND (cref, 1));
+             if (TYPE_DOMAIN (type2) != NULL_TREE)
+               {
+                 tree max = TYPE_MAX_VALUE (TYPE_DOMAIN (type2));
+                 if (max == NULL_TREE)
+                   {
+                     /* C [0] */
+                     if (COMPLETE_TYPE_P (type2)
+                         && integer_zerop (TYPE_SIZE (type2))
+                         && l == 3)
+                       next = TREE_OPERAND (cref, 1);
+                   }
+                 else if (TREE_CODE (max) == INTEGER_CST)
+                   {
+                     if (c_dialect_cxx ()
+                         && integer_all_onesp (max))
+                       {
+                         /* C++ [0] */
+                         if (l == 3)
+                           next = TREE_OPERAND (cref, 1);
+                       }
+                     else if (integer_zerop (max))
+                       {
+                         /* C/C++ [1] */
+                         if (l >= 2)
+                           next = TREE_OPERAND (cref, 1);
+                       }
+                     else if (l >= 1)
+                       next = TREE_OPERAND (cref, 1);
+                   }
+               }
+             if (next)
+               break;
+           }
          /* Ok, this is the last field of the structure/union.  But the
             aggregate containing the field must be the last field too,
             recursively.  */
@@ -413,7 +464,7 @@ ubsan_instrument_bounds (location_t loc,
   if (idx
       && TREE_CODE (bound) == INTEGER_CST
       && tree_int_cst_sgn (idx) >= 0
-      && tree_int_cst_le (idx, bound))
+      && tree_int_cst_lt (idx, bound))
     return NULL_TREE;
 
   *index = save_expr (*index);
--- gcc/c/c-decl.cc.jj  2023-02-18 12:38:30.000000000 +0100
+++ gcc/c/c-decl.cc     2023-02-27 15:09:57.143375989 +0100
@@ -9050,35 +9050,6 @@ finish_incomplete_vars (tree incomplete_
     }
 }
 
-/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
-   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
-static unsigned int
-strict_flex_array_level_of (tree array_field)
-{
-  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
-  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
-
-  tree attr_strict_flex_array
-    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
-  /* If there is a strict_flex_array attribute attached to the field,
-     override the flag_strict_flex_arrays.  */
-  if (attr_strict_flex_array)
-    {
-      /* Get the value of the level first from the attribute.  */
-      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
-      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
-      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
-
-      /* The attribute has higher priority than flag_struct_flex_array.  */
-      strict_flex_array_level = attr_strict_flex_array_level;
-    }
-  return strict_flex_array_level;
-}
-
 /* Determine whether the FIELD_DECL X is a flexible array member according to
    the following info:
   A. whether the FIELD_DECL X is the last field of the DECL_CONTEXT;
@@ -9105,7 +9076,7 @@ is_flexible_array_member_p (bool is_last
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
   bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
 
-  unsigned int strict_flex_array_level = strict_flex_array_level_of (x);
+  unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
   switch (strict_flex_array_level)
     {
--- gcc/testsuite/gcc.dg/ubsan/bounds-4.c.jj    2023-02-27 17:08:46.193766320 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4.c       2023-02-27 17:21:40.489668882 
+0100
@@ -0,0 +1,79 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" } */
+/* { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+struct A { int a; int b[]; };
+struct B { int a; int b[0]; };
+struct C { int a; int b[1]; };
+struct D { int a; int b[2]; };
+struct E { int a; int b[42]; };
+struct F { int a; int b[0]; int c[2]; };
+struct G { int a; int b[15]; int c[2]; };
+
+__attribute__((noipa)) int
+foo (struct A *a)
+{
+  return a->b[14];
+}
+
+__attribute__((noipa)) int
+bar (struct B *a)
+{
+  return a->b[0];
+}
+
+__attribute__((noipa)) int
+baz (struct C *a)
+{
+  return a->b[1];
+}
+
+__attribute__((noipa)) int
+qux (struct D *a)
+{
+  return a->b[2];
+}
+
+__attribute__((noipa)) int
+corge (struct E *a)
+{
+  return a->b[14];
+}
+
+__attribute__((noipa)) int
+freddy (struct F *a)
+{
+  return a->b[0];
+}
+
+__attribute__((noipa)) int
+garply (struct G *a)
+{
+  return a->b[15];
+}
+
+__attribute__((noipa)) int
+waldo (struct G *a)
+{
+  return a->b[16];
+}
+
+int
+main ()
+{
+  union { struct A a; struct B b; struct C c;
+         struct D d; struct E e; struct F f; } u;
+  struct G g;
+  u.e.a = 42;
+  __builtin_memset (u.e.b, 0, sizeof (u.e.b));
+  __builtin_memset (&g, 0, sizeof (g));
+  int r = garply (&g);
+  r += foo (&u.a) + bar (&u.b) + baz (&u.c);
+  r += qux (&u.d) + corge (&u.e) + freddy (&u.f);
+  r += waldo (&g);
+  if (r != 0)
+    __builtin_abort ();
+}
--- gcc/testsuite/gcc.dg/ubsan/bounds-4a.c.jj   2023-02-27 17:08:52.797671833 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4a.c      2023-02-27 17:25:08.463687314 
+0100
@@ -0,0 +1,8 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds 
-fstrict-flex-arrays=0" } */
+/* { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4b.c.jj   2023-02-27 17:08:52.797671833 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4b.c      2023-02-27 17:29:57.042545420 
+0100
@@ -0,0 +1,9 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds 
-fstrict-flex-arrays=1" } */
+/* { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int 
\\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4c.c.jj   2023-02-27 17:08:52.797671833 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4c.c      2023-02-27 17:34:29.094634537 
+0100
@@ -0,0 +1,10 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds 
-fstrict-flex-arrays=2" } */
+/* { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int 
\\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int 
\\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4d.c.jj   2023-02-27 17:08:52.797671833 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4d.c      2023-02-27 17:35:46.556520992 
+0100
@@ -0,0 +1,11 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds 
-fstrict-flex-arrays=3" } */
+/* { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int 
\\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int 
\\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/g++.dg/ubsan/bounds-1.C.jj    2023-02-27 17:37:03.076420982 
+0100
+++ gcc/testsuite/g++.dg/ubsan/bounds-1.C       2023-02-27 17:38:23.717261726 
+0100
@@ -0,0 +1,8 @@
+// PR sanitizer/108894
+// { dg-do run }
+// { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" }
+// { dg-output "index 15 out of bounds for type 'int 
\\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" }
+// { dg-output "\[^\n\r]*index 0 out of bounds for type 'int 
\\\[\[0-9x]*\\\]'\[^\n\r]*(\n|\r\n|\r)" }
+// { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" }
+
+#include "../../gcc.dg/ubsan/bounds-4.c"

        Jakub

Reply via email to