Re: [PATCH] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 09, 2023 at 08:12:47AM +, Richard Biener wrote:
> I think this is a reasonable way to address the regression, so OK.

It is true that both C and C++ (including c++14_down and c++17 and later
where the latter have different ordering rules) evaluate the lhs of
MODIFY_EXPR after rhs, so conceptually this patch makes sense.
But I wonder why we do in ubsan_maybe_instrument_array_ref:
  if (e != NULL_TREE)
{
  tree t = copy_node (*expr_p);
  TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
e, op1);
  *expr_p = t;
}
rather than modification of the ARRAY_REF's operand in place.  If we
did that, we wouldn't really care about the order, shared tree would
be instrumented once, with SAVE_EXPR in there making sure we don't
compute that multiple times.  Is that because the 2 copies could
have side-effects and we do want to evaluate those multiple times?

Jakub



Re: [PATCH] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-09 Thread Richard Biener via Gcc-patches
On Wed, 8 Mar 2023, Marek Polacek wrote:

> In this PR we are dealing with a missing .UBSAN_BOUNDS, so the
> out-of-bounds access in the test makes the program crash before
> a UBSan diagnostic was emitted.  In C and C++, c_genericize gets
> 
>   a[b] = a[b] | c;
> 
> but in C, both a[b] are one identical shared tree (not in C++ because
> cp_fold/ARRAY_REF created two same but not identical trees).  Since
> ubsan_walk_array_refs_r keeps a pset, in C we produce
> 
>   a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] = a[b] | c;
> 
> because the LHS is walked before the RHS.
> 
> Since r7-1900, we gimplify the RHS before the LHS.  So the statement above
> gets gimplified into
> 
> _1 = a[b];
> c.0_2 = c;
> b.1 = b;
> .UBSAN_BOUNDS (0B, b.1, 8);
> 
> With this patch we produce:
> 
>   a[b] = a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] | c;
> 
> which gets gimplified into:
> 
> b.0 = b;
> .UBSAN_BOUNDS (0B, b.0, 8);
> _1 = a[b.0];
> 
> therefore we emit a runtime error before making the bad array access.
> 
> I think it's OK that only the RHS gets a .UBSAN_BOUNDS, as in few lines
> above: the instrumented array access dominates the array access on the
> LHS, and I've verified that
> 
>   b = 0;
>   a[b] = (a[b], b = -32768, a[0] | c);
> 
> works as expected: the inner a[b] is OK but we do emit an error for the
> a[b] on the LHS.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?

I think this is a reasonable way to address the regression, so OK.

Thanks,
Richard.

>   PR sanitizer/108060
>   PR sanitizer/109050
> 
> gcc/c-family/ChangeLog:
> 
>   * c-gimplify.cc (ubsan_walk_array_refs_r): For a MODIFY_EXPR, instrument
>   the RHS before the LHS.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/ubsan/bounds-17.c: New test.
>   * c-c++-common/ubsan/bounds-18.c: New test.
>   * c-c++-common/ubsan/bounds-19.c: New test.
>   * c-c++-common/ubsan/bounds-20.c: New test.
> ---
>  gcc/c-family/c-gimplify.cc   | 12 
>  gcc/testsuite/c-c++-common/ubsan/bounds-17.c | 17 +
>  gcc/testsuite/c-c++-common/ubsan/bounds-18.c | 17 +
>  gcc/testsuite/c-c++-common/ubsan/bounds-19.c | 20 
>  gcc/testsuite/c-c++-common/ubsan/bounds-20.c | 16 
>  5 files changed, 82 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-17.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-18.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-19.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-20.c
> 
> diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
> index 74b276b2b26..ef5c7d919fc 100644
> --- a/gcc/c-family/c-gimplify.cc
> +++ b/gcc/c-family/c-gimplify.cc
> @@ -106,6 +106,18 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, 
> void *data)
>  }
>else if (TREE_CODE (*tp) == ARRAY_REF)
>  ubsan_maybe_instrument_array_ref (tp, false);
> +  else if (TREE_CODE (*tp) == MODIFY_EXPR)
> +{
> +  /* Since r7-1900, we gimplify RHS before LHS.  Consider
> +a[b] |= c;
> +  wherein we can have a single shared tree a[b] in both LHS and RHS.
> +  If we only instrument the LHS and the access is invalid, the program
> +  could crash before emitting a UBSan error.  So instrument the RHS
> +  first.  */
> +  *walk_subtrees = 0;
> +  walk_tree (_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, 
> pset);
> +  walk_tree (_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, 
> pset);
> +}
>return NULL_TREE;
>  }
>  
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
> new file mode 100644
> index 000..b727e3235b8
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
> @@ -0,0 +1,17 @@
> +/* PR sanitizer/108060 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +/* { dg-skip-if "" { *-*-* } "-flto" } */
> +/* { dg-shouldfail "ubsan" } */
> +
> +int a[8];
> +int c;
> +
> +int
> +main ()
> +{
> +  int b = -32768;
> +  a[b] |= c;
> +}
> +
> +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
> new file mode 100644
> index 000..556abc0e1c0
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
> @@ -0,0 +1,17 @@
> +/* PR sanitizer/108060 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +/* { dg-skip-if "" { *-*-* } "-flto" } */
> +/* { dg-shouldfail "ubsan" } */
> +
> +int a[8];
> +int c;
> +
> +int
> +main ()
> +{
> +  int b = -32768;
> +  a[b] = a[b] | c;
> +}
> +
> +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
> new file mode 

[PATCH] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-08 Thread Marek Polacek via Gcc-patches
In this PR we are dealing with a missing .UBSAN_BOUNDS, so the
out-of-bounds access in the test makes the program crash before
a UBSan diagnostic was emitted.  In C and C++, c_genericize gets

  a[b] = a[b] | c;

but in C, both a[b] are one identical shared tree (not in C++ because
cp_fold/ARRAY_REF created two same but not identical trees).  Since
ubsan_walk_array_refs_r keeps a pset, in C we produce

  a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] = a[b] | c;

because the LHS is walked before the RHS.

Since r7-1900, we gimplify the RHS before the LHS.  So the statement above
gets gimplified into

_1 = a[b];
c.0_2 = c;
b.1 = b;
.UBSAN_BOUNDS (0B, b.1, 8);

With this patch we produce:

  a[b] = a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] | c;

which gets gimplified into:

b.0 = b;
.UBSAN_BOUNDS (0B, b.0, 8);
_1 = a[b.0];

therefore we emit a runtime error before making the bad array access.

I think it's OK that only the RHS gets a .UBSAN_BOUNDS, as in few lines
above: the instrumented array access dominates the array access on the
LHS, and I've verified that

  b = 0;
  a[b] = (a[b], b = -32768, a[0] | c);

works as expected: the inner a[b] is OK but we do emit an error for the
a[b] on the LHS.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?

PR sanitizer/108060
PR sanitizer/109050

gcc/c-family/ChangeLog:

* c-gimplify.cc (ubsan_walk_array_refs_r): For a MODIFY_EXPR, instrument
the RHS before the LHS.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/bounds-17.c: New test.
* c-c++-common/ubsan/bounds-18.c: New test.
* c-c++-common/ubsan/bounds-19.c: New test.
* c-c++-common/ubsan/bounds-20.c: New test.
---
 gcc/c-family/c-gimplify.cc   | 12 
 gcc/testsuite/c-c++-common/ubsan/bounds-17.c | 17 +
 gcc/testsuite/c-c++-common/ubsan/bounds-18.c | 17 +
 gcc/testsuite/c-c++-common/ubsan/bounds-19.c | 20 
 gcc/testsuite/c-c++-common/ubsan/bounds-20.c | 16 
 5 files changed, 82 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-17.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-18.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-19.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-20.c

diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
index 74b276b2b26..ef5c7d919fc 100644
--- a/gcc/c-family/c-gimplify.cc
+++ b/gcc/c-family/c-gimplify.cc
@@ -106,6 +106,18 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, 
void *data)
 }
   else if (TREE_CODE (*tp) == ARRAY_REF)
 ubsan_maybe_instrument_array_ref (tp, false);
+  else if (TREE_CODE (*tp) == MODIFY_EXPR)
+{
+  /* Since r7-1900, we gimplify RHS before LHS.  Consider
+  a[b] |= c;
+wherein we can have a single shared tree a[b] in both LHS and RHS.
+If we only instrument the LHS and the access is invalid, the program
+could crash before emitting a UBSan error.  So instrument the RHS
+first.  */
+  *walk_subtrees = 0;
+  walk_tree (_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, pset);
+  walk_tree (_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, pset);
+}
   return NULL_TREE;
 }
 
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
new file mode 100644
index 000..b727e3235b8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
@@ -0,0 +1,17 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-shouldfail "ubsan" } */
+
+int a[8];
+int c;
+
+int
+main ()
+{
+  int b = -32768;
+  a[b] |= c;
+}
+
+/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
new file mode 100644
index 000..556abc0e1c0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
@@ -0,0 +1,17 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-shouldfail "ubsan" } */
+
+int a[8];
+int c;
+
+int
+main ()
+{
+  int b = -32768;
+  a[b] = a[b] | c;
+}
+
+/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
new file mode 100644
index 000..54217ae399f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
@@ -0,0 +1,20 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-shouldfail "ubsan" } */
+
+int a[8];
+int a2[18];
+int c;
+
+int
+main ()
+{
+  int b = 0;
+  a[0] = (a2[b], b = -32768, a[0] | c);
+  b = 0;