Re: [PATCH] tree-object-size: Support strndup and strdup

2022-11-02 Thread Siddhesh Poyarekar

On 2022-09-23 09:02, Jakub Jelinek wrote:

Oh, so can addr_object_size be simplified to use get_base_address too?


You can try.  As you can see in get_base_address, that function
handles something that the above doesn't (looking through some MEM_REFs too).



I went down this rabbithole and it actually simplifies some cases but 
got sucked into flex array related issues that I need more time to 
figure out.  I'll stick to using get_base_address for now since I want 
to make sure this makes the stage 1 deadline.


Thanks,
Sid


Re: [PATCH 2/2] tree-object-size: More consistent behaviour with flex arrays

2023-01-31 Thread Siddhesh Poyarekar

On 2023-01-31 07:46, Jakub Jelinek wrote:

On Wed, Dec 21, 2022 at 05:25:54PM -0500, Siddhesh Poyarekar wrote:

The tree object size pass tries to fail when it detects a flex array in
the struct, but it ends up doing the right thing only when the flex
array is in the outermost struct.  For nested cases (such as arrays
nested in a union or an inner struct), it ends up taking whatever value
the flex array is declared with, using zero for the standard flex array,
i.e. [].

Rework subobject size computation to make it more consistent across the
board, honoring -fstrict-flex-arrays.  With this change, any array at
the end of the struct will end up causing __bos to use the allocated
value of the outer object, bailing out in the maximum case when it can't
find it.  In the minimum case, it will return the subscript value or the
allocated value of the outer object, whichever is larger.


I think it is way too late in the GCC 13 cycle to change behavior here
except for when -fstrict-flex-arrays is used.


I agree.


Plus, am not really convinced it is a good idea to change the behavior
here except for the new options, programs in the wild took 2+ years
to acommodate for what we GCC requiring and am not sure they'd be willing
to be adjusted again.


The behaviour change basically does two things: better minimum object 
size estimates and more conservative object size estimates for trailing 
arrays.  ISTM that this should in fact reduce breakages due to flex 
arrays, although one could argue that protection gets reduced for 
trailing arrays without -fstrict-flex-arrays.  It wouldn't cause 
non-mitigation behaviour changes though, would it?


We don't need to rush this of course, we could consider this for gcc 14 
given that we're well into stage 4.


Thanks,
Sid


Re: [PATCH 2/2] Documentation Update.

2023-02-01 Thread Siddhesh Poyarekar

On 2023-02-01 13:24, Qing Zhao wrote:




On Feb 1, 2023, at 11:55 AM, Siddhesh Poyarekar  wrote:

On 2023-01-31 09:11, Qing Zhao wrote:

Update documentation to clarify a GCC extension on structure with
flexible array member being nested in another structure.
gcc/ChangeLog:
* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.


Should this resolve pr#77650 since the proposed action there appears to be to 
document these semantics?


My understanding of pr77650 is specifically for documentation on the following 
case:

The structure with a flexible array member is the middle field of another 
structure.

Which I added in the documentation as the 2nd situation.
However, I am still not very comfortable on my current clarification on this 
situation: how should we document on
the expected gcc behavior to handle such situation?


I reckon wording that dissuades programmers from using this might be 
appropriate, i.e. don't rely on this and if you already have such nested 
flex arrays, change code to remove them.



+In the above, @code{flex_data.data[]} is allowed to be extended flexibly to
+the padding. E.g, up to 4 elements.


"""
... Relying on space in struct padding is bad programming practice and 
any code relying on this behaviour should be modified to ensure that 
flexible array members only end up at the ends of arrays.  The 
`-pedantic` flag should help identify such uses.

"""

Although -pedantic will also flag on flex arrays nested in structs even 
if they're at the end of the parent struct, so my suggestion on the 
warning is not really perfect.


Sid


[committed v2] testsuite: Run __bos tests to completion

2023-02-01 Thread Siddhesh Poyarekar
Instead of failing on first error, run all __builtin_object_size and
__builtin_dynamic_object_size tests to completion and then provide a
summary of which tests failed.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c: Move FAIL and nfail
into...
* gcc.dg/builtin-object-size-common.h: ... new file.
* g++.dg/ext/builtin-object-size1.C: Include
builtin-object-size-common.h.  Replace all abort with FAIL.
(main): Call DONE.
* g++.dg/ext/builtin-object-size2.C: Likewise.
* gcc.dg/builtin-object-size-1.c: Likewise.
* gcc.dg/builtin-object-size-12.c: Likewise.
* gcc.dg/builtin-object-size-13.c: Likewise.
* gcc.dg/builtin-object-size-15.c: Likewise.
* gcc.dg/builtin-object-size-2.c: Likewise.
* gcc.dg/builtin-object-size-3.c: Likewise.
* gcc.dg/builtin-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-6.c: Likewise.
* gcc.dg/builtin-object-size-7.c: Likewise.
* gcc.dg/builtin-object-size-8.c: Likewise.
* gcc.dg/pr101836.c: Likewise.
* gcc.dg/strict-flex-array-3.c: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 .../g++.dg/ext/builtin-object-size1.C | 260 ---
 .../g++.dg/ext/builtin-object-size2.C | 260 ---
 .../gcc.dg/builtin-dynamic-object-size-0.c|  14 +-
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  | 297 -
 gcc/testsuite/gcc.dg/builtin-object-size-12.c |  12 +-
 gcc/testsuite/gcc.dg/builtin-object-size-13.c |  15 +-
 gcc/testsuite/gcc.dg/builtin-object-size-15.c |  11 +-
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  | 305 +-
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  | 275 
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  | 285 
 gcc/testsuite/gcc.dg/builtin-object-size-6.c  | 260 ---
 gcc/testsuite/gcc.dg/builtin-object-size-7.c  |  54 ++--
 gcc/testsuite/gcc.dg/builtin-object-size-8.c  |  15 +-
 .../gcc.dg/builtin-object-size-common.h   |  32 ++
 gcc/testsuite/gcc.dg/pr101836.c   |  10 +-
 gcc/testsuite/gcc.dg/strict-flex-array-3.c|  10 +-
 16 files changed, 1039 insertions(+), 1076 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-common.h

diff --git a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C 
b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
index 8590a0bbebd..ebbeced1942 100644
--- a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
+++ b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C
@@ -1,11 +1,7 @@
 // { dg-do run }
 // { dg-options "-O2" }
 
-typedef __SIZE_TYPE__ size_t;
-extern "C" void abort ();
-extern "C" void exit (int);
-extern "C" void *malloc (size_t);
-extern "C" void free (void *);
+#include "../../gcc.dg/builtin-object-size-common.h"
 
 struct A
 {
@@ -20,105 +16,105 @@ test1 (A *p)
 {
   char *c;
   if (__builtin_object_size (>a, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>a[0], 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>a[3], 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>b, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = p->a;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = >a[0];
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = >a[3];
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = (char *) >b;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   c = (char *) >c;
   if (__builtin_object_size (c, 0) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>a, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   if (__builtin_object_size (>a[0], 1) != sizeof (p->a))
-abort ();
+FAIL ();
   if (__builtin_object_size (>a[3], 1) != sizeof (p->a) - 3)
-abort ();
+FAIL ();
   if (__builtin_object_size (>b, 1) != sizeof (p->b))
-abort ();
+FAIL ();
   if (__builtin_object_size (>c, 1) != (size_t) -1)
-abort ();
+FAIL ();
   c = p->a;
   if (__builtin_object_size (c, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   c = >a[0];
   if (__builtin_object_size (c, 1) != sizeof (p->a))
-abort ();
+FAIL ();
   c = >a[3];
   if (__builtin_object_size (c, 1) != sizeof (p->a) - 3)
-abort ();
+FAIL ();
   c = (char *) >b;
   if (__builtin_object_size (c, 1) != sizeof (p->b))
-abort ();
+FAIL ();
   c = (char *) >c;
   if (__builtin_object_size (c, 1) != (size_t) -1)
-abort ();
+FAIL ();
   if (__builtin_object_size (>a, 2) != 0)
-abort ();
+FAIL ();
   if (__builtin_object_size (>a

Re: [PATCH 2/2] Documentation Update.

2023-02-01 Thread Siddhesh Poyarekar

On 2023-01-31 09:11, Qing Zhao wrote:

Update documentation to clarify a GCC extension on structure with
flexible array member being nested in another structure.

gcc/ChangeLog:

* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.


Should this resolve pr#77650 since the proposed action there appears to 
be to document these semantics?


Thanks,
Sid


---
  gcc/doc/extend.texi | 35 ++-
  1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 4a89a3eae7c..54e4baf49a9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1748,7 +1748,40 @@ Flexible array members may only appear as the last 
member of a
  A structure containing a flexible array member, or a union containing
  such a structure (possibly recursively), may not be a member of a
  structure or an element of an array.  (However, these uses are
-permitted by GCC as extensions.)
+permitted by GCC as extensions, see details below.)
+@end itemize
+
+GCC extension accepts a structure containing a flexible array member, or
+a union containing such a structure (possibly recursively) to be a member
+of a structure.
+
+There are two situations:
+
+@itemize @bullet
+@item
+The structure with a flexible array member is the last field of another
+structure, for example:
+
+@smallexample
+struct flex  @{ int length; char data[]; @};
+
+struct out_flex @{ int m; struct flex flex_data; @};
+@end smallexample
+
+In the above, @code{flex_data.data[]} is considered as a flexible array too.
+
+@item
+The structure with a flexible array member is the middle field of another
+structure, for example:
+
+@smallexample
+struct flex  @{ int length; char data[]; @};
+
+struct mid_flex @{ int m; struct flex flex_data; int n; @};
+@end smallexample
+
+In the above, @code{flex_data.data[]} is allowed to be extended flexibly to
+the padding. E.g, up to 4 elements.
  @end itemize
  
  Non-empty initialization of zero-length


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-01 Thread Siddhesh Poyarekar

On 2023-01-31 09:11, Qing Zhao wrote:

GCC extension accepts the case when a struct with a flexible array member
is embedded into another struct (possibly recursively).
__builtin_object_size should treat such struct as flexible size per
-fstrict-flex-arrays.

PR tree-optimization/101832

gcc/ChangeLog:

PR tree-optimization/101832
* tree-object-size.cc (flexible_size_type_p): New function.
(addr_object_size): Handle structure/union type when it has
flexible size.

gcc/testsuite/ChangeLog:

PR tree-optimization/101832
* gcc.dg/builtin-object-size-pr101832-2.c: New test.
* gcc.dg/builtin-object-size-pr101832-3.c: New test.
* gcc.dg/builtin-object-size-pr101832-4.c: New test.
* gcc.dg/builtin-object-size-pr101832.c: New test.
---
  .../gcc.dg/builtin-object-size-pr101832-2.c   | 135 ++
  .../gcc.dg/builtin-object-size-pr101832-3.c   | 135 ++
  .../gcc.dg/builtin-object-size-pr101832-4.c   | 135 ++
  .../gcc.dg/builtin-object-size-pr101832.c | 119 +++
  gcc/tree-object-size.cc   | 115 +++
  5 files changed, 611 insertions(+), 28 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-4.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c
new file mode 100644
index 000..f38babc5415
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c
@@ -0,0 +1,135 @@
+/* PR 101832:
+   GCC extension accepts the case when a struct with a flexible array member
+   is embedded into another struct (possibly recursively).
+   __builtin_object_size will treat such struct as flexible size per
+   -fstrict-flex-arrays.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fstrict-flex-arrays=1" } */
+
+#include 
+
+unsigned n_fails = 0;
+
+#define expect(p, _v) do { \
+  size_t v = _v; \
+  if (p == v) \
+printf("ok:  %s == %zd\n", #p, p); \
+  else {\
+printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+n_fails++; \


I just pushed my testsuite fix, so you could use the macros in 
gcc.dg/builtin-object-size-common.h instead of accounting this yourself.


Also if you use __builtin_printf, you won't have to include stdio.h.

Thanks,
Sid


+  } \
+} while (0);
+
+struct A {
+  int n;
+  char data[];/* Content following header */
+};
+
+struct B {
+  int m;
+  struct A a;
+};
+
+struct C {
+  int q;
+  struct B b;
+};
+
+struct A0 {
+  int n;
+  char data[0];/* Content following header */
+};
+
+struct B0 {
+  int m;
+  struct A0 a;
+};
+
+struct C0 {
+  int q;
+  struct B0 b;
+};
+
+struct A1 {
+  int n;
+  char data[1];/* Content following header */
+};
+
+struct B1 {
+  int m;
+  struct A1 a;
+};
+
+struct C1 {
+  int q;
+  struct B1 b;
+};
+
+struct An {
+  int n;
+  char data[8];/* Content following header */
+};
+
+struct Bn {
+  int m;
+  struct An a;
+};
+
+struct Cn {
+  int q;
+  struct Bn b;
+};
+
+volatile void *magic1, *magic2;
+
+int main(int argc, char *argv[])
+{
+struct B *outer;
+struct C *outest;
+
+/* Make sure optimization can't find some other object size. */
+outer = (void *)magic1;
+outest = (void *)magic2;
+
+expect(__builtin_object_size(>a, 1), -1);
+expect(__builtin_object_size(>b, 1), -1);
+expect(__builtin_object_size(>b.a, 1), -1);
+
+struct B0 *outer0;
+struct C0 *outest0;
+
+/* Make sure optimization can't find some other object size. */
+outer0 = (void *)magic1;
+outest0 = (void *)magic2;
+
+expect(__builtin_object_size(>a, 1), -1);
+expect(__builtin_object_size(>b, 1), -1);
+expect(__builtin_object_size(>b.a, 1), -1);
+
+struct B1 *outer1;
+struct C1 *outest1;
+
+/* Make sure optimization can't find some other object size. */
+outer1 = (void *)magic1;
+outest1 = (void *)magic2;
+
+expect(__builtin_object_size(>a, 1), -1);
+expect(__builtin_object_size(>b, 1), -1);
+expect(__builtin_object_size(>b.a, 1), -1);
+
+struct Bn *outern;
+struct Cn *outestn;
+
+/* Make sure optimization can't find some other object size. */
+outern = (void *)magic1;
+outestn = (void *)magic2;
+
+expect(__builtin_object_size(>a, 1), sizeof(outern->a));
+expect(__builtin_object_size(>b, 1), sizeof(outestn->b));
+expect(__builtin_object_size(>b.a, 1), sizeof(outestn->b.a));
+
+if (n_fails > 0)
+  __builtin_abort ();
+
+return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c
new file mode 100644
index 000..aaae99b8d67
--- /dev/null
+++ 

[PATCH] doc: Fix typo in -Wall description

2023-02-17 Thread Siddhesh Poyarekar
-Wall enables -Wuse-after-free=2 and not -Wuse-after-free=3.

gcc/ChangeLog:

* gcc/doc/invoke.texi (@item -Wall): Fix typo in
-Wuse-after-free.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 51447a78584..20d41e19b3c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6083,7 +6083,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect 
Options}.
 -Wunused-label @gol
 -Wunused-value @gol
 -Wunused-variable  @gol
--Wuse-after-free=3  @gol
+-Wuse-after-free=2  @gol
 -Wvla-parameter @r{(C and Objective-C only)} @gol
 -Wvolatile-register-var  @gol
 -Wzero-length-bounds}
-- 
2.38.1



Re: [PATCH] doc: Fix typo in -Wall description

2023-02-17 Thread Siddhesh Poyarekar

On 2023-02-17 14:43, Jeff Law wrote:



On 2/17/23 06:41, Siddhesh Poyarekar wrote:

-Wall enables -Wuse-after-free=2 and not -Wuse-after-free=3.

gcc/ChangeLog:

* gcc/doc/invoke.texi (@item -Wall): Fix typo in
-Wuse-after-free.

Looks obvious to me.  If you haven't committed it already, go ahead.


Pushed, thanks.

Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-20 Thread Siddhesh Poyarekar

On 2023-04-20 08:59, Jakub Jelinek via Gcc-patches wrote:

+r.set (type, dconstm1, dconst1);


See above, are we sure we can use [-1., 1.] range safely, or should that be
[-1.-Nulps, 1.+Nulps] for some kind of expected worse error margin of the
implementation?  And ditto for -frounding-math, shall we increase that
interval in that case, or is [-1., 1.] going to be ok?


Do any math implementations generate results outside of [-1., 1.]?  If 
yes, then it's a bug in those implementations IMO, not in the range 
assumption.  It feels wrong to cater for what ought to be trivially 
fixable in libraries if they ever happen to generate such results.


Thanks,
Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-20 Thread Siddhesh Poyarekar

On 2023-04-20 10:02, Jakub Jelinek wrote:

On x86_64-linux with glibc 2.35, I see
for i in FLOAT DOUBLE LDOUBLE FLOAT128; do for j in TONEAREST UPWARD DOWNWARD 
TOWARDZERO; do gcc -D$i -DROUND=FE_$j -g -O1 -o /tmp/sincos{,.c} -lm; 
/tmp/sincos || echo $i $j; done; done
Aborted (core dumped)
FLOAT UPWARD
Aborted (core dumped)
FLOAT DOWNWARD
On sparc-sun-solaris2.11 I see
for i in FLOAT DOUBLE LDOUBLE; do for j in TONEAREST UPWARD DOWNWARD 
TOWARDZERO; do gcc -D$i -DROUND=FE_$j -g -O1 -o sincos{,.c} -lm; ./sincos || 
echo $i $j; done; done
Abort (core dumped)
DOUBLE UPWARD
Abort (core dumped)
DOUBLE DOWNWARD
Haven't tried anything else.  So that shows (but doesn't prove) that
maybe [-1., 1.] interval is fine for -fno-rounding-math on those, but not
for -frounding-math.


Would there be a reason to not consider these as bugs?  I feel like 
these should be fixed in glibc, or any math implementation that ends up 
doing this.


I suppose one reason could be the overhead of an additional branch to 
check for result bounds, but is that serious enough to allow this 
imprecision?  The alternative of output range being defined as 
[-1.0-ulp, 1.0+ulp] avoids that conversation I guess.


Thanks,
Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-20 Thread Siddhesh Poyarekar

On 2023-04-20 11:52, Jakub Jelinek wrote:

Why?  Unless an implementation guarantees <= 0.5ulps errors, it can be one
or more ulps off, why is an error at or near 1.0 or -1.0 error any worse
than similar errors for other values?


In a general sense, maybe not, but in the sense of breaching the bounds 
of admissible values, especially when it can be reasonably corrected, it 
seems worse IMO to let the error slide.



Similarly for other functions which have other ranges, perhaps not with so
nice round numbers.  Say asin has [-pi/2, pi/2] range, those numbers aren't
exactly representable, but is it any worse to round those values to -inf or
+inf or worse give something 1-5 ulps further from that interval comparing
to other 1-5ulps errors?


I agree the argument in favour of allowing errors breaching the 
mathematical bounds is certainly stronger for bounds that are not 
exactly representable.  I just feel like the implementation ought to 
take the additional effort when the bounds are representable and make it 
easier for the compiler.


For bounds that aren't representable, one could get error bounds from 
libm-test-ulps data in glibc, although I reckon those won't be 
exhaustive.  From a quick peek at the sin/cos data, the arc target seems 
to be among the worst performers at about 7ulps, although if you include 
the complex routines we get close to 13 ulps.  The very worst 
imprecision among all math routines (that's gamma) is at 16 ulps for 
power in glibc tests, so maybe allowing about 25-30 ulps error in bounds 
might work across the board.


But yeah, it's probably going to be guesswork.

Thanks,
Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-20 Thread Siddhesh Poyarekar

On 2023-04-20 13:57, Siddhesh Poyarekar wrote:
For bounds that aren't representable, one could get error bounds from 
libm-test-ulps data in glibc, although I reckon those won't be 
exhaustive.  From a quick peek at the sin/cos data, the arc target seems 
to be among the worst performers at about 7ulps, although if you include 
the complex routines we get close to 13 ulps.  The very worst 
imprecision among all math routines (that's gamma) is at 16 ulps for 
power in glibc tests, so maybe allowing about 25-30 ulps error in bounds 
might work across the board.


I was thinking about this a bit more and it seems like limiting ranges 
to targets that can generate sane results (i.e. error bounds within, 
say, 5-6 ulps) and for the rest, avoid emitting the ranges altogether. 
Emitting a bad range for all architectures seems like a net worse 
solution again.


Thanks,
Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-24 Thread Siddhesh Poyarekar

On 2023-04-24 12:03, Jakub Jelinek wrote:

On Thu, Apr 20, 2023 at 01:57:55PM -0400, Siddhesh Poyarekar wrote:

Similarly for other functions which have other ranges, perhaps not with so
nice round numbers.  Say asin has [-pi/2, pi/2] range, those numbers aren't
exactly representable, but is it any worse to round those values to -inf or
+inf or worse give something 1-5 ulps further from that interval comparing
to other 1-5ulps errors?


I've extended my hack^^^test to include sqrt and this time it seems
that the boundary in that case holds even for non-standard rounding modes
for glibc:


IIRC the standard _requires_ sqrt to be correctly rounded.

Sid


Re: [PATCH] Implement range-op entry for sin/cos.

2023-04-21 Thread Siddhesh Poyarekar

On 2023-04-21 02:52, Jakub Jelinek wrote:

On Thu, Apr 20, 2023 at 09:14:10PM -0400, Siddhesh Poyarekar wrote:

On 2023-04-20 13:57, Siddhesh Poyarekar wrote:

For bounds that aren't representable, one could get error bounds from
libm-test-ulps data in glibc, although I reckon those won't be
exhaustive.  From a quick peek at the sin/cos data, the arc target seems
to be among the worst performers at about 7ulps, although if you include
the complex routines we get close to 13 ulps.  The very worst
imprecision among all math routines (that's gamma) is at 16 ulps for
power in glibc tests, so maybe allowing about 25-30 ulps error in bounds
might work across the board.


I was thinking about this a bit more and it seems like limiting ranges to
targets that can generate sane results (i.e. error bounds within, say, 5-6
ulps) and for the rest, avoid emitting the ranges altogether. Emitting a bad
range for all architectures seems like a net worse solution again.


Well, at least for basic arithmetics when libm functions aren't involved,
there is no point in disabling ranges altogether.


Oh yeah, I did mean only franges for math function call results.


And, for libm functions, my plan was to introduce a target hook, which
would have combined_fn argument to tell which function is queried,
machine_mode to say which floating point format and perhaps a bool whether
it is ulps for these basic math boundaries or results somewhere in between,
and would return in unsigned int ulps, 0 for 0.5ulps precision.
So, we could say for CASE_CFN_SIN: CASE_CFN_COS: in the glibc handler
say that ulps is say 3 inside of the ranges and 0 on the boundaries if
!flag_rounding_math and 6 and 2 with flag_rounding_math or whatever.
And in the generic code don't assume anything if ulps is say 100 or more.
The hooks would need to be a union of precision of supported versions of
the library through the history, including say libmvec because function
calls could be vectorized.
And default could be that infinite precision.
Back in November I've posted a proglet that can generate some ulps from
random number testing, plus on glibc we could pick maximums from ulps files.
And if needed, say powerpc*-linux could override the generic glibc
version for some subset of functions and call default otherwise (say at
least for __ibm128).


Ack, that sounds like a plan.

Thanks,
Sid


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-08 Thread Siddhesh Poyarekar

On 2023-02-08 14:09, Joseph Myers wrote:

What must be avoided is -pedantic diagnostics for

struct flex1 { int n; int data[1]; };
struct out_flex_end1 { int m; struct flex1 flex_data; };

regardless of whether considered flexible or not, since that's clearly
valid in standard C.



Are you sure about "regardless of whether considered flexible or not", 
since ISTM the validity of the above in standard C is limited to when 
it's considered a non-flexible array.  So with -pedantic it shouldn't 
warn, but it also then shouldn't consider it a flexible array.


In other words, perhaps it makes sense to imply -fstrict-flex-arrays 
with -pedantic?


Sid


Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]

2023-02-07 Thread Siddhesh Poyarekar

On 2023-02-06 18:14, Joseph Myers wrote:

On Mon, 6 Feb 2023, Qing Zhao via Gcc-patches wrote:


In GCC14:

1. Include this new warning -Wgnu-varaible-sized-type-not-at-end to -Wall
2. Deprecate this extension from GCC. (Or delay this to next release?).


Any deprecation, or inclusion in -Wall, would best come with evidence
about the prevalance of use (possibly unintentional, probably undesirable)
of these extensions.  For example, maybe someone could do a distribution
rebuild with a patch to enable these warnings and report the results?


FWIW, Fred from our team has been working on a mass prebuilder that can 
be used for this kind of distribution-wide validation.  I've used it for 
_FORTIFY_SOURCE validation as well as coverage analysis.


I can help you with this Qing, once you have a patch ready.

Sid

[1] https://gitlab.com/fedora/packager-tools/mass-prebuild/


Re: [PATCH] tree-optimization/108522 Use component_ref_field_offset

2023-02-07 Thread Siddhesh Poyarekar




On 2023-01-25 22:32, Siddhesh Poyarekar wrote:

Instead of using TREE_OPERAND (expr, 2) directly, use
component_ref_field_offset instead, which does scaling for us.  The
function also substitutes PLACEHOLDER_EXPRs, which is probably what we
want anyway but I'm not sure if it's relevant for tree-object-size.

gcc/ChangeLog:

PR tree-optimization/108522
* tree-object-size.cc (compute_object_offset): Make EXPR
argument non-const.  Call component_ref_field_offset.

gcc/testsuite/ChangeLog:

PR tree-optimization/108522
* gcc.dg/builtin-dynamic-object-size-0.c (DEFSTRUCT): New
macro.
(test_dynarray_struct_member_b, test_dynarray_struct_member_c,
test_dynarray_struct_member_d,
test_dynarray_struct_member_subobj_b,
test_dynarray_struct_member_subobj_c,
test_dynarray_struct_member_subobj_d): New tests.
(main): Call them.

Signed-off-by: Siddhesh Poyarekar 


... and now pushed (this and the earlier commit) to gcc-12 branch.

Sid


Re: [PATCH 2/2] Documentation Update.

2023-02-02 Thread Siddhesh Poyarekar

On 2023-02-02 03:33, Richard Biener wrote:

looking at PR77650 what seems missing there is the semantics of this
extension as expected/required by the glibc use.  comment#5 seems
to suggest that for my example above its expected that
Y.x.data[0] aliases Y.end?!  There must be a better way to write
the glibc code and IMHO it would be best to deprecate this extension.
Definitely the middle-end wouldn't consider this aliasing for
my example - maybe it "works" when wrapped inside a union but
then for sure only when the union is visible in all accesses ...

typedef union
{
   struct __gconv_info __cd;
   struct
   {
 struct __gconv_info __cd;
 struct __gconv_step_data __data;
   } __combined;
} _G_iconv_t;

could be written as

typedef union
{
   struct __gconv_info __cd;
   char __dummy[sizeof(struct __gconv_info) + sizeof(struct
__gconv_step_data)];
} _G_iconv_t;

in case the intent is to provide a complete type with space for
a single __gconv_step_data.


I dug into this on the glibc end and it looks like this commit:

commit 63fb8f9aa9d19f85599afe4b849b567aefd70a36
Author: Zack Weinberg 
Date:   Mon Feb 5 14:13:41 2018 -0500

Post-cleanup 2: minimize _G_config.h.

ripped all of that gunk out.  AFAICT there's no use of struct 
__gconv_info anywhere else in the code.


I reckon it is safe to say now that glibc no longer needs this misfeature.

Sid


Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 13:03, Siddhesh Poyarekar wrote:

On 2023-07-31 12:47, Qing Zhao wrote:

Hi, Sid and Jakub,

I have a question in the following source portion of the routine 
“addr_object_size” of gcc/tree-object-size.cc:


  743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
  744   if (bytes != error_mark_node)
  745 {
  746   bytes = size_for_offset (var_size, bytes);
  747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) 
== MEM_REF)

  748 {
  749   tree bytes2 = compute_object_offset (TREE_OPERAND 
(ptr, 0),

  750    pt_var);
  751   if (bytes2 != error_mark_node)
  752 {
  753   bytes2 = size_for_offset (pt_var_size, bytes2);
  754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
  755 }
  756 }
  757 }

At line 754, why we always use “MIN_EXPR” whenever it’s for 
OST_MINIMUM or not?

Shall we use

(object_size_type & OST_MINIMUM
 ? MIN_EXPR : MAX_EXPR)



That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations 
like this:


typedef struct
{
   int a;
} A;

size_t f()
{
   A *p = malloc (1);

   return __builtin_object_size (p, 0);


Correction, that should be __builtin_object_size (>a, 0)


}

where the returned size should be 1 and not sizeof (int).  The mode 
doesn't really matter in this case.


HTH.

Sid



Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 14:13, Qing Zhao wrote:

Okay. I see.

Then if the size info from the TYPE is smaller than the size info from the 
malloc,
  then based on the current code, we use the smaller one between these two,
  i.e, the size info from the TYPE.  (Even for the OST_MAXIMUM).

Is such behavior correct?


Yes, it's correct even for OST_MAXIMUM.  The smaller one between the two 
is the more precise estimate, which is why the mode doesn't matter.




This is for the new “counted_by” attribute and how to use it in 
__builtin_dynamic_object_size.
for example:

===

struct annotated {
 size_t foo;
 int array[] __attribute__((counted_by (foo)));
};

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 2

/* in the following function, malloc allocated more space than the value
of counted_by attribute.  Then what's the correct behavior we expect
the __builtin_dynamic_object_size should have?  */

static struct annotated * noinline alloc_buf (int index)
{
   struct annotated *p;
   p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
   p->foo = index;

   /*when checking the observed access p->array, we have info on both
 observered allocation and observed access,
 A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
 B. from observed access: p->foo * sizeof (int)

 in the above, p->foo = index.
*/

   /* for MAXIMUM size, based on the current code, we will use the size info 
from the TYPE,
  i.e, the “counted_by” attribute, which is the smaller one.   */
   expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));


If the counted_by is less than what is allocated, it is the more correct 
value to return because that's what the application asked for through 
the attribute.  If the allocated size is less, we return the allocated 
size because in that case, despite what the application said, the actual 
allocated size is less and hence that's the safer value.


In fact in the latter case it may even make sense to emit a warning 
because it is more likely than not to be a bug.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 12:47, Qing Zhao wrote:

Hi, Sid and Jakub,

I have a question in the following source portion of the routine 
“addr_object_size” of gcc/tree-object-size.cc:

  743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
  744   if (bytes != error_mark_node)
  745 {
  746   bytes = size_for_offset (var_size, bytes);
  747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
MEM_REF)
  748 {
  749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
  750pt_var);
  751   if (bytes2 != error_mark_node)
  752 {
  753   bytes2 = size_for_offset (pt_var_size, bytes2);
  754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
  755 }
  756 }
  757 }

At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not?
Shall we use

(object_size_type & OST_MINIMUM
 ? MIN_EXPR : MAX_EXPR)



That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations 
like this:


typedef struct
{
  int a;
} A;

size_t f()
{
  A *p = malloc (1);

  return __builtin_object_size (p, 0);
}

where the returned size should be 1 and not sizeof (int).  The mode 
doesn't really matter in this case.


HTH.

Sid


Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-21 07:21, Martin Uecker via Gcc-patches wrote:



This patch adds a warning for allocations with insufficient size
based on the "alloc_size" attribute and the type of the pointer
the result is assigned to. While it is theoretically legal to
assign to the wrong pointer type and cast it to the right type
later, this almost always indicates an error. Since this catches
common mistakes and is simple to diagnose, it is suggested to
add this warning.
  


Bootstrapped and regression tested on x86.


Martin



Add option Walloc-type that warns about allocations that have
insufficient storage for the target type of the pointer the
storage is assigned to.

gcc:
* doc/invoke.texi: Document -Wstrict-flex-arrays option.

gcc/c-family:

* c.opt (Walloc-type): New option.

gcc/c:
* c-typeck.cc (convert_for_assignment): Add Walloc-type warning.

gcc/testsuite:

* gcc.dg/Walloc-type-1.c: New test.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 4abdc8d0e77..8b9d148582b 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -319,6 +319,10 @@ Walloca
  C ObjC C++ ObjC++ Var(warn_alloca) Warning
  Warn on any use of alloca.
  
+Walloc-type

+C ObjC Var(warn_alloc_type) Warning
+Warn when allocating insufficient storage for the target type of the
assigned pointer.
+
  Walloc-size-larger-than=
  C ObjC C++ LTO ObjC++ Var(warn_alloc_size_limit) Joined Host_Wide_Int
ByteSize Warning Init(HOST_WIDE_INT_MAX)
  -Walloc-size-larger-than=Warn for calls to allocation
functions that
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7cf411155c6..2e392f9c952 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -7343,6 +7343,32 @@ convert_for_assignment (location_t location,
location_t expr_loc, tree type,
"request for implicit conversion "
"from %qT to %qT not permitted in C++", rhstype,
type);
  
+  /* Warn of new allocations are not big enough for the target

type.  */
+  tree fndecl;
+  if (warn_alloc_type
+ && TREE_CODE (rhs) == CALL_EXPR
+ && (fndecl = get_callee_fndecl (rhs)) != NULL_TREE
+ && DECL_IS_MALLOC (fndecl))
+   {
+ tree fntype = TREE_TYPE (fndecl);
+ tree fntypeattrs = TYPE_ATTRIBUTES (fntype);
+ tree alloc_size = lookup_attribute ("alloc_size",
fntypeattrs);
+ if (alloc_size)
+   {
+ tree args = TREE_VALUE (alloc_size);
+ int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
+ /* For calloc only use the second argument.  */
+ if (TREE_CHAIN (args))
+   idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN
(args))) - 1;
+ tree arg = CALL_EXPR_ARG (rhs, idx);
+ if (TREE_CODE (arg) == INTEGER_CST
+ && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl)))
+warning_at (location, OPT_Walloc_type, "allocation of
"
+"insufficient size %qE for type %qT with
"
+"size %qE", arg, ttl, TYPE_SIZE_UNIT
(ttl));
+   }
+   }
+


Wouldn't this be much more useful in later phases with ranger feedback 
like with the warn_access warnings?  That way the comparison won't be 
limited to constant sizes.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Siddhesh Poyarekar

On 2023-08-02 10:02, Qing Zhao wrote:

   /*when checking the observed access p->array, we only have info on the
 observed access, i.e, the TYPE_SIZE info from the access. We don't have
 info on the whole object.  */
   expect(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(int));
   expect(__builtin_dynamic_object_size(q->array, 0), -1);
   expect(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(int));
   expect(__builtin_dynamic_object_size(q->array, 2), 0);
   /*when checking the pointer p, we have no observed allocation nor observed 
access.
 therefore, we cannot determine the size info here.  */
   expect(__builtin_dynamic_object_size(q, 1), -1);
   expect(__builtin_dynamic_object_size(q, 0), -1);
   expect(__builtin_dynamic_object_size(q, 3), 0);
   expect(__builtin_dynamic_object_size(q, 2), 0);


I'm wondering if we could sizeof (*q) + q->foo for __bdos(q, 0), but I 
suppose it could mean generating code that potentially dereferences an 
invalid pointer.  Surely we could emit that for __bdos(q->array, 0) 
though, couldn't we?


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Siddhesh Poyarekar

On 2023-08-03 12:43, Qing Zhao wrote:

  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?


For __bdos(q->array, 0), we only have the access info for the sub-object q->array, 
we can surely decide the size of the sub-object q->array, but we still cannot
decide the whole object that is pointed by q (the same reason as above), right?


It's tricky, I mean we could assume p to be a valid object due to the 
dereference and hence assume that q->foo is also valid and that there's 
at least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The 
question then is whether q could be pointing to an element of an array 
of `struct annotated`.  Could we ever have a valid array of such structs 
that have a flex array at the end?  Wouldn't it always be a single object?


In fact for all pointers to such structs with a flex array at the end, 
could we always assume that it is a single object and never part of an 
array, and hence return sizeof()?


Thanks,
Sid


[PATCH] testsuite/110763: Ensure zero return from test

2023-07-21 Thread Siddhesh Poyarekar
The test deliberately reads beyond bounds to exersize ubsan and the
return value may be anything, based on previous allocations.  The OFF
test caters for it by ANDing the return with 0, do the same for the DYN
test.

gcc/testsuite/ChangeLog:

PR testsuite/110763
* gcc.dg/ubsan/object-size-dyn.c (dyn): New parameter RET.
(main): Use it.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c 
b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c
index 0159f5b9820..49c3abe2e72 100644
--- a/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c
+++ b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c
@@ -5,12 +5,12 @@
 
 int
 __attribute__ ((noinline))
-dyn (int size, int i)
+dyn (int size, int i, int ret)
 {
   __builtin_printf ("dyn\n");
   fflush (stdout);
   int *alloc = __builtin_calloc (size, sizeof (int));
-  int ret = alloc[i];
+  ret = ret & alloc[i];
   __builtin_free (alloc);
   return ret;
 }
@@ -28,7 +28,7 @@ off (int size, int i, int ret)
 int
 main (void)
 {
-  int ret = dyn (2, 2);
+  int ret = dyn (2, 2, 0);
 
   ret |= off (4, 4, 0);
 
-- 
2.41.0



Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-10 Thread Siddhesh Poyarekar

On 2023-08-10 12:39, Jakub Jelinek wrote:

On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:

The definition of __bos/__bdos allows us the freedom to *estimate* rather
than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
to give the more conservative answer of the two.


To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
__bos/__bdos from actual allocation size or size of an stack object or
size of data section object but something else (say counted_by), perhaps
0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
then user code can continue testing if both modes are equal to have
exact number.


Ack, that's fair.

Thanks,
Sid


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-10 Thread Siddhesh Poyarekar

On 2023-08-10 10:47, Martin Uecker wrote:

Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:

On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:

Am Donnerstag, dem 10.08.2023 um 13:59 + schrieb Qing Zhao:



On Aug 10, 2023, at 2:58 AM, Martin Uecker  wrote:

Am Mittwoch, dem 09.08.2023 um 20:10 + schrieb Qing Zhao:



On Aug 9, 2023, at 12:21 PM, Michael Matz  wrote:





I am not sure for the reason given above. The following
code would not work:

struct foo_flex { int a; short b; char t[]; } x;
x.a = 1;
struct foo_flex *p = malloc(sizeof(x) + x.a);
if (!p) abort();
memcpy(p, , sizeof(x)); // initialize struct


Okay.
Then, the user still should use the sizeof(struct foo_flex) + N * 
sizeof(foo->t) for the allocation, even though this might allocate more bytes 
than necessary. (But this is safe)

Let me know if I still miss anything.


The question is not only what the user should use to
allocate, but also what BDOS should return.  In my
example the user uses the sizeof() + N * sizeof
formula and the memcpy is safe, but it would be flagged
as a buffer overrun if BDOS uses the offsetof formula.


BDOS/BOS (at least the 0 level) should return what is actually
allocated for the var, what size was passed to malloc and if it
is a var with flex array member with initialization what is actually the
size on the stack or in .data/.rodata etc.


Agreed.

But what about a struct with FAM with the new "counted_by" attribute
if the original allocation is not visible?


There's precedent for this through the __access__ attribute; __bos 
trusts what the attribute says about the allocation.


Sid


Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-10 Thread Siddhesh Poyarekar

On 2023-08-10 11:18, Martin Uecker wrote:

Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:

On 2023-08-10 10:47, Martin Uecker wrote:

Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:

On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:

Am Donnerstag, dem 10.08.2023 um 13:59 + schrieb Qing Zhao:



On Aug 10, 2023, at 2:58 AM, Martin Uecker  wrote:

Am Mittwoch, dem 09.08.2023 um 20:10 + schrieb Qing Zhao:



On Aug 9, 2023, at 12:21 PM, Michael Matz  wrote:





I am not sure for the reason given above. The following
code would not work:

struct foo_flex { int a; short b; char t[]; } x;
x.a = 1;
struct foo_flex *p = malloc(sizeof(x) + x.a);
if (!p) abort();
memcpy(p, , sizeof(x)); // initialize struct


Okay.
Then, the user still should use the sizeof(struct foo_flex) + N * 
sizeof(foo->t) for the allocation, even though this might allocate more bytes 
than necessary. (But this is safe)

Let me know if I still miss anything.


The question is not only what the user should use to
allocate, but also what BDOS should return.  In my
example the user uses the sizeof() + N * sizeof
formula and the memcpy is safe, but it would be flagged
as a buffer overrun if BDOS uses the offsetof formula.


BDOS/BOS (at least the 0 level) should return what is actually
allocated for the var, what size was passed to malloc and if it
is a var with flex array member with initialization what is actually the
size on the stack or in .data/.rodata etc.


Agreed.

But what about a struct with FAM with the new "counted_by" attribute
if the original allocation is not visible?


There's precedent for this through the __access__ attribute; __bos
trusts what the attribute says about the allocation.


The access attribute gives the size directly. The counted_by gives
a length for the array which needs to be translated into a size
via a formula. There are different formulas in use. The question
is which formula should bdos trust?

Whatever you pick, if this is not consistent with the actual
allocation or use, then it will cause problems either by
breaking code or not detecting buffer overruns.

So it needs to be consistent with what GCC allocates for a
var with FAM and initialization and also the user needs to
be told what the right choice is so that he can use the right
size for allocation and argument to memcpy / memset etc.


We'd rather miss overflow to the extent of padding than to try and be 
overly aggressive; I doubt if we're missing much protection in practice 
by trying to account for the padding.


The definition of __bos/__bdos allows us the freedom to *estimate* 
rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since 
it's bound to give the more conservative answer of the two.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-10 Thread Siddhesh Poyarekar

On 2023-08-10 14:28, Richard Sandiford wrote:

Siddhesh Poyarekar  writes:

On 2023-08-08 10:30, Siddhesh Poyarekar wrote:

Do you have a suggestion for the language to address libgcc,
libstdc++, etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.


Hi David,

Here's what I came up with for different parts of GCC, including the
runtime libraries.  Over time we may find that specific parts of runtime
libraries simply cannot be used safely in some contexts and flag that.

Sid

"""
What is a GCC security bug?
===

  A security bug is one that threatens the security of a system or
  network, or might compromise the security of data stored on it.
  In the context of GCC there are multiple ways in which this might
  happen and they're detailed below.

Compiler drivers, programs, libgccjit and support libraries
---

  The compiler driver processes source code, invokes other programs
  such as the assembler and linker and generates the output result,
  which may be assembly code or machine code.  It is necessary that
  all source code inputs to the compiler are trusted, since it is
  impossible for the driver to validate input source code beyond
  conformance to a programming language standard.

  The GCC JIT implementation, libgccjit, is intended to be plugged
  into applications to translate input source code in the application
  context.  Limitations that apply to the compiler
  driver, apply here too in terms of sanitizing inputs, so it is
  recommended that inputs are either sanitized by an external program
  to allow only trusted, safe execution in the context of the
  application or the JIT execution context is appropriately sandboxed
  to contain the effects of any bugs in the JIT or its generated code
  to the sandboxed environment.

  Support libraries such as libiberty, libcc1 libvtv and libcpp have
  been developed separately to share code with other tools such as
  binutils and gdb.  These libraries again have similar challenges to
  compiler drivers.  While they are expected to be robust against
  arbitrary input, they should only be used with trusted inputs.

  Libraries such as zlib and libffi that bundled into GCC to build it
  will be treated the same as the compiler drivers and programs as far
  as security coverage is concerned.

  As a result, the only case for a potential security issue in all
  these cases is when it ends up generating vulnerable output for
  valid input source code.


I think this leaves open the interpretation "every wrong code bug
is potentially a security bug".  I suppose that's true in a trite sense,
but not in a useful sense.  As others said earlier in the thread,
whether a wrong code bug in GCC leads to a security bug in the object
code is too application-dependent to be a useful classification for GCC.

I think we should explicitly say that we don't generally consider wrong
code bugs to be security bugs.  Leaving it implicit is bound to lead
to misunderstanding.


I see what you mean, but the context-dependence of a bug is something 
GCC will have to deal with, similar to how libraries have to deal with 
bugs.  But I agree this probably needs some more expansion.  Let me try 
and come up with something more detailed for that last paragraph.



There's another case that I think should be highlighted explicitly:
GCC provides various security-hardening features.  I think any failure
of those feature to act as documented is poentially a security bug.
Failure to follow reasonable expectations (even if not documented)
might sometimes be a security bug too.


Missed hardening in general does not put systems at immediate risk, so 
they're not considered CVE-worthy.  In fact when bugs are evaluated for 
security risk at a source level (e.g. when NIST does it), hardening does 
not come into the picture at all.  It's only at product levels that 
hardening features are accounted for, e.g. where -fstack-protector would 
reduce the seriousness of a stack buffer overflow and even there one 
must do an analysis to see if the generated code actually mitigated the 
overflow using the stack protector canary.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-11 Thread Siddhesh Poyarekar

On 2023-08-11 11:12, David Edelsohn wrote:
The text above states "bugs in these libraries may be evaluated for 
security impact", but there is no comment about the criteria for a 
security impact, unlike the GLIBC SECURITY.md document.  The text seems 
to imply the "What is a security bug?" definitions from GLIBC, but the 
definitions are not explicitly stated in the GCC Security policy.


Should this "Language runtime libraries" section include some of the 
GLIBC "What is a security bug?" text or should the GCC "What is a 
security bug?" section earlier in this document include the text with a 
qualification that issues like buffer overflow, memory leaks, 
information disclosure, etc. specifically apply to "Language runtime 
libraries" and not all components of GCC?


Yes, that makes sense.  This part will likely evolve though, much like 
the glibc one did, based on reports we get over time.  I'll work it in 
and post an updated draft.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-11 Thread Siddhesh Poyarekar

On 2023-08-10 14:50, Siddhesh Poyarekar wrote:

  As a result, the only case for a potential security issue in all
  these cases is when it ends up generating vulnerable output for
  valid input source code.


I think this leaves open the interpretation "every wrong code bug
is potentially a security bug".  I suppose that's true in a trite sense,
but not in a useful sense.  As others said earlier in the thread,
whether a wrong code bug in GCC leads to a security bug in the object
code is too application-dependent to be a useful classification for GCC.

I think we should explicitly say that we don't generally consider wrong
code bugs to be security bugs.  Leaving it implicit is bound to lead
to misunderstanding.


I see what you mean, but the context-dependence of a bug is something 
GCC will have to deal with, similar to how libraries have to deal with 
bugs.  But I agree this probably needs some more expansion.  Let me try 
and come up with something more detailed for that last paragraph.


How's this:

As a result, the only case for a potential security issue in the 
compiler is when it generates vulnerable application code for valid, 
trusted input source code.  The output application code could be 
considered vulnerable if it produces an actual vulnerability in the 
target application, specifically in the following cases:


- The application dereferences an invalid memory location despite the 
application sources being valid.


- The application reads from or writes to a valid but incorrect memory 
location, resulting in an information integrity issue or an information 
leak.


- The application ends up running in an infinite loop or with severe 
degradation in performance despite the input sources having no such 
issue, resulting in a Denial of Service.  Note that correct but 
non-performant code is not a security issue candidate, this only applies 
to incorrect code that may result in performance degradation.


- The application crashes due to the generated incorrect code, resulting 
in a Denial of Service.




Re: [RFC] GCC Security policy

2023-08-11 Thread Siddhesh Poyarekar

On 2023-08-11 11:09, Paul Koning wrote:




On Aug 11, 2023, at 10:36 AM, Siddhesh Poyarekar  wrote:

On 2023-08-10 14:50, Siddhesh Poyarekar wrote:

   As a result, the only case for a potential security issue in all
   these cases is when it ends up generating vulnerable output for
   valid input source code.


I think this leaves open the interpretation "every wrong code bug
is potentially a security bug".  I suppose that's true in a trite sense,
but not in a useful sense.  As others said earlier in the thread,
whether a wrong code bug in GCC leads to a security bug in the object
code is too application-dependent to be a useful classification for GCC.

I think we should explicitly say that we don't generally consider wrong
code bugs to be security bugs.  Leaving it implicit is bound to lead
to misunderstanding.

I see what you mean, but the context-dependence of a bug is something GCC will 
have to deal with, similar to how libraries have to deal with bugs.  But I 
agree this probably needs some more expansion.  Let me try and come up with 
something more detailed for that last paragraph.


How's this:

As a result, the only case for a potential security issue in the compiler is 
when it generates vulnerable application code for valid, trusted input source 
code.  The output application code could be considered vulnerable if it 
produces an actual vulnerability in the target application, specifically in the 
following cases:


You might make it explicit that we're talking about wrong code errors here -- 
in other words, the source code is correct (conforms to the standard) and the 
algorithm expressed in the source code does not have a vulnerability, but the 
generated code has semantics that differ from those of the source code such 
that it does have a vulnerability.


Ack, thanks for the suggestion.




- The application dereferences an invalid memory location despite the 
application sources being valid.

- The application reads from or writes to a valid but incorrect memory 
location, resulting in an information integrity issue or an information leak.

- The application ends up running in an infinite loop or with severe 
degradation in performance despite the input sources having no such issue, 
resulting in a Denial of Service.  Note that correct but non-performant code is 
not a security issue candidate, this only applies to incorrect code that may 
result in performance degradation.


The last sentence somewhat contradicts the preceding one.  Perhaps "...may result in 
performance degradation severe enough to amount to a denial of service".


Ack, will fix that up, thanks.

Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-03 13:34, Qing Zhao wrote:

One thing I need to point out first is, currently, even for regular fixed size 
array in the structure,
We have this same issue, for example:

#define LENGTH 10

struct fix {
   size_t foo;
   int array[LENGTH];
};

…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();

   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}

Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole 
object" size could be anything; `p` could be pointing to the beginning 
of an array for all we know.  If however `array` is strictly a flex 
array, i.e.:


```
struct A
{
  size_t foo;
  int array[];
};
```

then there's no way in valid C to have an array of `struct fix`, so `q` 
must be pointing to a single element.  So you could deduce:


1. the minimum size of the whole object that q points to.

and

2. if you're able to determine the size of the flex array (through 
__element_count__(foo) for example), you could even determine the 
maximum size of the whole object.


For (2) though, you'd break applications that overallocate and then 
expect to be able to use that overallocation despite the space not being 
reflected in the __element_count__.  I think it's a bug in the 
application and I can't see a way for an application to be able to do 
this in a valid way so I'm inclined towards breaking it.


Of course, the fact that gcc allows flex arrays to be in the middle of 
structs breaks the base assumption but that's something we need to get 
rid of anyway since there's no way for valid C programs to use that safely.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 10:40, Siddhesh Poyarekar wrote:

On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular 
fixed size array in the structure,

We have this same issue, for example:

#define LENGTH 10

struct fix {
   size_t foo;
   int array[LENGTH];
};

…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();

   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}

Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN 
for it.

This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole 
object" size could be anything; `p` could be pointing to the beginning 
of an array for all we know.  If however `array` is strictly a flex 
array, i.e.:


```
struct A
{
   size_t foo;
   int array[];
};
```

then there's no way in valid C to have an array of `struct fix`, so `q` 
must be pointing to a single element.  So you could deduce:


1. the minimum size of the whole object that q points to.


Actually for minimum size we'd also need a guarantee that 
`alloc_buf_more` returns a valid allocated object.


Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 11:27, Qing Zhao wrote:




On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar  wrote:

On 2023-08-03 13:34, Qing Zhao wrote:

One thing I need to point out first is, currently, even for regular fixed size 
array in the structure,
We have this same issue, for example:
#define LENGTH 10
struct fix {
   size_t foo;
   int array[LENGTH];
};
…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();
   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}
Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole object" 
size could be anything; `p` could be pointing to the beginning of an array for all we 
know.  If however `array` is strictly a flex array, i.e.:

```
struct A
{
  size_t foo;
  int array[];
};
```

then there's no way in valid C to have an array of `struct fix`,


Yes!!   this is exactly the place that makes difference between structures with 
fixed arrays and the ones with flexible arrays.

With such difference, I guess that using the type of the structure with flexible 
array member for p->array to get the size of the whole object p point to might 
be reasonable?


Yes, that's what I'm thinking.


so `q` must be pointing to a single element.  So you could deduce:

1. the minimum size of the whole object that q points to.


You mean that the TYPE will determine the minimum size of the whole object?  
(Does this include the size of the flexible array member, or only the other 
part of the structure except the flexible array member?)


Only the constant sized part of the structure.


Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
returns a valid allocated object.


Why? Please explain a little bit here.


So `alloc_buf_more` could return NULL, a valid pointer or an invalid 
pointer.  So, we could end up returning a non-zero minimum size for an 
invalid or NULL pointer, which is incorrect, we don't know that.


We won't need the object validity guarantee for (2) beyond, e.g. 
guarding against a new NULL pointer dereference because it's a *maximum* 
estimate; an invalid or NULL pointer would have 0 size.  So for such 
cases, __bos(q, 0) could return


sizeof(*q) + (q ? q->foo:0)

and __bos(q->array, 0) could be

sizeof(*q) + q->foo - offsetof(q, array)

There's no need to guard against a dereference in the second case 
because the q->array dereference already assumes that q is valid.




and

2. if you're able to determine the size of the flex array (through 
__element_count__(foo) for example), you could even determine the maximum size 
of the whole object.

For (2) though, you'd break applications that overallocate and then expect to 
be able to use that overallocation despite the space not being reflected in the 
__element_count__.  I think it's a bug in the application and I can't see a way 
for an application to be able to do this in a valid way so I'm inclined towards 
breaking it.


Currently, we allow the situation when the allocation size for the whole object 
is larger than the value reflected in the “counted_by” attribute (the old name 
is __element_count__). But don’t allow the other way around (i.e, when the 
allocation size for the whole object is smaller than the value reflected in the 
“counted_by” attribute.


Right, that's going to be the "break".  For underallocation __bos will 
only end up overestimating the space available, which is not ideal, but 
won't end up breaking compatibility.




Of course, the fact that gcc allows flex arrays to be in the middle of structs 
breaks the base assumption but that's something we need to get rid of anyway 
since there's no way for valid C programs to use that safely.


Since GCC14, we started to deprecate this extension (allow flex array to be in 
the middle of structs).
https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html


Yes, that's what I'm banking on.

Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 15:06, Qing Zhao wrote:

Yes, that's what I'm thinking.


so `q` must be pointing to a single element.  So you could deduce:

1. the minimum size of the whole object that q points to.

You mean that the TYPE will determine the minimum size of the whole object?  
(Does this include the size of the flexible array member, or only the other 
part of the structure except the flexible array member?)


Only the constant sized part of the structure.

Okay. I see.
But if the “counted_by” info is available, then from p->array, we can deduce the 
minimum size too, as sizeof(struct A) + q->foo * sizeof(int), right?


Yes.




Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
returns a valid allocated object.

Why? Please explain a little bit here.


So `alloc_buf_more` could return NULL, a valid pointer or an invalid pointer.  
So, we could end up returning a non-zero minimum size for an invalid or NULL 
pointer, which is incorrect, we don't know that.


I see what’ s you mean now.

However, if we already see p->array, then the p is guaranteed a valid pointer and not 
a NULL, right?  (We are discussing on __builtin_dynamic_object_size (q->array, 2), we 
see q->array already)


Yes, you could argue that for p->array, I agree, but not for p.



We won't need the object validity guarantee for (2) beyond, e.g. guarding 
against a new NULL pointer dereference because it's a *maximum* estimate; an 
invalid or NULL pointer would have 0 size.  So for such cases, __bos(q, 0) 
could return

sizeof(*q) + (q ? q->foo:0)

and __bos(q->array, 0) could be

sizeof(*q) + q->foo - offsetof(q, array)

There's no need to guard against a dereference in the second case because the 
q->array dereference already assumes that q is valid.


q->array should also guarantee that q is a valid pointer for minimum size, 
right? Or do I miss anything here?


Yes.

Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-08 Thread Siddhesh Poyarekar

On 2023-08-08 10:14, David Edelsohn wrote:
On Tue, Aug 8, 2023 at 10:07 AM Siddhesh Poyarekar <mailto:siddh...@gotplt.org>> wrote:


On 2023-08-08 10:04, Richard Biener wrote:
 > On Tue, Aug 8, 2023 at 3:35 PM Ian Lance Taylor mailto:i...@google.com>> wrote:
 >>
 >> On Tue, Aug 8, 2023 at 6:02 AM Jakub Jelinek via Gcc-patches
 >> mailto:gcc-patches@gcc.gnu.org>> wrote:
 >>>
 >>> On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via
Gcc-patches wrote:
 >>>> There's probably external tools to do this, not sure if we
should replicate
 >>>> things in the driver for this.
 >>>>
 >>>> But sure, I think the driver is the proper point to address
any of such
 >>>> issues - iff we want to address them at all.  Maybe a nice little
 >>>> google summer-of-code project ;)
 >>>
 >>> What I'd really like to avoid is having all compiler bugs
(primarily ICEs)
 >>> considered to be security bugs (e.g. DoS category), it would be
terrible to
 >>> release every week a new compiler because of the "security" issues.
 >>> Running compiler on untrusted sources can trigger ICEs (which
we want to fix
 >>> but there will always be some), or run into some compile time
and/or compile
 >>> memory issue (we have various quadratic or worse spots),
compiler stack
 >>> limits (deeply nested stuff e.g. during parsing but other areas
as well).
 >>> So, people running fuzzers and reporting issues is great, but
if they'd get
 >>> a CVE assigned for each ice-on-invalid-code, ice-on-valid-code,
 >>> each compile-time-hog and each memory-hog, that wouldn't be useful.
 >>> Runtime libraries or security issues in the code we generate
for valid
 >>> sources are of course a different thing.
 >>
 >>
 >> I wonder if a security policy should say something about the
-fplugin
 >> option.  I agree that an ICE is not a security issue, but I
wonder how
 >> many people are aware that a poorly chosen command line option can
 >> direct the compiler to run arbitrary code.  For that matter the same
 >> is true of setting the GCC_EXEC_PREFIX environment variable, and no
 >> doubt several other environment variables.  My point is not that we
 >> should change these, but that a security policy should draw
attention
 >> to the fact that there are cases in which the compiler will
 >> unexpectedly run other programs.
 >
 > Well, if you run an arbitrary commandline from the internet you get
 > what you deserve, running "echo "Hello World" | gcc -xc - -o
/dev/sda"
 > as root doesn't need plugins to shoot yourself in the foot.  You
need to
 > know what you're doing, otherwise you are basically executing an
 > arbitrary shell script with whatever privileges you have.

I think it would be useful to mention caveats with plugins though, just
like it would be useful to mention exceptions for libiberty and similar
libraries that gcc builds.  It only helps makes things clearer in terms
of what security coverage the project provides.


I have added a line to the Note section in the proposed text:

     GCC and its tools provide features and options that can run 
arbitrary user code (e.g., -fplugin).


How about the following to make it clearer that arbitrary code in 
plugins is not considered secure:


GCC and its tools provide features and options that can run arbitrary 
user code, e.g. using the -fplugin options.  Such custom code should be 
vetted by the user for safety as bugs exposed through such code will not 
be considered security issues.


I believe that the security implication already is addressed because the 
program is not tricked into a direct compromise of security.


Do you have a suggestion for the language to address libgcc, libstdc++, 
etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.

Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-08 Thread Siddhesh Poyarekar

On 2023-08-08 04:16, Richard Biener wrote:

On Mon, Aug 7, 2023 at 7:30 PM David Edelsohn via Gcc-patches
 wrote:


FOSS Best Practices recommends that projects have an official Security
policy stated in a SECURITY.md or SECURITY.txt file at the root of the
repository.  GLIBC and Binutils have added such documents.

Appended is a prototype for a Security policy file for GCC based on the
Binutils document because GCC seems to have more affinity with Binutils as
a tool. Do the runtime libraries distributed with GCC, especially libgcc,
require additional security policies?

[ ] Is it appropriate to use the Binutils SECURITY.txt as the starting
point or should GCC use GLIBC SECURITY.md as the starting point for the GCC
Security policy?

[ ] Does GCC, or some components of GCC, require additional care because of
runtime libraries like libgcc and libstdc++, and because of gcov and
profile-directed feedback?


I do think that the runtime libraries should at least be explicitly mentioned
because they fall into the "generated output" category and bugs in the
runtime are usually more severe as affecting a wider class of inputs.


Ack, I'd expect libstdc++ and libgcc to be aligned with glibc's 
policies.  libiberty and others on the other hand, would probably be 
more suitably aligned with binutils libbfd, where we assume trusted input.



Thoughts?

Thanks, David

GCC Security Process


What is a GCC security bug?
===

 A security bug is one that threatens the security of a system or
 network, or might compromise the security of data stored on it.
 In the context of GCC there are two ways in which such
 bugs might occur.  In the first, the programs themselves might be
 tricked into a direct compromise of security.  In the second, the
 tools might introduce a vulnerability in the generated output that
 was not already present in the files used as input.

 Other than that, all other bugs will be treated as non-security
 issues.  This does not mean that they will be ignored, just that
 they will not be given the priority that is given to security bugs.

 This stance applies to the creation tools in the GCC (e.g.,
 gcc, g++, gfortran, gccgo, gccrs, gnat, cpp, gcov, etc.) and the
 libraries that they use.

Notes:
==

 None of the programs in GCC need elevated privileges to operate and
 it is recommended that users do not use them from accounts where such
 privileges are automatically available.


I'll note that we could ourselves mitigate some of that by handling privileged
invocation of the driver specially, dropping privs on exec of the sibling tools
and possibly using temporary files or pipes to do the parts of the I/O that
need to be privileged.


It's not a bad idea, but it ends up giving legitimizing running the 
compiler as root, pushing the responsibility of privilege management to 
the driver.  How about rejecting invocation as root altogether by 
default, bypassed with a --run-as-root flag instead?


I've also been thinking about a --sandbox flag that isolates the build 
process (for gcc as well as binutils) into a separate namespace so that 
it's usable in a restricted mode on untrusted sources without exposing 
the rest of the system to it.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-08 Thread Siddhesh Poyarekar

On 2023-08-08 10:04, Richard Biener wrote:

On Tue, Aug 8, 2023 at 3:35 PM Ian Lance Taylor  wrote:


On Tue, Aug 8, 2023 at 6:02 AM Jakub Jelinek via Gcc-patches
 wrote:


On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via Gcc-patches wrote:

There's probably external tools to do this, not sure if we should replicate
things in the driver for this.

But sure, I think the driver is the proper point to address any of such
issues - iff we want to address them at all.  Maybe a nice little
google summer-of-code project ;)


What I'd really like to avoid is having all compiler bugs (primarily ICEs)
considered to be security bugs (e.g. DoS category), it would be terrible to
release every week a new compiler because of the "security" issues.
Running compiler on untrusted sources can trigger ICEs (which we want to fix
but there will always be some), or run into some compile time and/or compile
memory issue (we have various quadratic or worse spots), compiler stack
limits (deeply nested stuff e.g. during parsing but other areas as well).
So, people running fuzzers and reporting issues is great, but if they'd get
a CVE assigned for each ice-on-invalid-code, ice-on-valid-code,
each compile-time-hog and each memory-hog, that wouldn't be useful.
Runtime libraries or security issues in the code we generate for valid
sources are of course a different thing.



I wonder if a security policy should say something about the -fplugin
option.  I agree that an ICE is not a security issue, but I wonder how
many people are aware that a poorly chosen command line option can
direct the compiler to run arbitrary code.  For that matter the same
is true of setting the GCC_EXEC_PREFIX environment variable, and no
doubt several other environment variables.  My point is not that we
should change these, but that a security policy should draw attention
to the fact that there are cases in which the compiler will
unexpectedly run other programs.


Well, if you run an arbitrary commandline from the internet you get
what you deserve, running "echo "Hello World" | gcc -xc - -o /dev/sda"
as root doesn't need plugins to shoot yourself in the foot.  You need to
know what you're doing, otherwise you are basically executing an
arbitrary shell script with whatever privileges you have.


I think it would be useful to mention caveats with plugins though, just 
like it would be useful to mention exceptions for libiberty and similar 
libraries that gcc builds.  It only helps makes things clearer in terms 
of what security coverage the project provides.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-08 Thread Siddhesh Poyarekar

On 2023-08-08 10:37, Jakub Jelinek wrote:

On Tue, Aug 08, 2023 at 10:30:10AM -0400, Siddhesh Poyarekar wrote:

Do you have a suggestion for the language to address libgcc, libstdc++,
etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.


BTW, I think we should perhaps differentiate between production ready
libraries (e.g. libgcc, libstdc++, libgomp, libatomic, libgfortran, libquadmath,
libssp) vs. e.g. the sanitizer libraries which are meant for debugging and


Agreed, that's why I need some time to sort all of the libraries gcc 
builds to categorize them into various levels of support in terms of 
safety re. untrusted input.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-08 Thread Siddhesh Poyarekar

On 2023-08-08 11:48, David Malcolm wrote:

On Tue, 2023-08-08 at 09:33 -0400, Paul Koning via Gcc-patches wrote:




On Aug 8, 2023, at 9:01 AM, Jakub Jelinek via Gcc-patches
 wrote:

On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via Gcc-
patches wrote:

There's probably external tools to do this, not sure if we should
replicate
things in the driver for this.

But sure, I think the driver is the proper point to address any
of such
issues - iff we want to address them at all.  Maybe a nice little
google summer-of-code project ;)


What I'd really like to avoid is having all compiler bugs
(primarily ICEs)
considered to be security bugs (e.g. DoS category), it would be
terrible to
release every week a new compiler because of the "security" issues.


Indeed.  But my answer would be that such things are not DoS issues.
DoS means that an external input, over which you have little control,
is impairing service.  In the case of a compiler, if feeding it bad
source code X.c causes it to crash, the answer is "well, then don't
do that".


Agreed.

I'm not sure how to "wordsmith" this, but it seems like the sources and
options on the *host* are assumed to be trusted, and that the act of
*compiling* source on the host requires trusting them, just like the
act of executing the compiled code on the target does.  Though users
may be more familiar with sandboxing the target than the host.

We should spell this out further for libgccjit: libgccjit allows for
ahead-of-time and JIT compilation of sources - but it assumes that
those sources (and the compilation options) are trusted.

[Adding Andrea Corallo to the addressees]

For example, Emacs is using libgccjit to do ahead-of-time compilation
of Emacs bytecode.  I'm assuming that Emacs is assuming that its
bytecode is trusted, and that there isn't any attempt by Emacs to
sandbox the Emacs Lisp being processed.

However, consider a situation in which someone attempted to, say, embed
libgccjit inside a web browser to generate machine code from
JavaScript, where the JavaScript is potentially controlled by an
attacker.  I think we want to explicitly say that that if you're going
to do that, you need to put some other layer of defense in, so that
you're not blithely accepting the inputs to the compilation (sources
and options) from a potentially hostile source, where a crafted input
sources could potentially hit an ICE in the compiler and thus crash the
web browser.


+1, this is precisely the kind of thing the security policy should warn 
against and suggest using sandboxing for.  The compiler (or libgccjit) 
isn't really in a position to defend such uses, ICE or otherwise.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-09 Thread Siddhesh Poyarekar

On 2023-08-09 14:17, David Edelsohn wrote:
On Wed, Aug 9, 2023 at 1:33 PM Siddhesh Poyarekar <mailto:siddh...@gotplt.org>> wrote:


On 2023-08-08 10:30, Siddhesh Poyarekar wrote:
 >> Do you have a suggestion for the language to address libgcc,
 >> libstdc++, etc. and libiberty, libbacktrace, etc.?
 >
 > I'll work on this a bit and share a draft.

Hi David,

Here's what I came up with for different parts of GCC, including the
runtime libraries.  Over time we may find that specific parts of
runtime
libraries simply cannot be used safely in some contexts and flag that.

Sid


Hi, Sid

Thanks for iterating on this.


"""
What is a GCC security bug?
===

      A security bug is one that threatens the security of a system or
      network, or might compromise the security of data stored on it.
      In the context of GCC there are multiple ways in which this might
      happen and they're detailed below.

Compiler drivers, programs, libgccjit and support libraries
---

      The compiler driver processes source code, invokes other programs
      such as the assembler and linker and generates the output result,
      which may be assembly code or machine code.  It is necessary that
      all source code inputs to the compiler are trusted, since it is
      impossible for the driver to validate input source code beyond
      conformance to a programming language standard.

      The GCC JIT implementation, libgccjit, is intended to be plugged
      into applications to translate input source code in the
application
      context.  Limitations that apply to the compiler
      driver, apply here too in terms of sanitizing inputs, so it is
      recommended that inputs are either sanitized by an external
program
      to allow only trusted, safe execution in the context of the
      application or the JIT execution context is appropriately
sandboxed
      to contain the effects of any bugs in the JIT or its generated
code
      to the sandboxed environment.

      Support libraries such as libiberty, libcc1 libvtv and libcpp have
      been developed separately to share code with other tools such as
      binutils and gdb.  These libraries again have similar
challenges to
      compiler drivers.  While they are expected to be robust against
      arbitrary input, they should only be used with trusted inputs.

      Libraries such as zlib and libffi that bundled into GCC to
build it
      will be treated the same as the compiler drivers and programs
as far
      as security coverage is concerned.


Should we direct people to the upstream projects for their security 
policies?


We bundle zlib and libffi so regardless of whether it's a security issue 
in those libraries (because security impact of memory safety bugs in 
general use libraries will be context dependent and hence get assigned 
CVEs more often than not), the context in gcc is well defined as a local 
unprivileged executable and hence not security-relevant.


That said, we could add something like:

However if you find a issue in these libraries independent of their
use in GCC you should reach out to their upstream projects to report
them.




      As a result, the only case for a potential security issue in all
      these cases is when it ends up generating vulnerable output for
      valid input source code.


Language runtime libraries
--

      GCC also builds and distributes libraries that are intended to be
      used widely to implement runtime support for various programming
      languages.  These include the following:

      * libada
      * libatomic
      * libbacktrace
      * libcc1
      * libcody
      * libcpp
      * libdecnumber
      * libgcc
      * libgfortran
      * libgm2
      * libgo
      * libgomp
      * libiberty
      * libitm
      * libobjc
      * libphobos
      * libquadmath
      * libssp
      * libstdc++

      These libraries are intended to be used in arbitrary contexts
and as
      a result, bugs in these libraries may be evaluated for security
      impact.  However, some of these libraries, e.g. libgo, libphobos,
      etc.  are not maintained in the GCC project, due to which the GCC
      project may not be the correct point of contact for them.  You are
      encouraged to look at README files within those library
directories
      to locate the canonical security contact point for those projects.


As Richard mentioned, should GCC make a specific statement about the 
security policy / response for issues that

Re: [RFC] GCC Security policy

2023-08-09 Thread Siddhesh Poyarekar

On 2023-08-08 10:30, Siddhesh Poyarekar wrote:
Do you have a suggestion for the language to address libgcc, 
libstdc++, etc. and libiberty, libbacktrace, etc.?


I'll work on this a bit and share a draft.


Hi David,

Here's what I came up with for different parts of GCC, including the 
runtime libraries.  Over time we may find that specific parts of runtime 
libraries simply cannot be used safely in some contexts and flag that.


Sid

"""
What is a GCC security bug?
===

A security bug is one that threatens the security of a system or
network, or might compromise the security of data stored on it.
In the context of GCC there are multiple ways in which this might
happen and they're detailed below.

Compiler drivers, programs, libgccjit and support libraries
---

The compiler driver processes source code, invokes other programs
such as the assembler and linker and generates the output result,
which may be assembly code or machine code.  It is necessary that
all source code inputs to the compiler are trusted, since it is
impossible for the driver to validate input source code beyond
conformance to a programming language standard.

The GCC JIT implementation, libgccjit, is intended to be plugged
into applications to translate input source code in the application
context.  Limitations that apply to the compiler
driver, apply here too in terms of sanitizing inputs, so it is
recommended that inputs are either sanitized by an external program
to allow only trusted, safe execution in the context of the
application or the JIT execution context is appropriately sandboxed
to contain the effects of any bugs in the JIT or its generated code
to the sandboxed environment.

Support libraries such as libiberty, libcc1 libvtv and libcpp have
been developed separately to share code with other tools such as
binutils and gdb.  These libraries again have similar challenges to
compiler drivers.  While they are expected to be robust against
arbitrary input, they should only be used with trusted inputs.

Libraries such as zlib and libffi that bundled into GCC to build it
will be treated the same as the compiler drivers and programs as far
as security coverage is concerned.

As a result, the only case for a potential security issue in all
these cases is when it ends up generating vulnerable output for
valid input source code.

Language runtime libraries
--

GCC also builds and distributes libraries that are intended to be
used widely to implement runtime support for various programming
languages.  These include the following:

* libada
* libatomic
* libbacktrace
* libcc1
* libcody
* libcpp
* libdecnumber
* libgcc
* libgfortran
* libgm2
* libgo
* libgomp
* libiberty
* libitm
* libobjc
* libphobos
* libquadmath
* libssp
* libstdc++

These libraries are intended to be used in arbitrary contexts and as
a result, bugs in these libraries may be evaluated for security
impact.  However, some of these libraries, e.g. libgo, libphobos,
etc.  are not maintained in the GCC project, due to which the GCC
project may not be the correct point of contact for them.  You are
encouraged to look at README files within those library directories
to locate the canonical security contact point for those projects.

Diagnostic libraries


The sanitizer library bundled in GCC is intended to be used in
diagnostic cases and not intended for use in sensitive environments.
As a result, bugs in the sanitizer will not be considered security
sensitive.

GCC plugins
---

It should be noted that GCC may execute arbitrary code loaded by a
user through the GCC plugin mechanism or through system preloading
mechanism.  Such custom code should be vetted by the user for safety
as bugs exposed through such code will not be considered security
issues.


Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Siddhesh Poyarekar

On 2023-08-01 18:57, Kees Cook wrote:


   return p;
}

/* in the following function, malloc allocated less space than size of the
struct fix.  Then what's the correct behavior we expect
the __builtin_object_size should have for the following?
  */

static struct fix * noinline alloc_buf_less ()
{
   struct fix *p;
   p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int));

   /*when checking the observed access p->array, we have info on both
 observered allocation and observed access,
 A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
(int)
 B. from observed access (TYPE): LENGTH * sizeof (int)
*/

   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */

   expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
sizeof(int));


ok:  __builtin_object_size(p->array, 0) == 20

My brain just melted a little, as this is now an under-sized instance of
"p", so we have an incomplete allocation. (I would expect -Warray-bounds
to yell very loudly for this.) But, technically, yes, this looks like
the right calculation.


AFAIK, -Warray-bounds will only yell in case of a dereference that the 
compiler may potentially see as being beyond that 20 byte bound; it 
won't actually see the undersized allocation.  An analyzer warning would 
be useful for just the undersized allocation regardless of whether the 
code actually ends up accessing the object beyond the allocation bounds.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Siddhesh Poyarekar

On 2023-08-01 17:35, Qing Zhao wrote:

typedef struct
{
   int a;
} A;
size_t f()
{
   A *p = malloc (1);
   return __builtin_object_size (p, 0);


Correction, that should be __builtin_object_size (p->a, 0).


Actually, it should be __builtin_object_size(p->a, 1).
For __builtin_object_size(p->a,0), gcc always uses the allocation size for the 
whole object.


Right, sorry, I mistyped, twice in fact; it should have been 
__bos(>a, 1) :)




GCC’s current behavior is:

For the size of the whole object, GCC currently always uses the allocation size.
And for the size in the sub-object, GCC chose the smaller one among the 
allocation size and the TYPE_SIZE.

Is this correct behavior?


Yes, it's deliberate; it specifically checks on var != pt_var, which can 
only be true for subobjects.


Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-14 Thread Siddhesh Poyarekar

Hi,

Here's the updated draft of the top part of the security policy with all 
of the recommendations incorporated.


Thanks,
Sid


What is a GCC security bug?
===

A security bug is one that threatens the security of a system or
network, or might compromise the security of data stored on it.
In the context of GCC there are multiple ways in which this might
happen and they're detailed below.

Compiler drivers, programs, libgccjit and support libraries
---

The compiler driver processes source code, invokes other programs
such as the assembler and linker and generates the output result,
which may be assembly code or machine code.  It is necessary that
all source code inputs to the compiler are trusted, since it is
impossible for the driver to validate input source code beyond
conformance to a programming language standard.

The GCC JIT implementation, libgccjit, is intended to be plugged
into applications to translate input source code in the application
context.  Limitations that apply to the compiler
driver, apply here too in terms of sanitizing inputs, so it is
recommended that inputs are either sanitized by an external program
to allow only trusted, safe execution in the context of the
application or the JIT execution context is appropriately sandboxed
to contain the effects of any bugs in the JIT or its generated code
to the sandboxed environment.

Support libraries such as libiberty, libcc1 libvtv and libcpp have
been developed separately to share code with other tools such as
binutils and gdb.  These libraries again have similar challenges to
compiler drivers.  While they are expected to be robust against
arbitrary input, they should only be used with trusted inputs.

Libraries such as zlib that bundled into GCC to build it will be
treated the same as the compiler drivers and programs as far as
security coverage is concerned.  However if you find an issue in
these libraries independent of their use in GCC, you should reach
out to their upstream projects to report them.

As a result, the only case for a potential security issue in all
these cases is when it ends up generating vulnerable output for
valid input source code.

As a result, the only case for a potential security issue in the
compiler is when it generates vulnerable application code for
trusted input source code that is conforming to the relevant
programming standard or extensions documented as supported by GCC
and the algorithm expressed in the source code does not have the
vulnerability.  The output application code could be considered
vulnerable if it produces an actual vulnerability in the target
application, specifically in the following cases:

- The application dereferences an invalid memory location despite
  the application sources being valid.
- The application reads from or writes to a valid but incorrect
  memory location, resulting in an information integrity issue or an
  information leak.
- The application ends up running in an infinite loop or with
  severe degradation in performance despite the input sources having
  no such issue, resulting in a Denial of Service.  Note that
  correct but non-performant code is not a security issue candidate,
  this only applies to incorrect code that may result in performance
  degradation severe enough to amount to a denial of service.
- The application crashes due to the generated incorrect code,
  resulting in a Denial of Service.

Language runtime libraries
--

GCC also builds and distributes libraries that are intended to be
used widely to implement runtime support for various programming
languages.  These include the following:

* libada
* libatomic
* libbacktrace
* libcc1
* libcody
* libcpp
* libdecnumber
* libffi
* libgcc
* libgfortran
* libgm2
* libgo
* libgomp
* libiberty
* libitm
* libobjc
* libphobos
* libquadmath
* libsanitizer
* libssp
* libstdc++

These libraries are intended to be used in arbitrary contexts and as
a result, bugs in these libraries may be evaluated for security
impact.  However, some of these libraries, e.g. libgo, libphobos,
etc.  are not maintained in the GCC project, due to which the GCC
project may not be the correct point of contact for them.  You are
encouraged to look at README files within those library directories
to locate the canonical security contact point for those projects
and include them in the report.  Once the issue is fixed in the
upstream project, the fix will be synced into GCC in a future
release.

Most security vulnerabilities in these runtime libraries arise when
an application 

Re: [RFC] GCC Security policy

2023-08-14 Thread Siddhesh Poyarekar

On 2023-08-14 17:16, Alexander Monakov wrote:


On Mon, 14 Aug 2023, Siddhesh Poyarekar wrote:


1. It makes it clear to users of the project the scope in which the project
could be used and what safety it could reasonably expect from the project.  In
the context of GCC for example, it cannot expect the compiler to do a safety
check of untrusted sources; the compiler will consider #include "/etc/passwd"
just as valid code as #include  and as a result, the onus is on the
user environment to validate the input sources for safety.


Whoa, no. We shouldn't make such statements unless we are prepared to explain
to users how such validation can be practically implemented, which I'm sure
we cannot in this case, due to future extensions such as the #embed directive,
and ability to obfuscate filenames using the preprocessor.


There's no practical (programmatic) way to do such validation; it has to 
be a manual audit, which is why source code passed to the compiler has 
to be *trusted*.



I think it would be more honest to say that crafted sources can result in
arbitrary code execution with the privileges of the user invoking the compiler,
and hence the operator may want to ensure that no sensitive data is available
to that user (via measures ranging from plain UNIX permissions, to chroots,
to virtual machines, to air-gapped computers, depending on threat model).


Right, that's what we're essentially trying to convey in the security 
policy text.  It doesn't go into mechanisms for securing execution 
(because that's really beyond the scope of the *project's* policy IMO) 
but it states unambiguously that input to the compiler must be trusted:


"""
  ... It is necessary that
all source code inputs to the compiler are trusted, since it is
impossible for the driver to validate input source code beyond
conformance to a programming language standard...
"""


Resource consumption is another good reason to sandbox compilers.


Agreed, we make that specific recommendation in the context of libgccjit.

Thanks,
Sid


Re: [RFC] GCC Security policy

2023-08-14 Thread Siddhesh Poyarekar

On 2023-08-14 14:51, Richard Sandiford wrote:

I think it would help to clarify what the aim of the security policy is.
Specifically:

(1) What service do we want to provide to users by classifying one thing
 as a security bug and another thing as not a security bug?

(2) What service do we want to provide to the GNU community by the same
 classification?

I think it will be easier to agree on the classification if we first
agree on that.


I actually wanted to do a talk on this at the Cauldron this year and 
*then* propose this for the gcc community, but I guess we could do this 
early :)


So the core intent of a security policy for a project is to make clear 
the security stance of the project, specifying to the extent possible 
what kind of uses are considered safe and what kinds of bugs would be 
considered security issues in the context of those uses.


There are a few advantages of doing this:

1. It makes it clear to users of the project the scope in which the 
project could be used and what safety it could reasonably expect from 
the project.  In the context of GCC for example, it cannot expect the 
compiler to do a safety check of untrusted sources; the compiler will 
consider #include "/etc/passwd" just as valid code as #include  
and as a result, the onus is on the user environment to validate the 
input sources for safety.


2. It helps the security community (Mitre and other CNAs and security 
researchers) set correct expectations of the project so that they don't 
cry wolf for every segfault or ICE under the pretext that code could 
presumably be run as a service somehow and hence result in a "DoS".


3. This in turn helps stave off spurious CVE submissions that cause 
needless churn in downstream distributions.  LLVM is already starting to 
see this[1] and it's only a matter of time before people start doing 
this for GCC.


4. It helps make a distinction between important bugs and security bugs; 
they're often conflated as one and the same thing.  Security bugs are 
special because they require different handling from those that do not 
have a security impact, regardless of their actual importance. 
Unfortunately one of the reasons they're special is because there's a 
bunch of (pretty dumb) automation out there that rings alarm bells on 
every single CVE.  Without a clear understanding of the context under 
which a project can be used, these alarm bells can be made unreasonably 
loud (due to incorrect scoring, see the LLVM CVE for instance; just one 
element in that vector changes the score from 0.0 to 5.5), causing 
needless churn in not just the code base but in downstream releases and 
end user environments.


5. This exercise is also a great start in developing an understanding of 
which parts in GCC are security sensitive and in what sense.  Runtime 
libraries for example have a direct impact on application security. 
Compiler impact is a little less direct.  Hardening features have 
another effect, but it's more mitigation-oriented than direct safety. 
This also informs us about the impact of various project actions such as 
bundling third-party libraries and development and maintenance of 
tooling within GCC and will hopefully guide policies around those practices.


I hope this is a sufficient start.  We don't necessarily want to get 
into the business of acknowledging or rejecting security issues as 
upstream at the moment (but see also the CNA discussion[2] of what we 
intend to do in that space for glibc) but having uniform upstream 
guidelines would be helpful to researchers as well as downstream 
consumers to help decide what constitutes a security issue.


Thanks,
Sid

[1] https://nvd.nist.gov/vuln/detail/CVE-2023-29932
[2] 
https://inbox.sourceware.org/libc-alpha/1a44f25a-5aa3-28b7-1ecb-b3991d44c...@gotplt.org/T/


Re: [RFC] GCC Security policy

2024-02-13 Thread Siddhesh Poyarekar

On 2024-02-12 10:00, Richard Biener wrote:

GCC driver support is then to the extent identifying the inputs and the outputs.


We already have -MM to generate a list of non-system dependencies, so 
gcc should be able to pass on inputs to the tool, which could then map 
those files (and the system headers directory) into the sandbox before 
invocation.  The output file could perhaps be enforced as having to be a 
new one, i.e. fail if the target is already found.



I'm not sure a generic utility can achieve this unless the outputs need to be
retrieved from somewhere else (not "usual" place when invoking un-sandboxed).

Even the driver doesn't necessarily know all files read/written.

So I suppose better defining of the actual goal is in order.


gcc -sandboxed -O2 -c t.ii -fdump-tree-all


what should this do?  IMO invoked tools (gas, cc1plus) should be restricted
to access the input files.  Ideally the dumps would appear where they
appear when not sandboxed but clearly overwriting existing files would be
problematic, writing new files not so much, but only to the standard (or
specified) auxiliary output file paths.


Couldn't we get away with not having to handle dump files?  They don't 
seem to be sensitive targets.


Thanks,
Sid


Re: [RFC] GCC Security policy

2024-02-09 Thread Siddhesh Poyarekar

On 2024-02-09 10:38, Martin Jambor wrote:

If anyone is interested in scoping this and then mentoring this as a
Google Summer of Code project this year then now is the right time to
speak up!


I can help with mentoring and reviews, although I'll need someone to 
assist with actual approvals.


There are two distinct sets of ideas to explore, one is privilege 
management and the other sandboxing.


For privilege management we could add a --allow-root driver flag that 
allows gcc to run as root.  Without the flag one could either outright 
refuse to run or drop privileges and run.  Dropping privileges will be a 
bit tricky to implement because it would need a user to drop privileges 
to and then there would be the question of how to manage file access to 
read the compiler input and write out the compiler output.  If there's 
no such user, gcc could refuse to run as root by default.  I wonder 
though if from a security posture perspective it makes sense to simply 
discourage running as root all the time and not bother trying to make it 
work with dropped privileges and all that.  Of course it would mean that 
this would be less of a "project"; it'll be a simple enough patch to 
refuse to run until --allow-root is specified.


This probably ties in somewhat with an idea David Malcolm had riffed on 
with me earlier, of caching files for diagnostics.  If we could unify 
file accesses somehow, we could make this happen, i.e. open/read files 
as root and then do all execution as non-root.


Sandboxing will have similar requirements, i.e. map in input files and 
an output file handle upfront and then unshare() into a sandbox to do 
the actual compilation.  This will make sure that at least the 
processing of inputs does not affect the system on which the compilation 
is being run.


Sid


Re: [RFC] GCC Security policy

2024-02-09 Thread Siddhesh Poyarekar

On 2024-02-09 12:14, Joseph Myers wrote:

On Fri, 9 Feb 2024, Siddhesh Poyarekar wrote:


For privilege management we could add a --allow-root driver flag that allows
gcc to run as root.  Without the flag one could either outright refuse to run
or drop privileges and run.  Dropping privileges will be a bit tricky to
implement because it would need a user to drop privileges to and then there
would be the question of how to manage file access to read the compiler input
and write out the compiler output.  If there's no such user, gcc could refuse
to run as root by default.  I wonder though if from a security posture
perspective it makes sense to simply discourage running as root all the time
and not bother trying to make it work with dropped privileges and all that.
Of course it would mean that this would be less of a "project"; it'll be a
simple enough patch to refuse to run until --allow-root is specified.


I think disallowing running as root would be a big problem in practice -
the typical problem case is when people build software as non-root and run
"make install" as root, and for some reason "make install" wants to
(re)build or (re)link something.


Isn't that a problematic practice though?  Or maybe have those 
invocations be separated out as CC_ROOT?


Thanks,
Sid


Re: [RFC] GCC Security policy

2024-02-12 Thread Siddhesh Poyarekar

On 2024-02-09 15:06, Joseph Myers wrote:

Ideally dependencies would be properly set up so that everything is built
in the original build, and ideally there would be no need to relink at
install time (I'm not sure of the exact circumstances in which it might be
needed, or on what OSes to e.g. encode the right library paths in final
installed executables).  In practice I think it's common for some building
to take place at install time.

There is a more general principle here of composability: it's not helpful
for being able to write scripts or makefiles combining invocations of
different utilities and have them behave predictably if some of those
utilities start making judgements about whether it's a good idea to run
them in a particular environment rather than just doing their job
independent of irrelevant aspects of the environment.  The semantics of
invoking "gcc" have nothing to do with whether it's run as root; it should
never need to look up what user it's running as at all.  (And it's
probably also a bad idea for lots of separate utilities to gain their own
ways to run in a restricted environment, for similar reasons; rather than
teaching "gcc" a way to create a restricted environment itself, ensure
there are easy-to-use more general utilities for running arbitrary
programs on untrusted input in a contained environment.)


I see your point.  The way you put it, there's no GCC project here at 
all then.


Sid


Re: [RFC] GCC Security policy

2024-02-12 Thread Siddhesh Poyarekar

On 2024-02-12 08:16, Martin Jambor wrote:

This probably ties in somewhat with an idea David Malcolm had riffed on
with me earlier, of caching files for diagnostics.  If we could unify
file accesses somehow, we could make this happen, i.e. open/read files
as root and then do all execution as non-root.

Sandboxing will have similar requirements, i.e. map in input files and
an output file handle upfront and then unshare() into a sandbox to do
the actual compilation.  This will make sure that at least the
processing of inputs does not affect the system on which the compilation
is being run.


Right.  As we often just download some (sometimes large) pre-processed
source from Bugzilla and then happily run GCC on it on our computers,
this feature might be actually useful for us (still, we'd probably need
a more concrete description of what we want, would e.g. using "-wrapper
gdb,--args" work in such a sandbox?).  I agree that for some even
semi-complex builds, a more general sandboxing solution is probably
better.


Joseph seems to be leaning towards nudging people to a general 
sandboxing solution too.  The question then is whether this takes the 
shape of a utility in, e.g. contrib that builds such a sandbox or simply 
a wiki page.


Thanks,
Sid


Re: [C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]

2023-12-06 Thread Siddhesh Poyarekar

On 2023-12-06 09:21, Jakub Jelinek wrote:

On Wed, Dec 06, 2023 at 02:34:10PM +0100, Martin Uecker wrote:

Further I think
"size less than or equal to the size requested"
is quite ambiguous in the calloc case, isn't the size requested in the
calloc case actually nmemb * size rather than just size?


This is unclear but it can be understood this way.
This was also Joseph's point.

I am happy to submit a patch that changes the code so
that the swapped arguments to calloc do not cause a warning
anymore.


That would be my preference because then the allocation size is
correct and it is purely a style warning.
It doesn't follow how the warning is described:
"Warn about calls to allocation functions decorated with attribute
@code{alloc_size} that specify insufficient size for the target type of
the pointer the result is assigned to"
when the size is certainly sufficient.

But wonder what others think about it.


+1, from a libc perspective, the transposed arguments don't make a 
difference, a typical allocator will produce sufficiently sized 
allocation for the calloc call.



BTW, shouldn't the warning be for C++ as well?  Sure, I know,
people use operator new more often, but still, the 
allocators are used in there as well.

We have the -Wmemset-transposed-args warning, couldn't we
have a similar one for calloc, and perhaps do it solely in
the case where one uses sizeof of the type used in the cast
pointer?
So warn for
(struct S *) calloc (sizeof (struct S), 1)
or
(struct S *) calloc (sizeof (struct S), n)
but not for
(struct S *) calloc (4, 15)
or
(struct S *) calloc (sizeof (struct T), 1)
or similar?  Of course check for compatible types of TYPE_MAIN_VARIANTs.


+1, this could be an analyzer warning, since in practice it is just a 
code cleanliness issue.


Thanks,
Sid


Re: [C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]

2023-12-06 Thread Siddhesh Poyarekar

On 2023-12-06 09:41, Jakub Jelinek wrote:

On Wed, Dec 06, 2023 at 09:30:32AM -0500, Siddhesh Poyarekar wrote:

We have the -Wmemset-transposed-args warning, couldn't we
have a similar one for calloc, and perhaps do it solely in
the case where one uses sizeof of the type used in the cast
pointer?
So warn for
(struct S *) calloc (sizeof (struct S), 1)
or
(struct S *) calloc (sizeof (struct S), n)
but not for
(struct S *) calloc (4, 15)
or
(struct S *) calloc (sizeof (struct T), 1)
or similar?  Of course check for compatible types of TYPE_MAIN_VARIANTs.


+1, this could be an analyzer warning, since in practice it is just a code
cleanliness issue.


We don't do such things in the analyzer, nor it is possible, by the time
analyzer sees the IL all the sizeofs etc. are folded.  Analyzer is for
expensive to compute warnings, code style warnings are normally implemented
in the FEs.


Thanks, understood.  A separate FE warning is fine as well.

Thanks,
Sid


[PATCH] SECURITY.txt: Drop "exploitable" in reference to hardening issues

2023-12-18 Thread Siddhesh Poyarekar
The "exploitable vulnerability" may lead to a misunderstanding that 
missed hardening issues are considered vulnerabilities, just that 
they're not exploitable.  This is not true, since while hardening bugs 
may be security-relevant, the absence of hardening does not make a 
program any more vulnerable to exploits than without.


Drop the "exploitable" word to make it clear that missed hardening is 
not considered a vulnerability.


diff --git a/SECURITY.txt b/SECURITY.txt
index b3e2bbfda90..126603d4c22 100644
--- a/SECURITY.txt
+++ b/SECURITY.txt
@@ -155,10 +155,10 @@ Security features implemented in GCC
 GCC implements a number of security features that reduce the impact
 of security issues in applications, such as -fstack-protector,
 -fstack-clash-protection, _FORTIFY_SOURCE and so on.  A failure of
-these features to function perfectly in all situations is not an
-exploitable vulnerability in itself since it does not affect the
-correctness of programs.  Further, they're dependent on heuristics
-and may not always have full coverage for protection.
+these features to function perfectly in all situations is not a
+vulnerability in itself since it does not affect the correctness of
+programs.  Further, they're dependent on heuristics and may not
+always have full coverage for protection.

 Similarly, GCC may transform code in a way that the correctness of
 the expressed algorithm is preserved, but supplementary properties


[PATCH] tree-object-size: Always set computed bit for bdos [PR113012]

2023-12-18 Thread Siddhesh Poyarekar
It is always safe to set the computed bit for dynamic object sizes at
the end of collect_object_sizes_for because even in case of a dependency
loop encountered in nested calls, we have an SSA temporary to actually
finish the object size expression.  The reexamine pass for dynamic
object sizes is only for propagation of unknowns and gimplification of
the size expressions, not for loop resolution as in the case of static
object sizes.

gcc/ChangeLog:

PR tree-optimization/113012
* gcc.dg/ubsan/pr113012.c: New test case.

gcc/testsuite/ChangeLog:

PR tree-optimization/113012
* tree-object-size.cc (compute_builtin_object_size): Expand
comment for dynamic object sizes.
(collect_object_sizes_for): Always set COMPUTED bitmap for
dynamic object sizes.

Signed-off-by: Siddhesh Poyarekar 
---
Testing:

- Bootstrapped x86_64 and config=ubsan
- Tested i686

OK for trunk and backport to gcc 13 branch?

 gcc/testsuite/gcc.dg/ubsan/pr113012.c | 17 +
 gcc/tree-object-size.cc   | 17 -
 2 files changed, 29 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pr113012.c

diff --git a/gcc/testsuite/gcc.dg/ubsan/pr113012.c 
b/gcc/testsuite/gcc.dg/ubsan/pr113012.c
new file mode 100644
index 000..4fc38cd1171
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/pr113012.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined" } */
+
+int *
+foo (int x, int y, int z, int w)
+{
+  int *p = __builtin_malloc (z * sizeof (int));
+  int *q = p - 1;
+  while (--x > 0)
+{
+  if (w + 1 > y)
+   q = p - 1;
+  ++*q;
+  ++q;
+}
+  return p;
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 28f27adf9ca..434b2fc0bf5 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -1185,10 +1185,12 @@ compute_builtin_object_size (tree ptr, int 
object_size_type,
  osi.tos = NULL;
}
 
-  /* First pass: walk UD chains, compute object sizes that
-can be computed.  osi.reexamine bitmap at the end will
-contain what variables were found in dependency cycles
-and therefore need to be reexamined.  */
+  /* First pass: walk UD chains, compute object sizes that can be computed.
+osi.reexamine bitmap at the end will contain what variables that need
+to be reexamined.  For both static and dynamic size computation,
+reexamination is for propagation across dependency loops. The dynamic
+case has the additional use case where the computed expression needs
+to be gimplified.  */
   osi.pass = 0;
   osi.changed = false;
   collect_object_sizes_for (, ptr);
@@ -1823,11 +1825,16 @@ collect_object_sizes_for (struct object_size_info *osi, 
tree var)
   gcc_unreachable ();
 }
 
-  if (! reexamine || object_sizes_unknown_p (object_size_type, varno))
+  /* Dynamic sizes use placeholder temps to return an answer, so it is always
+   * safe to set COMPUTED for them.  */
+  if ((object_size_type & OST_DYNAMIC)
+  || !reexamine || object_sizes_unknown_p (object_size_type, varno))
 {
   bitmap_set_bit (computed[object_size_type], varno);
   if (!(object_size_type & OST_DYNAMIC))
bitmap_clear_bit (osi->reexamine, varno);
+  else if (reexamine)
+   bitmap_set_bit (osi->reexamine, varno);
 }
   else
 {
-- 
2.43.0



[PATCH] tree-object-size: Clean up unknown propagation

2023-12-19 Thread Siddhesh Poyarekar
Narrow down scope of the unknowns bitmap so that it is only accessible
within the reexamination process.  This also removes any role of unknown
propagation from object_sizes_set, thus simplifying that code path a
bit.

gcc/ChangeLog:

* tree-object-size.cc (object_size_info): Remove UNKNOWNS.
Drop all references to it.
(object_sizes_set): Move unknowns propagation code to...
(gimplify_size_expressions): ... here.  Also free reexamine
bitmap.
(propagate_unknowns): New parameter UNKNOWNS.  Update callers.

Signed-off-by: Siddhesh Poyarekar 
---

This is a follow-up cleanup to pr#113012, but not required to fix that
bug.  Bootstrapped on x86_64 and with config=ubsan.

 gcc/tree-object-size.cc | 65 +
 1 file changed, 34 insertions(+), 31 deletions(-)

diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 434b2fc0bf5..08a3b7f5d94 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -43,7 +43,7 @@ struct object_size_info
   int object_size_type;
   unsigned char pass;
   bool changed;
-  bitmap visited, reexamine, unknowns;
+  bitmap visited, reexamine;
   unsigned int *depths;
   unsigned int *stack, *tos;
 };
@@ -264,19 +264,8 @@ object_sizes_set (struct object_size_info *osi, unsigned 
varno, tree val,
 {
   if (bitmap_bit_p (osi->reexamine, varno))
{
- if (size_unknown_p (val, object_size_type))
-   {
- oldval = object_sizes_get (osi, varno);
- old_wholeval = object_sizes_get (osi, varno, true);
- bitmap_set_bit (osi->unknowns, SSA_NAME_VERSION (oldval));
- bitmap_set_bit (osi->unknowns, SSA_NAME_VERSION (old_wholeval));
- bitmap_clear_bit (osi->reexamine, varno);
-   }
- else
-   {
- val = bundle_sizes (oldval, val);
- wholeval = bundle_sizes (old_wholeval, wholeval);
-   }
+ val = bundle_sizes (oldval, val);
+ wholeval = bundle_sizes (old_wholeval, wholeval);
}
   else
{
@@ -958,25 +947,26 @@ emit_phi_nodes (gimple *stmt, tree size, tree wholesize)
size_unknown, as noted in UNKNOWNS.  */
 
 static tree
-propagate_unknowns (object_size_info *osi, tree expr)
+propagate_unknowns (object_size_info *osi, tree expr, bitmap unknowns)
 {
   int object_size_type = osi->object_size_type;
 
   switch (TREE_CODE (expr))
 {
 case SSA_NAME:
-  if (bitmap_bit_p (osi->unknowns, SSA_NAME_VERSION (expr)))
+  if (bitmap_bit_p (unknowns, SSA_NAME_VERSION (expr)))
return size_unknown (object_size_type);
   return expr;
 
 case MIN_EXPR:
 case MAX_EXPR:
{
- tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0));
+ tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0),
+unknowns);
  if (size_unknown_p (res, object_size_type))
return res;
 
- res = propagate_unknowns (osi, TREE_OPERAND (expr, 1));
+ res = propagate_unknowns (osi, TREE_OPERAND (expr, 1), unknowns);
  if (size_unknown_p (res, object_size_type))
return res;
 
@@ -984,7 +974,8 @@ propagate_unknowns (object_size_info *osi, tree expr)
}
 case MODIFY_EXPR:
{
- tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 1));
+ tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 1),
+unknowns);
  if (size_unknown_p (res, object_size_type))
return res;
  return expr;
@@ -992,7 +983,8 @@ propagate_unknowns (object_size_info *osi, tree expr)
 case TREE_VEC:
   for (int i = 0; i < TREE_VEC_LENGTH (expr); i++)
{
- tree res = propagate_unknowns (osi, TREE_VEC_ELT (expr, i));
+ tree res = propagate_unknowns (osi, TREE_VEC_ELT (expr, i),
+unknowns);
  if (size_unknown_p (res, object_size_type))
return res;
}
@@ -1000,7 +992,8 @@ propagate_unknowns (object_size_info *osi, tree expr)
 case PLUS_EXPR:
 case MINUS_EXPR:
{
- tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0));
+ tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0),
+unknowns);
  if (size_unknown_p (res, object_size_type))
return res;
 
@@ -1025,6 +1018,7 @@ gimplify_size_expressions (object_size_info *osi)
   /* Step 1: Propagate unknowns into expressions.  */
   bitmap reexamine = BITMAP_ALLOC (NULL);
   bitmap_copy (reexamine, osi->reexamine);
+  bitmap unknowns = BITMAP_ALLOC (NULL);
   do
 {
   changed = false;
@@ -1032,14 +1026,23 @@ gimplify_size_expressions (object_size_info *osi)
{
  object_size cur = object_sizes_get_raw (osi, i);
 
- if (size_unknown_p (propagate

Re: [PATCH] tree-object-size: Always set computed bit for bdos [PR113012]

2023-12-19 Thread Siddhesh Poyarekar

On 2023-12-19 17:57, Jakub Jelinek wrote:

On Mon, Dec 18, 2023 at 11:42:52AM -0500, Siddhesh Poyarekar wrote:

It is always safe to set the computed bit for dynamic object sizes at
the end of collect_object_sizes_for because even in case of a dependency
loop encountered in nested calls, we have an SSA temporary to actually
finish the object size expression.  The reexamine pass for dynamic
object sizes is only for propagation of unknowns and gimplification of
the size expressions, not for loop resolution as in the case of static
object sizes.

gcc/ChangeLog:

PR tree-optimization/113012
* gcc.dg/ubsan/pr113012.c: New test case.

gcc/testsuite/ChangeLog:

PR tree-optimization/113012
* tree-object-size.cc (compute_builtin_object_size): Expand
comment for dynamic object sizes.
(collect_object_sizes_for): Always set COMPUTED bitmap for
dynamic object sizes.


You have the gcc/ChangeLog and gcc/testsuite/ChangeLog hunks swapped,
I think this wouldn't get through pre-commit hook.


Oops, fixed.


--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -1185,10 +1185,12 @@ compute_builtin_object_size (tree ptr, int 
object_size_type,
  osi.tos = NULL;
}
  
-  /* First pass: walk UD chains, compute object sizes that

-can be computed.  osi.reexamine bitmap at the end will
-contain what variables were found in dependency cycles
-and therefore need to be reexamined.  */
+  /* First pass: walk UD chains, compute object sizes that can be computed.
+osi.reexamine bitmap at the end will contain what variables that need


This wording is weird.
Perhaps ... will contain versions of SSA_NAMEs that need
to be reexamined. ?
Because varno seems to be SSA_NAME_VERSION.


Ack, it's unrelated to my change, but fixed since I touched the comment 
block.



@@ -1823,11 +1825,16 @@ collect_object_sizes_for (struct object_size_info *osi, 
tree var)
gcc_unreachable ();
  }
  
-  if (! reexamine || object_sizes_unknown_p (object_size_type, varno))

+  /* Dynamic sizes use placeholder temps to return an answer, so it is always
+   * safe to set COMPUTED for them.  */


We don't use this style of comments, please replace the * at the start of
second line with a space.


Oops, fixed.


+  if ((object_size_type & OST_DYNAMIC)
+  || !reexamine || object_sizes_unknown_p (object_size_type, varno))
  {
bitmap_set_bit (computed[object_size_type], varno);
if (!(object_size_type & OST_DYNAMIC))
bitmap_clear_bit (osi->reexamine, varno);
+  else if (reexamine)
+   bitmap_set_bit (osi->reexamine, varno);
  }
else
  {


Otherwise LGTM, but please wait at least a few weeks before backporting to
13.


OK, I'll push with fixes to trunk in a bit and then push to 13 next year.

Thanks,
Sid


Re: [PATCH] tree-object-size: Clean up unknown propagation

2023-12-20 Thread Siddhesh Poyarekar

On 2023-12-20 00:23, Jeff Law wrote:



On 12/19/23 10:21, Siddhesh Poyarekar wrote:

Narrow down scope of the unknowns bitmap so that it is only accessible
within the reexamination process.  This also removes any role of unknown
propagation from object_sizes_set, thus simplifying that code path a
bit.

gcc/ChangeLog:

* tree-object-size.cc (object_size_info): Remove UNKNOWNS.
Drop all references to it.
(object_sizes_set): Move unknowns propagation code to...
(gimplify_size_expressions): ... here.  Also free reexamine
bitmap.
(propagate_unknowns): New parameter UNKNOWNS.  Update callers.

Signed-off-by: Siddhesh Poyarekar 
---

This is a follow-up cleanup to pr#113012, but not required to fix that
bug.  Bootstrapped on x86_64 and with config=ubsan.
OK, assuming it also went through a regression test or does so before 
committing.


Thanks, yes, I also compared test results during the x86_64 bootstrap, 
likewise for i686 after I had posted this.


Sid


Re: [PATCH] SECURITY.txt: Drop "exploitable" in reference to hardening issues

2024-01-09 Thread Siddhesh Poyarekar

On 2023-12-18 09:35, Siddhesh Poyarekar wrote:
The "exploitable vulnerability" may lead to a misunderstanding that 
missed hardening issues are considered vulnerabilities, just that 
they're not exploitable.  This is not true, since while hardening bugs 
may be security-relevant, the absence of hardening does not make a 
program any more vulnerable to exploits than without.


Drop the "exploitable" word to make it clear that missed hardening is 
not considered a vulnerability.


Ping, may I commit this if there are no objections?

Thanks,
Sid



diff --git a/SECURITY.txt b/SECURITY.txt
index b3e2bbfda90..126603d4c22 100644
--- a/SECURITY.txt
+++ b/SECURITY.txt
@@ -155,10 +155,10 @@ Security features implemented in GCC
  GCC implements a number of security features that reduce the impact
  of security issues in applications, such as -fstack-protector,
  -fstack-clash-protection, _FORTIFY_SOURCE and so on.  A failure of
-    these features to function perfectly in all situations is not an
-    exploitable vulnerability in itself since it does not affect the
-    correctness of programs.  Further, they're dependent on heuristics
-    and may not always have full coverage for protection.
+    these features to function perfectly in all situations is not a
+    vulnerability in itself since it does not affect the correctness of
+    programs.  Further, they're dependent on heuristics and may not
+    always have full coverage for protection.

  Similarly, GCC may transform code in a way that the correctness of
  the expressed algorithm is preserved, but supplementary properties



Re: [gcc15] nested functions in C

2023-12-05 Thread Siddhesh Poyarekar

On 2023-12-04 16:31, Martin Uecker wrote:

If (assuming from them being called lambdas) they are primarily for
small functions without side-effects then it's already a significantly
stronger specification than what we have right now with C nested
functions.  That would end up enforcing what you demonstrate as the good
way to use nested functions.


The proposal we have seen for C23 (which was not accepted into
C23 mostly due to timing and lack of implementation experience)
were similar to C++'s lambdas and did not have any such restriction.


Oh well :/


If nested functions are eventually going to make it into the C standard
then effort is probably better spent in porting the C nested functions
to use descriptors instead of executable stacks or heaps.


I submitted a patch for this a long time ago which was based
on the code for Ada that uses a bit in the pointer to differentiate
between conventional pointers and descriptors.

I would now prefer an approach that uses a qualifier on the
function type to indicate that the static chain has to be
set. A pointer to such a qualified function would a descriptor
that consists of the address and the value for the static chain.

This would be useful for many things.


Ack, this probably becomes a gcc15 thing then, given that stage 1 has 
closed.  Are you planning to revive your work and repost?


Thanks,
Sid


Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-04 Thread Siddhesh Poyarekar

On 2023-12-02 04:42, Martin Uecker wrote:



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
It came up that a good hardening strategy is to disable trampolines
which may require executable stack.  Therefore the following patch
adds -Werror=trampolines to -fhardened.


This would add a warning about specific code (where it is then
unclear whether rewriting it is feasible or even an improvement),
which seems different to all the other flags -fhardening has
now.


It's actually -Werror=trampolines, not just -Wtrampolines; the aim is to 
hard fail on producing trampolines and consequently, an executable 
stack.  In general the goal of -fhardened is to produce hardened code 
and the nested function trampolines do the exact reverse of that, so 
-Werror=trampolines seems to align perfectly with that goal, doesn't it?



GCC now has an option to allocate trampolines on the heap,
which would seem to be a better fit.  On the other hand,
it does not work with longjmp which may be a limitation.


For hardened code in C, I think we really should look to step away from 
nested functions instead of adding ways to continue supporting it. 
There's probably a larger conversation to be had about the utility of 
nested functions in general for C (and whether this GCC extension should 
be deprecated altogether in future), but I feel like the -fhardened 
subset gives us the opportunity to enforce at least a safe subset for 
now, possibly extending it in future.


Thanks,
Sid


Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-04 Thread Siddhesh Poyarekar

On 2023-12-04 11:39, Andreas Schwab wrote:

On Dez 04 2023, Siddhesh Poyarekar wrote:


For hardened code in C, I think we really should look to step away from
nested functions instead of adding ways to continue supporting it. There's
probably a larger conversation to be had about the utility of nested
functions in general for C (and whether this GCC extension should be
deprecated altogether in future), but I feel like the -fhardened subset
gives us the opportunity to enforce at least a safe subset for now,
possibly extending it in future.


Nested functions by itself don't need a trampoline, only if the address
of it is passed outside the containing function's scope (as a callback,
for example).


Yes, that's why I said that the conversation about deprecating the C 
nested functions extension is a broader one (and hence for gcc 15) that 
will likely involve the question of whether dropping the extension 
altogether gives any benefit or if dropping support for on-stack 
trampolines is sufficient.  On-heap trampolines are maybe slightly 
better in that they don't need an executable stack, but defaulting to 
on-heap trampolines for -fhardened seems like a lost opportunity to 
enforce better user code.


Thanks,
Sid


[gcc15] nested functions in C

2023-12-04 Thread Siddhesh Poyarekar
[Branching this into a separate conversation to avoid derailing the 
patch, which isn't directly related]


On 2023-12-04 12:21, Martin Uecker wrote:

I do not really agree with that.  Nested functions can substantially
improve code quality and in C can avoid type unsafe use of
void* pointers in callbacks. The code is often much better with
nested functions than without.  Nested functions and lambdas
(i.e. anonymous nested functions) are used in many languages
because they make code better and GNU's nested function are no
exception.

So I disagree with the idea that discouraging nested functions leads
to better code - I think the exact opposite is true.


I would argue that GNU's nested functions *are* an exception because 
they're like feathers stuck on a pig to try and make it fly; I think a 
significant specification effort is required to actually make it a 
cleanly usable feature.  It *may* be possible to implement patterns that 
use C nested functions well enough *and* result in readable code, but 
IMO it is easier to write clunky and unmaintainable code with it.


I empathize with Jakub's stated use case though of keeping the C 
frontend support for testing purposes, but that could easily be done 
behind a flag, or by putting nested C func deprecation behind a flag.



I am generally wary of mitigations that may make exploitation of
buffer overflows a bit harder  while increasing the likelihood
of buffer overflows by reducing type safety and/or code quality.

But I would agree that trampolines are generally problematic. A
better strategy would be wide function pointer type (as in Apple'
Blocks extension). Alternatively, an explicit way to obtain the
static chain for a nested function which could be used with
__builtin_call_with_static_chain  could also work.

But in any case, I think it diminishes the value of -fhardening
it if requires source code changes, because then it is not as easy
to simply turn it on in larger projects / distributitions.


I suppose you mean source code changes even in correct code just to 
comply with the flag?  I don't disagree for cases like -Warray-bounds, 
but for warnings/errors that are more deterministic in nature (like 
-Werror=trampolines), they're going to point at actual problems and 
larger projects and distributions will usually prefer to at least track 
them, if not actually fix them.  For Fedora we tend to provide macro 
overrides for packages that need to explicitly disable a security 
related flag.


Thanks,
Sid


Re: [gcc15] nested functions in C

2023-12-04 Thread Siddhesh Poyarekar

On 2023-12-04 13:48, Martin Uecker wrote:

I empathize with Jakub's stated use case though of keeping the C
frontend support for testing purposes, but that could easily be done
behind a flag, or by putting nested C func deprecation behind a flag.


I am relatively sure C will get some form of nested functions.
Maybe as anonymous nested functions, i.e. lambdas, but I do
not see a fundamental difference here (I personally like naming
things for clarity, so i prefer named nested functions)


If (assuming from them being called lambdas) they are primarily for 
small functions without side-effects then it's already a significantly 
stronger specification than what we have right now with C nested 
functions.  That would end up enforcing what you demonstrate as the good 
way to use nested functions.


I suppose minimal, contained side-effects (such as atomically updating a 
structure) may also constitute sound design, but that should be made 
explicit in the language.



I don't disagree for cases like -Warray-bounds,
but for warnings/errors that are more deterministic in nature (like
-Werror=trampolines), they're going to point at actual problems and
larger projects and distributions will usually prefer to at least track
them, if not actually fix them.  For Fedora we tend to provide macro
overrides for packages that need to explicitly disable a security
related flag.


In projects such as mine, this will lead to a lot of code
transformations as indicated above, i.e. much worse code.

One could get away with it, since nested functions are rarely
used, but I think this is bad, because a lot of code would
improve if it used them.


If nested functions are eventually going to make it into the C standard 
then effort is probably better spent in porting the C nested functions 
to use descriptors instead of executable stacks or heaps.


Thanks,
Sid


Re: [gcc15] nested functions in C

2023-12-04 Thread Siddhesh Poyarekar

On 2023-12-04 13:51, Jakub Jelinek wrote:

Why?  The syntax doesn't seem to be something unexpected, and as C doesn't
have lambdas, one can use the nested functions instead.
The only problem is if you need to pass function pointers somewhere else
(and target doesn't have function descriptors or something similar), if it
is only done to make code more readable compared to say use of macros, I
think the nested functions are better, one doesn't have to worry about
multiple evaluations of argument side-effects etc.  And if everything is
inlined and SRA optimized, there is no extra cost.
The problem of passing it as a function pointer to other functions is
common with C++, only lambdas which don't capture anything actually can be
convertible to function pointer, for anything else you need a template and
instantiate it for a particular lambda (which is something you can't do in
C).


I think from a language standpoint, the general idea that nested 
functions are just any functions inside functions (which is how the C 
nested functions essentially behave) is too broad and they should be 
restricted to minimal implementations that, e.g. don't have side-effects 
or if they do, there's explicit syntactic sugar to make it clearer.


If (like Martin stated earlier), nested functions are in fact going to 
enter the C standard in future then maybe this discussion is moot and we 
probably are better off implementing descriptor support for C to replace 
the on-stack trampolines instead of adding -Werror=trampolines in a hurry.


Thanks,
Sid


Re: [gcc15] nested functions in C

2023-12-07 Thread Siddhesh Poyarekar

On 2023-12-07 10:42, Eric Botcazou wrote:

I think from a language standpoint, the general idea that nested
functions are just any functions inside functions (which is how the C
nested functions essentially behave) is too broad and they should be
restricted to minimal implementations that, e.g. don't have side-effects
or if they do, there's explicit syntactic sugar to make it clearer.


That sounds totally arbitrary though.  Algol-derived languages have had nested
subprograms for ages, e.g. Pascal or Ada, and they can be very useful.


I'll admit that it is a subjective preference and is probably not in the 
spirit of traditional C.


Sid


Re: [PATCH v8 5/5] Add the 6th argument to .ACCESS_WITH_SIZE

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

to carry the TYPE of the flexible array.

Such information is needed during tree-object-size.cc.

We cannot use the result type or the type of the 1st argument
of the routine .ACCESS_WITH_SIZE to decide the element type
of the original array due to possible type casting in the
source code.

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th
argument to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): Use the type
of the 6th argument for the type of the element.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-6.c: New test.


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


---
  gcc/c/c-typeck.cc | 11 +++--
  gcc/internal-fn.cc|  2 +
  .../gcc.dg/flex-array-counted-by-6.c  | 46 +++
  gcc/tree-object-size.cc   | 16 ---
  4 files changed, 66 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-6.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index f7b0e08459b0..05948f76039e 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2608,7 +2608,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
  
 to:
  
-   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1))

+   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1,
+   (TYPE_OF_ARRAY *)0))
  
 NOTE: The return type of this function is the POINTER type pointing

 to the original flexible array type.
@@ -2620,6 +2621,9 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
 The 4th argument of the call is a constant 0 with the TYPE of the
 object pointed by COUNTED_BY_REF.
  
+   The 6th argument of the call is a constant 0 with the pointer TYPE

+   to the original flexible array type.
+
*/
  static tree
  build_access_with_size_for_counted_by (location_t loc, tree ref,
@@ -2632,12 +2636,13 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
  
tree call

  = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
-   result_type, 5,
+   result_type, 6,
array_to_pointer_conversion (loc, ref),
counted_by_ref,
build_int_cst (integer_type_node, 1),
build_int_cst (counted_by_type, 0),
-   build_int_cst (integer_type_node, -1));
+   build_int_cst (integer_type_node, -1),
+   build_int_cst (result_type, 0));
/* Wrap the call with an INDIRECT_REF with the flexible array type.  */
call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
SET_EXPR_LOCATION (call, loc);
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index e744080ee670..34e4a4aea534 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3411,6 +3411,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
   1: read_only
   2: write_only
   3: read_write
+   6th argument: A constant 0 with the pointer TYPE to the original flexible
+ array type.
  
 Both the return type and the type of the first argument of this

 function have been converted from the incomplete array type to
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
new file mode 100644
index ..65fa01443d95
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
@@ -0,0 +1,46 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size: when the type of the flexible array member
+ * is casting to another type.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+typedef unsigned short u16;
+
+struct info {
+   u16 data_len;
+   char data[] __attribute__((counted_by(data_len)));
+};
+
+struct foo {
+   int a;
+   int b;
+};
+
+static __attribute__((__noinline__))
+struct info *setup ()
+{
+ struct info *p;
+ size_t bytes = 3 * sizeof(struct foo);
+
+ p = (struct info *)malloc (sizeof (struct info) + bytes);
+ p->data_len = bytes;
+
+ return p;
+}
+
+static void
+__attribute__((__noinline__)) report (struct info *p)
+{
+ struct foo *bar = (struct foo *)p->data;
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 1), 1), 16);
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 2), 1), 8);
+}
+
+int main(int argc, char *argv[])
+{
+ struct info *p = setup();
+ report(p);
+ return 0;
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 8de264d1dee2..4c1fa9b555fa 100644
--- 

Re: [PATCH v8 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


  gcc/c-family/c-ubsan.cc   | 42 +
  .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
  .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds-4.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
  5 files changed, 201 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-4.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..7cd3c6aa5b88 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, 
data));
  }
  
+/* Get the tree that represented the number of counted_by, i.e, the maximum

+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));
+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when class_of_size is 1, i.e, the number of the elements of
+ the object type, return the size.  */
+  if (class_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
  /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
 that gets expanded in the sanopt pass, and make an array dimension
 of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }
else
return NULL_TREE;
  }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..b503320628d2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* Test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void 

Re: [PATCH v8 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


---
  .../gcc.dg/builtin-object-size-common.h   |  11 ++
  .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
  .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
  .../gcc.dg/flex-array-counted-by-5.c  |  48 +
  gcc/tree-object-size.cc   |  60 ++
  5 files changed, 360 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
__builtin_abort ();   \
  return 0;   \
} while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);  \
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);  
\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..78f50230e891
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..20103d58ef51
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* Test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+We should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline 

Re: [PATCH v2 3/3] Add testing cases for flexible array members in unions and alone in structures.

2024-04-25 Thread Siddhesh Poyarekar

On 2024-04-25 10:06, Qing Zhao wrote:

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-1.c: New testcase.
* c-c++-common/fam-in-union-alone-in-struct-2.c: New testcase.
* c-c++-common/fam-in-union-alone-in-struct-3.c: New testcase.
---


Sorry I should have commented sooner; could you please also add tests 
for __bos/__bdos for such unions and structs?


Thanks,
Sid


  .../fam-in-union-alone-in-struct-1.c  | 52 +++
  .../fam-in-union-alone-in-struct-2.c  | 51 ++
  .../fam-in-union-alone-in-struct-3.c  | 36 +
  3 files changed, 139 insertions(+)
  create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
  create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
  create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c

diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
new file mode 100644
index ..7d4721aa95ac
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
@@ -0,0 +1,52 @@
+/* testing the correct usage of flexible array members in unions
+   and alone in structures.  */
+/* { dg-do run} */
+/* { dg-options "-Wpedantic" } */
+
+union with_fam_1 {
+  int a;
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+union with_fam_2 {
+  char a;
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+union with_fam_3 {
+  char a[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+  /* { dg-warning "in an otherwise empty" "" { target c++ } .-1 } */
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+struct only_fam {
+  int b[];
+  /* { dg-warning "in a struct with no named members" "" { target c } .-1 } */
+  /* { dg-warning "in an otherwise empty" "" { target c++ } .-2 } */
+  /* { dg-warning "forbids flexible array member" "" { target c++ } .-3 } */
+};
+
+struct only_fam_2 {
+  unsigned int : 2;
+  unsigned int : 3;
+  int b[];
+  /* { dg-warning "in a struct with no named members" "" { target c } .-1 } */
+  /* { dg-warning "in an otherwise empty" "" { target c++ } .-2 } */
+  /* { dg-warning "forbids flexible array member" "" { target c++ } .-3 } */
+};
+
+int main ()
+{
+  if (sizeof (union with_fam_1) != sizeof (int))
+__builtin_abort ();
+  if (sizeof (union with_fam_2) != __alignof__ (int))
+__builtin_abort ();
+  if (sizeof (union with_fam_3) != 0)
+__builtin_abort ();
+  if (sizeof (struct only_fam) != 0)
+__builtin_abort ();
+  if (sizeof (struct only_fam_2) != sizeof (int))
+__builtin_abort ();
+  return 0;
+}
+
diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
new file mode 100644
index ..3743f9e7dac5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
@@ -0,0 +1,51 @@
+/* testing the correct usage of flexible array members in unions
+   and alone in structures: initialization  */
+/* { dg-do run} */
+/* { dg-options "-O2" } */
+
+union with_fam_1 {
+  int a;
+  int b[];
+} with_fam_1_v = {.b = {1, 2, 3, 4}};
+
+union with_fam_2 {
+  int a;
+  char b[];
+} with_fam_2_v = {.a = 0x1f2f3f4f};
+
+union with_fam_3 {
+  char a[];
+  int b[];
+} with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f7f}};
+
+struct only_fam {
+  int b[];
+} only_fam_v = {{7, 11}};
+
+struct only_fam_2 {
+  unsigned int : 2;
+  unsigned int : 3;
+  int b[];
+} only_fam_2_v = {{7, 11}};
+
+int main ()
+{
+  if (with_fam_1_v.b[3] != 4
+  || with_fam_1_v.b[0] != 1)
+__builtin_abort ();
+  if (with_fam_2_v.b[3] != 0x1f
+  || with_fam_2_v.b[0] != 0x4f)
+__builtin_abort ();
+  if (with_fam_3_v.a[0] != 0x4f
+  || with_fam_3_v.a[7] != 0x5f)
+__builtin_abort ();
+  if (only_fam_v.b[0] != 7
+  || only_fam_v.b[1] != 11)
+__builtin_abort ();
+  if (only_fam_2_v.b[0] != 7
+  || only_fam_2_v.b[1] != 11)
+__builtin_abort ();
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c
new file mode 100644
index ..dd36fa01306d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c
@@ -0,0 +1,36 @@
+/* testing the correct usage of flexible array members in unions
+   and alone in structures.  */
+/* { dg-do compile } */
+/* { dg-options "-pedantic-errors" } */
+
+union with_fam_1 {
+  int a;
+  int b[];  /* { dg-error "flexible array member in union is a GCC extension" 
} */
+};
+
+union with_fam_2 {
+  char a;
+  int b[];  /* { dg-error "flexible array member in union is a GCC extension" 
} */
+};
+
+union with_fam_3 {
+  char a[];  /* { dg-error "flexible array member in union is a GCC 

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-18 Thread Siddhesh Poyarekar

On 2024-03-18 12:28, Qing Zhao wrote:

This should probably bail out if object_size_type & OST_DYNAMIC == 0.

Okay. Will add this.


When add this into access_with_size_object_size, I found some old bugs 
in early_object_sizes_execute_one, and fixed them as well.




Would you be able to isolate this fix and post them separately?  I 
reckon they would be relevant for gcc 14 too.


Thanks,
Sid


Re: [PATCH v6 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-03-11 Thread Siddhesh Poyarekar

On 2024-02-16 14:47, Qing Zhao wrote:

'counted_by (COUNT)'
  The 'counted_by' attribute may be attached to the C99 flexible
  array member of a structure.  It indicates that the number of the
  elements of the array is given by the field named "COUNT" in the
  same structure as the flexible array member.  GCC uses this
  information to improve the results of the array bound sanitizer and
  the '__builtin_dynamic_object_size'.

  For instance, the following code:

   struct P {
 size_t count;
 char other;
 char array[] __attribute__ ((counted_by (count)));
   } *p;

  specifies that the 'array' is a flexible array member whose number
  of elements is given by the field 'count' in the same structure.

  The field that represents the number of the elements should have an
  integer type.  Otherwise, the compiler will report a warning and
  ignore the attribute.

  When the field that represents the number of the elements is assigned a
  negative integer value, the compiler will treat the value as zero.

  An explicit 'counted_by' annotation defines a relationship between
  two objects, 'p->array' and 'p->count', and there are the following
  requirementthat on the relationship between this pair:

 * 'p->count' should be initialized before the first reference to
   'p->array';

 * 'p->array' has _at least_ 'p->count' number of elements
   available all the time.  This relationship must hold even
   after any of these related objects are updated during the
   program.

  It's the user's responsibility to make sure the above requirements
  to be kept all the time.  Otherwise the compiler will report
  warnings, at the same time, the results of the array bound
  sanitizer and the '__builtin_dynamic_object_size' is undefined.

  One important feature of the attribute is, a reference to the
  flexible array member field will use the latest value assigned to
  the field that represents the number of the elements before that
  reference.  For example,

 p->count = val1;
 p->array[20] = 0;  // ref1 to p->array
 p->count = val2;
 p->array[30] = 0;  // ref2 to p->array

  in the above, 'ref1' will use 'val1' as the number of the elements
  in 'p->array', and 'ref2' will use 'val2' as the number of elements
  in 'p->array'.


I can't approve of course, but here's a review of the code that should 
hopefully make it easier for the C frontend maintainers.




gcc/c-family/ChangeLog:

PR C/108896
* c-attribs.cc (handle_counted_by_attribute): New function.
(attribute_takes_identifier_p): Add counted_by attribute to the list.
* c-common.cc (c_flexible_array_member_type_p): ...To this.
* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

PR C/108896
* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
(add_flexible_array_elts_to_size): Use renamed function.
(is_flexible_array_member_p): Use renamed function.
(verify_counted_by_attribute): New function.
(finish_struct): Use renamed function and verify counted_by
attribute.
* c-tree.h (lookup_field): New prototype.
* c-typeck.cc (lookup_field): Expose as extern function.

gcc/ChangeLog:

PR C/108896
* doc/extend.texi: Document attribute counted_by.

gcc/testsuite/ChangeLog:

PR C/108896
* gcc.dg/flex-array-counted-by.c: New test.
---
  gcc/c-family/c-attribs.cc| 54 -
  gcc/c-family/c-common.cc | 13 +++
  gcc/c-family/c-common.h  |  1 +
  gcc/c/c-decl.cc  | 85 
  gcc/c/c-tree.h   |  1 +
  gcc/c/c-typeck.cc|  3 +-
  gcc/doc/extend.texi  | 64 +++
  gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 +
  8 files changed, 241 insertions(+), 20 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 40a0cf90295d..4395c0656b14 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -105,6 +105,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, 
tree, tree,
  int, bool *);
  static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_counted_by_attribute (tree *, tree, tree,
+  int, bool *);
  static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
  static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;

Re: [PATCH v6 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-03-11 Thread Siddhesh Poyarekar




On 2024-02-16 14:47, Qing Zhao wrote:

Including the following changes:
* The definition of the new internal function .ACCESS_WITH_SIZE
   in internal-fn.def.
* C FE converts every reference to a FAM with a "counted_by" attribute
   to a call to the internal function .ACCESS_WITH_SIZE.
   (build_component_ref in c_typeck.cc)

   This includes the case when the object is statically allocated and
   initialized.
   In order to make this working, the routines initializer_constant_valid_p_1
   and output_constant in varasm.cc are updated to handle calls to
   .ACCESS_WITH_SIZE.
   (initializer_constant_valid_p_1 and output_constant in varasm.c)

   However, for the reference inside "offsetof", the "counted_by" attribute is
   ignored since it's not useful at all.
   (c_parser_postfix_expression in c/c-parser.cc)

   In addtion to "offsetof", for the reference inside operator "typeof" and
   "alignof", we ignore counted_by attribute too.

   When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
   replace the call with its first argument.

* Convert every call to .ACCESS_WITH_SIZE to its first argument.
   (expand_ACCESS_WITH_SIZE in internal-fn.cc)
* Adjust alias analysis to exclude the new internal from clobbering anything.
   (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
tree-ssa-alias.cc)
* Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
   it's LHS is eliminated as dead code.
   (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
* Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
   get the reference from the call to .ACCESS_WITH_SIZE.
   (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)

gcc/c/ChangeLog:

* c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
attribute when build_component_ref inside offsetof operator.
* c-tree.h (build_component_ref): Add one more parameter.
* c-typeck.cc (build_counted_by_ref): New function.
(build_access_with_size_for_counted_by): New function.
(build_component_ref): Check the counted-by attribute and build
call to .ACCESS_WITH_SIZE.
(build_unary_op): When building ADDR_EXPR for
 .ACCESS_WITH_SIZE, use its first argument.
 (lvalue_p): Accept call to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
* internal-fn.def (ACCESS_WITH_SIZE): New internal function.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case
IFN_ACCESS_WITH_SIZE.
(call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE.
* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call
to .ACCESS_WITH_SIZE when its LHS is dead.
* tree.cc (process_call_operands): Adjust side effect for function
.ACCESS_WITH_SIZE.
(is_access_with_size_p): New function.
(get_ref_from_access_with_size): New function.
* tree.h (is_access_with_size_p): New prototype.
(get_ref_from_access_with_size): New prototype.
* varasm.cc (initializer_constant_valid_p_1): Handle call to
.ACCESS_WITH_SIZE.
(output_constant): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-2.c: New test.
---
  gcc/c/c-parser.cc |  10 +-
  gcc/c/c-tree.h|   2 +-
  gcc/c/c-typeck.cc | 128 +-
  gcc/internal-fn.cc|  36 +
  gcc/internal-fn.def   |   4 +
  .../gcc.dg/flex-array-counted-by-2.c  | 112 +++
  gcc/tree-ssa-alias.cc |   2 +
  gcc/tree-ssa-dce.cc   |   5 +-
  gcc/tree.cc   |  25 +++-
  gcc/tree.h|   8 ++
  gcc/varasm.cc |  10 ++
  11 files changed, 332 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index c31349dae2ff..a6ed5ac43bb1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10850,9 +10850,12 @@ c_parser_postfix_expression (c_parser *parser)
if (c_parser_next_token_is (parser, CPP_NAME))
  {
c_token *comp_tok = c_parser_peek_token (parser);
+   /* Ignore the counted_by attribute for reference inside
+  offsetof since the information is not useful at all.  */
offsetof_ref
  = build_component_ref (loc, offsetof_ref, comp_tok->value,
-comp_tok->location, UNKNOWN_LOCATION);
+comp_tok->location, UNKNOWN_LOCATION,
+false);
c_parser_consume_token (parser);

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-11 Thread Siddhesh Poyarekar




On 2024-02-16 14:47, Qing Zhao wrote:

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.
---
  .../gcc.dg/builtin-object-size-common.h   |  11 ++
  .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
  .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
  .../gcc.dg/flex-array-counted-by-5.c  |  48 +
  gcc/tree-object-size.cc   |  59 ++
  5 files changed, 359 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
__builtin_abort ();   \
  return 0;   \
} while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);  \
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);  
\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the 

Re: [PATCH v6 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-03-11 Thread Siddhesh Poyarekar




On 2024-02-16 14:47, Qing Zhao wrote:

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
  gcc/c-family/c-ubsan.cc   | 42 +
  .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
  .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
  4 files changed, 167 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..164b29845b3a 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, 
data));
  }
  
+/* Get the tree that represented the number of counted_by, i.e, the maximum

+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));


Again for consistency, this should probably be class_of_size.


+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when type_of_size is 1,i.e, the number of the elements of
+ the object type, return the size.  */
+  if (type_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
  /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
 that gets expanded in the sanopt pass, and make an array dimension
 of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }


This will wrap if bound == 0, maybe that needs to be special-cased.  And 
maybe also add a test for it below.



else
return NULL_TREE;
  }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m)
+{
+  struct foo {
+int n;
+int p[][n2][n1] __attribute__((counted_by(n)));
+  } *f;
+
+  f 

<    1   2   3   4