Re: [PATCH] tree-object-size: Support strndup and strdup
On 2022-09-23 09:02, Jakub Jelinek wrote: Oh, so can addr_object_size be simplified to use get_base_address too? You can try. As you can see in get_base_address, that function handles something that the above doesn't (looking through some MEM_REFs too). I went down this rabbithole and it actually simplifies some cases but got sucked into flex array related issues that I need more time to figure out. I'll stick to using get_base_address for now since I want to make sure this makes the stage 1 deadline. Thanks, Sid
Re: [PATCH 2/2] tree-object-size: More consistent behaviour with flex arrays
On 2023-01-31 07:46, Jakub Jelinek wrote: On Wed, Dec 21, 2022 at 05:25:54PM -0500, Siddhesh Poyarekar wrote: The tree object size pass tries to fail when it detects a flex array in the struct, but it ends up doing the right thing only when the flex array is in the outermost struct. For nested cases (such as arrays nested in a union or an inner struct), it ends up taking whatever value the flex array is declared with, using zero for the standard flex array, i.e. []. Rework subobject size computation to make it more consistent across the board, honoring -fstrict-flex-arrays. With this change, any array at the end of the struct will end up causing __bos to use the allocated value of the outer object, bailing out in the maximum case when it can't find it. In the minimum case, it will return the subscript value or the allocated value of the outer object, whichever is larger. I think it is way too late in the GCC 13 cycle to change behavior here except for when -fstrict-flex-arrays is used. I agree. Plus, am not really convinced it is a good idea to change the behavior here except for the new options, programs in the wild took 2+ years to acommodate for what we GCC requiring and am not sure they'd be willing to be adjusted again. The behaviour change basically does two things: better minimum object size estimates and more conservative object size estimates for trailing arrays. ISTM that this should in fact reduce breakages due to flex arrays, although one could argue that protection gets reduced for trailing arrays without -fstrict-flex-arrays. It wouldn't cause non-mitigation behaviour changes though, would it? We don't need to rush this of course, we could consider this for gcc 14 given that we're well into stage 4. Thanks, Sid
Re: [PATCH 2/2] Documentation Update.
On 2023-02-01 13:24, Qing Zhao wrote: On Feb 1, 2023, at 11:55 AM, Siddhesh Poyarekar wrote: On 2023-01-31 09:11, Qing Zhao wrote: Update documentation to clarify a GCC extension on structure with flexible array member being nested in another structure. gcc/ChangeLog: * doc/extend.texi: Document GCC extension on a structure containing a flexible array member to be a member of another structure. Should this resolve pr#77650 since the proposed action there appears to be to document these semantics? My understanding of pr77650 is specifically for documentation on the following case: The structure with a flexible array member is the middle field of another structure. Which I added in the documentation as the 2nd situation. However, I am still not very comfortable on my current clarification on this situation: how should we document on the expected gcc behavior to handle such situation? I reckon wording that dissuades programmers from using this might be appropriate, i.e. don't rely on this and if you already have such nested flex arrays, change code to remove them. +In the above, @code{flex_data.data[]} is allowed to be extended flexibly to +the padding. E.g, up to 4 elements. """ ... Relying on space in struct padding is bad programming practice and any code relying on this behaviour should be modified to ensure that flexible array members only end up at the ends of arrays. The `-pedantic` flag should help identify such uses. """ Although -pedantic will also flag on flex arrays nested in structs even if they're at the end of the parent struct, so my suggestion on the warning is not really perfect. Sid
[committed v2] testsuite: Run __bos tests to completion
Instead of failing on first error, run all __builtin_object_size and __builtin_dynamic_object_size tests to completion and then provide a summary of which tests failed. gcc/testsuite/ChangeLog: * gcc.dg/builtin-dynamic-object-size-0.c: Move FAIL and nfail into... * gcc.dg/builtin-object-size-common.h: ... new file. * g++.dg/ext/builtin-object-size1.C: Include builtin-object-size-common.h. Replace all abort with FAIL. (main): Call DONE. * g++.dg/ext/builtin-object-size2.C: Likewise. * gcc.dg/builtin-object-size-1.c: Likewise. * gcc.dg/builtin-object-size-12.c: Likewise. * gcc.dg/builtin-object-size-13.c: Likewise. * gcc.dg/builtin-object-size-15.c: Likewise. * gcc.dg/builtin-object-size-2.c: Likewise. * gcc.dg/builtin-object-size-3.c: Likewise. * gcc.dg/builtin-object-size-4.c: Likewise. * gcc.dg/builtin-object-size-6.c: Likewise. * gcc.dg/builtin-object-size-7.c: Likewise. * gcc.dg/builtin-object-size-8.c: Likewise. * gcc.dg/pr101836.c: Likewise. * gcc.dg/strict-flex-array-3.c: Likewise. Signed-off-by: Siddhesh Poyarekar --- .../g++.dg/ext/builtin-object-size1.C | 260 --- .../g++.dg/ext/builtin-object-size2.C | 260 --- .../gcc.dg/builtin-dynamic-object-size-0.c| 14 +- gcc/testsuite/gcc.dg/builtin-object-size-1.c | 297 - gcc/testsuite/gcc.dg/builtin-object-size-12.c | 12 +- gcc/testsuite/gcc.dg/builtin-object-size-13.c | 15 +- gcc/testsuite/gcc.dg/builtin-object-size-15.c | 11 +- gcc/testsuite/gcc.dg/builtin-object-size-2.c | 305 +- gcc/testsuite/gcc.dg/builtin-object-size-3.c | 275 gcc/testsuite/gcc.dg/builtin-object-size-4.c | 285 gcc/testsuite/gcc.dg/builtin-object-size-6.c | 260 --- gcc/testsuite/gcc.dg/builtin-object-size-7.c | 54 ++-- gcc/testsuite/gcc.dg/builtin-object-size-8.c | 15 +- .../gcc.dg/builtin-object-size-common.h | 32 ++ gcc/testsuite/gcc.dg/pr101836.c | 10 +- gcc/testsuite/gcc.dg/strict-flex-array-3.c| 10 +- 16 files changed, 1039 insertions(+), 1076 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-common.h diff --git a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C index 8590a0bbebd..ebbeced1942 100644 --- a/gcc/testsuite/g++.dg/ext/builtin-object-size1.C +++ b/gcc/testsuite/g++.dg/ext/builtin-object-size1.C @@ -1,11 +1,7 @@ // { dg-do run } // { dg-options "-O2" } -typedef __SIZE_TYPE__ size_t; -extern "C" void abort (); -extern "C" void exit (int); -extern "C" void *malloc (size_t); -extern "C" void free (void *); +#include "../../gcc.dg/builtin-object-size-common.h" struct A { @@ -20,105 +16,105 @@ test1 (A *p) { char *c; if (__builtin_object_size (>a, 0) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>a[0], 0) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>a[3], 0) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>b, 0) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>c, 0) != (size_t) -1) -abort (); +FAIL (); c = p->a; if (__builtin_object_size (c, 0) != (size_t) -1) -abort (); +FAIL (); c = >a[0]; if (__builtin_object_size (c, 0) != (size_t) -1) -abort (); +FAIL (); c = >a[3]; if (__builtin_object_size (c, 0) != (size_t) -1) -abort (); +FAIL (); c = (char *) >b; if (__builtin_object_size (c, 0) != (size_t) -1) -abort (); +FAIL (); c = (char *) >c; if (__builtin_object_size (c, 0) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>a, 1) != sizeof (p->a)) -abort (); +FAIL (); if (__builtin_object_size (>a[0], 1) != sizeof (p->a)) -abort (); +FAIL (); if (__builtin_object_size (>a[3], 1) != sizeof (p->a) - 3) -abort (); +FAIL (); if (__builtin_object_size (>b, 1) != sizeof (p->b)) -abort (); +FAIL (); if (__builtin_object_size (>c, 1) != (size_t) -1) -abort (); +FAIL (); c = p->a; if (__builtin_object_size (c, 1) != sizeof (p->a)) -abort (); +FAIL (); c = >a[0]; if (__builtin_object_size (c, 1) != sizeof (p->a)) -abort (); +FAIL (); c = >a[3]; if (__builtin_object_size (c, 1) != sizeof (p->a) - 3) -abort (); +FAIL (); c = (char *) >b; if (__builtin_object_size (c, 1) != sizeof (p->b)) -abort (); +FAIL (); c = (char *) >c; if (__builtin_object_size (c, 1) != (size_t) -1) -abort (); +FAIL (); if (__builtin_object_size (>a, 2) != 0) -abort (); +FAIL (); if (__builtin_object_size (>a
Re: [PATCH 2/2] Documentation Update.
On 2023-01-31 09:11, Qing Zhao wrote: Update documentation to clarify a GCC extension on structure with flexible array member being nested in another structure. gcc/ChangeLog: * doc/extend.texi: Document GCC extension on a structure containing a flexible array member to be a member of another structure. Should this resolve pr#77650 since the proposed action there appears to be to document these semantics? Thanks, Sid --- gcc/doc/extend.texi | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 4a89a3eae7c..54e4baf49a9 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -1748,7 +1748,40 @@ Flexible array members may only appear as the last member of a A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are -permitted by GCC as extensions.) +permitted by GCC as extensions, see details below.) +@end itemize + +GCC extension accepts a structure containing a flexible array member, or +a union containing such a structure (possibly recursively) to be a member +of a structure. + +There are two situations: + +@itemize @bullet +@item +The structure with a flexible array member is the last field of another +structure, for example: + +@smallexample +struct flex @{ int length; char data[]; @}; + +struct out_flex @{ int m; struct flex flex_data; @}; +@end smallexample + +In the above, @code{flex_data.data[]} is considered as a flexible array too. + +@item +The structure with a flexible array member is the middle field of another +structure, for example: + +@smallexample +struct flex @{ int length; char data[]; @}; + +struct mid_flex @{ int m; struct flex flex_data; int n; @}; +@end smallexample + +In the above, @code{flex_data.data[]} is allowed to be extended flexibly to +the padding. E.g, up to 4 elements. @end itemize Non-empty initialization of zero-length
Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]
On 2023-01-31 09:11, Qing Zhao wrote: GCC extension accepts the case when a struct with a flexible array member is embedded into another struct (possibly recursively). __builtin_object_size should treat such struct as flexible size per -fstrict-flex-arrays. PR tree-optimization/101832 gcc/ChangeLog: PR tree-optimization/101832 * tree-object-size.cc (flexible_size_type_p): New function. (addr_object_size): Handle structure/union type when it has flexible size. gcc/testsuite/ChangeLog: PR tree-optimization/101832 * gcc.dg/builtin-object-size-pr101832-2.c: New test. * gcc.dg/builtin-object-size-pr101832-3.c: New test. * gcc.dg/builtin-object-size-pr101832-4.c: New test. * gcc.dg/builtin-object-size-pr101832.c: New test. --- .../gcc.dg/builtin-object-size-pr101832-2.c | 135 ++ .../gcc.dg/builtin-object-size-pr101832-3.c | 135 ++ .../gcc.dg/builtin-object-size-pr101832-4.c | 135 ++ .../gcc.dg/builtin-object-size-pr101832.c | 119 +++ gcc/tree-object-size.cc | 115 +++ 5 files changed, 611 insertions(+), 28 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832-4.c create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c new file mode 100644 index 000..f38babc5415 --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-2.c @@ -0,0 +1,135 @@ +/* PR 101832: + GCC extension accepts the case when a struct with a flexible array member + is embedded into another struct (possibly recursively). + __builtin_object_size will treat such struct as flexible size per + -fstrict-flex-arrays. */ +/* { dg-do run } */ +/* { dg-options "-O2 -fstrict-flex-arrays=1" } */ + +#include + +unsigned n_fails = 0; + +#define expect(p, _v) do { \ + size_t v = _v; \ + if (p == v) \ +printf("ok: %s == %zd\n", #p, p); \ + else {\ +printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \ +n_fails++; \ I just pushed my testsuite fix, so you could use the macros in gcc.dg/builtin-object-size-common.h instead of accounting this yourself. Also if you use __builtin_printf, you won't have to include stdio.h. Thanks, Sid + } \ +} while (0); + +struct A { + int n; + char data[];/* Content following header */ +}; + +struct B { + int m; + struct A a; +}; + +struct C { + int q; + struct B b; +}; + +struct A0 { + int n; + char data[0];/* Content following header */ +}; + +struct B0 { + int m; + struct A0 a; +}; + +struct C0 { + int q; + struct B0 b; +}; + +struct A1 { + int n; + char data[1];/* Content following header */ +}; + +struct B1 { + int m; + struct A1 a; +}; + +struct C1 { + int q; + struct B1 b; +}; + +struct An { + int n; + char data[8];/* Content following header */ +}; + +struct Bn { + int m; + struct An a; +}; + +struct Cn { + int q; + struct Bn b; +}; + +volatile void *magic1, *magic2; + +int main(int argc, char *argv[]) +{ +struct B *outer; +struct C *outest; + +/* Make sure optimization can't find some other object size. */ +outer = (void *)magic1; +outest = (void *)magic2; + +expect(__builtin_object_size(>a, 1), -1); +expect(__builtin_object_size(>b, 1), -1); +expect(__builtin_object_size(>b.a, 1), -1); + +struct B0 *outer0; +struct C0 *outest0; + +/* Make sure optimization can't find some other object size. */ +outer0 = (void *)magic1; +outest0 = (void *)magic2; + +expect(__builtin_object_size(>a, 1), -1); +expect(__builtin_object_size(>b, 1), -1); +expect(__builtin_object_size(>b.a, 1), -1); + +struct B1 *outer1; +struct C1 *outest1; + +/* Make sure optimization can't find some other object size. */ +outer1 = (void *)magic1; +outest1 = (void *)magic2; + +expect(__builtin_object_size(>a, 1), -1); +expect(__builtin_object_size(>b, 1), -1); +expect(__builtin_object_size(>b.a, 1), -1); + +struct Bn *outern; +struct Cn *outestn; + +/* Make sure optimization can't find some other object size. */ +outern = (void *)magic1; +outestn = (void *)magic2; + +expect(__builtin_object_size(>a, 1), sizeof(outern->a)); +expect(__builtin_object_size(>b, 1), sizeof(outestn->b)); +expect(__builtin_object_size(>b.a, 1), sizeof(outestn->b.a)); + +if (n_fails > 0) + __builtin_abort (); + +return 0; +} diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832-3.c new file mode 100644 index 000..aaae99b8d67 --- /dev/null +++
[PATCH] doc: Fix typo in -Wall description
-Wall enables -Wuse-after-free=2 and not -Wuse-after-free=3. gcc/ChangeLog: * gcc/doc/invoke.texi (@item -Wall): Fix typo in -Wuse-after-free. Signed-off-by: Siddhesh Poyarekar --- gcc/doc/invoke.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 51447a78584..20d41e19b3c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -6083,7 +6083,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}. -Wunused-label @gol -Wunused-value @gol -Wunused-variable @gol --Wuse-after-free=3 @gol +-Wuse-after-free=2 @gol -Wvla-parameter @r{(C and Objective-C only)} @gol -Wvolatile-register-var @gol -Wzero-length-bounds} -- 2.38.1
Re: [PATCH] doc: Fix typo in -Wall description
On 2023-02-17 14:43, Jeff Law wrote: On 2/17/23 06:41, Siddhesh Poyarekar wrote: -Wall enables -Wuse-after-free=2 and not -Wuse-after-free=3. gcc/ChangeLog: * gcc/doc/invoke.texi (@item -Wall): Fix typo in -Wuse-after-free. Looks obvious to me. If you haven't committed it already, go ahead. Pushed, thanks. Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-20 08:59, Jakub Jelinek via Gcc-patches wrote: +r.set (type, dconstm1, dconst1); See above, are we sure we can use [-1., 1.] range safely, or should that be [-1.-Nulps, 1.+Nulps] for some kind of expected worse error margin of the implementation? And ditto for -frounding-math, shall we increase that interval in that case, or is [-1., 1.] going to be ok? Do any math implementations generate results outside of [-1., 1.]? If yes, then it's a bug in those implementations IMO, not in the range assumption. It feels wrong to cater for what ought to be trivially fixable in libraries if they ever happen to generate such results. Thanks, Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-20 10:02, Jakub Jelinek wrote: On x86_64-linux with glibc 2.35, I see for i in FLOAT DOUBLE LDOUBLE FLOAT128; do for j in TONEAREST UPWARD DOWNWARD TOWARDZERO; do gcc -D$i -DROUND=FE_$j -g -O1 -o /tmp/sincos{,.c} -lm; /tmp/sincos || echo $i $j; done; done Aborted (core dumped) FLOAT UPWARD Aborted (core dumped) FLOAT DOWNWARD On sparc-sun-solaris2.11 I see for i in FLOAT DOUBLE LDOUBLE; do for j in TONEAREST UPWARD DOWNWARD TOWARDZERO; do gcc -D$i -DROUND=FE_$j -g -O1 -o sincos{,.c} -lm; ./sincos || echo $i $j; done; done Abort (core dumped) DOUBLE UPWARD Abort (core dumped) DOUBLE DOWNWARD Haven't tried anything else. So that shows (but doesn't prove) that maybe [-1., 1.] interval is fine for -fno-rounding-math on those, but not for -frounding-math. Would there be a reason to not consider these as bugs? I feel like these should be fixed in glibc, or any math implementation that ends up doing this. I suppose one reason could be the overhead of an additional branch to check for result bounds, but is that serious enough to allow this imprecision? The alternative of output range being defined as [-1.0-ulp, 1.0+ulp] avoids that conversation I guess. Thanks, Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-20 11:52, Jakub Jelinek wrote: Why? Unless an implementation guarantees <= 0.5ulps errors, it can be one or more ulps off, why is an error at or near 1.0 or -1.0 error any worse than similar errors for other values? In a general sense, maybe not, but in the sense of breaching the bounds of admissible values, especially when it can be reasonably corrected, it seems worse IMO to let the error slide. Similarly for other functions which have other ranges, perhaps not with so nice round numbers. Say asin has [-pi/2, pi/2] range, those numbers aren't exactly representable, but is it any worse to round those values to -inf or +inf or worse give something 1-5 ulps further from that interval comparing to other 1-5ulps errors? I agree the argument in favour of allowing errors breaching the mathematical bounds is certainly stronger for bounds that are not exactly representable. I just feel like the implementation ought to take the additional effort when the bounds are representable and make it easier for the compiler. For bounds that aren't representable, one could get error bounds from libm-test-ulps data in glibc, although I reckon those won't be exhaustive. From a quick peek at the sin/cos data, the arc target seems to be among the worst performers at about 7ulps, although if you include the complex routines we get close to 13 ulps. The very worst imprecision among all math routines (that's gamma) is at 16 ulps for power in glibc tests, so maybe allowing about 25-30 ulps error in bounds might work across the board. But yeah, it's probably going to be guesswork. Thanks, Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-20 13:57, Siddhesh Poyarekar wrote: For bounds that aren't representable, one could get error bounds from libm-test-ulps data in glibc, although I reckon those won't be exhaustive. From a quick peek at the sin/cos data, the arc target seems to be among the worst performers at about 7ulps, although if you include the complex routines we get close to 13 ulps. The very worst imprecision among all math routines (that's gamma) is at 16 ulps for power in glibc tests, so maybe allowing about 25-30 ulps error in bounds might work across the board. I was thinking about this a bit more and it seems like limiting ranges to targets that can generate sane results (i.e. error bounds within, say, 5-6 ulps) and for the rest, avoid emitting the ranges altogether. Emitting a bad range for all architectures seems like a net worse solution again. Thanks, Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-24 12:03, Jakub Jelinek wrote: On Thu, Apr 20, 2023 at 01:57:55PM -0400, Siddhesh Poyarekar wrote: Similarly for other functions which have other ranges, perhaps not with so nice round numbers. Say asin has [-pi/2, pi/2] range, those numbers aren't exactly representable, but is it any worse to round those values to -inf or +inf or worse give something 1-5 ulps further from that interval comparing to other 1-5ulps errors? I've extended my hack^^^test to include sqrt and this time it seems that the boundary in that case holds even for non-standard rounding modes for glibc: IIRC the standard _requires_ sqrt to be correctly rounded. Sid
Re: [PATCH] Implement range-op entry for sin/cos.
On 2023-04-21 02:52, Jakub Jelinek wrote: On Thu, Apr 20, 2023 at 09:14:10PM -0400, Siddhesh Poyarekar wrote: On 2023-04-20 13:57, Siddhesh Poyarekar wrote: For bounds that aren't representable, one could get error bounds from libm-test-ulps data in glibc, although I reckon those won't be exhaustive. From a quick peek at the sin/cos data, the arc target seems to be among the worst performers at about 7ulps, although if you include the complex routines we get close to 13 ulps. The very worst imprecision among all math routines (that's gamma) is at 16 ulps for power in glibc tests, so maybe allowing about 25-30 ulps error in bounds might work across the board. I was thinking about this a bit more and it seems like limiting ranges to targets that can generate sane results (i.e. error bounds within, say, 5-6 ulps) and for the rest, avoid emitting the ranges altogether. Emitting a bad range for all architectures seems like a net worse solution again. Well, at least for basic arithmetics when libm functions aren't involved, there is no point in disabling ranges altogether. Oh yeah, I did mean only franges for math function call results. And, for libm functions, my plan was to introduce a target hook, which would have combined_fn argument to tell which function is queried, machine_mode to say which floating point format and perhaps a bool whether it is ulps for these basic math boundaries or results somewhere in between, and would return in unsigned int ulps, 0 for 0.5ulps precision. So, we could say for CASE_CFN_SIN: CASE_CFN_COS: in the glibc handler say that ulps is say 3 inside of the ranges and 0 on the boundaries if !flag_rounding_math and 6 and 2 with flag_rounding_math or whatever. And in the generic code don't assume anything if ulps is say 100 or more. The hooks would need to be a union of precision of supported versions of the library through the history, including say libmvec because function calls could be vectorized. And default could be that infinite precision. Back in November I've posted a proglet that can generate some ulps from random number testing, plus on glibc we could pick maximums from ulps files. And if needed, say powerpc*-linux could override the generic glibc version for some subset of functions and call default otherwise (say at least for __ibm128). Ack, that sounds like a plan. Thanks, Sid
Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]
On 2023-02-08 14:09, Joseph Myers wrote: What must be avoided is -pedantic diagnostics for struct flex1 { int n; int data[1]; }; struct out_flex_end1 { int m; struct flex1 flex_data; }; regardless of whether considered flexible or not, since that's clearly valid in standard C. Are you sure about "regardless of whether considered flexible or not", since ISTM the validity of the above in standard C is limited to when it's considered a non-flexible array. So with -pedantic it shouldn't warn, but it also then shouldn't consider it a flexible array. In other words, perhaps it makes sense to imply -fstrict-flex-arrays with -pedantic? Sid
Re: [PATCH 1/2] Handle component_ref to a structre/union field including flexible array member [PR101832]
On 2023-02-06 18:14, Joseph Myers wrote: On Mon, 6 Feb 2023, Qing Zhao via Gcc-patches wrote: In GCC14: 1. Include this new warning -Wgnu-varaible-sized-type-not-at-end to -Wall 2. Deprecate this extension from GCC. (Or delay this to next release?). Any deprecation, or inclusion in -Wall, would best come with evidence about the prevalance of use (possibly unintentional, probably undesirable) of these extensions. For example, maybe someone could do a distribution rebuild with a patch to enable these warnings and report the results? FWIW, Fred from our team has been working on a mass prebuilder that can be used for this kind of distribution-wide validation. I've used it for _FORTIFY_SOURCE validation as well as coverage analysis. I can help you with this Qing, once you have a patch ready. Sid [1] https://gitlab.com/fedora/packager-tools/mass-prebuild/
Re: [PATCH] tree-optimization/108522 Use component_ref_field_offset
On 2023-01-25 22:32, Siddhesh Poyarekar wrote: Instead of using TREE_OPERAND (expr, 2) directly, use component_ref_field_offset instead, which does scaling for us. The function also substitutes PLACEHOLDER_EXPRs, which is probably what we want anyway but I'm not sure if it's relevant for tree-object-size. gcc/ChangeLog: PR tree-optimization/108522 * tree-object-size.cc (compute_object_offset): Make EXPR argument non-const. Call component_ref_field_offset. gcc/testsuite/ChangeLog: PR tree-optimization/108522 * gcc.dg/builtin-dynamic-object-size-0.c (DEFSTRUCT): New macro. (test_dynarray_struct_member_b, test_dynarray_struct_member_c, test_dynarray_struct_member_d, test_dynarray_struct_member_subobj_b, test_dynarray_struct_member_subobj_c, test_dynarray_struct_member_subobj_d): New tests. (main): Call them. Signed-off-by: Siddhesh Poyarekar ... and now pushed (this and the earlier commit) to gcc-12 branch. Sid
Re: [PATCH 2/2] Documentation Update.
On 2023-02-02 03:33, Richard Biener wrote: looking at PR77650 what seems missing there is the semantics of this extension as expected/required by the glibc use. comment#5 seems to suggest that for my example above its expected that Y.x.data[0] aliases Y.end?! There must be a better way to write the glibc code and IMHO it would be best to deprecate this extension. Definitely the middle-end wouldn't consider this aliasing for my example - maybe it "works" when wrapped inside a union but then for sure only when the union is visible in all accesses ... typedef union { struct __gconv_info __cd; struct { struct __gconv_info __cd; struct __gconv_step_data __data; } __combined; } _G_iconv_t; could be written as typedef union { struct __gconv_info __cd; char __dummy[sizeof(struct __gconv_info) + sizeof(struct __gconv_step_data)]; } _G_iconv_t; in case the intent is to provide a complete type with space for a single __gconv_step_data. I dug into this on the glibc end and it looks like this commit: commit 63fb8f9aa9d19f85599afe4b849b567aefd70a36 Author: Zack Weinberg Date: Mon Feb 5 14:13:41 2018 -0500 Post-cleanup 2: minimize _G_config.h. ripped all of that gunk out. AFAICT there's no use of struct __gconv_info anywhere else in the code. I reckon it is safe to say now that glibc no longer needs this misfeature. Sid
Re: One question on the source code of tree-object-size.cc
On 2023-07-31 13:03, Siddhesh Poyarekar wrote: On 2023-07-31 12:47, Qing Zhao wrote: Hi, Sid and Jakub, I have a question in the following source portion of the routine “addr_object_size” of gcc/tree-object-size.cc: 743 bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var); 744 if (bytes != error_mark_node) 745 { 746 bytes = size_for_offset (var_size, bytes); 747 if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == MEM_REF) 748 { 749 tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0), 750 pt_var); 751 if (bytes2 != error_mark_node) 752 { 753 bytes2 = size_for_offset (pt_var_size, bytes2); 754 bytes = size_binop (MIN_EXPR, bytes, bytes2); 755 } 756 } 757 } At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not? Shall we use (object_size_type & OST_MINIMUM ? MIN_EXPR : MAX_EXPR) That MIN_EXPR is not for OST_MINIMUM. It is to cater for allocations like this: typedef struct { int a; } A; size_t f() { A *p = malloc (1); return __builtin_object_size (p, 0); Correction, that should be __builtin_object_size (>a, 0) } where the returned size should be 1 and not sizeof (int). The mode doesn't really matter in this case. HTH. Sid
Re: One question on the source code of tree-object-size.cc
On 2023-07-31 14:13, Qing Zhao wrote: Okay. I see. Then if the size info from the TYPE is smaller than the size info from the malloc, then based on the current code, we use the smaller one between these two, i.e, the size info from the TYPE. (Even for the OST_MAXIMUM). Is such behavior correct? Yes, it's correct even for OST_MAXIMUM. The smaller one between the two is the more precise estimate, which is why the mode doesn't matter. This is for the new “counted_by” attribute and how to use it in __builtin_dynamic_object_size. for example: === struct annotated { size_t foo; int array[] __attribute__((counted_by (foo))); }; #define noinline __attribute__((__noinline__)) #define SIZE_BUMP 2 /* in the following function, malloc allocated more space than the value of counted_by attribute. Then what's the correct behavior we expect the __builtin_dynamic_object_size should have? */ static struct annotated * noinline alloc_buf (int index) { struct annotated *p; p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int)); p->foo = index; /*when checking the observed access p->array, we have info on both observered allocation and observed access, A. from observed allocation: (index + SIZE_BUMP) * sizeof (int) B. from observed access: p->foo * sizeof (int) in the above, p->foo = index. */ /* for MAXIMUM size, based on the current code, we will use the size info from the TYPE, i.e, the “counted_by” attribute, which is the smaller one. */ expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int)); If the counted_by is less than what is allocated, it is the more correct value to return because that's what the application asked for through the attribute. If the allocated size is less, we return the allocated size because in that case, despite what the application said, the actual allocated size is less and hence that's the safer value. In fact in the latter case it may even make sense to emit a warning because it is more likely than not to be a bug. Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-07-31 12:47, Qing Zhao wrote: Hi, Sid and Jakub, I have a question in the following source portion of the routine “addr_object_size” of gcc/tree-object-size.cc: 743 bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var); 744 if (bytes != error_mark_node) 745 { 746 bytes = size_for_offset (var_size, bytes); 747 if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == MEM_REF) 748 { 749 tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0), 750pt_var); 751 if (bytes2 != error_mark_node) 752 { 753 bytes2 = size_for_offset (pt_var_size, bytes2); 754 bytes = size_binop (MIN_EXPR, bytes, bytes2); 755 } 756 } 757 } At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not? Shall we use (object_size_type & OST_MINIMUM ? MIN_EXPR : MAX_EXPR) That MIN_EXPR is not for OST_MINIMUM. It is to cater for allocations like this: typedef struct { int a; } A; size_t f() { A *p = malloc (1); return __builtin_object_size (p, 0); } where the returned size should be 1 and not sizeof (int). The mode doesn't really matter in this case. HTH. Sid
Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations
On 2023-07-21 07:21, Martin Uecker via Gcc-patches wrote: This patch adds a warning for allocations with insufficient size based on the "alloc_size" attribute and the type of the pointer the result is assigned to. While it is theoretically legal to assign to the wrong pointer type and cast it to the right type later, this almost always indicates an error. Since this catches common mistakes and is simple to diagnose, it is suggested to add this warning. Bootstrapped and regression tested on x86. Martin Add option Walloc-type that warns about allocations that have insufficient storage for the target type of the pointer the storage is assigned to. gcc: * doc/invoke.texi: Document -Wstrict-flex-arrays option. gcc/c-family: * c.opt (Walloc-type): New option. gcc/c: * c-typeck.cc (convert_for_assignment): Add Walloc-type warning. gcc/testsuite: * gcc.dg/Walloc-type-1.c: New test. diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 4abdc8d0e77..8b9d148582b 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -319,6 +319,10 @@ Walloca C ObjC C++ ObjC++ Var(warn_alloca) Warning Warn on any use of alloca. +Walloc-type +C ObjC Var(warn_alloc_type) Warning +Warn when allocating insufficient storage for the target type of the assigned pointer. + Walloc-size-larger-than= C ObjC C++ LTO ObjC++ Var(warn_alloc_size_limit) Joined Host_Wide_Int ByteSize Warning Init(HOST_WIDE_INT_MAX) -Walloc-size-larger-than=Warn for calls to allocation functions that diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 7cf411155c6..2e392f9c952 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -7343,6 +7343,32 @@ convert_for_assignment (location_t location, location_t expr_loc, tree type, "request for implicit conversion " "from %qT to %qT not permitted in C++", rhstype, type); + /* Warn of new allocations are not big enough for the target type. */ + tree fndecl; + if (warn_alloc_type + && TREE_CODE (rhs) == CALL_EXPR + && (fndecl = get_callee_fndecl (rhs)) != NULL_TREE + && DECL_IS_MALLOC (fndecl)) + { + tree fntype = TREE_TYPE (fndecl); + tree fntypeattrs = TYPE_ATTRIBUTES (fntype); + tree alloc_size = lookup_attribute ("alloc_size", fntypeattrs); + if (alloc_size) + { + tree args = TREE_VALUE (alloc_size); + int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1; + /* For calloc only use the second argument. */ + if (TREE_CHAIN (args)) + idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1; + tree arg = CALL_EXPR_ARG (rhs, idx); + if (TREE_CODE (arg) == INTEGER_CST + && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl))) +warning_at (location, OPT_Walloc_type, "allocation of " +"insufficient size %qE for type %qT with " +"size %qE", arg, ttl, TYPE_SIZE_UNIT (ttl)); + } + } + Wouldn't this be much more useful in later phases with ranger feedback like with the warn_access warnings? That way the comparison won't be limited to constant sizes. Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-02 10:02, Qing Zhao wrote: /*when checking the observed access p->array, we only have info on the observed access, i.e, the TYPE_SIZE info from the access. We don't have info on the whole object. */ expect(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(int)); expect(__builtin_dynamic_object_size(q->array, 0), -1); expect(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(int)); expect(__builtin_dynamic_object_size(q->array, 2), 0); /*when checking the pointer p, we have no observed allocation nor observed access. therefore, we cannot determine the size info here. */ expect(__builtin_dynamic_object_size(q, 1), -1); expect(__builtin_dynamic_object_size(q, 0), -1); expect(__builtin_dynamic_object_size(q, 3), 0); expect(__builtin_dynamic_object_size(q, 2), 0); I'm wondering if we could sizeof (*q) + q->foo for __bdos(q, 0), but I suppose it could mean generating code that potentially dereferences an invalid pointer. Surely we could emit that for __bdos(q->array, 0) though, couldn't we? Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-03 12:43, Qing Zhao wrote: Surely we could emit that for __bdos(q->array, 0) though, couldn't we? For __bdos(q->array, 0), we only have the access info for the sub-object q->array, we can surely decide the size of the sub-object q->array, but we still cannot decide the whole object that is pointed by q (the same reason as above), right? It's tricky, I mean we could assume p to be a valid object due to the dereference and hence assume that q->foo is also valid and that there's at least sizeof(*q) + q->foo * sizeof (q->array) bytes available. The question then is whether q could be pointing to an element of an array of `struct annotated`. Could we ever have a valid array of such structs that have a flex array at the end? Wouldn't it always be a single object? In fact for all pointers to such structs with a flex array at the end, could we always assume that it is a single object and never part of an array, and hence return sizeof()? Thanks, Sid
[PATCH] testsuite/110763: Ensure zero return from test
The test deliberately reads beyond bounds to exersize ubsan and the return value may be anything, based on previous allocations. The OFF test caters for it by ANDing the return with 0, do the same for the DYN test. gcc/testsuite/ChangeLog: PR testsuite/110763 * gcc.dg/ubsan/object-size-dyn.c (dyn): New parameter RET. (main): Use it. Signed-off-by: Siddhesh Poyarekar --- gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c index 0159f5b9820..49c3abe2e72 100644 --- a/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c +++ b/gcc/testsuite/gcc.dg/ubsan/object-size-dyn.c @@ -5,12 +5,12 @@ int __attribute__ ((noinline)) -dyn (int size, int i) +dyn (int size, int i, int ret) { __builtin_printf ("dyn\n"); fflush (stdout); int *alloc = __builtin_calloc (size, sizeof (int)); - int ret = alloc[i]; + ret = ret & alloc[i]; __builtin_free (alloc); return ret; } @@ -28,7 +28,7 @@ off (int size, int i, int ret) int main (void) { - int ret = dyn (2, 2); + int ret = dyn (2, 2, 0); ret |= off (4, 4, 0); -- 2.41.0
Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On 2023-08-10 12:39, Jakub Jelinek wrote: On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote: The definition of __bos/__bdos allows us the freedom to *estimate* rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound to give the more conservative answer of the two. To be precise, we have the 0/1 modes vs. 2/3. So, when not determining __bos/__bdos from actual allocation size or size of an stack object or size of data section object but something else (say counted_by), perhaps 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt) and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt), then user code can continue testing if both modes are equal to have exact number. Ack, that's fair. Thanks, Sid
Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On 2023-08-10 10:47, Martin Uecker wrote: Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: Am Donnerstag, dem 10.08.2023 um 13:59 + schrieb Qing Zhao: On Aug 10, 2023, at 2:58 AM, Martin Uecker wrote: Am Mittwoch, dem 09.08.2023 um 20:10 + schrieb Qing Zhao: On Aug 9, 2023, at 12:21 PM, Michael Matz wrote: I am not sure for the reason given above. The following code would not work: struct foo_flex { int a; short b; char t[]; } x; x.a = 1; struct foo_flex *p = malloc(sizeof(x) + x.a); if (!p) abort(); memcpy(p, , sizeof(x)); // initialize struct Okay. Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) Let me know if I still miss anything. The question is not only what the user should use to allocate, but also what BDOS should return. In my example the user uses the sizeof() + N * sizeof formula and the memcpy is safe, but it would be flagged as a buffer overrun if BDOS uses the offsetof formula. BDOS/BOS (at least the 0 level) should return what is actually allocated for the var, what size was passed to malloc and if it is a var with flex array member with initialization what is actually the size on the stack or in .data/.rodata etc. Agreed. But what about a struct with FAM with the new "counted_by" attribute if the original allocation is not visible? There's precedent for this through the __access__ attribute; __bos trusts what the attribute says about the allocation. Sid
Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On 2023-08-10 11:18, Martin Uecker wrote: Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar: On 2023-08-10 10:47, Martin Uecker wrote: Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek: On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote: Am Donnerstag, dem 10.08.2023 um 13:59 + schrieb Qing Zhao: On Aug 10, 2023, at 2:58 AM, Martin Uecker wrote: Am Mittwoch, dem 09.08.2023 um 20:10 + schrieb Qing Zhao: On Aug 9, 2023, at 12:21 PM, Michael Matz wrote: I am not sure for the reason given above. The following code would not work: struct foo_flex { int a; short b; char t[]; } x; x.a = 1; struct foo_flex *p = malloc(sizeof(x) + x.a); if (!p) abort(); memcpy(p, , sizeof(x)); // initialize struct Okay. Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe) Let me know if I still miss anything. The question is not only what the user should use to allocate, but also what BDOS should return. In my example the user uses the sizeof() + N * sizeof formula and the memcpy is safe, but it would be flagged as a buffer overrun if BDOS uses the offsetof formula. BDOS/BOS (at least the 0 level) should return what is actually allocated for the var, what size was passed to malloc and if it is a var with flex array member with initialization what is actually the size on the stack or in .data/.rodata etc. Agreed. But what about a struct with FAM with the new "counted_by" attribute if the original allocation is not visible? There's precedent for this through the __access__ attribute; __bos trusts what the attribute says about the allocation. The access attribute gives the size directly. The counted_by gives a length for the array which needs to be translated into a size via a formula. There are different formulas in use. The question is which formula should bdos trust? Whatever you pick, if this is not consistent with the actual allocation or use, then it will cause problems either by breaking code or not detecting buffer overruns. So it needs to be consistent with what GCC allocates for a var with FAM and initialization and also the user needs to be told what the right choice is so that he can use the right size for allocation and argument to memcpy / memset etc. We'd rather miss overflow to the extent of padding than to try and be overly aggressive; I doubt if we're missing much protection in practice by trying to account for the padding. The definition of __bos/__bdos allows us the freedom to *estimate* rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound to give the more conservative answer of the two. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-10 14:28, Richard Sandiford wrote: Siddhesh Poyarekar writes: On 2023-08-08 10:30, Siddhesh Poyarekar wrote: Do you have a suggestion for the language to address libgcc, libstdc++, etc. and libiberty, libbacktrace, etc.? I'll work on this a bit and share a draft. Hi David, Here's what I came up with for different parts of GCC, including the runtime libraries. Over time we may find that specific parts of runtime libraries simply cannot be used safely in some contexts and flag that. Sid """ What is a GCC security bug? === A security bug is one that threatens the security of a system or network, or might compromise the security of data stored on it. In the context of GCC there are multiple ways in which this might happen and they're detailed below. Compiler drivers, programs, libgccjit and support libraries --- The compiler driver processes source code, invokes other programs such as the assembler and linker and generates the output result, which may be assembly code or machine code. It is necessary that all source code inputs to the compiler are trusted, since it is impossible for the driver to validate input source code beyond conformance to a programming language standard. The GCC JIT implementation, libgccjit, is intended to be plugged into applications to translate input source code in the application context. Limitations that apply to the compiler driver, apply here too in terms of sanitizing inputs, so it is recommended that inputs are either sanitized by an external program to allow only trusted, safe execution in the context of the application or the JIT execution context is appropriately sandboxed to contain the effects of any bugs in the JIT or its generated code to the sandboxed environment. Support libraries such as libiberty, libcc1 libvtv and libcpp have been developed separately to share code with other tools such as binutils and gdb. These libraries again have similar challenges to compiler drivers. While they are expected to be robust against arbitrary input, they should only be used with trusted inputs. Libraries such as zlib and libffi that bundled into GCC to build it will be treated the same as the compiler drivers and programs as far as security coverage is concerned. As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. I think this leaves open the interpretation "every wrong code bug is potentially a security bug". I suppose that's true in a trite sense, but not in a useful sense. As others said earlier in the thread, whether a wrong code bug in GCC leads to a security bug in the object code is too application-dependent to be a useful classification for GCC. I think we should explicitly say that we don't generally consider wrong code bugs to be security bugs. Leaving it implicit is bound to lead to misunderstanding. I see what you mean, but the context-dependence of a bug is something GCC will have to deal with, similar to how libraries have to deal with bugs. But I agree this probably needs some more expansion. Let me try and come up with something more detailed for that last paragraph. There's another case that I think should be highlighted explicitly: GCC provides various security-hardening features. I think any failure of those feature to act as documented is poentially a security bug. Failure to follow reasonable expectations (even if not documented) might sometimes be a security bug too. Missed hardening in general does not put systems at immediate risk, so they're not considered CVE-worthy. In fact when bugs are evaluated for security risk at a source level (e.g. when NIST does it), hardening does not come into the picture at all. It's only at product levels that hardening features are accounted for, e.g. where -fstack-protector would reduce the seriousness of a stack buffer overflow and even there one must do an analysis to see if the generated code actually mitigated the overflow using the stack protector canary. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-11 11:12, David Edelsohn wrote: The text above states "bugs in these libraries may be evaluated for security impact", but there is no comment about the criteria for a security impact, unlike the GLIBC SECURITY.md document. The text seems to imply the "What is a security bug?" definitions from GLIBC, but the definitions are not explicitly stated in the GCC Security policy. Should this "Language runtime libraries" section include some of the GLIBC "What is a security bug?" text or should the GCC "What is a security bug?" section earlier in this document include the text with a qualification that issues like buffer overflow, memory leaks, information disclosure, etc. specifically apply to "Language runtime libraries" and not all components of GCC? Yes, that makes sense. This part will likely evolve though, much like the glibc one did, based on reports we get over time. I'll work it in and post an updated draft. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-10 14:50, Siddhesh Poyarekar wrote: As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. I think this leaves open the interpretation "every wrong code bug is potentially a security bug". I suppose that's true in a trite sense, but not in a useful sense. As others said earlier in the thread, whether a wrong code bug in GCC leads to a security bug in the object code is too application-dependent to be a useful classification for GCC. I think we should explicitly say that we don't generally consider wrong code bugs to be security bugs. Leaving it implicit is bound to lead to misunderstanding. I see what you mean, but the context-dependence of a bug is something GCC will have to deal with, similar to how libraries have to deal with bugs. But I agree this probably needs some more expansion. Let me try and come up with something more detailed for that last paragraph. How's this: As a result, the only case for a potential security issue in the compiler is when it generates vulnerable application code for valid, trusted input source code. The output application code could be considered vulnerable if it produces an actual vulnerability in the target application, specifically in the following cases: - The application dereferences an invalid memory location despite the application sources being valid. - The application reads from or writes to a valid but incorrect memory location, resulting in an information integrity issue or an information leak. - The application ends up running in an infinite loop or with severe degradation in performance despite the input sources having no such issue, resulting in a Denial of Service. Note that correct but non-performant code is not a security issue candidate, this only applies to incorrect code that may result in performance degradation. - The application crashes due to the generated incorrect code, resulting in a Denial of Service.
Re: [RFC] GCC Security policy
On 2023-08-11 11:09, Paul Koning wrote: On Aug 11, 2023, at 10:36 AM, Siddhesh Poyarekar wrote: On 2023-08-10 14:50, Siddhesh Poyarekar wrote: As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. I think this leaves open the interpretation "every wrong code bug is potentially a security bug". I suppose that's true in a trite sense, but not in a useful sense. As others said earlier in the thread, whether a wrong code bug in GCC leads to a security bug in the object code is too application-dependent to be a useful classification for GCC. I think we should explicitly say that we don't generally consider wrong code bugs to be security bugs. Leaving it implicit is bound to lead to misunderstanding. I see what you mean, but the context-dependence of a bug is something GCC will have to deal with, similar to how libraries have to deal with bugs. But I agree this probably needs some more expansion. Let me try and come up with something more detailed for that last paragraph. How's this: As a result, the only case for a potential security issue in the compiler is when it generates vulnerable application code for valid, trusted input source code. The output application code could be considered vulnerable if it produces an actual vulnerability in the target application, specifically in the following cases: You might make it explicit that we're talking about wrong code errors here -- in other words, the source code is correct (conforms to the standard) and the algorithm expressed in the source code does not have a vulnerability, but the generated code has semantics that differ from those of the source code such that it does have a vulnerability. Ack, thanks for the suggestion. - The application dereferences an invalid memory location despite the application sources being valid. - The application reads from or writes to a valid but incorrect memory location, resulting in an information integrity issue or an information leak. - The application ends up running in an infinite loop or with severe degradation in performance despite the input sources having no such issue, resulting in a Denial of Service. Note that correct but non-performant code is not a security issue candidate, this only applies to incorrect code that may result in performance degradation. The last sentence somewhat contradicts the preceding one. Perhaps "...may result in performance degradation severe enough to amount to a denial of service". Ack, will fix that up, thanks. Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-03 13:34, Qing Zhao wrote: One thing I need to point out first is, currently, even for regular fixed size array in the structure, We have this same issue, for example: #define LENGTH 10 struct fix { size_t foo; int array[LENGTH]; }; … int main () { struct fix *p; p = alloc_buf_more (); expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int)); expect(__builtin_object_size(p->array, 0), -1); } Currently, for __builtin_object_size(p->array, 0), GCC return UNKNOWN for it. This is not a special issue for flexible array member. That's fine for fixed arrays at the end of a struct because the "whole object" size could be anything; `p` could be pointing to the beginning of an array for all we know. If however `array` is strictly a flex array, i.e.: ``` struct A { size_t foo; int array[]; }; ``` then there's no way in valid C to have an array of `struct fix`, so `q` must be pointing to a single element. So you could deduce: 1. the minimum size of the whole object that q points to. and 2. if you're able to determine the size of the flex array (through __element_count__(foo) for example), you could even determine the maximum size of the whole object. For (2) though, you'd break applications that overallocate and then expect to be able to use that overallocation despite the space not being reflected in the __element_count__. I think it's a bug in the application and I can't see a way for an application to be able to do this in a valid way so I'm inclined towards breaking it. Of course, the fact that gcc allows flex arrays to be in the middle of structs breaks the base assumption but that's something we need to get rid of anyway since there's no way for valid C programs to use that safely. Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-04 10:40, Siddhesh Poyarekar wrote: On 2023-08-03 13:34, Qing Zhao wrote: One thing I need to point out first is, currently, even for regular fixed size array in the structure, We have this same issue, for example: #define LENGTH 10 struct fix { size_t foo; int array[LENGTH]; }; … int main () { struct fix *p; p = alloc_buf_more (); expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int)); expect(__builtin_object_size(p->array, 0), -1); } Currently, for __builtin_object_size(p->array, 0), GCC return UNKNOWN for it. This is not a special issue for flexible array member. That's fine for fixed arrays at the end of a struct because the "whole object" size could be anything; `p` could be pointing to the beginning of an array for all we know. If however `array` is strictly a flex array, i.e.: ``` struct A { size_t foo; int array[]; }; ``` then there's no way in valid C to have an array of `struct fix`, so `q` must be pointing to a single element. So you could deduce: 1. the minimum size of the whole object that q points to. Actually for minimum size we'd also need a guarantee that `alloc_buf_more` returns a valid allocated object. Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-04 11:27, Qing Zhao wrote: On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar wrote: On 2023-08-03 13:34, Qing Zhao wrote: One thing I need to point out first is, currently, even for regular fixed size array in the structure, We have this same issue, for example: #define LENGTH 10 struct fix { size_t foo; int array[LENGTH]; }; … int main () { struct fix *p; p = alloc_buf_more (); expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int)); expect(__builtin_object_size(p->array, 0), -1); } Currently, for __builtin_object_size(p->array, 0), GCC return UNKNOWN for it. This is not a special issue for flexible array member. That's fine for fixed arrays at the end of a struct because the "whole object" size could be anything; `p` could be pointing to the beginning of an array for all we know. If however `array` is strictly a flex array, i.e.: ``` struct A { size_t foo; int array[]; }; ``` then there's no way in valid C to have an array of `struct fix`, Yes!! this is exactly the place that makes difference between structures with fixed arrays and the ones with flexible arrays. With such difference, I guess that using the type of the structure with flexible array member for p->array to get the size of the whole object p point to might be reasonable? Yes, that's what I'm thinking. so `q` must be pointing to a single element. So you could deduce: 1. the minimum size of the whole object that q points to. You mean that the TYPE will determine the minimum size of the whole object? (Does this include the size of the flexible array member, or only the other part of the structure except the flexible array member?) Only the constant sized part of the structure. Actually for minimum size we'd also need a guarantee that `alloc_buf_more` returns a valid allocated object. Why? Please explain a little bit here. So `alloc_buf_more` could return NULL, a valid pointer or an invalid pointer. So, we could end up returning a non-zero minimum size for an invalid or NULL pointer, which is incorrect, we don't know that. We won't need the object validity guarantee for (2) beyond, e.g. guarding against a new NULL pointer dereference because it's a *maximum* estimate; an invalid or NULL pointer would have 0 size. So for such cases, __bos(q, 0) could return sizeof(*q) + (q ? q->foo:0) and __bos(q->array, 0) could be sizeof(*q) + q->foo - offsetof(q, array) There's no need to guard against a dereference in the second case because the q->array dereference already assumes that q is valid. and 2. if you're able to determine the size of the flex array (through __element_count__(foo) for example), you could even determine the maximum size of the whole object. For (2) though, you'd break applications that overallocate and then expect to be able to use that overallocation despite the space not being reflected in the __element_count__. I think it's a bug in the application and I can't see a way for an application to be able to do this in a valid way so I'm inclined towards breaking it. Currently, we allow the situation when the allocation size for the whole object is larger than the value reflected in the “counted_by” attribute (the old name is __element_count__). But don’t allow the other way around (i.e, when the allocation size for the whole object is smaller than the value reflected in the “counted_by” attribute. Right, that's going to be the "break". For underallocation __bos will only end up overestimating the space available, which is not ideal, but won't end up breaking compatibility. Of course, the fact that gcc allows flex arrays to be in the middle of structs breaks the base assumption but that's something we need to get rid of anyway since there's no way for valid C programs to use that safely. Since GCC14, we started to deprecate this extension (allow flex array to be in the middle of structs). https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html Yes, that's what I'm banking on. Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-04 15:06, Qing Zhao wrote: Yes, that's what I'm thinking. so `q` must be pointing to a single element. So you could deduce: 1. the minimum size of the whole object that q points to. You mean that the TYPE will determine the minimum size of the whole object? (Does this include the size of the flexible array member, or only the other part of the structure except the flexible array member?) Only the constant sized part of the structure. Okay. I see. But if the “counted_by” info is available, then from p->array, we can deduce the minimum size too, as sizeof(struct A) + q->foo * sizeof(int), right? Yes. Actually for minimum size we'd also need a guarantee that `alloc_buf_more` returns a valid allocated object. Why? Please explain a little bit here. So `alloc_buf_more` could return NULL, a valid pointer or an invalid pointer. So, we could end up returning a non-zero minimum size for an invalid or NULL pointer, which is incorrect, we don't know that. I see what’ s you mean now. However, if we already see p->array, then the p is guaranteed a valid pointer and not a NULL, right? (We are discussing on __builtin_dynamic_object_size (q->array, 2), we see q->array already) Yes, you could argue that for p->array, I agree, but not for p. We won't need the object validity guarantee for (2) beyond, e.g. guarding against a new NULL pointer dereference because it's a *maximum* estimate; an invalid or NULL pointer would have 0 size. So for such cases, __bos(q, 0) could return sizeof(*q) + (q ? q->foo:0) and __bos(q->array, 0) could be sizeof(*q) + q->foo - offsetof(q, array) There's no need to guard against a dereference in the second case because the q->array dereference already assumes that q is valid. q->array should also guarantee that q is a valid pointer for minimum size, right? Or do I miss anything here? Yes. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-08 10:14, David Edelsohn wrote: On Tue, Aug 8, 2023 at 10:07 AM Siddhesh Poyarekar <mailto:siddh...@gotplt.org>> wrote: On 2023-08-08 10:04, Richard Biener wrote: > On Tue, Aug 8, 2023 at 3:35 PM Ian Lance Taylor mailto:i...@google.com>> wrote: >> >> On Tue, Aug 8, 2023 at 6:02 AM Jakub Jelinek via Gcc-patches >> mailto:gcc-patches@gcc.gnu.org>> wrote: >>> >>> On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via Gcc-patches wrote: >>>> There's probably external tools to do this, not sure if we should replicate >>>> things in the driver for this. >>>> >>>> But sure, I think the driver is the proper point to address any of such >>>> issues - iff we want to address them at all. Maybe a nice little >>>> google summer-of-code project ;) >>> >>> What I'd really like to avoid is having all compiler bugs (primarily ICEs) >>> considered to be security bugs (e.g. DoS category), it would be terrible to >>> release every week a new compiler because of the "security" issues. >>> Running compiler on untrusted sources can trigger ICEs (which we want to fix >>> but there will always be some), or run into some compile time and/or compile >>> memory issue (we have various quadratic or worse spots), compiler stack >>> limits (deeply nested stuff e.g. during parsing but other areas as well). >>> So, people running fuzzers and reporting issues is great, but if they'd get >>> a CVE assigned for each ice-on-invalid-code, ice-on-valid-code, >>> each compile-time-hog and each memory-hog, that wouldn't be useful. >>> Runtime libraries or security issues in the code we generate for valid >>> sources are of course a different thing. >> >> >> I wonder if a security policy should say something about the -fplugin >> option. I agree that an ICE is not a security issue, but I wonder how >> many people are aware that a poorly chosen command line option can >> direct the compiler to run arbitrary code. For that matter the same >> is true of setting the GCC_EXEC_PREFIX environment variable, and no >> doubt several other environment variables. My point is not that we >> should change these, but that a security policy should draw attention >> to the fact that there are cases in which the compiler will >> unexpectedly run other programs. > > Well, if you run an arbitrary commandline from the internet you get > what you deserve, running "echo "Hello World" | gcc -xc - -o /dev/sda" > as root doesn't need plugins to shoot yourself in the foot. You need to > know what you're doing, otherwise you are basically executing an > arbitrary shell script with whatever privileges you have. I think it would be useful to mention caveats with plugins though, just like it would be useful to mention exceptions for libiberty and similar libraries that gcc builds. It only helps makes things clearer in terms of what security coverage the project provides. I have added a line to the Note section in the proposed text: GCC and its tools provide features and options that can run arbitrary user code (e.g., -fplugin). How about the following to make it clearer that arbitrary code in plugins is not considered secure: GCC and its tools provide features and options that can run arbitrary user code, e.g. using the -fplugin options. Such custom code should be vetted by the user for safety as bugs exposed through such code will not be considered security issues. I believe that the security implication already is addressed because the program is not tricked into a direct compromise of security. Do you have a suggestion for the language to address libgcc, libstdc++, etc. and libiberty, libbacktrace, etc.? I'll work on this a bit and share a draft. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-08 04:16, Richard Biener wrote: On Mon, Aug 7, 2023 at 7:30 PM David Edelsohn via Gcc-patches wrote: FOSS Best Practices recommends that projects have an official Security policy stated in a SECURITY.md or SECURITY.txt file at the root of the repository. GLIBC and Binutils have added such documents. Appended is a prototype for a Security policy file for GCC based on the Binutils document because GCC seems to have more affinity with Binutils as a tool. Do the runtime libraries distributed with GCC, especially libgcc, require additional security policies? [ ] Is it appropriate to use the Binutils SECURITY.txt as the starting point or should GCC use GLIBC SECURITY.md as the starting point for the GCC Security policy? [ ] Does GCC, or some components of GCC, require additional care because of runtime libraries like libgcc and libstdc++, and because of gcov and profile-directed feedback? I do think that the runtime libraries should at least be explicitly mentioned because they fall into the "generated output" category and bugs in the runtime are usually more severe as affecting a wider class of inputs. Ack, I'd expect libstdc++ and libgcc to be aligned with glibc's policies. libiberty and others on the other hand, would probably be more suitably aligned with binutils libbfd, where we assume trusted input. Thoughts? Thanks, David GCC Security Process What is a GCC security bug? === A security bug is one that threatens the security of a system or network, or might compromise the security of data stored on it. In the context of GCC there are two ways in which such bugs might occur. In the first, the programs themselves might be tricked into a direct compromise of security. In the second, the tools might introduce a vulnerability in the generated output that was not already present in the files used as input. Other than that, all other bugs will be treated as non-security issues. This does not mean that they will be ignored, just that they will not be given the priority that is given to security bugs. This stance applies to the creation tools in the GCC (e.g., gcc, g++, gfortran, gccgo, gccrs, gnat, cpp, gcov, etc.) and the libraries that they use. Notes: == None of the programs in GCC need elevated privileges to operate and it is recommended that users do not use them from accounts where such privileges are automatically available. I'll note that we could ourselves mitigate some of that by handling privileged invocation of the driver specially, dropping privs on exec of the sibling tools and possibly using temporary files or pipes to do the parts of the I/O that need to be privileged. It's not a bad idea, but it ends up giving legitimizing running the compiler as root, pushing the responsibility of privilege management to the driver. How about rejecting invocation as root altogether by default, bypassed with a --run-as-root flag instead? I've also been thinking about a --sandbox flag that isolates the build process (for gcc as well as binutils) into a separate namespace so that it's usable in a restricted mode on untrusted sources without exposing the rest of the system to it. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-08 10:04, Richard Biener wrote: On Tue, Aug 8, 2023 at 3:35 PM Ian Lance Taylor wrote: On Tue, Aug 8, 2023 at 6:02 AM Jakub Jelinek via Gcc-patches wrote: On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via Gcc-patches wrote: There's probably external tools to do this, not sure if we should replicate things in the driver for this. But sure, I think the driver is the proper point to address any of such issues - iff we want to address them at all. Maybe a nice little google summer-of-code project ;) What I'd really like to avoid is having all compiler bugs (primarily ICEs) considered to be security bugs (e.g. DoS category), it would be terrible to release every week a new compiler because of the "security" issues. Running compiler on untrusted sources can trigger ICEs (which we want to fix but there will always be some), or run into some compile time and/or compile memory issue (we have various quadratic or worse spots), compiler stack limits (deeply nested stuff e.g. during parsing but other areas as well). So, people running fuzzers and reporting issues is great, but if they'd get a CVE assigned for each ice-on-invalid-code, ice-on-valid-code, each compile-time-hog and each memory-hog, that wouldn't be useful. Runtime libraries or security issues in the code we generate for valid sources are of course a different thing. I wonder if a security policy should say something about the -fplugin option. I agree that an ICE is not a security issue, but I wonder how many people are aware that a poorly chosen command line option can direct the compiler to run arbitrary code. For that matter the same is true of setting the GCC_EXEC_PREFIX environment variable, and no doubt several other environment variables. My point is not that we should change these, but that a security policy should draw attention to the fact that there are cases in which the compiler will unexpectedly run other programs. Well, if you run an arbitrary commandline from the internet you get what you deserve, running "echo "Hello World" | gcc -xc - -o /dev/sda" as root doesn't need plugins to shoot yourself in the foot. You need to know what you're doing, otherwise you are basically executing an arbitrary shell script with whatever privileges you have. I think it would be useful to mention caveats with plugins though, just like it would be useful to mention exceptions for libiberty and similar libraries that gcc builds. It only helps makes things clearer in terms of what security coverage the project provides. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-08 10:37, Jakub Jelinek wrote: On Tue, Aug 08, 2023 at 10:30:10AM -0400, Siddhesh Poyarekar wrote: Do you have a suggestion for the language to address libgcc, libstdc++, etc. and libiberty, libbacktrace, etc.? I'll work on this a bit and share a draft. BTW, I think we should perhaps differentiate between production ready libraries (e.g. libgcc, libstdc++, libgomp, libatomic, libgfortran, libquadmath, libssp) vs. e.g. the sanitizer libraries which are meant for debugging and Agreed, that's why I need some time to sort all of the libraries gcc builds to categorize them into various levels of support in terms of safety re. untrusted input. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-08 11:48, David Malcolm wrote: On Tue, 2023-08-08 at 09:33 -0400, Paul Koning via Gcc-patches wrote: On Aug 8, 2023, at 9:01 AM, Jakub Jelinek via Gcc-patches wrote: On Tue, Aug 08, 2023 at 02:52:57PM +0200, Richard Biener via Gcc- patches wrote: There's probably external tools to do this, not sure if we should replicate things in the driver for this. But sure, I think the driver is the proper point to address any of such issues - iff we want to address them at all. Maybe a nice little google summer-of-code project ;) What I'd really like to avoid is having all compiler bugs (primarily ICEs) considered to be security bugs (e.g. DoS category), it would be terrible to release every week a new compiler because of the "security" issues. Indeed. But my answer would be that such things are not DoS issues. DoS means that an external input, over which you have little control, is impairing service. In the case of a compiler, if feeding it bad source code X.c causes it to crash, the answer is "well, then don't do that". Agreed. I'm not sure how to "wordsmith" this, but it seems like the sources and options on the *host* are assumed to be trusted, and that the act of *compiling* source on the host requires trusting them, just like the act of executing the compiled code on the target does. Though users may be more familiar with sandboxing the target than the host. We should spell this out further for libgccjit: libgccjit allows for ahead-of-time and JIT compilation of sources - but it assumes that those sources (and the compilation options) are trusted. [Adding Andrea Corallo to the addressees] For example, Emacs is using libgccjit to do ahead-of-time compilation of Emacs bytecode. I'm assuming that Emacs is assuming that its bytecode is trusted, and that there isn't any attempt by Emacs to sandbox the Emacs Lisp being processed. However, consider a situation in which someone attempted to, say, embed libgccjit inside a web browser to generate machine code from JavaScript, where the JavaScript is potentially controlled by an attacker. I think we want to explicitly say that that if you're going to do that, you need to put some other layer of defense in, so that you're not blithely accepting the inputs to the compilation (sources and options) from a potentially hostile source, where a crafted input sources could potentially hit an ICE in the compiler and thus crash the web browser. +1, this is precisely the kind of thing the security policy should warn against and suggest using sandboxing for. The compiler (or libgccjit) isn't really in a position to defend such uses, ICE or otherwise. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-09 14:17, David Edelsohn wrote: On Wed, Aug 9, 2023 at 1:33 PM Siddhesh Poyarekar <mailto:siddh...@gotplt.org>> wrote: On 2023-08-08 10:30, Siddhesh Poyarekar wrote: >> Do you have a suggestion for the language to address libgcc, >> libstdc++, etc. and libiberty, libbacktrace, etc.? > > I'll work on this a bit and share a draft. Hi David, Here's what I came up with for different parts of GCC, including the runtime libraries. Over time we may find that specific parts of runtime libraries simply cannot be used safely in some contexts and flag that. Sid Hi, Sid Thanks for iterating on this. """ What is a GCC security bug? === A security bug is one that threatens the security of a system or network, or might compromise the security of data stored on it. In the context of GCC there are multiple ways in which this might happen and they're detailed below. Compiler drivers, programs, libgccjit and support libraries --- The compiler driver processes source code, invokes other programs such as the assembler and linker and generates the output result, which may be assembly code or machine code. It is necessary that all source code inputs to the compiler are trusted, since it is impossible for the driver to validate input source code beyond conformance to a programming language standard. The GCC JIT implementation, libgccjit, is intended to be plugged into applications to translate input source code in the application context. Limitations that apply to the compiler driver, apply here too in terms of sanitizing inputs, so it is recommended that inputs are either sanitized by an external program to allow only trusted, safe execution in the context of the application or the JIT execution context is appropriately sandboxed to contain the effects of any bugs in the JIT or its generated code to the sandboxed environment. Support libraries such as libiberty, libcc1 libvtv and libcpp have been developed separately to share code with other tools such as binutils and gdb. These libraries again have similar challenges to compiler drivers. While they are expected to be robust against arbitrary input, they should only be used with trusted inputs. Libraries such as zlib and libffi that bundled into GCC to build it will be treated the same as the compiler drivers and programs as far as security coverage is concerned. Should we direct people to the upstream projects for their security policies? We bundle zlib and libffi so regardless of whether it's a security issue in those libraries (because security impact of memory safety bugs in general use libraries will be context dependent and hence get assigned CVEs more often than not), the context in gcc is well defined as a local unprivileged executable and hence not security-relevant. That said, we could add something like: However if you find a issue in these libraries independent of their use in GCC you should reach out to their upstream projects to report them. As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. Language runtime libraries -- GCC also builds and distributes libraries that are intended to be used widely to implement runtime support for various programming languages. These include the following: * libada * libatomic * libbacktrace * libcc1 * libcody * libcpp * libdecnumber * libgcc * libgfortran * libgm2 * libgo * libgomp * libiberty * libitm * libobjc * libphobos * libquadmath * libssp * libstdc++ These libraries are intended to be used in arbitrary contexts and as a result, bugs in these libraries may be evaluated for security impact. However, some of these libraries, e.g. libgo, libphobos, etc. are not maintained in the GCC project, due to which the GCC project may not be the correct point of contact for them. You are encouraged to look at README files within those library directories to locate the canonical security contact point for those projects. As Richard mentioned, should GCC make a specific statement about the security policy / response for issues that
Re: [RFC] GCC Security policy
On 2023-08-08 10:30, Siddhesh Poyarekar wrote: Do you have a suggestion for the language to address libgcc, libstdc++, etc. and libiberty, libbacktrace, etc.? I'll work on this a bit and share a draft. Hi David, Here's what I came up with for different parts of GCC, including the runtime libraries. Over time we may find that specific parts of runtime libraries simply cannot be used safely in some contexts and flag that. Sid """ What is a GCC security bug? === A security bug is one that threatens the security of a system or network, or might compromise the security of data stored on it. In the context of GCC there are multiple ways in which this might happen and they're detailed below. Compiler drivers, programs, libgccjit and support libraries --- The compiler driver processes source code, invokes other programs such as the assembler and linker and generates the output result, which may be assembly code or machine code. It is necessary that all source code inputs to the compiler are trusted, since it is impossible for the driver to validate input source code beyond conformance to a programming language standard. The GCC JIT implementation, libgccjit, is intended to be plugged into applications to translate input source code in the application context. Limitations that apply to the compiler driver, apply here too in terms of sanitizing inputs, so it is recommended that inputs are either sanitized by an external program to allow only trusted, safe execution in the context of the application or the JIT execution context is appropriately sandboxed to contain the effects of any bugs in the JIT or its generated code to the sandboxed environment. Support libraries such as libiberty, libcc1 libvtv and libcpp have been developed separately to share code with other tools such as binutils and gdb. These libraries again have similar challenges to compiler drivers. While they are expected to be robust against arbitrary input, they should only be used with trusted inputs. Libraries such as zlib and libffi that bundled into GCC to build it will be treated the same as the compiler drivers and programs as far as security coverage is concerned. As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. Language runtime libraries -- GCC also builds and distributes libraries that are intended to be used widely to implement runtime support for various programming languages. These include the following: * libada * libatomic * libbacktrace * libcc1 * libcody * libcpp * libdecnumber * libgcc * libgfortran * libgm2 * libgo * libgomp * libiberty * libitm * libobjc * libphobos * libquadmath * libssp * libstdc++ These libraries are intended to be used in arbitrary contexts and as a result, bugs in these libraries may be evaluated for security impact. However, some of these libraries, e.g. libgo, libphobos, etc. are not maintained in the GCC project, due to which the GCC project may not be the correct point of contact for them. You are encouraged to look at README files within those library directories to locate the canonical security contact point for those projects. Diagnostic libraries The sanitizer library bundled in GCC is intended to be used in diagnostic cases and not intended for use in sensitive environments. As a result, bugs in the sanitizer will not be considered security sensitive. GCC plugins --- It should be noted that GCC may execute arbitrary code loaded by a user through the GCC plugin mechanism or through system preloading mechanism. Such custom code should be vetted by the user for safety as bugs exposed through such code will not be considered security issues.
Re: One question on the source code of tree-object-size.cc
On 2023-08-01 18:57, Kees Cook wrote: return p; } /* in the following function, malloc allocated less space than size of the struct fix. Then what's the correct behavior we expect the __builtin_object_size should have for the following? */ static struct fix * noinline alloc_buf_less () { struct fix *p; p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int)); /*when checking the observed access p->array, we have info on both observered allocation and observed access, A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof (int) B. from observed access (TYPE): LENGTH * sizeof (int) */ /* for MAXIMUM size in the whole object: currently, GCC always used the A. */ expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * sizeof(int)); ok: __builtin_object_size(p->array, 0) == 20 My brain just melted a little, as this is now an under-sized instance of "p", so we have an incomplete allocation. (I would expect -Warray-bounds to yell very loudly for this.) But, technically, yes, this looks like the right calculation. AFAIK, -Warray-bounds will only yell in case of a dereference that the compiler may potentially see as being beyond that 20 byte bound; it won't actually see the undersized allocation. An analyzer warning would be useful for just the undersized allocation regardless of whether the code actually ends up accessing the object beyond the allocation bounds. Thanks, Sid
Re: One question on the source code of tree-object-size.cc
On 2023-08-01 17:35, Qing Zhao wrote: typedef struct { int a; } A; size_t f() { A *p = malloc (1); return __builtin_object_size (p, 0); Correction, that should be __builtin_object_size (p->a, 0). Actually, it should be __builtin_object_size(p->a, 1). For __builtin_object_size(p->a,0), gcc always uses the allocation size for the whole object. Right, sorry, I mistyped, twice in fact; it should have been __bos(>a, 1) :) GCC’s current behavior is: For the size of the whole object, GCC currently always uses the allocation size. And for the size in the sub-object, GCC chose the smaller one among the allocation size and the TYPE_SIZE. Is this correct behavior? Yes, it's deliberate; it specifically checks on var != pt_var, which can only be true for subobjects. Thanks, Sid
Re: [RFC] GCC Security policy
Hi, Here's the updated draft of the top part of the security policy with all of the recommendations incorporated. Thanks, Sid What is a GCC security bug? === A security bug is one that threatens the security of a system or network, or might compromise the security of data stored on it. In the context of GCC there are multiple ways in which this might happen and they're detailed below. Compiler drivers, programs, libgccjit and support libraries --- The compiler driver processes source code, invokes other programs such as the assembler and linker and generates the output result, which may be assembly code or machine code. It is necessary that all source code inputs to the compiler are trusted, since it is impossible for the driver to validate input source code beyond conformance to a programming language standard. The GCC JIT implementation, libgccjit, is intended to be plugged into applications to translate input source code in the application context. Limitations that apply to the compiler driver, apply here too in terms of sanitizing inputs, so it is recommended that inputs are either sanitized by an external program to allow only trusted, safe execution in the context of the application or the JIT execution context is appropriately sandboxed to contain the effects of any bugs in the JIT or its generated code to the sandboxed environment. Support libraries such as libiberty, libcc1 libvtv and libcpp have been developed separately to share code with other tools such as binutils and gdb. These libraries again have similar challenges to compiler drivers. While they are expected to be robust against arbitrary input, they should only be used with trusted inputs. Libraries such as zlib that bundled into GCC to build it will be treated the same as the compiler drivers and programs as far as security coverage is concerned. However if you find an issue in these libraries independent of their use in GCC, you should reach out to their upstream projects to report them. As a result, the only case for a potential security issue in all these cases is when it ends up generating vulnerable output for valid input source code. As a result, the only case for a potential security issue in the compiler is when it generates vulnerable application code for trusted input source code that is conforming to the relevant programming standard or extensions documented as supported by GCC and the algorithm expressed in the source code does not have the vulnerability. The output application code could be considered vulnerable if it produces an actual vulnerability in the target application, specifically in the following cases: - The application dereferences an invalid memory location despite the application sources being valid. - The application reads from or writes to a valid but incorrect memory location, resulting in an information integrity issue or an information leak. - The application ends up running in an infinite loop or with severe degradation in performance despite the input sources having no such issue, resulting in a Denial of Service. Note that correct but non-performant code is not a security issue candidate, this only applies to incorrect code that may result in performance degradation severe enough to amount to a denial of service. - The application crashes due to the generated incorrect code, resulting in a Denial of Service. Language runtime libraries -- GCC also builds and distributes libraries that are intended to be used widely to implement runtime support for various programming languages. These include the following: * libada * libatomic * libbacktrace * libcc1 * libcody * libcpp * libdecnumber * libffi * libgcc * libgfortran * libgm2 * libgo * libgomp * libiberty * libitm * libobjc * libphobos * libquadmath * libsanitizer * libssp * libstdc++ These libraries are intended to be used in arbitrary contexts and as a result, bugs in these libraries may be evaluated for security impact. However, some of these libraries, e.g. libgo, libphobos, etc. are not maintained in the GCC project, due to which the GCC project may not be the correct point of contact for them. You are encouraged to look at README files within those library directories to locate the canonical security contact point for those projects and include them in the report. Once the issue is fixed in the upstream project, the fix will be synced into GCC in a future release. Most security vulnerabilities in these runtime libraries arise when an application
Re: [RFC] GCC Security policy
On 2023-08-14 17:16, Alexander Monakov wrote: On Mon, 14 Aug 2023, Siddhesh Poyarekar wrote: 1. It makes it clear to users of the project the scope in which the project could be used and what safety it could reasonably expect from the project. In the context of GCC for example, it cannot expect the compiler to do a safety check of untrusted sources; the compiler will consider #include "/etc/passwd" just as valid code as #include and as a result, the onus is on the user environment to validate the input sources for safety. Whoa, no. We shouldn't make such statements unless we are prepared to explain to users how such validation can be practically implemented, which I'm sure we cannot in this case, due to future extensions such as the #embed directive, and ability to obfuscate filenames using the preprocessor. There's no practical (programmatic) way to do such validation; it has to be a manual audit, which is why source code passed to the compiler has to be *trusted*. I think it would be more honest to say that crafted sources can result in arbitrary code execution with the privileges of the user invoking the compiler, and hence the operator may want to ensure that no sensitive data is available to that user (via measures ranging from plain UNIX permissions, to chroots, to virtual machines, to air-gapped computers, depending on threat model). Right, that's what we're essentially trying to convey in the security policy text. It doesn't go into mechanisms for securing execution (because that's really beyond the scope of the *project's* policy IMO) but it states unambiguously that input to the compiler must be trusted: """ ... It is necessary that all source code inputs to the compiler are trusted, since it is impossible for the driver to validate input source code beyond conformance to a programming language standard... """ Resource consumption is another good reason to sandbox compilers. Agreed, we make that specific recommendation in the context of libgccjit. Thanks, Sid
Re: [RFC] GCC Security policy
On 2023-08-14 14:51, Richard Sandiford wrote: I think it would help to clarify what the aim of the security policy is. Specifically: (1) What service do we want to provide to users by classifying one thing as a security bug and another thing as not a security bug? (2) What service do we want to provide to the GNU community by the same classification? I think it will be easier to agree on the classification if we first agree on that. I actually wanted to do a talk on this at the Cauldron this year and *then* propose this for the gcc community, but I guess we could do this early :) So the core intent of a security policy for a project is to make clear the security stance of the project, specifying to the extent possible what kind of uses are considered safe and what kinds of bugs would be considered security issues in the context of those uses. There are a few advantages of doing this: 1. It makes it clear to users of the project the scope in which the project could be used and what safety it could reasonably expect from the project. In the context of GCC for example, it cannot expect the compiler to do a safety check of untrusted sources; the compiler will consider #include "/etc/passwd" just as valid code as #include and as a result, the onus is on the user environment to validate the input sources for safety. 2. It helps the security community (Mitre and other CNAs and security researchers) set correct expectations of the project so that they don't cry wolf for every segfault or ICE under the pretext that code could presumably be run as a service somehow and hence result in a "DoS". 3. This in turn helps stave off spurious CVE submissions that cause needless churn in downstream distributions. LLVM is already starting to see this[1] and it's only a matter of time before people start doing this for GCC. 4. It helps make a distinction between important bugs and security bugs; they're often conflated as one and the same thing. Security bugs are special because they require different handling from those that do not have a security impact, regardless of their actual importance. Unfortunately one of the reasons they're special is because there's a bunch of (pretty dumb) automation out there that rings alarm bells on every single CVE. Without a clear understanding of the context under which a project can be used, these alarm bells can be made unreasonably loud (due to incorrect scoring, see the LLVM CVE for instance; just one element in that vector changes the score from 0.0 to 5.5), causing needless churn in not just the code base but in downstream releases and end user environments. 5. This exercise is also a great start in developing an understanding of which parts in GCC are security sensitive and in what sense. Runtime libraries for example have a direct impact on application security. Compiler impact is a little less direct. Hardening features have another effect, but it's more mitigation-oriented than direct safety. This also informs us about the impact of various project actions such as bundling third-party libraries and development and maintenance of tooling within GCC and will hopefully guide policies around those practices. I hope this is a sufficient start. We don't necessarily want to get into the business of acknowledging or rejecting security issues as upstream at the moment (but see also the CNA discussion[2] of what we intend to do in that space for glibc) but having uniform upstream guidelines would be helpful to researchers as well as downstream consumers to help decide what constitutes a security issue. Thanks, Sid [1] https://nvd.nist.gov/vuln/detail/CVE-2023-29932 [2] https://inbox.sourceware.org/libc-alpha/1a44f25a-5aa3-28b7-1ecb-b3991d44c...@gotplt.org/T/
Re: [RFC] GCC Security policy
On 2024-02-12 10:00, Richard Biener wrote: GCC driver support is then to the extent identifying the inputs and the outputs. We already have -MM to generate a list of non-system dependencies, so gcc should be able to pass on inputs to the tool, which could then map those files (and the system headers directory) into the sandbox before invocation. The output file could perhaps be enforced as having to be a new one, i.e. fail if the target is already found. I'm not sure a generic utility can achieve this unless the outputs need to be retrieved from somewhere else (not "usual" place when invoking un-sandboxed). Even the driver doesn't necessarily know all files read/written. So I suppose better defining of the actual goal is in order. gcc -sandboxed -O2 -c t.ii -fdump-tree-all what should this do? IMO invoked tools (gas, cc1plus) should be restricted to access the input files. Ideally the dumps would appear where they appear when not sandboxed but clearly overwriting existing files would be problematic, writing new files not so much, but only to the standard (or specified) auxiliary output file paths. Couldn't we get away with not having to handle dump files? They don't seem to be sensitive targets. Thanks, Sid
Re: [RFC] GCC Security policy
On 2024-02-09 10:38, Martin Jambor wrote: If anyone is interested in scoping this and then mentoring this as a Google Summer of Code project this year then now is the right time to speak up! I can help with mentoring and reviews, although I'll need someone to assist with actual approvals. There are two distinct sets of ideas to explore, one is privilege management and the other sandboxing. For privilege management we could add a --allow-root driver flag that allows gcc to run as root. Without the flag one could either outright refuse to run or drop privileges and run. Dropping privileges will be a bit tricky to implement because it would need a user to drop privileges to and then there would be the question of how to manage file access to read the compiler input and write out the compiler output. If there's no such user, gcc could refuse to run as root by default. I wonder though if from a security posture perspective it makes sense to simply discourage running as root all the time and not bother trying to make it work with dropped privileges and all that. Of course it would mean that this would be less of a "project"; it'll be a simple enough patch to refuse to run until --allow-root is specified. This probably ties in somewhat with an idea David Malcolm had riffed on with me earlier, of caching files for diagnostics. If we could unify file accesses somehow, we could make this happen, i.e. open/read files as root and then do all execution as non-root. Sandboxing will have similar requirements, i.e. map in input files and an output file handle upfront and then unshare() into a sandbox to do the actual compilation. This will make sure that at least the processing of inputs does not affect the system on which the compilation is being run. Sid
Re: [RFC] GCC Security policy
On 2024-02-09 12:14, Joseph Myers wrote: On Fri, 9 Feb 2024, Siddhesh Poyarekar wrote: For privilege management we could add a --allow-root driver flag that allows gcc to run as root. Without the flag one could either outright refuse to run or drop privileges and run. Dropping privileges will be a bit tricky to implement because it would need a user to drop privileges to and then there would be the question of how to manage file access to read the compiler input and write out the compiler output. If there's no such user, gcc could refuse to run as root by default. I wonder though if from a security posture perspective it makes sense to simply discourage running as root all the time and not bother trying to make it work with dropped privileges and all that. Of course it would mean that this would be less of a "project"; it'll be a simple enough patch to refuse to run until --allow-root is specified. I think disallowing running as root would be a big problem in practice - the typical problem case is when people build software as non-root and run "make install" as root, and for some reason "make install" wants to (re)build or (re)link something. Isn't that a problematic practice though? Or maybe have those invocations be separated out as CC_ROOT? Thanks, Sid
Re: [RFC] GCC Security policy
On 2024-02-09 15:06, Joseph Myers wrote: Ideally dependencies would be properly set up so that everything is built in the original build, and ideally there would be no need to relink at install time (I'm not sure of the exact circumstances in which it might be needed, or on what OSes to e.g. encode the right library paths in final installed executables). In practice I think it's common for some building to take place at install time. There is a more general principle here of composability: it's not helpful for being able to write scripts or makefiles combining invocations of different utilities and have them behave predictably if some of those utilities start making judgements about whether it's a good idea to run them in a particular environment rather than just doing their job independent of irrelevant aspects of the environment. The semantics of invoking "gcc" have nothing to do with whether it's run as root; it should never need to look up what user it's running as at all. (And it's probably also a bad idea for lots of separate utilities to gain their own ways to run in a restricted environment, for similar reasons; rather than teaching "gcc" a way to create a restricted environment itself, ensure there are easy-to-use more general utilities for running arbitrary programs on untrusted input in a contained environment.) I see your point. The way you put it, there's no GCC project here at all then. Sid
Re: [RFC] GCC Security policy
On 2024-02-12 08:16, Martin Jambor wrote: This probably ties in somewhat with an idea David Malcolm had riffed on with me earlier, of caching files for diagnostics. If we could unify file accesses somehow, we could make this happen, i.e. open/read files as root and then do all execution as non-root. Sandboxing will have similar requirements, i.e. map in input files and an output file handle upfront and then unshare() into a sandbox to do the actual compilation. This will make sure that at least the processing of inputs does not affect the system on which the compilation is being run. Right. As we often just download some (sometimes large) pre-processed source from Bugzilla and then happily run GCC on it on our computers, this feature might be actually useful for us (still, we'd probably need a more concrete description of what we want, would e.g. using "-wrapper gdb,--args" work in such a sandbox?). I agree that for some even semi-complex builds, a more general sandboxing solution is probably better. Joseph seems to be leaning towards nudging people to a general sandboxing solution too. The question then is whether this takes the shape of a utility in, e.g. contrib that builds such a sandbox or simply a wiki page. Thanks, Sid
Re: [C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]
On 2023-12-06 09:21, Jakub Jelinek wrote: On Wed, Dec 06, 2023 at 02:34:10PM +0100, Martin Uecker wrote: Further I think "size less than or equal to the size requested" is quite ambiguous in the calloc case, isn't the size requested in the calloc case actually nmemb * size rather than just size? This is unclear but it can be understood this way. This was also Joseph's point. I am happy to submit a patch that changes the code so that the swapped arguments to calloc do not cause a warning anymore. That would be my preference because then the allocation size is correct and it is purely a style warning. It doesn't follow how the warning is described: "Warn about calls to allocation functions decorated with attribute @code{alloc_size} that specify insufficient size for the target type of the pointer the result is assigned to" when the size is certainly sufficient. But wonder what others think about it. +1, from a libc perspective, the transposed arguments don't make a difference, a typical allocator will produce sufficiently sized allocation for the calloc call. BTW, shouldn't the warning be for C++ as well? Sure, I know, people use operator new more often, but still, the allocators are used in there as well. We have the -Wmemset-transposed-args warning, couldn't we have a similar one for calloc, and perhaps do it solely in the case where one uses sizeof of the type used in the cast pointer? So warn for (struct S *) calloc (sizeof (struct S), 1) or (struct S *) calloc (sizeof (struct S), n) but not for (struct S *) calloc (4, 15) or (struct S *) calloc (sizeof (struct T), 1) or similar? Of course check for compatible types of TYPE_MAIN_VARIANTs. +1, this could be an analyzer warning, since in practice it is just a code cleanliness issue. Thanks, Sid
Re: [C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]
On 2023-12-06 09:41, Jakub Jelinek wrote: On Wed, Dec 06, 2023 at 09:30:32AM -0500, Siddhesh Poyarekar wrote: We have the -Wmemset-transposed-args warning, couldn't we have a similar one for calloc, and perhaps do it solely in the case where one uses sizeof of the type used in the cast pointer? So warn for (struct S *) calloc (sizeof (struct S), 1) or (struct S *) calloc (sizeof (struct S), n) but not for (struct S *) calloc (4, 15) or (struct S *) calloc (sizeof (struct T), 1) or similar? Of course check for compatible types of TYPE_MAIN_VARIANTs. +1, this could be an analyzer warning, since in practice it is just a code cleanliness issue. We don't do such things in the analyzer, nor it is possible, by the time analyzer sees the IL all the sizeofs etc. are folded. Analyzer is for expensive to compute warnings, code style warnings are normally implemented in the FEs. Thanks, understood. A separate FE warning is fine as well. Thanks, Sid
[PATCH] SECURITY.txt: Drop "exploitable" in reference to hardening issues
The "exploitable vulnerability" may lead to a misunderstanding that missed hardening issues are considered vulnerabilities, just that they're not exploitable. This is not true, since while hardening bugs may be security-relevant, the absence of hardening does not make a program any more vulnerable to exploits than without. Drop the "exploitable" word to make it clear that missed hardening is not considered a vulnerability. diff --git a/SECURITY.txt b/SECURITY.txt index b3e2bbfda90..126603d4c22 100644 --- a/SECURITY.txt +++ b/SECURITY.txt @@ -155,10 +155,10 @@ Security features implemented in GCC GCC implements a number of security features that reduce the impact of security issues in applications, such as -fstack-protector, -fstack-clash-protection, _FORTIFY_SOURCE and so on. A failure of -these features to function perfectly in all situations is not an -exploitable vulnerability in itself since it does not affect the -correctness of programs. Further, they're dependent on heuristics -and may not always have full coverage for protection. +these features to function perfectly in all situations is not a +vulnerability in itself since it does not affect the correctness of +programs. Further, they're dependent on heuristics and may not +always have full coverage for protection. Similarly, GCC may transform code in a way that the correctness of the expressed algorithm is preserved, but supplementary properties
[PATCH] tree-object-size: Always set computed bit for bdos [PR113012]
It is always safe to set the computed bit for dynamic object sizes at the end of collect_object_sizes_for because even in case of a dependency loop encountered in nested calls, we have an SSA temporary to actually finish the object size expression. The reexamine pass for dynamic object sizes is only for propagation of unknowns and gimplification of the size expressions, not for loop resolution as in the case of static object sizes. gcc/ChangeLog: PR tree-optimization/113012 * gcc.dg/ubsan/pr113012.c: New test case. gcc/testsuite/ChangeLog: PR tree-optimization/113012 * tree-object-size.cc (compute_builtin_object_size): Expand comment for dynamic object sizes. (collect_object_sizes_for): Always set COMPUTED bitmap for dynamic object sizes. Signed-off-by: Siddhesh Poyarekar --- Testing: - Bootstrapped x86_64 and config=ubsan - Tested i686 OK for trunk and backport to gcc 13 branch? gcc/testsuite/gcc.dg/ubsan/pr113012.c | 17 + gcc/tree-object-size.cc | 17 - 2 files changed, 29 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/ubsan/pr113012.c diff --git a/gcc/testsuite/gcc.dg/ubsan/pr113012.c b/gcc/testsuite/gcc.dg/ubsan/pr113012.c new file mode 100644 index 000..4fc38cd1171 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ubsan/pr113012.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=undefined" } */ + +int * +foo (int x, int y, int z, int w) +{ + int *p = __builtin_malloc (z * sizeof (int)); + int *q = p - 1; + while (--x > 0) +{ + if (w + 1 > y) + q = p - 1; + ++*q; + ++q; +} + return p; +} diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc index 28f27adf9ca..434b2fc0bf5 100644 --- a/gcc/tree-object-size.cc +++ b/gcc/tree-object-size.cc @@ -1185,10 +1185,12 @@ compute_builtin_object_size (tree ptr, int object_size_type, osi.tos = NULL; } - /* First pass: walk UD chains, compute object sizes that -can be computed. osi.reexamine bitmap at the end will -contain what variables were found in dependency cycles -and therefore need to be reexamined. */ + /* First pass: walk UD chains, compute object sizes that can be computed. +osi.reexamine bitmap at the end will contain what variables that need +to be reexamined. For both static and dynamic size computation, +reexamination is for propagation across dependency loops. The dynamic +case has the additional use case where the computed expression needs +to be gimplified. */ osi.pass = 0; osi.changed = false; collect_object_sizes_for (, ptr); @@ -1823,11 +1825,16 @@ collect_object_sizes_for (struct object_size_info *osi, tree var) gcc_unreachable (); } - if (! reexamine || object_sizes_unknown_p (object_size_type, varno)) + /* Dynamic sizes use placeholder temps to return an answer, so it is always + * safe to set COMPUTED for them. */ + if ((object_size_type & OST_DYNAMIC) + || !reexamine || object_sizes_unknown_p (object_size_type, varno)) { bitmap_set_bit (computed[object_size_type], varno); if (!(object_size_type & OST_DYNAMIC)) bitmap_clear_bit (osi->reexamine, varno); + else if (reexamine) + bitmap_set_bit (osi->reexamine, varno); } else { -- 2.43.0
[PATCH] tree-object-size: Clean up unknown propagation
Narrow down scope of the unknowns bitmap so that it is only accessible within the reexamination process. This also removes any role of unknown propagation from object_sizes_set, thus simplifying that code path a bit. gcc/ChangeLog: * tree-object-size.cc (object_size_info): Remove UNKNOWNS. Drop all references to it. (object_sizes_set): Move unknowns propagation code to... (gimplify_size_expressions): ... here. Also free reexamine bitmap. (propagate_unknowns): New parameter UNKNOWNS. Update callers. Signed-off-by: Siddhesh Poyarekar --- This is a follow-up cleanup to pr#113012, but not required to fix that bug. Bootstrapped on x86_64 and with config=ubsan. gcc/tree-object-size.cc | 65 + 1 file changed, 34 insertions(+), 31 deletions(-) diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc index 434b2fc0bf5..08a3b7f5d94 100644 --- a/gcc/tree-object-size.cc +++ b/gcc/tree-object-size.cc @@ -43,7 +43,7 @@ struct object_size_info int object_size_type; unsigned char pass; bool changed; - bitmap visited, reexamine, unknowns; + bitmap visited, reexamine; unsigned int *depths; unsigned int *stack, *tos; }; @@ -264,19 +264,8 @@ object_sizes_set (struct object_size_info *osi, unsigned varno, tree val, { if (bitmap_bit_p (osi->reexamine, varno)) { - if (size_unknown_p (val, object_size_type)) - { - oldval = object_sizes_get (osi, varno); - old_wholeval = object_sizes_get (osi, varno, true); - bitmap_set_bit (osi->unknowns, SSA_NAME_VERSION (oldval)); - bitmap_set_bit (osi->unknowns, SSA_NAME_VERSION (old_wholeval)); - bitmap_clear_bit (osi->reexamine, varno); - } - else - { - val = bundle_sizes (oldval, val); - wholeval = bundle_sizes (old_wholeval, wholeval); - } + val = bundle_sizes (oldval, val); + wholeval = bundle_sizes (old_wholeval, wholeval); } else { @@ -958,25 +947,26 @@ emit_phi_nodes (gimple *stmt, tree size, tree wholesize) size_unknown, as noted in UNKNOWNS. */ static tree -propagate_unknowns (object_size_info *osi, tree expr) +propagate_unknowns (object_size_info *osi, tree expr, bitmap unknowns) { int object_size_type = osi->object_size_type; switch (TREE_CODE (expr)) { case SSA_NAME: - if (bitmap_bit_p (osi->unknowns, SSA_NAME_VERSION (expr))) + if (bitmap_bit_p (unknowns, SSA_NAME_VERSION (expr))) return size_unknown (object_size_type); return expr; case MIN_EXPR: case MAX_EXPR: { - tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0)); + tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0), +unknowns); if (size_unknown_p (res, object_size_type)) return res; - res = propagate_unknowns (osi, TREE_OPERAND (expr, 1)); + res = propagate_unknowns (osi, TREE_OPERAND (expr, 1), unknowns); if (size_unknown_p (res, object_size_type)) return res; @@ -984,7 +974,8 @@ propagate_unknowns (object_size_info *osi, tree expr) } case MODIFY_EXPR: { - tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 1)); + tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 1), +unknowns); if (size_unknown_p (res, object_size_type)) return res; return expr; @@ -992,7 +983,8 @@ propagate_unknowns (object_size_info *osi, tree expr) case TREE_VEC: for (int i = 0; i < TREE_VEC_LENGTH (expr); i++) { - tree res = propagate_unknowns (osi, TREE_VEC_ELT (expr, i)); + tree res = propagate_unknowns (osi, TREE_VEC_ELT (expr, i), +unknowns); if (size_unknown_p (res, object_size_type)) return res; } @@ -1000,7 +992,8 @@ propagate_unknowns (object_size_info *osi, tree expr) case PLUS_EXPR: case MINUS_EXPR: { - tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0)); + tree res = propagate_unknowns (osi, TREE_OPERAND (expr, 0), +unknowns); if (size_unknown_p (res, object_size_type)) return res; @@ -1025,6 +1018,7 @@ gimplify_size_expressions (object_size_info *osi) /* Step 1: Propagate unknowns into expressions. */ bitmap reexamine = BITMAP_ALLOC (NULL); bitmap_copy (reexamine, osi->reexamine); + bitmap unknowns = BITMAP_ALLOC (NULL); do { changed = false; @@ -1032,14 +1026,23 @@ gimplify_size_expressions (object_size_info *osi) { object_size cur = object_sizes_get_raw (osi, i); - if (size_unknown_p (propagate
Re: [PATCH] tree-object-size: Always set computed bit for bdos [PR113012]
On 2023-12-19 17:57, Jakub Jelinek wrote: On Mon, Dec 18, 2023 at 11:42:52AM -0500, Siddhesh Poyarekar wrote: It is always safe to set the computed bit for dynamic object sizes at the end of collect_object_sizes_for because even in case of a dependency loop encountered in nested calls, we have an SSA temporary to actually finish the object size expression. The reexamine pass for dynamic object sizes is only for propagation of unknowns and gimplification of the size expressions, not for loop resolution as in the case of static object sizes. gcc/ChangeLog: PR tree-optimization/113012 * gcc.dg/ubsan/pr113012.c: New test case. gcc/testsuite/ChangeLog: PR tree-optimization/113012 * tree-object-size.cc (compute_builtin_object_size): Expand comment for dynamic object sizes. (collect_object_sizes_for): Always set COMPUTED bitmap for dynamic object sizes. You have the gcc/ChangeLog and gcc/testsuite/ChangeLog hunks swapped, I think this wouldn't get through pre-commit hook. Oops, fixed. --- a/gcc/tree-object-size.cc +++ b/gcc/tree-object-size.cc @@ -1185,10 +1185,12 @@ compute_builtin_object_size (tree ptr, int object_size_type, osi.tos = NULL; } - /* First pass: walk UD chains, compute object sizes that -can be computed. osi.reexamine bitmap at the end will -contain what variables were found in dependency cycles -and therefore need to be reexamined. */ + /* First pass: walk UD chains, compute object sizes that can be computed. +osi.reexamine bitmap at the end will contain what variables that need This wording is weird. Perhaps ... will contain versions of SSA_NAMEs that need to be reexamined. ? Because varno seems to be SSA_NAME_VERSION. Ack, it's unrelated to my change, but fixed since I touched the comment block. @@ -1823,11 +1825,16 @@ collect_object_sizes_for (struct object_size_info *osi, tree var) gcc_unreachable (); } - if (! reexamine || object_sizes_unknown_p (object_size_type, varno)) + /* Dynamic sizes use placeholder temps to return an answer, so it is always + * safe to set COMPUTED for them. */ We don't use this style of comments, please replace the * at the start of second line with a space. Oops, fixed. + if ((object_size_type & OST_DYNAMIC) + || !reexamine || object_sizes_unknown_p (object_size_type, varno)) { bitmap_set_bit (computed[object_size_type], varno); if (!(object_size_type & OST_DYNAMIC)) bitmap_clear_bit (osi->reexamine, varno); + else if (reexamine) + bitmap_set_bit (osi->reexamine, varno); } else { Otherwise LGTM, but please wait at least a few weeks before backporting to 13. OK, I'll push with fixes to trunk in a bit and then push to 13 next year. Thanks, Sid
Re: [PATCH] tree-object-size: Clean up unknown propagation
On 2023-12-20 00:23, Jeff Law wrote: On 12/19/23 10:21, Siddhesh Poyarekar wrote: Narrow down scope of the unknowns bitmap so that it is only accessible within the reexamination process. This also removes any role of unknown propagation from object_sizes_set, thus simplifying that code path a bit. gcc/ChangeLog: * tree-object-size.cc (object_size_info): Remove UNKNOWNS. Drop all references to it. (object_sizes_set): Move unknowns propagation code to... (gimplify_size_expressions): ... here. Also free reexamine bitmap. (propagate_unknowns): New parameter UNKNOWNS. Update callers. Signed-off-by: Siddhesh Poyarekar --- This is a follow-up cleanup to pr#113012, but not required to fix that bug. Bootstrapped on x86_64 and with config=ubsan. OK, assuming it also went through a regression test or does so before committing. Thanks, yes, I also compared test results during the x86_64 bootstrap, likewise for i686 after I had posted this. Sid
Re: [PATCH] SECURITY.txt: Drop "exploitable" in reference to hardening issues
On 2023-12-18 09:35, Siddhesh Poyarekar wrote: The "exploitable vulnerability" may lead to a misunderstanding that missed hardening issues are considered vulnerabilities, just that they're not exploitable. This is not true, since while hardening bugs may be security-relevant, the absence of hardening does not make a program any more vulnerable to exploits than without. Drop the "exploitable" word to make it clear that missed hardening is not considered a vulnerability. Ping, may I commit this if there are no objections? Thanks, Sid diff --git a/SECURITY.txt b/SECURITY.txt index b3e2bbfda90..126603d4c22 100644 --- a/SECURITY.txt +++ b/SECURITY.txt @@ -155,10 +155,10 @@ Security features implemented in GCC GCC implements a number of security features that reduce the impact of security issues in applications, such as -fstack-protector, -fstack-clash-protection, _FORTIFY_SOURCE and so on. A failure of - these features to function perfectly in all situations is not an - exploitable vulnerability in itself since it does not affect the - correctness of programs. Further, they're dependent on heuristics - and may not always have full coverage for protection. + these features to function perfectly in all situations is not a + vulnerability in itself since it does not affect the correctness of + programs. Further, they're dependent on heuristics and may not + always have full coverage for protection. Similarly, GCC may transform code in a way that the correctness of the expressed algorithm is preserved, but supplementary properties
Re: [gcc15] nested functions in C
On 2023-12-04 16:31, Martin Uecker wrote: If (assuming from them being called lambdas) they are primarily for small functions without side-effects then it's already a significantly stronger specification than what we have right now with C nested functions. That would end up enforcing what you demonstrate as the good way to use nested functions. The proposal we have seen for C23 (which was not accepted into C23 mostly due to timing and lack of implementation experience) were similar to C++'s lambdas and did not have any such restriction. Oh well :/ If nested functions are eventually going to make it into the C standard then effort is probably better spent in porting the C nested functions to use descriptors instead of executable stacks or heaps. I submitted a patch for this a long time ago which was based on the code for Ada that uses a bit in the pointer to differentiate between conventional pointers and descriptors. I would now prefer an approach that uses a qualifier on the function type to indicate that the static chain has to be set. A pointer to such a qualified function would a descriptor that consists of the address and the value for the static chain. This would be useful for many things. Ack, this probably becomes a gcc15 thing then, given that stage 1 has closed. Are you planning to revive your work and repost? Thanks, Sid
Re: [PATCH] gcc: Disallow trampolines when -fhardened
On 2023-12-02 04:42, Martin Uecker wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- It came up that a good hardening strategy is to disable trampolines which may require executable stack. Therefore the following patch adds -Werror=trampolines to -fhardened. This would add a warning about specific code (where it is then unclear whether rewriting it is feasible or even an improvement), which seems different to all the other flags -fhardening has now. It's actually -Werror=trampolines, not just -Wtrampolines; the aim is to hard fail on producing trampolines and consequently, an executable stack. In general the goal of -fhardened is to produce hardened code and the nested function trampolines do the exact reverse of that, so -Werror=trampolines seems to align perfectly with that goal, doesn't it? GCC now has an option to allocate trampolines on the heap, which would seem to be a better fit. On the other hand, it does not work with longjmp which may be a limitation. For hardened code in C, I think we really should look to step away from nested functions instead of adding ways to continue supporting it. There's probably a larger conversation to be had about the utility of nested functions in general for C (and whether this GCC extension should be deprecated altogether in future), but I feel like the -fhardened subset gives us the opportunity to enforce at least a safe subset for now, possibly extending it in future. Thanks, Sid
Re: [PATCH] gcc: Disallow trampolines when -fhardened
On 2023-12-04 11:39, Andreas Schwab wrote: On Dez 04 2023, Siddhesh Poyarekar wrote: For hardened code in C, I think we really should look to step away from nested functions instead of adding ways to continue supporting it. There's probably a larger conversation to be had about the utility of nested functions in general for C (and whether this GCC extension should be deprecated altogether in future), but I feel like the -fhardened subset gives us the opportunity to enforce at least a safe subset for now, possibly extending it in future. Nested functions by itself don't need a trampoline, only if the address of it is passed outside the containing function's scope (as a callback, for example). Yes, that's why I said that the conversation about deprecating the C nested functions extension is a broader one (and hence for gcc 15) that will likely involve the question of whether dropping the extension altogether gives any benefit or if dropping support for on-stack trampolines is sufficient. On-heap trampolines are maybe slightly better in that they don't need an executable stack, but defaulting to on-heap trampolines for -fhardened seems like a lost opportunity to enforce better user code. Thanks, Sid
[gcc15] nested functions in C
[Branching this into a separate conversation to avoid derailing the patch, which isn't directly related] On 2023-12-04 12:21, Martin Uecker wrote: I do not really agree with that. Nested functions can substantially improve code quality and in C can avoid type unsafe use of void* pointers in callbacks. The code is often much better with nested functions than without. Nested functions and lambdas (i.e. anonymous nested functions) are used in many languages because they make code better and GNU's nested function are no exception. So I disagree with the idea that discouraging nested functions leads to better code - I think the exact opposite is true. I would argue that GNU's nested functions *are* an exception because they're like feathers stuck on a pig to try and make it fly; I think a significant specification effort is required to actually make it a cleanly usable feature. It *may* be possible to implement patterns that use C nested functions well enough *and* result in readable code, but IMO it is easier to write clunky and unmaintainable code with it. I empathize with Jakub's stated use case though of keeping the C frontend support for testing purposes, but that could easily be done behind a flag, or by putting nested C func deprecation behind a flag. I am generally wary of mitigations that may make exploitation of buffer overflows a bit harder while increasing the likelihood of buffer overflows by reducing type safety and/or code quality. But I would agree that trampolines are generally problematic. A better strategy would be wide function pointer type (as in Apple' Blocks extension). Alternatively, an explicit way to obtain the static chain for a nested function which could be used with __builtin_call_with_static_chain could also work. But in any case, I think it diminishes the value of -fhardening it if requires source code changes, because then it is not as easy to simply turn it on in larger projects / distributitions. I suppose you mean source code changes even in correct code just to comply with the flag? I don't disagree for cases like -Warray-bounds, but for warnings/errors that are more deterministic in nature (like -Werror=trampolines), they're going to point at actual problems and larger projects and distributions will usually prefer to at least track them, if not actually fix them. For Fedora we tend to provide macro overrides for packages that need to explicitly disable a security related flag. Thanks, Sid
Re: [gcc15] nested functions in C
On 2023-12-04 13:48, Martin Uecker wrote: I empathize with Jakub's stated use case though of keeping the C frontend support for testing purposes, but that could easily be done behind a flag, or by putting nested C func deprecation behind a flag. I am relatively sure C will get some form of nested functions. Maybe as anonymous nested functions, i.e. lambdas, but I do not see a fundamental difference here (I personally like naming things for clarity, so i prefer named nested functions) If (assuming from them being called lambdas) they are primarily for small functions without side-effects then it's already a significantly stronger specification than what we have right now with C nested functions. That would end up enforcing what you demonstrate as the good way to use nested functions. I suppose minimal, contained side-effects (such as atomically updating a structure) may also constitute sound design, but that should be made explicit in the language. I don't disagree for cases like -Warray-bounds, but for warnings/errors that are more deterministic in nature (like -Werror=trampolines), they're going to point at actual problems and larger projects and distributions will usually prefer to at least track them, if not actually fix them. For Fedora we tend to provide macro overrides for packages that need to explicitly disable a security related flag. In projects such as mine, this will lead to a lot of code transformations as indicated above, i.e. much worse code. One could get away with it, since nested functions are rarely used, but I think this is bad, because a lot of code would improve if it used them. If nested functions are eventually going to make it into the C standard then effort is probably better spent in porting the C nested functions to use descriptors instead of executable stacks or heaps. Thanks, Sid
Re: [gcc15] nested functions in C
On 2023-12-04 13:51, Jakub Jelinek wrote: Why? The syntax doesn't seem to be something unexpected, and as C doesn't have lambdas, one can use the nested functions instead. The only problem is if you need to pass function pointers somewhere else (and target doesn't have function descriptors or something similar), if it is only done to make code more readable compared to say use of macros, I think the nested functions are better, one doesn't have to worry about multiple evaluations of argument side-effects etc. And if everything is inlined and SRA optimized, there is no extra cost. The problem of passing it as a function pointer to other functions is common with C++, only lambdas which don't capture anything actually can be convertible to function pointer, for anything else you need a template and instantiate it for a particular lambda (which is something you can't do in C). I think from a language standpoint, the general idea that nested functions are just any functions inside functions (which is how the C nested functions essentially behave) is too broad and they should be restricted to minimal implementations that, e.g. don't have side-effects or if they do, there's explicit syntactic sugar to make it clearer. If (like Martin stated earlier), nested functions are in fact going to enter the C standard in future then maybe this discussion is moot and we probably are better off implementing descriptor support for C to replace the on-stack trampolines instead of adding -Werror=trampolines in a hurry. Thanks, Sid
Re: [gcc15] nested functions in C
On 2023-12-07 10:42, Eric Botcazou wrote: I think from a language standpoint, the general idea that nested functions are just any functions inside functions (which is how the C nested functions essentially behave) is too broad and they should be restricted to minimal implementations that, e.g. don't have side-effects or if they do, there's explicit syntactic sugar to make it clearer. That sounds totally arbitrary though. Algol-derived languages have had nested subprograms for ages, e.g. Pascal or Ada, and they can be very useful. I'll admit that it is a subjective preference and is probably not in the spirit of traditional C. Sid
Re: [PATCH v8 5/5] Add the 6th argument to .ACCESS_WITH_SIZE
On 2024-03-29 12:07, Qing Zhao wrote: to carry the TYPE of the flexible array. Such information is needed during tree-object-size.cc. We cannot use the result type or the type of the 1st argument of the routine .ACCESS_WITH_SIZE to decide the element type of the original array due to possible type casting in the source code. gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th argument to .ACCESS_WITH_SIZE. gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): Use the type of the 6th argument for the type of the element. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-6.c: New test. This version looks fine to me for stage 1, but I'm not a maintainer so you'll need an ack from one to commit. Thanks, Sid --- gcc/c/c-typeck.cc | 11 +++-- gcc/internal-fn.cc| 2 + .../gcc.dg/flex-array-counted-by-6.c | 46 +++ gcc/tree-object-size.cc | 16 --- 4 files changed, 66 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-6.c diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index f7b0e08459b0..05948f76039e 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -2608,7 +2608,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) to: - (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1)) + (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1, + (TYPE_OF_ARRAY *)0)) NOTE: The return type of this function is the POINTER type pointing to the original flexible array type. @@ -2620,6 +2621,9 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) The 4th argument of the call is a constant 0 with the TYPE of the object pointed by COUNTED_BY_REF. + The 6th argument of the call is a constant 0 with the pointer TYPE + to the original flexible array type. + */ static tree build_access_with_size_for_counted_by (location_t loc, tree ref, @@ -2632,12 +2636,13 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, tree call = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE, - result_type, 5, + result_type, 6, array_to_pointer_conversion (loc, ref), counted_by_ref, build_int_cst (integer_type_node, 1), build_int_cst (counted_by_type, 0), - build_int_cst (integer_type_node, -1)); + build_int_cst (integer_type_node, -1), + build_int_cst (result_type, 0)); /* Wrap the call with an INDIRECT_REF with the flexible array type. */ call = build1 (INDIRECT_REF, TREE_TYPE (ref), call); SET_EXPR_LOCATION (call, loc); diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index e744080ee670..34e4a4aea534 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3411,6 +3411,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) 1: read_only 2: write_only 3: read_write + 6th argument: A constant 0 with the pointer TYPE to the original flexible + array type. Both the return type and the type of the first argument of this function have been converted from the incomplete array type to diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c new file mode 100644 index ..65fa01443d95 --- /dev/null +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c @@ -0,0 +1,46 @@ +/* Test the attribute counted_by and its usage in + * __builtin_dynamic_object_size: when the type of the flexible array member + * is casting to another type. */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "builtin-object-size-common.h" + +typedef unsigned short u16; + +struct info { + u16 data_len; + char data[] __attribute__((counted_by(data_len))); +}; + +struct foo { + int a; + int b; +}; + +static __attribute__((__noinline__)) +struct info *setup () +{ + struct info *p; + size_t bytes = 3 * sizeof(struct foo); + + p = (struct info *)malloc (sizeof (struct info) + bytes); + p->data_len = bytes; + + return p; +} + +static void +__attribute__((__noinline__)) report (struct info *p) +{ + struct foo *bar = (struct foo *)p->data; + EXPECT(__builtin_dynamic_object_size((char *)(bar + 1), 1), 16); + EXPECT(__builtin_dynamic_object_size((char *)(bar + 2), 1), 8); +} + +int main(int argc, char *argv[]) +{ + struct info *p = setup(); + report(p); + return 0; +} diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc index 8de264d1dee2..4c1fa9b555fa 100644 ---
Re: [PATCH v8 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.
On 2024-03-29 12:07, Qing Zhao wrote: gcc/c-family/ChangeLog: * c-ubsan.cc (get_bound_from_access_with_size): New function. (ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE. gcc/testsuite/ChangeLog: * gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test. --- This version looks fine to me for stage 1, but I'm not a maintainer so you'll need an ack from one to commit. Thanks, Sid gcc/c-family/c-ubsan.cc | 42 + .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++ .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++ .../ubsan/flex-array-counted-by-bounds-4.c| 34 ++ .../ubsan/flex-array-counted-by-bounds.c | 46 +++ 5 files changed, 201 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-4.c create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc index 940982819ddf..7cd3c6aa5b88 100644 --- a/gcc/c-family/c-ubsan.cc +++ b/gcc/c-family/c-ubsan.cc @@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc) return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data)); } +/* Get the tree that represented the number of counted_by, i.e, the maximum + number of the elements of the object that the call to .ACCESS_WITH_SIZE + points to, this number will be the bound of the corresponding array. */ +static tree +get_bound_from_access_with_size (tree call) +{ + if (!is_access_with_size_p (call)) +return NULL_TREE; + + tree ref_to_size = CALL_EXPR_ARG (call, 1); + unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2)); + tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3)); + tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size), + build_int_cst (ptr_type_node, 0)); + /* If size is negative value, treat it as zero. */ + if (!TYPE_UNSIGNED (type)) + { +tree cond = fold_build2 (LT_EXPR, boolean_type_node, +unshare_expr (size), build_zero_cst (type)); +size = fold_build3 (COND_EXPR, type, cond, + build_zero_cst (type), size); + } + + /* Only when class_of_size is 1, i.e, the number of the elements of + the object type, return the size. */ + if (class_of_size != 1) +return NULL_TREE; + else +size = fold_convert (sizetype, size); + + return size; +} + + /* Instrument array bounds for ARRAY_REFs. We create special builtin, that gets expanded in the sanopt pass, and make an array dimension of it. ARRAY is the array, *INDEX is an index to the array. @@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index, && COMPLETE_TYPE_P (type) && integer_zerop (TYPE_SIZE (type))) bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1); + else if (INDIRECT_REF_P (array) + && is_access_with_size_p ((TREE_OPERAND (array, 0 + { + bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0))); + bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound), + bound, + build_int_cst (TREE_TYPE (bound), 1)); + } else return NULL_TREE; } diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c new file mode 100644 index ..b503320628d2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c @@ -0,0 +1,45 @@ +/* Test the attribute counted_by and its usage in + bounds sanitizer combined with VLA. */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds" } */ +/* { dg-output "index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ + + +#include + +void __attribute__((__noinline__)) setup_and_test_vla (int n, int m) +{ + struct foo { + int n; + int p[][n] __attribute__((counted_by(n))); + } *f; + + f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n])); + f->n = m; + f->p[m][n-1]=1; + return; +} + +void
Re: [PATCH v8 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.
On 2024-03-29 12:07, Qing Zhao wrote: gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): New function. (call_object_size): Call the new function. gcc/testsuite/ChangeLog: * gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT. * gcc.dg/flex-array-counted-by-3.c: New test. * gcc.dg/flex-array-counted-by-4.c: New test. * gcc.dg/flex-array-counted-by-5.c: New test. This version looks fine to me for stage 1, but I'm not a maintainer so you'll need an ack from one to commit. Thanks, Sid --- .../gcc.dg/builtin-object-size-common.h | 11 ++ .../gcc.dg/flex-array-counted-by-3.c | 63 +++ .../gcc.dg/flex-array-counted-by-4.c | 178 ++ .../gcc.dg/flex-array-counted-by-5.c | 48 + gcc/tree-object-size.cc | 60 ++ 5 files changed, 360 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h b/gcc/testsuite/gcc.dg/builtin-object-size-common.h index 66ff7cdd953a..b677067c6e6b 100644 --- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h +++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h @@ -30,3 +30,14 @@ unsigned nfails = 0; __builtin_abort (); \ return 0; \ } while (0) + +#define EXPECT(p, _v) do { \ + size_t v = _v; \ + if (p == v)\ +__builtin_printf ("ok: %s == %zd\n", #p, p); \ + else \ +{\ + __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \ + FAIL (); \ +}\ +} while (0); diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c new file mode 100644 index ..78f50230e891 --- /dev/null +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c @@ -0,0 +1,63 @@ +/* Test the attribute counted_by and its usage in + * __builtin_dynamic_object_size. */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "builtin-object-size-common.h" + +struct flex { + int b; + int c[]; +} *array_flex; + +struct annotated { + int b; + int c[] __attribute__ ((counted_by (b))); +} *array_annotated; + +struct nested_annotated { + struct { +union { + int b; + float f; +}; +int n; + }; + int c[] __attribute__ ((counted_by (b))); +} *array_nested_annotated; + +void __attribute__((__noinline__)) setup (int normal_count, int attr_count) +{ + array_flex += (struct flex *)malloc (sizeof (struct flex) ++ normal_count * sizeof (int)); + array_flex->b = normal_count; + + array_annotated += (struct annotated *)malloc (sizeof (struct annotated) + + attr_count * sizeof (int)); + array_annotated->b = attr_count; + + array_nested_annotated += (struct nested_annotated *)malloc (sizeof (struct nested_annotated) ++ attr_count * sizeof (int)); + array_nested_annotated->b = attr_count; + + return; +} + +void __attribute__((__noinline__)) test () +{ +EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1); +EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1), + array_annotated->b * sizeof (int)); +EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1), + array_nested_annotated->b * sizeof (int)); +} + +int main(int argc, char *argv[]) +{ + setup (10,10); + test (); + DONE (); +} diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c new file mode 100644 index ..20103d58ef51 --- /dev/null +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c @@ -0,0 +1,178 @@ +/* Test the attribute counted_by and its usage in +__builtin_dynamic_object_size: what's the correct behavior when the +allocation size mismatched with the value of counted_by attribute? +We should always use the latest value that is hold by the counted_by +field. */ +/* { dg-do run } */ +/* { dg-options "-O -fstrict-flex-arrays=3" } */ + +#include "builtin-object-size-common.h" + +struct annotated { + size_t foo; + char others; + char array[] __attribute__((counted_by (foo))); +}; + +#define noinline
Re: [PATCH v2 3/3] Add testing cases for flexible array members in unions and alone in structures.
On 2024-04-25 10:06, Qing Zhao wrote: gcc/testsuite/ChangeLog: * c-c++-common/fam-in-union-alone-in-struct-1.c: New testcase. * c-c++-common/fam-in-union-alone-in-struct-2.c: New testcase. * c-c++-common/fam-in-union-alone-in-struct-3.c: New testcase. --- Sorry I should have commented sooner; could you please also add tests for __bos/__bdos for such unions and structs? Thanks, Sid .../fam-in-union-alone-in-struct-1.c | 52 +++ .../fam-in-union-alone-in-struct-2.c | 51 ++ .../fam-in-union-alone-in-struct-3.c | 36 + 3 files changed, 139 insertions(+) create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c create mode 100644 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c new file mode 100644 index ..7d4721aa95ac --- /dev/null +++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c @@ -0,0 +1,52 @@ +/* testing the correct usage of flexible array members in unions + and alone in structures. */ +/* { dg-do run} */ +/* { dg-options "-Wpedantic" } */ + +union with_fam_1 { + int a; + int b[]; /* { dg-warning "flexible array member in union is a GCC extension" } */ +}; + +union with_fam_2 { + char a; + int b[]; /* { dg-warning "flexible array member in union is a GCC extension" } */ +}; + +union with_fam_3 { + char a[]; /* { dg-warning "flexible array member in union is a GCC extension" } */ + /* { dg-warning "in an otherwise empty" "" { target c++ } .-1 } */ + int b[]; /* { dg-warning "flexible array member in union is a GCC extension" } */ +}; + +struct only_fam { + int b[]; + /* { dg-warning "in a struct with no named members" "" { target c } .-1 } */ + /* { dg-warning "in an otherwise empty" "" { target c++ } .-2 } */ + /* { dg-warning "forbids flexible array member" "" { target c++ } .-3 } */ +}; + +struct only_fam_2 { + unsigned int : 2; + unsigned int : 3; + int b[]; + /* { dg-warning "in a struct with no named members" "" { target c } .-1 } */ + /* { dg-warning "in an otherwise empty" "" { target c++ } .-2 } */ + /* { dg-warning "forbids flexible array member" "" { target c++ } .-3 } */ +}; + +int main () +{ + if (sizeof (union with_fam_1) != sizeof (int)) +__builtin_abort (); + if (sizeof (union with_fam_2) != __alignof__ (int)) +__builtin_abort (); + if (sizeof (union with_fam_3) != 0) +__builtin_abort (); + if (sizeof (struct only_fam) != 0) +__builtin_abort (); + if (sizeof (struct only_fam_2) != sizeof (int)) +__builtin_abort (); + return 0; +} + diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c new file mode 100644 index ..3743f9e7dac5 --- /dev/null +++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c @@ -0,0 +1,51 @@ +/* testing the correct usage of flexible array members in unions + and alone in structures: initialization */ +/* { dg-do run} */ +/* { dg-options "-O2" } */ + +union with_fam_1 { + int a; + int b[]; +} with_fam_1_v = {.b = {1, 2, 3, 4}}; + +union with_fam_2 { + int a; + char b[]; +} with_fam_2_v = {.a = 0x1f2f3f4f}; + +union with_fam_3 { + char a[]; + int b[]; +} with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f7f}}; + +struct only_fam { + int b[]; +} only_fam_v = {{7, 11}}; + +struct only_fam_2 { + unsigned int : 2; + unsigned int : 3; + int b[]; +} only_fam_2_v = {{7, 11}}; + +int main () +{ + if (with_fam_1_v.b[3] != 4 + || with_fam_1_v.b[0] != 1) +__builtin_abort (); + if (with_fam_2_v.b[3] != 0x1f + || with_fam_2_v.b[0] != 0x4f) +__builtin_abort (); + if (with_fam_3_v.a[0] != 0x4f + || with_fam_3_v.a[7] != 0x5f) +__builtin_abort (); + if (only_fam_v.b[0] != 7 + || only_fam_v.b[1] != 11) +__builtin_abort (); + if (only_fam_2_v.b[0] != 7 + || only_fam_2_v.b[1] != 11) +__builtin_abort (); + + return 0; +} + diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c new file mode 100644 index ..dd36fa01306d --- /dev/null +++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-3.c @@ -0,0 +1,36 @@ +/* testing the correct usage of flexible array members in unions + and alone in structures. */ +/* { dg-do compile } */ +/* { dg-options "-pedantic-errors" } */ + +union with_fam_1 { + int a; + int b[]; /* { dg-error "flexible array member in union is a GCC extension" } */ +}; + +union with_fam_2 { + char a; + int b[]; /* { dg-error "flexible array member in union is a GCC extension" } */ +}; + +union with_fam_3 { + char a[]; /* { dg-error "flexible array member in union is a GCC
Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.
On 2024-03-18 12:28, Qing Zhao wrote: This should probably bail out if object_size_type & OST_DYNAMIC == 0. Okay. Will add this. When add this into access_with_size_object_size, I found some old bugs in early_object_sizes_execute_one, and fixed them as well. Would you be able to isolate this fix and post them separately? I reckon they would be relevant for gcc 14 too. Thanks, Sid
Re: [PATCH v6 1/5] Provide counted_by attribute to flexible array member field (PR108896)
On 2024-02-16 14:47, Qing Zhao wrote: 'counted_by (COUNT)' The 'counted_by' attribute may be attached to the C99 flexible array member of a structure. It indicates that the number of the elements of the array is given by the field named "COUNT" in the same structure as the flexible array member. GCC uses this information to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size'. For instance, the following code: struct P { size_t count; char other; char array[] __attribute__ ((counted_by (count))); } *p; specifies that the 'array' is a flexible array member whose number of elements is given by the field 'count' in the same structure. The field that represents the number of the elements should have an integer type. Otherwise, the compiler will report a warning and ignore the attribute. When the field that represents the number of the elements is assigned a negative integer value, the compiler will treat the value as zero. An explicit 'counted_by' annotation defines a relationship between two objects, 'p->array' and 'p->count', and there are the following requirementthat on the relationship between this pair: * 'p->count' should be initialized before the first reference to 'p->array'; * 'p->array' has _at least_ 'p->count' number of elements available all the time. This relationship must hold even after any of these related objects are updated during the program. It's the user's responsibility to make sure the above requirements to be kept all the time. Otherwise the compiler will report warnings, at the same time, the results of the array bound sanitizer and the '__builtin_dynamic_object_size' is undefined. One important feature of the attribute is, a reference to the flexible array member field will use the latest value assigned to the field that represents the number of the elements before that reference. For example, p->count = val1; p->array[20] = 0; // ref1 to p->array p->count = val2; p->array[30] = 0; // ref2 to p->array in the above, 'ref1' will use 'val1' as the number of the elements in 'p->array', and 'ref2' will use 'val2' as the number of elements in 'p->array'. I can't approve of course, but here's a review of the code that should hopefully make it easier for the C frontend maintainers. gcc/c-family/ChangeLog: PR C/108896 * c-attribs.cc (handle_counted_by_attribute): New function. (attribute_takes_identifier_p): Add counted_by attribute to the list. * c-common.cc (c_flexible_array_member_type_p): ...To this. * c-common.h (c_flexible_array_member_type_p): New prototype. gcc/c/ChangeLog: PR C/108896 * c-decl.cc (flexible_array_member_type_p): Renamed and moved to... (add_flexible_array_elts_to_size): Use renamed function. (is_flexible_array_member_p): Use renamed function. (verify_counted_by_attribute): New function. (finish_struct): Use renamed function and verify counted_by attribute. * c-tree.h (lookup_field): New prototype. * c-typeck.cc (lookup_field): Expose as extern function. gcc/ChangeLog: PR C/108896 * doc/extend.texi: Document attribute counted_by. gcc/testsuite/ChangeLog: PR C/108896 * gcc.dg/flex-array-counted-by.c: New test. --- gcc/c-family/c-attribs.cc| 54 - gcc/c-family/c-common.cc | 13 +++ gcc/c-family/c-common.h | 1 + gcc/c/c-decl.cc | 85 gcc/c/c-tree.h | 1 + gcc/c/c-typeck.cc| 3 +- gcc/doc/extend.texi | 64 +++ gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 + 8 files changed, 241 insertions(+), 20 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 40a0cf90295d..4395c0656b14 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -105,6 +105,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree, int, bool *); static tree handle_strict_flex_array_attribute (tree *, tree, tree, int, bool *); +static tree handle_counted_by_attribute (tree *, tree, tree, + int, bool *); static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ; static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
Re: [PATCH v6 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.
On 2024-02-16 14:47, Qing Zhao wrote: Including the following changes: * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def. * C FE converts every reference to a FAM with a "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE. (build_component_ref in c_typeck.cc) This includes the case when the object is statically allocated and initialized. In order to make this working, the routines initializer_constant_valid_p_1 and output_constant in varasm.cc are updated to handle calls to .ACCESS_WITH_SIZE. (initializer_constant_valid_p_1 and output_constant in varasm.c) However, for the reference inside "offsetof", the "counted_by" attribute is ignored since it's not useful at all. (c_parser_postfix_expression in c/c-parser.cc) In addtion to "offsetof", for the reference inside operator "typeof" and "alignof", we ignore counted_by attribute too. When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE, replace the call with its first argument. * Convert every call to .ACCESS_WITH_SIZE to its first argument. (expand_ACCESS_WITH_SIZE in internal-fn.cc) * Adjust alias analysis to exclude the new internal from clobbering anything. (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc) * Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when it's LHS is eliminated as dead code. (eliminate_unnecessary_stmts in tree-ssa-dce.cc) * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and get the reference from the call to .ACCESS_WITH_SIZE. (is_access_with_size_p and get_ref_from_access_with_size in tree.cc) gcc/c/ChangeLog: * c-parser.cc (c_parser_postfix_expression): Ignore the counted-by attribute when build_component_ref inside offsetof operator. * c-tree.h (build_component_ref): Add one more parameter. * c-typeck.cc (build_counted_by_ref): New function. (build_access_with_size_for_counted_by): New function. (build_component_ref): Check the counted-by attribute and build call to .ACCESS_WITH_SIZE. (build_unary_op): When building ADDR_EXPR for .ACCESS_WITH_SIZE, use its first argument. (lvalue_p): Accept call to .ACCESS_WITH_SIZE. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): New function. * internal-fn.def (ACCESS_WITH_SIZE): New internal function. * tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case IFN_ACCESS_WITH_SIZE. (call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE. * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call to .ACCESS_WITH_SIZE when its LHS is dead. * tree.cc (process_call_operands): Adjust side effect for function .ACCESS_WITH_SIZE. (is_access_with_size_p): New function. (get_ref_from_access_with_size): New function. * tree.h (is_access_with_size_p): New prototype. (get_ref_from_access_with_size): New prototype. * varasm.cc (initializer_constant_valid_p_1): Handle call to .ACCESS_WITH_SIZE. (output_constant): Handle call to .ACCESS_WITH_SIZE. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-2.c: New test. --- gcc/c/c-parser.cc | 10 +- gcc/c/c-tree.h| 2 +- gcc/c/c-typeck.cc | 128 +- gcc/internal-fn.cc| 36 + gcc/internal-fn.def | 4 + .../gcc.dg/flex-array-counted-by-2.c | 112 +++ gcc/tree-ssa-alias.cc | 2 + gcc/tree-ssa-dce.cc | 5 +- gcc/tree.cc | 25 +++- gcc/tree.h| 8 ++ gcc/varasm.cc | 10 ++ 11 files changed, 332 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index c31349dae2ff..a6ed5ac43bb1 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -10850,9 +10850,12 @@ c_parser_postfix_expression (c_parser *parser) if (c_parser_next_token_is (parser, CPP_NAME)) { c_token *comp_tok = c_parser_peek_token (parser); + /* Ignore the counted_by attribute for reference inside + offsetof since the information is not useful at all. */ offsetof_ref = build_component_ref (loc, offsetof_ref, comp_tok->value, -comp_tok->location, UNKNOWN_LOCATION); +comp_tok->location, UNKNOWN_LOCATION, +false); c_parser_consume_token (parser);
Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.
On 2024-02-16 14:47, Qing Zhao wrote: gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): New function. (call_object_size): Call the new function. gcc/testsuite/ChangeLog: * gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT. * gcc.dg/flex-array-counted-by-3.c: New test. * gcc.dg/flex-array-counted-by-4.c: New test. * gcc.dg/flex-array-counted-by-5.c: New test. --- .../gcc.dg/builtin-object-size-common.h | 11 ++ .../gcc.dg/flex-array-counted-by-3.c | 63 +++ .../gcc.dg/flex-array-counted-by-4.c | 178 ++ .../gcc.dg/flex-array-counted-by-5.c | 48 + gcc/tree-object-size.cc | 59 ++ 5 files changed, 359 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h b/gcc/testsuite/gcc.dg/builtin-object-size-common.h index 66ff7cdd953a..b677067c6e6b 100644 --- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h +++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h @@ -30,3 +30,14 @@ unsigned nfails = 0; __builtin_abort (); \ return 0; \ } while (0) + +#define EXPECT(p, _v) do { \ + size_t v = _v; \ + if (p == v)\ +__builtin_printf ("ok: %s == %zd\n", #p, p); \ + else \ +{\ + __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \ + FAIL (); \ +}\ +} while (0); diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c new file mode 100644 index ..0066c32ca808 --- /dev/null +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c @@ -0,0 +1,63 @@ +/* test the attribute counted_by and its usage in + * __builtin_dynamic_object_size. */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "builtin-object-size-common.h" + +struct flex { + int b; + int c[]; +} *array_flex; + +struct annotated { + int b; + int c[] __attribute__ ((counted_by (b))); +} *array_annotated; + +struct nested_annotated { + struct { +union { + int b; + float f; +}; +int n; + }; + int c[] __attribute__ ((counted_by (b))); +} *array_nested_annotated; + +void __attribute__((__noinline__)) setup (int normal_count, int attr_count) +{ + array_flex += (struct flex *)malloc (sizeof (struct flex) ++ normal_count * sizeof (int)); + array_flex->b = normal_count; + + array_annotated += (struct annotated *)malloc (sizeof (struct annotated) + + attr_count * sizeof (int)); + array_annotated->b = attr_count; + + array_nested_annotated += (struct nested_annotated *)malloc (sizeof (struct nested_annotated) ++ attr_count * sizeof (int)); + array_nested_annotated->b = attr_count; + + return; +} + +void __attribute__((__noinline__)) test () +{ +EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1); +EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1), + array_annotated->b * sizeof (int)); +EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1), + array_nested_annotated->b * sizeof (int)); +} + +int main(int argc, char *argv[]) +{ + setup (10,10); + test (); + DONE (); +} diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c new file mode 100644 index ..3ce7f3545549 --- /dev/null +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c @@ -0,0 +1,178 @@ +/* test the attribute counted_by and its usage in +__builtin_dynamic_object_size: what's the correct behavior when the +allocation size mismatched with the value of counted_by attribute? +we should always use the latest value that is hold by the counted_by +field. */ +/* { dg-do run } */ +/* { dg-options "-O -fstrict-flex-arrays=3" } */ + +#include "builtin-object-size-common.h" + +struct annotated { + size_t foo; + char others; + char array[] __attribute__((counted_by (foo))); +}; + +#define noinline __attribute__((__noinline__)) +#define SIZE_BUMP 10 +#define MAX(a, b) ((a) > (b) ? (a) : (b)) + +/* In general, Due to type casting, the
Re: [PATCH v6 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.
On 2024-02-16 14:47, Qing Zhao wrote: gcc/c-family/ChangeLog: * c-ubsan.cc (get_bound_from_access_with_size): New function. (ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE. gcc/testsuite/ChangeLog: * gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test. --- gcc/c-family/c-ubsan.cc | 42 + .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++ .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++ .../ubsan/flex-array-counted-by-bounds.c | 46 +++ 4 files changed, 167 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc index 940982819ddf..164b29845b3a 100644 --- a/gcc/c-family/c-ubsan.cc +++ b/gcc/c-family/c-ubsan.cc @@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc) return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data)); } +/* Get the tree that represented the number of counted_by, i.e, the maximum + number of the elements of the object that the call to .ACCESS_WITH_SIZE + points to, this number will be the bound of the corresponding array. */ +static tree +get_bound_from_access_with_size (tree call) +{ + if (!is_access_with_size_p (call)) +return NULL_TREE; + + tree ref_to_size = CALL_EXPR_ARG (call, 1); + unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2)); Again for consistency, this should probably be class_of_size. + tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3)); + tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size), + build_int_cst (ptr_type_node, 0)); + /* If size is negative value, treat it as zero. */ + if (!TYPE_UNSIGNED (type)) + { +tree cond = fold_build2 (LT_EXPR, boolean_type_node, +unshare_expr (size), build_zero_cst (type)); +size = fold_build3 (COND_EXPR, type, cond, + build_zero_cst (type), size); + } + + /* Only when type_of_size is 1,i.e, the number of the elements of + the object type, return the size. */ + if (type_of_size != 1) +return NULL_TREE; + else +size = fold_convert (sizetype, size); + + return size; +} + + /* Instrument array bounds for ARRAY_REFs. We create special builtin, that gets expanded in the sanopt pass, and make an array dimension of it. ARRAY is the array, *INDEX is an index to the array. @@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index, && COMPLETE_TYPE_P (type) && integer_zerop (TYPE_SIZE (type))) bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1); + else if (INDIRECT_REF_P (array) + && is_access_with_size_p ((TREE_OPERAND (array, 0 + { + bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0))); + bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound), + bound, + build_int_cst (TREE_TYPE (bound), 1)); + } This will wrap if bound == 0, maybe that needs to be special-cased. And maybe also add a test for it below. else return NULL_TREE; } diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c new file mode 100644 index ..148934975ee5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c @@ -0,0 +1,45 @@ +/* test the attribute counted_by and its usage in + bounds sanitizer combined with VLA. */ +/* { dg-do run } */ +/* { dg-options "-fsanitize=bounds" } */ +/* { dg-output "index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ +/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */ + + +#include + +void __attribute__((__noinline__)) setup_and_test_vla (int n, int m) +{ + struct foo { + int n; + int p[][n] __attribute__((counted_by(n))); + } *f; + + f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n])); + f->n = m; + f->p[m][n-1]=1; + return; +} + +void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m) +{ + struct foo { +int n; +int p[][n2][n1] __attribute__((counted_by(n))); + } *f; + + f