Re: [PATCH] ubsan: remove bogus check for flexible array members
On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote: > this patch removes a bogus check for flexible array members > which prevents array references to be instrumented in some > interesting cases. Arrays accessed through pointers are now > instrumented correctly. > > The check was unnecessary because flexible arrays are not > instrumented anyway because of > TYPE_MAX_VALUE (domain) == NULL_TREE. No, it is not bogus nor unnecessary. This isn't about just real flexible arrays, but similar constructs, C++ doesn't have flexible array members, nor C89, so people use the GNU extension of struct S { ... ; char a[0]; } instead, or use char a[1]; as the last member and still expect to be able to access s->a[i] for i > 0 say on heap allocations etc. I think we were talking at some point about having a strict-bounds or something similar that would accept only real flexible array members and nothing else and be more pedantic, at the disadvantage of diagnosing tons of real-world code that is supported by GCC. Perhaps if the reference is just an array, not a struct containing flexible-array-member-like array, we could consider it strict always, but that is certainly not what your patch does. > * c-family/c-ubsan.c (ubsan_instrument_bounds): Remove c-family/ shouldn't be in the ChangeLog entry, instead that part should go into c-family/ChangeLog. > --- a/gcc/c-family/c-ubsan.c > +++ b/gcc/c-family/c-ubsan.c > @@ -293,6 +293,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree > *index, >tree type = TREE_TYPE (array); >tree domain = TYPE_DOMAIN (type); > > + /* This also takes care of flexilbe array members */ Typo. > +#ifdef __cplusplus > +extern "C" void* malloc(unsigned long x); > +#else > +extern void* malloc(unsigned long x); > +#endif Why are you declaring malloc (wrongly), when it isn't used? Malloc argument is size_t aka __SIZE_TYPE__, not unsigned long. Jakub
Re: [PATCH] ubsan: remove bogus check for flexible array members
On Thu, Feb 26, 2015 at 07:36:54AM +0100, Jakub Jelinek wrote: > On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote: > > this patch removes a bogus check for flexible array members > > which prevents array references to be instrumented in some > > interesting cases. Arrays accessed through pointers are now > > instrumented correctly. > > > > The check was unnecessary because flexible arrays are not > > instrumented anyway because of > > TYPE_MAX_VALUE (domain) == NULL_TREE. > > No, it is not bogus nor unnecessary. > This isn't about just real flexible arrays, but similar constructs, > C++ doesn't have flexible array members, nor C89, so people use the > GNU extension of struct S { ... ; char a[0]; } instead, or > use char a[1]; as the last member and still expect to be able to access > s->a[i] for i > 0 say on heap allocations etc. Yes, in other words, your patch would make a difference here: int main (void) { struct S { int i; char a[1]; }; struct S *t = (struct S *) __builtin_malloc (sizeof (struct S) + 10); t->a[2] = 1; } (bounds-1.c test case should test this, too.) > I think we were talking at some point about having a strict-bounds or > something similar that would accept only real flexible array members and > nothing else and be more pedantic, at the disadvantage of diagnosing tons > of real-world code that is supported by GCC. > > Perhaps if the reference is just an array, not a struct containing > flexible-array-member-like array, we could consider it strict always, > but that is certainly not what your patch does. Martin, can you try whether this (untested) does what you're after? --- gcc/c-family/c-ubsan.c +++ gcc/c-family/c-ubsan.c @@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index, /* Detect flexible array members and suchlike. */ tree base = get_base_address (array); - if (base && (TREE_CODE (base) == INDIRECT_REF - || TREE_CODE (base) == MEM_REF)) + if (TREE_CODE (array) == COMPONENT_REF + && base && (TREE_CODE (base) == INDIRECT_REF + || TREE_CODE (base) == MEM_REF)) { tree next = NULL_TREE; tree cref = array; I think it is a bug that we're playing games on something that is not an element of a structure. Marek
Re: [PATCH] ubsan: remove bogus check for flexible array members
Thu, 26 Feb 2015 07:36:54 +0100 Jakub Jelinek : > On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote: > > this patch removes a bogus check for flexible array members > > which prevents array references to be instrumented in some > > interesting cases. Arrays accessed through pointers are now > > instrumented correctly. > > > > The check was unnecessary because flexible arrays are not > > instrumented anyway because of > > TYPE_MAX_VALUE (domain) == NULL_TREE. > > No, it is not bogus nor unnecessary. > This isn't about just real flexible arrays, but similar constructs, > C++ doesn't have flexible array members, nor C89, so people use the > GNU extension of struct S { ... ; char a[0]; } instead, or The GNU extension is still allowed, i.e. not instrumented with the patch. > use char a[1]; as the last member and still expect to be able to access > s->a[i] for i > 0 say on heap allocations etc. And this is broken code. I would argue that a user who uses the ubsan *expects* this to be diagnosed. Atleast I was surprised that it didn't catch more out-of-bounds accesses. If this is indeed intentional, we should: - document these exceptions - remove the misleading comment - have test cases also for sizes > 0 - fix it not to prevent instrumentation of all array accesses through pointers (i.e. when the array is not a member of a struct) - have a configuration option (e.g. -fsanitize=strict-bounds) > I think we were talking at some point about having a strict-bounds or > something similar that would accept only real flexible array members and > nothing else and be more pedantic, at the disadvantage of diagnosing tons > of real-world code that is supported by GCC. > > Perhaps if the reference is just an array, not a struct containing > flexible-array-member-like array, we could consider it strict always, > but that is certainly not what your patch does. Indeed, currently the patch tries to make -fsanitize=bounds detect what the standard considers undefined behaviour. At the moment, I do not quite see why ubsan should be so permissive as you say. But we can also add a new option -fsanitize=strict-bounds. Then I could use this for my (real-world) projects. > > * c-family/c-ubsan.c (ubsan_instrument_bounds): Remove > > c-family/ shouldn't be in the ChangeLog entry, instead that part should go > into c-family/ChangeLog. > > --- a/gcc/c-family/c-ubsan.c > > +++ b/gcc/c-family/c-ubsan.c > > @@ -293,6 +293,7 @@ ubsan_instrument_bounds (location_t loc, tree array, > > tree *index, > >tree type = TREE_TYPE (array); > >tree domain = TYPE_DOMAIN (type); > > > > + /* This also takes care of flexilbe array members */ > > Typo. > > > +#ifdef __cplusplus > > +extern "C" void* malloc(unsigned long x); > > +#else > > +extern void* malloc(unsigned long x); > > +#endif > > Why are you declaring malloc (wrongly), when it isn't used? > Malloc argument is size_t aka __SIZE_TYPE__, not unsigned long. Left over from removing tests already done elsewhere. Thank you for your comments, Martin
Re: [PATCH] ubsan: remove bogus check for flexible array members
On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote: > > No, it is not bogus nor unnecessary. > > This isn't about just real flexible arrays, but similar constructs, > > C++ doesn't have flexible array members, nor C89, so people use the > > GNU extension of struct S { ... ; char a[0]; } instead, or > > The GNU extension is still allowed, i.e. not instrumented with > the patch. > > > use char a[1]; as the last member and still expect to be able to access > > s->a[i] for i > 0 say on heap allocations etc. > > And this is broken code. I would argue that a user who uses the > ubsan *expects* this to be diagnosed. Atleast I was surprised > that it didn't catch more out-of-bounds accesses. So can you explain what a C++ programmer can do portably? It has neither flexible array members, nor without GNU extensions zero sized arrays. If the array size is constant, perhaps turn the struct into a template, but if it is variable? Ditto for C89 code. The amount of code that uses this idiom in the wild is huge. Jakub
Re: [PATCH] ubsan: remove bogus check for flexible array members
On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote: > > Thu, 26 Feb 2015 07:36:54 +0100 > Jakub Jelinek : > > On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote: > > > this patch removes a bogus check for flexible array members > > > which prevents array references to be instrumented in some > > > interesting cases. Arrays accessed through pointers are now > > > instrumented correctly. > > > > > > The check was unnecessary because flexible arrays are not > > > instrumented anyway because of > > > TYPE_MAX_VALUE (domain) == NULL_TREE. > > > > No, it is not bogus nor unnecessary. > > This isn't about just real flexible arrays, but similar constructs, > > C++ doesn't have flexible array members, nor C89, so people use the > > GNU extension of struct S { ... ; char a[0]; } instead, or > > The GNU extension is still allowed, i.e. not instrumented with > the patch. > > > use char a[1]; as the last member and still expect to be able to access > > s->a[i] for i > 0 say on heap allocations etc. > > And this is broken code. I would argue that a user who uses the > ubsan *expects* this to be diagnosed. Atleast I was surprised > that it didn't catch more out-of-bounds accesses. I wouldn't say broken, it's being used pretty often. E.g. https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html "In ISO C90, you would have to give contents a length of 1" > If this is indeed intentional, we should: > > - document these exceptions See the link above + docs say for -fsanitize=bounds that "Flexible array members are not instrumented". > - remove the misleading comment Which one? The /* Detect flexible array members and suchlike. */ IMHO says exactly what's meant to be done. > - have test cases also for sizes > 0 Yes, we should have that. > - fix it not to prevent instrumentation of all array accesses > through pointers (i.e. when the array is not a member of a struct) Yes, see the patch in my previous mail. > - have a configuration option (e.g. -fsanitize=strict-bounds) Yeah, something like that would be useful to have. As Jakub mentioned, we discussed this in the past. Probably GCC 6 stuff though. Marek
Re: [PATCH] ubsan: remove bogus check for flexible array members
On Thu, Feb 26, 2015 at 8:32 AM, Marek Polacek wrote: > On Thu, Feb 26, 2015 at 07:36:54AM +0100, Jakub Jelinek wrote: >> On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote: >> > this patch removes a bogus check for flexible array members >> > which prevents array references to be instrumented in some >> > interesting cases. Arrays accessed through pointers are now >> > instrumented correctly. >> > >> > The check was unnecessary because flexible arrays are not >> > instrumented anyway because of >> > TYPE_MAX_VALUE (domain) == NULL_TREE. >> >> No, it is not bogus nor unnecessary. >> This isn't about just real flexible arrays, but similar constructs, >> C++ doesn't have flexible array members, nor C89, so people use the >> GNU extension of struct S { ... ; char a[0]; } instead, or >> use char a[1]; as the last member and still expect to be able to access >> s->a[i] for i > 0 say on heap allocations etc. > > Yes, in other words, your patch would make a difference here: > > int > main (void) > { > struct S { int i; char a[1]; }; > struct S *t = (struct S *) __builtin_malloc (sizeof (struct S) + 10); > t->a[2] = 1; > } > > (bounds-1.c test case should test this, too.) Btw, I've seen struct S { int i; char a[4]; }; as well. >> I think we were talking at some point about having a strict-bounds or >> something similar that would accept only real flexible array members and >> nothing else and be more pedantic, at the disadvantage of diagnosing tons >> of real-world code that is supported by GCC. >> >> Perhaps if the reference is just an array, not a struct containing >> flexible-array-member-like array, we could consider it strict always, >> but that is certainly not what your patch does. > > Martin, can you try whether this (untested) does what you're after? > > --- gcc/c-family/c-ubsan.c > +++ gcc/c-family/c-ubsan.c > @@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, tree > *index, > >/* Detect flexible array members and suchlike. */ >tree base = get_base_address (array); > - if (base && (TREE_CODE (base) == INDIRECT_REF > - || TREE_CODE (base) == MEM_REF)) > + if (TREE_CODE (array) == COMPONENT_REF Err - this doesn't detect int main (void) { int *t = (int *) __builtin_malloc (sizeof (int) * 10); int (*a)[1] = (int (*)[1])t; (*a)[2] = 1; } that is a trailing array VLA. What I've definitely seen is int main (void) { int *t = (int *) __builtin_malloc (sizeof (int) * 9); int (*a)[3][3] = (int (*)[3][3])t; (*a)[0][9] = 1; } that is, assume that the array dimension with the fast running index "wraps" over into the next (hello SPEC CPU 2006!). > + && base && (TREE_CODE (base) == INDIRECT_REF > + || TREE_CODE (base) == MEM_REF)) > { >tree next = NULL_TREE; >tree cref = array; > > I think it is a bug that we're playing games on something that is not > an element of a structure. Not sure about this. Richard. > Marek
Re: [PATCH] ubsan: remove bogus check for flexible array members
On Thu, Feb 26, 2015 at 11:08:04AM +0100, Richard Biener wrote: > > --- gcc/c-family/c-ubsan.c > > +++ gcc/c-family/c-ubsan.c > > @@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, > > tree *index, > > > >/* Detect flexible array members and suchlike. */ > >tree base = get_base_address (array); > > - if (base && (TREE_CODE (base) == INDIRECT_REF > > - || TREE_CODE (base) == MEM_REF)) > > + if (TREE_CODE (array) == COMPONENT_REF > > Err - this doesn't detect > > int > main (void) > { > int *t = (int *) __builtin_malloc (sizeof (int) * 10); > int (*a)[1] = (int (*)[1])t; > (*a)[2] = 1; > } > > that is a trailing array VLA. > > What I've definitely seen is > > int > main (void) > { > int *t = (int *) __builtin_malloc (sizeof (int) * 9); > int (*a)[3][3] = (int (*)[3][3])t; > (*a)[0][9] = 1; > } I think we should error on those. With my patch we'd emit the same -fsanitize=bounds runtime errors as clang does. > that is, assume that the array dimension with the fast running > index "wraps" over into the next (hello SPEC CPU 2006!). I think they're invoking UB then. > > + && base && (TREE_CODE (base) == INDIRECT_REF > > + || TREE_CODE (base) == MEM_REF)) > > { > >tree next = NULL_TREE; > >tree cref = array; > > > > I think it is a bug that we're playing games on something that is not > > an element of a structure. > > Not sure about this. The comment says that we're trying to detect a flexible array member there - and those can't be outside struct. I certainly hadn't anything else in my mind when I was writing that ;). Marek
Re: [PATCH] ubsan: remove bogus check for flexible array members
Marek Polacek : (Hi Marek and Jakub, I really appreciate your work on GCC and ubsan. I just want to add the minor enhancements necessary to make it really useful for me). > > And this is broken code. I would argue that a user who uses the > > ubsan *expects* this to be diagnosed. Atleast I was surprised > > that it didn't catch more out-of-bounds accesses. > > I wouldn't say broken, it's being used pretty often. E.g. > https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html > "In ISO C90, you would have to give contents a length of 1" Ok, let's add an option then. But that it is common doesn't mean it is not broken (especially since there is a legal way to that with GCC and C99) and I think there should be a way to diagnose it with ubsan. > > If this is indeed intentional, we should: > > > > - document these exceptions > > See the link above + docs say for -fsanitize=bounds that "Flexible array > members are not instrumented". Flexible array members should not be intrumented. But what is suprising to the user and should be documented is that other arrays (e.g. which hjust happen to be the last in a struct) are not instrumented. > > - remove the misleading comment > > Which one? > The /* Detect flexible array members and suchlike. */ IMHO says > exactly what's meant to be done. The reason it confused me was because flexible array members never see this code because the are catched by the 'if' condition directly before. > > - have test cases also for sizes > 0 > > Yes, we should have that. > > > - fix it not to prevent instrumentation of all array accesses > > through pointers (i.e. when the array is not a member of a struct) > > Yes, see the patch in my previous mail. Thank you! I will test this and report back > > - have a configuration option (e.g. -fsanitize=strict-bounds) > > Yeah, something like that would be useful to have. As Jakub > mentioned, we discussed this in the past. Probably GCC 6 stuff > though. I will look into this. Martin
Re: [PATCH] ubsan: remove bogus check for flexible array members
Am Thu, 26 Feb 2015 10:05:14 +0100 Jakub Jelinek : > On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote: > > > No, it is not bogus nor unnecessary. > > > This isn't about just real flexible arrays, but similar constructs, > > > C++ doesn't have flexible array members, nor C89, so people use the > > > GNU extension of struct S { ... ; char a[0]; } instead, or > > > > The GNU extension is still allowed, i.e. not instrumented with > > the patch. > > > > > use char a[1]; as the last member and still expect to be able to access > > > s->a[i] for i > 0 say on heap allocations etc. > > > > And this is broken code. I would argue that a user who uses the > > ubsan *expects* this to be diagnosed. Atleast I was surprised > > that it didn't catch more out-of-bounds accesses. > > So can you explain what a C++ programmer can do portably? Using broken code is not really portable either, because other compilers diagnose this. Also, we are not talking about breaking that code - they can simply continue to use that code. They could just not expect ubsan to not diagnose it. This would make them aware of the fact that their code is problematic - which is exactly what I would expect from ubsan. > It has neither > flexible array members, nor without GNU extensions zero sized arrays. > If the array size is constant, perhaps turn the struct into a template, > but if it is variable? Ditto for C89 code. > The amount of code that uses this idiom in the wild is huge. Ok, I will add an option then. Or should this be language dependent? Maritn