Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Martin Uecker
Am Thu, 26 Feb 2015 10:05:14 +0100
Jakub Jelinek ja...@redhat.com:

 On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote:
   No, it is not bogus nor unnecessary.
   This isn't about just real flexible arrays, but similar constructs,
   C++ doesn't have flexible array members, nor C89, so people use the
   GNU extension of struct S { ... ; char a[0]; } instead, or
  
  The GNU extension is still allowed, i.e. not instrumented with
  the patch.
  
   use char a[1]; as the last member and still expect to be able to access
   s-a[i] for i  0 say on heap allocations etc.
  
  And this is broken code. I would argue that a user who uses the
  ubsan *expects* this to be diagnosed. Atleast I was surprised
  that it didn't catch more out-of-bounds accesses.
 
 So can you explain what a C++ programmer can do portably? 

Using broken code is not really portable either, because other
compilers diagnose this. Also, we are not talking about breaking
that code - they can simply continue to use that code. They could
just not expect ubsan to not diagnose it. This would make
them aware of the fact that their code is problematic - which is
exactly what I would expect from ubsan.

 It has neither
 flexible array members, nor without GNU extensions zero sized arrays.
 If the array size is constant, perhaps turn the struct into a template,
 but if it is variable?  Ditto for C89 code.
 The amount of code that uses this idiom in the wild is huge.

Ok, I will add an option then. Or should this be language dependent?

Maritn


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Martin Uecker
Marek Polacek pola...@redhat.com:

(Hi Marek and Jakub, I really appreciate your work on GCC and 
ubsan. I just want to add the minor enhancements necessary 
to make it really useful for me).

  And this is broken code. I would argue that a user who uses the
  ubsan *expects* this to be diagnosed. Atleast I was surprised
  that it didn't catch more out-of-bounds accesses.
  
 I wouldn't say broken, it's being used pretty often.  E.g.
 https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
 In ISO C90, you would have to give contents a length of 1

Ok, let's add an option then. But that it is common doesn't
mean it is not broken (especially since there is a legal
way to that with GCC and C99) and I think there should be a 
way to diagnose it with ubsan.

  If this is indeed intentional, we should:
  
  - document these exceptions
 
 See the link above + docs say for -fsanitize=bounds that Flexible array
 members are not instrumented.

Flexible array members should not be intrumented. But
what is suprising to the user and should be documented
is that other arrays (e.g. which hjust happen to be the
last in a struct) are not instrumented.

  - remove the misleading comment
 
 Which one?
 The /* Detect flexible array members and suchlike.  */ IMHO says
 exactly what's meant to be done.

The reason it confused me was because flexible array members never
see this code because the are catched by the 'if' condition
directly before. 

  - have test cases also for sizes  0
 
 Yes, we should have that.
 
  - fix it not to prevent instrumentation of all array accesses
through pointers (i.e. when the array is not a member of a struct)
 
 Yes, see the patch in my previous mail.

Thank you! I will test this and report back

  - have a configuration option (e.g. -fsanitize=strict-bounds)
 
 Yeah, something like that would be useful to have.  As Jakub
 mentioned, we discussed this in the past.  Probably GCC 6 stuff
 though.

I will look into this.

Martin



Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Marek Polacek
On Thu, Feb 26, 2015 at 11:08:04AM +0100, Richard Biener wrote:
  --- gcc/c-family/c-ubsan.c
  +++ gcc/c-family/c-ubsan.c
  @@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, 
  tree *index,
 
 /* Detect flexible array members and suchlike.  */
 tree base = get_base_address (array);
  -  if (base  (TREE_CODE (base) == INDIRECT_REF
  -  || TREE_CODE (base) == MEM_REF))
  +  if (TREE_CODE (array) == COMPONENT_REF
 
 Err - this doesn't detect
 
 int
 main (void)
 {
   int *t = (int *) __builtin_malloc (sizeof (int) * 10);
   int (*a)[1] = (int (*)[1])t;
   (*a)[2] = 1;
 }
 
 that is a trailing array VLA.
 
 What I've definitely seen is
 
 int
 main (void)
 {
   int *t = (int *) __builtin_malloc (sizeof (int) * 9);
   int (*a)[3][3] = (int (*)[3][3])t;
   (*a)[0][9] = 1;
 }
 
I think we should error on those.
With my patch we'd emit the same -fsanitize=bounds runtime errors as clang
does.

 that is, assume that the array dimension with the fast running
 index wraps over into the next (hello SPEC CPU 2006!).
 
I think they're invoking UB then.

  +   base  (TREE_CODE (base) == INDIRECT_REF
  + || TREE_CODE (base) == MEM_REF))
   {
 tree next = NULL_TREE;
 tree cref = array;
 
  I think it is a bug that we're playing games on something that is not
  an element of a structure.
 
 Not sure about this.

The comment says that we're trying to detect a flexible array member
there - and those can't be outside struct.  I certainly hadn't anything
else in my mind when I was writing that ;).

Marek


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Marek Polacek
On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote:
 
 Thu, 26 Feb 2015 07:36:54 +0100
 Jakub Jelinek ja...@redhat.com:
  On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote:
   this patch removes a bogus check for flexible array members
   which prevents array references to be instrumented in some
   interesting cases. Arrays accessed through pointers are now
   instrumented correctly.
   
   The check was unnecessary because flexible arrays are not 
   instrumented anyway because of
   TYPE_MAX_VALUE (domain) == NULL_TREE. 
  
  No, it is not bogus nor unnecessary.
  This isn't about just real flexible arrays, but similar constructs,
  C++ doesn't have flexible array members, nor C89, so people use the
  GNU extension of struct S { ... ; char a[0]; } instead, or
 
 The GNU extension is still allowed, i.e. not instrumented with
 the patch.
 
  use char a[1]; as the last member and still expect to be able to access
  s-a[i] for i  0 say on heap allocations etc.
 
 And this is broken code. I would argue that a user who uses the
 ubsan *expects* this to be diagnosed. Atleast I was surprised
 that it didn't catch more out-of-bounds accesses.
 
I wouldn't say broken, it's being used pretty often.  E.g.
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
In ISO C90, you would have to give contents a length of 1

 If this is indeed intentional, we should:
 
 - document these exceptions

See the link above + docs say for -fsanitize=bounds that Flexible array
members are not instrumented.

 - remove the misleading comment

Which one?
The /* Detect flexible array members and suchlike.  */ IMHO says
exactly what's meant to be done.

 - have test cases also for sizes  0

Yes, we should have that.

 - fix it not to prevent instrumentation of all array accesses
   through pointers (i.e. when the array is not a member of a struct)

Yes, see the patch in my previous mail.

 - have a configuration option (e.g. -fsanitize=strict-bounds)

Yeah, something like that would be useful to have.  As Jakub
mentioned, we discussed this in the past.  Probably GCC 6 stuff
though.

Marek


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Richard Biener
On Thu, Feb 26, 2015 at 8:32 AM, Marek Polacek pola...@redhat.com wrote:
 On Thu, Feb 26, 2015 at 07:36:54AM +0100, Jakub Jelinek wrote:
 On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote:
  this patch removes a bogus check for flexible array members
  which prevents array references to be instrumented in some
  interesting cases. Arrays accessed through pointers are now
  instrumented correctly.
 
  The check was unnecessary because flexible arrays are not
  instrumented anyway because of
  TYPE_MAX_VALUE (domain) == NULL_TREE.

 No, it is not bogus nor unnecessary.
 This isn't about just real flexible arrays, but similar constructs,
 C++ doesn't have flexible array members, nor C89, so people use the
 GNU extension of struct S { ... ; char a[0]; } instead, or
 use char a[1]; as the last member and still expect to be able to access
 s-a[i] for i  0 say on heap allocations etc.

 Yes, in other words, your patch would make a difference here:

 int
 main (void)
 {
   struct S { int i; char a[1]; };
   struct S *t = (struct S *) __builtin_malloc (sizeof (struct S) + 10);
   t-a[2] = 1;
 }

 (bounds-1.c test case should test this, too.)

Btw, I've seen struct S { int i; char a[4]; }; as well.

 I think we were talking at some point about having a strict-bounds or
 something similar that would accept only real flexible array members and
 nothing else and be more pedantic, at the disadvantage of diagnosing tons
 of real-world code that is supported by GCC.

 Perhaps if the reference is just an array, not a struct containing
 flexible-array-member-like array, we could consider it strict always,
 but that is certainly not what your patch does.

 Martin, can you try whether this (untested) does what you're after?

 --- gcc/c-family/c-ubsan.c
 +++ gcc/c-family/c-ubsan.c
 @@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
 *index,

/* Detect flexible array members and suchlike.  */
tree base = get_base_address (array);
 -  if (base  (TREE_CODE (base) == INDIRECT_REF
 -  || TREE_CODE (base) == MEM_REF))
 +  if (TREE_CODE (array) == COMPONENT_REF

Err - this doesn't detect

int
main (void)
{
  int *t = (int *) __builtin_malloc (sizeof (int) * 10);
  int (*a)[1] = (int (*)[1])t;
  (*a)[2] = 1;
}

that is a trailing array VLA.

What I've definitely seen is

int
main (void)
{
  int *t = (int *) __builtin_malloc (sizeof (int) * 9);
  int (*a)[3][3] = (int (*)[3][3])t;
  (*a)[0][9] = 1;
}

that is, assume that the array dimension with the fast running
index wraps over into the next (hello SPEC CPU 2006!).

 +   base  (TREE_CODE (base) == INDIRECT_REF
 + || TREE_CODE (base) == MEM_REF))
  {
tree next = NULL_TREE;
tree cref = array;

 I think it is a bug that we're playing games on something that is not
 an element of a structure.

Not sure about this.

Richard.

 Marek


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Martin Uecker

Thu, 26 Feb 2015 07:36:54 +0100
Jakub Jelinek ja...@redhat.com:
 On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote:
  this patch removes a bogus check for flexible array members
  which prevents array references to be instrumented in some
  interesting cases. Arrays accessed through pointers are now
  instrumented correctly.
  
  The check was unnecessary because flexible arrays are not 
  instrumented anyway because of
  TYPE_MAX_VALUE (domain) == NULL_TREE. 
 
 No, it is not bogus nor unnecessary.
 This isn't about just real flexible arrays, but similar constructs,
 C++ doesn't have flexible array members, nor C89, so people use the
 GNU extension of struct S { ... ; char a[0]; } instead, or

The GNU extension is still allowed, i.e. not instrumented with
the patch.

 use char a[1]; as the last member and still expect to be able to access
 s-a[i] for i  0 say on heap allocations etc.

And this is broken code. I would argue that a user who uses the
ubsan *expects* this to be diagnosed. Atleast I was surprised
that it didn't catch more out-of-bounds accesses.

If this is indeed intentional, we should:

- document these exceptions
- remove the misleading comment
- have test cases also for sizes  0
- fix it not to prevent instrumentation of all array accesses
  through pointers (i.e. when the array is not a member of a struct)
- have a configuration option (e.g. -fsanitize=strict-bounds)


 I think we were talking at some point about having a strict-bounds or
 something similar that would accept only real flexible array members and
 nothing else and be more pedantic, at the disadvantage of diagnosing tons
 of real-world code that is supported by GCC.

 Perhaps if the reference is just an array, not a struct containing
 flexible-array-member-like array, we could consider it strict always,
 but that is certainly not what your patch does.

Indeed, currently the patch tries to make -fsanitize=bounds detect what
the  standard considers undefined behaviour. At the moment, I do not
quite see why ubsan should be so permissive as you say.

But we can also add a new option -fsanitize=strict-bounds. Then I could
use this for my (real-world) projects.



  * c-family/c-ubsan.c (ubsan_instrument_bounds): Remove
 
 c-family/ shouldn't be in the ChangeLog entry, instead that part should go
 into c-family/ChangeLog.

  --- a/gcc/c-family/c-ubsan.c
  +++ b/gcc/c-family/c-ubsan.c
  @@ -293,6 +293,7 @@ ubsan_instrument_bounds (location_t loc, tree array, 
  tree *index,
 tree type = TREE_TYPE (array);
 tree domain = TYPE_DOMAIN (type);
   
  +  /* This also takes care of flexilbe array members */
 
 Typo.
 
  +#ifdef __cplusplus
  +extern C void* malloc(unsigned long x);
  +#else
  +extern void* malloc(unsigned long x);
  +#endif
 
 Why are you declaring malloc (wrongly), when it isn't used?
 Malloc argument is size_t aka __SIZE_TYPE__, not unsigned long.

Left over from removing tests already done elsewhere.

Thank you for your comments,
Martin




Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-26 Thread Jakub Jelinek
On Thu, Feb 26, 2015 at 12:59:08AM -0800, Martin Uecker wrote:
  No, it is not bogus nor unnecessary.
  This isn't about just real flexible arrays, but similar constructs,
  C++ doesn't have flexible array members, nor C89, so people use the
  GNU extension of struct S { ... ; char a[0]; } instead, or
 
 The GNU extension is still allowed, i.e. not instrumented with
 the patch.
 
  use char a[1]; as the last member and still expect to be able to access
  s-a[i] for i  0 say on heap allocations etc.
 
 And this is broken code. I would argue that a user who uses the
 ubsan *expects* this to be diagnosed. Atleast I was surprised
 that it didn't catch more out-of-bounds accesses.

So can you explain what a C++ programmer can do portably?  It has neither
flexible array members, nor without GNU extensions zero sized arrays.
If the array size is constant, perhaps turn the struct into a template,
but if it is variable?  Ditto for C89 code.
The amount of code that uses this idiom in the wild is huge.

Jakub


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-25 Thread Jakub Jelinek
On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote:
 this patch removes a bogus check for flexible array members
 which prevents array references to be instrumented in some
 interesting cases. Arrays accessed through pointers are now
 instrumented correctly.
 
 The check was unnecessary because flexible arrays are not 
 instrumented anyway because of
 TYPE_MAX_VALUE (domain) == NULL_TREE. 

No, it is not bogus nor unnecessary.
This isn't about just real flexible arrays, but similar constructs,
C++ doesn't have flexible array members, nor C89, so people use the
GNU extension of struct S { ... ; char a[0]; } instead, or
use char a[1]; as the last member and still expect to be able to access
s-a[i] for i  0 say on heap allocations etc.

I think we were talking at some point about having a strict-bounds or
something similar that would accept only real flexible array members and
nothing else and be more pedantic, at the disadvantage of diagnosing tons
of real-world code that is supported by GCC.

Perhaps if the reference is just an array, not a struct containing
flexible-array-member-like array, we could consider it strict always,
but that is certainly not what your patch does.

 * c-family/c-ubsan.c (ubsan_instrument_bounds): Remove

c-family/ shouldn't be in the ChangeLog entry, instead that part should go
into c-family/ChangeLog.
 --- a/gcc/c-family/c-ubsan.c
 +++ b/gcc/c-family/c-ubsan.c
 @@ -293,6 +293,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
 *index,
tree type = TREE_TYPE (array);
tree domain = TYPE_DOMAIN (type);
  
 +  /* This also takes care of flexilbe array members */

Typo.

 +#ifdef __cplusplus
 +extern C void* malloc(unsigned long x);
 +#else
 +extern void* malloc(unsigned long x);
 +#endif

Why are you declaring malloc (wrongly), when it isn't used?
Malloc argument is size_t aka __SIZE_TYPE__, not unsigned long.

Jakub


Re: [PATCH] ubsan: remove bogus check for flexible array members

2015-02-25 Thread Marek Polacek
On Thu, Feb 26, 2015 at 07:36:54AM +0100, Jakub Jelinek wrote:
 On Wed, Feb 25, 2015 at 10:01:07PM -0800, Martin Uecker wrote:
  this patch removes a bogus check for flexible array members
  which prevents array references to be instrumented in some
  interesting cases. Arrays accessed through pointers are now
  instrumented correctly.
  
  The check was unnecessary because flexible arrays are not 
  instrumented anyway because of
  TYPE_MAX_VALUE (domain) == NULL_TREE. 
 
 No, it is not bogus nor unnecessary.
 This isn't about just real flexible arrays, but similar constructs,
 C++ doesn't have flexible array members, nor C89, so people use the
 GNU extension of struct S { ... ; char a[0]; } instead, or
 use char a[1]; as the last member and still expect to be able to access
 s-a[i] for i  0 say on heap allocations etc.
 
Yes, in other words, your patch would make a difference here:

int
main (void)
{
  struct S { int i; char a[1]; };
  struct S *t = (struct S *) __builtin_malloc (sizeof (struct S) + 10);
  t-a[2] = 1;
}

(bounds-1.c test case should test this, too.)

 I think we were talking at some point about having a strict-bounds or
 something similar that would accept only real flexible array members and
 nothing else and be more pedantic, at the disadvantage of diagnosing tons
 of real-world code that is supported by GCC.
 
 Perhaps if the reference is just an array, not a struct containing
 flexible-array-member-like array, we could consider it strict always,
 but that is certainly not what your patch does.

Martin, can you try whether this (untested) does what you're after?

--- gcc/c-family/c-ubsan.c
+++ gcc/c-family/c-ubsan.c
@@ -303,8 +303,9 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
 
   /* Detect flexible array members and suchlike.  */
   tree base = get_base_address (array);
-  if (base  (TREE_CODE (base) == INDIRECT_REF
-  || TREE_CODE (base) == MEM_REF))
+  if (TREE_CODE (array) == COMPONENT_REF
+   base  (TREE_CODE (base) == INDIRECT_REF
+ || TREE_CODE (base) == MEM_REF))
 {
   tree next = NULL_TREE;
   tree cref = array;

I think it is a bug that we're playing games on something that is not
an element of a structure.

Marek