Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-12-04 Thread Jeff Law via Gcc-patches



On 11/24/20 11:39 AM, Martin Sebor wrote:
> On 11/24/20 10:44 AM, Andrew MacLeod wrote:
>> On 11/24/20 12:42 PM, Andrew MacLeod wrote:
>>> On 11/23/20 4:38 PM, Martin Sebor wrote:
 On 11/21/20 6:26 AM, Andrew MacLeod wrote:
> On 11/21/20 12:07 AM, Jeff Law wrote:
>>
>> On 11/9/20 9:00 AM, Martin Sebor wrote:
>>> Ping:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html
>>>
>>>
>>> Jeff, I don't expect to have the cycles to reimplement this patch
>>> using the Ranger APIs before stage 1 closes.  I'm open to giving
>>> it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
>>> is this patch okay to commit?
>> So all we're going to get from the ranger is ranges of operands,
>> right?
>> Meaning that we still need to either roll our own evaluator
>> (eval_size_vflow) or overload range_for_stmt with our own, which
>> likely
>> looks like eval_size_vflow anyway, right?
>>
>> My hope was to avoid the roll our own evaluator, but that doesn't
>> look
>> like it's in the cards in the reasonably near future.
>
> Is there a PR open showing what exactly you are looking for?
> I'm using open PRs to track enhancement requests, and they will
> all feed back into the development roadmap  I am working on.

 Not that I know of.  The background is upthread, in particular in
 Aldy's response here:
 https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html

 I like the suggestion and if/when I have the time I'd like to give
 it a try.  Until then, I think the patch is useful on its own so
 I'll go with it for now.

 Longer term, I do hope we can revisit the idea of computing either
 mathematically correct ranges alongside those required by the language
 semantics, or tracking signed overflow or unsigned wraparound. E.g.,
 in:

   void* f (int n)
   {
     if (n < INT_MAX / 3)
   n = INT_MAX / 3;

     n *= sizeof (int);
     // n is [(INT_MAX / 3) * 4, INF] mathematically
     // undefined due to overflow in C
     // but [INT_MIN, INT_MAX] according to VRP
>>>
>>> but sizeof returns a size_t.. which is an unsigned. thus the
>>> multiply is promoted to an unsigned multiply  which means there is
>>> lots of wrapping and I don't see how you can conclude those ranges?
>>> [INT_MIN, INT_MAX] are all possible outcomes based on the code that
>>> is generated.
>>>
>>> If I change that to
>>>   n *= (int) sizeof (int) to keep it as signed arithmetic, I see:
>>>
>>>
>>> Folding statement: n_4 = n_1 * 4;
>>> EVRP:hybrid: RVRP found singleton 2147483647
>>> Queued stmt for removal.  Folds to: 2147483647
>>> evrp visiting stmt _7 = malloc (n_4);
>>>
>>> extract_range_from_stmt visiting:
>>> _7 = foo (n_4);
>>> Folding statement: _7 = foo (n_4);
>>> EVRP:hybrid: RVRP found singleton 2147483647
>>> Folded into: _7 = malloc (2147483647);
>>>
>>> So I'm not sure what exactly you want to do?  We are calculating
>>> what the program can produce?
>>>
>>> Why do we care about alternative calculations?
>>>
>> Or rather, why do we want to do this?
>
> When computing the sizes of things, programmers commonly forget
> to consider unsigned wrapping (or signed overflow).  We simply
> assume it can't happen and that (for instance) N * sizeof (X)
> is necessarily big enough for N elements of type X.  (Grepping
> any code base for the pattern '\* sizeof' and looking for code
> that tests that the result doesn't wrap is revealing.)
>
> When overflow or wrapping does happen (typically because of poor
> precondition checking) it often leads to bugs when we end up
> allocating less space than we need and use.  A simple example
> to help illustrate what I mean:
>
>   void* g (int *a, int n)
>   {
>     a = realloc (a, n * sizeof (int) + 32);
>     for (int i = n; i != n + 32; ++i)
>   a[i] = f ();
>   }
>
> In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will
> wrap around zero.  The realloc call will end up allocating less
> space than expected, and the loop will write past the end of
> the allocated block.
>
> (The bug above can only be detected if we know n's range.
> I left that part out.)
>
> Historically, bugs caused by integer overflow and wrapping have
> been among the most serious security weaknesses.  Detecting these
> mistakes will help prevent some of these.
>
> The problem is that according to C/C++, nothing in the function
> above is undefined except for the buffer overflow in the loop,
> and the buffer overflow only happens because of the well-defined
> integer wrapping.  To detect the wrapping, we either need to do
> the computation in as-if infinite math and compare the final result
> to the result we get under C's truncating rules, or we need to set
> and propagate the "wraparound" bit throughout the computation.
Just to add a bit to Martin's note.  Yes, an overflow of the size passed
to

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-24 Thread Martin Sebor via Gcc-patches

On 11/24/20 10:44 AM, Andrew MacLeod wrote:

On 11/24/20 12:42 PM, Andrew MacLeod wrote:

On 11/23/20 4:38 PM, Martin Sebor wrote:

On 11/21/20 6:26 AM, Andrew MacLeod wrote:

On 11/21/20 12:07 AM, Jeff Law wrote:


On 11/9/20 9:00 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
is this patch okay to commit?
So all we're going to get from the ranger is ranges of operands, 
right?

Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which 
likely

looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.


Is there a PR open showing what exactly you are looking for?
I'm using open PRs to track enhancement requests, and they will all 
feed back into the development roadmap  I am working on.


Not that I know of.  The background is upthread, in particular in
Aldy's response here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html

I like the suggestion and if/when I have the time I'd like to give
it a try.  Until then, I think the patch is useful on its own so
I'll go with it for now.

Longer term, I do hope we can revisit the idea of computing either
mathematically correct ranges alongside those required by the language
semantics, or tracking signed overflow or unsigned wraparound. E.g.,
in:

  void* f (int n)
  {
    if (n < INT_MAX / 3)
  n = INT_MAX / 3;

    n *= sizeof (int);
    // n is [(INT_MAX / 3) * 4, INF] mathematically
    // undefined due to overflow in C
    // but [INT_MIN, INT_MAX] according to VRP


but sizeof returns a size_t.. which is an unsigned. thus the multiply 
is promoted to an unsigned multiply  which means there is lots of 
wrapping and I don't see how you can conclude those ranges? [INT_MIN, 
INT_MAX] are all possible outcomes based on the code that is generated.


If I change that to
  n *= (int) sizeof (int) to keep it as signed arithmetic, I see:


Folding statement: n_4 = n_1 * 4;
EVRP:hybrid: RVRP found singleton 2147483647
Queued stmt for removal.  Folds to: 2147483647
evrp visiting stmt _7 = malloc (n_4);

extract_range_from_stmt visiting:
_7 = foo (n_4);
Folding statement: _7 = foo (n_4);
EVRP:hybrid: RVRP found singleton 2147483647
Folded into: _7 = malloc (2147483647);

So I'm not sure what exactly you want to do?  We are calculating what 
the program can produce?


Why do we care about alternative calculations?


Or rather, why do we want to do this?


When computing the sizes of things, programmers commonly forget
to consider unsigned wrapping (or signed overflow).  We simply
assume it can't happen and that (for instance) N * sizeof (X)
is necessarily big enough for N elements of type X.  (Grepping
any code base for the pattern '\* sizeof' and looking for code
that tests that the result doesn't wrap is revealing.)

When overflow or wrapping does happen (typically because of poor
precondition checking) it often leads to bugs when we end up
allocating less space than we need and use.  A simple example
to help illustrate what I mean:

  void* g (int *a, int n)
  {
a = realloc (a, n * sizeof (int) + 32);
for (int i = n; i != n + 32; ++i)
  a[i] = f ();
  }

In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will
wrap around zero.  The realloc call will end up allocating less
space than expected, and the loop will write past the end of
the allocated block.

(The bug above can only be detected if we know n's range.
I left that part out.)

Historically, bugs caused by integer overflow and wrapping have
been among the most serious security weaknesses.  Detecting these
mistakes will help prevent some of these.

The problem is that according to C/C++, nothing in the function
above is undefined except for the buffer overflow in the loop,
and the buffer overflow only happens because of the well-defined
integer wrapping.  To detect the wrapping, we either need to do
the computation in as-if infinite math and compare the final result
to the result we get under C's truncating rules, or we need to set
and propagate the "wraparound" bit throughout the computation.

Martin


Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-24 Thread Andrew MacLeod via Gcc-patches

On 11/24/20 12:42 PM, Andrew MacLeod wrote:

On 11/23/20 4:38 PM, Martin Sebor wrote:

On 11/21/20 6:26 AM, Andrew MacLeod wrote:

On 11/21/20 12:07 AM, Jeff Law wrote:


On 11/9/20 9:00 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
is this patch okay to commit?
So all we're going to get from the ranger is ranges of operands, 
right?

Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which 
likely

looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.


Is there a PR open showing what exactly you are looking for?
I'm using open PRs to track enhancement requests, and they will all 
feed back into the development roadmap  I am working on.


Not that I know of.  The background is upthread, in particular in
Aldy's response here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html

I like the suggestion and if/when I have the time I'd like to give
it a try.  Until then, I think the patch is useful on its own so
I'll go with it for now.

Longer term, I do hope we can revisit the idea of computing either
mathematically correct ranges alongside those required by the language
semantics, or tracking signed overflow or unsigned wraparound. E.g.,
in:

  void* f (int n)
  {
    if (n < INT_MAX / 3)
  n = INT_MAX / 3;

    n *= sizeof (int);
    // n is [(INT_MAX / 3) * 4, INF] mathematically
    // undefined due to overflow in C
    // but [INT_MIN, INT_MAX] according to VRP


but sizeof returns a size_t.. which is an unsigned. thus the multiply 
is promoted to an unsigned multiply  which means there is lots of 
wrapping and I don't see how you can conclude those ranges?    
[INT_MIN, INT_MAX] are all possible outcomes based on the code that is 
generated.


If I change that to
  n *= (int) sizeof (int) to keep it as signed arithmetic, I see:


Folding statement: n_4 = n_1 * 4;
EVRP:hybrid: RVRP found singleton 2147483647
Queued stmt for removal.  Folds to: 2147483647
evrp visiting stmt _7 = malloc (n_4);

extract_range_from_stmt visiting:
_7 = foo (n_4);
Folding statement: _7 = foo (n_4);
EVRP:hybrid: RVRP found singleton 2147483647
Folded into: _7 = malloc (2147483647);

So I'm not sure what exactly you want to do?  We are calculating what 
the program can produce?


Why do we care about alternative calculations?


Or rather, why do we want to do this?



Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-24 Thread Andrew MacLeod via Gcc-patches

On 11/23/20 4:38 PM, Martin Sebor wrote:

On 11/21/20 6:26 AM, Andrew MacLeod wrote:

On 11/21/20 12:07 AM, Jeff Law wrote:


On 11/9/20 9:00 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
is this patch okay to commit?

So all we're going to get from the ranger is ranges of operands, right?
Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which likely
looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.


Is there a PR open showing what exactly you are looking for?
I'm using open PRs to track enhancement requests, and they will all 
feed back into the development roadmap  I am working on.


Not that I know of.  The background is upthread, in particular in
Aldy's response here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html

I like the suggestion and if/when I have the time I'd like to give
it a try.  Until then, I think the patch is useful on its own so
I'll go with it for now.

Longer term, I do hope we can revisit the idea of computing either
mathematically correct ranges alongside those required by the language
semantics, or tracking signed overflow or unsigned wraparound. E.g.,
in:

  void* f (int n)
  {
    if (n < INT_MAX / 3)
  n = INT_MAX / 3;

    n *= sizeof (int);
    // n is [(INT_MAX / 3) * 4, INF] mathematically
    // undefined due to overflow in C
    // but [INT_MIN, INT_MAX] according to VRP


but sizeof returns a size_t.. which is an unsigned. thus the multiply is 
promoted to an unsigned multiply  which means there is lots of wrapping 
and I don't see how you can conclude those ranges?    [INT_MIN, INT_MAX] 
are all possible outcomes based on the code that is generated.


If I change that to
  n *= (int) sizeof (int) to keep it as signed arithmetic, I see:


Folding statement: n_4 = n_1 * 4;
EVRP:hybrid: RVRP found singleton 2147483647
Queued stmt for removal.  Folds to: 2147483647
evrp visiting stmt _7 = malloc (n_4);

extract_range_from_stmt visiting:
_7 = foo (n_4);
Folding statement: _7 = foo (n_4);
EVRP:hybrid: RVRP found singleton 2147483647
Folded into: _7 = malloc (2147483647);

So I'm not sure what exactly you want to do?  We are calculating what 
the program can produce?


Why do we care about alternative calculations?

Andrew



Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-23 Thread Martin Sebor via Gcc-patches

On 11/21/20 6:26 AM, Andrew MacLeod wrote:

On 11/21/20 12:07 AM, Jeff Law wrote:


On 11/9/20 9:00 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11.  Otherwise,
is this patch okay to commit?

So all we're going to get from the ranger is ranges of operands, right?
Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which likely
looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.


Is there a PR open showing what exactly you are looking for?
I'm using open PRs to track enhancement requests, and they will all feed 
back into the development roadmap  I am working on.


Not that I know of.  The background is upthread, in particular in
Aldy's response here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html

I like the suggestion and if/when I have the time I'd like to give
it a try.  Until then, I think the patch is useful on its own so
I'll go with it for now.

Longer term, I do hope we can revisit the idea of computing either
mathematically correct ranges alongside those required by the language
semantics, or tracking signed overflow or unsigned wraparound.  E.g.,
in:

  void* f (int n)
  {
if (n < INT_MAX / 3)
  n = INT_MAX / 3;

n *= sizeof (int);
// n is [(INT_MAX / 3) * 4, INF] mathematically
// undefined due to overflow in C
// but [INT_MIN, INT_MAX] according to VRP

return malloc (n);
  }

and in

  void* g (unsigned n)
  {
if (n < UINT_MAX / 3)
  n = UINT_MAX / 3;

n *= sizeof (int);
// n is [(UINT_MAX / 3) * 4, INF] mathematically
// but [0, UINT_MAX] in C

return malloc (n);
  }

In neither case is the overflow (or wraparound) easy to detect
without the Ranger's help.

Martin


If my summary of the current state is correct, then I'd suggest we go
with the patch as-is.  If we want to override eval_size_vflow in the
future, we can still do that.  And if we want to replace the call to
determine_value_range with a ranger API, that seems like a fairly
straightforward change to make in gcc-12.



Jeff







Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-21 Thread Andrew MacLeod via Gcc-patches

On 11/21/20 12:07 AM, Jeff Law wrote:


On 11/9/20 9:00 AM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11.  Otherwise,
is this patch okay to commit?

So all we're going to get from the ranger is ranges of operands, right?
Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which likely
looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.


Is there a PR open showing what exactly you are looking for?
I'm using open PRs to track enhancement requests, and they will all feed 
back into the development roadmap  I am working on.





If my summary of the current state is correct, then I'd suggest we go
with the patch as-is.  If we want to override eval_size_vflow in the
future, we can still do that.  And if we want to replace the call to
determine_value_range with a ranger API, that seems like a fairly
straightforward change to make in gcc-12.



Jeff





Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-20 Thread Jeff Law via Gcc-patches



On 11/9/20 9:00 AM, Martin Sebor wrote:
> Ping:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html
>
> Jeff, I don't expect to have the cycles to reimplement this patch
> using the Ranger APIs before stage 1 closes.  I'm open to giving
> it a try in stage 3 if it's still in scope for GCC 11.  Otherwise,
> is this patch okay to commit?
So all we're going to get from the ranger is ranges of operands, right? 
Meaning that we still need to either roll our own evaluator
(eval_size_vflow) or overload range_for_stmt with our own, which likely
looks like eval_size_vflow anyway, right?

My hope was to avoid the roll our own evaluator, but that doesn't look
like it's in the cards in the reasonably near future.

If my summary of the current state is correct, then I'd suggest we go
with the patch as-is.  If we want to override eval_size_vflow in the
future, we can still do that.  And if we want to replace the call to
determine_value_range with a ranger API, that seems like a fairly
straightforward change to make in gcc-12.



Jeff



Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-11-09 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html

Jeff, I don't expect to have the cycles to reimplement this patch
using the Ranger APIs before stage 1 closes.  I'm open to giving
it a try in stage 3 if it's still in scope for GCC 11.  Otherwise,
is this patch okay to commit?

On 9/21/20 9:13 AM, Martin Sebor wrote:

On 9/20/20 12:39 AM, Aldy Hernandez wrote:



On 9/19/20 11:22 PM, Martin Sebor wrote:

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in 
calls to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose 
overflow/wraparound.

* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the 
argument and get the overflow status out of that range, but I 
don't see that in the APIs.  How painful would it be to make 
that part of the API? The conceptual model would be to just ask 
for the range of the argument to malloc which would include the 
range and a status bit indicating the computation might have 
overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to 
do the math.    so you are correct in that the information is 
there. but is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would 
be flagging very many operations as overflows, so you would still 
have the difficulty of figuring out whether its a "real" overflow 
or a fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range 
class... but it was turning out to be fairly useless so it was 
removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like 
it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of 
cases.
It could be generalized and exposed by a more generic API but it 
does
seem like the ranger must already have all the logic built into 
it so

if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range 
for _6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to 
whatever the next higher precsion is,

range_cast  (op1r, highertype)
range_cast

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-21 Thread Martin Sebor via Gcc-patches

On 9/20/20 12:39 AM, Aldy Hernandez wrote:



On 9/19/20 11:22 PM, Martin Sebor wrote:

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in 
calls to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the 
argument and get the overflow status out of that range, but I 
don't see that in the APIs.  How painful would it be to make that 
part of the API? The conceptual model would be to just ask for 
the range of the argument to malloc which would include the range 
and a status bit indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to 
do the math.    so you are correct in that the information is 
there. but is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have 
the difficulty of figuring out whether its a "real" overflow or a 
fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range 
class... but it was turning out to be fairly useless so it was 
removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like 
it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range 
for _6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to 
whatever the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arit

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-19 Thread Aldy Hernandez via Gcc-patches




On 9/19/20 11:22 PM, Martin Sebor wrote:

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in 
calls to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the argument 
and get the overflow status out of that range, but I don't see 
that in the APIs.  How painful would it be to make that part of 
the API? The conceptual model would be to just ask for the range 
of the argument to malloc which would include the range and a 
status bit indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to 
do the math.    so you are correct in that the information is there. 
but is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have 
the difficulty of figuring out whether its a "real" overflow or a 
fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range class... 
but it was turning out to be fairly useless so it was removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like it 
overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range for 
_6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to 
whatever the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arithmetic operations.
It figures out if an unsi

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-19 Thread Martin Sebor via Gcc-patches

On 9/18/20 12:29 AM, Aldy Hernandez wrote:



On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in calls 
to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, 
when my son was asking me attack vectors, this is one I said I'd 
look at if I were a bad guy.



I'm a bit surprised you can't just query the range of the argument 
and get the overflow status out of that range, but I don't see that 
in the APIs.  How painful would it be to make that part of the API? 
The conceptual model would be to just ask for the range of the 
argument to malloc which would include the range and a status bit 
indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to do 
the math.    so you are correct in that the information is there. but 
is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have 
the difficulty of figuring out whether its a "real" overflow or a 
fake one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range class... 
but it was turning out to be fairly useless so it was removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now 
give you  [0x10001, 0x10009]    so its going to look like it 
overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 
0x3fffc] is too large to represent in ‘long unsigned 
int’ [-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range for 
_6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to whatever 
the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arithmetic operations.
It figures out if an unsigned expression wrapped simply by checking
to

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-17 Thread Aldy Hernandez via Gcc-patches




On 9/17/20 10:18 PM, Martin Sebor wrote:

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in calls 
to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, when 
my son was asking me attack vectors, this is one I said I'd look at 
if I were a bad guy.



I'm a bit surprised you can't just query the range of the argument 
and get the overflow status out of that range, but I don't see that 
in the APIs.  How painful would it be to make that part of the API? 
The conceptual model would be to just ask for the range of the 
argument to malloc which would include the range and a status bit 
indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to do 
the math.    so you are correct in that the information is there. but 
is it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have 
the difficulty of figuring out whether its a "real" overflow or a fake 
one because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range class... 
but it was turning out to be fairly useless so it was removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to 
subtract one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now give 
you  [0x10001, 0x10009]    so its going to look like it 
overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 0x3fffc] 
is too large to represent in ‘long unsigned int’ 
[-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range for 
_6, and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to whatever 
the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arithmetic operations.
It figures out if an unsigned expression wrapped simply by checking
to see if the mathematically correct result f

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-17 Thread Martin Sebor via Gcc-patches

On 9/17/20 12:39 PM, Andrew MacLeod wrote:

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in calls 
to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, when 
my son was asking me attack vectors, this is one I said I'd look at 
if I were a bad guy.



I'm a bit surprised you can't just query the range of the argument 
and get the overflow status out of that range, but I don't see that 
in the APIs.  How painful would it be to make that part of the API? 
The conceptual model would be to just ask for the range of the 
argument to malloc which would include the range and a status bit 
indicating the computation might have overflowed.


  Do we know if it did/would have wrapped? sure.  since we have to do 
the math.    so you are correct in that the information is there. but is 
it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have the 
difficulty of figuring out whether its a "real" overflow or a fake one 
because of the way we do unsigned math


You and me both :)



At the very start, I did have an overflow flag in the range class... but 
it was turning out to be fairly useless so it was removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to subtract 
one...  now, with infinite precision you are going to see


  [2,10]  - 1  we end up with [2,10]+0xFF, which will now give 
you  [0x10001, 0x10009]    so its going to look like it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 0x3fffc] 
is too large to represent in ‘long unsigned int’ 
[-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range for _6, 
and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to whatever 
the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The patch does pretty much exactly what you described, except in
offset_int, and only for a limited set of arithmetic operations.
It figures out if an unsigned expression wrapped simply by checking
to see if the mathematically correct result fits in its type.

It sounds like I should be a

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-17 Thread Andrew MacLeod via Gcc-patches

On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote:

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in calls 
to allocation functions


gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an 
allocation, then they can do all kinds of bad things.  In fact, when 
my son was asking me attack vectors, this is one I said I'd look at 
if I were a bad guy.



I'm a bit surprised you can't just query the range of the argument 
and get the overflow status out of that range, but I don't see that 
in the APIs.  How painful would it be to make that part of the API?  
The conceptual model would be to just ask for the range of the 
argument to malloc which would include the range and a status bit 
indicating the computation might have overflowed.


 Do we know if it did/would have wrapped? sure.  since we have to do 
the math.    so you are correct in that the information is there. but is 
it useful?


We are in the very annoying habit of subtracting one by adding 
0xFFF.  which means you get an overflow for unsigned when you 
subtract one.   From what I have seen of unsigned math, we would be 
flagging very many operations as overflows, so you would still have the 
difficulty of figuring out whether its a "real" overflow or a fake one 
because of the way we do unsigned math


At the very start, I did have an overflow flag in the range class...  
but it was turning out to be fairly useless so it was removed.

.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.


SO again, we get back to adding 0x0f when we are trying to subtract 
one...  now, with infinite precision you are going to see


 [2,10]  - 1  we end up with [2,10]+0xFF, which will now give 
you  [0x10001, 0x10009]    so its going to look like it overflowed?





But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
    if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

    return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 0x3fffc] 
is too large to represent in ‘long unsigned int’ 
[-Walloc-size-larger-than=]

    6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.


everything is exposed in range-ops.  well, mostly.
if we have _1 = _6 * 4

if one wanted to do that infinite precision, you query the range for _6, 
and the range for 4 (which would be [4,4] :-)

range_of_expr (op1r, _6, stmt)
range_of_expr (op2r, 4, stmt)

you could take their current types, and cast those ranges to whatever 
the next higher precsion is,

range_cast  (op1r, highertype)
range_cast (op2r, highertype)
then invoke the operation on those parameters

gimple_range_fold (r, stmt,  op1r, op2r)

and that will do your operation in the higher precision.  you could 
compare that to the value in regular precision too i suppose.


The ranger is designed to track ranges as they are represented in the 
IL.  You are asking for us to interpret the IL as something other than 
what is there, and increase the precision.  Thats a different task.


could that be done?    maybe. It might involve parallel tracking of 
ssa-Name ranges in a higher precision... and recalculating every 
express

Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-17 Thread Martin Sebor via Gcc-patches

On 9/16/20 9:23 PM, Jeff Law wrote:


On 9/15/20 1:47 PM, Martin Sebor wrote:

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin

gcc-96838.diff

PR middle-end/96838 - missing warning on integer overflow in calls to 
allocation functions

gcc/ChangeLog:

PR middle-end/96838
* calls.c (eval_size_vflow): New function.
(get_size_range): Call it.  Add argument.
(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

PR middle-end/96838
* gcc.dg/Walloc-size-larger-than-19.c: New test.
* gcc.dg/Walloc-size-larger-than-20.c: New test.


If an attacker can control an integer overflow that feeds an allocation, 
then they can do all kinds of bad things.  In fact, when my son was 
asking me attack vectors, this is one I said I'd look at if I were a bad 
guy.



I'm a bit surprised you can't just query the range of the argument and 
get the overflow status out of that range, but I don't see that in the 
APIs.  How painful would it be to make that part of the API?  The 
conceptual model would be to just ask for the range of the argument to 
malloc which would include the range and a status bit indicating the 
computation might have overflowed.


I agree that being able to evaluate an expression in an as-if
infinite precision (in addition to its type) would be helpful.
I mentioned it to Aldy in response to the irange best practices
document.  He says there's no way to do that and no plans to
make it possible.

But just to make sure I understood correctly, let me ask again
using an example:

  void* f (size_t n)
  {
if (n < PTRDIFF_MAX / 2)
  n = PTRDIFF_MAX / 2;

return malloc (n * sizeof (int));
  }

Can the unsigned wraparound in the argument be readily detected?

On trunk, this ends up with the following:

  # RANGE [4611686018427387903, 18446744073709551615]
  _6 = MAX_EXPR ;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _1 = _6 * 4;
  ...
  p_5 = mallocD.1206 (_1); [tail call]
  ...
  return p_5;

so _1's range reflects the wraparound in size_t, but _6's range
has enough information to uncover it.  So detecting it is possible
and is done in the patch so we get a warning:

warning: argument 1 range [18446744073709551612, 0x3fffc] is 
too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=]

6 |   return malloc (n * sizeof (int));
  |  ^

The code is very simplistic and only handles a small subset of cases.
It could be generalized and exposed by a more generic API but it does
seem like the ranger must already have all the logic built into it so
if it isn't exposed now it should be a matter of opening it up.

Martin


Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-16 Thread Jeff Law via Gcc-patches

On 9/15/20 1:47 PM, Martin Sebor wrote:
> Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
> can lead to a subsequent buffer overflow corrupting the heap or
> stack.  The attached patch diagnoses a subset of these cases where
> the overflow/wraparound is still detectable.
>
> Besides regtesting GCC on x86_64-linux I also verified the warning
> doesn't introduce any false positives into Glibc or Binutils/GDB
> builds on the same target.
>
> Martin
>
> gcc-96838.diff
>
> PR middle-end/96838 - missing warning on integer overflow in calls to 
> allocation functions
>
> gcc/ChangeLog:
>
>   PR middle-end/96838
>   * calls.c (eval_size_vflow): New function.
>   (get_size_range): Call it.  Add argument.
>   (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
>   * calls.h (get_size_range): Add argument.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/96838
>   * gcc.dg/Walloc-size-larger-than-19.c: New test.
>   * gcc.dg/Walloc-size-larger-than-20.c: New test.

If an attacker can control an integer overflow that feeds an allocation,
then they can do all kinds of bad things.  In fact, when my son was
asking me attack vectors, this is one I said I'd look at if I were a bad
guy.


I'm a bit surprised you can't just query the range of the argument and
get the overflow status out of that range, but I don't see that in the
APIs.  How painful would it be to make that part of the API?  The
conceptual model would be to just ask for the range of the argument to
malloc which would include the range and a status bit indicating the
computation might have overflowed.


jeff




pEpkey.asc
Description: application/pgp-keys


Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-16 Thread Bernhard Reutner-Fischer via Gcc-patches
On 15 September 2020 21:47:46 CEST, Martin Sebor via Gcc-patches 
 wrote:
>Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
>can lead to a subsequent buffer overflow corrupting the heap or
>stack.  The attached patch diagnoses a subset of these cases where
>the overflow/wraparound is still detectable.
>
>Besides regtesting GCC on x86_64-linux I also verified the warning
>doesn't introduce any false positives into Glibc or Binutils/GDB
>builds on the same target.

+/* Try to evaluate the artithmetic EXPresssion representing the size of

s/EXPresssion/expression EXP/

You had a bit more s than strictly necessary..
thanks,

>
>Martin



[PATCH] warn for integer overflow in allocation calls (PR 96838)

2020-09-15 Thread Martin Sebor via Gcc-patches

Overflowing the size of a dynamic allocation (e.g., malloc or VLA)
can lead to a subsequent buffer overflow corrupting the heap or
stack.  The attached patch diagnoses a subset of these cases where
the overflow/wraparound is still detectable.

Besides regtesting GCC on x86_64-linux I also verified the warning
doesn't introduce any false positives into Glibc or Binutils/GDB
builds on the same target.

Martin
PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions

gcc/ChangeLog:

	PR middle-end/96838
	* calls.c (eval_size_vflow): New function.
	(get_size_range): Call it.  Add argument.
	(maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound.
	* calls.h (get_size_range): Add argument.

gcc/testsuite/ChangeLog:

	PR middle-end/96838
	* gcc.dg/Walloc-size-larger-than-19.c: New test.
	* gcc.dg/Walloc-size-larger-than-20.c: New test.

diff --git a/gcc/calls.c b/gcc/calls.c
index 8ac94db6817..a5acff301e0 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1237,6 +1237,139 @@ alloc_max_size (void)
   return alloc_object_size_limit;
 }
 
+/* Try to evaluate the artithmetic EXPresssion representing the size of
+   an object in the widest possible type and set RANGE[] to the result.
+   Return the overflow status for the type of the expression (which may
+   be OVF_UNKNOWN if it cannot be determined from the ranges of its
+   operands).  Used to detect calls to allocation functions with
+   an argument that either overflows or wraps around zero.  */
+
+static wi::overflow_type
+eval_size_vflow (tree exp, wide_int range[2])
+{
+  const int prec = WIDE_INT_MAX_PRECISION;
+
+  if (TREE_CODE (exp) == INTEGER_CST)
+{
+  range[0] = range[1] = wi::to_wide (exp, prec);
+  return wi::OVF_NONE;
+}
+
+  if (TREE_CODE (exp) != SSA_NAME)
+return wi::OVF_UNKNOWN;
+
+  gimple *def = SSA_NAME_DEF_STMT (exp);
+  if (!is_gimple_assign (def))
+return wi::OVF_UNKNOWN;
+
+  tree_code code = gimple_assign_rhs_code (def);
+  tree optype = NULL_TREE;
+  wide_int op1r[2], op2r[2];
+  if (code == MULT_EXPR
+  || code == MINUS_EXPR
+  || code == PLUS_EXPR)
+{
+  /* Ignore the overflow on the operands.  */
+  tree rhs1 = gimple_assign_rhs1 (def);
+  wi::overflow_type ovf = eval_size_vflow (rhs1, op1r);
+  if (ovf == wi::OVF_UNKNOWN)
+	return ovf;
+
+  optype = TREE_TYPE (rhs1);
+  tree rhs2 = gimple_assign_rhs2 (def);
+  ovf = eval_size_vflow (rhs2, op2r);
+  if (ovf == wi::OVF_UNKNOWN)
+	return ovf;
+
+  if (code == PLUS_EXPR
+	  && TYPE_UNSIGNED (optype)
+	  && TREE_CODE (rhs2) == INTEGER_CST)
+	{
+	  /* A - CST is transformed into A + (-CST).  Undo that to avoid
+	 false reports of overflow (this means overflow due to very
+	 large constants in the source code isn't detected.)  */
+	  tree sgn_type = signed_type_for (optype);
+	  tree max = TYPE_MAX_VALUE (sgn_type);
+	  wide_int smax = wi::to_wide (max, prec);
+	  if (wi::ltu_p (smax, op2r[0]))
+	{
+	  op2r[0] = wi::neg (wi::sub (op2r[0], smax, UNSIGNED, &ovf));
+	  op2r[1] = op2r[0];
+	}
+	}
+}
+  else if (code == NOP_EXPR)
+{
+  /* Strip (implicit) conversions.  Explicit conversions are stripped
+	 as well which may result in reporting overflow despite a cast.
+	 Those cases should be rare.  */
+  tree rhs1 = gimple_assign_rhs1 (def);
+  wi::overflow_type ovf = eval_size_vflow (rhs1, op1r);
+  if (ovf == wi::OVF_UNKNOWN)
+	return ovf;
+  optype = TREE_TYPE (rhs1);
+}
+  else
+{
+  wide_int min, max;
+  if (determine_value_range (exp, &min, &max) != VR_RANGE)
+	return wi::OVF_UNKNOWN;
+  optype = TREE_TYPE (exp);
+  op1r[0] = wide_int::from (min, prec, TYPE_SIGN (optype));
+  op1r[1] = wide_int::from (max, prec, TYPE_SIGN (optype));
+}
+
+  wide_int umax = wi::to_wide (TYPE_MAX_VALUE (optype), prec);
+  tree sgn_type = signed_type_for (optype);
+  wide_int smax = wi::to_wide (TYPE_MAX_VALUE (sgn_type), prec);
+
+  wi::overflow_type ovf = wi::OVF_NONE;
+  if (code == MULT_EXPR)
+{
+  /* Compute the upper bound of the result first, discarding any
+	 overflow.  Only the overflow in the lower bound matters.  */
+  range[1] = wi::mul (op1r[1], op2r[1], UNSIGNED, &ovf);
+  range[0] = wi::mul (op1r[0], op2r[0], UNSIGNED, &ovf);
+}
+  else if (code == MINUS_EXPR)
+{
+  range[1] = wi::sub (op1r[1], op2r[1], UNSIGNED, &ovf);
+  range[0] = wi::sub (op1r[0], op2r[0], UNSIGNED, &ovf);
+}
+  else if (code == PLUS_EXPR)
+{
+  signop sgn = UNSIGNED;
+  if (op2r[0] == op2r[1] && wi::ltu_p (smax, op2r[0]))
+	sgn = SIGNED;
+
+  range[1] = wi::add (op1r[1], op2r[1], sgn, &ovf);
+  range[0] = wi::add (op1r[0], op2r[0], sgn, &ovf);
+}
+  else
+{
+  range[0] = op1r[0];
+  range[1] = op1r[1];
+}
+
+  if (ovf != wi::OVF_NONE)
+return ovf;
+
+  /* Nothing can be determined from a range that spans zero.
+ TO DO: Assume a range with a negative