Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/24/20 11:39 AM, Martin Sebor wrote: > On 11/24/20 10:44 AM, Andrew MacLeod wrote: >> On 11/24/20 12:42 PM, Andrew MacLeod wrote: >>> On 11/23/20 4:38 PM, Martin Sebor wrote: On 11/21/20 6:26 AM, Andrew MacLeod wrote: > On 11/21/20 12:07 AM, Jeff Law wrote: >> >> On 11/9/20 9:00 AM, Martin Sebor wrote: >>> Ping: >>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html >>> >>> >>> Jeff, I don't expect to have the cycles to reimplement this patch >>> using the Ranger APIs before stage 1 closes. I'm open to giving >>> it a try in stage 3 if it's still in scope for GCC 11. Otherwise, >>> is this patch okay to commit? >> So all we're going to get from the ranger is ranges of operands, >> right? >> Meaning that we still need to either roll our own evaluator >> (eval_size_vflow) or overload range_for_stmt with our own, which >> likely >> looks like eval_size_vflow anyway, right? >> >> My hope was to avoid the roll our own evaluator, but that doesn't >> look >> like it's in the cards in the reasonably near future. > > Is there a PR open showing what exactly you are looking for? > I'm using open PRs to track enhancement requests, and they will > all feed back into the development roadmap I am working on. Not that I know of. The background is upthread, in particular in Aldy's response here: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html I like the suggestion and if/when I have the time I'd like to give it a try. Until then, I think the patch is useful on its own so I'll go with it for now. Longer term, I do hope we can revisit the idea of computing either mathematically correct ranges alongside those required by the language semantics, or tracking signed overflow or unsigned wraparound. E.g., in: void* f (int n) { if (n < INT_MAX / 3) n = INT_MAX / 3; n *= sizeof (int); // n is [(INT_MAX / 3) * 4, INF] mathematically // undefined due to overflow in C // but [INT_MIN, INT_MAX] according to VRP >>> >>> but sizeof returns a size_t.. which is an unsigned. thus the >>> multiply is promoted to an unsigned multiply which means there is >>> lots of wrapping and I don't see how you can conclude those ranges? >>> [INT_MIN, INT_MAX] are all possible outcomes based on the code that >>> is generated. >>> >>> If I change that to >>> n *= (int) sizeof (int) to keep it as signed arithmetic, I see: >>> >>> >>> Folding statement: n_4 = n_1 * 4; >>> EVRP:hybrid: RVRP found singleton 2147483647 >>> Queued stmt for removal. Folds to: 2147483647 >>> evrp visiting stmt _7 = malloc (n_4); >>> >>> extract_range_from_stmt visiting: >>> _7 = foo (n_4); >>> Folding statement: _7 = foo (n_4); >>> EVRP:hybrid: RVRP found singleton 2147483647 >>> Folded into: _7 = malloc (2147483647); >>> >>> So I'm not sure what exactly you want to do? We are calculating >>> what the program can produce? >>> >>> Why do we care about alternative calculations? >>> >> Or rather, why do we want to do this? > > When computing the sizes of things, programmers commonly forget > to consider unsigned wrapping (or signed overflow). We simply > assume it can't happen and that (for instance) N * sizeof (X) > is necessarily big enough for N elements of type X. (Grepping > any code base for the pattern '\* sizeof' and looking for code > that tests that the result doesn't wrap is revealing.) > > When overflow or wrapping does happen (typically because of poor > precondition checking) it often leads to bugs when we end up > allocating less space than we need and use. A simple example > to help illustrate what I mean: > > void* g (int *a, int n) > { > a = realloc (a, n * sizeof (int) + 32); > for (int i = n; i != n + 32; ++i) > a[i] = f (); > } > > In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will > wrap around zero. The realloc call will end up allocating less > space than expected, and the loop will write past the end of > the allocated block. > > (The bug above can only be detected if we know n's range. > I left that part out.) > > Historically, bugs caused by integer overflow and wrapping have > been among the most serious security weaknesses. Detecting these > mistakes will help prevent some of these. > > The problem is that according to C/C++, nothing in the function > above is undefined except for the buffer overflow in the loop, > and the buffer overflow only happens because of the well-defined > integer wrapping. To detect the wrapping, we either need to do > the computation in as-if infinite math and compare the final result > to the result we get under C's truncating rules, or we need to set > and propagate the "wraparound" bit throughout the computation. Just to add a bit to Martin's note. Yes, an overflow of the size passed to
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/24/20 10:44 AM, Andrew MacLeod wrote: On 11/24/20 12:42 PM, Andrew MacLeod wrote: On 11/23/20 4:38 PM, Martin Sebor wrote: On 11/21/20 6:26 AM, Andrew MacLeod wrote: On 11/21/20 12:07 AM, Jeff Law wrote: On 11/9/20 9:00 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. Is there a PR open showing what exactly you are looking for? I'm using open PRs to track enhancement requests, and they will all feed back into the development roadmap I am working on. Not that I know of. The background is upthread, in particular in Aldy's response here: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html I like the suggestion and if/when I have the time I'd like to give it a try. Until then, I think the patch is useful on its own so I'll go with it for now. Longer term, I do hope we can revisit the idea of computing either mathematically correct ranges alongside those required by the language semantics, or tracking signed overflow or unsigned wraparound. E.g., in: void* f (int n) { if (n < INT_MAX / 3) n = INT_MAX / 3; n *= sizeof (int); // n is [(INT_MAX / 3) * 4, INF] mathematically // undefined due to overflow in C // but [INT_MIN, INT_MAX] according to VRP but sizeof returns a size_t.. which is an unsigned. thus the multiply is promoted to an unsigned multiply which means there is lots of wrapping and I don't see how you can conclude those ranges? [INT_MIN, INT_MAX] are all possible outcomes based on the code that is generated. If I change that to n *= (int) sizeof (int) to keep it as signed arithmetic, I see: Folding statement: n_4 = n_1 * 4; EVRP:hybrid: RVRP found singleton 2147483647 Queued stmt for removal. Folds to: 2147483647 evrp visiting stmt _7 = malloc (n_4); extract_range_from_stmt visiting: _7 = foo (n_4); Folding statement: _7 = foo (n_4); EVRP:hybrid: RVRP found singleton 2147483647 Folded into: _7 = malloc (2147483647); So I'm not sure what exactly you want to do? We are calculating what the program can produce? Why do we care about alternative calculations? Or rather, why do we want to do this? When computing the sizes of things, programmers commonly forget to consider unsigned wrapping (or signed overflow). We simply assume it can't happen and that (for instance) N * sizeof (X) is necessarily big enough for N elements of type X. (Grepping any code base for the pattern '\* sizeof' and looking for code that tests that the result doesn't wrap is revealing.) When overflow or wrapping does happen (typically because of poor precondition checking) it often leads to bugs when we end up allocating less space than we need and use. A simple example to help illustrate what I mean: void* g (int *a, int n) { a = realloc (a, n * sizeof (int) + 32); for (int i = n; i != n + 32; ++i) a[i] = f (); } In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will wrap around zero. The realloc call will end up allocating less space than expected, and the loop will write past the end of the allocated block. (The bug above can only be detected if we know n's range. I left that part out.) Historically, bugs caused by integer overflow and wrapping have been among the most serious security weaknesses. Detecting these mistakes will help prevent some of these. The problem is that according to C/C++, nothing in the function above is undefined except for the buffer overflow in the loop, and the buffer overflow only happens because of the well-defined integer wrapping. To detect the wrapping, we either need to do the computation in as-if infinite math and compare the final result to the result we get under C's truncating rules, or we need to set and propagate the "wraparound" bit throughout the computation. Martin
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/24/20 12:42 PM, Andrew MacLeod wrote: On 11/23/20 4:38 PM, Martin Sebor wrote: On 11/21/20 6:26 AM, Andrew MacLeod wrote: On 11/21/20 12:07 AM, Jeff Law wrote: On 11/9/20 9:00 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. Is there a PR open showing what exactly you are looking for? I'm using open PRs to track enhancement requests, and they will all feed back into the development roadmap I am working on. Not that I know of. The background is upthread, in particular in Aldy's response here: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html I like the suggestion and if/when I have the time I'd like to give it a try. Until then, I think the patch is useful on its own so I'll go with it for now. Longer term, I do hope we can revisit the idea of computing either mathematically correct ranges alongside those required by the language semantics, or tracking signed overflow or unsigned wraparound. E.g., in: void* f (int n) { if (n < INT_MAX / 3) n = INT_MAX / 3; n *= sizeof (int); // n is [(INT_MAX / 3) * 4, INF] mathematically // undefined due to overflow in C // but [INT_MIN, INT_MAX] according to VRP but sizeof returns a size_t.. which is an unsigned. thus the multiply is promoted to an unsigned multiply which means there is lots of wrapping and I don't see how you can conclude those ranges? [INT_MIN, INT_MAX] are all possible outcomes based on the code that is generated. If I change that to n *= (int) sizeof (int) to keep it as signed arithmetic, I see: Folding statement: n_4 = n_1 * 4; EVRP:hybrid: RVRP found singleton 2147483647 Queued stmt for removal. Folds to: 2147483647 evrp visiting stmt _7 = malloc (n_4); extract_range_from_stmt visiting: _7 = foo (n_4); Folding statement: _7 = foo (n_4); EVRP:hybrid: RVRP found singleton 2147483647 Folded into: _7 = malloc (2147483647); So I'm not sure what exactly you want to do? We are calculating what the program can produce? Why do we care about alternative calculations? Or rather, why do we want to do this?
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/23/20 4:38 PM, Martin Sebor wrote: On 11/21/20 6:26 AM, Andrew MacLeod wrote: On 11/21/20 12:07 AM, Jeff Law wrote: On 11/9/20 9:00 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. Is there a PR open showing what exactly you are looking for? I'm using open PRs to track enhancement requests, and they will all feed back into the development roadmap I am working on. Not that I know of. The background is upthread, in particular in Aldy's response here: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html I like the suggestion and if/when I have the time I'd like to give it a try. Until then, I think the patch is useful on its own so I'll go with it for now. Longer term, I do hope we can revisit the idea of computing either mathematically correct ranges alongside those required by the language semantics, or tracking signed overflow or unsigned wraparound. E.g., in: void* f (int n) { if (n < INT_MAX / 3) n = INT_MAX / 3; n *= sizeof (int); // n is [(INT_MAX / 3) * 4, INF] mathematically // undefined due to overflow in C // but [INT_MIN, INT_MAX] according to VRP but sizeof returns a size_t.. which is an unsigned. thus the multiply is promoted to an unsigned multiply which means there is lots of wrapping and I don't see how you can conclude those ranges? [INT_MIN, INT_MAX] are all possible outcomes based on the code that is generated. If I change that to n *= (int) sizeof (int) to keep it as signed arithmetic, I see: Folding statement: n_4 = n_1 * 4; EVRP:hybrid: RVRP found singleton 2147483647 Queued stmt for removal. Folds to: 2147483647 evrp visiting stmt _7 = malloc (n_4); extract_range_from_stmt visiting: _7 = foo (n_4); Folding statement: _7 = foo (n_4); EVRP:hybrid: RVRP found singleton 2147483647 Folded into: _7 = malloc (2147483647); So I'm not sure what exactly you want to do? We are calculating what the program can produce? Why do we care about alternative calculations? Andrew
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/21/20 6:26 AM, Andrew MacLeod wrote: On 11/21/20 12:07 AM, Jeff Law wrote: On 11/9/20 9:00 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. Is there a PR open showing what exactly you are looking for? I'm using open PRs to track enhancement requests, and they will all feed back into the development roadmap I am working on. Not that I know of. The background is upthread, in particular in Aldy's response here: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html I like the suggestion and if/when I have the time I'd like to give it a try. Until then, I think the patch is useful on its own so I'll go with it for now. Longer term, I do hope we can revisit the idea of computing either mathematically correct ranges alongside those required by the language semantics, or tracking signed overflow or unsigned wraparound. E.g., in: void* f (int n) { if (n < INT_MAX / 3) n = INT_MAX / 3; n *= sizeof (int); // n is [(INT_MAX / 3) * 4, INF] mathematically // undefined due to overflow in C // but [INT_MIN, INT_MAX] according to VRP return malloc (n); } and in void* g (unsigned n) { if (n < UINT_MAX / 3) n = UINT_MAX / 3; n *= sizeof (int); // n is [(UINT_MAX / 3) * 4, INF] mathematically // but [0, UINT_MAX] in C return malloc (n); } In neither case is the overflow (or wraparound) easy to detect without the Ranger's help. Martin If my summary of the current state is correct, then I'd suggest we go with the patch as-is. If we want to override eval_size_vflow in the future, we can still do that. And if we want to replace the call to determine_value_range with a ranger API, that seems like a fairly straightforward change to make in gcc-12. Jeff
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/21/20 12:07 AM, Jeff Law wrote: On 11/9/20 9:00 AM, Martin Sebor wrote: Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. Is there a PR open showing what exactly you are looking for? I'm using open PRs to track enhancement requests, and they will all feed back into the development roadmap I am working on. If my summary of the current state is correct, then I'd suggest we go with the patch as-is. If we want to override eval_size_vflow in the future, we can still do that. And if we want to replace the call to determine_value_range with a ranger API, that seems like a fairly straightforward change to make in gcc-12. Jeff
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 11/9/20 9:00 AM, Martin Sebor wrote: > Ping: > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html > > Jeff, I don't expect to have the cycles to reimplement this patch > using the Ranger APIs before stage 1 closes. I'm open to giving > it a try in stage 3 if it's still in scope for GCC 11. Otherwise, > is this patch okay to commit? So all we're going to get from the ranger is ranges of operands, right? Meaning that we still need to either roll our own evaluator (eval_size_vflow) or overload range_for_stmt with our own, which likely looks like eval_size_vflow anyway, right? My hope was to avoid the roll our own evaluator, but that doesn't look like it's in the cards in the reasonably near future. If my summary of the current state is correct, then I'd suggest we go with the patch as-is. If we want to override eval_size_vflow in the future, we can still do that. And if we want to replace the call to determine_value_range with a ranger API, that seems like a fairly straightforward change to make in gcc-12. Jeff
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html Jeff, I don't expect to have the cycles to reimplement this patch using the Ranger APIs before stage 1 closes. I'm open to giving it a try in stage 3 if it's still in scope for GCC 11. Otherwise, is this patch okay to commit? On 9/21/20 9:13 AM, Martin Sebor wrote: On 9/20/20 12:39 AM, Aldy Hernandez wrote: On 9/19/20 11:22 PM, Martin Sebor wrote: On 9/18/20 12:29 AM, Aldy Hernandez wrote: On 9/17/20 10:18 PM, Martin Sebor wrote: On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/20/20 12:39 AM, Aldy Hernandez wrote: On 9/19/20 11:22 PM, Martin Sebor wrote: On 9/18/20 12:29 AM, Aldy Hernandez wrote: On 9/17/20 10:18 PM, Martin Sebor wrote: On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The patch does pretty much exactly what you described, except in offset_int, and only for a limited set of arit
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/19/20 11:22 PM, Martin Sebor wrote: On 9/18/20 12:29 AM, Aldy Hernandez wrote: On 9/17/20 10:18 PM, Martin Sebor wrote: On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The patch does pretty much exactly what you described, except in offset_int, and only for a limited set of arithmetic operations. It figures out if an unsi
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/18/20 12:29 AM, Aldy Hernandez wrote: On 9/17/20 10:18 PM, Martin Sebor wrote: On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The patch does pretty much exactly what you described, except in offset_int, and only for a limited set of arithmetic operations. It figures out if an unsigned expression wrapped simply by checking to
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/17/20 10:18 PM, Martin Sebor wrote: On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The patch does pretty much exactly what you described, except in offset_int, and only for a limited set of arithmetic operations. It figures out if an unsigned expression wrapped simply by checking to see if the mathematically correct result f
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/17/20 12:39 PM, Andrew MacLeod wrote: On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math You and me both :) At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The patch does pretty much exactly what you described, except in offset_int, and only for a limited set of arithmetic operations. It figures out if an unsigned expression wrapped simply by checking to see if the mathematically correct result fits in its type. It sounds like I should be a
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/17/20 12:08 PM, Martin Sebor via Gcc-patches wrote: On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. Do we know if it did/would have wrapped? sure. since we have to do the math. so you are correct in that the information is there. but is it useful? We are in the very annoying habit of subtracting one by adding 0xFFF. which means you get an overflow for unsigned when you subtract one. From what I have seen of unsigned math, we would be flagging very many operations as overflows, so you would still have the difficulty of figuring out whether its a "real" overflow or a fake one because of the way we do unsigned math At the very start, I did have an overflow flag in the range class... but it was turning out to be fairly useless so it was removed. . I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. SO again, we get back to adding 0x0f when we are trying to subtract one... now, with infinite precision you are going to see [2,10] - 1 we end up with [2,10]+0xFF, which will now give you [0x10001, 0x10009] so its going to look like it overflowed? But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. everything is exposed in range-ops. well, mostly. if we have _1 = _6 * 4 if one wanted to do that infinite precision, you query the range for _6, and the range for 4 (which would be [4,4] :-) range_of_expr (op1r, _6, stmt) range_of_expr (op2r, 4, stmt) you could take their current types, and cast those ranges to whatever the next higher precsion is, range_cast (op1r, highertype) range_cast (op2r, highertype) then invoke the operation on those parameters gimple_range_fold (r, stmt, op1r, op2r) and that will do your operation in the higher precision. you could compare that to the value in regular precision too i suppose. The ranger is designed to track ranges as they are represented in the IL. You are asking for us to interpret the IL as something other than what is there, and increase the precision. Thats a different task. could that be done? maybe. It might involve parallel tracking of ssa-Name ranges in a higher precision... and recalculating every express
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/16/20 9:23 PM, Jeff Law wrote: On 9/15/20 1:47 PM, Martin Sebor wrote: Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin gcc-96838.diff PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. I agree that being able to evaluate an expression in an as-if infinite precision (in addition to its type) would be helpful. I mentioned it to Aldy in response to the irange best practices document. He says there's no way to do that and no plans to make it possible. But just to make sure I understood correctly, let me ask again using an example: void* f (size_t n) { if (n < PTRDIFF_MAX / 2) n = PTRDIFF_MAX / 2; return malloc (n * sizeof (int)); } Can the unsigned wraparound in the argument be readily detected? On trunk, this ends up with the following: # RANGE [4611686018427387903, 18446744073709551615] _6 = MAX_EXPR ; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _1 = _6 * 4; ... p_5 = mallocD.1206 (_1); [tail call] ... return p_5; so _1's range reflects the wraparound in size_t, but _6's range has enough information to uncover it. So detecting it is possible and is done in the patch so we get a warning: warning: argument 1 range [18446744073709551612, 0x3fffc] is too large to represent in ‘long unsigned int’ [-Walloc-size-larger-than=] 6 | return malloc (n * sizeof (int)); | ^ The code is very simplistic and only handles a small subset of cases. It could be generalized and exposed by a more generic API but it does seem like the ranger must already have all the logic built into it so if it isn't exposed now it should be a matter of opening it up. Martin
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 9/15/20 1:47 PM, Martin Sebor wrote: > Overflowing the size of a dynamic allocation (e.g., malloc or VLA) > can lead to a subsequent buffer overflow corrupting the heap or > stack. The attached patch diagnoses a subset of these cases where > the overflow/wraparound is still detectable. > > Besides regtesting GCC on x86_64-linux I also verified the warning > doesn't introduce any false positives into Glibc or Binutils/GDB > builds on the same target. > > Martin > > gcc-96838.diff > > PR middle-end/96838 - missing warning on integer overflow in calls to > allocation functions > > gcc/ChangeLog: > > PR middle-end/96838 > * calls.c (eval_size_vflow): New function. > (get_size_range): Call it. Add argument. > (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. > * calls.h (get_size_range): Add argument. > > gcc/testsuite/ChangeLog: > > PR middle-end/96838 > * gcc.dg/Walloc-size-larger-than-19.c: New test. > * gcc.dg/Walloc-size-larger-than-20.c: New test. If an attacker can control an integer overflow that feeds an allocation, then they can do all kinds of bad things. In fact, when my son was asking me attack vectors, this is one I said I'd look at if I were a bad guy. I'm a bit surprised you can't just query the range of the argument and get the overflow status out of that range, but I don't see that in the APIs. How painful would it be to make that part of the API? The conceptual model would be to just ask for the range of the argument to malloc which would include the range and a status bit indicating the computation might have overflowed. jeff pEpkey.asc Description: application/pgp-keys
Re: [PATCH] warn for integer overflow in allocation calls (PR 96838)
On 15 September 2020 21:47:46 CEST, Martin Sebor via Gcc-patches wrote: >Overflowing the size of a dynamic allocation (e.g., malloc or VLA) >can lead to a subsequent buffer overflow corrupting the heap or >stack. The attached patch diagnoses a subset of these cases where >the overflow/wraparound is still detectable. > >Besides regtesting GCC on x86_64-linux I also verified the warning >doesn't introduce any false positives into Glibc or Binutils/GDB >builds on the same target. +/* Try to evaluate the artithmetic EXPresssion representing the size of s/EXPresssion/expression EXP/ You had a bit more s than strictly necessary.. thanks, > >Martin
[PATCH] warn for integer overflow in allocation calls (PR 96838)
Overflowing the size of a dynamic allocation (e.g., malloc or VLA) can lead to a subsequent buffer overflow corrupting the heap or stack. The attached patch diagnoses a subset of these cases where the overflow/wraparound is still detectable. Besides regtesting GCC on x86_64-linux I also verified the warning doesn't introduce any false positives into Glibc or Binutils/GDB builds on the same target. Martin PR middle-end/96838 - missing warning on integer overflow in calls to allocation functions gcc/ChangeLog: PR middle-end/96838 * calls.c (eval_size_vflow): New function. (get_size_range): Call it. Add argument. (maybe_warn_alloc_args_overflow): Diagnose overflow/wraparound. * calls.h (get_size_range): Add argument. gcc/testsuite/ChangeLog: PR middle-end/96838 * gcc.dg/Walloc-size-larger-than-19.c: New test. * gcc.dg/Walloc-size-larger-than-20.c: New test. diff --git a/gcc/calls.c b/gcc/calls.c index 8ac94db6817..a5acff301e0 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1237,6 +1237,139 @@ alloc_max_size (void) return alloc_object_size_limit; } +/* Try to evaluate the artithmetic EXPresssion representing the size of + an object in the widest possible type and set RANGE[] to the result. + Return the overflow status for the type of the expression (which may + be OVF_UNKNOWN if it cannot be determined from the ranges of its + operands). Used to detect calls to allocation functions with + an argument that either overflows or wraps around zero. */ + +static wi::overflow_type +eval_size_vflow (tree exp, wide_int range[2]) +{ + const int prec = WIDE_INT_MAX_PRECISION; + + if (TREE_CODE (exp) == INTEGER_CST) +{ + range[0] = range[1] = wi::to_wide (exp, prec); + return wi::OVF_NONE; +} + + if (TREE_CODE (exp) != SSA_NAME) +return wi::OVF_UNKNOWN; + + gimple *def = SSA_NAME_DEF_STMT (exp); + if (!is_gimple_assign (def)) +return wi::OVF_UNKNOWN; + + tree_code code = gimple_assign_rhs_code (def); + tree optype = NULL_TREE; + wide_int op1r[2], op2r[2]; + if (code == MULT_EXPR + || code == MINUS_EXPR + || code == PLUS_EXPR) +{ + /* Ignore the overflow on the operands. */ + tree rhs1 = gimple_assign_rhs1 (def); + wi::overflow_type ovf = eval_size_vflow (rhs1, op1r); + if (ovf == wi::OVF_UNKNOWN) + return ovf; + + optype = TREE_TYPE (rhs1); + tree rhs2 = gimple_assign_rhs2 (def); + ovf = eval_size_vflow (rhs2, op2r); + if (ovf == wi::OVF_UNKNOWN) + return ovf; + + if (code == PLUS_EXPR + && TYPE_UNSIGNED (optype) + && TREE_CODE (rhs2) == INTEGER_CST) + { + /* A - CST is transformed into A + (-CST). Undo that to avoid + false reports of overflow (this means overflow due to very + large constants in the source code isn't detected.) */ + tree sgn_type = signed_type_for (optype); + tree max = TYPE_MAX_VALUE (sgn_type); + wide_int smax = wi::to_wide (max, prec); + if (wi::ltu_p (smax, op2r[0])) + { + op2r[0] = wi::neg (wi::sub (op2r[0], smax, UNSIGNED, &ovf)); + op2r[1] = op2r[0]; + } + } +} + else if (code == NOP_EXPR) +{ + /* Strip (implicit) conversions. Explicit conversions are stripped + as well which may result in reporting overflow despite a cast. + Those cases should be rare. */ + tree rhs1 = gimple_assign_rhs1 (def); + wi::overflow_type ovf = eval_size_vflow (rhs1, op1r); + if (ovf == wi::OVF_UNKNOWN) + return ovf; + optype = TREE_TYPE (rhs1); +} + else +{ + wide_int min, max; + if (determine_value_range (exp, &min, &max) != VR_RANGE) + return wi::OVF_UNKNOWN; + optype = TREE_TYPE (exp); + op1r[0] = wide_int::from (min, prec, TYPE_SIGN (optype)); + op1r[1] = wide_int::from (max, prec, TYPE_SIGN (optype)); +} + + wide_int umax = wi::to_wide (TYPE_MAX_VALUE (optype), prec); + tree sgn_type = signed_type_for (optype); + wide_int smax = wi::to_wide (TYPE_MAX_VALUE (sgn_type), prec); + + wi::overflow_type ovf = wi::OVF_NONE; + if (code == MULT_EXPR) +{ + /* Compute the upper bound of the result first, discarding any + overflow. Only the overflow in the lower bound matters. */ + range[1] = wi::mul (op1r[1], op2r[1], UNSIGNED, &ovf); + range[0] = wi::mul (op1r[0], op2r[0], UNSIGNED, &ovf); +} + else if (code == MINUS_EXPR) +{ + range[1] = wi::sub (op1r[1], op2r[1], UNSIGNED, &ovf); + range[0] = wi::sub (op1r[0], op2r[0], UNSIGNED, &ovf); +} + else if (code == PLUS_EXPR) +{ + signop sgn = UNSIGNED; + if (op2r[0] == op2r[1] && wi::ltu_p (smax, op2r[0])) + sgn = SIGNED; + + range[1] = wi::add (op1r[1], op2r[1], sgn, &ovf); + range[0] = wi::add (op1r[0], op2r[0], sgn, &ovf); +} + else +{ + range[0] = op1r[0]; + range[1] = op1r[1]; +} + + if (ovf != wi::OVF_NONE) +return ovf; + + /* Nothing can be determined from a range that spans zero. + TO DO: Assume a range with a negative