On 11/24/20 11:39 AM, Martin Sebor wrote:
> On 11/24/20 10:44 AM, Andrew MacLeod wrote:
>> On 11/24/20 12:42 PM, Andrew MacLeod wrote:
>>> On 11/23/20 4:38 PM, Martin Sebor wrote:
>>>> On 11/21/20 6:26 AM, Andrew MacLeod wrote:
>>>>> On 11/21/20 12:07 AM, Jeff Law wrote:
>>>>>>
>>>>>> On 11/9/20 9:00 AM, Martin Sebor wrote:
>>>>>>> Ping:
>>>>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554000.html
>>>>>>>
>>>>>>>
>>>>>>> Jeff, I don't expect to have the cycles to reimplement this patch
>>>>>>> using the Ranger APIs before stage 1 closes.  I'm open to giving
>>>>>>> it a try in stage 3 if it's still in scope for GCC 11. Otherwise,
>>>>>>> is this patch okay to commit?
>>>>>> So all we're going to get from the ranger is ranges of operands,
>>>>>> right?
>>>>>> Meaning that we still need to either roll our own evaluator
>>>>>> (eval_size_vflow) or overload range_for_stmt with our own, which
>>>>>> likely
>>>>>> looks like eval_size_vflow anyway, right?
>>>>>>
>>>>>> My hope was to avoid the roll our own evaluator, but that doesn't
>>>>>> look
>>>>>> like it's in the cards in the reasonably near future.
>>>>>
>>>>> Is there a PR open showing what exactly you are looking for?
>>>>> I'm using open PRs to track enhancement requests, and they will
>>>>> all feed back into the development roadmap  I am working on.
>>>>
>>>> Not that I know of.  The background is upthread, in particular in
>>>> Aldy's response here:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554242.html
>>>>
>>>> I like the suggestion and if/when I have the time I'd like to give
>>>> it a try.  Until then, I think the patch is useful on its own so
>>>> I'll go with it for now.
>>>>
>>>> Longer term, I do hope we can revisit the idea of computing either
>>>> mathematically correct ranges alongside those required by the language
>>>> semantics, or tracking signed overflow or unsigned wraparound. E.g.,
>>>> in:
>>>>
>>>>   void* f (int n)
>>>>   {
>>>>     if (n < INT_MAX / 3)
>>>>       n = INT_MAX / 3;
>>>>
>>>>     n *= sizeof (int);
>>>>     // n is [(INT_MAX / 3) * 4, INF] mathematically
>>>>     // undefined due to overflow in C
>>>>     // but [INT_MIN, INT_MAX] according to VRP
>>>
>>> but sizeof returns a size_t.. which is an unsigned. thus the
>>> multiply is promoted to an unsigned multiply  which means there is
>>> lots of wrapping and I don't see how you can conclude those ranges?
>>> [INT_MIN, INT_MAX] are all possible outcomes based on the code that
>>> is generated.
>>>
>>> If I change that to
>>>   n *= (int) sizeof (int) to keep it as signed arithmetic, I see:
>>>
>>>
>>> Folding statement: n_4 = n_1 * 4;
>>> EVRP:hybrid: RVRP found singleton 2147483647
>>> Queued stmt for removal.  Folds to: 2147483647
>>> evrp visiting stmt _7 = malloc (n_4);
>>>
>>> extract_range_from_stmt visiting:
>>> _7 = foo (n_4);
>>> Folding statement: _7 = foo (n_4);
>>> EVRP:hybrid: RVRP found singleton 2147483647
>>> Folded into: _7 = malloc (2147483647);
>>>
>>> So I'm not sure what exactly you want to do?  We are calculating
>>> what the program can produce?
>>>
>>> Why do we care about alternative calculations?
>>>
>> Or rather, why do we want to do this?
>
> When computing the sizes of things, programmers commonly forget
> to consider unsigned wrapping (or signed overflow).  We simply
> assume it can't happen and that (for instance) N * sizeof (X)
> is necessarily big enough for N elements of type X.  (Grepping
> any code base for the pattern '\* sizeof' and looking for code
> that tests that the result doesn't wrap is revealing.)
>
> When overflow or wrapping does happen (typically because of poor
> precondition checking) it often leads to bugs when we end up
> allocating less space than we need and use.  A simple example
> to help illustrate what I mean:
>
>   void* g (int *a, int n)
>   {
>     a = realloc (a, n * sizeof (int) + 32);
>     for (int i = n; i != n + 32; ++i)
>       a[i] = f ();
>   }
>
> In ILP32, if (n > INT_MAX / 4 - 32) holds, n * sizeof(int) will
> wrap around zero.  The realloc call will end up allocating less
> space than expected, and the loop will write past the end of
> the allocated block.
>
> (The bug above can only be detected if we know n's range.
> I left that part out.)
>
> Historically, bugs caused by integer overflow and wrapping have
> been among the most serious security weaknesses.  Detecting these
> mistakes will help prevent some of these.
>
> The problem is that according to C/C++, nothing in the function
> above is undefined except for the buffer overflow in the loop,
> and the buffer overflow only happens because of the well-defined
> integer wrapping.  To detect the wrapping, we either need to do
> the computation in as-if infinite math and compare the final result
> to the result we get under C's truncating rules, or we need to set
> and propagate the "wraparound" bit throughout the computation.
Just to add a bit to Martin's note.  Yes, an overflow of the size passed
to an allocation routine is a significant security risk and these have
been actively exploited through the years.  In simpest terms of an
attacker can control the argument to malloc, then they can make
subsequent writes scribble into fairly arbitrary chunks of memory.

There was a great example a while back where there was an overflow in an
allocation call from the low level glibc printf support.  By
manipulating the exact overflow result the attacker was able to then
control a write of 0 into what should have been in the allocated buffer
to instead write the 0 into a private field of a FILE* pointer that
controlled whether or not the output string fortification hardening was
applied to the file stream.  Turning off the fortification on the stream
in turn allowed them to use a positional printf argument attack to
perform arbitrary writes and ultimately control an application (cups
daemon IIRC).


Jeff

Reply via email to