On 19/09/2021 20:17, Allin Cottrell via Gcc wrote:
> Should this perhaps be considered a bug? Below is a minimal test case
> for a type of calculation that occurs in my real code. It works as
> expected when compiled without optimization, but produces what seems
> like a wrong result when compiled with -O2, using both gcc 10.3.1
> 20210422 on Fedora and gcc 11.1.0-1 on Arch Linux. I realize there's a
> newer gcc release but it's not yet available for Arch, and looking at
> https://gcc.gnu.org/gcc-11/changes.html I didn't see anything to suggest
> that something relevant has changed.
> 
> <test-case>
> #include <stdio.h>
> #include <limits.h>
> 
> int test (int *pk, int n)
> {
>     int err = 0;
> 
>     if (*pk > n) {
>         err = 1;
>         if (*pk > 2e9) {
>             int k = *pk + n - INT_MAX;
> 
>             *pk = k;
>             if (k > 0) {
>                 printf("Got positive revised k = %d\n", k);
>                 err = 0;
>             } else {
>                 printf("k = %d tests as non-positive?!\n", k);
>             }
>         }
>     }
> 
>     return err;
> }
> 
> int main (void)
> {
>     int k = INT_MAX - 10;
>     int err;
> 
>     err = test(&k, 20);
>     printf("main: err = %d\n", err);
> 
>     return 0;
> }
> </test-case>
> 
> What strikes me as "seems wrong" is that the "(k > 0)" branch in test()
> is not taken, although in the alternative branch it turns out that k =
> 10. This can be fixed by using the "volatile" keyword in front of the
> statement "int k = *pk + n - INT_MAX;" or by parenthesizing (n -
> INT_MAX) in that statement.
> 
> I can see the case for assuming that k can't be positive if one thinks
> of the expression as (*pk + n) - INT_MAX, since (*pk + n) can't be
> greater than INT_MAX in context, being the sum of two ints. All the
> same, since gcc does in fact end up assigning the value 10 to k the
> optimization seems a risky one.
> 

Your code is broken.  Signed integer overflow is undefined behaviour in
C - it has no meaning, and the compiler can assume it does not happen or
that you don't care what results you get if it /does/ happen.

This gives the compiler a lot of small but useful optimisation
opportunities - it can assume a number of basic mathematical identities
apply, and can use these to simplify code.

In this particular case, it knows that "*pk + n" will result in a valid
int value between INT_MIN and INT_MAX inclusive (or that you don't care
what happens if you try to overflow), with that int value being the
mathematically correct result as if the types had unlimited sizes.  It
also knows that when it takes an int "x" and subtracts INT_MAX, it again
has an int result between INT_MIN and INT_MAX that has the correct
mathematical value.  It is thus a simple reasoning that, regardless of
the value of "x", the result of "k = x - INT_MAX;" must lie between
INT_MIN and 0 inclusive.  The "k > 0" branch cannot be taken, and code
generation for the conditional and the branch can be skipped.

You appear to be assuming that signed integer arithmetic has two's
complement wrapping, which is not the case in C.  (It's a common
misunderstanding.  Another common misunderstanding is that it /should/
be wrapping, and C is a silly language for not doing so - more careful
thought and research will show that it is a /better/ language because it
makes this undefined behaviour.)

Now that you know the problem in your code (and that it is not a bug in
gcc), you should be able to find plenty of information about signed
integer arithmetic in C in whatever format you prefer (C standards,
blogs, stackoverflow, youtube, whatever suits you).  There is also the
Usenet group comp.lang.c.  This list is not appropriate, however, now
that you know it is nothing specific to gcc.

<https://en.cppreference.com/w/c/language/operator_arithmetic>

mvh.,

David

Reply via email to