http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705

             Bug #: 52705
           Summary: Loop optimization failure with -O2 versus -O1
    Classification: Unclassified
           Product: gcc
           Version: 4.6.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: veio...@gmail.com


Created attachment 26976
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26976
Intermediate of bug.c

When I compile with these different optimization levels, I get different
output. This isn't confusion over floats or uninitialized variables, as far as
I can tell. It appears to relate to casted memory accesses.

First of all, this relates to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49938, which _might_ solve the
problem (but I don't know, because I'm unable to upgrade from 4.6.1 under
MinGW). So please try the latest GCC before you try to debug this.

Here's the command line:

gcc -save-temps -O2 -obug.exe bug.c

This bug is very easy to reproduce. Here's the entire source of bug.c:

----------------------------------------------------
#include <stdint.h>
#include <stdio.h>

void
func(uint32_t a[8],uint32_t b[8]){
  uint32_t i;
  uint32_t c;
  int64_t d;

  for(i=0;i<=1;i++){
    ((uint64_t *)(b))[0]=((uint64_t *)(a))[0];
    ((uint64_t *)(b))[1]=((uint64_t *)(a))[0];
    ((uint64_t *)(b))[2]=((uint64_t *)(a))[0];
    ((uint64_t *)(b))[3]=((uint64_t *)(a))[0];
    c=1;
    d=b[0];
    d-=c;
    b[0]=d;
    c=b[0];
    d=b[1];
    d-=c<<1;
    b[1]=d;
  }
  return;
}

int
main(int argc, char *argv[]){
  uint32_t a[8]={1,0,0,0,0,0,0,0};
  uint32_t b[8];

  func(a,b);
 
printf("%08X%08X%08X%08X%08X%08X%08X%08X\n",b[0],b[1],b[2],b[3],b[4],b[5],b[6],b[7]);
  return 0;
}
----------------------------------------------------

As you will see, you get different outputs depending on whether you use -O1 or
-O2.

The relation to Bug 49930 is this:

Look at the above code. If you change:

----------------------------------------------------
    d=b[1];
    d-=c<<1;
    b[1]=d;
----------------------------------------------------

to:

----------------------------------------------------
    d=b[0];
    d-=c<<1;
    b[0]=d;
----------------------------------------------------

Then you will see bug 49930.

Note that b[] appears to be only half-initialized because I only write to
subscripts 0 through 3. But that's not the case, because I've casted 8 32-bit
integers to 4 64-bit integers.

I notice that when I change the lines involving (uint64_t *) casts to normal
(uint32_t *) memory accesses, i.e. when I get rid of the casts, it seems to
work better (but didn't investigate at length). But I don't want to do that for
performance reasons. (bug.c is just an adaptation that's filtered from a "real"
function where performance matters.)

Reply via email to