http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705
--- Comment #1 from pinskia at gmail dot com <pinskia at gmail dot com> 2012-03-25 05:12:44 UTC --- You are volating c/c++ aliasing rules. Use memcpy or -fno-strict-aliasing . Sent from my Palm Pre on AT&T On Mar 24, 2012 21:27, veiokej at gmail dot com <gcc-bugzi...@gcc.gnu.org> wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52705 Bug #: 52705 Summary: Loop optimization failure with -O2 versus -O1 Classification: Unclassified Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: veio...@gmail.com Created attachment 26976 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26976 Intermediate of bug.c When I compile with these different optimization levels, I get different output. This isn't confusion over floats or uninitialized variables, as far as I can tell. It appears to relate to casted memory accesses. First of all, this relates to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49938, which _might_ solve the problem (but I don't know, because I'm unable to upgrade from 4.6.1 under MinGW). So please try the latest GCC before you try to debug this. Here's the command line: gcc -save-temps -O2 -obug.exe bug.c This bug is very easy to reproduce. Here's the entire source of bug.c: ---------------------------------------------------- #include <stdint.h> #include <stdio.h> void func(uint32_t a[8],uint32_t b[8]){ uint32_t i; uint32_t c; int64_t d; for(i=0;i<=1;i++){ ((uint64_t *)(b))[0]=((uint64_t *)(a))[0]; ((uint64_t *)(b))[1]=((uint64_t *)(a))[0]; ((uint64_t *)(b))[2]=((uint64_t *)(a))[0]; ((uint64_t *)(b))[3]=((uint64_t *)(a))[0]; c=1; d=b[0]; d-=c; b[0]=d; c=b[0]; d=b[1]; d-=c<<1; b[1]=d; } return; } int main(int argc, char *argv[]){ uint32_t a[8]={1,0,0,0,0,0,0,0}; uint32_t b[8]; func(a,b); printf("%08X%08X%08X%08X%08X%08X%08X%08X\n",b[0],b[1],b[2],b[3],b[4],b[5],b[6],b[7]); return 0; } ---------------------------------------------------- As you will see, you get different outputs depending on whether you use -O1 or -O2. The relation to Bug 49930 is this: Look at the above code. If you change: ---------------------------------------------------- d=b[1]; d-=c<<1; b[1]=d; ---------------------------------------------------- to: ---------------------------------------------------- d=b[0]; d-=c<<1; b[0]=d; ---------------------------------------------------- Then you will see bug 49930. Note that b[] appears to be only half-initialized because I only write to subscripts 0 through 3. But that's not the case, because I've casted 8 32-bit integers to 4 64-bit integers. I notice that when I change the lines involving (uint64_t *) casts to normal (uint32_t *) memory accesses, i.e. when I get rid of the casts, it seems to work better (but didn't investigate at length). But I don't want to do that for performance reasons. (bug.c is just an adaptation that's filtered from a "real" function where performance matters.)