http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172
Summary: incorrect vectorization of loop in GCC 4.5.* with -O3 Product: gcc Version: 4.5.2 Status: UNCONFIRMED Severity: critical Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: g...@cray.com When the following test case is compiled with -O3, the program fails to compute correct array values. Compiling with -O2 allows the program to execute correctly. Looking at the disassembly of the program, the compiler appears to be vectorizing a loop (lines 25-27 in this example) incorrectly. Turning off loop vectorization allows correct code generation ("cc -O3 -fno-tree-vectorize" works) Problem was duplicated using gcc 4.5.0, 4.5.1, and 4.5.2. Problem does not occur using gcc 4.4.4 or 4.3.1. The CPU being used is a 12-core Magny-Cours Opteron. test case: $ cat vec.c // Compile with gcc -O3 to see failures // Compile with gcc -O3 -fno-tree-vectorize to get correct results #include <inttypes.h> #include <stdio.h> #include <string.h> #define ASIZE 1028 #define HALF (ASIZE/2) int main() { uint32_t array[ASIZE]; int failures; int i; memset(array, 0, sizeof(array)); // initialize first half of the array for (i = 0; i < HALF; i++) { array[i] = i; } // fill second half of array in by summing earlier elements of the array // gcc 4.5.1 and 4.5.2 incorrectly vectorize this loop! aray[1025] is left // at 0 for ASIZE=1028 for (i = 0; i < HALF-1; i++) { array[HALF+i] = array[2*i] + array[2*i + 1]; } // see if we have any failures failures = 0; for (i = 0; i < HALF - 1; i++) { if (array[HALF+i] != array[2*i] + array[2*i + 1]) { printf("COMPILER BUG: array[%d] should be %d but is %d\n", HALF+i, array[2*i] + array[2*i + 1], array[HALF+i]); ++failures; } } if (failures == 0) { printf("pass\n"); } return 0; } $$ gcc --version gcc (GCC) 4.5.2 20101216 (Cray Inc.) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -O3 vec.c ; ./a.out COMPILER BUG: array[1025] should be 98177 but is 0 $ gcc -O3 -fno-tree-vectorize vec.c ; ./a.out pass $ gcc -O2 vec.c ; ./a.out pass $ GCC 4.4.4 works: $ gcc --version gcc (GCC) 4.4.4 20100429 (Cray Inc.) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -O3 vec.c ; ./a.out pass $