http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49965
--- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot Uni-Bielefeld.DE> 2011-08-09 15:10:25 UTC --- > --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-08-03 > 16:37:36 UTC --- > So what values it printed? Did it print -2.0 and 9.0 in some iterations? > The final merging is done in a critical section between GOMP_atomic_start and > GOMP_atomic_end, perhaps you can put a breakpoint in there and watch how the > values are merged using the MIN_EXPR. > Can you reproduce the failure with OMP_NUM_THREADS=1 BTW? I think there's a code generation bug here: I could reduce the example to a considerably smaller testcase (red-4.C, attached), which still fails with OMP_NUM_THREADS=1. Between GOMP_atomic_start and GOMP_atomic_end, there's a single call to _Q_fle (-2, 1024), which correctly returns 1 (i.e. first arg <= second). With g++ -mcpu=v9 -fopenmp -O3 red-4.c -lm, I get this code sequence: ldd [%i0], %f8 %i1 ldd [%i0+8], %f10 ldd [%fp-48], %f12 %i0 ldd [%fp-40], %f14 std %f12, [%fp-16] %i0 std %f14, [%fp-8] std %f8, [%fp-32] %i1 std %f10, [%fp-24] std %f8, [%fp-56] %i1 std %f10, [%fp-64] add %fp, -16, %o0 %i0 <= %i1 call _Q_fle, 0 add %fp, -32, %o1 cmp %o0, 0 1 on true, 0 on false ldd [%fp-56], %f8 %i1 ldd [%fp-64], %f10 ldd [%fp-48], %f12 %i0 ldd [%fp-40], %f14 fmovdle %icc, %f12, %f8 on <= 0 (false): %i0 -> %i1 fmovdle %icc, %f14, %f10 std %f8, [%i0] return %i1 std %f10, [%i0+8] I think the fmovdle is wrong: if _Q_fle returns 1 (<=), the first arg should be returned, but it's the other way round. Rainer