https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812
--- Comment #11 from jun zhang <zhangjungcc at gmail dot com> --- Hello, Hubicka and Artem I try to reproduce this issue in Raptor Lake, I use -fopenmp -O3 -flto, meet the following error, but if use -fopenmp -O3, no -flto, build ok. Could you help me? libtool: link: /home/sdp/jun/gcc0/install/bin/gcc -fopenmp -O3 -flto -march=native -Wall -o utilities/gm utilities/gm.o -L/home/sdp/jun/omp/Ofast/pts_g_gomp/install/.phoronix-test-suite/installed-tests/pts/graphics-magick-2.1.0/gm_/lib magick/.libs/libGraphicsMagick.a -lfreetype -ljbig -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lzstd -lm -lpthread -fopenmp /home/sdp/jun/btl0/install/bin/ld: /tmp/ccnX75zI.ltrans0.ltrans.o: in function `main': <artificial>:(.text.startup+0x1): undefined reference to `GMCommand' collect2: error: ld returned 1 exit status make[1]: *** [Makefile:6411: utilities/gm] Error 1 make[1]: Leaving directory hubicka at gcc dot gnu.org <gcc-bugzi...@gcc.gnu.org> 于2023年5月29日周一 02:50写道: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812 > > --- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> --- > This is benchmarkeable version of the simplified testcase: > > jan@localhost:/tmp> cat t.c > #define N 10000000 > struct rgb {unsigned char r,g,b;} rgbs[N]; > int *addr; > struct drgb {double r,g,b; > #ifdef OPACITY > double o; > #endif > }; > > struct drgb sum(double w) > { > struct drgb r; > for (int i = 0; i < N; i++) > { > r.r += rgbs[i].r * w; > r.g += rgbs[i].g * w; > r.b += rgbs[i].b * w; > } > return r; > } > jan@localhost:/tmp> cat q.c > struct drgb {double r,g,b; > #ifdef OPACITY > double o; > #endif > }; > struct drgb sum(double w); > int > main() > { > for (int i = 0; i < 1000; i++) > sum(i); > } > > > jan@localhost:/tmp> gcc t.c q.c -march=native -O3 -g ; objdump -d a.out | grep > vfmadd231pd ; perf stat ./a.out > 40119d: c4 e2 d9 b8 d1 vfmadd231pd %xmm1,%xmm4,%xmm2 > > Performance counter stats for './a.out': > > 12,148.04 msec task-clock:u # 1.000 CPUs > utilized > 0 context-switches:u # 0.000 /sec > 0 cpu-migrations:u # 0.000 /sec > 736 page-faults:u # 60.586 /sec > 50,018,421,148 cycles:u # 4.117 GHz > 220,502 stalled-cycles-frontend:u # 0.00% frontend > cycles idle > 39,950,154,369 stalled-cycles-backend:u # 79.87% backend > cycles idle > 120,000,191,713 instructions:u # 2.40 insn per > cycle > # 0.33 stalled cycles > per > insn > 10,000,048,918 branches:u # 823.182 M/sec > 7,959 branch-misses:u # 0.00% of all > branches > > 12.149466078 seconds time elapsed > > 12.149084000 seconds user > 0.000000000 seconds sys > > > jan@localhost:/tmp> gcc t.c q.c -march=native -O3 -g -DOPACITY ; objdump -d > a.out | grep vfmadd231pd ; perf stat ./a.out > > Performance counter stats for './a.out': > > 12,141.11 msec task-clock:u # 1.000 CPUs > utilized > 0 context-switches:u # 0.000 /sec > 0 cpu-migrations:u # 0.000 /sec > 735 page-faults:u # 60.538 /sec > 50,018,839,129 cycles:u # 4.120 GHz > 185,034 stalled-cycles-frontend:u # 0.00% frontend > cycles idle > 29,963,999,798 stalled-cycles-backend:u # 59.91% backend > cycles idle > 120,000,191,729 instructions:u # 2.40 insn per > cycle > # 0.25 stalled cycles > per > insn > 10,000,048,913 branches:u # 823.652 M/sec > 7,311 branch-misses:u # 0.00% of all > branches > > 12.142252354 seconds time elapsed > > 12.138237000 seconds user > 0.004000000 seconds sys > > > So on zen2 hardware I get same performance on both. It may be interesting to > test it on Raptor Lake. > > -- > You are receiving this mail because: > You are on the CC list for the bug.