[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #21 from Ferruh YIGIT --- (In reply to Jakub Jelinek from comment #19) > Upstream 2.31 and 2.31.1 is affected too, but 2.31 branch starting with > August 2018 is not affected. As the fix has been backported also to 2.30 > branch, I guess 2.30 is affected too, 2.32 is not affected. Dunno about > older binutils, you'll need to try. Thanks. We already have another problem with 2.30 and disabling avx512 for that case, it seems we will need to extend it to 2.31 & 2.31.1 too, thanks.
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #20 from Ferruh YIGIT --- Confirmed that issue is fixed with the latest assembler [1]. [1] as --version GNU assembler (GNU Binutils) 2.32.51.20190410?
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #19 from Jakub Jelinek --- Upstream 2.31 and 2.31.1 is affected too, but 2.31 branch starting with August 2018 is not affected. As the fix has been backported also to 2.30 branch, I guess 2.30 is affected too, 2.32 is not affected. Dunno about older binutils, you'll need to try.
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #18 from Ferruh YIGIT --- (In reply to Nick Clifton from comment #17) > (In reply to Jakub Jelinek from comment #8) > > Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream > > 2.31.1 nor which exact patch fixed it. > > FYI - binutils-2.31.1-25.fc29.x86_64 now contains the patch. > > Cheers > Nick Thanks Nick. Can it be possible to get range of affected versions, so that we can provide protection for them? btw, are all binutils-2.31 affected, or fedora packge only? Thanks, ferruh
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 Nick Clifton changed: What|Removed |Added CC||nickc at gcc dot gnu.org --- Comment #17 from Nick Clifton --- (In reply to Jakub Jelinek from comment #8) > Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream > 2.31.1 nor which exact patch fixed it. FYI - binutils-2.31.1-25.fc29.x86_64 now contains the patch. Cheers Nick
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED See Also||https://sourceware.org/bugz ||illa/show_bug.cgi?id=24434 Resolution|--- |MOVED --- Comment #16 from Martin Liška --- Moved.
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #15 from Martin Liška --- Fixed in bintuils with: commit 629cfaf1b0fbb32a985607c774bd8e7870b9fa94 (HEAD, refs/bisect/bad) Author: Jan Beulich Date: Mon Jul 30 17:25:05 2018 +0200 x86: don't mistakenly scale non-8-bit displacements In commit b5014f7af2 I've removed (instead of replaced) a conditional, resulting in addressing forms not allowing 8-bit displacements to now get their displacements scaled under certain circumstances. Re-add the missing conditional. Minimal reproducer: $ cat min.s .text foo: vpgatherqq 8(,%ymm1,1), %ymm0{%k2} $ ./gas/as-new --64 min.s -o avx512.o && ./binutils/objdump -S avx512.o avx512.o: file format elf64-x86-64 Disassembly of section .text: : 0: 62 f2 fd 2a 91 04 0dvpgatherqq 0x1(,%ymm1,1),%ymm0{%k2} 7: 01 00 00 00
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #14 from Martin Liška --- $ as --version GNU assembler (GNU Binutils; openSUSE Tumbleweed) 2.32 is fine: $ as --64 avx512.s -o avx512.o && objdump -S avx512.o | grep gather 234b: 62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2} 235e: 62 f2 fd 2b 91 14 0dvpgatherqq 0x0(,%ymm1,1),%ymm2{%k3}
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #13 from Ferruh YIGIT --- (In reply to Hongtao.liu from comment #6) > (In reply to Ferruh YIGIT from comment #1) > > Created attachment 46115 [details] > > 19.05-rc1 -mno-avx512f gcc build on skylake > > > > The build is done with changing the lib/librte_kni/Makefile as following: > > > > + CFLAGS += -mno-avx512f > > (In reply to Ferruh YIGIT from comment #5) > > Tested with latest gcc [1], same output. > > > > [1] Compiled from source: > > gcc (GCC) 9.0.1 20190409 (experimental) > > I built rte_kni.i with latest gcc and got > > ... > vmovdqu64 (%rsi,%rax), %zmm1 > kmovw %k1, %k2 > vpgatherqq 8(,%zmm1,1), %zmm0{%k2} > vpaddq %zmm1, %zmm0, %zmm0 > kmovw %k1, %k3 > vpgatherqq 0(,%zmm1,1), %zmm2{%k3} > vpsubq %zmm2, %zmm0, %zmm0 > vmovdqu64 %zmm0, (%rcx,%rax) > ... > > Can't reproduce the issue you mentioned. > > Could you please upload *.s and *.o with both version(with and without > -mno-avx512f). Attached: gcc_avx512_rte_kni.o gcc_avx512_rte_kni.s gcc_NO_avx512_rte_kni.o gcc_NO_avx512_rte_kni.s
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #12 from Ferruh YIGIT --- Created attachment 46128 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46128=edit 19.05-rc1 -mno-avx512f gcc build on skylake .s file via --save-temp
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #11 from Ferruh YIGIT --- Created attachment 46127 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46127=edit 19.05-rc1 -mno-avx512f gcc build on skylake .o file
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #10 from Ferruh YIGIT --- Created attachment 46126 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46126=edit 19.05-rc1 default gcc build (avx512 enabled) on skylake .s file via --save-temps
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #9 from Ferruh YIGIT --- Created attachment 46125 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46125=edit 19.05-rc1 default gcc build (avx512 enabled) on skylake .o file
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #8 from Jakub Jelinek --- Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream 2.31.1 nor which exact patch fixed it. But as I said, there is no testcase coverage for this, so it might break any time again.
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #7 from Jakub Jelinek --- Looks like gas bug to me: vpgatherqq 8(,%ymm1,1), %ymm0{%k2} vpgatherqq 8(%rcx,%ymm1,1), %ymm0{%k2} vpgatherqq %ymm2, 8(,%ymm1,1), %ymm0 vpgatherqq %ymm2, 8(%rcx,%ymm1,1), %ymm0 when assembled with gas and objdump -d: 0: 62 f2 fd 2a 91 04 0dvpgatherqq 0x1(,%ymm1,1),%ymm0{%k2} 7: 01 00 00 00 b: 62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2} 12: 01 13: c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0 1a: 00 00 00 1d: c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0 while when assembled with clang and objdump -d: 0: 62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2} 7: 08 00 00 00 b: 62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2} 12: 01 13: c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0 1a: 00 00 00 1d: c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0 But trying current binutils trunk assembles it correctly too: 0: 62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2} 7: 08 00 00 00 b: 62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2} 12: 01 13: c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0 1a: 00 00 00 1d: c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0 That said, strangely even current binutils trunk doesn't have any test coverage for the EVEX encoded v*gather* instructions with no base register (i.e. disp(,%[xyz]mm*,*) ) while it has coverage for such AVX2 gathers.
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #6 from Hongtao.liu --- (In reply to Ferruh YIGIT from comment #1) > Created attachment 46115 [details] > 19.05-rc1 -mno-avx512f gcc build on skylake > > The build is done with changing the lib/librte_kni/Makefile as following: > > + CFLAGS += -mno-avx512f (In reply to Ferruh YIGIT from comment #5) > Tested with latest gcc [1], same output. > > [1] Compiled from source: > gcc (GCC) 9.0.1 20190409 (experimental) I built rte_kni.i with latest gcc and got ... vmovdqu64 (%rsi,%rax), %zmm1 kmovw %k1, %k2 vpgatherqq 8(,%zmm1,1), %zmm0{%k2} vpaddq %zmm1, %zmm0, %zmm0 kmovw %k1, %k3 vpgatherqq 0(,%zmm1,1), %zmm2{%k3} vpsubq %zmm2, %zmm0, %zmm0 vmovdqu64 %zmm0, (%rcx,%rax) ... Can't reproduce the issue you mentioned. Could you please upload *.s and *.o with both version(with and without -mno-avx512f).
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #5 from Ferruh YIGIT --- Tested with latest gcc [1], same output. [1] Compiled from source: gcc (GCC) 9.0.1 20190409 (experimental)
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #4 from Ferruh YIGIT --- Created attachment 46117 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46117=edit .s file generated by "--save-temps" param
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #3 from Ferruh YIGIT --- Created attachment 46116 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46116=edit .i file generated by "--save-temps" param
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #2 from Ferruh YIGIT --- While preparing the support files for this report, via --save-temps, recognized that generated .s file output is a little different, and correct assuming the suspicion on source of the failure was right: 3495 movl$-1, %edx 3496 salq$5, %rax 3497 xorl%ecx, %ecx 3498 kmovb %edx, %k1 3499 .p2align 4,,10 3500 .p2align 3 3501 .L540: 3502 vmovdqu64 (%rsi,%rcx), %ymm1 3503 kmovb %k1, %k2 3504 vpgatherqq 8(,%ymm1,1), %ymm0{%k2} 3505 kmovb %k1, %k3 3506 vpaddq %ymm1, %ymm0, %ymm0 3507 vpgatherqq 0(,%ymm1,1), %ymm2{%k3} 3508 vpsubq %ymm2, %ymm0, %ymm0 3509 vmovdqu64 %ymm0, (%r8,%rcx) It has "vpgatherqq 8 ..." Attaching .s and .i files. Does this mean the problem is in the assembler? /usr/bin/as --version GNU assembler version 2.31.1-24.fc29
[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 --- Comment #1 from Ferruh YIGIT --- Created attachment 46115 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46115=edit 19.05-rc1 -mno-avx512f gcc build on skylake The build is done with changing the lib/librte_kni/Makefile as following: + CFLAGS += -mno-avx512f