[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #21 from Ferruh YIGIT  ---
(In reply to Jakub Jelinek from comment #19)
> Upstream 2.31 and 2.31.1 is affected too, but 2.31 branch starting with
> August 2018 is not affected.  As the fix has been backported also to 2.30
> branch, I guess 2.30 is affected too, 2.32 is not affected.  Dunno about
> older binutils, you'll need to try.

Thanks.

We already have another problem with 2.30 and disabling avx512 for that case,
it seems we will need to extend it to 2.31 & 2.31.1 too, thanks.

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #20 from Ferruh YIGIT  ---
Confirmed that issue is fixed with the latest assembler [1].

[1]
as --version
GNU assembler (GNU Binutils) 2.32.51.20190410?

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #19 from Jakub Jelinek  ---
Upstream 2.31 and 2.31.1 is affected too, but 2.31 branch starting with August
2018 is not affected.  As the fix has been backported also to 2.30 branch, I
guess 2.30 is affected too, 2.32 is not affected.  Dunno about older binutils,
you'll need to try.

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #18 from Ferruh YIGIT  ---
(In reply to Nick Clifton from comment #17)
> (In reply to Jakub Jelinek from comment #8)
> > Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream
> > 2.31.1 nor which exact patch fixed it.
> 
> FYI - binutils-2.31.1-25.fc29.x86_64 now contains the patch.
> 
> Cheers
>   Nick

Thanks Nick.

Can it be possible to get range of affected versions, so that we can provide
protection for them?

btw, are all binutils-2.31 affected, or fedora packge only?

Thanks,
ferruh

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread nickc at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

Nick Clifton  changed:

   What|Removed |Added

 CC||nickc at gcc dot gnu.org

--- Comment #17 from Nick Clifton  ---
(In reply to Jakub Jelinek from comment #8)
> Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream
> 2.31.1 nor which exact patch fixed it.

FYI - binutils-2.31.1-25.fc29.x86_64 now contains the patch.

Cheers
  Nick

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   See Also||https://sourceware.org/bugz
   ||illa/show_bug.cgi?id=24434
 Resolution|--- |MOVED

--- Comment #16 from Martin Liška  ---
Moved.

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #15 from Martin Liška  ---
Fixed in bintuils with:

commit 629cfaf1b0fbb32a985607c774bd8e7870b9fa94 (HEAD, refs/bisect/bad)
Author: Jan Beulich 
Date:   Mon Jul 30 17:25:05 2018 +0200

x86: don't mistakenly scale non-8-bit displacements

In commit b5014f7af2 I've removed (instead of replaced) a conditional,
resulting in addressing forms not allowing 8-bit displacements to now
get their displacements scaled under certain circumstances. Re-add the
missing conditional.

Minimal reproducer:

$ cat min.s
.text
foo:
vpgatherqq  8(,%ymm1,1), %ymm0{%k2}

$ ./gas/as-new --64 min.s -o avx512.o && ./binutils/objdump -S avx512.o

avx512.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   62 f2 fd 2a 91 04 0dvpgatherqq 0x1(,%ymm1,1),%ymm0{%k2}
   7:   01 00 00 00

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #14 from Martin Liška  ---
$ as --version
GNU assembler (GNU Binutils; openSUSE Tumbleweed) 2.32

is fine:

$ as --64 avx512.s -o avx512.o && objdump -S avx512.o | grep gather
234b:   62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2}
235e:   62 f2 fd 2b 91 14 0dvpgatherqq 0x0(,%ymm1,1),%ymm2{%k3}

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #13 from Ferruh YIGIT  ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Ferruh YIGIT from comment #1)
> > Created attachment 46115 [details]
> > 19.05-rc1 -mno-avx512f gcc build on skylake
> > 
> > The build is done with changing the lib/librte_kni/Makefile as following:
> > 
> > + CFLAGS += -mno-avx512f
> 
> (In reply to Ferruh YIGIT from comment #5)
> > Tested with latest gcc [1], same output.
> > 
> > [1] Compiled from source:
> > gcc (GCC) 9.0.1 20190409 (experimental)
> 
> I built rte_kni.i with latest gcc and got
> 
> ...
>   vmovdqu64   (%rsi,%rax), %zmm1
>   kmovw   %k1, %k2
>   vpgatherqq  8(,%zmm1,1), %zmm0{%k2}
>   vpaddq  %zmm1, %zmm0, %zmm0
>   kmovw   %k1, %k3
>   vpgatherqq  0(,%zmm1,1), %zmm2{%k3}
>   vpsubq  %zmm2, %zmm0, %zmm0
>   vmovdqu64   %zmm0, (%rcx,%rax)
> ...
> 
> Can't reproduce the issue you mentioned.
> 
> Could you please upload *.s and *.o with both version(with and without
> -mno-avx512f).

Attached:
gcc_avx512_rte_kni.o
gcc_avx512_rte_kni.s
gcc_NO_avx512_rte_kni.o
gcc_NO_avx512_rte_kni.s

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #12 from Ferruh YIGIT  ---
Created attachment 46128
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46128=edit
19.05-rc1 -mno-avx512f gcc build on skylake .s file via --save-temp

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #11 from Ferruh YIGIT  ---
Created attachment 46127
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46127=edit
19.05-rc1 -mno-avx512f gcc build on skylake .o file

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #10 from Ferruh YIGIT  ---
Created attachment 46126
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46126=edit
19.05-rc1 default gcc build (avx512 enabled) on skylake .s file via
--save-temps

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #9 from Ferruh YIGIT  ---
Created attachment 46125
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46125=edit
19.05-rc1 default gcc build (avx512 enabled) on skylake .o file

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #8 from Jakub Jelinek  ---
Fedora binutils-2.31.1-24.fc29.x86_64 has the bug, haven't checked upstream
2.31.1 nor which exact patch fixed it.  But as I said, there is no testcase
coverage for this, so it might break any time again.

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #7 from Jakub Jelinek  ---
Looks like gas bug to me:
vpgatherqq  8(,%ymm1,1), %ymm0{%k2}
vpgatherqq  8(%rcx,%ymm1,1), %ymm0{%k2}
vpgatherqq  %ymm2, 8(,%ymm1,1), %ymm0
vpgatherqq  %ymm2, 8(%rcx,%ymm1,1), %ymm0
when assembled with gas and objdump -d:
   0:   62 f2 fd 2a 91 04 0dvpgatherqq 0x1(,%ymm1,1),%ymm0{%k2}
   7:   01 00 00 00 
   b:   62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2}
  12:   01 
  13:   c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0
  1a:   00 00 00 
  1d:   c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0
while when assembled with clang and objdump -d:
   0:   62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2}
   7:   08 00 00 00 
   b:   62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2}
  12:   01 
  13:   c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0
  1a:   00 00 00 
  1d:   c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0
But trying current binutils trunk assembles it correctly too:
   0:   62 f2 fd 2a 91 04 0dvpgatherqq 0x8(,%ymm1,1),%ymm0{%k2}
   7:   08 00 00 00 
   b:   62 f2 fd 2a 91 44 09vpgatherqq 0x8(%rcx,%ymm1,1),%ymm0{%k2}
  12:   01 
  13:   c4 e2 ed 91 04 0d 08vpgatherqq %ymm2,0x8(,%ymm1,1),%ymm0
  1a:   00 00 00 
  1d:   c4 e2 ed 91 44 09 08vpgatherqq %ymm2,0x8(%rcx,%ymm1,1),%ymm0
That said, strangely even current binutils trunk doesn't have any test coverage
for the EVEX encoded v*gather* instructions with no base register (i.e.
disp(,%[xyz]mm*,*) ) while it has coverage for such AVX2 gathers.

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-10 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #6 from Hongtao.liu  ---
(In reply to Ferruh YIGIT from comment #1)
> Created attachment 46115 [details]
> 19.05-rc1 -mno-avx512f gcc build on skylake
> 
> The build is done with changing the lib/librte_kni/Makefile as following:
> 
> + CFLAGS += -mno-avx512f

(In reply to Ferruh YIGIT from comment #5)
> Tested with latest gcc [1], same output.
> 
> [1] Compiled from source:
> gcc (GCC) 9.0.1 20190409 (experimental)

I built rte_kni.i with latest gcc and got

...
vmovdqu64   (%rsi,%rax), %zmm1
kmovw   %k1, %k2
vpgatherqq  8(,%zmm1,1), %zmm0{%k2}
vpaddq  %zmm1, %zmm0, %zmm0
kmovw   %k1, %k3
vpgatherqq  0(,%zmm1,1), %zmm2{%k3}
vpsubq  %zmm2, %zmm0, %zmm0
vmovdqu64   %zmm0, (%rcx,%rax)
...

Can't reproduce the issue you mentioned.

Could you please upload *.s and *.o with both version(with and without
-mno-avx512f).

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-09 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #5 from Ferruh YIGIT  ---
Tested with latest gcc [1], same output.

[1] Compiled from source:
gcc (GCC) 9.0.1 20190409 (experimental)

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-09 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #4 from Ferruh YIGIT  ---
Created attachment 46117
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46117=edit
.s file generated by "--save-temps" param

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-09 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #3 from Ferruh YIGIT  ---
Created attachment 46116
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46116=edit
.i file generated by "--save-temps" param

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-09 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #2 from Ferruh YIGIT  ---
While preparing the support files for this report, via --save-temps, recognized
that generated .s file output is a little different, and correct assuming the
suspicion on source of the failure was right:

3495 movl$-1, %edx
3496 salq$5, %rax
3497 xorl%ecx, %ecx
3498 kmovb   %edx, %k1
3499 .p2align 4,,10
3500 .p2align 3
3501 .L540:
3502 vmovdqu64   (%rsi,%rcx), %ymm1
3503 kmovb   %k1, %k2
3504 vpgatherqq  8(,%ymm1,1), %ymm0{%k2}
3505 kmovb   %k1, %k3
3506 vpaddq  %ymm1, %ymm0, %ymm0
3507 vpgatherqq  0(,%ymm1,1), %ymm2{%k3}
3508 vpsubq  %ymm2, %ymm0, %ymm0
3509 vmovdqu64   %ymm0, (%r8,%rcx)


It has "vpgatherqq  8 ..."

Attaching .s and .i files.


Does this mean the problem is in the assembler?

/usr/bin/as --version
GNU assembler version 2.31.1-24.fc29

[Bug target/90028] On Intel Skylake (-march=native) generated avx512 instruction can be wrong

2019-04-09 Thread ferruh.yigit at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028

--- Comment #1 from Ferruh YIGIT  ---
Created attachment 46115
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46115=edit
19.05-rc1 -mno-avx512f gcc build on skylake

The build is done with changing the lib/librte_kni/Makefile as following:

+ CFLAGS += -mno-avx512f