[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2021-02-18 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #7 from ktkachov at gcc dot gnu.org ---
Closing as dup then.

*** This bug has been marked as a duplicate of bug 98214 ***

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2021-02-18 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

--- Comment #6 from Alex Coplan  ---
g:0411210fddbd3ec27c8dc1183f40f662712a2232

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2021-02-18 Thread joelh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

--- Comment #5 from Joel Hutton  ---
this appears to have been fixed on trunk by 

0411210fddbd3ec27c8dc1183f40f662712a2232
Author: Richard Sandiford 
Date:   Thu Dec 31 16:10:47 2020 +

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2021-01-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2020-12-09 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

--- Comment #4 from Martin Liška  ---
(In reply to Alex Coplan from comment #3)
> @Martin: I originally saw the issue with a testcase generated by YARPGen
> (https://github.com/intel/yarpgen), but this only hit the bug with LTO.

Oh, cool. I haven't heard about the new codegen. I only knew csmith.
I'll experiment with that a bit.

> 
> I reduced that with cvise and then manually tweaked the testcase to simplify
> it such that it also hits the bug without LTO.

Nice!

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2020-12-08 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

--- Comment #3 from Alex Coplan  ---
@Martin: I originally saw the issue with a testcase generated by YARPGen
(https://github.com/intel/yarpgen), but this only hit the bug with LTO.

I reduced that with cvise and then manually tweaked the testcase to simplify it
such that it also hits the bug without LTO.

[Bug target/98196] [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited

2020-12-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
  Component|tree-optimization   |target

--- Comment #2 from Richard Biener  ---
The only thing we "sink" is

Sinking _37 = (long int) _36;
 from bb 4 to bb 5

that makes the loop look like

   [local count: 536870913]:
  # g_23 = PHI 
  _32 = (long unsigned int) g_23;
  _33 = _32 * 11;
  _34 =  + _33;
  _36 = (*_34)[0];
  g_22 = g_23 + 4;
  if (g_22 != 16)
goto ; [75.00%]
  else
goto ; [25.00%]

   [local count: 402653181]:
  goto ; [100.00%]

   [local count: 134217731]:
  _37 = (long int) _36;
  b = _37;

we use a gather load for vectorization:

  vect__36.12_1 = .GATHER_LOAD (vectp_c.10_6, { 0, 44, 88, 132 }, 1, { 0, 0, 0,
0 });

I suspect that is somehow expanded badly:

(insn 78 77 0 (set (reg:VNx2QI 92 [ vect__36.12 ])
(unspec:VNx2QI [
(subreg:VNx2BI (reg:VNx16BI 154) 0)
(reg/f:DI 150)
(reg:VNx2DI 152)
(const_int 1 [0x1]) repeated x2
(mem:BLK (scratch) [0  A8])
] UNSPEC_LD1_GATHER)) "t.c":12:15 -1
 (nil))

not aliasing with the stores which look like

(insn 33 32 34 (set (reg:DI 121)
(high:DI (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]))) "t.c":9:15 -1
 (nil))

(insn 34 33 35 (set (reg/f:DI 120)
(lo_sum:DI (reg:DI 121)
(symbol_ref:DI ("*.LANCHOR0") [flags 0x182]))) "t.c":9:15 -1
 (expr_list:REG_EQUAL (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
(nil)))

(insn 36 35 0 (set (mem/c:QI (plus:DI (reg/f:DI 120)
(const_int 77 [0x4d])) [0 MEM[(signed char *) + 77B]+0 S1
A8])
(reg:QI 122)) "t.c":9:15 -1
 (nil))

maybe due to the section anchor.