[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410 --- Comment #8 from Richard Biener --- Author: rguenth Date: Tue Jul 18 13:55:47 2017 New Revision: 250312 URL: https://gcc.gnu.org/viewcvs?rev=250312=gcc=rev Log: 2017-06-18 Richard BienerPR tree-optimization/81410 * tree-vect-stmts.c (vectorizable_load): Properly adjust for the gap in the ! slp_perm SLP case after each group. * gcc.dg/vect/pr81410.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/vect/pr81410.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410 --- Comment #7 from Richard Biener --- t.ii:25:19: note: === vect_analyze_data_ref_accesses === t.ii:25:19: note: Detected interleaving store _10->x and _10->y t.ii:25:19: note: Detected interleaving load MEM[(const struct Foo &)_8].x and MEM[(const struct Foo &)_8].y t.ii:25:19: note: Detected interleaving store of size 2 starting with _10->x = _37; t.ii:25:19: note: Detected interleaving load of size 3 starting with _37 = MEM[(const struct Foo &)_8].x; t.ii:25:19: note: There is a gap of 1 elements after the group ... t.ii:25:19: note: Final SLP tree for instance: t.ii:25:19: note: node t.ii:25:19: note: stmt 0 _10->x = _37; t.ii:25:19: note: stmt 1 _10->y = _38; t.ii:25:19: note: node t.ii:25:19: note: stmt 0 _37 = MEM[(const struct Foo &)_8].x; t.ii:25:19: note: stmt 1 _38 = MEM[(const struct Foo &)_8].y; (note no load permutation) t.ii:25:19: note: Loop contains SLP and non-SLP stmts t.ii:25:19: note: Updating vectorization factor to 4 t.ii:25:19: note: vectorization_factor = 4, niters = 5 _37 = MEM[(const struct Foo &)_8].x; vect__37.14_78 = MEM[(long int *)vectp.12_80]; vectp.12_73 = vectp.12_80 + 16; vect__37.15_72 = MEM[(long int *)vectp.12_73]; vectp.12_71 = vectp.12_73 + 16; vect__37.16_70 = MEM[(long int *)vectp.12_71]; vectp.12_69 = vectp.12_71 + 16; vect__37.17_68 = MEM[(long int *)vectp.12_69]; vectp.12_67 = vectp.12_69 + 32; _38 = MEM[(const struct Foo &)_8].y; so the gap is accounted for in the wrong place once instead of twice as required. C testcase: typedef __UINT64_TYPE__ uint64_t; uint64_t x[24]; uint64_t y[16]; uint64_t z[8]; void __attribute__((noinline)) foo() { for (int i = 0; i < 8; ++i) { y[2*i] = x[3*i]; y[2*i + 1] = x[3*i + 1]; z[i] = 1; } } int main() { for (int i = 0; i < 24; ++i) x[i] = i; foo (); for (int i = 0; i < 8; ++i) if (y[2*i] != 3*i || y[2*i+1] != 3*i + 1) __builtin_abort (); return 0; }
[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410 Richard Biener changed: What|Removed |Added Keywords||wrong-code Priority|P3 |P2 Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Target Milestone|--- |5.5 --- Comment #6 from Richard Biener --- I'll have a looksee.
[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410 --- Comment #5 from Marc Glisse --- Seems related to vectorization. These lines look suspicious: vect__37.14_78 = MEM[(long int *)_30]; vect__37.15_72 = MEM[(long int *)_30 + 16B]; vect__37.16_70 = MEM[(long int *)_30 + 32B]; vect__37.17_68 = MEM[(long int *)_30 + 48B]; MEM[(long int *)_28] = vect__37.14_78; MEM[(long int *)_28 + 16B] = vect__37.15_72; MEM[(long int *)_28 + 32B] = vect__37.16_70; MEM[(long int *)_28 + 48B] = vect__37.17_68; where _30 is for b, _28 is for a, and I would expect to see gaps in the reads from b (+24, +48, +72 instead of +16, +32 and +48). But I haven't checked, this is only a first guess.
[Bug c++/81410] [5/6/7/8 Regression] -O3 breaks code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81410 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-07-12 CC||jason at gcc dot gnu.org, ||marxin at gcc dot gnu.org Summary|O3 breaks code |[5/6/7/8 Regression] -O3 ||breaks code Ever confirmed|0 |1 --- Comment #4 from Martin Liška --- Confirmed, started with r209313.