https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100162
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Version|unknown |12.0 CC| |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed| |2021-04-21 Status|UNCONFIRMED |NEW Keywords| |missed-optimization --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Optimized by DOM3 which sees the following difference: - <bb 2> [local count: 118111601]: + <bb 2> [local count: 955630225]: b.1_1 = b; - c[0][b.1_1] = 2; - c[1][b.1_1] = 2; - c[2][b.1_1] = 2; - c[3][b.1_1] = 2; + _27 = (sizetype) b.1_1; + _28 = _27 * 4; + vectp_c.13_26 = &c + _28; + MEM <vector(4) int> [(int *)vectp_c.13_26] = { 2, 2, 2, 2 }; + vectp_c.12_30 = vectp_c.13_26 + 16; c[4][b.1_1] = 2; a = 5; _5 = b.1_1 != 0; _6 = (int) _5; - _8 = _6 / 2; + _7 = c[0][0]; + _8 = _6 / _7; if (_8 != 0) here c[0][b.1_1] takes advantage of get_ref_base_and_extent honoring the known array size of [1] while the pointer based access is not constrained this way which makes matching c[0][0] to *(&c + _28) = { 2, 2, 2, 2 } difficult. The realistic chance is to catch this by improving value-numbering done on the not unrolled loop earlier: <bb 3> [local count: 955630225]: # a.3_19 = PHI <_2(3), 0(2)> c[a.3_19][b.1_1] = 2; _2 = a.3_19 + 1; if (_2 <= 4) goto <bb 3>; [89.00%] else goto <bb 4>; [11.00%] <bb 4> [local count: 118111600]: a = _2; _5 = b.1_1 != 0; _6 = (int) _5; _7 = c[0][0]; where we could use SCEV & friends to lookup c[0][0] at the c[a.3_19][b.1_1] definition in vn_reference_lookup_3. That might also help to look through loop abstraction earlier.