Richard Biener <rguent...@suse.de> writes: > On Fri, 22 Jun 2018, David Malcolm wrote: > >> NightStrike and I were chatting on IRC last week about >> issues with trying to vectorize the following code: >> >> #include <vector> >> std::size_t f(std::vector<std::vector<float>> const & v) { >> std::size_t ret = 0; >> for (auto const & w: v) >> ret += w.size(); >> return ret; >> } >> >> icc could vectorize it, but gcc couldn't, but neither of us could >> immediately figure out what the problem was. >> >> Using -fopt-info leads to a wall of text. >> >> I tried using my patch here: >> >> "[PATCH] v3 of optinfo, remarks and optimization records" >> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01267.html >> >> It improved things somewhat, by showing: >> (a) the nesting structure via indentation, and >> (b) the GCC line at which each message is emitted (by using the >> "remark" output) >> >> but it's still a wall of text: >> >> https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.remarks.html >> >> https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.d/..%7C..%7Csrc%7Ctest.cc.html#line-4 >> >> It doesn't yet provide a simple high-level message to a >> tech-savvy user on what they need to do to get GCC to >> vectorize their loop. > > Yeah, in particular the vectorizer is way too noisy in its low-level > functions. IIRC -fopt-info-vec-missed is "somewhat" better: > > t.C:4:26: note: step unknown. > t.C:4:26: note: vector alignment may not be reachable > t.C:4:26: note: not ssa-name. > t.C:4:26: note: use not simple. > t.C:4:26: note: not ssa-name. > t.C:4:26: note: use not simple. > t.C:4:26: note: no array mode for V2DI[3] > t.C:4:26: note: Data access with gaps requires scalar epilogue loop > t.C:4:26: note: can't use a fully-masked loop because the target doesn't > have the appropriate masked load or store. > t.C:4:26: note: not ssa-name. > t.C:4:26: note: use not simple. > t.C:4:26: note: not ssa-name. > t.C:4:26: note: use not simple. > t.C:4:26: note: no array mode for V2DI[3] > t.C:4:26: note: Data access with gaps requires scalar epilogue loop > t.C:4:26: note: op not supported by target. > t.C:4:26: note: not vectorized: relevant stmt not supported: _15 = _14 > /[ex] 4; > t.C:4:26: note: bad operation or unsupported loop bound. > t.C:4:26: note: not vectorized: no grouped stores in basic block. > t.C:4:26: note: not vectorized: no grouped stores in basic block. > t.C:6:12: note: not vectorized: not enough data-refs in basic block. > > >> The pertinent dump messages are: >> >> test.cc:4:23: remark: === try_vectorize_loop_1 === >> [../../src/gcc/tree-vectorizer.c:674:try_vectorize_loop_1] >> cc1plus: remark: >> Analyzing loop at test.cc:4 >> [../../src/gcc/dumpfile.c:735:ensure_pending_optinfo] >> test.cc:4:23: remark: === analyze_loop_nest === >> [../../src/gcc/tree-vect-loop.c:2299:vect_analyze_loop] >> [...snip...] >> test.cc:4:23: remark: === vect_analyze_loop_operations === >> [../../src/gcc/tree-vect-loop.c:1520:vect_analyze_loop_operations] >> [...snip...] >> test.cc:4:23: remark: ==> examining statement: ‘_15 = _14 /[ex] 4;’ >> [../../src/gcc/tree-vect-stmts.c:9382:vect_analyze_stmt] >> test.cc:4:23: remark: vect_is_simple_use: operand ‘_14’ >> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use] >> test.cc:4:23: remark: def_stmt: ‘_14 = _8 - _7;’ >> [../../src/gcc/tree-vect-stmts.c:10098:vect_is_simple_use] >> test.cc:4:23: remark: type of def: internal >> [../../src/gcc/tree-vect-stmts.c:10112:vect_is_simple_use] >> test.cc:4:23: remark: vect_is_simple_use: operand ‘4’ >> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use] >> test.cc:4:23: remark: op not supported by target. >> [../../src/gcc/tree-vect-stmts.c:5932:vectorizable_operation] >> test.cc:4:23: remark: not vectorized: relevant stmt not supported: ‘_15 = >> _14 /[ex] 4;’ [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt] >> test.cc:4:23: remark: bad operation or unsupported loop bound. >> [../../src/gcc/tree-vect-loop.c:2043:vect_analyze_loop_2] >> cc1plus: remark: vectorized 0 loops in function. >> [../../src/gcc/tree-vectorizer.c:904:vectorize_loops] >> >> In particular, that complaint from >> [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt] >> is coming from: >> >> if (!ok) >> { >> if (dump_enabled_p ()) >> { >> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> "not vectorized: relevant stmt not "); >> dump_printf (MSG_MISSED_OPTIMIZATION, "supported: "); >> dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0); >> } >> >> return false; >> } >> >> This got me thinking: the user presumably wants to know several >> things: >> >> * the location of the loop that can't be vectorized (vect_location >> captures this) >> * location of the problematic statement >> * why it's problematic >> * the problematic statement itself. >> >> The following is an experiment at capturing that information, by >> recording an "opt_problem" instance describing what the optimization >> problem is, created deep in the callstack when it occurs, for use >> later on back at the top of the vectorization callstack. > > Nice idea. Of course the issue is then for which issues to > exactly create those. Like all of the MSG_MISSED_OPTIMIZATION > dumpings? > > I guess the vectorizer needs some axing of useless messages > and/or we need a MSG_DEBUG to have an additional level below > MSG_NOTE.
Sounds good. One of the worst sources of excess noise seems to be vect_is_simple_use: just looking at an operand usually produces three lines of text. I did wonder about suggesting we axe that, but I suppose there are times when it might be useful. Thanks, Richard