On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <bin.ch...@arm.com> wrote: > Hi, > This patch fixes PR81832. Root cause for the ICE is: > 1) Loop has distributed inner loop. > 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's > header. > 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus > not eliminated. > > Given pass_ch_vect copies loop header to enable more vectorization, we should > skip loop in this case because distributed inner loop means this loop can not > be vectorized anyway. One point to mention is name inner_loop_distributed_p > is a little misleading. The name indicates that each basic block is checked, > but the patch only checks loop's header for simplicity/efficiency's purpose. > Any comment?
My comment would be to question pass_ch_vect placement -- what was the reason to place it so late? I also see GRAPHITE runs inbetween loop distribution and vectorization -- what prevents GRAPHITE from messing up things here? Or autopar? The patch itself shows should_duplicate_loop_header_p should handle this IFN specially (somehow all IFNs are considered "inexpensive"). So can you please adjust should_duplicate_loop_header_p instead and/or gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not sure if that default is so good. Thanks, Richard. > Bootstrap and test on x86_64. > > Thanks, > bin > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. > (pass_ch_vect::process_loop_p): Call above function. > > gcc/testsuite > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * gcc.dg/tree-ssa/pr81832.c: New test.