On Wed, Aug 16, 2017 at 10:31 AM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > "Bin.Cheng" <amker.ch...@gmail.com> writes: >> On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >>> Richard Biener <richard.guent...@gmail.com> writes: >>>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>> Hi, >>>>> This patch fixes PR81832. Root cause for the ICE is: >>>>> 1) Loop has distributed inner loop. >>>>> 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in >>>>> loop's header. >>>>> 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect >>>>> thus >>>>> not eliminated. >>>>> >>>>> Given pass_ch_vect copies loop header to enable more vectorization, we >>>>> should >>>>> skip loop in this case because distributed inner loop means this loop can >>>>> not >>>>> be vectorized anyway. One point to mention is name >>>>> inner_loop_distributed_p >>>>> is a little misleading. The name indicates that each basic block is >>>>> checked, >>>>> but the patch only checks loop's header for simplicity/efficiency's >>>>> purpose. >>>>> Any comment? >>>> >>>> My comment would be to question pass_ch_vect placement -- what was the >>>> reason to place it so late? >>>> >>>> I also see GRAPHITE runs inbetween loop distribution and vectorization -- >>>> what prevents GRAPHITE from messing up things here? Or autopar? >>>> >>>> The patch itself shows should_duplicate_loop_header_p should >>>> handle this IFN specially (somehow all IFNs are considered "inexpensive"). >>>> >>>> So can you please adjust should_duplicate_loop_header_p instead and/or >>>> gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not >>>> sure if that default is so good. >>> >>> I think the default itself is OK: we only use IFNs for libm functions >>> like exp10 if an underlying optab exists (unlike __builtin_exp10, which >>> is useful as the non-errno setting version of exp10), so the target must >>> have something that it thinks is worth open-coding. Also, we currently >>> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN >>> handling is consistent with that. >>> >>> Maybe there are some IFNs that are worth special-casing as expensive, >>> but IMO doing that to solve this problem would be a bit hacky. It seems >>> like "inexpensive" should be more of a cost thing than a correctness thing. >> Hi all, > >> This is updated patch. It only adds a single check on IFN_LOOP_DIST_ALIAS >> in function should_duplicate_loop_header_p. > > Thanks. > >> As for gimple_inexpensive_call_p, I think it's natural to consider >> functions like IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS as >> expensive because they are only meant to indicate temporary >> arrangement of optimizations and are never used in code generation. >> I will send a standalone patch for that. > > Is that enough to consider them expensive though? To me, "expensive" Not sure. Or the other hand, "Expensive" is a measurement of the cost of generated code. For internal function calls in discussion, maybe we should not ask the question in the first. Even these function calls are expanded to constant, IMHO, we can't simply consider it's cheap either because there is high level side-effects along with expanding, i.e, undoing loop versioning. such high level transformation is not (and should not be) covered by gimple_inexpensive_call_p.
Thanks, bin > should mean that they cost a lot in terms of size or speed (whichever > is most important in context). Both functions are really cheap in > that sense, since they eventually expand to constants. > > Thanks, > Richard > >> Another thing is this patch doesn't check IFN_LOOP_VECTORIZED because >> it's impossible to have it with current order of passes. Bootstrap >> and test ongoing. Further comments? >> >> Thanks, >> bin >> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >> >> PR tree-optimization/81832 >> * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't >> copy loop header which has IFN_LOOP_DIST_ALIAS call. >> >> gcc/testsuite >> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >> >> PR tree-optimization/81832 >> * gcc.dg/tree-ssa/pr81832.c: New test. >> >>> >>> Thanks, >>> Richard >>> >>>> Thanks, >>>> Richard. >>>> >>>>> Bootstrap and test on x86_64. >>>>> >>>>> Thanks, >>>>> bin >>>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>>> >>>>> PR tree-optimization/81832 >>>>> * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. >>>>> (pass_ch_vect::process_loop_p): Call above function. >>>>> >>>>> gcc/testsuite >>>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>>> >>>>> PR tree-optimization/81832 >>>>> * gcc.dg/tree-ssa/pr81832.c: New test. >> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c >> b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c >> new file mode 100644 >> index 0000000..893124e >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c >> @@ -0,0 +1,22 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O3" } */ >> + >> +int a, b, *c; >> +void d(void) >> +{ >> + int **e; >> + for(;;) >> + for(int f = 1; f <= 6; f++) >> + { >> + b = 0; >> + if(a) >> +g: >> + while(a++); >> + if (**e); >> + else >> + { >> + *c = a; >> + goto g; >> + } >> + } >> +} >> diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c >> index 14cc6d8d..6bb0220 100644 >> --- a/gcc/tree-ssa-loop-ch.c >> +++ b/gcc/tree-ssa-loop-ch.c >> @@ -119,7 +119,10 @@ should_duplicate_loop_header_p (basic_block header, >> struct loop *loop, >> continue; >> >> if (gimple_code (last) == GIMPLE_CALL >> - && !gimple_inexpensive_call_p (as_a <gcall *> (last))) >> + && (!gimple_inexpensive_call_p (as_a <gcall *> (last)) >> + /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed >> + at current loop's header. Don't copy in this case. */ >> + || gimple_call_internal_p (last, IFN_LOOP_DIST_ALIAS))) >> { >> if (dump_file && (dump_flags & TDF_DETAILS)) >> fprintf (dump_file,