On Thu, Nov 14, 2019 at 4:41 PM Sudakshina Das <sudi....@arm.com> wrote: > > Hi > > This patch is trying to fix PR88398 for AArch64. As discussed in the PR, > loop unrolling is probably what we can do here. As an easy fix, the > existing unroll_stupid is unrolling the given example better than the > unroll_runtime_iterations since the the loop contains a break inside it.
Hm, the bug reference doesn't help me at all in reviewing this - the bug is about vectorization. So why is unroll_stupid better than unroll_runtime_iterations for a loop with a break (or as your implementation, with multiple exists)? I don't like this target hook, it seems like general heuristics can be improved here, but it seems unroll-stupid doesn't consider loops with multiple exits at all? Richard. > So all I have done here is: > 1) Add a target hook so that this is AArch64 specific. > 2) We are not unrolling the loops that decide_unroll_runtime_iterations > would reject. > 3) Out of the ones that decide_unroll_runtime_iterations would accept, > check if the loop has more than 1 exit (this is done in the new target > hook) and if it does, try to unroll using unroll_stupid. > > Regression tested on AArch64 and added the test from the PR. This gives > an overall code size reduction of 2.35% and performance gain of 0.498% > on Spec2017 Intrate. > > Is this ok for trunk? > > Thanks > Sudi > > gcc/ChangeLog: > > 2019-xx-xx Sudakshina Das <sudi....@arm.com> > > PR88398 > * cfgloop.h: Include target.h. > (lpt_dec): Move to... > * target.h (lpt_dec): ... Here. > * target.def: Define TARGET_LOOP_DECISION_ADJUST. > * loop-unroll.c (decide_unroll_runtime_iterations): Use new target > hook. > (decide_unroll_stupid): Likewise. > * config/aarch64/aarch64.c (aarch64_loop_decision_adjust): New > function. > (TARGET_LOOP_DECISION_ADJUST): Define for AArch64. > * doc/tm.texi: Regenerated. > * doc/tm.texi.in: Document TARGET_LOOP_DECISION_ADJUST. > > gcc/testsuite/ChangeLog: > > 2019-xx-xx Sudakshina Das <sudi....@arm.com> > > PR88398 > * gcc.target/aarch64/pr88398.c: New test.