On 12/04/17 13:58, Wilco Dijkstra wrote: > With -mcpu=generic the loop alignment is currently 4. All but one of the > supported cores use 8 or higher. Since using 8 provides performance gains > on several cores, it is best to use that by default. As discussed in [1], > the jump alignment has no effect on performance, yet has a relatively high > codesize cost [2], so setting it to 4 is best. This gives a 0.2% overall > codesize improvement as well as performance gains in several benchmarks. > Any objections? > > Bootstrap OK on AArch64, OK for stage 1? > > ChangeLog: > 2017-04-12 Wilco Dijkstra <wdijk...@arm.com> > > * config/aarch64/aarch64.c (generic_tunings): Set jump alignment to 4. > Set loop alignment to 8. >
OK. It looks to me as though these two values were back-to-front. Having loop align lower than jump align sounds just perverse. R. > [1] https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00574.html > [2] https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02075.html > > --- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > c8cf7169a5d387de336920b50c83761dc0c96f3a..8b729b1b1f87316e940d7fc657f235a935ffa93e > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -538,8 +538,8 @@ static const struct tune_params generic_tunings = > 2, /* issue_rate */ > (AARCH64_FUSE_AES_AESMC), /* fusible_ops */ > 8, /* function_align. */ > - 8, /* jump_align. */ > - 4, /* loop_align. */ > + 4, /* jump_align. */ > + 8, /* loop_align. */ > 2, /* int_reassoc_width. */ > 4, /* fp_reassoc_width. */ > 1, /* vec_reassoc_width. */ > >