Hi, I've seen a couple of large performance issues caused by expanding the high-precision reciprocal square root for Cortex-A57, so I'd like to turn it off by default.
This is good for art (~2%) from Spec2000, bad (~3.5%) for fma3d from Spec2000, good (~5.5%) for gromcas from Spec2006, and very good (>10%) for some private microbenchmark kernels which stress the divide/sqrt/multiply units. It therefore seems to me to be the correct choice to make across a number of workloads. Bootstrapped and tested on aarch64-none-linux-gnu with no issues. OK? Thanks, James --- 2015-12-11 James Greenhalgh <james.greenha...@arm.com> * config/aarch64/aarch64.c (cortexa57_tunings): Remove AARCH64_EXTRA_TUNE_RECIP_SQRT.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 1d5d898..999c9fc 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -484,8 +484,7 @@ static const struct tune_params cortexa57_tunings = 0, /* max_case_values. */ 0, /* cache_line_size. */ tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS - | AARCH64_EXTRA_TUNE_RECIP_SQRT) /* tune_flags. */ + (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS) /* tune_flags. */ }; static const struct tune_params cortexa72_tunings =