> -----Original Message-----
> From: Kyrylo Tkachov
> Sent: 18 March 2021 09:37
> To: 'qia...@fujitsu.com' <qia...@fujitsu.com>; gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford <richard.sandif...@arm.com>
> Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults
> 
> 
> 
> > -----Original Message-----
> > From: qia...@fujitsu.com <qia...@fujitsu.com>
> > Sent: 18 March 2021 01:52
> > To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-
> patc...@gcc.gnu.org
> > Cc: Richard Sandiford <richard.sandif...@arm.com>
> > Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults
> >
> > Hello Kyrill,
> >
> > Sorry for the slow response.
> > The performance on a64fx is not impacted with this patch.
> 
> Thank you very much for testing Qian.
> Glad to see there is no impact on A64FX. I will push the patch to master then.

I should say, I intend to backport this to GCC 10 as well as it has the same 
effect on that branch (helps Neoverse V1, no effect on anything else).
Will do so after a bit more testing, the patch applies cleanly.
Thanks,
Kyrill

> Kyrill
> 
> >
> > Regards,
> > Qian
> >
> > > -----Original Message-----
> > > From: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> > > Sent: Wednesday, March 10, 2021 10:56 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Richard Sandiford <richard.sandif...@arm.com>; Qian, Jianhua/钱
> 建
> > 华
> > > <qia...@fujitsu.com>
> > > Subject: [PATCH] aarch64: Improve generic SVE tuning defaults
> > >
> > > Hi all,
> > >
> > > This patch adds the recently-added tweak to split some SVE VL-based
> scalar
> > > operations [1] to the generic tuning used for SVE, as enabled by adding
> > +sve to
> > > the -march flag, for example -march=armv8.2-a+sve.
> > >
> > > The recommendation for best performance on a particular CPU remains
> > > unchanged:
> > > use the -mcpu option for that CPU, where possible. -mcpu=native makes
> > this
> > > straightforward for native compilation.
> > >
> > > The tweak to split out SVE VL-based scalar operations is a consistent win
> > for
> > > the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run
> of
> > > SPEC2017 on A64FX with this tweak on didn't show any non-noise
> > differences.
> > > It is also expected to be neutral on SVE2 implementations.
> > >
> > > Therefore, the patch enables the tweak for generic +sve tuning e.g.
> > > -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
> > > therefore the tweak is disabled for generic tuning when +sve2 is in -
> march
> > e.g.
> > > -march=armv8.2-a+sve2.
> > >
> > > The implementation of this approach requires a bit of custom logic in
> > > aarch64_override_options_internal to handle these kinds of
> > > architecture-dependent decisions, but we do believe the user-facing
> > principle
> > > here is important to implement.
> > >
> > > Qian, as you've contributed the A64FX support to GCC, I would be
> grateful
> > for
> > > your feedback on this approach and in particular on the performance
> > evaluation
> > > of this change.
> > >
> > > In general, for the generic target we're using a decision framework that
> > looks
> > > like:
> > >
> > > * If all cores that are known to benefit from an optimization are of
> > architecture X,
> > > and all other cores that implement X or above are not impacted, or have
> a
> > very
> > > slight impact, we will consider it for generic tuning for architecture X.
> > > * We will not enable that optimisation for generic tuning for architecture
> > X+1 if
> > > no known cores of architecture X+1 or above will benefit.
> > >
> > > This framework allows us to improve generic tuning for CPUs of
> generation
> > X
> > > while avoiding accumulating tweaks for future CPUs of generation X+1,
> > X+2...
> > > that do not need them, and thus avoid even the slight negative effects of
> > these
> > > optimisations if the user is willing to tell us the desired architecture
> > accurately.
> > >
> > > X above can mean either annual architecture updates (Armv8.2-a,
> Armv8.3-
> > a
> > > etc) or optional architecture extensions (like SVE, SVE2).
> > >
> > > We think that this patch fits that framework, so would like to propose it
> for
> > the
> > > trunk default tunings for SVE.
> > >
> > > Bootstrapped and tested on aarch64-none-linux-gnu.
> > >
> > > Thanks,
> > > Kyrill
> > >
> > > [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7
> > >
> > > gcc/ChangeLog:
> > >
> > >   * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning):
> > > Define.
> > >   (aarch64_override_options_internal): Use it.
> > >   (generic_tunings): Add
> > > AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
> > >   tune_flags.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * g++.target/aarch64/sve/aarch64-sve.exp: Add
> > > -moverride=tune=none to
> > >   sve_flags.
> > >   * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > >   * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> > >   * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
> > >   * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > >   * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> > >

Reply via email to