On Fri, Nov 24, 2023 at 04:22:54PM +0000, Richard Sandiford wrote:
> Andrew Carlotti <andrew.carlo...@arm.com> writes:
> > This adds initial support for function multiversioning on aarch64 using
> > the target_version and target_clones attributes.  This loosely follows
> > the Beta specification in the ACLE [1], although with some differences
> > that still need to be resolved (possibly as follow-up patches).
> >
> > Existing function multiversioning implementations are broken in various
> > ways when used across translation units.  This includes placing
> > resolvers in the wrong translation units, and using symbol mangling that
> > callers to unintentionally bypass the resolver in some circumstances.
> > Fixing these issues for aarch64 will require modifications to our ACLE
> > specification.  It will also require further adjustments to existing
> > middle end code, to facilitate different mangling and resolver
> > placement while preserving existing target behaviours.
> >
> > The list of function multiversioning features specified in the ACLE is
> > also inconsistent with the list of features supported in target option
> > extensions.  I intend to resolve some or all of these inconsistencies at
> > a later stage.
> >
> > The target_version attribute is currently only supported in C++, since
> > this is the only frontend with existing support for multiversioning
> > using the target attribute.  On the other hand, this patch happens to
> > enable multiversioning with the target_clones attribute in Ada and D, as
> > well as the entire C family, using their existing frontend support.
> >
> > This patch also does not support the following aspects of the Beta
> > specification:
> >
> > - The target_clones attribute should allow an implicit unlisted
> >   "default" version.
> > - There should be an option to disable function multiversioning at
> >   compile time.
> > - Unrecognised target names in a target_clones attribute should be
> >   ignored (with an optional warning).  This current patch raises an
> >   error instead.
> >
> > [1] 
> > https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning
> >
> > ---
> >
> > I believe the support present in this patch correctly handles function
> > multiversioning within a single translation unit for all features in the 
> > ACLE
> > specification with option extension support.
> >
> > Is it ok to push this patch in its current state? I'd then continue working 
> > on
> > incremental improvements to the supported feature extensions and the ABI 
> > issues
> > in followup patches, in along with corresponding changes and improvements to
> > the ACLE specification.
> >
> >
> > gcc/ChangeLog:
> >
> >     * config/aarch64/aarch64-feature-deps.h (fmv_deps_<FEAT_NAME>):
> >     Define aarch64_feature_flags mask foreach FMV feature.
> >     * config/aarch64/aarch64-option-extensions.def: Use new macros
> >     to define FMV feature extensions.
> >     * config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
> >     Check for target_version attribute after processing target
> >     attribute.
> >     (aarch64_fmv_feature_data): New.
> >     (aarch64_parse_fmv_features): New.
> >     (aarch64_process_target_version_attr): New.
> >     (aarch64_option_valid_version_attribute_p): New.
> >     (get_feature_mask_for_version): New.
> >     (compare_feature_masks): New.
> >     (aarch64_compare_version_priority): New.
> >     (build_ifunc_arg_type): New.
> >     (make_resolver_func): New.
> >     (add_condition_to_bb): New.
> >     (compare_feature_version_info): New.
> >     (dispatch_function_versions): New.
> >     (aarch64_generate_version_dispatcher_body): New.
> >     (aarch64_get_function_versions_dispatcher): New.
> >     (aarch64_common_function_versions): New.
> >     (aarch64_mangle_decl_assembler_name): New.
> >     (TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): New implementation.
> >     (TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE): New implementation.
> >     (TARGET_OPTION_FUNCTION_VERSIONS): New implementation.
> >     (TARGET_COMPARE_VERSION_PRIORITY): New implementation.
> >     (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New implementation.
> >     (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New implementation.
> >     (TARGET_MANGLE_DECL_ASSEMBLER_NAME): New implementation.
> >     * config/arm/aarch-common.h (enum aarch_parse_opt_result): Add
> >       new value to report duplicate FMV feature.
> >     * common/config/aarch64/cpuinfo.h: New file.
> >
> > libgcc/ChangeLog:
> >
> >     * config/aarch64/cpuinfo.c (enum CPUFeatures): Move to shared
> >       copy in gcc/common
> >
> > gcc/testsuite/ChangeLog:
> >
> >     * gcc.target/aarch64/options_set_17.c: Reorder expected flags.
> >     * gcc.target/aarch64/cpunative/native_cpu_0.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_13.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_16.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_17.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_18.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_19.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_20.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_21.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_6.c: Ditto.
> >     * gcc.target/aarch64/cpunative/native_cpu_7.c: Ditto.
> 
> Thanks, mostly looks good, but some comments below:
> 
> > diff --git a/gcc/common/config/aarch64/cpuinfo.h 
> > b/gcc/common/config/aarch64/cpuinfo.h
> > new file mode 100644
> > index 
> > 0000000000000000000000000000000000000000..1690b6eee48e960d0ae675f8e8b05e6f182b56a3
> > --- /dev/null
> > +++ b/gcc/common/config/aarch64/cpuinfo.h
> > @@ -0,0 +1,94 @@
> > +/* CPU feature detection for AArch64 architecture.
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +
> > +   This file is part of GCC.
> > +
> > +   This file is free software; you can redistribute it and/or modify it
> > +   under the terms of the GNU General Public License as published by the
> > +   Free Software Foundation; either version 3, or (at your option) any
> > +   later version.
> > +
> > +   This file is distributed in the hope that it will be useful, but
> > +   WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   General Public License for more details.
> > +
> > +   Under Section 7 of GPL version 3, you are granted additional
> > +   permissions described in the GCC Runtime Library Exception, version
> > +   3.1, as published by the Free Software Foundation.
> > +
> > +   You should have received a copy of the GNU General Public License and
> > +   a copy of the GCC Runtime Library Exception along with this program;
> > +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +/* This enum is used in libgcc feature detection, and in the function
> > +   multiversioning implementation in aarch64.cc.  The enum should use the 
> > same
> > +   values as the corresponding enum in LLVM's compiler-rt, to faciliate
> > +   compatibility between compilers.  */
> > +
> > +enum CPUFeatures {
> > +  FEAT_RNG,
> > +  FEAT_FLAGM,
> > +  FEAT_FLAGM2,
> > +  FEAT_FP16FML,
> > +  FEAT_DOTPROD,
> > +  FEAT_SM4,
> > +  FEAT_RDM,
> > +  FEAT_LSE,
> > +  FEAT_FP,
> > +  FEAT_SIMD,
> > +  FEAT_CRC,
> > +  FEAT_SHA1,
> > +  FEAT_SHA2,
> > +  FEAT_SHA3,
> > +  FEAT_AES,
> > +  FEAT_PMULL,
> > +  FEAT_FP16,
> > +  FEAT_DIT,
> > +  FEAT_DPB,
> > +  FEAT_DPB2,
> > +  FEAT_JSCVT,
> > +  FEAT_FCMA,
> > +  FEAT_RCPC,
> > +  FEAT_RCPC2,
> > +  FEAT_FRINTTS,
> > +  FEAT_DGH,
> > +  FEAT_I8MM,
> > +  FEAT_BF16,
> > +  FEAT_EBF16,
> > +  FEAT_RPRES,
> > +  FEAT_SVE,
> > +  FEAT_SVE_BF16,
> > +  FEAT_SVE_EBF16,
> > +  FEAT_SVE_I8MM,
> > +  FEAT_SVE_F32MM,
> > +  FEAT_SVE_F64MM,
> > +  FEAT_SVE2,
> > +  FEAT_SVE_AES,
> > +  FEAT_SVE_PMULL128,
> > +  FEAT_SVE_BITPERM,
> > +  FEAT_SVE_SHA3,
> > +  FEAT_SVE_SM4,
> > +  FEAT_SME,
> > +  FEAT_MEMTAG,
> > +  FEAT_MEMTAG2,
> > +  FEAT_MEMTAG3,
> > +  FEAT_SB,
> > +  FEAT_PREDRES,
> > +  FEAT_SSBS,
> > +  FEAT_SSBS2,
> > +  FEAT_BTI,
> > +  FEAT_LS64,
> > +  FEAT_LS64_V,
> > +  FEAT_LS64_ACCDATA,
> > +  FEAT_WFXT,
> > +  FEAT_SME_F64,
> > +  FEAT_SME_I64,
> > +  FEAT_SME2,
> > +  FEAT_RCPC3,
> > +  FEAT_MAX,
> > +  FEAT_EXT = 62, /* Reserved to indicate presence of additional features 
> > field
> > +               in __aarch64_cpu_features.  */
> > +  FEAT_INIT      /* Used as flag of features initialization completion.  */
> > +};
> > diff --git a/gcc/config/aarch64/aarch64-feature-deps.h 
> > b/gcc/config/aarch64/aarch64-feature-deps.h
> > index 
> > 7b85a8860de57f6727644c03296cef192ad0990c..8f20582e1efdd4817138480bee8cdb27fa7f3dfe
> >  100644
> > --- a/gcc/config/aarch64/aarch64-feature-deps.h
> > +++ b/gcc/config/aarch64/aarch64-feature-deps.h
> > @@ -115,6 +115,13 @@ get_flags_off (aarch64_feature_flags mask)
> >    constexpr auto cpu_##CORE_IDENT = ARCH_IDENT ().enable | get_enable 
> > FEATURES;
> >  #include "config/aarch64/aarch64-cores.def"
> >  
> > +/* Define fmv_deps_<NAME> variables for each FMV feature, giving the 
> > transitive
> > +   closure of all the features that the FMV feature enables.  */
> > +#define AARCH64_FMV_FEATURE(A, FEAT_NAME, OPT_FLAGS) \
> > +  constexpr auto fmv_deps_##FEAT_NAME = get_enable OPT_FLAGS;
> > +#include "config/aarch64/aarch64-option-extensions.def"
> > +
> > +
> >  }
> >  }
> >  
> > diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
> > b/gcc/config/aarch64/aarch64-option-extensions.def
> > index 
> > 825f3bf775899e2e5cffb1867b82766d632c8708..07df403491494d6dfe19095872ab32b9d60e9690
> >  100644
> > --- a/gcc/config/aarch64/aarch64-option-extensions.def
> > +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> > @@ -17,17 +17,22 @@
> >     along with GCC; see the file COPYING3.  If not see
> >     <http://www.gnu.org/licenses/>.  */
> >  
> > -/* This is a list of ISA extentsions in AArch64.
> > +/* This is a list of ISA extensions in AArch64.
> >  
> > -   Before using #include to read this file, define a macro:
> > +   Before using #include to read this file, define one of the following
> > +   macros:
> >  
> >        AARCH64_OPT_EXTENSION(NAME, IDENT, REQUIRES, EXPLICIT_ON,
> >                         EXPLICIT_OFF, FEATURE_STRING)
> >  
> > +      AARCH64_FMV_FEATURE(NAME, FEAT_NAME, IDENT)
> > +
> >     - NAME is the name of the extension, represented as a string constant.
> >  
> >     - IDENT is the canonical internal name for this flag.
> >  
> > +   - FEAT_NAME is the unprefixed name used in the CPUFeatures enum.
> > +
> >     - REQUIRES is a list of features that must be enabled whenever this
> >       feature is enabled.  The relationship is implicitly transitive:
> >       if A appears in B's REQUIRES and B appears in C's REQUIRES then
> > @@ -58,45 +63,96 @@
> >       that are required.  Their order is not important.  An empty string 
> > means
> >       do not detect this feature during auto detection.
> >  
> > -   The list of features must follow topological order wrt REQUIRES
> > -   and EXPLICIT_ON.  For example, if A is in B's REQUIRES list, A must
> > -   come before B.  This is enforced by aarch64-feature-deps.h.
> > +   - OPT_FLAGS is a list of feature IDENTS that should be enabled (along 
> > with
> > +     their transitive dependencies) when the specified FMV feature is 
> > present.
> > +
> > +   Where a feature is present as both an extension and a function
> > +   multiversioning feature, and IDENT matches the FEAT_NAME suffix, then 
> > these
> > +   can be listed here simultaneously using the macro:
> > +
> > +      AARCH64_OPT_FMV_EXTENSION(NAME, IDENT, REQUIRES, EXPLICIT_ON,
> > +                           EXPLICIT_OFF, FEATURE_STRING)
> > +
> > +   The list of features extensions must follow topological order wrt 
> > REQUIRES
> > +   and EXPLICIT_ON.  For example, if A is in B's REQUIRES list, A must come
> > +   before B.  This is enforced by aarch64-feature-deps.h.
> > +
> > +   The list of multiversioning features must be ordered by increasing 
> > priority,
> > +   as defined in 
> > https://github.com/ARM-software/acle/blob/main/main/acle.md
> >  
> >     NOTE: Any changes to the AARCH64_OPT_EXTENSION macro need to be 
> > mirrored in
> >     config.gcc.  */
> >  
> > +#ifndef AARCH64_OPT_EXTENSION
> > +#define AARCH64_OPT_EXTENSION(NAME, IDENT, REQUIRES, EXPLICIT_ON, \
> > +                         EXPLICIT_OFF, FEATURE_STRING)
> > +#endif
> > +
> > +#ifndef AARCH64_FMV_FEATURE
> > +#define AARCH64_FMV_FEATURE(NAME, FEAT_NAME, OPT_FLAGS)
> > +#endif
> > +
> > +#define AARCH64_OPT_FMV_EXTENSION(NAME, IDENT, REQUIRES, EXPLICIT_ON,   \
> > +                             EXPLICIT_OFF, FEATURE_STRING)         \
> > +AARCH64_OPT_EXTENSION(NAME, IDENT, REQUIRES, EXPLICIT_ON, EXPLICIT_OFF,    
> > \
> > +                 FEATURE_STRING)                                   \
> > +AARCH64_FMV_FEATURE(NAME, IDENT, (IDENT))
> > +
> > +
> >  AARCH64_OPT_EXTENSION("fp", FP, (), (), (), "fp")
> >  
> >  AARCH64_OPT_EXTENSION("simd", SIMD, (FP), (), (), "asimd")
> >  
> > -AARCH64_OPT_EXTENSION("crc", CRC, (), (), (), "crc32")
> > +AARCH64_OPT_FMV_EXTENSION("rng", RNG, (), (), (), "rng")
> >  
> > -AARCH64_OPT_EXTENSION("lse", LSE, (), (), (), "atomics")
> > +AARCH64_OPT_FMV_EXTENSION("flagm", FLAGM, (), (), (), "flagm")
> >  
> > -/* +nofp16 disables an implicit F16FML, even though an implicit F16FML
> > -   does not imply F16.  See F16FML for more details.  */
> > -AARCH64_OPT_EXTENSION("fp16", F16, (FP), (), (F16FML), "fphp asimdhp")
> > +AARCH64_FMV_FEATURE("flagm2", FLAGM2, (FLAGM))
> > +
> > +AARCH64_FMV_FEATURE("fp16fml", FP16FML, (F16FML))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("dotprod", DOTPROD, (SIMD), (), (), "asimddp")
> >  
> > -AARCH64_OPT_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
> > +AARCH64_OPT_FMV_EXTENSION("sm4", SM4, (SIMD), (), (), "sm3 sm4")
> >  
> >  /* An explicit +rdma implies +simd, but +rdma+nosimd still enables scalar
> >     RDMA instructions.  */
> >  AARCH64_OPT_EXTENSION("rdma", RDMA, (), (SIMD), (), "asimdrdm")
> >  
> > -AARCH64_OPT_EXTENSION("dotprod", DOTPROD, (SIMD), (), (), "asimddp")
> > +AARCH64_FMV_FEATURE("rmd", RDM, (RDMA))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("lse", LSE, (), (), (), "atomics")
> > +
> > +AARCH64_FMV_FEATURE("fp", FP, (FP))
> > +
> > +AARCH64_FMV_FEATURE("simd", SIMD, (SIMD))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("crc", CRC, (), (), (), "crc32")
> >  
> > -AARCH64_OPT_EXTENSION("aes", AES, (SIMD), (), (), "aes")
> > +AARCH64_FMV_FEATURE("sha1", SHA1, ())
> >  
> > -AARCH64_OPT_EXTENSION("sha2", SHA2, (SIMD), (), (), "sha1 sha2")
> > +AARCH64_OPT_FMV_EXTENSION("sha2", SHA2, (SIMD), (), (), "sha1 sha2")
> > +
> > +AARCH64_FMV_FEATURE("sha3", SHA3, (SHA3))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("aes", AES, (SIMD), (), (), "aes")
> > +
> > +AARCH64_FMV_FEATURE("pmull", PMULL, ())
> >  
> >  /* +nocrypto disables AES, SHA2 and SM4, and anything that depends on them
> >     (such as SHA3 and the SVE2 crypto extensions).  */
> >  AARCH64_OPT_EXTENSION("crypto", CRYPTO, (AES, SHA2), (), (AES, SHA2, SM4),
> >                   "aes pmull sha1 sha2")
> >  
> > +/* Listing sha3 after crypto means we pass "+aes+sha3" to the assembler
> > +   instead of "+sha3+crypto".  */
> >  AARCH64_OPT_EXTENSION("sha3", SHA3, (SHA2), (), (), "sha3 sha512")
> >  
> > -AARCH64_OPT_EXTENSION("sm4", SM4, (SIMD), (), (), "sm3 sm4")
> > +/* +nofp16 disables an implicit F16FML, even though an implicit F16FML
> > +   does not imply F16.  See F16FML for more details.  */
> > +AARCH64_OPT_EXTENSION("fp16", F16, (FP), (), (F16FML), "fphp asimdhp")
> > +
> > +AARCH64_FMV_FEATURE("fp16", FP16, (F16))
> >  
> >  /* An explicit +fp16fml implies +fp16, but a dependence on it does not.
> >     Thus -march=armv8.4-a implies F16FML but not F16.  -march=armv8.4-a+fp16
> > @@ -104,51 +160,117 @@ AARCH64_OPT_EXTENSION("sm4", SM4, (SIMD), (), (), 
> > "sm3 sm4")
> >     -march=armv8.4-a+nofp16+fp16 enables F16 but not F16FML.  */
> >  AARCH64_OPT_EXTENSION("fp16fml", F16FML, (), (F16), (), "asimdfhm")
> >  
> > -AARCH64_OPT_EXTENSION("sve", SVE, (SIMD, F16), (), (), "sve")
> > +AARCH64_FMV_FEATURE("dit", DIT, ())
> >  
> > -AARCH64_OPT_EXTENSION("profile", PROFILE, (), (), (), "")
> > +AARCH64_FMV_FEATURE("dpb", DPB, ())
> >  
> > -AARCH64_OPT_EXTENSION("rng", RNG, (), (), (), "rng")
> > +AARCH64_FMV_FEATURE("dpb2", DPB2, ())
> >  
> > -AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
> > +AARCH64_FMV_FEATURE("jscvt", JSCVT, ())
> >  
> > -AARCH64_OPT_EXTENSION("sb", SB, (), (), (), "sb")
> > +AARCH64_FMV_FEATURE("fcma", FCMA, (SIMD))
> >  
> > -AARCH64_OPT_EXTENSION("ssbs", SSBS, (), (), (), "ssbs")
> > +AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
> >  
> > -AARCH64_OPT_EXTENSION("predres", PREDRES, (), (), (), "")
> > +AARCH64_FMV_FEATURE("rcpc2", RCPC2, (RCPC))
> >  
> > -AARCH64_OPT_EXTENSION("sve2", SVE2, (SVE), (), (), "sve2")
> > +AARCH64_FMV_FEATURE("rcpc3", RCPC3, (RCPC))
> >  
> > -AARCH64_OPT_EXTENSION("sve2-sm4", SVE2_SM4, (SVE2, SM4), (), (), "svesm4")
> > +AARCH64_FMV_FEATURE("frintts", FRINTTS, ())
> > +
> > +AARCH64_FMV_FEATURE("dgh", DGH, ())
> > +
> > +AARCH64_OPT_FMV_EXTENSION("i8mm", I8MM, (SIMD), (), (), "i8mm")
> > +
> > +/* An explicit +bf16 implies +simd, but +bf16+nosimd still enables scalar 
> > BF16
> > +   instructions.  */
> > +AARCH64_OPT_FMV_EXTENSION("bf16", BF16, (FP), (SIMD), (), "bf16")
> > +
> > +AARCH64_FMV_FEATURE("ebf16", EBF16, (BF16))
> > +
> > +AARCH64_FMV_FEATURE("rpres", RPRES, ())
> > +
> > +AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16), (), (), "sve")
> > +
> > +AARCH64_FMV_FEATURE("sve-bf16", SVE_BF16, (SVE, BF16))
> > +
> > +AARCH64_FMV_FEATURE("sve-ebf16", SVE_EBF16, (SVE, BF16))
> > +
> > +AARCH64_FMV_FEATURE("sve-i8mm", SVE_I8MM, (SVE, I8MM))
> > +
> > +AARCH64_OPT_EXTENSION("f32mm", F32MM, (SVE), (), (), "f32mm")
> > +
> > +AARCH64_FMV_FEATURE("f32mm", SVE_F32MM, (F32MM))
> > +
> > +AARCH64_OPT_EXTENSION("f64mm", F64MM, (SVE), (), (), "f64mm")
> > +
> > +AARCH64_FMV_FEATURE("f64mm", SVE_F64MM, (F64MM))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("sve2", SVE2, (SVE), (), (), "sve2")
> >  
> >  AARCH64_OPT_EXTENSION("sve2-aes", SVE2_AES, (SVE2, AES), (), (), "sveaes")
> >  
> > -AARCH64_OPT_EXTENSION("sve2-sha3", SVE2_SHA3, (SVE2, SHA3), (), (), 
> > "svesha3")
> > +AARCH64_FMV_FEATURE("sve2-aes", SVE_AES, (SVE2, AES))
> > +
> > +AARCH64_FMV_FEATURE("sve2-pmull128", SVE_PMULL128, (SVE2))
> >  
> >  AARCH64_OPT_EXTENSION("sve2-bitperm", SVE2_BITPERM, (SVE2), (), (),
> >                   "svebitperm")
> >  
> > -AARCH64_OPT_EXTENSION("tme", TME, (), (), (), "")
> > +AARCH64_FMV_FEATURE("sve2-bitperm", SVE_BITPERM, (SVE2_BITPERM))
> >  
> > -AARCH64_OPT_EXTENSION("i8mm", I8MM, (SIMD), (), (), "i8mm")
> > +AARCH64_OPT_EXTENSION("sve2-sha3", SVE2_SHA3, (SVE2, SHA3), (), (), 
> > "svesha3")
> >  
> > -AARCH64_OPT_EXTENSION("f32mm", F32MM, (SVE), (), (), "f32mm")
> > +AARCH64_FMV_FEATURE("sve2-sha3", SVE_SHA3, (SVE2_SHA3))
> >  
> > -AARCH64_OPT_EXTENSION("f64mm", F64MM, (SVE), (), (), "f64mm")
> > +AARCH64_OPT_EXTENSION("sve2-sm4", SVE2_SM4, (SVE2, SM4), (), (), "svesm4")
> >  
> > -/* An explicit +bf16 implies +simd, but +bf16+nosimd still enables scalar 
> > BF16
> > -   instructions.  */
> > -AARCH64_OPT_EXTENSION("bf16", BF16, (FP), (SIMD), (), "bf16")
> > +AARCH64_FMV_FEATURE("sve2-sm4", SVE_SM4, (SVE2_SM4))
> > +
> > +AARCH64_FMV_FEATURE("sme", SME, ())
> >  
> > -AARCH64_OPT_EXTENSION("flagm", FLAGM, (), (), (), "flagm")
> > +AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
> > +
> > +AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
> > +
> > +AARCH64_FMV_FEATURE("memtag3", MEMTAG3, (MEMTAG))
> > +
> > +AARCH64_OPT_FMV_EXTENSION("sb", SB, (), (), (), "sb")
> > +
> > +AARCH64_OPT_FMV_EXTENSION("predres", PREDRES, (), (), (), "")
> > +
> > +AARCH64_OPT_FMV_EXTENSION("ssbs", SSBS, (), (), (), "ssbs")
> > +
> > +AARCH64_FMV_FEATURE("ssbs2", SSBS2, (SSBS))
> > +
> > +AARCH64_FMV_FEATURE("bti", BTI, ())
> > +
> > +AARCH64_OPT_EXTENSION("profile", PROFILE, (), (), (), "")
> > +
> > +AARCH64_OPT_EXTENSION("tme", TME, (), (), (), "")
> >  
> >  AARCH64_OPT_EXTENSION("pauth", PAUTH, (), (), (), "paca pacg")
> >  
> >  AARCH64_OPT_EXTENSION("ls64", LS64, (), (), (), "")
> >  
> > +AARCH64_FMV_FEATURE("ls64", LS64, ())
> > +
> > +AARCH64_FMV_FEATURE("ls64_v", LS64_V, ())
> > +
> > +AARCH64_FMV_FEATURE("ls64_accdata", LS64_ACCDATA, (LS64))
> > +
> > +AARCH64_FMV_FEATURE("wfxt", WFXT, ())
> > +
> > +AARCH64_FMV_FEATURE("sme-f64f64", SME_F64, ())
> > +
> > +AARCH64_FMV_FEATURE("sme-i64i64", SME_I64, ())
> > +
> > +AARCH64_FMV_FEATURE("sme2", SME2, ())
> > +
> >  AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "")
> >  
> >  AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "cssc")
> >  
> > +#undef AARCH64_OPT_FMV_EXTENSION
> >  #undef AARCH64_OPT_EXTENSION
> > +#undef AARCH64_FMV_FEATURE
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 
> > 800a8b0e11005416fb4e4b1222717629b16f3745..8721c0a923c53af2c2413ed90ccb05fa698c1f85
> >  100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -84,6 +84,7 @@
> >  #include "aarch64-feature-deps.h"
> >  #include "config/arm/aarch-common.h"
> >  #include "config/arm/aarch-common-protos.h"
> > +#include "common/config/aarch64/cpuinfo.h"
> >  #include "ssa.h"
> >  
> >  /* This file should be included last.  */
> > @@ -19525,6 +19526,8 @@ aarch64_process_target_attr (tree args)
> >    return true;
> >  }
> >  
> > +static bool aarch64_process_target_version_attr (tree args);
> > +
> >  /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P.  This is used to
> >     process attribute ((target ("..."))).  */
> >  
> > @@ -19580,6 +19583,19 @@ aarch64_option_valid_attribute_p (tree fndecl, 
> > tree, tree args, int)
> >                           TREE_TARGET_OPTION (target_option_current_node));
> >  
> >    ret = aarch64_process_target_attr (args);
> > +  if (ret)
> > +    {
> > +      tree version_attr = lookup_attribute ("target_version",
> > +                                       DECL_ATTRIBUTES (fndecl));
> > +      if (version_attr != NULL_TREE)
> > +   {
> > +     /* Reapply any target_version attribute after target attribute.
> > +        This should be equivalent to applying the target_version once
> > +        after processing all target attributes.  */
> > +     tree version_args = TREE_VALUE (version_attr);
> > +     ret = aarch64_process_target_version_attr (version_args);
> > +   }
> > +    }
> >  
> >    /* Set up any additional state.  */
> >    if (ret)
> > @@ -19610,6 +19626,821 @@ aarch64_option_valid_attribute_p (tree fndecl, 
> > tree, tree args, int)
> >    return ret;
> >  }
> >  
> > +typedef unsigned long long aarch64_fmv_feature_mask;
> > +
> > +typedef struct
> > +{
> > +  const char *name;
> > +  aarch64_fmv_feature_mask feature_mask;
> > +  aarch64_feature_flags opt_flags;
> > +} aarch64_fmv_feature_datum;
> > +
> > +#define AARCH64_FMV_FEATURE(NAME, FEAT_NAME, C) \
> > +  {NAME, 1ULL << FEAT_##FEAT_NAME, ::feature_deps::fmv_deps_##FEAT_NAME},
> > +
> > +/* FMV features are listed in priority order, to make it easier to sort 
> > target
> > +   strings.  */
> > +static aarch64_fmv_feature_datum aarch64_fmv_feature_data[] = {
> > +#include "config/aarch64/aarch64-option-extensions.def"
> > +};
> > +
> > +
> > +/* Parse a non-default fmv feature string, as found in a target_version or
> > +   target_clones attribute.  */
> 
> The comment says non-default, but the function does handle "default".
> 
> It would be good to describe the arguments too.  E.g. something like:
> 
> /* Parse function multi-versioning feature string STR, as found in a
>    target_version or target_clones attribute.  Add the selected FMV
>    features to *FEATURE_MASK and the associated -march ISA extensions
>    to *ISA_FLAGS.  If parsing fails due to an invalid or duplicate
>    feature name, store that feature name in *INVALID_EXTENSION.  */

Updated (with slightly different wording).

> > +
> > +static enum aarch_parse_opt_result
> > +aarch64_parse_fmv_features (const char *str, aarch64_feature_flags 
> > *isa_flags,
> > +                       aarch64_fmv_feature_mask *feature_mask,
> > +                       std::string *invalid_extension)
> > +{
> > +  if (strcmp (str, "default") == 0)
> > +    return AARCH_PARSE_OK;
> > +
> > +  while (str != NULL && *str != 0)
> > +    {
> > +      const char *ext;
> > +      size_t len;
> > +
> > +      ext = strchr (str, '+');
> > +
> > +      if (ext != NULL)
> > +   len = ext - str;
> > +      else
> > +   len = strlen (str);
> > +
> > +      if (len == 0)
> > +   return AARCH_PARSE_MISSING_ARG;
> > +
> > +      static const int num_features = ARRAY_SIZE 
> > (aarch64_fmv_feature_data);
> > +      int i;
> > +      for (i = 0; i < num_features; i++)
> > +   {
> > +     if (strlen (aarch64_fmv_feature_data[i].name) == len
> > +         && strncmp (aarch64_fmv_feature_data[i].name, str, len) == 0)
> > +       {
> > +         if (isa_flags)
> > +           *isa_flags |= aarch64_fmv_feature_data[i].opt_flags;
> > +         if (feature_mask)
> > +           {
> > +             auto old_feature_mask = *feature_mask;
> > +             *feature_mask |= aarch64_fmv_feature_data[i].feature_mask;
> > +             if (*feature_mask == old_feature_mask)
> > +               {
> > +                 /* Duplicate feature.  */
> > +                 if (invalid_extension)
> > +                   *invalid_extension = std::string (str, len);
> > +                 return AARCH_PARSE_DUPLICATE_FEATURE;
> > +               }
> > +           }
> > +         break;
> > +       }
> > +   }
> > +
> > +      if (i == num_features)
> > +   {
> > +     /* Feature not found in list.  */
> > +     if (invalid_extension)
> > +       *invalid_extension = std::string (str, len);
> > +     return AARCH_PARSE_INVALID_FEATURE;
> > +   }
> > +
> > +      str = ext;
> > +    }
> 
> Does this work for "feat1+feat2"?  It looks like str would be set to
> "+feat2" for the second iteration, and then the strchr would likewise
> return "+feat2", giving an empty string.

This was broken - thanks for spotting.  Fixed in the next version.
 
> > +
> > +  return AARCH_PARSE_OK;
> > +}
> > +
> > +/* Parse the tree in ARGS that contains the target_version attribute
> > +   information and update the global target options space.  */
> > +
> > +static bool
> > +aarch64_process_target_version_attr (tree args)
> > +{
> > +  if (TREE_CODE (args) == TREE_LIST)
> > +    {
> > +      if (TREE_CHAIN (args))
> > +   {
> > +     error ("attribute %<target_version%> has multiple values");
> > +     return false;
> > +   }
> > +      args = TREE_VALUE (args);
> > +    }
> > +
> > +  if (!args || TREE_CODE (args) != STRING_CST)
> > +    {
> > +      error ("attribute %<target_version%> argument not a string");
> > +      return false;
> > +    }
> > +
> > +  const char *str = TREE_STRING_POINTER (args);
> > +
> > +  enum aarch_parse_opt_result parse_res;
> > +  auto isa_flags = aarch64_asm_isa_flags;
> > +
> > +
> > +  std::string invalid_extension;
> > +  parse_res = aarch64_parse_fmv_features (str, &isa_flags, NULL,
> > +                                     &invalid_extension);
> > +
> > +  if (parse_res == AARCH_PARSE_OK)
> > +    {
> > +      aarch64_set_asm_isa_flags (isa_flags);
> > +      return true;
> > +    }
> > +
> > +  switch (parse_res)
> > +    {
> > +      case AARCH_PARSE_MISSING_ARG:
> > +   error ("missing value in %<target_version%> attribute");
> > +   break;
> > +
> > +      case AARCH_PARSE_INVALID_FEATURE:
> > +   error ("invalid feature modifier %qs of value %qs in "
> > +          "%<target_version%> attribute", invalid_extension.c_str (),
> > +          str);
> > +   break;
> > +
> > +      case AARCH_PARSE_DUPLICATE_FEATURE:
> > +   error ("duplicate feature modifier %qs of value %qs in "
> > +          "%<target_version%> attribute", invalid_extension.c_str (),
> > +          str);
> > +   break;
> > +
> > +      default:
> > +   gcc_unreachable ();
> > +    }
> 
> Formating nit: the convention is for cases to line up with the "{"
> of the switch, so the switch body between { and } above should be
> indented by 2 fewer columns.

Fixed.

> > +
> > +  return false;
> > +}
> > +
> > +/* Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P.  This is used to
> > +   process attribute ((target ("..."))).  */
> 
> attribute ((target_version ("...")))  ?

Fixed.

> > +
> > +static bool
> > +aarch64_option_valid_version_attribute_p (tree fndecl, tree, tree args, 
> > int)
> > +{
> > +  struct cl_target_option cur_target;
> > +  bool ret;
> > +  tree new_target;
> > +  tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
> > +
> > +  /* Save the current target options to restore at the end.  */
> > +  cl_target_option_save (&cur_target, &global_options, 
> > &global_options_set);
> > +
> > +  /* If fndecl already has some target attributes applied to it, unpack
> > +     them so that we add this attribute on top of them, rather than
> > +     overwriting them.  */
> > +  if (existing_target)
> > +    {
> > +      struct cl_target_option *existing_options
> > +   = TREE_TARGET_OPTION (existing_target);
> > +
> > +      if (existing_options)
> > +   cl_target_option_restore (&global_options, &global_options_set,
> > +                             existing_options);
> > +    }
> > +  else
> > +    cl_target_option_restore (&global_options, &global_options_set,
> > +                         TREE_TARGET_OPTION (target_option_current_node));
> > +
> > +  ret = aarch64_process_target_version_attr (args);
> > +
> > +  /* Set up any additional state.  */
> > +  if (ret)
> > +    {
> > +      aarch64_override_options_internal (&global_options);
> > +      new_target = build_target_option_node (&global_options,
> > +                                        &global_options_set);
> > +    }
> > +  else
> > +    new_target = NULL;
> > +
> > +  if (fndecl && ret)
> > +    {
> > +      DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target;
> > +    }
> > +
> > +  cl_target_option_restore (&global_options, &global_options_set, 
> > &cur_target);
> > +
> > +  return ret;
> > +}
> > +
> > +/* This parses the attribute arguments to target_version in DECL and the
> > +   feature mask required to select those targets.  No adjustments are made 
> > to
> > +   add or remove redundant feature requirements.  */
> > +
> > +static aarch64_fmv_feature_mask
> > +get_feature_mask_for_version (tree decl)
> > +{
> > +  tree version_attr = lookup_attribute ("target_version",
> > +                                   DECL_ATTRIBUTES (decl));
> > +  if (version_attr == NULL)
> > +    return 0;
> > +
> > +  const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE
> > +                                               (version_attr)));
> > +  enum aarch_parse_opt_result parse_res;
> > +  aarch64_fmv_feature_mask feature_mask = 0ULL;
> > +
> > +  parse_res = aarch64_parse_fmv_features (version_string, NULL, 
> > &feature_mask,
> > +                                     NULL);
> > +
> > +  /* We should have detected any errors before getting here.  */
> > +  gcc_assert (parse_res == AARCH_PARSE_OK);
> > +
> > +  return feature_mask;
> > +}
> > +
> > +/* Compare priorities of two feature masks. Return:
> > +     1: mask1 is higher priority
> > +    -1: mask2 is higher priority
> > +     0: masks are equal.  */
> > +
> > +static int
> > +compare_feature_masks (aarch64_fmv_feature_mask mask1,
> > +                  aarch64_fmv_feature_mask mask2)
> > +{
> > +  int pop1 = popcount_hwi(mask1);
> > +  int pop2 = popcount_hwi(mask2);
> 
> Nit: should be a space before "(mask1" and "(mask2".

Fixed.
 
> > +  if (pop1 > pop2)
> > +    return 1;
> > +  if (pop2 > pop1)
> > +    return -1;
> > +
> > +  auto diff_mask = mask1 ^ mask2;
> > +  if (diff_mask == 0ULL)
> > +    return 0;
> > +  for (int i = FEAT_MAX - 1; i > 0; i--)
> > +    {
> > +      auto bit_mask = aarch64_fmv_feature_data[i].feature_mask;
> > +      if (diff_mask & bit_mask)
> > +   return (mask1 & bit_mask) ? 1 : -1;
> > +    }
> > +  gcc_unreachable();
> > +}
> 
> Still not sure that this is the right criteria to use, but I suppose
> we can adjust it post-commit to match any changes in the spec.
> 
> > +
> > +int
> > +aarch64_compare_version_priority (tree decl1, tree decl2)
> > +{
> > +  auto mask1 = get_feature_mask_for_version (decl1);
> > +  auto mask2 = get_feature_mask_for_version (decl2);
> > +
> > +  return compare_feature_masks (mask1, mask2);
> > +}
> > +
> > +/* Build the struct __ifunc_arg_t type:
> > +
> > +   struct __ifunc_arg_t
> > +   {
> > +     unsigned long _size; // Size of the struct, so it can grow.
> > +     unsigned long _hwcap;
> > +     unsigned long _hwcap2;
> > +   }
> > + */
> 
> This isn't ILP32-friendly, but I agree we need to stick to the types
> that glibc uses.
> 
> > +
> > +static tree
> > +build_ifunc_arg_type ()
> > +{
> > +  tree ifunc_arg_type = lang_hooks.types.make_type (RECORD_TYPE);
> > +  tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> > +                       get_identifier ("_size"),
> > +                       long_unsigned_type_node);
> > +  tree field2 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> > +                       get_identifier ("_hwcap"),
> > +                       long_unsigned_type_node);
> > +  tree field3 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> > +                       get_identifier ("_hwcap2"),
> > +                       long_unsigned_type_node);
> > +
> > +  DECL_FIELD_CONTEXT (field1) = ifunc_arg_type;
> > +  DECL_FIELD_CONTEXT (field2) = ifunc_arg_type;
> > +  DECL_FIELD_CONTEXT (field3) = ifunc_arg_type;
> > +
> > +  TYPE_FIELDS (ifunc_arg_type) = field1;
> > +  DECL_CHAIN (field1) = field2;
> > +  DECL_CHAIN (field2) = field3;
> > +
> > +  layout_type (ifunc_arg_type);
> > +
> > +  tree const_type = build_qualified_type (ifunc_arg_type, TYPE_QUAL_CONST);
> > +  tree pointer_type = build_pointer_type (const_type);
> > +
> > +  return pointer_type;
> > +}
> > +
> > +/* Make the resolver function decl to dispatch the versions of
> > +   a multi-versioned function,  DEFAULT_DECL.  IFUNC_ALIAS_DECL is
> > +   ifunc alias that will point to the created resolver.  Create an
> > +   empty basic block in the resolver and store the pointer in
> > +   EMPTY_BB.  Return the decl of the resolver function.  */
> > +
> > +static tree
> > +make_resolver_func (const tree default_decl,
> > +               const tree ifunc_alias_decl,
> > +               basic_block *empty_bb)
> > +{
> > +  tree decl, type, t;
> > +
> > +  /* Create resolver function name based on default_decl.  */
> > +  tree decl_name = clone_function_name (default_decl, "resolver");
> > +  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
> > +
> > +  /* The resolver function should have signature
> > +     (void *) resolver (uint64_t, const __ifunc_arg_t *) */
> > +  type = build_function_type_list (ptr_type_node,
> > +                              uint64_type_node,
> > +                              build_ifunc_arg_type(),
> > +                              NULL_TREE);
> > +
> > +  decl = build_fn_decl (resolver_name, type);
> > +  SET_DECL_ASSEMBLER_NAME (decl, decl_name);
> > +
> > +  DECL_NAME (decl) = decl_name;
> > +  TREE_USED (decl) = 1;
> > +  DECL_ARTIFICIAL (decl) = 1;
> > +  DECL_IGNORED_P (decl) = 1;
> > +  TREE_PUBLIC (decl) = 0;
> > +  DECL_UNINLINABLE (decl) = 1;
> > +
> > +  /* Resolver is not external, body is generated.  */
> > +  DECL_EXTERNAL (decl) = 0;
> > +  DECL_EXTERNAL (ifunc_alias_decl) = 0;
> > +
> > +  DECL_CONTEXT (decl) = NULL_TREE;
> > +  DECL_INITIAL (decl) = make_node (BLOCK);
> > +  DECL_STATIC_CONSTRUCTOR (decl) = 0;
> > +
> > +  if (DECL_COMDAT_GROUP (default_decl)
> > +      || TREE_PUBLIC (default_decl))
> > +    {
> > +      /* In this case, each translation unit with a call to this
> > +    versioned function will put out a resolver.  Ensure it
> > +    is comdat to keep just one copy.  */
> > +      DECL_COMDAT (decl) = 1;
> > +      make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
> > +    }
> > +  else
> > +    TREE_PUBLIC (ifunc_alias_decl) = 0;
> > +
> > +  /* Build result decl and add to function_decl. */
> > +  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
> > +  DECL_CONTEXT (t) = decl;
> > +  DECL_ARTIFICIAL (t) = 1;
> > +  DECL_IGNORED_P (t) = 1;
> > +  DECL_RESULT (decl) = t;
> > +
> > +  /* Build parameter decls and add to function_decl. */
> > +  tree arg1 = build_decl (UNKNOWN_LOCATION, PARM_DECL,
> > +                     get_identifier ("hwcap"),
> > +                     uint64_type_node);
> > +  tree arg2 = build_decl (UNKNOWN_LOCATION, PARM_DECL,
> > +                     get_identifier ("arg"),
> > +                     build_ifunc_arg_type());
> > +  DECL_CONTEXT (arg1) = decl;
> > +  DECL_CONTEXT (arg2) = decl;
> > +  DECL_ARTIFICIAL (arg1) = 1;
> > +  DECL_ARTIFICIAL (arg2) = 1;
> > +  DECL_IGNORED_P (arg1) = 1;
> > +  DECL_IGNORED_P (arg2) = 1;
> > +  DECL_ARG_TYPE (arg1) = uint64_type_node;
> > +  DECL_ARG_TYPE (arg2) = build_ifunc_arg_type();
> 
> Nit: space before second "(".

Fixed, along with the earlier instance of this mistake.

> > +  DECL_ARGUMENTS (decl) = arg1;
> > +  TREE_CHAIN (arg1) = arg2;
> > +
> > +  gimplify_function_tree (decl);
> > +  push_cfun (DECL_STRUCT_FUNCTION (decl));
> > +  *empty_bb = init_lowered_empty_function (decl, false,
> > +                                      profile_count::uninitialized ());
> > +
> > +  cgraph_node::add_new_function (decl, true);
> > +  symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl));
> > +
> > +  pop_cfun ();
> > +
> > +  gcc_assert (ifunc_alias_decl != NULL);
> > +  /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name.  */
> > +  DECL_ATTRIBUTES (ifunc_alias_decl)
> > +    = make_attribute ("ifunc", resolver_name,
> > +                 DECL_ATTRIBUTES (ifunc_alias_decl));
> > +
> > +  /* Create the alias for dispatch to resolver here.  */
> > +  cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
> > +  return decl;
> > +}
> > +
> > +/* This adds a condition to the basic_block NEW_BB in function 
> > FUNCTION_DECL
> > +   to return a pointer to VERSION_DECL if all feature bits specified in
> > +   FEATURE_MASK are not set in MASK_VAR.  This function will be called 
> > during
> > +   version dispatch to decide which function version to execute.  It 
> > returns
> > +   the basic block at the end, to which more conditions can be added.  */
> > +static basic_block
> > +add_condition_to_bb (tree function_decl, tree version_decl,
> > +                aarch64_fmv_feature_mask feature_mask,
> > +                tree mask_var, basic_block new_bb)
> > +{
> > +  gimple *return_stmt;
> > +  tree convert_expr, result_var;
> > +  gimple *convert_stmt;
> > +  gimple *if_else_stmt;
> > +
> > +  basic_block bb1, bb2, bb3;
> > +  edge e12, e23;
> > +
> > +  gimple_seq gseq;
> > +
> > +  push_cfun (DECL_STRUCT_FUNCTION (function_decl));
> > +
> > +  gcc_assert (new_bb != NULL);
> > +  gseq = bb_seq (new_bb);
> > +
> > +
> > +  convert_expr = build1 (CONVERT_EXPR, ptr_type_node,
> > +                    build_fold_addr_expr (version_decl));
> > +  result_var = create_tmp_var (ptr_type_node);
> > +  convert_stmt = gimple_build_assign (result_var, convert_expr);
> > +  return_stmt = gimple_build_return (result_var);
> > +
> > +
> 
> Nit: just one blank line (before and after the block).  Some other instances
> in the patch too.

Fixed all new occurrences of "\n\n\n".
 
> > +  if (feature_mask == 0ULL)
> > +    {
> > +      /* Default version.  */
> > +      gimple_seq_add_stmt (&gseq, convert_stmt);
> > +      gimple_seq_add_stmt (&gseq, return_stmt);
> > +      set_bb_seq (new_bb, gseq);
> > +      gimple_set_bb (convert_stmt, new_bb);
> > +      gimple_set_bb (return_stmt, new_bb);
> > +      pop_cfun ();
> > +      return new_bb;
> > +    }
> > +
> > +  tree and_expr_var = create_tmp_var (long_long_unsigned_type_node);
> > +  tree and_expr = build2 (BIT_AND_EXPR,
> > +                     long_long_unsigned_type_node,
> > +                     mask_var,
> > +                     build_int_cst (long_long_unsigned_type_node,
> > +                                    feature_mask));
> > +  gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr);
> > +  gimple_set_block (and_stmt, DECL_INITIAL (function_decl));
> > +  gimple_set_bb (and_stmt, new_bb);
> > +  gimple_seq_add_stmt (&gseq, and_stmt);
> > +
> > +  tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0);
> > +  if_else_stmt = gimple_build_cond (EQ_EXPR, and_expr_var, zero_llu,
> > +                               NULL_TREE, NULL_TREE);
> > +  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl));
> > +  gimple_set_bb (if_else_stmt, new_bb);
> > +  gimple_seq_add_stmt (&gseq, if_else_stmt);
> > +
> > +  gimple_seq_add_stmt (&gseq, convert_stmt);
> > +  gimple_seq_add_stmt (&gseq, return_stmt);
> > +  set_bb_seq (new_bb, gseq);
> > +
> > +  bb1 = new_bb;
> > +  e12 = split_block (bb1, if_else_stmt);
> > +  bb2 = e12->dest;
> > +  e12->flags &= ~EDGE_FALLTHRU;
> > +  e12->flags |= EDGE_TRUE_VALUE;
> > +
> > +  e23 = split_block (bb2, return_stmt);
> > +
> > +  gimple_set_bb (convert_stmt, bb2);
> > +  gimple_set_bb (return_stmt, bb2);
> > +
> > +  bb3 = e23->dest;
> > +  make_edge (bb1, bb3, EDGE_FALSE_VALUE);
> > +
> > +  remove_edge (e23);
> > +  make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
> > +
> > +  pop_cfun ();
> > +
> > +  return bb3;
> > +}
> > +
> > +/* Used when sorting the decls into dispatch order.  */
> > +static int compare_feature_version_info (const void *p1, const void *p2)
> 
> Formatting nit: new line after "static int".
> 
> > +{
> > +  struct _function_version_info
> > +    {
> > +      tree version_decl;
> > +      aarch64_fmv_feature_mask feature_mask;
> > +    };
> 
> Think we should move this struct out of the function so that it can
> be shared by dispatch_function_versions.  Alternatively, the comparison
> function could be a lambda within dispatch_function_versions.

Rewritten as a lambda, and reordered within dispatch_function_versions so that
processing the list of function versions happens after all the preliminary
codegen.

> It's best to avoid names starting with "_", since those are reserved
> for the implementation.
> 
> > +  const _function_version_info v1 = *(const _function_version_info *)p1;
> > +  const _function_version_info v2 = *(const _function_version_info *)p2;
> > +  return - compare_feature_masks (v1.feature_mask, v2.feature_mask);
> > +}
> > +
> > +static int
> > +dispatch_function_versions (tree dispatch_decl,
> > +                       void *fndecls_p,
> > +                       basic_block *empty_bb)
> 
> Missing function comment.

Added (same as i386).
 
> > +{
> > +  gimple *ifunc_cpu_init_stmt;
> > +  gimple_seq gseq;
> > +  vec<tree> *fndecls;
> > +  unsigned int num_versions = 0;
> > +  unsigned int actual_versions = 0;
> > +  unsigned int i;
> > +
> > +  struct _function_version_info
> > +    {
> > +      tree version_decl;
> > +      aarch64_fmv_feature_mask feature_mask;
> > +    } *function_version_info;
> > +
> > +  gcc_assert (dispatch_decl != NULL
> > +         && fndecls_p != NULL
> > +         && empty_bb != NULL);
> > +
> > +  /*fndecls_p is actually a vector.  */
> > +  fndecls = static_cast<vec<tree> *> (fndecls_p);
> > +
> > +  /* At least one more version other than the default.  */
> > +  num_versions = fndecls->length ();
> > +  gcc_assert (num_versions >= 2);
> > +
> > +  function_version_info = (struct _function_version_info *)
> > +    XNEWVEC (struct _function_version_info, (num_versions));
> > +
> > +  push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl));
> > +
> > +  gseq = bb_seq (*empty_bb);
> > +  /* Function version dispatch is via IFUNC.  IFUNC resolvers fire before
> > +     constructors, so explicity call __init_cpu_features_resolver here.  */
> > +  tree init_fn_type = build_function_type_list (void_type_node,
> > +                                           long_unsigned_type_node,
> > +                                           build_ifunc_arg_type(),
> > +                                           NULL);
> > +  tree init_fn_id = get_identifier ("__init_cpu_features_resolver");
> > +  tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
> > +                             init_fn_id, init_fn_type);
> > +  tree arg1 = DECL_ARGUMENTS (dispatch_decl);
> > +  tree arg2 = TREE_CHAIN (arg1);
> > +  ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 2, arg1, arg2);
> > +  gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt);
> > +  gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb);
> > +
> > +  /* Build the struct type for __aarch64_cpu_features.  */
> > +  tree global_type = lang_hooks.types.make_type (RECORD_TYPE);
> > +  tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> > +                       get_identifier ("features"),
> > +                       long_long_unsigned_type_node);
> > +  DECL_FIELD_CONTEXT (field1) = global_type;
> > +  TYPE_FIELDS (global_type) = field1;
> > +  layout_type (global_type);
> > +
> > +  tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL,
> > +                           get_identifier ("__aarch64_cpu_features"),
> > +                           global_type);
> > +  DECL_EXTERNAL (global_var) = 1;
> > +  tree mask_var = create_tmp_var (long_long_unsigned_type_node);
> > +
> > +  tree component_expr = build3 (COMPONENT_REF, 
> > long_long_unsigned_type_node,
> > +                           global_var, field1, NULL_TREE);
> > +  gimple *component_stmt = gimple_build_assign (mask_var, component_expr);
> > +  gimple_set_block (component_stmt, DECL_INITIAL (dispatch_decl));
> > +  gimple_set_bb (component_stmt, *empty_bb);
> > +  gimple_seq_add_stmt (&gseq, component_stmt);
> > +
> > +  tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, 
> > mask_var);
> > +  gimple *not_stmt = gimple_build_assign (mask_var, not_expr);
> > +  gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl));
> > +  gimple_set_bb (not_stmt, *empty_bb);
> > +  gimple_seq_add_stmt (&gseq, not_stmt);
> > +
> > +  set_bb_seq (*empty_bb, gseq);
> > +
> > +  pop_cfun ();
> > +
> > +  for (tree version_decl : *fndecls)
> > +    {
> > +      aarch64_fmv_feature_mask feature_mask;
> > +      /* Get attribute string, parse it and find the right features.  */
> > +      feature_mask = get_feature_mask_for_version (version_decl);
> > +      function_version_info [actual_versions].version_decl = version_decl;
> > +      function_version_info [actual_versions].feature_mask = feature_mask;
> > +      actual_versions++;
> > +    }
> > +
> > +  /* Sort the versions according to descending order of dispatch priority. 
> >  */
> > +  qsort (function_version_info, actual_versions,
> > +    sizeof (struct _function_version_info), compare_feature_version_info);
> > +
> > +  for (i = 0; i < actual_versions; ++i)
> > +    *empty_bb = add_condition_to_bb (dispatch_decl,
> > +                                function_version_info[i].version_decl,
> > +                                function_version_info[i].feature_mask,
> > +                                mask_var,
> > +                                *empty_bb);
> > +
> > +  free (function_version_info);
> > +  return 0;
> > +}
> > +
> > +
> > +tree
> > +aarch64_generate_version_dispatcher_body (void *node_p)
> 
> Missing function comment.  Since the function implements a defined interface,
> the comment can just be:
> 
> /* Implement TARGET_GENERATE_VERSION_DISPATCHER_BODY.  */

Done.
 
> > +{
> > +  tree resolver_decl;
> > +  basic_block empty_bb;
> > +  tree default_ver_decl;
> > +  struct cgraph_node *versn;
> > +  struct cgraph_node *node;
> > +
> > +  struct cgraph_function_version_info *node_version_info = NULL;
> > +  struct cgraph_function_version_info *versn_info = NULL;
> > +
> > +  node = (cgraph_node *)node_p;
> > +
> > +  node_version_info = node->function_version ();
> > +  gcc_assert (node->dispatcher_function
> > +         && node_version_info != NULL);
> > +
> > +  if (node_version_info->dispatcher_resolver)
> > +    return node_version_info->dispatcher_resolver;
> > +
> > +  /* The first version in the chain corresponds to the default version.  */
> > +  default_ver_decl = node_version_info->next->this_node->decl;
> > +
> > +  /* node is going to be an alias, so remove the finalized bit.  */
> > +  node->definition = false;
> > +
> > +  resolver_decl = make_resolver_func (default_ver_decl,
> > +                                 node->decl, &empty_bb);
> > +
> > +  node_version_info->dispatcher_resolver = resolver_decl;
> > +
> > +  push_cfun (DECL_STRUCT_FUNCTION (resolver_decl));
> > +
> > +  auto_vec<tree, 2> fn_ver_vec;
> > +
> > +  for (versn_info = node_version_info->next; versn_info;
> > +       versn_info = versn_info->next)
> > +    {
> > +      versn = versn_info->this_node;
> > +      /* Check for virtual functions here again, as by this time it should
> > +    have been determined if this function needs a vtable index or
> > +    not.  This happens for methods in derived classes that override
> > +    virtual methods in base classes but are not explicitly marked as
> > +    virtual.  */
> > +      if (DECL_VINDEX (versn->decl))
> > +   sorry ("virtual function multiversioning not supported");
> > +
> > +      fn_ver_vec.safe_push (versn->decl);
> > +    }
> > +
> > +  dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb);
> > +  cgraph_edge::rebuild_edges ();
> > +  pop_cfun ();
> > +  return resolver_decl;
> > +}
> > +
> > +/* Make a dispatcher declaration for the multi-versioned function DECL.
> > +   Calls to DECL function will be replaced with calls to the dispatcher
> > +   by the front-end.  Returns the decl of the dispatcher function.  */
> > +
> > +tree
> > +aarch64_get_function_versions_dispatcher (void *decl)
> > +{
> > +  tree fn = (tree) decl;
> > +  struct cgraph_node *node = NULL;
> > +  struct cgraph_node *default_node = NULL;
> > +  struct cgraph_function_version_info *node_v = NULL;
> > +  struct cgraph_function_version_info *first_v = NULL;
> > +
> > +  tree dispatch_decl = NULL;
> > +
> > +  struct cgraph_function_version_info *default_version_info = NULL;
> > +
> > +  gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn));
> > +
> > +  node = cgraph_node::get (fn);
> > +  gcc_assert (node != NULL);
> > +
> > +  node_v = node->function_version ();
> > +  gcc_assert (node_v != NULL);
> > +
> > +  if (node_v->dispatcher_resolver != NULL)
> > +    return node_v->dispatcher_resolver;
> > +
> > +  /* Find the default version and make it the first node.  */
> > +  first_v = node_v;
> > +  /* Go to the beginning of the chain.  */
> > +  while (first_v->prev != NULL)
> > +    first_v = first_v->prev;
> > +  default_version_info = first_v;
> > +  while (default_version_info != NULL)
> > +    {
> > +      if (get_feature_mask_for_version
> > +       (default_version_info->this_node->decl) == 0ULL)
> > +   break;
> > +      default_version_info = default_version_info->next;
> > +    }
> > +
> > +  /* If there is no default node, just return NULL.  */
> > +  if (default_version_info == NULL)
> > +    return NULL;
> > +
> > +  /* Make default info the first node.  */
> > +  if (first_v != default_version_info)
> > +    {
> > +      default_version_info->prev->next = default_version_info->next;
> > +      if (default_version_info->next)
> > +   default_version_info->next->prev = default_version_info->prev;
> > +      first_v->prev = default_version_info;
> > +      default_version_info->next = first_v;
> > +      default_version_info->prev = NULL;
> > +    }
> > +
> > +  default_node = default_version_info->this_node;
> > +
> > +  if (targetm.has_ifunc_p ())
> > +    {
> > +      struct cgraph_function_version_info *it_v = NULL;
> > +      struct cgraph_node *dispatcher_node = NULL;
> > +      struct cgraph_function_version_info *dispatcher_version_info = NULL;
> > +
> > +      /* Right now, the dispatching is done via ifunc.  */
> > +      dispatch_decl = make_dispatcher_decl (default_node->decl);
> > +      TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn);
> > +
> > +      dispatcher_node = cgraph_node::get_create (dispatch_decl);
> > +      gcc_assert (dispatcher_node != NULL);
> > +      dispatcher_node->dispatcher_function = 1;
> > +      dispatcher_version_info
> > +   = dispatcher_node->insert_new_function_version ();
> > +      dispatcher_version_info->next = default_version_info;
> > +      dispatcher_node->definition = 1;
> > +
> > +      /* Set the dispatcher for all the versions.  */
> > +      it_v = default_version_info;
> > +      while (it_v != NULL)
> > +   {
> > +     it_v->dispatcher_resolver = dispatch_decl;
> > +     it_v = it_v->next;
> > +   }
> > +    }
> > +  else
> > +    {
> > +      error_at (DECL_SOURCE_LOCATION (default_node->decl),
> > +           "multiversioning needs %<ifunc%> which is not supported "
> > +           "on this target");
> > +    }
> > +
> > +  return dispatch_decl;
> > +}
> > +
> > +bool
> > +aarch64_common_function_versions (tree fn1, tree fn2)
> 
> Missing comment here too.  Same for other functions later.

Added.
 
> > +{
> > +  if (TREE_CODE (fn1) != FUNCTION_DECL
> > +      || TREE_CODE (fn2) != FUNCTION_DECL)
> > +    return false;
> > +
> > +  return (aarch64_compare_version_priority (fn1, fn2) != 0);
> > +}
> > +
> > +
> > +tree
> > +aarch64_mangle_decl_assembler_name (tree decl, tree id)
> > +{
> > +  /* For function version, add the target suffix to the assembler name.  */
> > +  if (TREE_CODE (decl) == FUNCTION_DECL
> > +      && DECL_FUNCTION_VERSIONED (decl))
> > +    {
> > +      aarch64_fmv_feature_mask feature_mask = get_feature_mask_for_version 
> > (decl);
> > +
> > +      /* No suffix for the default version.  */
> > +      if (feature_mask == 0ULL)
> > +   return id;
> > +
> > +      char suffix[2048];
> > +      int pos = 0;
> > +      const char *base = IDENTIFIER_POINTER (id);
> > +
> > +      for (int i = 1; i < FEAT_MAX; i++)
> 
> Why does this start at 1 rather than 0?  Think it deserves a comment.

It starts at 1 because that array used to have a "default" entry at the start.
Now it's just a bug - thanks for spotting.  Fixed in the next version.

> > +   {
> > +     if (feature_mask & aarch64_fmv_feature_data[i].feature_mask)
> > +       {
> > +         suffix[pos] = 'M';
> > +         strcpy (&suffix[pos+1], aarch64_fmv_feature_data[i].name);
> > +         pos += strlen(aarch64_fmv_feature_data[i].name) + 1;
> > +       }
> > +   }
> > +      suffix[pos] = '\0';
> > +
> > +      char *ret = XNEWVEC (char, strlen (base) + strlen (suffix) + 3);
> > +      sprintf (ret, "%s._%s", base, suffix);
> 
> It isn't obvious that the limit of 2048 is or will stay safe.  Probably
> best to build the suffix using a std::string instead.

It would be safe for now, because we have <64 features, each of which
contributes <32 characters.  But regardless, it's ugly confusing code that I
have now significantly improved by using std::string instead.

(The only reason I wrote it this way in the first place was because that's how
x86 did it, and I hadn't yet encountered usage of std::string elsewhere in
gcc.)
 
> Thanks,
> Richard
> 
> > +
> > +      if (DECL_ASSEMBLER_NAME_SET_P (decl))
> > +   SET_DECL_RTL (decl, NULL);
> > +
> > +      id = get_identifier (ret);
> > +    }
> > +  return id;
> > +}
> > +
> > +
> >  /* Helper for aarch64_can_inline_p.  In the case where CALLER and CALLEE 
> > are
> >     tri-bool options (yes, no, don't care) and the default value is
> >     DEF, determine whether to reject inlining.  */
> > @@ -28457,6 +29288,13 @@ aarch64_libgcc_floating_mode_supported_p
> >  #undef TARGET_OPTION_VALID_ATTRIBUTE_P
> >  #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p
> >  
> > +#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P
> > +#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P \
> > +  aarch64_option_valid_version_attribute_p
> > +
> > +#undef TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE
> > +#define TARGET_OPTION_EXPANDED_CLONES_ATTRIBUTE "target_version"
> > +
> >  #undef TARGET_SET_CURRENT_FUNCTION
> >  #define TARGET_SET_CURRENT_FUNCTION aarch64_set_current_function
> >  
> > @@ -28787,6 +29625,24 @@ aarch64_libgcc_floating_mode_supported_p
> >  #undef TARGET_CONST_ANCHOR
> >  #define TARGET_CONST_ANCHOR 0x1000000
> >  
> > +#undef TARGET_OPTION_FUNCTION_VERSIONS
> > +#define TARGET_OPTION_FUNCTION_VERSIONS aarch64_common_function_versions
> > +
> > +#undef TARGET_COMPARE_VERSION_PRIORITY
> > +#define TARGET_COMPARE_VERSION_PRIORITY aarch64_compare_version_priority
> > +
> > +#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY
> > +#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \
> > +  aarch64_generate_version_dispatcher_body
> > +
> > +#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
> > +#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \
> > +  aarch64_get_function_versions_dispatcher
> > +
> > +#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
> > +#define TARGET_MANGLE_DECL_ASSEMBLER_NAME 
> > aarch64_mangle_decl_assembler_name
> > +
> > +
> >  struct gcc_target targetm = TARGET_INITIALIZER;
> >  
> >  #include "gt-aarch64.h"
> > diff --git a/gcc/config/arm/aarch-common.h b/gcc/config/arm/aarch-common.h
> > index 
> > c6a67f0d05cc75d85d019e1cc910c37173884c03..70f01fd3da6919dd98cfe92bfc4c54b7d2cba72c
> >  100644
> > --- a/gcc/config/arm/aarch-common.h
> > +++ b/gcc/config/arm/aarch-common.h
> > @@ -23,7 +23,7 @@
> >  #define GCC_AARCH_COMMON_H
> >  
> >  /* Enum describing the various ways that the
> > -   aarch*_parse_{arch,tune,cpu,extension} functions can fail.
> > +   aarch*_parse_{arch,tune,cpu,extension,fmv_extension} functions can fail.
> >     This way their callers can choose what kind of error to give.  */
> >  
> >  enum aarch_parse_opt_result
> > @@ -31,7 +31,8 @@ enum aarch_parse_opt_result
> >    AARCH_PARSE_OK,                  /* Parsing was successful.  */
> >    AARCH_PARSE_MISSING_ARG,         /* Missing argument.  */
> >    AARCH_PARSE_INVALID_FEATURE,             /* Invalid feature modifier.  */
> > -  AARCH_PARSE_INVALID_ARG          /* Invalid arch, tune, cpu arg.  */
> > +  AARCH_PARSE_INVALID_ARG,         /* Invalid arch, tune, cpu arg.  */
> > +  AARCH_PARSE_DUPLICATE_FEATURE            /* Duplicate feature modifier.  
> > */
> >  };
> >  
> >  /* Function types -msign-return-address should sign.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_0.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_0.c
> > index 
> > 8499f87c39b173491a89626af56f4e193b1d12b5..8b7d7d2d8a00f6d5a6a35ffca28be7f1ff4cb9c7
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_0.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_0.c
> > @@ -7,6 +7,6 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+crc\+dotprod\+crypto} } } 
> > */
> > +/* { dg-final { scan-assembler {\.arch armv8-a\+dotprod\+crc\+crypto} } } 
> > */
> >  
> >  /* Test a normal looking procinfo.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_13.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_13.c
> > index 
> > 551669091c7010379a4c5247a27c517c4e67ef98..234a1ce1d7b4714e64c95c15488784d73c0552f2
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_13.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_13.c
> > @@ -7,6 +7,6 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+crc\+dotprod\+crypto} } } 
> > */
> > +/* { dg-final { scan-assembler {\.arch armv8-a\+dotprod\+crc\+crypto} } } 
> > */
> >  
> >  /* Test one with mixed order of feature bits.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_16.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_16.c
> > index 
> > 2f963bb2312711691f6f1c5989a100b88671ad52..bd3ea96a785de507578729a621ec4ae7bad8a516
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_16.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_16.c
> > @@ -7,6 +7,6 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+crc\+dotprod\+crypto\+sve2} } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+dotprod\+crc\+crypto\+sve2} } } */
> >  
> >  /* Test a normal looking procinfo.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_17.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_17.c
> > index 
> > c68a697aa3e97ef52fd7e90233c5bb4ac8dbddd9..33e6319b46dcebc717e8a415484093e980660fb5
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_17.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_17.c
> > @@ -7,6 +7,6 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+crc\+dotprod\+crypto\+sve2} } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+dotprod\+crc\+crypto\+sve2} } } */
> >  
> >  /* Test a normal looking procinfo.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_18.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_18.c
> > index 
> > b5f0a3005f50cbf01edbcb8aefcc3c34aa11207f..abae7a7d1453f79f879ff5e24f7c67e819db1dbb
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_18.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_18.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv8.6-a\+crc\+fp16\+aes\+sha3\+rng} } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv8.6-a\+rng\+crc\+aes\+sha3\+fp16} } } */
> >  
> >  /* Test one where the boundary of buffer size would overwrite the last
> >     character read when stitching the fgets-calls together.  With the
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_19.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_19.c
> > index 
> > 980d3f79dfb03b0d8eb68f691bf2dedf80aed87d..a5b4b4d3442c6522a8cdadf4eebd3b5460e37213
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_19.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_19.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv9-a\+crc\+profile\+memtag\+sve2-sm4\+sve2-aes\+sve2-sha3\+sve2-bitperm\+i8mm\+bf16\+nopauth\n}
> >  } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv9-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+memtag\+profile\+nopauth\n}
> >  } } */
> >  
> >  /* Test one that if the kernel doesn't report the availability of a 
> > mandatory
> >     feature that it has turned it off for whatever reason.  As such 
> > compilers
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_20.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_20.c
> > index 
> > 117df2b0b6cd5751d9f5175b4343aad9825a6c43..e12aa543d02924f268729f96fe1f17181287f097
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_20.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_20.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv9-a\+crc\+profile\+memtag\+sve2-sm4\+sve2-aes\+sve2-sha3\+sve2-bitperm\+i8mm\+bf16\n}
> >  } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv9-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+memtag\+profile\n}
> >  } } */
> >  
> >  /* Check whether features that don't have a midr name during detection are
> >     correctly ignored.  These features shouldn't affect the native 
> > detection.
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > index 
> > efbd02cbdc0638db85e776f1e79043709c11df21..920e1d65711cbcb77b07441597180c0159ccabf9
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+crc\+lse\+rcpc\+rdma\+dotprod\+fp16fml\+sb\+ssbs\+sve2-sm4\+sve2-aes\+sve2-sha3\+sve2-bitperm\+i8mm\+bf16\+flagm\n}
> >  } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+flagm\+dotprod\+rdma\+lse\+crc\+fp16fml\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n}
> >  } } */
> >  
> >  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
> >     values.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > index 
> > d431d4938265d024891b464ac3d069607b21d8e7..416a29b514ab7599a7092e26e3716ec8a50cc895
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+crc\+lse\+rcpc\+rdma\+dotprod\+fp16fml\+sb\+ssbs\+sve2-sm4\+sve2-aes\+sve2-sha3\+sve2-bitperm\+i8mm\+bf16\+flagm\+pauth\n}
> >  } } */
> > +/* { dg-final { scan-assembler {\.arch 
> > armv8-a\+flagm\+dotprod\+rdma\+lse\+crc\+fp16fml\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n}
> >  } } */
> >  
> >  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
> >     values and that it enables optional features.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_6.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_6.c
> > index 
> > 7608e8845a662219488effcdb8277006dcf457a9..907249c5c1e6a440731533407df0ff7caadcbf74
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_6.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_6.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+fp16\+crypto} } } */
> > +/* { dg-final { scan-assembler {\.arch armv8-a\+crypto\+fp16} } } */
> >  
> > -/* Test one where the feature bits for crypto and fp16 are given in
> > -   same order as declared in options file.  */
> > +/* Test one where the crypto and fp16 options are specified in different
> > +   order from what is in the options file.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_7.c 
> > b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_7.c
> > index 
> > 72b14b4f6ed0d50a4fc8a35931fbd232b09d2b61..b68a07a7c16b7a3cc9a896cca152d78e5cf9ea2f
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_7.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_7.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+fp16\+crypto} } } */
> > +/* { dg-final { scan-assembler {\.arch armv8-a\+crypto\+fp16} } } */
> >  
> > -/* Test one where the crypto and fp16 options are specified in different
> > -   order from what is in the options file.  */
> > +/* Test one where the feature bits for crypto and fp16 are given in
> > +   same order as declared in options file.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_17.c 
> > b/gcc/testsuite/gcc.target/aarch64/options_set_17.c
> > index 
> > c490e1f47a0a7a3adcbb7e96a3974d5651a023e8..4c53edd5cb92f83b3d34454c85062ff3f67b50ee
> >  100644
> > --- a/gcc/testsuite/gcc.target/aarch64/options_set_17.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/options_set_17.c
> > @@ -6,6 +6,6 @@ int main ()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8\.2-a\+crc\+dotprod} } } */
> > +/* { dg-final { scan-assembler {\.arch armv8\.2-a\+dotprod\+crc} } } */
> >  
> >   /* dotprod needs to be emitted pre armv8.4.  */
> > diff --git a/libgcc/config/aarch64/cpuinfo.c 
> > b/libgcc/config/aarch64/cpuinfo.c
> > index 
> > 0888ca4ed058430f524b99cb0e204bd996fa0e55..78664d5a4287be0369a4b02e1b8ab4a885869352
> >  100644
> > --- a/libgcc/config/aarch64/cpuinfo.c
> > +++ b/libgcc/config/aarch64/cpuinfo.c
> > @@ -22,6 +22,8 @@
> >     see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> >     <http://www.gnu.org/licenses/>.  */
> >  
> > +#include "common/config/aarch64/cpuinfo.h"
> > +
> >  #if defined(__has_include)
> >  #if __has_include(<sys/auxv.h>)
> >  #include <sys/auxv.h>
> > @@ -39,73 +41,6 @@ typedef struct __ifunc_arg_t {
> >  #if __has_include(<asm/hwcap.h>)
> >  #include <asm/hwcap.h>
> >  
> > -/* CPUFeatures must correspond to the same AArch64 features in aarch64.cc  
> > */
> > -enum CPUFeatures {
> > -  FEAT_RNG,
> > -  FEAT_FLAGM,
> > -  FEAT_FLAGM2,
> > -  FEAT_FP16FML,
> > -  FEAT_DOTPROD,
> > -  FEAT_SM4,
> > -  FEAT_RDM,
> > -  FEAT_LSE,
> > -  FEAT_FP,
> > -  FEAT_SIMD,
> > -  FEAT_CRC,
> > -  FEAT_SHA1,
> > -  FEAT_SHA2,
> > -  FEAT_SHA3,
> > -  FEAT_AES,
> > -  FEAT_PMULL,
> > -  FEAT_FP16,
> > -  FEAT_DIT,
> > -  FEAT_DPB,
> > -  FEAT_DPB2,
> > -  FEAT_JSCVT,
> > -  FEAT_FCMA,
> > -  FEAT_RCPC,
> > -  FEAT_RCPC2,
> > -  FEAT_FRINTTS,
> > -  FEAT_DGH,
> > -  FEAT_I8MM,
> > -  FEAT_BF16,
> > -  FEAT_EBF16,
> > -  FEAT_RPRES,
> > -  FEAT_SVE,
> > -  FEAT_SVE_BF16,
> > -  FEAT_SVE_EBF16,
> > -  FEAT_SVE_I8MM,
> > -  FEAT_SVE_F32MM,
> > -  FEAT_SVE_F64MM,
> > -  FEAT_SVE2,
> > -  FEAT_SVE_AES,
> > -  FEAT_SVE_PMULL128,
> > -  FEAT_SVE_BITPERM,
> > -  FEAT_SVE_SHA3,
> > -  FEAT_SVE_SM4,
> > -  FEAT_SME,
> > -  FEAT_MEMTAG,
> > -  FEAT_MEMTAG2,
> > -  FEAT_MEMTAG3,
> > -  FEAT_SB,
> > -  FEAT_PREDRES,
> > -  FEAT_SSBS,
> > -  FEAT_SSBS2,
> > -  FEAT_BTI,
> > -  FEAT_LS64,
> > -  FEAT_LS64_V,
> > -  FEAT_LS64_ACCDATA,
> > -  FEAT_WFXT,
> > -  FEAT_SME_F64,
> > -  FEAT_SME_I64,
> > -  FEAT_SME2,
> > -  FEAT_RCPC3,
> > -  FEAT_MAX,
> > -  FEAT_EXT = 62, /* Reserved to indicate presence of additional features 
> > field
> > -               in __aarch64_cpu_features.  */
> > -  FEAT_INIT      /* Used as flag of features initialization completion.  */
> > -};
> > -
> >  /* Architecture features used in Function Multi Versioning.  */
> >  struct {
> >    unsigned long long features;

Reply via email to