On Fri, Apr 12, 2024 at 05:41:11PM +0100, Richard Sandiford wrote: > Hi Andrew, > > Thanks for doing this. I think it improves the organisation of the > FMV documentation and adds some details that were previously missing. > > I've made some suggestions below, but documentation is subjective > and I realise that not everyone will agree with them. > > I've also added Sandra to cc: in case she has time to help with this. > [original patch: > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649071.html] > > Andrew Carlotti <andrew.carlo...@arm.com> writes: > > Add target_version attribute to Common Function Attributes and update > > target and target_clones documentation. Move shared detail and examples > > to the Function Multiversioning page. Add target-specific details to > > target-specific pages. > > > > --- > > > > I've built and checked the info and dvi outputs. Ok for master? > > > > gcc/ChangeLog: > > > > * doc/extend.texi (Common Function Attributes): Update target > > and target_clones documentation, and add target_version. > > (AArch64 Function Attributes): Add ACLE reference and list > > supported features. > > (PowerPC Function Attributes): List supported features. > > (x86 Function Attributes): Mention function multiversioning. > > (Function Multiversioning): Update, and move shared detail here. > > > > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > > index > > 7b54a241a7bfde03ce86571be9486b30bcea6200..78cc7ad2903b61a06b618b82ba7ad52ed42d944a > > 100644 > > --- a/gcc/doc/extend.texi > > +++ b/gcc/doc/extend.texi > > @@ -4178,18 +4178,27 @@ and @option{-Wanalyzer-tainted-size}. > > Multiple target back ends implement the @code{target} attribute > > to specify that a function is to > > be compiled with different target options than specified on the > > -command line. The original target command-line options are ignored. > > -One or more strings can be provided as arguments. > > -Each string consists of one or more comma-separated suffixes to > > -the @code{-m} prefix jointly forming the name of a machine-dependent > > -option. @xref{Submodel Options,,Machine-Dependent Options}. > > - > > +command line. One or more strings can be provided as arguments. > > +The attribute may override the original target command-line options, or it > > may > > +be combined with them in a target-specific manner. > > It's hard to tell from this what the conditions for "may" are, > e.g. whether it depends on the arguments, on the back end, or both. > Could you add a bit more text to clarify (even if it's just a forward > reference)?
I think it's better just to drop this sentence and leave it to the target-specific documentation to cover this. > With that extra text, and perhaps without, I think it's clearer to > say this after... > > > The @code{target} attribute can be used for instance to have a function > > compiled with a different ISA (instruction set architecture) than the > > -default. @samp{#pragma GCC target} can be used to specify target-specific > > +default. > > ...this. I.e.: > > Multiple target back ends implement [...] command-line. > The @code{target} attribute can be used [...] the default. > > <paragraph explaining the arguments and how they're interpreted> > > > + > > +@samp{#pragma GCC target} can be used to specify target-specific > > options for more than one function. @xref{Function Specific Option > > Pragmas}, > > for details about the pragma. > > > > +On x86, the @code{target} attribute can also be used to create multiple > > +versions of a function, compiled with different target-specific options. > > +@xref{Function Multiversioning} for more details. > > It might be clearer to put this at the end, since the rest of the section > goes back to talking about the non-FMV usage. Perhaps the same goes for > the pragma part. Agreed - I've reordered this. > > Also, how about saying that, on AArch64, the equivalent functionality > is provided by the target_version attribute? After reording, this paragraph immediately precedes the short descriptions of target_clones and target_version, with the latter explicitly referring to AArch64. I don't think another mention of target_version is necessary. > > + > > +The options supported by the @code{target} attribute are specific to each > > +target; refer to @ref{x86 Function Attributes}, @ref{PowerPC Function > > +Attributes}, @ref{ARM Function Attributes}, @ref{AArch64 Function > > Attributes}, > > +@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes} > > +for details. > > + > > For instance, on an x86, you could declare one function with the > > @code{target("sse4.1,arch=core2")} attribute and another with > > @code{target("sse4a,arch=amdfam10")}. This is equivalent to > > @@ -4211,39 +4220,18 @@ multiple options is equivalent to separating the > > option suffixes with > > a comma (@samp{,}) within a single string. Spaces are not permitted > > within the strings. > > > > -The options supported are specific to each target; refer to @ref{x86 > > -Function Attributes}, @ref{PowerPC Function Attributes}, > > -@ref{ARM Function Attributes}, @ref{AArch64 Function Attributes}, > > -@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes} > > -for details. > > - > > @cindex @code{target_clones} function attribute > > @item target_clones (@var{options}) > > The @code{target_clones} attribute is used to specify that a function > > -be cloned into multiple versions compiled with different target options > > -than specified on the command line. The supported options and restrictions > > -are the same as for @code{target} attribute. > > - > > -For instance, on an x86, you could compile a function with > > -@code{target_clones("sse4.1,avx")}. GCC creates two function clones, > > -one compiled with @option{-msse4.1} and another with @option{-mavx}. > > - > > -On a PowerPC, you can compile a function with > > -@code{target_clones("cpu=power9,default")}. GCC will create two > > -function clones, one compiled with @option{-mcpu=power9} and another > > -with the default options. GCC must be configured to use GLIBC 2.23 or > > -newer in order to use the @code{target_clones} attribute. > > - > > -It also creates a resolver function (see > > -the @code{ifunc} attribute above) that dynamically selects a clone > > -suitable for current architecture. The resolver is created only if there > > -is a usage of a function with @code{target_clones} attribute. > > - > > -Note that any subsequent call of a function without @code{target_clone} > > -from a @code{target_clone} caller will not lead to copying > > -(target clone) of the called function. > > -If you want to enforce such behaviour, > > -we recommend declaring the calling function with the @code{flatten} > > attribute? > > +should be cloned into multiple versions compiled with different target > > options > > +than specified on the command line. @xref{Function Multiversioning} for > > more > > +details. > > + > > +@cindex @code{target_version} function attribute > > +@item target_version (@var{options}) > > +The @code{target_version} attribute is used on AArch64 to create multiple > > +versions of a function, compiled with different target-specific options. > > +@xref{Function Multiversioning} for more details. > > > > @cindex @code{unavailable} function attribute > > @item unavailable > > @@ -4734,6 +4722,26 @@ Note that CPU tuning options and attributes such as > > the @option{-mcpu=}, > > @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the > > architectural feature rules specified above. > > > > +@subsubsection Function multiversioning > > +The @code{target_version} and @code{target_clones} attributes can be used > > to > > +specify multiple versions of a function. Each version enables the > > specified > > +set of architecture extensions, in addition to any extensions that were > > already > > +enabled at the command line or using @code{target} attributes. For general > > +details, @pxref{Function Multiversioning}. There are further > > AArch64-specific > > +details available in the > > +@uref{https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning, > > +Arm C Language Extensions (ACLE) specification}. > > + > > +Some aspects of the ACLE specification are not yet supported. In > > particular, > > +the currently supported feature names are @code{rng}, @code{flagm}, > > @code{lse}, > > +@code{fp}, @code{simd}, @code{dotprod}, @code{sm4}, @code{rdma}, @code{rdm} > > +(alias of @code{rdma}), @code{crc}, @code{sha2}, @code{sha3}, @code{aes}, > > +@code{fp16}, @code{fp16fml}, @code{rcpc}, @code{rcpc3}, @code{i8mm}, > > +@code{bf16}, @code{rpres}, @code{sve}, @code{f32mm}, @code{f64mm}, > > @code{sve2}, > > +@code{sve2-aes}, @code{sve2-bitperm}, @code{sve2-sha3}, @code{sve2-sm4}, > > +@code{sme}, @code{memtag}, @code{sb}, @code{predres}, @code{ssbs}, > > @code{ls64}, > > +@code{sme-f64f64}, @code{sme-i16i64} and @code{sme2}. > > + > > @node AMD GCN Function Attributes > > @subsection AMD GCN Function Attributes > > > > @@ -6278,6 +6286,15 @@ default tuning specified on the command line. > > On the PowerPC, the inliner does not inline a > > function that has different target options than the caller, unless the > > callee has a subset of the target options of the caller. > > + > > +@cindex @code{target_clones} function attribute > > +@item target_clones (@var{options}) > > +The @code{target_clones} attribute can be used to create multiple versions > > of a > > +function for different supported architectures, with one version for each > > +specifier in the options list. One of these version specifiers must be the > > +@code{default} version. The other supported target specifiers are > > +@code{cpu=power6}, @code{cpu=power7}, @code{cpu=power8}, @code{cpu=power9} > > and > > +@code{cpu=power10}. For more details, @pxref{Function Multiversioning}. > > @end table > > > > @node RISC-V Function Attributes > > @@ -6872,7 +6889,9 @@ will crash if the wrong kind of handler is used. > > @cindex @code{target} function attribute > > @item target (@var{options}) > > As discussed in @ref{Common Function Attributes}, this attribute > > -allows specification of target-specific compilation options. > > +allows specification of target-specific compilation options. It can also > > be > > +used to create multiple versions of a single function > > +(@pxref{Function Multiversioning}). > > > > On the x86, the following options are allowed: > > @table @samp > > @@ -29430,11 +29449,62 @@ For the effects of the @code{hot} attribute on > > functions, see > > @section Function Multiversioning > > @cindex function versions > > > > -With the GNU C++ front end, for x86 targets, you may specify multiple > > -versions of a function, where each function is specialized for a > > -specific target feature. At runtime, the appropriate version of the > > -function is automatically executed depending on the characteristics of > > -the execution platform. Here is an example. > > +On some targets it is possible to specify multiple versions of a function, > > +where each version of the function is specialized for a different set of > > target > > +features. At runtime, characteristics of the execution platform are > > checked, > > +and the most appropriate version of the function is chosen to be executed > > +depending on the available architecture features. One of the versions > > will be > > +a "default" version, which will be chosen if none of the criteria for the > > other > > +versions are met. > > + > > +Function multiversioning is implemented using the STT_GNU_IFUNC symbol type > > +extension to the ELF standard. This is same mechanism used by the > > @code{ifunc} > > +attribute (@pxref{Common Function Attributes}). However, the compiler > > +automatically generates a resolver function that checks which features are > > +available at runtime. This resolver uses GLIBC's hardware capability > > bits, and > > +therefore requires GCC to be configured to use GLIBC 2.23 or newer. The > > +resolver is run once at startup, and the resulting function pointer is then > > +stored in the dynamic symbol table. > > This is good implementation information, but perhaps it'd be better to > put it later in the section, after describing the more user-facing aspects. I've moved this to end of the section. > > + > > +Function multiversioning is enabled by annoting the function versions with > > one > > annotating Fixed. > > +of three function attributes. > > + > > +The @code{target} attribute can be used on x86 targets. Multiversioning > > with > > +the @code{target} attribute is supported only in the C++ frontend. One > > version > > +must be explicitly labelled as the "default" version; this version retains > > the > > +original mangled name, and will therefore be called directly by any callers > > +from translation units compiled without the target version attributes. > > + > > +The @code{target_version} attribute can be used on AArch64 targets. > > +Multiversioning with the @code{target_version} attribute is supported only > > in > > +the C++ frontend. This attribute behaves similarly to the @code{target} > > +attribute, with two differences. Firstly, the @code{target_version} > > attribute > > +is optional on the default version; the use of multiversioning can be > > inferred > > +by the presence of other non-default versions of the function. Secondly, > > the > > +original mangled name is used for the dispatched version of the function; > > this > > +means that the specialized function versions can be accessed from other > > +translation units without needing to include the additional versions and > > +function attributes in header files. > > Here too I think it would be better to avoid talking about the > implementation details at this level (mangling) and instead concentrate > on the user-visible behaviour (the end part of the sentence). I've rewritten these two paragraphs without the mangling details, and added a separate explanation of the mangling difference at the end of the section. > > + > > +The @code{target_clones} attribute can be used on AArch64, PowerPC and x86 > > +targets. It behaves similarly to the @code{target_version} attribute, > > except > > +that only one copy of the function is included in the source file. The > > +attribute takes a list of version specifiers and produces one copy of the > > +function for each specifier. This is useful in cases where the compiler is > > +capable of generating optimized code (with autovectorization, for example) > > +using architecture features enabled only in the more specialized function > > +versions. For example, on PowerPC, compiling a function with > > +@code{target_clones("default,cpu=power9")} will create two function clones > > - > > +one compiled with @option{-mcpu=power9}, and another with the default > > options. > > +The @code{target_clones} attribute is available in the C, C++, D and Ada > > +frontends. > > + > > +Function multiversioning attributes do not propogate from a versioned > > +function to its callees, although a callee can still be optimised using the > > +caller's extra target features if it has been inlined directly into the > > caller. > > It might be better to avoid the passive here. "Has been" sounds like > something that has already happened when the compiler is run, rather than > something that the compiler decides for itself. Changed to "is". > > + > > +Here is an example of function multiversioning on x86 using the > > @code{target} > > +attribute. > > > > @smallexample > > __attribute__ ((target ("default"))) > > @@ -29474,15 +29544,19 @@ int main () > > @end smallexample > > > > In the above example, four versions of function foo are created. The > > -first version of foo with the target attribute "default" is the default > > +first version of foo, with the target attribute "default", is the default > > Pre-existing, but this should probably be @samp{"default"} > > > version. This version gets executed when no other target specific > > -version qualifies for execution on a particular platform. A new version > > -of foo is created by using the same function signature but with a > > -different target string. Function foo is called or a pointer to it is > > -taken just like a regular function. GCC takes care of doing the > > +version qualifies for execution on a particular platform. Other versions > > +of foo are created by using the same function signature but with a > > Also pre-existing, but: @samp{foo}. Same below. Fixed. New version to follow. > Thanks, > Richard > > > +different target string. The function foo can be called or a pointer to it > > +can be taken just like a for regular function. GCC takes care of doing the > > dispatching to call the right version at runtime. Refer to the > > @uref{https://gcc.gnu.org/wiki/FunctionMultiVersioning, GCC wiki on > > -Function Multiversioning} for more details. > > +Function Multiversioning} for more details of the implementation. > > + > > +For details of the target options supported on each target, refer to > > +@ref{AArch64 Function Attributes}, @ref{PowerPC Function Attributes}, > > +and @ref{x86 Function Attributes}. > > > > @node Type Traits > > @section Type Traits