On Fri, Apr 12, 2024 at 05:41:11PM +0100, Richard Sandiford wrote:
> Hi Andrew,
> 
> Thanks for doing this.  I think it improves the organisation of the
> FMV documentation and adds some details that were previously missing.
> 
> I've made some suggestions below, but documentation is subjective
> and I realise that not everyone will agree with them.
> 
> I've also added Sandra to cc: in case she has time to help with this.
> [original patch: 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649071.html]
> 
> Andrew Carlotti <andrew.carlo...@arm.com> writes:
> > Add target_version attribute to Common Function Attributes and update
> > target and target_clones documentation.  Move shared detail and examples
> > to the Function Multiversioning page.  Add target-specific details to
> > target-specific pages.
> >
> > ---
> >
> > I've built and checked the info and dvi outputs.  Ok for master?
> >
> > gcc/ChangeLog:
> >
> >     * doc/extend.texi (Common Function Attributes): Update target
> >     and target_clones documentation, and add target_version.
> >     (AArch64 Function Attributes): Add ACLE reference and list
> >     supported features.
> >     (PowerPC Function Attributes): List supported features.
> >     (x86 Function Attributes): Mention function multiversioning.
> >     (Function Multiversioning): Update, and move shared detail here.
> >
> >
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 
> > 7b54a241a7bfde03ce86571be9486b30bcea6200..78cc7ad2903b61a06b618b82ba7ad52ed42d944a
> >  100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -4178,18 +4178,27 @@ and @option{-Wanalyzer-tainted-size}.
> >  Multiple target back ends implement the @code{target} attribute
> >  to specify that a function is to
> >  be compiled with different target options than specified on the
> > -command line.  The original target command-line options are ignored.
> > -One or more strings can be provided as arguments.
> > -Each string consists of one or more comma-separated suffixes to
> > -the @code{-m} prefix jointly forming the name of a machine-dependent
> > -option.  @xref{Submodel Options,,Machine-Dependent Options}.
> > -
> > +command line.  One or more strings can be provided as arguments.
> > +The attribute may override the original target command-line options, or it 
> > may
> > +be combined with them in a target-specific manner.
> 
> It's hard to tell from this what the conditions for "may" are,
> e.g. whether it depends on the arguments, on the back end, or both.
> Could you add a bit more text to clarify (even if it's just a forward
> reference)?

I think it's better just to drop this sentence and leave it to the
target-specific documentation to cover this.

> With that extra text, and perhaps without, I think it's clearer to
> say this after...
> 
> >  The @code{target} attribute can be used for instance to have a function
> >  compiled with a different ISA (instruction set architecture) than the
> > -default.  @samp{#pragma GCC target} can be used to specify target-specific
> > +default.
> 
> ...this.  I.e.:
> 
>   Multiple target back ends implement [...] command-line.  
>   The @code{target} attribute can be used [...] the default.
> 
>   <paragraph explaining the arguments and how they're interpreted>
> 
> > +
> > +@samp{#pragma GCC target} can be used to specify target-specific
> >  options for more than one function.  @xref{Function Specific Option 
> > Pragmas},
> >  for details about the pragma.
> >  
> > +On x86, the @code{target} attribute can also be used to create multiple
> > +versions of a function, compiled with different target-specific options.
> > +@xref{Function Multiversioning} for more details.
> 
> It might be clearer to put this at the end, since the rest of the section
> goes back to talking about the non-FMV usage.  Perhaps the same goes for
> the pragma part.

Agreed - I've reordered this.

> 
> Also, how about saying that, on AArch64, the equivalent functionality
> is provided by the target_version attribute?

After reording, this paragraph immediately precedes the short descriptions of
target_clones and target_version, with the latter explicitly referring to
AArch64.  I don't think another mention of target_version is necessary.

> > +
> > +The options supported by the @code{target} attribute are specific to each
> > +target; refer to @ref{x86 Function Attributes}, @ref{PowerPC Function
> > +Attributes}, @ref{ARM Function Attributes}, @ref{AArch64 Function 
> > Attributes},
> > +@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> > +for details.
> > +
> >  For instance, on an x86, you could declare one function with the
> >  @code{target("sse4.1,arch=core2")} attribute and another with
> >  @code{target("sse4a,arch=amdfam10")}.  This is equivalent to
> > @@ -4211,39 +4220,18 @@ multiple options is equivalent to separating the 
> > option suffixes with
> >  a comma (@samp{,}) within a single string.  Spaces are not permitted
> >  within the strings.
> >  
> > -The options supported are specific to each target; refer to @ref{x86
> > -Function Attributes}, @ref{PowerPC Function Attributes},
> > -@ref{ARM Function Attributes}, @ref{AArch64 Function Attributes},
> > -@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> > -for details.
> > -
> >  @cindex @code{target_clones} function attribute
> >  @item target_clones (@var{options})
> >  The @code{target_clones} attribute is used to specify that a function
> > -be cloned into multiple versions compiled with different target options
> > -than specified on the command line.  The supported options and restrictions
> > -are the same as for @code{target} attribute.
> > -
> > -For instance, on an x86, you could compile a function with
> > -@code{target_clones("sse4.1,avx")}.  GCC creates two function clones,
> > -one compiled with @option{-msse4.1} and another with @option{-mavx}.
> > -
> > -On a PowerPC, you can compile a function with
> > -@code{target_clones("cpu=power9,default")}.  GCC will create two
> > -function clones, one compiled with @option{-mcpu=power9} and another
> > -with the default options.  GCC must be configured to use GLIBC 2.23 or
> > -newer in order to use the @code{target_clones} attribute.
> > -
> > -It also creates a resolver function (see
> > -the @code{ifunc} attribute above) that dynamically selects a clone
> > -suitable for current architecture.  The resolver is created only if there
> > -is a usage of a function with @code{target_clones} attribute.
> > -
> > -Note that any subsequent call of a function without @code{target_clone}
> > -from a @code{target_clone} caller will not lead to copying
> > -(target clone) of the called function.
> > -If you want to enforce such behaviour,
> > -we recommend declaring the calling function with the @code{flatten} 
> > attribute?
> > +should be cloned into multiple versions compiled with different target 
> > options
> > +than specified on the command line.  @xref{Function Multiversioning} for 
> > more
> > +details.
> > +
> > +@cindex @code{target_version} function attribute
> > +@item target_version (@var{options})
> > +The @code{target_version} attribute is used on AArch64 to create multiple
> > +versions of a function, compiled with different target-specific options.
> > +@xref{Function Multiversioning} for more details.
> >  
> >  @cindex @code{unavailable} function attribute
> >  @item unavailable
> > @@ -4734,6 +4722,26 @@ Note that CPU tuning options and attributes such as 
> > the @option{-mcpu=},
> >  @option{-mcpu=} option or the @code{cpu=} attribute conflicts with the
> >  architectural feature rules specified above.
> >  
> > +@subsubsection Function multiversioning
> > +The @code{target_version} and @code{target_clones} attributes can be used 
> > to
> > +specify multiple versions of a function.  Each version enables the 
> > specified
> > +set of architecture extensions, in addition to any extensions that were 
> > already
> > +enabled at the command line or using @code{target} attributes. For general
> > +details, @pxref{Function Multiversioning}.  There are further 
> > AArch64-specific
> > +details available in the
> > +@uref{https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning,
> > +Arm C Language Extensions (ACLE) specification}.
> > +
> > +Some aspects of the ACLE specification are not yet supported.  In 
> > particular,
> > +the currently supported feature names are @code{rng}, @code{flagm}, 
> > @code{lse},
> > +@code{fp}, @code{simd}, @code{dotprod}, @code{sm4}, @code{rdma}, @code{rdm}
> > +(alias of @code{rdma}), @code{crc}, @code{sha2}, @code{sha3}, @code{aes},
> > +@code{fp16}, @code{fp16fml}, @code{rcpc}, @code{rcpc3}, @code{i8mm},
> > +@code{bf16}, @code{rpres}, @code{sve}, @code{f32mm}, @code{f64mm}, 
> > @code{sve2},
> > +@code{sve2-aes}, @code{sve2-bitperm}, @code{sve2-sha3}, @code{sve2-sm4},
> > +@code{sme}, @code{memtag}, @code{sb}, @code{predres}, @code{ssbs}, 
> > @code{ls64},
> > +@code{sme-f64f64}, @code{sme-i16i64} and @code{sme2}.
> > +
> >  @node AMD GCN Function Attributes
> >  @subsection AMD GCN Function Attributes
> >  
> > @@ -6278,6 +6286,15 @@ default tuning specified on the command line.
> >  On the PowerPC, the inliner does not inline a
> >  function that has different target options than the caller, unless the
> >  callee has a subset of the target options of the caller.
> > +
> > +@cindex @code{target_clones} function attribute
> > +@item target_clones (@var{options})
> > +The @code{target_clones} attribute can be used to create multiple versions 
> > of a
> > +function for different supported architectures, with one version for each
> > +specifier in the options list.  One of these version specifiers must be the
> > +@code{default} version. The other supported target specifiers are
> > +@code{cpu=power6}, @code{cpu=power7}, @code{cpu=power8}, @code{cpu=power9} 
> > and
> > +@code{cpu=power10}.  For more details, @pxref{Function Multiversioning}.
> >  @end table
> >  
> >  @node RISC-V Function Attributes
> > @@ -6872,7 +6889,9 @@ will crash if the wrong kind of handler is used.
> >  @cindex @code{target} function attribute
> >  @item target (@var{options})
> >  As discussed in @ref{Common Function Attributes}, this attribute 
> > -allows specification of target-specific compilation options.
> > +allows specification of target-specific compilation options.  It can also 
> > be
> > +used to create multiple versions of a single function
> > +(@pxref{Function Multiversioning}).
> >  
> >  On the x86, the following options are allowed:
> >  @table @samp
> > @@ -29430,11 +29449,62 @@ For the effects of the @code{hot} attribute on 
> > functions, see
> >  @section Function Multiversioning
> >  @cindex function versions
> >  
> > -With the GNU C++ front end, for x86 targets, you may specify multiple
> > -versions of a function, where each function is specialized for a
> > -specific target feature.  At runtime, the appropriate version of the
> > -function is automatically executed depending on the characteristics of
> > -the execution platform.  Here is an example.
> > +On some targets it is possible to specify multiple versions of a function,
> > +where each version of the function is specialized for a different set of 
> > target
> > +features.  At runtime, characteristics of the execution platform are 
> > checked,
> > +and the most appropriate version of the function is chosen to be executed
> > +depending on the available architecture features.  One of the versions 
> > will be
> > +a "default" version, which will be chosen if none of the criteria for the 
> > other
> > +versions are met.
> > +
> > +Function multiversioning is implemented using the STT_GNU_IFUNC symbol type
> > +extension to the ELF standard.  This is same mechanism used by the 
> > @code{ifunc}
> > +attribute (@pxref{Common Function Attributes}).  However, the compiler
> > +automatically generates a resolver function that checks which features are
> > +available at runtime.  This resolver uses GLIBC's hardware capability 
> > bits, and
> > +therefore requires GCC to be configured to use GLIBC 2.23 or newer.  The
> > +resolver is run once at startup, and the resulting function pointer is then
> > +stored in the dynamic symbol table.
> 
> This is good implementation information, but perhaps it'd be better to
> put it later in the section, after describing the more user-facing aspects.

I've moved this to end of the section.
 
> > +
> > +Function multiversioning is enabled by annoting the function versions with 
> > one
> 
> annotating

Fixed.

> > +of three function attributes.
> > +
> > +The @code{target} attribute can be used on x86 targets.  Multiversioning 
> > with
> > +the @code{target} attribute is supported only in the C++ frontend.  One 
> > version
> > +must be explicitly labelled as the "default" version; this version retains 
> > the
> > +original mangled name, and will therefore be called directly by any callers
> > +from translation units compiled without the target version attributes.
> > +
> > +The @code{target_version} attribute can be used on AArch64 targets.
> > +Multiversioning with the @code{target_version} attribute is supported only 
> > in
> > +the C++ frontend.  This attribute behaves similarly to the @code{target}
> > +attribute, with two differences.  Firstly, the @code{target_version} 
> > attribute
> > +is optional on the default version; the use of multiversioning can be 
> > inferred
> > +by the presence of other non-default versions of the function.  Secondly, 
> > the
> > +original mangled name is used for the dispatched version of the function; 
> > this
> > +means that the specialized function versions can be accessed from other
> > +translation units without needing to include the additional versions and
> > +function attributes in header files.
> 
> Here too I think it would be better to avoid talking about the
> implementation details at this level (mangling) and instead concentrate
> on the user-visible behaviour (the end part of the sentence).

I've rewritten these two paragraphs without the mangling details, and added a
separate explanation of the mangling difference at the end of the section.

> > +
> > +The @code{target_clones} attribute can be used on AArch64, PowerPC and x86
> > +targets.  It behaves similarly to the @code{target_version} attribute, 
> > except
> > +that only one copy of the function is included in the source file.  The
> > +attribute takes a list of version specifiers and produces one copy of the
> > +function for each specifier.  This is useful in cases where the compiler is
> > +capable of generating optimized code (with autovectorization, for example)
> > +using architecture features enabled only in the more specialized function
> > +versions.  For example, on PowerPC, compiling a function with
> > +@code{target_clones("default,cpu=power9")} will create two function clones 
> > -
> > +one compiled with @option{-mcpu=power9}, and another with the default 
> > options.
> > +The @code{target_clones} attribute is available in the C, C++, D and Ada
> > +frontends.
> > +
> > +Function multiversioning attributes do not propogate from a versioned
> > +function to its callees, although a callee can still be optimised using the
> > +caller's extra target features if it has been inlined directly into the 
> > caller.
> 
> It might be better to avoid the passive here.  "Has been" sounds like
> something that has already happened when the compiler is run, rather than
> something that the compiler decides for itself.

Changed to "is".

> > +
> > +Here is an example of function multiversioning on x86 using the 
> > @code{target}
> > +attribute.
> >  
> >  @smallexample
> >  __attribute__ ((target ("default")))
> > @@ -29474,15 +29544,19 @@ int main ()
> >  @end smallexample
> >  
> >  In the above example, four versions of function foo are created. The
> > -first version of foo with the target attribute "default" is the default
> > +first version of foo, with the target attribute "default", is the default
> 
> Pre-existing, but this should probably be @samp{"default"}
> 
> >  version.  This version gets executed when no other target specific
> > -version qualifies for execution on a particular platform. A new version
> > -of foo is created by using the same function signature but with a
> > -different target string.  Function foo is called or a pointer to it is
> > -taken just like a regular function.  GCC takes care of doing the
> > +version qualifies for execution on a particular platform.  Other versions
> > +of foo are created by using the same function signature but with a
> 
> Also pre-existing, but: @samp{foo}.  Same below.

Fixed.  New version to follow.
 
> Thanks,
> Richard
> 
> > +different target string.  The function foo can be called or a pointer to it
> > +can be taken just like a for regular function.  GCC takes care of doing the
> >  dispatching to call the right version at runtime.  Refer to the
> >  @uref{https://gcc.gnu.org/wiki/FunctionMultiVersioning, GCC wiki on
> > -Function Multiversioning} for more details.
> > +Function Multiversioning} for more details of the implementation.
> > +
> > +For details of the target options supported on each target, refer to
> > +@ref{AArch64 Function Attributes}, @ref{PowerPC Function Attributes},
> > +and @ref{x86 Function Attributes}.
> >  
> >  @node Type Traits
> >  @section Type Traits

Reply via email to