Hi Richard,

The purpose of this patch is to give more of the target code access to
the cumulative_args_t structure in order to enable certain (otherwise
currently impossible) stack layout constraints, specifically for
parameters that are passed over the stack. For example, there are some
targets (not yet in GCC trunk) which explicitly require the
distinction between named and variadic parameters in order to enforce
different alignment rules (when passing over the stack.) Such a
constraint would be accommodated by this patch.

The patch itself is very straightforward and simply adds the parameter
to the two functions which we'd like to extend. The motivation of
defining new target hooks was to minimize the patch size.

A concrete user of this change that we have in mind is the AArch64
Darwin parameter passing abi, which mandates that when passing on the
stack, named parameters are to be naturally-aligned, however variadic
ones are to be word-aligned. However this patch is completely generic
in nature and should enable all targets to have more control over
their parameter layout process.

Best Regards,
Maxim

On Mon, 15 Nov 2021 at 07:11, Richard Biener <richard.guent...@gmail.com> wrote:
>
> On Sat, Nov 13, 2021 at 10:43 AM Maxim Blinov <maxim.bli...@embecosm.com> 
> wrote:
> >
> > The two target hooks responsible for informing GCC about stack
> > parameter alignment are `TARGET_FUNCTION_ARG_BOUNDARY` and
> > `TARGET_FUNCTION_ARG_ROUND_BOUNDARY`, which currently only consider
> > the tree and machine_mode of a specific given argument.
> >
> > Create two new target hooks suffixed with '_CA', and pass in a third
> > `cumulative_args_t` parameter. This enables the backend to make
> > alignment decisions based on the context of the whole function rather
> > than individual parameters.
> >
> > The orignal machine_mode/tree type macros are not removed - they are
> > called by the default implementations of `TARGET_...BOUNDARY_CA` and
> > `TARGET_...ROUND_BOUNDARY_CA`. This is done with the intetnion of
> > avoiding large mechanical modifications of nearly every backend in
> > GCC. There is also a new flag, -fstack-use-cumulative-args, which
> > provides a way to completely bypass the new `..._CA` macros. This
> > feature is intended for debugging GCC itself.
>
> Just two quick comments without looking at the patch.
>
> Please do not introduce options in the user namespace -f... which are
> for debugging only.  I think you should go without this part instead.
>
> Second, you fail to motivate the change.  I cannot make sense of
> "This enables the backend to make alignment decisions based on the
> context of the whole function rather than individual parameters."
>
> Richard.
>
> >
> > gcc/ChangeLog:
> >
> >         * calls.c (initialize_argument_information): Pass `args_so_far`.
> >         * common.opt: New flag `-fstack-use-cumulative-args`.
> >         * config.gcc: No platforms currently use ..._CA-hooks: Set
> >         -fstack-use-cumulative-args to be off by default.
> >         * target.h (cumulative_args_t): Move declaration from here, to...
> >         * cumulative-args.h (cumulative_args_t): ...this new file. This is
> >         to permit backends to include the declaration of cumulative_args_t
> >         without dragging in circular dependencies.
> >         * function.c (assign_parm_find_entry_rtl): Provide
> >         cumulative_args_t to locate_and_pad_parm.
> >         (gimplify_parameters): Ditto.
> >         (locate_and_pad_parm): Conditionally call new hooks if we're
> >         invoked with -fstack-use-cumulative-args.
> >         * function.h: Include cumulative-args.h.
> >         (locate_and_pad_parm): Add cumulative_args_t parameter.
> >         * target.def (function_arg_boundary_ca): Add.
> >         (function_arg_round_boundary_ca): Ditto.
> >         * targhooks.c (default_function_arg_boundary_ca): Implement.
> >         (default_function_arg_round_boundary_ca): Ditto.
> >         * targhooks.h (default_function_arg_boundary_ca): Declare.
> >         (default_function_arg_round_boundary_ca): Ditto.
> >         * doc/invoke.texi (-fstack-use-cumulative-args): Document.
> >         * doc/tm.texi: Regenerate.
> >         * doc/tm.texi.in: Ditto.
> > ---
> >  gcc/calls.c           |  3 +++
> >  gcc/common.opt        |  4 ++++
> >  gcc/config.gcc        |  7 +++++++
> >  gcc/cumulative-args.h | 20 ++++++++++++++++++++
> >  gcc/doc/invoke.texi   | 12 ++++++++++++
> >  gcc/doc/tm.texi       | 20 ++++++++++++++++++++
> >  gcc/doc/tm.texi.in    |  4 ++++
> >  gcc/function.c        | 25 +++++++++++++++++++++----
> >  gcc/function.h        |  2 ++
> >  gcc/target.def        | 24 ++++++++++++++++++++++++
> >  gcc/target.h          | 17 +----------------
> >  gcc/targhooks.c       | 16 ++++++++++++++++
> >  gcc/targhooks.h       |  6 ++++++
> >  13 files changed, 140 insertions(+), 20 deletions(-)
> >  create mode 100644 gcc/cumulative-args.h
> >
> > diff --git a/gcc/calls.c b/gcc/calls.c
> > index 27b59f26ad3..cef612a6ef4 100644
> > --- a/gcc/calls.c
> > +++ b/gcc/calls.c
> > @@ -1527,6 +1527,7 @@ initialize_argument_information (int num_actuals 
> > ATTRIBUTE_UNUSED,
> >  #endif
> >                              reg_parm_stack_space,
> >                              args[i].pass_on_stack ? 0 : args[i].partial,
> > +                            args_so_far,
> >                              fndecl, args_size, &args[i].locate);
> >  #ifdef BLOCK_REG_PADDING
> >        else
> > @@ -4205,6 +4206,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, 
> > rtx value,
> >                            argvec[count].reg != 0,
> >  #endif
> >                            reg_parm_stack_space, 0,
> > +                          args_so_far,
> >                            NULL_TREE, &args_size, &argvec[count].locate);
> >
> >        if (argvec[count].reg == 0 || argvec[count].partial != 0
> > @@ -4296,6 +4298,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, 
> > rtx value,
> >                                argvec[count].reg != 0,
> >  #endif
> >                                reg_parm_stack_space, argvec[count].partial,
> > +                              args_so_far,
> >                                NULL_TREE, &args_size, 
> > &argvec[count].locate);
> >           args_size.constant += argvec[count].locate.size.constant;
> >           gcc_assert (!argvec[count].locate.size.var);
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index de9b848eda5..982417c1e39 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2705,6 +2705,10 @@ fstack-usage
> >  Common RejectNegative Var(flag_stack_usage)
> >  Output stack usage information on a per-function basis.
> >
> > +fstack-use-cumulative-args
> > +Common RejectNegative Var(flag_stack_use_cumulative_args) 
> > Init(STACK_USE_CUMULATIVE_ARGS_INIT)
> > +Use cumulative args-based stack layout hooks.
> > +
> >  fstrength-reduce
> >  Common Ignore
> >  Does nothing.  Preserved for backward compatibility.
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index edd12655c4a..046d691af56 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -1070,6 +1070,13 @@ case ${target} in
> >    ;;
> >  esac
> >
> > +# Figure out if we need to enable -foff-stack-trampolines by default.
> > +case ${target} in
> > +*)
> > +  tm_defines="$tm_defines STACK_USE_CUMULATIVE_ARGS_INIT=0"
> > +  ;;
> > +esac
> > +
> >  case ${target} in
> >  aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
> >         tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
> > diff --git a/gcc/cumulative-args.h b/gcc/cumulative-args.h
> > new file mode 100644
> > index 00000000000..b60928e37f9
> > --- /dev/null
> > +++ b/gcc/cumulative-args.h
> > @@ -0,0 +1,20 @@
> > +#ifndef GCC_CUMULATIVE_ARGS_H
> > +#define GCC_CUMULATIVE_ARGS_H
> > +
> > +#if CHECKING_P
> > +
> > +struct cumulative_args_t { void *magic; void *p; };
> > +
> > +#else /* !CHECKING_P */
> > +
> > +/* When using a GCC build compiler, we could use
> > +   __attribute__((transparent_union)) to get cumulative_args_t function
> > +   arguments passed like scalars where the ABI would mandate a less
> > +   efficient way of argument passing otherwise.  However, that would come
> > +   at the cost of less type-safe !CHECKING_P compilation.  */
> > +
> > +union cumulative_args_t { void *p; };
> > +
> > +#endif /* !CHECKING_P */
> > +
> > +#endif /* GCC_CUMULATIVE_ARGS_H */
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 2aba4c70b44..7ffb2997c69 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -670,6 +670,7 @@ Objective-C and Objective-C++ Dialects}.
> >  -fverbose-asm  -fpack-struct[=@var{n}]  @gol
> >  -fleading-underscore  -ftls-model=@var{model} @gol
> >  -fstack-reuse=@var{reuse_level} @gol
> > +-fstack-use-cumulative-args @gol
> >  -ftrampolines  -ftrapv  -fwrapv @gol
> >  -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol
> >  -fstrict-volatile-bitfields  -fsync-libcalls}
> > @@ -16625,6 +16626,17 @@ the behavior of older compilers in which 
> > temporaries' stack space is
> >  not reused, the aggressive stack reuse can lead to runtime errors. This
> >  option is used to control the temporary stack reuse optimization.
> >
> > +@item -fstack-use-cumulative-args
> > +@opindex fstack_use_cumulative_args
> > +This option instructs the compiler to use the
> > +@code{cumulative_args_t}-based stack layout target hooks,
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY_CA} and
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA}. If a given target does
> > +not define these hooks, the default behaviour is to fallback to using
> > +the standard non-@code{_CA} variants instead. Certain targets (such as
> > +AArch64 Darwin) require using the more advanced @code{_CA}-based
> > +hooks: For these targets this option should be enabled by default.
> > +
> >  @item -ftrapv
> >  @opindex ftrapv
> >  This option generates traps for signed overflow on addition, subtraction,
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 78a1af1ad4d..261eaf46ef5 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -4322,6 +4322,16 @@ with the specified mode and type.  The default hook 
> > returns
> >  @code{PARM_BOUNDARY} for all arguments.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_BOUNDARY_CA 
> > (machine_mode @var{mode}, const_tree @var{type}, cumulative_args_t @var{ca})
> > +This is the @code{cumulative_args_t}-based version of
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more
> > +fine-grained control over argument alignment, e.g. depending on whether
> > +it is a named argument or not, or any other criteria that you choose to
> > +place in the @var{ca} structure.
> > +
> > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.
> > +@end deftypefn
> > +
> >  @deftypefn {Target Hook} {unsigned int} TARGET_FUNCTION_ARG_ROUND_BOUNDARY 
> > (machine_mode @var{mode}, const_tree @var{type})
> >  Normally, the size of an argument is rounded up to @code{PARM_BOUNDARY},
> >  which is the default value for this hook.  You can define this hook to
> > @@ -4329,6 +4339,16 @@ return a different value if an argument size must be 
> > rounded to a larger
> >  value.
> >  @end deftypefn
> >
> > +@deftypefn {Target Hook} {unsigned int} 
> > TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA (machine_mode @var{mode}, const_tree 
> > @var{type}, cumulative_args_t @var{ca})
> > +This is the @code{cumulative_args_t}-based version of
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need 
> > more
> > +fine-grained control over argument size rounding, e.g. depending on whether
> > +it is a named argument or not, or any other criteria that you choose to
> > +place in the @var{ca} structure.
> > +
> > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.
> > +@end deftypefn
> > +
> >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> >  A C expression that is nonzero if @var{regno} is the number of a hard
> >  register in which function arguments are sometimes passed.  This does
> > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> > index 4401550989e..933644dfa86 100644
> > --- a/gcc/doc/tm.texi.in
> > +++ b/gcc/doc/tm.texi.in
> > @@ -3330,8 +3330,12 @@ required.
> >
> >  @hook TARGET_FUNCTION_ARG_BOUNDARY
> >
> > +@hook TARGET_FUNCTION_ARG_BOUNDARY_CA
> > +
> >  @hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY
> >
> > +@hook TARGET_FUNCTION_ARG_ROUND_BOUNDARY_CA
> > +
> >  @defmac FUNCTION_ARG_REGNO_P (@var{regno})
> >  A C expression that is nonzero if @var{regno} is the number of a hard
> >  register in which function arguments are sometimes passed.  This does
> > diff --git a/gcc/function.c b/gcc/function.c
> > index 61b3bd036b8..1ebf3e0ffe5 100644
> > --- a/gcc/function.c
> > +++ b/gcc/function.c
> > @@ -2610,7 +2610,9 @@ assign_parm_find_entry_rtl (struct 
> > assign_parm_data_all *all,
> >
> >    locate_and_pad_parm (data->arg.mode, data->arg.type, in_regs,
> >                        all->reg_parm_stack_space,
> > -                      entry_parm ? data->partial : 0, 
> > current_function_decl,
> > +                      entry_parm ? data->partial : 0,
> > +                      all->args_so_far,
> > +                      current_function_decl,
> >                        &all->stack_args_size, &data->locate);
> >
> >    /* Update parm_stack_boundary if this parameter is passed in the
> > @@ -4027,6 +4029,7 @@ gimplify_parameters (gimple_seq *cleanup)
> >  void
> >  locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
> >                      int reg_parm_stack_space, int partial,
> > +                    cumulative_args_t ca,
> >                      tree fndecl ATTRIBUTE_UNUSED,
> >                      struct args_size *initial_offset_ptr,
> >                      struct locate_and_pad_arg_data *locate)
> > @@ -4064,9 +4067,23 @@ locate_and_pad_parm (machine_mode passed_mode, tree 
> > type, int in_regs,
> >               ? arg_size_in_bytes (type)
> >               : size_int (GET_MODE_SIZE (passed_mode)));
> >    where_pad = targetm.calls.function_arg_padding (passed_mode, type);
> > -  boundary = targetm.calls.function_arg_boundary (passed_mode, type);
> > -  round_boundary = targetm.calls.function_arg_round_boundary (passed_mode,
> > -                                                             type);
> > +
> > +  if (flag_stack_use_cumulative_args)
> > +    {
> > +      boundary = targetm.calls.function_arg_boundary_ca (passed_mode,
> > +                                                        type,
> > +                                                        ca);
> > +      round_boundary = targetm.calls.function_arg_round_boundary_ca
> > +       (passed_mode, type, ca);
> > +    }
> > +  else
> > +    {
> > +      boundary = targetm.calls.function_arg_boundary (passed_mode,
> > +                                                     type);
> > +      round_boundary = targetm.calls.function_arg_round_boundary
> > +       (passed_mode, type);
> > +    }
> > +
> >    locate->where_pad = where_pad;
> >
> >    /* Alignment can't exceed MAX_SUPPORTED_STACK_ALIGNMENT.  */
> > diff --git a/gcc/function.h b/gcc/function.h
> > index 899430833ce..26f342caba3 100644
> > --- a/gcc/function.h
> > +++ b/gcc/function.h
> > @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #ifndef GCC_FUNCTION_H
> >  #define GCC_FUNCTION_H
> >
> > +#include "cumulative-args.h"
> >
> >  /* Stack of pending (incomplete) sequences saved by `start_sequence'.
> >     Each element describes one pending sequence.
> > @@ -661,6 +662,7 @@ extern int aggregate_value_p (const_tree, const_tree);
> >  extern bool use_register_for_decl (const_tree);
> >  extern gimple_seq gimplify_parameters (gimple_seq *);
> >  extern void locate_and_pad_parm (machine_mode, tree, int, int, int,
> > +                                cumulative_args_t,
> >                                  tree, struct args_size *,
> >                                  struct locate_and_pad_arg_data *);
> >  extern void generate_setjmp_warnings (void);
> > diff --git a/gcc/target.def b/gcc/target.def
> > index 51ea167172b..bb56afab539 100644
> > --- a/gcc/target.def
> > +++ b/gcc/target.def
> > @@ -4959,6 +4959,18 @@ with the specified mode and type.  The default hook 
> > returns\n\
> >   unsigned int, (machine_mode mode, const_tree type),
> >   default_function_arg_boundary)
> >
> > +DEFHOOK
> > +(function_arg_boundary_ca,
> > + "This is the @code{cumulative_args_t}-based version of\n\
> > +@code{TARGET_FUNCTION_ARG_BOUNDARY}. Define this hook if you need more\n\
> > +fine-grained control over argument alignment, e.g. depending on whether\n\
> > +it is a named argument or not, or any other criteria that you choose to\n\
> > +place in the @var{ca} structure.\n\
> > +\n\
> > +The default hook will call @code{TARGET_FUNCTION_ARG_BOUNDARY}.",
> > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > + default_function_arg_boundary_ca)
> > +
> >  DEFHOOK
> >  (function_arg_round_boundary,
> >   "Normally, the size of an argument is rounded up to 
> > @code{PARM_BOUNDARY},\n\
> > @@ -4968,6 +4980,18 @@ value.",
> >   unsigned int, (machine_mode mode, const_tree type),
> >   default_function_arg_round_boundary)
> >
> > +DEFHOOK
> > +(function_arg_round_boundary_ca,
> > + "This is the @code{cumulative_args_t}-based version of\n\
> > +@code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}. Define this hook if you need 
> > more\n\
> > +fine-grained control over argument size rounding, e.g. depending on 
> > whether\n\
> > +it is a named argument or not, or any other criteria that you choose to\n\
> > +place in the @var{ca} structure.\n\
> > +\n\
> > +The default hook will call @code{TARGET_FUNCTION_ARG_ROUND_BOUNDARY}.",
> > + unsigned int, (machine_mode mode, const_tree type, cumulative_args_t ca),
> > + default_function_arg_round_boundary_ca)
> > +
> >  /* Return the diagnostic message string if function without a prototype
> >     is not allowed for this 'val' argument; NULL otherwise. */
> >  DEFHOOK
> > diff --git a/gcc/target.h b/gcc/target.h
> > index d8f45fb99c3..5b28ece7e1c 100644
> > --- a/gcc/target.h
> > +++ b/gcc/target.h
> > @@ -51,22 +51,7 @@
> >  #include "insn-codes.h"
> >  #include "tm.h"
> >  #include "hard-reg-set.h"
> > -
> > -#if CHECKING_P
> > -
> > -struct cumulative_args_t { void *magic; void *p; };
> > -
> > -#else /* !CHECKING_P */
> > -
> > -/* When using a GCC build compiler, we could use
> > -   __attribute__((transparent_union)) to get cumulative_args_t function
> > -   arguments passed like scalars where the ABI would mandate a less
> > -   efficient way of argument passing otherwise.  However, that would come
> > -   at the cost of less type-safe !CHECKING_P compilation.  */
> > -
> > -union cumulative_args_t { void *p; };
> > -
> > -#endif /* !CHECKING_P */
> > +#include "cumulative-args.h"
> >
> >  /* Types of memory operation understood by the "by_pieces" infrastructure.
> >     Used by the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P target hook and
> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> > index 6f071f80231..d83f362a5ae 100644
> > --- a/gcc/targhooks.c
> > +++ b/gcc/targhooks.c
> > @@ -850,6 +850,14 @@ default_function_arg_boundary (machine_mode mode 
> > ATTRIBUTE_UNUSED,
> >    return PARM_BOUNDARY;
> >  }
> >
> > +unsigned int
> > +default_function_arg_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > +                                 const_tree type ATTRIBUTE_UNUSED,
> > +                                 cumulative_args_t ca ATTRIBUTE_UNUSED)
> > +{
> > +  return default_function_arg_boundary (mode, type);
> > +}
> > +
> >  unsigned int
> >  default_function_arg_round_boundary (machine_mode mode ATTRIBUTE_UNUSED,
> >                                      const_tree type ATTRIBUTE_UNUSED)
> > @@ -857,6 +865,14 @@ default_function_arg_round_boundary (machine_mode mode 
> > ATTRIBUTE_UNUSED,
> >    return PARM_BOUNDARY;
> >  }
> >
> > +unsigned int
> > +default_function_arg_round_boundary_ca (machine_mode mode ATTRIBUTE_UNUSED,
> > +                                       const_tree type ATTRIBUTE_UNUSED,
> > +                                       cumulative_args_t ca 
> > ATTRIBUTE_UNUSED)
> > +{
> > +  return default_function_arg_round_boundary (mode, type);
> > +}
> > +
> >  void
> >  hook_void_bitmap (bitmap regs ATTRIBUTE_UNUSED)
> >  {
> > diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> > index 11e9d7dd1a8..f33374ef073 100644
> > --- a/gcc/targhooks.h
> > +++ b/gcc/targhooks.h
> > @@ -154,6 +154,12 @@ extern unsigned int default_function_arg_boundary 
> > (machine_mode,
> >                                                    const_tree);
> >  extern unsigned int default_function_arg_round_boundary (machine_mode,
> >                                                          const_tree);
> > +extern unsigned int default_function_arg_boundary_ca (machine_mode,
> > +                                                     const_tree,
> > +                                                     cumulative_args_t ca);
> > +extern unsigned int default_function_arg_round_boundary_ca (machine_mode,
> > +                                                           const_tree,
> > +                                                           
> > cumulative_args_t ca);
> >  extern bool hook_bool_const_rtx_commutative_p (const_rtx, int);
> >  extern rtx default_function_value (const_tree, const_tree, bool);
> >  extern HARD_REG_SET default_zero_call_used_regs (HARD_REG_SET);
> > --
> > 2.30.1 (Apple Git-130)
> >

Reply via email to