On Wed, 17 Jan 2024, Jan Hubicka wrote: > > > +falign-all-functions > > > +Common Var(flag_align_all_functions) Optimization > > > +Align the start of functions. > > > > all functions > > > > or maybe "of every function."? > > Fixed, thanks! > > > +@opindex falign-all-functions=@var{n} > > > +@item -falign-all-functions > > > +Specify minimal alignment for function entry. Unlike > > > @option{-falign-functions} > > > +this alignment is applied also to all functions (even those considered > > > cold). > > > +The alignment is also not affected by @option{-flimit-function-alignment} > > > + > > > > For functions with two entries (like on powerpc), which entry does this > > apply to? I suppose the external ABI entry, not the local one? But > > how does this then help to align the patchable entry (the common > > local entry should be aligned?). Should we align _both_ entries? > > To be honest I did not really know we actually would like to patch > alternative entry points. > The function alignent is always produced before the start of function, > so the first entry point wins and the other entry point is not aligned. > > Aligning later labels needs to go using the label align code, since > theoretically some targets need to do relaxation over it. > > In final.cc we do no function alignments on labels those labels. > I guess this makes sense because if we align for performance, we > probably do not want the altenrate entry point to be aligned since it > appears close to original one. I can add that to compute_alignment: > test if label is alternative entry point and add alignment. > I wonder if that is a desired behaviour though and is this code > path even used? > > I know this was originally added to support i386 register passing > conventions and stack alignment via alternative entry point, but it was > never really used that way. Also there was plan to support Fortran > alternative entry point. > > Looking at what rs6000 does, it seems to not use the RTL representation > of alternative entry points. it seems that: > 1) call assemble_start_functions which > a) outputs function alignment > b) outputs start label > c) calls print_patchable_function_entry > 2) call final_start_functions which calls output_function_prologue. > In rs6000 there is second call to > rs6000_print_patchable_function_entry > So there is no target-independent place where alignment can be added, > so I would say it is up to rs6000 maintainers to decide what is right > here :)
Fair enough ... > > > > > @opindex falign-labels > > > @item -falign-labels > > > @itemx -falign-labels=@var{n} > > > @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > > Align loops to a power-of-two boundary. If the loops are executed > > > many times, this makes up for any execution of the dummy padding > > > instructions. > > > +This is an optimization of code performance and alignment is ignored for > > > +loops considered cold. > > > > > > If @option{-falign-labels} is greater than this value, then its value > > > is used instead. > > > @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > > Align branch targets to a power-of-two boundary, for branch targets > > > where the targets can only be reached by jumping. In this case, > > > no dummy operations need be executed. > > > +This is an optimization of code performance and alignment is ignored for > > > +jumps considered cold. > > > > > > If @option{-falign-labels} is greater than this value, then its value > > > is used instead. > > > @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} > > > and optimization > > > options should be specified at compile time and during the final link. > > > It is recommended that you compile all the files participating in the > > > same link with the same options and also specify those options at > > > -link time. > > > +link time. > > > For example: > > > > > > @smallexample > > > diff --git a/gcc/flags.h b/gcc/flags.h > > > index e4bafa310d6..ecf4fb9e846 100644 > > > --- a/gcc/flags.h > > > +++ b/gcc/flags.h > > > @@ -89,6 +89,7 @@ public: > > > align_flags x_align_jumps; > > > align_flags x_align_labels; > > > align_flags x_align_functions; > > > + align_flags x_align_all_functions; > > > }; > > > > > > extern class target_flag_state default_target_flag_state; > > > @@ -98,10 +99,11 @@ extern class target_flag_state > > > *this_target_flag_state; > > > #define this_target_flag_state (&default_target_flag_state) > > > #endif > > > > > > -#define align_loops (this_target_flag_state->x_align_loops) > > > -#define align_jumps (this_target_flag_state->x_align_jumps) > > > -#define align_labels (this_target_flag_state->x_align_labels) > > > -#define align_functions (this_target_flag_state->x_align_functions) > > > +#define align_loops (this_target_flag_state->x_align_loops) > > > +#define align_jumps (this_target_flag_state->x_align_jumps) > > > +#define align_labels (this_target_flag_state->x_align_labels) > > > +#define align_functions (this_target_flag_state->x_align_functions) > > > +#define align_all_functions > > > (this_target_flag_state->x_align_all_functions) > > > > > > /* Returns TRUE if generated code should match ABI version N or > > > greater is in use. */ > > > diff --git a/gcc/opts.cc b/gcc/opts.cc > > > index 7a3830caaa3..3fa521501ff 100644 > > > --- a/gcc/opts.cc > > > +++ b/gcc/opts.cc > > > @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts, > > > &opts->x_str_align_functions); > > > break; > > > > > > + case OPT_falign_all_functions_: > > > + check_alignment_argument (loc, arg, "all-functions", > > > + &opts->x_flag_align_all_functions, > > > + &opts->x_str_align_all_functions); > > > + break; > > > + > > > case OPT_ftabstop_: > > > /* It is documented that we silently ignore silly values. */ > > > if (value >= 1 && value <= 100) > > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > > > index 85450d97a1a..3dd6f4e1ef7 100644 > > > --- a/gcc/toplev.cc > > > +++ b/gcc/toplev.cc > > > @@ -1219,6 +1219,7 @@ parse_alignment_opts (void) > > > parse_N_M (str_align_jumps, align_jumps); > > > parse_N_M (str_align_labels, align_labels); > > > parse_N_M (str_align_functions, align_functions); > > > + parse_N_M (str_align_all_functions, align_all_functions); > > > } > > > > > > /* Process the options that have been parsed. */ > > > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > > > index 69f8f8ee018..ddb8536a337 100644 > > > --- a/gcc/varasm.cc > > > +++ b/gcc/varasm.cc > > > @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char > > > *fnname) > > > ASM_OUTPUT_ALIGN (asm_out_file, align); > > > } > > > > > > + /* Handle forced alignment. This really ought to apply to all > > > functions, > > > + since it is used by patchable entries. */ > > > + if (align_all_functions.levels[0].log > align) > > > + { > > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > > + int align_log = align_all_functions.levels[0].log; > > > +#endif > > > + int max_skip = align_all_functions.levels[0].maxskip; > > > + if (flag_limit_function_alignment && crtl->max_insn_address > 0 > > > + && max_skip >= crtl->max_insn_address) > > > + max_skip = crtl->max_insn_address - 1; > > > + > > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip); > > > + if (max_skip >= (1 << align_log) - 1) > > > + align = align_functions.levels[0].log; > > > + if (max_skip == align_all_functions.levels[0].maxskip) > > > + { > > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, > > > + align_all_functions.levels[1].log, > > > + align_all_functions.levels[1].maxskip); > > > + if (align_all_functions.levels[1].maxskip > > > + >= (1 << align_all_functions.levels[1].log) - 1) > > > + align = align_all_functions.levels[1].log; > > > + } > > > +#else > > > + ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log); > > > + align = align_all_functions.levels[0].log; > > > +#endif > > > > This looks close to the handling of user-specified alignment > > (factor sth out?), but I wonder if for -falign-all-functions > > we should only allow a hard alignment (no max_skip) and also > > not allow (but diagnose?) conflicts with limit-function-alignment? > > Well, I do not see much use of the max-skip feature, but it seemed to me > that perhaps it is better to keep the command line options symmetric. > > Actually my first iteration of the patch supported only one value, but > then I kind of convinced myself that symmetricity is better. > I am definitely fine with supporting only one alignment with no > max-skip. > > > > The interaction with the other flags also doesn't seem to be > > well documented? The main docs suggest it should be > > -fmin-function-alignment= which to me then suggests > Hmm, I do not see any mention of min-function-algnment I meant the new option might be named -fmin-function-alignment= rather than -falign-all-functions because of how it should override all other options. Otherwise is there an updated patch to look at? Richard. > > -flimit-function-alignment should not have an effect on it > > and even very small functions should be aligned. > > I write that it is not affected by limit-function-alignment > @opindex falign-all-functions=@var{n} > @item -falign-all-functions > Specify minimal alignment for function entry. Unlike > @option{-falign-functions} > this alignment is applied also to all functions (even those considered cold). > The alignment is also not affected by @option{-flimit-function-alignment} > > Because indeed that would break the atomicity of updates. > Honza > > > > Richard. > > > > > + } > > > + > > > /* Handle a user-specified function alignment. > > > Note that we still need to align to DECL_ALIGN, as above, > > > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at > > > all. */ > > > > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)