Hi, Thanks for all the feedback. Hopefully it's all incorporated now. I will reply to you individually on the specific topics, but here is the new v6 for you to rip apart ;-)
Changes since v5: * ChangeLogs split, reshuffled, reformatted. * cmdline option parsing again with integral_argument () * Documentation has less "pad"s * completely reworked default_print_prolog_pad () -- never liked the old version either. Torsten gcc/c-family/ChangeLog 2017-02-17 Torsten Duwe <d...@suse.de> * c-attribs.c (c_common_attribute_table): Add entry for "prolog_pad". gcc/lto/ChangeLog 2017-02-17 Torsten Duwe <d...@suse.de> * lto-lang.c (lto_attribute_table): Add entry for "prolog_pad". gcc/ChangeLog 2017-02-17 Torsten Duwe <d...@suse.de> * common.opt: Introduce -fprolog_pad command line option, and its variables prolog_nop_pad_size and prolog_nop_pad_entry. * opts.c (common_handle_option): Add -fprolog_pad_ case, including a two-value parser. * target.def (print_prolog_pad): New target hook. * targhooks.h (default_print_prolog_pad): New function. * targhooks.c (default_print_prolog_pad): Likewise. * toplev.c (process_options): Switch off IPA-RA if prolog pads are being generated. * varasm.c (assemble_start_function): Look at the prolog-pad command line switch and current function attributes and maybe generate NOP instructions by calling the print_prolog_pad hook. * doc/extend.texi: Document prolog_pad attribute. * doc/invoke.texi: Document -fprolog_pad command line option. * doc/tm.texi.in (TARGET_ASM_PRINT_PROLOG_PAD): New target hook. * doc/tm.texi: Likewise. gcc/testsuite/ChangeLog 2017-02-17 Torsten Duwe <d...@suse.de> * c-c++-common/attribute-prolog_pad-1.c: New test. diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index ce7fcaa..9f0f580 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -139,6 +139,7 @@ static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool *) static tree handle_bnd_legacy (tree *, tree, tree, int, bool *); static tree handle_bnd_instrument (tree *, tree, tree, int, bool *); static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *); +static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *); /* Table of machine-independent attributes common to all C-like languages. @@ -345,6 +346,8 @@ const struct attribute_spec c_common_attribute_table[] = handle_bnd_instrument, false }, { "fallthrough", 0, 0, false, false, false, handle_fallthrough_attribute, false }, + { "prolog_pad", 1, 2, true, false, false, + handle_prolog_pad_attribute, false }, { NULL, 0, 0, false, false, false, NULL, false } }; @@ -3173,3 +3176,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, int, *no_add_attrs = true; return NULL_TREE; } + +static tree +handle_prolog_pad_attribute (tree *, tree, tree, int, bool *) +{ + /* Nothing to be done here. */ + return NULL_TREE; +} diff --git a/gcc/common.opt b/gcc/common.opt index ad6baa3..02993b1 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -163,6 +163,13 @@ bool flag_stack_usage_info = false Variable int flag_debug_asm +; How many NOP insns to place before each function prologue by default +Variable +HOST_WIDE_INT prolog_nop_pad_size + +; And how far the asm entry point is into this pad +Variable +HOST_WIDE_INT prolog_nop_pad_entry ; Balance between GNAT encodings and standard DWARF to emit. Variable @@ -2022,6 +2029,10 @@ fprofile-reorder-functions Common Report Var(flag_profile_reorder_functions) Enable function reordering that improves code placement. +fprolog-pad= +Common Joined Optimization +Insert NOP instructions before each function prologue. + frandom-seed Common Var(common_deferred_options) Defer diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 3d1546a..ef7e985 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3076,6 +3076,23 @@ that affect more than one function. This attribute should be used for debugging purposes only. It is not suitable in production code. +@item prolog_pad +@cindex @code{prolog_pad} function attribute +@cindex extra NOP instructions at the function entry point +In case the target's text segment can be made writable at run time +by any means, padding the function entry with a number of NOPs can +be used to provide a universal tool for instrumentation. Usually, +prolog padding is enabled globally using the @option{-fprolog-pad=N,M} +command-line switch, and disabled with attribute @code{prolog_pad (0)} +for functions that are part of the actual instrumentation framework. +This conveniently avoids an endless recursion. +The @code{prolog_pad} function attribute can be used to +change the pad size to any desired value. The two-value syntax is +the same as for the command-line switch @option{-fprolog-pad=N,M}, +generating a NOP pad of size @var{N}, with the function entry point +@var{M} NOP instructions into the pad. @var{M} defaults to 0 +if omitted e.g. function entry point is before the first NOP. + @item pure @cindex @code{pure} function attribute @cindex functions that have no side effects diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 56ca53f..75a7e2c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -11370,6 +11370,31 @@ of the function name, it is considered to be a match. For C99 and C++ extended identifiers, the function name must be given in UTF-8, not using universal character names. +@item -fprolog-pad=@var{N}[,@var{M}] +@opindex fprolog-pad +Generate a pad of @var{N} NOPs right at the beginning +of each function, with the function entry point @var{M} NOPs into +the pad. If @var{M} is omitted, it defaults to @code{0} so the +function entry points to the address just at the first NOP. +The NOP instructions reserve extra space which can be used to patch in +any desired instrumentation at run time, provided that the code segment +is writable. The amount of space is only controllable indirectly via +the number of NOPs, so implementers are advised to use the smallest +NOP instruction available for the current CPU mode should there be a +choice, in order to achieve the finest granularity. +For run-time identification, the starting addresses +of these pads, which correspond to their respective function entries +minus @var{M}, are additionally collected in the @code{__prolog_pads_loc} +section of the resulting binary. + +Note that the value of @code{__attribute__ ((prolog_pad (N,M)))} takes +precedence over command-line option @option{-fprolog-pad=N,M}. +This can be used to increase the pad size or to remove it completely +on a single function. If @code{N=0}, no pad location is recorded. + +The NOP instructions are inserted at (and maybe before) the function entry +address, even before the prologue. + @end table diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 348fd68..5155d10 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4566,6 +4566,10 @@ will select the smallest suitable mode. This section describes the macros that output function entry (@dfn{prologue}) and exit (@dfn{epilogue}) code. +@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p}) +Generate prologue pad +@end deftypefn + @deftypefn {Target Hook} void TARGET_ASM_FUNCTION_PROLOGUE (FILE *@var{file}, HOST_WIDE_INT @var{size}) If defined, a function that outputs the assembler code for entry to a function. The prologue is responsible for setting up the stack frame, diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 6cde83c..b1d9d99 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3650,6 +3650,8 @@ will select the smallest suitable mode. This section describes the macros that output function entry (@dfn{prologue}) and exit (@dfn{epilogue}) code. +@hook TARGET_ASM_PRINT_PROLOG_PAD + @hook TARGET_ASM_FUNCTION_PROLOGUE @hook TARGET_ASM_FUNCTION_END_PROLOGUE diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c index ca8945e..9143328 100644 --- a/gcc/lto/lto-lang.c +++ b/gcc/lto/lto-lang.c @@ -48,6 +48,7 @@ static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *); static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *); static tree handle_transaction_pure_attribute (tree *, tree, tree, int, bool *); static tree handle_returns_twice_attribute (tree *, tree, tree, int, bool *); +static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *); static tree ignore_attribute (tree *, tree, tree, int, bool *); static tree handle_format_attribute (tree *, tree, tree, int, bool *); @@ -76,6 +77,8 @@ const struct attribute_spec lto_attribute_table[] = handle_nonnull_attribute, false }, { "nothrow", 0, 0, true, false, false, handle_nothrow_attribute, false }, + { "prolog_pad", 1, 2, true, false, false, + handle_prolog_pad_attribute, false }, { "returns_twice", 0, 0, true, false, false, handle_returns_twice_attribute, false }, { "sentinel", 0, 1, false, true, true, @@ -473,6 +476,13 @@ handle_returns_twice_attribute (tree *node, tree ARG_UNUSED (name), return NULL_TREE; } +static tree +handle_prolog_pad_attribute (tree *, tree, tree, int, bool *) +{ + /* Nothing to be done here. */ + return NULL_TREE; +} + /* Ignore the given attribute. Used when this attribute may be usefully overridden by the target, but is not used generically. */ diff --git a/gcc/opts.c b/gcc/opts.c index b38e9b4..10f751f 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -2159,6 +2159,29 @@ common_handle_option (struct gcc_options *opts, opts->x_flag_ipa_reference = false; break; + case OPT_fprolog_pad_: + { + char *pad_arg = xstrdup (arg); + char *comma = strchr (pad_arg, ','); + if (comma) + { + *comma = '\0'; + prolog_nop_pad_size = integral_argument (pad_arg); + prolog_nop_pad_entry = integral_argument (comma + 1); + } + else + { + prolog_nop_pad_size = integral_argument (pad_arg); + prolog_nop_pad_entry = 0; + } + if (prolog_nop_pad_size < 0 + || prolog_nop_pad_entry < 0 + || prolog_nop_pad_size < prolog_nop_pad_entry) + error ("invalid arguments for %<-fprolog_pad%>"); + free (pad_arg); + } + break; + case OPT_ftree_vectorize: if (!opts_set->x_flag_tree_loop_vectorize) opts->x_flag_tree_loop_vectorize = value; diff --git a/gcc/target.def b/gcc/target.def index 43600ae..bdc47b4 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -288,6 +288,12 @@ hidden, protected or internal visibility as specified by @var{visibility}.", void, (tree decl, int visibility), default_assemble_visibility) +DEFHOOK +(print_prolog_pad, + "Generate prologue pad", + void, (FILE *file, unsigned HOST_WIDE_INT pad_size, bool record_p), + default_print_prolog_pad) + /* Output the assembler code for entry to a function. */ DEFHOOK (function_prologue, diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 1cdec06..6729e6c 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1609,6 +1609,52 @@ default_compare_by_pieces_branch_ratio (machine_mode) return 1; } +/* Write PAD_SIZE NOPs into the asm outfile FILE before a function + prologue. If RECORD_P is true, the location of the pad will be + recorded in a special object section called "__prolog_pads_loc". + This routine may be called twice per function to put NOPs before + and after the function entry. */ + +void +default_print_prolog_pad (FILE *file, unsigned HOST_WIDE_INT pad_size, + bool record_p) +{ + static const char *nop_templ = 0; + + /* We use the template alone, relying on the (currently sane) assumption + that the NOP template does not have variable operands. */ + if (!nop_templ) + { + int code_num; + rtx_insn *my_nop = make_insn_raw (gen_nop ()); + + code_num = recog_memoized (my_nop); + nop_templ = get_insn_template (code_num, my_nop); + } + + if (record_p) + { + char buf[256]; + static int pad_number; + section *previous_section = in_section; + + pad_number++; + ASM_GENERATE_INTERNAL_LABEL (buf, "LPPAD", pad_number); + + switch_to_section (get_section ("__prolog_pads_loc", 0, NULL)); + fputs (integer_asm_op (POINTER_SIZE_UNITS, false), file); + assemble_name_raw (file, buf); + fputc ('\n', file); + + switch_to_section (previous_section); + ASM_OUTPUT_LABEL (file, buf); + } + + unsigned i; + for (i = 0; i < pad_size; ++i) + fprintf (file, "\t%s\n", nop_templ); +} + bool default_profile_before_prologue (void) { diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a5565f5..e302e8d 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -203,6 +203,7 @@ extern bool default_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT, bool); extern int default_compare_by_pieces_branch_ratio (machine_mode); +extern void default_print_prolog_pad (FILE *, unsigned HOST_WIDE_INT , bool); extern bool default_profile_before_prologue (void); extern reg_class_t default_preferred_reload_class (rtx, reg_class_t); extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t); diff --git a/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c new file mode 100644 index 0000000..2236aa8 --- /dev/null +++ b/gcc/testsuite/c-c++-common/attribute-prolog_pad-1.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-fprolog-pad=3,1" } */ + +void f1 (void) __attribute__((prolog_pad(2,1))); +void f2 (void) __attribute__((prolog_pad(3))); +int f3 (void); + +void +f1 (void) +{ + f2 (); +} + +void +f2 (void) +{ + f1 (); +} + +/* F3 should never have a NOP pad. */ +int +__attribute__((prolog_pad(0))) +__attribute__((noinline)) +f3 (void) +{ + return 5; +} + +/* F4 should receive the command line default setting. */ +int +f4 (void) +{ + return 3*f3 ()+1; +} diff --git a/gcc/toplev.c b/gcc/toplev.c index beb581a..3afda4a 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1596,8 +1596,10 @@ process_options (void) } /* Do not use IPA optimizations for register allocation if profiler is active + or prolog pads are inserted for run-time instrumentation or port does not emit prologue and epilogue as RTL. */ - if (profile_flag || !targetm.have_prologue () || !targetm.have_epilogue ()) + if (profile_flag || prolog_nop_pad_size + || !targetm.have_prologue () || !targetm.have_epilogue ()) flag_ipa_ra = 0; /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error diff --git a/gcc/varasm.c b/gcc/varasm.c index 11a8ac4..84c739a 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -1830,6 +1830,44 @@ assemble_start_function (tree decl, const char *fnname) if (DECL_PRESERVE_P (decl)) targetm.asm_out.mark_decl_preserved (fnname); + unsigned HOST_WIDE_INT pad_size = prolog_nop_pad_size; + unsigned HOST_WIDE_INT pad_entry = prolog_nop_pad_entry; + + tree prolog_pad_attr + = lookup_attribute ("prolog_pad", DECL_ATTRIBUTES (decl)); + if (prolog_pad_attr) + { + tree pp_val = TREE_VALUE (prolog_pad_attr); + tree prolog_pad_value1 = TREE_VALUE (pp_val); + + if (tree_fits_uhwi_p (prolog_pad_value1)) + pad_size = tree_to_uhwi (prolog_pad_value1); + else + gcc_unreachable (); + + pad_entry = 0; + if (list_length (pp_val) > 1) + { + tree prolog_pad_value2 = TREE_VALUE (TREE_CHAIN (pp_val)); + + if (tree_fits_uhwi_p (prolog_pad_value2)) + pad_entry = tree_to_uhwi (prolog_pad_value2); + else + gcc_unreachable (); + } + } + + if (pad_entry > pad_size) + { + if (pad_size > 0) + warning (OPT_Wattributes, "Prolog nop pad entry > size"); + pad_entry = 0; + } + + /* Emit the prolog padding before the entry label, if any. */ + if (pad_entry > 0) + targetm.asm_out.print_prolog_pad (asm_out_file, pad_entry, true); + /* Do any machine/system dependent processing of the function name. */ #ifdef ASM_DECLARE_FUNCTION_NAME ASM_DECLARE_FUNCTION_NAME (asm_out_file, fnname, current_function_decl); @@ -1838,6 +1876,11 @@ assemble_start_function (tree decl, const char *fnname) ASM_OUTPUT_FUNCTION_LABEL (asm_out_file, fnname, current_function_decl); #endif /* ASM_DECLARE_FUNCTION_NAME */ + /* And the padding after the label. Record it if we haven't done so yet. */ + if (pad_size > pad_entry) + targetm.asm_out.print_prolog_pad (asm_out_file, pad_size-pad_entry, + pad_entry == 0); + if (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (decl))) saw_no_split_stack = true; }