On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen <a...@firstfloor.org> wrote: > > From: Andi Kleen <a...@linux.intel.com> > > Add a new pass to automatically instrument changes to variables > with the new PTWRITE instruction on x86. PTWRITE writes a 4 or 8 byte > field into an Processor Trace log, which allows low over head > logging of information. Essentially it's a hardware accelerated > printf. > > This allows to reconstruct how values later from the log, > which can be useful for debugging or other analysis of the program > behavior. With the compiler support this can be done with without > having to manually add instrumentation to the code. > > Using dwarf information this can be later mapped back to the variables. > The decoder decodes the PTWRITE instructions using IP information > in the log, and then looks up the argument in the debug information. > Then this can be used to reconstruct the original variable > name to display a value history for the variable. > > There are new options to enable instrumentation for different types, > and also a new attribute to control analysis fine grained per > function or variable level. The attributes can be set on both > the variable and the type level, and also on structure fields. > This allows to enable tracing only for specific code in large > programs in a flexible matter. > > The pass is generic, but only the x86 backend enables the necessary > hooks. When the backend enables the necessary hooks (with -mptwrite) > there is an additional pass that looks through the code for > attribute vartrace enabled functions or variables. > > The -fvartrace=locals option is experimental: it works, but it > generates redundant ptwrites because the pass doesn't use > the SSA information to minimize instrumentation. > This could be optimized later. > > Currently the code can be tested with SDE, or on a Intel > Gemini Lake system with a new enough Linux kernel (v4.10+) > that supports PTWRITE for PT. Gemini Lake is used in low > end laptops ("Intel Pentium Silver J5...... / Celeron N4... / > Celeron J4...") > > Linux perf can be used to record the values > > perf record -e intel_pt/ptw=1,branch=0/ program > perf script --itrace=crw -F +synth ... > > I have an experimential version of perf that can also use > dwarf information to symbolize many[1] values back to their variable > names. So far it is not in standard perf, but available at > > https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/var-resolve-4 > > It is currently not able to decode all variable locations to names, > but a large subset. > > Longer term hopefully gdb will support this information too. > > The CPU can potentially generate very data high bandwidths when > code doing a lot of computation is heavily instrumented. > This can cause some data loss in both the CPU and also in perf > logging the data when the disk cannot keep up. > > Running some larger workloads most workloads do not cause > CPU level overflows, but I've seen it with -fvartrace > with crafty, and with more workloads with -fvartrace-locals. > > Recommendation is to not fully instrument programs, > but only areas of interest either at the file level or using > the attributes. > > The other thing is that perf and the disk often cannot keep up > with the data bandwidth for longer computations. In this case > it's possible to use perf snapshot mode (add --snapshot > to the command line above). The data will be only logged to > a memory ring buffer then, and only dump the buffers on events > of interest by sending SIGUSR2 to the perf binrary. > > In the future this will be hopefully better supported with > core files and gdb. > > Passes bootstrap and test suite on x86_64-linux, also > bootstrapped and tested gcc itself with full -fvartrace > and -fvartrace=locals instrumentation.
In the cover mail you mentioned you didn't get rid of SSA update. That is because your instrumentation does not set the calls virtual operands. Since your builtin clobbers memory and you instrument non-memory ops that's only possible if you'd track the active virtual operand during the walk over the function. I suppose using SSA update is OK for now. More comments inline > gcc/: > > 2018-11-15 Andi Kleen <a...@linux.intel.com> > > * Makefile.in: Add tree-vartrace.o. > * common.opt: Add -fvartrace > * opts.c (parse_vartrace_options): Add. > (common_handle_option): Call parse_vartrace_options. > * config/i386/i386.c (ix86_vartrace_func): Add. > (TARGET_VARTRACE_FUNC): Add. > * doc/extend.texi: Document vartrace/no_vartrace > attributes. > * doc/invoke.texi: Document -fvartrace. > * doc/tm.texi (TARGET_VARTRACE_FUNC): Add. > * passes.def: Add vartrace pass. > * target.def (vartrace_func): Add. > * tree-pass.h (make_pass_vartrace): Add. > * tree-vartrace.c: New file to implement vartrace pass. > > gcc/c-family/: > > 2018-11-15 Andi Kleen <a...@linux.intel.com> > > * c-attribs.c (handle_vartrace_attribute, > handle_no_vartrace_attribute): New functions. > (attr_vartrace_exclusions): Add. > > config/: > > 2018-11-03 Andi Kleen <a...@linux.intel.com> > > * bootstrap-vartrace.mk: New. > * bootstrap-vartrace-locals.mk: New. > --- > config/bootstrap-vartrace-locals.mk | 3 + > config/bootstrap-vartrace.mk | 3 + > gcc/Makefile.in | 1 + > gcc/c-family/c-attribs.c | 77 +++++ > gcc/common.opt | 8 + > gcc/config/i386/i386.c | 32 ++ > gcc/doc/extend.texi | 32 ++ > gcc/doc/invoke.texi | 27 ++ > gcc/doc/tm.texi | 6 + > gcc/doc/tm.texi.in | 2 + > gcc/flag-types.h | 9 + > gcc/opts.c | 63 ++++ > gcc/passes.def | 1 + > gcc/target.def | 9 + > gcc/tree-pass.h | 1 + > gcc/tree-vartrace.c | 491 ++++++++++++++++++++++++++++ > 16 files changed, 765 insertions(+) > create mode 100644 config/bootstrap-vartrace-locals.mk > create mode 100644 config/bootstrap-vartrace.mk > create mode 100644 gcc/tree-vartrace.c > > diff --git a/config/bootstrap-vartrace-locals.mk > b/config/bootstrap-vartrace-locals.mk > new file mode 100644 > index 00000000000..dd16640df74 > --- /dev/null > +++ b/config/bootstrap-vartrace-locals.mk > @@ -0,0 +1,3 @@ > +STAGE2_CFLAGS += -mptwrite -fvartrace=all > +STAGE3_CFLAGS += -mptwrite -fvartrace=all > +STAGE4_CFLAGS += -mptwrite -fvartrace=all > diff --git a/config/bootstrap-vartrace.mk b/config/bootstrap-vartrace.mk > new file mode 100644 > index 00000000000..e29824d799b > --- /dev/null > +++ b/config/bootstrap-vartrace.mk > @@ -0,0 +1,3 @@ > +STAGE2_CFLAGS += -mptwrite -fvartrace > +STAGE3_CFLAGS += -mptwrite -fvartrace > +STAGE4_CFLAGS += -mptwrite -fvartrace > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index ec793175c3b..64a99a1ec8a 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1594,6 +1594,7 @@ OBJS = \ > tree-vectorizer.o \ > tree-vector-builder.o \ > tree-vrp.o \ > + tree-vartrace.o \ > tree.o \ > typed-splay-tree.o \ > unique-ptr-tests.o \ > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c > index 1657df7f9df..d50e78de830 100644 > --- a/gcc/c-family/c-attribs.c > +++ b/gcc/c-family/c-attribs.c > @@ -104,6 +104,10 @@ static tree handle_tls_model_attribute (tree *, tree, > tree, int, > bool *); > static tree handle_no_instrument_function_attribute (tree *, tree, > tree, int, bool *); > +static tree handle_vartrace_attribute (tree *, tree, > + tree, int, bool *); > +static tree handle_no_vartrace_attribute (tree *, tree, > + tree, int, bool *); > static tree handle_no_profile_instrument_function_attribute (tree *, tree, > tree, int, bool > *); > static tree handle_malloc_attribute (tree *, tree, tree, int, bool *); > @@ -235,6 +239,13 @@ static const struct attribute_spec::exclusions > attr_const_pure_exclusions[] = > ATTR_EXCL (NULL, false, false, false) > }; > > +static const struct attribute_spec::exclusions attr_vartrace_exclusions[] = > +{ > + ATTR_EXCL ("vartrace", true, true, true), > + ATTR_EXCL ("no_vartrace", true, true, true), > + ATTR_EXCL (NULL, false, false, false) > +}; > + > /* Table of machine-independent attributes common to all C-like languages. > > Current list of processed common attributes: nonnull. */ > @@ -326,6 +337,12 @@ const struct attribute_spec c_common_attribute_table[] = > { "no_instrument_function", 0, 0, true, false, false, false, > handle_no_instrument_function_attribute, > NULL }, > + { "vartrace", 0, 0, false, false, false, false, > + handle_vartrace_attribute, > + attr_vartrace_exclusions }, > + { "no_vartrace", 0, 0, false, false, false, false, > + handle_no_vartrace_attribute, > + attr_vartrace_exclusions }, > { "no_profile_instrument_function", 0, 0, true, false, false, false, > handle_no_profile_instrument_function_attribute, > NULL }, > @@ -770,6 +787,66 @@ handle_no_sanitize_undefined_attribute (tree *node, tree > name, tree, int, > return NULL_TREE; > } > > +/* Handle "vartrace" attribute; arguments as in struct > + attribute_spec.handler. */ > + > +static tree > +handle_vartrace_attribute (tree *node, tree name, tree, int flags, > + bool *no_add_attrs) > +{ > + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) && > + TREE_CODE (*node) != FIELD_DECL) > + { > + warning (OPT_Wattributes, "%qE attribute ignored for object", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (!targetm.vartrace_func) > + { > + warning (OPT_Wattributes, "%qE attribute not supported for target", > name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TREE_TYPE (*node) > + && TREE_CODE (*node) != FUNCTION_DECL > + && targetm.vartrace_func (TREE_TYPE (*node), true) == NULL_TREE) > + { > + warning (OPT_Wattributes, "%qE attribute not supported for type", > name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) > + *node = build_variant_type_copy (*node); > + > + /* We look it up later with lookup_attribute. */ > + return NULL_TREE; > +} > + > +/* Handle "no_vartrace" attribute; arguments as in struct > + attribute_spec.handler. */ > + > +static tree > +handle_no_vartrace_attribute (tree *node, tree name, tree, int flags, > + bool *no_add_attrs) > +{ > + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) > + && TREE_CODE (*node) != FIELD_DECL) > + { > + warning (OPT_Wattributes, "%qE attribute ignored", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) > + *node = build_variant_type_copy (*node); > + > + /* We look it up later with lookup_attribute. */ > + return NULL_TREE; > +} > + > /* Handle an "asan odr indicator" attribute; arguments as in > struct attribute_spec.handler. */ > > diff --git a/gcc/common.opt b/gcc/common.opt > index 72a713593c3..f27a57b1177 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -221,6 +221,10 @@ unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED > | SANITIZE_UNDEFINED_NO > Variable > unsigned int flag_sanitize_coverage > > +; What to instrument with vartrace > +Variable > +unsigned int flag_vartrace > + > ; Flag whether a prefix has been added to dump_base_name > Variable > bool dump_base_name_prefixed = false > @@ -2856,6 +2860,10 @@ ftree-scev-cprop > Common Report Var(flag_tree_scev_cprop) Init(1) Optimization > Enable copy propagation of scalar-evolution information. > > +fvartrace > +Common JoinedOrMissing Report Driver > +-fvartrace=default|all|locals|returns|args|reads|writes|off Enable > variable tracing instrumentation. > + > ; -fverbose-asm causes extra commentary information to be produced in > ; the generated assembly code (to make it more readable). This option > ; is generally only of use to those who actually need to read the > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index c18c60a1d19..6d130bf0804 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -31900,6 +31900,35 @@ ix86_mangle_function_version_assembler_name (tree > decl, tree id) > return ret; > } > > +/* Hook to determine that TYPE can be traced. Ignore target flags > + if FORCE is true. Returns the tracing builtin if tracing is possible, > + or otherwise NULL. */ > + > +static tree > +ix86_vartrace_func (tree type, bool force) > +{ > + if (!(ix86_isa_flags2 & OPTION_MASK_ISA_PTWRITE)) > + { > + /* With force, as in checking for the attribute, ignore > + the current target settings. Otherwise it's not > + possible to declare vartrace variables outside > + an __attribute__((target("ptwrite"))) function > + if -mptwrite is not specified. */ > + if (!force) > + return NULL; > + /* Initialize the builtins if missing, so that we have > + something to return. */ > + if (!ix86_builtins[(int)IX86_BUILTIN_PTWRITE32]) > + ix86_add_new_builtins (0, OPTION_MASK_ISA_PTWRITE); > + } > + > + if (TYPE_PRECISION (type) == 32) > + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE32]; > + else if (TYPE_PRECISION (type) == 64) > + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE64]; I think it makes more sense to pass the target hook a mode instead of a type. Thus above you'd check for SImode vs. DImode. > + else > + return NULL; > +} > > static tree > ix86_mangle_decl_assembler_name (tree decl, tree id) > @@ -50894,6 +50923,9 @@ ix86_run_selftests (void) > #undef TARGET_ASAN_SHADOW_OFFSET > #define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset > > +#undef TARGET_VARTRACE_FUNC > +#define TARGET_VARTRACE_FUNC ix86_vartrace_func > + > #undef TARGET_GIMPLIFY_VA_ARG_EXPR > #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index d4b1046b6ae..f0cde5dcf0f 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -3228,6 +3228,13 @@ the standard C library can be guaranteed not to throw > an exception > with the notable exceptions of @code{qsort} and @code{bsearch} that > take function pointer arguments. > > +@item no_vartrace > +The @code{no_vartrace} attribute disables data tracing for > +the function [or variable or structure field] declared with > +the attribute. See @pxref{Common Variable Attributes} and > +@pxref{Common Type Attributes}. When specified for a function > +nothing in the function is traced. > + > @item optimize (@var{level}, @dots{}) > @item optimize (@var{string}, @dots{}) > @cindex @code{optimize} function attribute > @@ -3489,6 +3496,21 @@ When applied to a member function of a C++ class > template, the > attribute also means that the function is instantiated if the > class itself is instantiated. > > +@item vartrace > +@cindex @code{vartrace} function or variable attribute > +Enable data tracing for the function or variable or structure field > +marked with this attribute. When applied to a type all instances of the type > +will be traced. When applied to a structure or union all fields will be > traced. > +When applied to a structure field that field will be traced. > +For functions will not trace locals (unless enabled on the command line or > +on as an attribute the local itself) but arguments, returns, globals, pointer > +references. For variables or types or structure members any reads or writes > will be traced. > +Only integer or pointer types can be tracked. > + > +Currently implemented for x86 when the @option{ptwrite} target option > +is enabled for systems that support the @code{PTWRITE} instruction, > +supporting 4 and 8 byte integers. > + > @item visibility ("@var{visibility_type}") > @cindex @code{visibility} function attribute > This attribute affects the linkage of the declaration to which it is > attached. > @@ -7196,6 +7218,16 @@ A @{ /* @r{@dots{}} */ @}; > struct __attribute__ ((copy ( (struct A *)0)) B @{ /* @r{@dots{}} */ @}; > @end smallexample > > +@cindex @code{vartrace} type attribute > +@cindex @code{no_vartrace} type attribute > +@item vartrace > +@itemx no_vartrace > +Specify that all instances of type should be variable traced > +or not variable traced. Can be also also applied to function > +types to disable tracing for all instances of that function type. > +Can be also applied to structure fields. See the description in > +@pxref{Variable Attributes} for more details. > + > @item deprecated > @itemx deprecated (@var{msg}) > @cindex @code{deprecated} type attribute > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 535b258d22b..8b9d1d71669 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -2754,6 +2754,33 @@ Don't use the @code{__cxa_get_exception_ptr} runtime > routine. This > causes @code{std::uncaught_exception} to be incorrect, but is necessary > if the runtime routine is not available. > > +@item -fvartrace > +@opindex -fvartrace=... > +Insert trace instructions to trace variables at runtime with a tracer > +such as Intel Processor Trace. > + > +Requires enabling a backend specific option, like @option{-mptwrite} to > enable > +@code{PTWRITE} instruction generation on x86. > + > +Additional qualifiers can be specified after the =: @option{args} > +for arguments, @option{off} to disable, @option{returns} for tracing > +return values, @option{reads} to trace reads, @option{writes} to trace > +writes, @option{locals} to trace locals. > +Default when enabled is tracing reads, writes, arguments, returns, objects > +with a static or thread duration but no locals. > +Multiple options can be separated by comma. > + > +When -fvartrace enabled, the compiler will add trace instructions. By > +default these trace options act like nops, unless tracing is > +enabled. For example to enable tracing on x86 Linux with Linux perf > +use: > + > +@smallexample > +gcc -fvartrace -o program -g program.c > +perf record -e intel_pt/pt=1,ptw=1,branch=0/ program > +perf script > +@end smallexample > + > @item -fvisibility-inlines-hidden > @opindex fvisibility-inlines-hidden > This switch declares that the user does not attempt to compare > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 0a2ad9a745e..861914ac646 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -11936,6 +11936,12 @@ Address Sanitizer shadow memory address. NULL if > Address Sanitizer is not > supported by the target. > @end deftypefn > > +@deftypefn {Target Hook} tree TARGET_VARTRACE_FUNC (tree @var{type}, bool > @var{force}) > +Return a builtin to call to trace variables of type TYPE or NULL if not > supported > +by the target. Ignore target configuration if FORCE is true. The builtin > gets called with a > +single argument of TYPE. So make this take a enum machine_mode @var{mode} specified to be an integer mode. The argument type of the function would then be an unsigned integer type with this mode. That should allow more flexible instrumentation, including of floats or doubles or small aggregates. Maybe even instead pass it a number of bytes so it models how atomics work. > +@end deftypefn > + > @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK > (unsigned HOST_WIDE_INT @var{val}) > Validate target specific memory model mask bits. When NULL no target specific > memory model bits are allowed. > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index f1ad80da467..756e84d0e07 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -8104,6 +8104,8 @@ and the associated definitions of those functions. > > @hook TARGET_ASAN_SHADOW_OFFSET > > +@hook TARGET_VARTRACE_FUNC > + > @hook TARGET_MEMMODEL_CHECK > > @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL > diff --git a/gcc/flag-types.h b/gcc/flag-types.h > index 500f6638f36..6e5add2da9d 100644 > --- a/gcc/flag-types.h > +++ b/gcc/flag-types.h > @@ -261,6 +261,15 @@ enum sanitize_code { > | SANITIZE_BOUNDS_STRICT > }; > > +/* Settings for flag_vartrace */ > +enum flag_vartrace { > + VARTRACE_LOCALS = 1 << 0, > + VARTRACE_ARGS = 1 << 1, > + VARTRACE_RETURNS = 1 << 2, > + VARTRACE_READS = 1 << 3, > + VARTRACE_WRITES = 1 << 4 > +}; > + > /* Settings of flag_incremental_link. */ > enum incremental_link { > INCREMENTAL_LINK_NONE, > diff --git a/gcc/opts.c b/gcc/opts.c > index 318ed442057..c6c47b55a39 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -1872,6 +1872,65 @@ check_alignment_argument (location_t loc, const char > *flag, const char *name) > parse_and_check_align_values (flag, name, align_result, true, loc); > } > > +/* Parse vartrace options in P, updating flags OPTS at LOC and return > + updated flags. */ > + > +static int > +parse_vartrace_options (const char *p, int opts, location_t loc) > +{ > + static struct { > + const char *name; > + int opt; > + } vopts[] = > + { > + { "default", > + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS > + | VARTRACE_WRITES }, /* Keep as first entry. */ > + { "all", > + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS > + | VARTRACE_WRITES | VARTRACE_LOCALS }, > + { "args", VARTRACE_ARGS }, > + { "returns", VARTRACE_RETURNS }, > + { "reads", VARTRACE_READS }, > + { "writes", VARTRACE_WRITES }, > + { "locals", VARTRACE_LOCALS }, > + { NULL, 0 } > + }; > + > + if (*p == '=') > + p++; > + if (*p == 0) > + return opts | vopts[0].opt; > + > + if (!strcmp (p, "off")) > + return 0; > + > + while (*p) > + { > + unsigned len = strcspn (p, ","); > + int i; > + > + for (i = 0; vopts[i].name; i++) > + { > + if (len == strlen (vopts[i].name) && !strncmp (p, vopts[i].name, > len)) > + { > + opts |= vopts[i].opt; > + break; > + } > + } > + if (vopts[i].name == NULL) > + { > + error_at (loc, "invalid argument to %qs", "-fvartrace"); > + break; > + } > + > + p += len; > + if (*p == ',') > + p++; > + } > + return opts; > +} > + > /* Handle target- and language-independent options. Return zero to > generate an "unknown option" message. Only options that need > extra handling need to be listed here; if you simply want > @@ -2078,6 +2137,10 @@ common_handle_option (struct gcc_options *opts, > case OPT__completion_: > break; > > + case OPT_fvartrace: > + opts->x_flag_vartrace = parse_vartrace_options (arg, > opts->x_flag_vartrace, loc); > + break; > + > case OPT_fsanitize_: > opts->x_flag_sanitize > = parse_sanitizer_options (arg, loc, code, > diff --git a/gcc/passes.def b/gcc/passes.def > index 24f212c8e31..36d14b0386a 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -394,6 +394,7 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_asan_O0); > NEXT_PASS (pass_tsan_O0); > NEXT_PASS (pass_sanopt); > + NEXT_PASS (pass_vartrace); I'd move it one lower, after pass_cleanup_eh. Further enhancement would make it a RTL pass ... > NEXT_PASS (pass_cleanup_eh); > NEXT_PASS (pass_lower_resx); > NEXT_PASS (pass_nrv); > diff --git a/gcc/target.def b/gcc/target.def > index f9469d69cb0..b5d970870a4 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -4300,6 +4300,15 @@ supported by the target.", > unsigned HOST_WIDE_INT, (void), > NULL) > > +/* Defines the builtin to trace variables, or NULL. */ > +DEFHOOK > +(vartrace_func, > + "Return a builtin to call to trace variables of type TYPE or NULL if not > supported\n\ > +by the target. Ignore target configuration if FORCE is true. The builtin > gets called with a\n\ > +single argument of TYPE.", > + tree, (tree type, bool force), > + NULL) > + > /* Functions relating to calls - argument passing, returns, etc. */ > /* Members of struct call have no special macro prefix. */ > HOOK_VECTOR (TARGET_CALLS, calls) > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index af15adc8e0c..2cf31785a6f 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -423,6 +423,7 @@ extern gimple_opt_pass *make_pass_strlen (gcc::context > *ctxt); > extern gimple_opt_pass *make_pass_fold_builtins (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_post_ipa_warn (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_stdarg (gcc::context *ctxt); > +extern gimple_opt_pass *make_pass_vartrace (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_early_warn_uninitialized (gcc::context > *ctxt); > extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context > *ctxt); > extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt); > diff --git a/gcc/tree-vartrace.c b/gcc/tree-vartrace.c > new file mode 100644 > index 00000000000..1ef81b743fc > --- /dev/null > +++ b/gcc/tree-vartrace.c > @@ -0,0 +1,491 @@ > +/* Insert instructions for data value tracing. > + Copyright (C) 2017, 2018 Free Software Foundation, Inc. > + Contributed by Andi Kleen. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify > +it under the terms of the GNU General Public License as published by > +the Free Software Foundation; either version 3, or (at your option) > +any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "backend.h" > +#include "target.h" > +#include "tree.h" > +#include "tree-iterator.h" > +#include "tree-pass.h" > +#include "basic-block.h" > +#include "gimple.h" > +#include "gimple-iterator.h" > +#include "gimplify.h" > +#include "gimplify-me.h" > +#include "gimple-ssa.h" > +#include "gimple-pretty-print.h" > +#include "cfghooks.h" > +#include "fold-const.h" > +#include "ssa.h" > +#include "tree-dfa.h" > +#include "attribs.h" > + > +namespace { > + > +enum attrstate { force_off, force_on, neutral }; > + > +/* Can we trace with attributes ATTR. */ > + > +attrstate > +supported_attr (tree attr) > +{ > + if (lookup_attribute ("no_vartrace", attr)) > + return force_off; > + if (lookup_attribute ("vartrace", attr)) > + return force_on; > + return neutral; > +} > + > +/* Is tracing enabled for ARG considering S. */ > + > +attrstate > +supported_op (tree arg, attrstate s) > +{ > + if (s != neutral) > + return s; > + if (DECL_P (arg)) > + { > + s = supported_attr (DECL_ATTRIBUTES (arg)); > + if (s != neutral) > + return s; > + } > + return supported_attr (TYPE_ATTRIBUTES (TREE_TYPE (arg))); > +} > + > +/* Can we trace DECL. */ > + > +bool > +supported_type (tree decl) > +{ > + tree type = TREE_TYPE (decl); > + > + return POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type); So in reality the restriction is on the size of the object, correct? Given you only use this in supported_mem I'd elide this function (also see below) The supported_ names are a bit odd since some of them are about whehter tracing is enabled and some are whether a specific value can be logged at all. I suggest to differentiate appropriately here, like enabled_op vs. supported_type. > +} > + > +/* Return true if DECL is refering to a local variable. */ > + > +bool > +is_local (tree decl) > +{ > + if (!(flag_vartrace & VARTRACE_LOCALS) && supported_op (decl, neutral) != > force_on) > + return false; > + return auto_var_in_fn_p (decl, cfun->decl); > +} Likewise. > +/* Is T something we can log, FORCEing the type if needed. */ > + > +bool > +supported_mem (tree t, bool force) This looks like a bad name since 't' isn't necessarily a memory location? > +{ > + if (!supported_type (t)) You handle some nested cases below via recursion, like a.b.c (but not a[i][j]). But then this check will fire. I think it would be better to restructure the function to look at the outermost level for whether the op is of supported type, thus we can log it at all and then get all the way down to the base via sth like if (!supported_type (t)) return false; enum attrstate s = <some default>; do { s = supported_op (t, s); if (s == force_off) return false; } while (handled_component_p (t) && (t = TREE_OPERAND (t, 0))) Now t is either an SSA_NAME, a DECL (you fail to handle PARM_DECL and RESULT_DECL below) or a [TARGET_]MEM_REF. To get rid of non-pointer indirections do then t = get_base_address (t); if (DECL_P (t) && is_local (t)) .... because... > + return false; > + > + enum attrstate s = supported_op (t, neutral); > + if (s == force_off) > + return false; > + if (s == force_on) > + force = true; > + > + switch (TREE_CODE (t)) > + { > + case VAR_DECL: > + if (DECL_ARTIFICIAL (t)) > + return false; > + if (is_local (t)) > + return true; > + return s == force_on || force; > + > + case ARRAY_REF: > + t = TREE_OPERAND (t, 0); > + s = supported_op (t, s); > + if (s == force_off) > + return false; > + return supported_type (TREE_TYPE (t)); Your supported_type is said to take a DECL. And you already asked for this type - it's the type of the original t (well, the type of this type given TREE_TYPE (t) is an array type). But you'll reject a[i][j] where the type of this type is an array type as well. > + > + case COMPONENT_REF: > + s = supported_op (TREE_OPERAND (t, 1), s); > + t = TREE_OPERAND (t, 0); > + if (s == neutral && is_local (t)) > + return true; > + s = supported_op (t, s); > + if (s != neutral) > + return s == force_on ? true : false; > + return supported_mem (t, force); > + > + // support BIT_FIELD_REF? > + > + case VIEW_CONVERT_EXPR: > + case TARGET_MEM_REF: > + case MEM_REF: > + return supported_mem (TREE_OPERAND (t, 0), force); > + > + case SSA_NAME: > + if ((flag_vartrace & VARTRACE_LOCALS) > + && SSA_NAME_VAR (t) > + && !DECL_IGNORED_P (SSA_NAME_VAR (t))) > + return true; > + return force; > + > + default: > + break; > + } > + > + return false; > +} > + > +/* Print debugging for inserting CODE at ORIG_STMT with type of VAL for WHY. > */ > + > +void > +log_trace_code (gimple *orig_stmt, gimple *code, tree val, const char *why) > +{ > + if (!dump_file) > + return; > + if (orig_stmt) > + fprintf (dump_file, "BB%d ", gimple_bb (orig_stmt)->index); > + fprintf (dump_file, "%s inserting ", why); > + print_gimple_stmt (dump_file, code, 0, TDF_VOPS|TDF_MEMSYMS); > + if (orig_stmt) > + { > + fprintf (dump_file, "orig "); > + print_gimple_stmt (dump_file, orig_stmt, 2, > + TDF_VOPS|TDF_MEMSYMS); TDF_MEMSYMS is dead, or rather it seems to be a leftover now equivalent to just TDF_VOPS. > + } > + fprintf (dump_file, "type "); > + print_generic_expr (dump_file, TREE_TYPE (val), TDF_SLIM); > + fputc ('\n', dump_file); > + fputc ('\n', dump_file); > +} > + > +/* Insert variable tracing code for VAL before iterator GI, originally > + for ORIG_STMT and optionally at LOC. Normally before ORIG_STMT, but > + AFTER if true. Reason is WHY. Return trace var if successfull, > + or NULL_TREE. */ > + > +tree > +insert_trace (gimple_stmt_iterator *gi, tree val, gimple *orig_stmt, > + const char *why, location_t loc = -1, bool after = false) > +{ > + if (loc == (location_t)-1) > + loc = gimple_location (orig_stmt); > + > + tree func = targetm.vartrace_func (TREE_TYPE (val), false); > + if (!func) > + return NULL_TREE; > + > + tree tvar = val; > + if (!is_gimple_reg (val)) > + { > + tvar = make_ssa_name (TREE_TYPE (val)); > + gassign *assign = gimple_build_assign (tvar, unshare_expr (val)); > + log_trace_code (orig_stmt, assign, val, "copy"); > + gimple_set_location (assign, loc); > + if (after) > + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); > + else > + gsi_insert_before (gi, assign, GSI_SAME_STMT); > + update_stmt (assign); gsi_insert_* does update_stmt already. Btw, if you allow any SImode or DImode size value you can use a VIEW_CONVERT_EXPR to view the value as an integer type. You can of course also do simple promotion of char or short to int. I wonder why you do not get IL verification errors when passing the builtin signed ints/longs btw... ah, because we do not actually verify call arguments :/ > + } > + > + gcall *call = gimple_build_call (func, 1, tvar); > + log_trace_code (NULL, call, tvar, why); > + gimple_set_location (call, loc); > + if (after) > + gsi_insert_after (gi, call, GSI_CONTINUE_LINKING); > + else > + gsi_insert_before (gi, call, GSI_SAME_STMT); > + update_stmt (call); > + return tvar; > +} > + > +/* Insert trace at GI for T in FUN if suitable memory or variable > + reference. Always if FORCE. Originally on ORIG_STMT. Reason is > + WHY. Insert after GI if AFTER. Returns trace variable or NULL_TREE. */ > + > +tree > +instrument_mem (gimple_stmt_iterator *gi, tree t, bool force, > + gimple *orig_stmt, const char *why, bool after = false) > +{ > + if (supported_mem (t, force)) > + return insert_trace (gi, t, orig_stmt, why, -1, after); > + return NULL_TREE; > +} > + > +/* Instrument arguments for FUN. Return true if changed. */ > + > +bool > +instrument_args (function *fun) > +{ > + gimple_stmt_iterator gi; > + bool changed = false; > + > + /* Local tracing usually takes care of the argument too, when > + they are read. This avoids redundant trace instructions. */ But only when instrumenting reads? > + if (flag_vartrace & VARTRACE_LOCALS) > + return false; > + > + for (tree arg = DECL_ARGUMENTS (current_function_decl); > + arg != NULL_TREE; > + arg = DECL_CHAIN (arg)) > + { > + gi = gsi_start_bb (BASIC_BLOCK_FOR_FN (fun, NUM_FIXED_BLOCKS)); leftover? It seems to be unused (and is bougs). > + tree type = TREE_TYPE (arg); > + if (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type)) You had supported_type for this... I think you want to bail out for DECL_BY_REFERENCE (arg) // arg will be a pointer to the actual argument || DECL_IGNORED (arg) // no dbeug info > + { > + tree func = targetm.vartrace_func (TREE_TYPE (arg), false); > + if (!func) > + continue; > + > + if (!is_gimple_reg (arg)) TREE_CODE (arg) != SSA_NAME (since you are calling ssa_default_def) > + continue; > + tree sarg = ssa_default_def (fun, arg); > + if (!sarg) > + continue; > + > + gimple_stmt_iterator egi = gsi_after_labels (single_succ > (ENTRY_BLOCK_PTR_FOR_FN (fun))); > + changed |= !!insert_trace (&egi, sarg, NULL, "arg", > fun->function_start_locus); > + } > + } > + return changed; > +} > + > +/* Generate trace call for store GAS at GI, force if FORCE. Return true > + if successfull. Return true if successfully inserted. */ > + > +bool > +instrument_store (gimple_stmt_iterator *gi, gassign *gas, bool force) > +{ > + tree orig = gimple_assign_lhs (gas); > + > + if (!supported_mem (orig, force)) > + return false; > + > + tree func = targetm.vartrace_func (TREE_TYPE (orig), false); > + if (!func) > + return false; > + > + /* Generate another reference to target. That can be racy, but is > + guaranteed to have the debug location of the target. Better > + would be to use the original value to avoid any races, but we > + would need to somehow force the target location of the > + builtin. */ Hmm, but then this requires the target instruction to have a memory operand? That's going to be unlikely for RISCy cases? On x86 does it work if combine later does not syntesize a ptwrite with memory operand? I also wonder how this survives RTL CSE since you are basically doing mem = val; // orig stmt val' = mem; ptwrite (val'); that probably means when CSE removes the load there ends up a debug-insn reflecting what you want? > + tree tvar = make_ssa_name (TREE_TYPE (orig)); > + gassign *assign = gimple_build_assign (tvar, unshare_expr (orig)); > + log_trace_code (gas, assign, orig, "store copy"); > + gimple_set_location (assign, gimple_location (gas)); > + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); > + update_stmt (assign); > + > + gcall *tcall = gimple_build_call (func, 1, tvar); > + log_trace_code (gas, tcall, tvar, "store"); > + gimple_set_location (tcall, gimple_location (gas)); > + gsi_insert_after (gi, tcall, GSI_CONTINUE_LINKING); > + update_stmt (tcall); > + return true; > +} > + > +/* Instrument STMT at GI. Force if FORCE. Return true if changed. */ > + > +bool > +instrument_assign (gimple_stmt_iterator *gi, gassign *gas, bool force) > +{ > + if (gimple_clobber_p (gas)) > + return false; > + bool changed = false; > + tree tvar = instrument_mem (gi, gimple_assign_rhs1 (gas), > + (flag_vartrace & VARTRACE_READS) || force, > + gas, "assign load1"); > + if (tvar) > + { > + gimple_assign_set_rhs1 (gas, tvar); > + changed = true; > + } > + /* Handle operators in case they read locals. */ Does it make sense at all to instrument SSA "reads"? You know all SSA vars have a single definition and all locals not in SSA form are represented with memory reads/writes, so ... > + if (gimple_num_ops (gas) > 2) > + { ... this case looks suprious. And as said you probably do not want to isnstrument SSA "reads"? Also not above when instrumenting gimple_assign_rhs1 (gas), so better guard that with gimple_assing_load_p (gas)? > + tvar = instrument_mem (gi, gimple_assign_rhs2 (gas), > + (flag_vartrace & VARTRACE_READS) || force, > + gas, "assign load2"); > + if (tvar) > + { > + gimple_assign_set_rhs2 (gas, tvar); you still have this stmt adjusting here, why? > + changed = true; > + } > + } > + // handle more ops? > + > + if (gimple_store_p (gas)) > + changed |= instrument_store (gi, gas, > + (flag_vartrace & VARTRACE_WRITES) || force); > + > + if (changed) > + update_stmt (gas); > + return changed; > +} > + > +/* Instrument return at statement STMT at GI with FORCE. Return true > + if changed. */ > + > +bool > +instrument_return (gimple_stmt_iterator *gi, greturn *gret, bool force) > +{ > + tree rval = gimple_return_retval (gret); > + > + if (!rval) > + return false; > + if (DECL_P (rval) && DECL_BY_REFERENCE (rval)) > + rval = build_simple_mem_ref (ssa_default_def (cfun, rval)); This looks bogus. If DECL_BY_REFERENCE is true then rval is of pointer type and thus a register, you shouldn't ever see that returned plainly in gimple_return_retval. > + if (supported_mem (rval, force)) > + return !!insert_trace (gi, rval, gret, "return"); > + return false; > +} > + > +/* Instrument asm at GI in statement STMT with FORCE if needed. Return > + true if changed. */ > + > +bool > +instrument_asm (gimple_stmt_iterator *gi, gasm *stmt, bool force) > +{ > + bool changed = false; > + > + for (unsigned i = 0; i < gimple_asm_ninputs (stmt); i++) > + changed |= !!instrument_mem (gi, TREE_VALUE (gimple_asm_input_op (stmt, > i)), > + force || (flag_vartrace & VARTRACE_READS), > stmt, > + "asm input"); > + for (unsigned i = 0; i < gimple_asm_noutputs (stmt); i++) > + { > + tree o = TREE_VALUE (gimple_asm_output_op (stmt, i)); > + if (supported_mem (o, force | (flag_vartrace & VARTRACE_WRITES))) > + changed |= !!insert_trace (gi, o, stmt, "asm output", -1, true); > + } I would guess instrumenting asms() has the chance of disturbing reg-alloc quite a bit... > + return changed; > +} > + > +/* Insert vartrace calls for FUN. */ > + > +unsigned int > +vartrace_execute (function *fun) > +{ > + basic_block bb; > + gimple_stmt_iterator gi; > + bool force = 0; > + > + if (lookup_attribute ("vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) > + || lookup_attribute ("vartrace", DECL_ATTRIBUTES (fun->decl))) btw, I wonder whether the vartrace attribute should have an argument like vartrace(locals,reads,...)? > + force = true; > + > + bool changed = false; > + > + if ((flag_vartrace & VARTRACE_ARGS) || force) > + changed |= instrument_args (fun); > + > + FOR_EACH_BB_FN (bb, fun) > + for (gi = gsi_start_bb (bb); !gsi_end_p (gi); gsi_next (&gi)) > + { > + gimple *stmt = gsi_stmt (gi); Not sure if I suggested it during the first review but there's walk_stmt_load_store_ops () which lets you walk (via callbacks) all memory loads and stores in a stmt (also loads of non-SSA registers). Then there's FOR_EACH_SSA_DEF_OPERAND () or alternatively FOR_EACH_SSA_TREE_OPERAND () in case you are also interested in uses. For SSA uses you are currently missing indexes in ARRAY_REFs and friends. But as said I think you really want to avoid instrumenting SSA uses. > + switch (gimple_code (stmt)) > + { > + case GIMPLE_ASSIGN: > + changed |= instrument_assign (&gi, as_a <gassign *> (stmt), > force); > + break; > + case GIMPLE_RETURN: > + changed |= instrument_return (&gi, as_a <greturn *> (stmt), > + force || (flag_vartrace & > VARTRACE_RETURNS)); > + break; > + > + // for GIMPLE_CALL we use the argument logging in the callee > + // we could optionally log in the caller too to handle all > possible > + // reads of a local/global when the callee is not instrumented > + // possibly later we could also instrument copy and clear calls. > + > + case GIMPLE_SWITCH: > + changed |= !!instrument_mem (&gi, gimple_switch_index (as_a > <gswitch *> (stmt)), > + force, stmt, "switch"); > + break; > + case GIMPLE_COND: > + changed |= !!instrument_mem (&gi, gimple_cond_lhs (stmt), force, > stmt, "if lhs"); > + changed |= !!instrument_mem (&gi, gimple_cond_rhs (stmt), force, > stmt, "if rhs"); switch and cond are always SSA uses (or constants). > + break; > + > + case GIMPLE_ASM: > + changed |= instrument_asm (&gi, as_a<gasm *> (stmt), force); > + break; > + default: > + // everything else that reads/writes variables should be lowered > already > + break; > + } > + } > + > + // for now, until we fix all cases that destroy ssa > + return changed ? TODO_update_ssa : 0;; > +} > + > +const pass_data pass_data_vartrace = > +{ > + GIMPLE_PASS, /* type */ > + "vartrace", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_NONE, /* tv_id */ > + 0, /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + 0, /* todo_flags_finish */ > +}; > + > +class pass_vartrace : public gimple_opt_pass > +{ > +public: > + pass_vartrace (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_vartrace, ctxt) > + {} > + > + virtual opt_pass * clone () > + { > + return new pass_vartrace (m_ctxt); > + } > + > + virtual bool gate (function *fun) > + { > + // check if vartrace is supported in backend > + if (!targetm.vartrace_func > + || targetm.vartrace_func (integer_type_node, false) == NULL) > + return false; > + > + if (lookup_attribute ("no_vartrace", TYPE_ATTRIBUTES (TREE_TYPE > (fun->decl))) > + || lookup_attribute ("no_vartrace", DECL_ATTRIBUTES (fun->decl))) > + return false; > + > + // need to run pass always to check for variable attributes > + return true; > + } > + > + virtual unsigned int execute (function *f) { return vartrace_execute (f); } > +}; > + > +} // anon namespace > + > +gimple_opt_pass * > +make_pass_vartrace (gcc::context *ctxt) > +{ > + return new pass_vartrace (ctxt); > +} > -- > 2.19.1 >