On Wed, Aug 28, 2013 at 5:20 PM, Xinliang David Li <davi...@google.com> wrote: > On Wed, Aug 28, 2013 at 7:09 AM, Teresa Johnson <tejohn...@google.com> wrote: >> On Wed, Aug 28, 2013 at 4:01 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Wed, Aug 7, 2013 at 7:23 AM, Teresa Johnson <tejohn...@google.com> wrote: >>>> On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson <tejohn...@google.com> >>>> wrote: >>>>> On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor <mjam...@suse.cz> wrote: >>>>>> Hi, >>>>>> >>>>>> On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote: >>>>>>> On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor <mjam...@suse.cz> wrote: >>>>>>> > On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote: >>>>>>> >> This patch ports messages to the new dump framework, >>>>>>> > >>>>>>> > It would be great this new framework was documented somewhere. I lost >>>>>>> > track of what was agreed it would be and from the uses in the >>>>>>> > vectorizer I was never quite sure how to utilize it in other passes. >>>>>>> >>>>>>> Cc'ing Sharad who implemented this - Sharad, is this documented on a >>>>>>> wiki or elsewhere? >>>>>> >>>>>> Thanks >>>>>> >>>>>>> >>>>>>> > >>>>>>> > I'd also like to point out two other minor things inline: >>>>>>> > >>>>>>> > [...] >>>>>>> > >>>>>>> >> 2013-08-06 Teresa Johnson <tejohn...@google.com> >>>>>>> >> Dehao Chen <de...@google.com> >>>>>>> >> >>>>>>> >> * dumpfile.c (dump_loc): Add column number to output, make >>>>>>> >> newlines >>>>>>> >> consistent. >>>>>>> >> * dumpfile.h (OPTGROUP_OTHER): Add and enable under >>>>>>> >> OPTGROUP_ALL. >>>>>>> >> * ipa-inline-transform.c (clone_inlined_nodes): >>>>>>> >> (cgraph_node_opt_info): New function. >>>>>>> >> (cgraph_node_call_chain): Ditto. >>>>>>> >> (dump_inline_decision): Ditto. >>>>>>> >> (inline_call): Invoke dump_inline_decision. >>>>>>> >> * doc/invoke.texi: Document optall -fopt-info flag. >>>>>>> >> * profile.c (read_profile_edge_counts): Use new dump >>>>>>> >> framework. >>>>>>> >> (compute_branch_probabilities): Ditto. >>>>>>> >> * passes.c (pass_manager::register_one_dump_file): Use >>>>>>> >> OPTGROUP_OTHER >>>>>>> >> when pass not in any opt group. >>>>>>> >> * value-prof.c (check_counter): Use new dump framework. >>>>>>> >> (find_func_by_funcdef_no): Ditto. >>>>>>> >> (check_ic_target): Ditto. >>>>>>> >> * coverage.c (get_coverage_counts): Ditto. >>>>>>> >> (coverage_init): Setup new dump framework. >>>>>>> >> * ipa-inline.c (inline_small_functions): Set >>>>>>> >> is_in_ipa_inline. >>>>>>> >> * ipa-inline.h (is_in_ipa_inline): Declare. >>>>>>> >> >>>>>>> >> * testsuite/gcc.dg/pr40209.c: Use -fopt-info. >>>>>>> >> * testsuite/gcc.dg/pr26570.c: Ditto. >>>>>>> >> * testsuite/gcc.dg/pr32773.c: Ditto. >>>>>>> >> * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto. >>>>>>> >> >>>>>>> > >>>>>>> > [...] >>>>>>> > >>>>>>> >> Index: ipa-inline-transform.c >>>>>>> >> =================================================================== >>>>>>> >> --- ipa-inline-transform.c (revision 201461) >>>>>>> >> +++ ipa-inline-transform.c (working copy) >>>>>>> >> @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, >>>>>>> >> bool d >>>>>>> >> } >>>>>>> >> >>>>>>> >> >>>>>>> >> +#define MAX_INT_LENGTH 20 >>>>>>> >> + >>>>>>> >> +/* Return NODE's name and profile count, if available. */ >>>>>>> >> + >>>>>>> >> +static const char * >>>>>>> >> +cgraph_node_opt_info (struct cgraph_node *node) >>>>>>> >> +{ >>>>>>> >> + char *buf; >>>>>>> >> + size_t buf_size; >>>>>>> >> + const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, >>>>>>> >> 0); >>>>>>> >> + >>>>>>> >> + if (!bfd_name) >>>>>>> >> + bfd_name = "unknown"; >>>>>>> >> + >>>>>>> >> + buf_size = strlen (bfd_name) + 1; >>>>>>> >> + if (profile_info) >>>>>>> >> + buf_size += (MAX_INT_LENGTH + 3); >>>>>>> >> + >>>>>>> >> + buf = (char *) xmalloc (buf_size); >>>>>>> >> + >>>>>>> >> + strcpy (buf, bfd_name); >>>>>>> >> + >>>>>>> >> + if (profile_info) >>>>>>> >> + sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, >>>>>>> >> node->count); >>>>>>> >> + return buf; >>>>>>> >> +} >>>>>>> > >>>>>>> > I'm not sure if output of this function is aimed only at the user or >>>>>>> > if it is supposed to be used by gcc developers as well. If the >>>>>>> > latter, an incredibly useful thing is to also dump node->symbol.order >>>>>>> > too. We usually dump it after "/" sign separating it from node name. >>>>>>> > It is invaluable when examining decisions in C++ code where you can >>>>>>> > have lots of clones of a node (and also because existing dumps print >>>>>>> > it, it is easy to combine them). >>>>>>> >>>>>>> The output is useful for both power users doing performance tuning of >>>>>>> their application, and by gcc developers. Adding the id is not so >>>>>>> useful for the former, but I agree that it is very useful for compiler >>>>>>> developers. In fact, in the google branch version we emit more verbose >>>>>>> information (the lipo module id and the funcdef_no) to help uniquely >>>>>>> identify the routines and to aid in post-processing by humans and >>>>>>> tools. So it is probably useful to add something similar here too. Is >>>>>>> the node->symbol.order more or less unique than the funcdef_no? I see >>>>>>> that you added a patch a few months ago to print the >>>>>>> node->symbol.order in the function header, and it also has the >>>>>>> advantage as you note of matching up with existing ipa dumps. >>>>>> >>>>>> node->symbol.order is unique and if I remember correctly, it is not >>>>>> even recycled. Clones, inline clones, thunks, every symbol table node >>>>>> gets its own symbol order so it should be more unique than funcdef_no. >>>>>> On the other hand it may be a bit cryptic for users but at the same >>>>>> time it is only one number. >>>>> >>>>> Ok, I am going to go ahead and add this to the output. >>>>> >>>>>> >>>>>>> >>>>>>> > >>>>>>> > [...] >>>>>>> > >>>>>>> >> Index: ipa-inline.c >>>>>>> >> =================================================================== >>>>>>> >> --- ipa-inline.c (revision 201461) >>>>>>> >> +++ ipa-inline.c (working copy) >>>>>>> >> @@ -118,6 +118,9 @@ along with GCC; see the file COPYING3. If not >>>>>>> >> see >>>>>>> >> static int overall_size; >>>>>>> >> static gcov_type max_count; >>>>>>> >> >>>>>>> >> +/* Global variable to denote if it is in ipa-inline pass. */ >>>>>>> >> +bool is_in_ipa_inline = false; >>>>>>> >> + >>>>>>> >> /* Return false when inlining edge E would lead to violating >>>>>>> >> limits on function unit growth or stack usage growth. >>>>>>> >> >>>>>>> > >>>>>>> > In this age of removing global variables, are you sure you need this? >>>>>>> > The only user of this seems to be a function that is only being called >>>>>>> > from inline_call... can that ever happen when not inlining? If you >>>>>>> > plan to use this function also elsewhere, perhaps the callers will >>>>>>> > know whether we are inlining or not and can provide this in a >>>>>>> > parameter? >>>>>>> >>>>>>> This is to distinguish early inlining from ipa inlining. >>>>>> >>>>>> Oh, right, I did not realize that the IPA part was the important bit >>>>>> of the name. >>>>>> >>>>>>> The volume of >>>>>>> early inlining messages is too high to be on for the default setting >>>>>>> of -fopt-info, and are not as interesting usually for performance >>>>>>> tuning. The dumper will only emit the early inline messages under a >>>>>>> more verbose setting (MSG_NOTE): >>>>>>> dump_printf_loc (is_in_ipa_inline ? MSG_OPTIMIZED_LOCATIONS : >>>>>>> MSG_NOTE ... >>>>>>> The other way I can see to distinguish this would be to check the >>>>>>> always_inline_functions_inlined flag on the caller's function. It >>>>>>> could also be possible to pass down a flag from the callers of >>>>>>> inline_call, but at least one caller (flatten_functions) is shared >>>>>>> between early and late inlining, so the flag needs to be passed >>>>>>> through that as well. WDYT? >>>>>> >>>>>> Did you mean flatten_function? It already has a bool "early" >>>>>> parameter. But I can see that being able to quickly figure out >>>>>> whether we are in early inliner or ipa inliner without much hassle is >>>>>> useful enough to justify a global variable a month ago, however I >>>>>> suppose we should not be introducing them now and so you'd have to put >>>>>> such stuff into... well, you'd probably have to put into the universe >>>>>> object somewhere because it is basically shared between two passes. >>>>>> Another option, even though somewhat hackish, would be to look at >>>>>> current_pass and see which pass it is. I don't know, do what is >>>>>> easier or what you like more, just be aware of the problem. >>>>> >>>>> After thinking about this some more, I think passing down an early >>>>> flag from callers is the cleanest way to go. >>>>> >>>>> I'll fix these and post a new patch later today. >>>> >>>> New patch below that removes this global variable, and also outputs >>>> the node->symbol.order (in square brackets after the function name so >>>> as to not clutter it). Inline messages with profile data look look: >>>> >>>> test.c:8:3: note: foobar [0] (99999000) inlined into foo [2] (1000) >>>> with call count 99999000 (via inline instance bar [3] (99999000)) >>> >>> Ick. This looks both redundant and cluttered. This is supposed to be >>> understandable by GCC users, not only GCC developers. >> >> The main part that is only useful/understandable to gcc developers is >> the node->symbol.order in square brackes, requested by Martin. One >> possibility is that I could put that part under a param, disabled by >> default. We have something similar on the google branches that emits >> LIPO module info in the message, enabled via a param. >> >> I'd argue that the other information (the profile counts, emitted only >> when using -fprofile-use, and the inline call chains) are useful if >> you want to understand whether and how critical inlines are occurring. >> I think this is the type of information that users focused on >> optimizations, as well as gcc developers, want when they use >> -fopt-info. Otherwise it is difficult to make sense of the inline >> information. >> >>> >>>> (without FDO the counts in parentheses and the call count would not be >>>> included). >>>> >>>> Ok for trunk? >>> >>> Let's split this patch. >> >> Ok. >> >>> >>>> Thanks, >>>> Teresa >>>> >>>> 013-08-06 Teresa Johnson <tejohn...@google.com> >>>> Dehao Chen <de...@google.com> >>>> >>>> * dumpfile.c (dump_loc): Output column number, make newlines >>>> consistent. >>> >>> I don't like column numbers, they are of not much use generally. >> >> I added these here to get consistency with other messages (notes >> emitted via inform(), warnings, errors). Plus the dg-message testing >> was failing for the test cases that parse this output, since it >> expects the column to exist. >> >>> Does >>> 'make newlines consitent' avoid all the spurious vertical spacing I see with >>> -fopt-info? >> >> Well, it helps get us there. The problem was that before, since >> dump_loc was not consistently emitting newlines, the calls had to emit >> their own newlines manually in the string to ensure there was a >> newline at all. I was thinking that once this is fixed I could go back >> and clean up all those calls by removing the newlines in the string. I >> could split this part into a separate patch and do both at once. >> >> However, after thinking about this some more this morning, I am >> wondering whether it is better to remove the newline emission >> completely from dump_loc and rely on the caller to put the newline in >> the string. The reason is that there are 2 high level interfaces to >> the new dump infrastructure, dump_printf() and dump_printf_loc(). Only >> the latter invokes dump_loc and gets the newline at the start of the >> message. The typical usage seems to be to start a message via >> dump_printf_loc, and then use dump_printf to emit parts of the message >> (thus not requiring a newline), but I think it may lead to problems to >> rely on this assumption. >> >> So if you agree, I will simply remove the newline altogether from >> dump_loc, and ensure that all clients of dump_printf/dump_printf_loc >> include a newline char as appropriate in the string they pass. > > > As a helper function, dump_loc should not blindly emit new line as it > has no context. I have tried to remove it, and push the newline to > higher level helpers -- it mostly works, but the vectorizer verbose > messages need serious clean up -- most of them assume that > dump_printf_loc does not end with new line, so that the expression > dump can follow in the same line (the message texts need clean up too > -- i do not like the === === in info messages).
I know, but we should really do that cleanup. Richard. > David > > >> >>> >>>> * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL. >>> >>> Good change - please split this out (with the related changes) and commit >>> it. >> >> Ok, thanks. Will do. >> >>> >>>> * ipa-inline-transform.c (cgraph_node_opt_info): New function. >>>> (cgraph_node_call_chain): Ditto. >>>> (dump_inline_decision): Ditto. >>>> (inline_call): Invoke dump_inline_decision, new parameter. >>> >>> The inline stuff should be split and re-sent, it's non-obvious to me (extra >>> function parameters are not documented for example). I'd rather have >>> inline_and_report_call () for example instead of an extra bool parameter. >>> But let's iterate over this once it's split out. >> >> Ok, I will send this separately. I guess we could have a separate >> interface inline_and_report_call that is a wrapper around inline_call >> and simply invokes the dumper. Note that flatten_function will need to >> conditionally call one of the two interfaces based on the value of its >> bool early parameter though. >> >>> >>>> * doc/invoke.texi: Document optall -fopt-info flag. >>>> * profile.c (read_profile_edge_counts): Use new dump framework. >>>> (compute_branch_probabilities): Ditto. >>>> * passes.c (pass_manager::register_one_dump_file): Use >>>> OPTGROUP_OTHER >>>> when pass not in any opt group. >>>> * value-prof.c (check_counter): Use new dump framework. >>>> (find_func_by_funcdef_no): Ditto. >>>> (check_ic_target): Ditto. >>>> * coverage.c (get_coverage_counts): Ditto. >>>> (coverage_init): Setup new dump framework. >>> >>> These pieces look good to me. >>> >>>> * ipa-inline.c (recursive_inlining): New inline_call parameter. >>>> (inline_small_functions): Ditto. >>>> (flatten_function): Ditto. >>>> (ipa_inline): Ditto. >>>> (inline_always_inline_functions): Ditto. >>>> (early_inline_small_functions): Ditto. >>>> * ipa-inline.h: Ditto. >>>> >>>> * testsuite/gcc.dg/pr40209.c: Use -fopt-info. >>>> * testsuite/gcc.dg/pr26570.c: Ditto. >>>> * testsuite/gcc.dg/pr32773.c: Ditto. >>>> * testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto. >>> >>> Why? Just remove the stray dg- annotations that deal with the unwanted >>> output? >> >> Because there are dg-message annotations that want to confirm this output. >> >> Teresa >> >>> >>> Thanks, >>> Richard. >>> >>>> * testsuite/gcc.dg/inline-dump.c: New test. >>>> >>>> Index: dumpfile.c >>>> =================================================================== >>>> --- dumpfile.c (revision 201461) >>>> +++ dumpfile.c (working copy) >>>> @@ -257,16 +257,18 @@ dump_open_alternate_stream (struct dump_file_info >>>> void >>>> dump_loc (int dump_kind, FILE *dfile, source_location loc) >>>> { >>>> - /* Currently vectorization passes print location information. */ >>>> if (dump_kind) >>>> { >>>> + /* Ensure dump message starts on a new line. */ >>>> + fprintf (dfile, "\n"); >>>> if (LOCATION_LOCUS (loc) > BUILTINS_LOCATION) >>>> - fprintf (dfile, "\n%s:%d: note: ", LOCATION_FILE (loc), >>>> - LOCATION_LINE (loc)); >>>> + fprintf (dfile, "%s:%d:%d: note: ", LOCATION_FILE (loc), >>>> + LOCATION_LINE (loc), LOCATION_COLUMN (loc)); >>>> else if (current_function_decl) >>>> - fprintf (dfile, "\n%s:%d: note: ", >>>> + fprintf (dfile, "%s:%d:%d: note: ", >>>> DECL_SOURCE_FILE (current_function_decl), >>>> - DECL_SOURCE_LINE (current_function_decl)); >>>> + DECL_SOURCE_LINE (current_function_decl), >>>> + DECL_SOURCE_COLUMN (current_function_decl)); >>>> } >>>> } >>>> >>>> Index: dumpfile.h >>>> =================================================================== >>>> --- dumpfile.h (revision 201461) >>>> +++ dumpfile.h (working copy) >>>> @@ -97,8 +97,9 @@ enum tree_dump_index >>>> #define OPTGROUP_LOOP (1 << 2) /* Loop optimization passes */ >>>> #define OPTGROUP_INLINE (1 << 3) /* Inlining passes */ >>>> #define OPTGROUP_VEC (1 << 4) /* Vectorization passes */ >>>> +#define OPTGROUP_OTHER (1 << 5) /* All other passes */ >>>> #define OPTGROUP_ALL (OPTGROUP_IPA | OPTGROUP_LOOP | >>>> OPTGROUP_INLINE \ >>>> - | OPTGROUP_VEC) >>>> + | OPTGROUP_VEC | OPTGROUP_OTHER) >>>> >>>> /* Define a tree dump switch. */ >>>> struct dump_file_info >>>> Index: ipa-inline-transform.c >>>> =================================================================== >>>> --- ipa-inline-transform.c (revision 201461) >>>> +++ ipa-inline-transform.c (working copy) >>>> @@ -192,6 +192,111 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d >>>> } >>>> >>>> >>>> +#define MAX_INT_LENGTH 20 >>>> + >>>> +/* Return NODE's name and profile count, if available. */ >>>> + >>>> +static const char * >>>> +cgraph_node_opt_info (struct cgraph_node *node) >>>> +{ >>>> + char *buf; >>>> + size_t buf_size; >>>> + const char *bfd_name = lang_hooks.dwarf_name (node->symbol.decl, 0); >>>> + >>>> + if (!bfd_name) >>>> + bfd_name = "unknown"; >>>> + >>>> + buf_size = strlen (bfd_name) + 1; >>>> + if (profile_info) >>>> + buf_size += (MAX_INT_LENGTH + 3); >>>> + buf_size += MAX_INT_LENGTH; >>>> + >>>> + buf = (char *) xmalloc (buf_size); >>>> + >>>> + strcpy (buf, bfd_name); >>>> + //sprintf (buf, "%s/%i", buf, node->symbol.order); >>>> + sprintf (buf, "%s [%i]", buf, node->symbol.order); >>>> + >>>> + if (profile_info) >>>> + sprintf (buf, "%s ("HOST_WIDEST_INT_PRINT_DEC")", buf, node->count); >>>> + return buf; >>>> +} >>>> + >>>> + >>>> +/* Return CALLER's inlined call chain. Save the cgraph_node of the >>>> ultimate >>>> + function that the caller is inlined to in FINAL_CALLER. */ >>>> + >>>> +static const char * >>>> +cgraph_node_call_chain (struct cgraph_node *caller, >>>> + struct cgraph_node **final_caller) >>>> +{ >>>> + struct cgraph_node *node; >>>> + const char *via_str = " (via inline instance"; >>>> + size_t current_string_len = strlen (via_str) + 1; >>>> + size_t buf_size = current_string_len; >>>> + char *buf = (char *) xmalloc (buf_size); >>>> + >>>> + buf[0] = 0; >>>> + gcc_assert (caller->global.inlined_to != NULL); >>>> + strcat (buf, via_str); >>>> + for (node = caller; node->global.inlined_to != NULL; >>>> + node = node->callers->caller) >>>> + { >>>> + const char *name = cgraph_node_opt_info (node); >>>> + current_string_len += (strlen (name) + 1); >>>> + if (current_string_len >= buf_size) >>>> + { >>>> + buf_size = current_string_len * 2; >>>> + buf = (char *) xrealloc (buf, buf_size); >>>> + } >>>> + strcat (buf, " "); >>>> + strcat (buf, name); >>>> + } >>>> + strcat (buf, ")"); >>>> + *final_caller = node; >>>> + return buf; >>>> +} >>>> + >>>> + >>>> +/* Dump the inline decision of EDGE. */ >>>> + >>>> +static void >>>> +dump_inline_decision (struct cgraph_edge *edge, bool early) >>>> +{ >>>> + location_t locus; >>>> + const char *inline_chain_text; >>>> + const char *call_count_text; >>>> + struct cgraph_node *final_caller = edge->caller; >>>> + >>>> + if (final_caller->global.inlined_to != NULL) >>>> + inline_chain_text = cgraph_node_call_chain (final_caller, >>>> &final_caller); >>>> + else >>>> + inline_chain_text = ""; >>>> + >>>> + if (edge->count > 0) >>>> + { >>>> + const char *call_count_str = " with call count "; >>>> + char *buf = (char *) xmalloc (strlen (call_count_str) + >>>> MAX_INT_LENGTH); >>>> + sprintf (buf, "%s"HOST_WIDEST_INT_PRINT_DEC, call_count_str, >>>> + edge->count); >>>> + call_count_text = buf; >>>> + } >>>> + else >>>> + { >>>> + call_count_text = ""; >>>> + } >>>> + >>>> + locus = gimple_location (edge->call_stmt); >>>> + dump_printf_loc (early ? MSG_NOTE : MSG_OPTIMIZED_LOCATIONS, >>>> + locus, >>>> + "%s inlined into %s%s%s\n", >>>> + cgraph_node_opt_info (edge->callee), >>>> + cgraph_node_opt_info (final_caller), >>>> + call_count_text, >>>> + inline_chain_text); >>>> +} >>>> + >>>> + >>>> /* Mark edge E as inlined and update callgraph accordingly. >>>> UPDATE_ORIGINAL >>>> specify whether profile of original function should be updated. If >>>> any new >>>> indirect edges are discovered in the process, add them to NEW_EDGES, >>>> unless >>>> @@ -205,7 +310,8 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d >>>> bool >>>> inline_call (struct cgraph_edge *e, bool update_original, >>>> vec<cgraph_edge_p> *new_edges, >>>> - int *overall_size, bool update_overall_summary) >>>> + int *overall_size, bool update_overall_summary, >>>> + bool early) >>>> { >>>> int old_size = 0, new_size = 0; >>>> struct cgraph_node *to = NULL; >>>> @@ -218,6 +324,9 @@ inline_call (struct cgraph_edge *e, bool update_or >>>> bool predicated = inline_edge_summary (e)->predicate != NULL; >>>> #endif >>>> >>>> + if (dump_enabled_p ()) >>>> + dump_inline_decision (e, early); >>>> + >>>> /* Don't inline inlined edges. */ >>>> gcc_assert (e->inline_failed); >>>> /* Don't even think of inlining inline clone. */ >>>> Index: doc/invoke.texi >>>> =================================================================== >>>> --- doc/invoke.texi (revision 201461) >>>> +++ doc/invoke.texi (working copy) >>>> @@ -6234,6 +6234,9 @@ Enable dumps from all loop optimizations. >>>> Enable dumps from all inlining optimizations. >>>> @item vec >>>> Enable dumps from all vectorization optimizations. >>>> +@item optall >>>> +Enable dumps from all optimizations. This is a superset of >>>> +the optimization groups listed above. >>>> @end table >>>> >>>> For example, >>>> Index: profile.c >>>> =================================================================== >>>> --- profile.c (revision 201461) >>>> +++ profile.c (working copy) >>>> @@ -432,8 +432,8 @@ read_profile_edge_counts (gcov_type *exec_counts) >>>> if (flag_profile_correction) >>>> { >>>> static bool informed = 0; >>>> - if (!informed) >>>> - inform (input_location, >>>> + if (dump_enabled_p () && !informed) >>>> + dump_printf_loc (MSG_NOTE, input_location, >>>> "corrupted profile info: edge count >>>> exceeds maximal count"); >>>> informed = 1; >>>> } >>>> @@ -692,10 +692,11 @@ compute_branch_probabilities (unsigned cfg_checksu >>>> { >>>> /* Inconsistency detected. Make it flow-consistent. */ >>>> static int informed = 0; >>>> - if (informed == 0) >>>> + if (dump_enabled_p () && informed == 0) >>>> { >>>> informed = 1; >>>> - inform (input_location, "correcting inconsistent profile >>>> data"); >>>> + dump_printf_loc (MSG_NOTE, input_location, >>>> + "correcting inconsistent profile data"); >>>> } >>>> correct_negative_edge_counts (); >>>> /* Set bb counts to the sum of the outgoing edge counts */ >>>> Index: passes.c >>>> =================================================================== >>>> --- passes.c (revision 201461) >>>> +++ passes.c (working copy) >>>> @@ -524,6 +524,11 @@ pass_manager::register_one_dump_file (struct opt_p >>>> flag_name = concat (prefix, name, num, NULL); >>>> glob_name = concat (prefix, name, NULL); >>>> optgroup_flags |= pass->optinfo_flags; >>>> + /* For any passes that do not have an optgroup set, and which are not >>>> + IPA passes setup above, set the optgroup to OPTGROUP_OTHER so that >>>> + any dump messages are emitted properly under -fopt-info(-optall). */ >>>> + if (optgroup_flags == OPTGROUP_NONE) >>>> + optgroup_flags = OPTGROUP_OTHER; >>>> id = dump_register (dot_name, flag_name, glob_name, flags, >>>> optgroup_flags); >>>> set_pass_for_id (id, pass); >>>> full_name = concat (prefix, pass->name, num, NULL); >>>> Index: value-prof.c >>>> =================================================================== >>>> --- value-prof.c (revision 201461) >>>> +++ value-prof.c (working copy) >>>> @@ -585,9 +585,11 @@ check_counter (gimple stmt, const char * name, >>>> : DECL_SOURCE_LOCATION (current_function_decl); >>>> if (flag_profile_correction) >>>> { >>>> - inform (locus, "correcting inconsistent value profile: " >>>> - "%s profiler overall count (%d) does not match BB count " >>>> - "(%d)", name, (int)*all, (int)bb_count); >>>> + if (dump_enabled_p ()) >>>> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, locus, >>>> + "correcting inconsistent value profile: %s " >>>> + "profiler overall count (%d) does not match >>>> BB " >>>> + "count (%d)", name, (int)*all, >>>> (int)bb_count); >>>> *all = bb_count; >>>> if (*count > *all) >>>> *count = *all; >>>> @@ -1209,9 +1211,11 @@ find_func_by_funcdef_no (int func_id) >>>> int max_id = get_last_funcdef_no (); >>>> if (func_id >= max_id || cgraph_node_map[func_id] == NULL) >>>> { >>>> - if (flag_profile_correction) >>>> - inform (DECL_SOURCE_LOCATION (current_function_decl), >>>> - "Inconsistent profile: indirect call target (%d) does >>>> not exist", func_id); >>>> + if (flag_profile_correction && dump_enabled_p ()) >>>> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, >>>> + DECL_SOURCE_LOCATION (current_function_decl), >>>> + "Inconsistent profile: indirect call target (%d) >>>> " >>>> + "does not exist", func_id); >>>> else >>>> error ("Inconsistent profile: indirect call target (%d) does >>>> not exist", func_id); >>>> >>>> @@ -1235,8 +1239,10 @@ check_ic_target (gimple call_stmt, struct cgraph_n >>>> return true; >>>> >>>> locus = gimple_location (call_stmt); >>>> - inform (locus, "Skipping target %s with mismatching types for icall ", >>>> - cgraph_node_name (target)); >>>> + if (dump_enabled_p ()) >>>> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, locus, >>>> + "Skipping target %s with mismatching types for >>>> icall ", >>>> + cgraph_node_name (target)); >>>> return false; >>>> } >>>> >>>> Index: coverage.c >>>> =================================================================== >>>> --- coverage.c (revision 201461) >>>> +++ coverage.c (working copy) >>>> @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see >>>> #include "langhooks.h" >>>> #include "hash-table.h" >>>> #include "tree-iterator.h" >>>> +#include "tree-pass.h" >>>> #include "cgraph.h" >>>> #include "dumpfile.h" >>>> #include "diagnostic-core.h" >>>> @@ -341,11 +342,13 @@ get_coverage_counts (unsigned counter, unsigned ex >>>> { >>>> static int warned = 0; >>>> >>>> - if (!warned++) >>>> - inform (input_location, (flag_guess_branch_prob >>>> - ? "file %s not found, execution counts estimated" >>>> - : "file %s not found, execution counts assumed to be >>>> zero"), >>>> - da_file_name); >>>> + if (!warned++ && dump_enabled_p ()) >>>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, >>>> + (flag_guess_branch_prob >>>> + ? "file %s not found, execution counts >>>> estimated" >>>> + : "file %s not found, execution counts assumed >>>> to " >>>> + "be zero"), >>>> + da_file_name); >>>> return NULL; >>>> } >>>> >>>> @@ -369,21 +372,25 @@ get_coverage_counts (unsigned counter, unsigned ex >>>> warning_at (input_location, OPT_Wcoverage_mismatch, >>>> "the control flow of function %qE does not match " >>>> "its profile data (counter %qs)", id, >>>> ctr_names[counter]); >>>> - if (warning_printed) >>>> + if (warning_printed && dump_enabled_p ()) >>>> { >>>> - inform (input_location, "use -Wno-error=coverage-mismatch to >>>> tolerate " >>>> - "the mismatch but performance may drop if the >>>> function is hot"); >>>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, >>>> + "use -Wno-error=coverage-mismatch to tolerate " >>>> + "the mismatch but performance may drop if the " >>>> + "function is hot"); >>>> >>>> if (!seen_error () >>>> && !warned++) >>>> { >>>> - inform (input_location, "coverage mismatch ignored"); >>>> - inform (input_location, flag_guess_branch_prob >>>> - ? G_("execution counts estimated") >>>> - : G_("execution counts assumed to be zero")); >>>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, >>>> + "coverage mismatch ignored"); >>>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, >>>> + flag_guess_branch_prob >>>> + ? G_("execution counts estimated") >>>> + : G_("execution counts assumed to be >>>> zero")); >>>> if (!flag_guess_branch_prob) >>>> - inform (input_location, >>>> - "this can result in poorly optimized code"); >>>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, >>>> + "this can result in poorly optimized >>>> code"); >>>> } >>>> } >>>> >>>> @@ -1103,6 +1110,11 @@ coverage_init (const char *filename) >>>> int len = strlen (filename); >>>> int prefix_len = 0; >>>> >>>> + /* Since coverage_init is invoked very early, before the pass >>>> + manager, we need to set up the dumping explicitly. This is >>>> + similar to the handling in finish_optimization_passes. */ >>>> + dump_start (pass_profile.pass.static_pass_number, NULL); >>>> + >>>> if (!profile_data_prefix && !IS_ABSOLUTE_PATH (filename)) >>>> profile_data_prefix = getpwd (); >>>> >>>> @@ -1145,6 +1157,8 @@ coverage_init (const char *filename) >>>> gcov_write_unsigned (bbg_file_stamp); >>>> } >>>> } >>>> + >>>> + dump_finish (pass_profile.pass.static_pass_number); >>>> } >>>> >>>> /* Performs file-level cleanup. Close notes file, generate coverage >>>> Index: ipa-inline.c >>>> =================================================================== >>>> --- ipa-inline.c (revision 201461) >>>> +++ ipa-inline.c (working copy) >>>> @@ -1322,7 +1322,7 @@ recursive_inlining (struct cgraph_edge *edge, >>>> reset_edge_growth_cache (curr); >>>> } >>>> >>>> - inline_call (curr, false, new_edges, &overall_size, true); >>>> + inline_call (curr, false, new_edges, &overall_size, true, false); >>>> lookup_recursive_calls (node, curr->callee, heap); >>>> n++; >>>> } >>>> @@ -1612,7 +1612,8 @@ inline_small_functions (void) >>>> fprintf (dump_file, " Peeling recursion with depth %i\n", >>>> depth); >>>> >>>> gcc_checking_assert (!callee->global.inlined_to); >>>> - inline_call (edge, true, &new_indirect_edges, &overall_size, >>>> true); >>>> + inline_call (edge, true, &new_indirect_edges, &overall_size, >>>> true, >>>> + false); >>>> if (flag_indirect_inlining) >>>> add_new_edges_to_heap (edge_heap, new_indirect_edges); >>>> >>>> @@ -1733,7 +1734,7 @@ flatten_function (struct cgraph_node *node, bool e >>>> xstrdup (cgraph_node_name (callee)), >>>> xstrdup (cgraph_node_name (e->caller))); >>>> orig_callee = callee; >>>> - inline_call (e, true, NULL, NULL, false); >>>> + inline_call (e, true, NULL, NULL, false, early); >>>> if (e->callee != orig_callee) >>>> orig_callee->symbol.aux = (void *) node; >>>> flatten_function (e->callee, early); >>>> @@ -1852,7 +1853,8 @@ ipa_inline (void) >>>> inline_summary >>>> (node->callers->caller)->size); >>>> } >>>> >>>> - inline_call (node->callers, true, NULL, NULL, true); >>>> + inline_call (node->callers, true, NULL, NULL, true, >>>> + false); >>>> if (dump_file) >>>> fprintf (dump_file, >>>> " Inlined into %s which now has %i >>>> size\n", >>>> @@ -1925,7 +1927,7 @@ inline_always_inline_functions (struct cgraph_node >>>> fprintf (dump_file, " Inlining %s into %s (always_inline).\n", >>>> xstrdup (cgraph_node_name (e->callee)), >>>> xstrdup (cgraph_node_name (e->caller))); >>>> - inline_call (e, true, NULL, NULL, false); >>>> + inline_call (e, true, NULL, NULL, false, true); >>>> inlined = true; >>>> } >>>> if (inlined) >>>> @@ -1977,7 +1979,7 @@ early_inline_small_functions (struct cgraph_node * >>>> fprintf (dump_file, " Inlining %s into %s.\n", >>>> xstrdup (cgraph_node_name (callee)), >>>> xstrdup (cgraph_node_name (e->caller))); >>>> - inline_call (e, true, NULL, NULL, true); >>>> + inline_call (e, true, NULL, NULL, true, true); >>>> inlined = true; >>>> } >>>> >>>> Index: ipa-inline.h >>>> =================================================================== >>>> --- ipa-inline.h (revision 201461) >>>> +++ ipa-inline.h (working copy) >>>> @@ -228,7 +228,8 @@ void free_growth_caches (void); >>>> void compute_inline_parameters (struct cgraph_node *, bool); >>>> >>>> /* In ipa-inline-transform.c */ >>>> -bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *, >>>> int *, bool); >>>> +bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *, int *, >>>> + bool, bool); >>>> unsigned int inline_transform (struct cgraph_node *); >>>> void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *); >>>> >>>> Index: testsuite/gcc.dg/pr40209.c >>>> =================================================================== >>>> --- testsuite/gcc.dg/pr40209.c (revision 201461) >>>> +++ testsuite/gcc.dg/pr40209.c (working copy) >>>> @@ -1,5 +1,5 @@ >>>> /* { dg-do compile } */ >>>> -/* { dg-options "-O2 -fprofile-use" } */ >>>> +/* { dg-options "-O2 -fprofile-use -fopt-info" } */ >>>> >>>> void process(const char *s); >>>> >>>> Index: testsuite/gcc.dg/pr26570.c >>>> =================================================================== >>>> --- testsuite/gcc.dg/pr26570.c (revision 201461) >>>> +++ testsuite/gcc.dg/pr26570.c (working copy) >>>> @@ -1,5 +1,5 @@ >>>> /* { dg-do compile } */ >>>> -/* { dg-options "-O2 -fprofile-generate -fprofile-use" } */ >>>> +/* { dg-options "-O2 -fprofile-generate -fprofile-use -fopt-info" } */ >>>> >>>> unsigned test (unsigned a, unsigned b) >>>> { >>>> Index: testsuite/gcc.dg/pr32773.c >>>> =================================================================== >>>> --- testsuite/gcc.dg/pr32773.c (revision 201461) >>>> +++ testsuite/gcc.dg/pr32773.c (working copy) >>>> @@ -1,6 +1,6 @@ >>>> /* { dg-do compile } */ >>>> -/* { dg-options "-O -fprofile-use" } */ >>>> -/* { dg-options "-O -m4 -fprofile-use" { target sh-*-* } } */ >>>> +/* { dg-options "-O -fprofile-use -fopt-info" } */ >>>> +/* { dg-options "-O -m4 -fprofile-use -fopt-info" { target sh-*-* } } */ >>>> >>>> void foo (int *p) >>>> { >>>> Index: testsuite/g++.dg/tree-ssa/dom-invalid.C >>>> =================================================================== >>>> --- testsuite/g++.dg/tree-ssa/dom-invalid.C (revision 201461) >>>> +++ testsuite/g++.dg/tree-ssa/dom-invalid.C (working copy) >>>> @@ -1,7 +1,7 @@ >>>> // PR tree-optimization/39557 >>>> // invalid post-dom info leads to infinite loop >>>> // { dg-do run } >>>> -// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fno-rtti" } >>>> +// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fopt-info >>>> -fno-rtti" } >>>> >>>> struct C >>>> { >>>> Index: testsuite/gcc.dg/inline-dump.c >>>> =================================================================== >>>> --- testsuite/gcc.dg/inline-dump.c (revision 0) >>>> +++ testsuite/gcc.dg/inline-dump.c (revision 0) >>>> @@ -0,0 +1,11 @@ >>>> +/* Verify that -fopt-info can output correct inline info. */ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-Wall -fopt-info-inline=stderr -O2 -fno-early-inlining" >>>> } */ >>>> +static inline int leaf() { >>>> + int i, ret = 0; >>>> + for (i = 0; i < 10; i++) >>>> + ret += i; >>>> + return ret; >>>> +} >>>> +static inline int foo(void) { return leaf(); } /* { dg-message "note: >>>> leaf .*inlined into bar .*via inline instance foo.*\n" } */ >>>> +int bar(void) { return foo(); } >>>>> >>>>> Thanks, >>>>> Teresa >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Martin >>>>> >>>>> >>>>> >>>>> -- >>>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 >>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413