On Sat, Aug 31, 2013 at 12:26 AM, Bernhard Reutner-Fischer <rep.dot....@gmail.com> wrote: > On 30 August 2013 23:23:16 Teresa Johnson <tejohn...@google.com> wrote: >> >> On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li <davi...@google.com> >> wrote: >> > On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson <tejohn...@google.com> >> > wrote: >> >> On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li <davi...@google.com> >> >> wrote: >> >>> Except that in this form, the dump will be extremely large and not >> >>> suitable for very large applications. >> >> >> >> Yes. I did some measurements for both a fairly large source file that >> >> is heavily optimized with LIPO and for a simple toy example that has >> >> some inlining. For the large source file, the output from >> >> -fdump-ipa-inline=stderr was almost 100x the line count of the >> >> -fopt-info output. For the toy source file it was 43x. The size of the >> >> -details output was 250x and 100x, respectively. Which is untenable >> >> for a large app. >> >> >> >> The issue I am having here is that I want a more verbose message, not >> >> a more voluminous set of messages. Using either -fopt-info-all or >> >> -fdump-ipa-inline to provoke the more verbose inline message will give >> >> me a much greater volume of output. >> >> >> >> One compromise could be to emit the more verbose inliner message under >> >> a param (and a more concise "foo inlined into bar" by default with >> >> -fopt-info). Or we could do some variant of what David talks about >> >> below. >> > >> > something like --param=verbose-opt-info=1 >> >> Yes. Richard, would this be acceptable for now? >> >> i.e. the inliner messages would be like: >> >> -fopt-info: >> "test.c:8:3: note: foobar inlined into foo with call count 99999000" >> (the "with call count X" only when there is profile feedback) >> >> -fopt-info --param=verbose-opt-info=1: >> "test.c:8:3: note: foobar/0 (99999000) inlined into foo/2 (1000) >> with call count 99999000 (via inline instance bar [3] (99999000)) >> (again the call counts only emitted under profile feedback) > > > Assuming the [3] is order, please change that to match what the in liner > uses, I.e. /3
Agreed - I meant to switch that back to "/" in both places but missed the last. It should read: "test.c:8:3: note: foobar/0 (99999000) inlined into foo/2 (1000) with call count 99999000 (via inline instance bar/3 (99999000)) Thanks, Teresa > > Thanks > >> >> > >> > >> >> >> >>> Besides, we might also want to >> >>> use the same machinery (dump_printf_loc etc) for dump file dumping. >> >>> The current behavior of using '-details' to turn on opt-info-all >> >>> messages for dump files are not desirable. >> >> >> >> Interestingly, this doesn't even work. When I do >> >> -fdump-ipa-inline-details=stderr (with my patch containing the inliner >> >> messages) I am not getting those inliner messages emitted to stderr. >> >> Even though in dumpfile.c "details" is set to (TDF_DETAILS | >> >> MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not >> >> sure why, but will need to debug this. >> > >> > It works for vectorizer pass. >> >> Ok, let me see what is going on - I just confirmed that it is not >> working for the loop unroller messages either. >> >> > >> >> >> >>> How about the following: >> >>> >> >>> 1) add a new dump_kind modifier so that when that modifier is >> >>> specified, the messages won't goto the alt_dumpfile (controlled by >> >>> -fopt-info), but only to primary dump file. With this, the inline >> >>> messages can be dumped via: >> >>> >> >>> dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, >> >>> .....) >> >> >> >> (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) ) >> >> >> > >> > Yes. >> > >> >> Typically OR-ing together flags like this indicates dump under any of >> >> those conditions. But we could implement special handling for >> >> OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to >> >> the primary dump file, and only under the other conditions specified >> >> in the flag (here under "-optimized") >> >> >> >>> >> >>> >> >>> 2) add more flags in -fdump- support: >> >>> >> >>> -fdump-ipa-inline-opt --> turn on opt-info messages only >> >>> -fdump-ipa-inline-optall --> turn on opt-info-all messages >> >> >> >> According to the documentation (see the -fdump-tree- documentation on >> >> >> >> http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options), >> >> the above are already supposed to be there (-optimized, -missed, -note >> >> and -optall). However, specifying any of these gives a warning like: >> >> cc1: warning: ignoring unknown option ‘optimized’ in >> >> ‘-fdump-ipa-inline’ [enabled by default] >> >> Probably because none is listed in the dump_options[] array in >> >> dumpfile.c. >> >> >> >> However, I don't think there is currently a way to use -fdump- options >> >> and *only* get one of these, as much of the current dump output is >> >> emitted whenever there is a dump_file defined. Until everything is >> >> migrated to the new framework it may be difficult to get this to work. >> >> >> >>> -fdump-tree-pre-ir --> turn on GIMPLE dump only >> >>> -fdump-tree-pre-details --> turn on everything (ir, optall, trace) >> >>> >> >>> With this, developers can really just use >> >>> >> >>> >> >>> -fdump-ipa-inline-opt=stderr for inline messages. >> >> >> >> Yes, if we can figure out a good way to get this to work (i.e. only >> >> emit the optimized messages and not the rest of the dump messages). >> >> And unfortunately to get them all you need to specify >> >> "-fdump-ipa-all-optimized -fdump-tree-all-optimized >> >> -fdump-rtl-all-optimized" instead of just -fopt-info. Unless we can >> >> add -fdump-all-all-optimized. >> > >> > Having general support requires cleanup of all the old style if >> > (dump_file) fprintf (dump_file, ...) instances to be: >> > >> > if (dump_enabled_p ()) >> > dump_printf (dump_kind ....); >> >> Right. But that is going to be a big longer-term effort - grepping for >> dump_file in gcc/*.c gives about 6000 instances. >> >> > >> > >> > However, it might be easier to do this filtering for IR dump only (in >> > execute_function_dump) -- do not dump IR if any of the MSG_xxxx is >> > specified unless IR flag (a new flag) is also specified. >> >> Unfortunately there are a lot of messages that are not from >> execute_function_dump. >> >> Thanks, >> Teresa >> >> > >> > David >> > >> > >> >> >> >> Teresa >> >> >> >>> >> >>> thanks, >> >>> >> >>> David >> >>> >> >>> On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener >> >>> <richard.guent...@gmail.com> wrote: >> >>>> On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson >> >>>> <tejohn...@google.com> wrote: >> >>>>> On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener >> >>>>> <richard.guent...@gmail.com> wrote: >> >>>>>>>>> New patch below that removes this global variable, and also >> >>>>>>>>> outputs >> >>>>>>>>> the node->symbol.order (in square brackets after the function >> >>>>>>>>> name so >> >>>>>>>>> as to not clutter it). Inline messages with profile data look >> >>>>>>>>> look: >> >>>>>>>>> >> >>>>>>>>> test.c:8:3: note: foobar [0] (99999000) inlined into foo [2] >> >>>>>>>>> (1000) >> >>>>>>>>> with call count 99999000 (via inline instance bar [3] >> >>>>>>>>> (99999000)) >> >>>>>>>> >> >>>>>>>> Ick. This looks both redundant and cluttered. This is supposed >> >>>>>>>> to be >> >>>>>>>> understandable by GCC users, not only GCC developers. >> >>>>>>> >> >>>>>>> The main part that is only useful/understandable to gcc developers >> >>>>>>> is >> >>>>>>> the node->symbol.order in square brackes, requested by Martin. One >> >>>>>>> possibility is that I could put that part under a param, disabled >> >>>>>>> by >> >>>>>>> default. We have something similar on the google branches that >> >>>>>>> emits >> >>>>>>> LIPO module info in the message, enabled via a param. >> >>>>>> >> >>>>>> But we have _dump files_ for that. That's the developer-consumed >> >>>>>> form of opt-info. -fopt-info is purely user sugar and for usual >> >>>>>> translation >> >>>>>> units it shouldn't exceed a single terminal full of output. >> >>>>> >> >>>>> But as a developer I don't want to have to parse lots of dump files >> >>>>> for a summary of the major optimizations performed (e.g. inlining, >> >>>>> unrolling) for an application, unless I am diving into the reasons >> >>>>> for >> >>>>> why or why not one of those optimizations occurred in a particular >> >>>>> location. I really do want a summary emitted to stderr so that it is >> >>>>> easily searchable/summarizable for the app as a whole. >> >>>>> >> >>>>> For example, some of the apps I am interested in have thousands of >> >>>>> input files, and trying to collect and parse dump files for each and >> >>>>> every one is overwhelming (it probably would be even if my input >> >>>>> files >> >>>>> numbered in the hundreds). What has been very useful is having these >> >>>>> high level summary messages of inlines and unrolls emitted to stderr >> >>>>> by -fopt-info. Then it is easy to search and sort by hotness to get >> >>>>> a >> >>>>> feel for things like what inlines are missing when moving to a new >> >>>>> compiler, or compiling a new version of the source, for example. >> >>>>> Then >> >>>>> you know which files to focus on and collect dump files for. >> >>>> >> >>>> I thought we can direct dump files to stderr now? So, just use >> >>>> -fdump-tree-all=stderr >> >>>> >> >>>> and grep its contents. >> >>>> >> >>>>>> >> >>>>>>> I'd argue that the other information (the profile counts, emitted >> >>>>>>> only >> >>>>>>> when using -fprofile-use, and the inline call chains) are useful >> >>>>>>> if >> >>>>>>> you want to understand whether and how critical inlines are >> >>>>>>> occurring. >> >>>>>>> I think this is the type of information that users focused on >> >>>>>>> optimizations, as well as gcc developers, want when they use >> >>>>>>> -fopt-info. Otherwise it is difficult to make sense of the inline >> >>>>>>> information. >> >>>>>> >> >>>>>> Well, I doubt that inline information is interesting to users >> >>>>>> unless we are >> >>>>>> able to aggressively filter it to what users are interested in. >> >>>>>> Which IMHO >> >>>>>> isn't possible - users are interested in "I have not inlined this >> >>>>>> even though >> >>>>>> inlining would severely improve performance" which would indicate a >> >>>>>> bug >> >>>>>> in the heuristics we can reliably detect and thus it wouldn't be >> >>>>>> there. >> >>>>> >> >>>>> I have interacted with users who are aware of optimizations such as >> >>>>> inlining and unrolling and want to look at that information to >> >>>>> diagnose performance differences when refactoring code or using a >> >>>>> new >> >>>>> compiler version. I also think inlining (especially cross-module) is >> >>>>> one example of an optimization that is still being tuned, and user >> >>>>> reports of performance issues related to that have been useful. >> >>>>> >> >>>>> I really think that the two groups of people who will find >> >>>>> -fopt-info >> >>>>> useful are gcc developers and savvy performance-hungry users. For >> >>>>> the >> >>>>> former group the additional info is extremely useful. For the latter >> >>>>> group some of the extra information may not be required (although a >> >>>>> call count is useful for those using profile feedback), but IMO is >> >>>>> not >> >>>>> unreasonable. >> >>>> >> >>>> well, your proposed output wrecks my 80x24 terminal already due to >> >>>> overly >> >>>> long lines. >> >>>> >> >>>> In the end we may up with a verbosity level for each sub-set of >> >>>> opt-info >> >>>> messages. Ick. >> >>>> >> >>>> Richard. >> >>>> >> >>>>> Teresa >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Teresa Johnson | Software Engineer | tejohn...@google.com | >> >>>>> 408-460-2413 >> >> >> >> >> >> >> >> -- >> >> Teresa Johnson | Software Engineer | tejohn...@google.com | >> >> 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 > > > > Sent with AquaMail for Android > http://www.aqua-mail.com > > -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413