Re: How to get GCC on par with ICC?
On Fri, 8 Jun 2018, Steve Ellcey wrote: On Thu, 2018-06-07 at 12:01 +0200, Richard Biener wrote: When we do our own comparisons of GCC vs. ICC on benchmarks like SPEC CPU 2006/2017 ICC doesn't have a big lead over GCC (in fact it even trails in some benchmarks) unless you get to "SPEC tricks" like data structure re-organization optimizations that probably never apply in practice on real-world code (and people should fix such things at the source level being pointed at them via actually profiling their codes). Richard, I was wondering if you have any more details about these comparisions you have done that you can share? Compiler versions, options used, hardware, etc Also, were there any tests that stood out in terms of icc outperforming GCC? I did a compare of SPEC 2017 rate using GCC 8.* (pre release) and a recent ICC (2018.0.128?) on my desktop (Xeon CPU E5-1650 v4). I used '-xHost -O3' for icc and '-march=native -mtune=native -O3' for gcc. You should use -Ofast for gcc. As mentionned earlier in the discussion, ICC has some equivalent of -ffast-math by default. The int rate numbers (running 1 copy only) were not too bad, GCC was only about 2% slower and only 525.x264_r seemed way slower with GCC. The fp rate numbers (again only 1 copy) showed a larger difference, around 20%. 521.wrf_r was more than twice as slow when compiled with GCC instead of ICC and 503.bwaves_r and 510.parest_r also showed significant slowdowns when compiled with GCC vs. ICC. -- Marc Glisse
gcc-8-20180608 is now available
Snapshot gcc-8-20180608 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/8-20180608/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch revision 261348 You'll find: gcc-8-20180608.tar.xzComplete GCC SHA256=3097e5eeaf5701b003696140772f6d6bd1a5748a57a30d03eaa916f942857c22 SHA1=da2e4f143e52e812abcd1fc2ab87dc3db2cfdf57 Diffs from 8-20180601 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: decrement_and_branch_until_zero pattern
On Fri, Jun 8, 2018 at 1:12 PM, Paul Koning wrote: > Thanks. I saw those sections and interpreted them as support for signal > processor style fast hardware loops. If they can be adapted for dbra type > looping, great. I'll give that a try. The rs6000 port uses it for bdnz (branch decrement not zero) for instance, which is similar to the m68k dbra. > Meanwhile, yes, it looks like there is a documentation bug. I can clean that > up. It's more than a few lines, but does that qualify for an "obvious" > change? I think the obvious rule should only apply to trivial patches, and this will require some non-trivial changes to fix the looping pattern section. Just deleting the decrement_and_branch_until_zero named pattern section looks trivial. It looks like the REG_NONNEG section should mention the doloop_end pattern instead of decrement_and_branch_until_zero, since I think the same rule applies that they only get generated if the doloop_end pattern exists. Jim
Re: Aarch64 / simd builtin question
On Fri, 2018-06-08 at 22:34 +0100, James Greenhalgh wrote: > > Are you in an environment where you can use arm_neon.h ? If so, that > would > be the best approach: > > float32x4_t in; > float64x2_t low = vcvt_f64_f32 (vget_low_f64 (in)); > float64x2_t high = vcvt_high_f64_f32 (in); > > If you can't use arm_neon.h for some reason, you can look there for > inspiration of how to write your own versions of these intrinsics. > > Thanks, > James Thanks, that is helpful though I think you meant vget_low_f32 in the first line instead of vget_low_f64. With that change I get the code I want/expect. I hadn't seen the __GETLOW macro in the neon header file. Steve Ellcey
Re: How to get GCC on par with ICC?
On Thu, 2018-06-07 at 12:01 +0200, Richard Biener wrote: > > When we do our own comparisons of GCC vs. ICC on benchmarks > like SPEC CPU 2006/2017 ICC doesn't have a big lead over GCC > (in fact it even trails in some benchmarks) unless you get to > "SPEC tricks" like data structure re-organization optimizations that > probably never apply in practice on real-world code (and people > should fix such things at the source level being pointed at them > via actually profiling their codes). Richard, I was wondering if you have any more details about these comparisions you have done that you can share? Compiler versions, options used, hardware, etc Also, were there any tests that stood out in terms of icc outperforming GCC? I did a compare of SPEC 2017 rate using GCC 8.* (pre release) and a recent ICC (2018.0.128?) on my desktop (Xeon CPU E5-1650 v4). I used '-xHost -O3' for icc and '-march=native -mtune=native -O3' for gcc. The int rate numbers (running 1 copy only) were not too bad, GCC was only about 2% slower and only 525.x264_r seemed way slower with GCC. The fp rate numbers (again only 1 copy) showed a larger difference, around 20%. 521.wrf_r was more than twice as slow when compiled with GCC instead of ICC and 503.bwaves_r and 510.parest_r also showed significant slowdowns when compiled with GCC vs. ICC. Steve Ellcey sell...@cavium.com
Re: Aarch64 / simd builtin question
On Fri, Jun 08, 2018 at 04:01:14PM -0500, Steve Ellcey wrote: > I have a question about the Aarch64 simd instructions and builtins. > > I want to unpack a __Float32x4 (V4SF) variable into two __Float64x2 > variables. I can get the upper part with: > > __Float64x2_t a = __builtin_aarch64_vec_unpacks_hi_v4sf (x); > > But I can't seem to find a builtin that would get me the lower half. > I assume this is due to the issue in aarch64-simd.md around the > vec_unpacks_lo_ instruction: > > ;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns > ;; is inconsistent with vector ordering elsewhere in the compiler, in that > ;; the meaning of HI and LO changes depending on the target endianness. > ;; While elsewhere we map the higher numbered elements of a vector to > ;; the lower architectural lanes of the vector, for these patterns we want > ;; to always treat "hi" as referring to the higher architectural lanes. > ;; Consequently, while the patterns below look inconsistent with our > ;; other big-endian patterns their behavior is as required. > > Does this mean we can't have a __builtin_aarch64_vec_unpacks_lo_v4sf > builtin that will work in big endian and little endian modes? > It seems like it should be possible but I don't really understand > the details of the implementation enough to follow the comment and > all its implications. > > Right now, as a workaround, I use: > > static inline __Float64x2_t __vec_unpacks_lo_v4sf (__Float32x4_t x) > { > __Float64x2_t result; > __asm__ ("fcvtl %0.2d,%1.2s" : "=w"(result) : "w"(x) : /* No clobbers */); > return result; > } > > But a builtin would be cleaner. Hi Steve, Are you in an environment where you can use arm_neon.h ? If so, that would be the best approach: float32x4_t in; float64x2_t low = vcvt_f64_f32 (vget_low_f64 (in)); float64x2_t high = vcvt_high_f64_f32 (in); If you can't use arm_neon.h for some reason, you can look there for inspiration of how to write your own versions of these intrinsics. Thanks, James
Aarch64 / simd builtin question
I have a question about the Aarch64 simd instructions and builtins. I want to unpack a __Float32x4 (V4SF) variable into two __Float64x2 variables. I can get the upper part with: __Float64x2_t a = __builtin_aarch64_vec_unpacks_hi_v4sf (x); But I can't seem to find a builtin that would get me the lower half. I assume this is due to the issue in aarch64-simd.md around the vec_unpacks_lo_ instruction: ;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns ;; is inconsistent with vector ordering elsewhere in the compiler, in that ;; the meaning of HI and LO changes depending on the target endianness. ;; While elsewhere we map the higher numbered elements of a vector to ;; the lower architectural lanes of the vector, for these patterns we want ;; to always treat "hi" as referring to the higher architectural lanes. ;; Consequently, while the patterns below look inconsistent with our ;; other big-endian patterns their behavior is as required. Does this mean we can't have a __builtin_aarch64_vec_unpacks_lo_v4sf builtin that will work in big endian and little endian modes? It seems like it should be possible but I don't really understand the details of the implementation enough to follow the comment and all its implications. Right now, as a workaround, I use: static inline __Float64x2_t __vec_unpacks_lo_v4sf (__Float32x4_t x) { __Float64x2_t result; __asm__ ("fcvtl %0.2d,%1.2s" : "=w"(result) : "w"(x) : /* No clobbers */); return result; } But a builtin would be cleaner. Steve Ellcey sell...@cavium.com
Re: [GSOC] LTO dump tool project
On 8 June 2018 at 22:46, Hrishikesh Kulkarni wrote: > Hi, > > -fdump-lto-body=foo > will dump gimple body of the function foo > > foo (int a, int b) > { >[local count: 1073741825]: > _3 = a_1(D) + b_2(D); > return _3; > > } > > Please find the diff file attached herewith. @@ -53,10 +55,14 @@ dump_list () fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n"); FOR_EACH_SYMBOL (node) { - fprintf (stderr, "\n%20s",(flag_lto_dump_demangle) - ? node->name (): node->dump_asm_name ()); +const char *x = strchr (node->asm_name (), '/'); +if (flag_lto_dump_demangle) + fprintf (stderr, "\n%20s", node->name ()); + else + fprintf (stderr, "\n%20s", node->asm_name (), + node->asm_name ()-x); Shouldn't this be: fprintf (stderr, "\n%20.*s", (int) (x - node->asm_name ()), node->asm_name ()) ? Also better to put strchr within else block since that's the only place you seem to be using it. Thanks, Prathamesh > > Regards, > Hrishikesh > > On Fri, Jun 8, 2018 at 7:15 PM, Martin Liška wrote: >> On 06/08/2018 03:40 PM, Martin Liška wrote: >>> There's wrong declaration of the function in header file. I'll fix it soon >>> on trunk. Please carry on with following patch: >> >> Fixed in r261327. >> >> Martin
Re: decrement_and_branch_until_zero pattern
> On Jun 8, 2018, at 2:29 PM, Jim Wilson wrote: > > On 06/08/2018 06:21 AM, Paul Koning wrote: >> Interesting. The ChangeLog doesn't give any background. I suppose I should >> plan to approximate the effect of this pattern with a define-peephole2 ? > > The old RTL loop optimizer was replaced with a new RTL loop optimizer. When > the old one was written, m68k was a major target, and the dbra optimization > was written for it. When the new one was written, m68k was not a major > target, and this support was written differently. We now have doloop_begin > and doloop_end patterns that do almost the same thing, and can be created by > the loop-doloop.c code. > > There is a section in the internals docs that talks about this. > https://gcc.gnu.org/onlinedocs/gccint/Looping-Patterns.html > > The fact that we still have decrement_and_branch_until_zero references in > docs and target md files looks like a bug. The target md files should use > doloop patterns instead, and the doc references should be dropped. Thanks. I saw those sections and interpreted them as support for signal processor style fast hardware loops. If they can be adapted for dbra type looping, great. I'll give that a try. Meanwhile, yes, it looks like there is a documentation bug. I can clean that up. It's more than a few lines, but does that qualify for an "obvious" change? paul
Re: decrement_and_branch_until_zero pattern
On 06/08/2018 06:21 AM, Paul Koning wrote: Interesting. The ChangeLog doesn't give any background. I suppose I should plan to approximate the effect of this pattern with a define-peephole2 ? The old RTL loop optimizer was replaced with a new RTL loop optimizer. When the old one was written, m68k was a major target, and the dbra optimization was written for it. When the new one was written, m68k was not a major target, and this support was written differently. We now have doloop_begin and doloop_end patterns that do almost the same thing, and can be created by the loop-doloop.c code. There is a section in the internals docs that talks about this. https://gcc.gnu.org/onlinedocs/gccint/Looping-Patterns.html The fact that we still have decrement_and_branch_until_zero references in docs and target md files looks like a bug. The target md files should use doloop patterns instead, and the doc references should be dropped. Jim
Re: [GSOC] LTO dump tool project
Hi, -fdump-lto-body=foo will dump gimple body of the function foo foo (int a, int b) { [local count: 1073741825]: _3 = a_1(D) + b_2(D); return _3; } Please find the diff file attached herewith. Regards, Hrishikesh On Fri, Jun 8, 2018 at 7:15 PM, Martin Liška wrote: > On 06/08/2018 03:40 PM, Martin Liška wrote: >> There's wrong declaration of the function in header file. I'll fix it soon >> on trunk. Please carry on with following patch: > > Fixed in r261327. > > Martin diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c index 8529c82..8d20917 100644 --- a/gcc/lto-streamer-in.c +++ b/gcc/lto-streamer-in.c @@ -1320,7 +1320,6 @@ lto_read_body_or_constructor (struct lto_file_decl_data *file_data, struct symta /* Restore decl state */ file_data->current_decl_state = file_data->global_decl_state; } - lto_data_in_delete (data_in); } diff --git a/gcc/lto/lang.opt b/gcc/lto/lang.opt index a098797..c10c662 100644 --- a/gcc/lto/lang.opt +++ b/gcc/lto/lang.opt @@ -77,6 +77,9 @@ LTO Driver RejectNegative Joined Var(flag_lto_dump_symbol) demangle LTO Var(flag_lto_dump_demangle) +fdump-lto-body= +LTO Driver RejectNegative Joined Var(flag_lto_dump_body) + fresolution= LTO Joined The resolution file. diff --git a/gcc/lto/lto-dump.c b/gcc/lto/lto-dump.c index e0becd1..687c9c9 100644 --- a/gcc/lto/lto-dump.c +++ b/gcc/lto/lto-dump.c @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "function.h" #include "basic-block.h" #include "tree.h" +#include "tree-cfg.h" #include "gimple.h" #include "cgraph.h" #include "lto-streamer.h" @@ -36,13 +37,14 @@ along with GCC; see the file COPYING3. If not see #include "stdio.h" #include "lto.h" + /* Dump everything. */ -void +void dump () { fprintf(stderr, "\nHello World!\n"); } - + /* Dump variables and functions used in IL. */ void dump_list () @@ -53,10 +55,14 @@ dump_list () fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n"); FOR_EACH_SYMBOL (node) { - fprintf (stderr, "\n%20s",(flag_lto_dump_demangle) - ? node->name (): node->dump_asm_name ()); + const char *x = strchr (node->asm_name (), '/'); + if (flag_lto_dump_demangle) + fprintf (stderr, "\n%20s", node->name ()); + else + fprintf (stderr, "\n%20s", node->asm_name (), +node->asm_name ()-x); fprintf (stderr, "%20s", node->dump_type_name ()); - fprintf (stderr, "%20s\n", node->dump_visibility ()); + fprintf (stderr, "%20s", node->dump_visibility ()); } } @@ -67,13 +73,19 @@ dump_symbol () symtab_node *node; fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n"); FOR_EACH_SYMBOL (node) - { - if (!strcmp(flag_lto_dump_symbol, node->name())) + if (!strcmp (flag_lto_dump_symbol, node->name ())) + node->debug (); +} + +/* Dump gimple body of specific function. */ +void +dump_body () +{ + cgraph_node *cnode; + FOR_EACH_FUNCTION (cnode) + if (!strcmp (cnode->name (), flag_lto_dump_body)) { - fprintf (stderr, "\n%20s",(flag_lto_dump_demangle) -? node->name (): node->dump_asm_name ()); - fprintf (stderr, "%20s", node->dump_type_name ()); - fprintf (stderr, "%20s\n", node->dump_visibility ()); + cnode->get_untransformed_body (); + debug_function (cnode->decl, 0); } - } } \ No newline at end of file diff --git a/gcc/lto/lto-dump.h b/gcc/lto/lto-dump.h index 352160c..3b6c9bc 100644 --- a/gcc/lto/lto-dump.h +++ b/gcc/lto/lto-dump.h @@ -29,4 +29,7 @@ void dump_list (); /*Dump specific variable or function used in IL. */ void dump_symbol (); +/*Dump gimple body of specific function. */ +void dump_body (); + #endif \ No newline at end of file diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index ab1eed3..88d1480 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -2170,7 +2170,7 @@ lto_file_read (lto_file *file, FILE *resolution_file, int *count) /* Finalize each lto file for each submodule in the merged object */ for (file_data = file_list.first; file_data != NULL; file_data = file_data->next) lto_create_files_from_ids (file, file_data, count); - + splay_tree_delete (file_ids); htab_delete (section_hash_table); @@ -3373,6 +3373,10 @@ lto_main (void) if (flag_lto_dump_symbol) dump_symbol (); + /* Dump gimple body of specific function. */ + if (flag_lto_dump_body) +dump_body (); + timevar_stop (TV_PHASE_STREAM_IN); if (!seen_error ()) diff --git a/gcc/symtab.c b/gcc/symtab.c index 1d2374f..0e08519 100644 --- a/gcc/symtab.c +++ b/gcc/symtab.c @@ -815,7 +815,7 @@ symtab_node::dump_visibility () const "default", "protected", "hidden", "internal" }; - return visibility_types [DECL_VISIBILITY (decl)]; + return visibility_types[DECL_VISIBILITY (decl)]; } const char * diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h index 73237a6..3e10d15 100644 --- a/gcc/tree-cfg.h +++ b/gcc/tree-cfg.h @@ -81,7 +81,7 @@ extern void fold_loop_internal_call (gimple *, tree); extern basic_block move_sese_region_to_fn (struct function *,
Re: aarch64-none-elf build broken
On 8 June 2018 at 16:11, Joseph Myers wrote: > On Fri, 8 Jun 2018, Jonathan Wakely wrote: > >> > The root cause is PR66203 which I reported quite some time ago, which >> > points to a newlib problem: on aarch64 there is no default rom >> > monitor, one has to explicitly use a --specs flag for the link to >> > succeed. >> >> I have no idea why this causes the libstdc++ configuration problem >> though, I don't understand the issue. > > Generically libstdc++ should not be doing link tests for bare-metal > targets; rather, there is a hardcoded set of defines in configure.ac for > functions present on such targets. Thanks, I thought that might be how we need to fix this. > (For most other targets, link tests > *should* be run even when cross-compiling, there have been plenty of bugs > in the past where something tested in the $GLIBCXX_IS_NATIVE case wasn't > also tested in crossconfig.m4 for targets such as GNU/Linux where a target > libc is guaranteed to be linkable against for building libstdc++.) Yes, this requirement is very fragile and introduces differences (sometimes serious ones) between native and cross builds (e.g. r260678, r258468, r244169 ...)
Re: aarch64-none-elf build broken
On Fri, 8 Jun 2018, Jonathan Wakely wrote: > > The root cause is PR66203 which I reported quite some time ago, which > > points to a newlib problem: on aarch64 there is no default rom > > monitor, one has to explicitly use a --specs flag for the link to > > succeed. > > I have no idea why this causes the libstdc++ configuration problem > though, I don't understand the issue. Generically libstdc++ should not be doing link tests for bare-metal targets; rather, there is a hardcoded set of defines in configure.ac for functions present on such targets. (For most other targets, link tests *should* be run even when cross-compiling, there have been plenty of bugs in the past where something tested in the $GLIBCXX_IS_NATIVE case wasn't also tested in crossconfig.m4 for targets such as GNU/Linux where a target libc is guaranteed to be linkable against for building libstdc++.) -- Joseph S. Myers jos...@codesourcery.com
Re: aarch64-none-elf build broken
On 8 June 2018 at 16:41, Jonathan Wakely wrote: > On 8 June 2018 at 14:22, Christophe Lyon wrote: >> Hi, >> >> As I reported in >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of >> GCC for aarch64*-none-elf fails when configuring libstdc++ since >> r261034 (a week ago). > > Sorry for not trying to fix it, I'm travelling and not been able to > look into it (which is why I've only been doing trivial refactoring > patches all week). > I'm not blaming you in any way :) > >> The root cause is PR66203 which I reported quite some time ago, which >> points to a newlib problem: on aarch64 there is no default rom >> monitor, one has to explicitly use a --specs flag for the link to >> succeed. > > I have no idea why this causes the libstdc++ configuration problem > though, I don't understand the issue. That's because aarch64-elf-gcc conftest.c -o conftest fails to link if one does not provide --specs=rdimon.specs. So, every configure test that involves a link phase fails.
Re: aarch64-none-elf build broken
On 8 June 2018 at 14:22, Christophe Lyon wrote: > Hi, > > As I reported in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of > GCC for aarch64*-none-elf fails when configuring libstdc++ since > r261034 (a week ago). Sorry for not trying to fix it, I'm travelling and not been able to look into it (which is why I've only been doing trivial refactoring patches all week). > The root cause is PR66203 which I reported quite some time ago, which > points to a newlib problem: on aarch64 there is no default rom > monitor, one has to explicitly use a --specs flag for the link to > succeed. I have no idea why this causes the libstdc++ configuration problem though, I don't understand the issue.
Re: [GSOC] LTO dump tool project
On 06/08/2018 03:40 PM, Martin Liška wrote: > There's wrong declaration of the function in header file. I'll fix it soon > on trunk. Please carry on with following patch: Fixed in r261327. Martin
Re: [GSOC] LTO dump tool project
On 06/08/2018 03:27 PM, Hrishikesh Kulkarni wrote: > Hi, > > Linking is not taking place as the debug_function() being used is in > tree-cfg.c. How should I go about on adding in make-file considering > the dependencies? Hi. There's wrong declaration of the function in header file. I'll fix it soon on trunk. Please carry on with following patch: diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index 17634797c6e..363b59febd6 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see #include "fold-const.h" #include "attribs.h" #include "builtins.h" +#include "tree-cfg.h" /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver. */ @@ -3338,6 +3339,8 @@ offload_handle_link_vars (void) void lto_main (void) { + // only test if it builds + debug_function (cfun->decl, TDF_NONE); /* LTO is called as a front end, even though it is not a front end. Because it is called as a front end, TV_PHASE_PARSING and TV_PARSE_GLOBAL are active, and we need to turn them off while diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h index 73237a604be..9491bb45feb 100644 --- a/gcc/tree-cfg.h +++ b/gcc/tree-cfg.h @@ -81,7 +81,7 @@ extern void fold_loop_internal_call (gimple *, tree); extern basic_block move_sese_region_to_fn (struct function *, basic_block, basic_block, tree); extern void dump_function_to_file (tree, FILE *, dump_flags_t); -extern void debug_function (tree, int) ; +extern void debug_function (tree, dump_flags_t); extern void print_loops_bb (FILE *, basic_block, int, int); extern void print_loops (FILE *, int); extern void debug (struct loop &ref); Martin > > Please find the diff file attached herewith. > > Regards, > Hrishikesh > > On Tue, Jun 5, 2018 at 12:38 AM, Martin Liška wrote: >> On 06/04/2018 08:13 PM, Hrishikesh Kulkarni wrote: >>> >>> Hi, >>> >>> -fdump-lto-list will dump all the symbol list >> >> >> I see extra new lines in the output: >> >> $ lto1 -fdump-lto-list main.o >> [..snip..] >> Symbol Table >> NameTypeVisibility >> >>fwrite/15function default >> >> foo/14function default >> >> mystring/12variable default >> >> pole/11variable default >> >> main/13function default >> >>> -fdump-lto-list -demangle will dump all the list with symbol names >>> demangled >> >> >> Good for now. Note that non-demagle version prints function names with order >> (/$number). >> I would not print that. >> >>> -fdump-lto-symbol=foo will dump details of foo >> >> >> I would really prefer to use symtab_node::debug for now. It presents all >> details about >> symbol instead of current implementation which does: '-fdump-lto-list | grep >> foo' >> >>> >>> The output(demangled) will be in tabular form like nm: >>> Symbol Table >>> Name Type Visibility >>>printffunction default >>> kvariable default >>> mainfunction default >>> barfunction default >>> foofunction default >>> >>> I have tried to format the changes according to gnu coding style and >>> added required methods in symtab_node. >> >> >> That's nice that you came up with new symbol_node methods. It's much better. >> About the GNU coding style, I still see trailing whitespace: >> >> === ERROR type #3: there should be no space before a left square bracket (1 >> error(s)) === >> gcc/symtab.c:818:26: return visibility_types [DECL_VISIBILITY (decl)]; >> >> === ERROR type #4: trailing whitespace (6 error(s)) === >> gcc/lto/lto-dump.c:40:4:void█ >> gcc/lto/lto-dump.c:45:0:█ >> gcc/lto/lto-dump.c:56:52: fprintf (stderr, >> "\n%20s",(flag_lto_dump_demangle)█ >> gcc/lto/lto-dump.c:73:53: fprintf (stderr, >> "\n%20s",(flag_lto_dump_demangle)█ >> gcc/lto/lto-dump.c:78:2:}█ >> gcc/lto/lto-dump.c:79:1:}█ >> >> Martin >> >> >>> >>> Please find the diff file attached. >>> >>> Regards, >>> Hrishikesh >>> >>> On Mon, Jun 4, 2018 at 2:06 PM, Martin Liška wrote: On 06/01/2018 08:59 PM, Hrishikesh Kulkarni wrote: > > Hi, > I have pushed the changes to github > (https://github.com/hrisearch/gcc). Added a command line option for > specific dumps of variables and functions used in IL e.g. > -fdump-lto-list=foo will dump: > Call Graph: > > foo/1 (foo) >Type: function > visibility: default Hi. Thanks for the next step. I've got some comments about it: - -fdump-lto-list=foo is wrong option name, I would use -fdump-lto-symbol or something similar.
Re: [GSOC] LTO dump tool project
Hi, Linking is not taking place as the debug_function() being used is in tree-cfg.c. How should I go about on adding in make-file considering the dependencies? Please find the diff file attached herewith. Regards, Hrishikesh On Tue, Jun 5, 2018 at 12:38 AM, Martin Liška wrote: > On 06/04/2018 08:13 PM, Hrishikesh Kulkarni wrote: >> >> Hi, >> >> -fdump-lto-list will dump all the symbol list > > > I see extra new lines in the output: > > $ lto1 -fdump-lto-list main.o > [..snip..] > Symbol Table > NameTypeVisibility > >fwrite/15function default > > foo/14function default > > mystring/12variable default > > pole/11variable default > > main/13function default > >> -fdump-lto-list -demangle will dump all the list with symbol names >> demangled > > > Good for now. Note that non-demagle version prints function names with order > (/$number). > I would not print that. > >> -fdump-lto-symbol=foo will dump details of foo > > > I would really prefer to use symtab_node::debug for now. It presents all > details about > symbol instead of current implementation which does: '-fdump-lto-list | grep > foo' > >> >> The output(demangled) will be in tabular form like nm: >> Symbol Table >> Name Type Visibility >>printffunction default >> kvariable default >> mainfunction default >> barfunction default >> foofunction default >> >> I have tried to format the changes according to gnu coding style and >> added required methods in symtab_node. > > > That's nice that you came up with new symbol_node methods. It's much better. > About the GNU coding style, I still see trailing whitespace: > > === ERROR type #3: there should be no space before a left square bracket (1 > error(s)) === > gcc/symtab.c:818:26: return visibility_types [DECL_VISIBILITY (decl)]; > > === ERROR type #4: trailing whitespace (6 error(s)) === > gcc/lto/lto-dump.c:40:4:void█ > gcc/lto/lto-dump.c:45:0:█ > gcc/lto/lto-dump.c:56:52: fprintf (stderr, > "\n%20s",(flag_lto_dump_demangle)█ > gcc/lto/lto-dump.c:73:53: fprintf (stderr, > "\n%20s",(flag_lto_dump_demangle)█ > gcc/lto/lto-dump.c:78:2:}█ > gcc/lto/lto-dump.c:79:1:}█ > > Martin > > >> >> Please find the diff file attached. >> >> Regards, >> Hrishikesh >> >> On Mon, Jun 4, 2018 at 2:06 PM, Martin Liška wrote: >>> >>> On 06/01/2018 08:59 PM, Hrishikesh Kulkarni wrote: Hi, I have pushed the changes to github (https://github.com/hrisearch/gcc). Added a command line option for specific dumps of variables and functions used in IL e.g. -fdump-lto-list=foo will dump: Call Graph: foo/1 (foo) Type: function visibility: default >>> >>> >>> Hi. >>> >>> Thanks for the next step. I've got some comments about it: >>> >>> - -fdump-lto-list=foo is wrong option name, I would use -fdump-lto-symbol >>>or something similar. >>> >>> - for -fdump-lto-list I would really prefer to use a format similar to >>> nm: >>>print a header with column description and then one line for a symbol >>> >>> - think about mangling/demangling of C++ symbols, you can take a look at >>> nm it also has --demangle, --no-demangle >>> >>> - please learn & try to use an autoformat for your editor in order to >>>fulfill GNU coding style. Following checker will help you: >>> >>> $ ./contrib/check_GNU_style.py /tmp/p >>> === ERROR type #1: dot, space, space, end of comment (6 error(s)) === >>> gcc/lto/lto-dump.c:38:17:/*Dump everything*/ >>> gcc/lto/lto-dump.c:44:41:/*Dump variables and functions used in IL*/ >>> gcc/lto/lto-dump.c:73:50:/*Dump specific variables and functions used in >>> IL*/ >>> gcc/lto/lto.c:3364:19: /*Dump everything*/ >>> gcc/lto/lto.c:3368:43: /*Dump variables and functions used in IL*/ >>> gcc/lto/lto.c:3372:52: /*Dump specific variables and functions used in >>> IL*/ >>> >>> === ERROR type #2: lines should not exceed 80 characters (11 error(s)) >>> === >>> gcc/lto/lto-dump.c:51:80:static const char * const >>> symtab_type_names[] = {"symbol", "function", "variable"}; >>> gcc/lto/lto-dump.c:56:80:fprintf (stderr, "\n%s (%s)", >>> cnode->dump_asm_name (), cnode->name ()); >>> gcc/lto/lto-dump.c:57:80:fprintf (stderr, "\n Type: %s", >>> symtab_type_names[cnode->type]); >>> gcc/lto/lto-dump.c:66:80:fprintf (stderr, "\n%s (%s)", >>> vnode->dump_asm_name (), vnode->name ()); >>> gcc/lto/lto-dump.c:67:80:fprintf (stderr, "\n Type: %s", >>> symtab_type_names[vnode->type]); >>> gcc/lto/lto-dump.c:80:80:
aarch64-none-elf build broken
Hi, As I reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of GCC for aarch64*-none-elf fails when configuring libstdc++ since r261034 (a week ago). The root cause is PR66203 which I reported quite some time ago, which points to a newlib problem: on aarch64 there is no default rom monitor, one has to explicitly use a --specs flag for the link to succeed. Maybe I missed a change about this in newlib, and I should upgrade the version I use for GCC automatic validations? If not, and if there is not much interest in these configurations, maybe I should just drop them from my list? Alternatively, I could try to use LDFLAGS_FOR_TARGET=--specs=rdimon.specs in my validation scripts. Or, better, are there plans to fix this? I ask, because I have no immediate plans to look at this. Thanks, Christophe
Re: decrement_and_branch_until_zero pattern
> On Jun 8, 2018, at 6:59 AM, Andreas Schwab wrote: > > On Jun 07 2018, Paul Koning wrote: > >> None of these seem to use that loop optimization, with -O2 or -Os. Did I >> miss some magic switch to turn on something else that isn't on by default? >> Or is this a feature that was broken long ago and not noticed? If so, any >> hints where I might look for a reason? > > commit 7d3c6452d7 > Author: rakdver > Date: Thu Mar 2 23:50:55 2006 + > >* loop.c: Removed. Interesting. The ChangeLog doesn't give any background. I suppose I should plan to approximate the effect of this pattern with a define-peephole2 ? paul
Re: current state of gcc-ia16?
On 08/06/2018 12:43, Dennis Luehring wrote: is the patch already integrated into mainline? No, it's not. will that ever happen? Hard to say. There's no reason in principle why it couldn't happen, but there's not a big demand for it, so it would require someone taking the time and trouble to do it. It's not trivial, though - the current implementation has some middle-end changes which would need thinking through and doing properly to avoid polluting that code with ia16-isms. I might update it and have another try at upstreaming it at some point if nobody else does it first, but I have too much else going on at the moment so it would likely be a year or two (maybe more) before I get to it. Andrew
Re: current state of gcc-ia16?
is the patch already integrated into mainline? No, it's not. will that ever happen? is this the most recent development place? https://github.com/tkchia/gcc-ia16 Yes, that's the right place. thx Am 08.06.2018 um 12:59 schrieb Andrew Jenner: Hi Dennis, On 08/06/2018 11:37, Dennis Luehring wrote: > is the patch already integrated into mainline? No, it's not. > is this the most recent development place? > https://github.com/tkchia/gcc-ia16 Yes, that's the right place. Andrew
Re: current state of gcc-ia16?
Hi Dennis, On 08/06/2018 11:37, Dennis Luehring wrote: is the patch already integrated into mainline? No, it's not. is this the most recent development place? https://github.com/tkchia/gcc-ia16 Yes, that's the right place. Andrew
Re: decrement_and_branch_until_zero pattern
On Jun 07 2018, Paul Koning wrote: > None of these seem to use that loop optimization, with -O2 or -Os. Did I > miss some magic switch to turn on something else that isn't on by default? > Or is this a feature that was broken long ago and not noticed? If so, any > hints where I might look for a reason? commit 7d3c6452d7 Author: rakdver Date: Thu Mar 2 23:50:55 2006 + * loop.c: Removed. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@111650 138bc75d-0d04-0410-961f-82ee72b054a4 Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
current state of gcc-ia16?
is the patch already integrated into mainline? is this the most recent development place? https://github.com/tkchia/gcc-ia16
overflow check in extract_range_from_binary_1 useless?
Howdy. Am I missing something or are these two sets identical? /* Get the lower and upper bounds of the type. */ if (TYPE_OVERFLOW_WRAPS (expr_type)) { type_min = wi::min_value (prec, sgn); type_max = wi::max_value (prec, sgn); } else { type_min = wi::to_wide (vrp_val_min (expr_type)); type_max = wi::to_wide (vrp_val_max (expr_type)); } Isn't wi::to_wide(TYPE_MIN/MAX_VALUE) the same as wi::min/max_value, or is there some weird language (*cough ada*) subtlety I'm missing? Confused. Aldy