Re: How to get GCC on par with ICC?

2018-06-08 Thread Marc Glisse

On Fri, 8 Jun 2018, Steve Ellcey wrote:


On Thu, 2018-06-07 at 12:01 +0200, Richard Biener wrote:

 
When we do our own comparisons of GCC vs. ICC on benchmarks
like SPEC CPU 2006/2017 ICC doesn't have a big lead over GCC
(in fact it even trails in some benchmarks) unless you get to
"SPEC tricks" like data structure re-organization optimizations that
probably never apply in practice on real-world code (and people
should fix such things at the source level being pointed at them
via actually profiling their codes).


Richard,

I was wondering if you have any more details about these comparisions
you have done that you can share?  Compiler versions, options used,
hardware, etc  Also, were there any tests that stood out in terms of
icc outperforming GCC?

I did a compare of SPEC 2017 rate using GCC 8.* (pre release) and
a recent ICC (2018.0.128?) on my desktop (Xeon CPU E5-1650 v4).
I used '-xHost -O3' for icc and '-march=native -mtune=native -O3'
for gcc.


You should use -Ofast for gcc. As mentionned earlier in the discussion, 
ICC has some equivalent of -ffast-math by default.



The int rate numbers (running 1 copy only) were not too bad, GCC was
only about 2% slower and only 525.x264_r seemed way slower with GCC.
The fp rate numbers (again only 1 copy) showed a larger difference, 
around 20%.  521.wrf_r was more than twice as slow when compiled with
GCC instead of ICC and 503.bwaves_r and 510.parest_r also showed
significant slowdowns when compiled with GCC vs. ICC.


--
Marc Glisse


gcc-8-20180608 is now available

2018-06-08 Thread gccadmin
Snapshot gcc-8-20180608 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/8-20180608/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch 
revision 261348

You'll find:

 gcc-8-20180608.tar.xzComplete GCC

  SHA256=3097e5eeaf5701b003696140772f6d6bd1a5748a57a30d03eaa916f942857c22
  SHA1=da2e4f143e52e812abcd1fc2ab87dc3db2cfdf57

Diffs from 8-20180601 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: decrement_and_branch_until_zero pattern

2018-06-08 Thread Jim Wilson
On Fri, Jun 8, 2018 at 1:12 PM, Paul Koning  wrote:
> Thanks.  I saw those sections and interpreted them as support for signal 
> processor style fast hardware loops.  If they can be adapted for dbra type 
> looping, great.  I'll give that a try.

The rs6000 port uses it for bdnz (branch decrement not zero) for
instance, which is similar to the m68k dbra.

> Meanwhile, yes, it looks like there is a documentation bug.  I can clean that 
> up.  It's more than a few lines, but does that qualify for an "obvious" 
> change?

I think the obvious rule should only apply to trivial patches, and
this will require some non-trivial changes to fix the looping pattern
section.  Just deleting the decrement_and_branch_until_zero named
pattern section looks trivial.  It looks like the REG_NONNEG section
should  mention the doloop_end pattern instead of
decrement_and_branch_until_zero, since I think the same rule applies
that they only get generated if the doloop_end pattern exists.

Jim


Re: Aarch64 / simd builtin question

2018-06-08 Thread Steve Ellcey
On Fri, 2018-06-08 at 22:34 +0100, James Greenhalgh wrote:
> 
> Are you in an environment where you can use arm_neon.h ? If so, that
> would
> be the best approach:
> 
>   float32x4_t in;
>   float64x2_t low = vcvt_f64_f32 (vget_low_f64 (in));
>   float64x2_t high = vcvt_high_f64_f32 (in);
> 
> If you can't use arm_neon.h for some reason, you can look there for
> inspiration of how to write your own versions of these intrinsics.
> 
> Thanks,
> James

Thanks, that is helpful though I think you meant vget_low_f32 in
the first line instead of vget_low_f64.  With that change I get the
code I want/expect.  I hadn't seen the __GETLOW macro in the neon
header file.

Steve Ellcey


Re: How to get GCC on par with ICC?

2018-06-08 Thread Steve Ellcey
On Thu, 2018-06-07 at 12:01 +0200, Richard Biener wrote:
> 
> When we do our own comparisons of GCC vs. ICC on benchmarks
> like SPEC CPU 2006/2017 ICC doesn't have a big lead over GCC
> (in fact it even trails in some benchmarks) unless you get to
> "SPEC tricks" like data structure re-organization optimizations that
> probably never apply in practice on real-world code (and people
> should fix such things at the source level being pointed at them
> via actually profiling their codes).

Richard,

I was wondering if you have any more details about these comparisions
you have done that you can share?  Compiler versions, options used,
hardware, etc  Also, were there any tests that stood out in terms of
icc outperforming GCC?

I did a compare of SPEC 2017 rate using GCC 8.* (pre release) and
a recent ICC (2018.0.128?) on my desktop (Xeon CPU E5-1650 v4).
I used '-xHost -O3' for icc and '-march=native -mtune=native -O3'
for gcc.

The int rate numbers (running 1 copy only) were not too bad, GCC was
only about 2% slower and only 525.x264_r seemed way slower with GCC.
The fp rate numbers (again only 1 copy) showed a larger difference, 
around 20%.  521.wrf_r was more than twice as slow when compiled with
GCC instead of ICC and 503.bwaves_r and 510.parest_r also showed
significant slowdowns when compiled with GCC vs. ICC.

Steve Ellcey
sell...@cavium.com


Re: Aarch64 / simd builtin question

2018-06-08 Thread James Greenhalgh
On Fri, Jun 08, 2018 at 04:01:14PM -0500, Steve Ellcey wrote:
> I have a question about the Aarch64 simd instructions and builtins.
> 
> I want to unpack a __Float32x4 (V4SF) variable into two __Float64x2
> variables.  I can get the upper part with:
> 
> __Float64x2_t a = __builtin_aarch64_vec_unpacks_hi_v4sf (x);
> 
> But I can't seem to find a builtin that would get me the lower half.
> I assume this is due to the issue in aarch64-simd.md around the
> vec_unpacks_lo_ instruction:
> 
> ;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns
> ;; is inconsistent with vector ordering elsewhere in the compiler, in that
> ;; the meaning of HI and LO changes depending on the target endianness.
> ;; While elsewhere we map the higher numbered elements of a vector to
> ;; the lower architectural lanes of the vector, for these patterns we want
> ;; to always treat "hi" as referring to the higher architectural lanes.
> ;; Consequently, while the patterns below look inconsistent with our
> ;; other big-endian patterns their behavior is as required.
> 
> Does this mean we can't have a __builtin_aarch64_vec_unpacks_lo_v4sf
> builtin that will work in big endian and little endian modes?
> It seems like it should be possible but I don't really understand 
> the details of the implementation enough to follow the comment and
> all its implications.
> 
> Right now, as a workaround, I use:
> 
> static inline __Float64x2_t __vec_unpacks_lo_v4sf (__Float32x4_t x)
> {
>   __Float64x2_t result;
>   __asm__ ("fcvtl %0.2d,%1.2s" : "=w"(result) : "w"(x) : /* No clobbers */);
>   return result;
> }
> 
> But a builtin would be cleaner.

Hi Steve,

Are you in an environment where you can use arm_neon.h ? If so, that would
be the best approach:

  float32x4_t in;
  float64x2_t low = vcvt_f64_f32 (vget_low_f64 (in));
  float64x2_t high = vcvt_high_f64_f32 (in);

If you can't use arm_neon.h for some reason, you can look there for
inspiration of how to write your own versions of these intrinsics.

Thanks,
James



Aarch64 / simd builtin question

2018-06-08 Thread Steve Ellcey
I have a question about the Aarch64 simd instructions and builtins.

I want to unpack a __Float32x4 (V4SF) variable into two __Float64x2
variables.  I can get the upper part with:

__Float64x2_t a = __builtin_aarch64_vec_unpacks_hi_v4sf (x);

But I can't seem to find a builtin that would get me the lower half.
I assume this is due to the issue in aarch64-simd.md around the
vec_unpacks_lo_ instruction:

;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns
;; is inconsistent with vector ordering elsewhere in the compiler, in that
;; the meaning of HI and LO changes depending on the target endianness.
;; While elsewhere we map the higher numbered elements of a vector to
;; the lower architectural lanes of the vector, for these patterns we want
;; to always treat "hi" as referring to the higher architectural lanes.
;; Consequently, while the patterns below look inconsistent with our
;; other big-endian patterns their behavior is as required.

Does this mean we can't have a __builtin_aarch64_vec_unpacks_lo_v4sf
builtin that will work in big endian and little endian modes?
It seems like it should be possible but I don't really understand 
the details of the implementation enough to follow the comment and
all its implications.

Right now, as a workaround, I use:

static inline __Float64x2_t __vec_unpacks_lo_v4sf (__Float32x4_t x)
{
  __Float64x2_t result;
  __asm__ ("fcvtl %0.2d,%1.2s" : "=w"(result) : "w"(x) : /* No clobbers */);
  return result;
}

But a builtin would be cleaner.

Steve Ellcey
sell...@cavium.com


Re: [GSOC] LTO dump tool project

2018-06-08 Thread Prathamesh Kulkarni
On 8 June 2018 at 22:46, Hrishikesh Kulkarni  wrote:
> Hi,
>
> -fdump-lto-body=foo
> will dump gimple body of the function foo
>
> foo (int a, int b)
> {
>[local count: 1073741825]:
>   _3 = a_1(D) + b_2(D);
>   return _3;
>
> }
>
> Please find the diff file attached herewith.
@@ -53,10 +55,14 @@ dump_list ()
 fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n");
  FOR_EACH_SYMBOL (node)
  {
- fprintf (stderr, "\n%20s",(flag_lto_dump_demangle)
- ? node->name (): node->dump_asm_name ());
+const char *x = strchr (node->asm_name (), '/');
+if (flag_lto_dump_demangle)
+ fprintf (stderr, "\n%20s", node->name ());
+ else
+ fprintf (stderr, "\n%20s", node->asm_name (),
+ node->asm_name ()-x);
Shouldn't this be:
fprintf (stderr, "\n%20.*s", (int) (x - node->asm_name ()), node->asm_name ()) ?
Also better to put strchr within else block since that's the only
place you seem to be using it.

Thanks,
Prathamesh
>
> Regards,
> Hrishikesh
>
> On Fri, Jun 8, 2018 at 7:15 PM, Martin Liška  wrote:
>> On 06/08/2018 03:40 PM, Martin Liška wrote:
>>> There's wrong declaration of the function in header file. I'll fix it soon
>>> on trunk. Please carry on with following patch:
>>
>> Fixed in r261327.
>>
>> Martin


Re: decrement_and_branch_until_zero pattern

2018-06-08 Thread Paul Koning



> On Jun 8, 2018, at 2:29 PM, Jim Wilson  wrote:
> 
> On 06/08/2018 06:21 AM, Paul Koning wrote:
>> Interesting.  The ChangeLog doesn't give any background.  I suppose I should 
>> plan to approximate the effect of this pattern with a define-peephole2 ?
> 
> The old RTL loop optimizer was replaced with a new RTL loop optimizer. When 
> the old one was written, m68k was a major target, and the dbra optimization 
> was written for it.  When the new one was written, m68k was not a major 
> target, and this support was written differently.  We now have doloop_begin 
> and doloop_end patterns that do almost the same thing, and can be created by 
> the loop-doloop.c code.
> 
> There is a section in the internals docs that talks about this.
> https://gcc.gnu.org/onlinedocs/gccint/Looping-Patterns.html
> 
> The fact that we still have decrement_and_branch_until_zero references in 
> docs and target md files looks like a bug.  The target md files should use 
> doloop patterns instead, and the doc references should be dropped.

Thanks.  I saw those sections and interpreted them as support for signal 
processor style fast hardware loops.  If they can be adapted for dbra type 
looping, great.  I'll give that a try.

Meanwhile, yes, it looks like there is a documentation bug.  I can clean that 
up.  It's more than a few lines, but does that qualify for an "obvious" change?

paul



Re: decrement_and_branch_until_zero pattern

2018-06-08 Thread Jim Wilson

On 06/08/2018 06:21 AM, Paul Koning wrote:

Interesting.  The ChangeLog doesn't give any background.  I suppose I should 
plan to approximate the effect of this pattern with a define-peephole2 ?


The old RTL loop optimizer was replaced with a new RTL loop optimizer. 
When the old one was written, m68k was a major target, and the dbra 
optimization was written for it.  When the new one was written, m68k was 
not a major target, and this support was written differently.  We now 
have doloop_begin and doloop_end patterns that do almost the same thing, 
and can be created by the loop-doloop.c code.


There is a section in the internals docs that talks about this.
https://gcc.gnu.org/onlinedocs/gccint/Looping-Patterns.html

The fact that we still have decrement_and_branch_until_zero references 
in docs and target md files looks like a bug.  The target md files 
should use doloop patterns instead, and the doc references should be 
dropped.


Jim


Re: [GSOC] LTO dump tool project

2018-06-08 Thread Hrishikesh Kulkarni
Hi,

-fdump-lto-body=foo
will dump gimple body of the function foo

foo (int a, int b)
{
   [local count: 1073741825]:
  _3 = a_1(D) + b_2(D);
  return _3;

}

Please find the diff file attached herewith.

Regards,
Hrishikesh

On Fri, Jun 8, 2018 at 7:15 PM, Martin Liška  wrote:
> On 06/08/2018 03:40 PM, Martin Liška wrote:
>> There's wrong declaration of the function in header file. I'll fix it soon
>> on trunk. Please carry on with following patch:
>
> Fixed in r261327.
>
> Martin
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 8529c82..8d20917 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1320,7 +1320,6 @@ lto_read_body_or_constructor (struct lto_file_decl_data *file_data, struct symta
   /* Restore decl state */
   file_data->current_decl_state = file_data->global_decl_state;
 }
-
   lto_data_in_delete (data_in);
 }
 
diff --git a/gcc/lto/lang.opt b/gcc/lto/lang.opt
index a098797..c10c662 100644
--- a/gcc/lto/lang.opt
+++ b/gcc/lto/lang.opt
@@ -77,6 +77,9 @@ LTO Driver RejectNegative Joined Var(flag_lto_dump_symbol)
 demangle
 LTO Var(flag_lto_dump_demangle)
 
+fdump-lto-body=
+LTO Driver RejectNegative Joined Var(flag_lto_dump_body)
+
 fresolution=
 LTO Joined
 The resolution file.
diff --git a/gcc/lto/lto-dump.c b/gcc/lto/lto-dump.c
index e0becd1..687c9c9 100644
--- a/gcc/lto/lto-dump.c
+++ b/gcc/lto/lto-dump.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "function.h"
 #include "basic-block.h"
 #include "tree.h"
+#include "tree-cfg.h"
 #include "gimple.h"
 #include "cgraph.h"
 #include "lto-streamer.h"
@@ -36,13 +37,14 @@ along with GCC; see the file COPYING3.  If not see
 #include "stdio.h"
 #include "lto.h"
 
+
 /* Dump everything.  */
-void 
+void
 dump ()
 {
 	fprintf(stderr, "\nHello World!\n");
 }
-	
+
 /* Dump variables and functions used in IL.  */
 void
 dump_list ()
@@ -53,10 +55,14 @@ dump_list ()
 fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n");
 	FOR_EACH_SYMBOL (node)
 	{
-		fprintf (stderr, "\n%20s",(flag_lto_dump_demangle) 
-			? node->name (): node->dump_asm_name ());
+	const char *x = strchr (node->asm_name (), '/');
+	if (flag_lto_dump_demangle)
+			fprintf (stderr, "\n%20s", node->name ());
+		else
+			fprintf (stderr, "\n%20s", node->asm_name (), 
+node->asm_name ()-x);
 		fprintf (stderr, "%20s", node->dump_type_name ());
-		fprintf (stderr, "%20s\n", node->dump_visibility ());
+		fprintf (stderr, "%20s", node->dump_visibility ());
 	}
 }
 
@@ -67,13 +73,19 @@ dump_symbol ()
 	symtab_node *node;
 fprintf (stderr, "\t\tName \t\tType \t\tVisibility\n");
 	FOR_EACH_SYMBOL (node)
-	{
-		if (!strcmp(flag_lto_dump_symbol, node->name()))
+		if (!strcmp (flag_lto_dump_symbol, node->name ()))
+			node->debug ();
+}
+
+/* Dump gimple body of specific function.  */
+void
+dump_body ()
+{
+	cgraph_node *cnode;
+	FOR_EACH_FUNCTION (cnode)
+		if (!strcmp (cnode->name (), flag_lto_dump_body))
 		{
-			fprintf (stderr, "\n%20s",(flag_lto_dump_demangle) 
-? node->name (): node->dump_asm_name ());
-		fprintf (stderr, "%20s", node->dump_type_name ());
-		fprintf (stderr, "%20s\n", node->dump_visibility ());
+			cnode->get_untransformed_body ();
+			debug_function (cnode->decl, 0);
 		}
-	}	
 }	
\ No newline at end of file
diff --git a/gcc/lto/lto-dump.h b/gcc/lto/lto-dump.h
index 352160c..3b6c9bc 100644
--- a/gcc/lto/lto-dump.h
+++ b/gcc/lto/lto-dump.h
@@ -29,4 +29,7 @@ void dump_list ();
 /*Dump specific variable or function used in IL.  */
 void dump_symbol ();
 
+/*Dump gimple body of specific function.  */
+void dump_body ();
+
 #endif
\ No newline at end of file
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index ab1eed3..88d1480 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -2170,7 +2170,7 @@ lto_file_read (lto_file *file, FILE *resolution_file, int *count)
   /* Finalize each lto file for each submodule in the merged object */
   for (file_data = file_list.first; file_data != NULL; file_data = file_data->next)
 lto_create_files_from_ids (file, file_data, count);
- 
+
   splay_tree_delete (file_ids);
   htab_delete (section_hash_table);
 
@@ -3373,6 +3373,10 @@ lto_main (void)
   if (flag_lto_dump_symbol)
 dump_symbol ();
 
+  /* Dump gimple body of specific function.  */
+  if (flag_lto_dump_body)
+dump_body ();
+
   timevar_stop (TV_PHASE_STREAM_IN);
 
   if (!seen_error ())
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 1d2374f..0e08519 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -815,7 +815,7 @@ symtab_node::dump_visibility () const
 "default", "protected", "hidden", "internal"
   };
 
-  return visibility_types [DECL_VISIBILITY (decl)];
+  return visibility_types[DECL_VISIBILITY (decl)];
 }
 
 const char *
diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h
index 73237a6..3e10d15 100644
--- a/gcc/tree-cfg.h
+++ b/gcc/tree-cfg.h
@@ -81,7 +81,7 @@ extern void fold_loop_internal_call (gimple *, tree);
 extern basic_block move_sese_region_to_fn (struct function *,

Re: aarch64-none-elf build broken

2018-06-08 Thread Jonathan Wakely
On 8 June 2018 at 16:11, Joseph Myers wrote:
> On Fri, 8 Jun 2018, Jonathan Wakely wrote:
>
>> > The root cause is PR66203 which I reported quite some time ago, which
>> > points to a newlib problem: on aarch64 there is no default rom
>> > monitor, one has to explicitly use a --specs flag for the link to
>> > succeed.
>>
>> I have no idea why this causes the libstdc++ configuration problem
>> though, I don't understand the issue.
>
> Generically libstdc++ should not be doing link tests for bare-metal
> targets; rather, there is a hardcoded set of defines in configure.ac for
> functions present on such targets.

Thanks, I thought that might be how we need to fix this.


>  (For most other targets, link tests
> *should* be run even when cross-compiling, there have been plenty of bugs
> in the past where something tested in the $GLIBCXX_IS_NATIVE case wasn't
> also tested in crossconfig.m4 for targets such as GNU/Linux where a target
> libc is guaranteed to be linkable against for building libstdc++.)

Yes, this requirement is very fragile and introduces differences
(sometimes serious ones) between native and cross builds (e.g.
r260678, r258468, r244169 ...)


Re: aarch64-none-elf build broken

2018-06-08 Thread Joseph Myers
On Fri, 8 Jun 2018, Jonathan Wakely wrote:

> > The root cause is PR66203 which I reported quite some time ago, which
> > points to a newlib problem: on aarch64 there is no default rom
> > monitor, one has to explicitly use a --specs flag for the link to
> > succeed.
> 
> I have no idea why this causes the libstdc++ configuration problem
> though, I don't understand the issue.

Generically libstdc++ should not be doing link tests for bare-metal 
targets; rather, there is a hardcoded set of defines in configure.ac for 
functions present on such targets.  (For most other targets, link tests 
*should* be run even when cross-compiling, there have been plenty of bugs 
in the past where something tested in the $GLIBCXX_IS_NATIVE case wasn't 
also tested in crossconfig.m4 for targets such as GNU/Linux where a target 
libc is guaranteed to be linkable against for building libstdc++.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: aarch64-none-elf build broken

2018-06-08 Thread Christophe Lyon
On 8 June 2018 at 16:41, Jonathan Wakely  wrote:
> On 8 June 2018 at 14:22, Christophe Lyon wrote:
>> Hi,
>>
>> As I reported in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of
>> GCC for aarch64*-none-elf fails when configuring libstdc++ since
>> r261034 (a week ago).
>
> Sorry for not trying to fix it, I'm travelling and not been able to
> look into it (which is why I've only been doing trivial refactoring
> patches all week).
>
I'm not blaming you in any way :)


>
>> The root cause is PR66203 which I reported quite some time ago, which
>> points to a newlib problem: on aarch64 there is no default rom
>> monitor, one has to explicitly use a --specs flag for the link to
>> succeed.
>
> I have no idea why this causes the libstdc++ configuration problem
> though, I don't understand the issue.

That's because aarch64-elf-gcc conftest.c -o conftest fails to link if
one does not provide --specs=rdimon.specs.

So, every configure test that involves a link phase fails.


Re: aarch64-none-elf build broken

2018-06-08 Thread Jonathan Wakely
On 8 June 2018 at 14:22, Christophe Lyon wrote:
> Hi,
>
> As I reported in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of
> GCC for aarch64*-none-elf fails when configuring libstdc++ since
> r261034 (a week ago).

Sorry for not trying to fix it, I'm travelling and not been able to
look into it (which is why I've only been doing trivial refactoring
patches all week).


> The root cause is PR66203 which I reported quite some time ago, which
> points to a newlib problem: on aarch64 there is no default rom
> monitor, one has to explicitly use a --specs flag for the link to
> succeed.

I have no idea why this causes the libstdc++ configuration problem
though, I don't understand the issue.


Re: [GSOC] LTO dump tool project

2018-06-08 Thread Martin Liška
On 06/08/2018 03:40 PM, Martin Liška wrote:
> There's wrong declaration of the function in header file. I'll fix it soon
> on trunk. Please carry on with following patch:

Fixed in r261327.

Martin


Re: [GSOC] LTO dump tool project

2018-06-08 Thread Martin Liška
On 06/08/2018 03:27 PM, Hrishikesh Kulkarni wrote:
> Hi,
> 
> Linking is not taking place as the debug_function() being used is in
> tree-cfg.c. How should I go about on adding in make-file considering
> the dependencies?

Hi.

There's wrong declaration of the function in header file. I'll fix it soon
on trunk. Please carry on with following patch:

diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 17634797c6e..363b59febd6 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "attribs.h"
 #include "builtins.h"
+#include "tree-cfg.h"
 
 
 /* Number of parallel tasks to run, -1 if we want to use GNU Make jobserver.  
*/
@@ -3338,6 +3339,8 @@ offload_handle_link_vars (void)
 void
 lto_main (void)
 {
+  // only test if it builds
+  debug_function (cfun->decl, TDF_NONE);
   /* LTO is called as a front end, even though it is not a front end.
  Because it is called as a front end, TV_PHASE_PARSING and
  TV_PARSE_GLOBAL are active, and we need to turn them off while
diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h
index 73237a604be..9491bb45feb 100644
--- a/gcc/tree-cfg.h
+++ b/gcc/tree-cfg.h
@@ -81,7 +81,7 @@ extern void fold_loop_internal_call (gimple *, tree);
 extern basic_block move_sese_region_to_fn (struct function *, basic_block,
   basic_block, tree);
 extern void dump_function_to_file (tree, FILE *, dump_flags_t);
-extern void debug_function (tree, int) ;
+extern void debug_function (tree, dump_flags_t);
 extern void print_loops_bb (FILE *, basic_block, int, int);
 extern void print_loops (FILE *, int);
 extern void debug (struct loop &ref);

Martin

> 
> Please find the diff file attached herewith.
> 
> Regards,
> Hrishikesh
> 
> On Tue, Jun 5, 2018 at 12:38 AM, Martin Liška  wrote:
>> On 06/04/2018 08:13 PM, Hrishikesh Kulkarni wrote:
>>>
>>> Hi,
>>>
>>> -fdump-lto-list will dump all the symbol list
>>
>>
>> I see extra new lines in the output:
>>
>> $ lto1 -fdump-lto-list main.o
>> [..snip..]
>> Symbol Table
>> NameTypeVisibility
>>
>>fwrite/15function default
>>
>>   foo/14function default
>>
>>  mystring/12variable default
>>
>>  pole/11variable default
>>
>>  main/13function default
>>
>>> -fdump-lto-list -demangle will dump all the list with symbol names
>>> demangled
>>
>>
>> Good for now. Note that non-demagle version prints function names with order
>> (/$number).
>> I would not print that.
>>
>>> -fdump-lto-symbol=foo will dump details of foo
>>
>>
>> I would really prefer to use symtab_node::debug for now. It presents all
>> details about
>> symbol instead of current implementation which does: '-fdump-lto-list | grep
>> foo'
>>
>>>
>>> The output(demangled) will be in tabular form like nm:
>>> Symbol Table
>>>  Name Type Visibility
>>>printffunction default
>>> kvariable default
>>>  mainfunction default
>>>   barfunction default
>>>   foofunction default
>>>
>>> I have tried to format the changes according to gnu coding style and
>>> added required methods in symtab_node.
>>
>>
>> That's nice that you came up with new symbol_node methods. It's much better.
>> About the GNU coding style, I still see trailing whitespace:
>>
>> === ERROR type #3: there should be no space before a left square bracket (1
>> error(s)) ===
>> gcc/symtab.c:818:26:  return visibility_types [DECL_VISIBILITY (decl)];
>>
>> === ERROR type #4: trailing whitespace (6 error(s)) ===
>> gcc/lto/lto-dump.c:40:4:void█
>> gcc/lto/lto-dump.c:45:0:█
>> gcc/lto/lto-dump.c:56:52:   fprintf (stderr,
>> "\n%20s",(flag_lto_dump_demangle)█
>> gcc/lto/lto-dump.c:73:53:   fprintf (stderr,
>> "\n%20s",(flag_lto_dump_demangle)█
>> gcc/lto/lto-dump.c:78:2:}█
>> gcc/lto/lto-dump.c:79:1:}█
>>
>> Martin
>>
>>
>>>
>>> Please find the diff file attached.
>>>
>>> Regards,
>>> Hrishikesh
>>>
>>> On Mon, Jun 4, 2018 at 2:06 PM, Martin Liška  wrote:

 On 06/01/2018 08:59 PM, Hrishikesh Kulkarni wrote:
>
> Hi,
> I have pushed the changes to github
> (https://github.com/hrisearch/gcc). Added a command line option for
> specific dumps of variables and functions used in IL e.g.
> -fdump-lto-list=foo will dump:
> Call Graph:
>
> foo/1 (foo)
>Type: function
>   visibility: default


 Hi.

 Thanks for the next step. I've got some comments about it:

 - -fdump-lto-list=foo is wrong option name, I would use -fdump-lto-symbol
or something similar.


Re: [GSOC] LTO dump tool project

2018-06-08 Thread Hrishikesh Kulkarni
Hi,

Linking is not taking place as the debug_function() being used is in
tree-cfg.c. How should I go about on adding in make-file considering
the dependencies?

Please find the diff file attached herewith.

Regards,
Hrishikesh

On Tue, Jun 5, 2018 at 12:38 AM, Martin Liška  wrote:
> On 06/04/2018 08:13 PM, Hrishikesh Kulkarni wrote:
>>
>> Hi,
>>
>> -fdump-lto-list will dump all the symbol list
>
>
> I see extra new lines in the output:
>
> $ lto1 -fdump-lto-list main.o
> [..snip..]
> Symbol Table
> NameTypeVisibility
>
>fwrite/15function default
>
>   foo/14function default
>
>  mystring/12variable default
>
>  pole/11variable default
>
>  main/13function default
>
>> -fdump-lto-list -demangle will dump all the list with symbol names
>> demangled
>
>
> Good for now. Note that non-demagle version prints function names with order
> (/$number).
> I would not print that.
>
>> -fdump-lto-symbol=foo will dump details of foo
>
>
> I would really prefer to use symtab_node::debug for now. It presents all
> details about
> symbol instead of current implementation which does: '-fdump-lto-list | grep
> foo'
>
>>
>> The output(demangled) will be in tabular form like nm:
>> Symbol Table
>>  Name Type Visibility
>>printffunction default
>> kvariable default
>>  mainfunction default
>>   barfunction default
>>   foofunction default
>>
>> I have tried to format the changes according to gnu coding style and
>> added required methods in symtab_node.
>
>
> That's nice that you came up with new symbol_node methods. It's much better.
> About the GNU coding style, I still see trailing whitespace:
>
> === ERROR type #3: there should be no space before a left square bracket (1
> error(s)) ===
> gcc/symtab.c:818:26:  return visibility_types [DECL_VISIBILITY (decl)];
>
> === ERROR type #4: trailing whitespace (6 error(s)) ===
> gcc/lto/lto-dump.c:40:4:void█
> gcc/lto/lto-dump.c:45:0:█
> gcc/lto/lto-dump.c:56:52:   fprintf (stderr,
> "\n%20s",(flag_lto_dump_demangle)█
> gcc/lto/lto-dump.c:73:53:   fprintf (stderr,
> "\n%20s",(flag_lto_dump_demangle)█
> gcc/lto/lto-dump.c:78:2:}█
> gcc/lto/lto-dump.c:79:1:}█
>
> Martin
>
>
>>
>> Please find the diff file attached.
>>
>> Regards,
>> Hrishikesh
>>
>> On Mon, Jun 4, 2018 at 2:06 PM, Martin Liška  wrote:
>>>
>>> On 06/01/2018 08:59 PM, Hrishikesh Kulkarni wrote:

 Hi,
 I have pushed the changes to github
 (https://github.com/hrisearch/gcc). Added a command line option for
 specific dumps of variables and functions used in IL e.g.
 -fdump-lto-list=foo will dump:
 Call Graph:

 foo/1 (foo)
Type: function
   visibility: default
>>>
>>>
>>> Hi.
>>>
>>> Thanks for the next step. I've got some comments about it:
>>>
>>> - -fdump-lto-list=foo is wrong option name, I would use -fdump-lto-symbol
>>>or something similar.
>>>
>>> - for -fdump-lto-list I would really prefer to use a format similar to
>>> nm:
>>>print a header with column description and then one line for a symbol
>>>
>>> - think about mangling/demangling of C++ symbols, you can take a look at
>>> nm it also has --demangle, --no-demangle
>>>
>>> - please learn & try to use an autoformat for your editor in order to
>>>fulfill GNU coding style. Following checker will help you:
>>>
>>> $ ./contrib/check_GNU_style.py /tmp/p
>>> === ERROR type #1: dot, space, space, end of comment (6 error(s)) ===
>>> gcc/lto/lto-dump.c:38:17:/*Dump everything*/
>>> gcc/lto/lto-dump.c:44:41:/*Dump variables and functions used in IL*/
>>> gcc/lto/lto-dump.c:73:50:/*Dump specific variables and functions used in
>>> IL*/
>>> gcc/lto/lto.c:3364:19:  /*Dump everything*/
>>> gcc/lto/lto.c:3368:43:  /*Dump variables and functions used in IL*/
>>> gcc/lto/lto.c:3372:52:  /*Dump specific variables and functions used in
>>> IL*/
>>>
>>> === ERROR type #2: lines should not exceed 80 characters (11 error(s))
>>> ===
>>> gcc/lto/lto-dump.c:51:80:static const char * const
>>> symtab_type_names[] = {"symbol", "function", "variable"};
>>> gcc/lto/lto-dump.c:56:80:fprintf (stderr, "\n%s (%s)",
>>> cnode->dump_asm_name (), cnode->name ());
>>> gcc/lto/lto-dump.c:57:80:fprintf (stderr, "\n  Type: %s",
>>> symtab_type_names[cnode->type]);
>>> gcc/lto/lto-dump.c:66:80:fprintf (stderr, "\n%s (%s)",
>>> vnode->dump_asm_name (), vnode->name ());
>>> gcc/lto/lto-dump.c:67:80:fprintf (stderr, "\n  Type: %s",
>>> symtab_type_names[vnode->type]);
>>> gcc/lto/lto-dump.c:80:80:   

aarch64-none-elf build broken

2018-06-08 Thread Christophe Lyon
Hi,

As I reported in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78870#c16, the build of
GCC for aarch64*-none-elf fails when configuring libstdc++ since
r261034 (a week ago).

The root cause is PR66203 which I reported quite some time ago, which
points to a newlib problem: on aarch64 there is no default rom
monitor, one has to explicitly use a --specs flag for the link to
succeed.

Maybe I missed a change about this in newlib, and I should upgrade the
version I use for GCC automatic validations?

If not, and if there is not much interest in these configurations,
maybe I should just drop them from my list? Alternatively, I could try
to use LDFLAGS_FOR_TARGET=--specs=rdimon.specs in my validation
scripts.

Or, better, are there plans to fix this?

I ask, because I have no immediate plans to look at this.

Thanks,

Christophe


Re: decrement_and_branch_until_zero pattern

2018-06-08 Thread Paul Koning



> On Jun 8, 2018, at 6:59 AM, Andreas Schwab  wrote:
> 
> On Jun 07 2018, Paul Koning  wrote:
> 
>> None of these seem to use that loop optimization, with -O2 or -Os.  Did I
>> miss some magic switch to turn on something else that isn't on by default?
>> Or is this a feature that was broken long ago and not noticed?  If so, any
>> hints where I might look for a reason?
> 
> commit 7d3c6452d7
> Author: rakdver 
> Date:   Thu Mar 2 23:50:55 2006 +
> 
>* loop.c: Removed.

Interesting.  The ChangeLog doesn't give any background.  I suppose I should 
plan to approximate the effect of this pattern with a define-peephole2 ?  

paul



Re: current state of gcc-ia16?

2018-06-08 Thread Andrew Jenner

On 08/06/2018 12:43, Dennis Luehring wrote:

is the patch already integrated into mainline?

No, it's not.


will that ever happen?


Hard to say. There's no reason in principle why it couldn't happen, but 
there's not a big demand for it, so it would require someone taking the 
time and trouble to do it. It's not trivial, though - the current 
implementation has some middle-end changes which would need thinking 
through and doing properly to avoid polluting that code with ia16-isms.


I might update it and have another try at upstreaming it at some point 
if nobody else does it first, but I have too much else going on at the 
moment so it would likely be a year or two (maybe more) before I get to it.


Andrew


Re: current state of gcc-ia16?

2018-06-08 Thread Dennis Luehring

is the patch already integrated into mainline?

No, it's not.


will that ever happen?


is this the most recent development place?
https://github.com/tkchia/gcc-ia16

Yes, that's the right place.


thx


Am 08.06.2018 um 12:59 schrieb Andrew Jenner:

Hi Dennis,

On 08/06/2018 11:37, Dennis Luehring wrote:
> is the patch already integrated into mainline?

No, it's not.

> is this the most recent development place?
> https://github.com/tkchia/gcc-ia16

Yes, that's the right place.

Andrew





Re: current state of gcc-ia16?

2018-06-08 Thread Andrew Jenner

Hi Dennis,

On 08/06/2018 11:37, Dennis Luehring wrote:

is the patch already integrated into mainline?


No, it's not.


is this the most recent development place?
https://github.com/tkchia/gcc-ia16


Yes, that's the right place.

Andrew


Re: decrement_and_branch_until_zero pattern

2018-06-08 Thread Andreas Schwab
On Jun 07 2018, Paul Koning  wrote:

> None of these seem to use that loop optimization, with -O2 or -Os.  Did I
> miss some magic switch to turn on something else that isn't on by default?
> Or is this a feature that was broken long ago and not noticed?  If so, any
> hints where I might look for a reason?

commit 7d3c6452d7
Author: rakdver 
Date:   Thu Mar 2 23:50:55 2006 +

* loop.c: Removed.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@111650 
138bc75d-0d04-0410-961f-82ee72b054a4

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


current state of gcc-ia16?

2018-06-08 Thread Dennis Luehring

is the patch already integrated into mainline?

is this the most recent development place?
https://github.com/tkchia/gcc-ia16



overflow check in extract_range_from_binary_1 useless?

2018-06-08 Thread Aldy Hernandez

Howdy.

Am I missing something or are these two sets identical?


  /* Get the lower and upper bounds of the type.  */
  if (TYPE_OVERFLOW_WRAPS (expr_type))
{
  type_min = wi::min_value (prec, sgn);
  type_max = wi::max_value (prec, sgn);
}
  else
{
  type_min = wi::to_wide (vrp_val_min (expr_type));
  type_max = wi::to_wide (vrp_val_max (expr_type));
}


Isn't wi::to_wide(TYPE_MIN/MAX_VALUE) the same as wi::min/max_value, or 
is there some weird language (*cough ada*) subtlety I'm missing?


Confused.
Aldy