Re: [C++ coroutines 7/7] libiberty demangler update.

2020-01-11 Thread Ian Lance Taylor
Iain Sandoe  writes:

> Ian Lance Taylor  wrote:
>
>> Iain Sandoe  writes:
>
>>> 2020-01-09  Iain Sandoe  
>>> 
>>> * cp-demangle.c (cplus_demangle_operators): Add the co_await
>>> operator.
>> 
>> Please add something to libiberty/testsuite/demangle-expected.  Thanks.
>
> done***,
> OK now?
> thanks
> Iain
>
> *** it seems that the demangle test is broken for x86_64-darwin & linux on 
> trunk
> anyway, (I ‘fixed’ it locally to test my patch below; will try to figure out 
> if that fix is
> correct and post it if so) - however, that’s unconnected with the coroutines 
> stuff.
>
> ==
>
> revised:
> --
> 
> The coroutines implementation introduces a new operator with a
> mangling of 'aw'.  This patch adds that to libiberty's demangler.
> 
> libiberty/ChangeLog:
> 
> 2020-01-11  Iain Sandoe  
> 
> * cp-demangle.c (cplus_demangle_operators): Add the co_await
> operator.
> * testsuite/demangle-expected: Append a test for the co_await
> operator mangling.

This is OK.

Thanks.

Ian


[PATCH] Decrease cortexa57_extra_costs's alu.shift_reg

2020-01-11 Thread apinski
From: Andrew Pinski 

Like I mentioned in https://gcc.gnu.org/ml/gcc/2020-01/msg00157.html,
The shift by a register should be just COSTS_N_INSNS (1) rather than
COSTS_N_INSNS (2).  This allows lshift_cheap_p to return true now
and converting switches to be using shift and other like
structures.  I noticed this difference when I was working
through PR 93131 and understanding what reassoc could handle.

OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
* config/arm/aarch-cost-tables.h (cortexa57_extra_costs): Change
alu.shift_reg to 0.
---
 gcc/config/arm/aarch-cost-tables.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/arm/aarch-cost-tables.h 
b/gcc/config/arm/aarch-cost-tables.h
index 6a30d92cde9..cf818659901 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -235,7 +235,7 @@ const struct cpu_cost_table cortexa57_extra_costs =
 0, /* arith.  */
 0, /* logical.  */
 0, /* shift.  */
-COSTS_N_INSNS (1), /* shift_reg.  */
+0, /* shift_reg.  */
 COSTS_N_INSNS (1), /* arith_shift.  */
 COSTS_N_INSNS (1), /* arith_shift_reg.  */
 COSTS_N_INSNS (1), /* log_shift.  */
-- 
2.17.1



[PATCHv2] Add initial octeontx2 support.

2020-01-11 Thread apinski
From: Andrew Pinski 

This adds octeontx2 naming.  It currently uses the cortexa57
cost model and schedule model until I submit this.  This is
more a place holder to get the naming of the cores in GCC 10.
I will submit the cost model in the next couple of days.

OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

v2 changes: Add documentation and fix minor whitespace issues before the `,'.

Thanks,
Andrew Pinski

ChangeLog:
* config/aarch64/aarch64-cores.def (octeontx2): New define.
(octeontx2t98): New define.
(octeontx2t96): New define.
(octeontx2t93): New define.
(octeontx2f95): New define.
(octeontx2f95n): New define.
(octeontx2f95mm): New define.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (-mcpu=): Document the new cpu types.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-cores.def | 10 ++
 gcc/config/aarch64/aarch64-tune.md   |  2 +-
 gcc/doc/invoke.texi  |  6 +-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 2dd2b86bd92..ea9b98b4b0a 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -109,6 +109,16 @@ AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_F
 AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD 
| AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
 AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD 
| AARCH64_FL_SSBS, cortexa73, 0x41, 0xd4a, -1)
 
+/* Cavium ('C') cores. */
+AARCH64_CORE("octeontx2",  octeontx2,  cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b0, -1)
+AARCH64_CORE("octeontx2t98",   octeontx2t98,   cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b1, -1)
+AARCH64_CORE("octeontx2t96",   octeontx2t96,   cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b2, -1)
+/* Note OcteonTX2 T93 is an alias to OcteonTX2 T96. */
+AARCH64_CORE("octeontx2t93",   octeontx2t93,   cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b2, -1)
+AARCH64_CORE("octeontx2f95",   octeontx2f95,   cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b3, -1)
+AARCH64_CORE("octeontx2f95n",  octeontx2f95n,  cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b4, -1)
+AARCH64_CORE("octeontx2f95mm", octeontx2f95mm, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 
0x43, 0x0b5, -1)
+
 /* HiSilicon ('H') cores. */
 AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,  
 0x48, 0xd01, -1)
 
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index a6a14b7fc77..3cc1c4d761f 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+   
"cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f2c805c0a64..279b97f51ba 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16323,7 +16323,11 @@ performance of the code.  Permissible values for this 
option are:
 @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor},
 @samp{neoverse-e1},@samp{neoverse-n1},@samp{qdf24xx}, @samp{saphira},
 @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx

Re: [PATCH] PR90838: Support ctz idioms

2020-01-11 Thread Segher Boessenkool
Hi!

On Fri, Jan 10, 2020 at 10:35:04PM +0100, Jakub Jelinek wrote:
> On Thu, Jan 09, 2020 at 02:26:10PM +0100, Richard Biener wrote:
> > > >> +  tree lhs = gimple_assign_lhs (stmt);
> > > >> +  bool zero_ok = CTZ_DEFINED_VALUE_AT_ZERO (TYPE_MODE (type), 
> > > >> val);
> > > >
> > > > since we're using the optab entry shouldn't you check for == 2 here?
> > >
> > > Yes, that looks more correct (it's not clear what 1 means exactly).
> 
> 1 is what e.g. x86 uses with -mbmi, and I think it is what most targets
> actually define if they have any defined value there, exception is
> aarch64/arm, powerpc, gcn, nvptx and mips.

1 vs. 2 only makes a difference when expanding ffs, it seems.

> Given:
> int foo (int x) { return __builtin_ctz (x); }
> int bar (int x) { return x ? __builtin_ctz (x) : 32; }
> on x86 -mbmi we with CTZ_DEFINED_VALUE_AT_ZERO being 1 and value type_size
> emit the same code, we'd need to find out if that is something done for all
> targets or just a few,

It doesn't do that for rs6000 (not for clz either, which is easier for us,
the expansion of ctz depends on multiple factors).  (Note that "foo" has
UB whenever x == 0, btw. -- it would be nice if something like "bar" would
generate the optimal code as well.  Is there some other code that will just
work?)

> and if for all that have CTZ_DEFINED_VALUE_AT_ZERO 1,
> we could perhaps just emit branchy code and wait for RTL to fix that up.

Where would RTL fix that?  In what pass, I mean.


Segher


Re: [committed] Unbreak bootstrap on most targets (was Re: [PATCH] PR90838: Support ctz idioms)

2020-01-11 Thread Jakub Jelinek
On Sat, Jan 11, 2020 at 05:24:19PM +0100, Andreas Schwab wrote:
> ../../gcc/tree-ssa-forwprop.c: In function 'bool 
> simplify_count_trailing_zeroes(gimple_stmt_iterator*)':
> ../../gcc/tree-ssa-forwprop.c:1925:23: error: variable 'mode' set but not 
> used [-Werror=unused-but-set-variable]
>  1925 |   scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
>   |   ^~~~

Oops, then I think we need following, but can't commit it until Monday.

2020-01-11  Jakub Jelinek  

PR tree-optimization/90838
* tree-ssa-forwprop.c (simplify_count_trailing_zeroes): Use
SCALAR_INT_TYPE_MODE directly in CTZ_DEFINED_VALUE_AT_ZERO macro
argument rather than to initialize temporary for targets that
don't use the mode argument at all.

--- gcc/tree-ssa-forwprop.c.jj  2020-01-11 16:31:56.278274863 +0100
+++ gcc/tree-ssa-forwprop.c 2020-01-11 17:27:55.336925656 +0100
@@ -1922,8 +1922,8 @@ simplify_count_trailing_zeroes (gimple_s
   tree type = TREE_TYPE (res_ops[0]);
   HOST_WIDE_INT ctzval;
   HOST_WIDE_INT type_size = tree_to_shwi (TYPE_SIZE (type));
-  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
-  bool zero_ok = CTZ_DEFINED_VALUE_AT_ZERO (mode, ctzval) == 2;
+  bool zero_ok
+   = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type), ctzval) == 2;
 
   /* Skip if there is no value defined at zero, or if we can't easily
 return the correct value for zero.  */


Jakub



Re: [committed] Unbreak bootstrap on most targets (was Re: [PATCH] PR90838: Support ctz idioms)

2020-01-11 Thread Andreas Schwab
../../gcc/tree-ssa-forwprop.c: In function 'bool 
simplify_count_trailing_zeroes(gimple_stmt_iterator*)':
../../gcc/tree-ssa-forwprop.c:1925:23: error: variable 'mode' set but not used 
[-Werror=unused-but-set-variable]
 1925 |   scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
  |   ^~~~

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [wwwdocs] Git transition - how to access private user and vendor branches

2020-01-11 Thread Richard Earnshaw (lists)
On 11/01/2020 15:41, Segher Boessenkool wrote:
> Hi Richard,
> 
> On Thu, Jan 09, 2020 at 04:50:03PM +, Richard Earnshaw (lists) wrote:
>> Given the proposed intention to use non-standard refspecs for private 
>> and vendor branches I've written some notes on how to use these.
>> 
>> It would be helpful if someone could do some experiments to ensure that 
>> what I've written works properly for all versions of git, not just the 
>> one I have installed locally.
> 
> I haven't been able to test it yet, but it does look like it will work.
> 
>> +To fetch any of these you will need to add a fetch refspec to your
>> +configuration.  For example, to fetch all the IBM vendor branches add to
>> +your default upstream pull
>> +
>> +
>> +git config --add remote.origin.fetch 
>> "+refs/vendors/IBM/heads/*:refs/remotes/origin/IBM/*"
>> +git config --add remote.origin.fetch 
>> "+refs/vendors/IBM/tags/*:refs/tags/IBM/*"
>> +
> 
> It may help to show the resulting config file instead of (or in addition
> to) git-config commands to manipulate that.  The config file is (or can
> be) much more readable than those commands (shorter lines, less quoting).
> You can also organise it, put in comments, and even *debug it*!  And
> *understand it* in general!  Wow, what a concept :-)

I wanted to describe it in the 'official' git way.  Tweaking the
contents of your .git/config file is not really sanctioned, even though
we all do it. :-)

> 
>> +
>> +git checkout -b ARM/arm-9-branch origin/ARM/arm-9-branch
>> +
>> +
>> +then change the merge property for the branch to corectly identify the
> 
> (typo)

Already fixed, didn't seem like it was worth reposting just for that.

> 
>> +upstream source
>> +
>> +
>> +git config branch.ARM/arm-9-branch.merge refs/vendors/ARM/heads/arm-9-branch
>> +
>> +
>> +Pull operations will now correctly track the upstream branch.
> 
> Do you also need to set branch..remote?  Or is that done correctly
> already?  Or is that not needed if you have only one remote?

Yes.

However, on reflection, I'm not sure about that bit anyway.  I may
remote it entirely.


> 
>> +It is also possible to set up push operations so that local changes will be 
>> pushed to the private namespace.  For example, if you mirror your own 
>> private git information with
>> +
>> +
>> +git config --add remote.origin.fetch 
>> "+refs/users/my-userid/heads/*:refs/remotes/origin/me/*"
>> +
>> +
>> +then you can also add 
>> +
>> +git config --add remote.origin.push 
>> "+refs/heads/me/*:refs/users/my-userid/heads/*"
>> +
>> +and then any push from a branch that begins with me/ will be 
>> pushed to the private area.
> 
> Will be *force-pushed* to the server.  This may not be expected or
> wanted.
> 
> Why would people want to name their local branches "me/thing" instead
> of just "thing", btw?  You could just use
> 
>   push = thing:users//heads/thing

You could do that instead, but then you need a push line for every
branch.  The me/ trick means that anything named me/*
will automatically get sent to the right place if pushed upstream.

> 
> (no "+", I don't rebase the "thing" branch).  Hrm, and maybe make an
> alias to push a local branch to the server the first time...  Completely
> untested:
> 
> [alias]
>     new-user-branch = !git push $1:users/myname/heads/$1
> 
> (And yes, I know this doesn't work as written if you have tag names
> the same as branch names.  That *deserves* punishment :-) )
> 
> 
> Segher

Thanks for the feedback.  Also see my scripts for setting up user/vendor
stuff that I posted yesterday, it might simplify some of the above setup
steps and thus the documentation, but I wrote those scripts after I had
written this text.

R.


Re: [wwwdocs] Git transition - how to access private user and vendor branches

2020-01-11 Thread Segher Boessenkool
Hi Richard,

On Thu, Jan 09, 2020 at 04:50:03PM +, Richard Earnshaw (lists) wrote:
> Given the proposed intention to use non-standard refspecs for private 
> and vendor branches I've written some notes on how to use these.
> 
> It would be helpful if someone could do some experiments to ensure that 
> what I've written works properly for all versions of git, not just the 
> one I have installed locally.

I haven't been able to test it yet, but it does look like it will work.

> +To fetch any of these you will need to add a fetch refspec to your
> +configuration.  For example, to fetch all the IBM vendor branches add to
> +your default upstream pull
> +
> +
> +git config --add remote.origin.fetch 
> "+refs/vendors/IBM/heads/*:refs/remotes/origin/IBM/*"
> +git config --add remote.origin.fetch 
> "+refs/vendors/IBM/tags/*:refs/tags/IBM/*"
> +

It may help to show the resulting config file instead of (or in addition
to) git-config commands to manipulate that.  The config file is (or can
be) much more readable than those commands (shorter lines, less quoting).
You can also organise it, put in comments, and even *debug it*!  And
*understand it* in general!  Wow, what a concept :-)

> +
> +git checkout -b ARM/arm-9-branch origin/ARM/arm-9-branch
> +
> +
> +then change the merge property for the branch to corectly identify the

(typo)

> +upstream source
> +
> +
> +git config branch.ARM/arm-9-branch.merge refs/vendors/ARM/heads/arm-9-branch
> +
> +
> +Pull operations will now correctly track the upstream branch.

Do you also need to set branch..remote?  Or is that done correctly
already?  Or is that not needed if you have only one remote?

> +It is also possible to set up push operations so that local changes will be 
> pushed to the private namespace.  For example, if you mirror your own private 
> git information with
> +
> +
> +git config --add remote.origin.fetch 
> "+refs/users/my-userid/heads/*:refs/remotes/origin/me/*"
> +
> +
> +then you can also add 
> +
> +git config --add remote.origin.push 
> "+refs/heads/me/*:refs/users/my-userid/heads/*"
> +
> +and then any push from a branch that begins with me/ will be 
> pushed to the private area.

Will be *force-pushed* to the server.  This may not be expected or
wanted.

Why would people want to name their local branches "me/thing" instead
of just "thing", btw?  You could just use

  push = thing:users//heads/thing

(no "+", I don't rebase the "thing" branch).  Hrm, and maybe make an
alias to push a local branch to the server the first time...  Completely
untested:

[alias]
new-user-branch = !git push $1:users/myname/heads/$1

(And yes, I know this doesn't work as written if you have tag names
the same as branch names.  That *deserves* punishment :-) )


Segher


[C++ coroutines 7/7] libiberty demangler update.

2020-01-11 Thread Iain Sandoe
Ian Lance Taylor  wrote:

> Iain Sandoe  writes:

>> 2020-01-09  Iain Sandoe  
>> 
>>  * cp-demangle.c (cplus_demangle_operators): Add the co_await
>>  operator.
> 
> Please add something to libiberty/testsuite/demangle-expected.  Thanks.

done***,
OK now?
thanks
Iain

*** it seems that the demangle test is broken for x86_64-darwin & linux on trunk
anyway, (I ‘fixed’ it locally to test my patch below; will try to figure out if 
that fix is
correct and post it if so) - however, that’s unconnected with the coroutines 
stuff.

==

revised:
--

The coroutines implementation introduces a new operator with a
mangling of 'aw'.  This patch adds that to libiberty's demangler.

libiberty/ChangeLog:

2020-01-11  Iain Sandoe  

* cp-demangle.c (cplus_demangle_operators): Add the co_await
operator.
* testsuite/demangle-expected: Append a test for the co_await
operator mangling.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 3639bfbfd4..fc55b7fae1 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1802,6 +1802,7 @@ const struct demangle_operator_info 
cplus_demangle_operators[] =
   { "ad", NL ("&"), 1 },
   { "an", NL ("&"), 2 },
   { "at", NL ("alignof "),   1 },
+  { "aw", NL ("co_await "), 1 },
   { "az", NL ("alignof "),   1 },
   { "cc", NL ("const_cast"), 2 },
   { "cl", NL ("()"),2 },
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index 5878d96dee..e6a75601e0 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -1450,3 +1450,7 @@ Foo()::{lambda(auto:1)#1}::operator()(char) 
const::X::fn
 
 _Z3fooILPv0EEvPN9enable_ifIXeqT_LDnEEvE4typeE
 void foo<(void*)0>(enable_if<((void*)0)==((decltype(nullptr))), void>::type*)
+
+_ZNK5coro15emptyawEv
+coro1::empty::operator co_await() const
+



[wwwdocs] lists.html - Avoid two references to a concrete version control system (SVN here).

2020-01-11 Thread Gerald Pfeifer
Already when we converted from CVS to SVN aeons ago did I work
to reduce references to a concrete system.

These are two further instances, now that we are moving to GIT.

Committed.

Gerald


- Log -
commit 370969e4d49ad3defb1fddaa239dbab955896a3a
Author: Gerald Pfeifer 
Date:   Sat Jan 11 14:02:34 2020 +0100

diff --git a/htdocs/lists.html b/htdocs/lists.html
index 8ca456a..b05e171 100644
--- a/htdocs/lists.html
+++ b/htdocs/lists.html
@@ -99,11 +99,11 @@ before subscribing and posting to 
these lists.
 
   https://gcc.gnu.org/ml/gcc-cvs/";>gcc-cvs
   is a read-only, relatively high volume list which tracks checkins to the
-  GCC SVN repository.
+  GCC source repository.
 
   https://gcc.gnu.org/ml/libstdc++-cvs/";>libstdc++-cvs
   is a read-only, relatively low volume list which tracks checkins to
-  the libstdc++-v3 part of the GCC SVN repository.  This is a subset
+  the libstdc++-v3 part of the GCC repository.  This is a subset
   of the messages to gcc-cvs.
 
   https://gcc.gnu.org/ml/gcc-cvs-wwwdocs/";>gcc-cvs-wwwdocs


RE: [PATCH] Add Optimization for various IPA parameters.

2020-01-11 Thread Tamar Christina
Hi Martin,

This change (r280099) is causing a major performance regression on exchange2 in 
SPEC2017 dropping the benchmark by more than 30%.

It seems the parameters no longer do anything. i.e. -flto --param 
ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 doesn't have any effect 
anymore.

Thanks,
Tamar

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org 
> On Behalf Of Martin Jambor
> Sent: Wednesday, January 8, 2020 4:14 PM
> To: Martin Liška ; gcc-patches@gcc.gnu.org
> Cc: hubi...@ucw.cz; Richard Biener 
> Subject: Re: [PATCH] Add Optimization for various IPA parameters.
> 
> Hi,
> 
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time, one needs to
> > use opt_for_fn in order to get the right parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_cp_unit_growth
> 
> So as we discussed, picking this one from one particular node is not what one
> would expect to happen, but inlining does it too and so anyway:
> 
> 
> IPA-CP: Always access param_ipcp_unit_growth through opt_for_fn
> 
> 2020-01-07  Martin Jambor  
> 
>   * params.opt (param_ipcp_unit_growth): Mark as Optimization.
>   * ipa-cp.c (max_new_size): Removed.
>   (orig_overall_size): New variable.
>   (get_max_overall_size): New function.
>   (estimate_local_effects): Use it.  Adjust dump.
>   (decide_about_value): Likewise.
>   (ipcp_propagate_stage): Do not calculate max_new_size, just store
>   orig_overall_size.  Adjust dump.
>   (ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size.
> ---
>  gcc/ipa-cp.c   | 39 ++-
>  gcc/params.opt |  2 +-
>  2 files changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 9e20e278eff..c2572e3e0e8
> 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -375,7 +375,7 @@ static profile_count max_count;
> 
>  /* Original overall size of the program.  */
> 
> -static long overall_size, max_new_size;
> +static long overall_size, orig_overall_size;
> 
>  /* Node name to unique clone suffix number map.  */  static
> hash_map *clone_num_suffixes; @@ -3395,6
> +3395,23 @@ perform_estimation_of_a_value (cgraph_node *node,
> vec known_csts,
>val->local_size_cost = size;
>  }
> 
> +/* Get the overall limit oof growth based on parameters extracted from
> growth.
> +   it does not really make sense to mix functions with different overall
> growth
> +   limits but it is possible and if it happens, we do not want to select one
> +   limit at random.  */
> +
> +static long
> +get_max_overall_size (cgraph_node *node) {
> +  long max_new_size = orig_overall_size;
> +  long large_unit = opt_for_fn (node->decl, param_large_unit_insns);
> +  if (max_new_size < large_unit)
> +max_new_size = large_unit;
> +  int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth);
> +  max_new_size += max_new_size * unit_growth / 100 + 1;
> +  return max_new_size;
> +}
> +
>  /* Iterate over known values of parameters of NODE and estimate the local
> effects in terms of time and size they have.  */
> 
> @@ -3457,7 +3474,7 @@ estimate_local_effects (struct cgraph_node *node)
>  stats.freq_sum, stats.count_sum,
>  size))
>   {
> -   if (size + overall_size <= max_new_size)
> +   if (size + overall_size <= get_max_overall_size (node))
>   {
> info->do_clone_for_all_contexts = true;
> overall_size += size;
> @@ -3467,8 +3484,8 @@ estimate_local_effects (struct cgraph_node *node)
>"known contexts, growth deemed beneficial.\n");
>   }
> else if (dump_file && (dump_flags & TDF_DETAILS))
> - fprintf (dump_file, "   Not cloning for all contexts because "
> -  "max_new_size would be reached with %li.\n",
> + fprintf (dump_file, "  Not cloning for all contexts because "
> +  "maximum unit size would be reached with %li.\n",
>size + overall_size);
>   }
>else if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3860,14
> +3877,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
>  max_count = max_count.max (node->count.ipa ());
>}
> 
> -  max_new_size = overall_size;
> -  if (max_new_size < param_large_unit_insns)
> -max_new_size = param_large_unit_insns;
> -  max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1;
> +  orig_overall_size = overall_size;
> 
>if (dump_file)
> -fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n",
> -  overall_size, max_new_size);
> +fprintf (dump_file, "\noverall_size: %li\n", overall_size);
> 
>propagate_constants_topo (topo);
>if (flag_checking)
> @@ -5380,11 +5393,11 @@ decide_about_value (struct cgraph_node *node,
> 

Re: [PATCH] Add initial octeontx2 support.

2020-01-11 Thread Richard Sandiford
 writes:
> From: Andrew Pinski 
>
> This adds octeontx2 naming.  It currently uses the cortexa57
> cost model and schedule model until I submit this.  This is
> more a place holder to get the naming of the cores in GCC 10.
> I will submit the cost model in the next couple of days.
>
> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
> * config/aarch64/aarch64-cores.def (octeontx2): New define.
> (octeontx2t98): New define.
> (octeontx2t96): New define.
> (octeontx2t93): New define.
> (octeontx2f95): New define.
> (octeontx2f95n): New define.
> (octeontx2f95mm): New define.
> * config/aarch64/aarch64-tune.md: Regenerate.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/aarch64/aarch64-cores.def | 10 ++
>  gcc/config/aarch64/aarch64-tune.md   |  2 +-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 2dd2b86bd92..057ed1ee131 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -109,6 +109,16 @@ AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_F
>  AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
> AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
>  AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
> AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd4a, -1)
>  
> +/* Cavium ('C') cores. */
> +AARCH64_CORE("octeontx2",  octeontx2,  cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b0, -1)
> +AARCH64_CORE("octeontx2t98",   octeontx2t98,   cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b1, -1)
> +AARCH64_CORE("octeontx2t96",   octeontx2t96,   cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b2, -1)
> +/* Note OcteonTX2 T93 is an alias to OcteonTX2 T96. */
> +AARCH64_CORE("octeontx2t93",   octeontx2t93,   cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b2, -1)
> +AARCH64_CORE("octeontx2f95",   octeontx2f95,   cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b3, -1)
> +AARCH64_CORE("octeontx2f95n",  octeontx2f95n,  cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b4, -1)
> +AARCH64_CORE("octeontx2f95mm", octeontx2f95mm, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, 
> 0x43, 0x0b5, -1)

Sorry for the nit, but: should be no space before ",".

The new option values need documenting in invoke.texi too.

Thanks,
Richard



> +
>  /* HiSilicon ('H') cores. */
>  AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
> AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, 
> tsv110,   0x48, 0xd01, -1)
>  
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index a6a14b7fc77..3cc1c4d761f 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> - 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> + 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))