Re: [C++ coroutines 7/7] libiberty demangler update.
Iain Sandoe writes: > Ian Lance Taylor wrote: > >> Iain Sandoe writes: > >>> 2020-01-09 Iain Sandoe >>> >>> * cp-demangle.c (cplus_demangle_operators): Add the co_await >>> operator. >> >> Please add something to libiberty/testsuite/demangle-expected. Thanks. > > done***, > OK now? > thanks > Iain > > *** it seems that the demangle test is broken for x86_64-darwin & linux on > trunk > anyway, (I ‘fixed’ it locally to test my patch below; will try to figure out > if that fix is > correct and post it if so) - however, that’s unconnected with the coroutines > stuff. > > == > > revised: > -- > > The coroutines implementation introduces a new operator with a > mangling of 'aw'. This patch adds that to libiberty's demangler. > > libiberty/ChangeLog: > > 2020-01-11 Iain Sandoe > > * cp-demangle.c (cplus_demangle_operators): Add the co_await > operator. > * testsuite/demangle-expected: Append a test for the co_await > operator mangling. This is OK. Thanks. Ian
[PATCH] Decrease cortexa57_extra_costs's alu.shift_reg
From: Andrew Pinski Like I mentioned in https://gcc.gnu.org/ml/gcc/2020-01/msg00157.html, The shift by a register should be just COSTS_N_INSNS (1) rather than COSTS_N_INSNS (2). This allows lshift_cheap_p to return true now and converting switches to be using shift and other like structures. I noticed this difference when I was working through PR 93131 and understanding what reassoc could handle. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * config/arm/aarch-cost-tables.h (cortexa57_extra_costs): Change alu.shift_reg to 0. --- gcc/config/arm/aarch-cost-tables.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h index 6a30d92cde9..cf818659901 100644 --- a/gcc/config/arm/aarch-cost-tables.h +++ b/gcc/config/arm/aarch-cost-tables.h @@ -235,7 +235,7 @@ const struct cpu_cost_table cortexa57_extra_costs = 0, /* arith. */ 0, /* logical. */ 0, /* shift. */ -COSTS_N_INSNS (1), /* shift_reg. */ +0, /* shift_reg. */ COSTS_N_INSNS (1), /* arith_shift. */ COSTS_N_INSNS (1), /* arith_shift_reg. */ COSTS_N_INSNS (1), /* log_shift. */ -- 2.17.1
[PATCHv2] Add initial octeontx2 support.
From: Andrew Pinski This adds octeontx2 naming. It currently uses the cortexa57 cost model and schedule model until I submit this. This is more a place holder to get the naming of the cores in GCC 10. I will submit the cost model in the next couple of days. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. v2 changes: Add documentation and fix minor whitespace issues before the `,'. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-cores.def (octeontx2): New define. (octeontx2t98): New define. (octeontx2t96): New define. (octeontx2t93): New define. (octeontx2f95): New define. (octeontx2f95n): New define. (octeontx2f95mm): New define. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (-mcpu=): Document the new cpu types. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64-cores.def | 10 ++ gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/doc/invoke.texi | 6 +- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 2dd2b86bd92..ea9b98b4b0a 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -109,6 +109,16 @@ AARCH64_CORE("ares", ares, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_F AARCH64_CORE("neoverse-n1", neoversen1, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1) AARCH64_CORE("neoverse-e1", neoversee1, cortexa53, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd4a, -1) +/* Cavium ('C') cores. */ +AARCH64_CORE("octeontx2", octeontx2, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b0, -1) +AARCH64_CORE("octeontx2t98", octeontx2t98, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b1, -1) +AARCH64_CORE("octeontx2t96", octeontx2t96, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b2, -1) +/* Note OcteonTX2 T93 is an alias to OcteonTX2 T96. */ +AARCH64_CORE("octeontx2t93", octeontx2t93, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b2, -1) +AARCH64_CORE("octeontx2f95", octeontx2f95, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b3, -1) +AARCH64_CORE("octeontx2f95n", octeontx2f95n, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b4, -1) +AARCH64_CORE("octeontx2f95mm", octeontx2f95mm, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE, cortexa57, 0x43, 0x0b5, -1) + /* HiSilicon ('H') cores. */ AARCH64_CORE("tsv110", tsv110, tsv110, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110, 0x48, 0xd01, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index a6a14b7fc77..3cc1c4d761f 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index f2c805c0a64..279b97f51ba 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16323,7 +16323,11 @@ performance of the code. Permissible values for this option are: @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor}, @samp{neoverse-e1},@samp{neoverse-n1},@samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx
Re: [PATCH] PR90838: Support ctz idioms
Hi! On Fri, Jan 10, 2020 at 10:35:04PM +0100, Jakub Jelinek wrote: > On Thu, Jan 09, 2020 at 02:26:10PM +0100, Richard Biener wrote: > > > >> + tree lhs = gimple_assign_lhs (stmt); > > > >> + bool zero_ok = CTZ_DEFINED_VALUE_AT_ZERO (TYPE_MODE (type), > > > >> val); > > > > > > > > since we're using the optab entry shouldn't you check for == 2 here? > > > > > > Yes, that looks more correct (it's not clear what 1 means exactly). > > 1 is what e.g. x86 uses with -mbmi, and I think it is what most targets > actually define if they have any defined value there, exception is > aarch64/arm, powerpc, gcn, nvptx and mips. 1 vs. 2 only makes a difference when expanding ffs, it seems. > Given: > int foo (int x) { return __builtin_ctz (x); } > int bar (int x) { return x ? __builtin_ctz (x) : 32; } > on x86 -mbmi we with CTZ_DEFINED_VALUE_AT_ZERO being 1 and value type_size > emit the same code, we'd need to find out if that is something done for all > targets or just a few, It doesn't do that for rs6000 (not for clz either, which is easier for us, the expansion of ctz depends on multiple factors). (Note that "foo" has UB whenever x == 0, btw. -- it would be nice if something like "bar" would generate the optimal code as well. Is there some other code that will just work?) > and if for all that have CTZ_DEFINED_VALUE_AT_ZERO 1, > we could perhaps just emit branchy code and wait for RTL to fix that up. Where would RTL fix that? In what pass, I mean. Segher
Re: [committed] Unbreak bootstrap on most targets (was Re: [PATCH] PR90838: Support ctz idioms)
On Sat, Jan 11, 2020 at 05:24:19PM +0100, Andreas Schwab wrote: > ../../gcc/tree-ssa-forwprop.c: In function 'bool > simplify_count_trailing_zeroes(gimple_stmt_iterator*)': > ../../gcc/tree-ssa-forwprop.c:1925:23: error: variable 'mode' set but not > used [-Werror=unused-but-set-variable] > 1925 | scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type); > | ^~~~ Oops, then I think we need following, but can't commit it until Monday. 2020-01-11 Jakub Jelinek PR tree-optimization/90838 * tree-ssa-forwprop.c (simplify_count_trailing_zeroes): Use SCALAR_INT_TYPE_MODE directly in CTZ_DEFINED_VALUE_AT_ZERO macro argument rather than to initialize temporary for targets that don't use the mode argument at all. --- gcc/tree-ssa-forwprop.c.jj 2020-01-11 16:31:56.278274863 +0100 +++ gcc/tree-ssa-forwprop.c 2020-01-11 17:27:55.336925656 +0100 @@ -1922,8 +1922,8 @@ simplify_count_trailing_zeroes (gimple_s tree type = TREE_TYPE (res_ops[0]); HOST_WIDE_INT ctzval; HOST_WIDE_INT type_size = tree_to_shwi (TYPE_SIZE (type)); - scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type); - bool zero_ok = CTZ_DEFINED_VALUE_AT_ZERO (mode, ctzval) == 2; + bool zero_ok + = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type), ctzval) == 2; /* Skip if there is no value defined at zero, or if we can't easily return the correct value for zero. */ Jakub
Re: [committed] Unbreak bootstrap on most targets (was Re: [PATCH] PR90838: Support ctz idioms)
../../gcc/tree-ssa-forwprop.c: In function 'bool simplify_count_trailing_zeroes(gimple_stmt_iterator*)': ../../gcc/tree-ssa-forwprop.c:1925:23: error: variable 'mode' set but not used [-Werror=unused-but-set-variable] 1925 | scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type); | ^~~~ Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: [wwwdocs] Git transition - how to access private user and vendor branches
On 11/01/2020 15:41, Segher Boessenkool wrote: > Hi Richard, > > On Thu, Jan 09, 2020 at 04:50:03PM +, Richard Earnshaw (lists) wrote: >> Given the proposed intention to use non-standard refspecs for private >> and vendor branches I've written some notes on how to use these. >> >> It would be helpful if someone could do some experiments to ensure that >> what I've written works properly for all versions of git, not just the >> one I have installed locally. > > I haven't been able to test it yet, but it does look like it will work. > >> +To fetch any of these you will need to add a fetch refspec to your >> +configuration. For example, to fetch all the IBM vendor branches add to >> +your default upstream pull >> + >> + >> +git config --add remote.origin.fetch >> "+refs/vendors/IBM/heads/*:refs/remotes/origin/IBM/*" >> +git config --add remote.origin.fetch >> "+refs/vendors/IBM/tags/*:refs/tags/IBM/*" >> + > > It may help to show the resulting config file instead of (or in addition > to) git-config commands to manipulate that. The config file is (or can > be) much more readable than those commands (shorter lines, less quoting). > You can also organise it, put in comments, and even *debug it*! And > *understand it* in general! Wow, what a concept :-) I wanted to describe it in the 'official' git way. Tweaking the contents of your .git/config file is not really sanctioned, even though we all do it. :-) > >> + >> +git checkout -b ARM/arm-9-branch origin/ARM/arm-9-branch >> + >> + >> +then change the merge property for the branch to corectly identify the > > (typo) Already fixed, didn't seem like it was worth reposting just for that. > >> +upstream source >> + >> + >> +git config branch.ARM/arm-9-branch.merge refs/vendors/ARM/heads/arm-9-branch >> + >> + >> +Pull operations will now correctly track the upstream branch. > > Do you also need to set branch..remote? Or is that done correctly > already? Or is that not needed if you have only one remote? Yes. However, on reflection, I'm not sure about that bit anyway. I may remote it entirely. > >> +It is also possible to set up push operations so that local changes will be >> pushed to the private namespace. For example, if you mirror your own >> private git information with >> + >> + >> +git config --add remote.origin.fetch >> "+refs/users/my-userid/heads/*:refs/remotes/origin/me/*" >> + >> + >> +then you can also add >> + >> +git config --add remote.origin.push >> "+refs/heads/me/*:refs/users/my-userid/heads/*" >> + >> +and then any push from a branch that begins with me/ will be >> pushed to the private area. > > Will be *force-pushed* to the server. This may not be expected or > wanted. > > Why would people want to name their local branches "me/thing" instead > of just "thing", btw? You could just use > > push = thing:users//heads/thing You could do that instead, but then you need a push line for every branch. The me/ trick means that anything named me/* will automatically get sent to the right place if pushed upstream. > > (no "+", I don't rebase the "thing" branch). Hrm, and maybe make an > alias to push a local branch to the server the first time... Completely > untested: > > [alias] > new-user-branch = !git push $1:users/myname/heads/$1 > > (And yes, I know this doesn't work as written if you have tag names > the same as branch names. That *deserves* punishment :-) ) > > > Segher Thanks for the feedback. Also see my scripts for setting up user/vendor stuff that I posted yesterday, it might simplify some of the above setup steps and thus the documentation, but I wrote those scripts after I had written this text. R.
Re: [wwwdocs] Git transition - how to access private user and vendor branches
Hi Richard, On Thu, Jan 09, 2020 at 04:50:03PM +, Richard Earnshaw (lists) wrote: > Given the proposed intention to use non-standard refspecs for private > and vendor branches I've written some notes on how to use these. > > It would be helpful if someone could do some experiments to ensure that > what I've written works properly for all versions of git, not just the > one I have installed locally. I haven't been able to test it yet, but it does look like it will work. > +To fetch any of these you will need to add a fetch refspec to your > +configuration. For example, to fetch all the IBM vendor branches add to > +your default upstream pull > + > + > +git config --add remote.origin.fetch > "+refs/vendors/IBM/heads/*:refs/remotes/origin/IBM/*" > +git config --add remote.origin.fetch > "+refs/vendors/IBM/tags/*:refs/tags/IBM/*" > + It may help to show the resulting config file instead of (or in addition to) git-config commands to manipulate that. The config file is (or can be) much more readable than those commands (shorter lines, less quoting). You can also organise it, put in comments, and even *debug it*! And *understand it* in general! Wow, what a concept :-) > + > +git checkout -b ARM/arm-9-branch origin/ARM/arm-9-branch > + > + > +then change the merge property for the branch to corectly identify the (typo) > +upstream source > + > + > +git config branch.ARM/arm-9-branch.merge refs/vendors/ARM/heads/arm-9-branch > + > + > +Pull operations will now correctly track the upstream branch. Do you also need to set branch..remote? Or is that done correctly already? Or is that not needed if you have only one remote? > +It is also possible to set up push operations so that local changes will be > pushed to the private namespace. For example, if you mirror your own private > git information with > + > + > +git config --add remote.origin.fetch > "+refs/users/my-userid/heads/*:refs/remotes/origin/me/*" > + > + > +then you can also add > + > +git config --add remote.origin.push > "+refs/heads/me/*:refs/users/my-userid/heads/*" > + > +and then any push from a branch that begins with me/ will be > pushed to the private area. Will be *force-pushed* to the server. This may not be expected or wanted. Why would people want to name their local branches "me/thing" instead of just "thing", btw? You could just use push = thing:users//heads/thing (no "+", I don't rebase the "thing" branch). Hrm, and maybe make an alias to push a local branch to the server the first time... Completely untested: [alias] new-user-branch = !git push $1:users/myname/heads/$1 (And yes, I know this doesn't work as written if you have tag names the same as branch names. That *deserves* punishment :-) ) Segher
[C++ coroutines 7/7] libiberty demangler update.
Ian Lance Taylor wrote: > Iain Sandoe writes: >> 2020-01-09 Iain Sandoe >> >> * cp-demangle.c (cplus_demangle_operators): Add the co_await >> operator. > > Please add something to libiberty/testsuite/demangle-expected. Thanks. done***, OK now? thanks Iain *** it seems that the demangle test is broken for x86_64-darwin & linux on trunk anyway, (I ‘fixed’ it locally to test my patch below; will try to figure out if that fix is correct and post it if so) - however, that’s unconnected with the coroutines stuff. == revised: -- The coroutines implementation introduces a new operator with a mangling of 'aw'. This patch adds that to libiberty's demangler. libiberty/ChangeLog: 2020-01-11 Iain Sandoe * cp-demangle.c (cplus_demangle_operators): Add the co_await operator. * testsuite/demangle-expected: Append a test for the co_await operator mangling. diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index 3639bfbfd4..fc55b7fae1 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -1802,6 +1802,7 @@ const struct demangle_operator_info cplus_demangle_operators[] = { "ad", NL ("&"), 1 }, { "an", NL ("&"), 2 }, { "at", NL ("alignof "), 1 }, + { "aw", NL ("co_await "), 1 }, { "az", NL ("alignof "), 1 }, { "cc", NL ("const_cast"), 2 }, { "cl", NL ("()"),2 }, diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index 5878d96dee..e6a75601e0 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -1450,3 +1450,7 @@ Foo()::{lambda(auto:1)#1}::operator()(char) const::X::fn _Z3fooILPv0EEvPN9enable_ifIXeqT_LDnEEvE4typeE void foo<(void*)0>(enable_if<((void*)0)==((decltype(nullptr))), void>::type*) + +_ZNK5coro15emptyawEv +coro1::empty::operator co_await() const +
[wwwdocs] lists.html - Avoid two references to a concrete version control system (SVN here).
Already when we converted from CVS to SVN aeons ago did I work to reduce references to a concrete system. These are two further instances, now that we are moving to GIT. Committed. Gerald - Log - commit 370969e4d49ad3defb1fddaa239dbab955896a3a Author: Gerald Pfeifer Date: Sat Jan 11 14:02:34 2020 +0100 diff --git a/htdocs/lists.html b/htdocs/lists.html index 8ca456a..b05e171 100644 --- a/htdocs/lists.html +++ b/htdocs/lists.html @@ -99,11 +99,11 @@ before subscribing and posting to these lists. https://gcc.gnu.org/ml/gcc-cvs/";>gcc-cvs is a read-only, relatively high volume list which tracks checkins to the - GCC SVN repository. + GCC source repository. https://gcc.gnu.org/ml/libstdc++-cvs/";>libstdc++-cvs is a read-only, relatively low volume list which tracks checkins to - the libstdc++-v3 part of the GCC SVN repository. This is a subset + the libstdc++-v3 part of the GCC repository. This is a subset of the messages to gcc-cvs. https://gcc.gnu.org/ml/gcc-cvs-wwwdocs/";>gcc-cvs-wwwdocs
RE: [PATCH] Add Optimization for various IPA parameters.
Hi Martin, This change (r280099) is causing a major performance regression on exchange2 in SPEC2017 dropping the benchmark by more than 30%. It seems the parameters no longer do anything. i.e. -flto --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=80 doesn't have any effect anymore. Thanks, Tamar > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org > On Behalf Of Martin Jambor > Sent: Wednesday, January 8, 2020 4:14 PM > To: Martin Liška ; gcc-patches@gcc.gnu.org > Cc: hubi...@ucw.cz; Richard Biener > Subject: Re: [PATCH] Add Optimization for various IPA parameters. > > Hi, > > On Fri, Jan 03 2020, Martin Liška wrote: > > Hi. > > > > This is similar transformation for IPA passes. This time, one needs to > > use opt_for_fn in order to get the right parameter values. > > > > @Martin, Honza: > > There are last few remaining parameters which should use > > opt_for_fn: > > > > param_ipa_cp_unit_growth > > So as we discussed, picking this one from one particular node is not what one > would expect to happen, but inlining does it too and so anyway: > > > IPA-CP: Always access param_ipcp_unit_growth through opt_for_fn > > 2020-01-07 Martin Jambor > > * params.opt (param_ipcp_unit_growth): Mark as Optimization. > * ipa-cp.c (max_new_size): Removed. > (orig_overall_size): New variable. > (get_max_overall_size): New function. > (estimate_local_effects): Use it. Adjust dump. > (decide_about_value): Likewise. > (ipcp_propagate_stage): Do not calculate max_new_size, just store > orig_overall_size. Adjust dump. > (ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size. > --- > gcc/ipa-cp.c | 39 ++- > gcc/params.opt | 2 +- > 2 files changed, 27 insertions(+), 14 deletions(-) > > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 9e20e278eff..c2572e3e0e8 > 100644 > --- a/gcc/ipa-cp.c > +++ b/gcc/ipa-cp.c > @@ -375,7 +375,7 @@ static profile_count max_count; > > /* Original overall size of the program. */ > > -static long overall_size, max_new_size; > +static long overall_size, orig_overall_size; > > /* Node name to unique clone suffix number map. */ static > hash_map *clone_num_suffixes; @@ -3395,6 > +3395,23 @@ perform_estimation_of_a_value (cgraph_node *node, > vec known_csts, >val->local_size_cost = size; > } > > +/* Get the overall limit oof growth based on parameters extracted from > growth. > + it does not really make sense to mix functions with different overall > growth > + limits but it is possible and if it happens, we do not want to select one > + limit at random. */ > + > +static long > +get_max_overall_size (cgraph_node *node) { > + long max_new_size = orig_overall_size; > + long large_unit = opt_for_fn (node->decl, param_large_unit_insns); > + if (max_new_size < large_unit) > +max_new_size = large_unit; > + int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth); > + max_new_size += max_new_size * unit_growth / 100 + 1; > + return max_new_size; > +} > + > /* Iterate over known values of parameters of NODE and estimate the local > effects in terms of time and size they have. */ > > @@ -3457,7 +3474,7 @@ estimate_local_effects (struct cgraph_node *node) > stats.freq_sum, stats.count_sum, > size)) > { > - if (size + overall_size <= max_new_size) > + if (size + overall_size <= get_max_overall_size (node)) > { > info->do_clone_for_all_contexts = true; > overall_size += size; > @@ -3467,8 +3484,8 @@ estimate_local_effects (struct cgraph_node *node) >"known contexts, growth deemed beneficial.\n"); > } > else if (dump_file && (dump_flags & TDF_DETAILS)) > - fprintf (dump_file, " Not cloning for all contexts because " > - "max_new_size would be reached with %li.\n", > + fprintf (dump_file, " Not cloning for all contexts because " > + "maximum unit size would be reached with %li.\n", >size + overall_size); > } >else if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3860,14 > +3877,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo) > max_count = max_count.max (node->count.ipa ()); >} > > - max_new_size = overall_size; > - if (max_new_size < param_large_unit_insns) > -max_new_size = param_large_unit_insns; > - max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1; > + orig_overall_size = overall_size; > >if (dump_file) > -fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n", > - overall_size, max_new_size); > +fprintf (dump_file, "\noverall_size: %li\n", overall_size); > >propagate_constants_topo (topo); >if (flag_checking) > @@ -5380,11 +5393,11 @@ decide_about_value (struct cgraph_node *node, >
Re: [PATCH] Add initial octeontx2 support.
writes: > From: Andrew Pinski > > This adds octeontx2 naming. It currently uses the cortexa57 > cost model and schedule model until I submit this. This is > more a place holder to get the naming of the cores in GCC 10. > I will submit the cost model in the next couple of days. > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > Thanks, > Andrew Pinski > > ChangeLog: > * config/aarch64/aarch64-cores.def (octeontx2): New define. > (octeontx2t98): New define. > (octeontx2t96): New define. > (octeontx2t93): New define. > (octeontx2f95): New define. > (octeontx2f95n): New define. > (octeontx2f95mm): New define. > * config/aarch64/aarch64-tune.md: Regenerate. > > Signed-off-by: Andrew Pinski > --- > gcc/config/aarch64/aarch64-cores.def | 10 ++ > gcc/config/aarch64/aarch64-tune.md | 2 +- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index 2dd2b86bd92..057ed1ee131 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -109,6 +109,16 @@ AARCH64_CORE("ares", ares, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_F > AARCH64_CORE("neoverse-n1", neoversen1, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | > AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1) > AARCH64_CORE("neoverse-e1", neoversee1, cortexa53, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | > AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd4a, -1) > > +/* Cavium ('C') cores. */ > +AARCH64_CORE("octeontx2", octeontx2, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b0, -1) > +AARCH64_CORE("octeontx2t98", octeontx2t98, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b1, -1) > +AARCH64_CORE("octeontx2t96", octeontx2t96, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b2, -1) > +/* Note OcteonTX2 T93 is an alias to OcteonTX2 T96. */ > +AARCH64_CORE("octeontx2t93", octeontx2t93, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b2, -1) > +AARCH64_CORE("octeontx2f95", octeontx2f95, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b3, -1) > +AARCH64_CORE("octeontx2f95n", octeontx2f95n, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b4, -1) > +AARCH64_CORE("octeontx2f95mm", octeontx2f95mm, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_PROFILE , cortexa57, > 0x43, 0x0b5, -1) Sorry for the nit, but: should be no space before ",". The new option values need documenting in invoke.texi too. Thanks, Richard > + > /* HiSilicon ('H') cores. */ > AARCH64_CORE("tsv110", tsv110, tsv110, 8_2A, AARCH64_FL_FOR_ARCH8_2 | > AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, > tsv110, 0x48, 0xd01, -1) > > diff --git a/gcc/config/aarch64/aarch64-tune.md > b/gcc/config/aarch64/aarch64-tune.md > index a6a14b7fc77..3cc1c4d761f 100644 > --- a/gcc/config/aarch64/aarch64-tune.md > +++ b/gcc/config/aarch64/aarch64-tune.md > @@ -1,5 +1,5 @@ > ;; -*- buffer-read-only: t -*- > ;; Generated automatically by gentune.sh from aarch64-cores.def > (define_attr "tune" > - > "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > + > "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > (const (symbol_ref "((enum attr_tune) aarch64_tune)")))