Re: Ping^6: contribute Synopsys Designware ARC port
On Tue, Oct 01, 2013 at 04:22:38PM -0600, Jeff Law wrote: - The Copyright years should be 2013 in every new file. Or has this port been released before? The port has been available via git for quite a while: https://github.com/foss-for-synopsys-dwc-arc-processors/gcc Right. Was any of this code from Doug Evans's old ARC support? It doesn't hurt to have 2013 in the dates, and I suspect most files will get touched as a result of addressing Diego's comments. Because GCC has switched to Copyright year ranges, in fact all the Copyright lines should be either 2013, or firstyear-2013. Jakub
Re: [PATCH] Reducing number of alias checks in vectorization.
On Tue, Oct 01, 2013 at 07:12:54PM -0700, Cong Hou wrote: --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) Your mailer ate all the tabs, so the formatting of the whole patch can't be checked. @@ -19,6 +19,10 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#include vector +#include utility +#include algorithm Why? GCC has it's vec.h vectors, why don't you use those? There is even qsort method for you in there. And for pairs, you can easily just use structs with two members as structure elements in the vector. +struct dr_addr_with_seg_len +{ + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) +: dr (d), basic_addr (addr), offset (off), seg_len (len) {} + + data_reference* dr; Space should be before *, not after it. + if (TREE_CODE (p11.offset) != INTEGER_CST + || TREE_CODE (p21.offset) != INTEGER_CST) +return p11.offset p21.offset; If offset isn't INTEGER_CST, you are comparing the pointer values? That is never a good idea, then compilation will depend on how say address space randomization randomizes virtual address space. GCC needs to have reproduceable compilations. + if (int_cst_value (p11.offset) != int_cst_value (p21.offset)) +return int_cst_value (p11.offset) int_cst_value (p21.offset); This is going to ICE whenever the offsets wouldn't fit into a HOST_WIDE_INT. I'd say you just shouldn't put into the vector entries where offset isn't host_integerp, those would never be merged with other checks, or something similar. Jakub
Re: [C++ PATCH] Splice when giving an error (PR c++/58510)
Ping. On Wed, Sep 25, 2013 at 01:08:38PM +0200, Marek Polacek wrote: The following testcase ICEd because complete_ctor_at_level_p got a union with two initializers - and didn't like that. I think we can get away with splicing when sorting the initializers: we already gave an error and the program isn't accepted. Regtested/bootstrapped on x86_64-linux, ok for trunk? 2013-09-25 Marek Polacek pola...@redhat.com PR c++/58510 cp/ * init.c (sort_mem_initializers): Splice when giving an error. testsuite/ * g++.dg/cpp0x/pr58510.C: New test. --- gcc/cp/init.c.mp 2013-09-25 11:50:18.246432664 +0200 +++ gcc/cp/init.c 2013-09-25 11:50:18.262432728 +0200 @@ -980,9 +980,12 @@ sort_mem_initializers (tree t, tree mem_ else if (TREE_VALUE (*last_p) !TREE_VALUE (init)) goto splice; else - error_at (DECL_SOURCE_LOCATION (current_function_decl), - initializations for multiple members of %qT, - ctx); + { + error_at (DECL_SOURCE_LOCATION (current_function_decl), + initializations for multiple members of %qT, + ctx); + goto splice; + } } last_p = p; --- gcc/testsuite/g++.dg/cpp0x/pr58510.C.mp 2013-09-25 12:19:02.612137551 +0200 +++ gcc/testsuite/g++.dg/cpp0x/pr58510.C 2013-09-25 12:45:13.157119958 +0200 @@ -0,0 +1,11 @@ +// PR c++/58510 +// { dg-do compile { target c++11 } } + +void foo() +{ + union + { // { dg-error multiple } +int i = 0; +char c = 0; + }; +} Marek Marek
Re: operator new returns nonzero
It isn't a front-end patch, but it is still a C++ patch, maybe Jason will have comments? Anyone else? On Mon, 16 Sep 2013, Marc Glisse wrote: Nobody has expressed concern for a week, so it may be worth doing an official review ;-) http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00676.html On Mon, 9 Sep 2013, Marc Glisse wrote: I have now tested bootstrap+testsuite and there was no regression. 2013-09-07 Marc Glisse marc.gli...@inria.fr PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc Glisse
Re: operator new returns nonzero
On Mon, Sep 09, 2013 at 10:49:40PM +0200, Marc Glisse wrote: --- fold-const.c (revision 202413) +++ fold-const.c (working copy) @@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (flag_delete_null_pointer_checks !flag_check_new + DECL_IS_OPERATOR_NEW (fndecl) + !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); Not commenting on what this patch does, but how: why don't you use tree fndecl = get_callee_fndecl (t); if (fndecl ...) return true; instead? Perhaps alloca_call_p should use it too. Jakub
RE: [PING] 3 patches waiting for approval/review
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Andreas Krebbel Sent: 01 October 2013 10:18 To: gcc-patches@gcc.gnu.org Subject: [PING] 3 patches waiting for approval/review [RFC] Allow functions calling mcount before prologue to be leaf functions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html [PATCH] PR57377: Fix mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html [PATCH] Doc: Add documentation for the mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html Bye, -Andreas- Documentation patch has a typo: + specific checks in e.g. the pipleline description. ^ Cheers, Paulo Matos
Re: operator new returns nonzero
On Wed, 2 Oct 2013, Jakub Jelinek wrote: On Mon, Sep 09, 2013 at 10:49:40PM +0200, Marc Glisse wrote: --- fold-const.c(revision 202413) +++ fold-const.c(working copy) @@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (flag_delete_null_pointer_checks !flag_check_new +DECL_IS_OPERATOR_NEW (fndecl) +!TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); Not commenting on what this patch does, but how: why don't you use tree fndecl = get_callee_fndecl (t); if (fndecl ...) return true; instead? Because I copied the code from alloca_call_p ;-) get_callee_fndecl does look better indeed. Perhaps alloca_call_p should use it too. Thanks, I'll prepare a new patch. -- Marc Glisse
Re: [PATCH] Fix PR58554
On Tue, 1 Oct 2013, Bernhard Reutner-Fischer wrote: On 30 September 2013 14:19:01 Richard Biener rguent...@suse.de wrote: This fixes PR58554, pattern recognition in loop distribution now needs to check whether all stmts are unconditionally executed. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2013-09-30 Richard Biener rguent...@suse.de PR tree-optimization/58554 * tree-loop-distribution.c (classify_partition): Require unconditionally executed stores for memcpy and memset recognition. (tree_loop_distribution): Calculate dominance info. * gcc.dg/torture/pr58554.c: New testcase. Index: gcc/tree-loop-distribution.c *** out: *** 1719,1724 --- 1723,1729 { if (!cd) { + calculate_dominance_info (CDI_DOMINATORS); calculate_dominance_info (CDI_POST_DOMINATORS); cd = new control_dependences (create_edge_list ()); free_dominance_info (CDI_POST_DOMINATORS); Don't you have to free CDI_DOMINATORS too now, somewhere? Not unless they become invalid. We preserve dominators across passes (unlike post-dominators, because those are not kept up-to-date by CFG manipulation functions). Richard.
Re: [PATCH, doc]: Fix @anchor should not appear in @heading warning
On Sun, 29 Sep 2013, Uros Bizjak wrote: Rather trivial fix - put @anchor before @heading, as texi manual suggests. 2013-09-29 Uros Bizjak ubiz...@gmail.com * doc/install.texi (Host/target specific installation notes for GCC): Put @anchor before @heading. Tested by make doc with texinfo 5.1 on Fedora 19. Thanks. I assume (also based on their release notes) that newer versions of texinfo now issue more warnings? Anything else we should be aware of? Gerald
Re: [wwwdocs] Buildstat update for 4.4
On Tue, 1 Oct 2013, Tom G. Christensen wrote: Testresults for 4.4.7: i386-pc-solaris2.8 (2) i386-pc-solaris2.9 sparc-sun-solaris2.7 (2) sparc-sun-solaris2.8 sparc-sun-solaris2.9 Thanks, applied. Gerald
Re: [PATCH, doc]: Fix @anchor should not appear in @heading warning
On Wed, Oct 2, 2013 at 10:12 AM, Gerald Pfeifer ger...@pfeifer.com wrote: On Sun, 29 Sep 2013, Uros Bizjak wrote: Rather trivial fix - put @anchor before @heading, as texi manual suggests. 2013-09-29 Uros Bizjak ubiz...@gmail.com * doc/install.texi (Host/target specific installation notes for GCC): Put @anchor before @heading. Tested by make doc with texinfo 5.1 on Fedora 19. Thanks. I assume (also based on their release notes) that newer versions of texinfo now issue more warnings? Anything else we should be aware of? Correct. I have committed a pack of trivial warning fixes, remaining warnings are following: ../../gcc-svn/trunk/gcc/doc/invoke.texi:1068: warning: node next `Overall Options' in menu `C Dialect Options' and in sectioning `Invoking G++' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:1068: warning: node up `Overall Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:1534: warning: node prev `C Dialect Options' in menu `Overall Options' and in sectioning `Invoking G++' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:1534: warning: node up `C Dialect Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:1944: warning: node up `C++ Dialect Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:2773: warning: node up `Objective-C and Objective-C++ Dialect Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:3005: warning: node up `Language Independent Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:3131: warning: node up `Warning Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:5017: warning: node up `Debugging Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:6625: warning: node up `Optimize Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:9913: warning: node up `Preprocessor Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:9966: warning: node up `Assembler Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:9989: warning: node up `Link Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:10271: warning: node up `Directory Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:10424: warning: node up `Spec Files' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/invoke.texi:10985: warning: node up `Target Options' in menu `Option Summary' and in sectioning `Invoking GCC' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:7581: warning: node next `Object Size Checking' in menu `Cilk Plus Builtins' and in sectioning `Other Builtins' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:7714: warning: node next `Other Builtins' in menu `Target Builtins' and in sectioning `Cilk Plus Builtins' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:7714: warning: node prev `Other Builtins' in menu `Cilk Plus Builtins' and in sectioning `Object Size Checking' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:8838: warning: node next `Cilk Plus Builtins' in menu `Other Builtins' and in sectioning `Target Builtins' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:8838: warning: node prev `Cilk Plus Builtins' in menu `Object Size Checking' and in sectioning `Other Builtins' differ ../../gcc-svn/trunk/gcc/doc/extend.texi:8864: warning: node prev `Target Builtins' in menu `Other Builtins' and in sectioning `Cilk Plus Builtins' differ ../../gcc-svn/trunk/gcc/doc/trouble.texi:5: warning: node next `Trouble' in menu `Service' and in sectioning `Bugs' differ ../../gcc-svn/trunk/gcc/doc/trouble.texi:5: warning: node prev `Trouble' in menu `Bug Reporting' and in sectioning `Gcov' differ ../../gcc-svn/trunk/gcc/doc/trouble.texi:5: warning: node up `Trouble' in menu `Bugs' and in sectioning `Top' differ ../../gcc-svn/trunk/gcc/doc/service.texi:5: warning: node prev `Service' in menu `Trouble' and in sectioning `Bugs' differ ../../gcc-svn/trunk/gcc/doc/service.texi:5: warning: node up `Service' in menu `Bugs' and in sectioning `Top' differ Uros.
Re: [patch] Move some prototypes out of tree-flow.h
On Tue, Oct 1, 2013 at 5:18 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves 5 sets of prototypes out of tree-flow.h and creates 4 new header files. I then #include the new header files from tree-flow.h as a temporary measure until all prototypes have been cleared up. then as previously discussed, I will revisit all the #includes *of* and *within* tree-flow.h to get it and the .c files down to the basics required. I suspect this will then expose more functions that should be shuffled. This time around I'm only shuffled what really needed shuffling for compilation. all files now list all the exports within a file. tree-cfgcleanup.h: new file. pretty basic.. just moved the prototypes. tree-dfa.h: new file. Add the prototypes, and moved get_addr_base_and_unit_offset_1 from tree-flow-inline.h. its related funcitons were in tree-dfa.c so I just left them all there for now. tree-pretty-print.h: This file already existed, but some of the prototypes were in tree-flow.h for some reason. It also contained a prototype for a c front end debug routine which isn't used anywhere, so I deleted that. tree-into-ssa.h: Another new file, moved the prototypes out of tree-flow and there were bunch of debug prototypes ni the .c file itself. I moved those to the header file for clarity. gimple-low.h: The final new file. Moved the prototypes here. gImple-low.c: I moved try_catch_may_fallthru() and block_may_fallthru() to tree.c... there are gimple versions in gimple-low.c already, and it turns out that block_may_fallthru() is called from the c++ front end.. so it doesn't belong in gimple.c. The prototype is already in tree.h anyway. a few .c files required adding tree.h to the include file to pick up bits that moved due to the reshuffling. mostly uses of enum tree_code in tree-pretty-print.h. eventually, all the .c files ought to include tree.h directly. I'll tak care of that to some degree when tree-flow.h is finally sorted out. Bootstraps on x86_64-unknown-linux-gnu and no new regressions. OK? Ok. Thanks, Richard. Andrew
Re: [PING] 3 patches waiting for approval/review
On 02/10/13 09:10, Paulo Matos wrote: -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Andreas Krebbel Sent: 01 October 2013 10:18 To: gcc-patches@gcc.gnu.org Subject: [PING] 3 patches waiting for approval/review [RFC] Allow functions calling mcount before prologue to be leaf functions http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html [PATCH] PR57377: Fix mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01364.html [PATCH] Doc: Add documentation for the mnemonic attribute http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01436.html Bye, -Andreas- Documentation patch has a typo: + specific checks in e.g. the pipleline description. Fixed. Thanks. -Andreas-
Re: [PATCH]Fix computation of offset in ivopt
On Tue, Oct 1, 2013 at 6:13 PM, Bin.Cheng amker.ch...@gmail.com wrote: On Tue, Oct 1, 2013 at 6:50 PM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Sep 30, 2013 at 7:39 AM, bin.cheng bin.ch...@arm.com wrote: I don't think you need + /* Sign extend off if expr is in type which has lower precision + than HOST_WIDE_INT. */ + if (TYPE_PRECISION (TREE_TYPE (expr)) = HOST_BITS_PER_WIDE_INT) +off = sext_hwi (off, TYPE_PRECISION (TREE_TYPE (expr))); at least it would be suspicious if you did ... There is a problem for example of the first message. The iv base if like: pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4 I am not sure why but it seems (-4/0xFFFC) is represented by (1073741823*4). For each operand strip_offset_1 returns exactly the positive number and result of multiplication never get its chance of sign extend. That's why the sign extend is necessary to fix the problem. Or it should be fixed elsewhere by representing iv base with: pretmp_184 + ((sizetype) KeyIndex_180 + 4294967295) * 4 in the first place. Yeah, that's why I said the whole issue with forcing all offsets to be unsigned is a mess ... There is really no good answer besides not doing that I fear. Yes, in the above case we could fold the whole thing differently (interpret the offset of a POINTER_PLUS_EXPR as signed). You can try tracking down the offender, but it'll get non-trivial easily as you have to consider the fact that GCC will treat signed operations as having undefined behavior on overflow. So I see why you want to do the extension above (re-interpret the result), I suppose we can live with it but please make sure to add a big fat ??? comment before it explaining why it is necessary. Richard. The only case that I can think of points to a bug in strip_offset_1 again, namely if sizetype (the type of all offsets) is smaller than a HOST_WIDE_INT in which case +boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field)); +*offset = off0 + int_cst_value (tmp) + boffset / BITS_PER_UNIT; is wrong as boffset / BITS_PER_UNIT does not do a signed division then (for negative boffset which AFAIK does not happen - but it would be technically allowed). Thus, the predicates like + cst_and_fits_in_hwi (tmp) would need to be amended with a check that the MSB is not set. So I can handle it like: +abs_boffset = abs_hwi (boffset); +x = abs_boffset / BITS_PER_UNIT; +if (boffset 0) + x = -x; +*offset = off0 + int_cst_value (tmp) + x; Right? Btw, the cst_and_fits_in_hwi implementation is odd: bool cst_and_fits_in_hwi (const_tree x) { if (TREE_CODE (x) != INTEGER_CST) return false; if (TYPE_PRECISION (TREE_TYPE (x)) HOST_BITS_PER_WIDE_INT) return false; return (TREE_INT_CST_HIGH (x) == 0 || TREE_INT_CST_HIGH (x) == -1); } the precision check seems totally pointless and I wonder what's the point of this routine as there is host_integerp () already and tree_low_cst instead of int_cst_value - oh, I see, the latter forcefully sign-extends that should make the extension not necessary. See above. Thanks. bin
Re: [patch] More tree-flow.h prototypes.
On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. Bootstraps and no new regressions. OK? Ok. Thanks, Richard. Andrew
Re: [PATCH] Improving uniform_vector_p() function.
On Tue, Oct 1, 2013 at 7:31 PM, Cong Hou co...@google.com wrote: The current uniform_vector_p() function only returns non-NULL when the vector is directly a uniform vector. For example, for the following gimple code: vect_cst_.15_91 = {_9, _9, _9, _9, _9, _9, _9, _9}; The current implementation can only detect that {_9, _9, _9, _9, _9, _9, _9, _9} is a uniform vector, but fails to recognize vect_cst_.15_91 is also one. This simple patch searches through assignment chains to find more uniform vectors. Changing uniform_vector_p looks wrong - it is a predicate on GENERIC and you are adding SSA specifics to it. I suggest you simply lookup the def of the SSA name for the example you give in later mail. Richard. thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 45c1667..b42f8a9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2013-10-01 Cong Hou co...@google.com + + * tree.c: Improve the function uniform_vector_p() so that a + vector assigned with a uniform vector is also treated as a + uniform vector. + diff --git a/gcc/tree.c b/gcc/tree.c index 1c881e4..1d6d894 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -10297,6 +10297,17 @@ uniform_vector_p (const_tree vec) return first; } + if (TREE_CODE (vec) == SSA_NAME) +{ + gimple def = SSA_NAME_DEF_STMT (vec); + if (gimple_code (def) == GIMPLE_ASSIGN) +{ + tree rhs = gimple_op (def, 1); + if (VECTOR_TYPE_P (TREE_TYPE (rhs))) +return uniform_vector_p (rhs); +} +} + return NULL_TREE; }
Re: [wwwdocs] Buildstat update for 4.8
On Tue, 1 Oct 2013, Tom G. Christensen wrote: Latest results for gcc 4.8.x. -tgc Testresults for 4.8.1 sparc64-sun-solaris2.9 Thanks! Gerald
[C++ Patch] PR 58565
Hi, here, with -std=c++11 LABEL_EXPR is unhandled by potential_constant_expression_1 and we can't emit a meaningful diagnostic. It seems safe to just add it to the main switch, together with the existing LABEL_DECL. Tested x86_64-linux. Thanks, Paolo. // /cp 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58565 * semantics.c (potential_constant_expression_1): Handle LABEL_EXPR. /testsuite 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58565 * g++.dg/parse/crash64.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 203101) +++ cp/semantics.c (working copy) @@ -8422,6 +8422,7 @@ potential_constant_expression_1 (tree t, bool want case OVERLOAD: case TEMPLATE_ID_EXPR: case LABEL_DECL: +case LABEL_EXPR: case CONST_DECL: case SIZEOF_EXPR: case ALIGNOF_EXPR: Index: testsuite/g++.dg/parse/crash64.C === --- testsuite/g++.dg/parse/crash64.C(revision 0) +++ testsuite/g++.dg/parse/crash64.C(working copy) @@ -0,0 +1,7 @@ +// PR c++/58565 +// { dg-options } + +void foo() +{ + int i = ({ L: ; }); // { dg-error void value not ignored } +}
Copyright years for new old ports (Re: Ping^6: contribute Synopsys Designware ARC port)
Quoting Jakub Jelinek ja...@redhat.com: On Tue, Oct 01, 2013 at 04:22:38PM -0600, Jeff Law wrote: - The Copyright years should be 2013 in every new file. Or has this port been released before? The port has been available via git for quite a while: https://github.com/foss-for-synopsys-dwc-arc-processors/gcc Right. Was any of this code from Doug Evans's old ARC support? I don't have version control information to confirm or deny this. At any rate, to my knowledge, the Copyright year of any predecessor files has been included. And the old port wouldn't fill in any gaps in the 2009-2012 time frame, as any copy - if it happened - would have been much earlier. I've filled in Copyright year gaps from internal ChangeLogs / revision control info (inasmuch as available to me) where indicated, but some files were just not much touched at all. It doesn't hurt to have 2013 in the dates, and I suspect most files will get touched as a result of addressing Diego's comments. I've added 2013 for the affected files that didn't already have that year in their list/range. From my understanding, the condition for adding the current Copyright year without a source code change is to have a release in that year. Are we sure 4.9.0 will be released this year? Because GCC has switched to Copyright year ranges, in fact all the Copyright lines should be either 2013, or firstyear-2013. The way I recall the argument is that the releases we make allow us to add a copyright year without a source code change, and because files on trunk are included in a release at least once a year, you can fill in a range for their stay within trunk. However, this port hasn't been in the FSF gcc trunk till now, so what we have at this moment are the lists of years when the code was prepared.
[PATCH][ARM][committed] Add comment in arm_legitimize_reload_address to discourage use of plus_constant
Hi all, I've committed the attached patch to make it mode clear why we use gen_rtx_PLUS instead of plus_constant in that particular instance. Committed as obvious with r203106. Kyrill 2013-10-02 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_legitimize_reload_address): Explain why plus_constant is not used.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e23e29a..2f1e0b8 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -7547,7 +7547,11 @@ arm_legitimize_reload_address (rtx *p, return false; /* Reload the high part into a base reg; leave the low part - in the mem. */ + in the mem. + Note that replacing this gen_rtx_PLUS with plus_constant is + wrong in this case because we rely on the + (plus (plus reg c1) c2) structure being preserved so that + XEXP (*p, 0) in push_reload below uses the correct term. */ *p = gen_rtx_PLUS (GET_MODE (*p), gen_rtx_PLUS (GET_MODE (*p), XEXP (*p, 0), GEN_INT (high)),
Re: [C++ Patch] PR 58565
OK. Jason
Re: Copyright years for new old ports (Re: Ping^6: contribute Synopsys Designware ARC port)
On Wed, Oct 02, 2013 at 06:05:14AM -0400, Joern Rennecke wrote: From my understanding, the condition for adding the current Copyright year without a source code change is to have a release in that year. Are we sure 4.9.0 will be released this year? We are sure we don't want 4.9.0 to be released this year ;) Because GCC has switched to Copyright year ranges, in fact all the Copyright lines should be either 2013, or firstyear-2013. The way I recall the argument is that the releases we make allow us to add a copyright year without a source code change, and because files on trunk are included in a release at least once a year, you can fill in a range for their stay within trunk. However, this port hasn't been in the FSF gcc trunk till now, so what we have at this moment are the lists of years when the code was prepared. But, all the other files in gcc/ are now someyear-2013, new files added are also 2013, if you make your files someyear-2011 or similar, then I think the scripts won't easily adjust it to someyear-2014 when we run the script early in January 2014. Jakub
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On Tue, 1 Oct 2013, Tim Shen wrote: Hi, libstdc++-v3 regex is ready for releasing. Nice! Is it Ok to apply? By the way, do we need a News entry for this improvement? Yes, and yes. :-) Just one question improved experimental support sounds a bit weak. I think I understand where it is coming from, but is there a way to market this a bit stronger? Gerald
Re: Copyright years for new old ports (Re: Ping^6: contribute Synopsys Designware ARC port)
Quoting Jakub Jelinek ja...@redhat.com: On Wed, Oct 02, 2013 at 06:05:14AM -0400, Joern Rennecke wrote: From my understanding, the condition for adding the current Copyright year without a source code change is to have a release in that year. Are we sure 4.9.0 will be released this year? We are sure we don't want 4.9.0 to be released this year ;) Because GCC has switched to Copyright year ranges, in fact all the Copyright lines should be either 2013, or firstyear-2013. The way I recall the argument is that the releases we make allow us to add a copyright year without a source code change, and because files on trunk are included in a release at least once a year, you can fill in a range for their stay within trunk. However, this port hasn't been in the FSF gcc trunk till now, so what we have at this moment are the lists of years when the code was prepared. But, all the other files in gcc/ are now someyear-2013, new files added are also 2013, if you make your files someyear-2011 or similar, then I think the scripts won't easily adjust it to someyear-2014 when we run the script early in January 2014. So, should I add 2014 now? That would be no more speculative than adding the current year at the start of the year in anticipation of a release that year. Or put something in my Calendar to do it in 2014? Or should I backport the port into the gcc 4.8 branch, so assuming we still make another 4.8.x release this year, there is justification to add the 2013 year?
Re: [PATCH][AARCH64]Replace gen_rtx_PLUS with plus_constant
On 01/10/13 12:32, Marcus Shawcroft wrote: On 30 September 2013 14:20, Renlin Li renlin...@arm.com wrote: gcc/ChangeLog: 2013-09-30 Renlin Li renlin...@arm.com * config/aarch64/aarch64.c (aarch64_expand_prologue): Use plus_constant. (aarch64_expand_epilogue): Likewise. OK /Marcus I've committed the patch as r203108. Renlin, for future reference: Changelog rules expect two spaces between your name and the email. So I've committed the patch with the Changelog: 2013-10-02 Renlin Li renlin...@arm.com * config/aarch64/aarch64.c (aarch64_expand_prologue): Use plus_constant. (aarch64_expand_epilogue): Likewise. Cheers, Kyrill
Re: Copyright years for new old ports (Re: Ping^6: contribute Synopsys Designware ARC port)
On Wed, 2 Oct 2013, Joern Rennecke wrote: From my understanding, the condition for adding the current Copyright year without a source code change is to have a release in that year. Are we sure 4.9.0 will be released this year? We are sure we don't want 4.9.0 to be released this year ;) But(!) we'll be releasing another dozen of 4.9.0 snapshots this year. That probably was something the FSF had not considered when creating the original policy (recall how even GCC did not have a publicly accessible source code repository in the days). So, should I add 2014 now? That would be no more speculative than adding the current year at the start of the year in anticipation of a release that year. I would do add 2013. This is when the port has hit the tree and when we will be doing snapshots available via our and many other servers. Gerald
[build] Update t-sparc, t-sol2 etc. for automatic dependencies
Inspired by the t-i386 changes, the following patch moves SPARC and Solaris files over to automatic dependencies. Bootstrapped without regression on sparc-sun-solaris2.11, verified that dependencies were generated for affected files. Ok for mainline? Rainer 2013-10-01 Rainer Orth r...@cebitec.uni-bielefeld.de * config/t-sol2 (sol2-c.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). (sol2-cxx.o): Likewise. (sol2-stubs.o): Likewise. (sol2.o): Likewise. * config/x-solaris (host-solaris.o): Likewise. * config/sparc/t-sparc (sparc.o): Remove. (sparc-c.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). * config/sparc/x-sparc: Likewise. # HG changeset patch # Parent ca3146ecd0cfe562bf98a495014bbb9f65346986 Update t-sparc, t-sol2 etc. for automatic dependencies diff --git a/gcc/config/sparc/t-sparc b/gcc/config/sparc/t-sparc --- a/gcc/config/sparc/t-sparc +++ b/gcc/config/sparc/t-sparc @@ -18,19 +18,6 @@ # along with GCC; see the file COPYING3. If not see # http://www.gnu.org/licenses/. -sparc.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ - $(TREE_H) $(RTL_H) $(REGS_H) hard-reg-set.h insn-config.h \ - insn-codes.h conditions.h output.h $(INSN_ATTR_H) $(FLAGS_H) \ - $(FUNCTION_H) $(EXCEPT_H) $(EXPR_H) $(OPTABS_H) $(RECOG_H) \ - $(DIAGNOSTIC_CORE_H) $(GGC_H) $(TM_P_H) debug.h $(TARGET_H) \ - $(TARGET_DEF_H) $(COMMON_TARGET_H) $(GIMPLE_H) $(TREE_PASS_H) \ - langhooks.h reload.h $(PARAMS_H) $(DF_H) $(OPTS_H) $(CONTEXT_H) \ - gt-sparc.h - -sparc-c.o: $(srcdir)/config/sparc/sparc-c.c \ -$(srcdir)/config/sparc/sparc-protos.h \ -$(CONFIG_H) $(SYSTEM_H) $(CPPLIB_H) $(FLAGS_H) \ -$(TM_P_H) coretypes.h $(TM_H) $(TREE_H) \ -$(C_COMMON_H) $(C_PRAGMA_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/sparc/sparc-c.c +sparc-c.o: $(srcdir)/config/sparc/sparc-c.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/sparc/x-sparc b/gcc/config/sparc/x-sparc --- a/gcc/config/sparc/x-sparc +++ b/gcc/config/sparc/x-sparc @@ -1,3 +1,4 @@ -driver-sparc.o: $(srcdir)/config/sparc/driver-sparc.c \ - $(CONFIG_H) $(SYSTEM_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +driver-sparc.o: $(srcdir)/config/sparc/driver-sparc.c + $(COMPILE) $ + $(POSTCOMPILE) + diff --git a/gcc/config/t-sol2 b/gcc/config/t-sol2 --- a/gcc/config/t-sol2 +++ b/gcc/config/t-sol2 @@ -17,22 +17,21 @@ # http://www.gnu.org/licenses/. # Solaris-specific format checking and pragmas -sol2-c.o: $(srcdir)/config/sol2-c.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - tree.h c-family/c-format.h $(C_PRAGMA_H) $(C_COMMON_H) $(CPPLIB_H) \ - intl.h $(TM_H) $(TM_P_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +sol2-c.o: $(srcdir)/config/sol2-c.c + $(COMPILE) $ + $(POSTCOMPILE) # Solaris-specific C++ mangling. -sol2-cxx.o: $(srcdir)/config/sol2-cxx.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - tree.h cp/cp-tree.h $(TM_H) $(TM_P_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +sol2-cxx.o: $(srcdir)/config/sol2-cxx.c + $(COMPILE) $ + $(POSTCOMPILE) # Corresponding stub routines. -sol2-stubs.o: $(srcdir)/config/sol2-stubs.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - tree.h $(TM_H) $(TM_P_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +sol2-stubs.o: $(srcdir)/config/sol2-stubs.c + $(COMPILE) $ + $(POSTCOMPILE) # Solaris-specific attributes -sol2.o: $(srcdir)/config/sol2.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - tree.h output.h $(TM_H) $(TARGET_H) $(TM_P_H) $(GGC_H) $(HASH_TABLE_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +sol2.o: $(srcdir)/config/sol2.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/x-solaris b/gcc/config/x-solaris --- a/gcc/config/x-solaris +++ b/gcc/config/x-solaris @@ -1,4 +1,3 @@ -host-solaris.o : $(srcdir)/config/host-solaris.c $(CONFIG_H) $(SYSTEM_H) \ - coretypes.h hosthooks.h hosthooks-def.h $(HOOKS_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/host-solaris.c +host-solaris.o: $(srcdir)/config/host-solaris.c + $(COMPILE) $ + $(POSTCOMPILE) -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
Hi, On 10/02/2013 12:19 PM, Gerald Pfeifer wrote: On Tue, 1 Oct 2013, Tim Shen wrote: Hi, libstdc++-v3 regex is ready for releasing. Nice! Is it Ok to apply? By the way, do we need a News entry for this improvement? Yes, and yes. :-) Just one question improved experimental support sounds a bit weak. I think I understand where it is coming from, but is there a way to market this a bit stronger? Minimally, I would talk about improved support: the evolution from -std=c++0x to -std=c++11 meant that we aren't in experimental mode anymore. Paolo.
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On 2 October 2013 11:41, Paolo Carlini wrote: Minimally, I would talk about improved support: the evolution from -std=c++0x to -std=c++11 meant that we aren't in experimental mode anymore. From speaking to Jason he's pretty adamant it's still experimental for now :-) My understanding was that -std=c++11 changed just because the standard was published, not because our support was finished. I agree we don't need to say experimental on the release notes page though.
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On 10/02/2013 12:45 PM, Jonathan Wakely wrote: On 2 October 2013 11:41, Paolo Carlini wrote: Minimally, I would talk about improved support: the evolution from -std=c++0x to -std=c++11 meant that we aren't in experimental mode anymore. From speaking to Jason he's pretty adamant it's still experimental for now :-) My understanding was that -std=c++11 changed just because the standard was published, not because our support was finished. Note however, that these days we are also triumphally announcing that we are feature complete ;) Morally that kind of announcement seems to me quite inconsistent with saying that the support is experimental. That we have bugs, that's for sure, but still talking about experimental when we have a reference released Standard, and we are feature complete, seems to me a good way to confuse the users... But I don't have a strong opinion here, really. Just say in the announcement what suits your taste, try to be minimally consistent and I'm Ok with anything. Paolo.
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On Wed, Oct 02, 2013 at 12:55:48PM +0200, Paolo Carlini wrote: On 10/02/2013 12:45 PM, Jonathan Wakely wrote: On 2 October 2013 11:41, Paolo Carlini wrote: Minimally, I would talk about improved support: the evolution from -std=c++0x to -std=c++11 meant that we aren't in experimental mode anymore. From speaking to Jason he's pretty adamant it's still experimental for now :-) My understanding was that -std=c++11 changed just because the standard was published, not because our support was finished. Note however, that these days we are also triumphally announcing that we are feature complete ;) Morally that kind of announcement seems to me quite inconsistent with saying that the support is We have announced only core language feature completeness, the library was known to be incomplete. And, I think for 4.9 the library C++11 support is still meant to be experimental because of the ABI issues, where we know we'll need to change std::string, std::list etc., but are waiting for Dodji's ABI verification stuff before we are confident we can do all those changes safely. Jakub
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On 10/02/2013 12:59 PM, Jakub Jelinek wrote: We have announced only core language feature completeness, the library was known to be incomplete. And, I think for 4.9 the library C++11 support is still meant to be experimental because of the ABI issues, where we know we'll need to change std::string, std::list etc., but are waiting for Dodji's ABI verification stuff before we are confident we can do all those changes safely. Jakub That's a good point. But then both the announcement should be *much* more explicit that it only talks about the C++ front-end, not library (IMHO certainly isn't, and that's a rather serious issue) and c++0x_warning.h should be much more explicit that it refers to the library, not the front-end, as regards the experimental bit. Paolo.
Re: [build] Update t-sparc, t-sol2 etc. for automatic dependencies
Il 02/10/2013 12:39, Rainer Orth ha scritto: Inspired by the t-i386 changes, the following patch moves SPARC and Solaris files over to automatic dependencies. Bootstrapped without regression on sparc-sun-solaris2.11, verified that dependencies were generated for affected files. Ok for mainline? Rainer 2013-10-01 Rainer Orth r...@cebitec.uni-bielefeld.de * config/t-sol2 (sol2-c.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). (sol2-cxx.o): Likewise. (sol2-stubs.o): Likewise. (sol2.o): Likewise. * config/x-solaris (host-solaris.o): Likewise. * config/sparc/t-sparc (sparc.o): Remove. (sparc-c.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). * config/sparc/x-sparc: Likewise. Sure. Paolo
Re: Copyright years for new old ports (Re: Ping^6: contribute Synopsys Designware ARC port)
Quoting Gerald Pfeifer ger...@pfeifer.com: On Wed, 2 Oct 2013, Joern Rennecke wrote: From my understanding, the condition for adding the current Copyright year without a source code change is to have a release in that year. Are we sure 4.9.0 will be released this year? We are sure we don't want 4.9.0 to be released this year ;) But(!) we'll be releasing another dozen of 4.9.0 snapshots this year. That probably was something the FSF had not considered when creating the original policy (recall how even GCC did not have a publicly accessible source code repository in the days). So, should I add 2014 now? That would be no more speculative than adding the current year at the start of the year in anticipation of a release that year. I would do add 2013. This is when the port has hit the tree and when we will be doing snapshots available via our and many other servers. Ok, commited as revision 203110.
Re: [PATCH] Reducing number of alias checks in vectorization.
On Tue, 1 Oct 2013, Cong Hou wrote: When alias exists between data refs in a loop, to vectorize it GCC does loop versioning and adds runtime alias checks. Basically for each pair of data refs with possible data dependence, there will be two comparisons generated to make sure there is no aliasing between them in each iteration of the vectorized loop. If there are many such data refs pairs, the number of comparisons can be very large, which is a big overhead. However, in some cases it is possible to reduce the number of those comparisons. For example, for the following loop, we can detect that b[0] and b[1] are two consecutive member accesses so that we can combine the alias check between a[0:100]b[0] and a[0:100]b[1] into checking a[0:100]b[0:2]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[1]; } Actually, the requirement of consecutive memory accesses is too strict. For the following loop, we can still combine the alias checks between a[0:100]b[0] and a[0:100]b[100]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[100]; } This is because if b[0] is not in a[0:100] and b[100] is not in a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need to check a[0:100] and b[0:101] don't overlap. More generally, consider two pairs of data refs (a, b1) and (a, b2). Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2; offset_b1 and offset_b2 (offset_b1 offset_b2) are offsets of b1 and b2, and segment_length_a, segment_length_b1, and segment_length_b2 are segment length of a, b1, and b2. Then we can combine the two comparisons into one if the following condition is satisfied: offset_b2- offset_b1 - segment_length_b1 segment_length_a This patch detects those combination opportunities to reduce the number of alias checks. It is tested on an x86-64 machine. Apart from the other comments you got (to which I agree) the patch seems to do two things, namely also: + /* Extract load and store statements on pointers with zero-stride + accesses. */ + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) +{ which I'd rather see in a separate patch (and done also when the loop doesn't require versioning for alias). Also combining the alias checks in vect_create_cond_for_alias_checks is nice but doesn't properly fix the use of the vect-max-version-for-alias-checks param which currently inhibits vectorization of the HIMENO benchmark by default (and make us look bad compared to LLVM). So I believe this merging should be done incrementally when we collect the DDRs we need to test in vect_mark_for_runtime_alias_test. Thanks for working on this, Richard. thanks, Cong Index: gcc/tree-vect-loop-manip.c === --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) @@ -19,6 +19,10 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#include vector +#include utility +#include algorithm + #include config.h #include system.h #include coretypes.h @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer return segment_length; } +namespace +{ + +/* struct dr_addr_with_seg_len + + A struct storing information of a data reference, including the data + ref itself, its basic address, the access offset and the segment length + for aliasing checks. */ + +struct dr_addr_with_seg_len +{ + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) +: dr (d), basic_addr (addr), offset (off), seg_len (len) {} + + data_reference* dr; + tree basic_addr; + tree offset; + tree seg_len; +}; + +/* Operator == between two dr_addr_with_seg_len objects. + + This equality operator is used to make sure two data refs + are the same one so that we will consider to combine the + aliasing checks of those two pairs of data dependent data + refs. */ + +bool operator == (const dr_addr_with_seg_len d1, + const dr_addr_with_seg_len d2) +{ + return operand_equal_p (d1.basic_addr, d2.basic_addr, 0) + operand_equal_p (d1.offset, d2.offset, 0) + operand_equal_p (d1.seg_len, d2.seg_len, 0); +} + +typedef std::pair dr_addr_with_seg_len, dr_addr_with_seg_len + dr_addr_with_seg_len_pair_t; + + +/* Operator between two dr_addr_with_seg_len_pair_t objects. + + This operator is used to sort objects of dr_addr_with_seg_len_pair_t + so that we can combine aliasing checks during one scan. */ + +bool operator (const dr_addr_with_seg_len_pair_t p1, + const dr_addr_with_seg_len_pair_t p2) +{ + const dr_addr_with_seg_len p11 = p1.first; + const dr_addr_with_seg_len p12 = p1.second; + const dr_addr_with_seg_len p21 = p2.first; + const dr_addr_with_seg_len p22 = p2.second; + + if (p11.basic_addr !=
Re: [patch] move htab_iterator
On 10/01/2013 05:04 PM, DJ Delorie wrote: I'm typically against adding things to libiberty because there's no other place for them. The purpose of libiberty is to provide a portability layer, not a trash can. However, htab is already in there, and the argument for putting its accessors there is sound. I think it is the place they belong if someone wants them. Jakub didn't want to lose them, and Tom expressed an interest in them. However, most of the other functions in hashtab.h are of the form htab_*(). Could these be changed to match that pattern? If these functions are unused, it shouldn't matter to rename them. (although, if they're unused, it shouldn't matter to discard them, either) I can easily rename them, like so. They are also here for the record now... If you don't want to approve this because it is unused (and I understand that), anyone can re-introduce them when they actually want to use them... Andrew gcc * tree-flow.h (htab_iterator, FOR_EACH_HTAB_ELEMENT): Move from here. * tree-flow-inline.h (first_htab_element, end_htab_p, next_htab_element): Also move from here. include * hashtab.h (htab_iterator, HTAB_FOR_EACH_ELEMENT, htab_first_element, htab_next_element): Rename and move to here. (htab_end_p): Rename, move and change boolean to int and 0/1. Index: gcc/tree-flow.h === *** gcc/tree-flow.h (revision 203068) --- gcc/tree-flow.h (working copy) *** struct GTY(()) gimple_df { *** 92,112 htab_t GTY ((param_is (struct tm_restart_node))) tm_restart; }; - - typedef struct - { - htab_t htab; - PTR *slot; - PTR *limit; - } htab_iterator; - - /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, -storing each element in RESULT, which is of type TYPE. */ - #define FOR_EACH_HTAB_ELEMENT(HTAB, RESULT, TYPE, ITER) \ - for (RESULT = (TYPE) first_htab_element ((ITER), (HTAB)); \ - !end_htab_p ((ITER)); \ - RESULT = (TYPE) next_htab_element ((ITER))) - static inline int get_lineno (const_gimple); /*--- --- 92,97 Index: gcc/tree-flow-inline.h === *** gcc/tree-flow-inline.h (revision 203068) --- gcc/tree-flow-inline.h (working copy) *** gimple_vop (const struct function *fun) *** 42,93 return fun-gimple_df-vop; } - /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ - - static inline void * - first_htab_element (htab_iterator *hti, htab_t table) - { - hti-htab = table; - hti-slot = table-entries; - hti-limit = hti-slot + htab_size (table); - do - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - break; - } while (++(hti-slot) hti-limit); - - if (hti-slot hti-limit) - return *(hti-slot); - return NULL; - } - - /* Return current non-empty/deleted slot of the hashtable pointed to by HTI, -or NULL if we have reached the end. */ - - static inline bool - end_htab_p (const htab_iterator *hti) - { - if (hti-slot = hti-limit) - return true; - return false; - } - - /* Advance the hashtable iterator pointed to by HTI to the next element of the -hashtable. */ - - static inline void * - next_htab_element (htab_iterator *hti) - { - while (++(hti-slot) hti-limit) - { - PTR x = *(hti-slot); - if (x != HTAB_EMPTY_ENTRY x != HTAB_DELETED_ENTRY) - return x; - }; - return NULL; - } - /* Get the number of the next statement uid to be allocated. */ static inline unsigned int gimple_stmt_max_uid (struct function *fn) --- 42,47 Index: include/hashtab.h === *** include/hashtab.h (revision 203067) --- include/hashtab.h (working copy) *** extern hashval_t iterative_hash (const v *** 202,207 --- 202,269 /* Shorthand for hashing something with an intrinsic size. */ #define iterative_hash_object(OB,INIT) iterative_hash (OB, sizeof (OB), INIT) + /* GCC style hash table iterator. */ + + typedef struct + { + htab_t htab; + PTR *slot; + PTR *limit; + } htab_iterator; + + /* Iterate through the elements of hashtable HTAB, using htab_iterator ITER, +storing each element in RESULT, which is of type TYPE. */ + #define HTAB_FOR_EACH_ELEMENT(HTAB, RESULT, TYPE, ITER) \ + for (RESULT = (TYPE) htab_first_element ((ITER), (HTAB)); \ + !htab_end_p ((ITER)); \ + RESULT = (TYPE) htab_next_element ((ITER))) + + /* Initialize the hashtable iterator HTI to point to hashtable TABLE */ + + static inline void * + htab_first_element (htab_iterator *hti, htab_t table) + { + hti-htab = table; + hti-slot = table-entries; + hti-limit = hti-slot + htab_size (table); + do + { + PTR x = *(hti-slot); + if (x != HTAB_EMPTY_ENTRY x !=
[PATCH, build]: Remove -Wno-warning from expmed.c compilation
Hello! Compiling expmed.c has been warning free for some time now. 2013-10-02 Uros Bizjak ubiz...@gmail.com * Makefile.in (expmed.o-warn): Remove. Bootstrapped on x86_64-pc-linux-gnu. OK for mainline? Uros. Index: Makefile.in === --- Makefile.in (revision 203101) +++ Makefile.in (working copy) @@ -193,7 +193,6 @@ # flex output may yield harmless no previous prototype warnings build/gengtype-lex.o-warn = -Wno-error gengtype-lex.o-warn = -Wno-error -expmed.o-warn = -Wno-error # All warnings have to be shut off in stage1 if the compiler used then # isn't gcc; configure determines that. WARN_CFLAGS will be either
Re: [patch] More tree-flow.h prototypes.
On 10/02/2013 04:37 AM, Richard Biener wrote: On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. hmm, easy enough to move them *all* to tree-ssa-propagate.[ch] right now and check it in... That seems like the right place for all of them and then we don't even need to create tree-ssa-copy.h...? Andrew
Make the 2 versions of delete more similar
Hello, I don't understand why those 2 files differ by more than 1 extra argument, so I am changing that. Bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse marc.gli...@inria.fr * libsupc++/del_op.cc (operator delete): Don't test for 0 before free. * libsupc++/del_opnt.cc (free): Only declare if freestanding. (operator delete): Qualify free with std::. -- Marc GlisseIndex: libsupc++/del_op.cc === --- libsupc++/del_op.cc (revision 203101) +++ libsupc++/del_op.cc (working copy) @@ -36,13 +36,12 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace #else # include cstdlib #endif #include new _GLIBCXX_WEAK_DEFINITION void operator delete(void* ptr) _GLIBCXX_USE_NOEXCEPT { - if (ptr) -std::free(ptr); + std::free(ptr); } Index: libsupc++/del_opnt.cc === --- libsupc++/del_opnt.cc (revision 203101) +++ libsupc++/del_opnt.cc (working copy) @@ -17,19 +17,31 @@ // Under Section 7 of GPL version 3, you are granted additional // permissions described in the GCC Runtime Library Exception, version // 3.1, as published by the Free Software Foundation. // You should have received a copy of the GNU General Public License and // a copy of the GCC Runtime Library Exception along with this program; // see the files COPYING3 and COPYING.RUNTIME respectively. If not, see // http://www.gnu.org/licenses/. #include bits/c++config.h -#include new -extern C void free (void *); +#if !_GLIBCXX_HOSTED +// A freestanding C runtime may not provide free -- but there is no +// other reasonable way to implement operator delete. +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + extern C void free(void*); +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace +#else +# include cstdlib +#endif + +#include new _GLIBCXX_WEAK_DEFINITION void operator delete (void *ptr, const std::nothrow_t) _GLIBCXX_USE_NOEXCEPT { - free (ptr); + std::free(ptr); }
Re: operator new returns nonzero
New version after Jakub's comment, bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse marc.gli...@inria.fr PR c++/19476 gcc/ * calls.c (alloca_call_p): Use get_callee_fndecl. * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc GlisseIndex: calls.c === --- calls.c (revision 203101) +++ calls.c (working copy) @@ -628,25 +628,24 @@ gimple_alloca_call_p (const_gimple stmt) return true; return false; } /* Return true when exp contains alloca call. */ bool alloca_call_p (const_tree exp) { + tree fndecl; if (TREE_CODE (exp) == CALL_EXPR - TREE_CODE (CALL_EXPR_FN (exp)) == ADDR_EXPR - (TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (exp), 0)) == FUNCTION_DECL) - (special_function_p (TREE_OPERAND (CALL_EXPR_FN (exp), 0), 0) - ECF_MAY_BE_ALLOCA)) + (fndecl = get_callee_fndecl (exp)) + (special_function_p (fndecl, 0) ECF_MAY_BE_ALLOCA)) return true; return false; } /* Return TRUE if FNDECL is either a TM builtin or a TM cloned function. Return FALSE otherwise. */ static bool is_tm_builtin (const_tree fndecl) { Index: fold-const.c === --- fold-const.c(revision 203101) +++ fold-const.c(working copy) @@ -16215,21 +16215,29 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fndecl = get_callee_fndecl (t); + if (!fndecl) return false; + if (flag_delete_null_pointer_checks !flag_check_new +DECL_IS_OPERATOR_NEW (fndecl) +!TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); + } default: break; } return false; } /* Return true when T is an address and is known to be nonzero. Handle warnings about undefined signed overflow. */ Index: testsuite/g++.dg/tree-ssa/pr19476-1.C === --- testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-1.C (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-ccp1 } */ + +#include new + +int f(){ + return 33 + (0 == new(std::nothrow) int); +} +int g(){ + return 42 + (0 == new int[50]); +} + +/* { dg-final { scan-tree-dump return 42 ccp1 } } */ +/* { dg-final { scan-tree-dump-not return 33 ccp1 } } */ +/* { dg-final { cleanup-tree-dump ccp1 } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C ___ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-2.C === --- testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-2.C (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +#include new + +int f(){ + int *p = new(std::nothrow) int; + return 33 + (0 == p); +} +int g(){ + int *p = new int[50]; + return 42 + (0 == p); +} + +/* { dg-final { scan-tree-dump return 42 optimized } } */ +/* { dg-final { scan-tree-dump-not return 33 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C ___ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-3.C === --- testsuite/g++.dg/tree-ssa/pr19476-3.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-3.C (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O3 -fcheck-new -fdump-tree-optimized } */ + +#include new + +int g(){ + return 42 + (0 == new int); +} + +/* { dg-final { scan-tree-dump-not return 42 optimized } } */
Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64
On Tue, 2013-10-01 at 20:21 -0500, Bill Schmidt wrote: On Tue, 2013-10-01 at 23:57 +0100, Yufeng Zhang wrote: On 10/01/13 20:55, Bill Schmidt wrote: On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote: OK, thanks. The problem that you've encountered is that you are attempting to do something illegal. ;) (Bin's original patch is actually to blame for that, as well as me for not catching it then.) As your new test shows, it is unsafe to do the transformation in backtrace_base_for_ref when widening from an unsigned type, because the unsigned type has wrap semantics by default. (The actual test must be done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or removed by compile option -- see the comments with legal_cast_p and legal_cast_p_1 later in the module.) You cannot in general prove that the transformation is allowable for a specific constant, because you don't know that what you're adding it to won't cause an overflow that's handled incorrectly. I believe the correct fix for the unsigned-overflow case is to fail backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns false, where in_type is the type of the new *PBASE, and out_type is the widening type that you're looking through. So you can't just STRIP_NOPS, you have to check the cast for legitimacy for this transformation. This does not explain why backtrace_base_for_ref does not find all the opportunities on slsr-39.c. I don't immediately see what's preventing that. Note that the transformation is legal in that case because you are widening from a signed int to an unsigned int, which won't cause problems. You guys need to dig deeper into why those opportunities are missed when sizetype is larger than int. Let me know if you need help figuring it out. Sorry, I had to leave before and wanted to get this response back to you in case I didn't get back soon. I've looked at this some more, and your general approach should work ok once you get the legal_cast_p check in place where you do the get_unwidened call now. Once you know you have a legal widening, you don't have to worry about the safe_to_multiply_p stuff. I.e., you don't need the last two chunks in the patch to backtrace_base_for_ref, and you don't need the unwidened_p variable. It should all fall out properly by just restricting your unwidening to legal casts. Many thanks for looking into the issue so promptly. I've updated the patch; I have to use legal_cast_p_1 instead as the gimple node is no longer available by then. Does the new patch look sane? Yes, much better. I'm happy with this approach. However, please restore the correct whitespace before the { at -786,7 +795,7. Thanks for fixing this up! Bill (Just a reminder that I can't approve your patch; you need a maintainer for that. But it looks good to me.) Sometime when I get a moment I'm probably going to change this to handle the casting when the candidates are added to the table. I think we should look through the casts and distribute the multiply at that time. But for now what you have here is good. Thanks, Bill The regtest on aarch64 and bootstrapping on x86-64 are still running. Thanks, Yufeng gcc/ * gimple-ssa-strength-reduction.c (legal_cast_p_1): Forward declaration. (backtrace_base_for_ref): Call get_unwidened with 'base_in' if 'base_in' represent a conversion and legal_cast_p_1 holds; set 'base_in' with the returned value from get_unwidened. gcc/testsuite/ * gcc.dg/tree-ssa/slsr-40.c: New test.
Re: Make the 2 versions of delete more similar
On 2 October 2013 13:28, Marc Glisse marc.gli...@inria.fr wrote: Hello, I don't understand why those 2 files differ by more than 1 extra argument, so I am changing that. Bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse marc.gli...@inria.fr * libsupc++/del_op.cc (operator delete): Don't test for 0 before free. Just checking, for the nervous: Is the plan that this change will not effect any code behaviour (as correct implementations of free are happy to take a NULL pointer, and not do anything)? Chris _GLIBCXX_WEAK_DEFINITION void operator delete(void* ptr) _GLIBCXX_USE_NOEXCEPT { - if (ptr) -std::free(ptr); + std::free(ptr); }
Re: operator new returns nonzero
On 10/02/2013 08:33 AM, Marc Glisse wrote: + if (flag_delete_null_pointer_checks !flag_check_new You can't use flag_check_new in language-independent code without moving it from c.opt to common.opt. Jason
Re: [patch] More tree-flow.h prototypes.
On 10/02/2013 07:58 AM, Andrew MacLeod wrote: On 10/02/2013 04:37 AM, Richard Biener wrote: On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. hmm, easy enough to move them *all* to tree-ssa-propagate.[ch] right now and check it in... That seems like the right place for all of them and then we don't even need to create tree-ssa-copy.h...? Like so.. and directly include tree-ssa-propagate.h in the 3 .c files that need it now. bootstrapped on x86_64-unknown-linux-gnu.. regressions running. Prefer this? Andrew * tree-flow.h: Remove some prototypes. * gimple-fold.h: Add prototypes from gimple.h and tree-flow.h. * tree-ssa-propagate.h: Relocate prototypes from tree-flow.h. * tree-ssa-copy.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Move from here to... * tree-ssa-propagate.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Relocate here. * tree-ssa-propagate:h: Relocate prototypes from tree-flow.h. * gimple.h: Include gimple-fold.h, move prototypes into gimple-fold.h. * gimple-fold.c: Remove gimple-fold.h from include list. * tree-vrp.c: Remove gimple-fold.h from include list. * tree-ssa-sccvn.c: Remove gimple-fold.h from include list. * tree-ssa-ccp.c: Remove gimple-fold.h from include list. * tree-scalar-evolution.c: Add tree-ssa-propagate.h to include list. * tree-ssa-pre.c: Add tree-ssa-propagate.h to include list. * sese.c: Add tree-ssa-propagate.h to include list. Index: tree-flow.h === *** tree-flow.h (revision 203085) --- tree-flow.h (working copy) *** void mark_virtual_operands_for_renaming *** 297,306 tree get_current_def (tree); void set_current_def (tree, tree); - /* In tree-ssa-ccp.c */ - tree fold_const_aggregate_ref (tree); - tree gimple_fold_stmt_to_constant (gimple, tree (*)(tree)); - /* In tree-ssa-dom.c */ extern void dump_dominator_optimization_stats (FILE *); extern void debug_dominator_optimization_stats (void); --- 297,302 *** int loop_depth_of_name (tree); *** 308,322 tree degenerate_phi_result (gimple); bool simple_iv_increment_p (gimple); - /* In tree-ssa-copy.c */ - extern void propagate_value (use_operand_p, tree); - extern void propagate_tree_value (tree *, tree); - extern void propagate_tree_value_into_stmt (gimple_stmt_iterator *, tree); - extern void replace_exp (use_operand_p, tree); - extern bool may_propagate_copy (tree, tree); - extern bool may_propagate_copy_into_stmt (gimple, tree); - extern bool may_propagate_copy_into_asm (tree); - /* In tree-ssa-loop-ch.c */ bool do_while_loop_p (struct loop *); --- 304,309 Index: gimple-fold.h === *** gimple-fold.h (revision 203085) --- gimple-fold.h (working copy) *** along with GCC; see the file COPYING3. *** 22,31 #ifndef GCC_GIMPLE_FOLD_H #define GCC_GIMPLE_FOLD_H ! tree fold_const_aggregate_ref_1 (tree, tree (*) (tree)); ! tree fold_const_aggregate_ref (tree); ! ! tree gimple_fold_stmt_to_constant_1 (gimple, tree (*) (tree)); ! tree gimple_fold_stmt_to_constant (gimple, tree (*) (tree)); #endif /* GCC_GIMPLE_FOLD_H */ --- 22,43 #ifndef GCC_GIMPLE_FOLD_H #define GCC_GIMPLE_FOLD_H ! extern tree canonicalize_constructor_val (tree, tree); ! extern tree get_symbol_constant_value (tree); ! extern void gimplify_and_update_call_from_tree (gimple_stmt_iterator *, tree); ! extern tree gimple_fold_builtin (gimple); ! extern tree gimple_extract_devirt_binfo_from_cst (tree, tree); ! extern bool fold_stmt (gimple_stmt_iterator *); ! extern bool fold_stmt_inplace (gimple_stmt_iterator *); ! extern tree maybe_fold_and_comparisons (enum tree_code, tree, tree, ! enum tree_code, tree, tree); ! extern tree maybe_fold_or_comparisons (enum tree_code, tree, tree, ! enum tree_code, tree, tree); ! extern tree gimple_fold_stmt_to_constant_1 (gimple, tree (*) (tree)); ! extern tree gimple_fold_stmt_to_constant (gimple, tree (*) (tree)); ! extern tree fold_const_aggregate_ref_1 (tree, tree (*) (tree)); ! extern tree fold_const_aggregate_ref (tree); ! extern tree gimple_get_virt_method_for_binfo (HOST_WIDE_INT, tree); ! extern bool
Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64
On 10/02/13 02:21, Bill Schmidt wrote: On Tue, 2013-10-01 at 23:57 +0100, Yufeng Zhang wrote: On 10/01/13 20:55, Bill Schmidt wrote: On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote: OK, thanks. The problem that you've encountered is that you are attempting to do something illegal. ;) (Bin's original patch is actually to blame for that, as well as me for not catching it then.) As your new test shows, it is unsafe to do the transformation in backtrace_base_for_ref when widening from an unsigned type, because the unsigned type has wrap semantics by default. (The actual test must be done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or removed by compile option -- see the comments with legal_cast_p and legal_cast_p_1 later in the module.) You cannot in general prove that the transformation is allowable for a specific constant, because you don't know that what you're adding it to won't cause an overflow that's handled incorrectly. I believe the correct fix for the unsigned-overflow case is to fail backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns false, where in_type is the type of the new *PBASE, and out_type is the widening type that you're looking through. So you can't just STRIP_NOPS, you have to check the cast for legitimacy for this transformation. This does not explain why backtrace_base_for_ref does not find all the opportunities on slsr-39.c. I don't immediately see what's preventing that. Note that the transformation is legal in that case because you are widening from a signed int to an unsigned int, which won't cause problems. You guys need to dig deeper into why those opportunities are missed when sizetype is larger than int. Let me know if you need help figuring it out. Sorry, I had to leave before and wanted to get this response back to you in case I didn't get back soon. I've looked at this some more, and your general approach should work ok once you get the legal_cast_p check in place where you do the get_unwidened call now. Once you know you have a legal widening, you don't have to worry about the safe_to_multiply_p stuff. I.e., you don't need the last two chunks in the patch to backtrace_base_for_ref, and you don't need the unwidened_p variable. It should all fall out properly by just restricting your unwidening to legal casts. Many thanks for looking into the issue so promptly. I've updated the patch; I have to use legal_cast_p_1 instead as the gimple node is no longer available by then. Does the new patch look sane? Yes, much better. I'm happy with this approach. Great! The regtest and bootstrap all passed so I've committed the patch. However, please restore the correct whitespace before the { at -786,7 +795,7. This is actually a correction to the whitespace. I've split the patch and committed it separately. Thanks again for helping out! Regards, Yufeng
[c++-concepts] constrained friends redux
This patch implements constrained friends and disallows declarations of constrained friend template specialization. There was a previous question about whether I was doing the right thing in determine_specialization. I'm looking at that issue separately. 2013-10-01 Andrew Sutton andrew.n.sut...@gmail.com * gcc/cp/parser.c (cp_parser_member_declaration): Check that a constrained friend definition is valid. * gcc/cp/decl.c (grokfndecl): Disallow constrained friend template specializations. * gcc/cp/constraints.cc (check_constrained_friend): New. * gcc/cp/typeck.c (cp_build_function_call_vec): Diagnose constraints in the presence of the failure of a single candidate. * gcc/cp/cp-tree.h (check_constrained_friend): New. * gcc/cp/call.c (is_non_template_member_fn): Make inline. (is_non_template_friend), (is_constrainable_non_template_fn): New. (add_function_candidate): Predicate check on is_constrainable_non_template_fn. Andrew Sutton friends-2.patch Description: Binary data
Re: operator new returns nonzero
On Wed, 2 Oct 2013, Jason Merrill wrote: On 10/02/2013 08:33 AM, Marc Glisse wrote: + if (flag_delete_null_pointer_checks !flag_check_new You can't use flag_check_new in language-independent code without moving it from c.opt to common.opt. Thanks, that makes sense and I'll do that, but I am surprised that the fortran and java compilers built without complaining. -- Marc Glisse
Re: operator new returns nonzero
On Wed, Oct 02, 2013 at 04:12:24PM +0300, Marc Glisse wrote: On Wed, 2 Oct 2013, Jason Merrill wrote: On 10/02/2013 08:33 AM, Marc Glisse wrote: + if (flag_delete_null_pointer_checks !flag_check_new You can't use flag_check_new in language-independent code without moving it from c.opt to common.opt. Thanks, that makes sense and I'll do that, but I am surprised that the fortran and java compilers built without complaining. I think the macros are gathered for all the FEs configured, and as the C FE is mandatory I think you can't end up with it not being defined. Jakub
[PATCH] More loop distribution TLC
I split out some TLC to loop distribution from a patch I'll post shortly. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2013-10-02 Richard Biener rguent...@suse.de * tree-loop-distribution.c: Include tree-vectorizer.h for find_loop_location. (enum partition_kind): Remove PKIND_REDUCTION. (struct partition_s): Remove has_writes member, add reduction_p member. (partition_alloc): Adjust. (partition_builtin_p): Likewise. (partition_has_writes): Remove. (partition_reduction_p): New function. (partition_merge_into): Likewise. (generate_code_for_partition): Commonize builtin partition handling tail. (rdg_cannot_recompute_vertex_p): Remove. (already_processed_vertex_p): Likewise. (rdg_flag_vertex): Do not set has_writes. (classify_partition): Adjust. (rdg_build_partitions): Do not set has_writes, treat all partitions as useful. (distribute_loop): Record number of library calls generated. Adjust. (tree_loop_distribution): Report number of loops and library calls generated as opt-info. * gcc.dg/tree-ssa/ldist-11.c: Adjust. * gcc.dg/tree-ssa/ldist-17.c: Likewise. * gcc.dg/tree-ssa/ldist-23.c: Likewise. * gcc.dg/tree-ssa/ldist-pr45948.c: Likewise. * gfortran.dg/ldist-pr45199.f: Likewise. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c.orig 2013-10-02 14:42:48.0 +0200 --- gcc/tree-loop-distribution.c2013-10-02 14:49:05.775900940 +0200 *** along with GCC; see the file COPYING3. *** 51,56 --- 51,57 #include tree-scalar-evolution.h #include tree-pass.h #include gimple-pretty-print.h + #include tree-vectorizer.h /* A Reduced Dependence Graph (RDG) vertex representing a statement. */ *** build_rdg (vecloop_p loop_nest, contro *** 557,570 enum partition_kind { ! PKIND_NORMAL, PKIND_REDUCTION, PKIND_MEMSET, PKIND_MEMCPY }; typedef struct partition_s { bitmap stmts; bitmap loops; ! bool has_writes; enum partition_kind kind; /* data-references a kind != PKIND_NORMAL partition is about. */ data_reference_p main_dr; --- 558,571 enum partition_kind { ! PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY }; typedef struct partition_s { bitmap stmts; bitmap loops; ! bool reduction_p; enum partition_kind kind; /* data-references a kind != PKIND_NORMAL partition is about. */ data_reference_p main_dr; *** partition_alloc (bitmap stmts, bitmap lo *** 581,587 partition_t partition = XCNEW (struct partition_s); partition-stmts = stmts ? stmts : BITMAP_ALLOC (NULL); partition-loops = loops ? loops : BITMAP_ALLOC (NULL); ! partition-has_writes = false; partition-kind = PKIND_NORMAL; return partition; } --- 582,588 partition_t partition = XCNEW (struct partition_s); partition-stmts = stmts ? stmts : BITMAP_ALLOC (NULL); partition-loops = loops ? loops : BITMAP_ALLOC (NULL); ! partition-reduction_p = false; partition-kind = PKIND_NORMAL; return partition; } *** partition_free (partition_t partition) *** 601,617 static bool partition_builtin_p (partition_t partition) { ! return partition-kind PKIND_REDUCTION; } ! /* Returns true if the partition has an writes. */ static bool ! partition_has_writes (partition_t partition) { ! return partition-has_writes; } /* Returns true when DEF is an SSA_NAME defined in LOOP and used after the LOOP. */ --- 602,630 static bool partition_builtin_p (partition_t partition) { ! return partition-kind != PKIND_NORMAL; } ! /* Returns true if the partition contains a reduction. */ static bool ! partition_reduction_p (partition_t partition) { ! return partition-reduction_p; } + /* Merge PARTITION into the partition DEST. */ + + static void + partition_merge_into (partition_t dest, partition_t partition) + { + dest-kind = PKIND_NORMAL; + bitmap_ior_into (dest-stmts, partition-stmts); + if (partition_reduction_p (partition)) + dest-reduction_p = true; + } + + /* Returns true when DEF is an SSA_NAME defined in LOOP and used after the LOOP. */ *** generate_code_for_partition (struct loop *** 998,1055 { switch (partition-kind) { case PKIND_MEMSET: generate_memset_builtin (loop, partition); - /* If this is the last partition for which we generate code, we have -to destroy the loop. */ - if (!copy_p) - destroy_loop (loop); break; case PKIND_MEMCPY: generate_memcpy_builtin (loop, partition); - /* If this is the last partition for which we generate code, we
Re: Make the 2 versions of delete more similar
On 2 October 2013 13:28, Marc Glisse wrote: Hello, I don't understand why those 2 files differ by more than 1 extra argument, so I am changing that. Bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse marc.gli...@inria.fr * libsupc++/del_op.cc (operator delete): Don't test for 0 before free. * libsupc++/del_opnt.cc (free): Only declare if freestanding. (operator delete): Qualify free with std::. Looks good to me, thanks.
Re: [patch] More tree-flow.h prototypes.
On Wed, Oct 2, 2013 at 1:58 PM, Andrew MacLeod amacl...@redhat.com wrote: On 10/02/2013 04:37 AM, Richard Biener wrote: On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. hmm, easy enough to move them *all* to tree-ssa-propagate.[ch] right now and check it in... That seems like the right place for all of them and then we don't even need to create tree-ssa-copy.h...? Yeah. Richard. Andrew
[PATCH][RFC] Detect and use implementations of BLAS routines
This adds recognition of [sd]axpy and [sd]dot computing partitions to loop distribution (as an example for a moderately complex kernel and one kernel involving a reduction). To make this a reality we have to control this by an option (-fblas?) and we have to settle to an ABI we rely on (trailing underscore, argument passing conventions and what integer type to use). Official CBLAS uses f2c and f2c.h to define the ABI. I suppose other compiler vendors simply ship their own BLAS routines and thus have complete control over the ABI (and can also avoid passing scalar parameters by reference ...). Main use of this transformation is of course to get automagic access to vendor optimized BLAS kernels. Any comments? (yeah, pattern matching sucks) Thanks, Richard. 2013-10-02 Richard Biener rguent...@suse.de * tree-loop-distribution.c (enum partition_kind): Add PKIND_BLAS. (enum blas_kind): New enum. (struct partition_s): Add ops and subkind members. (partition_contains_stmt): New function. (build_elt_step): Likewise. (build_addr_arg_loc): Make nb_bytes parameter optional. (force_addr_of): New function. (force_addr_of_int): Likewise. (generate_blas_builtin): Likewise. (generate_code_for_partition): Handle PKIND_BLAS. (classify_partition): Detect [sd]axpy and [sd]dot like partitions. * gcc.dg/tree-ssa/ldist-24.c: New testcase. * gcc.dg/tree-ssa/ldist-25.c: Likewise. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c.orig 2013-10-02 15:47:08.492960908 +0200 --- gcc/tree-loop-distribution.c2013-10-02 15:47:21.519110583 +0200 *** build_rdg (vecloop_p loop_nest, contro *** 558,564 enum partition_kind { ! PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY }; typedef struct partition_s --- 558,570 enum partition_kind { ! PKIND_NORMAL, ! PKIND_MEMSET, PKIND_MEMCPY, ! PKIND_BLAS ! }; ! ! enum blas_kind { ! BLAS_DAXPY, BLAS_DDOT }; typedef struct partition_s *** typedef struct partition_s *** 567,576 bitmap loops; bool reduction_p; enum partition_kind kind; ! /* data-references a kind != PKIND_NORMAL partition is about. */ data_reference_p main_dr; data_reference_p secondary_dr; ! tree niter; } *partition_t; --- 573,587 bitmap loops; bool reduction_p; enum partition_kind kind; ! enum blas_kind subkind; ! /* kind != PKIND_NORMAL data follows. */ ! /* number of invocations of main_dr. */ ! tree niter; ! /* data-references participating. */ data_reference_p main_dr; data_reference_p secondary_dr; ! /* auxiliary operands. */ ! tree ops[2]; } *partition_t; *** partition_reduction_p (partition_t parti *** 613,618 --- 624,638 return partition-reduction_p; } + /* Returns true if PARTITION contains STMT. */ + + static bool + partition_contains_stmt (partition_t partition, gimple stmt) + { + int uid = gimple_uid (stmt); + return uid != -1 bitmap_bit_p (partition-stmts, uid); + } + /* Merge PARTITION into the partition DEST. */ static void *** build_size_arg_loc (location_t loc, data *** 803,808 --- 823,837 return fold_convert_loc (loc, size_type_node, size); } + /* Build the element step tree for DR. */ + + static tree + build_elt_step (location_t loc, data_reference_p dr) + { + tree sz = fold_convert (ssizetype, TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr; + return size_binop_loc (loc, EXACT_DIV_EXPR, DR_STEP (dr), sz); + } + /* Build an address argument for a memory operation call. */ static tree *** build_addr_arg_loc (location_t loc, data *** 814,820 addr_base = fold_convert_loc (loc, sizetype, addr_base); /* Test for a negative stride, iterating over every element. */ ! if (tree_int_cst_sgn (DR_STEP (dr)) == -1) { addr_base = size_binop_loc (loc, MINUS_EXPR, addr_base, fold_convert_loc (loc, sizetype, nb_bytes)); --- 843,850 addr_base = fold_convert_loc (loc, sizetype, addr_base); /* Test for a negative stride, iterating over every element. */ ! if (nb_bytes !tree_int_cst_sgn (DR_STEP (dr)) == -1) { addr_base = size_binop_loc (loc, MINUS_EXPR, addr_base, fold_convert_loc (loc, sizetype, nb_bytes)); *** generate_memcpy_builtin (struct loop *lo *** 955,960 --- 985,1117 } } + /* Force OP to a new temporary and return the address of that temporary +appending necessary statements at GSI. */ + + static tree + force_addr_of (gimple_stmt_iterator *gsi, tree op) + { + tree tem = create_tmp_var (TREE_TYPE (op), NULL); + TREE_ADDRESSABLE (tem) = 1; + gimple s =
[Patch] Fix incorrect behavior of [[=a=]] in regex
_BracketMatcher::_M_add_equivalence_class is misimplemented so I try `git blame regex_compiler.h`...that's me! Booted and tested under -m32, -m64. Thanks ;) -- Tim Shen a.patch Description: Binary data
Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64
On 10/02/13 13:40, Bill Schmidt wrote: On Tue, 2013-10-01 at 20:21 -0500, Bill Schmidt wrote: On Tue, 2013-10-01 at 23:57 +0100, Yufeng Zhang wrote: On 10/01/13 20:55, Bill Schmidt wrote: On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote: OK, thanks. The problem that you've encountered is that you are attempting to do something illegal. ;) (Bin's original patch is actually to blame for that, as well as me for not catching it then.) As your new test shows, it is unsafe to do the transformation in backtrace_base_for_ref when widening from an unsigned type, because the unsigned type has wrap semantics by default. (The actual test must be done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or removed by compile option -- see the comments with legal_cast_p and legal_cast_p_1 later in the module.) You cannot in general prove that the transformation is allowable for a specific constant, because you don't know that what you're adding it to won't cause an overflow that's handled incorrectly. I believe the correct fix for the unsigned-overflow case is to fail backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns false, where in_type is the type of the new *PBASE, and out_type is the widening type that you're looking through. So you can't just STRIP_NOPS, you have to check the cast for legitimacy for this transformation. This does not explain why backtrace_base_for_ref does not find all the opportunities on slsr-39.c. I don't immediately see what's preventing that. Note that the transformation is legal in that case because you are widening from a signed int to an unsigned int, which won't cause problems. You guys need to dig deeper into why those opportunities are missed when sizetype is larger than int. Let me know if you need help figuring it out. Sorry, I had to leave before and wanted to get this response back to you in case I didn't get back soon. I've looked at this some more, and your general approach should work ok once you get the legal_cast_p check in place where you do the get_unwidened call now. Once you know you have a legal widening, you don't have to worry about the safe_to_multiply_p stuff. I.e., you don't need the last two chunks in the patch to backtrace_base_for_ref, and you don't need the unwidened_p variable. It should all fall out properly by just restricting your unwidening to legal casts. Many thanks for looking into the issue so promptly. I've updated the patch; I have to use legal_cast_p_1 instead as the gimple node is no longer available by then. Does the new patch look sane? Yes, much better. I'm happy with this approach. However, please restore the correct whitespace before the { at -786,7 +795,7. Thanks for fixing this up! Bill (Just a reminder that I can't approve your patch; you need a maintainer for that. But it looks good to me.) Oops. I didn't realise that and I just saw your email. :( Sorry... Can Richard please do a retro-approval? Sometime when I get a moment I'm probably going to change this to handle the casting when the candidates are added to the table. Indeed, that will be a cleaner approach. Thanks, Yufeng
Re: libgo patch committed: Implement reflect.MakeFunc for 386
Ian Lance Taylor i...@google.com writes: On Mon, Sep 30, 2013 at 7:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: On Mon, Sep 30, 2013 at 6:07 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: Following up on my earlier patch, this patch implements the reflect.MakeFunc function for 386. Tom Tromey pointed out to me that the libffi closure support can probably be used for this. I was not aware of that support. It supports a lot more processors, and I should probably start using it. The approach I am using does have a couple of advantages: it's more efficient, and it doesn't require any type of writable executable memory. I can get away with that because indirect calls in Go always pass a closure value. So even when and if I do change to using libffi, I might still keep this code for amd64 and 386. Unfortunately, this patch (and undoubtedly the corresponding amd64 one) break Solaris/x86 libgo bootstrap with native as: Unfortunately I think I'll have to somehow disable this functionality on systems with assemblers that do not understand the .cfi directives, as otherwise calling panic in a function created with MakeFunc will not work. Alternatively, one could hand-craft the .eh_frame section for such systems along the lines of libffi/src/x86/sysv.S: ugly, but doable. Yeah. I'm not going to do that myself. But I would be happy to approve a patch for that if somebody else wants to write it. Here's what I came up with. As I said, it is inspired by the libffi code, but a bit simplified since e.g. stuff like no .ascii support aren't relevant on the Solaris versions supported on mainline and 4.8 branch. Bootstrapped on x86_64-unknown-linux-gnu and i386-pc-solaris2.10 with Sun as and gas. I've also compared the readelf --debug-dump=frames output for the 32 and 64-bit makefunc.o, both PIC and non-PIC. 64-bit is completely unchanged, while for 32-bit there are FDE encoding changes as expected from the FDE_ENCODING/FDE_ENCODE macros. Rainer 2013-10-01 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac (libgo_cv_ro_eh_frame): New test. (libgo_cv_as_comdat_gnu): Likewise. (libgo_cv_as_x86_pcrel): Likewise. (libgo_cv_as_x86_64_unwind_section_type): Likewise. * configure: Regenerate. * config.h.in: Regenerate. * go/reflect/makefunc_386.S: Replace CFI directives by hand-coded .eh_frame section. Restrict .note.* sections to Linux. * go/reflect/makefunc_amd64.S: Likewise. # HG changeset patch # Parent afdb60d4178e74141fa8d3ad8dfd756d009a209c Avoid CFI directives in makefunc_*.S diff --git a/libgo/configure.ac b/libgo/configure.ac --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -757,6 +757,68 @@ if test $libgo_cv_lib_setcontext_clobbe [Define if setcontext clobbers TLS variables]) fi +AC_CACHE_CHECK([whether .eh_frame section should be read-only], +libgo_cv_ro_eh_frame, [ +libgo_cv_ro_eh_frame=no +echo 'extern void foo (void); void bar (void) { foo (); foo (); }' conftest.c +if $CC $CFLAGS -S -fpic -fexceptions -o conftest.s conftest.c /dev/null 21; then + if grep '.section.*eh_frame.*a' conftest.s /dev/null; then +libgo_cv_ro_eh_frame=yes + elif grep '.section.*eh_frame.*#alloc' conftest.c \ + | grep -v '#write' /dev/null; then +libgo_cv_ro_eh_frame=yes + fi +fi +rm -f conftest.* +]) +if test x$libgo_cv_ro_eh_frame = xyes; then + AC_DEFINE(EH_FRAME_FLAGS, a, + [Define to the flags needed for the .section .eh_frame directive.]) +else + AC_DEFINE(EH_FRAME_FLAGS, aw, + [Define to the flags needed for the .section .eh_frame directive.]) +fi + +AC_CACHE_CHECK([if assembler supports GNU comdat group syntax], +libgo_cv_as_comdat_gnu, [ +echo '.section .text,axG,@progbits,.foo,comdat' conftest.s +if $CC $CFLAGS -c conftest.s /dev/null 21; then + libgo_cv_as_comdat_gnu=yes +else + libgo_cv_as_comdat_gnu=no +fi +]) +if test x$libgo_cv_as_comdat_gnu = xyes; then + AC_DEFINE(HAVE_AS_COMDAT_GAS, 1, + [Define if your assembler supports GNU comdat group syntax.]) +fi + +AC_CACHE_CHECK([assembler supports pc related relocs], +libgo_cv_as_x86_pcrel, [ +libgo_cv_as_x86_pcrel=yes +echo '.text; foo: nop; .data; .long foo-.; .text' conftest.s +if $CC $CFLAGS -c conftest.s 21 | $EGREP -i 'illegal|warning' /dev/null; then +libgo_cv_as_x86_pcrel=no +fi +]) +if test x$libgo_cv_as_x86_pcrel = xyes; then + AC_DEFINE(HAVE_AS_X86_PCREL, 1, + [Define if your assembler supports PC relative relocs.]) +fi + +AC_CACHE_CHECK([assembler supports unwind section type], +libgo_cv_as_x86_64_unwind_section_type, [ +libgo_cv_as_x86_64_unwind_section_type=yes +echo '.section .eh_frame,a,@unwind' conftest.s +if $CC $CFLAGS -c conftest.s 21 | grep -i warning /dev/null; then +libgo_cv_as_x86_64_unwind_section_type=no +fi +]) +if test
Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes
Ping 2013/9/17 Ilya Enkovich enkovich@gmail.com: Hi, Here is a patch introducing new type and mode for bounds. It is a part of MPX ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html). Bootstrapped and tested on linux-x86_64. Is it OK for trunk? Thanks, Ilya -- gcc/ 2013-09-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_BOUND): New. * tree.def (BOUND_TYPE): New. * genmodes.c (complete_mode): Support MODE_BOUND. (BOUND_MODE): New. (make_bound_mode): New. * machmode.h (BOUND_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_BOUND. (layout_type): Support BOUND_TYPE. * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE. * tree.c (build_int_cst_wide): Support BOUND_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (BOUND_TYPE_P): New. * varasm.c (output_constant): Support BOUND_TYPE. * doc/rtl.texi (MODE_BOUND): New. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 1d62223..02b1214 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the @file{@var{machine}-modes.def}. @xref{Jump Patterns}, also see @ref{Condition Code}. +@findex MODE_BOUND +@item MODE_BOUND +Bound modes class. Used to represent values of pointer bounds. + @findex MODE_RANDOM @item MODE_RANDOM This is a catchall mode class for modes which don't fit into the above diff --git a/gcc/genmodes.c b/gcc/genmodes.c index dc38483..89174ec 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m) break; case MODE_INT: +case MODE_BOUND: case MODE_FLOAT: case MODE_DECIMAL_FLOAT: case MODE_FRACT: @@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name, new_mode (cl, name, file, line); } +#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__) + +static void ATTRIBUTE_UNUSED +make_bound_mode (const char *name, + unsigned int bytesize, + const char *file, unsigned int line) +{ + struct mode_data *m = new_mode (MODE_BOUND, name, file, line); + m-bytesize = bytesize; +} + + #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y) #define FRACTIONAL_INT_MODE(N, B, Y) \ make_int_mode (#N, B, Y, __FILE__, __LINE__) diff --git a/gcc/machmode.h b/gcc/machmode.h index 981ee92..d4a20b2 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || CLASS == MODE_ACCUM \ || CLASS == MODE_UACCUM) +#define BOUND_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_BOUND) + /* Get the size in bytes and bits of an object of mode MODE. */ extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES]; diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index 7207ef7..c5ea215 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_RANDOM),/* other */ \ DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \ DEF_MODE_CLASS (MODE_INT), /* integer */ \ + DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \ DEF_MODE_CLASS (MODE_PARTIAL_INT), /* integer with padding bits */\ DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \ DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number */ \ diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 6f6b310..82611c7 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode) case MODE_VECTOR_ACCUM: case MODE_VECTOR_UFRACT: case MODE_VECTOR_UACCUM: +case MODE_BOUND: mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0); break; @@ -2135,6 +2136,13 @@ layout_type (tree type) SET_TYPE_MODE (type, VOIDmode); break; +case BOUND_TYPE: + SET_TYPE_MODE (type, + mode_for_size (TYPE_PRECISION (type), MODE_BOUND, 0)); + TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type))); + TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (TYPE_MODE (type))); + break; + case OFFSET_TYPE: TYPE_SIZE (type) = bitsize_int (POINTER_SIZE); TYPE_SIZE_UNIT (type) = size_int (POINTER_SIZE / BITS_PER_UNIT); diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 69e4006..8b0825c 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -697,6 +697,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags, break; case
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
I feel little bit uncomfortable with new ISO C++ standard, C++11, since C++14 is already there, so I removed it. Please check the words, since English is not my first language . Thanks! Index: htdocs/index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.891 diff -r1.891 index.html 55a56,59 dtspana href=projects/cxx0x.htmlC++11/a lt;regexgt; support/span span class=date[2013-10-02]/span/dt ddRegular expression support in a href=libstdc++/libstdc++-v3/a is now available./dd Index: htdocs/gcc-4.9/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.27 diff -r1.27 changes.html 136a137,139 lia href=http://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2011; Improved support for C++11/a, including support for lt;regexgt;. /li -- Tim Shen
Re: [PATCH][RFC] Detect and use implementations of BLAS routines
Richard Biener wrote: This adds recognition of [sd]axpy and [sd]dot computing partitions to loop distribution (as an example for a moderately complex kernel and one kernel involving a reduction). To make this a reality we have to control this by an option (-fblas?) and we have to settle to an ABI we rely on (trailing underscore, argument passing conventions and what integer type to use). Official CBLAS uses f2c and f2c.h to define the ABI. gfortran (since GCC 4.3) has -fexternal-blas to convert MATMUL to BLAS calls. It additionally has -fblas-matmul-limit=n (default: 30) to avoid calls to the library for small arrays. (I don't recall what happens if the size is unknown.) I believe that gfortran's ABI is used - and I don't recall any bug reports related to that ABI choice. (On the other hand, the number of -fexternal-blas users is probably small.) [In principle, dot_product also exists in Fortran 90+ and could be handled in the FE as well.] For the ABI, one has essentially the choice between platform ABI (i.e. what gfortran uses by default) and f2c ABI (which several other Fortran compilers use by default). My impression is that there is not much of an ABI difference: Most compilers generate lower-case procedure names, followed by a single underscore. Combining gfortran (with the default -fno-f2c) with vendor LAPACK/BLAS libraries usually works flawlessly. As some vendor libaries are compiled with gfortran (i.e. not using f2c semantics) while vendor libraries use a different (e.g. their own) compiler (which might use f2c ABI). Hence, there is no natural choice for the ABI. Regarding the ABI (platform ABI vs f2c) one has to be careful with: * Functions returning complex numbers. * On PowerPC (?) with complex numbers in general (namely whether struct{float re, im} and _Complex float are ABI compatible or not) * Functions returning single precision (which f2c returns as double precision) * Functions returning logical values (like LAPACK's lsame; other compilers might use different numbers than 1 for true, which is not compatible with GCC's negation [!f() / .not.f() might stay true for (-1) if handled as Boolean and not as int]). * Second underscore (although my impression is that most compilers have a single one by default). See also http://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html#index-ff2c-238 For the BLAS routines in question, only complex numbers might be problematic - at least on a very small set of systems (PowerPC). In general: I think it can be a useful feature. However, I am not sure that many users will use it. (Obstacle: They have to know/remember that this exists, they have to have a code that can profit from it, they need a vendor lib and they don't want to call BLAS directly themselves.) Tobias
[patch] move phiopt, ssa-dce and ssa-dom prototypes.
Handle a few more prototypes in tree-flow.h There were only 2 routines exported from tree-ssa-phiopts, and neither really belonged there. * I moved nonfreeing_call_p() to gimple.c since it is gimple dependent. blocks_in_phiopt_order() returns basic blocks in an order that guarantees any single predecessor is visited before its successor. After looking around, i think it really belongs in cfganal.c, so I moved it there... unless you can think of somewhere better. I also renamed it to be more generally descriptive: single_pred_before_succ_order... for lack of anything better. tree-ssa-dom.h needed to be created for a few prototypes. The only 2 exports from tree-ssa-dce.c were mark_virtual_operand_for_renaming and mark_virtual_phi_result_for_renaming. I moved them to tree-into-ssa.c which already has mark_virtual_operands_for_renaming (). Names are confusing a bit I find... and do we even need these routines? Aren't all virtual uses based off the one root variable? so isn't all that walking and SET_USE-ing a waste of time? Or is that still needed by the incremental bits? I never really paid attention to how that worked :-) Bootstrapped on x86_64-unknown-linux-gnu, and currently running regressions. Assuming it passes, OK? Andrew * tree-flow.h: Remove some prototypes. * tree-ssa-dce.c (mark_virtual_operand_for_renaming, mark_virtual_phi_result_for_renaming): Move to tree-into-ssa.c. * tree-into-ssa.c (mark_virtual_operand_for_renaming, mark_virtual_phi_result_for_renaming): Relocate here. * tree-into-ssa.h: Add prototypes. * tree-ssa-phiopt.c: (tree_ssa_phiopt_worker) Use single_pred_before_succ_order. (blocks_in_phiopt_order): Rename and move to cfganal.c. (nonfreeing_call_p) Move to gimple.c. * cfganal.c (single_pred_before_succ_order): Move and renamed from tree-ssa-phiopt.c. * basic-block.h (single_pred_before_succ_order): Add prototype. * gimple.c (nonfreeing_call_p): Relocate here. * gimple.h: Add prototype. * tree-ssa-ifcombine.c: Include tree-ssa-phiopt.h. * tree-ssa-dom.h: New file. Relocate prototypes here. * tree-ssa.h: Include tree-ssa-dom.h. Index: tree-flow.h === *** tree-flow.h (revision 203113) --- tree-flow.h (working copy) *** bool tree_node_can_be_shared (tree); *** 248,260 tree fold_const_aggregate_ref (tree); tree gimple_fold_stmt_to_constant (gimple, tree (*)(tree)); - /* In tree-ssa-dom.c */ - extern void dump_dominator_optimization_stats (FILE *); - extern void debug_dominator_optimization_stats (void); - int loop_depth_of_name (tree); - tree degenerate_phi_result (gimple); - bool simple_iv_increment_p (gimple); - /* In tree-ssa-copy.c */ extern void propagate_value (use_operand_p, tree); extern void propagate_tree_value (tree *, tree); --- 248,253 *** struct tree_niter_desc *** 309,318 enum tree_code cmp; }; - /* In tree-ssa-phiopt.c */ - bool empty_block_p (basic_block); - basic_block *blocks_in_phiopt_order (void); - bool nonfreeing_call_p (gimple); /* In tree-ssa-loop*.c */ --- 302,307 *** void tree_transform_and_unroll_loop (str *** 372,381 bool contains_abnormal_ssa_name_p (tree); bool stmt_dominates_stmt_p (gimple, gimple); - /* In tree-ssa-dce.c */ - void mark_virtual_operand_for_renaming (tree); - void mark_virtual_phi_result_for_renaming (gimple); - /* In tree-ssa-threadedge.c */ extern void threadedge_initialize_values (void); extern void threadedge_finalize_values (void); --- 361,366 Index: tree-ssa-dce.c === *** tree-ssa-dce.c (revision 203112) --- tree-ssa-dce.c (working copy) *** propagate_necessity (bool aggressive) *** 907,954 } } - /* Replace all uses of NAME by underlying variable and mark it -for renaming. This assumes the defining statement of NAME is -going to be removed. */ - - void - mark_virtual_operand_for_renaming (tree name) - { - tree name_var = SSA_NAME_VAR (name); - bool used = false; - imm_use_iterator iter; - use_operand_p use_p; - gimple stmt; - - gcc_assert (VAR_DECL_IS_VIRTUAL_OPERAND (name_var)); - FOR_EACH_IMM_USE_STMT (stmt, iter, name) - { - FOR_EACH_IMM_USE_ON_STMT (use_p, iter) - SET_USE (use_p, name_var); - used = true; - } - if (used) - mark_virtual_operands_for_renaming (cfun); - } - - /* Replace all uses of the virtual PHI result by its underlying variable -and mark it for renaming. This assumes the PHI node is going to be -removed. */ - - void - mark_virtual_phi_result_for_renaming (gimple phi) - { - if (dump_file (dump_flags TDF_DETAILS)) - { - fprintf (dump_file, Marking result for renaming : ); - print_gimple_stmt (dump_file, phi, 0, TDF_SLIM); - fprintf (dump_file, \n); - } - - mark_virtual_operand_for_renaming
Re: [wwwdocs] Mention libstdc++-v3 regex in 4.9 changes.html
On 2 October 2013 15:52, Tim Shen wrote: I feel little bit uncomfortable with new ISO C++ standard, C++11, since C++14 is already there, so I removed it. Good idea. Please check the words, since English is not my first language . The english is fine, please wait a few hours in case anyone else has comments, but if you don't hear anything else you can go ahead and commit it.
Re: [Patch] Fix incorrect behavior of [[=a=]] in regex
Committed. Thanks! On Wed, Oct 2, 2013 at 11:10 AM, Jonathan Wakely jwakely@gmail.com wrote: On 2 October 2013 15:26, Tim Shen wrote: _BracketMatcher::_M_add_equivalence_class is misimplemented so I try `git blame regex_compiler.h`...that's me! Booted and tested under -m32, -m64. This is OK to commit, thanks -- Tim Shen
Re: [Patch] Fix incorrect behavior of [[=a=]] in regex
On 2 October 2013 15:26, Tim Shen wrote: _BracketMatcher::_M_add_equivalence_class is misimplemented so I try `git blame regex_compiler.h`...that's me! Booted and tested under -m32, -m64. This is OK to commit, thanks
Re: [PATCH][RFC] Detect and use implementations of BLAS routines
Hello, You probably want to disable this transformation when the number of iterations is predicted to be small, right? Shouldn't dot product transform be predicated on -fassociative-math? Do you have a vision of a generalized pattern matcher to allow adding other routines easily? I'm curious what gap is between GCC's vectorizer output and fine-tuned BLAS libraries. [*] Or is the intention here to enable use of accelerated BLAS on HSA-like architectures? Or using BLAS when the vectorizer can't possibly match it (matmult -- but then again it's not easy to pattern-match in the first place; or non-trivial strides -- but what can a BLAS lib do in that case)? [*] The gap is definitely huge on something like ia64 (IIRC vectorization is not important there, but you need to unroll and schedule carefully), but I presume you're mostly interested in x86-64. GCC currently has a somewhat similar in spirit feature for the vectorizer -- -mveclibabi. Is it known how it is used in practice? Thanks. Alexander
Re: [PATCH] Improve probability/profile distribution in ORIF expansion
2013-10-01 Teresa Johnson tejohn...@google.com * dojump.c (do_jump_1): Divide probability between both conditions of a TRUTH_ORIF_EXPR. + { +/* Spread the probability evenly between the two conditions. So + the first condition has half the total probability of being true. + The second condition has the other half of the total probability, + so its jump has a probability of half the total, relative to + the probability we reached it (i.e. the first condition was false). */ +int op0_prob = prob / 2; +int op1_prob = GCOV_COMPUTE_SCALE ((prob / 2), inv (op0_prob)); Documentation of the functions says that PROB may be -1 when it is unknown, In that case you want to arrange op0_prob=op1_prob = -1. What about TRUTH_ANDIF_EXPR code above? I think it needs similar adjusting Patch is preaproved with these changes. Thanks! Honza
Re: [PATCH] alternative hirate for builtin_expert
Hi, Current default probability for builtin_expect is 0.9996. This makes the freq of unlikely bb very low (4), which suppresses the inlining of any calls within those bb. We used FDO data to measure the branch probably for the branch annotated with builtin_expert. For google internal benchmarks, the weight average (the profile count value as the weight) is 0.9081. Linux kernel is another program that is heavily annotated with builtin-expert. We measured its weight average as 0.8717, using google search as the workload. This patch sets the alternate hirate probability for builtin_expert to 90%. With the alternate hirate, we measured performance improvement for google benchmarks and Linux kernel. An earlier discussion is https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b This new patch is for the trunk and addresses Honza's comments. Honza: this new probability is off by default. When we backport to google branch we will make it the default. Let me know if you want to do the same here. I do not like much the binary parameter for builtin-expect-probability-relaxed. I would just add bulitin-expect-probability taking value in percents and then make predict.c to use it. Just use predict_edge instead of predict_edge_def and document hitrate value as unused in predict.def. OK with that change. Honza
[PATCH][4.8] S/390: Transactional memory fixes
Hi, with the attached patch we support more operand types in the tabort and tbegin_retry builtins. The patch also removes the constraint letters in the expanders and fixes a builtin prototype in the documentation. The testcase is adjusted accordingly. Bootstrapped and regtested on s390 and s390x with --with-arch=zEC12. I'll apply the patch to mainline and 4.8 branch after waiting for comments. Bye, -Andreas- 2013-10-02 Andreas Krebbel andreas.kreb...@de.ibm.com * config/s390/s390.md (tbegin, tbegin_nofloat, tbegin_retry) (tbegin_retry_nofloat, tend, tabort, tx_assist): Remove constraint letters from expanders. (tbegin_retry, tbegin_retry_nofloat): Change predicate of the retry count to general_operand. (tabort): Give operand 0 a mode. (tabort_1): Add mode and constraint letter for operand 0. * doc/extend.texi: Fix protoype of __builtin_non_tx_store. 2013-10-02 Andreas Krebbel andreas.kreb...@de.ibm.com * gcc.target/s390/htm-1.c: Add more tests to cover different operand types. --- gcc/config/s390/s390.md | 28 !!! gcc/doc/extend.texi |2 ! gcc/testsuite/gcc.target/s390/htm-1.c | 48 +! 3 files changed, 25 insertions(+), 53 modifications(!) Index: gcc/config/s390/s390.md === *** gcc/config/s390/s390.md.orig --- gcc/config/s390/s390.md *** *** 9962,9969 ; Non-constrained transaction begin (define_expand tbegin ! [(match_operand:SI 0 register_operand =d) !(match_operand:BLK 1 memory_operand =Q)] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, true); --- 9962,9969 ; Non-constrained transaction begin (define_expand tbegin ! [(match_operand:SI 0 register_operand ) !(match_operand:BLK 1 memory_operand )] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, true); *** *** 9971,9978 }) (define_expand tbegin_nofloat ! [(match_operand:SI 0 register_operand =d) !(match_operand:BLK 1 memory_operand =Q)] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, false); --- 9971,9978 }) (define_expand tbegin_nofloat ! [(match_operand:SI 0 register_operand ) !(match_operand:BLK 1 memory_operand )] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], NULL_RTX, false); *** *** 9980,9988 }) (define_expand tbegin_retry ! [(match_operand:SI 0 register_operand =d) !(match_operand:BLK 1 memory_operand =Q) !(match_operand 2 const_int_operand)] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], operands[2], true); --- 9980,9988 }) (define_expand tbegin_retry ! [(match_operand:SI 0 register_operand ) !(match_operand:BLK 1 memory_operand ) !(match_operand:SI 2 general_operand )] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], operands[2], true); *** *** 9990,9998 }) (define_expand tbegin_retry_nofloat ! [(match_operand:SI 0 register_operand =d) !(match_operand:BLK 1 memory_operand =Q) !(match_operand 2 const_int_operand)] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], operands[2], false); --- 9990,9998 }) (define_expand tbegin_retry_nofloat ! [(match_operand:SI 0 register_operand ) !(match_operand:BLK 1 memory_operand ) !(match_operand:SI 2 general_operand )] TARGET_HTM { s390_expand_tbegin (operands[0], operands[1], operands[2], false); *** *** 10059,10065 (define_expand tend [(set (reg:CCRAW CC_REGNUM) (unspec_volatile:CCRAW [(const_int 0)] UNSPECV_TEND)) !(set (match_operand:SI 0 register_operand =d) (unspec:SI [(reg:CCRAW CC_REGNUM)] UNSPEC_CC_TO_INT))] TARGET_HTM ) --- 10059,10065 (define_expand tend [(set (reg:CCRAW CC_REGNUM) (unspec_volatile:CCRAW [(const_int 0)] UNSPECV_TEND)) !(set (match_operand:SI 0 register_operand ) (unspec:SI [(reg:CCRAW CC_REGNUM)] UNSPEC_CC_TO_INT))] TARGET_HTM ) *** *** 10074,10080 ; Transaction abort (define_expand tabort ! [(unspec_volatile [(match_operand 0 shift_count_or_setmem_operand )] UNSPECV_TABORT)] TARGET_HTM operands != NULL { --- 10074,10080 ; Transaction abort (define_expand tabort ! [(unspec_volatile [(match_operand:SI 0 shift_count_or_setmem_operand )] UNSPECV_TABORT)] TARGET_HTM operands != NULL { *** *** 10089,10095 }) (define_insn *tabort_1 ! [(unspec_volatile [(match_operand 0 shift_count_or_setmem_operand )] UNSPECV_TABORT)] TARGET_HTM operands != NULL tabort\t%Y0 --- 10089,10095 }) (define_insn *tabort_1 !
Re: [PATCH] fix size_estimation for builtin_expect
Hi, builtin_expect should be a NOP in size_estimation. Indeed, the call stmt itself is 0 weight in size and time. But it may introduce an extra relation expr which has non-zero size/time. The end result is: for w/ and w/o builtin_expect, we have different size/time estimation for inlining. This patch fixes this problem. An earlier discussion of this patch is https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315 This new patch address Honza's comments. It passes the bootstrap and regression. Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think that's an overkill for this simple problem. Your code is mostly dealing with the recursively walk the PHI node to find the real def stmts. Here the traversal is within one BB and I may need to continue on multiple real assignment. Calling walk_use_def_chains probably only uses the SSA_NAME_DEF_STMT() part of the code. Thanks, -Rong This patch is OK. Add white space after + bool match = false; + bool done = false; and fix + if (match single_imm_use (var, use_p, use_stmt) + (gimple_code (use_stmt) == GIMPLE_COND)) should be at beggining of new line.. Thanks, Honza
[C++ Patch] PR 58535
Hi, in this [4.8/4.9] diagnostic regression the gcc_assert in check_member_templates trips: /* The parser rejects any use of virtual in a function template. */ gcc_assert (!(TREE_CODE (decl) == FUNCTION_DECL DECL_VIRTUAL_P (decl))); the ultimate reason being that in r187587 we inadvertently (I suppose, because the change has no relation to the rest of the commit and isn't explained) changed cp_parser_function_specifier_opt to always set ds_virtual in decl_specs, upon error too. Thus the below first hunk simply reverts that change and is enough to fix the primary bug. Then, in mainline only, we have a variant of the issue for our implicit function templates extension, which requires a little more work, because when we parse 'virtual' we don't know yet that all the function parameters will be auto. To handle that I'm adding a check in finish_fully_implicit_template. Tested x86_64-linux. Thanks, Paolo. / /cp 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58565 * semantics.c (potential_constant_expression_1): Handle LABEL_EXPR. /testsuite 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58565 * g++.dg/parse/crash64.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 203101) +++ cp/semantics.c (working copy) @@ -8422,6 +8422,7 @@ potential_constant_expression_1 (tree t, bool want case OVERLOAD: case TEMPLATE_ID_EXPR: case LABEL_DECL: +case LABEL_EXPR: case CONST_DECL: case SIZEOF_EXPR: case ALIGNOF_EXPR: Index: testsuite/g++.dg/parse/crash64.C === --- testsuite/g++.dg/parse/crash64.C(revision 0) +++ testsuite/g++.dg/parse/crash64.C(working copy) @@ -0,0 +1,7 @@ +// PR c++/58565 +// { dg-options } + +void foo() +{ + int i = ({ L: ; }); // { dg-error void value not ignored } +}
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
2013-09-29 Teresa Johnson tejohn...@google.com * bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges): Treat profile insanities conservatively. * predict.c (probably_never_executed): New function. Treat profile insanities conservatively. (probably_never_executed_bb_p): Invoke probably_never_executed. (probably_never_executed_edge_p): Invoke probably_never_executed. Index: bb-reorder.c === --- bb-reorder.c(revision 202947) +++ bb-reorder.c(working copy) @@ -1564,8 +1564,25 @@ find_rarely_executed_basic_blocks_and_crossing_edg /* Mark which partition (hot/cold) each basic block belongs in. */ FOR_EACH_BB (bb) { + bool cold_bb = false; whitespace here if (probably_never_executed_bb_p (cfun, bb)) { + /* Handle profile insanities created by upstream optimizations + by also checking the incoming edge weights. If there is a non-cold + incoming edge, conservatively prevent this block from being split + into the cold section. */ + cold_bb = true; + FOR_EACH_EDGE (e, ei, bb-preds) +{ + if (!probably_never_executed_edge_p (cfun, e)) +{ + cold_bb = false; + break; +} +} You can probably elimnate the extra braces. So we won't propagate deeper in the CFG, right? This change is OK. +} + if (cold_bb) +{ BB_SET_PARTITION (bb, BB_COLD_PARTITION); cold_bb_count++; } Index: predict.c === --- predict.c (revision 202947) +++ predict.c (working copy) @@ -226,26 +226,26 @@ maybe_hot_edge_p (edge e) } -/* Return true in case BB is probably never executed. */ -bool -probably_never_executed_bb_p (struct function *fun, const_basic_block bb) +/* Return true if profile COUNT and FREQUENCY, or function FUN static + node frequency reflects never being executed. */ + +static bool +probably_never_executed (struct function *fun, + gcov_type count, int frequency) { gcc_checking_assert (fun); if (profile_status_for_function (fun) == PROFILE_READ) { - if ((bb-count * 4 + profile_info-runs / 2) / profile_info-runs 0) + if ((count * 4 + profile_info-runs / 2) / profile_info-runs 0) return false; - if (!bb-frequency) - return true; - if (!ENTRY_BLOCK_PTR-frequency) - return false; - if (ENTRY_BLOCK_PTR-count ENTRY_BLOCK_PTR-count REG_BR_PROB_BASE) - { - return (RDIV (bb-frequency * ENTRY_BLOCK_PTR-count, - ENTRY_BLOCK_PTR-frequency) - REG_BR_PROB_BASE / 4); - } + // If this is a profiled function (entry bb non-zero count), then base + // the coldness decision on the frequency. This will handle cases where + // counts are not updated properly during optimizations or expansion. + if (ENTRY_BLOCK_PTR-count) + return frequency == 0; + // Unprofiled function, frequencies statically assigned. All bbs are + // treated as cold. I would avoid combining C and C++ comments in the function. Did you get some data on how many basic blocks we now consider hot? The previous implemntation consdered block as never executed when frequencies indicates that it is executed in at most 1/4th of invocations of program. You essentially chnage to 1/1. The first seems bit too high given the way we distribute probabilities in dojump and firends, second looks too low. The change introducing probably_never_executed with the current logic is OK. We may want to fine tune the ratio. Honza return true; } if ((!profile_info || !flag_branch_probabilities) @@ -256,19 +256,21 @@ maybe_hot_edge_p (edge e) } +/* Return true in case BB is probably never executed. */ + +bool +probably_never_executed_bb_p (struct function *fun, const_basic_block bb) +{ + return probably_never_executed (fun, bb-count, bb-frequency); +} + + /* Return true in case edge E is probably never executed. */ bool probably_never_executed_edge_p (struct function *fun, edge e) { - gcc_checking_assert (fun); - if (profile_info flag_branch_probabilities) -return ((e-count + profile_info-runs / 2) / profile_info-runs) == 0; - if ((!profile_info || !flag_branch_probabilities) - (cgraph_get_node (fun-decl)-frequency - == NODE_FREQUENCY_UNLIKELY_EXECUTED)) -return true; - return false; + return probably_never_executed (fun, e-count, EDGE_FREQUENCY (e)); } /* Return true if NODE should be optimized for size. */
Re: [C++ Patch] PR 58535
... oops attached the patch which I just committed. Sorry. Right attachments below. Thanks, Paolo. // /cp 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58535 * parser.c (cp_parser_function_specifier_opt): Upon error about virtual templates don't set ds_virtual. (finish_fully_implicit_template): Reject virtual implicit templates. /testsuite 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58535 * g++.dg/parse/crash65.C: New. * g++.dg/cpp1y/pr58535.C: Likewise. Index: cp/parser.c === --- cp/parser.c (revision 203110) +++ cp/parser.c (working copy) @@ -11460,7 +11460,8 @@ cp_parser_function_specifier_opt (cp_parser* parse A member function template shall not be virtual. */ if (PROCESSING_REAL_TEMPLATE_DECL_P ()) error_at (token-location, templates may not be %virtual%); - set_and_check_decl_spec_loc (decl_specs, ds_virtual, token); + else + set_and_check_decl_spec_loc (decl_specs, ds_virtual, token); break; case RID_EXPLICIT: @@ -29035,6 +29036,14 @@ finish_fully_implicit_template (cp_parser *parser, { gcc_assert (parser-fully_implicit_function_template_p); + if (member_decl_opt member_decl_opt != error_mark_node + DECL_VIRTUAL_P (member_decl_opt)) +{ + error_at (DECL_SOURCE_LOCATION (member_decl_opt), + implicit templates may not be %virtual%); + DECL_VIRTUAL_P (member_decl_opt) = false; +} + pop_deferring_access_checks (); if (member_decl_opt) member_decl_opt = finish_member_template_decl (member_decl_opt); Index: testsuite/g++.dg/cpp1y/pr58535.C === --- testsuite/g++.dg/cpp1y/pr58535.C(revision 0) +++ testsuite/g++.dg/cpp1y/pr58535.C(working copy) @@ -0,0 +1,7 @@ +// PR c++/58535 +// { dg-options -std=gnu++1y } + +struct A +{ + virtual void foo(auto); // { dg-error templates } +}; Index: testsuite/g++.dg/parse/crash65.C === --- testsuite/g++.dg/parse/crash65.C(revision 0) +++ testsuite/g++.dg/parse/crash65.C(working copy) @@ -0,0 +1,6 @@ +// PR c++/58535 + +struct A +{ + templateint virtual void foo(); // { dg-error templates } +};
[PATCH, build]: Update x-i386 and x-alpha for automatic dependencies
Hello! 2013-10-02 Uros Bizjak ubiz...@gmail.com * config/i386/x-i386 (driver-i386.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). * config/alpha/x-alpha (driver-alpha.o): Ditto. Bootstrapped on x86_64-pc-linux-gnu and alphaev68-pc-linux-gnu, committed to mainline SVN. Uros. Index: alpha/x-alpha === --- alpha/x-alpha (revision 203117) +++ alpha/x-alpha (working copy) @@ -1,3 +1,3 @@ -driver-alpha.o: $(srcdir)/config/alpha/driver-alpha.c \ - $(CONFIG_H) $(SYSTEM_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +driver-alpha.o: $(srcdir)/config/alpha/driver-alpha.c + $(COMPILE) $ + $(POSTCOMPILE) Index: i386/x-i386 === --- i386/x-i386 (revision 203117) +++ i386/x-i386 (working copy) @@ -1,4 +1,3 @@ -driver-i386.o : $(srcdir)/config/i386/driver-i386.c \ - $(srcdir)/config/i386/cpuid.h \ - $(CONFIG_H) $(SYSTEM_H) $(TM_H) coretypes.h - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +driver-i386.o : $(srcdir)/config/i386/driver-i386.c + $(COMPILE) $ + $(POSTCOMPILE)
Re: [c++-concepts] constrained friends redux
On 10/02/2013 09:05 AM, Andrew Sutton wrote: + // Do not permit the declaration of constrained friend + // function declarations. They cannot be instantiated since + // the resulting declaration would never match the definition, + // which must be a non-template and cannot be constrained. You're in the template-id code here, so must be a non-template is confusing: template class T void f(); struct A { friend void fint(); // matches a template }; Perhaps you mean that it must match a fully-instantiated function, so any constraints on the templates were considered during determine_specialization. + error(constrained friend does not depend on template parameters); Space before (. +// Returns true if FN is a non-template member function. +static inline bool is_non_template_member_fn (tree fn) { return DECL_FUNCTION_MEMBER_P (fn) @@ -1829,6 +1829,21 @@ is_non_template_member_fn (tree fn) !DECL_MEMBER_TEMPLATE_P (DECL_TI_TEMPLATE (fn)); } +// Returns true if FN is a non-template friend definition. +static inline bool +is_non_template_friend (tree fn) These names/comments fail to make it clear that they return true only for non-template members/friends *of class template specializations*. So my preference would be to open-code them into is_constrainable_non_template_fn. Jason
Re: [C++ Patch] PR 58535
OK. Jason
Re: [patch] move phiopt, ssa-dce and ssa-dom prototypes.
Andrew MacLeod amacl...@redhat.com wrote: Handle a few more prototypes in tree-flow.h There were only 2 routines exported from tree-ssa-phiopts, and neither really belonged there. * I moved nonfreeing_call_p() to gimple.c since it is gimple dependent. blocks_in_phiopt_order() returns basic blocks in an order that guarantees any single predecessor is visited before its successor. After looking around, i think it really belongs in cfganal.c, so I moved it there... unless you can think of somewhere better. I also renamed it to be more generally descriptive: single_pred_before_succ_order... for lack of anything better. tree-ssa-dom.h needed to be created for a few prototypes. The only 2 exports from tree-ssa-dce.c were mark_virtual_operand_for_renaming and mark_virtual_phi_result_for_renaming. I moved them to tree-into-ssa.c which already has mark_virtual_operands_for_renaming (). Names are confusing a bit I find... and do we even need these routines? Aren't all virtual uses based off the one root variable? so isn't all that walking and SET_USE-ing a waste of time? Or is that still needed by the incremental bits? I never really paid attention to how that worked :-) It can be a bit tricky sometimes, but yes, another cleanup is on my long todo list. Bootstrapped on x86_64-unknown-linux-gnu, and currently running regressions. Assuming it passes, OK? Ok. Thanks, Richard. Andrew
Re: [patch] More tree-flow.h prototypes.
Andrew MacLeod amacl...@redhat.com wrote: On 10/02/2013 07:58 AM, Andrew MacLeod wrote: On 10/02/2013 04:37 AM, Richard Biener wrote: On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. hmm, easy enough to move them *all* to tree-ssa-propagate.[ch] right now and check it in... That seems like the right place for all of them and then we don't even need to create tree-ssa-copy.h...? Like so.. and directly include tree-ssa-propagate.h in the 3 .c files that need it now. bootstrapped on x86_64-unknown-linux-gnu.. regressions running. Prefer this? Yes. Richard. Andrew
Re: [PATCH] Improve probability/profile distribution in ORIF expansion
On Wed, Oct 2, 2013 at 9:03 AM, Jan Hubicka hubi...@ucw.cz wrote: 2013-10-01 Teresa Johnson tejohn...@google.com * dojump.c (do_jump_1): Divide probability between both conditions of a TRUTH_ORIF_EXPR. + { +/* Spread the probability evenly between the two conditions. So + the first condition has half the total probability of being true. + The second condition has the other half of the total probability, + so its jump has a probability of half the total, relative to + the probability we reached it (i.e. the first condition was false). */ +int op0_prob = prob / 2; +int op1_prob = GCOV_COMPUTE_SCALE ((prob / 2), inv (op0_prob)); Documentation of the functions says that PROB may be -1 when it is unknown, In that case you want to arrange op0_prob=op1_prob = -1. Fixed. What about TRUTH_ANDIF_EXPR code above? I think it needs similar adjusting Yes. When I first looked at it yesterday I thought it was ok, but I see we need to do something similar. Essentially, the probability that either condition is false is half the probability the entire ANDIF expression is false. So basically we do the same computation, but on the expression's false probability, and invert the resulting false condition probabilities. Here is the new patch I am testing. Will commit after testing completes. Thanks, Teresa 2013-10-02 Teresa Johnson tejohn...@google.com * dojump.c (do_jump_1): Divide probability between both conditions of a TRUTH_ANDIF_EXPR/TRUTH_ORIF_EXPR. Index: dojump.c === --- dojump.c(revision 203077) +++ dojump.c(working copy) @@ -311,32 +311,66 @@ do_jump_1 (enum tree_code code, tree op0, tree op1 break; case TRUTH_ANDIF_EXPR: - if (if_false_label == NULL_RTX) -{ - drop_through_label = gen_label_rtx (); - do_jump (op0, drop_through_label, NULL_RTX, prob); - do_jump (op1, NULL_RTX, if_true_label, prob); - } - else - { - do_jump (op0, if_false_label, NULL_RTX, prob); - do_jump (op1, if_false_label, if_true_label, prob); - } - break; + { +/* Spread the probability that the expression is false evenly between + the two conditions. So the first condition is false half the total + probability of being false. The second condition is false the other + half of the total probability of being false, so its jump has a false + probability of half the total, relative to the probability we + reached it (i.e. the first condition was true). */ +int op0_prob = -1; +int op1_prob = -1; +if (prob != -1) + { +int false_prob = inv (prob); +int op0_false_prob = false_prob / 2; +int op1_false_prob = GCOV_COMPUTE_SCALE ((false_prob / 2), + inv (op0_false_prob)); +/* Get the probability that each jump below is true. */ +op0_prob = inv (op0_false_prob); +op1_prob = inv (op1_false_prob); + } +if (if_false_label == NULL_RTX) + { +drop_through_label = gen_label_rtx (); +do_jump (op0, drop_through_label, NULL_RTX, op0_prob); +do_jump (op1, NULL_RTX, if_true_label, op1_prob); + } +else + { +do_jump (op0, if_false_label, NULL_RTX, op0_prob); +do_jump (op1, if_false_label, if_true_label, op1_prob); + } +break; + } case TRUTH_ORIF_EXPR: - if (if_true_label == NULL_RTX) - { - drop_through_label = gen_label_rtx (); - do_jump (op0, NULL_RTX, drop_through_label, prob); - do_jump (op1, if_false_label, NULL_RTX, prob); - } - else - { - do_jump (op0, NULL_RTX, if_true_label, prob); - do_jump (op1, if_false_label, if_true_label, prob); - } - break; + { +/* Spread the probability evenly between the two conditions. So + the first condition has half the total probability of being true. + The second condition has the other half of the total probability, + so its jump has a probability of half the total, relative to + the probability we reached it (i.e. the first condition was false). */ +int op0_prob = -1; +int op1_prob = -1; +if (prob != -1) + { +op0_prob = prob / 2; +op1_prob = GCOV_COMPUTE_SCALE ((prob / 2), inv (op0_prob)); + } +if (if_true_label == NULL_RTX) + { +drop_through_label = gen_label_rtx (); +do_jump (op0, NULL_RTX, drop_through_label, op0_prob); +do_jump (op1, if_false_label, NULL_RTX, op1_prob); + } +else + { +do_jump
Honnor ix86_accumulate_outgoing_args again
Hi, currently ix86_accumulate_outgoing_args is ignored on all targets except for Solaris (that sets USE_IX86_FRAME_POINTER to true). It seems like accidental effect of http://gcc.gnu.org/ml/gcc-patches/2010-08/txt00102.txt that enabled omit-frame-pointer for 32bit (I take the 64bit change was purely accidental) probably based on the fact non-accumulate-outgoing-args was not doing well with assynchronous unwind info. The reason for this seems to be gone by http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00995.html So I thing we ought to honnor accumulate-outgoing-args again and in fact consider disabling it for generic - it is disabled for core (that may need re-benchmarking). For all AMD targets it is currently on. I tested disabling it on buldozer 32bit and it seems mostly SPEC neutral for specint2000 (I am wating for more benchmarks) with very nice code size improvements in all benchmarks with exception of MCF with LTO (not sure at all why), with overall reduction of 5.2% (same gain as we get for -flto aproximately) http://gcc.opensuse.org/SPEC/CINT/sb-megrez-head-64-32o-32bit/size.html There may be close to noise factor drops as seen in http://gcc.opensuse.org/SPEC/CINT/sb-megrez-head-64-32o-32bit/recent.html I will see how other tests shape and wait for multiple runs to show how much of this is actual noise. We may consider disabling it for size optimized functions and -O2 (and not for -O3) at least. This patch however only remove code forcingly enabling MASK_ACCUMULATE_OUTGOING_ARGS. If there will be no complains, I will commit it tomorrow. Honza * i386.c (ix86_option_override_internal): Do not force ACCUMULATE_OUTGOING_ARGS when unwind info is generated. Index: config/i386/i386.c === --- config/i386/i386.c (revision 203117) +++ config/i386/i386.c (working copy) @@ -3793,28 +3793,11 @@ ix86_option_override_internal (bool main } ix86_tune_mask = 1u ix86_tune; - if ((!USE_IX86_FRAME_POINTER - || (x86_accumulate_outgoing_args ix86_tune_mask)) + if ((x86_accumulate_outgoing_args ix86_tune_mask) !(target_flags_explicit MASK_ACCUMULATE_OUTGOING_ARGS) !optimize_size) target_flags |= MASK_ACCUMULATE_OUTGOING_ARGS; - /* ??? Unwind info is not correct around the CFG unless either a frame - pointer is present or M_A_O_A is set. Fixing this requires rewriting - unwind info generation to be aware of the CFG and propagating states - around edges. */ - if ((flag_unwind_tables || flag_asynchronous_unwind_tables - || flag_exceptions || flag_non_call_exceptions) - flag_omit_frame_pointer - !(target_flags MASK_ACCUMULATE_OUTGOING_ARGS)) -{ - if (target_flags_explicit MASK_ACCUMULATE_OUTGOING_ARGS) - warning (0, unwind tables currently require either a frame pointer -or %saccumulate-outgoing-args%s for correctness, -prefix, suffix); - target_flags |= MASK_ACCUMULATE_OUTGOING_ARGS; -} - /* If stack probes are required, the space used for large function arguments on the stack must also be probed, so enable -maccumulate-outgoing-args so this happens in the prologue. */
Re: libgo patch committed: Implement reflect.MakeFunc for 386
On Wed, Oct 2, 2013 at 7:45 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Here's what I came up with. As I said, it is inspired by the libffi code, but a bit simplified since e.g. stuff like no .ascii support aren't relevant on the Solaris versions supported on mainline and 4.8 branch. Bootstrapped on x86_64-unknown-linux-gnu and i386-pc-solaris2.10 with Sun as and gas. I've also compared the readelf --debug-dump=frames output for the 32 and 64-bit makefunc.o, both PIC and non-PIC. 64-bit is completely unchanged, while for 32-bit there are FDE encoding changes as expected from the FDE_ENCODING/FDE_ENCODE macros. Rainer 2013-10-01 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac (libgo_cv_ro_eh_frame): New test. (libgo_cv_as_comdat_gnu): Likewise. (libgo_cv_as_x86_pcrel): Likewise. (libgo_cv_as_x86_64_unwind_section_type): Likewise. * configure: Regenerate. * config.h.in: Regenerate. * go/reflect/makefunc_386.S: Replace CFI directives by hand-coded .eh_frame section. Restrict .note.* sections to Linux. * go/reflect/makefunc_amd64.S: Likewise. Great, thanks for working on this. Committed to trunk and 4.8 branch. Ian
Re: [PATCH] Reducing number of alias checks in vectorization.
On Tue, Oct 1, 2013 at 11:35 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 01, 2013 at 07:12:54PM -0700, Cong Hou wrote: --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) Your mailer ate all the tabs, so the formatting of the whole patch can't be checked. I'll pay attention to this problem in my later patch submission. @@ -19,6 +19,10 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#include vector +#include utility +#include algorithm Why? GCC has it's vec.h vectors, why don't you use those? There is even qsort method for you in there. And for pairs, you can easily just use structs with two members as structure elements in the vector. GCC is now restructured using C++ and STL is one of the most important part of C++. I am new to GCC community and more familiar to STL (and I think allowing STL in GCC could attract more new developers for GCC). I agree using GCC's vec can maintain a uniform style but STL is just so powerful and easy to use... I just did a search in GCC source tree and found vector is not used yet. I will change std::vector to GCC's vec for now (and also qsort), but am still wondering if one day GCC would accept STL. +struct dr_addr_with_seg_len +{ + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) +: dr (d), basic_addr (addr), offset (off), seg_len (len) {} + + data_reference* dr; Space should be before *, not after it. + if (TREE_CODE (p11.offset) != INTEGER_CST + || TREE_CODE (p21.offset) != INTEGER_CST) +return p11.offset p21.offset; If offset isn't INTEGER_CST, you are comparing the pointer values? That is never a good idea, then compilation will depend on how say address space randomization randomizes virtual address space. GCC needs to have reproduceable compilations. I this scenario comparing pointers is safe. The sort is used to put together any two pairs of data refs which can be merged. For example, if we have (a, b) (a, c), (a, b+1), then after sorting them we should have either (a, b), (a, b+1), (a, c) or (a, c), (a, b), (a, b+1). We don't care the relative order of non-mergable dr pairs here. So although the sorting result may vary the final result we get should not change. + if (int_cst_value (p11.offset) != int_cst_value (p21.offset)) +return int_cst_value (p11.offset) int_cst_value (p21.offset); This is going to ICE whenever the offsets wouldn't fit into a HOST_WIDE_INT. I'd say you just shouldn't put into the vector entries where offset isn't host_integerp, those would never be merged with other checks, or something similar. Do you mean I should use widest_int_cst_value()? Then I will replace all int_cst_value() here with it. I also changed the type of diff variable into HOST_WIDEST_INT. Thank you very much for your comments! Cong Jakub
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
On Wed, Oct 2, 2013 at 9:19 AM, Jan Hubicka hubi...@ucw.cz wrote: 2013-09-29 Teresa Johnson tejohn...@google.com * bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges): Treat profile insanities conservatively. * predict.c (probably_never_executed): New function. Treat profile insanities conservatively. (probably_never_executed_bb_p): Invoke probably_never_executed. (probably_never_executed_edge_p): Invoke probably_never_executed. Index: bb-reorder.c === --- bb-reorder.c(revision 202947) +++ bb-reorder.c(working copy) @@ -1564,8 +1564,25 @@ find_rarely_executed_basic_blocks_and_crossing_edg /* Mark which partition (hot/cold) each basic block belongs in. */ FOR_EACH_BB (bb) { + bool cold_bb = false; whitespace here meaning add a line of whitespace? Ok, done. if (probably_never_executed_bb_p (cfun, bb)) { + /* Handle profile insanities created by upstream optimizations + by also checking the incoming edge weights. If there is a non-cold + incoming edge, conservatively prevent this block from being split + into the cold section. */ + cold_bb = true; + FOR_EACH_EDGE (e, ei, bb-preds) +{ + if (!probably_never_executed_edge_p (cfun, e)) +{ + cold_bb = false; + break; +} +} You can probably elimnate the extra braces. So we won't propagate deeper in the CFG, right? Done. This change is OK. +} + if (cold_bb) +{ BB_SET_PARTITION (bb, BB_COLD_PARTITION); cold_bb_count++; } Index: predict.c === --- predict.c (revision 202947) +++ predict.c (working copy) @@ -226,26 +226,26 @@ maybe_hot_edge_p (edge e) } -/* Return true in case BB is probably never executed. */ -bool -probably_never_executed_bb_p (struct function *fun, const_basic_block bb) +/* Return true if profile COUNT and FREQUENCY, or function FUN static + node frequency reflects never being executed. */ + +static bool +probably_never_executed (struct function *fun, + gcov_type count, int frequency) { gcc_checking_assert (fun); if (profile_status_for_function (fun) == PROFILE_READ) { - if ((bb-count * 4 + profile_info-runs / 2) / profile_info-runs 0) + if ((count * 4 + profile_info-runs / 2) / profile_info-runs 0) return false; - if (!bb-frequency) - return true; - if (!ENTRY_BLOCK_PTR-frequency) - return false; - if (ENTRY_BLOCK_PTR-count ENTRY_BLOCK_PTR-count REG_BR_PROB_BASE) - { - return (RDIV (bb-frequency * ENTRY_BLOCK_PTR-count, - ENTRY_BLOCK_PTR-frequency) - REG_BR_PROB_BASE / 4); - } + // If this is a profiled function (entry bb non-zero count), then base + // the coldness decision on the frequency. This will handle cases where + // counts are not updated properly during optimizations or expansion. + if (ENTRY_BLOCK_PTR-count) + return frequency == 0; + // Unprofiled function, frequencies statically assigned. All bbs are + // treated as cold. I would avoid combining C and C++ comments in the function. Fixed. Did you get some data on how many basic blocks we now consider hot? No, I can do that. The previous implemntation consdered block as never executed when frequencies indicates that it is executed in at most 1/4th of invocations of program. You essentially chnage to 1/1. The first seems bit too high given the way we distribute probabilities in dojump and firends, second looks too low. But why do we want to consider blocks as probably never executed when the frequency suggests they are sometimes executed? AFAICT, there are 2 main callers of this routine: 1) function splitting in bb-layout 2) function cgraph node weight Where #2 will affect optimization of the function for size and also function layout by the linker. I would argue that for function splitting, we really want to know when it is probably *never* executed - i.e. completely cold, since the cost of jumping back and forth to the cold section is likely to be high. I am not sure for #2 what the right ratio is. For function layout, we may also want to place only really cold *never* executed functions into the cold section, but I am less sure about optimization for size. Perhaps we really need two different interfaces to test for different levels of coldness: probably_never_executed() - returns true when there is profile information for the function and the bb has 0 count and 0 frequency. - invoked from bb-reorder.cc to drive
[patch] tree-eh.c prototypes
This patch moves the prototypes for tree-eh.c into a new file tree-eh.h. This file is in fact really gimple-eh.. we'll rename that later with the other tree-gimple renaming that is needed. however, using_eh_for_cleanups() is in fact a front end routine which is called when eh regions are used for cleanups. It sets a static flag in tree-eh.c and is only examined from one place in tree-eh.c. I think 4 or 5 of the front ends call this routine. Since this is really a front end interface routine, I kept the name and moved it and the static variable to tree.[ch] for now and added a query routine. This prevents the front ends from having to include any of this gimple stuff. Bootstraps onx86_64-unknown-linux-gnu and has no new regressions. OK? Andrew PS. do we want to put debug routines in the .h file? I ask because I see a few are, but in many other cases there are a number of them in the .c file which are not explicitly exported. Often their names aren't very useful either and sometimes sometimes utilize structs or types that are specific to that .c file. Mostly I think they are not static simply because the debugger needs them so the compiler wont throw them away. for instance, tree-ssa-pre.c has 3 of them, including a very common form: debug_bitmap_sets_for_bb(basic_block bb)... This prints a bitmaps based on internal meanings of the bits. I see numerous other files which have similar, if slightly different names to do a simiiar function And in fact, tree-ssa-pre.c will have no header file, unless we need a place to put these 3 debug files. My personal preference is to simply leave them in the .c file, mostly because they can have internal types. Ideally, all the prototypes would be listed early in the .c file in one place so anyone truing to debug something can find them easily. * tree-flow.h: Remove some prototypes. * tree.h: Remove some protypes, add a couple. * tree.c (using_eh_for_cleanups_flag, using_eh_for_cleanups, using_eh_for_cleanups_p): Add interface routines for front ends. * tree-eh.h: New file. Add protoptyes. * tree-eh.c (using_eh_for_cleanups_p, using_eh_for_cleanups): Delete. (add_stmt_to_eh_lp_fn): Make static. (lower_try_finally): Use new using_eh_for_cleanups_p. * emit-rtl.c: Include tree-eh.h. * gimple.h: Include tree-eh.h. Index: tree-flow.h === *** tree-flow.h (revision 203118) --- tree-flow.h (working copy) *** enum move_pos *** 390,427 extern enum move_pos movement_possibility (gimple); char *get_lsm_tmp_name (tree, unsigned); - /* In tree-flow-inline.h */ - static inline bool unmodifiable_var_p (const_tree); - static inline bool ref_contains_array_ref (const_tree); - - /* In tree-eh.c */ - extern void make_eh_edges (gimple); - extern bool make_eh_dispatch_edges (gimple); - extern edge redirect_eh_edge (edge, basic_block); - extern void redirect_eh_dispatch_edge (gimple, edge, basic_block); - extern bool stmt_could_throw_p (gimple); - extern bool stmt_can_throw_internal (gimple); - extern bool stmt_can_throw_external (gimple); - extern void add_stmt_to_eh_lp_fn (struct function *, gimple, int); - extern void add_stmt_to_eh_lp (gimple, int); - extern bool remove_stmt_from_eh_lp (gimple); - extern bool remove_stmt_from_eh_lp_fn (struct function *, gimple); - extern int lookup_stmt_eh_lp_fn (struct function *, gimple); - extern int lookup_stmt_eh_lp (gimple); - extern bool maybe_clean_eh_stmt_fn (struct function *, gimple); - extern bool maybe_clean_eh_stmt (gimple); - extern bool maybe_clean_or_replace_eh_stmt (gimple, gimple); - extern bool maybe_duplicate_eh_stmt_fn (struct function *, gimple, - struct function *, gimple, - struct pointer_map_t *, int); - extern bool maybe_duplicate_eh_stmt (gimple, gimple); - extern bool verify_eh_edges (gimple); - extern bool verify_eh_dispatch_edge (gimple); - extern void maybe_remove_unreachable_handlers (void); - - /* In tree-ssa-pre.c */ - void debug_value_expressions (unsigned int); - /* In tree-loop-linear.c */ extern void linear_transform_loops (void); extern unsigned perfect_loop_nest_depth (struct loop *); --- 390,395 Index: tree.h === *** tree.h (revision 203117) --- tree.h (working copy) *** extern rtx expand_stack_save (void); *** 4216,4230 extern void expand_stack_restore (tree); extern void expand_return (tree); - /* In tree-eh.c */ - extern void using_eh_for_cleanups (void); - - extern bool tree_could_trap_p (tree); - extern bool operation_could_trap_helper_p (enum tree_code, bool, bool, bool, - bool, tree, bool *); - extern bool operation_could_trap_p (enum tree_code, bool, bool, tree); - extern bool tree_could_throw_p (tree); - /* Compare and hash for any structure which begins with a canonical pointer. Assumes all pointers are interchangeable, which is
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
But why do we want to consider blocks as probably never executed when the frequency suggests they are sometimes executed? Well, probably never executed is mean to reffer to one run. If you have something like code handling fatal errors, you probably still want to have it in cold secion even if user may have trained the program on a testsuite that triggers them once or twice per thousdand of runs. We may just make the predicate more strict, but lets do that incrementally so we know how much things change. I am somewhat concerned that we are not that effective on breaking out cold code so -fprofile-use does not lead to as significant code size reductions as the theory would suggest, so perhaps I am just overfly conservative about this. Getting the splitting to work reliably is definitely going to be a win. Perhaps we really need two different interfaces to test for different levels of coldness: probably_never_executed() - returns true when there is profile information for the function and the bb has 0 count and 0 frequency. - invoked from bb-reorder.cc to drive function splitting - may want to consider invoking this as an additional check before putting function into unlikely text section in the future. possibly_never_executed() - essentially the existing logic in probably_never_executed_bb_p - invoked when marking the cgraph node Perhaps... Advantage of hot/normal/cold split is that it is easy to understand, but if necessary (i.e, it becomes impossible to tune well) we may add more stages... Honza
Re: [PATCH] Relax the requirement of reduction pattern in GCC vectorizer.
Ping.. Any comment on this patch? thanks, Cong On Sat, Sep 28, 2013 at 9:34 AM, Xinliang David Li davi...@google.com wrote: You can also add a test case of this form: int foo( int t, int n, int *dst) { int j = 0; int s = 1; t++; for (j = 0; j n; j++) { dst[j] = t; s *= t; } return s; } where without the fix the loop vectorization is missed. David On Fri, Sep 27, 2013 at 6:28 PM, Cong Hou co...@google.com wrote: The current GCC vectorizer requires the following pattern as a simple reduction computation: loop_header: a1 = phi a0, a2 a3 = ... a2 = operation (a3, a1) But a3 can also be defined outside of the loop. For example, the following loop can benefit from vectorization but the GCC vectorizer fails to vectorize it: int foo(int v) { int s = 1; ++v; for (int i = 0; i 10; ++i) s *= v; return s; } This patch relaxes the original requirement by also considering the following pattern: a3 = ... loop_header: a1 = phi a0, a2 a2 = operation (a3, a1) A test case is also added. The patch is tested on x86-64. thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 39c786e..45c1667 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2013-09-27 Cong Hou co...@google.com + + * tree-vect-loop.c: Relax the requirement of the reduction + pattern so that one operand of the reduction operation can + come from outside of the loop. + 2013-09-25 Tom Tromey tro...@redhat.com * Makefile.in (PARTITION_H, LTO_SYMTAB_H, COMMON_TARGET_DEF_H) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 09644d2..90496a2 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2013-09-27 Cong Hou co...@google.com + + * gcc.dg/vect/vect-reduc-pattern-3.c: New test. + 2013-09-25 Marek Polacek pola...@redhat.com PR sanitizer/58413 diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 2871ba1..3c51c3b 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -2091,6 +2091,13 @@ vect_is_slp_reduction (loop_vec_info loop_info, gimple phi, gimple first_stmt) a3 = ... a2 = operation (a3, a1) + or + + a3 = ... + loop_header: + a1 = phi a0, a2 + a2 = operation (a3, a1) + such that: 1. operation is commutative and associative and it is safe to change the order of the computation (if CHECK_REDUCTION is true) @@ -2451,6 +2458,7 @@ vect_is_simple_reduction_1 (loop_vec_info loop_info, gimple phi, if (def2 def2 == phi (code == COND_EXPR || !def1 || gimple_nop_p (def1) + || !flow_bb_inside_loop_p (loop, gimple_bb (def1)) || (def1 flow_bb_inside_loop_p (loop, gimple_bb (def1)) (is_gimple_assign (def1) || is_gimple_call (def1) @@ -2469,6 +2477,7 @@ vect_is_simple_reduction_1 (loop_vec_info loop_info, gimple phi, if (def1 def1 == phi (code == COND_EXPR || !def2 || gimple_nop_p (def2) + || !flow_bb_inside_loop_p (loop, gimple_bb (def2)) || (def2 flow_bb_inside_loop_p (loop, gimple_bb (def2)) (is_gimple_assign (def2) || is_gimple_call (def2) diff --git gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c new file mode 100644 index 000..06a9416 --- /dev/null +++ gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c @@ -0,0 +1,41 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 10 +#define RES 1024 + +/* A reduction pattern in which there is no data ref in + the loop and one operand is defined outside of the loop. */ + +__attribute__ ((noinline)) int +foo (int v) +{ + int i; + int result = 1; + + ++v; + for (i = 0; i N; i++) +result *= v; + + return result; +} + +int +main (void) +{ + int res; + + check_vect (); + + res = foo (1); + if (res != RES) +abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ +/* { dg-final { cleanup-tree-dump vect } } */ +
Re: [PATCH] Reducing number of alias checks in vectorization.
On Wed, Oct 02, 2013 at 10:50:21AM -0700, Cong Hou wrote: + if (int_cst_value (p11.offset) != int_cst_value (p21.offset)) +return int_cst_value (p11.offset) int_cst_value (p21.offset); This is going to ICE whenever the offsets wouldn't fit into a HOST_WIDE_INT. I'd say you just shouldn't put into the vector entries where offset isn't host_integerp, those would never be merged with other checks, or something similar. Do you mean I should use widest_int_cst_value()? Then I will replace all int_cst_value() here with it. I also changed the type of diff variable into HOST_WIDEST_INT. Actually, best would be just to use tree_int_cst_compare (p11.offset, p21.offset) that will handle any INTEGER_CSTs, not just those that fit into HWI. Jakub
Re: [PATCH] Reducing number of alias checks in vectorization.
On Wed, Oct 2, 2013 at 10:50 AM, Cong Hou co...@google.com wrote: On Tue, Oct 1, 2013 at 11:35 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 01, 2013 at 07:12:54PM -0700, Cong Hou wrote: --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) Your mailer ate all the tabs, so the formatting of the whole patch can't be checked. I'll pay attention to this problem in my later patch submission. @@ -19,6 +19,10 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#include vector +#include utility +#include algorithm Why? GCC has it's vec.h vectors, why don't you use those? There is even qsort method for you in there. And for pairs, you can easily just use structs with two members as structure elements in the vector. GCC is now restructured using C++ and STL is one of the most important part of C++. I am new to GCC community and more familiar to STL (and I think allowing STL in GCC could attract more new developers for GCC). I agree using GCC's vec can maintain a uniform style but STL is just so powerful and easy to use... I just did a search in GCC source tree and found vector is not used yet. I will change std::vector to GCC's vec for now (and also qsort), but am still wondering if one day GCC would accept STL. I talked with Ian and Diego before, they are both OK to have STL in GCC code as soon as you just use it for local data structure that does not use gcc garbage collector. STL can greatly simply source code (e.g. In http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=201615 STL helped reduce auto-profile.c from 1721 LOC to 1371 LOC). STL could also attract more C++ developers to GCC community. Any comments? Dehao +struct dr_addr_with_seg_len +{ + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) +: dr (d), basic_addr (addr), offset (off), seg_len (len) {} + + data_reference* dr; Space should be before *, not after it. + if (TREE_CODE (p11.offset) != INTEGER_CST + || TREE_CODE (p21.offset) != INTEGER_CST) +return p11.offset p21.offset; If offset isn't INTEGER_CST, you are comparing the pointer values? That is never a good idea, then compilation will depend on how say address space randomization randomizes virtual address space. GCC needs to have reproduceable compilations. I this scenario comparing pointers is safe. The sort is used to put together any two pairs of data refs which can be merged. For example, if we have (a, b) (a, c), (a, b+1), then after sorting them we should have either (a, b), (a, b+1), (a, c) or (a, c), (a, b), (a, b+1). We don't care the relative order of non-mergable dr pairs here. So although the sorting result may vary the final result we get should not change. + if (int_cst_value (p11.offset) != int_cst_value (p21.offset)) +return int_cst_value (p11.offset) int_cst_value (p21.offset); This is going to ICE whenever the offsets wouldn't fit into a HOST_WIDE_INT. I'd say you just shouldn't put into the vector entries where offset isn't host_integerp, those would never be merged with other checks, or something similar. Do you mean I should use widest_int_cst_value()? Then I will replace all int_cst_value() here with it. I also changed the type of diff variable into HOST_WIDEST_INT. Thank you very much for your comments! Cong Jakub
RFA: GNU make 3.80 compatibility
This patch makes the automatic dependency tracking code compatible with GNU make 3.80, which is documented as the oldest supported version. This was noticed by Eric Botcazou: http://gcc.gnu.org/ml/gcc/2013-09/msg00243.html Ok? Tom 2013-10-02 Tom Tromey tro...@redhat.com * Makefile.in (DRIVER_DEFINES): Use $(if), not $(and). Index: Makefile.in === --- Makefile.in (revision 203124) +++ Makefile.in (working copy) @@ -1925,7 +1925,7 @@ -DTOOLDIR_BASE_PREFIX=\$(libsubdir_to_prefix)$(prefix_to_exec_prefix)\ \ @TARGET_SYSTEM_ROOT_DEFINE@ \ $(VALGRIND_DRIVER_DEFINES) \ - $(and $(SHLIB),$(filter yes,@enable_shared@),-DENABLE_SHARED_LIBGCC) \ + $(if $(SHLIB),$(if $(filter yes,@enable_shared@),-DENABLE_SHARED_LIBGCC)) \ -DCONFIGURE_SPECS=\@CONFIGURE_SPECS@\ CFLAGS-gcc.o += $(DRIVER_DEFINES)
Re: [PATCH] alternative hirate for builtin_expert
On Wed, Oct 2, 2013 at 9:08 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, Current default probability for builtin_expect is 0.9996. This makes the freq of unlikely bb very low (4), which suppresses the inlining of any calls within those bb. We used FDO data to measure the branch probably for the branch annotated with builtin_expert. For google internal benchmarks, the weight average (the profile count value as the weight) is 0.9081. Linux kernel is another program that is heavily annotated with builtin-expert. We measured its weight average as 0.8717, using google search as the workload. This patch sets the alternate hirate probability for builtin_expert to 90%. With the alternate hirate, we measured performance improvement for google benchmarks and Linux kernel. An earlier discussion is https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b This new patch is for the trunk and addresses Honza's comments. Honza: this new probability is off by default. When we backport to google branch we will make it the default. Let me know if you want to do the same here. I do not like much the binary parameter for builtin-expect-probability-relaxed. I would just add bulitin-expect-probability taking value in percents and then make predict.c to use it. Just use predict_edge instead of predict_edge_def and document hitrate value as unused in predict.def. Thanks for the suggestion. This is much cleaner than to use binary parameter. Just want to make sure I understand it correctly about the orginal hitrate: you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use the one specified in the biniltin-expect-probability parameter. Should I use 90% as the default? It's hard to fit current value 0.9996 in percent form. -Rong OK with that change. Honza
Re: RFA: GNU make 3.80 compatibility
On Wed, Oct 02, 2013 at 12:41:32PM -0600, Tom Tromey wrote: This patch makes the automatic dependency tracking code compatible with GNU make 3.80, which is documented as the oldest supported version. This was noticed by Eric Botcazou: http://gcc.gnu.org/ml/gcc/2013-09/msg00243.html Ok? Tom 2013-10-02 Tom Tromey tro...@redhat.com * Makefile.in (DRIVER_DEFINES): Use $(if), not $(and). Ok, thanks. --- Makefile.in (revision 203124) +++ Makefile.in (working copy) @@ -1925,7 +1925,7 @@ -DTOOLDIR_BASE_PREFIX=\$(libsubdir_to_prefix)$(prefix_to_exec_prefix)\ \ @TARGET_SYSTEM_ROOT_DEFINE@ \ $(VALGRIND_DRIVER_DEFINES) \ - $(and $(SHLIB),$(filter yes,@enable_shared@),-DENABLE_SHARED_LIBGCC) \ + $(if $(SHLIB),$(if $(filter yes,@enable_shared@),-DENABLE_SHARED_LIBGCC)) \ -DCONFIGURE_SPECS=\@CONFIGURE_SPECS@\ CFLAGS-gcc.o += $(DRIVER_DEFINES) Jakub
Re: [patch] More tree-flow.h prototypes.
This patch (rev. 203118) seems to break bootstrapping with Graphite: g++ -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace -DCLOOG_INT_GMP-o ipa-devirt.o -MT ipa-devirt.o -MMD -MP -MF ./.deps/ipa-devirt.TPo ../../gcc/ipa-devirt.c ../../gcc/graphite-sese-to-poly.c: In function 'void rewrite_cross_bb_scalar_dependence(scop_p, tree, tree, gimple)': ../../gcc/graphite-sese-to-poly.c:2348:31: error: 'replace_exp' was not declared in this scope replace_exp (use_p, name); ^ ../../gcc/graphite-scop-detection.c: In function 'void canonicalize_loop_closed_ssa(loop_p)': ../../gcc/graphite-scop-detection.c:1335:26: error: 'replace_exp' was not declared in this scope replace_exp (use_p, res); ^ make[3]: *** [graphite-sese-to-poly.o] Error 1 Tobias * tree-flow.h: Remove some prototypes. * gimple-fold.h: Add prototypes from gimple.h and tree-flow.h. * tree-ssa-propagate.h: Relocate prototypes from tree-flow.h. * tree-ssa-copy.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Move from here to... * tree-ssa-propagate.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Relocate here. * tree-ssa-propagate:h: Relocate prototypes from tree-flow.h. * gimple.h: Include gimple-fold.h, move prototypes into gimple-fold.h. * gimple-fold.c: Remove gimple-fold.h from include list. * tree-vrp.c: Remove gimple-fold.h from include list. * tree-ssa-sccvn.c: Remove gimple-fold.h from include list. * tree-ssa-ccp.c: Remove gimple-fold.h from include list. * tree-scalar-evolution.c: Add tree-ssa-propagate.h to include list. * tree-ssa-pre.c: Add tree-ssa-propagate.h to include list. * sese.c: Add tree-ssa-propagate.h to include list. Andrew MacLeod wrote: On 10/02/2013 07:58 AM, Andrew MacLeod wrote: On 10/02/2013 04:37 AM, Richard Biener wrote: On Tue, Oct 1, 2013 at 11:01 PM, Andrew MacLeod amacl...@redhat.com wrote: This patch moves prototypes into gimple-fold.h (which already existed). There were a few in tree-flow.h and a bunch in gimple.h. The routines are used frequently enough that it makes sense to include gimple-fold.h from gimple.h instead of from within each .c file that needs it. (presumably why the prototypes were in gimple.h to begin with). I took gimple-fold.h out of whatever .c files it was included in. tree-ssa-copy.h was also created for the prototypes in that file and included from tree-ssa.h. These should probably be moved elsewhere (tree-ssa-copy.c is supposed to be the copy propagation pass file). But that can be done as followup. hmm, easy enough to move them *all* to tree-ssa-propagate.[ch] right now and check it in... That seems like the right place for all of them and then we don't even need to create tree-ssa-copy.h...? Like so.. and directly include tree-ssa-propagate.h in the 3 .c files that need it now. bootstrapped on x86_64-unknown-linux-gnu.. regressions running. Prefer this? Andrew
Go patch committed: Use backend interface for numeric constants
This patch from Chris Manghane changes the Go frontend to use the backend interface for numeric constants. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Ian 2013-10-02 Chris Manghane cm...@google.com * go-gcc.cc: Include real.h and realmpfr.h. (Backend::integer_constant_expression): New function. (Backend::float_constant_expression): New function. (Backend::complex_constant_expression): New function. Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc (revision 203039) +++ gcc/go/gofrontend/expressions.cc (working copy) @@ -610,102 +610,57 @@ Expression::get_tree(Translate_context* return this-do_get_tree(context); } -// Return a tree for VAL in TYPE. - -tree -Expression::integer_constant_tree(mpz_t val, tree type) +// Return a backend expression for VAL. +Bexpression* +Expression::backend_numeric_constant_expression(Translate_context* context, +Numeric_constant* val) { - if (type == error_mark_node) -return error_mark_node; - else if (TREE_CODE(type) == INTEGER_TYPE) -return double_int_to_tree(type, - mpz_get_double_int(type, val, true)); - else if (TREE_CODE(type) == REAL_TYPE) + Gogo* gogo = context-gogo(); + Type* type = val-type(); + if (type == NULL) +return gogo-backend()-error_expression(); + + Btype* btype = type-get_backend(gogo); + Bexpression* ret; + if (type-integer_type() != NULL) { - mpfr_t fval; - mpfr_init_set_z(fval, val, GMP_RNDN); - tree ret = Expression::float_constant_tree(fval, type); - mpfr_clear(fval); - return ret; + mpz_t ival; + if (!val-to_int(ival)) +{ + go_assert(saw_errors()); + return gogo-backend()-error_expression(); +} + ret = gogo-backend()-integer_constant_expression(btype, ival); + mpz_clear(ival); } - else if (TREE_CODE(type) == COMPLEX_TYPE) + else if (type-float_type() != NULL) { mpfr_t fval; - mpfr_init_set_z(fval, val, GMP_RNDN); - tree real = Expression::float_constant_tree(fval, TREE_TYPE(type)); + if (!val-to_float(fval)) +{ + go_assert(saw_errors()); + return gogo-backend()-error_expression(); +} + ret = gogo-backend()-float_constant_expression(btype, fval); mpfr_clear(fval); - tree imag = build_real_from_int_cst(TREE_TYPE(type), - integer_zero_node); - return build_complex(type, real, imag); } - else -go_unreachable(); -} - -// Return a tree for VAL in TYPE. - -tree -Expression::float_constant_tree(mpfr_t val, tree type) -{ - if (type == error_mark_node) -return error_mark_node; - else if (TREE_CODE(type) == INTEGER_TYPE) -{ - mpz_t ival; - mpz_init(ival); - mpfr_get_z(ival, val, GMP_RNDN); - tree ret = Expression::integer_constant_tree(ival, type); - mpz_clear(ival); - return ret; -} - else if (TREE_CODE(type) == REAL_TYPE) + else if (type-complex_type() != NULL) { - REAL_VALUE_TYPE r1; - real_from_mpfr(r1, val, type, GMP_RNDN); - REAL_VALUE_TYPE r2; - real_convert(r2, TYPE_MODE(type), r1); - return build_real(type, r2); -} - else if (TREE_CODE(type) == COMPLEX_TYPE) -{ - REAL_VALUE_TYPE r1; - real_from_mpfr(r1, val, TREE_TYPE(type), GMP_RNDN); - REAL_VALUE_TYPE r2; - real_convert(r2, TYPE_MODE(TREE_TYPE(type)), r1); - tree imag = build_real_from_int_cst(TREE_TYPE(type), - integer_zero_node); - return build_complex(type, build_real(TREE_TYPE(type), r2), imag); + mpfr_t real; + mpfr_t imag; + if (!val-to_complex(real, imag)) +{ + go_assert(saw_errors()); + return gogo-backend()-error_expression(); +} + ret = gogo-backend()-complex_constant_expression(btype, real, imag); + mpfr_clear(real); + mpfr_clear(imag); } else go_unreachable(); -} - -// Return a tree for REAL/IMAG in TYPE. -tree -Expression::complex_constant_tree(mpfr_t real, mpfr_t imag, tree type) -{ - if (type == error_mark_node) -return error_mark_node; - else if (TREE_CODE(type) == INTEGER_TYPE || TREE_CODE(type) == REAL_TYPE) -return Expression::float_constant_tree(real, type); - else if (TREE_CODE(type) == COMPLEX_TYPE) -{ - REAL_VALUE_TYPE r1; - real_from_mpfr(r1, real, TREE_TYPE(type), GMP_RNDN); - REAL_VALUE_TYPE r2; - real_convert(r2, TYPE_MODE(TREE_TYPE(type)), r1); - - REAL_VALUE_TYPE r3; - real_from_mpfr(r3, imag, TREE_TYPE(type), GMP_RNDN); - REAL_VALUE_TYPE r4; - real_convert(r4, TYPE_MODE(TREE_TYPE(type)), r3); - - return build_complex(type, build_real(TREE_TYPE(type), r2), - build_real(TREE_TYPE(type), r4)); -} - else -go_unreachable(); + return ret; } //
Re: [patch] More tree-flow.h prototypes.
On 10/02/2013 03:09 PM, Tobias Burnus wrote: This patch (rev. 203118) seems to break bootstrapping with Graphite: g++ -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/../libbacktrace -DCLOOG_INT_GMP -o ipa-devirt.o -MT ipa-devirt.o -MMD -MP -MF ./.deps/ipa-devirt.TPo ../../gcc/ipa-devirt.c ../../gcc/graphite-sese-to-poly.c: In function 'void rewrite_cross_bb_scalar_dependence(scop_p, tree, tree, gimple)': ../../gcc/graphite-sese-to-poly.c:2348:31: error: 'replace_exp' was not declared in this scope replace_exp (use_p, name); ^ ../../gcc/graphite-scop-detection.c: In function 'void canonicalize_loop_closed_ssa(loop_p)': ../../gcc/graphite-scop-detection.c:1335:26: error: 'replace_exp' was not declared in this scope replace_exp (use_p, res); ^ make[3]: *** [graphite-sese-to-poly.o] Error 1 Tobias * tree-flow.h: Remove some prototypes. * gimple-fold.h: Add prototypes from gimple.h and tree-flow.h. * tree-ssa-propagate.h: Relocate prototypes from tree-flow.h. * tree-ssa-copy.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Move from here to... * tree-ssa-propagate.c (may_propagate*, propagate_value, replace_exp, propagate_tree_value*): Relocate here. * tree-ssa-propagate:h: Relocate prototypes from tree-flow.h. * gimple.h: Include gimple-fold.h, move prototypes into gimple-fold.h. * gimple-fold.c: Remove gimple-fold.h from include list. * tree-vrp.c: Remove gimple-fold.h from include list. * tree-ssa-sccvn.c: Remove gimple-fold.h from include list. * tree-ssa-ccp.c: Remove gimple-fold.h from include list. * tree-scalar-evolution.c: Add tree-ssa-propagate.h to include list. * tree-ssa-pre.c: Add tree-ssa-propagate.h to include list. * sese.c: Add tree-ssa-propagate.h to include list. I did scratch rebuilds and didn't run into any problem.. and it compiles right now for me... hmmm 0h, I see, the entire file is wrapped by #ifdef HAVE_cloog ... #endif so If I am compiling without cloog (which I am) then the entire file becomes basically nothing... and thus compiles fine. I'm guessing graphite-sese-to-poly.c has a similar problem since part of the file is wrapped like that... That seems like a somewhat hazardous situation :-P. anwyay, both files should have #include tree-ssa-propagate.h then I guess. I don't have cloog so can't test it, can you verify that works? Andrew
Re: [PATCH, build]: Remove -Wno-warning from expmed.c compilation
On Wed, Oct 2, 2013 at 1:55 PM, Uros Bizjak ubiz...@gmail.com wrote: Compiling expmed.c has been warning free for some time now. As discussed briefly on IRC: let's try -Werror and see which target will break. Committed to mainline. Uros.
[v3 patch] fix libstdc++/58594
PR libstdc++/58594 * include/bits/shared_ptr_base.h (_Sp_counted_ptr_inplace::_M_get_deleter()): Cast away cv-quals. * testsuite/20_util/shared_ptr/creation/58594.cc: New. Tested x86_64-linux, committed to trunk commit a58a4bea9475af3e3c44959aeab4b3ac48dc1af0 Author: Jonathan Wakely jwakely@gmail.com Date: Wed Oct 2 19:34:01 2013 +0100 PR libstdc++/58594 * include/bits/shared_ptr_base.h (_Sp_counted_ptr_inplace::_M_get_deleter()): Cast away cv-quals. * testsuite/20_util/shared_ptr/creation/58594.cc: New. diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index fb19d08..f4bff77 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -459,10 +459,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_get_deleter(const std::type_info __ti) noexcept { #ifdef __GXX_RTTI - return __ti == typeid(_Sp_make_shared_tag) ? _M_ptr() : nullptr; -#else -return nullptr; + if (__ti == typeid(_Sp_make_shared_tag)) + return const_casttypename remove_cv_Tp::type*(_M_ptr()); #endif + return nullptr; } private: diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594.cc new file mode 100644 index 000..d1e3a7c --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/creation/58594.cc @@ -0,0 +1,27 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include memory + +// libstdc++/58594 +void test01() +{ + std::make_sharedconst int(); +}
[v3 patch] fix libstdc++/58569
2013-10-02 Jonathan Wakely jwakely@gmail.com Daniel Krugler daniel.krueg...@gmail.com PR libstdc++/58569 * include/std/functional (function::_CheckResult): Move to namespace scope and rename to __check_func_return_type. * testsuite/20_util/function/58569.cc: New. Tested x86_64-linux, committed to trunk and 4.8 branch commit 2a14c82b439cdce90ac4008e03b69e3e734931c3 Author: Jonathan Wakely jwakely@gmail.com Date: Tue Oct 1 11:10:06 2013 +0100 2013-10-02 Jonathan Wakely jwakely@gmail.com Daniel Krugler daniel.krueg...@gmail.com PR libstdc++/58569 * include/std/functional (function::_CheckResult): Move to namespace scope and rename to __check_func_return_type. * testsuite/20_util/function/58569.cc: New. diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional index 73cddfe..eaa4509 100644 --- a/libstdc++-v3/include/std/functional +++ b/libstdc++-v3/include/std/functional @@ -2128,6 +2128,10 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) } }; + templatetypename _From, typename _To +using __check_func_return_type + = __or_is_void_To, is_convertible_From, _To; + /** * @brief Primary class template for std::function. * @ingroup functors @@ -2145,16 +2149,8 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) using _Invoke = decltype(__callable_functor(std::declval_Functor()) (std::declval_ArgTypes()...) ); - templatetypename _CallRes, typename _Res1 - struct _CheckResult - : is_convertible_CallRes, _Res1 { }; - - templatetypename _CallRes - struct _CheckResult_CallRes, void - : true_type { }; - templatetypename _Functor - using _Callable = _CheckResult_Invoke_Functor, _Res; + using _Callable = __check_func_return_type_Invoke_Functor, _Res; templatetypename _Cond, typename _Tp using _Requires = typename enable_if_Cond::value, _Tp::type; diff --git a/libstdc++-v3/testsuite/20_util/function/58569.cc b/libstdc++-v3/testsuite/20_util/function/58569.cc new file mode 100644 index 000..f1e67bc --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/function/58569.cc @@ -0,0 +1,29 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// libstdc++/58569 + +#include functional + +struct foo { + std::functionfoo (int) x; + std::functionfoo () y; +}; + +foo a;
Re: [Patch, Fortran, committed] PR58579 - fix allocation of string temporaries: Avoid overallocation
Tobias Burnus wrote: In gfc_conv_string_tmp, gfortran allocates temporary strings. However, using TYPE_SIZE (type) didn't yield one byte as intended but 64 - which means that gfortran allocated 64 times as much memory as needed. Committed (Rev. ) after building and regtesting on x86-64-gnu-linux. I didn't see a simple way to generate a test case - but the dump of the PR's test case looks fine both for kind=1 and kind=4 strings. It turned out - see PR58593 - that one sometimes doesn't have an array type but a simple single-character type. Fixed by the attached patch. Committed as Rev. 203135 after build+regtesting on x86-64-gnu-linux. Tobias Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 203134) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,9 @@ +2013-10-02 Tobias Burnus bur...@net-b.de + + PR fortran/58593 + * trans-expr.c (gfc_conv_string_tmp): Fix obtaining + the byte size of a single character. + 2013-10-01 Tobias Burnus bur...@net-b.de PR fortran/58579 Index: gcc/fortran/trans-expr.c === --- gcc/fortran/trans-expr.c (Revision 203134) +++ gcc/fortran/trans-expr.c (Arbeitskopie) @@ -2357,8 +2357,9 @@ gfc_conv_string_tmp (gfc_se * se, tree type, tree var = gfc_create_var (type, pstr); gcc_assert (POINTER_TYPE_P (type)); tmp = TREE_TYPE (type); - gcc_assert (TREE_CODE (tmp) == ARRAY_TYPE); - tmp = TYPE_SIZE_UNIT (TREE_TYPE (tmp)); + if (TREE_CODE (tmp) == ARRAY_TYPE) +tmp = TREE_TYPE (tmp); + tmp = TYPE_SIZE_UNIT (tmp); tmp = fold_build2_loc (input_location, MULT_EXPR, size_type_node, fold_convert (size_type_node, len), fold_convert (size_type_node, tmp)); Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 203134) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,8 @@ +2013-10-02 Tobias Burnus bur...@net-b.de + + PR fortran/58593 + * gfortran.dg/char_length_19.f90: New. + 2013-10-02 Paolo Carlini paolo.carl...@oracle.com PR c++/58535 Index: gcc/testsuite/gfortran.dg/char_length_19.f90 === --- gcc/testsuite/gfortran.dg/char_length_19.f90 (Revision 0) +++ gcc/testsuite/gfortran.dg/char_length_19.f90 (Arbeitskopie) @@ -0,0 +1,44 @@ +! { dg-do compile } +! +! PR fortran/58579 +! +! Contributed by Joost VandeVondele +! +! Was ICEing before due to the patch for PR 58593 +! + subroutine test +CHARACTER(len=20):: tmpStr +CHARACTER(len=20, kind=4):: tmpStr4 +INTEGER :: output_unit=6 + WRITE (UNIT=output_unit,FMT=(T2,A,T61,A20)) + DFT| Self-interaction correction (SIC),ADJUSTR(TRIM(tmpstr)) + WRITE (UNIT=output_unit,FMT=(T2,A,T61,A20)) + 4_DFT| Self-interaction correction (SIC),ADJUSTR(TRIM(tmpstr4)) + END + +! +! PR fortran/58593 +! Contributed by Albert Bartok +! +! The PR was overallocating memory. I placed it here to check for a +! variant of the test case above, which takes a slightly differnt code +! patch. Thus, its purpose is just to ensure that it won't ICE. +! +program test_char + + implicit none + integer :: i + + read*, i + print*, trim(test(i)) + + contains + + function test(i) + integer, intent(in) :: i + character(len=i) :: test + + test(1:1) = A + endfunction test + +endprogram test_char
Re: [PATCH] Reducing number of alias checks in vectorization.
On Wed, Oct 2, 2013 at 4:24 AM, Richard Biener rguent...@suse.de wrote: On Tue, 1 Oct 2013, Cong Hou wrote: When alias exists between data refs in a loop, to vectorize it GCC does loop versioning and adds runtime alias checks. Basically for each pair of data refs with possible data dependence, there will be two comparisons generated to make sure there is no aliasing between them in each iteration of the vectorized loop. If there are many such data refs pairs, the number of comparisons can be very large, which is a big overhead. However, in some cases it is possible to reduce the number of those comparisons. For example, for the following loop, we can detect that b[0] and b[1] are two consecutive member accesses so that we can combine the alias check between a[0:100]b[0] and a[0:100]b[1] into checking a[0:100]b[0:2]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[1]; } Actually, the requirement of consecutive memory accesses is too strict. For the following loop, we can still combine the alias checks between a[0:100]b[0] and a[0:100]b[100]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[100]; } This is because if b[0] is not in a[0:100] and b[100] is not in a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need to check a[0:100] and b[0:101] don't overlap. More generally, consider two pairs of data refs (a, b1) and (a, b2). Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2; offset_b1 and offset_b2 (offset_b1 offset_b2) are offsets of b1 and b2, and segment_length_a, segment_length_b1, and segment_length_b2 are segment length of a, b1, and b2. Then we can combine the two comparisons into one if the following condition is satisfied: offset_b2- offset_b1 - segment_length_b1 segment_length_a This patch detects those combination opportunities to reduce the number of alias checks. It is tested on an x86-64 machine. Apart from the other comments you got (to which I agree) the patch seems to do two things, namely also: + /* Extract load and store statements on pointers with zero-stride + accesses. */ + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) +{ which I'd rather see in a separate patch (and done also when the loop doesn't require versioning for alias). yes. Also combining the alias checks in vect_create_cond_for_alias_checks is nice but doesn't properly fix the use of the vect-max-version-for-alias-checks param Yes. The handling of this should be moved to 'vect_prune_runtime_alias_test_list' to avoid premature decisions. which currently inhibits vectorization of the HIMENO benchmark by default (and make us look bad compared to LLVM). Here is a small reproducible: struct A { int *base; int offset; int offset2; int offset3; int offset4; int offset5; int offset6; int offset7; int offset8; }; void foo (struct A * ar1, struct A* ar2) { int i; for (i = 0; i 1; i++) { ar1-base[i] = 2*ar2-base[i] + ar2-offset + ar2-offset2 + ar2-offset3 + ar2-offset4 + ar2-offset5 + ar2-offset6; /* + ar2-offset7 + ar2-offset8;*/ } } GCC trunk won't vectorize it at O2 due to the limit. There is another problem we should be tracking: GCC no longer vectorize the loop (with large --param=vect-max-version-for-alias-checks=40) when -fno-strict-alias is specified. However with additional runtime alias check, the loop should be vectorizable. David So I believe this merging should be done incrementally when we collect the DDRs we need to test in vect_mark_for_runtime_alias_test. Thanks for working on this, Richard. thanks, Cong Index: gcc/tree-vect-loop-manip.c === --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) @@ -19,6 +19,10 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#include vector +#include utility +#include algorithm + #include config.h #include system.h #include coretypes.h @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer return segment_length; } +namespace +{ + +/* struct dr_addr_with_seg_len + + A struct storing information of a data reference, including the data + ref itself, its basic address, the access offset and the segment length + for aliasing checks. */ + +struct dr_addr_with_seg_len +{ + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) +: dr (d), basic_addr (addr), offset (off), seg_len (len) {} + + data_reference* dr; + tree basic_addr; + tree offset; + tree seg_len; +}; + +/* Operator == between two dr_addr_with_seg_len objects. + + This equality operator is used to make sure two data refs + are the same one so that we will consider to combine the
[PATCH, build]: Update x-linux and t-linux-android for automatic dependencies
Hello! 2013-10-02 Uros Bizjak ubiz...@gmail.com * config/x-linux (host-linux.o): Remove header dependencies. Use $(COMPILE) and $(POSTCOMPILE). * config/t-linux-android (linux-android.o): Ditto. Bootstrapped on x86_64-pc-linux-gnu and committed to mainline. Uros. Index: config/t-linux-android === --- config/t-linux-android (revision 203130) +++ config/t-linux-android (working copy) @@ -1,5 +1,4 @@ -# Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2013 -# Free Software Foundation, Inc. +# Copyright (C) 2002-2013 Free Software Foundation, Inc. # # This file is part of GCC. # @@ -17,7 +16,6 @@ # along with GCC; see the file COPYING3. If not see # http://www.gnu.org/licenses/. -linux-android.o: $(srcdir)/config/linux-android.c $(CONFIG_H) $(SYSTEM_H) \ - coretypes.h $(TM_H) $(TM_P_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/linux-android.c +linux-android.o: $(srcdir)/config/linux-android.c + $(COMPILE) $ + $(POSTCOMPILE) Index: config/x-linux === --- config/x-linux (revision 203130) +++ config/x-linux (working copy) @@ -1,4 +1,3 @@ -host-linux.o : $(srcdir)/config/host-linux.c $(CONFIG_H) $(SYSTEM_H) \ - coretypes.h hosthooks.h hosthooks-def.h $(HOOKS_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/host-linux.c +host-linux.o : $(srcdir)/config/host-linux.c + $(COMPILE) $ + $(POSTCOMPILE)
Re: [PATCH] alternative hirate for builtin_expert
Thanks for the suggestion. This is much cleaner than to use binary parameter. Just want to make sure I understand it correctly about the orginal hitrate: you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use the one specified in the biniltin-expect-probability parameter. Yes. Should I use 90% as the default? It's hard to fit current value 0.9996 in percent form. Yes, 90% seems fine. The original value was set quite arbitrarily and no real performance study was made as far as I know except yours. I think users that are sure they use expect to gueard completely cold edges may just use 100% instead of 0.9996, so I would not worry much about the precision. Honza -Rong OK with that change. Honza
Re: [PATCH] Reducing number of alias checks in vectorization.
On Wed, Oct 2, 2013 at 4:24 AM, Richard Biener rguent...@suse.de wrote: On Tue, 1 Oct 2013, Cong Hou wrote: When alias exists between data refs in a loop, to vectorize it GCC does loop versioning and adds runtime alias checks. Basically for each pair of data refs with possible data dependence, there will be two comparisons generated to make sure there is no aliasing between them in each iteration of the vectorized loop. If there are many such data refs pairs, the number of comparisons can be very large, which is a big overhead. However, in some cases it is possible to reduce the number of those comparisons. For example, for the following loop, we can detect that b[0] and b[1] are two consecutive member accesses so that we can combine the alias check between a[0:100]b[0] and a[0:100]b[1] into checking a[0:100]b[0:2]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[1]; } Actually, the requirement of consecutive memory accesses is too strict. For the following loop, we can still combine the alias checks between a[0:100]b[0] and a[0:100]b[100]: void foo(int*a, int* b) { for (int i = 0; i 100; ++i) a[i] = b[0] + b[100]; } This is because if b[0] is not in a[0:100] and b[100] is not in a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need to check a[0:100] and b[0:101] don't overlap. More generally, consider two pairs of data refs (a, b1) and (a, b2). Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2; offset_b1 and offset_b2 (offset_b1 offset_b2) are offsets of b1 and b2, and segment_length_a, segment_length_b1, and segment_length_b2 are segment length of a, b1, and b2. Then we can combine the two comparisons into one if the following condition is satisfied: offset_b2- offset_b1 - segment_length_b1 segment_length_a This patch detects those combination opportunities to reduce the number of alias checks. It is tested on an x86-64 machine. Apart from the other comments you got (to which I agree) the patch seems to do two things, namely also: + /* Extract load and store statements on pointers with zero-stride + accesses. */ + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) +{ which I'd rather see in a separate patch (and done also when the loop doesn't require versioning for alias). My mistake.. I am working on those two patches at the same time and pasted that one also here by mistake. I will send another patch about the hoist topic. Also combining the alias checks in vect_create_cond_for_alias_checks is nice but doesn't properly fix the use of the vect-max-version-for-alias-checks param which currently inhibits vectorization of the HIMENO benchmark by default (and make us look bad compared to LLVM). So I believe this merging should be done incrementally when we collect the DDRs we need to test in vect_mark_for_runtime_alias_test. I agree that vect-max-version-for-alias-checks param should count the number of checks after the merge. However, the struct data_dependence_relation could not record the new information produced by the merge. The new information I mentioned contains the new segment length for comparisons. This length is calculated right in vect_create_cond_for_alias_checks() function. Since vect-max-version-for-alias-checks is used during analysis phase, shall we move all those (get segment length for each data ref and merge alias checks) from transformation to analysis phase? If we cannot store the result properly (data_dependence_relation is not enough), shall we do it twice in both phases? I also noticed a possible bug in the function vect_same_range_drs() called by vect_prune_runtime_alias_test_list(). For the following code I get two pairs of data refs after vect_prune_runtime_alias_test_list(), but in vect_create_cond_for_alias_checks() after detecting grouped accesses I got two identical pairs of data refs. The consequence is two identical alias checks are produced. void yuv2yuyv_ref (int *d, int *src, int n) { char *dest = (char *)d; int i; for(i=0;in/2;i++){ dest[i*4 + 0] = (src[i*2 + 0])16; dest[i*4 + 1] = (src[i*2 + 1])8; dest[i*4 + 2] = (src[i*2 + 0])16; dest[i*4 + 3] = (src[i*2 + 0])0; } } I think the solution to this problem is changing GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i)) == GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j) into STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i == STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j))) in function vect_same_range_drs(). What do you think about it? thanks, Cong Thanks for working on this, Richard. thanks, Cong Index: gcc/tree-vect-loop-manip.c === --- gcc/tree-vect-loop-manip.c (revision 202662) +++ gcc/tree-vect-loop-manip.c (working copy) @@ -19,6 +19,10 @@ You should have received a copy of
Re: [PATCH GCC] Tweak gimple-ssa-strength-reduction.c:backtrace_base_for_ref () to cover different cases as seen on AArch64
On Wed, 2013-10-02 at 07:40 -0500, Bill Schmidt wrote: On Tue, 2013-10-01 at 20:21 -0500, Bill Schmidt wrote: On Tue, 2013-10-01 at 23:57 +0100, Yufeng Zhang wrote: On 10/01/13 20:55, Bill Schmidt wrote: On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote: OK, thanks. The problem that you've encountered is that you are attempting to do something illegal. ;) (Bin's original patch is actually to blame for that, as well as me for not catching it then.) As your new test shows, it is unsafe to do the transformation in backtrace_base_for_ref when widening from an unsigned type, because the unsigned type has wrap semantics by default. (The actual test must be done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or removed by compile option -- see the comments with legal_cast_p and legal_cast_p_1 later in the module.) You cannot in general prove that the transformation is allowable for a specific constant, because you don't know that what you're adding it to won't cause an overflow that's handled incorrectly. I believe the correct fix for the unsigned-overflow case is to fail backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns false, where in_type is the type of the new *PBASE, and out_type is the widening type that you're looking through. So you can't just STRIP_NOPS, you have to check the cast for legitimacy for this transformation. This does not explain why backtrace_base_for_ref does not find all the opportunities on slsr-39.c. I don't immediately see what's preventing that. Note that the transformation is legal in that case because you are widening from a signed int to an unsigned int, which won't cause problems. You guys need to dig deeper into why those opportunities are missed when sizetype is larger than int. Let me know if you need help figuring it out. Sorry, I had to leave before and wanted to get this response back to you in case I didn't get back soon. I've looked at this some more, and your general approach should work ok once you get the legal_cast_p check in place where you do the get_unwidened call now. Once you know you have a legal widening, you don't have to worry about the safe_to_multiply_p stuff. I.e., you don't need the last two chunks in the patch to backtrace_base_for_ref, and you don't need the unwidened_p variable. It should all fall out properly by just restricting your unwidening to legal casts. Many thanks for looking into the issue so promptly. I've updated the patch; I have to use legal_cast_p_1 instead as the gimple node is no longer available by then. Does the new patch look sane? Yes, much better. I'm happy with this approach. However, please restore the correct whitespace before the { at -786,7 +795,7. Thanks for fixing this up! Bill (Just a reminder that I can't approve your patch; you need a maintainer for that. But it looks good to me.) Sometime when I get a moment I'm probably going to change this to handle the casting when the candidates are added to the table. I think we should look through the casts and distribute the multiply at that time. But for now what you have here is good. FYI, I looked at this a little more this afternoon, and convinced myself that your approach is the right one. This is already representing everything pertinent in the candidate table. Thanks again for adding these extensions. Bill Thanks, Bill The regtest on aarch64 and bootstrapping on x86-64 are still running. Thanks, Yufeng gcc/ * gimple-ssa-strength-reduction.c (legal_cast_p_1): Forward declaration. (backtrace_base_for_ref): Call get_unwidened with 'base_in' if 'base_in' represent a conversion and legal_cast_p_1 holds; set 'base_in' with the returned value from get_unwidened. gcc/testsuite/ * gcc.dg/tree-ssa/slsr-40.c: New test.