[patch] Fix libstdc++/55043 - issue with nesting unordered_map containing unique_ptr into vector
This fixes a regression caused by supporting the C++11 allocator requirements in std::vector, and the fact that the unordered containers don't have noexcept move constructors. Fixed by specializing is_copy_constructible for the unordered containers so vector doesn't try to copy them when their element type is not CopyInsertable into the container, and instead resorts to a move that might throw. PR libstdc++/55043 * include/std/unordered_map: Include alloc_traits.h * include/std/unordered_set: Likewise. * include/bits/alloc_traits.h: Define __is_copy_insertable. * include/bits/unordered_map.h: Use it. * include/bits/unordered_set.h: Likewise. * include/debug/unordered_map.h: Likewise. * include/debug/unordered_set.h: Likewise. * include/profile/unordered_map.h: Likewise. * include/profile/unordered_set.h: Likewise. * include/bits/hashtable.h: Fix comment typos. * testsuite/23_containers/unordered_map/55043.cc: New. * testsuite/23_containers/unordered_multimap/55043.cc: New. * testsuite/23_containers/unordered_multiset/55043.cc: New. * testsuite/23_containers/unordered_set/55043.cc: New. Tested x86_64-linux, in normal, debug and profile modes. Committed to trunk, to be committed to the 4.7 branch shortly. commit e9c94b31c67f26e52c9927a4eedb39095efcd8be Author: Jonathan Wakely jwakely@gmail.com Date: Wed Jan 16 09:19:11 2013 + PR libstdc++/55043 * include/std/unordered_map: Include alloc_traits.h * include/std/unordered_set: Likewise. * include/bits/alloc_traits.h: Define __is_copy_insertable. * include/bits/unordered_map.h: Use it. * include/bits/unordered_set.h: Likewise. * include/debug/unordered_map.h: Likewise. * include/debug/unordered_set.h: Likewise. * include/profile/unordered_map.h: Likewise. * include/profile/unordered_set.h: Likewise. * include/bits/hashtable.h: Fix comment typos. * testsuite/23_containers/unordered_map/55043.cc: New. * testsuite/23_containers/unordered_multimap/55043.cc: New. * testsuite/23_containers/unordered_multiset/55043.cc: New. * testsuite/23_containers/unordered_set/55043.cc: New. diff --git a/libstdc++-v3/include/bits/alloc_traits.h b/libstdc++-v3/include/bits/alloc_traits.h index 9abadbb..c6259a1 100644 --- a/libstdc++-v3/include/bits/alloc_traits.h +++ b/libstdc++-v3/include/bits/alloc_traits.h @@ -1,6 +1,6 @@ // Allocator traits -*- C++ -*- -// Copyright (C) 2011-2012 Free Software Foundation, Inc. +// Copyright (C) 2011-2013 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -39,6 +39,9 @@ namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION + templatetypename _Tp +class allocator; + templatetypename _Alloc, typename _Tp class __alloctr_rebind_helper { @@ -506,6 +509,41 @@ _GLIBCXX_ALLOC_TR_NESTED_TYPE(propagate_on_container_swap, __do_alloc_on_swap(__one, __two, __pocs()); } + templatetypename _Alloc +class __is_copy_insertable_impl +{ + typedef allocator_traits_Alloc _Traits; + + templatetypename _Up, typename + = decltype(_Traits::construct(std::declval_Alloc(), +std::declval_Up*(), +std::declvalconst _Up())) + static true_type + _M_select(int); + + templatetypename _Up + static false_type + _M_select(...); + +public: + typedef decltype(_M_selecttypename _Alloc::value_type(0)) type; +}; + + templatetypename _Alloc +struct __is_copy_insertable +: __is_copy_insertable_impl_Alloc::type +{ }; + + // std::allocator_Tp just requires CopyConstructible + templatetypename _Tp +struct __is_copy_insertableallocator_Tp +: is_copy_constructible_Tp +{ }; + + templatetypename _Container +using __has_copy_insertable_val + = __is_copy_insertabletypename _Container::allocator_type; + _GLIBCXX_END_NAMESPACE_VERSION } // namespace std diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h index fab6c7c..49cb4db 100644 --- a/libstdc++-v3/include/bits/hashtable.h +++ b/libstdc++-v3/include/bits/hashtable.h @@ -370,7 +370,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Hashtable(_Hashtable); - // Use delegating construtors. + // Use delegating constructors. explicit _Hashtable(size_type __n = 10, const _H1 __hf = _H1(), @@ -914,7 +914,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_element_count(__ht._M_element_count), _M_rehash_policy(__ht._M_rehash_policy) { - // Update, if necessary, bucket pointing to before begin that hasn't move. + // Update,
Re: [PATCH] Fix PR55882
On Wed, 16 Jan 2013, Eric Botcazou wrote: Not necessarily. The following is a 4.7 variant of the patch (on trunk get_object_alignment_1 got one extra output which moved the align return value to by-reference). OK, I obviously didn't try very hard here... Can you give it testing on the branch and a strict-align target? Thanks, Richard. 2013-01-15 Richard Biener rguent...@suse.de PR middle-end/55882 * emit-rtl.c (set_mem_attributes_minus_bitpos): Correctly account for bitpos when computing alignment. * gcc.dg/torture/pr55882.c: New testcase. Bootstrapped/regtested on 4.7 branch for SPARC/Solaris, all clear. Thanks, committed on the branch as well. The situation with the MEM_REF block on trunk isn't all that bad, it can't result in bogus alignment as far as I can see. Thus I'll leave further cleanups to when stage1 opens again. Richard.
Re: [PR libmudflap/53359] don't register symbols not emitted
On Jan 7, 2013, Richard Biener richard.guent...@gmail.com wrote: On Sun, Jan 6, 2013 at 8:47 PM, Alexandre Oliva aol...@redhat.com wrote: On Jan 2, 2013, Richard Biener richard.guent...@gmail.com wrote: On Sun, Dec 30, 2012 at 1:22 AM, Alexandre Oliva aol...@redhat.com wrote: On Dec 21, 2012, Richard Biener richard.guent...@gmail.com wrote: On Fri, Dec 21, 2012 at 6:33 AM, Alexandre Oliva aol...@redhat.com wrote: libmudflap emits a global initializer that registers memory ranges for global data symbols. However, even if IPA decides not to emit a symbol because it's unused, we'd still emit registration sequences for them in some cases, which, in the PR testcase, would result in TOC references to the undefined symbols. Hmm, I think that at this point of the compilation you are looking for TREE_ASM_WRITTEN instead. That doesn't work, several mudflap regressions show up because accesses to global library symbols that are accessed by template methods compiled with mudflap (say cout) are then verified but not registered. We have to register symbols that are not emitted but that referenced. Ehm, how can something be not emitted but still referenced? You mean if it's external? Yeah. if (!TREE_ASM_WRITTEN (obj) || DECL_EXTERNAL (obj)) instead? Then we're back to the original bug. How does the test above tell whether we're actually referencing the object? Only when we are do we want to register it with mudflap. If it's referenced and we don't register it, we get mudflap runtime errors. If it's not referenced but we register it, we get link-time errors if the symbol is one we'd have emitted if it was referenced. Then the bug is that we register something but do not actually tell the middle-end that this is a reference. No, the bug is that we're registering something because at an earlier point (when we emitted checks) there were references to it. The references were all optimized away, we correctly decided (elsewhere) the object was not referenced, and we removed it from the symbol table, but mudflap still wants to register it because it pays no attention to that decision. This is the reason why I believe the patch I proposed initially is the correct fix. Now, can you please tell me why you believe this diagnosis is incorrect, or why the fix for the diagnosed problem is incorrect, to the point of proposing alternate (faulty) approaches? I suppose instead of asking for TREE_ASM_WRITTEN you may want to look at DECL_RTL (which should be set for all referenced decls, whether external or not). I'm pretty sure the optimized-away objects that we do NOT want to register had DECL_RTL set, but I'm not exactly excited about double checking it without at least a hint of an explanation on what seems to be wrong with the proposed patch. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [PATCH] Allow new ISL/CLooG versions
On Tue, 15 Jan 2013, Jack Howarth wrote: On Tue, Jan 15, 2013 at 11:05:51AM +0100, Richard Biener wrote: On Tue, 15 Jan 2013, Richard Biener wrote: On Mon, 14 Jan 2013, Jack Howarth wrote: On Mon, Jan 14, 2013 at 08:27:12PM +0100, Dominique Dhumieres wrote: In order to bootstrap r195167 with the new ISL/CLooG versions, I had to apply the following patch: --- ../work/configure 2013-01-14 19:32:00.0 +0100 +++ configure 2013-01-14 19:42:15.0 +0100 @@ -5848,7 +5848,7 @@ else int main () { -if (strncmp (isl_version (), isl-0.10, strlen (isl-0.10)) != 0) +if (strncmp (isl_version (), isl-0.11, strlen (isl-0.11)) != 0) return 1; ; @@ -6033,7 +6033,7 @@ int main () { #if CLOOG_VERSION_MAJOR != 0 \ -|| CLOOG_VERSION_MINOR != 17 \ +|| CLOOG_VERSION_MINOR != 18 \ || CLOOG_VERSION_REVISION 0 choke me #endif (I didn't bother to update the messages: got checking for version 0.10 of ISL... yes checking for version 0.17.0 of CLooG... yes). Dominique Dominique, I believe that hack effectively changes... Index: configure.ac === --- configure.ac(revision 195174) +++ configure.ac(working copy) @@ -1607,7 +1607,7 @@ if test x$with_isl != xno dnl with user input. ISL_INIT_FLAGS dnl The versions of ISL that work for Graphite - ISL_CHECK_VERSION(0,10) + ISL_CHECK_VERSION(0,11) if test ${gcc_cv_isl} = no ; then ISL_CHECK_VERSION(0,11) fi Richard seems to be assuming that the second call to ISL_CHECK_VERSION(0,11) in configure.ac will rerun the isl checks on 0.11.x but I suspect this doesn't take in account the caching of the results from the first call to ISL_CHECK_VERSION(). Certainly from my config.log against isl 0.11.1 and cloog 0.18.0, it appears that the version tests from the ISL_CHECK_VERSION(0,11) call aren't run and the cached result from the first ISL_CHECK_VERSION(0,10) is used instead. True - I missed that. I re-tested allowing both versions only with in-tree. I'm going to fix this. Like with the following. Tested with both in-tree and out-of-tree ISL/CLooG. Ok? Richard, The committed change solves the build issues here. Any chance we can get the newer tarballs from... ftp://ftp.linux.student.kuleuven.be/pub/people/skimo/isl/isl-0.11.1.tar.bz2 and http://www.bastoul.net/cloog/pages/download/cloog-0.18.0.tar.gz added to gcc/infrastructure subdirectory on the gcc/gnu ftp sites? Done. It seems we never remove files from that place ... I'll update the recommended versions stated in install.texi and would eventually remove the isl 0.10 and cloog 0.17.0 versions from the infrastructure. Richard. Jack Thanks, Richard. 2013-01-15 Richard Biener rguent...@suse.de PR other/55973 * configure: Re-generate. config/ * isl.m4 (ISL_INIT_FLAGS): Warn about disabled version check for in-tree build. (ISL_CHECK_VERSION): Do not use AC_CACHE_CHECK. * cloog.m4 (CLOOG_INIT_FLAGS): Disable version check for in-tree build and warn about that. (CLOOG_CHECK_VERSION): Do not use AC_CACHE_CHECK. Index: config/isl.m4 === --- config/isl.m4 (revision 195190) +++ config/isl.m4 (working copy) @@ -66,6 +66,7 @@ AC_DEFUN([ISL_INIT_FLAGS], isllibs='-L$$r/$(HOST_SUBDIR)/isl/'$lt_cv_objdir' ' islinc='-I$$r/$(HOST_SUBDIR)/isl/include -I$$s/isl/include' ENABLE_ISL_CHECK=no +AC_MSG_WARN([using in-tree ISL, disabling version check]) fi ] ) @@ -115,12 +116,12 @@ AC_DEFUN([ISL_CHECK_VERSION], LIBS=${_isl_saved_LIBS} -lisl echo $CFLAGS -AC_CACHE_CHECK([for version $1.$2 of ISL], - [gcc_cv_isl], - [AC_RUN_IFELSE([_ISL_CHECK_CT_PROG($1,$2)], +AC_MSG_CHECKING([for version $1.$2 of ISL]) +AC_RUN_IFELSE([_ISL_CHECK_CT_PROG($1,$2)], [gcc_cv_isl=yes], [gcc_cv_isl=no], - [gcc_cv_isl=yes])]) + [gcc_cv_isl=yes]) +AC_MSG_RESULT([$gcc_cv_isl]) CFLAGS=$_isl_saved_CFLAGS LDFLAGS=$_isl_saved_LDFLAGS Index: config/cloog.m4 === --- config/cloog.m4 (revision 195190) +++ config/cloog.m4 (working copy) @@ -57,12 +57,15 @@ AC_DEFUN([CLOOG_INIT_FLAGS], if test x${with_cloog_lib} != x; then clooglibs=-L$with_cloog_lib fi - dnl If no --with-cloog flag was specified and there is in-tree ClooG - dnl source, set up flags to use that. + dnl If no --with-cloog flag was specified and there is in-tree
Re: [PATCH] Allow new ISL/CLooG versions
On Wed, 16 Jan 2013, Richard Biener wrote: On Tue, 15 Jan 2013, Jack Howarth wrote: On Tue, Jan 15, 2013 at 11:05:51AM +0100, Richard Biener wrote: On Tue, 15 Jan 2013, Richard Biener wrote: On Mon, 14 Jan 2013, Jack Howarth wrote: On Mon, Jan 14, 2013 at 08:27:12PM +0100, Dominique Dhumieres wrote: In order to bootstrap r195167 with the new ISL/CLooG versions, I had to apply the following patch: --- ../work/configure 2013-01-14 19:32:00.0 +0100 +++ configure 2013-01-14 19:42:15.0 +0100 @@ -5848,7 +5848,7 @@ else int main () { -if (strncmp (isl_version (), isl-0.10, strlen (isl-0.10)) != 0) +if (strncmp (isl_version (), isl-0.11, strlen (isl-0.11)) != 0) return 1; ; @@ -6033,7 +6033,7 @@ int main () { #if CLOOG_VERSION_MAJOR != 0 \ -|| CLOOG_VERSION_MINOR != 17 \ +|| CLOOG_VERSION_MINOR != 18 \ || CLOOG_VERSION_REVISION 0 choke me #endif (I didn't bother to update the messages: got checking for version 0.10 of ISL... yes checking for version 0.17.0 of CLooG... yes). Dominique Dominique, I believe that hack effectively changes... Index: configure.ac === --- configure.ac (revision 195174) +++ configure.ac (working copy) @@ -1607,7 +1607,7 @@ if test x$with_isl != xno dnl with user input. ISL_INIT_FLAGS dnl The versions of ISL that work for Graphite - ISL_CHECK_VERSION(0,10) + ISL_CHECK_VERSION(0,11) if test ${gcc_cv_isl} = no ; then ISL_CHECK_VERSION(0,11) fi Richard seems to be assuming that the second call to ISL_CHECK_VERSION(0,11) in configure.ac will rerun the isl checks on 0.11.x but I suspect this doesn't take in account the caching of the results from the first call to ISL_CHECK_VERSION(). Certainly from my config.log against isl 0.11.1 and cloog 0.18.0, it appears that the version tests from the ISL_CHECK_VERSION(0,11) call aren't run and the cached result from the first ISL_CHECK_VERSION(0,10) is used instead. True - I missed that. I re-tested allowing both versions only with in-tree. I'm going to fix this. Like with the following. Tested with both in-tree and out-of-tree ISL/CLooG. Ok? Richard, The committed change solves the build issues here. Any chance we can get the newer tarballs from... ftp://ftp.linux.student.kuleuven.be/pub/people/skimo/isl/isl-0.11.1.tar.bz2 and http://www.bastoul.net/cloog/pages/download/cloog-0.18.0.tar.gz added to gcc/infrastructure subdirectory on the gcc/gnu ftp sites? Done. It seems we never remove files from that place ... I'll update the recommended versions stated in install.texi and would eventually remove the isl 0.10 and cloog 0.17.0 versions from the infrastructure. Committed. 2013-01-16 Richard Biener rguent...@suse.de * doc/install.texi: Update CLooG and ISL requirements to 0.18.0 and 0.11.1. Index: gcc/doc/install.texi === --- gcc/doc/install.texi(revision 195204) +++ gcc/doc/install.texi(working copy) @@ -367,23 +367,24 @@ installed but it is not in your default @option{--with-mpc} configure option should be used. See also @option{--with-mpc-lib} and @option{--with-mpc-include}. -@item ISL Library version 0.10 +@item ISL Library version 0.11.1 Necessary to build GCC with the Graphite loop optimizations. -It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/}. +It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} +as @file{isl-0.11.1.tar.bz2}. The @option{--with-isl} configure option should be used if ISL is not installed in your default library search path. -@item CLooG 0.17.0 +@item CLooG 0.18.0 Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as -@file{cloog-0.17.0.tar.gz}. The @option{--with-cloog} configure option should +@file{cloog-0.18.0.tar.gz}. The @option{--with-cloog} configure option should be used if CLooG is not installed in your default library search path. -CLooG needs to be built against ISL 0.10, not its included copy of ISL -which is too old. Use @option{--with-isl=system} to direct CLooG to pick -up an already installed ISL. CLooG needs to be configured to use GMP +CLooG needs to be built against ISL 0.11.1. Use @option{--with-isl=system} +to direct CLooG to pick up an already installed ISL, otherwise it will use +ISL 0.11.1 as bundled with CLooG. CLooG needs to be configured to use GMP
Re: [patch] Fix libstdc++/55043 - issue with nesting unordered_map containing unique_ptr into vector
On 16 January 2013 09:25, Jonathan Wakely wrote: This fixes a regression caused by supporting the C++11 allocator requirements in std::vector, and the fact that the unordered containers don't have noexcept move constructors. Fixed by specializing is_copy_constructible for the unordered containers so vector doesn't try to copy them when their element type is not CopyInsertable into the container, and instead resorts to a move that might throw. PR libstdc++/55043 * include/std/unordered_map: Include alloc_traits.h * include/std/unordered_set: Likewise. * include/bits/alloc_traits.h: Define __is_copy_insertable. * include/bits/unordered_map.h: Use it. * include/bits/unordered_set.h: Likewise. * include/debug/unordered_map.h: Likewise. * include/debug/unordered_set.h: Likewise. * include/profile/unordered_map.h: Likewise. * include/profile/unordered_set.h: Likewise. * include/bits/hashtable.h: Fix comment typos. * testsuite/23_containers/unordered_map/55043.cc: New. * testsuite/23_containers/unordered_multimap/55043.cc: New. * testsuite/23_containers/unordered_multiset/55043.cc: New. * testsuite/23_containers/unordered_set/55043.cc: New. Tested x86_64-linux, in normal, debug and profile modes. Committed to trunk, to be committed to the 4.7 branch shortly. Daniel K has pointed out a problem with this fix, so I'll try to fix it again, properly, this evening.
[PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
Hi, PR 55264 is caused by cgraph machinery thinking it knows all calls to a virtual method when that is actually not true. As discussed with Honza, prior to inlining, we should not assume some virtual functions (namely those that are neither DECL_COMDAT nor DECL_EXTERNAL) are not reachable. DECL_EXTERNAL however still affects the cgraph_node-local.local flag and so in order to avoid some LTO failures, I had to adjust IPA-CP to consider such virtual functions non-local so that verification of lattice propagation does not complain. I'm a bit puzzled by the value of the flag in this situation but at least it does not seem to cause any other problems. Below is a trunk patch to do all this. It does not apply to released versions which also suffer from the same problem so I'll prepare a special version(s) for them once this gets approved. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2013-01-15 Martin Jambor mjam...@suse.cz PR tree-optimizations/55264 * ipa-inline-transform.c (can_remove_node_now_p_1): Never return true for virtual methods. * ipa.c (symtab_remove_unreachable_nodes): Never return true for virtual methods before inlining is over. * ipa-cp.c (initialize_node_lattices): Also consider all virtual methods non-local. testsuite/ * g++.dg/ipa/pr55264.C: New test. Index: src/gcc/ipa-inline-transform.c === --- src.orig/gcc/ipa-inline-transform.c +++ src/gcc/ipa-inline-transform.c @@ -92,9 +92,7 @@ can_remove_node_now_p_1 (struct cgraph_n those only after all devirtualizable virtual calls are processed. Lacking may edges in callgraph we just preserve them post inlining. */ - (!DECL_VIRTUAL_P (node-symbol.decl) - || (!DECL_COMDAT (node-symbol.decl) - !DECL_EXTERNAL (node-symbol.decl))) + !DECL_VIRTUAL_P (node-symbol.decl) /* During early inlining some unanalyzed cgraph nodes might be in the callgraph and they might reffer the function in question. */ !cgraph_new_nodes); Index: src/gcc/ipa.c === --- src.orig/gcc/ipa.c +++ src/gcc/ipa.c @@ -241,8 +241,7 @@ symtab_remove_unreachable_nodes (bool be (!cgraph_can_remove_if_no_direct_calls_and_refs_p (node) /* Keep around virtual functions for possible devirtualization. */ || (before_inlining_p -DECL_VIRTUAL_P (node-symbol.decl) -(DECL_COMDAT (node-symbol.decl) || DECL_EXTERNAL (node-symbol.decl) +DECL_VIRTUAL_P (node-symbol.decl { gcc_assert (!node-global.inlined_to); pointer_set_insert (reachable, node); Index: src/gcc/testsuite/g++.dg/ipa/pr55264.C === --- /dev/null +++ src/gcc/testsuite/g++.dg/ipa/pr55264.C @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-early-inlining -fno-weak } */ + +struct S +{ + S(); + virtual inline void foo () + { +foo(); + } +}; + +void +B () +{ + S().foo (); +} Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -699,7 +699,8 @@ initialize_node_lattices (struct cgraph_ int i; gcc_checking_assert (cgraph_function_with_gimple_body_p (node)); - if (!node-local.local) + if (!node-local.local + || DECL_VIRTUAL_P (node-symbol.decl)) { /* When cloning is allowed, we can assume that externally visible functions are not called. We will compensate this by cloning
Re: [PR libmudflap/53359] don't register symbols not emitted
No, the bug is that we're registering something because at an earlier point (when we emitted checks) there were references to it. The references were all optimized away, we correctly decided (elsewhere) the object was not referenced, and we removed it from the symbol table, but mudflap still wants to register it because it pays no attention to that decision. This is the reason why I believe the patch I proposed initially is the correct fix. Now, can you please tell me why you believe this diagnosis is incorrect, or why the fix for the diagnosed problem is incorrect, to the point of proposing alternate (faulty) approaches? I suppose instead of asking for TREE_ASM_WRITTEN you may want to look at DECL_RTL (which should be set for all referenced decls, whether external or not). I'm pretty sure the optimized-away objects that we do NOT want to register had DECL_RTL set, but I'm not exactly excited about double checking it without at least a hint of an explanation on what seems to be wrong with the proposed patch. Well, from symtab point of view, the early mudflap pass (that is collecting vars it wants to later reffer to) should - either build the references to them early and produce the constructor referencing them early. Then symtab has full information about what is going to be output and everything works well. This is what I slowly did to many parts of compiler, like C++, EH or OBJC interfaces that used to be output late. - of if there is good reason to delay this till end of the compilation it should 1) force them to be output when seeing early so they are not optimized away 2) check if they are optimized away or not. 2) has the obvious advantage that unused vars are not going to be output just for sake of checking code. For 2) the symtab_get_node test seems resonable to me. It is what dwarf2out does, too. We keep nodes for external vars that are refereed but we remove all optimized out nodes. At this point TREE_ASM_WRITTEN should have pretty much same info with two differences 1) for const_decls in constant pool varpool is still not representing things accurately 2) I would like to eventually get rid of TREE_ASM_WRITTEN in favor of test in symtab (in 4.9 I will unify the var/function flags in symtab to make this easier and I will add noes for labels since these are also needed for acurate partitioning. I would like to also eventually get rid of on-the-side constant pool but as explained in the HP PR it is not trivial, given how constant pool is tied to rtl codegen and machine reorg for some targets). So I am not really keen for new uses of this flag appearing. I believe CONST_DECLs are not a concern here. Otherwise I guess TREE_ASM_WRITTEN is good hack for 4.8 modulo the fact htat some FEs still trick with it. Honza
fix for PR49888 var-tracking compile-time regression
PR49888 introduced clobber_overlapping_mems to detach VALUEs (and variables bound to them) from MEMs as the MEMs are modified. This turned out to be quite expensive, particularly the computation of canonical addresses passed to alias dependency. This patch introduces caching of the canonical addresses to alleviate this expensive operation. We cache mappings that apply to the entire function, from equivalences recorded in the global cselib table, and mappings that apply only to the local basic block. This cut down 2% of a full regstrap cycle, and about 1% of the time it took to build target libraries for C, C++ and Java. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Cache canonical addresses within VTA From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog PR debug/54114 PR debug/54402 PR debug/49888 * var-tracking.c (negative_power_of_two_p): New. (global_get_addr_cache, local_get_addr_cache): New. (get_addr_from_global_cache, get_addr_from_local_cache): New. (vt_canonicalize_addr): Rewrite using the above. (local_get_addr_clear_given_value): New. (val_reset): Clear local cached entries. (compute_bb_dataflow): Create and release the local cache. (vt_emit_notes): Likewise. (vt_initialize, vt_finalize): Create and release the global cache, respectively. --- gcc/var-tracking.c | 243 +++- 1 files changed, 200 insertions(+), 43 deletions(-) diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 8c76fe0..befde0a 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -1948,6 +1948,14 @@ var_regno_delete (dataflow_set *set, int regno) *reg = NULL; } +/* Return true if I is the negated value of a power of two. */ +static bool +negative_power_of_two_p (HOST_WIDE_INT i) +{ + unsigned HOST_WIDE_INT x = -(unsigned HOST_WIDE_INT)i; + return x == (x -x); +} + /* Strip constant offsets and alignments off of LOC. Return the base expression. */ @@ -1958,12 +1966,123 @@ vt_get_canonicalize_base (rtx loc) || GET_CODE (loc) == AND) GET_CODE (XEXP (loc, 1)) == CONST_INT (GET_CODE (loc) != AND - || INTVAL (XEXP (loc, 1)) 0)) + || negative_power_of_two_p (INTVAL (XEXP (loc, 1) loc = XEXP (loc, 0); return loc; } +/* This caches canonicalized addresses for VALUEs, computed using + information in the global cselib table. */ +static struct pointer_map_t *global_get_addr_cache; + +/* This caches canonicalized addresses for VALUEs, computed using + information from the global cache and information pertaining to a + basic block being analyzed. */ +static struct pointer_map_t *local_get_addr_cache; + +static rtx vt_canonicalize_addr (dataflow_set *, rtx); + +/* Return the canonical address for LOC, that must be a VALUE, using a + cached global equivalence or computing it and storing it in the + global cache. */ + +static rtx +get_addr_from_global_cache (rtx const loc) +{ + rtx x; + void **slot; + + gcc_checking_assert (GET_CODE (loc) == VALUE); + + slot = pointer_map_insert (global_get_addr_cache, loc); + if (*slot) +return (rtx)*slot; + + x = canon_rtx (get_addr (loc)); + + /* Tentative, avoiding infinite recursion. */ + *slot = x; + + if (x != loc) +{ + rtx nx = vt_canonicalize_addr (NULL, x); + if (nx != x) + { + /* The table may have moved during recursion, recompute + SLOT. */ + slot = pointer_map_contains (global_get_addr_cache, loc); + *slot = x = nx; + } +} + + return x; +} + +/* Return the canonical address for LOC, that must be a VALUE, using a + cached local equivalence or computing it and storing it in the + local cache. */ + +static rtx +get_addr_from_local_cache (dataflow_set *set, rtx const loc) +{ + rtx x; + void **slot; + decl_or_value dv; + variable var; + location_chain l; + + gcc_checking_assert (GET_CODE (loc) == VALUE); + + slot = pointer_map_insert (local_get_addr_cache, loc); + if (*slot) +return (rtx)*slot; + + x = get_addr_from_global_cache (loc); + + /* Tentative, avoiding infinite recursion. */ + *slot = x; + + /* Recurse to cache local expansion of X, or if we need to search + for a VALUE in the expansion. */ + if (x != loc) +{ + rtx nx = vt_canonicalize_addr (set, x); + if (nx != x) + { + slot = pointer_map_contains (local_get_addr_cache, loc); + *slot = x = nx; + } + return x; +} + + dv = dv_from_rtx (x); + var = (variable) htab_find_with_hash (shared_hash_htab (set-vars), + dv, dv_htab_hash (dv)); + if (!var) +return x; + + /* Look for an improved equivalent expression. */ + for (l = var-var_part[0].loc_chain; l; l = l-next) +{ + rtx base = vt_get_canonicalize_base (l-loc); + if (GET_CODE (base) == REG + || (GET_CODE (base) == VALUE + canon_value_cmp (base, loc))) + { + rtx nx = vt_canonicalize_addr (set, l-loc); + if (x != nx) + { + slot = pointer_map_contains
Re: [PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
On Wed, Jan 16, 2013 at 11:01 AM, Martin Jambor mjam...@suse.cz wrote: Hi, PR 55264 is caused by cgraph machinery thinking it knows all calls to a virtual method when that is actually not true. As discussed with Honza, prior to inlining, we should not assume some virtual functions (namely those that are neither DECL_COMDAT nor DECL_EXTERNAL) are not reachable. Are they not reachable from the VTABLE which is referenced from all calls that eventually reach the virtual method? Thus, isn't the issue that the VTABLE is not correctly handled by ipa-references code? DECL_EXTERNAL however still affects the cgraph_node-local.local flag and so in order to avoid some LTO failures, I had to adjust IPA-CP to consider such virtual functions non-local so that verification of lattice propagation does not complain. I'm a bit puzzled by the value of the flag in this situation but at least it does not seem to cause any other problems. Below is a trunk patch to do all this. It does not apply to released versions which also suffer from the same problem so I'll prepare a special version(s) for them once this gets approved. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2013-01-15 Martin Jambor mjam...@suse.cz PR tree-optimizations/55264 * ipa-inline-transform.c (can_remove_node_now_p_1): Never return true for virtual methods. * ipa.c (symtab_remove_unreachable_nodes): Never return true for virtual methods before inlining is over. * ipa-cp.c (initialize_node_lattices): Also consider all virtual methods non-local. testsuite/ * g++.dg/ipa/pr55264.C: New test. Index: src/gcc/ipa-inline-transform.c === --- src.orig/gcc/ipa-inline-transform.c +++ src/gcc/ipa-inline-transform.c @@ -92,9 +92,7 @@ can_remove_node_now_p_1 (struct cgraph_n those only after all devirtualizable virtual calls are processed. Lacking may edges in callgraph we just preserve them post inlining. */ - (!DECL_VIRTUAL_P (node-symbol.decl) - || (!DECL_COMDAT (node-symbol.decl) - !DECL_EXTERNAL (node-symbol.decl))) + !DECL_VIRTUAL_P (node-symbol.decl) /* During early inlining some unanalyzed cgraph nodes might be in the callgraph and they might reffer the function in question. */ !cgraph_new_nodes); Index: src/gcc/ipa.c === --- src.orig/gcc/ipa.c +++ src/gcc/ipa.c @@ -241,8 +241,7 @@ symtab_remove_unreachable_nodes (bool be (!cgraph_can_remove_if_no_direct_calls_and_refs_p (node) /* Keep around virtual functions for possible devirtualization. */ || (before_inlining_p -DECL_VIRTUAL_P (node-symbol.decl) -(DECL_COMDAT (node-symbol.decl) || DECL_EXTERNAL (node-symbol.decl) +DECL_VIRTUAL_P (node-symbol.decl { gcc_assert (!node-global.inlined_to); pointer_set_insert (reachable, node); Index: src/gcc/testsuite/g++.dg/ipa/pr55264.C === --- /dev/null +++ src/gcc/testsuite/g++.dg/ipa/pr55264.C @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-early-inlining -fno-weak } */ + +struct S +{ + S(); + virtual inline void foo () + { +foo(); + } +}; + +void +B () +{ + S().foo (); +} Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -699,7 +699,8 @@ initialize_node_lattices (struct cgraph_ int i; gcc_checking_assert (cgraph_function_with_gimple_body_p (node)); - if (!node-local.local) + if (!node-local.local + || DECL_VIRTUAL_P (node-symbol.decl)) { /* When cloning is allowed, we can assume that externally visible functions are not called. We will compensate this by cloning
Re: fix for PR49888 var-tracking compile-time regression
On Wed, Jan 16, 2013 at 08:28:59AM -0200, Alexandre Oliva wrote: PR49888 introduced clobber_overlapping_mems to detach VALUEs (and variables bound to them) from MEMs as the MEMs are modified. This turned out to be quite expensive, particularly the computation of canonical addresses passed to alias dependency. This patch introduces caching of the canonical addresses to alleviate this expensive operation. We cache mappings that apply to the entire function, from equivalences recorded in the global cselib table, and mappings that apply only to the local basic block. This cut down 2% of a full regstrap cycle, and about 1% of the time it took to build target libraries for C, C++ and Java. Can you safely cache the canon addresses already during vt_initialize (when cselib_* is still processing new insns, cselib VALUEs contain REGs and MEMs that are flushed at the end of processing the current bb in vt_initialize)? In my earlier attempts to cache something in var-tracking, I've always started caching at the end of vt_initialize, when the VALUEs should be (with one small exception of variable_post_merge_new_vals created VALUEs) pretty much stable, everything should be flushed that needs to be flushed, etc. Also, what effects (if any) does the patch have on the .debug_info/.debug_loc size and coverage? Jakub
Re: [PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
On Wed, Jan 16, 2013 at 11:01 AM, Martin Jambor mjam...@suse.cz wrote: Hi, PR 55264 is caused by cgraph machinery thinking it knows all calls to a virtual method when that is actually not true. As discussed with Honza, prior to inlining, we should not assume some virtual functions (namely those that are neither DECL_COMDAT nor DECL_EXTERNAL) are not reachable. Are they not reachable from the VTABLE which is referenced from all calls that eventually reach the virtual method? Thus, isn't the issue that the VTABLE is not correctly handled by ipa-references code? No, the problem is that VTABLE can be keyed to other unit and not present in current unit at all and devirtualization is possible only via BINFOs. We already handle COMDAT/EXTERNALs like this. In longer run, I think we should build may edges for virtual calls that will render the corresponding methods reachable. DECL_EXTERNAL however still affects the cgraph_node-local.local flag and so in order to avoid some LTO failures, I had to adjust IPA-CP to consider such virtual functions non-local so that verification of lattice propagation does not complain. I'm a bit puzzled by the value of the flag in this situation but at least it does not seem to cause any other problems. Hmm, perhaps we could simply set local flag to be false for external functions? The local flag is mostly used by codegen at a time the external functions are either inlined or removed, so it is never used in that context. Perhaps could you first change cgraph_non_local_node_p_1 and try to check some code if codegen differs significantly? It should not at all. ipa-cp is the sole user of this flag in IPA passes, so you should know what it does. Index: src/gcc/ipa-inline-transform.c === --- src.orig/gcc/ipa-inline-transform.c +++ src/gcc/ipa-inline-transform.c @@ -92,9 +92,7 @@ can_remove_node_now_p_1 (struct cgraph_n those only after all devirtualizable virtual calls are processed. Lacking may edges in callgraph we just preserve them post inlining. */ - (!DECL_VIRTUAL_P (node-symbol.decl) - || (!DECL_COMDAT (node-symbol.decl) - !DECL_EXTERNAL (node-symbol.decl))) + !DECL_VIRTUAL_P (node-symbol.decl) /* During early inlining some unanalyzed cgraph nodes might be in the callgraph and they might reffer the function in question. */ !cgraph_new_nodes); Index: src/gcc/ipa.c === --- src.orig/gcc/ipa.c +++ src/gcc/ipa.c @@ -241,8 +241,7 @@ symtab_remove_unreachable_nodes (bool be (!cgraph_can_remove_if_no_direct_calls_and_refs_p (node) /* Keep around virtual functions for possible devirtualization. */ || (before_inlining_p -DECL_VIRTUAL_P (node-symbol.decl) -(DECL_COMDAT (node-symbol.decl) || DECL_EXTERNAL (node-symbol.decl) +DECL_VIRTUAL_P (node-symbol.decl { gcc_assert (!node-global.inlined_to); pointer_set_insert (reachable, node); Index: src/gcc/testsuite/g++.dg/ipa/pr55264.C === --- /dev/null +++ src/gcc/testsuite/g++.dg/ipa/pr55264.C @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-early-inlining -fno-weak } */ + +struct S +{ + S(); + virtual inline void foo () + { +foo(); + } +}; + +void +B () +{ + S().foo (); +} These changes are OK. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -699,7 +699,8 @@ initialize_node_lattices (struct cgraph_ int i; gcc_checking_assert (cgraph_function_with_gimple_body_p (node)); - if (!node-local.local) + if (!node-local.local + || DECL_VIRTUAL_P (node-symbol.decl)) { /* When cloning is allowed, we can assume that externally visible functions are not called. We will compensate this by cloning As mentioned above I would preffer the nonlocal_node_p change. If it passes testing and does not regress, consider the patch pre-approved. Honza
Re: [PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
Perhaps could you first change cgraph_non_local_node_p_1 and try to check some code if codegen differs significantly? It should not at all. ipa-cp is the sole user of this flag in IPA passes, so you should know what it does. Thinking deeper of ipa-cp and local virtuals, I think this is all slipperly. Local means that all calls to the functions are explicit and known. Obviously if function is virutal and new calls may appear by devirtualization, the local flag is bogus. I guess the external functions are the only that may be local and virtual because somewhere there must be a vtable reference, but to play safe, I would suggest marking all virtuals non-local. Honza
Re: [PR libmudflap/53359] don't register symbols not emitted
On Jan 16, 2013, Jan Hubicka hubi...@ucw.cz wrote: 2) has the obvious advantage that unused vars are not going to be output just for sake of checking code. For 2) the symtab_get_node test seems resonable to me. That's what I first implemented, and I still firmly believe symtab_get_node is the correct test. TREE_ASM_WRITTEN doesn't carry the same information as far as external objects are concerned, so we ended up failing to register them when I tried it, and that caused regressions in the testsuite, with tests that were designed to catch precisely this sort of error. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: fix for PR49888 var-tracking compile-time regression
On Jan 16, 2013, Jakub Jelinek ja...@redhat.com wrote: On Wed, Jan 16, 2013 at 08:28:59AM -0200, Alexandre Oliva wrote: PR49888 introduced clobber_overlapping_mems to detach VALUEs (and variables bound to them) from MEMs as the MEMs are modified. This turned out to be quite expensive, particularly the computation of canonical addresses passed to alias dependency. This patch introduces caching of the canonical addresses to alleviate this expensive operation. We cache mappings that apply to the entire function, from equivalences recorded in the global cselib table, and mappings that apply only to the local basic block. This cut down 2% of a full regstrap cycle, and about 1% of the time it took to build target libraries for C, C++ and Java. Can you safely cache the canon addresses already during vt_initialize (when cselib_* is still processing new insns, cselib VALUEs contain REGs and MEMs that are flushed at the end of processing the current bb in vt_initialize)? No. AFAICT we only call the address canonicalization function after collecting all MOps, when the cselib table is fully constructed and cleaned up from any local information, retaining only the global equivalences. Also, what effects (if any) does the patch have on the .debug_info/.debug_loc size and coverage? It shouldn't have any, since it's just caching results that would have been recomputed over and over. However, there's a possibility of being “lucky” and recording an equivalence in the cache whose path would later be removed from a dynamic set (say, if an incoming VALUE is reset and re-bound within a block; I'm not sure this ever actually happens). In this case, these retained equivalences might enable alias analysis to figure out that two memory refs do not overlap, and so one can be retained in a dynamic equivalence list when we process a MOp that modifies the other. Or something ;-) It shouldn't really make any difference, just speed things up a bit. Paraphrasing Knuth, “I proved it, but I didn't test it” ;-) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: PING: gcc.target/arm: skip 5 tests for flag conflicts
Hi Janis, Back in September I submitted a patch to fix five ARM tests in http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01515.html. You responded in http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00972.html and I answered your questions in a reply. I believe that Richard's main point was that the skips that you were adding to the tests meant that they would not be run for valid command line options. Eg: Index: gcc.target/arm/pr53187.c === --- gcc.target/arm/pr53187.c (revision 191502) +++ gcc.target/arm/pr53187.c (working copy) @@ -1,5 +1,6 @@ /* PR target/53187 */ /* { dg-do compile } */ + /* { dg-skip-if do not override -march { *-*-* } { -march=* } { -march=armv7-a } } */ /* { dg-options -march=armv7-a -mfloat-abi=hard -O2 } */ With your patch applied this test will be skipped when, eg, -march=armv7-r is specified as the multilib selector. Or in fact any -march that is not armv7-a, even though, for many of these, the test will successfully compile. Given that there are more compatible architectures than incompatible ones, I think that it would be better to allow the test by default and only exclude when necessary. Eg: /* PR target/53187 */ /* { dg-do compile } */ + /* { dg-skip-if incompatible -march { *-*-* } { -march=armv6s-m -march=armv6-m } { } } */ /* { dg-options -march=armv7-a -mfloat-abi=hard -O2 } */ Cheers Nick
Re: [testsuite] remove ARM big-endian from some vect effective targets
Hi Janis, 2013-01-15 Janis Johnson jani...@codesourcery.com PR testsuite/54622 * lib/target-supports.exp (check_effective_target_vect_perm_byte, check_effective_target_vect_perm_short, check_effective_target_vect_widen_mult_qi_to_hi_pattern, check_effective_target_vect64): Return 0 for big-endian ARM. (check_effective_target_vect_widen_sum_qi_to_hi): Return 1 for ARM. Approved - please apply. Cheers Nick
Re: [testsuite] add option to LTO flags for c-torture/execute/builtins tests
Hi Janis, 2013-01-15 Janis Johnson jani...@codesourcery.com PR testsuite/55994 * gcc.c-torture/execute/builtins/builtins.exp: Add -Wl,--allow-multiple-definition for eabi and elf targets. Approved - please apply. Cheers Nick
[PATCH] Change DO loop translation, avoid undefined overflow and repeated step sign tests
The following patch fixes a few things I noticed when looking at PR42108 again. First of all the current setup of /* Calculate the loop count. to-from can overflow, so we cast to unsigned. */ doesn't work as advertised (well, it does, for the result and -fwrapv) - it does not consider that signed overflow invokes undefined behavior. Second, originally for the innermost loop in PR42108 we create D.1910 = i; D.1911 = *nnd; D.1912 = *n; k = D.1910; if (D.1912 0) { if (D.1911 D.1910) goto L.6; } else { if (D.1911 D.1910) goto L.6; } countm1.6 = (unsigned int) ((D.1911 - D.1910) * (D.1912 0 ? -1 : 1)) / (unsigned int) ((D.1912 0 ? -1 : 1) * D.1912); which ends up emitting a D.1912 0 conditional block three times which persists until the first VRP / DOM pass. There isn't actually much value in computing countm1 in a single way - apart from folding for the case of a constant (or known sign) step. We can do even better than that by moving the countm1 computation inside the first step sign test: D.1910 = i; D.1911 = *nnd; D.1912 = *n; k = D.1910; if (D.1912 0) { if (D.1911 D.1910) { goto L.6; } else { countm1.6 = ((unsigned int) D.1910 - (unsigned int) D.1911) / -(unsigned int) D.1912; } } else { if (D.1911 D.1910) { goto L.6; } else { countm1.6 = ((unsigned int) D.1911 - (unsigned int) D.1910) / (unsigned int) D.1912; } } which avoids all the initial redundant control flow instructions and avoids multiplications / negations. I believe this is what we produced in 4.4 ... (apart from the signedness issue). Note that due to the way niter analysis works neither form helps us very much in that regard when the step sign is not known. This is a regression likely caused by 2009-12-01 Janne Blomqvist j...@gcc.gnu.org PR fortran/42131 * trans-stmt.c (gfc_trans_do): Sign test using ternary operator. 2009-11-30 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/42131 * trans-stmt.c (gfc_trans_do): Calculate loop count without if statements. which I see I triggered somehow. Only that the end-result isn't much better than the original. I believe we can't really make niter analysis work well with unknown step (or even unknown step sign). With the patch below it still arrives at Analyzing # of iterations of loop 3 exit condition [countm1.6_40, + , 4294967295] != 0 bounds on difference of bases: -4294967295 ... 0 result: # of iterations countm1.6_40, bounded by 4294967295 which is correct and as good as it can get. So in the end this patch fixes initial / unoptimized code generation quality regression and possible issues with undefined signed overflow. Bootstrapped on x86_64-unknown-linux-gnu, regtests running. Ok for trunk? Thanks, Richard. 2013-01-16 Richard Biener rguent...@suse.de fortran/ * trans-stmt.c (gfc_trans_do): Conditionally compute countm1 dependent on sign of step, avoids repeated evaluation of step sign test. Avoid undefined overflow issues by using unsigned arithmetic. Index: gcc/fortran/trans-stmt.c === *** gcc/fortran/trans-stmt.c(revision 195233) --- gcc/fortran/trans-stmt.c(working copy) *** gfc_trans_do (gfc_code * code, tree exit *** 1542,1548 tree cycle_label; tree exit_label; tree tmp; - tree pos_step; stmtblock_t block; stmtblock_t body; location_t loc; --- 1542,1547 *** gfc_trans_do (gfc_code * code, tree exit *** 1587,1594 || tree_int_cst_equal (step, integer_minus_one_node))) return gfc_trans_simple_do (code, block, dovar, from, to, step, exit_cond); - pos_step = fold_build2_loc (loc, GT_EXPR,
Re: [testsuite] fix ARM test gcc.target/arm/neon-vld1_dupQ.c
Hi Janis, 2013-01-14 Janis Johnson jani...@codesourcery.com * gcc.target/arm/neon-vld1_dupQ.c: Use types that match function prototypes. Approved - please apply. Cheers Nick
Re: [ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem
Ping^2 ? On 8 January 2013 17:24, Christophe Lyon christophe.l...@linaro.org wrote: Ping? http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01197.html Thanks, Christophe On 19 December 2012 16:59, Christophe Lyon christophe.l...@linaro.org wrote: On 17 December 2012 16:12, Richard Earnshaw rearn...@arm.com wrote: On 29/11/12 17:16, Christophe Lyon wrote: On trunk I have noticed a regression in gfortran when using modulo scheduling: sms-1.f90 now fails, but I suspect it's not because of this patch since forcing compilation for armv5t makes the same test fail with and without my patch. Hmm, that's worrying. Could you please makesure this is recorded in bugzilla. If this is a regression, please mark it as such. I was about to do so, but after bisecting it turns out that the problem was introduced by http://gcc.gnu.org/viewcvs?root=gccview=revrev=192969 and is very likely to be another instance of PR55562, which has just been fixed by http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01137.html. Now that this optimization is disabled by default, the onlya8 code is completely redundant and should be purged, along with the insn alternatives that used it. R. Here is a new version of my patch, with the cleanup you requested. 2012-12-18 Christophe Lyon christophe.l...@linaro.org gcc/ * config/arm/arm-protos.h (tune_params): Add prefer_neon_for_64bits field. * config/arm/arm.c (prefer_neon_for_64bits): New variable. (arm_slowmul_tune): Default prefer_neon_for_64bits to false. (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. (arm_option_override): Handle -mneon-for-64bits new option. * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. (prefer_neon_for_64bits): Declare new variable. * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to avoid_neon_for_64bits and neon_for_64bits. Remove onlya8 and nota8. (arch_enabled): Handle new arch types. Remove support for onlya8 and nota8. (one_cmpldi2): Use new arch names. * config/arm/arm.opt (mneon-for-64bits): Add option. * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) (anddi3_neon, xordi3_neon, ashldi3_neon, shiftdi3_neon): Use neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead of onlya8. * doc/invoke.texi (-mneon-for-64bits): Document. gcc/testsuite/ * gcc.target/arm/neon-for-64bits-1.c: New tests. * gcc.target/arm/neon-for-64bits-2.c: Likewise.
Re: [PR libmudflap/53359] don't register symbols not emitted
On Wed, Jan 16, 2013 at 2:17 PM, Alexandre Oliva aol...@redhat.com wrote: On Jan 16, 2013, Jan Hubicka hubi...@ucw.cz wrote: 2) has the obvious advantage that unused vars are not going to be output just for sake of checking code. For 2) the symtab_get_node test seems resonable to me. That's what I first implemented, and I still firmly believe symtab_get_node is the correct test. TREE_ASM_WRITTEN doesn't carry the same information as far as external objects are concerned, so we ended up failing to register them when I tried it, and that caused regressions in the testsuite, with tests that were designed to catch precisely this sort of error. As it is mudflap we are talking about I don't care very much ... I'm only not sure that using symtab_get won't regress in any way. Richard. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [testsuite] replace gcc.target/arm/ftest-*.c with compile-only tests
Hi Janis, The gcc.target/arm/ftest-*.c tests check various macros that are set for ARM targets by setting flags based on preprocessor directives that check those macros. The tests are skipped if the current test platform doesn't support executing programs for the architecture for which flags are being checked. There are several problems with these tests: I like most of this patch. The only part I am unhappy with is the new dg-skip-if statements to skip the test when -march or -mthumb is specified as part of the overall command line. I think that the current dg-require-effective-target statements are enough. Can you provide an example of a case where they do not work ? Cheers Nick
Re: [PATCH][RFC] Fix PR55964
On Tue, 15 Jan 2013, Jakub Jelinek wrote: On Mon, Jan 14, 2013 at 04:58:09PM +0100, Richard Biener wrote: I happen to have a patch for PR55964 around - preparatory work to make loop distribution (and vectorization) handle nested loops. It mostly kills the broken way loop distribution copies loops (which is where we ICE in this PR). So instead of trying to make that old logic slightly less broken I consider to simply apply this work now ... (I've posted this before in December). I'm re-bootstrapping and testing this on x86_64-unknown-linux-gnu. So ... ok at this stage? Ok, if nobody complains in the next 24 hours. For extra safety I throw bootstrap-O3 at it with success. Committed. Richard. 2013-01-14 Richard Biener rguent...@suse.de PR tree-optimization/55964 * tree-flow.h (rename_variables_in_loop): Remove. (rename_variables_in_bb): Likewise. * tree-loop-distribution.c (update_phis_for_loop_copy): Remove. (copy_loop_before): Adjust and delete update-ssa status. * tree-vect-loop-manip.c (rename_variables_in_bb): Make static. (rename_variables_in_bb): Likewise. Properly walk over predecessors. (rename_variables_in_loop): Remove. (slpeel_update_phis_for_duplicate_loop): Likewise. (slpeel_tree_duplicate_loop_to_edge_cfg): Handle nested loops, use available cfg machinery instead of duplicating it. Update PHI nodes and perform poor-mans SSA update here. (slpeel_tree_peel_loop_to_edge): Adjust. * gcc.dg/torture/pr55964.c: New testcase.
[Patch, Fortran] PR 55983: [4.7/4.8 Regression] ICE in find_typebound_proc_uop, at fortran/class.c:2711
Hi all, here is a close-to-obvious patch for an ICE-on-invalid regression. It removes as assert, which is reasonable for valid code but can fail under error conditions (as the PR shows), and replaces it with an equivalent IF clause. Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7? Cheers, Janus 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * class.c (find_typebound_proc_uop): Check for f2k_derived instead of asserting it. 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * gfortran.dg/class_55.f90: New. pr55983.diff Description: Binary data class_55.f90 Description: Binary data
Re: fix for PR49888 var-tracking compile-time regression
On Wed, Jan 16, 2013 at 11:25:46AM -0200, Alexandre Oliva wrote: Can you safely cache the canon addresses already during vt_initialize (when cselib_* is still processing new insns, cselib VALUEs contain REGs and MEMs that are flushed at the end of processing the current bb in vt_initialize)? No. AFAICT we only call the address canonicalization function after collecting all MOps, when the cselib table is fully constructed and cleaned up from any local information, retaining only the global equivalences. Weird, I thought I've seen significant time spent in get_addr etc. already during vt_initialize, e.g. when looking at PR54402, but I might be wrong. --- var-tracking.c.xx 2013-01-11 09:03:01.0 +0100 +++ var-tracking.c 2013-01-16 15:00:39.012478547 +0100 @@ -2172,11 +2172,14 @@ drop_overlapping_mem_locs (void **slot, /* Remove from SET all VALUE bindings to MEMs that overlap with LOC. */ +static bool vt_initialize_p; + static void clobber_overlapping_mems (dataflow_set *set, rtx loc) { struct overlapping_mems coms; +gcc_assert (!vt_initialize_p); coms.set = set; coms.loc = canon_rtx (loc); coms.addr = vt_canonicalize_addr (set, XEXP (loc, 0)); @@ -9604,6 +9607,8 @@ vt_initialize (void) VTI (bb)-permp = NULL; } +vt_initialize_p = true; + if (MAY_HAVE_DEBUG_INSNS) { cselib_init (CSELIB_RECORD_MEMORY | CSELIB_PRESERVE_CONSTANTS); @@ -9861,6 +9866,7 @@ vt_initialize (void) } } +vt_initialize_p = false; hard_frame_pointer_adjustment = -1; VTI (ENTRY_BLOCK_PTR)-flooded = true; cfa_base_rtx = NULL_RTX; doesn't ICE on the few testcases I've tried. If it only runs after vt_initialize, my complain is wrong of course. Also, what effects (if any) does the patch have on the .debug_info/.debug_loc size and coverage? It shouldn't have any, since it's just caching results that would have been recomputed over and over. However, there's a possibility of being “lucky” and recording an equivalence in the cache whose path would later be removed from a dynamic set (say, if an incoming VALUE is reset and re-bound within a block; I'm not sure this ever actually happens). In this case, these retained equivalences might enable alias analysis to figure out that two memory refs do not overlap, and so one can be retained in a dynamic equivalence list when we process a MOp that modifies the other. Or something ;-) It shouldn't really make any difference, just speed things up a bit. Paraphrasing Knuth, “I proved it, but I didn't test it” ;-) Let me do a bootstrap/regtest pair (first one almost finished) to see. Jakub
Re: [PATCH] Change DO loop translation, avoid undefined overflow and repeated step sign tests
Richard Biener wrote: if (D.1912 0) { if (D.1911 D.1910) { goto L.6; } else { countm1.6 = ((unsigned int) D.1910 - (unsigned int) D.1911) / -(unsigned int) D.1912; } } else { if (D.1911 D.1910) { goto L.6; } else { countm1.6 = ((unsigned int) D.1911 - (unsigned int) D.1910) / (unsigned int) D.1912; } } That look better. Bootstrapped on x86_64-unknown-linux-gnu, regtests running. Ok for trunk? 2013-01-16 Richard Biener rguent...@suse.de fortran/ I assume you mean 42108. (PR42131 caused the 'regression' and is a clone of 42108; PRs 52865/53957 are related.) The patch is OK. Thanks! Tobias
[PATCH] Remove dead code
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2013-01-16 Richard Biener rguent...@suse.de * tree-inline.c (tree_function_versioning): Remove set but never used variable. Index: gcc/tree-inline.c === --- gcc/tree-inline.c (revision 195232) +++ gcc/tree-inline.c (working copy) @@ -5190,7 +5190,6 @@ tree_function_versioning (tree old_decl, replace_info = (*tree_map)[i]; if (replace_info-replace_p) { - tree op = replace_info-new_tree; if (!replace_info-old_tree) { int i = replace_info-parm_num; @@ -5199,13 +5198,6 @@ tree_function_versioning (tree old_decl, i --; replace_info-old_tree = parm; } - - - STRIP_NOPS (op); - - if (TREE_CODE (op) == VIEW_CONVERT_EXPR) - op = TREE_OPERAND (op, 0); - gcc_assert (TREE_CODE (replace_info-old_tree) == PARM_DECL); init = setup_one_parameter (id, replace_info-old_tree, replace_info-new_tree, id.src_fn,
[PATCH] Reorder step != +-1 do code to allow empty latch (PR fortran/52865)
Hi! As discussed in the PR, this patch performs the decrement of countm1 before the condition, so that the loop can have empty latch block. The testcase from the PR then can be vectorized. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-16 Jakub Jelinek ja...@redhat.com PR fortran/52865 * trans-stmt.c (gfc_trans_do): Put countm1-- before conditional and use value of countm1 before the decrement in the condition. --- gcc/fortran/trans-stmt.c.jj 2013-01-11 09:02:50.0 +0100 +++ gcc/fortran/trans-stmt.c2013-01-16 12:47:44.191683738 +0100 @@ -1518,8 +1518,9 @@ gfc_trans_simple_do (gfc_code * code, st body; cycle_label: dovar += step - if (countm1 ==0) goto exit_label; + countm1t = countm1; countm1--; + if (countm1t == 0) goto exit_label; } exit_label: @@ -1739,19 +1740,23 @@ gfc_trans_do (gfc_code * code, tree exit if (gfc_option.rtcheck GFC_RTCHECK_DO) gfc_add_modify_loc (loc, body, saved_dovar, dovar); - /* End with the loop condition. Loop until countm1 == 0. */ - cond = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, countm1, - build_int_cst (utype, 0)); - tmp = fold_build1_loc (loc, GOTO_EXPR, void_type_node, exit_label); - tmp = fold_build3_loc (loc, COND_EXPR, void_type_node, -cond, tmp, build_empty_stmt (loc)); - gfc_add_expr_to_block (body, tmp); + /* Initialize countm1t. */ + tree countm1t = gfc_create_var (utype, countm1t); + gfc_add_modify_loc (loc, body, countm1t, countm1); /* Decrement the loop count. */ tmp = fold_build2_loc (loc, MINUS_EXPR, utype, countm1, build_int_cst (utype, 1)); gfc_add_modify_loc (loc, body, countm1, tmp); + /* End with the loop condition. Loop until countm1t == 0. */ + cond = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, countm1t, + build_int_cst (utype, 0)); + tmp = fold_build1_loc (loc, GOTO_EXPR, void_type_node, exit_label); + tmp = fold_build3_loc (loc, COND_EXPR, void_type_node, +cond, tmp, build_empty_stmt (loc)); + gfc_add_expr_to_block (body, tmp); + /* End of loop body. */ tmp = gfc_finish_block (body); Jakub
[patch] libmudflap: check for NULL path in dlopen wrapper
From the bug report: If mudflap is used to instrument a program using dlopen, and the program (assuming it is compiled with -rdynamic) loads itself by passing NULL for the path to dlopen, the program will crash unconditionally; that is, regardless of the options passed to mudflap, so long as instrumentation is enabled. This is because (at least with GNU/Linux) it is valid to pass a NULL pointer as the path argument to dlopen, and the instrumentation code unconditionally uses strlen on that pointer, without checking first if it is NULL. Ok for the trunk? Matthias PR mudflap/24619 * mf-hooks2.c (dlopen wrapper): Check for NULL path. Index: b/src/libmudflap/mf-hooks2.c === --- a/libmudflap/mf-hooks2.c +++ b/libmudflap/mf-hooks2.c @@ -1677,8 +1677,10 @@ size_t n; TRACE (%s\n, __PRETTY_FUNCTION__); n = strlen (path); - MF_VALIDATE_EXTENT (path, CLAMPADD(n, 1), __MF_CHECK_READ, dlopen path); - p = dlopen (path, flags); + if (NULL != path) { +MF_VALIDATE_EXTENT (path, CLAMPADD(n, 1), __MF_CHECK_READ, dlopen path); +p = dlopen (path, flags); + } if (NULL != p) { #ifdef MF_REGISTER_dlopen __mf_register (p, 0, MF_REGISTER_dlopen, dlopen result);
Re: [patch] libmudflap: check for NULL path in dlopen wrapper
On Wed, Jan 16, 2013 at 04:32:25PM +0100, Matthias Klose wrote: PR mudflap/24619 * mf-hooks2.c (dlopen wrapper): Check for NULL path. Index: b/src/libmudflap/mf-hooks2.c === --- a/libmudflap/mf-hooks2.c +++ b/libmudflap/mf-hooks2.c @@ -1677,8 +1677,10 @@ size_t n; TRACE (%s\n, __PRETTY_FUNCTION__); n = strlen (path); - MF_VALIDATE_EXTENT (path, CLAMPADD(n, 1), __MF_CHECK_READ, dlopen path); - p = dlopen (path, flags); + if (NULL != path) { +MF_VALIDATE_EXTENT (path, CLAMPADD(n, 1), __MF_CHECK_READ, dlopen path); +p = dlopen (path, flags); + } That can't be the right fix, given you still do strlen (path) unconditionally. Thus the compiler can assume path is non-NULL. Jakub
[PATCH] gcc/go/gospec.c: fix static linking of Go programs (issue 7130047)
LGTM https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c File gcc/go/gospec.c (right): https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode230 gcc/go/gospec.c:230: num_args = argc + need_math + shared_libgcc + (library 0) * 5 + 5; I wonder if we should change + 5 to + 10. https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode388 gcc/go/gospec.c:388: the linker */ Missing period at end of sentence. Also, two spaces after period after non-weak. Comment should wrap around line 75 or so in GCC code. https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode391 gcc/go/gospec.c:391: new_decoded_options[j++]); Put j++ on a separate line. https://codereview.appspot.com/7130047/
[PATCH] gcc/go/gospec.c: fix static linking of Go programs (issue 7130047)
LGTM https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c File gcc/go/gospec.c (right): https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode230 gcc/go/gospec.c:230: num_args = argc + need_math + shared_libgcc + (library 0) * 5 + 5; I wonder if we should change + 5 to + 10. https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode388 gcc/go/gospec.c:388: the linker */ Missing period at end of sentence. Also, two spaces after period after non-weak. Comment should wrap around line 75 or so in GCC code. https://codereview.appspot.com/7130047/diff/1/gcc/go/gospec.c#newcode391 gcc/go/gospec.c:391: new_decoded_options[j++]); Put j++ on a separate line. https://codereview.appspot.com/7130047/
Re: [PATCH] Reorder step != +-1 do code to allow empty latch (PR fortran/52865)
Jakub Jelinek wrote: As discussed in the PR, this patch performs the decrement of countm1 before the condition, so that the loop can have empty latch block. The testcase from the PR then can be vectorized. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK from the FE side. [I leave it to you and Richard whether you prefer this version or the Bool version; quoting from this PR: Not sure if it is better this way, or with doing assignment of the condition result into a bool and using it later (as done in the patch for the other PR). (Other PR = PR 53957.)] Thanks for the patch! Tobias 2013-01-16 Jakub Jelinek ja...@redhat.com PR fortran/52865 * trans-stmt.c (gfc_trans_do): Put countm1-- before conditional and use value of countm1 before the decrement in the condition.
Re: [PATCH] Reorder step != +-1 do code to allow empty latch (PR fortran/52865)
On Wed, 16 Jan 2013, Tobias Burnus wrote: Jakub Jelinek wrote: As discussed in the PR, this patch performs the decrement of countm1 before the condition, so that the loop can have empty latch block. The testcase from the PR then can be vectorized. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK from the FE side. [I leave it to you and Richard whether you prefer this version or the Bool version; quoting from this PR: Not sure if it is better this way, or with doing assignment of the condition result into a bool and using it later (as done in the patch for the other PR). (Other PR = PR 53957.)] Jakubs version is better I suppose. Richard.
[PATCH] Fix PR3713
This fixes PR3713 by properly propagating -has_constants in SCCVN. With that we are able to simplify (unsigned) Bar 1 properly. Only copyprop later turns the call into a direct one though, so I'm testing the important fact - that Bar is inlined and eliminated by IPA inlining. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Unless this is somehow a regression (which I doubt) this has to wait for stage1 (even though it's pretty safe and at most exposes existing bugs in SCCVN). Richard. 2013-01-16 Richard Biener rguent...@suse.de PR tree-optimization/3713 * tree-ssa-sccvn.c (visit_copy): Simplify. Always propagate has_constants and expr. (stmt_has_constants): Properly valueize SSA names when deciding whether the stmt has constants. * g++.dg/ipa/devirt-11.C: New testcase. Index: gcc/tree-ssa-sccvn.c === *** gcc/tree-ssa-sccvn.c(revision 195240) --- gcc/tree-ssa-sccvn.c(working copy) *** static tree valueize_expr (tree expr); *** 2653,2670 static bool visit_copy (tree lhs, tree rhs) { - /* Follow chains of copies to their destination. */ - while (TREE_CODE (rhs) == SSA_NAME - SSA_VAL (rhs) != rhs) - rhs = SSA_VAL (rhs); - /* The copy may have a more interesting constant filled expression (we don't, since we know our RHS is just an SSA name). */ ! if (TREE_CODE (rhs) == SSA_NAME) ! { ! VN_INFO (lhs)-has_constants = VN_INFO (rhs)-has_constants; ! VN_INFO (lhs)-expr = VN_INFO (rhs)-expr; ! } return set_ssa_val_to (lhs, rhs); } --- 2653,2665 static bool visit_copy (tree lhs, tree rhs) { /* The copy may have a more interesting constant filled expression (we don't, since we know our RHS is just an SSA name). */ ! VN_INFO (lhs)-has_constants = VN_INFO (rhs)-has_constants; ! VN_INFO (lhs)-expr = VN_INFO (rhs)-expr; ! ! /* And finally valueize. */ ! rhs = SSA_VAL (rhs); return set_ssa_val_to (lhs, rhs); } *** expr_has_constants (tree expr) *** 3063,3087 static bool stmt_has_constants (gimple stmt) { if (gimple_code (stmt) != GIMPLE_ASSIGN) return false; switch (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))) { ! case GIMPLE_UNARY_RHS: ! return is_gimple_min_invariant (gimple_assign_rhs1 (stmt)); case GIMPLE_BINARY_RHS: ! return (is_gimple_min_invariant (gimple_assign_rhs1 (stmt)) ! || is_gimple_min_invariant (gimple_assign_rhs2 (stmt))); ! case GIMPLE_TERNARY_RHS: ! return (is_gimple_min_invariant (gimple_assign_rhs1 (stmt)) ! || is_gimple_min_invariant (gimple_assign_rhs2 (stmt)) ! || is_gimple_min_invariant (gimple_assign_rhs3 (stmt))); case GIMPLE_SINGLE_RHS: /* Constants inside reference ops are rarely interesting, but it can take a lot of looking to find them. */ ! return is_gimple_min_invariant (gimple_assign_rhs1 (stmt)); default: gcc_unreachable (); } --- 3058,3095 static bool stmt_has_constants (gimple stmt) { + tree tem; + if (gimple_code (stmt) != GIMPLE_ASSIGN) return false; switch (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))) { ! case GIMPLE_TERNARY_RHS: ! tem = gimple_assign_rhs3 (stmt); ! if (TREE_CODE (tem) == SSA_NAME) ! tem = SSA_VAL (tem); ! if (is_gimple_min_invariant (tem)) ! return true; ! /* Fallthru. */ case GIMPLE_BINARY_RHS: ! tem = gimple_assign_rhs2 (stmt); ! if (TREE_CODE (tem) == SSA_NAME) ! tem = SSA_VAL (tem); ! if (is_gimple_min_invariant (tem)) ! return true; ! /* Fallthru. */ ! case GIMPLE_SINGLE_RHS: /* Constants inside reference ops are rarely interesting, but it can take a lot of looking to find them. */ ! case GIMPLE_UNARY_RHS: ! tem = gimple_assign_rhs1 (stmt); ! if (TREE_CODE (tem) == SSA_NAME) ! tem = SSA_VAL (tem); ! return is_gimple_min_invariant (tem); ! default: gcc_unreachable (); } Index: gcc/testsuite/g++.dg/ipa/devirt-11.C === *** gcc/testsuite/g++.dg/ipa/devirt-11.C(revision 0) --- gcc/testsuite/g++.dg/ipa/devirt-11.C(working copy) *** *** 0 --- 1,22 + // { dg-do compile } + // { dg-options -std=c++11 -O -fdump-ipa-inline } + + class Foo + { + public: + void Bar() const + { + __builtin_puts (Howdy!); + } + }; + + int main() + { + Foo x; + auto y = Foo::Bar; + (x.*y)(); + return 0; + } + + // { dg-final { scan-ipa-dump Inlined 1 calls, eliminated 1 functions inline } } + // { dg-final { cleanup-ipa-dump inline } }
Re: [PATCH] gcc/go/gospec.c: fix static linking of Go programs (issue 7130047)
Reviewers: iant, Message: please take another look. Description: 2013-01-16 Shenghou Ma minux...@gmail.com * gospec.c: pass -u pthread_create to linker when static linking. Please review this at https://codereview.appspot.com/7130047/ Affected files: M gcc/go/gospec.c Index: gcc/go/gospec.c === --- gcc/go/gospec.c (revision 195240) +++ gcc/go/gospec.c (working copy) @@ -227,7 +227,7 @@ #endif /* Make sure to have room for the trailing NULL argument. */ - num_args = argc + need_math + shared_libgcc + (library 0) * 5 + 5; + num_args = argc + need_math + shared_libgcc + (library 0) * 5 + 10; new_decoded_options = XNEWVEC (struct cl_decoded_option, num_args); i = 0; @@ -381,6 +381,20 @@ generate_option (OPT_shared_libgcc, NULL, 1, CL_DRIVER, new_decoded_options[j++]); +#ifdef TARGET_CAN_SPLIT_STACK + /* libgcc wraps pthread_create to support split stack, however, due to + relative ordering of -lpthread and -lgcc, we can't just mark + __real_pthread_create in libgcc as non-weak. But we need to link in + pthread_create from pthread if we are statically linking, so we work- + around by passing -u pthread_create to to the linker. */ + if (static_link) +{ + generate_option (OPT_Wl_, -u,pthread_create, 1, CL_DRIVER, + new_decoded_options[j]); + j++; +} +#endif + *in_decoded_options_count = j; *in_decoded_options = new_decoded_options; *in_added_libraries = added_libraries;
Fix loop-iv.c ICE
Hi, this patch fixes ICE seen in PR51083 on PPC. Here the flags ^= 0x8000 expression is translated as PLUS. This makes us to consider flags to be IV and work out that the loop do not really iterate. It is a missed optimization that we do not work out this on all targets and do not unloop the loop at tree level. I opened PR for this. This patch fixes the ICE that we get confused on concluding that max number of iterations is 0. Bootstrapped/regtested x86_64-linux (where the code path do not really trigger obviously) and tested that it is fixing the testcase on PPC. OK? Honza PR tree-optimizatoin/51083 * gcc.c-torture/compile/pr51083.c: New testcase. * loop-iv.c (iv_number_of_iterations): Consider zero iteration case. Index: testsuite/gcc.c-torture/compile/pr51083.c === --- testsuite/gcc.c-torture/compile/pr51083.c (revision 0) +++ testsuite/gcc.c-torture/compile/pr51083.c (revision 0) @@ -0,0 +1,18 @@ +extern int debug_threads; +extern void sigsuspend (void); +void my_waitpid (int flags, int wnohang) +{ + while (1) +{ + if (flags 0x8000) +{ + if (wnohang) +break; + if (debug_threads) +__builtin_puts (blocking\n); + sigsuspend (); +} + flags ^= 0x8000; +} +} + Index: loop-iv.c === --- loop-iv.c (revision 195144) +++ loop-iv.c (working copy) @@ -2819,7 +2819,8 @@ iv_number_of_iterations (struct loop *lo else { max = determine_max_iter (loop, desc, old_niter); - gcc_assert (max); + if (!max) + goto zero_iter_simplify; if (!desc-infinite !desc-assumptions) record_niter_bound (loop, double_int::from_uhwi (max),
Re: [PATCH] Fix shrink-wrapping with vDRAP (PR target/55940)
On Tue, Jan 15, 2013 at 2:38 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As the following testcase shows, even when stack_realign_drap we might need to prevent crtl-drap_reg accesses in the bbs considered for shrink-wrapping, even if reload decides stack realignment isn't needed, the vDRAP (in the testcase %edi) can be used by the first bbs and initialized only later on in the prologue. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-15 Jakub Jelinek ja...@redhat.com PR target/55940 * function.c (thread_prologue_and_epilogue_insns): Always add crtl-drap_reg to set_up_by_prologue.set, even if stack_realign_drap is false. * gcc.dg/pr55940.c: New test. --- gcc/function.c.jj 2013-01-11 09:02:55.0 +0100 +++ gcc/function.c 2013-01-15 19:23:20.648826011 +0100 @@ -6029,7 +6029,7 @@ thread_prologue_and_epilogue_insns (void if (pic_offset_table_rtx) add_to_hard_reg_set (set_up_by_prologue.set, Pmode, PIC_OFFSET_TABLE_REGNUM); - if (stack_realign_drap crtl-drap_reg) + if (crtl-drap_reg) add_to_hard_reg_set (set_up_by_prologue.set, GET_MODE (crtl-drap_reg), REGNO (crtl-drap_reg)); Does this cause http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56006 -- H.J.
Re: [PATCH] Fix shrink-wrapping with vDRAP (PR target/55940)
On Wed, Jan 16, 2013 at 08:28:54AM -0800, H.J. Lu wrote: --- gcc/function.c.jj 2013-01-11 09:02:55.0 +0100 +++ gcc/function.c 2013-01-15 19:23:20.648826011 +0100 @@ -6029,7 +6029,7 @@ thread_prologue_and_epilogue_insns (void if (pic_offset_table_rtx) add_to_hard_reg_set (set_up_by_prologue.set, Pmode, PIC_OFFSET_TABLE_REGNUM); - if (stack_realign_drap crtl-drap_reg) + if (crtl-drap_reg) add_to_hard_reg_set (set_up_by_prologue.set, GET_MODE (crtl-drap_reg), REGNO (crtl-drap_reg)); Does this cause http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56006 No, it is caused by http://gcc.gnu.org/viewcvs?root=gccview=revrev=195227 See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55547#c11 Jakub
Re: [PATCH] gcc/go/gospec.c: fix static linking of Go programs (issue 7130047)
Approved and applied. Thanks! https://codereview.appspot.com/7130047/
Re: fix for PR49888 var-tracking compile-time regression
On Wed, Jan 16, 2013 at 11:25:46AM -0200, Alexandre Oliva wrote: Also, what effects (if any) does the patch have on the .debug_info/.debug_loc size and coverage? It shouldn't have any, since it's just caching results that would have been recomputed over and over. However, there's a possibility of being “lucky” and recording an equivalence in the cache whose path would later be removed from a dynamic set (say, if an incoming VALUE is reset and re-bound within a block; I'm not sure this ever actually happens). In this case, these retained equivalences might enable alias analysis to figure out that two memory refs do not overlap, and so one can be retained in a dynamic equivalence list when we process a MOp that modifies the other. Or something ;-) It shouldn't really make any difference, just speed things up a bit. Paraphrasing Knuth, “I proved it, but I didn't test it” ;-) Ok, seems it is almost no change, but if I do between --enable-languages=all,obj-c++,ada,lto,go --enable-checking=yes,rtl x86_64-linux and i686-linux (the latter without ,ada) trees (once without your var-tracking.c patch, once with) readelf -WS comparison of all the gcc/*.o gcc/*/*.o files from stage3, I get differences beyond var-tracking.o (which is of course expected): for i686-linux tree-ssa-pre.o is different, and for x86_64-linux go/export.o is different. All other objects have the same readelf -WS output (not comparing .debug_* section data, as that could be slightly different as I haven't done the build with exactly the same object directory (different dirname, but same length thereof)). If I strip those two object files, then they are identical between the two trees. There are some differences in .debug_loc in both cases. Jakub
[PATCH, i386] Fix PR55981, atomic store is split in two smaller stores
Hello! Using plain movdi pattern is not guaranteed to be atomic for all operands. For x86_64, when storing DImode immediate outside SImode range, the compiler splits the move into two separate SImode moves, violating atomic assumptions. Attached patch generates atomic store for all supported input arguments. A related questions about volatile stores: - Does language standard guarantee atomic store in this case [wikipedia says No. [1]]? - Can a store to a volatile DImode location be implemented as two consecutive SImode stores to adjacent location (this breaks stores to MMIO 64bit registers)? 2012-01-16 Uros Bizjak ubiz...@gmail.com PR target/55981 * config/i386/sync.md (atomic_storemode): Always generate SWImode store through atomic_storemode_1. (atomic_storemode_1): Macroize insn using SWI mode iterator. testsuite/ChangeLog: 2012-01-16 Uros Bizjak ubiz...@gmail.com PR target/55981 * gcc.target/pr55981.c: New test. Tested on x86_64-pc-linux-gnu. I will wait a couple of days for possible comments. [1] http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B Uros. Index: config/i386/sync.md === --- config/i386/sync.md (revision 195240) +++ config/i386/sync.md (working copy) @@ -225,11 +225,8 @@ } /* Otherwise use a store. */ - if (INTVAL (operands[2]) IX86_HLE_RELEASE) - emit_insn (gen_atomic_storemode_1 (operands[0], operands[1], -operands[2])); - else - emit_move_insn (operands[0], operands[1]); + emit_insn (gen_atomic_storemode_1 (operands[0], operands[1], + operands[2])); } /* ... followed by an MFENCE, if required. */ if (model == MEMMODEL_SEQ_CST) @@ -238,10 +235,10 @@ }) (define_insn atomic_storemode_1 - [(set (match_operand:ATOMIC 0 memory_operand =m) - (unspec:ATOMIC [(match_operand:ATOMIC 1 nonmemory_operand ri) - (match_operand:SI 2 const_int_operand)] - UNSPEC_MOVA))] + [(set (match_operand:SWI 0 memory_operand =m) + (unspec:SWI [(match_operand:SWI 1 nonmemory_operand ri) +(match_operand:SWI 2 const_int_operand)] + UNSPEC_MOVA))] %K2mov{imodesuffix}\t{%1, %0|%0, %1}) Index: testsuite/gcc.target/i386/pr55981.c === --- testsuite/gcc.target/i386/pr55981.c (revision 0) +++ testsuite/gcc.target/i386/pr55981.c (working copy) @@ -0,0 +1,54 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options -O2 } */ + +volatile int a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p; + +volatile long long y; + +void +test () +{ + int a_ = a; + int b_ = b; + int c_ = c; + int d_ = d; + int e_ = e; + int f_ = f; + int g_ = g; + int h_ = h; + int i_ = i; + int j_ = j; + int k_ = k; + int l_ = l; + int m_ = m; + int n_ = n; + int o_ = o; + int p_ = p; + + int z; + + for (z = 0; z 1000; z++) +{ + __atomic_store_n (y, 0x10002ll, __ATOMIC_SEQ_CST); + __atomic_store_n (y, 0x30004ll, __ATOMIC_SEQ_CST); +} + + a = a_; + b = b_; + c = c_; + d = d_; + e = e_; + f = f_; + g = g_; + h = h_; + i = i_; + j = j_; + k = k_; + l = l_; + m = m_; + n = n_; + o = o_; + p = p_; +} + +/* { dg-final { scan-assembler-times movabs 2 } } */
Re: [PATCH, i386] Fix PR55981, atomic store is split in two smaller stores
On 01/16/2013 09:26 AM, Uros Bizjak wrote: 2012-01-16 Uros Bizjakubiz...@gmail.com PR target/55981 * config/i386/sync.md (atomic_storemode): Always generate SWImode store through atomic_storemode_1. (atomic_storemode_1): Macroize insn using SWI mode iterator. testsuite/ChangeLog: 2012-01-16 Uros Bizjakubiz...@gmail.com PR target/55981 * gcc.target/pr55981.c: New test. This looks good to me. r~
Re: [testsuite] replace gcc.target/arm/ftest-*.c with compile-only tests
On 01/16/2013 05:53 AM, Nick Clifton wrote: Hi Janis, The gcc.target/arm/ftest-*.c tests check various macros that are set for ARM targets by setting flags based on preprocessor directives that check those macros. The tests are skipped if the current test platform doesn't support executing programs for the architecture for which flags are being checked. There are several problems with these tests: I like most of this patch. The only part I am unhappy with is the new dg-skip-if statements to skip the test when -march or -mthumb is specified as part of the overall command line. I think that the current dg-require-effective-target statements are enough. Can you provide an example of a case where they do not work ? Cheers Nick The dg-require-effective-target arm_arch_v4_multilib and friends don't do what they're meant to do and are the reason for rewriting the tests. The dg-require-effective-target arm_nothumb isn't necessary because thumb support is there by default it will be overridden by -marm. If thumb support is turned on by multilib options then it can't be overridden, so it's necessary to look for the flag. I've modified the tests a bit in this new version of the patch to skip the test if the multilib includes -march with a value other than the one being tested, rather than for _any_ use of -march. That's necessary because the multilib options come at the end of the command line and override the options specified by the test. A test looking for macros set for -march=armv4 is going to fail if the test is compiled with -march=armv5. The tests are meant to check that the macros are defined for either -mthumb or -marm and add those options explicitly. Values for -march that can be used with either have tests for both. Some -march values can be used only for -marm or -mthumb (or their defaults) and would fail to compile if the multilib flags use the wrong one of those. A multilib with no flags will run all of these tests. A multilib that uses -march=armv8-a -mthumb will run only the test for those options and skip all of the others. I don't know why some of the tests required arm_eabi, but I can't see any reason for it being necessary for this version of the tests so I've dropped it. OK? Janis 2013-01-16 Janis Johnson jani...@codesourcery.com * gcc.target/arm/ftest-support.h: Replace for compile-only tests. * gcc.target/arm/ftest-support-arm.h: Delete. * gcc.target/arm/ftest-support-thumb.h: Delete. * gcc.target/arm/ftest-armv4-arm.c: Replace with compile-only test. * gcc.target/arm/ftest-armv4t-arm.c: Likewise. * gcc.target/arm/ftest-armv4t-thumb.c: Likewise. * gcc.target/arm/ftest-armv5t-arm.c: Likewise. * gcc.target/arm/ftest-armv5t-thumb.c: Likewise. * gcc.target/arm/ftest-armv5te-arm.c: Likewise. * gcc.target/arm/ftest-armv5te-thumb.c: Likewise. * gcc.target/arm/ftest-armv6-arm.c: Likewise. * gcc.target/arm/ftest-armv6-thumb.c: Likewise. * gcc.target/arm/ftest-armv6k-arm.c: Likewise. * gcc.target/arm/ftest-armv6k-thumb.c: Likewise. * gcc.target/arm/ftest-armv6m-thumb.c: Likewise. * gcc.target/arm/ftest-armv6t2-arm.c: Likewise. * gcc.target/arm/ftest-armv6t2-thumb.c: Likewise. * gcc.target/arm/ftest-armv6z-arm.c: Likewise. * gcc.target/arm/ftest-armv6z-thumb.c: Likewise. * gcc.target/arm/ftest-armv7a-arm.c: Likewise. * gcc.target/arm/ftest-armv7a-thumb.c: Likewise. * gcc.target/arm/ftest-armv7em-thumb.c: Likewise. * gcc.target/arm/ftest-armv7m-thumb.c: Likewise. * gcc.target/arm/ftest-armv7r-arm.c: Likewise. * gcc.target/arm/ftest-armv7r-thumb.c: Likewise. * gcc.target/arm/ftest-armv8a-arm.c: Likewise. * gcc.target/arm/ftest-armv8a-thumb.c: Likewise. Index: gcc.target/arm/ftest-support.h === --- gcc.target/arm/ftest-support.h (revision 195216) +++ gcc.target/arm/ftest-support.h (working copy) @@ -1,84 +1,156 @@ -#if 0 -#define INTERNAL_DEBUG 1 +/* For each of several ARM architecture features, check that relevant + macros are defined or not, and that they have the expected values. */ + +#ifdef NEED_ARM_ARCH +# ifdef __ARM_ARCH +# if __ARM_ARCH != VALUE_ARM_ARCH +# error __ARM_ARCH has unexpected value +# endif +# else +# error __ARM_ARCH is not defined but should be +# endif +#else +# ifdef __ARM_ARCH +# error __ARM_ARCH is defined but should not be +# endif #endif -#ifdef INTERNAL_DEBUG -#include stdio.h +#ifdef NEED_ARM_ARCH_ISA_ARM +# ifdef __ARM_ARCH_ISA_ARM +# if __ARM_ARCH_ISA_ARM != VALUE_ARM_ARCH_ISA_ARM +# error __ARM_ARCH_ISA_ARM has unexpected value +# endif +# else +# error __ARM_ARCH_ISA_ARM is not defined but should be +# endif +#else +# ifdef __ARM_ARCH_ISA_ARM +# error __ARM_ARCH_ISA_ARM is defined but should not be +# endif #endif
Re: [google integration, gcc-4_7] Rename __google_stl_debug_string_dangling - __google_stl_debug_dangling_string
On Wed, Jan 16, 2013 at 12:14 PM, Paul Pluzhnikov ppluzhni...@google.com wrote: [Resending to correct patches address.] Diego, This harmonizes with __google_stl_debug_dangling_vector Ok for google/integration and google/gcc-4_7 branches? OK. Diego.
Re: [Patch, Fortran] PR 55983: [4.7/4.8 Regression] ICE in find_typebound_proc_uop, at fortran/class.c:2711
Dear Janus, As you say, this is close to being obvious - OK for trunk and for 4.7. Thanks for the patch. Cheers Paul On 16 January 2013 15:08, Janus Weil ja...@gcc.gnu.org wrote: Hi all, here is a close-to-obvious patch for an ICE-on-invalid regression. It removes as assert, which is reasonable for valid code but can fail under error conditions (as the PR shows), and replaces it with an equivalent IF clause. Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7? Cheers, Janus 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * class.c (find_typebound_proc_uop): Check for f2k_derived instead of asserting it. 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * gfortran.dg/class_55.f90: New. -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
patch to fix PR56005
The following patch fixes PR56005. The details are on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56005 2013-01-16 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/56005 * sched-deps.c (sched_analyze_2): Check deps-readonly for adding pending reads for prefetch. The patch was successfully bootstrapped and tested on x86-64. Committed as rev 195247. Index: sched-deps.c === --- sched-deps.c(revision 195244) +++ sched-deps.c(working copy) @@ -2719,8 +2719,9 @@ sched_analyze_2 (struct deps_desc *deps, to generate accurate dependencies for prefetch insns as prefetch has only the start address but it is better to have something than nothing. */ - add_insn_mem_dependence (deps, true, insn, - gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0))); + if (!deps-readonly) + add_insn_mem_dependence (deps, true, insn, +gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0))); break; case UNSPEC_VOLATILE:
Re: [RFC, middlend] Fix for PR54218
On Mon, Jan 14, 2013 at 12:50 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Jan 11, 2013 at 6:37 PM, George Thomas georgethomas@gmail.com wrote: On Fri, Jan 11, 2013 at 9:53 PM, Andrew Pinski pins...@gmail.com wrote: On Fri, Jan 11, 2013 at 8:17 AM, George Thomas georgethomas@gmail.com wrote: Hi, I am sending a patch which solves the debugging issue (PR 54218). The fix is to allocate stack space only once for parameters in expand pass. The patch is attached. Could someone suggest if its right ? I have just a formatting issue: + if (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL) + { +if (!bitmap_bit_p (SA.partition_has_default_def, i)) I think it would have been better if you had done instead: if (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL !bitmap_bit_p (SA.partition_has_default_def, i)) I have attached the updated patch with the changes suggested. Also adding a dejagnu test case to reproduce the bug. So there are no other white space changes. Also missing a changelog entry too. I am adding the change logs below. 2013-01-11 George Thomas george.tho...@atmel.com Senthil Kumar Selvaraj senthil_kumar.selva...@atmel.com PR middle-end/54218 * gcc/cfgexpand.c (expand_used_vars ) :Added a step to not allocate stack space if its a parameter * gcc.dg/pr54218.c : New test Hoping that the changes are fine for trunk. Please state how you tested the patch (bootstrap and regtest on which target?) I initially tested my patch only on the avr target and ran the regressions on avr. When I tried building the default compiler, the build is failing in default optimisation -g -O2. build/genmddeps ../../gcc-trunk-new/gcc/config/i386/i386.md is throwing a segmentaion fault. I am trying to debug on why this could be happening. The build is passing when BOOT_CXXFLAGS is made -g3 -O0. The succesfully built compiler does not have the bug in it. Also tested functions with parameters and vectors as input. I am not sure how to debug if the issue is happening while bootstraping the compiler itself. Thanks, George
Re: [PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
On Wed, Jan 16, 2013 at 01:44:20PM +0100, Jan Hubicka wrote: Perhaps could you first change cgraph_non_local_node_p_1 and try to check some code if codegen differs significantly? It should not at all. ipa-cp is the sole user of this flag in IPA passes, so you should know what it does. Thinking deeper of ipa-cp and local virtuals, I think this is all slipperly. Local means that all calls to the functions are explicit and known. Obviously if function is virutal and new calls may appear by devirtualization, the local flag is bogus. I guess the external functions are the only that may be local and virtual because somewhere there must be a vtable reference, but to play safe, I would suggest marking all virtuals non-local. Right, as discussed on IRC, the patch below therfore modifies cgraph_only_called_directly_or_aliased_p to return false for virtual functions (which translates into cleared local flag) and the cloning machinery to clear that flag. Bootstrapped and tested on x86_64-linux without any problems. OK for trunk? Thanks, Martin 2013-01-16 Martin Jambor mjam...@suse.cz PR tree-optimizations/55264 * ipa-inline-transform.c (can_remove_node_now_p_1): Never return true for virtual methods. * ipa.c (symtab_remove_unreachable_nodes): Never return true for virtual methods before inlining is over. * cgraph.h (cgraph_only_called_directly_or_aliased_p): Return false for virtual functions. * cgraphclones.c (cgraph_create_virtual_clone): Mark clones as non-virtual. testsuite/ * g++.dg/ipa/pr55264.C: New test. Index: src/gcc/ipa-inline-transform.c === --- src.orig/gcc/ipa-inline-transform.c +++ src/gcc/ipa-inline-transform.c @@ -92,9 +92,7 @@ can_remove_node_now_p_1 (struct cgraph_n those only after all devirtualizable virtual calls are processed. Lacking may edges in callgraph we just preserve them post inlining. */ - (!DECL_VIRTUAL_P (node-symbol.decl) - || (!DECL_COMDAT (node-symbol.decl) - !DECL_EXTERNAL (node-symbol.decl))) + !DECL_VIRTUAL_P (node-symbol.decl) /* During early inlining some unanalyzed cgraph nodes might be in the callgraph and they might reffer the function in question. */ !cgraph_new_nodes); Index: src/gcc/ipa.c === --- src.orig/gcc/ipa.c +++ src/gcc/ipa.c @@ -241,8 +241,7 @@ symtab_remove_unreachable_nodes (bool be (!cgraph_can_remove_if_no_direct_calls_and_refs_p (node) /* Keep around virtual functions for possible devirtualization. */ || (before_inlining_p -DECL_VIRTUAL_P (node-symbol.decl) -(DECL_COMDAT (node-symbol.decl) || DECL_EXTERNAL (node-symbol.decl) +DECL_VIRTUAL_P (node-symbol.decl { gcc_assert (!node-global.inlined_to); pointer_set_insert (reachable, node); Index: src/gcc/testsuite/g++.dg/ipa/pr55264.C === --- /dev/null +++ src/gcc/testsuite/g++.dg/ipa/pr55264.C @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-early-inlining -fno-weak } */ + +struct S +{ + S(); + virtual inline void foo () + { +foo(); + } +}; + +void +B () +{ + S().foo (); +} Index: src/gcc/cgraph.h === --- src.orig/gcc/cgraph.h +++ src/gcc/cgraph.h @@ -1164,6 +1164,7 @@ cgraph_only_called_directly_or_aliased_p gcc_assert (!node-global.inlined_to); return (!node-symbol.force_output !node-symbol.address_taken !node-symbol.used_from_other_partition + !DECL_VIRTUAL_P (node-symbol.decl) !DECL_STATIC_CONSTRUCTOR (node-symbol.decl) !DECL_STATIC_DESTRUCTOR (node-symbol.decl) !node-symbol.externally_visible); Index: src/gcc/cgraphclones.c === --- src.orig/gcc/cgraphclones.c +++ src/gcc/cgraphclones.c @@ -319,6 +319,7 @@ cgraph_create_virtual_clone (struct cgra TREE_PUBLIC (new_node-symbol.decl) = 0; DECL_COMDAT (new_node-symbol.decl) = 0; DECL_WEAK (new_node-symbol.decl) = 0; + DECL_VIRTUAL_P (new_node-symbol.decl) = 0; DECL_STATIC_CONSTRUCTOR (new_node-symbol.decl) = 0; DECL_STATIC_DESTRUCTOR (new_node-symbol.decl) = 0; new_node-clone.tree_map = tree_map;
Re: [PR55547] fix alias regression on alpha on misaligned symbols (was: Re: do you have time to review this alpha P1 patch?)
On Wed, Jan 16, 2013 at 8:33 AM, Uros Bizjak ubiz...@gmail.com wrote: On 01/15/2013 08:24 AM, Aldy Hernandez wrote: Ok, it's really an alias.c bug, but it is Alpha, and aoliva has already provided an unreviewed patch... http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55547 The patch in #C4 is ok. Thanks, I'm checking it in (first patch below), but reviewing the logic that uses negative sizes, I found a number of places that should use the absolute value, and others in which being conservative about negative sizes is unnecessary (e.g., when dealing with CONST_INT addresses). That was implemented and regstrapped on x86_64-linux-gnu. Uros, would you give the second patch a spin on alpha to make sure it doesn't regress? Ok to install it? Thanks, I started a bootstrap/regtest run. If everything goes as expected, the results will be available in ~10h from now... The results looks good [1], no regressions with patch. [1] http://gcc.gnu.org/ml/gcc-testresults/2013-01/msg01706.html Uros.
Re: [PR55547] fix alias regression on alpha on misaligned symbols
On 01/15/2013 08:29 PM, Alexandre Oliva wrote: if (rtx_equal_for_memref_p (x, y)) { - if (xsize = 0 || ysize = 0) + if (xsize == 0 || ysize == 0) return 1; - if (c = 0 xsize c) + if (c = 0 abs (xsize) - c 0) return 1; - if (c 0 ysize+c 0) + if (c 0 abs (ysize) + c 0) return 1; return 0; } @@ -2063,7 +2063,8 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) y0 = canon_rtx (XEXP (y, 0)); if (rtx_equal_for_memref_p (x0, y0)) return (xsize == 0 || ysize == 0 - || (c = 0 xsize c) || (c 0 ysize+c 0)); + || (c = 0 abs (xsize) - c 0) + || (c 0 abs (ysize) + c 0)); /* Can't properly adjust our sizes. */ if (!CONST_INT_P (x1)) @@ -2119,8 +2120,9 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) if (CONST_INT_P (x) CONST_INT_P (y)) { c += (INTVAL (y) - INTVAL (x)); - return (xsize = 0 || ysize = 0 - || (c = 0 xsize c) || (c 0 ysize+c 0)); + return (xsize == 0 || ysize == 0 + || (c = 0 abs (xsize) - c 0) + || (c 0 abs (ysize) + c 0)); } I notice that these expressions (including the first hunk that uses ifs) are now all the same. It would seem extremely prudent to pull this out to a function so that they stay the same. That said, I question the change of = to == 0. If negative, we don't know how much overlap there is as far as I can see. if (GET_CODE (x) == CONST) @@ -2139,7 +2141,8 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) if (CONSTANT_P (y)) return (xsize = 0 || ysize = 0 || (rtx_equal_for_memref_p (x, y) -((c = 0 xsize c) || (c 0 ysize+c 0; +((c = 0 abs (xsize) - c 0) + || (c 0 abs (ysize) + c 0; This hunk is not needed, as we begin by eliminating = 0. So the abs is certain to do nothing. r~
Re: [google gcc-4_7] Inlining and devirtualization tests
On 16/01/2013, at 6:41 PM, Xinliang David Li wrote: Looks fine. Why adding tests that are expected to fail? Are these tests passing with trunk? They are expected to fail only due to optimization analysis not being up to snuff yet, not because it's a bad idea to optimize a particular testcase. As inlining and devirtualization analysis improves, these tests will start passing. Thanks, -- Maxim Kuvyrkov
Re: [Patch, Fortran] PR 55983: [4.7/4.8 Regression] ICE in find_typebound_proc_uop, at fortran/class.c:2711
As you say, this is close to being obvious - OK for trunk and for 4.7. Thanks, guys. Committed to trunk as 195251. Will do 4.7 soon ... Cheers, Janus On 16 January 2013 15:08, Janus Weil ja...@gcc.gnu.org wrote: Hi all, here is a close-to-obvious patch for an ICE-on-invalid regression. It removes as assert, which is reasonable for valid code but can fail under error conditions (as the PR shows), and replaces it with an equivalent IF clause. Regtested on x86_64-unknown-linux-gnu. Ok for trunk and 4.7? Cheers, Janus 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * class.c (find_typebound_proc_uop): Check for f2k_derived instead of asserting it. 2013-01-16 Janus Weil ja...@gcc.gnu.org PR fortran/55983 * gfortran.dg/class_55.f90: New. -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
[wwwdocs,avr,committed]: Mention avr specific improvments
http://gcc.gnu.org/ml/gcc-cvs-wwwdocs/2013/msg00015.html http://gcc.gnu.org/gcc-4.8/changes.html#avr Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.85 diff -u -p -r1.85 changes.html --- changes.html 16 Jan 2013 09:38:30 - 1.85 +++ changes.html 16 Jan 2013 20:40:40 - @@ -390,6 +390,45 @@ B b(42); // OK h2 id=targetsNew Targets and Target Specific Improvements/h2 +h3 id=avrAVR/h3 +ul + li +Support for the quot;Embeddednbsp;Cquot; fixed-point has been +added. For details, see the +a href=http://gcc.gnu.org/wiki/avr-gcc#Fixed-Point_Support; + GCC wiki/a and the +a href=http://gcc.gnu.org/onlinedocs/gcc/Fixed_002dPoint.html; + user manual/a. The support is not complete. + /li + liA new print modifier code%r/code for register operands in inline + assembler is supported. It will print the raw register number without the +register prefixnbsp;'coder/code': +pre +/* Return the most significant byte of 'val', a 64-bit value. */ + +unsigned char msb (long long val) +{ + unsigned char c; + __asm__ (mov %0, %r1+7 : =r (c) : r (val)); + return c; +}/pre +The inline assembler in this example will generate code like +pre +mov r24, 8+7/pre +provided codec/code is allocated to codeR24/code and +codeval/code is allocated to +codeR8/codehellip;codeR15/code. This works because +the GNU assembler accepts plain register numbers without register prefix. + /li + li +Static initializers with 3-byte symbols are supported now: +pre +extern const __memx char foo; +const __memx void *pfoo = amp;foo;/pre +This requires at least Binutils 2.23. + /li +/ul + h3IA-32/x86-64/h3 ul liAllow code-mpreferred-stack-boundary=3/code for the x86-64
Re: [PATCH, PR 55264] Do not remove as unreachable any virtual methods before inlining
On Wed, Jan 16, 2013 at 01:44:20PM +0100, Jan Hubicka wrote: Perhaps could you first change cgraph_non_local_node_p_1 and try to check some code if codegen differs significantly? It should not at all. ipa-cp is the sole user of this flag in IPA passes, so you should know what it does. Thinking deeper of ipa-cp and local virtuals, I think this is all slipperly. Local means that all calls to the functions are explicit and known. Obviously if function is virutal and new calls may appear by devirtualization, the local flag is bogus. I guess the external functions are the only that may be local and virtual because somewhere there must be a vtable reference, but to play safe, I would suggest marking all virtuals non-local. Right, as discussed on IRC, the patch below therfore modifies cgraph_only_called_directly_or_aliased_p to return false for virtual functions (which translates into cleared local flag) and the cloning machinery to clear that flag. Bootstrapped and tested on x86_64-linux without any problems. OK for trunk? OK, thanks! Honza
Re: PING: gcc.target/arm: skip 5 tests for flag conflicts
On 01/16/2013 05:31 AM, Nick Clifton wrote: Hi Janis, Back in September I submitted a patch to fix five ARM tests in http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01515.html. You responded in http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00972.html and I answered your questions in a reply. I believe that Richard's main point was that the skips that you were adding to the tests meant that they would not be run for valid command line options. Now I get it. This version is more selective about which multilibs are skipped. I tested it by using multilib test flags for all valid values for -march, with and without -mthumb as appropriate for the arch. The ones that are now skipped are the ones that used to fail with complaints from the compiler. Is this OK? Janis 2013-01-16 Janis Johnson jani...@codesourcery.com * gcc.target/arm/pr40887.c: Require at least armv5. * gcc.target/arm/pr51835.c: Avoid conflicts with multilib flags. * gcc.target/arm/pr51915.c: Likewise. * gcc.target/arm/pr52006.c: Likewise. * gcc.target/arm/pr53187.c: Likewise. Index: src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr40887.c === --- src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr40887.c (revision 195216) +++ src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr40887.c (working copy) @@ -1,3 +1,4 @@ +/* { dg-skip-if need at least armv5 { *-*-* } { -march=armv[234]* } { } } */ /* { dg-options -O2 -march=armv5te } */ /* { dg-final { scan-assembler blx } } */ Index: src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51835.c === --- src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51835.c (revision 195216) +++ src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51835.c (working copy) @@ -1,6 +1,8 @@ /* { dg-do compile } */ -/* { dg-options -O2 -mfloat-abi=hard -mfpu=fpv4-sp-d16 } */ -/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if no support for hard-float VFP ABI { arm_thumb1 } { -march=* } { } } */ +/* { dg-skip-if do not override -mfloat-abi { *-*-* } { -mfloat-abi=* } { -mfloat-abi=hard } } */ +/* { dg-skip-if avoid conflicting -mfpu { *-*-* } { -mfpu=* } { -mfpu=fpv4-sp-d16 -mfpu=vfpv3xd -mfpu=vfpv3xd-fp16 } } */ +/* { dg-options -O2 -march=armv7-a -mfloat-abi=hard -mfpu=fpv4-sp-d16 } */ int func1 (double d) { Index: src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51915.c === --- src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51915.c (revision 195216) +++ src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr51915.c (working copy) @@ -1,5 +1,7 @@ /* PR target/51915 */ /* { dg-do compile } */ +/* { dg-skip-if no support for hard-float VFP ABI { arm_thumb1 } { -march=* } { } } */ +/* { dg-skip-if do not override -mfloat-abi { *-*-* } { -mfloat-abi=* } { -mfloat-abi=hard } } */ /* { dg-options -march=armv7-a -mfloat-abi=hard -O2 } */ struct S { int s1; void *s2; }; Index: src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr52006.c === --- src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr52006.c (revision 195216) +++ src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr52006.c (working copy) @@ -1,5 +1,7 @@ /* PR target/52006 */ /* { dg-do compile } */ +/* { dg-skip-if avoid conflicts with multilib flags { *-*-* } { -mfloat-abi=* } { -mfloat-abi=hard } } */ +/* { dg-skip-if no support for hard-float VFP ABI { arm_thumb1 } { -march=* } { } } */ /* { dg-options -march=armv7-a -mfloat-abi=hard -O2 -fPIC } */ unsigned long a; Index: src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr53187.c === --- src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr53187.c (revision 195216) +++ src/gcc-mainline/gcc/testsuite/gcc.target/arm/pr53187.c (working copy) @@ -1,5 +1,7 @@ /* PR target/53187 */ /* { dg-do compile } */ +/* { dg-skip-if no support for hard-float VFP ABI { arm_thumb1 } { -march=* } { } } */ +/* { dg-skip-if do not override -mfloat-abi { *-*-* } { -mfloat-abi=* } { -mfloat-abi=hard } } */ /* { dg-options -march=armv7-a -mfloat-abi=hard -O2 } */ void bar (int);
Re: [patch] Fix libstdc++/55043 - issue with nesting unordered_map containing unique_ptr into vector
Here's another attempt to fix this regression, I hope this time it doesn't cause more problems than it solves. Instead of specializing is_copy_constructible for the unordered containers this causes their copy constructors to be deleted if the value_type is not CopyInsertable into the container. This makes is_copy_constructible naturally give the right result, and so __move_if_noexcept does the right thing and the testcase in the PR passes. Yay. As Daniel pointed out in the PR comments, the unfortunate side effect of this approach is that we can no longer support instantiating unordered containers with incomplete types. That's undefined behaviour, but was allowed as QoI. Conformance trumps QoI, I'm afraid. If someday we have noexcept move constructors for the unordered containers we could allow incomplete types again. PR libstdc++/55043 (again) * include/bits/alloc_traits.h (allocator_traits::construct): Disable unless construction would be well-formed. (__allow_copy_cons, __check_copy_constructible): Define. * include/bits/unordered_map.h (__check_copy_constructible): Use as base class so copy constructor will be deleted if appropriate. (is_copy_constructible): Remove specialization. * include/bits/unordered_set.h: Likewise. * include/debug/unordered_map.h: Undo previous commit. Default copy and move constructors. * include/debug/unordered_set.h: Likewise. * include/profile/unordered_map.h: Undo previous commit. * include/profile/unordered_set.h: Likewise. * testsuite/23_containers/unordered_map/55043.cc: Fix test. * testsuite/23_containers/unordered_multimap/55043.cc: Likewise. * testsuite/23_containers/unordered_multiset/55043.cc: Likewise. * testsuite/23_containers/unordered_set/55043.cc: Likewise. * testsuite/23_containers/unordered_map/requirements/53339.cc: XFAIL, cannot support incomplete types. * testsuite/23_containers/unordered_multimap/requirements/53339.cc: Likewise. Tested x86_86-linux, committed to trunk. commit 20ee8df23bc999c8cf8876b88a188c4f51fb7665 Author: Jonathan Wakely jwakely@gmail.com Date: Wed Jan 16 09:50:47 2013 + PR libstdc++/55043 (again) * include/bits/alloc_traits.h (allocator_traits::construct): Disable unless construction would be well-formed. (__allow_copy_cons, __check_copy_constructible): Define. * include/bits/unordered_map.h (__check_copy_constructible): Use as base class so copy constructor will be deleted if appropriate. (is_copy_constructible): Remove specialization. * include/bits/unordered_set.h: Likewise. * include/debug/unordered_map.h: Undo previous commit. Default copy and move constructors. * include/debug/unordered_set.h: Likewise. * include/profile/unordered_map.h: Undo previous commit. * include/profile/unordered_set.h: Likewise. * testsuite/23_containers/unordered_map/55043.cc: Fix test. * testsuite/23_containers/unordered_multimap/55043.cc: Likewise. * testsuite/23_containers/unordered_multiset/55043.cc: Likewise. * testsuite/23_containers/unordered_set/55043.cc: Likewise. * testsuite/23_containers/unordered_map/requirements/53339.cc: XFAIL, cannot support incomplete types. * testsuite/23_containers/unordered_multimap/requirements/53339.cc: Likewise. diff --git a/libstdc++-v3/include/bits/alloc_traits.h b/libstdc++-v3/include/bits/alloc_traits.h index c6259a1..26c64f2 100644 --- a/libstdc++-v3/include/bits/alloc_traits.h +++ b/libstdc++-v3/include/bits/alloc_traits.h @@ -257,7 +257,8 @@ _GLIBCXX_ALLOC_TR_NESTED_TYPE(propagate_on_container_swap, templatetypename _Tp, typename... _Args static typename - enable_if!__construct_helper_Tp, _Args...::value, void::type + enable_if__and___not___construct_helper_Tp, _Args..., +is_constructible_Tp, _Args...::value, void::type _S_construct(_Alloc, _Tp* __p, _Args... __args) { ::new((void*)__p) _Tp(std::forward_Args(__args)...); } @@ -389,7 +390,8 @@ _GLIBCXX_ALLOC_TR_NESTED_TYPE(propagate_on_container_swap, * arguments @a __args... */ templatetypename _Tp, typename... _Args - static void construct(_Alloc __a, _Tp* __p, _Args... __args) + static auto construct(_Alloc __a, _Tp* __p, _Args... __args) + - decltype(_S_construct(__a, __p, std::forward_Args(__args)...)) { _S_construct(__a, __p, std::forward_Args(__args)...); } /** @@ -526,9 +528,10 @@ _GLIBCXX_ALLOC_TR_NESTED_TYPE(propagate_on_container_swap, _M_select(...); public: - typedef decltype(_M_selecttypename _Alloc::value_type(0)) type; + typedef decltype(_M_selecttypename _Alloc::value_type(0)) type; }; + // true if _Alloc::value_type
[patch] fix libstdc++/56012 - narrowing conversion in std::atomic_flag
This fixes a regression since 4.6 when -Wsystem-headers is used. The initialization of the __atomic_flag_base base class has a narrowing conversion from int (the macro) to either bool or unsigned char. The patch fixes it by calling a constexpr function which implicitly converts the value to the return type instead of doing the conversion inside a braced-init-list. Doing that requires naming the return type, so I defined a new typedef for to avoid duplicating the preprocessor conditional. The patch also adds a missing assignment operator in atomicbool. PR libstdc++/56012 * include/bits/atomic_base.h (atomic_flag): Fix narrowing conversion. * testsuite/29_atomics/atomic/operators/56012.cc: New. PR libstdc++/56011 * include/std/atomic (atomicbool::operator=(bool) volatile): Add missing overload. * testsuite/29_atomics/atomic/operators/56011.cc: New. Tested x86_64-linux, it's a regression so I want to commit it to the trunk and 4.7 branch, any objections from the atomics experts? commit e32b21316534398d2b45d4b1b4c06bb12ec17e84 Author: Jonathan Wakely jwakely@gmail.com Date: Wed Jan 16 23:54:37 2013 + PR libstdc++/56012 * include/bits/atomic_base.h (atomic_flag): Fix narrowing conversion. * testsuite/29_atomics/atomic/operators/56012.cc: New. PR libstdc++/56011 * include/std/atomic (atomicbool::operator=(bool) volatile): Add missing overload. * testsuite/29_atomics/atomic/operators/56011.cc: New. diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index 8ce5553..959f524 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -1,6 +1,6 @@ // -*- C++ -*- header. -// Copyright (C) 2008-2012 Free Software Foundation, Inc. +// Copyright (C) 2008-2013 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -212,6 +212,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION templatetypename _Tp struct atomic_Tp*; +/* The target's set value for test-and-set may not be exactly 1. */ +#if __GCC_ATOMIC_TEST_AND_SET_TRUEVAL == 1 +typedef bool __atomic_flag_data_type; +#else +unsigned char __atomic_flag_data_type; +#endif /** * @brief Base type for atomic_flag. @@ -227,12 +233,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION struct __atomic_flag_base { -/* The target's set value for test-and-set may not be exactly 1. */ -#if __GCC_ATOMIC_TEST_AND_SET_TRUEVAL == 1 -bool _M_i; -#else -unsigned char _M_i; -#endif +__atomic_flag_base_type _M_i; }; _GLIBCXX_END_EXTERN_C @@ -250,7 +251,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Conversion to ATOMIC_FLAG_INIT. constexpr atomic_flag(bool __i) noexcept - : __atomic_flag_base({ __i ? __GCC_ATOMIC_TEST_AND_SET_TRUEVAL : 0 }) + : __atomic_flag_base{ _S_init(__i) } { } bool @@ -284,6 +285,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __atomic_clear (_M_i, __m); } + + private: +static constexpr __atomic_flag_data_type +_S_init(bool __i) +{ return __i ? __GCC_ATOMIC_TEST_AND_SET_TRUEVAL : 0; } }; diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index 4012f7d..7cc5c89 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -69,6 +69,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION operator=(bool __i) noexcept { return _M_base.operator=(__i); } +bool +operator=(bool __i) volatile noexcept +{ return _M_base.operator=(__i); } + operator bool() const noexcept { return _M_base.load(); } diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc b/libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc new file mode 100644 index 000..1d77a55 --- /dev/null +++ b/libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc @@ -0,0 +1,29 @@ +// { dg-options -std=gnu++0x } +// { dg-do compile } + +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include atomic +void test01() +{ + using namespace std; + volatile atomicbool ab1
Re: [google 4.7] Fix for PR 8013197 (issue7140044)
2013-01-16 Sterling Augustine saugust...@google.com * gcc/dwarf2out.c (resolve_addr): Delete call to remove_addr_table_entry. OK for google/gcc-4_7. The commit log entry should say Google ref: b/8013197 instead of PR Thanks! -cary
libbacktrace patch committed: Handle missing line number entry
It turns out that it is possible to construct debug info in which a compilation unit has a low_pc and a high_pc, and a line number table, but the line number table does not cover PC values from low_pc up to some value. For example, this will happen in an assembler file if you use a .cfi_startproc with no preceding .loc, and then provide some instructions, and then use a .loc later in the function before the .cfi_endproc. In this admittedly obscure case the backtrace library was returning an error inconsistent DWARF line number info. Since the case can occur in real files, this is inappropriate. This patch fixes libbacktrace to return the main compilation file, if available, with no associated function or line number. I have not added a test case because I don't know how to create one without using assembler code. I did write a test case myself and verified that it fixed the problem on my system. Bootstrapped and ran libbacktrace testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2013-01-16 Ian Lance Taylor i...@google.com * dwarf.c (struct unit): Add filename and abs_filename fields. (build_address_map): Set new fields when reading unit. (dwarf_lookup_pc): If we don't find an entry in the line table, just return the main file name. Index: dwarf.c === --- dwarf.c (revision 195256) +++ dwarf.c (working copy) @@ -283,8 +283,12 @@ struct unit int addrsize; /* Offset into line number information. */ off_t lineoff; + /* Primary source file. */ + const char *filename; /* Compilation command working directory. */ const char *comp_dir; + /* Absolute file name, only set if needed. */ + const char *abs_filename; /* The abbreviations for this unit. */ struct abbrevs abbrevs; @@ -1288,6 +1292,7 @@ build_address_map (struct backtrace_stat int have_ranges; uint64_t lineoff; int have_lineoff; + const char *filename; const char *comp_dir; if (info.reported_underflow) @@ -1346,6 +1351,7 @@ build_address_map (struct backtrace_stat have_ranges = 0; lineoff = 0; have_lineoff = 0; + filename = NULL; comp_dir = NULL; for (i = 0; i abbrev-num_attrs; ++i) { @@ -1394,6 +1400,10 @@ build_address_map (struct backtrace_stat have_lineoff = 1; } break; + case DW_AT_name: + if (val.encoding == ATTR_VAL_STRING) + filename = val.u.string; + break; case DW_AT_comp_dir: if (val.encoding == ATTR_VAL_STRING) comp_dir = val.u.string; @@ -1421,7 +1431,9 @@ build_address_map (struct backtrace_stat u-version = version; u-is_dwarf64 = is_dwarf64; u-addrsize = addrsize; + u-filename = filename; u-comp_dir = comp_dir; + u-abs_filename = NULL; u-lineoff = lineoff; u-abbrevs = abbrevs; memset (abbrevs, 0, sizeof abbrevs); @@ -2701,8 +2713,45 @@ dwarf_lookup_pc (struct backtrace_state sizeof (struct line), line_search); if (ln == NULL) { - error_callback (data, inconsistent DWARF line number info, 0); - return 0; + /* The PC is between the low_pc and high_pc attributes of the + compilation unit, but no entry in the line table covers it. + This implies that the start of the compilation unit has no + line number information. */ + + if (entry-u-abs_filename == NULL) + { + const char *filename; + + filename = entry-u-filename; + if (filename != NULL + !IS_ABSOLUTE_PATH (filename) + entry-u-comp_dir != NULL) + { + size_t filename_len; + const char *dir; + size_t dir_len; + char *s; + + filename_len = strlen (filename); + dir = entry-u-comp_dir; + dir_len = strlen (dir); + s = (char *) backtrace_alloc (state, dir_len + filename_len + 2, + error_callback, data); + if (s == NULL) + { + *found = 0; + return 0; + } + memcpy (s, dir, dir_len); + /* FIXME: Should use backslash if DOS file system. */ + s[dir_len] = '/'; + memcpy (s + dir_len + 1, filename, filename_len + 1); + filename = s; + } + entry-u-abs_filename = filename; + } + + return callback (data, pc, entry-u-abs_filename, 0, NULL); } /* Search for function name within this unit. */
[patch] libstdc++/52887 - fix AIX bootstrap
Add required instantiations for AIX. PR libstdc++/52887 * src/c++11/regex.cc: Add instantiations for AIX. Committed to the 4.7 branch only. commit df31b423330bab88fee84c8f32376dce7ca9242b Author: Jonathan Wakely jwakely@gmail.com Date: Thu Jan 17 01:36:42 2013 + PR libstdc++/52887 * src/c++11/regex.cc: Add instantiations for AIX. diff --git a/libstdc++-v3/src/c++11/regex.cc b/libstdc++-v3/src/c++11/regex.cc index 8a47da3..d21f221 100644 --- a/libstdc++-v3/src/c++11/regex.cc +++ b/libstdc++-v3/src/c++11/regex.cc @@ -1,6 +1,6 @@ // regex -*- C++ -*- -// Copyright (C) 2011 Free Software Foundation, Inc. +// Copyright (C) 2011-2013 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -34,5 +34,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION regex_error::~regex_error() throw() { } +#ifdef _AIX + // PR libstdc++/52887 + template class functionvoid (__regex::_PatternCursor const, + __regex::_Results); + template class functionbool (__regex::_PatternCursor const); +#endif + _GLIBCXX_END_NAMESPACE_VERSION } // namespace std
Go patch committed: Provide line number for init function
Related to the libbacktrace patch I submitted earlier (http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00889.html), this patch ensures that the initialization function, created by the Go frontend to handle package imports and run init functions, has line number information. Previously it did not. That was normally fine, but I ran across a case in which one init function was inlined and another one, called first, was not. The first init function called runtime.Callers to get backtrace information, and tried to get information from a function with a line number table but with no line number for the point of the call. The result was that the program crashed. This patch ensures that we do have line number information at that point, albeit line number information that is not very useful in a backtrace. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 244b695224a4 go/gogo-tree.cc --- a/go/gogo-tree.cc Fri Dec 21 17:05:18 2012 -0800 +++ b/go/gogo-tree.cc Wed Jan 16 17:45:59 2013 -0800 @@ -438,15 +438,15 @@ // The tedious details of building your own function. There doesn't // seem to be a helper function for this. std::string name = this-package_name() + .init; - tree fndecl = build_decl(BUILTINS_LOCATION, FUNCTION_DECL, - get_identifier_from_string(name), + tree fndecl = build_decl(this-package_-location().gcc_location(), + FUNCTION_DECL, get_identifier_from_string(name), build_function_type(void_type_node, void_list_node)); const std::string asm_name(this-get_init_fn_name()); SET_DECL_ASSEMBLER_NAME(fndecl, get_identifier_from_string(asm_name)); - tree resdecl = build_decl(BUILTINS_LOCATION, RESULT_DECL, NULL_TREE, - void_type_node); + tree resdecl = build_decl(this-package_-location().gcc_location(), + RESULT_DECL, NULL_TREE, void_type_node); DECL_ARTIFICIAL(resdecl) = 1; DECL_CONTEXT(resdecl) = fndecl; DECL_RESULT(fndecl) = resdecl; @@ -481,7 +481,8 @@ push_struct_function(fndecl); else push_cfun(DECL_STRUCT_FUNCTION(fndecl)); - cfun-function_end_locus = BUILTINS_LOCATION; + cfun-function_start_locus = this-package_-location().gcc_location(); + cfun-function_end_locus = cfun-function_start_locus; gimplify_function_tree(fndecl); @@ -1118,6 +1119,7 @@ else push_cfun(DECL_STRUCT_FUNCTION(decl)); + cfun-function_start_locus = func-location().gcc_location(); cfun-function_end_locus = func-block()-end_location().gcc_location();
Re: [PATCH] [testsuite] [arm] Test thumb1 far jump
On 01/16/2013 06:05 PM, Joey Ye wrote: Test cases for previous patch no lr save for non-far branches in leaf function. * gcc.target/arm/thumb1-far-jump-1.c: New. * gcc.target/arm/thumb1-far-jump-2.c: New. Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c (revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c (revision 0) @@ -0,0 +1,58 @@ +/* Check for thumb1 far jump. This is the extreme case that far jump + * will be used with minimum number of instructions. By passing this case + * it means the heuristic of saving lr for far jump meets the most extreme + * requirement. */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ The effective target arm_thumb1_ok returns 1 if it's OK to add -mthumb to the current multilib flags and together they generate code for Thumb-1. arm_thumb1 says that the current multilib flags generate code for Thumb-1. If you want to add -mthumb to the test then use: /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-options -Os -mthumb } */ Otherwise use /* { dg-skip-if { ! arm_thumb1 } } */ /* { dg-options -Os } */ +/* { dg-final { scan-assembler push.*lr } } */ Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c (revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c (revision 0) @@ -0,0 +1,35 @@ +/* Check for thumb1 far jump. Shouldn't save lr for small leaf functions + * even with a branch in it. */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ Same here. The tests are OK with those changes, but please wait a day or two for other comments or a clear OK from an ARM maintainer. Janis
Re: [PR55547] fix alias regression on alpha on misaligned symbols
On Jan 16, 2013, Richard Henderson r...@redhat.com wrote: I notice that these expressions (including the first hunk that uses ifs) are now all the same. *nod* It would seem extremely prudent to pull this out to a function so that they stay the same. ack, will do. That said, I question the change of = to == 0. If negative, we don't know how much overlap there is as far as I can see. Why not? Since the addresses are constants, and the negative sizes are just the adjusted sizes, negated to indicate they were conservatively lengthened by an alignment operation, we can determine that two references don't overlap if they're far enough from each other that, even with the alignment adjustment, they're still clearly non-overlapping. Say, if x is 0x0fff and y is 0x1234, both originally referenced with size 8 and x aligned to 0x20, it is obvious that the accesses won't overlap, in spite of the alignment on x. The test applied on constant addresses wouldn't realize that and would say they could overlap, because any alignment-adjusted size would be mistaken as “overlaps with anything”. if (GET_CODE (x) == CONST) @@ -2139,7 +2141,8 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) if (CONSTANT_P (y)) return (xsize = 0 || ysize = 0 || (rtx_equal_for_memref_p (x, y) - ((c = 0 xsize c) || (c 0 ysize+c 0; + ((c = 0 abs (xsize) - c 0) +|| (c 0 abs (ysize) + c 0; This hunk is not needed, as we begin by eliminating = 0. So the abs is certain to do nothing. The function I'll introduce to keep the expressions the same will have the abs and I'll use it here, but you're right that after testing for negative sizes they abses won't make much of a difference. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
RE: [PATCH] [testsuite] [arm] Test thumb1 far jump
-Original Message- From: Janis Johnson [mailto:janis_john...@mentor.com] Sent: Thursday, January 17, 2013 10:41 To: Joey Ye Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [testsuite] [arm] Test thumb1 far jump On 01/16/2013 06:05 PM, Joey Ye wrote: Test cases for previous patch no lr save for non-far branches in leaf function. * gcc.target/arm/thumb1-far-jump-1.c: New. * gcc.target/arm/thumb1-far-jump-2.c: New. Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0) @@ -0,0 +1,58 @@ +/* Check for thumb1 far jump. This is the extreme case that far jump + * will be used with minimum number of instructions. By passing this case + * it means the heuristic of saving lr for far jump meets the most extreme + * requirement. */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ The effective target arm_thumb1_ok returns 1 if it's OK to add -mthumb to the current multilib flags and together they generate code for Thumb-1. arm_thumb1 says that the current multilib flags generate code for Thumb-1. If you want to add -mthumb to the test then use: /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-options -Os -mthumb } */ Otherwise use /* { dg-skip-if { ! arm_thumb1 } } */ /* { dg-options -Os } */ This is what I intented +/* { dg-final { scan-assembler push.*lr } } */ Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0) @@ -0,0 +1,35 @@ +/* Check for thumb1 far jump. Shouldn't save lr for small leaf functions + * even with a branch in it. */ +/* { dg-require-effective-target arm_thumb1_ok } */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ Same here. The tests are OK with those changes, but please wait a day or two for other comments or a clear OK from an ARM maintainer. These test cases should be committed together with http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00221.html or otherwise one of them will fail. Janis Updated patch: Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-2.c(revision 0) @@ -0,0 +1,58 @@ +/* Check for thumb1 far jump. This is the extreme case that far jump + * will be used with minimum number of instructions. By passing this case + * it means the heuristic of saving lr for far jump meets the most extreme + * requirement. */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ + +volatile register r4 asm(r4); +void f3(int i) +{ +#define GO(n) \ + extern volatile int g_##n; \ + r4=(int)g_##n; + +#define GO8(n) \ + GO(n##_0) \ + GO(n##_1) \ + GO(n##_2) \ + GO(n##_3) \ + GO(n##_4) \ + GO(n##_5) \ + GO(n##_6) \ + GO(n##_7) + +#define GO64(n) \ + GO8(n##_0) \ + GO8(n##_1) \ + GO8(n##_2) \ + GO8(n##_3) \ + GO8(n##_4) \ + GO8(n##_5) \ + GO8(n##_6) \ + GO8(n##_7) \ + +#define GO498(n) \ + GO64(n##_0) \ + GO64(n##_1) \ + GO64(n##_2) \ + GO64(n##_3) \ + GO64(n##_4) \ + GO64(n##_5) \ + GO64(n##_6) \ + GO8(n##_0) \ + GO8(n##_1) \ + GO8(n##_2) \ + GO8(n##_3) \ + GO8(n##_4) \ + GO8(n##_5) \ + GO(n##_0) \ + GO(n##_1) \ + + if (i) { +GO498(0); + } +} + +/* { dg-final { scan-assembler push.*lr } } */ Index: gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c === --- gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0) +++ gcc/testsuite/gcc.target/arm/thumb1-far-jump-1.c(revision 0) @@ -0,0 +1,35 @@ +/* Check for thumb1 far jump. Shouldn't save lr for small leaf functions + * even with a branch in it. */ +/* { dg-options -Os } */ +/* { dg-skip-if { ! { arm_thumb1 } } } */ + +void f() +{ + for (;;); +} + +volatile int g; +void f2(int i) +{ + if (i) g=0; +} + +void f3(int i) +{ + if (i) { +g=0; +g=1; +g=2; +g=3; +g=4; +g=5; +g=6; +g=7; +g=8; +g=9; + } +} + +/* { dg-final { scan-assembler-not push.*lr } } */ +
Re: [PR55547] fix alias regression on alpha on misaligned symbols
On Jan 16, 2013, Richard Henderson r...@redhat.com wrote: I notice that these expressions (including the first hunk that uses ifs) are now all the same. It would seem extremely prudent to pull this out to a function so that they stay the same. Here's a revised patch that makes that change, making the overlap computation clearer (to me) while at that. The other fix was to avoid adjusting zero sizes at alignment expressions, lest they'd lose the special meaning. Regstrapping on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Be conservative about negative sizes on symbols, use abs elsewhere From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog PR rtl-optimization/55547 PR rtl-optimization/53827 PR debug/53671 PR debug/49888 * alias.c (memrefs_conflict_p): Use abs of sizes all over, retaining the conservative special case for symbolic constants. --- gcc/alias.c | 46 -- 1 files changed, 28 insertions(+), 18 deletions(-) diff --git a/gcc/alias.c b/gcc/alias.c index 9a386dd..f3cd014 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -1904,6 +1904,20 @@ addr_side_effect_eval (rtx addr, int size, int n_refs) return addr; } +/* Return TRUE if an object X sized at XSIZE bytes and another object + Y sized at YSIZE bytes, starting C bytes after X, may overlap. If + any of the sizes is zero, assume an overlap, otherwise use the + absolute value of the sizes as the actual sizes. */ + +static inline bool +offset_overlap_p (HOST_WIDE_INT c, int xsize, int ysize) +{ + return (xsize == 0 || ysize == 0 + || (c = 0 + ? (abs (xsize) c) + : (abs (ysize) -c))); +} + /* Return one if X and Y (memory addresses) reference the same location in memory or if the references overlap. Return zero if they do not overlap, else return @@ -1976,23 +1990,17 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) else if (GET_CODE (x) == LO_SUM) x = XEXP (x, 1); else -x = addr_side_effect_eval (x, xsize, 0); +x = addr_side_effect_eval (x, abs (xsize), 0); if (GET_CODE (y) == HIGH) y = XEXP (y, 0); else if (GET_CODE (y) == LO_SUM) y = XEXP (y, 1); else -y = addr_side_effect_eval (y, ysize, 0); +y = addr_side_effect_eval (y, abs (ysize), 0); if (rtx_equal_for_memref_p (x, y)) { - if (xsize = 0 || ysize = 0) - return 1; - if (c = 0 xsize c) - return 1; - if (c 0 ysize+c 0) - return 1; - return 0; + return offset_overlap_p (c, xsize, ysize); } /* This code used to check for conflicts involving stack references and @@ -2062,8 +2070,7 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) x0 = canon_rtx (XEXP (x, 0)); y0 = canon_rtx (XEXP (y, 0)); if (rtx_equal_for_memref_p (x0, y0)) - return (xsize == 0 || ysize == 0 - || (c = 0 xsize c) || (c 0 ysize+c 0)); + return offset_overlap_p (c, xsize, ysize); /* Can't properly adjust our sizes. */ if (!CONST_INT_P (x1)) @@ -2093,7 +2100,8 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) { if (xsize 0) xsize = -xsize; - xsize += sc + 1; + if (xsize) + xsize += sc + 1; c -= sc + 1; return memrefs_conflict_p (xsize, canon_rtx (XEXP (x, 0)), ysize, y, c); @@ -2107,7 +2115,8 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) { if (ysize 0) ysize = -ysize; - ysize += sc + 1; + if (ysize) + ysize += sc + 1; c += sc + 1; return memrefs_conflict_p (xsize, x, ysize, canon_rtx (XEXP (y, 0)), c); @@ -2119,8 +2128,7 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) if (CONST_INT_P (x) CONST_INT_P (y)) { c += (INTVAL (y) - INTVAL (x)); - return (xsize = 0 || ysize = 0 - || (c = 0 xsize c) || (c 0 ysize+c 0)); + return offset_overlap_p (c, xsize, ysize); } if (GET_CODE (x) == CONST) @@ -2136,10 +2144,12 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) return memrefs_conflict_p (xsize, x, ysize, canon_rtx (XEXP (y, 0)), c); + /* Assume a potential overlap for symbolic addresses that went + through alignment adjustments (i.e., that have negative + sizes), because we can't know how far they are from each + other. */ if (CONSTANT_P (y)) - return (xsize = 0 || ysize = 0 - || (rtx_equal_for_memref_p (x, y) - ((c = 0 xsize c) || (c 0 ysize+c 0; + return (xsize 0 || ysize 0 || offset_overlap_p (c, xsize, ysize)); return -1; } -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [Patch i386]: btver2 pipeline descriptions.
On Thu, Jan 17, 2013 at 8:17 AM, Kumar, Venkataramanan venkataramanan.ku...@amd.com wrote: Hi Maintainers, This patch adds pipeline descriptions for -march=btver2. Completed bootstrap and gcc regression test passes with r194705 Is this ok for trunk? Change log -- 2013-01-17 Venkataramanan Kumar venkataramanan.ku...@amd.com btver2 pipeline descriptions * config/i386/i386.c : Enable CPU_BTVER2 to use btver2 pipeline descriptions. * config/i386/i386.md (btver2_decode) : New type attributes. * config/i386/sse.md (btver2_decode, btver2_sse_attr): New type attributes. * config/i386/btver2.md: New file describing btver2 pipelines. As agreed previously, tuning patch looks safe even this late in the game. So, OK for mainline, unless RMs veto this decision. - [(set_attr type bitmanip) + [(set_attr type bitmanip,bitmanip) No need for this change (in a couple of places). Uros.
Re: [patch] fix libstdc++/56012 - narrowing conversion in std::atomic_flag
2013/1/17 Jonathan Wakely jwakely@gmail.com: This fixes a regression since 4.6 when -Wsystem-headers is used. The initialization of the __atomic_flag_base base class has a narrowing conversion from int (the macro) to either bool or unsigned char. The patch fixes it by calling a constexpr function which implicitly converts the value to the return type instead of doing the conversion inside a braced-init-list. Doing that requires naming the return type, so I defined a new typedef for to avoid duplicating the preprocessor conditional. The patch also adds a missing assignment operator in atomicbool. PR libstdc++/56012 * include/bits/atomic_base.h (atomic_flag): Fix narrowing conversion. * testsuite/29_atomics/atomic/operators/56012.cc: New. PR libstdc++/56011 * include/std/atomic (atomicbool::operator=(bool) volatile): Add missing overload. * testsuite/29_atomics/atomic/operators/56011.cc: New. Tested x86_64-linux, it's a regression so I want to commit it to the trunk and 4.7 branch, any objections from the atomics experts? Isn't here a typedef missing: +/* The target's set value for test-and-set may not be exactly 1. */ +#if __GCC_ATOMIC_TEST_AND_SET_TRUEVAL == 1 +typedef bool __atomic_flag_data_type; +#else +unsigned char __atomic_flag_data_type; +#endif I would expect that this looked like: +/* The target's set value for test-and-set may not be exactly 1. */ +#if __GCC_ATOMIC_TEST_AND_SET_TRUEVAL == 1 +typedef bool __atomic_flag_data_type; +#else +typedef unsigned char __atomic_flag_data_type; +#endif - Daniel