Re: [patch] change specific int128 - generic intN
I wasn't sure what to do with that array, since it was static and couldn't have empty slots in them like the arrays in tree.h. Also, do we need to have *every* type in that list? What's the rule for whether a type gets installed there or not? The comment says guaranteed to be in the runtime support but does that mean for this particular build (wrt multilibs) as not all intN types are guaranteed (even the int128 types were not guaranteed to be supported before my patch). In other parts of the patch, just taking out the special case for __int128 was sufficient to do the right thing for all __intN types. I can certainly put the intN types in there, but note that it would mean regenerating the fundamentals[] array at runtime to include those types which are supported at the time. Do the entries in the array need to be in a particular order?
Re: [PATCH] Add DW_AT_const_value as unsigned or int depending on type and value used.
On Mon, Apr 14, 2014 at 03:48:06PM -0700, Cary Coutant wrote: Also note that size_of_die and value_format will still choose DW_FORM_data[1248] for dw_val_class_unsigned_const in most cases. Don't you really want to use DW_FORM_udata? DW_FORM_data[1248] is in many cases smaller than DW_FORM_udata (though, one has to take into account possibly larger .debug_abbrev size). Yes, but it's up to the consumer to deduce from context whether the value is signed or unsigned. If it's still true that GDB will interpret DW_FORM_data[1248] as signed (as the deleted comment said), and you output a value between 128 and 255 using DW_FORM_data1, this isn't going to work. Maybe that comment only applies to DW_FORM_data[48] (whichever matches HOST_WIDE_INT)? If there is no agreement between producer and consumer what is signed and what is unsigned for DW_FORM_data[1248], then of course that is a problem, I wasn't aware of such disagreements. Anyway, at least DW_FORM_data[1248] values which don't have topmost bit set should be always fine, because they are not ambiguous even if there is disagreement between producer and consumer on if it is signed or unsigned. Jakub
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
On Fri, Apr 11, 2014 at 11:51:44PM +0200, Samuel Thibault wrote: It's indeed: /* This function is called at program startup time to make sure that mmap, munmap, and getpagesize are resolved if linking dynamically. We want to resolve them while we have enough stack for them, rather than calling into the dynamic linker while low on stack space. */ void __morestack_load_mmap (void) { /* Call with bogus values to run faster. We don't care if the call fails. Pass __MORESTACK_CURRENT_SEGMENT to make sure that any TLS accessor function is resolved. */ mmap (__morestack_current_segment, 0, PROT_READ, MAP_ANONYMOUS, -1, 0); mprotect (NULL, 0, 0); munmap (0, getpagesize ()); } Yes... So, do we really want to let munmap poke a hole at address 0 and thus let further vm_map() return address 0? We probably don't. AIUI, the first page is always mapped, and always with PROT_NONE to make sure null pointers are catched. Considering other systems have predefined ranges depending on the mapping type, instead of blindly starting at the beginning of the map like vm_map() does, it's perfectly valid to unmap the first page, which is normally the right way to catch null pointers. So, since we do want to catch null pointers, we do want to keep that first page, but only the first page. Or rather, a range large enough to catch accesses through null pointers, e.g. it could even be 64 or 128 KiB. We could alter glibc so that the first mapping has this special size, and have munmap override its given range to skip that area. -- Richard Braun
[PATCH] Fix sanitizer tests to work under QEMU
Hi all, Some ASan and UBSan pattern match tests fail under QEMU, because libsanitizer adds escape sequences which confuse Dejagnu (because it thinks that it is actually running under a tty). This bug was already fixed for some tests (see http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00235.html and http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00319.html). This patch fixes problem for newly added tests in the same way as before. Tested on x86_64-unknown-linux-gnu and arm-linux-gnueabi (both ssh and qemu). Ok to commit? -Maxim 2014-04-15 Max Ostapenko m.ostape...@partner.samsung.com * c-c++-common/asan/null-deref-1.c: Change regexp to pass test under QEMU. * c-c++-common/ubsan/div-by-zero-1.c: Likewise. * c-c++-common/ubsan/div-by-zero-2.c: Likewise. * c-c++-common/ubsan/div-by-zero-3.c: Likewise. * c-c++-common/ubsan/load-bool-enum.c (foo): Likewise. * c-c++-common/ubsan/null-1.c: Likewise. * c-c++-common/ubsan/null-10.c: Likewise. * c-c++-common/ubsan/null-11.c: Likewise. * c-c++-common/ubsan/null-2.c: Likewise. * c-c++-common/ubsan/null-3.c: Likewise. * c-c++-common/ubsan/null-4.c: Likewise. * c-c++-common/ubsan/null-5.c: Likewise. * c-c++-common/ubsan/null-6.c: Likewise. * c-c++-common/ubsan/null-7.c: Likewise. * c-c++-common/ubsan/null-8.c: Likewise. * c-c++-common/ubsan/null-9.c: Likewise. * c-c++-common/ubsan/overflow-add-2.c: Likewise. * c-c++-common/ubsan/overflow-int128.c: Likewise. * c-c++-common/ubsan/overflow-mul-2.c: Likewise. * c-c++-common/ubsan/overflow-mul-4.c: Likewise. * c-c++-common/ubsan/overflow-negate-1.c: Likewise. * c-c++-common/ubsan/overflow-sub-2.c: Likewise. * c-c++-common/ubsan/pr59333.c: Likewise. * c-c++-common/ubsan/pr59667.c: Likewise. * c-c++-common/ubsan/pr60613-2.c: Likewise. * c-c++-common/ubsan/pr60636.c: Likewise. * c-c++-common/ubsan/shift-1.c: Likewise. * c-c++-common/ubsan/shift-2.c: Likewise. * c-c++-common/ubsan/vla-1.c: Likewise. diff --git a/gcc/testsuite/c-c++-common/asan/null-deref-1.c b/gcc/testsuite/c-c++-common/asan/null-deref-1.c index 6aea9d2..87c34c4 100644 --- a/gcc/testsuite/c-c++-common/asan/null-deref-1.c +++ b/gcc/testsuite/c-c++-common/asan/null-deref-1.c @@ -18,5 +18,5 @@ int main() /* { dg-output ERROR: AddressSanitizer:? SEGV on unknown address\[^\n\r]* } */ /* { dg-output 0x\[0-9a-f\]+ \[^\n\r]*pc 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r) } */ -/* { dg-output #0 0x\[0-9a-f\]+ (in \[^\n\r]*NullDeref\[^\n\r]* (\[^\n\r]*null-deref-1.c:10|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*#0 0x\[0-9a-f\]+ (in \[^\n\r]*NullDeref\[^\n\r]* (\[^\n\r]*null-deref-1.c:10|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r) } */ /* { dg-output #1 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*null-deref-1.c:15|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r) } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c index ec391e4..479ced03 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c @@ -17,8 +17,8 @@ main (void) return 0; } -/* { dg-output division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero } */ +/* { dg-output division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]* } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c index c8820fa..d1eb95f 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c @@ -16,8 +16,8 @@ main (void) return 0; } -/* { dg-output division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero } */ +/* { dg-output division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]*(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero\[^\n\r]* } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c index 399071e..266423a 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c @@ -16,6 +16,6 @@ main (void) return 0; } -/* { dg-output division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */
Re: [PATCH] Fix sanitizer tests to work under QEMU
On Tue, Apr 15, 2014 at 10:49:48AM +0400, Maxim Ostapenko wrote: Some ASan and UBSan pattern match tests fail under QEMU, because libsanitizer adds escape sequences which confuse Dejagnu (because it thinks that it is actually running under a tty). This bug was already fixed for some tests (see http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00235.html and http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00319.html). This patch fixes problem for newly added tests in the same way as before. Tested on x86_64-unknown-linux-gnu and arm-linux-gnueabi (both ssh and qemu). It is unfortunate there is no way to forcefully disable the colorization, e.g. through some env var. It would be also nice if there was a way to tell the library to colorize more like GCC colorizes its output rather than clang. Ok to commit? Ok for trunk. Jakub
Re: [PATCH] Fix sanitizer tests to work under QEMU
It is unfortunate there is no way to forcefully disable the colorization, e.g. through some env var. I think they have ASAN_OPTIONS=color=0 (or something like that) but ssh targets (still) don't support passing environment variables. -Y
Re: Fix indirect call profiling for COMDAT symbols
On Mon, 14 Apr 2014, Jakub Jelinek wrote: On Mon, Apr 14, 2014 at 07:22:37PM +0200, Jan Hubicka wrote: Hi, while looking into firefox profiles, I noticed that we miss devirtualizations to comdat symbols, because we manage to get different profile_id in each unit. This is easily fixed by the following patch that makes profiled_id to by crc32 of the symbol name in this case. Bootstrapped/regtested x86_64-linux, tested with firefox, will commit it tomorrow. this is version I comitted with minor change of using the coverage checksum because I think some of anonymous namespace functions do get exported with random seed attached in a very side case. To answer Martin's question, I am just removing the tp_first_run merging code becuase it is done twice - second time in ipa_merge_profiles. Jakub/Richi: I would like to see this in 4.9 (it is missed optimization compared to 4.8). Let me know when it is OK to commit it (perhaps minus the lto-symtab.c change that is not necessary). For missed optimization at this point I'd prefer not to put it into 4.9.0 GA, for whether it is ok for 4.9.1 I'll defer to Richi. If it doesn't cause issues on trunk it's ok for 4.9.1. Richard.
Re: [PATCH] Fix sanitizer tests to work under QEMU
Thanks! Done in r209402. -Maxim
[PATCH] Install some more headers for plugins (PR plugins/59335)
Hi! This patch installs some headers that were newly added in 4.9 (other than cilk/vtable-verify/omp/ubsan headers), which might be needed by some plugins. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk/4.9? 2014-04-15 Jakub Jelinek ja...@redhat.com PR plugins/59335 * Makefile.in (PLUGIN_HEADERS): Add various headers that have been added in 4.9. * Make-lang.h (CP_PLUGIN_HEADERS): Add type-utils.h. --- gcc/Makefile.in.jj 2014-04-08 08:59:45.0 +0200 +++ gcc/Makefile.in 2014-04-14 11:53:51.676369581 +0200 @@ -3125,7 +3125,14 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $ version.h stringpool.h gimplify.h gimple-iterator.h gimple-ssa.h \ fold-const.h tree-cfg.h tree-into-ssa.h tree-ssanames.h print-tree.h \ varasm.h context.h tree-phinodes.h stor-layout.h ssa-iterators.h \ - $(RESOURCE_H) tree-cfgcleanup.h + $(RESOURCE_H) tree-cfgcleanup.h attribs.h calls.h cfgexpand.h \ + diagnostic-color.h gcc-symtab.h gimple-builder.h gimple-low.h \ + gimple-walk.h gimplify-me.h pass_manager.h print-rtl.h stmt.h \ + tree-dfa.h tree-hasher.h tree-nested.h tree-object-size.h tree-outof-ssa.h \ + tree-parloops.h tree-ssa-address.h tree-ssa-coalesce.h tree-ssa-dom.h \ + tree-ssa-loop.h tree-ssa-loop-ivopts.h tree-ssa-loop-manip.h \ + tree-ssa-loop-niter.h tree-ssa-ter.h tree-ssa-threadedge.h \ + tree-ssa-threadupdate.h # generate the 'build fragment' b-header-vars s-header-vars: Makefile --- gcc/cp/Make-lang.in.jj 2014-03-10 10:50:14.0 +0100 +++ gcc/cp/Make-lang.in 2014-04-14 11:55:24.992880591 +0200 @@ -39,7 +39,7 @@ CXX_INSTALL_NAME := $(shell echo c++|sed GXX_INSTALL_NAME := $(shell echo g++|sed '$(program_transform_name)') CXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo c++|sed '$(program_transform_name)') GXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo g++|sed '$(program_transform_name)') -CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h +CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h type-utils.h # # Define the names for selecting c++ in LANGUAGES. Jakub
Re: Fix PR/60820
On Mon, 14 Apr 2014, Jan Hubicka wrote: Hi, this patch fixes ICE in ctor_for_folding where varpool_remove_node incorrectly clobbers DECL_INITIAL of a variable while removing cgraph during the early LTO merging. This case is special by alowing multiple symtab nodes for a given declaration and we have similar special case in the other removal hooks. I did not manage to get reliable testcase for a testsuite, but it is ICE on valid, too. Bootstrapped/regtested x86_64-linux, comitted to mainline. Jakub/Richi, OK for trunk? Ok for 4.9.1. Please work harder for the testcase - there is one in the PR after all that just needs proper reduction. Maybe you can ask Markus to reduce it. Thanks, Richard. PR lto/60820 * varpool.c (varpool_remove_node): Do not alter decls when streaming. Index: varpool.c === --- varpool.c (revision 209386) +++ varpool.c (working copy) @@ -166,7 +166,9 @@ varpool_remove_node (varpool_node *node) /* Because we remove references from external functions before final compilation, we may end up removing useful constructors. FIXME: We probably want to trace boundaries better. */ - if ((init = ctor_for_folding (node-decl)) == error_mark_node) + if (cgraph_state == CGRAPH_LTO_STREAMING) +; + else if ((init = ctor_for_folding (node-decl)) == error_mark_node) varpool_remove_initializer (node); else DECL_INITIAL (node-decl) = init;
Re: [PATCH] Install some more headers for plugins (PR plugins/59335)
On Tue, 15 Apr 2014, Jakub Jelinek wrote: Hi! This patch installs some headers that were newly added in 4.9 (other than cilk/vtable-verify/omp/ubsan headers), which might be needed by some plugins. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk/4.9? Ok. Thanks, Richard. 2014-04-15 Jakub Jelinek ja...@redhat.com PR plugins/59335 * Makefile.in (PLUGIN_HEADERS): Add various headers that have been added in 4.9. * Make-lang.h (CP_PLUGIN_HEADERS): Add type-utils.h. --- gcc/Makefile.in.jj2014-04-08 08:59:45.0 +0200 +++ gcc/Makefile.in 2014-04-14 11:53:51.676369581 +0200 @@ -3125,7 +3125,14 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $ version.h stringpool.h gimplify.h gimple-iterator.h gimple-ssa.h \ fold-const.h tree-cfg.h tree-into-ssa.h tree-ssanames.h print-tree.h \ varasm.h context.h tree-phinodes.h stor-layout.h ssa-iterators.h \ - $(RESOURCE_H) tree-cfgcleanup.h + $(RESOURCE_H) tree-cfgcleanup.h attribs.h calls.h cfgexpand.h \ + diagnostic-color.h gcc-symtab.h gimple-builder.h gimple-low.h \ + gimple-walk.h gimplify-me.h pass_manager.h print-rtl.h stmt.h \ + tree-dfa.h tree-hasher.h tree-nested.h tree-object-size.h tree-outof-ssa.h \ + tree-parloops.h tree-ssa-address.h tree-ssa-coalesce.h tree-ssa-dom.h \ + tree-ssa-loop.h tree-ssa-loop-ivopts.h tree-ssa-loop-manip.h \ + tree-ssa-loop-niter.h tree-ssa-ter.h tree-ssa-threadedge.h \ + tree-ssa-threadupdate.h # generate the 'build fragment' b-header-vars s-header-vars: Makefile --- gcc/cp/Make-lang.in.jj2014-03-10 10:50:14.0 +0100 +++ gcc/cp/Make-lang.in 2014-04-14 11:55:24.992880591 +0200 @@ -39,7 +39,7 @@ CXX_INSTALL_NAME := $(shell echo c++|sed GXX_INSTALL_NAME := $(shell echo g++|sed '$(program_transform_name)') CXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo c++|sed '$(program_transform_name)') GXX_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo g++|sed '$(program_transform_name)') -CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h +CP_PLUGIN_HEADERS := cp-tree.h cxx-pretty-print.h name-lookup.h type-utils.h # # Define the names for selecting c++ in LANGUAGES. Jakub
Re: [patch] Add support for pragma Loop_Optimize ([No_]Vector)
On Mon, Apr 14, 2014 at 4:45 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this adds support for 2 optimization hints pertaining to loops in Ada, namely Loop_Optimize (No_Vector) and Loop_Optimize (Vector), by reusing the Ivdep approach in the middle-end (ANNOTATE_EXPR node) and directly setting the dont_vectorize and force_vectorize bits of the 'loop' structure. Tested on x86_64-suse-linux, OK for the mainline? The loop flags copying should go into copy_loop_info instead of only to copy_loops. Jakub - I see you remap simduid on copy - you have to do sth in copy_loop_info instead I suppose. See the other callers. Otherwise I'm fine with this patch. Thanks, Richard. 2014-04-14 Eric Botcazou ebotca...@adacore.com * cfgloop.h (struct loop): Move force_vectorize down. * gimplify.c (gimple_boolify) ANNOTATE_EXPR: Handle new kinds. (gimplify_expr) ANNOTATE_EXPR: Minor tweak. * lto-streamer-in.c (input_cfg): Read dont_vectorize field. * lto-streamer-out.c (output_cfg): Write dont_vectorize field. * tree-cfg.c (replace_loop_annotate): Revamp and handle new kinds. * tree-core.h (enum annot_expr_kind): Add new kind values. * tree-inline.c (copy_loops): Copy dont_vectorize field and reorder. * tree-pretty-print.c (dump_generic_node) ANNOTATE_EXPR: Handle new kinds. * tree.def (ANNOTATE_EXPR): Tweak comment. ada/ * gcc-interface/trans.c (gnat_gimplify_stmt): Propagate loop hints. 2014-04-14 Eric Botcazou ebotca...@adacore.com * gnat.dg/vect12.ad[sb]: New test. * gnat.dg/vect13.ad[sb]: Likewise. -- Eric Botcazou
Re: [PATCH] [CLEANUP] Wrap locally-used functions in anonymous namespaces
On Mon, Apr 14, 2014 at 4:51 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, This patch wraps a bunch of locally-used, non-debug functions in an anonymous namespace. These functions can't simply be marked as static because they are used as template arguments to hash_table::traverse, and the C++98 standard does not allow non-extern variables to be used as template arguments. The next best thing to marking them static is to define each of these functions inside an anonymous namespace. Hum, the formatting used looks super-ugly. I suppose a local visibility attribute would work as well? (well, what's the goal of the patch?) Thanks, Richard. I bootstrapped and regtested this change on x86_64-unknown-linux-gnu. 2014-04-11 Patrick Palka patr...@parcs.ath.cx * alloc-pool.c (print_alloc_pool_statistics): Wrap in an anonymous namespace. * bitmap.c (print_statistics): Likewise. * cselib.c (preserve_constants_and_equivs, discard_useless_locs, discard_useless_values, dump_cselib_val): Likewise. * dwarf2out.c (dwarf2_build_local_stub): Likewise. * ggc-common.c (ggc_call_count, ggc_call_alloc, ggc_prune_ptr, ggc_add_statistics): Likewise. * gimple-ssa-strength-reduction.c (ssa_base_cand_dump_callback): Likewise. * haifa-sched.c (haifa_htab_i{1,2}_traverse): Likewise. * passes.c (passes_pass_traverse): Likewise. * postreload-gcse.c (dump_expr_hash_table_entry, delete_redundant_insns_1): Likewise. * sese.c (debug_rename_map_1): Likewise. * statistics.c (statistics_fini_pass_{1,2,3}, statistics_fini_1): Likewise * tree-into-ssa.c (debug_var_infos_r): Likewise. * tree-parloops.c (initialize_reductions, add_field_for_reduction, add_field_for_name, create_phi_for_local_result, create_call_for_reduction_1, create_loads_for_reductions, create_stores_for_reduction, create_loads_and_stores_for_name, set_reduc_phi_uids): Likewise. * tree-ssa-threadupdate.c (ssa_create_duplicates, ssa_redirect_edges): Likewise. * var-tracking.c (drop_overlapping_mem_locs, canonicalize_loc_order_check, canonicalize_values_mark, canonicalize_values_star, canonicalize_vars_star, variable_post_merge_new_vals, variable_post_merge_perm_vals, dataflow_set_preserve_mem_locs, dataflow_set_remove_mem_locs, dump_var_tracking_slot, emit_not_insn_var_location, var_track_values_to_stack, emit_notes_for_differences_{1,2}): Likewise. --- gcc/alloc-pool.c| 2 ++ gcc/bitmap.c| 2 ++ gcc/cselib.c| 8 gcc/dwarf2out.c | 2 ++ gcc/ggc-common.c| 9 + gcc/gimple-ssa-strength-reduction.c | 2 ++ gcc/haifa-sched.c | 4 gcc/passes.c| 2 ++ gcc/postreload-gcse.c | 4 gcc/sese.c | 4 +++- gcc/statistics.c| 8 gcc/tree-into-ssa.c | 2 ++ gcc/tree-parloops.c | 18 ++ gcc/tree-ssa-threadupdate.c | 4 gcc/var-tracking.c | 29 + 15 files changed, 99 insertions(+), 1 deletion(-) diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index dfb13ce..999032d 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -376,6 +376,7 @@ struct output_info /* Called via hash_table.traverse. Output alloc_pool descriptor pointed out by SLOT and update statistics. */ +namespace { int print_alloc_pool_statistics (alloc_pool_descriptor **slot, struct output_info *i) @@ -394,6 +395,7 @@ print_alloc_pool_statistics (alloc_pool_descriptor **slot, } return 1; } +} /* Output per-alloc_pool memory usage statistics. */ void diff --git a/gcc/bitmap.c b/gcc/bitmap.c index 4855a66..62696b1 100644 --- a/gcc/bitmap.c +++ b/gcc/bitmap.c @@ -2150,6 +2150,7 @@ struct output_info /* Called via hash_table::traverse. Output bitmap descriptor pointed out by SLOT and update statistics. */ +namespace { int print_statistics (bitmap_descriptor_d **slot, output_info *i) { @@ -2177,6 +2178,7 @@ print_statistics (bitmap_descriptor_d **slot, output_info *i) } return 1; } +} /* Output per-bitmap memory usage statistics. */ void diff --git a/gcc/cselib.c b/gcc/cselib.c index 7918b2b..0633965 100644 --- a/gcc/cselib.c +++ b/gcc/cselib.c @@ -491,6 +491,7 @@ invariant_or_equiv_p (cselib_val *v) /* Remove from hash table all VALUEs except constants, function invariants and VALUE equivalences. */ +namespace { int preserve_constants_and_equivs (cselib_val **x, void *info ATTRIBUTE_UNUSED) { @@ -509,6
Re: [PATCH] [CLEANUP] Declare global functions before defining them
On Mon, Apr 14, 2014 at 4:52 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, Many source files currently define a global function that is not previously declared within that source file because the source file did not include the appropriate header file that declares said function. This patch fixes a number of these occurrences by making sure to include the appropriate header file within the offending source files. Bootstrapped and regtested on x86_64-unknown-linux-gnu. How did you find these? (in the C bootstrap times -Wstrict-prototypes did that) Thanks, Richard. 2014-04-11 Patrick Palka patr...@parcs.ath.cx gcc/ * builtins.c: Include targhooks.h. * calls.c: Include calls.h. * cfgexpand.c: Include cfgexpand.h. * cfgloop.c: Include tree-ssa-loop-niter.h. * gimple-builder.c: Include gimple-builder.h. * input.c: Include diagnostic.h. * print-tree.c: Include print-tree.h. * stringpool.c: Include stringpool.h. * tree-cfgcleanup.c: Include tree-cfgcleanup.h. * tree-nested.c: Include tree-nested.h. libiberty/ * setproctitle.c: Include libiberty.h. * stack-limit.c: Likewise. --- gcc/builtins.c | 1 + gcc/calls.c | 1 + gcc/cfgexpand.c | 1 + gcc/cfgloop.c| 1 + gcc/gimple-builder.c | 1 + gcc/input.c | 1 + gcc/print-tree.c | 1 + gcc/stmt.c | 1 + gcc/stringpool.c | 1 + gcc/tree-cfgcleanup.c| 1 + gcc/tree-nested.c| 1 + libiberty/setproctitle.c | 1 + libiberty/stack-limit.c | 1 + 13 files changed, 13 insertions(+) diff --git a/gcc/builtins.c b/gcc/builtins.c index dd57b1a..e057554 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3. If not see #include builtins.h #include ubsan.h #include cilk.h +#include targhooks.h static tree do_mpc_arg1 (tree, tree, int (*)(mpc_ptr, mpc_srcptr, mpc_rnd_t)); diff --git a/gcc/calls.c b/gcc/calls.c index f0c92dd..ecb5c00 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see #include cgraph.h #include except.h #include dbgcnt.h +#include calls.h /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits. */ #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index b7f6360..0c5fafb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3. If not see #include tree-ssa-address.h #include recog.h #include output.h +#include cfgexpand.h /* Some systems use __main in a way incompatible with its use in gcc, in these cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index 70744d8..63e680f 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. If not see #include gimple-iterator.h #include gimple-ssa.h #include dumpfile.h +#include tree-ssa-loop-niter.h static void flow_loops_cfg_dump (FILE *); diff --git a/gcc/gimple-builder.c b/gcc/gimple-builder.c index ba4be26..2f66f0e 100644 --- a/gcc/gimple-builder.c +++ b/gcc/gimple-builder.c @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include is-a.h #include gimple.h #include tree-ssanames.h +#include gimple-builder.h /* Return the expression type to use based on the CODE and type of diff --git a/gcc/input.c b/gcc/input.c index 63cd062..3aacb32 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see #include intl.h #include input.h #include vec.h +#include diagnostic.h /* This is a cache used by get_next_line to store the content of a file to be searched for file lines. */ diff --git a/gcc/print-tree.c b/gcc/print-tree.c index 91b696c..6b9f2bd 100644 --- a/gcc/print-tree.c +++ b/gcc/print-tree.c @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see #include tree-cfg.h #include tree-dump.h #include dumpfile.h +#include print-tree.h /* Define the hash table of nodes already seen. Such nodes are not repeated; brief cross-references are used. */ diff --git a/gcc/stmt.c b/gcc/stmt.c index 5d68edb..6bf5f06 100644 --- a/gcc/stmt.c +++ b/gcc/stmt.c @@ -59,6 +59,7 @@ along with GCC; see the file COPYING3. If not see #include pretty-print.h #include params.h #include dumpfile.h +#include stmt.h /* Functions and data structures for expanding case statements. */ diff --git a/gcc/stringpool.c b/gcc/stringpool.c index 4b6900c..e94d741 100644 --- a/gcc/stringpool.c +++ b/gcc/stringpool.c @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see #include tree.h #include symtab.h #include cpplib.h
Re: Optimize n?rotate(x,n):x
On Mon, Apr 14, 2014 at 6:40 PM, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 14 Apr 2014, Richard Biener wrote: + /* If the special case has a high probability, keep it. */ + if (EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) I suppose Honza has a comment on how to test this properly (not sure if -probability or -frequency is always initialized properly). for example single_likely_edge tests profile_status_for_fn != PROFILE_ABSENT (and uses a fixed probability value ...). Anyway, the comparison looks backwards to me, but maybe I'm missing sth - I'd use = PROB_LIKELY ;) Maybe the comment is confusing? middle_bb contains the expensive operation (say a/b) that the special case skips entirely. If the division happens in less than 50% of cases (that's the proba of the edge going from cond to middle_bb), then doing the comparison+jump may be cheaper and I abort the optimization. At least the testcase with __builtin_expect should prove that I didn't do it backwards. Ah, indeed. My mistake. value-prof seems to use 50% as the cut-off where it may become interesting to special case division, hence my choice of PROB_EVEN. I am not sure which way you want to use PROB_LIKELY (80%). If we have more than 80% chances of executing the division, always perform it? Or if we have more than 80% chances of skipping the division, keep the branch? Ok, if it's from value-prof then that's fine. The patch is ok if Honza doesn't have any comments on whether it's ok to look at -probability unconditionally. Thanks, Richard. Attached is the latest version (passed the testsuite). Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile { target x86_64-*-* } } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int f(int a, int b) { + if (__builtin_expect(a == 0, 1)) return b; + return a + b; +} + +// optab_handler can handle if(b==1) but not a/b +// so we consider a/b too expensive. +unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) { + if (b == 1) +return a; + else +return a / b; +} + +/* { dg-final { scan-tree-dump-times goto 4 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 209353) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -140,20 +140,37 @@ static bool gate_hoist_loads (void); x = PHI (CONST, a) Gets replaced with: bb0: bb2: t1 = a == CONST; t2 = b c; t3 = t1 t2; x = a; + + It also replaces + + bb0: + if (a != 0) goto bb1; else goto bb2; + bb1: + c = a + b; + bb2: + x = PHI c (bb1), b (bb0), ...; + + with + + bb0: + c = a + b; + bb2: + x = PHI c (bb0), ...; + ABS Replacement --- This transformation, implemented in abs_replacement, replaces bb0: if (a = 0) goto bb2; else goto bb1; bb1: x = -a; bb2: @@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; tmp = gimple_assign_rhs2 (def); if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; return false; } +/* Returns true if ARG is a neutral element for operation CODE + on the RIGHT side. */ + +static bool +neutral_element_p (tree_code code, tree arg, bool right) +{ + switch (code) +{ +case PLUS_EXPR: +case BIT_IOR_EXPR: +case BIT_XOR_EXPR: + return integer_zerop (arg); + +case LROTATE_EXPR: +case RROTATE_EXPR: +case LSHIFT_EXPR: +case RSHIFT_EXPR: +case MINUS_EXPR: +case POINTER_PLUS_EXPR: + return right integer_zerop (arg); + +case MULT_EXPR: + return integer_onep (arg); + +case
Re: [PATCH] [CLEANUP] Mark locally-used functions static
On Mon, Apr 14, 2014 at 4:48 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, This patch marks static a bunch of locally-used, non-debug functions within the GCC sources. Doing so addresses a subset of the warnings emitted when compiling the GCC sources with -Wmissing-declarations. I bootstrapped and regtested this change on x86_64-unknown-linux-gnu. The gcc/ parts are ok. Thanks, Richard. 2014-04-13 Patrick Palka patr...@parcs.ath.cx gcc/c/ * c-array-notation.c (replace_invariant_exprs): Make static. gcc/cp/ * class.c (inherit_targ_abi_tags): Make static. * cp-array-notation.c (create_cmp_incr): Likewise. * mangle.c (tree_string_cmp): Likewise. * pt.c (fixed_parameter_pack_p): Likewise. * vtable-class-hierarchy.c (vtv_register_class_hierarchy_information): Likewise. gcc/fortran/ * class.c (gfc_intrinsic_hash_value): Make static. * trans-expr.c (gfc_conv_intrinsic_to_class): Likewise. gcc/ * asan.c (asan_mem_ref_get_end, replace_invariant_exprs): Make static. * cgraphclones.c (redirect_edge_duplicating_thunks): Likewise. * cgraphunit.c (decide_is_symbol_needed): Likewise. * config/i386/i386.c (make_pass_insert_vzeroupper, ix86_avx_emit_vzeroupper): Likewise. * dwarf2out.c (init_addr_table_entry): Likewise. * gengtype.c (get_string_option, already_seen_tag, mark_tag_as_seen): Likewise. * gimple-ssa-isolate-paths (isolate_path): Likewise. * graphite.c (graphite_transform_loops): Likewise. * internal-fn.c (ubsan_expand_si_overflow_{addsub,neg,mul}_check): Likewise. * ipa-devirt.c (hash_type_name, likely_target_p): Likewise. * ipa-inline-analysis.c (simple_edge_hints): Likewise. * ipa-profile.c (cmp_counts, contains_hot_call_p): Likewise. * ipa-prop.c (ipa_alloc_node_params, write_agg_replacement_chain): Likewise. * ipa.c (can_replace_by_local_alias): Likewise. * lto-streamer-out.c (output_spmbol_p): Likewise. * omp-low.c (simd_clone_vector_of_formal_parm_types): Likewise. * tree-inline.c (redirect_all_calls, freqs_to_counts): Likewise. * tree-predcom.c (tree_predictive_commoning): Likewise. * tree-sra.c (ipa_sra_modify_function_body): Likewise. * tree-ssa-loop-im.c (movement_possibilyt, tree_ssa_lim): Likewise. * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables, tree_unroll_loops_completely): Likewise. * tree-ssa-loop-prefetch.c (tree_ssa_prefetch_arrays): Likewise. * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Likewise. * tree-ssa-threadupdate.c (ssa_fix_duplicate_block_edges): Likewise. * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Likewise. * ubsan.c (tree_type_map_hash): Likewise. * varpool.c (varpool_call_variable_insertion_hooks): Likewise. libiberty/ * make-temp-file.c (choose_tmpdir): Make static. --- gcc/asan.c | 4 ++-- gcc/c/c-array-notation.c| 2 +- gcc/cgraphclones.c | 2 +- gcc/cgraphunit.c| 2 +- gcc/config/i386/i386.c | 4 ++-- gcc/cp/class.c | 2 +- gcc/cp/cp-array-notation.c | 2 +- gcc/cp/mangle.c | 2 +- gcc/cp/pt.c | 2 +- gcc/cp/vtable-class-hierarchy.c | 2 +- gcc/dwarf2out.c | 2 +- gcc/fortran/class.c | 2 +- gcc/fortran/trans-expr.c| 2 +- gcc/gengtype.c | 6 +++--- gcc/gimple-ssa-isolate-paths.c | 2 +- gcc/graphite.c | 2 +- gcc/internal-fn.c | 6 +++--- gcc/ipa-devirt.c| 4 ++-- gcc/ipa-inline-analysis.c | 2 +- gcc/ipa-profile.c | 4 ++-- gcc/ipa-prop.c | 4 ++-- gcc/ipa.c | 2 +- gcc/lto-streamer-out.c | 2 +- gcc/omp-low.c | 2 +- gcc/tree-inline.c | 4 ++-- gcc/tree-predcom.c | 2 +- gcc/tree-sra.c | 2 +- gcc/tree-ssa-loop-im.c | 4 ++-- gcc/tree-ssa-loop-ivcanon.c | 4 ++-- gcc/tree-ssa-loop-prefetch.c| 2 +- gcc/tree-ssa-loop-unswitch.c| 2 +- gcc/tree-ssa-threadupdate.c | 2 +- gcc/tree-vect-loop-manip.c | 2 +- gcc/ubsan.c | 2 +- gcc/varpool.c | 2 +- libiberty/make-temp-file.c | 2 +- 36 files changed, 48 insertions(+), 48 deletions(-) diff --git a/gcc/asan.c b/gcc/asan.c index 53992a8..de4058a 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -299,7 +299,7 @@ asan_mem_ref_new (tree start, char access_size) /* This builds and returns a pointer to the end of the memory region that starts at START and of length LEN. */ -tree
[PATCH] Minor cleanups
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-04-15 Richard Biener rguent...@suse.de * tree.c (iterative_hash_expr): Use enum tree_code_class to store TREE_CODE_CLASS. (tree_block): Likewise. (tree_set_block): Likewise. * tree.h (fold_build_pointer_plus_loc): Use convert_to_ptrofftype_loc. Index: gcc/tree.c === --- gcc/tree.c (revision 209374) +++ gcc/tree.c (working copy) @@ -7387,7 +7387,7 @@ iterative_hash_expr (const_tree t, hashv { int i; enum tree_code code; - char tclass; + enum tree_code_class tclass; if (t == NULL_TREE) return iterative_hash_hashval_t (0, val); @@ -11235,7 +11235,7 @@ walk_tree_without_duplicates_1 (tree *tp tree tree_block (tree t) { - char const c = TREE_CODE_CLASS (TREE_CODE (t)); + const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); if (IS_EXPR_CODE_CLASS (c)) return LOCATION_BLOCK (t-exp.locus); @@ -11246,7 +11246,7 @@ tree_block (tree t) void tree_set_block (tree t, tree b) { - char const c = TREE_CODE_CLASS (TREE_CODE (t)); + const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t)); if (IS_EXPR_CODE_CLASS (c)) { Index: gcc/tree.h === --- gcc/tree.h (revision 209374) +++ gcc/tree.h (working copy) @@ -4187,7 +4187,7 @@ static inline tree fold_build_pointer_plus_loc (location_t loc, tree ptr, tree off) { return fold_build2_loc (loc, POINTER_PLUS_EXPR, TREE_TYPE (ptr), - ptr, fold_convert_loc (loc, sizetype, off)); + ptr, convert_to_ptrofftype_loc (loc, off)); } #define fold_build_pointer_plus(p,o) \ fold_build_pointer_plus_loc (UNKNOWN_LOCATION, p, o)
Re: [patch] Add support for pragma Loop_Optimize ([No_]Vector)
The loop flags copying should go into copy_loop_info instead of only to copy_loops. Jakub - I see you remap simduid on copy - you have to do sth in copy_loop_info instead I suppose. See the other callers. That also occurred to me, but IMO it's not crystal clear; for example, ivdep (aka safelen) is not in copy_loop_info either. So I think this needs to be further discussed. Otherwise I'm fine with this patch. Thanks, I have applied it as-is for now. -- Eric Botcazou
Re: [patch] Disable if_conversion2 for Og
On Tue, Apr 15, 2014 at 3:59 AM, Joey Ye joey...@arm.com wrote: If-converstion is harmful to optimized debugging as it generates conditional execution instructions with line number information, which resulted in a dillusion to developers that both then-else branches are executed. For example: test.c: 1: unsigned oldest_sequence; 2: 3: unsigned foo(unsigned seq_number) 4: { 5: if ((seq_number + 5) 10) 6:seq_number += 100; 7: else 8: seq_number = oldest_sequence; if (seq_number oldest_sequence) seq_number = oldest_sequence; return seq_number; } $ arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -Og -g3 gets: .loc 1 5 0 addsr3, r0, #5 cmp r3, #9 .loc 1 6 0 - line 6, then branch iteels addls r0, r0, #100 .LVL1: .loc 1 8 0 - line 8, else branch. Both branches seems to be executed in GDB ldrhi r3, .L5 ldrhi r0, [r3] The reason is that if_conversion2 is still enabled in Og. The patch simply disables it for Og. Tests: * -Og bootstrap passed. * Make check default (no additional option): No regression. * Make check with -Og: expected regressions. Cases relying on if-conversion2 failed. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne OK to trunk and 4.8/4.9 branch? Ok for trunk and branches after a while. Why does if-conversion not have the same problem? On the GIMPLE part we avoid all kinds of if-conversion with -Og. Thanks, Richard. ChangeLog: * opts.c (OPT_fif_conversion2): Disable for Og. diff --git a/gcc/opts.c b/gcc/opts.c index fdc903f..e076253 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -432,7 +432,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fcprop_registers, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fforward_propagate, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fif_conversion, NULL, 1 }, -{ OPT_LEVELS_1_PLUS, OPT_fif_conversion2, NULL, 1 }, +{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
Re: [patch] Add support for pragma Loop_Optimize ([No_]Vector)
On Tue, Apr 15, 2014 at 10:01 AM, Eric Botcazou ebotca...@adacore.com wrote: The loop flags copying should go into copy_loop_info instead of only to copy_loops. Jakub - I see you remap simduid on copy - you have to do sth in copy_loop_info instead I suppose. See the other callers. That also occurred to me, but IMO it's not crystal clear; for example, ivdep (aka safelen) is not in copy_loop_info either. So I think this needs to be further discussed. Well, there are passes that can end up duplicating loops and thus you lose no_vectorize on the copy for example. Clearly that's undesired, no? For safelen and simduid not copying them is erroring on the safe side at least, likewise for force_vectorize. But we do have the copy_loop_info abstraction for a reason. Otherwise we should simply discard it. Richard. Otherwise I'm fine with this patch. Thanks, I have applied it as-is for now. -- Eric Botcazou
[PATCH] Avoid push/pop_cfun in IPA PTA analysis
This avoids the push/pop_cfun calls, they are not necessary. Bootstrapped / tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-04-15 Richard Biener rguent...@suse.de * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): Add struct function argument and adjust. (find_func_aliases_for_call): Likewise. (find_func_aliases): Likewise. (find_func_clobbers): Likewise. (intra_create_variable_infos): Likewise. (compute_points_to_sets): Likewise. (ipa_pta_execute): Adjust. Do not push/pop cfun. Index: gcc/tree-ssa-structalias.c === *** gcc/tree-ssa-structalias.c (revision 207658) --- gcc/tree-ssa-structalias.c (working copy) *** get_fi_for_callee (gimple call) *** 4126,4132 was handled, otherwise false. */ static bool ! find_func_aliases_for_builtin_call (gimple t) { tree fndecl = gimple_call_fndecl (t); vecce_s lhsc = vNULL; --- 4126,4132 was handled, otherwise false. */ static bool ! find_func_aliases_for_builtin_call (struct function *fn, gimple t) { tree fndecl = gimple_call_fndecl (t); vecce_s lhsc = vNULL; *** find_func_aliases_for_builtin_call (gimp *** 4440,4446 and otherwise are just all nonlocal variables. */ if (in_ipa_mode) { ! fi = lookup_vi_for_tree (cfun-decl); rhs = get_function_part_constraint (fi, ~0); rhs.type = ADDRESSOF; } --- 4440,4446 and otherwise are just all nonlocal variables. */ if (in_ipa_mode) { ! fi = lookup_vi_for_tree (fn-decl); rhs = get_function_part_constraint (fi, ~0); rhs.type = ADDRESSOF; } *** find_func_aliases_for_builtin_call (gimp *** 4465,4471 { fi = NULL; if (!in_ipa_mode ! || !(fi = get_vi_for_tree (cfun-decl))) make_constraint_from (get_varinfo (escaped_id), anything_id); else if (in_ipa_mode fi != NULL) --- 4465,4471 { fi = NULL; if (!in_ipa_mode ! || !(fi = get_vi_for_tree (fn-decl))) make_constraint_from (get_varinfo (escaped_id), anything_id); else if (in_ipa_mode fi != NULL) *** find_func_aliases_for_builtin_call (gimp *** 4492,4498 /* Create constraints for the call T. */ static void ! find_func_aliases_for_call (gimple t) { tree fndecl = gimple_call_fndecl (t); vecce_s lhsc = vNULL; --- 4492,4498 /* Create constraints for the call T. */ static void ! find_func_aliases_for_call (struct function *fn, gimple t) { tree fndecl = gimple_call_fndecl (t); vecce_s lhsc = vNULL; *** find_func_aliases_for_call (gimple t) *** 4501,4507 if (fndecl != NULL_TREE DECL_BUILT_IN (fndecl) !find_func_aliases_for_builtin_call (t)) return; fi = get_fi_for_callee (t); --- 4501,4507 if (fndecl != NULL_TREE DECL_BUILT_IN (fndecl) !find_func_aliases_for_builtin_call (fn, t)) return; fi = get_fi_for_callee (t); *** find_func_aliases_for_call (gimple t) *** 4611,4617 when building alias sets and computing alias grouping heuristics. */ static void ! find_func_aliases (gimple origt) { gimple t = origt; vecce_s lhsc = vNULL; --- 4611,4617 when building alias sets and computing alias grouping heuristics. */ static void ! find_func_aliases (struct function *fn, gimple origt) { gimple t = origt; vecce_s lhsc = vNULL; *** find_func_aliases (gimple origt) *** 4655,4661 In non-ipa mode, we need to generate constraints for each pointer passed by address. */ else if (is_gimple_call (t)) ! find_func_aliases_for_call (t); /* Otherwise, just a regular assignment statement. Only care about operations with pointer result, others are dealt with as escape --- 4655,4661 In non-ipa mode, we need to generate constraints for each pointer passed by address. */ else if (is_gimple_call (t)) ! find_func_aliases_for_call (fn, t); /* Otherwise, just a regular assignment statement. Only care about operations with pointer result, others are dealt with as escape *** find_func_aliases (gimple origt) *** 4746,4752 { fi = NULL; if (!in_ipa_mode ! || !(fi = get_vi_for_tree (cfun-decl))) make_escape_constraint (gimple_return_retval (t)); else if (in_ipa_mode fi != NULL) --- 4746,4752 { fi = NULL; if (!in_ipa_mode ! || !(fi = get_vi_for_tree (fn-decl))) make_escape_constraint
Re: [PATCH] Prevent out of bound access for multilib_options
Hi Jakub: Thanks your review and approve, however I don't have commit right yet, can you help me to commit it :) And here is updated patch in attachment, thanks! On Mon, Apr 14, 2014 at 8:12 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Apr 09, 2014 at 10:00:38PM +0800, Kito Cheng wrote: `q` will out of bound access if `*q` already reach the end of multilib_options, so check it before increment to prevent condition check part out of bound access. btw, this bug is detected by address sanitizer. 2014-04-09 Kito Cheng k...@0xlab.org * gcc.c (used_arg): Prevent out of bound access for multilib_options. There should be a newline between date/name/email line and * gcc.c ... and the * gcc.c line should be indented by tab. diff --git a/gcc/gcc.c b/gcc/gcc.c index 5cb485a..c8ab7d6 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -7490,7 +7490,7 @@ used_arg (const char *p, int len) { const char *r; - for (q = multilib_options; *q != '\0'; q++) + for (q = multilib_options; *q != '\0'; *q q++) { while (*q == ' ') q++; Ok for trunk and 4.9.1. Jakub diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2fbdb01..f161c8a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2014-04-15 Kito Cheng k...@0xlab.org + + * gcc.c (used_arg): Prevent out of bound access for multilib_options. + 2014-04-15 Jakub Jelinek ja...@redhat.com PR plugins/59335 diff --git a/gcc/gcc.c b/gcc/gcc.c index 5cb485a..c8ab7d6 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -7490,7 +7490,7 @@ used_arg (const char *p, int len) { const char *r; - for (q = multilib_options; *q != '\0'; q++) + for (q = multilib_options; *q != '\0'; *q q++) { while (*q == ' ') q++;
Re: [patch] Add support for pragma Loop_Optimize ([No_]Vector)
Well, there are passes that can end up duplicating loops and thus you lose no_vectorize on the copy for example. Clearly that's undesired, no? For safelen and simduid not copying them is erroring on the safe side at least, likewise for force_vectorize. But we do have the copy_loop_info abstraction for a reason. Otherwise we should simply discard it. That's not very clear, even for dont_vectorize, and I'm a bit uncomfortable special-casing it. For example, copy_loop_info is called for loop unswitching and loop versioning and one can wonder what should happen to loop hints for unswitched and versioned instances of loops. I'll further think about it. -- Eric Botcazou
Re: [PATCH] Add DW_AT_const_value as unsigned or int depending on type and value used.
On Tue, 2014-04-15 at 08:21 +0200, Jakub Jelinek wrote: On Mon, Apr 14, 2014 at 03:48:06PM -0700, Cary Coutant wrote: Also note that size_of_die and value_format will still choose DW_FORM_data[1248] for dw_val_class_unsigned_const in most cases. Don't you really want to use DW_FORM_udata? DW_FORM_data[1248] is in many cases smaller than DW_FORM_udata (though, one has to take into account possibly larger .debug_abbrev size). Yes, but it's up to the consumer to deduce from context whether the value is signed or unsigned. If it's still true that GDB will interpret DW_FORM_data[1248] as signed (as the deleted comment said), and you output a value between 128 and 255 using DW_FORM_data1, this isn't going to work. Maybe that comment only applies to DW_FORM_data[48] (whichever matches HOST_WIDE_INT)? If there is no agreement between producer and consumer what is signed and what is unsigned for DW_FORM_data[1248], then of course that is a problem, I wasn't aware of such disagreements. Cary is right, I should have clarified/fixed the comment instead of just removing it completely. There used to be a very brief period where GDB (5.x timeframe) treated DW_FORM_data[1248] as signed. This hasn't been true for a very long time anymore. There must indeed be agreement between the producers and consumers how to interpret these forms. GCC always outputs DW_FORM_data[1248] as unsigned values and consumers (at least GDB and elfutils explicitly agreed on this) explicitly always zero-extend these forms. This is documented in other places in dwarf2out.c, in the GDB sources and elfutils comments, but it would not be a bad idea to have a comment here too to make sure this is kept consistent. The other issue is when HOST_WIDE_INT is smaller than 64 bits. I didn't want to fix that issue in this patch because I don't have any such setups. And as Cary also pointed out in the previous thread that does require some changes to how doubles are treated. It would need a new add_AT_unsigned_double function. And I think it would mean fixing the case were add_AT_double generates either a constant class or a block class form (add_AT_double is used for both those cases, but not in all places where it is used is a block class form allowed - here it is for a DW_AT_const, but it isn't in all cases were it is used in dwarf2out.c). But those issues/TODOs are out of scope for this patch. Added a clarifying comment to the code and reinstated the TODO for the double case. OK to push? Thanks, Mark commit f7c10a0ae5e99b680335b1a13e082fcad4ad0236 Author: Mark Wielaard m...@redhat.com Date: Fri Mar 7 22:27:15 2014 +0100 Add DW_AT_const_value as unsigned or int depending on type and value used. As the comment in the code already indicated DWARF2 does provide DW_FORM_sdata/DW_FORM_udata to represent signed/unsigned data. Enumeration constants wider than HOST_WIDE_INT are already handled separately. Those constant values that do fit a HOST_WIDE_INT can be encoded as signed or unsigned depending on type and value for more efficient encoding. * dwarf2out.c (gen_enumeration_type_die): Add DW_AT_const_value as unsigned or int depending on type and value used. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 802b587..e4d6669 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-03-21 Mark Wielaard m...@redhat.com + + * dwarf2out.c (gen_enumeration_type_die): Add DW_AT_const_value + as unsigned or int depending on type and value used. + 2014-03-20 Mark Wielaard m...@redhat.com * dwarf2out.c (add_bound_info): If HOST_WIDE_INT is big enough, diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 7eef56c..70b0716 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -17369,22 +17369,23 @@ gen_enumeration_type_die (tree type, dw_die_ref context_die) if (simple_type_size_in_bits (TREE_TYPE (value)) = HOST_BITS_PER_WIDE_INT || tree_fits_shwi_p (value)) - /* DWARF2 does not provide a way of indicating whether or - not enumeration constants are signed or unsigned. GDB - always assumes the values are signed, so we output all - values as if they were signed. That means that - enumeration constants with very large unsigned values - will appear to have negative values in the debugger. - - TODO: the above comment is wrong, DWARF2 does provide - DW_FORM_sdata/DW_FORM_udata to represent signed/unsigned data. - This should be re-worked to use correct signed/unsigned - int/double tags for all cases, instead of always treating as - signed. */ - add_AT_int (enum_die, DW_AT_const_value, TREE_INT_CST_LOW (value)); + { + /* For constant forms created by add_AT_unsigned DWARF +consumers (GDB, elfutils, etc.) always zero extend +
[Ada] Use correct predicate for static expressions
Really a minor detail, but still. Tested on x86_64-suse-linux, applied on the mainline. 2014-04-15 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (prepend_one_attribute_pragma): Call Is_OK_Static_Expression in lieu of Is_Static_Expression to detect valid arguments. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 209404) +++ gcc-interface/decl.c (working copy) @@ -6151,7 +6151,8 @@ prepend_one_attribute_pragma (struct att Node_Id gnat_arg0 = Next (First (gnat_arg)); Node_Id gnat_arg1 = Empty; - if (Present (gnat_arg0) Is_Static_Expression (Expression (gnat_arg0))) + if (Present (gnat_arg0) + Is_OK_Static_Expression (Expression (gnat_arg0))) { gnu_arg0 = gnat_to_gnu (Expression (gnat_arg0)); @@ -6165,7 +6166,8 @@ prepend_one_attribute_pragma (struct att gnat_arg1 = Next (gnat_arg0); } - if (Present (gnat_arg1) Is_Static_Expression (Expression (gnat_arg1))) + if (Present (gnat_arg1) + Is_OK_Static_Expression (Expression (gnat_arg1))) { gnu_arg1 = gnat_to_gnu (Expression (gnat_arg1));
[Ada] Fix strange performance drop for code using SSE vector types
Gigi generates a VIEW_CONVERT_EXPR in the middle of a hot loop, which kills the performance at run time. Tested on x86_64-suse-linux, applied on the mainline. 2014-04-15 Eric Botcazou ebotca...@adacore.com * gcc-interface/utils.c (type_for_vector_element_p): New predicate. (build_vector_type_for_size): New function. (build_vector_type_for_array): Likewise. (unchecked_convert): Build an intermediate vector type to convert from a generic array type to a vector type. (handle_vector_size_attribute): Reimplement. (handle_vector_type_attribute): Likewise. 2014-04-15 Eric Botcazou ebotca...@adacore.com * gnat.dg/vect14.adb: New test. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 209404) +++ gcc-interface/utils.c (working copy) @@ -3194,6 +3194,96 @@ build_template (tree template_type, tree return gnat_build_constructor (template_type, template_elts); } +/* Return true if TYPE is suitable for the element type of a vector. */ + +static bool +type_for_vector_element_p (tree type) +{ + enum machine_mode mode; + + if (!INTEGRAL_TYPE_P (type) + !SCALAR_FLOAT_TYPE_P (type) + !FIXED_POINT_TYPE_P (type)) +return false; + + mode = TYPE_MODE (type); + if (GET_MODE_CLASS (mode) != MODE_INT + !SCALAR_FLOAT_MODE_P (mode) + !ALL_SCALAR_FIXED_POINT_MODE_P (mode)) +return false; + + return true; +} + +/* Return a vector type given the SIZE and the INNER_TYPE, or NULL_TREE if + this is not possible. If ATTRIBUTE is non-zero, we are processing the + attribute declaration and want to issue error messages on failure. */ + +static tree +build_vector_type_for_size (tree inner_type, tree size, tree attribute) +{ + unsigned HOST_WIDE_INT size_int, inner_size_int; + int nunits; + + /* Silently punt on variable sizes. We can't make vector types for them, + need to ignore them on front-end generated subtypes of unconstrained + base types, and this attribute is for binding implementors, not end + users, so we should never get there from legitimate explicit uses. */ + if (!tree_fits_uhwi_p (size)) +return NULL_TREE; + size_int = tree_to_uhwi (size); + + if (!type_for_vector_element_p (inner_type)) +{ + if (attribute) + error (invalid element type for attribute %qs, + IDENTIFIER_POINTER (attribute)); + return NULL_TREE; +} + inner_size_int = tree_to_uhwi (TYPE_SIZE_UNIT (inner_type)); + + if (size_int % inner_size_int) +{ + if (attribute) + error (vector size not an integral multiple of component size); + return NULL_TREE; +} + + if (size_int == 0) +{ + if (attribute) + error (zero vector size); + return NULL_TREE; +} + + nunits = size_int / inner_size_int; + if (nunits (nunits - 1)) +{ + if (attribute) + error (number of components of vector not a power of two); + return NULL_TREE; +} + + return build_vector_type (inner_type, nunits); +} + +/* Return a vector type whose representative array type is ARRAY_TYPE, or + NULL_TREE if this is not possible. If ATTRIBUTE is non-zero, we are + processing the attribute and want to issue error messages on failure. */ + +static tree +build_vector_type_for_array (tree array_type, tree attribute) +{ + tree vector_type = build_vector_type_for_size (TREE_TYPE (array_type), + TYPE_SIZE_UNIT (array_type), + attribute); + if (!vector_type) +return NULL_TREE; + + TYPE_REPRESENTATIVE_ARRAY (vector_type) = array_type; + return vector_type; +} + /* Helper routine to make a descriptor field. FIELD_LIST is the list of decls being built; the new decl is chained on to the front of the list. */ @@ -5268,6 +5358,7 @@ unchecked_convert (tree type, tree expr, tree etype = TREE_TYPE (expr); enum tree_code ecode = TREE_CODE (etype); enum tree_code code = TREE_CODE (type); + tree tem; int c; /* If the expression is already of the right type, we are done. */ @@ -5414,6 +5505,18 @@ unchecked_convert (tree type, tree expr, etype)) expr = convert (type, expr); + /* And, if the array type is not the representative, we try to build an + intermediate vector type of which the array type is the representative + and to do the unchecked conversion between the vector types, in order + to enable further simplifications in the middle-end. */ + else if (code == VECTOR_TYPE + ecode == ARRAY_TYPE + (tem = build_vector_type_for_array (etype, NULL_TREE))) +{ + expr = convert (tem, expr); + return unchecked_convert (type, expr, notrunc_p); +} + /* If we are converting a CONSTRUCTOR to a more aligned RECORD_TYPE, bump the alignment of the CONSTRUCTOR to speed up the copy operation. */ else if (TREE_CODE (expr) == CONSTRUCTOR @@ -6310,27 +6413,13 @@ handle_type_generic_attribute (tree
[C++ Patch] Minor TYPE_NAME clean-up
Hi, now that TYPE_IDENTIFIER includes an explicit check that TYPE_NAME is nonnull, we can remove a few redundant uses of the latter. Plus a few additional tweaks. Tested x86_64-linux. Thanks, Paolo. /// 2014-04-15 Paolo Carlini paolo.carl...@oracle.com * decl.c (duplicate_decls): Remove redundant TYPE_NAME use. * name-lookup.c (pushdecl_maybe_friend_1): Likewise. (do_class_using_decl): Likewise. (set_identifier_type_value): Use TYPE_LINKAGE_IDENTIFIER. * pt.c (resolve_typename_type): Likewise. * mangle.c (dump_substitution_candidates): Use TYPE_NAME_STRING. Index: decl.c === --- decl.c (revision 209408) +++ decl.c (working copy) @@ -1380,7 +1380,6 @@ duplicate_decls (tree newdecl, tree olddecl, bool tree t = TREE_VALUE (t1); if (TYPE_PTR_P (t) -TYPE_NAME (TREE_TYPE (t)) TYPE_IDENTIFIER (TREE_TYPE (t)) == get_identifier (FILE) compparms (TREE_CHAIN (t1), TREE_CHAIN (t2))) Index: mangle.c === --- mangle.c(revision 209408) +++ mangle.c(working copy) @@ -323,7 +323,7 @@ dump_substitution_candidates (void) else if (TREE_CODE (el) == TREE_LIST) name = IDENTIFIER_POINTER (DECL_NAME (TREE_VALUE (el))); else if (TYPE_NAME (el)) - name = IDENTIFIER_POINTER (TYPE_IDENTIFIER (el)); + name = TYPE_NAME_STRING (el); fprintf (stderr, S%d_ = , i - 1); if (TYPE_P (el) (CP_TYPE_RESTRICT_P (el) Index: name-lookup.c === --- name-lookup.c (revision 209408) +++ name-lookup.c (working copy) @@ -945,7 +945,6 @@ pushdecl_maybe_friend_1 (tree x, bool is_friend) set_underlying_type (x); if (type != error_mark_node - TYPE_NAME (type) TYPE_IDENTIFIER (type)) set_identifier_type_value (DECL_NAME (x), x); @@ -2017,7 +2016,7 @@ set_identifier_type_value (tree id, tree decl) static inline tree constructor_name_full (tree type) { - return TYPE_IDENTIFIER (TYPE_MAIN_VARIANT (type)); + return TYPE_LINKAGE_IDENTIFIER (type); } /* Return the name for the constructor (or destructor) for the @@ -,7 +3332,7 @@ do_class_using_decl (tree scope, tree name) } /* Using T::T declares inheriting ctors, even if T is a typedef. */ if (MAYBE_CLASS_TYPE_P (scope) - ((TYPE_NAME (scope) name == TYPE_IDENTIFIER (scope)) + (name == TYPE_IDENTIFIER (scope) || constructor_name_p (name, scope))) { maybe_warn_cpp0x (CPP0X_INHERITING_CTORS); Index: pt.c === --- pt.c(revision 209408) +++ pt.c(working copy) @@ -21335,13 +21335,13 @@ resolve_typename_type (tree type, bool only_curren scope = TYPE_CONTEXT (type); /* Usually the non-qualified identifier of a TYPENAME_TYPE is TYPE_IDENTIFIER (type). But when 'type' is a typedef variant of - a TYPENAME_TYPE node, then TYPE_NAME (type) is set to the TYPE_DECL representing - the typedef. In that case TYPE_IDENTIFIER (type) is not the non-qualified - identifier of the TYPENAME_TYPE anymore. + a TYPENAME_TYPE node, then TYPE_NAME (type) is set to the TYPE_DECL + representing the typedef. In that case TYPE_IDENTIFIER (type) is + not the non-qualified identifier of the TYPENAME_TYPE anymore. So by getting the TYPE_IDENTIFIER of the _main declaration_ of the TYPENAME_TYPE instead, we avoid messing up with a possible typedef variant case. */ - name = TYPE_IDENTIFIER (TYPE_MAIN_VARIANT (type)); + name = TYPE_LINKAGE_IDENTIFIER (type); /* If the SCOPE is itself a TYPENAME_TYPE, then we need to resolve it first before we can figure out what NAME refers to. */
Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store
On Fri, Apr 4, 2014 at 7:49 AM, Thomas Preud'homme thomas.preudho...@arm.com wrote: From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Rainer Orth Just omit the { target *-*-* } completely, also a few more times. Please find attached an updated patch. @@ -1733,6 +1743,51 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit) to initialize the symbolic number. */ if (!source_expr1) { + n-base_addr = n-offset = NULL_TREE; + if (is_gimple_assign (rhs1_stmt)) you want gimple_assign_load_p (rhs1_stmt) !gimple_has_volatile_ops (rhs1_stmt) here. + case ARRAY_RANGE_REF: For ARRAY_RANGE_REF this doesn't make much sense IMHO. + case ARRAY_REF: + n-base_addr = TREE_OPERAND (var, 0); + elt_size = array_ref_element_size (var); + if (TREE_CODE (elt_size) != INTEGER_CST) + return NULL_TREE; + index = TREE_OPERAND (var, 1); + if (TREE_THIS_VOLATILE (var) || TREE_THIS_VOLATILE (index)) + return NULL_TREE; + n-offset = fold_build2 (MULT_EXPR, sizetype, + index, elt_size); You fail to honor array_ref_low_bound. With handling only the outermost handled-component and then only a selected subset you'll catch many but not all cases. Why not simply use get_inner_reference () here (plus stripping the constant offset from an innermost MEM_REF) and get the best of both worlds (not duplicate parts of its logic and handle more cases)? Eventually using tree-affine.c and get_inner_reference_aff is even more appropriate so you can compute the address differences without decomposing them yourselves. Btw, I think for the recursion to work properly you need to handle loads for the toplevel stmt, not for rhs1_stmt. Eventually you need to split out a find_bswap_2 that handles the recursion case that allows loads from the case that doesn't called from find_bswap. + /* Compute address to load from and cast according to the size + of the load. */ + load_ptr_type = build_pointer_type (load_type); + addr_expr = build1 (ADDR_EXPR, load_ptr_type, bswap_src); + addr_tmp = make_temp_ssa_name (load_ptr_type, NULL, load_src); + addr_stmt = gimple_build_assign_with_ops +(NOP_EXPR, addr_tmp, addr_expr, NULL); + gsi_insert_before (gsi, addr_stmt, GSI_SAME_STMT); + + /* Perform the load. */ + load_offset_ptr = build_int_cst (load_ptr_type, 0); + val_tmp = make_temp_ssa_name (load_type, NULL, load_dst); + val_expr = build2 (MEM_REF, load_type, addr_tmp, load_offset_ptr); + load_stmt = gimple_build_assign_with_ops +(MEM_REF, val_tmp, val_expr, NULL); this is unnecessarily complex and has TBAA issues. You don't need to create a correct pointer type, so doing addr_expr = fold_build_addr_expr (bswap_src); is enough. Now, to fix the TBAA issues you either need to remember and combine the reference_alias_ptr_type of each original load and use that for the load_offset_ptr value or decide that isn't worth it and use alias-set zero (use ptr_type_node). Can you also expand the comment about size vs. range? Is it that range can be bigger than size if you have (short)a[0] | ((short)a[3] 1) sofar where size == 2 but range == 3? Thus range can also be smaller than size for example for (short)a[0] | ((short)a[0] 1) where range would be 1 and size == 2? I suppose adding two examples like this to the comment, together with the expected value of 'n' would help here. Otherwise the patch looks good. Now we're only missing the addition of trying to match to a VEC_PERM_EXPR with a constant mask using can_vec_perm_p ;) Thanks, Richard. Best regards, Thomas
[PATCH] Do not walk BINFOs in record_component_aliases
As discussed last year. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-04-15 Richard Biener rguent...@suse.de * alias.c (record_component_aliases): Do not walk BINFOs. Index: gcc/alias.c === --- gcc/alias.c (revision 209405) +++ gcc/alias.c (working copy) @@ -995,17 +995,6 @@ record_component_aliases (tree type) case RECORD_TYPE: case UNION_TYPE: case QUAL_UNION_TYPE: - /* Recursively record aliases for the base classes, if there are any. */ - if (TYPE_BINFO (type)) - { - int i; - tree binfo, base_binfo; - - for (binfo = TYPE_BINFO (type), i = 0; - BINFO_BASE_ITERATE (binfo, i, base_binfo); i++) - record_alias_subset (superset, -get_alias_set (BINFO_TYPE (base_binfo))); - } for (field = TYPE_FIELDS (type); field != 0; field = DECL_CHAIN (field)) if (TREE_CODE (field) == FIELD_DECL !DECL_NONADDRESSABLE_P (field)) record_alias_subset (superset, get_alias_set (TREE_TYPE (field)));
[Ada] Robustify renaming code in gigi
This makes the renaming code in gigi more robust in preparation for further changes related to renaming. Tested on x86_64-suse-linux, applied on the mainline. 2014-04-15 Eric Botcazou ebotca...@adacore.com Pierre-Marie de Rodat dero...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity) object: Create a mere scalar constant instead of a reference for renaming of scalar literal. Do not create a new object for constant renaming except for a function call. Make sure a VAR_DECL is created for the renaming pointer. * gcc-interface/trans.c (constant_decl_with_initializer_p): New. (fold_constant_decl_in_expr): New function. (Identifier_to_gnu): Use constant_decl_with_initializer_p. For a constant renaming, try to fold a constant DECL in the result. (lvalue_required_p) N_Object_Renaming_Declaration: Always return 1. (Identifier_to_gnu): Reference the renamed object of constant renaming pointers directly. (Case_Statement_to_gnu): Do not re-fold the bounds of integer types. Assert that the case values are constant. * gcc-interface/utils.c (invalidate_global_renaming_pointers): Do not invalidate constant renaming pointers. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 209410) +++ gcc-interface/utils.c (working copy) @@ -2514,7 +2514,10 @@ record_global_renaming_pointer (tree dec vec_safe_push (global_renaming_pointers, decl); } -/* Invalidate the global renaming pointers. */ +/* Invalidate the global renaming pointers that are not constant, lest their + renamed object contains SAVE_EXPRs tied to an elaboration routine. Note + that we should not blindly invalidate everything here because of the need + to propagate constant values through renaming. */ void invalidate_global_renaming_pointers (void) @@ -2526,7 +2529,8 @@ invalidate_global_renaming_pointers (voi return; FOR_EACH_VEC_ELT (*global_renaming_pointers, i, iter) -SET_DECL_RENAMED_OBJECT (iter, NULL_TREE); +if (!TREE_CONSTANT (DECL_RENAMED_OBJECT (iter))) + SET_DECL_RENAMED_OBJECT (iter, NULL_TREE); vec_free (global_renaming_pointers); } Index: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 209409) +++ gcc-interface/decl.c (working copy) @@ -960,18 +960,20 @@ gnat_to_gnu_entity (Entity_Id gnat_entit gnu_type = TREE_TYPE (gnu_expr); /* Case 1: If this is a constant renaming stemming from a function - call, treat it as a normal object whose initial value is what - is being renamed. RM 3.3 says that the result of evaluating a - function call is a constant object. As a consequence, it can - be the inner object of a constant renaming. In this case, the - renaming must be fully instantiated, i.e. it cannot be a mere - reference to (part of) an existing object. */ + call, treat it as a normal object whose initial value is what is + being renamed. RM 3.3 says that the result of evaluating a + function call is a constant object. Treat constant literals + the same way. As a consequence, it can be the inner object of + a constant renaming. In this case, the renaming must be fully + instantiated, i.e. it cannot be a mere reference to (part of) an + existing object. */ if (const_flag) { tree inner_object = gnu_expr; while (handled_component_p (inner_object)) inner_object = TREE_OPERAND (inner_object, 0); - if (TREE_CODE (inner_object) == CALL_EXPR) + if (TREE_CODE (inner_object) == CALL_EXPR + || CONSTANT_CLASS_P (inner_object)) create_normal_object = true; } @@ -1030,15 +1032,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit about that failure. */ } - /* Case 3: If this is a constant renaming and creating a - new object is allowed and cheap, treat it as a normal - object whose initial value is what is being renamed. */ - if (const_flag - !Is_Composite_Type - (Underlying_Type (Etype (gnat_entity - ; - - /* Case 4: Make this into a constant pointer to the object we + /* Case 3: Make this into a constant pointer to the object we are to rename and attach the object to the pointer if it is something we can stabilize. @@ -1050,68 +1044,59 @@ gnat_to_gnu_entity (Entity_Id gnat_entit The pointer is called a renaming pointer in this case. In the rare cases where we cannot stabilize the renamed - object, we just make a bare pointer, and the renamed - entity is always accessed indirectly through it. */ - else - { - /* We need to preserve the volatileness of the renamed - object through the indirection. */ - if (TREE_THIS_VOLATILE
[PATCH][RFC] Remove RTL loop unswitching
This removes RTL loop unswitching (see last years discussion about compile-time issues of that pass). RTL loop unswitching is enabled together with GIMPLE loop unswitching at -O3 and by -floop-unswitch. It's clearly the wrong place to do high-level loop transforms these days, and the cost of maintainance doesn't outweight the questionable benefit. Thus the following patch removes it. Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope for testsuite fallout). Any objections? Thanks, Richard. 2014-04-15 Richard Biener rguent...@suse.de * Makefile.in (OBJS): Remove loop-unswitch.o. * loop-unswitch.c: Delete. * tree-pass.h (make_pass_rtl_unswitch): Remove. * passes.def (pass_rtl_unswitch): Likewise. * loop-init.c (gate_rtl_unswitch): Likewise. (rtl_unswitch): Likewise. (pass_data_rtl_unswitch): Likewise. (pass_rtl_unswitch): Likewise. (make_pass_rtl_unswitch): Likewise. * rtl.h (reversed_condition): Likewise. (compare_and_jump_seq): Likewise. * loop-iv.c (reversed_condition): Move here from loop-unswitch.c and make static. * loop-unroll.c (compare_and_jump_seq): Likewise. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 209410) +++ gcc/Makefile.in (working copy) @@ -1294,7 +1294,6 @@ OBJS = \ loop-invariant.o \ loop-iv.o \ loop-unroll.o \ - loop-unswitch.o \ lower-subreg.o \ lra.o \ lra-assigns.o \ Index: gcc/tree-pass.h === --- gcc/tree-pass.h (revision 209410) +++ gcc/tree-pass.h (working copy) @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt); -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt); Index: gcc/passes.def === --- gcc/passes.def (revision 209410) +++ gcc/passes.def (working copy) @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_loop2) NEXT_PASS (pass_rtl_loop_init); NEXT_PASS (pass_rtl_move_loop_invariants); - NEXT_PASS (pass_rtl_unswitch); NEXT_PASS (pass_rtl_unroll_and_peel_loops); NEXT_PASS (pass_rtl_doloop); NEXT_PASS (pass_rtl_loop_done); Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 209410) +++ gcc/loop-init.c (working copy) @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc: } -/* Loop unswitching for RTL. */ -static bool -gate_rtl_unswitch (void) -{ - return flag_unswitch_loops; -} - -static unsigned int -rtl_unswitch (void) -{ - if (number_of_loops (cfun) 1) -unswitch_loops (); - return 0; -} - -namespace { - -const pass_data pass_data_rtl_unswitch = -{ - RTL_PASS, /* type */ - loop2_unswitch, /* name */ - OPTGROUP_LOOP, /* optinfo_flags */ - true, /* has_gate */ - true, /* has_execute */ - TV_LOOP_UNSWITCH, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_verify_rtl_sharing, /* todo_flags_finish */ -}; - -class pass_rtl_unswitch : public rtl_opt_pass -{ -public: - pass_rtl_unswitch (gcc::context *ctxt) -: rtl_opt_pass (pass_data_rtl_unswitch, ctxt) - {} - - /* opt_pass methods: */ - bool gate () { return gate_rtl_unswitch (); } - unsigned int execute () { return rtl_unswitch (); } - -}; // class pass_rtl_unswitch - -} // anon namespace - -rtl_opt_pass * -make_pass_rtl_unswitch (gcc::context *ctxt) -{ - return new pass_rtl_unswitch (ctxt); -} - - -/* Loop unswitching for RTL. */ +/* Loop unrolling and peeling for RTL. */ static bool gate_rtl_unroll_and_peel_loops (void) { Index: gcc/loop-iv.c === --- gcc/loop-iv.c (revision 209410) +++ gcc/loop-iv.c (working copy) @@ -1732,6 +1732,21 @@ canon_condition (rtx cond) return cond; } +/* Reverses CONDition; returns NULL if we cannot. */ + +static rtx +reversed_condition (rtx cond) +{ + enum rtx_code reversed; + reversed = reversed_comparison_code (cond, NULL); + if (reversed == UNKNOWN) +return NULL_RTX; + else +return gen_rtx_fmt_ee (reversed, + GET_MODE (cond), XEXP (cond, 0), + XEXP (cond, 1)); +} + /* Tries to use the fact
[PING PATCH] Extend mode-switching to support toggle
Hello, I guess it's for RTL maintainers. Also interested by mode-switching.c last contributors (from past ChangeLog entries) comments, http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00196.html Many thanks Christian,
Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
Maciej W. Rozycki ma...@codesourcery.com writes: On Sat, 12 Apr 2014, Richard Sandiford wrote: I went ahead and applied the adjusted version of the patch to trunk as below (because I wanted to add a testcase too). I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. Thanks, Richard
[PATCH] Add support for -fno-sanitize-recover and -fsanitize-undefined-trap-on-error (PR sanitizer/60275)
Hi! This patch adds two new options (compatible with clang) which allow users to choose the behavior of undefined behavior sanitization. By default as before, all undefined behaviors (except for __builtin_unreachable and missing return in C++) continue after reporting which means that you can get lots of runtime errors from a single program run and the exit code will not reflect the failure in that case. With this patch, one can use -fsanitize=undefined -fno-sanitize-recover, which will report just the first undefined behavior and then exit with non-zero code. Or one can use -fsanitize-undefined-trap-on-error, which will just __builtin_trap () on undefined behavior, not report anything and not require linking of -lubsan (useful e.g. for the kernel or embedded apps). If -fsanitize-undefined-trap-on-error, then -f{,no-}sanitize-recover is ignored, as ub traps, of course only the first undefined behavior will be reported (through the SIGILL/abort). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-04-15 Jakub Jelinek ja...@redhat.com PR sanitizer/60275 * common.opt (fsanitize-recover, fsanitize-undefined-trap-on-error): New options. * gcc.c (sanitize_spec_function): Don't return for undefined if flag_sanitize_undefined_trap_on_error. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT, BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE_ABORT, BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT, BUILT_IN_UBSAN_HANDLE_ADD_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_SUB_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_MUL_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_NEGATE_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_LOAD_INVALID_VALUE_ABORT): New builtins. * ubsan.c (ubsan_instrument_unreachable): Return __builtin_trap () if flag_sanitize_undefined_trap_on_error. (ubsan_expand_null_ifn): Emit __builtin_trap () if flag_sanitize_undefined_trap_on_error and __ubsan_handle_type_mismatch_abort if !flag_sanitize_recover. (ubsan_expand_null_ifn, ubsan_build_overflow_builtin, instrument_bool_enum_load): Emit __builtin_trap () if flag_sanitize_undefined_trap_on_error and __builtin_handle_*_abort () if !flag_sanitize_recover. * doc/invoke.texi (-fsanitize-recover, -fsanitize-undefined-trap-on-error): Document. c-family/ * c-ubsan.c (ubsan_instrument_return): Return __builtin_trap () if flag_sanitize_undefined_trap_on_error. (ubsan_instrument_division, ubsan_instrument_shift, ubsan_instrument_vla): Likewise. Use __ubsan_handle_*_abort () if !flag_sanitize_recover. testsuite/ * g++.dg/ubsan/return-2.C: Revert 2014-03-24 changes, add -fno-sanitize-recover to dg-options. * g++.dg/ubsan/cxx11-shift-1.C: Remove c++11 target restriction, add -std=c++11 to dg-options. * g++.dg/ubsan/cxx11-shift-2.C: Likewise. * g++.dg/ubsan/cxx1y-vla.C: Remove c++1y target restriction, add -std=c++1y to dg-options. * c-c++-common/ubsan/undefined-1.c: Revert 2014-03-24 changes, add -fno-sanitize-recover to dg-options. * c-c++-common/ubsan/overflow-sub-1.c: Likewise. * c-c++-common/ubsan/vla-4.c: Likewise. * c-c++-common/ubsan/pr59503.c: Likewise. * c-c++-common/ubsan/vla-3.c: Likewise. * c-c++-common/ubsan/save-expr-1.c: Likewise. * c-c++-common/ubsan/overflow-add-1.c: Likewise. * c-c++-common/ubsan/shift-3.c: Likewise. * c-c++-common/ubsan/overflow-1.c: Likewise. * c-c++-common/ubsan/overflow-negate-2.c: Likewise. * c-c++-common/ubsan/vla-2.c: Likewise. * c-c++-common/ubsan/overflow-mul-1.c: Likewise. * c-c++-common/ubsan/pr60613-1.c: Likewise. * c-c++-common/ubsan/shift-6.c: Likewise. * c-c++-common/ubsan/overflow-mul-3.c: Likewise. * c-c++-common/ubsan/overflow-add-3.c: New test. * c-c++-common/ubsan/overflow-add-4.c: New test. * c-c++-common/ubsan/div-by-zero-6.c: New test. * c-c++-common/ubsan/div-by-zero-7.c: New test. --- gcc/common.opt.jj 2014-04-15 09:57:33.400264838 +0200 +++ gcc/common.opt 2014-04-15 10:28:10.554519376 +0200 @@ -862,6 +862,14 @@ fsanitize= Common Driver Report Joined Select what to sanitize +fsanitize-recover +Common Report Var(flag_sanitize_recover) Init(1) +After diagnosing undefined behavior attempt to continue execution + +fsanitize-undefined-trap-on-error +Common Report Var(flag_sanitize_undefined_trap_on_error) Init(0) +Use trap instead of a library function for undefined behavior sanitization + fasynchronous-unwind-tables Common Report Var(flag_asynchronous_unwind_tables) Optimization Generate unwind tables that are exact at each instruction boundary --- gcc/gcc.c.jj2014-04-15 09:57:33.456264545
Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.
On 27 March 2014 23:56, Luke Allardyce wrote: It looks like the new standard also requires the precision to be ignored for hexfloat For conversion from a floating-point type, if floatfield != (ios_base::fixed | ios_base:: scientific), str.precision() is specified as precision in the conversion specification. Otherwise, no precision is specified. Thanks for pointing out that difference. We'll need a test for that. Also the old standard seems to require that ios_base::fixed | ios_base::scientific (or any other combination) falls through to the uppercase test; I was trying to use abi_tag for a solution as not only would two versions of _S_format_float be necessary, but also num_get due to the pre-instantiated templates for char and wchar, which led me to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60642. It might just be more trouble than it's worth. I don't think we need to worry about that, if I understand correctly the combination of fixed|scientific has unspecified behaviour in C++03, so we can make our implementation do exactly what it does in C++11.
RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
Richard Sandiford rdsandif...@googlemail.com writes: Maciej W. Rozycki ma...@codesourcery.com writes: On Sat, 12 Apr 2014, Richard Sandiford wrote: I went ahead and applied the adjusted version of the patch to trunk as below (because I wanted to add a testcase too). I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. FYI, we have GNUSIM patches awaiting submission to add micromips support. We are waiting on copyright assignment for GDB which is why they are not available yet. We were planning on getting them submitting as 'on behalf of' but it seems this may not be permitted any more by FSF. Matthew
RE: [patch] Disable if_conversion2 for Og
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Tuesday, April 15, 2014 4:05 PM To: Joey Ye Cc: GCC Patches Subject: Re: [patch] Disable if_conversion2 for Og On Tue, Apr 15, 2014 at 3:59 AM, Joey Ye joey...@arm.com wrote: If-converstion is harmful to optimized debugging as it generates conditional execution instructions with line number information, which resulted in a dillusion to developers that both then-else branches are executed. For example: test.c: 1: unsigned oldest_sequence; 2: 3: unsigned foo(unsigned seq_number) 4: { 5: if ((seq_number + 5) 10) 6:seq_number += 100; 7: else 8: seq_number = oldest_sequence; if (seq_number oldest_sequence) seq_number = oldest_sequence; return seq_number; } $ arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -Og -g3 gets: .loc 1 5 0 addsr3, r0, #5 cmp r3, #9 .loc 1 6 0 - line 6, then branch iteels addls r0, r0, #100 .LVL1: .loc 1 8 0 - line 8, else branch. Both branches seems to be executed in GDB ldrhi r3, .L5 ldrhi r0, [r3] The reason is that if_conversion2 is still enabled in Og. The patch simply disables it for Og. Tests: * -Og bootstrap passed. * Make check default (no additional option): No regression. * Make check with -Og: expected regressions. Cases relying on if-conversion2 failed. FAIL: gcc.target/arm/its.c scan-assembler-times \\tit 2 FAIL: gcc.target/arm/pr40956.c scan-assembler-times mov[t ]*r., #0 1 FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler asreq FAIL: gcc.target/arm/thumb-ifcvt-2.c scan-assembler lslne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler asrne FAIL: gcc.target/arm/thumb-ifcvt.c scan-assembler lslne OK to trunk and 4.8/4.9 branch? Ok for trunk and branches after a while. Why does if-conversion not have the same problem? On the GIMPLE part we avoid all kinds of if-conversion with -Og. I think if-conversion should be disabled for Og too, but I don't have a case to show that it is harmful. If GIMPLE avoids all if-conversion, it is nature to do the same for RTL. I'll test and send another patch to also disable if-conversion.
[PATCH][ARM] PR60663: Improve RTX costs for asm statements
Hi all, This patch relates to PR60663 where cse got confused due to asm statements being given a cost of zero in the arm backend. Jakub already put in a fix to cse for 4.9.0 (http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00512.html) but we should still fix the costs in arm. This patch does that by estimating the number of instructions in the asm statement, adding the cost of the input operands and making sure that it's at least COSTS_N_INSNS (1). Tested and bootstrapped on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com PR rtl-optimization/60663 * config/arm/arm.c (arm_new_rtx_costs): Improve ASM_OPERANDS case, avoid 0 cost. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 91e4cd8..ce7ee82 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10758,10 +10758,16 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case ASM_OPERANDS: - /* Just a guess. Cost one insn per input. */ - *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); - return true; + { + /* Just a guess. Guess number of instructions in the asm + plus one insn per input. Always a minimum of COSTS_N_INSNS (1) + though (see PR60663). */ +int asm_length = asm_str_count (ASM_OPERANDS_TEMPLATE (x)); +int num_operands = ASM_OPERANDS_INPUT_LENGTH (x); +*cost = COSTS_N_INSNS (MAX (1, asm_length + num_operands)); +return true; + } default: if (mode != VOIDmode) *cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
[PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
Hi all (and wide-int maintainers in particular), I tried bootstrapping the wide-int branch on arm-none-linux-gnueabihf and encountered some syntax errors while building wide-int.h and wide-int.cc in expressions that tried to cast to HOST_WIDE_INT. This patch fixes those errors. Also, in c-ada-spec.c I think we intended to use the HOST_WIDE_INT_PRINT format rather than HOST_LONG_FORMAT, since on arm-linux HOST_WIDE_INT is a 'long long'. The attached patch allowed the build to proceed for me, but in stage 2 I encountered an ICE: $TOP/gcc/dwarf2out.c: In function 'long unsigned int _ZL11size_of_dieP10die_struct.isra.209(vecdw_attr_struct, va_gc**, long unsigned int)': $TOP/gcc/dwarf2out.c:7820:1: internal compiler error: in set_value_range, at tree-vrp.c:452 size_of_die (dw_die_ref die) ^ 0xa825c1 set_value_range $TOP/gcc/tree-vrp.c:452 0xa8a441 extract_range_basic $TOP/gcc/tree-vrp.c:3679 0xa92c13 vrp_visit_assignment_or_call $TOP/gcc/tree-vrp.c:6725 0xa947eb vrp_visit_stmt $TOP/gcc/tree-vrp.c:7538 0x9d4d47 simulate_stmt $TOP/gcc/tree-ssa-propagate.c:329 0x9d5047 simulate_block $TOP/gcc/tree-ssa-propagate.c:452 0x9d5e23 ssa_propagate(ssa_prop_result (*)(gimple_statement_base*, edge_def**, tree_node**), ssa_prop_result (*)(gimple_statement_base*)) $TOP/gcc/tree-ssa-propagate.c:859 0xa9a1e1 execute_vrp $TOP/gcc/tree-vrp.c:9781 0xa9a4a3 execute $TOP/gcc/tree-vrp.c:9872 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. Any ideas? The compiler was configured with: --enable-languages=c,c++,fortran --with-cpu=cortex-a15 --with-float=hard --with-mode=thumb Thanks, Kyrill gcc/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * wide-int.h (sign_mask): Fix syntax error. * wide-int.cc (wi::add_large): Likewise. (mul_internal): Likewise. (sub_large): Likewise. c-family/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * c-ada-spec.c (dump_generic_ada_node): Use HOST_WIDE_INT_PRINT instead of HOST_LONG_FORMAT. diff --git a/gcc/c-family/c-ada-spec.c b/gcc/c-family/c-ada-spec.c index 35b036c..0a28840 100644 --- a/gcc/c-family/c-ada-spec.c +++ b/gcc/c-family/c-ada-spec.c @@ -2205,7 +2205,7 @@ dump_generic_ada_node (pretty_printer *buffer, tree node, tree type, int spc, val = -val; } sprintf (pp_buffer (buffer)-digit_buffer, - 16#% HOST_LONG_FORMAT x, val.elt (val.get_len () - 1)); + 16#% HOST_WIDE_INT_PRINT x, val.elt (val.get_len () - 1)); for (i = val.get_len () - 2; i = 0; i--) sprintf (pp_buffer (buffer)-digit_buffer, HOST_WIDE_INT_PRINT_PADDED_HEX, val.elt (i)); diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc index fbef721..a64ed88 100644 --- a/gcc/wide-int.cc +++ b/gcc/wide-int.cc @@ -1130,7 +1130,7 @@ wi::add_large (HOST_WIDE_INT *val, const HOST_WIDE_INT *op0, if (sgn == SIGNED) { unsigned HOST_WIDE_INT x = (val[len - 1] ^ o0) (val[len - 1] ^ o1); - *overflow = HOST_WIDE_INT (x shift) 0; + *overflow = (HOST_WIDE_INT) (x shift) 0; } else { @@ -1369,7 +1369,7 @@ wi::mul_internal (HOST_WIDE_INT *val, const HOST_WIDE_INT *op1val, { if (sgn == SIGNED) { - if (HOST_WIDE_INT (r) != sext_hwi (r, prec)) + if ((HOST_WIDE_INT) (r) != sext_hwi (r, prec)) *overflow = true; } else @@ -1549,7 +1549,7 @@ wi::sub_large (HOST_WIDE_INT *val, const HOST_WIDE_INT *op0, if (sgn == SIGNED) { unsigned HOST_WIDE_INT x = (o0 ^ o1) (val[len - 1] ^ o0); - *overflow = HOST_WIDE_INT (x shift) 0; + *overflow = (HOST_WIDE_INT) (x shift) 0; } else { diff --git a/gcc/wide-int.h b/gcc/wide-int.h index a0241f2..2163f3c 100644 --- a/gcc/wide-int.h +++ b/gcc/wide-int.h @@ -801,7 +801,7 @@ generic_wide_int storage::sign_mask () const if (excess 0) high = excess; } - return HOST_WIDE_INT (high) 0 ? -1 : 0; + return (HOST_WIDE_INT) (high) 0 ? -1 : 0; } /* Return the signed value of the least-significant explicitly-encoded
Re: [PATCH, ARM] Enable tail call optimization for long call
On Wed, Apr 2, 2014 at 12:04 PM, Jiong Wang jiong.w...@arm.com wrote: On 25/03/14 15:44, Richard Earnshaw wrote: On 24/03/14 11:26, Jiong Wang wrote: This patch enables tail call optimization for long call on arm. Previously we have too strict check on arm_function_ok_for_sibcall and be lack of the support on sibcall/sibcall_value expand that long call tail oppportunities are lost. OK for next next stage 1? I think this is OK for EABI targets (since we can rely on the linker generating the right form of interworking veneer), but I'm less certain about other systems (do we still support COFF). I think I'd prefer the patch to factor in TARGET_AAPCS_BASED and to assume that if that is true then arbitrary tail-calls are safe. Hi Richard, IMHO, this is actually a tail call optimization, we just need to make sure the register which hold the address be caller saved then it will be OK. Updated the change log to fix that aarch64 typo. No modification on the patch, but enclose it in this reply to keep wholeness. So, is it ok for next stage-1? This is OK for stage1. Ramana Thanks. -- Jiong gcc/ * config/arm/predicates.md (call_insn_operand): Add long_call check. * config/arm/arm.md (sibcall, sibcall_value): Force the address to reg for long_call. * config/arm/arm.c (arm_function_ok_for_sibcall): Remove long_call restriction. gcc/testsuite gcc.target/arm/tail-long-call.c: New test.
[patch] fix libstdc++/60734 - invalid static_cast
Tested x86_64-linux, committed to trunk. I intend to fix this for 4.9.1 and 4.8.3 too. commit 1e13d4d7791a665bb254e0d53e96d3a5ab925023 Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 00:36:35 2014 +0100 PR libstdc++/60734 * include/bits/stl_tree.h (_Rb_tree::_M_end): Fix invalid cast. diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h index 4bc3c60..288c9fa 100644 --- a/libstdc++-v3/include/bits/stl_tree.h +++ b/libstdc++-v3/include/bits/stl_tree.h @@ -526,11 +526,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Link_type _M_end() _GLIBCXX_NOEXCEPT - { return static_cast_Link_type(this-_M_impl._M_header); } + { return reinterpret_cast_Link_type(this-_M_impl._M_header); } _Const_Link_type _M_end() const _GLIBCXX_NOEXCEPT - { return static_cast_Const_Link_type(this-_M_impl._M_header); } + { return reinterpret_cast_Const_Link_type(this-_M_impl._M_header); } static const_reference _S_value(_Const_Link_type __x)
[PATCH][AArch64] Vectorise bswap[16,32,64]
Hi all, This patch enables aarch64 to vectorise bswap[16,32,64] operations by using the AdvancedSIMD forms of the rev[16,32,64] instructions. The TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION hook is extended to return the vectorised forms of __builtin_bswap* where possible and vector bswap patterns are added. I've added the tests in vect.exp and a new effective target check (vect_bswap) that can be extended for other targets in the future if they can also vectorise these operations. Is that ok? Bootstrapped and tested aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handle BUILT_IN_BSWAP16, BUILT_IN_BSWAP32, BUILT_IN_BSWAP64. * config/aarch64/aarch64-simd.md (bswapmode): New pattern. * config/aarch64/aarch64-simd-builtins.def: Define vector bswap builtins. * config/aarch64/iterator.md (VDQHSD): New mode iterator. (Vrevsuff): New mode attribute. 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * lib/target-supports.exp (check_effective_target_vect_bswap): New. * gcc.dg/vect/vect-bswap16: New test. * gcc.dg/vect/vect-bswap32: Likewise. * gcc.dg/vect/vect-bswap64: Likewise.commit 0d6d820881443a7ce7f9bd51f35aff04866c5e57 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Apr 3 09:22:14 2014 +0100 [AArch64] vectorise bswap diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..d839a40 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1086,7 +1086,29 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in) return aarch64_builtin_decls[builtin]; } - + case BUILT_IN_BSWAP16: +#undef AARCH64_CHECK_BUILTIN_MODE +#define AARCH64_CHECK_BUILTIN_MODE(C, N) \ + (out_mode == N##Imode out_n == C \ +in_mode == N##Imode in_n == C) + if (AARCH64_CHECK_BUILTIN_MODE (4, H)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv4hi]; + else if (AARCH64_CHECK_BUILTIN_MODE (8, H)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv8hi]; + else + return NULL_TREE; + case BUILT_IN_BSWAP32: + if (AARCH64_CHECK_BUILTIN_MODE (2, S)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv2si]; + else if (AARCH64_CHECK_BUILTIN_MODE (4, S)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv4si]; + else + return NULL_TREE; + case BUILT_IN_BSWAP64: + if (AARCH64_CHECK_BUILTIN_MODE (2, D)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOPU_bswapv2di]; + else + return NULL_TREE; default: return NULL_TREE; } diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index c9b7570..e9736da 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -330,6 +330,8 @@ VAR1 (UNOP, floatunsv4si, 2, v4sf) VAR1 (UNOP, floatunsv2di, 2, v2df) + VAR5 (UNOPU, bswap, 10, v4hi, v8hi, v2si, v4si, v2di) + /* Implemented by aarch64_PERMUTE:perm_insnPERMUTE:perm_hilomode. */ BUILTIN_VALL (BINOP, zip1, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 73aee2c..75db3e8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -286,6 +286,14 @@ [(set_attr type neon_mul_Vetypeq)] ) +(define_insn bswapmode + [(set (match_operand:VDQHSD 0 register_operand =w) +(bswap:VDQHSD (match_operand:VDQHSD 1 register_operand w)))] + TARGET_SIMD + revVrevsuff\\t%0.Vbtype, %1.Vbtype + [(set_attr type neon_revq)] +) + (define_insn *aarch64_mul3_eltmode [(set (match_operand:VMUL 0 register_operand =w) (mult:VMUL diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index f1339b8..2b5ebd1 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -150,6 +150,9 @@ ;; Vector modes for H and S types. (define_mode_iterator VDQHS [V4HI V8HI V2SI V4SI]) +;; Vector modes for H, S and D types. +(define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI]) + ;; Vector modes for Q, H and S types. (define_mode_iterator VDQQHS [V8QI V16QI V4HI V8HI V2SI V4SI]) @@ -352,6 +355,9 @@ (V2DI 2d) (V2SF 2s) (V4SF 4s) (V2DF 2d)]) +(define_mode_attr Vrevsuff [(V4HI 16) (V8HI 16) (V2SI 32) +(V4SI 32) (V2DI 64)]) + (define_mode_attr Vmtype [(V8QI .8b) (V16QI .16b) (V4HI .4h) (V8HI .8h) (V2SI .2s) (V4SI .4s) diff --git a/gcc/testsuite/gcc.dg/vect/vect-bswap16.c b/gcc/testsuite/gcc.dg/vect/vect-bswap16.c new file mode 100644 index 000..b452a29 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-bswap16.c @@ -0,0 +1,44 @@ +/* { dg-require-effective-target vect_bswap } */ + +#include tree-vect.h +
Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) Maciej
RE: [PATCH, FORTRAN] Fix PR fortran/60718
Hi Tobias, On Fri, 11 Apr 2014 16:04:51, Tobias Burnus wrote: Hi Tobias, On Fri, Apr 11, 2014 at 02:39:57PM +0200, Bernd Edlinger wrote: On Fri, 11 Apr 2014 13:37:46, Tobias Burnus wrote: Hmm, I was hoping somehow that only that test case is broken, and needs to be fixed. The target attribute is somehow simple, it implies intent(in) and the actual value will in most cases be a pointer, as in the example. I think that passing another nonpointer TARGET to a dummy argument which has a TARGET attribute is at least as common as passing a POINTER to a TARGET. TARGET is roughtly the opposite to the restrict qualifier. By default any nonpointer variable does not alias with something else, unless it has the TARGET attribute; if it has, it (its address) can then be assigned to a pointer. POINTER intrinsically alias and cannot have the TARGET attribute. Pointer - Nonalloc Allocatable - Noalloc Nonallocatable*/Allocatable* - Pointer with intent(in) Well, this approach does not handle intent(inout) at all. Now I have created a test case for the different aliasing issues with may arise with scalar objects. As you pointed out, also conversions of allocatable - nonalloc, allocatable - pointer and nonalloc - pointer turn out to violate the strict aliasing rules. However, conversions of arrays of objects with different attributes seem to be safe. I have not been able to find an example where it would be necessary to write the modified class object back to the original location. But I am not really a Fortran expert. Unfortunately there are also conversions of optional allocatable - optional pointer, which complicate the whole thing quite a lot. I have found these in class_optional_2.f90. Boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk? Thanks Bernd. 2014-04-15 Bernd Edlinger bernd.edlin...@hotmail.de PR fortran/60718 * trans-expr.c (gfc_conv_procedure_call): Fix a strict aliasing violation when passing a class object to a formal parameter which has different pointer or allocatable attributes. patch-pr60718.diff Description: Binary data
Re: [PING PATCH] Extend mode-switching to support toggle
On 04/15/2014 01:13 PM, Joern Rennecke wrote: On 15 April 2014 10:20, Christian Bruel christian.br...@st.com wrote: Hello, I guess it's for RTL maintainers. Also interested by mode-switching.c last contributors (from past ChangeLog entries) comments, http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00196.html This only helps if there are exactly two modes for an entity. An interface which extends EMIT_MODE_SET with a parameter for the known mode would be more versatile. I.e. if you need to manipulate a control register with an AND and OR to set a specific mode with no knowledge of the previous mode, having the known previous mode allows to use a single add or xor to make the desired switch. An unknown mode could be represented by no_mode or -1. yes, I didn't have a 3 state (or more) toggling in mind. My implementation only works for 2 states entities (flip on/off a bit). More than that would require not using a XOR and should expose the test/set_toggle_status machinery to a machine description part. This is a limitation of my proposal, If this will be a missing extension for a target, we will need to move the test/set toggle machinery out of the machine independent part. Maybe overkill as of today. I agree that extending the current EMIT_MODE_SET might be more flexible than a new EMIT_TOGGLE. I was balancing between the two interfaces... thanks for this point. While you are at it, you should also hookize the thing. OK, FWIW, I have noted down some weaknesses/improvement opportunities of the mode switching pass in: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29349 . This also touches the fpchg issue Thanks, I have also some example where this can be improved, hoping to resurrect this problem one day.
[PATCH] Fix PR56965, move nonoverlapping_component_refs_p
This moves nonoverlapping_component_refs_p (now that I committed the patch to make it O (n log n) instead of O (n^3)) to the tree alias oracle (which is dispatched to from the RTL oracle). This fixes the wrong-code part of the bug by moving the call to nonoverlapping_component_refs_p at a place where we verified that we are not presented with weird may_alias or type-punned (via VIEW_CONVERT_EXPR) accesses. I didn't yet relax some of its restrictions (we can safely skip real/imag-part and array[-range?]-refs - we don't have to stop at them. There is also the question whether aliasing_component_refs_p still does sth useful after this (if those restrictions are removed). Bootstrap / regtest pending on x86_64-unknown-linux-gnu. Eric, does this look reasonable? Thanks, Richard. 2014-04-15 Richard Biener rguent...@suse.de PR rtl-optimization/56965 * alias.c (ncr_compar, nonoverlapping_component_refs_p): Move ... * tree-ssa-alias.c (ncr_compar, nonoverlapping_component_refs_p): ... here. * alias.c (true_dependence_1): Do not call nonoverlapping_component_refs_p. * tree-ssa-alias.c (indirect_ref_may_alias_decl_p): Call nonoverlapping_component_refs_p. (indirect_refs_may_alias_p): Likewise. * gcc.dg/torture/pr56965-1.c: New testcase. * gcc.dg/torture/pr56965-2.c: Likewise. Index: gcc/alias.c === *** gcc/alias.c (revision 209415) --- gcc/alias.c (working copy) *** read_dependence (const_rtx mem, const_rt *** 2248,2373 return false; } - /* qsort compare function to sort FIELD_DECLs after their -DECL_FIELD_CONTEXT TYPE_UID. */ - - static inline int - ncr_compar (const void *field1_, const void *field2_) - { - const_tree field1 = *(const_tree *) const_cast void *(field1_); - const_tree field2 = *(const_tree *) const_cast void *(field2_); - unsigned int uid1 - = TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field1))); - unsigned int uid2 - = TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field2))); - if (uid1 uid2) - return -1; - else if (uid1 uid2) - return 1; - return 0; - } - - /* Return true if we can determine that the fields referenced cannot -overlap for any pair of objects. */ - - static bool - nonoverlapping_component_refs_p (const_rtx rtlx, const_rtx rtly) - { - const_tree x = MEM_EXPR (rtlx), y = MEM_EXPR (rtly); - - if (!flag_strict_aliasing - || !x || !y - || TREE_CODE (x) != COMPONENT_REF - || TREE_CODE (y) != COMPONENT_REF) - return false; - - auto_vecconst_tree, 16 fieldsx; - while (TREE_CODE (x) == COMPONENT_REF) - { - tree field = TREE_OPERAND (x, 1); - tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field)); - if (TREE_CODE (type) == RECORD_TYPE) - fieldsx.safe_push (field); - x = TREE_OPERAND (x, 0); - } - if (fieldsx.length () == 0) - return false; - auto_vecconst_tree, 16 fieldsy; - while (TREE_CODE (y) == COMPONENT_REF) - { - tree field = TREE_OPERAND (y, 1); - tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field)); - if (TREE_CODE (type) == RECORD_TYPE) - fieldsy.safe_push (TREE_OPERAND (y, 1)); - y = TREE_OPERAND (y, 0); - } - if (fieldsy.length () == 0) - return false; - - /* Most common case first. */ - if (fieldsx.length () == 1 -fieldsy.length () == 1) - return ((TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldsx[0])) -== TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldsy[0]))) -fieldsx[0] != fieldsy[0] -!(DECL_BIT_FIELD (fieldsx[0]) DECL_BIT_FIELD (fieldsy[0]))); - - if (fieldsx.length () == 2) - { - if (ncr_compar (fieldsx[0], fieldsx[1]) == 1) - { - const_tree tem = fieldsx[0]; - fieldsx[0] = fieldsx[1]; - fieldsx[1] = tem; - } - } - else - fieldsx.qsort (ncr_compar); - - if (fieldsy.length () == 2) - { - if (ncr_compar (fieldsy[0], fieldsy[1]) == 1) - { - const_tree tem = fieldsy[0]; - fieldsy[0] = fieldsy[1]; - fieldsy[1] = tem; - } - } - else - fieldsy.qsort (ncr_compar); - - unsigned i = 0, j = 0; - do - { - const_tree fieldx = fieldsx[i]; - const_tree fieldy = fieldsy[j]; - tree typex = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldx)); - tree typey = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldy)); - if (typex == typey) - { - /* We're left with accessing different fields of a structure, -no possible overlap, unless they are both bitfields. */ - if (fieldx != fieldy) - return !(DECL_BIT_FIELD (fieldx) DECL_BIT_FIELD (fieldy)); - } - if (TYPE_UID (typex) TYPE_UID (typey)) - { - i++; - if (i == fieldsx.length ()) - break; - }
Re: [PATCH] [libgomp] make it possible to use OMP on both sides of a fork
On Tue, Mar 4, 2014 at 11:37 PM, Nathaniel Smith n...@pobox.com wrote: On Tue, Feb 18, 2014 at 8:58 PM, Richard Henderson r...@redhat.com wrote: On 02/16/2014 03:59 PM, Nathaniel Smith wrote: Yes, but the problem is that depending on what the user intends to do after forking, our pthread_atfork handler might help or it might hurt, and we don't know which. Consider these two cases: - fork+exec - fork+continue to use OMP in child The former case is totally POSIX-legal, even when performed at arbitrary places, even when another thread is, say, in the middle of calling malloc(). Point well taken. Hi all, I guess this patch has gotten all the feedback that it's getting. Any interest in committing it? :-) I don't have commit access. 2014-02-12 Nathaniel J. Smith n...@pobox.com * team.c (gomp_free_pool_helper): Move per-thread cleanup to main thread. (gomp_free_thread): Delegate implementation to... (gomp_free_thread_pool): ...this new function. Like old gomp_free_thread, but does per-thread cleanup, and has option to skip everything that involves interacting with actual threads, which is useful when called after fork. (gomp_after_fork_callback): New function. (gomp_team_start): Register atfork handler, and check for fork on entry. Pinging this again now that trunk has re-opened. For compliant code this patch has essentially no impact (OMP-using code acquires a single-line post-fork callback which sets a flag; everything else works the same as now). For technically non-compliant mostly serial code that uses OMP in some places, and forks children in other places, it makes a best effort attempt to clean up the thread pool detritus left by a fork, instead of simply deadlocking as currently, so as to allow children to use OMP as well. This makes GOMP match the behaviour of all other OMP implementations I'm aware of. Previous discussion: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00813.html Bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035 I don't have a commit bit -- please commit if acceptable. Cheers, -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
-Original Message- From: Maciej W. Rozycki [mailto:ma...@codesourcery.com] Sent: Tuesday, April 15, 2014 7:28 AM To: Richard Sandiford Cc: Matthew Fortune; Moore, Catherine; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) I'll take care of this bit.
Re: [PATCH][RFC] Remove RTL loop unswitching
On Tue, 15 Apr 2014, Richard Biener wrote: This removes RTL loop unswitching (see last years discussion about compile-time issues of that pass). RTL loop unswitching is enabled together with GIMPLE loop unswitching at -O3 and by -floop-unswitch. It's clearly the wrong place to do high-level loop transforms these days, and the cost of maintainance doesn't outweight the questionable benefit. Thus the following patch removes it. Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope for testsuite fallout). No testsuite fallout, thus no testcases that test for a working RTL unswitching (on x86_64/i586 at least). Richard. Any objections? Thanks, Richard. 2014-04-15 Richard Biener rguent...@suse.de * Makefile.in (OBJS): Remove loop-unswitch.o. * loop-unswitch.c: Delete. * tree-pass.h (make_pass_rtl_unswitch): Remove. * passes.def (pass_rtl_unswitch): Likewise. * loop-init.c (gate_rtl_unswitch): Likewise. (rtl_unswitch): Likewise. (pass_data_rtl_unswitch): Likewise. (pass_rtl_unswitch): Likewise. (make_pass_rtl_unswitch): Likewise. * rtl.h (reversed_condition): Likewise. (compare_and_jump_seq): Likewise. * loop-iv.c (reversed_condition): Move here from loop-unswitch.c and make static. * loop-unroll.c (compare_and_jump_seq): Likewise. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 209410) +++ gcc/Makefile.in (working copy) @@ -1294,7 +1294,6 @@ OBJS = \ loop-invariant.o \ loop-iv.o \ loop-unroll.o \ - loop-unswitch.o \ lower-subreg.o \ lra.o \ lra-assigns.o \ Index: gcc/tree-pass.h === --- gcc/tree-pass.h (revision 209410) +++ gcc/tree-pass.h (working copy) @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt); -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt); Index: gcc/passes.def === --- gcc/passes.def(revision 209410) +++ gcc/passes.def(working copy) @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_loop2) NEXT_PASS (pass_rtl_loop_init); NEXT_PASS (pass_rtl_move_loop_invariants); - NEXT_PASS (pass_rtl_unswitch); NEXT_PASS (pass_rtl_unroll_and_peel_loops); NEXT_PASS (pass_rtl_doloop); NEXT_PASS (pass_rtl_loop_done); Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 209410) +++ gcc/loop-init.c (working copy) @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc: } -/* Loop unswitching for RTL. */ -static bool -gate_rtl_unswitch (void) -{ - return flag_unswitch_loops; -} - -static unsigned int -rtl_unswitch (void) -{ - if (number_of_loops (cfun) 1) -unswitch_loops (); - return 0; -} - -namespace { - -const pass_data pass_data_rtl_unswitch = -{ - RTL_PASS, /* type */ - loop2_unswitch, /* name */ - OPTGROUP_LOOP, /* optinfo_flags */ - true, /* has_gate */ - true, /* has_execute */ - TV_LOOP_UNSWITCH, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_verify_rtl_sharing, /* todo_flags_finish */ -}; - -class pass_rtl_unswitch : public rtl_opt_pass -{ -public: - pass_rtl_unswitch (gcc::context *ctxt) -: rtl_opt_pass (pass_data_rtl_unswitch, ctxt) - {} - - /* opt_pass methods: */ - bool gate () { return gate_rtl_unswitch (); } - unsigned int execute () { return rtl_unswitch (); } - -}; // class pass_rtl_unswitch - -} // anon namespace - -rtl_opt_pass * -make_pass_rtl_unswitch (gcc::context *ctxt) -{ - return new pass_rtl_unswitch (ctxt); -} - - -/* Loop unswitching for RTL. */ +/* Loop unrolling and peeling for RTL. */ static bool gate_rtl_unroll_and_peel_loops (void) { Index: gcc/loop-iv.c === --- gcc/loop-iv.c (revision 209410) +++ gcc/loop-iv.c (working copy) @@ -1732,6 +1732,21 @@ canon_condition (rtx cond) return cond; } +/* Reverses CONDition; returns NULL if we cannot. */ + +static rtx +reversed_condition (rtx cond) +{ + enum rtx_code reversed; + reversed = reversed_comparison_code
[PATCH] Bump LTO_major_version
This bumps LTO_major_version on trunk to avoid ICEing when accidentially mixing objects with 4.9 ones. We'll soon start to diverge (if we didn't already). Committed. Richard. 2014-04-15 Richard Biener rguent...@suse.de * lto-streamer.h (LTO_major_version): Bump to 4. Index: gcc/lto-streamer.h === --- gcc/lto-streamer.h (revision 209415) +++ gcc/lto-streamer.h (working copy) @@ -140,7 +140,7 @@ along with GCC; see the file COPYING3. sections a '.' and the section type are appended. */ #define LTO_SECTION_NAME_PREFIX .gnu.lto_ -#define LTO_major_version 3 +#define LTO_major_version 4 #define LTO_minor_version 0 typedef unsigned char lto_decl_flags_t;
Re: [PATCH] Change HONOR_REG_ALLOC_ORDER to a marco for C expression
Hi Vladimir: Although this patch is safe. I guess it could wait for stage 1 as right now we don't need this functionality. The patch is ok for the stage1 which is probably about a month away. ping is this patch ok now? thanks:) Thanks for the patch.
Re: Ignore cxa_pure_virtual in ipa-devirt's target lists
On 04/14/2014 04:06 PM, Jan Hubicka wrote: 1) write backend pass that walks vtables and for every non-hidden noreturn function it can produce hidden COMDAT wrapper with a fixed (GCC local) mangling. Perhaps with a command line option to do so for all virtual functions; while this would introduce extra call into most of virtual calls, this may not be too bad with LTO, where we can do that only for virtuals that are exported from unit. With LTO, we could do this for all virtuals that are not defined locally; nothing in the language allows you to compare the address value of a vtable element...but I suppose speculative devirtualization does, so maybe not. 2) make something similar in C++ FE (probably not so cool idea because of the comment above) 3) introduce hidden local_cxa_pure_virtual into bits that are linked into every DSO so we do not need to play with COMDAT. This makes sense to me. Jason
Re: [PATCH][ARM] PR60663: Improve RTX costs for asm statements
On 15/04/14 11:56, Kyrill Tkachov wrote: Hi all, This patch relates to PR60663 where cse got confused due to asm statements being given a cost of zero in the arm backend. Jakub already put in a fix to cse for 4.9.0 (http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00512.html) but we should still fix the costs in arm. This patch does that by estimating the number of instructions in the asm statement, adding the cost of the input operands and making sure that it's at least COSTS_N_INSNS (1). Tested and bootstrapped on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com PR rtl-optimization/60663 * config/arm/arm.c (arm_new_rtx_costs): Improve ASM_OPERANDS case, avoid 0 cost. I'd be inclined to use 1 + number of operands, rather than MAX. OK with that change. R. pr60663.patch diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 91e4cd8..ce7ee82 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10758,10 +10758,16 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case ASM_OPERANDS: - /* Just a guess. Cost one insn per input. */ - *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); - return true; + { + /* Just a guess. Guess number of instructions in the asm + plus one insn per input. Always a minimum of COSTS_N_INSNS (1) + though (see PR60663). */ +int asm_length = asm_str_count (ASM_OPERANDS_TEMPLATE (x)); +int num_operands = ASM_OPERANDS_INPUT_LENGTH (x); +*cost = COSTS_N_INSNS (MAX (1, asm_length + num_operands)); +return true; + } default: if (mode != VOIDmode) *cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On 04/14/2014 11:10 AM, Marek Polacek wrote: + else if (TREE_CODE (val) == IDENTIFIER_NODE) +{ + tree t = lookup_name (val); + if (t TREE_CODE (t) == CONST_DECL) + return DECL_INITIAL (t); +} I'm uncomfortable with this; we should have looked up any attributes in the parser. Does the testsuite hit this code? Jason
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On 04/14/2014 06:02 PM, Marc Glisse wrote: shouldn't the same also apply if VECTOR_TYPE_P (type), not just for CLASS_TYPE_P (type)? Sure. Do you want to make that change? Jason
Re: [PATCH][ARM] PR60663: Improve RTX costs for asm statements
On 15/04/14 14:55, Richard Earnshaw wrote: On 15/04/14 11:56, Kyrill Tkachov wrote: Hi all, This patch relates to PR60663 where cse got confused due to asm statements being given a cost of zero in the arm backend. Jakub already put in a fix to cse for 4.9.0 (http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00512.html) but we should still fix the costs in arm. This patch does that by estimating the number of instructions in the asm statement, adding the cost of the input operands and making sure that it's at least COSTS_N_INSNS (1). Tested and bootstrapped on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com PR rtl-optimization/60663 * config/arm/arm.c (arm_new_rtx_costs): Improve ASM_OPERANDS case, avoid 0 cost. I'd be inclined to use 1 + number of operands, rather than MAX. OK with that change. Thanks Richard, I've committed this slightly modified version as version as r209419. Kyrill R. pr60663.patch diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 91e4cd8..ce7ee82 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10758,10 +10758,16 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case ASM_OPERANDS: - /* Just a guess. Cost one insn per input. */ - *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); - return true; + { + /* Just a guess. Guess number of instructions in the asm + plus one insn per input. Always a minimum of COSTS_N_INSNS (1) + though (see PR60663). */ +int asm_length = asm_str_count (ASM_OPERANDS_TEMPLATE (x)); +int num_operands = ASM_OPERANDS_INPUT_LENGTH (x); +*cost = COSTS_N_INSNS (MAX (1, asm_length + num_operands)); +return true; + } default: if (mode != VOIDmode) *cost = COSTS_N_INSNS (ARM_NUM_REGS (mode)); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e5cf503..773c353 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10670,10 +10670,16 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case ASM_OPERANDS: - /* Just a guess. Cost one insn per input. */ - *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x)); - return true; + { + /* Just a guess. Guess number of instructions in the asm + plus one insn per input. Always a minimum of COSTS_N_INSNS (1) + though (see PR60663). */ +int asm_length = MAX (1, asm_str_count (ASM_OPERANDS_TEMPLATE (x))); +int num_operands = ASM_OPERANDS_INPUT_LENGTH (x); +*cost = COSTS_N_INSNS (asm_length + num_operands); +return true; + } default: if (mode != VOIDmode) *cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
Re: [PATCH] Fix PR56965, move nonoverlapping_component_refs_p
I didn't yet relax some of its restrictions (we can safely skip real/imag-part and array[-range?]-refs - we don't have to stop at them. There is also the question whether aliasing_component_refs_p still does sth useful after this (if those restrictions are removed). Bootstrap / regtest pending on x86_64-unknown-linux-gnu. Eric, does this look reasonable? Yes, although I wonder whether this machinery is still that useful these days, given that we already have aliasing_component_refs_p (for the indirect/direct case) and nonoverlapping_component_refs_of_decl_p (for the direct/direct case) but, on the other hand, your testcases don't look too far-fetched. We'll have 2 nonoverlapping_component_refs[_of_decls]_p predicates with a very similar implementation, I guess unifying them would be the next logical step. -- Eric Botcazou
Re: [C++ Patch] Minor TYPE_NAME clean-up
I'd rather not use TYPE_LINKAGE_IDENTIFIER in places that aren't dealing with linkage. The other changes are OK. Jason
Re: [PATCH] Fix PR56965, move nonoverlapping_component_refs_p
On Tue, 15 Apr 2014, Eric Botcazou wrote: I didn't yet relax some of its restrictions (we can safely skip real/imag-part and array[-range?]-refs - we don't have to stop at them. There is also the question whether aliasing_component_refs_p still does sth useful after this (if those restrictions are removed). Bootstrap / regtest pending on x86_64-unknown-linux-gnu. Eric, does this look reasonable? Yes, although I wonder whether this machinery is still that useful these days, given that we already have aliasing_component_refs_p (for the indirect/direct case) and nonoverlapping_component_refs_of_decl_p (for the direct/direct case) but, on the other hand, your testcases don't look too far-fetched. We'll have 2 nonoverlapping_component_refs[_of_decls]_p predicates with a very similar implementation, I guess unifying them would be the next logical step. Yeah, I'm definitely planning to do some cleanup here. Richard.
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On Tue, 15 Apr 2014, Jason Merrill wrote: On 04/14/2014 06:02 PM, Marc Glisse wrote: shouldn't the same also apply if VECTOR_TYPE_P (type), not just for CLASS_TYPE_P (type)? Sure. Do you want to make that change? I can add || VECTOR_TYPE_P (type), yes, but I thought you might have ideas about other cases that might have been forgotten, maybe arrays or something (I didn't have time to test any further), and thus on what the right test should be. If it is just vectors I'll prepare a patch with a simple testcase. -- Marc Glisse
[RFC] Enable virtual operands at -O0
Hi, I recently stumbled on the ??? comment in tree-ssa-uninit.c and wondered why virtual operands are not enabled at -O0. As demonstrated by the attached patch, this would make it possible to unXFAIL a few uninit-*.c testcases. Tested on x86_64-suse-linux. * tree-ssa-operands.c (create_vop_var): Set DECL_IGNORED_P. (append_use): Run at -O0. (append_vdef): Likewise. * tree-ssa-uninit.c (warn_uninitialized_vars): Remove obsolete comment. testsuite/ * gcc.dg/uninit-B-O0.c: Remove XFAIL. * gcc.dg/uninit-I-O0.c: Likewise. * gcc.dg/uninit-pr19430-O0.c: Remove some XFAILs. -- Eric BotcazouIndex: tree-ssa-uninit.c === --- tree-ssa-uninit.c (revision 209411) +++ tree-ssa-uninit.c (working copy) @@ -210,7 +210,6 @@ warn_uninitialized_vars (bool warn_possi /* For memory the only cheap thing we can do is see if we have a use of the default def of the virtual operand. - ??? Note that at -O0 we do not have virtual operands. ??? Not so cheap would be to use the alias oracle via walk_aliased_vdefs, if we don't find any aliasing vdef warn as is-used-uninitialized, if we don't find an aliasing Index: testsuite/gcc.dg/uninit-I-O0.c === --- testsuite/gcc.dg/uninit-I-O0.c (revision 209411) +++ testsuite/gcc.dg/uninit-I-O0.c (working copy) @@ -3,6 +3,6 @@ int sys_msgctl (void) { - struct { int mode; } setbuf; /* { dg-warning 'setbuf\.mode' is used {} { xfail *-*-* } } */ - return setbuf.mode; + struct { int mode; } setbuf; + return setbuf.mode; /* { dg-warning 'setbuf\.mode' is used uninitialized in this function } */ } Index: testsuite/gcc.dg/uninit-B-O0.c === --- testsuite/gcc.dg/uninit-B-O0.c (revision 209411) +++ testsuite/gcc.dg/uninit-B-O0.c (working copy) @@ -9,7 +9,7 @@ void baz (void) { int i; - if (i) /* { dg-warning uninit uninit i warning { xfail *-*-* } } */ + if (i) /* { dg-warning 'i' is used uninitialized in this function } */ bar (i); foo (i); } Index: testsuite/gcc.dg/uninit-pr19430-O0.c === --- testsuite/gcc.dg/uninit-pr19430-O0.c (revision 209411) +++ testsuite/gcc.dg/uninit-pr19430-O0.c (working copy) @@ -16,10 +16,9 @@ foo (int i) return j; } - int foo2( void ) { - int rc; /* { dg-warning 'rc' is used uninitialized in this function uninitialized { xfail *-*-* } 21 } */ - return rc; + int rc; + return rc; /* { dg-warning 'rc' is used uninitialized in this function } */ *rc = 0; } @@ -29,7 +28,7 @@ void frob(int *pi); int main(void) { int i; - printf(i = %d\n, i); /* { dg-warning 'i' is used uninitialized in this function uninitialized { xfail *-*-* } 32 } */ + printf(i = %d\n, i); /* { dg-warning 'i' is used uninitialized in this function } */ frob(i); return 0; @@ -38,6 +37,6 @@ int main(void) void foo3(int*); void bar3(void) { int x; - if(x) /* { dg-warning 'x' is used uninitialized in this function uninitialized { xfail *-*-* } 41 } */ + if(x) /* { dg-warning 'x' is used uninitialized in this function } */ foo3(x); } Index: tree-ssa-operands.c === --- tree-ssa-operands.c (revision 209411) +++ tree-ssa-operands.c (working copy) @@ -166,6 +166,7 @@ create_vop_var (struct function *fn) get_identifier (.MEM), void_type_node); DECL_ARTIFICIAL (global_var) = 1; + DECL_IGNORED_P (global_var) = 1; TREE_READONLY (global_var) = 0; DECL_EXTERNAL (global_var) = 1; TREE_STATIC (global_var) = 1; @@ -477,9 +478,6 @@ append_use (tree *use_p) static inline void append_vdef (tree var) { - if (!optimize) -return; - gcc_assert ((build_vdef == NULL_TREE || build_vdef == var) (build_vuse == NULL_TREE @@ -495,9 +493,6 @@ append_vdef (tree var) static inline void append_vuse (tree var) { - if (!optimize) -return; - gcc_assert (build_vuse == NULL_TREE || build_vuse == var);
[patch] libstdc++/60695 - add static_assert to std::atomic
As I said in the PR comments, I see no useful reason to allow std::atomic to support zero-sized types. Tested x86_64-linux, committed to trunk. commit 6835d5ad1694f54d16c4a0d63273b12cbed78852 Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 13:19:32 2014 +0100 PR libstdc++/60695 * include/std/atomic (atomic_Tp): Add static assertion. * testsuite/29_atomics/atomic/60695.cc: New. diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index ece75a4..1b8e445 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -163,6 +163,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION private: _Tp _M_i; + // TODO: static_assert(is_trivially_copyable_Tp::value, ); + + static_assert(sizeof(_Tp) 0, + Incomplete or zero-sized types are not supported); + public: atomic() noexcept = default; ~atomic() noexcept = default; diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc b/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc new file mode 100644 index 000..27c0c8f --- /dev/null +++ b/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc @@ -0,0 +1,30 @@ +// { dg-require-atomic-builtins } +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include atomic + +// libstdc++/60695 + +struct X { + char stuff[0]; // GNU extension, type has zero size +}; + +std::atomicX a; // { dg-error not supported { target *-*-* } 168 }
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On 04/15/2014 10:13 AM, Marc Glisse wrote: I can add || VECTOR_TYPE_P (type), yes, but I thought you might have ideas about other cases that might have been forgotten, maybe arrays or something (I didn't have time to test any further), and thus on what the right test should be. If it is just vectors I'll prepare a patch with a simple testcase. It's just vectors, because they're an extension; the patch I checked in covered the standard language. Jason
Re: [patch] Fix PR59295 -- move redundant friend decl warning under -Wredundant-decls
On 03/21/2014 12:16 PM, Paul Pluzhnikov wrote: Ok for trunk once it opens in stage 1? OK. Jason
RE: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect Cilk_sync usage
-Original Message- From: Jason Merrill [mailto:ja...@redhat.com] Sent: Monday, April 14, 2014 9:49 PM To: Zamyatin, Igor; Jakub Jelinek Cc: GCC Patches (gcc-patches@gcc.gnu.org); Iyer, Balaji V Subject: Re: [PATCH, PR60189, Cilk+] Fix for ICE with incorrect Cilk_sync usage Oh, I see where the problem is coming from. Cilk_sync is a statement, but it's being parsed as an expression. Let's move it to cp_parser_statement. Something like this (better to put new code in separate routine?)? diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index bb59e3b..3105d6c 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -5835,20 +5835,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool address_p, bool cast_p, } break; } - -case RID_CILK_SYNC: - if (flag_cilkplus) - { - tree sync_expr = build_cilk_sync (); - SET_EXPR_LOCATION (sync_expr, -cp_lexer_peek_token (parser-lexer)-location); - finish_expr_stmt (sync_expr); - } - else - error_at (token-location, -fcilkplus must be enabled to use - %_Cilk_sync%); - cp_lexer_consume_token (parser-lexer); - break; case RID_BUILTIN_SHUFFLE: { @@ -9404,6 +9390,24 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr, statement = cp_parser_jump_statement (parser); break; + case RID_CILK_SYNC: + cp_lexer_consume_token (parser-lexer); + if (flag_cilkplus) + { + tree sync_expr = build_cilk_sync (); + SET_EXPR_LOCATION (sync_expr, +token-location); + statement = finish_expr_stmt (sync_expr); + } + else + { + error_at (token-location, -fcilkplus must be enabled to use +%_Cilk_sync%); + statement = error_mark_node; + } + cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON); + break; + /* Objective-C++ exception-handling constructs. */ case RID_AT_TRY: case RID_AT_CATCH: Thanks, Igor Jason
[build] Correctly detect native TLS support with 64-bit gas on Solaris/x86 (PR target/60817)
As reported in the PR, gcc/configure currently fails to detect native TLS support on x86_64-*-solaris2* with a 64-bit gas since it feeds it 32-bit TLS code. I haden't noticed this so far since I've been using a 32-bit gas here (no idea why). The following patch fixes this by making sure 64-bit code is both used for 64-bit-default configurations and the necessary assembler flags passed. I've chosen to merge the i?86 and x86_64 cases to avoid duplicating considerable amounts of code. When using the native Solaris assembler, the relocs need to be in lower case as already done for 32-bit. Tested by configuring for x86_64-pc-solaris2.11 with 32-bit gas, 64-bit gas, /bin/as, i386-pc-solaris2.11 with 32-bit gas and /bin/as, x86_64-unknown-linux-gnu, and i686-unknown-linux-gnu and checking that native TLS support is detected correctly. Ok for mainline or should I rather bootstrap the change on a couple of those configurations? Thanks. Rainer 2014-04-15 Rainer Orth r...@cebitec.uni-bielefeld.de PR target/60817 * configure.ac (set_have_as_tls): Merge i[34567]86-*-* and x86_64-*-* cases. Pass necessary as flags on 64-bit Solaris/x86. Use lowercase relocs for x86_64-*-*. * configure: Regenerate. diff --git a/gcc/configure.ac b/gcc/configure.ac --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -2959,7 +2959,7 @@ foo: .long 25 tls_first_major=2 tls_first_minor=17 ;; - i[34567]86-*-* | x86_64-*-solaris2.1[0-9]*) + i[34567]86-*-* | x86_64-*-*) case $target in i[34567]86-*-solaris2.*) on_solaris=yes @@ -2991,6 +2991,8 @@ changequote(,)dnl tls_section_flag=T tls_as_opt=--fatal-warnings fi +case $target in + i[34567]86-*-*) conftest_s=$conftest_s foo: .long 25 .text @@ -3007,20 +3009,23 @@ foo: .long 25 leal foo@ntpoff(%ecx), %eax ;; x86_64-*-*) -conftest_s=' - .section .tdata,awT,@progbits + if test x$on_solaris = xyes; then + case $gas_flag in + yes) tls_as_opt=$tls_as_opt --64 ;; + no) tls_as_opt=$tls_as_opt -xarch=amd64 ;; + esac + fi + conftest_s=$conftest_s foo: .long 25 .text movq %fs:0, %rax - leaq foo@TLSGD(%rip), %rdi - leaq foo@TLSLD(%rip), %rdi - leaq foo@DTPOFF(%rax), %rdx - movq foo@GOTTPOFF(%rip), %rax - movq $foo@TPOFF, %rax' - tls_first_major=2 - tls_first_minor=14 - tls_section_flag=T - tls_as_opt=--fatal-warnings + leaq foo@tlsgd(%rip), %rdi + leaq foo@tlsld(%rip), %rdi + leaq foo@dtpoff(%rax), %rdx + movq foo@gottpoff(%rip), %rax + movq \$foo@tpoff, %rax +;; +esac ;; ia64-*-*) conftest_s=' -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[patch] libstdc++/60594 - std::functionincomplete()
The bug in the PR is that is_copy_constructiblestd::functionbar() performs overload resolution for function(const function), which instantiates the SFINAE constraints on the function(F) constructor template, which instantiates is_convertiblebar, bar, which is ill-formed when bar is incomplete. When bar is later completed you can't call the function(F) constructor with any function object returning bar, because the constraints have already been instantiated and so continue to produce the same result. The fix is to exclude the function type from the constraints checks, so that only the copy constructor is considered, and the function(F) constructor template can be used later when bar is complete. The new test also checks for the same problem with assignment. Tested x86_64-linux, committed to trunk. I plan to fix this for 4.8.3 and 4.9.1 too. commit 1f0672b59e27a808e291891f97d38278b221d30a Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 15:25:14 2014 +0100 PR libstdc++/60594 * include/std/functional (function::_Callable): Exclude own type from the callable checks. * testsuite/20_util/function/60594.cc: New. diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional index 5a987d9..0e80fa3 100644 --- a/libstdc++-v3/include/std/functional +++ b/libstdc++-v3/include/std/functional @@ -2149,8 +2149,15 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) using _Invoke = decltype(__callable_functor(std::declval_Functor()) (std::declval_ArgTypes()...) ); + // Used so the return type convertibility checks aren't done when + // performing overload resolution for copy construction/assignment. + templatetypename _Tp + using _NotSelf = __not_is_same_Tp, function; + templatetypename _Functor - using _Callable = __check_func_return_type_Invoke_Functor, _Res; + using _Callable + = __and__NotSelf_Functor, + __check_func_return_type_Invoke_Functor, _Res; templatetypename _Cond, typename _Tp using _Requires = typename enable_if_Cond::value, _Tp::type; @@ -2291,7 +2298,7 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) * reference_wrapperF, this function will not throw. */ templatetypename _Functor - _Requires_Callable_Functor, function + _Requires_Callabletypename decay_Functor::type, function operator=(_Functor __f) { function(std::forward_Functor(__f)).swap(*this); diff --git a/libstdc++-v3/testsuite/20_util/function/60594.cc b/libstdc++-v3/testsuite/20_util/function/60594.cc new file mode 100644 index 000..be80b3f --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/function/60594.cc @@ -0,0 +1,36 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2011-2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// libstdc++/60594 + +#include functional +#include type_traits +struct bar; +using F = std::functionbar(); +// check for copy constructible and assignable while 'bar' is incomplete +constexpr int c = std::is_copy_constructibleF::value; +constexpr int a = std::is_copy_assignableF::value; +struct bar { }; +bar func(); +void test() +{ + F g{ func }; + g = func; +}
Re: [RFC] Enable virtual operands at -O0
On Tue, Apr 15, 2014 at 4:13 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, I recently stumbled on the ??? comment in tree-ssa-uninit.c and wondered why virtual operands are not enabled at -O0. As demonstrated by the attached patch, this would make it possible to unXFAIL a few uninit-*.c testcases. Tested on x86_64-suse-linux. ISTR some more ???/FIXMEs and/or special-casings we could remove with that. As followup, of course. The single reason why we don't have virtual operands at -O0 is compile-time btw - SSA rewrite doesn't come for free. But I don't mind - still maybe a quick comparison of stage1-gcc compile-time with/without that patch would be interesting? Thanks, Richard. * tree-ssa-operands.c (create_vop_var): Set DECL_IGNORED_P. (append_use): Run at -O0. (append_vdef): Likewise. * tree-ssa-uninit.c (warn_uninitialized_vars): Remove obsolete comment. testsuite/ * gcc.dg/uninit-B-O0.c: Remove XFAIL. * gcc.dg/uninit-I-O0.c: Likewise. * gcc.dg/uninit-pr19430-O0.c: Remove some XFAILs. -- Eric Botcazou
[PATCH 1/3, x86] X86 Silvermont vector cost model tune
Hi, I've separated the patch into 3. The patch passes x86 bootstrap. 1st part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/i386.c (slm_cost): Fixing vec_to_scalar_cost for Silvermont according latency table. (intel_cost): Ditto. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index f2e6957..bf4d576 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -1738,7 +1738,7 @@ struct processor_costs slm_cost = { 1, /* scalar load_cost. */ 1, /* scalar_store_cost. */ 1, /* vec_stmt_cost. */ - 1, /* vec_to_scalar_cost. */ + 4, /* vec_to_scalar_cost. */ 1, /* scalar_to_vec_cost. */ 1, /* vec_align_load_cost. */ 2, /* vec_unalign_load_cost. */ @@ -1815,7 +1815,7 @@ struct processor_costs intel_cost = { 1, /* scalar load_cost. */ 1, /* scalar_store_cost. */ 1, /* vec_stmt_cost. */ - 1, /* vec_to_scalar_cost. */ + 4, /* vec_to_scalar_cost. */ 1, /* scalar_to_vec_cost. */ 1, /* vec_align_load_cost. */ 2, /* vec_unalign_load_cost. */ Evgeny On Thu, Mar 6, 2014 at 12:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: slm_cost/intel_cost and TARGET_SLOW_PSHUFB are just preparation to a next vectorization patch. Changes in ix86_add_stmt_cost gives real performance to Silvermont. Let's move all to stage1. On Wed, Mar 5, 2014 at 9:29 PM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Mar 5, 2014 at 5:46 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: Hi, The patch is for x86 Silvermont. It improves x86 Silvermont vector cost model. It gives +20% on facerec spec on Silvermont. It passes make check and bootstrap on x86. Is this patch ok for stage1? ChangeLog: 2014-03-05 Evgeny Stupachenko evstu...@gmail.com * config/i386/x86-tune.def (TARGET_SLOW_PSHUFB): Target for slow byte shuffle on some x86 architectures. * config/i386/i386.h (TARGET_SLOW_PSHUFB): Ditto. * config/i386/i386.c (processor_costs): Fixing vec_to_scalar_cost for Silvermont according latency table. (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PSHUFB). (x86_add_stmt_cost): Fixing vector cost model for Silvermont. Thanks, Evgeny There are 3 separate changes in this patch: 1. Update slm_cost, which doesn't have a ChangeLog entry. 2. Add TARGET_SLOW_PSHUFB. 3. Update ix86_add_stmt_cost. I suggest you break it into 3 independent patches. I think that slm_cost/intel_cost and TARGET_SLOW_PSHUFB changes can still go into mainline at this stage since they are trivial tuning changes that should not destabilize the compiler. The ix86_add_stmt_cost should wait for stage 1. Uros.
[PATCH 2/3, x86] X86 Silvermont vector cost model tune
2d part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow byte shuffle on some x86 architectures. * config/i386/i386.h (TARGET_SLOW_PHUFFB): Ditto. * config/i386/i386.c (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PHUFFB). diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index bf4d576..0ae3cda 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -44026,7 +44026,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: - if (TARGET_SSSE3) + if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { @@ -44049,7 +44049,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: - if (TARGET_SSSE3) + if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 51659de..1a884d8 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -425,6 +425,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; ix86_tune_features[X86_TUNE_USE_VECTOR_FP_CONVERTS] #define TARGET_USE_VECTOR_CONVERTS \ ix86_tune_features[X86_TUNE_USE_VECTOR_CONVERTS] +#define TARGET_SLOW_PSHUFB \ + ix86_tune_features[X86_TUNE_SLOW_PSHUFB] #define TARGET_FUSE_CMP_AND_BRANCH_32 \ ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32] #define TARGET_FUSE_CMP_AND_BRANCH_64 \ diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 8399102..9b0ff36 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -386,6 +386,10 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_FP_CONVERTS, use_vector_fp_converts, from integer to FP. */ DEF_TUNE (X86_TUNE_USE_VECTOR_CONVERTS, use_vector_converts, m_AMDFAM10) +/* X86_TUNE_SLOW_SHUFB: Indicates tunings with slow pshufb instruction. */ +DEF_TUNE (X86_TUNE_SLOW_PSHUFB, slow_pshufb, + m_BONNELL | m_SILVERMONT | m_INTEL) + /*/ /* AVX instruction selection tuning (some of SSE flags affects AVX, too) */ /*/ On Thu, Mar 6, 2014 at 12:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: slm_cost/intel_cost and TARGET_SLOW_PSHUFB are just preparation to a next vectorization patch. Changes in ix86_add_stmt_cost gives real performance to Silvermont. Let's move all to stage1. On Wed, Mar 5, 2014 at 9:29 PM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Mar 5, 2014 at 5:46 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: Hi, The patch is for x86 Silvermont. It improves x86 Silvermont vector cost model. It gives +20% on facerec spec on Silvermont. It passes make check and bootstrap on x86. Is this patch ok for stage1? ChangeLog: 2014-03-05 Evgeny Stupachenko evstu...@gmail.com * config/i386/x86-tune.def (TARGET_SLOW_PSHUFB): Target for slow byte shuffle on some x86 architectures. * config/i386/i386.h (TARGET_SLOW_PSHUFB): Ditto. * config/i386/i386.c (processor_costs): Fixing vec_to_scalar_cost for Silvermont according latency table. (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PSHUFB). (x86_add_stmt_cost): Fixing vector cost model for Silvermont. Thanks, Evgeny There are 3 separate changes in this patch: 1. Update slm_cost, which doesn't have a ChangeLog entry. 2. Add TARGET_SLOW_PSHUFB. 3. Update ix86_add_stmt_cost. I suggest you break it into 3 independent patches. I think that slm_cost/intel_cost and TARGET_SLOW_PSHUFB changes can still go into mainline at this stage since they are trivial tuning changes that should not destabilize the compiler. The ix86_add_stmt_cost should wait for stage 1. Uros.
[PATCH 3/3, x86] X86 Silvermont vector cost model tune
3d part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/i386.c (x86_add_stmt_cost): Fixing vector cost model for Silvermont. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 0ae3cda..2522b5c 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -81,6 +81,7 @@ along with GCC; see the file COPYING3. If not see #include context.h #include pass_manager.h #include target-globals.h +#include tree-vectorizer.h static rtx legitimize_dllimport_symbol (rtx, bool); static rtx legitimize_pe_coff_extern_decl (rtx, bool); @@ -46329,6 +46330,18 @@ ix86_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind, count *= 50; /* FIXME. */ retval = (unsigned) (count * stmt_cost); + + /* We need to multiply all vector stmt cost by 1.8 (estimated cost) + for Silvermont as it has out of order integer pipeline and can execute + 2 scalar instruction per tick, but has in order SIMD pipeline. */ + if (TARGET_SILVERMONT || TARGET_INTEL) +if (stmt_info stmt_info-stmt) + { + tree lhs_op = gimple_get_lhs (stmt_info-stmt); + if (lhs_op TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE) + retval = (retval * 18) / 10; + } + cost[where] += retval; return retval; On Thu, Mar 6, 2014 at 12:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: slm_cost/intel_cost and TARGET_SLOW_PSHUFB are just preparation to a next vectorization patch. Changes in ix86_add_stmt_cost gives real performance to Silvermont. Let's move all to stage1. On Wed, Mar 5, 2014 at 9:29 PM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Mar 5, 2014 at 5:46 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:58 AM, Evgeny Stupachenko evstu...@gmail.com wrote: Hi, The patch is for x86 Silvermont. It improves x86 Silvermont vector cost model. It gives +20% on facerec spec on Silvermont. It passes make check and bootstrap on x86. Is this patch ok for stage1? ChangeLog: 2014-03-05 Evgeny Stupachenko evstu...@gmail.com * config/i386/x86-tune.def (TARGET_SLOW_PSHUFB): Target for slow byte shuffle on some x86 architectures. * config/i386/i386.h (TARGET_SLOW_PSHUFB): Ditto. * config/i386/i386.c (processor_costs): Fixing vec_to_scalar_cost for Silvermont according latency table. (expand_vec_perm_even_odd_1): Avoid byte shuffles in architectures where they are slow (TARGET_SLOW_PSHUFB). (x86_add_stmt_cost): Fixing vector cost model for Silvermont. Thanks, Evgeny There are 3 separate changes in this patch: 1. Update slm_cost, which doesn't have a ChangeLog entry. 2. Add TARGET_SLOW_PSHUFB. 3. Update ix86_add_stmt_cost. I suggest you break it into 3 independent patches. I think that slm_cost/intel_cost and TARGET_SLOW_PSHUFB changes can still go into mainline at this stage since they are trivial tuning changes that should not destabilize the compiler. The ix86_add_stmt_cost should wait for stage 1. Uros.
[C++ Patch/RFC] Remove unify_success / unify_invalid unused parameter?
Hi, a lot of time ago I noticed that these parameters are unused: should I prepare a ChangeLog for the below or we have stylistic, etc, reasons for keeping the parameters? Thanks, Paolo. PS: I also see many int return types in the various unify* which could as well be bool. Opinions about that? Index: pt.c === --- pt.c(revision 209420) +++ pt.c(working copy) @@ -5377,7 +5377,7 @@ has_value_dependent_address (tree op) call.c */ static int -unify_success (bool /*explain_p*/) +unify_success (void) { return 0; } @@ -5392,7 +5392,7 @@ unify_parameter_deduction_failure (bool explain_p, } static int -unify_invalid (bool /*explain_p*/) +unify_invalid (void) { return 1; } @@ -16085,13 +16085,13 @@ check_non_deducible_conversion (tree parm, tree ar type = arg; if (same_type_p (parm, type)) -return unify_success (explain_p); +return unify_success (); if (strict == DEDUCE_CONV) { if (can_convert_arg (type, parm, NULL_TREE, flags, explain_p ? tf_warning_or_error : tf_none)) - return unify_success (explain_p); + return unify_success (); } else if (strict != DEDUCE_EXACT) { @@ -16098,7 +16098,7 @@ check_non_deducible_conversion (tree parm, tree ar if (can_convert_arg (parm, type, TYPE_P (arg) ? NULL_TREE : arg, flags, explain_p ? tf_warning_or_error : tf_none)) - return unify_success (explain_p); + return unify_success (); } if (strict == DEDUCE_EXACT) @@ -16249,11 +16249,11 @@ unify_one_argument (tree tparms, tree targs, tree int arg_strict; if (arg == error_mark_node || parm == error_mark_node) -return unify_invalid (explain_p); +return unify_invalid (); if (arg == unknown_type_node) /* We can't deduce anything from this, but we might get all the template args from other function args. */ -return unify_success (explain_p); +return unify_success (); /* Implicit conversions (Clause 4) will be performed on a function argument to convert it to the type of the corresponding function @@ -16269,7 +16269,7 @@ unify_one_argument (tree tparms, tree targs, tree TYPE_P (parm) !uses_deducible_template_parms (parm)) /* For function parameters with only non-deducible template parameters, just return. */ -return unify_success (explain_p); +return unify_success (); switch (strict) { @@ -16314,7 +16314,7 @@ unify_one_argument (tree tparms, tree targs, tree if (resolve_overloaded_unification (tparms, targs, parm, arg, strict, arg_strict, explain_p)) - return unify_success (explain_p); + return unify_success (); return unify_overload_resolution_failure (explain_p, arg); } @@ -16321,7 +16321,7 @@ unify_one_argument (tree tparms, tree targs, tree arg_expr = arg; arg = unlowered_expr_type (arg); if (arg == error_mark_node) - return unify_invalid (explain_p); + return unify_invalid (); } arg_strict |= @@ -16548,7 +16548,7 @@ type_unification_real (tree tparms, SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs, TREE_VEC_LENGTH (targs)); #endif - return unify_success (explain_p); + return unify_success (); } /* Subroutine of type_unification_real. Args are like the variables @@ -17242,7 +17242,7 @@ unify_pack_expansion (tree tparms, tree targs, tre } } - return unify_success (explain_p); + return unify_success (); } /* Handle unification of the domain of an array. PARM_DOM and ARG_DOM are @@ -17369,12 +17369,12 @@ unify (tree tparms, tree targs, tree parm, tree ar parm = TREE_OPERAND (parm, 0); if (arg == error_mark_node) -return unify_invalid (explain_p); +return unify_invalid (); if (arg == unknown_type_node || arg == init_list_type_node) /* We can't deduce anything from this, but we might get all the template args from other function args. */ -return unify_success (explain_p); +return unify_success (); /* If PARM uses template parameters, then we can't bail out here, even if ARG == PARM, since we won't record unifications for the @@ -17381,7 +17381,7 @@ unify (tree tparms, tree targs, tree parm, tree ar template parameters. We might need them if we're trying to figure out which of two things is more specialized. */ if (arg == parm !uses_template_parms (parm)) -return unify_success (explain_p); +return unify_success (); /* Handle init lists early, so the rest of the function can assume we're dealing with a type. */ @@ -17401,7 +17401,7 @@ unify (tree tparms, tree targs, tree parm, tree ar /* We can only deduce from an initializer list argument
[PATCH] Fix reassociation with -g (PR tree-optimization/60844)
Hi! The (admittedly ugly) reassoc stmt positioning stuff requires that we maintain uids in ascending order within each bb (equal uid for several adjacent stmts is ok), including debug stmts. We assign those initially, and for stmts we add we make sure to copy the uid from some adjacent insn. But, as the following testcase shows, we don't take into account that gsi_remove can add debug stmts, and those don't have uid set. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.9.1? 2014-04-15 Jakub Jelinek ja...@redhat.com PR tree-optimization/60844 * tree-ssa-reassoc.c (reassoc_remove_stmt): New function. (propagate_op_to_single_use, remove_visited_stmt_chain, linearize_expr, repropagate_negates, reassociate_bb): Use it instead of gsi_remove. * gcc.dg/pr60844.c: New test. --- gcc/tree-ssa-reassoc.c.jj 2014-03-13 10:38:09.0 +0100 +++ gcc/tree-ssa-reassoc.c 2014-04-15 13:59:14.511383249 +0200 @@ -221,6 +221,35 @@ static struct pointer_map_t *operand_ran static long get_rank (tree); static bool reassoc_stmt_dominates_stmt_p (gimple, gimple); +/* Wrapper around gsi_remove, which adjusts gimple_uid of debug stmts + possibly added by gsi_remove. */ + +bool +reassoc_remove_stmt (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + + if (!MAY_HAVE_DEBUG_STMTS || gimple_code (stmt) == GIMPLE_PHI) +return gsi_remove (gsi, true); + + gimple_stmt_iterator prev = *gsi; + gsi_prev (prev); + unsigned uid = gimple_uid (stmt); + basic_block bb = gimple_bb (stmt); + bool ret = gsi_remove (gsi, true); + if (!gsi_end_p (prev)) +gsi_next (prev); + else +prev = gsi_start_bb (bb); + gimple end_stmt = gsi_stmt (*gsi); + while ((stmt = gsi_stmt (prev)) != end_stmt) +{ + gcc_assert (stmt is_gimple_debug (stmt) gimple_uid (stmt) == 0); + gimple_set_uid (stmt, uid); + gsi_next (prev); +} + return ret; +} /* Bias amount for loop-carried phis. We want this to be larger than the depth of any reassociation tree we can see, but not larger than @@ -1123,7 +1152,7 @@ propagate_op_to_single_use (tree op, gim update_stmt (use_stmt); gsi = gsi_for_stmt (stmt); unlink_stmt_vdef (stmt); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); } @@ -3072,7 +3101,7 @@ remove_visited_stmt_chain (tree var) { var = gimple_assign_rhs1 (stmt); gsi = gsi_for_stmt (stmt); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); } else @@ -3494,7 +3523,7 @@ linearize_expr (gimple stmt) update_stmt (stmt); gsi = gsi_for_stmt (oldbinrhs); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (oldbinrhs); gimple_set_visited (stmt, true); @@ -3896,7 +3925,7 @@ repropagate_negates (void) gimple_assign_set_rhs_with_ops (gsi2, NEGATE_EXPR, x, NULL); user = gsi_stmt (gsi2); update_stmt (user); - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (feed); plus_negates.safe_push (gimple_assign_lhs (user)); } @@ -4413,7 +4442,7 @@ reassociate_bb (basic_block bb) reassociations. */ if (has_zero_uses (gimple_get_lhs (stmt))) { - gsi_remove (gsi, true); + reassoc_remove_stmt (gsi); release_defs (stmt); /* We might end up removing the last stmt above which places the iterator to the end of the sequence. --- gcc/testsuite/gcc.dg/pr60844.c.jj 2014-04-15 14:01:27.561689401 +0200 +++ gcc/testsuite/gcc.dg/pr60844.c 2014-04-15 14:01:10.0 +0200 @@ -0,0 +1,16 @@ +/* PR tree-optimization/60844 */ +/* { dg-do compile } */ +/* { dg-options -O2 -g } */ +/* { dg-additional-options -mtune=atom { target { i?86-*-* x86_64-*-* } } } */ + +void +foo (int *x, int y, int z) +{ + int b, c = x[0], d = x[1]; + for (b = 0; b 1; b++) +{ + int e = (y ? 1 : 0) | (d ? 2 : 0) | (z ? 1 : 0); + e |= (c ? 2 : 0) | ((1 b) ? 1 : 0); + x[2 + b] = e; +} +} Jakub
Re: [PATCH] Add DW_AT_const_value as unsigned or int depending on type and value used.
Added a clarifying comment to the code and reinstated the TODO for the double case. OK to push? * dwarf2out.c (gen_enumeration_type_die): Add DW_AT_const_value as unsigned or int depending on type and value used. OK. Thanks! -cary
Re: [PATCH] [CLEANUP] Wrap locally-used functions in anonymous namespaces
On Tue, Apr 15, 2014 at 3:51 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Apr 14, 2014 at 4:51 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, This patch wraps a bunch of locally-used, non-debug functions in an anonymous namespace. These functions can't simply be marked as static because they are used as template arguments to hash_table::traverse, and the C++98 standard does not allow non-extern variables to be used as template arguments. The next best thing to marking them static is to define each of these functions inside an anonymous namespace. Hum, the formatting used looks super-ugly. I suppose a local visibility attribute would work as well? (well, what's the goal of the patch?) Thanks, Richard. The goal of this patch is to resolve warnings emitted by -Wmissing-declarations for the GCC sources. Later I would like to propose adding -Wmissing-declarations to GCC's build flags, but I figured that these kinds of cleanup patches are good on their own. I don't think a local visibility attribute would squelch the -Wmissing-declaration warnings. Is there a better/standardized format for defining a function within an anonymous namespace? I personally don't think the formatting is too bad.
Re: [PATCH] [CLEANUP] Declare global functions before defining them
On Tue, Apr 15, 2014 at 3:52 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Apr 14, 2014 at 4:52 PM, Patrick Palka patr...@parcs.ath.cx wrote: Hi everyone, Many source files currently define a global function that is not previously declared within that source file because the source file did not include the appropriate header file that declares said function. This patch fixes a number of these occurrences by making sure to include the appropriate header file within the offending source files. Bootstrapped and regtested on x86_64-unknown-linux-gnu. How did you find these? (in the C bootstrap times -Wstrict-prototypes did that) Thanks, Richard. Like with the other two patches, the changes in this patch address a subset of the warnings emitted by -Wmissing-declarations. In this case the subset is extern functions that are declared inside a header file but whose defining source file does not include said header file.
Re: [RFC] [Testsuite,ARM] Neon intrinsics executable tests
On 15 April 2014 16:18, Ramana Radhakrishnan ramana.radhakrish...@arm.com wrote: On 04/14/14 23:16, Christophe Lyon wrote: Hi Ramana, Here is an updated version of my proposal to include tests for Neon intrinsics. wrt to my previous post, I have made a few changes: - renamed the test files, removing the ref_ prefix. - removed the TEST_ prefix on some initialization macros - use the c-torture framework I have run it successfully on the following configurations: aarch64-none-linux-gnu aarch64-none-elf aarch64_be-none-elf arm-none-linux-gnueabihf armeb-none-linux-gnueabihf arm-none-linux-gnueabi armeb-none-linux-gnueabi arm-none-eabi using qemu for most of them and the Foundation Model for aarch64*elf I had a brief look at your patch and how does this run for AArch64 when you have such options in the testsuite ? +++ b/gcc/testsuite/gcc.target/arm/neon-intrinsics/vaba.c @@ -0,0 +1,145 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw { target { arm* } } } */ +/* { dg-add-options arm_neon } */ + Good catch... in fact these lines are ignored when using c-torture, I just forgot to clean them up. Additionally a README would help in terms of how one should add new tests. OK Any comments? Thanks, Christophe. On 29 October 2013 19:09, Christophe Lyon christophe.l...@linaro.org wrote: On 29 October 2013 03:24, Ramana Radhakrishnan ramra...@arm.com wrote: On 10/09/13 23:16, Christophe Lyon wrote: Irrespective of our earlier conversations on this now I'm actually wondering if instead of doing this and integrating this in the GCC source base it maybe easier to write a harness to test this cross on qemu or natively. Additionally setting up an auto-tester to do this might be a more productive use of time rather than manually dejagnuizing this which appears to be a tedious and slow process. This would be easy to setup, since the Makefile on gitorious is already targetting qemu. I used it occasionnally on boards with minimal changes. This just means we'd have to agree on how to set up such an auto-tester, where do we send the results to, etc... If you are sufficiently motivated to do the transition, I'm not opposed to putting it into the testsuite as a basic regression testing framework for neon intrinsics. I would really like to have all this converge to a good solution, so yes I want to convert the whole testsuite to dejagnu. I just want that we agree on the format before proceeding with the other tests, that's why I've just posted a subset, hopefully representative enough but easier to review. I'll try and play with this in some more detail with a couple of patches I'm doing in the area of neon intrinsics so it may be useful to cross check. OK let me know if you have further comments. As of now I understand that you are OK with this patch, modulo the removal of the 3 dg-* lines, correct? Thanks, Christophe. regards Ramana I'd like your feedback before continuing, as there are a lot more files to come. I have made some cleanup to help review, but the two .h files will need to grow as more intrinsics will be added (see the original ones). Which one should I compare this with in terms of the original file ? I have kept the same file names. I'd like to keep the modifications at a minimal level, to save my time when adapting each test (there are currently 145 test files, so 143 left:-). On to the patch itself. The prefix TEST_ seems a bit misleading in that it suggests this is testing something when in reality this is initializing stuff. In fact, TEST_ executes the intrinsics, and copies the results to memory when relevant. But I can easily change TEST_ to something else. So in the sample I posted: TEST_VABA: VAR=vaba(); vst1(BUFFER,VAR) TEST_VLD1: VAR=vld1(); vst1(BUFFER, VAR) VDUP is special in that it is a helper for other tests: TEST_VDUP: VAR1=vdup(VAR2,) and similarly for TEST_VLOAD and TEST_VSETLANE +# Exit immediately if this isn't an ARM target. +if ![istarget arm*-*-*] then { + return +} Also for aarch64*-*-* as all these intrinsics are compatible with the aarch64 port. I would also prefer that this be tortured over multiple optimization levels as many times we find issues with different optimization levels. OK, this sounds easy to do, and I agree. I prefered to post a simple version first. And given you talked me about your plans to factorize arm and aarch64 tests, I thought it was better to start with a simple version I knew was working. More later I need to get back to something else and I need to play more with your original testsuite - but I'd like some discussion around some of these points anyway. Ramana OK thanks for the feedback. If we decide to go with auto-testers instead, the discussion will probably be shorter. Christophe + +# Load support procs. +load_lib
Re: [build] Correctly detect native TLS support with 64-bit gas on Solaris/x86 (PR target/60817)
On Tue, Apr 15, 2014 at 5:21 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: As reported in the PR, gcc/configure currently fails to detect native TLS support on x86_64-*-solaris2* with a 64-bit gas since it feeds it 32-bit TLS code. I haden't noticed this so far since I've been using a 32-bit gas here (no idea why). The following patch fixes this by making sure 64-bit code is both used for 64-bit-default configurations and the necessary assembler flags passed. I've chosen to merge the i?86 and x86_64 cases to avoid duplicating considerable amounts of code. When using the native Solaris assembler, the relocs need to be in lower case as already done for 32-bit. Tested by configuring for x86_64-pc-solaris2.11 with 32-bit gas, 64-bit gas, /bin/as, i386-pc-solaris2.11 with 32-bit gas and /bin/as, x86_64-unknown-linux-gnu, and i686-unknown-linux-gnu and checking that native TLS support is detected correctly. Ok for mainline or should I rather bootstrap the change on a couple of those configurations? Thanks. Rainer 2014-04-15 Rainer Orth r...@cebitec.uni-bielefeld.de PR target/60817 * configure.ac (set_have_as_tls): Merge i[34567]86-*-* and x86_64-*-* cases. Pass necessary as flags on 64-bit Solaris/x86. Use lowercase relocs for x86_64-*-*. * configure: Regenerate. OK. Thanks, Uros.
Re: [PATCH 1/3, x86] X86 Silvermont vector cost model tune
On Tue, Apr 15, 2014 at 6:06 PM, Evgeny Stupachenko evstu...@gmail.com wrote: I've separated the patch into 3. The patch passes x86 bootstrap. 1st part: 2014-04-15 Evgeny Stupachenko evstu...@gmail.com * config/i386/i386.c (slm_cost): Fixing vec_to_scalar_cost for Silvermont according latency table. ... : Adjust vec_to_scalar_cost. (intel_cost): Ditto. OK for mainline with the above ChangeLog fix. Thanks, Uros.
Re: [PATCH][AArch64] Vectorise bswap[16,32,64]
Testcase weirdness? for (i 0; i N; ++i) { arr[i] = i; expect[i] = __builtin_bswap64 (i); if (y) /* Avoid vectorisation. */ abort (); } i 0 :) duplicated in all 3 testcases btw. -eric On Tue, Apr 15, 2014 at 4:25 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, This patch enables aarch64 to vectorise bswap[16,32,64] operations by using the AdvancedSIMD forms of the rev[16,32,64] instructions. The TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION hook is extended to return the vectorised forms of __builtin_bswap* where possible and vector bswap patterns are added. I've added the tests in vect.exp and a new effective target check (vect_bswap) that can be extended for other targets in the future if they can also vectorise these operations. Is that ok? Bootstrapped and tested aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handle BUILT_IN_BSWAP16, BUILT_IN_BSWAP32, BUILT_IN_BSWAP64. * config/aarch64/aarch64-simd.md (bswapmode): New pattern. * config/aarch64/aarch64-simd-builtins.def: Define vector bswap builtins. * config/aarch64/iterator.md (VDQHSD): New mode iterator. (Vrevsuff): New mode attribute. 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * lib/target-supports.exp (check_effective_target_vect_bswap): New. * gcc.dg/vect/vect-bswap16: New test. * gcc.dg/vect/vect-bswap32: Likewise. * gcc.dg/vect/vect-bswap64: Likewise.
Re: [RFC] Enable virtual operands at -O0
ISTR some more ???/FIXMEs and/or special-casings we could remove with that. As followup, of course. It would be better to remove them all at once, so if you have specifics... The single reason why we don't have virtual operands at -O0 is compile-time btw - SSA rewrite doesn't come for free. But I don't mind - still maybe a quick comparison of stage1-gcc compile-time with/without that patch would be interesting? 3m3.306s vs 3m3.041s for the 64-bit build of an earlier compiler version. The difference doesn't seem to be much more noticeable on big preprocessed files, e.g. combine.i or pt.i, but I'm not sure this means anything. -- Eric Botcazou
[PATCH] Make SRA tolerate most throwing statements
Hi, back in January in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed out a testcase where the problem was SRA not scalarizing an aggregate because it was involved in a throwing statement. The reason is that SRA is likely to need to append new statements after each one where a replaced aggregate is present, but throwing statements must end their BBs. This patch comes up with a fix for most such situations by adding these new statements onto a single successor non-EH edge, if there is one and only one such edge. I have bootstrapped and tested a very similar version on x86_64-linux, bootstrap and testing of this exact one is currently underway. OK for trunk? Eric, if and once this gets in, can you please add the testcase from your original post to the suite? Thanks, Martin 2014-04-15 Martin Jambor mjam...@suse.cz * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (gsi_for_eh_followups): New function. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Iterate safely over BBs. diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index ffef13d..4fd0f5e 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1142,17 +1142,41 @@ build_access_from_expr (tree expr, gimple stmt, bool write) return false; } -/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in - modes in which it matters, return true iff they have been disqualified. RHS - may be NULL, in that case ignore it. If we scalarize an aggregate in - intra-SRA we may need to add statements after each statement. This is not - possible if a statement unconditionally has to end the basic block. */ +/* Return the single non-EH successor edge of BB or NULL if there is none or + more than one. */ + +static edge +single_non_eh_succ (basic_block bb) +{ + edge e, res = NULL; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb-succs) +if (!(e-flags EDGE_EH)) + { + if (res) + return NULL; + res = e; + } + + return res; +} + +/* Disqualify LHS and RHS for scalarization if STMT has to terminate its BB and + there is no alternative spot where to put statements SRA might need to + generate after it. The spot we are looking for is an edge leading to a + single non-EH successor, if it exists and is indeed single. RHS may be + NULL, in that case ignore it. */ + static bool -disqualify_ops_if_throwing_stmt (gimple stmt, tree lhs, tree rhs) +disqualify_if_bad_bb_terminating_stmt (gimple stmt, tree lhs, tree rhs) { if ((sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA) - (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))) + stmt_ends_bb_p (stmt)) { + if (single_non_eh_succ (gimple_bb (stmt))) + return false; + disqualify_base_of_expr (lhs, LHS of a throwing stmt.); if (rhs) disqualify_base_of_expr (rhs, RHS of a throwing stmt.); @@ -1180,7 +1204,7 @@ build_accesses_from_assign (gimple stmt) lhs = gimple_assign_lhs (stmt); rhs = gimple_assign_rhs1 (stmt); - if (disqualify_ops_if_throwing_stmt (stmt, lhs, rhs)) + if (disqualify_if_bad_bb_terminating_stmt (stmt, lhs, rhs)) return false; racc = build_access_from_expr_1 (rhs, stmt, false); @@ -1319,7 +1343,7 @@ scan_function (void) } t = gimple_call_lhs (stmt); - if (t !disqualify_ops_if_throwing_stmt (stmt, t, NULL)) + if (t !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL)) ret |= build_access_from_expr (t, stmt, true); break; @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr) return get_var_base_offset_size_access (base, offset, max_size); } +/* Split the single non-EH successor edge from BB (there must be exactly one) + and return a gimple iterator to the new block. */ + +static gimple_stmt_iterator +gsi_for_eh_followups (basic_block bb) +{ + edge e = single_non_eh_succ (bb); + gcc_assert (e); + + basic_block new_bb = split_edge (e); + return gsi_start_bb (new_bb); +} + /* Replace the expression EXPR with a scalar replacement if there is one and generate other statements to do type conversion or subtree copying if necessary. GSI is used to place newly created statements, WRITE is true if @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) type = TREE_TYPE (*expr); loc = gimple_location (gsi_stmt (*gsi)); + gimple_stmt_iterator alt_gsi = gsi_none (); + if (write stmt_ends_bb_p (gsi_stmt (*gsi))) +{ + alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi)); + gsi =
[patch] Add const to constexpr member functions
Add const to functions that would change meaning in C++14. Tested x86_64-linux, committed to trunk. commit f30d35d5a7aa3ff2e0d0e4010aecaf9f5a5fb9ed Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 18:55:45 2014 +0100 * include/bits/atomic_base.h (__atomic_base_PTp*::_M_type_size): Add const to constexpr member functions. diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index 242459a..1fc0ebb 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -675,10 +675,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Factored out to facilitate explicit specialization. constexpr ptrdiff_t - _M_type_size(ptrdiff_t __d) { return __d * sizeof(_PTp); } + _M_type_size(ptrdiff_t __d) const { return __d * sizeof(_PTp); } constexpr ptrdiff_t - _M_type_size(ptrdiff_t __d) volatile { return __d * sizeof(_PTp); } + _M_type_size(ptrdiff_t __d) const volatile { return __d * sizeof(_PTp); } public: __atomic_base() noexcept = default;
[patch] Use delegating constructors in std::shared_ptr
A minor simplification that removes a longstanding TODO note. Tested x86_64-linux, committed to trunk. commit 7769b63f43899b901bba08e5b2b3a6806e2195f2 Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 19:00:47 2014 +0100 * include/bits/shared_ptr.h (shared_ptr::shared_ptr(nullptr_t)): Use delegating constructor. * include/bits/shared_ptr_base.h (__shared_ptr::__shared_ptr(nullptr_t)): Likewise diff --git a/libstdc++-v3/include/bits/shared_ptr.h b/libstdc++-v3/include/bits/shared_ptr.h index 081d3bd..104c869 100644 --- a/libstdc++-v3/include/bits/shared_ptr.h +++ b/libstdc++-v3/include/bits/shared_ptr.h @@ -262,8 +262,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p A null pointer constant. * @post use_count() == 0 get() == nullptr */ - constexpr shared_ptr(nullptr_t __p) noexcept - : __shared_ptr_Tp(__p) { } + constexpr shared_ptr(nullptr_t __p) noexcept : shared_ptr() { } shared_ptr operator=(const shared_ptr) noexcept = default; diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index 536df01..57398af 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -963,10 +963,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __shared_ptr(std::auto_ptr_Tp1 __r); #endif - /* TODO: use delegating constructor */ - constexpr __shared_ptr(nullptr_t) noexcept - : _M_ptr(0), _M_refcount() - { } + constexpr __shared_ptr(nullptr_t) noexcept : __shared_ptr() { } templatetypename _Tp1 __shared_ptr
[patch] Fix non-reserved names in atomic
The parameters names should all be reserved names. Also remove some trailing whitespace at the end of lines. Tested x86_64-linux, committed to trunk. commit b6e5e08880da7d80c1f14c132bd4bd8eed301205 Author: Jonathan Wakely a...@kayari.org Date: Tue Apr 15 19:01:36 2014 +0100 * include/std/atomic: Uglify parameter names. diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic index 1b8e445..be7d0be 100644 --- a/libstdc++-v3/include/std/atomic +++ b/libstdc++-v3/include/std/atomic @@ -200,43 +200,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return __atomic_is_lock_free(sizeof(_M_i), nullptr); } void - store(_Tp __i, memory_order _m = memory_order_seq_cst) noexcept - { __atomic_store(_M_i, __i, _m); } + store(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept + { __atomic_store(_M_i, __i, __m); } void - store(_Tp __i, memory_order _m = memory_order_seq_cst) volatile noexcept - { __atomic_store(_M_i, __i, _m); } + store(_Tp __i, memory_order __m = memory_order_seq_cst) volatile noexcept + { __atomic_store(_M_i, __i, __m); } _Tp - load(memory_order _m = memory_order_seq_cst) const noexcept + load(memory_order __m = memory_order_seq_cst) const noexcept { _Tp tmp; - __atomic_load(_M_i, tmp, _m); + __atomic_load(_M_i, tmp, __m); return tmp; } _Tp - load(memory_order _m = memory_order_seq_cst) const volatile noexcept + load(memory_order __m = memory_order_seq_cst) const volatile noexcept { _Tp tmp; - __atomic_load(_M_i, tmp, _m); + __atomic_load(_M_i, tmp, __m); return tmp; } _Tp - exchange(_Tp __i, memory_order _m = memory_order_seq_cst) noexcept + exchange(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept { _Tp tmp; - __atomic_exchange(_M_i, __i, tmp, _m); + __atomic_exchange(_M_i, __i, tmp, __m); return tmp; } _Tp exchange(_Tp __i, - memory_order _m = memory_order_seq_cst) volatile noexcept + memory_order __m = memory_order_seq_cst) volatile noexcept { _Tp tmp; - __atomic_exchange(_M_i, __i, tmp, _m); + __atomic_exchange(_M_i, __i, tmp, __m); return tmp; } @@ -244,14 +244,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION compare_exchange_weak(_Tp __e, _Tp __i, memory_order __s, memory_order __f) noexcept { - return __atomic_compare_exchange(_M_i, __e, __i, true, __s, __f); + return __atomic_compare_exchange(_M_i, __e, __i, true, __s, __f); } bool compare_exchange_weak(_Tp __e, _Tp __i, memory_order __s, memory_order __f) volatile noexcept { - return __atomic_compare_exchange(_M_i, __e, __i, true, __s, __f); + return __atomic_compare_exchange(_M_i, __e, __i, true, __s, __f); } bool @@ -270,14 +270,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION compare_exchange_strong(_Tp __e, _Tp __i, memory_order __s, memory_order __f) noexcept { - return __atomic_compare_exchange(_M_i, __e, __i, false, __s, __f); + return __atomic_compare_exchange(_M_i, __e, __i, false, __s, __f); } bool compare_exchange_strong(_Tp __e, _Tp __i, memory_order __s, memory_order __f) volatile noexcept { - return __atomic_compare_exchange(_M_i, __e, __i, false, __s, __f); + return __atomic_compare_exchange(_M_i, __e, __i, false, __s, __f); } bool
Re: [v3] Slightly improve operator new
Ping http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00051.html On Sun, 2 Mar 2014, Marc Glisse wrote: Hello, inlining operator new (with LTO or otherwise), I noticed that it has a complicated implementation, which makes it hard to use this inlined code for optimizations. This patch does two things: 1) there are 2 calls to malloc, I am turning them into just one. At -Os, it does not change the generated code (RTL optimizers manage to merge the calls to malloc). At other levels (-O2, -O3, and especially with -g) it gives a smaller object file. And with just one malloc, some optimizations become much easier (see my recent calloc patch for instance). 2) malloc is predicted to return null 19 times out of 20 because of the loop (that didn't change with the patch), so I am adding __builtin_expect to let gcc optimize the fast path. Further discussion: a) I didn't add __builtin_expect for the test (sz == 0), it didn't change the generated code in my limited test. I was wondering if this test is necessary (new doesn't seem to ever call operator new(0)) or could be moved to operator new[] (new type[0] does call operator new[](0)), but since one can call operator new directly, it has to be protected indeed, so let's forget this point ;-) (too bad malloc is replacable, so we can't use the fact that glibc already does the right thing) b) I have a bit of trouble parsing the standard. Is the nothrow operator new supposed to call the regular operator new? In particular, if a user replaces only the throwing operator new, should the nothrow operator new automatically call that function? That's not what we are currently doing (and it would be a perf regression). Required behavior: Return a non-null pointer to suitably aligned storage (3.7.4), or else return a null pointer. This nothrow version of operator new returns a pointer obtained as if acquired from the (possibly replaced) ordinary version. This requirement is binding on a replacement version of this function. Default behavior: Calls operator new(size). If the call returns normally, returns the result of that call. Otherwise, returns a null pointer. Passes bootstrap+testsuite on x86_64-linux-gnu. Stage 1? 2014-03-03 Marc Glisse marc.gli...@inria.fr * libsupc++/new_op.cc: Factor the calls to malloc, use __builtin_expect. * libsupc++/new_opnt.cc: Likewise. -- Marc Glisse
Re: calloc = malloc + memset
Let me ping this. There's no hurry, but it may have got lost with 4.9 approaching. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01205.html On Sun, 23 Mar 2014, Marc Glisse wrote: On Mon, 3 Mar 2014, Richard Biener wrote: That's a bit much of ad-hoc pattern-matching ... wouldn't be p = malloc (n); memset (p, 0, n); transform better suited to the strlen opt pass? After all that tracks what 'string' is associated with a SSA name pointer through arbitrary satements using a lattice. Like this? I had to move the strlen pass after the loop passes (and after dom or everything was too dirty) but long enough before the end (some optimizations are necessary after strlen). As a bonus, one more strlen is optimized in the current testcases :-) Running the pass twice would be another option I guess (it would require implementing the clone method), but without a testcase showing it is needed... Passes bootstrap+testsuite on x86_64-linux-gnu. 2014-03-23 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. -- Marc Glisse
[patch] Fix libstdc++ tests w.r.t PR c++/60786
G++ accepts these tests but it shouldn't, and clang doesn't, so this makes them valid C++11. Tested x86_64-linux, committed to trunk. commit f1517e2ae280691724472cbd0f6b31fa98f313d0 Author: Jonathan Wakely jwak...@redhat.com Date: Tue Apr 15 19:39:15 2014 +0100 PR c++/60786 * testsuite/20_util/shared_ptr/requirements/explicit_instantiation/ 1.cc: Fix invalid explicit instantiations with unqualified names. * testsuite/20_util/shared_ptr/requirements/explicit_instantiation/ 2.cc: Likweise. * testsuite/20_util/tuple/53648.cc: Likweise. * testsuite/20_util/weak_ptr/requirements/explicit_instantiation/1.cc: Likewise. * testsuite/20_util/weak_ptr/requirements/explicit_instantiation/2.cc: Likewise. * testsuite/23_containers/unordered_map/requirements/ debug_container.cc: Likewise. * testsuite/23_containers/unordered_map/requirements/ explicit_instantiation/3.cc: Likewise. * testsuite/23_containers/unordered_multimap/requirements/debug.cc: Likewise. * testsuite/23_containers/unordered_multimap/requirements/ explicit_instantiation/3.cc: Likewise. * testsuite/23_containers/unordered_multiset/requirements/debug.cc: Likewise. * testsuite/23_containers/unordered_multiset/requirements/ explicit_instantiation/3.cc: Likewise. * testsuite/23_containers/unordered_set/requirements/ debug_container.cc: Likewise. * testsuite/23_containers/unordered_set/requirements/ explicit_instantiation/3.cc: Likewise. diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/1.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/1.cc index 40ebec0..0d81481 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/1.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/1.cc @@ -1,4 +1,4 @@ -// { dg-options -std=gnu++0x } +// { dg-options -std=gnu++11 } // { dg-do compile } // Copyright (C) 2006-2014 Free Software Foundation, Inc. @@ -24,8 +24,7 @@ #include testsuite_tr1.h using namespace __gnu_test; -using std::shared_ptr; -template class shared_ptrint; -template class shared_ptrvoid; -template class shared_ptrClassType; -template class shared_ptrIncompleteClass; +template class std::shared_ptrint; +template class std::shared_ptrvoid; +template class std::shared_ptrClassType; +template class std::shared_ptrIncompleteClass; diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/2.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/2.cc index 148375a..37a0c1d 100644 --- a/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/2.cc +++ b/libstdc++-v3/testsuite/20_util/shared_ptr/requirements/explicit_instantiation/2.cc @@ -1,4 +1,4 @@ -// { dg-options -std=gnu++0x } +// { dg-options -std=gnu++11 } // { dg-do compile } // Copyright (C) 2007-2014 Free Software Foundation, Inc. @@ -27,8 +27,7 @@ // library this checks the templates can be instantiated for non-default // lock policy, for a single-threaded lib this is redundant but harmless. using namespace __gnu_test; -using std::__shared_ptr; using std::_S_single; -template class __shared_ptrint, _S_single; -template class __shared_ptrClassType, _S_single; -template class __shared_ptrIncompleteClass, _S_single; +template class std::__shared_ptrint, _S_single; +template class std::__shared_ptrClassType, _S_single; +template class std::__shared_ptrIncompleteClass, _S_single; diff --git a/libstdc++-v3/testsuite/20_util/tuple/53648.cc b/libstdc++-v3/testsuite/20_util/tuple/53648.cc index 7bde67e..fb37638 100644 --- a/libstdc++-v3/testsuite/20_util/tuple/53648.cc +++ b/libstdc++-v3/testsuite/20_util/tuple/53648.cc @@ -1,4 +1,4 @@ -// { dg-options -std=gnu++0x } +// { dg-options -std=gnu++11 } // { dg-do compile } // Copyright (C) 2012-2014 Free Software Foundation, Inc. @@ -27,10 +27,10 @@ using std::tuple; struct A { }; -template class tupletuple; -template class tupletupletuple; -template class tupleA, tupleA, tupleA, tupleA; -template class tupletupletupleA, A, A, A; +template class std::tupletuple; +template class std::tupletupletuple; +template class std::tupleA, tupleA, tupleA, tupleA; +template class std::tupletupletupleA, A, A, A; // Verify the following QoI properties are preserved diff --git a/libstdc++-v3/testsuite/20_util/weak_ptr/requirements/explicit_instantiation/1.cc b/libstdc++-v3/testsuite/20_util/weak_ptr/requirements/explicit_instantiation/1.cc index c5a30f2..0a15e46 100644 --- a/libstdc++-v3/testsuite/20_util/weak_ptr/requirements/explicit_instantiation/1.cc +++ b/libstdc++-v3/testsuite/20_util/weak_ptr/requirements/explicit_instantiation/1.cc @@ -1,4 +1,4 @@ -// { dg-options -std=gnu++0x } +// { dg-options
Re: Make string_view operations involving CharT* *not* noexcept and consistent beween string_view and string_view.tcc.
On 29/03/14 14:54 -0400, Ed Smith-Rowland wrote: All, In string_view I botched the noexcept specification of operations like find and friends with CharT* arguments. I'm a little surprised the inconsistency between string_view and string_view.tcc didn't error. In fact, in one repo thats a little behind trunk it does. I'll continue to look after that issue separately. I'm fixing this differently, by strengthening the exception specs as Marc suggested. I haven't addressed Marc's other comments, but we should do. Tested x86_64-linux, committed to trunk. commit 5ac00aa4544a4c10c3eeadb8ca2a3ce57d9e62ce Author: Jonathan Wakely jwak...@redhat.com Date: Tue Apr 15 19:45:29 2014 +0100 * include/experimental/string_view: Fix inconsistent exception specs. diff --git a/libstdc++-v3/include/experimental/string_view b/libstdc++-v3/include/experimental/string_view index bebeb6b..6b6588b 100644 --- a/libstdc++-v3/include/experimental/string_view +++ b/libstdc++-v3/include/experimental/string_view @@ -329,7 +329,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION find(_CharT __c, size_type __pos=0) const noexcept; size_type - find(const _CharT* __str, size_type __pos, size_type __n) const; + find(const _CharT* __str, size_type __pos, size_type __n) const noexcept; size_type find(const _CharT* __str, size_type __pos=0) const noexcept @@ -343,7 +343,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION rfind(_CharT __c, size_type __pos = npos) const noexcept; size_type - rfind(const _CharT* __str, size_type __pos, size_type __n) const; + rfind(const _CharT* __str, size_type __pos, size_type __n) const noexcept; size_type rfind(const _CharT* __str, size_type __pos = npos) const noexcept
[patch] libstdc++ testsuite tweaks
Three little fixes. Tested x86_64-linux, committed to trunk. commit 17beaac60c01164cc496da57a2a9ced7a487d17d Author: Jonathan Wakely jwak...@redhat.com Date: Tue Apr 15 19:41:42 2014 +0100 * testsuite/24_iterators/insert_iterator/requirements/container.cc: Do not use uninitialized members in mem-initializers. * testsuite/ext/throw_value/cons.cc: Fix most vexing parse. * testsuite/util/testsuite_common_types.h: Update comment. diff --git a/libstdc++-v3/testsuite/24_iterators/insert_iterator/requirements/container.cc b/libstdc++-v3/testsuite/24_iterators/insert_iterator/requirements/container.cc index 9aea7c9..162a16e 100644 --- a/libstdc++-v3/testsuite/24_iterators/insert_iterator/requirements/container.cc +++ b/libstdc++-v3/testsuite/24_iterators/insert_iterator/requirements/container.cc @@ -26,9 +26,9 @@ // Check data member 'container' accessible. class test_dm : public std::insert_iteratorstd::listint { - container_type l; - container_type::iterator i; + container_type l(); + container_type::iterator i(); container_type* p; public: - test_dm(): std::insert_iteratorstd::listint (l, i), p(container) { } + test_dm(): std::insert_iteratorstd::listint (l(), i()), p(container) { } }; diff --git a/libstdc++-v3/testsuite/ext/throw_value/cons.cc b/libstdc++-v3/testsuite/ext/throw_value/cons.cc index 40e67a8..c668975 100644 --- a/libstdc++-v3/testsuite/ext/throw_value/cons.cc +++ b/libstdc++-v3/testsuite/ext/throw_value/cons.cc @@ -1,4 +1,4 @@ -// { dg-options -std=gnu++0x } +// { dg-options -std=gnu++11 } // Copyright (C) 2009-2014 Free Software Foundation, Inc. // @@ -24,8 +24,8 @@ void foo1() { typedef __gnu_cxx::throw_value_limit value_type; value_type v1; - value_type v2(v2); - value_type v3(value_type()); + value_type v2{v1}; + value_type v3{value_type()}; } bool foo2() diff --git a/libstdc++-v3/testsuite/util/testsuite_common_types.h b/libstdc++-v3/testsuite/util/testsuite_common_types.h index 63339ef..abf6ea9 100644 --- a/libstdc++-v3/testsuite/util/testsuite_common_types.h +++ b/libstdc++-v3/testsuite/util/testsuite_common_types.h @@ -689,7 +689,8 @@ namespace __gnu_test struct _Concept; // NB: _Tp must be a literal type. -// Have to have user-defined default ctor for this to work. +// Have to have user-defined default ctor for this to work, +// or implicit default ctor must initialize all members. templatetypename _Tp struct _Concept_Tp, true {
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On Tue, Apr 15, 2014 at 09:59:10AM -0400, Jason Merrill wrote: On 04/14/2014 11:10 AM, Marek Polacek wrote: + else if (TREE_CODE (val) == IDENTIFIER_NODE) +{ + tree t = lookup_name (val); + if (t TREE_CODE (t) == CONST_DECL) +return DECL_INITIAL (t); +} I'm uncomfortable with this; we should have looked up any attributes in the parser. Does the testsuite hit this code? Thanks for looking at it. So the newer version of the patch contains: + else if (TREE_CODE (val) == IDENTIFIER_NODE) + { + tree t = lookup_name (val); + if (t TREE_CODE (t) == CONST_DECL) + val = default_conversion (t); + } The testsuite doesn't hit this code with C++, but does hit this code with C. The thing is, if we have e.g. enum { A = 128 }; void *fn1 (void) __attribute__((assume_aligned (A))); then handle_assume_aligned_attribute walks the attribute arguments and gets the argument via TREE_VALUE. If this argument is an enum value, then for C the argument is identifier_node that contains const_decl, but for C++ the argument is directly const_decl. That means for C++ in get_attrib_value we just call default_conversion as before, but for C we call lookup_name firstly. Does this answer your question? Marek
RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
-Original Message- From: Moore, Catherine Sent: Tuesday, April 15, 2014 8:49 AM To: Rozycki, Maciej; Richard Sandiford Cc: Matthew Fortune; gcc-patches@gcc.gnu.org; Moore, Catherine Subject: RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 -Original Message- From: Maciej W. Rozycki [mailto:ma...@codesourcery.com] Sent: Tuesday, April 15, 2014 7:28 AM To: Richard Sandiford Cc: Matthew Fortune; Moore, Catherine; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) I'll take care of this bit. I've attached an updated patch to address Maciej's concern with $0 and the microMIPS store instructions. Does this look okay to install? Thanks, Catherine umips-zero.cl Description: umips-zero.cl umips-zero.patch Description: umips-zero.patch
Re: [patch] Use delegating constructors in std::shared_ptr
On 15/04/14 21:51 +0200, Václav Zeman wrote: On 04/15/2014 08:29 PM, Jonathan Wakely wrote: A minor simplification that removes a longstanding TODO note. Tested x86_64-linux, committed to trunk. diff --git a/libstdc++-v3/include/bits/shared_ptr.h b/libstdc++-v3/include/bits/shared_ptr.h index 081d3bd..104c869 100644 --- a/libstdc++-v3/include/bits/shared_ptr.h +++ b/libstdc++-v3/include/bits/shared_ptr.h @@ -262,8 +262,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p A null pointer constant. * @post use_count() == 0 get() == nullptr */ - constexpr shared_ptr(nullptr_t __p) noexcept - : __shared_ptr_Tp(__p) { } + constexpr shared_ptr(nullptr_t __p) noexcept : shared_ptr() { } ^^^ Will this not cause unused parameter warning or some such? Not usually, because it's in a system header, but I'll commit the attached patch when it finishes testing anyway. Thanks. commit 1048a84b2f4a8626fe32cbfca5fd65701a71bd58 Author: Jonathan Wakely jwak...@redhat.com Date: Tue Apr 15 20:56:59 2014 +0100 * include/bits/shared_ptr.h (shared_ptr::shared_ptr(nullptr_t)): Remove name of unused parameter. diff --git a/libstdc++-v3/include/bits/shared_ptr.h b/libstdc++-v3/include/bits/shared_ptr.h index 104c869..290a0c9 100644 --- a/libstdc++-v3/include/bits/shared_ptr.h +++ b/libstdc++-v3/include/bits/shared_ptr.h @@ -262,7 +262,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __p A null pointer constant. * @post use_count() == 0 get() == nullptr */ - constexpr shared_ptr(nullptr_t __p) noexcept : shared_ptr() { } + constexpr shared_ptr(nullptr_t) noexcept : shared_ptr() { } shared_ptr operator=(const shared_ptr) noexcept = default;
Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16
Moore, Catherine catherine_mo...@mentor.com writes: -Original Message- From: Moore, Catherine Sent: Tuesday, April 15, 2014 8:49 AM To: Rozycki, Maciej; Richard Sandiford Cc: Matthew Fortune; gcc-patches@gcc.gnu.org; Moore, Catherine Subject: RE: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 -Original Message- From: Maciej W. Rozycki [mailto:ma...@codesourcery.com] Sent: Tuesday, April 15, 2014 7:28 AM To: Richard Sandiford Cc: Matthew Fortune; Moore, Catherine; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [MIPS] Fix operands for microMIPS SW16, SH16 and SB16 On Tue, 15 Apr 2014, Richard Sandiford wrote: I believe you need to adjust constraints to ensure constant 0 is known to produce a 16-bit instruction encoding where possible. Otherwise you'll end up with suboptimal code when the instruction is in a branch delay slot. Yeah, it'd be good to do that too (although this is a preexisting problem). Well, it depends on how you look at the problem being solved here -- if it is for SW16, SH16 and SB16 GCC produces broken code for the `s0' source register, then indeed it is, whereas if it is GCC does not handle the source register set for SW16, SH16 and SB16 correctly, then it is a part of the same problem, not completely corrected. I can live with that until 4.10/4.9.1 though if you prefer. I'm relying on you guys to do the microMIPS stuff though -- I don't have a way of testing it. An assembly/objdump test is enough to cover this, so you've got all tools at hand, although I understand you may not be inclined to rush working on it. ;) I'll take care of this bit. I've attached an updated patch to address Maciej's concern with $0 and the microMIPS store instructions. Does this look okay to install? No, the point was that zero is modelled as a constant in RTL, so like Maciej says, the way to handle it is to use the J constraint (like some of the existing contraints use dJ for any GPR or zero). What we want to test is that: *ptr = 0; is a 16-bit instruction. You could do that by adding -dp to the options and matching something like: MICROMIPS void f1 (unsigned char *ptr) { *ptr = 0; } ...[similarly for short and int]... /* { dg-final { scan-assembler \tsb\t\\\$0, 0\\(\\\$4\\)\[^\n\]length = 2 } } */ ...[similarly for sh and sw]... Completely untested. I bet the regexp needs different backslashes. :-) Thanks, Richard
[RFC][PATCH] RL78 - clean-up of missing operand mode warnings.
Hi, This patch cleans up some warnings when building due to missing operand modes. trampoline_init in rl78.md still produces warnings but I'm not entirely sure about how best to fix that insn and I didn't want to break anything. Regards, Richard 2014-04-15 Richard Hulme pepe...@yahoo.com * config/rl78/rl78.md (addsi3, addsi3_internal_virt, addsi3_internal_real, subsi3, subsi3_internal_virt, subsi3_internal_real): Add missing modes to operands. * config/rl78/rl78-real.md (*movqi_real, *xorqi3_real): Likewise. * config/rl78/rl78-virt.md (*movqi_virt, *xorqi3_vidr): Likewise. --- gcc/config/rl78/rl78-real.md |4 ++-- gcc/config/rl78/rl78-virt.md |4 ++-- gcc/config/rl78/rl78.md | 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/gcc/config/rl78/rl78-real.md b/gcc/config/rl78/rl78-real.md index 5d5c598..847a82d 100644 --- a/gcc/config/rl78/rl78-real.md +++ b/gcc/config/rl78/rl78-real.md @@ -45,7 +45,7 @@ (define_insn *movqi_real [(set (match_operand:QI 0 nonimmediate_operand =g,RaxbcWab,RaxbcWab,a, bcx,R, WabWd2WhlWh1WhbWbcWs1v, bcx) - (match_operand1 general_operand 0,K,M, RInt8sJvWabWdeWd2WhlWh1WhbWbcWs1,Wab,aInt8J,a, R))] + (match_operand:QI 1 general_operand 0,K,M, RInt8sJvWabWdeWd2WhlWh1WhbWbcWs1,Wab,aInt8J,a, R))] rl78_real_insns_ok () @ ; mov\t%0, %1 @@ -194,7 +194,7 @@ (define_insn *xorqi3_real [(set (match_operand:QI 0 nonimmediate_operand =A,R,v) (xor:QI (match_operand:QI 1 general_operand %0,0,0) - (match_operand2 general_operand iRvWabWhbWh1Whl,A,i))) + (match_operand:QI 2 general_operand iRvWabWhbWh1Whl,A,i))) ] rl78_real_insns_ok () xor\t%0, %2 diff --git a/gcc/config/rl78/rl78-virt.md b/gcc/config/rl78/rl78-virt.md index 1db3751..189cf79 100644 --- a/gcc/config/rl78/rl78-virt.md +++ b/gcc/config/rl78/rl78-virt.md @@ -35,7 +35,7 @@ (define_insn *movqi_virt [(set (match_operand:QI 0 nonimmediate_operand =vY,v,Wfr) - (match_operand1 general_operand vInt8JY,Wfr,vInt8J))] + (match_operand:QI 1 general_operand vInt8JY,Wfr,vInt8J))] rl78_virt_insns_ok () v.mov %0, %1 [(set_attr valloc op1)] @@ -126,7 +126,7 @@ (define_insn *xor3_virt [(set (match_operand:QI 0 rl78_nonfar_nonimm_operand =v,vm,m) (xor:QI (match_operand:QI 1 rl78_nonfar_operand %0,vm,vm) - (match_operand2 general_operand i,vm,vim))) + (match_operand:QI 2 general_operand i,vm,vim))) ] rl78_virt_insns_ok () v.xor\t%0, %1, %2 diff --git a/gcc/config/rl78/rl78.md b/gcc/config/rl78/rl78.md index eb4c468..ede4eac 100644 --- a/gcc/config/rl78/rl78.md +++ b/gcc/config/rl78/rl78.md @@ -208,7 +208,7 @@ (define_expand addsi3 [(set (match_operand:SI 0 nonimmediate_operand =vm) (plus:SI (match_operand:SI 1 general_operand vim) -(match_operand2 general_operand vim))) +(match_operand:SI 2 general_operand vim))) ] emit_insn (gen_addsi3_internal_virt (operands[0], operands[1], operands[2])); @@ -218,7 +218,7 @@ (define_insn addsi3_internal_virt [(set (match_operand:SI 0 nonimmediate_operand =v,vm, vm) (plus:SI (match_operand:SI 1 general_operand 0, vim, vim) -(match_operand2 general_operand vim,vim,vim))) +(match_operand:SI 2 general_operand vim,vim,vim))) (clobber (reg:HI AX_REG)) (clobber (reg:HI BC_REG)) ] @@ -230,7 +230,7 @@ (define_insn addsi3_internal_real [(set (match_operand:SI 0 nonimmediate_operand =v,vU, vU) (plus:SI (match_operand:SI 1 general_operand +0, viU, viU) - (match_operand2 general_operand viWabWhlWh1,viWabWhlWh1,viWabWhlWh1))) + (match_operand:SI 2 general_operand viWabWhlWh1,viWabWhlWh1,viWabWhlWh1))) (clobber (reg:HI AX_REG)) (clobber (reg:HI BC_REG)) ] @@ -245,7 +245,7 @@ (define_expand subsi3 [(set (match_operand:SI 0 nonimmediate_operand =vm) (minus:SI (match_operand:SI 1 general_operand vim) - (match_operand2 general_operandvim))) + (match_operand:SI 2 general_operandvim))) ] emit_insn (gen_subsi3_internal_virt (operands[0], operands[1], operands[2])); @@ -255,7 +255,7 @@ (define_insn subsi3_internal_virt [(set (match_operand:SI 0 nonimmediate_operand =v,vm, vm) (minus:SI (match_operand:SI 1 general_operand 0, vim, vim) - (match_operand2 general_operand vim,vim,vim))) + (match_operand:SI 2 general_operand vim,vim,vim))) (clobber (reg:HI AX_REG)) (clobber (reg:HI BC_REG)) ] @@ -267,7 +267,7 @@ (define_insn subsi3_internal_real [(set (match_operand:SI
RFA: Tighten checking for 'X' constraints
As Robert pointed out here: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00416.html we're a bit too eager when folding stuff into an 'X' constraint. The value at expand time is sensible, but after that asm_operand_ok allows arbitrary rtx expressions, including any number of registers as well as MEMs with unchecked addresses. This is a target-independent problem, as shown by the testcase below. Reload would give bogus impossible constraint in asm errors while LRA ICEs. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * recog.c (asm_operand_ok): Tighten MEM validity for 'X'. gcc/testsuite/ * gcc.dg/torture/asm-x-constraint-1.c: New test. Index: gcc/recog.c === --- gcc/recog.c 2014-04-12 22:43:54.729854903 +0100 +++ gcc/recog.c 2014-04-15 21:47:32.139873570 +0100 @@ -1840,7 +1840,17 @@ asm_operand_ok (rtx op, const char *cons break; case 'X': - result = 1; + /* Although the asm itself doesn't impose any restrictions on +the operand, we still need to restrict it to something that +can be reloaded and printed. + +MEM operands are always reloaded to make them legitimate, +regardless of the constraint, so we need to handle them +in the same way as for 'm' and 'g'. Since 'X' is not treated +as an address constraint, the only other valid operand types +are constants and registers. */ + result = (CONSTANT_P (op) + || general_operand (op, VOIDmode)); break; case 'g': Index: gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c === --- /dev/null 2014-04-15 08:10:27.294524132 +0100 +++ gcc/testsuite/gcc.dg/torture/asm-x-constraint-1.c 2014-04-15 19:11:29.830962008 +0100 @@ -0,0 +1,27 @@ +void +noprop1 (int **x, int y, int z) +{ + int *ptr = *x + y * z / 11; + __asm__ __volatile__ (noprop1 %0 : : X (*ptr)); +} + +void +noprop2 (int **x, int y, int z) +{ + int *ptr = *x + y * z / 11; + __asm__ __volatile__ (noprop2 %0 : : X (ptr)); +} + +int *global_var; + +void +const1 (void) +{ + __asm__ __volatile__ (const1 %0 : : X (global_var)); +} + +void +const2 (void) +{ + __asm__ __volatile__ (const2 %0 : : X (*global_var)); +}
Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.
I typically leave the mode off when the operand accepts a CONST_INT as I've had problems with patterns matching CONST_INTs otherwise, as CONST_INT rtx's do not have a mode (or have VOIDmode). (yes, I know gcc is supposed to accomodate that, but like I said, I've had problems...)
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On Tue, 15 Apr 2014, Jason Merrill wrote: It's just vectors, because they're an extension; the patch I checked in covered the standard language. Like this? (regtested on x86_64-linux-gnu) 2014-04-16 Marc Glisse marc.gli...@inria.fr gcc/cp/ * decl.c (reshape_init_r): Handle a single element of vector type. gcc/testsuite/ * g++.dg/cpp0x/initlist-vect.C: New file. -- Marc GlisseIndex: gcc/cp/decl.c === --- gcc/cp/decl.c (revision 209434) +++ gcc/cp/decl.c (working copy) @@ -5400,21 +5400,21 @@ reshape_init_r (tree type, reshape_iter maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS); } d-cur++; return init; } /* If T is a class type and the initializer list has a single element of type cv U, where U is T or a class derived from T, the object is initialized from that element. Even if T is an aggregate. */ - if (cxx_dialect = cxx11 CLASS_TYPE_P (type) + if (cxx_dialect = cxx11 (CLASS_TYPE_P (type) || VECTOR_TYPE_P (type)) first_initializer_p d-end - d-cur == 1 reference_related_p (type, TREE_TYPE (init))) { d-cur++; return init; } /* [dcl.init.aggr] Index: gcc/testsuite/g++.dg/cpp0x/initlist-vect.C === --- gcc/testsuite/g++.dg/cpp0x/initlist-vect.C (revision 0) +++ gcc/testsuite/g++.dg/cpp0x/initlist-vect.C (working copy) @@ -0,0 +1,6 @@ +// { dg-do compile { target c++11 } } + +typedef float X __attribute__ ((vector_size (4 * sizeof (float; + +X x; +X x2{x};
Minor ipa-devirt improvement
Hi, this patch prevents ipa-devirt to devirtualize to functions that are not exported and have no address takem from virtual table; those are obviously not going to be targets of virtual calls. Honza * ipa-devirt.c (referenced_from_vtable_p): New predicate. (maybe_record_node, likely_target_p): Use it. Index: ipa-devirt.c === --- ipa-devirt.c(revision 209391) +++ ipa-devirt.c(working copy) @@ -598,6 +598,48 @@ build_type_inheritance_graph (void) timevar_pop (TV_IPA_INHERITANCE); } +/* Return true if N has reference from live virtual table + (and thus can be a destination of polymorphic call). + Be conservatively correct when callgraph is not built or + if the method may be referred externally. */ + +static bool +referenced_from_vtable_p (struct cgraph_node *node) +{ + int i; + struct ipa_ref *ref; + bool found = false; + + if (node-externally_visible + || node-used_from_other_partition) +return true; + + /* Keep this test constant time. + It is unlikely this can happen except for the case where speculative + devirtualization introduced many speculative edges to this node. + In this case the target is very likely alive anyway. */ + if (node-ref_list.referring.length () 100) +return true; + + /* We need references built. */ + if (cgraph_state = CGRAPH_STATE_CONSTRUCTION) +return true; + + for (i = 0; ipa_ref_list_referring_iterate (node-ref_list, + i, ref); i++) + +if ((ref-use == IPA_REF_ALIAS + referenced_from_vtable_p (cgraph (ref-referring))) + || (ref-use == IPA_REF_ADDR +TREE_CODE (ref-referring-decl) == VAR_DECL +DECL_VIRTUAL_P (ref-referring-decl))) + { + found = true; + break; + } + return found; +} + /* If TARGET has associated node, record it in the NODES array. CAN_REFER specify if program can refer to the target directly. if TARGET is unknown (NULL) or it can not be inserted (for example because @@ -634,11 +676,29 @@ maybe_record_node (vec cgraph_node * target_node = cgraph_get_node (target); - if (target_node != NULL - ((TREE_PUBLIC (target) - || DECL_EXTERNAL (target)) - || target_node-definition) - symtab_real_symbol_p (target_node)) + /* Method can only be called by polymorphic call if any + of vtables refering to it are alive. + + While this holds for non-anonymous functions, too, there are + cases where we want to keep them in the list; for example + inline functions with -fno-weak are static, but we still + may devirtualize them when instance comes from other unit. + The same holds for LTO. + + Currently we ignore these functions in speculative devirtualization. + ??? Maybe it would make sense to be more aggressive for LTO even + eslewhere. */ + if (!flag_ltrans + type_in_anonymous_namespace_p (DECL_CONTEXT (target)) + (!target_node + || !referenced_from_vtable_p (target_node))) +; + /* See if TARGET is useful function we can deal with. */ + else if (target_node != NULL + (TREE_PUBLIC (target) + || DECL_EXTERNAL (target) + || target_node-definition) + symtab_real_symbol_p (target_node)) { gcc_assert (!target_node-global.inlined_to); gcc_assert (symtab_real_symbol_p (target_node)); @@ -1725,6 +1785,12 @@ likely_target_p (struct cgraph_node *n) return false; if (n-frequency NODE_FREQUENCY_NORMAL) return false; + /* If there are no virtual tables refering the target alive, + the only way the target can be called is an instance comming from other + compilation unit; speculative devirtualization is build around an + assumption that won't happen. */ + if (!referenced_from_vtable_p (n)) +return false; return true; }
Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
Robert Suchanek robert.sucha...@imgtec.com writes: Hmm, marking them fixed was supposed to be a temporary reload-only thing, until the move to LRA. It should never be worse to spill to these GPRs over spilling to the stack, if the value isn't live across a call. I would say this also affects IRA/LRA integration. I found that it is more profitable to hide registers (MIPS16 only) in IRA to encourage spilling to memory. Otherwise $8-$15 would be treated like any other registers and LRA would inserts reloads to move in/out values of these registers. My assumption is that if we could hide some of the registers in IRA but enable them in LRA then all registers in SPILL_REGS would be available keeping reasonable code size. Another way would be to increase the cost of moving values between M16_REGS and GR_REGS but it was already mentioned, and is true that there should be no difference of costs and it feels like a hack to make things work. OK. This definitely sounds like it ought to be made to work, with some mixture of target and generic changes. But if it doesn't work out of the box then let's leave that for future work. Did you see the failures even after your mips_regno_mode_ok_for_base_p change? LRA should know how to reload a W address. Yes but I realize there is more. It fails because $sp is now included in BASE_REG_CLASS and W is based on it. However, I suppose that it would be too eager to say it is wrong and likely there is something missing in LRA if we want to keep all alternatives. Currently there is no check if a reloaded operand has a valid address, use of $sp in lbu/lhu cases. Even if we added extra checks we are less likely to benefit as we need to reload the base into register. Not sure what you mean, sorry. W exists specifically to exclude $sp-based and $pc-based addresses. LRA AFAIK should already be able to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P sense but which do not match the constraints for a particular insn. Can you remember one of the tests that fails? Even that might be too loose, since invalid scales will need to be reloaded as a multiplication or shift, and there's no guarantee that the target can do that without clobbering the flags. So maybe we should do something like the patch below. Alternatively we could stick to the decompose_mem_address-based check above and teach LRA to keep invalid addresses for 'X'. The problem then is that we might ICE while printing the operand. Tightening validity for 'X' appears to be reasonable. Will you commit this patch? OK, just submitted separately. Thanks, Richard