update address taken: don't drop clobbers
Hello, we currently drop clobbers on variables whose address is not taken anymore. However, rewrite_stmt has code to replace them with an SSA_NAME with a default definition (an uninitialized variable), and I believe rewrite_update_stmt should do the same. This allows us to warn sometimes (see testcase), but during the debugging I also noticed several places where it allowed CCP to simplify further PHIs, so this is also an optimization. In an earlier version of the patch, I was using get_or_create_ssa_default_def (cfun, sym); (I was reusing the same variable). This passed bootstrap+testsuite on all languages except for ada. Indeed, the compiler wanted to coalesce several SSA_NAMEs, including those new ones, in out-of-ssa, but couldn't. There are abnormal PHIs involved. Maybe it shouldn't have insisted on coalescing an undefined ssa_name, maybe something should have prevented us from reaching such a situation, but creating a new variable was the simplest workaround. Some things could be done to improve the error message in uninit: - getting the location of the variable, - differenciating uninitialized from clobbered, but that can come later. Bootstrap+testsuite (all,obj-c++,ada,go) on x86_64-unknown-linux-gnu. 2014-06-30 Marc Glisse marc.gli...@inria.fr PR tree-optimization/60770 gcc/ * tree-ssa.c (execute_update_addresses_taken): Don't drop clobbers. * tree-into-ssa.c (maybe_register_def): Replace clobbers with a default definition. gcc/testsuite/ * gcc.dg/tree-ssa/pr60770-1.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr60770-1.c (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O -Wall } */ + +int f(int n){ + int*p; + { +int yyy=n; +p=yyy; + } + return *p; /* { dg-warning yyy } */ +} Index: gcc/tree-into-ssa.c === --- gcc/tree-into-ssa.c (revision 212109) +++ gcc/tree-into-ssa.c (working copy) @@ -1831,26 +1831,38 @@ maybe_register_def (def_operand_p def_p, { tree def = DEF_FROM_PTR (def_p); tree sym = DECL_P (def) ? def : SSA_NAME_VAR (def); /* If DEF is a naked symbol that needs renaming, create a new name for it. */ if (marked_for_renaming (sym)) { if (DECL_P (def)) { - tree tracked_var; - - def = make_ssa_name (def, stmt); + if (gimple_clobber_p (stmt) is_gimple_reg (sym)) + { + /* Replace clobber stmts with a default def. Create a new +variable so we don't later think we must coalesce, which would +fail with some ada abnormal PHIs. Still, we try to keep a +similar name so error messages make sense. */ + unlink_stmt_vdef (stmt); + gsi_replace (gsi, gimple_build_nop (), true); + tree id = DECL_NAME (sym); + const char* name = id ? IDENTIFIER_POINTER (id) : 0; + tree newvar = create_tmp_var (TREE_TYPE (sym), name); + def = get_or_create_ssa_default_def (cfun, newvar); + } + else + def = make_ssa_name (def, stmt); SET_DEF (def_p, def); - tracked_var = target_for_debug_bind (sym); + tree tracked_var = target_for_debug_bind (sym); if (tracked_var) { gimple note = gimple_build_debug_bind (tracked_var, def, stmt); /* If stmt ends the bb, insert the debug stmt on the single non-EH edge from the stmt. */ if (gsi_one_before_end_p (gsi) stmt_ends_bb_p (stmt)) { basic_block bb = gsi_bb (gsi); edge_iterator ei; edge e, ef = NULL; Index: gcc/tree-ssa.c === --- gcc/tree-ssa.c (revision 212109) +++ gcc/tree-ssa.c (working copy) @@ -1607,32 +1607,20 @@ execute_update_addresses_taken (void) rhs = gimple_assign_rhs1 (stmt); if (gimple_assign_lhs (stmt) != lhs !useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs))) rhs = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), rhs); if (gimple_assign_lhs (stmt) != lhs) gimple_assign_set_lhs (stmt, lhs); - /* For var ={v} {CLOBBER}; where var lost - TREE_ADDRESSABLE just remove the stmt. */ - if (DECL_P (lhs) -TREE_CLOBBER_P (rhs) -bitmap_bit_p (suitable_for_renaming, DECL_UID (lhs))) - { - unlink_stmt_vdef (stmt
Re: [i386] Replace builtins with vector extensions
On Sat, 28 Jun 2014, Ulrich Drepper wrote: On Sat, Jun 28, 2014 at 6:42 AM, Marc Glisse marc.gli...@inria.fr wrote: Ping, nobody has an opinion on this? Or some explanation why I am mistaken to believe that #pragma target makes it safer now? It would enable a number of optimizations, like constant propagation, FMA contraction, etc. It would also allow us to remove several builtins. I see no problem with using the array-type access to the registers. As for replacing the builtins with arithmetic operators: I appreciate the possibility for optimization. But is there any chance the calls could not end up being implemented with a vector instruction? I think that would be bad. The intrinsics should be a way to guarantee that the programmer can create vector instructions. Otherwise we might just not support them. There is always a risk, but then even with builtins I think there was a small risk that an RTL optimization would mess things up. It is indeed higher if we expose the operation to the optimizers earlier, but it would be a bug if an optimization replaced a vector operation by something worse. Also, I am only proposing to handle the most trivial operations this way, not more complicated ones (like v[0]+=s) where we would be likely to fail generating the right instruction. And the pragma should ensure that the function will always be compiled in a mode where the vector instruction is available. ARM did the same and I don't think I have seen a bug reporting a regression about it (I haven't really looked though). Thanks, -- Marc Glisse
Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, 26 Jun 2014, Jakub Jelinek wrote: So like this? I've also changed get_compute_type so that it will DTRT even for -mavx and V4DImode vectors, so e.g. f5/f6/f8 routines in avx-pr57233.c improve. Also, even for shifts by scalar, if e.g. target doesn't have shifts by scalar at all, and only has narrower vector by vector shifts, it should handle this case too. All that? Cool! @@ -1455,11 +1507,83 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector -if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing - optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing) +if necessary. Pick one with wider vector type. */ + tree compute_vtype = get_compute_type (code, opv, type); + if (count_type_subparts (compute_vtype) + count_type_subparts (compute_type)) + { + compute_type = compute_vtype; + op = opv; + } + } + + if (code == LROTATE_EXPR || code == RROTATE_EXPR) + { + if (compute_type == NULL_TREE) + compute_type = get_compute_type (code, op, type); + if (compute_type == type) return; + /* Before splitting vector rotates into scalar rotates, +see if we can't use vector shifts and BIT_IOR_EXPR +instead. For vector by vector rotates we'd also +need to check BIT_AND_EXPR and NEGATE_EXPR, punt there +for now, fold doesn't seem to create such rotates anyway. */ + if (compute_type == TREE_TYPE (type) + !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) + { + optab oplv, opl, oprv, opr, opo; + oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); + /* Right shift always has to be logical, no matter what +signedness type has. */ + oprv = vlshr_optab; + opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); + opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); + oprv = lshr_optab; + opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); Looks like there are some typos in there, you are assigning to oprv twice. -- Marc Glisse
Re: [PATCH] Change default for --param allow-...-data-races to off
On Wed, 25 Jun 2014, Richard Biener wrote: On Wed, Jun 25, 2014 at 10:54 AM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Jun 25, 2014 at 10:14:17AM +0200, Richard Biener wrote: Perhaps not unsurprisingly, the patch is very similar. Bootstrapped and tested on x86_64-linux. OK for trunk? Ok - please give the C++/atomics folks a chance to comment. This change of default behavior should also be documented in gcc-4.10/changes.html. Do we want to allow the store data races by default with -Ofast even in strict conformance modes (-std=c++11, -std=c++14, -std=c11)? I think so. -Ofast means -Ofast (same issue with fp contraction and other stuff -ffast-math enables that is not valid in strict conformance mode). One thing I am missing is a -single-thread option that would allow races, remove atomics and thread locals, etc, without breaking conformance as long as no second thread is created. I hope that not too many libraries use threads internally in a way that would break this. -- Marc Glisse
strlen: update datastructure when replacing malloc with calloc
Hello, in my calloc patch, I forgot to update the datastructure when replacing malloc with calloc. Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-25 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (handle_builtin_memset): Update strinfo after replacing the statement. gcc/testsuite/ * gcc.dg/tree-ssa/calloc-3.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c === --- gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c(revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/calloc-3.c(working copy) @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +void*f(){ +char*p=__builtin_malloc(42); +__builtin_memset(p,0,42); +__builtin_memset(p,0,42); +return p; +}; + +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-strlen.c === --- gcc/tree-ssa-strlen.c (revision 211968) +++ gcc/tree-ssa-strlen.c (working copy) @@ -1646,20 +1646,22 @@ handle_builtin_memset (gimple_stmt_itera enum built_in_function code1 = DECL_FUNCTION_CODE (callee1); tree size = gimple_call_arg (stmt2, 2); if (code1 == BUILT_IN_CALLOC) /* Not touching stmt1 */ ; else if (code1 == BUILT_IN_MALLOC operand_equal_p (gimple_call_arg (stmt1, 0), size, 0)) { gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1); update_gimple_call (gsi1, builtin_decl_implicit (BUILT_IN_CALLOC), 2, size, build_one_cst (size_type_node)); + si1-length = build_int_cst (size_type_node, 0); + si1-stmt = gsi_stmt (gsi1); } else return true; tree lhs = gimple_call_lhs (stmt2); unlink_stmt_vdef (stmt2); if (lhs) { gimple assign = gimple_build_assign (lhs, ptr); gsi_replace (gsi, assign, false); }
Re: [GSoC][match-and-simplify] mark some more operators as commutative
On Mon, 23 Jun 2014, Richard Biener wrote: On Mon, Jun 23, 2014 at 3:32 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: * match.pd: Mark operators in some bitwise and plus-minus patterns to be commutative. /* A - (A +- B) - -+ B */ (match_and_simplify - (minus @0 (plus @0 @1)) + (minus @0 (plus:c @0 @1)) (negate @0)) seems pointless Why? a-(a+b) and a-(b+a) are both wanted and don't appear elsewhere in the file, no? Should simplify to (negate @1) though. -- Marc Glisse
Re: calloc = malloc + memset
On Mon, 23 Jun 2014, Jakub Jelinek wrote: Ok for trunk, sorry for the delay. Thanks. Richard has moved the passes a bit since then, but I still have exactly one spot where the testsuite is ok :-) I need strlen to be after dom (for calloc.C) and before vrp (for several strlenopt-*.c). I'll commit it tomorrow if there aren't any comments on the pass placement. 2014-06-24 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom but before vrp. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. -- Marc GlisseIndex: gcc/passes.def === --- gcc/passes.def (revision 211886) +++ gcc/passes.def (working copy) @@ -179,21 +179,20 @@ along with GCC; see the file COPYING3. DOM and erroneous path isolation should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); NEXT_PASS (pass_dse); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_dce); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); - NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); NEXT_PASS (pass_optimize_bswap); NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); NEXT_PASS (pass_asan); @@ -232,20 +231,21 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_loop_prefetch); NEXT_PASS (pass_iv_optimize); NEXT_PASS (pass_lim); NEXT_PASS (pass_tree_loop_done); POP_INSERT_PASSES () NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_dominator); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_vrp); /* The only const/copy propagation opportunities left after DOM and VRP should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_tracer); NEXT_PASS (pass_dse); Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C === --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-optimized } */ + +typedef __SIZE_TYPE__ size_t; +inline void* operator new(size_t, void* p) throw() { return p; } + +typedef void (*handler_t)(void); +extern handler_t get_handle(); + +inline void* operator new(size_t sz) +{ + void *p; + + if (sz == 0) +sz = 1; + + while ((p = __builtin_malloc (sz)) == 0) +{ + handler_t handler = get_handle (); + if (! handler) +throw 42; + handler (); +} + return p; +} + +struct vect { + int *start, *end; + vect(size_t n) { +start = end = 0; +if (n (size_t)-1 / sizeof(int)) + throw 33; +if (n != 0) + start = static_castint* (operator new (n * sizeof(int))); +end = start + n; +int *p = start; +for (size_t l = n; l 0; --l, ++p) + *p = 0; + } +}; + +void f (void *p, int n) +{ + new (p) vect(n); +} + +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/testsuite/gcc.dg/strlenopt-9.c === --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 211886) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r) optimized away. */ return strchr (p, '\0'); } __attribute__((noinline, noclone)) size_t fn2 (int r) { char *p, q[10]; strcpy (q, abc); p = r ? a : q; - /* String length for p varies, therefore strlen below isn't + /* String length is constant for both alternatives, and strlen is optimized away. */ return strlen (p); } __attribute__((noinline, noclone)) size_t fn3 (char *p, int n) { int
Re: calloc = malloc + memset
On Mon, 23 Jun 2014, Richard Biener wrote: On June 23, 2014 5:51:30 PM CEST, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 23 Jun 2014, Jakub Jelinek wrote: Ok for trunk, sorry for the delay. Thanks. Richard has moved the passes a bit since then, but I still have exactly one spot where the testsuite is ok :-) I need strlen to be after dom (for calloc.C) and before vrp (for several strlenopt-*.c). I'll commit it tomorrow if there aren't any comments on the pass placement. But vrp does not run at -O1 - does strlenopt? { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 }, So that's just a missed optimization at -Os, I guess. -- Marc Glisse
Re: calloc = malloc + memset
On Mon, 23 Jun 2014, Andi Kleen wrote: FWIW i believe the transformation will break a large variety of micro benchmarks. calloc internally knows that memory fresh from the OS is zeroed. But the memory may not be faulted in yet. memset always faults in the memory. So if you have some test like buf = malloc(...) memset(buf, ...) start = get_time(); ... do something with buf end = get_time() Now the times will be completely off because the measured times includes the page faults. Good point. I guess working around compiler optimizations is part of the game for micro benchmarks, and their authors would be disappointed if the compiler didn't mess it up regularly in new and entertaining ways ;-) -- Marc Glisse
Re: calloc = malloc + memset
On Mon, 23 Jun 2014, Andi Kleen wrote: I would prefer to not do it. For the sake of micro benchmarks? I'm not sure it has a lot of benefit. It has a non-zero benefit. If you want to keep it please make sure there is an easy way to turn it off. Any of these flags works: -fdisable-tree-strlen -fno-builtin-malloc -fno-builtin-memset (assuming you wrote 'memset' explicitly in your code) -fno-builtin -ffreestanding -O1 -Os In the code, you can hide that the pointer passed to memset is the one returned by malloc by storing it in a volatile variable, or any other trick to hide from the compiler that we are doing memset(malloc(n),0,n). -- Marc Glisse
[doc] Remove duplicate -Wmaybe-uninitialized
Hello, a trivial patch to remove a duplicated option, you can see the second one 4 lines below in the patch. (the mixed use of single or double spaces in this list is strange) This was included in the bootstrap of another patch. 2014-06-23 Marc Glisse marc.gli...@inria.fr * doc/invoke.texi (Warning Options): Remove duplicated -Wmaybe-uninitialized. -- Marc GlisseIndex: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 211874) +++ gcc/doc/invoke.texi (working copy) @@ -246,21 +246,21 @@ Objective-C and Objective-C++ Dialects}. -Wno-deprecated -Wno-deprecated-declarations -Wdisabled-optimization @gol -Wno-discarded-qualifiers @gol -Wno-div-by-zero -Wdouble-promotion -Wempty-body -Wenum-compare @gol -Wno-endif-labels -Werror -Werror=* @gol -Wfatal-errors -Wfloat-equal -Wformat -Wformat=2 @gol -Wno-format-contains-nul -Wno-format-extra-args -Wformat-nonliteral @gol -Wformat-security -Wformat-signedness -Wformat-y2k @gol -Wframe-larger-than=@var{len} -Wno-free-nonheap-object -Wjump-misses-init @gol -Wignored-qualifiers @gol -Wimplicit -Wimplicit-function-declaration -Wimplicit-int @gol --Winit-self -Winline -Wmaybe-uninitialized @gol +-Winit-self -Winline @gol -Wno-int-to-pointer-cast -Wno-invalid-offsetof @gol -Winvalid-pch -Wlarger-than=@var{len} -Wunsafe-loop-optimizations @gol -Wlogical-op -Wlogical-not-parentheses -Wlong-long @gol -Wmain -Wmaybe-uninitialized -Wmissing-braces -Wmissing-field-initializers @gol -Wmissing-include-dirs @gol -Wno-multichar -Wnonnull -Wno-overflow -Wopenmp-simd @gol -Woverlength-strings -Wpacked -Wpacked-bitfield-compat -Wpadded @gol -Wparentheses -Wpedantic-ms-format -Wno-pedantic-ms-format @gol -Wpointer-arith -Wno-pointer-to-int-cast @gol -Wredundant-decls -Wno-return-local-addr @gol
Warn when returning the address of a temporary (middle-end) v2
Hello, I followed the advice in this discussion: https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00269.html and here is a new patch. I made an effort to isolate a path in at least one subcase so it doesn't look too strange that the warning is in this file. Computing the dominance info just to tweak the warning message may be a bit excessive. I kept the same option as the front-ends, I don't know if we want a different one, or maybe a Wmaybe-... version. There will be cases where we get a duplicate warning from -Wtarget-lifetime in fortran, but none in the testsuite, and I would rather have 2 warnings than miss such broken code. The uninit-G testcase is about initialization, not returning, so I am changing that, even if it is unnecessary with the current version of the patch (only activated at -O2). Bootstrap+testsuite (--enable-languages=all,obj-c++,ada,go) on x86_64-unknown-linux-gnu. (by the way, contrib/compare_tests is confused when I use all languages, it prints comm: file 1 is not in sorted order and tons of spurious differences) 2014-06-23 Marc Glisse marc.gli...@inria.fr PR c++/60517 gcc/c/ * c-typeck.c (c_finish_return): Return 0 instead of the address of a local variable. gcc/cp/ * typeck.c (maybe_warn_about_returning_address_of_local): Return whether it is returning the address of a local variable. (check_return_expr): Return 0 instead of the address of a local variable. gcc/c-family/ * c.opt (-Wreturn-local-addr): Move to common.opt. gcc/ * common.opt (-Wreturn-local-addr): Moved from c.opt. * gimple-ssa-isolate-paths.c: Include diagnostic-core.h. (isolate_path): New argument to avoid inserting a trap. (find_implicit_erroneous_behaviour): Handle returning the address of a local variable. (find_explicit_erroneous_behaviour): Likewise. (gimple_ssa_isolate_erroneous_paths): Calculate dominance info. gcc/testsuite/ * c-c++-common/addrtmp.c: New file. * c-c++-common/uninit-G.c: Adapt. -- Marc GlisseIndex: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c(revision 211874) +++ gcc/c/c-typeck.c(working copy) @@ -9315,22 +9315,26 @@ c_finish_return (location_t loc, tree re if (DECL_P (inner) !DECL_EXTERNAL (inner) !TREE_STATIC (inner) DECL_CONTEXT (inner) == current_function_decl) { if (TREE_CODE (inner) == LABEL_DECL) warning_at (loc, OPT_Wreturn_local_addr, function returns address of label); else - warning_at (loc, OPT_Wreturn_local_addr, - function returns address of local variable); + { + warning_at (loc, OPT_Wreturn_local_addr, + function returns address of local variable); + tree zero = build_zero_cst (TREE_TYPE (res)); + t = build_compound_expr (loc, t, zero); + } } break; default: break; } break; } Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 211874) +++ gcc/c-family/c.opt (working copy) @@ -682,24 +682,20 @@ ObjC ObjC++ Var(warn_protocol) Init(1) W Warn if inherited methods are unimplemented Wredundant-decls C ObjC C++ ObjC++ Var(warn_redundant_decls) Warning Warn about multiple declarations of the same object Wreorder C++ ObjC++ Var(warn_reorder) Warning LangEnabledBy(C++ ObjC++,Wall) Warn when the compiler reorders code -Wreturn-local-addr -C ObjC C++ ObjC++ Var(warn_return_local_addr) Init(1) Warning -Warn about returning a pointer/reference to a local or temporary variable. - Wreturn-type C ObjC C++ ObjC++ Var(warn_return_type) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn whenever a function's return type defaults to \int\ (C), or about inconsistent return types (C++) Wselector ObjC ObjC++ Var(warn_selector) Warning Warn if a selector has multiple methods Wshadow-ivar ObjC ObjC++ Var(warn_shadow_ivar) EnabledBy(Wshadow) Init(1) Warning Index: gcc/common.opt === --- gcc/common.opt (revision 211874) +++ gcc/common.opt (working copy) @@ -596,20 +596,24 @@ Common Var(warn_packed) Warning Warn when the packed attribute has no effect on struct layout Wpadded Common Var(warn_padded) Warning Warn when padding is required to align structure members Wpedantic Common Var(pedantic) Warning Issue warnings needed for strict compliance to the standard +Wreturn-local-addr +Common Var(warn_return_local_addr) Init(1) Warning +Warn about returning
Re: [i386] logical shift right in shrd
On Sat, 21 Jun 2014, Uros Bizjak wrote: On Fri, Jun 20, 2014 at 10:42 PM, Marc Glisse marc.gli...@inria.fr wrote: as reported in PR 61503, there seems to be a typo in the shrd pattern. I think it is quite unlikely to cause any problem, because the pattern is 1 instruction too long for combine to recognize it (by the way, if someone has suggestions for PR 55583...). But it is still better to fix it. Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-21 Marc Glisse marc.gli...@inria.fr PR target/61503 * config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt with lshiftrt. OK for mainline and 4.9. Thanks. Er, I am sorry, I don't know what happened, but when testing the backport to 4.9 I got an obvious failure in the testsuite, which I am sure should also happen on trunk, but somehow I didn't see it (I am almost sure I tested the right branch though). Anyway, here is an updated patch, that did pass bootstrap+testsuite both on trunk and 4.9. I haven't committed anything yet, is the new patch ok? 2014-06-21 Marc Glisse marc.gli...@inria.fr PR target/61503 * config/i386/i386.md (x86_64_shrd, x86_shrd, ix86_rotrdwi3_doubleword): Replace ashiftrt with lshiftrt. -- Marc GlisseIndex: gcc/config/i386/i386.md === --- gcc/config/i386/i386.md (revision 211865) +++ gcc/config/i386/i386.md (working copy) @@ -9601,37 +9601,37 @@ (match_operand:DWI 1 register_operand) (match_operand:QI 2 nonmemory_operand))) (clobber (reg:CC FLAGS_REG))]) (match_dup 3)] TARGET_CMOVE [(const_int 0)] ix86_split_shift_insn (operands, operands[3], DWImode); DONE;) (define_insn x86_64_shrd [(set (match_operand:DI 0 nonimmediate_operand +r*m) -(ior:DI (ashiftrt:DI (match_dup 0) +(ior:DI (lshiftrt:DI (match_dup 0) (match_operand:QI 2 nonmemory_operand Jc)) (ashift:DI (match_operand:DI 1 register_operand r) (minus:QI (const_int 64) (match_dup 2) (clobber (reg:CC FLAGS_REG))] TARGET_64BIT shrd{q}\t{%s2%1, %0|%0, %1, %2} [(set_attr type ishift) (set_attr prefix_0f 1) (set_attr mode DI) (set_attr athlon_decode vector) (set_attr amdfam10_decode vector) (set_attr bdver1_decode vector)]) (define_insn x86_shrd [(set (match_operand:SI 0 nonimmediate_operand +r*m) -(ior:SI (ashiftrt:SI (match_dup 0) +(ior:SI (lshiftrt:SI (match_dup 0) (match_operand:QI 2 nonmemory_operand Ic)) (ashift:SI (match_operand:SI 1 register_operand r) (minus:QI (const_int 32) (match_dup 2) (clobber (reg:CC FLAGS_REG))] shrd{l}\t{%s2%1, %0|%0, %1, %2} [(set_attr type ishift) (set_attr prefix_0f 1) (set_attr mode SI) (set_attr pent_pair np) @@ -10069,27 +10069,27 @@ (rotatert:DWI (match_operand:DWI 1 register_operand 0) (match_operand:QI 2 shift_immediate_operand S))) (clobber (reg:CC FLAGS_REG)) (clobber (match_scratch:DWIH 3 =r))] # reload_completed [(set (match_dup 3) (match_dup 4)) (parallel [(set (match_dup 4) -(ior:DWIH (ashiftrt:DWIH (match_dup 4) (match_dup 2)) +(ior:DWIH (lshiftrt:DWIH (match_dup 4) (match_dup 2)) (ashift:DWIH (match_dup 5) (minus:QI (match_dup 6) (match_dup 2) (clobber (reg:CC FLAGS_REG))]) (parallel [(set (match_dup 5) -(ior:DWIH (ashiftrt:DWIH (match_dup 5) (match_dup 2)) +(ior:DWIH (lshiftrt:DWIH (match_dup 5) (match_dup 2)) (ashift:DWIH (match_dup 3) (minus:QI (match_dup 6) (match_dup 2) (clobber (reg:CC FLAGS_REG))])] { operands[6] = GEN_INT (GET_MODE_BITSIZE (MODEmode)); split_double_mode (DWImode, operands[0], 1, operands[4], operands[5]); }) (define_insn *bmi2_rorxmode3_1
Re: [patch] change specific int128 - generic intN
(Adding libstdc++@ in Cc: so they see the patch at https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01655.html ) On Sat, 21 Jun 2014, DJ Delorie wrote: New version of https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00723.html This is a big patch, but it includes all the features/changes/support requested since the initial patch. Tested with identical results before/after on x86-64 EXCEPT for g++.dg/ext/int128-1.C, which assumes __int128 is not supported with -std= but as per previous discussion, it now is (because __intN might be used for size_t, which is required). Tested on msp430 with significant improvements but a few regressions (expected). This patch replaces the current int128 support with a generic intN support, which happens to provide int128 as well as up to three additional intN types per target. I will post a separate patch for the msp430 backend demonstrating the use of this new feature for __int20, as well as a third to fix some problems with PSImode in general. The general idea is that genmodes has a new macro INT_N() for tm-modes.def, which gives the mode and precision to use for target-specific intN types. These types are always created, but the parser will error with unsupported if the target does not support that mode for that compile (i.e. because of command line switches, etc). There's an INT_N(TI,128) in the common sources. If the target defines any types larger than long long, those types (like int128 before) will be used for type promotion, but types smaller than long long will not, to avoid confusing the C type promotion rules. Otherwise, it's up to the source being compiled to specific __intN types as desired. Nice. A couple quick comments on the parts I understand while maintainers are resting for the week-end. Index: libstdc++-v3/src/c++11/limits.cc === [...] +#if !defined(__STRICT_ANSI__) Since the test on __STRICT_ANSI__ is removed for all other uses, it would seem consistent to me to remove this one as well. Besides, you are already testing __GLIBCXX_USE_INT_N_0, which as far as I understand is protected by !flag_iso (with the exception of size_t). -#if !defined(__STRICT_ANSI__) defined(_GLIBCXX_USE_INT128) + // Conditionalizing on __STRICT_ANSI__ here will break any port that + // uses one of these types for size_t. +#if defined(__GLIBCXX_USE_INT_N_0) template -struct __is_integral_helper__int128 +struct __is_integral_helper__GLIBCXX_TYPE_INT_N_0 Since the check for __STRICT_ANSI__ is removed, do we need to add __extension__ in front of __GLIBCXX_TYPE_INT_N_0 to avoid warning with -Wsystem-headers? --- gcc/cp/rtti.c (revision 211858) +++ gcc/cp/rtti.c (working copy) @@ -1506,31 +1506,44 @@ emit_support_tinfo_1 (tree bltn) void emit_support_tinfos (void) { /* Dummy static variable so we can put nullptr in the array; it will be set before we actually start to walk the array. */ - static tree *const fundamentals[] = + static tree * fundamentals[] = { void_type_node, boolean_type_node, wchar_type_node, char16_type_node, char32_type_node, char_type_node, signed_char_type_node, unsigned_char_type_node, short_integer_type_node, short_unsigned_type_node, integer_type_node, unsigned_type_node, long_integer_type_node, long_unsigned_type_node, long_long_integer_type_node, long_long_unsigned_type_node, -int128_integer_type_node, int128_unsigned_type_node, float_type_node, double_type_node, long_double_type_node, dfloat32_type_node, dfloat64_type_node, dfloat128_type_node, +#define FUND_INT_N_IDX 22 +// These eight are for intN_t nodes +nullptr_type_node, nullptr_type_node, nullptr_type_node, nullptr_type_node, +nullptr_type_node, nullptr_type_node, nullptr_type_node, nullptr_type_node, nullptr_type_node, 0 }; - int ix; + int ix, i; tree bltn_type, dtor; + ix = FUND_INT_N_IDX; + for (i = 0; i NUM_INT_N_ENTS; i ++) +if (int_n_enabled_p[i]) + { + fundamentals [ix++] = int_n_trees[i].signed_type; + fundamentals [ix++] = int_n_trees[i].unsigned_type; + } + fundamentals [ix++] = nullptr_type_node; + fundamentals [ix++] = 0; + push_abi_namespace (); bltn_type = xref_tag (class_type, get_identifier (__fundamental_type_info), /*tag_scope=*/ts_current, false); pop_abi_namespace (); if (!COMPLETE_TYPE_P (bltn_type)) That seems complicated. You just need to call emit_support_tinfo_1 on each of the types (see how fundamentals is used at the end of the function), no need to put everything in the array. -- Marc Glisse
[RTL] (vec_select (vec_concat a b) c) may be just a or b
Hello, this is another small simplification of RTL for vectors. Note that it doesn't really solve the problem, because these simplifications are only performed for single-use objects. If I start from vectors [a,b] and [c,d] and concatenate them into [a,b,c,d], then extract both halves, as in the original testcase in the PR, we won't notice that those are the original vectors. Still, better than nothing... (we output a vzeroupper for the testcase, that seems unnecessary) Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-22 Marc Glisse marc.gli...@inria.fr PR target/44551 gcc/ * simplify-rtx.c (simplify_binary_operation_1) VEC_SELECT: Optimize inverse of a VEC_CONCAT. gcc/testsuite/ * gcc.target/i386/pr44551-1.c: New file. -- Marc GlisseIndex: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 211867) +++ gcc/simplify-rtx.c (working copy) @@ -3359,20 +3359,64 @@ simplify_binary_operation_1 (enum rtx_co unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1)); rtx subop0, subop1; gcc_assert (i0 2 i1 2); subop0 = XEXP (trueop0, i0); subop1 = XEXP (trueop0, i1); return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1); } + + /* If we select one half of a vec_concat, return that. */ + if (GET_CODE (trueop0) == VEC_CONCAT + CONST_INT_P (XVECEXP (trueop1, 0, 0))) + { + rtx subop0 = XEXP (trueop0, 0); + rtx subop1 = XEXP (trueop0, 1); + enum machine_mode mode0 = GET_MODE (subop0); + enum machine_mode mode1 = GET_MODE (subop1); + int li = GET_MODE_SIZE (GET_MODE_INNER (mode0)); + int l0 = GET_MODE_SIZE (mode0) / li; + int l1 = GET_MODE_SIZE (mode1) / li; + int i0 = INTVAL (XVECEXP (trueop1, 0, 0)); + if (i0 == 0 !side_effects_p (op1) mode == mode0) + { + bool success = true; + for (int i = 1; i l0; ++i) + { + rtx j = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (j) || INTVAL (j) != i) + { + success = false; + break; + } + } + if (success) + return subop0; + } + if (i0 == l0 !side_effects_p (op0) mode == mode1) + { + bool success = true; + for (int i = 1; i l1; ++i) + { + rtx j = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (j) || INTVAL (j) != i0 + i) + { + success = false; + break; + } + } + if (success) + return subop1; + } + } } if (XVECLEN (trueop1, 0) == 1 CONST_INT_P (XVECEXP (trueop1, 0, 0)) GET_CODE (trueop0) == VEC_CONCAT) { rtx vec = trueop0; int offset = INTVAL (XVECEXP (trueop1, 0, 0)) * GET_MODE_SIZE (mode); /* Try to find the element in the VEC_CONCAT. */ Index: gcc/testsuite/gcc.target/i386/pr44551-1.c === --- gcc/testsuite/gcc.target/i386/pr44551-1.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr44551-1.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mavx } */ + +#include immintrin.h + +__m128i +foo (__m256i x, __m128i y) +{ + __m256i r = _mm256_insertf128_si256(x, y, 1); + __m128i a = _mm256_extractf128_si256(r, 1); + return a; +} + +/* { dg-final { scan-assembler-not vinsertf } } */ +/* { dg-final { scan-assembler-not vextractf } } */
[i386] logical shift right in shrd
Hello, as reported in PR 61503, there seems to be a typo in the shrd pattern. I think it is quite unlikely to cause any problem, because the pattern is 1 instruction too long for combine to recognize it (by the way, if someone has suggestions for PR 55583...). But it is still better to fix it. Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-21 Marc Glisse marc.gli...@inria.fr PR target/61503 * config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt with lshiftrt. -- Marc GlisseIndex: gcc/config/i386/i386.md === --- gcc/config/i386/i386.md (revision 211856) +++ gcc/config/i386/i386.md (working copy) @@ -9601,37 +9601,37 @@ (match_operand:DWI 1 register_operand) (match_operand:QI 2 nonmemory_operand))) (clobber (reg:CC FLAGS_REG))]) (match_dup 3)] TARGET_CMOVE [(const_int 0)] ix86_split_shift_insn (operands, operands[3], DWImode); DONE;) (define_insn x86_64_shrd [(set (match_operand:DI 0 nonimmediate_operand +r*m) -(ior:DI (ashiftrt:DI (match_dup 0) +(ior:DI (lshiftrt:DI (match_dup 0) (match_operand:QI 2 nonmemory_operand Jc)) (ashift:DI (match_operand:DI 1 register_operand r) (minus:QI (const_int 64) (match_dup 2) (clobber (reg:CC FLAGS_REG))] TARGET_64BIT shrd{q}\t{%s2%1, %0|%0, %1, %2} [(set_attr type ishift) (set_attr prefix_0f 1) (set_attr mode DI) (set_attr athlon_decode vector) (set_attr amdfam10_decode vector) (set_attr bdver1_decode vector)]) (define_insn x86_shrd [(set (match_operand:SI 0 nonimmediate_operand +r*m) -(ior:SI (ashiftrt:SI (match_dup 0) +(ior:SI (lshiftrt:SI (match_dup 0) (match_operand:QI 2 nonmemory_operand Ic)) (ashift:SI (match_operand:SI 1 register_operand r) (minus:QI (const_int 32) (match_dup 2) (clobber (reg:CC FLAGS_REG))] shrd{l}\t{%s2%1, %0|%0, %1, %2} [(set_attr type ishift) (set_attr prefix_0f 1) (set_attr mode SI) (set_attr pent_pair np)
Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)
On Thu, 19 Jun 2014, Jason Merrill wrote: We were treating a const member function like a normal const reference, and binding an rvalue object argument to it. But it doesn't work that way. That looks weird to me. The const version is a better match than the const, so we should pick that one in overload resolution, but if we remove the const version, the other one seems valid to me, we shouldn't reject this program: struct Type { void get() const { } }; int main() { Type{}.get(); } At least both clang and intel accept it, and I hope the standard didn't make it behave differently from regular function calls. -- Marc Glisse
Re: [RTL, i386] Use subreg instead of UNSPEC_CAST
On Tue, 19 Mar 2013, Richard Henderson wrote: I'm not fond of this, primarily because I believe the pattern should not exist at all. One year later, new try. Tweaking the pattern, I ended up with a copy of the mov pattern (the subreg is generated automatically when the modes don't match), so I just removed it. I know the comment in emit-rtl.c says splitters are a better way forward than subregs, but I haven't managed with splitters while the subreg patch is very simple :-) I added a -O0 testcase because when I was experimenting I had many versions that worked for -O2 but ICEd at -O0 (and vice versa), but it might be redundant with some other tests. Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-10 Marc Glisse marc.gli...@inria.fr PR target/50829 gcc/ * config/i386/sse.md (enum unspec): Remove UNSPEC_CAST. (avx_castmodeavxsizesuffix_castmode): Remove. * config/i386/i386.c (builtin_description) [__builtin_ia32_si256_si, __builtin_ia32_ps256_ps, __builtin_ia32_pd256_pd]: Replace the removed insn with mov. * emit-rtl.c (validate_subreg): Allow vector-vector subregs. gcc/testsuite/ * gcc.target/i386/pr50829-1.c: New file. * gcc.target/i386/pr50829-2.c: New file. -- Marc GlisseIndex: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 211397) +++ gcc/config/i386/i386.c (working copy) @@ -29793,23 +29793,23 @@ static const struct builtin_description { OPTION_MASK_ISA_AVX, CODE_FOR_avx_roundps_sfix256, __builtin_ia32_ceilps_sfix256, IX86_BUILTIN_CEILPS_SFIX256, (enum rtx_code) ROUND_CEIL, (int) V8SI_FTYPE_V8SF_ROUND }, { OPTION_MASK_ISA_AVX, CODE_FOR_roundv8sf2, __builtin_ia32_roundps_az256, IX86_BUILTIN_ROUNDPS_AZ256, UNKNOWN, (int) V8SF_FTYPE_V8SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_roundv8sf2_sfix, __builtin_ia32_roundps_az_sfix256, IX86_BUILTIN_ROUNDPS_AZ_SFIX256, UNKNOWN, (int) V8SI_FTYPE_V8SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpckhpd256, __builtin_ia32_unpckhpd256, IX86_BUILTIN_UNPCKHPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpcklpd256, __builtin_ia32_unpcklpd256, IX86_BUILTIN_UNPCKLPD256, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpckhps256, __builtin_ia32_unpckhps256, IX86_BUILTIN_UNPCKHPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_unpcklps256, __builtin_ia32_unpcklps256, IX86_BUILTIN_UNPCKLPS256, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_si256_si, __builtin_ia32_si256_si, IX86_BUILTIN_SI256_SI, UNKNOWN, (int) V8SI_FTYPE_V4SI }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_ps256_ps, __builtin_ia32_ps256_ps, IX86_BUILTIN_PS256_PS, UNKNOWN, (int) V8SF_FTYPE_V4SF }, - { OPTION_MASK_ISA_AVX, CODE_FOR_avx_pd256_pd, __builtin_ia32_pd256_pd, IX86_BUILTIN_PD256_PD, UNKNOWN, (int) V4DF_FTYPE_V2DF }, + { OPTION_MASK_ISA_AVX, CODE_FOR_movv8si, __builtin_ia32_si256_si, IX86_BUILTIN_SI256_SI, UNKNOWN, (int) V8SI_FTYPE_V4SI }, + { OPTION_MASK_ISA_AVX, CODE_FOR_movv8sf, __builtin_ia32_ps256_ps, IX86_BUILTIN_PS256_PS, UNKNOWN, (int) V8SF_FTYPE_V4SF }, + { OPTION_MASK_ISA_AVX, CODE_FOR_movv4df, __builtin_ia32_pd256_pd, IX86_BUILTIN_PD256_PD, UNKNOWN, (int) V4DF_FTYPE_V2DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8si, __builtin_ia32_si_si256, IX86_BUILTIN_SI_SI256, UNKNOWN, (int) V4SI_FTYPE_V8SI }, { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v8sf, __builtin_ia32_ps_ps256, IX86_BUILTIN_PS_PS256, UNKNOWN, (int) V4SF_FTYPE_V8SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_vec_extract_lo_v4df, __builtin_ia32_pd_pd256, IX86_BUILTIN_PD_PD256, UNKNOWN, (int) V2DF_FTYPE_V4DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestzpd, IX86_BUILTIN_VTESTZPD, EQ, (int) INT_FTYPE_V2DF_V2DF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestcpd, IX86_BUILTIN_VTESTCPD, LTU, (int) INT_FTYPE_V2DF_V2DF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestpd, __builtin_ia32_vtestnzcpd, IX86_BUILTIN_VTESTNZCPD, GTU, (int) INT_FTYPE_V2DF_V2DF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestzps, IX86_BUILTIN_VTESTZPS, EQ, (int) INT_FTYPE_V4SF_V4SF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestcps, IX86_BUILTIN_VTESTCPS, LTU, (int) INT_FTYPE_V4SF_V4SF_PTEST }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vtestps, __builtin_ia32_vtestnzcps, IX86_BUILTIN_VTESTNZCPS, GTU, (int) INT_FTYPE_V4SF_V4SF_PTEST }, Index: gcc/config/i386/sse.md === --- gcc/config/i386/sse.md (revision 211397) +++ gcc/config/i386/sse.md (working copy) @@ -66,21 +66,20 @@ UNSPEC_AESKEYGENASSIST ;; For PCLMUL support UNSPEC_PCLMUL ;; For AVX support UNSPEC_PCMP UNSPEC_VPERMIL
PR54442 build_qualified_type produces a non-canonical type
Hello, in this PR, we end up with 3 types A, B and C such that TYPE_CANONICAL(A)=B and TYPE_CANONICAL(B)=C. I don't think that is supposed to happen. Here, build_qualified_type fills in TYPE_CANONICAL(A) by calling build_qualified_type (so get_qualified_type), which afaics isn't guaranteed to return a canonical type. I doubt the patch can be wrong, but it may be that this is a situation that is not supposed to happen and should be fixed elsewhere? Bootstrap+testsuite on x86_64-linux-gnu. 2014-06-09 Marc Glisse marc.gli...@inria.fr PR c++/54442 gcc/ * tree.c (build_qualified_type): Use a canonical type for TYPE_CANONICAL. gcc/testsuite/ * g++.dg/pr54442.C: New file. -- Marc GlisseIndex: gcc/testsuite/g++.dg/pr54442.C === --- gcc/testsuite/g++.dg/pr54442.C (revision 0) +++ gcc/testsuite/g++.dg/pr54442.C (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ + +struct S +{ + void s (int) const throw (); + void s (int) throw (); +}; + +typedef int index_t; + +void (S::*f) (index_t) = S::s; +void (S::*g) (index_t) const = S::s; Index: gcc/tree.c === --- gcc/tree.c (revision 211374) +++ gcc/tree.c (working copy) @@ -6319,22 +6319,25 @@ build_qualified_type (tree type, int typ TYPE_ALIGN (t) = TYPE_ALIGN (atomic_type); } } if (TYPE_STRUCTURAL_EQUALITY_P (type)) /* Propagate structural equality. */ SET_TYPE_STRUCTURAL_EQUALITY (t); else if (TYPE_CANONICAL (type) != type) /* Build the underlying canonical type, since it is different from TYPE. */ - TYPE_CANONICAL (t) = build_qualified_type (TYPE_CANONICAL (type), - type_quals); + { + tree c = build_qualified_type (TYPE_CANONICAL (type), +type_quals); + TYPE_CANONICAL (t) = TYPE_CANONICAL (c); + } else /* T is its own canonical type. */ TYPE_CANONICAL (t) = t; } return t; } /* Create a variant of type T with alignment ALIGN. */
Re: PR54442 build_qualified_type produces a non-canonical type
On Mon, 9 Jun 2014, Jason Merrill wrote: On 06/09/2014 10:18 AM, Marc Glisse wrote: I doubt the patch can be wrong, but it may be that this is a situation that is not supposed to happen and should be fixed elsewhere? Seems likely. What is the difference between the type returned from build_qualified_type (TYPE_CANONICAL and it's TYPE_CANONICAL? I would expect them to be the same. throws tree_list 0x7660e5c8 purpose integer_cst 0x764d6ba0 constant 1 (in what build_qualified_type returns) -- Marc Glisse
Re: __float128 typeinfo
On Fri, 6 Jun 2014, Paolo Carlini wrote: On 06/06/2014 04:16 PM, Marc Glisse wrote: abi_check is broken before my patch (134 incompatible symbols). Isn't broken for me, though. Likewise, AFAICS, on gcc-testresults. I would recommend investigating in some detail what's going on at your end... Ah, no, abi_check actually passed in the regular bootstrap, it is in the debug builds (-O0) that it fails because more than a hundred non-inlined inline functions find their way into the library. False alarm. -- Marc Glisse
__float128 typeinfo
Hello, here is a new try on adding __float128 typeinfo to libsupc++. The front-end part is based on the discussion with Jason yesterday. The libstdc++ part is copied from: https://gcc.gnu.org/ml/libstdc++/2014-04/msg00077.html (which wasn't reviewed), but I changed the testsuite. Michael will likely need to make some adjustments to his __float128 patch, but I believe it will only be a small configure tweak. Ramana, does it look safe for ARM? By the way, do you want to remove the XFmode defined in arm-modes.def and apparently unused? Bootstrap+testsuite on x86_64-linux-gnu. I manually checked that libstdc++.{a,so} gained exactly the symbols related to typeinfo for __float128. abi_check is broken before my patch (134 incompatible symbols). After it, the number of added symbols increases by 7, but the number of incompatible symbols remains the same, so I take it as a pass. 2014-06-06 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/cp/ * rtti.c (emit_support_tinfos): Handle __float128. libstdc++-v3/ * config/abi/pre/float128.ver: New file. * configure.ac: Use float128.ver when relevant. * configure: Regenerate. * testsuite/util/testsuite_abi.cc (check_version): Accept new CXXABI_FLOAT128_1.3.9 version. -- Marc GlisseIndex: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 211311) +++ gcc/cp/rtti.c (working copy) @@ -1540,20 +1540,31 @@ emit_support_tinfos (void) /*tag_scope=*/ts_current, false); pop_abi_namespace (); if (!COMPLETE_TYPE_P (bltn_type)) return; dtor = CLASSTYPE_DESTRUCTORS (bltn_type); if (!dtor || DECL_EXTERNAL (dtor)) return; doing_runtime = 1; for (ix = 0; fundamentals[ix]; ix++) emit_support_tinfo_1 (*fundamentals[ix]); + + /* Search for an extra floating point type like __float128. */ + for (enum machine_mode mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT); + mode != VOIDmode; + mode = GET_MODE_WIDER_MODE (mode)) +{ + tree type = c_common_type_for_mode (mode, false); + if (type type != float_type_node type != double_type_node + type != long_double_type_node) + emit_support_tinfo_1 (type); +} } /* Finish a type info decl. DECL_PTR is a pointer to an unemitted tinfo decl. Determine whether it needs emitting, and if so generate the initializer. */ bool emit_tinfo_decl (tree decl) { tree type = TREE_TYPE (DECL_NAME (decl)); Index: libstdc++-v3/config/abi/pre/float128.ver === --- libstdc++-v3/config/abi/pre/float128.ver(revision 0) +++ libstdc++-v3/config/abi/pre/float128.ver(working copy) @@ -0,0 +1,10 @@ +# Appended to version file. + +CXXABI_FLOAT128_1.3.9 { + +# typeinfo and typeinfo name for __float128 +_ZT[IS]g; +_ZT[IS]Pg; +_ZT[IS]PKg; + +}; Index: libstdc++-v3/configure === --- libstdc++-v3/configure (revision 211311) +++ libstdc++-v3/configure (working copy) @@ -15698,20 +15698,23 @@ $as_echo #define _GLIBCXX_USE_FLOAT128 $as_echo $enable_float128 6; } rm -f conftest* ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext 5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 5' ac_compiler_gnu=$ac_cv_c_compiler_gnu +if test $enable_float128 = yes; then + port_specific_symbol_files=$port_specific_symbol_files \$(top_srcdir)/config/abi/pre/float128.ver +fi # Checks for compiler support that doesn't require linking. # All these tests are for C++; save the language and the compiler flags. # The CXXFLAGS thing is suspicious, but based on similar bits previously # found in GLIBCXX_CONFIGURE. ac_ext=cpp ac_cpp='$CXXCPP $CPPFLAGS' ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext 5' Index: libstdc++-v3/configure.ac === --- libstdc++-v3/configure.ac (revision 211311) +++ libstdc++-v3/configure.ac (working copy) @@ -146,20 +146,23 @@ GLIBCXX_ENABLE_HOSTED # Enable descriptive messages to standard output on termination. GLIBCXX_ENABLE_VERBOSE # Enable compiler support that doesn't require linking. GLIBCXX_ENABLE_SJLJ_EXCEPTIONS GLIBCXX_ENABLE_PCH($is_hosted) GLIBCXX_ENABLE_THREADS GLIBCXX_ENABLE_ATOMIC_BUILTINS GLIBCXX_ENABLE_DECIMAL_FLOAT GLIBCXX_ENABLE_INT128_FLOAT128 +if test $enable_float128 = yes; then + port_specific_symbol_files=$port_specific_symbol_files \$(top_srcdir)/config/abi/pre/float128.ver +fi # Checks for compiler support that doesn't require linking. GLIBCXX_CHECK_COMPILER_FEATURES # Enable all the variable C++ runtime options that don't require linking. GLIBCXX_ENABLE_CSTDIO GLIBCXX_ENABLE_CLOCALE GLIBCXX_ENABLE_ALLOCATOR
Re: __float128 typeinfo
On Fri, 6 Jun 2014, Ramana Radhakrishnan wrote: On Fri, Jun 6, 2014 at 3:16 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, here is a new try on adding __float128 typeinfo to libsupc++. The front-end part is based on the discussion with Jason yesterday. The libstdc++ part is copied from: https://gcc.gnu.org/ml/libstdc++/2014-04/msg00077.html (which wasn't reviewed), but I changed the testsuite. Michael will likely need to make some adjustments to his __float128 patch, but I believe it will only be a small configure tweak. Ramana, does it look safe for ARM? By the way, do you want to remove the XFmode defined in arm-modes.def and apparently unused? Thanks for this though I know Tejas is actually trying to fix AArch64 first and then we'll move on to ARM for cleaning up all these types - as you realize it's quite a chunky project. Nice to know. I am a bit afraid it might be hard without breaking ABI compatibility at least a little. Good luck anyway. A patch to remove XFmode is pre-approved. Even for such a trivial patch, it would be good to do at least a build to check nothing breaks, but I don't have an easy way to do that (debian multiarch cross-builds don't work out of the box), it would be better if one of the people working on ARM could handle it. I'll try and have a look at this patch later today or over the weekend to see if it doesn't affect ARM / AArch64. Thanks! -- Marc Glisse
Re: __float128 typeinfo
On Fri, 6 Jun 2014, Jason Merrill wrote: On 06/06/2014 10:16 AM, Marc Glisse wrote: * config/abi/pre/float128.ver: New file. +CXXABI_FLOAT128_1.3.9 { What's your rationale for keeping this in a separate version block rather than in 1.3.9 (like __int128)? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43622#c6 Powerpc already has those symbols in CXXABI_LDBL_1.3 (for a type that isn't __float128 but (de)mangles that way). That doesn't prevent from using CXXABI_1.3.9 in the x86/ia64-specific file though, so I can do that if you prefer. If a new target implements __float128 in 4.11 or 4.12, they will have to refine the configure test and provide a different .ver file to avoid adding symbols to an old version. Or I could try to share the CXXABI_LDBL_1.3 name although it doesn't correspond to long double in this case. I have no preference, I'll go with whatever people like. -- Marc Glisse
Re: __float128 typeinfo
On Fri, 6 Jun 2014, Jason Merrill wrote: On 06/06/2014 11:58 AM, Marc Glisse wrote: On Fri, 6 Jun 2014, Jason Merrill wrote: What's your rationale for keeping this in a separate version block rather than in 1.3.9 (like __int128)? Powerpc already has those symbols in CXXABI_LDBL_1.3 (for a type that isn't __float128 but (de)mangles that way). That doesn't prevent from using CXXABI_1.3.9 in the x86/ia64-specific file though, so I can do that if you prefer. If a new target implements __float128 in 4.11 or 4.12, they will have to refine the configure test and provide a different .ver file to avoid adding symbols to an old version. Fair enough. Let's stick with your approach then, but drop the 1.3.9 from the name. Ok. To help with the review, I am appending the updated patch (untested). (if we add numeric_limits__float128 later, I think it should go into GLIBCXX_FLOAT128_1) Why is __float128 handled as a target type, anyway? I'd think it ought to be a standard type that just isn't supported on all targets, like __int128. I don't know. Since it is implemented in software on all platforms as far as I know, it shouldn't be that specific to a target. But it isn't just __float128, on some targets my patch may start generating extra symbols for half-floats (ARM) or __float80. DJ Delorie's work on __intN may be a good direction for __floatN as well, IIUC we will have a global list of 4 __intN types, and the target decides the value of N for each of them, so we can refer to those type nodes in common code but they are still target-specific. Or maybe there isn't enough variety in float types to deserve this. -- Marc GlisseIndex: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 211311) +++ gcc/cp/rtti.c (working copy) @@ -1540,20 +1540,31 @@ emit_support_tinfos (void) /*tag_scope=*/ts_current, false); pop_abi_namespace (); if (!COMPLETE_TYPE_P (bltn_type)) return; dtor = CLASSTYPE_DESTRUCTORS (bltn_type); if (!dtor || DECL_EXTERNAL (dtor)) return; doing_runtime = 1; for (ix = 0; fundamentals[ix]; ix++) emit_support_tinfo_1 (*fundamentals[ix]); + + /* Search for an extra floating point type like __float128. */ + for (enum machine_mode mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT); + mode != VOIDmode; + mode = GET_MODE_WIDER_MODE (mode)) +{ + tree type = c_common_type_for_mode (mode, false); + if (type type != float_type_node type != double_type_node + type != long_double_type_node) + emit_support_tinfo_1 (type); +} } /* Finish a type info decl. DECL_PTR is a pointer to an unemitted tinfo decl. Determine whether it needs emitting, and if so generate the initializer. */ bool emit_tinfo_decl (tree decl) { tree type = TREE_TYPE (DECL_NAME (decl)); Index: libstdc++-v3/config/abi/pre/float128.ver === --- libstdc++-v3/config/abi/pre/float128.ver(revision 0) +++ libstdc++-v3/config/abi/pre/float128.ver(working copy) @@ -0,0 +1,10 @@ +# Appended to version file. + +CXXABI_FLOAT128 { + +# typeinfo and typeinfo name for __float128 +_ZT[IS]g; +_ZT[IS]Pg; +_ZT[IS]PKg; + +}; Index: libstdc++-v3/configure === --- libstdc++-v3/configure (revision 211311) +++ libstdc++-v3/configure (working copy) @@ -15698,20 +15698,23 @@ $as_echo #define _GLIBCXX_USE_FLOAT128 $as_echo $enable_float128 6; } rm -f conftest* ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext 5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 5' ac_compiler_gnu=$ac_cv_c_compiler_gnu +if test $enable_float128 = yes; then + port_specific_symbol_files=$port_specific_symbol_files \$(top_srcdir)/config/abi/pre/float128.ver +fi # Checks for compiler support that doesn't require linking. # All these tests are for C++; save the language and the compiler flags. # The CXXFLAGS thing is suspicious, but based on similar bits previously # found in GLIBCXX_CONFIGURE. ac_ext=cpp ac_cpp='$CXXCPP $CPPFLAGS' ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext 5' Index: libstdc++-v3/configure.ac === --- libstdc++-v3/configure.ac (revision 211311) +++ libstdc++-v3/configure.ac (working copy) @@ -146,20 +146,23 @@ GLIBCXX_ENABLE_HOSTED # Enable descriptive messages to standard output on termination. GLIBCXX_ENABLE_VERBOSE # Enable compiler support that doesn't require linking. GLIBCXX_ENABLE_SJLJ_EXCEPTIONS GLIBCXX_ENABLE_PCH($is_hosted) GLIBCXX_ENABLE_THREADS GLIBCXX_ENABLE_ATOMIC_BUILTINS GLIBCXX_ENABLE_DECIMAL_FLOAT GLIBCXX_ENABLE_INT128_FLOAT128 +if test $enable_float128 = yes
Re: PR61385: phiopt drops some PHIs
On Tue, 3 Jun 2014, Jeff Law wrote: On 06/03/14 08:08, Richard Biener wrote: All arguments get the same value (and the PHI in middle-bb is surely a singleton?), so it's way better to re-materialize the PHI as a gimple assignment at the start of the basic block. If they are singletons (which I expect), the easiest way is to propagate their single argument into all uses and simply remove them. I was all for propagating, but replace_uses_by calls fold_stmt on the using statements (that's normal). If I first move the statement and then replace, I may call fold_stmt on a stmt that isn't dominated by the defs of all its arguments. If I first replace, the statement I am supposed to move may have been modified so I need to find out what happened to it. In the end, the assignment seems easier and safer (and not too bad since this case almost never happens in practice). We certainly want to get rid of them :-) I'd start by finding out which pass left the degenerate, ensure it's not marked as SSA_NAME_OCCURS_IN_ABNORMAL_PHI, then propagate it away. If it's a systemic problem that a particular pass can leave degenerate PHIs, then you might want to schedule the phi-only propagator to run after that pass. It does const/copy propagation seeding from degenerate PHIs, so it ought to be very fast. This one comes from VRP2 apparently. But it isn't the only pass that produces singleton phis. I didn't look at them closely, but I assume LIM is normal, DOM maybe less so, etc. bootstrap+testsuite on x86_64-linux-gnu as usual. 2014-06-04 Marc Glisse marc.gli...@inria.fr PR tree-optimization/61385 gcc/ * tree-ssa-phiopt.c (value_replacement): Replace singleton PHIs with assignments. gcc/testsuite/ * gcc.dg/tree-ssa/pr61385.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy) @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +#define assert(x) if (!(x)) __builtin_abort () + +int a, b, c, d, e, f, g; + +int +fn1 () +{ + int *h = c; + for (; c 1; c++) +{ + int *i = a, *k = a; + f = 0; + if (b) + return 0; + if (*h) + { + int **j = i; + *j = 0; + d = 0; + } + else + g = e = 0; + if (*h) + { + int **l = k; + *l = g; + } + d = *h; + assert (k == a || k); + assert (i); +} + return 0; +} + +int +main () +{ + fn1 (); + return 0; +} Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 211178) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -877,21 +877,35 @@ value_replacement (basic_block cond_bb, operand_equal_for_phi_arg_p (rhs2, cond_lhs) neutral_element_p (code_def, cond_rhs, true)) || (arg1 == rhs2 operand_equal_for_phi_arg_p (rhs1, cond_lhs) neutral_element_p (code_def, cond_rhs, false)) || (operand_equal_for_phi_arg_p (arg1, cond_rhs) (operand_equal_for_phi_arg_p (rhs2, cond_lhs) || operand_equal_for_phi_arg_p (rhs1, cond_lhs)) absorbing_element_p (code_def, cond_rhs { + /* Any PHI node is a singleton and we can replace it with an assignment. +We don't just call replace_uses_by because it calls fold_stmt and it +becomes hard to make all the right adjustments in the right order. */ gsi = gsi_for_stmt (cond); + gimple_stmt_iterator psi = gsi_start_phis (middle_bb); + while (!gsi_end_p (psi)) + { + gimple phi_moving = gsi_stmt (psi); + tree phi_res = gimple_phi_result (phi_moving); + tree phi_arg = gimple_phi_arg_def (phi_moving, 0); + gsi_remove (psi, false); + gsi_insert_before (gsi, gimple_build_assign (phi_res, phi_arg), +GSI_SAME_STMT); + } + gimple_stmt_iterator gsi_from = gsi_for_stmt (assign); gsi_move_before (gsi_from, gsi); replace_phi_edge_with_variable (cond_bb, e1, phi, lhs); return 2; } return 0; } /* The function minmax_replacement does the main work of doing the minmax
Re: PR61385: phiopt drops some PHIs
On Wed, 4 Jun 2014, Richard Biener wrote: So I'd say we should instead simply bail out if the middle-bb has a PHI node. Sounds good to me, so I am testing the mini-patch I had originally posted to bugzilla: 2014-06-04 Marc Glisse marc.gli...@inria.fr PR tree-optimization/61385 gcc/ * tree-ssa-phiopt.c (value_replacement): Punt if there are PHI nodes. gcc/testsuite/ * gcc.dg/tree-ssa/pr61385.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy) @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +#define assert(x) if (!(x)) __builtin_abort () + +int a, b, c, d, e, f, g; + +int +fn1 () +{ + int *h = c; + for (; c 1; c++) +{ + int *i = a, *k = a; + f = 0; + if (b) + return 0; + if (*h) + { + int **j = i; + *j = 0; + d = 0; + } + else + g = e = 0; + if (*h) + { + int **l = k; + *l = g; + } + d = *h; + assert (k == a || k); + assert (i); +} + return 0; +} + +int +main () +{ + fn1 (); + return 0; +} Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 211221) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -842,20 +842,24 @@ value_replacement (basic_block cond_bb, /* Now optimize (x != 0) ? x + y : y to just y. The following condition is too restrictive, there can easily be another stmt in middle_bb, for instance a CONVERT_EXPR for the second argument. */ gimple assign = last_and_only_stmt (middle_bb); if (!assign || gimple_code (assign) != GIMPLE_ASSIGN || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)) !POINTER_TYPE_P (TREE_TYPE (arg0 return 0; + /* Punt if there are (degenerate) PHIs in middle_bb, there should not be. */ + if (!gimple_seq_empty_p (phi_nodes (middle_bb))) +return 0; + /* Only transform if it removes the condition. */ if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, e1)) return 0; /* Size-wise, this is always profitable. */ if (optimize_bb_for_speed_p (cond_bb) /* The special case is useless if it has a low probability. */ profile_status_for_fn (cfun) != PROFILE_ABSENT EDGE_PRED (middle_bb, 0)-probability PROB_EVEN /* If assign is cheap, there is no point avoiding it. */
Re: emit __float128 typeinfo
On Wed, 4 Jun 2014, Jason Merrill wrote: How about, in emit_support_tinfos, using type_for_mode to check for a TF-mode floating point type different from long_double_type_node? What should I pass as the mode argument? I can't just write TFmode, that will fail to compile on platforms that don't define it, and powerpc seems likely to call it JFmode instead quite soon. MAX_MODE_FLOAT maybe? But then if we configure with long double = __float128, we will miss __float80. Ah, we walk from GET_CLASS_NARROWEST_MODE (MODE_FLOAT) with GET_MODE_WIDER_MODE steps and test if the associated type is not in the list 0/float/double/long double. I think it should be ok with arm (it would be good if they removed their unused XFmode, but I don't even think it is necessary). Is that what you were suggesting? I'll try to write a patch, thanks. -- Marc Glisse
Re: [PATCH][match-and-simplify]
On Tue, 3 Jun 2014, Richard Biener wrote: On Mon, 2 Jun 2014, Marc Glisse wrote: (plus (bit_not @0) @0) if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) { build_int_cst (TREE_TYPE (@0), -1); }) +(match_and_simplify + (plus @0 (bit_not @0)) + if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) + { build_int_cst (TREE_TYPE (@0), -1); }) Why not just: (match_and_simplify (plus @0 (bit_not @0)) { build_all_ones_cst (TREE_TYPE (@0)); }) ? Works for vector/complex, I don't know what type a bit_not_expr can have where the simplification wouldn't be true. Sure. Thanks. Sorry for not being clear enough, the same remark applies to basically all the bitwise patterns in the file. I could have saved the remark for the pre-merge RFC period. I mostly posted it now so the next batch of patterns can be directly written in a general way so you (Prathamesh or Richard) have fewer to fix later, but it can certainly wait. /* ~~x - x */ (match_and_simplify (bit_not (bit_not @0)) if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) @0) We can drop the if line. /* (x | CST1) CST2 - (x CST2) | (CST1 CST2) */ (match_and_simplify (bit_and (bit_ior @0 INTEGER_CST_P@1) INTEGER_CST_P@2) if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) (bit_ior (bit_and @0 @2) (bit_and @1 @2))) Drop the if line and replace INTEGER_CST_P with CONSTANT_CLASS_P. etc. -- Marc Glisse
PR61385: phiopt drops some PHIs
Hello, apparently it is possible to have a PHI in the middle basic block of value_replacement, so I need to move it as well when I move the statement and remove the block. Bootstrap and testsuite on x86_64-linux-gnu (re-running for various reasons but they had completed successfully yesterday). 2014-06-03 Marc Glisse marc.gli...@inria.fr PR tree-optimization/61385 gcc/ * tree-ssa-phiopt.c (value_replacement): Copy PHI nodes before removing the basic block. gcc/testsuite/ * gcc.dg/tree-ssa/pr61385.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy) @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +#define assert(x) if (!(x)) __builtin_abort () + +int a, b, c, d, e, f, g; + +int +fn1 () +{ + int *h = c; + for (; c 1; c++) +{ + int *i = a, *k = a; + f = 0; + if (b) + return 0; + if (*h) + { + int **j = i; + *j = 0; + d = 0; + } + else + g = e = 0; + if (*h) + { + int **l = k; + *l = g; + } + d = *h; + assert (k == a || k); + assert (i); +} + return 0; +} + +int +main () +{ + fn1 (); + return 0; +} Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 211178) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -877,20 +877,39 @@ value_replacement (basic_block cond_bb, operand_equal_for_phi_arg_p (rhs2, cond_lhs) neutral_element_p (code_def, cond_rhs, true)) || (arg1 == rhs2 operand_equal_for_phi_arg_p (rhs1, cond_lhs) neutral_element_p (code_def, cond_rhs, false)) || (operand_equal_for_phi_arg_p (arg1, cond_rhs) (operand_equal_for_phi_arg_p (rhs2, cond_lhs) || operand_equal_for_phi_arg_p (rhs1, cond_lhs)) absorbing_element_p (code_def, cond_rhs { + /* Move potential PHI nodes. */ + gimple_stmt_iterator psi = gsi_start_phis (middle_bb); + while (!gsi_end_p (psi)) + { + gimple phi_moving = gsi_stmt (psi); + gimple newphi = create_phi_node (gimple_phi_result (phi_moving), + cond_bb); + int nargs = cond_bb-preds-length(); + location_t loc = gimple_phi_arg_location (phi_moving, 0); + tree phi_arg = gimple_phi_arg_def (phi_moving, 0); + for (int i = 0; i nargs; ++i) + { + edge e = (*cond_bb-preds)[i]; + add_phi_arg (newphi, phi_arg, e, loc); + } + update_stmt (newphi); + gsi_remove (psi, false); + } + gsi = gsi_for_stmt (cond); gimple_stmt_iterator gsi_from = gsi_for_stmt (assign); gsi_move_before (gsi_from, gsi); replace_phi_edge_with_variable (cond_bb, e1, phi, lhs); return 2; } return 0; }
Re: calloc = malloc + memset
Ping? On Sat, 17 May 2014, Marc Glisse wrote: Ping Jakub? https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01104.html On Wed, 23 Apr 2014, Richard Biener wrote: On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote: Thanks for the comments! On Fri, 18 Apr 2014, Jakub Jelinek wrote: The passes.def change makes me a little bit nervous, but if it works, perhaps. Would you prefer running the pass twice? I thought there would be less resistance to moving the pass than duplicating it. Indeed. I think placing it after loops and CSE (thus what you have done) makes sense. strlenopt itself shouldn't enable much additional optimizations. But well, pass ordering is always tricky. Didn't look at the rest of the changes, but Jakub is certainly able to approve the patch so I leave it to him. -- Marc Glisse
Re: PR61385: phiopt drops some PHIs
On Tue, 3 Jun 2014, Richard Biener wrote: On Tue, Jun 3, 2014 at 3:48 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, apparently it is possible to have a PHI in the middle basic block of value_replacement, so I need to move it as well when I move the statement and remove the block. Bootstrap and testsuite on x86_64-linux-gnu (re-running for various reasons but they had completed successfully yesterday). 2014-06-03 Marc Glisse marc.gli...@inria.fr PR tree-optimization/61385 gcc/ * tree-ssa-phiopt.c (value_replacement): Copy PHI nodes before removing the basic block. gcc/testsuite/ * gcc.dg/tree-ssa/pr61385.c: New file. -- Marc Glisse Index: gcc/testsuite/gcc.dg/tree-ssa/pr61385.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr61385.c (working copy) @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +#define assert(x) if (!(x)) __builtin_abort () + +int a, b, c, d, e, f, g; + +int +fn1 () +{ + int *h = c; + for (; c 1; c++) +{ + int *i = a, *k = a; + f = 0; + if (b) + return 0; + if (*h) + { + int **j = i; + *j = 0; + d = 0; + } + else + g = e = 0; + if (*h) + { + int **l = k; + *l = g; + } + d = *h; + assert (k == a || k); + assert (i); +} + return 0; +} + +int +main () +{ + fn1 (); + return 0; +} Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 211178) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -877,20 +877,39 @@ value_replacement (basic_block cond_bb, operand_equal_for_phi_arg_p (rhs2, cond_lhs) neutral_element_p (code_def, cond_rhs, true)) || (arg1 == rhs2 operand_equal_for_phi_arg_p (rhs1, cond_lhs) neutral_element_p (code_def, cond_rhs, false)) || (operand_equal_for_phi_arg_p (arg1, cond_rhs) (operand_equal_for_phi_arg_p (rhs2, cond_lhs) || operand_equal_for_phi_arg_p (rhs1, cond_lhs)) absorbing_element_p (code_def, cond_rhs { + /* Move potential PHI nodes. */ + gimple_stmt_iterator psi = gsi_start_phis (middle_bb); + while (!gsi_end_p (psi)) + { + gimple phi_moving = gsi_stmt (psi); + gimple newphi = create_phi_node (gimple_phi_result (phi_moving), + cond_bb); + int nargs = cond_bb-preds-length(); + location_t loc = gimple_phi_arg_location (phi_moving, 0); + tree phi_arg = gimple_phi_arg_def (phi_moving, 0); + for (int i = 0; i nargs; ++i) + { + edge e = (*cond_bb-preds)[i]; + add_phi_arg (newphi, phi_arg, e, loc); All arguments get the same value (and the PHI in middle-bb is surely a singleton?), Yes, there is a single incoming edge to middle-bb. so it's way better to re-materialize the PHI as a gimple assignment at the start of the basic block. I thought there might be a reason why it was a PHI and not an assignment so I wasn't sure doing that would be ok. It is indeed much easier, so I'll do that... If they are singletons (which I expect), the easiest way is to propagate their single argument into all uses and simply remove them. ... or indeed that, now that I have found a function called replace_uses_by which should do exactly what I need. Thanks, -- Marc Glisse
Re: [PATCH][match-and-simplify]
(plus (bit_not @0) @0) if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) { build_int_cst (TREE_TYPE (@0), -1); }) +(match_and_simplify + (plus @0 (bit_not @0)) + if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) + { build_int_cst (TREE_TYPE (@0), -1); }) Why not just: (match_and_simplify (plus @0 (bit_not @0)) { build_all_ones_cst (TREE_TYPE (@0)); }) ? Works for vector/complex, I don't know what type a bit_not_expr can have where the simplification wouldn't be true. -- Marc Glisse
Re: [patch] libstdc++/61374 fix string_view conversion and update to latest draft
On Sun, 1 Jun 2014, Daniel Krügler wrote: 2014-06-01 19:24 GMT+02:00 Jonathan Wakely jwak...@redhat.com: Tested x86_64-linux, committed to trunk. This should probably go on the 4.9 branch too, although we could leave the old default cosntructor semantics and just fix the conversion operator. Looking at the comparison functions of basic_string_view I noticed that these are not constexpr (nor seem to be other functions such as find()). We may want to wait until the compiler supports C++14 constexpr, trying to implement it with C++11 would be doable but not so useful in my opinion. -- Marc Glisse
Re: [C++11, C++14 PATCH 2/3] Support for SD-6: SG10 Feature Test Recommendations - c-family and testsuite
On Fri, 30 May 2014, Ed Smith-Rowland wrote: + cpp_undef (pfile, __cpp_constexpr); + cpp_define (pfile, __cpp_constexpr=201304); Could you set the other value in an else branch to avoid a def undef redef game? Also, I am pretty sure that gcc doesn't support the latest constexpr, we shouldn't define those macros lightly. -- Marc Glisse
Re: [C++11, C++14 PATCH 3/3] Support for SD-6: SG10 Feature Test Recommendations - libstdc++
On Fri, 30 May 2014, Ed Smith-Rowland wrote: +#define __cpp_lib_constexpr_functions 201210 appears quite a few times, I believe it will warn with -Wsystem-headers. I think you should check if it is already defined (or undefine it first). (you forgot to Cc: libstdc++) -- Marc Glisse
Re: [PATCH] Warn on and fold NULL checks against inline functions
On Mon, 26 May 2014, Patrick Palka wrote: This patch teaches the C++ frontend to warn on NULL checks against inline functions and it teaches the middle-end to fold NULL checks against inline functions. These two things are currently done for non-inline C++ functions, but inline functions are exceptional in that the C++ frontend marks them as weak, and NULL checks against weak symbols cannot be folded in general because the symbol may be mapped to NULL at link-time. But in the case of an inline function, the fact that it's a weak symbol is an implementation detail. And because it is not permitted to explicitly give an inline function the weak attribute (see handle_weak_attribute), in order to acheive $SUBJECT it suffices to assume that all inline functions are non-null, which is what this patch does. Thanks for working on this. You may want to mention PR 59948 in the ChangeLog. In the comments of that PR, Honza seems to have a different idea of what the fold-const test should look like. -- Marc Glisse
Re: [PATCH] Warn on and fold NULL checks against inline functions
On Mon, 26 May 2014, Patrick Palka wrote: On Mon, May 26, 2014 at 4:26 PM, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 26 May 2014, Patrick Palka wrote: This patch teaches the C++ frontend to warn on NULL checks against inline functions and it teaches the middle-end to fold NULL checks against inline functions. These two things are currently done for non-inline C++ functions, but inline functions are exceptional in that the C++ frontend marks them as weak, and NULL checks against weak symbols cannot be folded in general because the symbol may be mapped to NULL at link-time. But in the case of an inline function, the fact that it's a weak symbol is an implementation detail. And because it is not permitted to explicitly give an inline function the weak attribute (see handle_weak_attribute), in order to acheive $SUBJECT it suffices to assume that all inline functions are non-null, which is what this patch does. Thanks for working on this. You may want to mention PR 59948 in the ChangeLog. In the comments of that PR, Honza seems to have a different idea of what the fold-const test should look like. I wonder, then, if the test in c-common.c is also broken... Thanks for the heads up about the PR. I think I should also mention PR 61144 which says the test can be wrong in the other direction (I didn't look at it closely). But even if the tests are broken, your patch may still be an improvement while waiting for the rewrite... -- Marc Glisse
Re: emit __float128 typeinfo
On Wed, 21 May 2014, Jason Merrill wrote: On 04/25/2014 05:04 AM, Marc Glisse wrote: Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? The latter, I think; these are base types, so they should go in the library. Hmm, ok. Because of the arm target, we can't just use the register_builtin_type hook as it is. The things I can think of right now are: 1) change the prototype of register_builtin_type so it takes an extra bool parameter that says if we want to generate runtime stuff like typeinfo (in addition to what register_builtin_type is already doing). I would then update all target calls with , false and let target maintainers switch to true when they are ready. or 2) create a new register_builtin_type_runtime lang hook that would be defined only in (obj-)c++ and would be used only to generate typeinfo. Does one of those seem acceptable? Another alternative might be to wait for the intN_t work to land and do the same for floatN_t, but that's too big for me. -- Marc Glisse
Re: [i386] Replace builtins with vector extensions
Ping On Mon, 28 Apr 2014, Marc Glisse wrote: Ping http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00590.html (note that ARM seems to be doing the same thing for their neon intrinsics, see Ramana's patch series posted today) On Fri, 11 Apr 2014, Marc Glisse wrote: Hello, the previous discussion on the topic was before we added all those #pragma target in *mmintrin.h: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html I believe that removes a large part of the arguments against it. Note that I only did a few of the more obvious intrinsics, I am waiting to see if this patch is accepted before doing more. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-11 Marc Glisse marc.gli...@inria.fr * config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions instead of builtins. * config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, _mm_storeh_pd, _mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64, _mm_loadh_pd, _mm_loadl_pd): Likewise. (_mm_sqrt_sd): Fix comment. -- Marc Glisse
Re: calloc = malloc + memset
Ping Jakub? https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01104.html On Wed, 23 Apr 2014, Richard Biener wrote: On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote: Thanks for the comments! On Fri, 18 Apr 2014, Jakub Jelinek wrote: The passes.def change makes me a little bit nervous, but if it works, perhaps. Would you prefer running the pass twice? I thought there would be less resistance to moving the pass than duplicating it. Indeed. I think placing it after loops and CSE (thus what you have done) makes sense. strlenopt itself shouldn't enable much additional optimizations. But well, pass ordering is always tricky. Didn't look at the rest of the changes, but Jakub is certainly able to approve the patch so I leave it to him. Thanks, Richard. By the way, I think even passes we run only once should have the required functions implemented so they can be run several times (at least most of them), in case users want to do that in plugins. I was surprised when I tried adding a second strlen pass and the compiler refused. --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,35 @@ +/* { dg-do compile { target c++11 } } */ +/* { dg-options -O3 -fdump-tree-optimized } */ + +#include new +#include vector +#include cstdlib + +void g(void*); +inline void* operator new(std::size_t sz) +{ + void *p; + + if (sz == 0) +sz = 1; + + // Slightly modified from the libsupc++ version, that one has 2 calls + // to malloc which makes it too hard to optimize. + while ((p = std::malloc (sz)) == 0) +{ + std::new_handler handler = std::get_new_handler (); + if (! handler) +throw std::bad_alloc(); + handler (); +} + return p; +} + +void f(void*p,int n){ + new(p)std::vectorint(n); +} + +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ This looks to me way too much fragile, any time the libstdc++ or glibc headers change a little bit, you might need to adjust the dg-final directives. Much better would be if you just provided the prototypes yourself and subset of the std::vector you really need for the testcase. You can throw some class or int, it doesn't have to be std::bad_alloc, etc. I don't understand what seems so fragile to you. There is a single function in the .optimized dump, which just calls calloc in a loop. It doesn't seem that likely that a change in glibc/libstdc++ would make an extra memset pop up. A change in libstdc++ could easily prevent the optimization completely (I'd like to hope we can avoid that, half of the purpose of the testcase was making sure libstdc++ didn't change in a bad way), but I don't really see how it could keep it in a way that requires tweaking dg-final. While trying to write a standalone version, I hit again many missed optimizations, getting such nice things in the .optimized dump as: _12 = p_13 + sz_7; if (_12 != p_13) or: _12 = p_13 + sz_7; _30 = (unsigned long) _12; _9 = p_13 + 4; _10 = (unsigned long) _9; _11 = _30 - _10; _22 = _11 /[ex] 4; _21 = _22; _40 = _21 + 1; _34 = _40 * 4; It is embarrassing... I hope the combiner GSoC will work well and we can just add a dozen patterns to handle this before 4.10. --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 208772) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r) optimized away. */ return strchr (p, '\0'); } __attribute__((noinline, noclone)) size_t fn2 (int r) { char *p, q[10]; strcpy (q, abc); p = r ? a : q; - /* String length for p varies, therefore strlen below isn't + /* String length is constant for both alternatives, and strlen is optimized away. */ return strlen (p); Is this because of jump threading? It is PRE that turns: if (r_4(D) == 0) goto bb 5; else goto bb 3; bb 5: goto bb 4; bb 3: bb 4: # p_1 = PHI q(5), a(3) _5 = __builtin_strlen (p_1); into: if (r_4(D) == 0) goto bb 5; else goto bb 3; bb 5: _7 = __builtin_strlen (q); pretmp_8 = _7; goto bb 4; bb 3: bb 4: # p_1 = PHI q(5), a(3) # prephitmp_9 = PHI pretmp_8(5), 1(3) _5 = prephitmp_9; It says: Found partial redundancy for expression {call_expr__builtin_strlen,p_1}@.MEM_3 (0005) --- gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +#include stdlib.h +#include string.h Even this I find unsafe. The strlenopt*.c tests use it's custom strlenopt.h header for a reason, you might just add a calloc prototype in there and use that header. Might as well use __builtin_* then. +/* Handle a call
Re: [v3] Slightly improve operator new
Ping. On Tue, 15 Apr 2014, Marc Glisse wrote: Ping http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00051.html On Sun, 2 Mar 2014, Marc Glisse wrote: Hello, inlining operator new (with LTO or otherwise), I noticed that it has a complicated implementation, which makes it hard to use this inlined code for optimizations. This patch does two things: 1) there are 2 calls to malloc, I am turning them into just one. At -Os, it does not change the generated code (RTL optimizers manage to merge the calls to malloc). At other levels (-O2, -O3, and especially with -g) it gives a smaller object file. And with just one malloc, some optimizations become much easier (see my recent calloc patch for instance). 2) malloc is predicted to return null 19 times out of 20 because of the loop (that didn't change with the patch), so I am adding __builtin_expect to let gcc optimize the fast path. Further discussion: a) I didn't add __builtin_expect for the test (sz == 0), it didn't change the generated code in my limited test. I was wondering if this test is necessary (new doesn't seem to ever call operator new(0)) or could be moved to operator new[] (new type[0] does call operator new[](0)), but since one can call operator new directly, it has to be protected indeed, so let's forget this point ;-) (too bad malloc is replacable, so we can't use the fact that glibc already does the right thing) b) I have a bit of trouble parsing the standard. Is the nothrow operator new supposed to call the regular operator new? In particular, if a user replaces only the throwing operator new, should the nothrow operator new automatically call that function? That's not what we are currently doing (and it would be a perf regression). Required behavior: Return a non-null pointer to suitably aligned storage (3.7.4), or else return a null pointer. This nothrow version of operator new returns a pointer obtained as if acquired from the (possibly replaced) ordinary version. This requirement is binding on a replacement version of this function. Default behavior: Calls operator new(size). If the call returns normally, returns the result of that call. Otherwise, returns a null pointer. Passes bootstrap+testsuite on x86_64-linux-gnu. Stage 1? 2014-03-03 Marc Glisse marc.gli...@inria.fr * libsupc++/new_op.cc: Factor the calls to malloc, use __builtin_expect. * libsupc++/new_opnt.cc: Likewise. -- Marc Glisse
Re: PR61140: check the phi is unique in value_replacement
On Sat, 10 May 2014, Andrew Pinski wrote: On Sat, May 10, 2014 at 3:53 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, in my recent phiopt patch enhancing value_replacement to optimize x!=0?x+y:y, I forgot to check that there is no other PHI (not sure how I managed to miss that since I copy-pasted the line just below the test). If there are other phi nodes (with different arguments for those 2 branches), it would be possible to replace the phi argument and stop there (as value_replacement does for its other transformation). However, I am chosing to punt. The cost analysis would be different, and I wrote the transformation assuming that this single-phi test was already done higher in the function. I think we should have some good cost analysis because for this testcase, we should be able to get only one conditional move but right now with punting we don't. That's true. But note that the transformation is already very limited (gives up if there is a second statement in the middle bb, even a simple cast), so I would like to first quickly get the wrong-code regression out of the way, and we can make improvements afterwards (though we can of course start discussing them now). It seems like if there is only 1 extra non-singleton phi (in addition to the one we are transforming) and the target supports conditional move for this type and the direct branch has proba 50%, with the other restrictions already in place, we could go ahead. How does that sound? Not too specialized? If there are many phis, conditional moves are out, the branch will stay, and unless the edge to the operation has a very high proba, it doesn't seem like a good idea to pull the operation out of the branch. -- Marc Glisse
PR61140: check the phi is unique in value_replacement
Hello, in my recent phiopt patch enhancing value_replacement to optimize x!=0?x+y:y, I forgot to check that there is no other PHI (not sure how I managed to miss that since I copy-pasted the line just below the test). If there are other phi nodes (with different arguments for those 2 branches), it would be possible to replace the phi argument and stop there (as value_replacement does for its other transformation). However, I am chosing to punt. The cost analysis would be different, and I wrote the transformation assuming that this single-phi test was already done higher in the function. Bootstrap+testsuite on x86_64-linux-gnu. 2014-05-12 Marc Glisse marc.gli...@inria.fr PR tree-optimization/61140 gcc/ * tree-ssa-phiopt.c (value_replacement): Punt on multiple phis. gcc/testsuite/ * gcc.dg/tree-ssa/pr61140.c: New file. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/pr61140.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr61140.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr61140.c (working copy) @@ -0,0 +1,18 @@ +/* { dg-do run } */ +/* { dg-options -O2 } */ + +int a[1] = { 1 }, b = 1, c; + +int +main () +{ + for (; c 1; c++) +if (a[0]) +{ + a[0] = 1; + b = 0; +} + if (b) +__builtin_abort (); + return 0; +} Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 210301) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -842,20 +842,24 @@ value_replacement (basic_block cond_bb, /* Now optimize (x != 0) ? x + y : y to just y. The following condition is too restrictive, there can easily be another stmt in middle_bb, for instance a CONVERT_EXPR for the second argument. */ gimple assign = last_and_only_stmt (middle_bb); if (!assign || gimple_code (assign) != GIMPLE_ASSIGN || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)) !POINTER_TYPE_P (TREE_TYPE (arg0 return 0; + /* Only transform if it removes the condition. */ + if (!single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), e0, e1)) +return 0; + /* Size-wise, this is always profitable. */ if (optimize_bb_for_speed_p (cond_bb) /* The special case is useless if it has a low probability. */ profile_status_for_fn (cfun) != PROFILE_ABSENT EDGE_PRED (middle_bb, 0)-probability PROB_EVEN /* If assign is cheap, there is no point avoiding it. */ estimate_num_insns (assign, eni_time_weights) = 3 * estimate_num_insns (cond, eni_time_weights)) return 0;
Re: [patch] change specific int128 - generic intN
On Thu, 8 May 2014, DJ Delorie wrote: Assuming that the formula sizeof(type)*char_bit==precision works for all It doesn't. THe MSP430 has __int20 for example. Well, it wasn't a hard requirement, it is just that the library has to use a more complicated way to get the precision (use (unsigned TYPE)(-1) to get the unsigned max and compute the precision from that, probably). Would it be acceptable for the compiler to always define a set of macros for each of the intN types? What set of macros do you have in mind? I would have thought that would be discouraged, If we can't think of another way... -- Marc Glisse
Re: emit __float128 typeinfo
Ping http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01651.html On Fri, 25 Apr 2014, Marc Glisse wrote: On Fri, 25 Apr 2014, Marc Glisse wrote: the previous patch had to be reverted as it broke the strange handling of vectors in the ARM target. This new patch should be much more conservative I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE know that it isn't available in libsupc++. There are 2 versions, both regtested fine. Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (emit_support_tinfos): Move the array... (fundamentals): ... and make it global. (typeinfo_in_lib_p): Use it. 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a hardcoded list of available types. It seems better with a TYPE_CANONICAL in there. It passed bootstrap and the testsuite is running. -- Marc Glisse
Re: [patch] change specific int128 - generic intN
On Thu, 8 May 2014, DJ Delorie wrote: The libstdc++v3 headers have __int128 hard-coded all over the place. Any suggestions on parameterizing those for the __intN types that are actually supported by the target? (adding libstdc++@ in Cc:) The first idea that comes to mind (so possibly not such a good one) is to provide predefined macros: #define __EXTENDED_INTEGER_TYPE_1__ __int24 #define __EXTENDED_INTEGER_TYPE_2__ __int128 #undef __EXTENDED_INTEGER_TYPE_3__ Assuming that the formula sizeof(type)*char_bit==precision works for all types, it should be sufficient for the library (abs, type_traits and numeric_limits). -- Marc Glisse
Re: Optimize n?rotate(x,n):x
Hello, here is the latest version. Reviewers seemed happy with different versions, so I went for the simplest one. We only give up on the transformation if we are optimizing for speed, the short-cut has probability 50% and the operation the branch is avoiding is expensive (i.e. only divisions, with the current estimate_num_insns). The optab checks are gone, they should eventually be added to estimate_num_insns instead. I believe this is covered by the previous ok, but I won't commit anything before Tuesday. It would have been more general (and shorter) to call fold after substituting the tested value and see if the result matches the other phi argument (it might handle (a!=b)?a|b:a for instance), instead of the *_element_p tests. I may try that later, but I guess the current patch is good enough for now. 2014-05-06 Marc Glisse marc.gli...@inria.fr PR tree-optimization/59100 gcc/ * tree-ssa-phiopt.c: Include tree-inline.h. (neutral_element_p, absorbing_element_p): New functions. (value_replacement): Handle conditional binary operations with a neutral or absorbing element. gcc/testsuite/ * gcc.dg/tree-ssa/phi-opt-12.c: New file. * gcc.dg/tree-ssa/phi-opt-13.c: Likewise. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +// Division is expensive +long f(long a, long b) { + if (__builtin_expect(b == 1, 1)) return a; + return a / b; +} + +/* { dg-final { scan-tree-dump-times goto 2 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 209979) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -47,20 +47,21 @@ along with GCC; see the file COPYING3. #include tree-pass.h #include langhooks.h #include domwalk.h #include cfgloop.h #include tree-data-ref.h #include gimple-pretty-print.h #include insn-config.h #include expr.h #include optabs.h #include tree-scalar-evolution.h +#include tree-inline.h #ifndef HAVE_conditional_move #define HAVE_conditional_move (0) #endif static unsigned int tree_ssa_phiopt_worker (bool, bool); static bool conditional_replacement (basic_block, basic_block, edge, edge, gimple, tree, tree); static int value_replacement (basic_block, basic_block, edge, edge, gimple, tree, tree); @@ -652,20 +653,78 @@ operand_equal_for_value_replacement (con if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; tmp = gimple_assign_rhs2 (def); if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; return false; } +/* Returns true if ARG is a neutral element for operation CODE + on the RIGHT side. */ + +static bool +neutral_element_p (tree_code code, tree arg, bool right) +{ + switch (code) +{ +case PLUS_EXPR: +case BIT_IOR_EXPR: +case BIT_XOR_EXPR: + return integer_zerop (arg); + +case LROTATE_EXPR: +case RROTATE_EXPR: +case LSHIFT_EXPR: +case RSHIFT_EXPR: +case MINUS_EXPR: +case POINTER_PLUS_EXPR: + return right integer_zerop (arg); + +case MULT_EXPR: + return integer_onep (arg); + +case TRUNC_DIV_EXPR: +case CEIL_DIV_EXPR: +case FLOOR_DIV_EXPR: +case ROUND_DIV_EXPR: +case EXACT_DIV_EXPR: + return right integer_onep (arg); + +case BIT_AND_EXPR: + return integer_all_onesp (arg); + +default: + return false; +} +} + +/* Returns true if ARG is an absorbing element for operation CODE. */ + +static bool +absorbing_element_p (tree_code code, tree arg) +{ + switch (code) +{ +case BIT_IOR_EXPR: + return integer_all_onesp (arg); + +case MULT_EXPR: +case BIT_AND_EXPR: + return integer_zerop (arg); + +default: + return false
Re: [PATCH], RFC, add support for __float128/__ibm128 types on PowerPC
Minor detail: + if ((flags OPTION_MASK_FLOAT128) != 0) +rs6000_define_or_undefine_macro (define_p, __FLOAT128__); I recently added __SIZEOF_FLOAT128__ to the x86 target to advertise the availability of __float128, it would be good if we had a common macro (it can be in addition to __FLOAT128__). If you really don't like it, we can ask x86 maintainers if they like __FLOAT128__. -- Marc Glisse
Re: [i386] Replace builtins with vector extensions
Ping http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00590.html (note that ARM seems to be doing the same thing for their neon intrinsics, see Ramana's patch series posted today) On Fri, 11 Apr 2014, Marc Glisse wrote: Hello, the previous discussion on the topic was before we added all those #pragma target in *mmintrin.h: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html I believe that removes a large part of the arguments against it. Note that I only did a few of the more obvious intrinsics, I am waiting to see if this patch is accepted before doing more. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-11 Marc Glisse marc.gli...@inria.fr * config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions instead of builtins. * config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, _mm_storeh_pd, _mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64, _mm_loadh_pd, _mm_loadl_pd): Likewise. (_mm_sqrt_sd): Fix comment. -- Marc Glisse
Re: [PATCH] Implement -fsanitize=float-divide-by-zero
On Mon, 28 Apr 2014, Marek Polacek wrote: This patch implements -fsanitize=float-divide-by-zero option that can be used to detect division by zero even when using floating types. Most of the code in ubsan_instrument_division was ready for this so this was mainly about handling REAL_TYPE there. Ideally this would all be unneeded, you would compile your program with pragma stdc fenv_access on, on glibc you would call feenableexcept(FE_DIVBYZERO) at the beginning of the program, done (I may have listed the wrong function, I am always confused by those names). But I guess we won't be there anytime soon, so in the mean time... Since division by a floating point zero can be a valid way of obtaining infinities and NaNs, I'm not 100% sure this ought to be enabled by default (that is, enabled when -fsanitize=undefined is specified). Please don't enable it with undefined. As you say, it is well defined (except when finite-math-only is in effect). If you want a meta-category for this kind of valid thing, maybe -fsanitize=unusual or -fsanitize=suspicious. -- Marc Glisse
Re: version typeinfo for 128bit types
On Fri, 25 Apr 2014, Jonathan Wakely wrote: On 24/04/14 20:03 +0200, Marc Glisse wrote: Grep seems to indicate that the manual is the only other place that needs updating, but that can wait. Is this patch ok, assuming the tests pass? OK, and sorry for forgetting about that file in the testsuite! Done. You might also want to update the latestp check to GLIBCXX_3.4.21 and CXXABI_TM_2 since I think adding symbols to GLIBCXX_3.4.20 or CXXABI_TM_1 at this point should make a noisy error. -- Marc Glisse
emit __float128 typeinfo
Hello, the previous patch had to be reverted as it broke the strange handling of vectors in the ARM target. This new patch should be much more conservative I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE know that it isn't available in libsupc++. There are 2 versions, both regtested fine. Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (emit_support_tinfos): Move the array... (fundamentals): ... and make it global. (typeinfo_in_lib_p): Use it. 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a hardcoded list of available types. -- Marc GlisseIndex: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 209787) +++ gcc/cp/rtti.c (working copy) @@ -1042,42 +1042,68 @@ class_initializer (tinfo_s *ti, tree tar for (i = 0; i n; i++) CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, va_arg (extra_inits, tree)); va_end (extra_inits); init = build_constructor (init_list_type_node, v); TREE_CONSTANT (init) = 1; TREE_STATIC (init) = 1; return init; } +/* List of types for which the typeinfo should be placed in the + runtime library. + Dummy static variable so we can put nullptr in the array; it will be + set before we actually start to walk the array. */ +static tree *const fundamentals[] = +{ + void_type_node, + boolean_type_node, + wchar_type_node, char16_type_node, char32_type_node, + char_type_node, signed_char_type_node, unsigned_char_type_node, + short_integer_type_node, short_unsigned_type_node, + integer_type_node, unsigned_type_node, + long_integer_type_node, long_unsigned_type_node, + long_long_integer_type_node, long_long_unsigned_type_node, + int128_integer_type_node, int128_unsigned_type_node, + float_type_node, double_type_node, long_double_type_node, + dfloat32_type_node, dfloat64_type_node, dfloat128_type_node, + nullptr_type_node, + 0 +}; + /* Returns true if the typeinfo for type should be placed in the runtime library. */ static bool typeinfo_in_lib_p (tree type) { /* The typeinfo objects for `T*' and `const T*' are in the runtime library for simple types T. */ if (TYPE_PTR_P (type) (cp_type_quals (TREE_TYPE (type)) == TYPE_QUAL_CONST || cp_type_quals (TREE_TYPE (type)) == TYPE_UNQUALIFIED)) type = TREE_TYPE (type); switch (TREE_CODE (type)) { -case INTEGER_TYPE: case BOOLEAN_TYPE: -case REAL_TYPE: case VOID_TYPE: case NULLPTR_TYPE: return true; +case INTEGER_TYPE: +case REAL_TYPE: + for (int ix = 0; fundamentals[ix]; ix++) + if (*fundamentals[ix] == type) + return true; + return false; + case LANG_TYPE: /* fall through. */ default: return false; } } /* Generate the initializer for the type info describing TYPE. TK_INDEX is the index of the descriptor in the tinfo_desc vector. */ @@ -1505,38 +1531,20 @@ emit_support_tinfo_1 (tree bltn) /* Emit the type_info descriptors which are guaranteed to be in the runtime support. Generating them here guarantees consistency with the other structures. We use the following heuristic to determine when the runtime is being generated. If std::__fundamental_type_info is defined, and its destructor is defined, then the runtime is being built. */ void emit_support_tinfos (void) { - /* Dummy static variable so we can put nullptr in the array; it will be - set before we actually start to walk the array. */ - static tree *const fundamentals[] = - { -void_type_node, -boolean_type_node, -wchar_type_node, char16_type_node, char32_type_node, -char_type_node, signed_char_type_node, unsigned_char_type_node, -short_integer_type_node, short_unsigned_type_node, -integer_type_node, unsigned_type_node, -long_integer_type_node, long_unsigned_type_node, -long_long_integer_type_node, long_long_unsigned_type_node, -int128_integer_type_node, int128_unsigned_type_node, -float_type_node, double_type_node, long_double_type_node, -dfloat32_type_node, dfloat64_type_node, dfloat128_type_node, -nullptr_type_node, -0 - }; int ix; tree bltn_type, dtor; push_abi_namespace (); bltn_type = xref_tag (class_type, get_identifier (__fundamental_type_info), /*tag_scope=*/ts_current, false); pop_abi_namespace (); if (!COMPLETE_TYPE_P (bltn_type)) return; Index: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 209787) +++ gcc/cp/rtti.c (working copy) @@ -1059,25 +1059,32 @@ typeinfo_in_lib_p (tree type
Re: emit __float128 typeinfo
On Fri, 25 Apr 2014, Marc Glisse wrote: the previous patch had to be reverted as it broke the strange handling of vectors in the ARM target. This new patch should be much more conservative I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE know that it isn't available in libsupc++. There are 2 versions, both regtested fine. Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (emit_support_tinfos): Move the array... (fundamentals): ... and make it global. (typeinfo_in_lib_p): Use it. 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a hardcoded list of available types. It seems better with a TYPE_CANONICAL in there. It passed bootstrap and the testsuite is running. -- Marc GlisseIndex: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 209789) +++ gcc/cp/rtti.c (working copy) @@ -1042,42 +1042,69 @@ class_initializer (tinfo_s *ti, tree tar for (i = 0; i n; i++) CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, va_arg (extra_inits, tree)); va_end (extra_inits); init = build_constructor (init_list_type_node, v); TREE_CONSTANT (init) = 1; TREE_STATIC (init) = 1; return init; } +/* List of types for which the typeinfo should be placed in the + runtime library. + Dummy static variable so we can put nullptr in the array; it will be + set before we actually start to walk the array. */ +static tree *const fundamentals[] = +{ + void_type_node, + boolean_type_node, + wchar_type_node, char16_type_node, char32_type_node, + char_type_node, signed_char_type_node, unsigned_char_type_node, + short_integer_type_node, short_unsigned_type_node, + integer_type_node, unsigned_type_node, + long_integer_type_node, long_unsigned_type_node, + long_long_integer_type_node, long_long_unsigned_type_node, + int128_integer_type_node, int128_unsigned_type_node, + float_type_node, double_type_node, long_double_type_node, + dfloat32_type_node, dfloat64_type_node, dfloat128_type_node, + nullptr_type_node, + 0 +}; + /* Returns true if the typeinfo for type should be placed in the runtime library. */ static bool typeinfo_in_lib_p (tree type) { /* The typeinfo objects for `T*' and `const T*' are in the runtime library for simple types T. */ if (TYPE_PTR_P (type) (cp_type_quals (TREE_TYPE (type)) == TYPE_QUAL_CONST || cp_type_quals (TREE_TYPE (type)) == TYPE_UNQUALIFIED)) type = TREE_TYPE (type); switch (TREE_CODE (type)) { -case INTEGER_TYPE: case BOOLEAN_TYPE: -case REAL_TYPE: case VOID_TYPE: case NULLPTR_TYPE: return true; +case INTEGER_TYPE: +case REAL_TYPE: + type = TYPE_CANONICAL (type); + for (int ix = 0; fundamentals[ix]; ix++) + if (*fundamentals[ix] == type) + return true; + return false; + case LANG_TYPE: /* fall through. */ default: return false; } } /* Generate the initializer for the type info describing TYPE. TK_INDEX is the index of the descriptor in the tinfo_desc vector. */ @@ -1505,38 +1532,20 @@ emit_support_tinfo_1 (tree bltn) /* Emit the type_info descriptors which are guaranteed to be in the runtime support. Generating them here guarantees consistency with the other structures. We use the following heuristic to determine when the runtime is being generated. If std::__fundamental_type_info is defined, and its destructor is defined, then the runtime is being built. */ void emit_support_tinfos (void) { - /* Dummy static variable so we can put nullptr in the array; it will be - set before we actually start to walk the array. */ - static tree *const fundamentals[] = - { -void_type_node, -boolean_type_node, -wchar_type_node, char16_type_node, char32_type_node, -char_type_node, signed_char_type_node, unsigned_char_type_node, -short_integer_type_node, short_unsigned_type_node, -integer_type_node, unsigned_type_node, -long_integer_type_node, long_unsigned_type_node, -long_long_integer_type_node, long_long_unsigned_type_node, -int128_integer_type_node, int128_unsigned_type_node, -float_type_node, double_type_node, long_double_type_node, -dfloat32_type_node, dfloat64_type_node, dfloat128_type_node, -nullptr_type_node, -0 - }; int ix; tree bltn_type, dtor; push_abi_namespace (); bltn_type = xref_tag (class_type, get_identifier (__fundamental_type_info), /*tag_scope=*/ts_current, false); pop_abi_namespace (); if (!COMPLETE_TYPE_P (bltn_type)) return; Index: gcc/cp/rtti.c
Re: [c++] typeinfo for target types
On Thu, 24 Apr 2014, Ramana Radhakrishnan wrote: On Wed, Apr 23, 2014 at 8:43 PM, Marc Glisse marc.gli...@inria.fr wrote: On Wed, 23 Apr 2014, Richard Henderson wrote: On 04/13/2014 01:41 AM, Marc Glisse wrote: Hello, this patch generates typeinfo for target types. On x86_64, it adds these 6 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and version those in the shared library. + V typeinfo for __float128 + V typeinfo for __float128 const* + V typeinfo for __float128* + V typeinfo name for __float128 + V typeinfo name for __float128 const* + V typeinfo name for __float128* Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c). 2014-04-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/c-family/ * c-common.c (registered_builtin_types): Make non-static. * c-common.h (registered_builtin_types): Declare. gcc/cp/ * rtti.c (emit_support_tinfo_1): New function, extracted from emit_support_tinfos. (emit_support_tinfos): Call it and iterate on registered_builtin_types. This is causing aarch64 builds to break. If it is causing too much trouble, we could ifdef out the last 2 lines of emit_support_tinfos and revert the libstdc++ changes (or even revert the whole thing). Any c++ compilation aborts at That's surprising, the code I touched is only ever supposed to run while compiling one file in libsupc++, if I understand correctly. #0 fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303, function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__ write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190 #1 0x007ce2b4 in write_builtin_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:2303 #2 0x007cc85c in write_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:1969 #3 0x007d4d98 in mangle_special_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df, code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569 #4 0x007d4dcc in mangle_typeinfo_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:3585 #5 0x0070618c in get_tinfo_decl ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:422 #6 0x00709ff0 in emit_support_tinfo_1 ( bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:1485 #7 0x0070a344 in emit_support_tinfos () at ../../git-rh/gcc/cp/rtti.c:1550 Presumably the backend needs to grow some mangling support for its builtins, aarch64 has complicated builtins... __builtin_aarch64_simd_df uses double_aarch64_type_node which is not the same as double_type_node. I mostly looked at the x86 backend, so I didn't notice that aarch64 registers a lot more builtins. but in the meantime can we do something less drastic than abort? Sounds good, but I am not sure how exactly. We could use a separate hook (register_builtin_type_for_typeinfo?) so back-ends have to explicitly say they want typeinfo, but it is ugly having to register types multiple times. We could add a parameter to the existing register_builtin_type saying whether we want typeinfo, but that means updating all back-ends. We could also make typeinfo_in_lib_p more strict so for REAL_TYPE it only returns true for the types listed in fundamentals. Well some of these scalar types are not really user visible which is where I believe the problem is coming from and prima-facie I don't think we should be inventing mangling for some of these internal types. If the types are not user-visible, it is not clear to me why they need to be registered with the front-end... We could get the mangling functions to take a parameter that says whether errors should be fatal and skip generating the typeinfo when we can't mangle, but there is no convenient way to communicate this mangling failure (0 bytes written?). Would mangling the aarch64 builtins be a lot of work? Did other platforms break as well? It's not a lot of work but I'd like to make sure we're doing the right thing on both AArch32 and AArch64. So, for now can we just revert this till the thing is sorted out. Ok, I'll commit the attached as soon as I've checked it isn't too broken. It is not a complete revert: splitting the rtti function is still cleaner, and the int128 symbols are still there. 2014-04-24 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/cp/ * rtti.c (emit_support_tinfos): Do not iterate on registered_builtin_types (partial revert). libstdc++/ * config/abi/pre/gnu.ver (CXXABI_1.3.9): Remove __float128 symbols. * config/abi/pre/gnu-versioned-namespace.ver: Likewise. * config/abi/post/x86_64
Re: [c++] typeinfo for target types
On Thu, 24 Apr 2014, Ramana Radhakrishnan wrote: Well some of these scalar types are not really user visible which is where I believe the problem is coming from and prima-facie I don't think we should be inventing mangling for some of these internal types. If the types are not user-visible, it is not clear to me why they need to be registered with the front-end... The vector types that are built on this are user visible, so I suspect that's why the scalar types need to be registered with the front-end. A lot of this comes from the original support for the intrinsics way that goes quite some time back so there is some digging needed here. Problematic part of the patch was reverted, arm and aarch64 should be ok now. Please do investigate... -- Marc Glisse
Re: version typeinfo for 128bit types
On Thu, 24 Apr 2014, Rainer Orth wrote: Marc Glisse marc.gli...@inria.fr writes: this is a follow-up for this patch: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html once committed, g++ will generate typeinfo for __float128, and it needs versioning. While there, I noticed that __int128 has typeinfo but not typeinfo name, so I am adding it. I manually checked that the new symbols were exactly the 12 I expected, with the new version number. I did not test the gnu-versioned-namespace version. I manually updated baseline for x86_64. It is awfully inconvenient to do. I was expecting make new-abi-baseline to generate it for me, but it gives me plenty of extra symbols compared to the current one. Some random examples: It shouldn't be necessary to update all baselines whenever you add a new version to libstdc++.so.6. It seems to me that when you added CXXABI_1.3.9, you forgot to update libstdc++-v3/testsuite/util/testsuite_abi.cc for that. I had no idea this file even existed, thanks for the pointer! It makes so much more sense this way :-) Grep seems to indicate that the manual is the only other place that needs updating, but that can wait. Is this patch ok, assuming the tests pass? 2014-04-24 Marc Glisse marc.gli...@inria.fr * testsuite/util/testsuite_abi.cc (check_version): Update for CXXABI_1.3.9. -- Marc GlisseIndex: libstdc++-v3/testsuite/util/testsuite_abi.cc === --- libstdc++-v3/testsuite/util/testsuite_abi.cc(revision 209755) +++ libstdc++-v3/testsuite/util/testsuite_abi.cc(working copy) @@ -203,38 +203,39 @@ check_version(symbol test, bool added) known_versions.push_back(CXXABI_1.3); known_versions.push_back(CXXABI_LDBL_1.3); known_versions.push_back(CXXABI_1.3.1); known_versions.push_back(CXXABI_1.3.2); known_versions.push_back(CXXABI_1.3.3); known_versions.push_back(CXXABI_1.3.4); known_versions.push_back(CXXABI_1.3.5); known_versions.push_back(CXXABI_1.3.6); known_versions.push_back(CXXABI_1.3.7); known_versions.push_back(CXXABI_1.3.8); + known_versions.push_back(CXXABI_1.3.9); known_versions.push_back(CXXABI_TM_1); } compat_list::iterator begin = known_versions.begin(); compat_list::iterator end = known_versions.end(); // Check for compatible version. if (test.version_name.size()) { compat_list::iterator it1 = find(begin, end, test.version_name); compat_list::iterator it2 = find(begin, end, test.name); if (it1 != end) test.version_status = symbol::compatible; else test.version_status = symbol::incompatible; // Check that added symbols are added in the latest pre-release version. bool latestp = (test.version_name == GLIBCXX_3.4.20 -|| test.version_name == CXXABI_1.3.8 +|| test.version_name == CXXABI_1.3.9 || test.version_name == CXXABI_TM_1); if (added !latestp) test.version_status = symbol::incompatible; // Check that long double compatibility symbols demangled as // __float128 are put into some _LDBL_ version name. if (added test.demangled_name.find(__float128) != std::string::npos) { // Has to be in _LDBL_ version name. if (test.version_name.find(_LDBL_) == std::string::npos)
Re: [C PATCH] Improve error on attrs after declarator in a fndef (PR c/60915)
On Thu, 24 Apr 2014, Marek Polacek wrote: This PR is about not very clear error message when one tries to add attributes *after* the declarator in a function definition. cc1plus already handles this well, so I used the same message. I thought you had an earlier version of the patch where, instead of forbidden, you said move the attribute so users would know that writing the attribute *before* was fine? That sounded much better to me. -- Marc Glisse
Re: Optimize n?rotate(x,n):x
Honza, any comment on Richard's question? On Tue, 15 Apr 2014, Richard Biener wrote: On Mon, Apr 14, 2014 at 6:40 PM, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 14 Apr 2014, Richard Biener wrote: + /* If the special case has a high probability, keep it. */ + if (EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) I suppose Honza has a comment on how to test this properly (not sure if -probability or -frequency is always initialized properly). for example single_likely_edge tests profile_status_for_fn != PROFILE_ABSENT (and uses a fixed probability value ...). Anyway, the comparison looks backwards to me, but maybe I'm missing sth - I'd use = PROB_LIKELY ;) Maybe the comment is confusing? middle_bb contains the expensive operation (say a/b) that the special case skips entirely. If the division happens in less than 50% of cases (that's the proba of the edge going from cond to middle_bb), then doing the comparison+jump may be cheaper and I abort the optimization. At least the testcase with __builtin_expect should prove that I didn't do it backwards. Ah, indeed. My mistake. value-prof seems to use 50% as the cut-off where it may become interesting to special case division, hence my choice of PROB_EVEN. I am not sure which way you want to use PROB_LIKELY (80%). If we have more than 80% chances of executing the division, always perform it? Or if we have more than 80% chances of skipping the division, keep the branch? Ok, if it's from value-prof then that's fine. The patch is ok if Honza doesn't have any comments on whether it's ok to look at -probability unconditionally. Thanks, Richard. Attached is the latest version (passed the testsuite). Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile { target x86_64-*-* } } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int f(int a, int b) { + if (__builtin_expect(a == 0, 1)) return b; + return a + b; +} + +// optab_handler can handle if(b==1) but not a/b +// so we consider a/b too expensive. +unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) { + if (b == 1) +return a; + else +return a / b; +} + +/* { dg-final { scan-tree-dump-times goto 4 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 209353) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -140,20 +140,37 @@ static bool gate_hoist_loads (void); x = PHI (CONST, a) Gets replaced with: bb0: bb2: t1 = a == CONST; t2 = b c; t3 = t1 t2; x = a; + + It also replaces + + bb0: + if (a != 0) goto bb1; else goto bb2; + bb1: + c = a + b; + bb2: + x = PHI c (bb1), b (bb0), ...; + + with + + bb0: + c = a + b; + bb2: + x = PHI c (bb0), ...; + ABS Replacement --- This transformation, implemented in abs_replacement, replaces bb0: if (a = 0) goto bb2; else goto bb1; bb1: x = -a; bb2: @@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; tmp = gimple_assign_rhs2 (def); if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; return false; } +/* Returns true if ARG is a neutral element for operation CODE + on the RIGHT side. */ + +static bool +neutral_element_p (tree_code code, tree arg, bool right) +{ + switch (code) +{ +case PLUS_EXPR: +case BIT_IOR_EXPR: +case BIT_XOR_EXPR: + return integer_zerop (arg); + +case LROTATE_EXPR: +case RROTATE_EXPR: +case LSHIFT_EXPR: +case RSHIFT_EXPR: +case MINUS_EXPR: +case POINTER_PLUS_EXPR: + return right integer_zerop (arg); + +case MULT_EXPR: + return integer_onep (arg); + +case TRUNC_DIV_EXPR: +case CEIL_DIV_EXPR
Re: [i386] define __SIZEOF_FLOAT128__
(Adding an i386 maintainer in Cc) http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html On Sun, 13 Apr 2014, Marc Glisse wrote: Hello, some people like having a macro to test if a type is available (__SIZEOF_INT128__ for instance). This adds macros for __float80 and __float128. The types seem to be always available, so I didn't add any condition. If you think this is a bad idea, please close the PR. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-13 Marc Glisse marc.gli...@inria.fr PR preprocessor/56540 * config/i386/i386-c.c (ix86_target_macros): Define __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__. -- Marc Glisse
Re: [c++] typeinfo for target types
On Wed, 23 Apr 2014, Richard Henderson wrote: On 04/13/2014 01:41 AM, Marc Glisse wrote: Hello, this patch generates typeinfo for target types. On x86_64, it adds these 6 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and version those in the shared library. + V typeinfo for __float128 + V typeinfo for __float128 const* + V typeinfo for __float128* + V typeinfo name for __float128 + V typeinfo name for __float128 const* + V typeinfo name for __float128* Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c). 2014-04-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/c-family/ * c-common.c (registered_builtin_types): Make non-static. * c-common.h (registered_builtin_types): Declare. gcc/cp/ * rtti.c (emit_support_tinfo_1): New function, extracted from emit_support_tinfos. (emit_support_tinfos): Call it and iterate on registered_builtin_types. This is causing aarch64 builds to break. If it is causing too much trouble, we could ifdef out the last 2 lines of emit_support_tinfos and revert the libstdc++ changes (or even revert the whole thing). Any c++ compilation aborts at That's surprising, the code I touched is only ever supposed to run while compiling one file in libsupc++, if I understand correctly. #0 fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303, function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__ write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190 #1 0x007ce2b4 in write_builtin_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:2303 #2 0x007cc85c in write_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:1969 #3 0x007d4d98 in mangle_special_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df, code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569 #4 0x007d4dcc in mangle_typeinfo_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:3585 #5 0x0070618c in get_tinfo_decl ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:422 #6 0x00709ff0 in emit_support_tinfo_1 ( bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:1485 #7 0x0070a344 in emit_support_tinfos () at ../../git-rh/gcc/cp/rtti.c:1550 Presumably the backend needs to grow some mangling support for its builtins, aarch64 has complicated builtins... __builtin_aarch64_simd_df uses double_aarch64_type_node which is not the same as double_type_node. I mostly looked at the x86 backend, so I didn't notice that aarch64 registers a lot more builtins. but in the meantime can we do something less drastic than abort? Sounds good, but I am not sure how exactly. We could use a separate hook (register_builtin_type_for_typeinfo?) so back-ends have to explicitly say they want typeinfo, but it is ugly having to register types multiple times. We could add a parameter to the existing register_builtin_type saying whether we want typeinfo, but that means updating all back-ends. We could get the mangling functions to take a parameter that says whether errors should be fatal and skip generating the typeinfo when we can't mangle, but there is no convenient way to communicate this mangling failure (0 bytes written?). Would mangling the aarch64 builtins be a lot of work? Did other platforms break as well? Isn't this only really an issue if someone tries to access one of these types via typeinfo? Yes. -- Marc Glisse
Re: Optimize n?rotate(x,n):x
On Wed, 23 Apr 2014, Jan Hubicka wrote: + /* Now optimize (x != 0) ? x + y : y to just y. + The following condition is too restrictive, there can easily be another + stmt in middle_bb, for instance a CONVERT_EXPR for the second argument. */ + gimple assign = last_and_only_stmt (middle_bb); + if (!assign || gimple_code (assign) != GIMPLE_ASSIGN + || gimple_assign_rhs_class (assign) != GIMPLE_BINARY_RHS + || (!INTEGRAL_TYPE_P (TREE_TYPE (arg0)) + !POINTER_TYPE_P (TREE_TYPE (arg0 +return 0; + + /* assign may call a libgcc routine, which is slow. */ + if (!is_cheap_stmt (assign) is_cheap_stmt (cond)) +return 0; + + /* If the special case has a high probability, keep it. */ + if (EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) +return 0; Well, I would expect this transformation to be win always, + operation is virtually for free. Concerning profile - you must consider two cases. If the profile is guessed then the probability is most probably 71% to the non-zero case unless user used an expect (since I would expect only PRED_OPCODE_NONEQUAL to match). This is data dependent branch unless this sits in a loop and those are badly predicted statically. If the probability is read, then probabilities over (say) 10% and less than 90% means that the branch is likely not very well predictable (it still may be on advanced architectures if it is predictable from context) and getting rid of it is a good idea. So if you want to disable the optimization for x being likely zero, I guess testing that probability is over PROV_LIKELY is a resonable threshold - it will handle both builtin_expect and the (very) badly predictable branches. Maybe for FDO it should be higher, but we would need to do some research on it that is probably not worth the effort. The division transformation is bit different story, since cost of division is more than cost of branch and the 50% threshold is a limit for one value counter to be reliable. Thank you for the comments. If I understand correctly: - it is always ok to look at edge-probability (no need to check that the probabilities are available as Richard feared) - PROB_EVEN is an ok threshold for division (not sure I understood your last sentence) - for cheaper operations like addition, I should be less conservative and do the transformation always, or use a threshold of PROB_UNLIKELY. Are there other operations than division (among those listed in neutral_element_p or absorbing_element_p) that fall in the expensive category? I guess there are some platforms where multiplication is expensive, but few, and querying the target for the cost of operations seems exagerated. So I would go with: [move definition of code_def] if ((code_def == TRUNC_DIV_EXPR || code_def == CEIL_DIV_EXPR || code_def == FLOOR_DIV_EXPR || code_def == ROUND_DIV_EXPR || code_def == EXACT_DIV_EXPR) EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) return 0; (and change the testsuite example with __builtin_expect to use division) or (my favorite): [move definition of code_def] int threshold = PROB_UNLIKELY; if (code_def == TRUNC_DIV_EXPR || code_def == CEIL_DIV_EXPR || code_def == FLOOR_DIV_EXPR || code_def == ROUND_DIV_EXPR || code_def == EXACT_DIV_EXPR) threshold = PROB_EVEN; if (EDGE_PRED (middle_bb, 0)-probability threshold) return 0; Is that ok, after re-testing? -- Marc Glisse
Re: [i386] define __SIZEOF_FLOAT128__
On Wed, 23 Apr 2014, H.J. Lu wrote: On Wed, Apr 23, 2014 at 11:48 AM, Marc Glisse marc.gli...@inria.fr wrote: (Adding an i386 maintainer in Cc) http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html On Sun, 13 Apr 2014, Marc Glisse wrote: Hello, some people like having a macro to test if a type is available (__SIZEOF_INT128__ for instance). This adds macros for __float80 and __float128. The types seem to be always available, so I didn't add any condition. If you think this is a bad idea, please close the PR. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-13 Marc Glisse marc.gli...@inria.fr PR preprocessor/56540 * config/i386/i386-c.c (ix86_target_macros): Define __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__. For __SIZEOF_FLOAT80__, you should check TARGET_128BIT_LONG_DOUBLE instead of TARGET_64BIT. Good point, thanks! It now matches i386-modes.def. Is this version (same changelog) ok? -- Marc GlisseIndex: gcc/config/i386/i386-c.c === --- gcc/config/i386/i386-c.c(revision 209721) +++ gcc/config/i386/i386-c.c(working copy) @@ -511,20 +511,27 @@ ix86_target_macros (void) if (!TARGET_80387) cpp_define (parse_in, _SOFT_FLOAT); if (TARGET_LONG_DOUBLE_64) cpp_define (parse_in, __LONG_DOUBLE_64__); if (TARGET_LONG_DOUBLE_128) cpp_define (parse_in, __LONG_DOUBLE_128__); + if (TARGET_128BIT_LONG_DOUBLE) +cpp_define (parse_in, __SIZEOF_FLOAT80__=16); + else +cpp_define (parse_in, __SIZEOF_FLOAT80__=12); + + cpp_define (parse_in, __SIZEOF_FLOAT128__=16); + cpp_define_formatted (parse_in, __ATOMIC_HLE_ACQUIRE=%d, IX86_HLE_ACQUIRE); cpp_define_formatted (parse_in, __ATOMIC_HLE_RELEASE=%d, IX86_HLE_RELEASE); ix86_target_macros_internal (ix86_isa_flags, ix86_arch, ix86_tune, ix86_fpmath, cpp_define); }
Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse
On Mon, 21 Apr 2014, Richard Henderson wrote: On 04/21/2014 01:19 PM, Bill Schmidt wrote: + if (GET_CODE (trueop0) == VEC_SELECT + GET_MODE (XEXP (trueop0, 0)) == mode) + { + rtx op0_subop1 = XEXP (trueop0, 1); + gcc_assert (GET_CODE (op0_subop1) == PARALLEL); + gcc_assert (XVECLEN (trueop1, 0) == GET_MODE_NUNITS (mode)); + + /* Apply the outer ordering vector to the inner one. (The inner +ordering vector is expressly permitted to be of a different +length than the outer one.) If the result is { 0, 1, ..., n-1 } +then the two VEC_SELECTs cancel. */ + for (int i = 0; i XVECLEN (trueop1, 0); ++i) + { + rtx x = XVECEXP (trueop1, 0, i); + gcc_assert (CONST_INT_P (x)); + rtx y = XVECEXP (op0_subop1, 0, INTVAL (x)); + gcc_assert (CONST_INT_P (y)); In two places you're asserting that you've got a constant permutation. Surely there should be a non-assertion check and graceful exit for either select to be a variable permutation. Note that in the case where trueop0 is a CONST_VECTOR, we already check each element of trueop1: gcc_assert (CONST_INT_P (x)); In the case where the result is a scalar, we also have: gcc_assert (CONST_INT_P (XVECEXP (trueop1, 0, 0))); so we will have other issues if something ever creates a variable vec_select. Not that a graceful exit will hurt of course. -- Marc Glisse
Re: version typeinfo for 128bit types
On Tue, 22 Apr 2014, Jonathan Wakely wrote: On 14 April 2014 11:28, Jonathan Wakely wrote: On 14 April 2014 10:39, Marc Glisse wrote: PR libstdc++/43622 * config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols. * config/abi/pre/gnu-versioned-namespace.ver: New symbols. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise. I'd like to wait until 4.9.0 has been released before adding new versions to the library, to avoid unnecessary differences between trunk and the 4.9.0 release candidate(s). I was about to say the patch is OK for trunk now, but noticed that the new CXXABI_1.3.9 block is added after the CXXABI_TM one, not after CXXABI_1.3.8 ... was that intentional? I don't have any opinion, I'll move it just after 1.3.8 if that seems more logical to you. -- Marc Glisse
Re: version typeinfo for 128bit types
Hello, as written in the PR, my patch seems wrong for platforms like powerpc that already had the __float128 typeinfo for long double with a different version. The following patch regtested fine on x86_64, and a hackish cross-build shows that float128.ver is ignored on powerpc (good). 2014-04-23 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * config/abi/pre/float128.ver: New file. * config/abi/pre/gnu.ver (CXXABI_1.3.9): Move __float128 typeinfo to the new file. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update. * configure.ac: Use float128.ver when relevant. * configure: Regenerate. -- Marc GlisseIndex: libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt === --- libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (revision 209658) +++ libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (working copy) @@ -2514,20 +2514,21 @@ FUNC:atomic_flag_test_and_set_explicit@@ OBJECT:0:CXXABI_1.3 OBJECT:0:CXXABI_1.3.1 OBJECT:0:CXXABI_1.3.2 OBJECT:0:CXXABI_1.3.3 OBJECT:0:CXXABI_1.3.4 OBJECT:0:CXXABI_1.3.5 OBJECT:0:CXXABI_1.3.6 OBJECT:0:CXXABI_1.3.7 OBJECT:0:CXXABI_1.3.8 OBJECT:0:CXXABI_1.3.9 +OBJECT:0:CXXABI_FLOAT128_1.3.9 OBJECT:0:CXXABI_TM_1 OBJECT:0:GLIBCXX_3.4 OBJECT:0:GLIBCXX_3.4.1 OBJECT:0:GLIBCXX_3.4.10 OBJECT:0:GLIBCXX_3.4.11 OBJECT:0:GLIBCXX_3.4.12 OBJECT:0:GLIBCXX_3.4.13 OBJECT:0:GLIBCXX_3.4.14 OBJECT:0:GLIBCXX_3.4.15 OBJECT:0:GLIBCXX_3.4.16 @@ -2618,21 +2619,21 @@ OBJECT:16:_ZTISt16nested_exception@@CXXA OBJECT:16:_ZTISt8ios_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9exception@@GLIBCXX_3.4 OBJECT:16:_ZTISt9time_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9type_info@@GLIBCXX_3.4 OBJECT:16:_ZTIa@@CXXABI_1.3 OBJECT:16:_ZTIb@@CXXABI_1.3 OBJECT:16:_ZTIc@@CXXABI_1.3 OBJECT:16:_ZTId@@CXXABI_1.3 OBJECT:16:_ZTIe@@CXXABI_1.3 OBJECT:16:_ZTIf@@CXXABI_1.3 -OBJECT:16:_ZTIg@@CXXABI_1.3.9 +OBJECT:16:_ZTIg@@CXXABI_FLOAT128_1.3.9 OBJECT:16:_ZTIh@@CXXABI_1.3 OBJECT:16:_ZTIi@@CXXABI_1.3 OBJECT:16:_ZTIj@@CXXABI_1.3 OBJECT:16:_ZTIl@@CXXABI_1.3 OBJECT:16:_ZTIm@@CXXABI_1.3 OBJECT:16:_ZTIn@@CXXABI_1.3.5 OBJECT:16:_ZTIo@@CXXABI_1.3.5 OBJECT:16:_ZTIs@@CXXABI_1.3 OBJECT:16:_ZTIt@@CXXABI_1.3 OBJECT:16:_ZTIv@@CXXABI_1.3 @@ -3119,21 +3120,21 @@ OBJECT:2:_ZNSt10ctype_base5printE@@GLIBC OBJECT:2:_ZNSt10ctype_base5punctE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5spaceE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5upperE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base6xdigitE@@GLIBCXX_3.4 OBJECT:2:_ZTSa@@CXXABI_1.3 OBJECT:2:_ZTSb@@CXXABI_1.3 OBJECT:2:_ZTSc@@CXXABI_1.3 OBJECT:2:_ZTSd@@CXXABI_1.3 OBJECT:2:_ZTSe@@CXXABI_1.3 OBJECT:2:_ZTSf@@CXXABI_1.3 -OBJECT:2:_ZTSg@@CXXABI_1.3.9 +OBJECT:2:_ZTSg@@CXXABI_FLOAT128_1.3.9 OBJECT:2:_ZTSh@@CXXABI_1.3 OBJECT:2:_ZTSi@@CXXABI_1.3 OBJECT:2:_ZTSj@@CXXABI_1.3 OBJECT:2:_ZTSl@@CXXABI_1.3 OBJECT:2:_ZTSm@@CXXABI_1.3 OBJECT:2:_ZTSn@@CXXABI_1.3.9 OBJECT:2:_ZTSo@@CXXABI_1.3.9 OBJECT:2:_ZTSs@@CXXABI_1.3 OBJECT:2:_ZTSt@@CXXABI_1.3 OBJECT:2:_ZTSv@@CXXABI_1.3 @@ -3153,41 +3154,41 @@ OBJECT:32:_ZTIPKDe@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDf@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDi@@CXXABI_1.3.3 OBJECT:32:_ZTIPKDn@@CXXABI_1.3.5 OBJECT:32:_ZTIPKDs@@CXXABI_1.3.3 OBJECT:32:_ZTIPKa@@CXXABI_1.3 OBJECT:32:_ZTIPKb@@CXXABI_1.3 OBJECT:32:_ZTIPKc@@CXXABI_1.3 OBJECT:32:_ZTIPKd@@CXXABI_1.3 OBJECT:32:_ZTIPKe@@CXXABI_1.3 OBJECT:32:_ZTIPKf@@CXXABI_1.3 -OBJECT:32:_ZTIPKg@@CXXABI_1.3.9 +OBJECT:32:_ZTIPKg@@CXXABI_FLOAT128_1.3.9 OBJECT:32:_ZTIPKh@@CXXABI_1.3 OBJECT:32:_ZTIPKi@@CXXABI_1.3 OBJECT:32:_ZTIPKj@@CXXABI_1.3 OBJECT:32:_ZTIPKl@@CXXABI_1.3 OBJECT:32:_ZTIPKm@@CXXABI_1.3 OBJECT:32:_ZTIPKn@@CXXABI_1.3.5 OBJECT:32:_ZTIPKo@@CXXABI_1.3.5 OBJECT:32:_ZTIPKs@@CXXABI_1.3 OBJECT:32:_ZTIPKt@@CXXABI_1.3 OBJECT:32:_ZTIPKv@@CXXABI_1.3 OBJECT:32:_ZTIPKw@@CXXABI_1.3 OBJECT:32:_ZTIPKx@@CXXABI_1.3 OBJECT:32:_ZTIPKy@@CXXABI_1.3 OBJECT:32:_ZTIPa@@CXXABI_1.3 OBJECT:32:_ZTIPb@@CXXABI_1.3 OBJECT:32:_ZTIPc@@CXXABI_1.3 OBJECT:32:_ZTIPd@@CXXABI_1.3 OBJECT:32:_ZTIPe@@CXXABI_1.3 OBJECT:32:_ZTIPf@@CXXABI_1.3 -OBJECT:32:_ZTIPg@@CXXABI_1.3.9 +OBJECT:32:_ZTIPg@@CXXABI_FLOAT128_1.3.9 OBJECT:32:_ZTIPh@@CXXABI_1.3 OBJECT:32:_ZTIPi@@CXXABI_1.3 OBJECT:32:_ZTIPj@@CXXABI_1.3 OBJECT:32:_ZTIPl@@CXXABI_1.3 OBJECT:32:_ZTIPm@@CXXABI_1.3 OBJECT:32:_ZTIPn@@CXXABI_1.3.5 OBJECT:32:_ZTIPo@@CXXABI_1.3.5 OBJECT:32:_ZTIPs@@CXXABI_1.3 OBJECT:32:_ZTIPt@@CXXABI_1.3 OBJECT:32:_ZTIPv@@CXXABI_1.3 @@ -3228,21 +3229,21 @@ OBJECT:39:_ZTSSt13basic_filebufIwSt11cha OBJECT:39:_ZTSSt13basic_fstreamIcSt11char_traitsIcEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_fstreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_istreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_ostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:3:_ZTSPa@@CXXABI_1.3 OBJECT:3:_ZTSPb@@CXXABI_1.3 OBJECT:3:_ZTSPc@@CXXABI_1.3 OBJECT:3:_ZTSPd@@CXXABI_1.3
Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse
On Mon, 21 Apr 2014, Bill Schmidt wrote: Note that it would be possible to do a more general transformation here, in which any vec_select feeding another could be replaced by a vec_select performing the composite function of the other two. I have not done this because I am unaware of this situation arising in practice. If it's desirable, I can extend the patch in this direction. It does arise, but I think it isn't done because not all permutations are (optimally) supported by all targets. Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 209516) +++ gcc/simplify-rtx.c (working copy) @@ -3673,6 +3673,34 @@ simplify_binary_operation_1 (enum rtx_code code, e } } + /* If we have two nested selects that are inverses of each +other, replace them with the source operand. */ + if (GET_CODE (trueop0) == VEC_SELECT) + { + enum machine_mode reg_mode = GET_MODE (XEXP (trueop0, 0)); + rtx op0_subop1 = XEXP (trueop0, 1); + gcc_assert (VECTOR_MODE_P (reg_mode)); + gcc_assert (GET_MODE_INNER (mode) == GET_MODE_INNER (reg_mode)); + gcc_assert (GET_CODE (op0_subop1) == PARALLEL); + + if (XVECLEN (trueop1, 0) == XVECLEN (op0_subop1, 0)) + { + /* Apply the second ordering vector to the first. +If the result is { 0, 1, ..., n-1 } then the +two VEC_SELECTs cancel. */ + for (int i = 0; i XVECLEN (trueop1, 0); ++i) + { + rtx x = XVECEXP (trueop1, 0, i); + gcc_assert (CONST_INT_P (x)); + rtx y = XVECEXP (op0_subop1, 0, INTVAL (x)); + gcc_assert (CONST_INT_P (y)); + if (i != INTVAL (y)) + return 0; + } + return XEXP (trueop0, 0); + } + } I may have missed it, but don't you want to check that what you are returning has the right mode/length (or generate the obvious vec_select otherwise)? I don't know if any platform has such constructions (probably not), but in principle you could start from a vector of size 4, extract {1,0} from it, extract {1,0} from that, and you don't want to return the initial vector as is. On the other hand, I don't think you really care whether trueop1 is smaller than op0_subop1. Starting from a vector of size 2, extracting {1,0,1,0} then {3,0} gives the initial vector just fine. -- Marc Glisse
Re: [build] PR 43538: Don't overwrite CXXFLAGS_FOR_TARGET in config/mt-gnu
Ping http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01480.html On Thu, 23 Jan 2014, Marc Glisse wrote: Hello, although setting CFLAGS_FOR_TARGET before compiling gcc works fine, CXXFLAGS_FOR_TARGET is ignored. I don't see any good reason for that. I tested the patch by doing a regular bootstrap+testsuite on x86_64-unknown-linux-gnu. I also did a non-bootstrap build where I set CXXFLAGS_FOR_TARGET and checked that it now propagates to libstdc++ and others. config/ChangeLog: 2014-01-23 Marc Glisse marc.gli...@inria.fr PR target/43538 * mt-gnu: Don't reset CXXFLAGS_FOR_TARGET. -- Marc GlisseIndex: config/mt-gnu === --- config/mt-gnu (revision 209514) +++ config/mt-gnu (working copy) @@ -1 +1 @@ -CXXFLAGS_FOR_TARGET = $(CXXFLAGS) -D_GNU_SOURCE +CXXFLAGS_FOR_TARGET += -D_GNU_SOURCE
Re: [C PATCH] Warn if switch has boolean value (PR c/60439)
On Fri, 18 Apr 2014, Steven Bosscher wrote: On Fri, Apr 18, 2014 at 7:30 AM, Marek Polacek wrote: + if (TREE_CODE (type) == BOOLEAN_TYPE + || exp_code == TRUTH_ANDIF_EXPR + || exp_code == TRUTH_AND_EXPR + || exp_code == TRUTH_ORIF_EXPR + || exp_code == TRUTH_OR_EXPR + || exp_code == TRUTH_XOR_EXPR + || exp_code == TRUTH_NOT_EXPR + || exp_code == EQ_EXPR + || exp_code == NE_EXPR + || exp_code == LE_EXPR + || exp_code == GE_EXPR + || exp_code == LT_EXPR + || exp_code == GT_EXPR) Is there a TREE_CODE_CLASS or a #define for this? truth_value_p -- Marc Glisse
Re: calloc = malloc + memset
., then there is no zero termination. Sure, calling strlen or strchr etc. on the result is undefined behavior, just wondering if it isn't going to break anything. Though, the result is zero length and thus any dereferencing would be a bug, so perhaps we are fine. You know the pass better than I do... I don't think the pass does anything like: all 0-length strings are equivalent, let's replace one by another one, so I think we should be ok, but I don't mind treating calloc like malloc here if you think it is too risky, I don't think people call strlen(calloc(a,b)) very often, and it would let me remove the don't use si-length comment. Updated and tested on x86_64-linux-gnu. 2014-04-18 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. -- Marc GlisseIndex: gcc/passes.def === --- gcc/passes.def (revision 209516) +++ gcc/passes.def (working copy) @@ -176,21 +176,20 @@ along with GCC; see the file COPYING3. DOM and erroneous path isolation should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); NEXT_PASS (pass_dse); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_dce); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); - NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); NEXT_PASS (pass_optimize_bswap); NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); NEXT_PASS (pass_asan); @@ -235,20 +234,21 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_dominator); /* The only const/copy propagation opportunities left after DOM should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_vrp); NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_tracer); NEXT_PASS (pass_dse); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_rename_ssa_copies); Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C === --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options -O3 -fdump-tree-optimized } */ + +typedef __SIZE_TYPE__ size_t; +inline void* operator new(size_t, void* p) throw() { return p; } + +typedef void (*handler_t)(void); +extern handler_t get_handle(); + +inline void* operator new(size_t sz) +{ + void *p; + + if (sz == 0) +sz = 1; + + while ((p = __builtin_malloc (sz)) == 0) +{ + handler_t handler = get_handle (); + if (! handler) +throw 42; + handler (); +} + return p; +} + +struct vect { + int *start, *end; + vect(size_t n) { +start = end = 0; +if (n (size_t)-1 / sizeof(int)) + throw 33; +if (n != 0) + start = static_castint* (operator new (n * sizeof(int))); +end = start + n; +int *p = start; +for (size_t l = n; l 0; --l, ++p) + *p = 0; + } +}; + +void f (void *p, int n) +{ + new (p) vect(n); +} + +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/testsuite/gcc.dg/strlenopt-9.c === --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 209516) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r) optimized away. */ return strchr (p, '\0'); } __attribute__((noinline, noclone)) size_t fn2 (int r) { char *p, q[10]; strcpy
Re: [PATCH] Fix PR60849
On Thu, 17 Apr 2014, Richard Biener wrote: This fixes PR60849 by properly rejecting non-boolean typed comparisons from valid_gimple_rhs_p so they go through the gimplification paths. Could you also accept vector comparisons please? -- Marc Glisse
Re: [patch] change specific int128 - generic intN
On Tue, 15 Apr 2014, DJ Delorie wrote: I wasn't sure what to do with that array, since it was static and couldn't have empty slots in them like the arrays in tree.h. Also, do we need to have *every* type in that list? What's the rule for whether a type gets installed there or not? The comment says guaranteed to be in the runtime support but does that mean for this particular build (wrt multilibs) as not all intN types are guaranteed (even the int128 types were not guaranteed to be supported before my patch). In other parts of the patch, just taking out the special case for __int128 was sufficient to do the right thing for all __intN types. You need someone who understands this better than me (ask Jason). To be able to throw/catch a type, you need some typeinfo symbols. The front-end generates that for classes when they are defined. For fundamental types, it assumes libsupc++ will provide it, and the function you are modifying is the one generating libsupc++ (I am surprised your patch didn't cause any failure on x64_64, at least in abi_check). We need to generate the typeinfo for __intN, either in libsupc++, or in each TU, and since both cases will require code, I assume libsupc++ is preferable. I can certainly put the intN types in there, but note that it would mean regenerating the fundamentals[] array at runtime to include those types which are supported at the time. After the patch I linked, it should just mean calling the helper function on your new types, no need to touch the array. Do the entries in the array need to be in a particular order? No, any random order would do. -- Marc Glisse
Re: [C PATCH] Warn if switch has boolean value (PR c/60439)
On Fri, 18 Apr 2014, Marek Polacek wrote: This patch implements a new warning that warns when controlling expression of a switch has boolean value. (Intentionally I don't warn if the controlling expression is (un)signed:1 bit-field.) I guess the question is if this should be enabled by default or deserves some new warning option. Since clang does the former, I did it too and currently this warning is enabled by default. It can be enabled by -Wsome-name which is itself enabled by default but at least gives the possibility to use -Wno-some-name, -Werror=some-name, etc. No? I believe Manuel insists regularly that no new warning should use 0 (and old ones should progressively lose it). -- Marc Glisse
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On Tue, 15 Apr 2014, Jason Merrill wrote: On 04/14/2014 06:02 PM, Marc Glisse wrote: shouldn't the same also apply if VECTOR_TYPE_P (type), not just for CLASS_TYPE_P (type)? Sure. Do you want to make that change? I can add || VECTOR_TYPE_P (type), yes, but I thought you might have ideas about other cases that might have been forgotten, maybe arrays or something (I didn't have time to test any further), and thus on what the right test should be. If it is just vectors I'll prepare a patch with a simple testcase. -- Marc Glisse
Re: [v3] Slightly improve operator new
Ping http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00051.html On Sun, 2 Mar 2014, Marc Glisse wrote: Hello, inlining operator new (with LTO or otherwise), I noticed that it has a complicated implementation, which makes it hard to use this inlined code for optimizations. This patch does two things: 1) there are 2 calls to malloc, I am turning them into just one. At -Os, it does not change the generated code (RTL optimizers manage to merge the calls to malloc). At other levels (-O2, -O3, and especially with -g) it gives a smaller object file. And with just one malloc, some optimizations become much easier (see my recent calloc patch for instance). 2) malloc is predicted to return null 19 times out of 20 because of the loop (that didn't change with the patch), so I am adding __builtin_expect to let gcc optimize the fast path. Further discussion: a) I didn't add __builtin_expect for the test (sz == 0), it didn't change the generated code in my limited test. I was wondering if this test is necessary (new doesn't seem to ever call operator new(0)) or could be moved to operator new[] (new type[0] does call operator new[](0)), but since one can call operator new directly, it has to be protected indeed, so let's forget this point ;-) (too bad malloc is replacable, so we can't use the fact that glibc already does the right thing) b) I have a bit of trouble parsing the standard. Is the nothrow operator new supposed to call the regular operator new? In particular, if a user replaces only the throwing operator new, should the nothrow operator new automatically call that function? That's not what we are currently doing (and it would be a perf regression). Required behavior: Return a non-null pointer to suitably aligned storage (3.7.4), or else return a null pointer. This nothrow version of operator new returns a pointer obtained as if acquired from the (possibly replaced) ordinary version. This requirement is binding on a replacement version of this function. Default behavior: Calls operator new(size). If the call returns normally, returns the result of that call. Otherwise, returns a null pointer. Passes bootstrap+testsuite on x86_64-linux-gnu. Stage 1? 2014-03-03 Marc Glisse marc.gli...@inria.fr * libsupc++/new_op.cc: Factor the calls to malloc, use __builtin_expect. * libsupc++/new_opnt.cc: Likewise. -- Marc Glisse
Re: calloc = malloc + memset
Let me ping this. There's no hurry, but it may have got lost with 4.9 approaching. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01205.html On Sun, 23 Mar 2014, Marc Glisse wrote: On Mon, 3 Mar 2014, Richard Biener wrote: That's a bit much of ad-hoc pattern-matching ... wouldn't be p = malloc (n); memset (p, 0, n); transform better suited to the strlen opt pass? After all that tracks what 'string' is associated with a SSA name pointer through arbitrary satements using a lattice. Like this? I had to move the strlen pass after the loop passes (and after dom or everything was too dirty) but long enough before the end (some optimizations are necessary after strlen). As a bonus, one more strlen is optimized in the current testcases :-) Running the pass twice would be another option I guess (it would require implementing the clone method), but without a testcase showing it is needed... Passes bootstrap+testsuite on x86_64-linux-gnu. 2014-03-23 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. -- Marc Glisse
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On Tue, 15 Apr 2014, Jason Merrill wrote: It's just vectors, because they're an extension; the patch I checked in covered the standard language. Like this? (regtested on x86_64-linux-gnu) 2014-04-16 Marc Glisse marc.gli...@inria.fr gcc/cp/ * decl.c (reshape_init_r): Handle a single element of vector type. gcc/testsuite/ * g++.dg/cpp0x/initlist-vect.C: New file. -- Marc GlisseIndex: gcc/cp/decl.c === --- gcc/cp/decl.c (revision 209434) +++ gcc/cp/decl.c (working copy) @@ -5400,21 +5400,21 @@ reshape_init_r (tree type, reshape_iter maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS); } d-cur++; return init; } /* If T is a class type and the initializer list has a single element of type cv U, where U is T or a class derived from T, the object is initialized from that element. Even if T is an aggregate. */ - if (cxx_dialect = cxx11 CLASS_TYPE_P (type) + if (cxx_dialect = cxx11 (CLASS_TYPE_P (type) || VECTOR_TYPE_P (type)) first_initializer_p d-end - d-cur == 1 reference_related_p (type, TREE_TYPE (init))) { d-cur++; return init; } /* [dcl.init.aggr] Index: gcc/testsuite/g++.dg/cpp0x/initlist-vect.C === --- gcc/testsuite/g++.dg/cpp0x/initlist-vect.C (revision 0) +++ gcc/testsuite/g++.dg/cpp0x/initlist-vect.C (working copy) @@ -0,0 +1,6 @@ +// { dg-do compile { target c++11 } } + +typedef float X __attribute__ ((vector_size (4 * sizeof (float; + +X x; +X x2{x};
Re: [PATCH 1/3] libstdc++: Add time_get::get support.
On Tue, 15 Apr 2014, Paolo Carlini wrote: Anyway, the real issue is indeed that implementing those bits requires a new virtual function, and that would break the ABI. What is the status of the ABI half-break plan (abi_tag and such), necessary to get the remaining pieces of C++11? -- Marc Glisse
version typeinfo for 128bit types
Hello, this is a follow-up for this patch: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html once committed, g++ will generate typeinfo for __float128, and it needs versioning. While there, I noticed that __int128 has typeinfo but not typeinfo name, so I am adding it. I manually checked that the new symbols were exactly the 12 I expected, with the new version number. I did not test the gnu-versioned-namespace version. I manually updated baseline for x86_64. It is awfully inconvenient to do. I was expecting make new-abi-baseline to generate it for me, but it gives me plenty of extra symbols compared to the current one. Some random examples: FUNC:_ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_PKS3_@@GLIBCXX_3.4 FUNC:_ZSt8distanceIPKcENSt15iterator_traitsIT_E15difference_typeES3_S3_@@GLIBCXX_3.4 FUNC:_ZSt19__iterator_categoryIPKmENSt15iterator_traitsIT_E17iterator_categoryERKS3_@@GLIBCXX_3.4 FUNC:_ZSt13__check_facetISt7codecvtIwc11__mbstate_tEERKT_PS4_@@GLIBCXX_3.4 FUNC:_ZSt13__check_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcRKT_PS7_@@GLIBCXX_3.4 FUNC:_ZNSt9exceptionC1Ev@@GLIBCXX_3.4 FUNC:_ZNSt8iteratorISt18input_iterator_tagclPcRcEC1Ev@@GLIBCXX_3.4 FUNC:_ZNSt8ios_base4setfESt13_Ios_FmtflagsS0_@@GLIBCXX_3.4 FUNC:_ZNSt7complexIfEC1Eff@@GLIBCXX_3.4 FUNC:_ZNSt6chrono13duration_castINS_8durationIlSt5ratioILl1ELl10lS2_ILl1ELl1NSt9enable_ifIXsrNS_13__is_durationIT_EE5valueES8_E4typeERKNS1_IT0_T1_EE@@GLIBCXX_3.4 FUNC:_ZNSt20bad_array_new_lengthC2Ev@@CXXABI_1.3.8 FUNC:_ZNSt14numeric_limitsIdE8infinityEv@@GLIBCXX_3.4 FUNC:_ZN10__cxxabiv117__class_type_info16__dyncast_resultC1Ei@@CXXABI_1.3 etc. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-14 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols. * config/abi/pre/gnu-versioned-namespace.ver: New symbols. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise. -- Marc GlisseIndex: config/abi/post/x86_64-linux-gnu/baseline_symbols.txt === --- config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (revision 209354) +++ config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (working copy) @@ -2513,20 +2513,21 @@ FUNC:atomic_flag_clear_explicit@@GLIBCXX FUNC:atomic_flag_test_and_set_explicit@@GLIBCXX_3.4.11 OBJECT:0:CXXABI_1.3 OBJECT:0:CXXABI_1.3.1 OBJECT:0:CXXABI_1.3.2 OBJECT:0:CXXABI_1.3.3 OBJECT:0:CXXABI_1.3.4 OBJECT:0:CXXABI_1.3.5 OBJECT:0:CXXABI_1.3.6 OBJECT:0:CXXABI_1.3.7 OBJECT:0:CXXABI_1.3.8 +OBJECT:0:CXXABI_1.3.9 OBJECT:0:CXXABI_TM_1 OBJECT:0:GLIBCXX_3.4 OBJECT:0:GLIBCXX_3.4.1 OBJECT:0:GLIBCXX_3.4.10 OBJECT:0:GLIBCXX_3.4.11 OBJECT:0:GLIBCXX_3.4.12 OBJECT:0:GLIBCXX_3.4.13 OBJECT:0:GLIBCXX_3.4.14 OBJECT:0:GLIBCXX_3.4.15 OBJECT:0:GLIBCXX_3.4.16 @@ -2617,20 +2618,21 @@ OBJECT:16:_ZTISt16nested_exception@@CXXA OBJECT:16:_ZTISt8ios_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9exception@@GLIBCXX_3.4 OBJECT:16:_ZTISt9time_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9type_info@@GLIBCXX_3.4 OBJECT:16:_ZTIa@@CXXABI_1.3 OBJECT:16:_ZTIb@@CXXABI_1.3 OBJECT:16:_ZTIc@@CXXABI_1.3 OBJECT:16:_ZTId@@CXXABI_1.3 OBJECT:16:_ZTIe@@CXXABI_1.3 OBJECT:16:_ZTIf@@CXXABI_1.3 +OBJECT:16:_ZTIg@@CXXABI_1.3.9 OBJECT:16:_ZTIh@@CXXABI_1.3 OBJECT:16:_ZTIi@@CXXABI_1.3 OBJECT:16:_ZTIj@@CXXABI_1.3 OBJECT:16:_ZTIl@@CXXABI_1.3 OBJECT:16:_ZTIm@@CXXABI_1.3 OBJECT:16:_ZTIn@@CXXABI_1.3.5 OBJECT:16:_ZTIo@@CXXABI_1.3.5 OBJECT:16:_ZTIs@@CXXABI_1.3 OBJECT:16:_ZTIt@@CXXABI_1.3 OBJECT:16:_ZTIv@@CXXABI_1.3 @@ -3117,25 +3119,28 @@ OBJECT:2:_ZNSt10ctype_base5printE@@GLIBC OBJECT:2:_ZNSt10ctype_base5punctE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5spaceE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5upperE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base6xdigitE@@GLIBCXX_3.4 OBJECT:2:_ZTSa@@CXXABI_1.3 OBJECT:2:_ZTSb@@CXXABI_1.3 OBJECT:2:_ZTSc@@CXXABI_1.3 OBJECT:2:_ZTSd@@CXXABI_1.3 OBJECT:2:_ZTSe@@CXXABI_1.3 OBJECT:2:_ZTSf@@CXXABI_1.3 +OBJECT:2:_ZTSg@@CXXABI_1.3.9 OBJECT:2:_ZTSh@@CXXABI_1.3 OBJECT:2:_ZTSi@@CXXABI_1.3 OBJECT:2:_ZTSj@@CXXABI_1.3 OBJECT:2:_ZTSl@@CXXABI_1.3 OBJECT:2:_ZTSm@@CXXABI_1.3 +OBJECT:2:_ZTSn@@CXXABI_1.3.9 +OBJECT:2:_ZTSo@@CXXABI_1.3.9 OBJECT:2:_ZTSs@@CXXABI_1.3 OBJECT:2:_ZTSt@@CXXABI_1.3 OBJECT:2:_ZTSv@@CXXABI_1.3 OBJECT:2:_ZTSw@@CXXABI_1.3 OBJECT:2:_ZTSx@@CXXABI_1.3 OBJECT:2:_ZTSy@@CXXABI_1.3 OBJECT:32:_ZNSbIwSt11char_traitsIwESaIwEE4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4 OBJECT:32:_ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4 OBJECT:32:_ZTIPDd@@CXXABI_1.3.4 OBJECT:32:_ZTIPDe@@CXXABI_1.3.4 @@ -3148,39 +3153,41 @@ OBJECT:32:_ZTIPKDe@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDf@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDi@@CXXABI_1.3.3 OBJECT:32:_ZTIPKDn@@CXXABI_1.3.5 OBJECT:32:_ZTIPKDs@@CXXABI_1.3.3 OBJECT:32:_ZTIPKa@@CXXABI_1.3 OBJECT:32:_ZTIPKb@@CXXABI_1.3 OBJECT:32:_ZTIPKc@@CXXABI_1.3 OBJECT:32:_ZTIPKd@@CXXABI_1.3 OBJECT:32
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On Mon, 14 Apr 2014, Marek Polacek wrote: Currently it is not possible to use an enum value (or const int in C++) as an attribute argument (where compiler wants INTEGER_CST). This is because the values are represented by CONST_DECL nodes -- and we don't deal with them. I think it's viable to accept them too, and that is what this patch does along with some clean ups. Thanks. +/* Perform default promotions or extract the integer constant from + an enum value. This function is used for getting a value from + an attribute argument. */ + +static tree +get_attrib_value (tree val) +{ + if (val + TREE_CODE (val) != IDENTIFIER_NODE + TREE_CODE (val) != FUNCTION_DECL) +val = default_conversion (val); + else if (TREE_CODE (val) == IDENTIFIER_NODE) +{ + tree t = lookup_name (val); + if (t TREE_CODE (t) == CONST_DECL) + return DECL_INITIAL (t); +} + return val; +} If val is NULL, it seems you still end up looking at its TREE_CODE? (don't know if that can happen) In C++, default_conversion probably handles IDENTIFIER_NODE just fine, but having different code for the 2 languages would not be nice, right? -- Marc Glisse
Re: Optimize n?rotate(x,n):x
On Mon, 14 Apr 2014, Richard Biener wrote: + /* If the special case has a high probability, keep it. */ + if (EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) I suppose Honza has a comment on how to test this properly (not sure if -probability or -frequency is always initialized properly). for example single_likely_edge tests profile_status_for_fn != PROFILE_ABSENT (and uses a fixed probability value ...). Anyway, the comparison looks backwards to me, but maybe I'm missing sth - I'd use = PROB_LIKELY ;) Maybe the comment is confusing? middle_bb contains the expensive operation (say a/b) that the special case skips entirely. If the division happens in less than 50% of cases (that's the proba of the edge going from cond to middle_bb), then doing the comparison+jump may be cheaper and I abort the optimization. At least the testcase with __builtin_expect should prove that I didn't do it backwards. value-prof seems to use 50% as the cut-off where it may become interesting to special case division, hence my choice of PROB_EVEN. I am not sure which way you want to use PROB_LIKELY (80%). If we have more than 80% chances of executing the division, always perform it? Or if we have more than 80% chances of skipping the division, keep the branch? Attached is the latest version (passed the testsuite).Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile { target x86_64-*-* } } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int f(int a, int b) { + if (__builtin_expect(a == 0, 1)) return b; + return a + b; +} + +// optab_handler can handle if(b==1) but not a/b +// so we consider a/b too expensive. +unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) { + if (b == 1) +return a; + else +return a / b; +} + +/* { dg-final { scan-tree-dump-times goto 4 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 209353) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -140,20 +140,37 @@ static bool gate_hoist_loads (void); x = PHI (CONST, a) Gets replaced with: bb0: bb2: t1 = a == CONST; t2 = b c; t3 = t1 t2; x = a; + + It also replaces + + bb0: + if (a != 0) goto bb1; else goto bb2; + bb1: + c = a + b; + bb2: + x = PHI c (bb1), b (bb0), ...; + + with + + bb0: + c = a + b; + bb2: + x = PHI c (bb0), ...; + ABS Replacement --- This transformation, implemented in abs_replacement, replaces bb0: if (a = 0) goto bb2; else goto bb1; bb1: x = -a; bb2: @@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; tmp = gimple_assign_rhs2 (def); if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; return false; } +/* Returns true if ARG is a neutral element for operation CODE + on the RIGHT side. */ + +static bool +neutral_element_p (tree_code code, tree arg, bool right) +{ + switch (code) +{ +case PLUS_EXPR: +case BIT_IOR_EXPR: +case BIT_XOR_EXPR: + return integer_zerop (arg); + +case LROTATE_EXPR: +case RROTATE_EXPR: +case LSHIFT_EXPR: +case RSHIFT_EXPR: +case MINUS_EXPR: +case POINTER_PLUS_EXPR: + return right integer_zerop (arg); + +case MULT_EXPR: + return integer_onep (arg); + +case TRUNC_DIV_EXPR: +case CEIL_DIV_EXPR: +case FLOOR_DIV_EXPR: +case ROUND_DIV_EXPR: +case EXACT_DIV_EXPR: + return right integer_onep (arg); + +case BIT_AND_EXPR: + return integer_all_onesp (arg); + +default: + return false; +} +} + +/* Returns true if ARG is an absorbing element for operation CODE. */ + +static bool +absorbing_element_p (tree_code code,
Re: C++ PATCH for c++/51747 (list-initialization from same type)
On Fri, 11 Apr 2014, Jason Merrill wrote: Recent changes to the C++ standard have allowed the use of list-initialization with a single initializer of the same type as the target; this patch updates reshape_init accordingly. Tested x86_64-pc-linux-gnu, applying to trunk. Hello, shouldn't the same also apply if VECTOR_TYPE_P (type), not just for CLASS_TYPE_P (type)? Testcase in: http://stackoverflow.com/questions/23070982/c-initialization-of-intel-simd-intrinsics-class-members -- Marc Glisse
Re: [patch] change specific int128 - generic intN
On Mon, 14 Apr 2014, DJ Delorie wrote: * rtti.c (emit_support_tinfos): Remove __int128-specific entries. Hello, I wanted to check the interaction with: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00618.html but I don't see where you moved the __int128 typeinfo generation.
[c++] typeinfo for target types
Hello, this patch generates typeinfo for target types. On x86_64, it adds these 6 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and version those in the shared library. + V typeinfo for __float128 + V typeinfo for __float128 const* + V typeinfo for __float128* + V typeinfo name for __float128 + V typeinfo name for __float128 const* + V typeinfo name for __float128* Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c). 2014-04-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/c-family/ * c-common.c (registered_builtin_types): Make non-static. * c-common.h (registered_builtin_types): Declare. gcc/cp/ * rtti.c (emit_support_tinfo_1): New function, extracted from emit_support_tinfos. (emit_support_tinfos): Call it and iterate on registered_builtin_types. -- Marc GlisseIndex: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 209345) +++ gcc/c-family/c-common.c (working copy) @@ -3462,21 +3462,21 @@ c_common_fixed_point_type_for_size (unsi fixed-point types that have too many integral and fractional bits together); return 0; } return c_common_type_for_mode (mode, satp); } /* Used for communication between c_common_type_for_mode and c_register_builtin_type. */ -static GTY(()) tree registered_builtin_types; +tree registered_builtin_types; /* Return a data type that has machine mode MODE. If the mode is an integer, then UNSIGNEDP selects between signed and unsigned types. If the mode is a fixed-point mode, then UNSIGNEDP selects between saturating and nonsaturating types. */ tree c_common_type_for_mode (enum machine_mode mode, int unsignedp) { Index: gcc/c-family/c-common.h === --- gcc/c-family/c-common.h (revision 209345) +++ gcc/c-family/c-common.h (working copy) @@ -1006,20 +1006,24 @@ extern void do_warn_double_promotion (tr extern void set_underlying_type (tree); extern void record_locally_defined_typedef (tree); extern void maybe_record_typedef_use (tree); extern void maybe_warn_unused_local_typedefs (void); extern vectree, va_gc *make_tree_vector (void); extern void release_tree_vector (vectree, va_gc *); extern vectree, va_gc *make_tree_vector_single (tree); extern vectree, va_gc *make_tree_vector_from_list (tree); extern vectree, va_gc *make_tree_vector_copy (const vectree, va_gc *); +/* Used for communication between c_common_type_for_mode and + c_register_builtin_type. */ +extern GTY(()) tree registered_builtin_types; + /* In c-gimplify.c */ extern void c_genericize (tree); extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *); extern tree c_build_bind_expr (location_t, tree, tree); /* In c-pch.c */ extern void pch_init (void); extern void pch_cpp_save_state (void); extern int c_common_valid_pch (cpp_reader *pfile, const char *name, int fd); extern void c_common_read_pch (cpp_reader *pfile, const char *name, int fd, Index: gcc/cp/rtti.c === --- gcc/cp/rtti.c (revision 209345) +++ gcc/cp/rtti.c (working copy) @@ -1458,20 +1458,58 @@ create_tinfo_types (void) FIELD_DECL, NULL_TREE, integer_type_node), build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, type_info_ptr_type), build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, type_info_ptr_type), NULL); pop_abi_namespace (); } +/* Helper for emit_support_tinfos. Emits the type_info descriptor of + a single type. */ + +void +emit_support_tinfo_1 (tree bltn) +{ + tree types[3]; + + if (bltn == NULL_TREE) +return; + types[0] = bltn; + types[1] = build_pointer_type (bltn); + types[2] = build_pointer_type (cp_build_qualified_type (bltn, + TYPE_QUAL_CONST)); + + for (int i = 0; i 3; ++i) +{ + tree tinfo = get_tinfo_decl (types[i]); + TREE_USED (tinfo) = 1; + mark_needed (tinfo); + /* The C++ ABI requires that these objects be COMDAT. But, +On systems without weak symbols, initialized COMDAT +objects are emitted with internal linkage. (See +comdat_linkage for details.) Since we want these objects +to have external linkage so that copies do not have to be +emitted in code outside the runtime library, we make them +non-COMDAT here. + +It might also not be necessary to follow this detail of the +ABI. */ + if (!flag_weak || ! targetm.cxx.library_rtti_comdat ()) + { + gcc_assert (TREE_PUBLIC (tinfo) !DECL_COMDAT (tinfo
[i386] define __SIZEOF_FLOAT128__
Hello, some people like having a macro to test if a type is available (__SIZEOF_INT128__ for instance). This adds macros for __float80 and __float128. The types seem to be always available, so I didn't add any condition. If you think this is a bad idea, please close the PR. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-13 Marc Glisse marc.gli...@inria.fr PR preprocessor/56540 * config/i386/i386-c.c (ix86_target_macros): Define __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__. -- Marc GlisseIndex: gcc/config/i386/i386-c.c === --- gcc/config/i386/i386-c.c(revision 209345) +++ gcc/config/i386/i386-c.c(working copy) @@ -511,20 +511,27 @@ ix86_target_macros (void) if (!TARGET_80387) cpp_define (parse_in, _SOFT_FLOAT); if (TARGET_LONG_DOUBLE_64) cpp_define (parse_in, __LONG_DOUBLE_64__); if (TARGET_LONG_DOUBLE_128) cpp_define (parse_in, __LONG_DOUBLE_128__); + if (TARGET_64BIT) +cpp_define (parse_in, __SIZEOF_FLOAT80__=16); + else +cpp_define (parse_in, __SIZEOF_FLOAT80__=12); + + cpp_define (parse_in, __SIZEOF_FLOAT128__=16); + cpp_define_formatted (parse_in, __ATOMIC_HLE_ACQUIRE=%d, IX86_HLE_ACQUIRE); cpp_define_formatted (parse_in, __ATOMIC_HLE_RELEASE=%d, IX86_HLE_RELEASE); ix86_target_macros_internal (ix86_isa_flags, ix86_arch, ix86_tune, ix86_fpmath, cpp_define); }
Re: Optimize n?rotate(x,n):x
On Mon, 3 Mar 2014, Richard Biener wrote: On Sat, Mar 1, 2014 at 3:33 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, again, a stage 1 patch that I will ping then, but early comments are welcome. PR 59100 was asking to transform n?rotate(x,n):x to rotate(x,n) (because it can be hard to write a strictly valid rotate in plain C). The operation is really: (x != neutral) ? x op y : y where neutral is such that (neutral op y) is always y, so that's what I implemented (and absorbing elements while I was at it). For some operations on some platforms, the transformation may not be such a good idea, in particular if division is very slow and b is 1 most of the time, then computing a/b may be slower than (b!=1)?a/b:a. The easiest might be to comment out those operations in the switch for now. I think divisions are the only ones slow enough to deserve this, though there certainly are CPUs where multiplication is not so fast, and even for rotate it may not always be a win if the processor doesn't have a rotate instruction and the shift amount is almost always 0. You only handle integer operations, so checking for INTEGER_TYPE_P early on would make sense. Ah, I wasn't planning on restricting this to integers but yes, indeed, since it ended up that way... Note that some archs may dispatch to libgcc for integer operations so it may make sense to check whether that is the case (you can query optabs to check that) - if the comparison also dispatches to libgcc then of course the transform would be a win again. Ok. I am not sure about using cbranch_optab, other ideas for the comparison? Even on x86 TImode division uses libgcc. This reminds me of PR 58897 (unrelated). Note also that value-profiling may have created this special case in the first place! (gimple_divmod_fixed_value_transform) Test coverage must be limited if I didn't break the testsuite :-( value-profiling seems to set edge probabilities when it does the transformation, so I am checking only that. Otherwise I think this is a good transform. Did you check if it triggers during GCC bootstrap? It does, a few times. Java has the obvious: int padLength = blockSize; if (length % blockSize != 0) padLength = blockSize - length % blockSize; sched-deps.c has (I didn't check if it was actually this piece of code that triggered or another one in the same file): if ((ds1 t) !(ds2 t)) ds |= ds1 t; else if (!(ds1 t) (ds2 t)) ds |= ds2 t; and we end up seeing (phiopt3) if(a!=0)ds|=a. And mostly bitmap.h where we see quite a bit of: if (_867 != 0) goto bb 55; else goto bb 56; bb 55: start_bit_869 = _867 * 128; bb 56: # start_bit_870 = PHI 0(54), start_bit_869(55) Bootstrap+testsuite on x86_64-linux-gnu (modulo a minor reformatting, I'm retesting anyway). 2014-04-14 Marc Glisse marc.gli...@inria.fr PR tree-optimization/59100 gcc/ * tree-ssa-phiopt.c (neutral_element_p, absorbing_element_p, is_cheap_stmt): New functions. (value_replacement): Handle conditional binary operations with a neutral or absorbing element. gcc/testsuite/ * gcc.dg/tree-ssa/phi-opt-12.c: New file. * gcc.dg/tree-ssa/phi-opt-13.c: Likewise. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Property changes on: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c ___ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile { target x86_64-*-* } } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int f(int a, int b) { + if (__builtin_expect(a == 0, 1)) return b; + return a + b; +} + +// optab_handler can handle if(b==1) but not a/b +// so we consider a/b too expensive. +unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) { + if (b == 1
Re: [v3] complex functions with expression template reals
On Sat, 12 Apr 2014, Eric Botcazou wrote: r209321 (for a patch accepted long ago, I guess it is better to leave a trace in gcc-patches near the commit date) Nope, see the doc, we have ChangeLog and gcc-cvs for this purpose. Those don't point to the conversation in gcc-patches where the patch was ok-ed by a maintainer (maybe we should start adding URLs pointing to the gcc-patches messages in commit logs?). I also meant it as a hello to whoever reviewed the patch, but it is true they likely already read new ChangeLog entries or follow (a filtered version of) gcc-cvs. Thanks, I won't do it again in the future :-) -- Marc Glisse
Re: [v3] complex functions with expression template reals
On Wed, 26 Feb 2014, Paolo Carlini wrote: On 02/26/2014 10:57 AM, Jonathan Wakely wrote: On 24 February 2014 09:10, Paolo Carlini wrote: Another option would be just using boost/multiprecision/mpfr.hpp when available. In general, I think it makes sense to have a minimum of infrastructure enabling tests checking interoperability with boost. If only we had a check_v3_target_header { args } it would be most of it, but it doesn't seem we do?!? Anyway I guess we can take care of that post 4.9.0 and commit the straightforward code tweak now. Jon? I'd be happy with it for Stage 1, but I don't think we should apply it now if it is only useful for people doing unusual (unsupported) things with std::complex. Ok. Marc, please remember to apply it as soon as Stage 1 re-opens. r209321 (for a patch accepted long ago, I guess it is better to leave a trace in gcc-patches near the commit date) -- Marc Glisse
Re: [v3] LWG 2106: move_iterator::reference
On Wed, 5 Mar 2014, Jonathan Wakely wrote: On 5 March 2014 19:36, Marc Glisse wrote: Hello, this issue got delayed in LWG, apparently because of a failed improvement to the wording along the way (happens, that's ok), but there seems to be a consensus on the resolution and I don't really see the point of waiting (it changes code that currently returns a reference to a temporary). Tested on x86_64-linux-gnu. Stage 1? Yes, OK for Stage 1. Please put _GLIBCXX_RESOLVE_DEFECTS (or whatever it is we use elsewhere) in the comment, rather than just DR 2106. r209323 (again, only posting because this was accepted long ago) -- Marc Glisse
[i386] Replace builtins with vector extensions
Hello, the previous discussion on the topic was before we added all those #pragma target in *mmintrin.h: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html I believe that removes a large part of the arguments against it. Note that I only did a few of the more obvious intrinsics, I am waiting to see if this patch is accepted before doing more. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-11 Marc Glisse marc.gli...@inria.fr * config/i386/xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps, _mm_store_ss, _mm_cvtss_f32): Use vector extensions instead of builtins. * config/i386/emmintrin.h (_mm_store_sd, _mm_cvtsd_f64, _mm_storeh_pd, _mm_cvtsi128_si64, _mm_cvtsi128_si64x, _mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd, _mm_storel_epi64, _mm_movepi64_pi64, _mm_loadh_pd, _mm_loadl_pd): Likewise. (_mm_sqrt_sd): Fix comment. -- Marc GlisseIndex: gcc/config/i386/emmintrin.h === --- gcc/config/i386/emmintrin.h (revision 209323) +++ gcc/config/i386/emmintrin.h (working copy) @@ -161,40 +161,40 @@ _mm_store_pd (double *__P, __m128d __A) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeu_pd (double *__P, __m128d __A) { __builtin_ia32_storeupd (__P, __A); } /* Stores the lower DPFP value. */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_store_sd (double *__P, __m128d __A) { - *__P = __builtin_ia32_vec_ext_v2df (__A, 0); + *__P = __A[0]; } extern __inline double __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cvtsd_f64 (__m128d __A) { - return __builtin_ia32_vec_ext_v2df (__A, 0); + return __A[0]; } extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storel_pd (double *__P, __m128d __A) { _mm_store_sd (__P, __A); } /* Stores the upper DPFP value. */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeh_pd (double *__P, __m128d __A) { - *__P = __builtin_ia32_vec_ext_v2df (__A, 1); + *__P = __A[1]; } /* Store the lower DPFP value across two words. The address must be 16-byte aligned. */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_store1_pd (double *__P, __m128d __A) { _mm_store_pd (__P, __builtin_ia32_shufpd (__A, __A, _MM_SHUFFLE2 (0,0))); } @@ -215,86 +215,86 @@ extern __inline int __attribute__((__gnu _mm_cvtsi128_si32 (__m128i __A) { return __builtin_ia32_vec_ext_v4si ((__v4si)__A, 0); } #ifdef __x86_64__ /* Intel intrinsic. */ extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cvtsi128_si64 (__m128i __A) { - return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0); + return __A[0]; } /* Microsoft intrinsic. */ extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_cvtsi128_si64x (__m128i __A) { - return __builtin_ia32_vec_ext_v2di ((__v2di)__A, 0); + return __A[0]; } #endif extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_add_pd (__m128d __A, __m128d __B) { - return (__m128d)__builtin_ia32_addpd ((__v2df)__A, (__v2df)__B); + return __A + __B; } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_add_sd (__m128d __A, __m128d __B) { return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B); } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_sub_pd (__m128d __A, __m128d __B) { - return (__m128d)__builtin_ia32_subpd ((__v2df)__A, (__v2df)__B); + return __A - __B; } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_sub_sd (__m128d __A, __m128d __B) { return (__m128d)__builtin_ia32_subsd ((__v2df)__A, (__v2df)__B); } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_mul_pd (__m128d __A, __m128d __B) { - return (__m128d)__builtin_ia32_mulpd ((__v2df)__A, (__v2df)__B); + return __A * __B; } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_mul_sd (__m128d __A, __m128d __B) { return (__m128d)__builtin_ia32_mulsd ((__v2df)__A, (__v2df)__B); } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_div_pd (__m128d __A, __m128d __B) { - return (__m128d)__builtin_ia32_divpd ((__v2df)__A, (__v2df)__B); + return __A / __B; } extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_div_sd (__m128d __A, __m128d __B) { return (__m128d)__builtin_ia32_divsd ((__v2df)__A, (__v2df)__B); } extern __inline __m128d __attribute__
Re: [PATCH] Fix PR c++/60764
On Wed, 9 Apr 2014, Jason Merrill wrote: Hmm, I would expect the parameter numbering for attribute nonnull and such to ignore the 'this' parameter. The doc for the format attribute says clearly: Since non-static C++ methods have an implicit this argument, the arguments of such methods should be counted from two, not one, when giving values for string-index and first-to-check. It would be strange to count arguments differently for different attributes. -- Marc Glisse
Re: Warn when returning the address of a temporary (middle-end)
On Mon, 7 Apr 2014, Richard Biener wrote: One hard part is avoiding duplicate warnings. Replacing the address with 0 is a convenient way to do that, so I did it both for my new warning and for the existing C/C++ ones. The patch breaks gfortran.dg/warn_target_lifetime_2.f90 because it ends up warning twice. I didn't touch that front-end because I don't know fortran, and the warning message Pointer at .1. in pointer assignment might outlive the pointer target doesn't seem very confident that the thing really is broken enough to be replaced by 0. I only tested (bootstrap+regression) the default languages, so ada/go may have a similar issue, to be handled if the approach seems ok. (I personally wouldn't care about duplicate warnings, but I know some people can't help complaining about it) This doesn't actually fix PR 60517, for that I was thinking of checking for memory reads if the first stop of walk_aliased_vdefs is a clobber (could also check __builtin_free), though I don't know in which pass to do that yet. Note that this will break working programs where inlining causes those references to no longer be returned (though maybe they already break with the clobbers we insert). I remember that POOMA/PETE was full of this ... (and made it reliably work by flattening all call trees with such calls). I mostly see programs that are written and tested using debug mode and that appear to be working, but break as soon as they are compiled in release mode because of dangling references. Breaking them more seems like a helpful thing to me ;-) (especially with a warning). But I would be happy not replacing the pointer and only emitting a warning, if people don't mind duplicate warnings or if we can find another way to avoid them. + if (warning_at (gimple_location (stmt), OPT_Wreturn_local_addr, + function returns address of local variable)) That's a front-end option, I should either move it to gcc/common.opt or use a different option. -- Marc Glisse
Re: Warn when returning the address of a temporary (middle-end)
On Mon, 7 Apr 2014, Jeff Law wrote: On 04/05/14 07:52, Marc Glisse wrote: Hello, we have front-end warnings about returning the address of a local variable. However, quite often in C++, people don't directly return the address of a temporary, it goes through a few functions which hide that fact. After some inlining, the fact that we are returning the address of a local variable can become obvious to the compiler, to the point where I have used, for debugging purposes, grep 'return ' on the optimized dump produced with -O3 -fkeep-inline-functions (I then had to sort through the global/local variables). fold_stmt looks like a good place for this, but it could go almost anywhere. It has to happen after enough inlining / copy propagation to make it useful though. I was wondering if this would be better implemented as a propagation problem so that cases where some, but not all paths to the return statement have local which reaches the return. ie ... if (foo) x = local else x = global return x; ISTM it ought to be a standard propagation problem and that the problematical ADDR_EXPRs are all going to be in PHI nodes. So I think you just search for problematical ADDR_EXPRs in PHI nodes as your seeds, then forward propagate through the CFG. Ultimately looking for any cases where those ADDR_EXPRs ultimately reach the return statement. Thoughts? I would tend to start from the return statements (assuming the return type is a pointer), look at the defining statement, do things if it is an assignment of an addr_expr, and recurse if it is a PHI. But maybe my brain is cabled backwards ;-) It would be good to piggy back on a pass that already does something similar, if we go that way. Do you know a convenient one? I am also afraid we may get more false positives, but maybe not. Last, the simple version actually works well enough that it discovered at least one real bug in real code, and I am afraid that by refining it too much we'll delay and get nothing in the end (my time and my knowledge of the compiler are limited enough to make it a real possibility). But I admit that's not a good argument. Thanks for the comments, -- Marc Glisse
Re: Warn when returning the address of a temporary (middle-end)
On Mon, 7 Apr 2014, Jeff Law wrote: I am also afraid we may get more false positives, but maybe not. The only false positives should come from paths which are unexecutable. One could argue that if we find any that we should warn, then isolate the path so that we get an immediate runtime trap rather than letting the address of the local escape through the return value. That in turn would argue for dumping it into gimple-isolate-erroneous-paths ;-) I wonder if trapping may be too strong? If a function sometimes returns a pointer to a local variable but the caller always ignores the return value in those cases, we can warn, but maybe we don't want to introduce a trap? Replacing the address with a null pointer seemed like a compromise, it will trap when you try to read it, but not if you ignore it. But if you think we can trap, ok. -- Marc Glisse
Warn when returning the address of a temporary (middle-end)
Hello, we have front-end warnings about returning the address of a local variable. However, quite often in C++, people don't directly return the address of a temporary, it goes through a few functions which hide that fact. After some inlining, the fact that we are returning the address of a local variable can become obvious to the compiler, to the point where I have used, for debugging purposes, grep 'return ' on the optimized dump produced with -O3 -fkeep-inline-functions (I then had to sort through the global/local variables). fold_stmt looks like a good place for this, but it could go almost anywhere. It has to happen after enough inlining / copy propagation to make it useful though. One hard part is avoiding duplicate warnings. Replacing the address with 0 is a convenient way to do that, so I did it both for my new warning and for the existing C/C++ ones. The patch breaks gfortran.dg/warn_target_lifetime_2.f90 because it ends up warning twice. I didn't touch that front-end because I don't know fortran, and the warning message Pointer at .1. in pointer assignment might outlive the pointer target doesn't seem very confident that the thing really is broken enough to be replaced by 0. I only tested (bootstrap+regression) the default languages, so ada/go may have a similar issue, to be handled if the approach seems ok. (I personally wouldn't care about duplicate warnings, but I know some people can't help complaining about it) This doesn't actually fix PR 60517, for that I was thinking of checking for memory reads if the first stop of walk_aliased_vdefs is a clobber (could also check __builtin_free), though I don't know in which pass to do that yet. 2014-04-05 Marc Glisse marc.gli...@inria.fr PR c++/60517 gcc/c/ * c-typeck.c (c_finish_return): Return 0 instead of the address of a local variable. gcc/cp/ * typeck.c (check_return_expr): Likewise. (maybe_warn_about_returning_address_of_local): Tell the caller if we warned. gcc/ * gimple-fold.c (fold_stmt_1): Warn if returning the address of a local variable. gcc/testsuite/ * c-c++-common/addrtmp.c: New testcase. * c-c++-common/uninit-G.c: Adjust. -- Marc GlisseIndex: gcc/c/c-typeck.c === --- gcc/c/c-typeck.c(revision 209157) +++ gcc/c/c-typeck.c(working copy) @@ -9254,23 +9254,25 @@ c_finish_return (location_t loc, tree re inner = TREE_OPERAND (inner, 0); while (REFERENCE_CLASS_P (inner) TREE_CODE (inner) != INDIRECT_REF) inner = TREE_OPERAND (inner, 0); if (DECL_P (inner) !DECL_EXTERNAL (inner) !TREE_STATIC (inner) DECL_CONTEXT (inner) == current_function_decl) - warning_at (loc, - OPT_Wreturn_local_addr, function returns address - of local variable); + { + warning_at (loc, OPT_Wreturn_local_addr, + function returns address of local variable); + t = build_zero_cst (TREE_TYPE (res)); + } break; default: break; } break; } retval = build2 (MODIFY_EXPR, TREE_TYPE (res), res, t); Index: gcc/cp/typeck.c === --- gcc/cp/typeck.c (revision 209157) +++ gcc/cp/typeck.c (working copy) @@ -49,21 +49,21 @@ static tree convert_for_assignment (tree static tree cp_pointer_int_sum (enum tree_code, tree, tree, tsubst_flags_t); static tree rationalize_conditional_expr (enum tree_code, tree, tsubst_flags_t); static int comp_ptr_ttypes_real (tree, tree, int); static bool comp_except_types (tree, tree, bool); static bool comp_array_types (const_tree, const_tree, bool); static tree pointer_diff (tree, tree, tree, tsubst_flags_t); static tree get_delta_difference (tree, tree, bool, bool, tsubst_flags_t); static void casts_away_constness_r (tree *, tree *, tsubst_flags_t); static bool casts_away_constness (tree, tree, tsubst_flags_t); -static void maybe_warn_about_returning_address_of_local (tree); +static bool maybe_warn_about_returning_address_of_local (tree); static tree lookup_destructor (tree, tree, tree, tsubst_flags_t); static void warn_args_num (location_t, tree, bool); static int convert_arguments (tree, vectree, va_gc **, tree, int, tsubst_flags_t); /* Do `exp = require_complete_type (exp);' to make sure exp does not have an incomplete type. (That includes void types.) Returns error_mark_node if the VALUE does not have complete type when this function returns. */ @@ -8253,79 +8253,81 @@ convert_for_initialization (tree exp, tr return rhs
Re: Fix ipa-devirt ICE
On Thu, 3 Apr 2014, Jan Hubicka wrote: + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + OTR_TOKEN == INT_MAX is used to mark calls that are provably Did you mean provably instead of probably in the first two? -- Marc Glisse
Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store
On Wed, 2 Apr 2014, Thomas Preud'homme wrote: Note that as it stands the patch does not work for arrays indexed with variable (such a tab[a] || (tab[a+1] 8)) because fold_const does not fold (a + 1) - a. Uh? It does fold a+1-a for me. What it doesn't do is look through the definition of b in b-a. Richard+GSoC will supposedly soon provide a function that does that. -- Marc Glisse
Re: [PATCH] Fix undefined behavior in IRA
On Tue, 25 Mar 2014, Marek Polacek wrote: This is a temporary fix for UB in IRA, where ubsan complains because there's signed iteger overflow in the multiplication. To shut this error up, we can perform the multiplication in unsigned and only then cast the result of the multiplication to int. Naive question: why do you want to shut the error up? If modular arithmetic makes sense for costs (sounds doubtful), they should use an unsigned type to begin with. Otherwise, this is making it harder to notice a bug (doesn't sound like an improvement). If you want to cast, doesn't something like long long (HOST_WIDEST_INT?) make more sense than unsigned? -- Marc Glisse
Re: [PATCH] Add test for PR c++/53711
On Sun, 23 Mar 2014, Patrick Palka wrote: On Fri, Mar 14, 2014 at 11:20 AM, Jason Merrill ja...@redhat.com wrote: Applied, thanks. Thanks! The commit seems to be missing the corresponding ChangeLog entry though. ChangeLog is kind of optional in the testsuite. Please keep providing entries, but don't be surprised when some go missing. -- Marc Glisse
Re: calloc = malloc + memset
On Mon, 3 Mar 2014, Richard Biener wrote: That's a bit much of ad-hoc pattern-matching ... wouldn't be p = malloc (n); memset (p, 0, n); transform better suited to the strlen opt pass? After all that tracks what 'string' is associated with a SSA name pointer through arbitrary satements using a lattice. Like this? I had to move the strlen pass after the loop passes (and after dom or everything was too dirty) but long enough before the end (some optimizations are necessary after strlen). As a bonus, one more strlen is optimized in the current testcases :-) Running the pass twice would be another option I guess (it would require implementing the clone method), but without a testcase showing it is needed... Passes bootstrap+testsuite on x86_64-linux-gnu. 2014-03-23 Marc Glisse marc.gli...@inria.fr PR tree-optimization/57742 gcc/ * tree-ssa-strlen.c (get_string_length): Ignore malloc. (handle_builtin_malloc, handle_builtin_memset): New functions. (strlen_optimize_stmt): Call them. * passes.def: Move strlen after loop+dom. gcc/testsuite/ * g++.dg/tree-ssa/calloc.C: New testcase. * gcc.dg/tree-ssa/calloc-1.c: Likewise. * gcc.dg/tree-ssa/calloc-2.c: Likewise. * gcc.dg/strlenopt-9.c: Adapt. -- Marc GlisseIndex: gcc/passes.def === --- gcc/passes.def (revision 208772) +++ gcc/passes.def (working copy) @@ -176,21 +176,20 @@ along with GCC; see the file COPYING3. DOM and erroneous path isolation should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); NEXT_PASS (pass_dse); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_dce); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); - NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); NEXT_PASS (pass_optimize_bswap); NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); NEXT_PASS (pass_asan); @@ -235,20 +234,21 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); NEXT_PASS (pass_dominator); /* The only const/copy propagation opportunities left after DOM should be due to degenerate PHI nodes. So rather than run the full propagators, run a specialized pass which only examines PHIs to discover const/copy propagation opportunities. */ NEXT_PASS (pass_phi_only_cprop); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_vrp); NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_tracer); NEXT_PASS (pass_dse); NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_rename_ssa_copies); Index: gcc/testsuite/g++.dg/tree-ssa/calloc.C === --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,35 @@ +/* { dg-do compile { target c++11 } } */ +/* { dg-options -O3 -fdump-tree-optimized } */ + +#include new +#include vector +#include cstdlib + +void g(void*); +inline void* operator new(std::size_t sz) +{ + void *p; + + if (sz == 0) +sz = 1; + + // Slightly modified from the libsupc++ version, that one has 2 calls + // to malloc which makes it too hard to optimize. + while ((p = std::malloc (sz)) == 0) +{ + std::new_handler handler = std::get_new_handler (); + if (! handler) +throw std::bad_alloc(); + handler (); +} + return p; +} + +void f(void*p,int n){ + new(p)std::vectorint(n); +} + +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Property changes on: gcc/testsuite/g++.dg/tree-ssa/calloc.C ___ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: gcc/testsuite/gcc.dg/strlenopt-9.c === --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 208772) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r