Re: RFC Asan instrumentation control
Hi! * c-c++-common/asan/no-asan-stack.c (this triggers read overflow because we haven't found a cross-platform way to grep for stack redzones instrumentation) I'd prefer no test in that case, or just some semi-platform specific test (scan that the 0x41b58ab3 constant doesn't appear in say some late RTL dump, or perhaps just assembly (just scan it with lower and upper case and decimal too)). Thanks, commited in 206458 without c-c++-common/asan/no-asan-stack.c testfile. I'll fix this test according to your recommendations a bit later. -Maxim.
Re: [patch][i386] Remove code executed only if reload_in_progress (i.e. never)
On Wed, Jan 8, 2014 at 10:58 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Jan 08, 2014 at 10:51:53PM +0100, Steven Bosscher wrote: Hello Uros, and everyone else, Now that LRA is always used for the i386 targets, reload_in_progress is never set so all code conditional on it is now dead. The attached patch removes this code. Sadly I'm having difficulty testing the patch because I have no access to a suitable x86_64 or ix86 box :-) I'll try to test the patch on a compile farm machine, but I'm already posting the patch to hear if this is still OK for this late stage of the development cycle. It's not as if we're going to go back to reload so the code really is dead AFAICT, but it's obviously not a bug fix. While LRA is always on, making it harder to test with reload doesn't seem to be a good idea to me for 4.9, when some RA issue is reported for these architectures, often one just patches config/i386/i386.c by hand to enable reload instead of LRA and tests it with that instead. This patch would mean we'd need to keep around a patchset to apply for those purposes. I also think that we should leave reload functionality in 4.9, for the same reasons as Jakub presents. IMO, LRA still has some rough edges, please see [1] that triggers with improved x86 atomic_compare_and_swapdwi_doubleword pattern. [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58945 Uros.
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Wed, 8 Jan 2014, Jakub Jelinek wrote: On Wed, Jan 08, 2014 at 12:19:02PM +0100, Richard Biener wrote: On Wed, 8 Jan 2014, Jakub Jelinek wrote: On Wed, Jan 08, 2014 at 12:15:40PM +0100, Richard Biener wrote: I start to think this is a too complex transform for stmt folding ... Alternatively do update_call_from_tree (gsi, get_or_create_ssa_default_def (cfun, create_tmp_var (TREE_TYPE (lhs. The lhs might not be is_gimple_reg_type though. What to do in that case? In that case you can remove the stmt. Ok, so like this? Unfortunately, I haven't been able to construct a testcase where this would be folded later than during gimplification where lhs is obviously not a SSA_NAME and we can safely replace the call. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok, though if you are happy to do another change then ... 2014-01-08 Jakub Jelinek ja...@redhat.com PR tree-optimization/59622 * gimple-fold.c (gimple_fold_call): Fix a typo in message. Handle __cxa_pure_virtual similarly to __builtin_unreachable, but replace the OBJ_TYPE_REF call with the noreturn and add if needed a setter of the lhs SSA_NAME. Don't devirtualize for inplace at all. * g++.dg/opt/pr59622-2.C: New test. * g++.dg/opt/pr59622-3.C: New test. * g++.dg/opt/pr59622-4.C: New test. --- gcc/gimple-fold.c.jj 2014-01-08 10:23:24.536443566 +0100 +++ gcc/gimple-fold.c 2014-01-08 17:02:40.356635177 +0100 @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator * (OBJ_TYPE_REF_EXPR (callee) { fprintf (dump_file, -Type inheritnace inconsistent devirtualization of ); +Type inheritance inconsistent devirtualization of ); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, to ); print_generic_expr (dump_file, callee, TDF_SLIM); @@ -1177,26 +1177,46 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else if (flag_devirtualize virtual_method_call_p (callee)) + else if (flag_devirtualize !inplace virtual_method_call_p (callee)) { bool final; vec cgraph_node *targets = possible_polymorphic_call_targets (callee, final); if (final targets.length () = 1) { + tree fndecl; if (targets.length () == 1) + fndecl = targets[0]-decl; + else + fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); + + /* If fndecl (like __builtin_unreachable or + __cxa_pure_virtual) takes no arguments, doesn't have + return value and is noreturn, if the call doesn't have + lhs or lhs isn't SSA_NAME, replace the call with + the noreturn call, otherwise insert it before the call + and replace the call with setting of lhs to default def. */ + if (TREE_THIS_VOLATILE (fndecl) +VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl))) +TYPE_ARG_TYPES (TREE_TYPE (fndecl)) == void_list_node) ... restrict this to the targets.length () == 0 case. Thanks, Richard. { - gimple_call_set_fndecl (stmt, targets[0]-decl); - changed = true; - } - else if (!inplace) - { - tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); + tree lhs = gimple_call_lhs (stmt); gimple new_stmt = gimple_build_call (fndecl, 0); gimple_set_location (new_stmt, gimple_location (stmt)); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + if (lhs TREE_CODE (lhs) == SSA_NAME) + { + tree var = create_tmp_var (TREE_TYPE (lhs), NULL); + tree def = get_or_create_ssa_default_def (cfun, var); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + update_call_from_tree (gsi, def); + } + else + gsi_replace (gsi, new_stmt, true); return true; } + + gimple_call_set_fndecl (stmt, fndecl); + changed = true; } } } --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-08 16:42:49.588747876 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-2.C 2014-01-08 16:42:49.588747876 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +namespace +{ + struct A + { +A () {} +virtual A *bar (int) = 0; +A *baz (int x) { return bar (x); } + }; +} + +A *a; + +void +foo () +{ + a-baz (0); +} ---
Re: [PATCH] Fix PR45586
This fixes the gimple verification ICEs in the Fortran testsuite. ... This patch causes 6300+ regressions (-m32/-m64 all languages but go). However the following change --- ../_clean/gcc/lto/lto.c 2014-01-04 15:51:44.0 +0100 +++ gcc/lto/lto.c 2014-01-08 08:26:09.0 +0100 @@ -310,7 +310,7 @@ hash_canonical_type (tree type) { v = iterative_hash_hashval_t (TYPE_REF_CAN_ALIAS_ALL (type), v); v = iterative_hash_hashval_t (TYPE_ADDR_SPACE (TREE_TYPE (type)), v); - v = iterative_hash_hashval_t (TYPE_RESTRICT (type), v); + /* v = iterative_hash_hashval_t (TYPE_RESTRICT (type), v); */ v = iterative_hash_hashval_t (TREE_CODE (TREE_TYPE (type)), v); } @@ -495,8 +495,8 @@ gimple_canonical_types_compatible_p (tre != TYPE_ADDR_SPACE (TREE_TYPE (t2))) return false; - if (TYPE_RESTRICT (t1) != TYPE_RESTRICT (t2)) - return false; + /* if (TYPE_RESTRICT (t1) != TYPE_RESTRICT (t2)) + return false; */ if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2))) return false; fixes PR45586 without regression. Further testing is blocked by PR59723. Dominique
Re: [PATCH] Allow building if libsanitizer on RHEL5 (i.e. with 2.6.18-ish kernel headers, take 2)
Jakub Jelinek ja...@redhat.com writes: Hi! Here is an updated version which doesn't warn about #include_next. Ok for trunk? 2013-12-10 Jakub Jelinek ja...@redhat.com * sanitizer_common/Makefile.am (AM_CPPFLAGS): Add -isystem $(top_srcdir)/include/system. * sanitizer_common/Makefile.in: Regenerated. * include/system/linux/aio_abi.h: New header. * include/system/linux/mroute.h: New header. * include/system/linux/mroute6.h: New header. * include/system/linux/perf_event.h: New header. * include/system/linux/types.h: New header. This looks good to me. A better approach would have been what you said in another thread: | much better would be just say a testcase that would include the | sanitizer + kernel headers, guarded by recent enough LINUX_VERSION_CODE | or configure or similar, so it wouldn't prevent library build on older | kernel headers, the kernel ABI better be stable (only new things added, | not size of structures/magic constants etc. changed from time to time). But because of: | But Kostya is apparently not willing to do that, so this patch | provides a workaround in non-compiler-rt maintained files. Let's get this in then :-) Cheers. -- Dodji
Re: [PATCH] Fix PR58115
Bernd Edlinger bernd.edlin...@hotmail.de writes: I found another test case that still fails with today's trunk: #include immintrin.h __m256 a[10], b[10], c[10]; void __attribute__((target (sse2), optimize (3))) foo (void) { } void __attribute__((target (avx), optimize (3))) bar (void) { a[0] = _mm256_and_ps (b[0], c[0]); } compile with i686-pc-linux-gnu-gcc -O2 -msse2 -mno-avx -S The attached patch seems to fix this test case for targets that do not have SWITCHABLE_TARGET. What do you think about it? It looks like a correct fix, but the memcpy is going to be pretty expensive, since in most cases there will be no difference. Calling target_reinit is the rare case, and already very slow itself, so maybe an easier option would be to have a target_reinit counter. I.e. for !SWITCHABLE_TARGETs only, replace TREE_OPTIMIZATION_BASE_OPTABS with a number of target_reinit calls field. Not sure it's worth the effort though. The other targets should really move to SWITCHABLE_TARGET too. One of the reasons why I made SWITCHABLE_TARGET optional was that I was worried it might slow down the compiler for targets that didn't need it. Jakub's measurements suggest that any compile-time effect is in the noise though. I think Jakub's patch will fix this case, but I did not try. However even if the i368 is now clean, there are still many targets that use target_reinit() in target_set_current_function. FWIW I only see three others (nios, rs6000 and rx). nios and rs6000 are direct cut--pastes of the i386 version so should be easy to switch. rx looks more like MIPS in that it's switching between two specific subtargets. Thanks, Richard
Re: [PATCH] Use libbacktrace for libsanitizer symbolization (take 2, PR sanitizer/59136)
Jakub Jelinek ja...@redhat.com writes: This is a second attempt at libsanitizer symbolization using libbacktrace. The compiler-rt maintained bit have been already added by the recent merge from compiler-rt, so this patch is mostly configury/Makefile stuff. Rather than using libbacktrace.la built in libbacktrace directory directly this patch builds libsanitizer's own copy of a subset of libbacktrace that it actually needs (only everything required for backtrace_{{sym,pc}info,create_state}), OK. renames the symbols to __asan_backtrace_* so that when it is e.g. through -static-libasan etc. it doesn't clash with user symbols or other projects using libbacktrace (and, as libasan isn't yet symbol versioned, also doesn't export backtrace_* symbols from the DSO). So we are carrying a light fork for libbacktrace then. I am little bit concerned about the maintenance cost of this over time. I guess we can figure out a way to factorize libbacktrace if the cost of its maintenance really rises. Regtested on x86_64-linux (--target_board=unix\{-m32,-m64\}), ok for trunk (will do full bootstrap/regtest momentarily)? Looks good to me. Thank you. -- Dodji
[PATCH] Fix PR59715
This fixes PR59715 by splitting critical edges again before code sinking. The critical edge splitting done before PRE was designed to survive until sinking originally, but at least since 4.5 PRE now eventually cleans up the CFG and thus undos critical edge splitting. This results in less than optimal code placement (and lost opportunities) for sinking and it breaks (at least) the virtual operand updating code which assumes that critical edges are still split. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-01-09 Richard Biener rguent...@suse.de PR tree-optimization/59715 * tree-cfg.h (split_critical_edges): Declare. * tree-cfg.c (split_critical_edges): Export. * tree-ssa-sink.c (execute_sink_code): Split critical edges. * gcc.dg/torture/pr59715.c: New testcase. Index: gcc/tree-cfg.h === *** gcc/tree-cfg.h (revision 206421) --- gcc/tree-cfg.h (working copy) *** extern tree gimplify_build1 (gimple_stmt *** 93,97 --- 93,98 tree, tree); extern void extract_true_false_edges_from_block (basic_block, edge *, edge *); extern unsigned int execute_fixup_cfg (void); + extern unsigned int split_critical_edges (void); #endif /* _TREE_CFG_H */ Index: gcc/tree-cfg.c === *** gcc/tree-cfg.c (revision 206421) --- gcc/tree-cfg.c (working copy) *** static void make_goto_expr_edges (basic_ *** 159,165 static void make_gimple_asm_edges (basic_block); static edge gimple_redirect_edge_and_branch (edge, basic_block); static edge gimple_try_redirect_by_replacing_jump (edge, basic_block); - static unsigned int split_critical_edges (void); /* Various helpers. */ static inline bool stmt_starts_bb_p (gimple, gimple); --- 159,164 *** struct cfg_hooks gimple_cfg_hooks = { *** 7929,7935 /* Split all critical edges. */ ! static unsigned int split_critical_edges (void) { basic_block bb; --- 7954,7960 /* Split all critical edges. */ ! unsigned int split_critical_edges (void) { basic_block bb; Index: gcc/tree-ssa-sink.c === *** gcc/tree-ssa-sink.c (revision 206421) --- gcc/tree-ssa-sink.c (working copy) *** static void *** 567,573 execute_sink_code (void) { loop_optimizer_init (LOOPS_NORMAL); ! connect_infinite_loops_to_exit (); memset (sink_stats, 0, sizeof (sink_stats)); calculate_dominance_info (CDI_DOMINATORS); --- 567,573 execute_sink_code (void) { loop_optimizer_init (LOOPS_NORMAL); ! split_critical_edges (); connect_infinite_loops_to_exit (); memset (sink_stats, 0, sizeof (sink_stats)); calculate_dominance_info (CDI_DOMINATORS); Index: gcc/testsuite/gcc.dg/torture/pr59715.c === *** gcc/testsuite/gcc.dg/torture/pr59715.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr59715.c (working copy) *** *** 0 --- 1,21 + /* { dg-do run } */ + + extern void abort (void); + + int a = 2, b; + + int + main () + { + int c; + if (!b) + { + b = a; + c = a == 0 ? 1 : 1 % a; + if (c) + b = 0; + } + if (b != 0) + abort (); + return 0; + }
Re: [PATCH] Fix PR58115
On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: I think Jakub's patch will fix this case, but I did not try. However even if the i368 is now clean, there are still many targets that use target_reinit() in target_set_current_function. FWIW I only see three others (nios, rs6000 and rx). nios and rs6000 are direct cut--pastes of the i386 version so should be easy to switch. rx looks more like MIPS in that it's switching between two specific subtargets. Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage rs6000 and nios folks to follow the suit. Jakub
RE: [PATCH] Fix PR58115
On Thu, 9 Jan 2014 10:28:43, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: I think Jakub's patch will fix this case, but I did not try. However even if the i368 is now clean, there are still many targets that use target_reinit() in target_set_current_function. FWIW I only see three others (nios, rs6000 and rx). nios and rs6000 are direct cut--pastes of the i386 version so should be easy to switch. rx looks more like MIPS in that it's switching between two specific subtargets. Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage rs6000 and nios folks to follow the suit. Jakub Ok for me. Hope they read this thread... If that is our policy for 4.9.0, then the comment in function.c where the targetm.set_current_function (fndecl); is called should _very_ clearly say that this callback is no longer allowed to call target_reinit() any more. Thanks Bernd.
Re: [PATCH] Fix PR58115
On Thu, Jan 09, 2014 at 10:36:43AM +0100, Bernd Edlinger wrote: On Thu, 9 Jan 2014 10:28:43, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: I think Jakub's patch will fix this case, but I did not try. However even if the i368 is now clean, there are still many targets that use target_reinit() in target_set_current_function. FWIW I only see three others (nios, rs6000 and rx). nios and rs6000 are direct cut--pastes of the i386 version so should be easy to switch. rx looks more like MIPS in that it's switching between two specific subtargets. Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage rs6000 and nios folks to follow the suit. Jakub Ok for me. Hope they read this thread... If that is our policy for 4.9.0, then the comment in function.c where the targetm.set_current_function (fndecl); is called should _very_ clearly say that this callback is no longer allowed to call target_reinit() any more. But that is not the case, even with the i386 SWITCHABLE_TARGET patch it may call target_reinit, because that is what save_target_globals_default_opts calls. It just calls it temporarily with the optimization_default_node. Jakub
Re: PR 59137: Incorrect liveness info during dbr_schedule
On Wed, Jan 8, 2014 at 8:46 PM, Steven Bosscher stevenb@gmail.com wrote: On Wed, Jan 8, 2014 at 8:27 PM, Richard Sandiford wrote: gcc/ PR rtl-optimization/59137 * reorg.c (steal_delay_list_from_target): Call update_block for elided insns. (steal_delay_list_from_fallthrough, relax_delay_slots): Likewise. gcc/testsuite/ PR rtl-optimization/59137 * gcc.target/mips/pr59137.c: New test. This is OK for trunk. For release branches I'll defer to the RMs. Works for me. Richard. Ciao! Steven
Re: [Patch] Regex bracket matcher cache optimization
On 01/08/2014 11:47 PM, Tim Shen wrote: On Wed, Jan 8, 2014 at 5:38 PM, Paolo Carlini paolo.carl...@oracle.com wrote: I agree, it's probably fine for now, but please actually attach the patch ;) Oops sorry . Jon, is this version Ok with you? Thanks, Paolo.
Re: [PATCH] Fix PR58115
On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: It looks like a correct fix, but the memcpy is going to be pretty expensive, since in most cases there will be no difference. Perhaps we should add another tree code, which would represent the combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL would then refer to this combo node only and that new tree would refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE. That way we could stick the saved optabs into the new node rather than having default opt cached target optabs, non-default opt with default target optabs cached too, but for non-default target non-default opt we don't cache anything and always recompute. Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE into one and let both target and optimize attributes adjust it? Jakub
[PATCH, go]: Skip some go tests
Hello! Attached patch skip some go tests to avoid testsuite failures: - peano.go tests recursive call. The test fails on targets that don't support -fsplit-stack (Centos 5.3, alpha) - rotate[0123]-out.go take too long to compile. The test should be skipped for the same reason as rotate.go: # This test produces a temporary file that takes too long # to compile--5 minutes on my laptop without optimization. # When compiling without optimization it tests nothing # useful, since the point of the test is to see whether # the compiler generates rotate instructions. There are two remaining warnings: go.test/test/nilcheck.go: unrecognized test line: // errorcheck -0 -N -d=nil go.test/test/nilptr3.go: unrecognized test line: // errorcheck -0 -d=nil (The patch doesn't address these warnings). 2014-01-09 Uros Bizjak ubiz...@gmail.com * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems which don't support -fsplit-stack. Skip rotate[0123]-out.go. Patch was tested on alpha-linux-gnu, where it fixes all but recovery.go failure; the latter due to unimplemented feature on alpha target. OK for mainline? Uros. Index: go-test.exp === --- go-test.exp (revision 206441) +++ go-test.exp (working copy) @@ -400,7 +400,8 @@ } if { ( [file tail $test] == select2.go \ - || [file tail $test] == stack.go ) \ + || [file tail $test] == stack.go \ + || [file tail $test] == peano.go ) \ ! [check_effective_target_split_stack] } { # chan/select2.go fails on targets without split stack, # because they allocate a large stack segment that blows @@ -409,7 +410,8 @@ continue } - if { [file tail $test] == rotate.go } { + if { ( [file tail $test] == rotate.go \ + || [file tail $test] == rotate\[0123\]-out.go ) } { # This test produces a temporary file that takes too long # to compile--5 minutes on my laptop without optimization. # When compiling without optimization it tests nothing
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, Jan 09, 2014 at 09:46:11AM +0100, Richard Biener wrote: + /* If fndecl (like __builtin_unreachable or +__cxa_pure_virtual) takes no arguments, doesn't have +return value and is noreturn, if the call doesn't have +lhs or lhs isn't SSA_NAME, replace the call with +the noreturn call, otherwise insert it before the call +and replace the call with setting of lhs to default def. */ + if (TREE_THIS_VOLATILE (fndecl) + VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl))) + TYPE_ARG_TYPES (TREE_TYPE (fndecl)) == void_list_node) ... restrict this to the targets.length () == 0 case. Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. 2014-01-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/59622 * gimple-fold.c (gimple_fold_call): Fix a typo in message. For __builtin_unreachable replace the OBJ_TYPE_REF call with a call to __builtin_unreachable and add if needed a setter of the lhs SSA_NAME. Don't devirtualize for inplace at all. For targets.length () == 1, if the call is noreturn and cfun isn't in SSA form yet, clear lhs. * g++.dg/opt/pr59622-2.C: New test. * g++.dg/opt/pr59622-3.C: New test. * g++.dg/opt/pr59622-4.C: New test. * g++.dg/opt/pr59622-5.C: New test. --- gcc/gimple-fold.c.jj2014-01-08 17:44:57.690582374 +0100 +++ gcc/gimple-fold.c 2014-01-09 11:05:40.165287975 +0100 @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator * (OBJ_TYPE_REF_EXPR (callee) { fprintf (dump_file, - Type inheritnace inconsistent devirtualization of ); + Type inheritance inconsistent devirtualization of ); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, to ); print_generic_expr (dump_file, callee, TDF_SLIM); @@ -1177,7 +1177,7 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else if (flag_devirtualize virtual_method_call_p (callee)) + else if (flag_devirtualize !inplace virtual_method_call_p (callee)) { bool final; vec cgraph_node *targets @@ -1188,13 +1188,29 @@ gimple_fold_call (gimple_stmt_iterator * { gimple_call_set_fndecl (stmt, targets[0]-decl); changed = true; + /* If the call becomes noreturn and we aren't in SSA form +yet, just drop the lhs, because fixup_noreturn_call +isn't run at that point yet. */ + if ((gimple_call_flags (stmt) ECF_NORETURN) + !gimple_in_ssa_p (cfun) + gimple_call_lhs (stmt)) + gimple_call_set_lhs (stmt, NULL_TREE); } - else if (!inplace) + else { + tree lhs = gimple_call_lhs (stmt); tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); gimple new_stmt = gimple_build_call (fndecl, 0); gimple_set_location (new_stmt, gimple_location (stmt)); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + if (lhs TREE_CODE (lhs) == SSA_NAME) + { + tree var = create_tmp_var (TREE_TYPE (lhs), NULL); + tree def = get_or_create_ssa_default_def (cfun, var); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + update_call_from_tree (gsi, def); + } + else + gsi_replace (gsi, new_stmt, true); return true; } } --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-09 10:57:46.246694025 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-2.C2014-01-09 10:57:46.246694025 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +namespace +{ + struct A + { +A () {} +virtual A *bar (int) = 0; +A *baz (int x) { return bar (x); } + }; +} + +A *a; + +void +foo () +{ + a-baz (0); +} --- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj 2014-01-09 10:57:46.247694040 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-3.C2014-01-09
Re: [PATCH] Fix PR58115
On Thu, Jan 9, 2014 at 11:24 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: It looks like a correct fix, but the memcpy is going to be pretty expensive, since in most cases there will be no difference. Perhaps we should add another tree code, which would represent the combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL would then refer to this combo node only and that new tree would refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE. That way we could stick the saved optabs into the new node rather than having default opt cached target optabs, non-default opt with default target optabs cached too, but for non-default target non-default opt we don't cache anything and always recompute. Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE into one and let both target and optimize attributes adjust it? Yeah - I fail to see why we have two different tree nodes here anyway. Richard. Jakub
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 09:46:11AM +0100, Richard Biener wrote: + /* If fndecl (like __builtin_unreachable or + __cxa_pure_virtual) takes no arguments, doesn't have + return value and is noreturn, if the call doesn't have + lhs or lhs isn't SSA_NAME, replace the call with + the noreturn call, otherwise insert it before the call + and replace the call with setting of lhs to default def. */ + if (TREE_THIS_VOLATILE (fndecl) +VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl))) +TYPE_ARG_TYPES (TREE_TYPE (fndecl)) == void_list_node) ... restrict this to the targets.length () == 0 case. Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. Don't we fix this up during CFG build? That is, I fail to see the issue with noreturn and pre-CFG? Richard. 2014-01-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/59622 * gimple-fold.c (gimple_fold_call): Fix a typo in message. For __builtin_unreachable replace the OBJ_TYPE_REF call with a call to __builtin_unreachable and add if needed a setter of the lhs SSA_NAME. Don't devirtualize for inplace at all. For targets.length () == 1, if the call is noreturn and cfun isn't in SSA form yet, clear lhs. * g++.dg/opt/pr59622-2.C: New test. * g++.dg/opt/pr59622-3.C: New test. * g++.dg/opt/pr59622-4.C: New test. * g++.dg/opt/pr59622-5.C: New test. --- gcc/gimple-fold.c.jj 2014-01-08 17:44:57.690582374 +0100 +++ gcc/gimple-fold.c 2014-01-09 11:05:40.165287975 +0100 @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator * (OBJ_TYPE_REF_EXPR (callee) { fprintf (dump_file, -Type inheritnace inconsistent devirtualization of ); +Type inheritance inconsistent devirtualization of ); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, to ); print_generic_expr (dump_file, callee, TDF_SLIM); @@ -1177,7 +1177,7 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else if (flag_devirtualize virtual_method_call_p (callee)) + else if (flag_devirtualize !inplace virtual_method_call_p (callee)) { bool final; vec cgraph_node *targets @@ -1188,13 +1188,29 @@ gimple_fold_call (gimple_stmt_iterator * { gimple_call_set_fndecl (stmt, targets[0]-decl); changed = true; + /* If the call becomes noreturn and we aren't in SSA form + yet, just drop the lhs, because fixup_noreturn_call + isn't run at that point yet. */ + if ((gimple_call_flags (stmt) ECF_NORETURN) +!gimple_in_ssa_p (cfun) +gimple_call_lhs (stmt)) + gimple_call_set_lhs (stmt, NULL_TREE); } - else if (!inplace) + else { + tree lhs = gimple_call_lhs (stmt); tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); gimple new_stmt = gimple_build_call (fndecl, 0); gimple_set_location (new_stmt, gimple_location (stmt)); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + if (lhs TREE_CODE (lhs) == SSA_NAME) + { + tree var = create_tmp_var (TREE_TYPE (lhs), NULL); + tree def = get_or_create_ssa_default_def (cfun, var); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + update_call_from_tree (gsi, def); + } + else + gsi_replace (gsi, new_stmt, true); return true; } } --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-09 10:57:46.246694025 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-2.C 2014-01-09 10:57:46.246694025 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +namespace +{ + struct A + { +A () {} +virtual A *bar (int) = 0; +A *baz (int x) { return bar (x); } + }; +} + +A *a; + +void +foo () +{ + a-baz (0); +} ---
Re: [PATCH] Fix PR58115
On Thu, Jan 09, 2014 at 12:11:12PM +0100, Richard Biener wrote: On Thu, Jan 9, 2014 at 11:24 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jan 09, 2014 at 09:02:28AM +, Richard Sandiford wrote: It looks like a correct fix, but the memcpy is going to be pretty expensive, since in most cases there will be no difference. Perhaps we should add another tree code, which would represent the combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL would then refer to this combo node only and that new tree would refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE. That way we could stick the saved optabs into the new node rather than having default opt cached target optabs, non-default opt with default target optabs cached too, but for non-default target non-default opt we don't cache anything and always recompute. Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE into one and let both target and optimize attributes adjust it? Yeah - I fail to see why we have two different tree nodes here anyway. Well, if the target_reinit stuff (except for optabs) is only dependent on the target flags and not on the optimization flags (is that really the case?), then by not having a single node only it is possible to save the 0.5MB or what target blob only once per target specific option combination, while with only one node we'd need to duplicate that. And, I think OPTIMIZATION_NODE works regardless of target support, doesn't need target_reinit etc. Jakub
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, Jan 09, 2014 at 12:15:00PM +0100, Richard Biener wrote: Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. Don't we fix this up during CFG build? Sure, we do, but we ICE far before we get there. #0 error (gmsgid=0x1613b69 LHS in noreturn call) at ../../gcc/diagnostic.c:1041 #1 0x00d64c7b in verify_gimple_call (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:3149 #2 0x00d68e0f in verify_gimple_stmt (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:4323 #3 0x00d69430 in verify_gimple_in_seq_2 (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4490 #4 0x00d69502 in verify_gimple_in_seq (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4520 #5 0x00af3806 in gimplify_body (fndecl=function_decl 0x71a0da00 baz, do_parms=true) at ../../gcc/gimplify.c:8599 #6 0x00af3c40 in gimplify_function_tree (fndecl=function_decl 0x71a0da00 baz) at ../../gcc/gimplify.c:8684 Jakub
Re: [Patch, fortran] PR34547 - [4.8/4.9 regression] NULL(): Fortran 2003 changes, accepts invalid, ICE on invalid
Ping! The patch is also ok with me. Regarding the wording I'd vote for Tobias' suggestion NULL() intrinsic not permitted in data-transfer statement, but I'm also ok with the other variants. Paul, please commit. Cheers, Janus 2013/12/1 Tobias Burnus bur...@net-b.de: Paul Richard Thomas wrote: This one is trivial. NULL(...) is simply out of context in a transfer statement. Bootstrapped and regtested on FC17/x86_64. OK for trunk and 4.8? Looks good to me, except that I wonder whether the wording could be improved: Invalid context for NULL () intrinsic at %L, For instance, something like NULL() intrinsic not permitted in data-transfer statement or Data transfer statement requires an associated pointer or NULL() is not an associated pointer as required for an data-transfer statement or something like that, given that we know that the context is a data transfer statement. The standard requires: If an output item is a pointer, it shall be associated with a target (see just added quote to the PR). Thus, the patch is fine after shortly pondering about the wording; but I am also fine with your wording. Tobias 2013-11-30 Paul Thomaspa...@gcc.gnu.org PR fortran/34547 * resolve.c (resolve_transfer): EXPR_NULL is always in an invalid context in a transfer statement. 2013-11-30 Paul Thomaspa...@gcc.gnu.org PR fortran/34547 * gfortran.dg/null_5.f90 : Include new error. * gfortran.dg/null_6.f90 : Include new error.
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 12:15:00PM +0100, Richard Biener wrote: Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. Don't we fix this up during CFG build? Sure, we do, but we ICE far before we get there. #0 error (gmsgid=0x1613b69 LHS in noreturn call) at ../../gcc/diagnostic.c:1041 #1 0x00d64c7b in verify_gimple_call (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:3149 #2 0x00d68e0f in verify_gimple_stmt (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:4323 #3 0x00d69430 in verify_gimple_in_seq_2 (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4490 #4 0x00d69502 in verify_gimple_in_seq (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4520 #5 0x00af3806 in gimplify_body (fndecl=function_decl 0x71a0da00 baz, do_parms=true) at ../../gcc/gimplify.c:8599 #6 0x00af3c40 in gimplify_function_tree (fndecl=function_decl 0x71a0da00 baz) at ../../gcc/gimplify.c:8684 So what fixes it for int __attribute__((noreturn)) foo () {} int main() { return foo (); } ? Richard.
Re: [PATCH] libsanitizer demangling using cp-demangle.c
Jakub Jelinek ja...@redhat.com a écrit: 2013-12-10 Jakub Jelinek ja...@redhat.com * sanitizer_common/sanitizer_symbolizer_libbacktrace.h (LibbacktraceSymbolizer::Demangle): New declaration. * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc (POSIXSymbolizer::Demangle): Use libbacktrace_symbolizer_'s Demangle method if possible. * sanitizer_common/sanitizer_symbolizer_libbacktrace.cc: Include demangle.h if SANITIZE_CP_DEMANGLE is defined. (struct CplusV3DemangleData): New type. (CplusV3DemangleCallback, CplusV3Demangle): New functions. (SymbolizeCodePCInfoCallback, SymbolizeCodeCallback, SymbolizeDataCallback): Use CplusV3Demangle. * sanitizer_common/Makefile.am (AM_CXXFLAGS): Add -DSANITIZE_CP_DEMANGLE and -I $(top_srcdir)/../include. * libbacktrace/backtrace-rename.h (cplus_demangle_builtin_types, cplus_demangle_fill_ctor, cplus_demangle_fill_dtor, cplus_demangle_fill_extended_operator, cplus_demangle_fill_name, cplus_demangle_init_info, cplus_demangle_mangled_name, cplus_demangle_operators, cplus_demangle_print, cplus_demangle_print_callback, cplus_demangle_type, cplus_demangle_v3, cplus_demangle_v3_callback, is_gnu_v3_mangled_ctor, is_gnu_v3_mangled_dtor, java_demangle_v3, java_demangle_v3_callback): Define. (__asan_internal_memcmp, __asan_internal_strncmp): New prototypes. (memcmp, strncmp): Redefine. * libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add ../../libiberty/cp-demangle.c. * libbacktrace/bridge.cc (__asan_internal_memcmp, __asan_internal_strncmp): New functions. * sanitizer_common/Makefile.in: Regenerated. * libbacktrace/Makefile.in: Regenerated. * configure: Regenerated. * configure.ac: Regenerated. * config.h.in: Regenerated. This looks good to me. Thanks. -- Dodji
Re: [PATCH] Fix PR45586
On Thu, 9 Jan 2014, Dominique Dhumieres wrote: This fixes the gimple verification ICEs in the Fortran testsuite. ... This patch causes 6300+ regressions (-m32/-m64 all languages but go). However the following change --- ../_clean/gcc/lto/lto.c 2014-01-04 15:51:44.0 +0100 +++ gcc/lto/lto.c 2014-01-08 08:26:09.0 +0100 @@ -310,7 +310,7 @@ hash_canonical_type (tree type) { v = iterative_hash_hashval_t (TYPE_REF_CAN_ALIAS_ALL (type), v); v = iterative_hash_hashval_t (TYPE_ADDR_SPACE (TREE_TYPE (type)), v); - v = iterative_hash_hashval_t (TYPE_RESTRICT (type), v); + /* v = iterative_hash_hashval_t (TYPE_RESTRICT (type), v); */ v = iterative_hash_hashval_t (TREE_CODE (TREE_TYPE (type)), v); } @@ -495,8 +495,8 @@ gimple_canonical_types_compatible_p (tre != TYPE_ADDR_SPACE (TREE_TYPE (t2))) return false; - if (TYPE_RESTRICT (t1) != TYPE_RESTRICT (t2)) - return false; + /* if (TYPE_RESTRICT (t1) != TYPE_RESTRICT (t2)) + return false; */ if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2))) return false; fixes PR45586 without regression. Further testing is blocked by PR59723. Yeah. I've tried to look at the fallout but got stuck with inappropriate changes for stage3. So I've settled for the following which beats some sense into the canonical type computation (it's supposed to glob everything that is supposed to TBAA-alias and that is allowed as useless_type_conversion_p-compatible for aggregates). LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-01-09 Richard Biener rguent...@suse.de PR lto/45586 * lto.c (hash_canonical_type): Do not hash TREE_ADDRESSABLE, TYPE_ALIGN, TYPE_RESTRICT or TYPE_REF_CAN_ALIAS_ALL. (gimple_canonical_types_compatible_p): Do not compare them either. Index: gcc/lto/lto.c === *** gcc/lto/lto.c (revision 206459) --- gcc/lto/lto.c (working copy) *** hash_canonical_type (tree type) *** 280,287 only existing types having the same features as the new type will be checked. */ v = iterative_hash_hashval_t (TREE_CODE (type), 0); - v = iterative_hash_hashval_t (TREE_ADDRESSABLE (type), v); - v = iterative_hash_hashval_t (TYPE_ALIGN (type), v); v = iterative_hash_hashval_t (TYPE_MODE (type), v); /* Incorporate common features of numerical types. */ --- 280,285 *** hash_canonical_type (tree type) *** 308,316 pointed to but do not recurse to the pointed-to type. */ if (POINTER_TYPE_P (type)) { - v = iterative_hash_hashval_t (TYPE_REF_CAN_ALIAS_ALL (type), v); v = iterative_hash_hashval_t (TYPE_ADDR_SPACE (TREE_TYPE (type)), v); - v = iterative_hash_hashval_t (TYPE_RESTRICT (type), v); v = iterative_hash_hashval_t (TREE_CODE (TREE_TYPE (type)), v); } --- 306,312 *** gimple_canonical_types_compatible_p (tre *** 447,455 if (TREE_CODE (t1) != TREE_CODE (t2)) return false; - if (TREE_ADDRESSABLE (t1) != TREE_ADDRESSABLE (t2)) - return false; - /* Qualifiers do not matter for canonical type comparison purposes. */ /* Void types and nullptr types are always the same. */ --- 443,448 *** gimple_canonical_types_compatible_p (tre *** 458,465 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_ALIGN (t1) != TYPE_ALIGN (t2) ! || TYPE_MODE (t1) != TYPE_MODE (t2)) return false; /* Non-aggregate types can be handled cheaply. */ --- 451,457 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_MODE (t1) != TYPE_MODE (t2)) return false; /* Non-aggregate types can be handled cheaply. */ *** gimple_canonical_types_compatible_p (tre *** 486,503 useless_type_conversion_p would do. */ if (POINTER_TYPE_P (t1)) { - /* If the two pointers have different ref-all attributes, -they can't be the same type. */ - if (TYPE_REF_CAN_ALIAS_ALL (t1) != TYPE_REF_CAN_ALIAS_ALL (t2)) - return false; - if (TYPE_ADDR_SPACE (TREE_TYPE (t1)) != TYPE_ADDR_SPACE (TREE_TYPE (t2))) return false; - if (TYPE_RESTRICT (t1) != TYPE_RESTRICT (t2)) - return false; - if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2))) return false; } --- 478,487
[PING] Another build!=host fix
Hello, and Ping for this patch: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01552.html note however, that cross-building is probably broken anyway in the moment by r205690, Thanks Bernd. there is a small problem with SSIZE_MAX, because it is not always defined, especially not in gcc/glimits.h, which seems to be the fall-back if the target fails to have a working limits.h. When I create a cross-compiler for --target=arm-linux-gnueabihf, the working limits.h is overwritten by fix-includes with a copy of gcc/glimits.h. Probably because it is not possible to compile the target headers with the build compiler and produce meaningful test results. However because gcc/glimits.h does not define SSIZE_MAX the following build fails with In file included from ../../gcc-4.9-20131215/gcc/config/host-linux.c:21:0: ../../gcc-4.9-20131215/gcc/config/host-linux.c: In function 'int linux_gt_pch_use_address(void*, size_t, int, size_t)': ../../gcc-4.9-20131215/gcc/config/host-linux.c:215:43: error: 'SSIZE_MAX' was not declared in this scope nbytes = read (fd, base, MIN (size, SSIZE_MAX)); ^ ../../gcc-4.9-20131215/gcc/system.h:351:26: note: in definition of macro 'MIN' #define MIN(X,Y) ((X) (Y) ? (X) : (Y)) ^ The most simple way to fix this would be to not use SSIZE_MAX here. Boot-Strapped and regression-tested on X86_64. Plus cross-build for arm-linux-gnueabihf. Ok for trunk? Thanks Bernd.
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, Jan 09, 2014 at 12:31:47PM +0100, Richard Biener wrote: On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 12:15:00PM +0100, Richard Biener wrote: Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. Don't we fix this up during CFG build? Sure, we do, but we ICE far before we get there. #0 error (gmsgid=0x1613b69 LHS in noreturn call) at ../../gcc/diagnostic.c:1041 #1 0x00d64c7b in verify_gimple_call (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:3149 #2 0x00d68e0f in verify_gimple_stmt (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:4323 #3 0x00d69430 in verify_gimple_in_seq_2 (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4490 #4 0x00d69502 in verify_gimple_in_seq (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4520 #5 0x00af3806 in gimplify_body (fndecl=function_decl 0x71a0da00 baz, do_parms=true) at ../../gcc/gimplify.c:8599 #6 0x00af3c40 in gimplify_function_tree (fndecl=function_decl 0x71a0da00 baz) at ../../gcc/gimplify.c:8684 So what fixes it for int __attribute__((noreturn)) foo () {} int main() { return foo (); } ? gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Jakub
Re: [RFC] libgcov.c re-factoring and offline profile-tool
I don't understand why static variables can cause any safety issue in multi-thread programs. For any process, gcov_dump will be called only once and there will be one instance of globals. (v)fork, failing exec and other cases can lead to gcov_flush being called several times and that calls gcov_exit that uses the global state. I believe that a simple program containing two threads that both are executing execve of non-existing file will trigger concurent writes on systems not having __gthread_mutex_lock that seems to be in place to prevent it. I wonder if we should move move locking into gcov_exit itself? Is there somethign that will promise us that paralellel streaming invoked by other thread at the failing execve is not going to end up in parallel with gcov_exit called via atexit handler? In any case for sanity of setups without gthread support, I think we need to keep eye on not doing something evil in this case - like writting into random file names or corrupting memory/files. 2014-01-08 Rong Xu x...@google.com * libgcc/libgcov-driver.c (this_prg): make it local to save bss space. (gcov_exit_compute_summary): Ditto. (gcov_exit_merge_gcda): Ditto. (gcov_exit_merge_summary): Ditto. (gcov_exit_dump_gcov): Ditto. This is OK, thanks! Honza
Re: [PATCH] Fix PR59715
On 09-01-14 10:16, Richard Biener wrote: This fixes PR59715 by splitting critical edges again before code sinking. The critical edge splitting done before PRE was designed to survive until sinking originally, but at least since 4.5 PRE now eventually cleans up the CFG and thus undos critical edge splitting. This results in less than optimal code placement (and lost opportunities) for sinking and it breaks (at least) the virtual operand updating code which assumes that critical edges are still split. Richard, this follow-up patch: - notes in pass_pre that PROP_no_crit_edge is destroyed - notes in pass_sink_code that PROP_no_crit_edge is not required (because it's now ensured by the pass itself) Build and reg-tested pr59715.c on x86_64. OK for stage3 trunk if bootstrap and full reg-test on x86_64 is ok? Thanks, - Tom 2014-01-09 Tom de Vries t...@codesourcery.com * tree-ssa-pre.c (pass_data_pre): Add comment about PROP_no_crit_edges in properties_required. Add PROP_no_crit_edges to properties_destroyed. * tree-ssa-sink.c (pass_data_sink_code): Comment out PROP_no_crit_edges in PROP_no_crit_edges. diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index 2de5db5..1e55356 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -4798,9 +4798,11 @@ const pass_data pass_data_pre = true, /* has_gate */ true, /* has_execute */ TV_TREE_PRE, /* tv_id */ + /* PROP_no_crit_edges is ensured by placing pass_split_crit_edges before + pass_pre. */ ( PROP_no_crit_edges | PROP_cfg | PROP_ssa ), /* properties_required */ 0, /* properties_provided */ - 0, /* properties_destroyed */ + PROP_no_crit_edges, /* properties_destroyed */ TODO_rebuild_alias, /* todo_flags_start */ TODO_verify_ssa, /* todo_flags_finish */ }; diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c index a72a9e8..b56c4fe 100644 --- a/gcc/tree-ssa-sink.c +++ b/gcc/tree-ssa-sink.c @@ -604,7 +604,9 @@ const pass_data pass_data_sink_code = true, /* has_gate */ true, /* has_execute */ TV_TREE_SINK, /* tv_id */ - ( PROP_no_crit_edges | PROP_cfg | PROP_ssa ), /* properties_required */ + /* PROP_no_crit_edges is ensured by running split_critical_edges in + execute_sink_code. */ + ( /* PROP_no_crit_edges | */ PROP_cfg | PROP_ssa ), /* properties_required */ 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */
Re: std::vector move assign patch
On Fri, Dec 27, 2013 at 10:27 AM, François Dumont frs.dum...@gmail.com wrote: Hi Here is a patch to fix an issue in normal mode during the move assignment. The destination vector allocator instance is moved too during the assignment which is wrong. As I discover this problem while working on issues with management of safe iterators during move operations this patch also fix those issues in the debug mode for the vector container. Fixes for other containers in debug mode will come later. 2013-12-27 François Dumont fdum...@gcc.gnu.org * include/bits/stl_vector.h (std::vector::_M_move_assign): Pass *this allocator instance when building temporary vector instance so that *this allocator do not get moved. * include/debug/safe_base.h (_Safe_sequence_base(_Safe_sequence_base)): New. * include/debug/vector (__gnu_debug::vector(vector)): Use latter. (__gnu_debug::vector(vector, const allocator_type)): Swap safe iterators if the instance is moved. (__gnu_debug::vector::operator=(vector)): Likewise. * testsuite/23_containers/vector/allocator/move.cc (test01): Add check on a vector iterator. * testsuite/23_containers/vector/allocator/move_assign.cc (test02): Likewise. (test03): New, test with a non-propagating allocator. * testsuite/23_containers/vector/debug/move_assign_neg.cc: New. Tested under Linux x86_64 normal and debug modes. I will be in vacation for a week starting today so if you want to apply it quickly do not hesitate to do it yourself. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59738 -- H.J.
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 12:31:47PM +0100, Richard Biener wrote: On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 12:15:00PM +0100, Richard Biener wrote: Well, then the __cxa_pure_virtual testcases ICE again, but the pr59622-5.C testcase ICEs anyway, so here is a different patch (untested so far except for the tests). The issue with the calls is that when fold_stmt is done during gimplification (or omp lowering), we don't have cfg nor SSA form and nothing performs fixup_noreturn_calls (that function requires CFG and SSA form anyway). So, we have to fix that up by hand, but as we aren't in SSA form yet, dropping lhs is always safe for the noreturn calls. Don't we fix this up during CFG build? Sure, we do, but we ICE far before we get there. #0 error (gmsgid=0x1613b69 LHS in noreturn call) at ../../gcc/diagnostic.c:1041 #1 0x00d64c7b in verify_gimple_call (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:3149 #2 0x00d68e0f in verify_gimple_stmt (stmt=gimple_call 0x71a2b000) at ../../gcc/tree-cfg.c:4323 #3 0x00d69430 in verify_gimple_in_seq_2 (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4490 #4 0x00d69502 in verify_gimple_in_seq (stmts=0x71a232d0) at ../../gcc/tree-cfg.c:4520 #5 0x00af3806 in gimplify_body (fndecl=function_decl 0x71a0da00 baz, do_parms=true) at ../../gcc/gimplify.c:8599 #6 0x00af3c40 in gimplify_function_tree (fndecl=function_decl 0x71a0da00 baz) at ../../gcc/gimplify.c:8684 So what fixes it for int __attribute__((noreturn)) foo () {} int main() { return foo (); } ? gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Ok, it seems to be too early then - move it after the folding. Richard.
Re: [PATCH] Fix PR59715
On Thu, 9 Jan 2014, Tom de Vries wrote: On 09-01-14 10:16, Richard Biener wrote: This fixes PR59715 by splitting critical edges again before code sinking. The critical edge splitting done before PRE was designed to survive until sinking originally, but at least since 4.5 PRE now eventually cleans up the CFG and thus undos critical edge splitting. This results in less than optimal code placement (and lost opportunities) for sinking and it breaks (at least) the virtual operand updating code which assumes that critical edges are still split. Richard, this follow-up patch: - notes in pass_pre that PROP_no_crit_edge is destroyed - notes in pass_sink_code that PROP_no_crit_edge is not required (because it's now ensured by the pass itself) Build and reg-tested pr59715.c on x86_64. OK for stage3 trunk if bootstrap and full reg-test on x86_64 is ok? Ok with /* PROP_no_crit_edges | */ not commented but removed. Thanks, Richard.
Re: [PATCH] Fix PR45586
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 12:48:49PM +0100, Richard Biener wrote: *** gimple_canonical_types_compatible_p (tre *** 458,465 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_ALIGN (t1) != TYPE_ALIGN (t2) ! || TYPE_MODE (t1) != TYPE_MODE (t2)) return false; /* Non-aggregate types can be handled cheaply. */ --- 451,457 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_MODE (t1) != TYPE_MODE (t2)) The comment needs updating then. Fixed. Richard.
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote: gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Ok, it seems to be too early then - move it after the folding. That wouldn't help all the other early calls of fold_stmt though. E.g. lower_omp. Plus, even in gimplify_modify_expr, doing it after fold_stmt would mean having to walk all stmts created by the folding?, check if they are calls (because a call can fold into nothing or something completely different). Isn't it better then fold_stmt does that instead? Jakub
Re: [PATCH] Fix PR45586
On Thu, Jan 09, 2014 at 12:48:49PM +0100, Richard Biener wrote: *** gimple_canonical_types_compatible_p (tre *** 458,465 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_ALIGN (t1) != TYPE_ALIGN (t2) ! || TYPE_MODE (t1) != TYPE_MODE (t2)) return false; /* Non-aggregate types can be handled cheaply. */ --- 451,457 return true; /* Can't be the same type if they have different alignment, or mode. */ ! if (TYPE_MODE (t1) != TYPE_MODE (t2)) The comment needs updating then. Jakub
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote: gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Ok, it seems to be too early then - move it after the folding. That wouldn't help all the other early calls of fold_stmt though. E.g. lower_omp. Plus, even in gimplify_modify_expr, doing it after fold_stmt would mean having to walk all stmts created by the folding?, check if they are calls (because a call can fold into nothing or something completely different). Isn't it better then fold_stmt does that instead? Hmm, maybe. Not sure why we are this anal about requiring noreturn calls not to have a LHS. But if we require callers in SSA form to update the stmt and properly cleanup the cfg if fold_stmt returns true then it's reasonable to require at least something for callers from non-SSA/CFG code. That is, I don't like this special-casing. If so, then rather don't fold at this point - thus if (... !inplace in_ssa_form (cfun) ...) (or rather if we have a CFG - cfun cfun-curr_properties PROP_cfg). Richard.
Improving mklog [was: Re: RFC Asan instrumentation control]
Hello, I have reproduced the problem with mklog mentioned by Jakub: In my experience mklog is pretty much useless, e.g. if you add a new function, it will list the previous function as being modified rather than the new one, etc. My focus was on functions from headers of diff-log chunks. I hacked a simple addition to mklog which skips unchanged functions in diff-log while adding function names to the final ChangeLog. New mklog results were verified by testsuite which compares reference ChangeLogs of patches from gcc trunk with logs generated by mklog. Patched mklog considerably reduced the number of unchanged functions in ChangeLog. Is it OK for trunk? Thank you, Tatiana Udalova mklog_patch.diff Description: Binary data
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, Jan 09, 2014 at 02:13:39PM +0100, Richard Biener wrote: On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote: gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Ok, it seems to be too early then - move it after the folding. That wouldn't help all the other early calls of fold_stmt though. E.g. lower_omp. Plus, even in gimplify_modify_expr, doing it after fold_stmt would mean having to walk all stmts created by the folding?, check if they are calls (because a call can fold into nothing or something completely different). Isn't it better then fold_stmt does that instead? Hmm, maybe. Not sure why we are this anal about requiring noreturn calls not to have a LHS. But if we require callers in SSA form to update the stmt and properly cleanup the cfg if fold_stmt returns true then it's reasonable to require at least something for callers from non-SSA/CFG code. That is, I don't like this special-casing. If so, then rather don't fold at this point - thus if (... !inplace in_ssa_form (cfun) ...) (or rather if we have a CFG - cfun cfun-curr_properties PROP_cfg). But, isn't right now gimplification the only guaranteed folding of all stmts? I mean, other passes fold_stmt only if they propagate something into them, don't they? Also, in most cases the call actually isn't noreturn, so stopping all the devirtualization just for the unlikely case doesn't look like a good idea to me. Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition we can just drop the lhs always in that case, just doing what we do for __builtin_unreachable if lhs is SSA_NAME: tree var = create_tmp_var (TREE_TYPE (lhs), NULL); tree def = get_or_create_ssa_default_def (cfun, var); gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT); Jakub
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)
On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 02:13:39PM +0100, Richard Biener wrote: On Thu, 9 Jan 2014, Jakub Jelinek wrote: On Thu, Jan 09, 2014 at 01:30:53PM +0100, Richard Biener wrote: gimplify_modify_expr has: if (!gimple_call_noreturn_p (assign)) gimple_call_set_lhs (assign, *to_p); Ok, it seems to be too early then - move it after the folding. That wouldn't help all the other early calls of fold_stmt though. E.g. lower_omp. Plus, even in gimplify_modify_expr, doing it after fold_stmt would mean having to walk all stmts created by the folding?, check if they are calls (because a call can fold into nothing or something completely different). Isn't it better then fold_stmt does that instead? Hmm, maybe. Not sure why we are this anal about requiring noreturn calls not to have a LHS. But if we require callers in SSA form to update the stmt and properly cleanup the cfg if fold_stmt returns true then it's reasonable to require at least something for callers from non-SSA/CFG code. That is, I don't like this special-casing. If so, then rather don't fold at this point - thus if (... !inplace in_ssa_form (cfun) ...) (or rather if we have a CFG - cfun cfun-curr_properties PROP_cfg). But, isn't right now gimplification the only guaranteed folding of all stmts? Actually gimplification doesn't fold all stmts either, but yes. I mean, other passes fold_stmt only if they propagate something into them, don't they? Also, in most cases the call actually isn't noreturn, so stopping all the devirtualization just for the unlikely case doesn't look like a good idea to me. Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition we can just drop the lhs always in that case, just doing what we do for __builtin_unreachable if lhs is SSA_NAME: tree var = create_tmp_var (TREE_TYPE (lhs), NULL); tree def = get_or_create_ssa_default_def (cfun, var); gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT); That works for me. Richard.
Re: [Patch, Fortran, committed] PR 59612: iso_fortran_env segfaults with -fdump-fortran-original
After noticing that the bug is actually a regression (see PR 57042): Ok to backport to 4.7 and 4.8? Cheers, Janus 2013/12/29 Janus Weil ja...@gcc.gnu.org: Hi all, I have just committed an obvious patch for a segfault with -fdump-fortran-original (plus a small documentation fix): http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=206237 Cheers, Janus
Re: [PATCH] libsanitizer demangling using cp-demangle.c
On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote: There was a recent buggy patch to the demangler that added calls to malloc and realloc (2013-10-25 Gary Benson gben...@redhat.com). That patch must be fixed or reverted before the 4.9 release. The main code in the demangler must not call malloc/realloc. When that patch is fixed, you can use the cplus_demangle_v3_callback function to get a demangler that never calls malloc. AFAIK Gary is working on a fix, when that is fixed, with the following patch libsanitizer (when using libbacktrace for symbolization) will not use system malloc/realloc/free for the demangling at all. Tested on x86_64-linux (-m64/-m32). Note that the changes for the 3 files unfortunately will need to be applied upstream to compiler-rt, is that possible? 2013-12-10 Jakub Jelinek ja...@redhat.com * sanitizer_common/sanitizer_symbolizer_libbacktrace.h (LibbacktraceSymbolizer::Demangle): New declaration. * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc sanitizer_symbolizer_posix_libcdep.cc is the file from upstream. If it gets any change in the GCC variant, I will not be able to do merges from upstream until the same code is applied upstream. (POSIXSymbolizer::Demangle): Use libbacktrace_symbolizer_'s Demangle method if possible. * sanitizer_common/sanitizer_symbolizer_libbacktrace.cc: Include demangle.h if SANITIZE_CP_DEMANGLE is defined. (struct CplusV3DemangleData): New type. (CplusV3DemangleCallback, CplusV3Demangle): New functions. (SymbolizeCodePCInfoCallback, SymbolizeCodeCallback, SymbolizeDataCallback): Use CplusV3Demangle. * sanitizer_common/Makefile.am (AM_CXXFLAGS): Add -DSANITIZE_CP_DEMANGLE and -I $(top_srcdir)/../include. * libbacktrace/backtrace-rename.h (cplus_demangle_builtin_types, cplus_demangle_fill_ctor, cplus_demangle_fill_dtor, cplus_demangle_fill_extended_operator, cplus_demangle_fill_name, cplus_demangle_init_info, cplus_demangle_mangled_name, cplus_demangle_operators, cplus_demangle_print, cplus_demangle_print_callback, cplus_demangle_type, cplus_demangle_v3, cplus_demangle_v3_callback, is_gnu_v3_mangled_ctor, is_gnu_v3_mangled_dtor, java_demangle_v3, java_demangle_v3_callback): Define. (__asan_internal_memcmp, __asan_internal_strncmp): New prototypes. (memcmp, strncmp): Redefine. * libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add ../../libiberty/cp-demangle.c. * libbacktrace/bridge.cc (__asan_internal_memcmp, __asan_internal_strncmp): New functions. * sanitizer_common/Makefile.in: Regenerated. * libbacktrace/Makefile.in: Regenerated. * configure: Regenerated. * configure.ac: Regenerated. * config.h.in: Regenerated. --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h.jj 2013-12-05 12:04:28.0 +0100 +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h 2013-12-10 11:01:26.777371566 +0100 @@ -29,6 +29,8 @@ class LibbacktraceSymbolizer { bool SymbolizeData(DataInfo *info); + const char *Demangle(const char *name); + private: explicit LibbacktraceSymbolizer(void *state) : state_(state) {} --- libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc.jj 2013-12-05 12:04:28.0 +0100 +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc 2013-12-10 11:03:02.971876505 +0100 @@ -513,6 +513,11 @@ class POSIXSymbolizer : public Symbolize SymbolizerScope sym_scope(this); if (internal_symbolizer_ != 0) return internal_symbolizer_-Demangle(name); +if (libbacktrace_symbolizer_ != 0) { + const char *demangled = libbacktrace_symbolizer_-Demangle(name); + if (demangled) + return demangled; +} return DemangleCXXABI(name); } --- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc.jj 2013-12-09 14:32:06.0 +0100 +++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc 2013-12-10 11:48:19.803830291 +0100 @@ -20,6 +20,10 @@ # include backtrace-supported.h # if SANITIZER_POSIX BACKTRACE_SUPPORTED !BACKTRACE_USES_MALLOC # include backtrace.h +# if SANITIZER_CP_DEMANGLE +# undef ARRAY_SIZE +# include demangle.h +# endif # else # define SANITIZER_LIBBACKTRACE 0 # endif @@ -31,6 +35,60 @@ namespace __sanitizer { namespace { +#if SANITIZER_CP_DEMANGLE +struct CplusV3DemangleData { + char *buf; + uptr size, allocated; +}; + +extern C { +static void CplusV3DemangleCallback(const char *s, size_t l, void *vdata) { + CplusV3DemangleData *data = (CplusV3DemangleData *)vdata; + uptr needed =
[PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
On 25/12/13 14:02, Tom de Vries wrote: On 07-12-13 16:07, Tom de Vries wrote: Richard, This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to address the issue that $6 is sometimes used in split calls. Build and reg-tested on MIPS. OK for stage1? Richard, Ping. This patch is the only part of -fuse-caller-save that still needs approval. This patch was submitted here ( http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00771.html ) and is required for the -fuse-caller-save optimization which was submitted here ( http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01234.html ). The patch fixes a correctness issue with -fuse-caller-save for MIPS. OK for stage1? Thanks, - Tom
Re: [PATCH] libsanitizer demangling using cp-demangle.c
On Thu, Jan 09, 2014 at 05:51:05PM +0400, Konstantin Serebryany wrote: On Tue, Dec 10, 2013 at 3:38 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote: There was a recent buggy patch to the demangler that added calls to malloc and realloc (2013-10-25 Gary Benson gben...@redhat.com). That patch must be fixed or reverted before the 4.9 release. The main code in the demangler must not call malloc/realloc. When that patch is fixed, you can use the cplus_demangle_v3_callback function to get a demangler that never calls malloc. AFAIK Gary is working on a fix, when that is fixed, with the following patch libsanitizer (when using libbacktrace for symbolization) will not use system malloc/realloc/free for the demangling at all. Tested on x86_64-linux (-m64/-m32). Note that the changes for the 3 files unfortunately will need to be applied upstream to compiler-rt, is that possible? 2013-12-10 Jakub Jelinek ja...@redhat.com * sanitizer_common/sanitizer_symbolizer_libbacktrace.h (LibbacktraceSymbolizer::Demangle): New declaration. * sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc sanitizer_symbolizer_posix_libcdep.cc is the file from upstream. If it gets any change in the GCC variant, I will not be able to do merges from upstream until the same code is applied upstream. Sure, but we are nearing GCC 4.9 stage3 finish and really need to demangle the libbacktrace provided output. Has the compiler-rt situation been cleared up? Haven't seen any follow-ups after Chandler's reversion. So, this change is meant to be temporary, with hope that in upstream this will be resolved, either with the same patch or something similar. Jakub
Re: [PATCH] Fix PR49718 : allow no_instrument_function attribute in class member definition/declaration
On 01/09/14 06:02, Jeff Law wrote: On 01/08/14 02:05, Laurent Alfonsi wrote: All, I was looking at PR49718. I have enclosed a simple fix for this bug report. 2014-01-07 Laurent Alfonsi laurent.alfo...@st.com * c-family/c-common.c (handle_no_instrument_function_attribute): Allow no_instrument_function attribute in class member definition/declaration. Looking at the implementation of the function attributes, I see no reason anymore to keep this error message. Let me know if I missed something. I have also added a testcase in the enclosed patch. 2014-01-07 Laurent Alfonsi laurent.alfo...@st.com PR c++/49718 * g++.dg/pr49718.C: New Isn't the idea here that if we've already generated the function body (presumably with instrumentation) that a no-instrument attribute appearing on a later declaration won't do anything useful? jeff Jeff, You are right. That's probably the reason. From what i can see, the code instrumentation is performed in the gimplification pass (gimplify_function_tree), and the function attribute is handled and attached earlier in the parsing phase. I ve checked with an example like : ---8--8--8--8--8--- int foo () { return 2; } int bar () { return 1; } int foo () __attribute__((no_instrument_function)); ---8--8--8--8--8--- The attribute is well honored on foo function. I might need to add this test case too. Let me know if fix is ok. Thanks Laurent
A question about forward_addr.
Hi all, I'm confused by the annotation in shold_replace_address. Here is the code in fwprop.c: /* OLD is a memory address. Return whether it is good to use NEW instead, for a memory access in the given MODE. */ static bool should_replace_address (rtx old_rtx, rtx new_rtx, enum machine_mode mode, addr_space_t as, bool speed) { int gain; if (rtx_equal_p (old_rtx, new_rtx) || !memory_address_addr_space_p (mode, new_rtx, as)) return false; /* Copy propagation is always ok. */ if (REG_P (old_rtx) REG_P (new_rtx)) return true; */* Prefer the new address if it is less expensive. */ gain = (address_cost (old_rtx, mode, as, speed) - address_cost (new_rtx, mode, as, speed)); /* If the addresses have equivalent cost, prefer the new address if it has the highest `set_src_cost'. That has the potential of eliminating the most insns without additional costs, and it is the same that cse.c used to do. */ if (gain == 0) gain = set_src_cost (new_rtx, speed) - set_src_cost (old_rtx, speed); return (gain 0); *} According to the annotation, the 'return (gain 0)' shouldn't be 'return (gain = 0)' ? Here is the case for forward_addr. insn set r155 plus r167 + 32 insn set mem (155) r188 insn set mem (plus r155 + 8) r189 .. If it is handled by the original code, the result will be: insn set r155 plus r167 + 32 insn set mem (r167 + 32) r188 insn set mem (plus r155 + 8) r189 However it is expected to be: insn set mem (r167 + 32) r188 insn set mem (plus r167 + 40) r189 As the cost of 'addr r155' + 8 is equal to 'addr r167 + 40', so I think that we should preffer to take the new addr, technically will be profitable ??? Brs, Peter Xu. - Dying in the sun. -- View this message in context: http://gcc.1065356.n5.nabble.com/A-question-about-forward-addr-tp1001126.html Sent from the gcc - patches mailing list archive at Nabble.com.
[Patch] Avoid gcc_assert in libgcov
As suggested by Honza, avoid bloating libgcov from gcc_assert by using a new macro gcov_nonruntime_assert in gcov-io.c that is only mapped to gcc_assert when not in libgcov. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2014-01-09 Teresa Johnson tejohn...@google.com * gcov-io.c (gcov_position): Use gcov_nonruntime_assert. (gcov_is_error): Ditto. (gcov_rewrite): Ditto. (gcov_open): Ditto. (gcov_write_words): Ditto. (gcov_write_length): Ditto. (gcov_read_words): Ditto. (gcov_read_summary): Ditto. (gcov_sync): Ditto. (gcov_seek): Ditto. (gcov_histo_index): Ditto. (static void gcov_histogram_merge): Ditto. (compute_working_sets): Ditto. * gcov-io.h (gcov_nonruntime_assert): Define. Index: gcov-io.c === --- gcov-io.c (revision 206435) +++ gcov-io.c (working copy) @@ -67,7 +67,7 @@ GCOV_LINKAGE struct gcov_var static inline gcov_position_t gcov_position (void) { - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); return gcov_var.start + gcov_var.offset; } @@ -83,7 +83,7 @@ gcov_is_error (void) GCOV_LINKAGE inline void gcov_rewrite (void) { - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); gcov_var.mode = -1; gcov_var.start = 0; gcov_var.offset = 0; @@ -133,7 +133,7 @@ gcov_open (const char *name, int mode) s_flock.l_pid = getpid (); #endif - gcc_assert (!gcov_var.file); + gcov_nonruntime_assert (!gcov_var.file); gcov_var.start = 0; gcov_var.offset = gcov_var.length = 0; gcov_var.overread = -1u; @@ -291,14 +291,14 @@ gcov_write_words (unsigned words) { gcov_unsigned_t *result; - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); #if IN_LIBGCOV if (gcov_var.offset = GCOV_BLOCK_SIZE) { gcov_write_block (GCOV_BLOCK_SIZE); if (gcov_var.offset) { - gcc_assert (gcov_var.offset == 1); + gcov_nonruntime_assert (gcov_var.offset == 1); memcpy (gcov_var.buffer, gcov_var.buffer + GCOV_BLOCK_SIZE, 4); } } @@ -393,9 +393,9 @@ gcov_write_length (gcov_position_t position) gcov_unsigned_t length; gcov_unsigned_t *buffer; - gcc_assert (gcov_var.mode 0); - gcc_assert (position + 2 = gcov_var.start + gcov_var.offset); - gcc_assert (position = gcov_var.start); + gcov_nonruntime_assert (gcov_var.mode 0); + gcov_nonruntime_assert (position + 2 = gcov_var.start + gcov_var.offset); + gcov_nonruntime_assert (position = gcov_var.start); offset = position - gcov_var.start; length = gcov_var.offset - offset - 2; buffer = (gcov_unsigned_t *) gcov_var.buffer[offset]; @@ -481,14 +481,14 @@ gcov_read_words (unsigned words) const gcov_unsigned_t *result; unsigned excess = gcov_var.length - gcov_var.offset; - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); if (excess words) { gcov_var.start += gcov_var.offset; #if IN_LIBGCOV if (excess) { - gcc_assert (excess == 1); + gcov_nonruntime_assert (excess == 1); memcpy (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, 4); } #else @@ -497,7 +497,7 @@ gcov_read_words (unsigned words) gcov_var.offset = 0; gcov_var.length = excess; #if IN_LIBGCOV - gcc_assert (!gcov_var.length || gcov_var.length == 1); + gcov_nonruntime_assert (!gcov_var.length || gcov_var.length == 1); excess = GCOV_BLOCK_SIZE; #else if (gcov_var.length + words gcov_var.alloc) @@ -614,7 +614,7 @@ gcov_read_summary (struct gcov_summary *summary) while (!cur_bitvector) { h_ix = bv_ix * 32; - gcc_assert (bv_ix GCOV_HISTOGRAM_BITVECTOR_SIZE); + gcov_nonruntime_assert (bv_ix GCOV_HISTOGRAM_BITVECTOR_SIZE); cur_bitvector = histo_bitvector[bv_ix++]; } while (!(cur_bitvector 0x1)) @@ -622,7 +622,7 @@ gcov_read_summary (struct gcov_summary *summary) h_ix++; cur_bitvector = 1; } - gcc_assert (h_ix GCOV_HISTOGRAM_SIZE); + gcov_nonruntime_assert (h_ix GCOV_HISTOGRAM_SIZE); csum-histogram[h_ix].num_counters = gcov_read_unsigned (); csum-histogram[h_ix].min_value = gcov_read_counter (); @@ -642,7 +642,7 @@ gcov_read_summary (struct gcov_summary *summary) GCOV_LINKAGE void gcov_sync (gcov_position_t base, gcov_unsigned_t length) { - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); base += length; if (base - gcov_var.start = gcov_var.length) gcov_var.offset = base - gcov_var.start; @@ -661,7 +661,7 @@ gcov_sync (gcov_position_t base, gcov_unsigned_t l GCOV_LINKAGE void gcov_seek (gcov_position_t base) { - gcc_assert (gcov_var.mode 0); +
Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.
On 30/03/13 16:10, Tom de Vries wrote: On 29/03/13 13:54, Tom de Vries wrote: I split the patch up into 10 patches, to facilitate further review: ... 0001-Add-command-line-option.patch 0002-Add-new-reg-note-REG_CALL_DECL.patch 0003-Add-implicit-parameter-to-find_all_hard_reg_sets.patch 0004-Add-TARGET_FN_OTHER_HARD_REG_USAGE-hook.patch 0005-Implement-TARGET_FN_OTHER_HARD_REG_USAGE-hook-for-ARM.patch 0006-Collect-register-usage-information.patch 0007-Use-collected-register-usage-information.patch 0008-Enable-by-default-at-O2-and-higher.patch 0009-Add-documentation.patch 0010-Add-test-case.patch ... I'll post these in reply to this email. Something went wrong with those emails, which were generated. I tested the emails by sending them to my work email, where they looked fine. I managed to reproduce the problem by sending them to my private email. It seems the problem was inconsistent EOL format. I've written a python script to handle composing the email, and posted it here using that script: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01311.html. Given that that email looks ok, I think I've addressed the problems now. I'll repost the patches. Sorry about the noise. Thanks, - Tom It's unfortunate that this feature doesn't fail safe when a port has not explicitly defined what should happen. Consequently, you'll need to add a patch for AArch64 which has two registers clobbered by PLT-based calls. R.
Re: [Patch] Avoid gcc_assert in libgcov
As suggested by Honza, avoid bloating libgcov from gcc_assert by using a new macro gcov_nonruntime_assert in gcov-io.c that is only mapped to gcc_assert when not in libgcov. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2014-01-09 Teresa Johnson tejohn...@google.com * gcov-io.c (gcov_position): Use gcov_nonruntime_assert. (gcov_is_error): Ditto. (gcov_rewrite): Ditto. (gcov_open): Ditto. (gcov_write_words): Ditto. (gcov_write_length): Ditto. (gcov_read_words): Ditto. (gcov_read_summary): Ditto. (gcov_sync): Ditto. (gcov_seek): Ditto. (gcov_histo_index): Ditto. (static void gcov_histogram_merge): Ditto. (compute_working_sets): Ditto. * gcov-io.h (gcov_nonruntime_assert): Define. @@ -481,14 +481,14 @@ gcov_read_words (unsigned words) const gcov_unsigned_t *result; unsigned excess = gcov_var.length - gcov_var.offset; - gcc_assert (gcov_var.mode 0); + gcov_nonruntime_assert (gcov_var.mode 0); if (excess words) { gcov_var.start += gcov_var.offset; #if IN_LIBGCOV if (excess) { - gcc_assert (excess == 1); + gcov_nonruntime_assert (excess == 1); It probably makes no sense to put nonruntime access into IN_LIBGCOV defines. memcpy (gcov_var.buffer, gcov_var.buffer + gcov_var.offset, 4); } #else @@ -497,7 +497,7 @@ gcov_read_words (unsigned words) gcov_var.offset = 0; gcov_var.length = excess; #if IN_LIBGCOV - gcc_assert (!gcov_var.length || gcov_var.length == 1); + gcov_nonruntime_assert (!gcov_var.length || gcov_var.length == 1); excess = GCOV_BLOCK_SIZE; #else if (gcov_var.length + words gcov_var.alloc) @@ -614,7 +614,7 @@ gcov_read_summary (struct gcov_summary *summary) while (!cur_bitvector) { h_ix = bv_ix * 32; - gcc_assert (bv_ix GCOV_HISTOGRAM_BITVECTOR_SIZE); + gcov_nonruntime_assert (bv_ix GCOV_HISTOGRAM_BITVECTOR_SIZE); cur_bitvector = histo_bitvector[bv_ix++]; } while (!(cur_bitvector 0x1)) @@ -622,7 +622,7 @@ gcov_read_summary (struct gcov_summary *summary) h_ix++; cur_bitvector = 1; } - gcc_assert (h_ix GCOV_HISTOGRAM_SIZE); + gcov_nonruntime_assert (h_ix GCOV_HISTOGRAM_SIZE); How many of those asserts can be triggered by a corrupted gcda file? I would like to make libgcov more safe WRT file corruptions, too, so in that case we should produce an error message. The rest of changes seems OK. Honza
Re: [PATCH, go]: Skip some go tests
On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak ubiz...@gmail.com wrote: 2014-01-09 Uros Bizjak ubiz...@gmail.com * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems which don't support -fsplit-stack. Skip rotate[0123]-out.go. This is OK. Thanks. You might want to tweak the comment just under where you added peano.go. Then go ahead and commit. Ian
Re: [ARM] add armv7ve support
Hi Gerald, Sorry for the late reply! We're working on a list of all the ARM-related changes in 4.9. This will also be included. Kind regards, Renlin On 03/01/14 13:24, Gerald Pfeifer wrote: Renlin Li renlin...@arm.com wrote: Hi all, This patch will add armv7ve support to gcc. Armv7ve is basically a armv7-a architecture profile with Virtualization Extensions. Mind adding this to the release notes? Gerald
Re: [PATCH] Add zero-overhead looping for xtensa backend
Hi Sterling, Attached please find version 2 of the patch. I applied this updated patch (with small adaptations) to gcc-4.8.2 and carried out some tests. I can execute the testcases in a simulator, which support zero-overhead looping instructions. First of all, I can successfully build libgcc, libstdc++ and newlibc for xtensa with this patch. The newly built xtensa gcc also passed testsuite which comes with newlibc. I also tested the cases under gcc/testsuite/gcc.c-torture/execute/ directory. There are about 800+ cases tested. Test result shows no new failed case with this patch, compared with the original gcc version. Is that OK? I also double checked the loop relaxation issue with binutils-2.24 (the latest version). The result show that the assember can do loop relaxation when the loop target is too far ( 256 Byte). And this is the reason why I don't check the size of the loop. Index: gcc/ChangeLog === --- gcc/ChangeLog(revision 206463) +++ gcc/ChangeLog(working copy) @@ -1,3 +1,18 @@ +2014-01-09 Felix Yang fei.yang0...@gmail.com + +* config/xtensa/xtensa.c (xtensa_reorg): New. +(xtensa_reorg_loops): New. +(xtensa_can_use_doloop_p): New. +(xtensa_invalid_within_doloop): New. +(hwloop_optimize): New. +(hwloop_fail): New. +(hwloop_pattern_reg): New. +(xtensa_emit_loop_end): Modified to emit the zero-overhead loop end label. +(xtensa_doloop_hooks): Define. +* config/xtensa/xtensa.md (doloop_end): New. +(zero_cost_loop_start): Rewritten. +(zero_cost_loop_end): Rewritten. + 2014-01-09 Richard Biener rguent...@suse.de PR tree-optimization/59715 Index: gcc/config/xtensa/xtensa.md === --- gcc/config/xtensa/xtensa.md(revision 206463) +++ gcc/config/xtensa/xtensa.md(working copy) @@ -1,6 +1,7 @@ ;; GCC machine description for Tensilica's Xtensa architecture. ;; Copyright (C) 2001-2014 Free Software Foundation, Inc. ;; Contributed by Bob Wilson (bwil...@tensilica.com) at Tensilica. +;; Zero-overhead looping support by Felix Yang (fei.yang0...@gmail.com). ;; This file is part of GCC. @@ -35,6 +36,8 @@ (UNSPEC_TLS_CALL9) (UNSPEC_TP10) (UNSPEC_MEMW11) + (UNSPEC_LSETUP_START 12) + (UNSPEC_LSETUP_END13) (UNSPECV_SET_FP1) (UNSPECV_ENTRY2) @@ -1289,41 +1292,67 @@ (set_attr length3)]) +;; Hardware loop support. + ;; Define the loop insns used by bct optimization to represent the -;; start and end of a zero-overhead loop (in loop.c). This start -;; template generates the loop insn; the end template doesn't generate -;; any instructions since loop end is handled in hardware. +;; start and end of a zero-overhead loop. This start template generates +;; the loop insn; the end template doesn't generate any instructions since +;; loop end is handled in hardware. (define_insn zero_cost_loop_start [(set (pc) -(if_then_else (eq (match_operand:SI 0 register_operand a) - (const_int 0)) - (label_ref (match_operand 1 )) - (pc))) - (set (reg:SI 19) -(plus:SI (match_dup 0) (const_int -1)))] +(if_then_else (ne (match_operand:SI 0 register_operand a) + (const_int 1)) + (label_ref (match_operand 1 )) + (pc))) + (set (match_operand:SI 2 register_operand +a0) +(plus (match_dup 2) + (const_int -1))) + (unspec [(const_int 0)] UNSPEC_LSETUP_START)] - loopnez\t%0, %l1 + loop\t%0, %l1_LEND [(set_attr typejump) (set_attr modenone) (set_attr length3)]) (define_insn zero_cost_loop_end [(set (pc) -(if_then_else (ne (reg:SI 19) (const_int 0)) - (label_ref (match_operand 0 )) - (pc))) - (set (reg:SI 19) -(plus:SI (reg:SI 19) (const_int -1)))] +(if_then_else (ne (match_operand:SI 0 register_operand a) + (const_int 1)) + (label_ref (match_operand 1 )) + (pc))) + (set (match_operand:SI 2 register_operand +a0) +(plus (match_dup 2) + (const_int -1))) + (unspec [(const_int 0)] UNSPEC_LSETUP_END)] { -xtensa_emit_loop_end (insn, operands); -return ; + xtensa_emit_loop_end (insn, operands); + return ; } [(set_attr typejump) (set_attr modenone) (set_attr length0)]) +; operand 0 is the loop count pseudo register +; operand 1 is the label to jump to at the top of the loop +(define_expand doloop_end + [(parallel [(set (pc) (if_then_else + (ne (match_operand:SI 0 ) + (const_int 1)) + (label_ref (match_operand 1 )) + (pc))) + (set (match_dup 0) + (plus:SI
Re: wide-int, build system
On Sat, Nov 23, 2013 at 8:20 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch.This patch covers the build system (make). Ok? Needs updating (no explicit dependences for wide-int.h) but ok. Richard.
Re: wide-int, doc
On Sat, Nov 23, 2013 at 8:21 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch.This patch covers the documentation. Ok? Ok. Thanks, Richard.
Re: wide-int, gimple
On Thu, Jan 2, 2014 at 5:10 AM, Mike Stump mikest...@comcast.net wrote: On Nov 28, 2013, at 6:20 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Nov 28, 2013 at 12:58 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Jakub Jelinek ja...@redhat.com writes: On Mon, Nov 25, 2013 at 12:24:30PM +0100, Richard Biener wrote: On Sat, Nov 23, 2013 at 8:21 PM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch. This patch covers the gimple code. @@ -1754,7 +1754,7 @@ dump_ssaname_info (pretty_printer *buffer, tree node, int spc) if (!POINTER_TYPE_P (TREE_TYPE (node)) SSA_NAME_RANGE_INFO (node)) { - double_int min, max, nonzero_bits; + widest_int min, max, nonzero_bits; value_range_type range_type = get_range_info (node, min, max); if (range_type == VR_VARYING) this makes me suspect you are changing SSA_NAME_RANGE_INFO to embed two max wide_ints. That's a no-no. Well, the range_info_def struct right now contains 3 double_ints, which is unnecessary overhead for the most of the cases where the SSA_NAME's type has just at most HOST_BITS_PER_WIDE_INT bits and thus we could fit all 3 of them into 3 HOST_WIDE_INTs rather than 3 double_ints. So supposedly struct range_info_def could be a template on the type's precision rounded up to HWI bits, or say have 3 alternatives there, use FIXED_WIDE_INT (HOST_BITS_PER_WIDE_INT) for the smallest types, FIXED_WIDE_INT (2 * HOST_BITS_PER_WIDE_INT) aka double_int for the larger but still common ones, and widest_int for the rest, then the API to set/get it could use widest_int everywhere, and just what storage we'd use would depend on the precision of the type. This patch adds a trailing_wide_ints N that can be used at the end of a variable-length structure to store N wide_ints. There's also a macro to declare get/set methods for each of the N elements. At the moment I've only defined non-const operator[]. It'd be possible to add a const version later if necessary. The size of range_info_def for precisions that fit in M HWIs is then 1 + 3 * M, so 4 for the common case (down from 6 on trunk). The maximum is 7 for current x86_64 types (up from 6 on trunk). I wondered whether to keep the interface using widest_int, but I think wide_int works out more naturally. The only caller that wants to extend beyond the precision is CCP, but that's already special because the upper bits are supposed to be set (i.e. it's not a normal sign or zero extension). This relies on the SSA_NAME_ANTI_RANGE_P patch I just posted. If this is OK I'll look at using the same structure elsewhere. Looks good to me. So, is that an Ok for the gimple patch and the follow on work? Just double checking. Yes.
Re: wide-int, ipa
On Thu, Jan 2, 2014 at 5:12 AM, Mike Stump mikest...@comcast.net wrote: On Nov 23, 2013, at 11:22 AM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch.This patch covers the ipa code. Ok? Ping? I promise, this patch isn't frightening. Small, easy to read and understand, doesn't require an ipa expert. Why @@ -968,7 +968,7 @@ get_polymorphic_call_info (tree fndecl, { base_pointer = TREE_OPERAND (base, 0); context-offset -+= offset2 + mem_ref_offset (base).low * BITS_PER_UNIT; + += offset2 + mem_ref_offset (base).ulow () * BITS_PER_UNIT; context-outer_type = NULL; } /* We found base object. In this case the outer_type but then @@ -1063,7 +1063,7 @@ compute_complex_assign_jump_func (struct ipa_node_params *info, || max_size == -1 || max_size != size) return; - offset += mem_ref_offset (base).low * BITS_PER_UNIT; + offset += mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT; ssa = TREE_OPERAND (base, 0); if (TREE_CODE (ssa) != SSA_NAME || !SSA_NAME_IS_DEFAULT_DEF (ssa) ? I think it should be to_short_addr () in the first case as well. Ok with that change. Richard.
Re: wide-int, sched
On Thu, Jan 2, 2014 at 5:53 AM, Mike Stump mikest...@comcast.net wrote: On Nov 23, 2013, at 11:22 AM, Mike Stump mikest...@comcast.net wrote: Richi has asked the we break the wide-int patch so that the individual port and front end maintainers can review their parts without have to go through the entire patch.This patch covers the scheduler code. Ok? Ping? I promise, this one is easy… Ok. Richard.
Re: wide-int, loop
On Thu, Jan 2, 2014 at 5:27 AM, Mike Stump mikest...@comcast.net wrote: On Nov 26, 2013, at 1:14 AM, Richard Biener richard.guent...@gmail.com wrote: @@ -2662,8 +2661,8 @@ iv_number_of_iterations (struct loop *loop, rtx insn, rtx condition, iv1.step = const0_rtx; if (INTVAL (iv0.step) 0) { - iv0.step = simplify_gen_unary (NEG, comp_mode, iv0.step, mode); - iv1.base = simplify_gen_unary (NEG, comp_mode, iv1.base, mode); + iv0.step = simplify_gen_unary (NEG, comp_mode, iv0.step, comp_mode); + iv1.base = simplify_gen_unary (NEG, comp_mode, iv1.base, comp_mode); } iv0.step = lowpart_subreg (mode, iv0.step, comp_mode); separate bugfix? most likely.i will submit separately. @@ -1378,7 +1368,8 @@ decide_peel_simple (struct loop *loop, int flags) /* If we have realistic estimate on number of iterations, use it. */ if (get_estimated_loop_iterations (loop, iterations)) { - if (double_int::from_shwi (npeel).ule (iterations)) + /* TODO: unsigned/signed confusion */ + if (wi::leu_p (npeel, iterations)) { if (dump_file) { what does this refer to? npeel is unsigned. it was the fact that they were doing the from_shwi and then using an unsigned test. Ah - probably a typo. Please just remove the TODO. Done: Index: loop-unroll.c === --- loop-unroll.c (revision 206183) +++ loop-unroll.c (working copy) @@ -1371,7 +1371,6 @@ decide_peel_simple (struct loop *loop, i /* If we have realistic estimate on number of iterations, use it. */ if (get_estimated_loop_iterations (loop, iterations)) { - /* TODO: unsigned/signed confusion */ if (wi::leu_p (npeel, iterations)) { if (dump_file) Otherwise looks good to me. Kenny hasn't yet integrated the first into trunk, but I'd like to ask anyway: Ok? Ok. Richard.
a patch prototype for PR59535 (THUMB code size regression)
Hi, Richard. This week I've been working on THUMB code size issues. Here is the prototype of the patch for spilling into HI_REGS instead of memory. The patch decreases number of generated insns and makes the code faster as it removes a lot of loads/stores. I am sending the patch for your evaluation and for getting your opinion. If you like the code size results, I could create the real patch next week (the patch here will not work correctly when a user defines fixed registers by himself). Thanks in advance, Vlad. Index: config/arm/arm.c === --- config/arm/arm.c(revision 206089) +++ config/arm/arm.c(working copy) @@ -73,6 +73,8 @@ struct four_ints /* Forward function declarations. */ static bool arm_lra_p (void); +static reg_class_t arm_spill_class (reg_class_t, enum machine_mode); +static int arm_spill_hard_regno (int, reg_class_t, enum machine_mode); static bool arm_needs_doubleword_align (enum machine_mode, const_tree); static int arm_compute_static_chain_stack_bytes (void); static arm_stack_offsets *arm_get_frame_offsets (void); @@ -345,6 +347,12 @@ static const struct attribute_spec arm_a #undef TARGET_LRA_P #define TARGET_LRA_P arm_lra_p +#undef TARGET_SPILL_CLASS +#define TARGET_SPILL_CLASS arm_spill_class + +#undef TARGET_SPILL_HARD_REGNO +#define TARGET_SPILL_HARD_REGNO arm_spill_hard_regno + #undef TARGET_ATTRIBUTE_TABLE #define TARGET_ATTRIBUTE_TABLE arm_attribute_table @@ -5597,6 +5605,28 @@ arm_lra_p (void) return arm_lra_flag; } +/* Return class of registers which could be used for pseudo of MODE + and of class RCLASS for spilling instead of memory. Return NO_REGS + if it is not possible or non-profitable. */ +static reg_class_t +arm_spill_class (reg_class_t rclass, enum machine_mode mode) +{ + if (TARGET_THUMB1 mode == SImode + (rclass == LO_REGS || rclass == GENERAL_REGS)) +return HI_REGS; + return NO_REGS; +} + +/* ??? */ +static int +arm_spill_hard_regno (int n, reg_class_t spill_class, enum machine_mode mode) +{ + gcc_assert (TARGET_THUMB1 mode == SImode spill_class == HI_REGS + n = 0); + int hard_regno = FIRST_HI_REGNUM + n; + return hard_regno 12 ? -1 : hard_regno; +} + /* Return true if mode/type need doubleword alignment. */ static bool arm_needs_doubleword_align (enum machine_mode mode, const_tree type) @@ -29236,6 +29266,7 @@ arm_conditional_register_usage (void) for (regno = FIRST_HI_REGNUM; regno = LAST_HI_REGNUM; ++regno) fixed_regs[regno] = call_used_regs[regno] = 1; + fixed_regs[12] = call_used_regs[12] = 1; } /* The link register can be clobbered by any branch insn, Index: doc/tm.texi === --- doc/tm.texi (revision 206089) +++ doc/tm.texi (working copy) @@ -2918,6 +2918,10 @@ A target hook which returns true if an a This hook defines a class of registers which could be used for spilling pseudos of the given mode and class, or @code{NO_REGS} if only memory should be used. Not defining this hook is equivalent to returning @code{NO_REGS} for all inputs. @end deftypefn +@deftypefn {Target Hook} int TARGET_SPILL_HARD_REGNO (int, @var{reg_class_t}, enum @var{machine_mode}) +This hook defines n-th (0, ...) register which could be used for spilling pseudos of the given mode and spill class, or -1 if there are no such regs anymore. The hook shoul be defined with spill_class hook and should be defined only for classes returned by spill_class. +@end deftypefn + @deftypefn {Target Hook} {enum machine_mode} TARGET_CSTORE_MODE (enum insn_code @var{icode}) This hook defines the machine mode to use for the boolean result of conditional store patterns. The ICODE argument is the instruction code for the cstore being performed. Not definiting this hook is the same as accepting the mode encoded into operand 0 of the cstore expander patterns. @end deftypefn Index: doc/tm.texi.in === --- doc/tm.texi.in (revision 206089) +++ doc/tm.texi.in (working copy) @@ -2549,6 +2549,8 @@ as below: @hook TARGET_SPILL_CLASS +@hook TARGET_SPILL_HARD_REGNO + @hook TARGET_CSTORE_MODE @node Old Constraints Index: lra-spills.c === --- lra-spills.c(revision 206089) +++ lra-spills.c(working copy) @@ -252,7 +252,7 @@ pseudo_reg_slot_compare (const void *v1p static int assign_spill_hard_regs (int *pseudo_regnos, int n) { - int i, k, p, regno, res, spill_class_size, hard_regno, nr; + int i, k, p, regno, res, hard_regno, nr; enum reg_class rclass, spill_class; enum machine_mode mode; lra_live_range_t r; @@ -271,7 +271,7 @@ assign_spill_hard_regs (int *pseudo_regn /* Set up reserved hard regs for every program point. */ reserved_hard_regs = XNEWVEC
Re: [patch] regcprop fix for PR rtl-optimization/54300
On 20/11/13 13:57, Richard Earnshaw wrote: On 19/11/13 17:48, Jeff Law wrote: On 11/19/13 10:32, Steven Bosscher wrote: Yes. In the GCC3 days it was important for sincos on i386, and on mk68 it used to be important for some of the funnier patterns. Not sure if it's still useful today, though. Might be worth looking into, just to avoid the confusion in the future. I doubt it's changed all that much :-) There's been confusion about this before, where people assumed single_set really means just one SET in this pattern. (ISTR fixing gcse.c's hash_scan_rtx for this at some point...?). But that's not the semantics of single_set. Yes. And I'd expect confusion to continue :( Not sure if creating renaming to capture the actual semantics would help here. The proper test for just one SET is (!multiple_sets single_set). At least, that's how I've always coded it... Seems reasonable for those cases where you have to ensure there really is just one set. jeff Provided we correctly note the other values that are killed, we can handle multiple sets safely. The one restriction we have to watch is where the dead set operations kill input values to the live set operation. I've committed my patch to trunk. I'll leave it to gestate a couple of days, but this is also needed on the active release branches as well. Well, a bit more than a few days... 4.8 backport has now been applied. 4.7 should follow shortly. R.
[Patch, Fortran] PR 58026: Bad error recovery for allocatable component of undeclared type
Hi all, the attached patch started out as an ICE-on-invalid regression fix, but after the ICE had been fixed recently by other means, it was degraded to a mere error-recovery improvement. It removes some rather 'hackish' code that was added by Paul quite a long time ago. Regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk? Cheers, Janus 2014-01-09 Janus Weil ja...@gcc.gnu.org PR fortran/58026 * decl.c (gfc_match_data_decl): Improve error recovery. 2014-01-09 Janus Weil ja...@gcc.gnu.org PR fortran/58026 * gfortran.dg/alloc_comp_basics_6.f90: New. Index: gcc/fortran/decl.c === --- gcc/fortran/decl.c (revision 206462) +++ gcc/fortran/decl.c (working copy) @@ -4287,12 +4287,10 @@ gfc_match_data_decl (void) || current_ts.u.derived-attr.zero_comp)) goto ok; - /* Now we have an error, which we signal, and then fix up -because the knock-on is plain and simple confusing. */ gfc_error_now (Derived type at %C has not been previously defined and so cannot appear in a derived type definition); - current_attr.pointer = 1; - goto ok; + m = MATCH_ERROR; + goto cleanup; } ok: ! { dg-do compile } ! ! PR 58026: Bad error recovery for allocatable component of undeclared type ! ! Contributed by Joost VandeVondele joost.vandevond...@mat.ethz.ch type sysmtx_t type(ext_complex_t), allocatable :: S(:) end type end
Re: a patch prototype for PR59535 (THUMB code size regression)
On 09/01/14 15:21, Vladimir Makarov wrote: Hi, Richard. This week I've been working on THUMB code size issues. Here is the prototype of the patch for spilling into HI_REGS instead of memory. The patch decreases number of generated insns and makes the code faster as it removes a lot of loads/stores. I am sending the patch for your evaluation and for getting your opinion. If you like the code size results, I could create the real patch next week (the patch here will not work correctly when a user defines fixed registers by himself). Thanks in advance, Vlad. Do you need to take into account HARD_REGNO_NREGS (mode) when doing the limit check? R. z Index: config/arm/arm.c === --- config/arm/arm.c (revision 206089) +++ config/arm/arm.c (working copy) @@ -73,6 +73,8 @@ struct four_ints /* Forward function declarations. */ static bool arm_lra_p (void); +static reg_class_t arm_spill_class (reg_class_t, enum machine_mode); +static int arm_spill_hard_regno (int, reg_class_t, enum machine_mode); static bool arm_needs_doubleword_align (enum machine_mode, const_tree); static int arm_compute_static_chain_stack_bytes (void); static arm_stack_offsets *arm_get_frame_offsets (void); @@ -345,6 +347,12 @@ static const struct attribute_spec arm_a #undef TARGET_LRA_P #define TARGET_LRA_P arm_lra_p +#undef TARGET_SPILL_CLASS +#define TARGET_SPILL_CLASS arm_spill_class + +#undef TARGET_SPILL_HARD_REGNO +#define TARGET_SPILL_HARD_REGNO arm_spill_hard_regno + #undef TARGET_ATTRIBUTE_TABLE #define TARGET_ATTRIBUTE_TABLE arm_attribute_table @@ -5597,6 +5605,28 @@ arm_lra_p (void) return arm_lra_flag; } +/* Return class of registers which could be used for pseudo of MODE + and of class RCLASS for spilling instead of memory. Return NO_REGS + if it is not possible or non-profitable. */ +static reg_class_t +arm_spill_class (reg_class_t rclass, enum machine_mode mode) +{ + if (TARGET_THUMB1 mode == SImode + (rclass == LO_REGS || rclass == GENERAL_REGS)) +return HI_REGS; + return NO_REGS; +} + +/* ??? */ +static int +arm_spill_hard_regno (int n, reg_class_t spill_class, enum machine_mode mode) +{ + gcc_assert (TARGET_THUMB1 mode == SImode spill_class == HI_REGS +n = 0); + int hard_regno = FIRST_HI_REGNUM + n; + return hard_regno 12 ? -1 : hard_regno; +} + /* Return true if mode/type need doubleword alignment. */ static bool arm_needs_doubleword_align (enum machine_mode mode, const_tree type) @@ -29236,6 +29266,7 @@ arm_conditional_register_usage (void) for (regno = FIRST_HI_REGNUM; regno = LAST_HI_REGNUM; ++regno) fixed_regs[regno] = call_used_regs[regno] = 1; + fixed_regs[12] = call_used_regs[12] = 1; } /* The link register can be clobbered by any branch insn, Index: doc/tm.texi === --- doc/tm.texi (revision 206089) +++ doc/tm.texi (working copy) @@ -2918,6 +2918,10 @@ A target hook which returns true if an a This hook defines a class of registers which could be used for spilling pseudos of the given mode and class, or @code{NO_REGS} if only memory should be used. Not defining this hook is equivalent to returning @code{NO_REGS} for all inputs. @end deftypefn +@deftypefn {Target Hook} int TARGET_SPILL_HARD_REGNO (int, @var{reg_class_t}, enum @var{machine_mode}) +This hook defines n-th (0, ...) register which could be used for spilling pseudos of the given mode and spill class, or -1 if there are no such regs anymore. The hook shoul be defined with spill_class hook and should be defined only for classes returned by spill_class. +@end deftypefn + @deftypefn {Target Hook} {enum machine_mode} TARGET_CSTORE_MODE (enum insn_code @var{icode}) This hook defines the machine mode to use for the boolean result of conditional store patterns. The ICODE argument is the instruction code for the cstore being performed. Not definiting this hook is the same as accepting the mode encoded into operand 0 of the cstore expander patterns. @end deftypefn Index: doc/tm.texi.in === --- doc/tm.texi.in(revision 206089) +++ doc/tm.texi.in(working copy) @@ -2549,6 +2549,8 @@ as below: @hook TARGET_SPILL_CLASS +@hook TARGET_SPILL_HARD_REGNO + @hook TARGET_CSTORE_MODE @node Old Constraints Index: lra-spills.c === --- lra-spills.c (revision 206089) +++ lra-spills.c (working copy) @@ -252,7 +252,7 @@ pseudo_reg_slot_compare (const void *v1p static int assign_spill_hard_regs (int *pseudo_regnos, int n) { - int i, k, p, regno, res, spill_class_size, hard_regno, nr; + int i, k, p, regno, res,
Re: [PING^2][PATCH] -fuse-caller-save - Implement TARGET_FN_OTHER_HARD_REG_USAGE hook for MIPS
Tom de Vries tom_devr...@mentor.com writes: On 25/12/13 14:02, Tom de Vries wrote: On 07-12-13 16:07, Tom de Vries wrote: Richard, This patch implements the target hook TARGET_FN_OTHER_HARD_REG_USAGE (posted here: http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01318.html) for MIPS, to address the issue that $6 is sometimes used in split calls. Build and reg-tested on MIPS. OK for stage1? Richard, Ping. This patch is the only part of -fuse-caller-save that still needs approval. Hmm, where were parts 4 and 6 approved? Was looking for the discussion in the hope that it would answer the question I don't really understand, which is: this hook is only used during final, is that right? And the clobber that you're adding is exposed at the rtl level. So why do we need the hook at all? Why not just collect the usage information at the end of final rather than at the beginning, so that all splits during final have been done? For other cases (where the usage isn't explicit at the rtl level), why not record the usage in CALL_INSN_FUNCTION_USAGE instead? Thanks, Richard
Re: [PATCH, AArch64 6/6] aarch64: Define add_ssaaaa, sub_ddmmss, umul_ppmm
Hi, This patch and the preceding aarch64.md patches all look good to me, but I cannot approve it. Thanks for adding the support for these missing patterns and defines! Yufeng On 01/08/14 18:13, Richard Henderson wrote: We have good support for TImode arithmetic, so no need to do anything with inline assembly. include/ * longlong.h [__aarch64__] (add_ss, sub_ddmmss, umul_ppmm): New. [__aarch64__] (COUNT_LEADING_ZEROS_0): Define in terms of W_TYPE_SIZE. --- include/longlong.h | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/include/longlong.h b/include/longlong.h index b4c1f400..1b11fc7 100644 --- a/include/longlong.h +++ b/include/longlong.h @@ -123,19 +123,35 @@ extern const UQItype __clz_tab[256] attribute_hidden; #endif /* __GNUC__ 2 */ #if defined (__aarch64__) +#define add_ss(sh, sl, ah, al, bh, bl) \ + do { \ +UDWtype __x = (UDWtype)(UWtype)(ah) 64 | (UWtype)(al);\ +__x += (UDWtype)(UWtype)(bh) 64 | (UWtype)(bl); \ +(sh) = __x W_TYPE_SIZE; \ +(sl) = __x; \ + } while (0) +#define sub_ddmmss(sh, sl, ah, al, bh, bl) \ + do { \ +UDWtype __x = (UDWtype)(UWtype)(ah) 64 | (UWtype)(al);\ +__x -= (UDWtype)(UWtype)(bh) 64 | (UWtype)(bl); \ +(sh) = __x W_TYPE_SIZE; \ +(sl) = __x; \ + } while (0) +#define umul_ppmm(ph, pl, m0, m1) \ + do { \ +UDWtype __x = (UDWtype)(UWtype)(m0) * (UWtype)(m1); \ +(ph) = __x W_TYPE_SIZE; \ +(pl) = __x; \ + } while (0) +#define COUNT_LEADING_ZEROS_0 W_TYPE_SIZE #if W_TYPE_SIZE == 32 #define count_leading_zeros(COUNT, X) ((COUNT) = __builtin_clz (X)) #define count_trailing_zeros(COUNT, X) ((COUNT) = __builtin_ctz (X)) -#define COUNT_LEADING_ZEROS_0 32 -#endif /* W_TYPE_SIZE == 32 */ - -#if W_TYPE_SIZE == 64 +#elif W_TYPE_SIZE == 64 #define count_leading_zeros(COUNT, X) ((COUNT) = __builtin_clzll (X)) #define count_trailing_zeros(COUNT, X) ((COUNT) = __builtin_ctzll (X)) -#define COUNT_LEADING_ZEROS_0 64 #endif /* W_TYPE_SIZE == 64 */ - #endif /* __aarch64__ */ #if defined (__alpha) W_TYPE_SIZE == 64
Re: [PATCH i386 4/8] [AVX512] [6/8] Add substed patterns: `sae' subst.
On Wed, Dec 18, 2013 at 5:02 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, On 02 Dec 16:10, Kirill Yukhin wrote: Hello, On 19 Nov 12:11, Kirill Yukhin wrote: Hello, On 15 Nov 20:07, Kirill Yukhin wrote: Is it ok for trunk? Ping. Ping. Ping. Ping. Rebased patch in the bottom. This patch caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59733 Now bootstrap-asan failed with 24GB RAM and is almost unusable. H.J.
Re: [PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types
On 01/08/2014 12:39 PM, Joseph S. Myers wrote: On Wed, 8 Jan 2014, Richard Henderson wrote: diff --git a/libgcc/soft-fp/soft-fp.h b/libgcc/soft-fp/soft-fp.h index 696fc86..b54b1ed 100644 --- a/libgcc/soft-fp/soft-fp.h +++ b/libgcc/soft-fp/soft-fp.h @@ -237,6 +237,11 @@ typedef int DItype __attribute__ ((mode (DI))); typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); +#if _FP_W_TYPE_SIZE == 64 +typedef int TItype __attribute__ ((mode (TI))); +typedef unsigned int UTItype __attribute__ ((mode (TI))); +#endif This isn't the right conditional. _FP_W_TYPE_SIZE is ultimately an optimization choice and need not be related to whether any TImode functions are being defined using soft-fp, or whether TImode is supported at all. I think the most you can do is have sfp-machine.h define a macro to say that TImode should be supported in soft-fp, rather than actually defining the types itself. The documentation for longlong.h say we must have a double-word type defined. Given how easy it is to support a double-word type... (If someone were to use soft-fp on hppa64, then they might well use _FP_W_TYPE_SIZE == 64, but hppa64 doesn't support TImode.) ... I can't imagine that this is anything but a bug. Not that anyone seems to be doing any hppa work at all these past years. r~
Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)
On 01/08/2014 04:45 AM, Jakub Jelinek wrote: 2014-01-07 Jakub Jelinek ja...@redhat.com PR target/58115 * tree-core.h (struct target_globals): New forward declaration. (struct tree_target_option): Add globals field. * tree.h (TREE_TARGET_GLOBALS): Define. (prepare_target_option_nodes_for_pch): New prototype. * target-globals.h (struct target_globals): Define even if !SWITCHABLE_TARGET. * tree.c (prepare_target_option_node_for_pch, prepare_target_option_nodes_for_pch): New functions. * config/i386/i386.h (SWITCHABLE_TARGET): Define. * config/i386/i386.c: Include target-globals.h. (ix86_set_current_function): Instead of doing target_reinit unconditionally, use save_target_globals_default_opts and restore_target_globals. c-family/ * c-pch.c (c_common_write_pch): Call prepare_target_option_nodes_for_pch. Ok. r~
Fix ipa-devirt ICE on virtual inheritance
Hi, this patch fixes IPA-devirt testcase that gave me bad sleep for months. The problem turned out to be combination of three issues (that greatly confused me). This patch fixes first two. Here representation of BINFOs of multiple inheritnace actually differs from my mental modem. For diamond shaped graph A / \ B C \ / D here A is a common virtual base of B and C. I assumed that there will be two binfos representing A linked from binfos representing B/C both pointing to virtual table of A. This did not work so I assumed that there is one shared binfo. In reality we however have two binfos but only first one has vtable associated. Second issue, also addressed in this patch is lookup of corresponding vtable. I copied code from get_binfo_at_offset that dives into the structure and tracks last base that has non-zero offset. This is becaus vtables of bases starting at same offset are shared. This however does not work with multiple inheritance. A may have same offset 0, while we may reach it over C that has non-zero offset. In this case we really want D's vtable instead of C's. So instead of tracking one vtable I now maintain stack where one can look up corresponding base. Alternative is to mimick what get_binfo_at_offset does by walking fields instead of bases. Here the walk would bypass B/C and get dirrectly to A, but then I would have difficulties to lookup the A's binfo. Final alternative is to use BINFO_INHERITANCE_CHAIN same way as C++ FE, but we do not stream it and I would like to avoid using it unless really necessary. Bootstrapped/regtested ppc64-linux. Honza PR ipa/58252 PR ipa/59226 * ipa-devirt.c record_target_from_binfo): Take as argument stack of binfos and lookup matching one for virtual inheritance. (possible_polymorphic_call_targets_1): Update. * g++.dg/ipa/devirt-20.C: New testcase. * g++.dg/torture/pr58252.C: Likewise. * g++.dg/torture/pr59226.C: Likewise. Index: ipa-devirt.c === --- ipa-devirt.c(revision 206362) +++ ipa-devirt.c(working copy) @@ -614,10 +614,8 @@ maybe_record_node (vec cgraph_node * This match what get_binfo_at_offset does, but with offset being unknown. - TYPE_BINFO is binfo holding an virtual table matching - BINFO's type. In the case of single inheritance, this - is binfo of BINFO's type ancestor (vtable is shared), - otherwise it is binfo of BINFO's type. + TYPE_BINFOS is a stack of BINFOS of types with defined + virtual table seen on way from class type to BINFO. MATCHED_VTABLES tracks virtual tables we already did lookup for virtual function in. INSERTED tracks nodes we already @@ -630,7 +628,7 @@ static void record_target_from_binfo (vec cgraph_node * nodes, tree binfo, tree otr_type, - tree type_binfo, + vec tree type_binfos, HOST_WIDE_INT otr_token, tree outer_type, HOST_WIDE_INT offset, @@ -642,10 +640,32 @@ record_target_from_binfo (vec cgraph_no int i; tree base_binfo; - gcc_checking_assert (BINFO_VTABLE (type_binfo)); + if (BINFO_VTABLE (binfo)) +type_binfos.safe_push (binfo); if (types_same_for_odr (type, outer_type)) { + int i; + tree type_binfo = NULL; + + /* Lookup BINFO with virtual table. For normal types it is always last +binfo on stack. */ + for (i = type_binfos.length () - 1; i = 0; i--) + if (BINFO_OFFSET (type_binfos[i]) == BINFO_OFFSET (binfo)) + { + type_binfo = type_binfos[i]; + break; + } + if (BINFO_VTABLE (binfo)) + type_binfos.pop (); + /* If this is duplicated BINFO for base shared by virtual inheritance, +we may not have its associated vtable. This is not a problem, since +we will walk it on the other path. */ + if (!type_binfo) + { + gcc_assert (BINFO_VIRTUAL_P (binfo)); + return; + } tree inner_binfo = get_binfo_at_offset (type_binfo, offset, otr_type); /* For types in anonymous namespace first check if the respective vtable @@ -676,12 +696,11 @@ record_target_from_binfo (vec cgraph_no /* Walking bases that have no virtual method is pointless excercise. */ if (polymorphic_type_binfo_p (base_binfo)) record_target_from_binfo (nodes, base_binfo, otr_type, - /* In the case of single inheritance, - the virtual table is shared with - the outer type. */ - BINFO_VTABLE (base_binfo) ? base_binfo : type_binfo, + type_binfos, otr_token,
Re: a patch prototype for PR59535 (THUMB code size regression)
On 1/9/2014, 10:30 AM, Richard Earnshaw wrote: On 09/01/14 15:21, Vladimir Makarov wrote: Hi, Richard. This week I've been working on THUMB code size issues. Here is the prototype of the patch for spilling into HI_REGS instead of memory. The patch decreases number of generated insns and makes the code faster as it removes a lot of loads/stores. I am sending the patch for your evaluation and for getting your opinion. If you like the code size results, I could create the real patch next week (the patch here will not work correctly when a user defines fixed registers by himself). Thanks in advance, Vlad. Do you need to take into account HARD_REGNO_NREGS (mode) when doing the limit check? In this patch only SImode is permitted. The hooks also will be different in the final version of the patch.
Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
On 01/08/2014 10:34 AM, Jakub Jelinek wrote: struct target_globals *g; - - g = ggc_alloc_target_globals (); - g-flag_state = XCNEW (struct target_flag_state); - g-regs = XCNEW (struct target_regs); + struct target_globals_extra { +struct target_globals g; +struct target_flag_state flag_state; +struct target_regs regs; +struct target_hard_regs hard_regs; +struct target_reload reload; +struct target_expmed expmed; +struct target_optabs optabs; +struct target_cfgloop cfgloop; +struct target_ira ira; +struct target_ira_int ira_int; +struct target_lra_int lra_int; +struct target_builtins builtins; +struct target_gcse gcse; +struct target_bb_reorder bb_reorder; +struct target_lower_subreg lower_subreg; + } *p; + p = (struct target_globals_extra *) + ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra) +PASS_MEM_STAT); + g = (struct target_globals *) p; + g-flag_state = p-flag_state; + g-regs = p-regs; g-rtl = ggc_alloc_cleared_target_rtl (); So, we're relying on something pointing to G, thus keeping the whole P alive? I suppose that works but it's fairly ugly that's for sure. As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to do something sensible with these high-order allocations, or we can do nearly this same trick only with libc. I.e. struct target_globals_extra { struct target_flag_state flag_state; struct target_regs regs; struct target_hard_regs hard_regs; struct target_reload reload; struct target_expmed expmed; struct target_optabs optabs; struct target_cfgloop cfgloop; struct target_ira ira; struct target_ira_int ira_int; struct target_lra_int lra_int; struct target_builtins builtins; struct target_gcse gcse; struct target_bb_reorder bb_reorder; struct target_lower_subreg lower_subreg; } *p; g = ggc_alloc_target_globals (); p = XCNEW (target_globals_extra); ... r~
[PING^2] [PATCH]SIMD-Enabled functions for C++
Hello Jakub, Did you get a chance to look at this patch (http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00116.html)? I think I have fixed all the changes you requested. Is it ok for trunk? Thanks, Balaji V. Iyer.
Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
On Thu, Jan 09, 2014 at 08:25:31AM -0800, Richard Henderson wrote: + p = (struct target_globals_extra *) + ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra) + PASS_MEM_STAT); + g = (struct target_globals *) p; + g-flag_state = p-flag_state; + g-regs = p-regs; g-rtl = ggc_alloc_cleared_target_rtl (); So, we're relying on something pointing to G, thus keeping the whole P alive? I suppose that works but it's fairly ugly that's for sure. The separate structures aren't really installed individually, they are always installed together through restore_target_globals. As long as the any FUNCTION_DECL with such TARGET_OPTION_NODE exists, it will be reachable. The reason why it needs to be GC is: 1) in two of these target_* structures there are embedded rtxes etc. the GC needs to see 2) if all FUNCTION_DECL with such combination of target attributes are GCed, we'd leak memory As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to do something sensible with these high-order allocations, or we can do nearly this same trick only with libc. I.e. struct target_globals_extra { struct target_flag_state flag_state; struct target_regs regs; struct target_hard_regs hard_regs; struct target_reload reload; struct target_expmed expmed; struct target_optabs optabs; struct target_cfgloop cfgloop; struct target_ira ira; struct target_ira_int ira_int; struct target_lra_int lra_int; struct target_builtins builtins; struct target_gcse gcse; struct target_bb_reorder bb_reorder; struct target_lower_subreg lower_subreg; } *p; g = ggc_alloc_target_globals (); p = XCNEW (target_globals_extra); ... That would be fine for 1), but would mean 2). It is also fine to GC allocate each structure individually, but some (like bb_reorder) are say just 4 bytes long, so it might be overkill. As noted by Richard S., IRA/LRA still puts pointers to heap allocated objects into some of the structures, so there is some leak anyway, but not as big. Jakub
Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
On 01/09/2014 08:35 AM, Jakub Jelinek wrote: That would be fine for 1), but would mean 2). It is also fine to GC allocate each structure individually, but some (like bb_reorder) are say just 4 bytes long, so it might be overkill. Hmm.. Perhaps define the whole structure as you do, but somewhere global enough that ggc-page.c can see it, and add to the extra_order_size_table? I don't know how much memory wastage there would be there, but I can't imagine it's as much as 0.5MB. r~
[PATCH, committed] Fix for PR 59094
Hello Everyone, The following patch will fix the bug in PR 59094. The main issue was that version specific libraries are not stored in the correct location. The patch below should fix that. It is committed since the person who filed the bug has confirmed that the fix works. Index: libcilkrts/Makefile.in === --- libcilkrts/Makefile.in (revision 206468) +++ libcilkrts/Makefile.in (working copy) @@ -401,7 +401,8 @@ -no-undefined # C/C++ header files for Cilk. -cilkincludedir = $(includedir)/cilk +# cilkincludedir = $(includedir)/cilk +cilkincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/cilk cilkinclude_HEADERS = \ include/cilk/cilk_api.h \ include/cilk/cilk_api_linux.h\ Index: libcilkrts/ChangeLog === --- libcilkrts/ChangeLog(revision 206468) +++ libcilkrts/ChangeLog(working copy) @@ -1,3 +1,10 @@ +2014-01-09 Balaji V. Iyer balaji.v.i...@intel.com + + bootstrap/59094 + * Makefile.am (cilkincludedir): Fixed a bug to store version-specific + runtime libraries in the correct place. + * Makefile.in: Regenerate. + 2013-12-13 Balaji V. Iyer balaji.v.i...@intel.com * Makefile.am (GENERAL_FLAGS): Removed undefining of Cilk keywords. Index: libcilkrts/Makefile.am === --- libcilkrts/Makefile.am (revision 206468) +++ libcilkrts/Makefile.am (working copy) @@ -108,7 +108,8 @@ libcilkrts_la_LDFLAGS += -no-undefined # C/C++ header files for Cilk. -cilkincludedir = $(includedir)/cilk +# cilkincludedir = $(includedir)/cilk +cilkincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/cilk cilkinclude_HEADERS = \ include/cilk/cilk_api.h \ include/cilk/cilk_api_linux.h\ Thanks, Balaji V. Iyer.
Re: [Patch,testsuite] Fix testcases that use bind_pic_locally
On Wed, Jan 08, 2014 at 12:28:56PM +, Jakub Jelinek wrote: On Wed, Jan 08, 2014 at 11:49:08AM +, Vidya Praveen wrote: On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote: On Dec 17, 2013, at 6:06 AM, Vidya Praveen vidyaprav...@arm.com wrote: bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by default [1][2]. Let's give Jakub 2 days to weigh in? If no objections, Ok, though, do see about adding documentation for it. Sure. I didn't respin the patch with documentation since I wanted to know if the solution is acceptable. If this patch is OK, I'll respin with the documentation for bind_pic_locally_ok. I kinda would like a simpler interface for these two, but? that can be follow on work, if someone has a bright idea and some time to implement it. Could you explain what do you mean by simpler interface here? The simpler interface, as I said earlier, would be just to make sure /* { dg-add-options bind_pic_locally } */ does the right thing, I really don't believe you've tried hard enough. It is true dejagnu's default_target_compile has: if {[board_info $dest exists multilib_flags]} { append add_flags [board_info $dest multilib_flags] } last (before just adding -o $destfile; is multilib_flags where the -fpic/-fPIC comes in, right?), but if say dg-add-options bind_pic_locally adds the necessary options not to dg-extra-tools-flags, but to some other variable and say gcc_target_compile (and g++_target_compile) around the [target_compile ...] invocation e.g. temporarily append that other variable (if not empty) to board_info's multilib_flags and afterwards remove it, I don't see why it wouldn't work. Tcl is quite flexible in this. Thanks Jakub. I seem to have not properly understood your earlier email. I could do this and works fine. I'll test and post the patch. VP.
[PATCH][testsuite][ARM] Properly figure -mfloat-abi option for crypto tests
Hi all, When adding the testsuite options for the crypto tests we need to make sure that don't end up adding -mfloat-abi=softfp to a hard-float target like arm-none-linux-gnueabihf. This patch adds that code to figure out which -mfpu/-mfloat-abi combination to use in a similar approach to the NEON tests. This patch addresses the same failures that Christophe mentioned in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00375.html but with this patch we can get those tests to PASS on arm-none-linux-gnueabihf instead of being just UNSUPPORTED. Tested arm-none-linux-gnueabihf and arm-none-eabi. Ok for trunk? Thanks, Kyrill 2014-01-09 Kyrylo Tkachov kyrylo.tkac...@arm.com * lib/target-supports.exp (check_effective_target_arm_crypto_ok_nocache): New. (check_effective_target_arm_crypto_ok): Use above procedure. (add_options_for_arm_crypto): Use et_arm_crypto_flags.diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 5166679..f1f4024 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2301,19 +2301,37 @@ proc check_effective_target_arm_unaligned { } { } # Return 1 if this is an ARM target supporting -mfpu=crypto-neon-fp-armv8 -# -mfloat-abi=softfp. -proc check_effective_target_arm_crypto_ok {} { +# -mfloat-abi=softfp or equivalent options. Some multilibs may be +# incompatible with these options. Also set et_arm_crypto_flags to the +# best options to add. + +proc check_effective_target_arm_crypto_ok_nocache { } { +global et_arm_crypto_flags +set et_arm_crypto_flags if { [check_effective_target_arm32] } { - return [check_no_compiler_messages arm_crypto_ok object { - int foo (void) - { - __asm__ volatile (aese.8 q0, q0); - return 0; - } - } -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp] -} else { - return 0 + foreach flags { -mfloat-abi=softfp -mfpu=crypto-neon-fp-armv8 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp} { + if { [check_no_compiler_messages_nocache arm_crypto_ok object { + #include arm_neon.h + uint8x16_t + foo (uint8x16_t a, uint8x16_t b) + { + return vaeseq_u8 (a, b); + } + } $flags] } { + set et_arm_crypto_flags $flags + return 1 + } + } } + +return 0 +} + +# Return 1 if this is an ARM target supporting -mfpu=crypto-neon-fp-armv8 + +proc check_effective_target_arm_crypto_ok { } { +return [check_cached_effective_target arm_crypto_ok \ + check_effective_target_arm_crypto_ok_nocache] } # Add options for crypto extensions. @@ -2321,7 +2339,8 @@ proc add_options_for_arm_crypto { flags } { if { ! [check_effective_target_arm_crypto_ok] } { return $flags } -return $flags -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp +global et_arm_crypto_flags +return $flags $et_arm_crypto_flags } # Add the options needed for NEON. We need either -mfloat-abi=softfp
Re: [PATCH, go]: Skip some go tests
On Thu, Jan 9, 2014 at 4:01 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak ubiz...@gmail.com wrote: 2014-01-09 Uros Bizjak ubiz...@gmail.com * go.test/go-test.exp (go-gc-tests): Don't run peano.go on systems which don't support -fsplit-stack. Skip rotate[0123]-out.go. This is OK. Thanks. You might want to tweak the comment just under where you added peano.go. Then go ahead and commit. Actually, we don't even have to compile/execute generator file, and included rotate.go is skipped due to // skip in its test line. Attached patch was committed to mainline after re-test on x86_64-pc-linux-gnu. Uros. Index: go.test/go-test.exp === --- go.test/go-test.exp (revision 206468) +++ go.test/go-test.exp (working copy) @@ -400,17 +400,16 @@ } if { ( [file tail $test] == select2.go \ - || [file tail $test] == stack.go ) \ + || [file tail $test] == stack.go \ + || [file tail $test] == peano.go ) \ ! [check_effective_target_split_stack] } { - # chan/select2.go fails on targets without split stack, - # because they allocate a large stack segment that blows - # out the memory calculations. + # These tests fails on targets without split stack. untested $name continue } - if { [file tail $test] == rotate.go } { - # This test produces a temporary file that takes too long + if [string match *go.test/test/rotate\[0123\].go $test] { + # These tests produces a temporary file that takes too long # to compile--5 minutes on my laptop without optimization. # When compiling without optimization it tests nothing # useful, since the point of the test is to see whether
[PATCH][ARM] Fix arm_init_iwmmxt_builtins to handle only iwmmxt entries
Hi all, After my CRC32 intrinsics patch that added new entries into the bdesc_2arg table, the arm_init_iwmmxt_builtins function tries to iterate over them and blows up, causing an ICE when trying to compile with -mcpu=iwmmxt. This patch fixes that by ignoring the non-iwmmxt entries in that table when initialising the iwmmxt builtins. With this patch the gcc.target/arm/mmx-2.c comes back and PASSes. Tested arm-none-eabi on qemu. Ok for trunk? Thanks, Kyrill 2014-01-09 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_init_iwmmxt_builtins): Skip non-iwmmxt builtins.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c8bf7c1..842d67f 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -24244,7 +24244,7 @@ arm_init_iwmmxt_builtins (void) enum machine_mode mode; tree type; - if (d-name == 0) + if (d-name == 0 || !(d-mask == FL_IWMMXT || d-mask == FL_IWMMXT2)) continue; mode = insn_data[d-icode].operand[1].mode;
[PATCH][ARM] Add CRC32 to the feature flags of Cortex-A53, A57
Hi all, The Cortex-A53 and Cortex-A57 processors support the CRC32 extensions to ARMv8-a, so we specify that in their definitions in arm-cores.def. This also updates their big.LITTLE amalgamation and removes the redundant FL_THUMB_DIV and FL_ARM_DIV there since ARMv8-a already implies those flags. Tested arm-none-eabi on a model. Ok for trunk? 2014-01-09 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm-cores.def (cortex-a53): Specify FL_CRC32. (cortex-a57): Likewise. (cortex-a57.cortex-a53): Likewise. Remove redundant flags.diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index d961e25..1e97273 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -152,8 +152,8 @@ ARM_CORE(marvell-pj4, marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) /* V8 Architecture Processors */ -ARM_CORE(cortex-a53, cortexa53, cortexa53, 8A, FL_LDSCHED, cortex_a53) -ARM_CORE(cortex-a57, cortexa57, cortexa15, 8A, FL_LDSCHED, cortex_a15) +ARM_CORE(cortex-a53, cortexa53, cortexa53, 8A, FL_LDSCHED | FL_CRC32, cortex_a53) +ARM_CORE(cortex-a57, cortexa57, cortexa15, 8A, FL_LDSCHED | FL_CRC32, cortex_a15) /* V8 big.LITTLE implementations */ -ARM_CORE(cortex-a57.cortex-a53, cortexa57cortexa53, cortexa53, 8A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) +ARM_CORE(cortex-a57.cortex-a53, cortexa57cortexa53, cortexa53, 8A, FL_LDSCHED | FL_CRC32, cortex_a15)
[PATCH][ARM] Get mode for rtx costs calculations for SET RTX from destination reg
Hi all, SET RTXs don't have a mode, so the code to calculate a reg-to-reg set in the arm rtx costs function needs to get the mode from one of the registers involved. We already did that when the source is a CONST_INT. This patch fixes that oversight and also prevents us from falling through or recursing, since the cost calculated for (set (reg) (reg)) should be final at that point. Tested arm-none-eabi on qemu. Ok for trunk? Thanks, Kyrill 2014-01-09 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_new_rtx_costs): Use destination mode when handling a SET rtx.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c8bf7c1..4c991c2 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9092,6 +9092,9 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, { case SET: *cost = 0; + /* SET RTXs don't have a mode so we get it from the destination. */ + mode = GET_MODE (SET_DEST (x)); + if (REG_P (SET_SRC (x)) REG_P (SET_DEST (x))) { @@ -9106,6 +9109,8 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, in 16 bits in Thumb mode. */ if (!speed_p TARGET_THUMB outer_code == COND_EXEC) *cost = 1; + + return true; } if (CONST_INT_P (SET_SRC (x))) @@ -9113,7 +9118,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, /* Handle CONST_INT here, since the value doesn't have a mode and we would otherwise be unable to work out the true cost. */ *cost = rtx_cost (SET_DEST (x), SET, 0, speed_p); - mode = GET_MODE (SET_DEST (x)); outer_code = SET; /* Slightly lower the cost of setting a core reg to a constant. This helps break up chains and allows for better scheduling. */
[Patch] Remove references to non-existent tree-flow.h file
While looking at PR 59335 (plugin doesn't build) I saw the comments about tree-flow.h and tree-flow-inline.h not existing anymore. While these files have been removed there are still some references to them in Makefile.in, doc/tree-ssa.texi, and a couple of source files. This patch removes the references to these now-nonexistent files. OK to checkin? Steve Ellcey sell...@mips.com 2014-01-09 Steve Ellcey sell...@mips.com * Makefile.in (TREE_FLOW_H): Remove. (TREE_SSA_H): Add files names from tree-flow.h. * doc/tree-ssa.texi (Annotations): Remove reference to tree-flow.h * tree.h: Remove tree-flow.h reference. * hash-table.h: Remove tree-flow.h reference. * tree-ssa-loop-niter.c (dump_affine_iv): Replace tree-flow.h reference with tree-ssa-loop.h. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 459b1ba..8eb4f68 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -929,11 +929,10 @@ CPP_ID_DATA_H = $(CPPLIB_H) $(srcdir)/../libcpp/include/cpp-id-data.h CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H) TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H) TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H) -TREE_FLOW_H = tree-flow.h tree-flow-inline.h tree-ssa-operands.h \ +TREE_SSA_H = tree-ssa.h tree-ssa-operands.h \ $(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \ $(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \ tree-ssa-alias.h -TREE_SSA_H = tree-ssa.h $(TREE_FLOW_H) PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H) TREE_PRETTY_PRINT_H = tree-pretty-print.h $(PRETTY_PRINT_H) GIMPLE_PRETTY_PRINT_H = gimple-pretty-print.h $(TREE_PRETTY_PRINT_H) diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi index 391dba8..e0238bd 100644 --- a/gcc/doc/tree-ssa.texi +++ b/gcc/doc/tree-ssa.texi @@ -53,9 +53,6 @@ variable has aliases. All these attributes are stored in data structures called annotations which are then linked to the field @code{ann} in @code{struct tree_common}. -Presently, we define annotations for variables (@code{var_ann_t}). -Annotations are defined and documented in @file{tree-flow.h}. - @node SSA Operands @section SSA Operands diff --git a/gcc/hash-table.h b/gcc/hash-table.h index 2b04067..034385c 100644 --- a/gcc/hash-table.h +++ b/gcc/hash-table.h @@ -1050,10 +1050,7 @@ hash_table Descriptor, Allocator::end () /* Iterate through the elements of hash_table HTAB, using hash_table ::iterator ITER, - storing each element in RESULT, which is of type TYPE. - - This macro has this form for compatibility with the - FOR_EACH_HTAB_ELEMENT currently defined in tree-flow.h. */ + storing each element in RESULT, which is of type TYPE. */ #define FOR_EACH_HASH_TABLE_ELEMENT(HTAB, RESULT, TYPE, ITER) \ for ((ITER) = (HTAB).begin (); \ diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 5a10297..7628363 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -1311,7 +1311,7 @@ dump_affine_iv (FILE *file, affine_iv *iv) if EVERY_ITERATION is true, we know the test is executed on every iteration. The results (number of iterations and assumptions as described in - comments at struct tree_niter_desc in tree-flow.h) are stored to NITER. + comments at struct tree_niter_desc in tree-ssa-loop.h) are stored to NITER. Returns false if it fails to determine number of iterations, true if it was determined (possibly with some assumptions). */ diff --git a/gcc/tree.h b/gcc/tree.h index fa79b6f..67454b7 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1114,9 +1114,6 @@ extern void protected_set_expr_location (tree, location_t); the given label expression. */ #define LABEL_EXPR_LABEL(NODE) TREE_OPERAND (LABEL_EXPR_CHECK (NODE), 0) -/* VDEF_EXPR accessors are specified in tree-flow.h, along with the other - accessors for SSA operands. */ - /* CATCH_EXPR accessors. */ #define CATCH_TYPES(NODE) TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 0) #define CATCH_BODY(NODE) TREE_OPERAND (CATCH_EXPR_CHECK (NODE), 1)
[C++ testcase, committed] PR 59730
Hi, this just adds the testcase to mainline. Thanks, Paolo. // 2014-01-09 Paolo Carlini paolo.carl...@oracle.com PR c++/59730 * g++.dg/cpp0x/variadic145.C: New. Index: g++.dg/cpp0x/variadic145.C === --- g++.dg/cpp0x/variadic145.C (revision 0) +++ g++.dg/cpp0x/variadic145.C (working copy) @@ -0,0 +1,13 @@ +// PR c++/59730 +// { dg-do compile { target c++11 } } + +template typename void declval(); +template typename void forward(); +template typename class D; +template typename _Functor, typename... _Bound_args +class D _Functor(_Bound_args...) { + template typename... _Args, decltype(declval_Functor) + void operator()(...) { +0(forward_Args...); + } +};
[ping] Re: [patch] Pass -fuse-ld=gold to gccgo on targets supporting -fsplit-stack
ping patch Am 29.11.2013 14:29, schrieb Matthias Klose: to get full advantage of the -fsplit-stack option, gccgo binaries have to be linked with gold, not the bfd linker. When the system linker defaults to the bfd linker, then gccgo should explicitly use the gold linker, passing fuse-ld=gold, unless another -fuse-ld option is present. Tested with and without having ld.gold on the system. Matthias
Re: [PATCH, go]: Skip some go tests
On Thu, Jan 9, 2014 at 2:54 AM, Uros Bizjak ubiz...@gmail.com wrote: There are two remaining warnings: go.test/test/nilcheck.go: unrecognized test line: // errorcheck -0 -N -d=nil go.test/test/nilptr3.go: unrecognized test line: // errorcheck -0 -d=nil Thanks, not sure how I missed those. Those tests are really testing specific gc compiler behaviour anyhow, so we should just skip them with gccgo. This patch does that. Committed to mainline. Ian 2014-01-09 Ian Lance Taylor i...@google.com * go.test/go-test.exp (go-gc-tests): Skip nilptr tests that test the other Go compiler. Index: go.test/go-test.exp === --- go.test/go-test.exp (revision 206473) +++ go.test/go-test.exp (working copy) @@ -1143,6 +1143,10 @@ proc go-gc-tests { } { || $test_line == // \$G \$D/pkg.go pack grcS pkg.a pkg.\$A 2 /dev/null rm pkg.\$A \$G -I. -u \$D/main.go } { # This tests the gc -u option, which gccgo does not # support. + } elseif { $test_line == // errorcheck -0 -N -d=nil \ + || $test_line == // errorcheck -0 -d=nil } { + # This tests gc nil pointer checks using -d=nil, which + # gccgo does not support. } else { clone_output $name: unrecognized test line: $test_line unsupported $name
Re: [PATCH,rs6000] Add -maltivec={le,be} options
Thanks for the comments! Here is a second go-round at the patch with improved documentation. I'm happy to change the wording if it can be further improved. Thanks, Bill 2014-01-09 Bill Schmidt wschm...@linux.vnet.ibm.com * doc/invoke.texi: Add -maltivec={be,le} options, and document default element-order behavior for -maltivec. * config/rs6000/rs6000.opt: Add -maltivec={be,le} options. * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure that -maltivec={le,be} implies -maltivec; disallow -maltivec=le when targeting big endian, at least for now. * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 206442) +++ gcc/doc/invoke.texi (working copy) @@ -18855,6 +18855,37 @@ the AltiVec instruction set. You may also need to @option{-mabi=altivec} to adjust the current ABI with AltiVec ABI enhancements. +When -maltivec is used, rather than -maltivec=le or -maltivec=be, the +element order for Altivec intrinsics such as vec_splat, vec_extract, +and vec_insert will match array element order corresponding to the +endianness of the target. That is, element zero identifies the +leftmost element in a vector register when targeting a big-endian +platform, and identifies the rightmost element in a vector register +when targeting a little-endian platform. + +@item -maltivec=be +@opindex maltivec=be +Generate Altivec instructions using big-endian element order, +regardless of whether the target is big- or little-endian. This is +the default when targeting a big-endian platform. + +The element order is used to interpret element numbers in Altivec +intrinsics such as vec_splat, vec_extract, and vec_insert. By +default, these will match array element order corresponding to the +endianness for the target. + +@item -maltivec=le +@opindex maltivec=le +Generate Altivec instructions using little-endian element order, +regardless of whether the target is big- or little-endian. This is +the default when targeting a little-endian platform. This option is +currently ignored when targeting a big-endian platform. + +The element order is used to interpret element numbers in Altivec +intrinsics such as vec_splat, vec_extract, and vec_insert. By +default, these will match array element order corresponding to the +endianness for the target. + @item -mvrsave @itemx -mno-vrsave @opindex mvrsave Index: gcc/config/rs6000/rs6000.opt === --- gcc/config/rs6000/rs6000.opt(revision 206442) +++ gcc/config/rs6000/rs6000.opt(working copy) @@ -140,6 +140,14 @@ maltivec Target Report Mask(ALTIVEC) Var(rs6000_isa_flags) Use AltiVec instructions +maltivec=le +Target Report RejectNegative Var(rs6000_altivec_element_order, 1) Save +Generate Altivec instructions using little-endian element order + +maltivec=be +Target Report RejectNegative Var(rs6000_altivec_element_order, 2) +Generate Altivec instructions using big-endian element order + mhard-dfp Target Report Mask(DFP) Var(rs6000_isa_flags) Use decimal floating point instructions Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 206442) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3238,6 +3238,18 @@ rs6000_option_override_internal (bool global_init_ !(processor_target_table[tune_index].target_enable OPTION_MASK_HTM)) rs6000_isa_flags |= ~rs6000_isa_flags_explicit OPTION_MASK_STRICT_ALIGN; + /* -maltivec={le,be} implies -maltivec. */ + if (rs6000_altivec_element_order != 0) +rs6000_isa_flags |= OPTION_MASK_ALTIVEC; + + /* Disallow -maltivec=le in big endian mode for now. This is not + known to be useful for anyone. */ + if (BYTES_BIG_ENDIAN rs6000_altivec_element_order == 1) +{ + warning (0, N_(-maltivec=le not allowed for big-endian targets)); + rs6000_altivec_element_order = 0; +} + /* Add some warnings for VSX. */ if (TARGET_VSX) { Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 206442) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -468,6 +468,15 @@ extern int rs6000_vector_align[]; ? rs6000_vector_align[(MODE)] \ : (int)GET_MODE_BITSIZE ((MODE))) +/* Determine the element order to use for vector instructions. By + default we use big-endian element order when targeting big-endian, + and little-endian element order when targeting little-endian. For + programs being ported from BE Power to LE Power, it can sometimes + be useful to use big-endian element order when targeting little-endian. + This is set via -maltivec=be, for example. */ +#define VECTOR_ELT_ORDER_BIG
[PATCH] Ignore DECL_ALIGN of SSA_NAME underlying decls for dynamic stack realignment (PR middle-end/47735)
Hi! As discussed in the PR, if a var isn't addressable and has gimple reg type, I don't see any point to honor it's DECL_ALIGN, we only refer to the var through SSA_NAME_VAR of SSA_NAMEs, nothing is allocated on the stack immediately and the SSA_NAMEs are turned into pseudos for which we only care about their modes and corresponding alignments if they need to be spilled to stack. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-01-09 Jakub Jelinek ja...@redhat.com PR middle-end/47735 * cfgexpand.c (expand_one_var): For SSA_NAMEs, if the underlying var satisfies use_register_for_decl, just take into account type alignment, rather than decl alignment. * gcc.target/i386/pr47735.c: New test. --- gcc/cfgexpand.c.jj 2014-01-08 19:37:33.630986939 +0100 +++ gcc/cfgexpand.c 2014-01-09 13:38:45.073324129 +0100 @@ -1215,8 +1215,11 @@ expand_one_var (tree var, bool toplevel, we conservatively assume it will be on stack even if VAR is eventually put into register after RA pass. For non-automatic variables, which won't be on stack, we collect alignment of -type and ignore user specified alignment. */ - if (TREE_STATIC (var) || DECL_EXTERNAL (var)) +type and ignore user specified alignment. Similarly for +SSA_NAMEs for which use_register_for_decl returns true. */ + if (TREE_STATIC (var) + || DECL_EXTERNAL (var) + || (TREE_CODE (origvar) == SSA_NAME use_register_for_decl (var))) align = MINIMUM_ALIGNMENT (TREE_TYPE (var), TYPE_MODE (TREE_TYPE (var)), TYPE_ALIGN (TREE_TYPE (var))); --- gcc/testsuite/gcc.target/i386/pr47735.c.jj 2014-01-09 13:30:14.410941107 +0100 +++ gcc/testsuite/gcc.target/i386/pr47735.c 2014-01-09 13:28:45.0 +0100 @@ -0,0 +1,16 @@ +/* PR middle-end/47735 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fomit-frame-pointer } */ + +unsigned +mulh (unsigned a, unsigned b) +{ + unsigned long long l __attribute__ ((aligned (32))) += ((unsigned long long) a * (unsigned long long) b) 32; + return l; +} + +/* No need to dynamically realign the stack here. */ +/* { dg-final { scan-assembler-not and\[^\n\r]*%\[re\]sp } } */ +/* Nor use a frame pointer. */ +/* { dg-final { scan-assembler-not %\[re\]bp } } */ Jakub
[PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 5)
On Thu, Jan 09, 2014 at 02:27:40PM +0100, Richard Biener wrote: Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition we can just drop the lhs always in that case, just doing what we do for __builtin_unreachable if lhs is SSA_NAME: tree var = create_tmp_var (TREE_TYPE (lhs), NULL); tree def = get_or_create_ssa_default_def (cfun, var); gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT); That works for me. So like this? Bootstrapped/regtested on x86_64-linux and i686-linux, ok? 2014-01-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/59622 * gimple-fold.c (gimple_fold_call): Fix a typo in message. For __builtin_unreachable replace the OBJ_TYPE_REF call with a call to __builtin_unreachable and add if needed a setter of the lhs SSA_NAME. Don't devirtualize for inplace at all. For targets.length () == 1, if the call is noreturn and cfun isn't in SSA form yet, clear lhs. * g++.dg/opt/pr59622-2.C: New test. * g++.dg/opt/pr59622-3.C: New test. * g++.dg/opt/pr59622-4.C: New test. * g++.dg/opt/pr59622-5.C: New test. --- gcc/gimple-fold.c.jj2014-01-08 17:44:57.690582374 +0100 +++ gcc/gimple-fold.c 2014-01-09 14:34:40.816149806 +0100 @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator * (OBJ_TYPE_REF_EXPR (callee) { fprintf (dump_file, - Type inheritnace inconsistent devirtualization of ); + Type inheritance inconsistent devirtualization of ); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, to ); print_generic_expr (dump_file, callee, TDF_SLIM); @@ -1177,24 +1177,45 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else if (flag_devirtualize virtual_method_call_p (callee)) + else if (flag_devirtualize !inplace virtual_method_call_p (callee)) { bool final; vec cgraph_node *targets = possible_polymorphic_call_targets (callee, final); if (final targets.length () = 1) { + tree lhs = gimple_call_lhs (stmt); if (targets.length () == 1) { gimple_call_set_fndecl (stmt, targets[0]-decl); changed = true; + /* If the call becomes noreturn, remove the lhs. */ + if (lhs (gimple_call_flags (stmt) ECF_NORETURN)) + { + if (TREE_CODE (lhs) == SSA_NAME) + { + tree var = create_tmp_var (TREE_TYPE (lhs), NULL); + tree def = get_or_create_ssa_default_def (cfun, var); + gimple new_stmt = gimple_build_assign (lhs, def); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + } + gimple_call_set_lhs (stmt, NULL_TREE); + } } - else if (!inplace) + else { tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); gimple new_stmt = gimple_build_call (fndecl, 0); gimple_set_location (new_stmt, gimple_location (stmt)); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + if (lhs TREE_CODE (lhs) == SSA_NAME) + { + tree var = create_tmp_var (TREE_TYPE (lhs), NULL); + tree def = get_or_create_ssa_default_def (cfun, var); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + update_call_from_tree (gsi, def); + } + else + gsi_replace (gsi, new_stmt, true); return true; } } --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-09 10:57:46.246694025 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-2.C2014-01-09 10:57:46.246694025 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +namespace +{ + struct A + { +A () {} +virtual A *bar (int) = 0; +A *baz (int x) { return bar (x); } + }; +} + +A *a; + +void +foo () +{ + a-baz (0); +} --- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj 2014-01-09 10:57:46.247694040 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-3.C2014-01-09 10:57:46.247694040 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +struct C { int a; int b; }; + +namespace +{ + struct A + { +virtual C foo (); +C bar () { return foo (); } + }; +} + +C +baz () +{ + A a; + return a.bar (); +} --- gcc/testsuite/g++.dg/opt/pr59622-4.C.jj
PATCH: Put a breakpoint on __sanitizer::Report
Hi, This patch puts a breakpoint on __sanitizer::Report to help with debugging sanitizer issues. OK to install? Thanks. -- H.J. -- 2014-01-09 H.J. Lu hongjiu...@intel.com * gdbasan.in: Put a breakpoint on __sanitizer::Report. diff --git a/gcc/gdbasan.in b/gcc/gdbasan.in index cf05825..3a6fca0 100644 --- a/gcc/gdbasan.in +++ b/gcc/gdbasan.in @@ -1,3 +1,7 @@ # Put a breakpoint on __asan_report_error to help with debugging buffer # overflow. b __asan_report_error + +# Put a breakpoint on __sanitizer::Report to help with debugging sanitizer +# issues. +b __sanitizer::Report
Re: [patch] PR56572 flatten unnecessary nested transactions after inlining
On 01/06/14 13:40, Richard Henderson wrote: On 12/19/2013 11:06 AM, Richard Biener wrote: Aldy Hernandez al...@redhat.com wrote: I'd still like to catch the common cases, like I do with this patch. Perhaps we move this code to the .tmmark pass and handle the uninstrumented case. rth? tmmark is way way later than you'd want. I believe that ipa_tm is the right place. That's where we generate clones. The clones know a-priori that they're called within a transaction and thus all internal transations may be eliminated. And thus any inlining that would happen after ipa_tm would properly inline the clone, and thus no more need be done. I have a patch (attached for reference) removing the nested transactions while we are creating the clones (as suggested), but the uninstrumented code path complicates things. I'm afraid I don't have any good news. Consider this: inline void f() { __transaction_atomic { a = 12345; } } void g() { __transaction_atomic { f(); } } The problem is that when we add the uninstrumented code path later in tmipa, we end up with the following for g(): g () { bb 2: __transaction_atomic // SUBCODE=[ GTMA_HAVE_LOAD GTMA_HAVE_STORE ] goto bb 3; bb 5: f ();/* uninstrumented path */ __builtin__ITM_commitTransaction (); goto bb 4; bb 3: f (); [tm-clone]/* instrumented path */ __builtin__ITM_commitTransaction (); bb 4: return; } Since we only removed the transaction in the clone of f(), plain regular f() will still have the additional transaction, so inlining will still yield a g() with a nested transaction in the uninstrumented path. So we're back to square one, needing a separate pass to remove the nested transactions, and this pass will unfortunately have to deal with the uninstrumented/instrumented paths. This has taken longer to fix than I expected, so I'm going to put this aside for now and concentrate on some P1-P2's. For the record, since you don't like this pass in the .tmmark pass which is WAY late, can we have a tree pass right after the IPA passes (thus after inlining)? I'll add some notes to the PR so we can pick this up later. Aldy diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c index fe6dc28..59b589c 100644 --- a/gcc/trans-mem.c +++ b/gcc/trans-mem.c @@ -1,5 +1,7 @@ /* Passes for transactional memory support. Copyright (C) 2008-2014 Free Software Foundation, Inc. + Contributed by Richard Henderson r...@redhat.com and + Aldy Hernandez al...@redhat.com. This file is part of GCC. @@ -4106,8 +4108,8 @@ maybe_push_queue (struct cgraph_node *node, code path. QUEUE are the basic blocks inside the transaction represented in REGION. - Later in split_code_paths() we will add the conditional to choose - between the two alternatives. */ + Later in the tmmark pass (expand_transaction) we will add the + conditional to choose between the two alternatives. */ static void ipa_uninstrument_transaction (struct tm_region *region, @@ -4192,29 +4194,11 @@ ipa_tm_scan_calls_transaction (struct tm_ipa_cg_data *d, bbs = get_tm_region_blocks (r-entry_block, r-exit_blocks, NULL, d-transaction_blocks_normal, false); - // Generate the uninstrumented code path for this transaction. - ipa_uninstrument_transaction (r, bbs); - FOR_EACH_VEC_ELT (bbs, i, bb) ipa_tm_scan_calls_block (callees_p, bb, false); bbs.release (); } - - // ??? copy_bbs should maintain cgraph edges for the blocks as it is - // copying them, rather than forcing us to do this externally. - rebuild_cgraph_edges (); - - // ??? In ipa_uninstrument_transaction we don't try to update dominators - // because copy_bbs doesn't return a VEC like iterate_fix_dominators expects. - // Instead, just release dominators here so update_ssa recomputes them. - free_dominance_info (CDI_DOMINATORS); - - // When building the uninstrumented code path, copy_bbs will have invoked - // create_new_def_for starting an ssa update context. There is only one - // instance of this context, so resolve ssa updates before moving on to - // the next function. - update_ssa (TODO_update_ssa); } /* Scan all calls in NODE as if this is the transactional clone, @@ -4890,10 +4874,11 @@ ipa_tm_create_version_alias (struct cgraph_node *node, void *data) return false; } -/* Create a copy of the function (possibly declaration only) of OLD_NODE, - appropriate for the transactional clone. */ +/* Create a copy of the function (possibly declaration only) of + OLD_NODE, appropriate for the transactional clone. Returns the + cgraph node for the newly created clone. */ -static void +static struct cgraph_node * ipa_tm_create_version (struct cgraph_node *old_node) { tree new_decl, old_decl, tm_name; @@ -4947,13 +4932,12 @@ ipa_tm_create_version (struct cgraph_node *old_node) ipa_tm_mark_forced_by_abi_node
Re: PATCH: Put a breakpoint on __sanitizer::Report
On Thu, Jan 09, 2014 at 10:28:56AM -0800, H.J. Lu wrote: Hi, This patch puts a breakpoint on __sanitizer::Report to help with debugging sanitizer issues. OK to install? Ok. 2014-01-09 H.J. Lu hongjiu...@intel.com * gdbasan.in: Put a breakpoint on __sanitizer::Report. diff --git a/gcc/gdbasan.in b/gcc/gdbasan.in index cf05825..3a6fca0 100644 --- a/gcc/gdbasan.in +++ b/gcc/gdbasan.in @@ -1,3 +1,7 @@ # Put a breakpoint on __asan_report_error to help with debugging buffer # overflow. b __asan_report_error + +# Put a breakpoint on __sanitizer::Report to help with debugging sanitizer +# issues. +b __sanitizer::Report Jakub
Re: [PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types
On Thu, 9 Jan 2014, Richard Henderson wrote: This isn't the right conditional. _FP_W_TYPE_SIZE is ultimately an optimization choice and need not be related to whether any TImode functions are being defined using soft-fp, or whether TImode is supported at all. I think the most you can do is have sfp-machine.h define a macro to say that TImode should be supported in soft-fp, rather than actually defining the types itself. The documentation for longlong.h say we must have a double-word type defined. Given how easy it is to support a double-word type... I suppose that's a reason to define TImode types under that condition unless and until soft-fp is used with _FP_W_TYPE_SIZE == 64 for an architecture not supporting them (there's also the possibility it might be used with _FP_W_TYPE_SIZE == 32 but with TImode support wanted, though defining the types in sfp-machine.h would of course be possible then). But of course the patches need proposing for glibc first (for longlong.h things are less clear, as long as a patch applied to one place is promptly then applied to the other). -- Joseph S. Myers jos...@codesourcery.com
Re: std::vector move assign patch
On 9 January 2014 12:22, H.J. Lu wrote: On Fri, Dec 27, 2013 at 10:27 AM, François Dumont frs.dum...@gmail.com wrote: Hi Here is a patch to fix an issue in normal mode during the move assignment. The destination vector allocator instance is moved too during the assignment which is wrong. As I discover this problem while working on issues with management of safe iterators during move operations this patch also fix those issues in the debug mode for the vector container. Fixes for other containers in debug mode will come later. 2013-12-27 François Dumont fdum...@gcc.gnu.org * include/bits/stl_vector.h (std::vector::_M_move_assign): Pass *this allocator instance when building temporary vector instance so that *this allocator do not get moved. * include/debug/safe_base.h (_Safe_sequence_base(_Safe_sequence_base)): New. * include/debug/vector (__gnu_debug::vector(vector)): Use latter. (__gnu_debug::vector(vector, const allocator_type)): Swap safe iterators if the instance is moved. (__gnu_debug::vector::operator=(vector)): Likewise. * testsuite/23_containers/vector/allocator/move.cc (test01): Add check on a vector iterator. * testsuite/23_containers/vector/allocator/move_assign.cc (test02): Likewise. (test03): New, test with a non-propagating allocator. * testsuite/23_containers/vector/debug/move_assign_neg.cc: New. Tested under Linux x86_64 normal and debug modes. I will be in vacation for a week starting today so if you want to apply it quickly do not hesitate to do it yourself. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59738 Fixed by the attached patch, tested x86_64-linux and committed to trunk. I've also rotated the libstdc++ ChangeLog. 2014-01-09 Jonathan Wakely jwak...@redhat.com PR libstdc++/59738 * include/bits/stl_vector.h (vector::_M_move_assign): Restore support for non-Movable types. commit c12a0d112781150c2888de7c63960e22ef4ffcbb Author: Jonathan Wakely jwak...@redhat.com Date: Thu Jan 9 16:50:50 2014 + PR libstdc++/59738 * include/bits/stl_vector.h (vector::_M_move_assign): Restore support for non-Movable types. diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h index 3638a8c..2cedd39 100644 --- a/libstdc++-v3/include/bits/stl_vector.h +++ b/libstdc++-v3/include/bits/stl_vector.h @@ -1433,7 +1433,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER void _M_move_assign(vector __x, std::true_type) noexcept { - const vector __tmp(std::move(*this), get_allocator()); + vector __tmp(get_allocator()); + this-_M_impl._M_swap_data(__tmp._M_impl); this-_M_impl._M_swap_data(__x._M_impl); if (_Alloc_traits::_S_propagate_on_move_assign()) std::__alloc_on_move(_M_get_Tp_allocator(),
Re: [PATCH,rs6000] Add -maltivec={le,be} options
On Thu, 9 Jan 2014, Bill Schmidt wrote: +When -maltivec is used, rather than -maltivec=le or -maltivec=be, the +element order for Altivec intrinsics such as vec_splat, vec_extract, +and vec_insert will match array element order corresponding to the +endianness of the target. That is, element zero identifies the +leftmost element in a vector register when targeting a big-endian +platform, and identifies the rightmost element in a vector register +when targeting a little-endian platform. Use @option{} markup around option names and @code{} around intrinsic names, here and in the discussion of intrinsics under individual options. -- Joseph S. Myers jos...@codesourcery.com
[Patch, testsuite, mips] Fix test gcc.dg/delay-slot-1.c for MIPS
The gcc.dg/delay-slot-1.c test is failing for MIPS targets that do not support the 64 bit ABI because it didn't check to see if that support existed before using the -mabi=64 flag. This patch fixes the problem by using the mips64 check. OK to checkin? Steve Ellcey sell...@mips.com 2014-01-09 Steve Ellcey sell...@mips.com * gcc.dg/delay-slot-1.c: Add check for 64 bit support. diff --git a/gcc/testsuite/gcc.dg/delay-slot-1.c b/gcc/testsuite/gcc.dg/delay-slot-1.c index f3bcd8e..bfc0273 100644 --- a/gcc/testsuite/gcc.dg/delay-slot-1.c +++ b/gcc/testsuite/gcc.dg/delay-slot-1.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options -O2 } */ -/* { dg-options -O2 -mabi=64 { target mips-*-linux-* } } */ +/* { dg-options -O2 -mabi=64 { target { mips*-*-linux* mips64 } } } */ struct offset_v1 { int k_uniqueness;
Re: [PATCH,rs6000] Add -maltivec={le,be} options
On Thu, Jan 9, 2014 at 1:14 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Thanks for the comments! Here is a second go-round at the patch with improved documentation. I'm happy to change the wording if it can be further improved. Thanks, Bill 2014-01-09 Bill Schmidt wschm...@linux.vnet.ibm.com * doc/invoke.texi: Add -maltivec={be,le} options, and document default element-order behavior for -maltivec. * config/rs6000/rs6000.opt: Add -maltivec={be,le} options. * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure that -maltivec={le,be} implies -maltivec; disallow -maltivec=le when targeting big endian, at least for now. * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG. The patch and text look good, with the markup fixes requested by Joseph. Thanks, David
Re: [Patch, testsuite, mips] Fix test gcc.dg/delay-slot-1.c for MIPS
Steve Ellcey sell...@mips.com writes: 2014-01-09 Steve Ellcey sell...@mips.com * gcc.dg/delay-slot-1.c: Add check for 64 bit support. OK, thanks. Pedantically it's Restrict -mabi=64 to 64-bit processors., since we're not really checking whether the support is there, but whether n64 is compatible with the currently-selected processor. Last time I looked, user options take precedence over test options, so someone testing specific -mabi options (like I do for mips64-linux-gnu) won't be affected either way. And for people testing -march= without -mabi=, or people using toolchains configured using --with-arch=32-bit-proc, I agree the change is a good thing. Thanks, Richard
PATCH: Remove the unused btver1
Hi Uros, btver1 iis never used. This patch removes it. It avoids: insn-attrtab.c:extern int internal_dfa_insn_code_btver1 (rtx); insn-attrtab.c:extern int insn_default_latency_btver1 (rtx); OK to install? Thanks. -- H.J. -- 2014-01-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.md (cpu): Remove the unused btver1. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index de0b2dd..954bbed 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -366,7 +366,7 @@ ;; Processor type. (define_attr cpu none,pentium,pentiumpro,geode,k6,athlon,k8,core2,nehalem, atom,slm,generic,amdfam10,bdver1,bdver2,bdver3,bdver4, -btver1,btver2 +btver2 (const (symbol_ref ix86_schedule))) ;; A basic instruction type. Refinements due to arguments to be
Re: PATCH: Remove the unused btver1
On Thu, Jan 9, 2014 at 8:32 PM, H.J. Lu hjl.to...@gmail.com wrote: btver1 iis never used. This patch removes it. It avoids: insn-attrtab.c:extern int internal_dfa_insn_code_btver1 (rtx); insn-attrtab.c:extern int insn_default_latency_btver1 (rtx); OK to install? OK. Thanks, Uros.
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu x...@google.com wrote: On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka hubi...@ucw.cz wrote: @@ -325,6 +311,9 @@ static struct gcov_summary all_prg; #endif /* crc32 for this program. */ static gcov_unsigned_t crc32; +/* Use this summary checksum rather the computed one if the value is + *non-zero. */ +static gcov_unsigned_t saved_summary_checksum; Why do you need to save the checksum? Won't it reset summary back with multiple streaming? This was for the gcov_tool. checksum will be recomputed in gcov_exit and the value will depend on the order of gcov_info list. (the order will be different after reading from gcda files to memory). The purpose was to have the same summary_checksum so that I can get identical gcov-dump output. I would really like to avoid introducing those static vars that are used exclusively by gcov_exit. What about putting them into an gcov_context structure that is passed around the functions that was broken out? With my recently patch the localizes this_prg, we only use 64 more bytes in bss. Do you still we have to remove all these statics? libgcc ChangeLog entries should be in libgcc/ChangeLog, not gcc/ChangeLog. I checked in a patch to move them to libgcc/ChangeLog. -- H.J.
Re: [RFC] libgcov.c re-factoring and offline profile-tool
My bad. Thanks for the fix! -Rong On Thu, Jan 9, 2014 at 11:47 AM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu x...@google.com wrote: On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka hubi...@ucw.cz wrote: @@ -325,6 +311,9 @@ static struct gcov_summary all_prg; #endif /* crc32 for this program. */ static gcov_unsigned_t crc32; +/* Use this summary checksum rather the computed one if the value is + *non-zero. */ +static gcov_unsigned_t saved_summary_checksum; Why do you need to save the checksum? Won't it reset summary back with multiple streaming? This was for the gcov_tool. checksum will be recomputed in gcov_exit and the value will depend on the order of gcov_info list. (the order will be different after reading from gcda files to memory). The purpose was to have the same summary_checksum so that I can get identical gcov-dump output. I would really like to avoid introducing those static vars that are used exclusively by gcov_exit. What about putting them into an gcov_context structure that is passed around the functions that was broken out? With my recently patch the localizes this_prg, we only use 64 more bytes in bss. Do you still we have to remove all these statics? libgcc ChangeLog entries should be in libgcc/ChangeLog, not gcc/ChangeLog. I checked in a patch to move them to libgcc/ChangeLog. -- H.J.
[Patch, Fortran, committed] Fix buglet in cpp.c
Committed as obvious: Rev. 206487. Tobias Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 206486) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,8 @@ +2014-01-09 Tobias Burnus bur...@net-b.de + + * cpp.c (gfc_cpp_handle_option): Add missing break. + * trans-io.c (transfer_expr): Silence unused value warning. + 2014-01-08 Janus Weil ja...@gcc.gnu.org PR fortran/58182 Index: gcc/fortran/cpp.c === --- gcc/fortran/cpp.c (Revision 206486) +++ gcc/fortran/cpp.c (Arbeitskopie) @@ -363,6 +363,7 @@ gfc_cpp_handle_option (size_t scode, const char *a case OPT_Wdate_time: gfc_cpp_option.warn_date_time = value; + break; case OPT_A: case OPT_D: Index: gcc/fortran/trans-io.c === --- gcc/fortran/trans-io.c (Revision 206486) +++ gcc/fortran/trans-io.c (Arbeitskopie) @@ -2152,7 +2152,7 @@ transfer_expr (gfc_se * se, gfc_typespec * ts, tre function, if only referenced in an io statement, requires this check (see PR58771). */ if (ts-u.derived-backend_decl == NULL_TREE) - tmp = gfc_typenode_for_spec (ts); + (void) gfc_typenode_for_spec (ts); for (c = ts-u.derived-components; c; c = c-next) {
Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 5)
Jakub Jelinek ja...@redhat.com wrote: On Thu, Jan 09, 2014 at 02:27:40PM +0100, Richard Biener wrote: Perhaps, if you don't like the !gimple_in_ssa_p (cfun) in the condition we can just drop the lhs always in that case, just doing what we do for __builtin_unreachable if lhs is SSA_NAME: tree var = create_tmp_var (TREE_TYPE (lhs), NULL); tree def = get_or_create_ssa_default_def (cfun, var); gsi_insert_after (gsi, gimple_build_assign (lhs, def), GSI_NEW_STMT); That works for me. So like this? Bootstrapped/regtested on x86_64-linux and i686-linux, ok? Ok, Thanks. 2014-01-09 Jakub Jelinek ja...@redhat.com PR tree-optimization/59622 * gimple-fold.c (gimple_fold_call): Fix a typo in message. For __builtin_unreachable replace the OBJ_TYPE_REF call with a call to __builtin_unreachable and add if needed a setter of the lhs SSA_NAME. Don't devirtualize for inplace at all. For targets.length () == 1, if the call is noreturn and cfun isn't in SSA form yet, clear lhs. * g++.dg/opt/pr59622-2.C: New test. * g++.dg/opt/pr59622-3.C: New test. * g++.dg/opt/pr59622-4.C: New test. * g++.dg/opt/pr59622-5.C: New test. --- gcc/gimple-fold.c.jj 2014-01-08 17:44:57.690582374 +0100 +++ gcc/gimple-fold.c 2014-01-09 14:34:40.816149806 +0100 @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator * (OBJ_TYPE_REF_EXPR (callee) { fprintf (dump_file, - Type inheritnace inconsistent devirtualization of ); + Type inheritance inconsistent devirtualization of ); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, to ); print_generic_expr (dump_file, callee, TDF_SLIM); @@ -1177,24 +1177,45 @@ gimple_fold_call (gimple_stmt_iterator * gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); changed = true; } - else if (flag_devirtualize virtual_method_call_p (callee)) + else if (flag_devirtualize !inplace virtual_method_call_p (callee)) { bool final; vec cgraph_node *targets = possible_polymorphic_call_targets (callee, final); if (final targets.length () = 1) { +tree lhs = gimple_call_lhs (stmt); if (targets.length () == 1) { gimple_call_set_fndecl (stmt, targets[0]-decl); changed = true; +/* If the call becomes noreturn, remove the lhs. */ +if (lhs (gimple_call_flags (stmt) ECF_NORETURN)) + { +if (TREE_CODE (lhs) == SSA_NAME) + { +tree var = create_tmp_var (TREE_TYPE (lhs), NULL); +tree def = get_or_create_ssa_default_def (cfun, var); +gimple new_stmt = gimple_build_assign (lhs, def); +gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + } +gimple_call_set_lhs (stmt, NULL_TREE); + } } -else if (!inplace) +else { tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); gimple new_stmt = gimple_build_call (fndecl, 0); gimple_set_location (new_stmt, gimple_location (stmt)); -gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); +if (lhs TREE_CODE (lhs) == SSA_NAME) + { +tree var = create_tmp_var (TREE_TYPE (lhs), NULL); +tree def = get_or_create_ssa_default_def (cfun, var); +gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); +update_call_from_tree (gsi, def); + } +else + gsi_replace (gsi, new_stmt, true); return true; } } --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj2014-01-09 10:57:46.246694025 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-2.C 2014-01-09 10:57:46.246694025 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +namespace +{ + struct A + { +A () {} +virtual A *bar (int) = 0; +A *baz (int x) { return bar (x); } + }; +} + +A *a; + +void +foo () +{ + a-baz (0); +} --- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj2014-01-09 10:57:46.247694040 +0100 +++ gcc/testsuite/g++.dg/opt/pr59622-3.C 2014-01-09 10:57:46.247694040 +0100 @@ -0,0 +1,21 @@ +// PR tree-optimization/59622 +// { dg-do compile } +// { dg-options -O2 } + +struct C { int a; int b; }; + +namespace +{ + struct A + { +virtual C foo (); +C bar () { return foo (); } + }; +} + +C +baz () +{ + A a; + return a.bar (); +} --- gcc/testsuite/g++.dg/opt/pr59622-4.C.jj2014-01-09 10:57:46.247694040