Re: [gofrontend-dev] [PATCH 7/9] Gccgo port to s390[x] -- part I
On Tue, Oct 28, 2014 at 10:30:08AM -0700, Ian Taylor wrote: On Tue, Oct 28, 2014 at 7:31 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: The attached patch contains all the discussed changes. I made a few formatting changes. I patched the test to work on x86, by making the char types accept either int8 or uint8, and making the long double tests accept any floating point size. Approved and applied as attached. Great, thanks! By the way, the changes I made to this patch do not interfer with patch #8 (complex type support for -fdump-go-spec) in any way - it should still apply without conflict. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
Re: [gofrontend-dev] [PATCH 8/9] Gccgo port to s390[x] -- part I
Patch updated to remove conflicts with changed tests in patch 7. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany From e81ca934b619cad8b3872f28edbf3d2d0afeeec9 Mon Sep 17 00:00:00 2001 From: Dominik Vogt v...@linux.vnet.ibm.com Date: Fri, 5 Sep 2014 07:31:01 +0100 Subject: [PATCH 8/9] Gccgo port to s390[x] -- part I godump: Support _Complex types in go_format_type. 1) float/double _Complex are represented as complex64/complex128 in Go as appropriate. 2) Add tests. --- gcc/godump.c| 34 + gcc/testsuite/gcc.misc-tests/godump-1.c | 30 + 2 files changed, 64 insertions(+) diff --git a/gcc/godump.c b/gcc/godump.c index 7a05664..fccd3eb 100644 --- a/gcc/godump.c +++ b/gcc/godump.c @@ -780,6 +780,40 @@ go_format_type (struct godump_container *container, tree type, } break; +case COMPLEX_TYPE: + { + const char *s; + char buf[100]; + tree real_type; + + real_type = TREE_TYPE (type); + if (TREE_CODE (real_type) == REAL_TYPE) + { + switch (TYPE_PRECISION (real_type)) + { + case 32: + s = complex64; + break; + case 64: + s = complex128; + break; + default: + snprintf (buf, sizeof buf, INVALID-complex-%u, + 2 * TYPE_PRECISION (real_type)); + s = buf; + ret = false; + break; + } + } + else + { + s = INVALID-complex-non-real; + ret = false; + } + obstack_grow (ob, s, strlen (s)); + } + break; + case BOOLEAN_TYPE: obstack_grow (ob, bool, 4); break; diff --git a/gcc/testsuite/gcc.misc-tests/godump-1.c b/gcc/testsuite/gcc.misc-tests/godump-1.c index 876cf28..f339cc9 100644 --- a/gcc/testsuite/gcc.misc-tests/godump-1.c +++ b/gcc/testsuite/gcc.misc-tests/godump-1.c @@ -104,6 +104,21 @@ d_t d_v2; typedef long double ld_t; long double ld_v1; ld_t ld_v2; +typedef _Complex cx_t; +_Complex cx_v1; +cx_t cx_v2; +typedef float _Complex fcx_t; +float _Complex fcx_v1; +fcx_t fcx_v2; +typedef double _Complex dcx_t; +double _Complex dcx_v1; +dcx_t dcx_v2; +typedef long double _Complex ldcx_t; +long double _Complex ldcx_v1; +ldcx_t ldcx_v2; +typedef int _Complex icx_t; +int _Complex icx_v1; +icx_t icx_v2; /* nested typedefs */ typedef int ni_t; @@ -301,6 +316,11 @@ typedef int8_t (*func_t)(void *p); /* { dg-final { scan-file godump-1.out (?n)^type _f_t float\[0-9\]*$ } } */ /* { dg-final { scan-file godump-1.out (?n)^type _d_t float\[0-9\]*$ } } */ /* { dg-final { scan-file godump-1.out (?n)^// type _ld_t INVALID-float-\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^type _cx_t complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^type _fcx_t complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^type _dcx_t complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// type _ldcx_t INVALID-complex-256$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// type _icx_t INVALID-complex-non-real$ } } */ /* { dg-final { scan-file godump-1.out (?n)^type _ni_t int\[0-9\]*$ } } */ /* { dg-final { scan-file godump-1.out (?n)^type _ni2_t int\[0-9\]*$ } } */ /* { dg-final { scan-file godump-1.out (?n)^type _ni3_t int\[0-9\]*$ } } */ @@ -414,6 +434,16 @@ typedef int8_t (*func_t)(void *p); /* { dg-final { scan-file godump-1.out (?n)^var _d_v2 _d_t$ } } */ /* { dg-final { scan-file godump-1.out (?n)^// var _ld_v1 INVALID-float-\[0-9\]*$ } } */ /* { dg-final { scan-file godump-1.out (?n)^// var _ld_v2 INVALID-float-\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _cx_v1 complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _cx_v2 _cx_t$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _fcx_v1 complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _fcx_v2 _fcx_t$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _dcx_v1 complex\[0-9\]*$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^var _dcx_v2 _dcx_t$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// var _ldcx_v1 INVALID-complex-256$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// var _ldcx_v2 INVALID-complex-256$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// var _icx_v1 INVALID-complex-non-real$ } } */ +/* { dg-final { scan-file godump-1.out (?n)^// var _icx_v2 INVALID-complex-non-real$ } } */ /* { dg-final { scan-file godump-1.out (?n)^var _ni2_v2 _ni2_t$ } } */ /* { dg-final { scan-file godump-1.out (?n)^var _ni3_v2 _ni3_t$ } } */ /* { dg-final { scan-file godump-1.out (?n)^var _e1_v1 int$ } } */ -- 1.8.4.2
Re: [PATCH] Fix PR63665
2014-10-28 Richard Biener rguent...@suse.de PR tree-optimization/63665 * tree-vect-slp.c (vect_get_mask_element): Properly handle accessing out-of-bound elements. Does fix it the assertion failure on the attached testcase? If so, would you mind committing the testcase with the patch? * gnat.dg/opt42.ad[sb]: New test. -- Eric Botcazoupackage Opt42 is type Index_Type is range 1 .. 7; type Row_Type is array (Index_Type) of Float; type Array_Type is array (Index_Type) of Row_Type; function * (Left, Right : in Array_Type) return Array_Type; end Opt42;-- { dg-do compile } -- { dg-options -cargs --param max-completely-peeled-insns=200 -margs -O3 } package body Opt42 is function * (Left, Right : in Array_Type) return Array_Type is Temp : Float; Result : Array_Type; begin for I in Index_Type loop for J in Index_Type loop Temp := 0.0; for K in Index_Type loop Temp := Temp + Left (I) (K) * Right (K) (J); end loop; Result (I) (J) := Temp; end loop; end loop; return Result; end *; end Opt42;
[committed]: change my email address
I will commit the following change in MAINTAINERS. Tristan. 2014-10-29 Tristan Gingold ging...@adacore.com * MAINTAINERS: Change my email address. --- MAINTAINERS (revision 216822) +++ MAINTAINERS (working copy) @@ -136,7 +136,7 @@ RTEMS PortsJoel Sherrill j...@oarcorp.com RTEMS PortsRalf Corsepius ralf.corsep...@rtems.org VMSDouglas Ruppdouglas.b.r...@gmail.com -VMSTristan Gingold ging...@adacore.com +VMSTristan Gingold tging...@free.fr VxWorks ports Nathan Sidwell nat...@codesourcery.com windows, cygwin, mingw Kai Tietz kti...@redhat.com
Re: [PATCH, x86, 63534] Fix '-p' profile for 32 bit PIC mode
Patch with the fixes: Bootstrap, gcc make check and spec2000 with -p passed. 2014-10-29 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite PR target/63534 * gcc.target/i386/mcount_pic.c: New. gcc/ PR target/63534 * config/i386/i386.c (ix86_init_pic_reg): Emit SET_GOT to REAL_PIC_OFFSET_TABLE_REGNUM for mcount profiling. (ix86_save_reg): Save REAL_PIC_OFFSET_TABLE_REGNUM when profiling using mcount in 32bit PIC mode. (ix86_elim_entry_set_got): New. (ix86_expand_prologue): For the mcount profiling emit new SET_GOT in PROLOGUE, delete initial if possible. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 6235c4f..fe61b8c 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -6190,8 +6190,15 @@ ix86_init_pic_reg (void) } else { - rtx insn = emit_insn (gen_set_got (pic_offset_table_rtx)); + /* If there is future mcount call in the function it is more profitable + to emit SET_GOT into ABI defined REAL_PIC_OFFSET_TABLE_REGNUM. */ + rtx reg = crtl-profile + ? gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM) + : pic_offset_table_rtx; + rtx insn = emit_insn (gen_set_got (reg)); RTX_FRAME_RELATED_P (insn) = 1; + if (crtl-profile) +emit_move_insn (pic_offset_table_rtx, reg); add_reg_note (insn, REG_CFA_FLUSH_QUEUE, NULL_RTX); } @@ -9471,15 +9478,23 @@ ix86_select_alt_pic_regnum (void) static bool ix86_save_reg (unsigned int regno, bool maybe_eh_return) { - if (pic_offset_table_rtx - !ix86_use_pseudo_pic_reg () - regno == REAL_PIC_OFFSET_TABLE_REGNUM - (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) - || crtl-profile - || crtl-calls_eh_return - || crtl-uses_const_pool - || cfun-has_nonlocal_label)) -return ix86_select_alt_pic_regnum () == INVALID_REGNUM; + if (regno == REAL_PIC_OFFSET_TABLE_REGNUM + pic_offset_table_rtx) +{ + if (ix86_use_pseudo_pic_reg ()) + { + /* REAL_PIC_OFFSET_TABLE_REGNUM used by call to + _mcount in prologue. */ + if (!TARGET_64BIT flag_pic crtl-profile) + return true; + } + else if (df_regs_ever_live_p (REAL_PIC_OFFSET_TABLE_REGNUM) + || crtl-profile + || crtl-calls_eh_return + || crtl-uses_const_pool + || cfun-has_nonlocal_label) +return ix86_select_alt_pic_regnum () == INVALID_REGNUM; +} if (crtl-calls_eh_return maybe_eh_return) { @@ -10818,6 +10833,29 @@ ix86_finalize_stack_realign_flags (void) crtl-stack_realign_finalized = true; } +/* Delete SET_GOT right after entry block if it is allocated to reg. */ + +static void +ix86_elim_entry_set_got (rtx reg) +{ + basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb; + rtx_insn *c_insn = BB_HEAD (bb); + if (!NONDEBUG_INSN_P (c_insn)) +c_insn = next_nonnote_nondebug_insn (c_insn); + if (c_insn NONJUMP_INSN_P (c_insn)) +{ + rtx pat = PATTERN (c_insn); + if (GET_CODE (pat) == PARALLEL) + { + rtx vec = XVECEXP (pat, 0, 0); + if (GET_CODE (vec) == SET + XINT (XEXP (vec, 1), 1) == UNSPEC_SET_GOT + REGNO (XEXP (vec, 0)) == REGNO (reg)) + delete_insn (c_insn); + } +} +} + /* Expand the prologue into a bunch of separate insns. */ void @@ -11271,6 +11309,20 @@ ix86_expand_prologue (void) if (!sse_registers_saved) ix86_emit_save_sse_regs_using_mov (frame.sse_reg_save_offset); + /* For the mcount profiling on 32 bit PIC mode we need to emit SET_GOT + in PROLOGUE. */ + if (!TARGET_64BIT pic_offset_table_rtx crtl-profile !flag_fentry) +{ + rtx pic = gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM); + insn = emit_insn (gen_set_got (pic)); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_CFA_FLUSH_QUEUE, NULL_RTX); + emit_insn (gen_prologue_use (pic)); + /* Deleting already emmitted SET_GOT if exist and allocated to +REAL_PIC_OFFSET_TABLE_REGNUM. */ + ix86_elim_entry_set_got (pic); +} + if (crtl-drap_reg !crtl-stack_realign_needed) { /* vDRAP is setup but after reload it turns out stack realign diff --git a/gcc/testsuite/gcc.target/i386/mcount_pic.c b/gcc/testsuite/gcc.target/i386/mcount_pic.c new file mode 100644 index 000..6132cdf --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mcount_pic.c @@ -0,0 +1,15 @@ +/* PR target/63534 */ +/* Check correct mcount generation. */ +/* { dg-do run } */ +/* { dg-require-effective-target fpic } */ +/* { dg-require-effective-target ia32 } */ +/* { dg-options -O2 -fpic -p -save-temps } */ + +int main () +{ + return 0; +} + +/* { dg-final { scan-assembler mcount } } */ +/* { dg-final { scan-assembler get_pc_thunk } } */ +/* { dg-final { cleanup-saved-temps } } */ On Tue, Oct 28, 2014 at 9:19 PM,
[PATCH][match-and-simplify] Allow SCCVN to follow SSA use-def chains
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-10-29 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (try_to_simplify): Allow gimple_fold_stmt_to_constant_1 to follow SSA use-def chains. (visit_use): Likewise. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 216798) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -3113,7 +3113,7 @@ try_to_simplify (gimple stmt) transparently support materializing temporary SSA names created by gimple_simplify - or we never value-number to them. */ - tem = gimple_fold_stmt_to_constant_1 (stmt, vn_valueize); + tem = gimple_fold_stmt_to_constant_1 (stmt, vn_valueize, vn_valueize); if (tem (TREE_CODE (tem) == SSA_NAME || is_gimple_min_invariant (tem))) @@ -3274,6 +3274,7 @@ visit_use (tree use) { /* Try constant folding based on our current lattice. */ tree simplified = gimple_fold_stmt_to_constant_1 (stmt, + vn_valueize, vn_valueize); if (simplified) {
Re: [PATCH, x86, 63534] Fix '-p' profile for 32 bit PIC mode
On Wed, Oct 29, 2014 at 12:21:15PM +0300, Evgeny Stupachenko wrote: Patch with the fixes: Bootstrap, gcc make check and spec2000 with -p passed. 2014-10-29 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite PR target/63534 * gcc.target/i386/mcount_pic.c: New. gcc/ PR target/63534 * config/i386/i386.c (ix86_init_pic_reg): Emit SET_GOT to REAL_PIC_OFFSET_TABLE_REGNUM for mcount profiling. (ix86_save_reg): Save REAL_PIC_OFFSET_TABLE_REGNUM when profiling using mcount in 32bit PIC mode. (ix86_elim_entry_set_got): New. (ix86_expand_prologue): For the mcount profiling emit new SET_GOT in PROLOGUE, delete initial if possible. Ok, thanks. Jakub
[PATCH]Partially fix PR61529, bound basic block frequency
Hi all, This is a simple patch to fix ICE in comment 2 of PR61529: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61529 Bound checking code is added to make sure the frequency is within legal range. As far as I have observed, r215830 patch fixes the glibc building ICE. And this patch should fix the ICE while building the sample code in comment 2 using aarch64-none-elf toolchain. Until now, all the ICEs reported in this bug ticket should be fixed. x86_64-unknown-linux-gnu bootstrap and regression test have been done, no new issue. aarch64-none-elf toolchain has been test on the model. No new regression. Is this Okay for trunk? gcc/ChangeLog: 2014-10-29 Renlin Li renlin...@arm.com PR middle-end/61529 * tree-ssa-threadupdate.c (compute_path_counts): Bound path_in_freq.commit c44195cb52ec8ac6386b2b7afe467b680422fb2e Author: Renlin Li renlin...@arm.com Date: Tue Oct 28 16:30:42 2014 + fix pr61529 Change-Id: Ie5e58510f21a4d7a609306006270c3168ab48d06 diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index d2cf4de..e3077a1 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -730,6 +730,10 @@ compute_path_counts (struct redirection_data *rd, nonpath_count += ein-count; } } + + if (path_in_freq BB_FREQ_MAX) +path_in_freq = BB_FREQ_MAX; + BITMAP_FREE (in_edge_srcs); /* Now compute the fraction of the total count coming into the first
RE: [PATCH] Fix up sign extension in bswap
From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Tuesday, October 28, 2014 12:27 PM Thomas, you know the code better, can you from the fix figure out a testcase that current trunk miscompiles or doesn't optimize because of this bug? Here you are (see attachment). Best regards, Thomas
Re: [PATCH, IPA ICF] Fix PR63664, PR63574 (segfault in ipa-icf pass)
On Tue, Oct 28, 2014 at 5:14 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes PR63664 and PR63574. Problem is in NULL types for labels not handled by ICF properly. I assume it is OK for labels to have NULL type and added check into ICF rather then fixed label generation. Bootstrapped and checked on linux-x86_64. OK for trunk? Instead it shouldn't be called for labels instead. Richard. Thanks, Ilya -- gcc/ 2014-10-28 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 PR bootstrap/63574 * ipa-icf-gimple.c (func_checker::compatible_types_p): Allow NULL args. gcc/testsuite/ 2014-10-28 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 * gcc.dg/ipa/pr63664.C: New. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 1369b74..afc0eeb 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -169,6 +169,11 @@ bool func_checker::compatible_types_p (tree t1, tree t2, bool compare_polymorphic, bool first_argument) { + if (!t1 !t2) +return true; + else if (!t1 || !t2) +return false; + if (TREE_CODE (t1) != TREE_CODE (t2)) return return_false_with_msg (different tree types); diff --git a/gcc/testsuite/gcc.dg/ipa/pr63664.C b/gcc/testsuite/gcc.dg/ipa/pr63664.C new file mode 100644 index 000..31d96d4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63664.C @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +class test { + public: + test (int val, int *p) +{ + int_val = *p; + bool_val = (val != int_val); +} + + ~test () +{ + if (!bool_val) + return; +} + + int get_int_val () const { return int_val; } + + private: + bool bool_val; + int int_val; +}; + +static int __attribute__ ((noinline)) +f1 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +static int __attribute__ ((noinline)) +f2 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +int +f (int i, int *p) +{ + return f1 (i, p) + f2 (i, p); +}
RE: [PATCH] Fix up sign extension in bswap
Bummer. Why didn't my MUA warned me on this one? Here you are. Best regards, Thomas -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme Sent: Wednesday, October 29, 2014 9:33 AM To: 'Jakub Jelinek'; Richard Biener Cc: GCC Patches Subject: RE: [PATCH] Fix up sign extension in bswap From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Tuesday, October 28, 2014 12:27 PM Thomas, you know the code better, can you from the fix figure out a testcase that current trunk miscompiles or doesn't optimize because of this bug? Here you are (see attachment). Best regards, Thomas marker_not_cast_testcase.1.0.diff Description: Binary data
Re: [PATCH] Add memory barriers to xbegin/xend/xabort
On Wed, Oct 29, 2014 at 4:31 AM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com xbegin/xend/xabort were missing memory barriers. This can lead to memory operations being moved out of transactions, which would cause unexpected races. Always generate implicit memory barriers for these intrinsics. The compat header versions always generated memory barriers, so this also improves compatibility. Passes test suite. Ok for release branches? Hmm, can't the insns themselves properly clobber/use memory? I suppose they are UNSPEC_VOLATILE anyway, right? Richard. gcc/: 2014-10-28 Andi Kleen a...@linux.intel.com PR target/63672 * config/i386/i386.c (ix86_expand_builtin): Generate memory barrier after abort. * config/i386/i386.md (xbegin): Add memory barrier. (xend): Rename to ... (xend_1): New. Generate memory barrier and emit xend. --- gcc/config/i386/i386.c | 1 + gcc/config/i386/i386.md | 18 +- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ec3e056..ec0df40 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -36413,6 +36413,7 @@ addcarryx: return const0_rtx; } emit_insn (gen_xabort (op0)); + emit_insn (gen_memory_blockage ()); return 0; default: diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7ba07c3..3544e60 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -18530,6 +18530,9 @@ emit_move_insn (operands[0], ax_reg); + operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); + MEM_VOLATILE_P (operands[0]) = 1; + DONE; }) @@ -18546,13 +18549,26 @@ [(set_attr type other) (set_attr length 6)]) -(define_insn xend +(define_insn xend_1 [(unspec_volatile [(const_int 0)] UNSPECV_XEND)] TARGET_RTM xend [(set_attr type other) (set_attr length 3)]) +(define_expand xend + [(set (match_dup 0) + (unspec:BLK [(const_int 0)] UNSPECV_XEND))] /* or match_dup 0 ? */ + TARGET_RTM +{ + emit_insn (gen_xend_1 ()); + + operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); + MEM_VOLATILE_P (operands[0]) = 1; + + DONE; +}) + (define_insn xabort [(unspec_volatile [(match_operand:SI 0 const_0_to_255_operand n)] UNSPECV_XABORT)] -- 2.1.1
Re: fix math wrt volatile-bitfields vs C++ model
On Wed, Oct 29, 2014 at 6:24 AM, DJ Delorie d...@redhat.com wrote: Looks ok to me, but can you add a testcase please? Also check if 4.9 is affected. Sorry for the delay, this finally made it back to the top of my to-do list. Testcase included which fails without and passes with this patch. 4.9 is affected and the same patch fixes it. Tested on rx-elf, x86 32/64, and arm32. Ok. For the branch please wait until after 4.9.2 is out. Thanks, Richard. 2014-10-29 DJ Delorie d...@redhat.com * expmed.c (strict_volatile_bitfield_p): Fix off-by-one error. 2014-10-29 DJ Delorie d...@redhat.com * gcc.dg/20141029-1.c: New. Index: expmed.c === --- expmed.c(revision 216811) +++ expmed.c(working copy) @@ -454,13 +454,13 @@ strict_volatile_bitfield_p (rtx op0, uns bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) return false; /* Check for cases where the C++ memory model applies. */ if (bitregion_end != 0 (bitnum - bitnum % modesize bitregion_start - || bitnum - bitnum % modesize + modesize bitregion_end)) + || bitnum - bitnum % modesize + modesize - 1 bitregion_end)) return false; return true; } /* Return true if OP is a memory and if a bitfield of size BITSIZE at Index: testsuite/gcc.dg/20141029-1.c === --- testsuite/gcc.dg/20141029-1.c (revision 0) +++ testsuite/gcc.dg/20141029-1.c (revision 0) @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options -fstrict-volatile-bitfields -fdump-rtl-final } */ + +#define PERIPH (*(volatile struct system_periph *)0x81234) + +struct system_periph { + union { +unsigned short WORD; +struct { + unsigned short a:1; + unsigned short b:1; + unsigned short :5; + unsigned short c:1; + unsigned short :8; +} BIT; + } ALL; +}; + +void +foo() +{ + while (1) +{ + PERIPH.ALL.BIT.a = 1; +} +} +/* { dg-final { scan-rtl-dump-times mem/v(/.)*:HI 4 final } } */ +/* { dg-final { cleanup-rtl-dump final } } */
Re: [PATCH] Fix PR63665
On Wed, Oct 29, 2014 at 10:04 AM, Eric Botcazou ebotca...@adacore.com wrote: 2014-10-28 Richard Biener rguent...@suse.de PR tree-optimization/63665 * tree-vect-slp.c (vect_get_mask_element): Properly handle accessing out-of-bound elements. Does fix it the assertion failure on the attached testcase? If so, would you mind committing the testcase with the patch? Sorry - I saw this mail too late. The assert already triggers various existing tests in gcc.dg/vect on x86_64-linux so I didn't add a new one. Feel free to add the test if you think more Ada coverage is warranted here. Thanks, Richard. * gnat.dg/opt42.ad[sb]: New test. -- Eric Botcazou
Re: [PATCH] Fix up sign extension in bswap
On Wed, Oct 29, 2014 at 09:36:02AM -, Thomas Preud'homme wrote: Bummer. Why didn't my MUA warned me on this one? I think this is ok for trunk with proper ChangeLog entry. Jakub
Re: RFA: Remove redundant enum from enum machine_mode
On Wed, Oct 29, 2014 at 10:20 AM, Richard Sandiford richard.sandif...@arm.com wrote: In https://gcc.gnu.org/ml/gcc/2014-10/msg00206.html I asked: I have some plans to clean up the machine_mode handling and perhaps make it hierarchical, so that functions that can only handle scalar integer modes (say) will be able to take a scalar_int_mode rather than a machine_mode as argument. The first step would be to do a blanket removal of the (in C++) redundant enum from all those enum machine_mode variables, parameters and fields. Regardless of whether the hierarchy sounds like a good idea, would removing the enum be OK? There's never a good time for that much churn, but since the release branches are fairly mature and since we already have quite a bit of churn in rtl land between 4.9 and 5, now seemed like as good a time as any. No-one objected (or said it was a good idea :-)) so this patch makes the change. The hand-written part is very simple. The automatic part is to run this hacky script: #!/bin/bash rfind gcc -name '*.h' -o -name '*.c' -o -name '*.md' -o -name '*.texi*' \ -o -name '*.awk' -o -name '*.def' | grep -v testsuite/ | xargs sed -i 's/enum machine_mode/machine_mode/g' sed -i 's/machine_mode\\n{);/enum machine_mode\\n{);/' gcc/genmodes.c rfind gcc -name '*.texi*' | xargs sed -i -e 's/ {machine_mode} / machine_mode /g' \ -e 's/enum @var{machine_mode}/machine_mode/' \ -e '/TARGET_SPILL_CLASS/s/machine_mode)/@var{machine_mode})/' touch gcc/doc/tm.texi I've attached the result for reference. The special handling of TARGET_SPILL_CLASS is due to a bug in the way genhooks handles target.def entries that have no argument names. It affects other hooks too, e.g.: @deftypefn {Target Hook} int TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN (struct cgraph_node *@var{}, struct cgraph_simd_clone *@var{}, @var{tree}, @var{int}) where tree and int shouldn't be @var{...}s. I'd like to deal with that separately. Tested on x86_64-linux-gnu. OK to install? Ok. (not sure if we have headers that are still used by programs compiled with a C compiler in some weird setups) Thanks, Richard. Thanks, Richard gcc/ * gengtype.c (main): Treat machine_mode as a scalar typedef. * genmodes.c (emit_insn_modes_h): Hide inline functions if USED_FOR_TARGET. Index: gcc/gengtype.c === --- gcc/gengtype.c (revision 216806) +++ gcc/gengtype.c (working copy) @@ -5650,6 +5650,7 @@ POS_HERE (do_scalar_typedef (jword, pos)); POS_HERE (do_scalar_typedef (JCF_u2, pos)); POS_HERE (do_scalar_typedef (void, pos)); + POS_HERE (do_scalar_typedef (machine_mode, pos)); POS_HERE (do_typedef (PTR, create_pointer (resolve_typedef (void, pos)), pos)); Index: gcc/genmodes.c === --- gcc/genmodes.c (revision 216806) +++ gcc/genmodes.c (working copy) @@ -1108,7 +1108,7 @@ printf (#define NUM_INT_N_ENTS %d\n, n_int_n_ents); - puts (\n#if GCC_VERSION = 4001\n); + puts (\n#if !defined (USED_FOR_TARGET) GCC_VERSION = 4001\n); emit_mode_size_inline (); emit_mode_nunits_inline (); emit_mode_inner_inline (); gcc/ada/ * gcc-interface/decl.c, gcc-interface/gigi.h, gcc-interface/misc.c, gcc-interface/trans.c, gcc-interface/utils.c, gcc-interface/utils2.c: Remove redundant enum from machine_mode. c-family/ * c-common.c, c-common.h, c-cppbuiltin.c, c-lex.c: Remove redundant enum from machine_mode. c/ * c-decl.c, c-tree.h, c-typeck.c: Remove redundant enum from machine_mode. cp/ * constexpr.c: Remove redundant enum from machine_mode. fortran/ * trans-types.c, trans-types.h: Remove redundant enum from machine_mode. go/ * go-lang.c: Remove redundant enum from machine_mode. java/ * builtins.c, java-tree.h, typeck.c: Remove redundant enum from machine_mode. lto/ * lto-lang.c: Remove redundant enum from machine_mode. gcc/ * addresses.h, alias.c, asan.c, auto-inc-dec.c, bt-load.c, builtins.c, builtins.h, caller-save.c, calls.c, calls.h, cfgexpand.c, cfgloop.h, cfgrtl.c, combine.c, compare-elim.c, config/aarch64/aarch64-builtins.c, config/aarch64/aarch64-protos.h, config/aarch64/aarch64-simd.md, config/aarch64/aarch64.c, config/aarch64/aarch64.h, config/aarch64/aarch64.md, config/alpha/alpha-protos.h, config/alpha/alpha.c, config/arc/arc-protos.h, config/arc/arc.c, config/arc/arc.h,
RE: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
From: Nathan Sidwell [mailto:nat...@codesourcery.com] Sent: Thursday, October 09, 2014 2:30 PM On 10/09/14 09:25, Jason Merrill wrote: I would think we want to handle this up in the existing defaulted_int block: my thought was to at least put it next to the explicit_int = -1 above. It seems more sensible to keep it in this block as the existing defaulted_int block is for types for which it is not an error to omit the int type specifier. Here is an updated patch which moves the statement as requested by Nathan: diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index d26a432..f382e27 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -9187,6 +9187,7 @@ grokdeclarator (const cp_declarator *declarator, int is_main; explicit_int = -1; + defaulted_int = 1; /* We handle `main' specially here, because 'main () { }' is so common. With no options, it is allowed. With -Wreturn-type, diff --git a/gcc/testsuite/g++.dg/torture/pr63366.C b/gcc/testsuite/g++.dg/torture/pr63366.C new file mode 100644 index 000..af59b98 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr63366.C @@ -0,0 +1,11 @@ +// { dg-do run } +// { dg-options -fpermissive } +// { dg-prune-output ISO C\\+\\+ forbids declaration of 'type name' with no type } + +#include typeinfo + +int +main (void) +{ + return typeid (__complex) != typeid (__complex double); +} ChangeLog unchanged. Ok for trunk? Best regards, Thomas
[PATCH][AArch64] Restore recog state after finding pre-madd instruction
Hi all, This patch fixes an issue with the final_prescan workaround for the Cortex-A53 erratum 835769 where calling recog_memoized could modify the recog data for the multiply-accumulate instruction when looking at a preceding asm block. This can lead to wrong code generation. The solution is to call extract_constrain_insn_cached to restore the recog data before exiting aarch64_madd_needs_nop. Bootstrapped and tested on aarch64-none-linux-gnu. A compile testcase is added demonstrating the issue. Ok for trunk? Thanks, Kyrill 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64.c (aarch64_madd_needs_nop): Restore recog state after aarch64_prev_real_insn call. 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/aarch64/madd_after_asm_1.c: New test.commit e11cd63678d7c334d2f9f124e197695ea470025c Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Mon Oct 27 13:29:25 2014 + [AArch64] Restore recog state after finding pre-madd instruction diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cbbc482..002cb44 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -7769,6 +7769,10 @@ aarch64_madd_needs_nop (rtx_insn* insn) return false; prev = aarch64_prev_real_insn (insn); + /* aarch64_prev_real_insn can call recog_memoized on insns other than INSN. + Restore recog state to INSN to avoid state corruption. */ + extract_constrain_insn_cached (insn); + if (!prev || !has_memory_op (prev)) return false; diff --git a/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c new file mode 100644 index 000..523941d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-options -O2 -mfix-cortex-a53-835769 } */ + +int +test (int a, double b, int c, int d, int e) +{ + double result; + __asm__ __volatile (// %0, %1 + : =w (result) + : 0 (b) + :/* No clobbers */ + ); + return c * d + e; +}
[PATCH][AArch64][4.8] Restore recog state after finding pre-madd instruction
Hi all, This is the 4.8 backport of the trunk patch (https://gcc.gnu.org/ml/gcc-patches/2014-10/msg03019.html). Tested similarly. Ok for that branch? Thanks, Kyrill 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64.c (aarch64_madd_needs_nop): Restore recog state after aarch64_prev_real_insn call. 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/aarch64/madd_after_asm_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index d756763..c262792 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6148,6 +6148,10 @@ aarch64_madd_needs_nop (rtx insn) return false; prev = aarch64_prev_real_insn (insn); + /* aarch64_prev_real_insn can call recog_memoized on insns other than INSN. + Restore recog state to INSN to avoid state corruption. */ + extract_constrain_insn_cached (insn); + if (!prev || !has_memory_op (prev)) return false; diff --git a/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c new file mode 100644 index 000..523941d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-options -O2 -mfix-cortex-a53-835769 } */ + +int +test (int a, double b, int c, int d, int e) +{ + double result; + __asm__ __volatile (// %0, %1 + : =w (result) + : 0 (b) + :/* No clobbers */ + ); + return c * d + e; +}
[PATCH][AArch64][4.9] Restore recog state after finding pre-madd instruction
Hi all, This is the backport of the trunk patch posted at https://gcc.gnu.org/ml/gcc-patches/2014-10/msg03019.html. It is essentially the same content (only the diff context differs). Jakub, this is a regression fix so, if ok'd, can we get this into 4.9.2 please? Bootstrapped and regtested on aarch64-none-linux-gnu. Thanks, Kyrill 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64.c (aarch64_madd_needs_nop): Restore recog state after aarch64_prev_real_insn call. 2014-10-28 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/aarch64/madd_after_asm_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 34354d4..52c0471 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6557,6 +6557,10 @@ aarch64_madd_needs_nop (rtx insn) return false; prev = aarch64_prev_real_insn (insn); + /* aarch64_prev_real_insn can call recog_memoized on insns other than INSN. + Restore recog state to INSN to avoid state corruption. */ + extract_constrain_insn_cached (insn); + if (!prev) return false; diff --git a/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c new file mode 100644 index 000..523941d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/madd_after_asm_1.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-options -O2 -mfix-cortex-a53-835769 } */ + +int +test (int a, double b, int c, int d, int e) +{ + double result; + __asm__ __volatile (// %0, %1 + : =w (result) + : 0 (b) + :/* No clobbers */ + ); + return c * d + e; +}
Re: [PATCH][ARM] Optimize copysign/copysignf for soft-float using BFI
On 26/08/14 13:36, Richard Earnshaw wrote: On 29/07/14 15:49, Jiong Wang wrote: test done === no regression on the full toolchain test on arm-none-eabi. ok to install? Hmm, I think this is wrong for DF mode. The principle the patch works on is by tying the output to the value containing the sign bit, and then copying the rest of the other value into that value. However, for DF mode it only copies 31 of the 63 bits needed; the least significant 32 bits of the mantissa are not copied over. R. updated the patch. fixed the DF mode bug. no regression on arm-none-eabi multi-lib test. ok to trunk? gcc/ * config/arm/arm.md (copysignsf3): New define_expand for SImode. (copysigndf3): New define_expand for DImode. gcc/testsuite/ * gcc.target/arm/copysign_softfloat_1.c: New copysign/copysignf testcase for soft-float. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index cd9ab6c..2a7dc11 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11131,6 +11131,47 @@ [(set_attr predicable yes)] ) +(define_expand copysignsf3 + [(match_operand:SF 0 register_operand) + (match_operand:SF 1 register_operand) + (match_operand:SF 2 register_operand)] + TARGET_SOFT_FLOAT arm_arch_thumb2 + { + emit_move_insn (operands[0], operands[2]); + emit_insn (gen_insv_t2 (simplify_gen_subreg (SImode, operands[0], SFmode, 0), + GEN_INT (31), GEN_INT (0), + simplify_gen_subreg (SImode, operands[1], SFmode, 0))); + DONE; + } +) + +(define_expand copysigndf3 + [(match_operand:DF 0 register_operand) + (match_operand:DF 1 register_operand) + (match_operand:DF 2 register_operand)] + TARGET_SOFT_FLOAT arm_arch_thumb2 + { + rtx op0_low = gen_lowpart (SImode, operands[0]); + rtx op0_high = gen_highpart (SImode, operands[0]); + rtx op1_low = gen_lowpart (SImode, operands[1]); + rtx op1_high = gen_highpart (SImode, operands[1]); + rtx op2_high = gen_highpart (SImode, operands[2]); + + rtx scratch1 = gen_reg_rtx (SImode); + rtx scratch2 = gen_reg_rtx (SImode); + emit_move_insn (scratch1, op2_high); + emit_move_insn (scratch2, op1_high); + + emit_insn(gen_rtx_SET(SImode, scratch1, + gen_rtx_LSHIFTRT (SImode, op2_high, GEN_INT(31; + emit_insn(gen_insv_t2(scratch2, GEN_INT(1), GEN_INT(31), scratch1)); + emit_move_insn (op0_low, op1_low); + emit_move_insn (op0_high, scratch2); + + DONE; + } +) + ;; Vector bits common to IWMMXT and Neon (include vec-common.md) ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c b/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c new file mode 100644 index 000..990c46e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/copysign_softfloat_1.c @@ -0,0 +1,61 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if skip override { *-*-* } { -mfloat-abi=softfp -mfloat-abi=hard } { } } */ +/* { dg-options -O2 -mfloat-abi=soft --save-temps } */ +extern void abort (void); + +#define N 16 + +float a_f[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; + +float b_f[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; + +float c_f[N] = {-0.1f, 3.2f, -6.3f, 9.4f, + -12.5f, 15.6f, -18.7f, 21.8f, + -24.9f, -27.1f, 30.2f, 33.3f, + -36.4f, -39.5f, 42.6f, 45.7f}; + +double a_d[N] = {-0.1, -3.2, -6.3, -9.4, + -12.5, -15.6, -18.7, -21.8, + 24.9, 27.1, 30.2, 33.3, + 36.4, 39.5, 42.6, 45.7}; + +double b_d[N] = {-1.2, 3.4, -5.6, 7.8, + -9.0, 1.0, -2.0, 3.0, + -4.0, -5.0, 6.0, 7.0, + -8.0, -9.0, 10.0, 11.0}; + +double c_d[N] = {-0.1, 3.2, -6.3, 9.4, + -12.5, 15.6, -18.7, 21.8, + -24.9, -27.1, 30.2, 33.3, + -36.4, -39.5, 42.6, 45.7}; + +int +main (int argc, char **argv) +{ + int index = 0; + +/* { dg-final { scan-assembler-times bfi 2 } } */ +/* { dg-final { scan-assembler-times lsr 1 } } */ + for (index; index N; index++) +{ + if (__builtin_copysignf (a_f[index], b_f[index]) != c_f[index]) + abort(); +} + + for (index = 0; index N; index++) +{ + if (__builtin_copysign (a_d[index], b_d[index]) != c_d[index]) + abort(); +} + + return 0; +} + +/* { dg-final { cleanup-saved-temps } } */
[PATCH] Add test for PR52769
PR52769 reports a bug that has been fixed in 4.7, but the test case was never added. So I'd like to put this test in and close PR52769. Ok? 2014-10-29 Marek Polacek pola...@redhat.com PR c/52769 * gcc.dg/pr52769.c: New test. diff --git gcc/testsuite/gcc.dg/pr52769.c gcc/testsuite/gcc.dg/pr52769.c index e69de29..138cecb 100644 --- gcc/testsuite/gcc.dg/pr52769.c +++ gcc/testsuite/gcc.dg/pr52769.c @@ -0,0 +1,24 @@ +/* PR c/52769 */ +/* { dg-do run } */ +/* { dg-options -O3 } */ + +typedef struct +{ + int should_be_zero; + char s[6]; + int x; +} foo_t; + +int +main (void) +{ + volatile foo_t foo = { +.s = 123456, +.x = 2 + }; + + if (foo.should_be_zero != 0) +__builtin_abort (); + + return 0; +} Marek
Re: [PATCH 4/n] OpenMP 4.0 offloading infrastructure: lto-wrapper
Hello Richard, Jan, On 16 Oct 13:22, Jakub Jelinek wrote: On Thu, Oct 16, 2014 at 03:17:36PM +0400, Ilya Verbin wrote: The rest LGTM, but please run it through LTO review (Richard/Honza) too. Ping? -- Thanks, k Jakub
[PATCH, ifcvt] Allow CC mode if HAVE_cbranchcc4
Hi, The patch enhances ifcvt to allow_cc_mode if HAVE_cbranchcc4. Bootstrap and no make check regression on X86-64. Will add new test cases after ccmp is enabled. Ok for trunk? Thanks! -Zhenqiang ChangeLog: 2014-10-29 Zhenqiang Chen zhenqiang.c...@arm.com * ifcvt.c (noce_emit_cmove, noce_get_alt_condition, noce_get_condition): Allow CC mode if HAVE_cbranchcc4. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index a28f5c1..5cd0ac0 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1441,10 +1441,17 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code, end_sequence (); } - /* Don't even try if the comparison operands are weird. */ + /* Don't even try if the comparison operands are weird + except that the target supports cbranchcc4. */ if (! general_operand (cmp_a, GET_MODE (cmp_a)) || ! general_operand (cmp_b, GET_MODE (cmp_b))) -return NULL_RTX; +{ +#if HAVE_cbranchcc4 + if (GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC + || cmp_b != const0_rtx) +#endif + return NULL_RTX; +} #if HAVE_conditional_move unsignedp = (code == LTU || code == GEU @@ -1770,6 +1777,11 @@ noce_get_alt_condition (struct noce_if_info *if_info, rtx target, rtx cond, set; rtx_insn *insn; int reverse; + int allow_cc_mode = false; +#if HAVE_cbranchcc4 + allow_cc_mode = true; +#endif + /* If target is already mentioned in the known condition, return it. */ if (reg_mentioned_p (target, if_info-cond)) @@ -1891,7 +1903,7 @@ noce_get_alt_condition (struct noce_if_info *if_info, rtx target, } cond = canonicalize_condition (if_info-jump, cond, reverse, -earliest, target, false, true); +earliest, target, allow_cc_mode, true); if (! cond || ! reg_mentioned_p (target, cond)) return NULL; @@ -2347,6 +2359,10 @@ noce_get_condition (rtx_insn *jump, rtx_insn **earliest, bool then_else_reversed { rtx cond, set, tmp; bool reverse; + int allow_cc_mode = false; +#if HAVE_cbranchcc4 + allow_cc_mode = true; +#endif if (! any_condjump_p (jump)) return NULL_RTX; @@ -2383,7 +2399,7 @@ noce_get_condition (rtx_insn *jump, rtx_insn **earliest, bool then_else_reversed /* Otherwise, fall back on canonicalize_condition to do the dirty work of manipulating MODE_CC values and COMPARE rtx codes. */ tmp = canonicalize_condition (jump, cond, reverse, earliest, - NULL_RTX, false, true); + NULL_RTX, allow_cc_mode, true); /* We don't handle side-effects in the condition, like handling REG_INC notes and making sure no duplicate conditions are emitted. */ -Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Tuesday, October 28, 2014 12:03 AM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 10/10] aarch64: Handle ccmp in ifcvt to make it work with cmov On 10/27/2014 12:50 AM, Zhenqiang Chen wrote: Good point. It is not ccmp special. It is cbranchcc4 related. If I understand correct, without cbranchcc4, we need put the result to a tmp register and generate additional compares, which is not good for performance. It won't be an additional compare. It is regenerating the compare at a new location. The old comparison would be deleted via dead code elimination. That said, +#if HAVE_cbranchcc4 + allow_cc_mode = true; +#endif does seem to be the right solution. If a target has this, we ought to be able to reasonably expect that it's got all the other patterns that this implies. If a target is missing them, hopefully we can get that sorted fairly quickly after this patch is included. +#if HAVE_cbranchcc4 + if (!(GET_MODE_CLASS (GET_MODE (cmp_a)) == MODE_CC + || GET_MODE_CLASS (GET_MODE (cmp_b)) == MODE_CC)) #endif This test looks weird, considering what we're looking for. I think a better test is if (GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC || cmp_b != const0_rtx) Accepting something like (compare (reg:CC a) (reg:CC b)) is definitely non- canonical. Even (compare (const_int 0) (reg:CC flags)) is odd. The ifcvt.c change should go in by itself. The expr.c change should also be standalone. The ccmp.c change should probably be merged with the initial commit of ccmp.c. The aaarch64.md change should probably be merged with the patch that adds cbranchcc. r~
Re: [PATCH 5/5] add libcc1
On Tue, Oct 28, 2014 at 05:36:50PM +, Phil Muldoon wrote: On 28/10/14 13:19, Joseph S. Myers wrote: I'm seeing a different bootstrap failure from those already discussed: In file included from /scratch/jmyers/fsf/gcc-mainline/libcc1/../gcc/gcc-plugin.h:28:0, from /scratch/jmyers/fsf/gcc-mainline/libcc1/plugin.cc:34: /scratch/jmyers/fsf/gcc-mainline/libcc1/../gcc/system.h:653:17: fatal error: gmp.h: No such file or directory It appears the build is ignoring the --with-gmp option passed to configure. Since gmp.h is included in system.h, if you include system.h you have to pass the right -I option corresponding to --with-gmp / --with-gmp-include. (There are several other such configure options for MPFR, MPC, CLooG, ISL, libiconv at least - whether they are relevant depends on whether your code ends up including the relevant headers.) Hi, sorry for the troubles! I am having difficulty seeing this fail on my system. I built gmp from upstream, installed it, and pointed to the install location with --with-gmp. Which stage does your build fail at? I am actually not totally sure how to respect the -with-gmp argument in libcc1. auto* tools are not my strongest skill. ;) I notice gcc/configure.ac I think just exports the variables to Makefile.in from the main configure script. That what we should do in this case? Here is a patch I'm bootstrapping/regtesting now (but, with system gmp installed). I've verified that with this patch stage1 libcc1 is built without -Werror in flags, while stage2 libcc1 is built with -Werror. If this passes bootstrap/regtest, is it ok for trunk (should fix two bootstrap issues)? Is the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02936.html patch ok too (that one already tested; another bootstrap issue)? 2014-10-29 Jakub Jelinek ja...@redhat.com Phil Muldoon pmuld...@redhat.com * configure.ac: Remove -Werror addition to WARN_FLAGS. Add ACX_PROG_CC_WARNINGS_ARE_ERRORS and AC_ARG_VAR for GMPINC. * Makefile.am (AM_CPPFLAGS): Add $(GMPINC). (WERROR_FLAG): Remove. (AM_CXXFLAGS): Use $(WERROR) instead of $(WERROR_FLAG). * configure: Regenerated. * Makefile.in: Regenerated. --- libcc1/configure.ac.jj 2014-10-28 14:39:52.0 +0100 +++ libcc1/configure.ac 2014-10-29 10:01:36.515497687 +0100 @@ -52,8 +52,10 @@ gcc_version=`cat $srcdir/../gcc/BASE-VER AC_SUBST(gcc_version) ACX_PROG_CC_WARNING_OPTS([-W -Wall], [WARN_FLAGS]) -WARN_FLAGS=$WARN_FLAGS -Werror AC_SUBST(WARN_FLAGS) +ACX_PROG_CC_WARNINGS_ARE_ERRORS([manual]) + +AC_ARG_VAR(GMPINC,[How to find GMP include files]) libsuffix= if test $GXX = yes; then --- libcc1/Makefile.am.jj 2014-10-29 09:53:00.0 +0100 +++ libcc1/Makefile.am 2014-10-29 10:02:08.481885746 +0100 @@ -21,9 +21,8 @@ gcc_build_dir = ../$(host_subdir)/gcc AM_CPPFLAGS = -I $(srcdir)/../include -I $(srcdir)/../libgcc \ -I $(gcc_build_dir) -I$(srcdir)/../gcc \ -I $(srcdir)/../gcc/c -I $(srcdir)/../gcc/c-family \ - -I $(srcdir)/../libcpp/include -WERROR_FLAG = -Werror -AM_CXXFLAGS = $(WARN_FLAGS) $(WERROR_FLAG) $(visibility) + -I $(srcdir)/../libcpp/include $(GMPINC) +AM_CXXFLAGS = $(WARN_FLAGS) $(WERROR) $(visibility) override CXXFLAGS := $(filter-out -fsanitize=address,$(CXXFLAGS)) override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS)) # Can be simplified when libiberty becomes a normal convenience library. --- libcc1/configure.jj 2014-10-28 14:39:52.0 +0100 +++ libcc1/configure2014-10-29 10:02:32.957419099 +0100 @@ -605,6 +605,8 @@ LIBOBJS ENABLE_PLUGIN_FALSE ENABLE_PLUGIN_TRUE libsuffix +GMPINC +WERROR WARN_FLAGS gcc_version visibility @@ -743,6 +745,7 @@ with_pic enable_fast_install with_gnu_ld enable_libtool_lock +enable_werror_always enable_plugin ' ac_precious_vars='build_alias @@ -757,7 +760,8 @@ CPP CXX CXXFLAGS CCC -CXXCPP' +CXXCPP +GMPINC' # Initialize some variables set by options. @@ -1387,6 +1391,7 @@ Optional Features: --enable-fast-install[=PKGS] optimize for fast installation [default=yes] --disable-libtool-lock avoid locking (might break parallel builds) + --enable-werror-always enable -Werror despite compiler version --enable-plugin enable plugin support Optional Packages: @@ -1409,6 +1414,7 @@ Some influential environment variables: CXX C++ compiler command CXXFLAGSC++ compiler flags CXXCPP C++ preprocessor + GMPINC How to find GMP include files Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. @@ -10530,7 +10536,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 10533 configure +#line 10539 configure #include confdefs.h #if HAVE_DLFCN_H @@
RE: [Ping] [PATCH, 1/10] two hooks for conditional compare (ccmp)
-Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Monday, October 27, 2014 10:56 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 1/10] two hooks for conditional compare (ccmp) On 10/27/2014 12:47 AM, Zhenqiang Chen wrote: + @var{NULL} if the combination of @var{prev} and this comparison is\n\ @code{NULL} Thanks! Patch is updated. -Zhenqiang 1-hooks.patch Description: Binary data
Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables
Hello Richard, Jan, On 08 Oct 11:23, Jakub Jelinek wrote: On Tue, Sep 30, 2014 at 06:53:20PM +0400, Ilya Verbin wrote: Bootstrapped and regtested on top of patch 2. Is it OK for trunk? LGTM, with the requested var/section renames. Would like if Honza and/or Richard had a look at the cgraph/LTO stuff in the patch though. Ping? -- Thanks, K Jakub
RE: [Ping] [PATCH, 2/10] prepare ccmp
-Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Monday, October 27, 2014 11:14 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 2/10] prepare ccmp On 10/27/2014 12:48 AM, Zhenqiang Chen wrote: On 09/22/2014 11:43 PM, Zhenqiang Chen wrote: + /* If jumps are cheap and the target does not support conditional + compare, turn some more codes into jumpy sequences. */ + else if (BRANCH_COST (optimize_insn_for_speed_p (), false) 4 + (targetm.gen_ccmp_first == NULL)) Don't add unnecessary parenthesis around the == expression. Otherwise ok. r~ 2-prepare.patch diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 5200053..cfd4070 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -2115,9 +2115,10 @@ expand_gimple_cond (basic_block bb, gimple stmt) op0 = gimple_assign_rhs1 (second); op1 = gimple_assign_rhs2 (second); } - /* If jumps are cheap turn some more codes into -jumpy sequences. */ - else if (BRANCH_COST (optimize_insn_for_speed_p (), false) 4) + /* If jumps are cheap and the target does not support conditional +compare, turn some more codes into jumpy sequences. */ + else if ((BRANCH_COST (optimize_insn_for_speed_p (), false) 4) + (!targetm.gen_ccmp_first)) Did you not understand what I meant by parenthesis? Thanks! Patch is updated. -Zhenqiang 2-prepare.patch Description: Binary data
RE: [Ping] [PATCH, 4/10] expand ccmp
Patch is rebased and merged with other changes according to comments. Thanks! -Zhenqiang -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen Sent: Tuesday, September 23, 2014 2:44 PM To: gcc-patches@gcc.gnu.org Subject: [Ping] [PATCH, 4/10] expand ccmp Ping? Patch is rebased and regenerated since [PATCH, 3/10] skip swapping operands used in ccmp is discarded. Please find the updated patch in attachment. Bootstrap and no make check regression on X86-64. Thanks! -Zhenqiang ChangeLog: 2014-09-23 Zhenqiang Chen zhenqiang.c...@linaro.org * ccmp.c: New file. * ccmp.h: New file. * Makefile.in: Add ccmp.o * expr.c: #include ccmp.h (expand_expr_real_1): Try to expand ccmp. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen Sent: Tuesday, July 01, 2014 4:01 PM To: Richard Earnshaw Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, 4/10] expand ccmp On 25 June 2014 23:16, Richard Earnshaw rearn...@arm.com wrote: On 23/06/14 07:59, Zhenqiang Chen wrote: Hi, This patch includes the main logic to expand ccmp instructions. In the patch, * ccmp_candidate_p is used to identify the CCMP candidate * expand_ccmp_expr is the main entry, which calls expand_ccmp_expr_1 to expand CCMP. * expand_ccmp_expr_1 uses a recursive algorithm to expand CCMP. It calls gen_ccmp_first and gen_ccmp_next to generate CCMP instructions. During expanding, we must make sure that no instruction can clobber the CC reg except the compares. So clobber_cc_p and check_clobber_cc are introduced to do the check. * If the final result is not used in a COND_EXPR (checked by function used_in_cond_stmt_p), it calls cstorecc4 pattern to store the CC to a general register. Bootstrap and no make check regression on X86-64. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-06-23 Zhenqiang Chen zhenqiang.c...@linaro.org * ccmp.c (ccmp_candidate_p, used_in_cond_stmt_p, check_clobber_cc, clobber_cc_p, expand_ccmp_next, expand_ccmp_expr_1, expand_ccmp_expr): New functions to expand ccmp. * ccmp.h (expand_ccmp_expr): New prototype. * expr.c: #include ccmp.h (expand_expr_real_1): Try to expand ccmp. diff --git a/gcc/ccmp.c b/gcc/ccmp.c index 665c2a5..97b3910 100644 --- a/gcc/ccmp.c +++ b/gcc/ccmp.c @@ -47,6 +47,262 @@ along with GCC; see the file COPYING3. If not see #include expmed.h #include ccmp.h +/* The following functions expand conditional compare (CCMP) instructions. + Here is a short description about the over all algorithm: + * ccmp_candidate_p is used to identify the CCMP candidate + + * expand_ccmp_expr is the main entry, which calls expand_ccmp_expr_1 + to expand CCMP. + + * expand_ccmp_expr_1 uses a recursive algorithm to expand CCMP. + It calls two target hooks gen_ccmp_first and gen_ccmp_next + to generate + CCMP instructions. +- gen_ccmp_first expands the first compare in CCMP. +- gen_ccmp_next expands the following compares. + + During expanding, we must make sure that no instruction can clobber the + CC reg except the compares. So clobber_cc_p and check_clobber_cc are + introduced to do the check. + + * If the final result is not used in a COND_EXPR (checked by function + used_in_cond_stmt_p), it calls cstorecc4 pattern to store + the CC to a + general register. */ + +/* Check whether G is a potential conditional compare candidate. +*/ static bool ccmp_candidate_p (gimple g) { + tree rhs = gimple_assign_rhs_to_tree (g); + tree lhs, op0, op1; + gimple gs0, gs1; + enum tree_code tcode, tcode0, tcode1; + tcode = TREE_CODE (rhs); + + if (tcode != BIT_AND_EXPR tcode != BIT_IOR_EXPR) +return false; + + lhs = gimple_assign_lhs (g); + op0 = TREE_OPERAND (rhs, 0); + op1 = TREE_OPERAND (rhs, 1); + + if ((TREE_CODE (op0) != SSA_NAME) || (TREE_CODE (op1) != SSA_NAME) + || !has_single_use (lhs)) +return false; + + gs0 = get_gimple_for_ssa_name (op0); + gs1 = get_gimple_for_ssa_name (op1); if (!gs0 || !gs1 || + !is_gimple_assign (gs0) || !is_gimple_assign (gs1) + /* g, gs0 and gs1 must be in the same basic block, since + current stage +is out-of-ssa. We can not guarantee the correctness when forwording +the gs0 and gs1 into g whithout DATAFLOW analysis. */ + || gimple_bb (gs0) != gimple_bb (gs1) + || gimple_bb (gs0) != gimple_bb (g)) +return false; + + if
libcc1
It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. Could something like following work? Phil, can you try that? Perhaps some toplevel Makefile* changes would be needed too. --- configure.ac2014-10-28 14:39:53.018852391 +0100 +++ configure.ac2014-10-29 11:27:39.866152791 +0100 @@ -129,7 +129,7 @@ extra_host_args= build_libs=build-libiberty build-libcpp # these tools are built for the build environment -build_tools=build-texinfo build-flex build-bison build-m4 build-fixincludes +build_tools=build-texinfo build-flex build-bison build-m4 build-fixincludes build-libcc1 # these libraries are used by various programs built for the host environment #f @@ -141,7 +141,7 @@ host_libs=intl libiberty opcodes bfd re # binutils, gas and ld appear in that order because it makes sense to run # make check in that particular order. # If --enable-gold is used, gold may replace ld. -host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 +host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools # libgcj represents the runtime libraries only used by gcj. libgcj=target-libffi \ @@ -2659,11 +2659,18 @@ for module in ${build_configdirs} ; do echo 12 *** removing ${build_subdir}/${module}/Makefile to force reconfigure rm -f ${build_subdir}/${module}/Makefile fi + + # Don't bootstrap libcc1 + case $module in +libcc1) build_bootstrap_suffix=no-bootstrap ;; +*) build_bootstrap_suffix=$bootstrap_suffix ;; + esac + extrasub_build=$extrasub_build /^@if build-$module\$/d /^@endif build-$module\$/d -/^@if build-$module-$bootstrap_suffix\$/d -/^@endif build-$module-$bootstrap_suffix\$/d +/^@if build-$module-$build-bootstrap_suffix\$/d +/^@endif build-$module-$build-bootstrap_suffix\$/d done extrasub_host= for module in ${configdirs} ; do Jakub
RE: [Ping] [PATCH, 6/10] aarch64: add ccmp CC mode
-Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Monday, October 27, 2014 11:20 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 6/10] aarch64: add ccmp CC mode On 10/27/2014 12:48 AM, Zhenqiang Chen wrote: -Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Saturday, October 11, 2014 11:32 PM To: Zhenqiang Chen; gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 6/10] aarch64: add ccmp CC mode On 09/22/2014 11:44 PM, Zhenqiang Chen wrote: +case CC_DNEmode: + return comp_code == NE ? AARCH64_NE : AARCH64_EQ; +case CC_DEQmode: + return comp_code == NE ? AARCH64_EQ : AARCH64_NE; +case CC_DGEmode: + return comp_code == NE ? AARCH64_GE : AARCH64_LT; +case CC_DLTmode: + return comp_code == NE ? AARCH64_LT : AARCH64_GE; +case CC_DGTmode: + return comp_code == NE ? AARCH64_GT : AARCH64_LE; +case CC_DLEmode: + return comp_code == NE ? AARCH64_LE : AARCH64_GT; +case CC_DGEUmode: + return comp_code == NE ? AARCH64_CS : AARCH64_CC; +case CC_DLTUmode: + return comp_code == NE ? AARCH64_CC : AARCH64_CS; +case CC_DGTUmode: + return comp_code == NE ? AARCH64_HI : AARCH64_LS; +case CC_DLEUmode: + return comp_code == NE ? AARCH64_LS : AARCH64_HI; I think these should return -1 if comp_code is not EQ. Like the CC_Zmode case below. Since the code can not guarantee that the CC is used in cbranchcc insns when expand, it maybe in a tmp register. After some optimizations the CC is forwarded in cbranchcc insn. So the comp_code might be any legal COMPARE. Um, no. The point of returning -1 is to avoid combining with comparisons for which we cannot produce the proper result. Patch is updated. E.g. the existing CC_Zmode, where only the Z bit is valid. We want to reject combination with LTU, which checks the C bit. Are you honestly suggesting that using CC_DNEmode with GE can be made to make sense in any way? No. I misuse the function in previous patch. Now it returns -1. Thanks! -Zhenqiang 6-ccmp-cc-mode.patch Description: Binary data
Re: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.
On 10/10/14 15:48, David Sherwood wrote: Hi, I have a fix (originally written by Tejas Belagod) for the following bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810 Could someone take a look please? Thanks! David Sherwood. ChangeLog: gcc/: 2014-10-10 David Sherwood david.sherw...@arm.com * config/aarch64/aarch64-protos.h (aarch64_simd_attr_length_rglist, aarch64_reverse_mask): New decls. * config/aarch64/iterators.md (UNSPEC_REV_REGLIST): New enum. * config/aarch64/iterators.md (insn_count): New mode_attr. * config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i, vec_load_lanes(o/c/x)i): Fixed to work for Big Endian. * config/aarch64/aarch64-simd.md (aarch64_rev_reglist, aarch64_simd_(ld/st)(2/3/4)): Added. * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist, aarch64_reverse_mask): Added. + RTVEC_ELT (v, (i * usize + j)) = GEN_INT (((i+1) * usize) - 1 - j); s/i+1/i + 1. Remove extra parentheses. Tejas.
RE: [PATCH] Fix up sign extension in bswap
From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Wednesday, October 29, 2014 9:41 AM I think this is ok for trunk with proper ChangeLog entry. Done with following ChangeLog entry: 2014-10-29 Thomas Preud'homme thomas.preudho...@arm.com * gcc.dg/optimize-bswapsi-1.c (swap32_e): New bswap test. * gcc.dg/optimize-bswapsi-3.c: New test. Best regards, Thomas
Re: libcc1
On 10/29/2014 11:31 AM, Jakub Jelinek wrote: It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 Stuff such as texinfo and flex is in host_tools just as a relic of the old Cygnus tree. fixincludes is in there for running it after installation. The ones that matter in the common case are biuntils, gas, ld, gcc, gdb, gnattools and of course libcc1. shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. That is host, not build. Build is the system you are on. Say you're cross-building a native mingw compiler and debugger: build = i686-pc-linux-gnu host = i686-pc-mingw (or whatever they use these days) target = i686-pc-mingw You cannot link build-libcc1 (for i686-pc-linux-gnu) into host-gcc or host-gdb. But you surely know this, so perhaps it's me who is missing something.
RE: [Ping] [PATCH, 8/10] aarch64: ccmp insn patterns
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Richard Henderson Sent: Monday, October 27, 2014 11:47 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 8/10] aarch64: ccmp insn patterns On 10/27/2014 12:49 AM, Zhenqiang Chen wrote: + {AARCH64_CC_Z, 0}, /* EQ, Z == 1. */ {0, AARCH64_CC_Z}, /* NE, Z + == 0. */ {AARCH64_CC_C, 0}, /* CS, C == 1. */ {0, AARCH64_CC_C}, + /* CC, C == 0. */ {0, 0}, /* MI, not supported*/ {0, 0}, /* PL, + not supported*/ {0, 0}, /* VS, not supported*/ {0, 0}, /* VC, not + supported*/ Why not go ahead and fill out the table? You know what needs to go in these slots, after all. Updated. + {AARCH64_CC_C, AARCH64_CC_Z}, /* HI, C ==1 Z == 0. */ + {AARCH64_CC_Z, AARCH64_CC_C}, /* LS, !(C == 1 Z == 0). */ + {AARCH64_CC_N | AARCH64_CC_V, AARCH64_CC_N}, /* GE, N == V. */ + {AARCH64_CC_N, AARCH64_CC_N | AARCH64_CC_V}, /* LT, N != V. */ + {AARCH64_CC_N | AARCH64_CC_V, AARCH64_CC_Z}, /* GT, Z == 0 N == + V. */ {AARCH64_CC_Z, AARCH64_CC_N | AARCH64_CC_V}, /* LE, !(Z == 0 + N == V). */ Perhaps it's me, but does it make things clearer to reduce these? That is, for the compound conditions, we need not make both sub-conditions be false, only one of them. E.g. {AARCH64_CC_C, 0} /* HI, C ==1 Z == 0. */ {0, AARCH64_CC_C} /* LS, !(C ==1 Z == 0) */ {0, AARCH64_CC_V} /* GE, N == V */ {AARCH64_CC_V, 0} /* LT, N != V */ {0, AARCH64_CC_Z} /* GT, Z == 0 N == V */ {AARCH64_CC_Z, 0} /* LE, !(Z == 0 N == V) */ At which point it becomes blindingly obvious that while we can't compress the table with ~nczv, we can index it with reverse_comparison instead. Updated. +case 'k': + { + int cond_code; + rtx op0 = XEXP (x, 0); + enum rtx_code mode_code; + /* Print a condition (eq, ne, etc) of ccmp. */ + + if (!COMPARISON_P (x) || !ccmp_cc_register (op0, GET_MODE (op0))) + { + output_operand_lossage (invalid operand for '%%%c', code); + return; + } + + mode_code = aarch64_ccmp_mode_to_code (GET_MODE (op0)); + cond_code = aarch64_get_condition_code_1 (CCmode, mode_code); + gcc_assert (cond_code = 0); + fputs (aarch64_condition_codes[cond_code], f); + } Is there a branch with all the patches applied? I can't look back at the modified aarch64_get_condition_code_1, but off-hand I can't think of why %m/%M wouldn't work. Surely It's my fault. %m/%M work well in the new patch. And I add a check aarch64_ccmp_mode_to_code (GET_MODE (operands[1])) == GET_CODE (operands[5]) on the patterns to make sure that the compare and CC mode are aligned. Thanks! -Zhenqiang aarch64_get_condition_code_1 (GET_MODE (op0), GET_CODE (x)) will yield the correct cond_code. If it didn't, then surely branches wouldn't work at all. These are not some magic new kind of conditions; they're exactly the same. r~ 7-8-ccmp-patterns.patch Description: Binary data
Re: [PATCH 5/5] add libcc1
On 10/29/2014 11:28 AM, Jakub Jelinek wrote: If this passes bootstrap/regtest, is it ok for trunk (should fix two bootstrap issues)? Is the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02936.html patch ok too (that one already tested; another bootstrap issue)? Both seem okay, though I'd have to look at the whole thread to understand what libcc1 is. :) Just two questions: 1) what's the issue that you need to disable asan for? 2) why is GMPLIB not handled in the same way? Paolo
[PATCH][BUILDROBOT] Unused static function (was: RFA: AVR: add infrastructure for device packages)
On Wed, 2014-10-29 02:23:31 +0100, Jan-Benedict Glaw jbg...@lug-owl.de wrote: On Wed, 2014-10-08 18:50:32 +0100, Joern Rennecke joern.renne...@embecosm.com wrote: Attached is the GCC patch for the basic device package infrastructure. OK to apply? There's some fallout on config-list.mk builds: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../../gcc/gcc -I../../../gcc/gcc/. -I../../../gcc/gcc/../include -I../../../gcc/gcc/../libcpp/include -I/opt/cfarm/mpc/include -I../../../gcc/gcc/../libdecnumber -I../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I../../../gcc/gcc/../libbacktrace-I. -I. -I../../../gcc/gcc -I../../../gcc/gcc/. -I../../../gcc/gcc/../include -I../../../gcc/gcc/../libcpp/include -I/opt/cfarm/mpc/include -I../../../gcc/gcc/../libdecnumber -I../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I../../../gcc/gcc/../libbacktrace ../../../gcc/gcc/config/avr/driver-avr.c ../../../gcc/gcc/config/avr/driver-avr.c:35:1: error: ‘void avr_set_current_device(const char*)’ defined but not used [-Werror=unused-function] avr_set_current_device (const char *name) ^ cc1plus: all warnings being treated as errors make[2]: *** [driver-avr.o] Error 1 See build http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=370682 Is it planned to use that function later on? Or shall we just drop it? So I suggest to just remove it. 2014-10-29 Jan-Benedict Glaw jbg...@lug-owl.de * config/avr/driver-avr.c (avr_set_current_device): Remove. diff --git a/gcc/config/avr/driver-avr.c b/gcc/config/avr/driver-avr.c index 24a26d4..50de944 100644 --- a/gcc/config/avr/driver-avr.c +++ b/gcc/config/avr/driver-avr.c @@ -28,22 +28,3 @@ const avr_arch_t *avr_current_arch = NULL; /* Current device. */ const avr_mcu_t *avr_current_device = NULL; - -/* Initialize avr_current_arch and avr_current_device variables. */ - -static void -avr_set_current_device (const char *name) -{ - - if (NULL != avr_current_arch) - return; - - for (avr_current_device = avr_mcu_types; avr_current_device-name; - avr_current_device++) -{ - if (strcmp (avr_current_device-name, name) == 0) -break; -} - - avr_current_arch = avr_arch_types[avr_current_device-arch]; -} -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Friends are relatives you make for yourself. the second : signature.asc Description: Digital signature
[Patch 0/6] Hookize MOVE_BY_PIECES_P
Hi, As discussed in the thread starting at: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02359.html it would be useful to completely remove MOVE_BY_PIECES_P, rather than leaving it half-dead. This patch series has a small respin of the patch approved in that thread, followed by patches for each of the architectures using MOVE_BY_PIECES_P, followed by a final patch removing and poisoning the target macro. I haven't been able to test the target patches beyond building a compiler as I don't have access to hardware or emulators for these platforms. I would appreciate help from the maintainers of those ports where it can be given. The target-independent patches I've bootstrapped and tested on x86_64/ARM/AArch64 with no issues. OK for trunk? Thanks, James --- James Greenhalgh (6): [Patch 1/6] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO [Patch 2/6 s390] Deprecate MOVE_BY_PIECES_P, move to hookized version [Patch 3/6 arc] Deprecate MOVE_BY_PIECES_P, move to hookized version [Patch 4/6 sh] Deprecate MOVE_BY_PIECES_P, move to hookized version [Patch 5/6 mips] Deprecate MOVE_BY_PIECES_P, move to hookized version [Patch 6/6] Remove MOVE_BY_PIECES_P
[Patch 1/6] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO
Hi, This is a very minor respin of the patch at: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02359.html dropping the dependency on the refactor in: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01925.html The patch is otherwise unmodified from what was approved in September. Is this still OK? Thanks, James --- gcc/ 2014-10-28 James Greenhalgh james.greenha...@arm.com * target.def (move_by_pieces_profitable_p): New. * doc/tm.texi.in (MOVE_BY_PIECES_P): Reduce documentation to a stub describing that this macro is deprecated. (TARGET_MOVE_BY_PIECES_PROFITABLE_P): Add hook. * doc/tm.texi: Regenerate. * expr.c (MOVE_BY_PIECES_P): Remove. (STORE_BY_PIECES_P): Rewrite in terms of TARGET_MOVE_BY_PIECES_PROFITABLE_P. (can_move_by_pieces): Likewise. (emit_block_move_hints): Rewrite in terms of can_move_by_pieces. (emit_push_insn): Likewise. (expand_constructor): Likewise. * targhooks.c (get_move_ratio): New. (default_move_by_pieces_profitable_p): Likewise. * targhooks.h (default_move_by_pieces_profitable_p): New. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 5036d4f..c50227a 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6124,11 +6124,38 @@ If you don't define this, a reasonable default is used. @end defmac @defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment}) -A C expression used to determine whether @code{move_by_pieces} will be used to -copy a chunk of memory, or whether some other block move mechanism -will be used. Defaults to 1 if @code{move_by_pieces_ninsns} returns less -than @code{MOVE_RATIO}. -@end defmac +A C expression used to implement the default behaviour of +@code{TARGET_MOVE_BY_PIECES_PROFITABLE_P}. New ports should implement +that hook in preference to this macro, which is deprecated. +@end defmac + +@deftypefn {Target Hook} bool TARGET_MOVE_BY_PIECES_PROFITABLE_P (unsigned int @var{size}, unsigned int @var{alignment}, bool @var{speed_p}) +GCC will attempt several strategies when asked to copy between +two areas of memory, for example when copying a @code{struct}. +@code{move_by_pieces} implements such a copy as a sequence of +memory-to-memory move insns. Alternate strategies are to expand the +@code{movmem} optab, to emit a library call, or to emit a unit-by-unit +loop-based copy. + +This target hook should return true if, for a memory move with a given +@var{size} and @var{alignment}, using the @code{move_by_pieces} +infrastructure is expected to result in better code generation. +Both @var{size} and @var{alignment} are measured in terms of storage +units. + +The parameter @var{speed_p} is true if the code is currently being +optimized for speed rather than size. + +Returning true for higher values of @var{size} can improve code generation +for speed if the target does not provide an implementation of the +@code{movmem} standard name, if the @code{movmem} implementation would be +more expensive than a sequence of move insns, or if the overhead of a +library call would dominate that of the body of the copy. + +Returning true for higher values of @code{size} may also cause an increase +in code size, for example where the number of insns emitted to perform a +move would be greater than that of a library call. +@end deftypefn @defmac MOVE_MAX_PIECES A C expression used by @code{move_by_pieces} to determine the largest unit diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 5674e6c..f3c90f8 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4601,12 +4601,13 @@ If you don't define this, a reasonable default is used. @end defmac @defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment}) -A C expression used to determine whether @code{move_by_pieces} will be used to -copy a chunk of memory, or whether some other block move mechanism -will be used. Defaults to 1 if @code{move_by_pieces_ninsns} returns less -than @code{MOVE_RATIO}. +A C expression used to implement the default behaviour of +@code{TARGET_MOVE_BY_PIECES_PROFITABLE_P}. New ports should implement +that hook in preference to this macro, which is deprecated. @end defmac +@hook TARGET_MOVE_BY_PIECES_PROFITABLE_P + @defmac MOVE_MAX_PIECES A C expression used by @code{move_by_pieces} to determine the largest unit a load or store used to copy memory is. Defaults to @code{MOVE_MAX}. diff --git a/gcc/expr.c b/gcc/expr.c index a5bf13a..6b3291f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -164,14 +164,6 @@ static void do_tablejump (rtx, enum machine_mode, rtx, rtx, rtx, int); static rtx const_vector_from_tree (tree); static void write_complex_part (rtx, rtx, bool); -/* This macro is used to determine whether move_by_pieces should be called - to perform a structure copy. */ -#ifndef MOVE_BY_PIECES_P -#define MOVE_BY_PIECES_P(SIZE, ALIGN) \ - (move_by_pieces_ninsns (SIZE, ALIGN, MOVE_MAX_PIECES + 1) \ -(unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ()))
Re: libcc1
On Wed, Oct 29, 2014 at 11:37:26AM +0100, Paolo Bonzini wrote: On 10/29/2014 11:31 AM, Jakub Jelinek wrote: shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. That is host, not build. Build is the system you are on. Oops, sorry, mixed that, sure, it should be host tool then. So without the first two hunks and third hunk changed so that it doesn't bootstrap it? Doesn't that mean that when bootstrapping natively it will be built by the system compiler rather than the newly built compiler? I think fixincludes is only built during stage1 normally, we don't need libcc1 during stage1/stage2 unless not bootstrapping, it is needed just for installation and testing. --- configure.ac2014-10-28 14:39:53.018852391 +0100 +++ configure.ac2014-10-29 11:43:19.873216226 +0100 @@ -2677,6 +2677,7 @@ for module in ${configdirs} ; do fi case ${module},${bootstrap_fixincludes} in fixincludes,no) host_bootstrap_suffix=no-bootstrap ;; +libcc1,*) host_bootstrap_suffix=no-bootstrap ;; *) host_bootstrap_suffix=$bootstrap_suffix ;; esac extrasub_host=$extrasub_host Jakub
[Patch 2/6 s390] Deprecate MOVE_BY_PIECES_P, move to hookized version
Hi, This patch moves s390 to TARGET_MOVE_BY_PIECES_PROFITABLE_P. I tried building a compiler and there were no fires, but otherwise, I have no reasonable way to test this patch. If one of the s390 maintainers wants to pick it up and test it, that would be much appreciated. Ok? James --- 2014-10-29 James Greenhalgh james.greenha...@arm.com * config/s390/s390.c (s390_move_by_pieces_profitable_p): New. (TARGET_MOVE_BY_PIECES_PROFITABLE_P): Likewise. * config/s390/s390.h (MOVE_BY_PIECES_P): Remove. diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 1b10805..f531e12 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -12043,6 +12043,17 @@ s390_option_override (void) register_pass (insert_pass_s390_early_mach); } +/* Implement TARGET_MOVE_BY_PIECES_PROFITABLE_P. */ + +static bool +s390_move_by_pieces_profitable_p (unsigned int size, + unsigned int align ATTRIBUTE_UNUSED, + bool speed_p ATTRIBUTE_UNUSED) +{ + return (size == 1 || size == 2 + || size == 4 || (TARGET_ZARCH size == 8)); +} + /* Initialize GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -12228,6 +12239,9 @@ s390_option_override (void) #undef TARGET_SET_UP_BY_PROLOGUE #define TARGET_SET_UP_BY_PROLOGUE s300_set_up_by_prologue +#undef TARGET_MOVE_BY_PIECES_PROFITABLE_P +#define TARGET_MOVE_BY_PIECES_PROFITABLE_P s390_move_by_pieces_profitable_p + struct gcc_target targetm = TARGET_INITIALIZER; #include gt-s390.h diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index c5edace..688c2fb 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -744,11 +744,6 @@ do { \ #define MOVE_MAX_PIECES (TARGET_ZARCH ? 8 : 4) #define MAX_MOVE_MAX 16 -/* Determine whether to use move_by_pieces or block move insn. */ -#define MOVE_BY_PIECES_P(SIZE, ALIGN) \ - ( (SIZE) == 1 || (SIZE) == 2 || (SIZE) == 4 \ -|| (TARGET_ZARCH (SIZE) == 8) ) - /* Determine whether to use clear_by_pieces or block clear insn. */ #define CLEAR_BY_PIECES_P(SIZE, ALIGN) \ ( (SIZE) == 1 || (SIZE) == 2 || (SIZE) == 4 \ @@ -756,7 +751,9 @@ do { \ /* This macro is used to determine whether store_by_pieces should be called to memcpy storage when the source is a constant string. */ -#define STORE_BY_PIECES_P(SIZE, ALIGN) MOVE_BY_PIECES_P (SIZE, ALIGN) +#define STORE_BY_PIECES_P(SIZE, ALIGN) \ + targetm.move_by_pieces_profitable_p \ +(SIZE, ALIGN, optimize_function_for_size_p (cfun)) /* Likewise to decide whether to memset storage with byte values other than zero. */
[Patch 3/6 arc] Deprecate MOVE_BY_PIECES_P, move to hookized version
Hi, This patch moves arc to TARGET_MOVE_BY_PIECES_PROFITABLE_P. While I am there, arc defines a macro CAN_MOVE_BY_PIECES, which is unused, so clean that up too. I tried building a compiler but no amount of fiddling with target strings got me to a sensible result, so this patch is completely untested. If one of the arc maintainers could give it a spin that would be helpful. OK? Thanks, James --- 2014-10-28 James Greenhalgh james.greenha...@arm.com * config/arc/arc.c (TARGET_MOVE_BY_PIECES_PROFITABLE_P): New. (arc_move_by_pieces_profitable_p): Likewise. * confir/arc/arc.h (MOVE_BY_PIECES_P): Delete. (CAN_MOVE_BY_PIECES): Likewise. diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 8bfebfd..fcebe59 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -415,6 +415,10 @@ static void output_short_suffix (FILE *file); static bool arc_frame_pointer_required (void); +static bool arc_move_by_pieces_profitable_p (unsigned int, + unsigned int, + bool); + /* Implements target hook vector_mode_supported_p. */ static bool @@ -530,6 +534,9 @@ static void arc_finalize_pic (void); #undef TARGET_DELEGITIMIZE_ADDRESS #define TARGET_DELEGITIMIZE_ADDRESS arc_delegitimize_address +#undef TARGET_MOVE_BY_PIECES_PROFITABLE_P +#define TARGET_MOVE_BY_PIECES_PROFITABLE_P arc_move_by_pieces_profitable_p + /* Usually, we will be able to scale anchor offsets. When this fails, we want LEGITIMIZE_ADDRESS to kick in. */ #undef TARGET_MIN_ANCHOR_OFFSET @@ -9383,6 +9390,16 @@ arc_legitimize_reload_address (rtx *p, enum machine_mode mode, int opnum, return false; } +/* Implement TARGET_MOVE_BY_PIECES_PROFITABLE_P. */ + +static bool +arc_move_by_pieces_profitable_p (unsigned int size ATTRIBUTE_UNUSED, + unsigned int align ATTRIBUTE_UNUSED, + bool speed_p ATTRIBUTE_UNUSED) +{ + return false; +} + struct gcc_target targetm = TARGET_INITIALIZER; #include gt-arc.h diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index 2b0a04c..1a2c6b1 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -1553,12 +1553,6 @@ extern int arc_return_address_regs[4]; in one reasonably fast instruction. */ #define MOVE_MAX 4 -/* Let the movmem expander handle small block moves. */ -#define MOVE_BY_PIECES_P(LEN, ALIGN) 0 -#define CAN_MOVE_BY_PIECES(SIZE, ALIGN) \ - (move_by_pieces_ninsns (SIZE, ALIGN, MOVE_MAX_PIECES + 1) \ -(unsigned int) MOVE_RATIO (!optimize_size)) - /* Undo the effects of the movmem pattern presence on STORE_BY_PIECES_P . */ #define MOVE_RATIO(SPEED) ((SPEED) ? 15 : 3)
[Patch 4/6 sh] Deprecate MOVE_BY_PIECES_P, move to hookized version
Hi, This patch moves sh to TARGET_MOVE_BY_PIECES_PROFITABLE_P. I tried building a compiler and there were no fires, but otherwise, I have no reasonable way to test this patch. If one of the sh maintainers wants to pick it up and test it, that would be much appreciated. Thanks, James --- gcc/ 2014-10-28 James Greenhalgh james.greenha...@arm.com * config/sh/sh.c (TARGET_MOVE_BY_PIECES_PROFITABLE_P): New. (sh_move_by_pieces_profitable_p): Likewise. * config/sh/sh.h (MOVE_BY_PIECES_P): Remove. diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c index 1662b55..0b907b9 100644 --- a/gcc/config/sh/sh.c +++ b/gcc/config/sh/sh.c @@ -338,6 +338,9 @@ static void sh_conditional_register_usage (void); static bool sh_legitimate_constant_p (enum machine_mode, rtx); static int mov_insn_size (enum machine_mode, bool); static int mov_insn_alignment_mask (enum machine_mode, bool); +static bool sh_move_by_pieces_profitable_p (unsigned int size, + unsigned int align, + bool speed_p); static bool sequence_insn_p (rtx_insn *); static void sh_canonicalize_comparison (int *, rtx *, rtx *, bool); static void sh_canonicalize_comparison (enum rtx_code, rtx, rtx, @@ -640,6 +643,9 @@ static const struct attribute_spec sh_attribute_table[] = #undef TARGET_FIXED_CONDITION_CODE_REGS #define TARGET_FIXED_CONDITION_CODE_REGS sh_fixed_condition_code_regs +#undef TARGET_MOVE_BY_PIECES_PROFITABLE_P +#define TARGET_MOVE_BY_PIECES_PROFITABLE_P sh_move_by_pieces_profitable_p + /* Machine-specific symbol_ref flags. */ #define SYMBOL_FLAG_FUNCVEC_FUNCTION (SYMBOL_FLAG_MACH_DEP 0) @@ -13674,4 +13680,15 @@ sh_mode_priority (int entity ATTRIBUTE_UNUSED, int n) return ((TARGET_FPU_SINGLE != 0) ^ (n) ? FP_MODE_SINGLE : FP_MODE_DOUBLE); } +/* Implement TARGET_MOVE_BY_PIECES_PROFITABLE_P. */ + +static bool +sh_move_by_pieces_profitable_p (unsigned int size, +unsigned int align, +bool speed_p) +{ + return move_by_pieces_ninsns (size, align, MOVE_MAX_PIECES + 1) + (!speed_p ? 2 : (align = 32) ? 16 : 2); +} + #include gt-sh.h diff --git a/gcc/config/sh/sh.h b/gcc/config/sh/sh.h index 5b8b4a1..e115b1e 100644 --- a/gcc/config/sh/sh.h +++ b/gcc/config/sh/sh.h @@ -1591,10 +1591,6 @@ struct sh_args { #define USE_STORE_PRE_DECREMENT(mode)((mode == SImode || mode == DImode) \ ? 0 : TARGET_SH1) -#define MOVE_BY_PIECES_P(SIZE, ALIGN) \ - (move_by_pieces_ninsns (SIZE, ALIGN, MOVE_MAX_PIECES + 1) \ -(optimize_size ? 2 : ((ALIGN = 32) ? 16 : 2))) - #define STORE_BY_PIECES_P(SIZE, ALIGN) \ (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ (optimize_size ? 2 : ((ALIGN = 32) ? 16 : 2)))
Re: libcc1
On 10/29/2014 11:45 AM, Jakub Jelinek wrote: On Wed, Oct 29, 2014 at 11:37:26AM +0100, Paolo Bonzini wrote: On 10/29/2014 11:31 AM, Jakub Jelinek wrote: shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. That is host, not build. Build is the system you are on. Oops, sorry, mixed that, sure, it should be host tool then. So without the first two hunks and third hunk changed so that it doesn't bootstrap it? Doesn't that mean that when bootstrapping natively it will be built by the system compiler rather than the newly built compiler? IIRC it will be built after stage3 completes, with the just-bootstrapped compiler. I think fixincludes is only built during stage1 normally, we don't need libcc1 during stage1/stage2 unless not bootstrapping, it is needed just for installation and testing. --- configure.ac 2014-10-28 14:39:53.018852391 +0100 +++ configure.ac 2014-10-29 11:43:19.873216226 +0100 @@ -2677,6 +2677,7 @@ for module in ${configdirs} ; do fi case ${module},${bootstrap_fixincludes} in fixincludes,no) host_bootstrap_suffix=no-bootstrap ;; +libcc1,*) host_bootstrap_suffix=no-bootstrap ;; *) host_bootstrap_suffix=$bootstrap_suffix ;; esac extrasub_host=$extrasub_host This makes sense. Paolo
[Patch 5/6 mips] Deprecate MOVE_BY_PIECES_P, move to hookized version
Hi, This patch moves mips to TARGET_MOVE_BY_PIECES_PROFITABLE_P. I tried building a compiler and there were no fires, I don't have access to any MIPS hardware, so if one of the MIPS maintainers wanted to pick this up and test it, that would be very much appreciated. OK? Thanks, James --- gcc/ 2014-10-28 James Greenhalgh james.greenha...@arm.com * config/mips/mips.h (MOVE_BY_PIECES_P): Remove. * config/mips/mips.c (TARGET_MOVE_BY_PIECES_PROFITABLE_P): New. (mips_move_by_pieces_p): Rename to... (mips_move_by_pieces_profitable_p): ...this, use new hook parameters, use the default hook implementation as a fall-back. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 2f9d2da..4d7ef81 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -7172,7 +7172,9 @@ mips_function_ok_for_sibcall (tree decl, tree exp ATTRIBUTE_UNUSED) /* Implement MOVE_BY_PIECES_P. */ bool -mips_move_by_pieces_p (unsigned HOST_WIDE_INT size, unsigned int align) +mips_move_by_pieces_profitable_p (unsigned int size, + unsigned int align, + bool speed_p) { if (HAVE_movmemsi) { @@ -7191,10 +7193,8 @@ mips_move_by_pieces_p (unsigned HOST_WIDE_INT size, unsigned int align) return size UNITS_PER_WORD; return size = MIPS_MAX_MOVE_BYTES_STRAIGHT; } - /* The default value. If this becomes a target hook, we should - call the default definition instead. */ - return (move_by_pieces_ninsns (size, align, MOVE_MAX_PIECES + 1) - (unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ())); + + return default_move_by_pieces_profitable_p (size, align, speed_p); } /* Implement STORE_BY_PIECES_P. */ @@ -19116,6 +19116,9 @@ mips_lra_p (void) #undef TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS #define TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS true +#undef TARGET_MOVE_BY_PIECES_PROFITABLE_P +#define TARGET_MOVE_BY_PIECES_PROFITABLE_P mips_move_by_pieces_profitable_p + #undef TARGET_SPILL_CLASS #define TARGET_SPILL_CLASS mips_spill_class #undef TARGET_LRA_P diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index c7b998b..6872940 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -2872,9 +2872,6 @@ while (0) ? MIPS_MAX_MOVE_BYTES_STRAIGHT / MOVE_MAX \ : MIPS_CALL_RATIO / 2) -#define MOVE_BY_PIECES_P(SIZE, ALIGN) \ - mips_move_by_pieces_p (SIZE, ALIGN) - /* For CLEAR_RATIO, when optimizing for size, give a better estimate of the length of a memset call, but use the default otherwise. */
[Patch 6/6] Remove MOVE_BY_PIECES_P
Hi, This final patch gets rid of MOVE_BY_PIECES_P. Bootstrapped on x86_64, ARM and AArch64. Thanks, James --- gcc/ 2014-10-28 James Greenhalgh james.greenha...@arm.com * doc/tm.texi.in (MOVE_BY_PIECES_P): Remove. * doc/tm.texi: Regenerate. * system.h: Poison MOVE_BY_PIECES_P. * targhooks.c (default_move_by_pieces_profitable_p): Remove MOVE_BY_PIECES_P. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c50227a..86d783e 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6123,12 +6123,6 @@ optimized for speed rather than size. If you don't define this, a reasonable default is used. @end defmac -@defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment}) -A C expression used to implement the default behaviour of -@code{TARGET_MOVE_BY_PIECES_PROFITABLE_P}. New ports should implement -that hook in preference to this macro, which is deprecated. -@end defmac - @deftypefn {Target Hook} bool TARGET_MOVE_BY_PIECES_PROFITABLE_P (unsigned int @var{size}, unsigned int @var{alignment}, bool @var{speed_p}) GCC will attempt several strategies when asked to copy between two areas of memory, for example when copying a @code{struct}. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index f3c90f8..f085796 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4600,12 +4600,6 @@ optimized for speed rather than size. If you don't define this, a reasonable default is used. @end defmac -@defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment}) -A C expression used to implement the default behaviour of -@code{TARGET_MOVE_BY_PIECES_PROFITABLE_P}. New ports should implement -that hook in preference to this macro, which is deprecated. -@end defmac - @hook TARGET_MOVE_BY_PIECES_PROFITABLE_P @defmac MOVE_MAX_PIECES diff --git a/gcc/system.h b/gcc/system.h index dbe1ceb..b9b90d4 100644 --- a/gcc/system.h +++ b/gcc/system.h @@ -847,7 +847,8 @@ extern void fancy_abort (const char *, int, const char *) ATTRIBUTE_NORETURN; HOT_TEXT_SECTION_NAME LEGITIMATE_CONSTANT_P ALWAYS_STRIP_DOTDOT \ OUTPUT_ADDR_CONST_EXTRA SMALL_REGISTER_CLASSES ASM_OUTPUT_IDENT \ ASM_BYTE_OP MEMBER_TYPE_FORCES_BLK LIBGCC2_HAS_SF_MODE \ - LIBGCC2_HAS_DF_MODE LIBGCC2_HAS_XF_MODE LIBGCC2_HAS_TF_MODE + LIBGCC2_HAS_DF_MODE LIBGCC2_HAS_XF_MODE LIBGCC2_HAS_TF_MODE \ + MOVE_BY_PIECES_P /* Target macros only used for code built for the target, that have moved to libgcc-tm.h or have never been present elsewhere. */ diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 9ba3f8b..068e24e 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1433,12 +1433,8 @@ default_move_by_pieces_profitable_p (unsigned int size ATTRIBUTE_UNUSED, unsigned int alignment ATTRIBUTE_UNUSED, bool speed_p ATTRIBUTE_UNUSED) { -#ifndef MOVE_BY_PIECES_P return move_by_pieces_ninsns (size, alignment, MOVE_MAX_PIECES + 1) get_move_ratio (speed_p); -#else - return !!(MOVE_BY_PIECES_P (size, alignment)); -#endif } bool
Re: [PATCH 5/5] add libcc1
On Wed, Oct 29, 2014 at 11:37:42AM +0100, Paolo Bonzini wrote: On 10/29/2014 11:28 AM, Jakub Jelinek wrote: If this passes bootstrap/regtest, is it ok for trunk (should fix two bootstrap issues)? Is the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02936.html patch ok too (that one already tested; another bootstrap issue)? Both seem okay, though I'd have to look at the whole thread to understand what libcc1 is. :) It is a library for communication between the debugger and a GCC plugin (and the plugin itself). So, the library is dlopened into GDB and the plugin that links against that library is dlopened by GCC when GDB asks the library it dlopened to run the compiler with the plugin. Just two questions: 1) what's the issue that you need to disable asan for? -fsanitize=address generally doesn't work or doesn't work too well, if the binary is not built with -fsanitize=address, but shared library dlopened into it is. So, we want to avoid instrumenting plugins that way (we already don't instrument lto-plugin for that reason, because ld might not be asan instrumented, and libcc1 is similar case, when gdb dlopens the library, it might not be instrumented either). 2) why is GMPLIB not handled in the same way? The only problem is that system.h includes gmp.h, so we need a way to find that header. I think libcc1 doesn't use any functions from gmp itself, so if gmp.h can be included, GMPLIB isn't really needed. Jakub
Re: [PATCH 5/5] add libcc1
On 10/29/2014 11:51 AM, Jakub Jelinek wrote: On Wed, Oct 29, 2014 at 11:37:42AM +0100, Paolo Bonzini wrote: On 10/29/2014 11:28 AM, Jakub Jelinek wrote: If this passes bootstrap/regtest, is it ok for trunk (should fix two bootstrap issues)? Is the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02936.html patch ok too (that one already tested; another bootstrap issue)? Both seem okay, though I'd have to look at the whole thread to understand what libcc1 is. :) It is a library for communication between the debugger and a GCC plugin (and the plugin itself). So, the library is dlopened into GDB and the plugin that links against that library is dlopened by GCC when GDB asks the library it dlopened to run the compiler with the plugin. Just two questions: 1) what's the issue that you need to disable asan for? -fsanitize=address generally doesn't work or doesn't work too well, if the binary is not built with -fsanitize=address, but shared library dlopened into it is. So, we want to avoid instrumenting plugins that way (we already don't instrument lto-plugin for that reason, because ld might not be asan instrumented, and libcc1 is similar case, when gdb dlopens the library, it might not be instrumented either). Thanks for explaining. I can see intuitively why that could be a problem... 2) why is GMPLIB not handled in the same way? The only problem is that system.h includes gmp.h, so we need a way to find that header. I think libcc1 doesn't use any functions from gmp itself, so if gmp.h can be included, GMPLIB isn't really needed. Ah, got it. Is it hard to move the inclusion to the actual users? Paolo
Re: libcc1
On 29/10/14 10:31, Jakub Jelinek wrote: It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. Could something like following work? Phil, can you try that? Perhaps some toplevel Makefile* changes would be needed too. From GDB's point-of-view, as long as we have access to the .so that is built that's all GDB wants. So whichever stage it is produced should be fine. My archaeology into the source repository has not revealed why we needed bootstrap. Perhaps we included it out of a sense of paranoia for testing. I've CC'd Tom on this, so he may have an opinion or insight. From my point of view, I see no value in bootstrapping libcc1 now. It's not a required build to bootstrap GCC. Cheers Phil
Re: [PATCH 5/5] add libcc1
On Wed, Oct 29, 2014 at 11:53:28AM +0100, Paolo Bonzini wrote: 2) why is GMPLIB not handled in the same way? The only problem is that system.h includes gmp.h, so we need a way to find that header. I think libcc1 doesn't use any functions from gmp itself, so if gmp.h can be included, GMPLIB isn't really needed. Ah, got it. Is it hard to move the inclusion to the actual users? I think it is hard. I think it has been moved to system.h very much intentionally, as including gmp.h only in selected headers was causing lots of troubles, e.g. because of #pragma GCC poison at the end of system.h, I believe some gmp.h versions were using some poisoned symbols. system.h doesn't include gmp.h if -DGENERATOR_FILE, but libcc1 is not a generator, so that is not appropriate, it can use various other GCC headers that are not suitable for generators. GMPINC has been suggested by Joseph, I'd think if we ever need also GMPLIB, we'd clearly see it as link failures of libcc1 first and could add it only when really needed. Jakub
Re: [PATCH 5/5] add libcc1
On 29/10/14 10:53, Paolo Bonzini wrote: 2) why is GMPLIB not handled in the same way? The only problem is that system.h includes gmp.h, so we need a way to find that header. I think libcc1 doesn't use any functions from gmp itself, so if gmp.h can be included, GMPLIB isn't really needed. Ah, got it. Is it hard to move the inclusion to the actual users? We don't, I was looking at this issue today. It is just as Jakub explains. Cheers Phil
Re: libcc1
On 10/29/2014 11:58 AM, Phil Muldoon wrote: My archaeology into the source repository has not revealed why we needed bootstrap. Perhaps we included it out of a sense of paranoia for testing. I've CC'd Tom on this, so he may have an opinion or insight. From my point of view, I see no value in bootstrapping libcc1 now. It's not a required build to bootstrap GCC. Then I agree, I don't think it needs to be bootstrapped. Paolo
Re: [PATCH 5/5] add libcc1
On 29/10/14 10:53, Paolo Bonzini wrote: On 10/29/2014 11:51 AM, Jakub Jelinek wrote: On Wed, Oct 29, 2014 at 11:37:42AM +0100, Paolo Bonzini wrote: On 10/29/2014 11:28 AM, Jakub Jelinek wrote: If this passes bootstrap/regtest, is it ok for trunk (should fix two bootstrap issues)? Is the https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02936.html patch ok too (that one already tested; another bootstrap issue)? Both seem okay, though I'd have to look at the whole thread to understand what libcc1 is. :) It is a library for communication between the debugger and a GCC plugin (and the plugin itself). So, the library is dlopened into GDB and the plugin that links against that library is dlopened by GCC when GDB asks the library it dlopened to run the compiler with the plugin. Adding on to what Jakub said, it allows GDB access to GCC's parser. There are a number of reasons why, but right now that means we can compile code snippets in GDB (without the source of the current inferior), allow access to symbols of that inferior in that source snippet, etc. We then inject and execute it. We are currently writing a wiki article about it. Not required reading or anything, but more information for the curious. https://sourceware.org/gdb/wiki/GCCCompileAndExecute (That is a work in progress). There is also a video of a presentation I did at Cauldron somewhere. Cheers Phil
Re: libcc1
On 29/10/14 10:31, Jakub Jelinek wrote: It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. Could something like following work? Phil, can you try that? Perhaps some toplevel Makefile* changes would be needed too. From a point of view of GDB, as long as in all scenarios above the .so is available in the finished produce that is fine. I will test your patch and report back. Cheers Phil
RE: [Patch 1/6] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO
Hi James, I think you have a bug in the following hunk where you pass STORE_MAX_PIECES in place of the optimise for speed flag. I guess you would need an extra argument to pass a different *_MAX_PIECES value in. thanks, Matthew @@ -192,8 +184,7 @@ static void write_complex_part (rtx, rtx, bool); called to memcpy storage when the source is a constant string. */ #ifndef STORE_BY_PIECES_P #define STORE_BY_PIECES_P(SIZE, ALIGN) \ - (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ -(unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ())) + (targetm.move_by_pieces_profitable_p (SIZE, ALIGN, STORE_MAX_PIECES)) #endif /* This is run to set up which modes can be use
Re: [PATCH] Add test for PR52769
On Wed, Oct 29, 2014 at 11:24 AM, Marek Polacek pola...@redhat.com wrote: PR52769 reports a bug that has been fixed in 4.7, but the test case was never added. So I'd like to put this test in and close PR52769. Ok? Ok everywhere. Thanks, Richard. 2014-10-29 Marek Polacek pola...@redhat.com PR c/52769 * gcc.dg/pr52769.c: New test. diff --git gcc/testsuite/gcc.dg/pr52769.c gcc/testsuite/gcc.dg/pr52769.c index e69de29..138cecb 100644 --- gcc/testsuite/gcc.dg/pr52769.c +++ gcc/testsuite/gcc.dg/pr52769.c @@ -0,0 +1,24 @@ +/* PR c/52769 */ +/* { dg-do run } */ +/* { dg-options -O3 } */ + +typedef struct +{ + int should_be_zero; + char s[6]; + int x; +} foo_t; + +int +main (void) +{ + volatile foo_t foo = { +.s = 123456, +.x = 2 + }; + + if (foo.should_be_zero != 0) +__builtin_abort (); + + return 0; +} Marek
Re: [PATCH 1/X, i386, PR54232] Enable EBX for x86 in 32bits PIC code
The test passes now. So let's remove xfail. 2014-10-29 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite * gcc.target/i386/pr23098.c: Remove xfail. diff --git a/gcc/testsuite/gcc.target/i386/pr23098.c b/gcc/testsuite/gcc.target/i386/pr23098.c index 7f118dc..7f118bb 100644 --- a/gcc/testsuite/gcc.target/i386/pr23098.c +++ b/gcc/testsuite/gcc.target/i386/pr23098.c @@ -1,7 +1,7 @@ /* PR rtl-optimization/23098 */ /* { dg-do compile } */ /* { dg-options -O2 -fPIC } */ -/* { dg-final { scan-assembler-not \.LC\[0-9\] { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-not \.LC\[0-9\] } } */ /* { dg-require-effective-target ia32 } */ /* { dg-require-effective-target fpic } */ On Thu, Oct 23, 2014 at 4:19 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Evgeny Stupachenko evstu...@gmail.com writes: Reattached. On Mon, Oct 13, 2014 at 8:22 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Oct 13, 2014 at 4:53 PM, Evgeny Stupachenko evstu...@gmail.com wrote: ChangeLog for testsuite: 2014-10-13 Evgeny Stupachenko evstu...@gmail.com PR target/8340 PR middle-end/47602 PR rtl-optimization/55458 * gcc.target/i386/pic-1.c: Remove dg-error as test should pass now. * gcc.target/i386/pr55458.c: Likewise. * gcc.target/i386/pr47602.c: New. * gcc.target/i386/pr23098.c: Move to XFAIL. The unconditional XFAIL is wrong: the test now XPASSes on i386-pc-solaris2.11 and x86_64-unknown-linux-gnu, i686-unknown-linux-gnu for 32-bit. Please fix. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: genmatch infinite loop during bootstrap on AIX
On Fri, 24 Oct 2014, David Edelsohn wrote: genmatch is hanging when bootstrapping on AIX (gcc111). When I attach to the process: #0 0x1007efac in std::basic_stringchar, std::char_traitschar, std::allocatorchar ::basic_string () #1 0x1000e6b0 in _ZN6parser13parse_captureEP7operand (this=0x300594b8, op=0x0) at /home/dje/src/src/gcc/genmatch.c:2607 #2 0x1000e9f0 in _ZN6parser10parse_exprEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2669 #3 0x1000ee38 in _ZN6parser8parse_opEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2728 #4 0x1000efc4 in _ZN6parser14parse_simplifyEjR3vecIP8simplify7va_heap6vl_ptrEP12predicate_idP4expr (this=0x2ff20208, match_location=4614, simplifiers=..., matcher=0x0, result=0x0) at /home/dje/src/src/gcc/genmatch.c:2792 #5 0x100102fc in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3052 #6 0x10010c0c in _ZN6parser9parse_forEj (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2991 #7 0x10010350 in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3090 #8 0x1001122c in _ZN6parserC2EP10cpp_reader (this=0x2ff20208, r_=0x3003bbec) at /home/dje/src/src/gcc/genmatch.c:3122 #9 0x10004acc in main (argc=error reading variable, argv=error reading variable) at _start_ :3204 (I've re-built stage2 build/genmatch with -g, thus no optimization and debug info) Then I see a different frame #0 (std::allocatorchar::allocator()) and for frame #1 I see 0x100098b4 +160: stw r9,88(r31) 0x100098b8 +164: lwz r9,152(r31) 0x100098bc +168: lwz r30,12(r9) 0x100098c0 +172: addir9,r31,64 0x100098c4 +176: mr r3,r9 0x100098c8 +180: bl 0x100984dc _ZNSaIcEC1Ev = 0x100098cc +184: lwz r2,20(r1) while for _ZNSaIcEC1Ev there doesn't seem to be proper debug information (maybe I'm missing some tricks for that) even though stage1 libstdc++ was built with -g. The dissassembly of this (empty!) constructor looks completely weird though: (gdb) down #0 0x100984dc in std::allocatorchar::allocator() () (gdb) disassemble Dump of assembler code for function _ZNSaIcEC1Ev: = 0x100984dc +0: addir12,r2,-9528 0x100984e0 +4: stw r2,20(r1) 0x100984e4 +8: lwz r0,0(r12) 0x100984e8 +12:lwz r2,4(r12) 0x100984ec +16:mtctr r0 0x100984f0 +20:bctr 0x100984f4 +24:.long 0x0 0x100984f8 +28:.long 0xca000 0x100984fc +32:.long 0x0 0x10098500 +36:.long 0x18 End of assembler dump. 'bctr' seems to be a jump to $r0 (0x100984dc) here and all other instructions are fancy no-ops? I do see a long list of warnings at link time similar to ld: 0711-768 WARNING: Object /home/rguenth/obj/prev-powerpc-ibm-aix7.1.0.0/libst dc++-v3/src/.libs/libstdc++.a[libstdc++.so.6], section 1, function .std::time_ge twchar_t, std::istreambuf_iteratorwchar_t, std::char_traitswchar_t ::_M_e xtract_via_format(std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::ios_base, std::_Ios_Iostate, tm*, wchar_t const*) const: The branch at address 0x10042638 is not followed by a recognized no-op or TOC-reload instruction. The unrecognized instruction is 0x4BFFFEBC. so maybe some weird PPC stuff is not set up correctly in libstdc++ so that the above function doesn't compute its return address correctly. Maybe we only run into this because genmatch is the first and only generator program that actually uses libstdc++ and we don't do well using a libstdc++ built with -g only (and no optimization). This is after all the very first entry into libstdc++ (to an empty function). I am making the bootstrap continue by copying over stage1 genmatch. Let's see if stage3 fails the same way (it should use the optimized libstdc++ from stage2). Thanks, Richard.
Re: [PATCH]Partially fix PR61529, bound basic block frequency
Hi Renlin, Are the incoming edge counts or probabilities insane in this case? I guess the patch is ok if we need to do this to handle those incoming insanitiles. But I can't approve patches myself. However, this is a fix to code (r215739) committed after the ICE in the original bug report and in comment 2 were reported, so I wonder if it is just hiding the original problem. Originally this was reported to be due to r210538 - ccing Dehao who was the author of that patch. Dehao, did you get a chance to look at this bug and see why your change triggered it? It is possible that Dehao's patch simply amplified an even further upstream profile insanity, but it would be good to confirm. Thanks! Teresa On Wed, Oct 29, 2014 at 2:26 AM, Renlin Li renlin...@arm.com wrote: Hi all, This is a simple patch to fix ICE in comment 2 of PR61529: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61529 Bound checking code is added to make sure the frequency is within legal range. As far as I have observed, r215830 patch fixes the glibc building ICE. And this patch should fix the ICE while building the sample code in comment 2 using aarch64-none-elf toolchain. Until now, all the ICEs reported in this bug ticket should be fixed. x86_64-unknown-linux-gnu bootstrap and regression test have been done, no new issue. aarch64-none-elf toolchain has been test on the model. No new regression. Is this Okay for trunk? gcc/ChangeLog: 2014-10-29 Renlin Li renlin...@arm.com PR middle-end/61529 * tree-ssa-threadupdate.c (compute_path_counts): Bound path_in_freq. -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Optimize powerpc*-*-linux* e500 hardfp/soft-fp use
Continuing the cleanups of libgcc soft-fp configuration for powerpc*-*-linux* in preparation for implementing TARGET_ATOMIC_ASSIGN_EXPAND_FENV for soft-float and e500, this patch optimizes the choice of which functions to build for the e500 cases. For e500v2, use of hardfp is generally right, except that calls to __unordsf2 and __unorddf2 are actually generated by GCC from __builtin_isunordered and so they need to be implemented with soft-fp to avoid recursively calling themselves. For e500v1, hardfp is right for SFmode (except for __unordsf2) but soft-fp for DFmode (and when using soft-fp, as usual it's best for the conversions between DFmode and integers all to come directly from soft-fp rather than some coming from libgcc2.c). Thus, new variables hardfp_exclusions and softfp_extras are added that configurations using t-hardfp and t-softfp can use to achieve the desired effect of selectively mixing the two sources of functions. Tested with no regressions for crosses to powerpc-linux-gnuspe (both e500v1 and e500v2); also checked that the same set of symbols and versions is exported from shared libgcc before and after the patch. OK to commit? 2014-10-29 Joseph Myers jos...@codesourcery.com * config/t-hardfp (hardfp_exclusions): Document new variable for user to define. (hardfp_func_list): Exclude functions from $(hardfp_exclusions). * config/t-softfp (softfp_extras): Document new variable for user to define. (softfp_func_list): Add functions from $(softfp_extras). * config/rs6000/t-e500v1-fp, config/rs6000/t-e500v2-fp: New files. * config.host (powerpc*-*-linux*): For e500v1, use rs6000/t-e500v1-fp and t-hardfp; do not use t-softfp-sfdf and t-softfp-excl. For e500v2, use t-hardfp-sfdf, rs6000/t-e500v2-fp and t-hardfp; do not use t-softfp-sfdf and t-softfp-excl. Index: libgcc/config/rs6000/t-e500v1-fp === --- libgcc/config/rs6000/t-e500v1-fp(revision 0) +++ libgcc/config/rs6000/t-e500v1-fp(working copy) @@ -0,0 +1,32 @@ +# Copyright (C) 2014 Free Software Foundation, Inc. + +# This file is part of GCC. + +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. + +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# http://www.gnu.org/licenses/. + +# Use hardfp.c for SFmode (except __unordsf2), soft-fp for DFmode. +# For SFmode, libgcc2.c functions are used where applicable; for +# DFmode, they are excluded. +hardfp_float_modes := sf +hardfp_int_modes := si +hardfp_extensions := +hardfp_truncations := +hardfp_exclusions := unordsf2 +softfp_float_modes := df +softfp_int_modes := si di +softfp_extensions := sfdf +softfp_truncations := dfsf +softfp_exclude_libgcc2 := n +softfp_extras := unordsf2 Index: libgcc/config/rs6000/t-e500v2-fp === --- libgcc/config/rs6000/t-e500v2-fp(revision 0) +++ libgcc/config/rs6000/t-e500v2-fp(working copy) @@ -0,0 +1,26 @@ +# Copyright (C) 2014 Free Software Foundation, Inc. + +# This file is part of GCC. + +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. + +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# http://www.gnu.org/licenses/. + +# All operations except __unordsf2 and __unorddf2 can come from hardfp.c. +hardfp_exclusions := unordsf2 unorddf2 +softfp_float_modes := +softfp_int_modes := +softfp_extensions := +softfp_truncations := +softfp_exclude_libgcc2 := n +softfp_extras := unordsf2 unorddf2 Index: libgcc/config/t-hardfp === --- libgcc/config/t-hardfp (revision 216787) +++ libgcc/config/t-hardfp (working copy) @@ -32,6 +32,13 @@ #e.g. sfdf # hardfp_truncations: a list of truncations between hardware floating-point # modes, e.g. dfsf +# +# If some functions that would otherwise be defined should not be +# defined by this file (typically because the target would
Re: Optimize powerpc*-*-linux* e500 hardfp/soft-fp use
On Wed, Oct 29, 2014 at 8:54 AM, Joseph S. Myers jos...@codesourcery.com wrote: Continuing the cleanups of libgcc soft-fp configuration for powerpc*-*-linux* in preparation for implementing TARGET_ATOMIC_ASSIGN_EXPAND_FENV for soft-float and e500, this patch optimizes the choice of which functions to build for the e500 cases. For e500v2, use of hardfp is generally right, except that calls to __unordsf2 and __unorddf2 are actually generated by GCC from __builtin_isunordered and so they need to be implemented with soft-fp to avoid recursively calling themselves. For e500v1, hardfp is right for SFmode (except for __unordsf2) but soft-fp for DFmode (and when using soft-fp, as usual it's best for the conversions between DFmode and integers all to come directly from soft-fp rather than some coming from libgcc2.c). Thus, new variables hardfp_exclusions and softfp_extras are added that configurations using t-hardfp and t-softfp can use to achieve the desired effect of selectively mixing the two sources of functions. Tested with no regressions for crosses to powerpc-linux-gnuspe (both e500v1 and e500v2); also checked that the same set of symbols and versions is exported from shared libgcc before and after the patch. OK to commit? 2014-10-29 Joseph Myers jos...@codesourcery.com * config/t-hardfp (hardfp_exclusions): Document new variable for user to define. (hardfp_func_list): Exclude functions from $(hardfp_exclusions). * config/t-softfp (softfp_extras): Document new variable for user to define. (softfp_func_list): Add functions from $(softfp_extras). * config/rs6000/t-e500v1-fp, config/rs6000/t-e500v2-fp: New files. * config.host (powerpc*-*-linux*): For e500v1, use rs6000/t-e500v1-fp and t-hardfp; do not use t-softfp-sfdf and t-softfp-excl. For e500v2, use t-hardfp-sfdf, rs6000/t-e500v2-fp and t-hardfp; do not use t-softfp-sfdf and t-softfp-excl. Okay. Thanks, David
[PATCH] AArch64: Add TARGET_SCHED_REASSOCIATION_WIDTH
This patch adds the TARGET_SCHED_REASSOCIATION_WIDTH hook. Separate settings for integer, floating point and vector modes are supported via the CPU tuning parameters. Setting the FP reassociation width to 4 improves FP performance on SPEC2000 by ~1.3%. OK for commit? ChangeLog: 2014-10-29 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64-protos.h (tune-params): Add reasociation tuning parameters. * gcc/config/aarch64/aarch64.c (TARGET_SCHED_REASSOCIATION_WIDTH): Define. (aarch64_reassociation_width): New function. (generic_tunings) Add reassociation tuning parameters. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. --- gcc/config/aarch64/aarch64-protos.h | 3 +++ gcc/config/aarch64/aarch64.c| 34 +++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 810644c..9c03f7b 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -170,6 +170,9 @@ struct tune_params const struct cpu_vector_cost *const vec_costs; const int memmov_cost; const int issue_rate; + const int int_reassoc_width; + const int fp_reassoc_width; + const int vec_reassoc_width; }; HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index e6cd5eb..4d67722 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -309,7 +309,10 @@ static const struct tune_params generic_tunings = generic_regmove_cost, generic_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 2) + NAMED_PARAM (issue_rate, 2), + 1, /* int_reassoc_width. */ + 1, /* fp_reassoc_width. */ + 1/* vec_reassoc_width. */ }; static const struct tune_params cortexa53_tunings = @@ -319,7 +322,10 @@ static const struct tune_params cortexa53_tunings = cortexa53_regmove_cost, generic_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 2) + NAMED_PARAM (issue_rate, 2), + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; static const struct tune_params cortexa57_tunings = @@ -329,7 +335,10 @@ static const struct tune_params cortexa57_tunings = cortexa57_regmove_cost, cortexa57_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 3) + NAMED_PARAM (issue_rate, 3), + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; static const struct tune_params thunderx_tunings = @@ -340,6 +349,9 @@ static const struct tune_params thunderx_tunings = generic_vector_cost, NAMED_PARAM (memmov_cost, 6), NAMED_PARAM (issue_rate, 2) + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; /* A processor implementing AArch64. */ @@ -429,6 +441,19 @@ static const char * const aarch64_condition_codes[] = hi, ls, ge, lt, gt, le, al, nv }; +static int +aarch64_reassociation_width (unsigned opc ATTRIBUTE_UNUSED, +enum machine_mode mode) +{ + if (VECTOR_MODE_P (mode)) +return aarch64_tune_params-vec_reassoc_width; + if (INTEGRAL_MODE_P (mode)) +return aarch64_tune_params-int_reassoc_width; + if (FLOAT_MODE_P (mode)) +return aarch64_tune_params-fp_reassoc_width; + return 1; +} + /* Provide a mapping from gcc register numbers to dwarf register numbers. */ unsigned aarch64_dbx_register_number (unsigned regno) @@ -10147,6 +10172,9 @@ aarch64_asan_shadow_offset (void) #undef TARGET_PREFERRED_RELOAD_CLASS #define TARGET_PREFERRED_RELOAD_CLASS aarch64_preferred_reload_class +#undef TARGET_SCHED_REASSOCIATION_WIDTH +#define TARGET_SCHED_REASSOCIATION_WIDTH aarch64_reassociation_width + #undef TARGET_SECONDARY_RELOAD #define TARGET_SECONDARY_RELOAD aarch64_secondary_reload -- 1.9.1
[match-and-simplify] fix segfault in parser::parse_for
genmatch segfaults if user-defined operator is not specified. eg: (for (oper1 oper2...) pattern) * genmatch.c (parser::parse_for): Call peek instead of peek_ident. Thanks, Prathamesh Index: genmatch.c === --- genmatch.c (revision 216826) +++ genmatch.c (working copy) @@ -2953,8 +2953,8 @@ while (1) { - token = peek_ident (); - if (token == 0) + token = peek (); + if (token-type != CPP_NAME) break; /* Insert the user defined operators into the operator hash. */
Re: genmatch infinite loop during bootstrap on AIX
On Wed, Oct 29, 2014 at 8:26 AM, Richard Biener rguent...@suse.de wrote: On Fri, 24 Oct 2014, David Edelsohn wrote: genmatch is hanging when bootstrapping on AIX (gcc111). When I attach to the process: #0 0x1007efac in std::basic_stringchar, std::char_traitschar, std::allocatorchar ::basic_string () #1 0x1000e6b0 in _ZN6parser13parse_captureEP7operand (this=0x300594b8, op=0x0) at /home/dje/src/src/gcc/genmatch.c:2607 #2 0x1000e9f0 in _ZN6parser10parse_exprEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2669 #3 0x1000ee38 in _ZN6parser8parse_opEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2728 #4 0x1000efc4 in _ZN6parser14parse_simplifyEjR3vecIP8simplify7va_heap6vl_ptrEP12predicate_idP4expr (this=0x2ff20208, match_location=4614, simplifiers=..., matcher=0x0, result=0x0) at /home/dje/src/src/gcc/genmatch.c:2792 #5 0x100102fc in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3052 #6 0x10010c0c in _ZN6parser9parse_forEj (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2991 #7 0x10010350 in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3090 #8 0x1001122c in _ZN6parserC2EP10cpp_reader (this=0x2ff20208, r_=0x3003bbec) at /home/dje/src/src/gcc/genmatch.c:3122 #9 0x10004acc in main (argc=error reading variable, argv=error reading variable) at _start_ :3204 (I've re-built stage2 build/genmatch with -g, thus no optimization and debug info) Then I see a different frame #0 (std::allocatorchar::allocator()) and for frame #1 I see 0x100098b4 +160: stw r9,88(r31) 0x100098b8 +164: lwz r9,152(r31) 0x100098bc +168: lwz r30,12(r9) 0x100098c0 +172: addir9,r31,64 0x100098c4 +176: mr r3,r9 0x100098c8 +180: bl 0x100984dc _ZNSaIcEC1Ev = 0x100098cc +184: lwz r2,20(r1) while for _ZNSaIcEC1Ev there doesn't seem to be proper debug information (maybe I'm missing some tricks for that) even though stage1 libstdc++ was built with -g. The dissassembly of this (empty!) constructor looks completely weird though: (gdb) down #0 0x100984dc in std::allocatorchar::allocator() () (gdb) disassemble Dump of assembler code for function _ZNSaIcEC1Ev: = 0x100984dc +0: addir12,r2,-9528 0x100984e0 +4: stw r2,20(r1) 0x100984e4 +8: lwz r0,0(r12) 0x100984e8 +12:lwz r2,4(r12) 0x100984ec +16:mtctr r0 0x100984f0 +20:bctr 0x100984f4 +24:.long 0x0 0x100984f8 +28:.long 0xca000 0x100984fc +32:.long 0x0 0x10098500 +36:.long 0x18 End of assembler dump. bctr is the end of the function. It is an unconditional, indirect jump, likely a tail call. The instructions after the bctr are part of the function epilogue on AIX with information about the function, originally for AIX exception handling and stack walking, not used by GCC EH. 'bctr' seems to be a jump to $r0 (0x100984dc) here and all other instructions are fancy no-ops? I do see a long list of warnings at link time similar to ld: 0711-768 WARNING: Object /home/rguenth/obj/prev-powerpc-ibm-aix7.1.0.0/libst dc++-v3/src/.libs/libstdc++.a[libstdc++.so.6], section 1, function .std::time_ge twchar_t, std::istreambuf_iteratorwchar_t, std::char_traitswchar_t ::_M_e xtract_via_format(std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::ios_base, std::_Ios_Iostate, tm*, wchar_t const*) const: The branch at address 0x10042638 is not followed by a recognized no-op or TOC-reload instruction. The unrecognized instruction is 0x4BFFFEBC. so maybe some weird PPC stuff is not set up correctly in libstdc++ so that the above function doesn't compute its return address correctly. Those warnings are normal. GCC is generating a tail call to a global function that it knows is in the same translation unit (binds_local_p). Depending on how one interprets SVR4 ABI, one should be able to interpose the call, which could call a function external to the TU. AIX (and PPC64 BE) require a nop instruction after calls to global functions that can be replaced with an instruction to restore the TOC (GOT) if the call is determined to reference a function in another TU at link-edit time. The instruction is not followed by the no-op, so the AIX linker complains. It's basically complaining that GCC is being too aggressive in optimization -- tail call to global function in same source file -- but it's not a bug. Maybe we only run into this because genmatch is the first and only generator program that actually uses libstdc++ and we don't do well using a libstdc++ built with -g only (and no optimization). This is after all the very first entry into libstdc++ (to an empty function). I am making the bootstrap continue by copying over stage1 genmatch. Let's see if stage3 fails the
[PATCH] Improve spillcost of literal pool loads
This patch adjusts the spill cost of literal pool loads to reduce the chance of them being caller-saved (which is inefficient). Such loads should be rematerialized and thus should not include the cost of a spill store. This was done only on constants for which legitimate_constant_p is true, however it is right thing to do for any constant, including constants in literal pools (which are typically not legitimate). Also use ALL_REGS rather than GENERAL_REGS as ALL_REGS has the correct floating point register costs. ChangeLog: 2014-10-29 Wilco Dijkstra wdijk...@arm.com * gcc/ira-costs.c (scan_one_insn): Improve spill cost adjustment. --- gcc/ira-costs.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c index 122815b..c4a1934 100644 --- a/gcc/ira-costs.c +++ b/gcc/ira-costs.c @@ -1455,19 +1455,18 @@ scan_one_insn (rtx_insn *insn) mem_cost might result in it being loaded using the specialized instruction into a register, then stored into stack and loaded again from the stack. See PR52208. - + Don't do this if SET_SRC (set) has side effect. See PR56124. */ if (set != 0 REG_P (SET_DEST (set)) MEM_P (SET_SRC (set)) (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL_RTX ((MEM_P (XEXP (note, 0)) !side_effects_p (SET_SRC (set))) || (CONSTANT_P (XEXP (note, 0)) - targetm.legitimate_constant_p (GET_MODE (SET_DEST (set)), - XEXP (note, 0)) + (! flag_pic || LEGITIMATE_PIC_OPERAND_P (XEXP (note, 0))) REG_N_SETS (REGNO (SET_DEST (set))) == 1)) general_operand (SET_SRC (set), GET_MODE (SET_SRC (set { - enum reg_class cl = GENERAL_REGS; + enum reg_class cl = ALL_REGS; rtx reg = SET_DEST (set); int num = COST_INDEX (REGNO (reg)); -- 1.9.1
[Patch, testsuite] [AArch64,ARM] support bswap tests on aarch64_be
Hi, Following discussions after Thomas's patches improving bswap support https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01279.html I noticed that: * the associated tests weren't executed on aarch64_be * ARM targets older than v6 do not support the needed instructions. The attached patch changes check_effective_target_bswap(): - accept aarch64*-*-* instead of aarch64-*-* - when target is arm*-*-*, check __ARM_ARCH = 6 2014-10-29 Christophe Lyon christophe.l...@linaro.org * lib/target-supports.exp (check_effective_target_bswap): Update conditions for AArch64 and ARM targets. OK? Christophe. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 4398345..80ff52d 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4892,9 +4892,8 @@ proc check_effective_target_bswap { } { verbose check_effective_target_bswap: using cached result 2 } else { set et_bswap_saved 0 - if { [istarget aarch64-*-*] + if { [istarget aarch64*-*-*] || [istarget alpha*-*-*] -|| [istarget arm*-*-*] || [istarget i?86-*-*] || [istarget m68k-*-*] || [istarget powerpc*-*-*] @@ -4902,6 +4901,16 @@ proc check_effective_target_bswap { } { || [istarget s390*-*-*] || [istarget x86_64-*-*] } { set et_bswap_saved 1 + } else { + if { [istarget arm*-*-*] + [check_no_compiler_messages_nocache arm_v6_or_later object { +#if __ARM_ARCH 6 +#error not armv6 or later +#endif +int i; +} ] } { + set et_bswap_saved 1 + } } }
Re: genmatch infinite loop during bootstrap on AIX
On Wed, Oct 29, 2014 at 2:10 PM, David Edelsohn dje@gmail.com wrote: On Wed, Oct 29, 2014 at 8:26 AM, Richard Biener rguent...@suse.de wrote: On Fri, 24 Oct 2014, David Edelsohn wrote: genmatch is hanging when bootstrapping on AIX (gcc111). When I attach to the process: #0 0x1007efac in std::basic_stringchar, std::char_traitschar, std::allocatorchar ::basic_string () #1 0x1000e6b0 in _ZN6parser13parse_captureEP7operand (this=0x300594b8, op=0x0) at /home/dje/src/src/gcc/genmatch.c:2607 #2 0x1000e9f0 in _ZN6parser10parse_exprEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2669 #3 0x1000ee38 in _ZN6parser8parse_opEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2728 #4 0x1000efc4 in _ZN6parser14parse_simplifyEjR3vecIP8simplify7va_heap6vl_ptrEP12predicate_idP4expr (this=0x2ff20208, match_location=4614, simplifiers=..., matcher=0x0, result=0x0) at /home/dje/src/src/gcc/genmatch.c:2792 #5 0x100102fc in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3052 #6 0x10010c0c in _ZN6parser9parse_forEj (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:2991 #7 0x10010350 in _ZN6parser13parse_patternEv (this=0x2ff20208) at /home/dje/src/src/gcc/genmatch.c:3090 #8 0x1001122c in _ZN6parserC2EP10cpp_reader (this=0x2ff20208, r_=0x3003bbec) at /home/dje/src/src/gcc/genmatch.c:3122 #9 0x10004acc in main (argc=error reading variable, argv=error reading variable) at _start_ :3204 (I've re-built stage2 build/genmatch with -g, thus no optimization and debug info) Then I see a different frame #0 (std::allocatorchar::allocator()) and for frame #1 I see 0x100098b4 +160: stw r9,88(r31) 0x100098b8 +164: lwz r9,152(r31) 0x100098bc +168: lwz r30,12(r9) 0x100098c0 +172: addir9,r31,64 0x100098c4 +176: mr r3,r9 0x100098c8 +180: bl 0x100984dc _ZNSaIcEC1Ev = 0x100098cc +184: lwz r2,20(r1) while for _ZNSaIcEC1Ev there doesn't seem to be proper debug information (maybe I'm missing some tricks for that) even though stage1 libstdc++ was built with -g. The dissassembly of this (empty!) constructor looks completely weird though: (gdb) down #0 0x100984dc in std::allocatorchar::allocator() () (gdb) disassemble Dump of assembler code for function _ZNSaIcEC1Ev: = 0x100984dc +0: addir12,r2,-9528 0x100984e0 +4: stw r2,20(r1) 0x100984e4 +8: lwz r0,0(r12) 0x100984e8 +12:lwz r2,4(r12) 0x100984ec +16:mtctr r0 0x100984f0 +20:bctr 0x100984f4 +24:.long 0x0 0x100984f8 +28:.long 0xca000 0x100984fc +32:.long 0x0 0x10098500 +36:.long 0x18 End of assembler dump. bctr is the end of the function. It is an unconditional, indirect jump, likely a tail call. The instructions after the bctr are part of the function epilogue on AIX with information about the function, originally for AIX exception handling and stack walking, not used by GCC EH. 'bctr' seems to be a jump to $r0 (0x100984dc) here and all other instructions are fancy no-ops? I do see a long list of warnings at link time similar to ld: 0711-768 WARNING: Object /home/rguenth/obj/prev-powerpc-ibm-aix7.1.0.0/libst dc++-v3/src/.libs/libstdc++.a[libstdc++.so.6], section 1, function .std::time_ge twchar_t, std::istreambuf_iteratorwchar_t, std::char_traitswchar_t ::_M_e xtract_via_format(std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::istreambuf_iteratorwchar_t, std::char_traitswchar_t , std::ios_base, std::_Ios_Iostate, tm*, wchar_t const*) const: The branch at address 0x10042638 is not followed by a recognized no-op or TOC-reload instruction. The unrecognized instruction is 0x4BFFFEBC. so maybe some weird PPC stuff is not set up correctly in libstdc++ so that the above function doesn't compute its return address correctly. Those warnings are normal. GCC is generating a tail call to a global function that it knows is in the same translation unit (binds_local_p). Depending on how one interprets SVR4 ABI, one should be able to interpose the call, which could call a function external to the TU. AIX (and PPC64 BE) require a nop instruction after calls to global functions that can be replaced with an instruction to restore the TOC (GOT) if the call is determined to reference a function in another TU at link-edit time. The instruction is not followed by the no-op, so the AIX linker complains. It's basically complaining that GCC is being too aggressive in optimization -- tail call to global function in same source file -- but it's not a bug. Maybe we only run into this because genmatch is the first and only generator program that actually uses libstdc++ and we don't do well using a libstdc++ built with -g only (and no optimization). This is after all the very first entry into libstdc++ (to an empty function). I am
RE: [PATCH] Fix PR63259: bswap not recognized when finishing with rotation
From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Wednesday, October 08, 2014 8:27 AM I wouldn't worry about that too much. Indeed the question would be what should be canonical on GIMPLE (expanders should choose the optimal vairant from both). I think a tree code should be always prefered to a builtin function call - which means a rotate is more canonical than a bswap16 call. Below is the updated patch. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2014-10-29 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63259 * tree-ssa-math-opts.c (bswap_replace): Replace expression by a rotation left if it is a 16 bit byte swap. (pass_optimize_bswap::execute): Also consider bswap in LROTATE_EXPR and RROTATE_EXPR statements if it is a byte rotation. *** gcc/testsuite/ChangeLog *** 2014-10-29 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63259 * optimize-bswapsi-1.c (swap32_f): New bswap pass test. * optimize-bswaphi-1.c: Drop useless SIType definition and fix typo in following comment. diff --git a/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c b/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c index 18aba28..692fceb 100644 --- a/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c +++ b/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c @@ -42,11 +42,10 @@ uint32_t read_be16_3 (unsigned char *data) return *(data + 1) | (*data 8); } -typedef int SItype __attribute__ ((mode (SI))); typedef int HItype __attribute__ ((mode (HI))); /* Test that detection of significant sign extension works correctly. This - checks that unknown byte marker are set correctly in cast of cast. */ + checks that unknown byte markers are set correctly in cast of cast. */ HItype swap16 (HItype in) diff --git a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c index cfde218..ad3ede4 100644 --- a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c +++ b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c @@ -78,5 +78,16 @@ swap32_e (SItype in) | (((in 24) 0xFF) 0); } -/* { dg-final { scan-tree-dump-times 32 bit bswap implementation found at 5 bswap } } */ +/* This variant comes from PR63259. It compiles to a gimple sequence that ends + with a rotation instead of a bitwise OR. */ + +unsigned +swap32_f (unsigned in) +{ + in = ((in 0xff00ff00) 8) | ((in 0x00ff00ff) 8); + in = ((in 0x) 16) | ((in 0x) 16); + return in; +} + +/* { dg-final { scan-tree-dump-times 32 bit bswap implementation found at 6 bswap } } */ /* { dg-final { cleanup-tree-dump bswap } } */ diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index e0f2924..5b656e0 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2187,7 +2187,7 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt, struct symbolic_number *n, bool bswap) { tree src, tmp, tgt; - gimple call; + gimple bswap_stmt; src = gimple_assign_rhs1 (src_stmt); tgt = gimple_assign_lhs (cur_stmt); @@ -2293,16 +2293,28 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt, tmp = src; - /* Convert the src expression if necessary. */ - if (!useless_type_conversion_p (TREE_TYPE (tmp), bswap_type)) + /* Canonical form for 16 bit bswap is a rotate expression. */ + if (bswap n-range == 16) { - gimple convert_stmt; - tmp = make_temp_ssa_name (bswap_type, NULL, bswapsrc); - convert_stmt = gimple_build_assign_with_ops (NOP_EXPR, tmp, src, NULL); - gsi_insert_before (gsi, convert_stmt, GSI_SAME_STMT); + tree count = build_int_cst (NULL, BITS_PER_UNIT); + bswap_type = TREE_TYPE (src); + src = fold_build2 (LROTATE_EXPR, bswap_type, src, count); + bswap_stmt = gimple_build_assign (NULL, src); } + else +{ + /* Convert the src expression if necessary. */ + if (!useless_type_conversion_p (TREE_TYPE (tmp), bswap_type)) + { + gimple convert_stmt; + tmp = make_temp_ssa_name (bswap_type, NULL, bswapsrc); + convert_stmt = gimple_build_assign_with_ops (NOP_EXPR, tmp, src, + NULL); + gsi_insert_before (gsi, convert_stmt, GSI_SAME_STMT); + } - call = gimple_build_call (fndecl, 1, tmp); + bswap_stmt = gimple_build_call (fndecl, 1, tmp); +} tmp = tgt; @@ -2315,7 +2327,7 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt, gsi_insert_after (gsi, convert_stmt, GSI_SAME_STMT); } - gimple_call_set_lhs (call, tmp); + gimple_set_lhs (bswap_stmt, tmp); if (dump_file) { @@ -2324,7 +2336,7 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt, print_gimple_stmt (dump_file, cur_stmt, 0, 0); } - gsi_insert_after (gsi, call, GSI_SAME_STMT); + gsi_insert_after (gsi,
Re: [match-and-simplify] fix segfault in parser::parse_for
On Wed, Oct 29, 2014 at 2:01 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: genmatch segfaults if user-defined operator is not specified. eg: (for (oper1 oper2...) pattern) * genmatch.c (parser::parse_for): Call peek instead of peek_ident. Thanks - applied. Richard. Thanks, Prathamesh
RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
Hello, This is new patch version in which reported issue is fixed. Also, patch is rebased to the revision 216452 and some minor code clean-up is done. -- Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-22 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * doc/invoke.texi: Add -fmerge-bitfields to the list of optimization flags turned on at -O2. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (part_of_union_p): New function. (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. (simple_type_size_in_bits): Move from dwarf2out.c. * tree.h (expressions_equal_p): Add declaration. (field_byte_offset): Add declaration. Patch - diff --git a/gcc/common.opt b/gcc/common.opt index 5db5e1e..cec145c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2270,6 +2270,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +fmerge-bitfields +Common Report Var(flag_tree_bitfield_merge) Optimization +Merge loads and stores of consecutive bitfields + ftree-ter Common Report Var(flag_tree_ter) Optimization Replace temporary expressions in the SSA-normal pass diff --git
Re: [PATCH, IPA ICF] Fix PR63664, PR63574 (segfault in ipa-icf pass)
On 29 Oct 10:34, Richard Biener wrote: On Tue, Oct 28, 2014 at 5:14 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes PR63664 and PR63574. Problem is in NULL types for labels not handled by ICF properly. I assume it is OK for labels to have NULL type and added check into ICF rather then fixed label generation. Bootstrapped and checked on linux-x86_64. OK for trunk? Instead it shouldn't be called for labels instead. Richard. Here is a version which doesn't compare types for labels. Is is OK? Bootstrapped and checked on linux-x86_64. Thanks, Ilya -- gcc/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 PR bootstrap/63574 * ipa-icf-gimple.c (func_checker::compatible_types_p): Assert for null args. (func_checker::compare_operand): Don't compare types for labels. gcc/testsuite/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 * gcc.dg/ipa/pr63664.C: New. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 1369b74..094e8ab 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -169,6 +169,9 @@ bool func_checker::compatible_types_p (tree t1, tree t2, bool compare_polymorphic, bool first_argument) { + gcc_assert (t1); + gcc_assert (t2); + if (TREE_CODE (t1) != TREE_CODE (t2)) return return_false_with_msg (different tree types); @@ -214,11 +217,15 @@ func_checker::compare_operand (tree t1, tree t2) else if (!t1 || !t2) return false; - tree tt1 = TREE_TYPE (t1); - tree tt2 = TREE_TYPE (t2); + if (TREE_CODE (t1) != LABEL_DECL + TREE_CODE (t2) != LABEL_DECL) +{ + tree tt1 = TREE_TYPE (t1); + tree tt2 = TREE_TYPE (t2); - if (!func_checker::compatible_types_p (tt1, tt2)) -return false; + if (!func_checker::compatible_types_p (tt1, tt2)) + return false; +} base1 = get_addr_base_and_unit_offset (t1, offset1); base2 = get_addr_base_and_unit_offset (t2, offset2); diff --git a/gcc/testsuite/gcc.dg/ipa/pr63664.C b/gcc/testsuite/gcc.dg/ipa/pr63664.C new file mode 100644 index 000..31d96d4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63664.C @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +class test { + public: + test (int val, int *p) +{ + int_val = *p; + bool_val = (val != int_val); +} + + ~test () +{ + if (!bool_val) + return; +} + + int get_int_val () const { return int_val; } + + private: + bool bool_val; + int int_val; +}; + +static int __attribute__ ((noinline)) +f1 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +static int __attribute__ ((noinline)) +f2 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +int +f (int i, int *p) +{ + return f1 (i, p) + f2 (i, p); +}
Re: [PATCH, IPA ICF] Fix PR63664, PR63574 (segfault in ipa-icf pass)
On 10/29/2014 02:45 PM, Ilya Enkovich wrote: On 29 Oct 10:34, Richard Biener wrote: On Tue, Oct 28, 2014 at 5:14 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes PR63664 and PR63574. Problem is in NULL types for labels not handled by ICF properly. I assume it is OK for labels to have NULL type and added check into ICF rather then fixed label generation. Bootstrapped and checked on linux-x86_64. OK for trunk? Instead it shouldn't be called for labels instead. Richard. Here is a version which doesn't compare types for labels. Is is OK? Hello. I've been just testing a patch, where the pass does not call compare_operand for gimple labels. As the pass creates mapping between labels and basic blocks, such comparison will not be necessary. Thanks, Martin Bootstrapped and checked on linux-x86_64. Thanks, Ilya -- gcc/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 PR bootstrap/63574 * ipa-icf-gimple.c (func_checker::compatible_types_p): Assert for null args. (func_checker::compare_operand): Don't compare types for labels. gcc/testsuite/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 * gcc.dg/ipa/pr63664.C: New. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 1369b74..094e8ab 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -169,6 +169,9 @@ bool func_checker::compatible_types_p (tree t1, tree t2, bool compare_polymorphic, bool first_argument) { + gcc_assert (t1); + gcc_assert (t2); + if (TREE_CODE (t1) != TREE_CODE (t2)) return return_false_with_msg (different tree types); @@ -214,11 +217,15 @@ func_checker::compare_operand (tree t1, tree t2) else if (!t1 || !t2) return false; - tree tt1 = TREE_TYPE (t1); - tree tt2 = TREE_TYPE (t2); + if (TREE_CODE (t1) != LABEL_DECL + TREE_CODE (t2) != LABEL_DECL) +{ + tree tt1 = TREE_TYPE (t1); + tree tt2 = TREE_TYPE (t2); - if (!func_checker::compatible_types_p (tt1, tt2)) -return false; + if (!func_checker::compatible_types_p (tt1, tt2)) + return false; +} base1 = get_addr_base_and_unit_offset (t1, offset1); base2 = get_addr_base_and_unit_offset (t2, offset2); diff --git a/gcc/testsuite/gcc.dg/ipa/pr63664.C b/gcc/testsuite/gcc.dg/ipa/pr63664.C new file mode 100644 index 000..31d96d4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63664.C @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +class test { + public: + test (int val, int *p) +{ + int_val = *p; + bool_val = (val != int_val); +} + + ~test () +{ + if (!bool_val) + return; +} + + int get_int_val () const { return int_val; } + + private: + bool bool_val; + int int_val; +}; + +static int __attribute__ ((noinline)) +f1 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +static int __attribute__ ((noinline)) +f2 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +int +f (int i, int *p) +{ + return f1 (i, p) + f2 (i, p); +}
[gimple-classes, committed 3/3] Strengthen remaining gimple_try_ accessors to require a gtry *
gcc/ChangeLog.gimple-classes: * gimple.h (gimple_try_kind): Strengthen param from const_gimple to const gtry *. (gimple_try_catch_is_cleanup): Likewise. (gimple_try_eval_ptr): Strengthen param from gimple gtry *. (gimple_try_eval): Likewise. (gimple_try_cleanup_ptr): Likewise. (gimple_try_cleanup): Likewise. * gimple-low.c (lower_stmt): Introduce new local gtry *try_stmt via a checked cast, using it in place of stmt for typesafety. (lower_try_catch): Strengthen local stmt from gimple to gtry *. (gimple_stmt_may_fallthru): Within case GIMPLE_TRY, introduce new local gtry *try_stmt via a checked cast, using it in place of stmt for typesafety. * gimple-walk.c (walk_gimple_stmt): Likewise. * omp-low.c (lower_omp_1): Likewise. * tree-cfg.c (verify_gimple_in_seq_2): Likewise. (do_warn_unused_result): Likewise. * tree-eh.c (collect_finally_tree): Likewise. (replace_goto_queue_1): Likewise. (honor_protect_cleanup_actions): Replace check against GIMPLE_TRY with a dyn_cast gtry *, introducing new local try_stmt, using it in place of stmt. (optimize_double_finally): Strengthen local oneh from gimple to gtry *, via a dyn_cast. (refactor_eh_r): Within case GIMPLE_TRY, introduce new local gtry *try_stmt via a checked cast, using it in place of stmt for typesafety. * tree-inline.c (remap_gimple_stmt): Likewise. (estimate_num_insns): Likewise. --- gcc/ChangeLog.gimple-classes | 32 ++ gcc/gimple-low.c | 78 gcc/gimple-walk.c| 21 +++- gcc/gimple.h | 19 +-- gcc/omp-low.c| 7 ++-- gcc/tree-cfg.c | 14 +--- gcc/tree-eh.c| 67 + gcc/tree-inline.c| 17 +++--- 8 files changed, 161 insertions(+), 94 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 50b87b9..10c3957 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,37 @@ 2014-10-28 David Malcolm dmalc...@redhat.com + * gimple.h (gimple_try_kind): Strengthen param from const_gimple + to const gtry *. + (gimple_try_catch_is_cleanup): Likewise. + (gimple_try_eval_ptr): Strengthen param from gimple gtry *. + (gimple_try_eval): Likewise. + (gimple_try_cleanup_ptr): Likewise. + (gimple_try_cleanup): Likewise. + * gimple-low.c (lower_stmt): Introduce new local gtry *try_stmt + via a checked cast, using it in place of stmt for typesafety. + (lower_try_catch): Strengthen local stmt from gimple to gtry *. + (gimple_stmt_may_fallthru): Within case GIMPLE_TRY, introduce new + local gtry *try_stmt via a checked cast, using it in place of stmt + for typesafety. + * gimple-walk.c (walk_gimple_stmt): Likewise. + * omp-low.c (lower_omp_1): Likewise. + * tree-cfg.c (verify_gimple_in_seq_2): Likewise. + (do_warn_unused_result): Likewise. + * tree-eh.c (collect_finally_tree): Likewise. + (replace_goto_queue_1): Likewise. + (honor_protect_cleanup_actions): Replace check against GIMPLE_TRY + with a dyn_cast gtry *, introducing new local try_stmt, using it + in place of stmt. + (optimize_double_finally): Strengthen local oneh from gimple to + gtry *, via a dyn_cast. + (refactor_eh_r): Within case GIMPLE_TRY, introduce new local + gtry *try_stmt via a checked cast, using it in place of stmt for + typesafety. + * tree-inline.c (remap_gimple_stmt): Likewise. + (estimate_num_insns): Likewise. + +2014-10-28 David Malcolm dmalc...@redhat.com + * gimple.h (gimple_goto_dest): Strengthen param from const_gimple to const ggoto *. * cfgexpand.c (expand_gimple_stmt_1): Add checked cast to ggoto * diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c index ab191a0..0aec000 100644 --- a/gcc/gimple-low.c +++ b/gcc/gimple-low.c @@ -271,27 +271,30 @@ lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data) return; case GIMPLE_TRY: - if (gimple_try_kind (stmt) == GIMPLE_TRY_CATCH) - lower_try_catch (gsi, data); - else - { - /* It must be a GIMPLE_TRY_FINALLY. */ - bool cannot_fallthru; - lower_sequence (gimple_try_eval_ptr (stmt), data); - cannot_fallthru = data-cannot_fallthru; - - /* The finally clause is always executed after the try clause, -so if it does not fall through, then the try-finally will not -fall through. Otherwise, if the try clause does not fall -through, then when the finally clause falls through it will -resume execution wherever the try clause
[gimple-classes, committed 0/3] More accessor typesafety
I've pushed the following three patches to the git branch dmalcolm/gimple-classes. Successfully bootstrappedregrtested the combination of the three patches on x86_64-unknown-linux-gnu (Fedora 20) - same results relative to an unpatched control bootstrap of trunk's r216746. David Malcolm (3): Strengthen params of all gimple_wce_ accessors Make gimple_goto_dest require a const ggoto * Strengthen remaining gimple_try_ accessors to require a gtry * gcc/ChangeLog.gimple-classes | 106 +++ gcc/cfgexpand.c | 2 +- gcc/doc/gimple.texi | 2 +- gcc/gimple-low.c | 78 --- gcc/gimple-walk.c| 28 +++- gcc/gimple.c | 8 ++-- gcc/gimple.h | 47 --- gcc/gimplify.c | 6 +-- gcc/gsstruct.def | 2 +- gcc/ipa-icf-gimple.c | 7 +-- gcc/ipa-icf-gimple.h | 4 +- gcc/omp-low.c| 9 ++-- gcc/tree-cfg.c | 30 gcc/tree-cfgcleanup.c| 9 ++-- gcc/tree-eh.c| 82 +++-- gcc/tree-inline.c| 22 ++--- gcc/tree-nested.c| 4 +- gcc/tree-ssa-dom.c | 12 ++--- gcc/tree-ssa-sccvn.c | 2 +- gcc/tree-ssa-threadedge.c| 2 +- gcc/tree-ssa-threadupdate.c | 2 +- 21 files changed, 305 insertions(+), 159 deletions(-) -- 1.7.11.7
[gimple-classes, committed 1/3] Strengthen params of all gimple_wce_ accessors
gcc/ChangeLog.gimple-classes: * doc/gimple.texi (Class hierarchy of GIMPLE statements): Update for renaming of gimple_statement_wce to gwce. * gimple-walk.c (walk_gimple_stmt): Add checked cast to gwce * within case GIMPLE_WITH_CLEANUP_EXPR. * gimple.c (gimple_build_wce): Strengthen return type and local p from gimple to gwce *. (gimple_copy): Add checked casts to gwce * within case GIMPLE_WITH_CLEANUP_EXPR. * gimple.h (struct gimple_statement_wce): Rename to... (struct gwce): ...this. (is_a_helper gimple_statement_wce *::test): Rename to... (is_a_helper gwce *::test): ...this. (gimple_build_wce): Strengthen return type from gimple to gwce *. (gimple_wce_cleanup_ptr): Strengthen param from gimple to gwce *. (gimple_wce_cleanup): Likewise. (gimple_wce_set_cleanup): Likewise. (gimple_wce_cleanup_eh_only): Strengthen param from const_gimple to const gwce *. (gimple_wce_set_cleanup_eh_only): Strengthen param from gimple to gwce *. * gimplify.c (gimplify_cleanup_point_expr): Replace check against GIMPLE_WITH_CLEANUP_EXPR with a dyn_cast gwce *, strengthening local wce from gimple to gwce *. (gimple_push_cleanup): Strengthen local wce from gimple to gwce *. * gsstruct.def (GSS_WCE): Update for renaming of gimple_statement_wce to gwce. * tree-inline.c (remap_gimple_stmt): Add checked cast to gwce * within case GIMPLE_WITH_CLEANUP_EXPR. --- gcc/ChangeLog.gimple-classes | 32 gcc/doc/gimple.texi | 2 +- gcc/gimple-walk.c| 5 +++-- gcc/gimple.c | 8 gcc/gimple.h | 25 +++-- gcc/gimplify.c | 6 +++--- gcc/gsstruct.def | 2 +- gcc/tree-inline.c| 3 ++- 8 files changed, 57 insertions(+), 26 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 133965c..b7a62de 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,37 @@ 2014-10-28 David Malcolm dmalc...@redhat.com + * doc/gimple.texi (Class hierarchy of GIMPLE statements): Update + for renaming of gimple_statement_wce to gwce. + * gimple-walk.c (walk_gimple_stmt): Add checked cast to gwce * + within case GIMPLE_WITH_CLEANUP_EXPR. + * gimple.c (gimple_build_wce): Strengthen return type and local + p from gimple to gwce *. + (gimple_copy): Add checked casts to gwce * within case + GIMPLE_WITH_CLEANUP_EXPR. + * gimple.h (struct gimple_statement_wce): Rename to... + (struct gwce): ...this. + (is_a_helper gimple_statement_wce *::test): Rename to... + (is_a_helper gwce *::test): ...this. + (gimple_build_wce): Strengthen return type from gimple to gwce *. + (gimple_wce_cleanup_ptr): Strengthen param from gimple to gwce *. + (gimple_wce_cleanup): Likewise. + (gimple_wce_set_cleanup): Likewise. + (gimple_wce_cleanup_eh_only): Strengthen param from const_gimple + to const gwce *. + (gimple_wce_set_cleanup_eh_only): Strengthen param from gimple to + gwce *. + * gimplify.c (gimplify_cleanup_point_expr): Replace check against + GIMPLE_WITH_CLEANUP_EXPR with a dyn_cast gwce *, strengthening + local wce from gimple to gwce *. + (gimple_push_cleanup): Strengthen local wce from gimple to + gwce *. + * gsstruct.def (GSS_WCE): Update for renaming of + gimple_statement_wce to gwce. + * tree-inline.c (remap_gimple_stmt): Add checked cast to gwce * + within case GIMPLE_WITH_CLEANUP_EXPR. + +2014-10-28 David Malcolm dmalc...@redhat.com + * auto-profile.c (autofdo::function_instance::find_icall_target_map): Strengthen param stmt from gimple to gcall *. (autofdo::autofdo_source_profile::update_inlined_ind_target): diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi index 621c860..de7345e 100644 --- a/gcc/doc/gimple.texi +++ b/gcc/doc/gimple.texi @@ -414,7 +414,7 @@ kinds, along with their relationships to @code{GSS_} values (layouts) and + gtry |layout: GSS_TRY, code: GIMPLE_TRY | - + gimple_statement_wce + + gwce |layout: GSS_WCE, code: GIMPLE_WITH_CLEANUP_EXPR | + gomp_continue diff --git a/gcc/gimple-walk.c b/gcc/gimple-walk.c index a6ea1ec..002308c 100644 --- a/gcc/gimple-walk.c +++ b/gcc/gimple-walk.c @@ -635,8 +635,9 @@ walk_gimple_stmt (gimple_stmt_iterator *gsi, walk_stmt_fn callback_stmt, break; case GIMPLE_WITH_CLEANUP_EXPR: - ret = walk_gimple_seq_mod (gimple_wce_cleanup_ptr (stmt), callback_stmt, -callback_op, wi); + ret = walk_gimple_seq_mod (gimple_wce_cleanup_ptr (as_a gwce * (stmt)), +
Re: [PATCH, IPA ICF] Fix PR63664, PR63574 (segfault in ipa-icf pass)
2014-10-29 17:01 GMT+03:00 Martin Liška mli...@suse.cz: On 10/29/2014 02:45 PM, Ilya Enkovich wrote: On 29 Oct 10:34, Richard Biener wrote: On Tue, Oct 28, 2014 at 5:14 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes PR63664 and PR63574. Problem is in NULL types for labels not handled by ICF properly. I assume it is OK for labels to have NULL type and added check into ICF rather then fixed label generation. Bootstrapped and checked on linux-x86_64. OK for trunk? Instead it shouldn't be called for labels instead. Richard. Here is a version which doesn't compare types for labels. Is is OK? Hello. I've been just testing a patch, where the pass does not call compare_operand for gimple labels. As the pass creates mapping between labels and basic blocks, such comparison will not be necessary. OK. That would be better. Thanks, Ilya Thanks, Martin Bootstrapped and checked on linux-x86_64. Thanks, Ilya -- gcc/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 PR bootstrap/63574 * ipa-icf-gimple.c (func_checker::compatible_types_p): Assert for null args. (func_checker::compare_operand): Don't compare types for labels. gcc/testsuite/ 2014-10-29 Ilya Enkovich ilya.enkov...@intel.com PR ipa/63664 * gcc.dg/ipa/pr63664.C: New. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 1369b74..094e8ab 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -169,6 +169,9 @@ bool func_checker::compatible_types_p (tree t1, tree t2, bool compare_polymorphic, bool first_argument) { + gcc_assert (t1); + gcc_assert (t2); + if (TREE_CODE (t1) != TREE_CODE (t2)) return return_false_with_msg (different tree types); @@ -214,11 +217,15 @@ func_checker::compare_operand (tree t1, tree t2) else if (!t1 || !t2) return false; - tree tt1 = TREE_TYPE (t1); - tree tt2 = TREE_TYPE (t2); + if (TREE_CODE (t1) != LABEL_DECL + TREE_CODE (t2) != LABEL_DECL) +{ + tree tt1 = TREE_TYPE (t1); + tree tt2 = TREE_TYPE (t2); - if (!func_checker::compatible_types_p (tt1, tt2)) -return false; + if (!func_checker::compatible_types_p (tt1, tt2)) + return false; +} base1 = get_addr_base_and_unit_offset (t1, offset1); base2 = get_addr_base_and_unit_offset (t2, offset2); diff --git a/gcc/testsuite/gcc.dg/ipa/pr63664.C b/gcc/testsuite/gcc.dg/ipa/pr63664.C new file mode 100644 index 000..31d96d4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63664.C @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +class test { + public: + test (int val, int *p) +{ + int_val = *p; + bool_val = (val != int_val); +} + + ~test () +{ + if (!bool_val) + return; +} + + int get_int_val () const { return int_val; } + + private: + bool bool_val; + int int_val; +}; + +static int __attribute__ ((noinline)) +f1 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +static int __attribute__ ((noinline)) +f2 (int i, int *p) +{ + test obj (i, p); + return obj.get_int_val (); +} + +int +f (int i, int *p) +{ + return f1 (i, p) + f2 (i, p); +}
[PATCH] Fix for PR63587
Hello. Following patch fixes PR63587, where we put DECL_RESULT in cgraph_node::expand_thunk to local_decls. Patch has been tested on x86_64-linux-pc without any regression and boostrap works correctly. Ready for thunk? Thanks, Martin gcc/testsuite/ChangeLog: 2014-10-29 Martin Liska mli...@suse.cz * g++.dg/ipa/pr63587-1.C: New test. * g++.dg/ipa/pr63587-2.C: New test. gcc/ChangeLog: 2014-10-29 Martin Liska mli...@suse.cz * cgraphunit.c (cgraph_node::expand_thunk): Only VAR_DECLs are put to local declarations. * function.c (add_local_decl): Implementation moved from header file, assert introduced for tree type. * function.h: Likewise. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index a86bd1b..6f61f5c 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1550,7 +1550,9 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) else if (!is_gimple_reg_type (restype)) { restmp = resdecl; - add_local_decl (cfun, restmp); + + if (TREE_CODE (restmp) == VAR_DECL) + add_local_decl (cfun, restmp); BLOCK_VARS (DECL_INITIAL (current_function_decl)) = restmp; } else diff --git a/gcc/function.c b/gcc/function.c index ee229ad..893ca6f 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -6441,6 +6441,15 @@ match_asm_constraints_1 (rtx_insn *insn, rtx *p_sets, int noutputs) df_insn_rescan (insn); } +/* Add the decl D to the local_decls list of FUN. */ + +void +add_local_decl (struct function *fun, tree d) +{ + gcc_assert (TREE_CODE (d) == VAR_DECL); + vec_safe_push (fun-local_decls, d); +} + namespace { const pass_data pass_data_match_asm_constraints = diff --git a/gcc/function.h b/gcc/function.h index 66384e5..aa47018 100644 --- a/gcc/function.h +++ b/gcc/function.h @@ -668,11 +668,7 @@ struct GTY(()) function { /* Add the decl D to the local_decls list of FUN. */ -static inline void -add_local_decl (struct function *fun, tree d) -{ - vec_safe_push (fun-local_decls, d); -} +void add_local_decl (struct function *fun, tree d); #define FOR_EACH_LOCAL_DECL(FUN, I, D) \ FOR_EACH_VEC_SAFE_ELT_REVERSE ((FUN)-local_decls, I, D) diff --git a/gcc/testsuite/g++.dg/ipa/pr63587-1.C b/gcc/testsuite/g++.dg/ipa/pr63587-1.C new file mode 100644 index 000..cbf872e --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr63587-1.C @@ -0,0 +1,92 @@ +// PR ipa/63587 +// { dg-do compile { target c++11 } } +// { dg-options -O2 -fno-strict-aliasing } + +template class struct A +{ +}; +template typename struct B +{ + template typename struct C; +}; +class D; +template typename class F; +struct G +{ + void operator()(const D , D); +}; +class D +{ +public: + D (int); +}; +struct H +{ + H (int); +}; +template typename _Key, typename, typename, typename _Compare, typename +class I +{ + typedef _Key key_type; + template typename _Key_compare struct J + { +_Key_compare _M_key_compare; + }; + J_Compare _M_impl; + +public: + Aint _M_get_insert_unique_pos (const key_type ); + Aint _M_get_insert_hint_unique_pos (H ); + template typename... _Args int _M_emplace_hint_unique (H, _Args ...); +}; +template typename _Key, typename _Tp, typename _Compare = G, + typename _Alloc = FA_Tp +class K +{ + typedef _Key key_type; + typedef _Key value_type; + typedef typename B_Alloc::template Cvalue_type _Pair_alloc_type; + Ikey_type, value_type, int, _Compare, _Pair_alloc_type _M_t; + +public: + void operator[](key_type) + { +_M_t._M_emplace_hint_unique (0); + } +}; +template typename _Key, typename _Val, typename _KeyOfValue, + typename _Compare, typename _Alloc +Aint +I_Key, _Val, _KeyOfValue, _Compare, _Alloc::_M_get_insert_unique_pos ( + const key_type p1) +{ + _M_impl._M_key_compare (p1, 0); +} +template typename _Key, typename _Val, typename _KeyOfValue, + typename _Compare, typename _Alloc +Aint +I_Key, _Val, _KeyOfValue, _Compare, _Alloc::_M_get_insert_hint_unique_pos ( + H ) +{ + _M_get_insert_unique_pos (0); +} +template typename _Key, typename _Val, typename _KeyOfValue, + typename _Compare, typename _Alloc +template typename... _Args +int +I_Key, _Val, _KeyOfValue, _Compare, _Alloc::_M_emplace_hint_unique ( + H p1, _Args ...) +{ + _M_get_insert_hint_unique_pos (p1); +} +namespace { +struct L; +} +void +fn1 () +{ + KD, L a; + a[0]; + KD, int b; + b[0]; +} diff --git a/gcc/testsuite/g++.dg/ipa/pr63587-2.C b/gcc/testsuite/g++.dg/ipa/pr63587-2.C new file mode 100644 index 000..f31c5bd --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr63587-2.C @@ -0,0 +1,250 @@ +// PR ipa/63587 +// { dg-do compile { target c++11 } } +// { dg-options -O2 } + +namespace boost { +class basic_cstring +{ +public: + basic_cstring (char *); +}; +template typename struct identity +{ +}; +struct make_identity; +struct function_buffer +{ +}; +template typename FunctionObj struct function_obj_invoker0 +{ + static int + invoke (function_buffer ) + { +FunctionObj f; +
Re: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
On 10/29/14 02:47, Thomas Preud'homme wrote: It seems more sensible to keep it in this block as the existing defaulted_int block is for types for which it is not an error to omit the int type specifier. It's not an error to omit it for complex - but of course means something different. IMHO it would be confusing to set type to integer_type_node when that's definitely wrong. But then setting 'defaulted_int' when that's not the case is also confusing. ChangeLog unchanged. Ok for trunk? Anyway, I have no further comments on this patch and defer to Jason. nathan
[PATCH][8/n] Merge from match-and-simplify, conversion patterns
This merges a set of conversion patterns and removes the corresponding code from both fold-const.c and tree-ssa-forwprop.c. fold-const.c| 36 match.pd| 42 + tree-ssa-forwprop.c | 65 3 files changed, 42 insertions(+), 101 deletions(-) (hopefully it will always look that nice!) Bootstrapped and tested on x86_64-unknown-linux-gnu, I'll apply shortly. Thanks, Richard. 2014-10-29 Richard Biener rguent...@suse.de * match.pd: Implement a first set of conversion patterns. * fold-const.c (fold_unary_loc): Remove them here. * tree-ssa-forwprop.c (simplify_vce): Remove. (pass_forwprop::execute): Do not call simplify_vce. Index: gcc/match.pd === --- gcc/match.pd(revision 216798) +++ gcc/match.pd(working copy) @@ -90,6 +90,48 @@ (define_predicates +/* Simplifications of conversions. */ + +/* Basic strip-useless-type-conversions / strip_nops. */ +(for cvt (convert view_convert) + (simplify + (cvt @0) + (if ((GIMPLE useless_type_conversion_p (type, TREE_TYPE (@0))) + || (GENERIC type == TREE_TYPE (@0))) + @0))) + +/* Contract view-conversions. */ +(simplify + (view_convert (view_convert @0)) + (view_convert @0)) + +/* For integral conversions with the same precision or pointer + conversions use a NOP_EXPR instead. */ +(simplify + (view_convert @0) + (if ((INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)) +(INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0))) +TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (@0))) + (convert @0))) + +/* Strip inner integral conversions that do not change precision or size. */ +(simplify + (view_convert (convert@0 @1)) + (if ((INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0))) +(INTEGRAL_TYPE_P (TREE_TYPE (@1)) || POINTER_TYPE_P (TREE_TYPE (@1))) +(TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1))) +(TYPE_SIZE (TREE_TYPE (@0)) == TYPE_SIZE (TREE_TYPE (@1 + (view_convert @1))) + +/* Re-association barriers around constants and other re-association + barriers can be removed. */ +(simplify + (paren CONSTANT_CLASS_P@0) + @0) +(simplify + (paren (paren@1 @0)) + @1) + /* Simple example for a user-defined predicate - modeled after Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 216801) +++ gcc/fold-const.c(working copy) @@ -7661,14 +7661,6 @@ fold_unary_loc (location_t loc, enum tre switch (code) { -case PAREN_EXPR: - /* Re-association barriers around constants and other re-association -barriers can be removed. */ - if (CONSTANT_CLASS_P (op0) - || TREE_CODE (op0) == PAREN_EXPR) - return fold_convert_loc (loc, type, op0); - return NULL_TREE; - case NON_LVALUE_EXPR: if (!maybe_lvalue_p (op0)) return fold_convert_loc (loc, type, op0); @@ -7677,9 +7669,6 @@ fold_unary_loc (location_t loc, enum tre CASE_CONVERT: case FLOAT_EXPR: case FIX_TRUNC_EXPR: - if (TREE_TYPE (op0) == type) - return op0; - if (COMPARISON_CLASS_P (op0)) { /* If we have (type) (a CMP b) and type is an integral type, return @@ -7950,35 +7939,10 @@ fold_unary_loc (location_t loc, enum tre return tem ? tem : NULL_TREE; case VIEW_CONVERT_EXPR: - if (TREE_TYPE (op0) == type) - return op0; - if (TREE_CODE (op0) == VIEW_CONVERT_EXPR) - return fold_build1_loc (loc, VIEW_CONVERT_EXPR, - type, TREE_OPERAND (op0, 0)); if (TREE_CODE (op0) == MEM_REF) return fold_build2_loc (loc, MEM_REF, type, TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1)); - /* For integral conversions with the same precision or pointer -conversions use a NOP_EXPR instead. */ - if ((INTEGRAL_TYPE_P (type) - || POINTER_TYPE_P (type)) - (INTEGRAL_TYPE_P (TREE_TYPE (op0)) - || POINTER_TYPE_P (TREE_TYPE (op0))) - TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (op0))) - return fold_convert_loc (loc, type, op0); - - /* Strip inner integral conversions that do not change the precision. */ - if (CONVERT_EXPR_P (op0) - (INTEGRAL_TYPE_P (TREE_TYPE (op0)) - || POINTER_TYPE_P (TREE_TYPE (op0))) - (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) - || POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0 - (TYPE_PRECISION (TREE_TYPE (op0)) - == TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0) - return fold_build1_loc (loc, VIEW_CONVERT_EXPR, - type, TREE_OPERAND (op0, 0)); - return
Re: libcc1
On 29/10/14 11:24, Phil Muldoon wrote: On 29/10/14 10:31, Jakub Jelinek wrote: It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. Could something like following work? Phil, can you try that? Perhaps some toplevel Makefile* changes would be needed too. From a point of view of GDB, as long as in all scenarios above the .so is available in the finished produce that is fine. I will test your patch and report back. I built with bootstrap enabled, and also disabled with this patch. In both cases the .so is available. So it looks good. I also ran GDB compile testcases against both .so's and all looks good there too. Cheers Phil
Re: [Ping] [PATCH, 8/10] aarch64: ccmp insn patterns
On 10/29/2014 03:37 AM, Zhenqiang Chen wrote: It's my fault. %m/%M work well in the new patch. And I add a check aarch64_ccmp_mode_to_code (GET_MODE (operands[1])) == GET_CODE (operands[5]) on the patterns to make sure that the compare and CC mode are aligned. Looks good. r~
Re: libcc1
On 29/10/14 14:26, Phil Muldoon wrote: On 29/10/14 11:24, Phil Muldoon wrote: On 29/10/14 10:31, Jakub Jelinek wrote: It would be nice to have libcc1 built just once, not bootstrap it, but it is a build module, is that possible? In toplevel configure.ac I'm seeing: host_tools=texinfo flex bison binutils gas ld fixincludes gcc cgen sid sim gdb gprof etc expect dejagnu m4 utils guile fastjar gnattools libcc1 shouldn't libcc1 be in build_tools instead? I mean, it is a library meant to be dlopened by gdb and gcc plugin that uses that library, so in canadian-cross should be for the build target, where the resulting compiler will be run and where gdb will be run. Could something like following work? Phil, can you try that? Perhaps some toplevel Makefile* changes would be needed too. From a point of view of GDB, as long as in all scenarios above the .so is available in the finished produce that is fine. I will test your patch and report back. I built with bootstrap enabled, and also disabled with this patch. In both cases the .so is available. So it looks good. I also ran GDB compile testcases against both .so's and all looks good there too. Cheers Phil I forgot to ask, I am fine with this patch. I concur with Jakub that building libcc1 as part of bootstrap is not needed. Does anyone else object to removing libcc1.so from bootstrap? Cheers Phil
RE: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
From: Nathan Sidwell [mailto:nathanmsidw...@gmail.com] On Behalf Of Nathan Sidwell It's not an error to omit it for complex - but of course means something different. IMHO it would be confusing to set type to integer_type_node when that's definitely wrong. But then setting 'defaulted_int' when that's not the case is also confusing. Oh in that case the patch is incomplete. Currently a complex alone gives an error at compilation which is why I added -fpermissive to the testcase. The patch don't change this behavior. Best regards, Thomas
Re: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
On 10/29/14 07:32, Thomas Preud'homme wrote: From: Nathan Sidwell [mailto:nathanmsidw...@gmail.com] On Behalf Of Oh in that case the patch is incomplete. Currently a complex alone gives an error at compilation which is why I added -fpermissive to the testcase. The patch don't change this behavior. It's quite probably I'm wrong -- I forgot that you mentioned -fpermissive before. In which case your reasoning is sound. Still deferring to Jason though. nathan
Re: [Ping] [PATCH, 6/10] aarch64: add ccmp CC mode
On 10/29/2014 03:31 AM, Zhenqiang Chen wrote: Patch is updated. Looks good. r~
Re: [Patchv2 3/4] Control SRA and IPA-SRA by a param rather than MOVE_RATIO
On Wed, Oct 01, 2014 at 05:38:12PM +0100, James Greenhalgh wrote: On Fri, Sep 26, 2014 at 10:11:13AM +0100, Richard Biener wrote: On Thu, Sep 25, 2014 at 4:57 PM, James Greenhalgh james.greenha...@arm.com wrote: Given the special value to note the default for the new --params is zero a user cannot disable scalarization that way. I still somehow dislike that you need a target hook to compute the default. Why doesn't it work to do, in opts.c:default_options_optimization maybe_set_param_value (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, get_move_ratio (speed_p) * MOVE_MAX_PIECES, opts-x_param_values, opts_set-x_param_values); and override that default in targets option_override hook the same way? The problem I am having is getting get_move_ratio right, without breaking the modular design. default_options_optimization, and the rest of opts.c is going to end up in libcommon-target.a, so we are not going to have access to any backend-specific symbols. An early draft of this patch used the MOVE_RATIO macro to set the default value. This worked fine for AArch64 and ARM targets (which both use a simple C expression for MOVE_RATIO), but failed for x86_64 which defines MOVE_RATIO as so: #define MOVE_RATIO(speed) ((speed) ? ix86_cost-move_ratio : 3) Dealing with that ix86_cost symbol is what causes us the pain. It seems reasonable that a target might want to define MOVE_RATIO as some function of their tuning parameters, so I don't want to disallow that usage. This inspired me to try turning this in to a target hook, but this doesn't help as opts.c only gets access to common-target.def target hooks. These suffer the same problem, they don't have access to any backend symbols. I suppose I could port any target with a definition of MOVE_RATIO to override the default parameter value in their option overriding code, but that makes this a very large patch set (many targets define MOVE_RATIO). Is this an avenue worth exploring? I agree the very special target hook is not ideal. Hi, Did you have any further thoughts on this? I'm still unable to come up with a way to set these parameters which allows them to default to their current (MOVE_RATIO derived) values. If the only way to make this work is to add code to TARGET_OPTION_OVERRIDE for all targets that define MOVE_RATIO, then I suppose I can do that, but I'd prefer a neater way to make it work, if you can think of one. Thanks, James
Re: [PATCH] Fix for PR63587
On Wed, Oct 29, 2014 at 3:14 PM, Martin Liška mli...@suse.cz wrote: Hello. Following patch fixes PR63587, where we put DECL_RESULT in cgraph_node::expand_thunk to local_decls. Patch has been tested on x86_64-linux-pc without any regression and boostrap works correctly. Ready for thunk? Ok. Thanks, Richard. Thanks, Martin
[PATCHv2] PR ipa/63576: Process speculative edges in ICF
Hi all, This patch is an attempt to fix bug PR ipa/63576, corrected according to note made by Jiong Wang: On 27.10.2014 18:41, Jiong Wang wrote: how about using early exit for above code, something like: if (!e-speculative || profile_status_for_fn (DECL_STRUCT_FUNCTION (dst-decl)) == PROFILE_ABSEN)) { e-count = bb-count; e-frequency = (e-speculative ? CGRAPH_FREQ_BASE : compute_call_stmt_bb_frequency (dst-decl, bb)); return; } gcc_assert (e-count = 0); ... ... Ok, that's right idea. Jan Hubicka wrote: THen you need to sum counts (instead of taking ones from BB) and turn them back to frequencies (because it is profile only counts should be non-0) It seems that counts and frequencies are gathered in some special manner, and this patch simply adds counts from speculative edges and from basic blocks. Of course, I don't know whether this way is proper one, so please correct me or redirect to right place where it is documented. Honza, can you explain your comment to the bug? I've changed the patch, bootstrapped and regtested on x86_64-unknown-linux-gnu again. Ok for trunk? -- Best regards, Ilya Palachev --- gcc/ 2014-10-27 Ilya Palachev i.palac...@samsung.com * ipa-utils.c (compute_edge_count_and_frequency): New function (ipa_merge_profiles): handle speculative case --- gcc/ipa-utils.c | 43 +-- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c index e4ea84c..177a170 100644 --- a/gcc/ipa-utils.c +++ b/gcc/ipa-utils.c @@ -390,6 +390,37 @@ get_base_var (tree t) return t; } +/* Computes count and frequency for edges. */ + +static void +compute_edge_count_and_frequency (struct cgraph_edge *e, + struct cgraph_node *dst) +{ + basic_block bb = gimple_bb (e-call_stmt); + if (!e-speculative + || profile_status_for_fn (DECL_STRUCT_FUNCTION (dst-decl)) + == PROFILE_ABSENT) +{ + e-count = bb-count; + e-frequency = compute_call_stmt_bb_frequency (dst-decl, bb); + return; +} + gcc_assert (e-count = 0); + e-count += bb-count; + gcc_assert (e-frequency = 0); + + int entry_freq = ENTRY_BLOCK_PTR_FOR_FN + (DECL_STRUCT_FUNCTION (dst-decl))-frequency; + int freq = e-frequency + bb-frequency; + + if (!entry_freq) +entry_freq = 1, freq++; + + freq = freq * CGRAPH_FREQ_BASE / entry_freq; + if (freq CGRAPH_FREQ_MAX) +freq = CGRAPH_FREQ_MAX; + e-frequency = freq; +} /* SRC and DST are going to be merged. Take SRC's profile and merge it into DST so it is not going to be lost. Destroy SRC's body on the way. */ @@ -547,19 +578,11 @@ ipa_merge_profiles (struct cgraph_node *dst, pop_cfun (); for (e = dst-callees; e; e = e-next_callee) { - gcc_assert (!e-speculative); - e-count = gimple_bb (e-call_stmt)-count; - e-frequency = compute_call_stmt_bb_frequency -(dst-decl, - gimple_bb (e-call_stmt)); + compute_edge_count_and_frequency (e, dst); } for (e = dst-indirect_calls; e; e = e-next_callee) { - gcc_assert (!e-speculative); - e-count = gimple_bb (e-call_stmt)-count; - e-frequency = compute_call_stmt_bb_frequency -(dst-decl, - gimple_bb (e-call_stmt)); + compute_edge_count_and_frequency (e, dst); } src-release_body (); inline_update_overall_summary (dst); -- 2.1.1
Re: [PATCH, ifcvt] Allow CC mode if HAVE_cbranchcc4
On 10/29/2014 03:27 AM, Zhenqiang Chen wrote: ChangeLog: 2014-10-29 Zhenqiang Chen zhenqiang.c...@arm.com * ifcvt.c (noce_emit_cmove, noce_get_alt_condition, noce_get_condition): Allow CC mode if HAVE_cbranchcc4. Ok. r~
Re: [PATCH 5/5] add libcc1
On 10/29/2014 11:59 AM, Jakub Jelinek wrote: Ah, got it. Is it hard to move the inclusion to the actual users? I think it is hard. I think it has been moved to system.h very much intentionally, as including gmp.h only in selected headers was causing lots of troubles, e.g. because of #pragma GCC poison at the end of system.h, I believe some gmp.h versions were using some poisoned symbols. system.h doesn't include gmp.h if -DGENERATOR_FILE, but libcc1 is not a generator, so that is not appropriate, it can use various other GCC headers that are not suitable for generators. GMPINC has been suggested by Joseph, I'd think if we ever need also GMPLIB, we'd clearly see it as link failures of libcc1 first and could add it only when really needed. Fair enough, thanks! Paolo
[gimple-classes, committed 2/3] Make gimple_goto_dest require a const ggoto *
gcc/ChangeLog.gimple-classes: * gimple.h (gimple_goto_dest): Strengthen param from const_gimple to const ggoto *. * cfgexpand.c (expand_gimple_stmt_1): Add checked cast to ggoto * within case GIMPLE_GOTO. * gimple-walk.c (walk_stmt_load_store_addr_ops): Add checked cast to ggoto *. * ipa-icf-gimple.c (ipa_icf_gimple::func_checker::compare_bb): Add checked casts to ggoto * within case GIMPLE_GOTO. (ipa_icf_gimple::func_checker::compare_gimple_goto): Strengthen both params from gimple to const ggoto *. * ipa-icf-gimple.h (ipa_icf_gimple::func_checker::compare_gimple_goto): Likewise. * omp-low.c (diagnose_sb_2): Add checked cast to ggoto * within case GIMPLE_GOTO. * tree-cfg.c (computed_goto_p): Replace check for GIMPLE_GOTO with a dyn_cast ggoto *, introducing new local goto_stmt. (handle_abnormal_edges): Strengthen local last from gimple to ggoto *. (make_goto_expr_edges): Add checked cast to ggoto * within region where we know it's a simple goto. (simple_goto_p): Replace check for GIMPLE_GOTO with a dyn_cast ggoto *, introducing new local goto_stmt. * tree-cfgcleanup.c (cleanup_control_flow_bb): Likewise, using new goto_stmt in place of stmt. * tree-eh.c (replace_goto_queue_cond_clause): Likewise, using new goto_stmt in place of gimple_seq_first_stmt (new_seq). (maybe_record_in_goto_queue): Add checked cast to ggoto * within case GIMPLE_GOTO. * tree-inline.c (inline_forbidden_p_stmt): Likewise. * tree-nested.c (convert_nonlocal_reference_stmt): Likewise. (convert_nl_goto_reference): Add checked cast to ggoto *. * tree-ssa-dom.c (initialize_hash_element): Replace check for GIMPLE_GOTO with a dyn_cast ggoto *, introducing new local goto_stmt. (optimize_stmt): Likewise. (propagate_rhs_into_lhs): Add checked cast to ggoto *. * tree-ssa-sccvn.c (cond_dom_walker::before_dom_children): Likewise. * tree-ssa-threadedge.c (simplify_control_stmt_condition): Likewise. * tree-ssa-threadupdate.c (bb_ends_with_multiway_branch): Likewise. --- gcc/ChangeLog.gimple-classes | 42 ++ gcc/cfgexpand.c | 2 +- gcc/gimple-walk.c| 2 +- gcc/gimple.h | 3 +-- gcc/ipa-icf-gimple.c | 7 --- gcc/ipa-icf-gimple.h | 4 ++-- gcc/omp-low.c| 2 +- gcc/tree-cfg.c | 16 ++-- gcc/tree-cfgcleanup.c| 9 + gcc/tree-eh.c| 15 --- gcc/tree-inline.c| 2 +- gcc/tree-nested.c| 4 ++-- gcc/tree-ssa-dom.c | 12 ++-- gcc/tree-ssa-sccvn.c | 2 +- gcc/tree-ssa-threadedge.c| 2 +- gcc/tree-ssa-threadupdate.c | 2 +- 16 files changed, 87 insertions(+), 39 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index b7a62de..50b87b9 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,47 @@ 2014-10-28 David Malcolm dmalc...@redhat.com + * gimple.h (gimple_goto_dest): Strengthen param from const_gimple to + const ggoto *. + * cfgexpand.c (expand_gimple_stmt_1): Add checked cast to ggoto * + within case GIMPLE_GOTO. + * gimple-walk.c (walk_stmt_load_store_addr_ops): Add checked cast + to ggoto *. + * ipa-icf-gimple.c (ipa_icf_gimple::func_checker::compare_bb): Add + checked casts to ggoto * within case GIMPLE_GOTO. + (ipa_icf_gimple::func_checker::compare_gimple_goto): Strengthen + both params from gimple to const ggoto *. + * ipa-icf-gimple.h (ipa_icf_gimple::func_checker::compare_gimple_goto): + Likewise. + * omp-low.c (diagnose_sb_2): Add checked cast to ggoto * within + case GIMPLE_GOTO. + * tree-cfg.c (computed_goto_p): Replace check for GIMPLE_GOTO with + a dyn_cast ggoto *, introducing new local goto_stmt. + (handle_abnormal_edges): Strengthen local last from gimple to + ggoto *. + (make_goto_expr_edges): Add checked cast to ggoto * within region + where we know it's a simple goto. + (simple_goto_p): Replace check for GIMPLE_GOTO with a + dyn_cast ggoto *, introducing new local goto_stmt. + * tree-cfgcleanup.c (cleanup_control_flow_bb): Likewise, using + new goto_stmt in place of stmt. + * tree-eh.c (replace_goto_queue_cond_clause): Likewise, using + new goto_stmt in place of gimple_seq_first_stmt (new_seq). + (maybe_record_in_goto_queue): Add checked cast to ggoto * within + case GIMPLE_GOTO. + * tree-inline.c (inline_forbidden_p_stmt): Likewise. + * tree-nested.c (convert_nonlocal_reference_stmt): Likewise. + (convert_nl_goto_reference): Add
Re: [Ping] [PATCH, 1/10] two hooks for conditional compare (ccmp)
On 10/29/2014 03:28 AM, Zhenqiang Chen wrote: Thanks! Patch is updated. Ok. r~
Re: [gofrontend-dev] [PATCH 8/9] Gccgo port to s390[x] -- part I
On Wed, Oct 29, 2014 at 12:01 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: Patch updated to remove conflicts with changed tests in patch 7. Thanks. Approved and committed. Ian
Re: [Ping] [PATCH, 2/10] prepare ccmp
On 10/29/2014 03:29 AM, Zhenqiang Chen wrote: Thanks! Patch is updated. Ok. r~
Re: [gomp4] Rationalise thread-local variables in libgomp OpenACC support
On Tue, 28 Oct 2014 11:16:19 + Julian Brown jul...@codesourcery.com wrote: Hi, This patch rationalises TLS support by moving all thread-local variables into a single structure. Because this meant interfering with how per-thread/per-device initialisation was done, I took the opportunity to tidy up a couple of other bits along the way. Highlights are: Here's a slightly-updated version of the patch, adjusted for Thomas's removal of the queue.h list-handling functions. ChangeLog as before. Thanks, Juliancommit ab4e9ff7a52e43418d6d2fc5b5e76e0065e130d5 Author: Julian Brown jul...@codesourcery.com Date: Mon Oct 27 08:43:07 2014 -0700 TLS rework diff --git a/libgomp/env.c b/libgomp/env.c index 32fb92c..8b22e6f 100644 --- a/libgomp/env.c +++ b/libgomp/env.c @@ -28,6 +28,7 @@ #include libgomp.h #include libgomp_f.h #include target.h +#include oacc-int.h #include ctype.h #include stdlib.h #include stdio.h diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h index e31573c..1496437 100644 --- a/libgomp/libgomp-plugin.h +++ b/libgomp/libgomp-plugin.h @@ -50,8 +50,4 @@ extern void GOMP_PLUGIN_mutex_destroy (gomp_mutex_t *mutex); extern void GOMP_PLUGIN_mutex_lock (gomp_mutex_t *mutex); extern void GOMP_PLUGIN_mutex_unlock (gomp_mutex_t *mutex); -/* target.c */ - -extern void GOMP_PLUGIN_async_unmap_vars (void *ptr); - #endif diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 538aabb..c6a88a2 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -337,4 +337,5 @@ PLUGIN_1.0 { GOMP_PLUGIN_mutex_lock; GOMP_PLUGIN_mutex_unlock; GOMP_PLUGIN_async_unmap_vars; + GOMP_PLUGIN_acc_thread; }; diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c index 08b6b95..dddfe05 100644 --- a/libgomp/oacc-async.c +++ b/libgomp/oacc-async.c @@ -29,6 +29,7 @@ #include openacc.h #include libgomp.h #include target.h +#include oacc-int.h int acc_async_test (int async) @@ -36,13 +37,13 @@ acc_async_test (int async) if (async acc_async_sync) gomp_fatal (invalid async argument: %d, async); - return ACC_dev-openacc.async_test_func (async); + return base_dev-openacc.async_test_func (async); } int acc_async_test_all (void) { - return ACC_dev-openacc.async_test_all_func (); + return base_dev-openacc.async_test_all_func (); } void @@ -51,22 +52,19 @@ acc_wait (int async) if (async acc_async_sync) gomp_fatal (invalid async argument: %d, async); - ACC_dev-openacc.async_wait_func (async); - return; + base_dev-openacc.async_wait_func (async); } void acc_wait_async (int async1, int async2) { - ACC_dev-openacc.async_wait_async_func (async1, async2); - return; + base_dev-openacc.async_wait_async_func (async1, async2); } void acc_wait_all (void) { - ACC_dev-openacc.async_wait_all_func (); - return; + base_dev-openacc.async_wait_all_func (); } void @@ -75,6 +73,5 @@ acc_wait_all_async (int async) if (async acc_async_sync) gomp_fatal (invalid async argument: %d, async); - ACC_dev-openacc.async_wait_all_async_func (async); - return; + base_dev-openacc.async_wait_all_async_func (async); } diff --git a/libgomp/oacc-cuda.c b/libgomp/oacc-cuda.c index f587325..3daf5b1 100644 --- a/libgomp/oacc-cuda.c +++ b/libgomp/oacc-cuda.c @@ -29,14 +29,15 @@ #include config.h #include libgomp.h #include target.h +#include oacc-int.h void * acc_get_current_cuda_device (void) { void *p = NULL; - if (ACC_dev ACC_dev-openacc.cuda.get_current_device_func) -p = ACC_dev-openacc.cuda.get_current_device_func (); + if (base_dev base_dev-openacc.cuda.get_current_device_func) +p = base_dev-openacc.cuda.get_current_device_func (); return p; } @@ -46,8 +47,8 @@ acc_get_current_cuda_context (void) { void *p = NULL; - if (ACC_dev ACC_dev-openacc.cuda.get_current_context_func) -p = ACC_dev-openacc.cuda.get_current_context_func (); + if (base_dev base_dev-openacc.cuda.get_current_context_func) +p = base_dev-openacc.cuda.get_current_context_func (); return p; } @@ -60,8 +61,8 @@ acc_get_cuda_stream (int async) if (async 0) return p; - if (ACC_dev ACC_dev-openacc.cuda.get_stream_func) -p = ACC_dev-openacc.cuda.get_stream_func (async); + if (base_dev base_dev-openacc.cuda.get_stream_func) +p = base_dev-openacc.cuda.get_stream_func (async); return p; } @@ -73,9 +74,11 @@ acc_set_cuda_stream (int async, void *stream) if (async 0 || stream == NULL) return 0; + + ACC_lazy_initialize (); - if (ACC_dev ACC_dev-openacc.cuda.set_stream_func) -s = ACC_dev-openacc.cuda.set_stream_func (async, stream); + if (base_dev base_dev-openacc.cuda.set_stream_func) +s = base_dev-openacc.cuda.set_stream_func (async, stream); return s; } diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index f44ca5e..6fe8f6c 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -35,6 +35,9 @@ #include target.h #ifdef HOST_NONSHM_PLUGIN #include
[testsuite,ARM] PR61153 Fix vbic and vorn tests
Hi, In PR61153, the vbic and vorn tests fail because when compiled at -O0 the expected Neon instructions are not generated, making scan-assembler fail. This patch: - replaces -O0 by -O2 - moves the declaration of local variables used as intrinsics parameters and results to global declarations, to prevent the compiler from optimizing the whole test away. OK? Christophe. 2014-10-29 Christophe Lyon christophe.l...@linaro.org PR target/61153 * gcc.target/arm/neon/vbicQs16.c: Compile at O2 and move variables declarations from local to global. * gcc.target/arm/neon/vbicQs16.c: Likewise. * gcc.target/arm/neon/vbicQs32.c: Likewise. * gcc.target/arm/neon/vbicQs64.c: Likewise. * gcc.target/arm/neon/vbicQs8.c: Likewise. * gcc.target/arm/neon/vbicQu16.c: Likewise. * gcc.target/arm/neon/vbicQu32.c: Likewise. * gcc.target/arm/neon/vbicQu64.c: Likewise. * gcc.target/arm/neon/vbicQu8.c: Likewise. * gcc.target/arm/neon/vbics16.c: Likewise. * gcc.target/arm/neon/vbics32.c: Likewise. * gcc.target/arm/neon/vbics64.c: Likewise. * gcc.target/arm/neon/vbics8.c: Likewise. * gcc.target/arm/neon/vbicu16.c: Likewise. * gcc.target/arm/neon/vbicu32.c: Likewise. * gcc.target/arm/neon/vbicu64.c: Likewise. * gcc.target/arm/neon/vbicu8.c: Likewise. * gcc.target/arm/neon/vornQs16.c: Likewise. * gcc.target/arm/neon/vornQs32.c: Likewise. * gcc.target/arm/neon/vornQs64.c: Likewise. * gcc.target/arm/neon/vornQs8.c: Likewise. * gcc.target/arm/neon/vornQu16.c: Likewise. * gcc.target/arm/neon/vornQu32.c: Likewise. * gcc.target/arm/neon/vornQu64.c: Likewise. * gcc.target/arm/neon/vornQu8.c: Likewise. * gcc.target/arm/neon/vorns16.c: Likewise. * gcc.target/arm/neon/vorns32.c: Likewise. * gcc.target/arm/neon/vorns64.c: Likewise. * gcc.target/arm/neon/vorns8.c: Likewise. * gcc.target/arm/neon/vornu16.c: Likewise. * gcc.target/arm/neon/vornu32.c: Likewise. * gcc.target/arm/neon/vornu64.c: Likewise. * gcc.target/arm/neon/vornu8.c: Likewise. diff --git a/gcc/testsuite/gcc.target/arm/neon/vbicQs16.c b/gcc/testsuite/gcc.target/arm/neon/vbicQs16.c index e15a260..ccb81e4 100644 --- a/gcc/testsuite/gcc.target/arm/neon/vbicQs16.c +++ b/gcc/testsuite/gcc.target/arm/neon/vbicQs16.c @@ -3,17 +3,17 @@ /* { dg-do assemble } */ /* { dg-require-effective-target arm_neon_ok } */ -/* { dg-options -save-temps -O0 } */ +/* { dg-options -save-temps -O2 } */ /* { dg-add-options arm_neon } */ #include arm_neon.h +int16x8_t out_int16x8_t; +int16x8_t arg0_int16x8_t; +int16x8_t arg1_int16x8_t; + void test_vbicQs16 (void) { - int16x8_t out_int16x8_t; - int16x8_t arg0_int16x8_t; - int16x8_t arg1_int16x8_t; - out_int16x8_t = vbicq_s16 (arg0_int16x8_t, arg1_int16x8_t); } diff --git a/gcc/testsuite/gcc.target/arm/neon/vbicQs32.c b/gcc/testsuite/gcc.target/arm/neon/vbicQs32.c index f376bf0..64f2a43 100644 --- a/gcc/testsuite/gcc.target/arm/neon/vbicQs32.c +++ b/gcc/testsuite/gcc.target/arm/neon/vbicQs32.c @@ -3,17 +3,17 @@ /* { dg-do assemble } */ /* { dg-require-effective-target arm_neon_ok } */ -/* { dg-options -save-temps -O0 } */ +/* { dg-options -save-temps -O2 } */ /* { dg-add-options arm_neon } */ #include arm_neon.h +int32x4_t out_int32x4_t; +int32x4_t arg0_int32x4_t; +int32x4_t arg1_int32x4_t; + void test_vbicQs32 (void) { - int32x4_t out_int32x4_t; - int32x4_t arg0_int32x4_t; - int32x4_t arg1_int32x4_t; - out_int32x4_t = vbicq_s32 (arg0_int32x4_t, arg1_int32x4_t); } diff --git a/gcc/testsuite/gcc.target/arm/neon/vbicQs64.c b/gcc/testsuite/gcc.target/arm/neon/vbicQs64.c index 87049f1..7b5d05b 100644 --- a/gcc/testsuite/gcc.target/arm/neon/vbicQs64.c +++ b/gcc/testsuite/gcc.target/arm/neon/vbicQs64.c @@ -3,17 +3,17 @@ /* { dg-do assemble } */ /* { dg-require-effective-target arm_neon_ok } */ -/* { dg-options -save-temps -O0 } */ +/* { dg-options -save-temps -O2 } */ /* { dg-add-options arm_neon } */ #include arm_neon.h +int64x2_t out_int64x2_t; +int64x2_t arg0_int64x2_t; +int64x2_t arg1_int64x2_t; + void test_vbicQs64 (void) { - int64x2_t out_int64x2_t; - int64x2_t arg0_int64x2_t; - int64x2_t arg1_int64x2_t; - out_int64x2_t = vbicq_s64 (arg0_int64x2_t, arg1_int64x2_t); } diff --git a/gcc/testsuite/gcc.target/arm/neon/vbicQs8.c b/gcc/testsuite/gcc.target/arm/neon/vbicQs8.c index 4f64e88..89a882c 100644 --- a/gcc/testsuite/gcc.target/arm/neon/vbicQs8.c +++ b/gcc/testsuite/gcc.target/arm/neon/vbicQs8.c @@ -8,12 +8,12 @@ #include arm_neon.h +int8x16_t out_int8x16_t; +int8x16_t arg0_int8x16_t; +int8x16_t arg1_int8x16_t; + void test_vbicQs8 (void) { - int8x16_t out_int8x16_t; - int8x16_t arg0_int8x16_t; - int8x16_t arg1_int8x16_t; - out_int8x16_t = vbicq_s8 (arg0_int8x16_t, arg1_int8x16_t); } diff --git a/gcc/testsuite/gcc.target/arm/neon/vbicQu16.c b/gcc/testsuite/gcc.target/arm/neon/vbicQu16.c index f92f9b3..51d14a0 100644 ---
Re: [testsuite,ARM] PR61153 Fix vbic and vorn tests
On Wed, Oct 29, 2014 at 3:26 PM, Christophe Lyon christophe.l...@linaro.org wrote: Hi, In PR61153, the vbic and vorn tests fail because when compiled at -O0 the expected Neon instructions are not generated, making scan-assembler fail. This patch: - replaces -O0 by -O2 - moves the declaration of local variables used as intrinsics parameters and results to global declarations, to prevent the compiler from optimizing the whole test away. OK? If you really want to do it , do it in neon-testgen.ml and do it for the whole lot. regards Ramana Christophe. 2014-10-29 Christophe Lyon christophe.l...@linaro.org PR target/61153 * gcc.target/arm/neon/vbicQs16.c: Compile at O2 and move variables declarations from local to global. * gcc.target/arm/neon/vbicQs16.c: Likewise. * gcc.target/arm/neon/vbicQs32.c: Likewise. * gcc.target/arm/neon/vbicQs64.c: Likewise. * gcc.target/arm/neon/vbicQs8.c: Likewise. * gcc.target/arm/neon/vbicQu16.c: Likewise. * gcc.target/arm/neon/vbicQu32.c: Likewise. * gcc.target/arm/neon/vbicQu64.c: Likewise. * gcc.target/arm/neon/vbicQu8.c: Likewise. * gcc.target/arm/neon/vbics16.c: Likewise. * gcc.target/arm/neon/vbics32.c: Likewise. * gcc.target/arm/neon/vbics64.c: Likewise. * gcc.target/arm/neon/vbics8.c: Likewise. * gcc.target/arm/neon/vbicu16.c: Likewise. * gcc.target/arm/neon/vbicu32.c: Likewise. * gcc.target/arm/neon/vbicu64.c: Likewise. * gcc.target/arm/neon/vbicu8.c: Likewise. * gcc.target/arm/neon/vornQs16.c: Likewise. * gcc.target/arm/neon/vornQs32.c: Likewise. * gcc.target/arm/neon/vornQs64.c: Likewise. * gcc.target/arm/neon/vornQs8.c: Likewise. * gcc.target/arm/neon/vornQu16.c: Likewise. * gcc.target/arm/neon/vornQu32.c: Likewise. * gcc.target/arm/neon/vornQu64.c: Likewise. * gcc.target/arm/neon/vornQu8.c: Likewise. * gcc.target/arm/neon/vorns16.c: Likewise. * gcc.target/arm/neon/vorns32.c: Likewise. * gcc.target/arm/neon/vorns64.c: Likewise. * gcc.target/arm/neon/vorns8.c: Likewise. * gcc.target/arm/neon/vornu16.c: Likewise. * gcc.target/arm/neon/vornu32.c: Likewise. * gcc.target/arm/neon/vornu64.c: Likewise. * gcc.target/arm/neon/vornu8.c: Likewise.
Re: [Patch 1/6] Hookize MOVE_BY_PIECES_P, remove most uses of MOVE_RATIO
On Wed, Oct 29, 2014 at 11:42:06AM +, Matthew Fortune wrote: Hi James, I think you have a bug in the following hunk where you pass STORE_MAX_PIECES in place of the optimise for speed flag. I guess you would need an extra argument to pass a different *_MAX_PIECES value in. Yup, good spot and agreed. I think I'll respin this series and get rid of all the *_BY_PIECES_P in one sweep. I'm thinking of something like: use_by_pieces_infrastructure_p (unsigned int size, unsigned int alignment, enum by_pieces_mode mode, bool speed_p) which will take the type of by_pieces operation as the third parameter. Thanks, James @@ -192,8 +184,7 @@ static void write_complex_part (rtx, rtx, bool); called to memcpy storage when the source is a constant string. */ #ifndef STORE_BY_PIECES_P #define STORE_BY_PIECES_P(SIZE, ALIGN) \ - (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ -(unsigned int) MOVE_RATIO (optimize_insn_for_speed_p ())) + (targetm.move_by_pieces_profitable_p (SIZE, ALIGN, STORE_MAX_PIECES)) #endif /* This is run to set up which modes can be use
Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'
On 10/27/14 9:42, Chen Gang wrote: On 10/27/14 2:22, Michael Eager wrote: Microblaze-sim provides basic instruction set architecture and memory simulation. There is no operating system support. (It's also quite old. I'm not sure which version of the MB architecture it models, but it is not recent.) Microblaze-sim is not a full system simulator, like QEMU. To be able to run a program which requires glibc, you need to be able to boot a full Linux image on the simulator, which microblaze-sim cannot do. QEMU models an entire processor and can boot a Linux image. At present, run upstream qemu 2.1.2 and upstream Linux kernel 3.17-rc7 with simple ramfs successfully. Via modify ramfs, can run hello world program with static glibc (built by upstream mc_gcc), successfully. - For ramfs: wget http://www.wiki.xilinx.com/file/view/microblaze_complete.cpio.gz/419243588/microblaze_complete.cpio.gz - Related qemu command: ./microblaze-softmmu/qemu-system-microblaze -M petalogix-s3adsp1800 \ -kernel ../linux-stable.microblaze/arch/microblaze/boot/linux.bin \ -no-reboot -append console=ttyUL0,115200 doreboot -nographic Next, I shall try to let our gdb and DejaGNU work for it: - How to let qemu support network and rsh (ramfs need telnetd, kernel may need related driver, and qemu related hardware need be tested). - Let gdb work for it, then config DejaGNU (need we test the program with dynamic glib, it will be fail now for not match glibc version in ramfs). - At last, run our test. It seems, still many things need trying. Welcome any ideas, suggestions, and completions for it (especially for ramfs network and/or glibc, and DejaGNU configuration ...). Thanks. OK, thank you very much, I shall rewind to qemu, and should try my best to finish within within this month. Thanks -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
RE: [PATCH, PR63307] Fix generation of new declarations in random order
The question remains, are the decls all you need from the traversal (i.e. what you need to act upon)? From my earlier skim of the original code that wasn't that obvious. You can have in decl_map at least also BLOCKs, perhaps types too, what else? Jakub, Seems the BLOCKs are the only exception, they can be added in map by insert_decl_map (id, wd-block, DECL_INITIAL (inner_fn)); in cilk_outline In other cases adding to decl_map is being done through add_variable routine which is called only for DECLs (in extract_free_variables) Your fix for bootstrap looks correct since in cilk_outline we deal with error_mark_node values which are set only for DECLs. Thanks, Igor