[Ping]Re: [PR63762][4.9] Backport the patch which fixes "GCC generates UNPREDICTABLE STR with Rn = Rt for arm"
On 20/11/14 16:17, Renlin Li wrote: Hi all, This is a backport for gcc-4_9-branch of the patch "[PR63762]GCC generates UNPREDICTABLE STR with Rn = Rt for arm" posted in: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html arm-none-eabi has been test on the model, no new issues. bootstrapping and regression tested on x86, no new issues. Is it Okay for gcc-4_9-branch? gcc/ChangeLog: 2014-11-20 Renlin Li PR middle-end/63762 * ira.c (ira): Update preferred class. gcc/testsuite/ChangeLog: 2014-11-20 Renlin Li PR middle-end/63762 * gcc.dg/pr63762.c: New. Ping for it. Regards, Renlin Li
Re: [PATCH] Fix PR lto/64075
On Wed, Nov 26, 2014 at 1:35 AM, Ilya Enkovich wrote: > Hi, > > This patch fixes LTO streamers which were not adjusted when function_code > field was extended up to 12 bits. > > OK for trunk after bootstrap and check? > > Thanks, > Ilya > -- > gcc/ > > 2014-11-26 Ilya Enkovich > > * tree-streamer-in.c (unpack_ts_function_decl_value_fields): Use > proper size for function_code bitfield. > (pack_ts_function_decl_value_fields): Likewise. > > gcc/testsuite/ > > 2014-11-26 Ilya Enkovich > > * gcc.dg/pr64075.c: New. > > > diff --git a/gcc/testsuite/gcc.dg/pr64075.c b/gcc/testsuite/gcc.dg/pr64075.c > new file mode 100644 > index 000..f3c8dc4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr64075.c > @@ -0,0 +1,8 @@ > +/* PR lto/64075 */ > +/* { dg-do compile } */ > +/* { dg-options "-flto" } */ You should use /* { dg-do compile { target lto } } */ -- H.J.
[PATCH] Fix PR63738
I am testing the following (obvious) patch to avoid generating overlapping life-ranges for SSA names that occur in abnormal PHIs. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-11-26 Richard Biener PR middle-end/63738 * tree-data-ref.c (split_constant_offset_1): Do not follow SSA edges for SSA names with SSA_NAME_OCCURS_IN_ABNORMAL_PHI. * gcc.dg/torture/pr63738.c: New testcase. Index: gcc/tree-data-ref.c === --- gcc/tree-data-ref.c (revision 218076) +++ gcc/tree-data-ref.c (working copy) @@ -674,6 +674,9 @@ split_constant_offset_1 (tree type, tree case SSA_NAME: { + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op0)) + return false; + gimple def_stmt = SSA_NAME_DEF_STMT (op0); enum tree_code subcode; Index: gcc/testsuite/gcc.dg/torture/pr63738.c === --- gcc/testsuite/gcc.dg/torture/pr63738.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr63738.c (working copy) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ + +#include + +struct longjmp_buffer { + jmp_buf buf; +}; + +void plouf(); + +extern long interprete() +{ + long * sp; + int i; + long *args; + int n; + + struct longjmp_buffer raise_buf; + _setjmp (raise_buf.buf); + + plouf(); + sp -= 4; + for (i = 0; i < n; i++) +args[i] = sp[10-i]; + plouf(); + return 0; +}
Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.
On Wed, Nov 26, 2014 at 11:19:42AM +0100, Mark Wielaard wrote: > Ping. Rebased patch attached. > > I have submitted patches for elfutils, valgrind, gdb and binutils for > this. But they are pending till this patch hits GCC first. Ok, thanks. Jakub
Re: [PATCH 2/2] PR debug/38757 continued. Handle C11, C++11 and C++14.
On Fri, 2014-11-21 at 21:34 +0100, Mark Wielaard wrote: > On Fri, Nov 21, 2014 at 09:28:45AM +0100, Jakub Jelinek wrote: > > I think best would be to tweak > > if (value < 2 || value > 4) > > error_at (loc, "dwarf version %d is not supported", value); > > else > > opts->x_dwarf_version = value; > > so that we accept value 5 too, and for now, until the > > most common consumers are changed, use > > if (dwarf_version >= 5 /* || !dwarf_strict */) > > so that > > - you can actually use it in the test with -gdwarf-5 > > - you can commit it right away > > - people can start playing with what it will mean to support DWARF5 > > > > GCC 4.5 also allowed -gdwarf-4 even when DWARF4 has not been released yet. > > When there are consumers that can grok it, we can uncomment the > > || !dwarf_strict. > > That makes sense and would be convenient for me. > > I made the change in opts.c and added some minimal documentation. > And made sure we only emit the new DWARFv5 language values, but not yet > anything else (the table header format has changed for debug_info and > debug_line in v5, but we don't emit new style headers yet). The testcases > were updated to explicitly add -gdwarf-5. > > > >else if (strncmp (language_string, "GNU C", 5) == 0) > > > { > > >language = DW_LANG_C89; > > >if (dwarf_version >= 3 || !dwarf_strict) > > > - if (strcmp (language_string, "GNU C99") == 0) > > > - language = DW_LANG_C99; > > > + { > > > + if (strcmp (language_string, "GNU C89") != 0) > > > + language = DW_LANG_C99; > > > + > > > + if (dwarf_version >= 5 || !dwarf_strict) > > > + if (strcmp (language_string, "GNU C11") == 0) > > > + language = DW_LANG_C11; > > > + } > > > > Shouldn't we emit at least DW_LANG_C99 for GNU C11 if > > not dwarf_version >= 5 /* || !dwarf_strict */ but > > dwarf_version >= 3 || !dwarf_strict is true? > > Yes, that is the intention. If it is a versioned GNU C then it is > at least DW_LANG_C89, if we have -gdwarf-3 or higher and it isn't > GNU C89 then it is at least DW_LANG_C99 and if we have -gdwarf-5 > and it is GNU C11 then we emit DW_LANG_C11. > > I added an explicit testcase for this. > > > BTW, noticed we don't have anything for Fortran 2003 and 2008, > > filed a DWARF Issue for that. > > Thanks. I have only focussed on C and C++ because I don't know anything > about version changes in other language standards. > > With the above change everything keeps working fine. You only need a > patched GDB when explicitly using -gdwarf-5. > > OK to commit? Ping. Rebased patch attached. I have submitted patches for elfutils, valgrind, gdb and binutils for this. But they are pending till this patch hits GCC first. Thanks, Mark From a9c0a73bc74a7a25edd449cf42d179c16fbc13f4 Mon Sep 17 00:00:00 2001 From: Mark Wielaard Date: Fri, 21 Nov 2014 21:18:00 +0100 Subject: [PATCH] PR debug/38757 continued. Handle C11, C++11 and C++14. Add experimental (minimal) DWARFv5 support. This change depends on the new DWARFv5 constants mentioned in the following draft: http://dwarfstd.org/doc/dwarf5.20141029.pdf gcc/ChangeLog * doc/invoke.texi (-gdwarf-@{version}): Mention experimental DWARFv5. * opts.c (common_handle_option): Accept -gdwarf-5. * dwarf2out.c (is_cxx): Add DW_LANG_C_plus_plus_11 and DW_LANG_C_plus_plus_14. (lower_bound_default): Likewise. Plus DW_LANG_C11. (gen_compile_unit_die): Output DW_LANG_C_plus_plus_11, DW_LANG_C_plus_plus_14 or DW_LANG_C11. (output_compilation_unit_header): Output at most a DWARFv4 header. (output_skeleton_debug_sections): Likewise. (output_line_info): Likewise. (output_aranges): Document header version number. gcc/testsuite/ChangeLog * gcc.dg/debug/dwarf2/lang-c11.c: New test. * gcc.dg/debug/dwarf2/lang-c11-d4-strict.c: Likewise. * g++.dg/debug/dwarf2/lang-cpp11.C: Likewise. * g++.dg/debug/dwarf2/lang-cpp14.C: Likewise. * g++.dg/debug/dwarf2/lang-cpp98.C: Likewise. include/ChangeLog * dwarf2.h: Add DW_LANG_C_plus_plus_11, DW_LANG_C11 and DW_LANG_C_plus_plus_14. --- gcc/ChangeLog | 14 +++ gcc/doc/invoke.texi| 4 +- gcc/dwarf2out.c| 45 +- gcc/opts.c | 2 +- gcc/testsuite/ChangeLog| 8 gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp11.C | 6 +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp14.C | 6 +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp98.C | 6 +++ .../gcc.dg/debug/dwarf2/lang-c11-d4-strict.c | 7 gcc/testsuite/gcc.dg/debug/dwarf2/lang-c11.c | 6 +++ include/ChangeLog | 5 +++ include/dwarf2.h | 4 ++ 12 files changed, 101 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp11.C create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/lang
[PATCH] Fix overly restrictive condition in get_symbol_constant_value
The following fixes an overly restrictive condition on the zeros we produce when folding a read from a zero-initialized global. Now it matches what we allow elsewhere and what is useful. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-11-26 Richard Biener * gimple-fold.c (get_symbol_constant_value): Allow all GIMPLE register type zero-constants. Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 218073) +++ gcc/gimple-fold.c (working copy) @@ -254,8 +254,7 @@ get_symbol_constant_value (tree sym) have zero as the initializer if they may not be overridden at link or run time. */ if (!val - && (INTEGRAL_TYPE_P (TREE_TYPE (sym)) - || SCALAR_FLOAT_TYPE_P (TREE_TYPE (sym + && is_gimple_reg_type (TREE_TYPE (sym))) return build_zero_cst (TREE_TYPE (sym)); }
Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic
Hi Kyrill, On 21 November 2014 at 16:52, Marcus Shawcroft wrote: > On 17 November 2014 17:35, Kyrill Tkachov wrote: > >> 2014-11-17 Kyrylo Tkachov >> >> * config/aarch64/arm_neon.h (vsqrt_f64): New intrinsic. >> >> 2014-11-17 Kyrylo Tkachov >> >> * gcc.target/aarch64/simd/vsqrt_f64_1.c > > OK /Marcus Your new test fails at the scan-assembly step because all the code is optimized away (even at -O1). Christophe.
Re: [PATCH] Fix PR lto/64075
On Wed, Nov 26, 2014 at 10:35 AM, Ilya Enkovich wrote: > Hi, > > This patch fixes LTO streamers which were not adjusted when function_code > field was extended up to 12 bits. > > OK for trunk after bootstrap and check? Ok. Thanks, Richard. > Thanks, > Ilya > -- > gcc/ > > 2014-11-26 Ilya Enkovich > > * tree-streamer-in.c (unpack_ts_function_decl_value_fields): Use > proper size for function_code bitfield. > (pack_ts_function_decl_value_fields): Likewise. > > gcc/testsuite/ > > 2014-11-26 Ilya Enkovich > > * gcc.dg/pr64075.c: New. > > > diff --git a/gcc/testsuite/gcc.dg/pr64075.c b/gcc/testsuite/gcc.dg/pr64075.c > new file mode 100644 > index 000..f3c8dc4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr64075.c > @@ -0,0 +1,8 @@ > +/* PR lto/64075 */ > +/* { dg-do compile } */ > +/* { dg-options "-flto" } */ > + > +_Complex float test (float a, float b, float c, float d) > +{ > + return 1.0iF; > +} > diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c > index 99448dd..eb205ed 100644 > --- a/gcc/tree-streamer-in.c > +++ b/gcc/tree-streamer-in.c > @@ -333,7 +333,7 @@ unpack_ts_function_decl_value_fields (struct bitpack_d > *bp, tree expr) >if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN) > { >DECL_FUNCTION_CODE (expr) = (enum built_in_function) bp_unpack_value > (bp, > - > 11); > + > 12); >if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_NORMAL > && DECL_FUNCTION_CODE (expr) >= END_BUILTINS) > fatal_error ("machine independent builtin code out of range"); > diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c > index ad58b84..0d87cff 100644 > --- a/gcc/tree-streamer-out.c > +++ b/gcc/tree-streamer-out.c > @@ -300,7 +300,7 @@ pack_ts_function_decl_value_fields (struct bitpack_d *bp, > tree expr) >bp_pack_value (bp, DECL_PURE_P (expr), 1); >bp_pack_value (bp, DECL_LOOPING_CONST_OR_PURE_P (expr), 1); >if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN) > -bp_pack_value (bp, DECL_FUNCTION_CODE (expr), 11); > +bp_pack_value (bp, DECL_FUNCTION_CODE (expr), 12); > } > >
Re: [PATCH 03/08] PR jit/63854: Fix leak in real.c for i386:init_ext_80387_constants
On Wed, Nov 26, 2014 at 2:39 AM, David Malcolm wrote: > Valgrind of testsuite/jit.dg/test-types.c showed this leak in real.c > when converting the strings in init_ext_80387_constants to real. > > 160 bytes in 5 blocks are definitely lost in loss record 89 of 144 >at 0x4A0645D: malloc (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >by 0x30AF40C368: __gmp_default_allocate (memory.c:44) >by 0x309842EA20: mpfr_init2 (init2.c:55) >by 0x527C033: real_from_string(real_value*, char const*) (real.c:2045) >by 0x56F0CEF: init_ext_80387_constants() (i386.c:9200) >by 0x56F0E4A: standard_80387_constant_p(rtx_def*) (i386.c:9237) >by 0x5740115: ix86_rtx_costs(rtx_def*, int, int, int, int*, bool) > (i386.c:41735) >by 0x52E4417: rtx_cost(rtx_def*, rtx_code, int, bool) (rtlanal.c:3855) >by 0x4F28E9C: set_src_cost(rtx_def*, bool) (rtl.h:2111) >by 0x4F33AFB: compress_float_constant(rtx_def*, rtx_def*) (expr.c:3667) >by 0x4F3380E: emit_move_insn(rtx_def*, rtx_def*) (expr.c:3590) >by 0x4F3A1B6: store_expr_with_bounds(tree_node*, rtx_def*, int, bool, > tree_node*) (expr.c:5540) > > Fix it. Ok. Thanks, Richard. > gcc/ChangeLog: > PR jit/63854 > * real.c (real_from_string): Add missing mpfr_clear. > --- > gcc/real.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/gcc/real.c b/gcc/real.c > index cebbe8d..5e8050d 100644 > --- a/gcc/real.c > +++ b/gcc/real.c > @@ -2067,6 +2067,7 @@ real_from_string (REAL_VALUE_TYPE *r, const char *str) > gcc_assert (r->cl = rvc_normal); > /* Set a sticky bit if mpfr_strtofr was inexact. */ > r->sig[0] |= inexact; > + mpfr_clear (m); > } > } > > -- > 1.8.5.3 >
Re: [PATCH 01/08] PR jit/63854: Fix leak in tree-ssa-math-opts.c
On Wed, Nov 26, 2014 at 2:39 AM, David Malcolm wrote: > Running testsuite/jit.dg/test-functions.c under valgrind showed this > leak (amongst others): > > 400 bytes in 10 blocks are definitely lost in loss record 142 of 181 >at 0x4A0645D: malloc (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) >by 0x5DCDF2F: xrealloc (xmalloc.c:177) >by 0x53726CE: void > va_heap::reserve(vec vl_embed>*&, unsigned int, bool) (vec.h:310) >by 0x5371BB1: vec vl_ptr>::reserve(unsigned int, bool) (vec.h:1428) >by 0x5370F5D: vec vl_ptr>::safe_push(gimple_statement_base* const&) (vec.h:1537) >by 0x5523E71: maybe_record_sincos(vec vl_ptr>*, basic_block_def**, gimple_statement_base*) > (tree-ssa-math-opts.c:718) >by 0x552403E: execute_cse_sincos_1(tree_node*) (tree-ssa-math-opts.c:760) >by 0x5526224: (anonymous namespace)::pass_cse_sincos::execute(function*) > (tree-ssa-math-opts.c:1497) >by 0x5250095: execute_one_pass(opt_pass*) (passes.c:2311) >by 0x525030C: execute_pass_list_1(opt_pass*) (passes.c:2363) >by 0x525033D: execute_pass_list_1(opt_pass*) (passes.c:2364) >by 0x525037A: execute_pass_list(function*, opt_pass*) (passes.c:2374) > > For some reason (which I've filed for myself as PR jit/64020), this > code was bailing out: > > fndecl = mathfn_built_in (type, BUILT_IN_CEXPI); > if (!fndecl) > return false; > > That exit path is missing a: > stmts.release (); > and thus is leaking the buffer of stmts on the way out. > > Fix it by converting stmts from vec<> to auto_vec<>, to avoid the need to > have handwritten release calls on every exit path. Ok. Thanks, Richard. > gcc/ChangeLog: > PR jit/63854 > * tree-ssa-math-opts.c (execute_cse_sincos_1): Fix a missing > release of stmts by converting it to an auto_vec. > --- > gcc/tree-ssa-math-opts.c | 9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c > index f9c30bf..5e08c7ee 100644 > --- a/gcc/tree-ssa-math-opts.c > +++ b/gcc/tree-ssa-math-opts.c > @@ -740,7 +740,7 @@ execute_cse_sincos_1 (tree name) >tree fndecl, res, type; >gimple def_stmt, use_stmt, stmt; >int seen_cos = 0, seen_sin = 0, seen_cexpi = 0; > - vec stmts = vNULL; > + auto_vec stmts; >basic_block top_bb = NULL; >int i; >bool cfg_changed = false; > @@ -773,10 +773,7 @@ execute_cse_sincos_1 (tree name) > } > >if (seen_cos + seen_sin + seen_cexpi <= 1) > -{ > - stmts.release (); > - return false; > -} > +return false; > >/* Simply insert cexpi at the beginning of top_bb but not earlier than > the name def statement. */ > @@ -835,8 +832,6 @@ execute_cse_sincos_1 (tree name) > cfg_changed = true; > } > > - stmts.release (); > - >return cfg_changed; > } > > -- > 1.8.5.3 >
[PATCH] Fix PR lto/64075
Hi, This patch fixes LTO streamers which were not adjusted when function_code field was extended up to 12 bits. OK for trunk after bootstrap and check? Thanks, Ilya -- gcc/ 2014-11-26 Ilya Enkovich * tree-streamer-in.c (unpack_ts_function_decl_value_fields): Use proper size for function_code bitfield. (pack_ts_function_decl_value_fields): Likewise. gcc/testsuite/ 2014-11-26 Ilya Enkovich * gcc.dg/pr64075.c: New. diff --git a/gcc/testsuite/gcc.dg/pr64075.c b/gcc/testsuite/gcc.dg/pr64075.c new file mode 100644 index 000..f3c8dc4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr64075.c @@ -0,0 +1,8 @@ +/* PR lto/64075 */ +/* { dg-do compile } */ +/* { dg-options "-flto" } */ + +_Complex float test (float a, float b, float c, float d) +{ + return 1.0iF; +} diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c index 99448dd..eb205ed 100644 --- a/gcc/tree-streamer-in.c +++ b/gcc/tree-streamer-in.c @@ -333,7 +333,7 @@ unpack_ts_function_decl_value_fields (struct bitpack_d *bp, tree expr) if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN) { DECL_FUNCTION_CODE (expr) = (enum built_in_function) bp_unpack_value (bp, - 11); + 12); if (DECL_BUILT_IN_CLASS (expr) == BUILT_IN_NORMAL && DECL_FUNCTION_CODE (expr) >= END_BUILTINS) fatal_error ("machine independent builtin code out of range"); diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c index ad58b84..0d87cff 100644 --- a/gcc/tree-streamer-out.c +++ b/gcc/tree-streamer-out.c @@ -300,7 +300,7 @@ pack_ts_function_decl_value_fields (struct bitpack_d *bp, tree expr) bp_pack_value (bp, DECL_PURE_P (expr), 1); bp_pack_value (bp, DECL_LOOPING_CONST_OR_PURE_P (expr), 1); if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN) -bp_pack_value (bp, DECL_FUNCTION_CODE (expr), 11); +bp_pack_value (bp, DECL_FUNCTION_CODE (expr), 12); }
Re: [PATCH v3] gcc/c-family/c-cppbuiltin.c: Let buffer enough to print host wide integer value
On 11/26/14 15:33, Jakub Jelinek wrote: > On Wed, Nov 26, 2014 at 09:41:16AM +0800, Chen Gang wrote: >> On 11/26/14 8:31, Joseph Myers wrote: >>> On Wed, 26 Nov 2014, Chen Gang wrote: >>> + gcc_assert (wi::fits_to_tree_p (value, char_type_node) +|| wi::fits_to_tree_p (value, short_integer_type_node) +|| wi::fits_to_tree_p (value, integer_type_node) +|| wi::fits_to_tree_p (value, long_integer_type_node) +|| wi::fits_to_tree_p (value, long_long_integer_type_node)); >>> >>> It doesn't make sense to check for char or short, since you can't write a >>> constant of one of those types. And it doesn't make sense to check for >>> int or long when checking for long long, as the ranges of int and long are >>> subsets of that of long long. So just check long long here. >>> + buf = (char *) alloca (strlen (macro) + vlen + extra); + + sprintf (buf, "%s=%s"HOST_WIDE_INT_PRINT_DEC"%s%s", > > Oh, and please use spaces around HOST_WIDE_INT_PRINT_DEC etc., > with C++11 user defined literals it is always better to have whitespace > in there. > OK, thanks, I shall notice about it when send patch v4. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
RE: [PATCH, AARCH64] Fix ICE in CCMP (PR64015)
> -Original Message- > From: Richard Henderson [mailto:r...@redhat.com] > Sent: Tuesday, November 25, 2014 5:25 PM > To: Zhenqiang Chen > Cc: Marcus Shawcroft; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH, AARCH64] Fix ICE in CCMP (PR64015) > > On 11/25/2014 09:41 AM, Zhenqiang Chen wrote: > > I want to confirm with you two things before I rework it. > > (1) expand_insn needs an optab_handler as input. Do I need to define a > ccmp_optab with different mode support in optabs.def? > > No, look again: expand_insn needs an enum insn_code as input. Since this is > the backend, you can use any icode name you like, which means that you can > use CODE_FOR_ccmp_and etc directly. > > > (2) To make sure later operands not clobber CC, all operands are expanded > before ccmp-first in current implementation. If taking tree/gimple as input, > what's your preferred logic to guarantee CC not clobbered? > > Hmm. Perhaps the target hook will need to output two sequences, each of > which will be concatenated while looping around the calls to gen_ccmp_next. > The first sequence will be operand preparation and the second sequence will > be ccmp generation. > > Something like > > bool > aarch64_gen_ccmp_start(rtx *prep_seq, rtx *gen_seq, >int cmp_code, int bit_code, >tree op0, tree op1) { > bool success; > > start_sequence (); > // Widen and expand operands > *prep_seq = get_insns (); > end_sequence (); > > start_sequence (); > // Generate the first compare > *gen_seq = get_insns (); > end_sequence (); > > return success; > } > > bool > aarch64_gen_ccmp_next(rtx *prep_seq, rtx *gen_seq, > rtx prev, int cmp_code, int bit_code, > tree op0, tree op1) { > bool success; > > push_to_sequence (*prep_seq); > // Widen and expand operands > *prep_seq = get_insns (); > end_sequence (); > > push_to_sequence (*gen_seq); > // Generate the next ccmp > *gen_seq = get_insns (); > end_sequence (); > > return success; > } > > If there are ever any failures, the middle-end can simply discard the > sequences. If everything succeeds, it simply calls emit_insn on both > sequences. When there are more than one ccmps, it will be complexity to maintain the un-emitted sequences in a recursive algorithm. E.g. CC0 = CMP (a, b); CC1 = CCMP (NE (CC0, 0), CMP (e, f)); ... CCn = CCMP (NE (CCn-1, 0), CMP (...)); I read the codes to expand cstoresi and cbranchsi. It just uses normal expand_operands. So I think we can keep expand_operands in middle-end. For the start one, we do not need worry about it since its last step should compare to set CC. And it is easy to delete the insns when fail. static rtx aarch64_gen_ccmp_first (int code, rtx op0, rtx op1) { ... // init the vars op0 = prepare_operand (icode, op0, 2, op_mode, cmp_mode, unsignedp); op1 = prepare_operand (icode, op1, 3, op_mode, cmp_mode, unsignedp); if (!op0 || !op1) return NULL_RTX; cmp = gen_rtx_fmt_ee ((enum rtx_code) code, cmp_mode, op0, op1); target = gen_rtx_REG (CCmode, CC_REGNUM); create_output_operand (&ops[0], target, CCmode); create_fixed_operand (&ops[1], cmp); create_fixed_operand (&ops[2], op0); create_fixed_operand (&ops[3], op1); if (!maybe_expand_insn (icode, 4, ops)) return NULL_RTX; return gen_rtx_REG (cc_mode, CC_REGNUM); } aarch64_gen_ccmp_next (rtx prev, int cmp_code, rtx op0, rtx op1, int bit_code) { ... // init the vars op0 = prepare_operand (icode, op0, 2, op_mode, cmp_mode, unsignedp); op1 = prepare_operand (icode, op1, 3, op_mode, cmp_mode, unsignedp); if (!op0 || !op1) return NULL_RTX; /* Check to make sure CC is not clobbered since prepare_operand might generates copy or mode convertion insns, although no test shows such insns clobber CC. */ ... cmp1 = gen_rtx_fmt_ee ((enum rtx_code) cmp_code, cmp_mode, op0, op1); cmp0 = gen_rtx_fmt_ee (NE, cmp_mode, prev, const0_rtx); target = gen_rtx_REG (cc_mode, CC_REGNUM); create_fixed_operand (&ops[0], prev); create_fixed_operand (&ops[1], target); create_fixed_operand (&ops[2], op0); create_fixed_operand (&ops[3], op1); create_fixed_operand (&ops[4], cmp0); create_fixed_operand (&ops[5], cmp1); if (!maybe_expand_insn (icode, 6, ops)) return NULL_RTX; return target; } Does such change align with your comments? Thanks! -Zhenqiang
Re: [PATCH] Enhance ASAN_CHECK optimization
On Tue, Nov 25, 2014 at 08:06:00PM +0300, Yury Gribov wrote: > +/* Traits class for tree hash maps below. */ > + > +struct tree_map_traits : default_hashmap_traits > +{ > + static inline hashval_t hash (const_tree ref) > +{ > + return iterative_hash_expr (ref, 0); > +} > + > + static inline bool equal_keys (const_tree ref1, const_tree ref2) > +{ > + return operand_equal_p (ref1, ref2, 0); > +} Formatting. The {} should be indented like static and return 2 columns to the right of that. > @@ -281,37 +316,46 @@ maybe_optimize_asan_check_ifn (struct sanopt_ctx *ctx, > gimple stmt) > >gimple_set_uid (stmt, info->freeing_call_events); > > - auto_vec &v = ctx->asan_check_map.get_or_insert (ptr); > - if (v.is_empty ()) > + auto_vec *ptr_checks = &ctx->asan_check_map.get_or_insert (ptr); > + gimple g = maybe_get_dominating_check (*ptr_checks); > + > + tree base_addr = maybe_get_single_definition (ptr); > + auto_vec *base_checks = NULL; > + if (base_addr) > { > - /* For this PTR we don't have any ASAN_CHECK stmts recorded, so there's > - nothing to optimize yet. */ > - v.safe_push (stmt); > - return false; > + base_checks = &ctx->asan_check_map.get_or_insert (base_addr); > + /* Original pointer might have been invalidated. */ > + ptr_checks = ctx->asan_check_map.get (ptr); > } For base_addr computation, you don't really need g or ptr_checks, do you? So why not move the: auto_vec *ptr_checks = &ctx->asan_check_map.get_or_insert (ptr); gimple g = maybe_get_dominating_check (*ptr_checks); lines below the if? > @@ -404,10 +445,7 @@ sanopt_optimize_walker (basic_block bb, struct > sanopt_ctx *ctx) >basic_block son; >gimple_stmt_iterator gsi; >sanopt_info *info = (sanopt_info *) bb->aux; > - bool asan_check_optimize > -= (flag_sanitize & SANITIZE_ADDRESS) > - && ((flag_sanitize & flag_sanitize_recover > -& SANITIZE_KERNEL_ADDRESS) == 0); > + bool asan_check_optimize = (flag_sanitize & SANITIZE_ADDRESS) != 0; > >for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) > { I'm afraid I'm not convinced about this hunk. If asan (kernel-address) is recovering, I don't see a difference from not reporting two different invalid accesses to the same function and not reporting two integer overflows in the same function, at least if they have different location_t. Jakub
Re: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)
On Wed, Oct 29, 2014 at 6:34 AM, Zoran Jovanovic wrote: > Hello, > This is new patch version in which reported issue is fixed. > Also, patch is rebased to the revision 216452 and some minor code clean-up is > done. FYI. This causes gfc_add_interface_mapping in fortrant/trans-expr.c to be miscompiled for aarch64-linux-gnu. I am still debugging it and trying to get a smaller testcase. Thanks, Andrew > > -- > > Lowering is applied only for bit-fields copy sequences that are merged. > Data structure representing bit-field copy sequences is renamed and reduced > in size. > Optimization turned on by default for -O2 and higher. > Some comments fixed. > > Benchmarking performed on WebKit for Android. > Code size reduction noticed on several files, best examples are: > > core/rendering/style/StyleMultiColData (632->520 bytes) > core/platform/graphics/FontDescription (1715->1475 bytes) > core/rendering/style/FillLayer (5069->4513 bytes) > core/rendering/style/StyleRareInheritedData (5618->5346) > core/css/CSSSelectorList(4047->3887) > core/platform/animation/CSSAnimationData (3844->3440 bytes) > core/css/resolver/FontBuilder (13818->13350 bytes) > core/platform/graphics/Font (16447->15975 bytes) > > > Example: > > One of the motivating examples for this work was copy constructor of the > class which contains bit-fields. > > C++ code: > class A > { > public: > A(const A &x); > unsigned a : 1; > unsigned b : 2; > unsigned c : 4; > }; > > A::A(const A&x) > { > a = x.a; > b = x.b; > c = x.c; > } > > GIMPLE code without optimization: > > : > _3 = x_2(D)->a; > this_4(D)->a = _3; > _6 = x_2(D)->b; > this_4(D)->b = _6; > _8 = x_2(D)->c; > this_4(D)->c = _8; > return; > > Optimized GIMPLE code: > : > _10 = x_2(D)->D.1867; > _11 = BIT_FIELD_REF <_10, 7, 0>; > _12 = this_4(D)->D.1867; > _13 = _12 & 128; > _14 = (unsigned char) _11; > _15 = _13 | _14; > this_4(D)->D.1867 = _15; > return; > > Generated MIPS32r2 assembly code without optimization: > lw $3,0($5) > lbu $2,0($4) > andi$3,$3,0x1 > andi$2,$2,0xfe > or $2,$2,$3 > sb $2,0($4) > lw $3,0($5) > andi$2,$2,0xf9 > andi$3,$3,0x6 > or $2,$2,$3 > sb $2,0($4) > lw $3,0($5) > andi$2,$2,0x87 > andi$3,$3,0x78 > or $2,$2,$3 > j $31 > sb $2,0($4) > > Optimized MIPS32r2 assembly code: > lw $3,0($5) > lbu $2,0($4) > andi$3,$3,0x7f > andi$2,$2,0x80 > or $2,$3,$2 > j $31 > sb $2,0($4) > > > Algorithm works on basic block level and consists of following 3 major steps: > 1. Go through basic block statements list. If there are statement pairs that > implement copy of bit field content from one memory location to another > record statements pointers and other necessary data in corresponding data > structure. > 2. Identify records that represent adjacent bit field accesses and mark them > as merged. > 3. Lower bit-field accesses by using new field size for those that can be > merged. > > > New command line option "-fmerge-bitfields" is introduced. > > > Tested - passed gcc regression tests for MIPS32r2. > > > Changelog - > > gcc/ChangeLog: > 2014-04-22 Zoran Jovanovic (zoran.jovano...@imgtec.com) > * common.opt (fmerge-bitfields): New option. > * doc/invoke.texi: Add reference to "-fmerge-bitfields". > * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization > flags turned on at -O2. > * tree-sra.c (lower_bitfields): New function. > Entry for (-fmerge-bitfields). > (part_of_union_p): New function. > (bf_access_candidate_p): New function. > (lower_bitfield_read): New function. > (lower_bitfield_write): New function. > (bitfield_stmt_bfcopy_pair::hash): New function. > (bitfield_stmt_bfcopy_pair::equal): New function. > (bitfield_stmt_bfcopy_pair::remove): New function. > (create_and_insert_bfcopy): New function. > (get_bit_offset): New function. > (add_stmt_bfcopy_pair): New function. > (cmp_bfcopies): New function. > (get_merged_bit_field_size): New function. > * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. > (field_byte_offset): Move declaration to tree.h and make it extern. > * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. > * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. > * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. > * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. > * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. > (simple_type_size_in_bits): Move from dwarf2out.c. > * tree.h (expressions_equal_p): Add declaration. > (field_byte_offset
[Committed] Add a new testcase
Hi, While working on an aarch64 patch, I ran into wrong code produced by my patch. I am not ready to submit the patch yet but since I reduced the testcase and there was no testcase in the testsuite yet, I thought I commit the testcase. This testcase is reduced from aarch64_float_const_representable_p in aarch64.c. Thanks, Andrew Pinski ChangeLog: * gcc.c-torture/execute/20141125-1.c: New testcase. Index: testsuite/gcc.c-torture/execute/20141125-1.c === --- testsuite/gcc.c-torture/execute/20141125-1.c(revision 0) +++ testsuite/gcc.c-torture/execute/20141125-1.c(revision 0) @@ -0,0 +1,17 @@ +int f(long long a) __attribute__((noinline,noclone)); +int f(long long a) +{ + if (a & 0x3ffull) +return 1; + return 1024; +} + +int main(void) +{ + if(f(0x48375d80ull) != 1) +__builtin_abort (); + if (f(0xfc00ull) != 1024) +__builtin_abort (); + return 0; +} +
RE: [PATCH][MIPS] Fix P5600 memory cost
Committed with ChangeLog entry fixes. Prachi -Original Message- From: Matthew Fortune Sent: Wednesday, November 5, 2014 4:07 PM To: Prachi Godbole; gcc-patches@gcc.gnu.org Subject: RE: [PATCH][MIPS] Fix P5600 memory cost > The patch below fixes the memory cost for P5600. > > ChangeLog: > 2014-11-05 Prachi Godbole > > * config/mips/mips.c (mips_rtx_cost_data): Fix memory_letency cost for > p5600. Please follow these instructions to add yourself to MAINTAINERS in the write-after-approval section now that you have write access to GCC: https://gcc.gnu.org/svnwrite.html#authenticated OK with fixes to the changelog entry: latency not latency. Remember to tab in the changelog entry and split the line as it will exceed 80 chars. Also two spaces between the date/name and name/email. E.g. 2014-11-05 Prachi Godbole * config/mips/mips.c (mips_rtx_cost_data): Fix memory_latency cost for p5600. Thanks, Matthew > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index > af6a913..558ba2f 100644 > --- a/gcc/config/mips/mips.c > +++ b/gcc/config/mips/mips.c > @@ -1193,7 +1193,7 @@ static const struct mips_rtx_cost_data > COSTS_N_INSNS (8),/* int_div_si */ > COSTS_N_INSNS (8),/* int_div_di */ > 2,/* branch_cost */ > - 10 /* memory_latency */ > + 4 /* memory_latency */ >} > }; > ^L
Patch ping^2: [PATCH] -fsanitize=vptr instrumentation (take 2)
On Wed, Nov 12, 2014 at 03:05:46PM +0100, Jakub Jelinek wrote: > On Tue, Oct 28, 2014 at 01:44:50PM +0100, Jakub Jelinek wrote: > > On Mon, Oct 27, 2014 at 05:16:05PM +0100, Jakub Jelinek wrote: > > > Here is an updated patch, ok if bootstrap/testing passes (so far just > > > checked with > > > make -j16 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} > > > asan.exp tsan.exp ubsan.exp' > > > )? > > I'd like to ping the > https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02945.html > patch. Ping. Jakub
Re: [PATCH 04/08] PR jit/63854: Remove xstrdup from ipa/cgraph fprintf calls
Hello! > cgraph*.c and ipa-*.c use xstrdup on strings when dumping them via > fprintf, leaking all of the duplicated buffers. > > Is/was there a reason for doing this? Yes, please see [1] and PR 53136 [2]. As said in [1]: "There is a problem with multiple calls of cgraph_node_name in fprintf dumps. Please note that C++ uses caching in cxx_printable_name_internal (aka LANG_HOOKS_DECL_PRINTABLE_NAME), so when cxx_printable_name_internal is called multiple times from printf (i.e. fprintf "%s/%i -> %s/%i"), it can happen that the first string gets evicted by the second call, before fprintf is fully evaluated." > Taking them out fixes these leaks (seen when dumping is enabled): But you will get "Invalid read of size X" instead. The patch at [1] fixed these, but introduced memory leaks, which were tolerable at the time: "I think that small memory leak is tolerable here (the changes are exclusively in the dump code), and follows the same approach as in java frontend." It seems that these assumptions are not valid anymore. [1] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01904.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53136 Uros.