[PATCH] Fix a test with bogus size_t type
We are now more strict when accepting user-defined initializer_lists; in particular, we now require sizetype, not just any integral type. The following test failed with -m32, because it had a bogus type of size_t, with -m32 it usually should be unsigned int, not unsigned long. Test passes now with both -m32/-m64, ok for trunk? 2015-04-02 Marek Polacek pola...@redhat.com * g++.dg/cpp0x/pr57101.C: Use proper type for size_t. diff --git gcc/testsuite/g++.dg/cpp0x/pr57101.C gcc/testsuite/g++.dg/cpp0x/pr57101.C index 94b576f..c0fc966 100644 --- gcc/testsuite/g++.dg/cpp0x/pr57101.C +++ gcc/testsuite/g++.dg/cpp0x/pr57101.C @@ -1,7 +1,7 @@ // { dg-do compile { target c++11 } } // { dg-options -fcompare-debug } -typedef long unsigned size_t; +typedef __SIZE_TYPE__ size_t; namespace { template typename _Tp, _Tp __v struct integral_constant Marek
[PATCH] Fix PR65650 (1/n in merging CCP and copyprop)
The following makes CCP track copies which avoids pass ordering issues between CCP and copyprop as seen from the testcase. Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1. For stage1 I'd like to get rid of copyprop completely, a 2nd patch in the series will remove the copyprop instances immediately preceeding/following CCP. CCP needs some TLC and I'm going to apply that during stage1. Thanks, Richard. 2015-04-02 Richard Biener rguent...@suse.de PR tree-optimization/65650 * tree-ssa-ccp.c (valid_lattice_transition): Allow lattice transitions involving copies. (set_lattice_value): Adjust for copy lattice state. (ccp_lattice_meet): Do not merge UNDEFINED and a copy to the copy. (bit_value_unop): Adjust what we treat as varying mask. (bit_value_binop): Likewise. (bit_value_assume_aligned): Likewise. (evaluate_stmt): When we simplified to a SSA name record a copy instead of dropping to varying. (visit_assignment): Simplify. * gimple-match.h (gimple_simplify): Add another callback. * gimple-fold.c (fold_stmt_1): Adjust caller. (gimple_fold_stmt_to_constant_1): Likewise - pass valueize for the 2nd callback. * gimple-match-head.c (gimple_simplify): Add a callback that is used to valueize the stmt operands and use it that way. * gcc.dg/tree-ssa/ssa-ccp-36.c: New testcase. Index: gcc/tree-ssa-ccp.c === *** gcc/tree-ssa-ccp.c.orig 2015-04-01 15:13:46.424457049 +0200 --- gcc/tree-ssa-ccp.c 2015-04-01 16:33:50.194715581 +0200 *** valid_lattice_transition (ccp_prop_value *** 439,444 --- 439,455 /* Now both lattice values are CONSTANT. */ + /* Allow arbitrary copy changes as we might look through PHI a_1, ... + when only a single copy edge is executable. */ + if (TREE_CODE (old_val.value) == SSA_NAME +TREE_CODE (new_val.value) == SSA_NAME) + return true; + + /* Allow transitioning from a constant to a copy. */ + if (is_gimple_min_invariant (old_val.value) +TREE_CODE (new_val.value) == SSA_NAME) + return true; + /* Allow transitioning from PHI x, not executable == x to PHI x, y == common alignment. */ if (TREE_CODE (old_val.value) != INTEGER_CST *** set_lattice_value (tree var, ccp_prop_va *** 527,535 caller that this was a non-transition. */ if (old_val-lattice_val != new_val.lattice_val || (new_val.lattice_val == CONSTANT ! TREE_CODE (new_val.value) == INTEGER_CST ! (TREE_CODE (old_val-value) != INTEGER_CST ! || new_val.mask != old_val-mask))) { /* ??? We would like to delay creation of INTEGER_CSTs from partially constants here. */ --- 538,547 caller that this was a non-transition. */ if (old_val-lattice_val != new_val.lattice_val || (new_val.lattice_val == CONSTANT ! (TREE_CODE (new_val.value) != TREE_CODE (old_val-value) ! || simple_cst_equal (new_val.value, old_val-value) != 1 ! || (TREE_CODE (new_val.value) == INTEGER_CST ! new_val.mask != old_val-mask { /* ??? We would like to delay creation of INTEGER_CSTs from partially constants here. */ *** ccp_finalize (void) *** 958,969 static void ccp_lattice_meet (ccp_prop_value_t *val1, ccp_prop_value_t *val2) { ! if (val1-lattice_val == UNDEFINED) { /* UNDEFINED M any = any */ *val1 = *val2; } ! else if (val2-lattice_val == UNDEFINED) { /* any M UNDEFINED = any Nothing to do. VAL1 already contains the value we want. */ --- 970,989 static void ccp_lattice_meet (ccp_prop_value_t *val1, ccp_prop_value_t *val2) { ! if (val1-lattice_val == UNDEFINED ! /* For UNDEFINED M SSA we can't use SSA because its definition ! may not dominate the PHI node. */ !(val2-lattice_val != CONSTANT ! || TREE_CODE (val2-value) != SSA_NAME)) { /* UNDEFINED M any = any */ *val1 = *val2; } ! else if (val2-lattice_val == UNDEFINED ! /* For UNDEFINED M SSA we can't use SSA because its definition ! may not dominate the PHI node. */ ! (val1-lattice_val != CONSTANT ! || TREE_CODE (val1-value) != SSA_NAME)) { /* any M UNDEFINED = any Nothing to do. VAL1 already contains the value we want. */ *** bit_value_unop (enum tree_code code, tre *** 1513,1519 || rval.mask == -1); bit_value_unop_1 (code, type, value, mask, TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask); ! if (mask != -1) { val.lattice_val = CONSTANT; val.mask = mask; --- 1533,1539 || rval.mask
[PATCH, i386]: Use UNSPEC_{FILD_FIST}_ATOMIC in {load,store}di_via_fpu
Hello! Trivial patch to make insn dumps more informative. 2015-04-02 Uros Bizjak ubiz...@gmail.com * config/i386/sync.md (UNSPEC_FILD_ATOMIC, UNSPEC_FIST_ATOMIC): New. (loaddi_via_fpu): Use UNSPEC_FILD_ATOMIC. (storedi_via_fpu): Use UNSPEC_FIST_ATOMIC. * reg-stack.c (get_true_reg): Change UNSPEC_LDA to UNSPEC_FILD_ATOMIC. (subst_stack_regs_pat): Change UNSPEC_STA to UNSPEC_FIST_ATOMIC. Bootstrapped and regtested on x86_64-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config/i386/sync.md === --- config/i386/sync.md (revision 221826) +++ config/i386/sync.md (working copy) @@ -22,6 +22,9 @@ UNSPEC_SFENCE UNSPEC_MFENCE + UNSPEC_FILD_ATOMIC + UNSPEC_FIST_ATOMIC + ;; __atomic support UNSPEC_LDA UNSPEC_STA @@ -302,7 +305,8 @@ (define_insn loaddi_via_fpu [(set (match_operand:DF 0 register_operand =f) - (unspec:DF [(match_operand:DI 1 memory_operand m)] UNSPEC_LDA))] + (unspec:DF [(match_operand:DI 1 memory_operand m)] + UNSPEC_FILD_ATOMIC))] TARGET_80387 fild%Z1\t%1 [(set_attr type fmov) @@ -311,7 +315,8 @@ (define_insn storedi_via_fpu [(set (match_operand:DI 0 memory_operand =m) - (unspec:DI [(match_operand:DF 1 register_operand f)] UNSPEC_STA))] + (unspec:DI [(match_operand:DF 1 register_operand f)] + UNSPEC_FIST_ATOMIC))] TARGET_80387 { gcc_assert (find_regno_note (insn, REG_DEAD, FIRST_STACK_REG) != NULL_RTX); Index: reg-stack.c === --- reg-stack.c (revision 221826) +++ reg-stack.c (working copy) @@ -448,7 +448,7 @@ get_true_reg (rtx *pat) case UNSPEC: if (XINT (*pat, 1) == UNSPEC_TRUNC_NOOP - || XINT (*pat, 1) == UNSPEC_LDA) + || XINT (*pat, 1) == UNSPEC_FILD_ATOMIC) pat = XVECEXP (*pat, 0, 0); return pat; @@ -1670,8 +1670,8 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr re case UNSPEC: switch (XINT (pat_src, 1)) { - case UNSPEC_STA: case UNSPEC_FIST: + case UNSPEC_FIST_ATOMIC: case UNSPEC_FIST_FLOOR: case UNSPEC_FIST_CEIL:
Re: [Patch, fortran, PR44672, v3] [F08] ALLOCATE with SOURCE and no array-spec
Hi all, during debugging of a larger fortran source I figured that my previous patch on 44672 had still some issues, when it comes to adding a gfc_code into the chain of codes and with a symbol. Adding a new gfc_code object before the current one is now solved be creating a new gfc_code object, copying the current one to the new one, initialize the old one to the new data and setting its next pointer to the current one. Because in the gfc_code.ext.alloc a flag is introduced, that is only set by the C-code adding a new gfc_code object, that flag can be used to prevent doing this process endlessly. I also learned, that one has to commit newly created symbols or one may get a very strange error in an assert in gfc_enforce_clean_symbol_state (). After adding the gfc_commit_symbol () everything was fine. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for 5.2 trunk? Regards, Andre On Wed, 1 Apr 2015 15:15:40 +0200 Andre Vehreschild ve...@gmx.de wrote: Hi all, during debugging another fortran code, I figured that some cases were not yet met. Especially the case where a class array is in the source= or mold= expression. This new version of the patch fixes the issue now. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for 5.2 trunk? Regards, Andre On Mon, 30 Mar 2015 19:47:49 +0200 Andre Vehreschild ve...@gmx.de wrote: Dear all, please find attach a patch fixing pr44672: integer, dimension(:) :: arr allocate(arr, source = [1,2,3]) as for F2008:C633 now is no longer flagged, beside when you insist on -std=f2003 or lower. Furthermore does the patch implement the F2008 feature of obsoleting the explicit array specification on the arrays to allocate, when an array valued source=/mold= expression is given. Bootstrap and regtests ok on x86_64-linux-gnu/F20. This batched is based on a trunk having my latest for pr60322 patched in (else deltas may occur). Ok for 5.2 trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 643cd6a..9835edc 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2394,6 +2394,9 @@ typedef struct gfc_code { gfc_typespec ts; gfc_alloc *list; + /* Take the array specification from expr3 to allocate arrays + without an explicit array specification. */ + unsigned arr_spec_from_expr3:1; } alloc; diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index 316b413..21add32 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -6804,7 +6804,7 @@ conformable_arrays (gfc_expr *e1, gfc_expr *e2) have a trailing array reference that gives the size of the array. */ static bool -resolve_allocate_expr (gfc_expr *e, gfc_code *code) +resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec) { int i, pointer, allocatable, dimension, is_abstract; int codimension; @@ -7103,9 +7103,20 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code) if (!ref2 || ref2-type != REF_ARRAY || ref2-u.ar.type == AR_FULL || (dimension ref2-u.ar.dimen == 0)) { - gfc_error (Array specification required in ALLOCATE statement - at %L, e-where); - goto failure; + /* F08:C633. */ + if (code-expr3) + { + if (!gfc_notify_std (GFC_STD_F2008, Array specification required + in ALLOCATE statement at %L, e-where)) + goto failure; + *array_alloc_wo_spec = true; + } + else + { + gfc_error (Array specification required in ALLOCATE statement + at %L, e-where); + goto failure; + } } /* Make sure that the array section reference makes sense in the @@ -7124,7 +7135,7 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code) for (i = 0; i ar-dimen; i++) { - if (ref2-u.ar.type == AR_ELEMENT) + if (ar-type == AR_ELEMENT || ar-type == AR_FULL) goto check_symbols; switch (ar-dimen_type[i]) @@ -7201,12 +7212,23 @@ failure: return false; } + +static gfc_code * +build_assignment (gfc_exec_op op, gfc_expr *expr1, gfc_expr *expr2, + gfc_component *comp1, gfc_component *comp2, locus loc); + + static void resolve_allocate_deallocate (gfc_code *code, const char *fcn) { gfc_expr *stat, *errmsg, *pe, *qe; gfc_alloc *a, *p, *q; + /* When this flag is set already, then this allocate has already been + resolved. Doing so again, would result in an endless loop. */ + if (code-ext.alloc.arr_spec_from_expr3) +return; + stat = code-expr1; errmsg = code-expr2; @@ -7375,8 +7397,97 @@ resolve_allocate_deallocate (gfc_code *code, const char *fcn) if (strcmp (fcn, ALLOCATE) == 0) { + bool arr_alloc_wo_spec = false; for (a = code-ext.alloc.list; a; a = a-next) - resolve_allocate_expr (a-expr, code); + resolve_allocate_expr (a-expr, code, arr_alloc_wo_spec); + + if (arr_alloc_wo_spec code-expr3) + {
Re: C++ PATCH for c++/65554 (ICE with user-defined initializer_list)
On Wed, Apr 01, 2015 at 03:37:19PM -0700, H.J. Lu wrote: This caused: FAIL: g++.dg/cpp0x/pr57101.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/cpp0x/pr57101.C -std=gnu++14 (test for excess errors) on 32 bit system. Thanks, I'll fix it momentarily.
Re: [CHKP] Never expand instrumentation thunks
On 04/02/2015 08:51 AM, Ilya Enkovich wrote: Ping. Patch doesn't affect not instrumented code. Thanks, Ilya 2015-03-18 15:19 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, This patch disables attempts to expand instrumentation thunks which appear when we create specialized function versions. Problem was found during SPEC2006 insatrumentation with '-Ofast -flto'. Unfortunately I couldn't make a small reproducer. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-18 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. So is it the case that these instrumentation thunks have a gimple body and thus will be expanded elsewhere? Or are you trying to solve some different problem? Jeff
Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers
Hi, This patch fixes nested PARALLEL in retval for isntrumented calls. Current possible call: (call_insn:TI 6 30 17 2 (set (parallel [ (expr_list:REG_DEP_TRUE (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) ]) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Such construction causes DF analysis problems. This patch tranforms it to: (call_insn:TI 6 30 17 2 (set (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Yep, nested parallels should not happen. 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c (ix86_expand_call): Avoid nested PARALLEL in returned call value. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 744642c..1d821cd 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -25624,8 +25624,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, { rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG); rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1); - retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1)); - chkp_put_regs_to_expr_list (retval); + if (GET_CODE (retval) == PARALLEL) + { + b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx); + b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx); + rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1)); + retval = chkp_join_splitted_slot (retval, par); I do not understand this code. Isn't it just droping the original return value constructing prallelcontaining only the BND_REGs? Honza
Re: [CHKP] Never expand instrumentation thunks
On 04/02/2015 03:13 PM, Jan Hubicka wrote: I think it is really the transparent alias issue. The comment seems pretty clear about it. What is confusing is that instrumentation thunks are called thunks when they are really not - thunk is a small hunk of code, while instrumentation thunk is a transparent alias. And it may be the terminology gets in the way. I was at least partly interested in what code we generated for that testcase. Too bad I did not notice we introduced transparent aliases, i would push out my code for that. I will compare Ilya's changes with mine and hopefully we can catch more bugs and unify the code next stage1. Sounds good. jeff
[PATCH, c6x] handle unk_isa in TARGET_CPU_CPP_BUILTINS
On Wed, Apr 01, 2015 at 11:37:41PM +0200, Bernhard Reutner-Fischer wrote: Bernd, same for c6x for unk_isa, fwiw. Attached. Ok for trunk for the c6x bits? Ok for trunk for the bfin bits? thanks, thanks, On 1 April 2015 at 23:34, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Hi, gcc/c-family/c-cppbuiltin.c In file included from ./tm.h:21:0, from ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:23: ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c: In function ‘void c_cpp_builtins(cpp_reader*)’: ../../../../../../home/me/src/gcc-5.0.mine/gcc/config/bfin/bfin.h:43:14: error: enumeration value ‘BFIN_CPU_UNKNOWN’ not handled in switch [-Werror=switch] switch (bfin_cpu_type) \ ^ ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:1243:3: note: in expansion of macro ‘TARGET_CPU_CPP_BUILTINS’ TARGET_CPU_CPP_BUILTINS (); ^ cc1plus: all warnings being treated as errors make[2]: *** [c-family/c-cppbuiltin.o] Error 1 gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * config/bfin/bfin.h (TARGET_CPU_CPP_BUILTINS): Add BFIN_CPU_UNKNOWN. From da0fd9e46efb975f8e5d8854f221689ff8808bc5 Mon Sep 17 00:00:00 2001 From: Bernhard Reutner-Fischer rep.dot@gmail.com Date: Thu, 1 Apr 2015 23:42:08 +0200 Subject: [PATCH 6/6] handle c6x unk_isa in TARGET_CPU_CPP_BUILTINS gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * config/c6x/c6x.h (TARGET_CPU_CPP_BUILTINS): Add unk_isa. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/config/c6x/c6x.h |2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/config/c6x/c6x.h b/gcc/config/c6x/c6x.h index 58a7ac6..9b3fcfb 100644 --- a/gcc/config/c6x/c6x.h +++ b/gcc/config/c6x/c6x.h @@ -84,6 +84,8 @@ extern c6x_cpu_t c6x_arch; \ switch (c6x_arch)\ { \ + case unk_isa:\ + break;\ case C6X_CPU_C62X: \ builtin_define (_TMS320C6200); \ break;\ -- 1.7.10.4
Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor
Hi, from what I understand, Evandro has addressed the comments from Kyrill. Are there other problems to be addressed before the patches can go in? Thanks, Sebastian On Tue, Mar 31, 2015 at 7:30 PM, Evandro Menezes e.mene...@samsung.com wrote: Hi, Kyrill. At this moment, it suffices to use the same scheduling as Cortex A57, but more specific details are to be expected. I couldn't check the build though, as my Arndale is strange today. As soon as it's healthy, I'll check it. I appreciate your feedback. -- Evandro Menezes Austin, TX -Original Message- From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com] Sent: Tuesday, March 31, 2015 3:33 To: Evandro Menezes; 'GCC Patches' Subject: Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor Hi Evandro On 30/03/15 22:51, Evandro Menezes wrote: The Samsung Exynos M1 implements the ARMv8 ISA and this patch adds support for it through the -mcpu command-line option. The patch was checked on arm-unknown-linux-gnueabihf without new failures. OK for trunk? -- Evandro Menezes Austin, TX 0001-ARM-Add-option-for-the-Samsung-Exynos-M1-core-for-AR.patch diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index b22ea7f..0710a38 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -168,6 +168,7 @@ ARM_CORE(cortex-a17.cortex-a7, cortexa17cortexa7, cortexa7, 7A, FL_LDSCHED | ARM_CORE(cortex-a53,cortexa53, cortexa53, 8A, FL_LDSCHED | FL_CRC32, cortex_a53) ARM_CORE(cortex-a57,cortexa57, cortexa57, 8A, FL_LDSCHED | FL_CRC32, cortex_a57) ARM_CORE(cortex-a72,cortexa72, cortexa57, 8A, FL_LDSCHED | FL_CRC32, cortex_a57) +ARM_CORE(exynos-m1, exynosm1, exynosm1,8A, FL_LDSCHED | FL_CRC32, exynosm1) There are two problems with this: * The 3rd field of ARM_CORE represents the scheduling identifier and without a separate pipeline description for exynosm1 this will just use the generic_sched scheduler which performs quite poorly on modern cores. Would you prefer to reuse a pipeline description from one of the pre-existing ones? Look for example at the cortex-a72 definition: ARM_CORE(cortex-a72,cortexa72, cortexa57, ...snip here the cortexa57 means 'make scheduling decisions for cortexa57'. * The final field in ARM_CORE specifies the tuning struct to be used for this core. This should be defined in arm.c and have the form 'arm_ident_tune, so for your case it should be arm_exynosm1_tune. This isn't defined in your patch, so it won't compile without that. You can write a custom tuning struct yourself, or reuse a tuning struct for one of the existing cores, if you'd like. Also, you should add exynosm1 to the switch statement in arm_issue_rate to specify the issue rate. I have a patch for next stage1 that should refactor it all into the tuning structs (https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02706.html) but until that goes in, you should fill in the switch statement there. Thanks, Kyrill 0001-AArch64-Add-option-for-the-Samsung-Exynos-M1-core-fo.patch Description: Binary data 0001-ARM-Add-option-for-the-Samsung-Exynos-M1-core.patch Description: Binary data
Re: [CHKP] Never expand instrumentation thunks
On 04/02/2015 02:11 PM, Jan Hubicka wrote: 2015-03-18 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..abc9cfe 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) tree thunk_fndecl = decl; tree a; + /* Instrumentation thunk is the same function with + a different signature. Never need to expand it. */ + if (thunk.add_pointer_bounds_args) +return false; Yeah, this is another case where we hit problem with transparent alias pretending to be thunk :) I guess the patch is OK for GCC-5 and for next stage1 we can clean this up. I was actually building a compiler so I could take a look at this one under a debugger ;-) jeff
Re: [CHKP] Support returned bounds in thunks expand
Ping 2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, Currentl we loose returned bounds when functions are merged. This patch fixes it by adding returne bounds support for cgraph_node::expand_thunk. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? 2015-03-06 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Build returned bounds for instrumented functions. I think if the extra bultin call is needed here, the same andling needs to be added to ipa-split. We really ought to unify that code - it is surprisingly difficult to produce a wrapper call these days. + if (instrumentation_clone + !DECL_BY_REFERENCE (resdecl) + restmp + BOUNDED_P (restmp)) + { + tree fn = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET); + gcall *retbnd = gimple_build_call (fn, 1, restmp); + + resbnd = create_tmp_reg (pointer_bounds_type_node, retbnd); + gimple_call_set_lhs (retbnd, resbnd); + + gsi_insert_after (bsi, retbnd, GSI_NEW_STMT); + create_edge (get_create (fn), retbnd, callees-count, callees-frequency); + } I am not sure why we need to check here: Originaly we have two bounded functions, A and B. We turn function B to a wrapper of function A. If callers of bounded functions need to call a builtin, I would expect all callers of B to do it already, so I would expect that wrapper is safe here? Or do we mix instrumented and non-instrumented versions somehow? Addding code handling return value is going to turn the call to non-tailcall, so you probably want to drop the tailcall flag, too. Honza
Re: [CHKP] Never expand instrumentation thunks
2015-03-18 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..abc9cfe 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) tree thunk_fndecl = decl; tree a; + /* Instrumentation thunk is the same function with + a different signature. Never need to expand it. */ + if (thunk.add_pointer_bounds_args) +return false; Yeah, this is another case where we hit problem with transparent alias pretending to be thunk :) I guess the patch is OK for GCC-5 and for next stage1 we can clean this up. Honza if (!force_gimple_thunk this_adjusting targetm.asm_out.can_output_mi_thunk (thunk_fndecl, fixed_offset,
Re: [PATCH, CHKP] Fix cdtor merge for instrumented functions
Hi, This patch doesn't allow instrumentation thunks calls while merging constructors and destructors. Not isntrumented code is not affeceted. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * ipa.c (ipa_cdtor_merge): Skip instrumentation thunks. So the problem here is that you do have two names for the function, one that is not instrumented and other that is instrumented? I am bit surprised we get instrumentation on ctors that should not take or return pointer parameter, but I see one can trigger that at least by manually adding constructor attribute. I think what you need is to drop DECL_STATIC_CONSTRUCTOR/DESTRUCTURO flags when producing the transparent alias. Honza
Re: [CHKP] Never expand instrumentation thunks
On 04/02/2015 02:11 PM, Jan Hubicka wrote: 2015-03-18 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..abc9cfe 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) tree thunk_fndecl = decl; tree a; + /* Instrumentation thunk is the same function with + a different signature. Never need to expand it. */ + if (thunk.add_pointer_bounds_args) +return false; Yeah, this is another case where we hit problem with transparent alias pretending to be thunk :) I guess the patch is OK for GCC-5 and for next stage1 we can clean this up. I was actually building a compiler so I could take a look at this one under a debugger ;-) I think it is really the transparent alias issue. The comment seems pretty clear about it. What is confusing is that instrumentation thunks are called thunks when they are really not - thunk is a small hunk of code, while instrumentation thunk is a transparent alias. Too bad I did not notice we introduced transparent aliases, i would push out my code for that. I will compare Ilya's changes with mine and hopefully we can catch more bugs and unify the code next stage1. Honza
Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses
On 03/30/2015 10:37 PM, Mikhail Maltsev wrote: Hi! I'm currently working on the proposed task of replacing rtx objects (i.e. struct rtx_def) with derived classes. I would like to get some feedback on this work (it's far from being finished, but basically I would like to know, whether my modifications are appropriate, e.g. one may consider that this is too much for just refactoring, because sometimes they involve small modification of semantics). The attached patch is not well tested, i.e. I bootstrapped and regtested it only on x86_64, but I'll perform more extensive testing before submitting the next version. The key points I would like to ask about: 1. The original task was to replace the rtx type by rtx_insn *, where it is appropriate. But rtx_insn has several derived classes, such as rtx_code_label, for example. So I tried to use the most derived type when possible. Is it OK? Definitely. In general the idea here is to exploit the static type checking done in the compiler to avoid runtime checking and failures. 2. Not all of these type promotions can be done by just looking at function callers and callees (and some functions will only be generated during the build of some rare architecture) and checks already done in them. In a couple of cases I referred to comments and my general understanding of code semantics. In one case this actually caused a regression (in the patch it is fixed, of course), because of somewhat misleading comment (see live_label_rtx function added in patch for details) The question is - are such changes OK for refactoring (or it should strictly preserve semantics)? They're OK, but it may be easier to run things through the review process if refactoring is kept separate from strengthening the type checking. 3. In lra-constraints.c I added a new class rtx_usage_list, which, IMHO, allows to group the functions which work with usage list in a more explicit manner and make some conditions more self-explaining. I hope that Vladimir Makarov (in this case, because it concerns LRA) and other authors will not object against such intrusion into their code (or would rather tell me what should be fixed in my patch(es), instead of just refusing to apply it/them). In general, are such changes OK or should better be avoided? A couple of questions related to further work: I don't see anything inherently wrong with this concept. Though again, I'd suggest separating out these changes from type safety work. 1. I noticed that emit_insn function, in fact, does two kinds of things: it can either add it's argument to the chain, or, if the argument is a pattern, it creates a new instruction based on that pattern. Shouldn't this logic be separated in the callers? That would be wise. There's probably several of these kinds of things lurking around. 2. Are there any plans on implementing a better class hierarchy on AST's (union tree_node type). I see that C++ FE uses a huge number of macros (which check TREE_CODE and some boolean flags). Could this be improved somehow? It's in progress and I'm hoping Andrew will be in a position to post this work soon. jeff
Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers
On Thu, Apr 2, 2015 at 10:17 PM, Jan Hubicka hubi...@ucw.cz wrote: rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG); rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1); - retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1)); - chkp_put_regs_to_expr_list (retval); + if (GET_CODE (retval) == PARALLEL) + { + b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx); + b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx); + rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1)); + retval = chkp_join_splitted_slot (retval, par); I do not understand this code. Isn't it just droping the original return value constructing prallelcontaining only the BND_REGs? Please see chkp_join_splitted_slot that merges two parallels. Uros.
Re: [PATCH, CHKP] Fix static const bounds creation in LTO
Hi, Current in LTO static const bounds are created as external symbol. It doesn't work in case original symbols were removed and privatized. This patch introduces a separate comdat symbol to be used in LTO. Bootstrapped and tested on x86_64-unknown-linux-gnu. Does this approach look OK? As I understand it, you want to create a static variables, like __chkp_zero_bounds which should be shared across translation units. Currently the function does: static tree chkp_make_static_const_bounds (HOST_WIDE_INT lb, HOST_WIDE_INT ub, const char *name) { tree var; /* With LTO we may have constant bounds already in varpool. Try to find it. */ var = chkp_find_const_bounds_var (lb, ub); if (var) return var; var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (name), pointer_bounds_type_node); TREE_PUBLIC (var) = 1; TREE_USED (var) = 1; TREE_READONLY (var) = 1; TREE_STATIC (var) = 1; TREE_ADDRESSABLE (var) = 0; DECL_ARTIFICIAL (var) = 1; DECL_READ_P (var) = 1; /* We may use this symbol during ctors generation in chkp_finish_file when all symbols are emitted. Force output to avoid undefined symbols in ctors. */ if (!in_lto_p) { DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub); DECL_COMDAT (var) = 1; varpool_node::get_create (var)-set_comdat_group (DECL_ASSEMBLER_NAME (var)); varpool_node::get_create (var)-force_output = 1; } else DECL_EXTERNAL (var) = 1; varpool_node::finalize_decl (var); } You should not set comdat group by hand, or we get into troubles on non-ELF targets. Just call make_decl_one_only (var, DECL_ASSEMBLER_NAME (var)); Now in LTO I guess you want to check if the symbol is provided by the source compilation unit, i.e. call symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME (var)) and if it is there, check it have proper type and fatal_error otherwise and if it does, discar var and use existing variable This avoid a duplicate declaration of a symbol that is invalid in symtab. (once we solve the transparent alias thing, I will add verification for that) Because you already set force_output, the symbol should not be privatized and renamed in any way. If there are cases the symbol can get legally optimized out, I guess you can also avoid setting force_output and we can stream references to the decls into optimization summary in a case we want ot bind for it in WPA, but that can wait for next stage1. Honza Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c (CHKP_LTO_SYMBOL_SUFFIX): New. (chkp_make_static_const_bounds): Use another symbol in LTO. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-static-bounds_0.c: New. diff --git a/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c new file mode 100644 index 000..e896eb1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c @@ -0,0 +1,26 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -flto -flto-partition=max -fcheck-pointer-bounds -mmpx } } } */ + +char *cc; + +int test1 (const char *c) +{ + c = __builtin___bnd_init_ptr_bounds (c); + cc = c; + return c[0] * 2; +} + +struct S +{ + int (*fnptr) (const char *); +} S; + +struct S s1 = {test1}; +struct S s2 = {test1}; +struct S s3 = {test1}; + +int main (int argc, const char **argv) +{ + return s1.fnptr (argv[0]) + s2.fnptr (argv[1]); +} diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c index d2df4ba..0578936 100644 --- a/gcc/tree-chkp.c +++ b/gcc/tree-chkp.c @@ -421,6 +421,7 @@ static bool in_chkp_pass; #define CHKP_VAR_BOUNDS_PREFIX __chkp_var_bounds_ #define CHKP_ZERO_BOUNDS_VAR_NAME __chkp_zero_bounds #define CHKP_NONE_BOUNDS_VAR_NAME __chkp_none_bounds +#define CHKP_LTO_SYMBOL_SUFFIX .lto /* Static checker constructors may become very large and their compilation with optimization may take too much time. @@ -1906,7 +1907,8 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, HOST_WIDE_INT ub, const char *name) { - tree var; + tree var, id; + varpool_node *node; /* With LTO we may have constant bounds already in varpool. Try to find it. */ @@ -1915,8 +1917,22 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, if (var) return var; - var = build_decl (UNKNOWN_LOCATION, VAR_DECL, - get_identifier (name), pointer_bounds_type_node); + /* In LTO we may have symbol with changed visibility, comdat + group etc. Therefore we shouldn't recreate the same symbol. + Use LTO version instead. */ + if (in_lto_p) +{
Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor
On Thu, Apr 02, 2015 at 11:19:14PM +0100, Sebastian Pop wrote: Hi, from what I understand, Evandro has addressed the comments from Kyrill. Are there other problems to be addressed before the patches can go in? Trunk is currently in Stage 4 development, these patches are fairly low-risk, but they are certainly not regression fixes. I'll defer to port maintainers and release managers for the final say, but in my opinion it would not be appropriate to commit them until Stage 1 development for GCC 6.0 opens (hopefully in a few weeks). For the AArch64 patch, I was expecting to see a respin after Junmo's comment on Wednesday [1]. In particular: +AARCH64_CORE(exynos-m1, exynosm1, exynosm1, 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1) As there has not been a scheduling model contributed for the exynos-m1, this will use *no* scheduling model on AArch64. This is unlikely to give good performance. +static const struct tune_params exynosm1_tunings = +{ + cortexa57_extra_costs, + cortexa57_addrcost_table, + cortexa57_regmove_cost, + cortexa57_vector_cost, + 4, /* memmov_cost */ + 3, /* issue_rate */ + (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD + | AARCH64_FUSE_MOVK_MOVK), /* fuseable_ops */ + 16, /* function_align. */ + 8, /* jump_align. */ + 4, /* loop_align. */ + 2, /* int_reassoc_width. */ + 4, /* fp_reassoc_width. */ + 1/* vec_reassoc_width. */ +}; + As these are identical to the Cortex-A57 tuning, is there any reason to add them? I'd prefer if we took a copy-on-modify policy for these tuning structs, only adding them where there is a difference. Thanks, James --- [1]: https://gcc.gnu.org/ml/gcc-patches/2015-04/msg1.html
Re: [RS6000] Fix 65576 regression
On Thu, Apr 02, 2015 at 03:52:35PM +1030, Alan Modra wrote: This cures an unrecognizable insn ICE by modifying a predicate of extenddftf2_internal (the only place this predicate is used) to ensure that rtl optimization passes do not substitute 0.0 for a register with known 0.0 value, except when VSX is enabled. ie. Don't undo the necessary register move emitted by the extenddftf2_fprs expander. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and powerpc-linux. OK to apply? PR target/65576 PR target/65240 * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude 0.0 constant unless TARGET_VSX. Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (revision 221805) +++ gcc/config/rs6000/predicates.md (working copy) @@ -964,7 +964,8 @@ ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand. (define_predicate zero_reg_mem_operand - (ior (match_operand 0 zero_fp_constant) + (ior (and (match_test TARGET_VSX) + (match_operand 0 zero_fp_constant)) (match_operand 0 reg_or_mem_operand))) ;; Return 1 if the operand is a CONST_INT and it is the element for 64-bit I definately prefer Alan's patch over mine. In looking at extenddftf2_internal, I believe you are correct David, in that final alternative will never match because 0.0 will not be valid (pre-VSX 0.0 won't be allowed as operand2, VSX 0.0 will not match n constraint). For the VSX case, the n should be either j or E. If you are doing a double-long double conversion, and it happens to pick GPRs for some reason, the GPR side will do a load/move of 0.0 instead of using li to load 0 on pre-VSX systems. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [RS6000] Fix 65576 regression
On Thu, Apr 02, 2015 at 01:03:19PM -0400, David Edelsohn wrote: On Thu, Apr 2, 2015 at 1:22 AM, Alan Modra amo...@gmail.com wrote: This cures an unrecognizable insn ICE by modifying a predicate of extenddftf2_internal (the only place this predicate is used) to ensure that rtl optimization passes do not substitute 0.0 for a register with known 0.0 value, except when VSX is enabled. ie. Don't undo the necessary register move emitted by the extenddftf2_fprs expander. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and powerpc-linux. OK to apply? PR target/65576 PR target/65240 * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude 0.0 constant unless TARGET_VSX. This change seems to disable the final alternative in the extenddftf2_internal pattern. It looks to me like it was already disabled, by r96390. -- Alan Modra Australia Development Lab, IBM
Re: [Ping, Patch, fortran, pr65548, v1] [5 Regression] gfc_conv_procedure_call
On 04/02/2015 03:28 AM, Andre Vehreschild wrote: Ping! This should be in 5.1. Dominique and I feel like this patch is nearly obvious. Regards, Andre On Wed, 25 Mar 2015 14:35:54 +0100 Andre Vehreschild ve...@gmx.de wrote: Hi all, please find attached a fix for the recently introduced regression when allocating arrays with an intrinsic function for source=. The patch addresses this issue by using gfc_conv_expr_descriptor () for intrinsic functions. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for trunk? Yes, ok for trunk. Thanks, Jerry
Re: [libstdc++/65033] Give alignment info to libatomic
On Wed, 25 Mar 2015, Jonathan Wakely wrote: I've convinced myself that Richard's patch is correct in all cases, but I think we also want this patch, to fix PR62259 and PR65147. For the generic std::atomicT (i.e. not the integral or pointer specializations) we should increase the alignment of atomic types that have the same size as one of the standard integral types. This should be consistent with what the C front end does for _Atomic, based on what Joseph told me on IRC: Wrong. jsm28 jwakely: _Atomic aligns 1/2/4/8/16-byte types the same as integer types of that size. No it doesn't! It's same or higher as. jsm28 (Which may not be alignment = size, depending on the architecture.) Ideally we'd use an attribute like Andrew describes at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259#c4 but that's not going to happen for GCC 5, so this just looks for an integral type of the same size and uses its alignment. Tested x86_64-linux, powerpc64le-linux. I'll wait for RM approval for this and Richard's patch (which is OK from a libstdc++ perspective). brgds, H-P
Re: [RS6000] Fix 65576 regression
On Thu, Apr 2, 2015 at 7:41 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Thu, Apr 02, 2015 at 03:52:35PM +1030, Alan Modra wrote: This cures an unrecognizable insn ICE by modifying a predicate of extenddftf2_internal (the only place this predicate is used) to ensure that rtl optimization passes do not substitute 0.0 for a register with known 0.0 value, except when VSX is enabled. ie. Don't undo the necessary register move emitted by the extenddftf2_fprs expander. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and powerpc-linux. OK to apply? PR target/65576 PR target/65240 * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude 0.0 constant unless TARGET_VSX. Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (revision 221805) +++ gcc/config/rs6000/predicates.md (working copy) @@ -964,7 +964,8 @@ ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand. (define_predicate zero_reg_mem_operand - (ior (match_operand 0 zero_fp_constant) + (ior (and (match_test TARGET_VSX) + (match_operand 0 zero_fp_constant)) (match_operand 0 reg_or_mem_operand))) ;; Return 1 if the operand is a CONST_INT and it is the element for 64-bit I definately prefer Alan's patch over mine. In looking at extenddftf2_internal, I believe you are correct David, in that final alternative will never match because 0.0 will not be valid (pre-VSX 0.0 won't be allowed as operand2, VSX 0.0 will not match n constraint). For the VSX case, the n should be either j or E. If you are doing a double-long double conversion, and it happens to pick GPRs for some reason, the GPR side will do a load/move of 0.0 instead of using li to load 0 on pre-VSX systems. If the final alternative cannot occur, it should be removed as well. Alan, would you please test that change also? Thanks, David
[PATCH] Add myself as reviewer for arm port
Hi all, I've committed this as per https://gcc.gnu.org/ml/gcc/2015-04/msg00012.html with r221833. Thanks, Kyrill 2015-04-02 Kyrylo Tkachov kyrylo.tkac...@arm.com * MAINTAINERS: Add myself as an arm port reviewer. Index: MAINTAINERS === --- MAINTAINERS (revision 221832) +++ MAINTAINERS (working copy) @@ -268,6 +268,7 @@ Reviewers +arm port Kyrylo Tkachov kyrylo.tkac...@arm.com C front end Marek Polacek pola...@redhat.com dataflow Paolo Bonzini bonz...@gnu.org dataflow Seongbae Park seongbae.p...@gmail.com
Re: [Ping, Patch, fortran, pr65548, v1] [5 Regression] gfc_conv_procedure_call
Ping! This should be in 5.1. Dominique and I feel like this patch is nearly obvious. Regards, Andre On Wed, 25 Mar 2015 14:35:54 +0100 Andre Vehreschild ve...@gmx.de wrote: Hi all, please find attached a fix for the recently introduced regression when allocating arrays with an intrinsic function for source=. The patch addresses this issue by using gfc_conv_expr_descriptor () for intrinsic functions. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Ok for trunk? Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: Announcing AArch64 and ARM port reviewers
On Thu, Apr 02, 2015 at 06:39:57AM +0100, Jeff Law wrote: I'm pleased to announce that James Greenhalgh has been appointed as a reviewer for the AArch64 port and that Kyrylo Tkachov has been appointed as a reviewer for the ARM port. James Kyrylo, if you could add yourself to the MAINTAINERS file for the additional roles, it would be appreciated. Thanks! Done as r221834, with the attached patch. Cheers, James Index: ChangeLog === --- ChangeLog (revision 221833) +++ ChangeLog (revision 221834) @@ -1,3 +1,7 @@ +2015-04-02 James Greenhalgh james.greenha...@arm.com + + * MAINTAINERS: Add myself as a reviewer for the AArch64 port. + 2015-04-02 Kyrylo Tkachov kyrylo.tkac...@arm.com * MAINTAINERS: Add myself as an arm port reviewer. Index: MAINTAINERS === --- MAINTAINERS (revision 221833) +++ MAINTAINERS (revision 221834) @@ -268,6 +268,7 @@ Reviewers +aarch64 port James Greenhalgh james.greenha...@arm.com arm port Kyrylo Tkachov kyrylo.tkac...@arm.com C front end Marek Polacek pola...@redhat.com dataflow Paolo Bonzini bonz...@gnu.org
[committed] Fix wrong generated by pa_output_move_double
The attached change fixes an old bug. In debugging the segmentation fault of cc76narrowing1 in the ppl-1.1 test suite, I found that we incorrectly handle register indexed memory operands in double word loads on 32-bit targets. In particular, when there is an overlap between the source registers and the destination registers, we get wrong assembly code. For example, with (insn:TI 220 543 544 (set (reg:DF 28 %r28 [orig:133 D.246652 ] [133]) (mem:DF (plus:SI (reg/f:SI 29 %r29 [orig:167 _49-D.173044.impl ] [167]) (reg:SI 28 %r28 [163])) [142 MEM[(const double )_53]+0 S8 A64]) ) 72 {*pa.md:3858} we generated the following assembly code: ldo 4(%r29),%r29 ldw %r29(%r28),%r29 ldo -4(%r29),%r29 ldw %r29(%r28),%r28 The first ldw instruction clobbers %r29. Tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11. Committed to trunk, and 4.9 and 4.8 branches. Dave -- John David Anglin dave.ang...@bell.net 2015-04-02 John David Anglin dang...@gcc.gnu.org * config/pa/pa.c (pa_output_move_double): Directly handle register indexed memory operand. Simplify handling of scaled register indexed memory operands. Index: config/pa/pa.c === --- config/pa/pa.c (revision 221781) +++ config/pa/pa.c (working copy) @@ -2595,29 +2595,30 @@ GET_CODE (XEXP (addr, 0)) == MULT) { rtx xoperands[4]; - rtx high_reg = gen_rtx_SUBREG (SImode, operands[0], 0); - if (!reg_overlap_mentioned_p (high_reg, addr)) - { - xoperands[0] = high_reg; - xoperands[1] = XEXP (addr, 1); - xoperands[2] = XEXP (XEXP (addr, 0), 0); - xoperands[3] = XEXP (XEXP (addr, 0), 1); - output_asm_insn ({sh%O3addl %2,%1,%0|shladd,l %2,%O3,%1,%0}, - xoperands); - return ldw 4(%0),%R0\n\tldw 0(%0),%0; - } - else - { - xoperands[0] = high_reg; - xoperands[1] = XEXP (addr, 1); - xoperands[2] = XEXP (XEXP (addr, 0), 0); - xoperands[3] = XEXP (XEXP (addr, 0), 1); - output_asm_insn ({sh%O3addl %2,%1,%R0|shladd,l %2,%O3,%1,%R0}, - xoperands); - return ldw 0(%R0),%0\n\tldw 4(%R0),%R0; - } + /* Load address into left half of destination register. */ + xoperands[0] = gen_rtx_SUBREG (SImode, operands[0], 0); + xoperands[1] = XEXP (addr, 1); + xoperands[2] = XEXP (XEXP (addr, 0), 0); + xoperands[3] = XEXP (XEXP (addr, 0), 1); + output_asm_insn ({sh%O3addl %2,%1,%0|shladd,l %2,%O3,%1,%0}, + xoperands); + return ldw 4(%0),%R0\n\tldw 0(%0),%0; } + else if (GET_CODE (addr) == PLUS + REG_P (XEXP (addr, 0)) + REG_P (XEXP (addr, 1))) + { + rtx xoperands[3]; + + /* Load address into left half of destination register. */ + xoperands[0] = gen_rtx_SUBREG (SImode, operands[0], 0); + xoperands[1] = XEXP (addr, 0); + xoperands[2] = XEXP (addr, 1); + output_asm_insn ({addl|add,l} %1,%2,%0, + xoperands); + return ldw 4(%0),%R0\n\tldw 0(%0),%0; + } } /* If an operand is an unoffsettable memory ref, find a register
[PATCH] Fix PR65660 in parts, better fix for PR64909
When I fixed PR64909 I knew it wasn't a 100% precise fix. This bites us now in PR65660 (which also shows other issues). Thus the following precise fix for PR64909 (which didn't turn out as too involved) - compute a cost vector for the scalar iteration cost and apply it for peeled iterations. Bootstrap and regtest running on x86_64-unknown-linux-gnu. I plan to apply this to trunk and to the 4.9 branch where I backported the original fix to as it fixes round-off errors which can lead to very unprofitable vectorizations. Richard. 2015-04-02 Richard Biener rguent...@suse.de PR tree-optimization/64909 PR tree-optimization/65660 * tree-vectorizer.h (vect_get_known_peeling_cost): Adjust to take a cost vector for scalar iteration cost. (vect_get_single_scalar_iteration_cost): Likewise. * tree-vect-loop.c (vect_get_single_scalar_iteration_cost): Compute the scalar iteration cost into a cost vector. (vect_get_known_peeling_cost): Use the scalar cost vector to account for the cost of the peeled iterations. (vect_estimate_min_profitable_iters): Likewise. * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Likewise. Index: gcc/tree-vectorizer.h === --- gcc/tree-vectorizer.h (revision 220580) +++ gcc/tree-vectorizer.h (working copy) @@ -1101,10 +1101,12 @@ extern bool vectorizable_reduction (gimp extern bool vectorizable_induction (gimple, gimple_stmt_iterator *, gimple *); extern tree get_initial_def_for_reduction (gimple, tree, tree *); extern int vect_min_worthwhile_factor (enum tree_code); -extern int vect_get_known_peeling_cost (loop_vec_info, int, int *, int, +extern int vect_get_known_peeling_cost (loop_vec_info, int, int *, + stmt_vector_for_cost *, stmt_vector_for_cost *, stmt_vector_for_cost *); -extern int vect_get_single_scalar_iteration_cost (loop_vec_info); +extern int vect_get_single_scalar_iteration_cost (loop_vec_info, + stmt_vector_for_cost *); /* In tree-vect-slp.c. */ extern void vect_free_slp_instance (slp_instance); Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c(revision 220580) +++ gcc/tree-vect-loop.c(working copy) @@ -2653,12 +2653,13 @@ vect_force_simple_reduction (loop_vec_in /* Calculate the cost of one scalar iteration of the loop. */ int -vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo) +vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo, + stmt_vector_for_cost *scalar_cost_vec) { struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); int nbbs = loop-num_nodes, factor, scalar_single_iter_cost = 0; - int innerloop_iters, i, stmt_cost; + int innerloop_iters, i; /* Count statements in scalar loop. Using this as scalar cost for a single iteration for now. @@ -2699,17 +2700,20 @@ vect_get_single_scalar_iteration_cost (l !STMT_VINFO_IN_PATTERN_P (stmt_info)) continue; + vect_cost_for_stmt kind; if (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt))) { if (DR_IS_READ (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt - stmt_cost = vect_get_stmt_cost (scalar_load); + kind = scalar_load; else - stmt_cost = vect_get_stmt_cost (scalar_store); + kind = scalar_store; } else -stmt_cost = vect_get_stmt_cost (scalar_stmt); +kind = scalar_stmt; - scalar_single_iter_cost += stmt_cost * factor; + scalar_single_iter_cost + += record_stmt_cost (scalar_cost_vec, factor, kind, +NULL, 0, vect_prologue); } } return scalar_single_iter_cost; @@ -2719,7 +2723,7 @@ vect_get_single_scalar_iteration_cost (l int vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, int *peel_iters_epilogue, - int scalar_single_iter_cost, + stmt_vector_for_cost *scalar_cost_vec, stmt_vector_for_cost *prologue_cost_vec, stmt_vector_for_cost *epilogue_cost_vec) { @@ -2736,8 +2740,10 @@ vect_get_known_peeling_cost (loop_vec_in /* If peeled iterations are known but number of scalar loop iterations are unknown, count a taken branch per peeled loop. */ - retval = record_stmt_cost (prologue_cost_vec, 2, cond_branch_taken, + retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
[PATCH, i386]: Revert unwanted part of r216820 to fix movqi_internal type calculation
Hello! Attached patch reverts unwanted part of r216820 [1] movqi_internal change to restore correct type calculation. 2015-04-02 Uros Bizjak ubiz...@gmail.com Revert parts of r216820. * config/i386/i386.md (movqi_internal): Correct type calculation for alternatives 3 and 5. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN. [1] https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=216820 Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 221826) +++ config/i386/i386.md (working copy) @@ -2533,9 +2533,7 @@ } [(set_attr isa *,*,*,*,*,*,*,*,*,*,avx512dq,avx512dq) (set (attr type) - (cond [(eq_attr alternative 3,5) - (const_string imovx) - (eq_attr alternative 7,8,9,10,11) + (cond [(eq_attr alternative 7,8,9,10,11) (const_string mskmov) (and (eq_attr alternative 5) (not (match_operand:QI 1 aligned_operand))) @@ -2546,6 +2544,8 @@ (ior (not (match_test TARGET_PARTIAL_REG_STALL)) (not (match_test TARGET_QIMODE_MATH (const_string imov) + (eq_attr alternative 3,5) + (const_string imovx) (and (match_test TARGET_MOVX) (eq_attr alternative 2)) (const_string imovx)
Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX
On Wed, Apr 1, 2015 at 9:34 PM, Jeff Law l...@redhat.com wrote: On 03/31/2015 03:47 AM, Ilya Enkovich wrote: On 23 Mar 13:19, Ilya Enkovich wrote: Hi, May this patch go into trunk at this point? It is very important for dynamic MPX codes. Thanks, Ilya I additionally documented changes in invoke.texi. OK for trunk? Thanks, Ilya -- gcc/ 2015-03-31 Ilya Enkovich ilya.enkov...@intel.com PR driver/65444 * config/i386/linux-common.h (MPX_SPEC): New. (CHKP_SPEC): Add MPX_SPEC. * doc/invoke.texi (-fcheck-pointer-boudns): Document possible issues with '-z bndplt' support in linker. libmpx/ 2015-03-31 Ilya Enkovich ilya.enkov...@intel.com PR driver/65444 * configure.ac: Add check for '-z bndplt' support by linker. Add link_mpx output variable. * libmpx.spec.in (link_mpx): New. * configure: Regenerate. Just to make sure I understand. With this patch we conditionally pass -z bndplt based on whether or not it's supported by the linker we find when we configure GCC. Failure to pass -z bndplt won't cause the program to misbehave, but will limit the effectiveness of MPX. Gold doesn't support -z bndplt, just newer versions of the BFD linker. Gold issues an error for -z bndplt, old BFD linkers will issue a warning. There are, at least in theory, use cases where we might not have a PLT and thus -z bndplt wouldn't make sense anyway. I guess the one thing I don't like here is that whether or not to pass -z bndplt is made at the time we configure GCC. Yet, it's trivial for someone to change the system linker later, either to gold or from an old BFD linker that doesn't support -z bndplt to one that does support -z bndplt. [ Note we have the same issue with certain assembler feature tests. ] I'm not aware of any real infrastructure in GCC to query the behavior of the linker at link time and then customize the options passed. So if it's going to be configurable, then that's the only time to do the test. I strongly disagree with HJ's assertion that we should always pass the flag, regardless of the underlying linker. So, in an ideal world, we'd query the linker at link time and pass the flag anytime we have a linker that supports the capability and perhaps warn if the linker doesn't support that capability. Given we're not in that ideal world, I think Ilya's patch is reasonable and should be installed. Without proper PLT for MPX, all external function calls will clear bound registers. MPX is a security feature. Cyber criminals only need to get it right 1 time. An organization who uses MPX for cyber security may not realize they leave a door open due to an old linker. What I want to avoid is 2 years from now, bank of foobar comes out saying that they thought they were protected by MPX, but somehow MPX didn't catch a buffer overflow it was supposed to and there was no compiler error message to warn programmers. -- H.J.
Re: [libstdc++/65033] Give alignment info to libatomic
On Thu, 12 Feb 2015, Richard Henderson wrote: When we fixed PR54005, Hm, there's confusion. When was this fixed? (Not fixed AFAICT.) Maybe you mean PR54004, but there was no note there either. Or maybe there's a typo and you meant some other PR and that PR54005 is supposedly fixed by this patch (committed as r221701) ...but it doesn't seem right: you use a specific object when deducing the alignment for the fake-pointer, so it's used anyway and is_lock_free must not be object-specific (despite its name) and only type-specific as mandated by the standard (see PR). To wit, deduce from the known-alignment of _Tp, not known-alignment of _M_i. Or is this what happens; they're the same? Why then use __alignof(_M_i) (the object-alignment) instead of _S_alignment (the deduced alas insufficiently increased type-alignment)? making sure that atomic_is_lock_free returns the same value for all objects of a given type, (That would work but it doesn't seem to be the case.) we probably should have changed the interface so that we would pass size and alignment rather than size and object pointer. Instead, we decided that passing null for the object pointer would be sufficient. But as this PR shows, we really do need to take alignment into account. Regarding what's actually needed, alignment of an atomic type should always be *forced to be at least the natural alignment of of the object* (with non-power-of-two sized-objects rounded up) and until then atomic types won't work for targets where the non-atomic equivalents have less alignment (as straddling a page-boundary won't be lock-less-atomic anywhere where straddling a page-boundary may cause a non-atomic-access...) So, not target-specific except for targets that require even higher-than-natural alignment. The following patch constructs a fake object pointer that is maximally misaligned. (With regards to the known object alignment of the _M_i object.) This allows the interface to both the builtin and to libatomic to remain unchanged. Which probably makes this back-portable to maintenance releases as well. I believe that for all of our current systems, size_t == uintptr_t, so the reinterpret_cast ought not generate warnings. The test case is problematic, as there's currently no good place to put it. The libstdc++ testsuite doesn't have the libatomic library path configured, and the libatomic testsuite doesn't have the libstdc++ include paths configured. Yet another example where we really need an install tree for testing. Thoughts? brgds, H-P
Re: [PATCH, CHKP] Restore transparent alias chains
On 04/02/2015 12:48 PM, Jan Hubicka wrote: On 03/20/2015 02:20 AM, Ilya Enkovich wrote: Hi, Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. We always expect it for instrumentation clones, thus restore it input_cgraph_1. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-20 Ilya Enkovich ilya.enkov...@intel.com * lto-cgraph.c (input_cgraph_1): Always link instrumented assembler name with original one. This appears to be a code path that only triggers when MPX is enabled and is roughly analogous to the code in chkp_build_instrumented_fndecl links things up. OK for the trunk. I think this will lead to wrong code. At this time, we may have multple declarations sharing single assembler name (and thus IDENTIFIER_TRANSPARENT_ALIAS link). Think of case where one unit defines static function and other global function of the same name. We may end up renaming the symbol but keeping bogus transparent alias link that will trigger on other symbol. Then I think the code in ipa-chkp chkp_build_instrumented_fndecl (and more generally how we're using transparent aliases) may need some rethinking. jeff
Re: Silence merge warnings on artificial types
On Thu, Apr 02, 2015 at 09:23:03PM +0300, Ilya Verbin wrote: Hmm, libgomp.c++/target-3.C still fails. Here is what I see in need_assembler_name_p: Guess we should make the .omp_data_s.* types TYPE_ARTIFICIAL too. Will take care of that tomorrow. Jakub
[PATCH, CHKP] Fix static const bounds creation in LTO
Hi, Current in LTO static const bounds are created as external symbol. It doesn't work in case original symbols were removed and privatized. This patch introduces a separate comdat symbol to be used in LTO. Bootstrapped and tested on x86_64-unknown-linux-gnu. Does this approach look OK? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c (CHKP_LTO_SYMBOL_SUFFIX): New. (chkp_make_static_const_bounds): Use another symbol in LTO. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-static-bounds_0.c: New. diff --git a/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c new file mode 100644 index 000..e896eb1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c @@ -0,0 +1,26 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -flto -flto-partition=max -fcheck-pointer-bounds -mmpx } } } */ + +char *cc; + +int test1 (const char *c) +{ + c = __builtin___bnd_init_ptr_bounds (c); + cc = c; + return c[0] * 2; +} + +struct S +{ + int (*fnptr) (const char *); +} S; + +struct S s1 = {test1}; +struct S s2 = {test1}; +struct S s3 = {test1}; + +int main (int argc, const char **argv) +{ + return s1.fnptr (argv[0]) + s2.fnptr (argv[1]); +} diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c index d2df4ba..0578936 100644 --- a/gcc/tree-chkp.c +++ b/gcc/tree-chkp.c @@ -421,6 +421,7 @@ static bool in_chkp_pass; #define CHKP_VAR_BOUNDS_PREFIX __chkp_var_bounds_ #define CHKP_ZERO_BOUNDS_VAR_NAME __chkp_zero_bounds #define CHKP_NONE_BOUNDS_VAR_NAME __chkp_none_bounds +#define CHKP_LTO_SYMBOL_SUFFIX .lto /* Static checker constructors may become very large and their compilation with optimization may take too much time. @@ -1906,7 +1907,8 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, HOST_WIDE_INT ub, const char *name) { - tree var; + tree var, id; + varpool_node *node; /* With LTO we may have constant bounds already in varpool. Try to find it. */ @@ -1915,8 +1917,22 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, if (var) return var; - var = build_decl (UNKNOWN_LOCATION, VAR_DECL, -get_identifier (name), pointer_bounds_type_node); + /* In LTO we may have symbol with changed visibility, comdat + group etc. Therefore we shouldn't recreate the same symbol. + Use LTO version instead. */ + if (in_lto_p) +{ + int len = strlen (name) + strlen (CHKP_LTO_SYMBOL_SUFFIX) + 1; + char *new_name = XALLOCAVEC (char, len); + strcpy (new_name, name); + strcat (new_name, CHKP_LTO_SYMBOL_SUFFIX); + id = get_identifier (new_name); +} + else +id = get_identifier (name); + + var = build_decl (UNKNOWN_LOCATION, VAR_DECL, id, +pointer_bounds_type_node); TREE_PUBLIC (var) = 1; TREE_USED (var) = 1; @@ -1925,18 +1941,17 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, TREE_ADDRESSABLE (var) = 0; DECL_ARTIFICIAL (var) = 1; DECL_READ_P (var) = 1; + DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub); + DECL_COMDAT (var) = 1; + DECL_WEAK (var) = 1; /* We may use this symbol during ctors generation in chkp_finish_file when all symbols are emitted. Force output to avoid undefined symbols in ctors. */ - if (!in_lto_p) -{ - DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub); - DECL_COMDAT (var) = 1; - varpool_node::get_create (var)-set_comdat_group (DECL_ASSEMBLER_NAME (var)); - varpool_node::get_create (var)-force_output = 1; -} - else -DECL_EXTERNAL (var) = 1; + node = varpool_node::get_create (var); + node-set_comdat_group (DECL_ASSEMBLER_NAME (var)); + node-force_output = 1; + node-externally_visible = 1; + varpool_node::finalize_decl (var); return var;
Re: [C++ Patch/RFC] PR 64085
Hi, I guess I can as well ping this - still in my tree, no issues so far... On 03/04/2015 07:29 PM, Paolo Carlini wrote: Hi, today while working a bit on Bugzilla I found this issue which should be easy to fix, not a regression, therefore possibly post 5.0. At parsing time, cp_parser_lambda_introducer ends up calling real_lvalue_p on the lambda initializer (via add_capture), but then the initializer, which is a CALL_EXPR, crashes immediately lvalue_kind because its TREE_TYPE is not set (we are dealing with a template function). I guess that, in this case too, we want to check first dependent_type_p. Tested x86_64-linux. https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00241.html Thanks, Paolo.
[PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers
Hi, This patch fixes nested PARALLEL in retval for isntrumented calls. Current possible call: (call_insn:TI 6 30 17 2 (set (parallel [ (expr_list:REG_DEP_TRUE (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) ]) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Such construction causes DF analysis problems. This patch tranforms it to: (call_insn:TI 6 30 17 2 (set (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Only MPX target is affected. Bootstrapped and tested on x86_64-unknown-linux-gnu. Does it look OK? Thanks, Ilya -- 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c (ix86_expand_call): Avoid nested PARALLEL in returned call value. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 744642c..1d821cd 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -25624,8 +25624,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, { rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG); rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1); - retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1)); - chkp_put_regs_to_expr_list (retval); + if (GET_CODE (retval) == PARALLEL) + { + b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx); + b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx); + rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1)); + retval = chkp_join_splitted_slot (retval, par); + } + else + { + retval = gen_rtx_PARALLEL (VOIDmode, +gen_rtvec (3, retval, b0, b1)); + chkp_put_regs_to_expr_list (retval); + } } call = gen_rtx_SET (VOIDmode, retval, call);
[PATCH, CHKP] Fix ipa-comdats for instrumentation thunks
Hi, With r221574 (https://gcc.gnu.org/ml/gcc-cvs/2015-03/msg00495.html) thunks don't get comdat groups assigned and this causes a failure in cgraph checker for instrumentation thunks. It happens because instrumentation thunk may reference local symbol in comdat not being in comdat by itself. This patch fixes the problem. Doesn't affect not instrumented code. Testing is in progress. Does it look OK? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * ipa-comdats.c (ipa_comdats): Visit all instrumentation thunks to set proper comdat group. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * gcc.target/i386/mpx/chkp-thunk-comdat-1.cc: New. * gcc.target/i386/mpx/chkp-thunk-comdat-2.cc: New. diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c index f349f9f..30bcad8 100644 --- a/gcc/ipa-comdats.c +++ b/gcc/ipa-comdats.c @@ -348,10 +348,9 @@ ipa_comdats (void) } /* Finally assign symbols to the sections. */ - + cgraph_node *fun; FOR_EACH_DEFINED_SYMBOL (symbol) { - struct cgraph_node *fun; symbol-aux = NULL; if (!symbol-get_comdat_group () !symbol-alias @@ -388,6 +387,20 @@ ipa_comdats (void) true); } } + + /* Instrumentation thunks reference original node and thus + need to be in the same comdat group. Otherwise we may + get a local instrumented symbol in a comdat group and + the referencing original node outside of it. */ + FOR_EACH_DEFINED_FUNCTION (fun) +if (fun-instrumentation_clone +fun-instrumented_version +!fun-instrumented_version-alias +fun-get_comdat_group () +!fun-instrumented_version-get_comdat_group ()) + fun-instrumented_version-call_for_symbol_and_aliases + (set_comdat_group_1, fun, true); + return 0; } diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc new file mode 100644 index 000..26d3c48 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx } */ + +namespace +{ + template int dim + int __attribute__((noinline)) + f1 () + { +return dim; + } +} + +int +test () +{ + return f13 (); +} diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc new file mode 100644 index 000..2b1abe9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx } */ + +class c1 +{ +public: + virtual int test1 (const char *); +}; + +class c2 +{ +public: + int test2 (const char *); +}; + +int +c1::test1 (const char *) +{ + return 0; +} + +int +c2::test2 (const char *) +{ + return 0; +}
C++ PATCH for c++/65642 (rejects-valid-constexpr)
This fixes the regression part of c++/65642. Some history behind this: up until r214941 we were able to fold A[i] + j into A[i + j], but in this revision richi removed this transformation. Since we need this transformation when dealing with constexprs, I added it back into cxx_fold_indirect_ref function (r221544). But that was a wrong place; we need to be able to handle such a transformation even when we're not dealing with INDIRECT_REFs, so I moved the folding into cxx_eval_constant_expression and introduced cxx_eval_pointer_plus_expression (r221777). That was the right thing to do, but I should've also added an evaluation of the first op of a POINTER_PLUS_EXPR, otherwise cxx_eval_pointer_plus_expression won't be able to do its job properly in some cases: e.g. when it gets s + 1 where s is a PARM_DECL: the function expects the s[0] + 1 form. The fix for the second part of this PR should follow soon. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-04-02 Marek Polacek pola...@redhat.com PR c++/65642 * constexpr.c (cxx_eval_pointer_plus_expression): Call cxx_eval_constant_expression on the first operand. * g++.dg/cpp0x/constexpr-fold1.C: New test. * g++.dg/cpp0x/constexpr-fold2.C: New test. diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c index 2100f94..f5be8df 100644 --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -2933,6 +2933,9 @@ cxx_eval_pointer_plus_expression (const constexpr_ctx *ctx, tree t, tree op01 = TREE_OPERAND (t, 1); location_t loc = EXPR_LOCATION (t); + op00 = cxx_eval_constant_expression (ctx, op00, lval, + non_constant_p, overflow_p); + STRIP_NOPS (op00); if (TREE_CODE (op00) != ADDR_EXPR) return NULL_TREE; diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C index e69de29..414a0da 100644 --- gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C +++ gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C @@ -0,0 +1,65 @@ +// PR c++/65642 +// { dg-do compile { target c++11 } } + +// Check we're able to evaluate these. + +#define SA(X) static_assert((X),#X) + +constexpr char s[] = abc; +constexpr int t[] = { 'a', 'b', 'c', '\0' }; + +constexpr char +fn1 (const char *p) +{ + return *(p + 1); +} + +constexpr char +fn2 (const char *p) +{ + return p[1]; +} + +constexpr int +fn3 (const int *p) +{ + return *(p + 1); +} + +constexpr int +fn4 (const int *p) +{ + return p[1]; +} + +constexpr auto c1 = fn1 (s[0]); +constexpr auto c2 = fn1 (s[1]); +constexpr auto c3 = fn1 (s[2]); + +SA (c1 == 'b'); +SA (c2 == 'c'); +SA (c3 == '\0'); + +constexpr auto d1 = fn2 (s[0]); +constexpr auto d2 = fn2 (s[1]); +constexpr auto d3 = fn2 (s[2]); + +SA (d1 == 'b'); +SA (d2 == 'c'); +SA (d3 == '\0'); + +constexpr auto e1 = fn3 (t[0]); +constexpr auto e2 = fn3 (t[1]); +constexpr auto e3 = fn3 (t[2]); + +SA (e1 == 'b'); +SA (e2 == 'c'); +SA (e3 == '\0'); + +constexpr auto f1 = fn4 (t[0]); +constexpr auto f2 = fn4 (t[1]); +constexpr auto f3 = fn4 (t[2]); + +SA (f1 == 'b'); +SA (f2 == 'c'); +SA (f3 == '\0'); diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C index e69de29..98aca2a 100644 --- gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C +++ gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C @@ -0,0 +1,30 @@ +// PR c++/65642 +// { dg-do compile { target c++11 } } + +#define SA(X) static_assert((X),#X) + +constexpr char s[] = abc; + +constexpr bool +cmp (char const *a, char const *b) +{ + return a == b; +} + +constexpr bool +fn1 (const char *s) +{ + return cmp (s, s + 1); +} + +constexpr bool +fn2 (const char *s) +{ + return cmp (s + 1, s + 1); +} + +constexpr auto c1 = fn1 (s[0]); +constexpr auto c2 = fn2 (s[0]); + +SA (!c1); +SA (c2); Marek
Re: New regression on ARM Linux
Richard Biener wrote: On Tue, 31 Mar 2015, Alan Lawrence wrote: (1) If we wish to keep the AAPCS principle that varargs are passed just as named args, we should use TYPE_MAIN_VARIANT inside arm_needs_doubleword_alignment, which will then ignore overalignment on both varargs _and named args_. However this would be silently ABI-changing? (2) It seems to me that SRA is the only way for overalignment info to be present on a value, so the patch to tree-sra.c/create_access_replacement seems to make things more consistent? I'm not so sure about (2), SCCVN records the type of a reference and PRE uses it to create the LHS temporaries to insert them. You'd need some tricky order of optimizations to expose that to a call argument though (copy-propagating the inserted value to a call argument). LIM may have similar issues (when doing store-motion), so may predictive commoning and loop distribution (and maybe others I forgot). Hmmm. The other cases you mention are somewhat worrisome, as we don't know if there are testcases that might tickle these too. But I think making a last-minute change to the backend, *that would affect the ABI for prototyped arguments*, is asking for trouble; we don't have time to think through the implications or make a proper spec, and I can easily see us wanting to change it *again* for gcc 6! So I think your patch to tree-sra.c (create_access_replacement) seems the right fix? --Alan
[PATCH, CHKP] Fix cdtor merge for instrumented functions
Hi, This patch doesn't allow instrumentation thunks calls while merging constructors and destructors. Not isntrumented code is not affeceted. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * ipa.c (ipa_cdtor_merge): Skip instrumentation thunks. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-ctor-merge_0.c: New. diff --git a/gcc/ipa.c b/gcc/ipa.c index b3752de..84ab542 100644 --- a/gcc/ipa.c +++ b/gcc/ipa.c @@ -1159,8 +1159,10 @@ ipa_cdtor_merge (void) { struct cgraph_node *node; FOR_EACH_DEFINED_FUNCTION (node) -if (DECL_STATIC_CONSTRUCTOR (node-decl) - || DECL_STATIC_DESTRUCTOR (node-decl)) +if ((DECL_STATIC_CONSTRUCTOR (node-decl) +|| DECL_STATIC_DESTRUCTOR (node-decl)) +!(node-thunk.thunk_p + node-thunk.add_pointer_bounds_args)) record_cdtor_fn (node); build_cdtor_fns (); static_ctors.release (); diff --git a/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c b/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c new file mode 100644 index 000..ac4095b --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c @@ -0,0 +1,23 @@ +/* { dg-lto-do run } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -O2 -flto -fcheck-pointer-bounds -mmpx -nodefaultlibs -lc } } } */ + +int glob = 1; + +void __attribute__((constructor)) +ctor1 () +{ + glob += 1; +} + + +void __attribute__((constructor)) +ctor2 () +{ + glob -= 2; +} + +int main (int argc, const char **argv) +{ + return glob; +}
C++ PATCH for c++/65625 (ICE using variable template as type)
We were specifically handling misuse of a function template, but we also need to handle variable templates now. Tested x86_64-pc-linux-gnu, applying to trunk. commit d743a07ee6b19b3fec59c81ae2cbedbf38d91c4b Author: Jason Merrill ja...@redhat.com Date: Wed Apr 1 19:02:03 2015 -0400 PR c++/65625 * decl.c (make_typename_type): Handle seeing a variable template. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 31b8e0c..c4731ae 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -3480,9 +3480,9 @@ make_typename_type (tree context, tree name, enum tag_types tag_type, if (TREE_CODE (name) == TEMPLATE_ID_EXPR) { name = TREE_OPERAND (name, 0); - if (TREE_CODE (name) == TEMPLATE_DECL) + if (DECL_TYPE_TEMPLATE_P (name)) name = TREE_OPERAND (fullname, 0) = DECL_NAME (name); - else if (TREE_CODE (name) == OVERLOAD) + if (TREE_CODE (name) != IDENTIFIER_NODE) { if (complain tf_error) error (%qD is not a type, name); diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ23.C b/gcc/testsuite/g++.dg/cpp1y/var-templ23.C new file mode 100644 index 000..a5b19d2 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/var-templ23.C @@ -0,0 +1,7 @@ +// PR c++/65625 +// { dg-do compile { target c++14 } } + +namespace std { + template typename T int declval; + typename std::declval d; // { dg-error not a type } +}
Re: [CHKP] Support returned bounds in thunks expand
Ping 2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, Currentl we loose returned bounds when functions are merged. This patch fixes it by adding returne bounds support for cgraph_node::expand_thunk. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2015-03-06 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Build returned bounds for instrumented functions. gcc/testsuite/ 2015-03-06 Ilya Enkovich ilya.enkov...@intel.com * gcc/testsuite/gcc.target/i386/thunk-retbnd.c: New. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..fc38e67 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1581,6 +1581,7 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) int i; tree resdecl; tree restmp = NULL; + tree resbnd = NULL; gcall *call; greturn *ret; @@ -1697,6 +1698,21 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) gsi_insert_after (bsi, call, GSI_NEW_STMT); if (!alias_is_noreturn) { + if (instrumentation_clone + !DECL_BY_REFERENCE (resdecl) + restmp + BOUNDED_P (restmp)) + { + tree fn = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET); + gcall *retbnd = gimple_build_call (fn, 1, restmp); + + resbnd = create_tmp_reg (pointer_bounds_type_node, retbnd); + gimple_call_set_lhs (retbnd, resbnd); + + gsi_insert_after (bsi, retbnd, GSI_NEW_STMT); + create_edge (get_create (fn), retbnd, callees-count, callees-frequency); + } + if (restmp !this_adjusting (fixed_offset || virtual_offset)) { @@ -1766,6 +1782,7 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) ret = gimple_build_return (restmp); else ret = gimple_build_return (resdecl); + gimple_return_set_retbnd (ret, resbnd); gsi_insert_after (bsi, ret, GSI_NEW_STMT); } diff --git a/gcc/testsuite/gcc.target/i386/thunk-retbnd.c b/gcc/testsuite/gcc.target/i386/thunk-retbnd.c new file mode 100644 index 000..d9bd031 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/thunk-retbnd.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target mpx } */ +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx -fdump-tree-optimized } */ +/* { dg-final { scan-tree-dump-times return glob, 2 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ + +int glob; + +int * +test1 (void) +{ + return glob; +} + +int * +test2 (void) +{ + return test1 (); +}
Re: [CHKP, PATCH] Fix LTO cgraph merge for instrumented functions
Ping 2015-03-19 11:34 GMT+03:00 Ilya Enkovich enkovich@gmail.com: On 12 Mar 13:27, Ilya Enkovich wrote: Hi, Currently cgraph merge has several issues with instrumented code: - original function node may be removed = no assembler name conflict is detected between function and variable - only orig_decl name is privatized for instrumented function = node still shares assembler name which causes infinite privatization loop - information about changed name is stored in file_data of instrumented node = original section name may be not found for original function - chkp reference is not fixed when nodes are merged This patch should fix theese problems by keeping instrumentation thunks reachable, privatizing both nodes and fixing chkp references. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya Here is an updated version where chkp_maybe_fix_chkp_ref also removes duplicating references which may appear during cgraph nodes merge. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2015-03-19 Ilya Enkovich ilya.enkov...@intel.com * ipa-chkp.h (chkp_maybe_fix_chkp_ref): New. * ipa-chkp.c (chkp_maybe_fix_chkp_ref): New. * ipa.c (symbol_table::remove_unreachable_nodes): Don't remove instumentation thunks calling reachable functions. * lto-cgraph.c: Include ipa-chkp.h. (input_symtab): Fix chkp references for boundary nodes. * lto/lto-partition.c (privatize_symbol_name_1): New. (privatize_symbol_name): Privatize both decl and orig_decl names for instrumented functions. * lto/lto-symtab.c: Include ipa-chkp.h. (lto_cgraph_replace_node): Fix chkp references for merged function nodes. gcc/testsuite/ 2015-03-19 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-privatize-1_0.c: New. * gcc.dg/lto/chkp-privatize-1_1.c: New. * gcc.dg/lto/chkp-privatize-2_0.c: New. * gcc.dg/lto/chkp-privatize-2_1.c: New. diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c index 0b857ff..7a7fc3c 100644 --- a/gcc/ipa-chkp.c +++ b/gcc/ipa-chkp.c @@ -414,6 +414,40 @@ chkp_instrumentable_p (tree fndecl) (!fn || !copy_forbidden (fn, fndecl))); } +/* Check NODE has a correct IPA_REF_CHKP reference. + Create a new reference if required. */ + +void +chkp_maybe_fix_chkp_ref (cgraph_node *node) +{ + /* Firstly check node needs IPA_REF_CHKP. */ + if (node-instrumentation_clone + || !node-instrumented_version) +return; + + /* Check we already have a proper IPA_REF_CHKP. + Remove incorrect refs. */ + int i; + ipa_ref *ref = NULL; + bool found = false; + for (i = 0; node-iterate_reference (i, ref); i++) +if (ref-use == IPA_REF_CHKP) + { + /* Found proper reference. */ + if (ref-referred == node-instrumented_version +!found) + found = true; + else + { + ref-remove_reference (); + i--; + } + } + + if (!found) +node-create_reference (node-instrumented_version, IPA_REF_CHKP, NULL); +} + /* Return clone created for instrumentation of NODE or NULL. */ cgraph_node * diff --git a/gcc/ipa-chkp.h b/gcc/ipa-chkp.h index 6708fe9..5fa7d88 100644 --- a/gcc/ipa-chkp.h +++ b/gcc/ipa-chkp.h @@ -24,5 +24,6 @@ extern tree chkp_copy_function_type_adding_bounds (tree orig_type); extern tree chkp_maybe_clone_builtin_fndecl (tree fndecl); extern cgraph_node *chkp_maybe_create_clone (tree fndecl); extern bool chkp_instrumentable_p (tree fndecl); +extern void chkp_maybe_fix_chkp_ref (cgraph_node *node); #endif /* GCC_IPA_CHKP_H */ diff --git a/gcc/ipa.c b/gcc/ipa.c index b3752de..3054afe 100644 --- a/gcc/ipa.c +++ b/gcc/ipa.c @@ -492,7 +492,22 @@ symbol_table::remove_unreachable_nodes (FILE *file) } else if (cnode-thunk.thunk_p) enqueue_node (cnode-callees-callee, first, reachable); - + + /* For instrumentation clones we always need original +function node for proper LTO privatization. */ + if (cnode-instrumentation_clone + reachable.contains (cnode) + cnode-definition) + { + gcc_assert (cnode-instrumented_version || in_lto_p); + if (cnode-instrumented_version) + { + enqueue_node (cnode-instrumented_version, first, + reachable); + reachable.add (cnode-instrumented_version); + } + } + /* If any reachable function has simd clones, mark them as reachable as well. */ if (cnode-simd_clones) diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index c875fed..b9196eb 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -80,6 +80,7 @@ along with GCC;
Re: [CHKP] Never expand instrumentation thunks
Ping. Patch doesn't affect not instrumented code. Thanks, Ilya 2015-03-18 15:19 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, This patch disables attempts to expand instrumentation thunks which appear when we create specialized function versions. Problem was found during SPEC2006 insatrumentation with '-Ofast -flto'. Unfortunately I couldn't make a small reproducer. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-18 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Don't expand instrumentation thunks. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e640907..abc9cfe 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) tree thunk_fndecl = decl; tree a; + /* Instrumentation thunk is the same function with + a different signature. Never need to expand it. */ + if (thunk.add_pointer_bounds_args) +return false; if (!force_gimple_thunk this_adjusting targetm.asm_out.can_output_mi_thunk (thunk_fndecl, fixed_offset,
Re: [PATCH] Fix PR ipa/65557
On Wed, Apr 01, 2015 at 11:42:28PM +0100, H.J. Lu wrote: On Mon, Mar 30, 2015 at 3:19 PM, Martin Liška mli...@suse.cz wrote: You are right, there's one more occurrence of the usage. I'm going to install the patch I've attached. This caused: FAIL: g++.dg/torture/pr64378.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) FAIL: g++.dg/torture/pr64378.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) on x86-64. Likewise on aarch64-none-elf and aarch64-none-linux-gnu. Thanks, James /build-gcc/obj/gcc2/gcc/testsuite/g++6/../../xg++ -B/build-gcc/obj/gcc2/gcc/testsuite/g++6/../../ /gcc-src/gcc/gcc/testsuite/g++.dg/torture/pr64378.C -fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++ -I/build-gcc/obj/gcc2/aarch64-none-elf/libstdc++-v3/include/aarch64-none-elf -I/build-gcc/obj/gcc2/aarch64-none-elf/libstdc++-v3/include -I/gcc-src/gcc/libstdc++-v3/libsupc++ -I/gcc-src/gcc/libstdc++-v3/include/backward -I/gcc-src/gcc/libstdc++-v3/testsuite/util -fmessage-length=0 -O2 -flto -fno-use-linker-plugin -flto-partition=none -fno-ipa-cp -S -specs=aem-ve.specs -mcmodel=small -o pr64378.s /gcc-src/gcc/gcc/testsuite/g++.dg/torture/pr64378.C:24:1: internal compiler error: Segmentation fault 0xc2c81a crash_signal /gcc-src/gcc/gcc/toplev.c:383 0x10e434e vecipa_param_descriptor, va_heap, vl_embed::operator[](unsigned int) /gcc-src/gcc/gcc/vec.h:736 0x10e434e vecipa_param_descriptor, va_heap, vl_ptr::operator[](unsigned int) /gcc-src/gcc/gcc/vec.h:1202 0x10e434e ipa_is_param_used /gcc-src/gcc/gcc/ipa-prop.h:405 0x10e434e ipa_icf::sem_item_optimizer::update_hash_by_addr_refs() /gcc-src/gcc/gcc/ipa-icf.c:2511 0x10e45c5 ipa_icf::sem_item_optimizer::execute() /gcc-src/gcc/gcc/ipa-icf.c:2394 0x10e6aa6 ipa_icf_driver /gcc-src/gcc/gcc/ipa-icf.c:3304 0x10e6aa6 ipa_icf::pass_ipa_icf::execute(function*) /gcc-src/gcc/gcc/ipa-icf.c:3351 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions.
Re: [PATCH, CHKP] Restore transparent alias chains
Ping. This patch doesn't affect not instrumented code. Thanks, Ilya 2015-03-20 11:20 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. We always expect it for instrumentation clones, thus restore it input_cgraph_1. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-20 Ilya Enkovich ilya.enkov...@intel.com * lto-cgraph.c (input_cgraph_1): Always link instrumented assembler name with original one. diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index c875fed..d782327 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -1613,9 +1613,8 @@ input_cgraph_1 (struct lto_file_decl_data *file_data, } /* Restore decl names reference. */ - if (IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME (cnode-decl)) - !TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl))) - TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl)) + IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME (cnode-decl)) = 1; + TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl)) = DECL_ASSEMBLER_NAME (cnode-orig_decl); } }
Re: C++ PATCH for c++/65642 (rejects-valid-constexpr)
OK. Jason
Re: [PATCH, CHKP] Fix ipa-comdats for instrumentation thunks
Hi, With r221574 (https://gcc.gnu.org/ml/gcc-cvs/2015-03/msg00495.html) thunks don't get comdat groups assigned and this causes a failure in cgraph checker for instrumentation thunks. It happens because instrumentation thunk may reference local symbol in comdat not being in comdat by itself. This patch fixes the problem. Doesn't affect not instrumented code. Testing is in progress. Does it look OK? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * ipa-comdats.c (ipa_comdats): Visit all instrumentation thunks to set proper comdat group. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * gcc.target/i386/mpx/chkp-thunk-comdat-1.cc: New. * gcc.target/i386/mpx/chkp-thunk-comdat-2.cc: New. diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c index f349f9f..30bcad8 100644 --- a/gcc/ipa-comdats.c +++ b/gcc/ipa-comdats.c @@ -348,10 +348,9 @@ ipa_comdats (void) } /* Finally assign symbols to the sections. */ - + cgraph_node *fun; FOR_EACH_DEFINED_SYMBOL (symbol) { - struct cgraph_node *fun; symbol-aux = NULL; if (!symbol-get_comdat_group () !symbol-alias @@ -388,6 +387,20 @@ ipa_comdats (void) true); } } + + /* Instrumentation thunks reference original node and thus + need to be in the same comdat group. Otherwise we may + get a local instrumented symbol in a comdat group and + the referencing original node outside of it. */ + FOR_EACH_DEFINED_FUNCTION (fun) +if (fun-instrumentation_clone + fun-instrumented_version + !fun-instrumented_version-alias + fun-get_comdat_group () + !fun-instrumented_version-get_comdat_group ()) + fun-instrumented_version-call_for_symbol_and_aliases + (set_comdat_group_1, fun, true); I think you want to handle them same way as the aliasesthunks are handled. This fix is symptomatic: the code may assign them to different comdat groups and may propagate that furhter. Honza
Re: [RS6000] Fix 65576 regression
On Thu, Apr 2, 2015 at 1:22 AM, Alan Modra amo...@gmail.com wrote: This cures an unrecognizable insn ICE by modifying a predicate of extenddftf2_internal (the only place this predicate is used) to ensure that rtl optimization passes do not substitute 0.0 for a register with known 0.0 value, except when VSX is enabled. ie. Don't undo the necessary register move emitted by the extenddftf2_fprs expander. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and powerpc-linux. OK to apply? PR target/65576 PR target/65240 * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude 0.0 constant unless TARGET_VSX. This change seems to disable the final alternative in the extenddftf2_internal pattern. - David
Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers
On 24 Mar 15:06, Jakub Jelinek wrote: On Tue, Mar 24, 2015 at 12:22:27PM +0300, Ilya Enkovich wrote: 2015-03-24 11:33 GMT+03:00 Jakub Jelinek ja...@redhat.com: On Thu, Mar 19, 2015 at 11:29:44AM +0300, Ilya Enkovich wrote: + /* We might propagate instrumented function pointer into + not instrumented function and vice versa. In such a + case we need to either fix function declaration or + remove bounds from call statement. */ + if (callee) +skip_bounds = chkp_redirect_edge (e); I just want to say that I'm not really excited about all this compile time cost that is added everywhere unconditionally for chkp. I think much better would be to guard most of it with proper option check first and only do the more expensive part if the option has been used. Agree, overhead for not instrumented code should be minimized. Unfortunately there is no option check I can use to guard chkp codes due to LTO. Currently it is allowed to pass -fcheck-pointer-bounds for IL generation and don't pass it for final code generation. I suppose I may set this (or some new) flag if see instrumented node when read cgraph and then use it to guard chkp related codes. Would it be acceptable? The question is what you want to do in the LTO case for the different cases, in particular a TU compiled with -fcheck-pointer-bounds and LTO link without that, or TU compiled without -fcheck-pointer-bounds and LTO link with it. It could be handled as LTO incompatible option, where lto1 would error out if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds code, or e.g. similar to var-tracking, you could consider adjusting the IL upon LTO reading if if some TU has been built with -fcheck-pointer-bounds and the LTO link is -fno-check-pointer-bounds. Dunno what will happen with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds. Or another possibility is to or in -fcheck-pointer-bounds from all TUs. Maybe replace attribute usage with a new flag in tree_decl_with_vis structure? Depends, might be better to stick it into cgraph_node instead, depends on whether you are querying it already early in the FEs or just during GIMPLE when the cgraph node should be created too. Jakub Hi, Here is an updated version of the patch. I added merge for -fcheck-pointer-bounds option in LTO and used it as a guard for new code. Testing is in progress. Does this version look OK? Thanks, Ilya -- gcc/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com PR target/65527 * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Add redirection for instrumented calls. * lto-wrapper.c (merge_and_complain): Merge -fcheck-pointer-bounds. (append_compiler_options): Append -fcheck-pointer-bounds. * tree-chkp.h (chkp_copy_call_skip_bounds): New. (chkp_redirect_edge): New. * tree-chkp.c (chkp_copy_call_skip_bounds): New. (chkp_redirect_edge): New. gcc/testsuite/ 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com PR target/65527 * gcc.target/i386/mpx/chkp-fix-calls-1.c: New. * gcc.target/i386/mpx/chkp-fix-calls-2.c: New. * gcc.target/i386/mpx/chkp-fix-calls-3.c: New. * gcc.target/i386/mpx/chkp-fix-calls-4.c: New. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 85531c8..28b5996 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -1277,14 +1277,25 @@ cgraph_edge::redirect_call_stmt_to_callee (void) { cgraph_edge *e = this; - tree decl = gimple_call_fndecl (e-call_stmt); - tree lhs = gimple_call_lhs (e-call_stmt); + tree decl; + tree lhs; gcall *new_stmt; gimple_stmt_iterator gsi; + bool skip_bounds = false; #ifdef ENABLE_CHECKING cgraph_node *node; #endif + /* We might propagate instrumented function pointer into + not instrumented function and vice versa. In such a + case we need to either fix function declaration or + remove bounds from call statement. */ + if (flag_check_pointer_bounds callee) +skip_bounds = chkp_redirect_edge (e); + + decl = gimple_call_fndecl (e-call_stmt); + lhs = gimple_call_lhs (e-call_stmt); + if (e-speculative) { cgraph_edge *e2; @@ -1390,7 +1401,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void) } if (e-indirect_unknown_callee - || decl == e-callee-decl) + || (decl == e-callee-decl + !skip_bounds)) return e-call_stmt; #ifdef ENABLE_CHECKING @@ -1415,13 +1427,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void) } } - if (e-callee-clone.combined_args_to_skip) + if (e-callee-clone.combined_args_to_skip + || skip_bounds) { int lp_nr; - new_stmt - = gimple_call_copy_skip_args (e-call_stmt, - e-callee-clone.combined_args_to_skip); + new_stmt = e-call_stmt; + if (e-callee-clone.combined_args_to_skip) + new_stmt +
Re: [CHKP, PATCH] Fix LTO cgraph merge for instrumented functions
Ping It would help if you add hubi...@ucw.cz to CC for IPA related patches. 2015-03-19 11:34 GMT+03:00 Ilya Enkovich enkovich@gmail.com: On 12 Mar 13:27, Ilya Enkovich wrote: Hi, Currently cgraph merge has several issues with instrumented code: - original function node may be removed = no assembler name conflict is detected between function and variable Why do you need to detect this one? - only orig_decl name is privatized for instrumented function = node still shares assembler name which causes infinite privatization loop - information about changed name is stored in file_data of instrumented node = original section name may be not found for original function - chkp reference is not fixed when nodes are merged This patch should fix theese problems by keeping instrumentation thunks reachable, privatizing both nodes and fixing chkp references. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Next stage1 I definitly hope we will be able to reduce number of special cases we need for chck nodes. I will try to read the code and your design document and give it some tought. 2015-03-19 Ilya Enkovich ilya.enkov...@intel.com * ipa-chkp.h (chkp_maybe_fix_chkp_ref): New. * ipa-chkp.c (chkp_maybe_fix_chkp_ref): New. * ipa.c (symbol_table::remove_unreachable_nodes): Don't remove instumentation thunks calling reachable functions. * lto-cgraph.c: Include ipa-chkp.h. (input_symtab): Fix chkp references for boundary nodes. * lto/lto-partition.c (privatize_symbol_name_1): New. (privatize_symbol_name): Privatize both decl and orig_decl names for instrumented functions. * lto/lto-symtab.c: Include ipa-chkp.h. (lto_cgraph_replace_node): Fix chkp references for merged function nodes. gcc/testsuite/ 2015-03-19 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-privatize-1_0.c: New. * gcc.dg/lto/chkp-privatize-1_1.c: New. * gcc.dg/lto/chkp-privatize-2_0.c: New. * gcc.dg/lto/chkp-privatize-2_1.c: New. diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c index 0b857ff..7a7fc3c 100644 --- a/gcc/ipa-chkp.c +++ b/gcc/ipa-chkp.c @@ -414,6 +414,40 @@ chkp_instrumentable_p (tree fndecl) (!fn || !copy_forbidden (fn, fndecl))); } +/* Check NODE has a correct IPA_REF_CHKP reference. + Create a new reference if required. */ + +void +chkp_maybe_fix_chkp_ref (cgraph_node *node) OK, so you have the chkp references that links to instrumented_version. You do not stream them for boundary symbols but for some reason they are needed, but when you can end up with wrong CHKP link? diff --git a/gcc/ipa.c b/gcc/ipa.c index b3752de..3054afe 100644 --- a/gcc/ipa.c +++ b/gcc/ipa.c @@ -492,7 +492,22 @@ symbol_table::remove_unreachable_nodes (FILE *file) } else if (cnode-thunk.thunk_p) enqueue_node (cnode-callees-callee, first, reachable); - + + /* For instrumentation clones we always need original +function node for proper LTO privatization. */ + if (cnode-instrumentation_clone + reachable.contains (cnode) + cnode-definition) + { + gcc_assert (cnode-instrumented_version || in_lto_p); + if (cnode-instrumented_version) + { + enqueue_node (cnode-instrumented_version, first, + reachable); + reachable.add (cnode-instrumented_version); + } + } + This is OK +/* Mangle NODE symbol name into a local name. + This is necessary to do + 1) if two or more static vars of same assembler name + are merged into single ltrans unit. + 2) if previously static var was promoted hidden to avoid possible conflict + with symbols defined out of the LTO world. */ + +static bool +privatize_symbol_name (symtab_node *node) +{ + if (!privatize_symbol_name_1 (node, node-decl)) +return false; + /* We could change name which is a target of transparent alias chain of instrumented function name. Fix alias chain if so .*/ - if (cnode) + if (cgraph_node *cnode = dyn_cast cgraph_node * (node)) { tree iname = NULL_TREE; if (cnode-instrumentation_clone) - iname = DECL_ASSEMBLER_NAME (cnode-decl); + { + /* If we want to privatize instrumentation clone +then we also need to privatize original function. */ + if (cnode-instrumented_version) + privatize_symbol_name (cnode-instrumented_version); + else + privatize_symbol_name_1 (cnode, cnode-orig_decl); This is because these are TRANSPARENT_ALIASes and thus visibility needs to match? I plan to add generic support for
Re: [PATCH, CHKP] Restore transparent alias chains
On 03/20/2015 02:20 AM, Ilya Enkovich wrote: Hi, Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. We always expect it for instrumentation clones, thus restore it input_cgraph_1. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-20 Ilya Enkovich ilya.enkov...@intel.com * lto-cgraph.c (input_cgraph_1): Always link instrumented assembler name with original one. This appears to be a code path that only triggers when MPX is enabled and is roughly analogous to the code in chkp_build_instrumented_fndecl links things up. OK for the trunk. jeff
Re: Silence merge warnings on artificial types
On Tue, Mar 31, 2015 at 09:51:21 +0200, Jan Hubicka wrote: this patch adds the ARTIFICIAL flag check to avoid ODR merging to these. I oriignally tested DECL_ARTIFICIAL (decl) (that is TYPE_NAME) that randomly dropped type names on some classes but not all. Jason, please do you know what is meaning of DECL_ARTIFICIAL on class type names? Perhaps we can drop them to 0 in free lang data? With this bug I triggered wrong devirtualization because we no longer insert non-odr types into a type inheritance graph. This is fixed by the lto_read_decls change and finally I triggered an ICE in ipa-devirt that due to the bug output a warning late and ICEd on streamer cache being NULL. I guess it is better to guard it even though all wanrings should be output early. Bootsrapped/regtested x86_64-linux, will commit it after chromium rebuild. Honza * tree.c (need_assembler_name_p): Artificial types have no ODR names. * ipa-devirt.c (warn_odr): Do not try to apply ODR cache when no caching is done. * lto.c (lto_read_decls): Move code registering odr types out of TYPE_CANONICAL conditional and also register polymorphic types. Index: tree.c === --- tree.c (revision 221777) @@ -5139,6 +5145,7 @@ need_assembler_name_p (tree decl) decl == TYPE_NAME (TREE_TYPE (decl)) !is_lang_specific (TREE_TYPE (decl)) AGGREGATE_TYPE_P (TREE_TYPE (decl)) + !TYPE_ARTIFICIAL (TREE_TYPE (decl)) !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE) !type_in_anonymous_namespace_p (TREE_TYPE (decl))) return !DECL_ASSEMBLER_NAME_SET_P (decl); Hmm, libgomp.c++/target-3.C still fails. Here is what I see in need_assembler_name_p: type_decl 0x2b5a8751aed8 .omp_data_s.3 type record_type 0x2b5a87536690 .omp_data_s.3 BLK size integer_cst 0x2b5a873ce318 constant 192 unit size integer_cst 0x2b5a873ce2e8 constant 24 align 64 symtab 0 alias set -1 canonical type 0x2b5a87536690 fields field_decl 0x2b5a87537000 b.0 type pointer_type 0x2b5a87536150 unsigned DI file test.cpp line 8 col 13 size integer_cst 0x2b5a873ace58 constant 64 unit size integer_cst 0x2b5a873ace70 constant 8 align 64 offset_align 128 offset integer_cst 0x2b5a873ace88 constant 0 bit offset integer_cst 0x2b5a873aced0 constant 0 context record_type 0x2b5a87536690 .omp_data_s.3 chain field_decl 0x2b5a87537098 s reference_to_this reference_type 0x2b5a87536738 VOID file test.cpp line 11 col 13 align 1 (TYPE_ARTIFICIAL = 0) type_decl 0x2b5a8751a850 .omp_data_s.3 type record_type 0x2b5a8752cdc8 .omp_data_s.3 BLK size integer_cst 0x2b5a873ce318 constant 192 unit size integer_cst 0x2b5a873ce2e8 constant 24 align 64 symtab 0 alias set -1 canonical type 0x2b5a8752cdc8 fields field_decl 0x2b5a8751a980 b.0 type pointer_type 0x2b5a8752c498 unsigned DI file test.cpp line 8 col 13 size integer_cst 0x2b5a873ace58 constant 64 unit size integer_cst 0x2b5a873ace70 constant 8 align 64 offset_align 128 offset integer_cst 0x2b5a873ace88 constant 0 bit offset integer_cst 0x2b5a873aced0 constant 0 context record_type 0x2b5a8752cdc8 .omp_data_s.3 chain field_decl 0x2b5a8751a8e8 s pointer_to_this pointer_type 0x2b5a87536888 VOID file test.cpp line 11 col 13 align 1 (TYPE_ARTIFICIAL = 0) I see, the DECL_ARTIFICIAL is set on TYPE_DECL but it is not set on type itself. Jakub, this looks like an artificial type, so perhaps we want to set it in omp-low.c? Otherwise I can add extra test for DECL_NAMELESS, but marking type as artifical looks like correct fix to me. Honza -- Ilya
Re: [PATCH, CHKP] Restore transparent alias chains
On 03/20/2015 02:20 AM, Ilya Enkovich wrote: Hi, Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. We always expect it for instrumentation clones, thus restore it input_cgraph_1. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- 2015-03-20 Ilya Enkovich ilya.enkov...@intel.com * lto-cgraph.c (input_cgraph_1): Always link instrumented assembler name with original one. This appears to be a code path that only triggers when MPX is enabled and is roughly analogous to the code in chkp_build_instrumented_fndecl links things up. OK for the trunk. I think this will lead to wrong code. At this time, we may have multple declarations sharing single assembler name (and thus IDENTIFIER_TRANSPARENT_ALIAS link). Think of case where one unit defines static function and other global function of the same name. We may end up renaming the symbol but keeping bogus transparent alias link that will trigger on other symbol. During WPA the assembler names are never fully unique, but we also probably do not need to set IDENTIFIER_TRANSPARENT_ALIAS. I think IDENTIFIER_TRANSPARENT_ALIAS should be set in rename_statics and separately in ltrans on the place we skip lto_symtab merging. At least it is the place I link it with my patch for supporting transparent aliases in cgraph. I will try to find time to audit chkp code - I missed the addition of transparent aliases in the initial chkp patches. Symbol table code expect 1-1 correspondence between symbol table entries and real symbols. Basically all places we special case aliases or weakrefs needs to be revisited for transparent aliases. Honza
Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers
On Thu, Apr 2, 2015 at 5:07 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch fixes nested PARALLEL in retval for isntrumented calls. Current possible call: (call_insn:TI 6 30 17 2 (set (parallel [ (expr_list:REG_DEP_TRUE (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) ]) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Such construction causes DF analysis problems. This patch tranforms it to: (call_insn:TI 6 30 17 2 (set (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1) (const_int 8 [0x8])) (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1) (const_int 0 [0])) ]) (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41] function_decl 0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8]) (const_int 0 [0]))) complex.c:11 670 {*call_value} Only MPX target is affected. Bootstrapped and tested on x86_64-unknown-linux-gnu. Does it look OK? Thanks, Ilya -- 2015-04-02 Ilya Enkovich ilya.enkov...@intel.com * config/i386/i386.c (ix86_expand_call): Avoid nested PARALLEL in returned call value. OK. Thanks, Uros.
Re: Silence merge warnings on artificial types
On Tue, Mar 31, 2015 at 09:51:21 +0200, Jan Hubicka wrote: this patch adds the ARTIFICIAL flag check to avoid ODR merging to these. I oriignally tested DECL_ARTIFICIAL (decl) (that is TYPE_NAME) that randomly dropped type names on some classes but not all. Jason, please do you know what is meaning of DECL_ARTIFICIAL on class type names? Perhaps we can drop them to 0 in free lang data? With this bug I triggered wrong devirtualization because we no longer insert non-odr types into a type inheritance graph. This is fixed by the lto_read_decls change and finally I triggered an ICE in ipa-devirt that due to the bug output a warning late and ICEd on streamer cache being NULL. I guess it is better to guard it even though all wanrings should be output early. Bootsrapped/regtested x86_64-linux, will commit it after chromium rebuild. Honza * tree.c (need_assembler_name_p): Artificial types have no ODR names. * ipa-devirt.c (warn_odr): Do not try to apply ODR cache when no caching is done. * lto.c (lto_read_decls): Move code registering odr types out of TYPE_CANONICAL conditional and also register polymorphic types. Index: tree.c === --- tree.c(revision 221777) @@ -5139,6 +5145,7 @@ need_assembler_name_p (tree decl) decl == TYPE_NAME (TREE_TYPE (decl)) !is_lang_specific (TREE_TYPE (decl)) AGGREGATE_TYPE_P (TREE_TYPE (decl)) + !TYPE_ARTIFICIAL (TREE_TYPE (decl)) !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE) !type_in_anonymous_namespace_p (TREE_TYPE (decl))) return !DECL_ASSEMBLER_NAME_SET_P (decl); Hmm, libgomp.c++/target-3.C still fails. Here is what I see in need_assembler_name_p: type_decl 0x2b5a8751aed8 .omp_data_s.3 type record_type 0x2b5a87536690 .omp_data_s.3 BLK size integer_cst 0x2b5a873ce318 constant 192 unit size integer_cst 0x2b5a873ce2e8 constant 24 align 64 symtab 0 alias set -1 canonical type 0x2b5a87536690 fields field_decl 0x2b5a87537000 b.0 type pointer_type 0x2b5a87536150 unsigned DI file test.cpp line 8 col 13 size integer_cst 0x2b5a873ace58 constant 64 unit size integer_cst 0x2b5a873ace70 constant 8 align 64 offset_align 128 offset integer_cst 0x2b5a873ace88 constant 0 bit offset integer_cst 0x2b5a873aced0 constant 0 context record_type 0x2b5a87536690 .omp_data_s.3 chain field_decl 0x2b5a87537098 s reference_to_this reference_type 0x2b5a87536738 VOID file test.cpp line 11 col 13 align 1 (TYPE_ARTIFICIAL = 0) type_decl 0x2b5a8751a850 .omp_data_s.3 type record_type 0x2b5a8752cdc8 .omp_data_s.3 BLK size integer_cst 0x2b5a873ce318 constant 192 unit size integer_cst 0x2b5a873ce2e8 constant 24 align 64 symtab 0 alias set -1 canonical type 0x2b5a8752cdc8 fields field_decl 0x2b5a8751a980 b.0 type pointer_type 0x2b5a8752c498 unsigned DI file test.cpp line 8 col 13 size integer_cst 0x2b5a873ace58 constant 64 unit size integer_cst 0x2b5a873ace70 constant 8 align 64 offset_align 128 offset integer_cst 0x2b5a873ace88 constant 0 bit offset integer_cst 0x2b5a873aced0 constant 0 context record_type 0x2b5a8752cdc8 .omp_data_s.3 chain field_decl 0x2b5a8751a8e8 s pointer_to_this pointer_type 0x2b5a87536888 VOID file test.cpp line 11 col 13 align 1 (TYPE_ARTIFICIAL = 0) -- Ilya
Re: [CHKP] Support returned bounds in thunks expand
On 04/02/2015 08:49 AM, Ilya Enkovich wrote: Ping 2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com: Hi, Currentl we loose returned bounds when functions are merged. This patch fixes it by adding returne bounds support for cgraph_node::expand_thunk. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Ilya -- gcc/ 2015-03-06 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (cgraph_node::expand_thunk): Build returned bounds for instrumented functions. gcc/testsuite/ 2015-03-06 Ilya Enkovich ilya.enkov...@intel.com * gcc/testsuite/gcc.target/i386/thunk-retbnd.c: New. I really dislike the amount of gimple and bounded pointer knowledge in this code. It seems like a significant modularity violation and while you didn't introduce the gimple stuff, we probably shouldn't be making it worse. Is it possible to let this code build up the thunk, then pass it off as a whole to the chkp code to add the instrumentation, particularly for the return value? ALso, is this critical for stage4? It looks like this is strictly a QofI change, correct? jeff
Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers
On Thu, Apr 2, 2015 at 10:17 PM, Jan Hubicka hubi...@ucw.cz wrote: rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG); rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1); - retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1)); - chkp_put_regs_to_expr_list (retval); + if (GET_CODE (retval) == PARALLEL) + { + b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx); + b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx); + rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1)); + retval = chkp_join_splitted_slot (retval, par); I do not understand this code. Isn't it just droping the original return value constructing prallelcontaining only the BND_REGs? Please see chkp_join_splitted_slot that merges two parallels. I see, OK then :) Honza Uros.