[PATCH] Fix a test with bogus size_t type

2015-04-02 Thread Marek Polacek
We are now more strict when accepting user-defined initializer_lists; in
particular, we now require sizetype, not just any integral type.  The
following test failed with -m32, because it had a bogus type of size_t,
with -m32 it usually should be unsigned int, not unsigned long.

Test passes now with both -m32/-m64, ok for trunk?

2015-04-02  Marek Polacek  pola...@redhat.com

* g++.dg/cpp0x/pr57101.C: Use proper type for size_t.

diff --git gcc/testsuite/g++.dg/cpp0x/pr57101.C 
gcc/testsuite/g++.dg/cpp0x/pr57101.C
index 94b576f..c0fc966 100644
--- gcc/testsuite/g++.dg/cpp0x/pr57101.C
+++ gcc/testsuite/g++.dg/cpp0x/pr57101.C
@@ -1,7 +1,7 @@
 // { dg-do compile { target c++11 } }
 // { dg-options -fcompare-debug }
 
-typedef long unsigned size_t;
+typedef __SIZE_TYPE__ size_t;
 namespace
 {
   template  typename _Tp, _Tp __v  struct integral_constant

Marek


[PATCH] Fix PR65650 (1/n in merging CCP and copyprop)

2015-04-02 Thread Richard Biener

The following makes CCP track copies which avoids pass ordering
issues between CCP and copyprop as seen from the testcase.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.

For stage1 I'd like to get rid of copyprop completely, a 2nd patch
in the series will remove the copyprop instances immediately
preceeding/following CCP.

CCP needs some TLC and I'm going to apply that during stage1.

Thanks,
Richard.

2015-04-02  Richard Biener  rguent...@suse.de

PR tree-optimization/65650
* tree-ssa-ccp.c (valid_lattice_transition): Allow lattice
transitions involving copies.
(set_lattice_value): Adjust for copy lattice state.
(ccp_lattice_meet): Do not merge UNDEFINED and a copy to the copy.
(bit_value_unop): Adjust what we treat as varying mask.
(bit_value_binop): Likewise.
(bit_value_assume_aligned): Likewise.
(evaluate_stmt): When we simplified to a SSA name record a copy
instead of dropping to varying.
(visit_assignment): Simplify.

* gimple-match.h (gimple_simplify): Add another callback.
* gimple-fold.c (fold_stmt_1): Adjust caller.
(gimple_fold_stmt_to_constant_1): Likewise - pass valueize
for the 2nd callback.
* gimple-match-head.c (gimple_simplify): Add a callback that is
used to valueize the stmt operands and use it that way.

* gcc.dg/tree-ssa/ssa-ccp-36.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===
*** gcc/tree-ssa-ccp.c.orig 2015-04-01 15:13:46.424457049 +0200
--- gcc/tree-ssa-ccp.c  2015-04-01 16:33:50.194715581 +0200
*** valid_lattice_transition (ccp_prop_value
*** 439,444 
--- 439,455 
  
/* Now both lattice values are CONSTANT.  */
  
+   /* Allow arbitrary copy changes as we might look through PHI a_1, ...
+  when only a single copy edge is executable.  */
+   if (TREE_CODE (old_val.value) == SSA_NAME
+TREE_CODE (new_val.value) == SSA_NAME)
+ return true;
+ 
+   /* Allow transitioning from a constant to a copy.  */
+   if (is_gimple_min_invariant (old_val.value)
+TREE_CODE (new_val.value) == SSA_NAME)
+ return true;
+ 
/* Allow transitioning from PHI x, not executable == x
   to PHI x, y == common alignment.  */
if (TREE_CODE (old_val.value) != INTEGER_CST
*** set_lattice_value (tree var, ccp_prop_va
*** 527,535 
   caller that this was a non-transition.  */
if (old_val-lattice_val != new_val.lattice_val
|| (new_val.lattice_val == CONSTANT
!  TREE_CODE (new_val.value) == INTEGER_CST
!  (TREE_CODE (old_val-value) != INTEGER_CST
! || new_val.mask != old_val-mask)))
  {
/* ???  We would like to delay creation of INTEGER_CSTs from
 partially constants here.  */
--- 538,547 
   caller that this was a non-transition.  */
if (old_val-lattice_val != new_val.lattice_val
|| (new_val.lattice_val == CONSTANT
!  (TREE_CODE (new_val.value) != TREE_CODE (old_val-value)
! || simple_cst_equal (new_val.value, old_val-value) != 1
! || (TREE_CODE (new_val.value) == INTEGER_CST
!  new_val.mask != old_val-mask
  {
/* ???  We would like to delay creation of INTEGER_CSTs from
 partially constants here.  */
*** ccp_finalize (void)
*** 958,969 
  static void
  ccp_lattice_meet (ccp_prop_value_t *val1, ccp_prop_value_t *val2)
  {
!   if (val1-lattice_val == UNDEFINED)
  {
/* UNDEFINED M any = any   */
*val1 = *val2;
  }
!   else if (val2-lattice_val == UNDEFINED)
  {
/* any M UNDEFINED = any
   Nothing to do.  VAL1 already contains the value we want.  */
--- 970,989 
  static void
  ccp_lattice_meet (ccp_prop_value_t *val1, ccp_prop_value_t *val2)
  {
!   if (val1-lattice_val == UNDEFINED
!   /* For UNDEFINED M SSA we can't use SSA because its definition
!  may not dominate the PHI node.  */
!(val2-lattice_val != CONSTANT
! || TREE_CODE (val2-value) != SSA_NAME))
  {
/* UNDEFINED M any = any   */
*val1 = *val2;
  }
!   else if (val2-lattice_val == UNDEFINED
!  /* For UNDEFINED M SSA we can't use SSA because its definition
! may not dominate the PHI node.  */
!   (val1-lattice_val != CONSTANT
!  || TREE_CODE (val1-value) != SSA_NAME))
  {
/* any M UNDEFINED = any
   Nothing to do.  VAL1 already contains the value we want.  */
*** bit_value_unop (enum tree_code code, tre
*** 1513,1519 
  || rval.mask == -1);
bit_value_unop_1 (code, type, value, mask,
TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
!   if (mask != -1)
  {
val.lattice_val = CONSTANT;
val.mask = mask;
--- 1533,1539 
  || rval.mask 

[PATCH, i386]: Use UNSPEC_{FILD_FIST}_ATOMIC in {load,store}di_via_fpu

2015-04-02 Thread Uros Bizjak
Hello!

Trivial patch to make insn dumps more informative.

2015-04-02  Uros Bizjak  ubiz...@gmail.com

* config/i386/sync.md (UNSPEC_FILD_ATOMIC, UNSPEC_FIST_ATOMIC): New.
(loaddi_via_fpu): Use UNSPEC_FILD_ATOMIC.
(storedi_via_fpu): Use UNSPEC_FIST_ATOMIC.
* reg-stack.c (get_true_reg): Change UNSPEC_LDA to UNSPEC_FILD_ATOMIC.
(subst_stack_regs_pat): Change UNSPEC_STA to UNSPEC_FIST_ATOMIC.

Bootstrapped and regtested on x86_64-linux-gnu {,-m32} and committed
to mainline SVN.

Uros.
Index: config/i386/sync.md
===
--- config/i386/sync.md (revision 221826)
+++ config/i386/sync.md (working copy)
@@ -22,6 +22,9 @@
   UNSPEC_SFENCE
   UNSPEC_MFENCE
 
+  UNSPEC_FILD_ATOMIC
+  UNSPEC_FIST_ATOMIC
+
   ;; __atomic support
   UNSPEC_LDA
   UNSPEC_STA
@@ -302,7 +305,8 @@
 
 (define_insn loaddi_via_fpu
   [(set (match_operand:DF 0 register_operand =f)
-   (unspec:DF [(match_operand:DI 1 memory_operand m)] UNSPEC_LDA))]
+   (unspec:DF [(match_operand:DI 1 memory_operand m)]
+  UNSPEC_FILD_ATOMIC))]
   TARGET_80387
   fild%Z1\t%1
   [(set_attr type fmov)
@@ -311,7 +315,8 @@
 
 (define_insn storedi_via_fpu
   [(set (match_operand:DI 0 memory_operand =m)
-   (unspec:DI [(match_operand:DF 1 register_operand f)] UNSPEC_STA))]
+   (unspec:DI [(match_operand:DF 1 register_operand f)]
+  UNSPEC_FIST_ATOMIC))]
   TARGET_80387
 {
   gcc_assert (find_regno_note (insn, REG_DEAD, FIRST_STACK_REG) != NULL_RTX);
Index: reg-stack.c
===
--- reg-stack.c (revision 221826)
+++ reg-stack.c (working copy)
@@ -448,7 +448,7 @@ get_true_reg (rtx *pat)
 
   case UNSPEC:
if (XINT (*pat, 1) == UNSPEC_TRUNC_NOOP
-   || XINT (*pat, 1) == UNSPEC_LDA)
+   || XINT (*pat, 1) == UNSPEC_FILD_ATOMIC)
  pat =  XVECEXP (*pat, 0, 0);
return pat;
 
@@ -1670,8 +1670,8 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr re
  case UNSPEC:
switch (XINT (pat_src, 1))
  {
- case UNSPEC_STA:
  case UNSPEC_FIST:
+ case UNSPEC_FIST_ATOMIC:
 
  case UNSPEC_FIST_FLOOR:
  case UNSPEC_FIST_CEIL:


Re: [Patch, fortran, PR44672, v3] [F08] ALLOCATE with SOURCE and no array-spec

2015-04-02 Thread Andre Vehreschild
Hi all,

during debugging of a larger fortran source I figured that my previous patch on
44672 had still some issues, when it comes to adding a gfc_code into the chain
of codes and with a symbol. Adding a new gfc_code object before the current one
is now solved be creating a new gfc_code object, copying the current one to the
new one, initialize the old one to the new data and setting its next pointer to
the current one. Because in the gfc_code.ext.alloc a flag is introduced, that
is only set by the C-code adding a new gfc_code object, that flag can be used
to prevent doing this process endlessly. I also learned, that one has to commit
newly created symbols or one may get a very strange error in an assert in
gfc_enforce_clean_symbol_state (). After adding the gfc_commit_symbol ()
everything was fine.

Bootstraps and regtests ok on x86_64-linux-gnu/F20.

Ok for 5.2 trunk?

Regards,
Andre

On Wed, 1 Apr 2015 15:15:40 +0200
Andre Vehreschild ve...@gmx.de wrote:

 Hi all,
 
 during debugging another fortran code, I figured that some cases were not yet
 met. Especially the case where a class array is in the source= or mold=
 expression. This new version of the patch fixes the issue now.
 
 Bootstraps and regtests ok on x86_64-linux-gnu/F20.
 
 Ok for 5.2 trunk?
 
 Regards,
   Andre
 
 On Mon, 30 Mar 2015 19:47:49 +0200
 Andre Vehreschild ve...@gmx.de wrote:
 
  Dear all,
  
  please find attach a patch fixing pr44672:
  
  integer, dimension(:) :: arr
  allocate(arr, source = [1,2,3])
  
  as for F2008:C633 now is no longer flagged, beside when you insist on
  -std=f2003 or lower. Furthermore does the patch implement the F2008 feature
  of obsoleting the explicit array specification on the arrays to allocate,
  when an array valued source=/mold= expression is given.
  
  Bootstrap and regtests ok on x86_64-linux-gnu/F20.
  
  This batched is based on a trunk having my latest for pr60322 patched in
  (else deltas may occur).
  
  Ok for 5.2 trunk?
  
  Regards,
  Andre
 
 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 643cd6a..9835edc 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2394,6 +2394,9 @@ typedef struct gfc_code
 {
   gfc_typespec ts;
   gfc_alloc *list;
+  /* Take the array specification from expr3 to allocate arrays
+	 without an explicit array specification.  */
+  unsigned arr_spec_from_expr3:1;
 }
 alloc;
 
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 316b413..21add32 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -6804,7 +6804,7 @@ conformable_arrays (gfc_expr *e1, gfc_expr *e2)
have a trailing array reference that gives the size of the array.  */
 
 static bool
-resolve_allocate_expr (gfc_expr *e, gfc_code *code)
+resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec)
 {
   int i, pointer, allocatable, dimension, is_abstract;
   int codimension;
@@ -7103,9 +7103,20 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code)
   if (!ref2 || ref2-type != REF_ARRAY || ref2-u.ar.type == AR_FULL
   || (dimension  ref2-u.ar.dimen == 0))
 {
-  gfc_error (Array specification required in ALLOCATE statement 
-		 at %L, e-where);
-  goto failure;
+  /* F08:C633.  */
+  if (code-expr3)
+	{
+	  if (!gfc_notify_std (GFC_STD_F2008, Array specification required 
+			   in ALLOCATE statement at %L, e-where))
+	goto failure;
+	  *array_alloc_wo_spec = true;
+	}
+  else
+	{
+	  gfc_error (Array specification required in ALLOCATE statement 
+		 at %L, e-where);
+	  goto failure;
+	}
 }
 
   /* Make sure that the array section reference makes sense in the
@@ -7124,7 +7135,7 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code)
 
   for (i = 0; i  ar-dimen; i++)
 {
-  if (ref2-u.ar.type == AR_ELEMENT)
+  if (ar-type == AR_ELEMENT || ar-type == AR_FULL)
 	goto check_symbols;
 
   switch (ar-dimen_type[i])
@@ -7201,12 +7212,23 @@ failure:
   return false;
 }
 
+
+static gfc_code *
+build_assignment (gfc_exec_op op, gfc_expr *expr1, gfc_expr *expr2,
+		  gfc_component *comp1, gfc_component *comp2, locus loc);
+
+
 static void
 resolve_allocate_deallocate (gfc_code *code, const char *fcn)
 {
   gfc_expr *stat, *errmsg, *pe, *qe;
   gfc_alloc *a, *p, *q;
 
+  /* When this flag is set already, then this allocate has already been
+ resolved.  Doing so again, would result in an endless loop.  */
+  if (code-ext.alloc.arr_spec_from_expr3)
+return;
+
   stat = code-expr1;
   errmsg = code-expr2;
 
@@ -7375,8 +7397,97 @@ resolve_allocate_deallocate (gfc_code *code, const char *fcn)
 
   if (strcmp (fcn, ALLOCATE) == 0)
 {
+  bool arr_alloc_wo_spec = false;
   for (a = code-ext.alloc.list; a; a = a-next)
-	resolve_allocate_expr (a-expr, code);
+	resolve_allocate_expr (a-expr, code, arr_alloc_wo_spec);
+
+  if (arr_alloc_wo_spec  code-expr3)
+	{

Re: C++ PATCH for c++/65554 (ICE with user-defined initializer_list)

2015-04-02 Thread Marek Polacek
On Wed, Apr 01, 2015 at 03:37:19PM -0700, H.J. Lu wrote:
 This caused:
 
 FAIL: g++.dg/cpp0x/pr57101.C  -std=gnu++11 (test for excess errors)
 FAIL: g++.dg/cpp0x/pr57101.C  -std=gnu++14 (test for excess errors)
 
 on 32 bit system.

Thanks, I'll fix it momentarily.


Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Jeff Law

On 04/02/2015 08:51 AM, Ilya Enkovich wrote:

Ping. Patch doesn't affect not instrumented code.

Thanks,
Ilya

2015-03-18 15:19 GMT+03:00 Ilya Enkovich enkovich@gmail.com:

Hi,

This patch disables attempts to expand instrumentation thunks which appear when 
we create specialized function versions.  Problem was found during SPEC2006 
insatrumentation with '-Ofast -flto'.  Unfortunately I couldn't make a small 
reproducer.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2015-03-18  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (cgraph_node::expand_thunk): Don't expand
 instrumentation thunks.
So is it the case that these instrumentation thunks have a gimple body 
and thus will be expanded elsewhere?  Or are you trying to solve some 
different problem?


Jeff



Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers

2015-04-02 Thread Jan Hubicka
 Hi,
 
 This patch fixes nested PARALLEL in retval for isntrumented calls. Current 
 possible call:
 
 (call_insn:TI 6 30 17 2 (set (parallel [
 (expr_list:REG_DEP_TRUE (parallel:TI [
 (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
 (const_int 8 [0x8]))
 ])
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
 (const_int 0 [0]))
 ])
 (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
 0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8])
 (const_int 0 [0]))) complex.c:11 670 {*call_value}
 
 Such construction causes DF analysis problems.  This patch tranforms it to:
 
 (call_insn:TI 6 30 17 2 (set (parallel:TI [
 (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
 (const_int 8 [0x8]))
 (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
 (const_int 0 [0]))
 ])
 (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
 0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8])
 (const_int 0 [0]))) complex.c:11 670 {*call_value}

Yep, nested parallels should not happen.

 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * config/i386/i386.c (ix86_expand_call): Avoid nested
   PARALLEL in returned call value.
 
 
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 744642c..1d821cd 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -25624,8 +25624,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
 callarg1,
   {
 rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG);
 rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1);
 -   retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1));
 -   chkp_put_regs_to_expr_list (retval);
 +   if (GET_CODE (retval) == PARALLEL)
 + {
 +   b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx);
 +   b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx);
 +   rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1));
 +   retval = chkp_join_splitted_slot (retval, par);

I do not understand this code. Isn't it just droping the original return value
constructing prallelcontaining only the BND_REGs?

Honza


Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Jeff Law

On 04/02/2015 03:13 PM, Jan Hubicka wrote:


I think it is really the transparent alias issue.  The comment seems pretty 
clear about it.
What is confusing is that instrumentation thunks are called thunks when they are
really not - thunk is a small hunk of code, while instrumentation thunk is a 
transparent
alias.
And it may be the terminology gets in the way.  I was at least partly 
interested in what code we generated for that testcase.




Too bad I did not notice we introduced transparent aliases, i would push out my
code for that. I will compare Ilya's changes with mine and hopefully we can
catch more bugs and unify the code next stage1.

Sounds good.
jeff


[PATCH, c6x] handle unk_isa in TARGET_CPU_CPP_BUILTINS

2015-04-02 Thread Bernhard Reutner-Fischer
On Wed, Apr 01, 2015 at 11:37:41PM +0200, Bernhard Reutner-Fischer wrote:
Bernd,

same for c6x for unk_isa, fwiw.

Attached.

Ok for trunk for the c6x bits?
Ok for trunk for the bfin bits?

thanks,

thanks,

On 1 April 2015 at 23:34, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 Hi,

 gcc/c-family/c-cppbuiltin.c
 In file included from ./tm.h:21:0,
  from
 ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:23:
 ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:
 In function ‘void c_cpp_builtins(cpp_reader*)’:
 ../../../../../../home/me/src/gcc-5.0.mine/gcc/config/bfin/bfin.h:43:14:
 error: enumeration value ‘BFIN_CPU_UNKNOWN’ not handled in switch
 [-Werror=switch]
switch (bfin_cpu_type)   \
   ^
 ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:1243:3:
 note: in expansion of macro ‘TARGET_CPU_CPP_BUILTINS’
TARGET_CPU_CPP_BUILTINS ();
^
 cc1plus: all warnings being treated as errors
 make[2]: *** [c-family/c-cppbuiltin.o] Error 1

 gcc/ChangeLog:

 2015-04-01  Bernhard Reutner-Fischer  al...@gcc.gnu.org

   * config/bfin/bfin.h (TARGET_CPU_CPP_BUILTINS): Add BFIN_CPU_UNKNOWN.
From da0fd9e46efb975f8e5d8854f221689ff8808bc5 Mon Sep 17 00:00:00 2001
From: Bernhard Reutner-Fischer rep.dot@gmail.com
Date: Thu, 1 Apr 2015 23:42:08 +0200
Subject: [PATCH 6/6] handle c6x unk_isa in TARGET_CPU_CPP_BUILTINS

gcc/ChangeLog:

2015-04-01  Bernhard Reutner-Fischer  al...@gcc.gnu.org

* config/c6x/c6x.h (TARGET_CPU_CPP_BUILTINS): Add unk_isa.

Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com
---
 gcc/config/c6x/c6x.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/c6x/c6x.h b/gcc/config/c6x/c6x.h
index 58a7ac6..9b3fcfb 100644
--- a/gcc/config/c6x/c6x.h
+++ b/gcc/config/c6x/c6x.h
@@ -84,6 +84,8 @@ extern c6x_cpu_t c6x_arch;
 		\
   switch (c6x_arch)\
 	{	\
+	case unk_isa:\
+	  break;\
 	case C6X_CPU_C62X:			\
 	  builtin_define (_TMS320C6200);	\
 	  break;\
-- 
1.7.10.4



Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor

2015-04-02 Thread Sebastian Pop
Hi,

from what I understand, Evandro has addressed the comments from Kyrill.
Are there other problems to be addressed before the patches can go in?

Thanks,
Sebastian

On Tue, Mar 31, 2015 at 7:30 PM, Evandro Menezes e.mene...@samsung.com wrote:
 Hi, Kyrill.

 At this moment, it suffices to use the same scheduling as Cortex A57, but
 more specific details are to be expected.

 I couldn't check the build though, as my Arndale is strange today.  As soon
 as it's healthy, I'll check it.

 I appreciate your feedback.

 --
 Evandro Menezes  Austin, TX


 -Original Message-
 From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com]
 Sent: Tuesday, March 31, 2015 3:33
 To: Evandro Menezes; 'GCC Patches'
 Subject: Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor

 Hi Evandro
 On 30/03/15 22:51, Evandro Menezes wrote:
  The Samsung Exynos M1 implements the ARMv8 ISA and this patch adds
  support for it through the -mcpu command-line option.
 
  The patch was checked on arm-unknown-linux-gnueabihf without new
 failures.
 
  OK for trunk?
 
  -- Evandro Menezes Austin, TX
 
  0001-ARM-Add-option-for-the-Samsung-Exynos-M1-core-for-AR.patch
 
 
  diff --git a/gcc/config/arm/arm-cores.def
  b/gcc/config/arm/arm-cores.def index b22ea7f..0710a38 100644
  --- a/gcc/config/arm/arm-cores.def
  +++ b/gcc/config/arm/arm-cores.def
  @@ -168,6 +168,7 @@ ARM_CORE(cortex-a17.cortex-a7, cortexa17cortexa7,
 cortexa7, 7A,  FL_LDSCHED |
ARM_CORE(cortex-a53,cortexa53, cortexa53,   8A, FL_LDSCHED |
 FL_CRC32, cortex_a53)
ARM_CORE(cortex-a57,cortexa57, cortexa57,   8A, FL_LDSCHED |
 FL_CRC32, cortex_a57)
ARM_CORE(cortex-a72,cortexa72, cortexa57,   8A, FL_LDSCHED |
 FL_CRC32, cortex_a57)
  +ARM_CORE(exynos-m1,  exynosm1,  exynosm1,8A, FL_LDSCHED |
 FL_CRC32,
 exynosm1)

 There are two problems with this:
 * The 3rd field of ARM_CORE represents the scheduling identifier and
 without
 a separate pipeline description for exynosm1 this will just use the
 generic_sched scheduler which performs quite poorly on modern cores.
 Would
 you prefer to reuse a pipeline description from one of the pre-existing
 ones?
 Look for example at the cortex-a72 definition:
 ARM_CORE(cortex-a72,cortexa72, cortexa57,  ...snip
 here the cortexa57 means 'make scheduling decisions for cortexa57'.

 * The final field in ARM_CORE specifies the tuning struct to be used for
 this
 core.
 This should be defined in arm.c and have the form 'arm_ident_tune, so
 for
 your case it should be arm_exynosm1_tune. This isn't defined in your
 patch,
 so it won't compile without that. You can write a custom tuning struct
 yourself, or reuse a tuning struct for one of the existing cores, if you'd
 like.

 Also, you should add exynosm1 to the switch statement in arm_issue_rate to
 specify the issue rate. I have a patch for next stage1 that should
 refactor
 it all into the tuning structs
 (https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02706.html) but until that
 goes in, you should fill in the switch statement there.

 Thanks,
 Kyrill



0001-AArch64-Add-option-for-the-Samsung-Exynos-M1-core-fo.patch
Description: Binary data


0001-ARM-Add-option-for-the-Samsung-Exynos-M1-core.patch
Description: Binary data


Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Jeff Law

On 04/02/2015 02:11 PM, Jan Hubicka wrote:

2015-03-18  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (cgraph_node::expand_thunk): Don't expand
 instrumentation thunks.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index e640907..abc9cfe 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool 
force_gimple_thunk)
tree thunk_fndecl = decl;
tree a;

+  /* Instrumentation thunk is the same function with
+ a different signature.  Never need to expand it.  */
+  if (thunk.add_pointer_bounds_args)
+return false;


Yeah, this is another case where we hit problem with transparent alias 
pretending
to be thunk :)
I guess the patch is OK for GCC-5 and for next stage1 we can clean this up.
I was actually building a compiler so I could take a look at this one 
under a debugger ;-)


jeff



Re: [CHKP] Support returned bounds in thunks expand

2015-04-02 Thread Jan Hubicka
 Ping
 
 2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
  Hi,
 
  Currentl we loose returned bounds when functions are merged.  This patch 
  fixes it by adding returne bounds support for cgraph_node::expand_thunk.  
  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?
  2015-03-06  Ilya Enkovich  ilya.enkov...@intel.com
 
  * cgraphunit.c (cgraph_node::expand_thunk): Build returned
  bounds for instrumented functions.

I think if the extra bultin call is needed here, the same andling needs to be
added to ipa-split.  We really ought to unify that code - it is surprisingly
difficult to produce a wrapper call these days.
  + if (instrumentation_clone
  +  !DECL_BY_REFERENCE (resdecl)
  +  restmp
  +  BOUNDED_P (restmp))
  +   {
  + tree fn = targetm.builtin_chkp_function 
  (BUILT_IN_CHKP_BNDRET);
  + gcall *retbnd = gimple_build_call (fn, 1, restmp);
  +
  + resbnd = create_tmp_reg (pointer_bounds_type_node, retbnd);
  + gimple_call_set_lhs (retbnd, resbnd);
  +
  + gsi_insert_after (bsi, retbnd, GSI_NEW_STMT);
  + create_edge (get_create (fn), retbnd, callees-count, 
  callees-frequency);
  +   }

I am not sure why we need to check here:  Originaly we have two bounded 
functions, A and B.
We turn function B to a wrapper of function A. If callers of bounded functions 
need
to call a builtin, I would expect all callers of B to do it already, so I would 
expect
that wrapper is safe here?

Or do we mix instrumented and non-instrumented versions somehow?

Addding code handling return value is going to turn the call to non-tailcall, 
so you probably
want to drop the tailcall flag, too. 

Honza


Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Jan Hubicka
  2015-03-18  Ilya Enkovich  ilya.enkov...@intel.com
 
  * cgraphunit.c (cgraph_node::expand_thunk): Don't expand
  instrumentation thunks.
 
 
  diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
  index e640907..abc9cfe 100644
  --- a/gcc/cgraphunit.c
  +++ b/gcc/cgraphunit.c
  @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, 
  bool force_gimple_thunk)
 tree thunk_fndecl = decl;
 tree a;
 
  +  /* Instrumentation thunk is the same function with
  + a different signature.  Never need to expand it.  */
  +  if (thunk.add_pointer_bounds_args)
  +return false;

Yeah, this is another case where we hit problem with transparent alias 
pretending
to be thunk :)
I guess the patch is OK for GCC-5 and for next stage1 we can clean this up.

Honza
 
 if (!force_gimple_thunk  this_adjusting
  targetm.asm_out.can_output_mi_thunk (thunk_fndecl, fixed_offset,


Re: [PATCH, CHKP] Fix cdtor merge for instrumented functions

2015-04-02 Thread Jan Hubicka
 Hi,
 
 This patch doesn't allow instrumentation thunks calls while merging 
 constructors and destructors.  Not isntrumented code is not affeceted.  
 Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?
 
 Thanks,
 Ilya
 --
 gcc/
 
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * ipa.c (ipa_cdtor_merge): Skip instrumentation thunks.

So the problem here is that you do have two names for the function, one that
is not instrumented and other that is instrumented?  I am bit surprised we get
instrumentation on ctors that should not take or return pointer parameter,
but I see one can trigger that at least by manually adding constructor 
attribute.

I think what you need is to drop DECL_STATIC_CONSTRUCTOR/DESTRUCTURO flags when
producing the transparent alias.

Honza


Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Jan Hubicka
 On 04/02/2015 02:11 PM, Jan Hubicka wrote:
 2015-03-18  Ilya Enkovich  ilya.enkov...@intel.com
 
  * cgraphunit.c (cgraph_node::expand_thunk): Don't expand
  instrumentation thunks.
 
 
 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index e640907..abc9cfe 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, 
 bool force_gimple_thunk)
 tree thunk_fndecl = decl;
 tree a;
 
 +  /* Instrumentation thunk is the same function with
 + a different signature.  Never need to expand it.  */
 +  if (thunk.add_pointer_bounds_args)
 +return false;
 
 Yeah, this is another case where we hit problem with transparent alias 
 pretending
 to be thunk :)
 I guess the patch is OK for GCC-5 and for next stage1 we can clean this up.
 I was actually building a compiler so I could take a look at this
 one under a debugger ;-)

I think it is really the transparent alias issue.  The comment seems pretty 
clear about it.
What is confusing is that instrumentation thunks are called thunks when they are
really not - thunk is a small hunk of code, while instrumentation thunk is a 
transparent
alias.

Too bad I did not notice we introduced transparent aliases, i would push out my
code for that. I will compare Ilya's changes with mine and hopefully we can
catch more bugs and unify the code next stage1.

Honza


Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-04-02 Thread Jeff Law

On 03/30/2015 10:37 PM, Mikhail Maltsev wrote:

Hi!

I'm currently working on the proposed task of replacing rtx objects
(i.e. struct rtx_def) with derived classes. I would like to get some
feedback on this work (it's far from being finished, but basically I
would like to know, whether my modifications are appropriate, e.g. one
may consider that this is too much for just refactoring, because
sometimes they involve small modification of semantics).

The attached patch is not well tested, i.e. I bootstrapped and regtested
it only on x86_64, but I'll perform more extensive testing before
submitting the next version.

The key points I would like to ask about:

1. The original task was to replace the rtx type by rtx_insn *, where it
is appropriate. But rtx_insn has several derived classes, such as
rtx_code_label, for example. So I tried to use the most derived type
when possible. Is it OK?
Definitely.  In general the idea here is to exploit the static type 
checking done in the compiler to avoid runtime checking and failures.





2. Not all of these type promotions can be done by just looking at
function callers and callees (and some functions will only be generated
during the build of some rare architecture) and checks already done in
them. In a couple of cases I referred to comments and my general
understanding of code semantics. In one case this actually caused a
regression (in the patch it is fixed, of course), because of somewhat
misleading comment (see live_label_rtx function added in patch for
details) The question is - are such changes OK for refactoring (or it
should strictly preserve semantics)?
They're OK, but it may be easier to run things through the review 
process if refactoring is kept separate from strengthening the type 
checking.





3. In lra-constraints.c I added a new class rtx_usage_list, which, IMHO,
allows to group the functions which work with usage list in a more
explicit manner and make some conditions more self-explaining. I hope
that Vladimir Makarov (in this case, because it concerns LRA) and other
authors will not object against such intrusion into their code (or
would rather tell me what should be fixed in my patch(es), instead of
just refusing to apply it/them). In general, are such changes OK or
should better be avoided?

A couple of questions related to further work:
I don't see anything inherently wrong with this concept.  Though again, 
I'd suggest separating out these changes from type safety work.




1. I noticed that emit_insn function, in fact, does two kinds of things:
it can either add it's argument to the chain, or, if the argument is a
pattern, it creates a new instruction based on that pattern. Shouldn't
this logic be separated in the callers?
That would be wise.  There's probably several of these kinds of things 
lurking around.




2. Are there any plans on implementing a better class hierarchy on AST's
(union tree_node type). I see that C++ FE uses a huge number of macros
(which check TREE_CODE and some boolean flags). Could this be improved
somehow?
It's in progress and I'm hoping Andrew will be in a position to post 
this work soon.


jeff


Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers

2015-04-02 Thread Uros Bizjak
On Thu, Apr 2, 2015 at 10:17 PM, Jan Hubicka hubi...@ucw.cz wrote:

 rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG);
 rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1);
 -   retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1));
 -   chkp_put_regs_to_expr_list (retval);
 +   if (GET_CODE (retval) == PARALLEL)
 + {
 +   b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx);
 +   b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx);
 +   rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1));
 +   retval = chkp_join_splitted_slot (retval, par);

 I do not understand this code. Isn't it just droping the original return value
 constructing prallelcontaining only the BND_REGs?

Please see chkp_join_splitted_slot that merges two parallels.

Uros.


Re: [PATCH, CHKP] Fix static const bounds creation in LTO

2015-04-02 Thread Jan Hubicka
 Hi,
 
 Current in LTO static const bounds are created as external symbol.  It 
 doesn't work in case original symbols were removed and privatized.  This 
 patch introduces a separate comdat symbol to be used in LTO.   Bootstrapped 
 and tested on x86_64-unknown-linux-gnu.  Does this approach look OK?

As I understand it, you want to create a static variables, like 
__chkp_zero_bounds
which should be shared across translation units.  Currently the function does:
  static tree
  chkp_make_static_const_bounds (HOST_WIDE_INT lb,
 HOST_WIDE_INT ub,
 const char *name)
  {
tree var;

/* With LTO we may have constant bounds already in varpool.
   Try to find it.  */
var = chkp_find_const_bounds_var (lb, ub);

if (var)
  return var;

var  = build_decl (UNKNOWN_LOCATION, VAR_DECL,
   get_identifier (name), pointer_bounds_type_node);

TREE_PUBLIC (var) = 1;
TREE_USED (var) = 1;
TREE_READONLY (var) = 1;
TREE_STATIC (var) = 1;
TREE_ADDRESSABLE (var) = 0;
DECL_ARTIFICIAL (var) = 1;
DECL_READ_P (var) = 1;
/* We may use this symbol during ctors generation in chkp_finish_file
   when all symbols are emitted.  Force output to avoid undefined
   symbols in ctors.  */
if (!in_lto_p)
  {
DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub);
DECL_COMDAT (var) = 1;
varpool_node::get_create (var)-set_comdat_group (DECL_ASSEMBLER_NAME 
(var));
varpool_node::get_create (var)-force_output = 1;
  }
else
  DECL_EXTERNAL (var) = 1;
varpool_node::finalize_decl (var);
  }

You should not set comdat group by hand, or we get into troubles on non-ELF
targets.  Just call make_decl_one_only (var, DECL_ASSEMBLER_NAME (var));

Now in LTO I guess you want to  check if the symbol is provided by the source
compilation unit, i.e. call symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME
(var)) and if it is there, check it have proper type and fatal_error otherwise
and if it does, discar var and use existing variable

This avoid a duplicate declaration of a symbol that is invalid in symtab.
(once we solve the transparent alias thing, I will add verification for that)
Because you already set force_output, the symbol should not be privatized and
renamed in any way.

If there are cases the symbol can get legally optimized out, I guess you
can also avoid setting force_output and we can stream references to the
decls into optimization summary in a case we want ot bind for it in WPA,
but that can wait for next stage1.

Honza
 
 Thanks,
 Ilya
 --
 gcc/
 
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * tree-chkp.c (CHKP_LTO_SYMBOL_SUFFIX): New.
   (chkp_make_static_const_bounds): Use another
   symbol in LTO.
 
 gcc/testsuite/
 
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * gcc.dg/lto/chkp-static-bounds_0.c: New.
 
 
 diff --git a/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c 
 b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c
 new file mode 100644
 index 000..e896eb1
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c
 @@ -0,0 +1,26 @@
 +/* { dg-lto-do link } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-lto-options { { -flto -flto-partition=max -fcheck-pointer-bounds 
 -mmpx } } } */
 +
 +char *cc;
 +
 +int test1 (const char *c)
 +{
 +  c = __builtin___bnd_init_ptr_bounds (c);
 +  cc = c;
 +  return c[0] * 2;
 +}
 +
 +struct S
 +{
 +  int (*fnptr) (const char *);
 +} S;
 +
 +struct S s1 = {test1};
 +struct S s2 = {test1};
 +struct S s3 = {test1};
 +
 +int main (int argc, const char **argv)
 +{
 +  return s1.fnptr (argv[0]) + s2.fnptr (argv[1]);
 +}
 diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
 index d2df4ba..0578936 100644
 --- a/gcc/tree-chkp.c
 +++ b/gcc/tree-chkp.c
 @@ -421,6 +421,7 @@ static bool in_chkp_pass;
  #define CHKP_VAR_BOUNDS_PREFIX __chkp_var_bounds_
  #define CHKP_ZERO_BOUNDS_VAR_NAME __chkp_zero_bounds
  #define CHKP_NONE_BOUNDS_VAR_NAME __chkp_none_bounds
 +#define CHKP_LTO_SYMBOL_SUFFIX .lto
  
  /* Static checker constructors may become very large and their
 compilation with optimization may take too much time.
 @@ -1906,7 +1907,8 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb,
  HOST_WIDE_INT ub,
  const char *name)
  {
 -  tree var;
 +  tree var, id;
 +  varpool_node *node;
  
/* With LTO we may have constant bounds already in varpool.
   Try to find it.  */
 @@ -1915,8 +1917,22 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb,
if (var)
  return var;
  
 -  var  = build_decl (UNKNOWN_LOCATION, VAR_DECL,
 -  get_identifier (name), pointer_bounds_type_node);
 +  /* In LTO we may have symbol with changed visibility, comdat
 + group etc. Therefore we shouldn't recreate the same symbol.
 + Use LTO version instead.  */
 +  if (in_lto_p)
 +{
 

Re: [PATCH] [ARM] Add support for the Samsung Exynos M1 processor

2015-04-02 Thread James Greenhalgh
On Thu, Apr 02, 2015 at 11:19:14PM +0100, Sebastian Pop wrote:
 Hi,
 
 from what I understand, Evandro has addressed the comments from Kyrill.
 Are there other problems to be addressed before the patches can go in?

Trunk is currently in Stage 4 development, these patches are fairly
low-risk, but they are certainly not regression fixes. I'll defer
to port maintainers and release managers for the final say, but in my
opinion it would not be appropriate to commit them until Stage 1
development for GCC 6.0 opens (hopefully in a few weeks).

For the AArch64 patch, I was expecting to see a respin after Junmo's
comment on Wednesday [1].

In particular:

+AARCH64_CORE(exynos-m1,   exynosm1,  exynosm1,  8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1)

As there has not been a scheduling model contributed for the exynos-m1,
this will use *no* scheduling model on AArch64. This is unlikely to give
good performance.

+static const struct tune_params exynosm1_tunings =
+{
+  cortexa57_extra_costs,
+  cortexa57_addrcost_table,
+  cortexa57_regmove_cost,
+  cortexa57_vector_cost,
+  4, /* memmov_cost  */
+  3, /* issue_rate  */
+  (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
+   | AARCH64_FUSE_MOVK_MOVK), /* fuseable_ops  */
+  16,  /* function_align.  */
+  8,   /* jump_align.  */
+  4,   /* loop_align.  */
+  2,   /* int_reassoc_width.  */
+  4,   /* fp_reassoc_width.  */
+  1/* vec_reassoc_width.  */
+};
+

As these are identical to the Cortex-A57 tuning, is there any reason to
add them? I'd prefer if we took a copy-on-modify policy for these
tuning structs, only adding them where there is a difference.

Thanks,
James

---
[1]: https://gcc.gnu.org/ml/gcc-patches/2015-04/msg1.html



Re: [RS6000] Fix 65576 regression

2015-04-02 Thread Michael Meissner
On Thu, Apr 02, 2015 at 03:52:35PM +1030, Alan Modra wrote:
 This cures an unrecognizable insn ICE by modifying a predicate of
 extenddftf2_internal (the only place this predicate is used) to ensure
 that rtl optimization passes do not substitute 0.0 for a register with
 known 0.0 value, except when VSX is enabled.  ie. Don't undo the
 necessary register move emitted by the extenddftf2_fprs expander.
 Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
 and powerpc-linux.  OK to apply? 
 
   PR target/65576
   PR target/65240
   * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude
   0.0 constant unless TARGET_VSX.
 
 Index: gcc/config/rs6000/predicates.md
 ===
 --- gcc/config/rs6000/predicates.md   (revision 221805)
 +++ gcc/config/rs6000/predicates.md   (working copy)
 @@ -964,7 +964,8 @@
 
  ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand.
  (define_predicate zero_reg_mem_operand
 -  (ior (match_operand 0 zero_fp_constant)
 +  (ior (and (match_test TARGET_VSX)
 + (match_operand 0 zero_fp_constant))
 (match_operand 0 reg_or_mem_operand)))
 
  ;; Return 1 if the operand is a CONST_INT and it is the element for 64-bit
 

I definately prefer Alan's patch over mine.

In looking at extenddftf2_internal, I believe you are correct David, in that
final alternative will never match because 0.0 will not be valid (pre-VSX 0.0
won't be allowed as operand2, VSX 0.0 will not match n constraint).  For the
VSX case, the n should be either j or E.

If you are doing a double-long double conversion, and it happens to pick GPRs
for some reason, the GPR side will do a load/move of 0.0 instead of using li to
load 0 on pre-VSX systems.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [RS6000] Fix 65576 regression

2015-04-02 Thread Alan Modra
On Thu, Apr 02, 2015 at 01:03:19PM -0400, David Edelsohn wrote:
 On Thu, Apr 2, 2015 at 1:22 AM, Alan Modra amo...@gmail.com wrote:
  This cures an unrecognizable insn ICE by modifying a predicate of
  extenddftf2_internal (the only place this predicate is used) to ensure
  that rtl optimization passes do not substitute 0.0 for a register with
  known 0.0 value, except when VSX is enabled.  ie. Don't undo the
  necessary register move emitted by the extenddftf2_fprs expander.
  Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
  and powerpc-linux.  OK to apply?
 
  PR target/65576
  PR target/65240
  * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude
  0.0 constant unless TARGET_VSX.
 
 This change seems to disable the final alternative in the
 extenddftf2_internal pattern.

It looks to me like it was already disabled, by r96390.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [Ping, Patch, fortran, pr65548, v1] [5 Regression] gfc_conv_procedure_call

2015-04-02 Thread Jerry DeLisle

On 04/02/2015 03:28 AM, Andre Vehreschild wrote:

Ping!

This should be in 5.1. Dominique and I feel like this patch is nearly obvious.

Regards,
Andre

On Wed, 25 Mar 2015 14:35:54 +0100
Andre Vehreschild ve...@gmx.de wrote:


Hi all,

please find attached a fix for the recently introduced regression when
allocating arrays with an intrinsic function for source=. The patch addresses
this issue by using gfc_conv_expr_descriptor () for intrinsic functions.

Bootstraps and regtests ok on x86_64-linux-gnu/F20.

Ok for trunk?


Yes, ok for trunk.

Thanks,

Jerry



Re: [libstdc++/65033] Give alignment info to libatomic

2015-04-02 Thread Hans-Peter Nilsson
On Wed, 25 Mar 2015, Jonathan Wakely wrote:
 I've convinced myself that Richard's patch is correct in all cases,
 but I think we also want this patch, to fix PR62259 and PR65147.

 For the generic std::atomicT (i.e. not the integral or pointer
 specializations) we should increase the alignment of atomic types that
 have the same size as one of the standard integral types. This should
 be consistent with what the C front end does for _Atomic, based on
 what Joseph told me on IRC:

Wrong.

 jsm28 jwakely: _Atomic aligns 1/2/4/8/16-byte types the same as
integer types of that size.

No it doesn't!  It's same or higher as.

 jsm28 (Which may not be alignment = size, depending on the
architecture.)

 Ideally we'd use an attribute like Andrew describes at
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259#c4 but that's not
 going to happen for GCC 5, so this just looks for an integral type of
 the same size and uses its alignment.

 Tested x86_64-linux, powerpc64le-linux.

 I'll wait for RM approval for this and Richard's patch (which is OK
 from a libstdc++ perspective).


brgds, H-P


Re: [RS6000] Fix 65576 regression

2015-04-02 Thread David Edelsohn
On Thu, Apr 2, 2015 at 7:41 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 On Thu, Apr 02, 2015 at 03:52:35PM +1030, Alan Modra wrote:
 This cures an unrecognizable insn ICE by modifying a predicate of
 extenddftf2_internal (the only place this predicate is used) to ensure
 that rtl optimization passes do not substitute 0.0 for a register with
 known 0.0 value, except when VSX is enabled.  ie. Don't undo the
 necessary register move emitted by the extenddftf2_fprs expander.
 Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
 and powerpc-linux.  OK to apply?

   PR target/65576
   PR target/65240
   * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude
   0.0 constant unless TARGET_VSX.

 Index: gcc/config/rs6000/predicates.md
 ===
 --- gcc/config/rs6000/predicates.md   (revision 221805)
 +++ gcc/config/rs6000/predicates.md   (working copy)
 @@ -964,7 +964,8 @@

  ;; Return 1 if the operand is CONST_DOUBLE 0, register or memory operand.
  (define_predicate zero_reg_mem_operand
 -  (ior (match_operand 0 zero_fp_constant)
 +  (ior (and (match_test TARGET_VSX)
 + (match_operand 0 zero_fp_constant))
 (match_operand 0 reg_or_mem_operand)))

  ;; Return 1 if the operand is a CONST_INT and it is the element for 64-bit


 I definately prefer Alan's patch over mine.

 In looking at extenddftf2_internal, I believe you are correct David, in that
 final alternative will never match because 0.0 will not be valid (pre-VSX 0.0
 won't be allowed as operand2, VSX 0.0 will not match n constraint).  For the
 VSX case, the n should be either j or E.

 If you are doing a double-long double conversion, and it happens to pick GPRs
 for some reason, the GPR side will do a load/move of 0.0 instead of using li 
 to
 load 0 on pre-VSX systems.

If the final alternative cannot occur, it should be removed as well.
Alan, would you please test that change also?

Thanks, David


[PATCH] Add myself as reviewer for arm port

2015-04-02 Thread Kyrill Tkachov

Hi all,

I've committed this as per https://gcc.gnu.org/ml/gcc/2015-04/msg00012.html 
with r221833.

Thanks,
Kyrill

2015-04-02  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* MAINTAINERS: Add myself as an arm port reviewer.


Index: MAINTAINERS
===
--- MAINTAINERS	(revision 221832)
+++ MAINTAINERS	(working copy)
@@ -268,6 +268,7 @@
 
 			Reviewers
 
+arm port		Kyrylo Tkachov		kyrylo.tkac...@arm.com
 C front end		Marek Polacek		pola...@redhat.com
 dataflow		Paolo Bonzini		bonz...@gnu.org
 dataflow		Seongbae Park		seongbae.p...@gmail.com




Re: [Ping, Patch, fortran, pr65548, v1] [5 Regression] gfc_conv_procedure_call

2015-04-02 Thread Andre Vehreschild
Ping! 

This should be in 5.1. Dominique and I feel like this patch is nearly obvious.

Regards,
Andre

On Wed, 25 Mar 2015 14:35:54 +0100
Andre Vehreschild ve...@gmx.de wrote:

 Hi all,
 
 please find attached a fix for the recently introduced regression when
 allocating arrays with an intrinsic function for source=. The patch addresses
 this issue by using gfc_conv_expr_descriptor () for intrinsic functions.
 
 Bootstraps and regtests ok on x86_64-linux-gnu/F20.
 
 Ok for trunk?
 
 Regards,
   Andre


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: Announcing AArch64 and ARM port reviewers

2015-04-02 Thread James Greenhalgh
On Thu, Apr 02, 2015 at 06:39:57AM +0100, Jeff Law wrote:
 I'm pleased to announce that James Greenhalgh has been appointed as a 
 reviewer for the AArch64 port and that Kyrylo Tkachov has been appointed 
 as a reviewer for the ARM port.
 
 James  Kyrylo, if you could add yourself to the MAINTAINERS file for 
 the additional roles, it would be appreciated.

Thanks!

Done as r221834, with the attached patch.

Cheers,
James

Index: ChangeLog
===
--- ChangeLog	(revision 221833)
+++ ChangeLog	(revision 221834)
@@ -1,3 +1,7 @@
+2015-04-02  James Greenhalgh  james.greenha...@arm.com
+
+	* MAINTAINERS: Add myself as a reviewer for the AArch64 port.
+
 2015-04-02  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
 	* MAINTAINERS: Add myself as an arm port reviewer.
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 221833)
+++ MAINTAINERS	(revision 221834)
@@ -268,6 +268,7 @@
 
 			Reviewers
 
+aarch64 port		James Greenhalgh	james.greenha...@arm.com
 arm port		Kyrylo Tkachov		kyrylo.tkac...@arm.com
 C front end		Marek Polacek		pola...@redhat.com
 dataflow		Paolo Bonzini		bonz...@gnu.org


[committed] Fix wrong generated by pa_output_move_double

2015-04-02 Thread John David Anglin
The attached change fixes an old bug.  In debugging the segmentation fault of 
cc76narrowing1 in the
ppl-1.1 test suite, I found that we incorrectly handle register indexed memory 
operands in double word
loads on 32-bit targets.  In particular, when there is an overlap between the 
source registers and the
destination registers, we get wrong assembly code.  For example, with

(insn:TI 220 543 544 (set (reg:DF 28 %r28 [orig:133 D.246652 ] [133])
(mem:DF (plus:SI (reg/f:SI 29 %r29 [orig:167 _49-D.173044.impl ] [167])
(reg:SI 28 %r28 [163])) [142 MEM[(const double )_53]+0 S8 A64])
) 72 {*pa.md:3858}

we generated the following assembly code:

ldo 4(%r29),%r29
ldw %r29(%r28),%r29
ldo -4(%r29),%r29
ldw %r29(%r28),%r28

The first ldw instruction clobbers %r29.

Tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11.  Committed to 
trunk, and 4.9 and 4.8 branches.

Dave
--
John David Anglin   dave.ang...@bell.net


2015-04-02  John David Anglin  dang...@gcc.gnu.org

* config/pa/pa.c (pa_output_move_double): Directly handle register
indexed memory operand.  Simplify handling of scaled register indexed
memory operands.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 221781)
+++ config/pa/pa.c  (working copy)
@@ -2595,29 +2595,30 @@
GET_CODE (XEXP (addr, 0)) == MULT)
{
  rtx xoperands[4];
- rtx high_reg = gen_rtx_SUBREG (SImode, operands[0], 0);
 
- if (!reg_overlap_mentioned_p (high_reg, addr))
-   {
- xoperands[0] = high_reg;
- xoperands[1] = XEXP (addr, 1);
- xoperands[2] = XEXP (XEXP (addr, 0), 0);
- xoperands[3] = XEXP (XEXP (addr, 0), 1);
- output_asm_insn ({sh%O3addl %2,%1,%0|shladd,l %2,%O3,%1,%0},
-  xoperands);
- return ldw 4(%0),%R0\n\tldw 0(%0),%0;
-   }
- else
-   {
- xoperands[0] = high_reg;
- xoperands[1] = XEXP (addr, 1);
- xoperands[2] = XEXP (XEXP (addr, 0), 0);
- xoperands[3] = XEXP (XEXP (addr, 0), 1);
- output_asm_insn ({sh%O3addl %2,%1,%R0|shladd,l %2,%O3,%1,%R0},
-  xoperands);
- return ldw 0(%R0),%0\n\tldw 4(%R0),%R0;
-   }
+ /* Load address into left half of destination register.  */
+ xoperands[0] = gen_rtx_SUBREG (SImode, operands[0], 0);
+ xoperands[1] = XEXP (addr, 1);
+ xoperands[2] = XEXP (XEXP (addr, 0), 0);
+ xoperands[3] = XEXP (XEXP (addr, 0), 1);
+ output_asm_insn ({sh%O3addl %2,%1,%0|shladd,l %2,%O3,%1,%0},
+  xoperands);
+ return ldw 4(%0),%R0\n\tldw 0(%0),%0;
}
+  else if (GET_CODE (addr) == PLUS
+   REG_P (XEXP (addr, 0))
+   REG_P (XEXP (addr, 1)))
+   {
+ rtx xoperands[3];
+
+ /* Load address into left half of destination register.  */
+ xoperands[0] = gen_rtx_SUBREG (SImode, operands[0], 0);
+ xoperands[1] = XEXP (addr, 0);
+ xoperands[2] = XEXP (addr, 1);
+ output_asm_insn ({addl|add,l} %1,%2,%0,
+  xoperands);
+ return ldw 4(%0),%R0\n\tldw 0(%0),%0;
+   }
 }
 
   /* If an operand is an unoffsettable memory ref, find a register


[PATCH] Fix PR65660 in parts, better fix for PR64909

2015-04-02 Thread Richard Biener

When I fixed PR64909 I knew it wasn't a 100% precise fix.  This bites
us now in PR65660 (which also shows other issues).  Thus the following
precise fix for PR64909 (which didn't turn out as too involved) - compute
a cost vector for the scalar iteration cost and apply it for peeled
iterations.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

I plan to apply this to trunk and to the 4.9 branch where I backported
the original fix to as it fixes round-off errors which can lead to
very unprofitable vectorizations.

Richard.

2015-04-02  Richard Biener  rguent...@suse.de

PR tree-optimization/64909
PR tree-optimization/65660
* tree-vectorizer.h (vect_get_known_peeling_cost): Adjust
to take a cost vector for scalar iteration cost.
(vect_get_single_scalar_iteration_cost): Likewise.
* tree-vect-loop.c (vect_get_single_scalar_iteration_cost):
Compute the scalar iteration cost into a cost vector.
(vect_get_known_peeling_cost): Use the scalar cost vector to
account for the cost of the peeled iterations.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost):
Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   (revision 220580)
+++ gcc/tree-vectorizer.h   (working copy)
@@ -1101,10 +1101,12 @@ extern bool vectorizable_reduction (gimp
 extern bool vectorizable_induction (gimple, gimple_stmt_iterator *, gimple *);
 extern tree get_initial_def_for_reduction (gimple, tree, tree *);
 extern int vect_min_worthwhile_factor (enum tree_code);
-extern int vect_get_known_peeling_cost (loop_vec_info, int, int *, int,
+extern int vect_get_known_peeling_cost (loop_vec_info, int, int *,
+   stmt_vector_for_cost *,
stmt_vector_for_cost *,
stmt_vector_for_cost *);
-extern int vect_get_single_scalar_iteration_cost (loop_vec_info);
+extern int vect_get_single_scalar_iteration_cost (loop_vec_info,
+ stmt_vector_for_cost *);
 
 /* In tree-vect-slp.c.  */
 extern void vect_free_slp_instance (slp_instance);
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 220580)
+++ gcc/tree-vect-loop.c(working copy)
@@ -2653,12 +2653,13 @@ vect_force_simple_reduction (loop_vec_in
 
 /* Calculate the cost of one scalar iteration of the loop.  */
 int
-vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo)
+vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo,
+  stmt_vector_for_cost *scalar_cost_vec)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
   int nbbs = loop-num_nodes, factor, scalar_single_iter_cost = 0;
-  int innerloop_iters, i, stmt_cost;
+  int innerloop_iters, i;
 
   /* Count statements in scalar loop.  Using this as scalar cost for a single
  iteration for now.
@@ -2699,17 +2700,20 @@ vect_get_single_scalar_iteration_cost (l
   !STMT_VINFO_IN_PATTERN_P (stmt_info))
 continue;
 
+ vect_cost_for_stmt kind;
   if (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt)))
 {
   if (DR_IS_READ (STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt
-   stmt_cost = vect_get_stmt_cost (scalar_load);
+   kind = scalar_load;
  else
-   stmt_cost = vect_get_stmt_cost (scalar_store);
+   kind = scalar_store;
 }
   else
-stmt_cost = vect_get_stmt_cost (scalar_stmt);
+kind = scalar_stmt;
 
-  scalar_single_iter_cost += stmt_cost * factor;
+ scalar_single_iter_cost
+   += record_stmt_cost (scalar_cost_vec, factor, kind,
+NULL, 0, vect_prologue);
 }
 }
   return scalar_single_iter_cost;
@@ -2719,7 +2723,7 @@ vect_get_single_scalar_iteration_cost (l
 int
 vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue,
  int *peel_iters_epilogue,
- int scalar_single_iter_cost,
+ stmt_vector_for_cost *scalar_cost_vec,
 stmt_vector_for_cost *prologue_cost_vec,
 stmt_vector_for_cost *epilogue_cost_vec)
 {
@@ -2736,8 +2740,10 @@ vect_get_known_peeling_cost (loop_vec_in
 
   /* If peeled iterations are known but number of scalar loop
  iterations are unknown, count a taken branch per peeled loop.  */
-  retval = record_stmt_cost (prologue_cost_vec, 2, cond_branch_taken,
+  retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
 

[PATCH, i386]: Revert unwanted part of r216820 to fix movqi_internal type calculation

2015-04-02 Thread Uros Bizjak
Hello!

Attached patch reverts unwanted part of r216820 [1] movqi_internal
change to restore correct type calculation.

2015-04-02  Uros Bizjak  ubiz...@gmail.com

Revert parts of r216820.
* config/i386/i386.md (movqi_internal): Correct type calculation
for alternatives 3 and 5.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} and
committed to mainline SVN.

[1] https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=216820

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 221826)
+++ config/i386/i386.md (working copy)
@@ -2533,9 +2533,7 @@
 }
   [(set_attr isa *,*,*,*,*,*,*,*,*,*,avx512dq,avx512dq)
(set (attr type)
- (cond [(eq_attr alternative 3,5)
- (const_string imovx)
-   (eq_attr alternative 7,8,9,10,11)
+ (cond [(eq_attr alternative 7,8,9,10,11)
  (const_string mskmov)
(and (eq_attr alternative 5)
 (not (match_operand:QI 1 aligned_operand)))
@@ -2546,6 +2544,8 @@
 (ior (not (match_test TARGET_PARTIAL_REG_STALL))
  (not (match_test TARGET_QIMODE_MATH
  (const_string imov)
+   (eq_attr alternative 3,5)
+ (const_string imovx)
(and (match_test TARGET_MOVX)
 (eq_attr alternative 2))
  (const_string imovx)


Re: [PATCH, libmpx, i386, PR driver/65444] Pass '-z bndplt' when building dynamic objects with MPX

2015-04-02 Thread H.J. Lu
On Wed, Apr 1, 2015 at 9:34 PM, Jeff Law l...@redhat.com wrote:
 On 03/31/2015 03:47 AM, Ilya Enkovich wrote:

 On 23 Mar 13:19, Ilya Enkovich wrote:

 Hi,

 May this patch go into trunk at this point?  It is very important for
 dynamic MPX codes.

 Thanks,
 Ilya


 I additionally documented changes in invoke.texi.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2015-03-31  Ilya Enkovich  ilya.enkov...@intel.com

 PR driver/65444
 * config/i386/linux-common.h (MPX_SPEC): New.
 (CHKP_SPEC): Add MPX_SPEC.
 * doc/invoke.texi (-fcheck-pointer-boudns): Document
 possible issues with '-z bndplt' support in linker.

 libmpx/

 2015-03-31  Ilya Enkovich  ilya.enkov...@intel.com

 PR driver/65444
 * configure.ac: Add check for '-z bndplt' support
 by linker. Add link_mpx output variable.
 * libmpx.spec.in (link_mpx): New.
 * configure: Regenerate.

 Just to make sure I understand.  With this patch we conditionally pass -z
 bndplt based on whether or not it's supported by the linker we find when we
 configure GCC.

 Failure to pass -z bndplt won't cause the program to misbehave, but will
 limit the effectiveness of MPX.

 Gold doesn't support -z bndplt, just newer versions of the BFD linker.

 Gold issues an error for -z bndplt, old BFD linkers will issue a warning.

 There are, at least in theory, use cases where we might not have a PLT and
 thus -z bndplt wouldn't make sense anyway.

 I guess the one thing I don't like here is that whether or not to pass -z
 bndplt is made at the time we configure GCC.  Yet, it's trivial for someone
 to change the system linker later, either to gold or from an old BFD linker
 that doesn't support -z bndplt to one that does support -z bndplt.

 [ Note we have the same issue with certain assembler feature tests. ]

 I'm not aware of any real infrastructure in GCC to query the behavior of the
 linker at link time and then customize the options passed.  So if it's going
 to be configurable, then that's the only time to do the test.

 I strongly disagree with HJ's assertion that we should always pass the flag,
 regardless of the underlying linker.

 So, in an ideal world, we'd query the linker at link time and pass the flag
 anytime we have a linker that supports the capability and perhaps warn if
 the linker doesn't support that capability.

 Given we're not in that ideal world, I think Ilya's patch is reasonable and
 should be installed.

Without proper PLT for MPX, all external function calls will clear bound
registers.  MPX is a security feature. Cyber criminals only need to get
it right 1 time.  An organization who uses MPX for cyber security may
not realize they leave a door open due to an old linker.  What I want to
avoid is 2 years from now, bank of foobar comes out saying that they
thought they were protected by MPX, but somehow MPX didn't catch a
buffer overflow it was supposed to and there was no compiler error message
to warn programmers.


-- 
H.J.


Re: [libstdc++/65033] Give alignment info to libatomic

2015-04-02 Thread Hans-Peter Nilsson
On Thu, 12 Feb 2015, Richard Henderson wrote:
 When we fixed PR54005,

Hm, there's confusion.  When was this fixed?  (Not fixed
AFAICT.)  Maybe you mean PR54004, but there was no note there
either.  Or maybe there's a typo and you meant some other PR and
that PR54005 is supposedly fixed by this patch (committed as
r221701)

...but it doesn't seem right: you use a specific object when
deducing the alignment for the fake-pointer, so it's used anyway
and is_lock_free must not be object-specific (despite its name)
and only type-specific as mandated by the standard (see PR).

To wit, deduce from the known-alignment of _Tp, not
known-alignment of _M_i. Or is this what happens; they're the
same?  Why then use __alignof(_M_i) (the object-alignment)
instead of _S_alignment (the deduced alas insufficiently
increased type-alignment)?

 making sure that atomic_is_lock_free returns the same
 value for all objects of a given type,

(That would work but it doesn't seem to be the case.)

 we probably should have changed the
 interface so that we would pass size and alignment rather than size and object
 pointer.

 Instead, we decided that passing null for the object pointer would be
 sufficient.  But as this PR shows, we really do need to take alignment into
 account.

Regarding what's actually needed, alignment of an atomic type
should always be *forced to be at least the natural alignment of
of the object* (with non-power-of-two sized-objects rounded up)
and until then atomic types won't work for targets where the
non-atomic equivalents have less alignment (as straddling a
page-boundary won't be lock-less-atomic anywhere where
straddling a page-boundary may cause a non-atomic-access...) So,
not target-specific except for targets that require even
higher-than-natural alignment.

 The following patch constructs a fake object pointer that is maximally
 misaligned.

(With regards to the known object alignment of the _M_i object.)

  This allows the interface to both the builtin and to libatomic to
 remain unchanged.  Which probably makes this back-portable to maintenance
 releases as well.

 I believe that for all of our current systems, size_t == uintptr_t, so the
 reinterpret_cast ought not generate warnings.

 The test case is problematic, as there's currently no good place to put it.
 The libstdc++ testsuite doesn't have the libatomic library path configured, 
 and
 the libatomic testsuite doesn't have the libstdc++ include paths configured.
 Yet another example where we really need an install tree for testing.  
 Thoughts?

brgds, H-P


Re: [PATCH, CHKP] Restore transparent alias chains

2015-04-02 Thread Jeff Law

On 04/02/2015 12:48 PM, Jan Hubicka wrote:

On 03/20/2015 02:20 AM, Ilya Enkovich wrote:

Hi,

Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit.  
We always expect it for instrumentation clones, thus restore it input_cgraph_1. 
 Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2015-03-20  Ilya Enkovich  ilya.enkov...@intel.com

* lto-cgraph.c (input_cgraph_1): Always link instrumented
assembler name with original one.

This appears to be a code path that only triggers when MPX is
enabled and is roughly analogous to the code in
chkp_build_instrumented_fndecl links things up.

OK for the trunk.


I think this will lead to wrong code. At this time, we may have multple
declarations sharing single assembler name (and thus
IDENTIFIER_TRANSPARENT_ALIAS link). Think of case where one unit defines static
function and other global function of the same name. We may end up renaming the
symbol but keeping bogus transparent alias link that will trigger on other
symbol.
Then I think the code in ipa-chkp chkp_build_instrumented_fndecl (and 
more generally how we're using transparent aliases) may need some 
rethinking.


jeff


Re: Silence merge warnings on artificial types

2015-04-02 Thread Jakub Jelinek
On Thu, Apr 02, 2015 at 09:23:03PM +0300, Ilya Verbin wrote:
 Hmm, libgomp.c++/target-3.C still fails.
 Here is what I see in need_assembler_name_p:

Guess we should make the .omp_data_s.* types TYPE_ARTIFICIAL too.
Will take care of that tomorrow.

Jakub


[PATCH, CHKP] Fix static const bounds creation in LTO

2015-04-02 Thread Ilya Enkovich
Hi,

Current in LTO static const bounds are created as external symbol.  It doesn't 
work in case original symbols were removed and privatized.  This patch 
introduces a separate comdat symbol to be used in LTO.   Bootstrapped and 
tested on x86_64-unknown-linux-gnu.  Does this approach look OK?

Thanks,
Ilya
--
gcc/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* tree-chkp.c (CHKP_LTO_SYMBOL_SUFFIX): New.
(chkp_make_static_const_bounds): Use another
symbol in LTO.

gcc/testsuite/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/lto/chkp-static-bounds_0.c: New.


diff --git a/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c 
b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c
new file mode 100644
index 000..e896eb1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c
@@ -0,0 +1,26 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target mpx } */
+/* { dg-lto-options { { -flto -flto-partition=max -fcheck-pointer-bounds -mmpx 
} } } */
+
+char *cc;
+
+int test1 (const char *c)
+{
+  c = __builtin___bnd_init_ptr_bounds (c);
+  cc = c;
+  return c[0] * 2;
+}
+
+struct S
+{
+  int (*fnptr) (const char *);
+} S;
+
+struct S s1 = {test1};
+struct S s2 = {test1};
+struct S s3 = {test1};
+
+int main (int argc, const char **argv)
+{
+  return s1.fnptr (argv[0]) + s2.fnptr (argv[1]);
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index d2df4ba..0578936 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -421,6 +421,7 @@ static bool in_chkp_pass;
 #define CHKP_VAR_BOUNDS_PREFIX __chkp_var_bounds_
 #define CHKP_ZERO_BOUNDS_VAR_NAME __chkp_zero_bounds
 #define CHKP_NONE_BOUNDS_VAR_NAME __chkp_none_bounds
+#define CHKP_LTO_SYMBOL_SUFFIX .lto
 
 /* Static checker constructors may become very large and their
compilation with optimization may take too much time.
@@ -1906,7 +1907,8 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb,
   HOST_WIDE_INT ub,
   const char *name)
 {
-  tree var;
+  tree var, id;
+  varpool_node *node;
 
   /* With LTO we may have constant bounds already in varpool.
  Try to find it.  */
@@ -1915,8 +1917,22 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb,
   if (var)
 return var;
 
-  var  = build_decl (UNKNOWN_LOCATION, VAR_DECL,
-get_identifier (name), pointer_bounds_type_node);
+  /* In LTO we may have symbol with changed visibility, comdat
+ group etc. Therefore we shouldn't recreate the same symbol.
+ Use LTO version instead.  */
+  if (in_lto_p)
+{
+  int len = strlen (name) + strlen (CHKP_LTO_SYMBOL_SUFFIX) + 1;
+  char *new_name = XALLOCAVEC (char, len);
+  strcpy (new_name, name);
+  strcat (new_name, CHKP_LTO_SYMBOL_SUFFIX);
+  id = get_identifier (new_name);
+}
+  else
+id = get_identifier (name);
+
+  var  = build_decl (UNKNOWN_LOCATION, VAR_DECL, id,
+pointer_bounds_type_node);
 
   TREE_PUBLIC (var) = 1;
   TREE_USED (var) = 1;
@@ -1925,18 +1941,17 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb,
   TREE_ADDRESSABLE (var) = 0;
   DECL_ARTIFICIAL (var) = 1;
   DECL_READ_P (var) = 1;
+  DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub);
+  DECL_COMDAT (var) = 1;
+  DECL_WEAK (var) = 1;
   /* We may use this symbol during ctors generation in chkp_finish_file
  when all symbols are emitted.  Force output to avoid undefined
  symbols in ctors.  */
-  if (!in_lto_p)
-{
-  DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub);
-  DECL_COMDAT (var) = 1;
-  varpool_node::get_create (var)-set_comdat_group (DECL_ASSEMBLER_NAME 
(var));
-  varpool_node::get_create (var)-force_output = 1;
-}
-  else
-DECL_EXTERNAL (var) = 1;
+  node = varpool_node::get_create (var);
+  node-set_comdat_group (DECL_ASSEMBLER_NAME (var));
+  node-force_output = 1;
+  node-externally_visible = 1;
+
   varpool_node::finalize_decl (var);
 
   return var;


Re: [C++ Patch/RFC] PR 64085

2015-04-02 Thread Paolo Carlini

Hi,

I guess I can as well ping this - still in my tree, no issues so far...

On 03/04/2015 07:29 PM, Paolo Carlini wrote:

Hi,

today while working a bit on Bugzilla I found this issue which should 
be easy to fix, not a regression, therefore possibly post 5.0. At 
parsing time, cp_parser_lambda_introducer ends up calling 
real_lvalue_p on the lambda initializer (via add_capture), but then 
the initializer, which is a CALL_EXPR, crashes immediately lvalue_kind 
because its TREE_TYPE is not set (we are dealing with a template 
function). I guess that, in this case too, we want to check first 
dependent_type_p. Tested x86_64-linux.


https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00241.html

Thanks,
Paolo.


[PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers

2015-04-02 Thread Ilya Enkovich
Hi,

This patch fixes nested PARALLEL in retval for isntrumented calls. Current 
possible call:

(call_insn:TI 6 30 17 2 (set (parallel [
(expr_list:REG_DEP_TRUE (parallel:TI [
(expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
(const_int 8 [0x8]))
])
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
(const_int 0 [0]))
])
(call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8])
(const_int 0 [0]))) complex.c:11 670 {*call_value}

Such construction causes DF analysis problems.  This patch tranforms it to:

(call_insn:TI 6 30 17 2 (set (parallel:TI [
(expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
(const_int 8 [0x8]))
(expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
(const_int 0 [0]))
])
(call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8])
(const_int 0 [0]))) complex.c:11 670 {*call_value}


Only MPX target is affected.  Bootstrapped and tested on 
x86_64-unknown-linux-gnu.  Does it look OK?

Thanks,
Ilya
--
2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/i386.c (ix86_expand_call): Avoid nested
PARALLEL in returned call value.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 744642c..1d821cd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -25624,8 +25624,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
{
  rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG);
  rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1);
- retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, b1));
- chkp_put_regs_to_expr_list (retval);
+ if (GET_CODE (retval) == PARALLEL)
+   {
+ b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx);
+ b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx);
+ rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1));
+ retval = chkp_join_splitted_slot (retval, par);
+   }
+ else
+   {
+ retval = gen_rtx_PARALLEL (VOIDmode,
+gen_rtvec (3, retval, b0, b1));
+ chkp_put_regs_to_expr_list (retval);
+   }
}
 
   call = gen_rtx_SET (VOIDmode, retval, call);


[PATCH, CHKP] Fix ipa-comdats for instrumentation thunks

2015-04-02 Thread Ilya Enkovich
Hi,

With r221574 (https://gcc.gnu.org/ml/gcc-cvs/2015-03/msg00495.html) thunks 
don't get comdat groups assigned and this causes a failure in cgraph checker 
for instrumentation thunks.  It happens because instrumentation thunk may 
reference local symbol in comdat not being in comdat by itself.  This patch 
fixes the problem.  Doesn't affect not instrumented code.  Testing is in 
progress.  Does it look OK?

Thanks,
Ilya
--
gcc/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* ipa-comdats.c (ipa_comdats): Visit all instrumentation
thunks to set proper comdat group.

gcc/testsuite/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.target/i386/mpx/chkp-thunk-comdat-1.cc: New.
* gcc.target/i386/mpx/chkp-thunk-comdat-2.cc: New.


diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c
index f349f9f..30bcad8 100644
--- a/gcc/ipa-comdats.c
+++ b/gcc/ipa-comdats.c
@@ -348,10 +348,9 @@ ipa_comdats (void)
 }
 
   /* Finally assign symbols to the sections.  */
-
+  cgraph_node *fun;
   FOR_EACH_DEFINED_SYMBOL (symbol)
 {
-  struct cgraph_node *fun;
   symbol-aux = NULL; 
   if (!symbol-get_comdat_group ()
   !symbol-alias
@@ -388,6 +387,20 @@ ipa_comdats (void)
   true);
}
 }
+
+  /* Instrumentation thunks reference original node and thus
+ need to be in the same comdat group.  Otherwise we may
+ get a local instrumented symbol in a comdat group and
+ the referencing original node outside of it.  */
+  FOR_EACH_DEFINED_FUNCTION (fun)
+if (fun-instrumentation_clone
+fun-instrumented_version
+!fun-instrumented_version-alias
+fun-get_comdat_group ()
+!fun-instrumented_version-get_comdat_group ())
+  fun-instrumented_version-call_for_symbol_and_aliases
+   (set_comdat_group_1, fun, true);
+
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc 
b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc
new file mode 100644
index 000..26d3c48
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-1.cc
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fcheck-pointer-bounds -mmpx } */
+
+namespace
+{
+  template int dim
+  int __attribute__((noinline))
+  f1 ()
+  {
+return dim;
+  }
+}
+
+int
+test ()
+{
+  return f13 ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc 
b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc
new file mode 100644
index 000..2b1abe9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-2.cc
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fcheck-pointer-bounds -mmpx } */
+
+class c1
+{
+public:
+  virtual int test1 (const char *);
+};
+
+class c2
+{
+public:
+  int test2 (const char *);
+};
+
+int
+c1::test1 (const char *)
+{
+  return 0;
+}
+
+int
+c2::test2 (const char *)
+{
+  return 0;
+}


C++ PATCH for c++/65642 (rejects-valid-constexpr)

2015-04-02 Thread Marek Polacek
This fixes the regression part of c++/65642.  Some history behind this:
up until r214941 we were able to fold A[i] + j into A[i + j], but
in this revision richi removed this transformation.  Since we need this
transformation when dealing with constexprs, I added it back into
cxx_fold_indirect_ref function (r221544).  But that was a wrong place; we
need to be able to handle such a transformation even when we're not dealing
with INDIRECT_REFs, so I moved the folding into cxx_eval_constant_expression
and introduced cxx_eval_pointer_plus_expression (r221777).  That was the
right thing to do, but I should've also added an evaluation of the first op
of a POINTER_PLUS_EXPR, otherwise cxx_eval_pointer_plus_expression won't be
able to do its job properly in some cases: e.g. when it gets s + 1 where
s is a PARM_DECL: the function expects the s[0] + 1 form.

The fix for the second part of this PR should follow soon.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-02  Marek Polacek  pola...@redhat.com

PR c++/65642
* constexpr.c (cxx_eval_pointer_plus_expression): Call
cxx_eval_constant_expression on the first operand.

* g++.dg/cpp0x/constexpr-fold1.C: New test.
* g++.dg/cpp0x/constexpr-fold2.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 2100f94..f5be8df 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -2933,6 +2933,9 @@ cxx_eval_pointer_plus_expression (const constexpr_ctx 
*ctx, tree t,
   tree op01 = TREE_OPERAND (t, 1);
   location_t loc = EXPR_LOCATION (t);
 
+  op00 = cxx_eval_constant_expression (ctx, op00, lval,
+  non_constant_p, overflow_p);
+
   STRIP_NOPS (op00);
   if (TREE_CODE (op00) != ADDR_EXPR)
 return NULL_TREE;
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C
index e69de29..414a0da 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-fold1.C
@@ -0,0 +1,65 @@
+// PR c++/65642
+// { dg-do compile { target c++11 } }
+
+// Check we're able to evaluate these.
+
+#define SA(X) static_assert((X),#X)
+
+constexpr char s[] = abc;
+constexpr int t[] = { 'a', 'b', 'c', '\0' };
+
+constexpr char
+fn1 (const char *p)
+{
+  return *(p + 1);
+}
+
+constexpr char
+fn2 (const char *p)
+{
+  return p[1];
+}
+
+constexpr int
+fn3 (const int *p)
+{
+  return *(p + 1);
+}
+
+constexpr int
+fn4 (const int *p)
+{
+  return p[1];
+}
+
+constexpr auto c1 = fn1 (s[0]);
+constexpr auto c2 = fn1 (s[1]);
+constexpr auto c3 = fn1 (s[2]);
+
+SA (c1 == 'b');
+SA (c2 == 'c');
+SA (c3 == '\0');
+
+constexpr auto d1 = fn2 (s[0]);
+constexpr auto d2 = fn2 (s[1]);
+constexpr auto d3 = fn2 (s[2]);
+
+SA (d1 == 'b');
+SA (d2 == 'c');
+SA (d3 == '\0');
+
+constexpr auto e1 = fn3 (t[0]);
+constexpr auto e2 = fn3 (t[1]);
+constexpr auto e3 = fn3 (t[2]);
+
+SA (e1 == 'b');
+SA (e2 == 'c');
+SA (e3 == '\0');
+
+constexpr auto f1 = fn4 (t[0]);
+constexpr auto f2 = fn4 (t[1]);
+constexpr auto f3 = fn4 (t[2]);
+
+SA (f1 == 'b');
+SA (f2 == 'c');
+SA (f3 == '\0');
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C
index e69de29..98aca2a 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-fold2.C
@@ -0,0 +1,30 @@
+// PR c++/65642
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+constexpr char s[] = abc;
+
+constexpr bool
+cmp (char const *a, char const *b)
+{
+  return a == b;
+}
+
+constexpr bool
+fn1 (const char *s)
+{
+  return cmp (s, s + 1);
+}
+
+constexpr bool
+fn2 (const char *s)
+{
+  return cmp (s + 1, s + 1);
+}
+
+constexpr auto c1 = fn1 (s[0]);
+constexpr auto c2 = fn2 (s[0]);
+
+SA (!c1);
+SA (c2);

Marek


Re: New regression on ARM Linux

2015-04-02 Thread Alan Lawrence

Richard Biener wrote:
  On Tue, 31 Mar 2015, Alan Lawrence wrote:
 
  (1) If we wish to keep the AAPCS principle that varargs are passed just as
  named args, we should use TYPE_MAIN_VARIANT inside
  arm_needs_doubleword_alignment, which will then ignore overalignment on 
both
  varargs _and named args_. However this would be silently ABI-changing?
 
  (2) It seems to me that SRA is the only way for overalignment info to be
  present on a value, so the patch to tree-sra.c/create_access_replacement 
seems

  to make things more consistent?
 
  I'm not so sure about (2), SCCVN records the type of a reference
  and PRE uses it to create the LHS temporaries to insert them.
  You'd need some tricky order of optimizations to expose that to
  a call argument though (copy-propagating the inserted value to
  a call argument).  LIM may have similar issues (when doing store-motion),
  so may predictive commoning and loop distribution (and maybe others I
  forgot).


Hmmm. The other cases you mention are somewhat worrisome, as we don't know if
there are testcases that might tickle these too. But I think making a
last-minute change to the backend, *that would affect the ABI for prototyped
arguments*, is asking for trouble; we don't have time to think through the
implications or make a proper spec, and I can easily see us wanting to change it
*again* for gcc 6! So I think your patch to tree-sra.c
(create_access_replacement) seems the right fix?

--Alan




[PATCH, CHKP] Fix cdtor merge for instrumented functions

2015-04-02 Thread Ilya Enkovich
Hi,

This patch doesn't allow instrumentation thunks calls while merging 
constructors and destructors.  Not isntrumented code is not affeceted.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* ipa.c (ipa_cdtor_merge): Skip instrumentation thunks.

gcc/testsuite/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/lto/chkp-ctor-merge_0.c: New.


diff --git a/gcc/ipa.c b/gcc/ipa.c
index b3752de..84ab542 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -1159,8 +1159,10 @@ ipa_cdtor_merge (void)
 {
   struct cgraph_node *node;
   FOR_EACH_DEFINED_FUNCTION (node)
-if (DECL_STATIC_CONSTRUCTOR (node-decl)
-   || DECL_STATIC_DESTRUCTOR (node-decl))
+if ((DECL_STATIC_CONSTRUCTOR (node-decl)
+|| DECL_STATIC_DESTRUCTOR (node-decl))
+!(node-thunk.thunk_p
+ node-thunk.add_pointer_bounds_args))
record_cdtor_fn (node);
   build_cdtor_fns ();
   static_ctors.release ();
diff --git a/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c 
b/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c
new file mode 100644
index 000..ac4095b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/chkp-ctor-merge_0.c
@@ -0,0 +1,23 @@
+/* { dg-lto-do run } */
+/* { dg-require-effective-target mpx } */
+/* { dg-lto-options { { -O2 -flto -fcheck-pointer-bounds -mmpx -nodefaultlibs 
-lc } } } */
+
+int glob = 1;
+
+void __attribute__((constructor))
+ctor1 ()
+{
+  glob += 1;
+}
+
+
+void __attribute__((constructor))
+ctor2 ()
+{
+  glob -= 2;
+}
+
+int main (int argc, const char **argv)
+{
+  return glob;
+}


C++ PATCH for c++/65625 (ICE using variable template as type)

2015-04-02 Thread Jason Merrill
We were specifically handling misuse of a function template, but we also 
need to handle variable templates now.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit d743a07ee6b19b3fec59c81ae2cbedbf38d91c4b
Author: Jason Merrill ja...@redhat.com
Date:   Wed Apr 1 19:02:03 2015 -0400

	PR c++/65625
	* decl.c (make_typename_type): Handle seeing a variable template.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 31b8e0c..c4731ae 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -3480,9 +3480,9 @@ make_typename_type (tree context, tree name, enum tag_types tag_type,
   if (TREE_CODE (name) == TEMPLATE_ID_EXPR)
 {
   name = TREE_OPERAND (name, 0);
-  if (TREE_CODE (name) == TEMPLATE_DECL)
+  if (DECL_TYPE_TEMPLATE_P (name))
 	name = TREE_OPERAND (fullname, 0) = DECL_NAME (name);
-  else if (TREE_CODE (name) == OVERLOAD)
+  if (TREE_CODE (name) != IDENTIFIER_NODE)
 	{
 	  if (complain  tf_error)
 	error (%qD is not a type, name);
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ23.C b/gcc/testsuite/g++.dg/cpp1y/var-templ23.C
new file mode 100644
index 000..a5b19d2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ23.C
@@ -0,0 +1,7 @@
+// PR c++/65625
+// { dg-do compile { target c++14 } }
+
+namespace std {
+  template typename T int declval;
+  typename std::declval d;	// { dg-error not a type }
+}


Re: [CHKP] Support returned bounds in thunks expand

2015-04-02 Thread Ilya Enkovich
Ping

2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Hi,

 Currentl we loose returned bounds when functions are merged.  This patch 
 fixes it by adding returne bounds support for cgraph_node::expand_thunk.  
 Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2015-03-06  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (cgraph_node::expand_thunk): Build returned
 bounds for instrumented functions.

 gcc/testsuite/

 2015-03-06  Ilya Enkovich  ilya.enkov...@intel.com

 * gcc/testsuite/gcc.target/i386/thunk-retbnd.c: New.


 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index e640907..fc38e67 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -1581,6 +1581,7 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool 
 force_gimple_thunk)
int i;
tree resdecl;
tree restmp = NULL;
 +  tree resbnd = NULL;

gcall *call;
greturn *ret;
 @@ -1697,6 +1698,21 @@ cgraph_node::expand_thunk (bool output_asm_thunks, 
 bool force_gimple_thunk)
gsi_insert_after (bsi, call, GSI_NEW_STMT);
if (!alias_is_noreturn)
 {
 + if (instrumentation_clone
 +  !DECL_BY_REFERENCE (resdecl)
 +  restmp
 +  BOUNDED_P (restmp))
 +   {
 + tree fn = targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDRET);
 + gcall *retbnd = gimple_build_call (fn, 1, restmp);
 +
 + resbnd = create_tmp_reg (pointer_bounds_type_node, retbnd);
 + gimple_call_set_lhs (retbnd, resbnd);
 +
 + gsi_insert_after (bsi, retbnd, GSI_NEW_STMT);
 + create_edge (get_create (fn), retbnd, callees-count, 
 callees-frequency);
 +   }
 +
   if (restmp  !this_adjusting
(fixed_offset || virtual_offset))
 {
 @@ -1766,6 +1782,7 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool 
 force_gimple_thunk)
 ret = gimple_build_return (restmp);
   else
 ret = gimple_build_return (resdecl);
 + gimple_return_set_retbnd (ret, resbnd);

   gsi_insert_after (bsi, ret, GSI_NEW_STMT);
 }
 diff --git a/gcc/testsuite/gcc.target/i386/thunk-retbnd.c 
 b/gcc/testsuite/gcc.target/i386/thunk-retbnd.c
 new file mode 100644
 index 000..d9bd031
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/thunk-retbnd.c
 @@ -0,0 +1,19 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target mpx } */
 +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx -fdump-tree-optimized } */
 +/* { dg-final { scan-tree-dump-times return glob, 2 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 +
 +int glob;
 +
 +int *
 +test1 (void)
 +{
 +  return glob;
 +}
 +
 +int *
 +test2 (void)
 +{
 +  return test1 ();
 +}


Re: [CHKP, PATCH] Fix LTO cgraph merge for instrumented functions

2015-04-02 Thread Ilya Enkovich
Ping

2015-03-19 11:34 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 On 12 Mar 13:27, Ilya Enkovich wrote:
 Hi,

 Currently cgraph merge has several issues with instrumented code:
  - original function node may be removed = no assembler name conflict is 
 detected between function and variable
  - only orig_decl name is privatized for instrumented function = node still 
 shares assembler name which causes infinite privatization loop
  - information about changed name is stored in file_data of instrumented 
 node = original section name may be not found for original function
  - chkp reference is not fixed when nodes are merged

 This patch should fix theese problems by keeping instrumentation thunks 
 reachable, privatizing both nodes and fixing chkp references.  Bootstrapped 
 and tested on x86_64-unknown-linux-gnu.  OK for trunk?


 Thanks,
 Ilya

 Here is an updated version where chkp_maybe_fix_chkp_ref also removes 
 duplicating references which may appear during cgraph nodes merge.  
 Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?


 Thanks,
 Ilya
 --
 gcc/

 2015-03-19  Ilya Enkovich  ilya.enkov...@intel.com

 * ipa-chkp.h (chkp_maybe_fix_chkp_ref): New.
 * ipa-chkp.c (chkp_maybe_fix_chkp_ref): New.
 * ipa.c (symbol_table::remove_unreachable_nodes): Don't
 remove instumentation thunks calling reachable functions.
 * lto-cgraph.c: Include ipa-chkp.h.
 (input_symtab): Fix chkp references for boundary nodes.
 * lto/lto-partition.c (privatize_symbol_name_1): New.
 (privatize_symbol_name): Privatize both decl and orig_decl
 names for instrumented functions.
 * lto/lto-symtab.c: Include ipa-chkp.h.
 (lto_cgraph_replace_node): Fix chkp references for merged
 function nodes.

 gcc/testsuite/

 2015-03-19  Ilya Enkovich  ilya.enkov...@intel.com

 * gcc.dg/lto/chkp-privatize-1_0.c: New.
 * gcc.dg/lto/chkp-privatize-1_1.c: New.
 * gcc.dg/lto/chkp-privatize-2_0.c: New.
 * gcc.dg/lto/chkp-privatize-2_1.c: New.


 diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c
 index 0b857ff..7a7fc3c 100644
 --- a/gcc/ipa-chkp.c
 +++ b/gcc/ipa-chkp.c
 @@ -414,6 +414,40 @@ chkp_instrumentable_p (tree fndecl)
(!fn || !copy_forbidden (fn, fndecl)));
  }

 +/* Check NODE has a correct IPA_REF_CHKP reference.
 +   Create a new reference if required.  */
 +
 +void
 +chkp_maybe_fix_chkp_ref (cgraph_node *node)
 +{
 +  /* Firstly check node needs IPA_REF_CHKP.  */
 +  if (node-instrumentation_clone
 +  || !node-instrumented_version)
 +return;
 +
 +  /* Check we already have a proper IPA_REF_CHKP.
 + Remove incorrect refs.  */
 +  int i;
 +  ipa_ref *ref = NULL;
 +  bool found = false;
 +  for (i = 0; node-iterate_reference (i, ref); i++)
 +if (ref-use == IPA_REF_CHKP)
 +  {
 +   /* Found proper reference.  */
 +   if (ref-referred == node-instrumented_version
 +!found)
 + found = true;
 +   else
 + {
 +   ref-remove_reference ();
 +   i--;
 + }
 +  }
 +
 +  if (!found)
 +node-create_reference (node-instrumented_version, IPA_REF_CHKP, NULL);
 +}
 +
  /* Return clone created for instrumentation of NODE or NULL.  */

  cgraph_node *
 diff --git a/gcc/ipa-chkp.h b/gcc/ipa-chkp.h
 index 6708fe9..5fa7d88 100644
 --- a/gcc/ipa-chkp.h
 +++ b/gcc/ipa-chkp.h
 @@ -24,5 +24,6 @@ extern tree chkp_copy_function_type_adding_bounds (tree 
 orig_type);
  extern tree chkp_maybe_clone_builtin_fndecl (tree fndecl);
  extern cgraph_node *chkp_maybe_create_clone (tree fndecl);
  extern bool chkp_instrumentable_p (tree fndecl);
 +extern void chkp_maybe_fix_chkp_ref (cgraph_node *node);

  #endif /* GCC_IPA_CHKP_H */
 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index b3752de..3054afe 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -492,7 +492,22 @@ symbol_table::remove_unreachable_nodes (FILE *file)
 }
   else if (cnode-thunk.thunk_p)
 enqueue_node (cnode-callees-callee, first, reachable);
 -
 +
 + /* For instrumentation clones we always need original
 +function node for proper LTO privatization.  */
 + if (cnode-instrumentation_clone
 +  reachable.contains (cnode)
 +  cnode-definition)
 +   {
 + gcc_assert (cnode-instrumented_version || in_lto_p);
 + if (cnode-instrumented_version)
 +   {
 + enqueue_node (cnode-instrumented_version, first,
 +   reachable);
 + reachable.add (cnode-instrumented_version);
 +   }
 +   }
 +
   /* If any reachable function has simd clones, mark them as
  reachable as well.  */
   if (cnode-simd_clones)
 diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
 index c875fed..b9196eb 100644
 --- a/gcc/lto-cgraph.c
 +++ b/gcc/lto-cgraph.c
 @@ -80,6 +80,7 @@ along with GCC; 

Re: [CHKP] Never expand instrumentation thunks

2015-04-02 Thread Ilya Enkovich
Ping. Patch doesn't affect not instrumented code.

Thanks,
Ilya

2015-03-18 15:19 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Hi,

 This patch disables attempts to expand instrumentation thunks which appear 
 when we create specialized function versions.  Problem was found during 
 SPEC2006 insatrumentation with '-Ofast -flto'.  Unfortunately I couldn't make 
 a small reproducer.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK 
 for trunk?

 Thanks,
 Ilya
 --
 2015-03-18  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (cgraph_node::expand_thunk): Don't expand
 instrumentation thunks.


 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index e640907..abc9cfe 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -1508,6 +1508,10 @@ cgraph_node::expand_thunk (bool output_asm_thunks, 
 bool force_gimple_thunk)
tree thunk_fndecl = decl;
tree a;

 +  /* Instrumentation thunk is the same function with
 + a different signature.  Never need to expand it.  */
 +  if (thunk.add_pointer_bounds_args)
 +return false;

if (!force_gimple_thunk  this_adjusting
 targetm.asm_out.can_output_mi_thunk (thunk_fndecl, fixed_offset,


Re: [PATCH] Fix PR ipa/65557

2015-04-02 Thread James Greenhalgh
On Wed, Apr 01, 2015 at 11:42:28PM +0100, H.J. Lu wrote:
 On Mon, Mar 30, 2015 at 3:19 PM, Martin Liška mli...@suse.cz wrote:
  You are right, there's one more occurrence of the usage.
  I'm going to install the patch I've attached.
 
 
 This caused:
 
 FAIL: g++.dg/torture/pr64378.C   -O2 -flto -fno-use-linker-plugin
 -flto-partition=none  (internal compiler error)
 FAIL: g++.dg/torture/pr64378.C   -O2 -flto -fno-use-linker-plugin
 -flto-partition=none  (test for excess errors)
 
 on x86-64.

Likewise on aarch64-none-elf and aarch64-none-linux-gnu.

Thanks,
James



/build-gcc/obj/gcc2/gcc/testsuite/g++6/../../xg++ 
-B/build-gcc/obj/gcc2/gcc/testsuite/g++6/../../ 
/gcc-src/gcc/gcc/testsuite/g++.dg/torture/pr64378.C -fno-diagnostics-show-caret 
-fdiagnostics-color=never -nostdinc++ 
-I/build-gcc/obj/gcc2/aarch64-none-elf/libstdc++-v3/include/aarch64-none-elf 
-I/build-gcc/obj/gcc2/aarch64-none-elf/libstdc++-v3/include 
-I/gcc-src/gcc/libstdc++-v3/libsupc++ 
-I/gcc-src/gcc/libstdc++-v3/include/backward 
-I/gcc-src/gcc/libstdc++-v3/testsuite/util -fmessage-length=0 -O2 -flto 
-fno-use-linker-plugin -flto-partition=none -fno-ipa-cp -S -specs=aem-ve.specs 
-mcmodel=small -o pr64378.s

/gcc-src/gcc/gcc/testsuite/g++.dg/torture/pr64378.C:24:1: internal compiler 
error: Segmentation fault
0xc2c81a crash_signal
/gcc-src/gcc/gcc/toplev.c:383
0x10e434e vecipa_param_descriptor, va_heap, vl_embed::operator[](unsigned int)
/gcc-src/gcc/gcc/vec.h:736
0x10e434e vecipa_param_descriptor, va_heap, vl_ptr::operator[](unsigned int)
/gcc-src/gcc/gcc/vec.h:1202
0x10e434e ipa_is_param_used
/gcc-src/gcc/gcc/ipa-prop.h:405
0x10e434e ipa_icf::sem_item_optimizer::update_hash_by_addr_refs()
/gcc-src/gcc/gcc/ipa-icf.c:2511
0x10e45c5 ipa_icf::sem_item_optimizer::execute()
/gcc-src/gcc/gcc/ipa-icf.c:2394
0x10e6aa6 ipa_icf_driver
/gcc-src/gcc/gcc/ipa-icf.c:3304
0x10e6aa6 ipa_icf::pass_ipa_icf::execute(function*)
/gcc-src/gcc/gcc/ipa-icf.c:3351
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.



Re: [PATCH, CHKP] Restore transparent alias chains

2015-04-02 Thread Ilya Enkovich
Ping. This patch doesn't affect not instrumented code.

Thanks,
Ilya

2015-03-20 11:20 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Hi,

 Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit. 
  We always expect it for instrumentation clones, thus restore it 
 input_cgraph_1.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for 
 trunk?

 Thanks,
 Ilya
 --
 2015-03-20  Ilya Enkovich  ilya.enkov...@intel.com

 * lto-cgraph.c (input_cgraph_1): Always link instrumented
 assembler name with original one.


 diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
 index c875fed..d782327 100644
 --- a/gcc/lto-cgraph.c
 +++ b/gcc/lto-cgraph.c
 @@ -1613,9 +1613,8 @@ input_cgraph_1 (struct lto_file_decl_data *file_data,
 }

   /* Restore decl names reference.  */
 - if (IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME 
 (cnode-decl))
 -  !TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl)))
 -   TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl))
 + IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME 
 (cnode-decl)) = 1;
 + TREE_CHAIN (DECL_ASSEMBLER_NAME (cnode-decl))
   = DECL_ASSEMBLER_NAME (cnode-orig_decl);
 }
 }


Re: C++ PATCH for c++/65642 (rejects-valid-constexpr)

2015-04-02 Thread Jason Merrill

OK.

Jason


Re: [PATCH, CHKP] Fix ipa-comdats for instrumentation thunks

2015-04-02 Thread Jan Hubicka
 Hi,
 
 With r221574 (https://gcc.gnu.org/ml/gcc-cvs/2015-03/msg00495.html) thunks 
 don't get comdat groups assigned and this causes a failure in cgraph checker 
 for instrumentation thunks.  It happens because instrumentation thunk may 
 reference local symbol in comdat not being in comdat by itself.  This patch 
 fixes the problem.  Doesn't affect not instrumented code.  Testing is in 
 progress.  Does it look OK?
 
 Thanks,
 Ilya
 --
 gcc/
 
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * ipa-comdats.c (ipa_comdats): Visit all instrumentation
   thunks to set proper comdat group.
 
 gcc/testsuite/
 
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com
 
   * gcc.target/i386/mpx/chkp-thunk-comdat-1.cc: New.
   * gcc.target/i386/mpx/chkp-thunk-comdat-2.cc: New.
 
 
 diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c
 index f349f9f..30bcad8 100644
 --- a/gcc/ipa-comdats.c
 +++ b/gcc/ipa-comdats.c
 @@ -348,10 +348,9 @@ ipa_comdats (void)
  }
  
/* Finally assign symbols to the sections.  */
 -
 +  cgraph_node *fun;
FOR_EACH_DEFINED_SYMBOL (symbol)
  {
 -  struct cgraph_node *fun;
symbol-aux = NULL; 
if (!symbol-get_comdat_group ()
  !symbol-alias
 @@ -388,6 +387,20 @@ ipa_comdats (void)
  true);
   }
  }
 +
 +  /* Instrumentation thunks reference original node and thus
 + need to be in the same comdat group.  Otherwise we may
 + get a local instrumented symbol in a comdat group and
 + the referencing original node outside of it.  */
 +  FOR_EACH_DEFINED_FUNCTION (fun)
 +if (fun-instrumentation_clone
 +  fun-instrumented_version
 +  !fun-instrumented_version-alias
 +  fun-get_comdat_group ()
 +  !fun-instrumented_version-get_comdat_group ())
 +  fun-instrumented_version-call_for_symbol_and_aliases
 + (set_comdat_group_1, fun, true);

I think you want to handle them same way as the aliasesthunks are handled.
This fix is symptomatic: the code may assign them to different comdat groups
and may propagate that furhter.

Honza


Re: [RS6000] Fix 65576 regression

2015-04-02 Thread David Edelsohn
On Thu, Apr 2, 2015 at 1:22 AM, Alan Modra amo...@gmail.com wrote:
 This cures an unrecognizable insn ICE by modifying a predicate of
 extenddftf2_internal (the only place this predicate is used) to ensure
 that rtl optimization passes do not substitute 0.0 for a register with
 known 0.0 value, except when VSX is enabled.  ie. Don't undo the
 necessary register move emitted by the extenddftf2_fprs expander.
 Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux
 and powerpc-linux.  OK to apply?

 PR target/65576
 PR target/65240
 * config/rs6000/predicates.md (zero_reg_mem_operand): Exclude
 0.0 constant unless TARGET_VSX.

This change seems to disable the final alternative in the
extenddftf2_internal pattern.

- David


Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-04-02 Thread Ilya Enkovich
On 24 Mar 15:06, Jakub Jelinek wrote:
 On Tue, Mar 24, 2015 at 12:22:27PM +0300, Ilya Enkovich wrote:
  2015-03-24 11:33 GMT+03:00 Jakub Jelinek ja...@redhat.com:
   On Thu, Mar 19, 2015 at 11:29:44AM +0300, Ilya Enkovich wrote:
   +  /* We might propagate instrumented function pointer into
   + not instrumented function and vice versa.  In such a
   + case we need to either fix function declaration or
   + remove bounds from call statement.  */
   +  if (callee)
   +skip_bounds = chkp_redirect_edge (e);
  
   I just want to say that I'm not really excited about all this compile time
   cost that is added everywhere unconditionally for chkp.
   I think much better would be to guard most of it with proper option check
   first and only do the more expensive part if the option has been used.
  
  Agree, overhead for not instrumented code should be minimized.
  Unfortunately there is no option check I can use to guard chkp codes
  due to LTO. Currently it is allowed to pass -fcheck-pointer-bounds for
  IL generation and don't pass it for final code generation. I suppose I
  may set this (or some new) flag if see instrumented node when read
  cgraph and then use it to guard chkp related codes. Would it be
  acceptable?
 
 The question is what you want to do in the LTO case for the different cases,
 in particular a TU compiled with -fcheck-pointer-bounds and LTO link without
 that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
 It could be handled as LTO incompatible option, where lto1 would error out
 if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
 code, or e.g. similar to var-tracking, you could consider adjusting the IL
 upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
 and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
 with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
 Or another possibility is to or in -fcheck-pointer-bounds from all TUs.
 
  Maybe replace attribute usage with a new flag in tree_decl_with_vis 
  structure?
 
 Depends, might be better to stick it into cgraph_node instead, depends on
 whether you are querying it already early in the FEs or just during GIMPLE
 when the cgraph node should be created too.
 
   Jakub

Hi,

Here is an updated version of the patch.  I added merge for 
-fcheck-pointer-bounds option in LTO and used it as a guard for new code.  
Testing is in progress.  Does this version look OK?

Thanks,
Ilya
--
gcc/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

PR target/65527
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Add
redirection for instrumented calls.
* lto-wrapper.c (merge_and_complain): Merge -fcheck-pointer-bounds.
(append_compiler_options): Append -fcheck-pointer-bounds.
* tree-chkp.h (chkp_copy_call_skip_bounds): New.
(chkp_redirect_edge): New.
* tree-chkp.c (chkp_copy_call_skip_bounds): New.
(chkp_redirect_edge): New.

gcc/testsuite/

2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

PR target/65527
* gcc.target/i386/mpx/chkp-fix-calls-1.c: New.
* gcc.target/i386/mpx/chkp-fix-calls-2.c: New.
* gcc.target/i386/mpx/chkp-fix-calls-3.c: New.
* gcc.target/i386/mpx/chkp-fix-calls-4.c: New.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 85531c8..28b5996 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1277,14 +1277,25 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 {
   cgraph_edge *e = this;
 
-  tree decl = gimple_call_fndecl (e-call_stmt);
-  tree lhs = gimple_call_lhs (e-call_stmt);
+  tree decl;
+  tree lhs;
   gcall *new_stmt;
   gimple_stmt_iterator gsi;
+  bool skip_bounds = false;
 #ifdef ENABLE_CHECKING
   cgraph_node *node;
 #endif
 
+  /* We might propagate instrumented function pointer into
+ not instrumented function and vice versa.  In such a
+ case we need to either fix function declaration or
+ remove bounds from call statement.  */
+  if (flag_check_pointer_bounds  callee)
+skip_bounds = chkp_redirect_edge (e);
+
+  decl = gimple_call_fndecl (e-call_stmt);
+  lhs = gimple_call_lhs (e-call_stmt);
+
   if (e-speculative)
 {
   cgraph_edge *e2;
@@ -1390,7 +1401,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
 
   if (e-indirect_unknown_callee
-  || decl == e-callee-decl)
+  || (decl == e-callee-decl
+  !skip_bounds))
 return e-call_stmt;
 
 #ifdef ENABLE_CHECKING
@@ -1415,13 +1427,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
}
 }
 
-  if (e-callee-clone.combined_args_to_skip)
+  if (e-callee-clone.combined_args_to_skip
+  || skip_bounds)
 {
   int lp_nr;
 
-  new_stmt
-   = gimple_call_copy_skip_args (e-call_stmt,
- e-callee-clone.combined_args_to_skip);
+  new_stmt = e-call_stmt;
+  if (e-callee-clone.combined_args_to_skip)
+   new_stmt
+

Re: [CHKP, PATCH] Fix LTO cgraph merge for instrumented functions

2015-04-02 Thread Jan Hubicka
 Ping
It would help if you add hubi...@ucw.cz to CC for IPA related patches.
 
 2015-03-19 11:34 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
  On 12 Mar 13:27, Ilya Enkovich wrote:
  Hi,
 
  Currently cgraph merge has several issues with instrumented code:
   - original function node may be removed = no assembler name conflict is 
  detected between function and variable

Why do you need to detect this one?
   - only orig_decl name is privatized for instrumented function = node 
  still shares assembler name which causes infinite privatization loop
   - information about changed name is stored in file_data of instrumented 
  node = original section name may be not found for original function
   - chkp reference is not fixed when nodes are merged
 
  This patch should fix theese problems by keeping instrumentation thunks 
  reachable, privatizing both nodes and fixing chkp references.  
  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Next stage1 I definitly hope we will be able to reduce number of special cases
we need for chck nodes.  I will try to read the code and your design document
and give it some tought.
  2015-03-19  Ilya Enkovich  ilya.enkov...@intel.com
 
  * ipa-chkp.h (chkp_maybe_fix_chkp_ref): New.
  * ipa-chkp.c (chkp_maybe_fix_chkp_ref): New.
  * ipa.c (symbol_table::remove_unreachable_nodes): Don't
  remove instumentation thunks calling reachable functions.
  * lto-cgraph.c: Include ipa-chkp.h.
  (input_symtab): Fix chkp references for boundary nodes.
  * lto/lto-partition.c (privatize_symbol_name_1): New.
  (privatize_symbol_name): Privatize both decl and orig_decl
  names for instrumented functions.
  * lto/lto-symtab.c: Include ipa-chkp.h.
  (lto_cgraph_replace_node): Fix chkp references for merged
  function nodes.
 
  gcc/testsuite/
 
  2015-03-19  Ilya Enkovich  ilya.enkov...@intel.com
 
  * gcc.dg/lto/chkp-privatize-1_0.c: New.
  * gcc.dg/lto/chkp-privatize-1_1.c: New.
  * gcc.dg/lto/chkp-privatize-2_0.c: New.
  * gcc.dg/lto/chkp-privatize-2_1.c: New.
 
 
  diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c
  index 0b857ff..7a7fc3c 100644
  --- a/gcc/ipa-chkp.c
  +++ b/gcc/ipa-chkp.c
  @@ -414,6 +414,40 @@ chkp_instrumentable_p (tree fndecl)
 (!fn || !copy_forbidden (fn, fndecl)));
   }
 
  +/* Check NODE has a correct IPA_REF_CHKP reference.
  +   Create a new reference if required.  */
  +
  +void
  +chkp_maybe_fix_chkp_ref (cgraph_node *node)

OK, so you have the chkp references that links to instrumented_version.
You do not stream them for boundary symbols but for some reason they are needed,
but when you can end up with wrong CHKP link?
  diff --git a/gcc/ipa.c b/gcc/ipa.c
  index b3752de..3054afe 100644
  --- a/gcc/ipa.c
  +++ b/gcc/ipa.c
  @@ -492,7 +492,22 @@ symbol_table::remove_unreachable_nodes (FILE *file)
  }
else if (cnode-thunk.thunk_p)
  enqueue_node (cnode-callees-callee, first, reachable);
  -
  +
  + /* For instrumentation clones we always need original
  +function node for proper LTO privatization.  */
  + if (cnode-instrumentation_clone
  +  reachable.contains (cnode)
  +  cnode-definition)
  +   {
  + gcc_assert (cnode-instrumented_version || in_lto_p);
  + if (cnode-instrumented_version)
  +   {
  + enqueue_node (cnode-instrumented_version, first,
  +   reachable);
  + reachable.add (cnode-instrumented_version);
  +   }
  +   }
  +
This is OK
  +/* Mangle NODE symbol name into a local name.
  +   This is necessary to do
  +   1) if two or more static vars of same assembler name
  +  are merged into single ltrans unit.
  +   2) if previously static var was promoted hidden to avoid possible 
  conflict
  +  with symbols defined out of the LTO world.  */
  +
  +static bool
  +privatize_symbol_name (symtab_node *node)
  +{
  +  if (!privatize_symbol_name_1 (node, node-decl))
  +return false;
  +
 /* We could change name which is a target of transparent alias
chain of instrumented function name.  Fix alias chain if so  .*/
  -  if (cnode)
  +  if (cgraph_node *cnode = dyn_cast cgraph_node * (node))
   {
 tree iname = NULL_TREE;
 if (cnode-instrumentation_clone)
  -   iname = DECL_ASSEMBLER_NAME (cnode-decl);
  +   {
  + /* If we want to privatize instrumentation clone
  +then we also need to privatize original function.  */
  + if (cnode-instrumented_version)
  +   privatize_symbol_name (cnode-instrumented_version);
  + else
  +   privatize_symbol_name_1 (cnode, cnode-orig_decl);

This is because these are TRANSPARENT_ALIASes and thus visibility needs to 
match?
I plan to add generic support for 

Re: [PATCH, CHKP] Restore transparent alias chains

2015-04-02 Thread Jeff Law

On 03/20/2015 02:20 AM, Ilya Enkovich wrote:

Hi,

Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS bit.  
We always expect it for instrumentation clones, thus restore it input_cgraph_1. 
 Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2015-03-20  Ilya Enkovich  ilya.enkov...@intel.com

* lto-cgraph.c (input_cgraph_1): Always link instrumented
assembler name with original one.
This appears to be a code path that only triggers when MPX is enabled 
and is roughly analogous to the code in chkp_build_instrumented_fndecl 
links things up.


OK for the trunk.

jeff


Re: Silence merge warnings on artificial types

2015-04-02 Thread Jan Hubicka
 On Tue, Mar 31, 2015 at 09:51:21 +0200, Jan Hubicka wrote:
  this patch adds the ARTIFICIAL flag check to avoid ODR merging to these.
  I oriignally tested DECL_ARTIFICIAL (decl) (that is TYPE_NAME) that randomly
  dropped type names on some classes but not all.
  
  Jason, please do you know what is meaning of DECL_ARTIFICIAL on class type
  names? Perhaps we can drop them to 0 in free lang data?
  
  With this bug I triggered wrong devirtualization because we no longer insert
  non-odr types into a type inheritance graph.  This is fixed by the 
  lto_read_decls
  change and finally I triggered an ICE in ipa-devirt that due to the bug
  output a warning late and ICEd on streamer cache being NULL.  I guess it is
  better to guard it even though all wanrings should be output early.
  
  Bootsrapped/regtested x86_64-linux, will commit it after chromium rebuild.
  
  Honza
  
  * tree.c (need_assembler_name_p): Artificial types have no ODR
  names.
  * ipa-devirt.c (warn_odr): Do not try to apply ODR cache when
  no caching is done.
  
  * lto.c (lto_read_decls): Move code registering odr types out
  of TYPE_CANONICAL conditional and also register polymorphic types.
  Index: tree.c
  ===
  --- tree.c  (revision 221777)
  @@ -5139,6 +5145,7 @@ need_assembler_name_p (tree decl)
  decl == TYPE_NAME (TREE_TYPE (decl))
  !is_lang_specific (TREE_TYPE (decl))
  AGGREGATE_TYPE_P (TREE_TYPE (decl))
  +   !TYPE_ARTIFICIAL (TREE_TYPE (decl))
  !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE)
  !type_in_anonymous_namespace_p (TREE_TYPE (decl)))
   return !DECL_ASSEMBLER_NAME_SET_P (decl);
 
 Hmm, libgomp.c++/target-3.C still fails.
 Here is what I see in need_assembler_name_p:
 
  type_decl 0x2b5a8751aed8 .omp_data_s.3
 type record_type 0x2b5a87536690 .omp_data_s.3 BLK
 size integer_cst 0x2b5a873ce318 constant 192
 unit size integer_cst 0x2b5a873ce2e8 constant 24
 align 64 symtab 0 alias set -1 canonical type 0x2b5a87536690
 fields field_decl 0x2b5a87537000 b.0 type pointer_type 
 0x2b5a87536150
 unsigned DI file test.cpp line 8 col 13
 size integer_cst 0x2b5a873ace58 constant 64
 unit size integer_cst 0x2b5a873ace70 constant 8
 align 64 offset_align 128
 offset integer_cst 0x2b5a873ace88 constant 0
 bit offset integer_cst 0x2b5a873aced0 constant 0 context 
 record_type 0x2b5a87536690 .omp_data_s.3 chain field_decl 0x2b5a87537098 
 s reference_to_this reference_type 0x2b5a87536738
 VOID file test.cpp line 11 col 13
 align 1
 (TYPE_ARTIFICIAL = 0)
 
  type_decl 0x2b5a8751a850 .omp_data_s.3
 type record_type 0x2b5a8752cdc8 .omp_data_s.3 BLK
 size integer_cst 0x2b5a873ce318 constant 192
 unit size integer_cst 0x2b5a873ce2e8 constant 24
 align 64 symtab 0 alias set -1 canonical type 0x2b5a8752cdc8
 fields field_decl 0x2b5a8751a980 b.0 type pointer_type 
 0x2b5a8752c498
 unsigned DI file test.cpp line 8 col 13
 size integer_cst 0x2b5a873ace58 constant 64
 unit size integer_cst 0x2b5a873ace70 constant 8
 align 64 offset_align 128
 offset integer_cst 0x2b5a873ace88 constant 0
 bit offset integer_cst 0x2b5a873aced0 constant 0 context 
 record_type 0x2b5a8752cdc8 .omp_data_s.3 chain field_decl 0x2b5a8751a8e8 
 s
 pointer_to_this pointer_type 0x2b5a87536888
 VOID file test.cpp line 11 col 13
 align 1
 (TYPE_ARTIFICIAL = 0)

I see, the DECL_ARTIFICIAL is set on TYPE_DECL but it is not set on type
itself.  Jakub, this looks like an artificial type, so perhaps we want to set
it in omp-low.c? 
Otherwise I can add extra test for DECL_NAMELESS, but marking type as artifical
looks like correct fix to me.

Honza
 
   -- Ilya


Re: [PATCH, CHKP] Restore transparent alias chains

2015-04-02 Thread Jan Hubicka
 On 03/20/2015 02:20 AM, Ilya Enkovich wrote:
 Hi,
 
 Identifiers read with input_identifier miss IDENTIFIER_TRANSPARENT_ALIAS 
 bit.  We always expect it for instrumentation clones, thus restore it 
 input_cgraph_1.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK 
 for trunk?
 
 Thanks,
 Ilya
 --
 2015-03-20  Ilya Enkovich  ilya.enkov...@intel.com
 
  * lto-cgraph.c (input_cgraph_1): Always link instrumented
  assembler name with original one.
 This appears to be a code path that only triggers when MPX is
 enabled and is roughly analogous to the code in
 chkp_build_instrumented_fndecl links things up.
 
 OK for the trunk.

I think this will lead to wrong code. At this time, we may have multple
declarations sharing single assembler name (and thus
IDENTIFIER_TRANSPARENT_ALIAS link). Think of case where one unit defines static
function and other global function of the same name. We may end up renaming the
symbol but keeping bogus transparent alias link that will trigger on other
symbol.

During WPA the assembler names are never fully unique, but we also probably do
not need to set IDENTIFIER_TRANSPARENT_ALIAS.  

I think IDENTIFIER_TRANSPARENT_ALIAS should be set in rename_statics and
separately in ltrans on the place we skip lto_symtab merging. At least it is
the place I link it with my patch for supporting transparent aliases in cgraph.

I will try to find time to audit chkp code - I missed the addition of
transparent aliases in the initial chkp patches.  Symbol table code expect 1-1
correspondence between symbol table entries and real symbols.  Basically all
places we special case aliases or weakrefs needs to be revisited for
transparent aliases.

Honza


Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers

2015-04-02 Thread Uros Bizjak
On Thu, Apr 2, 2015 at 5:07 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 This patch fixes nested PARALLEL in retval for isntrumented calls. Current 
 possible call:

 (call_insn:TI 6 30 17 2 (set (parallel [
 (expr_list:REG_DEP_TRUE (parallel:TI [
 (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
 (const_int 8 [0x8]))
 ])
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
 (const_int 0 [0]))
 ])
 (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
 0x7f6f50e02ca8 test1.chkp) [0 test1.chkp S1 A8])
 (const_int 0 [0]))) complex.c:11 670 {*call_value}

 Such construction causes DF analysis problems.  This patch tranforms it to:

 (call_insn:TI 6 30 17 2 (set (parallel:TI [
 (expr_list:REG_DEP_TRUE (reg:DF 21 xmm0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:DF 22 xmm1)
 (const_int 8 [0x8]))
 (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
 (const_int 0 [0]))
 (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
 (const_int 0 [0]))
 ])
 (call/j (mem:QI (symbol_ref:DI (test1) [flags 0x41]  function_decl 
 0x7fb609bf7ca8 test1.chkp) [0 test1.chkp S1 A8])
 (const_int 0 [0]))) complex.c:11 670 {*call_value}


 Only MPX target is affected.  Bootstrapped and tested on 
 x86_64-unknown-linux-gnu.  Does it look OK?

 Thanks,
 Ilya
 --
 2015-04-02  Ilya Enkovich  ilya.enkov...@intel.com

 * config/i386/i386.c (ix86_expand_call): Avoid nested
 PARALLEL in returned call value.

OK.

Thanks,
Uros.


Re: Silence merge warnings on artificial types

2015-04-02 Thread Ilya Verbin
On Tue, Mar 31, 2015 at 09:51:21 +0200, Jan Hubicka wrote:
 this patch adds the ARTIFICIAL flag check to avoid ODR merging to these.
 I oriignally tested DECL_ARTIFICIAL (decl) (that is TYPE_NAME) that randomly
 dropped type names on some classes but not all.
 
 Jason, please do you know what is meaning of DECL_ARTIFICIAL on class type
 names? Perhaps we can drop them to 0 in free lang data?
 
 With this bug I triggered wrong devirtualization because we no longer insert
 non-odr types into a type inheritance graph.  This is fixed by the 
 lto_read_decls
 change and finally I triggered an ICE in ipa-devirt that due to the bug
 output a warning late and ICEd on streamer cache being NULL.  I guess it is
 better to guard it even though all wanrings should be output early.
 
 Bootsrapped/regtested x86_64-linux, will commit it after chromium rebuild.
 
 Honza
 
   * tree.c (need_assembler_name_p): Artificial types have no ODR
   names.
   * ipa-devirt.c (warn_odr): Do not try to apply ODR cache when
   no caching is done.
 
   * lto.c (lto_read_decls): Move code registering odr types out
   of TYPE_CANONICAL conditional and also register polymorphic types.
 Index: tree.c
 ===
 --- tree.c(revision 221777)
 @@ -5139,6 +5145,7 @@ need_assembler_name_p (tree decl)
 decl == TYPE_NAME (TREE_TYPE (decl))
 !is_lang_specific (TREE_TYPE (decl))
 AGGREGATE_TYPE_P (TREE_TYPE (decl))
 +   !TYPE_ARTIFICIAL (TREE_TYPE (decl))
 !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE)
 !type_in_anonymous_namespace_p (TREE_TYPE (decl)))
  return !DECL_ASSEMBLER_NAME_SET_P (decl);

Hmm, libgomp.c++/target-3.C still fails.
Here is what I see in need_assembler_name_p:

 type_decl 0x2b5a8751aed8 .omp_data_s.3
type record_type 0x2b5a87536690 .omp_data_s.3 BLK
size integer_cst 0x2b5a873ce318 constant 192
unit size integer_cst 0x2b5a873ce2e8 constant 24
align 64 symtab 0 alias set -1 canonical type 0x2b5a87536690
fields field_decl 0x2b5a87537000 b.0 type pointer_type 0x2b5a87536150
unsigned DI file test.cpp line 8 col 13
size integer_cst 0x2b5a873ace58 constant 64
unit size integer_cst 0x2b5a873ace70 constant 8
align 64 offset_align 128
offset integer_cst 0x2b5a873ace88 constant 0
bit offset integer_cst 0x2b5a873aced0 constant 0 context 
record_type 0x2b5a87536690 .omp_data_s.3 chain field_decl 0x2b5a87537098 s 
reference_to_this reference_type 0x2b5a87536738
VOID file test.cpp line 11 col 13
align 1
(TYPE_ARTIFICIAL = 0)

 type_decl 0x2b5a8751a850 .omp_data_s.3
type record_type 0x2b5a8752cdc8 .omp_data_s.3 BLK
size integer_cst 0x2b5a873ce318 constant 192
unit size integer_cst 0x2b5a873ce2e8 constant 24
align 64 symtab 0 alias set -1 canonical type 0x2b5a8752cdc8
fields field_decl 0x2b5a8751a980 b.0 type pointer_type 0x2b5a8752c498
unsigned DI file test.cpp line 8 col 13
size integer_cst 0x2b5a873ace58 constant 64
unit size integer_cst 0x2b5a873ace70 constant 8
align 64 offset_align 128
offset integer_cst 0x2b5a873ace88 constant 0
bit offset integer_cst 0x2b5a873aced0 constant 0 context 
record_type 0x2b5a8752cdc8 .omp_data_s.3 chain field_decl 0x2b5a8751a8e8 s
pointer_to_this pointer_type 0x2b5a87536888
VOID file test.cpp line 11 col 13
align 1
(TYPE_ARTIFICIAL = 0)

  -- Ilya


Re: [CHKP] Support returned bounds in thunks expand

2015-04-02 Thread Jeff Law

On 04/02/2015 08:49 AM, Ilya Enkovich wrote:

Ping

2015-03-10 13:12 GMT+03:00 Ilya Enkovich enkovich@gmail.com:

Hi,

Currentl we loose returned bounds when functions are merged.  This patch fixes 
it by adding returne bounds support for cgraph_node::expand_thunk.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-03-06  Ilya Enkovich  ilya.enkov...@intel.com

 * cgraphunit.c (cgraph_node::expand_thunk): Build returned
 bounds for instrumented functions.

gcc/testsuite/

2015-03-06  Ilya Enkovich  ilya.enkov...@intel.com

 * gcc/testsuite/gcc.target/i386/thunk-retbnd.c: New.
I really dislike the amount of gimple and bounded pointer knowledge in 
this code.


It seems like a significant modularity violation and while you didn't 
introduce the gimple stuff, we probably shouldn't be making it worse.


Is it possible to let this code build up the thunk, then pass it off as 
a whole to the chkp code to add the instrumentation, particularly for 
the return value?


ALso, is this critical for stage4?  It looks like this is strictly a 
QofI change, correct?


jeff



Re: [PATCH, CHKP, i386] Fix retval generated for instrumented calls returning values on multiple registers

2015-04-02 Thread Jan Hubicka
 On Thu, Apr 2, 2015 at 10:17 PM, Jan Hubicka hubi...@ucw.cz wrote:
 
  rtx b0 = gen_rtx_REG (BND64mode, FIRST_BND_REG);
  rtx b1 = gen_rtx_REG (BND64mode, FIRST_BND_REG + 1);
  -   retval = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, retval, b0, 
  b1));
  -   chkp_put_regs_to_expr_list (retval);
  +   if (GET_CODE (retval) == PARALLEL)
  + {
  +   b0 = gen_rtx_EXPR_LIST (VOIDmode, b0, const0_rtx);
  +   b1 = gen_rtx_EXPR_LIST (VOIDmode, b1, const0_rtx);
  +   rtx par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, b0, b1));
  +   retval = chkp_join_splitted_slot (retval, par);
 
  I do not understand this code. Isn't it just droping the original return 
  value
  constructing prallelcontaining only the BND_REGs?
 
 Please see chkp_join_splitted_slot that merges two parallels.

I see, OK then :)

Honza
 
 Uros.