Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior

2014-11-18 Thread Tim Shen
On Tue, Nov 18, 2014 at 11:19 AM, Paolo Carlini
 wrote:
> Jonathan lately is following your work much better than me, but naively
> seems weird that _M_begin is non-const and _M_end is const, a different type
> anyway.

Hmm. The current regex_search algorithm is implemented as try match
starting from _M_begin; if it doesn't match, start over from
_M_begin+1, ...

As we can tell, _M_begin is never changed, and it's const.

The problem is when the executer reaches the accept state (which
indicates a match) we use _M_current == _M_begin to verify if it's an
empty match. It is possible that, when we are not in the first
iteration, say, in the second iteration actually, _M_current is
initialized with _M_begin+1. It turns out even _M_current has never
been increased (no chars are eaten, aka empty match), _M_current !=
_M_begin is still true.

This fix is making each regex_search iteration more thorough, with
increased _M_begin, as if it's a new regex _M_search_from_first.

I've carefully (admittedly, after sending this patch) inspect
everywhere when _M_begin is used. It turns out _M_begin is under
well-defined (the initial position of _M_current when current
iteration starts) invariants (see _Executor<>::_M_at_begin), indicated
by the use of regex_constants::match_prev_avail. This flag actually
implies that __begin iterator passed into regex_search is not always
"the physical boundary" who matches "^". Boost (and we) conforms this
behavior:

std::regex_search("asdf", std::regex("^asdf"),
std::regex_constants::match_prev_avail)

returns false.

It's more elegant to move _Executor::_M_search out of its class and
make _M_begin still const, but _Executor costs too much to initialize.


-- 
Regards,
Tim Shen


Re: [PATCH] Fix up r217118 - simplify_binary_operation_1 (PR rtl-optimization/63843)

2014-11-18 Thread Richard Biener
On November 18, 2014 10:59:28 PM CET, Jakub Jelinek  wrote:
>Hi!
>
>The case ASHIFTRT: which is meant for this optimization is preceeded by
>/* FALLTHRU */ from ROTATE cases, which can't be optimized this way.
>Thus, this patch limits this optimization to ASHIFTRT.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

>2014-11-18  Jakub Jelinek  
>
>   PR rtl-optimization/63843
>   * simplify-rtx.c (simplify_binary_operation_1) : For
>   optimization of ashiftrt of subreg of lshiftrt, check that code
>   is ASHIFTRT.
>
>   * gcc.c-torture/execute/pr63843.c: New test.
>
>--- gcc/simplify-rtx.c.jj  2014-11-18 10:17:01.0 +0100
>+++ gcc/simplify-rtx.c 2014-11-18 13:44:26.727792683 +0100
>@@ -3118,10 +3118,11 @@ simplify_binary_operation_1 (enum rtx_co
>  (subreg:M1 (ashiftrt:M2 (reg:M2)
>  (const_int ))
>   ).  */
>-  if (!VECTOR_MODE_P (mode)
>+  if (code == ASHIFTRT
>+&& !VECTOR_MODE_P (mode)
>   && SUBREG_P (op0)
>   && CONST_INT_P (op1)
>-  && (GET_CODE (SUBREG_REG (op0)) == LSHIFTRT)
>+&& GET_CODE (SUBREG_REG (op0)) == LSHIFTRT
>   && !VECTOR_MODE_P (GET_MODE (SUBREG_REG (op0)))
>   && CONST_INT_P (XEXP (SUBREG_REG (op0), 1))
>   && (GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (op0)))
>--- gcc/testsuite/gcc.c-torture/execute/pr63843.c.jj   2014-11-18
>13:43:45.417544096 +0100
>+++ gcc/testsuite/gcc.c-torture/execute/pr63843.c  2014-11-18
>13:43:32.0 +0100
>@@ -0,0 +1,31 @@
>+/* PR rtl-optimization/63843 */
>+
>+static inline __attribute__ ((always_inline))
>+unsigned short foo (unsigned short v)
>+{
>+  return (v << 8) | (v >> 8);
>+}
>+
>+unsigned short __attribute__ ((noinline, noclone, hot))
>+bar (unsigned char *x)
>+{
>+  unsigned int a;
>+  unsigned short b;
>+  __builtin_memcpy (&a, &x[0], sizeof (a));
>+  a ^= 0x80808080U;
>+  __builtin_memcpy (&x[0], &a, sizeof (a));
>+  __builtin_memcpy (&b, &x[2], sizeof (b));
>+  return foo (b);
>+}
>+
>+int
>+main ()
>+{
>+  unsigned char x[8] = { 0x01, 0x01, 0x01, 0x01 };
>+  if (__CHAR_BIT__ == 8
>+  && sizeof (short) == 2
>+  && sizeof (int) == 4
>+  && bar (x) != 0x8181U)
>+__builtin_abort ();
>+  return 0;
>+}
>
>   Jakub




Re: [PING][PATCH] [AARCH64, NEON] Improve vcls(q?) vcnt(q?) and vld1(q?)_dup intrinsics

2014-11-18 Thread Yangfei (Felix)
> On 17 November 2014 06:58, Yangfei (Felix)  wrote:
> > PING?
> > BTW: It seems that Alan's way of improving vld1(q?)_dup intrinsic is more
> elegant.
> > So is the improvement of vcls(q?) vcnt(q?) OK for trunk?  Thanks.
> 
> Please rebase over Alan's patch and repost, thank you /Marcus


I rebased the patch on the latest trunk. 
Regtested for aarch64-linux-gnu with qemu. 
OK for the trunk? 


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 217717)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,14 @@
+2014-11-13  Felix Yang  
+   Shanyao Chen  
+
+   * config/aarch64/aarch64-simd.md (clrsb2, popcount2): New
+   patterns.
+   * config/aarch64/aarch64-simd-builtins.def (clrsb, popcount): New
+   builtins.
+   * config/aarch64/arm_neon.h (vcls_s8, vcls_s16, vcls_s32, vclsq_s8,
+   vclsq_s16, vclsq_s32, vcnt_p8, vcnt_s8, vcnt_u8, vcntq_p8, vcntq_s8,
+   vcntq_u8): Rewrite using builtin functions.
+
 2014-11-18  Felix Yang  
 
* config/aarch64/aarch64.c (doloop_end): New pattern.
Index: gcc/config/aarch64/arm_neon.h
===
--- gcc/config/aarch64/arm_neon.h   (revision 217717)
+++ gcc/config/aarch64/arm_neon.h   (working copy)
@@ -5317,138 +5317,6 @@ vaddlvq_u32 (uint32x4_t a)
   return result;
 }
 
-__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
-vcls_s8 (int8x8_t a)
-{
-  int8x8_t result;
-  __asm__ ("cls %0.8b,%1.8b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int16x4_t __attribute__ ((__always_inline__))
-vcls_s16 (int16x4_t a)
-{
-  int16x4_t result;
-  __asm__ ("cls %0.4h,%1.4h"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
-vcls_s32 (int32x2_t a)
-{
-  int32x2_t result;
-  __asm__ ("cls %0.2s,%1.2s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
-vclsq_s8 (int8x16_t a)
-{
-  int8x16_t result;
-  __asm__ ("cls %0.16b,%1.16b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
-vclsq_s16 (int16x8_t a)
-{
-  int16x8_t result;
-  __asm__ ("cls %0.8h,%1.8h"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int32x4_t __attribute__ ((__always_inline__))
-vclsq_s32 (int32x4_t a)
-{
-  int32x4_t result;
-  __asm__ ("cls %0.4s,%1.4s"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__))
-vcnt_p8 (poly8x8_t a)
-{
-  poly8x8_t result;
-  __asm__ ("cnt %0.8b,%1.8b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
-vcnt_s8 (int8x8_t a)
-{
-  int8x8_t result;
-  __asm__ ("cnt %0.8b,%1.8b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vcnt_u8 (uint8x8_t a)
-{
-  uint8x8_t result;
-  __asm__ ("cnt %0.8b,%1.8b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline poly8x16_t __attribute__ ((__always_inline__))
-vcntq_p8 (poly8x16_t a)
-{
-  poly8x16_t result;
-  __asm__ ("cnt %0.16b,%1.16b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
-vcntq_s8 (int8x16_t a)
-{
-  int8x16_t result;
-  __asm__ ("cnt %0.16b,%1.16b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
-vcntq_u8 (uint8x16_t a)
-{
-  uint8x16_t result;
-  __asm__ ("cnt %0.16b,%1.16b"
-   : "=w"(result)
-   : "w"(a)
-   : /* No clobbers */);
-  return result;
-}
-
 #define vcopyq_lane_f32(a, b, c, d) \
   __extension__ \
 ({  \
@@ -14082,6 +13950,44 @@ vcltzd_f64 (float64_t __a)
   return __a < 0.0 ? -1ll : 0ll;
 }
 
+/* vcls.  */
+
+__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
+vcls_s8 (int8x8_t __a)
+{
+  return __builtin_aarch64_clrsb

Re: [PATCH] gcc/c-family/c-cppbuiltin.c: Use 20 instead of 18 for the maximized 64-bits integer decimal string length

2014-11-18 Thread Chen Gang
On 11/19/14 11:03, Chen Gang wrote:
> On 11/19/14 0:44, Joseph Myers wrote:
>> On Sun, 16 Nov 2014, Chen Gang wrote:
>>
>>> The maximize 64-bits integer decimal string length excluding NUL is 20 (
>>> '-9223372036854775808'), so need use 20 instead of 18 for HOST_WIDE_INT.
>>>
>>> 2014-11-16  Chen Gang  
>>>
>>> * c-family/c-cppbuiltin.c (builtin_define_with_int_value):  Use
>>> 20 instead of 18 for the maximize 64-bits integer decimal
>>> string length
>>
>> OK.  (Though it's not a good idea to use builtin_define_with_int_value 
>> with large arguments, as it won't generate any suffixes for arguments 
>> outside the range of target int, and -9223372036854775808 would actually 
>> need to be expressed as (-9223372036854775807LL-1).  I think it's OK that 
>> it doesn't parenthesize negative numbers when outputting them - that the 
>> only cases for which the lack of parentheses could affect the parse are 
>> invalid for other reasons - though parentheses around negative numbers 
>> output might be a good idea anyway to make it obvious the output is OK, 
>> and would accord with how some macros such as __*_MIN_EXP__ are output; 
>> those values should probably use builtin_define_with_int_value.)
>>
> 
> OK, thanks, what you said sounds reasonable to me. We need '(' and ')'
> for negative members, and "LL" for the members which is larger than 32
> bits.
> 
> For me, for simplify thinking, can let all numbers have '(', ')' and
> "LL" (whether it is positive numbers or negative numbers, whether it is
> large numbers or small numbers), and one sprintf() is enough:
> 
>   sprintf(buf, "%s=("HOST_WIDE_INT_PRINT_DEC")LL", macro, value);

Oh, sorry, need be:

  sprintf(buf, "%s=("HOST_WIDE_INT_PRINT_DEC"LL)", macro, value);

> 
> And excuse me, I do not understand why use "(-9223372036854775807LL-1)"
> instead of "(-9223372036854775808LL)": compiler will report warning for
> it, but for me, it is more likely the compiler's own issue:
> 
>   bash-3.2# cat ./test.c 
>   #include 
>   
>   int main ()
>   {
> long long i = (-9223372036854775808LL);
> long long j = 0x8000LL;
> long long k = 0x7fffLL;
>   
> printf("\ni: %lld, j: %lld, k: %lld\n", i, j, k);
>   
> return 0;
>   }
>   bash-3.2# gcc -o test test.c
>   test.c:5:19: warning: integer constant is larger than the largest signed 
> integer type
> long long i = (-9223372036854775808LL);
> ^
>   1 warning generated.
>   bash-3.2# ./test 
>   
>   i: -9223372036854775808, j: -9223372036854775808, k: 9223372036854775807
> 
> 
> 
> Thanks
> 

-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed


Several small C++ constexpr PATCHes, including c++/63924 (constexpr and empty classes)

2014-11-18 Thread Jason Merrill
The first patch fixes 63924; I'm not sure whether or not it is really 
valid, but it's easy enough to allow "copying" a non-constant empty 
class object.


The second patch fixes GNU statement-expressions in 
constant-expressions.  This is mostly interesting statement-expressions 
are used internally by e.g. build_vec_init, and another patch I'm 
working on broke because they weren't supported properly.


The last two patches are basically reformatting.  The third moves the 
allow_non_const flag into constexpr_ctx, and the fourth replaces all the 
explicit NULL jump_target arguments with a default argument.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit b9482d7269f16f62657e480632e662d77477355d
Author: Jason Merrill 
Date:   Tue Nov 18 21:16:10 2014 -0500

	PR c++/63924
	* constexpr.c (cxx_eval_constant_expression) [PARM_DECL]: A load
	from a variable of empty class type is constant.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 1b330a0..1303fdc 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2845,6 +2845,12 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, tree t,
 	r = *p;
   else if (addr)
 	/* Defer in case this is only used for its type.  */;
+  else if (is_empty_class (TREE_TYPE (t)))
+	{
+	  /* If the class is empty, we aren't actually loading anything.  */
+	  r = build_constructor (TREE_TYPE (t), NULL);
+	  TREE_CONSTANT (r) = true;
+	}
   else
 	{
 	  if (!ctx->quiet)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-empty8.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty8.C
new file mode 100644
index 000..8c1414a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty8.C
@@ -0,0 +1,7 @@
+// PR c++/63924
+// { dg-do compile { target c++11 } }
+
+struct A { };
+constexpr bool f(A a) { return true; }
+template  constexpr bool g() { return true; }
+constexpr bool g(A a) { return g(); }
commit deec200eb93467dde2873fddc4ab30ffeb31003a
Author: Jason Merrill 
Date:   Tue Nov 18 12:14:36 2014 -0500

	* constexpr.c (cxx_eval_statement_list): Handle statement-expressions.
	(potential_constant_expression_1): Handle STMT_EXPR.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 4669586..1b330a0 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2685,6 +2685,14 @@ cxx_eval_statement_list (const constexpr_ctx *ctx, tree t,
 {
   tree_stmt_iterator i;
   tree_stmt_iterator default_label = tree_stmt_iterator();
+  tree local_target;
+  /* In a statement-expression we want to return the last value.  */
+  tree r = NULL_TREE;
+  if (!jump_target)
+{
+  local_target = NULL_TREE;
+  jump_target = &local_target;
+}
   for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
 {
 reenter:
@@ -2699,10 +2707,9 @@ cxx_eval_statement_list (const constexpr_ctx *ctx, tree t,
 	  else
 	continue;
 	}
-  cxx_eval_constant_expression (ctx, stmt,
-false,
-non_constant_p, overflow_p,
-jump_target);
+  r = cxx_eval_constant_expression (ctx, stmt, false,
+	non_constant_p, overflow_p,
+	jump_target);
   if (*non_constant_p)
 	break;
   if (returns (jump_target) || breaks (jump_target))
@@ -2714,7 +2721,7 @@ cxx_eval_statement_list (const constexpr_ctx *ctx, tree t,
   *jump_target = NULL_TREE;
   goto reenter;
 }
-  return NULL_TREE;
+  return r;
 }
 
 /* Evaluate a LOOP_EXPR for side-effects.  Handles break and return
@@ -3885,6 +3892,9 @@ potential_constant_expression_1 (tree t, bool want_rval, tsubst_flags_t flags)
 	return false;
   return true;
 
+case STMT_EXPR:
+  return potential_constant_expression_1 (STMT_EXPR_STMT (t), rval, flags);
+
 case LAMBDA_EXPR:
 case DYNAMIC_CAST_EXPR:
 case PSEUDO_DTOR_EXPR:
@@ -3900,7 +3910,6 @@ potential_constant_expression_1 (tree t, bool want_rval, tsubst_flags_t flags)
   /* GCC internal stuff.  */
 case VA_ARG_EXPR:
 case OBJ_TYPE_REF:
-case STMT_EXPR:
 case TRANSACTION_EXPR:
 case ASM_EXPR:
 fail:
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-stmtexpr2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-stmtexpr2.C
new file mode 100644
index 000..34ca9fa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-stmtexpr2.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+#define SA(X) static_assert((X),#X)
+
+template 
+constexpr T f (T t)
+{
+  return ({ t+1; });
+}
+
+SA(f(42) == 43);
commit 38fb0e8ae0387b39c4ec825ebc086ae5e482cf2c
Author: Jason Merrill 
Date:   Tue Nov 18 09:31:54 2014 -0500

	* constexpr.c (struct constexpr_ctx): Add quiet field.
	(cxx_eval_outermost_constant_expr, is_sub_constant_expr): Set it.
	(lots): Replace allow_non_constant parameter with ctx->quiet.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 517bf23..7cdf649 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -36,7 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 static bool verify_constant (tree, 

Re: [PATCH] gcc/c-family/c-cppbuiltin.c: Use 20 instead of 18 for the maximized 64-bits integer decimal string length

2014-11-18 Thread Chen Gang
On 11/19/14 0:44, Joseph Myers wrote:
> On Sun, 16 Nov 2014, Chen Gang wrote:
> 
>> The maximize 64-bits integer decimal string length excluding NUL is 20 (
>> '-9223372036854775808'), so need use 20 instead of 18 for HOST_WIDE_INT.
>>
>> 2014-11-16  Chen Gang  
>>
>>  * c-family/c-cppbuiltin.c (builtin_define_with_int_value):  Use
>>  20 instead of 18 for the maximize 64-bits integer decimal
>>  string length
> 
> OK.  (Though it's not a good idea to use builtin_define_with_int_value 
> with large arguments, as it won't generate any suffixes for arguments 
> outside the range of target int, and -9223372036854775808 would actually 
> need to be expressed as (-9223372036854775807LL-1).  I think it's OK that 
> it doesn't parenthesize negative numbers when outputting them - that the 
> only cases for which the lack of parentheses could affect the parse are 
> invalid for other reasons - though parentheses around negative numbers 
> output might be a good idea anyway to make it obvious the output is OK, 
> and would accord with how some macros such as __*_MIN_EXP__ are output; 
> those values should probably use builtin_define_with_int_value.)
> 

OK, thanks, what you said sounds reasonable to me. We need '(' and ')'
for negative members, and "LL" for the members which is larger than 32
bits.

For me, for simplify thinking, can let all numbers have '(', ')' and
"LL" (whether it is positive numbers or negative numbers, whether it is
large numbers or small numbers), and one sprintf() is enough:

  sprintf(buf, "%s=("HOST_WIDE_INT_PRINT_DEC")LL", macro, value);

And excuse me, I do not understand why use "(-9223372036854775807LL-1)"
instead of "(-9223372036854775808LL)": compiler will report warning for
it, but for me, it is more likely the compiler's own issue:

  bash-3.2# cat ./test.c 
  #include 
  
  int main ()
  {
long long i = (-9223372036854775808LL);
long long j = 0x8000LL;
long long k = 0x7fffLL;
  
printf("\ni: %lld, j: %lld, k: %lld\n", i, j, k);
  
return 0;
  }
  bash-3.2# gcc -o test test.c
  test.c:5:19: warning: integer constant is larger than the largest signed 
integer type
long long i = (-9223372036854775808LL);
^
  1 warning generated.
  bash-3.2# ./test 
  
  i: -9223372036854775808, j: -9223372036854775808, k: 9223372036854775807



Thanks
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed


[patch, arm] Minor optimization on thumb2 tail call

2014-11-18 Thread Joey Ye
Current thumb2 -Os generates suboptimal code for following tail call case:

int f4(int b, int a, int c, int d);
int g(int a, int b, int c, int d)
{ return f4(b, a, c, d); }

arm-none-eabi-gcc -Os -mthumb -mcpu=cortex-m3 test.c

push
{r4, lr}
mov r4, r1
mov r1, r0
mov r0, r4
pop {r4, lr}

b f4

There are two issues: The first one is that saving/restoring lr is not
necessary, as there is no return via pop pc. The second one is that even if
we managed to avoid lr push/pop, ldmia.w sp!, {r4} is still emitted as there
is a missing pattern for pop single and code size is not optimal.

This patch fixes these two issues and introduces a shared test case. CSiBE
thumb2 -Os shows cross board code size reduction, except for one case with 4
bytes regression. The case is like:

void f ()
{
   if ()
 ...
   else if ()
 ...
   else g();
}

There are N=2 non-sibcall returns and S=1 sibcall return. Originally the
non-sibcall returns are just pop {r4, r5, pc}, now they become
  b.n  .Lreturn

.Lreturn:
  pop {r4, r5}
  bx lr

The one byte save from sibcall return does not win the non-sibcall return
regressions back. In general scenario, number of N non-sibcall returns use
b.n branching to merged tail, number of S sibcalls save 2 bytes by avoid
poping lr. It results in 4-2*S bytes regression. In the worst scenario, each
non-sibcall return has to use b.w branching to merged tail, resulting in
(N-S)*2 bytes regression. The worst scenario is rare, according to CSiBE.
The general regression scenario can only regress 2 bytes at most. So I would
not introduce additional complexity to handle the regression case.

Make check cortex-m3: pass
thumb2 bootstrap (O2/Os): pass

* config/arm/arm.c (arm_compute_save_reg_mask):
Do not save lr in case of tail call.
* config/arm/thumb2.md (*thumb2_pop_single): New pattern.

* gcc.target/arm/thumb2-pop-single.c: New test.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4f04707..20d0b9e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19190,6 +19190,7 @@ arm_compute_save_reg_mask (void)
   || (save_reg_mask
  && optimize_size
  && ARM_FUNC_TYPE (func_type) == ARM_FT_NORMAL
+ && !crtl->tail_call_emit
  && !crtl->calls_eh_return))
 save_reg_mask |= 1 << LR_REGNUM;
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 64acfea..29cfb17 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -267,6 +267,17 @@
(set_attr "type" "multiple")]
 )
 
+;; Pop a single register as its size is preferred over a post-incremental
load
+(define_insn "*thumb2_pop_single"
+  [(set (match_operand:SI 0 "low_register_operand" "=r")
+(mem:SI (post_inc:SI (reg:SI SP_REGNUM]
+  "TARGET_THUMB2 && (reload_in_progress || reload_completed)"
+  "pop\t{%0}"
+  [(set_attr "type" "load1")
+   (set_attr "length" "2")
+   (set_attr "predicable" "yes")]
+)
+
 ;; We have two alternatives here for memory loads (and similarly for
stores)
 ;; to reflect the fact that the permissible constant pool ranges differ
 ;; between ldr instructions taking low regs and ldr instructions taking
high





RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Maciej W. Rozycki
On Tue, 18 Nov 2014, Matthew Fortune wrote:

> > >  With a quick look at `mips_process_sync_loop' it looks to me the
> > > other NOP is produced from here:
> > >
> > >   else if (!(required_oldval && cmp))
> > > mips_multi_add_insn ("nop", NULL);
> > >
> > > -- correct?  If so, then can't you just make it:
> > 
> > Correct.
> > 
> > >
> > >   else if (!(required_oldval && cmp) && !TARGET_FIX_R1)
> > > mips_multi_add_insn ("nop", NULL);
> > >
> > > instead?  It appears plain and simple, and if you're concerned about
> > > it being unobvious to the casual reader, then add a one-line comment
> > > or suchlike.  It's not like TARGET_FIX_R1 is going to imply
> > > anything else about branches in the future (and r6 code won't run on
> > > an R10k anyway).
> 
> I'm afraid I disagree about it being plain and simple even with a comment.
> The amount of code in the backend that makes assumptions based on derived
> variables is very high and it makes the code hard to read (especially w.r.t.
> branches).  I'm OK with adding the code to avoid the extra NOP but have it
> based on mips_branch_likely and fix the callers so that it is set
> appropriately.

 Sure, fine with me too.  It looks to me it'll be more natural even to 
have `mips_branch_likely' preset on entering `mips_process_sync_loop'.

  Maciej


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Mike Stump
On Nov 18, 2014, at 3:42 PM, Jakub Jelinek  wrote:
> No, I'm not touching tmp array at all in that case, only look at vp
> individual bytes.  Either they are all 0, or all 0xff, or I return NULL.

Oh, sorry, I misread where the break; in your code was going.  I might have 
been misled by:

> - gcc_assert (GET_MODE_PRECISION (outer_submode)
> - <= MAX_BITSIZE_MODE_ANY_INT);

in your patch.  You don’t need that anymore, do you?  If not, can you remove 
this part.

The rest looks like normal rtl/vector code, I don’t see anything wrong with it.

Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-11-18 Thread Trevor Saunders
On Tue, Nov 18, 2014 at 07:59:26PM +0100, Jan Hubicka wrote:
> > Hi,
> > 
> > On Tue, Nov 18, 2014 at 04:39:00PM +0100, Jan Hubicka wrote:
> > > > On Fri, Nov 14, 2014 at 08:59:10PM +0100, Jan Hubicka wrote:
> > > > > > > 
> > > > > > > > b) with GTY, we cannot call destructor
> > > > > > > 
> > > > > > > Everything in symbol table is expecitely memory managed (i.e. 
> > > > > > > enver left
> > > > > > > to be freed by garbage collector). It resists in GTY only to 
> > > > > > > allow linking
> > > > > > > garbage collected object from them and to get PCH working.
> > > > > > > 
> > > > > > 
> > > > > > Well, if I understand the intent correctly, summaries are for stuff
> > > > > > that is not in the symbol table.  For example jump functions are a
> > > > > Correct.
> > > > > > vector of structures possibly containing trees, so everything has to
> > > > > > be in garbage collected memory.
> > > > > > 
> > > > > > When an edge is removed, it is necessary to be notified about it
> > > > > > immediately, for example to decrement rdesc_refcount (you might 
> > > > > > argue
> > > > > > that that should be done in a separate hook and not from within a
> > > > > > summary class but then you start to rely on hook invocation ordering
> > > > > > so I think it is better to eventually use the summaries for it too).
> > > > > 
> > > > > I do not see why ctors/dtors can not do the reference counting. In 
> > > > > fact
> > > > > this is how refcounting is done usually anyway?
> > > > > 
> > > > 
> > > > Well, when there is no garbage collection involved then yes, that is
> > > > how you normally do it but in the GC case, there is the question of
> > > > what is the appropriate time to call destructor on garbage collected
> > > > data (like jump functions)?
> > > 
> > > I still fail to see problem here.  Summaries are explicitly managed- they 
> > > are
> > > constructed at summary construction time or when new callgarph node is
> > > introduced/duplicated.  They are destroyed when callgarph node is 
> > > destroyed or
> > > whole summary is ddestroyed.  It is job of the summary datastructure to 
> > > call
> > > proper ctors/dtors, not job of garbage collector that provides the 
> > > underlying
> > > memory management.
> > 
> > I do not think that all summaries (in the meaning of a description of
> > one particular symbol table node or call graph edge) are explicitely
> > managed.  For example ipa_edge_args or ipa_agg_replacement_value
> > (which my alignment patch changes to ipcp_transformation_summary) are
> > allocated in GC memory because they contain trees.
> > 
> > > 
> > > If you have datastructure that points to something that is not
> > > explicitly managed (i.e. tree expression), you just can not have
> > > non-trivial constructor on that datastructure, because that is freed
> > > transparently by gty that don't do destruction...
> > 
> > I admit to not being particularly bright today but that seems to be
> > exactly my point.
> 
> Well, in your case you have datastructure jump_function that contain a pointer
> to tree (EXPR).  What I am trying to explain is that I see no reson why
> jump_function needs to be POD. The tree pointed to by EXPR pointer can not
> have a dtor by itself because GGC will not call it upon freeing.

ggc does call dtors when it is about to sweep an object.  however if you
explicitly call ggc_free on an object with a dtor you need to call the
dtor manually I believe.

> It is true that jump_function lives in GGC memory (to make pointer to expr
> work) but it never gets removed by ggc_collect because it is always pointed to
> by the summary datastructure.  There are two ways to free the jump_function
> datastructure.

assuming it doesn't need to go into pch, and I would really think it
never does there's no real reason it has to live on the heap, you could
iterate over all the summaries and mark all the trees they point at
"manually".

>   1) removing the symbol node it is attached to.
>  Here the symtab code will call removal hook that was registered by 
> container
>  template. The container will call destructor of jump_function and the 
> ggc_free
>  its memory
>   2) removing the summary.  In this case I would again expect the container
>  template to walk all summaries and free them.
> 
> So even if your structure lives in GGC memory it is not really garbage
> collected and thus the lack of machinery to call dtors at a time ggc decides 
> to
> free something is not a problem?
> 
> In fact looking at struct default_hashmap_traits, I see:
> 
>   /* Called to dispose of the key and value before marking the entry as
>  deleted.  */
> 
>   template static void remove (T &v) { v.~T (); }
> 
> This trait gets called by the underlying hash table so if you have explicitly
> managed hashmap (in GGC memory or not), things just work.  Only catch is that
> if you let your hashmap to be garbage collected, then your dtor is not called.

actually I think it should be, ggc tra

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek
On Tue, Nov 18, 2014 at 03:30:56PM -0800, Mike Stump wrote:
> > Before wide-int got merged, the testcase worked, though the code didn't
> > bother checking anything, just created 0 or constm1_rtx for the two cases
> > that could happen and if something else appeared, could just return what
> > matched low TImode (or DImode for -m32).
> 
> tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for
> units that keeps u in bounds.  Doesn’t this access 32 bytes of OImode
> values in a 16 byte data structure?

No, I'm not touching tmp array at all in that case, only look at vp
individual bytes.  Either they are all 0, or all 0xff, or I return NULL.

> and GET_MODE_BITSIZE (outer_submode) is > MAX_BITSIZE_MODE_ANY_INT, right?
> 
> You can’t copy more bytes than the size of the destination has?

On the testcase we have
(subreg:OI (reg:V8SI 1234) 0)
and try to propagate
(const_vector:V8SI [8 x (const_int 0)])
into it.  All zeros even in OImode (or XImode) is still
(const_int 0), all ones even in those modes is (const_int -1),
which are the only constants I'm using for those ultra-wide modes,
anything else will return NULL (but not ICE).

Jakub


Re: [PATCH] Make IPA-CP propagate alignment information of pointers

2014-11-18 Thread Martin Jambor
Hi,

On Mon, Nov 17, 2014 at 01:05:23PM +0100, Richard Biener wrote:
> On Sat, Nov 15, 2014 at 2:04 AM, Martin Jambor  wrote:
> > Hi,
> >
> > this patch adds very simple propagation of alignment of pointers to
> > IPA-CP.  Because I have not attempted to estimate profitability of
> > such propagation in any way, it does not do any cloning, just
> > propagation when the alignment is known and the same in all contexts.
> >
> > I have seen this shrinking code in a few vectorization testcases in
> > our testsuite, especially if run with LTO.
> >
> > The patch is slightly bigger because the streaming of transformation
> > summaries had to be modified (and many parts renamed) a little bit to
> > also hold the resultant alignment information.
> >
> > In the future we may replace this by a more fancy VRP-based IPA-CP but
> > I think it is worth having this simple addition in 5.0.  Bootstrapped
> > and tested on x86_64-linux.  OK for trunk?
> >

Apart from changes suggested by Richi (more on that below), I have
renamed the field bitpos of ipa_alignment to misalign, which is how it
is called in ptr_info_def in tree-ssanames.h, because it is really the
offset in bytes, not bits.

> > Index: src/gcc/ipa-cp.c
> > ===
> > --- src.orig/gcc/ipa-cp.c   2014-11-14 22:06:06.240031030 +0100
> > +++ src/gcc/ipa-cp.c2014-11-15 01:40:38.260521577 +0100
> > @@ -1372,6 +1405,80 @@ propagate_context_accross_jump_function
> >return ret;
> >  }
> >
> > +/* Propagate alignments accross jump function JFUNC that is associated with
> > +   edge CS and update DEST_LAT accordingly.  */
> > +
> > +static bool
> > +propagate_alignment_accross_jump_function (struct cgraph_edge *cs,
> > +  struct ipa_jump_func *jfunc,
> > +  struct ipcp_param_lattices 
> > *dest_lat)
> > +{
> > +  if (alignment_bottom_p (dest_lat))
> > +return false;
> > +
> > +  ipa_alignment cur;
> > +  cur.known = false;
> > +  if (jfunc->alignment.known)
> > +cur = jfunc->alignment;
> > +  else if (jfunc->type == IPA_JF_PASS_THROUGH
> > +  || jfunc->type == IPA_JF_ANCESTOR)
> > +{
> > +  struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
> > +  struct ipcp_param_lattices *src_lats;
> > +  HOST_WIDE_INT offset = 0;
> > +  int src_idx;
> > +
> > +  if (jfunc->type == IPA_JF_PASS_THROUGH)
> > +   {
> > + enum tree_code op = ipa_get_jf_pass_through_operation (jfunc);
> > + if (op != NOP_EXPR)
> > +   {
> > + if (op != POINTER_PLUS_EXPR
> > + && op != PLUS_EXPR
> > + && op != MINUS_EXPR)
> > +   goto prop_fail;
> > + tree operand = ipa_get_jf_pass_through_operand (jfunc);
> > + if (!tree_fits_shwi_p (operand))
> > +   goto prop_fail;
> > + offset = tree_to_shwi (operand);
> > +   }
> > + src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
> > +   }
> > +  else
> > +   {
> > + src_idx = ipa_get_jf_ancestor_formal_id (jfunc);
> > + offset = ipa_get_jf_ancestor_offset (jfunc);
> > +   }
> > +
> > +  src_lats = ipa_get_parm_lattices (caller_info, src_idx);
> > +  if (!src_lats->alignment.known
> > + || alignment_bottom_p (src_lats))
> > +   goto prop_fail;
> > +
> > +  cur = src_lats->alignment;
> > +
> > +  if (offset != 0
> > + && (offset % cur.align != 0))
> > +   goto prop_fail;
> 
> why fail?  shouldn't you simply add offset to cur.bitpos?
> 

Right, I have changed this whole condition to simple

  cur.misalign = (cur.misalign + offset) % cur.align;

> > +}
> > +
> > +  if (cur.known)
> > +{
> > +  if (!dest_lat->alignment.known)
> > +   {
> > + dest_lat->alignment = cur;
> > + return true;
> > +   }
> > +  else if (dest_lat->alignment.align == cur.align
> > +  && dest_lat->alignment.bitpos == cur.bitpos)
> > +   return false;
> > +}
> > +
> > + prop_fail:
> > +  set_alignment_to_bottom (dest_lat);
> > +  return true;
> > +}
> > +
> >  /* If DEST_PLATS already has aggregate items, check that aggs_by_ref 
> > matches
> > NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true 
> > (in all
> > other cases, return false).  If there are no aggregate items, set
> > Index: src/gcc/ipa-prop.c
> > ===
> > --- src.orig/gcc/ipa-prop.c 2014-11-14 22:06:06.240031030 +0100
> > +++ src/gcc/ipa-prop.c  2014-11-15 01:40:05.564520331 +0100
> > @@ -1716,6 +1734,24 @@ ipa_compute_jump_functions_for_edge (str
> > useful_context = true;
> > }
> >
> > +  if (POINTER_TYPE_P (TREE_TYPE(arg)))
> > +   {
> > + unsigned HOST_WIDE_INT hwi_bitpos;
> > + unsigned align;
> > +
> > + 

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Mike Stump
On Nov 18, 2014, at 1:52 PM, Jakub Jelinek  wrote:
> OImode/XImode on i?86/x86_64 are not <= MAX_BITSIZE_MODE_ANY_INT, because
> they are never used for integer arithmetics (and there is no way
> to represent all their values in RTL if not using CONST_WIDE_INT).
> As the following testcase shows, simplify_immed_subreg can be called
> with such modes though, e.g. trying to forward propagate a CONST_VECTOR
> (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
> in the IL directly) into a SUBREG_REG.
> The following patch instead of ICE handles the most common cases (all 0
> and all 1 CONST_VECTORs) and returns NULL otherwise.
> 
> Before wide-int got merged, the testcase worked, though the code didn't
> bother checking anything, just created 0 or constm1_rtx for the two cases
> that could happen and if something else appeared, could just return what
> matched low TImode (or DImode for -m32).

tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for 
units that keeps u in bounds.  Doesn’t this access 32 bytes of OImode values in 
a 16 byte data structure?

Next, from_arrary uses a wide_int, and this from the documentation applies:

   All three flavors of wide_int are represented as a vector of 
 
   HOST_WIDE_INTs.  The default and widest_int vectors contain enough elements  
 
   to hold a value of MAX_BITSIZE_MODE_ANY_INT bits.  offset_int contains only  

   enough elements to hold ADDR_MAX_PRECISION bits.  The values are stored  

   in the vector with the least significant HOST_BITS_PER_WIDE_INT bits 
 
   in element 0.
 

If you look at the code to from_arrary:

wide_int_storage::from_array (const HOST_WIDE_INT *val, unsigned int len,
  unsigned int precision, bool need_canon_p)
{
  wide_int result = wide_int::create (precision);
  result.set_len (wi::from_array (result.write_val (), val, len, precision,
  need_canon_p));


unsigned int
wi::from_array (HOST_WIDE_INT *val, const HOST_WIDE_INT *xval,
unsigned int xlen, unsigned int precision, bool need_canon)
{
  for (unsigned i = 0; i < xlen; i++)
val[i] = xval[i];

it just does a blind copy of all the xlen hunks which forms from units, and 
units is:

int units
  = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
  / HOST_BITS_PER_WIDE_INT;

and GET_MODE_BITSIZE (outer_submode) is > MAX_BITSIZE_MODE_ANY_INT, right?

You can’t copy more bytes than the size of the destination has?



Re: [PATCH, PR62167] Fix tail-merge pass for dead type-unsafe code

2014-11-18 Thread Jeff Law

On 11/18/14 09:57, Tom de Vries wrote:

Richard,

this (trunk) patch fixes PR62167.

The patch fixes a problem that triggers with the test-case on the 4.8
branch, when tail-merge makes a dead type-unsafe load alive.

I'm not able to reproduce this bug on 4.9 and trunk with the same
test-case. On those branches, the tail-merge already does not happen.

The reason for the difference is as follows: With 4.8 the two phi
arguments of the phi in the tail block are value-numbered identically:
...
SCC consists of: p_14
Value numbering p_14 stmt = p_14 = MEM[(struct head *)_13].first;
Setting value number of p_14 to p_14 (changed)

SCC consists of: p_15
Value numbering p_15 stmt = p_15 = _13->next;
Setting value number of p_15 to p_14 (changed)
...

With 4.9 (and trunk), that's not the case:
...
SCC consists of: p_14
Value numbering p_14 stmt = p_14 = MEM[(struct head *)&heads][k.1_9].first;
Setting value number of p_14 to p_14 (changed)

SCC consists of: p_15
Value numbering p_15 stmt = p_15 = _13->next;
Setting value number of p_15 to p_15 (changed)
...

I'm not sure the bug triggers on trunk and 4.9, but I see no reason why
it could not trigger, so I'd prefer to apply the patch to 4.9 and trunk
as well.

The patch introduces an xfail for pr51879-12.c. I can follow up with a
patch to improve upon that, but I think that's better limited to trunk
only.

Bootstrapped and reg-tested on x86_64/trunk.
So instead of simply disabling for anything with virtual operands, 
shouldn't instead we be comparing the virtual operands and alias 
information?  Seems to me that if we did that properly, this would "just 
work".  Right?



jeff



Re: Audit some ipa passes for optimization attribute

2014-11-18 Thread Martin Jambor
On Tue, Nov 18, 2014 at 09:48:52PM +0100, Jan Hubicka wrote:
> Hi,
> this patch goes through most of ipa passes: ipa-devirt, ipa-cp, 
> ipa-pure-const,
> ipa-profile and ipa-inline and audits them for opt_for_fn.
> I did not converted yet ipa-reference because the code is organized in a way
> making it bit difficult to hook the tests in and ipa-icf that I want to do
> separately.
> 
> The patch also enables ipa inehritance graph building by default.  It makes it
> easy to support -fdevirtualize functoins within -fno-devirtualize unit and it
> is not hard to do so.
> 
> Bootstrapped/regtested x86_64-linux and tested on firefox.  I will try to 
> find time
> to design testcases that passes can be randomly turned off on per function 
> basis,
> I would definitely welcome a help with this task - there are quite few options
> to test and I am also lagging behind with ipa-cp devirt testcases.
> 
> Martin: I disabled sanity check in ipa_value_from_jfunc, 
> ipa_context_from_jfunc
> because I do not have function context handy.  If you think it is worth to 
> keep
> it, I will pass the stuff around.

No, I don't think they are useful enough to bother.

Thanks,

Martin


[PATCH] Updates ssa and inline summary in the correct location for AutoFDO

2014-11-18 Thread Dehao Chen
This patch updates ssa and inline summary in the correct location for AutoFDO.

Bootstrapped and passed regression test. OK for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2014-11-18  Dehao Chen  

* auto-profile.c (afdo_annotate_cfg): Invoke update_ssa in the right
place.
(auto_profile): Recompute inline summary after processing cgraph node.

Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 217741)
+++ gcc/auto-profile.c (working copy)
@@ -1514,7 +1514,14 @@ afdo_annotate_cfg (const stmt_set &promoted_stmts)
   profile_status_for_fn (cfun) = PROFILE_READ;
 }
   if (flag_value_profile_transformations)
-gimple_value_profile_transformations ();
+{
+  gimple_value_profile_transformations ();
+  free_dominance_info (CDI_DOMINATORS);
+  free_dominance_info (CDI_POST_DOMINATORS);
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+  calculate_dominance_info (CDI_DOMINATORS);
+  update_ssa (TODO_update_ssa);
+}
 }

 /* Wrapper function to invoke early inliner.  */
@@ -1592,7 +1599,6 @@ auto_profile (void)
 early_inline ();
 autofdo::afdo_annotate_cfg (promoted_stmts);
 compute_function_frequency ();
-update_ssa (TODO_update_ssa);

 /* Local pure-const may imply need to fixup the cfg.  */
 if (execute_fixup_cfg () & TODO_cleanup_cfg)
@@ -1601,6 +1607,7 @@ auto_profile (void)
 free_dominance_info (CDI_DOMINATORS);
 free_dominance_info (CDI_POST_DOMINATORS);
 cgraph_edge::rebuild_edges ();
+compute_inline_parameters (cgraph_node::get (current_function_decl), true);
 pop_cfun ();
   }


Re: [PATCH] Simple improvement for predicate computation in if-convert phase.

2014-11-18 Thread H.J. Lu
On Fri, Oct 17, 2014 at 6:08 AM, Yuri Rumyantsev  wrote:
> Jeff,
>
> I prepared another patch that includes test-case as you requested.
>
> Below are answers on your questions.
>
>> First, for the benefit of anyone trying to understand what you're doing, 
>> defining what "cd equivalent" means would be >helpful.
>
> I added the following  comment to function:
>
>fwe call basic blocks bb1 and bb2
>cd-equivalent if they are executed under the same condition.
>
>
> Is it sufficient?
>
>>So, do you have a case where the dominated_by_p test above is true and 
>>is_predicated(bb) returns true as well?  I think this >part of the change is 
>>largely responsible for the hack you're doing with having the function scoped 
>>static variable join_bb.
>
> I don't have such test-case and I assume that if bb is always
> executed, it is not predicated.
>
> I also deleted "join_bb" in my changes.
>
>
> Is it OK for trunk now.
>
> Thanks.
> Yuri.
>
> 2014-10-17  Yuri Rumyantsev  
> gcc/ChangeLog
>
> * tree-if-conv.c (add_to_predicate_list): Check unconditionally
> that bb is always executed to early exit. Use predicate of
> cd-equivalent block for join blocks if it exists.
> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree.
> (tree_if_conversion): Free post-dominance information.
>
> gcc/testsuite/ChangeLog
>
> * gcc/dg/tree-ssa/ifc-cd.c: New test.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63941


-- 
H.J.


Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-11-18 Thread Martin Jambor
On Tue, Nov 18, 2014 at 07:59:26PM +0100, Jan Hubicka wrote:
> > Hi,
> > 
> > On Tue, Nov 18, 2014 at 04:39:00PM +0100, Jan Hubicka wrote:
> > > > On Fri, Nov 14, 2014 at 08:59:10PM +0100, Jan Hubicka wrote:
> > > > > > > 
> > > > > > > > b) with GTY, we cannot call destructor
> > > > > > > 
> > > > > > > Everything in symbol table is expecitely memory managed (i.e. 
> > > > > > > enver left
> > > > > > > to be freed by garbage collector). It resists in GTY only to 
> > > > > > > allow linking
> > > > > > > garbage collected object from them and to get PCH working.
> > > > > > > 
> > > > > > 
> > > > > > Well, if I understand the intent correctly, summaries are for stuff
> > > > > > that is not in the symbol table.  For example jump functions are a
> > > > > Correct.
> > > > > > vector of structures possibly containing trees, so everything has to
> > > > > > be in garbage collected memory.
> > > > > > 
> > > > > > When an edge is removed, it is necessary to be notified about it
> > > > > > immediately, for example to decrement rdesc_refcount (you might 
> > > > > > argue
> > > > > > that that should be done in a separate hook and not from within a
> > > > > > summary class but then you start to rely on hook invocation ordering
> > > > > > so I think it is better to eventually use the summaries for it too).
> > > > > 
> > > > > I do not see why ctors/dtors can not do the reference counting. In 
> > > > > fact
> > > > > this is how refcounting is done usually anyway?
> > > > > 
> > > > 
> > > > Well, when there is no garbage collection involved then yes, that is
> > > > how you normally do it but in the GC case, there is the question of
> > > > what is the appropriate time to call destructor on garbage collected
> > > > data (like jump functions)?
> > > 
> > > I still fail to see problem here.  Summaries are explicitly managed- they 
> > > are
> > > constructed at summary construction time or when new callgarph node is
> > > introduced/duplicated.  They are destroyed when callgarph node is 
> > > destroyed or
> > > whole summary is ddestroyed.  It is job of the summary datastructure to 
> > > call
> > > proper ctors/dtors, not job of garbage collector that provides the 
> > > underlying
> > > memory management.
> > 
> > I do not think that all summaries (in the meaning of a description of
> > one particular symbol table node or call graph edge) are explicitely
> > managed.  For example ipa_edge_args or ipa_agg_replacement_value
> > (which my alignment patch changes to ipcp_transformation_summary) are
> > allocated in GC memory because they contain trees.
> > 
> > > 
> > > If you have datastructure that points to something that is not
> > > explicitly managed (i.e. tree expression), you just can not have
> > > non-trivial constructor on that datastructure, because that is freed
> > > transparently by gty that don't do destruction...
> > 
> > I admit to not being particularly bright today but that seems to be
> > exactly my point.
> 
> Well, in your case you have datastructure jump_function that contain a pointer
> to tree (EXPR).  What I am trying to explain is that I see no reson why
> jump_function needs to be POD.

I never said that the summary object needs to be a POD, I only said I
liked the possibility of storing very simple objects (without wrapping
them in classes with constructors and destructors).  That is of course
nothing more than my personal preference.

> The tree pointed to by EXPR pointer can not
> have a dtor by itself because GGC will not call it upon freeing.
> 
> It is true that jump_function lives in GGC memory (to make pointer to expr
> work) but it never gets removed by ggc_collect because it is always pointed to
> by the summary datastructure.  There are two ways to free the jump_function
> datastructure.
>   1) removing the symbol node it is attached to.
>  Here the symtab code will call removal hook that was registered by 
> container
>  template. The container will call destructor of jump_function and the 
> ggc_free
>  its memory
>   2) removing the summary.  In this case I would again expect the container
>  template to walk all summaries and free them.
> 
> So even if your structure lives in GGC memory it is not really garbage
> collected and thus the lack of machinery to call dtors at a time ggc decides 
> to
> free something is not a problem?
> 
> In fact looking at struct default_hashmap_traits, I see:
> 
>   /* Called to dispose of the key and value before marking the entry as
>  deleted.  */
> 
>   template static void remove (T &v) { v.~T (); }

Now I see, I should have read your previous email more carefully, by
explicitely managed you mean that destructors will be called
explicitely by the summary infrastructure.  I was wondering how you
wanted to rip the summaries out of GGC memory.

Well, I suppose that would work, and since explicit calls to
destructors are basically the counterpart of placement new that we
already plan to use, it might be actually

Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-18 Thread Sebastian Pop
Richard Biener wrote:
> On Tue, Nov 11, 2014 at 2:14 AM, Sebastian Pop  wrote:
> > Hi Jeff,
> >
> > I have adapted the code generation part from James' patch to current trunk, 
> > and
> > the resulting patch gets the 30% speedup on coremark and passes bootstrap 
> > of GCC.
> >
> > Ok for trunk?
> 
> In addition to missing documentation for the parameters
> 
> +  /* If we are copying an abnormally shaped region, keep track of
> +which block will become our loop header.  */
> +  if ((region[i] != entry->dest && region[i] == loop->header)
> + || (region[i] != entry->src && region[i] == loop->latch))
> +   {
> + save_loop_details = true;
> + memo_loop_latch_no = i;
> + memo_loop_header_no = i;
> +   }
> 
> this looks bogus as you overwrite latch/header.  

Right, this should be:

if (region[i] != entry->dest && region[i] == loop->header)
  {
save_loop_details = true;
memo_loop_header_no = i;
  }

if (region[i] != entry->src && region[i] == loop->latch)
  {
save_loop_details = true;
memo_loop_latch_no = i;
  }


> I wonder what you
> tried to fix with this as "abnormally shaped" isn't sth we support
> given the check before (all blocks must belong to the same loop
> and thus entry is always the loop header or there is no loop)?
> 

The regions that we duplicate start inside a loop and stay inside the same loop,
and the jump threading path is not allowed to go in deeper nested loops.

The reason why we need to modify the sese duplication function is that the sese
region that we need to duplicate starts at an arbitrary place inside the loop,
whereas the current user of the sese duplication function tree-ssa-loop-ch.c:245
starts at the edge entering the loop and exits at the latch edge.

> I'll leave the rest to Jeff but it looks good to me from an overall
> structure.
> 

Thanks for your review.

Sebastian

PS: Patch passed bootstrap and regtest on x86_64-linux.

PS: I have run some perf analysis with the patch:
- on a bootstrap of GCC I see 3209 FSM jump threads
- libpng and libjpeg contain FSM jump threads, the perf increase is in the 1%
  (measured on simulators and reduced data sets)
- coremark gets jump threaded (as expected)
- I'm setting up the llvm test-suite and I will report perf differences
>From aee8e01469c05e370b757b20c357a1c9dae57950 Mon Sep 17 00:00:00 2001
From: Sebastian Pop 
Date: Fri, 26 Sep 2014 14:54:20 -0500
Subject: [PATCH] extend jump thread for finite state automata PR 54742

Adapted from a patch from James Greenhalgh.

	* params.def (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): New.

	* doc/invoke.texi (max-fsm-thread-path-insns, max-fsm-thread-length,
	max-fsm-thread-paths): Documented.

	* testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.

	* tree-cfg.c (gimple_duplicate_sese_region): Save and restore loop
	header and latch.

	* tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore the
	original value of cond when simplification fails.
	(fsm_find_thread_path): New.
	(fsm_find_control_statement_thread_paths): New.
	(fsm_thread_through_normal_block): Call find_control_statement_thread_paths.

	* tree-ssa-threadupdate.c (dump_jump_thread_path): Pretty print
	EDGE_START_FSM_THREAD.
	(thread_through_all_blocks): Generate code for EDGE_START_FSM_THREAD edges
	calling gimple_duplicate_sese_region.

	* tree-ssa-threadupdate.h (jump_thread_edge_type): Add EDGE_START_FSM_THREAD.
---
 gcc/doc/invoke.texi  |  12 ++
 gcc/params.def   |  15 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |  38 +
 gcc/tree-cfg.c   |  26 ++-
 gcc/tree-ssa-threadedge.c| 201 ++-
 gcc/tree-ssa-threadupdate.c  |  52 +-
 gcc/tree-ssa-threadupdate.h  |   1 +
 7 files changed, 340 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 13270bc..613edbb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10586,6 +10586,18 @@ large and significantly increase compile time at optimization level
 @option{-O1} and higher.  This parameter is a maximum nubmer of statements
 in a single generated constructor.  Default value is 5000.
 
+@item max-fsm-thread-path-insns
+Maximum number of instructions to copy when duplicating blocks on a
+finite state automaton jump thread path.  The default is 100.
+
+@item max-fsm-thread-length
+Maximum number of basic blocks on a finite state automaton jump thread
+path.  The default is 10.
+
+@item max-fsm-thread-paths
+Maximum number of new jump thread paths to create for a finite state
+automaton.  The default is 50.
+
 @end table
 @end table
 
diff --git a/gcc/params.def b/gcc/params.def
index d2d2add..55c287e 100644
--- a/gcc/params.def
+++ b/g

Re: [Patch] pr63937: TARGET_USE_BY_PIECES_INFRASTRUCTURE_P should take an unsigned HOST_WIDE_INT size argument

2014-11-18 Thread Jakub Jelinek
On Tue, Nov 18, 2014 at 09:57:37PM +, James Greenhalgh wrote:
> 2014-11-18  James Greenhalgh  
> 
>   PR target/63937
>   * target.def (use_by_pieces_infrastructure_p): Take unsigned
>   HOST_WIDE_INT as the size parameter.
>   * targhooks.c (default_use_by_pieces_infrastructure_p): Likewise.
>   * targhooks.h (default_use_by_pieces_infrastructure_p): Likewise.
>   * config/arc/arc.c (arc_use_by_pieces_infrastructure_p)): Likewise.
>   * config/mips/mips.c (mips_use_by_pieces_infrastructure_p)): Likewise.
>   * config/s390/s390.c (s390_use_by_pieces_infrastructure_p)): Likewise.
>   * config/sh/sh.c (sh_use_by_pieces_infrastructure_p)): Likewise.
>   * config/aarch64/aarch64.c
>   (aarch64_use_by_pieces_infrastructure_p)): Likewise.
>   * doc/tm.texi: Regenerate.
> 
> 2014-11-18  James Greenhalgh  
> 
>   * gcc.dg/memset-2.c: New.

Please mention the PR also in the testsuite/ChangeLog entry.
Ok with that change, thanks.

Jakub


[committed] Add testcase for PR tree-optimization/61042

2014-11-18 Thread Jakub Jelinek
Hi!

This PR got fixed by a fix for another PR, but I've committed
the testcase to the testsuite for better coverage.

2014-11-18  Jakub Jelinek  

PR tree-optimization/61042
* gcc.c-torture/compile/pr61042.c: New test.

--- gcc/testsuite/gcc.c-torture/compile/pr61042.c.jj2014-11-18 
14:23:56.004671762 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr61042.c   2014-11-18 
14:23:35.0 +0100
@@ -0,0 +1,10 @@
+/* PR tree-optimization/61042 */
+
+int a, b, c[1], d, f;
+
+void
+foo ()
+{
+  for (; b; b++)
+c[0] = c[f] && (d = a);
+}

Jakub


Re: Don't override all other CFLAGS_FOR_TARGET when optimizing libraries for space

2014-11-18 Thread Jeff Law

On 11/17/14 03:16, Bob Dunlop wrote:

Hi,

This patch prevents optimizing libraries for space from overriding all
previous CFLAGS_FOR_TARGET settings.  Add optimization flags to any
pre-existing values.

I hit this problem whilst building a Gcc cross compiler and newlib
library for ARM with crosstool-NG-1.20.0.  The newlib developers
suggested I should try posting it here.


--- gcc-4.9.2/config/mt-ospace  1999-09-04 16:09:22.0 +0100
+++ gcc-new/config/mt-ospace2014-11-17 09:50:08.0 +
@@ -1,3 +1,3 @@
  # Build libraries optimizing for space, not speed.
- CFLAGS_FOR_TARGET = -g -Os
- CXXFLAGS_FOR_TARGET = -g -Os
+ CFLAGS_FOR_TARGET += -g -Os
+ CXXFLAGS_FOR_TARGET += -g -Os

Thanks.  Installed.

jeff



[PATCH] Fix simd clone vectorization with EH (PR tree-optimization/63915)

2014-11-18 Thread Jakub Jelinek
Hi!

Simd clone vectorization uses vect_finish_stmt_generation which
handles adding the new calls properly to EH tables, but afterwards
we replace the original call with a normal assignment, and if the original
call can throw, we need to remove it from the EH tables.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2014-11-18  Jakub Jelinek  

PR tree-optimization/63915
* tree-vect-stmts.c (vectorizable_simd_clone_call): Pass
true instead of false as last argument to gsi_replace.

* c-c++-common/gomp/pr60823-4.c: New test.

--- gcc/tree-vect-stmts.c.jj2014-11-14 00:10:39.0 +0100
+++ gcc/tree-vect-stmts.c   2014-11-18 17:02:15.635257023 +0100
@@ -3195,7 +3195,7 @@ vectorizable_simd_clone_call (gimple stm
   set_vinfo_for_stmt (new_stmt, stmt_info);
   set_vinfo_for_stmt (stmt, NULL);
   STMT_VINFO_STMT (stmt_info) = new_stmt;
-  gsi_replace (gsi, new_stmt, false);
+  gsi_replace (gsi, new_stmt, true);
   unlink_stmt_vdef (stmt);
 
   return true;
--- gcc/testsuite/c-c++-common/gomp/pr60823-4.c.jj  2014-11-18 
17:06:38.859319961 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr60823-4.c 2014-11-18 17:06:34.602383632 
+0100
@@ -0,0 +1,7 @@
+/* PR tree-optimization/63915 */
+/* { dg-do run } */
+/* { dg-require-effective-target vect_simd_clones } */
+/* { dg-options "-O2 -fopenmp-simd" } */
+/* { dg-additional-options "-fpic" { target fpic } } */
+
+#include "pr60823-2.c"

Jakub


[PATCH] Fix ubsan -fsanitize=signed-integer-overflow expansion (PR sanitizer/63520)

2014-11-18 Thread Jakub Jelinek
Hi!

Apparently, expand_expr with EXPR_WRITE can return
a SUBREG with SUBREG_PROMOTED_VAR_P set on it.  For
UBSAN_CHECK_{ADD,SUB,MUL} expansion, I've been doing just
emit_move_insn into it, which apparently is wrong in that case,
store_expr instead uses convert_move for it.  The
{ADD,SUB,MUL}_OVERFLOW (i.e. __builtin_*_overflow) expansion
shouldn't need it, as the result is complex and complex integers
aren't promoted that way.  As store_expr* uses a tree expression
to store, while I have rtx, I just wrote a short helper function
for this.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  

PR sanitizer/63520
* internal-fn.c (expand_ubsan_result_store): New function.
(expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow):
Use it instead of just emit_move_insn.

* c-c++-common/ubsan/pr63520.c: New test.

--- gcc/internal-fn.c.jj2014-11-12 13:28:47.0 +0100
+++ gcc/internal-fn.c   2014-11-18 15:35:46.395916823 +0100
@@ -395,6 +395,21 @@ expand_arith_overflow_result_store (tree
   write_complex_part (target, lres, false);
 }
 
+/* Helper for expand_*_overflow.  Store RES into TARGET.  */
+
+static void
+expand_ubsan_result_store (rtx target, rtx res)
+{
+  if (GET_CODE (target) == SUBREG && SUBREG_PROMOTED_VAR_P (target))
+/* If this is a scalar in a register that is stored in a wider mode   
+   than the declared mode, compute the result into its declared mode
+   and then convert to the wider mode.  Our value is the computed
+   expression.  */
+convert_move (SUBREG_REG (target), res, SUBREG_PROMOTED_SIGN (target));
+  else
+emit_move_insn (target, res);
+}
+
 /* Add sub/add overflow checking to the statement STMT.
CODE says whether the operation is +, or -.  */
 
@@ -809,7 +824,7 @@ expand_addsub_overflow (location_t loc,
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
{
  if (do_xor)
@@ -904,7 +919,7 @@ expand_neg_overflow (location_t loc, tre
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
expand_arith_overflow_result_store (lhs, target, mode, res);
 }
@@ -1590,7 +1605,7 @@ expand_mul_overflow (location_t loc, tre
   if (lhs)
 {
   if (is_ubsan)
-   emit_move_insn (target, res);
+   expand_ubsan_result_store (target, res);
   else
expand_arith_overflow_result_store (lhs, target, mode, res);
 }
--- gcc/testsuite/c-c++-common/ubsan/pr63520.c.jj   2014-11-18 
15:40:07.271273710 +0100
+++ gcc/testsuite/c-c++-common/ubsan/pr63520.c  2014-11-18 15:40:40.971673904 
+0100
@@ -0,0 +1,16 @@
+/* PR sanitizer/63520 */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined" } */
+
+int a;
+
+void
+foo (void)
+{
+  while (1)
+{
+  if (a == 1)
+   break;
+  a -= 1;
+}
+}


Jakub


Re: [C++ PATCH] Fix -fsanitize={null,alignment} reference instrumentation (PR sanitizer/63813)

2014-11-18 Thread Jason Merrill

OK.

Jason


[PATCH] Fix up r217118 - simplify_binary_operation_1 (PR rtl-optimization/63843)

2014-11-18 Thread Jakub Jelinek
Hi!

The case ASHIFTRT: which is meant for this optimization is preceeded by
/* FALLTHRU */ from ROTATE cases, which can't be optimized this way.
Thus, this patch limits this optimization to ASHIFTRT.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  

PR rtl-optimization/63843
* simplify-rtx.c (simplify_binary_operation_1) : For
optimization of ashiftrt of subreg of lshiftrt, check that code
is ASHIFTRT.

* gcc.c-torture/execute/pr63843.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-18 10:17:01.0 +0100
+++ gcc/simplify-rtx.c  2014-11-18 13:44:26.727792683 +0100
@@ -3118,10 +3118,11 @@ simplify_binary_operation_1 (enum rtx_co
  (subreg:M1 (ashiftrt:M2 (reg:M2)
  (const_int ))
   ).  */
-  if (!VECTOR_MODE_P (mode)
+  if (code == ASHIFTRT
+ && !VECTOR_MODE_P (mode)
   && SUBREG_P (op0)
   && CONST_INT_P (op1)
-  && (GET_CODE (SUBREG_REG (op0)) == LSHIFTRT)
+ && GET_CODE (SUBREG_REG (op0)) == LSHIFTRT
   && !VECTOR_MODE_P (GET_MODE (SUBREG_REG (op0)))
   && CONST_INT_P (XEXP (SUBREG_REG (op0), 1))
   && (GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (op0)))
--- gcc/testsuite/gcc.c-torture/execute/pr63843.c.jj2014-11-18 
13:43:45.417544096 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr63843.c   2014-11-18 
13:43:32.0 +0100
@@ -0,0 +1,31 @@
+/* PR rtl-optimization/63843 */
+
+static inline __attribute__ ((always_inline))
+unsigned short foo (unsigned short v)
+{
+  return (v << 8) | (v >> 8);
+}
+
+unsigned short __attribute__ ((noinline, noclone, hot))
+bar (unsigned char *x)
+{
+  unsigned int a;
+  unsigned short b;
+  __builtin_memcpy (&a, &x[0], sizeof (a));
+  a ^= 0x80808080U;
+  __builtin_memcpy (&x[0], &a, sizeof (a));
+  __builtin_memcpy (&b, &x[2], sizeof (b));
+  return foo (b);
+}
+
+int
+main ()
+{
+  unsigned char x[8] = { 0x01, 0x01, 0x01, 0x01 };
+  if (__CHAR_BIT__ == 8
+  && sizeof (short) == 2
+  && sizeof (int) == 4
+  && bar (x) != 0x8181U)
+__builtin_abort ();
+  return 0;
+}

Jakub


[Patch] pr63937: TARGET_USE_BY_PIECES_INFRASTRUCTURE_P should take an unsigned HOST_WIDE_INT size argument

2014-11-18 Thread James Greenhalgh

Hi,

The TARGET_USE_BY_PIECES_INFRASTRUCTURE_P hook takes an unsigned
int for the size of memory operation. This is dangerous, as all the
callers of this hook pass an unsigned HOST_WIDE_INT.

This causes pr63937, where the compiler ends up trying to emit
an unfeasible number of instructions for a word-by-word copy of a > 4GB
array (which wraps around and becomes an 8-byte copy that the compiler
momentarily thinks can be done in a single instruction, D'oh).

This target hook should be using unsigned HOST_WIDE_INT in line with
what its callers expect.

Fixed as attached.

An x86_64 bootstrap comes back clean, and I've built cross-compilers
for the other touched targets

Thanks,
James

---
2014-11-18  James Greenhalgh  

PR target/63937
* target.def (use_by_pieces_infrastructure_p): Take unsigned
HOST_WIDE_INT as the size parameter.
* targhooks.c (default_use_by_pieces_infrastructure_p): Likewise.
* targhooks.h (default_use_by_pieces_infrastructure_p): Likewise.
* config/arc/arc.c (arc_use_by_pieces_infrastructure_p)): Likewise.
* config/mips/mips.c (mips_use_by_pieces_infrastructure_p)): Likewise.
* config/s390/s390.c (s390_use_by_pieces_infrastructure_p)): Likewise.
* config/sh/sh.c (sh_use_by_pieces_infrastructure_p)): Likewise.
* config/aarch64/aarch64.c
(aarch64_use_by_pieces_infrastructure_p)): Likewise.
* doc/tm.texi: Regenerate.

2014-11-18  James Greenhalgh  

* gcc.dg/memset-2.c: New.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3548335..fd3c819 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10196,7 +10196,7 @@ aarch64_asan_shadow_offset (void)
 }
 
 static bool
-aarch64_use_by_pieces_infrastructure_p (unsigned int size,
+aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
 	unsigned int align,
 	enum by_pieces_operation op,
 	bool speed_p)
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 0f3825e..764f736 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -416,7 +416,7 @@ static void output_short_suffix (FILE *file);
 
 static bool arc_frame_pointer_required (void);
 
-static bool arc_use_by_pieces_infrastructure_p (unsigned int,
+static bool arc_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT,
 		unsigned int,
 		enum by_pieces_operation op,
 		bool);
@@ -9374,7 +9374,7 @@ arc_legitimize_reload_address (rtx *p, machine_mode mode, int opnum,
 /* Implement TARGET_USE_BY_PIECES_INFRASTRUCTURE_P.  */
 
 static bool
-arc_use_by_pieces_infrastructure_p (unsigned int size,
+arc_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
 unsigned int align,
 enum by_pieces_operation op,
 bool speed_p)
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 02268f3..db58d07 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -7235,7 +7235,7 @@ mips_function_ok_for_sibcall (tree decl, tree exp ATTRIBUTE_UNUSED)
 /* Implement TARGET_USE_MOVE_BY_PIECES_INFRASTRUCTURE_P.  */
 
 bool
-mips_use_by_pieces_infrastructure_p (unsigned int size,
+mips_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
  unsigned int align,
  enum by_pieces_operation op,
  bool speed_p)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3152762..ae3ffd1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -12036,7 +12036,7 @@ s390_option_override (void)
 /* Implement TARGET_USE_BY_PIECES_INFRASTRUCTURE_P.  */
 
 static bool
-s390_use_by_pieces_infrastructure_p (unsigned int size,
+s390_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
  unsigned int align ATTRIBUTE_UNUSED,
  enum by_pieces_operation op ATTRIBUTE_UNUSED,
  bool speed_p ATTRIBUTE_UNUSED)
diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c
index 5bac2af..e449121 100644
--- a/gcc/config/sh/sh.c
+++ b/gcc/config/sh/sh.c
@@ -338,7 +338,7 @@ static void sh_conditional_register_usage (void);
 static bool sh_legitimate_constant_p (machine_mode, rtx);
 static int mov_insn_size (machine_mode, bool);
 static int mov_insn_alignment_mask (machine_mode, bool);
-static bool sh_use_by_pieces_infrastructure_p (unsigned int,
+static bool sh_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT,
 	   unsigned int,
 	   enum by_pieces_operation,
 	   bool);
@@ -13680,7 +13680,7 @@ sh_mode_priority (int entity ATTRIBUTE_UNUSED, int n)
 /* Implement TARGET_USE_BY_PIECES_INFRASTRUCTURE_P.  */
 
 static bool
-sh_use_by_pieces_infrastructure_p (unsigned int size,
+sh_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
    unsigned int align,
    enum by_pieces_operation op,
    bool speed_p)
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c09c510..0d3a9fd 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6205,7 +6205,7 @@ optimized for speed rat

[C++ PATCH] Fix -fsanitize={null,alignment} reference instrumentation (PR sanitizer/63813)

2014-11-18 Thread Jakub Jelinek
Hi!

As the testcase shows, if there is a reinterpret_cast, ubsan
reference instrumentation can ICE, because for NOP_EXPR
to REFERENCE_TYPE we pass argument of that NOP_EXPR, and if it is not
a POINTER_TYPE or POINTER_TYPE to a different type, we can get the
type to use in the diagnostics wrong or ICE.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2014-11-18  Jakub Jelinek  

PR sanitizer/63813
* c-ubsan.c (ubsan_maybe_instrument_reference_or_call): Change type
argument to ptype, set type to TREE_TYPE (ptype).  Don't call
get_pointer_alignment for non-pointers.  Use ptype, or if it is
reference type, corresponding pointer type, as type of kind
argument.
(ubsan_maybe_instrument_reference,
ubsan_maybe_instrument_member_call): Adjust callers.

* g++.dg/ubsan/pr63813.C: New test.

--- gcc/c-family/c-ubsan.c.jj   2014-10-29 09:49:47.0 +0100
+++ gcc/c-family/c-ubsan.c  2014-11-18 12:31:42.700607456 +0100
@@ -383,18 +383,19 @@ ubsan_maybe_instrument_array_ref (tree *
 }
 
 static tree
-ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree type,
+ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype,
  enum ubsan_null_ckind ckind)
 {
-  tree orig_op = op;
-  bool instrument = false;
-  unsigned int mina = 0;
-
   if (current_function_decl == NULL_TREE
   || lookup_attribute ("no_sanitize_undefined",
   DECL_ATTRIBUTES (current_function_decl)))
 return NULL_TREE;
 
+  tree type = TREE_TYPE (ptype);
+  tree orig_op = op;
+  bool instrument = false;
+  unsigned int mina = 0;
+
   if (flag_sanitize & SANITIZE_ALIGNMENT)
 {
   mina = min_align_of_type (type);
@@ -431,13 +432,20 @@ ubsan_maybe_instrument_reference_or_call
}
   else if (flag_sanitize & SANITIZE_NULL)
instrument = true;
-  if (mina && mina > get_pointer_alignment (op) / BITS_PER_UNIT)
-   instrument = true;
+  if (mina && mina > 1)
+   {
+ if (!POINTER_TYPE_P (TREE_TYPE (op))
+ || mina > get_pointer_alignment (op) / BITS_PER_UNIT)
+   instrument = true;
+   }
 }
   if (!instrument)
 return NULL_TREE;
   op = save_expr (orig_op);
-  tree kind = build_int_cst (TREE_TYPE (op), ckind);
+  gcc_assert (POINTER_TYPE_P (ptype));
+  if (TREE_CODE (ptype) == REFERENCE_TYPE)
+ptype = build_pointer_type (TREE_TYPE (ptype));
+  tree kind = build_int_cst (ptype, ckind);
   tree align = build_int_cst (pointer_sized_int_node, mina);
   tree call
 = build_call_expr_internal_loc (loc, IFN_UBSAN_NULL, void_type_node,
@@ -453,7 +461,7 @@ ubsan_maybe_instrument_reference (tree s
 {
   tree op = TREE_OPERAND (stmt, 0);
   op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op,
-TREE_TYPE (TREE_TYPE (stmt)),
+TREE_TYPE (stmt),
 UBSAN_REF_BINDING);
   if (op)
 TREE_OPERAND (stmt, 0) = op;
@@ -471,7 +479,7 @@ ubsan_maybe_instrument_member_call (tree
   || !POINTER_TYPE_P (TREE_TYPE (op)))
 return;
   op = ubsan_maybe_instrument_reference_or_call (EXPR_LOCATION (stmt), op,
-TREE_TYPE (TREE_TYPE (op)),
+TREE_TYPE (op),
 is_ctor ? UBSAN_CTOR_CALL
 : UBSAN_MEMBER_CALL);
   if (op)
--- gcc/testsuite/g++.dg/ubsan/pr63813.C.jj 2014-11-18 12:37:57.790991586 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr63813.C2014-11-18 12:50:58.723345975 
+0100
@@ -0,0 +1,12 @@
+// PR sanitizer/63813
+// { dg-do compile }
+// { dg-options "-fsanitize=undefined -O1" }
+
+struct A {};
+struct B { long foo () const; A &bar () const; };
+
+A &
+B::bar () const
+{
+  return *reinterpret_cast  (foo ());
+}

Jakub


[PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek
Hi!

OImode/XImode on i?86/x86_64 are not <= MAX_BITSIZE_MODE_ANY_INT, because
they are never used for integer arithmetics (and there is no way
to represent all their values in RTL if not using CONST_WIDE_INT).
As the following testcase shows, simplify_immed_subreg can be called
with such modes though, e.g. trying to forward propagate a CONST_VECTOR
(i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
in the IL directly) into a SUBREG_REG.
The following patch instead of ICE handles the most common cases (all 0
and all 1 CONST_VECTORs) and returns NULL otherwise.

Before wide-int got merged, the testcase worked, though the code didn't
bother checking anything, just created 0 or constm1_rtx for the two cases
that could happen and if something else appeared, could just return what
matched low TImode (or DImode for -m32).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): For integer modes
wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones
and for other values return NULL_RTX.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-11 00:06:19.0 +0100
+++ gcc/simplify-rtx.c  2014-11-18 10:17:01.198668555 +0100
@@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode) > MAX_BITSIZE_MODE_ANY_INT)
+ {
+   /* Handle just all zeros and all ones CONST_VECTORs in
+  this case.  */
+   if ((vp[0] & value_mask) == 0)
+ elems[elem] = const0_rtx;
+   else if ((vp[0] & value_mask) == value_mask)
+ elems[elem] = constm1_rtx;
+   else
+ return NULL_RTX;
+   for (i = value_bit; i < elem_bitsize; i += value_bit)
+ if ((vp[i / value_bit] & value_mask) != (vp[0] & value_mask))
+   return NULL_RTX;
+   break;
+ }
for (u = 0; u < units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   <= MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
elems[elem] = immed_wide_int_const (r, outer_submode);
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-18 10:12:24.282659318 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options "-O -mstringop-strategy=vector_loop -mavx512f" } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}

Jakub


[PATCH] Fix -fsanitize=bool -fnon-call-exceptions (PR sanitizer/63913)

2014-11-18 Thread Jakub Jelinek
Hi!

This patch fixes instrumentation of bool/enum loads if they could
throw.  We want to keep the EH on the load, and push
further statements on the fallthru edge.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  

PR sanitizer/63913
* ubsan.c: Include tree-eh.h.
(instrument_bool_enum_load): Handle loads that can throw.

* g++.dg/ubsan/pr63913.C: New test.

--- gcc/ubsan.c.jj  2014-11-14 15:11:49.0 +0100
+++ gcc/ubsan.c 2014-11-18 09:29:20.409209556 +0100
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.
 #include "dfp.h"
 #include "builtins.h"
 #include "tree-object-size.h"
+#include "tree-eh.h"
 
 /* Map from a tree to a VAR_DECL tree.  */
 
@@ -1135,7 +1136,9 @@ instrument_bool_enum_load (gimple_stmt_i
   || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
 return;
 
+  bool can_throw = stmt_could_throw_p (stmt);
   location_t loc = gimple_location (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
   tree ptype = build_pointer_type (TREE_TYPE (rhs));
   tree atype = reference_alias_ptr_type (rhs);
   gimple g = gimple_build_assign (make_ssa_name (ptype, NULL),
@@ -1145,9 +1148,24 @@ instrument_bool_enum_load (gimple_stmt_i
   tree mem = build2 (MEM_REF, utype, gimple_assign_lhs (g),
 build_int_cst (atype, 0));
   tree urhs = make_ssa_name (utype, NULL);
-  g = gimple_build_assign (urhs, mem);
-  gimple_set_location (g, loc);
-  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+  if (can_throw)
+{
+  gimple_assign_set_lhs (stmt, urhs);
+  g = gimple_build_assign_with_ops (NOP_EXPR, lhs, urhs, NULL_TREE);
+  gimple_set_location (g, loc);
+  edge e = find_fallthru_edge (gimple_bb (stmt)->succs);
+  gsi_insert_on_edge_immediate (e, g);
+  gimple_assign_set_rhs_with_ops (gsi, MEM_REF, mem, NULL_TREE);
+  update_stmt (stmt);
+  *gsi = gsi_for_stmt (g);
+  g = stmt;
+}
+  else
+{
+  g = gimple_build_assign (urhs, mem);
+  gimple_set_location (g, loc);
+  gsi_insert_before (gsi, g, GSI_SAME_STMT);
+}
   minv = fold_convert (utype, minv);
   maxv = fold_convert (utype, maxv);
   if (!integer_zerop (minv))
@@ -1169,8 +1187,11 @@ instrument_bool_enum_load (gimple_stmt_i
   gimple_set_location (g, loc);
   gsi_insert_after (gsi, g, GSI_NEW_STMT);
 
-  gimple_assign_set_rhs_with_ops (&gsi2, NOP_EXPR, urhs, NULL_TREE);
-  update_stmt (stmt);
+  if (!can_throw)
+{
+  gimple_assign_set_rhs_with_ops (&gsi2, NOP_EXPR, urhs, NULL_TREE);
+  update_stmt (stmt);
+}
 
   gsi2 = gsi_after_labels (then_bb);
   if (flag_sanitize_undefined_trap_on_error)
--- gcc/testsuite/g++.dg/ubsan/pr63913.C.jj 2014-11-18 09:34:10.603981487 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr63913.C2014-11-18 09:33:47.0 
+0100
@@ -0,0 +1,12 @@
+// PR sanitizer/63913
+// { dg-do compile }
+// { dg-options "-fsanitize=bool -fnon-call-exceptions" }
+
+struct B { B (); ~B (); };
+
+double
+foo (bool *x)
+{
+  B b;
+  return *x;
+}

Jakub


C++ PATCH for c++/63940 (constexpr errors in bootstrap)

2014-11-18 Thread Jason Merrill
I'm not sure why this wasn't showing up on x86_64-linux, but here's a 
fix.  When we lower SIZEOF_EXPR during genericization, we don't then 
fold the containing expression, so the constexpr evaluator shouldn't 
assume that everything has been folded.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1c056fd82cc25ac0ec30be50b37f274c6ba0e496
Author: Jason Merrill 
Date:   Tue Nov 18 15:18:57 2014 -0500

	PR c++/63940
	* constexpr.c (cxx_eval_binary_expression): Don't assume the
	expression was already folded.
	(cxx_eval_unary_expression): Likewise.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 5abea14..517bf23 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1461,9 +1461,17 @@ cxx_eval_unary_expression (const constexpr_ctx *ctx, tree t,
 	   addr, non_constant_p, overflow_p,
 	   NULL);
   VERIFY_CONSTANT (arg);
-  if (arg == orig_arg)
-return t;
-  r = fold_build1 (TREE_CODE (t), TREE_TYPE (t), arg);
+  location_t loc = EXPR_LOCATION (t);
+  enum tree_code code = TREE_CODE (t);
+  tree type = TREE_TYPE (t);
+  r = fold_unary_loc (loc, code, type, arg);
+  if (r == NULL_TREE)
+{
+  if (arg == orig_arg)
+	r = t;
+  else
+	r = build1_loc (loc, code, type, arg);
+}
   VERIFY_CONSTANT (r);
   return r;
 }
@@ -1488,9 +1496,18 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, tree t,
   allow_non_constant, addr,
   non_constant_p, overflow_p, NULL);
   VERIFY_CONSTANT (rhs);
-  if (lhs == orig_lhs && rhs == orig_rhs)
-return t;
-  r = fold_build2 (TREE_CODE (t), TREE_TYPE (t), lhs, rhs);
+
+  location_t loc = EXPR_LOCATION (t);
+  enum tree_code code = TREE_CODE (t);
+  tree type = TREE_TYPE (t);
+  r = fold_binary_loc (loc, code, type, lhs, rhs);
+  if (r == NULL_TREE)
+{
+  if (lhs == orig_lhs && rhs == orig_rhs)
+	r = t;
+  else
+	r = build2_loc (loc, code, type, lhs, rhs);
+}
   VERIFY_CONSTANT (r);
   return r;
 }


Re: [PATCH, MPX wrappers 1/3] Add MPX wrappers library

2014-11-18 Thread Jeff Law

On 11/18/14 09:48, Ilya Enkovich wrote:

On 15 Nov 00:10, Jeff Law wrote:

On 11/14/14 10:26, Ilya Enkovich wrote:

Hi,

This patch introduces a simple library with several wrappers to be used with 
MPX and Pointer Bounds Checker.  Wrappers allow to obtain, copy and just keep 
alive bounds whrough widely use library calls.  It significantly increases 
checking  quality.

Thanks,
Ilya
--
gcc/

2014-11-14  Ilya Enkovich  

* gcc.c (MPX_SPEC): Add wrappers library.

libmpx/

2014-11-14  Ilya Enkovich  

* Makefile.am (SUBDIRS): New.
(MAKEOVERRIDES): New.
* Makefile.in: Regenerate.
* configure.ac: Add mpxintr/Makefile to config
files.
* configure: Regenerate.
* mpxwrap/Makefile.am: New.
* mpxwrap/Makefile.in: New.
* mpxwrap/libtool-version: New.
* mpxwrap/mpx_wrappers.cc: New.

As Joseph mentioned, symbol versioning.  Anytime a target side
library is added to GCC, it should be properly versioned.

Don't forget copyright headers in the new files.  Remember it has to
be suitable for embeddeding in the target without infecting the
target with the GPL.  LGPL or GPL + exception clause seem the most
appropriate to me.


Jeff



Thank you for review!  Here is a version with license and versioning added.

Thanks,
Ilya
--
gcc/

2014-11-18  Ilya Enkovich  

* gcc.c (MPX_SPEC): Add wrappers library.

libmpx/

2014-11-18  Ilya Enkovich  

* Makefile.am (SUBDIRS): New.
(MAKEOVERRIDES): New.
* Makefile.in: Regenerate.
* configure.ac: Add mpxintr/Makefile to config
files.
* configure: Regenerate.
* mpxwrap/Makefile.am: New.
* mpxwrap/Makefile.in: New.
* mpxwrap/libtool-version: New.
* mpxwrap/mpx_wrappers.cc: New.
* mpxwrap/libmpxwrappers.map: New.

OK.
Jeff



Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

2014-11-18 Thread David Malcolm
On Tue, 2014-11-18 at 10:53 +0100, Richard Biener wrote:
> On Tue, Nov 18, 2014 at 2:59 AM, David Malcolm  wrote:
> > On Mon, 2014-11-17 at 11:06 +0100, Richard Biener wrote:
> >> On Sat, Nov 15, 2014 at 12:00 PM, David Malcolm  
> >> wrote:
> >> > On Thu, 2014-11-13 at 11:45 +0100, Richard Biener wrote:
> >> >> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm  
> >> >> wrote:
> >> >> > On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
> >> >> >> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek  
> >> >> >> wrote:
> >> >> >> > On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
> >> >> >> >> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> >> >> >> >> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> >> >> >> >> > > To be constructive here - the above case is from within a
> >> >> >> >> > > GIMPLE_ASSIGN case label
> >> >> >> >> > > and thus I'd have expected
> >> >> >> >> > >
> >> >> >> >> > > case GIMPLE_ASSIGN:
> >> >> >> >> > >   {
> >> >> >> >> > > gassign *a1 = as_a  (s1);
> >> >> >> >> > > gassign *a2 = as_a  (s2);
> >> >> >> >> > >   lhs1 = gimple_assign_lhs (a1);
> >> >> >> >> > >   lhs2 = gimple_assign_lhs (a2);
> >> >> >> >> > >   if (TREE_CODE (lhs1) != SSA_NAME
> >> >> >> >> > >   && TREE_CODE (lhs2) != SSA_NAME)
> >> >> >> >> > > return (operand_equal_p (lhs1, lhs2, 0)
> >> >> >> >> > > && gimple_operand_equal_value_p 
> >> >> >> >> > > (gimple_assign_rhs1 (a1),
> >> >> >> >> > >  
> >> >> >> >> > > gimple_assign_rhs1 (a2)));
> >> >> >> >> > >   else if (TREE_CODE (lhs1) == SSA_NAME
> >> >> >> >> > >&& TREE_CODE (lhs2) == SSA_NAME)
> >> >> >> >> > > return vn_valueize (lhs1) == vn_valueize (lhs2);
> >> >> >> >> > >   return false;
> >> >> >> >> > >   }
> >> >> >> >> > >
> >> >> >> >> > > instead.  That's the kind of changes I have expected and have 
> >> >> >> >> > > approved of.
> >> >> >> >> >
> >> >> >> >> > But even that looks like just adding extra work for all 
> >> >> >> >> > developers, with no
> >> >> >> >> > gain.  You only have to add extra code and extra temporaries, 
> >> >> >> >> > in switches
> >> >> >> >> > typically also have to add {} because of the temporaries and 
> >> >> >> >> > thus extra
> >> >> >> >> > indentation level, and it doesn't simplify anything in the code.
> >> >> >> >>
> >> >> >> >> The branch attempts to use the C++ typesystem to capture 
> >> >> >> >> information
> >> >> >> >> about the kinds of gimple statement we expect, both:
> >> >> >> >>   (A) so that the compiler can detect type errors, and
> >> >> >> >>   (B) as a comprehension aid to the human reader of the code
> >> >> >> >>
> >> >> >> >> The ideal here is when function params and struct field can be
> >> >> >> >> strengthened from "gimple" to a subclass ptr.  This captures the
> >> >> >> >> knowledge that every use of a function or within a struct has a 
> >> >> >> >> given
> >> >> >> >> gimple code.
> >> >> >> >
> >> >> >> > I just don't like all the as_a/is_a stuff enforced everywhere,
> >> >> >> > it means more typing, more temporaries, more indentation.
> >> >> >> > So, as I view it, instead of the checks being done cheaply (yes, I 
> >> >> >> > think
> >> >> >> > the gimple checking as we have right now is very cheap) under the
> >> >> >> > hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those 
> >> >> >> > changes
> >> >> >> > put the burden on the developers, who has to check that manually 
> >> >> >> > through
> >> >> >> > the as_a/is_a stuff everywhere, more typing and uglier syntax.
> >> >> >> > I just don't see that as a step forward, instead a huge step 
> >> >> >> > backwards.
> >> >> >> > But perhaps I'm alone with this.
> >> >> >> > Can you e.g. compare the size of - lines in your patchset 
> >> >> >> > combined, and
> >> >> >> > size of + lines in your patchset?  As in, if your changes lead to 
> >> >> >> > less
> >> >> >> > typing or more.
> >> >> >>
> >> >> >> I see two ways out here.  One is to add overloads to all the 
> >> >> >> functions
> >> >> >> taking the special types like
> >> >> >>
> >> >> >> tree
> >> >> >> gimple_assign_rhs1 (gimple *);
> >> >> >>
> >> >> >> or simply add
> >> >> >>
> >> >> >> gassign *operator ()(gimple *g) { return as_a  (g); }
> >> >> >>
> >> >> >> into a gimple-compat.h header which you include in places that
> >> >> >> are not converted "nicely".
> >> >> >
> >> >> > Thanks for the suggestions.
> >> >> >
> >> >> > Am I missing something, or is the gimple-compat.h idea above not 
> >> >> > valid C
> >> >> > ++?
> >> >> >
> >> >> > Note that "gimple" is still a typedef to
> >> >> >   gimple_statement_base *
> >> >> > (as noted before, the gimple -> gimple * change would break everyone
> >> >> > else's patches, so we talked about that as a followup patch for early
> >> >> > stage3).
> >> >> >
> >> >> > Given that, if I try to create an "operator ()" ou

Re: [PATCH] Fix PR56480 aka DR374. Allow explicit specialization in enclosing namespace.

2014-11-18 Thread Jason Merrill

On 11/12/2014 06:53 AM, Markus Trippelsdorf wrote:

Anyway, I will defer working on this until next stage1, because I don't
have time to implement this before stage1 closes on Saturday.


I think this should be OK for stage 3, especially given that you first 
sent the patch before the end of stage 1.


Jason




Re: Query about the TREE_TYPE field

2014-11-18 Thread Jason Merrill

On 11/18/2014 02:32 PM, Andrew MacLeod wrote:

So it effectively does nothing.   Unless Jason can think of a good
reason for it, we probably ought to turf it.  Its effectively a NOP.


Yeah, go ahead.

Jason




Audit some ipa passes for optimization attribute

2014-11-18 Thread Jan Hubicka
Hi,
this patch goes through most of ipa passes: ipa-devirt, ipa-cp, ipa-pure-const,
ipa-profile and ipa-inline and audits them for opt_for_fn.
I did not converted yet ipa-reference because the code is organized in a way
making it bit difficult to hook the tests in and ipa-icf that I want to do
separately.

The patch also enables ipa inehritance graph building by default.  It makes it
easy to support -fdevirtualize functoins within -fno-devirtualize unit and it
is not hard to do so.

Bootstrapped/regtested x86_64-linux and tested on firefox.  I will try to find 
time
to design testcases that passes can be randomly turned off on per function 
basis,
I would definitely welcome a help with this task - there are quite few options
to test and I am also lagging behind with ipa-cp devirt testcases.

Martin: I disabled sanity check in ipa_value_from_jfunc, ipa_context_from_jfunc
because I do not have function context handy.  If you think it is worth to keep
it, I will pass the stuff around.

Honza

* ipa-cp.c (ipcp_cloning_candidate_p): Use opt_for_fn.
(ipa_value_from_jfunc, ipa_context_from_jfunc): Skip sanity check.
(ipa_get_indirect_edge_target_1): Use opt_for_fn.
(good_cloning_opportunity_p): Likewise.
(ipa-cp gate): Enable ipa-cp with LTO.
* ipa-profile.c (ipa_propagate_frequency): Use opt_for_fn.
* ipa.c (symbol_table::remove_unreachable_nodes): Always build type
inheritance.
* ipa-inline-transform.c (inline_transform): Check if there are inlines
to apply even at -O0.
* cgraphunit.c (cgraph_node::finalize_function): Use opt_for_fn.
(analyze_functions): Build type inheritance graph.
* ipa-inline.c (can_inline_edge_p): Use opt_for_fn.
(want_early_inline_function_p, want_inline_small_function_p):
Likewise.
(check_callers): Likewise.
(edge_badness): Likewise.
(inline_small_functions): Always be ready for indirect inlining
to happend.
(ipa_inline): Always use want_inline_function_to_all_callers_p.
(early_inline_small_functions): Use opt_for_fn.
* ipa-inline-analysis.c (estimate_function_body_sizes): use opt_for_fn.
(estimate_function_body_sizes): Likewise.
(compute_inline_parameters): Likewise.
(estimate_edge_devirt_benefit): Likewise.
(inline_analyze_function): Likewise.
* ipa-devirt.c (ipa_devirt): Likewise.
(gate): Use in_lto_p.
* ipa-prop.c (ipa_func_spec_opts_forbid_analysis_p): Use opt_for_fn.
(try_make_edge_direct_virtual_call): Likewise.
(update_indirect_edges_after_inlining): Likewise.
(ipa_free_all_structures_after_ipa_cp): Add in_lto_p check.
* common.opt (findirect-inlining): Turn into optimization.
* ipa-pure-const.c (add_new_function): Use opt_for_fn.
(pure_const_generate_summary): Likewise.
(gate_pure_const): Always enable with in_lto_p.

Index: ipa-cp.c
===
--- ipa-cp.c(revision 217675)
+++ ipa-cp.c(working copy)
@@ -566,7 +566,7 @@ ipcp_cloning_candidate_p (struct cgraph_
 
   gcc_checking_assert (node->has_gimple_body_p ());
 
-  if (!flag_ipa_cp_clone)
+  if (!opt_for_fn (node->decl, flag_ipa_cp_clone))
 {
   if (dump_file)
 fprintf (dump_file, "Not considering %s for cloning; "
@@ -902,10 +902,7 @@ ipa_value_from_jfunc (struct ipa_node_pa
  ipcp_lattice *lat;
 
  if (!info->lattices)
-   {
- gcc_checking_assert (!flag_ipa_cp);
- return NULL_TREE;
-   }
+   return NULL_TREE;
  lat = ipa_get_scalar_lat (info, idx);
  if (!lat->is_single_const ())
return NULL_TREE;
@@ -967,10 +964,7 @@ ipa_context_from_jfunc (ipa_node_params
   else
{
  if (!info->lattices)
-   {
- gcc_checking_assert (!flag_ipa_cp);
- return ctx;
-   }
+   return ctx;
  ipcp_lattice *lat;
  lat = ipa_get_poly_ctx_lat (info, srcidx);
  if (!lat->is_single_const ())
@@ -1786,7 +1780,7 @@ ipa_get_indirect_edge_target_1 (struct c
return NULL_TREE;
 }
 
-  if (!flag_devirtualize)
+  if (!opt_for_fn (ie->caller->decl, flag_devirtualize))
 return NULL_TREE;
 
   gcc_assert (!ie->indirect_info->agg_contents);
@@ -1884,8 +1878,8 @@ ipa_get_indirect_edge_target_1 (struct c
   struct cgraph_node *node;
   if (*speculative)
return target;
-  if (!flag_devirtualize_speculatively || ie->speculative
- || !ie->maybe_hot_p ())
+  if (!opt_for_fn (ie->caller->decl, flag_devirtualize_speculatively)
+ || ie->speculative || !ie->maybe_hot_p ())
return NULL;
   node = try_speculative_devirtualization (ie->indirect_info->otr_type,
   ie->indirect_info->otr_token,
@@ -2003,7 +1997,7 @@ good_clo

Re: [AArch64, Patch] Restructure arm_neon.h vector types's implementation(Take 2).

2014-11-18 Thread Marc Glisse

On Tue, 18 Nov 2014, James Greenhalgh wrote:


On Wed, Nov 05, 2014 at 11:31:24PM +, James Greenhalgh wrote:

On Wed, Nov 05, 2014 at 09:50:52PM +, Marc Glisse wrote:

Thanks. Do you know if anyone is planning to "port" this patch to the arm
target (which IIRC has the same issue)? No pressure, this is just so I
know if I should ping the alternate __float128 patch or wait.


Yes, you're right, the arm target also has this issue. I have a port of
Tejas' patch based on top of some other refactoring of the arm
target's builtin infrastructure I've been working on - I'm hoping to get
that all rebased and proposed upstream in the next couple of weeks.


This is all done now (r217700), so your patch should be safe to go in.


Thanks, I retested on x86_64-linux and committed as r217735. Keeping my 
fingers crossed...


--
Marc Glisse


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Matthew Fortune
> > From: Maciej W. Rozycki [mailto:ma...@codesourcery.com]
> >
> > On Tue, 18 Nov 2014, Andrew Bennett wrote:
> >
> > > Produces (for the atomic operation):
> > >
> > >.setnoat
> > > sync
> > > 1:
> > > ll  $3,0($5)
> > > and $1,$3,$4
> > > bne $1,$7,2f
> > > and $1,$3,$6
> > > or  $1,$1,$8
> > > sc  $1,0($5)
> > > beql$1,$0,1b
> > > nop
> > > nop
> > > sync
> > > 2:
> > > .setat
> >
> >  With a quick look at `mips_process_sync_loop' it looks to me the
> > other NOP is produced from here:
> >
> >   else if (!(required_oldval && cmp))
> > mips_multi_add_insn ("nop", NULL);
> >
> > -- correct?  If so, then can't you just make it:
> 
> Correct.
> 
> >
> >   else if (!(required_oldval && cmp) && !TARGET_FIX_R1)
> > mips_multi_add_insn ("nop", NULL);
> >
> > instead?  It appears plain and simple, and if you're concerned about
> > it being unobvious to the casual reader, then add a one-line comment
> > or suchlike.  It's not like TARGET_FIX_R1 is going to imply
> > anything else about branches in the future (and r6 code won't run on
> > an R10k anyway).

I'm afraid I disagree about it being plain and simple even with a comment.
The amount of code in the backend that makes assumptions based on derived
variables is very high and it makes the code hard to read (especially w.r.t.
branches).  I'm OK with adding the code to avoid the extra NOP but have it
based on mips_branch_likely and fix the callers so that it is set
appropriately.

Thanks,
Matthew



Re: [PR63762]GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-11-18 Thread Vladimir Makarov
On 11/18/2014 05:56 AM, Renlin Li wrote:
> Hi all,
>
> This patch should fix PR63762.
>
> Hard_reg allocated during IRA pass conflicts with hard_reg allocated
> in LRA pass because of inconsistent information between allocno and
> reg_pref.
>
> The staled reg_class information in reg_pref is used by LRA while
> doing hard register assignment. LRA makes wrong decision
> to choose same hard_reg number for pseudo registers with overlapped
> liveness.
>
>
> For this particular case manifested on arm target:
>
> (insn 180 183 184 15 (set (mem:SF (post_inc:SI (reg/v/f:SI 223 [ in
> ])) [2 MEM[base: in_68, offset: 4294967292B]+0 S4 A32])
> (reg/v:SF 290 [orig:224 val ] [224])) unpreditable-str.c:33
> 618 {*movsf_vfp}
>  (expr_list:REG_INC (reg/v/f:SI 223 [ in ])
> (nil)))
>
> IRA:
> r290(2 ,SF, GENERAL_REGS) -> r278(2, SF, GENERAL_REGS) -> r224 (16,
> SF, VFP_LO_REGS)
> reg_pref[290] is set following its original_reg's(r224) info which is
> VFP_LO_REGS.
>
> LRA:
> r302(SI, GENERAL_REGS) -> r223(SI, GENERAL_REGS)
>
> while selecting hard register for r302, register class is checked
> first, r290 is VFP_LO_REGS (from reg_pref[290]), r302 is GENERAL_REGS,
> so they will never conflict. Thus, hard register number 2 is chosen
> for r302 without problem.
>
>
>
> A testcase is added for all targets as I think it's a middle-end
> issue. And sorry for not being able to simplify it.
> arm-none-eabi has been test on the model, no new issues. bootstrapping
> and regression tested on x86, no new issues.
>
> Is it Okay to commit?
Yes.  Thanks very much for working on it and fixing it.



Re: [PATCH] -fsanitize-recover=list

2014-11-18 Thread Alexey Samsonov
On Mon, Nov 17, 2014 at 11:53 PM, Jakub Jelinek  wrote:
> On Mon, Nov 17, 2014 at 11:39:47PM -0800, Alexey Samsonov wrote:
>> > That is not what I think we've agreed on and what is implemented in GCC.
>> > -fsanitize-recover only enables recovery of the undefined plus
>> > undefined-like sanitizers, in particular it doesn't enable recover from
>> > kernel-address, because -fsanitize-recover should be a deprecated option
>> > and kernel-address never used it before.
>>
>> Hm, indeed, I messed it up. In the older thread we agree that plain
>> -f(no-)sanitize-recover
>> should be a deprecated synonym for the current meaning of these flags,
>> which only specify recoverability for UBSan-related checks :-/
>>
>> Has GCC already shipped with existing implementation? I'm just curious
>> if it's convenient to have flags that would enable/disable recovery for all
>> sanitizers (analogous to -Werror flags which exist in a standalone form, but
>> may accept specific warnings, so that one can write
>>   "$(CC) foo.cc -Wno-error -Werror=sign-compare"
>
> Well, no sanitizer is on by default, so you can just use the same
> list for -fno-sanitize-recover= or -fsanitize-recover= as for -fsanitize=
> if you want.

Yeah, but it may look somewhat redundant. Also, re-using the exact
same list may lead to diagnostic messages (if you write
-fsanitize=unreachable,null -fsanitize-recover=unreachable,null).

>> > So, in GCC -fsanitize-recover stands for
>> > -fsanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
>> > and -fno-sanitize-recover stands for
>> > -fno-sanitize-recover=undefined,float-divide-by-zero,float-cast-overflow
>> >
>> > > * -fsanitize-recover=: Enable recovery for all selected checks
>> > > or group of checks. It is forbidden to list unrecoverable sanitizers
>> > > here (e.g., "-fsanitize-recover=address" will produce an error).
>> >
>> > We only error on
>> > -fsanitize-recover=address
>> > -fsanitize-recover=thread
>> > -fsanitize-recover=leak
>>
>> Right, it's a good idea to error on sanitizer kinds we don't have
>> recovery for. I will
>> add the errors for TSan/MSan/LSan etc.
>>
>> > but not say on
>> > -fsanitize-recover=unreachable
>> > which is part of undefined; unreachable isn't recoverable silently.
>> > Likewise -fsanitize-recover=return.  Otherwise one couldn't use
>> > -fsanitize-recover=undefined which is useful.
>>
>> Can't this be fixed by checking the actual values of -fsanitize-recover= flag
>> before expanding the sanitizer groups (i.e. turning "undefined" into
>> "null,unreachable,return,")?
>>
>> We should probably be able to distinguish between 
>> -fsanitize-recover=undefined,
>> and -fsanitize-recover=unreachable,.
>> In the first case we can enable recovery
>> for all parts of "undefined" that support it. In the second, we can
>> produce an error as user
>> explicitly tried to enable recovery for sanitizer that doesn't support it.
>
> But in that case you would need to diagnose it already when parsing the
> option, or add code that would remember what bits have been set from other
> option sets.
> In the former case, you'd diagnose even
> -fsanitize-recover=unreachable -fno-sanitize-recover=undefined
> even when in that case unreachable in the end is not recoverable.
> We don't error out for
> -fsanitize-recover=address -fno-sanitize-recover=address
> because in the end address is not recovered.

OK, that's a good question: should we diagnose -fsanitize-recover=address
if it was later disabled by -fno-sanitize-recover=address? There is an argument
for doing this: "-fsanitize-recover=address" is unsupported, so this
flag shouldn't
have been provided in the first place. It makes much more sense to
delete it rather
than override it. It couldn't be passed down from some global
project-wide configuration
(like CFLAGS), because it wouldn't work anywhere.

The only case where overriding might come in handy is re-using the same list for
-fsanitize= and -fsanitize-recover= flags that you mention:
# SANITIZERS is a list of sanitizers we want to enable.
CFLAGS = "${CFLAGS} -fsanitize=${SANITIZERS} -fsanitize-recover=${SANITIZERS}"
# Oops - we produce an error if ${SANITIZERS} contains "address".

However, even if we decide to *not* diagnose unrecoverable sanitizer
kinds disabled later
in the command line, we can still implement warnings for "unreachable" properly.
You can scan "-fsanitize-recover" flags from right to left and keep
track of all sanitizers that
"would be disabled". When you see "-fsanitize=unreachable" (with
literal "unreachable" as a value),
you diagnose the error only if "unreachable" wouldn't be disabled
later by some -fno-sanitize-recover flag.

FWIW, we implement this logic in Clang for regular -fsanitize= flags.
-- 
Alexey Samsonov, Mountain View, CA


Re: Query about the TREE_TYPE field

2014-11-18 Thread Andrew MacLeod

On 11/18/2014 01:36 PM, Jeff Law wrote:

On 11/18/14 09:30, Andrew MacLeod wrote:


I tried doing the if before chaning to TREE_TYPE...  absolutely no
effect on the testsuite or anything else :-)

What do you think, should I check this in?   What is there is clearly
incorrect.we could also revert the original patch since that is what
the code base is effectively is doing at the moment...
What I tend to do in these situations is roll back to the version 
prior to the "fix" and try the testcase with that compiler.  Then I 
can walk through under GDB control to find out whatever I need.


Clearly something needs to change here, but the question is what :-) 
And I think rolling back and debugging that compiler is the best way 
to know get more information to allow this to move forward.


From 2008. gah.  anyway, Doesn't help.   The code snippet never 
triggers. The function is called 240 times on the testcase, and each 
time its a FUNCTION_DECL, which becomes a FUNCTION_TYPE when we take 
TREE_TYPE.


Test testcase ICE's before and after the change.

So it effectively does nothing.   Unless Jason can think of a good 
reason for it, we probably ought to turf it.  Its effectively a NOP.


Andrew






	* attribs.c: Remove always true condition, as TREE_TYPE(x) will never
	compare equal to a TYPE_DECL.

Index: attribs.c
===
*** attribs.c	(revision 217234)
--- attribs.c	(working copy)
*** decl_attributes (tree *node, tree attrib
*** 502,512 
if (spec->type_required && DECL_P (*anode))
  	{
  	  anode = &TREE_TYPE (*anode);
! 	  /* Allow ATTR_FLAG_TYPE_IN_PLACE for the type's naming decl.  */
! 	  if (!(TREE_CODE (*anode) == TYPE_DECL
! 		&& *anode == TYPE_NAME (TYPE_MAIN_VARIANT
! 	(TREE_TYPE (*anode)
! 	flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
  	}
  
if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE
--- 502,508 
if (spec->type_required && DECL_P (*anode))
  	{
  	  anode = &TREE_TYPE (*anode);
! 	  flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
  	}
  
if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE


Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-18 Thread Jeff Law

On 11/18/14 12:25, Steve Ellcey wrote:

On Mon, 2014-11-17 at 09:24 +, James Greenhalgh wrote:


For what it is worth, I've bootstrapped and tested this patch on
aarch64-none-linux-gnu and arm-none-linux-gnueabi with no issues, and
both targets get the expected speedup in the interesting benchmark.
I've also thrown some of the larger popular benchmark suites at it, and
they've compiled and run without any compilation issues or miscompares.

I'm happy to help out with the testing and bug-triaging effort once this
patch goes in.

Some very shallow comments: you should document the new parameters
in doc/invoke.texi and you ought to run contrib/check_GNU_style.sh
on the patch and clean up the coding style issues it highlights.

Thanks,
James Greenhalgh


I tested the patch on MIPS and things looked good there too.  I got the
desired speedup and did not see any regressions.
It's on my list of things to look at.  The patch clearly was in prior to 
stage1 close and is thus eligible for inclusion in GCC 5.  Slogging 
through stuff as quickly as I can.



jeff


Re: [Patch] Improving jump-thread pass for PR 54742

2014-11-18 Thread Steve Ellcey
On Mon, 2014-11-17 at 09:24 +, James Greenhalgh wrote:

> For what it is worth, I've bootstrapped and tested this patch on
> aarch64-none-linux-gnu and arm-none-linux-gnueabi with no issues, and
> both targets get the expected speedup in the interesting benchmark.
> I've also thrown some of the larger popular benchmark suites at it, and
> they've compiled and run without any compilation issues or miscompares.
> 
> I'm happy to help out with the testing and bug-triaging effort once this
> patch goes in.
> 
> Some very shallow comments: you should document the new parameters
> in doc/invoke.texi and you ought to run contrib/check_GNU_style.sh
> on the patch and clean up the coding style issues it highlights.
> 
> Thanks,
> James Greenhalgh

I tested the patch on MIPS and things looked good there too.  I got the
desired speedup and did not see any regressions.

Steve Ellcey



RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix

2014-11-18 Thread Matthew Fortune
> > -Original Message-
> > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> > Sent: Tuesday, November 18, 2014 12:22 PM
> > To: Rozycki, Maciej
> > Cc: gcc-patches@gcc.gnu.org; Moore, Catherine; Eric Christopher
> > Subject: RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix
> >
> > Maciej W. Rozycki  writes:
> > > On Mon, 17 Nov 2014, Matthew Fortune wrote:
> > >
> > > > >   gcc/
> > > > >   * gcc/config/mips/mips.md (*jump_absolute): Use a branch when
> in
> > > > >   range, a jump otherwise.
> > > >
> > > > OK.
> > > >
> > > > I only got my head around this code last week otherwise I wouldn't
> > > > have known whether this was correct!
> > >
> > >  Committed now, thanks for your review.  How about 4.9? -- once it
> > > is unfrozen, that is.
> >
> > I admit to being a bit more nervous about 4.9 but the test coverage
> > seems thorough enough. I guess I would have been less concerned if the
> > optimisation was still just tied to TARGET_MICROMIPS for the 4.9
> branch.
> >
> > Catherine, what do you think?
> >
> This is okay for 4.9 IMO.

OK

Matthew


Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior

2014-11-18 Thread Paolo Carlini

Hi,

On 11/18/2014 08:12 PM, Tim Shen wrote:

Bootstrapped and tested.
Jonathan lately is following your work much better than me, but naively 
seems weird that _M_begin is non-const and _M_end is const, a different 
type anyway.


Paolo.


[Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior

2014-11-18 Thread Tim Shen
Bootstrapped and tested.

Thanks!


-- 
Regards,
Tim Shen
commit f17155183b9ae1283d04f3bbdb61d05d9279ebe4
Author: timshen 
Date:   Tue Nov 18 00:07:28 2014 -0800

PR libstdc++/63920
* include/bits/regex_executor.h: Make _M_begin non const.
* include/bits/regex_executor.tcc (_Executor<>::_M_search): Increase
_M_begin in search algorithm, so that _M_begin is treated as
"current start position" for each search iteration.
* testsuite/28_regex/algorithms/regex_search/ecma/flags.cc: New
testcase.

diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index b26992c..2d7796f 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -205,7 +205,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   _ResultsVec   _M_cur_results;
   _BiIter   _M_current;
-  const _BiIter _M_begin;
+  _BiIter   _M_begin;
   const _BiIter _M_end;
   const _RegexT&_M_re;
   const _NFAT&  _M_nfa;
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 38d4781..a973667 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -39,17 +39,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
 _M_search()
 {
+  if (_M_search_from_first())
+   return true;
   if (_M_flags & regex_constants::match_continuous)
-   return _M_search_from_first();
-  auto __cur = _M_begin;
-  do
+   return false;
+  _M_flags |= regex_constants::match_prev_avail;
+  while (_M_begin != _M_end)
{
- _M_current = __cur;
- if (_M_main(_Match_mode::_Prefix))
+ ++_M_begin;
+ if (_M_search_from_first())
return true;
}
-  // Continue when __cur == _M_end
-  while (__cur++ != _M_end);
   return false;
 }
 
diff --git 
a/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc 
b/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
index 45ad0f6..6b038ee 100644
--- a/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
+++ b/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
@@ -65,6 +65,8 @@ test01()
 regex_constants::match_prev_avail));
   VERIFY( regex_search_debug("ba"+1, regex("\\Ba"),
 regex_constants::match_prev_avail));
+  // PR libstdc++/63920
+  VERIFY(!regex_search_debug("a", regex("b*"), 
regex_constants::match_not_null));
 }
 
 int


C++ PATCH for c++/63925 (constexpr pointer increment)

2014-11-18 Thread Jason Merrill

The middle-end doesn't like when I give it a PLUS_EXPR with pointer type.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 98e8254b2e89e1e34521a7cf9f027bd78eddd3ab
Author: Jason Merrill 
Date:   Tue Nov 18 13:36:46 2014 -0500

	PR c++/63925
	* constexpr.c (cxx_eval_increment_expression): Use POINTER_PLUS_EXPR.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 4325caa..5abea14 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2566,8 +2566,17 @@ cxx_eval_increment_expression (const constexpr_ctx *ctx, tree t,
 
   /* The modified value.  */
   bool inc = (code == PREINCREMENT_EXPR || code == POSTINCREMENT_EXPR);
-  tree mod = fold_build2 (inc ? PLUS_EXPR : MINUS_EXPR,
-			  type, val, offset);
+  tree mod;
+  if (POINTER_TYPE_P (type))
+{
+  /* The middle end requires pointers to use POINTER_PLUS_EXPR.  */
+  offset = convert_to_ptrofftype (offset);
+  if (!inc)
+	offset = fold_build1 (NEGATE_EXPR, TREE_TYPE (offset), offset);
+  mod = fold_build2 (POINTER_PLUS_EXPR, type, val, offset);
+}
+  else
+mod = fold_build2 (inc ? PLUS_EXPR : MINUS_EXPR, type, val, offset);
   VERIFY_CONSTANT (mod);
 
   /* Storing the modified value.  */
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-incr1.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-incr1.C
index 2b099c8..ecd7c04 100644
--- a/gcc/testsuite/g++.dg/cpp1y/constexpr-incr1.C
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-incr1.C
@@ -1,4 +1,5 @@
 // { dg-do compile { target c++14 } }
+#define SA(X) static_assert((X),#X)
 
 constexpr int f (int i)
 {
@@ -8,6 +9,15 @@ constexpr int f (int i)
   return x;
 }
 
+constexpr int* g (int* p)
+{
+  ++p;
+  return p;
+}
+
 constexpr int i = f(42);
-#define SA(X) static_assert((X),#X)
 SA(i==44);
+
+int array[4];
+constexpr int* p = g(array);
+SA(p == &array[1]);


Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-11-18 Thread Jan Hubicka
> Hi,
> 
> On Tue, Nov 18, 2014 at 04:39:00PM +0100, Jan Hubicka wrote:
> > > On Fri, Nov 14, 2014 at 08:59:10PM +0100, Jan Hubicka wrote:
> > > > > > 
> > > > > > > b) with GTY, we cannot call destructor
> > > > > > 
> > > > > > Everything in symbol table is expecitely memory managed (i.e. enver 
> > > > > > left
> > > > > > to be freed by garbage collector). It resists in GTY only to allow 
> > > > > > linking
> > > > > > garbage collected object from them and to get PCH working.
> > > > > > 
> > > > > 
> > > > > Well, if I understand the intent correctly, summaries are for stuff
> > > > > that is not in the symbol table.  For example jump functions are a
> > > > Correct.
> > > > > vector of structures possibly containing trees, so everything has to
> > > > > be in garbage collected memory.
> > > > > 
> > > > > When an edge is removed, it is necessary to be notified about it
> > > > > immediately, for example to decrement rdesc_refcount (you might argue
> > > > > that that should be done in a separate hook and not from within a
> > > > > summary class but then you start to rely on hook invocation ordering
> > > > > so I think it is better to eventually use the summaries for it too).
> > > > 
> > > > I do not see why ctors/dtors can not do the reference counting. In fact
> > > > this is how refcounting is done usually anyway?
> > > > 
> > > 
> > > Well, when there is no garbage collection involved then yes, that is
> > > how you normally do it but in the GC case, there is the question of
> > > what is the appropriate time to call destructor on garbage collected
> > > data (like jump functions)?
> > 
> > I still fail to see problem here.  Summaries are explicitly managed- they 
> > are
> > constructed at summary construction time or when new callgarph node is
> > introduced/duplicated.  They are destroyed when callgarph node is destroyed 
> > or
> > whole summary is ddestroyed.  It is job of the summary datastructure to call
> > proper ctors/dtors, not job of garbage collector that provides the 
> > underlying
> > memory management.
> 
> I do not think that all summaries (in the meaning of a description of
> one particular symbol table node or call graph edge) are explicitely
> managed.  For example ipa_edge_args or ipa_agg_replacement_value
> (which my alignment patch changes to ipcp_transformation_summary) are
> allocated in GC memory because they contain trees.
> 
> > 
> > If you have datastructure that points to something that is not
> > explicitly managed (i.e. tree expression), you just can not have
> > non-trivial constructor on that datastructure, because that is freed
> > transparently by gty that don't do destruction...
> 
> I admit to not being particularly bright today but that seems to be
> exactly my point.

Well, in your case you have datastructure jump_function that contain a pointer
to tree (EXPR).  What I am trying to explain is that I see no reson why
jump_function needs to be POD. The tree pointed to by EXPR pointer can not
have a dtor by itself because GGC will not call it upon freeing.

It is true that jump_function lives in GGC memory (to make pointer to expr
work) but it never gets removed by ggc_collect because it is always pointed to
by the summary datastructure.  There are two ways to free the jump_function
datastructure.
  1) removing the symbol node it is attached to.
 Here the symtab code will call removal hook that was registered by 
container
 template. The container will call destructor of jump_function and the 
ggc_free
 its memory
  2) removing the summary.  In this case I would again expect the container
 template to walk all summaries and free them.

So even if your structure lives in GGC memory it is not really garbage
collected and thus the lack of machinery to call dtors at a time ggc decides to
free something is not a problem?

In fact looking at struct default_hashmap_traits, I see:

  /* Called to dispose of the key and value before marking the entry as
 deleted.  */

  template static void remove (T &v) { v.~T (); }

This trait gets called by the underlying hash table so if you have explicitly
managed hashmap (in GGC memory or not), things just work.  Only catch is that
if you let your hashmap to be garbage collected, then your dtor is not called.

So probably the dtors are working same way with Martin's summaries?
I guess we can follow same scheme here, have summary_traits that default
to calling correspondin ctors/dtors. 

Honza

> 
> Martin


Re: [PATCH, Pointer Bounds Checker, Builtins instrumentation 3/5] Expand instrumented builtin calls

2014-11-18 Thread David Edelsohn
On Tue, Nov 18, 2014 at 1:44 PM, Jeff Law  wrote:
> On 11/18/14 05:18, Richard Biener wrote:
>
>>>
>>> As I understand the main problem is that fixing dbxout in trunk won't
>>> help to build stage1 compiler.
>>
>>
>> Simply build stage1 without debug info on AIX then.
>>
>> Btw, you have to start fixing the bug at some point ... (we can
>> backport to 4.8 and 4.9).  Of course dbxout.c is in kind of
>> deep-freeze unmaintained state.
>>
>> I don't think we ever encouraged work-arounds for broken host compilers
>> if that requires substantial work.
>
> It's always depended on the amount of work involved.  We have to do far less
> catering to broken host compilers these days.
>
> I wouldn't lose any sleep if we just truncated as these stab strings get
> really big.  I don't think we want to much too much effort into dbxout.c :-)

AIX assembler does not support continuation lines, but apparently one
may be able to emit multiple, adjacent .stabx pseudo-ops if the string
is broken at a grammatically correct location.  In other words,
"continuations" with a complicated string of characters to end one
.stabx and start another.

- David


Re: [PATCH, Pointer Bounds Checker, Builtins instrumentation 3/5] Expand instrumented builtin calls

2014-11-18 Thread Jeff Law

On 11/18/14 05:18, Richard Biener wrote:



As I understand the main problem is that fixing dbxout in trunk won't
help to build stage1 compiler.


Simply build stage1 without debug info on AIX then.

Btw, you have to start fixing the bug at some point ... (we can
backport to 4.8 and 4.9).  Of course dbxout.c is in kind of
deep-freeze unmaintained state.

I don't think we ever encouraged work-arounds for broken host compilers
if that requires substantial work.
It's always depended on the amount of work involved.  We have to do far 
less catering to broken host compilers these days.


I wouldn't lose any sleep if we just truncated as these stab strings get 
really big.  I don't think we want to much too much effort into dbxout.c :-)



Jeff



Re: [PATCH 5/5] combine: preferably delete dead SETs in PARALLELs

2014-11-18 Thread Jeff Law

On 11/18/14 06:42, Segher Boessenkool wrote:

On Tue, Nov 18, 2014 at 06:38:49AM -0600, Segher Boessenkool wrote:

Does it help pr52714 where we'd like to rip apart a PARALLEL with two
sets, one of which is dead.  If so, it might allow us to  optimize that
code better.


It does not seem to fix the testcase.  I'll investigate why not.
You're talking about the

   (parallel [(set (pc) (pc))
  (set (X) (sp))])

right?  I guess the "set pc pc" is not marked as unused...


The very first thing that is checked for is !(added_sets_2 && i1 == 0)
which matches in this case.  I wonder what that condition is supposed
to accomplish -- "optimisation" I suppose?

Hacking that out, it still won't match because the "set pc pc" is not
considered dead.  It's a no-op, not the same thing.  I'll try to widen
the condition...
Just to be clear, I wouldn't consider fixing this a requirement to get 
your change in.   I think 5/5 was ready to go as-is.  Obviously if you 
can improve on it, that's great ;-)



jeff


Re: Query about the TREE_TYPE field

2014-11-18 Thread Jeff Law

On 11/18/14 09:30, Andrew MacLeod wrote:


I tried doing the if before chaning to TREE_TYPE...  absolutely no
effect on the testsuite or anything else :-)

What do you think, should I check this in?   What is there is clearly
incorrect.we could also revert the original patch since that is what
the code base is effectively is doing at the moment...
What I tend to do in these situations is roll back to the version prior 
to the "fix" and try the testcase with that compiler.  Then I can walk 
through under GDB control to find out whatever I need.


Clearly something needs to change here, but the question is what :-) 
And I think rolling back and debugging that compiler is the best way to 
know get more information to allow this to move forward.


Jeff


Re: [patch] New std::string implementation

2014-11-18 Thread Jonathan Wakely

On 18/11/14 10:45 +0100, Richard Biener wrote:

Looking at all these issues that just pop up inside libstdc++ I wonder
if this whole business will not blow up in our face once out in the wild...


I'm trying to ensure that most of the the pain is dealt with inside
libstdc++ and so users won't have the same kind of issues.

I've got to the point where everything works except programs which use
custom facets. If you create a custom locale with a user-defined facet
in a translation unit that uses the new ABI and imbue an iostream with
that locale, your user-defined facet will not be used, because the
iostream definitions are instantiated in the library using the old
ABI. I've never met anyone who uses custom facets, but that doesn't
mean it's OK for them to not work properly using  the new ABI.

So if I add dg-options "-D_GLIBCXX_USE_CXX11_ABI" to the last 40 tests
that are failing, everything passes. Maybe that's what I should do for
now.


Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-11-18 Thread Martin Jambor
Hi,

On Tue, Nov 18, 2014 at 04:39:00PM +0100, Jan Hubicka wrote:
> > On Fri, Nov 14, 2014 at 08:59:10PM +0100, Jan Hubicka wrote:
> > > > > 
> > > > > > b) with GTY, we cannot call destructor
> > > > > 
> > > > > Everything in symbol table is expecitely memory managed (i.e. enver 
> > > > > left
> > > > > to be freed by garbage collector). It resists in GTY only to allow 
> > > > > linking
> > > > > garbage collected object from them and to get PCH working.
> > > > > 
> > > > 
> > > > Well, if I understand the intent correctly, summaries are for stuff
> > > > that is not in the symbol table.  For example jump functions are a
> > > Correct.
> > > > vector of structures possibly containing trees, so everything has to
> > > > be in garbage collected memory.
> > > > 
> > > > When an edge is removed, it is necessary to be notified about it
> > > > immediately, for example to decrement rdesc_refcount (you might argue
> > > > that that should be done in a separate hook and not from within a
> > > > summary class but then you start to rely on hook invocation ordering
> > > > so I think it is better to eventually use the summaries for it too).
> > > 
> > > I do not see why ctors/dtors can not do the reference counting. In fact
> > > this is how refcounting is done usually anyway?
> > > 
> > 
> > Well, when there is no garbage collection involved then yes, that is
> > how you normally do it but in the GC case, there is the question of
> > what is the appropriate time to call destructor on garbage collected
> > data (like jump functions)?
> 
> I still fail to see problem here.  Summaries are explicitly managed- they are
> constructed at summary construction time or when new callgarph node is
> introduced/duplicated.  They are destroyed when callgarph node is destroyed or
> whole summary is ddestroyed.  It is job of the summary datastructure to call
> proper ctors/dtors, not job of garbage collector that provides the underlying
> memory management.

I do not think that all summaries (in the meaning of a description of
one particular symbol table node or call graph edge) are explicitely
managed.  For example ipa_edge_args or ipa_agg_replacement_value
(which my alignment patch changes to ipcp_transformation_summary) are
allocated in GC memory because they contain trees.

> 
> If you have datastructure that points to something that is not
> explicitly managed (i.e. tree expression), you just can not have
> non-trivial constructor on that datastructure, because that is freed
> transparently by gty that don't do destruction...

I admit to not being particularly bright today but that seems to be
exactly my point.

Martin


Go patch committed: Fix initialization order

2014-11-18 Thread Ian Taylor
The Go frontend was slightly incorrect in the way that it handled the
order of initialization (http://golang.org/issue/8052).  Fortunately
it rarely made a difference in real code.  This patch by Chris
Manghane fixes it.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 82f97044669e go/gogo.cc
--- a/go/gogo.ccFri Nov 14 10:02:13 2014 -0800
+++ b/go/gogo.ccTue Nov 18 09:02:23 2014 -0800
@@ -1018,11 +1018,11 @@
 {
  public:
   Var_init()
-: var_(NULL), init_(NULL)
+: var_(NULL), init_(NULL), dep_count_(0)
   { }
 
   Var_init(Named_object* var, Bstatement* init)
-: var_(var), init_(init)
+: var_(var), init_(init), dep_count_(0)
   { }
 
   // Return the variable.
@@ -1035,13 +1035,38 @@
   init() const
   { return this->init_; }
 
+  // Return the number of remaining dependencies.
+  size_t
+  dep_count() const
+  { return this->dep_count_; }
+
+  // Increment the number of dependencies.
+  void
+  add_dependency()
+  { ++this->dep_count_; }
+
+  // Decrement the number of dependencies.
+  void
+  remove_dependency()
+  { --this->dep_count_; }
+
  private:
   // The variable being initialized.
   Named_object* var_;
   // The initialization statement.
   Bstatement* init_;
+  // The number of initializations this is dependent on.  A variable
+  // initialization should not be emitted if any of its dependencies
+  // have not yet been resolved.
+  size_t dep_count_;
 };
 
+// For comparing Var_init keys in a map.
+
+inline bool
+operator<(const Var_init& v1, const Var_init& v2)
+{ return v1.var()->name() < v2.var()->name(); }
+
 typedef std::list Var_inits;
 
 // Sort the variable initializations.  The rule we follow is that we
@@ -1052,14 +1077,21 @@
 static void
 sort_var_inits(Gogo* gogo, Var_inits* var_inits)
 {
+  if (var_inits->empty())
+return;
+
   typedef std::pair No_no;
   typedef std::map Cache;
   Cache cache;
 
-  Var_inits ready;
-  while (!var_inits->empty())
-{
-  Var_inits::iterator p1 = var_inits->begin();
+  // A mapping from a variable initialization to a set of
+  // variable initializations that depend on it.
+  typedef std::map > Init_deps;
+  Init_deps init_deps;
+  for (Var_inits::iterator p1 = var_inits->begin();
+   p1 != var_inits->end();
+   ++p1)
+{
   Named_object* var = p1->var();
   Expression* init = var->var_value()->init();
   Block* preinit = var->var_value()->preinit();
@@ -1067,11 +1099,13 @@
 
   // Start walking through the list to see which variables VAR
   // needs to wait for.
-  Var_inits::iterator p2 = p1;
-  ++p2;
-
-  for (; p2 != var_inits->end(); ++p2)
+  for (Var_inits::iterator p2 = var_inits->begin();
+  p2 != var_inits->end();
+  ++p2)
{
+ if (var == p2->var())
+   continue;
+
  Named_object* p2var = p2->var();
  No_no key(var, p2var);
  std::pair ins =
@@ -1080,6 +1114,10 @@
ins.first->second = expression_requires(init, preinit, dep, p2var);
  if (ins.first->second)
{
+ // VAR depends on P2VAR.
+ init_deps[*p2].insert(&(*p1));
+ p1->add_dependency();
+
  // Check for cycles.
  key = std::make_pair(p2var, var);
  ins = cache.insert(std::make_pair(key, false));
@@ -1100,36 +1138,66 @@
 p2var->message_name().c_str());
  p2 = var_inits->end();
}
- else
-   {
- // We can't emit P1 until P2 is emitted.  Move P1.
- Var_inits::iterator p3 = p2;
- ++p3;
- var_inits->splice(p3, *var_inits, p1);
-   }
- break;
}
}
-
-  if (p2 == var_inits->end())
+}
+
+  // If there are no dependencies then the declaration order is sorted.
+  if (!init_deps.empty())
+{
+  // Otherwise, sort variable initializations by emitting all variables 
with
+  // no dependencies in declaration order. VAR_INITS is already in
+  // declaration order.
+  Var_inits ready;
+  while (!var_inits->empty())
{
- // VAR does not depends upon any other initialization expressions.
-
- // Check for a loop of VAR on itself.  We only do this if
- // INIT is not NULL and there is no dependency; when INIT is
- // NULL, it means that PREINIT sets VAR, which we will
- // interpret as a loop.
- if (init != NULL && dep == NULL
- && expression_requires(init, preinit, NULL, var))
-   error_at(var->location(),
-"initialization expression for %qs depends upon itself",
-var->message_name().c_str());
- ready.splice(ready.end(), *var_inits, p1);
+ Var_inits::iterator v1;;
+ for (v1 = var_inits->begin(); v1 != var_inits->end(); ++v1)
+   {
+ if

Re: Stream out default optimization nodes

2014-11-18 Thread Jan Hubicka
> On Tue, Nov 18, 2014 at 9:27 AM, Jan Hubicka  wrote:
> >> https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00473.html
> >>
> >> /export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/xg++
> >> -B/export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/
> >> -B/usr/5.0.0/x86_64-unknown-linux-gnu/bin/ -nostdinc++
> >> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> >> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> >>  
> >> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
> >>  
> >> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
> >>  
> >> -I/export/gnu/import/git/gcc-test-profiled/src-trunk/libstdc++-v3/libsupc++
> >> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> >> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
> >>   -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
> >> -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
> >> -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
> >> -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
> >> -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE
> >> -static-libstdc++ -static-libgcc  -o build/genmatch \
> >> build/genmatch.o ../libcpp/libcpp.a ../libiberty/libiberty.a
> >> build/errors.o build/vec.o build/hash-table.o
> >> .././libiberty/libiberty.a
> >> ../../src-trunk/libcpp/lex.c: In function âend_directiveâ:
> >> ../../src-trunk/libcpp/lex.c:442:43: error:
> >> â__builtin_ia32_pcmpestri128â needs isa option -m32 -msse4.2
> >>index = __builtin_ia32_pcmpestri128 (search, 4, sv, 16, 0);
> >>^
> >> make[7]: *** [/tmp/ccTC6Hk9.ltrans9.ltrans.o] Error 1
> >
> > Indeed, it looks like the bug is that search_line_sse42 gets inlined int
> > end_directive that is compiled w/o SSE support.  Probably something that
> > happened previously, too, just led to compiling the function with
> > SSE4.2
> >
> > I will need to setup -m32 LTO bootstrap enviornment...
> >
> 
> This is -m64 LTO, not -m32.
OK then the message seems bogus, too.  I will try to reproduce it.

Honza
> 
> 
> -- 
> H.J.


Re: Stream out default optimization nodes

2014-11-18 Thread H.J. Lu
On Tue, Nov 18, 2014 at 9:27 AM, Jan Hubicka  wrote:
>> https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00473.html
>>
>> /export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/xg++
>> -B/export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/
>> -B/usr/5.0.0/x86_64-unknown-linux-gnu/bin/ -nostdinc++
>> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
>> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>>  
>> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
>>  
>> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
>>  -I/export/gnu/import/git/gcc-test-profiled/src-trunk/libstdc++-v3/libsupc++
>> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
>> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>>   -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
>> -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
>> -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
>> -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
>> -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE
>> -static-libstdc++ -static-libgcc  -o build/genmatch \
>> build/genmatch.o ../libcpp/libcpp.a ../libiberty/libiberty.a
>> build/errors.o build/vec.o build/hash-table.o
>> .././libiberty/libiberty.a
>> ../../src-trunk/libcpp/lex.c: In function âend_directiveâ:
>> ../../src-trunk/libcpp/lex.c:442:43: error:
>> â__builtin_ia32_pcmpestri128â needs isa option -m32 -msse4.2
>>index = __builtin_ia32_pcmpestri128 (search, 4, sv, 16, 0);
>>^
>> make[7]: *** [/tmp/ccTC6Hk9.ltrans9.ltrans.o] Error 1
>
> Indeed, it looks like the bug is that search_line_sse42 gets inlined int
> end_directive that is compiled w/o SSE support.  Probably something that
> happened previously, too, just led to compiling the function with
> SSE4.2
>
> I will need to setup -m32 LTO bootstrap enviornment...
>

This is -m64 LTO, not -m32.


-- 
H.J.


Re: Stream out default optimization nodes

2014-11-18 Thread Jan Hubicka
> https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00473.html
> 
> /export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/xg++
> -B/export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/
> -B/usr/5.0.0/x86_64-unknown-linux-gnu/bin/ -nostdinc++
> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> -B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>  
> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
>  
> -I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
>  -I/export/gnu/import/git/gcc-test-profiled/src-trunk/libstdc++-v3/libsupc++
> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> -L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
>   -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
> -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
> -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
> -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
> -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE
> -static-libstdc++ -static-libgcc  -o build/genmatch \
> build/genmatch.o ../libcpp/libcpp.a ../libiberty/libiberty.a
> build/errors.o build/vec.o build/hash-table.o
> .././libiberty/libiberty.a
> ../../src-trunk/libcpp/lex.c: In function âend_directiveâ:
> ../../src-trunk/libcpp/lex.c:442:43: error:
> â__builtin_ia32_pcmpestri128â needs isa option -m32 -msse4.2
>index = __builtin_ia32_pcmpestri128 (search, 4, sv, 16, 0);
>^
> make[7]: *** [/tmp/ccTC6Hk9.ltrans9.ltrans.o] Error 1

Indeed, it looks like the bug is that search_line_sse42 gets inlined int
end_directive that is compiled w/o SSE support.  Probably something that
happened previously, too, just led to compiling the function with
SSE4.2

I will need to setup -m32 LTO bootstrap enviornment...

Honza


RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix

2014-11-18 Thread Moore, Catherine


> -Original Message-
> From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> Sent: Tuesday, November 18, 2014 12:22 PM
> To: Rozycki, Maciej
> Cc: gcc-patches@gcc.gnu.org; Moore, Catherine; Eric Christopher
> Subject: RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix
> 
> Maciej W. Rozycki  writes:
> > On Mon, 17 Nov 2014, Matthew Fortune wrote:
> >
> > > > gcc/
> > > > * gcc/config/mips/mips.md (*jump_absolute): Use a branch when in
> > > > range, a jump otherwise.
> > >
> > > OK.
> > >
> > > I only got my head around this code last week otherwise I wouldn't
> > > have known whether this was correct!
> >
> >  Committed now, thanks for your review.  How about 4.9? -- once it is
> > unfrozen, that is.
> 
> I admit to being a bit more nervous about 4.9 but the test coverage seems
> thorough enough. I guess I would have been less concerned if the
> optimisation was still just tied to TARGET_MICROMIPS for the 4.9 branch.
> 
> Catherine, what do you think?
> 
This is okay for 4.9 IMO.


RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix

2014-11-18 Thread Matthew Fortune
Maciej W. Rozycki  writes:
> On Mon, 17 Nov 2014, Matthew Fortune wrote:
> 
> > >   gcc/
> > >   * gcc/config/mips/mips.md (*jump_absolute): Use a branch when in
> > >   range, a jump otherwise.
> >
> > OK.
> >
> > I only got my head around this code last week otherwise I wouldn't
> > have known whether this was correct!
> 
>  Committed now, thanks for your review.  How about 4.9? -- once it is
> unfrozen, that is.

I admit to being a bit more nervous about 4.9 but the test coverage seems
thorough enough. I guess I would have been less concerned if the
optimisation was still just tied to TARGET_MICROMIPS for the 4.9 branch.

Catherine, what do you think?

Matthew


Re: [Patch, Fortran] Convert gfc_fatal_error to common diagnostics

2014-11-18 Thread Manuel López-Ibáñez
On 15 November 2014 10:23, Tobias Burnus  wrote:
> Especially since color diagnostic is now the default [1], it makes sense to
> convert more gfortran diagnostics to use the common diagnostics.
>
> For an example, see [1]. That also brings all the nice features like placing
> the warning option in brackets:
>   Warning: USE statement at (1) has no ONLY qualifier [-Wuse-without-only]
> Adding -Werror changes it to:
>   Error: USE statement at (1) has no ONLY qualifier
> [-Werror=use-without-only]
> -fno-diagnostics-show-caret compactifies the error into a single line
> without showing the source code - and other nice features.

In theory, one should also be able to use:

https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html#Diagnostic-Pragmas

however, I am not sure what is the syntax for Fortran #pragmas.
Perhaps the Fortran FE needs to parse the pragmas first or set it up
to use the pragma handler used by C/C++.

Cheers,

Manuel.


C++ PATCH for c++/63934 (constexpr failure on ARM)

2014-11-18 Thread Jason Merrill
On ARM constructors return a pointer, so the test for void return 
breaks.  We also don't need to mess with the CONSTRUCTOR we built up in 
the new scheme.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ddfea2b5af06860433d32ad440673c2df37dc8f1
Author: Jason Merrill 
Date:   Tue Nov 18 11:14:22 2014 -0500

	PR c++/63934
	* constexpr.c (cxx_eval_call_expression): Check DECL_CONSTRUCTOR_P
	rather than VOID_TYPE_P.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 67d039e..691dd78 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1327,7 +1327,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	addr, non_constant_p, overflow_p,
 	&jump_target);
 
-	  if (VOID_TYPE_P (TREE_TYPE (res)))
+	  if (DECL_CONSTRUCTOR_P (fun))
 		/* This can be null for a subobject constructor call, in
 		   which case what we care about is the initialization
 		   side-effects rather than the value.  We could get at the
@@ -1366,7 +1366,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 	{
 	  /* If this was a call to initialize an object, set the type of
 	 the CONSTRUCTOR to the type of that object.  */
-	  if (DECL_CONSTRUCTOR_P (fun))
+	  if (DECL_CONSTRUCTOR_P (fun) && !use_new_call)
 	{
 	  tree ob_arg = get_nth_callarg (t, 0);
 	  STRIP_NOPS (ob_arg);


[PATCH, PR62167] Fix tail-merge pass for dead type-unsafe code

2014-11-18 Thread Tom de Vries

Richard,

this (trunk) patch fixes PR62167.

The patch fixes a problem that triggers with the test-case on the 4.8 branch, 
when tail-merge makes a dead type-unsafe load alive.


I'm not able to reproduce this bug on 4.9 and trunk with the same test-case. On 
those branches, the tail-merge already does not happen.


The reason for the difference is as follows: With 4.8 the two phi arguments of 
the phi in the tail block are value-numbered identically:

...
SCC consists of: p_14
Value numbering p_14 stmt = p_14 = MEM[(struct head *)_13].first;
Setting value number of p_14 to p_14 (changed)

SCC consists of: p_15
Value numbering p_15 stmt = p_15 = _13->next;
Setting value number of p_15 to p_14 (changed)
...

With 4.9 (and trunk), that's not the case:
...
SCC consists of: p_14
Value numbering p_14 stmt = p_14 = MEM[(struct head *)&heads][k.1_9].first;
Setting value number of p_14 to p_14 (changed)

SCC consists of: p_15
Value numbering p_15 stmt = p_15 = _13->next;
Setting value number of p_15 to p_15 (changed)
...

I'm not sure the bug triggers on trunk and 4.9, but I see no reason why it could 
not trigger, so I'd prefer to apply the patch to 4.9 and trunk as well.


The patch introduces an xfail for pr51879-12.c. I can follow up with a patch to 
improve upon that, but I think that's better limited to trunk only.


Bootstrapped and reg-tested on x86_64/trunk.

OK for trunk/stage3, 4.8, 4.9?

Thanks,
- Tom
2014-11-17  Tom de Vries  

	PR tree-optimization/62167
	* tree-ssa-tail-merge.c (stmt_local_def): Handle statements with vuse
	conservatively.
	(gimple_equal_p): Don't use vn_valueize to compare for lhs equality of
	assigns.

	* gcc.dg/pr51879-12.c: Add xfails.
	* gcc.dg/pr62167-run.c: New test.
	* gcc.dg/pr62167.c: New test.
---
 gcc/testsuite/gcc.dg/pr51879-12.c  |  4 +--
 gcc/testsuite/gcc.dg/pr62167-run.c | 47 +++
 gcc/testsuite/gcc.dg/pr62167.c | 50 ++
 gcc/tree-ssa-tail-merge.c  |  6 +++--
 4 files changed, 103 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr62167-run.c
 create mode 100644 gcc/testsuite/gcc.dg/pr62167.c

diff --git a/gcc/testsuite/gcc.dg/pr51879-12.c b/gcc/testsuite/gcc.dg/pr51879-12.c
index 8126505..85e2687 100644
--- a/gcc/testsuite/gcc.dg/pr51879-12.c
+++ b/gcc/testsuite/gcc.dg/pr51879-12.c
@@ -24,6 +24,6 @@ foo (int y)
   baz (a);
 }
 
-/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
-/* { dg-final { scan-tree-dump-times "bar2 \\(" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump-times "bar2 \\(" 1 "pre" { xfail *-*-* } } } */
 /* { dg-final { cleanup-tree-dump "pre" } } */
diff --git a/gcc/testsuite/gcc.dg/pr62167-run.c b/gcc/testsuite/gcc.dg/pr62167-run.c
new file mode 100644
index 000..37214a3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr62167-run.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-tail-merge" } */
+
+struct node
+{
+  struct node *next;
+  struct node *prev;
+};
+
+struct node node;
+
+struct head
+{
+  struct node *first;
+};
+
+struct head heads[5];
+
+int k = 2;
+
+struct head *head = &heads[2];
+
+int
+main ()
+{
+  struct node *p;
+
+  node.next = (void*)0;
+
+  node.prev = (void *)head;
+
+  head->first = &node;
+
+  struct node *n = head->first;
+
+  struct head *h = &heads[k];
+
+  heads[2].first = n->next;
+
+  if ((void*)n->prev == (void *)h)
+p = h->first;
+  else
+/* Dead tbaa-unsafe load from ((struct node *)&heads[2])->next.  */
+p = n->prev->next;
+
+  return !(p == (void*)0);
+}
diff --git a/gcc/testsuite/gcc.dg/pr62167.c b/gcc/testsuite/gcc.dg/pr62167.c
new file mode 100644
index 000..f8c31a0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr62167.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-tail-merge -fdump-tree-pre" } */
+
+struct node
+{
+  struct node *next;
+  struct node *prev;
+};
+
+struct node node;
+
+struct head
+{
+  struct node *first;
+};
+
+struct head heads[5];
+
+int k = 2;
+
+struct head *head = &heads[2];
+
+int
+main ()
+{
+  struct node *p;
+
+  node.next = (void*)0;
+
+  node.prev = (void *)head;
+
+  head->first = &node;
+
+  struct node *n = head->first;
+
+  struct head *h = &heads[k];
+
+  heads[2].first = n->next;
+
+  if ((void*)n->prev == (void *)h)
+p = h->first;
+  else
+/* Dead tbaa-unsafe load from ((struct node *)&heads[2])->next.  */
+p = n->prev->next;
+
+  return !(p == (void*)0);
+}
+
+/* { dg-final { scan-tree-dump-not "Removing basic block" "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index 303bd5e..1651985 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -326,7 +326,8 @@ stmt_local_def (gimple stmt)
 
   if (gimple_vdef (stmt) != NULL_TREE
   || gimple_has_side_effects (stmt)
-  || gimple_could_trap_p_1 (stmt, false, false))
+  

AIX libstdc++ failures

2014-11-18 Thread David Edelsohn
Jason,

After your recent C++ patches from yesterday, I am encountering a huge
number of libstdc++ failures on AIX preventing creation of
executables.  They all are of the form:

/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
424:31:   in constexpr expansion of 'std::__detail::_BracketMatcher<
, , 
>::_S_cache_size, false, false>()'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
425:20: error: '(int)(1ul * 8ul)' is not a constant expression
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
424:31:   in constexpr expansion of 'std::__detail::_BracketMatcher<
, , 
>::_S_cache_size, false, true>()'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
425:20: error: '(int)(1ul * 8ul)' is not a constant expression
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
424:31:   in constexpr expansion of 'std::__detail::_BracketMatcher<
, , 
>::_S_cache_size, true, false>()'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
425:20: error: '(int)(1ul * 8ul)' is not a constant expression
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
424:31:   in constexpr expansion of 'std::__detail::_BracketMatcher<
, , 
>::_S_cache_size, true, true>()'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
425:20: error: '(int)(1ul * 8ul)' is not a constant expression

The full output for one testcase is attached.

Thanks, David
Executing on host: /tmp/20141118/./gcc/xg++ -shared-libgcc 
-B/tmp/20141118/./gcc -nostdinc++ 
-L/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src 
-L/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src/.libs 
-L/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/libsupc++/.libs 
-B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20141118/powerpc-ibm-aix7.1.0.0/bin/
 
-B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20141118/powerpc-ibm-aix7.1.0.0/lib/
 -isystem 
/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20141118/powerpc-ibm-aix7.1.0.0/include
 -isystem 
/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20141118/powerpc-ibm-aix7.1.0.0/sys-include
 -B/tmp/20141118/powerpc-ibm-aix7.1.0.0/./libstdc++-v3/src/.libs 
-D_GLIBCXX_ASSERT -fmessage-length=0 -ffunction-sections -fdata-sections -g -O2 
-DLOCALEDIR="." -nostdinc++ 
-I/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/powerpc-ibm-aix7.1.0.0
 -I/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include 
-I/nasfarm/edelsohn/src/src/libstdc++-v3/libsupc++ 
-I/nasfarm/edelsohn/src/src/libstdc++-v3/include/backward 
-I/nasfarm/edelsohn/src/src/libstdc++-v3/testsuite/util 
/nasfarm/edelsohn/src/src/libstdc++-v3/testsuite/28_regex/iterators/regex_token_iterator/char/string_01.cc
   -std=gnu++11 ./libtestc++.a 
/gsa/yktgsa/home/e/d/edelsohn/install/lib/libiconv.a  -lm   -o ./string_01.exe  
  (timeout = 600)
In file included from 
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/regex:61:0,
 from 
/nasfarm/edelsohn/src/src/libstdc++-v3/testsuite/28_regex/iterators/regex_token_iterator/char/string_01.cc:26:
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.h:
 In instantiation of 'struct 
std::__detail::_BracketMatcher, false, false>':
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:403:53:
   required from 'void 
std::__detail::_Compiler<_TraitsT>::_M_insert_character_class_matcher() [with 
bool __icase = false; bool __collate = false; _TraitsT = 
std::regex_traits]'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:321:2:
   required from 'bool std::__detail::_Compiler<_TraitsT>::_M_atom() [with 
_TraitsT = std::regex_traits]'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:139:7:
   required from 'bool std::__detail::_Compiler<_TraitsT>::_M_term() [with 
_TraitsT = std::regex_traits]'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:121:7:
   required from 'void std::__detail::_Compiler<_TraitsT>::_M_alternative() 
[with _TraitsT = std::regex_traits]'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:97:7:
   required from 'void std::__detail::_Compiler<_TraitsT>::_M_disjunction() 
[with _TraitsT = std::regex_traits]'
/tmp/20141118/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/regex_compiler.tcc:82:7:
   required from 
'std::__detail::_Compiler<_TraitsT>::_Compiler(std::__detail::_Compiler<_TraitsT>::_IterT,
 std::__detail::_Compiler<_TraitsT>::_IterT, const typename 
_TraitsT::locale_type&a

Re: Fix speculation in ipa-cp

2014-11-18 Thread Martin Jambor
Hi,

On Sun, Nov 16, 2014 at 12:56:45AM +0100, Jan Hubicka wrote:
> Hi,
> this patch enables propagation of speculative contextes I promised to fix 
> after Martin's
> merge. There were few bugs that ended up disturbing testsuite:

Wonderful, thanks a lot.

> 
>  1) ipa_polymorphic_call_context::combine_with did not handle very well case
> where the incomming type is in construction and it's current type is 
> known.
> This made us to drop the ball on one of devirt-*.C testcases
>  2) ipa_get_indirect_edge_target_1 did not correctly apply the offset 
> adjustment
> and type_preserved prior using the known context.  This caused an ICE 
> while
> building Firefox

Opps, I was worried I would accidently remove something like that.

>  4) I noticed that find_more_scalar_values_for_callers_subset seems to 
> contain a bug
> when searching if all incomming edges do contribute a same constant to a
> given parameter.  The code seems to set the constant to NULL and then set 
> it
> to non-NULL at first occurence.  When it however hits two different 
> constants it
> resets back to NULL and later it may get again non-NULL.

I don't there was a bug.  When it was reset back to NULL, there was
the break statement that quit the loop?  Your new condition (!first &&
!newval) will never trigger because in the second (or any later)
iteration, either we had hit the break statement or set newval to a
non-NULL value.

(If you however still think the code was wrong, it should be also
fixed on release branches.)

> 
> Otherwise the patch just disable parts where speculation si cleared; it makes 
> equal_to
> to work better and it also add meet operation that is used to compute context 
> of edges
> that have multiple callers.

I admit I did not expect the ipa-cp parts to be this small.

>   * ipa-polymorphic-call.c
>   (ipa_polymorphic_call_context::speculation_consistent_p): Constify.
>   (ipa_polymorphic_call_context::meet_speculation_with): New function.
>   (ipa_polymorphic_call_context::combine_with): Handle types in 
> construction
>   better.
>   (ipa_polymorphic_call_context::equal_to): Do not bother about useless
>   speculation.
>   (ipa_polymorphic_call_context::meet_with): New function.
>   * cgraph.h (class ipa_polymorphic_call_context): Add
>   meet_width, meet_speculation_with; constify speculation_consistent_p.
>   * ipa-cp.c (ipa_context_from_jfunc): Handle speculation; combine with 
> incomming
>   context.
>   (propagate_context_accross_jump_function): Likewise; be more cureful.
>   about set_contains_variable.
>   (ipa_get_indirect_edge_target_1): Fix handling of dynamic type changes.
>   (find_more_scalar_values_for_callers_subset): Fix.
>   (find_more_contexts_for_caller_subset): Perform meet operation.
> Index: ipa-polymorphic-call.c
> ===
> --- ipa-polymorphic-call.c(revision 217609)
> +++ ipa-polymorphic-call.c(working copy)
> @@ -1727,7 +1727,7 @@ bool
>  ipa_polymorphic_call_context::speculation_consistent_p (tree spec_outer_type,
>   HOST_WIDE_INT 
> spec_offset,
>   bool 
> spec_maybe_derived_type,
> - tree otr_type)
> + tree otr_type) const
>  {
>if (!flag_devirtualize_speculatively)
>  return false;
> @@ -1873,6 +1873,102 @@ ipa_polymorphic_call_context::combine_sp
>return false;
>  }
>  
> +/* Make speculation less specific so
> +   NEW_OUTER_TYPE, NEW_OFFSET, NEW_MAYBE_DERIVED_TYPE is also included.
> +   If OTR_TYPE is set, assume the context is used with OTR_TYPE.  */
> +

It would be nice if the return value was documented too.

> +bool
> +ipa_polymorphic_call_context::meet_speculation_with
> +   (tree new_outer_type, HOST_WIDE_INT new_offset, bool 
> new_maybe_derived_type,
> +tree otr_type)
> +{
> +  if (!new_outer_type && speculative_outer_type)
> +{
> +  clear_speculation ();
> +  return true;
> +}
> +
> +  /* restrict_to_inner_class may eliminate wrong speculation making our job
> + easeier.  */
> +  if (otr_type)
> +restrict_to_inner_class (otr_type);
> +
> +  if (!speculative_outer_type
> +  || !speculation_consistent_p (speculative_outer_type,
> + speculative_offset,
> + speculative_maybe_derived_type,
> + otr_type))
> +return false;
> +
> +  if (!speculation_consistent_p (new_outer_type, new_offset,
> +  new_maybe_derived_type, otr_type))
> +{
> +  clear_speculation ();
> +  return true;
> +}
> +
> +  else if (types_must_be_same_for_odr (speculative_outer_type,
> +new_outer_type))
> +{
> 

RE: [PATCH] microMIPS/GCC: Correct 64-bit instruction sizes

2014-11-18 Thread Maciej W. Rozycki
On Mon, 17 Nov 2014, Moore, Catherine wrote:

> > * config/mips/mips.md (compression): Add `micromips32' setting.
> > (enabled, length): Handle it.
> > (shift_compression): Replace `micromips' with `micromips32' in
> > the `compression' attribute.
> > (*add3, sub3): Likewise.
> 
> Yes, this looks good.

 Applied now, thanks for your review.

  Maciej


RE: [PATCH] MIPS/GCC: Unconditional jump generation bug fix

2014-11-18 Thread Maciej W. Rozycki
On Mon, 17 Nov 2014, Matthew Fortune wrote:

> > gcc/
> > * gcc/config/mips/mips.md (*jump_absolute): Use a branch when in
> > range, a jump otherwise.
> 
> OK.
> 
> I only got my head around this code last week otherwise I wouldn't have
> known whether this was correct!

 Committed now, thanks for your review.  How about 4.9? -- once it is 
unfrozen, that is.

  Maciej


Re: [PATCH, MPX wrappers 1/3] Add MPX wrappers library

2014-11-18 Thread Ilya Enkovich
On 15 Nov 00:10, Jeff Law wrote:
> On 11/14/14 10:26, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch introduces a simple library with several wrappers to be used with 
> >MPX and Pointer Bounds Checker.  Wrappers allow to obtain, copy and just 
> >keep alive bounds whrough widely use library calls.  It significantly 
> >increases checking  quality.
> >
> >Thanks,
> >Ilya
> >--
> >gcc/
> >
> >2014-11-14  Ilya Enkovich  
> >
> > * gcc.c (MPX_SPEC): Add wrappers library.
> >
> >libmpx/
> >
> >2014-11-14  Ilya Enkovich  
> >
> > * Makefile.am (SUBDIRS): New.
> > (MAKEOVERRIDES): New.
> > * Makefile.in: Regenerate.
> > * configure.ac: Add mpxintr/Makefile to config
> > files.
> > * configure: Regenerate.
> > * mpxwrap/Makefile.am: New.
> > * mpxwrap/Makefile.in: New.
> > * mpxwrap/libtool-version: New.
> > * mpxwrap/mpx_wrappers.cc: New.
> As Joseph mentioned, symbol versioning.  Anytime a target side
> library is added to GCC, it should be properly versioned.
> 
> Don't forget copyright headers in the new files.  Remember it has to
> be suitable for embeddeding in the target without infecting the
> target with the GPL.  LGPL or GPL + exception clause seem the most
> appropriate to me.
> 
> 
> Jeff
> 

Thank you for review!  Here is a version with license and versioning added.

Thanks,
Ilya
--
gcc/

2014-11-18  Ilya Enkovich  

* gcc.c (MPX_SPEC): Add wrappers library.

libmpx/

2014-11-18  Ilya Enkovich  

* Makefile.am (SUBDIRS): New.
(MAKEOVERRIDES): New.
* Makefile.in: Regenerate.
* configure.ac: Add mpxintr/Makefile to config
files.
* configure: Regenerate.
* mpxwrap/Makefile.am: New.
* mpxwrap/Makefile.in: New.
* mpxwrap/libtool-version: New.
* mpxwrap/mpx_wrappers.cc: New.
* mpxwrap/libmpxwrappers.map: New.


diff --git a/gcc/gcc.c b/gcc/gcc.c
index 8ee7bba..56d7e79 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -811,9 +811,10 @@ proper position among the other output files.  */
 
 #ifndef MPX_SPEC
 #define MPX_SPEC "\
-%{!nostdlib:%{!nodefaultlibs:%{mmpx:\
+%{!nostdlib:%{!nodefaultlibs:%{mmpx:%{fcheck-pointer-bounds:\
 %{static:%nMPX runtime is disabled due to -static used}\
-%{!static:-lmpx"
+%{!static:-lmpx}\
+%{!fno-chkp-use-wrappers:-lmpxwrappers}"
 #endif
 
 /* -u* was put back because both BSD and SysV seem to support it.  */
diff --git a/libmpx/Makefile.am b/libmpx/Makefile.am
index c6b479f..553ad30 100644
--- a/libmpx/Makefile.am
+++ b/libmpx/Makefile.am
@@ -1,6 +1,9 @@
 ACLOCAL_AMFLAGS = -I .. -I ../config
 
-SUBDIRS = mpxrt
+SUBDIRS = mpxrt mpxwrap
+
+## May be used by toolexeclibdir.
+gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
@@ -39,3 +42,5 @@ AM_MAKEFLAGS = \
"PICFLAG=$(PICFLAG)" \
"RANLIB=$(RANLIB)" \
"DESTDIR=$(DESTDIR)"
+
+MAKEOVERRIDES =
diff --git a/libmpx/configure.ac b/libmpx/configure.ac
index 9a761c4..f3abead 100644
--- a/libmpx/configure.ac
+++ b/libmpx/configure.ac
@@ -110,7 +110,7 @@ fi
 
 AC_CONFIG_FILES([Makefile])
 AC_CONFIG_HEADERS(config.h)
-AC_CONFIG_FILES(AC_FOREACH([DIR], [mpxrt], [DIR/Makefile]),
+AC_CONFIG_FILES(AC_FOREACH([DIR], [mpxrt mpxwrap], [DIR/Makefile ]),
   [cat > vpsed$$ << \_EOF
 s!`test -f '$<' || echo '$(srcdir)/'`!!
 _EOF
diff --git a/libmpx/mpxwrap/Makefile.am b/libmpx/mpxwrap/Makefile.am
new file mode 100644
index 000..c254d9b
--- /dev/null
+++ b/libmpx/mpxwrap/Makefile.am
@@ -0,0 +1,54 @@
+ALCLOCAL_AMFLAGS = -I .. -I ../config
+
+# May be used by toolexeclibdir.
+gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
+
+libmpxwrappers_la_CFLAGS = -fcheck-pointer-bounds -mmpx -fno-chkp-check-read \
+  -fno-chkp-check-write -fno-chkp-use-wrappers
+libmpxwrappers_la_DEPENDENCIES = libmpxwrappers.map
+libmpxwrappers_la_LDFLAGS = -Wl,--version-script=$(srcdir)/libmpxwrappers.map
+
+toolexeclib_LTLIBRARIES = libmpxwrappers.la
+
+libmpxwrappers_la_SOURCES = mpx_wrappers.c
+
+# Work around what appears to be a GNU make bug handling MAKEFLAGS
+# values defined in terms of make variables, as is the case for CC and
+# friends when we are called from the top level Makefile.
+AM_MAKEFLAGS = \
+   "AR_FLAGS=$(AR_FLAGS)" \
+   "CC_FOR_BUILD=$(CC_FOR_BUILD)" \
+   "CFLAGS=$(CFLAGS)" \
+   "CXXFLAGS=$(CXXFLAGS)" \
+   "CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
+   "CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
+   "INSTALL=$(INSTALL)" \
+   "INSTALL_DATA=$(INSTALL_DATA)" \
+   "INSTALL_PROGRAM=$(INSTALL_PROGRAM)" \
+   "INSTALL_SCRIPT=$(INSTALL_SCRIPT)" \
+   "JC1FLAGS=$(JC1FLAGS)" \
+   "LDFLAGS=$(LDFLAGS)" \
+   "LIBCFLAGS=$(LIBCFLAGS)" \
+   "LIBCFLAGS_FOR_TARGET=$(LIBCFLAGS_FOR_TARGET)" \
+   "MAKE=$(MAKE)" \
+   "MAKEINFO=$(MAKEINFO) $(MAKEINF

Re: [PATCH] gcc/c-family/c-cppbuiltin.c: Use 20 instead of 18 for the maximized 64-bits integer decimal string length

2014-11-18 Thread Joseph Myers
On Sun, 16 Nov 2014, Chen Gang wrote:

> The maximize 64-bits integer decimal string length excluding NUL is 20 (
> '-9223372036854775808'), so need use 20 instead of 18 for HOST_WIDE_INT.
> 
> 2014-11-16  Chen Gang  
> 
>   * c-family/c-cppbuiltin.c (builtin_define_with_int_value):  Use
>   20 instead of 18 for the maximize 64-bits integer decimal
>   string length

OK.  (Though it's not a good idea to use builtin_define_with_int_value 
with large arguments, as it won't generate any suffixes for arguments 
outside the range of target int, and -9223372036854775808 would actually 
need to be expressed as (-9223372036854775807LL-1).  I think it's OK that 
it doesn't parenthesize negative numbers when outputting them - that the 
only cases for which the lack of parentheses could affect the parse are 
invalid for other reasons - though parentheses around negative numbers 
output might be a good idea anyway to make it obvious the output is OK, 
and would accord with how some macros such as __*_MIN_EXP__ are output; 
those values should probably use builtin_define_with_int_value.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, MPX wrappers 2/3] Replace some function calls with wrapper calls during instrumentation

2014-11-18 Thread Ilya Enkovich
On 18 Nov 16:23, Joseph Myers wrote:
> On Tue, 18 Nov 2014, Ilya Enkovich wrote:
> 
> > +@item -fcheck-pointer-bounds
> > +@opindex fcheck-pointer-bounds
> > +@opindex fno-check-pointer-bounds
> > +Enable Pointer Bounds Checker instrumentation.  Each memory reference
> > +is instrumented with checks of pointer used for memory access against
> > +bounds associated with that pointer.  Generated instrumentation may
> > +be controlled by various @option{-fchkp-*} options.
> 
> If this is only operational given -mmpx and when the generated code is run 
> on a processor supporting MPX, I think the documentation needs to make 
> that clear.
> 
> > +@item -fchkp-use-fast-string-functions
> > +@opindex fchkp-use-fast-string-functions
> > +@opindex fno-chkp-use-fast-string-functions
> > +Allow to use *_nobnd versions of string functions (not copying bounds)
> > +by Pointer Bounds Checker.  Disabled by default.
> 
> @code{*_nobnd}.
> 
> > +@item -fchkp-use-nochk-string-functions
> > +@opindex fchkp-use-nochk-string-functions
> > +@opindex fno-chkp-use-nochk-string-functions
> > +Allow to use *_nochk versions of string functions (not checking bounds)
> > +by Pointer Bounds Checker.  Disabled by default.
> 
> @code{*_nochk).
> 
> > +@item -fchkp-instrument-marked-only
> > +@opindex fchkp-instrument-marked-only
> > +@opindex fno-chkp-instrument-marked-only
> > +Instructs Pointer Bounds Checker to instrument only functions
> > +marked with bnd_instrument attribute.  Disabled by default.
> 
> @code{bnd_instrument}.
> 
> > +@item -fchkp-use-wrappers
> > +@opindex fchkp-use-wrappers
> > +@opindex fno-chkp-use-wrappers
> > +Allows Pointer Bounds Checker to replace calls to builtin function
> > +with calls to wrapper functions.  Enabled by default.
> 
> "built-in functions".
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com

Thank you for comments!  Below is a fixed version.

Ilya
--
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 85dcb98..a9f8f99 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1040,6 +1040,10 @@ fchkp-instrument-marked-only
 C ObjC C++ ObjC++ LTO Report Var(flag_chkp_instrument_marked_only) Init(0)
 Instrument only functions marked with bnd_instrument attribute.
 
+fchkp-use-wrappers
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_wrappers) Init(1)
+Transform instrumented builtin calls into calls to wrappers.
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..60056da 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -299,6 +299,15 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-d@var{letters}  -dumpspecs  -dumpmachine  -dumpversion @gol
 -fsanitize=@var{style} -fsanitize-recover -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number} -fsanitize-undefined-trap-on-error @gol
+-fcheck-pointer-bounds -fchkp-check-incomplete-type @gol
+-fchkp-first-field-has-own-bounds -fchkp-narrow-bounds @gol
+-fchkp-narrow-to-innermost-array -fchkp-optimize @gol
+-fchkp-use-fast-string-functions -fchkp-use-nochk-string-functions @gol
+-fchkp-use-static-bounds -fchkp-use-static-const-bounds @gol
+-fchkp-treat-zero-dynamic-size-as-infinite -fchkp-check-read @gol
+-fchkp-check-read -fchkp-check-write -fchkp-store-bounds @gol
+-fchkp-instrument-calls -fchkp-instrument-marked-only @gol
+-fchkp-use-wrappers @gol
 -fdbg-cnt-list -fdbg-cnt=@var{counter-value-list} @gol
 -fdisable-ipa-@var{pass_name} @gol
 -fdisable-rtl-@var{pass_name} @gol
@@ -5693,6 +5702,119 @@ a @code{libubsan} library routine.  The advantage of 
this is that the
 @code{libubsan} library is not needed and will not be linked in, so this
 is usable even for use in freestanding environments.
 
+@item -fcheck-pointer-bounds
+@opindex fcheck-pointer-bounds
+@opindex fno-check-pointer-bounds
+Enable Pointer Bounds Checker instrumentation.  Each memory reference
+is instrumented with checks of pointer used for memory access against
+bounds associated with that pointer.  Generated instrumentation may
+be controlled by various @option{-fchkp-*} options.  Currently there
+is only Intel MPX based implementation available, thus i386 target
+and @option{-mmpx} are required.
+
+@item -fchkp-check-incomplete-type
+@opindex fchkp-check-incomplete-type
+@opindex fno-chkp-check-incomplete-type
+Generate pointer bounds checks for variables with incomplete type.
+Enabled by default
+
+@item -fchkp-narrow-bounds
+@opindex fchkp-narrow-bounds
+@opindex fno-chkp-narrow-bounds
+Controls bounds used by Pointer Bounds Checker for pointers to object
+fields.  If narrowing is enabled then field bounds are used.  Otherwise
+object bounds are used.  See also @option{-fchkp-narrow-to-innermost-array}
+and @option{-fchkp-first-field-has-own-bounds}.  Enabled by default.
+
+@item -fchkp-first-field-has-own-bounds
+@opindex fchkp-first-field-has-own-bounds
+@opindex fno-chkp-first-field-has-own-bounds
+Forces Pointer Bounds Checke

Re: Query about the TREE_TYPE field

2014-11-18 Thread Andrew MacLeod

On 11/18/2014 09:52 AM, Andrew MacLeod wrote:

On 11/18/2014 09:40 AM, Jason Merrill wrote:

On 11/18/2014 09:26 AM, Andrew MacLeod wrote:

I was poking around attribs.c while trial running my tree-type-safety
stuff, and it triggered something in decl_attributes() that seems fishy
to me.  It looks like it was part of the fix for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35315

decl_attributes() can be passed a tree node which is either decl or a
type, but we get to this little snippet:

   if (spec->type_required && DECL_P (*anode))
 {
   anode = &TREE_TYPE (*anode);
   /* Allow ATTR_FLAG_TYPE_IN_PLACE for the type's naming 
decl.  */

   if (!(TREE_CODE (*anode) == TYPE_DECL
 && *anode == TYPE_NAME (TYPE_MAIN_VARIANT
 (TREE_TYPE (*anode)
 flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
 }

anode is changed to point to the TREE_TYPE of the original decl, and
*then* checks if it is a TYPE_DECL...   That doesnt seem right to me..
we can't have a TYPE_DECL as a TREE_TYPE can we?


No.


is that code suppose to be checking is the original DECL is a TYPE_DECL
rather than the TREE_TYPE?


I think so.

Maybe the assignment to anode should be after the if instead of in 
front

of it?


Probably.

Strange that the 35315 patch fixed the testcase with that change 
being a placebo...


Jason

Indeed.   The condition is negated,  so effectively  it becomes an 
always true condition and the ATTR_FLAG_TYPE_IN_PLACE is always turned 
off when a DECL is passed and a type is required..


maybe that works most/all of the time?   huh, that also means the code 
is the same as it was before the patch :-P


maybe one of the follow up patches in bugzilla supercede it?

Andrew

I tried doing the if before chaning to TREE_TYPE...  absolutely no 
effect on the testsuite or anything else :-)


What do you think, should I check this in?   What is there is clearly 
incorrect.we could also revert the original patch since that is what 
the code base is effectively is doing at the moment...


Andrew


	* attribs.c (decl_attributes): Check for TYPE_DECL before switching to
	using the TREE_TYPE.

Index: attribs.c
===
*** attribs.c	(revision 217234)
--- attribs.c	(working copy)
*** decl_attributes (tree *node, tree attrib
*** 501,512 
  	 the decl's type in place here.  */
if (spec->type_required && DECL_P (*anode))
  	{
- 	  anode = &TREE_TYPE (*anode);
  	  /* Allow ATTR_FLAG_TYPE_IN_PLACE for the type's naming decl.  */
  	  if (!(TREE_CODE (*anode) == TYPE_DECL
  		&& *anode == TYPE_NAME (TYPE_MAIN_VARIANT
  	(TREE_TYPE (*anode)
  	flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
  	}
  
if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE
--- 501,512 
  	 the decl's type in place here.  */
if (spec->type_required && DECL_P (*anode))
  	{
  	  /* Allow ATTR_FLAG_TYPE_IN_PLACE for the type's naming decl.  */
  	  if (!(TREE_CODE (*anode) == TYPE_DECL
  		&& *anode == TYPE_NAME (TYPE_MAIN_VARIANT
  	(TREE_TYPE (*anode)
  	flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
+ 	  anode = &TREE_TYPE (*anode);
  	}
  
if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE


Re: [PING^2][PATCH] GCC/test: Set timeout factor for c11-atomic-exec-5.c

2014-11-18 Thread Maciej W. Rozycki
On Mon, 17 Nov 2014, Mike Stump wrote:

> > http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00242.html
> > 
> > is still waiting, please review.
> 
> Wait no more.
> 
> Ok.

 Applied now, thanks for your review.

  Maciej


RE: [PATCH] Cancel bswap opt when intermediate stmts are reused

2014-11-18 Thread Thomas Preud'homme
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Monday, November 17, 2014 12:47 PM
> 
> Hmm.  I am a little bit concerned about the malloc traffic generated here.
> So why not use a vec, get rid of the ->next pointer and
> use a hash_map  to associate the stmt with
> an index into the vector?

Sure, it even makes things easier. However I don't understand why a vector is
better than malloc. Is it a matter of fragmentation?

> 
> At this point I'd rather leave DCE to DCE ...

I thought since all information is there why not do it. It makes it easier to
read the dump of the pass.

> 
> Ick - do we really have to use gotos here?  Can you refactor this
> a bit to avoid it?

Yes I can. I needed the same kind of thing for fixing PR63761 (r217409) and
found a way to avoid it.

> 
> The whole scheme wouldn't exactly fly with the idea of eventually
> using a lattice to propagate the symbolic numbers, but well...
 
> 
> I think the overall idea is sound and if you have time to adjust according
> to my comments that would be nice.

To be honest I think it should wait for next stage1. After several ping I took
another look at the testcase with the idea of measuring the size reduction
the patch could give and was surprised that in all cases I could construct the
size was actually bigger. Performance might be improved nonetheless but
I think this needs more careful consideration.

And as you said the approach would need to be changed if bswap was
rewritten to do a forward analysis. At last, nobody reported a code size or
performance regression so far due to the changes so that might be a non
issue. If such a report happens, then it will be a good time to revisit that
decision.

Do you agree? 
 

> 
> Sorry for the very late review.

That's alright, I know how it is. Thank you for keeping track of it. I actually
feel sorry I didn't warn about my findings. I thought the patch fell through
the cracks and didn't want to spam gcc-patches uselessly.

Best regards,

Thomas





Re: [PATCH, MPX wrappers 2/3] Replace some function calls with wrapper calls during instrumentation

2014-11-18 Thread Joseph Myers
On Tue, 18 Nov 2014, Ilya Enkovich wrote:

> +@item -fcheck-pointer-bounds
> +@opindex fcheck-pointer-bounds
> +@opindex fno-check-pointer-bounds
> +Enable Pointer Bounds Checker instrumentation.  Each memory reference
> +is instrumented with checks of pointer used for memory access against
> +bounds associated with that pointer.  Generated instrumentation may
> +be controlled by various @option{-fchkp-*} options.

If this is only operational given -mmpx and when the generated code is run 
on a processor supporting MPX, I think the documentation needs to make 
that clear.

> +@item -fchkp-use-fast-string-functions
> +@opindex fchkp-use-fast-string-functions
> +@opindex fno-chkp-use-fast-string-functions
> +Allow to use *_nobnd versions of string functions (not copying bounds)
> +by Pointer Bounds Checker.  Disabled by default.

@code{*_nobnd}.

> +@item -fchkp-use-nochk-string-functions
> +@opindex fchkp-use-nochk-string-functions
> +@opindex fno-chkp-use-nochk-string-functions
> +Allow to use *_nochk versions of string functions (not checking bounds)
> +by Pointer Bounds Checker.  Disabled by default.

@code{*_nochk).

> +@item -fchkp-instrument-marked-only
> +@opindex fchkp-instrument-marked-only
> +@opindex fno-chkp-instrument-marked-only
> +Instructs Pointer Bounds Checker to instrument only functions
> +marked with bnd_instrument attribute.  Disabled by default.

@code{bnd_instrument}.

> +@item -fchkp-use-wrappers
> +@opindex fchkp-use-wrappers
> +@opindex fno-chkp-use-wrappers
> +Allows Pointer Bounds Checker to replace calls to builtin function
> +with calls to wrapper functions.  Enabled by default.

"built-in functions".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, Pointer Bounds Checker, Builtins instrumentation 3/5] Expand instrumented builtin calls

2014-11-18 Thread Jeff Law

On 11/18/14 02:55, Richard Biener wrote:

On Tue, Nov 18, 2014 at 3:59 AM, Jeff Law  wrote:

On 11/17/14 13:05, Ilya Enkovich wrote:



How comes you emit debug info for functions that do not exist and thus
are never used?  Is problem caused by builtins going after
END_CHKP_BUILTINS? Or some info generated for all builtins ignoring
its existence?


IIRC, stabs aren't pruned based on whether or not they're used.  So every
file that includes tree-core.h is going to have every name/value pair for
the enum.  Hence the major heartburn for AIX.


I wonder how AIX deals with the ever increasing number of target builtins
then.

Well.  Why can't we just drop enum values that exceed the stabs format?
At least I don't understand why we cannot "fix" this in the stabs output.
After all what happens to non-GCC code with such large enums?
I can live with that -- I don't expect we'll have the need to be looking 
at those enum values on AIX often :-)


jeff


Re: [PATCH][wwwdocs] Add Cortex-A53 erratum workaround note to AArch64 changes for 4.9

2014-11-18 Thread Kyrill Tkachov


On 17/11/14 17:14, Marcus Shawcroft wrote:

On 14 November 2014 15:06, Kyrill Tkachov  wrote:

Hi all,

Considering that the erratum workaround option was backported to 4.9, I
assume we'll need an item for that
in the changes.html for that branch?

The text is the same as in the trunk version that I committed recently.

Looks OK to me. /Marcus


Thanks, but I've since reworked this (and the 4.8 version) in response 
to feedback.

The latest version is here:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02075.html

Kyrill








Re: [PATCH][ARM/AArch64] Improve modeled latency between FP operations and FP->GP register moves

2014-11-18 Thread Ramana Radhakrishnan
On Tue, Nov 11, 2014 at 11:59 AM, Kyrill Tkachov  wrote:
> Hi all,
>
> This patch models the latency of moves between FP and GP registers on the
> A15 and A57 a bit more accurately by splitting the reservations for FP->GP
> and GP->FP moves and adding an appropriate bypass.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf and
> aarch64-none-linux-gnu.
>
> Ok for trunk?

Ok - thanks.

Ramana
>
> Thanks,
> Kyrill
>
> 2014-11-11  Kyrylo Tkachov  
>
> * config/arm/cortex-a15-neon.md (cortex_a15_vfp_to_from_gp):
> Split into...
> (cortex_a15_gp_to_vfp): ...This.
> (cortex_a15_fp_to_gp): ...And this.
> Define and comment bypass from vfp operations to fp->gp moves.


Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish

2014-11-18 Thread Michael Matz
Hi,

On Mon, 17 Nov 2014, Richard Biener wrote:

> This means I can no longer interrupt a compile that is running too long? 

No, that's not what it means, cc1 will also get the SIGINT.

> You should instead debug the actual compiler, not the driver.

-wrapper is specifically also for invoking cc1 with gdb from the driver 
(that's the usecase documented with -wrapper), so it better should work as 
intended.  I don't know what problems Patrick had with that, though.  For 
me gcc -wrapper gdb,--args works as expected (as in ^C interrupts cc1 
returning to gdb).


Ciao,
Michael.


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Andrew Bennett
> From: Maciej W. Rozycki [mailto:ma...@codesourcery.com]
> 
> On Tue, 18 Nov 2014, Andrew Bennett wrote:
> 
> > Produces (for the atomic operation):
> >
> >.setnoat
> > sync
> > 1:
> > ll  $3,0($5)
> > and $1,$3,$4
> > bne $1,$7,2f
> > and $1,$3,$6
> > or  $1,$1,$8
> > sc  $1,0($5)
> > beql$1,$0,1b
> > nop
> > nop
> > sync
> > 2:
> > .setat
> 
>  With a quick look at `mips_process_sync_loop' it looks to me the
> other NOP is produced from here:
> 
>   else if (!(required_oldval && cmp))
> mips_multi_add_insn ("nop", NULL);
> 
> -- correct?  If so, then can't you just make it:

Correct.

> 
>   else if (!(required_oldval && cmp) && !TARGET_FIX_R1)
> mips_multi_add_insn ("nop", NULL);
> 
> instead?  It appears plain and simple, and if you're concerned about it
> being unobvious to the casual reader, then add a one-line comment or
> suchlike.  It's not like TARGET_FIX_R1 is going to imply anything
> else about branches in the future (and r6 code won't run on an R10k
> anyway).

True.  Actually this is what my original patch had in it, but Matthew and I
decided to leave it out for the initial submission.  I am happy to add
it back in, and provide a comment to explain what is happening. 
 
Catherine are you happy with this?


Regards,


Andrew


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Maciej W. Rozycki
On Tue, 18 Nov 2014, Andrew Bennett wrote:

> Produces (for the atomic operation):
> 
>.setnoat
> sync
> 1:
> ll  $3,0($5)
> and $1,$3,$4
> bne $1,$7,2f
> and $1,$3,$6
> or  $1,$1,$8
> sc  $1,0($5)
> beql$1,$0,1b
> nop
> nop
> sync
> 2:
> .setat

 With a quick look at `mips_process_sync_loop' it looks to me the
other NOP is produced from here:

  else if (!(required_oldval && cmp))
mips_multi_add_insn ("nop", NULL);

-- correct?  If so, then can't you just make it:

  else if (!(required_oldval && cmp) && !TARGET_FIX_R1)
mips_multi_add_insn ("nop", NULL);

instead?  It appears plain and simple, and if you're concerned about it 
being unobvious to the casual reader, then add a one-line comment or 
suchlike.  It's not like TARGET_FIX_R1 is going to imply anything 
else about branches in the future (and r6 code won't run on an R10k 
anyway).

  Maciej


Re: PATCH: PR bootstrap/63784: [5 Regression] profiledbootstrap failure with bootstrap-lto

2014-11-18 Thread H.J. Lu
On Fri, Nov 14, 2014 at 2:19 AM, Richard Biener
 wrote:
> On Fri, Nov 14, 2014 at 12:15 AM, H.J. Lu  wrote:
>> On Tue, Nov 11, 2014 at 8:02 AM, H.J. Lu  wrote:
>>> On Mon, Nov 10, 2014 at 11:42 AM, H.J. Lu  wrote:
 On Mon, Nov 10, 2014 at 5:44 AM, Richard Biener
  wrote:
> On Mon, Nov 10, 2014 at 2:43 PM, Jakub Jelinek  wrote:
>> On Mon, Nov 10, 2014 at 05:32:32AM -0800, H.J. Lu wrote:
>>> On Mon, Nov 10, 2014 at 4:05 AM, Jakub Jelinek  wrote:
>>> > On Mon, Nov 10, 2014 at 12:50:44PM +0100, Richard Biener wrote:
>>> >> On Sun, Nov 9, 2014 at 5:46 PM, H.J. Lu  wrote:
>>> >> > Hi,
>>> >> >
>>> >> > r216964 disables bootstrap for libcc1 which exposed 2 things:
>>> >> >
>>> >> > 1. libcc1 isn't compiled with LTO even when GCC is configured with
>>> >> > "--with-build-config=bootstrap-lto".  It may be intentional since
>>> >> > libcc1 is disabled for bootstrap.
>>> >> > 2. -fPIC isn't used to created libcc1.so, which is OK if libcc1 is
>>> >> > compiled with LTO which remembers PIC option.
>>> >>
>>> >> Why is this any special to LTO?  If it is then it looks like a LTO
>>> >> (driver) issue to me?  Why are we linking the pic libibterty into
>>> >> a non-pic libcc1?
>>> >
>>> > I admit I haven't tried LTO bootstrap, but from normal bootstrap logs,
>>> > libcc1 is built normally using libtool using -fPIC only, and linked 
>>> > into
>>> > libcc1.so.0.0.0 and libcc1plugin.so.0.0.0, and of course against the
>>> > pic/libiberty.a, because we need PIC code in the shared libraries.
>>> > So, I don't understand the change at all.
>>> >
>>> > Jakub
>>>
>>> This is the command line to build libcc1.la:
>>
>> Sure, but there was -fPIC used to compile all the *.o files that are 
>> being
>> linked into libcc1.so, so LTO should know that.
>
> And it does.  If not please file a bug with a smaller testcase than libcc1
> and libiberty.
>

 There is nothing wrong with linker.  It is a slm-lto bug in libtool.  I 
 uploaded
 a testcase at

 https://gcc.gnu.org/bugzilla/attachment.cgi?id=33931

>>>
>>> My patch is a backport of libtool LTO support:
>>>
>>> commit b81fd4ef009c24a86a7e64727ea09efb410ea149
>>> Author: Ralf Wildenhues 
>>> Date:   Sun Aug 29 17:31:29 2010 +0200
>>>
>>> Support GCC LTO on GNU/Linux.
>>>
>>> * libltdl/config/ltmain.m4sh (func_mode_link): Allow through
>>> flags matching -O*, -flto*, -fwhopr, -fuse-linker-plugin.
>>> * libltdl/m4/libtool.m4 (_LT_CMD_GLOBAL_SYMBOLS): Drop symbols
>>> starting with __gnu_lto.
>>> (_LT_LINKER_SHLIBS) [linux] :
>>> Add $pic_flag for GCC.
>>> (_LT_LANG_CXX_CONFIG) [linux] :
>>> Likewise.
>>> (_LT_SYS_HIDDEN_LIBDEPS): Ignore files matching *.lto.o.
>>> * NEWS: Update.
>>>
>>> Signed-off-by: Ralf Wildenhues 
>>>
>>> OK to install?
>>>
>>
>> Ping.
>>
>> Stage 1 will be closed tomorrow.  I'd like to restore LTO bootstrap.
>
> Bugfixing is still possible after that date.  I suppose you don't call
> LTO bootstrap a new feature ;)
>

New LTO bugs have been introduced since LTO bootstrap was broken
a few weeks ago.  I'd like to check in this one so that we have a chance
to prevent further damage.  Any objects to backport this from libtool
upstream?

-- 
H.J.


Re: Stream out default optimization nodes

2014-11-18 Thread H.J. Lu
On Mon, Nov 17, 2014 at 10:38 AM, Jan Hubicka  wrote:
> Hi,
> this patch makes us to store default optimization node same way as we stream
> target node.  This means that command line options given at compile time
> prevail those given at linktime.  Previously we sort of combined both.
>
> We still have a lot of flags that are global (i.e. not marked as Optimization)
> but affect way how the unit is output.  Since I woul dlike to replace the old
> option merging, I would like to add "Global" attribute to each of them in .opt
> file and generate streaming code for them same way as we do for
> optimization/target nodes.
>
> This patch regtested/bootstrapped x86_64-linux and in ealrier tree also
> ppc64-linux/ppc64-aix that do not work for me at the moment.
> I alosuse it in my tree for some time and tested firefox/libreoffice builds
>
> OK?
> Honza
>
> * tree.c (free_lang_data_in_decl): Store default optimization node.
> Index: tree.c
> ===
> --- tree.c  (revision 217659)
> +++ tree.c  (working copy)
> @@ -5118,6 +5118,9 @@ free_lang_data_in_decl (tree decl)
>   if (!DECL_FUNCTION_SPECIFIC_TARGET (decl))
> DECL_FUNCTION_SPECIFIC_TARGET (decl)
>   = target_option_default_node;
> + if (!DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl))
> +   DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
> + = optimization_default_node;
> }
>
>/* DECL_SAVED_TREE holds the GENERIC representation for DECL.

I think one of your LTO streaming change breaks GCC LTO build:

https://gcc.gnu.org/ml/gcc-regression/2014-11/msg00473.html

/export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/xg++
-B/export/gnu/import/git/gcc-test-profiled/bld/./prev-gcc/
-B/usr/5.0.0/x86_64-unknown-linux-gnu/bin/ -nostdinc++
-B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
-B/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
 
-I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
 
-I/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include
 -I/export/gnu/import/git/gcc-test-profiled/src-trunk/libstdc++-v3/libsupc++
-L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/export/gnu/import/git/gcc-test-profiled/bld/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
  -g -O2 -flto=jobserver -frandom-seed=1 -fprofile-use -DIN_GCC
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE
-static-libstdc++ -static-libgcc  -o build/genmatch \
build/genmatch.o ../libcpp/libcpp.a ../libiberty/libiberty.a
build/errors.o build/vec.o build/hash-table.o
.././libiberty/libiberty.a
../../src-trunk/libcpp/lex.c: In function âend_directiveâ:
../../src-trunk/libcpp/lex.c:442:43: error:
â__builtin_ia32_pcmpestri128â needs isa option -m32 -msse4.2
   index = __builtin_ia32_pcmpestri128 (search, 4, sv, 16, 0);
   ^
make[7]: *** [/tmp/ccTC6Hk9.ltrans9.ltrans.o] Error 1


-- 
H.J.


Re: [PATCH][AArch64] Adjust generic move costs

2014-11-18 Thread Ramana Radhakrishnan
On Mon, Nov 17, 2014 at 5:13 PM, Marcus Shawcroft
 wrote:
> On 14 November 2014 14:35, Wilco Dijkstra  wrote:
>
>> 2014-11-14  Wilco Dijkstra  
>>
>> * gcc/config/aarch64/aarch64.c (generic_regmove_cost):
>> Increase FP move cost.
>
> OK /Marcus

Changelog should probably indicate PR target/61915 when committing ?

Ramana


[PATCH V2] plugin event for C/C++ function definitions

2014-11-18 Thread Andres Tiraboschi
Hi, this patch adds a new plugin event PLUGIN_START_PARSE_FUNCTION and 
PLUGIN_FINISH_PARSE_FUNCTION that are invoked at start_function and 
finish_function respectively in the C and C++ frontends. 
PLUGIN_START_PARSE_FUNCTION is called before parsing the function body.
PLUGIN_FINISH_PARSE_FUNCTION is called after parsing a function definition.

changelog:

 gcc/c/c-decl.c: Invoke callbacks in start_function and finish_function.

 gcc/cp/decl.c: Invoke callbacks in start_function and finish_function.

 gcc/doc/plugins.texi: Add documentation about PLUGIN_START_FUNCTION and 
PLUGIN_FINISH_FUNCTION
 gcc/plugin.def: Add events for start_function and finish_function. 
 
 gcc/plugin.c (register_callback, invoke_plugin_callbacks): Same.

 gcc/testsuite/g++.dg/plugin/def_plugin.c: New test plugin. 
 
 gcc/testsuite/g++.dg/plugin/def-plugin-test.C: Testcase for above plugin.
 gcc/testsuite/g++.dg/plugin/plugin.exp 

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index e23284a..fea3334 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -8073,6 +8073,7 @@ start_function (struct c_declspecs *declspecs, struct 
c_declarator *declarator,
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, true, NULL,
  &attributes, NULL, NULL, DEPRECATED_NORMAL);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
 
   /* If the declarator is not suitable for a function definition,
  cause a syntax error.  */
@@ -8886,6 +8887,7 @@ finish_function (void)
  It's still in DECL_STRUCT_FUNCTION, and we'll restore it in
  tree_rest_of_compilation.  */
   set_cfun (NULL);
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, 
current_function_decl);
   current_function_decl = NULL;
 }
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d4adbeb..ce2f832 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13631,6 +13631,7 @@ start_function (cp_decl_specifier_seq *declspecs,
   tree decl1;
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, 1, &attrs);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
   if (decl1 == error_mark_node)
 return false;
   /* If the declarator is not suitable for a function definition,
@@ -14260,6 +14261,7 @@ finish_function (int flags)
   vec_free (deferred_mark_used_calls);
 }
 
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, fndecl);
   return fndecl;
 }
 
diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi
index 4a839b8..1c9e074 100644
--- a/gcc/doc/plugins.texi
+++ b/gcc/doc/plugins.texi
@@ -174,6 +174,8 @@ Callbacks can be invoked at the following pre-determined 
events:
 @smallexample
 enum plugin_event
 @{
+  PLUGIN_START_PARSE_FUNCTION,  /* Called before parsing the body of a 
function. */
+  PLUGIN_FINISH_PARSE_FUNCTION, /* After finishing parsing a function. */
   PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager.  */
   PLUGIN_FINISH_TYPE,   /* After finishing parsing a type.  */
   PLUGIN_FINISH_DECL,   /* After finishing parsing a declaration. */
diff --git a/gcc/plugin.c b/gcc/plugin.c
index 8debc09..f7a8b64 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -433,6 +433,8 @@ register_callback (const char *plugin_name,
return;
  }
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
@@ -511,6 +513,8 @@ invoke_plugin_callbacks_full (int event, void *gcc_data)
gcc_assert (event >= PLUGIN_EVENT_FIRST_DYNAMIC);
gcc_assert (event < event_last);
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
diff --git a/gcc/plugin.def b/gcc/plugin.def
index df5d383..04faec9 100644
--- a/gcc/plugin.def
+++ b/gcc/plugin.def
@@ -18,6 +18,12 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 
+/* Called before parsing the body of a function.  */
+DEFEVENT (PLUGIN_START_PARSE_FUNCTION)
+
+/* After finishing parsing a function. */
+DEFEVENT (PLUGIN_FINISH_PARSE_FUNCTION)
+
 /* To hook into pass manager.  */
 DEFEVENT (PLUGIN_PASS_MANAGER_SETUP)
 
diff --git a/gcc/testsuite/g++.dg/plugin/def-plugin-test.C 
b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
new file mode 100644
index 000..b7f2d3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
@@ -0,0 +1,13 @@
+int global = 12;
+
+int function1(void);
+
+int function2(int a) // { dg-warning "Start fndef function2" }
+{
+  return function1() + a;
+} //  { dg-warning "Finish fndef function2" }
+
+int function1(void) // { dg-warning "Start fndef function1" }
+{
+  return global + 1;
+} //  { dg-warning "Finish fndef function1" }
diff --gi

Re: [PATCH 4/4] Data structure is used for inline_summary struct.

2014-11-18 Thread Jan Hubicka
> On Fri, Nov 14, 2014 at 08:59:10PM +0100, Jan Hubicka wrote:
> > > > 
> > > > > b) with GTY, we cannot call destructor
> > > > 
> > > > Everything in symbol table is expecitely memory managed (i.e. enver left
> > > > to be freed by garbage collector). It resists in GTY only to allow 
> > > > linking
> > > > garbage collected object from them and to get PCH working.
> > > > 
> > > 
> > > Well, if I understand the intent correctly, summaries are for stuff
> > > that is not in the symbol table.  For example jump functions are a
> > Correct.
> > > vector of structures possibly containing trees, so everything has to
> > > be in garbage collected memory.
> > > 
> > > When an edge is removed, it is necessary to be notified about it
> > > immediately, for example to decrement rdesc_refcount (you might argue
> > > that that should be done in a separate hook and not from within a
> > > summary class but then you start to rely on hook invocation ordering
> > > so I think it is better to eventually use the summaries for it too).
> > 
> > I do not see why ctors/dtors can not do the reference counting. In fact
> > this is how refcounting is done usually anyway?
> > 
> 
> Well, when there is no garbage collection involved then yes, that is
> how you normally do it but in the GC case, there is the question of
> what is the appropriate time to call destructor on garbage collected
> data (like jump functions)?

I still fail to see problem here.  Summaries are explicitly managed- they are
constructed at summary construction time or when new callgarph node is
introduced/duplicated.  They are destroyed when callgarph node is destroyed or
whole summary is ddestroyed.  It is job of the summary datastructure to call
proper ctors/dtors, not job of garbage collector that provides the underlying
memory management.

If you have datastructure that points to something that is not explicitly
managed (i.e. tree expression), you just can not have non-trivial constructor
on that datastructure, because that is freed transparently by gty that don't do
destruction...

Honza



Re: [PATCH] Fix PR63868 - Guard offloading with ifdef ENABLE_OFFLOADING

2014-11-18 Thread Jakub Jelinek
On Tue, Nov 18, 2014 at 06:31:59PM +0300, Ilya Verbin wrote:
> @@ -8287,7 +8289,9 @@ expand_omp_target (struct omp_region *region)
>if (kind == GF_OMP_TARGET_KIND_REGION)
>  {
>unsigned srcidx, dstidx, num;
> +#ifdef ENABLE_OFFLOADING
>struct cgraph_node *node;
> +#endif

Please instead move the struct cgraph_node *node; declaration
right above where it is used for the first time.

There is no goto involved there, and it isn't in a switch either,
so you probably also can do just:
struct cgraph_node *node = cgraph_node::get (child_fn);
instead.

Ok with that change.

> @@ -8414,18 +8418,22 @@ expand_omp_target (struct omp_region *region)
>DECL_STRUCT_FUNCTION (child_fn)->curr_properties = 
> cfun->curr_properties;
>cgraph_node::add_new_function (child_fn, true);
>  
> +#ifdef ENABLE_OFFLOADING
>/* Add the new function to the offload table.  */
>vec_safe_push (offload_funcs, child_fn);
> +#endif
>  
>/* Fix the callgraph edges for child_cfun.  Those for cfun will be
>fixed in a following pass.  */
>push_cfun (child_cfun);
>cgraph_edge::rebuild_edges ();
>  
> +#ifdef ENABLE_OFFLOADING
>/* Prevent IPA from removing child_fn as unreachable, since there are 
> no
>refs from the parent function to child_fn in offload LTO mode.  */
>node = cgraph_node::get (child_fn);
>node->mark_force_output ();
> +#endif
>  
>/* Some EH regions might become dead, see PR34608.  If
>pass_cleanup_cfg isn't the first pass to happen with the

Jakub


[PATCH] Fix PR63868 - Guard offloading with ifdef ENABLE_OFFLOADING

2014-11-18 Thread Ilya Verbin
Hi,

This patch disables the generation of sections with offload IR and offload
tables by default, when a compiler is configured without
--enable-offload-targets= .
Bootsrap and regtesting in progress, OK after it finished?

  -- Ilya


PR regression/63868
gcc/
* cgraph.c (cgraph_node::create): Guard g->have_offload with
ifdef ENABLE_OFFLOADING.
* omp-low.c (create_omp_child_function): Likewise.
(expand_omp_target): Guard node and offload_funcs
with ifdef ENABLE_OFFLOADING.
* varpool.c (varpool_node::get_create): Guard g->have_offload and
offload_vars with ifdef ENABLE_OFFLOADING.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index cc04744..18ae6a8 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -500,7 +500,9 @@ cgraph_node::create (tree decl)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
 {
   node->offloadable = 1;
+#ifdef ENABLE_OFFLOADING
   g->have_offload = true;
+#endif
 }
 
   node->register_symbol ();
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 915d55f..44ce479 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1961,7 +1961,9 @@ create_omp_child_function (omp_context *ctx, bool 
task_copy)
if (is_targetreg_ctx (octx))
  {
cgraph_node::get_create (decl)->offloadable = 1;
+#ifdef ENABLE_OFFLOADING
g->have_offload = true;
+#endif
break;
  }
 }
@@ -8287,7 +8289,9 @@ expand_omp_target (struct omp_region *region)
   if (kind == GF_OMP_TARGET_KIND_REGION)
 {
   unsigned srcidx, dstidx, num;
+#ifdef ENABLE_OFFLOADING
   struct cgraph_node *node;
+#endif
 
   /* If the target region needs data sent from the parent
 function, then the very first statement (except possible
@@ -8414,18 +8418,22 @@ expand_omp_target (struct omp_region *region)
   DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
   cgraph_node::add_new_function (child_fn, true);
 
+#ifdef ENABLE_OFFLOADING
   /* Add the new function to the offload table.  */
   vec_safe_push (offload_funcs, child_fn);
+#endif
 
   /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 fixed in a following pass.  */
   push_cfun (child_cfun);
   cgraph_edge::rebuild_edges ();
 
+#ifdef ENABLE_OFFLOADING
   /* Prevent IPA from removing child_fn as unreachable, since there are no
 refs from the parent function to child_fn in offload LTO mode.  */
   node = cgraph_node::get (child_fn);
   node->mark_force_output ();
+#endif
 
   /* Some EH regions might become dead, see PR34608.  If
 pass_cleanup_cfg isn't the first pass to happen with the
@@ -12502,7 +12510,7 @@ omp_finish_file (void)
 
   varpool_node::finalize_decl (vars_decl);
   varpool_node::finalize_decl (funcs_decl);
-   }
+}
   else
 {
   for (unsigned i = 0; i < num_funcs; i++)
diff --git a/gcc/varpool.c b/gcc/varpool.c
index 80dd496..50f2e6e 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -171,9 +171,11 @@ varpool_node::get_create (tree decl)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
 {
   node->offloadable = 1;
+#ifdef ENABLE_OFFLOADING
   g->have_offload = true;
   if (!in_lto_p)
vec_safe_push (offload_vars, decl);
+#endif
 }
 
   node->register_symbol ();


Re: RFA (tree-inline): PATCH for more C++14 constexpr support

2014-11-18 Thread Kyrill Tkachov


On 17/11/14 18:15, Jason Merrill wrote:

On 11/17/2014 05:29 AM, Richard Biener wrote:

can you rename it to copy_fn please?  It really copies parameter and
result and then the body.

Ok with that change.

Done.  Here's what I'm checking in, along with a second patch to enable
the new code for C++11 as well:


Hi Jason,

These broke building libstdc++ on arm-none-eabi.
I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63936 for it.

Cheers,
Kyrill





Re: [ping] account for register spans in expand_builtin_init_dwarf_reg_sizes

2014-11-18 Thread Olivier Hainque

On Nov 18, 2014, at 03:29 , Jason Merrill  wrote:
> What happens when the outer loop hits a register that we've already seen as 
> part of a span?

 Hmm, I was under the impression that this was supposed never to happen. I can
 see that this is not so clear though.

 What would happen with the current code is the production of extraneous
 instructions initializing the same slot, presumably with the same values.

 I have a candidate improvement to prevent processing the same regno
 multiple times.

 Will followup as soon as I have some test results.

 Thanks for your feedback,

 Olivier

 

RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Andrew Bennett
>  OK, this does look to me like the correct way to address the issue, but
> where is the second NOP that you previously mentioned?  I fail to see it
> here and this code can't be made any better, there isn't anything you
> could possibly schedule into the delay slot as there is nothing else to
> do in this loop.

The following testcase shows this occurring.

short v, count, ret;

int
main ()
{
  v = 0;
  count = 0;

  __atomic_exchange_n (&v, count + 1, __ATOMIC_RELAXED);

  return 0;
}

Produces (for the atomic operation):

   .setnoat
sync
1:
ll  $3,0($5)
and $1,$3,$4
bne $1,$7,2f
and $1,$3,$6
or  $1,$1,$8
sc  $1,0($5)
beql$1,$0,1b
nop
nop
sync
2:
.setat


Regards,



Andrew


Re: [PATCH, MPX wrappers 2/3] Replace some function calls with wrapper calls during instrumentation

2014-11-18 Thread Ilya Enkovich
On 15 Nov 00:16, Jeff Law wrote:
> On 11/14/14 10:29, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch adds wrapper calls.  It's simply achieved by using proper name 
> >for builtin clones.  Patches apply over builtins instrumentation series.
> >
> >Thanks,
> >Ilya
> >--
> >2014-11-14  Ilya Enkovich  
> >
> > * c-family/c.opt (fchkp-use-wrappers): New.
> > * ipa-chkp.c (CHKP_WRAPPER_SYMBOL_PREFIX): New.
> > (chkp_wrap_function): New.
> > (chkp_build_instrumented_fndecl): Support wrapped
> > functions.
> invoke.texi documentation please.
> 
> OK with the invoke.texi doc work.  Patch #3 is OK.
> 
> Jeff
> 

invoke.texi misses all chkp options description.  I fixed it.  Here is a new 
patch.

Thanks,
Ilya
--
2014-11-18  Ilya Enkovich  

* c-family/c.opt (fchkp-use-wrappers): New.
* ipa-chkp.c (CHKP_WRAPPER_SYMBOL_PREFIX): New.
(chkp_wrap_function): New.
(chkp_build_instrumented_fndecl): Support wrapped
functions.
* doc/invoke.texi (-fcheck-pointer-bounds): New.
(-fchkp-check-incomplete-type): New.
(-fchkp-first-field-has-own-bounds): New.
(-fchkp-narrow-bounds): New.
(-fchkp-narrow-to-innermost-array): New.
(-fchkp-optimize): New.
(-fchkp-use-fast-string-functions): New.
(-fchkp-use-nochk-string-functions): New.
(-fchkp-use-static-bounds): New.
(-fchkp-use-static-const-bounds): New.
(-fchkp-treat-zero-dynamic-size-as-infinite): New.
(-fchkp-check-read): New.
(-fchkp-check-write): New.
(-fchkp-store-bounds): New.
(-fchkp-instrument-calls): New.
(-fchkp-instrument-marked-only): New.
(-fchkp-use-wrappers): New.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 85dcb98..a9f8f99 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1040,6 +1040,10 @@ fchkp-instrument-marked-only
 C ObjC C++ ObjC++ LTO Report Var(flag_chkp_instrument_marked_only) Init(0)
 Instrument only functions marked with bnd_instrument attribute.
 
+fchkp-use-wrappers
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_wrappers) Init(1)
+Transform instrumented builtin calls into calls to wrappers.
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 89edddb..09a0683 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -299,6 +299,15 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-d@var{letters}  -dumpspecs  -dumpmachine  -dumpversion @gol
 -fsanitize=@var{style} -fsanitize-recover -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number} -fsanitize-undefined-trap-on-error @gol
+-fcheck-pointer-bounds -fchkp-check-incomplete-type @gol
+-fchkp-first-field-has-own-bounds -fchkp-narrow-bounds @gol
+-fchkp-narrow-to-innermost-array -fchkp-optimize @gol
+-fchkp-use-fast-string-functions -fchkp-use-nochk-string-functions @gol
+-fchkp-use-static-bounds -fchkp-use-static-const-bounds @gol
+-fchkp-treat-zero-dynamic-size-as-infinite -fchkp-check-read @gol
+-fchkp-check-read -fchkp-check-write -fchkp-store-bounds @gol
+-fchkp-instrument-calls -fchkp-instrument-marked-only @gol
+-fchkp-use-wrappers @gol
 -fdbg-cnt-list -fdbg-cnt=@var{counter-value-list} @gol
 -fdisable-ipa-@var{pass_name} @gol
 -fdisable-rtl-@var{pass_name} @gol
@@ -5693,6 +5702,117 @@ a @code{libubsan} library routine.  The advantage of 
this is that the
 @code{libubsan} library is not needed and will not be linked in, so this
 is usable even for use in freestanding environments.
 
+@item -fcheck-pointer-bounds
+@opindex fcheck-pointer-bounds
+@opindex fno-check-pointer-bounds
+Enable Pointer Bounds Checker instrumentation.  Each memory reference
+is instrumented with checks of pointer used for memory access against
+bounds associated with that pointer.  Generated instrumentation may
+be controlled by various @option{-fchkp-*} options.
+
+@item -fchkp-check-incomplete-type
+@opindex fchkp-check-incomplete-type
+@opindex fno-chkp-check-incomplete-type
+Generate pointer bounds checks for variables with incomplete type.
+Enabled by default
+
+@item -fchkp-narrow-bounds
+@opindex fchkp-narrow-bounds
+@opindex fno-chkp-narrow-bounds
+Controls bounds used by Pointer Bounds Checker for pointers to object
+fields.  If narrowing is enabled then field bounds are used.  Otherwise
+object bounds are used.  See also @option{-fchkp-narrow-to-innermost-array}
+and @option{-fchkp-first-field-has-own-bounds}.  Enabled by default.
+
+@item -fchkp-first-field-has-own-bounds
+@opindex fchkp-first-field-has-own-bounds
+@opindex fno-chkp-first-field-has-own-bounds
+Forces Pointer Bounds Checker to use narrowed bounds for address of the
+first field in the structure.  By default pointer to the first field has
+the same bounds as pointer to the whole structure.
+
+@item -fchkp-narrow-to-innermost-array
+@opindex fchkp-narrow-to-innermost-array
+@opindex fno-chkp-narrow-to-innermost-

Re: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL

2014-11-18 Thread Kyrill Tkachov


On 18/11/14 02:48, Terry Guo wrote:

+ 
+   The Thumb-1 assembly code are now generated in unified syntax. The 
new option
+-masm-syntax-unified can be used to specify whether 
inline assembly
+code are using unified syntax. By default the option is off which means
+non-unified syntax is used. However this is subject to change in 
future releases.
+Eventually the non-unified syntax will be deprecated.
+  
+ 

Hi Terry,

Sorry for the late comment, I see this has already been committed.

I think it should be "assembly code is now generated".
Also "whether inline assembly code is using unified syntax".

Kyrill



RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Maciej W. Rozycki
On Tue, 18 Nov 2014, Andrew Bennett wrote:

> My fix places a nop in the delay slot of the branch likely instruction
> by using the %~ output operation.  This then causes the sync code for the 
> previous example to be correct:
> 
> .setnoat
> sync # 15   sync_new_addsi/2[length = 24]
> 1:
> ll  $3,0($4)
> addu$1,$3,$2
> sc  $1,0($4)
> beql$1,$0,1b
> nop
> addu$3,$3,$2
> sync
> .setat

 OK, this does look to me like the correct way to address the issue, but 
where is the second NOP that you previously mentioned?  I fail to see it 
here and this code can't be made any better, there isn't anything you 
could possibly schedule into the delay slot as there is nothing else to 
do in this loop.

  Maciej


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Matthew Fortune
> The atomic-compare-exchange-3.c and atomic-op-3.c tests are failing when
> using the -mfix-r1 option.  This is due to the fact that the delay
> slot of the branch instruction that checks if the atomic operation was
> not successful can be filled with an operation that returns the output
> result.  For the branch likely case this operation can not go in the
> delay slot because it wont execute when the atomic operation was
> successful and therefore the wrong result will be returned.  This patch
> forces a nop to be placed in the delay slot if a branch likely is used.
> The fix is as simple as possible; it does cause a second nop to be
> introduced
> after the branch instruction in the branch likely case.  As this option is
> rarely used, it makes more sense to keep the code readable than trying to
> optimise it.
> 
> The atomic tests now pass successfully.  The ChangeLog and patch are
> below.
> 
> Ok to commit?

I'm OK with just making the fix-r1 case safe rather than also
complicating the normal code path to remove the normal delay slot NOP in
this special case.  I'd like to see what Catherine thinks though.  To
remove the second NOP I believe we would have to add a !TARGET_FIX_R1
to the condition of the normal delay slot NOP. This seems a bit convoluted
when the real reason is because branch likely is in use.  To make use of
the mips_branch_likely flag it would have to be set earlier in:
mips_output_sync_loop and also get set in mips_sync_loop_insns.

If Catherine thinks getting rid of the second NOP is important enough then
please base it on mips_branch_likely and fix the callers.

> gcc/
>   * config/mips/mips.c (mips_process_sync_loop): Place a nop in the
>   delay slot of the branch likely instruction.
> 
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 02268f3..6dd3728 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -12997,7 +12997,12 @@ mips_process_sync_loop (rtx_insn *insn, rtx
> *operands)
>   This will sometimes be a delayed branch; see the write code below
>   for details.  */
>mips_multi_add_insn (is_64bit_p ? "scd\t%0,%1" : "sc\t%0,%1", at, mem,
> NULL);
> -  mips_multi_add_insn ("beq%?\t%0,%.,1b", at, NULL);
> +
> +  /* We can not put the NEWVAL = $TMP3 or CMP = 1 operations in the delay
> slot
> + of the branch if it is a branch likely because they will not execute
> when
> + the atomic operation is successful, so place a NOP in there instead.
> */
> +

I found the comment hard to digest, perhaps:

/* When using branch likely (-mfix-r1), the delay slot instruction will
   be annulled on false.  The normal delay slot instructions calculate the
   overall result of the atomic operation and must not be annulled.
   Unconditionally use a NOP instead for the branch likely case.  */

> +  mips_multi_add_insn ("beq%?\t%0,%.,1b%~", at, NULL);
> 
>/* if (INSN1 != MOVE && INSN1 != LI) NEWVAL = $TMP3 [delay slot].  */
>if (insn1 != SYNC_INSN1_MOVE && insn1 != SYNC_INSN1_LI && tmp3 !=
> newval)

Matthew


small C++ PATCH to instantiate_template_1

2014-11-18 Thread Jason Merrill
While looking at 55992 I noticed that we weren't pushing into dependent 
scope properly; we need to pass the "entering scope" flag to 
tsubst_aggr_type so that we get the version of the type with the 
members.  This isn't enough to fix 55992, but may well fix other bugs.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c12c225ae102f77c6b4b1f98a69adbee474cb3a3
Author: Jason Merrill 
Date:   Mon Nov 17 16:26:45 2014 -0500

	* pt.c (instantiate_template_1): Use tsubst_aggr_type for context.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c0f3b8c..6661325 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15856,8 +15856,8 @@ instantiate_template_1 (tree tmpl, tree orig_args, tsubst_flags_t complain)
 ++processing_template_decl;
   if (DECL_CLASS_SCOPE_P (gen_tmpl))
 {
-  tree ctx = tsubst (DECL_CONTEXT (gen_tmpl), targ_ptr,
-			 complain, gen_tmpl);
+  tree ctx = tsubst_aggr_type (DECL_CONTEXT (gen_tmpl), targ_ptr,
+   complain, gen_tmpl, true);
   push_nested_class (ctx);
 }
   /* Substitute template parameters to obtain the specialization.  */


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-18 Thread Andrew Bennett
> -Original Message-
> From: Maciej W. Rozycki [mailto:ma...@codesourcery.com]
> Sent: 18 November 2014 13:48
> To: Andrew Bennett
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] If using branch likelies in MIPS sync code fill the delay
> slot with a nop
> 
> On Tue, 18 Nov 2014, Andrew Bennett wrote:
> 
> > The atomic-compare-exchange-3.c and atomic-op-3.c tests are failing when
> > using the -mfix-r1 option.  This is due to the fact that the delay
> > slot of the branch instruction that checks if the atomic operation was
> > not successful can be filled with an operation that returns the output
> > result.  For the branch likely case this operation can not go in the
> > delay slot because it wont execute when the atomic operation was
> > successful and therefore the wrong result will be returned.  This patch
> > forces a nop to be placed in the delay slot if a branch likely is used.
> > The fix is as simple as possible; it does cause a second nop to be
> introduced
> > after the branch instruction in the branch likely case.  As this option is
> > rarely used, it makes more sense to keep the code readable than trying to
> > optimise it.
> 
>  Can you please be a bit more elaborate on how the wrong code sequence
> looks like, why it is produced like that and how your fix changes it?
> Without seeing examples of code generated it is very hard to imagine
> what is really going on here.

Ok if we compile the following cut-down atomic-op-3.c test case with
-mfix-r1.

extern void abort(void);

int v, count, res;
int
main ()
{
  v = 0;
  count = 1;

  if (__atomic_add_fetch (&v, count, __ATOMIC_RELAXED) != 1)
abort ();

  return 0;
}

The following assembly code is produced for the atomic operation:

.setnoat
sync # 15   sync_new_addsi/2[length = 24]
1:
ll  $3,0($4)
addu$1,$3,$2
sc  $1,0($4)
beql$1,$0,1b
addu$3,$3,$2
sync
.setat


Notice here that the addu is in the delay slot of the branch likely instruction.
This is computing the value that exists in the memory location after the atomic 
operation has completed.  Because it is a branch likely instruction
this will only run when the atomic operation fails, and hence when it is
successful it will not return the correct value.


The offending code is in mips_process_sync_loop:

mips_multi_add_insn ("beq%?\t%0,%.,1b", at, NULL);
 
  /* if (INSN1 != MOVE && INSN1 != LI) NEWVAL = $TMP3 [delay slot].  */
  if (insn1 != SYNC_INSN1_MOVE && insn1 != SYNC_INSN1_LI && tmp3 != newval)
{
  mips_multi_copy_insn (tmp3_insn);
  mips_multi_set_operand (mips_multi_last_index (), 0, newval);
}
  else if (!(required_oldval && cmp))
mips_multi_add_insn ("nop", NULL);
  
  /* CMP = 1 -- either standalone or in a delay slot.  */
  if (required_oldval && cmp)
mips_multi_add_insn ("li\t%0,1", cmp, NULL);


For the branch likely case the delay slot could be filled either with the
operation to compute the value that exists in memory after the atomic operation
has completed; an operation to return if the atomic operation is successful 
or not; or a nop.  The first two operations should not be in the delay slot 
of the branch if it is a branch likely because they will only run if the atomic 
operation was unsuccessful.

My fix places a nop in the delay slot of the branch likely instruction
by using the %~ output operation.  This then causes the sync code for the 
previous example to be correct:

.setnoat
sync # 15   sync_new_addsi/2[length = 24]
1:
ll  $3,0($4)
addu$1,$3,$2
sc  $1,0($4)
beql$1,$0,1b
nop
addu$3,$3,$2
sync
.setat



Regards,



Andrew




Re: [PATCH, Pointer Bounds Checker, Builtins instrumentation 3/5] Expand instrumented builtin calls

2014-11-18 Thread David Edelsohn
On Tue, Nov 18, 2014 at 9:07 AM, Richard Biener
 wrote:

> Can we emit this particular STABS piece in "raw" form then?  Thus
> does it work to have sth like
>
>.stabs ...
>.string "..."
>.stabs ...
>
> or whatever way to emit assembled data there?

The stabstring debug information is saved as a length and a string.
For 32 bit AIX, the length is an unsigned short.  Longer entries
apparently can be stored in multiple symbol table entries, but it's
not clear how to emit assembly to produce that effect.  For 64 bit
AIX, the length is an unsigned int 32 bits, so one may be able to work
around this problem by compiling GCC on AIX in 64 bit mode.  A
conversion to default 64 bit build will take a lot of work.

Thanks, David


Re: RFA (tree-inline): PATCH for more C++14 constexpr support

2014-11-18 Thread Andreas Schwab
Jason Merrill  writes:

> commit e52e82e56507d1de1932abcafd80683c4dc00d1e
> Author: Jason Merrill 
> Date:   Sun Nov 16 17:14:12 2014 -0500
>
>   * constexpr.c (use_new_call): Always use new call handling.
>
> diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> index 57d0c46..8881271 100644
> --- a/gcc/cp/constexpr.c
> +++ b/gcc/cp/constexpr.c
> @@ -1021,8 +1021,8 @@ adjust_temp_type (tree type, tree temp)
>  }
>  
>  /* True if we want to use the new handling of constexpr calls based on
> -   DECL_SAVED_TREE.  Currently only active for C++14 mode.  */
> -#define use_new_call (cxx_dialect >= cxx14)
> +   DECL_SAVED_TREE.  */
> +#define use_new_call true
>  
>  /* Subroutine of cxx_eval_call_expression.
> We are processing a call expression (either CALL_EXPR or

FAIL: 18_support/numeric_limits/requirements/constexpr_functions.cc (test for 
excess errors)

$ gcc/xg++ -shared-libgcc -Bgcc -nostdinc++ -Lm68k-linux/libstdc++-v3/src 
-Lm68k-linux/libstdc++-v3/src/.libs -Lm68k-linux/libstdc++-v3/libsupc++/.libs 
-B/daten/cross/m68k-linux/m68k-linux/bin/ 
-B/daten/cross/m68k-linux/m68k-linux/lib/ -isystem 
/daten/cross/m68k-linux/m68k-linux/include -isystem 
/daten/cross/m68k-linux/m68k-linux/sys-include 
-Bm68k-linux/./libstdc++-v3/src/.libs -D_GLIBCXX_ASSERT -fmessage-length=0 
-ffunction-sections -fdata-sections -O2 -g -D_GNU_SOURCE -DLOCALEDIR="." 
-nostdinc++ -Im68k-linux/libstdc++-v3/include/m68k-linux 
-Im68k-linux/libstdc++-v3/include -I../libstdc++-v3/libsupc++ 
-I../libstdc++-v3/include/backward -I../libstdc++-v3/testsuite/util 
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc
 -std=gnu++14 -S -o constexpr_functions.s
In file included from 
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc:21:0:
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc:
 In instantiation of ‘void 
__gnu_test::constexpr_member_functions::operator()()::_Concept::__constraint() 
[with _Ttesttype = std::numeric_limits; _Tbasetype = char]’:
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc:55:2:
   required from ‘void __gnu_test::constexpr_member_functions::operator()() 
[with _Ttesttype = std::numeric_limits; _Tbasetype = char]’
m68k-linux/libstdc++-v3/include/ext/typelist.h:197:2:   required from ‘void 
__gnu_cxx::typelist::detail::apply_generator2_, __gnu_cxx::typelist::chain 
>::operator()(Gn&) [with Gn = __gnu_test::constexpr_member_functions; Hd1 = 
std::numeric_limits; TlT = 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::null_type> > > > > > > > > > > > >; Hd2 = char; TlV = 
__gnu_cxx::typelist::chain > > > > > 
> > > > > > >]’
m68k-linux/libstdc++-v3/include/ext/typelist.h:199:6:   required from ‘void 
__gnu_cxx::typelist::detail::apply_generator2_, __gnu_cxx::typelist::chain 
>::operator()(Gn&) [with Gn = __gnu_test::constexpr_member_functions; Hd1 = 
std::numeric_limits; TlT = 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::null_type> > > > > > > > > > > > > >; Hd2 = bool; TlV = 
__gnu_cxx::typelist::chain > > > > > 
> > > > > > > >]’
m68k-linux/libstdc++-v3/include/ext/typelist.h:424:8:   required from ‘void 
__gnu_cxx::typelist::apply_generator(Gn&, TypelistT, TypelistV) [with Gn = 
__gnu_test::constexpr_member_functions; TypelistT = 
__gnu_cxx::typelist::node<__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::chain, 
__gnu_cxx::typelist::null_type> > > > > > > > > > > > > > > >; TypelistV = 
__gnu_cxx::typelist::node<__gnu_cxx::typelist::chain > > > > > 
> > > > > > > > > >]’
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc:68:50:
   required from here
../libstdc++-v3/testsuite/18_support/numeric_limits/requirements/constexpr_functions.cc:37:25:
   in constexpr expansion of ‘std::numeric_limits::min()’
m68k-linux/libstdc++-v3/include/l

Re: [PATCH] Check 'fd' neither -1 nor 0, before close it

2014-11-18 Thread Chen Gang
On 11/18/14 0:38, Joseph Myers wrote:
> On Sat, 15 Nov 2014, Chen Gang wrote:
> 
>> Also in c_common_read_pch(), when failure occurs, also need be sure the
>> 'fd' is not '-1' for the next close operation.
> 
> Please clarify how c_common_read_pch gets called with fd == -1.  
> Certainly checking after an error is too late; we shouldn't call fdopen in 
> that case at all, and I think something further up the call chain should 
> have avoided calling c_common_read_pch with fd == -1 at all (the question 
> is just exactly what point on the call chain is missing such a check and 
> needs it added).
> 

According to current source code, the author wants 'fd' should never be
'-1' in c_common_read_pch(). "grep -rn read_pch *" in root directory,
then "grep -rn _cpp_stack_file *", can know it is used in 3 areas:

 - c_common_pch_pragma() in "gcc/c-family/c-pch.c", it has already
   checked 'fd' before call c_common_read_pch()

 - _cpp_stack_include() in "libcpp/files.c", before _cpp_stack_file(),
   has called and checked _cpp_find_file().

 - cpp_read_main_file() in "libcpp/init.c", before _cpp_stack_file(),
   has called and checked _cpp_find_file().

In c_common_read_pch(), even if 'fd' is '-1', we should check it firstly
before call fdopen(), instead of check '-1' in failure code block.

But for me, it contents another 2 related issues:

 - _cpp_find_file() always return a valid pointer, so related check code
   "file == NULL" is no use for _cpp_find_file() in _cpp_stack_include().

 - According to its comments, _cpp_find_file() can not be sure of
   'file->fd' must not be '-1', even when "file->err_no == 0", we need
   reopen it if necessary.

Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed


Re: [PATCH, Pointer Bounds Checker, Builtins instrumentation 3/5] Expand instrumented builtin calls

2014-11-18 Thread Richard Biener
On Tue, Nov 18, 2014 at 2:44 PM, David Edelsohn  wrote:
> On Tue, Nov 18, 2014 at 8:33 AM, Ilya Enkovich  wrote:
>> 2014-11-18 15:18 GMT+03:00 Richard Biener :
>>> On Tue, Nov 18, 2014 at 1:13 PM, Ilya Enkovich  
>>> wrote:
 2014-11-18 15:04 GMT+03:00 Richard Biener :
> On Tue, Nov 18, 2014 at 11:51 AM, Ilya Enkovich  
> wrote:
>> 2014-11-18 5:56 GMT+03:00 Jeff Law :
>>> On 11/17/14 13:43, Ilya Enkovich wrote:
>>>

 I don't fully understand how this problem appears.  Is it fully AIX
 specific and doesn't affect any other target?  May we put all _CHKP
 codes to the end of enum and ignore them by AIX? Limiting number of
 codes in enum due to restrictions of specific debug info format seems
 weird.  If something cannot be encoded for debug info then it should
 be ignored.  I really hoped that creation of functions by demand would
 allow to avoid such kind of problems :(

 I'll try to make a patch reducing amound of builtin codes to a
 required minimum in case it appears to be the best option we have.
>>>
>>> It's a problem with the AIX native tools.  Presumably they don't like 
>>> really
>>> long lines -- it was a relatively common issue in the past with vendor
>>> tools.
>>>
>>> I think we should proceed on two fronts.  First if David could 
>>> investigate
>>> the stabs-in-xcoff bits a bit to see if DBX_CONTIN_LEN can be used to
>>> resolve the issue.  Second if you could look at now duplicating every
>>> builtin in the enumeration since it's a combination of the number of 
>>> enums
>>> and the length of the debug strings to represent them that's causing AIX
>>> heartburn.
>>>
>>>
>>>
>>> jeff
>>
>> I see.  I can reduce the number of _CHKP builtin codes. Experimental
>> patch shows resulting END_BUILTINS = 1242.  But we should expect
>> similar problem for i386 target builds hosted on AIX (here
>> http://toolchain.lug-owl.de/buildbot/ I see such build is tested).
>> Current IX86_BUILTIN_MAX is 2467 which is greater than current
>> END_BUILTINS.
>
> I think it's better to fix dbxout.c to do something sensible with
> too long lines for enum types.  It should be easy to cook up a
> testcase that is not GCC.
>
> enum { entry1, entry2, entry3 };
>

 As I understand the main problem is that fixing dbxout in trunk won't
 help to build stage1 compiler.
>>>
>>> Simply build stage1 without debug info on AIX then.
>>>
>>> Btw, you have to start fixing the bug at some point ... (we can
>>> backport to 4.8 and 4.9).  Of course dbxout.c is in kind of
>>> deep-freeze unmaintained state.
>>
>> Seems with no continuation lines support there is no easy way to limit
>> stabstring length.  But it should be easy to fix this particular
>> problem (too long stabstring due to many enum values) by discarding
>> the rest of values when some string length threshold is hit.  Possible
>> patch:
>>
>> diff --git a/gcc/dbxout.c b/gcc/dbxout.c
>> index aa15a39..9b0b82d 100644
>> --- a/gcc/dbxout.c
>> +++ b/gcc/dbxout.c
>> @@ -168,6 +168,10 @@ along with GCC; see the file COPYING3.  If not see
>>  #define DBX_CONTIN_CHAR '\\'
>>  #endif
>>
>> +#ifndef DBX_MAX_LENGTH
>> +#define DBX_MAX_LENGTH 0
>> +#endif
>> +
>>  enum typestatus {TYPE_UNSEEN, TYPE_XREF, TYPE_DEFINED};
>>
>>  /* Structure recording information about a C data type.
>> @@ -2270,6 +2274,12 @@ dbxout_type (tree type, int full)
>>   stabstr_C (',');
>>   if (TREE_CHAIN (tem) != 0)
>> CONTIN;
>> +
>> + /* Prevent too long strings rejected by linker.  */
>> + if (DBX_CONTIN_LENGTH == 0
>> + && DBX_MAX_LENGTH != 0
>> + && obstack_object_size (&stabstr_ob) > DBX_MAX_LENGTH)
>> +   break;
>> }
>>
>>stabstr_C (';');
>>
>> With this we should be able to set limit in target config files.
>> Tried on linux with a small test with limit set to 20:
>>
>>>cat test.c
>> enum e1 {
>>   val1, val2, val3
>> };
>>>gcc test.c -S -gstabs
>>>grep e1 test.s
>> .stabs  "e1:T(0,21)=eval1:0,val2:1,;",128,0,0,0
>>
>>
>> The last enum value was discarded.
>>
>> I don't know which target should set this limit and what limit values
>> should be used.
>
> Hi, Ilya
>
> Thanks for the preliminary patch to dbxout.c.  I was planning to work
> on something similar today.
>
> Continuations would have been a nice solution, but, as I wrote
> earlier, AIX assembler does not honor stabstring continuations.

Can we emit this particular STABS piece in "raw" form then?  Thus
does it work to have sth like

   .stabs ...
   .string "..."
   .stabs ...

or whatever way to emit assembled data there?

>  It's
> one line, period. As much as can fit on one line.  I believe that AIX
> allows up to 64K.

Can we get the assembler fixed? (sorry to ask this stupid question
again .

  1   2   >