Re: [PATCH PR52272]Be smart when adding iv candidates

2015-11-07 Thread Bin.Cheng
On Fri, Nov 6, 2015 at 9:24 PM, Richard Biener
 wrote:
> On Wed, Nov 4, 2015 at 11:18 AM, Bin Cheng  wrote:
>> Hi,
>> PR52272 reported a performance regression in spec2006/410.bwaves once GCC is
>> prevented from representing address of one memory object using address of
>> another memory object.  Also as I commented in that PR, we have two possible
>> fixes for this:
>> 1) Improve how TMR.base is deduced, so that we can represent addr of mem obj
>> using another one, while not breaking PR50955.
>> 2) Add iv candidates with base object stripped.  In this way, we use the
>> common base-stripped part to represent all address expressions, in the form
>> of [base_1 + common], [base_2 + common], ..., [base_n + common].
>>
>> In terms of code generation, method 2) is at least as good as 1), actually
>> better in my opinion.  The problem of 2) is we need to tell when iv
>> candidates should be added for the common part and when shouldn't.  This
>> issue can be generalized and described as: We know IVO tries to add
>> candidates by deriving from iv uses.  One disadvantage is that candidates
>> are derived from iv use independently.  It doesn't take common sub
>> expression among different iv uses into consideration.  As a result,
>> candidate for common sub expression is not added, while many useless
>> candidates are added.
>>
>> As a matter of fact, candidate derived from iv use is useful only if it's
>> common enough and could be shared among different uses.  A candidate is most
>> likely useless if it's derived from a single use and could not be shared by
>> others.  This patch works in this way by firstly recording all kinds
>> candidates derived from iv uses, then adding candidates for common ones.
>>
>> The patch improves 410.bwaves by 3-4% on x86_64.  I also saw regression for
>> 400.perlbench and small regression for 401.bzip on x86_64, but I can confirm
>> they are false alarms caused by align issues.
>> For aarch64, fp cases are obviously improved for both spec2000 and spec2006.
>> Also the patch causes 2-3% regression for 459.GemsFDTD, which I think is
>> another irrelevant issue caused by heuristic candidate selecting algorithm.
>> Unfortunately, I don't have fix to it currently.
>>
>> This patch may add more candidates in some cases, but generally candidates
>> number is smaller because we don't need to add useless candidates now.
>> Statistic data shows there are quite fewer loops with more than 30
>> candidates when building spec2k6 on x86_64 using this patch.
>>
>> Bootstrap and test on x86_64.  I will re-test it against latest trunk on
>> AArch64.  Is it OK?
>
> +inline bool
> +iv_common_cand_hasher::equal (const iv_common_cand *ccand1,
> +  const iv_common_cand *ccand2)
> +{
> +  return ccand1->hash == ccand2->hash
> +&& operand_equal_p (ccand1->base, ccand2->base, 0)
> +&& operand_equal_p (ccand1->step, ccand2->step, 0)
> +&& TYPE_PRECISION (TREE_TYPE (ccand1->base))
> + == TYPE_PRECISION (TREE_TYPE (ccand2->base));
>
Hi Richard,
Thanks for reviewing.

> I'm wondering on the TYPE_PRECISION check.  a) why is that needed?
Because operand_equal_p doesn't check type precision for constant int
nodes, and IVO needs to take precision into consideration.

> and b) what kind of tree is base so that it is safe to inspect TYPE_PRECISION
> unconditionally?
Both SCEV and IVO work on expressions with type satisfying
POINTER_TYPE_P or INTEGRAL_TYPE_P, so it's safe to access precision
unconditionally?

>
> +  slot = data->iv_common_cand_tab->find_slot (&ent, INSERT);
> +  if (*slot == NULL)
> +{
> +  *slot = XNEW (struct iv_common_cand);
>
> allocate from the IV obstack instead?  I see we do a lot of heap allocations
> in IVOPTs, so we can improve that as followup as well.
>
Yes, small structures in IVO like iv, iv_use, iv_cand, iv_common_cand
are better to be allocated in obstack.  Actually I have already make
that change to struct iv.  others will be followup too.

Thanks,
bin
> We probably should empty the obstack after each processed loop.
>
> Thanks,
> Richard.
>
>
>> Thanks,
>> bin
>>
>> 2015-11-03  Bin Cheng  
>>
>> PR tree-optimization/52272
>> * tree-ssa-loop-ivopts.c (struct iv_common_cand): New struct.
>> (struct iv_common_cand_hasher): New struct.
>> (iv_common_cand_hasher::hash): New function.
>> (iv_common_cand_hasher::equal): New function.
>> (struct ivopts_data): New fields, iv_common_cand_tab and
>> iv_common_cands.
>> (tree_ssa_iv_optimize_init): Initialize above fields.
>> (record_common_cand, common_cand_cmp): New functions.
>> (add_iv_candidate_derived_from_uses): New function.
>> (add_iv_candidate_for_use): Record iv_common_cands derived from
>> iv use in hash table, instead of adding candidates directly.
>> (add_iv_candidate_for_uses): Call
>> add_iv_candidate_derived_from_uses.
>> (record_important_candidates)

Re: [PATCH] Add -fchecking

2015-11-07 Thread Jeff Law

On 11/07/2015 01:47 PM, Gerald Pfeifer wrote:

On Tue, 27 Oct 2015, Richard Biener wrote:

This adds -fchecking as a way to enable internal consistency checks
even in release builds (or disable checking with -fno-checking - up to
a certain extent - with checking enabled).


How (much) do we want to advertize this?

I don't think much -- it's really a developer-centric option.

Jeff


[PATCH 4a/4] [ARM] PR63870 Use internal_error() for invalid lane numbers

2015-11-07 Thread charles . baylis
From: Charles Baylis 

  Charles Baylis  

* config/arm/neon.md (neon_vld1_lane): Use internal_error for
invalid lane number.
(neon_vst1_lane): Likewise.
(neon_vld2_lane): Likewise.
(neon_vst2_lane): Likewise.
(neon_vld3_lane): Likewise.
(neon_vst3_lane): Likewise.
(neon_vld4_lane): Likewise.
(neon_vst4_lane): Likewise.

Change-Id: I72686845119df2f857fed98e7e0a588c532159a7
---
 gcc/config/arm/neon.md | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index e8db020..99caf96 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -4265,7 +4265,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   operands[3] = GEN_INT (lane);
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   if (max == 1)
 return "vld1.\t%P0, %A1";
   else
@@ -4287,7 +4287,7 @@ if (BYTES_BIG_ENDIAN)
   operands[3] = GEN_INT (lane);
   int regno = REGNO (operands[0]);
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -4373,7 +4373,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   operands[2] = GEN_INT (lane);
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   if (max == 1)
 return "vst1.\t{%P1}, %A0";
   else
@@ -4394,7 +4394,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -4465,7 +4465,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[4];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = operands[1];
@@ -4490,7 +4490,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[4];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -4580,7 +4580,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[1]);
   rtx ops[4];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
@@ -4605,7 +4605,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[1]);
   rtx ops[4];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -4724,7 +4724,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[5];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
@@ -4751,7 +4751,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[5];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -4896,7 +4896,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[1]);
   rtx ops[5];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
@@ -4923,7 +4923,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[1]);
   rtx ops[5];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -5046,7 +5046,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[6];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
@@ -5074,7 +5074,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[0]);
   rtx ops[6];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out of range");
   else if (lane >= max / 2)
 {
   lane -= max / 2;
@@ -5226,7 +5226,7 @@ if (BYTES_BIG_ENDIAN)
   int regno = REGNO (operands[1]);
   rtx ops[6];
   if (lane < 0 || lane >= max)
-error ("lane out of range");
+internal_error ("lane out

[PATCH 4b/4] [ARM] PR63870 Remove error for invalid lane numbers

2015-11-07 Thread charles . baylis
From: Charles Baylis 

  Charles Baylis  

* config/arm/neon.md (neon_vld1_lane): Remove error for invalid
lane number.
(neon_vst1_lane): Likewise.
(neon_vld2_lane): Likewise.
(neon_vst2_lane): Likewise.
(neon_vld3_lane): Likewise.
(neon_vst3_lane): Likewise.
(neon_vld4_lane): Likewise.
(neon_vst4_lane): Likewise.

Change-Id: Id7b4b6fa7320157e62e5bae574b4c4688d921774
---
 gcc/config/arm/neon.md | 48 
 1 file changed, 8 insertions(+), 40 deletions(-)

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index e8db020..6574e6e 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -4264,8 +4264,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   operands[3] = GEN_INT (lane);
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   if (max == 1)
 return "vld1.\t%P0, %A1";
   else
@@ -4286,9 +4284,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   operands[3] = GEN_INT (lane);
   int regno = REGNO (operands[0]);
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -4372,8 +4368,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   operands[2] = GEN_INT (lane);
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   if (max == 1)
 return "vst1.\t{%P1}, %A0";
   else
@@ -4393,9 +4387,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -4464,8 +4456,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = operands[1];
@@ -4489,9 +4479,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -4579,8 +4567,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
@@ -4604,9 +4590,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -4723,8 +4707,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
@@ -4750,9 +4732,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -4895,8 +4875,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
@@ -4922,9 +4900,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
 {
   lane -= max / 2;
   regno += 2;
@@ -5045,8 +5021,6 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[6];
-  if (lane < 0 || lane >= max)
-error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
@@ -5073,9 +5047,7 @@ if (BYTES_BIG_ENDIAN)
   HOST_WIDE_INT max = GET_MODE_

[PATCH 2/4] [ARM] PR63870 Mark lane indices of vldN/vstN with appropriate qualifier

2015-11-07 Thread charles . baylis
From: Charles Baylis 

gcc/ChangeLog:

  Charles Baylis  

PR target/63870
* config/arm/arm-builtins.c: (arm_load1_qualifiers) Use
qualifier_struct_load_store_lane_index.
(arm_storestruct_lane_qualifiers) Likewise.
* config/arm/neon.md: (neon_vld1_lane) Reverse lane numbers for
big-endian.
(neon_vst1_lane) Likewise.
(neon_vld2_lane) Likewise.
(neon_vst2_lane) Likewise.
(neon_vld3_lane) Likewise.
(neon_vst3_lane) Likewise.
(neon_vld4_lane) Likewise.
(neon_vst4_lane) Likewise.

Change-Id: Ic39898d288701bc5b712490265be688f5620c4e2
---
 gcc/config/arm/arm-builtins.c |  4 ++--
 gcc/config/arm/neon.md| 49 +++
 2 files changed, 28 insertions(+), 25 deletions(-)

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 6e3aad4..113e3da 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -152,7 +152,7 @@ arm_load1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 static enum arm_type_qualifiers
 arm_load1_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_const_pointer_map_mode,
-  qualifier_none, qualifier_immediate };
+  qualifier_none, qualifier_struct_load_store_lane_index };
 #define LOAD1LANE_QUALIFIERS (arm_load1_lane_qualifiers)
 
 /* The first argument (return type) of a store should be void type,
@@ -171,7 +171,7 @@ arm_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 static enum arm_type_qualifiers
 arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_void, qualifier_pointer_map_mode,
-  qualifier_none, qualifier_immediate };
+  qualifier_none, qualifier_struct_load_store_lane_index };
 #define STORE1LANE_QUALIFIERS (arm_storestruct_lane_qualifiers)
 
 #define v8qi_UP  V8QImode
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index e5a2b0f..e8db020 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -4261,8 +4261,9 @@ if (BYTES_BIG_ENDIAN)
 UNSPEC_VLD1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
+  operands[3] = GEN_INT (lane);
   if (lane < 0 || lane >= max)
 error ("lane out of range");
   if (max == 1)
@@ -4281,8 +4282,9 @@ if (BYTES_BIG_ENDIAN)
 UNSPEC_VLD1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
+  operands[3] = GEN_INT (lane);
   int regno = REGNO (operands[0]);
   if (lane < 0 || lane >= max)
 error ("lane out of range");
@@ -4367,8 +4369,9 @@ if (BYTES_BIG_ENDIAN)
  UNSPEC_VST1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
+  operands[2] = GEN_INT (lane);
   if (lane < 0 || lane >= max)
 error ("lane out of range");
   if (max == 1)
@@ -4387,7 +4390,7 @@ if (BYTES_BIG_ENDIAN)
  UNSPEC_VST1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   if (lane < 0 || lane >= max)
@@ -4396,8 +4399,8 @@ if (BYTES_BIG_ENDIAN)
 {
   lane -= max / 2;
   regno += 2;
-  operands[2] = GEN_INT (lane);
 }
+  operands[2] = GEN_INT (lane);
   operands[1] = gen_rtx_REG (mode, regno);
   if (max == 2)
 return "vst1.\t{%P1}, %A0";
@@ -4457,7 +4460,7 @@ if (BYTES_BIG_ENDIAN)
UNSPEC_VLD2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
@@ -4466,7 +4469,7 @@ if (BYTES_BIG_ENDIAN)
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = operands[1];
-  ops[3] = operands[3];
+  ops[3] = GEN_INT (lane);
   output_asm_insn ("vld2.\t{%P0[%c3], %P1[%c3]}, %A2", ops);
   return "";
 }
@@ -4482,7 +4485,7 @@ if (BYTES_BIG_ENDIAN)
UNSPEC_VLD2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
@@ -4572,7 +4575,7 @@ if (BYTES_BIG_ENDIAN)
  UNSPEC_VST2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = ENDIAN_LANE_N(mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (mode);
   int regno = REGNO (operands[1]);
   rtx ops[4];
@@ -4581,7 +4584,7 @@ if (BYTES_BIG_ENDIAN)

[PATCH v3 0/4] [ARM] PR63870 vldN_lane/vstN_lane error messages

2015-11-07 Thread charles . baylis
From: Charles Baylis 

Previous discussion: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00657.html

This is a minor update to the previous patch set, fixing one coding style issue
in the first patch, and adding a fourth patch for which there are two options,
described below.

  [ARM] PR63870 Add qualifiers for NEON builtins
  [ARM] PR63870 Mark lane indices of vldN/vstN with appropriate
qualifier
  [ARM] PR63870 Add test cases

These two patches are alternate options. Alan suggested removing the error
checks at assembly time, since the user-supplied lane number is always be
checked earlier. I thought it might be better to catch this case as an internal
error, to guard against future bugs.. If we don't use the internal error, then
the assembler will catch use of invalid lane numbers. Not sure which is
prefered, so both options are presented. Either one can be applied:
  [ARM] PR63870 Use internal_error() for invalid lane numbers
  [ARM] PR63870 Remove error for invalid lane numbers

Passes make check for arm-unknown-linux-gnueabihf and
armeb-unknown-linux-gnueabihf with no regressions. As mentioned in the last
thread, the new *_f16 tests fail on armeb-* due to unrelated problems with
half float moves. 

OK for trunk? I prefer patch 4a, but will commit 4b if that is prefered.

 gcc/config/arm/arm-builtins.c  | 52 +++-
 gcc/config/arm/arm.c   |  1 +
 gcc/config/arm/arm.h   |  3 +
 gcc/config/arm/neon.md | 97 --
 .../advsimd-intrinsics/vld2_lane_f16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_f32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_f64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_p8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld2_lane_s16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_s32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_s64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_s8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld2_lane_u16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_u32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_u64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2_lane_u8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld2q_lane_f16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_f32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_f64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_p8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2q_lane_s16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_s32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_s64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_s8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld2q_lane_u16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_u32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_u64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld2q_lane_u8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_f16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_f32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_f64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_p8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld3_lane_s16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_s32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_s64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_s8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld3_lane_u16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_u32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_u64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3_lane_u8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld3q_lane_f16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_f32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_f64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_p8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3q_lane_s16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_s32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_s64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_s8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld3q_lane_u16_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_u32_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_u64_indices_1.c  |  5 +-
 .../advsimd-intrinsics/vld3q_lane_u8_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_f16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_f32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_f64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_p8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld4_lane_s16_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_s32_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_s64_indices_1.c   |  5 +-
 .../advsimd-intrinsics/vld4_lane_s8_indices_1.c|  5 +-
 .../advsimd-intrinsics/vld4_lane_u16_indices_1.c   |  5 +-
 .../advs

[PATCH 1/4] [ARM] PR63870 Add qualifiers for NEON builtins

2015-11-07 Thread charles . baylis
From: Charles Baylis 

gcc/ChangeLog:

  Charles Baylis  

PR target/63870
* config/arm/arm-builtins.c (enum arm_type_qualifiers): New enumerator
qualifier_struct_load_store_lane_index.
(builtin_arg): New enumerator NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
(arm_expand_neon_args): New parameter. Remove ellipsis. Handle NEON
argument qualifiers.
(arm_expand_neon_builtin): Handle new NEON argument qualifier.
* config/arm/arm.h (ENDIAN_LANE_N): New macro.

Change-Id: Iaa14d8736879fa53776319977eda2089f0a26647
---
 gcc/config/arm/arm-builtins.c | 48 +++
 gcc/config/arm/arm.c  |  1 +
 gcc/config/arm/arm.h  |  3 +++
 3 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index bad3dc3..6e3aad4 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -67,7 +67,9 @@ enum arm_type_qualifiers
   /* Polynomial types.  */
   qualifier_poly = 0x100,
   /* Lane indices - must be within range of previous argument = a vector.  */
-  qualifier_lane_index = 0x200
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores.  */
+  qualifier_struct_load_store_lane_index = 0x400
 };
 
 /*  The qualifier_internal allows generation of a unary builtin from
@@ -1963,6 +1965,7 @@ typedef enum {
   NEON_ARG_COPY_TO_REG,
   NEON_ARG_CONSTANT,
   NEON_ARG_LANE_INDEX,
+  NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX,
   NEON_ARG_MEMORY,
   NEON_ARG_STOP
 } builtin_arg;
@@ -2020,9 +2023,9 @@ neon_dereference_pointer (tree exp, tree type, 
machine_mode mem_mode,
 /* Expand a Neon builtin.  */
 static rtx
 arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
- int icode, int have_retval, tree exp, ...)
+ int icode, int have_retval, tree exp,
+ builtin_arg *args)
 {
-  va_list ap;
   rtx pat;
   tree arg[SIMD_MAX_BUILTIN_ARGS];
   rtx op[SIMD_MAX_BUILTIN_ARGS];
@@ -2037,13 +2040,11 @@ arm_expand_neon_args (rtx target, machine_mode 
map_mode, int fcode,
  || !(*insn_data[icode].operand[0].predicate) (target, tmode)))
 target = gen_reg_rtx (tmode);
 
-  va_start (ap, exp);
-
   formals = TYPE_ARG_TYPES (TREE_TYPE (arm_builtin_decls[fcode]));
 
   for (;;)
 {
-  builtin_arg thisarg = (builtin_arg) va_arg (ap, int);
+  builtin_arg thisarg = args[argc];
 
   if (thisarg == NEON_ARG_STOP)
break;
@@ -2079,6 +2080,18 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, 
int fcode,
op[argc] = copy_to_mode_reg (mode[argc], op[argc]);
  break;
 
+   case NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX:
+ gcc_assert (argc > 1);
+ if (CONST_INT_P (op[argc]))
+   {
+ neon_lane_bounds (op[argc], 0,
+   GET_MODE_NUNITS (map_mode), exp);
+ /* Keep to GCC-vector-extension lane indices in the RTL.  */
+ op[argc] =
+   GEN_INT (ENDIAN_LANE_N (map_mode, INTVAL (op[argc])));
+   }
+ goto constant_arg;
+
case NEON_ARG_LANE_INDEX:
  /* Previous argument must be a vector, which this indexes.  */
  gcc_assert (argc > 0);
@@ -2089,19 +2102,22 @@ arm_expand_neon_args (rtx target, machine_mode 
map_mode, int fcode,
}
  /* Fall through - if the lane index isn't a constant then
 the next case will error.  */
+
case NEON_ARG_CONSTANT:
+constant_arg:
  if (!(*insn_data[icode].operand[opno].predicate)
  (op[argc], mode[argc]))
-   error_at (EXPR_LOCATION (exp), "incompatible type for argument 
%d, "
-  "expected %", argc + 1);
+   {
+ error ("%Kargument %d must be a constant immediate",
+exp, argc + 1);
+ return const0_rtx;
+   }
  break;
+
 case NEON_ARG_MEMORY:
  /* Check if expand failed.  */
  if (op[argc] == const0_rtx)
- {
-   va_end (ap);
return 0;
- }
  gcc_assert (MEM_P (op[argc]));
  PUT_MODE (op[argc], mode[argc]);
  /* ??? arm_neon.h uses the same built-in functions for signed
@@ -2122,8 +2138,6 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, 
int fcode,
}
 }
 
-  va_end (ap);
-
   if (have_retval)
 switch (argc)
   {
@@ -2235,6 +2249,8 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target)
 
   if (d->qualifiers[qualifiers_k] & qualifier_lane_index)
args[k] = NEON_ARG_LANE_INDEX;
+  else if (d->qualifiers[qualifiers_k] & 
qualifier_struct_load_store_lane_index)
+   args[k] = NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX;

Re: [PATCH] Add support for ARM embedded multilibs

2015-11-07 Thread Jasmin J.
Hello Ramana and Tejas!

The patch is originally from Terry Guo
(see https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00729.html).
SVN commit r210320 on 
svn://gcc.gnu.org/svn/gcc/branches/ARM/embedded-4_9-branch .

The original was using "with_multilib_list" instead of TM_MULTILIB_CONFIG.
Moreover, it did not check each argument of "$with_multilib_list".

I simplified the patch and reworked it to use TM_MULTILIB_CONFIG. Additionally
each argument of "$with_multilib_list" is now checked.
I added missing "armv7".

I added the FSF header to t-rmprofile and a little explanation.

Concerning the copyright assignment:

I found this sentence on the gcc contribute page:
  ... a copyright disclaimer to put the change in the public domain is
  acceptable as well.
and
  Small changes can be accepted without a copyright disclaimer or a copyright
  assignment on file.

So here it is:

*
* I submit this change in the public domain.
*


In the meantime, I found the copyright assignment form. I will send it soon
to gnu.org.

Concerning testing:

> see for example how I added t-aprofile to the backend and the kind of 
> testing it underwent
If this patch is now in principle acceptable, I will start working on your
suggested test scripts.

> The t-rmprofile file will need updating to newer values for -mcpu and 
> march
I will let this for open for other people, because I am not familiar with
the different CPU and ARCH variants. Keep in mind, that I am porting
Terry's patch only. But if someone it telling me what is required, I can add
it now and include it to the test scripts.

Regards,
   Jasmin
>From cfe11cfdfbe3c7655bac246bbf503ac0f5c7114d Mon Sep 17 00:00:00 2001
From: Jasmin Jessich 
Date: Sat, 24 Oct 2015 00:43:48 +0200
Subject: [PATCH] Add support for ARM embedded multilibs

Based on svn://gcc.gnu.org/svn/gcc/branches/ARM/embedded-4_9-branch
commit r210320 from Terry Guo  .

 * config.gcc (--with-multilib-list): Accept arm embedded cores.
 * configure/configure.ac: Helptext.
 * config/arm/t-rmprofile: New file.

Signed-off-by: Terry Guo 
Signed-off-by: Jasmin Jessich 
---
 gcc/config.gcc |  14 ++
 gcc/config/arm/t-rmprofile | 121 +
 gcc/configure  |   2 +-
 gcc/configure.ac   |   2 +-
 4 files changed, 137 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/arm/t-rmprofile

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 9cc765e..57f333d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3796,6 +3796,18 @@ case "${target}" in
 	tmake_file="${tmake_file} arm/t-aprofile"
 	break
 	;;
+armv6-m|armv7|armv7-m|armv7e-m|armv7-r|armv7-a|cortex-m7)
+	if test "x$with_arch" != x \
+	|| test "x$with_cpu" != x \
+	|| test "x$with_float" != x \
+	|| test "x$with_fpu" != x \
+	|| test "x$with_mode" != x ; then
+	echo "Error: You cannot use any of --with-arch/cpu/fpu/float/mode with --with-multilib-list=${with_multilib_list}" 1>&2
+	exit 1
+	fi
+	tmake_file_ml=" arm/t-rmprofile"
+	TM_MULTILIB_CONFIG="${TM_MULTILIB_CONFIG},${arm_multilib}"
+	;;
 default)
 	;;
 *)
@@ -3804,6 +3816,8 @@ case "${target}" in
 	;;
 esac
 			done
+			tmake_file="${tmake_file}${tmake_file_ml}"
+			TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
 		fi
 		;;
 
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
new file mode 100644
index 000..65d60c0
--- /dev/null
+++ b/gcc/config/arm/t-rmprofile
@@ -0,0 +1,121 @@
+# Copyright (C) 2012-2015 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# This is a target makefile fragment that attempts to get
+# multilibs built for the range of CPU's, FPU's and ABI's the user did
+# customize via the configure option --with-multilib-list.
+# It should not be used in conjunction with another make file fragment and
+# assumes --with-arch, --with-cpu, --with-fpu, --with-float, --with-mode
+# have their default values during the configure step.  We enforce
+# this during the top-level configury.
+
+comma := ,
+space :=
+space +=
+
+MULTILIB_OPTIONS   = mthumb/marm
+MULTILIB_DIRNAMES  = thumb arm
+MULTILIB_OPTIONS  += march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/mcpu=cortex-m7
+MULTILIB_DIRNAMES += armv6-m armv7-m armv7e-m 

Re: [PATCH] c/67882 - improve -Warray-bounds for invalid offsetof

2015-11-07 Thread Segher Boessenkool
On Tue, Oct 20, 2015 at 10:10:44PM +, Joseph Myers wrote:
> > typedef struct FA5_7 {
> >   int i;
> >   char a5_7 [5][7];
> > } FA5_7;
> > 
> > __builtin_offsetof (FA5_7, a5_7 [0][7]), // { dg-warning 
> > "index" }
> > __builtin_offsetof (FA5_7, a5_7 [1][7]), // { dg-warning 
> > "index" }
> > __builtin_offsetof (FA5_7, a5_7 [5][0]), // { dg-warning 
> > "index" }
> > __builtin_offsetof (FA5_7, a5_7 [5][7]), // { dg-warning 
> > "index" }
> > 
> > Here I think the last one of these is most likely invalid (being 8 bytes 
> > past
> > the end of the object, rather than just one) and the others valid. Can you
> > confirm this? (If the &a.v[2].a example is considered invalid, then I think
> > the a5_7[5][0] test would be the equivalent and ought to also be considered
> > invalid).
> 
> The last one is certainly invalid.  The one before is arguably invalid as 
> well (in the unary '&' equivalent, &a5_7[5][0] which is equivalent to 
> a5_7[5] + 0, the questionable operation is implicit conversion of a5_7[5] 
> from array to pointer - an array expression gets converted to an 
> expression "that points to the initial element of the array object", but 
> there is no array object a5_7[5] here).

C11, 6.5.2.1/3:
Successive subscript operators designate an element of a
multidimensional array object. If E is an n-dimensional array (n >= 2)
with dimensions i x j x . . . x k, then E (used as other than an lvalue)
is converted to a pointer to an (n - 1)-dimensional array with
dimensions j x . . . x k. If the unary * operator is applied to this
pointer explicitly, or implicitly as a result of subscripting, the
result is the referenced (n - 1)-dimensional array, which itself is
converted into a pointer if used as other than an lvalue. It follows
from this that arrays are stored in row-major order (last subscript
varies fastest).

As far as I see, a5_7[5] here is never treated as an array, just as a
pointer, and &a5_7[5][0] is valid.


Segher


Re: Fix ipa-polymorphic-call-info ICE

2015-11-07 Thread Jan Hubicka
> Hi!
> 
> On Fri, 6 Nov 2015 07:07:27 +0100, Jan Hubicka  wrote:
> > this patch fixes tripple thinko when in
> > ipa_polymorphic_call_context::restrict_to_inner_type when dealing with an
> > offset that is out of the range of the type considered.  In this case 
> > function
> > should return true only when type is dynamic (so there may be additional 
> > type
> > after the known type) or derivations are allowed (so the type may get 
> > bigger).
> > There is check that is supposed to make this happen, but it clears the flags
> > before checking htem that is not a good idea.
> > 
> > There is also check in contains_type_p that is supposed to shortcut this 
> > scenario,
> > but instead of checking TYPE_SIZE it checks type itself (ouch).
> > 
> > Bootstrapped/regtested x86_64-linux, will commit it shortly.
> 
> > --- testsuite/g++.dg/lto/pr68057_0.C(revision 0)
> > +++ testsuite/g++.dg/lto/pr68057_0.C(revision 0)
> > @@ -0,0 +1,23 @@
> > +// { dg-lto-do compile }
> > +/* { dg-extra-ld-options { -O2 -Wno-odr -r -nostdlib } } */
> 
> I'm seeing "WARNING: lto.exp does not support dg-lto-do compile" as well
> as a few FAILs:

Ah, sorry, I must have messed up the testcase. Will fix it ASAP.

Honza


Re: [PATCH 1/9] ENABLE_CHECKING refactoring

2015-11-07 Thread Gerald Pfeifer
On Wed, 21 Oct 2015, Jeff Law wrote:
> I might even claim it's already helping.  While we're still seeing 
> syntax errors in the conditionally compiled code, it doesn't feel like 
> we're seeing it as often as in the past.  That's purely anecdotal based 
> on what I've seen fly by over the last couple years.

In the past, when I was offline (or at least off GCC) for two weeks, 
perhaps three, there were usually one, if not two, instances of my 
daily bootstrap on i386 and FreeBSD failing.

This is rarely the case anymore these days, and if anything happens,
usually it's fixed the following day.

So it seems things have quite improved.

Gerald


[SPARC] Add missing final period

2015-11-07 Thread Eric Botcazou
Applied on the mainline.


2015-11-07  Eric Botcazou  

* config/sparc/sparc.opt (mfix-at697f): Add final period.

-- 
Eric BotcazouIndex: config/sparc/sparc.opt
===
--- config/sparc/sparc.opt	(revision 229840)
+++ config/sparc/sparc.opt	(working copy)
@@ -209,7 +209,7 @@ Enable strict 32-bit psABI struct return
 mfix-at697f
 Target Report RejectNegative Var(sparc_fix_at697f)
 Enable workaround for single erratum of AT697F processor
-(corresponding to erratum #13 of AT697E processor)
+(corresponding to erratum #13 of AT697E processor).
 
 mfix-ut699
 Target Report RejectNegative Var(sparc_fix_ut699)


[committed] gcc.dg/Wno-frame-address.c: Skip on hppa*-*-*

2015-11-07 Thread John David Anglin
We need to skip on hppa since it's not possible to access arbitrary frames.

Dave
--
John David Anglin   dave.ang...@bell.net


2015-11-07  John David Anglin  

* gcc.dg/Wno-frame-address.c: Skip on hppa*-*-*.

Index: gcc.dg/Wno-frame-address.c
===
--- gcc.dg/Wno-frame-address.c  (revision 229906)
+++ gcc.dg/Wno-frame-address.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-skip-if "Cannot access arbitrary stack frames" { arm*-*-* visium-*-* } 
} */
+/* { dg-skip-if "Cannot access arbitrary stack frames" { arm*-*-* hppa*-*-* 
visium-*-* } } */
 /* { dg-options "-Werror" } */
 
 /* Verify that -Wframe-address is not enabled by default by enabling


Re: [PATCH] Add -fchecking

2015-11-07 Thread Gerald Pfeifer
On Tue, 27 Oct 2015, Richard Biener wrote:
> This adds -fchecking as a way to enable internal consistency checks
> even in release builds (or disable checking with -fno-checking - up to
> a certain extent - with checking enabled).

How (much) do we want to advertize this?

Gerald


[PATCH] PR fortran/68244 -- Check for NULL() in an array spec.

2015-11-07 Thread Steve Kargl
NULL() can only appear in a few situations.  It cannot
be part of an array spec.  See testcase for example.
OK to commit?

2015-11-07  Steven G. Kargl  

PR fortran/68224
* array.c (match_array_element_spec): Check of invalid NULL().
While here, fix nearby comments.

2015-11-07  Steven G. Kargl  

PR fortran/68224
* gfortran.dg/pr68224.f90: New test.

-- 
Steve
Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c	(revision 229933)
+++ gcc/fortran/array.c	(working copy)
@@ -147,9 +147,9 @@ matched:
 }
 
 
-/* Match an array reference, whether it is the whole array or a
-   particular elements or a section. If init is set, the reference has
-   to consist of init expressions.  */
+/* Match an array reference, whether it is the whole array or particular
+   elements or a section.  If init is set, the reference has to consist
+   of init expressions.  */
 
 match
 gfc_match_array_ref (gfc_array_ref *ar, gfc_array_spec *as, int init,
@@ -422,6 +422,13 @@ match_array_element_spec (gfc_array_spec
   if (!gfc_expr_check_typed (*upper, gfc_current_ns, false))
 return AS_UNKNOWN;
 
+  if ((*upper)->expr_type == EXPR_FUNCTION && (*upper)->ts.type == BT_UNKNOWN
+  && (*upper)->symtree && strcmp ((*upper)->symtree->name, "null") == 0)
+{
+  gfc_error ("Expecting a scalar INTEGER expression at %C");
+  return AS_UNKNOWN;
+}
+
   if (gfc_match_char (':') == MATCH_NO)
 {
   *lower = gfc_get_int_expr (gfc_default_integer_kind, NULL, 1);
@@ -442,13 +449,20 @@ match_array_element_spec (gfc_array_spec
   if (!gfc_expr_check_typed (*upper, gfc_current_ns, false))
 return AS_UNKNOWN;
 
+  if ((*upper)->expr_type == EXPR_FUNCTION && (*upper)->ts.type == BT_UNKNOWN
+  && (*upper)->symtree && strcmp ((*upper)->symtree->name, "null") == 0)
+{
+  gfc_error ("Expecting a scalar INTEGER expression at %C");
+  return AS_UNKNOWN;
+}
+
   return AS_EXPLICIT;
 }
 
 
 /* Matches an array specification, incidentally figuring out what sort
-   it is. Match either a normal array specification, or a coarray spec
-   or both. Optionally allow [:] for coarrays.  */
+   it is.  Match either a normal array specification, or a coarray spec
+   or both.  Optionally allow [:] for coarrays.  */
 
 match
 gfc_match_array_spec (gfc_array_spec **asp, bool match_dim, bool match_codim)
Index: gcc/testsuite/gfortran.dg/pr68224.f90
===
--- gcc/testsuite/gfortran.dg/pr68224.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/pr68224.f90	(working copy)
@@ -0,0 +1,10 @@
+! { dg-do compile }
+! PR fortran/68224
+! Original code contribute by Gerhard Steinmetz
+! 
+! 
+program p
+   integer, parameter :: a(null()) = [1, 2]   ! { dg-error "scalar INTEGER expression" }
+   integer, parameter :: b(null():*) = [1, 2]   ! { dg-error "scalar INTEGER expression" }
+   integer, parameter :: c(1:null()) = [1, 2]   ! { dg-error "scalar INTEGER expression" }
+end program p


Re: Combined constructs' clause splitting

2015-11-07 Thread Cesar Philippidis
On 11/07/2015 03:45 AM, Thomas Schwinge wrote:
> Hi!
> 
> On Fri, 6 Nov 2015 15:31:23 -0800, Cesar Philippidis  
> wrote:
>> I've applied this patch to gomp-4_0-branch which backports most of my
>> front end changes from trunk. Note that I found a regression while
>> testing, which is also present in trunk. It looks like
>> kernels-acc-loop-reduction.c is failing because I'm incorrectly
>> propagating the reduction variable to both to the kernels and loop
>> constructs for combined 'acc kernels loop'. The problem here is that
>> kernels don't support the reduction clause. I'll fix that next week.
> 
> Always need to consider both what the specification allows -- and thus
> what the front ends accept/refuse -- as well as what we might do
> differently, internally in later processing stages.  I have not analyzed
> whether it makes sense to have the OMP_CLAUSE_REDUCTION of a combined
> "kernels loop reduction([...])" construct be attached to the outer
> OACC_KERNELS or inner OACC_LOOP, or duplicated for both.
> 
> Tom, if you need a solution for that right now/want to restore the
> previous behavior (attached to innter OACC_LOOP only), here's what you
> should try: in gcc/c-family/c-omp.c:c_oacc_split_loop_clauses remove the
> special handling for OMP_CLAUSE_REDUCTION, and move it to "Loop clauses"
> section, and in

That should would work.

> gcc/fortran/trans-openmp.c:gfc_trans_oacc_combined_directive I don't see
> reduction clauses being handled, hmm, maybe the Fortran front end is
> doing that differently?

You're correct, reductions are being associated with kernels and
parallel constructs. This is one area that needed more test cases, but
things like

  'acc parallel reduction(+:var) copy(var)'

was broken because of the recent gimplifier changes, so I couldn't test
for it. I was planning on fixing both problems (reductions and variable
appearing in multiple clauses) after Nathan's firstprivate and default
gimplifier changes landed in trunk.

Cesar



Re: [gomp4] backport trunk FE changes

2015-11-07 Thread Cesar Philippidis
On 11/07/2015 04:30 AM, Thomas Schwinge wrote:
> Hi!
> 
> On Fri, 6 Nov 2015 15:31:23 -0800, Cesar Philippidis  
> wrote:
>> I've applied this patch to gomp-4_0-branch which backports most of my
>> front end changes from trunk.
> 
>> --- a/gcc/cp/pt.c
>> +++ b/gcc/cp/pt.c
>> @@ -14398,7 +14398,6 @@ tsubst_omp_clauses (tree clauses, bool declare_simd, 
>> bool allow_fields,
>>  case OMP_CLAUSE_NUM_GANGS:
>>  case OMP_CLAUSE_NUM_WORKERS:
>>  case OMP_CLAUSE_VECTOR_LENGTH:
>> -case OMP_CLAUSE_GANG:
>>  case OMP_CLAUSE_WORKER:
>>  case OMP_CLAUSE_VECTOR:
>>  case OMP_CLAUSE_ASYNC:
>> @@ -14427,7 +14426,7 @@ tsubst_omp_clauses (tree clauses, bool declare_simd, 
>> bool allow_fields,
>>  = tsubst_omp_clause_decl (OMP_CLAUSE_DECL (oc), args, complain,
>>in_decl);
>>break;
>> -case OMP_CLAUSE_LINEAR:
>> +case OMP_CLAUSE_GANG:
>>  case OMP_CLAUSE_ALIGNED:
>>OMP_CLAUSE_DECL (nc)
>>  = tsubst_omp_clause_decl (OMP_CLAUSE_DECL (oc), args, complain,
> 
> This -- unintentional, I suppose ;-) -- removal of OMP_CLAUSE_LINEAR
> caused a lot of regressions; committed to gomp-4_0-branch in r229928:

Thank you. I had two versions of this patch and I committed the wrong
one. That was the only change though.

Cesar



Re: [Patch, fortran] PR68196 [4.9/5/6 Regression] ICE on function result with procedure pointer component

2015-11-07 Thread Steve Kargl
On Wed, Nov 04, 2015 at 04:03:10PM +0100, Paul Richard Thomas wrote:
> 
> 2015-11-04  Paul Thomas  
> 
> PR fortran/68196
> * class.c (has_finalizer_component): Prevent infinite recursion
> through this function if the derived type and that of its
> component are the same.
> * trans-types.c (gfc_get_derived_type): Do the same for proc
> pointers by ignoring the explicit interface for the component.
> 
> PR fortran/66465
> * check.c (same_type_check): If either of the expressions is
> BT_PROCEDURE, use the typespec from the symbol, rather than the
> expression.
> 
> 2015-11-04  Paul Thomas  
> 
> PR fortran/68196
> * gfortran.dg/proc_ptr_47.f90: New test.
> 
> PR fortran/66465
> * gfortran.dg/pr66465.f90: New test.

OK.  Thanks for the patch.

-- 
steve


RFC: Incomplete Draft Patches to Correct Errors in Loop Unrolling Frequencies (bugzilla problem 68212)

2015-11-07 Thread Kelvin Nilsen


This is a draft patch to partially address the concerns described in 
bugzilla problem report 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212). The patch is 
incomplete in the sense that there are some known shortcomings with 
nested loops which I am still working on.  I am sending this out for 
comments at this time because we would like these patches to be 
integrated into the GCC 6 release and want to begin responding to 
community feedback as soon as possible in order to make the integration 
possible.


The problem described in Bugzilla 68212 is that code produced by loop 
unrolling has incorrect block frequencies.  The erroneous block 
frequencies result because block frequencies are not adjusted to account 
for the execution contexts into which they are copied. These incorrect 
frequencies disable and confuse subsequent compiler optimizations.  The 
general idea of how we address this problem is two fold:


 1. Before a loop body is unpeeled into a pre-header location, we 
temporarily adjust the loop body frequencies to represent the values 
appropriate for the context into which the loop body is to be copied.


 2. After unrolling the loop body (by replicating the loop body (N-1) 
times within the loop), we recompute all frequencies associated with 
blocks contained within the loop.


Additional test programs will be added to the bugzilla report and will 
be integrated into the regression test suite as part of the final 
submission of this patch.


ChangeLog:

2015-11-07  Kelvin Nilsen 

* cfgloopmanip.h
(in_loop_p): new extern declaration
(zero_loop_frequencies): new extern declaration
(increment_loop_frequencies): new extern declaration

* cfgloopmanip.c
(in_loop_p): new helper routine
(zero_loop_frequencies): new helper routine
(block_ladder_rung): new struct definition for helper routines
(same_edge_p): new helper routine
(in_edge_set_p): new helper routine
(in_call_chain_p): new helper routine
(recursively_zero_frequency): new helper routine
(recursion_detected_p): new helper routine
(in_loop_of_header_p): new helper routine
(recursively_get_loop_blocks): new helper routine
(get_loop_blocks): new helper routine
(in_block_set_p): new helper routine
(get_exit_edges_from_loop_blocks): new helper routine
(zero_partial_loop_frequencies): new helper routine
(recursively_increment_frequency): new helper routine
(increment_loop_frequencies): new helper routine
(internal): new helper routine
(check_loop_frequency_integrity): new helper routine
(set_zero_probability): added another parameter
(duplicate_loop_to_header_edge): Add code to recompute loop 
body frequencies after blocks are replicated (unrolled) into the loop 
body. Introduce certain help routines because existing infrastructure 
routines are not reliable during typical executions of 
duplicate_loop_to_header_edge().


* loop-unroll.c
(unroll_loop_constant_iterations): After replicating the loop 
body within a loop, recompute the frequencies for all blocks contained 
within the loop.
(unroll_loop_runtime_iterations):Before copying loop body to 
preheader location, temporarily adjust the loop body frequencies to 
represent the context into which the loop body will be copied. After 
replicating the loop body within a loop, recompute the frequencies for 
all blocks contained within the loop.


Index: loop-unroll.c
===
--- loop-unroll.c   (.../trunk/gcc) (revision 229257)
+++ loop-unroll.c   (.../branches/ibm/kelvin-1/gcc) (working copy)
@@ -587,14 +587,14 @@ unroll_loop_constant_iterations (struct loop *loop
   /* Now unroll the loop.  */
 
   opt_info_start_duplication (opt_info);
+
   ok = duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
  max_unroll,
  wont_exit, desc->out_edge,
  &remove_edges,
- DLTHE_FLAG_UPDATE_FREQ
- | (opt_info
+ opt_info
 ? DLTHE_RECORD_COPY_NUMBER
-  : 0));
+  : 0);
   gcc_assert (ok);
 
   if (opt_info)
@@ -876,6 +876,7 @@ unroll_loop_runtime_iterations (struct loop *loop)
   auto_vec dom_bbs;
 
   body = get_loop_body (loop);
+
   for (i = 0; i < loop->num_nodes; i++)
 {
   vec ldom;
@@ -943,6 +944,7 @@ unroll_loop_runtime_iterations (struct loop *loop)
   && !desc->noloop_assumptions)
 bitmap_set_bit (wont_exit, 1);
   ezc_swtch = loop_preheader_edge (loop)->src;
+
   ok = duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
 

Re: Add null identifiers to genmatch

2015-11-07 Thread Pedro Alves
Hi Richard,

Passerby comment below.

On 11/07/2015 01:21 PM, Richard Sandiford wrote:
> -/* Lookup the identifier ID.  */
> +/* Lookup the identifier ID.  Allow "null" if ALLOW_NULL.  */
>  
>  id_base *
> -get_operator (const char *id)
> +get_operator (const char *id, bool allow_null = false)
>  {
> +  if (allow_null && strcmp (id, "null") == 0)
> +return null_id;
> +
>id_base tem (id_base::CODE, id);

Boolean params are best avoided if possible, IMO.  In this case,
it seems this could instead be a new wrapper function, like:

id_base *
get_operator_allow_null (const char *id)
{
  if (strcmp (id, "null") == 0)
return null_id;
  return get_operator (id);
}

Then callers are more obvious as you no longer have to know
what true/false mean:

   const char *id = get_ident ();
-  if (get_operator (id) != NULL)
+  if (get_operator_allow_null (id) != NULL)
fatal_at (token, "operator already defined");


vs:

   const char *id = get_ident ();
-  if (get_operator (id) != NULL)
+  if (get_operator (id, true) != NULL)
fatal_at (token, "operator already defined");


Thanks,
Pedro Alves



[PATCH] Complete cxa_atexit for AIX

2015-11-07 Thread David Edelsohn
IBM xlC++ compiler provides its own implementation of atexit() to
provide correct interaction between atexit() and destructors.  GCC
needs to provide the same through libgcc.

I previously added partial support based on the implementation in
GLIBC (copied with FSF permission) and confirmed that destructors
continued to function properly.  This patch completes the support with
the matching definition of atexit().

With this patch, cxa_atexit tests in the testsuite function correctly.

I plan to backport this patch to the GCC 5 branch, in which cxa_atexit
is not enabled by default.

Bootstrapped on powerpc-ibm-aix7.1.0.0 and powerpc-ibm-aix7.1.3.0.

Thanks, David

* config/rs6000/atexit.c: New file.
* config/rs6000/t-aix-cxa (LIB2ADDEH): Build atexit.c.
* config/rs6000/libgcc-aix-cxa.ver (atexit): Add symbol to exports.
* config/rs6000/cxa_finalize.c
(catomic_compare_and_exchange_bool_acq): Negate return value.


AA
Description: Binary data


Re: [Fortran, patch, pr68218, v1] ALLOCATE with size given by a module function

2015-11-07 Thread Paul Richard Thomas
Dear Andre,

OK for trunk.

I understand that you have investigated the issue(s) reported to you
by Dominique and can find no sign of them.

Thanks

Paul

On 5 November 2015 at 15:29, Andre Vehreschild  wrote:
> Hi all,
>
> attached is a rather trivial patch to prevent multiple evaluations of a
> function in:
>
>   allocate( array(func()) )
>
> The patch tests whether the upper bound of the array is a function
> and calls gfc_evaluate_now().
>
> Bootstrapped and regtested for x86_64-linux-gnu/f21.
>
> Ok for trunk?
>
> Regards,
> Andre
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


[gomp4] Merge trunk r229809 (2015-11-05) into gomp-4_0-branch

2015-11-07 Thread Thomas Schwinge
Hi!

Committed to gomp-4_0-branch in r229929:

commit f782e15f314aa57eb7bca3bfdea54fba6c48e929
Merge: eb7d11e e103794
Author: tschwinge 
Date:   Sat Nov 7 13:42:07 2015 +

svn merge -r 229770:229809 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229929 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas


signature.asc
Description: PGP signature


OpenACC Firstprivate

2015-11-07 Thread Nathan Sidwell

Jakub,
this patch implements firstprivate support for openacc.  This is pretty straight 
forwards -- they're just regular auto variables, but with an initialization 
value from the host.


The gimplify.c implementation is somewhat different to gomp4 branch, as I've 
added new bits to enum omp_region_type, rather than add 2 new fields to 
omp_region_ctx.  The new enums use bits already defined in omp_region_type:


+  ORT_ACC = 0x40,  /* An OpenACC region.  */
+  ORT_ACC_DATA = ORT_ACC | ORT_TARGET_DATA, /* Data construct.  */
+  ORT_ACC_PARALLEL = ORT_ACC | ORT_TARGET,  /* Parallel construct */
+  ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 0x80,  /* Kernels construct.  */

On gomp4 we were already setting those bits, but then setting the new fields to 
indicate 'openacc'.  Many places in gimplify.c where we check for '== 
ORT_TARGET_DATA' or ORT_TARGET get changed to '& ORT_TARGET_DATA' etc.


On gomp4 for things like an openacc loop we were setting ORT_WORKSHARE, so 
nearly all checks for == ORT_WORKSHARE get an additional '|| X == ORT_ACC'.


Although this patch doesn't make use of the difference between ORT_ACC_KERNELS 
and ORT_ACC_PARALLEL, the default handling patch will -- they have different 
behaviours.


I think the gimpify.c changes are then obvious from that, but let me know.

in omp-low the changes are to remove 'sorry' and build the initializer exprs in 
lower_omp_target.


As you can see this fixes a few xfails.

I'll post the default handling patch, which is much more localized.

nathan
2015-11-06  Nathan Sidwell  
	Cesar Philippidis  

	gcc/
	* gcc/gimplify.c (enum  omp_region_type): Add ORT_ACC,
	ORT_ACC_DATA, ORT_ACC_PARALLEL, ORT_ACC_KERNELS.  Adjust ORT_NONE.
	(new_omp_context): Initialize all fields.
	(gimple_add_tmp_var): Add ORT_ACC checks.
	(gimplify_var_or_parm_decl): Likewise.
	(omp_firstprivatize_variable): Likewise. Use ORT_TARGET_DATA as a
	mask.
	(omp_add_variable): Look in outer contexts for openacc and allow
	reductions with other sharing. Add ORT_ACC and ORT_TARGET_DATA
	checks.
	(omp_notice_variable, omp_is_private, omp_check_private): Add
	ORT_ACC checks.
	(gimplify_scan_omp_clauses: Treat ORT_ACC as ORT_WORKSHARE.
	Permit private openacc reductions.
	(gimplify_oacc_cache): Specify ORT_ACC.
	(gimplify_omp_workshare): Adjust OpenACC region types.
	(gimplify_omp_target_update): Likewise.
	* gcc/omp-low.c (scan_sharing_clauses): Remove Openacc
	firstprivate sorry.
	(lower-rec_input_clauses): Don't handle openacc firstprivate
	references here.
	(lower_omp_target): Emit initializers for openacc firstprivate vars.

	gcc/testsuite/
	* gfortran.dg/goacc/private-3.f95: Remove xfail.
	* gfortran.dg/goacc/combined_loop.f90: Remove xfail.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Remove xfail.
	* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Remove xfail.
	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: New.

Index: gcc/gimplify.c
===
--- gcc/gimplify.c	(revision 229892)
+++ gcc/gimplify.c	(working copy)
@@ -108,9 +108,15 @@ enum omp_region_type
   /* Data region with offloading.  */
   ORT_TARGET = 32,
   ORT_COMBINED_TARGET = 33,
+
+  ORT_ACC = 0x40,  /* An OpenACC region.  */
+  ORT_ACC_DATA = ORT_ACC | ORT_TARGET_DATA, /* Data construct.  */
+  ORT_ACC_PARALLEL = ORT_ACC | ORT_TARGET,  /* Parallel construct */
+  ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 0x80,  /* Kernels construct.  */
+
   /* Dummy OpenMP region, used to disable expansion of
  DECL_VALUE_EXPRs in taskloop pre body.  */
-  ORT_NONE = 64
+  ORT_NONE = 0x100
 };
 
 /* Gimplify hashtable helper.  */
@@ -377,6 +383,12 @@ new_omp_context (enum omp_region_type re
   else
 c->default_kind = OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 
+  c->combined_loop = false;
+  c->distribute = false;
+  c->target_map_scalars_firstprivate = false;
+  c->target_map_pointers_as_0len_arrays = false;
+  c->target_firstprivatize_array_bases = false;
+
   return c;
 }
 
@@ -689,7 +701,8 @@ gimple_add_tmp_var (tree tmp)
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
 	  while (ctx
 		 && (ctx->region_type == ORT_WORKSHARE
-		 || ctx->region_type == ORT_SIMD))
+		 || ctx->region_type == ORT_SIMD
+		 || ctx->region_type == ORT_ACC))
 	ctx = ctx->outer_context;
 	  if (ctx)
 	omp_add_variable (ctx, tmp, GOVD_LOCAL | GOVD_SEEN);
@@ -1804,7 +1817,8 @@ gimplify_var_or_parm_decl (tree *expr_p)
 	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
 	  while (ctx
 		 && (ctx->region_type == ORT_WORKSHARE
-		 || ctx->region_type == ORT_SIMD))
+		 || ctx->region_type == ORT_SIMD
+		 || ctx->region_type == ORT_ACC))
 	ctx = ctx->outer_context;
 	  if (!ctx && !nonlocal_vlas->add (decl))
 	{
@@ -5579,7 +5593,8 @@ omp_firstprivatize_variable (struct gimp
 	}
   else if (ctx->region_type != ORT_WORKSHARE
 	   && ctx->region_type != ORT_SIMD
-	   && ctx->region_type != ORT_TARGET_DATA)
+	   

Re: [sh] Add flag_unsafe_math_optimizations to sincossf3

2015-11-07 Thread Oleg Endo
On Sat, 2015-11-07 at 13:27 +, Richard Sandiford wrote:
> builtins.c uses the following code to guard expansions involving
> optabs:
> 
> CASE_FLT_FN (BUILT_IN_EXP):
> CASE_FLT_FN (BUILT_IN_EXP10):
> CASE_FLT_FN (BUILT_IN_POW10):
> CASE_FLT_FN (BUILT_IN_EXP2):
> CASE_FLT_FN (BUILT_IN_EXPM1):
> CASE_FLT_FN (BUILT_IN_LOGB):
> CASE_FLT_FN (BUILT_IN_LOG):
> CASE_FLT_FN (BUILT_IN_LOG10):
> CASE_FLT_FN (BUILT_IN_LOG2):
> CASE_FLT_FN (BUILT_IN_LOG1P):
> CASE_FLT_FN (BUILT_IN_TAN):
> CASE_FLT_FN (BUILT_IN_ASIN):
> CASE_FLT_FN (BUILT_IN_ACOS):
> CASE_FLT_FN (BUILT_IN_ATAN):
> CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
>   /* Treat these like sqrt only if unsafe math optimizations are
> allowed,
>because of possible accuracy problems.  */
>   if (! flag_unsafe_math_optimizations)
>   break;
> [...]
> CASE_FLT_FN (BUILT_IN_ILOGB):
>   if (! flag_unsafe_math_optimizations)
>   break;
> [...]
> CASE_FLT_FN (BUILT_IN_ATAN2):
> CASE_FLT_FN (BUILT_IN_LDEXP):
> CASE_FLT_FN (BUILT_IN_SCALB):
> CASE_FLT_FN (BUILT_IN_SCALBN):
> CASE_FLT_FN (BUILT_IN_SCALBLN):
>   if (! flag_unsafe_math_optimizations)
>   break;
> [...]
> CASE_FLT_FN (BUILT_IN_SIN):
> CASE_FLT_FN (BUILT_IN_COS):
>   if (! flag_unsafe_math_optimizations)
>   break;
> [...]
> CASE_FLT_FN (BUILT_IN_SINCOS):
>   if (! flag_unsafe_math_optimizations)
>   break;
> 
> I think it's really up to the optab to decide whether it's safe
> for !flag_unsafe_math_optimizations or not, and AFAICT, all optabs
> but sh.md:sincossf3 already check.  This patch makes the sh pattern
> check too.

In sh.c (sh_option_override) TARGET_FSCA is enabled only when
 flag_unsafe_math_optimizations != 0.  Thus another check in the
patterns is not done.  However, there is PR 67723 and the fix for it
will probably be checking all the necessary flags in the pattern
conditions or something like that.  Thus, please feel free to commit.

Cheers,
Oleg


Extend tree-call-cdce to calls whose result is used

2015-11-07 Thread Richard Sandiford
For -fmath-errno, builtins.c currently expands calls to sqrt to:

y = sqrt_optab (x);
if (y != y)
  [ sqrt (x); or errno = EDOM; ]

The drawbacks of this are:

- the call to sqrt is protected by the result of the optab rather
  than the input.  It would be better to check !(x >= 0), like
  tree-call-cdce.c does.

- the branch isn't exposed at the gimple level and so gets little
  high-level optimisation.

- we do this for log too, but for log a zero input produces
  -inf rather than a NaN, and sets errno to ERANGE rather than EDOM.

This patch moves the code to tree-call-cdce.c instead, with the optab
operation being represented as an internal function.  This means that
we can use the existing argument-based range checks rather than the
result-based checks and that we get more gimple optimisation of
the branch.

Previously the pass was only enabled by default at -O2 or above,
but the old builtins.c code was enabled at -O.  The patch therefore
enables the pass at -O as well.

The previous patch to cfgexpand.c handled cases where functions
don't (or are assumed not to) set errno, so this patch makes
the builtins.c code dead.

Tested on x86_64-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabi
and visium-elf (for the EDOM stuff).  OK to install?

Thanks,
Richard


gcc/
* builtins.c (expand_errno_check, expand_builtin_mathfn)
(expand_builtin_mathfn_2): Delete.
(expand_builtin): Remove handling of functions with
internal function equivalents.
* internal-fn.def (SET_EDOM): New internal function.
* internal-fn.h (set_edom_supported_p): Declare.
* internal-fn.c (expand_SET_EDOM): New function.
(set_edom_supported_p): Likewise.
* tree-call-cdce.c: Include builtins.h and internal-fn.h.
Rewrite comment at head of file.
(is_call_dce_candidate): Rename to...
(can_test_argument_range): ...this.  Don't check gimple_call_lhs
or gimple_call_builtin_p here.
(edom_only_function): New function.
(shrink_wrap_one_built_in_call_with_conds): New function, split out
from...
(shrink_wrap_one_built_in_call): ...here.
(can_use_internal_fn, use_internal_fn): New functions.
(shrink_wrap_conditional_dead_built_in_calls): Call use_internal_fn
for calls that have an lhs.
(pass_call_cdce::gate): Remove optimize_function_for_speed_p check.
(pass_call_cdce::execute): Skip blocks that are optimized for size.
Check gimple_call_builtin_p here.  Use can_use_internal_fn for
calls with an lhs.
* opts.c (default_options_table): Enable -ftree-builtin-call-cdce
at -O and above.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index bbcc7dc3..1c13a51 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -101,9 +101,6 @@ static rtx expand_builtin_apply (rtx, rtx, rtx);
 static void expand_builtin_return (rtx);
 static enum type_class type_to_class (tree);
 static rtx expand_builtin_classify_type (tree);
-static void expand_errno_check (tree, rtx);
-static rtx expand_builtin_mathfn (tree, rtx, rtx);
-static rtx expand_builtin_mathfn_2 (tree, rtx, rtx);
 static rtx expand_builtin_mathfn_3 (tree, rtx, rtx);
 static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx);
 static rtx expand_builtin_interclass_mathfn (tree, rtx);
@@ -1972,286 +1969,6 @@ replacement_internal_fn (gcall *call)
   return IFN_LAST;
 }
 
-/* If errno must be maintained, expand the RTL to check if the result,
-   TARGET, of a built-in function call, EXP, is NaN, and if so set
-   errno to EDOM.  */
-
-static void
-expand_errno_check (tree exp, rtx target)
-{
-  rtx_code_label *lab = gen_label_rtx ();
-
-  /* Test the result; if it is NaN, set errno=EDOM because
- the argument was not in the domain.  */
-  do_compare_rtx_and_jump (target, target, EQ, 0, GET_MODE (target),
-  NULL_RTX, NULL, lab,
-  /* The jump is very likely.  */
-  REG_BR_PROB_BASE - (REG_BR_PROB_BASE / 2000 - 1));
-
-#ifdef TARGET_EDOM
-  /* If this built-in doesn't throw an exception, set errno directly.  */
-  if (TREE_NOTHROW (TREE_OPERAND (CALL_EXPR_FN (exp), 0)))
-{
-#ifdef GEN_ERRNO_RTX
-  rtx errno_rtx = GEN_ERRNO_RTX;
-#else
-  rtx errno_rtx
- = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno"));
-#endif
-  emit_move_insn (errno_rtx,
- gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx)));
-  emit_label (lab);
-  return;
-}
-#endif
-
-  /* Make sure the library call isn't expanded as a tail call.  */
-  CALL_EXPR_TAILCALL (exp) = 0;
-
-  /* We can't set errno=EDOM directly; let the library call do it.
- Pop the arguments right away in case the call gets deleted.  */
-  NO_DEFER_POP;
-  expand_call (exp, target, 0);
-  OK_DEFER_POP;
-  emit_label (lab);
-}
-
-/* Expand a call to one of the builtin math functions (sqrt, exp, or log).
-   Return NULL_RTX if a normal

Short-cut generation of simple built-in functions

2015-11-07 Thread Richard Sandiford
This patch short-circuits the builtins.c expansion code for a particular
gimple call if:

- the function has an associated internal function
- the target implements that internal function
- the call has no side effects

This allows a later patch to remove the builtins.c code, once calls with
side effects have been handled.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* builtins.h (called_as_built_in): Declare.
* builtins.c (called_as_built_in): Make external.
* internal-fn.h (expand_internal_call): Define a variant that
specifies the internal function explicitly.
* internal-fn.c (expand_load_lanes_optab_fn)
(expand_store_lanes_optab_fn, expand_ANNOTATE, expand_GOMP_SIMD_LANE)
(expand_GOMP_SIMD_VF, expand_GOMP_SIMD_LAST_LANE)
(expand_GOMP_SIMD_ORDERED_START, expand_GOMP_SIMD_ORDERED_END)
(expand_UBSAN_NULL, expand_UBSAN_BOUNDS, expand_UBSAN_VPTR)
(expand_UBSAN_OBJECT_SIZE, expand_ASAN_CHECK, expand_TSAN_FUNC_EXIT)
(expand_UBSAN_CHECK_ADD, expand_UBSAN_CHECK_SUB)
(expand_UBSAN_CHECK_MUL, expand_ADD_OVERFLOW, expand_SUB_OVERFLOW)
(expand_MUL_OVERFLOW, expand_LOOP_VECTORIZED)
(expand_mask_load_optab_fn, expand_mask_store_optab_fn)
(expand_ABNORMAL_DISPATCHER, expand_BUILTIN_EXPECT, expand_VA_ARG)
(expand_UNIQUE, expand_GOACC_DIM_SIZE, expand_GOACC_DIM_POS)
(expand_GOACC_LOOP, expand_GOACC_REDUCTION, expand_direct_optab_fn)
(expand_unary_optab_fn, expand_binary_optab_fn): Add an internal_fn
argument.
(internal_fn_expanders): Update prototype.
(expand_internal_call): Define a variant that specifies the
internal function explicitly. Use it to implement the previous
interface.
* cfgexpand.c (expand_call_stmt): Try to expand calls to built-in
functions as calls to internal functions.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index f65011e..bbcc7dc3 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -222,7 +222,7 @@ is_builtin_fn (tree decl)
of the optimization level.  This means whenever a function is invoked with
its "internal" name, which normally contains the prefix "__builtin".  */
 
-static bool
+bool
 called_as_built_in (tree node)
 {
   /* Note that we must use DECL_NAME, not DECL_ASSEMBLER_NAME_SET_P since
diff --git a/gcc/builtins.h b/gcc/builtins.h
index 917eb90..1d00068 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -50,6 +50,7 @@ extern struct target_builtins *this_target_builtins;
 extern bool force_folding_builtin_constant_p;
 
 extern bool is_builtin_fn (tree);
+extern bool called_as_built_in (tree);
 extern bool get_object_alignment_1 (tree, unsigned int *,
unsigned HOST_WIDE_INT *);
 extern unsigned int get_object_alignment (tree);
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index bfbc958..dc7d4f5 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2551,10 +2551,25 @@ expand_call_stmt (gcall *stmt)
   return;
 }
 
+  /* If this is a call to a built-in function and it has no effect other
+ than setting the lhs, try to implement it using an internal function
+ instead.  */
+  decl = gimple_call_fndecl (stmt);
+  if (gimple_call_lhs (stmt)
+  && !gimple_vdef (stmt)
+  && (optimize || (decl && called_as_built_in (decl
+{
+  internal_fn ifn = replacement_internal_fn (stmt);
+  if (ifn != IFN_LAST)
+   {
+ expand_internal_call (ifn, stmt);
+ return;
+   }
+}
+
   exp = build_vl_exp (CALL_EXPR, gimple_call_num_args (stmt) + 3);
 
   CALL_EXPR_FN (exp) = gimple_call_fn (stmt);
-  decl = gimple_call_fndecl (stmt);
   builtin_p = decl && DECL_BUILT_IN (decl);
 
   /* If this is not a builtin function, the function type through which the
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 9f9f9cf..c03c0fc 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -103,7 +103,7 @@ get_multi_vector_move (tree array_type, convert_optab optab)
 /* Expand LOAD_LANES call STMT using optab OPTAB.  */
 
 static void
-expand_load_lanes_optab_fn (gcall *stmt, convert_optab optab)
+expand_load_lanes_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[2];
   tree type, lhs, rhs;
@@ -127,7 +127,7 @@ expand_load_lanes_optab_fn (gcall *stmt, convert_optab 
optab)
 /* Expand STORE_LANES call STMT using optab OPTAB.  */
 
 static void
-expand_store_lanes_optab_fn (gcall *stmt, convert_optab optab)
+expand_store_lanes_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[2];
   tree type, lhs, rhs;
@@ -149,7 +149,7 @@ expand_store_lanes_optab_fn (gcall *stmt, convert_optab 
optab)
 }
 
 static void
-expand_ANNOTATE (gcall *)
+expand_ANNOTATE (internal_fn, gcall *)
 {
   gcc_unreachable ();
 }
@@ -157,7 +157,7 @@ expand_ANNOTATE (gcall *)
 /* This should get expanded i

[sh] Add flag_unsafe_math_optimizations to sincossf3

2015-11-07 Thread Richard Sandiford
builtins.c uses the following code to guard expansions involving optabs:

CASE_FLT_FN (BUILT_IN_EXP):
CASE_FLT_FN (BUILT_IN_EXP10):
CASE_FLT_FN (BUILT_IN_POW10):
CASE_FLT_FN (BUILT_IN_EXP2):
CASE_FLT_FN (BUILT_IN_EXPM1):
CASE_FLT_FN (BUILT_IN_LOGB):
CASE_FLT_FN (BUILT_IN_LOG):
CASE_FLT_FN (BUILT_IN_LOG10):
CASE_FLT_FN (BUILT_IN_LOG2):
CASE_FLT_FN (BUILT_IN_LOG1P):
CASE_FLT_FN (BUILT_IN_TAN):
CASE_FLT_FN (BUILT_IN_ASIN):
CASE_FLT_FN (BUILT_IN_ACOS):
CASE_FLT_FN (BUILT_IN_ATAN):
CASE_FLT_FN (BUILT_IN_SIGNIFICAND):
  /* Treat these like sqrt only if unsafe math optimizations are allowed,
 because of possible accuracy problems.  */
  if (! flag_unsafe_math_optimizations)
break;
[...]
CASE_FLT_FN (BUILT_IN_ILOGB):
  if (! flag_unsafe_math_optimizations)
break;
[...]
CASE_FLT_FN (BUILT_IN_ATAN2):
CASE_FLT_FN (BUILT_IN_LDEXP):
CASE_FLT_FN (BUILT_IN_SCALB):
CASE_FLT_FN (BUILT_IN_SCALBN):
CASE_FLT_FN (BUILT_IN_SCALBLN):
  if (! flag_unsafe_math_optimizations)
break;
[...]
CASE_FLT_FN (BUILT_IN_SIN):
CASE_FLT_FN (BUILT_IN_COS):
  if (! flag_unsafe_math_optimizations)
break;
[...]
CASE_FLT_FN (BUILT_IN_SINCOS):
  if (! flag_unsafe_math_optimizations)
break;

I think it's really up to the optab to decide whether it's safe
for !flag_unsafe_math_optimizations or not, and AFAICT, all optabs
but sh.md:sincossf3 already check.  This patch makes the sh pattern
check too.

Tested on sh-elf.  OK to install?

Thanks,
Richard


gcc/
* config/sh/sh.md (sincossf3): Require flag_unsafe_math_optimizations.

diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 557a0f0..0c3b9f2 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -13722,7 +13722,7 @@ label:
(unspec:SF [(match_operand:SF 2 "fp_arith_reg_operand")] UNSPEC_FCOSA))
(set (match_operand:SF 1 "nonimmediate_operand")
(unspec:SF [(match_dup 2)] UNSPEC_FSINA))]
-  "TARGET_FPU_ANY && TARGET_FSCA"
+  "TARGET_FPU_ANY && TARGET_FSCA && flag_unsafe_math_optimizations"
 {
   rtx scaled = gen_reg_rtx (SFmode);
   rtx truncated = gen_reg_rtx (SImode);



Replace match.pd DEFINE_MATH_FNs with auto-generated lists

2015-11-07 Thread Richard Sandiford
This patch autogenerates the operator lists for maths functions
like SQRT, adding an additional entry for internal functions.
E.g.:

(define_operator_list SQRT
BUILT_IN_SQRTF
BUILT_IN_SQRT
BUILT_IN_SQRTL
IFN_SQRT)

and:

(define_operator_list CABS
BUILT_IN_CABSF
BUILT_IN_CABS
BUILT_IN_CABSL
null)

(since there's no internal function for CABS).

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* Makefile.in (MOSTLYCLEANFILES): Add cfn-operators.pd.
(generated_files): Likewise.
(s-cfn-operators, cfn-operators.pd): New rules.
(s-match): Depend on cfn-operators.pd.
* gencfn-macros.c: Expand comment to describe -o behavior.
(print_define_operator_list): New function.
(main): Accept -o.  Call print_define_operator_list.
* genmatch.c (main): Add "." to the include path.
* match.pd (DEFINE_MATH_FN): Delete.  Include cfn-operators.pd
instead.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 298bb38..a21aaf5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1566,7 +1566,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h 
insn-codes.h \
  tm-preds.h tm-constrs.h checksum-options gimple-match.c generic-match.c \
  tree-check.h min-insn-modes.c insn-modes.c insn-modes.h \
  genrtl.h gt-*.h gtype-*.h gtype-desc.c gtyp-input.list \
- case-cfn-macros.h \
+ case-cfn-macros.h cfn-operators.pd \
  xgcc$(exeext) cpp$(exeext) $(FULL_DRIVER_NAME) \
  $(EXTRA_PROGRAMS) gcc-cross$(exeext) \
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
@@ -2252,6 +2252,14 @@ s-case-cfn-macros: build/gencfn-macros$(build_exeext)
$(STAMP) s-case-cfn-macros
 case-cfn-macros.h: s-case-cfn-macros; @true
 
+s-cfn-operators: build/gencfn-macros$(build_exeext)
+   $(RUN_GEN) build/gencfn-macros$(build_exeext) -o \
+ > tmp-cfn-operators.pd
+   $(SHELL) $(srcdir)/../move-if-change tmp-cfn-operators.pd \
+ cfn-operators.pd
+   $(STAMP) s-cfn-operators
+cfn-operators.pd: s-cfn-operators; @true
+
 target-hooks-def.h: s-target-hooks-def-h; @true
 # make sure that when we build info files, the used tm.texi is up to date.
 $(srcdir)/doc/tm.texi: s-tm-texi; @true
@@ -2318,7 +2326,7 @@ s-tm-texi: build/genhooks$(build_exeext) 
$(srcdir)/doc/tm.texi.in
 gimple-match.c: s-match gimple-match-head.c ; @true
 generic-match.c: s-match generic-match-head.c ; @true
 
-s-match: build/genmatch$(build_exeext) $(srcdir)/match.pd
+s-match: build/genmatch$(build_exeext) $(srcdir)/match.pd cfn-operators.pd
$(RUN_GEN) build/genmatch$(build_exeext) --gimple $(srcdir)/match.pd \
> tmp-gimple-match.c
$(RUN_GEN) build/genmatch$(build_exeext) --generic $(srcdir)/match.pd \
@@ -2439,7 +2447,8 @@ generated_files = config.h tm.h $(TM_P_H) $(TM_H) 
multilib.h \
$(ALL_GTFILES_H) gtype-desc.c gtype-desc.h gcov-iov.h \
options.h target-hooks-def.h insn-opinit.h \
common/common-target-hooks-def.h pass-instances.def \
-   c-family/c-target-hooks-def.h params.list case-cfn-macros.h
+   c-family/c-target-hooks-def.h params.list case-cfn-macros.h \
+   cfn-operators.pd
 
 #
 # How to compile object files to run on the build machine.
diff --git a/gcc/gencfn-macros.c b/gcc/gencfn-macros.c
index 5ee3af0..401c429 100644
--- a/gcc/gencfn-macros.c
+++ b/gcc/gencfn-macros.c
@@ -40,7 +40,27 @@ along with GCC; see the file COPYING3.  If not see
   case CFN_BUILT_IN_SQRTL:
   case CFN_SQRT:
 
-   The macros for groups with no internal function drop the last line.  */
+   The macros for groups with no internal function drop the last line.
+
+   When run with -o, the generator prints a similar list of
+   define_operator_list directives, for use by match.pd.  Each operator
+   list starts with the built-in functions, in order of ascending type width.
+   This is followed by an entry for the internal function, or "null" if there
+   is no internal function for the group.  For example:
+
+ (define_operator_list SQRT
+BUILT_IN_SQRTF
+BUILT_IN_SQRT
+BUILT_IN_SQRTL
+IFN_SQRT)
+
+   and:
+
+ (define_operator_list CABS
+BUILT_IN_CABSF
+BUILT_IN_CABS
+BUILT_IN_CABSL
+null)  */
 
 #include "bconfig.h"
 #include "system.h"
@@ -89,6 +109,23 @@ print_case_cfn (const char *name, bool internal_p,
   printf ("\n");
 }
 
+/* Print an operator list for all combined functions related to NAME,
+   with the null-terminated list of suffixes in SUFFIXES.  INTERNAL_P
+   says whether CFN_ also exists.  */
+
+static void
+print_define_operator_list (const char *name, bool internal_p,
+   const char *const *suffixes)
+{
+  printf ("(define_operator_list %s\n", name);
+  for (unsigned int i = 0; suffixes[i]; ++i)
+printf ("BUILT_IN_%s%s\n", name, suffixes[i]);
+  if (internal_p)
+printf 

Add null identifiers to genmatch

2015-11-07 Thread Richard Sandiford
This patch adds a null identifier that can never match anything and
can never be generated.  It is only valid in operator lists and fors.
Later patches will add uses of it.

The idea is to allow operator lists for maths functions that have
four entries:

- float built-in
- double built-in
- long double built-in
- internal function

Not all maths functions have an associated internal function,
and for those the final operator will be "null".  Any simplification
that tries to use a null substitution will be skipped.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* doc/match-and-simplify.texi: Document the "null" identifier.
* genmatch.c (id_base::NULL_ID): New kind.
(null_id): New variable.
(get_operator): Add a parameter that says whether null identifiers
are allowed.
(contains_id): New function.
(lower_for): Skip substitutions that would have a null_id in
either the match or the result.
(parser::parse_for): Allow the null identifier to be used.
(parser::parse_operator_list): Likewise.
(main): Initialize null_id.

diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
index c5c2b7e..db6519d 100644
--- a/gcc/doc/match-and-simplify.texi
+++ b/gcc/doc/match-and-simplify.texi
@@ -323,6 +323,11 @@ is the same as
   (POW (abs @@0) (mult @@1 @{ built_real (TREE_TYPE (@@1), dconsthalf); @}
 @end smallexample
 
+@code{for}s and operator lists can include the special identifier
+@code{null} that matches nothing and can never be generated.  This can
+be used to pad an operator list so that it has a standard form,
+even if there isn't a suitable operator for every form.
+
 Another building block are @code{with} expressions in the
 result expression which nest the generated code in a new C block
 followed by its argument:
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index c7ab4a4..cff32b0 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -297,7 +297,7 @@ commutative_ternary_tree_code (enum tree_code code)
 
 struct id_base : nofree_ptr_hash
 {
-  enum id_kind { CODE, FN, PREDICATE, USER } kind;
+  enum id_kind { CODE, FN, PREDICATE, USER, NULL_ID } kind;
 
   id_base (id_kind, const char *, int = -1);
 
@@ -324,6 +324,9 @@ id_base::equal (const id_base *op1,
  && strcmp (op1->id, op2->id) == 0);
 }
 
+/* The special id "null", which matches nothing.  */
+static id_base *null_id;
+
 /* Hashtable of known pattern operators.  This is pre-seeded from
all known tree codes and all known builtin function ids.  */
 static hash_table *operators;
@@ -479,11 +482,14 @@ operator==(id_base &id, enum tree_code code)
   return false;
 }
 
-/* Lookup the identifier ID.  */
+/* Lookup the identifier ID.  Allow "null" if ALLOW_NULL.  */
 
 id_base *
-get_operator (const char *id)
+get_operator (const char *id, bool allow_null = false)
 {
+  if (allow_null && strcmp (id, "null") == 0)
+return null_id;
+
   id_base tem (id_base::CODE, id);
 
   id_base *op = operators->find_with_hash (&tem, tem.hashval);
@@ -1115,6 +1121,40 @@ lower_cond (simplify *s, vec& simplifiers)
 }
 }
 
+/* Return true if O refers to ID.  */
+
+bool
+contains_id (operand *o, user_id *id)
+{
+  if (capture *c = dyn_cast (o))
+return c->what && contains_id (c->what, id);
+
+  if (expr *e = dyn_cast (o))
+{
+  if (e->operation == id)
+   return true;
+  for (unsigned i = 0; i < e->ops.length (); ++i)
+   if (contains_id (e->ops[i], id))
+ return true;
+  return false;
+}
+
+  if (with_expr *w = dyn_cast  (o))
+return (contains_id (w->with, id)
+   || contains_id (w->subexpr, id));
+
+  if (if_expr *ife = dyn_cast  (o))
+return (contains_id (ife->cond, id)
+   || contains_id (ife->trueexpr, id)
+   || (ife->falseexpr && contains_id (ife->falseexpr, id)));
+
+  if (c_expr *ce = dyn_cast (o))
+return ce->capture_ids && ce->capture_ids->get (id->id);
+
+  return false;
+}
+
+
 /* In AST operand O replace operator ID with operator WITH.  */
 
 operand *
@@ -1270,16 +1310,29 @@ lower_for (simplify *sin, vec& simplifiers)
  operand *result_op = s->result;
  vec > subst;
  subst.create (n_ids);
+ bool skip = false;
  for (unsigned i = 0; i < n_ids; ++i)
{
  user_id *id = ids[i];
  id_base *oper = id->substitutes[j % id->substitutes.length 
()];
+ if (oper == null_id
+ && (contains_id (match_op, id)
+ || contains_id (result_op, id)))
+   {
+ skip = true;
+ break;
+   }
  subst.quick_push (std::make_pair (id, oper));
  match_op = replace_id (match_op, id, oper);
  if (result_op
  && !can_delay_subst)
 

Extend mathfn_built_in to handle combined_fn

2015-11-07 Thread Richard Sandiford
This patch extends mathfn_built_in to handle combined_fn, but keeps the
old built_in_function interface around since it's a common case.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* builtins.h (mathfn_built_in): Add a variant that takes
a combined_fn.
* builtins.c: Include case-cfn-macros.h.
(CASE_MATHFN): Use CASE_CFN_*.
(CASE_MATHFN_REENT): Use CFN_ codes.
(mathfn_built_in_2, mathfn_built_in_1): Replace built_in_function
argument with a combined_fn.
(mathfn_built_in): Add a variant that takes a combined_fn.
(expand_builtin_int_roundingfn_2): Update callers accordingly.
(fold_builtin_sincos, fold_builtin_classify): Likewise.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c393f7c..f65011e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
 #include "internal-fn.h"
+#include "case-cfn-macros.h"
 
 
 struct target_builtins default_target_builtins;
@@ -1751,120 +1752,121 @@ expand_builtin_classify_type (tree exp)
determines which among a set of three builtin math functions is
appropriate for a given type mode.  The `F' and `L' cases are
automatically generated from the `double' case.  */
-#define CASE_MATHFN(BUILT_IN_MATHFN) \
-  case BUILT_IN_MATHFN: case BUILT_IN_MATHFN##F: case BUILT_IN_MATHFN##L: \
-  fcode = BUILT_IN_MATHFN; fcodef = BUILT_IN_MATHFN##F ; \
-  fcodel = BUILT_IN_MATHFN##L ; break;
+#define CASE_MATHFN(MATHFN) \
+  CASE_CFN_##MATHFN: \
+  fcode = BUILT_IN_##MATHFN; fcodef = BUILT_IN_##MATHFN##F ; \
+  fcodel = BUILT_IN_##MATHFN##L ; break;
 /* Similar to above, but appends _R after any F/L suffix.  */
-#define CASE_MATHFN_REENT(BUILT_IN_MATHFN) \
-  case BUILT_IN_MATHFN##_R: case BUILT_IN_MATHFN##F_R: case 
BUILT_IN_MATHFN##L_R: \
-  fcode = BUILT_IN_MATHFN##_R; fcodef = BUILT_IN_MATHFN##F_R ; \
-  fcodel = BUILT_IN_MATHFN##L_R ; break;
+#define CASE_MATHFN_REENT(MATHFN) \
+  case CFN_BUILT_IN_##MATHFN##_R: \
+  case CFN_BUILT_IN_##MATHFN##F_R: \
+  case CFN_BUILT_IN_##MATHFN##L_R: \
+  fcode = BUILT_IN_##MATHFN##_R; fcodef = BUILT_IN_##MATHFN##F_R ; \
+  fcodel = BUILT_IN_##MATHFN##L_R ; break;
 
 /* Return a function equivalent to FN but operating on floating-point
values of type TYPE, or END_BUILTINS if no such function exists.
-   This is purely an operation on built-in function codes; it does not
-   guarantee that the target actually has an implementation of the
-   function.  */
+   This is purely an operation on function codes; it does not guarantee
+   that the target actually has an implementation of the function.  */
 
 static built_in_function
-mathfn_built_in_2 (tree type, built_in_function fn)
+mathfn_built_in_2 (tree type, combined_fn fn)
 {
   built_in_function fcode, fcodef, fcodel;
 
   switch (fn)
 {
-  CASE_MATHFN (BUILT_IN_ACOS)
-  CASE_MATHFN (BUILT_IN_ACOSH)
-  CASE_MATHFN (BUILT_IN_ASIN)
-  CASE_MATHFN (BUILT_IN_ASINH)
-  CASE_MATHFN (BUILT_IN_ATAN)
-  CASE_MATHFN (BUILT_IN_ATAN2)
-  CASE_MATHFN (BUILT_IN_ATANH)
-  CASE_MATHFN (BUILT_IN_CBRT)
-  CASE_MATHFN (BUILT_IN_CEIL)
-  CASE_MATHFN (BUILT_IN_CEXPI)
-  CASE_MATHFN (BUILT_IN_COPYSIGN)
-  CASE_MATHFN (BUILT_IN_COS)
-  CASE_MATHFN (BUILT_IN_COSH)
-  CASE_MATHFN (BUILT_IN_DREM)
-  CASE_MATHFN (BUILT_IN_ERF)
-  CASE_MATHFN (BUILT_IN_ERFC)
-  CASE_MATHFN (BUILT_IN_EXP)
-  CASE_MATHFN (BUILT_IN_EXP10)
-  CASE_MATHFN (BUILT_IN_EXP2)
-  CASE_MATHFN (BUILT_IN_EXPM1)
-  CASE_MATHFN (BUILT_IN_FABS)
-  CASE_MATHFN (BUILT_IN_FDIM)
-  CASE_MATHFN (BUILT_IN_FLOOR)
-  CASE_MATHFN (BUILT_IN_FMA)
-  CASE_MATHFN (BUILT_IN_FMAX)
-  CASE_MATHFN (BUILT_IN_FMIN)
-  CASE_MATHFN (BUILT_IN_FMOD)
-  CASE_MATHFN (BUILT_IN_FREXP)
-  CASE_MATHFN (BUILT_IN_GAMMA)
-  CASE_MATHFN_REENT (BUILT_IN_GAMMA) /* GAMMA_R */
-  CASE_MATHFN (BUILT_IN_HUGE_VAL)
-  CASE_MATHFN (BUILT_IN_HYPOT)
-  CASE_MATHFN (BUILT_IN_ILOGB)
-  CASE_MATHFN (BUILT_IN_ICEIL)
-  CASE_MATHFN (BUILT_IN_IFLOOR)
-  CASE_MATHFN (BUILT_IN_INF)
-  CASE_MATHFN (BUILT_IN_IRINT)
-  CASE_MATHFN (BUILT_IN_IROUND)
-  CASE_MATHFN (BUILT_IN_ISINF)
-  CASE_MATHFN (BUILT_IN_J0)
-  CASE_MATHFN (BUILT_IN_J1)
-  CASE_MATHFN (BUILT_IN_JN)
-  CASE_MATHFN (BUILT_IN_LCEIL)
-  CASE_MATHFN (BUILT_IN_LDEXP)
-  CASE_MATHFN (BUILT_IN_LFLOOR)
-  CASE_MATHFN (BUILT_IN_LGAMMA)
-  CASE_MATHFN_REENT (BUILT_IN_LGAMMA) /* LGAMMA_R */
-  CASE_MATHFN (BUILT_IN_LLCEIL)
-  CASE_MATHFN (BUILT_IN_LLFLOOR)
-  CASE_MATHFN (BUILT_IN_LLRINT)
-  CASE_MATHFN (BUILT_IN_LLROUND)
-  CASE_MATHFN (BUILT_IN_LOG)
-  CASE_MATHFN (BUILT_IN_LOG10)
-  CASE_MATHFN (BUILT_IN_LOG1P)
-  CASE_MATHFN (BUILT_IN_LOG2)
-  CASE_MATHFN (BUILT_IN_

Use combined_fn in tree-vect-patterns.c

2015-11-07 Thread Richard Sandiford
Another patch to extend uses of built_in_function to combined_fn,
this time in tree-vect-patterns.c.  The old code didn't handle the
long double pow variants, but I think that's because noone had a target
that would benefit rather than because the code would mishandle them.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* tree-vect-patterns.c: Include case-cfn-macros.h.
(vect_recog_pow_pattern): Use combined_fn instead of built-in codes.

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index d003d33..bab9a4f 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vectorizer.h"
 #include "dumpfile.h"
 #include "builtins.h"
+#include "case-cfn-macros.h"
 
 /* Pattern recognition functions  */
 static gimple *vect_recog_widen_sum_pattern (vec *, tree *,
@@ -1007,23 +1008,17 @@ vect_recog_pow_pattern (vec *stmts, tree 
*type_in,
tree *type_out)
 {
   gimple *last_stmt = (*stmts)[0];
-  tree fn, base, exp = NULL;
+  tree base, exp = NULL;
   gimple *stmt;
   tree var;
 
   if (!is_gimple_call (last_stmt) || gimple_call_lhs (last_stmt) == NULL)
 return NULL;
 
-  fn = gimple_call_fndecl (last_stmt);
-  if (fn == NULL_TREE || DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL)
-   return NULL;
-
-  switch (DECL_FUNCTION_CODE (fn))
+  switch (gimple_call_combined_fn (last_stmt))
 {
-case BUILT_IN_POWIF:
-case BUILT_IN_POWI:
-case BUILT_IN_POWF:
-case BUILT_IN_POW:
+CASE_CFN_POW:
+CASE_CFN_POWI:
   base = gimple_call_arg (last_stmt, 0);
   exp = gimple_call_arg (last_stmt, 1);
   if (TREE_CODE (exp) != REAL_CST



Use combined_fn in tree-ssa-math-opts.c

2015-11-07 Thread Richard Sandiford
Another patch to extend uses of built_in_function to combined_fn, this time
in tree-ssa-math-opts.c.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* tree-ssa-math-opts.c: Include case-cfn-macros.h.
(execute_cse_sincos_1): Use combined_fn instead of built-in codes.
(pass_cse_sincos::execute): Likewise.

diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 41fcabf..ccfed32 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -110,6 +110,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa.h"
 #include "builtins.h"
 #include "params.h"
+#include "case-cfn-macros.h"
 
 /* This structure represents one basic block that either computes a
division, or is a common dominator for basic block that compute a
@@ -725,22 +726,20 @@ execute_cse_sincos_1 (tree name)
   FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, name)
 {
   if (gimple_code (use_stmt) != GIMPLE_CALL
- || !gimple_call_lhs (use_stmt)
- || !(fndecl = gimple_call_fndecl (use_stmt))
- || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+ || !gimple_call_lhs (use_stmt))
continue;
 
-  switch (DECL_FUNCTION_CODE (fndecl))
+  switch (gimple_call_combined_fn (use_stmt))
{
-   CASE_FLT_FN (BUILT_IN_COS):
+   CASE_CFN_COS:
  seen_cos |= maybe_record_sincos (&stmts, &top_bb, use_stmt) ? 1 : 0;
  break;
 
-   CASE_FLT_FN (BUILT_IN_SIN):
+   CASE_CFN_SIN:
  seen_sin |= maybe_record_sincos (&stmts, &top_bb, use_stmt) ? 1 : 0;
  break;
 
-   CASE_FLT_FN (BUILT_IN_CEXPI):
+   CASE_CFN_CEXPI:
  seen_cexpi |= maybe_record_sincos (&stmts, &top_bb, use_stmt) ? 1 : 0;
  break;
 
@@ -779,19 +778,18 @@ execute_cse_sincos_1 (tree name)
   for (i = 0; stmts.iterate (i, &use_stmt); ++i)
 {
   tree rhs = NULL;
-  fndecl = gimple_call_fndecl (use_stmt);
 
-  switch (DECL_FUNCTION_CODE (fndecl))
+  switch (gimple_call_combined_fn (use_stmt))
{
-   CASE_FLT_FN (BUILT_IN_COS):
+   CASE_CFN_COS:
  rhs = fold_build1 (REALPART_EXPR, type, res);
  break;
 
-   CASE_FLT_FN (BUILT_IN_SIN):
+   CASE_CFN_SIN:
  rhs = fold_build1 (IMAGPART_EXPR, type, res);
  break;
 
-   CASE_FLT_FN (BUILT_IN_CEXPI):
+   CASE_CFN_CEXPI:
  rhs = res;
  break;
 
@@ -1727,26 +1725,24 @@ pass_cse_sincos::execute (function *fun)
   for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
 {
  gimple *stmt = gsi_stmt (gsi);
- tree fndecl;
 
  /* Only the last stmt in a bb could throw, no need to call
 gimple_purge_dead_eh_edges if we change something in the middle
 of a basic block.  */
  cleanup_eh = false;
 
- if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
+ if (is_gimple_call (stmt)
  && gimple_call_lhs (stmt))
{
  tree arg, arg0, arg1, result;
  HOST_WIDE_INT n;
  location_t loc;
 
- fndecl = gimple_call_fndecl (stmt);
- switch (DECL_FUNCTION_CODE (fndecl))
+ switch (gimple_call_combined_fn (stmt))
{
-   CASE_FLT_FN (BUILT_IN_COS):
-   CASE_FLT_FN (BUILT_IN_SIN):
-   CASE_FLT_FN (BUILT_IN_CEXPI):
+   CASE_CFN_COS:
+   CASE_CFN_SIN:
+   CASE_CFN_CEXPI:
  /* Make sure we have either sincos or cexp.  */
  if (!targetm.libc_has_function (function_c99_math_complex)
  && !targetm.libc_has_function (function_sincos))
@@ -1757,7 +1753,7 @@ pass_cse_sincos::execute (function *fun)
cfg_changed |= execute_cse_sincos_1 (arg);
  break;
 
-   CASE_FLT_FN (BUILT_IN_POW):
+   CASE_CFN_POW:
  arg0 = gimple_call_arg (stmt, 0);
  arg1 = gimple_call_arg (stmt, 1);
 
@@ -1777,7 +1773,7 @@ pass_cse_sincos::execute (function *fun)
}
  break;
 
-   CASE_FLT_FN (BUILT_IN_POWI):
+   CASE_CFN_POWI:
  arg0 = gimple_call_arg (stmt, 0);
  arg1 = gimple_call_arg (stmt, 1);
  loc = gimple_location (stmt);
@@ -1826,7 +1822,7 @@ pass_cse_sincos::execute (function *fun)
}
  break;
 
-   CASE_FLT_FN (BUILT_IN_CABS):
+   CASE_CFN_CABS:
  arg0 = gimple_call_arg (stmt, 0);
  loc = gimple_location (stmt);
  result = gimple_expand_builtin_cabs (&gsi, loc, arg0);



Use combined_fn in tree-ssa-reassoc.c

2015-11-07 Thread Richard Sandiford
Another patch to extend uses of built_in_function to combined_fn, this time
in tree-ssa-reassoc.c.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* tree-ssa-reassoc.c: Include case-cfn-macros.h.
(stmt_is_power_of_op): Use combined_fn instead of built-in codes.
(decrement_power, acceptable_pow_call): Likewise.
(attempt_builtin_copysign): Likewise.

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index a75290c..9394664 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -50,6 +50,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "builtins.h"
 #include "gimplify.h"
+#include "case-cfn-macros.h"
 
 /*  This is a simple global reassociation pass.  It is, in part, based
 on the LLVM pass of the same name (They do some things more/less
@@ -1035,21 +1036,13 @@ oecount_cmp (const void *p1, const void *p2)
 static bool
 stmt_is_power_of_op (gimple *stmt, tree op)
 {
-  tree fndecl;
-
   if (!is_gimple_call (stmt))
 return false;
 
-  fndecl = gimple_call_fndecl (stmt);
-
-  if (!fndecl
-  || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
-return false;
-
-  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (stmt)))
+  switch (gimple_call_combined_fn (stmt))
 {
-CASE_FLT_FN (BUILT_IN_POW):
-CASE_FLT_FN (BUILT_IN_POWI):
+CASE_CFN_POW:
+CASE_CFN_POWI:
   return (operand_equal_p (gimple_call_arg (stmt, 0), op, 0));
   
 default:
@@ -1068,9 +1061,9 @@ decrement_power (gimple *stmt)
   HOST_WIDE_INT power;
   tree arg1;
 
-  switch (DECL_FUNCTION_CODE (gimple_call_fndecl (stmt)))
+  switch (gimple_call_combined_fn (stmt))
 {
-CASE_FLT_FN (BUILT_IN_POW):
+CASE_CFN_POW:
   arg1 = gimple_call_arg (stmt, 1);
   c = TREE_REAL_CST (arg1);
   power = real_to_integer (&c) - 1;
@@ -1078,7 +1071,7 @@ decrement_power (gimple *stmt)
   gimple_call_set_arg (stmt, 1, build_real (TREE_TYPE (arg1), cint));
   return power;
 
-CASE_FLT_FN (BUILT_IN_POWI):
+CASE_CFN_POWI:
   arg1 = gimple_call_arg (stmt, 1);
   power = TREE_INT_CST_LOW (arg1) - 1;
   gimple_call_set_arg (stmt, 1, build_int_cst (TREE_TYPE (arg1), power));
@@ -3937,7 +3930,7 @@ break_up_subtract (gimple *stmt, gimple_stmt_iterator 
*gsip)
 static bool
 acceptable_pow_call (gimple *stmt, tree *base, HOST_WIDE_INT *exponent)
 {
-  tree fndecl, arg1;
+  tree arg1;
   REAL_VALUE_TYPE c, cint;
 
   if (!first_pass_instance
@@ -3946,15 +3939,9 @@ acceptable_pow_call (gimple *stmt, tree *base, 
HOST_WIDE_INT *exponent)
   || !has_single_use (gimple_call_lhs (stmt)))
 return false;
 
-  fndecl = gimple_call_fndecl (stmt);
-
-  if (!fndecl
-  || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
-return false;
-
-  switch (DECL_FUNCTION_CODE (fndecl))
+  switch (gimple_call_combined_fn (stmt))
 {
-CASE_FLT_FN (BUILT_IN_POW):
+CASE_CFN_POW:
   if (flag_errno_math)
return false;
 
@@ -3976,7 +3963,7 @@ acceptable_pow_call (gimple *stmt, tree *base, 
HOST_WIDE_INT *exponent)
 
   break;
 
-CASE_FLT_FN (BUILT_IN_POWI):
+CASE_CFN_POWI:
   *base = gimple_call_arg (stmt, 0);
   arg1 = gimple_call_arg (stmt, 1);
 
@@ -4636,35 +4623,40 @@ attempt_builtin_copysign (vec *ops)
  && has_single_use (oe->op))
{
  gimple *def_stmt = SSA_NAME_DEF_STMT (oe->op);
- if (gimple_call_builtin_p (def_stmt, BUILT_IN_NORMAL))
+ if (gcall *old_call = dyn_cast  (def_stmt))
{
- tree fndecl = gimple_call_fndecl (def_stmt);
  tree arg0, arg1;
- switch (DECL_FUNCTION_CODE (fndecl))
+ switch (gimple_call_combined_fn (old_call))
{
-   CASE_FLT_FN (BUILT_IN_COPYSIGN):
- arg0 = gimple_call_arg (def_stmt, 0);
- arg1 = gimple_call_arg (def_stmt, 1);
+   CASE_CFN_COPYSIGN:
+ arg0 = gimple_call_arg (old_call, 0);
+ arg1 = gimple_call_arg (old_call, 1);
  /* The first argument of copysign must be a constant,
 otherwise there's nothing to do.  */
  if (TREE_CODE (arg0) == REAL_CST)
{
- tree mul = const_binop (MULT_EXPR, TREE_TYPE (cst),
- cst, arg0);
+ tree type = TREE_TYPE (arg0);
+ tree mul = const_binop (MULT_EXPR, type, cst, arg0);
  /* If we couldn't fold to a single constant, skip it.
 That happens e.g. for inexact multiplication when
 -frounding-math.  */
  if (mul == NULL_TREE)
break;
- /* Instead of adjusting the old DEF_STMT, let's build
-a new call to not leak the LHS and preve

Use combined_fn in tree-vrp.c

2015-11-07 Thread Richard Sandiford
Another patch to extend uses of built_in_function to combined_fn, this time
in tree-vrp.c.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* tree-vrp.c: Include case-cfn-macros.h.
(extract_range_basic): Switch on combined_fn rather than handling
built-in functions and internal functions separately.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 87c0265..4b4179d 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-threadedge.h"
 #include "omp-low.h"
 #include "target.h"
+#include "case-cfn-macros.h"
 
 /* Range of values that can be associated with an SSA_NAME after VRP
has executed.  */
@@ -3791,14 +3792,16 @@ extract_range_basic (value_range *vr, gimple *stmt)
   bool sop = false;
   tree type = gimple_expr_type (stmt);
 
-  if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
+  if (is_gimple_call (stmt))
 {
-  tree fndecl = gimple_call_fndecl (stmt), arg;
+  tree arg;
   int mini, maxi, zerov = 0, prec;
+  enum tree_code subcode = ERROR_MARK;
+  combined_fn cfn = gimple_call_combined_fn (stmt);
 
-  switch (DECL_FUNCTION_CODE (fndecl))
+  switch (cfn)
{
-   case BUILT_IN_CONSTANT_P:
+   case CFN_BUILT_IN_CONSTANT_P:
  /* If the call is __builtin_constant_p and the argument is a
 function parameter resolve it to false.  This avoids bogus
 array bound warnings.
@@ -3814,8 +3817,8 @@ extract_range_basic (value_range *vr, gimple *stmt)
  break;
  /* Both __builtin_ffs* and __builtin_popcount return
 [0, prec].  */
-   CASE_INT_FN (BUILT_IN_FFS):
-   CASE_INT_FN (BUILT_IN_POPCOUNT):
+   CASE_CFN_FFS:
+   CASE_CFN_POPCOUNT:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
@@ -3843,7 +3846,7 @@ extract_range_basic (value_range *vr, gimple *stmt)
}
  goto bitop_builtin;
  /* __builtin_parity* returns [0, 1].  */
-   CASE_INT_FN (BUILT_IN_PARITY):
+   CASE_CFN_PARITY:
  mini = 0;
  maxi = 1;
  goto bitop_builtin;
@@ -3852,7 +3855,7 @@ extract_range_basic (value_range *vr, gimple *stmt)
 On many targets where the CLZ RTL or optab value is defined
 for 0 the value is prec, so include that in the range
 by default.  */
-   CASE_INT_FN (BUILT_IN_CLZ):
+   CASE_CFN_CLZ:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
@@ -3907,7 +3910,7 @@ extract_range_basic (value_range *vr, gimple *stmt)
 If there is a ctz optab for this mode and
 CTZ_DEFINED_VALUE_AT_ZERO, include that in the range,
 otherwise just assume 0 won't be seen.  */
-   CASE_INT_FN (BUILT_IN_CTZ):
+   CASE_CFN_CTZ:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
@@ -3956,7 +3959,7 @@ extract_range_basic (value_range *vr, gimple *stmt)
break;
  goto bitop_builtin;
  /* __builtin_clrsb* returns [0, prec-1].  */
-   CASE_INT_FN (BUILT_IN_CLRSB):
+   CASE_CFN_CLRSB:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
@@ -3966,33 +3969,22 @@ extract_range_basic (value_range *vr, gimple *stmt)
  set_value_range (vr, VR_RANGE, build_int_cst (type, mini),
   build_int_cst (type, maxi), NULL);
  return;
-   default:
- break;
-   }
-}
-  else if (is_gimple_call (stmt) && gimple_call_internal_p (stmt))
-{
-  enum tree_code subcode = ERROR_MARK;
-  unsigned ifn_code = gimple_call_internal_fn (stmt);
-
-  switch (ifn_code)
-   {
-   case IFN_UBSAN_CHECK_ADD:
+   case CFN_UBSAN_CHECK_ADD:
  subcode = PLUS_EXPR;
  break;
-   case IFN_UBSAN_CHECK_SUB:
+   case CFN_UBSAN_CHECK_SUB:
  subcode = MINUS_EXPR;
  break;
-   case IFN_UBSAN_CHECK_MUL:
+   case CFN_UBSAN_CHECK_MUL:
  subcode = MULT_EXPR;
  break;
-   case IFN_GOACC_DIM_SIZE:
-   case IFN_GOACC_DIM_POS:
+   case CFN_GOACC_DIM_SIZE:
+   case CFN_GOACC_DIM_POS:
  /* Optimizing these two internal functions helps the loop
 optimizer eliminate outer comparisons.  Size is [1,N]
 and pos is [0,N-1].  */
  {
-   bool is_pos = ifn_code == IFN_GOACC_DIM_POS;
+   bool is_pos = cfn == CFN_GOACC_DIM_POS;
int axis = get_oacc_ifn_dim_arg (stmt);
int size = get_oacc_fn_dim_size (current_function_decl, axis);
 



Make more use of combined_fn

2015-11-07 Thread Richard Sandiford
This patch generalises fold-const.[hc] routines to use combined_fn
instead of built_in_function.  It also updates gimple-ssa-backprop,c
since the update is simple and it avoids churn on the call to
negate_mathfn_p.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard

[I've attached a -b form of the patch too since it's easier to read.]


gcc/
* fold-const.h (negate_mathfn_p): Take a combined_fn rather
than a built_in_function.
(tree_call_nonnegative_warnv_p): Take a combined_fn rather than
a function decl.
(integer_valued_real_call_p): Likewise.
* fold-const.c: Include case-cfn-macros.h
(negate_mathfn_p): Take a combined_fn rather than a built_in_function.
(negate_expr_p): Update accordingly.
(tree_call_nonnegative_warnv_p): Take a combined_fn rather than
a function decl.
(integer_valued_real_call_p): Likewise.
(tree_invalid_nonnegative_warnv_p): Update accordingly.
(integer_valued_real_p): Likewise.
* gimple-fold.c (gimple_call_nonnegative_warnv_p): Update call
to tree_call_nonnegative_warnv_p.
(gimple_call_integer_valued_real_p): Likewise
integer_valued_real_call_p.
* gimple-ssa-backprop.c: Include case-cfn-macros.h.
(backprop::process_builtin_call_use): Extend to combined_fn.
(strip_sign_op_1): Likewise.
(backprop::process_use): Don't check for built-in calls here.
(backprop::execute): Likewise.
(backprop::optimize_builtin_call): Update call to negate_mathfn_p.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index ae28445..a7085ef 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -73,6 +73,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-into-ssa.h"
 #include "md5.h"
+#include "case-cfn-macros.h"
 
 #ifndef LOAD_EXTEND_OP
 #define LOAD_EXTEND_OP(M) UNKNOWN
@@ -313,39 +314,39 @@ fold_overflow_warning (const char* gmsgid, enum warn_strict_overflow_code wc)
is odd, i.e. -f(x) == f(-x).  */
 
 bool
-negate_mathfn_p (enum built_in_function code)
-{
-  switch (code)
-{
-CASE_FLT_FN (BUILT_IN_ASIN):
-CASE_FLT_FN (BUILT_IN_ASINH):
-CASE_FLT_FN (BUILT_IN_ATAN):
-CASE_FLT_FN (BUILT_IN_ATANH):
-CASE_FLT_FN (BUILT_IN_CASIN):
-CASE_FLT_FN (BUILT_IN_CASINH):
-CASE_FLT_FN (BUILT_IN_CATAN):
-CASE_FLT_FN (BUILT_IN_CATANH):
-CASE_FLT_FN (BUILT_IN_CBRT):
-CASE_FLT_FN (BUILT_IN_CPROJ):
-CASE_FLT_FN (BUILT_IN_CSIN):
-CASE_FLT_FN (BUILT_IN_CSINH):
-CASE_FLT_FN (BUILT_IN_CTAN):
-CASE_FLT_FN (BUILT_IN_CTANH):
-CASE_FLT_FN (BUILT_IN_ERF):
-CASE_FLT_FN (BUILT_IN_LLROUND):
-CASE_FLT_FN (BUILT_IN_LROUND):
-CASE_FLT_FN (BUILT_IN_ROUND):
-CASE_FLT_FN (BUILT_IN_SIN):
-CASE_FLT_FN (BUILT_IN_SINH):
-CASE_FLT_FN (BUILT_IN_TAN):
-CASE_FLT_FN (BUILT_IN_TANH):
-CASE_FLT_FN (BUILT_IN_TRUNC):
+negate_mathfn_p (combined_fn fn)
+{
+  switch (fn)
+{
+CASE_CFN_ASIN:
+CASE_CFN_ASINH:
+CASE_CFN_ATAN:
+CASE_CFN_ATANH:
+CASE_CFN_CASIN:
+CASE_CFN_CASINH:
+CASE_CFN_CATAN:
+CASE_CFN_CATANH:
+CASE_CFN_CBRT:
+CASE_CFN_CPROJ:
+CASE_CFN_CSIN:
+CASE_CFN_CSINH:
+CASE_CFN_CTAN:
+CASE_CFN_CTANH:
+CASE_CFN_ERF:
+CASE_CFN_LLROUND:
+CASE_CFN_LROUND:
+CASE_CFN_ROUND:
+CASE_CFN_SIN:
+CASE_CFN_SINH:
+CASE_CFN_TAN:
+CASE_CFN_TANH:
+CASE_CFN_TRUNC:
   return true;
 
-CASE_FLT_FN (BUILT_IN_LLRINT):
-CASE_FLT_FN (BUILT_IN_LRINT):
-CASE_FLT_FN (BUILT_IN_NEARBYINT):
-CASE_FLT_FN (BUILT_IN_RINT):
+CASE_CFN_LLRINT:
+CASE_CFN_LRINT:
+CASE_CFN_NEARBYINT:
+CASE_CFN_RINT:
   return !flag_rounding_math;
 
 default:
@@ -506,7 +507,7 @@ negate_expr_p (tree t)
 
 case CALL_EXPR:
   /* Negate -f(x) as f(-x).  */
-  if (negate_mathfn_p (builtin_mathfn_code (t)))
+  if (negate_mathfn_p (get_call_combined_fn (t)))
 	return negate_expr_p (CALL_EXPR_ARG (t, 0));
   break;
 
@@ -693,7 +694,7 @@ fold_negate_expr (location_t loc, tree t)
 
 case CALL_EXPR:
   /* Negate -f(x) as f(-x).  */
-  if (negate_mathfn_p (builtin_mathfn_code (t))
+  if (negate_mathfn_p (get_call_combined_fn (t))
 	  && negate_expr_p (CALL_EXPR_ARG (t, 0)))
 	{
 	  tree fndecl, arg;
@@ -12905,121 +12906,120 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
*STRICT_OVERFLOW_P.  DEPTH is the current nesting depth of the query.  */
 
 bool
-tree_call_nonnegative_warnv_p (tree type, tree fndecl, tree arg0, tree arg1,
+tree_call_nonnegative_warnv_p (tree type, combined_fn fn, tree arg0, tree arg1,
 			   bool *strict_overflow_p, int depth)
 {
-  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
-switch (DECL_FUNCTION_CODE (fndecl))
-  {
-	CASE_FLT_FN (BUILT_IN_ACOS):
-	CASE_FLT_FN (BUILT_IN_ACOSH):
-	CASE_FLT_FN (

Extend fold_const_call to combined_fn

2015-11-07 Thread Richard Sandiford
This patch extends fold_const_call so that it can handle internal
as well as built-in functions.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* fold-const-call.h (fold_const_call): Replace built_in_function
arguments with combined_fn arguments.
* fold-const-call.c: Include case-cfn-macros.h.
(fold_const_call_ss, fold_const_call_cs, fold_const_call_sc)
(fold_const_call_cc, fold_const_call_sss, fold_const_call_ccc)
(fold_const_call_, fold_const_call_1, fold_const_call): Replace
built_in_function arguments with combined_fn arguments.
* builtins.c (fold_builtin_sincos, fold_builtin_1, fold_builtin_2)
(fold_builtin_3): Update calls to fold_const_call.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index edf0086..c393f7c 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7348,7 +7348,7 @@ fold_builtin_sincos (location_t loc,
   if (TREE_CODE (arg0) == REAL_CST)
 {
   tree complex_type = build_complex_type (type);
-  call = fold_const_call (fn, complex_type, arg0);
+  call = fold_const_call (as_combined_fn (fn), complex_type, arg0);
 }
   if (!call)
 {
@@ -8193,7 +8193,7 @@ fold_builtin_1 (location_t loc, tree fndecl, tree arg0)
   if (TREE_CODE (arg0) == ERROR_MARK)
 return NULL_TREE;
 
-  if (tree ret = fold_const_call (fcode, type, arg0))
+  if (tree ret = fold_const_call (as_combined_fn (fcode), type, arg0))
 return ret;
 
   switch (fcode)
@@ -8320,7 +8320,7 @@ fold_builtin_2 (location_t loc, tree fndecl, tree arg0, 
tree arg1)
   || TREE_CODE (arg1) == ERROR_MARK)
 return NULL_TREE;
 
-  if (tree ret = fold_const_call (fcode, type, arg0, arg1))
+  if (tree ret = fold_const_call (as_combined_fn (fcode), type, arg0, arg1))
 return ret;
 
   switch (fcode)
@@ -8419,7 +8419,8 @@ fold_builtin_3 (location_t loc, tree fndecl,
   || TREE_CODE (arg2) == ERROR_MARK)
 return NULL_TREE;
 
-  if (tree ret = fold_const_call (fcode, type, arg0, arg1, arg2))
+  if (tree ret = fold_const_call (as_combined_fn (fcode), type,
+ arg0, arg1, arg2))
 return ret;
 
   switch (fcode)
diff --git a/gcc/fold-const-call.c b/gcc/fold-const-call.c
index 49793a5..94801d2 100644
--- a/gcc/fold-const-call.c
+++ b/gcc/fold-const-call.c
@@ -26,6 +26,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "options.h"
 #include "fold-const.h"
 #include "fold-const-call.h"
+#include "case-cfn-macros.h"
 #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
 
 /* Functions that test for certain constant types, abstracting away the
@@ -574,114 +575,114 @@ fold_const_builtin_nan (tree type, tree arg, bool quiet)
in format FORMAT.  Return true on success.  */
 
 static bool
-fold_const_call_ss (real_value *result, built_in_function fn,
+fold_const_call_ss (real_value *result, combined_fn fn,
const real_value *arg, const real_format *format)
 {
   switch (fn)
 {
-CASE_FLT_FN (BUILT_IN_SQRT):
+CASE_CFN_SQRT:
   return (real_compare (GE_EXPR, arg, &dconst0)
  && do_mpfr_arg1 (result, mpfr_sqrt, arg, format));
 
-CASE_FLT_FN (BUILT_IN_CBRT):
+CASE_CFN_CBRT:
   return do_mpfr_arg1 (result, mpfr_cbrt, arg, format);
 
-CASE_FLT_FN (BUILT_IN_ASIN):
+CASE_CFN_ASIN:
   return (real_compare (GE_EXPR, arg, &dconstm1)
  && real_compare (LE_EXPR, arg, &dconst1)
  && do_mpfr_arg1 (result, mpfr_asin, arg, format));
 
-CASE_FLT_FN (BUILT_IN_ACOS):
+CASE_CFN_ACOS:
   return (real_compare (GE_EXPR, arg, &dconstm1)
  && real_compare (LE_EXPR, arg, &dconst1)
  && do_mpfr_arg1 (result, mpfr_acos, arg, format));
 
-CASE_FLT_FN (BUILT_IN_ATAN):
+CASE_CFN_ATAN:
   return do_mpfr_arg1 (result, mpfr_atan, arg, format);
 
-CASE_FLT_FN (BUILT_IN_ASINH):
+CASE_CFN_ASINH:
   return do_mpfr_arg1 (result, mpfr_asinh, arg, format);
 
-CASE_FLT_FN (BUILT_IN_ACOSH):
+CASE_CFN_ACOSH:
   return (real_compare (GE_EXPR, arg, &dconst1)
  && do_mpfr_arg1 (result, mpfr_acosh, arg, format));
 
-CASE_FLT_FN (BUILT_IN_ATANH):
+CASE_CFN_ATANH:
   return (real_compare (GE_EXPR, arg, &dconstm1)
  && real_compare (LE_EXPR, arg, &dconst1)
  && do_mpfr_arg1 (result, mpfr_atanh, arg, format));
 
-CASE_FLT_FN (BUILT_IN_SIN):
+CASE_CFN_SIN:
   return do_mpfr_arg1 (result, mpfr_sin, arg, format);
 
-CASE_FLT_FN (BUILT_IN_COS):
+CASE_CFN_COS:
   return do_mpfr_arg1 (result, mpfr_cos, arg, format);
 
-CASE_FLT_FN (BUILT_IN_TAN):
+CASE_CFN_TAN:
   return do_mpfr_arg1 (result, mpfr_tan, arg, format);
 
-CASE_FLT_FN (BUILT_IN_SINH):
+CASE_CFN_SINH:
   return do_mpfr_arg1 (result, mpfr_sinh, arg, format);
 
-CASE_FLT_FN (BUILT_IN_COSH):
+CASE_CFN_COSH:
   return do_mpfr_arg1 (result, mpfr_cosh, arg, form

Add gencfn-macros.c

2015-11-07 Thread Richard Sandiford
This patch automatically generates case macros such as:

CASE_CFN_SQRT

for each {F,,L} floating-point built-in function and each {,L,LL,IMAX}
integer built-in function.  The macros match the same built-in
functions as CASE_FLT_FN and CASE_INT_FN but in addition include
the associated internal function, if any.

The idea is to make sure that users of combined_fn don't need to know
which built-in functions have internal-function equivalents.  If we add
a new function to internal-fn.def, all combined_fn users should pick it
up automatically.

The generator wants to use "hash_set ",
so the patch follows hash_map in using the types given by the
traits as the key.  This is a no-op for current users of hash_set.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* Makefile.in (HASH_TABLE_H): Add GGC_H.
(MOSTLYCLEANFILES, generated_files): Add case-fn-macros.h.
(s-case-cfn-macros, case-cfn-macros.h, build/gencfn-macros.o)
(build/gencfn-macros$(build_exeext): New rules.
(genprogerr): Add cfn-macros.
* hash-set.h (hash_set): Use the traits value_type as the key.
* gencfn-macros.c: New file.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 34d2356..298bb38 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -876,7 +876,7 @@ endif
 # Shorthand variables for dependency lists.
 DUMPFILE_H = $(srcdir)/../libcpp/include/line-map.h dumpfile.h
 VEC_H = vec.h statistics.h $(GGC_H)
-HASH_TABLE_H = $(HASHTAB_H) hash-table.h
+HASH_TABLE_H = $(HASHTAB_H) hash-table.h $(GGC_H)
 EXCEPT_H = except.h $(HASHTAB_H)
 TARGET_DEF = target.def target-hooks-macros.h target-insns.def
 C_TARGET_DEF = c-family/c-target.def target-hooks-macros.h
@@ -1566,6 +1566,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h 
insn-codes.h \
  tm-preds.h tm-constrs.h checksum-options gimple-match.c generic-match.c \
  tree-check.h min-insn-modes.c insn-modes.c insn-modes.h \
  genrtl.h gt-*.h gtype-*.h gtype-desc.c gtyp-input.list \
+ case-cfn-macros.h \
  xgcc$(exeext) cpp$(exeext) $(FULL_DRIVER_NAME) \
  $(EXTRA_PROGRAMS) gcc-cross$(exeext) \
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
@@ -2243,6 +2244,14 @@ s-constrs-h: $(MD_DEPS) build/genpreds$(build_exeext)
$(SHELL) $(srcdir)/../move-if-change tmp-constrs.h tm-constrs.h
$(STAMP) s-constrs-h
 
+s-case-cfn-macros: build/gencfn-macros$(build_exeext)
+   $(RUN_GEN) build/gencfn-macros$(build_exeext) -c \
+ > tmp-case-cfn-macros.h
+   $(SHELL) $(srcdir)/../move-if-change tmp-case-cfn-macros.h \
+ case-cfn-macros.h
+   $(STAMP) s-case-cfn-macros
+case-cfn-macros.h: s-case-cfn-macros; @true
+
 target-hooks-def.h: s-target-hooks-def-h; @true
 # make sure that when we build info files, the used tm.texi is up to date.
 $(srcdir)/doc/tm.texi: s-tm-texi; @true
@@ -2430,7 +2439,7 @@ generated_files = config.h tm.h $(TM_P_H) $(TM_H) 
multilib.h \
$(ALL_GTFILES_H) gtype-desc.c gtype-desc.h gcov-iov.h \
options.h target-hooks-def.h insn-opinit.h \
common/common-target-hooks-def.h pass-instances.def \
-   c-family/c-target-hooks-def.h params.list
+   c-family/c-target-hooks-def.h params.list case-cfn-macros.h
 
 #
 # How to compile object files to run on the build machine.
@@ -2577,6 +2586,8 @@ build/genmddump.o : genmddump.c $(RTL_BASE_H) 
$(BCONFIG_H) $(SYSTEM_H)\
 build/genmatch.o : genmatch.c $(BCONFIG_H) $(SYSTEM_H) \
   coretypes.h errors.h $(HASH_TABLE_H) hash-map.h $(GGC_H) is-a.h \
   tree.def builtins.def
+build/gencfn-macros.o : gencfn-macros.c $(BCONFIG_H) $(SYSTEM_H)   \
+  coretypes.h errors.h $(HASH_TABLE_H) hash-set.h builtins.def internal-fn.def
 
 # Compile the programs that generate insn-* from the machine description.
 # They are compiled with $(COMPILER_FOR_BUILD), and associated libraries,
@@ -2593,7 +2604,7 @@ genprogmd = $(genprogrtl) mddeps constants enums
 $(genprogmd:%=build/gen%$(build_exeext)): $(BUILD_MD)
 
 # All these programs need to report errors.
-genprogerr = $(genprogmd) genrtl modes gtype hooks
+genprogerr = $(genprogmd) genrtl modes gtype hooks cfn-macros
 $(genprogerr:%=build/gen%$(build_exeext)): $(BUILD_ERRORS)
 
 # Remaining build programs.
@@ -2603,6 +2614,7 @@ genprog = $(genprogerr) check checksum condmd match
 build/genautomata$(build_exeext) : BUILD_LIBS += -lm
 
 build/genrecog$(build_exeext) : build/hash-table.o build/inchash.o
+build/gencfn-macros$(build_exeext) : build/hash-table.o build/ggc-none.o
 
 # For stage1 and when cross-compiling use the build libcpp which is
 # built with NLS disabled.  For stage2+ use the host library and
diff --git a/gcc/gencfn-macros.c b/gcc/gencfn-macros.c
new file mode 100644
index 000..5ee3af0
--- /dev/null
+++ b/gcc/gencfn-macros.c
@@ -0,0 +1,176 @@
+/* Generate macros based on the combined_fn enum.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software

Add internal bitcount functions

2015-11-07 Thread Richard Sandiford
This patch adds internal function equivalents of all the INT_FN functions.
Unlike the math functions, these functions never set errno and the internal
functions should be exactly equivalent to the built-in ones.  The reason
for defining the internal functions is so that we can extend the
functionality to other modes, in particular vector modes.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* internal-fn.def (DEF_INTERNAL_INT_FN): New macro.
(CLRSB, CLZ, CTZ, FFS, PARITY, POPCOUNT): New functions.
* builtins.c (associated_internal_fn): Handle them.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 0a9b185..edf0086 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1916,6 +1916,8 @@ associated_internal_fn (tree fndecl)
 {
 #define DEF_INTERNAL_FLT_FN(NAME, FLAGS, OPTAB, TYPE) \
 CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME;
+#define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
+CASE_INT_FN (BUILT_IN_##NAME): return IFN_##NAME;
 #include "internal-fn.def"
 
 CASE_FLT_FN (BUILT_IN_POW10):
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 65e158e..bf8047a 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
  DEF_INTERNAL_FN (NAME, FLAGS, FNSPEC)
  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
  DEF_INTERNAL_FLT_FN (NAME, FLAGS, OPTAB, TYPE)
+ DEF_INTERNAL_INT_FN (NAME, FLAGS, OPTAB, TYPE)
 
where NAME is the name of the function, FLAGS is a set of
ECF_* flags and FNSPEC is a string describing functions fnspec.
@@ -53,6 +54,10 @@ along with GCC; see the file COPYING3.  If not see
function BUILT_IN_{F,,L}.  Unlike some built-in functions,
these internal functions never set errno.
 
+   DEF_INTERNAL_INT_FN is like DEF_INTERNAL_OPTAB_FN, but in addition
+   says that the function extends the C-level BUILT_IN_{,L,LL,IMAX}
+   group of functions to any integral mode (including vector modes).
+
Each entry must have a corresponding expander of the form:
 
  void expand_NAME (gimple_call stmt)
@@ -75,6 +80,11 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_INT_FN
+#define DEF_INTERNAL_INT_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 
@@ -119,6 +129,14 @@ DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary)
 /* FP scales.  */
 DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary)
 
+/* Unary integer ops.  */
+DEF_INTERNAL_INT_FN (CLRSB, ECF_CONST, clrsb, unary)
+DEF_INTERNAL_INT_FN (CLZ, ECF_CONST, clz, unary)
+DEF_INTERNAL_INT_FN (CTZ, ECF_CONST, ctz, unary)
+DEF_INTERNAL_INT_FN (FFS, ECF_CONST, ffs, unary)
+DEF_INTERNAL_INT_FN (PARITY, ECF_CONST, parity, unary)
+DEF_INTERNAL_INT_FN (POPCOUNT, ECF_CONST, popcount, unary)
+
 DEF_INTERNAL_FN (GOMP_SIMD_LANE, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_VF, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
@@ -163,6 +181,7 @@ DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL)
 /* OpenACC reduction abstraction.  See internal-fn.h  for usage.  */
 DEF_INTERNAL_FN (GOACC_REDUCTION, ECF_NOTHROW | ECF_LEAF, NULL)
 
+#undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_OPTAB_FN
 #undef DEF_INTERNAL_FN



Add internal math functions

2015-11-07 Thread Richard Sandiford
This patch adds internal functions for simple FLT_FN built-in functions,
in cases where an associated optab already exists.  Unlike some of the
built-in functions, these internal functions never set errno.

LDEXP is an odd-one out in that its second operand is an integer.
All the others operate on uniform types.

The patch also adds a function to query the internal function associated
with a built-in function (if any), and another to test whether a given
gcall could be replaced by a call to an internal function on the current
target (as long as the caller deals with errno appropriately).

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* builtins.h (associated_internal_fn): Declare.
(replacement_internal_fn): Likewise.
* builtins.c: Include internal-fn.h
(associated_internal_fn, replacement_internal_fn): New functions.
* internal-fn.def (DEF_INTERNAL_FLT_FN): New macro.
(ACOS, ASIN, ATAN, COS, EXP, EXP10, EXP2, EXPM1, LOG, LOG10, LOG1P)
(LOG2, LOGB, SIGNIFICAND, SIN, SQRT, TAN, CEIL, FLOOR, NEARBYINT)
(RINT, ROUND, TRUNC, ATAN2, COPYSIGN, FMOD, POW, REMAINDER, SCALB)
(LDEXP): New functions.
* internal-fn.c: Include recog.h.
(unary_direct, binary_direct): New macros.
(expand_direct_optab_fn): New function.
(expand_unary_optab_fn): New macro.
(expand_binary_optab_fn): Likewise.
(direct_unary_optab_supported_p): Likewise.
(direct_binary_optab_supported_p): Likewise.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index ad661c1..0a9b185 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cilk.h"
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
+#include "internal-fn.h"
 
 
 struct target_builtins default_target_builtins;
@@ -1901,6 +1902,63 @@ mathfn_built_in (tree type, enum built_in_function fn)
   return mathfn_built_in_1 (type, fn, /*implicit=*/ 1);
 }
 
+/* If BUILT_IN_NORMAL function FNDECL has an associated internal function,
+   return its code, otherwise return IFN_LAST.  Note that this function
+   only tests whether the function is defined in internals.def, not whether
+   it is actually available on the target.  */
+
+internal_fn
+associated_internal_fn (tree fndecl)
+{
+  gcc_checking_assert (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL);
+  tree return_type = TREE_TYPE (TREE_TYPE (fndecl));
+  switch (DECL_FUNCTION_CODE (fndecl))
+{
+#define DEF_INTERNAL_FLT_FN(NAME, FLAGS, OPTAB, TYPE) \
+CASE_FLT_FN (BUILT_IN_##NAME): return IFN_##NAME;
+#include "internal-fn.def"
+
+CASE_FLT_FN (BUILT_IN_POW10):
+  return IFN_EXP10;
+
+CASE_FLT_FN (BUILT_IN_DREM):
+  return IFN_REMAINDER;
+
+CASE_FLT_FN (BUILT_IN_SCALBN):
+CASE_FLT_FN (BUILT_IN_SCALBLN):
+  if (REAL_MODE_FORMAT (TYPE_MODE (return_type))->b == 2)
+   return IFN_LDEXP;
+  return IFN_LAST;
+
+default:
+  return IFN_LAST;
+}
+}
+
+/* If CALL is a call to a BUILT_IN_NORMAL function that could be replaced
+   on the current target by a call to an internal function, return the
+   code of that internal function, otherwise return IFN_LAST.  The caller
+   is responsible for ensuring that any side-effects of the built-in
+   call are dealt with correctly.  E.g. if CALL sets errno, the caller
+   must decide that the errno result isn't needed or make it available
+   in some other way.  */
+
+internal_fn
+replacement_internal_fn (gcall *call)
+{
+  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL))
+{
+  internal_fn ifn = associated_internal_fn (gimple_call_fndecl (call));
+  if (ifn != IFN_LAST)
+   {
+ tree_pair types = direct_internal_fn_types (ifn, call);
+ if (direct_internal_fn_supported_p (ifn, types))
+   return ifn;
+   }
+}
+  return IFN_LAST;
+}
+
 /* If errno must be maintained, expand the RTL to check if the result,
TARGET, of a built-in function call, EXP, is NaN, and if so set
errno to EDOM.  */
diff --git a/gcc/builtins.h b/gcc/builtins.h
index b039632..7f92d07 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -94,4 +94,7 @@ extern char target_percent_s[3];
 extern char target_percent_c[3];
 extern char target_percent_s_newline[4];
 
+extern internal_fn associated_internal_fn (tree);
+extern internal_fn replacement_internal_fn (gcall *);
+
 #endif
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 72536da..9f9f9cf 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dojump.h"
 #include "expr.h"
 #include "ubsan.h"
+#include "recog.h"
 
 /* The names of each internal function, indexed by function number.  */
 const char *const internal_fn_name_array[] = {
@@ -73,6 +74,8 @@ init_internal_fns ()
 #define load_lanes_direct { -1, -1 }
 #define mask_store_direct { 3, 3 }
 #define store_

Add basic support for direct_optab internal functions

2015-11-07 Thread Richard Sandiford
This patch adds a concept of internal functions that map directly to an
optab (here called "direct internal functions").  The function can only
be used if the associated optab can be used.

We currently have four functions like that:

- LOAD_LANES
- STORE_LANES
- MASK_LOAD
- MASK_STORE

so the patch converts them to the new infrastructure.  These four
all need different types of optabs, but future patches will add
regular unary and binary ones.

In general we need one or two modes to decide whether an optab is
supported, depending on whether it's a convert_optab or not.
This in turn means that we need up to two types to decide whether
an internal function is supported.  The patch records which types
are needed for each internal function, using -1 if the return type
should be used and N>=0 if the type of argument N should be used.

(LOAD_LANES and STORE_LANES are unusual in that both optab modes
come from the same array type.)

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* coretypes.h (tree_pair): New type.
* internal-fn.def (DEF_INTERNAL_OPTAB_FN): New macro.  Use it
for MASK_LOAD, LOAD_LANES, MASK_STORE and STORE_LANES.
* internal-fn.h (direct_internal_fn_info): New structure.
(direct_internal_fn_array): Declare.
(direct_internal_fn_p, direct_internal_fn): New functions.
(direct_internal_fn_types, direct_internal_fn_supported_p): Declare.
* internal-fn.c (not_direct, mask_load_direct, load_lanes_direct)
(mask_store_direct, store_lanes_direct): New macros.
(direct_internal_fn_array) New array.
(get_multi_vector_move): Return the optab handler without asserting
that it is available.
(expand_LOAD_LANES): Rename to...
(expand_load_lanes_optab_fn): ...this and add an optab argument.
(expand_STORE_LANES): Rename to...
(expand_store_lanes_optab_fn): ...this and add an optab argument.
(expand_MASK_LOAD): Rename to...
(expand_mask_load_optab_fn): ...this and add an optab argument.
(expand_MASK_STORE): Rename to...
(expand_mask_store_optab_fn): ...this and add an optab argument.
(direct_internal_fn_types, direct_optab_supported_p)
(multi_vector_optab_supported_p, direct_internal_fn_supported_p)
(direct_internal_fn_supported_p): New functions.
(direct_mask_load_optab_supported_p): New macro.
(direct_load_lanes_optab_supported_p): Likewise.
(direct_mask_store_optab_supported_p): Likewise.
(direct_store_lanes_optab_supported_p): Likewise.

diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 3439c38..d4a75db 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -251,6 +251,8 @@ namespace gcc {
   class context;
 }
 
+typedef std::pair  tree_pair;
+
 #else
 
 struct _dont_use_rtx_here_;
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index afbfae8..72536da 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -66,13 +66,27 @@ init_internal_fns ()
   internal_fn_fnspec_array[IFN_LAST] = 0;
 }
 
+/* Create static initializers for the information returned by
+   direct_internal_fn.  */
+#define not_direct { -2, -2 }
+#define mask_load_direct { -1, -1 }
+#define load_lanes_direct { -1, -1 }
+#define mask_store_direct { 3, 3 }
+#define store_lanes_direct { 0, 0 }
+
+const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
+#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct,
+#define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct,
+#include "internal-fn.def"
+  not_direct
+};
+
 /* ARRAY_TYPE is an array of vector modes.  Return the associated insn
-   for load-lanes-style optab OPTAB.  The insn must exist.  */
+   for load-lanes-style optab OPTAB, or CODE_FOR_nothing if none.  */
 
 static enum insn_code
 get_multi_vector_move (tree array_type, convert_optab optab)
 {
-  enum insn_code icode;
   machine_mode imode;
   machine_mode vmode;
 
@@ -80,15 +94,13 @@ get_multi_vector_move (tree array_type, convert_optab optab)
   imode = TYPE_MODE (array_type);
   vmode = TYPE_MODE (TREE_TYPE (array_type));
 
-  icode = convert_optab_handler (optab, imode, vmode);
-  gcc_assert (icode != CODE_FOR_nothing);
-  return icode;
+  return convert_optab_handler (optab, imode, vmode);
 }
 
-/* Expand LOAD_LANES call STMT.  */
+/* Expand LOAD_LANES call STMT using optab OPTAB.  */
 
 static void
-expand_LOAD_LANES (gcall *stmt)
+expand_load_lanes_optab_fn (gcall *stmt, convert_optab optab)
 {
   struct expand_operand ops[2];
   tree type, lhs, rhs;
@@ -106,13 +118,13 @@ expand_LOAD_LANES (gcall *stmt)
 
   create_output_operand (&ops[0], target, TYPE_MODE (type));
   create_fixed_operand (&ops[1], mem);
-  expand_insn (get_multi_vector_move (type, vec_load_lanes_optab), 2, ops);
+  expand_insn (get_multi_vector_move (type, optab), 2, ops);
 }
 
-/* Expand STORE_LANES call STMT.  */
+/* Expand STORE_LANES call STMT using opt

Add a combined_fn enum

2015-11-07 Thread Richard Sandiford
I'm working on a patch series that needs to be able to treat built-in
functions and internal functions in a similar way.  This patch adds a
new enum, combined_fn, that combines the two together.  It also adds
utility functions for seeing which combined_fn (if any) is called by
a given CALL_EXPR or gcall.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
OK to install?

Thanks,
Richard


gcc/
* tree-core.h (internal_fn): Move immediately after the definition
of built_in_function.
(combined_fn): New enum.
* tree.h (as_combined_fn, builtin_fn_p, as_builtin_fn)
(internal_fn_p, as_internal_fn): New functions.
(get_call_combined_fn, combined_fn_name): Declare.
* tree.c (get_call_combined_fn): New function.
(combined_fn_name): Likewise.
* gimple.h (gimple_call_combined_fn): Declare.
* gimple.c (gimple_call_combined_fn): New function.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 4ce38da..de3520a 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -2530,6 +2530,27 @@ gimple_call_builtin_p (const gimple *stmt, enum 
built_in_function code)
   return false;
 }
 
+/* If CALL is a call to a combined_fn (i.e. an internal function or
+   a normal built-in function), return its code, otherwise return
+   CFN_LAST.  */
+
+combined_fn
+gimple_call_combined_fn (const gimple *stmt)
+{
+  if (const gcall *call = dyn_cast  (stmt))
+{
+  if (gimple_call_internal_p (call))
+   return as_combined_fn (gimple_call_internal_fn (call));
+
+  tree fndecl = gimple_call_fndecl (stmt);
+  if (fndecl
+ && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
+ && gimple_builtin_call_types_compatible_p (stmt, fndecl))
+   return as_combined_fn (DECL_FUNCTION_CODE (fndecl));
+}
+  return CFN_LAST;
+}
+
 /* Return true if STMT clobbers memory.  STMT is required to be a
GIMPLE_ASM.  */
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 781801b..13cfbce 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1499,6 +1499,7 @@ extern tree gimple_signed_type (tree);
 extern alias_set_type gimple_get_alias_set (tree);
 extern bool gimple_ior_addresses_taken (bitmap, gimple *);
 extern bool gimple_builtin_call_types_compatible_p (const gimple *, tree);
+extern combined_fn gimple_call_combined_fn (const gimple *);
 extern bool gimple_call_builtin_p (const gimple *);
 extern bool gimple_call_builtin_p (const gimple *, enum built_in_class);
 extern bool gimple_call_builtin_p (const gimple *, enum built_in_function);
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 954368f..afb53be 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -184,6 +184,35 @@ enum built_in_function {
   END_BUILTINS
 };
 
+/* Internal functions.  */
+enum internal_fn {
+#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) IFN_##CODE,
+#include "internal-fn.def"
+  IFN_LAST
+};
+
+/* An enum that combines target-independent built-in functions with
+   internal functions, so that they can be treated in a similar way.
+   The numbers for built-in functions are the same as for the
+   built_in_function enum.  The numbers for internal functions
+   start at END_BUITLINS.  */
+enum combined_fn {
+#define DEF_BUILTIN(ENUM, N, C, T, LT, B, F, NA, AT, IM, COND) \
+  CFN_##ENUM = int (ENUM),
+#include "builtins.def"
+
+#define DEF_BUILTIN(ENUM, N, C, T, LT, B, F, NA, AT, IM, COND)
+#define DEF_BUILTIN_CHKP(ENUM, N, C, T, LT, B, F, NA, AT, IM, COND) \
+  CFN_##ENUM##_CHKP = int (ENUM##_CHKP),
+#include "builtins.def"
+
+#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) \
+  CFN_##CODE = int (END_BUILTINS) + int (IFN_##CODE),
+#include "internal-fn.def"
+
+  CFN_LAST
+};
+
 /* Tree code classes.  Each tree_code has an associated code class
represented by a TREE_CODE_CLASS.  */
 enum tree_code_class {
@@ -766,13 +795,6 @@ enum annot_expr_kind {
   annot_expr_kind_last
 };
 
-/* Internal functions.  */
-enum internal_fn {
-#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) IFN_##CODE,
-#include "internal-fn.def"
-  IFN_LAST
-};
-
 /*---
 Type definitions
 ---*/
diff --git a/gcc/tree.c b/gcc/tree.c
index 5b9a7bd..94c3a1a 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9316,6 +9316,25 @@ get_callee_fndecl (const_tree call)
   return NULL_TREE;
 }
 
+/* If CALL_EXPR CALL calls a normal built-in function or an internal function,
+   return the associated function code, otherwise return CFN_LAST.  */
+
+combined_fn
+get_call_combined_fn (const_tree call)
+{
+  /* It's invalid to call this function with anything but a CALL_EXPR.  */
+  gcc_assert (TREE_CODE (call) == CALL_EXPR);
+
+  if (!CALL_EXPR_FN (call))
+return as_combined_fn (CALL_EXPR_IFN (call));
+
+  tree fndecl = get_callee_fndecl (call);
+  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
+return as_combined_fn (DECL_FUNCTI

Re: improved RTL-level if conversion using scratchpads [half-hammock edition]

2015-11-07 Thread Bernhard Reutner-Fischer
On November 6, 2015 5:27:44 PM GMT+01:00, Bernd Schmidt 
 wrote:
>On 11/06/2015 04:52 PM, Sebastian Pop wrote:
>
>>> opinion). If you want a half-finished redzone allocator, I can send
>you a
>>> patch.
>>
>> Yes please.  Let's get it work.
>
>Here you go. This is incomplete and does not compile, but it shows the 
>direction I have in mind and isn't too far off. I had a similar patch 

--- a/gcc/function.c
+++ b/gcc/function.c
@@ -217,10 +217,10 @@ free_after_compilation (struct function *f)
 HOST_WIDE_INT
 get_frame_size (void)
 {
-  if (FRAME_GROWS_DOWNWARD)
-return -frame_offset;
+  if (-crtl->frame.grows_downward)
+return -crtl->frame.frame_offset;
   else
-return frame_offset;
+return crtl->frame.frame_offset;
 }
 
frame.grows_downward is a bool it seems and as such I wonder what the minus in 
the condition means or is supposed to achieve?
Something we (should?) warn about?

Just curious..
Cheers,




Combined constructs' clause splitting (was: [gomp4] backport trunk FE changes)

2015-11-07 Thread Thomas Schwinge
Hi!

On Fri, 6 Nov 2015 15:31:23 -0800, Cesar Philippidis  
wrote:
> I've applied this patch to gomp-4_0-branch which backports most of my
> front end changes from trunk. Note that I found a regression while
> testing, which is also present in trunk. It looks like
> kernels-acc-loop-reduction.c is failing because I'm incorrectly
> propagating the reduction variable to both to the kernels and loop
> constructs for combined 'acc kernels loop'. The problem here is that
> kernels don't support the reduction clause. I'll fix that next week.

Always need to consider both what the specification allows -- and thus
what the front ends accept/refuse -- as well as what we might do
differently, internally in later processing stages.  I have not analyzed
whether it makes sense to have the OMP_CLAUSE_REDUCTION of a combined
"kernels loop reduction([...])" construct be attached to the outer
OACC_KERNELS or inner OACC_LOOP, or duplicated for both.

Tom, if you need a solution for that right now/want to restore the
previous behavior (attached to innter OACC_LOOP only), here's what you
should try: in gcc/c-family/c-omp.c:c_oacc_split_loop_clauses remove the
special handling for OMP_CLAUSE_REDUCTION, and move it to "Loop clauses"
section, and in
gcc/fortran/trans-openmp.c:gfc_trans_oacc_combined_directive I don't see
reduction clauses being handled, hmm, maybe the Fortran front end is
doing that differently?


Grüße
 Thomas


signature.asc
Description: PGP signature


[RFC] [PATCH V2]: RE: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.

2015-11-07 Thread Kumar, Venkataramanan
Hi Richard,

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Friday, October 30, 2015 5:00 PM
> To: Kumar, Venkataramanan
> Cc: Andrew Pinski; gcc-patches@gcc.gnu.org
> Subject: Re: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.
> 
> On Fri, Oct 30, 2015 at 11:21 AM, Kumar, Venkataramanan
>  wrote:
> > Hi Andrew,
> >
> >> -Original Message-
> >> From: Andrew Pinski [mailto:pins...@gmail.com]
> >> Sent: Friday, October 30, 2015 3:38 PM
> >> To: Kumar, Venkataramanan
> >> Cc: Richard Beiner (richard.guent...@gmail.com);
> >> gcc-patches@gcc.gnu.org
> >> Subject: Re: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.
> >>
> >> On Fri, Oct 30, 2015 at 6:06 PM, Kumar, Venkataramanan
> >>  wrote:
> >> > Hi Richard,
> >> >
> >> > I am trying to "if covert the store" in the below test case and
> >> > later help it to get vectorized under -Ofast
> >> > -ftree-loop-if-convert-stores -fno-common
> >> >
> >> > #define LEN 4096
> >> >  __attribute__((aligned(32))) float array[LEN]; void test() { for
> >> > (int i = 0; i <
> >> LEN; i++) {
> >> >if (array[i] > (float)0.)
> >> > array[i] =3 ;
> >> >
> >> > }
> >> > }
> >> >
> >> > Currently in GCC 5.2  does not vectorize it.
> >> > https://goo.gl/9nS6Pd
> >> >
> >> > However ICC seems to vectorize it
> >> > https://goo.gl/y1yGHx
> >> >
> >> > As discussed with you  earlier, to allow "if convert store"  I am
> >> > checking the
> >> following:
> >> >
> >> > (1) We already  read the reference "array[i]"  unconditionally once .
> >> > (2) I am now checking  if we are conditionally writing to memory
> >> > which is
> >> defined as read and write and is bound to the definition we are seeing.
> >>
> >>
> >> I don't think this is thread safe 
> >>
> >
> > I thought under -ftree-loop-if-convert-stores it is ok to do this
> transformation.
> 
> Yes, that's what we have done in the past here.  Note that I think we should
> remove the flag in favor of the --param allow-store-data-races and if-convert
> safe stores always (stores to thread-local memory).  Esp. using masked
> stores should be always safe.
> 
> > Regards,
> > Venkat.
> >
> >> Thanks,
> >> Andrew
> >>
> >> >
> >> > With this change, I get able to if convert and the vectorize the case 
> >> > also.
> >> >
> >> > /build/gcc/xgcc -B ./build/gcc/  ifconv.c -Ofast -fopt-info-vec  -S
> >> > -ftree-loop-if-convert-stores -fno-common
> >> > ifconv.c:2:63: note: loop vectorized
> >> >
> >> > Patch
> >> > --
> >> > diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index
> >> > f201ab5..6475cc0 100644
> >> > --- a/gcc/tree-if-conv.c
> >> > +++ b/gcc/tree-if-conv.c
> >> > @@ -727,6 +727,34 @@ write_memrefs_written_at_least_once
> (gimple
> >> *stmt,
> >> >return true;
> >> > }
> >> >
> >> > +static bool
> >> > +write_memrefs_safe_to_access_unconditionally (gimple *stmt,
> >> > +
> >> > +vec drs) {
> 
> { to the next line
> 
> The function has a bad name it should be write_memrefs_writable ()
> 
> >> > +  int i;
> >> > +  data_reference_p a;
> >> > +  bool found = false;
> >> > +
> >> > +  for (i = 0; drs.iterate (i, &a); i++)
> >> > +{
> >> > +  if (DR_STMT (a) == stmt
> >> > +   && DR_IS_WRITE (a)
> >> > +   && (DR_WRITTEN_AT_LEAST_ONCE (a) == 0)
> >> > +   && (DR_RW_UNCONDITIONALLY (a) == 1))
> >> > + {
> >> > +   tree base = get_base_address (DR_REF (a));
> >> > +   found = false;
> >> > +   if (DECL_P (base)
> >> > +   && decl_binds_to_current_def_p (base)
> >> > +   && !TREE_READONLY (base))
> >> > + {
> >> > +   found = true;
> 
> So if the vector ever would contain more than one write you'd return true if
> only one of them is not readonly.
> 
> IMHO all the routines need refactoring to operate on single DRs which AFAIK
> is the only case if-conversion handles anyway (can't if-convert calls or
> aggregate assignments or asms).  Ugh, it seems the whole thing is quadratic,
> doing linear walks to find the DRs for a stmt ...
> 
> A simple
> 
> Index: gcc/tree-if-conv.c
> ==
> =
> --- gcc/tree-if-conv.c  (revision 229572)
> +++ gcc/tree-if-conv.c  (working copy)
> @@ -612,9 +612,10 @@ memrefs_read_or_written_unconditionally
>data_reference_p a, b;
>tree ca = bb_predicate (gimple_bb (stmt));
> 
> -  for (i = 0; drs.iterate (i, &a); i++)
> -if (DR_STMT (a) == stmt)
> -  {
> +  for (i = gimple_uid (stmt) - 1; drs.iterate (i, &a); i++)
> +{
> +if (DR_STMT (a) != stmt)
> +  break;
> bool found = false;
> int x = DR_RW_UNCONDITIONALLY (a);
> 
> @@ -684,10 +685,13 @@ write_memrefs_written_at_least_once (gim
>data_reference_p a, b;
>tree ca = bb_predicate (gimple_bb (stmt));
> 
> -  for (i = 0; drs.iterate (i, &a); i++)
> -if (DR_STMT (a) == stmt
> -   && DR_IS_WR

Re: [PATCH 1/7] New obstack_next_free is not an lvalue

2015-11-07 Thread Richard Sandiford
Alan Modra  writes:
> diff --git a/gcc/gensupport.c b/gcc/gensupport.c
> index 0480e17..484ead2 100644
> --- a/gcc/gensupport.c
> +++ b/gcc/gensupport.c
> @@ -2253,7 +2253,7 @@ htab_eq_string (const void *s1, const void *s2)
> and a permanent heap copy of STR is created.  */
>  
>  static void
> -add_mnemonic_string (htab_t mnemonic_htab, const char *str, int len)
> +add_mnemonic_string (htab_t mnemonic_htab, const char *str, size_t len)
>  {
>char *new_str;
>void **slot;
> @@ -2306,7 +2306,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
>for (i = 0; *cp; )
>  {
>const char *ep, *sp;
> -  int size = 0;
> +  size_t size = 0;
>  
>while (ISSPACE (*cp))
>   cp++;
> @@ -2333,8 +2333,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
>   {
> /* Don't set a value if there are more than one
>instruction in the string.  */
> -   obstack_next_free (&string_obstack) =
> - obstack_next_free (&string_obstack) - size;
> +   obstack_blank_fast (&string_obstack, -size);
> size = 0;
>  
> cp = sp;

Maybe ssize_t instead of size_t since we're using the negative?  Or does
that still trigger a warning?

OK either way for gensupport.c, thanks.

Richard


Re: [gomp4] OpenACC reduction tests

2015-11-07 Thread Thomas Schwinge
Hi!

On Wed, 23 Sep 2015 09:56:44 +0200, I wrote:
> On Fri, 18 Sep 2015 10:11:25 +0200, I wrote:
> > On Fri, 17 Jul 2015 11:13:59 -0700, Cesar Philippidis 
> >  wrote:
> > > This patch updates the libgomp OpenACC reduction test cases [...]

> > Given the following
> > -Wuninitialized/-Wmaybe-uninitialized warnings (for -O1, for example),
> > maybe there's some initialization of (internal) variables missing?
> > (These user-visible warnings about compiler internals need to be
> > addressed regardless.)  Would you please have a look at that?
> > 
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90: In 
> > function 'redsub_combined_._omp_fn.0':
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90:73:0: 
> > warning: '' is used uninitialized in this function 
> > [-Wuninitialized]
> >!$acc loop reduction(+:sum) gang worker vector
> > ^
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90: In 
> > function 'redsub_vector_._omp_fn.1':
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90:60:0: 
> > warning: '' is used uninitialized in this function 
> > [-Wuninitialized]
> >!$acc loop reduction(+:sum) vector
> > ^
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90: In 
> > function 'redsub_worker_._omp_fn.2':
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90:47:0: 
> > warning: '' is used uninitialized in this function 
> > [-Wuninitialized]
> >!$acc loop reduction(+:sum) worker
> > ^
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90: In 
> > function 'redsub_gang_._omp_fn.3':
> > source-gcc/libgomp/testsuite/libgomp.oacc-fortran/reduction-5.f90:34:0: 
> > warning: 'sum.43' may be used uninitialized in this function 
> > [-Wmaybe-uninitialized]
> >!$acc loop reduction(+:sum) gang
> > ^

I didn't see anyone explicitly claim to have fixed that; however, the
warnings are gone.


Grüße
 Thomas


signature.asc
Description: PGP signature


[gomp4, committed] Revert "Add counter inits to zero_iter_bb in expand_omp_for_init_counts"

2015-11-07 Thread Tom de Vries

Hi,

this patch reverts "Add counter inits to zero_iter_bb in 
expand_omp_for_init_counts". We no longer split off the kernels region 
in ssa-mode, so there's no need for this patch anymore.


Committed to gomp-4_0-branch.

Thanks,
- Tom
Revert "Add counter inits to zero_iter_bb in expand_omp_for_init_counts"

2015-10-08  Tom de Vries  

	revert:
	2015-10-08  Tom de Vries  

	* omp-low.c (expand_omp_for_init_counts): Add inits for counters in
	zero_iter_bb.
	(expand_omp_for_generic): Remove TREE_NO_WARNING setttings on counters.

	* gcc.dg/gomp/collapse-2.c: New test.
---
 gcc/omp-low.c  | 35 ++
 gcc/testsuite/gcc.dg/gomp/collapse-2.c | 19 --
 2 files changed, 14 insertions(+), 40 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/gomp/collapse-2.c

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 437f8c1..76f1ae9 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -7437,7 +7437,6 @@ expand_omp_for_init_counts (struct omp_for_data *fd, gimple_stmt_iterator *gsi,
 	  break;
 	}
 }
-  bool created_zero_iter_bb = false;
   for (i = 0; i < (fd->ordered ? fd->ordered : fd->collapse); i++)
 {
   tree itype = TREE_TYPE (fd->loops[i].v);
@@ -7493,7 +7492,6 @@ expand_omp_for_init_counts (struct omp_for_data *fd, gimple_stmt_iterator *gsi,
 	  gsi_insert_before (gsi, assign_stmt, GSI_SAME_STMT);
 	  set_immediate_dominator (CDI_DOMINATORS, zero_iter_bb,
    entry_bb);
-	  created_zero_iter_bb = true;
 	}
 	  ne = make_edge (entry_bb, zero_iter_bb, EDGE_FALSE_VALUE);
 	  ne->probability = REG_BR_PROB_BASE / 2000 - 1;
@@ -7547,25 +7545,6 @@ expand_omp_for_init_counts (struct omp_for_data *fd, gimple_stmt_iterator *gsi,
 	  expand_omp_build_assign (gsi, fd->loop.n2, t);
 	}
 }
-
-  if (created_zero_iter_bb)
-{
-  /* Atm counts[0] doesn't seem to be used beyond create_zero_iter_bb,
-	 but for robustness-sake we include that one as well.  */
-  for (i = 0; i < (fd->ordered ? fd->ordered : fd->collapse); i++)
-	{
-	  tree var = counts[i];
-	  if (!SSA_VAR_P (var))
-	continue;
-
-	  tree zero = build_zero_cst (type);
-	  gassign *assign_stmt = gimple_build_assign (var, zero);
-	  basic_block &zero_iter_bb
-	= i < fd->collapse ? zero_iter1_bb : zero_iter2_bb;
-	  gimple_stmt_iterator gsi = gsi_after_labels (zero_iter_bb);
-	  gsi_insert_before (&gsi, assign_stmt, GSI_SAME_STMT);
-	}
-}
 }
 
 
@@ -8237,6 +8216,7 @@ expand_omp_for_generic (struct omp_region *region,
   bool seq_loop = (start_fn == BUILT_IN_NONE || next_fn == BUILT_IN_NONE);
   edge e, ne;
   tree *counts = NULL;
+  int i;
   bool ordered_lastprivate = false;
 
   gcc_assert (!broken_loop || !in_combined_parallel);
@@ -8283,6 +8263,13 @@ expand_omp_for_generic (struct omp_region *region,
 
   if (zero_iter1_bb)
 	{
+	  /* Some counts[i] vars might be uninitialized if
+	 some loop has zero iterations.  But the body shouldn't
+	 be executed in that case, so just avoid uninit warnings.  */
+	  for (i = first_zero_iter1;
+	   i < (fd->ordered ? fd->ordered : fd->collapse); i++)
+	if (SSA_VAR_P (counts[i]))
+	  TREE_NO_WARNING (counts[i]) = 1;
 	  gsi_prev (&gsi);
 	  e = split_block (entry_bb, gsi_stmt (gsi));
 	  entry_bb = e->dest;
@@ -8294,6 +8281,12 @@ expand_omp_for_generic (struct omp_region *region,
 	}
   if (zero_iter2_bb)
 	{
+	  /* Some counts[i] vars might be uninitialized if
+	 some loop has zero iterations.  But the body shouldn't
+	 be executed in that case, so just avoid uninit warnings.  */
+	  for (i = first_zero_iter2; i < fd->ordered; i++)
+	if (SSA_VAR_P (counts[i]))
+	  TREE_NO_WARNING (counts[i]) = 1;
 	  if (zero_iter1_bb)
 	make_edge (zero_iter2_bb, entry_bb, EDGE_FALLTHRU);
 	  else
diff --git a/gcc/testsuite/gcc.dg/gomp/collapse-2.c b/gcc/testsuite/gcc.dg/gomp/collapse-2.c
deleted file mode 100644
index 5319f89..000
--- a/gcc/testsuite/gcc.dg/gomp/collapse-2.c
+++ /dev/null
@@ -1,19 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fopenmp -fdump-tree-ssa" } */
-
-#define N 100
-
-int a[N][N];
-
-void
-foo (int m, int n)
-{
-  int i, j;
-#pragma omp parallel
-#pragma omp for collapse(2) schedule (runtime)
-  for (i = 0; i < m; i++)
-for (j = 0; j < n; j++)
-  a[i][j] = 1;
-}
-
-/* { dg-final { scan-tree-dump-not "(?n)PHI.*count.*\\(D\\)" "ssa" } } */
-- 
1.9.1



[gomp4, committed] Remove no_overflow_tree_code

2015-11-07 Thread Tom de Vries

Hi,

this patch removes dead code from gomp-4_0-branch.

Committed to gomp-4_0-branch.

Thanks,
- Tom
Remove no_overflow_tree_code

2015-11-07  Tom de Vries  

	* tree.c (no_overflow_tree_code): Remove.
	* tree.h (no_overflow_tree_code): Remove.
---
 gcc/tree.c | 24 
 gcc/tree.h |  1 -
 2 files changed, 25 deletions(-)

diff --git a/gcc/tree.c b/gcc/tree.c
index 535c2d1..c7a3313 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -7606,30 +7606,6 @@ associative_tree_code (enum tree_code code)
   return false;
 }
 
-/* Return true if CODE represents an tree code that cannot overflow, given
-   operand type OP_TYPE.  Otherwise return false.  */
-bool
-no_overflow_tree_code (enum tree_code code, tree op_type)
-{
-  /* For now, just handle associative tree codes.  */
-  switch (code)
-{
-case BIT_IOR_EXPR:
-case BIT_AND_EXPR:
-case BIT_XOR_EXPR:
-  return true;
-
-case MIN_EXPR:
-case MAX_EXPR:
-  return (ANY_INTEGRAL_TYPE_P (op_type)
-	  && TREE_CODE (op_type) != COMPLEX_TYPE);
-
-default:
-  break;
-}
-  return false;
-}
-
 /* Return true if CODE represents a commutative tree code.  Otherwise
return false.  */
 bool
diff --git a/gcc/tree.h b/gcc/tree.h
index 92d6a89..f3e2a48 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4451,7 +4451,6 @@ extern tree get_file_function_name (const char *);
 extern tree get_callee_fndecl (const_tree);
 extern int type_num_arguments (const_tree);
 extern bool associative_tree_code (enum tree_code);
-extern bool no_overflow_tree_code (enum tree_code, tree);
 extern bool commutative_tree_code (enum tree_code);
 extern bool commutative_ternary_tree_code (enum tree_code);
 extern bool operation_can_overflow (enum tree_code);
-- 
1.9.1



[gomp4, committed] Make formatting resemble trunk in f95-lang.c

2015-11-07 Thread Tom de Vries

Hi,

this patch removes formatting differences with trunk.

Committed to gomp-4_0-branch.

Thanks,
- Tom
Make formatting resemble trunk in f95-lang.c

2015-11-07  Tom de Vries  

	* f95-lang.c: Make formatting resemble trunk.
---
 gcc/fortran/f95-lang.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index a63ebb3..40546e6 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -563,6 +563,7 @@ gfc_define_builtin (const char *name, tree type, enum built_in_function code,
   set_builtin_decl (code, decl, true);
 }
 
+
 #define DO_DEFINE_MATH_BUILTIN(code, name, argtype, tbase) \
 gfc_define_builtin ("__builtin_" name "l", tbase##longdouble[argtype], \
 			BUILT_IN_ ## code ## L, name "l", \
-- 
1.9.1



[gomp4, committed] Cleanup formatting of DEF_GOACC_BUILTINs

2015-11-07 Thread Tom de Vries

Hi,

this patch removes formatting differences with trunk.

Committed to gomp-4_0-branch.

Thanks,
- Tom
>From a77fd266102498a909886cecde1b57adf9350d90 Mon Sep 17 00:00:00 2001
From: Tom de Vries 
Date: Fri, 6 Nov 2015 22:11:08 +0100
Subject: [PATCH 2/4] Cleanup formatting of DEF_GOACC_BUILTINs

2015-11-07  Tom de Vries  

	* omp-builtins.def: Cleanup formatting.
---
 gcc/omp-builtins.def | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def
index 1504a48..e04edc2 100644
--- a/gcc/omp-builtins.def
+++ b/gcc/omp-builtins.def
@@ -32,12 +32,10 @@ along with GCC; see the file COPYING3.  If not see
 DEF_GOACC_BUILTIN (BUILT_IN_ACC_GET_DEVICE_TYPE, "acc_get_device_type",
 		   BT_FN_INT, ATTR_NOTHROW_LIST)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DATA_START, "GOACC_data_start",
-		   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR,
-		   ATTR_NOTHROW_LIST)
+		   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DATA_END, "GOACC_data_end",
 		   BT_FN_VOID, ATTR_NOTHROW_LIST)
-DEF_GOACC_BUILTIN (BUILT_IN_GOACC_ENTER_EXIT_DATA,
-		   "GOACC_enter_exit_data",
+DEF_GOACC_BUILTIN (BUILT_IN_GOACC_ENTER_EXIT_DATA, "GOACC_enter_exit_data",
 		   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_INT_INT_VAR,
 		   ATTR_NOTHROW_LIST)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_PARALLEL, "GOACC_parallel_keyed",
-- 
1.9.1



[gomp4, committed] Undo cgraph_node::release_body workaround

2015-11-07 Thread Tom de Vries

Hi,

this patch removes a workaround that's no longer needed, now that we 
split off the kernels region at the first omp-expand pass.


Committed to gomp-4_0-branch.

Thanks,
- Tom
>From 5e9a609006b45c51598a3d52d5ab55b72a186f67 Mon Sep 17 00:00:00 2001
From: Tom de Vries 
Date: Fri, 6 Nov 2015 22:10:31 +0100
Subject: [PATCH 1/4] Undo cgraph_node::release_body workaround

2015-11-07  Tom de Vries  

	* cgraph.c (cgraph_node::release_body): Remove workaround.
---
 gcc/cgraph.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 8fe1ab4..7839c72 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1707,15 +1707,6 @@ release_function_body (tree decl)
 void
 cgraph_node::release_body (bool keep_arguments)
 {
-  /* The omp-expansion of the oacc kernels directive is post-poned till after
- all_small_ipa_passes.  That means pass_ipa_free_lang_data, which tries to
- release the body of the offload function, is run before omp_expand_target 
- can process the oacc kernels directive,  and omp_expand_target would crash
- trying to access the body.  This snippet works around this problem.
- FIXME: This should probably be fixed in a different way.  */
-  if (offloadable)
-return;
-
   ipa_transforms_to_apply.release ();
   if (!used_as_abstract_origin && symtab->state != PARSING)
 {
-- 
1.9.1



Re: [PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-11-07 Thread Uros Bizjak
On Sat, Nov 7, 2015 at 3:34 AM, Segher Boessenkool
 wrote:
> Adding x86 maintainer, ping?
>
> On Fri, Oct 16, 2015 at 05:53:41AM -0700, Segher Boessenkool wrote:
>> For x86, STC still gives better results for optimise-for-size than
>> "simple" does.  So use STC at -Os as well.
>>
>> Is this okay for trunk?
>>
>>
>> Segher
>>
>>
>> 2015-10-16  Segher Boessenkool  
>>
>>   PR rtl-optimization/67864
>>   * common/config/i386/i386-common.c (ix86_option_optimization_table)
>>   : Use REORDER_BLOCKS_ALGORITHM_STC
>>   at -Os and up.

OK.

Thanks,
Uros.

>> ---
>>  gcc/common/config/i386/i386-common.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/gcc/common/config/i386/i386-common.c 
>> b/gcc/common/config/i386/i386-common.c
>> index 79b2472..bb9f29c 100644
>> --- a/gcc/common/config/i386/i386-common.c
>> +++ b/gcc/common/config/i386/i386-common.c
>> @@ -1011,6 +1011,9 @@ static const struct default_options 
>> ix86_option_optimization_table[] =
>>  { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
>>  /* Enable function splitting at -O2 and higher.  */
>>  { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_and_partition, NULL, 1 },
>> +/* The STC algorithm produces the smallest code at -Os, for x86.  */
>> +{ OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
>> +  REORDER_BLOCKS_ALGORITHM_STC },
>>  /* Turn off -fschedule-insns by default.  It tends to make the
>> problem with not enough registers even worse.  */
>>  { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 },
>> --
>> 2.4.3


Re: [AArch64] Fix vqtb[lx][234] on big-endian

2015-11-07 Thread James Greenhalgh
On Fri, Nov 06, 2015 at 09:37:17PM +0100, Christophe Lyon wrote:
> On 6 November 2015 at 18:03, James Greenhalgh  
> wrote:
> > On Fri, Nov 06, 2015 at 02:49:38PM +0100, Christophe Lyon wrote:
> >> Hi,
> >>
> >> As mentioned by James a few weeks ago, the vqtbl[lx][234] intrinsics
> >> are failing on aarch64_be.
> >>
> >> The attached patch fixes them, and rewrites them using new builtins
> >> instead of inline assembly.
> >>
> >> I wondered about the names of the new builtins, I hope I got them
> >> right: qtbl3, qtbl4, qtbx3, qtbx4 with v8qi and v16qi modes.
> >>
> >> I have modified the existing aarch64_tbl3v8qi and aarch64_tbx4v8qi to
> >> use  and share the code with the v16qi variants.
> >>
> >> In arm_neon.h, I moved the rewritten intrinsics to the bottom of the
> >> file, in alphabetical order, although the comment says "Start of
> >> optimal implementations in approved order": the previous ones really
> >> seem to be in alphabetical order.
> >>
> >> And I added a new testcase, skipped for arm* targets.
> >>
> >> This has been tested on aarch64-none-elf and aarch64_be-none-elf
> >> targets, using the Foundation model.
> >>
> >> OK?
> >
> > Hi Christophe,
> >
> > Thanks for this. With this patch I think we can finally say that
> > aarch64_be Neon intrinsics are in as good a state as aarch64 Neon
> > intrinsics. On our internal testsuite the pass rate is now equivalent
> > between the two. I'm very grateful for your work in this area!
> 
> Thanks for the quick review, committed as r229886.
> 
> We are still missing many tests for most of the armv8 intrinsics.
> A significant effort, apparently not worth it since you say your
> internal testsuite is now clean.

The internal testsuiite is of no use to the rest community and is
unlikely to be feasible to submit upstream, so I wouldn't write off
extending the (excellent) set of GCC tests you've been adding so far
as "not worth it". Certainly they were a big help for the big-endian
work.

> Actually, you say the pass rate is equivalent on little and
> big-endian: does it mean that it not 100%?

Yes, I picked my words carefully :-)

The remaining failures are missing intrinsics and conformance issues when
the intrinsics are combined and folded. For an idea of what is missing,
take a look at the LLVM test-suite I pointed you at a few weeks ago:


/SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c

I'll try to get some of the "folding" examples in to the upstream
bugzilla - generally they are issues where the semantics of the
intrinsic are well defined for signed overflow, but our use of C
constructs means the midend considers signed overflow undefined, and
performs more aggressive optimisation.

Thanks,
James

> >
> > This patch is OK for trunk.
> >
> > Thanks again,
> > James
> >
> >>
> >> Christophe.
> >
> >> 2015-11-06  Christophe Lyon  
> >>
> >>   gcc/testsuite/
> >>   * gcc.target/aarch64/advsimd-intrinsics/vqtbX.c: New test.
> >>
> >>   gcc/
> >>   * config/aarch64/aarch64-simd-builtins.def: Update builtins
> >>   tables: add tbl3v16qi, qtbl[34]*, tbx4v16qi, qtbx[34]*.
> >>   * config/aarch64/aarch64-simd.md (aarch64_tbl3v8qi): Rename to...
> >>   (aarch64_tbl3) ... this, which supports v16qi too.
> >>   (aarch64_tbx4v8qi): Rename to...
> >>   aarch64_tbx4): ... this.
> >>   (aarch64_qtbl3): New pattern.
> >>   (aarch64_qtbx3): New pattern.
> >>   (aarch64_qtbl4): New pattern.
> >>   (aarch64_qtbx4): New pattern.
> >>   * config/aarch64/arm_neon.h (vqtbl2_s8, vqtbl2_u8, vqtbl2_p8)
> >>   (vqtbl2q_s8, vqtbl2q_u8, vqtbl2q_p8, vqtbl3_s8, vqtbl3_u8)
> >>   (vqtbl3_p8, vqtbl3q_s8, vqtbl3q_u8, vqtbl3q_p8, vqtbl4_s8)
> >>   (vqtbl4_u8, vqtbl4_p8, vqtbl4q_s8, vqtbl4q_u8, vqtbl4q_p8)
> >>   (vqtbx2_s8, vqtbx2_u8, vqtbx2_p8, vqtbx2q_s8, vqtbx2q_u8)
> >>   (vqtbx2q_p8, vqtbx3_s8, vqtbx3_u8, vqtbx3_p8, vqtbx3q_s8)
> >>   (vqtbx3q_u8, vqtbx3q_p8, vqtbx4_s8, vqtbx4_u8, vqtbx4_p8)
> >>   (vqtbx4q_s8, vqtbx4q_u8, vqtbx4q_p8): Rewrite using builtin
> >>   functions.
> >
> 


***ping*** [Patch, fortran] PR68196 [4.9/5/6 Regression] ICE on function result with procedure pointer component

2015-11-07 Thread Paul Richard Thomas
***ping***

On 4 November 2015 at 16:03, Paul Richard Thomas
 wrote:
> Dear All,
>
> The patch for these PRs is fully explained by the the comments and/or
> changelogs. PR66465 has no connection with PR68196, other than Damian
> asking if it is connected!
>
> Bootstrapped and regtested on x86_64/FC21 - OK for trunk and a few
> weeks later 4.9 and 5 branches?
>
> Cheers
>
> Paul
>
> 2015-11-04  Paul Thomas  
>
> PR fortran/68196
> * class.c (has_finalizer_component): Prevent infinite recursion
> through this function if the derived type and that of its
> component are the same.
> * trans-types.c (gfc_get_derived_type): Do the same for proc
> pointers by ignoring the explicit interface for the component.
>
> PR fortran/66465
> * check.c (same_type_check): If either of the expressions is
> BT_PROCEDURE, use the typespec from the symbol, rather than the
> expression.
>
> 2015-11-04  Paul Thomas  
>
> PR fortran/68196
> * gfortran.dg/proc_ptr_47.f90: New test.
>
> PR fortran/66465
> * gfortran.dg/pr66465.f90: New test.



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


[PATCH 7/7] Configury changes for obstack optimization

2015-11-07 Thread Alan Modra
Provides defines used to determine whether glibc obstacks are
compatible.  Generally speaking, 32-bit targets won't need to use
obstack.o from libiberty if glibc is used, while 64-bit targets will,
until glibc gets the new obstack code.

* configure.ac: Check size of size_t.
* configure: Regenerate.

diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 868be8e..1ab5235 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -276,6 +276,7 @@ libiberty_AC_DECLARE_ERRNO
 # Determine sizes of some types.
 AC_CHECK_SIZEOF([int])
 AC_CHECK_SIZEOF([long])
+AC_CHECK_SIZEOF([size_t])
 
 # Check for presense of long long
 AC_CHECK_TYPE([long long],


[PATCH 6/7] Silence obstack.c -Wc++compat warning

2015-11-07 Thread Alan Modra
Fixes
warning: request for implicit conversion from ‘void *’ to ‘struct 
_obstack_chunk *’ not permitted in C++ [-Wc++-compat]

I moved the assignment to h->chunk to fix an overlong line, then
decided it would be better after the alloc failure check just to do
things the same way as in _obstack_newchunk.

* obstack.c (_obstack_newchunk): Silence -Wc++compat warning.
(_obstack_begin_worker): Likewise.  Move assignment to h->chunk
after alloc failure check.

diff --git a/libiberty/obstack.c b/libiberty/obstack.c
index 9f34da1..6d8d672 100644
--- a/libiberty/obstack.c
+++ b/libiberty/obstack.c
@@ -138,9 +138,10 @@ _obstack_begin_worker (struct obstack *h,
   h->chunk_size = size;
   h->alignment_mask = alignment - 1;
 
-  chunk = h->chunk = call_chunkfun (h, h->chunk_size);
+  chunk = (struct _obstack_chunk *) call_chunkfun (h, h->chunk_size);
   if (!chunk)
 (*obstack_alloc_failed_handler) ();
+  h->chunk = chunk;
   h->next_free = h->object_base = __PTR_ALIGN ((char *) chunk, chunk->contents,
alignment - 1);
   h->chunk_limit = chunk->limit = (char *) chunk + h->chunk_size;
@@ -202,7 +203,7 @@ _obstack_newchunk (struct obstack *h, _OBSTACK_SIZE_T 
length)
 
   /* Allocate and initialize the new chunk.  */
   if (obj_size <= sum1 && sum1 <= sum2)
-new_chunk = call_chunkfun (h, new_size);
+new_chunk = (struct _obstack_chunk *) call_chunkfun (h, new_size);
   if (!new_chunk)
 (*obstack_alloc_failed_handler)();
   h->chunk = new_chunk;


[PATCH 5/7] Modify obstack.[hc] to avoid having to include other gnulib files

2015-11-07 Thread Alan Modra
Using the standard gnulib obstack source requires importing quite a
lot of other files from gnulib, and requires build changes.

If one did want to use gnulib obstack directly, then it would need to
go in a sub-directory and after ".../gnulib-tool --import obstack"
we'd have the following:

./lib:
alignof.h   gettext.hobstack.hstdlib.in.h unistd.in.h
exitfail.c  Makefile.am  stddef.in.h  sys_types.in.h
exitfail.h  obstack.cstdint.in.h  unistd.c

./m4:
00gnulib.m4 gnulib-comp.m4   obstack.m4   stdint.m4   wchar_t.m4
absolute-header.m4  gnulib-tool.m4   off_t.m4 stdlib_h.m4
extern-inline.m4include_next.m4  onceonly.m4  sys_types_h.m4
gnulib-cache.m4 longlong.m4  ssize_t.m4   unistd_h.m4
gnulib-common.m4multiarch.m4 stddef_h.m4  warn-on-use.m4

./snippet:
arg-nonnull.h  c++defs.h  _Noreturn.h  warn-on-use.h

include/
PR gdb/17133
* obstack.h (__attribute_pure__): Expand _GL_ATTRIBUTE_PURE.
libiberty/
PR gdb/17133
* obstack.c (__alignof__): Expand alignof_type from alignof.h.
(obstack_exit_failure): Don't use exitfail.h.
(_): Include libintl.h when HAVE_LIBINTL_H and nls enabled.
Provide default.  Don't include gettext.h.
(_Noreturn): Define.
* obstacks.texi: Adjust node references to external libc info files.

diff --git a/include/obstack.h b/include/obstack.h
index 0ff3309..0d13c72 100644
--- a/include/obstack.h
+++ b/include/obstack.h
@@ -142,7 +142,11 @@
 P, A)
 
 #ifndef __attribute_pure__
-# define __attribute_pure__ _GL_ATTRIBUTE_PURE
+# if defined __GNUC_MINOR__ && __GNUC__ * 1000 + __GNUC_MINOR__ >= 2096
+#  define __attribute_pure__ __attribute__ ((__pure__))
+# else
+#  define __attribute_pure__
+# endif
 #endif
 
 #ifdef __cplusplus
diff --git a/libiberty/obstack.c b/libiberty/obstack.c
index 3b99dfa..9f34da1 100644
--- a/libiberty/obstack.c
+++ b/libiberty/obstack.c
@@ -51,9 +51,14 @@
 /* If GCC, or if an oddball (testing?) host that #defines __alignof__,
use the already-supplied __alignof__.  Otherwise, this must be Gnulib
(as glibc assumes GCC); defer to Gnulib's alignof_type.  */
-# if !defined __GNUC__ && !defined __alignof__
-#  include 
-#  define __alignof__(type) alignof_type (type)
+# if !defined __GNUC__ && !defined __IBM__ALIGNOF__ && !defined __alignof__
+#  if defined __cplusplus
+template  struct alignof_helper { char __slot1; type __slot2; };
+#   define __alignof__(type) offsetof (alignof_helper, __slot2)
+#  else
+#   define __alignof__(type) \
+  offsetof (struct { char __slot1; type __slot2; }, __slot2)
+#  endif
 # endif
 # include 
 # include 
@@ -309,17 +314,34 @@ _obstack_memory_used (struct obstack *h)
 #  ifdef _LIBC
 int obstack_exit_failure = EXIT_FAILURE;
 #  else
-#   include "exitfail.h"
-#   define obstack_exit_failure exit_failure
+#   ifndef EXIT_FAILURE
+#define EXIT_FAILURE 1
+#   endif
+#   define obstack_exit_failure EXIT_FAILURE
 #  endif
 
-#  ifdef _LIBC
+#  if defined _LIBC || (HAVE_LIBINTL_H && ENABLE_NLS)
 #   include 
+#   ifndef _
+#define _(msgid) gettext (msgid)
+#   endif
 #  else
-#   include "gettext.h"
+#   ifndef _
+#define _(msgid) (msgid)
+#   endif
 #  endif
-#  ifndef _
-#   define _(msgid) gettext (msgid)
+
+#  if !(defined _Noreturn\
+|| (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112))
+#   if ((defined __GNUC__\
+&& (__GNUC__ >= 3 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 8)))\
+   || (defined __SUNPRO_C && __SUNPRO_C >= 0x5110))
+#define _Noreturn __attribute__ ((__noreturn__))
+#   elif defined _MSC_VER && _MSC_VER >= 1200
+#define _Noreturn __declspec (noreturn)
+#   else
+#define _Noreturn
+#   endif
 #  endif
 
 #  ifdef _LIBC
diff --git a/libiberty/obstacks.texi b/libiberty/obstacks.texi
index 1bfc878..b2d2403 100644
--- a/libiberty/obstacks.texi
+++ b/libiberty/obstacks.texi
@@ -93,7 +93,7 @@ them are freed.  These macros should appear before any use of 
obstacks
 in the source file.
 
 Usually these are defined to use @code{malloc} via the intermediary
-@code{xmalloc} (@pxref{Unconstrained Allocation}).  This is done with
+@code{xmalloc} (@pxref{Unconstrained Allocation, , , libc, The GNU C Library 
Reference Manual}).  This is done with
 the following pair of macro definitions:
 
 @smallexample
@@ -172,8 +172,8 @@ The value of this variable is a pointer to a function that
 @code{obstack} uses when @code{obstack_chunk_alloc} fails to allocate
 memory.  The default action is to print a message and abort.
 You should supply a function that either calls @code{exit}
-(@pxref{Program Termination}) or @code{longjmp} (@pxref{Non-Local
-Exits}) and doesn't return.
+(@pxref{Program Termination, , , libc, The GNU C Library Reference Manual}) or 
@code{longjmp} (@pxref{Non-Local
+Exits, , , 

[PATCH 4/7] Copy gnulib obstack files

2015-11-07 Thread Alan Modra
This copies obstack.[ch] from gnulib, and updates the docs.  The next
patch should be applied if someone repeats the import at a later date.

include/
PR gdb/17133
* obstack.h: Import current gnulib file.
libiberty/
PR gdb/17133
* obstack.c: Import current gnulib file.
* obstacks.texi: Updated doc, from glibc's manual/memory.texi.

diff --git a/include/obstack.h b/include/obstack.h
index 9759af4..0ff3309 100644
--- a/include/obstack.h
+++ b/include/obstack.h
@@ -1,106 +1,102 @@
 /* obstack.h - object stack macros
Copyright (C) 1988-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
 
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
 
-   NOTE: The canonical source of this file is maintained with the GNU C 
Library.
-   Bugs can be reported to bug-gl...@gnu.org.
-
-   This program is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by the
-   Free Software Foundation; either version 2, or (at your option) any
-   later version.
-
-   This program is distributed in the hope that it will be useful,
+   The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
 
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301,
-   USA.  */
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
 
 /* Summary:
 
-All the apparent functions defined here are macros. The idea
-is that you would use these pre-tested macros to solve a
-very specific set of problems, and they would run fast.
-Caution: no side-effects in arguments please!! They may be
-evaluated MANY times!!
-
-These macros operate a stack of objects.  Each object starts life
-small, and may grow to maturity.  (Consider building a word syllable
-by syllable.)  An object can move while it is growing.  Once it has
-been "finished" it never changes address again.  So the "top of the
-stack" is typically an immature growing object, while the rest of the
-stack is of mature, fixed size and fixed address objects.
-
-These routines grab large chunks of memory, using a function you
-supply, called `obstack_chunk_alloc'.  On occasion, they free chunks,
-by calling `obstack_chunk_free'.  You must define them and declare
-them before using any obstack macros.
-
-Each independent stack is represented by a `struct obstack'.
-Each of the obstack macros expects a pointer to such a structure
-as the first argument.
-
-One motivation for this package is the problem of growing char strings
-in symbol tables.  Unless you are "fascist pig with a read-only mind"
---Gosper's immortal quote from HAKMEM item 154, out of context--you
-would not like to put any arbitrary upper limit on the length of your
-symbols.
-
-In practice this often means you will build many short symbols and a
-few long symbols.  At the time you are reading a symbol you don't know
-how long it is.  One traditional method is to read a symbol into a
-buffer, realloc()ating the buffer every time you try to read a symbol
-that is longer than the buffer.  This is beaut, but you still will
-want to copy the symbol from the buffer to a more permanent
-symbol-table entry say about half the time.
-
-With obstacks, you can work differently.  Use one obstack for all symbol
-names.  As you read a symbol, grow the name in the obstack gradually.
-When the name is complete, finalize it.  Then, if the symbol exists already,
-free the newly read name.
-
-The way we do this is to take a large chunk, allocating memory from
-low addresses.  When you want to build a symbol in the chunk you just
-add chars above the current "high water mark" in the chunk.  When you
-have finished adding chars, because you got to the end of the symbol,
-you know how long the chars are, and you can create a new object.
-Mostly the chars will not burst over the highest address of the chunk,
-because you would typically expect a chunk to be (say) 100 times as
-long as an average object.
-
-In case that isn't clear, when we have enough chars to make up
-the object, THEY ARE ALREADY CONTIGUOUS IN THE CHUNK (guaranteed)
-so we just point to it where it lies.  No moving of chars is
-needed and this is the second win: potentially 

[PATCH 3/7] Update libsanitizer obstack interceptors

2015-11-07 Thread Alan Modra
New obstack uses sensible types, size_t instead of int for length
params.  Since libsanitizer does not use prototypes from obstack.h to
call the real functions, it's necessary to update the libsanitizer
function declarations emitted by the INTERCEPTOR macro.

As per the comment added to configure.ac, it would be nice if we could
update to a more recent autoconf, but what I have should do given the
limited target support for libsanitizer.

I'll be pushing this one upstream too, when I figure out something
reasonable for cmake.

* sanitizer_common/sanitizer_common_interceptors.inc: Update size
params for _obstack_begin_1, _obstack_begin, _obstack_newchunk
interceptors.
* configure.ac: Substitute OBSTACK_DEFS.
* asan/Makefile.am: Add OBSTACK_DEFS to DEFS.
* tsan/Makefile.am: Likewise.
* configure: Regenerate.
* Makefile.in: Regenerate.
* asan/Makefile.in: Regenerate.
* interception/Makefile.in: Regenerate.
* libbacktrace/Makefile.in: Regenerate.
* lsan/Makefile.in: Regenerate.
* sanitizer_common/Makefile.in: Regenerate.
* tsan/Makefile.in: Regenerate.
* ubsan/Makefile.in: Regenerate.

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index 9b8c77e..92b9027 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -4874,8 +4874,9 @@ static void initialize_obstack(__sanitizer_obstack 
*obstack) {
 sizeof(*obstack->chunk));
 }
 
-INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack *obstack, int sz,
-int align, void *(*alloc_fn)(uptr arg, uptr sz),
+INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T sz, _OBSTACK_SIZE_T align,
+void *(*alloc_fn)(uptr arg, SIZE_T sz),
 void (*free_fn)(uptr arg, void *p)) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_begin_1, obstack, sz, align, alloc_fn,
@@ -4884,8 +4885,10 @@ INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack 
*obstack, int sz,
   if (res) initialize_obstack(obstack);
   return res;
 }
-INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack *obstack, int sz,
-int align, void *(*alloc_fn)(uptr sz), void (*free_fn)(void *p)) {
+INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T sz, _OBSTACK_SIZE_T align,
+void *(*alloc_fn)(SIZE_T sz),
+void (*free_fn)(void *p)) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_begin, obstack, sz, align, alloc_fn,
free_fn);
@@ -4893,7 +4896,8 @@ INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack 
*obstack, int sz,
   if (res) initialize_obstack(obstack);
   return res;
 }
-INTERCEPTOR(void, _obstack_newchunk, __sanitizer_obstack *obstack, int length) 
{
+INTERCEPTOR(void, _obstack_newchunk, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T length) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_newchunk, obstack, length);
   REAL(_obstack_newchunk)(obstack, length);
diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac
index 81fd46d..72b13a1 100644
--- a/libsanitizer/configure.ac
+++ b/libsanitizer/configure.ac
@@ -335,6 +335,30 @@ fi
 
 AC_SUBST([RPC_DEFS], [$rpc_defs])
 
+dnl If this file is processed by autoconf-2.67 or later then the CPPFLAGS
+dnl "-o conftest.iii" can disappear, conftest.iii be replaced with
+dnl conftest.i in the sed command line, and the rm deleted.
+dnl Not all cpp's accept -o, and gcc -E does not accept a second file
+dnl argument as the output file.
+AC_CACHE_CHECK([obstack params],
+[libsanitizer_cv_sys_obstack],
+[save_cppflags=$CPPFLAGS
+CPPFLAGS="-I${srcdir}/../include -o conftest.iii $CPPFLAGS"
+AC_PREPROC_IFELSE([AC_LANG_SOURCE([
+#include "obstack.h"
+#ifdef _OBSTACK_SIZE_T
+_OBSTACK_SIZE_T
+#else
+int
+#endif
+])],
+[libsanitizer_cv_sys_obstack=`sed -e '/^#/d;/^[ ]*$/d' conftest.iii | 
sed -e '$!d;s/size_t/SIZE_T/'`],
+[libsanitizer_cv_sys_obstack=int])
+CPPFLAGS=$save_cppflags
+rm -f conftest.iii
+])
+AC_SUBST([OBSTACK_DEFS], [-D_OBSTACK_SIZE_T=\"$libsanitizer_cv_sys_obstack\"])
+
 AM_CONDITIONAL(LIBBACKTRACE_SUPPORTED,
   [test "x${BACKTRACE_SUPPORTED}x${BACKTRACE_USES_MALLOC}" = 
"x1x0"])
 
diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index bd3cd73..4500e21 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -3,7 +3,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
-DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -DASAN_HAS_EXCEPTIONS=1 -DASAN_NEEDS_SEGV=1 
-DCAN_SANITIZE_UB=0
+DEFS =

[PATCH 2/7] Correct libvtv obstack use

2015-11-07 Thread Alan Modra
Fixes a compile error with both old and new obstacks due to
obstack_chunk_free having the wrong signature.  Also, setting chunk
size and alignment before obstack_init is pointless since they are
overwritten.

* vtv_malloc.cc (obstack_chunk_free): Correct param type.
(__vtv_malloc_init): Use obstack_specify_allocation.

diff --git a/libvtv/vtv_malloc.cc b/libvtv/vtv_malloc.cc
index ecd07eb..ea26b82 100644
--- a/libvtv/vtv_malloc.cc
+++ b/libvtv/vtv_malloc.cc
@@ -194,7 +194,7 @@ obstack_chunk_alloc (size_t size)
 }
 
 static void
-obstack_chunk_free (size_t)
+obstack_chunk_free (void *)
 {
   /* Do nothing. For our purposes there should be very little
  de-allocation. */
@@ -217,14 +217,13 @@ __vtv_malloc_init (void)
 #endif
 VTV_error ();
 
-  obstack_chunk_size (&vtv_obstack) = VTV_PAGE_SIZE;
-  obstack_alignment_mask (&vtv_obstack) = sizeof (long) - 1;
   /* We guarantee that the obstack alloc failed handler will never be
  called because in case the allocation of the chunk fails, it will
  never return */
   obstack_alloc_failed_handler = NULL;
 
-  obstack_init (&vtv_obstack);
+  obstack_specify_allocation (&vtv_obstack, VTV_PAGE_SIZE, sizeof (long),
+ obstack_chunk_alloc, obstack_chunk_free);
   malloc_initialized = 1;
 }
 


[PATCH 1/7] New obstack_next_free is not an lvalue

2015-11-07 Thread Alan Modra
New obstack.h casts obstack_next_free to (void *), resulting in it
being a non-lvalue, and warnings on pointer arithmetic.

gcc/
* gensupport.c (add_mnemonic_string): Make len param a size_t.
(gen_mnemonic_setattr): Make "size" var a size_t.  Use
obstack_blank_fast to shrink obstack.  Cast obstack_next_free
return value.
gcc/objc/
* objc-encoding.c (encode_aggregate_within): Cast obstack_next_free
return value.

diff --git a/gcc/gensupport.c b/gcc/gensupport.c
index 0480e17..484ead2 100644
--- a/gcc/gensupport.c
+++ b/gcc/gensupport.c
@@ -2253,7 +2253,7 @@ htab_eq_string (const void *s1, const void *s2)
and a permanent heap copy of STR is created.  */
 
 static void
-add_mnemonic_string (htab_t mnemonic_htab, const char *str, int len)
+add_mnemonic_string (htab_t mnemonic_htab, const char *str, size_t len)
 {
   char *new_str;
   void **slot;
@@ -2306,7 +2306,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
   for (i = 0; *cp; )
 {
   const char *ep, *sp;
-  int size = 0;
+  size_t size = 0;
 
   while (ISSPACE (*cp))
cp++;
@@ -2333,8 +2333,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
{
  /* Don't set a value if there are more than one
 instruction in the string.  */
- obstack_next_free (&string_obstack) =
-   obstack_next_free (&string_obstack) - size;
+ obstack_blank_fast (&string_obstack, -size);
  size = 0;
 
  cp = sp;
@@ -2346,7 +2345,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
obstack_1grow (&string_obstack, '*');
   else
add_mnemonic_string (mnemonic_htab,
-obstack_next_free (&string_obstack) - size,
+(char *) obstack_next_free (&string_obstack) - 
size,
 size);
   i++;
 }
diff --git a/gcc/objc/objc-encoding.c b/gcc/objc/objc-encoding.c
index 4848021..9c577e9 100644
--- a/gcc/objc/objc-encoding.c
+++ b/gcc/objc/objc-encoding.c
@@ -495,13 +495,14 @@ encode_aggregate_within (tree type, int curtype, int 
format, int left,
 
   if (flag_next_runtime)
 {
-  if (ob_size > 0  &&  *(obstack_next_free (&util_obstack) - 1) == '^')
+  if (ob_size > 0
+ && *((char *) obstack_next_free (&util_obstack) - 1) == '^')
pointed_to = true;
 
   if ((format == OBJC_ENCODE_INLINE_DEFS || generating_instance_variables)
  && (!pointed_to || ob_size - curtype == 1
  || (ob_size - curtype == 2
- && *(obstack_next_free (&util_obstack) - 2) == 'r')))
+ && *((char *) obstack_next_free (&util_obstack) - 2) == 'r')))
inline_contents = true;
 }
   else
@@ -512,9 +513,10 @@ encode_aggregate_within (tree type, int curtype, int 
format, int left,
 comment above applies: in that case we should avoid encoding
 the names of instance variables.
   */
-  char c1 = ob_size > 1 ? *(obstack_next_free (&util_obstack) - 2) : 0;
-  char c0 = ob_size > 0 ? *(obstack_next_free (&util_obstack) - 1) : 0;
+  char c0, c1;
 
+  c1 = ob_size > 1 ? *((char *) obstack_next_free (&util_obstack) - 2) : 0;
+  c0 = ob_size > 0 ? *((char *) obstack_next_free (&util_obstack) - 1) : 0;
   if (c0 == '^' || (c1 == '^' && c0 == 'r'))
pointed_to = true;
 


[PATCH 0/7] 64-bit obstack support in libiberty

2015-11-07 Thread Alan Modra
This patchset imports new obstack support to libiberty, to better
support 64-bit systems, and fix an old gdb bug.  Most of the necessary
changes outside of libiberty were committed October last year, but a
few more incompatibilities have crept in since then.  The first three
patches fix these problems.  Patch 4 does the import from gnulib, and
edits the docs as if they had been imported from glibc.  Patch 5 makes
modifications for libiberty.  Patch 6 is a warning fix that I'll see
about pushing upstream, and finally, patch 7 supplies a define used to
determine whether libiberty needs obstack.o.

The cumulative patch series was bootstrapped and regression tested on
x86_64-linux, and also after just the first three patches.

Alan Modra (7):
  New obstack_next_free is not an lvalue
  Correct libvtv obstack use
  Update libsanitizer obstack interceptors
  Copy gnulib obstack files
  Modify obstack.[hc] to avoid having to include other gnulib files
  Silence obstack.c -Wc++compat warning
  Configury changes for obstack optimization

 gcc/gensupport.c   |   9 +-
 gcc/objc/objc-encoding.c   |  10 +-
 include/obstack.h  | 910 ++---
 libiberty/configure|  58 ++
 libiberty/configure.ac |   1 +
 libiberty/obstack.c| 570 +
 libiberty/obstacks.texi| 257 +++---
 libsanitizer/Makefile.in   |   1 +
 libsanitizer/asan/Makefile.am  |   2 +-
 libsanitizer/asan/Makefile.in  |   3 +-
 libsanitizer/configure |  38 +-
 libsanitizer/configure.ac  |  24 +
 libsanitizer/interception/Makefile.in  |   1 +
 libsanitizer/libbacktrace/Makefile.in  |   1 +
 libsanitizer/lsan/Makefile.in  |   1 +
 libsanitizer/sanitizer_common/Makefile.in  |   1 +
 .../sanitizer_common_interceptors.inc  |  14 +-
 libsanitizer/tsan/Makefile.am  |   2 +-
 libsanitizer/tsan/Makefile.in  |   3 +-
 libsanitizer/ubsan/Makefile.in |   1 +
 libvtv/vtv_malloc.cc   |   7 +-
 21 files changed, 957 insertions(+), 957 deletions(-)