[Committed] S/390: Add missing constraints in builtin patterns

2017-03-08 Thread Andreas Krebbel
gcc/ChangeLog:

2017-03-09  Andreas Krebbel  

* config/s390/vx-builtins.md ("vfee", "vfeez")
("vfenez"): Add missing constraints.
---
 gcc/config/s390/vx-builtins.md | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 1e9010a..6aff378 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -1351,9 +1351,9 @@
 
 ; vfeeb, vfeeh, vfeef
 (define_insn "vfee"
-  [(set (match_operand:VI_HW_QHS0 "register_operand" "")
-   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "")
-  (match_operand:VI_HW_QHS 2 "register_operand" "")
+  [(set (match_operand:VI_HW_QHS0 "register_operand" "=v")
+   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand"  "v")
+  (match_operand:VI_HW_QHS 2 "register_operand"  "v")
   (const_int 0)]
  UNSPEC_VEC_VFEE))]
   "TARGET_VX"
@@ -1362,9 +1362,9 @@
 
 ; vfeezb, vfeezh, vfeezf
 (define_insn "vfeez"
-  [(set (match_operand:VI_HW_QHS0 "register_operand" "")
-   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "")
-  (match_operand:VI_HW_QHS 2 "register_operand" "")
+  [(set (match_operand:VI_HW_QHS0 "register_operand" "=v")
+   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand"  "v")
+  (match_operand:VI_HW_QHS 2 "register_operand"  "v")
   (const_int VSTRING_FLAG_ZS)]
  UNSPEC_VEC_VFEE))]
   "TARGET_VX"
@@ -1423,9 +1423,9 @@
 
 ; vfenezb, vfenezh, vfenezf
 (define_insn "vfenez"
-  [(set (match_operand:VI_HW_QHS0 "register_operand" "")
-   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "")
-  (match_operand:VI_HW_QHS 2 "register_operand" "")
+  [(set (match_operand:VI_HW_QHS0 "register_operand" "=v")
+   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand"  "v")
+  (match_operand:VI_HW_QHS 2 "register_operand"  "v")
   (const_int VSTRING_FLAG_ZS)]
  UNSPEC_VEC_VFENE))]
   "TARGET_VX"
-- 
2.9.1



Re: [PATCH][AArch64] Implement ALU_BRANCH fusion

2017-03-08 Thread Hurugalawadi, Naveen
Hi James,

Thanks for the review and your comments.

>> I'd need more detail on what types of instruction pairs you
>> are trying to fuse. 

The documentation mentions it as follows:-
Single uop ALU instruction may fuse with adjacent branch instruction in the 
same bundle

>> This comment looks incorrect - there is no vulcan_alu_basic reservation

Modified as per comment.

Please let us know if the description is sufficient?

Thanks,
Naveen

New German PO file for 'gcc' (version 7.1-b20170226)

2017-03-08 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

http://translationproject.org/latest/gcc/de.po

(This file, 'gcc-7.1-b20170226.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




C++ PATCH for c++/79797, ICE with array NSDMI

2017-03-08 Thread Jason Merrill
In this testcase, ctx->object was the array subobject and 'this'
refers to the containing object, so we failed.  Fixed by generalizing
PLACEHOLDER_EXPR handling more (in a way that replace_placeholders_r
already handles).

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a383ab56e2ea367509255f5a6b95fb5ad0b20f88
Author: Jason Merrill 
Date:   Wed Mar 8 16:03:14 2017 -0500

PR c++/79797 - ICE with self-reference in array DMI.

* constexpr.c (lookup_placeholder): Split out...
(cxx_eval_constant_expression): ...from here.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index f114da0..2510e23e 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3827,6 +3827,35 @@ cxx_eval_switch_expr (const constexpr_ctx *ctx, tree t,
   return NULL_TREE;
 }
 
+/* Find the object of TYPE under initialization in CTX.  */
+
+static tree
+lookup_placeholder (const constexpr_ctx *ctx, bool lval, tree type)
+{
+  if (!ctx || !ctx->ctor || (lval && !ctx->object))
+return NULL_TREE;
+
+  /* We could use ctx->object unconditionally, but using ctx->ctor when we
+ can is a minor optimization.  */
+  if (!lval && same_type_p (TREE_TYPE (ctx->ctor), type))
+return ctx->ctor;
+
+  /* Since an object cannot have a field of its own type, we can search outward
+ from ctx->object to find the unique containing object of TYPE.  */
+  tree ob = ctx->object;
+  while (ob)
+{
+  if (same_type_ignoring_top_level_qualifiers_p (TREE_TYPE (ob), type))
+   break;
+  if (handled_component_p (ob))
+   ob = TREE_OPERAND (ob, 0);
+  else
+   ob = NULL_TREE;
+}
+
+  return ob;
+}
+
 /* Attempt to reduce the expression T to a constant value.
On failure, issue diagnostic and return error_mark_node.  */
 /* FIXME unify with c_fully_fold */
@@ -4468,27 +4497,15 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
   break;
 
 case PLACEHOLDER_EXPR:
-  if (!ctx || !ctx->ctor || (lval && !ctx->object)
- || !(same_type_ignoring_top_level_qualifiers_p
-  (TREE_TYPE (t), TREE_TYPE (ctx->ctor
-   {
- /* A placeholder without a referent.  We can get here when
-checking whether NSDMIs are noexcept, or in massage_init_elt;
-just say it's non-constant for now.  */
- gcc_assert (ctx->quiet);
- *non_constant_p = true;
- break;
-   }
-  else
-   {
- /* Use of the value or address of the current object.  We could
-use ctx->object unconditionally, but using ctx->ctor when we
-can is a minor optimization.  */
- tree ctor = lval ? ctx->object : ctx->ctor;
- return cxx_eval_constant_expression
-   (ctx, ctor, lval,
-non_constant_p, overflow_p);
-   }
+  /* Use of the value or address of the current object.  */
+  if (tree ctor = lookup_placeholder (ctx, lval, TREE_TYPE (t)))
+   return cxx_eval_constant_expression (ctx, ctor, lval,
+non_constant_p, overflow_p);
+  /* A placeholder without a referent.  We can get here when
+checking whether NSDMIs are noexcept, or in massage_init_elt;
+just say it's non-constant for now.  */
+  gcc_assert (ctx->quiet);
+  *non_constant_p = true;
   break;
 
 case EXIT_EXPR:
diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr7.C 
b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr7.C
new file mode 100644
index 000..2134ac9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr7.C
@@ -0,0 +1,12 @@
+// PR c++/79797
+// { dg-do compile { target c++14 } }
+
+struct A
+{
+  A* x[1]{(A*)this};
+};
+
+extern constexpr A a{};
+
+#define SA(X) static_assert ((X), #X)
+SA (a.x[0] == );


Re: C++ PATCH to fix ICE in strip_typedefs with -Wpadded (PR c++/79900)

2017-03-08 Thread Jason Merrill
OK.

On Wed, Mar 8, 2017 at 6:00 PM, Marek Polacek  wrote:
> We crash on an assert in strip_typedefs, because we find ourselves in a
> scenario where RESULT, the main variant of a struct, was modified in
> finalize_record_size (its TYPE_ALIGN changed), but its variants (T in
> strip_typedefs) weren't fixed-up yet; that will happen slightly later in
> finalize_type_size.  So there's a discrepancy that confuses the code.
>
> This patch implements what Jason suggested to me on IRC, i.e., skip the
> attribute handling if RESULT is complete and T isn't.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2017-03-08  Marek Polacek  
>
> PR c++/79900 - ICE in strip_typedefs
> * tree.c (strip_typedefs): Skip the attribute handling if T is
> a variant type which hasn't been updated yet.
>
> * g++.dg/warn/Wpadded-1.C: New test.
>
> diff --git gcc/cp/tree.c gcc/cp/tree.c
> index d3c63b8..b3a4a10 100644
> --- gcc/cp/tree.c
> +++ gcc/cp/tree.c
> @@ -1548,29 +1548,40 @@ strip_typedefs (tree t, bool *remove_attributes)
> result = TYPE_MAIN_VARIANT (t);
>  }
>gcc_assert (!typedef_variant_p (result));
> -  if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
> -  || TYPE_ALIGN (t) != TYPE_ALIGN (result))
> +
> +  if (COMPLETE_TYPE_P (result) && !COMPLETE_TYPE_P (t))
> +  /* If RESULT is complete and T isn't, it's likely the case that T
> + is a variant of RESULT which hasn't been updated yet.  Skip the
> + attribute handling.  */;
> +  else
>  {
> -  gcc_assert (TYPE_USER_ALIGN (t));
> -  if (remove_attributes)
> -   *remove_attributes = true;
> -  else
> +  if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
> + || TYPE_ALIGN (t) != TYPE_ALIGN (result))
> {
> - if (TYPE_ALIGN (t) == TYPE_ALIGN (result))
> -   result = build_variant_type_copy (result);
> + gcc_assert (TYPE_USER_ALIGN (t));
> + if (remove_attributes)
> +   *remove_attributes = true;
>   else
> -   result = build_aligned_type (result, TYPE_ALIGN (t));
> - TYPE_USER_ALIGN (result) = true;
> +   {
> + if (TYPE_ALIGN (t) == TYPE_ALIGN (result))
> +   result = build_variant_type_copy (result);
> + else
> +   result = build_aligned_type (result, TYPE_ALIGN (t));
> + TYPE_USER_ALIGN (result) = true;
> +   }
> +   }
> +
> +  if (TYPE_ATTRIBUTES (t))
> +   {
> + if (remove_attributes)
> +   result = apply_identity_attributes (result, TYPE_ATTRIBUTES (t),
> +   remove_attributes);
> + else
> +   result = cp_build_type_attribute_variant (result,
> + TYPE_ATTRIBUTES (t));
> }
>  }
> -  if (TYPE_ATTRIBUTES (t))
> -{
> -  if (remove_attributes)
> -   result = apply_identity_attributes (result, TYPE_ATTRIBUTES (t),
> -   remove_attributes);
> -  else
> -   result = cp_build_type_attribute_variant (result, TYPE_ATTRIBUTES 
> (t));
> -}
> +
>return cp_build_qualified_type (result, cp_type_quals (t));
>  }
>
> diff --git gcc/testsuite/g++.dg/warn/Wpadded-1.C 
> gcc/testsuite/g++.dg/warn/Wpadded-1.C
> index e69de29..b3f0581 100644
> --- gcc/testsuite/g++.dg/warn/Wpadded-1.C
> +++ gcc/testsuite/g++.dg/warn/Wpadded-1.C
> @@ -0,0 +1,22 @@
> +// PR c++/79900 - ICE in strip_typedefs
> +// { dg-do compile }
> +// { dg-options "-Wpadded" }
> +
> +template  struct A;
> +template  struct B { // { dg-warning "padding struct size to 
> alignment boundary" }
> +  long _M_off;
> +  int _M_state;
> +};
> +template <> struct A { typedef B pos_type; };
> +enum _Ios_Openmode {};
> +struct C {
> +  typedef _Ios_Openmode openmode;
> +};
> +template  struct D {
> +  typedef typename _Traits::pos_type pos_type;
> +  pos_type m_fn1(pos_type, C::openmode);
> +};
> +template class D;
> +template 
> +typename D<_CharT, _Traits>::pos_type D<_CharT, _Traits>::m_fn1(pos_type x,
> +C::openmode) 
> { return x; }
>
> Marek


Re: [PATCH] -Wduplicated-branches -fopenmp ICE in inchash::add_expr (PR c++/79672)

2017-03-08 Thread Jakub Jelinek
On Thu, Mar 09, 2017 at 12:02:17AM +0100, Marek Polacek wrote:
> Ping.
> 
> On Wed, Mar 01, 2017 at 08:09:05PM +0100, Marek Polacek wrote:
> > The following testcase ICEd with -Wduplicated-branches and -fopenmp
> > because we tried to has omp_parallel expression that contained some
> > TREE_VECs, but those aren't handled in inchash::add_expr.  Handling
> > that is easy and fixes the ICE.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > 2017-03-01  Marek Polacek  
> > 
> > PR c++/79672
> > * tree.c (inchash::add_expr): Handle TREE_VEC.
> > 
> > * g++.dg/warn/Wduplicated-branches2.C: Fix PR.
> > * g++.dg/warn/Wduplicated-branches3.C: New test.

Ok, thanks.

Jakub


Re: [PATCH] -Wduplicated-branches -fopenmp ICE in inchash::add_expr (PR c++/79672)

2017-03-08 Thread Marek Polacek
Ping.

On Wed, Mar 01, 2017 at 08:09:05PM +0100, Marek Polacek wrote:
> The following testcase ICEd with -Wduplicated-branches and -fopenmp
> because we tried to has omp_parallel expression that contained some
> TREE_VECs, but those aren't handled in inchash::add_expr.  Handling
> that is easy and fixes the ICE.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-03-01  Marek Polacek  
> 
>   PR c++/79672
>   * tree.c (inchash::add_expr): Handle TREE_VEC.
> 
>   * g++.dg/warn/Wduplicated-branches2.C: Fix PR.
>   * g++.dg/warn/Wduplicated-branches3.C: New test.
> 
> diff --git gcc/testsuite/g++.dg/warn/Wduplicated-branches2.C 
> gcc/testsuite/g++.dg/warn/Wduplicated-branches2.C
> index 4da2d54..7e14c5f 100644
> --- gcc/testsuite/g++.dg/warn/Wduplicated-branches2.C
> +++ gcc/testsuite/g++.dg/warn/Wduplicated-branches2.C
> @@ -1,4 +1,4 @@
> -// PR c/6427
> +// PR c/64279
>  // { dg-do compile { target c++11 } }
>  // { dg-options "-Wduplicated-branches" }
>  
> diff --git gcc/testsuite/g++.dg/warn/Wduplicated-branches3.C 
> gcc/testsuite/g++.dg/warn/Wduplicated-branches3.C
> index e69de29..26dab85 100644
> --- gcc/testsuite/g++.dg/warn/Wduplicated-branches3.C
> +++ gcc/testsuite/g++.dg/warn/Wduplicated-branches3.C
> @@ -0,0 +1,18 @@
> +// PR c++/79672
> +// { dg-do compile }
> +// { dg-options "-Wduplicated-branches -fopenmp" }
> +// { dg-require-effective-target fopenmp }
> +
> +template void foo()
> +{
> +  if (N > 0)
> +  {
> +#pragma omp parallel for
> +for (int i = 0; i < 10; ++i) ;
> +  }
> +}
> +
> +void bar()
> +{
> +  foo<0>();
> +}
> diff --git gcc/tree.c gcc/tree.c
> index 42c8a2d..8f87e7c 100644
> --- gcc/tree.c
> +++ gcc/tree.c
> @@ -7865,6 +7865,10 @@ add_expr (const_tree t, inchash::hash , 
> unsigned int flags)
> inchash::add_expr (tsi_stmt (i), hstate, flags);
>   return;
>}
> +case TREE_VEC:
> +  for (int i = 0; i < TREE_VEC_LENGTH (t); ++i)
> + inchash::add_expr (TREE_VEC_ELT (t, i), hstate, flags);
> +  return;
>  case FUNCTION_DECL:
>/* When referring to a built-in FUNCTION_DECL, use the __builtin__ 
> form.
>Otherwise nodes that compare equal according to operand_equal_p might
> 
>   Marek

Marek


C++ PATCH to fix ICE in strip_typedefs with -Wpadded (PR c++/79900)

2017-03-08 Thread Marek Polacek
We crash on an assert in strip_typedefs, because we find ourselves in a
scenario where RESULT, the main variant of a struct, was modified in
finalize_record_size (its TYPE_ALIGN changed), but its variants (T in
strip_typedefs) weren't fixed-up yet; that will happen slightly later in
finalize_type_size.  So there's a discrepancy that confuses the code.

This patch implements what Jason suggested to me on IRC, i.e., skip the
attribute handling if RESULT is complete and T isn't.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-03-08  Marek Polacek  

PR c++/79900 - ICE in strip_typedefs
* tree.c (strip_typedefs): Skip the attribute handling if T is
a variant type which hasn't been updated yet.

* g++.dg/warn/Wpadded-1.C: New test.

diff --git gcc/cp/tree.c gcc/cp/tree.c
index d3c63b8..b3a4a10 100644
--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -1548,29 +1548,40 @@ strip_typedefs (tree t, bool *remove_attributes)
result = TYPE_MAIN_VARIANT (t);
 }
   gcc_assert (!typedef_variant_p (result));
-  if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
-  || TYPE_ALIGN (t) != TYPE_ALIGN (result))
+
+  if (COMPLETE_TYPE_P (result) && !COMPLETE_TYPE_P (t))
+  /* If RESULT is complete and T isn't, it's likely the case that T
+ is a variant of RESULT which hasn't been updated yet.  Skip the
+ attribute handling.  */;
+  else
 {
-  gcc_assert (TYPE_USER_ALIGN (t));
-  if (remove_attributes)
-   *remove_attributes = true;
-  else
+  if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
+ || TYPE_ALIGN (t) != TYPE_ALIGN (result))
{
- if (TYPE_ALIGN (t) == TYPE_ALIGN (result))
-   result = build_variant_type_copy (result);
+ gcc_assert (TYPE_USER_ALIGN (t));
+ if (remove_attributes)
+   *remove_attributes = true;
  else
-   result = build_aligned_type (result, TYPE_ALIGN (t));
- TYPE_USER_ALIGN (result) = true;
+   {
+ if (TYPE_ALIGN (t) == TYPE_ALIGN (result))
+   result = build_variant_type_copy (result);
+ else
+   result = build_aligned_type (result, TYPE_ALIGN (t));
+ TYPE_USER_ALIGN (result) = true;
+   }
+   }
+
+  if (TYPE_ATTRIBUTES (t))
+   {
+ if (remove_attributes)
+   result = apply_identity_attributes (result, TYPE_ATTRIBUTES (t),
+   remove_attributes);
+ else
+   result = cp_build_type_attribute_variant (result,
+ TYPE_ATTRIBUTES (t));
}
 }
-  if (TYPE_ATTRIBUTES (t))
-{
-  if (remove_attributes)
-   result = apply_identity_attributes (result, TYPE_ATTRIBUTES (t),
-   remove_attributes);
-  else
-   result = cp_build_type_attribute_variant (result, TYPE_ATTRIBUTES (t));
-}
+
   return cp_build_qualified_type (result, cp_type_quals (t));
 }
 
diff --git gcc/testsuite/g++.dg/warn/Wpadded-1.C 
gcc/testsuite/g++.dg/warn/Wpadded-1.C
index e69de29..b3f0581 100644
--- gcc/testsuite/g++.dg/warn/Wpadded-1.C
+++ gcc/testsuite/g++.dg/warn/Wpadded-1.C
@@ -0,0 +1,22 @@
+// PR c++/79900 - ICE in strip_typedefs
+// { dg-do compile }
+// { dg-options "-Wpadded" }
+
+template  struct A;
+template  struct B { // { dg-warning "padding struct size to 
alignment boundary" }
+  long _M_off;
+  int _M_state;
+};
+template <> struct A { typedef B pos_type; };
+enum _Ios_Openmode {};
+struct C {
+  typedef _Ios_Openmode openmode;
+};
+template  struct D {
+  typedef typename _Traits::pos_type pos_type;
+  pos_type m_fn1(pos_type, C::openmode);
+};
+template class D;
+template 
+typename D<_CharT, _Traits>::pos_type D<_CharT, _Traits>::m_fn1(pos_type x,
+C::openmode) { 
return x; }

Marek


Re: [PATCH, Fortran, Coarray, v1] Add support for failed images

2017-03-08 Thread Christophe Lyon
Hi,

On 5 March 2017 at 12:41, Andre Vehreschild  wrote:
> Hi Jerry,
>
> thanks for seconding my read of the standard and reviewing so quickly.
> Committed as r245900.
>

I've noticed that the new test:
gfortran.dg/coarray/fail_image_2.f08 -fcoarray=single  -O2  -latomic
execution test
fails on arm and aarch64.
I'm using qemu if it matters, and my gfortran.log has:
spawn /XXX/qemu-wrapper.sh ./fail_image_2.exe
FAIL: gfortran.dg/coarray/fail_image_2.f08 -fcoarray=single  -O2
-latomic execution test

that is, no obvious error message :-(

Am I the only one seeing this?

Thanks,

Christophe

> Regards,
> Andre
>
> On Sat, 4 Mar 2017 15:06:25 -0800
> Jerry DeLisle  wrote:
>
>> On 03/04/2017 09:58 AM, Andre Vehreschild wrote:
>> > Hi all,
>> >
>> > attached patch polishes the one begun by Alessandro. It adds documentation
>> > and fixes the style issues. Furthermore did I try to interpret the standard
>> > according to the FAIL IMAGE statement. IMHO should it just quit the
>> > executable without any error code. The caf_single library emits "FAIL
>> > IMAGE" to stderr, while in coarray=single mode it just quits. What do you
>> > think?
>> >
>> > Bootstraps and regtests ok on x86_64-linux/f25. Ok for trunk? (May be
>> > later).
>> >
>> > Gruß,
>> > Andre
>> >
>>
>> From my read:
>>
>> "A failed image is usually associated with a hardware failure of the
>> processor, memory system, or interconnection network"
>>
>> Since the FAIL IMAGE statement is intended to simulate such failure, I agree
>> with your interpretation as well, it just stops execution.
>>
>> Yes OK for trunk now.
>>
>> Jerry
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de


[PATCH rs6000] Fix PR79907

2017-03-08 Thread Pat Haugen
The following patch fixes an "insn does not match its constraints" ICE
that occurred when copying a VR reg to a GPR when -mno-upper-regs-df is
in effect. The regclass used for the wi/wj constraints was being
incorrectly based on TARGET_UPPER_REGS_DF instead of
TARGET_UPPER_REGS_DI. This patch fixes the initialization of the wi
constraint, which in turn is used to set the value of the wj constraint.

Bootstrap/regtest on powerp64le and powerpc64 (-m32/-m64) with no new
regressions. Ok for trunk?

-Pat


2017-03-08  Pat Haugen  

PR target/79907
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok): Test
TARGET_UPPER_REGS_DI when setting 'wi' constraint regclass.

testsuite/ChangeLog:
2017-03-08  Pat Haugen  

* gcc.target/powerpc/pr79907.c: New.



Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c  (revision 245961)
+++ config/rs6000/rs6000.c  (working copy)
@@ -3182,7 +3182,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
   else
rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;

-  if (TARGET_UPPER_REGS_DF)/* 
DImode  */
+  if (TARGET_UPPER_REGS_DI)/* 
DImode  */
rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS;
   else
rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;
Index: testsuite/gcc.target/powerpc/pr79907.c
===
--- testsuite/gcc.target/powerpc/pr79907.c  (nonexistent)
+++ testsuite/gcc.target/powerpc/pr79907.c  (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" }
{ "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O3 -mno-upper-regs-df" } */
+
+int foo (short a[], int x)
+{
+  unsigned int i;
+  for (i = 0; i < 1000; i++)
+{
+  x = a[i];
+  a[i] = (x <= 0 ? 0 : x);
+}
+  return x;
+}



Re: [PATCH][AArch64] Add neon_pairwise_add & neon_pairwise_add_q types

2017-03-08 Thread James Greenhalgh
On Mon, Mar 06, 2017 at 05:09:44AM +, Hurugalawadi, Naveen wrote:
> Hi,
> 
> Please find attached the patch that adds "neon_pairwise_add" & 
> "neon_pairwise_add_qcrypto_pmull" for AArch64.
> 
> The patch doesn't change spec but improve other benchmarks.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> Please review the patch and let us know if its okay for Stage-1?

The whitespace in various places in this patch is inconsistent with the
whitespace around the modified line. For example:

--- a/gcc/config/arm/cortex-a9-neon.md
+++ b/gcc/config/arm/cortex-a9-neon.md
@@ -52,6 +52,7 @@
   (const_string "neon_int_2")
   (eq_attr "type" "neon_neg, neon_neg_q,\
neon_reduc_add, neon_reduc_add_q,\
+  neon_pairwise_add, neon_pairwise_add_q,\
neon_reduc_add_long,\
neon_add_long, neon_sub_long")
   (const_string "neon_int_3")

Where most lines use 8 spaces, and you've used tabs.

While all these lines should really have been tabs to begin with, they
didn't, and introducing inconsistency from line to line is worse than
just following the existing style.

With the whitespace fixed, this patch is OK for stage 1.

Thanks,
James

> 2017-03-06  Julian Brown  
>   Naveen H.S  
> 
>   * config/aarch64/aarch64-simd.md (aarch64_reduc_plus_internal)
>   (aarch64_reduc_plus_internalv2si, aarch64_addp, aarch64_addpdi):
>   Use neon_pairwise_add/neon_pairwise_add_q as appropriate.
>   * config/aarch64/iterators.md (reduc_pairwise): New mode attribute.
>   * config/aarch64/thunderx.md (thunderx_neon_add, thunderx_neon_add_q):
>   Tweak for neon_pairwise_add split.
>   * config/aarch64/thunderx2t99.md (thunderx2t99_asimd_int): Add
>   neon_pairwise_add/neon_pairwise_add_q types.
>   * config/arm/cortex-a15-neon.md (cortex_a15_neon_type): Likewise.
>   * config/arm/cortex-a17-neon.md (cortex_a17_neon_type): Likewise.
>   * config/arm/cortex-a57.md (cortex_a57_neon_type): Likewise.
>   * config/arm/cortex-a8-neon.md (cortex_a8_neon_type): Likewise.
>   * config/arm/cortex-a9-neon.md (cortex_a9_neon_type): Likewise.
>   * config/arm/xgene1.md (xgene1_neon_arith): Likewise.
>   * config/arm/types.md (neon_pairwise_add, neon_pairwise_add_q): Add.

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 338b9f8..878f86a 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2101,7 +2101,7 @@
>   UNSPEC_ADDV))]
>   "TARGET_SIMD"
>   "add\\t%0, %1."
> -  [(set_attr "type" "neon_reduc_add")]
> +  [(set_attr "type" "neon__add")]
>  )
>  
>  (define_insn "aarch64_reduc_plus_internalv2si"
> @@ -2110,7 +2110,7 @@
>   UNSPEC_ADDV))]
>   "TARGET_SIMD"
>   "addp\\t%0.2s, %1.2s, %1.2s"
> -  [(set_attr "type" "neon_reduc_add")]
> +  [(set_attr "type" "neon_pairwise_add")]
>  )
>  
>  (define_insn "reduc_plus_scal_"
> @@ -4405,7 +4405,7 @@
>UNSPEC_ADDP))]
>"TARGET_SIMD"
>"addp\t%0, %1, %2"
> -  [(set_attr "type" "neon_reduc_add")]
> +  [(set_attr "type" "neon_pairwise_add")]
>  )
>  
>  (define_insn "aarch64_addpdi"
> @@ -4415,7 +4415,7 @@
>UNSPEC_ADDP))]
>"TARGET_SIMD"
>"addp\t%d0, %1.2d"
> -  [(set_attr "type" "neon_reduc_add")]
> +  [(set_attr "type" "neon_pairwise_add")]
>  )
>  
>  ;; sqrt
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index c59d31e..c829cb5 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -790,6 +790,12 @@
> (V2SF "p") (V4SF  "v")
> (V4HF "v") (V8HF  "v")])
>  
> +(define_mode_attr reduc_pairwise [(V8QI "reduc") (V16QI "reduc")
> +   (V4HI "reduc") (V8HI "reduc")
> +   (V2SI "pairwise") (V4SI "reduc")
> +   (V2DI "pairwise") (V2DF "pairwise")
> +   (V2SF "pairwise") (V4SF "reduc")])
> +
>  (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
>  (define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])
>  
> diff --git a/gcc/config/aarch64/thunderx.md b/gcc/config/aarch64/thunderx.md
> index b67671d..95bfad4 100644
> --- a/gcc/config/aarch64/thunderx.md
> +++ b/gcc/config/aarch64/thunderx.md
> @@ -266,7 +266,8 @@
>  
>  (define_insn_reservation "thunderx_neon_add" 4
>(and (eq_attr "tune" "thunderx")
> -   (eq_attr "type" "neon_reduc_add, neon_reduc_minmax, 
> neon_fp_reduc_add_s, \
> +   (eq_attr "type" "neon_reduc_add, neon_pairwise_add, 
> neon_reduc_minmax,\
> + neon_fp_reduc_add_s, \
>   neon_fp_reduc_add_d, neon_fp_to_int_s, 
> neon_fp_to_int_d, \
>   

Re: [PATCH][AArch64] Implement ALU_BRANCH fusion

2017-03-08 Thread James Greenhalgh
On Mon, Mar 06, 2017 at 05:10:10AM +, Hurugalawadi, Naveen wrote:
> Hi,
> 
> Please find attached the patch that implements alu_branch fusion
> for AArch64.
> The patch doesn't change spec but improve other benchmarks.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> Please review the patch and let us know if its okay for Stage-1?

This description is insufficient for me to review this patch - in
particular I'd need more detail on what types of instruction pairs you
are trying to fuse. From inspection you will be trying to fuse any
ALU operation with an unconditional direct branch. Is that what you
intend?

i.e. you are looking to fuse instruction sequences like:

  add   x0, x1, #5
  b .L3

  csel  x0, x1, x1, gt
  b .L4

Have I understood that right?

> +  if (aarch64_fusion_enabled_p (AARCH64_FUSE_ALU_BRANCH)
> +  && any_uncondjump_p (curr))
> +{
> +  /* These types correspond to the reservation "vulcan_alu_basic" for
> +  Broadcom Vulcan: these are ALU operations that produce a single uop
> +  during instruction decoding.  */

This comment looks incorrect - there is no vulcan_alu_basic reservation
in trunk GCC.

Thanks,
James


Re: [PATCH][AArch64] Add crc reservations for Thunderx2t99

2017-03-08 Thread James Greenhalgh
On Mon, Mar 06, 2017 at 05:10:04AM +, Hurugalawadi, Naveen wrote:
> Hi,
> 
> Please find attached the patch that adds crc reservations for Thunderx2t99.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> Please review the patch and let us know if its okay for Stage-1?

OK for Stage 1.

Thanks,
James

> 2017-03-06  Julian Brown  
>   Naveen H.S  
> 
>   * config/aarch64/thunderx2t99.md (thunderx2t99_crc): New Reservation.

> diff --git a/gcc/config/aarch64/thunderx2t99.md 
> b/gcc/config/aarch64/thunderx2t99.md
> index 2eb136b..936078c 100644
> --- a/gcc/config/aarch64/thunderx2t99.md
> +++ b/gcc/config/aarch64/thunderx2t99.md
> @@ -462,3 +462,10 @@
> (eq_attr "type" "crypto_sha1_fast,crypto_sha1_xor,crypto_sha1_slow,\
>   crypto_sha256_fast,crypto_sha256_slow"))
>"thunderx2t99_f1")
> +
> +;; CRC extension.
> +
> +(define_insn_reservation "thunderx2t99_crc" 4
> +  (and (eq_attr "tune" "thunderx2t99")
> +   (eq_attr "type" "crc"))
> +  "thunderx2t99_i1")



Re: [PATCH][AArch64] Add aes and sha reservations for Thunderx2t99

2017-03-08 Thread James Greenhalgh
On Mon, Mar 06, 2017 at 05:09:58AM +, Hurugalawadi, Naveen wrote:
> Hi,
> 
> Please find attached the patch that adds aes and sha reservations for
> Thunderx2t99.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> Please review the patch and let us know if its okay for Stage-1?
> 
> Thanks,
> Naveen
> 
> 2017-03-06  Julian Brown  
>   Naveen H.S  
> 
>   * config/aarch64/thunderx2t99.md (thunderx2t99_crc): New Reservation.

> diff --git a/gcc/config/aarch64/thunderx2t99.md 
> b/gcc/config/aarch64/thunderx2t99.md
> index f807547..2eb136b 100644
> --- a/gcc/config/aarch64/thunderx2t99.md
> +++ b/gcc/config/aarch64/thunderx2t99.md
> @@ -443,7 +443,22 @@
> (eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q"))
>"thunderx2t99_ls01,thunderx2t99_f01")
>  
> +;; Crypto extensions.
> +
> +; FIXME: Forwarding path for aese/aesmc or aesd/aesimc pairs?

Do you need these if you've enabled instruction fusion?

I'm not sure on the exact mechanics of how instruction fusion works, and
whether you'd get further scheduling benefits from modeling a forwarding
path if one exists.

Regardless, this patch is OK for Stage 1.

Thanks,
James

> +(define_insn_reservation "thunderx2t99_aes" 5
> +  (and (eq_attr "tune" "thunderx2t99")
> +   (eq_attr "type" "crypto_aese,crypto_aesmc"))
> +  "thunderx2t99_f1")
> +
>  (define_insn_reservation "thunderx2t99_pmull" 5
>(and (eq_attr "tune" "thunderx2t99")
> (eq_attr "type" "crypto_pmull"))
>"thunderx2t99_f1")
> +
> +(define_insn_reservation "thunderx2t99_sha" 7
> +  (and (eq_attr "tune" "thunderx2t99")
> +   (eq_attr "type" "crypto_sha1_fast,crypto_sha1_xor,crypto_sha1_slow,\
> + crypto_sha256_fast,crypto_sha256_slow"))
> +  "thunderx2t99_f1")



Re: [PATCH][AArch64] Fix type for 1-element load

2017-03-08 Thread James Greenhalgh
On Mon, Mar 06, 2017 at 05:09:51AM +, Hurugalawadi, Naveen wrote:
> Hi,
> 
> Please find attached the patch that fixes type for 1-element load in AArch64.
> 
> Bootstrapped and Regression tested on aarch64-thunder-linux.
> Please review the patch and let us know if its okay for Stage-1?
> 
> Thanks,
> Naveen
> 
> 2017-03-06  Julian Brown  
>   Naveen H.S  
> 
>   * config/aarch64/aarch64-simd.md (aarch64_simd_vec_set): Fix
>   type for 1-element load.

OK.

Thanks,
James

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 878f86a..0443a86 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -561,7 +561,7 @@
>   gcc_unreachable ();
>   }
>}
> -  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_1reg")]
> +  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_one_lane")]
>  )
>  
>  (define_insn "*aarch64_simd_vec_copy_lane"



[PATCH] Fix -fsanitize=address atomic builtin handling (PR sanitizer/79944)

2017-03-08 Thread Jakub Jelinek
Hi!

As reported, we were using often completely bogus access sizes for
atomic builtins, where the access size should be determined purely by their
_{1,2,4,8,16} suffix, not by whatever type the passed pointer points to,
pointer conversions are useless.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-03-08  Jakub Jelinek  

PR sanitizer/79944
* asan.c (get_mem_refs_of_builtin_call): For BUILT_IN_ATOMIC* and
BUILT_IN_SYNC*, determine the access type from the size suffix and
always build a MEM_REF with that type.  Handle forgotten
BUILT_IN_SYNC_FETCH_AND_NAND_16 and BUILT_IN_SYNC_NAND_AND_FETCH_16.

* c-c++-common/asan/pr79944.c: New test.

--- gcc/asan.c.jj   2017-03-06 12:32:28.0 +0100
+++ gcc/asan.c  2017-03-08 12:24:11.151353229 +0100
@@ -603,218 +603,208 @@ get_mem_refs_of_builtin_call (const gcal
 case BUILT_IN_STRLEN:
   source0 = gimple_call_arg (call, 0);
   len = gimple_call_lhs (call);
-  break ;
+  break;
 
 /* And now the __atomic* and __sync builtins.
These are handled differently from the classical memory memory
access builtins above.  */
 
 case BUILT_IN_ATOMIC_LOAD_1:
-case BUILT_IN_ATOMIC_LOAD_2:
-case BUILT_IN_ATOMIC_LOAD_4:
-case BUILT_IN_ATOMIC_LOAD_8:
-case BUILT_IN_ATOMIC_LOAD_16:
   is_store = false;
-  /* fall through.  */
-
+  /* FALLTHRU */
 case BUILT_IN_SYNC_FETCH_AND_ADD_1:
-case BUILT_IN_SYNC_FETCH_AND_ADD_2:
-case BUILT_IN_SYNC_FETCH_AND_ADD_4:
-case BUILT_IN_SYNC_FETCH_AND_ADD_8:
-case BUILT_IN_SYNC_FETCH_AND_ADD_16:
-
 case BUILT_IN_SYNC_FETCH_AND_SUB_1:
-case BUILT_IN_SYNC_FETCH_AND_SUB_2:
-case BUILT_IN_SYNC_FETCH_AND_SUB_4:
-case BUILT_IN_SYNC_FETCH_AND_SUB_8:
-case BUILT_IN_SYNC_FETCH_AND_SUB_16:
-
 case BUILT_IN_SYNC_FETCH_AND_OR_1:
-case BUILT_IN_SYNC_FETCH_AND_OR_2:
-case BUILT_IN_SYNC_FETCH_AND_OR_4:
-case BUILT_IN_SYNC_FETCH_AND_OR_8:
-case BUILT_IN_SYNC_FETCH_AND_OR_16:
-
 case BUILT_IN_SYNC_FETCH_AND_AND_1:
-case BUILT_IN_SYNC_FETCH_AND_AND_2:
-case BUILT_IN_SYNC_FETCH_AND_AND_4:
-case BUILT_IN_SYNC_FETCH_AND_AND_8:
-case BUILT_IN_SYNC_FETCH_AND_AND_16:
-
 case BUILT_IN_SYNC_FETCH_AND_XOR_1:
-case BUILT_IN_SYNC_FETCH_AND_XOR_2:
-case BUILT_IN_SYNC_FETCH_AND_XOR_4:
-case BUILT_IN_SYNC_FETCH_AND_XOR_8:
-case BUILT_IN_SYNC_FETCH_AND_XOR_16:
-
 case BUILT_IN_SYNC_FETCH_AND_NAND_1:
-case BUILT_IN_SYNC_FETCH_AND_NAND_2:
-case BUILT_IN_SYNC_FETCH_AND_NAND_4:
-case BUILT_IN_SYNC_FETCH_AND_NAND_8:
-
 case BUILT_IN_SYNC_ADD_AND_FETCH_1:
-case BUILT_IN_SYNC_ADD_AND_FETCH_2:
-case BUILT_IN_SYNC_ADD_AND_FETCH_4:
-case BUILT_IN_SYNC_ADD_AND_FETCH_8:
-case BUILT_IN_SYNC_ADD_AND_FETCH_16:
-
 case BUILT_IN_SYNC_SUB_AND_FETCH_1:
-case BUILT_IN_SYNC_SUB_AND_FETCH_2:
-case BUILT_IN_SYNC_SUB_AND_FETCH_4:
-case BUILT_IN_SYNC_SUB_AND_FETCH_8:
-case BUILT_IN_SYNC_SUB_AND_FETCH_16:
-
 case BUILT_IN_SYNC_OR_AND_FETCH_1:
-case BUILT_IN_SYNC_OR_AND_FETCH_2:
-case BUILT_IN_SYNC_OR_AND_FETCH_4:
-case BUILT_IN_SYNC_OR_AND_FETCH_8:
-case BUILT_IN_SYNC_OR_AND_FETCH_16:
-
 case BUILT_IN_SYNC_AND_AND_FETCH_1:
-case BUILT_IN_SYNC_AND_AND_FETCH_2:
-case BUILT_IN_SYNC_AND_AND_FETCH_4:
-case BUILT_IN_SYNC_AND_AND_FETCH_8:
-case BUILT_IN_SYNC_AND_AND_FETCH_16:
-
 case BUILT_IN_SYNC_XOR_AND_FETCH_1:
-case BUILT_IN_SYNC_XOR_AND_FETCH_2:
-case BUILT_IN_SYNC_XOR_AND_FETCH_4:
-case BUILT_IN_SYNC_XOR_AND_FETCH_8:
-case BUILT_IN_SYNC_XOR_AND_FETCH_16:
-
 case BUILT_IN_SYNC_NAND_AND_FETCH_1:
-case BUILT_IN_SYNC_NAND_AND_FETCH_2:
-case BUILT_IN_SYNC_NAND_AND_FETCH_4:
-case BUILT_IN_SYNC_NAND_AND_FETCH_8:
-
 case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
-case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
-case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
-case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
-case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
-
 case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
-case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
-case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
-case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
-case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
-
 case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
-case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
-case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
-case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
-case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
-
 case BUILT_IN_SYNC_LOCK_RELEASE_1:
-case BUILT_IN_SYNC_LOCK_RELEASE_2:
-case BUILT_IN_SYNC_LOCK_RELEASE_4:
-case BUILT_IN_SYNC_LOCK_RELEASE_8:
-case BUILT_IN_SYNC_LOCK_RELEASE_16:
-
 case BUILT_IN_ATOMIC_EXCHANGE_1:
-case BUILT_IN_ATOMIC_EXCHANGE_2:
-case BUILT_IN_ATOMIC_EXCHANGE_4:
-case BUILT_IN_ATOMIC_EXCHANGE_8:
-case BUILT_IN_ATOMIC_EXCHANGE_16:
-
 case 

[committed] Fix taskloop handling inside of parallel construct body (PR c/79940)

2017-03-08 Thread Jakub Jelinek
Hi!

We split OMP_TASKLOOP into 3 constructs, two GIMPLE_OMP_FOR
with GIMPLE_OMP_PARALLEL sandwiched in between them, so that
it is possible to compute number of iterations etc. before calling
GOMP_taskloop*.  Using the original iterator in the outer gfor
doesn't play very well if the taskloop region is nested in other
OpenMP regions like parallel.  This patch just creates a temporary
for that.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed
to trunk so far, queued for 6.x.

2017-03-08  Jakub Jelinek  

PR c/79940
* gimplify.c (gimplify_omp_for): Replace index var in outer
taskloop statement with an artificial variable and add
OMP_CLAUSE_PRIVATE clause for it.

* testsuite/libgomp.c/pr79940.c: New test.

--- gcc/gimplify.c.jj   2017-02-21 09:03:57.0 +0100
+++ gcc/gimplify.c  2017-03-08 10:06:11.926501447 +0100
@@ -10232,8 +10232,9 @@ gimplify_omp_for (tree *expr_p, gimple_s
   gimple_omp_for_set_combined_into_p (gfor, true);
   for (i = 0; i < (int) gimple_omp_for_collapse (gfor); i++)
{
- t = unshare_expr (gimple_omp_for_index (gfor, i));
- gimple_omp_for_set_index (gforo, i, t);
+ tree type = TREE_TYPE (gimple_omp_for_index (gfor, i));
+ tree v = create_tmp_var (type);
+ gimple_omp_for_set_index (gforo, i, v);
  t = unshare_expr (gimple_omp_for_initial (gfor, i));
  gimple_omp_for_set_initial (gforo, i, t);
  gimple_omp_for_set_cond (gforo, i,
@@ -10241,7 +10242,13 @@ gimplify_omp_for (tree *expr_p, gimple_s
  t = unshare_expr (gimple_omp_for_final (gfor, i));
  gimple_omp_for_set_final (gforo, i, t);
  t = unshare_expr (gimple_omp_for_incr (gfor, i));
+ gcc_assert (TREE_OPERAND (t, 0) == gimple_omp_for_index (gfor, i));
+ TREE_OPERAND (t, 0) = v;
  gimple_omp_for_set_incr (gforo, i, t);
+ t = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
+ OMP_CLAUSE_DECL (t) = v;
+ OMP_CLAUSE_CHAIN (t) = gimple_omp_for_clauses (gforo);
+ gimple_omp_for_set_clauses (gforo, t);
}
   gimplify_seq_add_stmt (pre_p, gforo);
 }
--- libgomp/testsuite/libgomp.c/pr79940.c.jj2017-03-08 10:47:26.179154442 
+0100
+++ libgomp/testsuite/libgomp.c/pr79940.c   2017-03-08 10:46:46.0 
+0100
@@ -0,0 +1,47 @@
+/* PR c/79940 */
+
+int
+main ()
+{
+  int i, j, l, m;
+  int a[1], b[1], c[1];
+  for (i = 0; i < 1; i++)
+{
+  a[i] = i;
+  b[i] = i & 31;
+}
+#pragma omp parallel shared(a, b, c)
+#pragma omp single
+#pragma omp taskloop shared(a, b, c)
+  for (i = 0; i < 1; i++)
+c[i] = a[i] + b[i];
+#pragma omp parallel
+#pragma omp single
+  {
+#pragma omp taskloop shared(a, b, c) lastprivate (i)
+for (i = 0; i < 1; i++)
+  c[i] += a[i] + b[i];
+l = i;
+  }
+#pragma omp parallel
+#pragma omp single
+#pragma omp taskloop shared(a, b, c) collapse(2)
+  for (i = 0; i < 100; i++)
+for (j = 0; j < 100; j++)
+  c[i * 100 + j] += a[i * 100 + j] + b[i * 100 + j];
+#pragma omp parallel
+#pragma omp single
+  {
+#pragma omp taskloop shared(a, b, c) lastprivate (i, j)
+for (i = 0; i < 100; i++)
+  for (j = 0; j < 100; j++)
+   c[i * 100 + j] += a[i * 100 + j] + b[i * 100 + j];
+m = i * 100 + j;
+  }
+  for (i = 0; i < 1; i++)
+if (a[i] != i || b[i] != (i & 31) || c[i] != 4 * i + 4 * (i & 31))
+  __builtin_abort ();
+  if (l != 1 || m != 10100)
+__builtin_abort ();
+  return 0;
+}

Jakub


[PATCH][AArch64] Emit tighter strong atomic compare-exchange loop when comparing against zero

2017-03-08 Thread Kyrill Tkachov

Hi all,

For the testcase in this patch where the value of x is zero we currently 
generate:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cmp w2, 0
bne .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
csetw0, eq
ret

We currently cannot merge the cmp and b.ne inside the loop into a cbnz because 
we need
the condition flags set for the return value of the function (i.e. the cset at 
the end).
But if we re-jig the sequence in that case we can generate a tighter loop:
foo:
mov w1, 4
.L2:
ldaxr   w2, [x0]
cbnzw2, .L3
stxrw3, w1, [x0]
cbnzw3, .L2
.L3:
cmp w2, 0
csetw0, eq
ret

So we add an explicit compare after the loop and inside the loop we use the 
fact that
we're comparing against zero to emit a CBNZ. This means we may re-do the 
comparison twice
(once inside the CBNZ, once at the CMP at the end), but there is now less code 
inside the loop.

I've seen this sequence appear in glibc locking code so maybe it's worth adding 
the extra bit
of complexity to the compare-exchange splitter to catch this case.

Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of 
the patch where
I had gotten some logic wrong it would cause miscompiles of libgomp leading to 
timeouts in its
testsuite but this version passes everything cleanly.

Ok for GCC 8? (I know it's early, but might as well get it out in case someone 
wants to try it out)

Thanks,
Kyrill

2017-03-08  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
Emit CBNZ inside loop when doing a strong exchange and comparing
against zero.  Generate the CC flags after the loop.

2017-03-08  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 76a2de20dfcd4ea38fb7c58a9e8612509c5987bd..5fa8e197328ce4cb1718ff7d99b1ea95e02129a4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -12095,6 +12095,17 @@ aarch64_split_compare_and_swap (rtx operands[])
   mode = GET_MODE (mem);
   model = memmodel_from_int (INTVAL (model_rtx));
 
+  /* When OLDVAL is zero and we want the strong version we can emit a tighter
+loop:
+.label1:
+	LD[A]XR	rval, [mem]
+	CBNZ	rval, .label2
+	ST[L]XR	scratch, newval, [mem]
+	CBNZ	scratch, .label1
+.label2:
+	CMP	rval, 0.  */
+  bool strong_zero_p = !is_weak && oldval == const0_rtx;
+
   label1 = NULL;
   if (!is_weak)
 {
@@ -12111,11 +12122,21 @@ aarch64_split_compare_and_swap (rtx operands[])
   else
 aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
-  cond = aarch64_gen_compare_reg (NE, rval, oldval);
-  x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-			gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+  if (strong_zero_p)
+{
+  x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+}
+  else
+{
+  cond = aarch64_gen_compare_reg (NE, rval, oldval);
+  x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+ gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+}
 
   aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
@@ -12134,7 +12155,15 @@ aarch64_split_compare_and_swap (rtx operands[])
 }
 
   emit_label (label2);
-
+  /* If we used a CBNZ in the exchange loop emit an explicit compare with RVAL
+ to set the condition flags.  If this is not used it will be removed by
+ later passes.  */
+  if (strong_zero_p)
+{
+  cond = gen_rtx_REG (CCmode, CC_REGNUM);
+  x = gen_rtx_COMPARE (CCmode, rval, const0_rtx);
+  emit_insn (gen_rtx_SET (cond, x));
+}
   /* Emit any final barrier needed for a __sync operation.  */
   if (is_mm_sync (model))
 aarch64_emit_post_barrier (model);
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
new file mode 100644
index ..b14a7c294376f03cd13077d18d865f83a04bd04e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int *a)
+{
+  int x = 0;
+  return __atomic_compare_exchange_n (a, , 4, 0,
+  __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE);
+}
+
+/* { dg-final { scan-assembler-times "cbnz\\tw\[0-9\]+" 2 } } */


Re: [RFC] VEC_SELECT sanity checking in genrecog (arm, aarch64, mips)

2017-03-08 Thread Kyrill Tkachov


On 08/03/17 15:51, Bill Schmidt wrote:

On Mar 6, 2017, at 6:55 AM, Jakub Jelinek  wrote:

Nice.  So just arm, aarch64 and rs6000 to go.

Proposed patch for rs6000 is here:  
https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00372.html.


arm patch is here https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00358.html
Doing some more testing on the aarch64 version.

Thanks,
Kyrill


Thanks,
Bill




Re: [PATCH] Fix PRs 79955 and 79956 (hopefully)

2017-03-08 Thread Jeff Law

On 03/08/2017 03:12 AM, Richard Biener wrote:


The following avoids warning about completely out-of-bound accesses
as uninitialized.

For GCC8 we likely want to enhance path isolation to catch those
cases (and eventually issue a diagnostic about them).  And we
finally want to fix complete peeling to not introduce those
accesses.
FWIW, I want to do isolation of o-o-b array accesses and this would 
likely fit into that reasonably well.


jeff



Re: [RFC] VEC_SELECT sanity checking in genrecog (arm, aarch64, mips)

2017-03-08 Thread Bill Schmidt

> On Mar 6, 2017, at 6:55 AM, Jakub Jelinek  wrote:
> 
> Nice.  So just arm, aarch64 and rs6000 to go.

Proposed patch for rs6000 is here:  
https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00372.html.

Thanks,
Bill


[PATCH, rs6000] Fix incorrect mode usage for vec_select

2017-03-08 Thread Bill Schmidt
Hi,

As noted by Jakub in https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00183.html,
the PowerPC back end incorrectly uses vec_select with 2 elements for a mode
that has only one.  This is due to faulty mode iterator use:  V1TImode was
wrongly included in the VSX_LE mode iterator, but should instead have been
in the VSX_LE_128 mode iterator.

This patch fixes that, and with VSX_LE no longer including V1TImode, it is
now redundant with VSX_D, so that patch removes VSX_LE altogether.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this ok for trunk?

I am uncertain whether we should backport the fix to gcc 5 and 6, since,
although the code is technically incorrect, it works just fine.  The fix
is needed in trunk to permit the sanity checking that Jakub has proposed
for genrecog.

Thanks,
Bill


2017-03-08  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use rotate
instead of vec_select for V1TImode.
* conifg/rs6000/vsx.md (VSX_LE): Remove mode iterator that is no
longer needed.
(VSX_LE_128): Add V1TI to this mode iterator.
(*vsx_le_perm_load_): Change to use VSX_D mode iterator.
(*vsx_le_perm_store_): Likewise.
(pre-reload splitter for VSX stores): Likewise.
(post-reload splitter for VSX stores): Likewise.
(*vsx_xxpermdi2_le_): Likewise.
(*vsx_lxvd2x2_le_): Likewise.
(*vsx_stxvd2x2_le_): Likewise.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 245952)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -10420,7 +10420,7 @@ rs6000_gen_le_vsx_permute (rtx source, machine_mod
 {
   /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
  128-bit integers if they are allowed in VSX registers.  */
-  if (FLOAT128_VECTOR_P (mode) || mode == TImode)
+  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
 return gen_rtx_ROTATE (mode, source, GEN_INT (64));
   else
 {
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 245952)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -27,15 +27,12 @@
 ;; Iterator for the 2 64-bit vector types
 (define_mode_iterator VSX_D [V2DF V2DI])
 
-;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with
-;; lxvd2x to properly handle swapping words on little endian
-(define_mode_iterator VSX_LE [V2DF V2DI V1TI])
-
 ;; Mode iterator to handle swapping words on little endian for the 128-bit
 ;; types that goes in a single vector register.
 (define_mode_iterator VSX_LE_128 [(KF   "FLOAT128_VECTOR_P (KFmode)")
  (TF   "FLOAT128_VECTOR_P (TFmode)")
- (TI   "TARGET_VSX_TIMODE")])
+ (TI   "TARGET_VSX_TIMODE")
+ V1TI])
 
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
@@ -387,8 +384,8 @@
 ;; The patterns for LE permuted loads and stores come before the general
 ;; VSX moves so they match first.
 (define_insn_and_split "*vsx_le_perm_load_"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=")
-(match_operand:VSX_LE 1 "memory_operand" "Z"))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=")
+(match_operand:VSX_D 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
@@ -501,8 +498,8 @@
(set_attr "length" "8")])
 
 (define_insn "*vsx_le_perm_store_"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-(match_operand:VSX_LE 1 "vsx_register_operand" "+"))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+(match_operand:VSX_D 1 "vsx_register_operand" "+"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -509,8 +506,8 @@
(set_attr "length" "12")])
 
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-(match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+(match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed"
   [(set (match_dup 2)
 (vec_select:
@@ -528,8 +525,8 @@
 ;; The post-reload split requires that we re-permute the source
 ;; register in case it is still live.
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-(match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+(match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed"
   [(set (match_dup 1)
 (vec_select:
@@ 

Re: [PATCH,testsuite] Add check_effective_target_rdynamic and use it in g++.dg/lto/pr69589_0.C.

2017-03-08 Thread Rainer Orth
Hi Toma,

> g++.dg/lto/pr69589_0.C is currently failing for mips-mti-elf with the 
> following error:
>
> xg++: error: unrecognized command line option '-rdynamic'
>
> However, it passes just fine for mips-mti-linux-gnu.
> I think that we should skip this test for mips-mti-elf.

could it be that mips/sde.h is just missing -rdynamic handling in
LINK_SPEC?  At least some bare-metal targets using gld *do* support
-rdynamic (cf. xtensa/elf.h and aarch64/aarch64-elf-raw.h)...

That said, I like the general direction of this patch, even when the new
keyword is only used in one place.

> This patch achieves this by adding support for check_effective_target_rdynamic
> and then using it in g++.dg/lto/pr69589_0.C.
>
> This patch also removes the existing dg-skip-if, as it is made redundant by 
> the
> effective target check. The latter is also more convenient, as we won't have 
> to
> keep tweaking the dg-skip-if for failing targets anymore.
>
> This is similar to my recent check_effective_target_gettimeofday patch,
> which was greatly improved by Rainer Orth's review.
>
> Tested with mips-mti-elf and mips-mti-linux-gnu.
>
> Does this look good ?
[...]
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 2766af4..f46f0ba 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -1484,6 +1484,16 @@ proc check_effective_target_static_libgfortran { } {
>  } "-static"]
>  }
>  
> +# Return 1 if we can use the -rdynamic option, 0 otherwise.
> +#
> +# When the target name changes, replace the cached result.

I think this line isn't necessary.  Just keep the first.

Ok for mainline with that fixed.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] V2: Fix compile warning building testcase

2017-03-08 Thread Ian Lance Taylor via gcc-patches
On Wed, Mar 8, 2017 at 4:55 AM, Sam Thursfield
 wrote:
>
> Thanks a lot for reviewing the patch. I've attached a new version with some
> comments below.
>
> On 08/03/17 04:03, Ian Lance Taylor via gcc-patches wrote:
>>
>> Thanks.  The patch submission is fine but 1) you didn't see which
>> version of GCC you are using; 2) I don't understand why it works.  If
>> BACKTRACE_SUPPORTED is 0, then I would expect that you would see a
>> warning for test1, test2, test3, and test4.  Your patch doesn't fix
>> those, so it sounds like you are only seeing the warning for test5, so
>> BACKTRACE_SUPPORTED must be 1..  test5 is not compiled if
>> BACKTRACE_SUPPORTS_DATA is 0, so BACKTRACE_SUPPORTS_DATA must be 1.
>> So both BACKTRACE_SUPPORTED and BACKTRACE_SUPPORTS_DATA must be 1, so
>> I don't understand what your patch fixes.
>
>
> I'm using GCC 6.3.0 here to compile.
>
> I see your point about the fix not making sense. The values of the
> defines are this:
>
> #define BACKTRACE_SUPPORTED 0
> #define BACKTRACE_SUPPORTS_DATA 1
>
> I noticed that test1(), test2(), test3() and test4() all have the
> "unused" attribute:
>
> static int test1 (void) __attribute__ ((noinline, unused));
> static inline int test2 (void) __attribute__ ((always_inline, unused));
> static int test3 (void) __attribute__ ((noinline, unused));
> static inline int test4 (void) __attribute__ ((always_inline, unused));
>
> So that will be why they don't trigger -Wunused-function :-)
>
> It seems to me now that it'd be neater to add that same attribute to
> test5(), instead of using an #ifdef guard. Attached is a new patch that does
> so.

Makes sense.  Thanks.  Committed to mainline.

Ian


Re: [RFC PATCH libiberty] Fix infinite recursion in demangler

2017-03-08 Thread Nathan Sidwell

On 03/08/2017 08:29 AM, Mark Wielaard wrote:

On Wed, 2017-03-08 at 08:22 -0500, Nathan Sidwell wrote:

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00089.html?

Patch looks good to me (not a libiberty maintainer)

One question:
+  if (dc == NULL || dc->d_printing > 1)

Can dc ever be null here, or is this just extra defensiveness?


Yes, dc can be NULL. There is a similar check (and setting of
d_print_error) in d_print_comp_inner.


thanks.  in case you'd not noticed Ian T's okayed my review.  so this is 
good to go


nathan

--
Nathan Sidwell


Re: [RFC PATCH libiberty] Fix infinite recursion in demangler

2017-03-08 Thread Mark Wielaard
On Wed, 2017-03-08 at 08:22 -0500, Nathan Sidwell wrote:
> https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00089.html?
> 
> Patch looks good to me (not a libiberty maintainer)
> 
> One question:
> +  if (dc == NULL || dc->d_printing > 1)
> 
> Can dc ever be null here, or is this just extra defensiveness?

Yes, dc can be NULL. There is a similar check (and setting of
d_print_error) in d_print_comp_inner. 

Cheers,

Mark


[RFC PATCH libiberty] Fix infinite recursion in demangler

2017-03-08 Thread Nathan Sidwell

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00089.html?

Patch looks good to me (not a libiberty maintainer)

One question:
+  if (dc == NULL || dc->d_printing > 1)

Can dc ever be null here, or is this just extra defensiveness?

nathan

--
Nathan Sidwell


Re: [PATCH] Fix ICE with -Walloca-larger-than=[>INT_MAX] (PR middle-end/79809)

2017-03-08 Thread Marek Polacek
On Wed, Mar 08, 2017 at 12:30:59PM +0100, Andreas Schwab wrote:
> On Mär 02 2017, Marek Polacek <pola...@redhat.com> wrote:
> 
> > diff --git gcc/testsuite/g++.dg/Walloca1.C gcc/testsuite/g++.dg/Walloca1.C
> > index e69de29..23b97e8 100644
> > --- gcc/testsuite/g++.dg/Walloca1.C
> > +++ gcc/testsuite/g++.dg/Walloca1.C
> > @@ -0,0 +1,6 @@
> > +/* PR middle-end/79809 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-Walloca-larger-than=4207115063 
> > -Wvla-larger-than=1233877270 -O2" } */
> > +
> > +int a;
> > +char *b = static_cast(__builtin_alloca (a)); // { dg-warning 
> > "argument to .alloca. may be too large" }
> 
> FAIL: g++.dg/Walloca1.C  -std=gnu++11  (test for warnings, line 6)
> FAIL: g++.dg/Walloca1.C  -std=gnu++11 (test for excess errors)
> Excess errors:
> /daten/aranym/gcc/gcc-20170308/gcc/testsuite/g++.dg/Walloca1.C:6:48: warning: 
> unbounded use of 'alloca' [-Walloca-larger-than=]

Ok, this should help:

2017-03-08  Marek Polacek  <pola...@redhat.com>

* g++.dg/Walloca1.C: Adjust dg-warning.

diff --git gcc/testsuite/g++.dg/Walloca1.C gcc/testsuite/g++.dg/Walloca1.C
index 23b97e8..818c6f0 100644
--- gcc/testsuite/g++.dg/Walloca1.C
+++ gcc/testsuite/g++.dg/Walloca1.C
@@ -3,4 +3,4 @@
 /* { dg-options "-Walloca-larger-than=4207115063 -Wvla-larger-than=1233877270 
-O2" } */
 
 int a;
-char *b = static_cast(__builtin_alloca (a)); // { dg-warning "argument 
to .alloca. may be too large" }
+char *b = static_cast(__builtin_alloca (a)); // { dg-warning "argument 
to .alloca. may be too large|unbounded use of" }

Marek


Re: [PATCH] V2: Fix compile warning building testcase

2017-03-08 Thread Sam Thursfield

Hi Ian

Thanks a lot for reviewing the patch. I've attached a new version with 
some comments below.


On 08/03/17 04:03, Ian Lance Taylor via gcc-patches wrote:

Thanks.  The patch submission is fine but 1) you didn't see which
version of GCC you are using; 2) I don't understand why it works.  If
BACKTRACE_SUPPORTED is 0, then I would expect that you would see a
warning for test1, test2, test3, and test4.  Your patch doesn't fix
those, so it sounds like you are only seeing the warning for test5, so
BACKTRACE_SUPPORTED must be 1..  test5 is not compiled if
BACKTRACE_SUPPORTS_DATA is 0, so BACKTRACE_SUPPORTS_DATA must be 1.
So both BACKTRACE_SUPPORTED and BACKTRACE_SUPPORTS_DATA must be 1, so
I don't understand what your patch fixes.


I'm using GCC 6.3.0 here to compile.

I see your point about the fix not making sense. The values of the
defines are this:

#define BACKTRACE_SUPPORTED 0
#define BACKTRACE_SUPPORTS_DATA 1

I noticed that test1(), test2(), test3() and test4() all have the
"unused" attribute:

static int test1 (void) __attribute__ ((noinline, unused));
static inline int test2 (void) __attribute__ ((always_inline, unused));
static int test3 (void) __attribute__ ((noinline, unused));
static inline int test4 (void) __attribute__ ((always_inline, unused));

So that will be why they don't trigger -Wunused-function :-)

It seems to me now that it'd be neater to add that same attribute to 
test5(), instead of using an #ifdef guard. Attached is a new patch that 
does so.


Again, I've tested this using GCC 6.3.0 on powerpc-ibm-aix7.2.0.0.

Sam

--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575
>From 87b52dee5ca3b8feca379877c873c22b673f1b7b Mon Sep 17 00:00:00 2001
From: Sam Thursfield 
Date: Wed, 8 Mar 2017 06:48:30 -0600
Subject: [PATCH] libbacktrace: Fix compile warning building testcase

This warning occurred when the BACKTRACE_SUPPORTED flag is defined to 0.

../../../gcc/libbacktrace/btest.c:624:1: error: 'test5' defined but not used [-Werror=unused-function]
 test5 (void)
 ^
cc1: all warnings being treated as errors

The other test functions have the 'unused' attribute, presumably to
avoid similar errors, so let's handle test5() the same way. instead
of using a #ifdef guard.

libbacktrace/:

2017-03-08  Sam Thursfield  

   * btest.c (test5): Replace #ifdef guard with 'unused' attribute
   to fix compile warning when BACKTRACE_SUPPORTED isn't defined.
---
 libbacktrace/btest.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/libbacktrace/btest.c b/libbacktrace/btest.c
index e28a3d8..92cb325 100644
--- a/libbacktrace/btest.c
+++ b/libbacktrace/btest.c
@@ -616,7 +616,7 @@ f33 (int f1line, int f2line)
   return failures;
 }
 
-#if BACKTRACE_SUPPORTS_DATA
+static int test5 (void) __attribute__ ((unused));
 
 int global = 1;
 
@@ -686,8 +686,6 @@ test5 (void)
   return failures;
 }
 
-#endif /* BACKTRACE_SUPPORTS_DATA  */
-
 static void
 error_callback_create (void *data ATTRIBUTE_UNUSED, const char *msg,
 		   int errnum)
-- 
2.2.2



Re: [PATCH] Fix ICE with -Walloca-larger-than=[>INT_MAX] (PR middle-end/79809)

2017-03-08 Thread Andreas Schwab
On Mär 02 2017, Marek Polacek <pola...@redhat.com> wrote:

> diff --git gcc/testsuite/g++.dg/Walloca1.C gcc/testsuite/g++.dg/Walloca1.C
> index e69de29..23b97e8 100644
> --- gcc/testsuite/g++.dg/Walloca1.C
> +++ gcc/testsuite/g++.dg/Walloca1.C
> @@ -0,0 +1,6 @@
> +/* PR middle-end/79809 */
> +/* { dg-do compile } */
> +/* { dg-options "-Walloca-larger-than=4207115063 
> -Wvla-larger-than=1233877270 -O2" } */
> +
> +int a;
> +char *b = static_cast(__builtin_alloca (a)); // { dg-warning 
> "argument to .alloca. may be too large" }

FAIL: g++.dg/Walloca1.C  -std=gnu++11  (test for warnings, line 6)
FAIL: g++.dg/Walloca1.C  -std=gnu++11 (test for excess errors)
Excess errors:
/daten/aranym/gcc/gcc-20170308/gcc/testsuite/g++.dg/Walloca1.C:6:48: warning: 
unbounded use of 'alloca' [-Walloca-larger-than=]

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Loop splitting breaks with loops of pointer type

2017-03-08 Thread Richard Biener
On Wed, Mar 8, 2017 at 11:53 AM, Andrew Haley  wrote:
> Loop splitting is fine when the control variable is of integer type,
> but when it is a pointer type the upper bound of the new loop is
> calculated incorrectly.
> The calculation should be guard_init + (end-beg), but instead we do
> guard_init - (end-beg).
>
> Fixed thusly.  Bootstrapped, regtested.
>
> OK?

Ok.

Thanks,
Richard.

> Andrew.
>
>
> 2017-03-08  Andrew Haley  
>
> PR tree-optimization/79894
> * tree-ssa-loop-split.c (compute_new_first_bound): When
> calculating the new upper bound, (END-BEG) should be added, not
> subtracted.
>
> Index: gcc/tree-ssa-loop-split.c
> ===
> --- gcc/tree-ssa-loop-split.c   (revision 245948)
> +++ gcc/tree-ssa-loop-split.c   (working copy)
> @@ -436,7 +436,6 @@
>if (POINTER_TYPE_P (TREE_TYPE (guard_init)))
>  {
>enddiff = gimple_convert (stmts, sizetype, enddiff);
> -  enddiff = gimple_build (stmts, NEGATE_EXPR, sizetype, enddiff);
>newbound = gimple_build (stmts, POINTER_PLUS_EXPR,
>TREE_TYPE (guard_init),
>guard_init, enddiff);
>
>
> 2017-03-08  Andrew Haley  
>
> PR tree-optimization/79894
> * gcc.dg/tree-ssa/pr79943.c: New test.
>
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr79943.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/pr79943.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr79943.c (revision 0)
> @@ -0,0 +1,40 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
> +/* { dg-require-effective-target int32plus } */
> +
> +#ifdef __cplusplus
> +extern "C" void abort (void);
> +#else
> +extern void abort (void);
> +#endif
> +
> +typedef struct {
> +  int n;
> +} region_t;
> +
> +void set (region_t *region) __attribute__((noinline));
> +void doit (region_t *beg, region_t *end, region_t *limit)
> +  __attribute__((noinline));
> +
> +region_t regions[10];
> +
> +void
> +set (region_t *region) {
> +  region->n = 1;
> +}
> +
> +void
> +doit (region_t *beg, region_t *end, region_t *limit) {
> +  for (region_t *cur = beg; cur < end; cur++) {
> +if (cur < limit) {
> +  set(cur);
> +}
> +  }
> +}
> +
> +int
> +main (void) {
> +  doit([0], [2], [10]);
> +  if (regions[1].n != 1)
> +abort();
> +}


Re: [PATCH 2/5] Get bounds for a PARM_DECL (PR ipa/79761).

2017-03-08 Thread Richard Biener
On Tue, Mar 7, 2017 at 5:07 PM, Martin Liška  wrote:
> On 03/07/2017 03:57 PM, Richard Biener wrote:
>> On Thu, Mar 2, 2017 at 6:06 PM, marxin  wrote:
>>> gcc/ChangeLog:
>>>
>>> 2017-03-06  Martin Liska  
>>>
>>> PR ipa/79761
>>> * tree-chkp.c (chkp_get_bound_for_parm): Get bounds for a param.
>>> (chkp_find_bounds_1): Remove gcc_unreachable.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2017-03-06  Martin Liska  
>>>
>>> PR ipa/79761
>>> * g++.dg/pr79761.C: New test.
>>> ---
>>>  gcc/testsuite/g++.dg/pr79761.C | 34 ++
>>>  gcc/tree-chkp.c|  3 +--
>>>  2 files changed, 35 insertions(+), 2 deletions(-)
>>>  create mode 100644 gcc/testsuite/g++.dg/pr79761.C
>>>
>>> diff --git a/gcc/testsuite/g++.dg/pr79761.C b/gcc/testsuite/g++.dg/pr79761.C
>>> new file mode 100644
>>> index 000..b1f92d2b036
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.dg/pr79761.C
>>> @@ -0,0 +1,34 @@
>>> +/* { dg-do compile { target { ! x32 } } } */
>>> +/* { dg-options "-fcheck-pointer-bounds -mmpx -mabi=ms" } */
>>> +
>>> +struct Foo
>>> +{
>>> +  Foo() : a(1), b(1), c('a') {}
>>> +  int a;
>>> +  int b;
>>> +  char c;
>>> +};
>>> +
>>> +static Foo copy_foo(Foo) __attribute__((noinline, noclone));
>>> +
>>> +static Foo copy_foo(Foo A)
>>> +{
>>> +  return A;
>>> +}
>>> +
>>> +struct Bar : Foo
>>> +{
>>> +  Bar(Foo t) : Foo(copy_foo(t)) {}
>>> +};
>>> +
>>> +Foo F;
>>> +
>>> +int main (void)
>>> +{
>>> +  Bar B (F);
>>> +
>>> +  if (B.a != 1 || B.b != 1 || B.c != 'a')
>>> +__builtin_abort ();
>>> +
>>> +  return 0;
>>> +}
>>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>>> index 3d497f51ed8..d5683b1b9cf 100644
>>> --- a/gcc/tree-chkp.c
>>> +++ b/gcc/tree-chkp.c
>>> @@ -2353,7 +2353,7 @@ chkp_get_next_bounds_parm (tree parm)
>>>  static tree
>>>  chkp_get_bound_for_parm (tree parm)
>>>  {
>>> -  tree decl = SSA_NAME_VAR (parm);
>>> +  tree decl = TREE_CODE (parm) == PARM_DECL ? parm : SSA_NAME_VAR (parm);
>>>tree bounds;
>>>
>>>gcc_assert (TREE_CODE (decl) == PARM_DECL);
>>> @@ -3602,7 +3602,6 @@ chkp_find_bounds_1 (tree ptr, tree ptr_src, 
>>> gimple_stmt_iterator *iter)
>>>break;
>>>
>>>  case PARM_DECL:
>>> -  gcc_unreachable ();
>>>bounds = chkp_get_bound_for_parm (ptr_src);
>>
>> But this is just useless work ... just do
>>
>>   case PARM_DECL:
>>  /* Handled above but failed.  */
>>  break;
>
> Ok, let's return invalid bounds. Please see updated patch.

Ok.

> Martin
>
>>
>> the SSA_NAME case is similarly redundantly calling 
>> chkp_get_registered_bounds.
>>
>> Richard.
>>
>>>break;
>>>
>>> --
>>> 2.11.1
>>>
>>>
>


Re: [PATCH 1/5] Fix *_CST ICEs connected to MPX.

2017-03-08 Thread Richard Biener
On Tue, Mar 7, 2017 at 5:01 PM, Martin Liška  wrote:
> On 03/07/2017 03:53 PM, Richard Biener wrote:
>> On Tue, Mar 7, 2017 at 3:48 PM, Martin Liška  wrote:
>>> On 03/07/2017 11:17 AM, Rainer Orth wrote:
 marxin  writes:

> diff --git a/gcc/testsuite/g++.dg/pr79769.C 
> b/gcc/testsuite/g++.dg/pr79769.C
> new file mode 100644
> index 000..f9223db1b2d
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr79769.C
> @@ -0,0 +1,4 @@
> +/* { dg-do compile { target { ! x32 } } } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx -mabi=ms" } */

 ... and again: make this x86-only.

   Rainer

>>>
>>> Thanks. I'm sending v2 of the patch.
>>
>> Hmm, not sure why we should handle REAL_CST here explicitely for example.
>>
>> Why not, instead of internal_error in the default: case do
>>
>>   bounds = chkp_get_invalid_op_bounds ();
>
> Because chkp_get_invalid_op_bounds() returns bounds that are always valid and 
> as it's
> security extension, I would be strict here in order to not handle something 
> that can bypass
> the checking.
>
>>
>> there?  For the testcase why do we invoke chkp_find_bounds_1 on sth that is
>> a REAL_CST for example?
>
> It's called when setting bounds in a call expr:
>
> #0  chkp_find_bounds_1 (ptr=0x76a03720, ptr_src=0x76a03720, 
> iter=0x7fffd5d0) at ../../gcc/tree-chkp.c:3734
> #1  0x00ec7c7d in chkp_find_bounds (ptr=0x76a03720, 
> iter=0x7fffd5d0) at ../../gcc/tree-chkp.c:3768
> #2  0x00ec22e1 in chkp_add_bounds_to_call_stmt (gsi=0x7fffd5d0) 
> at ../../gcc/tree-chkp.c:1901
> #3  0x00ec9a1a in chkp_instrument_function () at 
> ../../gcc/tree-chkp.c:4344
> #4  0x00eca4cb in chkp_execute () at ../../gcc/tree-chkp.c:4528

So this happens for calls with mismatched arguments?  I believe that

  if (BOUNDED_TYPE_P (type)
  || pass_by_reference (NULL, TYPE_MODE (type), type, true))
new_args.safe_push (chkp_find_bounds (call_arg, gsi));

may be misguided given pass_by_reference is not exposed at GIMPLE and thus
for the pass_by_reference case it should fall back to
"chkp_get_invalid_op_bounds"
and thus eventually pass this down as a flag to chkp_find_bounds (or check
on call_arg again).

Note that here 'type' and call_arg may not match (asking for trouble).  Plus
'fntype' is computed bogously (should use gimple_call_fntype).

It later does sth like

  /* For direct calls fndecl is replaced with instrumented version.  */
  if (fndecl)
{
  tree new_decl = chkp_maybe_create_clone (fndecl)->decl;
  gimple_call_set_fndecl (new_call, new_decl);
  /* In case of a type cast we should modify used function
 type instead of using type of new fndecl.  */
  if (gimple_call_fntype (call) != TREE_TYPE (fndecl))
{
  tree type = gimple_call_fntype (call);
  type = chkp_copy_function_type_adding_bounds (type);
  gimple_call_set_fntype (new_call, type);
}
  else
gimple_call_set_fntype (new_call, TREE_TYPE (new_decl));

but obviously if gimple_call_fntype and fndecl don't match cloning
the mismatched decl isn't of very much help ... (it may just generate
more mismatches with mismatching bounds...).

> ...
>
> Martin
>
>>
>> Richard.
>>
>>> Martin
>


[PATCH] Loop splitting breaks with loops of pointer type

2017-03-08 Thread Andrew Haley
Loop splitting is fine when the control variable is of integer type,
but when it is a pointer type the upper bound of the new loop is
calculated incorrectly.
The calculation should be guard_init + (end-beg), but instead we do
guard_init - (end-beg).

Fixed thusly.  Bootstrapped, regtested.

OK?

Andrew.


2017-03-08  Andrew Haley  

PR tree-optimization/79894
* tree-ssa-loop-split.c (compute_new_first_bound): When
calculating the new upper bound, (END-BEG) should be added, not
subtracted.

Index: gcc/tree-ssa-loop-split.c
===
--- gcc/tree-ssa-loop-split.c   (revision 245948)
+++ gcc/tree-ssa-loop-split.c   (working copy)
@@ -436,7 +436,6 @@
   if (POINTER_TYPE_P (TREE_TYPE (guard_init)))
 {
   enddiff = gimple_convert (stmts, sizetype, enddiff);
-  enddiff = gimple_build (stmts, NEGATE_EXPR, sizetype, enddiff);
   newbound = gimple_build (stmts, POINTER_PLUS_EXPR,
   TREE_TYPE (guard_init),
   guard_init, enddiff);


2017-03-08  Andrew Haley  

PR tree-optimization/79894
* gcc.dg/tree-ssa/pr79943.c: New test.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr79943.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr79943.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr79943.c (revision 0)
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
+/* { dg-require-effective-target int32plus } */
+
+#ifdef __cplusplus
+extern "C" void abort (void);
+#else
+extern void abort (void);
+#endif
+
+typedef struct {
+  int n;
+} region_t;
+
+void set (region_t *region) __attribute__((noinline));
+void doit (region_t *beg, region_t *end, region_t *limit)
+  __attribute__((noinline));
+
+region_t regions[10];
+
+void
+set (region_t *region) {
+  region->n = 1;
+}
+
+void
+doit (region_t *beg, region_t *end, region_t *limit) {
+  for (region_t *cur = beg; cur < end; cur++) {
+if (cur < limit) {
+  set(cur);
+}
+  }
+}
+
+int
+main (void) {
+  doit([0], [2], [10]);
+  if (regions[1].n != 1)
+abort();
+}


Re: stabilize store merging

2017-03-08 Thread Richard Biener
On Wed, Mar 8, 2017 at 12:58 AM, Alexandre Oliva  wrote:
> Don't let pointer randomization change the order in which we process
> store chains.  This may cause SSA_NAMEs to be released in different
> order, and if they're reused later, they may cause differences in SSA
> partitioning, leading to differences in expand, and ultimately to
> different code.
>
> bootstrap-debug-lean (-fcompare-debug) on i686-linux-gnu has failed in
> haifa-sched.c since r245196 exposed the latent ordering problem in
> store merging.  In this case, the IR differences (different SSA names
> selected for copies in out-of-SSA, resulting in some off-by-one
> differences in pseudos) were not significant enough to be visible in
> the compiler output.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Any reason you chose to maintain a new hash-map instead of at
interesting places gather to-be-processed chains into a vector and
sort that after IDs?  Traversing the hash-map still doesn't get you
a reliable order but only one depending on the chain creation and
thus stmt walk order - plus the hash function and size - which may be
good enough
to fix the issue at hand of course.  But it will for example still make
testcase reduction fragile if shrinking the hash-map by removing unrelated
code ends up processing things in different order.

So - if we fix this can we fix it by properly sorting things?

Thanks,
Richard.

> for  gcc/ChangeLog
>
> * gimple-ssa-store-merging.c (INCLUDE_MAP): Define.
> (struct imm_store_chain_info): Add seqno field and ctor parm.
> (class pass_store_merging): Add m_store_seq field.
> (pass_store_merging::terminate_and_process_all_chains):
> Iterate over m_store_seq.
> (pass_store_merging::terminate_all_aliasing_chains):
> Likewise.
> (pass_store_merging::terminate_and_release_chain):
> Erase the m_store_seq entry.
> (pass_store_merging::execute): Check for debug stmts first.
> Compute seqno, add the m_store_seq entry.
> ---
>  gcc/gimple-ssa-store-merging.c |   58 
> +---
>  1 file changed, 42 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
> index 17ac94a..0bf7d89 100644
> --- a/gcc/gimple-ssa-store-merging.c
> +++ b/gcc/gimple-ssa-store-merging.c
> @@ -103,6 +103,7 @@
>[p + 4B] (16-bit) := 0xabcd; //  val & 0x;  */
>
>  #include "config.h"
> +#define INCLUDE_MAP
>  #include "system.h"
>  #include "coretypes.h"
>  #include "backend.h"
> @@ -675,11 +676,12 @@ merged_store_group::apply_stores ()
>
>  struct imm_store_chain_info
>  {
> +  unsigned seqno;
>tree base_addr;
>auto_vec m_store_info;
>auto_vec m_merged_store_groups;
>
> -  imm_store_chain_info (tree b_a) : base_addr (b_a) {}
> +  imm_store_chain_info (unsigned seq, tree b_a) : seqno (seq), base_addr 
> (b_a) {}
>bool terminate_and_process_chain ();
>bool coalesce_immediate_stores ();
>bool output_merged_store (merged_store_group *);
> @@ -718,6 +720,18 @@ public:
>  private:
>hash_map m_stores;
>
> +  /* Use m_store_seq to order elements in m_stores, and iterate over
> + them in a predictable way.  It maps sequence numbers to the base
> + addresses stored as keys in m_stores.  Using this order avoids
> + extraneous differences in the compiler output just because of
> + tree pointer variations (e.g. different chains end up in
> + different positions of m_stores, so they are handled in different
> + orders, so they allocate or release SSA names in different
> + orders, and when they get reused, subsequent passes end up
> + getting different SSA names, which may ultimately change
> + decisions when going out of SSA).  */
> +  std::map m_store_seq;
> +
>bool terminate_and_process_all_chains ();
>bool terminate_all_aliasing_chains (imm_store_chain_info **,
>   bool, gimple *);
> @@ -730,11 +744,16 @@ private:
>  bool
>  pass_store_merging::terminate_and_process_all_chains ()
>  {
> -  hash_map::iterator iter
> -= m_stores.begin ();
>bool ret = false;
> -  for (; iter != m_stores.end (); ++iter)
> -ret |= terminate_and_release_chain ((*iter).second);
> +  for (std::map::iterator next = m_store_seq.begin (),
> +iter = next; iter != m_store_seq.end (); iter = next)
> +{
> +  next++;
> +  tree base_addr = (*iter).second;
> +  ret |= terminate_and_release_chain (*m_stores.get (base_addr));
> +}
> +  gcc_assert (m_stores.elements () == 0);
> +  gcc_assert (m_store_seq.empty ());
>
>return ret;
>  }
> @@ -799,15 +818,17 @@ pass_store_merging::terminate_all_aliasing_chains 
> (imm_store_chain_info
> }
>  }
>
> -  

[PATCH][ARM] PR target/79911: Invalid vec_select arguments

2017-03-08 Thread Kyrill Tkachov

Hi all,

This patch fixes the NEON patterns with Jakub's genrecog verification 
improvements:
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
HImode and its operand QImode
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
SImode and its operand QImode
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
QImode and its operand HImode
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
SImode and its operand HImode
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
QImode and its operand SImode
../../gcc/config/arm/neon.md:1338:1: element mode mismatch between vec_select 
HImode and its operand SImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
HImode and its operand QImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
SImode and its operand QImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
QImode and its operand HImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
SImode and its operand HImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
QImode and its operand SImode
../../gcc/config/arm/neon.md:1353:1: element mode mismatch between vec_select 
HImode and its operand SImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
HImode and its operand QImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
SImode and its operand QImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
QImode and its operand HImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
SImode and its operand HImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
QImode and its operand SImode
../../gcc/config/arm/neon.md:1407:1: element mode mismatch between vec_select 
HImode and its operand SImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
HImode and its operand QImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
SImode and its operand QImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
QImode and its operand HImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
SImode and its operand HImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
QImode and its operand SImode
../../gcc/config/arm/neon.md:1422:1: element mode mismatch between vec_select 
HImode and its operand SImode

The problem is that the affected patterns use both the VQI iterator [V16QI V8HI 
V4SI] and the
VW iterator [V8QI V4HI V2SI]. The intent was to use V8QI when VQI returned 
V16QI, V4HI when VQI returned V8HI etc.
But using both iterators at the same time creates a cartesian product of 
combinations, so the iterators expanded
to some patterns that are not valid that is, the inner mode of the vec_select 
mode being not the same as the
inner mode of its vector argument.

This patch fixes that by using the appropriate V_HALF mode attribute instead of 
the VW iterator, so VQI is the only
real mode iterator used in these patterns. I took care to leave the expanded 
name of each pattern the same so that
their gen_* invocations from widen_ssum3 and widen_usum3 remain the 
same.

With this patch the arm backend passes the genrecog validations.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill


2017-03-08  Kyrylo Tkachov  

PR target/79911
* config/arm/neon.md (vec_sel_widen_ssum_lo3):
Rename to...
(vec_sel_widen_ssum_lo3): ... This. Avoid mismatch
between vec_select and vector argument.
(vec_sel_widen_ssum_hi3): Rename to...
(vec_sel_widen_ssum_hi3): ... This. Likewise.
(vec_sel_widen_usum_lo3): Rename to...
(vec_sel_widen_usum_lo3): ... This.
(vec_sel_widen_usum_hi3): Rename to...
(vec_sel_widen_usum_hi3): ... This.
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index cf281df..50d89eb 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1335,14 +1335,14 @@ (define_expand "widen_ssum3"
   }
 )
 
-(define_insn "vec_sel_widen_ssum_lo3"
-  [(set (match_operand: 0 "s_register_operand" "=w")
-	(plus:
-	 (sign_extend:
-	  (vec_select:VW
+(define_insn "vec_sel_widen_ssum_lo3"
+  [(set (match_operand: 0 "s_register_operand" "=w")
+	(plus:
+	 (sign_extend:
+	  (vec_select:
 	   (match_operand:VQI 1 "s_register_operand" "%w")
 	   (match_operand:VQI 2 "vect_par_constant_low" "")))
-	 (match_operand: 3 "s_register_operand" "0")))]
+	 (match_operand: 3 "s_register_operand" "0")))]
   "TARGET_NEON"
 {
   return BYTES_BIG_ENDIAN ?  "vaddw.\t%q0, %q3, %f1" :
@@ -1350,13 +1350,14 @@ (define_insn "vec_sel_widen_ssum_lo3"
 }
   [(set_attr 

Re: [PATCH] Fix PRs 79955 and 79956 (hopefully)

2017-03-08 Thread Jakub Jelinek
On Wed, Mar 08, 2017 at 11:12:47AM +0100, Richard Biener wrote:
> 
> The following avoids warning about completely out-of-bound accesses
> as uninitialized.
> 
> For GCC8 we likely want to enhance path isolation to catch those
> cases (and eventually issue a diagnostic about them).  And we
> finally want to fix complete peeling to not introduce those
> accesses.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

LGTM, thanks.

> 2017-03-08  Richard Biener  
> 
>   PR tree-optimization/79955
>   PR tree-optimization/79956
>   * tree-ssa-uninit.c (warn_uninitialized_vars): Do not warn
>   for accesses that are completely outside of the variable.
> 
>   * gcc.dg/uninit-24.c: New testcase.

Jakub


[PATCH] Fix PRs 79955 and 79956 (hopefully)

2017-03-08 Thread Richard Biener

The following avoids warning about completely out-of-bound accesses
as uninitialized.

For GCC8 we likely want to enhance path isolation to catch those
cases (and eventually issue a diagnostic about them).  And we
finally want to fix complete peeling to not introduce those
accesses.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-03-08  Richard Biener  

PR tree-optimization/79955
PR tree-optimization/79956
* tree-ssa-uninit.c (warn_uninitialized_vars): Do not warn
for accesses that are completely outside of the variable.

* gcc.dg/uninit-24.c: New testcase.

Index: gcc/tree-ssa-uninit.c
===
--- gcc/tree-ssa-uninit.c   (revision 245968)
+++ gcc/tree-ssa-uninit.c   (working copy)
@@ -287,6 +287,17 @@ warn_uninitialized_vars (bool warn_possi
  || TREE_NO_WARNING (base))
continue;
 
+ /* Do not warn if the access is fully outside of the
+variable.  */
+ if (ref.size != -1
+ && ref.max_size == ref.size
+ && (ref.offset + ref.size <= 0
+ || (ref.offset >= 0
+ && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST
+ && compare_tree_int (DECL_SIZE (base),
+  ref.offset) <= 0)))
+   continue;
+
  /* Limit the walking to a constant number of stmts after
 we overcommit quadratic behavior for small functions
 and O(n) behavior.  */
Index: gcc/testsuite/gcc.dg/uninit-24.c
===
--- gcc/testsuite/gcc.dg/uninit-24.c(nonexistent)
+++ gcc/testsuite/gcc.dg/uninit-24.c(working copy)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O -Wmaybe-uninitialized" } */
+
+int foo (int x)
+{
+  int y;
+  if (x)
+return *( + 1); /* { dg-bogus "may be used uninitialized" } */
+  return 0;
+}


Re: [PATCH] Small ubsan vector arith optimization to fix one testcase from PR sanitizer/79904

2017-03-08 Thread Marek Polacek
On Wed, Mar 08, 2017 at 11:06:38AM +0100, Jakub Jelinek wrote:
> On Wed, Mar 08, 2017 at 11:03:53AM +0100, Richard Biener wrote:
> > On Wed, 8 Mar 2017, Jakub Jelinek wrote:
> > 
> > > On Wed, Mar 08, 2017 at 09:15:05AM +0100, Richard Biener wrote:
> > > > Ok.  Note that another option for the loopy case is to do
> > > > 
> > > >   for (;;)
> > > > {
> > > >   vec >> by-one-elt;
> > > >   elt = BIT_FIELD_REF ;
> > > > }
> > > 
> > > Indeed, that is a possibility, but I guess I'd need to construct
> > > the result similarly if resv is non-NULL.  Also, not sure about big endian
> > > vectors and whether BIT_FIELD_REF with zero or size - elt_size is
> > > more efficient there.
> > > 
> > > In any case, the PR was about s390 without vectors enabled, so this 
> > > wouldn't
> > > apply.
> > > 
> > > > when whole-vector shifts are available (they are constructed by
> > > > VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
> > > > doing variable-index array accesses you can as well spill the
> > > > vector to memory and use memory accesses on that.  Not sure how
> > > > to arrange that from this part of the expander.
> > > 
> > > Shouldn't something else during the expansion force it into memory if it 
> > > is
> > > more efficient to expand it that way?  Apparently it is forced into memory
> > 
> > Possibly - but it might end up spilling in the loop itself and thus be
> > rather inefficient?
> 
> Ok.  That said, this is -fsanitize=undefined which slows down code anyway,
> so making it more efficient can be done in GCC8 IMNSHO.

Indeed -- I'd push out that to 8.

Marek


Re: [PATCH] Small ubsan vector arith optimization to fix one testcase from PR sanitizer/79904

2017-03-08 Thread Jakub Jelinek
On Wed, Mar 08, 2017 at 11:03:53AM +0100, Richard Biener wrote:
> On Wed, 8 Mar 2017, Jakub Jelinek wrote:
> 
> > On Wed, Mar 08, 2017 at 09:15:05AM +0100, Richard Biener wrote:
> > > Ok.  Note that another option for the loopy case is to do
> > > 
> > >   for (;;)
> > > {
> > >   vec >> by-one-elt;
> > >   elt = BIT_FIELD_REF ;
> > > }
> > 
> > Indeed, that is a possibility, but I guess I'd need to construct
> > the result similarly if resv is non-NULL.  Also, not sure about big endian
> > vectors and whether BIT_FIELD_REF with zero or size - elt_size is
> > more efficient there.
> > 
> > In any case, the PR was about s390 without vectors enabled, so this wouldn't
> > apply.
> > 
> > > when whole-vector shifts are available (they are constructed by
> > > VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
> > > doing variable-index array accesses you can as well spill the
> > > vector to memory and use memory accesses on that.  Not sure how
> > > to arrange that from this part of the expander.
> > 
> > Shouldn't something else during the expansion force it into memory if it is
> > more efficient to expand it that way?  Apparently it is forced into memory
> 
> Possibly - but it might end up spilling in the loop itself and thus be
> rather inefficient?

Ok.  That said, this is -fsanitize=undefined which slows down code anyway,
so making it more efficient can be done in GCC8 IMNSHO.

Jakub


Re: [PATCH] Small ubsan vector arith optimization to fix one testcase from PR sanitizer/79904

2017-03-08 Thread Richard Biener
On Wed, 8 Mar 2017, Jakub Jelinek wrote:

> On Wed, Mar 08, 2017 at 09:15:05AM +0100, Richard Biener wrote:
> > Ok.  Note that another option for the loopy case is to do
> > 
> >   for (;;)
> > {
> >   vec >> by-one-elt;
> >   elt = BIT_FIELD_REF ;
> > }
> 
> Indeed, that is a possibility, but I guess I'd need to construct
> the result similarly if resv is non-NULL.  Also, not sure about big endian
> vectors and whether BIT_FIELD_REF with zero or size - elt_size is
> more efficient there.
> 
> In any case, the PR was about s390 without vectors enabled, so this wouldn't
> apply.
> 
> > when whole-vector shifts are available (they are constructed by
> > VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
> > doing variable-index array accesses you can as well spill the
> > vector to memory and use memory accesses on that.  Not sure how
> > to arrange that from this part of the expander.
> 
> Shouldn't something else during the expansion force it into memory if it is
> more efficient to expand it that way?  Apparently it is forced into memory

Possibly - but it might end up spilling in the loop itself and thus be
rather inefficient?

> on s390 and the ICE is that the backend doesn't like something on it.

Could be - as I said I didn't look into what the ICE actually is.

Richard.


[PATCH] For broken exception handling in GDB on AIX platform

2017-03-08 Thread Nitish Kumar Mishra
Hi,

I got some review comment from Bernhard Reutner-Fischer, and I have
updated the patch accordingly.
This patch is for bug opened
here:https://sourceware.org/bugzilla/show_bug.cgi?id=21187

Please find the attachment below.

Thanks and Regards,
Nitish K Mishra.
diff --git a/configure.ac b/configure.ac
index 3ec86c1..c400251 100644
--- a/configure.ac
+++ b/configure.ac
@@ -471,6 +471,13 @@ ENABLE_LIBSTDCXX=default)
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
 fi]
 
+AC_ARG_ENABLE(staticlib,
+AS_HELP_STRING([--disable-staticlib],
+  [do not link libstdc++ and libgcc library statically, default is static linking]),
+ENABLE_STATICLIB=$enableval,
+ENABLE_STATICLIB=yes)
+
+
 # If this is accelerator compiler and its target is intelmic we enable
 # target liboffloadmic by default.  If this is compiler with offloading
 # for intelmic we enable host liboffloadmic by default.  Otherwise
@@ -1406,9 +1413,10 @@ if test -z "$LD"; then
   fi
 fi
 
-# Check whether -static-libstdc++ -static-libgcc is supported.
+# If ENABLE_STATICLIB is set for configuration, check whether -static-libstdc++ -static-libgcc is supported
 have_static_libs=no
-if test "$GCC" = yes; then
+if test "$ENABLE_STATICLIB" = yes; then
+ if test "$GCC" = yes; then
   saved_LDFLAGS="$LDFLAGS"
 
   LDFLAGS="$LDFLAGS -static-libstdc++ -static-libgcc"
@@ -1424,6 +1432,7 @@ int main() {}],
   AC_LANG_POP(C++)
 
   LDFLAGS="$saved_LDFLAGS"
+ fi
 fi
 
 ACX_PROG_GNAT
@@ -1741,6 +1750,9 @@ AC_ARG_WITH(stage1-ldflags,
  # trust that they are doing what they want.
  if test "$stage1_libs" = "" -a "$have_static_libs" = yes; then
stage1_ldflags="-static-libstdc++ -static-libgcc"
+   else
+  # If static lib is disabled.
+   stage1_ldflags=""
  fi])
 AC_SUBST(stage1_ldflags)
 
@@ -1768,8 +1780,11 @@ AC_ARG_WITH(boot-ldflags,
  # In stages 2 and 3, default to linking libstdc++ and libgcc
  # statically.  But if the user explicitly specified the libraries to
  # use, trust that they are doing what they want.
- if test "$poststage1_libs" = ""; then
+ if test "$poststage1_libs" = "" -a "$have_static_libs" = yes; then
poststage1_ldflags="-static-libstdc++ -static-libgcc"
+   else
+   # If static library linking is disabled.
+   poststage1_ldflags=""
  fi])
 AC_SUBST(poststage1_ldflags)
 


[Ada] Fix PR ada/79903

2017-03-08 Thread Eric Botcazou
This (apparently) fixes the build of the runtime on RTEMS.

Tested on x86-64/Linux, applied on mainline and 6 branch.


2017-03-08  Thanassis Tsiodras  

PR ada/79903
* socket.c (__gnat_gethostbyaddr): Add missing test for __rtems__.

-- 
Eric BotcazouIndex: socket.c
===
--- socket.c	(revision 245628)
+++ socket.c	(working copy)
@@ -202,7 +202,7 @@ __gnat_gethostbyaddr (const char *addr,
   struct hostent *rh;
   int ri;
 
-#if defined(__linux__) || defined(__GLIBC__)
+#if defined(__linux__) || defined(__GLIBC__) || defined(__rtems__)
   (void) gethostbyaddr_r (addr, len, type, ret, buf, buflen, , h_errnop);
 #else
   rh = gethostbyaddr_r (addr, len, type, ret, buf, buflen, h_errnop);


Re: [PATCH] Small ubsan vector arith optimization to fix one testcase from PR sanitizer/79904

2017-03-08 Thread Jakub Jelinek
On Wed, Mar 08, 2017 at 09:15:05AM +0100, Richard Biener wrote:
> Ok.  Note that another option for the loopy case is to do
> 
>   for (;;)
> {
>   vec >> by-one-elt;
>   elt = BIT_FIELD_REF ;
> }

Indeed, that is a possibility, but I guess I'd need to construct
the result similarly if resv is non-NULL.  Also, not sure about big endian
vectors and whether BIT_FIELD_REF with zero or size - elt_size is
more efficient there.

In any case, the PR was about s390 without vectors enabled, so this wouldn't
apply.

> when whole-vector shifts are available (they are constructed by
> VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
> doing variable-index array accesses you can as well spill the
> vector to memory and use memory accesses on that.  Not sure how
> to arrange that from this part of the expander.

Shouldn't something else during the expansion force it into memory if it is
more efficient to expand it that way?  Apparently it is forced into memory
on s390 and the ICE is that the backend doesn't like something on it.

Jakub


[Ada] Fix PR ada/79945

2017-03-08 Thread Eric Botcazou
This adjusts the declaration of Default_Bit_Order for PowerPC/Linux to handle 
the new little-endian variant.  And, on mainline only, this merges system.ads 
files for the big-endian & little-endian variants of ARM/Linux and MIPS/Linux, 
since their only difference is precisely the Default_Bit_Order setting.

Tested on x86-64/Linux and PowerPC64/Linux, applied on mainline and 6 branch.


2017-03-08  Eric Botcazou  

PR ada/79945
* system-linux-ppc.ads (Default_Bit_Order): Use Standard's setting.

* system-linux-arm.ads (Default_Bit_Order): Likewise.
* system-linux-mips.ads (Default_Bit_Order): Likewise.
* system-linux-armeb.ads: Delete.
* system-linux-mipsel.ads: Likewise.
* gcc-interface/Makefile.in (MIPS/Linux): Adjust.
(ARM/Linux): Likewise.

-- 
Eric BotcazouIndex: gcc-interface/Makefile.in
===
--- gcc-interface/Makefile.in	(revision 245767)
+++ gcc-interface/Makefile.in	(working copy)
@@ -1871,12 +1871,7 @@ ifeq ($(strip $(filter-out mips% linux%,
   s-taspri.adshttp://www.gnu.org/licenses/>.  --
---  --
--- GNAT was originally developed  by the GNAT team at  New York University. --
--- Extensive contributions were provided by Ada Core Technologies Inc.  --
---  --
---
-
-package System is
-   pragma Pure;
-   --  Note that we take advantage of the implementation permission to make
-   --  this unit Pure instead of Preelaborable; see RM 13.7.1(15). In Ada
-   --  2005, this is Pure in any case (AI-362).
-
-   pragma No_Elaboration_Code_All;
-   --  Allow the use of that restriction in units that WITH this unit
-
-   type Name is (SYSTEM_NAME_GNAT);
-   System_Name : constant Name := SYSTEM_NAME_GNAT;
-
-   --  System-Dependent Named Numbers
-
-   Min_Int   : constant := Long_Long_Integer'First;
-   Max_Int   : constant := Long_Long_Integer'Last;
-
-   Max_Binary_Modulus: constant := 2 ** Long_Long_Integer'Size;
-   Max_Nonbinary_Modulus : constant := 2 ** Integer'Size - 1;
-
-   Max_Base_Digits   : constant := Long_Long_Float'Digits;
-   Max_Digits: constant := Long_Long_Float'Digits;
-
-   Max_Mantissa  : constant := 63;
-   Fine_Delta: constant := 2.0 ** (-Max_Mantissa);
-
-   Tick  : constant := 0.000_001;
-
-   --  Storage-related Declarations
-
-   type Address is private;
-   pragma Preelaborable_Initialization (Address);
-   Null_Address : constant Address;
-
-   Storage_Unit : constant := 8;
-   Word_Size: constant := Standard'Word_Size;
-   Memory_Size  : constant := 2 ** Word_Size;
-
-   --  Address comparison
-
-   function "<"  (Left, Right : Address) return Boolean;
-   function "<=" (Left, Right : Address) return Boolean;
-   function ">"  (Left, Right : Address) return Boolean;
-   function ">=" (Left, Right : Address) return Boolean;
-   function "="  (Left, Right : Address) return Boolean;
-
-   pragma Import (Intrinsic, "<");
-   pragma Import (Intrinsic, "<=");
-   pragma Import (Intrinsic, ">");
-   pragma Import (Intrinsic, ">=");
-   pragma Import (Intrinsic, "=");
-
-   --  Other System-Dependent Declarations
-
-   type Bit_Order is (High_Order_First, Low_Order_First);
-   Default_Bit_Order : constant Bit_Order := High_Order_First;
-   pragma Warnings (Off, Default_Bit_Order); -- kill constant condition warning
-
-   --  Priority-related Declarations (RM D.1)
-
-   --  0 .. 98 corresponds to the system priority range 1 .. 99.
-   --
-   --  If the scheduling policy is SCHED_FIFO or SCHED_RR the runtime makes use
-   --  of the entire range provided by the system.
-   --
-   --  If the scheduling policy is SCHED_OTHER the only valid system priority
-   --  is 1 and other values are simply ignored.
-
-   Max_Priority   : constant Positive := 97;
-   Max_Interrupt_Priority : constant Positive := 98;
-
-   subtype Any_Priority   is Integer  range  0 .. 98;
-   subtype Priority   is Any_Priority range  0 .. 97;
-   subtype Interrupt_Priority is Any_Priority range 98 .. 98;
-
-   Default_Priority : constant Priority := 48;
-
-private
-
-   type Address is mod Memory_Size;
-   Null_Address : constant Address := 0;
-
-   --
-   -- System Implementation Parameters --
-   --
-
-   --  These parameters provide information about the target that is used
-   --  by the compiler. They are in the private part of System, where they
-   --  can be accessed using the special circuitry in the Targparm unit
-   --  whose source should be consulted for more detailed descriptions
-   --  of the individual switch 

[committed] Fix avr build

2017-03-08 Thread Jakub Jelinek
Hi!

Richard reported that after my PR79345 r245833 fix avr no longer builds:
../../gcc/config/avr/avr.md:1060:1: define_expand for setmemhi needs to have 
match_scratch numbers above all other operands
The following patch ought to fix that, committed as obvious to trunk.

2017-03-08  Jakub Jelinek  

* config/avr/avr.md (setmemhi): Make sure match_dup
operand number comes before match_scratch.

--- gcc/config/avr/avr.md.jj2017-02-06 13:32:14.0 +0100
+++ gcc/config/avr/avr.md   2017-03-08 09:46:39.828773285 +0100
@@ -1062,8 +1062,8 @@ (define_expand "setmemhi"
(match_operand 2 "const_int_operand" ""))
   (use (match_operand:HI 1 "const_int_operand" ""))
   (use (match_operand:HI 3 "const_int_operand" ""))
-  (clobber (match_scratch:HI 4 ""))
-  (clobber (match_dup 5))])]
+  (clobber (match_scratch:HI 5 ""))
+  (clobber (match_dup 4))])]
   ""
   {
 rtx addr0;
@@ -1077,7 +1077,7 @@ (define_expand "setmemhi"
   FAIL;
 
 mode = u8_operand (operands[1], VOIDmode) ? QImode : HImode;
-operands[5] = gen_rtx_SCRATCH (mode);
+operands[4] = gen_rtx_SCRATCH (mode);
 operands[1] = copy_to_mode_reg (mode,
 gen_int_mode (INTVAL (operands[1]), mode));
 addr0 = copy_to_mode_reg (Pmode, XEXP (operands[0], 0));

Jakub


[PATCH] Fix PR79920

2017-03-08 Thread Richard Biener

The following fixes a bug in vect_transform_slp_perm_load which tries
to be clever in computing sth like "ncopies" but fails to do that
correctly (and in fact it can't be done).  Instead just compute all
loads/permutations manually.

The fix is as simple as

Index: tree-vect-slp.c
===
--- tree-vect-slp.c (revision 245279)
+++ tree-vect-slp.c (working copy)
@@ -3412,7 +3412,7 @@ vect_transform_slp_perm_load (slp_tree n
   int second_vec_index = -1;
   bool noop_p = true;
 
-  for (int j = 0; j < unroll_factor; j++)
+  for (int j = 0; j < vf; j++)
 {
   for (int k = 0; k < group_size; k++)
{
@@ -3486,7 +3486,7 @@ vect_transform_slp_perm_load (slp_tree n
  vect_create_mask_and_perm (stmt, mask_vec, 
first_vec_index,
 second_vec_index,
 gsi, node, vectype, dr_chain,
-ncopies, 
vect_stmts_counter++);
+1, vect_stmts_counter++);
}
 
  index = 0;

but it allows dead code to be removed and the now pointless helper
vect_create_mask_and_perm to vanish.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-03-08  Richard Biener  

PR tree-optimization/79920
* tree-vect-slp.c (vect_create_mask_and_perm): Remove and inline
with ncopies == 1 to ...
(vect_transform_slp_perm_load): ... here.  Properly compute
all element loads by iterating VF times over the group.  Do
not handle ncopies (computed in a broken way) in
vect_create_mask_and_perm.

* gcc.dg/vect/pr79920.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 245947)
--- gcc/tree-vect-slp.c (working copy)
*** vect_get_slp_defs (vec ops, slp_tr
*** 3379,3444 
  }
  }
  
- 
- /* Create NCOPIES permutation statements using the mask MASK_BYTES (by
-building a vector of type MASK_TYPE from it) and two input vectors placed 
in
-DR_CHAIN at FIRST_VEC_INDX and SECOND_VEC_INDX for the first copy and
-shifting by STRIDE elements of DR_CHAIN for every copy.
-(STRIDE is the number of vectorized stmts for NODE divided by the number of
-copies).
-VECT_STMTS_COUNTER specifies the index in the vectorized stmts of NODE, 
where
-the created stmts must be inserted.  */
- 
- static inline void
- vect_create_mask_and_perm (gimple *stmt,
-tree mask, int first_vec_indx, int second_vec_indx,
-gimple_stmt_iterator *gsi, slp_tree node,
-tree vectype, vec dr_chain,
-int ncopies, int vect_stmts_counter)
- {
-   tree perm_dest;
-   gimple *perm_stmt = NULL;
-   int i, stride_in, stride_out;
-   tree first_vec, second_vec, data_ref;
- 
-   stride_out = SLP_TREE_NUMBER_OF_VEC_STMTS (node) / ncopies;
-   stride_in = dr_chain.length () / ncopies;
- 
-   /* Initialize the vect stmts of NODE to properly insert the generated
-  stmts later.  */
-   for (i = SLP_TREE_VEC_STMTS (node).length ();
-i < (int) SLP_TREE_NUMBER_OF_VEC_STMTS (node); i++)
- SLP_TREE_VEC_STMTS (node).quick_push (NULL);
- 
-   perm_dest = vect_create_destination_var (gimple_assign_lhs (stmt), vectype);
-   for (i = 0; i < ncopies; i++)
- {
-   first_vec = dr_chain[first_vec_indx];
-   second_vec = dr_chain[second_vec_indx];
- 
-   /* Generate the permute statement if necessary.  */
-   if (mask)
-   {
- perm_stmt = gimple_build_assign (perm_dest, VEC_PERM_EXPR,
-  first_vec, second_vec, mask);
- data_ref = make_ssa_name (perm_dest, perm_stmt);
- gimple_set_lhs (perm_stmt, data_ref);
- vect_finish_stmt_generation (stmt, perm_stmt, gsi);
-   }
-   else
-   /* If mask was NULL_TREE generate the requested identity transform.  */
-   perm_stmt = SSA_NAME_DEF_STMT (first_vec);
- 
-   /* Store the vector statement in NODE.  */
-   SLP_TREE_VEC_STMTS (node)[stride_out * i + vect_stmts_counter]
-   = perm_stmt;
- 
-   first_vec_indx += stride_in;
-   second_vec_indx += stride_in;
- }
- }
- 
- 
  /* Generate vector permute statements from a list of loads in DR_CHAIN.
 If ANALYZE_ONLY is TRUE, only check that it is possible to create valid
 permute statements for the SLP node NODE of the SLP instance
--- 3379,3384 
*** vect_transform_slp_perm_load (slp_tree n
*** 3456,3462 
int nunits, vec_index = 0;
tree vectype = STMT_VINFO_VECTYPE (stmt_info);
int group_size = SLP_INSTANCE_GROUP_SIZE (slp_node_instance);
!   int unroll_factor, mask_element, ncopies;
unsigned char *mask;

Re: [PATCH] Small ubsan vector arith optimization to fix one testcase from PR sanitizer/79904

2017-03-08 Thread Richard Biener
On Tue, 7 Mar 2017, Jakub Jelinek wrote:

> Hi!
> 
> If any of the operands of the UBSAN_{ADD,SUB,MUL}_OVERFLOW ifn with
> vector operand is a uniform vector, expanding it as VCE on the VECTOR_CST
> followed by ARRAY_REF with variable index in the loop is unnecessarily
> expensive and nothing fixes it afterwards.
> This works around ICE on s390 at least for the uniform vectors.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.  Note that another option for the loopy case is to do

  for (;;)
{
  vec >> by-one-elt;
  elt = BIT_FIELD_REF ;
}

when whole-vector shifts are available (they are constructed by
VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
doing variable-index array accesses you can as well spill the
vector to memory and use memory accesses on that.  Not sure how
to arrange that from this part of the expander.

Unsure what the ICE you see with s390 is about...

Thanks,
Richard.



> 2017-03-07  Jakub Jelinek  
> 
>   PR sanitizer/79904
>   * internal-fn.c (expand_vector_ubsan_overflow): If arg0 or arg1
>   is a uniform vector, use uniform_vector_p return value instead of
>   building ARRAY_REF on folded VIEW_CONVERT_EXPR to array type.
> 
>   * gcc.dg/ubsan/pr79904.c: New test.
> 
> --- gcc/internal-fn.c.jj  2017-02-23 08:48:40.0 +0100
> +++ gcc/internal-fn.c 2017-03-07 11:55:56.261465702 +0100
> @@ -1869,12 +1869,20 @@ expand_vector_ubsan_overflow (location_t
>if (cnt > 4)
>   {
> tree atype = build_array_type_nelts (eltype, cnt);
> -   op0 = fold_build1_loc (loc, VIEW_CONVERT_EXPR, atype, arg0);
> -   op0 = build4_loc (loc, ARRAY_REF, eltype, op0, cntv,
> - NULL_TREE, NULL_TREE);
> -   op1 = fold_build1_loc (loc, VIEW_CONVERT_EXPR, atype, arg1);
> -   op1 = build4_loc (loc, ARRAY_REF, eltype, op1, cntv,
> - NULL_TREE, NULL_TREE);
> +   op0 = uniform_vector_p (arg0);
> +   if (op0 == NULL_TREE)
> + {
> +   op0 = fold_build1_loc (loc, VIEW_CONVERT_EXPR, atype, arg0);
> +   op0 = build4_loc (loc, ARRAY_REF, eltype, op0, cntv,
> + NULL_TREE, NULL_TREE);
> + }
> +   op1 = uniform_vector_p (arg1);
> +   if (op1 == NULL_TREE)
> + {
> +   op1 = fold_build1_loc (loc, VIEW_CONVERT_EXPR, atype, arg1);
> +   op1 = build4_loc (loc, ARRAY_REF, eltype, op1, cntv,
> + NULL_TREE, NULL_TREE);
> + }
> if (resv)
>   {
> res = fold_build1_loc (loc, VIEW_CONVERT_EXPR, atype, resv);
> --- gcc/testsuite/gcc.dg/ubsan/pr79904.c.jj   2017-03-07 11:58:53.266120958 
> +0100
> +++ gcc/testsuite/gcc.dg/ubsan/pr79904.c  2017-03-07 11:58:46.0 
> +0100
> @@ -0,0 +1,11 @@
> +/* PR sanitizer/79904 */
> +/* { dg-do compile } */
> +/* { dg-options "-fsanitize=signed-integer-overflow -Wno-psabi" } */
> +
> +typedef signed char V __attribute__((vector_size (8))); 
> +
> +void
> +foo (V *a) 
> +{ 
> +  *a = *a * 3; 
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Increment value instead of a pointer in ADA macro processing.

2017-03-08 Thread Eric Botcazou
> I prepared quite obvious fix for that and tested make check -k
> RUNTESTFLAGS="dg.exp=dump-ada-spec-*" on x86_64-linux-gnu.

Thanks for spotting and fixing it.  Please put the fix on the 6 branch too.

-- 
Eric Botcazou