Re: [PATCH] Fix store-merging vuse handling (PR tree-optimization/83170, PR tree-optimization/83241)

2017-12-01 Thread Richard Biener
On December 1, 2017 11:36:19 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The bswap infrastructure uses the vuse field to make sure all the loads
>are
>having the same gimple_vuse and also uses it in bswap_replace.
>When this infrastructure is used inside of the store-merging pass, the
>problem is that the old stores are being removed and new added, so
>gimple_vuse of the loads we record during process_stmt can change.
>So, this patch updates the vuse fields before we plan to use it (in
>try_coalesce_bswap for the checking and in output_merged_stores for the
>bswap_replace purposes).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-12-01  Jakub Jelinek  
>
>   PR tree-optimization/83170
>   PR tree-optimization/83241
>   * gimple-ssa-store-merging.c
>   (imm_store_chain_info::try_coalesce_bswap): Update vuse field from
>   gimple_vuse (ins_stmt) in case it has changed.
>   (imm_store_chain_info::output_merged_store): Likewise.
>
>   * gcc.dg/store_merging_17.c: New test.
>
>--- gcc/gimple-ssa-store-merging.c.jj  2017-12-01 09:17:36.0
>+0100
>+++ gcc/gimple-ssa-store-merging.c 2017-12-01 16:03:40.806918965 +0100
>@@ -2384,6 +2384,9 @@ imm_store_chain_info::try_coalesce_bswap
>   this_n.type = type;
>   if (!this_n.base_addr)
>   this_n.range = try_size / BITS_PER_UNIT;
>+  else
>+  /* Update vuse in case it has changed by output_merged_stores.  */
>+  this_n.vuse = gimple_vuse (info->ins_stmt);
>   unsigned int bitpos = info->bitpos - infof->bitpos;
>   if (!do_shift_rotate (LSHIFT_EXPR, &this_n,
>   BYTES_BIG_ENDIAN
>@@ -3341,10 +3344,16 @@ imm_store_chain_info::output_merged_stor
>we've checked the aliasing already in try_coalesce_bswap and
>we want to sink the need load into seq.  So need to use new_vuse
>on the load.  */
>-  if (n->base_addr && n->vuse == NULL)
>+  if (n->base_addr)
>   {
>-n->vuse = new_vuse;
>-ins_stmt = NULL;
>+if (n->vuse == NULL)
>+  {
>+n->vuse = new_vuse;
>+ins_stmt = NULL;
>+  }
>+else
>+  /* Update vuse in case it has changed by output_merged_stores. 
>*/
>+  n->vuse = gimple_vuse (ins_stmt);
>   }
>   bswap_res = bswap_replace (gsi_start (seq), ins_stmt, fndecl,
>bswap_type, load_type, n, bswap);
>--- gcc/testsuite/gcc.dg/store_merging_17.c.jj 2017-12-01
>16:07:20.590224536 +0100
>+++ gcc/testsuite/gcc.dg/store_merging_17.c2017-12-01
>16:07:01.0 +0100
>@@ -0,0 +1,17 @@
>+/* PR tree-optimization/83241 */
>+/* { dg-do compile { target store_merge } } */
>+/* { dg-options "-O2" } */
>+
>+struct S { int a; short b[32]; } e;
>+struct T { volatile int c; int d; } f;
>+
>+void
>+foo ()
>+{
>+  struct T g = f;
>+  e.b[0] = 6;
>+  e.b[1] = 6;
>+  e.b[4] = g.d;
>+  e.b[5] = g.d >> 16;
>+  e.a = 1;
>+}
>
>   Jakub



Re: [PATCH] Handle POINTER_DIFF_EXPR in chkp

2017-12-01 Thread Richard Biener
On December 2, 2017 1:14:03 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The following testcase shows that chkp_compute_bounds_for_assignment
>should know about POINTER_DIFF_EXPR and handle it like it used
>to handle MINUS_EXPR of 2 pointers in the past.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-12-01  Jakub Jelinek  
>
>   * tree-chkp.c (chkp_compute_bounds_for_assignment): Handle
>   POINTER_DIFF_EXPR.
>
>   * gcc.target/i386/mpx/pointer-diff-1.c: New test.
>
>--- gcc/tree-chkp.c.jj 2017-11-13 09:31:29.0 +0100
>+++ gcc/tree-chkp.c2017-12-01 15:11:40.331410514 +0100
>@@ -2762,6 +2762,7 @@ chkp_compute_bounds_for_assignment (tree
> case FLOAT_EXPR:
> case REALPART_EXPR:
> case IMAGPART_EXPR:
>+case POINTER_DIFF_EXPR:
>   /* No valid bounds may be produced by these exprs.  */
>   bounds = chkp_get_invalid_op_bounds ();
>   break;
>--- gcc/testsuite/gcc.target/i386/mpx/pointer-diff-1.c.jj  2017-12-01
>15:18:52.805002275 +0100
>+++ gcc/testsuite/gcc.target/i386/mpx/pointer-diff-1.c 2017-12-01
>15:17:35.0 +0100
>@@ -0,0 +1,8 @@
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -mmpx -fcheck-pointer-bounds" } */
>+
>+char *
>+foo (char *p, char *q)
>+{
>+  return (char *) (p - q);/* { dg-bogus "pointer bounds were lost due
>to unexpected expression" } */
>+}
>
>   Jakub



Re: [PATCH] Fix -Wreturn-type with -fsanitize=return (PR c++/81212)

2017-12-01 Thread Richard Biener
On December 2, 2017 1:14:10 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>With -fsanitize=return, we add __builtin_ubsan_handle_missing_return
>instead of __builtin_unreachable with BUILTINS_LOCATION locus,
>the following patch teaches warn function return pass to handle that
>too.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-12-01  Jakub Jelinek  
>
>   PR c++/81212
>   * tree-cfg.c (pass_warn_function_return::execute): Handle
>   __builtin_ubsan_handle_missing_return like __builtin_unreachable
>   with BUILTINS_LOCATION.
>
>   * g++.dg/ubsan/pr81212.C: New test.
>   * g++.dg/ubsan/return-1.C: Add -Wno-return-type to dg-options.
>   * g++.dg/ubsan/return-2.C: Add -Wno-return-type to dg-options.
>   * g++.dg/ubsan/return-7.C: Add -Wno-return-type to dg-options.
>
>--- gcc/tree-cfg.c.jj  2017-12-01 09:10:12.0 +0100
>+++ gcc/tree-cfg.c 2017-12-01 19:38:21.933049446 +0100
>@@ -9151,10 +9151,13 @@ pass_warn_function_return::execute (func
> if (EDGE_COUNT (bb->succs) == 0)
>   {
> gimple *last = last_stmt (bb);
>+const enum built_in_function ubsan_missing_ret
>+  = BUILT_IN_UBSAN_HANDLE_MISSING_RETURN;
> if (last
>-&& (LOCATION_LOCUS (gimple_location (last))
>-== BUILTINS_LOCATION)
>-&& gimple_call_builtin_p (last, BUILT_IN_UNREACHABLE))
>+&& ((LOCATION_LOCUS (gimple_location (last))
>+ == BUILTINS_LOCATION
>+ && gimple_call_builtin_p (last, BUILT_IN_UNREACHABLE))
>+|| gimple_call_builtin_p (last, ubsan_missing_ret)))
>   {
> gimple_stmt_iterator gsi = gsi_for_stmt (last);
> gsi_prev_nondebug (&gsi);
>--- gcc/testsuite/g++.dg/ubsan/pr81212.C.jj2017-12-01
>19:30:43.251747933 +0100
>+++ gcc/testsuite/g++.dg/ubsan/pr81212.C   2017-12-01 19:30:25.0
>+0100
>@@ -0,0 +1,16 @@
>+// PR c++/81212
>+// { dg-do compile }
>+// { dg-options "-Wreturn-type -fsanitize=return" }
>+
>+struct S
>+{
>+  S (void *);
>+  void *s;
>+};
>+
>+S
>+foo (bool x, void *y)
>+{
>+  if (x)
>+return S (y);
>+} // { dg-warning "control reaches end of non-void function" }
>--- gcc/testsuite/g++.dg/ubsan/return-1.C.jj   2013-11-22
>21:05:59.0 +0100
>+++ gcc/testsuite/g++.dg/ubsan/return-1.C  2017-12-01 22:23:17.05380
>+0100
>@@ -1,5 +1,5 @@
> // { dg-do run }
>-// { dg-options "-fsanitize=return" }
>+// { dg-options "-fsanitize=return -Wno-return-type" }
> // { dg-shouldfail "ubsan" }
> 
> struct S { S (); ~S (); };
>--- gcc/testsuite/g++.dg/ubsan/return-2.C.jj   2014-10-22
>15:52:16.0 +0200
>+++ gcc/testsuite/g++.dg/ubsan/return-2.C  2017-12-01 22:23:32.059679081
>+0100
>@@ -1,5 +1,5 @@
> // { dg-do run }
>-// { dg-options "-fsanitize=return -fno-sanitize-recover=return" }
>+// { dg-options "-fsanitize=return -fno-sanitize-recover=return
>-Wno-return-type" }
> 
> struct S { S (); ~S (); };
> 
>--- gcc/testsuite/g++.dg/ubsan/return-7.C.jj   2016-11-23
>20:50:51.0 +0100
>+++ gcc/testsuite/g++.dg/ubsan/return-7.C  2017-12-01 22:24:03.894281135
>+0100
>@@ -1,5 +1,5 @@
> // { dg-do run }
>-// { dg-options "-fsanitize=undefined" }
>+// { dg-options "-fsanitize=undefined -Wno-return-type" }
> // { dg-shouldfail "ubsan" }
> 
> struct S { S (); ~S (); };
>
>   Jakub



Re: [PATCH] Small hack for DECL_MODE lack of vector_type_mode-like behavior (PR target/78643, PR target/80583)

2017-12-01 Thread Richard Biener
On December 2, 2017 1:14:15 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>In TYPE_MODE we have a hack for vector types where we dynamically
>adjust it based on whether it is used from a function where the vector
>mode
>is or isn't supported (depending on target attribute, function
>multiversioning, etc.), but we don't have anything like that in
>DECL_MODE.
>
>The following is just a small hack that should be backportable to
>adjust
>similarly DECL_MODE when looking at fields - for static VAR_DECLs we
>already
>have code to cope with that, furthermore we need to cope there with the
>case where DECL_RTL is created in one context and used in another one.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-12-01  Jakub Jelinek  
>
>   PR target/78643
>   PR target/80583
>   * expr.c (get_inner_reference): If DECL_MODE of a non-bitfield
>   is BLKmode for vector field with vector raw mode, use TYPE_MODE
>   instead of DECL_MODE.
>
>   * gcc.target/i386/pr80583.c: New test.
>
>--- gcc/expr.c.jj  2017-11-27 09:27:41.0 +0100
>+++ gcc/expr.c 2017-12-01 11:36:19.011863308 +0100
>@@ -7032,7 +7032,16 @@ get_inner_reference (tree exp, HOST_WIDE
>size.  */
>   mode = TYPE_MODE (DECL_BIT_FIELD_TYPE (field));
>   else if (!DECL_BIT_FIELD (field))
>-  mode = DECL_MODE (field);
>+  {
>+mode = DECL_MODE (field);
>+/* For vector fields re-check the target flags, as DECL_MODE
>+   could have been set with different target flags than
>+   the current function has.  */
>+if (mode == BLKmode
>+&& VECTOR_TYPE_P (TREE_TYPE (field))
>+&& VECTOR_MODE_P (TYPE_MODE_RAW (TREE_TYPE (field
>+  mode = TYPE_MODE (TREE_TYPE (field));
>+  }
>   else if (DECL_MODE (field) == BLKmode)
>   blkmode_bitfield = true;
> 
>--- gcc/testsuite/gcc.target/i386/pr80583.c.jj 2017-12-01
>11:44:28.106603314 +0100
>+++ gcc/testsuite/gcc.target/i386/pr80583.c2017-12-01
>11:44:12.0 +0100
>@@ -0,0 +1,13 @@
>+/* PR target/80583 */
>+/* { dg-do compile } */
>+/* { dg-options "-O0 -mno-avx" } */
>+
>+typedef int V __attribute__((__vector_size__(32)));
>+struct S { V a; };
>+
>+V __attribute__((target ("avx")))
>+foo (struct S *b)
>+{
>+  V x = b->a;
>+  return x;
>+}
>
>   Jakub



Re: [PATCH] Re: loading of zeros into {x,y,z}mm registers (take 2)

2017-12-01 Thread Kirill Yukhin
On 02 Dec 01:13, Jakub Jelinek wrote:
> On Fri, Dec 01, 2017 at 02:54:28PM +0100, Jakub Jelinek wrote:
> > Will try this:
> 
> That failed to bootstrap, but here is an updated version that passed
> bootstrap/regtest on x86_64-linux and i686-linux, ok for trunk?
Great. OK.

--
Thanks, K


[PATCH] doc update for -dp

2017-12-01 Thread Segher Boessenkool
Committing as obvious and trivial.


Segher


2017-12-01  Segher Boessenkool  

* doc/invoke.texi (-dp): Say that instruction cost is printed as well.

---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c105800..b4e0231 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13317,7 +13317,7 @@ Produce a core dump whenever an error occurs.
 @item -dp
 @opindex dp
 Annotate the assembler output with a comment indicating which
-pattern and alternative is used.  The length of each instruction is
+pattern and alternative is used.  The length and cost of each instruction are
 also printed.
 
 @item -dP
-- 
1.8.3.1



libgo patch committed: Export cgoCheck functions

2017-12-01 Thread Ian Lance Taylor
This patch to libgo exports the cgoCheck functions.  The functions
cgoCheckPointer and cgoCheckResult are called by code generated by
cgo. That means that we need to export them using go:linkname, as
otherwise they are local symbols. The cgo code currently uses weak
references to only call the symbols if they are defined, which is why
it has been working--the cgo code has not been doing any checks.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 255346)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1949a203fca0c8bde6f2690ebc36427c5e3953c7
+338f7434175bb71f3c8905e9ad7f480aec3afee6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/cgocall.go
===
--- libgo/go/runtime/cgocall.go (revision 255340)
+++ libgo/go/runtime/cgocall.go (working copy)
@@ -11,6 +11,10 @@ import (
"unsafe"
 )
 
+// Functions called by cgo-generated code.
+//go:linkname cgoCheckPointer runtime.cgoCheckPointer
+//go:linkname cgoCheckResult runtime.cgoCheckResult
+
 // Pointer checking for cgo code.
 
 // We want to detect all cases where a program that does not use


Go patch committed: Avoid middle-end control flow warnings

2017-12-01 Thread Ian Lance Taylor
The GCC middle-end has started emitting "control reaches end of
non-void function" warnings. This are not too useful for Go, which
implements its own error for this in the frontend.  Avoid the
middle-end warnings for Go by 1) marking the builtin function panic
and the compiler-generated function __go_runtime_error as not
returning and 2) adding a default case to the switch used for select
statements that simply calls __builtin_unreachable.  This fixes
https://golang.org/issue/22767.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian


2017-12-01  Ian Lance Taylor  

* go-gcc.cc (Gcc_backend::Gcc_backend): Define
__builtin_unreachable.
(Gcc_backend::function): Add does_not_return parameter.
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 255340)
+++ gcc/go/go-gcc.cc(working copy)
@@ -486,7 +486,8 @@ class Gcc_backend : public Backend
   Bfunction*
   function(Btype* fntype, const std::string& name, const std::string& asm_name,
bool is_visible, bool is_declaration, bool is_inlinable,
-   bool disable_split_stack, bool in_unique_section, Location);
+   bool disable_split_stack, bool does_not_return,
+  bool in_unique_section, Location);
 
   Bstatement*
   function_defer_statement(Bfunction* function, Bexpression* undefer,
@@ -760,6 +761,12 @@ Gcc_backend::Gcc_backend()
const_ptr_type_node,
NULL_TREE),
   false, false);
+
+  // The compiler uses __builtin_unreachable for cases that can not
+  // occur.
+  this->define_builtin(BUILT_IN_UNREACHABLE, "__builtin_unreachable", NULL,
+  build_function_type(void_type_node, void_list_node),
+  true, true);
 }
 
 // Get an unnamed integer type.
@@ -3012,8 +3019,8 @@ Bfunction*
 Gcc_backend::function(Btype* fntype, const std::string& name,
   const std::string& asm_name, bool is_visible,
   bool is_declaration, bool is_inlinable,
-  bool disable_split_stack, bool in_unique_section,
-  Location location)
+  bool disable_split_stack, bool does_not_return,
+ bool in_unique_section, Location location)
 {
   tree functype = fntype->get_tree();
   if (functype != error_mark_node)
@@ -3049,6 +3056,8 @@ Gcc_backend::function(Btype* fntype, con
   tree attr = get_identifier ("no_split_stack");
   DECL_ATTRIBUTES(decl) = tree_cons(attr, NULL_TREE, NULL_TREE);
 }
+  if (does_not_return)
+TREE_THIS_VOLATILE(decl) = 1;
   if (in_unique_section)
 resolve_unique_section(decl, 0, 1);
 
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 255340)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-8cd42a3e9e0e618bb09e67be73f7d2f2477a0faa
+1949a203fca0c8bde6f2690ebc36427c5e3953c7
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h (revision 255340)
+++ gcc/go/gofrontend/backend.h (working copy)
@@ -711,12 +711,15 @@ class Backend
   // IS_INLINABLE is true if the function can be inlined.
   // DISABLE_SPLIT_STACK is true if this function may not split the stack; this
   // is used for the implementation of recover.
+  // DOES_NOT_RETURN is true for a function that does not return; this is used
+  // for the implementation of panic.
   // IN_UNIQUE_SECTION is true if this function should be put into a unique
   // location if possible; this is used for field tracking.
   virtual Bfunction*
   function(Btype* fntype, const std::string& name, const std::string& asm_name,
bool is_visible, bool is_declaration, bool is_inlinable,
-   bool disable_split_stack, bool in_unique_section, Location) = 0;
+   bool disable_split_stack, bool does_not_return,
+  bool in_unique_section, Location) = 0;
 
   // Create a statement that runs all deferred calls for FUNCTION.  This should
   // be a statement that looks like this in C++:
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 255340)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -711,7 +711,7 @@ Gogo::init_imports(std::vectorbackend()->function(fntype, user_name, 
init_name,
true, true, true, false,
-   false, unknown_loc);
+   false, false, unknown_loc);
   Bexpression* pfunc_code =
   this->backend()->functi

[PATCH] Delete obsolete DWARF1 references.

2017-12-01 Thread Jim Wilson
I noticed a stray DWARF_DEBUG reference.  The DWARF1 support was removed in
2004.  So I did a quick find|grep and I found two more.  This patch removes
them.

This was tested with quick x86_64-linux and mips-wrs-vxworks builds just to
verify that they still build, and by hand checking the docs to make sure they
are OK.

Committed.

gcc/
* common.opt (use_gnu_debug_info_extensions): Delete DWARF_DEBUG from
comment.
* config/vx-common.h (DWARF_DEBUGGING_INFO): Delete undef.
* doc/tm.texi.in (PREFERRED_DEBUGGING_TYPE): Delete DWARF_DEBUG
reference.
* doc/tm.texi: Regenerate.
---
 gcc/common.opt | 2 +-
 gcc/config/vx-common.h | 1 -
 gcc/doc/tm.texi| 4 ++--
 gcc/doc/tm.texi.in | 4 ++--
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 28a0185f0cf..ffcbf850216 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -137,7 +137,7 @@ enum debug_info_levels debug_info_level = DINFO_LEVEL_NONE
 
 ; Nonzero means use GNU-only extensions in the generated symbolic
 ; debugging information.  Currently, this only has an effect when
-; write_symbols is set to DBX_DEBUG, XCOFF_DEBUG, or DWARF_DEBUG.
+; write_symbols is set to DBX_DEBUG or XCOFF_DEBUG.
 Variable
 bool use_gnu_debug_info_extensions
 
diff --git a/gcc/config/vx-common.h b/gcc/config/vx-common.h
index d8f04eced4d..a75f5a00f48 100644
--- a/gcc/config/vx-common.h
+++ b/gcc/config/vx-common.h
@@ -70,7 +70,6 @@ along with GCC; see the file COPYING3.  If not see
 #define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
 
 /* None of these other formats is supported.  */
-#undef DWARF_DEBUGGING_INFO
 #undef DBX_DEBUGGING_INFO
 #undef XCOFF_DEBUGGING_INFO
 #undef VMS_DEBUGGING_INFO
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f16e73c31b1..b39c7efa415 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9591,8 +9591,8 @@ A C expression that returns the type of debugging output 
GCC should
 produce when the user specifies just @option{-g}.  Define
 this if you have arranged for GCC to support more than one format of
 debugging output.  Currently, the allowable values are @code{DBX_DEBUG},
-@code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
-@code{XCOFF_DEBUG}, @code{VMS_DEBUG}, and @code{VMS_AND_DWARF2_DEBUG}.
+@code{DWARF2_DEBUG}, @code{XCOFF_DEBUG}, @code{VMS_DEBUG},
+and @code{VMS_AND_DWARF2_DEBUG}.
 
 When the user specifies @option{-ggdb}, GCC normally also uses the
 value of this macro to select the debugging output format, but with two
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 39f6fcaaa11..57b83a8e542 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6629,8 +6629,8 @@ A C expression that returns the type of debugging output 
GCC should
 produce when the user specifies just @option{-g}.  Define
 this if you have arranged for GCC to support more than one format of
 debugging output.  Currently, the allowable values are @code{DBX_DEBUG},
-@code{DWARF_DEBUG}, @code{DWARF2_DEBUG},
-@code{XCOFF_DEBUG}, @code{VMS_DEBUG}, and @code{VMS_AND_DWARF2_DEBUG}.
+@code{DWARF2_DEBUG}, @code{XCOFF_DEBUG}, @code{VMS_DEBUG},
+and @code{VMS_AND_DWARF2_DEBUG}.
 
 When the user specifies @option{-ggdb}, GCC normally also uses the
 value of this macro to select the debugging output format, but with two
-- 
2.14.1



[PATCH] Handle POINTER_DIFF_EXPR in chkp

2017-12-01 Thread Jakub Jelinek
Hi!

The following testcase shows that chkp_compute_bounds_for_assignment
should know about POINTER_DIFF_EXPR and handle it like it used
to handle MINUS_EXPR of 2 pointers in the past.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

* tree-chkp.c (chkp_compute_bounds_for_assignment): Handle
POINTER_DIFF_EXPR.

* gcc.target/i386/mpx/pointer-diff-1.c: New test.

--- gcc/tree-chkp.c.jj  2017-11-13 09:31:29.0 +0100
+++ gcc/tree-chkp.c 2017-12-01 15:11:40.331410514 +0100
@@ -2762,6 +2762,7 @@ chkp_compute_bounds_for_assignment (tree
 case FLOAT_EXPR:
 case REALPART_EXPR:
 case IMAGPART_EXPR:
+case POINTER_DIFF_EXPR:
   /* No valid bounds may be produced by these exprs.  */
   bounds = chkp_get_invalid_op_bounds ();
   break;
--- gcc/testsuite/gcc.target/i386/mpx/pointer-diff-1.c.jj   2017-12-01 
15:18:52.805002275 +0100
+++ gcc/testsuite/gcc.target/i386/mpx/pointer-diff-1.c  2017-12-01 
15:17:35.0 +0100
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmpx -fcheck-pointer-bounds" } */
+
+char *
+foo (char *p, char *q)
+{
+  return (char *) (p - q); /* { dg-bogus "pointer bounds were lost due to 
unexpected expression" } */
+}

Jakub


[C++ PATCH] Diagnose = delete override of a friend function defined earlier (PR c++/80259)

2017-12-01 Thread Jakub Jelinek
Hi!

As the testcase shows, we weren't diagnosing the foo case in the testcase
and would just silently overwrite old DECL_INITIAL with error_mark_node
and ICE later on.  Fixed thusly, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

PR c++/80259
* decl2.c (grokfield): Diagnose = delete redefinition of a friend.

* g++.dg/cpp0x/pr80259.C: New test.

--- gcc/cp/decl2.c.jj   2017-11-21 08:43:50.0 +0100
+++ gcc/cp/decl2.c  2017-12-01 12:01:32.671514761 +0100
@@ -911,9 +911,18 @@ grokfield (const cp_declarator *declarat
{
  if (init == ridpointers[(int)RID_DELETE])
{
- DECL_DELETED_FN (value) = 1;
- DECL_DECLARED_INLINE_P (value) = 1;
- DECL_INITIAL (value) = error_mark_node;
+ if (friendp && decl_defined_p (value))
+   {
+ error ("redefinition of %q#D", value);
+ inform (DECL_SOURCE_LOCATION (value),
+ "%q#D previously defined here", value);
+   }
+ else
+   {
+ DECL_DELETED_FN (value) = 1;
+ DECL_DECLARED_INLINE_P (value) = 1;
+ DECL_INITIAL (value) = error_mark_node;
+   }
}
  else if (init == ridpointers[(int)RID_DEFAULT])
{
--- gcc/testsuite/g++.dg/cpp0x/pr80259.C.jj 2017-12-01 12:06:54.611405404 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr80259.C2017-12-01 12:06:19.0 
+0100
@@ -0,0 +1,13 @@
+// PR c++/80259
+// { dg-do compile { target c++11 } }
+
+void foo () {} // { dg-message "previously defined here" }
+void bar ();
+
+struct A
+{
+  friend void foo () = delete; // { dg-error "redefinition of" }
+  friend void bar () = delete; // { dg-message "previously defined here" }
+};
+
+void bar () {} // { dg-error "redefinition of" }

Jakub


[PATCH] Fix -Wreturn-type with -fsanitize=return (PR c++/81212)

2017-12-01 Thread Jakub Jelinek
Hi!

With -fsanitize=return, we add __builtin_ubsan_handle_missing_return
instead of __builtin_unreachable with BUILTINS_LOCATION locus,
the following patch teaches warn function return pass to handle that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

PR c++/81212
* tree-cfg.c (pass_warn_function_return::execute): Handle
__builtin_ubsan_handle_missing_return like __builtin_unreachable
with BUILTINS_LOCATION.

* g++.dg/ubsan/pr81212.C: New test.
* g++.dg/ubsan/return-1.C: Add -Wno-return-type to dg-options.
* g++.dg/ubsan/return-2.C: Add -Wno-return-type to dg-options.
* g++.dg/ubsan/return-7.C: Add -Wno-return-type to dg-options.

--- gcc/tree-cfg.c.jj   2017-12-01 09:10:12.0 +0100
+++ gcc/tree-cfg.c  2017-12-01 19:38:21.933049446 +0100
@@ -9151,10 +9151,13 @@ pass_warn_function_return::execute (func
  if (EDGE_COUNT (bb->succs) == 0)
{
  gimple *last = last_stmt (bb);
+ const enum built_in_function ubsan_missing_ret
+   = BUILT_IN_UBSAN_HANDLE_MISSING_RETURN;
  if (last
- && (LOCATION_LOCUS (gimple_location (last))
- == BUILTINS_LOCATION)
- && gimple_call_builtin_p (last, BUILT_IN_UNREACHABLE))
+ && ((LOCATION_LOCUS (gimple_location (last))
+  == BUILTINS_LOCATION
+  && gimple_call_builtin_p (last, BUILT_IN_UNREACHABLE))
+ || gimple_call_builtin_p (last, ubsan_missing_ret)))
{
  gimple_stmt_iterator gsi = gsi_for_stmt (last);
  gsi_prev_nondebug (&gsi);
--- gcc/testsuite/g++.dg/ubsan/pr81212.C.jj 2017-12-01 19:30:43.251747933 
+0100
+++ gcc/testsuite/g++.dg/ubsan/pr81212.C2017-12-01 19:30:25.0 
+0100
@@ -0,0 +1,16 @@
+// PR c++/81212
+// { dg-do compile }
+// { dg-options "-Wreturn-type -fsanitize=return" }
+
+struct S
+{
+  S (void *);
+  void *s;
+};
+
+S
+foo (bool x, void *y)
+{
+  if (x)
+return S (y);
+}  // { dg-warning "control reaches end of non-void function" }
--- gcc/testsuite/g++.dg/ubsan/return-1.C.jj2013-11-22 21:05:59.0 
+0100
+++ gcc/testsuite/g++.dg/ubsan/return-1.C   2017-12-01 22:23:17.05380 
+0100
@@ -1,5 +1,5 @@
 // { dg-do run }
-// { dg-options "-fsanitize=return" }
+// { dg-options "-fsanitize=return -Wno-return-type" }
 // { dg-shouldfail "ubsan" }
 
 struct S { S (); ~S (); };
--- gcc/testsuite/g++.dg/ubsan/return-2.C.jj2014-10-22 15:52:16.0 
+0200
+++ gcc/testsuite/g++.dg/ubsan/return-2.C   2017-12-01 22:23:32.059679081 
+0100
@@ -1,5 +1,5 @@
 // { dg-do run }
-// { dg-options "-fsanitize=return -fno-sanitize-recover=return" }
+// { dg-options "-fsanitize=return -fno-sanitize-recover=return 
-Wno-return-type" }
 
 struct S { S (); ~S (); };
 
--- gcc/testsuite/g++.dg/ubsan/return-7.C.jj2016-11-23 20:50:51.0 
+0100
+++ gcc/testsuite/g++.dg/ubsan/return-7.C   2017-12-01 22:24:03.894281135 
+0100
@@ -1,5 +1,5 @@
 // { dg-do run }
-// { dg-options "-fsanitize=undefined" }
+// { dg-options "-fsanitize=undefined -Wno-return-type" }
 // { dg-shouldfail "ubsan" }
 
 struct S { S (); ~S (); };

Jakub


[PATCH] Small hack for DECL_MODE lack of vector_type_mode-like behavior (PR target/78643, PR target/80583)

2017-12-01 Thread Jakub Jelinek
Hi!

In TYPE_MODE we have a hack for vector types where we dynamically
adjust it based on whether it is used from a function where the vector mode
is or isn't supported (depending on target attribute, function
multiversioning, etc.), but we don't have anything like that in DECL_MODE.

The following is just a small hack that should be backportable to adjust
similarly DECL_MODE when looking at fields - for static VAR_DECLs we already
have code to cope with that, furthermore we need to cope there with the
case where DECL_RTL is created in one context and used in another one.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

PR target/78643
PR target/80583
* expr.c (get_inner_reference): If DECL_MODE of a non-bitfield
is BLKmode for vector field with vector raw mode, use TYPE_MODE
instead of DECL_MODE.

* gcc.target/i386/pr80583.c: New test.

--- gcc/expr.c.jj   2017-11-27 09:27:41.0 +0100
+++ gcc/expr.c  2017-12-01 11:36:19.011863308 +0100
@@ -7032,7 +7032,16 @@ get_inner_reference (tree exp, HOST_WIDE
 size.  */
mode = TYPE_MODE (DECL_BIT_FIELD_TYPE (field));
   else if (!DECL_BIT_FIELD (field))
-   mode = DECL_MODE (field);
+   {
+ mode = DECL_MODE (field);
+ /* For vector fields re-check the target flags, as DECL_MODE
+could have been set with different target flags than
+the current function has.  */
+ if (mode == BLKmode
+ && VECTOR_TYPE_P (TREE_TYPE (field))
+ && VECTOR_MODE_P (TYPE_MODE_RAW (TREE_TYPE (field
+   mode = TYPE_MODE (TREE_TYPE (field));
+   }
   else if (DECL_MODE (field) == BLKmode)
blkmode_bitfield = true;
 
--- gcc/testsuite/gcc.target/i386/pr80583.c.jj  2017-12-01 11:44:28.106603314 
+0100
+++ gcc/testsuite/gcc.target/i386/pr80583.c 2017-12-01 11:44:12.0 
+0100
@@ -0,0 +1,13 @@
+/* PR target/80583 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -mno-avx" } */
+
+typedef int V __attribute__((__vector_size__(32)));
+struct S { V a; };
+
+V __attribute__((target ("avx")))
+foo (struct S *b)
+{
+  V x = b->a;
+  return x;
+}

Jakub


[PATCH] Re: loading of zeros into {x,y,z}mm registers (take 2)

2017-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2017 at 02:54:28PM +0100, Jakub Jelinek wrote:
> Will try this:

That failed to bootstrap, but here is an updated version that passed
bootstrap/regtest on x86_64-linux and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

* config/i386/i386-protos.h (standard_sse_constant_opcode): Change
last argument to rtx pointer.
* config/i386/i386.c (standard_sse_constant_opcode): Replace X argument
with OPERANDS.  For AVX+ 128-bit VEX encoded instructions over 256-bit
or 512-bit.  If setting EXT_REX_SSE_REG_P, use EVEX encoded insn
depending on the chosen ISAs.
* config/i386/i386.md (*movxi_internal_avx512f, *movoi_internal_avx,
*movti_internal, *movdi_internal, *movsi_internal, *movtf_internal,
*movdf_internal, *movsf_internal): Adjust standard_sse_constant_opcode
callers.
* config/i386/sse.md (mov_internal): Likewise.
* config/i386/mmx.md (*mov_internal): Likewise.

--- gcc/config/i386/i386-protos.h.jj2017-10-28 09:00:44.0 +0200
+++ gcc/config/i386/i386-protos.h   2017-12-01 14:39:36.498608799 +0100
@@ -52,7 +52,7 @@ extern int standard_80387_constant_p (rt
 extern const char *standard_80387_constant_opcode (rtx);
 extern rtx standard_80387_constant_rtx (int);
 extern int standard_sse_constant_p (rtx, machine_mode);
-extern const char *standard_sse_constant_opcode (rtx_insn *, rtx);
+extern const char *standard_sse_constant_opcode (rtx_insn *, rtx *);
 extern bool ix86_standard_x87sse_constant_load_p (const rtx_insn *, rtx);
 extern bool symbolic_reference_mentioned_p (rtx);
 extern bool extended_reg_mentioned_p (rtx);
--- gcc/config/i386/i386.c.jj   2017-12-01 09:19:07.0 +0100
+++ gcc/config/i386/i386.c  2017-12-01 14:36:38.884847618 +0100
@@ -10380,12 +10380,13 @@ standard_sse_constant_p (rtx x, machine_
 }
 
 /* Return the opcode of the special instruction to be used to load
-   the constant X.  */
+   the constant operands[1] into operands[0].  */
 
 const char *
-standard_sse_constant_opcode (rtx_insn *insn, rtx x)
+standard_sse_constant_opcode (rtx_insn *insn, rtx *operands)
 {
   machine_mode mode;
+  rtx x = operands[1];
 
   gcc_assert (TARGET_SSE);
 
@@ -10395,34 +10396,51 @@ standard_sse_constant_opcode (rtx_insn *
 {
   switch (get_attr_mode (insn))
{
+   case MODE_TI:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vpxor\t%0, %d0";
+ /* FALLTHRU */
case MODE_XI:
- return "vpxord\t%g0, %g0, %g0";
case MODE_OI:
- return (TARGET_AVX512VL
- ? "vpxord\t%x0, %x0, %x0"
- : "vpxor\t%x0, %x0, %x0");
-   case MODE_TI:
- return (TARGET_AVX512VL
- ? "vpxord\t%x0, %x0, %x0"
- : "%vpxor\t%0, %d0");
+ if (EXT_REX_SSE_REG_P (operands[0]))
+   return (TARGET_AVX512VL
+   ? "vpxord\t%x0, %x0, %x0"
+   : "vpxord\t%g0, %g0, %g0");
+ return "vpxor\t%x0, %x0, %x0";
 
+   case MODE_V2DF:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vxorpd\t%0, %d0";
+ /* FALLTHRU */
case MODE_V8DF:
- return (TARGET_AVX512DQ
- ? "vxorpd\t%g0, %g0, %g0"
- : "vpxorq\t%g0, %g0, %g0");
case MODE_V4DF:
- return "vxorpd\t%x0, %x0, %x0";
-   case MODE_V2DF:
- return "%vxorpd\t%0, %d0";
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "vxorpd\t%x0, %x0, %x0";
+ else if (TARGET_AVX512DQ)
+   return (TARGET_AVX512VL
+   ? "vxorpd\t%x0, %x0, %x0"
+   : "vxorpd\t%g0, %g0, %g0");
+ else
+   return (TARGET_AVX512VL
+   ? "vpxorq\t%x0, %x0, %x0"
+   : "vpxorq\t%g0, %g0, %g0");
 
+   case MODE_V4SF:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vxorps\t%0, %d0";
+ /* FALLTHRU */
case MODE_V16SF:
- return (TARGET_AVX512DQ
- ? "vxorps\t%g0, %g0, %g0"
- : "vpxord\t%g0, %g0, %g0");
case MODE_V8SF:
- return "vxorps\t%x0, %x0, %x0";
-   case MODE_V4SF:
- return "%vxorps\t%0, %d0";
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "vxorps\t%x0, %x0, %x0";
+ else if (TARGET_AVX512DQ)
+   return (TARGET_AVX512VL
+   ? "vxorps\t%x0, %x0, %x0"
+   : "vxorps\t%g0, %g0, %g0");
+ else
+   return (TARGET_AVX512VL
+   ? "vpxord\t%x0, %x0, %x0"
+   : "vpxord\t%g0, %g0, %g0");
 
default:
  gcc_unreachable ();
@@ -10449,11 +10467,14 @@ standard_sse_constant_opcode (rtx_insn *
case MODE_V2DF:
case MODE_V4SF:
  gcc_assert (TARGET_SSE2);
- return (TARGET_AVX512F
- ? "vpternlogd\t{$0xFF, %0, %0, %0|%0, %0, %0, 0xFF}"
-  

[PATCH] v2: C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-01 Thread David Malcolm
On Fri, 2017-12-01 at 22:56 +0100, Jakub Jelinek wrote:
> On Fri, Dec 01, 2017 at 04:48:20PM -0500, David Malcolm wrote:
> > PR c/83236 reports an issue where the C FE unhelpfully suggests the
> > use
> > of glibc's private "__ino_t" type when it fails to recognize
> > "ino_t":
> > 
> > $ cat > test.c < > #include 
> > ino_t inode;
> > EOF
> > $ gcc -std=c89 -fsyntax-only test.c
> > test.c:2:1: error: unknown type name 'ino_t'; did you mean
> > '__ino_t'?
> >  ino_t inode;
> >  ^
> >  __ino_t
> > 
> > This patch updates the C/C++ FEs suggestions for unrecognized
> > identifiers
> > so that they don't suggest names that are reserved for use by the
> > implementation i.e. those that begin with an underscore and either
> > an
> > uppercase letter or another underscore.
> > 
> > However, it allows built-in macros that match this pattern to be
> > suggested, since it's useful to be able to suggest __FILE__,
> > __LINE__
> > etc.  Other macros *are* filtered.
> > 
> > One wart with the patch: the existing macro-handling spellcheck
> > code
> > is in spellcheck-tree.c, and needs to call the the new function
> > "name_reserved_for_implementation_p", however the latter relates to
> > the C family of FEs.
> > Perhaps I should move all of the the macro-handling stuff in
> > spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
> > first step?
> > 
> > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> > 
> > OK for trunk?
> > 
> > gcc/c/ChangeLog:
> > PR c/83236
> > * c-decl.c (lookup_name_fuzzy): Don't suggest names that are
> > reserved for use by the implementation.
> > 
> > gcc/cp/ChangeLog:
> > PR c/83236
> > * name-lookup.c (consider_binding_level): Don't suggest names
> > that
> > are reserved for use by the implementation.
> > 
> > gcc/ChangeLog:
> > PR c/83236
> > * spellcheck-tree.c (name_reserved_for_implementation_p): New
> > function.
> > (should_suggest_as_macro_p): New function.
> > (find_closest_macro_cpp_cb): Move the check for NT_MACRO to
> > should_suggest_as_macro_p and call it.
> > (selftest::test_name_reserved_for_implementation_p): New
> > function.
> > (selftest::spellcheck_tree_c_tests): Call it.
> > * spellcheck-tree.h (name_reserved_for_implementation_p): New
> > decl.
> > 
> > gcc/testsuite/ChangeLog:
> > PR c/83236
> > * c-c++-common/spellcheck-reserved.c: New test case.
> > ---
> >  gcc/c/c-decl.c   |  5 +++
> >  gcc/cp/name-lookup.c | 18 +++---
> >  gcc/spellcheck-tree.c| 46
> > +++-
> >  gcc/spellcheck-tree.h|  2 ++
> >  gcc/testsuite/c-c++-common/spellcheck-reserved.c | 25
> > +
> >  5 files changed, 91 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/c-c++-common/spellcheck-
> > reserved.c
> > 
> > diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> > index 56c63d8..dfd136d 100644
> > --- a/gcc/c/c-decl.c
> > +++ b/gcc/c/c-decl.c
> > @@ -4041,6 +4041,11 @@ lookup_name_fuzzy (tree name, enum
> > lookup_name_fuzzy_kind kind, location_t loc)
> > if (TREE_CODE (binding->decl) == FUNCTION_DECL)
> >   if (C_DECL_IMPLICIT (binding->decl))
> > continue;
> > +   /* Don't suggest names that are reserved for use by the
> > +  implementation.  */
> > +   if (name_reserved_for_implementation_p
> > +   (IDENTIFIER_POINTER (binding->id)))
> 
> Can't you use a temporary to avoid wrapping line between function
> name and ( ?

Fixed.

> More importantly, does this mean if I mistype __builtin_strtchr it
> won't suggest __builtin_strrchr?  Would be nice if the filtering
> of the names reserved for implementation isn't done if the
> name being looked up is reserved for implementation.

Good idea, thanks.

Here's an updated version of the patch.

Changed in v2:
* don't filter suggestions if the name name being looked up
  is itself reserved for implementation
* fix wrapping in c-decl.c's lookup_name_fuzzy
* name-lookup.c (consider_binding_level): rename new variable from "name"
  to "suggestion" to avoid shadowing a param
* spellcheck-tree.c (test_name_reserved_for_implementation_p): Add more
  test coverage ("_" and "__")

One additional wart I noticed writing the testase is that the
C and C++ frontends offer different suggestions for "__builtin_strtchr".
C recomends:
  __builtin_strchr
whereas C++ recommends:
  __builtin_strrchr

The reason is that the C FE visits the builtins in order of builtins.def,
whereas C++ visits them in the reverse order.

Both have the same edit distance, and so the first "winner" in
best_match varies between FEs.

This is a pre-existing issue, though (not sure if it warrants a PR).

Bootstrap®rtest in progress; OK if it passes?

As before, the other wart is that the existing macro-handling
spellcheck code is in spellcheck-tree.c, and needs to call the the
new function "name_reserved_for_implemen

Re: [PATCH] PR libgcc/83112, Fix warnings on libgcc float128-ifunc.c

2017-12-01 Thread Segher Boessenkool
On Fri, Dec 01, 2017 at 12:40:22AM -0500, Michael Meissner wrote:
> After committing the previous patch, I noticed that it was now generating
> warnings for __{mul,div}kc3_{sw,hw} not having a prototype that I hadn't
> noticed during development of the patch.  This is due to the fact that before 
> I
> added the ifunc support, it was only compiling __{mul,div}kc3, and those have
> built-in declarations.  I installed this patch as being obvious:
> 
> 2017-11-30  Michael Meissner  
> 
>   * config/rs6000/_mulkc3.c (__mulkc3): Add forward declaration.
>   * config/rs6000/_divkc3.c (__divkc3): Likewise.
> 
> Index: libgcc/config/rs6000/_divkc3.c
> ===
> --- libgcc/config/rs6000/_divkc3.c(revision 255288)
> +++ libgcc/config/rs6000/_divkc3.c(working copy)
> @@ -37,6 +37,8 @@ typedef __complex float KCtype __attribu
>  #define __divkc3 __divkc3_sw
>  #endif
>  
> +extern KCtype __divkc3 (KFtype, KFtype, KFtype, KFtype);
> +
>  KCtype
>  __divkc3 (KFtype a, KFtype b, KFtype c, KFtype d)
>  {

How does this warn?  -Wmissing-declarations?  Should this declaration be
in a header then?

A code comment explaining why you do a declaration for exactly the same
thing as there is two lines later would help; otherwise people will try
to delete it again :-)


Segher


Re: [PATCH] final: Improve output for -dp and -fverbose-asm

2017-12-01 Thread Segher Boessenkool
On Thu, Nov 30, 2017 at 05:49:27PM -0700, Jeff Law wrote:
> I think length and costing information are definitely things we want to
> include.  Length is less of an issue now than it was in the past, but it
> definitely has value.

At least for risc targets length is usually pretty boring, but this is
not the length of machine insns, RTL insns instead, making it more
insteresting; and when it is wrong it leads to hard to debug problems,
and we don't have this info easily accessible elsewhere.  Similar goes
for the insn_cost: if you need to debug it, the -dp output is a very
convenient place to quickly get an overview of what we generate.


Segher


Re: [PATCH #2], PR target/81959, Fix ++int to _Float128 conversion on power9

2017-12-01 Thread Michael Meissner
On Fri, Dec 01, 2017 at 05:33:39PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Nov 30, 2017 at 04:52:44PM -0500, Michael Meissner wrote:
> > No, then it tends to generate worse code if it is done before the first 
> > split
> > pass (because it no longer keeps the address together).  I've been thinking
> > that in general, we should replace these calls with a new predicate that 
> > before
> > register allocation allows normal memory addresses, but during/after RA, it
> > becomes more strict.  In my experience, with RELOAD that wasn't feasible, 
> > but
> > LRA can handle it (and RELOAD is no longer an issue).
> 
> Can't you use the "strict" arg to legitimate_address_p and friends?

Well legitimate_address_p allows various D-form address, pre-inc/pre-dec, etc.
It has no context on what the address is being used for.  Secondary reload does
have the context, but I've seen post reload passes redo stuff (and typically
then it has to add more code to match the constraints once again).

> > > --- gcc/testsuite/gcc.target/powerpc/pr81959.c(revision 0)
> > > +++ gcc/testsuite/gcc.target/powerpc/pr81959.c(revision 0)
> > > @@ -0,0 +1,25 @@
> > > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
> > > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > > +/* { dg-options "-mpower9-vector -O2 -mfloat128" } */
> > 
> > powerpc*-*-*, or does that not work?
> > 
> > It needs 64-bit because various machine independent parts of the compiler 
> > want
> > to use TImode if there is arithmetic support for KFmode to copy things, and
> > TImode isn't supported in 32-bit.
> 
> That's what lp64 is for.
> 
> > The __float128 support is not built if the compiler is a 32-bit compiler 
> > (the
> > enabler for _float128 is in linux64.h)
> 
> So we need some bugzilla predicate for that really?

Or possibly implement the support in 32-bit compilers (and not break embedded
targets).

> Okay for trunk.  Further improvements welcome ;-)  Thanks!

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH #2], PR target/81959, Fix ++int to _Float128 conversion on power9

2017-12-01 Thread Segher Boessenkool
Hi!

On Thu, Nov 30, 2017 at 04:52:44PM -0500, Michael Meissner wrote:
> No, then it tends to generate worse code if it is done before the first split
> pass (because it no longer keeps the address together).  I've been thinking
> that in general, we should replace these calls with a new predicate that 
> before
> register allocation allows normal memory addresses, but during/after RA, it
> becomes more strict.  In my experience, with RELOAD that wasn't feasible, but
> LRA can handle it (and RELOAD is no longer an issue).

Can't you use the "strict" arg to legitimate_address_p and friends?

> > --- gcc/testsuite/gcc.target/powerpc/pr81959.c  (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/pr81959.c  (revision 0)
> > @@ -0,0 +1,25 @@
> > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mpower9-vector -O2 -mfloat128" } */
> 
> powerpc*-*-*, or does that not work?
> 
> It needs 64-bit because various machine independent parts of the compiler want
> to use TImode if there is arithmetic support for KFmode to copy things, and
> TImode isn't supported in 32-bit.

That's what lp64 is for.

> The __float128 support is not built if the compiler is a 32-bit compiler (the
> enabler for _float128 is in linux64.h)

So we need some bugzilla predicate for that really?

Okay for trunk.  Further improvements welcome ;-)  Thanks!


Segher


Go patch committed: Add size threshold for nil checks

2017-12-01 Thread Ian Lance Taylor
This patch to the Go frontend by Than McIntosh adds a new control
variable to the Gogo class that stores the size threshold for nil
checks. This value can be used to control the policy for deciding when
a given deference operation needs a check and when it does not. A size
threshold of -1 means that every potentially faulting dereference
needs an explicit check (and branch to error call). A size threshold
of K (where K > 0) means that if the size of the object being
dereferenced is >= K, then we need a check.  Currently for gccgo we
keep the same policy: do a nil check for an offset more than 4096.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian


2017-12-01  Than McIntosh  

* go-c.h (go_create_gogo_args): Add nil_check_size_threshold
field.
* go-lang.c (go_langhook_init): Set nil_check_size_threshold.
Index: gcc/go/go-c.h
===
--- gcc/go/go-c.h   (revision 254090)
+++ gcc/go/go-c.h   (working copy)
@@ -47,6 +47,7 @@ struct go_create_gogo_args
   bool check_divide_overflow;
   bool compiling_runtime;
   int debug_escape_level;
+  int64_t nil_check_size_threshold;
 };
 
 extern void go_create_gogo (const struct go_create_gogo_args*);
Index: gcc/go/go-lang.c
===
--- gcc/go/go-lang.c(revision 254090)
+++ gcc/go/go-lang.c(working copy)
@@ -112,6 +112,7 @@ go_langhook_init (void)
   args.check_divide_overflow = go_check_divide_overflow;
   args.compiling_runtime = go_compiling_runtime;
   args.debug_escape_level = go_debug_escape_level;
+  args.nil_check_size_threshold = 4096;
   args.linemap = go_get_linemap();
   args.backend = go_get_backend();
   go_create_gogo (&args);
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 255266)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-0d6b3abcbfe04949db947081651a503ceb12fe6e
+8cd42a3e9e0e618bb09e67be73f7d2f2477a0faa
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 254748)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -290,7 +290,7 @@ Expression::get_interface_type_descripto
   Expression::make_interface_info(rhs, INTERFACE_INFO_METHODS, location);
 
   Expression* descriptor =
-  Expression::make_unary(OPERATOR_MULT, mtable, location);
+  Expression::make_dereference(mtable, NIL_CHECK_NOT_NEEDED, location);
   descriptor = Expression::make_field_reference(descriptor, 0, location);
   Expression* nil = Expression::make_nil(location);
 
@@ -393,7 +393,8 @@ Expression::convert_interface_to_type(Ty
 {
   obj = Expression::make_unsafe_cast(Type::make_pointer_type(lhs_type), 
obj,
  location);
-  obj = Expression::make_unary(OPERATOR_MULT, obj, location);
+  obj = Expression::make_dereference(obj, NIL_CHECK_NOT_NEEDED,
+ location);
 }
   return Expression::make_compound(check_iface, obj, location);
 }
@@ -3842,24 +3843,20 @@ Unary_expression::do_flatten(Gogo* gogo,
   && !this->expr_->is_variable())
 {
   go_assert(this->expr_->type()->points_to() != NULL);
-  Type* ptype = this->expr_->type()->points_to();
-  if (!ptype->is_void_type())
+  switch (this->requires_nil_check(gogo))
 {
-  int64_t s;
-  bool ok = ptype->backend_type_size(gogo, &s);
-  if (!ok)
+  case NIL_CHECK_ERROR_ENCOUNTERED:
 {
   go_assert(saw_errors());
   return Expression::make_error(this->location());
 }
-  if (s >= 4096 || this->issue_nil_check_)
-{
-  Temporary_statement* temp =
-  Statement::make_temporary(NULL, this->expr_, location);
-  inserter->insert(temp);
-  this->expr_ =
-  Expression::make_temporary_reference(temp, location);
-}
+  case NIL_CHECK_NOT_NEEDED:
+break;
+  case NIL_CHECK_NEEDED:
+this->create_temp_ = true;
+break;
+  case NIL_CHECK_DEFAULT:
+go_unreachable();
 }
 }
 
@@ -3960,6 +3957,41 @@ Unary_expression::base_is_static_initial
   return false;
 }
 
+// Return whether this dereference expression requires an explicit nil
+// check. If we are dereferencing the pointer to a large struct
+// (greater than the specified size threshold), we need to check for
+// nil. We don't bother to check for small structs because we expect
+// the system to crash on a nil pointer dereference. However, if we
+// know the address of this expression is being taken, we must alway

Re: [PATCH, rs6000] gimple folding of vec_msum()

2017-12-01 Thread Bill Schmidt
Hi Will,

> On Dec 1, 2017, at 3:43 PM, Will Schmidt  wrote:
> 
> On Fri, 2017-12-01 at 18:46 +0100, Richard Biener wrote:
>> On December 1, 2017 6:22:21 PM GMT+01:00, Will Schmidt 
>>  wrote:
>>> Hi,
>>> Add support for folding of vec_msum in GIMPLE.
>>> 
>>> This uses the DOT_PROD_EXPR gimple op, which is sensitive to type
>>> mismatches:
>>> error: type mismatch in dot product reduction
>>> __vector signed int
>>> __vector signed char
>>> __vector unsigned char
>>> D.2798 = DOT_PROD_EXPR ;
>>> So for those cases with a signed/unsigned mismatch in the arguments,
>>> this
>>> converts those arguments to their signed type.
>>> 
>>> This also adds a define_expand for sdot_prodv16qi. This is based on a
>>> similar
>>> existing entry.
>>> 
>>> Testing coverage is handled by the existing
>>> gcc.target/powerpc/fold-vec-msum*.c tests.
>>> 
>>> Sniff-tests have passed on P8.  full regtests currently running on
>>> other assorted
>>> power systems.
>>> OK for trunk with successful results?
>> 
>> Note DOT_PROD_EXPR is only useful when the result is reduced to a scalar 
>> later and the reduction order is irrelevant. 
>> 
>> This is because GIMPLE doesn't specify whether the reduction reduces 
>> odd/even or high/low lanes of the argument vectors.  Does vec_msum specify 
>> that? 
> 
> Not that I see, but there may be an implied intent here that just isn't
> obvious to me.   I'll defer to ... someone. :-)
> 
>> That said, it exists as a 'hack' for the vectorizer and isn't otherwise 
>> useful for GIMPLE.
> 
> OK.  With that in mind, should I just try to split this out into
> separate multiply and add steps?

No.  The semantics of vec_msum are very specific and can't be accurately 
represented in GIMPLE.  This one should be left as a call until expand.

Thanks!
Bill

> 
> Thanks,
> -Will
> 
> 
> 
> 
>> 
>> 
>> Richard. 
>> 
>>> Thanks
>>> -Will
>>> 
>>> [gcc]
>>> 
>>> 2017-12-01  Will Schmidt  
>>> 
>>> * config/rs6000/altivec.md (sdot_prodv16qi): New.
>>> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
>>> gimple-folding of vec_msum.
>>> (builtin_function_type): Add entries for VMSUMU[BH]M and VMSUMMBM.
>>> 
>>> diff --git a/gcc/config/rs6000/altivec.md
>>> b/gcc/config/rs6000/altivec.md
>>> index 7122f99..fa9e121 100644
>>> --- a/gcc/config/rs6000/altivec.md
>>> +++ b/gcc/config/rs6000/altivec.md
>>> @@ -3349,11 +3349,26 @@
>>>(match_operand:V8HI 2 "register_operand" "v")]
>>>UNSPEC_VMSUMSHM)))]
>>>  "TARGET_ALTIVEC"
>>>  "
>>> {
>>> -  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
>>> operands[2], operands[3]));
>>> +  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
>>> +  operands[2], operands[3]));
>>> +  DONE;
>>> +}")
>>> +
>>> +(define_expand "sdot_prodv16qi"
>>> +  [(set (match_operand:V4SI 0 "register_operand" "=v")
>>> +(plus:V4SI (match_operand:V4SI 3 "register_operand" "v")
>>> +   (unspec:V4SI [(match_operand:V16QI 1
>>> "register_operand" "v")
>>> + (match_operand:V16QI 2
>>> "register_operand" "v")]
>>> +UNSPEC_VMSUMM)))]
>>> +  "TARGET_ALTIVEC"
>>> +  "
>>> +{
>>> +  emit_insn (gen_altivec_vmsummbm (operands[0], operands[1],
>>> +  operands[2], operands[3]));
>>>  DONE;
>>> }")
>>> 
>>> (define_expand "widen_usum3"
>>>  [(set (match_operand:V4SI 0 "register_operand" "=v")
>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>>> index 551d9c4..552fcdd 100644
>>> --- a/gcc/config/rs6000/rs6000.c
>>> +++ b/gcc/config/rs6000/rs6000.c
>>> @@ -16614,10 +16614,40 @@ rs6000_gimple_fold_builtin
>>> (gimple_stmt_iterator *gsi)
>>>case VSX_BUILTIN_CMPLE_2DI:
>>>case VSX_BUILTIN_CMPLE_U2DI:
>>>  fold_compare_helper (gsi, LE_EXPR, stmt);
>>>  return true;
>>> 
>>> +/* vec_msum.  */
>>> +case ALTIVEC_BUILTIN_VMSUMUHM:
>>> +case ALTIVEC_BUILTIN_VMSUMSHM:
>>> +case ALTIVEC_BUILTIN_VMSUMUBM:
>>> +case ALTIVEC_BUILTIN_VMSUMMBM:
>>> +  {
>>> +   arg0 = gimple_call_arg (stmt, 0);
>>> +   arg1 = gimple_call_arg (stmt, 1);
>>> +   tree arg2 = gimple_call_arg (stmt, 2);
>>> +   lhs = gimple_call_lhs (stmt);
>>> +   if ( TREE_TYPE (arg0) == TREE_TYPE (arg1))
>>> + g = gimple_build_assign (lhs, DOT_PROD_EXPR, arg0, arg1, arg2);
>>> +   else
>>> + {
>>> +   // For the case where we have a mix of signed/unsigned
>>> +   // arguments, convert both multiply args to their signed type.
>>> +   gimple_seq stmts = NULL;
>>> +   location_t loc = gimple_location (stmt);
>>> +   tree new_arg_type = signed_type_for (TREE_TYPE (arg0));
>>> +   tree signed_arg0 = gimple_convert (&stmts, loc, new_arg_type,
>>> arg0);
>>> +   tree signed_arg1 = gimple_convert (&stmts, loc, new_arg_type,
>>> arg1);
>>> +   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>> +  

[RFC][PATCH] PR preprocessor/83173: Additional check before decrementing highest_location

2017-12-01 Thread Mike Gulick
I've come up with some patches that fix PR preprocessor/83173, which I reported
a couple of weeks ago.

The first patch is a test case.  The second and third patches are two versions
of the fix.  The first version is simpler, but it may still leave in place some
subtle incorrect behavior that happens when the current source location is less
than LINE_MAP_MAX_COLUMN_NUMBER.  The second version tries to handle that case
as well, however I'm less comfortable with it as I don't know whether I'm
computing the source_location of the *end* of the current line correctly in all
cases.  Both of these pass the gcc/g++ test suites with no regressions.

Thanks in advance for the review/feedback!

-Mike
>From 6ff0068284c346c8db08c4b6b4d9a66d8464aeac Mon Sep 17 00:00:00 2001
From: Mike Gulick 
Date: Thu, 30 Nov 2017 18:35:48 -0500
Subject: [PATCH 1/2] PR preprocessor/83173: New test

2017-12-01  Mike Gulick  

	PR preprocessor/83173
	* gcc.dg/plugin/pr83173.c: New test.
	* gcc.dg/plugin/pr83173.h: Header for pr83173.c
	* gcc.dg/plugin/pr83173-1.h: Header for pr83173.c
	* gcc.dg/plugin/pr83173-2.h: Header for pr83173.c
	* gcc.dg/plugin/location_overflow_pp_plugin.c: New plugin to
	override line_table->highest_location for preprocessor.
---
 .../gcc.dg/plugin/location_overflow_pp_plugin.c| 44 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |  1 +
 gcc/testsuite/gcc.dg/plugin/pr83173-1.h|  2 +
 gcc/testsuite/gcc.dg/plugin/pr83173-2.h|  2 +
 gcc/testsuite/gcc.dg/plugin/pr83173.c  | 21 +++
 gcc/testsuite/gcc.dg/plugin/pr83173.h  |  2 +
 6 files changed, 72 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/location_overflow_pp_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/pr83173-1.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/pr83173-2.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/pr83173.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/pr83173.h

diff --git a/gcc/testsuite/gcc.dg/plugin/location_overflow_pp_plugin.c b/gcc/testsuite/gcc.dg/plugin/location_overflow_pp_plugin.c
new file mode 100644
index 000..ba5a795b937
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/location_overflow_pp_plugin.c
@@ -0,0 +1,44 @@
+/* Plugin for testing how gracefully we degrade in the face of very
+   large source files.  */
+
+#include "config.h"
+#include "gcc-plugin.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic.h"
+
+int plugin_is_GPL_compatible;
+
+static location_t base_location;
+
+/* Callback handler for the PLUGIN_PRAGMAS event.  This is used to set the
+   initial line table offset for the preprocessor, to make it appear as if we
+   had parsed a very large file.  PRAGMA_START_UNIT is not suitable here as is
+   not invoked during the preprocessor stage.  */
+
+static void
+on_pragma_registration (void *gcc_data, void *user_data)
+{
+  line_table->highest_location = base_location;
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	 struct plugin_gcc_version * /*version */ )
+{
+  /* Read VALUE from -fplugin-arg-location_overflow_pp_plugin-value=
+ in hexadecimal form into base_location.  */
+  for (int i = 0; i < plugin_info->argc; i++)
+{
+  if (0 == strcmp (plugin_info->argv[i].key, "value"))
+	base_location = strtol (plugin_info->argv[i].value, NULL, 16);
+}
+
+  if (!base_location)
+error_at (UNKNOWN_LOCATION, "missing plugin argument");
+
+  register_callback (plugin_info->base_name,
+		 PLUGIN_PRAGMAS, on_pragma_registration, NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index c7a3b4dbf2f..69d67caa846 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -79,6 +79,7 @@ set plugin_test_list [list \
 { location_overflow_plugin.c \
 	  location-overflow-test-1.c \
 	  location-overflow-test-2.c } \
+{ location_overflow_pp_plugin.c pr83173.c } \
 { must_tail_call_plugin.c \
 	  must-tail-call-1.c \
 	  must-tail-call-2.c } \
diff --git a/gcc/testsuite/gcc.dg/plugin/pr83173-1.h b/gcc/testsuite/gcc.dg/plugin/pr83173-1.h
new file mode 100644
index 000..bf05d561976
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/pr83173-1.h
@@ -0,0 +1,2 @@
+#pragma once
+#define PR83173_1_H
diff --git a/gcc/testsuite/gcc.dg/plugin/pr83173-2.h b/gcc/testsuite/gcc.dg/plugin/pr83173-2.h
new file mode 100644
index 000..dd0fc94bf53
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/pr83173-2.h
@@ -0,0 +1,2 @@
+#pragma once
+#define PR83173_2_H
diff --git a/gcc/testsuite/gcc.dg/plugin/pr83173.c b/gcc/testsuite/gcc.dg/plugin/pr83173.c
new file mode 100644
index 000..ff1858a2b33
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/pr83173.c
@@ -0,0 +1,21 @@
+/*
+  { dg-options "-fplugin-arg-location_overflow_pp_plugin-value=0x6001" }
+  { dg-additional-files "pr83173.h" "pr83173-1.h" "pr83173-2.h" }
+  { dg-do preprocess }
+*

Re: [PATCH] rs6000: Improve fusion assembler output

2017-12-01 Thread Segher Boessenkool
On Thu, Nov 30, 2017 at 12:54:53PM -0500, Michael Meissner wrote:
> On Thu, Nov 30, 2017 at 11:59:37AM +, Segher Boessenkool wrote:
> > This improves the output for load and store fusion a little.  In most
> > cases it removes the comment output, because that makes the generated
> > assembler code hard to read, and equivalent info is available with -dp
> > anyway.  For the vector loads it puts the comment on the second insn,
> > where it doesn't interfere with other debug comments.
> > 
> > Mike, does this look good?  Or is there something I'm missing :-)
> > 
> > Tested on powerpc64-linux {-m32,-m64}.
> 
> The comment was used by my perl script (analyze-ppc-asm) that looks at .s 
> files
> and prints out statistics.  I can adjust the tool so it no longer looks for 
> the
> comment, but actually looks at the adjacent instructions (which I do in a few
> other cases).

Ah okay, thanks!  I'll commit the patch then.


Segher


Re: [PATCH] final: Improve output for -dp and -fverbose-asm

2017-12-01 Thread Segher Boessenkool
Hi!

On Thu, Nov 30, 2017 at 10:55:04AM -0700, Martin Sebor wrote:
> >Or, for that matter, what "length" means?  Could be byte-length, sure.
> >But OTOH, for a RISC target it's always four, so why print it?  The GCC
> >developers surely meant cycle-length with that, nothing else makes sense.
> 
> Heh.  I thought it meant the length of the instruction in bytes,
> and it made perfect sense to me.  Sounds like I misinterpreted it.

It is:

"Lengths are measured in addressable storage units (bytes)."

(which is in the manual just fine; gccint of course, not the user manual).

> Which suggests that it should be mentioned in the manual (whatever
> label it ends up with).  With it documented (and the position on
> the line made clear), the length= or l= part could even be skipped
> altogether to save a few more bytes if that's important (I don't
> think it is in this case).

It is documented with -dp (I'll document it prints insn cost too).


Segher


Re: [PATCH] rs6000: Cleanup bdz/bdnz insn/splitter, add new insn/splitter for bdzt/bdzf/bdnzt/bdnzf

2017-12-01 Thread Segher Boessenkool
Hi Aaron,

On Thu, Nov 30, 2017 at 11:31:47AM -0600, Aaron Sawdey wrote:
> This does some cleanup/consolidation so that bdz/bdnz are supported by
> a single insn and splitter, and adds a new insn and splitter to support
> the conditional form of those (bdzt/bdzf/bdnzt/bdnzf).
> 
> This is going to be used for the memcmp() builtin expansion patch which
> is next. That also will require the change to canonicalize_condition I
> posted before thanksgiving to prevent doloop from being confused by
> bdnzt et. al. 

>   * config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it
>   to generate rtl.
>   (cceq_ior_compare_complement): Give it a name so I can use it, and
>   change boolean_or_operator predicate to boolean_operator so it can
>   be used to generate a crand.
>   Define new code iterator eqne, and new code_attrs bd/bd_neg.

(eqne): New code_iterator.
etc.

>   (_) New name for ctr_internal[12] now combined into
>   a single define_insn. There is now just a single splitter for this
>   that looks whether the decremented ctr result is going to a register
>   or memory and generates the appropriate rtl.

Colon after ), two spaces after a full stop.  You don't need to explain
what a change is for btw; changelog is *what* changed, not *why*.

>   (tf_) A new insn pattern for the conditional form branch
>   decrement (bdnzt/bdnzf/bdzt/bdzf). This also has a splitter similar
>   to the one for _.
>   * config/rs6000/rs6000.c (rs6000_legitimate_combined_insn): Updated
>   with the new names of the branch decrement patterns, and added the
>   names of the branch decrement conditional patterns.

> +  && (icode == CODE_FOR_bdz_si
> +   || icode == CODE_FOR_bdz_di
> +   || icode == CODE_FOR_bdnz_si
> +   || icode == CODE_FOR_bdnz_di
> +   || icode == CODE_FOR_bdztf_si
> +   || icode == CODE_FOR_bdnztf_si
> +   || icode == CODE_FOR_bdztf_di
> +   || icode == CODE_FOR_bdnztf_di))

Please swap bdnztf_si and bdztf_di so it is clearer you handle all cases?

> +(define_code_iterator eqne [eq ne])
> +(define_code_attr bd [(ne "bdnz") (eq "bdz")])
> +(define_code_attr bd_neg [(ne "bdz") (eq "bdnz")])

Maybe order those as eq, ne as well?

> +  rtx ctrin = operands[1];
> +  rtx ctrout = operands[0];
> +  rtx ctrtmp = operands[4];

I don't think these temporaries improve legibility at all?

>operands[7] = gen_rtx_fmt_ee (GET_CODE (operands[2]), VOIDmode, 
> operands[3],
>   const0_rtx);
> +  emit_insn (gen_rtx_SET (operands[3],
> +  gen_rtx_COMPARE (CCmode, ctrin, const1_rtx)));
> +  if (gpc_reg_operand (ctrout, mode))
> +emit_insn (gen_add3 (ctrout, ctrin, constm1_rtx));
> +  else
> +{
> +  emit_insn (gen_add3 (ctrtmp, ctrin, constm1_rtx));
> +  emit_move_insn (ctrout, ctrtmp);
> +} 

(Space at end of line).

> +/* No DONE so branch comes from the pattern.  */
>  })

> +(define_insn "tf_"
>[(set (pc)
> - (if_then_else (match_operator 2 "comparison_operator"
> -   [(match_operand:P 1 "gpc_reg_operand")
> -(const_int 1)])
> -   (match_operand 5)
> -   (match_operand 6)))
> -   (set (match_operand:P 0 "nonimmediate_operand")
> + (if_then_else
> +   (and
> +  (eqne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
> +(const_int 1))
> +  (match_operator 3 "branch_comparison_operator"
> +   [(match_operand 4 "cc_reg_operand" "y,y,y,y")
> +(const_int 0)]))
> +   (label_ref (match_operand 0))
> +   (pc)))

Those last two lines should be indented the same as the "(and" (they are
sub rtx of the if_then_else).

> +{
> +  if (which_alternative != 0)
> +return "#";
> +  else if (get_attr_length (insn) == 4)
> +{
> +  if (branch_positive_comparison_operator (operands[3],
> +   GET_MODE (operands[3])))
> +return "t %j3,%l0";
> +  else
> +return "f %j3,%l0";

Eight leading spaces should be a tab.

> +}
> +  else
> +{ 

Trailing space.

> +static char seq[96];
> +char *bcs = output_cbranch (operands[3], "$+8", 1, insn);
> + sprintf(seq, " $+12\;%s;b %%l0", bcs);
> + return seq;

The indent should be six spaces on these four lines.

It should be "const char *" really; but output_cbranch has a big bug
as well it seems: it returns a pointer to a string on its stack!  Uh-oh.

> +;; Now the splitter if we could not allocate the CTR register
> +(define_split
> +  [(set (pc)
> + (if_then_else
> +   (and
> +  (match_operator 1 "comparison_operator"
> +  [(match_operand:P 0 "gpc_reg_operand")
> +   (const_int 1)])
> +  (match_operator 3 "branch_comparison_operator"

[PATCH] Fix store-merging vuse handling (PR tree-optimization/83170, PR tree-optimization/83241)

2017-12-01 Thread Jakub Jelinek
Hi!

The bswap infrastructure uses the vuse field to make sure all the loads are
having the same gimple_vuse and also uses it in bswap_replace.
When this infrastructure is used inside of the store-merging pass, the
problem is that the old stores are being removed and new added, so
gimple_vuse of the loads we record during process_stmt can change.
So, this patch updates the vuse fields before we plan to use it (in
try_coalesce_bswap for the checking and in output_merged_stores for the
bswap_replace purposes).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-12-01  Jakub Jelinek  

PR tree-optimization/83170
PR tree-optimization/83241
* gimple-ssa-store-merging.c
(imm_store_chain_info::try_coalesce_bswap): Update vuse field from
gimple_vuse (ins_stmt) in case it has changed.
(imm_store_chain_info::output_merged_store): Likewise.

* gcc.dg/store_merging_17.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2017-12-01 09:17:36.0 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-12-01 16:03:40.806918965 +0100
@@ -2384,6 +2384,9 @@ imm_store_chain_info::try_coalesce_bswap
   this_n.type = type;
   if (!this_n.base_addr)
this_n.range = try_size / BITS_PER_UNIT;
+  else
+   /* Update vuse in case it has changed by output_merged_stores.  */
+   this_n.vuse = gimple_vuse (info->ins_stmt);
   unsigned int bitpos = info->bitpos - infof->bitpos;
   if (!do_shift_rotate (LSHIFT_EXPR, &this_n,
BYTES_BIG_ENDIAN
@@ -3341,10 +3344,16 @@ imm_store_chain_info::output_merged_stor
 we've checked the aliasing already in try_coalesce_bswap and
 we want to sink the need load into seq.  So need to use new_vuse
 on the load.  */
-  if (n->base_addr && n->vuse == NULL)
+  if (n->base_addr)
{
- n->vuse = new_vuse;
- ins_stmt = NULL;
+ if (n->vuse == NULL)
+   {
+ n->vuse = new_vuse;
+ ins_stmt = NULL;
+   }
+ else
+   /* Update vuse in case it has changed by output_merged_stores.  */
+   n->vuse = gimple_vuse (ins_stmt);
}
   bswap_res = bswap_replace (gsi_start (seq), ins_stmt, fndecl,
 bswap_type, load_type, n, bswap);
--- gcc/testsuite/gcc.dg/store_merging_17.c.jj  2017-12-01 16:07:20.590224536 
+0100
+++ gcc/testsuite/gcc.dg/store_merging_17.c 2017-12-01 16:07:01.0 
+0100
@@ -0,0 +1,17 @@
+/* PR tree-optimization/83241 */
+/* { dg-do compile { target store_merge } } */
+/* { dg-options "-O2" } */
+
+struct S { int a; short b[32]; } e;
+struct T { volatile int c; int d; } f;
+
+void
+foo ()
+{
+  struct T g = f;
+  e.b[0] = 6;
+  e.b[1] = 6;
+  e.b[4] = g.d;
+  e.b[5] = g.d >> 16;
+  e.a = 1;
+}

Jakub


Fwd: RFC: remove the "tile" architecture from glibc

2017-12-01 Thread Jeff Law

Something to consider... I'm not suggesting we remove at this point, it
really depends on whether or not Walter wants to continue to maintain
the bits.


 Forwarded Message 
Subject: RFC: remove the "tile" architecture from glibc
Date: Fri, 1 Dec 2017 16:34:09 -0500
From: Chris Metcalf 
To: GNU C Library 

The tile architecture was introduced to glibc in 2011 and first
appeared in glibc 2.15.  The chip family of TILEPro and TILE-Gx was
developed by Tilera, which was eventually acquired by Mellanox.  Now
at Mellanox we are developing new chips based on the ARM64
architecture; our last TILE-Gx chip (the Gx72) was released in 2013,
and our customers using the tile architecture products are now all in
maintenance mode, as far as we know, and not looking to upgrade their
software to newer open-source releases.

Compounding this state of affairs is the fact that after twelve years
here I am moving on next week; my last day at Mellanox is December
8th.  Since tracking upstream development of the old tile architecture
is not a high priority for Mellanox, reasonably enough, it seems
cleanest at this point to propose removal of the architecture from the
glibc tree, so that the 2.26 release will be the last release to have
tile support.

If there is any desire to continue to support the tile architecture in
glibc, I'm happy to hand off to someone else as maintainer.  I'm aware
of one issue in the current code, which is that upstream gcc vector
insn support has a bug in it that causes some of the string functions
to misbehave; I can publish a fix for that before handing off, if desired.

I will in any case be dropping off the glibc list (other than perhaps
occasionally reading the archives) at the end of next week.  It's been
a rewarding experience following glibc's development over the last six
years and I will certainly miss being part of this community.

I'm keeping that libc.so.6 sticker I got from Carlos, though!  :)

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com



Re: [PATCH] C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2017 at 04:48:20PM -0500, David Malcolm wrote:
> PR c/83236 reports an issue where the C FE unhelpfully suggests the use
> of glibc's private "__ino_t" type when it fails to recognize "ino_t":
> 
> $ cat > test.c < #include 
> ino_t inode;
> EOF
> $ gcc -std=c89 -fsyntax-only test.c
> test.c:2:1: error: unknown type name 'ino_t'; did you mean '__ino_t'?
>  ino_t inode;
>  ^
>  __ino_t
> 
> This patch updates the C/C++ FEs suggestions for unrecognized identifiers
> so that they don't suggest names that are reserved for use by the
> implementation i.e. those that begin with an underscore and either an
> uppercase letter or another underscore.
> 
> However, it allows built-in macros that match this pattern to be
> suggested, since it's useful to be able to suggest __FILE__, __LINE__
> etc.  Other macros *are* filtered.
> 
> One wart with the patch: the existing macro-handling spellcheck code
> is in spellcheck-tree.c, and needs to call the the new function
> "name_reserved_for_implementation_p", however the latter relates to
> the C family of FEs.
> Perhaps I should move all of the the macro-handling stuff in
> spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
> first step?
> 
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?
> 
> gcc/c/ChangeLog:
>   PR c/83236
>   * c-decl.c (lookup_name_fuzzy): Don't suggest names that are
>   reserved for use by the implementation.
> 
> gcc/cp/ChangeLog:
>   PR c/83236
>   * name-lookup.c (consider_binding_level): Don't suggest names that
>   are reserved for use by the implementation.
> 
> gcc/ChangeLog:
>   PR c/83236
>   * spellcheck-tree.c (name_reserved_for_implementation_p): New
>   function.
>   (should_suggest_as_macro_p): New function.
>   (find_closest_macro_cpp_cb): Move the check for NT_MACRO to
>   should_suggest_as_macro_p and call it.
>   (selftest::test_name_reserved_for_implementation_p): New function.
>   (selftest::spellcheck_tree_c_tests): Call it.
>   * spellcheck-tree.h (name_reserved_for_implementation_p): New
>   decl.
> 
> gcc/testsuite/ChangeLog:
>   PR c/83236
>   * c-c++-common/spellcheck-reserved.c: New test case.
> ---
>  gcc/c/c-decl.c   |  5 +++
>  gcc/cp/name-lookup.c | 18 +++---
>  gcc/spellcheck-tree.c| 46 
> +++-
>  gcc/spellcheck-tree.h|  2 ++
>  gcc/testsuite/c-c++-common/spellcheck-reserved.c | 25 +
>  5 files changed, 91 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/spellcheck-reserved.c
> 
> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index 56c63d8..dfd136d 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -4041,6 +4041,11 @@ lookup_name_fuzzy (tree name, enum 
> lookup_name_fuzzy_kind kind, location_t loc)
>   if (TREE_CODE (binding->decl) == FUNCTION_DECL)
> if (C_DECL_IMPLICIT (binding->decl))
>   continue;
> + /* Don't suggest names that are reserved for use by the
> +implementation.  */
> + if (name_reserved_for_implementation_p
> + (IDENTIFIER_POINTER (binding->id)))

Can't you use a temporary to avoid wrapping line between function
name and ( ?

More importantly, does this mean if I mistype __builtin_strtchr it
won't suggest __builtin_strrchr?  Would be nice if the filtering
of the names reserved for implementation isn't done if the
name being looked up is reserved for implementation.

Jakub


[PATCH] C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-01 Thread David Malcolm
PR c/83236 reports an issue where the C FE unhelpfully suggests the use
of glibc's private "__ino_t" type when it fails to recognize "ino_t":

$ cat > test.c <
ino_t inode;
EOF
$ gcc -std=c89 -fsyntax-only test.c
test.c:2:1: error: unknown type name 'ino_t'; did you mean '__ino_t'?
 ino_t inode;
 ^
 __ino_t

This patch updates the C/C++ FEs suggestions for unrecognized identifiers
so that they don't suggest names that are reserved for use by the
implementation i.e. those that begin with an underscore and either an
uppercase letter or another underscore.

However, it allows built-in macros that match this pattern to be
suggested, since it's useful to be able to suggest __FILE__, __LINE__
etc.  Other macros *are* filtered.

One wart with the patch: the existing macro-handling spellcheck code
is in spellcheck-tree.c, and needs to call the the new function
"name_reserved_for_implementation_p", however the latter relates to
the C family of FEs.
Perhaps I should move all of the the macro-handling stuff in
spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
first step?

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/c/ChangeLog:
PR c/83236
* c-decl.c (lookup_name_fuzzy): Don't suggest names that are
reserved for use by the implementation.

gcc/cp/ChangeLog:
PR c/83236
* name-lookup.c (consider_binding_level): Don't suggest names that
are reserved for use by the implementation.

gcc/ChangeLog:
PR c/83236
* spellcheck-tree.c (name_reserved_for_implementation_p): New
function.
(should_suggest_as_macro_p): New function.
(find_closest_macro_cpp_cb): Move the check for NT_MACRO to
should_suggest_as_macro_p and call it.
(selftest::test_name_reserved_for_implementation_p): New function.
(selftest::spellcheck_tree_c_tests): Call it.
* spellcheck-tree.h (name_reserved_for_implementation_p): New
decl.

gcc/testsuite/ChangeLog:
PR c/83236
* c-c++-common/spellcheck-reserved.c: New test case.
---
 gcc/c/c-decl.c   |  5 +++
 gcc/cp/name-lookup.c | 18 +++---
 gcc/spellcheck-tree.c| 46 +++-
 gcc/spellcheck-tree.h|  2 ++
 gcc/testsuite/c-c++-common/spellcheck-reserved.c | 25 +
 5 files changed, 91 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/spellcheck-reserved.c

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 56c63d8..dfd136d 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -4041,6 +4041,11 @@ lookup_name_fuzzy (tree name, enum 
lookup_name_fuzzy_kind kind, location_t loc)
if (TREE_CODE (binding->decl) == FUNCTION_DECL)
  if (C_DECL_IMPLICIT (binding->decl))
continue;
+   /* Don't suggest names that are reserved for use by the
+  implementation.  */
+   if (name_reserved_for_implementation_p
+   (IDENTIFIER_POINTER (binding->id)))
+ continue;
switch (kind)
  {
  case FUZZY_LOOKUP_TYPENAME:
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 9f65c4d..fd2c335 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5633,10 +5633,20 @@ consider_binding_level (tree name, best_match  &bm,
  && DECL_ANTICIPATED (d))
continue;
 
-  if (tree name = DECL_NAME (d))
-   /* Ignore internal names with spaces in them.  */
-   if (!strchr (IDENTIFIER_POINTER (name), ' '))
- bm.consider (IDENTIFIER_POINTER (name));
+  tree name = DECL_NAME (d);
+  if (!name)
+   continue;
+
+  /* Ignore internal names with spaces in them.  */
+  if (strchr (IDENTIFIER_POINTER (name), ' '))
+   continue;
+
+  /* Don't suggest names that are reserved for use by the
+implementation.  */
+  if (name_reserved_for_implementation_p (IDENTIFIER_POINTER (name)))
+   continue;
+
+  bm.consider (IDENTIFIER_POINTER (name));
 }
 }
 
diff --git a/gcc/spellcheck-tree.c b/gcc/spellcheck-tree.c
index 56740b9..4c244fa 100644
--- a/gcc/spellcheck-tree.c
+++ b/gcc/spellcheck-tree.c
@@ -66,6 +66,36 @@ find_closest_identifier (tree target, const auto_vec 
*candidates)
   return bm.get_best_meaningful_candidate ();
 }
 
+/* Return true iff STR begin with an underscore and either an uppercase
+   letter or another underscore, and is thus, for C and C++, reserved for
+   use by the implementation.  */
+
+bool
+name_reserved_for_implementation_p (const char *str)
+{
+  if (str[0] != '_')
+return false;
+  return (str[1] == '_' || ISUPPER(str[1]));
+}
+
+/* Return true iff HASHNODE is a macro that should be offered as a
+   suggestion for a misspelling.  */
+
+static bool
+should_suggest_as_macro_p (cpp_hashnode *hashnode)
+{
+  if (hashnode->type != NT_MACRO)
+return false;
+
+  /* Don't suggest names reserved for th

Re: [PATCH, rs6000] gimple folding of vec_msum()

2017-12-01 Thread Will Schmidt
On Fri, 2017-12-01 at 18:46 +0100, Richard Biener wrote:
> On December 1, 2017 6:22:21 PM GMT+01:00, Will Schmidt 
>  wrote:
> >Hi,
> >Add support for folding of vec_msum in GIMPLE.
> >
> >This uses the DOT_PROD_EXPR gimple op, which is sensitive to type
> >mismatches:
> > error: type mismatch in dot product reduction
> > __vector signed int
> > __vector signed char
> > __vector unsigned char
> > D.2798 = DOT_PROD_EXPR ;
> >So for those cases with a signed/unsigned mismatch in the arguments,
> >this
> >converts those arguments to their signed type.
> >
> >This also adds a define_expand for sdot_prodv16qi. This is based on a
> >similar
> >existing entry.
> >
> >Testing coverage is handled by the existing
> >gcc.target/powerpc/fold-vec-msum*.c tests.
> >
> >Sniff-tests have passed on P8.  full regtests currently running on
> >other assorted
> >power systems.
> >OK for trunk with successful results?
> 
> Note DOT_PROD_EXPR is only useful when the result is reduced to a scalar 
> later and the reduction order is irrelevant. 
> 
> This is because GIMPLE doesn't specify whether the reduction reduces odd/even 
> or high/low lanes of the argument vectors.  Does vec_msum specify that? 

Not that I see, but there may be an implied intent here that just isn't
obvious to me.   I'll defer to ... someone. :-)

> That said, it exists as a 'hack' for the vectorizer and isn't otherwise 
> useful for GIMPLE.

OK.  With that in mind, should I just try to split this out into
separate multiply and add steps?

Thanks,
-Will




>  
> 
> Richard. 
> 
> >Thanks
> >-Will
> >
> >[gcc]
> >
> >2017-12-01  Will Schmidt  
> >
> > * config/rs6000/altivec.md (sdot_prodv16qi): New.
> > * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
> > gimple-folding of vec_msum.
> > (builtin_function_type): Add entries for VMSUMU[BH]M and VMSUMMBM.
> >
> >diff --git a/gcc/config/rs6000/altivec.md
> >b/gcc/config/rs6000/altivec.md
> >index 7122f99..fa9e121 100644
> >--- a/gcc/config/rs6000/altivec.md
> >+++ b/gcc/config/rs6000/altivec.md
> >@@ -3349,11 +3349,26 @@
> > (match_operand:V8HI 2 "register_operand" "v")]
> > UNSPEC_VMSUMSHM)))]
> >   "TARGET_ALTIVEC"
> >   "
> > {
> >-  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
> >operands[2], operands[3]));
> >+  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
> >+   operands[2], operands[3]));
> >+  DONE;
> >+}")
> >+
> >+(define_expand "sdot_prodv16qi"
> >+  [(set (match_operand:V4SI 0 "register_operand" "=v")
> >+(plus:V4SI (match_operand:V4SI 3 "register_operand" "v")
> >+   (unspec:V4SI [(match_operand:V16QI 1
> >"register_operand" "v")
> >+ (match_operand:V16QI 2
> >"register_operand" "v")]
> >+UNSPEC_VMSUMM)))]
> >+  "TARGET_ALTIVEC"
> >+  "
> >+{
> >+  emit_insn (gen_altivec_vmsummbm (operands[0], operands[1],
> >+   operands[2], operands[3]));
> >   DONE;
> > }")
> > 
> > (define_expand "widen_usum3"
> >   [(set (match_operand:V4SI 0 "register_operand" "=v")
> >diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> >index 551d9c4..552fcdd 100644
> >--- a/gcc/config/rs6000/rs6000.c
> >+++ b/gcc/config/rs6000/rs6000.c
> >@@ -16614,10 +16614,40 @@ rs6000_gimple_fold_builtin
> >(gimple_stmt_iterator *gsi)
> > case VSX_BUILTIN_CMPLE_2DI:
> > case VSX_BUILTIN_CMPLE_U2DI:
> >   fold_compare_helper (gsi, LE_EXPR, stmt);
> >   return true;
> > 
> >+/* vec_msum.  */
> >+case ALTIVEC_BUILTIN_VMSUMUHM:
> >+case ALTIVEC_BUILTIN_VMSUMSHM:
> >+case ALTIVEC_BUILTIN_VMSUMUBM:
> >+case ALTIVEC_BUILTIN_VMSUMMBM:
> >+  {
> >+arg0 = gimple_call_arg (stmt, 0);
> >+arg1 = gimple_call_arg (stmt, 1);
> >+tree arg2 = gimple_call_arg (stmt, 2);
> >+lhs = gimple_call_lhs (stmt);
> >+if ( TREE_TYPE (arg0) == TREE_TYPE (arg1))
> >+  g = gimple_build_assign (lhs, DOT_PROD_EXPR, arg0, arg1, arg2);
> >+else
> >+  {
> >+// For the case where we have a mix of signed/unsigned
> >+// arguments, convert both multiply args to their signed type.
> >+gimple_seq stmts = NULL;
> >+location_t loc = gimple_location (stmt);
> >+tree new_arg_type = signed_type_for (TREE_TYPE (arg0));
> >+tree signed_arg0 = gimple_convert (&stmts, loc, new_arg_type,
> >arg0);
> >+tree signed_arg1 = gimple_convert (&stmts, loc, new_arg_type,
> >arg1);
> >+gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> >+g = gimple_build_assign (lhs, DOT_PROD_EXPR,
> >+ signed_arg0, signed_arg1, arg2);
> >+  }
> >+gimple_set_location (g, gimple_location (stmt));
> >+gsi_replace (gsi, g, true);
> >+return true;
> >+  }
> >+
> > default:
> >   if (TARGET_DEBUG_BU

C++ PATCH for c++/79228, complex literal suffixes

2017-12-01 Thread Jason Merrill
79228 points out that C++14 defines complex literal suffixes that
conflict with the GNU suffixes.  In this patch I take the approach
that in C++14 and up, if  has been included we assume that
the user wants the C++14 suffixes and give a hard error if they aren't
found; otherwise we assume that the user wants the GNU suffixes and
give a pedwarn if -Wpedantic.

While looking at this, I noticed that David's #include suggestion code
wasn't suggesting  for references to std::complex, so the
second patch addresses that.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit e39b7d506d236ce7ef9f64d1bcf0b384bb2d8038
Author: Jason Merrill 
Date:   Fri Dec 1 07:45:03 2017 -0500

PR c++/79228 - extensions hide C++14 complex literal operators

libcpp/
* expr.c (interpret_float_suffix): Ignore 'i' in C++14 and up.
(interpret_int_suffix): Likewise.
gcc/cp/
* parser.c (cp_parser_userdef_numeric_literal): Be helpful about
'i' in C++14 and up.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b469d1c1760..6e4c24362c6 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4397,11 +4397,75 @@ cp_parser_userdef_numeric_literal (cp_parser *parser)
 
   release_tree_vector (args);
 
-  error ("unable to find numeric literal operator %qD", name);
-  if (!cpp_get_options (parse_in)->ext_numeric_literals)
-inform (token->location, "use -std=gnu++11 or -fext-numeric-literals "
+  /* In C++14 the standard library defines complex number suffixes that
+ conflict with GNU extensions.  Prefer them if  is #included.  */
+  bool ext = cpp_get_options (parse_in)->ext_numeric_literals;
+  bool i14 = (cxx_dialect > cxx11
+ && (id_equal (suffix_id, "i")
+ || id_equal (suffix_id, "if")
+ || id_equal (suffix_id, "il")));
+  diagnostic_t kind = DK_ERROR;
+  int opt = 0;
+
+  if (i14 && ext)
+{
+  tree cxlit = lookup_qualified_name (std_node,
+ get_identifier ("complex_literals"),
+ 0, false, false);
+  if (cxlit == error_mark_node)
+   {
+ /* No , so pedwarn and use GNU semantics.  */
+ kind = DK_PEDWARN;
+ opt = OPT_Wpedantic;
+   }
+}
+
+  bool complained
+= emit_diagnostic (kind, input_location, opt,
+  "unable to find numeric literal operator %qD", name);
+
+  if (!complained)
+/* Don't inform either.  */;
+  else if (i14)
+{
+  inform (token->location, "add % "
+ "(from ) to enable the C++14 user-defined literal "
+ "suffixes");
+  if (ext)
+   inform (token->location, "or use % instead of % for the "
+   "GNU built-in suffix");
+}
+  else if (!ext)
+inform (token->location, "use -fext-numeric-literals "
"to enable more built-in suffixes");
-  return error_mark_node;
+
+  if (kind == DK_ERROR)
+value = error_mark_node;
+  else
+{
+  /* Use the built-in semantics.  */
+  tree type;
+  if (id_equal (suffix_id, "i"))
+   {
+ if (TREE_CODE (value) == INTEGER_CST)
+   type = integer_type_node;
+ else
+   type = double_type_node;
+   }
+  else if (id_equal (suffix_id, "if"))
+   type = float_type_node;
+  else /* if (id_equal (suffix_id, "il")) */
+   type = long_double_type_node;
+
+  value = build_complex (build_complex_type (type),
+fold_convert (type, integer_zero_node),
+fold_convert (type, value));
+}
+
+  if (cp_parser_uncommitted_to_tentative_parse_p (parser))
+/* Avoid repeated diagnostics.  */
+token->u.value = value;
+  return value;
 }
 
 /* Parse a user-defined string constant.  Returns a call to a user-defined
diff --git a/gcc/testsuite/g++.dg/cpp0x/gnu_fext-numeric-literals.C 
b/gcc/testsuite/g++.dg/cpp0x/gnu_fext-numeric-literals.C
index 6a8398b896a..ac2db287f3a 100644
--- a/gcc/testsuite/g++.dg/cpp0x/gnu_fext-numeric-literals.C
+++ b/gcc/testsuite/g++.dg/cpp0x/gnu_fext-numeric-literals.C
@@ -4,7 +4,7 @@
 //  Integer imaginary...
 
 constexpr unsigned long long
-operator"" i(unsigned long long n) // { dg-warning "shadowed by 
implementation" }
+operator"" i(unsigned long long n) // { dg-warning "shadowed by 
implementation" "" { target c++11_only } }
 { return 4 * n + 0; }
 
 constexpr unsigned long long
@@ -22,7 +22,7 @@ operator"" J(unsigned long long n) // { dg-warning "shadowed 
by implementation"
 //  Floating-point imaginary...
 
 constexpr long double
-operator"" i(long double n) // { dg-warning "shadowed by implementation" }
+operator"" i(long double n) // { dg-warning "shadowed by implementation" "" { 
target c++11_only } }
 { return 4.0L * n + 0.0L; }
 
 constexpr long double
diff --git a/gcc/testsuite/g++.dg/cpp0x/std_fext-numeric-literals.C 
b/gcc/testsuite/g++.dg/cpp0x/std_fext-numeric-literals.C
index 7caaa7cee8d..ff1e7b6d

[Patch][aarch64] Add missing thunderx2-t99 instruction scheduling pipeline descriptions.

2017-12-01 Thread Steve Ellcey
There are a number of instruction types defined in aarch64.md which do not
have pipeline/scheduling information in thunderx2-t99.md.  This patch adds
some of them.  This patch includes all the missing types except the neon
ones that I hope to include in a follow-up patch.

Bootstrapped and tested with no regressions on a thunderx2.

I know we are in stage3 but I hope this type of plaform specific
change is still OK to checkin.

Steve Ellcey
sell...@cavium.com


2017-11-30  Steve Ellcey  

* config/aarch64/thunderx2-t99.md (thunderx2t99_branch): Add trap
to reservation.
(thunderx2t99_nothing): New insn reservation.
(thunderx2t99_mrs): New insn reservation.
(thunderx2t99_multiple): New insn reservation.
(thunderx2t99_alu_basi): Add bfx to reservation.
(thunderx2t99_fp_cmp): Add fccmps and fccmpd to reservation.


diff --git a/gcc/config/aarch64/thunderx2t99.md 
b/gcc/config/aarch64/thunderx2t99.md
index 5bcf4ff..5e48521 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -69,9 +69,26 @@
 
 (define_insn_reservation "thunderx2t99_branch" 1
   (and (eq_attr "tune" "thunderx2t99")
-   (eq_attr "type" "call,branch"))
+   (eq_attr "type" "call,branch,trap"))
   "thunderx2t99_i2")
 
+;; Misc instructions.
+
+(define_insn_reservation "thunderx2t99_nothing" 0
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "no_insn,block"))
+  "nothing")
+
+(define_insn_reservation "thunderx2t99_mrs" 0
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "mrs"))
+  "thunderx2t99_i2")
+
+(define_insn_reservation "thunderx2t99_multiple" 1
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "multiple"))
+  
"thunderx2t99_i0+thunderx2t99_i1+thunderx2t99_i2+thunderx2t99_ls0+thunderx2t99_ls1+thunderx2t99_sd+thunderx2t99_i1m1+thunderx2t99_i1m2+thunderx2t99_i1m3+thunderx2t99_ls0d1+thunderx2t99_ls0d2+thunderx2t99_ls0d3+thunderx2t99_ls1d1+thunderx2t99_ls1d2+thunderx2t99_ls1d3+thunderx2t99_f0+thunderx2t99_f1")
+
 ;; Integer arithmetic/logic instructions.
 
 ; Plain register moves are handled by renaming, and don't create any uops.
@@ -87,7 +104,7 @@
    adc_reg,adc_imm,adcs_reg,adcs_imm,\
    logic_reg,logic_imm,logics_reg,logics_imm,\
    csel,adr,mov_imm,shift_reg,shift_imm,bfm,\
-   rbit,rev,extend,rotate_imm"))
+   bfx,rbit,rev,extend,rotate_imm"))
   "thunderx2t99_i012")
 
 (define_insn_reservation "thunderx2t99_alu_shift" 2
@@ -155,7 +172,7 @@
 
 (define_insn_reservation "thunderx2t99_fp_cmp" 5
   (and (eq_attr "tune" "thunderx2t99")
-   (eq_attr "type" "fcmps,fcmpd"))
+   (eq_attr "type" "fcmps,fcmpd,fccmps,fccmpd"))
   "thunderx2t99_f01")
 
 (define_insn_reservation "thunderx2t99_fp_divsqrt_s" 16


[PATCH, obv?] Fix missing newlines from local-pure-const pass dump

2017-12-01 Thread Luis Machado
I noticed the debugging output from local-pure-const pass is missing a
newline in a couple places, leading to this:

 local analysis of main
   scanning: i ={v} 0;
Volatile stmt is not const/pure
Volatile operand is not const/pure  scanning: j ={v} 20;
Volatile stmt is not const/pure
Volatile operand is not const/pure  scanning: vol.0_10 ={v} i;
Volatile stmt is not const/pure

It should've been:

 local analysis of main
   scanning: i ={v} 0;
Volatile stmt is not const/pure
Volatile operand is not const/pure
   scanning: j ={v} 20;
Volatile stmt is not const/pure
Volatile operand is not const/pure
   scanning: vol.0_10 ={v} i;
Volatile stmt is not const/pure

Seems fairly obvious. OK?

gcc/ChangeLog:

2017-12-01  Luis Machado  

* ipa-pure-const.c (check_decl): Add missing newline.
(state_from_flags): Likewise.
---
 gcc/ipa-pure-const.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index bdc7522..22f92fc 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -332,7 +332,7 @@ check_decl (funct_state local,
 {
   local->pure_const_state = IPA_NEITHER;
   if (dump_file)
-fprintf (dump_file, "Volatile operand is not const/pure");
+fprintf (dump_file, "Volatile operand is not const/pure\n");
   return;
 }
 
@@ -446,7 +446,7 @@ state_from_flags (enum pure_const_state_e *state, bool 
*looping,
 {
   *looping = true;
   if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, " looping");
+   fprintf (dump_file, " looping\n");
 }
   if (flags & ECF_CONST)
 {
-- 
2.7.4



[patch, fortran, committed] Fix part 2 of PR 83224

2017-12-01 Thread Thomas Koenig

Hello world,

I have committed the fix for the second part (and original
test case) of PR 83224 as obvious after regression-testing
as r255331.

Regards

Thomas

2017-12-01  Thomas Koenig  

PR fortran/83224
* frontend-passes.c (create_var): Also handle
character arrays, handling deferred lenghts.

2017-12-01  Thomas Koenig  

PR fortran/83224
* gfortran.dg/dependency_51.f90: New test.
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 255294)
+++ frontend-passes.c	(Arbeitskopie)
@@ -767,7 +767,7 @@ create_var (gfc_expr * e, const char *vname)
 }
 
   deferred = 0;
-  if (e->ts.type == BT_CHARACTER && e->rank == 0)
+  if (e->ts.type == BT_CHARACTER)
 {
   gfc_expr *length;
 
@@ -778,6 +778,8 @@ create_var (gfc_expr * e, const char *vname)
   else
 	{
 	  symbol->attr.allocatable = 1;
+	  symbol->ts.u.cl->length = NULL;
+	  symbol->ts.deferred = 1;
 	  deferred = 1;
 	}
 }
@@ -790,7 +792,7 @@ create_var (gfc_expr * e, const char *vname)
 
   result = gfc_get_expr ();
   result->expr_type = EXPR_VARIABLE;
-  result->ts = e->ts;
+  result->ts = symbol->ts;
   result->ts.deferred = deferred;
   result->rank = e->rank;
   result->shape = gfc_copy_shape (e->shape, e->rank);
! { dg-do  run }
! PR 83224 - dependency mishandling with an array constructor
! Original test case by Urban Jost
program dusty_corner
  implicit none
  character(len=:),allocatable :: words(:)
  integer :: n

  words=[character(len=3) :: 'one', 'two']
  n = 5
  words=[character(len=n) :: words, 'three']
  if (any(words /= [ "one  ", "two  ", "three"])) call abort

end program dusty_corner


Re: [PATCH] handle non-constant offsets in -Wstringop-overflow (PR 77608)

2017-12-01 Thread Martin Sebor

On 12/01/2017 01:26 AM, Jeff Law wrote:

On 11/30/2017 01:30 PM, Martin Sebor wrote:

On 11/22/2017 05:03 PM, Jeff Law wrote:

On 11/21/2017 12:07 PM, Martin Sebor wrote:

On 11/21/2017 09:55 AM, Jeff Law wrote:

On 11/19/2017 04:28 PM, Martin Sebor wrote:

On 11/18/2017 12:53 AM, Jeff Law wrote:

On 11/17/2017 12:36 PM, Martin Sebor wrote:

The attached patch enhances -Wstringop-overflow to detect more
instances of buffer overflow at compile time by handling non-
constant offsets into the destination object that are known to
be in some range.  The solution could be improved by handling
even more cases (e.g., anti-ranges or offsets relative to
pointers beyond the beginning of an object) but it's a start.

In addition to bootsrapping/regtesting GCC, also tested with
Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no
regressions.

Martin

The top of GDB fails to compile at the moment so the validation
there was incomplete.

gcc-77608.diff


PR middle-end/77608 - missing protection on trivially detectable
runtime buffer overflow

gcc/ChangeLog:

PR middle-end/77608
* builtins.c (compute_objsize): Handle non-constant offsets.

gcc/testsuite/ChangeLog:

PR middle-end/77608
* gcc.dg/Wstringop-overflow.c: New test.

The recursive call into compute_objsize passing in the ostype avoids
having to think about the whole object vs nearest containing object
issues.  Right?

What's left to worry about is maximum or minimum remaining bytes
in the
object.  At least that's my understanding of how ostype works here.

So we get the amount remaining, ignoring the variable offset, from
the
recursive call (SIZE).  The space left after we account for the
variable
offset is [SIZE - MAX, SIZE - MIN].  So ISTM for type 0/1 you have to
return SIZE-MIN (which you do) and for type 2/3 you have to return
SIZE-MAX which I think you get wrong (and you have to account for the
possibility that MAX or MIN is greater than SIZE and thus there's
nothing left).


Subtracting the upper bound of the offset from the size instead
of the lower bound when the caller is asking for the minimum
object size would make the result essentially meaningless in
common cases where the offset is smaller than size_t, as in:

  char a[7];

  void f (const char *s, unsigned i)
  {
__builtin_strcpy (a + i, s);
  }

Here, i's range is [0, UINT_MAX].

IMO, it's only useful to use the lower bound here, otherwise
the result would only rarely be non-zero.

But when we're asking for the minimum left, aren't we essentially
asking
for "how much space am I absolutely sure I can write"?  And if that is
the question, then the only conservatively correct answer is to
subtract
the high bound.


I suppose you could look at it that way but IME with this work
(now, and also last year when I submitted a patch actually
changing the built-in), using the upper bound is just not that
useful because it's too often way too big.  There's no way to
distinguish an out-of-range upper bound that's the result of
an inadequate attempt to constrain a value from an out-of-range
upper bound that is sufficiently constrained but in a way GCC
doesn't see.

Understood.

So while it's reasonable to not warn in those cases where we just have
crap range information (that's always going to be the case for some code
regardless of how good my work or Andrew/Aldy's work is), we have to be
very careful and make sure that nobody acts on this information for
optimization purposes because what we're returning is not conservatively
correct.




There are no clients of this API that would be affected by
the decision one way or the other (unless the user specifies
a -Wstringop-overflow= argument greater than the default 2)
so I don't think what we do now matters much, if at all.

Right, but what's to stop someone without knowledge of the
implementation and its quirk of not returning the conservatively safe
result from using the results in other ways.


Presumably they would find out by testing their code.  But this
is a hypothetical scenario.  I added the function for warnings.
I wasn't expecting it to be used for optimization, no such uses
have emerged, and I don't have the impression that anyone is
contemplating adding them (certainly not in stage 3).  If you
think the function could be useful for optimization then we
should certainly consider changing it as we gain experience
with it under those conditions.

Merely passing tests does not mean the code is correct, we both have the
war stories and scars to prove it. :-)  Hell, I prove it to myself
nearly daily :(

Furthermore, just because nobody is using the function today in an
optimization context does not mean it will always be the case.  Worse
yet, someone could potentially use a caller of compute_objsize without
knowing about the limitations.

In fact, if I look at how we handle expand_builtin_mempcpy we have:

 /* Avoid expanding mempcpy into memcpy when the call is determined
 to overflow the buffer.  This also prevents the 

Re: [PATCH, rs6000] gimple folding of vec_msum()

2017-12-01 Thread Richard Biener
On December 1, 2017 6:22:21 PM GMT+01:00, Will Schmidt 
 wrote:
>Hi,
>Add support for folding of vec_msum in GIMPLE.
>
>This uses the DOT_PROD_EXPR gimple op, which is sensitive to type
>mismatches:
>   error: type mismatch in dot product reduction
>   __vector signed int
>   __vector signed char
>   __vector unsigned char
>   D.2798 = DOT_PROD_EXPR ;
>So for those cases with a signed/unsigned mismatch in the arguments,
>this
>converts those arguments to their signed type.
>
>This also adds a define_expand for sdot_prodv16qi. This is based on a
>similar
>existing entry.
>
>Testing coverage is handled by the existing
>gcc.target/powerpc/fold-vec-msum*.c tests.
>
>Sniff-tests have passed on P8.  full regtests currently running on
>other assorted
>power systems.
>OK for trunk with successful results?

Note DOT_PROD_EXPR is only useful when the result is reduced to a scalar later 
and the reduction order is irrelevant. 

This is because GIMPLE doesn't specify whether the reduction reduces odd/even 
or high/low lanes of the argument vectors.  Does vec_msum specify that? 

That said, it exists as a 'hack' for the vectorizer and isn't otherwise useful 
for GIMPLE. 

Richard. 

>Thanks
>-Will
>
>[gcc]
>
>2017-12-01  Will Schmidt  
>
>   * config/rs6000/altivec.md (sdot_prodv16qi): New.
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
>   gimple-folding of vec_msum.
>   (builtin_function_type): Add entries for VMSUMU[BH]M and VMSUMMBM.
>
>diff --git a/gcc/config/rs6000/altivec.md
>b/gcc/config/rs6000/altivec.md
>index 7122f99..fa9e121 100644
>--- a/gcc/config/rs6000/altivec.md
>+++ b/gcc/config/rs6000/altivec.md
>@@ -3349,11 +3349,26 @@
> (match_operand:V8HI 2 "register_operand" "v")]
> UNSPEC_VMSUMSHM)))]
>   "TARGET_ALTIVEC"
>   "
> {
>-  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
>operands[2], operands[3]));
>+  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
>+ operands[2], operands[3]));
>+  DONE;
>+}")
>+
>+(define_expand "sdot_prodv16qi"
>+  [(set (match_operand:V4SI 0 "register_operand" "=v")
>+(plus:V4SI (match_operand:V4SI 3 "register_operand" "v")
>+   (unspec:V4SI [(match_operand:V16QI 1
>"register_operand" "v")
>+ (match_operand:V16QI 2
>"register_operand" "v")]
>+UNSPEC_VMSUMM)))]
>+  "TARGET_ALTIVEC"
>+  "
>+{
>+  emit_insn (gen_altivec_vmsummbm (operands[0], operands[1],
>+ operands[2], operands[3]));
>   DONE;
> }")
> 
> (define_expand "widen_usum3"
>   [(set (match_operand:V4SI 0 "register_operand" "=v")
>diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>index 551d9c4..552fcdd 100644
>--- a/gcc/config/rs6000/rs6000.c
>+++ b/gcc/config/rs6000/rs6000.c
>@@ -16614,10 +16614,40 @@ rs6000_gimple_fold_builtin
>(gimple_stmt_iterator *gsi)
> case VSX_BUILTIN_CMPLE_2DI:
> case VSX_BUILTIN_CMPLE_U2DI:
>   fold_compare_helper (gsi, LE_EXPR, stmt);
>   return true;
> 
>+/* vec_msum.  */
>+case ALTIVEC_BUILTIN_VMSUMUHM:
>+case ALTIVEC_BUILTIN_VMSUMSHM:
>+case ALTIVEC_BUILTIN_VMSUMUBM:
>+case ALTIVEC_BUILTIN_VMSUMMBM:
>+  {
>+  arg0 = gimple_call_arg (stmt, 0);
>+  arg1 = gimple_call_arg (stmt, 1);
>+  tree arg2 = gimple_call_arg (stmt, 2);
>+  lhs = gimple_call_lhs (stmt);
>+  if ( TREE_TYPE (arg0) == TREE_TYPE (arg1))
>+g = gimple_build_assign (lhs, DOT_PROD_EXPR, arg0, arg1, arg2);
>+  else
>+{
>+  // For the case where we have a mix of signed/unsigned
>+  // arguments, convert both multiply args to their signed type.
>+  gimple_seq stmts = NULL;
>+  location_t loc = gimple_location (stmt);
>+  tree new_arg_type = signed_type_for (TREE_TYPE (arg0));
>+  tree signed_arg0 = gimple_convert (&stmts, loc, new_arg_type,
>arg0);
>+  tree signed_arg1 = gimple_convert (&stmts, loc, new_arg_type,
>arg1);
>+  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>+  g = gimple_build_assign (lhs, DOT_PROD_EXPR,
>+   signed_arg0, signed_arg1, arg2);
>+}
>+  gimple_set_location (g, gimple_location (stmt));
>+  gsi_replace (gsi, g, true);
>+  return true;
>+  }
>+
> default:
>   if (TARGET_DEBUG_BUILTIN)
>   fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s\n",
>fn_code, fn_name1, fn_name2);
>   break;
>@@ -18080,16 +18110,23 @@ builtin_function_type (machine_mode mode_ret,
>machine_mode mode_arg0,
> case CRYPTO_BUILTIN_VPERMXOR_V8HI:
> case CRYPTO_BUILTIN_VPERMXOR_V16QI:
> case CRYPTO_BUILTIN_VSHASIGMAW:
> case CRYPTO_BUILTIN_VSHASIGMAD:
> case CRYPTO_BUILTIN_VSHASIGMA:
>+case ALTIVEC_BUILTIN_VMSUMUHM:
>+case ALTIVEC_

Re: Patch to fix an undefined behavior in fortran/decl.c

2017-12-01 Thread Thomas Koenig

HI Quing,

this is a very straightforward fix for an undefined behavior in 
fortran/decl.c:


> -  sprintf (name, "%s_%d", name, kind_value);
> +  sprintf (name + strlen (name), "_%d", kind_value);

OK for trunk. Thanks for the patch!

Regards

Thomas


[PATCH, rs6000] gimple folding of vec_msum()

2017-12-01 Thread Will Schmidt
Hi,
Add support for folding of vec_msum in GIMPLE.

This uses the DOT_PROD_EXPR gimple op, which is sensitive to type mismatches:
error: type mismatch in dot product reduction
__vector signed int
__vector signed char
__vector unsigned char
D.2798 = DOT_PROD_EXPR ;
So for those cases with a signed/unsigned mismatch in the arguments, this
converts those arguments to their signed type.

This also adds a define_expand for sdot_prodv16qi. This is based on a similar
existing entry.

Testing coverage is handled by the existing gcc.target/powerpc/fold-vec-msum*.c 
tests.

Sniff-tests have passed on P8.  full regtests currently running on other 
assorted
power systems.
OK for trunk with successful results?

Thanks
-Will

[gcc]

2017-12-01  Will Schmidt  

* config/rs6000/altivec.md (sdot_prodv16qi): New.
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
gimple-folding of vec_msum.
(builtin_function_type): Add entries for VMSUMU[BH]M and VMSUMMBM.

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 7122f99..fa9e121 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -3349,11 +3349,26 @@
  (match_operand:V8HI 2 "register_operand" "v")]
 UNSPEC_VMSUMSHM)))]
   "TARGET_ALTIVEC"
   "
 {
-  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1], operands[2], 
operands[3]));
+  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
+  operands[2], operands[3]));
+  DONE;
+}")
+
+(define_expand "sdot_prodv16qi"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+(plus:V4SI (match_operand:V4SI 3 "register_operand" "v")
+   (unspec:V4SI [(match_operand:V16QI 1 "register_operand" "v")
+ (match_operand:V16QI 2 "register_operand" 
"v")]
+UNSPEC_VMSUMM)))]
+  "TARGET_ALTIVEC"
+  "
+{
+  emit_insn (gen_altivec_vmsummbm (operands[0], operands[1],
+  operands[2], operands[3]));
   DONE;
 }")
 
 (define_expand "widen_usum3"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 551d9c4..552fcdd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16614,10 +16614,40 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case VSX_BUILTIN_CMPLE_2DI:
 case VSX_BUILTIN_CMPLE_U2DI:
   fold_compare_helper (gsi, LE_EXPR, stmt);
   return true;
 
+/* vec_msum.  */
+case ALTIVEC_BUILTIN_VMSUMUHM:
+case ALTIVEC_BUILTIN_VMSUMSHM:
+case ALTIVEC_BUILTIN_VMSUMUBM:
+case ALTIVEC_BUILTIN_VMSUMMBM:
+  {
+   arg0 = gimple_call_arg (stmt, 0);
+   arg1 = gimple_call_arg (stmt, 1);
+   tree arg2 = gimple_call_arg (stmt, 2);
+   lhs = gimple_call_lhs (stmt);
+   if ( TREE_TYPE (arg0) == TREE_TYPE (arg1))
+ g = gimple_build_assign (lhs, DOT_PROD_EXPR, arg0, arg1, arg2);
+   else
+ {
+   // For the case where we have a mix of signed/unsigned
+   // arguments, convert both multiply args to their signed type.
+   gimple_seq stmts = NULL;
+   location_t loc = gimple_location (stmt);
+   tree new_arg_type = signed_type_for (TREE_TYPE (arg0));
+   tree signed_arg0 = gimple_convert (&stmts, loc, new_arg_type, arg0);
+   tree signed_arg1 = gimple_convert (&stmts, loc, new_arg_type, arg1);
+   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+   g = gimple_build_assign (lhs, DOT_PROD_EXPR,
+signed_arg0, signed_arg1, arg2);
+ }
+   gimple_set_location (g, gimple_location (stmt));
+   gsi_replace (gsi, g, true);
+   return true;
+  }
+
 default:
   if (TARGET_DEBUG_BUILTIN)
fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s\n",
 fn_code, fn_name1, fn_name2);
   break;
@@ -18080,16 +18110,23 @@ builtin_function_type (machine_mode mode_ret, 
machine_mode mode_arg0,
 case CRYPTO_BUILTIN_VPERMXOR_V8HI:
 case CRYPTO_BUILTIN_VPERMXOR_V16QI:
 case CRYPTO_BUILTIN_VSHASIGMAW:
 case CRYPTO_BUILTIN_VSHASIGMAD:
 case CRYPTO_BUILTIN_VSHASIGMA:
+case ALTIVEC_BUILTIN_VMSUMUHM:
+case ALTIVEC_BUILTIN_VMSUMUBM:
   h.uns_p[0] = 1;
   h.uns_p[1] = 1;
   h.uns_p[2] = 1;
   h.uns_p[3] = 1;
   break;
 
+/* The second parm to this vec_msum variant is unsigned.  */
+case ALTIVEC_BUILTIN_VMSUMMBM:
+  h.uns_p[2] = 1;
+  break;
+
 /* signed permute functions with unsigned char mask.  */
 case ALTIVEC_BUILTIN_VPERM_16QI:
 case ALTIVEC_BUILTIN_VPERM_8HI:
 case ALTIVEC_BUILTIN_VPERM_4SI:
 case ALTIVEC_BUILTIN_VPERM_4SF:




Re: [Patch, fortran] PRs 82605, 82606 and 82622 - PDT problems

2017-12-01 Thread Paul Richard Thomas
It turned out that I couldn't stop and so fixed a number of other
bugs. All are minor tweaks, well hidden behind the pdt attributes.

Committed as r255311.

Paul

2017-12-01  Paul Thomas  

PR fortran/82605
* resolve.c (get_pdt_constructor): Initialize 'cons' to NULL.
(resolve_pdt): Correct typo in prior comment. Emit an error if
any parameters are deferred and the object is neither pointer
nor allocatable.

PR fortran/82606
* decl.c (gfc_get_pdt_instance): Continue if the parameter sym
is not present or has no name. Select the parameter by name
of component, rather than component order. Remove all the other
manipulations of 'tail' when building the pdt instance.
(gfc_match_formal_arglist): Emit and error if a star is picked
up in a PDT decl parameter list.

PR fortran/82622
* trans-array.c (set_loop_bounds): If a GFC_SS_COMPONENT has an
info->end, use it rather than falling through to
gcc_unreachable.
(structure_alloc_comps): Check that param->name is non-null
before comparing with the component name.
* trans-decl.c (gfc_get_symbol_decl): Do not use the static
initializer for PDT symbols.
(gfc_init_default_dt): Do nothing for PDT symbols.
* trans-io.c (transfer_array_component): Parameterized array
components use the descriptor ubound since the shape is not
available.

PR fortran/82719
PR fortran/82720
* trans-expr.c (gfc_conv_component_ref): Do not use the charlen
backend_decl of pdt strings. Use the hidden component instead.
* trans-io.c (transfer_expr): Do not do IO on "hidden" string
lengths. Use the hidden string length for pdt string transfers
by adding it to the se structure. When finished nullify the
se string length.

PR fortran/82866
* decl.c (gfc_match_formal_arglist): If a name is not found or
star is found, while reading a type parameter list, emit an
immediate error.
(gfc_match_derived_decl): On reading a PDT parameter list, on
failure to match call gfc_error_recovery.

PR fortran/82978
* decl.c (build_struct): Character kind defaults to 1, so use
kind_expr whatever is the set value.
(gfc_get_pdt_instance): Ditto.
* trans-array.c (structure_alloc_comps): Copy the expression
for the PDT string length before parameter substitution. Use
this expression for evaluation and free it after use.

2017-12-01  Paul Thomas  

PR fortran/82605
* gfortran.dg/pdt_4.f03 : Incorporate the new error.

PR fortran/82606
* gfortran.dg/pdt_19.f03 : New test.
* gfortran.dg/pdt_21.f03 : New test.

PR fortran/82622
* gfortran.dg/pdt_20.f03 : New test.
* gfortran.dg/pdt_22.f03 : New test.

PR fortran/82719
PR fortran/82720
* gfortran.dg/pdt_23.f03 : New test.

PR fortran/82866
* gfortran.dg/pdt_24.f03 : New test.

PR fortran/82978
* gfortran.dg/pdt_10.f03 : Correct for error in coding the for
kind 4 component and change the kind check appropriately.
* gfortran.dg/pdt_25.f03 : New test.


On 30 November 2017 at 12:47, Paul Richard Thomas
 wrote:
> This patch fixes the above PRs and the additional problems in comment
> #1 of both 82606 and 82622.
>
> For the main part, the patch consists of 'obvious' tweaks to the PDT
> machinery. The exception to this is the chunk in
> trans-array.c(set_loop_bounds), which is needed to handle
> parameterized array components coming from trans-io.c. This is safe
> because the code would have fallen through to gcc_unreachable
> otherwise. If the info->end is present then this can be used.
>
> Bootstrapped and regtested on FC23/x86_64 - OK for trunk?
>
> I will commit tomorrow morning if there are no complaints in the meantime.
>
> Regards
>
> Paul
>
> 2017-11-30  Paul Thomas  
>
> PR fortran/82605
> * resolve.c (get_pdt_constructor): Initialize 'cons' to NULL.
> (resolve_pdt): Correct typo in prior comment. Emit an error if
> any parameters are deferred and the object is neither pointer
> nor allocatable.
>
> PR fortran/82606
> * decl.c (gfc_get_pdt_instance): Continue if the parameter sym
> is not present or has no name. Select the parameter by name
> of component, rather than component order. Remove all the other
> manipulations of 'tail' when building the pdt instance.
> (gfc_match_formal_arglist): Emit and error if a star is picked
> up in a PDT decl parameter list.
>
> PR fortran/82622
> * trans-array.c (set_loop_bounds): If a GFC_SS_COMPONENT has an
> info->end, use it rather than falling through to
> gcc_unreachable.
> (structure_alloc_comps): Check that param->name is non-null
> before comparing with the component name.
> * trans-decl.c (gfc_get_symbol_decl): Do not use the static
> initializer for PDT symbols.
> (gfc_init_default_dt): Do nothing for PDT symbols.
> * trans-io.c (transfer_array_component): Parameterized array
> components use the

Re: [patch] remove cilk-plus

2017-12-01 Thread Paolo Carlini

Hi,

On 01/12/2017 16:43, Jeff Law wrote:

On 12/01/2017 03:28 AM, Paolo Carlini wrote:

Hi,

On 16/11/2017 16:33, Koval, Julia wrote:

// I failed to send patch itself, it is too big even in gzipped form.
What is the right way to send such big patches?

Hi, this patch removes cilkplus. Ok for trunk?

Now that cilkplus is gone I suppose we should clean-up Bugzilla about
that. Shall I go ahead and essentially close all the bugs we got? As
WONTFIX or what else? Let's agree on something. In principle we could
keep the regressions for the sake of the existing release branches but I
think we got very, very, few of those and anyway I don't see who gonna
work on that...

Not sure if we have a policy in this space or not.

If we don't then my vote would be for CLOSE/WONTFIX now.  That seems to
most accurately reflect state -- we're not going to be fixing any Cilk+
stuff on the trunk or in the release branches.
Excellent, thanks Jeff. Thus, barring strong contrary opinions, I'll 
take care of that over the we.


Cheers,
Paolo.


Re: [PATCH] Fix -Wsystem-header warnings in libstdc++

2017-12-01 Thread Jonathan Wakely

On 01/12/17 15:11 +, Jonathan Wakely wrote:

This fixes a number of warnings that show up with -Wsystem-headers


This fixes some more.

Tested powerpc64le-linux, committed to trunk.


commit cea830828177721a6d201dd6c201c34235626641
Author: Jonathan Wakely 
Date:   Fri Dec 1 15:59:01 2017 +

Fix narrowing conversions in string_view types

* include/experimental/string_view (basic_string_view::_S_compare):
Use value-init so narrowing conversions are not ill-formed.
* include/std/string_view (basic_string_view::_S_compare): Likewise.

diff --git a/libstdc++-v3/include/experimental/string_view b/libstdc++-v3/include/experimental/string_view
index 96d1f58f8e9..ef171ecc025 100644
--- a/libstdc++-v3/include/experimental/string_view
+++ b/libstdc++-v3/include/experimental/string_view
@@ -422,11 +422,11 @@ inline namespace fundamentals_v1
   static constexpr int
   _S_compare(size_type __n1, size_type __n2) noexcept
   {
-	return difference_type{__n1 - __n2} > std::numeric_limits::max()
+	return difference_type(__n1 - __n2) > std::numeric_limits::max()
 	 ? std::numeric_limits::max()
-	 : difference_type{__n1 - __n2} < std::numeric_limits::min()
+	 : difference_type(__n1 - __n2) < std::numeric_limits::min()
 	 ? std::numeric_limits::min()
-	 : static_cast(difference_type{__n1 - __n2});
+	 : static_cast(difference_type(__n1 - __n2));
   }
 
   size_t	_M_len;
diff --git a/libstdc++-v3/include/std/string_view b/libstdc++-v3/include/std/string_view
index 1266a07d04f..3b2901ab3c6 100644
--- a/libstdc++-v3/include/std/string_view
+++ b/libstdc++-v3/include/std/string_view
@@ -408,7 +408,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr int
   _S_compare(size_type __n1, size_type __n2) noexcept
   {
-	const difference_type __diff{__n1 - __n2};
+	const difference_type __diff = __n1 - __n2;
 	if (__diff > std::numeric_limits::max())
 	  return std::numeric_limits::max();
 	if (__diff < std::numeric_limits::min())

commit b992ef59e17964034ffb6dd094629468998ee6e9
Author: Jonathan Wakely 
Date:   Thu Nov 30 16:26:22 2017 +

Disable -Wliteral-suffix for standard UDLs

* include/bits/basic_string.h (operator""s): Add pragmas to disable
-Wliteral-suffix warnings.
* include/experimental/string_view (operator""sv): Likewise.
* include/std/chrono (operator""h, operator""min, operator""s)
(operator""ms, operator""us, operator""ns): Likewise.
* include/std/complex (operator""if, operator""i, operator""il):
Likewise.
* include/std/string_view (operator""sv): Likewise.
* testsuite/20_util/duration/literals/range.cc: Adjust dg-error.

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index a4b81137571..70373e7448a 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -6665,6 +6665,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
   inline namespace string_literals
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wliteral-suffix"
 _GLIBCXX_DEFAULT_ABI_TAG
 inline basic_string
 operator""s(const char* __str, size_t __len)
@@ -6689,6 +6691,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return basic_string{__str, __len}; }
 #endif
 
+#pragma GCC diagnostic pop
   } // inline namespace string_literals
   } // inline namespace literals
 
diff --git a/libstdc++-v3/include/experimental/string_view b/libstdc++-v3/include/experimental/string_view
index 8eaf9ec3d96..96d1f58f8e9 100644
--- a/libstdc++-v3/include/experimental/string_view
+++ b/libstdc++-v3/include/experimental/string_view
@@ -644,6 +644,8 @@ namespace experimental
   {
   inline namespace string_view_literals
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wliteral-suffix"
 inline constexpr basic_string_view
 operator""sv(const char* __str, size_t __len) noexcept
 { return basic_string_view{__str, __len}; }
@@ -663,6 +665,7 @@ namespace experimental
 operator""sv(const char32_t* __str, size_t __len) noexcept
 { return basic_string_view{__str, __len}; }
 #endif
+#pragma GCC diagnostic pop
   } // namespace string_literals
   } // namespace literals
 } // namespace experimental
diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 9491508e637..2419e82acce 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -884,6 +884,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
   inline namespace chrono_literals
   {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wliteral-suffix"
 template
   struct _Checked_integral_constant
   : integral_constant<_Rep, static_cast<_Rep>(_Val)>
@@ -958,6 +960,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   operator""ns()
   { return __check_overflow(); }
 
+#pragma GCC diagnostic p

Re: [PATCH][AArch64] Fix address printing on ILP32

2017-12-01 Thread James Greenhalgh
On Thu, Nov 30, 2017 at 05:27:47PM +, Wilco Dijkstra wrote:
> Fix address printing for ILP32.  The md file uses 'a' in assembler
> templates for symbolic addresses in adrp/add, which end up calling 
> aarch64_print_operand_address.  However in ILP32 these are not valid
> memory addresses (being ptr_mode rather than Pmode), so the assert
> triggers.  Since it is incorrect to use symbols in memory addresses
> (besides literal pool accesses), change the 'a' to 'c' in the md file.
> 
> Skip one failing test in ILP32 which combines the 'p' modifier with the 'a'
> assembler template to fake a memory reference.
> 
> This fixes the ICE in 
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02509.html.

OK.

Thanks,
James

> ChangeLog:
> 2017-11-30  Wilco Dijkstra  
> 
> gcc/
>   * config/aarch64/aarch64.md (call_insn): Use %c rather than %a.
>   (call_value_insn): Likewise.
>   (sibcall_insn): Likewise.
>   (sibcall_value_insn): Likewise.
>   (movsi_aarch64): Likewise.
>   (movdi_aarch64): Likewise.
>   (add_losym_): Likewise.
>   (ldr_got_small_): Likewise.
>   (ldr_got_small_sidi): Likewise.
>   (ldr_got_small_28k_): Likewise.
>   (ldr_got_small_28k_sidi): Likewise.
>   * config/aarch64/aarch64.c (aarch64_print_address_internal):
>   Move output_addr_const to symbolic case. Add error check.
> testsuite/
>   * gcc.dg/asm-4.c: Skip on AArch64 with ILP32 as test is incorrect.
> 


Re: [PATCH] ARM testsuite: force hardfp for addr-modes-float.c

2017-12-01 Thread Charles Baylis
On 30 November 2017 at 15:56, Kyrill  Tkachov
 wrote:

>
> So is it the case that you don't run any arm tests that include arm_neon.h
> in your configuration?

No, it is only the case that any arm test which includes arm_neon.h
(in fact, any system header) *and* uses dg-add-options
-mfloat-abi=hard fails on my configuration (And -mfloat-abi=softfp
fails in my configurations which default to hardfp). [1]

The only test which currently has -mfloat-abi=hard and #include
 is gcc.target/arm/pr51534.c, and it FAILs in my
arm-unknown-linux-gnueabi configuration.

> If so, then I would be fine with leaving this test unsupported on this
> configuration.

I don't see why, when the test can simply be fixed with
attribute((pcs)), but if you prefer I can respin the patch
accordingly.

> By the way, I notice that in addr-modes-float.c the arm_neon_ok check is
> placed before the dg-add-options.
> I don't remember the arcane rules exactly, but I think the effective target
> check should go before it, so that the test gets skipped properly.

OK, I can respin the patch with that change.

[1] full details as follows:

$ arm-unknown-linux-gnueabi-gcc -v
COLLECT_GCC=/home/cbaylis/tools//tools-arm-unknown-linux-gnueabi-git/bin/arm-unknown-linux-gnueabi-gcc
COLLECT_LTO_WRAPPER=/home/cbaylis/tools/tools-arm-unknown-linux-gnueabi-git/bin/../libexec/gcc/arm-unknown-linux-gnueabi/8.0.0/lto-wrapper
Target: arm-unknown-linux-gnueabi
Configured with: /home/cbaylis/srcarea/gcc/gcc-git/configure
--prefix=/home/cbaylis/tools//tools-arm-unknown-linux-gnueabi-git
--target=arm-unknown-linux-gnueabi --enable-languages=c,c++
--with-sysroot=/home/cbaylis/tools//sysroot-arm-unknown-linux-gnueabi-git
--with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16
--with-float=softfp --with-mode=thumb
Thread model: posix
gcc version 8.0.0 20171124 (experimental) (GCC)

$ cat tn.c
#include 

$ arm-unknown-linux-gnueabi-gcc -mfloat-abi=hard tn.c
In file included from
/home/cbaylis/tools/sysroot-arm-unknown-linux-gnueabi-git/usr/include/features.h:447,
 from
/home/cbaylis/tools/sysroot-arm-unknown-linux-gnueabi-git/usr/include/bits/libc-header-start.h:33,
 from
/home/cbaylis/tools/sysroot-arm-unknown-linux-gnueabi-git/usr/include/stdio.h:27,
 from tn.c:2:
/home/cbaylis/tools/sysroot-arm-unknown-linux-gnueabi-git/usr/include/gnu/stubs.h:10:11:
fatal error: gnu/stubs-hard.h: Dosiero aŭ dosierujo ne ekzistas
 # include 
   ^~
compilation terminated.


Re: [patch] remove cilk-plus

2017-12-01 Thread Jeff Law
On 12/01/2017 03:28 AM, Paolo Carlini wrote:
> Hi,
> 
> On 16/11/2017 16:33, Koval, Julia wrote:
>> // I failed to send patch itself, it is too big even in gzipped form. 
>> What is the right way to send such big patches?
>>
>> Hi, this patch removes cilkplus. Ok for trunk?
> Now that cilkplus is gone I suppose we should clean-up Bugzilla about
> that. Shall I go ahead and essentially close all the bugs we got? As
> WONTFIX or what else? Let's agree on something. In principle we could
> keep the regressions for the sake of the existing release branches but I
> think we got very, very, few of those and anyway I don't see who gonna
> work on that...
Not sure if we have a policy in this space or not.

If we don't then my vote would be for CLOSE/WONTFIX now.  That seems to
most accurately reflect state -- we're not going to be fixing any Cilk+
stuff on the trunk or in the release branches.


jeff


[Committed] S/390: Split MVC instruction for better forwarding

2017-12-01 Thread Andreas Krebbel
Certain lengths used in an MVC instruction might disable operand
forwarding.  Split MVCs into up to 2 forwardable ones if possible.

Bootstrapped and regtested on s390x.

gcc/ChangeLog:

2017-12-01  Andreas Krebbel  

* config/s390/predicates.md (plus16_Q_operand): New predicate.
* config/s390/s390.md: Disable MVC merging peephole if it would
disable operand forwarding.
(new peephole2): Split MVCs if it would turn them into up to 2
forwardable MVCs.
---
 gcc/config/s390/predicates.md | 19 +++
 gcc/config/s390/s390.md   | 22 +-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index bbff8d8..e140b68 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -67,6 +67,25 @@
   return true;
 })
 
+;; Return true of the address of the mem operand plus 16 is still a
+;; valid Q constraint address.
+
+(define_predicate "plus16_Q_operand"
+  (and (match_code "mem")
+   (match_operand 0 "general_operand"))
+{
+  rtx addr = XEXP (op, 0);
+  if (REG_P (addr))
+return true;
+
+  if (GET_CODE (addr) != PLUS
+  || !REG_P (XEXP (addr, 0))
+  || !CONST_INT_P (XEXP (addr, 1)))
+return false;
+
+  return SHORT_DISP_IN_RANGE (INTVAL (XEXP (addr, 1)) + 16);
+})
+
 ;; Return true if OP is a valid operand for the BRAS instruction.
 ;; Allow SYMBOL_REFs and @PLT stubs.
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 1d63523..093f6f9 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2720,7 +2720,9 @@
 [(set (match_operand:BLK 3 "memory_operand" "")
   (match_operand:BLK 4 "memory_operand" ""))
  (use (match_operand 5 "const_int_operand" ""))])]
-  "s390_offset_p (operands[0], operands[3], operands[2])
+  "((INTVAL (operands[2]) > 16 && INTVAL (operands[5]) > 16)
+|| (INTVAL (operands[2]) + INTVAL (operands[5]) <= 16))
+   && s390_offset_p (operands[0], operands[3], operands[2])
&& s390_offset_p (operands[1], operands[4], operands[2])
&& !s390_overlap_p (operands[0], operands[1],
INTVAL (operands[2]) + INTVAL (operands[5]))
@@ -2732,6 +2734,24 @@
operands[7] = gen_rtx_MEM (BLKmode, XEXP (operands[1], 0));
operands[8] = GEN_INT (INTVAL (operands[2]) + INTVAL (operands[5]));")
 
+(define_peephole2
+  [(parallel
+[(set (match_operand:BLK 0 "plus16_Q_operand" "")
+  (match_operand:BLK 1 "plus16_Q_operand" ""))
+ (use (match_operand 2 "const_int_operand" ""))])]
+  "INTVAL (operands[2]) > 16 && INTVAL (operands[2]) <= 32"
+  [(parallel
+[(set (match_dup 0) (match_dup 1))
+ (use (const_int 16))])
+   (parallel
+[(set (match_dup 3) (match_dup 4))
+ (use (match_dup 5))])]
+  "operands[3] = change_address (operands[0], VOIDmode,
+ plus_constant (Pmode, XEXP (operands[0], 0), 
16));
+   operands[4] = change_address (operands[1], VOIDmode,
+ plus_constant (Pmode, XEXP (operands[1], 0), 
16));
+   operands[5] = GEN_INT (INTVAL (operands[2]) - 16);")
+
 
 ;
 ; load_multiple pattern(s).
-- 
2.9.1



Re: libstdc++ PATCH to harmonize noexcept

2017-12-01 Thread Jonathan Wakely

On 14/11/17 13:56 -0500, Jason Merrill wrote:

While working on an unrelated issue I noticed that the compiler didn't
like some of these declarations after preprocessing, when they aren't
protected by system-header permissiveness.

I thought about limiting the permissiveness to only extern "C"
functions, but I believe that system headers are adding more C++
declarations, so we'd likely run into this issue again.

Shouldn't we build libstdc++ with -Wsystem-headers?  Maybe along with
-Wno-error=system-headers?

Jonathan approved these changes elsewhere.

Jason



commit abe66184d116f85f10108191b081f48dd0cfe3aa
Author: Jason Merrill 
Date:   Tue Nov 14 13:48:57 2017 -0500

   Correct noexcept mismatch in declarations.

   * include/bits/fs_ops.h (permissions): Add noexcept.
   * include/bits/fs_fwd.h (copy, copy_file): Remove noexcept.
   (permissions): Add noexcept.
   * include/std/string_view (find_first_of): Add noexcept.



There's another one needed too:

--- a/libstdc++-v3/libsupc++/eh_throw.cc
+++ b/libstdc++-v3/libsupc++/eh_throw.cc
@@ -53,8 +53,10 @@ __gxx_exception_cleanup (_Unwind_Reason_Code code, 
_Unwind_Exception *exc)
}

extern "C" __cxa_refcounted_exception*
-__cxxabiv1::__cxa_init_primary_exception(void *obj, std::type_info *tinfo,
- void (_GLIBCXX_CDTOR_CALLABI *dest) 
(void *))
+__cxxabiv1::
+__cxa_init_primary_exception(void *obj, std::type_info *tinfo,
+void (_GLIBCXX_CDTOR_CALLABI *dest) (void *))
+_GLIBCXX_NOTHROW
{
  __cxa_refcounted_exception *header
= __get_refcounted_exception_header_from_obj (obj);



[PATCH] Fix -Wsystem-header warnings in libstdc++

2017-12-01 Thread Jonathan Wakely

This fixes a number of warnings that show up with -Wsystem-headers

Tested powerpc64le-linux, committed to trunk.


commit cc833c247c3b334c56feff8898bd02c8f9f3fc6a
Author: Jonathan Wakely 
Date:   Fri Dec 1 14:13:47 2017 +

Add comment to fix -Wfallthrough warning

* include/bits/locale_facets_nonio.tcc (money_get::_M_extract): Add
fallthrough comment.

diff --git a/libstdc++-v3/include/bits/locale_facets_nonio.tcc b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
index a449c41e6b8..135dd0b9d8f 100644
--- a/libstdc++-v3/include/bits/locale_facets_nonio.tcc
+++ b/libstdc++-v3/include/bits/locale_facets_nonio.tcc
@@ -282,6 +282,7 @@ _GLIBCXX_BEGIN_NAMESPACE_LDBL_OR_CXX11
 		  ++__beg;
 		else
 		  __testvalid = false;
+		// fallthrough
 	  case money_base::none:
 		// Only if not at the end of the pattern.
 		if (__i != 3)

commit 98e449432c7ceddb157ccc5e94e6c2886c5d33e1
Author: Jonathan Wakely 
Date:   Thu Nov 30 16:57:20 2017 +

Fix -Wempty-body warnings for debug assertions

* include/bits/node_handle.h (_Node_handle_common::operator=)
(_Node_handle_common::_M_swap): Add braces around debug assertions.

diff --git a/libstdc++-v3/include/bits/node_handle.h b/libstdc++-v3/include/bits/node_handle.h
index 7f109ada6f1..8a1e465893e 100644
--- a/libstdc++-v3/include/bits/node_handle.h
+++ b/libstdc++-v3/include/bits/node_handle.h
@@ -87,10 +87,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		|| !this->_M_alloc)
 	  this->_M_alloc = std::move(__nh._M_alloc);
 	else
-	  __glibcxx_assert(this->_M_alloc == __nh._M_alloc);
+	  {
+		__glibcxx_assert(this->_M_alloc == __nh._M_alloc);
+	  }
 	  }
 	else
-	  __glibcxx_assert(_M_alloc);
+	  {
+	__glibcxx_assert(_M_alloc);
+	  }
 	__nh._M_ptr = nullptr;
 	__nh._M_alloc = nullopt;
 	return *this;
@@ -109,7 +113,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	|| !_M_alloc || !__nh._M_alloc)
 	  _M_alloc.swap(__nh._M_alloc);
 	else
-	  __glibcxx_assert(_M_alloc == __nh._M_alloc);
+	  {
+	__glibcxx_assert(_M_alloc == __nh._M_alloc);
+	  }
   }
 
 private:

commit 93ebef15310e0ebed92041ebbc6c860c1b06e2a6
Author: Jonathan Wakely 
Date:   Thu Nov 30 20:14:31 2017 +

Use const char* to fix -Wwrite-strings warning

* include/ext/ropeimpl.h (rope::_S_dump): Use const char*.

diff --git a/libstdc++-v3/include/ext/ropeimpl.h b/libstdc++-v3/include/ext/ropeimpl.h
index 9e88ce14c18..4842034c1e8 100644
--- a/libstdc++-v3/include/ext/ropeimpl.h
+++ b/libstdc++-v3/include/ext/ropeimpl.h
@@ -1139,7 +1139,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	}
   else
 	{
-	  char* __kind;
+	  const char* __kind;
 	  
 	  switch (__r->_M_tag)
 	{

commit 3a05b2c46c829bc7c3698d3d432372c9b70881c7
Author: Jonathan Wakely 
Date:   Thu Nov 30 20:18:47 2017 +

Add [[noreturn]] attributes to fix warning

* libsupc++/nested_exception.h (__throw_with_nested_impl): Add
noreturn attribute.

diff --git a/libstdc++-v3/libsupc++/nested_exception.h b/libstdc++-v3/libsupc++/nested_exception.h
index 43970b4ef86..27bccfce35f 100644
--- a/libstdc++-v3/libsupc++/nested_exception.h
+++ b/libstdc++-v3/libsupc++/nested_exception.h
@@ -92,6 +92,7 @@ namespace std
   // Throw an exception of unspecified type that is publicly derived from
   // both remove_reference_t<_Tp> and nested_exception.
   template
+[[noreturn]]
 inline void
 __throw_with_nested_impl(_Tp&& __t, true_type)
 {
@@ -100,6 +101,7 @@ namespace std
 }
 
   template
+[[noreturn]]
 inline void
 __throw_with_nested_impl(_Tp&& __t, false_type)
 { throw std::forward<_Tp>(__t); }

commit a82e6e608b99d5be038869b3745d1f49ddfc022b
Author: Jonathan Wakely 
Date:   Fri Dec 1 13:59:59 2017 +

Remove stray semi-colons at namespace scope

* include/bits/regex_executor.tcc (_Executor::_M_rep_once_more):
Remove semi-colon after function body.
* include/bits/uniform_int_dist.h (_Power_of_2): Likewise.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index 2ceba35e7b8..008ffa0e836 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -170,7 +170,7 @@ namespace __detail
   // visited more than twice. It's `twice` instead of `once` because
   // we need to spare one more time for potential group capture.
   template
+	   bool __dfs_mode>
 void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
 _M_rep_once_more(_Match_mode __match_mode, _StateIdT __i)
 {
@@ -193,7 +193,7 @@ namespace __detail
 	  __rep_count.second--;
 	}
 	}
-};
+}
 
   // _M_alt branch is "match once more", while _M_next is "get me out
   // of this quantifier". Executing _M_next first or _M_alt first don't
diff --git a/libstdc++-v3/include/bits/uniform_int_dist.h b/libstdc++-v3/include/bits/uniform_int_dist.h
index 16509

Re: [PATCH] Fix overflow in __gnu_cxx::__numeric_traits::__min

2017-12-01 Thread Jonathan Wakely

On 01/12/17 14:34 +, Jonathan Wakely wrote:

On 01/12/17 15:22 +0100, Paolo Carlini wrote:

Hi,

On 01/12/2017 15:11, Jonathan Wakely wrote:

On 01/12/17 14:02 +, Jonathan Wakely wrote:

Is there a reason we left-shift into the sign bit, causing undefined
behaviour? The approach used in std::numeric_limits seems better.


The current code warns with -Wpedantic -Wsystem-headers:

/usr/include/c++/7/ext/numeric_traits.h:58:35: warning: overflow 
in implicit constant conversion [-Woverflow]

  static const _Value __min = __glibcxx_min(_Value);
  ^
Many details can be found in c++/52119. Which should probably be 
updated, right?


At the time I handled the libstdc++ side of the issue, and for some 
reason forgot to fix at the same time the ext/numeric_traits bits. I 
have no idea why, maybe to be conservative, in a way.


Huh, so is the warning wrong? Is it only undefined in C++98?


Oh I see, we actually get a different warning on trunk:

/home/jwakely/gcc/8/include/c++/8.0.0/ext/numeric_traits.h:58:55: warning: 
overflow in conversion from ‘int’ to ‘short int’ changes value from ‘32768’ to 
‘-32768’ [-Woverflow]
  static const _Value __min = __glibcxx_min(_Value);
  ^

So we just need a cast instead.




[PATCH] Add noexcept to std::integral_constant members

2017-12-01 Thread Jonathan Wakely

C++14 added noexcept to the integral_constant member functions, and it
should always have been on the integer_sequence one.

* include/std/type_traits (integral_constant): Make member functions
noexcept (LWG 2346).
* include/std/utility (integer_sequence): Likewise.

Tested powerpc64le-linux, committed to trunk.


commit 63ab060f8e958744dd06f09d7c639f45e66d09c0
Author: Jonathan Wakely 
Date:   Fri Dec 1 13:52:08 2017 +

Add noexcept to std::integral_constant members

* include/std/type_traits (integral_constant): Make member functions
noexcept (LWG 2346).
* include/std/utility (integer_sequence): Likewise.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 723c137f5b9..1d639e452f3 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -71,12 +71,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr _Tp  value = __v;
   typedef _Tp   value_type;
   typedef integral_constant<_Tp, __v>   type;
-  constexpr operator value_type() const { return value; }
+  constexpr operator value_type() const noexcept { return value; }
 #if __cplusplus > 201103L
 
 #define __cpp_lib_integral_constant_callable 201304
 
-  constexpr value_type operator()() const { return value; }
+  constexpr value_type operator()() const noexcept { return value; }
 #endif
 };
 
diff --git a/libstdc++-v3/include/std/utility b/libstdc++-v3/include/std/utility
index e7386320e2a..da17928feee 100644
--- a/libstdc++-v3/include/std/utility
+++ b/libstdc++-v3/include/std/utility
@@ -321,7 +321,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct integer_sequence
 {
   typedef _Tp value_type;
-  static constexpr size_t size() { return sizeof...(_Idx); }
+  static constexpr size_t size() noexcept { return sizeof...(_Idx); }
 };
 
   /// Alias template make_integer_sequence


Patch to fix an undefined behavior in fortran/decl.c

2017-12-01 Thread Qing Zhao
Hi,

this is a very straightforward fix for an undefined behavior in fortran/decl.c:


In the man page of sprintf, it's clearly state:

===
NOTES
  Some programs imprudently rely on code such as the following

  sprintf(buf, "%s some further text", buf);

  to append text to buf.  However, the standards explicitly note that the  
results
  are  undefined if source and destination buffers overlap when calling 
sprintf(),
  snprintf(), vsprintf(), and vsnprintf().  Depending on  the  version  of  
gcc(1)
  used,  and  the compiler options employed, calls such as the above will 
not pro‐
  duce the expected results.
===

in gcc/fortran/decl.c, there is exactly such case as following:

3361   sprintf (name, "%s_%d", name, kind_value);

fixed it in this patch.

bootstraped and tested on both X86 and Aarch64. no regression.

Okay for trunk?

thanks.

Qing


gcc/fortran/ChangeLog

2017-11-30  Qing Zhao  mailto:qing.z...@oracle.com>>

   * fortran/decl.c (gfc_get_pdt_instance): Adjust the call to
   sprintf to avoid the same buffer being both source and
   destination.

---
gcc/fortran/decl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index e57cfde..02dda24 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3358,7 +3358,7 @@ gfc_get_pdt_instance (gfc_actual_arglist *param_list, 
gfc_symbol **sym,
   }

  gfc_extract_int (kind_expr, &kind_value);
-  sprintf (name, "%s_%d", name, kind_value);
+  sprintf (name + strlen (name), "_%d", kind_value);

  if (!name_seen && actual_param)
   actual_param = actual_param->next;
-- 
1.9.1

Re: [PATCH][gimple-linterchange] Rewrite compute_access_stride

2017-12-01 Thread Bin.Cheng
On Fri, Dec 1, 2017 at 2:26 PM, Richard Biener  wrote:
> On Fri, 1 Dec 2017, Bin.Cheng wrote:
>
>> On Fri, Dec 1, 2017 at 12:31 PM, Richard Biener  wrote:
>> >
>> > This is the access stride computation change.  Apart from the
>> > stride extraction I adjusted the cost model to handle non-constant
>> > strides by checking if either is a multiple of the other and
>> > simply fail interchanging if it's the wrong way around for one
>> > ref or if the simple method using multiple_of_p fails to determine
>> > either case.
>> >
>> > This still handles the bwaves case.
>> >
>> > I think we want additional testcases with variable strides for each
>> > case we add - I believe this is the most conservative way to treat
>> > variable strides.
>> >
>> > It may be inconsistent with the constant stride handling where you
>> > allow for many OK DRs to outweight a few not OK DRs, but as it
>> > worked for bwaves it must be good enough ;)
>> >
>> > Tested on x86_64-unknown-linux-gnu (just the interchange testcases).
>> >
>> > Currently running a bootstrap with -O3 -g -floop-interchange.
>> >
>> > Ok for the branch?
>> Ok.  This actually is closer the motivation: simple/conservative cost
>> model that only transforms code when it's known to be good.
>> I will check the impact on the number of interchange in spec.
>
> Few high-level observations.
>
> In tree_loop_interchange::interchange we try interchanging adjacent
> loops, starting from innermost with outer of innermost.  This way
> the innermost loop will bubble up as much as possible.  But we
> don't seem to handle bulling multiple loops like for
>
>  for (i=0; i   for (j=0; j for (k=0; k   a[j][k][i] = 1.;
>
> because the innermost two loops are correctly ordered so we then
> try interchanging the k and the i loop which succeeds but then
> we stop.  So there's something wrong in the iteration scheme.
> I would have expected it to be quadratic, basically iterating
> the ::interchange loop until we didn't perform any interchange
> (or doing sth more clever).
Yes, I restricted it to be a single pass process in loop nest.
Ideally we could create a vector of loop_cand for the whole loop nest,
then sort/permute loops wrto computed strides.

>
> loop_cand::can_interchange_p seems to perform per BB checks
> (supported_operations, num_stmts) that with the way we interchange
> should disallow any such BB in a loop that we interchange or
> interchange across.  That means it looks like sth we should
> perform early, like during data dependence gathering by for
> example inlining find_data_references_in_bb and doing those
> per-stmt checks there?
Yes.  The only problem is the check on the reduction.  We can build up
all loop_cand earlier, or simply move non-reduction checks earlier.

>
> In prepare_perfect_loop_nest we seem to be somewhat quadratic
> in the way we re-compute dependences if doing so failed
> (we also always just strip the outermost loop while the failing
> DDR could involve a DR that is in an inner loop).  I think it
> should be possible to re-structure this computing dependences
> from inner loop body to outer loop bodies (the ddrs vector
> is, as opposed to the dr vector, unsorted I think).
Even more, we would want interface in tree-data-ref.c so that data
dependence can be computed/assembled level by level in loop nest.
loop distribution can be benefited as well.

> I haven't fully thought this out yet though - a similar
> iteration scheme could improve DR gathering though that's not
> so costly.
I can change this one now.

>
> Overall we should try improving on function names, we have
> valid_data_dependences, can_interchange_loops, should_interchange_loops,
> can_interchange_p which all are related but do slightly different
> things.  My usual approach is to inline all single-use functions
> to improve things (and make program flow more visible).  But
> I guess that's too much implementation detail.
>
> Didn't get to the IV re-mapping stuff yet but you (of course)
> end up with quite some dead IVs when iterating the interchange.
> You seem to add a new canonical IV just to avoid rewriting the
> existing exit test, right?  Defering that to the "final"
> interchange on a nest should avoid those dead IVs.
Hmm, with the help pf new created dce interface, all dead IV/RE will
be deleted after this pass.  Note for current implementation, dead
code can be generated from mapped IV, canonical IV and the reduction.
Deferring adding canonical IV looks not practical wrto current level
by level interchange, because inner loop's IV is needed for
interchange?

>
> Will now leave for the weekend.
Have a nice WE!

Thanks,
bin
>
> Thanks,
> Richard.


Re: Update tune flags for generic

2017-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2017 at 03:41:49PM +0100, Jan Hubicka wrote:
> @@ -99,25 +97,25 @@ DEF_TUNE (X86_TUNE_MEMORY_MISMATCH_STALL
> conditional jump instruction for 32 bit TARGET.
> FIXME: revisit for generic.  */

Remove the last line of the comment above?

>  DEF_TUNE (X86_TUNE_FUSE_CMP_AND_BRANCH_32, "fuse_cmp_and_branch_32",
> -   m_CORE_ALL | m_BDVER | m_ZNVER1)
> +   m_CORE_ALL | m_BDVER | m_ZNVER1 | m_GENERIC)

Jakub


Update tune flags for generic

2017-12-01 Thread Jan Hubicka
Hi,
this patch updates tuning flags for generic. It drops flags used by old
chips (X86_TUNE_PARTIAL_FLAG_REG_STALL which is needed only for original
core2, X86_TUNE_PAD_RETURNS which is needed for pre-buldozer chips)
and enables fussion logic because it seems wasteful to let scheduler
prevent it.

I will upate scheduler model incrementally.

Bootstrapped/regtested x86_64-linux, plan commit it today.

Honza

* x86-tune.def (X86_TUNE_PARTIAL_FLAG_REG_STALL): Disable for
generic
(X86_TUNE_FUSE_CMP_AND_BRANCH_32, X86_TUNE_FUSE_CMP_AND_BRANCH_64,
X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS, X86_TUNE_FUSE_ALU_AND_BRANCH):
Enable for generic.
(X86_TUNE_PAD_RETURNS): Disable for generic.
Index: i386/x86-tune.def
===
--- i386/x86-tune.def   (revision 255304)
+++ i386/x86-tune.def   (working copy)
@@ -75,12 +75,10 @@ DEF_TUNE (X86_TUNE_SSE_SPLIT_REGS, "sse_
setting full flags.
 
The flags does not affect generation of INC and DEC that is controlled
-   by X86_TUNE_USE_INCDEC.
+   by X86_TUNE_USE_INCDEC.  */
 
-   This flag may be dropped from generic once core2-corei5 machines are
-   rare enough.  */
 DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, "partial_flag_reg_stall",
-  m_CORE2 | m_GENERIC)
+  m_CORE2)
 
 /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
partial dependencies.  */
@@ -99,25 +97,25 @@ DEF_TUNE (X86_TUNE_MEMORY_MISMATCH_STALL
conditional jump instruction for 32 bit TARGET.
FIXME: revisit for generic.  */
 DEF_TUNE (X86_TUNE_FUSE_CMP_AND_BRANCH_32, "fuse_cmp_and_branch_32",
- m_CORE_ALL | m_BDVER | m_ZNVER1)
+ m_CORE_ALL | m_BDVER | m_ZNVER1 | m_GENERIC)
 
 /* X86_TUNE_FUSE_CMP_AND_BRANCH_64: Fuse compare with a subsequent
conditional jump instruction for TARGET_64BIT.
FIXME: revisit for generic.  */
 DEF_TUNE (X86_TUNE_FUSE_CMP_AND_BRANCH_64, "fuse_cmp_and_branch_64",
- m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_BDVER | m_ZNVER1)
+ m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_BDVER | m_ZNVER1 | 
m_GENERIC)
 
 /* X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS: Fuse compare with a
subsequent conditional jump instruction when the condition jump
check sign flag (SF) or overflow flag (OF).  */
 DEF_TUNE (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS, "fuse_cmp_and_branch_soflags",
- m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_BDVER | m_ZNVER1)
+ m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_BDVER | m_ZNVER1 | 
m_GENERIC)
 
 /* X86_TUNE_FUSE_ALU_AND_BRANCH: Fuse alu with a subsequent conditional
jump instruction when the alu instruction produces the CCFLAG consumed by
the conditional jump instruction. */
 DEF_TUNE (X86_TUNE_FUSE_ALU_AND_BRANCH, "fuse_alu_and_branch",
-  m_SANDYBRIDGE | m_HASWELL)
+  m_SANDYBRIDGE | m_HASWELL | m_GENERIC)
 
 
 /*/
@@ -194,7 +192,7 @@ DEF_TUNE (X86_TUNE_PAD_SHORT_FUNCTION, "
architecture expect at most one jump per 2 byte window.  Failing to
pad returns leads to misaligned return stack.  */
 DEF_TUNE (X86_TUNE_PAD_RETURNS, "pad_returns",
-  m_ATHLON_K8 | m_AMDFAM10 | m_GENERIC)
+  m_ATHLON_K8 | m_AMDFAM10)
 
 /* X86_TUNE_FOUR_JUMP_LIMIT: Some CPU cores are not able to predict more
than 4 branch instructions in the 16 byte window.  */


[patch] Fix bug in an OpenACC async test case

2017-12-01 Thread Cesar Philippidis
This patch fixes a race condition bug in
libgomp.oacc-c-c++-common/data-2-lib.c. That is an OpenACC test which
exercises the runtime wait API, for use in conjunction with asynchronous
OpenACC offloaded regions. I not sure why this problem went undetected
for so long. Either the parallel region runs too fast on the GPU so that
the copy'ed out data is correct, or the Nvidia's CUDA runtime blocks all
device->host data transfers until the GPU is no longer processing the
data. I suspect it's the former.

I've applied this patch to trunk and og7 as obvious.

Cesar
2017-12-01  Cesar Philippidis  

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Add missing
	call to acc_wait (1).


diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
index 1694f582363..f553d3d839c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
@@ -64,6 +64,8 @@ main (int argc, char **argv)
   for (i = 0; i < N; i++)
 b[i] = a[i];
 
+  acc_wait (1);
+
   acc_memcpy_from_device (a, d_a, nbytes);
   acc_memcpy_from_device (b, d_b, nbytes);
   


Re: [PATCH] Fix overflow in __gnu_cxx::__numeric_traits::__min

2017-12-01 Thread Jonathan Wakely

On 01/12/17 15:22 +0100, Paolo Carlini wrote:

Hi,

On 01/12/2017 15:11, Jonathan Wakely wrote:

On 01/12/17 14:02 +, Jonathan Wakely wrote:

Is there a reason we left-shift into the sign bit, causing undefined
behaviour? The approach used in std::numeric_limits seems better.


The current code warns with -Wpedantic -Wsystem-headers:

/usr/include/c++/7/ext/numeric_traits.h:58:35: warning: overflow in 
implicit constant conversion [-Woverflow]

  static const _Value __min = __glibcxx_min(_Value);
  ^
Many details can be found in c++/52119. Which should probably be 
updated, right?


At the time I handled the libstdc++ side of the issue, and for some 
reason forgot to fix at the same time the ext/numeric_traits bits. I 
have no idea why, maybe to be conservative, in a way.


Huh, so is the warning wrong? Is it only undefined in C++98?




Re: [PATCH][gimple-linterchange] Rewrite compute_access_stride

2017-12-01 Thread Richard Biener
On Fri, 1 Dec 2017, Bin.Cheng wrote:

> On Fri, Dec 1, 2017 at 12:31 PM, Richard Biener  wrote:
> >
> > This is the access stride computation change.  Apart from the
> > stride extraction I adjusted the cost model to handle non-constant
> > strides by checking if either is a multiple of the other and
> > simply fail interchanging if it's the wrong way around for one
> > ref or if the simple method using multiple_of_p fails to determine
> > either case.
> >
> > This still handles the bwaves case.
> >
> > I think we want additional testcases with variable strides for each
> > case we add - I believe this is the most conservative way to treat
> > variable strides.
> >
> > It may be inconsistent with the constant stride handling where you
> > allow for many OK DRs to outweight a few not OK DRs, but as it
> > worked for bwaves it must be good enough ;)
> >
> > Tested on x86_64-unknown-linux-gnu (just the interchange testcases).
> >
> > Currently running a bootstrap with -O3 -g -floop-interchange.
> >
> > Ok for the branch?
> Ok.  This actually is closer the motivation: simple/conservative cost
> model that only transforms code when it's known to be good.
> I will check the impact on the number of interchange in spec.

Few high-level observations.

In tree_loop_interchange::interchange we try interchanging adjacent
loops, starting from innermost with outer of innermost.  This way
the innermost loop will bubble up as much as possible.  But we
don't seem to handle bulling multiple loops like for

 for (i=0; i

Re: [PATCH] Fix overflow in __gnu_cxx::__numeric_traits::__min

2017-12-01 Thread Paolo Carlini

Hi,

On 01/12/2017 15:11, Jonathan Wakely wrote:

On 01/12/17 14:02 +, Jonathan Wakely wrote:

Is there a reason we left-shift into the sign bit, causing undefined
behaviour? The approach used in std::numeric_limits seems better.


The current code warns with -Wpedantic -Wsystem-headers:

/usr/include/c++/7/ext/numeric_traits.h:58:35: warning: overflow in 
implicit constant conversion [-Woverflow]

  static const _Value __min = __glibcxx_min(_Value);
  ^
Many details can be found in c++/52119. Which should probably be 
updated, right?


At the time I handled the libstdc++ side of the issue, and for some 
reason forgot to fix at the same time the ext/numeric_traits bits. I 
have no idea why, maybe to be conservative, in a way.


Paolo.


Re: [PATCH] Fix overflow in __gnu_cxx::__numeric_traits::__min

2017-12-01 Thread Jonathan Wakely

On 01/12/17 14:02 +, Jonathan Wakely wrote:

Is there a reason we left-shift into the sign bit, causing undefined
behaviour? The approach used in std::numeric_limits seems better.


The current code warns with -Wpedantic -Wsystem-headers:

/usr/include/c++/7/ext/numeric_traits.h:58:35: warning: overflow in implicit 
constant conversion [-Woverflow]
  static const _Value __min = __glibcxx_min(_Value);
  ^



[PATCH] Fix overflow in __gnu_cxx::__numeric_traits::__min

2017-12-01 Thread Jonathan Wakely

Is there a reason we left-shift into the sign bit, causing undefined
behaviour? The approach used in std::numeric_limits seems better.


commit e18203057f8e46a3b35239977d8d703df47cdc28
Author: Jonathan Wakely 
Date:   Thu Nov 30 18:15:49 2017 +

Fix overflow

diff --git a/libstdc++-v3/include/ext/numeric_traits.h 
b/libstdc++-v3/include/ext/numeric_traits.h
index 3138eaac716..dd13ef6075c 100644
--- a/libstdc++-v3/include/ext/numeric_traits.h
+++ b/libstdc++-v3/include/ext/numeric_traits.h
@@ -41,18 +41,19 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Compile time constants for builtin types.
-  // Sadly std::numeric_limits member functions cannot be used for this.
+  // Sadly std::numeric_limits member functions cannot be used for this
+  // before C++11 made them constexpr.
 #define __glibcxx_signed(_Tp) ((_Tp)(-1) < 0)
 #define __glibcxx_digits(_Tp) \
   (sizeof(_Tp) * __CHAR_BIT__ - __glibcxx_signed(_Tp))
 
-#define __glibcxx_min(_Tp) \
-  (__glibcxx_signed(_Tp) ? (_Tp)1 << __glibcxx_digits(_Tp) : (_Tp)0)
-
 #define __glibcxx_max(_Tp) \
   (__glibcxx_signed(_Tp) ? \
(_Tp)1 << (__glibcxx_digits(_Tp) - 1)) - 1) << 1) + 1) : ~(_Tp)0)
 
+#define __glibcxx_min(_Tp) \
+  (__glibcxx_signed(_Tp) ? -__glibcxx_max(_Tp) - 1 : (_Tp)0)
+
   template
 struct __numeric_traits_integer
 {


[PATCH] Re: loading of zeros into {x,y,z}mm registers

2017-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2017 at 01:18:43PM +0100, Jakub Jelinek wrote:
> > Furthermore this
> > 
> > typedef double __attribute__((vector_size(16))) v2df_t;
> > typedef double __attribute__((vector_size(32))) v4df_t;
> > 
> > void test1(void) {
> > register v2df_t x asm("xmm31") = {};
> > asm volatile("" :: "v" (x));
> > }
> > 
> > void test2(void) {
> > register v4df_t x asm("ymm31") = {};
> > asm volatile("" :: "v" (x));
> > }
> > 
> > translates to "vxorpd %xmm31, %xmm31, %xmm31" for both
> > functions with -mavx512vl, yet afaict the instructions would #UD
> > without AVX-512DQ, which suggests to me that the original
> > intention wasn't fully met.
> 
> This indeed is a bug, please file a PR; we should IMHO just use
> vpxorq instead in that case, which is just AVX512VL and doesn't need
> DQ.  Of course if DQ is available, we should use vxorpd.
> Working on a fix.

Will try this:

2017-12-01  Jakub Jelinek  

* config/i386/i386-protos.h (standard_sse_constant_opcode): Change
last argument to rtx pointer.
* config/i386/i386.c (standard_sse_constant_opcode): Replace X argument
with OPERANDS.  For AVX+ 128-bit VEX encoded instructions over 256-bit
or 512-bit.  If setting EXT_REX_SSE_REG_P, use EVEX encoded insn
depending on the chosen ISAs.
* config/i386/i386.md (*movxi_internal_avx512f, *movoi_internal_avx,
*movti_internal, *movdi_internal, *movsi_internal, *movtf_internal,
*movdf_internal, *movsf_internal): Adjust standard_sse_constant_opcode
callers.
* config/i386/sse.md (mov_internal): Likewise.
* config/i386/mmx.md (*mov_internal): Likewise.

--- gcc/config/i386/i386-protos.h.jj2017-10-28 09:00:44.0 +0200
+++ gcc/config/i386/i386-protos.h   2017-12-01 14:39:36.498608799 +0100
@@ -52,7 +52,7 @@ extern int standard_80387_constant_p (rt
 extern const char *standard_80387_constant_opcode (rtx);
 extern rtx standard_80387_constant_rtx (int);
 extern int standard_sse_constant_p (rtx, machine_mode);
-extern const char *standard_sse_constant_opcode (rtx_insn *, rtx);
+extern const char *standard_sse_constant_opcode (rtx_insn *, rtx *);
 extern bool ix86_standard_x87sse_constant_load_p (const rtx_insn *, rtx);
 extern bool symbolic_reference_mentioned_p (rtx);
 extern bool extended_reg_mentioned_p (rtx);
--- gcc/config/i386/i386.c.jj   2017-12-01 09:19:07.0 +0100
+++ gcc/config/i386/i386.c  2017-12-01 14:36:38.884847618 +0100
@@ -10380,12 +10380,13 @@ standard_sse_constant_p (rtx x, machine_
 }
 
 /* Return the opcode of the special instruction to be used to load
-   the constant X.  */
+   the constant operands[1] into operands[0].  */
 
 const char *
-standard_sse_constant_opcode (rtx_insn *insn, rtx x)
+standard_sse_constant_opcode (rtx_insn *insn, rtx *operands)
 {
   machine_mode mode;
+  rtx x = operands[1];
 
   gcc_assert (TARGET_SSE);
 
@@ -10395,34 +10396,51 @@ standard_sse_constant_opcode (rtx_insn *
 {
   switch (get_attr_mode (insn))
{
+   case MODE_TI:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vpxor\t%0, %d0";
+ /* FALLTHRU */
case MODE_XI:
- return "vpxord\t%g0, %g0, %g0";
case MODE_OI:
- return (TARGET_AVX512VL
- ? "vpxord\t%x0, %x0, %x0"
- : "vpxor\t%x0, %x0, %x0");
-   case MODE_TI:
- return (TARGET_AVX512VL
- ? "vpxord\t%x0, %x0, %x0"
- : "%vpxor\t%0, %d0");
+ if (EXT_REX_SSE_REG_P (operands[0]))
+   return (TARGET_AVX512VL
+   ? "vpxord\t%x0, %x0, %x0"
+   : "vpxord\t%g0, %g0, %g0");
+ return "vpxor\t%x0, %x0, %x0";
 
+   case MODE_V2DF:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vxorpd\t%0, %d0";
+ /* FALLTHRU */
case MODE_V8DF:
- return (TARGET_AVX512DQ
- ? "vxorpd\t%g0, %g0, %g0"
- : "vpxorq\t%g0, %g0, %g0");
case MODE_V4DF:
- return "vxorpd\t%x0, %x0, %x0";
-   case MODE_V2DF:
- return "%vxorpd\t%0, %d0";
+ if (EXT_REX_SSE_REG_P (operands[0]))
+   if (TARGET_AVX512DQ)
+ return (TARGET_AVX512VL
+ ? "vxorpd\t%x0, %x0, %x0"
+ : "vxorpd\t%g0, %g0, %g0");
+   else
+ return (TARGET_AVX512VL
+ ? "vpxorq\t%x0, %x0, %x0"
+ : "vpxorq\t%g0, %g0, %g0");
+  return "vxorpd\t%x0, %x0, %x0";
 
+   case MODE_V4SF:
+ if (!EXT_REX_SSE_REG_P (operands[0]))
+   return "%vxorps\t%0, %d0";
+ /* FALLTHRU */
case MODE_V16SF:
- return (TARGET_AVX512DQ
- ? "vxorps\t%g0, %g0, %g0"
- : "vpxord\t%g0, %g0, %g0");
case MODE_V8SF:
- return "vxorps\t%x0, %x0, %x0";
-   case MODE_V4SF:
- return "%vxorps\t%0, %d0";
+ if (EX

Re: [PATCH][gimple-linterchange] Rewrite compute_access_stride

2017-12-01 Thread Bin.Cheng
On Fri, Dec 1, 2017 at 12:31 PM, Richard Biener  wrote:
>
> This is the access stride computation change.  Apart from the
> stride extraction I adjusted the cost model to handle non-constant
> strides by checking if either is a multiple of the other and
> simply fail interchanging if it's the wrong way around for one
> ref or if the simple method using multiple_of_p fails to determine
> either case.
>
> This still handles the bwaves case.
>
> I think we want additional testcases with variable strides for each
> case we add - I believe this is the most conservative way to treat
> variable strides.
>
> It may be inconsistent with the constant stride handling where you
> allow for many OK DRs to outweight a few not OK DRs, but as it
> worked for bwaves it must be good enough ;)
>
> Tested on x86_64-unknown-linux-gnu (just the interchange testcases).
>
> Currently running a bootstrap with -O3 -g -floop-interchange.
>
> Ok for the branch?
Ok.  This actually is closer the motivation: simple/conservative cost
model that only transforms code when it's known to be good.
I will check the impact on the number of interchange in spec.

Thanks,
bin
>
> Richard.
>
> 2017-12-01  Richard Biener  
>
> * gimple-loop-interchange.cc (estimate_val_by_simplify_replace):
> Remove.
> (compute_access_stride): Rewrite using instantiate_scev,
> remove constant substitution.
> (should_interchange_loops): Adjust for non-constant strides.
>
> Index: gcc/gimple-loop-interchange.cc
> ===
> --- gcc/gimple-loop-interchange.cc  (revision 255303)
> +++ gcc/gimple-loop-interchange.cc  (working copy)
> @@ -1325,42 +1325,6 @@ tree_loop_interchange::move_code_to_inne
>  }
>  }
>
> -/* Estimate and return the value of EXPR by replacing variables in EXPR
> -   with CST_TREE and simplifying.  */
> -
> -static tree
> -estimate_val_by_simplify_replace (tree expr, tree cst_tree)
> -{
> -  unsigned i, n;
> -  tree ret = NULL_TREE, e, se;
> -
> -  if (!expr)
> -return NULL_TREE;
> -
> -  /* Do not bother to replace constants.  */
> -  if (CONSTANT_CLASS_P (expr))
> -return expr;
> -
> -  if (!EXPR_P (expr))
> -return cst_tree;
> -
> -  n = TREE_OPERAND_LENGTH (expr);
> -  for (i = 0; i < n; i++)
> -{
> -  e = TREE_OPERAND (expr, i);
> -  se = estimate_val_by_simplify_replace (e, cst_tree);
> -  if (e == se)
> -   continue;
> -
> -  if (!ret)
> -   ret = copy_node (expr);
> -
> -  TREE_OPERAND (ret, i) = se;
> -}
> -
> -  return (ret ? fold (ret) : expr);
> -}
> -
>  /* Given data reference DR in LOOP_NEST, the function computes DR's access
> stride at each level of loop from innermost LOOP to outer.  On success,
> it saves access stride at each level loop in a vector which is pointed
> @@ -1388,44 +1352,31 @@ compute_access_stride (struct loop *loop
>
>tree ref = DR_REF (dr);
>tree scev_base = build_fold_addr_expr (ref);
> -  tree access_size = TYPE_SIZE_UNIT (TREE_TYPE (ref));
> -  tree niters = build_int_cst (sizetype, AVG_LOOP_NITER);
> -  access_size = fold_build2 (MULT_EXPR, sizetype, niters, access_size);
> -
> -  do {
> -tree scev_fn = analyze_scalar_evolution (loop, scev_base);
> -if (chrec_contains_undetermined (scev_fn)
> -   || chrec_contains_symbols_defined_in_loop (scev_fn, loop->num))
> -  break;
> -
> -if (TREE_CODE (scev_fn) != POLYNOMIAL_CHREC)
> -  {
> -   scev_base = scev_fn;
> -   strides->safe_push (build_int_cst (sizetype, 0));
> -   continue;
> -  }
> -
> -scev_base = CHREC_LEFT (scev_fn);
> -if (tree_contains_chrecs (scev_base, NULL))
> -  break;
> -
> -tree scev_step = fold_convert (sizetype, CHREC_RIGHT (scev_fn));
> -
> -enum ev_direction scev_dir = scev_direction (scev_fn);
> -/* Estimate if step isn't constant.  */
> -if (scev_dir == EV_DIR_UNKNOWN)
> -  {
> -   scev_step = estimate_val_by_simplify_replace (scev_step, niters);
> -   if (TREE_CODE (scev_step) != INTEGER_CST
> -   || tree_int_cst_lt (scev_step, access_size))
> - scev_step = access_size;
> -  }
> -/* Compute absolute value of scev step.  */
> -else if (scev_dir == EV_DIR_DECREASES)
> -  scev_step = fold_build1 (NEGATE_EXPR, sizetype, scev_step);
> -
> -strides->safe_push (scev_step);
> -  } while (loop != loop_nest && (loop = loop_outer (loop)) != NULL);
> +  tree scev = analyze_scalar_evolution (loop, scev_base);
> +  scev = instantiate_scev (loop_preheader_edge (loop_nest), loop, scev);
> +  if (! chrec_contains_undetermined (scev))
> +{
> +  tree sl = scev;
> +  struct loop *expected = loop;
> +  while (TREE_CODE (sl) == POLYNOMIAL_CHREC)
> +   {
> + struct loop *sl_loop = get_chrec_loop (sl);
> + while (sl_loop != expected)
> +   {
> + strides->safe_push (size_int (0));
> + expected = loop_outer (expe

[PATCHv3] Add a warning for invalid function casts

2017-12-01 Thread Bernd Edlinger
Hi,

this version of the patch improves the heuristic check to take the
target hook into account, to handle cases correctly when both or only
one parameter is _not_ promoted to int.

Both C and C++ FE should of course use the same logic here.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
gcc:
2017-10-06  Bernd Edlinger  

* doc/invoke.texi: Document -Wcast-function-type.
* recog.h (stored_funcptr): Change signature.
* tree-dump.c (dump_node): Avoid warning.
* typed-splay-tree.h (typed_splay_tree): Avoid warning.

libcpp:
2017-10-06  Bernd Edlinger  

* internal.h (maybe_print_line): Change signature.

c-family:
2017-10-06  Bernd Edlinger  

* c.opt (Wcast-function-type): New warning option.
* c-lex.c (get_fileinfo): Avoid warning.
* c-ppoutput.c (scan_translation_unit_directives_only): Remove cast.

c:
2017-10-06  Bernd Edlinger  

* c-typeck.c (c_safe_arg_type_equiv_p,
c_safe_function_type_cast_p): New function.
(build_c_cast): Implement -Wcast-function-type.

cp:
2017-10-06  Bernd Edlinger  

* decl2.c (start_static_storage_duration_function): Avoid warning.
* typeck.c (cxx_safe_arg_type_equiv_p,
cxx_safe_function_type_cast_p): New function.
(build_reinterpret_cast_1): Implement -Wcast-function-type.

testsuite:
2017-10-06  Bernd Edlinger  

* c-c++-common/Wcast-function-type.c: New test.
* g++.dg/Wcast-function-type.C: New test.
Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 255270)
+++ gcc/c/c-typeck.c	(working copy)
@@ -5452,6 +5452,62 @@ handle_warn_cast_qual (location_t loc, tree type,
   while (TREE_CODE (in_type) == POINTER_TYPE);
 }
 
+/* Heuristic check if two parameter types can be considered ABI-equivalent.  */
+
+static bool
+c_safe_arg_type_equiv_p (tree t1, tree t2)
+{
+  t1 = TYPE_MAIN_VARIANT (t1);
+  t2 = TYPE_MAIN_VARIANT (t2);
+
+  if (TREE_CODE (t1) == POINTER_TYPE
+  && TREE_CODE (t2) == POINTER_TYPE)
+return true;
+
+  /* The signedness of the parameter matters only when an integral
+ type smaller than int is promoted to int, otherwise only the
+ precision of the parameter matters.
+ This check should make sure that the callee does not see
+ undefined values in argument registers.  */
+  if (INTEGRAL_TYPE_P (t1)
+  && INTEGRAL_TYPE_P (t2)
+  && TYPE_PRECISION (t1) == TYPE_PRECISION (t2)
+  && (targetm.calls.promote_prototypes (t1)
+	  == targetm.calls.promote_prototypes (t2)
+	  || TYPE_PRECISION (t1) >= TYPE_PRECISION (integer_type_node))
+  && (TYPE_UNSIGNED (t1) == TYPE_UNSIGNED (t2)
+	  || !targetm.calls.promote_prototypes (t1)
+	  || TYPE_PRECISION (t1) >= TYPE_PRECISION (integer_type_node)))
+return true;
+
+  return comptypes (t1, t2);
+}
+
+/* Check if a type cast between two function types can be considered safe.  */
+
+static bool
+c_safe_function_type_cast_p (tree t1, tree t2)
+{
+  if (TREE_TYPE (t1) == void_type_node &&
+  TYPE_ARG_TYPES (t1) == void_list_node)
+return true;
+
+  if (TREE_TYPE (t2) == void_type_node &&
+  TYPE_ARG_TYPES (t2) == void_list_node)
+return true;
+
+  if (!c_safe_arg_type_equiv_p (TREE_TYPE (t1), TREE_TYPE (t2)))
+return false;
+
+  for (t1 = TYPE_ARG_TYPES (t1), t2 = TYPE_ARG_TYPES (t2);
+   t1 && t2;
+   t1 = TREE_CHAIN (t1), t2 = TREE_CHAIN (t2))
+if (!c_safe_arg_type_equiv_p (TREE_VALUE (t1), TREE_VALUE (t2)))
+  return false;
+
+  return true;
+}
+
 /* Build an expression representing a cast to type TYPE of expression EXPR.
LOC is the location of the cast-- typically the open paren of the cast.  */
 
@@ -5645,6 +5701,16 @@ build_c_cast (location_t loc, tree type, tree expr
 	pedwarn (loc, OPT_Wpedantic, "ISO C forbids "
 		 "conversion of object pointer to function pointer type");
 
+  if (TREE_CODE (type) == POINTER_TYPE
+	  && TREE_CODE (otype) == POINTER_TYPE
+	  && TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE
+	  && TREE_CODE (TREE_TYPE (otype)) == FUNCTION_TYPE
+	  && !c_safe_function_type_cast_p (TREE_TYPE (type),
+	   TREE_TYPE (otype)))
+	warning_at (loc, OPT_Wcast_function_type,
+		"cast between incompatible function types"
+		" from %qT to %qT", otype, type);
+
   ovalue = value;
   value = convert (type, value);
 
Index: gcc/c-family/c-lex.c
===
--- gcc/c-family/c-lex.c	(revision 255270)
+++ gcc/c-family/c-lex.c	(working copy)
@@ -101,9 +101,11 @@ get_fileinfo (const char *name)
   struct c_fileinfo *fi;
 
   if (!file_info_tree)
-file_info_tree = splay_tree_new ((splay_tree_compare_fn) strcmp,
+file_info_tree = splay_tree_new ((splay_tree_compare_fn)
+ (void (*) (void)) strcmp,
  0,
- (splay_tree_delete_value_fn) free);
+ (splay_tree_delete_value_fn)
+  

[PATCH][gimple-linterchange] Rewrite compute_access_stride

2017-12-01 Thread Richard Biener

This is the access stride computation change.  Apart from the
stride extraction I adjusted the cost model to handle non-constant
strides by checking if either is a multiple of the other and
simply fail interchanging if it's the wrong way around for one
ref or if the simple method using multiple_of_p fails to determine
either case.

This still handles the bwaves case.

I think we want additional testcases with variable strides for each
case we add - I believe this is the most conservative way to treat
variable strides.

It may be inconsistent with the constant stride handling where you
allow for many OK DRs to outweight a few not OK DRs, but as it
worked for bwaves it must be good enough ;)

Tested on x86_64-unknown-linux-gnu (just the interchange testcases).

Currently running a bootstrap with -O3 -g -floop-interchange.

Ok for the branch?

Richard.

2017-12-01  Richard Biener  

* gimple-loop-interchange.cc (estimate_val_by_simplify_replace):
Remove.
(compute_access_stride): Rewrite using instantiate_scev,
remove constant substitution.
(should_interchange_loops): Adjust for non-constant strides.

Index: gcc/gimple-loop-interchange.cc
===
--- gcc/gimple-loop-interchange.cc  (revision 255303)
+++ gcc/gimple-loop-interchange.cc  (working copy)
@@ -1325,42 +1325,6 @@ tree_loop_interchange::move_code_to_inne
 }
 }
 
-/* Estimate and return the value of EXPR by replacing variables in EXPR
-   with CST_TREE and simplifying.  */
-
-static tree
-estimate_val_by_simplify_replace (tree expr, tree cst_tree)
-{
-  unsigned i, n;
-  tree ret = NULL_TREE, e, se;
-
-  if (!expr)
-return NULL_TREE;
-
-  /* Do not bother to replace constants.  */
-  if (CONSTANT_CLASS_P (expr))
-return expr;
-
-  if (!EXPR_P (expr))
-return cst_tree;
-
-  n = TREE_OPERAND_LENGTH (expr);
-  for (i = 0; i < n; i++)
-{
-  e = TREE_OPERAND (expr, i);
-  se = estimate_val_by_simplify_replace (e, cst_tree);
-  if (e == se)
-   continue;
-
-  if (!ret)
-   ret = copy_node (expr);
-
-  TREE_OPERAND (ret, i) = se;
-}
-
-  return (ret ? fold (ret) : expr);
-}
-
 /* Given data reference DR in LOOP_NEST, the function computes DR's access
stride at each level of loop from innermost LOOP to outer.  On success,
it saves access stride at each level loop in a vector which is pointed
@@ -1388,44 +1352,31 @@ compute_access_stride (struct loop *loop
 
   tree ref = DR_REF (dr);
   tree scev_base = build_fold_addr_expr (ref);
-  tree access_size = TYPE_SIZE_UNIT (TREE_TYPE (ref));
-  tree niters = build_int_cst (sizetype, AVG_LOOP_NITER);
-  access_size = fold_build2 (MULT_EXPR, sizetype, niters, access_size);
-
-  do {
-tree scev_fn = analyze_scalar_evolution (loop, scev_base);
-if (chrec_contains_undetermined (scev_fn)
-   || chrec_contains_symbols_defined_in_loop (scev_fn, loop->num))
-  break;
-
-if (TREE_CODE (scev_fn) != POLYNOMIAL_CHREC)
-  {
-   scev_base = scev_fn;
-   strides->safe_push (build_int_cst (sizetype, 0));
-   continue;
-  }
-
-scev_base = CHREC_LEFT (scev_fn);
-if (tree_contains_chrecs (scev_base, NULL))
-  break;
-
-tree scev_step = fold_convert (sizetype, CHREC_RIGHT (scev_fn));
-
-enum ev_direction scev_dir = scev_direction (scev_fn);
-/* Estimate if step isn't constant.  */
-if (scev_dir == EV_DIR_UNKNOWN)
-  {
-   scev_step = estimate_val_by_simplify_replace (scev_step, niters);
-   if (TREE_CODE (scev_step) != INTEGER_CST
-   || tree_int_cst_lt (scev_step, access_size))
- scev_step = access_size;
-  }
-/* Compute absolute value of scev step.  */
-else if (scev_dir == EV_DIR_DECREASES)
-  scev_step = fold_build1 (NEGATE_EXPR, sizetype, scev_step);
-
-strides->safe_push (scev_step);
-  } while (loop != loop_nest && (loop = loop_outer (loop)) != NULL);
+  tree scev = analyze_scalar_evolution (loop, scev_base);
+  scev = instantiate_scev (loop_preheader_edge (loop_nest), loop, scev);
+  if (! chrec_contains_undetermined (scev))
+{
+  tree sl = scev;
+  struct loop *expected = loop;
+  while (TREE_CODE (sl) == POLYNOMIAL_CHREC)
+   {
+ struct loop *sl_loop = get_chrec_loop (sl);
+ while (sl_loop != expected)
+   {
+ strides->safe_push (size_int (0));
+ expected = loop_outer (expected);
+   }
+ strides->safe_push (CHREC_RIGHT (sl));
+ sl = CHREC_LEFT (sl);
+ expected = loop_outer (expected);
+   }
+  if (! tree_contains_chrecs (sl, NULL))
+   while (expected != loop_outer (loop_nest))
+ {
+   strides->safe_push (size_int (0));
+   expected = loop_outer (expected);
+ }
+}
 
   dr->aux = strides;
 }
@@ -1538,6 +1489,9 @@ should_interchange_loops (unsigned i_idx
   struct data_reference *dr;
   bool all_seq_dr_be

[PATCH][gimple-linterchange] add testcase from bwaves

2017-12-01 Thread Richard Biener

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-12-01  Richard Biener  

* gfortran.dg/pr81303.f: New testcase.

Index: gcc/testsuite/gfortran.dg/pr81303.f
===
--- gcc/testsuite/gfortran.dg/pr81303.f (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr81303.f (working copy)
@@ -0,0 +1,44 @@
+! { dg-do compile }
+! { dg-options "-O3 -floop-interchange -fdump-tree-linterchange-details" }
+
+subroutine mat_times_vec(y,x,a,axp,ayp,azp,axm,aym,azm,
+ $  nb,nx,ny,nz)
+implicit none
+integer nb,nx,ny,nz,i,j,k,m,l,kit,im1,ip1,jm1,jp1,km1,kp1
+
+real*8 y(nb,nx,ny,nz),x(nb,nx,ny,nz)
+
+real*8 a(nb,nb,nx,ny,nz),
+ 1  axp(nb,nb,nx,ny,nz),ayp(nb,nb,nx,ny,nz),azp(nb,nb,nx,ny,nz),
+ 2  axm(nb,nb,nx,ny,nz),aym(nb,nb,nx,ny,nz),azm(nb,nb,nx,ny,nz)
+
+
+  do k=1,nz
+ km1=mod(k+nz-2,nz)+1
+ kp1=mod(k,nz)+1
+ do j=1,ny
+jm1=mod(j+ny-2,ny)+1
+jp1=mod(j,ny)+1
+do i=1,nx
+   im1=mod(i+nx-2,nx)+1
+   ip1=mod(i,nx)+1
+   do l=1,nb
+  y(l,i,j,k)=0.0d0
+  do m=1,nb
+ y(l,i,j,k)=y(l,i,j,k)+
+ 1   a(l,m,i,j,k)*x(m,i,j,k)+
+ 2   axp(l,m,i,j,k)*x(m,ip1,j,k)+
+ 3   ayp(l,m,i,j,k)*x(m,i,jp1,k)+
+ 4   azp(l,m,i,j,k)*x(m,i,j,kp1)+
+ 5   axm(l,m,i,j,k)*x(m,im1,j,k)+
+ 6   aym(l,m,i,j,k)*x(m,i,jm1,k)+
+ 7   azm(l,m,i,j,k)*x(m,i,j,km1)
+  enddo
+   enddo
+enddo
+ enddo
+enddo  
+return
+end
+
+! { dg-final { scan-tree-dump-times "is interchanged" 1 "linterchange" } }


Re: [PATCH branch/gimple-linterchange]Use dyn_cast instread of is_a<> and as_a<>

2017-12-01 Thread Richard Biener
On Fri, Dec 1, 2017 at 11:53 AM, Bin Cheng  wrote:
> Hi,
> This is a simple patch using dyn_cast instead of is_a<> and as_a<> as 
> suggested by review.
> This is for branches/gimple-linterchange, bootstrap and test as when the 
> branch is created.  Is it OK?

Ok.

Richard.



> Thanks,
> bin
> 2017-11-30  Bin Cheng  
>
> * gimple-loop-interchange.cc (is-a.h): New header file.
> (loop_cand::find_reduction_by_stmt): Use dyn_cast instead of is_a<>
> and as_a<>.
> (loop_cand::analyze_iloop_reduction_var): Ditto.
> (loop_cand::analyze_oloop_reduction_var): Ditto.  Check gimple stmt
> against phi node directly.


Re: RFC: Variable-length VECTOR_CSTs

2017-12-01 Thread Richard Biener
On Thu, Nov 30, 2017 at 2:18 PM, Richard Sandiford
 wrote:
> Richard Sandiford  writes:
>> Richard Biener  writes:
>>> On Wed, Nov 29, 2017 at 12:57 PM, Richard Sandiford
>>>  wrote:
 It was clear from the SVE reviews that people were unhappy with how
 "special" the variable-length case was.  One particular concern was
 the use of VEC_DUPLICATE_CST and VEC_SERIES_CST, and the way that
 that would in turn lead to different representations of VEC_PERM_EXPRs
 with constant permute vectors, and difficulties in invoking
 vec_perm_const_ok.

 This is an RFC for a VECTOR_CST representation that treats each
 specific constant as one instance of an arbitrary-length sequence.
 The reprensentation then extends to variable-length vectors in a
 natural way.

 As discussed on IRC, if a vector contains X*N elements for some
 constant N and integer X>0, the main features we need are:

 1) the ability to represent a sequence that duplicates N values

This is useful for SLP invariants.

 2) the ability to represent a sequence that starts with N values and
is followed by zeros

This is useful for the initial value in a double or SLP reduction

 3) the ability to represent N interleaved series

This is useful for SLP inductions and for VEC_PERM_EXPRs.

 For (2), zero isn't necessarily special, since vectors used in an AND
 reduction might need to fill with ones.  Also, we might need up to N
 different fill values with mixed SLP operations; it isn't necessarily
 safe to assume that a single fill value will always be enough.

 The same goes for (3): there's no reason in principle why the
 steps in an SLP induction should all be the same (although they
 do need to be at the moment).  E.g. once we support SLP on:

   for (unsigned int i = 0; i < n; i += 2)
 {
   x[i] += 4 + i;
   x[i + 1] += 11 + i * 3;
 }

 we'll need {[4, 14], +, [2, 6]}.

 So the idea is to represent vectors as P interleaved patterns of the form:

   [BASE0, BASE1, BASE1 + STEP, BASE1 + STEP*2, ...]

 where the STEP is always zero (actually null) for non-integer vectors.
 This is effectively projecting a "foreground" value of P elements
 onto an arbitrary-length "background" sequenece, where the background
 sequence contains P parallel linear series.

 E.g. to pick an extreme and unlikely example,

   [42, 99, 2, 20, 3, 30, 4, 40, ...]

 has 2 patterns:

   BASE0 = 42, BASE1 = 2, STEP = 1
   BASE0 = 99, BASE1 = 20, STEP = 10

 The more useful cases are degenerate versions of this general case.

 As far as memory consumption goes: the number of patterns needed for a
 fixed-length vector with 2*N elements is always at most N; in the worst
 case, we simply interleave the first N elements with the second N elements.
 The worst-case increase in footprint is therefore N trees for the steps.
 In practice the footprint is usually smaller than it was before, since
 most constants do have a pattern.

 The patch below implements this for trees.  I have patches to use the
 same style of encoding for CONST_VECTOR and vec_perm_indices, but the
 tree one is probably easiest to read.

 The patch only adds the representation.  Follow-on patches make more
 use of it (and usually make things simpler; e.g. integer_zerop is no
 longer a looping operation).

 Does this look better?
>>>
>>> Yes, the overall design looks good.  I wonder why you chose to have
>>> the number of patterns being a power of two?  I suppose this is
>>> to have the same number of elements from all patterns in the final
>>> vector (which is power-of-two sized)?
>>
>> Right.  The rtl and vec_perm_indices parts don't have this restriction,
>> since some ports do define non-power-of-2 vectors for internal use.
>> The problem is that, since VECTOR_CSTs are used by the FE, we need
>> to support all valid vector lengths without blowing the 16-bit field.
>> Using the same style of representation as TYPE_VECTOR_SUBPARTS seemed
>> like the safest way of doing that.
>>
>>> I wonder if there exists a vector where say a three-pattern
>>> interleaving would be smaller than a four-pattern one?
>>
>> Only in the non-power-of-2 case.
>>
>>> Given you add flags for various purposes would it make sense to
>>> overload 'step' with a regular element to avoid the storage increase
>>> in case step is unnecessary?  This makes it have three elements
>>> which is of course awkward :/
>>
>> I wondered about keeping it as an array of trees and tacking the
>> steps onto the end as an optional addition.  But the idea is that
>> tree_vector_pattern becomes the preferred way of handling constant
>> vectors, if it can be used, so it seemed neater to use in the tree
>> node too.
>
> In the en

[PATCH] Fix PR83232

2017-12-01 Thread Richard Biener

The following fixes an old missed basic-block vectorization issue
exposing itself as regression caused by a x86 cost change lumping a
lot more code into the same BB.

We are sorting DRs after constant offset and if there are multiple
refs with the same offset we break the DR group.  This doesn't handle
things like

 a[0] = ..;
 a[1] = ..;
 ...
 a[0] = ..;
 a[1] = ..;

very well (read: not).  The temporary fix for GCC 8 to solve the fma3d
regression (and avoid a STLF issue) is to not break groups at such point
but simply ignore the duplicates we run into for group construction
so only the first group in a BB with exact duplicates will be identified.

A more elaborate fix isn't suitable at this stage IMHO (I suggested how
to do it in the comment but didn't try yet to see how complicated it
would end up).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-12-01  Richard Biener  

PR tree-optimization/83232
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Fix
detection of same access. Instead of breaking the group here
do not consider the duplicate.  Add comment explaining real fix.

* gfortran.dg/vect/pr83232.f90: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 255300)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -2841,10 +2841,6 @@ vect_analyze_data_ref_accesses (vec_info
  if (data_ref_compare_tree (DR_STEP (dra), DR_STEP (drb)) != 0)
break;
 
- /* Do not place the same access in the interleaving chain twice.  */
- if (tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb)) == 0)
-   break;
-
  /* Check the types are compatible.
 ???  We don't distinguish this during sorting.  */
  if (!types_compatible_p (TREE_TYPE (DR_REF (dra)),
@@ -2854,7 +2850,25 @@ vect_analyze_data_ref_accesses (vec_info
  /* Sorting has ensured that DR_INIT (dra) <= DR_INIT (drb).  */
  HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_INIT (dra));
  HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_INIT (drb));
- gcc_assert (init_a <= init_b);
+ HOST_WIDE_INT init_prev
+   = TREE_INT_CST_LOW (DR_INIT (datarefs_copy[i-1]));
+ gcc_assert (init_a <= init_b
+ && init_a <= init_prev
+ && init_prev <= init_b);
+
+ /* Do not place the same access in the interleaving chain twice.  */
+ if (init_b == init_prev)
+   {
+ gcc_assert (gimple_uid (DR_STMT (datarefs_copy[i-1]))
+ < gimple_uid (DR_STMT (drb)));
+ /* ???  For now we simply "drop" the later reference which is
+otherwise the same rather than finishing off this group.
+In the end we'd want to re-process duplicates forming
+multiple groups from the refs, likely by just collecting
+all candidates (including duplicates and split points
+below) in a vector and then process them together.  */
+ continue;
+   }
 
  /* If init_b == init_a + the size of the type * k, we have an
 interleaving, and DRA is accessed before DRB.  */
@@ -2866,10 +2880,7 @@ vect_analyze_data_ref_accesses (vec_info
  /* If we have a store, the accesses are adjacent.  This splits
 groups into chunks we support (we don't support vectorization
 of stores with gaps).  */
- if (!DR_IS_READ (dra)
- && (init_b - (HOST_WIDE_INT) TREE_INT_CST_LOW
-(DR_INIT (datarefs_copy[i-1]))
- != type_size_a))
+ if (!DR_IS_READ (dra) && init_b - init_prev != type_size_a)
break;
 
  /* If the step (if not zero or non-constant) is greater than the
Index: gcc/testsuite/gfortran.dg/vect/pr83232.f90
===
--- gcc/testsuite/gfortran.dg/vect/pr83232.f90  (nonexistent)
+++ gcc/testsuite/gfortran.dg/vect/pr83232.f90  (working copy)
@@ -0,0 +1,33 @@
+! { dg-do compile }
+! { dg-require-effective-target vect_double }
+! { dg-additional-options "-funroll-loops --param 
vect-max-peeling-for-alignment=0 -fdump-tree-slp-details" }
+
+  SUBROUTINE MATERIAL_41_INTEGRATION ( STRESS,YLDC,EFPS,   
&
+ &  DTnext,Dxx,Dyy,Dzz,Dxy,Dxz,Dyz,MatID,P1,P3 )
+  REAL(KIND(0D0)), INTENT(INOUT) :: STRESS(6)
+  REAL(KIND(0D0)), INTENT(IN):: DTnext
+  REAL(KIND(0D0)), INTENT(IN):: Dxx,Dyy,Dzz,Dxy,Dxz,Dyz
+  REAL(KIND(0D0)) :: Einc(6)
+  REAL(KIND(0D0)) :: P1,P3
+
+  Einc(1) = DTnext * Dxx ! (1)
+  Einc(2) = DTnext * Dyy
+  Einc(3) = DTnext * Dzz
+  Einc(4) = DTnext * Dxy
+  Einc(5) = DTnext * Dxz
+  Einc(6) = DTnext * Dyz
+  DO i = 1,6
+STRESS(i) = STRESS(i) + P

[PATCH branch/gimple-linterchange]Use dyn_cast instread of is_a<> and as_a<>

2017-12-01 Thread Bin Cheng
Hi,
This is a simple patch using dyn_cast instead of is_a<> and as_a<> as suggested 
by review.
This is for branches/gimple-linterchange, bootstrap and test as when the branch 
is created.  Is it OK?

Thanks,
bin
2017-11-30  Bin Cheng  

* gimple-loop-interchange.cc (is-a.h): New header file.
(loop_cand::find_reduction_by_stmt): Use dyn_cast instead of is_a<>
and as_a<>.
(loop_cand::analyze_iloop_reduction_var): Ditto.
(loop_cand::analyze_oloop_reduction_var): Ditto.  Check gimple stmt
against phi node directly.From 88ddf90ee183f2e58bb5d4b38d14733412603b44 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 29 Nov 2017 11:23:52 +
Subject: [PATCH 40/42] dyn_cast

---
 gcc/gimple-loop-interchange.cc | 25 +++--
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 7afafb8..e999822 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
+#include "is-a.h"
 #include "tree.h"
 #include "gimple.h"
 #include "tree-pass.h"
@@ -270,12 +271,9 @@ unsupported_edge (edge e)
 reduction_p
 loop_cand::find_reduction_by_stmt (gimple *stmt)
 {
-  gphi *phi = NULL;
+  gphi *phi = dyn_cast  (stmt);
   reduction_p re;
 
-  if (is_a  (stmt))
-phi = as_a  (stmt);
-
   for (unsigned i = 0; m_reductions.iterate (i, &re); ++i)
 if ((phi != NULL && phi == re->lcssa_phi)
 	|| (stmt == re->producer || stmt == re->consumer))
@@ -591,10 +589,8 @@ loop_cand::analyze_iloop_reduction_var (tree var)
 	continue;
 
   /* Or else it's used in PHI itself.  */
-  use_phi = NULL;
-  if (is_a  (stmt)
-	  && (use_phi = as_a  (stmt)) != NULL
-	  && use_phi == phi)
+  use_phi = dyn_cast  (stmt);
+  if (use_phi == phi)
 	continue;
 
   if (use_phi != NULL
@@ -684,12 +680,7 @@ loop_cand::analyze_oloop_reduction_var (loop_cand *iloop, tree var)
   if (is_gimple_debug (stmt))
 	continue;
 
-  if (!flow_bb_inside_loop_p (m_loop, gimple_bb (stmt)))
-	return false;
-
-  if (! is_a  (stmt)
-	  || (use_phi = as_a  (stmt)) == NULL
-	  || use_phi != inner_re->phi)
+  if (stmt != inner_re->phi)
 	return false;
 }
 
@@ -701,10 +692,8 @@ loop_cand::analyze_oloop_reduction_var (loop_cand *iloop, tree var)
 	continue;
 
   /* Or else it's used in PHI itself.  */
-  use_phi = NULL;
-  if (is_a  (stmt)
-	  && (use_phi = as_a  (stmt)) != NULL
-	  && use_phi == phi)
+  use_phi = dyn_cast  (stmt);
+  if (use_phi == phi)
 	continue;
 
   if (lcssa_phi == NULL
-- 
1.9.1



Re: [patch] remove cilk-plus

2017-12-01 Thread Paolo Carlini

Hi,

On 16/11/2017 16:33, Koval, Julia wrote:

// I failed to send patch itself, it is too big even in gzipped form.  What is 
the right way to send such big patches?

Hi, this patch removes cilkplus. Ok for trunk?
Now that cilkplus is gone I suppose we should clean-up Bugzilla about 
that. Shall I go ahead and essentially close all the bugs we got? As 
WONTFIX or what else? Let's agree on something. In principle we could 
keep the regressions for the sake of the existing release branches but I 
think we got very, very, few of those and anyway I don't see who gonna 
work on that...


Cheers,
Paolo.


Re: [PATCH][ARM] Fix wrong code by arm_final_prescan with fp16 move instructions

2017-12-01 Thread Sudakshina Das

On 30/11/17 16:07, Kyrill Tkachov wrote:


On 30/11/17 16:06, Sudakshina Das wrote:

Hi Kyrill

On 27/11/17 12:25, Kyrill Tkachov wrote:
> Hi Sudi,
>
> On 24/11/17 14:57, Sudi Das wrote:
>> Hi
>>
>> For the following test case:
>> __fp16
>> test_select (__fp16 a, __fp16 b, __fp16 c)
>> {
>>return (a < b) ? b : c;
>> }
>>
>> when compiled with -mfpu=fp-armv8 -march=armv8.2-a+fp16 -marm
>> -mfloat-abi=hard trunk generates wrong code:
>>
>> test_select:
>>  @ args = 0, pretend = 0, frame = 0
>>  @ frame_needed = 0, uses_anonymous_args = 0
>>  @ link register save eliminated.
>>  vcvtb.f32.f16   s0, s0
>>  vcvtb.f32.f16   s15, s1
>>  vcmpe.f32   s0, s15
>>  vmrsAPSR_nzcv, FPSCR
>>  // <-- No conditional branch!
>>  vmovs1, s2  @ __fp16
>> .L2:
>>  vmovs0, s1  @ __fp16
>>  bx  lr
>>
>> There should have been a conditional branch there to skip one of the
>> VMOVs.
>> This patch fixes this problem by making *movhf_vfp_fp16 unconditional
>> wherever needed.
>>
>> Testing done: Add a new test case and checked for regressions
>> arm-none-linux-gnueabihf.
>>
>> Is this ok for trunk?
>>
>
> This is ok after assuming a bootstrap on arm-none-linux-gnueabihf
passes
> as well.
> Does this bug appear on the GCC 7 branch?
> If so, could you please test this patch on that branch as well if so?
>

I have tested the patch and also sent a new patch request for gcc-7
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02577.html



Thanks Sudi, this is ok to commit to the branch after we let this patch
bake on trunk for a week without problems.



Committed as r255301 on trunk. Will wait for a week before committing to 
gcc-7.


Thanks
Sudi


Kyrill



Thanks
Sudi

> Thanks,
> Kyrill
>
>> Sudi
>>
>> ChangeLog entry are as follow:
>>
>> *** gcc/ChangeLog ***
>>
>> 2017-11-24  Sudakshina Das 
>>
>> * config/arm/vfp.md (*movhf_vfp_fp16): Add conds attribute.
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2017-11-24  Sudakshina Das 
>>
>> * gcc.target/arm/armv8_2-fp16-move-2.c: New test.
>







RE: [PATCH][GCC][ARM] Generate .arch and .arch_extensions for each function if required. [Patch (3/3)]

2017-12-01 Thread Tamar Christina
Ping, 

This patch has also been bootstrapped and no issues.

> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, November 21, 2017 17:29
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
> 
> Subject: RE: [PATCH][GCC][ARM] Generate .arch and .arch_extensions for
> each function if required. [Patch (3/3)]
> 
> Ping
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Tamar Christina
> > Sent: Monday, November 6, 2017 16:52
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd ; Ramana Radhakrishnan
> > ; Richard Earnshaw
> > ; ni...@redhat.com; Kyrylo Tkachov
> > 
> > Subject: [PATCH][GCC][ARM] Generate .arch and .arch_extensions for
> > each function if required. [Patch (3/3)]
> >
> > Hi All,
> >
> > This patch adds the needed machinery to generate the appropriate .arch
> > and .arch_extension directives per function.
> >
> > Borrowing from AArch64 this is only done when it's required (i.e. when
> > the directives to be set differ from the currently set one).
> >
> > As part if this the .fpu directive has also been cleaned up to follow
> > the same logic.
> >
> > Regtested on arm-none-eabi and no regressions.
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-11-06  Tamar Christina  
> >
> > PR target/82641
> > * config/arm/arm.c (INCLUDE_STRING): Define.
> > (arm_last_printed_arch_string, arm_last_printed_fpu_string): New.
> > (arm_declare_function_name): Conservatively emit .arch,
> > .arch_extensions
> > and .fpu.
> >
> > gcc/testsuite/
> > 2017-11-06  Tamar Christina  
> >
> > PR target/82641
> > * gcc.target/arm/pragma_arch_attribute_2.c: New.
> > * gcc.target/arm/pragma_arch_attribute_2.c: New.
> > * gcc.target/arm/pragma_arch_attribute_3.c: New.
> > * gcc.target/arm/pragma_fpu_attribute.c: New.
> > * gcc.target/arm/pragma_fpu_attribute_2.c: New.
> >
> > --


Re: [PATCH] handle non-constant offsets in -Wstringop-overflow (PR 77608)

2017-12-01 Thread Jeff Law
On 11/30/2017 01:30 PM, Martin Sebor wrote:
> On 11/22/2017 05:03 PM, Jeff Law wrote:
>> On 11/21/2017 12:07 PM, Martin Sebor wrote:
>>> On 11/21/2017 09:55 AM, Jeff Law wrote:
 On 11/19/2017 04:28 PM, Martin Sebor wrote:
> On 11/18/2017 12:53 AM, Jeff Law wrote:
>> On 11/17/2017 12:36 PM, Martin Sebor wrote:
>>> The attached patch enhances -Wstringop-overflow to detect more
>>> instances of buffer overflow at compile time by handling non-
>>> constant offsets into the destination object that are known to
>>> be in some range.  The solution could be improved by handling
>>> even more cases (e.g., anti-ranges or offsets relative to
>>> pointers beyond the beginning of an object) but it's a start.
>>>
>>> In addition to bootsrapping/regtesting GCC, also tested with
>>> Binutils/GDB, Glibc, and the Linux kernel on x86_64 with no
>>> regressions.
>>>
>>> Martin
>>>
>>> The top of GDB fails to compile at the moment so the validation
>>> there was incomplete.
>>>
>>> gcc-77608.diff
>>>
>>>
>>> PR middle-end/77608 - missing protection on trivially detectable
>>> runtime buffer overflow
>>>
>>> gcc/ChangeLog:
>>>
>>>     PR middle-end/77608
>>>     * builtins.c (compute_objsize): Handle non-constant offsets.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     PR middle-end/77608
>>>     * gcc.dg/Wstringop-overflow.c: New test.
>> The recursive call into compute_objsize passing in the ostype avoids
>> having to think about the whole object vs nearest containing object
>> issues.  Right?
>>
>> What's left to worry about is maximum or minimum remaining bytes
>> in the
>> object.  At least that's my understanding of how ostype works here.
>>
>> So we get the amount remaining, ignoring the variable offset, from
>> the
>> recursive call (SIZE).  The space left after we account for the
>> variable
>> offset is [SIZE - MAX, SIZE - MIN].  So ISTM for type 0/1 you have to
>> return SIZE-MIN (which you do) and for type 2/3 you have to return
>> SIZE-MAX which I think you get wrong (and you have to account for the
>> possibility that MAX or MIN is greater than SIZE and thus there's
>> nothing left).
>
> Subtracting the upper bound of the offset from the size instead
> of the lower bound when the caller is asking for the minimum
> object size would make the result essentially meaningless in
> common cases where the offset is smaller than size_t, as in:
>
>   char a[7];
>
>   void f (const char *s, unsigned i)
>   {
>     __builtin_strcpy (a + i, s);
>   }
>
> Here, i's range is [0, UINT_MAX].
>
> IMO, it's only useful to use the lower bound here, otherwise
> the result would only rarely be non-zero.
 But when we're asking for the minimum left, aren't we essentially
 asking
 for "how much space am I absolutely sure I can write"?  And if that is
 the question, then the only conservatively correct answer is to
 subtract
 the high bound.
>>>
>>> I suppose you could look at it that way but IME with this work
>>> (now, and also last year when I submitted a patch actually
>>> changing the built-in), using the upper bound is just not that
>>> useful because it's too often way too big.  There's no way to
>>> distinguish an out-of-range upper bound that's the result of
>>> an inadequate attempt to constrain a value from an out-of-range
>>> upper bound that is sufficiently constrained but in a way GCC
>>> doesn't see.
>> Understood.
>>
>> So while it's reasonable to not warn in those cases where we just have
>> crap range information (that's always going to be the case for some code
>> regardless of how good my work or Andrew/Aldy's work is), we have to be
>> very careful and make sure that nobody acts on this information for
>> optimization purposes because what we're returning is not conservatively
>> correct.
>>
>>
>>>
>>> There are no clients of this API that would be affected by
>>> the decision one way or the other (unless the user specifies
>>> a -Wstringop-overflow= argument greater than the default 2)
>>> so I don't think what we do now matters much, if at all.
>> Right, but what's to stop someone without knowledge of the
>> implementation and its quirk of not returning the conservatively safe
>> result from using the results in other ways.
> 
> Presumably they would find out by testing their code.  But this
> is a hypothetical scenario.  I added the function for warnings.
> I wasn't expecting it to be used for optimization, no such uses
> have emerged, and I don't have the impression that anyone is
> contemplating adding them (certainly not in stage 3).  If you
> think the function could be useful for optimization then we
> should certainly consider changing it as we gain experience
> with it under those conditions.
Merely passing tests does no