[PATCH] Fix ICE in fixup_abnormal_edges (PR rtl-optimization/88018)

2018-11-14 Thread Jakub Jelinek
Hi!

On the following testcase, we have a call (not marked noreturn), which can
throw, followed immediately by __builtin_unreachable (), so it effectively
is noreturn in this particular call site (i.e. if it returns, it is UB).
In RTL this is represented by the corresponding bb having just EH edge
successor, no fallthrough edge.  If one of the callers of
fixup_abnormal_edges (the stack pass in this case, or LRA or reload) adds
some instructions after the call and calls fixup_abnormal_edges, it ICEs,
as it tries to delete those insns and insert them on the NULL edge.
Somebody has hit that issue already but solved it for USE insns only, this
patch does that for all the cases where we don't have the fallthru edge.
If there weren't any instructions after the call before one of these 3
passes, I'd say it must be UB to return from such a call and thus it should
be unimportant if those insns are executed or not.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-15  Jakub Jelinek  

PR rtl-optimization/88018
* cfgrtl.c (fixup_abnormal_edges): Guard moving insns to fallthru edge
on the presence of fallthru edge, rather than if it is a USE or not.

* g++.dg/tsan/pr88018.C: New test.

--- gcc/cfgrtl.c.jj 2018-11-14 00:55:51.695333876 +0100
+++ gcc/cfgrtl.c2018-11-14 17:24:46.414915101 +0100
@@ -3332,8 +3332,15 @@ fixup_abnormal_edges (void)
 If it's placed after a trapping call (i.e. that
 call is the last insn anyway), we have no fallthru
 edge.  Simply delete this use and don't try to insert
-on the non-existent edge.  */
- if (GET_CODE (PATTERN (insn)) != USE)
+on the non-existent edge.
+Similarly, sometimes a call that can throw is
+followed in the source with __builtin_unreachable (),
+meaning that there is UB if the call returns rather
+than throws.  If there weren't any instructions
+following such calls before, supposedly even the ones
+we've deleted aren't significant and can be
+removed.  */
+ if (e)
{
  /* We're not deleting it, we're moving it.  */
  insn->set_undeleted ();
--- gcc/testsuite/g++.dg/tsan/pr88018.C.jj  2018-11-14 17:26:46.224944969 
+0100
+++ gcc/testsuite/g++.dg/tsan/pr88018.C 2018-11-14 17:28:40.057073142 +0100
@@ -0,0 +1,6 @@
+// PR rtl-optimization/88018
+// { dg-do compile }
+// { dg-skip-if "" { *-*-* }  { "*" } { "-O0" } }
+// { dg-options "-fsanitize=thread -fno-ipa-pure-const -O1 
-fno-inline-functions-called-once -w" }
+
+#include "../pr69667.C"

Jakub


[PATCH, csky] Update dynamic linker'name

2018-11-14 Thread 瞿仙淼
Hi, 
I have submitted a patch to update dynamic linker'name


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 266171)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2018-11-15  Xianmiao Qu  
+
+   * config/csky/csky-linux-elf.h (LINUX_DYNAMIC_LINKER): Remove.
+   (GLIBC_DYNAMIC_LINKER): Define.
+   (LINUX_TARGET_LINK_SPEC): Update the dynamic linker's name.
+
 2018-11-15  Bin Cheng  
 
PR tree-optimization/84648
Index: gcc/config/csky/csky-linux-elf.h
===
--- gcc/config/csky/csky-linux-elf.h(revision 266171)
+++ gcc/config/csky/csky-linux-elf.h(working copy)
@@ -61,7 +61,7 @@
   %{mvdsp:-mvdsp}  \
   "
 
-#define LINUX_DYNAMIC_LINKER  "/lib/ld.so.1"
+#define GLIBC_DYNAMIC_LINKER 
"/lib/ld-linux-cskyv2%{mhard-float:-hf}%{mbig-endian:-be}.so.1"
 
 #define LINUX_TARGET_LINK_SPEC "%{h*} %{version:-v}\
%{b}\
@@ -70,7 +70,7 @@
%{symbolic:-Bsymbolic}  \
%{!static:  \
  %{rdynamic:-export-dynamic}   \
- %{!shared:-dynamic-linker " LINUX_DYNAMIC_LINKER "}}  \
+ %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}   \
-X  \
%{mbig-endian:-EB} %{mlittle-endian:-EL}\
%{EB:-EB} %{EL:-EL}"
Index: libgcc/ChangeLog
===
--- libgcc/ChangeLog(revision 266171)
+++ libgcc/ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2018-11-15  Xianmiao Qu  
+
+   * config/csky/linux-unwind.h: Fix coding style.
+
 2018-11-13  Xianmiao Qu  
 
* config/csky/linux-unwind.h (_sig_ucontext_t): Remove.
Index: libgcc/config/csky/linux-unwind.h
===
--- libgcc/config/csky/linux-unwind.h   (revision 266171)
+++ libgcc/config/csky/linux-unwind.h   (working copy)
@@ -25,10 +25,8 @@
 
 #ifndef inhibit_libc
 
-/*
- * Do code reading to identify a signal frame, and set the frame state data
- * appropriately.  See unwind-dw2.c for the structs.
- */
+/* Do code reading to identify a signal frame, and set the frame state data
+   appropriately.  See unwind-dw2.c for the structs.  */
 
 #include 
 #include 




Re: [wwwdocs] [committed] Add ARC news

2018-11-14 Thread Gerald Pfeifer
On Wed, 14 Nov 2018, claz...@gmail.com wrote:
> I've just committed the attached patch containing the news for the ARC
> backend.

Nice!  (Both in terms of improvements to the ARC target, and this
release notes update.)

GErald


Re: [PATCH] MIPS: Add `-mfix-r5900' option for the R5900 short loop erratum

2018-11-14 Thread Hans-Peter Nilsson
On Tue, 13 Nov 2018, Maciej W. Rozycki wrote:

> On Sun, 11 Nov 2018, Fredrik Noring wrote:
>
> > ../../../../libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc:71:1:
> >  note: in expansion of macro ?COMPILER_CHECK?
> >71 | COMPILER_CHECK(struct_kernel_stat_sz == sizeof(struct stat));
> >   | ^~
>
>  I guess `struct_kernel_stat_sz' and `sizeof(struct stat)' do not match.
> You may try making a preprocessed source with the same GCC invocation
> (possibly with `-dD' added if needed) to see how these items have been
> defined in your build environment.  This may reveal something obvious.
>
>  Also unless you realise the problem is due to misconfiguration, please
> file it in GCC bugzilla as a GCC 9 regression (since as you say 8.2.0
> builds just fine in your environment).  We don't want things to break with
> new releases.

This sounds familiar.

Perhaps the local edits I made for sanitizer support for MIPS
have been overwritten by the upstream import?  I know I made a
boo-boo and didn't "upstream" that.

brgds, H-P


[doc, committed] __attribute__((aligned)) linker restrictions

2018-11-14 Thread Sandra Loosemore
I've checked in this patch for PR 56334, following the recommendation in 
comment 1 in that issue to


* distinguish between stack-allocated and statically-allocated variables
* mention object file format restrictions and not just blame it on the 
linker.


-Sandra
2018-11-15  Sandra Loosemore  

	PR other/56334

	gcc/
	* doc/extend.texi (Common Function Attributes): Clarify linker
	restrictions on "aligned" attribute.
	(Common Variable Attributes): Likewise.  Mention that linker
	restrictions don't apply to stack-allocated variables.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 266169)
+++ gcc/doc/extend.texi	(working copy)
@@ -2396,7 +2396,8 @@ alignment this overrides the effect of t
 function.
 
 Note that the effectiveness of @code{aligned} attributes may be
-limited by inherent limitations in your linker.  On many systems, the
+limited by inherent limitations in the system linker 
+and/or object file format.  On some systems, the
 linker is only able to arrange for functions to be aligned up to a
 certain maximum alignment.  (For some linkers, the maximum supported
 alignment may be very very small.)  See your linker documentation for
@@ -6132,8 +6133,9 @@ attribute must be specified as well.  Wh
 @code{aligned} attribute can both increase and decrease alignment, and
 specifying the @code{packed} attribute generates a warning.
 
-Note that the effectiveness of @code{aligned} attributes may be limited
-by inherent limitations in your linker.  On many systems, the linker is
+Note that the effectiveness of @code{aligned} attributes for static
+variables may be limited by inherent limitations in the system linker
+and/or object file format.  On some systems, the linker is
 only able to arrange for variables to be aligned up to a certain maximum
 alignment.  (For some linkers, the maximum supported alignment may
 be very very small.)  If your linker is only able to align variables
@@ -6141,6 +6143,9 @@ up to a maximum of 8-byte alignment, the
 in an @code{__attribute__} still only provides you with 8-byte
 alignment.  See your linker documentation for further information.
 
+Stack variables are not affected by linker restrictions; GCC can properly
+align them on any target.
+
 The @code{aligned} attribute can also be used for functions
 (@pxref{Common Function Attributes}.)
 


Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT

2018-11-14 Thread Andrew Pinski
On Fri, May 13, 2016 at 3:51 AM Richard Biener  wrote:
>
>
> The following patch adds BIT_FIELD_INSERT, an operation to
> facilitate doing bitfield inserts on registers (as opposed
> to currently where we'd have a BIT_FIELD_REF store).
>
> Originally this was developed as part of bitfield lowering
> where bitfield stores were lowered into read-modify-write
> cycles and the modify part, instead of doing shifting and masking,
> be kept in a more high-level form to ease combining them.
>
> A second use case (the above is still valid) is vector element
> inserts which we currently can only do via memory or
> by extracting all components and re-building the vector using
> a CONSTRUCTOR.  For this second use case I added code
> re-writing the BIT_FIELD_REF stores the C family FEs produce
> into BIT_FIELD_INSERT when update-address-taken can otherwise
> re-write a decl into SSA form (the testcase shows we miss
> a similar opportunity with the MEM_REF form of a vector insert,
> I plan to fix that for the final submission).
>
> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> is that the size of the insertion is given implicitely via the
> type size/precision of the value to insert.  That avoids
> introducing ways to have quaternary ops in folding and GIMPLE stmts.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2011-06-16  Richard Guenther  
>
> PR tree-optimization/29756
> * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> * fold-const.c (operand_equal_p): Likewise.
> (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> * tree-inline.c (estimate_operator_cost): Likewise.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * tree-ssa-operands.c (get_expr_operands): Likewise.
> * cfgexpand.c (expand_debug_expr): Likewise.
> * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
>
> * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> vector inserts using BIT_FIELD_REF on the lhs.
> (execute_update_addresses_taken): Do it.
>
> * gcc.dg/tree-ssa/vector-6.c: New testcase.
>
> Index: trunk/gcc/expr.c
> ===
> *** trunk.orig/gcc/expr.c   2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/expr.c2016-05-12 15:40:32.481225744 +0200
> *** expand_expr_real_2 (sepops ops, rtx targ
> *** 9358,9363 
> --- 9358,9380 
> target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, 
> target);
> return target;
>
> + case BIT_FIELD_INSERT:
> +   {
> +   unsigned bitpos = tree_to_uhwi (treeop2);
> +   unsigned bitsize;
> +   if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> + bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> +   else
> + bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> +   rtx op0 = expand_normal (treeop0);
> +   rtx op1 = expand_normal (treeop1);
> +   rtx dst = gen_reg_rtx (mode);
> +   emit_move_insn (dst, op0);
> +   store_bit_field (dst, bitsize, bitpos, 0, 0,
> +TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> +   return dst;
> +   }
> +
>   default:
> gcc_unreachable ();
>   }
> Index: trunk/gcc/fold-const.c
> ===
> *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/fold-const.c  2016-05-13 09:41:13.509812127 +0200
> *** operand_equal_p (const_tree arg0, const_
> *** 3163,3168 
> --- 3163,3169 
>
> case VEC_COND_EXPR:
> case DOT_PROD_EXPR:
> +   case BIT_FIELD_INSERT:
>   return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
>
> default:
> *** fold_ternary_loc (location_t loc, enum t
> *** 11870,11875 
> --- 11871,11916 
> }
> return NULL_TREE;
>
> + case BIT_FIELD_INSERT:
> +   /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> +   if (TREE_CODE (arg0) == INTEGER_CST
> + && TREE_CODE (arg1) == INTEGER_CST)
> +   {
> + unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> + unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> + wide_int tem = wi::bit_and (arg0,
> + wi::shifted_mask (bitpos, bitsize, true,
> +   TYPE_PRECISION 
> (type)));
> + wide_int tem2
> +   = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> +   bitsize), bitpos);
> + 

[doc, committed] fix documentation about interaction between -flto and -O

2018-11-14 Thread Sandra Loosemore
I've checked in this patch to address the complaints about a bad example 
in the discussion of -flto reported in PR55102 and its duplicate 
PR56700.  The bad example implied that you can compile files at -O0 and 
then link them with -O3 to get full optimization, which is not correct. 
I replaced that with some language explaining why you need to compile 
with optimization to get the full effect of LTO.


I also did a bit of copy-editing and rearrangement of material in the 
-flto discussion so that it flows a little better and isn't quite as wordy.


-Sandra

2018-11-14  Sandra Loosemore  

	PR lto/55102
	PR lto/56700

	gcc/
	* doc/invoke.texi (Optimize Options): Remove bad example about
	interaction between -flto and -O.  Replace it with a note that
	you need to compile with -O and not just link.  Copy-edit -flto
	discussion to reduce verbiage and improve flow.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 266162)
+++ gcc/doc/invoke.texi	(working copy)
@@ -9822,15 +9822,11 @@ The above generates bytecode for @file{f
 merges them together into a single GIMPLE representation and optimizes
 them as usual to produce @file{myprog}.
 
-The only important thing to keep in mind is that to enable link-time
+The important thing to keep in mind is that to enable link-time
 optimizations you need to use the GCC driver to perform the link step.
-GCC then automatically performs link-time optimization if any of the
+GCC automatically performs link-time optimization if any of the
 objects involved were compiled with the @option{-flto} command-line option.  
-You generally
-should specify the optimization options to be used for link-time
-optimization though GCC tries to be clever at guessing an
-optimization level to use from the options used at compile time
-if you fail to specify one at link time.  You can always override
+You can always override
 the automatic decision to do link-time optimization
 by passing @option{-fno-lto} to the link command.
 
@@ -9844,8 +9840,8 @@ the linker plugin is not available, @opt
 used to allow the compiler to make these assumptions, which leads
 to more aggressive optimization decisions.
 
-When @option{-fuse-linker-plugin} is not enabled, when a file is
-compiled with @option{-flto}, the generated object file is larger than
+When a file is compiled with @option{-flto} without
+@option{-fuse-linker-plugin}, the generated object file is larger than
 a regular object file because it contains GIMPLE bytecodes and the usual
 final code (see @option{-ffat-lto-objects}.  This means that
 object files with LTO information can be linked as normal object
@@ -9854,20 +9850,6 @@ interprocedural optimizations are applie
 @option{-fno-fat-lto-objects} is enabled the compile stage is faster
 but you cannot perform a regular, non-LTO link on them.
 
-Additionally, the optimization flags used to compile individual files
-are not necessarily related to those used at link time.  For instance,
-
-@smallexample
-gcc -c -O0 -ffat-lto-objects -flto foo.c
-gcc -c -O0 -ffat-lto-objects -flto bar.c
-gcc -o myprog -O3 foo.o bar.o
-@end smallexample
-
-This produces individual object files with unoptimized assembler
-code, but the resulting binary @file{myprog} is optimized at
-@option{-O3}.  If, instead, the final binary is generated with
-@option{-fno-lto}, then @file{myprog} is not optimized.
-
 When producing the final binary, GCC only
 applies link-time optimizations to those files that contain bytecode.
 Therefore, you can mix and match object files and libraries with
@@ -9875,15 +9857,22 @@ GIMPLE bytecodes and final object code.
 which files to optimize in LTO mode and which files to link without
 further processing.
 
-There are some code generation flags preserved by GCC when
-generating bytecodes, as they need to be used during the final link
-stage.  Generally options specified at link time override those
-specified at compile time.
+Generally, options specified at link time override those
+specified at compile time, although in some cases GCC attempts to infer
+link-time options from the settings used to compile the input files.
 
 If you do not specify an optimization level option @option{-O} at
 link time, then GCC uses the highest optimization level 
-used when compiling the object files.
+used when compiling the object files.  Note that it is generally 
+ineffective to specify an optimization level option only at link time and 
+not at compile time, for two reasons.  First, compiling without 
+optimization suppresses compiler passes that gather information 
+needed for effective optimization at link time.  Second, some early
+optimization passes can be performed only at compile time and 
+not at link time.
 
+There are some code generation flags preserved by GCC when
+generating bytecodes, as they need to be used during the final link.
 Currently, the following options and their settings are taken from
 the first 

Re: [PATCH] Fix incorrect assertion when deallocating big block

2018-11-14 Thread Jonathan Wakely

On 14/11/18 20:26 +, Jonathan Wakely wrote:

On 14/11/18 10:31 +0100, Christophe Lyon wrote:

On Tue, 13 Nov 2018 at 23:58, Jonathan Wakely  wrote:


Since a big_block rounds up the size to a multiple of big_block::min it
is wrong to assert that the supplied number of bytes equals the
big_block's size(). Add big_block::alloc_size(size_t) to calculate the
allocated size consistently, and add comments to the code.

   * src/c++17/memory_resource.cc (big_block): Improve comments.
   (big_block::all_ones): Remove.
   (big_block::big_block(size_t, size_t)): Use alloc_size.
   (big_block::size()): Add comment, replace all_ones with equivalent
   expression.
   (big_block::align()): Shift value of correct type.
   (big_block::alloc_size(size_t)): New function to round up size.
   (__pool_resource::allocate(size_t, size_t)): Add comment.
   (__pool_resource::deallocate(void*, size_t, size_t)): Likewise. Fix
   incorrect assertion by using big_block::alloc_size(size_t).
   * testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Add
   more tests for unpooled allocations.



Hi Jonathan,

I've noticed that the updated test fails on arm*:
FAIL: 20_util/unsynchronized_pool_resource/allocate.cc execution test

the log says:
/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc:232:
void test06(): Assertion 'false' failed.

The same happens on aarch64-elf with -mabi=ilp32


Should be fixed by this patch, committed to trunk.


I forgot to actually commit this. *Now* it's committed to trunk
(r266163).




[PATCH] Optimize pool resource allocation

2018-11-14 Thread Jonathan Wakely

A recent change caused a performance regression. This restores the
previous performance and adds a performance test.

* scripts/check_performance: Allow tests to choose a -std flag.
* src/c++17/memory_resource.cc (bitset::get_first_unset()): Use local
variables of the right types. Call update_next_word() unconditionally.
* testsuite/20_util/unsynchronized_pool_resource/cons.cc: New test.
* testsuite/performance/20_util/memory_resource/pools.cc: New test.
* testsuite/util/testsuite_performance.h (time_counter): Allow
timer to be restarted.

Tested x86_64-linux, committed to trunk.


commit da6a822ba7b4aec1fa2fe9994e75539eec55f80e
Author: Jonathan Wakely 
Date:   Thu Nov 15 00:01:53 2018 +

Optimize pool resource allocation

A recent change caused a performance regression. This restores the
previous performance and adds a performance test.

* scripts/check_performance: Allow tests to choose a -std flag.
* src/c++17/memory_resource.cc (bitset::get_first_unset()): Use 
local
variables of the right types. Call update_next_word() 
unconditionally.
* testsuite/20_util/unsynchronized_pool_resource/cons.cc: New test.
* testsuite/performance/20_util/memory_resource/pools.cc: New test.
* testsuite/util/testsuite_performance.h (time_counter): Allow
timer to be restarted.

diff --git a/libstdc++-v3/scripts/check_performance 
b/libstdc++-v3/scripts/check_performance
index d196355bd44..3fa927480c9 100755
--- a/libstdc++-v3/scripts/check_performance
+++ b/libstdc++-v3/scripts/check_performance
@@ -44,6 +44,8 @@ do
   TESTNAME=$SRC_DIR/testsuite/$NAME
   FILE_NAME="`basename $NAME`"
   FILE_NAME="`echo $FILE_NAME | sed 's/.cc//g'`"
+  ORIG_CXX="$CXX"
+  CXX="$CXX `sed -n 's/.* STD=/-std=/p' $TESTNAME`"
 
   # TEST_S == single thread
   # TEST_B == do both single and multi-thread
@@ -90,6 +92,7 @@ do
mv tmp.$FILE_NAME $FILE_NAME.xml
 fi
   fi
+  CXX="$ORIG_CXX"
 done
 
 exit 0
diff --git a/libstdc++-v3/src/c++17/memory_resource.cc 
b/libstdc++-v3/src/c++17/memory_resource.cc
index 605bdd53950..79c1665146d 100644
--- a/libstdc++-v3/src/c++17/memory_resource.cc
+++ b/libstdc++-v3/src/c++17/memory_resource.cc
@@ -335,17 +335,16 @@ namespace pmr
 
 size_type get_first_unset() noexcept
 {
-  if (_M_next_word < nwords())
+  const size_type wd = _M_next_word;
+  if (wd < nwords())
{
- const size_type n = std::__countr_one(_M_words[_M_next_word]);
+ const size_type n = std::__countr_one(_M_words[wd]);
  if (n < bits_per_word)
{
  const word bit = word(1) << n;
- _M_words[_M_next_word] |= bit;
- const size_t res = (_M_next_word * bits_per_word) + n;
- if (n == (bits_per_word - 1))
-   update_next_word();
- return res;
+ _M_words[wd] |= bit;
+ update_next_word();
+ return (wd * bits_per_word) + n;
}
}
   return size_type(-1);
diff --git 
a/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/cons.cc 
b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/cons.cc
new file mode 100644
index 000..14519ba1d00
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/cons.cc
@@ -0,0 +1,80 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do run { target c++17 } }
+
+#include 
+#include 
+#include 
+
+void
+test01()
+{
+  __gnu_test::memory_resource test_mr1, test_mr2;
+  __gnu_test::default_resource_mgr mgr(&test_mr1);
+
+  const std::pmr::pool_options opts{1, 2};
+  using std::pmr::unsynchronized_pool_resource;
+
+  unsynchronized_pool_resource p1 = {opts, &test_mr2};
+  VERIFY( p1.upstream_resource() == &test_mr2 );
+  unsynchronized_pool_resource p2;
+  VERIFY( p2.upstream_resource() == std::pmr::get_default_resource() );
+  unsynchronized_pool_resource p3{&test_mr2};
+  VERIFY( p3.upstream_resource() == &test_mr2 );
+  unsynchronized_pool_resource p4{opts};
+  VERIFY( p4.upstream_resource() == std::pmr::get_default_resource() );
+
+  static_assert(!std::is_copy_constru

Re: [PATCH 21/25] GCN Back-end (part 2/2).

2018-11-14 Thread Jeff Law
On 11/12/18 5:53 AM, Andrew Stubbs wrote:
> On 09/11/2018 19:39, Jeff Law wrote:
>>> +
>>> +/* Generate epilogue.  Called from gen_epilogue during
>>> pro_and_epilogue pass.
>>> +
>>> +   See gcn_expand_prologue for stack details.  */
>>> +
>>> +void
>>> +gcn_expand_epilogue (void)
>> You probably need a barrier in here to ensure that the scheduler doesn't
>> move an aliased memory reference into the local stack beyond the stack
>> adjustment.
>>
>> You're less likely to run into it because you eliminate frame pointers
>> fairly aggressively, but it's still the right thing to do.
> 
> Sorry, I'm not sure I understand what the problem is? How can this
> happen? Surely the scheduler wouldn't change the logic of the code?
There's a particular case that has historically been problematical.

If you have this kind of sequence in the epilogue

restore register using FP
move fp->sp  (deallocates frame)
return

Under certain circumstances the scheduler can swap the register restore
and move from fp into sp creating something like this:

move fp->sp (deallocates frame)
restore register using FP (reads from deallocated frame)
return

That would normally be OK, except if you take an interrupt between the
first two instructions.  If interrupt handling is done without switching
stacks, then the interrupt handler may write into the just de-allocated
frame destroying the values that were saved in the prologue.

You may not need to worry about that today on the GCN port, but you
really want to fix it now so that it's never a problem.  You *really*
don't want to have to debug this kind of problem in the wild.  Been
there, done that, more than once :(


>> This seems a bit hokey.  Why specifically is combine removing the USE?
> 
> I don't understand combine fully enough to explain it now, although at
> the time I wrote this, and in a GCC 7 code base, I had followed the code
> through and observed what it was doing.
> 
> Basically, if you have two patterns that do the same operation, but one
> has a "parallel" with an additional "use", then combine will tend to
> prefer the one without the "use". That doesn't stop the code working,
> but it makes a premature (accidental) decision about instruction
> selection that we'd prefer to leave to the register allocator.
> 
> I don't recall if it did this to lone instructions, but it would
> certainly do so when combining two (or more) instructions, and IIRC
> there are typically plenty of simple moves around that can be easily
> combined.
I would hazard a guess that combine saw the one without the use as
"simpler" and preferred it.  I think you've made a bit of a fundamental
problem with the way the EXEC register is being handled.  Hopefully you
can get by with some magic UNSPEC wrappers without having to do too much
surgery.

> 
>>> +  /* "Manually Inserted Wait States (NOPs)."
>>> +
>>> + GCN hardware detects most kinds of register dependencies, but
>>> there
>>> + are some exceptions documented in the ISA manual.  This pass
>>> + detects the missed cases, and inserts the documented number of
>>> NOPs
>>> + required for correct execution.  */
>> How unpleasant :(  But if it's what you need to do, so be it.  I'll
>> assume the compiler is the right place to do this -- though some ports
>> handle this kind of stuff in the assembler or linker.
> 
> We're using an LLVM assembler and linker, so we have tried to use them
> as is, rather than making parallel changes that would prevent GCC
> working with the last numbered release of LLVM (see the work around for
> assembler bugs in the BImode mode instruction).
> 
> Expecting the assembler to fix this up would also throw off the
> compiler's offset calculations, and the near/far branch instructions
> have different register requirements it's better for the compiler to
> know about.
> 
> The MIPS backend also inserts NOPs in a similar way.
MIPS isn't that simple.  If you're not in a .reorder block and you don't
have interlocks, then it leaves it to the assembler...

If you have near/far branch calculations, then those have to be aware of
the nop insertions and you're generally better off doing them both in
the same tool.  You've chosen the compiler.  It's a valid choice, but
does have some downsides.  The assembler is a valid choice too, with a
different set of downsides.

> 
> In future, I'd like to have the scheduler insert real instructions into
> these slots, but that's very much on the to-do list.
If you you can model this as a latency between the two points where you
need to insert the nops, then the scheduler will fill in what it can.
But it doesn't generally handle non-interlocked processors.   So you'll
still want your little pass to fix things up when the scheduler couldn't
find useful work to schedule into those bubbles.

> 
> Oh, OK. :-(
> 
> I have no idea whether the architecture has those issues or not.
The guideline I would give to determine if you're vulnerable.

Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Jozef Lawrynowicz
On Wed, 14 Nov 2018 11:30:39 -0500
Paul Koning  wrote:

> > On Nov 14, 2018, at 10:44 AM, Jozef Lawrynowicz 
> > wrote:
> > 
> > Patch 1 tweaks dg directives in tests specifically for msp430. Many of
> > these are extensions to existing target selectors in dg directives.
> > 
> > <0001-TESTSUITE-MSP430-Tweak-dg-directives-for-msp430-elf.patch>  
> 
> For pr41779.c, you have
> 
> +/* { dg-skip-if "int is smaller than float" { msp430-*-* } } */
> 
> I take it that means: sizeof(int) < sizeof(float).  That property also holds
> for pdp11 and perhaps other targets.  Would it make sense to introduce a new
> effective-target flag for that check instead?
> 
>   paul
> 

Paul,

Yes you are correct the comment implies sizeof(int) < sizeof(float).

I believe this was the only test where this property affected the test
results, so a new effective-target flag is probably only worth adding if it
affects at least a couple of tests.
On the other hand, I suppose there is no harm in adding another
check-effective-target and it at least means we'll catch failures across more
targets.

I'd be curious if the line I added the xfail to in c-c++-common/pr57371-2.c
also fails for pdp11.

The conversion to float might be getting optimized out whenever
sizeof(int) < sizeof(float).

Thanks,
Jozef


Re: [PATCH][RFC] Come up with -flive-patching master option.

2018-11-14 Thread Miroslav Benes
On Tue, 13 Nov 2018, Qing Zhao wrote:

> Hi,
> 
> > On Nov 13, 2018, at 1:18 PM, Miroslav Benes  wrote:
> > 
> >> Attached is the patch for new -flive-patching=[inline-only-static | 
> >> inline-clone] master option.
> >> 
> >> '-flive-patching=LEVEL'
> >> Control GCC's optimizations to provide a safe compilation for
> >> live-patching.  Provides multiple-level control on how many of the
> >> optimizations are enabled by users' request.  The LEVEL argument
> >> should be one of the following:
> >> 
> >> 'inline-only-static'
> >> 
> >>  Only enable inlining of static functions, disable all other
> >>  ipa optimizations/analyses.  As a result, when patching a
> >>  static routine, all its callers need to be patches as well.
> >> 
> >> 'inline-clone'
> >> 
> >>  Only enable inlining and all optimizations that internally
> >>  create clone, for example, cloning, ipa-sra, partial inlining,
> >>  etc.; disable all other ipa optimizations/analyses.  As a
> >>  result, when patching a routine, all its callers and its
> >>  clones' callers need to be patched as well.
> > 
> > Based on our previous discussion I assume that "clone" optimizations are 
> > safe (for LP) and the others are not. Anyway I'd welcome a note mentioning 
> > that disabled optimizations are dangerous for LP.
> 
> actually, I don’t think that those disabled optimizations are “dangerous” for 
> live-patching. one of the major reasons we disable them
> is because that currently the compiler does NOT provide a good way to compute 
> the impacted function list for those optimizations.
> therefore, we disable them at this time. 
> 
> many of them could be enabled too if the compiler can report the impacted 
> function list accurately in the future.

Yes, you can formulate it like that. On the other hand, I (we) have always 
tried to keep a set of patched functions as small as possible. So some 
cost-benefit analysis would have to be done. However, that's another 
problem and we can discuss it later.

> > 
> > I know it may be the same for you, but it is not for me as a GCC user. 
> > "internally create clone" sounds very... well, internal. It does not 
> > describe the option much for ordinary user whow has no knowledge about GCC 
> > internals.
> > 
> > So could you rephrase it a bit, please?
> 
> I tried to make this clear. please see the following:
> 
> '-flive-patching=LEVEL'
>  Control GCC's optimizations to provide a safe compilation for
>  live-patching.
> 
>  If the compiler's optimization uses a function's body or
>  information extracted from its body to optimize/change another
>  function, the latter is called an impacted function of the former.
>  If a function is patched, its impacted functions should be patched
>  too.
> 
>  The impacted functions are decided by the compiler's
>  interprocedural optimizations.  For example, inlining a function
>  into its caller, cloning a function and changing its caller to call
>  this new clone, or extracting a function's pureness/constness
>  information to optimize its direct or indirect callers, etc.
> 
>  Usually, the more ipa optimizations enabled, the larger the number
>  of impacted functions for each function.  In order to control the
>  number of impacted functions and computed the list of impacted
>  function easily, we provide control to partially enable ipa
>  optimizations on two different levels.
> 
>  The LEVEL argument should be one of the following:
> 
>  'inline-only-static'
> 
>   Only enable inlining of static functions, disable all other
>   interprocedural optimizations/analyses.  As a result, when
>   patching a static routine, all its callers need to be patches
>   as well.
> 
>  'inline-clone'
> 
>   Only enable inlining and cloning optimizations, which includes
>   inlining, cloning, interprocedural scalar replacement of
>   aggregates and partial inlining.  Disable all other
>   interprocedural optimizations/analyses.  As a result, when
>   patching a routine, all its callers and its clones' callers
>   need to be patched as well.
> 
>  When -flive-patching specified without any value, the default value
>  is "inline-clone".
> 
>  This flag is disabled by default.

Sounds better. Thanks.

Miroslav

Re: [PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-14 Thread Jeff Law
On 11/14/18 6:07 AM, Robin Dapp wrote:
> noce_convert_multiple_sets creates temporaries for the destination of
> every emitted cmov and expects subsequent passes to get rid of them.  This
> does not happen every time and even if the temporaries are removed, code
> generation can be affected adversely.  In this patch, temporaries are
> only created if the destination of a set is used in an emitted condition
> check.
> 
> --
> 
> gcc/ChangeLog:
> 
> 2018-11-14  Robin Dapp  
> 
>   * ifcvt.c (check_need_temps): New function.
>   (noce_convert_multiple_sets): Only created temporaries if needed.
This looks pretty reasonable.  ISTM it ought to be able to go forward if
it's tested independently.

jeff


Re: [PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-14 Thread Jeff Law
On 11/14/18 6:07 AM, Robin Dapp wrote:
> This patch checks whether the current target supports conditional moves
> with immediate then/else operands and allows noce_convert_multiple_sets
> to deal with constants subsequently.
> Also, minor refactoring is performed.
> 
> --
> 
> gcc/ChangeLog:
> 
> 2018-11-14  Robin Dapp  
> 
>   * ifcvt.c (have_const_cmov): New function.
>   (noce_convert_multiple_sets): Allow constants if supported.
>   (bb_ok_for_noce_convert_multiple_sets): Likewise.
>   (check_cond_move_block): Refactor.
> ---
>  gcc/ifcvt.c | 46 --
>  1 file changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index ddf077fa051..660bb46eb1c 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -3077,6 +3077,27 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx 
> cond,
>return false;
>  }
>  
> +/* Check if we have a movcc pattern that accepts constants as then/else
> +   operand (op 2/3).  */
> +static bool
> +have_const_cmov (machine_mode mode)
> +{
> +  enum insn_code icode;
> +  if ((icode = direct_optab_handler (movcc_optab, mode))
> +  != CODE_FOR_nothing)
> +{
> +  if (insn_data[(int) icode].operand[2].predicate
> +   && (insn_data[(int) icode].operand[2].predicate
> + (const1_rtx, insn_data[(int) icode].operand[2].mode)))
> + if (insn_data[(int) icode].operand[3].predicate
> + && (insn_data[(int) icode].operand[3].predicate
> +   (const1_rtx, insn_data[(int) icode].operand[3].mode)))
> +   return true;
> +}
> +
> +  return false;
> +}
This may ultimately be too simplistic.  There are targets where some
constants are OK, but others may not be.   By checking the predicate
like this I think you can cause over-aggressive if-conversion if the
target allows a range of integers in the expander's operand predicate,
but allows a narrower range in the actual define_insn (presumably the
expander loads them into a pseudo or somesuch in that case).

We know that over-aggressive if-conversion into conditional moves hurts
some targets.

Ideally you'd create the actual insn with the constants you want to use
and see if that's recognized as well as compute its cost.  Is that just
too painful at this point for some reason?


> @@ -3689,7 +3717,7 @@ check_cond_move_block (basic_block bb,
>  {
>rtx set, dest, src;
>  
> -  if (!NONDEBUG_INSN_P (insn) || JUMP_P (insn))
> +  if (!active_insn_p (insn))
>   continue;
>set = single_set (insn);
>if (!set)
> @@ -3705,10 +3733,8 @@ check_cond_move_block (basic_block bb,
>if (!CONSTANT_P (src) && !register_operand (src, VOIDmode))
>   return FALSE;
>  
> -  if (side_effects_p (src) || side_effects_p (dest))
> - return FALSE;
> -
> -  if (may_trap_p (src) || may_trap_p (dest))
> +  /* Check for side effects and trapping.  */
> +  if (!noce_operand_ok (src) || !noce_operand_ok (dest))
>   return FALSE;
>  
>/* Don't try to handle this if the source register was
These two hunks are probably OK as general cleanups.  Note that
noce_operand_ok is not strictly the same as checking side_effects_p and
may_trap_p in the case of a MEM.  But you've already filtered out MEMs
before you get here.

Jeff
> 



+reminder+ Don’t build gdb/readline/libreadline.a, when --with-system-readline is supplied

2018-11-14 Thread Дилян Палаузов
 Forwarded Message 
From: Дилян Палаузов 
To: gcc-patches@gcc.gnu.org
Subject: Don’t build gdb/readline/libreadline.a, when --with-system-
readline is supplied
Date: Sat, 27 Oct 2018 19:53:44 +

Building GDB always builds the bundled libreadline.a, even if use of
the libreadline installed on the system was requested with --with-
system-readline.

The change below is for binutils-gdb’s/configure.ac, which is
maintained by gcc.


See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87741 [GCC] and 
https://sourceware.org/bugzilla/show_bug.cgi?id=18632 [GDB] for
details.



diff --git a/configure.ac b/configure.ac
--- a/configure.ac
+++ b/configure.ac
@@ -253,6 +253,12 @@ if test x$with_system_zlib = xyes ; then
   noconfigdirs="$noconfigdirs zlib"
 fi
 
+# Don't compile the bundled readline/libreadline.a in gdb-binutils if
+#  --with-system-readline is provided.
+if test x$with_system_readline = xyes ; then
+  noconfigdirs="$noconfigdirs readline"
+fi
+
 # some tools are so dependent upon X11 that if we're not building with
X, 
 # it's not even worth trying to configure, much less build, that tool.



Re: [PATCH] RFC: elide repeated source locations (PR other/84889)

2018-11-14 Thread David Malcolm
On Mon, 2018-11-12 at 13:37 -0700, Martin Sebor wrote:
> On 11/11/2018 07:43 PM, David Malcolm wrote:
> > We often emit more than one diagnostic at the same source location.
> > For example, the C++ frontend can emit many diagnostics at
> > the same source location when suggesting overload candidates.
> > 
> > For example:
> > 
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
> > function 'int test_3(s, t)':
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
> > error: no match for 'operator&&' (operand types are 's' and 't')
> >38 |   return param_s && param_t;
> >   |  ^~
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
> > note: candidate: 'operator&&(bool, bool)' 
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
> > note:   no known conversion for argument 2 from 't' to 'bool'
> > 
> > This is overly verbose.  Note how the same location has been
> > printed
> > three times, obscuring the pertinent messages.
> > 
> > This patch add a new "elide" value to -fdiagnostics-show-location=
> > and makes it the default (previously it was "once").  With elision
> > the above is printed as:
> > 
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
> > function 'int test_3(s, t)':
> > ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
> > error: no match for 'operator&&' (operand types are 's' and 't')
> >38 |   return param_s && param_t;
> >   |  ^~
> >   = note: candidate: 'operator&&(bool, bool)' 
> >   = note:   no known conversion for argument 2 from 't' to
> > 'bool'
> > 
> > where the followup notes are printed with a '=' lined up with
> > the source code margin.
> > 
> > Thoughts?
> 
> I agree the long pathname in the notes is at first glance redundant
> but I'm not sure about using '=' as a shorthand for it.  I have
> written many scripts to parse GCC output to extract all diagnostics
> (including notes) and publish those on a Web page somewhere, as I'm
> sure must have others.  All those scripts would stop working with
> this change and require changes to the build system to work again.
> Making those changes can be a substantial undertaking in some
> organizations.
> 
> Have you considered printing just the file name instead?  Or any
> other alternatives?

"-fdiagnostics-show-location=once" will restore the old behavior.
Alternatively, if you want to parse GCC output, I'm adding a JSON
output format; see:
  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01038.html
(I'm testing an updated version of that patch)

Dave


Re: [PATCH] Fix bootstrap with GCC 4.1.2 (PR bootstrap/86739)

2018-11-14 Thread Marc Glisse

On Wed, 14 Nov 2018, Richard Biener wrote:


On Wed, 14 Nov 2018, Jakub Jelinek wrote:


Hi!

As mentioned in the PR, with GCC before 4.3 one can't instantiate std::pair
where one or both of the template parameters are reference types, because
the std::pair constructor has arguments references to the template parameter
types and the CWG that resolved hasn't been applied to those compilers.

The following patch works around it by not returning
std::pair object, but instead a different class that
holds the two references and has conversion operator to std::pair.

If that conversion operator isn't acceptable, in the PR there is another
patch which adjusts the (so far) two spots which need to be changed in that
case.

Bootstrapped/regtested on x86_64-linux and i686-linux (using GCC 7 as
bootstrap compiler) and tested on the preprocessed source with GCC 4.1.
Ok for trunk?


Works for me if C++ people have no better idea.


A number of C++ people actually dislike std::pair on general principle and 
believe that we should instead use classes with meaningful field names 
(key and value?). Of course, that's way more disruptive and thus probably 
less desirable here.


--
Marc Glisse


Re: C++ PATCH for c++/87781, detect invalid elaborated-type-specifier

2018-11-14 Thread Marek Polacek
On Wed, Nov 14, 2018 at 10:03:50AM -0500, Jason Merrill wrote:
> On Wed, Nov 14, 2018 at 9:55 AM Marek Polacek  wrote:
> >
> > In elaborated-type-specifier, the typename keyword can only follow a
> > nested-name-specifier:
> >
> >   class-key nested-name-specifier template[opt] simple-template-id
> >
> > but we weren't detecting it.
> >
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> >
> > 2018-11-14  Marek Polacek  
> >
> > PR c++/87781 - detect invalid elaborated-type-specifier.
> > * parser.c (cp_parser_elaborated_type_specifier): Ensure that
> > typename follows a nested-name-specifier.
> >
> > * g++.dg/parse/elab3.C: New test.
> >
> > diff --git gcc/cp/parser.c gcc/cp/parser.c
> > index e9e49b15702..0ab44ab93e3 100644
> > --- gcc/cp/parser.c
> > +++ gcc/cp/parser.c
> > @@ -17986,6 +17986,10 @@ cp_parser_elaborated_type_specifier (cp_parser* 
> > parser,
> >  template-id or not.  */
> >if (!template_p)
> > cp_parser_parse_tentatively (parser);
> > +  /* The `template' keyword must follow a nested-name-specifier.  */
> > +  else if (!nested_name_specifier)
> > +   return error_mark_node;
> 
> Don't we want a diagnostic here?

We'd get "invalid declarator" even without a diagnostic there but I guess it'd
be nicer so say what the actual problem is.  Unsure which diagnostic to go with,
this patch uses cp_parser_error though.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-11-14  Marek Polacek  

PR c++/87781 - detect invalid elaborated-type-specifier.
* parser.c (cp_parser_elaborated_type_specifier): Ensure that
typename follows a nested-name-specifier.

* g++.dg/parse/elab3.C: New test.
* g++.dg/template/crash115.C: Adjust dg-error.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index e9e49b15702..bfcf42b0f39 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -17986,6 +17986,14 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
 template-id or not.  */
   if (!template_p)
cp_parser_parse_tentatively (parser);
+  /* The `template' keyword must follow a nested-name-specifier.  */
+  else if (!nested_name_specifier)
+   {
+ cp_parser_error (parser, "% must follow a nested-"
+  "name-specifier");
+ return error_mark_node;
+   }
+
   /* Parse the template-id.  */
   token = cp_lexer_peek_token (parser->lexer);
   decl = cp_parser_template_id (parser, template_p,
diff --git gcc/testsuite/g++.dg/parse/elab3.C gcc/testsuite/g++.dg/parse/elab3.C
new file mode 100644
index 000..61338fb7ac4
--- /dev/null
+++ gcc/testsuite/g++.dg/parse/elab3.C
@@ -0,0 +1,6 @@
+// PR c++/87781
+// { dg-do compile }
+// { dg-options "" }
+
+template class A;
+class template A *p; // { dg-error ".template. must follow|invalid" }
diff --git gcc/testsuite/g++.dg/template/crash115.C 
gcc/testsuite/g++.dg/template/crash115.C
index 5c9f525cd64..80f8683a136 100644
--- gcc/testsuite/g++.dg/template/crash115.C
+++ gcc/testsuite/g++.dg/template/crash115.C
@@ -1,3 +1,3 @@
 // PR c++/56534
 
-template < struct template rebind < > // { dg-error "expected" }
+template < struct template rebind < > // { dg-error "expected|must follow" }


Re: [PATCH] Fix incorrect assertion when deallocating big block

2018-11-14 Thread Jonathan Wakely

On 14/11/18 10:31 +0100, Christophe Lyon wrote:

On Tue, 13 Nov 2018 at 23:58, Jonathan Wakely  wrote:


Since a big_block rounds up the size to a multiple of big_block::min it
is wrong to assert that the supplied number of bytes equals the
big_block's size(). Add big_block::alloc_size(size_t) to calculate the
allocated size consistently, and add comments to the code.

* src/c++17/memory_resource.cc (big_block): Improve comments.
(big_block::all_ones): Remove.
(big_block::big_block(size_t, size_t)): Use alloc_size.
(big_block::size()): Add comment, replace all_ones with equivalent
expression.
(big_block::align()): Shift value of correct type.
(big_block::alloc_size(size_t)): New function to round up size.
(__pool_resource::allocate(size_t, size_t)): Add comment.
(__pool_resource::deallocate(void*, size_t, size_t)): Likewise. Fix
incorrect assertion by using big_block::alloc_size(size_t).
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Add
more tests for unpooled allocations.



Hi Jonathan,

I've noticed that the updated test fails on arm*:
FAIL: 20_util/unsynchronized_pool_resource/allocate.cc execution test

the log says:
/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc:232:
void test06(): Assertion 'false' failed.

The same happens on aarch64-elf with -mabi=ilp32


Should be fixed by this patch, committed to trunk.


commit b9aea25cc625b0ee3322d8185c2fc9354a800ebb
Author: Jonathan Wakely 
Date:   Wed Nov 14 20:23:58 2018 +

Fix test that does undefined shifts greater than width of size_t

* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Fix
test for 32-bit targets. Test additional allocation sizes.

diff --git a/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
index 749655b63c7..0325a4358b6 100644
--- a/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
+++ b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
@@ -170,7 +170,7 @@ test05()
 void
 test06()
 {
-  struct custom_mr : std::pmr::memory_resource
+  struct checking_mr : std::pmr::memory_resource
   {
 size_t expected_size = 0;
 size_t expected_alignment = 0;
@@ -178,29 +178,30 @@ test06()
 struct bad_size { };
 struct bad_alignment { };
 
-void* do_allocate(std::size_t b, std::size_t a)
+void* do_allocate(std::size_t bytes, std::size_t align)
 {
-  if (expected_size != 0)
-  {
-	if (b < expected_size)
-	  throw bad_size();
-	else if (a != expected_alignment)
-	  throw bad_alignment();
-	// Else just throw, don't try to allocate:
-	throw std::bad_alloc();
-  }
+  // Internal data structures in unsynchronized_pool_resource need to
+  // allocate memory, so handle those normally:
+  if (align <= alignof(std::max_align_t))
+	return std::pmr::new_delete_resource()->allocate(bytes, align);
 
-  return std::pmr::new_delete_resource()->allocate(b, a);
+  // This is a large, unpooled allocation. Check the arguments:
+  if (bytes < expected_size)
+	throw bad_size();
+  else if (align != expected_alignment)
+	throw bad_alignment();
+  // Else just throw, don't really try to allocate:
+  throw std::bad_alloc();
 }
 
-void do_deallocate(void* p, std::size_t b, std::size_t a)
-{ std::pmr::new_delete_resource()->deallocate(p, b, a); }
+void do_deallocate(void* p, std::size_t bytes, std::size_t align)
+{ std::pmr::new_delete_resource()->deallocate(p, bytes, align); }
 
 bool do_is_equal(const memory_resource& r) const noexcept
 { return false; }
   };
 
-  custom_mr c;
+  checking_mr c;
   std::pmr::unsynchronized_pool_resource r({1, 1}, &c);
   std::pmr::pool_options opts = r.options();
   const std::size_t largest_pool = opts.largest_required_pool_block;
@@ -214,23 +215,26 @@ test06()
 
   // Try allocating various very large sizes and ensure the size requested
   // from the upstream allocator is at least as large as needed.
-  for (int i = 1; i < 64; ++i)
+  for (int i = 0; i < std::numeric_limits::digits; ++i)
   {
-for (auto b : { -1, 0, 1, 3 })
+for (auto b : { -63, -5, -1, 0, 1, 3, std::numeric_limits::max() })
 {
   std::size_t bytes = std::size_t(1) << i;
-  bytes += b;
+  bytes += b; // For negative b this can wrap to a large positive value.
   c.expected_size = bytes;
   c.expected_alignment = large_alignment;
+  bool caught_bad_alloc = false;
   try {
 	(void) r.allocate(bytes, large_alignment);
   } catch (const std::bad_alloc&) {
 	// expect to catch bad_alloc
-  } catch (custom_mr::bad_size) {
-	VERIFY(false);
-  } catch (custom_mr::bad_alignment) {
-	VERIFY(false);
+	caught_bad_alloc = true;
+  } catch (checking_mr::bad_size) {
+	VERIFY( ! "allocation from upstream resour

Re: [PATCH 1/6] ifcvt: Store the number of created cmovs.

2018-11-14 Thread Jeff Law
On 11/14/18 6:07 AM, Robin Dapp wrote:
> This patch saves the number of created conditional moves by
> noce_convert_multiple_sets in the IF_INFO struct.  This may be used by
> the backend to easier decide whether to accept a generated sequence or
> not.
> 
> --
> 
> gcc/ChangeLog:
> 
> 2018-11-14  Robin Dapp  
> 
>   * ifcvt.c (noce_convert_multiple_sets): Set cmov count.
>   (noce_find_if_block): Set cmov count.
>   * ifcvt.h (struct noce_if_info): Add cmov count.
So this series came in after stage1 close.I'm not aware of a bug
this series is meant to fix, but perhaps you just didn't include a
reference to it.

Anyway, patches #1 and #3 are OK for the trunk right now.

Jeff


[doc, committed] improve documentation for -Og

2018-11-14 Thread Sandra Loosemore
This patch is for PR 59658, "Document -f* flags enabled by -Og".  As 
noted in the issue comments, this is tricky because -Og uses a separate 
pass list than the other -O levels, and like -O0 it completely ignores 
-f* options for many optimization passes.  And, like -O0, there's really 
nothing in the code to identify which flags are ignored without checking 
them individually against the pass list.  I opened PR 88024 to improve 
the code in that area.  Meanwhile this documentation patch identifies 
the flags that are explicitly defaulted to off and documents that other 
flags are ignored, consistently with how the manual documents the 
similar -O0 abbreviated-passlist behavior.


I have also included a code patch here to better sort the 
default_options_table array in opts.c so it is easier to keep the 
documentation in sync with it.  I think that counts as "obvious" so I 
have gone ahead and committed this patch.


-Sandra
2018-11-14  Sandra Loosemore  

	PR middle-end/59658

	gcc/
	* doc/invoke.texi (Optimize Options): Clarify that -O0 and -Og
	also suppress many optimizations.  Alphabetize option lists for
	-O1, -O2, and -Os.  Add list of options disabled with -Og, and
	correct documentation for those options to say that.
	* opts.c (default_options_table): Sort table by level and option
	name, to make it easier to correlate to the manual.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 266082)
+++ gcc/doc/invoke.texi	(working copy)
@@ -7896,9 +7896,10 @@ each of them.
 Not all optimizations are controlled directly by a flag.  Only
 optimizations that have a flag are listed in this section.
 
-Most optimizations are only enabled if an @option{-O} level is set on
-the command line.  Otherwise they are disabled, even if individual
-optimization flags are specified.
+Most optimizations are completely disabled at @option{-O0} or if an
+@option{-O} level is not set on the command line, even if individual
+optimization flags are specified.  Similarly, @option{-Og} suppresses
+many optimization passes.
 
 Depending on the target and how GCC was configured, a slightly different
 set of optimizations may be enabled at each @option{-O} level than
@@ -7918,9 +7919,14 @@ With @option{-O}, the compiler tries to
 time, without performing any optimizations that take a great deal of
 compilation time.
 
+@c Note that in addition to the default_options_table list in opts.c,
+@c several optimization flags default to true but control optimization
+@c passes that are explicitly disabled at -O0.
+
 @option{-O} turns on the following optimization flags:
-@gccoptlist{
--fauto-inc-dec @gol
+
+@c Please keep the following list alphabetized.
+@gccoptlist{-fauto-inc-dec @gol
 -fbranch-count-reg @gol
 -fcombine-stack-adjustments @gol
 -fcompare-elim @gol
@@ -7931,11 +7937,11 @@ compilation time.
 -fdse @gol
 -fforward-propagate @gol
 -fguess-branch-probability @gol
--fif-conversion2 @gol
 -fif-conversion @gol
+-fif-conversion2 @gol
 -finline-functions-called-once @gol
--fipa-pure-const @gol
 -fipa-profile @gol
+-fipa-pure-const @gol
 -fipa-reference @gol
 -fipa-reference-addressable @gol
 -fmerge-constants @gol
@@ -7958,11 +7964,11 @@ compilation time.
 -ftree-forwprop @gol
 -ftree-fre @gol
 -ftree-phiprop @gol
+-ftree-pta @gol
 -ftree-scev-cprop @gol
 -ftree-sink @gol
 -ftree-slsr @gol
 -ftree-sra @gol
--ftree-pta @gol
 -ftree-ter @gol
 -funit-at-a-time}
 
@@ -7975,10 +7981,12 @@ and the performance of the generated cod
 
 @option{-O2} turns on all optimization flags specified by @option{-O}.  It
 also turns on the following optimization flags:
-@gccoptlist{-fthread-jumps @gol
--falign-functions  -falign-jumps @gol
--falign-loops  -falign-labels @gol
+
+@c Please keep the following list alphabetized!
+@gccoptlist{-falign-functions  -falign-jumps @gol
+-falign-labels  -falign-loops @gol
 -fcaller-saves @gol
+-fcode-hoisting @gol
 -fcrossjumping @gol
 -fcse-follow-jumps  -fcse-skip-blocks @gol
 -fdelete-null-pointer-checks @gol
@@ -7988,11 +7996,8 @@ also turns on the following optimization
 -fhoist-adjacent-loads @gol
 -finline-small-functions @gol
 -findirect-inlining @gol
--fipa-cp @gol
--fipa-bit-cp @gol
--fipa-vrp @gol
--fipa-sra @gol
--fipa-icf @gol
+-fipa-bit-cp  -fipa-cp  -fipa-icf @gol
+-fipa-ra  -fipa-sra  -fipa-vrp @gol
 -fisolate-erroneous-paths-dereference @gol
 -flra-remat @gol
 -foptimize-sibling-calls @gol
@@ -8002,16 +8007,15 @@ also turns on the following optimization
 -freorder-blocks-algorithm=stc @gol
 -freorder-blocks-and-partition -freorder-functions @gol
 -frerun-cse-after-loop  @gol
--fsched-interblock  -fsched-spec @gol
 -fschedule-insns  -fschedule-insns2 @gol
+-fsched-interblock  -fsched-spec @gol
 -fstore-merging @gol
 -fstrict-aliasing @gol
+-fthread-jumps @gol
 -ftree-builtin-call-dce @gol
--ftree-switch-conversion -ftree-tail-merge @gol
--fcode-hoisting @gol
 -ftree-pre @gol
--ftree-vrp @gol
--fipa-ra}
+-ftree-switch-co

[C++ Patch PING] Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-14 Thread Paolo Carlini

Hi,

gently pinging this older patch of mine: given the previous 
create_array_type_for_decl change, its gist should not be very 
controversial...


On 06/11/18 10:01, Paolo Carlini wrote:

Hi,

when I improved create_array_type_for_decl I didn't notice that it 
calls compute_array_index_type as helper, which simply needs to have 
the location information propagated. Tested x86_64-linux.


    https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00331.html

Thanks, Paolo.




Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Jozef Lawrynowicz

On 14/11/2018 18:50, Paul Koning wrote:



On Nov 14, 2018, at 1:00 PM, Jozef Lawrynowicz  wrote:

On 14/11/2018 16:54, Andreas Schwab wrote:

On Nov 14 2018, Jozef Lawrynowicz  wrote:


diff --git a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c 
b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
index 6b1c427..71d24ce 100644
--- a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
+++ b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
@@ -1,6 +1,7 @@
  /* Test __builtin_{add,sub}_overflow on {,un}signed long int.  */
  /* { dg-do run } */
  /* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-timeout 120 { target msp430-*-* } } */

Are you sure you want to _decrease_ the timeout?  The default is 300
seconds.

Andreas.


The timeout as set in the dejagnu configuration for msp430 
([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely hit. There are 
some tests which pass most of of the time but will occasionally timeout so 
maybe the default timeout in dejagnu is worth increasing a little as well.

Would it make sense to use { dg-timeout-factor 4 ... } instead?  That would 
make it explicit that you're raising rather than lowering the timeout.

paul



Thanks, I wasn't aware of that directive. Using dg-timeout-factor does seem

more appropriate in this case.

Jozef.



Re: [PATCH] Add missing ZLIBINC to CFLAGS-optinfo-emit-json.o

2018-11-14 Thread David Malcolm
On Wed, 2018-11-14 at 09:49 +, Kyrill Tkachov wrote:
> On 13/11/18 18:45, David Malcolm wrote:
> > On Tue, 2018-11-13 at 17:58 +, Kyrill Tkachov wrote:
> > > Hi David,
> > > 
> > > On 09/11/18 21:00, Jeff Law wrote:
> > > > On 11/9/18 10:51 AM, David Malcolm wrote:
> > > > > One of the concerns noted at Cauldron about -fsave-
> > > > > optimization-
> > > > > record
> > > > > was the size of the output files.
> > > > > 
> > > > > This file implements compression of the -fsave-optimization-
> > > > > record
> > > > > output, using zlib.
> > > > > 
> > > > > I did some before/after testing of this patch, using SPEC
> > > > > 2017's
> > > > > 502.gcc_r with -O3, looking at the sizes of the generated
> > > > > FILENAME.opt-record.json[.gz] files.
> > > > > 
> > > > > The largest file was for insn-attrtab.c:
> > > > >before:  171736285 bytes (164M)
> > > > >after: 5304015 bytes (5.1M)
> > > > > 
> > > > > Smallest file was for vasprintf.c:
> > > > >before:  30567 bytes
> > > > >after:4485 bytes
> > > > > 
> > > > > Median file by size before was lambda-mat.c:
> > > > >before:2266738 bytes (2.2M)
> > > > >after:   75988 bytes (15K)
> > > > > 
> > > > > Total of all files in the benchmark:
> > > > >before: 2041720713 bytes (1.9G)
> > > > >after:66870770 bytes (63.8M)
> > > > > 
> > > > > ...so clearly compression is a big win in terms of file size,
> > > > > at
> > > > > the
> > > > > cost of making the files slightly more awkward to work with.
> > > > > [1]
> > > > > I also wonder if we want to support any pre-filtering of the
> > > > > output
> > > > > (FWIW roughly half of the biggest file seems to be "Adding
> > > > > assert
> > > > > for "
> > > > > messages from tree-vrp.c).
> > > > > 
> > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-
> > > > > gnu.
> > > > > 
> > > > > OK for trunk?
> > > > > 
> > > 
> > > So does this now add a dependency on zlib?
> > > I can't build GCC on my aarch64-none-linux machine after this
> > > patch
> > > due to a missing zlib.h.
> > > I see there's a zlib in the top-level GCC tree. Is that
> > > build/used
> > > during the GCC build itself?
> > > 
> > > Thanks,
> > > Kyrill
> > 
> > Sorry about that.  Does the following patch fix the build for you?
> 
> Yes, that fixes it.
> Thanks David!
> 
> Kyrill

Thanks; I've committed it to trunk as r266156.

Dave


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-14 Thread Marek Polacek
On Wed, Nov 14, 2018 at 09:55:39PM +0530, Umesh Kalappa wrote:
> My bad Marek and thank you for pointing that out.
> 
> Please find the attached correct one (pr52869.patch) .

Index: gcc/cp/ChangeLog
===
--- gcc/cp/ChangeLog(revision 266026)
+++ gcc/cp/ChangeLog(working copy)
@@ -1,3 +1,9 @@
+2018-11-14  Kamlesh Kumar  
+
+   PR c++/52869
+   *parser.c () :  restore the old current_class_{ptr,ref} by
+   inject_this_parameter().
+

So the correct CL entry would look like

2018-11-14  Kamlesh Kumar  

DR 1207
PR c++/52869
* parser.c (cp_parser_noexcept_specification_opt): Call
inject_this_parameter.

or so.

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c (revision 266026)
+++ gcc/cp/parser.c (working copy)
@@ -24615,11 +24615,24 @@
 {
   tree expr;
   cp_lexer_consume_token (parser->lexer);
-
+   

You're adding whitespaces where they shouldn't be.  Let's avoid changes like 
these.

   if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
{
  matching_parens parens;
  parens.consume_open (parser);
+ 
+ if (current_class_type)
+  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
+  else
+{
+  /*clear the current_class_ptr for non class type , like 
+   int foo() noexcept(*this)
+   {   
+ return 1;
+   }   
+ */
+current_class_ptr = NULL_TREE;
+}
 
I don't believe that's what Jason meant by restoring; I think you want

  tree save_ccp = current_class_ptr;
  tree save_ccr = current_class_ref;

  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);

  [...]

  current_class_ptr = save_ccp;
  current_class_ref = save_ccr;

In the future, if using diff, please also use the -p option.

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 266026)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2018-11-14  Kamlesh Kumar  
+
+   PR g++.dg/52869
+   * g++.dg/pr52869.C: New.

Should be "PR c++/52869".

Index: gcc/testsuite/g++.dg/pr52869.C
===
--- gcc/testsuite/g++.dg/pr52869.C  (nonexistent)
+++ gcc/testsuite/g++.dg/pr52869.C  (working copy)

Maybe move the test to testsuite/g++.dg/DRs?

@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -g" } */

Why these options?  I don't think you need -g.

+struct S {
+void f() { }
+void g() noexcept(noexcept(f())) { }
+void h() noexcept(noexcept(this->f())) { }
+};
+
+struct Nyan {
+   constexpr Nyan &operator++() noexcept { return *this; }
+   constexpr void omg() noexcept(noexcept(++*this)) {}
+};

I was hoping you'd add also a test with 'this' in noexcept in a class template.

This test doesn't compile on all dialects:
FAIL: g++.dg/pr52869.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/pr52869.C  -std=gnu++11 (test for excess errors)

You can run just the new test in all dialects using:
GXX_TESTSUITE_STDS=98,11,14,17,2a make check-c++ RUNTESTFLAGS=dg.exp=pr52869.C

The noexcept specifier is only in C++11 and newer I think.

+template< typename T >
+T sine( T const& a, T const& b ) noexcept
+{
+static_assert( noexcept( T(a / sqrt(a * a  + b * b)) ), "throwing expr" );
+return a / sqrt(a * a  + b * b);
+}
+
+int foo() noexcept
+{
+  return 1;
+}
+

I don't understand what this part of the test is testing.  It compiles even
without the patch.  Let's drop it.

Marek


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Paul Koning



> On Nov 14, 2018, at 1:00 PM, Jozef Lawrynowicz  
> wrote:
> 
> On 14/11/2018 16:54, Andreas Schwab wrote:
>> On Nov 14 2018, Jozef Lawrynowicz  wrote:
>> 
>>> diff --git a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c 
>>> b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
>>> index 6b1c427..71d24ce 100644
>>> --- a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
>>> +++ b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
>>> @@ -1,6 +1,7 @@
>>>  /* Test __builtin_{add,sub}_overflow on {,un}signed long int.  */
>>>  /* { dg-do run } */
>>>  /* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
>>> +/* { dg-timeout 120 { target msp430-*-* } } */
>> Are you sure you want to _decrease_ the timeout?  The default is 300
>> seconds.
>> 
>> Andreas.
>> 
> The timeout as set in the dejagnu configuration for msp430 
> ([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely hit. There 
> are some tests which pass most of of the time but will occasionally timeout 
> so maybe the default timeout in dejagnu is worth increasing a little as well.

Would it make sense to use { dg-timeout-factor 4 ... } instead?  That would 
make it explicit that you're raising rather than lowering the timeout.

paul




Re: [PATCH], Remove power9 fusion support

2018-11-14 Thread Segher Boessenkool
On Wed, Nov 07, 2018 at 01:36:50PM -0500, Michael Meissner wrote:
> On Mon, Nov 05, 2018 at 04:09:23PM -0600, Segher Boessenkool wrote:
> > Hi Mike,
> > 
> > On Fri, Nov 02, 2018 at 02:37:34PM -0400, Michael Meissner wrote:
> > > This patch removes all of the so-called power9 fusion support for the GCC
> > > compiler.  It leaves -mpower9-fusion as a deprecated switch in case 
> > > somebody
> > > used it (the switch was never documented).
> > 
> > As Mike Stump says, please just remove it.  The option was never documented,
> > most likely zero people use it, and those that do shouldn't have and can
> > easily adjust.
> > 
> > > [gcc]
> > > 2018-11-02  Michael Meissner  
> > > 
> > >   * config/rs6000/constraints.md (wF constraint): Only document the
> > >   wF constraint for power8 fusion.  Remove documentation for power9
> > >   fusion.
> > 
> > It wasn't documented as being anything for p8 before.  So that was wrong?
> 
> The switch wasn't documented.  In the constraint (which is what I'm changing
> here), the constraint mentioned p9 fusion in the documentation string.

Yeah; so it was wrong.  k.

> > >   (rs6000_option_override_internal): Delete power9 fusion option
> > >   support.  If we do -mcpu=power8 -mtune=power9, turn off power8
> > >   fusion.
> > 
> > That doesn't sound right.  Either the -mcpu= or the -mtune= should turn
> > it on, but neither should turn it off.  It sounds like you want -mtune
> > to say whether fusion is enabled or not?  That sounds fine, but this
> > should be implemented more directly (or more generically).
> 
> Ok, I will look at it.

Thanks.

> > >/* Power8 currently will only do the fusion if the top 11 bits of the 
> > > addis
> > > - value are all 1's or 0's.  Ignore this restriction if we are testing
> > > - advanced fusion.  */
> > > -  if (TARGET_P9_FUSION)
> > > -return 1;
> > > -
> > > + value are all 1's or 0's.  */
> > >return (IN_RANGE (value >> 16, -32, 31));
> > >  })
> > 
> > I think this is top 12 bits equal, not 11, so [-16..15].
> 
> It is 11 bits, check section 12.1.12 in the  power8 book IV.
> 
>   addis(SI) first 11 bits must be all 0’s or all 1’s

As we discussed offline, it is 12.


Segher


Re: [debug/88006] -fdebug-types-section gives undefined ref

2018-11-14 Thread Jeff Law
On 11/14/18 8:08 AM, Nathan Sidwell wrote:
> On 11/14/18 7:33 AM, Richard Biener wrote:
> 
>> Hmm, there is reference_to_unused () used in a related case.  But generally
>>
>> for late emission such references are "OK" and expected to be pruned
>> later by resolve_addr () (which I see we do not call for type units?!).  
>> Quote:
>>
>>
>> /* Resolve DW_OP_addr and DW_AT_const_value CONST_STRING arguments to
>> an address in .rodata section if the string literal is emitted there,
>> or remove the containing location list or replace DW_AT_const_value
>> with DW_AT_location and empty location expression, if it isn't found
>> in .rodata.  Similarly for SYMBOL_REFs, keep only those that refer
>> to something that has been emitted in the current CU.  */
>>
>> static void
>> resolve_addr (dw_die_ref die)
>> {
> 
> That does seem to work.  The attached survise bootstrapping on
> x86_64-linux, ok?
> 
> nathan
> 
> -- 
> Nathan Sidwell
> 
> pr88006-2.diff
> 
> 2018-11-14  Nathan Sidwell  
> 
>   PR debug/88006
>   PR debug/87462
>   * dwarf2out.c (dwarf2out_finish): Apply resolve_addr to comdat
>   type list.
> 
>   * g++.dg/debug/dwarf2/pr87462.C: New.
>   * g++.dg/debug/dwarf2/pr88006.C: New.
OK
jeff


Re: Allow target to override gnu-user.h crti and crtn

2018-11-14 Thread Jeff Law
On 11/12/18 4:31 AM, Alan Modra wrote:
> Also give target access to the gnu-user.h LINK_GCC_C_SEQUENCE_SPEC.
> In preparation for using gnu-user.h in rs6000/.
> 
> Bootstrapped etc. powerpc64le-linux.  OK?
> 
>   * config/gnu-user.h (GNU_USER_TARGET_CRTI): Define.
>   (GNU_USER_TARGET_STARTFILE_SPEC): Use it here.
>   (GNU_USER_TARGET_CRTN): Define.
>   (GNU_USER_TARGET_ENDFILE_SPEC): Use it here.
>   (GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC): Define.
OK
jeff


Re: Delete !HAVE_LD_PIE variants of startfile/endfile specs

2018-11-14 Thread Jeff Law
On 11/12/18 4:29 AM, Alan Modra wrote:
> This patch is a small cleanup.
> 
> The HAVE_LD_PIE variant doesn't contain anything that will break
> linking when !HAVE_LD_PIE that isn't already broken if you choose to
> build PIEs with a linker that doesn't support PIE.  All this
> HAVE_LD_PIE protects is the choice of different crt files, which is
> more about libc capability than linker capability.
> 
> Bootstrapped etc. powerpc64le-linux and x86_64-linux.  OK?
> 
>   * config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Delete
>   !HAVE_LD_PIE variant.
>   (GNU_USER_TARGET_ENDFILE_SPEC): Likewise.
OK
jeff


Re: [PATCH] pretty-print.c: add selftest::test_prefixes_and_wrapping

2018-11-14 Thread Jeff Law
On 11/11/18 6:39 PM, David Malcolm wrote:
> gcc/ChangeLog:
>   * pretty-print.c (class selftest::test_pretty_printer): New
>   subclass of pretty_printer.
>   (selftest::test_prefixes_and_wrapping): New test.
>   (selftest::pretty_print_c_tests): Call it.
> ---
OK
jeff



Re: RFA: Fix add_predicate_code to acknowledge ZERO_EXTRACT as an lvalue.

2018-11-14 Thread Jeff Law
On 11/11/18 1:52 AM, Joern Wolfgang Rennecke wrote:
> With a configurable vector size, it is not really feasible to represent
> every vector register
> inside GCC as a collection of lots of imaginary BITS_PER_WORD registers.
> So you got your general purpose registers that are BITS_PER_WORD, and
> vector registers
> that are a bit or a lot larger.  To void invalid code being emitted by
> reload, you have to define TARGET_CAN_CHANGE_MODE_CLASS to reject the
> use of vector registers for values
> where certain kinds of SUBREGs are used.  In practice, that's most of them.
> To avoid register allocation mayhem, the port has to steer the
> middle-end away from
> the tried-and-true-and-generating-absymal-code path of SUBREGs. There
> are a number
> of choices for lvalues.  vec_select is sort of obvious and works to a
> point, but it doesn't
> scale well because the access representation changes according to the
> content of
> vector registers.  And it doesn't work at all as an lvalue.
> ZERO_EXTRACT has none of these problems.  It can describe a bitfield
> access independent of
> the vector structure (if any) of outer and inner mode, and it is valid
> as an lvalue.
> Unfortunately, add_predicate_code in gensupport.c didn't get the message.
> This patch fixes that.
> 
> Bootstrapped and regression tested on  x86_64-pc-linux-gnu .
> 
> gensupport-zext-patch-266008.txt
> 
> 2018-11-10  Joern Rennecke  
> 
>   * gensupport.c (add_predicate_code): Properly handle ZERO_EXTRACT
>   as an lvalue.
OK
jeff


Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-14 Thread Jeff Law
On 11/6/18 4:56 PM, Martin Sebor wrote:
> In response to Joseph's comment I've removed the interaction
> with -Wpedantic from the updated patch.
> 
> In addition, to help detect bugs like the one in the test case
> for pr87886, I have also enhanced the detection of incompatible
> calls to include integer/real type mismatches so that calls like
> the one below are diagnosed:
> 
>   extern double sqrt ();
>   int f (int x)
>   {
> return sqrt (x);   // passing int where double is expected
>   }
> 
> With the removal of the -Wpedantic interaction declaring abort()
> without a prototype is no longer diagnosed and so the test suite
> changes to add the prototype are not necessary.  I decided not
> to back them out because Jeff indicated a preference for making
> these kinds of improvements in general in an unrelated
> discussion.
> 
> 
> 
> gcc-83656.diff
> 
> PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without 
> prototype
> 
> gcc/c/ChangeLog:
> 
>   PR c/83656
>   * c-decl.c (header_for_builtin_fn): Declare.
>   (diagnose_mismatched_decls): Diagnose declarations of built-in
>   functions without a prototype.
>   * c-typeck.c (maybe_warn_builtin_no_proto_arg): New function.
>   (convert_argument): Same.
>   (convert_arguments): Factor code out into convert_argument.
>   Detect mismatches between built-in formal arguments in calls
>   to built-in without prototype.
>   (build_conditional_expr): Same.
>   (type_or_builtin_type): New function.
>   (convert_for_assignment): Add argument.  Conditionally issue
>   warnings instead of errors for mismatches.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/83656
>   * gcc.dg/20021006-1.c
>   * gcc.dg/Wbuiltin-declaration-mismatch.c: New test.
>   * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test.
>   * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test.
>   * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test.
>   * gcc.dg/Walloca-16.c: Adjust.
>   * gcc.dg/Wrestrict-4.c: Adjust.
>   * gcc.dg/Wrestrict-5.c: Adjust.
>   * gcc.dg/atomic/stdatomic-generic.c: Adjust.
>   * gcc.dg/atomic/stdatomic-lockfree.c: Adjust.
>   * gcc.dg/initpri1.c: Adjust.
>   * gcc.dg/pr15698-1.c: Adjust.
>   * gcc.dg/pr69156.c: Adjust.
>   * gcc.dg/pr83463.c: Adjust.
>   * gcc.dg/redecl-4.c: Adjust.
>   * gcc.dg/tls/thr-init-2.c: Adjust.
>   * gcc.dg/torture/pr55890-2.c: Adjust.
>   * gcc.dg/torture/pr55890-3.c: Adjust.
>   * gcc.dg/torture/pr67741.c: Adjust.
>   * gcc.dg/torture/stackalign/sibcall-1.c: Adjust.
>   * gcc.dg/torture/tls/thr-init-1.c: Adjust.
>   * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Adjust.
> 


> @@ -3547,8 +3598,24 @@ convert_arguments (location_t loc, vec 
> arg_loc, tree typelist,
>if (parmval == error_mark_node)
>   error_args = true;
>  
> +  if (!type && builtin_type && TREE_CODE (builtin_type) != VOID_TYPE)
> + {
> +   /* For a call to a built-in function declared without a prototype,
> +  perform the coversions from the argument to the expected type
> +  but issue warnings rather than errors for any mismatches.
> +  Ignore the converted argument and use the PARMVAL obtained
> +  above by applying default coversions instead.  */
s/coversions/conversions/

Two of 'em in that comment.  OK with that nit fixed.

jeff




Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-14 Thread Alexander Monakov
On Wed, 14 Nov 2018, Segher Boessenkool wrote:
> Yeah, using local register vars to access global registers only works
> by accident.  It does work currently though, and people apparently use
> it, so we shouldn't break it :-/

In the proposed approach (copying from/to pseudos just before/after the
asm) we can emulate historic behavior by making uninitialized pseudos
take values of the corresponding hardregs. If we decide that doing that
just for must-uninit pseudos is enough, it's a simple extension to the
existing init-regs pass.  Does this sound reasonable?

Alexander


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Jozef Lawrynowicz

On 14/11/2018 16:54, Andreas Schwab wrote:

On Nov 14 2018, Jozef Lawrynowicz  wrote:


diff --git a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c 
b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
index 6b1c427..71d24ce 100644
--- a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
+++ b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
@@ -1,6 +1,7 @@
  /* Test __builtin_{add,sub}_overflow on {,un}signed long int.  */
  /* { dg-do run } */
  /* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-timeout 120 { target msp430-*-* } } */

Are you sure you want to _decrease_ the timeout?  The default is 300
seconds.

Andreas.

The timeout as set in the dejagnu configuration for msp430 
([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely hit. 
There are some tests which pass most of of the time but will 
occasionally timeout so maybe the default timeout in dejagnu is worth 
increasing a little as well.




Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-14 Thread Segher Boessenkool
On Wed, Nov 14, 2018 at 06:50:38PM +0300, Alexander Monakov wrote:
> On Wed, 14 Nov 2018, Segher Boessenkool wrote:
> > > I think with "=g" rather than "+g" this example is ok.
> > 
> > No, it needs the register var as an input.  That is the whole *point*.
> 
> Hm. I think I see what you meant, but "+g" is not correct either: the
> asm, by intent, depends *on the current value in the 'sp' hardreg*, not
> *on the current value of some automatic variable that is supposed to be
> passed on the 'sp' hardreg to the asm* (which is what expressed by the
> input constraint).

Yeah, using local register vars to access global registers only works
by accident.  It does work currently though, and people apparently use
it, so we shouldn't break it :-/


Segher


Re: Don't use %z printf format length specified

2018-11-14 Thread Michael Matz
Hi,

On Wed, 14 Nov 2018, Alexander Monakov wrote:

> On Wed, 14 Nov 2018, Michael Matz wrote:
> 
> > Hi,
> > 
> > it's not c++98 conforming and I get 1 million warnings when compiling.  
> > Initially I had also casts to long at the various arguments, but I get no 
> > warning with this variant either, so I removed them again.  I'm currently 
> > regstrapping this, okay for trunk if successfull?
> 
> Surely this will break mingw-w64 where size_t is 64-bit yet long is 32-bit?

Okay, probably.  Then consider the same patch with sprinkling casts to 
long for all these arguments (the actual numbers will fit that type in 
reality).  I could of course also use PRIu64 but that makes my eyes bleed.


Ciao,
Michael.


Re: Don't use %z printf format length specified

2018-11-14 Thread Andreas Schwab
On Nov 14 2018, Michael Matz  wrote:

> Initially I had also casts to long at the various arguments, but I get no 
> warning with this variant either, so I removed them again.

That is probably pure luck.  size_t is not required to be the same as
unsigned long.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Don't use %z printf format length specified

2018-11-14 Thread Alexander Monakov
On Wed, 14 Nov 2018, Michael Matz wrote:

> Hi,
> 
> it's not c++98 conforming and I get 1 million warnings when compiling.  
> Initially I had also casts to long at the various arguments, but I get no 
> warning with this variant either, so I removed them again.  I'm currently 
> regstrapping this, okay for trunk if successfull?

Surely this will break mingw-w64 where size_t is 64-bit yet long is 32-bit?

Alexander


Don't use %z printf format length specified

2018-11-14 Thread Michael Matz
Hi,

it's not c++98 conforming and I get 1 million warnings when compiling.  
Initially I had also casts to long at the various arguments, but I get no 
warning with this variant either, so I removed them again.  I'm currently 
regstrapping this, okay for trunk if successfull?


Ciao,
Michael.

* alloc-pool.h (pool_usage::dump): Don't use %z but %l.
(pool_usage::dump_footer): Likewise.
* bitmap.h (bitmap_usage::dump): Likewise.
* ggc-common.c (ggc_usage::dump): Likewise.
* ggc-page.c (ggc_print_statistics): Likewise.
* mem-stats.h (mem_usage::dump): Likewise.
(mem_usage::dump_footer): Likewise.
* rtl.c (dump_rtx_statistics): Likewise.
* vec.c (vec_usage::dump): Likewise.
(vec_usage::dump_footer): Likewise.

diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
index d17a05ca4fb1..8a6747d89cc6 100644
--- a/gcc/alloc-pool.h
+++ b/gcc/alloc-pool.h
@@ -63,8 +63,8 @@ struct pool_usage: public mem_usage
   {
 char *location_string = loc->to_string ();
 
-fprintf (stderr, "%-32s%-48s %5zu%c%9zu%c:%5.1f%%%9zu"
-"%c%9zu%c:%5.1f%%%12zu\n",
+fprintf (stderr, "%-32s%-48s %5lu%c%9lu%c:%5.1f%%%9lu"
+"%c%9lu%c:%5.1f%%%12lu\n",
 m_pool_name, location_string,
 SIZE_AMOUNT (m_instances),
 SIZE_AMOUNT (m_allocated),
@@ -91,7 +91,7 @@ struct pool_usage: public mem_usage
   dump_footer ()
   {
 print_dash_line ();
-fprintf (stderr, "%s%82zu%c%10zu%c\n", "Total",
+fprintf (stderr, "%s%82lu%c%10lu%c\n", "Total",
 SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated));
 print_dash_line ();
   }
diff --git a/gcc/bitmap.h b/gcc/bitmap.h
index 973ea846baf1..7c547aba9a5b 100644
--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -239,8 +239,8 @@ struct bitmap_usage: public mem_usage
   {
 char *location_string = loc->to_string ();
 
-fprintf (stderr, "%-48s %9zu%c:%5.1f%%"
-"%9zu%c%9zu%c:%5.1f%%"
+fprintf (stderr, "%-48s %9lu%c:%5.1f%%"
+"%9lu%c%9lu%c:%5.1f%%"
 "%11" PRIu64 "%c%11" PRIu64 "%c%10s\n",
 location_string, SIZE_AMOUNT (m_allocated),
 get_percent (m_allocated, total.m_allocated),
diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index 9fdba23ce4c2..0fe06ccd653d 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -884,12 +884,13 @@ struct ggc_usage: public mem_usage
   {
 size_t balance = get_balance ();
 fprintf (stderr,
-"%-48s %9zu%c:%5.1f%%%9zu%c:%5.1f%%"
-"%9zu%c:%5.1f%%%9zu%c:%5.1f%%%9zu%c\n",
+"%-48s %9lu%c:%5.1f%%%9lu%c:%5.1f%%"
+"%9lu%c:%5.1f%%%9lu%c:%5.1f%%%9lu%c\n",
 prefix, SIZE_AMOUNT (m_collected),
 get_percent (m_collected, total.m_collected),
 SIZE_AMOUNT (m_freed), get_percent (m_freed, total.m_freed),
-SIZE_AMOUNT (balance), get_percent (balance, total.get_balance ()),
+SIZE_AMOUNT (balance),
+get_percent (balance, total.get_balance ()),
 SIZE_AMOUNT (m_overhead),
 get_percent (m_overhead, total.m_overhead),
 SIZE_AMOUNT (m_times));
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 00c2864711f0..6ead4a144390 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -2288,14 +2288,14 @@ ggc_print_statistics (void)
  overhead += (sizeof (page_entry) - sizeof (long)
   + BITMAP_SIZE (OBJECTS_IN_PAGE (p) + 1));
}
-  fprintf (stderr, "%-8zu %10zu%c %10zu%c %10zu%c\n",
+  fprintf (stderr, "%-8lu %10lu%c %10lu%c %10lu%c\n",
   OBJECT_SIZE (i),
   SIZE_AMOUNT (allocated),
   SIZE_AMOUNT (in_use),
   SIZE_AMOUNT (overhead));
   total_overhead += overhead;
 }
-  fprintf (stderr, "%-8s %10zu%c %10zu%c %10zu%c\n",
+  fprintf (stderr, "%-8s %10lu%c %10lu%c %10lu%c\n",
   "Total",
   SIZE_AMOUNT (G.bytes_mapped),
   SIZE_AMOUNT (G.allocated),
@@ -2335,11 +2335,11 @@ ggc_print_statistics (void)
   for (i = 0; i < NUM_ORDERS; i++)
if (G.stats.total_allocated_per_order[i])
  {
-   fprintf (stderr, "Total Overhead  page size %9zu: %9"
+   fprintf (stderr, "Total Overhead  page size %9lu: %9"
 HOST_LONG_LONG_FORMAT "d%c\n",
 OBJECT_SIZE (i),
 SIZE_AMOUNT (G.stats.total_overhead_per_order[i]));
-   fprintf (stderr, "Total Allocated page size %9zu: %9"
+   fprintf (stderr, "Total Allocated page size %9lu: %9"
 HOST_LONG_LONG_FORMAT "d%c\n",
 OBJECT_SIZE (i),
 SIZE_AMOUNT (G.stats.total_allocated_per_order[i]));
diff --git a/gcc/mem-stats.h b/gcc/mem-stats.h
index 6ab92211cf47..a2da009f100e 100644
--- a/gcc/mem-stats.h
+++ b/gcc/mem-stats.h
@@ -205,8 +205,8 @@ struct mem_usage
   {
 char *location_string = loc->to_string ();
 
-

Re: Handle libphobos in contrib/gcc_update

2018-11-14 Thread Jeff Law
On 11/14/18 9:29 AM, Rainer Orth wrote:
> I noticed that libphobos isn't currently handled in gcc_update.  The
> following patch fixes this.
> 
> I guess this is close to obvious?
Seems so to me :-)

jeff


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Andreas Schwab
On Nov 14 2018, Jozef Lawrynowicz  wrote:

> diff --git a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c 
> b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
> index 6b1c427..71d24ce 100644
> --- a/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
> +++ b/gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c
> @@ -1,6 +1,7 @@
>  /* Test __builtin_{add,sub}_overflow on {,un}signed long int.  */
>  /* { dg-do run } */
>  /* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-timeout 120 { target msp430-*-* } } */

Are you sure you want to _decrease_ the timeout?  The default is 300
seconds.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH][rs6000] inline expansion of memcmp using vsx

2018-11-14 Thread Aaron Sawdey
This patch generalizes some the functions added earlier to do vsx expansion of 
strncmp
so that the can also generate the code needed for memcmp. I reorganized
expand_block_compare() a little to be able to make use of this there. The vsx 
code is more
compact so I've changed the default block compare inline limit to 63 bytes. The 
vsx
code is only used if there is at least 16 bytes to compare as this means we 
don't have to
do complex code to compare less than one chunk. If vsx is not available the 
limit is cut
in half. The performance is good, vsx memcmp is considerably faster than the 
gpr inline code
if the strings are equal and is comparable if the strings have a 10% chance of 
being
equal (spread across the string).

Currently regtesting, ok for trunk if tests pass?

Thanks!
   Aaron

2018-11-14  Aaron Sawdey  

* config/rs6000/rs6000-string.c (emit_vsx_zero_reg): New function.
(expand_cmp_vec_sequence): Rename and modify
expand_strncmp_vec_sequence.
(emit_final_compare_vec): Rename and modify emit_final_str_compare_vec.
(generate_6432_conversion): New function.
(expand_block_compare): Add support for vsx.
(expand_block_compare_gpr): New function.
* config/rs6000/rs6000.opt (rs6000_block_compare_inline_limit): Increase
default limit to 63 because of more compact vsx code.




Index: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c   (revision 266034)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -615,6 +615,283 @@
 }
 }

+static rtx
+emit_vsx_zero_reg()
+{
+  unsigned int i;
+  rtx zr[16];
+  for (i = 0; i < 16; i++)
+zr[i] = GEN_INT (0);
+  rtvec zv = gen_rtvec_v (16, zr);
+  rtx zero_reg = gen_reg_rtx (V16QImode);
+  rs6000_expand_vector_init (zero_reg, gen_rtx_PARALLEL (V16QImode, zv));
+  return zero_reg;
+}
+
+/* Generate the sequence of compares for strcmp/strncmp using vec/vsx
+   instructions.
+
+   BYTES_TO_COMPARE is the number of bytes to be compared.
+   ORIG_SRC1 is the unmodified rtx for the first string.
+   ORIG_SRC2 is the unmodified rtx for the second string.
+   S1ADDR is the register to use for the base address of the first string.
+   S2ADDR is the register to use for the base address of the second string.
+   OFF_REG is the register to use for the string offset for loads.
+   S1DATA is the register for loading the first string.
+   S2DATA is the register for loading the second string.
+   VEC_RESULT is the rtx for the vector result indicating the byte difference.
+   EQUALITY_COMPARE_REST is a flag to indicate we need to make a cleanup call
+   to strcmp/strncmp if we have equality at the end of the inline comparison.
+   P_CLEANUP_LABEL is a pointer to rtx for a label we generate if we need code
+   to clean up and generate the final comparison result.
+   FINAL_MOVE_LABEL is rtx for a label we can branch to when we can just
+   set the final result.
+   CHECKZERO indicates whether the sequence should check for zero bytes
+   for use doing strncmp, or not (for use doing memcmp).  */
+static void
+expand_cmp_vec_sequence (unsigned HOST_WIDE_INT bytes_to_compare,
+rtx orig_src1, rtx orig_src2,
+rtx s1addr, rtx s2addr, rtx off_reg,
+rtx s1data, rtx s2data, rtx vec_result,
+bool equality_compare_rest, rtx *p_cleanup_label,
+rtx final_move_label, bool checkzero)
+{
+  machine_mode load_mode;
+  unsigned int load_mode_size;
+  unsigned HOST_WIDE_INT cmp_bytes = 0;
+  unsigned HOST_WIDE_INT offset = 0;
+  rtx zero_reg = NULL;
+
+  gcc_assert (p_cleanup_label != NULL);
+  rtx cleanup_label = *p_cleanup_label;
+
+  emit_move_insn (s1addr, force_reg (Pmode, XEXP (orig_src1, 0)));
+  emit_move_insn (s2addr, force_reg (Pmode, XEXP (orig_src2, 0)));
+
+  if (checkzero && !TARGET_P9_VECTOR)
+zero_reg = emit_vsx_zero_reg();
+
+  while (bytes_to_compare > 0)
+{
+  /* VEC/VSX compare sequence for P8:
+check each 16B with:
+lxvd2x 32,28,8
+lxvd2x 33,29,8
+vcmpequb 2,0,1  # compare strings
+vcmpequb 4,0,3  # compare w/ 0
+xxlorc 37,36,34   # first FF byte is either mismatch or end of 
string
+vcmpequb. 7,5,3  # reg 7 contains 0
+bnl 6,.Lmismatch
+
+For the P8 LE case, we use lxvd2x and compare full 16 bytes
+but then use use vgbbd and a shift to get two bytes with the
+information we need in the correct order.
+
+VEC/VSX compare sequence if TARGET_P9_VECTOR:
+lxvb16x/lxvb16x # load 16B of each string
+vcmpnezb.   # produces difference location or zero byte 
location
+bne 6,.Lmismatch
+
+Use the overlapping compare trick for the last block if it is
+less than 16 bytes.
+  */
+
+  load_mode = V16QImode;
+  load_mode_size = GET_M

[committed] Fix rl78 newlib build failure due to bogus operand_subword_force argument

2018-11-14 Thread Jeff Law
ify_subreg that will result in using the
mode of the operand which is precisely what we want.

And that's precisely what this patch does.  That fixes the ICE and I've
verified the assembly code looks right on the rl78 port.  Furthermore,
newlib will successfully build with this patch.

It's been through a full cycle in my tester.  So it's been through
bootstraps on big/little endian systems, built kernels, built runtimes
(glibc, newlib, libgcc) -- essentially covering nearly all our targets
to varying degrees.

Installing on the trunk.

Jeff

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3efd96b570e..be75c6874c8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-11-14  Jeff Law  
+
+   * optabs.c (expand_binop): Pass INT_MODE to operand_subword_force
+   iff the operand is a constant.
+
 2018-11-14  Aldy Hernandez  
 
* gimple-ssa-evrp-analyze.c
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 605c90c..c7d1f22e7a8 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -1377,12 +1377,14 @@ expand_binop (machine_mode mode, optab binoptab, rtx 
op0, rtx op1,
   start_sequence ();
 
   /* Do the actual arithmetic.  */
+  enum machine_mode op0_mode = CONSTANT_P (op0) ? int_mode : VOIDmode;
+  enum machine_mode op1_mode = CONSTANT_P (op1) ? int_mode : VOIDmode;
   for (i = 0; i < GET_MODE_BITSIZE (int_mode) / BITS_PER_WORD; i++)
{
  rtx target_piece = operand_subword (target, i, 1, int_mode);
  rtx x = expand_binop (word_mode, binoptab,
-   operand_subword_force (op0, i, int_mode),
-   operand_subword_force (op1, i, int_mode),
+   operand_subword_force (op0, i, op0_mode),
+   operand_subword_force (op1, i, op1_mode),
target_piece, unsignedp, next_methods);
 
  if (x == 0)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 50e53f0b196..cee33796cc5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2018-11-14  Jeff Law  
+
+   * gcc.c-torture/compile/20181114.c: New test.
+
 2018-11-14  Richard Biener  
 
PR middle-end/87985
diff --git a/gcc/testsuite/gcc.c-torture/compile/20181114-1.c 
b/gcc/testsuite/gcc.c-torture/compile/20181114-1.c
new file mode 100644
index 000..9bcc3992f64
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/20181114-1.c
@@ -0,0 +1,6 @@
+int
+_vfprintf_r (double fp)
+{
+  if (__builtin_signbit (fp))
+return '-';
+}


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Paul Koning



> On Nov 14, 2018, at 10:44 AM, Jozef Lawrynowicz  
> wrote:
> 
> Patch 1 tweaks dg directives in tests specifically for msp430. Many of
> these are extensions to existing target selectors in dg directives.
> 
> <0001-TESTSUITE-MSP430-Tweak-dg-directives-for-msp430-elf.patch>

For pr41779.c, you have

+/* { dg-skip-if "int is smaller than float" { msp430-*-* } } */

I take it that means: sizeof(int) < sizeof(float).  That property also holds 
for pdp11 and perhaps other targets.  Would it make sense to introduce a new 
effective-target flag for that check instead?

paul



Re: record_ranges_from_incoming_edge: use value_range API for creating new range

2018-11-14 Thread Aldy Hernandez

On 11/13/18 1:47 PM, Richard Biener wrote:

On November 13, 2018 5:40:59 PM GMT+01:00, Aldy Hernandez  
wrote:

With your cleanups, the main raison d'etre for my patch goes away, but
here is the promised removal of ignore_equivs_equal_p.

I think the == operator is a bit confusing, and equality intent should
be clearly specified.  I am providing the following for the derived
class (with no hidden default arguments):

bool equal_p (const value_range &, bool ignore_equivs) const;

and providing the following for the base class:

bool equal_p (const value_range_base &) const;

I am also removing access to both the == and the != operators.  It
should now be clear from the code whether the equivalence bitmap is
being taken into account or not.

What do you think?


Sounds good.


Committed to trunk.

Thanks.
Aldy


Handle libphobos in contrib/gcc_update

2018-11-14 Thread Rainer Orth
I noticed that libphobos isn't currently handled in gcc_update.  The
following patch fixes this.

I guess this is close to obvious?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-11-01  Rainer Orth  

* gcc_update (files_and_dependencies): Handle libphobos.

# HG changeset patch
# Parent  e011b14b6da3d49e0b2a37cdad178b545a8c34ca
Handle libphobos in contrib/gcc_update

diff --git a/contrib/gcc_update b/contrib/gcc_update
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -169,6 +169,12 @@ libbacktrace/aclocal.m4: libbacktrace/co
 libbacktrace/Makefile.in: libbacktrace/Makefile.am libbacktrace/aclocal.m4
 libbacktrace/configure: libbacktrace/configure.ac libbacktrace/aclocal.m4
 libbacktrace/config.h.in: libbacktrace/configure.ac libbacktrace/aclocal.m4
+libphobos/Makefile.in: libphobos/Makefile.am libphobos/configure.ac libphobos/aclocal.m4
+libphobos/aclocal.m4: libphobos/configure.ac libphobos/acinclude.m4
+libphobos/config.h.in: libphobos/configure.ac libphobos/aclocal.m4
+libphobos/configure: libphobos/configure.ac libphobos/aclocal.m4
+libphobos/src/Makefile.in: libphobos/src/Makefile.am libphobos/aclocal.m4
+libphobos/testsuite/Makefile.in: libphobos/testsuite/Makefile.am libphobos/aclocal.m4
 # Top level
 Makefile.in: Makefile.tpl Makefile.def
 configure: configure.ac config/acx.m4


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-14 Thread Umesh Kalappa
My bad Marek and thank you for pointing that out.

Please find the attached correct one (pr52869.patch) .

~Umesh


pr52869.patch
Description: Binary data


Re: Fix PR86575

2018-11-14 Thread Michael Matz
Hi,

On Wed, 14 Nov 2018, Marek Polacek wrote:

> >  static gimple *
> >  collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
> > -   auto_vec  *labels)
> > +   auto_vec  *labels,
> > +   location_t *prevloc)
> >  {
> >gimple *prev = NULL;
> 
> Looks good, thanks, though PREVLOC should probably be described in the 
> comment.

Gah.  And I thought about exactly this shortly before commiting, and then 
forgot :)  Fixed in svn.


Ciao,
Michael.


[PATCH 7/7][MSP430][TESTSUITE] Fix tests for msp430-elf large memory model

2018-11-14 Thread Jozef Lawrynowicz

Patch 7 fixes tests for msp430-elf in the large memory model.

>From 494465f13df814bf3daad5e330d2c7139f2db625 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sat, 10 Nov 2018 16:08:44 +
Subject: [PATCH 7/7] [TESTSUITE] Fix tests for msp430-elf large memory model

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/991014-1.c: Fix bufsize definition for
	msp430 large memory model.
	* gcc.dg/Walloca-1.c: Don't expect warning for msp430 large memory
	model.
	* gcc.dg/Walloca-2.c: Likewise.
	* gcc.dg/c99-const-expr-2.c: Define ZERO macro for msp430 large memory
	model.
	* gcc.dg/format/format.h: Prefix typedefs using __SIZE_TYPE__ and
	__PTRDIFF_TYPE__ with __extension__.
	* gcc.dg/lto/20081210-1_0.c: Always typedef uintptr_t as
	__UINTPTR_TYPE__.
	* gcc.dg/pr36227.c: Likewise.
	* gcc.dg/pr42611.c: Use __INTPTR_MAX__ as the maximum object size if
	size_t and ptr_t are the same size.
	* gcc.dg/pr78973.c: dg-warning XFAIL for int16 but not msp430 large
	memory model.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Update dg-warning
	directives for msp430 large memory model.
	* gcc.dg/tree-ssa/pr66449.c: Always use __INTPTR_TYPE__ when integer
	type equal in size to ptr_t is required.
	* gcc.dg/tree-ssa/ssa-dom-thread-8.c: Extend pointer size checking
	macro for msp430.
	* lib/target-supports.exp (check_effective_target_msp430_large_mem):
	New. 

---
 gcc/testsuite/gcc.c-torture/execute/991014-1.c |  7 -
 gcc/testsuite/gcc.dg/Walloca-1.c   |  4 +--
 gcc/testsuite/gcc.dg/Walloca-2.c   |  8 +++---
 gcc/testsuite/gcc.dg/c99-const-expr-2.c|  2 ++
 gcc/testsuite/gcc.dg/format/format.h   |  6 ++--
 gcc/testsuite/gcc.dg/lto/20081210-1_0.c|  8 +-
 gcc/testsuite/gcc.dg/pr36227.c | 10 +--
 gcc/testsuite/gcc.dg/pr42611.c |  3 +-
 gcc/testsuite/gcc.dg/pr78973.c |  2 +-
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-3.c   | 32 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr66449.c|  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-8.c   |  8 +++---
 gcc/testsuite/lib/target-supports.exp  |  7 +
 13 files changed, 52 insertions(+), 53 deletions(-)

diff --git a/gcc/testsuite/gcc.c-torture/execute/991014-1.c b/gcc/testsuite/gcc.c-torture/execute/991014-1.c
index e0bcd6d..95e38ce 100644
--- a/gcc/testsuite/gcc.c-torture/execute/991014-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/991014-1.c
@@ -1,11 +1,16 @@
-
 typedef __SIZE_TYPE__ Size_t;
 
+#ifdef __MSP430X_LARGE__
+/* size_t is __int20, so 20 bits, for __MSP430X_LARGE__, but __SIZEOF_POINTER__
+   returns the bytesize which is 4.  */
+#define bufsize ((1L << (20 - 2))-256)
+#else  /* !__MSP430X_LARGE__ */
 #if __SIZEOF_LONG__ < __SIZEOF_POINTER__
 #define bufsize ((1LL << (8 * sizeof(Size_t) - 2))-256)
 #else
 #define bufsize ((1L << (8 * sizeof(Size_t) - 2))-256)
 #endif
+#endif
 
 struct huge_struct
 {
diff --git a/gcc/testsuite/gcc.dg/Walloca-1.c b/gcc/testsuite/gcc.dg/Walloca-1.c
index 85e9160..c9a6c57 100644
--- a/gcc/testsuite/gcc.dg/Walloca-1.c
+++ b/gcc/testsuite/gcc.dg/Walloca-1.c
@@ -24,8 +24,8 @@ void foo1 (size_t len, size_t len2, size_t len3)
   char *s = alloca (123);
   useit (s);			// OK, constant argument to alloca
 
-  s = alloca (num);		// { dg-warning "large due to conversion" "" { target lp64 } }
-  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } .-1 }
+  s = alloca (num);		// { dg-warning "large due to conversion" "" { target { { lp64 } || { msp430_large_mem } } } }
+  // { dg-warning "unbounded use of 'alloca'" "" { target { { ! lp64 } && { ! msp430_large_mem } } } .-1 }
   useit (s);
 
   s = alloca (3);		/* { dg-warning "is too large" } */
diff --git a/gcc/testsuite/gcc.dg/Walloca-2.c b/gcc/testsuite/gcc.dg/Walloca-2.c
index 766ff8d..446c811 100644
--- a/gcc/testsuite/gcc.dg/Walloca-2.c
+++ b/gcc/testsuite/gcc.dg/Walloca-2.c
@@ -13,7 +13,7 @@ g1 (int n)
 // 32-bit targets because VRP is not giving us any range info for
 // the argument to __builtin_alloca.  This should be fixed by the
 // upcoming range work.
-p = __builtin_alloca (n); // { dg-bogus "unbounded use of 'alloca'" "" { xfail { ! lp64 } } }
+p = __builtin_alloca (n); // { dg-bogus "unbounded use of 'alloca'" "" { xfail { { ! lp64 } && { ! msp430_large_mem } } } }
   else
 p = __builtin_malloc (n);
   f (p);
@@ -36,9 +36,9 @@ g3 (int n)
   void *p;
   if (n > 0 && n < 3000)
 {
-  p = __builtin_alloca (n); // { dg-warning "'alloca' may be too large" "" { target lp64} }
-  // { dg-message "note:.*argument may be as large as 2999" "note" { target lp64 } .-1 }
-  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } .-2 }
+  p = __builtin_alloca (n); // { dg-warning "'alloca' may be too large" "" { target { lp64 || msp430_large_mem } } }
+  // { dg-message "note:.*argument may be as

Re: Fix PR86575

2018-11-14 Thread Marek Polacek
On Wed, Nov 14, 2018 at 02:51:45PM +, Michael Matz wrote:
> Hi,
> 
> our warning code sometimes adds locations to statement which didn't have 
> them before, which can in turn lead to code changes (here only label 
> numbers change).  It seems better to not do that from warning code, and 
> here it's easy to do: just return the location we want to use for 
> warnings, don't change it in the statement itself.
> 
> Regstrapped on x86-64, okay for trunk?
> 
> 
> Ciao,
> Michael.
> 
>   PR middle-end/86575
>   * gimplify.c (collect_fallthrough_labels): Add new argument,
>   return location via that, don't modify statements.
>   (warn_implicit_fallthrough_r): Adjust call, don't use
>   statement location directly.
> 
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 509fc2f3f5be..22dff0e546c9 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -1938,10 +1938,12 @@ last_stmt_in_scope (gimple *stmt)
>  
>  static gimple *
>  collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
> - auto_vec  *labels)
> + auto_vec  *labels,
> + location_t *prevloc)
>  {
>gimple *prev = NULL;

Looks good, thanks, though PREVLOC should probably be described in the comment.

Marek


[PATCH 6/7][MSP430][TESTSUITE] Fix tests requiring float printf support when GCC was configured with --enable-newlib-nano-formatted-io

2018-11-14 Thread Jozef Lawrynowicz

Patch 6 fixes tests expecting printf float support for targets which have been
configured with "newlib-nano-formatted-io". When newlib is configured in this
way, float printf is enabled at build time by registering _printf_float as an
undefined symbol.

>From 15a04e0139ec40196ddb79f1125635029dccae68 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sat, 10 Nov 2018 16:02:25 +
Subject: [PATCH 6/7] [TESTSUITE] Fix tests requiring float printf support when
 GCC was configured with --enable-newlib-nano-formatted-io

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* lib/target-supports.exp (check_effective_target_newlib_nano_io): New. 
	* gcc.c-torture/execute/920501-8.c: Register undefined linker symbol
	_printf_float for newlib_nano_io target.
	* gcc.c-torture/execute/930513-1.c: Likewise.
	* gcc.dg/torture/builtin-sprintf.c: Likewise.
	* gcc.c-torture/execute/ieee/920810-1.x: New.
---
 gcc/testsuite/gcc.c-torture/execute/920501-8.c  | 2 ++
 gcc/testsuite/gcc.c-torture/execute/930513-1.c  | 2 ++
 gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x | 4 
 gcc/testsuite/gcc.dg/torture/builtin-sprintf.c  | 3 ++-
 gcc/testsuite/lib/target-supports.exp   | 4 
 5 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x

diff --git a/gcc/testsuite/gcc.c-torture/execute/920501-8.c b/gcc/testsuite/gcc.c-torture/execute/920501-8.c
index 62780a0..7e4fa17 100644
--- a/gcc/testsuite/gcc.c-torture/execute/920501-8.c
+++ b/gcc/testsuite/gcc.c-torture/execute/920501-8.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
+
 #include 
 #include 
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/930513-1.c b/gcc/testsuite/gcc.c-torture/execute/930513-1.c
index 4544471..f163007 100644
--- a/gcc/testsuite/gcc.c-torture/execute/930513-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/930513-1.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
+
 #include 
 char buf[2];
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x
new file mode 100644
index 000..8edec730
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x
@@ -0,0 +1,4 @@
+if { [check_effective_target_newlib_nano_io] } {
+lappend additional_flags "-Wl,-u,_printf_float"
+}
+return 0
diff --git a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
index 6f8b7a9..5684fd7 100644
--- a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
+++ b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
@@ -1,6 +1,7 @@
 /* PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN)
{ dg-do run }
-   { dg-options "-O2 -Wall" } */
+   { dg-options "-O2 -Wall" }
+   { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
 
 #define X"0xdeadbeef"
 #define nan(x)   __builtin_nan (x)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 5235d5e..ced1582 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8946,3 +8946,7 @@ proc check_effective_target_cet { } {
 	}
 } "-O2" ]
 }
+
+proc check_effective_target_newlib_nano_io { } {
+return [check_configured_with "--enable-newlib-nano-formatted-io"]
+}
-- 
2.7.4



[PATCH 5/7][MSP430][TESTSUITE] Prune messages about ISO C not supporting __int20 from output of tests

2018-11-14 Thread Jozef Lawrynowicz

Patch 5 deals with ISO C errors emitted by tests when the large memory model is
used. size_t and ptrdiff_t are __int20 with -mlarge, and if the test is
compiled with -pedantic-errors and -std=* or -ansi, then use of these types
causes an error of the form:
  ISO C does not support __int20 types
I fixed this by adding dg-prune-output directives to tests which cause this
error.

Alternatively, I considered adding typedefs preceded by  __extension__ to fix
these errors, but in many cases __SIZE_TYPE__ is directly used so replacing all
these with a new typedef'd type changes the code in more places, in some cases
changing the offset for dg-warning or dg-error directives. Changing the line
numbers for dg-warning/dg-error adds further manual steps to comparing
testresults and as these are generic tests I wanted to minimize the effect on
the testresults for other targets.

>From ed24754b1d97992400bb374916d87cce151f7e89 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sat, 10 Nov 2018 15:47:21 +
Subject: [PATCH 5/7] [TESTSUITE] Prune messages about ISO C not supporting
 __int20 from output of tests

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* gcc.dg/addr_builtin-1.c: Prune ISO C does not support __int20
	message from output.
	* gcc.dg/c11-static-assert-3.c: Likewise.
	* gcc.dg/c11-uni-string-1.c: Likewise.
	* gcc.dg/c99-const-expr-10.c: Likewise.
	* gcc.dg/c99-const-expr-6.c: Likewise.
	* gcc.dg/c99-const-expr-9.c: Likewise.
	* gcc.dg/c99-init-1.c: Likewise.
	* gcc.dg/c99-stdint-5.c: Likewise.
	* gcc.dg/c99-stdint-6.c: Likewise.
	* gcc.dg/pr52549.c: Likewise.
	* gcc.dg/pr61240.c: Likewise.
	* gcc.dg/pr71558.c: Likewise.
	* gcc.dg/pr77587.c: Likewise.
	* gcc.dg/pr79223.c: Likewise.
	* gcc.dg/vla-11.c: Likewise.
	* gcc.dg/vla-9.c: Likewise.

---
 gcc/testsuite/gcc.dg/addr_builtin-1.c  | 3 ++-
 gcc/testsuite/gcc.dg/c11-static-assert-3.c | 1 +
 gcc/testsuite/gcc.dg/c11-uni-string-1.c| 1 +
 gcc/testsuite/gcc.dg/c99-const-expr-10.c   | 1 +
 gcc/testsuite/gcc.dg/c99-const-expr-6.c| 1 +
 gcc/testsuite/gcc.dg/c99-const-expr-9.c| 1 +
 gcc/testsuite/gcc.dg/c99-init-1.c  | 1 +
 gcc/testsuite/gcc.dg/c99-stdint-5.c| 1 +
 gcc/testsuite/gcc.dg/c99-stdint-6.c| 1 +
 gcc/testsuite/gcc.dg/pr52549.c | 1 +
 gcc/testsuite/gcc.dg/pr61240.c | 1 +
 gcc/testsuite/gcc.dg/pr71558.c | 1 +
 gcc/testsuite/gcc.dg/pr77587.c | 1 +
 gcc/testsuite/gcc.dg/pr79223.c | 3 ++-
 gcc/testsuite/gcc.dg/vla-11.c  | 1 +
 gcc/testsuite/gcc.dg/vla-9.c   | 1 +
 16 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/addr_builtin-1.c b/gcc/testsuite/gcc.dg/addr_builtin-1.c
index 4a0888a..7d91c62 100644
--- a/gcc/testsuite/gcc.dg/addr_builtin-1.c
+++ b/gcc/testsuite/gcc.dg/addr_builtin-1.c
@@ -1,5 +1,6 @@
 /* PR66516 - missing diagnostic on taking the address of a builtin function
-   { dg-do compile }  */
+   { dg-do compile }
+   { dg-prune-output "ISO C does not support.*__int20" }  */
 
 typedef void (F)(void);
 typedef __UINTPTR_TYPE__ uintptr_t;
diff --git a/gcc/testsuite/gcc.dg/c11-static-assert-3.c b/gcc/testsuite/gcc.dg/c11-static-assert-3.c
index 9799b97..ea369e9 100644
--- a/gcc/testsuite/gcc.dg/c11-static-assert-3.c
+++ b/gcc/testsuite/gcc.dg/c11-static-assert-3.c
@@ -1,6 +1,7 @@
 /* Test C11 static assertions.  Invalid assertions.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-prune-output "ISO C does not support.*__int20" } */
 
 _Static_assert (__INT_MAX__ * 2, "overflow"); /* { dg-warning "integer overflow in expression" } */
 /* { dg-error "overflow in constant expression" "error" { target *-*-* } .-1 } */
diff --git a/gcc/testsuite/gcc.dg/c11-uni-string-1.c b/gcc/testsuite/gcc.dg/c11-uni-string-1.c
index 9f86bea..9b47f6a6 100644
--- a/gcc/testsuite/gcc.dg/c11-uni-string-1.c
+++ b/gcc/testsuite/gcc.dg/c11-uni-string-1.c
@@ -1,6 +1,7 @@
 /* Test Unicode strings in C11.  Test valid code.  */
 /* { dg-do run } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-prune-output "ISO C does not support.*__int20" } */
 
 /* More thorough tests are in c-c++-common/raw-string-*.c; this test
verifies the particular subset (Unicode but not raw strings) that
diff --git a/gcc/testsuite/gcc.dg/c99-const-expr-10.c b/gcc/testsuite/gcc.dg/c99-const-expr-10.c
index 2aca610..bb5af01 100644
--- a/gcc/testsuite/gcc.dg/c99-const-expr-10.c
+++ b/gcc/testsuite/gcc.dg/c99-const-expr-10.c
@@ -4,6 +4,7 @@
 /* Origin: Joseph Myers  */
 /* { dg-do compile } */
 /* { dg-options "-std=iso9899:1999 -pedantic-errors" } */
+/* { dg-prune-output "ISO C does not support.*__int20" } */
 
 void *p = (__SIZE_TYPE__)(void *)0; /* { dg-error "without a cast" } */
 struct s { void *a; } q = { (__SIZE_TYPE__)(void *)0 }; /* { dg-error "without a cast|near initialization" } */
diff --git a/gcc/testsuite/gcc.dg/c99-const-expr-6.c b/gcc/testsuite/gcc.dg/c99-con

[PATCH 4/7][MSP430][TESTSUITE] Fix tests when int is 16-bit by default

2018-11-14 Thread Jozef Lawrynowicz

Patch 4 fixes tests when int is 16-bits by default.

>From 62b273f73cd7a4db22b1161f450ae7267d185890 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 23:09:38 +
Subject: [PATCH 4/7] [TESTSUITE] Fix tests when int is 16-bit by default

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* c-c++-common/Warray-bounds-3.c (test_strcpy_bounds): Use long instead
	of int if __SIZEOF_INT__ == 2.
	* c-c++-common/Wrestrict.c: Test memcpy range with smaller length when
	__SIZEOF_SIZE_T < 4.
	* c-c++-common/rotate-8.c: Define smaller "large" constant when
	__SIZEOF_INT__ == 2.
	* gcc.dg/pr53037-1.c: Add dg-require-effective-target int32.
	* gcc.dg/pr53037-2.c: Likewise.
	* gcc.dg/pr53037-3.c: Likewise.
	* gcc.dg/pr85512.c: Likewise.
	* gcc.dg/pr59963-2.c: Add dg-warning for int16.
	* gcc.dg/sancov/cmp0.c: Explicitly use __INT32_TYPE__ instead of int.
	* gcc.dg/tree-ssa/addadd.c: Fix dg-final directives for int16.
	* gcc.dg/tree-ssa/pr79327-2.c: Likewise.
	* gcc.dg/tree-ssa/builtin-sprintf-2.c: Filter out invalid tests for
	int16.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-10.c: Update sizes in dg-warning
	directives for int16.

---
 gcc/testsuite/c-c++-common/Warray-bounds-3.c   |  4 +
 gcc/testsuite/c-c++-common/Wrestrict.c |  5 ++
 gcc/testsuite/c-c++-common/rotate-8.c  | 14 +++-
 gcc/testsuite/gcc.dg/pr53037-1.c   |  2 +-
 gcc/testsuite/gcc.dg/pr53037-2.c   |  2 +-
 gcc/testsuite/gcc.dg/pr53037-3.c   |  2 +-
 gcc/testsuite/gcc.dg/pr59963-2.c   |  1 +
 gcc/testsuite/gcc.dg/pr85512.c |  1 +
 gcc/testsuite/gcc.dg/sancov/cmp0.c | 14 +++-
 gcc/testsuite/gcc.dg/tree-ssa/addadd.c |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-2.c  |  9 ++-
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-10.c  | 94 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr79327-2.c  |  5 +-
 13 files changed, 96 insertions(+), 61 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/Warray-bounds-3.c b/gcc/testsuite/c-c++-common/Warray-bounds-3.c
index 2ee8146..e49d44ad 100644
--- a/gcc/testsuite/c-c++-common/Warray-bounds-3.c
+++ b/gcc/testsuite/c-c++-common/Warray-bounds-3.c
@@ -326,7 +326,11 @@ void test_strcpy_bounds (char *d, const char *s)
 
 struct MA
 {
+#if __SIZEOF_INT__ == 2
+  long i;
+#else
   int i;
+#endif
   char a5[5];
   char a11[11];
 };
diff --git a/gcc/testsuite/c-c++-common/Wrestrict.c b/gcc/testsuite/c-c++-common/Wrestrict.c
index 36a1ffa..efd72ef 100644
--- a/gcc/testsuite/c-c++-common/Wrestrict.c
+++ b/gcc/testsuite/c-c++-common/Wrestrict.c
@@ -262,8 +262,13 @@ void test_memcpy_range (char *d, size_t sz)
   {
 /* Create an offset in the range [0, -1].  */
 size_t o = sz << 1;
+#if __SIZEOF_SIZE_T__ < 4
+T (d, d + o, 1234);
+T (d + o, d, 2345);
+#else
 T (d, d + o, 12345);
 T (d + o, d, 23456);
+#endif
   }
 
   /* Exercise memcpy with both destination and source pointer offsets
diff --git a/gcc/testsuite/c-c++-common/rotate-8.c b/gcc/testsuite/c-c++-common/rotate-8.c
index 9ba3e94..f27634a 100644
--- a/gcc/testsuite/c-c++-common/rotate-8.c
+++ b/gcc/testsuite/c-c++-common/rotate-8.c
@@ -5,6 +5,12 @@
 /* { dg-final { scan-tree-dump-times "r\[<>]\[<>]" 23 "optimized" } } */
 /* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */
 
+#if __SIZEOF_INT__ == 2
+#define LARGE_UNSIGNED 0x1234U
+#else
+#define LARGE_UNSIGNED 0x12345678U
+#endif
+
 unsigned int
 f1 (unsigned int x, unsigned char y)
 {
@@ -60,25 +66,25 @@ f8 (unsigned int x, unsigned char y)
 unsigned int
 f9 (unsigned int x, int y)
 {
-  return (0x12345678U << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+  return (LARGE_UNSIGNED << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (LARGE_UNSIGNED >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
 }
 
 unsigned int
 f10 (unsigned int x, int y)
 {
-  return (0x12345678U >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+  return (LARGE_UNSIGNED >> (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (LARGE_UNSIGNED << (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
 }
 
 unsigned int
 f11 (unsigned int x, int y)
 {
-  return (0x12345678U >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+  return (LARGE_UNSIGNED >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (LARGE_UNSIGNED << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
 }
 
 unsigned int
 f12 (unsigned int x, int y)
 {
-  return (0x12345678U << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (0x12345678U >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
+  return (LARGE_UNSIGNED << (-y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1))) | (LARGE_UNSIGNED >> (y & (__CHAR_BIT__ * __SIZEOF_INT__ - 1)));
 }
 
 unsigned
diff --git a/gcc/testsuite/gcc.dg/pr53037-1.c b/gcc/testsuite/gcc.dg/pr53037-1.c
index ce0

Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-14 Thread Alexander Monakov
On Wed, 14 Nov 2018, Segher Boessenkool wrote:
> > I think with "=g" rather than "+g" this example is ok.
> 
> No, it needs the register var as an input.  That is the whole *point*.

Hm. I think I see what you meant, but "+g" is not correct either: the
asm, by intent, depends *on the current value in the 'sp' hardreg*, not
*on the current value of some automatic variable that is supposed to be
passed on the 'sp' hardreg to the asm* (which is what expressed by the
input constraint).

Consider what would happen in the scenario demonstrated in PR 89784:
suppose you have (e.g. after inlining 'retsp' in a loop):

  for (int i=0; i<2; i++)
{
  register long sp asm ("%rsp");
  asm ("" : "+r" (sp));
  
}

and then after unrolling

  register long sp asm ("%rsp");
  asm ("" : "+r" (sp));
  
  asm ("" : "+r" (sp));
  

where only the first asm has an uninitialized input, and the second asm
implies restoring hardreg %rsp to the value in variable sp.

So at a minimum you'd need to use two separate register variables:

  register long sp_in asm ("%rsp");
  register long sp asm ("%rsp");
  asm ("" : "=r" (sp) : "r" (sp_in));

Alexander


[PATCH 3/7][MSP430][TESTSUITE] Dynamically check if size_t is large enough for tests containing large structs/arrays

2018-11-14 Thread Jozef Lawrynowicz

Patch 3 sets up require-effective-target directives for tests which
require the compilation of large arrays.
Targets which have 16-bit or 20-bit size_t fail to compile tests with large
arrays designed to test 32-bit or 64-bit behaviour. Rather than enumerating
another target to skip, I've replaced the target selector in some tests with
a size checking procedure:
- size20plus (new)
- size32plus
size20plus checks to see if a 16-bit structure/array size is supported,
similarly to how the existing size32plus checks to see if a 24-bit
structure/array size is supported.

>From 23ab77f7e44e104595adb0b5cabd9caf93141ffd Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 22:39:12 +
Subject: [PATCH 3/7] [TESTSUITE] Dynamically check if size_t is large enough
 for tests containing large structs/arrays

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* gcc.c-torture/compile/20151204.c: Add dg-require-effective-target
	size20plus.
	* gcc.dg/pr34225.c: Likewise.
	* gcc.dg/pr40971.c: Likewise.
	* gcc.dg/pr69071.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-10.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-2.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-3.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-5.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-6.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-7.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-8.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-9.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-11.c: Add dg-require-effective-target
	size32plus.
	* gcc.dg/Walloc-size-larger-than-4.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-5.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-6.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-7.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-1.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-1b.c: Likewise.
	* lib/target-supports.exp (check_effective_target_size20plus): New.
	(check_effective_target_size32plus): Update comment. 

---
 gcc/testsuite/gcc.c-torture/compile/20151204.c  |  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-6.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-7.c|  2 +-
 gcc/testsuite/gcc.dg/pr34225.c  |  1 +
 gcc/testsuite/gcc.dg/pr40971.c  |  1 +
 gcc/testsuite/gcc.dg/pr69071.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-10.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-11.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1b.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-2.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-3.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-5.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-6.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-7.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-8.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-9.c  |  3 ++-
 gcc/testsuite/lib/target-supports.exp   | 18 +++---
 20 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/gcc.c-torture/compile/20151204.c b/gcc/testsuite/gcc.c-torture/compile/20151204.c
index 6a46abf..e41f6c1 100644
--- a/gcc/testsuite/gcc.c-torture/compile/20151204.c
+++ b/gcc/testsuite/gcc.c-torture/compile/20151204.c
@@ -1,4 +1,4 @@
-/* { dg-skip-if "Array too big" { "avr-*-*" "pdp11-*-*" } } */
+/* { dg-require-effective-target size20plus } */
 
 typedef __SIZE_TYPE__ size_t;
 
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c
index 4b3a64b..54e43cd 100644
--- a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c
+++ b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c
@@ -1,6 +1,6 @@
 /* PR middle-end/82063 - issues with arguments enabled by -Wall
{ dg-do compile }
-   { dg-skip-if "small address space" { "pdp11-*-*" } }
+   { dg-require-effective-target size32plus }
{ dg-options "-O -Walloc-size-larger-than=1MiB -ftrack-macro-expansion=0" } */
 
 void sink (void*);
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c
index 4217ad6..774c4cf 100644
--- a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c
+++ b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c
@@ -1,6 +1,6 @@
 /* PR middle-end/82063 - issues with arguments enabled by -Wall
{ dg-do compile }
-   { dg-skip-if "small address space" { "pdp11-*-*" } }
+   { dg-require-effective-target size32plus }
{ dg-options "-O -Walloc-size-larger-than=1MB -ftrack-macro-expansion=0" } */
 
 void sink (void*);
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-6.c b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-6.c
index a46fce7..2dfc663 100644
--- a/gcc/testsuite/gcc.dg/Wall

RE: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP recognizer for Complex addition with rotate and complex MLA with rotation

2018-11-14 Thread Tamar Christina
Hi Richard,

Thanks for the feedback, I've replied inline below.
I'll wait for your answers before making changes.

> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, November 14, 2018 12:21
> To: Tamar Christina 
> Cc: GCC Patches ; nd ; Richard
> Guenther ; Zdenek Dvorak ; Richard
> Earnshaw ; James Greenhalgh
> ; Marcus Shawcroft
> 
> Subject: Re: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP
> recognizer for Complex addition with rotate and complex MLA with rotation
> 
> On Sun, Nov 11, 2018 at 11:26 AM Tamar Christina
>  wrote:
> >
> > Hi All,
> >
> > This patch adds support for SLP vectorization of Complex number
> > arithmetic with rotations along with Argand plane.
> >
> > For this to work it has to recognize two statements in parallel as it
> > needs to match against operations towards both the real and imaginary
> > numbers.  The instructions generated by this change is also only
> > available in their vector forms.  As such I add them as pattern statements 
> > to
> the stmt.  The BB is left untouched and so the scalar loop is untouched.
> >
> > The instructions also require the loads to be contiguous and so when a
> > match is made, and the code decides it is able to do the replacement
> > it re-organizes the SLP tree such that the loads are contiguous.
> > Since this operation cannot be undone it only does this if it's sure that 
> > the
> resulting loads can be made continguous.
> >
> > It gets this guarantee by only allowing the replacement if between the
> > matched expression and the loads there are no other expressions it
> doesn't know aside from type casts.
> >
> > When a match occurs over multiple expressions, the dead statements are
> > immediately removed from the tree to prevent verification failures later.
> >
> > Because the pattern matching is done after SLP analysis has analyzed
> > the usage of the instruction it also marks the instructions as used and the
> old ones as unusued.
> >
> > When a replacement is done a new internal function is generated which
> > the back-end has to expand to the proper instruction sequences.
> >
> > For now, this patch only adds support for Complex addition with rotate
> > and Complex FMLA with rotation of 0 and 180. However it is the
> > intention to in the future add support for Complex subtraction and
> Complex multiplication.
> >
> > Concretely, this generates
> >
> > ldr q1, [x1, x3]
> > ldr q2, [x0, x3]
> > ldr q0, [x2, x3]
> > fcmla   v0.2d, v1.2d, v2.2d, #180
> > fcmla   v0.2d, v1.2d, v2.2d, #90
> > str q0, [x2, x3]
> > add x3, x3, 16
> > cmp x3, 3200
> > bne .L2
> > ret
> >
> > now instead of
> >
> > add x3, x2, 31
> > sub x4, x3, x1
> > sub x3, x3, x0
> > cmp x4, 62
> > mov x4, 62
> > ccmpx3, x4, 0, hi
> > bls .L5
> > mov x3, x0
> > mov x0, x1
> > add x1, x2, 3200
> > .p2align 3,,7
> > .L3:
> > ld2 {v16.2d - v17.2d}, [x2]
> > ld2 {v2.2d - v3.2d}, [x3], 32
> > ld2 {v0.2d - v1.2d}, [x0], 32
> > mov v7.16b, v17.16b
> > fmulv6.2d, v0.2d, v3.2d
> > fmlav7.2d, v1.2d, v3.2d
> > fmlav6.2d, v1.2d, v2.2d
> > fmlsv7.2d, v2.2d, v0.2d
> > faddv4.2d, v6.2d, v16.2d
> > mov v5.16b, v7.16b
> > st2 {v4.2d - v5.2d}, [x2], 32
> > cmp x2, x1
> > bne .L3
> > ret
> > .L5:
> > add x4, x2, 8
> > add x6, x0, 8
> > add x5, x1, 8
> > mov x3, 0
> > .p2align 3,,7
> > .L2:
> > ldr d1, [x6, x3]
> > ldr d4, [x1, x3]
> > ldr d5, [x5, x3]
> > ldr d3, [x0, x3]
> > fmuld2, d4, d1
> > ldr d0, [x4, x3]
> > fmadd   d0, d5, d1, d0
> > ldr d1, [x2, x3]
> > fmadd   d2, d5, d3, d2
> > fmsub   d0, d4, d3, d0
> > faddd1, d1, d2
> > str d1, [x2, x3]
> > str d0, [x4, x3]
> > add x3, x3, 16
> > cmp x3, 3200
> > bne .L2
> > ret
> >
> > Bootstrap and Regtests on aarch64-none-linux-gnu, arm-none-gnueabihf
> > and x86_64-pc-linux-gnu are still on going but previous patch showed no
> regressions.
> >
> > The instructions have also been tested on aarch64-none-elf and
> > arm-none-eabi on a Armv8.3-a model and -march=Armv8.3-a+fp16 and all
> tests pass.
> >
> > Ok for trunk?
> 
> I first have a few high-level questions.  Complex addition when the complex
> values are in vectors looks trivial to me and maps to vector addition.  Your
> md.texi description of fcadd mentions a rotation 'm' but doesn't further
> explain the details
> - I suppose
> fcadd@var{m}@var{n}3 really means fcadd90@var{n}3 and
> fcadd270@var{n}3, thus the rotation b

RE: [PATCH 8/9][GCC][Arm] Add autovectorization support for complex multiplication and addition

2018-11-14 Thread Tamar Christina
Hi Richard,

> > Ok for trunk?
> 
> +;; The complex mla operations always need to expand to two instructions.
> +;; The first operation does half the computation and the second does
> +the ;; remainder.  Because of this, expand early.
> +(define_expand "fcmla4"
> +  [(set (match_operand:VF 0 "register_operand")
> +   (plus:VF (match_operand:VF 1 "register_operand")
> +(unspec:VF [(match_operand:VF 2 "register_operand")
> +(match_operand:VF 3 "register_operand")]
> +VCMLA)))]
> +  "TARGET_COMPLEX"
> +{
> +  emit_insn (gen_neon_vcmla (operands[0],
> operands[1],
> + operands[2],
> +operands[3]));
> +  emit_insn (gen_neon_vcmla (operands[0],
> operands[0],
> + operands[2],
> +operands[3]));
> +  DONE;
> +})
> 
> What's the two halves?  Why hide this from the vectorizer if you go down all
> to the detail and expose the rotation to it?
> 

The two halves are an implementation detail of the instruction in Armv8.3-a. As 
far as the
Vectorizer is concerned all you want to do, is an FMA rotating one of the 
operands by 0 or 180 degrees.

Also note that the "rotations" in these instructions aren't exactly the same as 
what would fall under rotation of a complex number,
as each instruction can only do half of the final computation you want.

In the ISA these instructions have to be used in a pair, where rotations 
determine
the operation you want to perform. E.g. a rotation of #0 followed by #90 makes 
it a multiply and accumulate.

A rotation of #180 followed by #90 makes this a vector complex subtract, which 
is intended to be used by the first call
using a register cleared with 0 (It becomes an "FMS" essentially if you don't 
clear the register).
Each "rotation" determine what operation is done and using which parts of the 
complex number. You change the
"rotations" and the grouping of the instructions to get different operations.

I did not expose this to the vectorizer as It seems very ISA specific.

> +;; The vcadd and vcmla patterns are made UNSPEC for the explicitly due
> +to the ;; fact that their usage need to guarantee that the source
> +vectors are ;; contiguous.  It would be wrong to describe the operation
> +without being able ;; to describe the permute that is also required,
> +but even if that is done ;; the permute would have been created as a
> +LOAD_LANES which means the values ;; in the registers are in the wrong
> order.
> 
> Hmm, it's totally non-obvious to me how this relates to loads or what a "non-
> contiguous"
> register would be?  That is, once you make this an unspec combine will never
> be able to synthesize this from intrinsics code that doesn't use this form.
> 
> +(define_insn "neon_vcadd"
> +  [(set (match_operand:VF 0 "register_operand" "=w")
> +   (unspec:VF [(match_operand:VF 1 "register_operand" "w")
> +   (match_operand:VF 2 "register_operand" "w")]
> +   VCADD))]
> 

Yes that's my goal, as if operand1 and operand2 are loaded by instructions that
would have permuted the values in the register then the instruction doesn't 
work.

The instruction does the permute itself, so it expects the values to have been 
loaded
using a simple load and not a LOAD_LANES. So I am intended to prevent combine 
from
recognizing the operation for that reason.  For the ADD combine can be used but 
then you'd
have to match the load and store since you have to change these, for the rest 
you'll run far afoul
of combine's 5 instruction limit.

Kind Regards,
Tamar

> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 2018-11-11  Tamar Christina  
> >
> > * config/arm/arm.c (arm_arch8_3, arm_arch8_4): New.
> > * config/arm/arm.h (TARGET_COMPLEX, arm_arch8_3, arm_arch8_4):
> New.
> > (arm_option_reconfigure_globals): Use them.
> > * config/arm/iterators.md (VDF, VQ_HSF): New.
> > (VCADD, VCMLA): New.
> > (VF_constraint, rot, rotsplit1, rotsplit2): Add V4HF and V8HF.
> > * config/arm/neon.md (neon_vcadd,
> fcadd3,
> > neon_vcmla, fcmla4): New.
> > * config/arm/unspecs.md (UNSPEC_VCADD90, UNSPEC_VCADD270,
> > UNSPEC_VCMLA, UNSPEC_VCMLA90, UNSPEC_VCMLA180,
> UNSPEC_VCMLA270): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2018-11-11  Tamar Christina  
> >
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c: Add Arm
> support.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c: Likewise.
> > * gcc.

[PATCH 2/7][MSP430][TESTSUITE] Add path to libssp to the linker search path when checking for -fstack-protector support

2018-11-14 Thread Jozef Lawrynowicz

Patch 2 fixes issues finding libssp when linking tests or checking for
fstack_protector support.

>From 6c6f34bae386a5f396e6f9630514fc7080c2f940 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sun, 11 Nov 2018 14:30:32 +
Subject: [PATCH 2/7] [TESTSUITE] Add path to libssp to the linker search path
 when checking for -fstack-protector support

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	lib/g++.exp (g++_link_flags): Append path to libssp to link flags.
	lib/gcc.exp (gcc_link_flags): New.
	(gcc_init): Append gcc_link_flags result to TEST_EXTRA_LIBS. 
	lib/target-supports.exp (check_effective_target_fstack_protector): Pass
	path to libssp as extra flags to check_runtime.
---
 gcc/testsuite/lib/g++.exp |  4 
 gcc/testsuite/lib/gcc.exp | 34 ++
 gcc/testsuite/lib/target-supports.exp |  2 +-
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/g++.exp b/gcc/testsuite/lib/g++.exp
index c0ffcdf..710bc9b 100644
--- a/gcc/testsuite/lib/g++.exp
+++ b/gcc/testsuite/lib/g++.exp
@@ -149,6 +149,10 @@ proc g++_link_flags { paths } {
 	  append flags "-B${gccpath}/libitm/ -L${gccpath}/libitm/.libs"
 	  append ld_library_path ":${gccpath}/libitm/.libs"
   }
+  if [file exists "${gccpath}/libssp/.libs/libssp.a"] {
+	  append flags "-L${gccpath}/libssp/.libs "
+	  append ld_library_path ":${gccpath}/libssp/.libs"
+  }
   append ld_library_path [gcc-set-multilib-library-path $GXX_UNDER_TEST]
 } else {
   global tool_root_dir
diff --git a/gcc/testsuite/lib/gcc.exp b/gcc/testsuite/lib/gcc.exp
index 61e995a..4c5c652 100644
--- a/gcc/testsuite/lib/gcc.exp
+++ b/gcc/testsuite/lib/gcc.exp
@@ -78,6 +78,37 @@ proc gcc_version { } {
 }
 
 #
+# gcc_link_flags -- provide gcc_link_flags, based on g++_link_flags
+# (originally from libgloss.exp) which knows about the gcc tree structure
+#
+
+proc gcc_link_flags { paths } {
+global ld_library_path
+
+set gccpath ${paths}
+set flags ""
+
+if { $gccpath != "" } {
+	if [file exists "${gccpath}/libssp/.libs/libssp.a"] {
+	append flags "-L${gccpath}/libssp/.libs "
+	append ld_library_path ":${gccpath}/libssp/.libs"
+	}
+} else {
+	global tool_root_dir
+
+	set libssp [lookfor_file ${tool_root_dir} libssp]
+	if { $libssp != "" } {
+	append flags "-L${libssp} "
+	append ld_library_path ":${libssp}"
+	}
+}
+
+set_ld_library_path_env_vars
+
+return "$flags"
+}
+
+#
 # gcc_init -- called at the start of each .exp script.
 #
 # There currently isn't much to do, but always using it allows us to
@@ -95,6 +126,7 @@ proc gcc_init { args } {
 global TOOL_EXECUTABLE
 global gcc_warning_prefix
 global gcc_error_prefix
+global TEST_EXTRA_LIBS
 
 if { $gcc_initialized == 1 } { return; }
 
@@ -114,6 +146,8 @@ proc gcc_init { args } {
 set gcc_error_prefix "(fatal )?error:"
 
 gcc_maybe_build_wrapper "${tmpdir}/gcc-testglue.o"
+
+append TEST_EXTRA_LIBS "[gcc_link_flags [get_multilibs]]"
 }
 
 #
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 4966e50..093b12a 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1062,7 +1062,7 @@ proc check_effective_target_fstack_protector {} {
 	  char buf[64];
 	  return !strcpy (buf, strrchr (argv[0], '/'));
 	}
-} "-fstack-protector"]
+} "-fstack-protector -B[get_multilibs]/libssp/ -L[get_multilibs]/libssp/.libs"]
 }
 
 # Return 1 if the target supports -fstack-check or -fstack-check=$stack_kind
-- 
2.7.4



[PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-14 Thread Jozef Lawrynowicz

Patch 1 tweaks dg directives in tests specifically for msp430. Many of
these are extensions to existing target selectors in dg directives.

From a730d945647923c5c10e8487ca3c2a24511abf3d Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 18:55:57 +
Subject: [PATCH 1/7] [TESTSUITE][MSP430] Tweak dg-directives for msp430-elf

2018-11-14  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* c-c++-common/pr41779.c: Skip for msp430.
	* gcc.dg/Wno-frame-address.c: Likewise.
	* gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
	* gcc.dg/ifcvt-4.c: Likewise.
	* gcc.dg/pr34856.c: Likewise.
	* gcc.dg/pr84670-4.c: Likewise.
	* gcc.dg/pr85859.c: Likewise.
	* gcc.dg/builtin-apply2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-dse-26.c: Likewise.
	* c-c++-common/pr57371-2.c: XFAIL optimized dump scan for msp430.
	* c-c++-common/torture/builtin-arith-overflow-10.c: Increase timeout
	for msp430.
	* c-c++-common/torture/builtin-arith-overflow-p-10.c: Likewise.
	* gcc.c-torture/execute/arith-rand-ll.c: Likewise.
	* gcc.dg/attr-alloc_size-11.c: Remove dg-warning XFAIL for msp430.
	* gcc.dg/tree-ssa/20040204-1.c: Likewise.
	* gcc.dg/compat/struct-by-value-16a_x.c: Build at -O1 for msp430
	so it fits.
	* gcc.dg/lto/20091013-1_1.c: Add xfail-if for msp430.
	* gcc.dg/lto/20091013-1_2.c: Likewise.
	* gcc.dg/tree-ssa/loop-1.c: Fix expected dg-final behaviour for msp430.
	* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
	* gcc.dg/tree-ssa/gen-vect-11.c: Likewise.
	* gcc.dg/tree-ssa/loop-35.c: Likewise.
	* gcc.dg/tree-ssa/pr23455.c: Likewise.
	* gcc.dg/weak/typeof-2.c: Likewise.
	* gcc.target/msp430/interrupt_fn_placement.c: Skip for 430 ISA.
	* gcc.target/msp430/pr78818-data-region.c: Fix scan-assembler text.
	* gcc.target/msp430/pr79242.c: Don't skip for -msmall.
	* gcc.target/msp430/special-regs.c: Use "__asm__" instead of "asm".
	* lib/target-supports.exp
	(check_effective_target_logical_op_short_circuit): Add msp430.
---
 gcc/testsuite/c-c++-common/pr41779.c | 3 ++-
 gcc/testsuite/c-c++-common/pr57371-2.c   | 2 +-
 gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-10.c   | 1 +
 gcc/testsuite/c-c++-common/torture/builtin-arith-overflow-p-10.c | 1 +
 gcc/testsuite/gcc.c-torture/execute/arith-rand-ll.c  | 2 ++
 gcc/testsuite/gcc.dg/Wno-frame-address.c | 2 +-
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c| 4 ++--
 gcc/testsuite/gcc.dg/builtin-apply2.c| 2 +-
 gcc/testsuite/gcc.dg/compat/struct-by-value-16a_x.c  | 2 ++
 gcc/testsuite/gcc.dg/ifcvt-4.c   | 2 +-
 gcc/testsuite/gcc.dg/lto/20091013-1_1.c  | 2 +-
 gcc/testsuite/gcc.dg/lto/20091013-1_2.c  | 2 +-
 gcc/testsuite/gcc.dg/pr34856.c   | 1 +
 gcc/testsuite/gcc.dg/pr84670-4.c | 1 +
 gcc/testsuite/gcc.dg/pr85859.c   | 1 +
 gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c| 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c   | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c  | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c  | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-1.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c  | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr23455.c  | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c   | 1 +
 gcc/testsuite/gcc.dg/weak/typeof-2.c | 2 ++
 gcc/testsuite/gcc.target/msp430/interrupt_fn_placement.c | 1 +
 gcc/testsuite/gcc.target/msp430/pr78818-data-region.c| 3 ++-
 gcc/testsuite/gcc.target/msp430/pr79242.c| 2 +-
 gcc/testsuite/gcc.target/msp430/special-regs.c   | 8 
 gcc/testsuite/lib/target-supports.exp| 1 +
 29 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/pr41779.c b/gcc/testsuite/c-c++-common/pr41779.c
index c42a0f5..4ecedec 100644
--- a/gcc/testsuite/c-c++-common/pr41779.c
+++ b/gcc/testsuite/c-c++-common/pr41779.c
@@ -1,6 +1,7 @@
 /* PR41779: Wconversion cannot see through real*integer promotions. */
 /* { dg-do compile } */
-/* { dg-skip-if "doubles are floats" { "avr-*-*" } } */
+/* { dg-skip-if "doubles are floats" { avr-*-* } } */
+/* { dg-skip-if "int is smaller than float" { msp430-*-* } } */
 /* { dg-options "-std=c99 -Wconversion" { target c } } */
 /* { dg-options "-Wconversion" { target c++ } } */
 /* { dg-require-effective-target large_double } */
diff --git a/gcc/testsuite/c-c++-common/pr57371-2.c b/gcc/testsuite/c-c++-common/pr57371-2.c
index d07cff3..9ff83eb 100644
--- a/gcc/testsuite/c-c++-common/pr57371-2.c
+++ b/gcc/testsuite/c-c++-common/pr57371-2.c
@@

Re: RFA: vectorizer patches 1/2 : WIDEN_MULT_PLUS support

2018-11-14 Thread Joern Wolfgang Rennecke



On 14/11/18 09:53, Richard Biener wrote:

WIDEN_MULT_PLUS is special on our target in that it creates double-sized
vectors.

Are there really double-size vectors or does the target simply produce
the output in two vectors?  Usually targets have WIDEN_MULT_PLUS_LOW/HIGH
or _EVEN/ODD split operations.  Or, like - what I now remember - for the
DOT_PROD_EXPR optab, the target already reduces element pairs of the
result vector (unspecified which ones) so the result vector is of the same
size as the inputs.

The output of widening multiply and widening multiply-add is stored
in two consecutive registers.  So, they can be used as separate
vectors, but you can't choose the register numbers indepenndently.
OTOH, you can treat them together as a double-sized vector, but
without any extra alignment requirements over a single-sized vector.


That is, if your target produces two vectors you may instead want to hide
that fact by claiming you support DOT_PROD_EXPR and expanding
it to the widen-mult-plus plus reducing (adding) the two result vectors to
get a single one.


Doing a part of the reduction in the loop is a bit pointless.

I have tried another approach, to advertize the WIDEN_MULT_PLUS
and WIDEN_MULT operations as LO/HI part operations of the double
vector size, and also add fake double-vector patterns for move, widening
and add (they get expanded or splitted to single-vector patterns).
That seems to work for the dot product, it's like the code is unrolled by
a factor of two.  There are a few drawbacks though:
- the tree optimizer creates separate WIDEN_MULT and PLUS expressions,
and it is left to the combiner to clean that up.  That combination and 
register allocation

might be a bit fragile.
- if the input isn't known to be aligned to the doubled vector size, a 
run-time
check is inserted to use an unvectorized loop if there is no excess 
alignment.
- auto-increment for the loads is lost.  I can probably fix this by 
keeping double-sized
loads around for longer or with some special-purpose pass, but both 
might have
some other drawbacks.  But there's actually a configuration option for 
an instruction
to load multiple vector registers with register-indirect or 
auto-increment, so there is

some merit to have a pattern for it.
- the code size is larger.
- vectorization will fail if any other code is mixed in for which no 
double-vector patterns are provided.
- this approach uses SUBREGs in ways that are not safe according to the 
documentation.
But then, other ports like i386 and aarch64-little endian do that too.  
I think it is now (since we have
SUBREG_BYTE) safe to have subregs of registers with hard reg sizes 
larger than UNITS_PER_WORD,
as long as you refer to entire hard registers.  Maybe we could change 
the documentation?
AFAICT, there are also only four places that need to be patched to make 
a lowpart access with a SUBREG of such a hard register safe. I'm trying 
this at the moment, it was justa few hours late for the

phase 1->3 deadline.

I suppose for WIDEN_SUM_EXPR, I'd have to have one double-vector-sized 
pattern that
adds the products of the two input vectors into the double output 
vector, and leave
the rtl loop optimizer to get the constant pool load of the all-ones 
vector out of
the loop.  But again, there'll be issues with excess alignment 
requirements and code size.


The vectorizer cannot really deal with multiple sizes, thus for example
a V4SI * V4SI + V4DI operation and that all those tree codes are exposed
as "scalar" is sth that continues to confuse me but is mainly done because
at pattern recognition time there's only the scalars.
Well, the vectorizer makes an exception for reductions as it'll allow to 
maintain
either a vector or a scalar during the loop, so why not allow other 
sizes for that
value as well?  It's all hidden in the final reduction emitted by the 
epilogue.
For vectorization I would advise to provide expansion patterns for 
codes that are already supported, in your case DOT_PROD_EXPR.
With vector size doubling, it seems to work better with LO/HI multiply 
and PLUS (and let

the combiner take the strain).
without... for a straight expansion, there is little point.  The 
previous sum is in one
register, the multiply results are spread over two registers, and 
DOT_PROD_EXPR is supposed
to yield a scalar.  Even with a reduction instruction to sum up two 
registers, you need another
instruction to add up all three, so a minimum of three instructions. 
LO/HI mulltiply can
be fudged by doing a full multiply and picking half the result, and cse 
should reduce that
to one multiply.  Again, two adds needed, because the reduction variable 
is too narrow

to use widening multiply-add.
There maybe some merit to DOT_PROD_EXPR if I make it do something strange.
But there's no easy way to use a special purpose mode, since there's no 
matching reduction
pattern for a DOT_PROD_EXPR, and the reduction for a WIDEN_SUM_EXPR is 
not readily
distinguishable from the one for a non-w

[PATCH 0/7][MSP430][TESTSUITE] Fix GCC tests for msp430-elf

2018-11-14 Thread Jozef Lawrynowicz

The following series of patches fixes a number of test failures when running
the GCC DejaGNU testsuite for msp430-elf.

The raw output from contrib/compare_tests is a bit misleading for some tests,
as lines have been added to the source code, changing line numbers for
dg-warning/dg-error tests. I verified there are no regressions
or new failures for x86_64-pc-linux-gnu (gcc, g++), avr (gcc) and msp430-elf
with the small and large memory model (gcc, g++).

For the msp430 small memory model (default) here are the summary results:

BEFORE:
=== gcc Summary ===

# of expected passes90663

# of unexpected failures447
# of unexpected successes   6
# of expected failures  287
# of unresolved testcases   93
# of unsupported tests  4171

=== g++ Summary ===

# of expected passes99772

# of unexpected failures2673
# of expected failures  433
# of unresolved testcases   1812
# of unsupported tests  5862

AFTER:
=== gcc Summary ===

# of expected passes90750
# of unexpected failures150
# of unexpected successes   3
# of expected failures  295
# of unresolved testcases   14
# of unsupported tests  4313

=== g++ Summary ===

# of expected passes99804
# of unexpected failures706
# of expected failures  436
# of unresolved testcases   46
# of unsupported tests  7778

For the -mlarge configuration here are the summary results:
BEFORE:
=== gcc Summary ===

# of expected passes90451
# of unexpected failures590
# of unexpected successes   7
# of expected failures  286
# of unresolved testcases   162
# of unsupported tests  4172

=== g++ Summary ===

# of expected passes99509
# of unexpected failures2885
# of expected failures  433
# of unresolved testcases   1875
# of unsupported tests  5862

AFTER:
=== gcc Summary ===

# of expected passes90641
# of unexpected failures154
# of unexpected successes   3
# of expected failures  293
# of unresolved testcases   32
# of unsupported tests  4355

=== g++ Summary ===

# of expected passes99529
# of unexpected failures949
# of expected failures  436
# of unresolved testcases   128
# of unsupported tests  7759


The "before" results do not include the "obvious" testsuites changes committed
in r265924, r265926, r265927, r265928,
(https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00618.html)
but the "after" results do.

Patch 1 tweaks dg directives in tests specifically for msp430. Many of
these are extensions to existing target selectors in dg directives.

Patch 2 fixes issues finding libssp when linking tests or checking for
fstack_protector support.

Patch 3 sets up require-effective-target directives for tests which
require the compilation of large arrays.
Targets which have 16-bit or 20-bit size_t fail to compile tests with large
arrays designed to test 32-bit or 64-bit behaviour. Rather than enumerating
another target to skip, I've replaced the target selector in some tests with
a size checking procedure:
- size20plus (new)
- size32plus
size20plus checks to see if a 16-bit structure/array size is supported,
similarly to how the existing size32plus checks to see if a 24-bit
structure/array size is supported,

Patch 4 fixes tests when int is 16-bits by default.

Patch 5 deals with ISO C errors emitted by tests when the large memory model is
used. size_t and ptrdiff_t are __int20 with -mlarge, and if the test is
compiled with -pedantic-errors and -std=* or -ansi, then use of these types
causes an error of the form:
  ISO C does not support __int20 types
I fixed this by adding dg-prune-output directives to tests which cause this
error.
Alternatively, I considered adding typedefs preceded by  __extension__ to fix
these errors, but in many cases __SIZE_TYPE__ is directly used so replacing all
these with a new typedef'd type changes the code in more places, in some cases
changing the offset for dg-warning or dg-error directives. Changing the line
numbers for dg-warning/dg-error adds further manual steps to comparing
testresults and as these are generic tests I wanted to minimize the effect on
the testresults for other targets.

Patch 6 fixes tests expecting printf float support for targets which have been
configured with "newlib-nano-formatted-io". When newlib is configured in this
way, float printf is enabled at build time by regi

Re: [C++ Patch] Fix two grokdeclarator locations

2018-11-14 Thread Jason Merrill
OK.
On Wed, Nov 14, 2018 at 4:26 AM Paolo Carlini  wrote:
>
> Hi,
>
> On 14/11/18 01:30, Jason Merrill wrote:
> > On 11/12/18 6:39 AM, Paolo Carlini wrote:
> >> Hi again,
> >>
> >> On 08/11/18 10:26, Paolo Carlini wrote:
> >>> Hi,
> >>>
> >>> two additional grokdeclarator locations that we can easily fix by
> >>> using declarator->id_loc. Slightly more interesting, testing
> >>> revealed a latent issue in the make_id_declarator uses:
> >>> cp_parser_member_declaration wasn't setting declarator->id_loc, thus
> >>> I decided to add a location_t parameter to make_id_declarator itself
> >>> and adjust all the callers. Tested x86_64-linux.
> >>
> >> PS: In my local tree I have the cp_parser_objc_class_ivars change
> >> using token->location instead of UNKNOWN_LOCATION, thus all the
> >> make_id_declarator calls should be completely fine location-wise.
> > Great, I was going to ask about that.  Can I see that patch, then?
>
> Thanks, well I didn't post it because it is a trivial incremental change
> vs the posted one. Attached.
>
> Paolo.
>
> //
>


Re: [C++ PATCH] P1236R1 - Signed integers are two's complement

2018-11-14 Thread Jason Merrill
OK, thanks.
On Wed, Nov 14, 2018 at 7:12 AM Jakub Jelinek  wrote:
>
> Hi!
>
> This paper from what I can see mostly just codifies our
> implementation-defined behavior as standard (two's complement as the only
> possible representation of signed integers, but still keeping UB signed
> integer overflows), the only changes I found that IMHO need changing
> in GCC are that left shifts of signed integers are now always well defined
> (well, like for unsigned shifts out of bounds shift count is still UB),
> so we need to accept such shifts in constant expressions, don't warn about
> those and don't sanitize it in ubsan when in -std=c++2a or -std=gnu++2a
> modes.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-11-14  Jakub Jelinek  
>
> P1236R1 - Signed integers are two's complement
> gcc/cp/
> * constexpr.c (cxx_eval_check_shift_p): Disable the signed LSHIFT_EXPR
> checks for c++2a.
> gcc/c-family/
> * c-warn.c (maybe_warn_shift_overflow): Don't warn for c++2a.
> * c-ubsan.c (ubsan_instrument_shift): Make signed shifts
> with in-range second operand well defined for -std=c++2a.
> gcc/
> * doc/invoke.texi (Wshift-overflow): Adjust documentation for
> c++2a.
> gcc/testsuite/
> * g++.dg/cpp2a/constexpr-shift1.C: New test.
> * g++.dg/warn/permissive-1.C (enum A, enum D): Don't expect
> diagnostics here for c++2a.
> * g++.dg/cpp0x/constexpr-shift1.C (fn3, i3, fn4, i4): Don't expect
> diagnostics here for c++2a.
> * g++.dg/cpp0x/constexpr-60049.C (f3, x3, y3): Likewise.
> * g++.dg/ubsan/cxx11-shift-1.C (main): Add some further tests.
> * g++.dg/ubsan/cxx11-shift-2.C (main): Likewise.
> * g++.dg/ubsan/cxx2a-shift-1.C: New test.
> * g++.dg/ubsan/cxx2a-shift-2.C: New test.
>
> --- gcc/cp/constexpr.c.jj   2018-11-08 18:07:56.071075133 +0100
> +++ gcc/cp/constexpr.c  2018-11-14 08:56:23.705641740 +0100
> @@ -1920,9 +1920,14 @@ cxx_eval_check_shift_p (location_t loc,
>   if E1 has a signed type and non-negative value, and E1x2^E2 is
>   representable in the corresponding unsigned type of the result type,
>   then that value, converted to the result type, is the resulting value;
> - otherwise, the behavior is undefined.  */
> -  if (code == LSHIFT_EXPR && !TYPE_UNSIGNED (lhstype)
> -  && (cxx_dialect >= cxx11))
> + otherwise, the behavior is undefined.
> + For C++2a:
> + The value of E1 << E2 is the unique value congruent to E1 x 2^E2 modulo
> + 2^N, where N is the range exponent of the type of the result.  */
> +  if (code == LSHIFT_EXPR
> +  && !TYPE_UNSIGNED (lhstype)
> +  && cxx_dialect >= cxx11
> +  && cxx_dialect < cxx2a)
>  {
>if (tree_int_cst_sgn (lhs) == -1)
> {
> --- gcc/c-family/c-warn.c.jj2018-08-27 17:50:41.649525026 +0200
> +++ gcc/c-family/c-warn.c   2018-11-14 08:59:02.424022594 +0100
> @@ -2286,6 +2286,8 @@ diagnose_mismatched_attributes (tree old
>  /* Warn if signed left shift overflows.  We don't warn
> about left-shifting 1 into the sign bit in C++14; cf.
> 
> +   and don't warn for C++2a at all, as signed left shifts never
> +   overflow.
> LOC is a location of the shift; OP0 and OP1 are the operands.
> Return true if an overflow is detected, false otherwise.  */
>
> @@ -2300,7 +2302,7 @@ maybe_warn_shift_overflow (location_t lo
>unsigned int prec0 = TYPE_PRECISION (type0);
>
>/* Left-hand operand must be signed.  */
> -  if (TYPE_UNSIGNED (type0))
> +  if (TYPE_UNSIGNED (type0) || cxx_dialect >= cxx2a)
>  return false;
>
>unsigned int min_prec = (wi::min_precision (wi::to_wide (op0), SIGNED)
> @@ -2309,7 +2311,7 @@ maybe_warn_shift_overflow (location_t lo
> * However, shifting 1 _out_ of the sign bit, as in
> * INT_MIN << 1, is considered an overflow.
> */
> -  if (!tree_int_cst_sign_bit(op0) && min_prec == prec0 + 1)
> +  if (!tree_int_cst_sign_bit (op0) && min_prec == prec0 + 1)
>  {
>/* Never warn for C++14 onwards.  */
>if (cxx_dialect >= cxx14)
> --- gcc/c-family/c-ubsan.c.jj   2018-11-14 00:55:32.351645152 +0100
> +++ gcc/c-family/c-ubsan.c  2018-11-14 08:35:13.131600599 +0100
> @@ -134,7 +134,10 @@ ubsan_instrument_shift (location_t loc,
>if (TYPE_OVERFLOW_WRAPS (type0)
>|| maybe_ne (GET_MODE_BITSIZE (TYPE_MODE (type0)),
>TYPE_PRECISION (type0))
> -  || !sanitize_flags_p (SANITIZE_SHIFT_BASE))
> +  || !sanitize_flags_p (SANITIZE_SHIFT_BASE)
> +  /* In C++2a and later, shifts are well defined except when
> +the second operand is not within bounds.  */
> +  || cxx_dialect >= cxx2a)
>  ;
>
>/* For signed x << y, in C99/C11, the following:
> --- gcc/doc/invoke.texi.jj  2018-11-14 00:55:36.636576201 +0100
> +++ gcc/doc/invoke.texi 2018-11-14

Re: [debug/88006] -fdebug-types-section gives undefined ref

2018-11-14 Thread Nathan Sidwell

On 11/14/18 7:33 AM, Richard Biener wrote:


Hmm, there is reference_to_unused () used in a related case.  But generally
for late emission such references are "OK" and expected to be pruned
later by resolve_addr () (which I see we do not call for type units?!).  Quote:

/* Resolve DW_OP_addr and DW_AT_const_value CONST_STRING arguments to
an address in .rodata section if the string literal is emitted there,
or remove the containing location list or replace DW_AT_const_value
with DW_AT_location and empty location expression, if it isn't found
in .rodata.  Similarly for SYMBOL_REFs, keep only those that refer
to something that has been emitted in the current CU.  */

static void
resolve_addr (dw_die_ref die)
{


That does seem to work.  The attached survise bootstrapping on 
x86_64-linux, ok?


nathan

--
Nathan Sidwell
2018-11-14  Nathan Sidwell  

	PR debug/88006
	PR debug/87462
	* dwarf2out.c (dwarf2out_finish): Apply resolve_addr to comdat
	type list.

	* g++.dg/debug/dwarf2/pr87462.C: New.
	* g++.dg/debug/dwarf2/pr88006.C: New.

Index: dwarf2out.c
===
--- dwarf2out.c	(revision 266082)
+++ dwarf2out.c	(working copy)
@@ -31182,6 +31182,8 @@ dwarf2out_finish (const char *filename)
 FOR_EACH_CHILD (die, c, gcc_assert (! c->die_mark));
   }
 #endif
+  for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
+resolve_addr (ctnode->root_die);
   resolve_addr (comp_unit_die ());
   move_marked_base_types ();
 
Index: testsuite/g++.dg/debug/dwarf2/pr87462.C
===
--- testsuite/g++.dg/debug/dwarf2/pr87462.C	(revision 0)
+++ testsuite/g++.dg/debug/dwarf2/pr87462.C	(working copy)
@@ -0,0 +1,20 @@
+// { dg-additional-options "-dA -std=gnu++17 -gdwarf-4 -O1 -fdebug-types-section" }
+// reject .pseudo label, but "label" is ok.
+// { dg-final { scan-assembler-not "\[^L\"\]_ZN5Test18testFuncEv" } }
+// undefined ref to _ZN5Test18testFuncEv
+
+class Test1 {
+public:
+  static int testFunc() { return 1; }
+};
+
+template 
+class TestWrapper {
+public:
+  static T func() __attribute((noinline)) { return (*funcImpl)(); } 
+};
+
+int main() {
+  return TestWrapper::func();
+}
Index: testsuite/g++.dg/debug/dwarf2/pr88006.C
===
--- testsuite/g++.dg/debug/dwarf2/pr88006.C	(revision 0)
+++ testsuite/g++.dg/debug/dwarf2/pr88006.C	(working copy)
@@ -0,0 +1,39 @@
+// { dg-additional-options "-dA -std=gnu++17 -gdwarf-4 -O1 -fdebug-types-section" }
+// reject .pseudo label, but "label" is ok.
+// { dg-final { scan-assembler-not "\[^\"\]_ZN3Foo4mfunEv" } }
+// undefined ref to _ZN3Foo4mfunEv
+
+struct Foo {
+  void mfun () {}
+};
+
+struct A { static constexpr bool Value = false; };
+
+template  struct B { typedef int Type; };
+
+class Arg
+{
+  template  struct Local : A {};
+
+public:
+  template ::Value>::Type>
+  Arg (Init) {}
+};
+
+class Lambda {
+  static constexpr int Unused = 0;
+
+public:
+  Lambda (Arg);
+};
+
+// Generated ref to Foo::mfun in the type die of an instantiation of this
+template  struct Callable {};
+
+class I {
+  I() : lamb ([this] {}) {}
+
+  Lambda lamb;
+
+  Callable<&Foo::mfun> bm;
+};


Re: Fix PR86575

2018-11-14 Thread Richard Biener
On Wed, Nov 14, 2018 at 3:51 PM Michael Matz  wrote:
>
> Hi,
>
> our warning code sometimes adds locations to statement which didn't have
> them before, which can in turn lead to code changes (here only label
> numbers change).  It seems better to not do that from warning code, and
> here it's easy to do: just return the location we want to use for
> warnings, don't change it in the statement itself.
>
> Regstrapped on x86-64, okay for trunk?

OK.

Richard.

>
> Ciao,
> Michael.
>
> PR middle-end/86575
> * gimplify.c (collect_fallthrough_labels): Add new argument,
> return location via that, don't modify statements.
> (warn_implicit_fallthrough_r): Adjust call, don't use
> statement location directly.
>
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 509fc2f3f5be..22dff0e546c9 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -1938,10 +1938,12 @@ last_stmt_in_scope (gimple *stmt)
>
>  static gimple *
>  collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
> -   auto_vec  *labels)
> +   auto_vec  *labels,
> +   location_t *prevloc)
>  {
>gimple *prev = NULL;
>
> +  *prevloc = UNKNOWN_LOCATION;
>do
>  {
>if (gimple_code (gsi_stmt (*gsi_p)) == GIMPLE_BIND)
> @@ -1978,7 +1980,7 @@ collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
>   /* It might be a label without a location.  Use the
>  location of the scope then.  */
>   if (!gimple_has_location (prev))
> -   gimple_set_location (prev, bind_loc);
> +   *prevloc = bind_loc;
> }
>   gsi_next (gsi_p);
>   continue;
> @@ -2061,6 +2063,8 @@ collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
>  && (gimple_code (gsi_stmt (*gsi_p)) != GIMPLE_LABEL
>  || !gimple_has_location (gsi_stmt (*gsi_p;
>
> +  if (prev && gimple_has_location (prev))
> +*prevloc = gimple_location (prev);
>return prev;
>  }
>
> @@ -2157,7 +2161,8 @@ warn_implicit_fallthrough_r (gimple_stmt_iterator 
> *gsi_p, bool *handled_ops_p,
>
> /* Vector of labels that fall through.  */
> auto_vec  labels;
> -   gimple *prev = collect_fallthrough_labels (gsi_p, &labels);
> +   location_t prevloc;
> +   gimple *prev = collect_fallthrough_labels (gsi_p, &labels, &prevloc);
>
> /* There might be no more statements.  */
> if (gsi_end_p (*gsi_p))
> @@ -2185,8 +2190,8 @@ warn_implicit_fallthrough_r (gimple_stmt_iterator 
> *gsi_p, bool *handled_ops_p,
>  /* Try to be clever and don't warn when the statement
> can't actually fall through.  */
>  && gimple_stmt_may_fallthru (prev)
> -&& gimple_has_location (prev))
> - warned_p = warning_at (gimple_location (prev),
> +&& prevloc != UNKNOWN_LOCATION)
> + warned_p = warning_at (prevloc,
>  OPT_Wimplicit_fallthrough_,
>  "this statement may fall through");
> if (warned_p)


Re: C++ PATCH for c++/87781, detect invalid elaborated-type-specifier

2018-11-14 Thread Jason Merrill
On Wed, Nov 14, 2018 at 9:55 AM Marek Polacek  wrote:
>
> In elaborated-type-specifier, the typename keyword can only follow a
> nested-name-specifier:
>
>   class-key nested-name-specifier template[opt] simple-template-id
>
> but we weren't detecting it.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-11-14  Marek Polacek  
>
> PR c++/87781 - detect invalid elaborated-type-specifier.
> * parser.c (cp_parser_elaborated_type_specifier): Ensure that
> typename follows a nested-name-specifier.
>
> * g++.dg/parse/elab3.C: New test.
>
> diff --git gcc/cp/parser.c gcc/cp/parser.c
> index e9e49b15702..0ab44ab93e3 100644
> --- gcc/cp/parser.c
> +++ gcc/cp/parser.c
> @@ -17986,6 +17986,10 @@ cp_parser_elaborated_type_specifier (cp_parser* 
> parser,
>  template-id or not.  */
>if (!template_p)
> cp_parser_parse_tentatively (parser);
> +  /* The `template' keyword must follow a nested-name-specifier.  */
> +  else if (!nested_name_specifier)
> +   return error_mark_node;

Don't we want a diagnostic here?

Jason


Re: [PATCH][RFC] Come up with -flive-patching master option.

2018-11-14 Thread Martin Liška
On 11/13/18 10:16 PM, Qing Zhao wrote:
> Hi,
> 
>> On Nov 13, 2018, at 1:18 PM, Miroslav Benes  wrote:
>>
>>> Attached is the patch for new -flive-patching=[inline-only-static | 
>>> inline-clone] master option.
>>>
>>> '-flive-patching=LEVEL'
>>> Control GCC's optimizations to provide a safe compilation for
>>> live-patching.  Provides multiple-level control on how many of the
>>> optimizations are enabled by users' request.  The LEVEL argument
>>> should be one of the following:
>>>
>>> 'inline-only-static'
>>>
>>>  Only enable inlining of static functions, disable all other
>>>  ipa optimizations/analyses.  As a result, when patching a
>>>  static routine, all its callers need to be patches as well.
>>>
>>> 'inline-clone'
>>>
>>>  Only enable inlining and all optimizations that internally
>>>  create clone, for example, cloning, ipa-sra, partial inlining,
>>>  etc.; disable all other ipa optimizations/analyses.  As a
>>>  result, when patching a routine, all its callers and its
>>>  clones' callers need to be patched as well.
>> Based on our previous discussion I assume that "clone" optimizations are 
>> safe (for LP) and the others are not. Anyway I'd welcome a note mentioning 
>> that disabled optimizations are dangerous for LP.
> actually, I don’t think that those disabled optimizations are “dangerous” for 
> live-patching. one of the major reasons we disable them
> is because that currently the compiler does NOT provide a good way to compute 
> the impacted function list for those optimizations.
> therefore, we disable them at this time. 
> 
> many of them could be enabled too if the compiler can report the impacted 
> function list accurately in the future.
> 
> 
> 
>> I know it may be the same for you, but it is not for me as a GCC user. 
>> "internally create clone" sounds very... well, internal. It does not 
>> describe the option much for ordinary user whow has no knowledge about GCC 
>> internals.
>>
>> So could you rephrase it a bit, please?
> I tried to make this clear. please see the following:
> 
> '-flive-patching=LEVEL'
>  Control GCC's optimizations to provide a safe compilation for
>  live-patching.
> 
>  If the compiler's optimization uses a function's body or
>  information extracted from its body to optimize/change another
>  function, the latter is called an impacted function of the former.
>  If a function is patched, its impacted functions should be patched
>  too.
> 
>  The impacted functions are decided by the compiler's
>  interprocedural optimizations.  For example, inlining a function
>  into its caller, cloning a function and changing its caller to call
>  this new clone, or extracting a function's pureness/constness
>  information to optimize its direct or indirect callers, etc.
> 
>  Usually, the more ipa optimizations enabled, the larger the number
>  of impacted functions for each function.  In order to control the
>  number of impacted functions and computed the list of impacted
>  function easily, we provide control to partially enable ipa
>  optimizations on two different levels.
> 
>  The LEVEL argument should be one of the following:
> 
>  'inline-only-static'
> 
>   Only enable inlining of static functions, disable all other
>   interprocedural optimizations/analyses.  As a result, when
>   patching a static routine, all its callers need to be patches
>   as well.
> 
>  'inline-clone'
> 
>   Only enable inlining and cloning optimizations, which includes
>   inlining, cloning, interprocedural scalar replacement of
>   aggregates and partial inlining.  Disable all other
>   interprocedural optimizations/analyses.  As a result, when
>   patching a routine, all its callers and its clones' callers
>   need to be patched as well.
> 
>  When -flive-patching specified without any value, the default value
>  is "inline-clone".
> 
>  This flag is disabled by default.
> 
> 
>>> When -flive-patching specified without any value, the default value
>>> is "inline-clone".
>>>
>>> This flag is disabled by default.
>>>
>>> let me know your comments and suggestions on the implementation.
>> I compared it to Martin's patch and ipa-icf-variables is not covered in 
>> yours (I may have missed something).
> Yes, you are right. I added this into my patch.
> 
> I am attaching the new patch here.

Hello.

Please use
git diff HEAD~ > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py 
/tmp/patch

in order to address many formatting issues of the patch (skip the ones reported 
in common.opt).

> 
> 
> flive-patching.patch
> 
> From c0785675eb29599754aaf11c70901c12dd3ea821 Mon Sep 17 00:00:00 2001
> From: qing zhao 
> Date: Tue, 13 Nov 2018 13:02:57 -0800
> Subject: [PATCH] Add -flive-patching to support live patching

Re: [PATCH][cunroll] Add unroll-known-loop-iterations-only param and use it in aarch64

2018-11-14 Thread Richard Biener
On Tue, Nov 13, 2018 at 3:33 PM Richard Biener
 wrote:
>
> On Tue, Nov 13, 2018 at 10:48 AM Kyrill Tkachov
>  wrote:
> >
> >
> > On 13/11/18 09:28, Richard Biener wrote:
> > > On Tue, Nov 13, 2018 at 10:15 AM Kyrill Tkachov
> > >  wrote:
> > >> Hi Richard,
> > >>
> > >> On 13/11/18 08:24, Richard Biener wrote:
> > >>> On Mon, Nov 12, 2018 at 7:20 PM Kyrill Tkachov
> > >>>  wrote:
> >  On 12/11/18 14:10, Richard Biener wrote:
> > > On Fri, Nov 9, 2018 at 6:57 PM Kyrill Tkachov
> > >  wrote:
> > >> On 09/11/18 12:18, Richard Biener wrote:
> > >>> On Fri, Nov 9, 2018 at 11:47 AM Kyrill Tkachov
> > >>>  wrote:
> >  Hi all,
> > 
> >  In this testcase the codegen for VLA SVE is worse than it could be 
> >  due to unrolling:
> > 
> >  fully_peel_me:
> >  mov x1, 5
> >  ptrue   p1.d, all
> >  whilelo p0.d, xzr, x1
> >  ld1dz0.d, p0/z, [x0]
> >  faddz0.d, z0.d, z0.d
> >  st1dz0.d, p0, [x0]
> >  cntdx2
> >  addvl   x3, x0, #1
> >  whilelo p0.d, x2, x1
> >  beq .L1
> >  ld1dz0.d, p0/z, [x0, #1, mul vl]
> >  faddz0.d, z0.d, z0.d
> >  st1dz0.d, p0, [x3]
> >  cntwx2
> >  incbx0, all, mul #2
> >  whilelo p0.d, x2, x1
> >  beq .L1
> >  ld1dz0.d, p0/z, [x0]
> >  faddz0.d, z0.d, z0.d
> >  st1dz0.d, p0, [x0]
> >  .L1:
> >  ret
> > 
> >  In this case, due to the vector-length-agnostic nature of SVE the 
> >  compiler doesn't know the loop iteration count.
> >  For such loops we don't want to unroll if we don't end up 
> >  eliminating branches as this just bloats code size
> >  and hurts icache performance.
> > 
> >  This patch introduces a new unroll-known-loop-iterations-only 
> >  param that disables cunroll when the loop iteration
> >  count is unknown (SCEV_NOT_KNOWN). This case occurs much more 
> >  often for SVE VLA code, but it does help some
> >  Advanced SIMD cases as well where loops with an unknown iteration 
> >  count are not unrolled when it doesn't eliminate
> >  the branches.
> > 
> >  So for the above testcase we generate now:
> >  fully_peel_me:
> >  mov x2, 5
> >  mov x3, x2
> >  mov x1, 0
> >  whilelo p0.d, xzr, x2
> >  ptrue   p1.d, all
> >  .L2:
> >  ld1dz0.d, p0/z, [x0, x1, lsl 3]
> >  faddz0.d, z0.d, z0.d
> >  st1dz0.d, p0, [x0, x1, lsl 3]
> >  incdx1
> >  whilelo p0.d, x1, x3
> >  bne .L2
> >  ret
> > 
> >  Not perfect still, but it's preferable to the original code.
> >  The new param is enabled by default on aarch64 but disabled for 
> >  other targets, leaving their behaviour unchanged
> >  (until other target people experiment with it and set it, if 
> >  appropriate).
> > 
> >  Bootstrapped and tested on aarch64-none-linux-gnu.
> >  Benchmarked on SPEC2017 on a Cortex-A57 and there are no 
> >  differences in performance.
> > 
> >  Ok for trunk?
> > >>> Hum.  Why introduce a new --param and not simply key on
> > >>> flag_peel_loops instead?  That is
> > >>> enabled by default at -O3 and with FDO but you of course can control
> > >>> that in your targets
> > >>> post-option-processing hook.
> > >> You mean like this?
> > >> It's certainly a simpler patch, but I was just a bit hesitant of 
> > >> making this change for all targets :)
> > >> But I suppose it's a reasonable change.
> > > No, that change is backward.  What I said is that peeling is already
> > > conditional on
> > > flag_peel_loops and that is enabled by -O3.  So you want to disable
> > > flag_peel_loops for
> > > SVE instead in the target.
> >  Sorry, I got confused by the similarly named functions.
> >  I'm talking about try_unroll_loop_completely when run as part of 
> >  canonicalize_induction_variables i.e. the "ivcanon" pass
> >  (sorry about blaming cunroll here). This doesn't get called through 
> >  the try_unroll_loops_completely path.
> > >>> Well, peeling gets disabled.  From your patch I see you want to
> > >>> disable "unrolling" when
> > >>> the number of loop iteration is not constant.  That

Re: [RFC][PATCH]Merge VEC_COND_EXPR into MASK_STORE after loop vectorization

2018-11-14 Thread Richard Biener
On Fri, Nov 9, 2018 at 4:49 PM Renlin Li  wrote:
>
> Hi Richard,
>
> On 11/09/2018 11:48 AM, Richard Biener wrote:
> > On Thu, Nov 8, 2018 at 5:55 PM Renlin Li  wrote:
> >>
> >> Hi Richard,
> >>
> >>
> >> *However*, after I rebased my patch on the latest trunk.
> >> Got the following dump from ifcvt:
> >>  [local count: 1006632961]:
> >> # i_20 = PHI 
> >> # ivtmp_18 = PHI 
> >> a_10 = array[i_20];
> >> _1 = a_10 & 1;
> >> _2 = a_10 + 1;
> >> _ifc__34 = _1 != 0 ? _2 : a_10;
> >> array[i_20] = _ifc__34;
> >> _4 = a_10 + 2;
> >> _ifc__37 = _ifc__34 > 10 ? _4 : _ifc__34;
> >> array[i_20] = _ifc__37;
> >> i_13 = i_20 + 1;
> >> ivtmp_5 = ivtmp_18 - 1;
> >> if (ivtmp_5 != 0)
> >>   goto ; [93.33%]
> >> else
> >>   goto ; [6.67%]
> >>
> >> the redundant load is not generated, but you could still see the 
> >> unconditional store.
> >
> > Yes, I fixed the redundant loads recently and indeed dead stores
> > remain (for the particular
> > testcase they would be easy to remove).
>
> Right.
>
> >
> >> After loop vectorization, the following is generated (without my change):
> >
> > Huh.  But that's not because of if-conversion but because SVE needs to
> > mask _all_
> > loop operations that are not safe to execute with the loop_mask!
> >
> >> vect_a_10.6_6 = .MASK_LOAD (vectp_array.4_35, 4B, loop_mask_7);
> >> a_10 = array[i_20];
> >> vect__1.7_39 = vect_a_10.6_6 & vect_cst__38;
> >> _1 = a_10 & 1;
> >> vect__2.8_41 = vect_a_10.6_6 + vect_cst__40;
> >> _2 = a_10 + 1;
> >> vect__ifc__34.9_43 = VEC_COND_EXPR  >> vect__2.8_41, vect_a_10.6_6>;
> >> _ifc__34 = _1 != 0 ? _2 : a_10;
> >> .MASK_STORE (vectp_array.10_45, 4B, loop_mask_7, vect__ifc__34.9_43);
> >> vect__4.12_49 = vect_a_10.6_6 + vect_cst__48;
> >> _4 = a_10 + 2;
> >> vect__ifc__37.13_51 = VEC_COND_EXPR  
> >> vect_cst__50, vect__4.12_49, vect__ifc__34.9_43>;
> >> _ifc__37 = _ifc__34 > 10 ? _4 : _ifc__34;
> >> .MASK_STORE (vectp_array.14_53, 4B, loop_mask_7, vect__ifc__37.13_51);
> >>
> >> With the old ifcvt code, my change here could improve it a little bit, 
> >> eliminate some redundant load.
> >> With the new code, it could not improved it further. I'll adjust the patch 
> >> based on the latest trunk.
> >
> > So what does the patch change the above to?  The code has little to no
> > comments apart from a
> > small picture with code _before_ the transform...
> It is like this:
>vect_a_10.6_6 = .MASK_LOAD (vectp_array.4_35, 4B, loop_mask_7);
>a_10 = array[i_20];
>vect__1.7_39 = vect_a_10.6_6 & vect_cst__38;
>_1 = a_10 & 1;
>vect__2.8_41 = vect_a_10.6_6 + vect_cst__40;
>_2 = a_10 + 1;
>_60 = vect__1.7_39 != vect_cst__42;
>vect__ifc__34.9_43 = VEC_COND_EXPR <_60, vect__2.8_41, vect_a_10.6_6>;
>_ifc__34 = _1 != 0 ? _2 : a_10;
>vec_mask_and_61 = _60 & loop_mask_7;
>.MASK_STORE (vectp_array.10_45, 4B, vec_mask_and_61, vect__2.8_41);
>vect__4.12_49 = vect_a_10.6_6 + vect_cst__48;
>_4 = a_10 + 2;
>vect__ifc__37.13_51 = VEC_COND_EXPR  vect_cst__50, 
> vect__4.12_49, vect__ifc__34.9_43>;
>_ifc__37 = _ifc__34 > 10 ? _4 : _ifc__34;
>.MASK_STORE (vectp_array.14_53, 4B, loop_mask_7, vect__ifc__37.13_51);

Ah, OK, now I see what you do.

> As the loaded value is used later, It could not be removed.
>
> With the change, ideally, less data is stored.
> However, it might generate more instructions.
>
> 1, The load is not eliminable. Apparently, your change eliminate most of the 
> redundant load.
> The rest is necessary or not easy to remove.
> 2, additional AND instruction.
>
> With a simpler test case like this:
>
> static int array[100];
> int test (int a, int i)
> {
>for (unsigned i = 0; i < 16; i++)
>  {
>if (a & 1)
> array[i] = a + 1;
>  }
>return array[i];
> }
>
> The new code-gen will be:
>vect__2.4_29 = vect_cst__27 + vect_cst__28;
>_44 = vect_cst__34 != vect_cst__35;
>vec_mask_and_45 = _44 & loop_mask_32;
>.MASK_STORE (vectp_array.9_37, 4B, vec_mask_and_45, vect__2.4_29);
>
> While the old one is:
>
>vect__2.4_29 = vect_cst__27 + vect_cst__28;
>vect__ifc__24.7_33 = .MASK_LOAD (vectp_array.5_30, 4B, loop_mask_32);
>vect__ifc__26.8_36 = VEC_COND_EXPR  vect__2.4_29, vect__ifc__24.7_33>;
>.MASK_STORE (vectp_array.9_37, 4B, loop_mask_32, vect__ifc__26.8_36);

I don't see the masked load here on x86_64 btw. (I don't see
if-conversion generating a load).
I guess that's again when store-data-races are allowed that it uses a
RMW cycle and vectorization
generating the masked variants for the loop-mask.  Which means for SVE
if-conversion should
prefer the masked-store variant even when store data races are allowed?

>
> >
> > I was wondering whether we can implement
> >
> >l = [masked]load;
> >tem = cond ? x : l;
> >masked-store = tem;
> >
> > pattern matching in a regular pass - forwprop for example.  Note the
> 

C++ PATCH for c++/87781, detect invalid elaborated-type-specifier

2018-11-14 Thread Marek Polacek
In elaborated-type-specifier, the typename keyword can only follow a
nested-name-specifier:

  class-key nested-name-specifier template[opt] simple-template-id

but we weren't detecting it.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-11-14  Marek Polacek  

PR c++/87781 - detect invalid elaborated-type-specifier.
* parser.c (cp_parser_elaborated_type_specifier): Ensure that
typename follows a nested-name-specifier.

* g++.dg/parse/elab3.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index e9e49b15702..0ab44ab93e3 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -17986,6 +17986,10 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
 template-id or not.  */
   if (!template_p)
cp_parser_parse_tentatively (parser);
+  /* The `template' keyword must follow a nested-name-specifier.  */
+  else if (!nested_name_specifier)
+   return error_mark_node;
+
   /* Parse the template-id.  */
   token = cp_lexer_peek_token (parser->lexer);
   decl = cp_parser_template_id (parser, template_p,
diff --git gcc/testsuite/g++.dg/parse/elab3.C gcc/testsuite/g++.dg/parse/elab3.C
new file mode 100644
index 000..5488d90afcf
--- /dev/null
+++ gcc/testsuite/g++.dg/parse/elab3.C
@@ -0,0 +1,5 @@
+// PR c++/87781
+// { dg-do compile }
+
+template class A;
+class template A *p; // { dg-error "" }


Re: [Committed][AArch64] Fix PR62178 testcase failures

2018-11-14 Thread Segher Boessenkool
On Wed, Nov 14, 2018 at 12:37:05PM +, Wilco Dijkstra wrote:
> +/* { dg-final { scan-assembler-not { dup } } } */
> +/* { dg-final { scan-assembler-not { fmov } } } */

{ dup }   is the same as   " dup "  , that is, with spaces and all.
I don't think you want that (there usually is a tab character before
mnemonics, so \s would work better, or use \m and \M.  Or nothing,
just delete the spaces, if you are sure nothing in the generated
assembler code says "dup").


Segher


Fix PR86575

2018-11-14 Thread Michael Matz
Hi,

our warning code sometimes adds locations to statement which didn't have 
them before, which can in turn lead to code changes (here only label 
numbers change).  It seems better to not do that from warning code, and 
here it's easy to do: just return the location we want to use for 
warnings, don't change it in the statement itself.

Regstrapped on x86-64, okay for trunk?


Ciao,
Michael.

PR middle-end/86575
* gimplify.c (collect_fallthrough_labels): Add new argument,
return location via that, don't modify statements.
(warn_implicit_fallthrough_r): Adjust call, don't use
statement location directly.

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 509fc2f3f5be..22dff0e546c9 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1938,10 +1938,12 @@ last_stmt_in_scope (gimple *stmt)
 
 static gimple *
 collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
-   auto_vec  *labels)
+   auto_vec  *labels,
+   location_t *prevloc)
 {
   gimple *prev = NULL;
 
+  *prevloc = UNKNOWN_LOCATION;
   do
 {
   if (gimple_code (gsi_stmt (*gsi_p)) == GIMPLE_BIND)
@@ -1978,7 +1980,7 @@ collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
  /* It might be a label without a location.  Use the
 location of the scope then.  */
  if (!gimple_has_location (prev))
-   gimple_set_location (prev, bind_loc);
+   *prevloc = bind_loc;
}
  gsi_next (gsi_p);
  continue;
@@ -2061,6 +2063,8 @@ collect_fallthrough_labels (gimple_stmt_iterator *gsi_p,
 && (gimple_code (gsi_stmt (*gsi_p)) != GIMPLE_LABEL
 || !gimple_has_location (gsi_stmt (*gsi_p;
 
+  if (prev && gimple_has_location (prev))
+*prevloc = gimple_location (prev);
   return prev;
 }
 
@@ -2157,7 +2161,8 @@ warn_implicit_fallthrough_r (gimple_stmt_iterator *gsi_p, 
bool *handled_ops_p,
 
/* Vector of labels that fall through.  */
auto_vec  labels;
-   gimple *prev = collect_fallthrough_labels (gsi_p, &labels);
+   location_t prevloc;
+   gimple *prev = collect_fallthrough_labels (gsi_p, &labels, &prevloc);
 
/* There might be no more statements.  */
if (gsi_end_p (*gsi_p))
@@ -2185,8 +2190,8 @@ warn_implicit_fallthrough_r (gimple_stmt_iterator *gsi_p, 
bool *handled_ops_p,
 /* Try to be clever and don't warn when the statement
can't actually fall through.  */
 && gimple_stmt_may_fallthru (prev)
-&& gimple_has_location (prev))
- warned_p = warning_at (gimple_location (prev),
+&& prevloc != UNKNOWN_LOCATION)
+ warned_p = warning_at (prevloc,
 OPT_Wimplicit_fallthrough_,
 "this statement may fall through");
if (warned_p)


Re: [PATCH][libbacktrace] Handle DW_FORM_GNU_strp_alt

2018-11-14 Thread Tom de Vries
On 14-11-18 14:25, Jakub Jelinek wrote:
> On Wed, Nov 14, 2018 at 02:08:05PM +0100, Tom de Vries wrote:
>>> +btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
>>
>> Hmm, I already discovered that specifying the -O0 doesn't work, since
>> it's overridden by $(CFLAGS).
>>
>> With a hack like this:
>> ...
>> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
>> index 2fec9bbb4b6..8bdf13b3546 100644
>> --- a/libbacktrace/Makefile.am
>> +++ b/libbacktrace/Makefile.am
>> @@ -99,11 +99,14 @@ check_PROGRAMS += btest
>>  if HAVE_DWZ
>>
>>  btest_dwz_SOURCES = btest_dwz.c testlib.c
>> -btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
>> +btest_dwz_CFLAGS = $(AM_CFLAGS) -g
>>  btest_dwz_LDADD = libbacktrace.la
>>
>>  check_PROGRAMS += btest_dwz
>>
>> +btest_dwz-btest_dwz.o: btest_dwz.c
>> +   $(AM_V_CC)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES)
>> $(AM_CPPFLAGS) $(CPPFLAGS) $(btest_dwz_CFLAGS) $(CFLAGS) -O0 -c -o
>> btest_dwz-btest_dwz.o `test -f 'btest_dwz.c' || echo
>> '$(srcdir)/'`btest_dwz.c
> 
> Can't you instead do something like:
> btest_dwz.o: CFLAGS += -g -O0
> or something similar

Hi,

yes, that works, thanks.

> (whatever the corresponding goal is)?

The goal is to run the testcase with a setting lower than -O2, such that
we can successfully run a substantial portion of the test without
needing support for DW_FORM_GNU_ref_alt.

[ At O2 we get constprop versions of some functions, which have an
abstract origin, which tends to be moved to the common debug file by dwz
-m, after which we need support for DW_FORM_GNU_ref_alt to get to the
name of the function. ]

> Otherwise, the patch looks generally ok to me,

Great.

> but yes, I've been wondering
> how you can get away with DW_FORM_GNU_ref_alt not implemented properly.
> 

Indeed, DW_FORM_GNU_ref_alt support is required to make this work in
general.

But I observed that implementing just DW_FORM_GNU_strp_alt improves on
the current situation, so I thought it was worthwhile submitting this as
a separate patch.

Updated patch attached (which also rewrites btest_dwz.c to an include of
btest.c, while disabling the inline tests that require DW_FORM_GNU_ref_alt).

Thanks,
- Tom
[libbacktrace] Handle DW_FORM_GNU_strp_alt

The dwz tool attempts to optimize DWARF debugging information contained in ELF
shared libraries and ELF executables for size.

With the dwz -m option, it attempts to optimize by moving DWARF debugging
information entries (DIEs), strings and macro descriptions duplicated in
more than one object into a newly created ELF ET_REL object whose filename is
given as -m option argument.  The debug sections in the executables and
shared libraries specified on the command line are then modified again,
referring to the entities in the newly created object.

After a dwz invocation:
...
$ dwz -m c.debug a.out b.out
...
both a.out and b.out contain a .gnu_debugaltlink section referring to c.debug,
and use "DWZ DWARF multifile extensions" such as DW_FORM_GNU_strp_alt and
DW_FORM_GNU_ref_alt to refer to the content of c.debug.

The .gnu_debugaltlink consists of a filename and the expected buildid.

This patch adds to libbacktrace:
- finding a file matching the .gnu_debugaltlink filename
- verifying the .gnu_debugaltlink buildid
- reading the dwarf of the .gnu_debugaltlink
- support for FORM_GNU_strp_alt
- a testcase btest_dwz.c, which includes btest.c but excludes the inline tests
  and is compiled with -O, such that it only requires FORM_GNU_strp_alt.

Bootstrapped and reg-tested on x86_64.

2018-11-11  Tom de Vries  

	* dwarf.c (struct dwarf_data): Add altlink field.
	(read_attribute): Add altlink parameter.  Handle DW_FORM_GNU_strp_alt
	using altlink.
	(find_address_ranges, build_address_map, build_dwarf_data): Add and
	handle altlink parameter.
	(read_referenced_name, read_function_entry): Add argument to
	read_attribute call.
	(backtrace_dwarf_add): Add and handle fileline_entry and
	fileline_altlink parameters.
	* elf.c (elf_open_debugfile_by_debugaltlink): New function.
	(elf_add): Add and handle fileline_entry, with_buildid_data and
	with_buildid_size parameters.  Handle .gnu_debugaltlink section.
	(phdr_callback, backtrace_initialize): Add arguments to elf_add calls.
	* internal.h (backtrace_dwarf_add): Add fileline_entry and
	fileline_altlink parameters.
	* configure.ac (DWZ): Set with AC_CHECK_PROG.
	(HAVE_DWZ): Set with AM_CONDITIONAL.
	* configure: Regenerate.
	* Makefile.am (check_PROGRAMS): Add btest_dwz.
	(TESTS): Add btest_dwz_2 and btest_dwz_3.
	* Makefile.in: Regenerate.
	* btest.c (INLINE_TESTS): Define to 1 if not defined.
	(main): Only run test2 and test4 if INLINE_TESTS.
	* btest_dwz.c: New file. Define INLINE_TESTS to 0 and include btest.c.

---
 libbacktrace/Makefile.am  |  22 +
 libbacktrace/Makefile.in  |  80 +--
 libbacktrace/btest.c  |  12 +
 libbacktrace/btest_dwz.c  |  33 +
 libbacktrace/configure|  57 +-
 libbacktrace/configure.ac |   3 ++
 libbacktra

Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-14 Thread Marek Polacek
On Wed, Nov 14, 2018 at 05:18:40PM +0530, Umesh Kalappa wrote:
> Thank you Jason and Marek for the suggestions .
> 
> Attached patch(pr86512.patch)  along the Changelog .

It seems you've attached the wrong patch.

Marek


Re: [PATCH] Fix PR87985

2018-11-14 Thread Richard Biener
On Mon, 12 Nov 2018, Richard Biener wrote:

> 
> The following fixes split_constant_offset unbound un-CSEing of
> expressions when following SSA def stmts.  Simply limiting it to
> single-uses isn't good for consumers so the following instead
> limits analysis by implementing a cache.  Note this may still
> end up un-CSEing stuff but I didn't want to try inserting
> SAVE_EXPRs in split_constant_offset result...  (maybe I should
> simply try though...).  Another option would be to give up
> when we see several uses of an "interesting" expression, thus
> make the hash-map a visited thing instead (but the result would
> be somewhat odd I guess).
> 
> Anyway, the following preserves existing behavior while fixing
> the compile-time issue for the testcase (which doesn't end up
> generating anything interesting).

So I sneaked in "obvious" changes not expanding stuff without
contributions to the constant.  Obviously that broke some
vectorizer tests where for dependence analysis we have to have
1:1 matching bases and ptr + (sizetype)i[+ 0] vs. ptr + (sizetype)(i+4)
was expanded inconsistently.  I've take out those changes and
committed the following instead.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

>From 111674b00c850e849c66ab569a63c6ff7466110a Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Mon, 12 Nov 2018 14:45:27 +0100
Subject: [PATCH] fix-pr87985

PR middle-end/87985
* tree-data-ref.c (split_constant_offset): Add wrapper
allocating a cache hash-map.
(split_constant_offset_1): Cache results of expanding
expressions from SSA def stmts.

* gcc.dg/pr87985.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/pr87985.c b/gcc/testsuite/gcc.dg/pr87985.c
new file mode 100644
index 000..c0d07ff918f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87985.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ftree-slp-vectorize" } */
+
+char *bar (void);
+__INTPTR_TYPE__ baz (void);
+
+void
+foo (__INTPTR_TYPE__ *q)
+{
+  char *p = bar ();
+  __INTPTR_TYPE__ a = baz ();
+  __INTPTR_TYPE__ b = baz ();
+  int i = 0;
+#define X q[i++] = a; q[i++] = b; a = a + b; b = b + a;
+#define Y X X X X X X X X X X
+#define Z Y Y Y Y Y Y Y Y Y Y
+  Z Z Z Z Z Z Z Z Z Z
+  p[a] = 1;
+  p[b] = 2;
+}
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 6019c6168bf..0096afb9ba7 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -95,10 +95,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-affine.h"
 #include "params.h"
 #include "builtins.h"
-#include "stringpool.h"
-#include "tree-vrp.h"
-#include "tree-ssanames.h"
 #include "tree-eh.h"
+#include "ssa.h"
 
 static struct datadep_stats
 {
@@ -584,6 +582,10 @@ debug_ddrs (vec ddrs)
   dump_ddrs (stderr, ddrs);
 }
 
+static void
+split_constant_offset (tree exp, tree *var, tree *off,
+  hash_map > &cache);
+
 /* Helper function for split_constant_offset.  Expresses OP0 CODE OP1
(the type of the result is TYPE) as VAR + OFF, where OFF is a nonzero
constant of type ssizetype, and returns true.  If we cannot do this
@@ -592,7 +594,8 @@ debug_ddrs (vec ddrs)
 
 static bool
 split_constant_offset_1 (tree type, tree op0, enum tree_code code, tree op1,
-tree *var, tree *off)
+tree *var, tree *off,
+hash_map > &cache)
 {
   tree var0, var1;
   tree off0, off1;
@@ -613,8 +616,8 @@ split_constant_offset_1 (tree type, tree op0, enum 
tree_code code, tree op1,
   /* FALLTHROUGH */
 case PLUS_EXPR:
 case MINUS_EXPR:
-  split_constant_offset (op0, &var0, &off0);
-  split_constant_offset (op1, &var1, &off1);
+  split_constant_offset (op0, &var0, &off0, cache);
+  split_constant_offset (op1, &var1, &off1, cache);
   *var = fold_build2 (code, type, var0, var1);
   *off = size_binop (ocode, off0, off1);
   return true;
@@ -623,7 +626,7 @@ split_constant_offset_1 (tree type, tree op0, enum 
tree_code code, tree op1,
   if (TREE_CODE (op1) != INTEGER_CST)
return false;
 
-  split_constant_offset (op0, &var0, &off0);
+  split_constant_offset (op0, &var0, &off0, cache);
   *var = fold_build2 (MULT_EXPR, type, var0, op1);
   *off = size_binop (MULT_EXPR, off0, fold_convert (ssizetype, op1));
   return true;
@@ -647,7 +650,7 @@ split_constant_offset_1 (tree type, tree op0, enum 
tree_code code, tree op1,
 
if (poffset)
  {
-   split_constant_offset (poffset, &poffset, &off1);
+   split_constant_offset (poffset, &poffset, &off1, cache);
off0 = size_binop (PLUS_EXPR, off0, off1);
if (POINTER_TYPE_P (TREE_TYPE (base)))
  base = fold_build_pointer_plus (base, poffset);
@@ -691,18 +694,48 @@ split_constant_offset_1 (tree type, tree op0, enum 
tree_code code, tree op1,
if (gimple_code (def_stmt) != GIMPLE_ASSIGN)
  return false;
 
- 

[PATCH] Add missing dir to create_testsuite_files script

2018-11-14 Thread Jonathan Wakely

* scripts/create_testsuite_files: Add special_functions to the list
of directories to search. Add comment referring to conformance.exp.
* testsuite/libstdc++-dg/conformance.exp: Add comment referring
to create_testsuite_files.

Committed to trunk.

commit de9099395703eac44f7d1ab7c06dc20718dcc0b8
Author: Jonathan Wakely 
Date:   Wed Nov 14 14:11:14 2018 +

Add missing dir to create_testsuite_files script

* scripts/create_testsuite_files: Add special_functions to the list
of directories to search. Add comment referring to conformance.exp.
* testsuite/libstdc++-dg/conformance.exp: Add comment referring
to create_testsuite_files.

diff --git a/libstdc++-v3/scripts/create_testsuite_files 
b/libstdc++-v3/scripts/create_testsuite_files
index 156304c2ad2..40e81cea8a9 100755
--- a/libstdc++-v3/scripts/create_testsuite_files
+++ b/libstdc++-v3/scripts/create_testsuite_files
@@ -31,8 +31,10 @@ tests_file_perf="$outdir/testsuite_files_performance"
 cd $srcdir
 # This is the ugly version of "everything but the current directory".  It's
 # what has to happen when find(1) doesn't support -mindepth, or -xtype.
+# The directories here should be consistent with libstdc++-dg/conformance.exp
 dlist=`echo [0-9][0-9]*`
 dlist="$dlist abi backward ext performance tr1 tr2 decimal experimental"
+dlist="$dlist special_functions"
 find $dlist "(" -type f -o -type l ")" -name "*.cc" -print > $tmp.01
 find $dlist "(" -type f -o -type l ")" -name "*.c" -print > $tmp.02
 cat  $tmp.01 $tmp.02 | sort > $tmp.1
diff --git a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp 
b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
index 49ac6fb1649..f372d670f6b 100644
--- a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
+++ b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
@@ -52,6 +52,7 @@ if {[info exists tests_file] && [file exists $tests_file]} {
 close $f
 } else {
 # Find directories that might have tests.
+# This list should be consistent with scripts/create_testsuite_files
 set subdirs [glob "$srcdir/\[0-9\]\[0-9\]*"]
 lappend subdirs "$srcdir/abi"
 lappend subdirs "$srcdir/backward"


Re: [PATCH] Fix bootstrap with GCC 4.1.2 (PR bootstrap/86739)

2018-11-14 Thread Jonathan Wakely

On 14/11/18 09:25 +0100, Richard Biener wrote:

On Wed, 14 Nov 2018, Jakub Jelinek wrote:


Hi!

As mentioned in the PR, with GCC before 4.3 one can't instantiate std::pair
where one or both of the template parameters are reference types, because
the std::pair constructor has arguments references to the template parameter
types and the CWG that resolved hasn't been applied to those compilers.

The following patch works around it by not returning
std::pair object, but instead a different class that
holds the two references and has conversion operator to std::pair.

If that conversion operator isn't acceptable, in the PR there is another
patch which adjusts the (so far) two spots which need to be changed in that
case.

Bootstrapped/regtested on x86_64-linux and i686-linux (using GCC 7 as
bootstrap compiler) and tested on the preprocessed source with GCC 4.1.
Ok for trunk?


Works for me if C++ people have no better idea.


Looks good to me.



Re: [PATCH] Support simd function declarations via a pre-include.

2018-11-14 Thread Jakub Jelinek
On Wed, Nov 14, 2018 at 03:09:49PM +0100, Martin Liška wrote:
> > So omp-simd-notinbranch or omp_simd_notinbranch?
> > Any particular reason for this weird syntax and for not also
> > supporting inbranch or just simd?
> 
> Questionable whether to support as current glibc vector ABI only uses 
> notinbranch?

simd attribute supports all those three cases, i.e.
__attribute__((simd)), __attribute__((simd ("notinbranch"))) and
__attribute__((simd ("inbranch"))).
So at least for consistency it would be nice to support the same thing.
Some users could e.g. provide a vectorized definition for other builtins
and have e.g. inbranch support there only (the vectorizer can use those
by providing all ones masks).

Jakub


Re: [PATCH] Support simd function declarations via a pre-include.

2018-11-14 Thread Martin Liška
On 11/14/18 12:35 PM, Jakub Jelinek wrote:
> On Wed, Nov 14, 2018 at 11:06:04AM +0100, Martin Liška wrote:
>> Question I have is about default search locations for the header file. On my 
>> machine I can
>> see:
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/x86_64-pc-linux-gnu/9.0.0/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/../lib64/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../x86_64-pc-linux-gnu/9.0.0/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../lib64/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/lib/x86_64-pc-linux-gnu/9.0.0/math-vector-fortran.h", R_OK) = -1 
>> ENOENT (No such file or directory)
>> access("/lib/../lib64/math-vector-fortran.h", R_OK) = -1 ENOENT (No such 
>> file or directory)
>> access("/usr/lib/x86_64-pc-linux-gnu/9.0.0/math-vector-fortran.h", R_OK) = 
>> -1 ENOENT (No such file or directory)
>> access("/usr/lib/../lib64/math-vector-fortran.h", R_OK) = -1 ENOENT (No such 
>> file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/home/marxin/bin/gcc2/lib64/gcc/x86_64-pc-linux-gnu/9.0.0/../../../math-vector-fortran.h",
>>  R_OK) = -1 ENOENT (No such file or directory)
>> access("/lib/math-vector-fortran.h", R_OK) = -1 ENOENT (No such file or 
>> directory)
>> access("/usr/lib/math-vector-fortran.h", R_OK) = -1 ENOENT (No such file or 
>> directory)
>>
>> Aren't these locations desired for libraries, instead of include locations?
> 
> That isn't correct indeed.
> What about find_a_file (&include_prefixes, ... )?

Thanks, so setting last argument to true should handle here the multilib 
support.

> Though, in the design where to put the file we really need to have multilib
> (and multiarch on Debian/Ubuntu) in mind, because e.g. on x86_64-linux you
> want to find a m64 version of the header for -m64, but a different for -m32
> and there is always the possibility somebody installs a 32-bit gfortran on 
> x86_64.
> 
>> --- a/gcc/config/gnu-user.h
>> +++ b/gcc/config/gnu-user.h
>> @@ -170,3 +170,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>>  If not, see
>>LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \
>>LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}"
>>  #endif
>> +
>> +#undef TARGET_F951_NOSTDINC_OPTIONS
>> +#define TARGET_F951_NOSTDINC_OPTIONS "%:fortran-header-file(-fpre-include= 
>> math-vector-fortran.h)"
> 
> Too long line, use some \s to split it up.
> 
>> +   Flags are one of:
>> + - omp-simd-notinbranch.
> 
> So omp-simd-notinbranch or omp_simd_notinbranch?
> Any particular reason for this weird syntax and for not also
> supporting inbranch or just simd?

Questionable whether to support as current glibc vector ABI only uses 
notinbranch?

> 
>> +
>> +   When we come here, we have already matched the !GCC$ builtin string.  */
>> +match
>> +gfc_match_gcc_builtin (void)
>> +{
>> +  char builtin[GFC_MAX_SYMBOL_LEN + 1];
>> +
>> +  if (gfc_match_name (builtin) != MATCH_YES)
>> +return MATCH_ERROR;
>> +
>> +  gfc_gobble_whitespace ();
>> +  if (gfc_match ("attributes") != MATCH_YES)
>> +return MATCH_ERROR;
>> +
>> +  gfc_gobble_whitespace ();
>> +  if (gfc_match ("omp_simd_notinbranch") != MATCH_YES)
>> +return MATCH_ERROR;
> 
> Why so many gfc_match calls?  Can't you e.g. just do
>   if (gfc_match ("%n attributes simd", builtin) != MATCH_YES)
> return MATCH_ERROR;
> 
>   int builtin_kind = 0; /* Or whatever, just want to show the parsing here.  
> */
>   if (gfc_match ("(notinbranch)") == MATCH_YES)
> builtin_kind = -1;
>   else if (gfc_match ("(inbranch)") == MATCH_YES)
> builtin_kind = 1;
> 
> The space in gfc_match indicates gfc_gobble_whitespace (), i.e. optionally
> eating whitespace (note, in fixed form white space is optionally eaten
> pretty much always).  If you want a mandatory space, there is "% ".
> So it depends in if in fixed form we want to support e.g.
> !GCC$ BUI LTIN SINf ATTRI BUT ESSIMD(NOT IN BRANCH)
> and in free form e.g.
> !gcc$ builtin sinf attributessimd (notinbranch)
> I wouldn't use omp_simd because in C/C++ the attribute is called simd.
>> +
>> +  char *r = XNEWVEC (char, strlen (builtin) + 32);
>> +  sprintf (r, "__builtin_%s", builtin);
>> +  vectorized_builtins.safe_push (r);
> 
> Perhaps make it vector of const char *, int pairs, so that you

Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-14 Thread Segher Boessenkool
On Wed, Nov 14, 2018 at 03:33:43PM +0300, Alexander Monakov wrote:
> On Wed, 14 Nov 2018, Jakub Jelinek wrote:
> 
> > On Wed, Nov 14, 2018 at 06:22:51AM -0600, Segher Boessenkool wrote:
> > > Btw, if you just add
> > > 
> > > void *
> > > retsp (void)
> > > {
> > >   register void *sp __asm ("sp");
> > >   asm ("" : "+g" (sp));  // <-- this line
> > >   return sp;
> > > }
> > > 
> > > everything works fine.
> > 
> > Even in what you are proposing, i.e. handle the var as any other var
> > in SSA form and only copy into the hard register right before asm and out of
> > it after it?
> > Because 
> > {
> >   void *sp;
> >   asm ("" : "+g" (sp));
> >   return sp;
> > }
> > would store into the register default definition of the SSA_NAME (the var
> > has no initializer).
> 
> I think with "=g" rather than "+g" this example is ok.

No, it needs the register var as an input.  That is the whole *point*.
It could output elsewhere, like with

void *
retsp (void)
{
  register void *sp __asm ("sp");
  void *p;
  asm ("" : "=g" (p) : "0" (sp));
  return p;
}

(which also works reliably with current GCC).

Or like

void *
retsp (void)
{
  register void *sp __asm ("sp");
  void *p;
  asm ("mov %0,%1" : "=r" (p) : "r" (sp));
  return p;
}

if you don't want to tie the asm input and output (but use an extra
machine instruction, alas).


Segher


Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-14 Thread Segher Boessenkool
On Wed, Nov 14, 2018 at 01:27:26PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 14, 2018 at 06:22:51AM -0600, Segher Boessenkool wrote:
> > Btw, if you just add
> > 
> > void *
> > retsp (void)
> > {
> >   register void *sp __asm ("sp");
> >   asm ("" : "+g" (sp));  // <-- this line
> >   return sp;
> > }
> > 
> > everything works fine.
> 
> Even in what you are proposing, i.e. handle the var as any other var
> in SSA form and only copy into the hard register right before asm and out of
> it after it?

Yes, *only* in that: with current trunk sp lives in the "sp" hard register
at the "return sp", which cannot work reliably (what value is returned?
It is unspecified).

> Because 
> {
>   void *sp;
>   asm ("" : "+g" (sp));
>   return sp;
> }
> would store into the register default definition of the SSA_NAME (the var
> has no initializer).

I'm more concerned about what it looks like in RTL, but sure :-)  What
*should* it do before RTL?  Not much at all I think, just keep track that
this var is a register asm and that's that?


Segher


Re: [PATCH v3] Add sinh(atanh(x)) and cosh(atanh(x)) optimizations

2018-11-14 Thread Wilco Dijkstra
Hi,


> Indeed. After plotting the graph of both functions, it is very clear
> that this check isn't required. Sorry about that.

It wouldn't be clear from the graph, you need to check that +0.0, -0.0,
out of range values, infinities, NaNs give the same answer before/after
your transformation. If so, then you don't need anything extra except
for unsafe-math-optimizations and no-math-errno (given errno handling
is changed).

> There can be NaNs and Infinities. For NaNs, take any input that is
> outside the [-1, 1] line.
> For Infinities, take x = -1, or x = 1. I think these must be 'honored'
> as to ensure compatibility with the original expression.

The question is whether you get the same answer for these, not whether
you can end up with an infinity or NaN. The idea is that we optimize based
on the assumption there are no infinities or NaNs. FP operations can still
produce infinities or NaNs, the compiler just doesn't need to worry about
treating them correctly, and it's the programmer's reponsibility to ensure
they are not generated.

> so I must check for
> !HONOR_SIGNED_ZEROS (type) && HONOR_NANS (type) && HONOR_INFINITIES (type)
> that is correct? Also, is it safe to remove the !finite_math_only with
> this, as now it is stated that the type supports infinity and NaNs?

No that doesn't look quite right. First check whether the transformation
handles zero/inf/NaN correctly, if so you don't need any of this.

> However, I am not sure if it is OK to remove unsafe-math-optimizations
> even if it enables
> finite_math_only because of the 2 ULP error. As stated in the first
> iteration, the user can be
> using a very precise math library that yields 0 ULP.

Well 0 ULP would be an impossibility. Unsafe math seems reasonable since
it does behave slightly differently (including in terms of exception flags set).
It's unfortunate GCC doesn't have clear definition of IEEE conformance
modes like various other compilers.

Wilco





Re: [PATCH][libbacktrace] Handle DW_FORM_GNU_strp_alt

2018-11-14 Thread Jakub Jelinek
On Wed, Nov 14, 2018 at 02:08:05PM +0100, Tom de Vries wrote:
> > +btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
> 
> Hmm, I already discovered that specifying the -O0 doesn't work, since
> it's overridden by $(CFLAGS).
> 
> With a hack like this:
> ...
> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
> index 2fec9bbb4b6..8bdf13b3546 100644
> --- a/libbacktrace/Makefile.am
> +++ b/libbacktrace/Makefile.am
> @@ -99,11 +99,14 @@ check_PROGRAMS += btest
>  if HAVE_DWZ
> 
>  btest_dwz_SOURCES = btest_dwz.c testlib.c
> -btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
> +btest_dwz_CFLAGS = $(AM_CFLAGS) -g
>  btest_dwz_LDADD = libbacktrace.la
> 
>  check_PROGRAMS += btest_dwz
> 
> +btest_dwz-btest_dwz.o: btest_dwz.c
> +   $(AM_V_CC)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES)
> $(AM_CPPFLAGS) $(CPPFLAGS) $(btest_dwz_CFLAGS) $(CFLAGS) -O0 -c -o
> btest_dwz-btest_dwz.o `test -f 'btest_dwz.c' || echo
> '$(srcdir)/'`btest_dwz.c

Can't you instead do something like:
btest_dwz.o: CFLAGS += -g -O0
or something similar (whatever the corresponding goal is)?

Otherwise, the patch looks generally ok to me, but yes, I've been wondering
how you can get away with DW_FORM_GNU_ref_alt not implemented properly.

Jakub


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-14 Thread Umesh Kalappa
>>We are runing the make check-gcc(x86_64) and will let know for any 
>>regressions .
No regress found .

~Umesh
On Wed, Nov 14, 2018 at 5:18 PM Umesh Kalappa  wrote:
>
> Thank you Jason and Marek for the suggestions .
>
> Attached patch(pr86512.patch)  along the Changelog .
>
> and also please note tested the patch for x86_64 only with "make -k
> check-gcc RUNTESTFLAGS=dg.exp=g++.dg" and see no regressions.
>
> We are runing the make check-gcc(x86_64) and will let know for any 
> regressions .
>
> Meanwhile ,Please let us know your thoughts on the patch.
>
> Thank you
> ~Umesh
>
> On Wed, Nov 14, 2018 at 2:55 AM Jason Merrill  wrote:
> >
> > On Tue, Nov 13, 2018 at 10:40 AM Marek Polacek  wrote:
> > > On Tue, Nov 13, 2018 at 11:49:55AM +0530, Umesh Kalappa wrote:
> > > > Hi All,
> > > >
> > > > the following patch fix the subjected issue
> > > >
> > > > Index: gcc/cp/parser.c
> > > > ===
> > > > --- gcc/cp/parser.c (revision 266026)
> > > > +++ gcc/cp/parser.c (working copy)
> > > > @@ -24615,6 +24615,8 @@
> > > >  {
> > > >tree expr;
> > > >cp_lexer_consume_token (parser->lexer);
> > > > +
> > > > +  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
> > > >
> > > >if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
> > > > {
> > > >
> > > >
> > > > ok to commit along the testcase with changelog update ?
> > >
> > > Thanks for the patch.
> > >
> > > Please also include the testcase along with the patch (and I think it 
> > > should
> > > also test noexcept in a template).  Please also include a ChangeLog entry
> > > in the patch submission.
> > >
> > > Can you describe how this patch has been tested?
> > >
> > > Further, wouldn't it be better to call inject_this_parameter inside the
> > > CPP_OPEN_PAREN block?  If noexcept doesn't have any expression, then it
> > > can't refer to "this".
> >
> > Agreed, thanks.  You also need to restore the old
> > current_class_{ptr,ref} at the end of the noexcept-specifier.
> >
> > Jason


Re: [PR81878]: fix --disable-bootstrap --enable-languages=ada, and cross-back gnattools build

2018-11-14 Thread Arnaud Charlet
> Huh, indeed - it's a host_module without bootstrap ...  and libada is
> a target_module not bootstrapped either.  So we're indeed in a curious
> situation where we have a bootstrap of Ada requiring a host Ada but
> nothing of Ada is actually bootstrapped ... ;)

Not sure what you mean by that, all the files needed to compile gnat1 and
gnatbind (which includes most of the files under gcc/gcc/ada and all the files
under gcc/gcc/ada/gcc-interface) are boostrapped. What's not bootstrapped are
the Ada runtime (only a subset is as part of bootstrapping gnat1/gnatbind)
and Ada tools.

If we were starting from scratch, we would indeed likely have a different
and simpler bootstrap scheme where:

- we first build gnat1 only
- then we build the Ada runtime (libgnat/libgnarl)
- then we build Ada tools (gnatbind, gnatlink, gnatmake, etc...)

and then we iterate again for stage2 and stage3 on the above using the
previously built toolchain.

Doing the above at this stage and given the complexity of the GCC Makefiles
would require a lot of complex and error prone work, not sure it's worth the
trouble and it would likely take a lot of time and effort to get all the
combinations of possible builds (including all complex cases of
"standard" cross and canadian cross builds) working.

> Yeah, I expected that for non-bootstrap.  And I somehow assumed it
> was bootstrapped so I'd get gnattools and gnat1 not depending on the
> host compiler libs.  I guess we're lucky for gnat1 because it's written
> in C?

gnat1 is written mostly in Ada not in C (most of the Ada files under
gcc/gcc/ada are used for gnat1).

Arno


Re: [PATCH] Add C++ runtime support for new 128-bit long double format

2018-11-14 Thread Michael Meissner
On Mon, Nov 12, 2018 at 11:09:45AM +, Jonathan Wakely wrote:
> This adds support for the new 128-bit long double format on powerpc64,
> see https://fedoraproject.org/wiki/Changes/PPC64LE_Float128_Transition
> for more details.
> 
> Most of the required changes are to the locale facets that parse and
> print long doubles, as used by iostreams for reading/writing numbers.

Thanks for the patches.  Unfortunately I find if I use the Advance Toolchain
AT12 libraries which have the f128 functions, and I try to use it, I get
some redefinition errors:

In file included from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/move.h:55,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/nested_exception.h:40,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/exception:144,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/ios:39,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/ostream:38,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/iostream:39,
 from foo.cc:1:
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/type_traits:340:12:
 error: redefinition of 'struct std::__is_floating_point_helper'
  340 | struct __is_floating_point_helper<__float128>
  |^~
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/type_traits:335:12:
 note: previous definition of 'struct std::__is_floating_point_helper'
  335 | struct __is_floating_point_helper
  |^~~
In file included from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/cstdlib:77,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/ext/string_conversions.h:41,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/basic_string.h:6412,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/string:52,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/locale_classes.h:40,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/ios_base.h:41,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/ios:42,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/ostream:38,
 from 
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/iostream:39,
 from foo.cc:1:
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/std_abs.h:102:3:
 error: redefinition of 'constexpr long double std::abs(long double)'
  102 |   abs(__float128 __x)
  |   ^~~
/home/meissner/fsf-install-ppc64le/ieee-c++/include/c++/9.0.0/bits/std_abs.h:78:3:
 note: 'constexpr long double std::abs(long double)' previously defined here
   78 |   abs(long double __x)
  |   ^~~

I configured the compiler with the options:

/home/meissner/fsf-src/ieee-c++/configure \
--prefix=/home/meissner/fsf-install-ppc64le/ieee-c++ \
--enable-languages=c,c++,fortran --enable-checking 
--enable-stage1-checking \
--enable-gnu-indirect-function --enable-plugin --enable-decimal-float \
--with-long-double-128 --enable-secureplt --enable-threads=posix \
--enable-__cxa_atexit --with-cpu=power8 \
--with-as=/opt/at12.0/bin/as --with-ld=/opt/at12.0/bin/ld \
--with-gnu-as=/opt/at12.0/bin/as --with-gnu-ld=/opt/at12.0/bin/ld \
--with-advance-toolchain=at12.0 
--with-native-system-header-dir=/opt/at12.0/include \
--without-ppl --without-cloog --without-isl

The test case was:

#include 

#ifndef TYPE
#define TYPE long double
#endif

volatile TYPE a = (TYPE)3, b = (TYPE)4;

int main (void)
{
  std::cout << "Value is " << a+b << "\n";
  return 0;
}

And I invoked the compiler with:

$ /home/meissner/fsf-install-ppc64le/ieee-c++/bin/g++ -O2 
-mabi=ieeelongdouble -Wno-psabi foo.cc

Let me know if I can help in testing future versions of the patch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



[PATCH 1/6] ifcvt: Store the number of created cmovs.

2018-11-14 Thread Robin Dapp
This patch saves the number of created conditional moves by
noce_convert_multiple_sets in the IF_INFO struct.  This may be used by
the backend to easier decide whether to accept a generated sequence or
not.

--

gcc/ChangeLog:

2018-11-14  Robin Dapp  

* ifcvt.c (noce_convert_multiple_sets): Set cmov count.
(noce_find_if_block): Set cmov count.
* ifcvt.h (struct noce_if_info): Add cmov count.
---
 gcc/ifcvt.c | 10 --
 gcc/ifcvt.h |  4 
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 8b3907618e7..ddf077fa051 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3247,9 +3247,14 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   /* Actually emit the sequence if it isn't too expensive.  */
   rtx_insn *seq = get_insns ();
 
+  if_info->transform_name = "noce_convert_multiple_sets";
+  if_info->created_cmovs = count;
+
   if (!targetm.noce_conversion_profitable_p (seq, if_info))
 {
   end_sequence ();
+  if_info->transform_name = "";
+  if_info->created_cmovs = 0;
   return FALSE;
 }
 
@@ -3296,7 +3301,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 }
 
   num_updated_if_blocks++;
-  if_info->transform_name = "noce_convert_multiple_sets";
+
   return TRUE;
 }
 
@@ -4060,7 +4065,8 @@ noce_find_if_block (basic_block test_bb, edge then_edge, 
edge else_edge,
  and jump_insns are always given a cost of 1 by seq_cost, so treat
  both instructions as having cost COSTS_N_INSNS (1).  */
   if_info.original_cost = COSTS_N_INSNS (2);
-
+  if_info.transform_name = "";
+  if_info.created_cmovs = 0;
 
   /* Do the real work.  */
 
diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h
index a18ba94b8df..50f40bbd1e5 100644
--- a/gcc/ifcvt.h
+++ b/gcc/ifcvt.h
@@ -108,6 +108,10 @@ struct noce_if_info
   /* The name of the noce transform that succeeded in if-converting
  this structure.  Used for debugging.  */
   const char *transform_name;
+
+  /* The number of created conditional moves in case we convert multiple
+ sets.  */
+  unsigned int created_cmovs;
 };
 
 #endif /* GCC_IFCVT_H */
-- 
2.17.0



[PATCH 4/6] S/390: Implement noce_conversion_profitable_p.

2018-11-14 Thread Robin Dapp
This patch implements noce_conversion_profitable_p by checking for the
transformation ifcvt used and only return positively if
noce_convert_multiple_sets created less than MAX_IFCVT_INSNS insns.

--

gcc/ChangeLog:

2018-11-14  Robin Dapp  

* config/s390/s390.c (MAX_IFCVT_INSNS): Define.
(s390_noce_conversion_profitable_p): Implement.
(TARGET_NOCE_CONVERSION_PROFITABLE_P): Define.
---
 gcc/config/s390/s390.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 0f33101d779..1018d9b8057 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -86,6 +86,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "ipa-fnsummary.h"
 #include "sched-int.h"
+#include "ifcvt.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -393,6 +394,8 @@ struct s390_address
   bool literal_pool;
 };
 
+#define MAX_IFCVT_INSNS2
+
 /* Few accessor macros for struct cfun->machine->s390_frame_layout.  */
 
 #define cfun_frame_layout (cfun->machine->frame_layout)
@@ -15989,6 +15992,17 @@ s390_case_values_threshold (void)
   return default_case_values_threshold ();
 }
 
+static bool
+s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
+{
+  if (if_info->transform
+  && if_info->transform == ifcvt_transform_noce_convert_multiple_sets
+  && if_info->created_cmovs <= MAX_IFCVT_INSNS)
+return true;
+
+  return default_noce_conversion_profitable_p (seq, if_info);
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -16240,6 +16254,9 @@ s390_case_values_threshold (void)
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT 
s390_support_vector_misalignment
 
+#undef TARGET_NOCE_CONVERSION_PROFITABLE_P
+#define TARGET_NOCE_CONVERSION_PROFITABLE_P s390_noce_conversion_profitable_p
+
 #undef TARGET_VECTOR_ALIGNMENT
 #define TARGET_VECTOR_ALIGNMENT s390_vector_alignment
 
-- 
2.17.0



[PATCH 6/6] S/390: Add test for noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
New test.

--

gcc/testsuite/ChangeLog:

2018-11-14  Robin Dapp  

* gcc.target/s390/ifcvt-two-insns-int.c: New test.
---
 .../gcc.target/s390/ifcvt-two-insns-int.c | 26 +++
 1 file changed, 26 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c

diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c 
b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
new file mode 100644
index 000..952c8fd890e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
@@ -0,0 +1,26 @@
+/* Check load on condition for bool.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O2 -march=z13" } */
+
+/* { dg-final { scan-assembler "lochinhe\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locrhe\t.*" } } */
+#include 
+
+int foo (int *a, unsigned int n)
+{
+  int min = 99;
+  int bla = 0;
+  for (int i = 0; i < n; i++)
+{
+  if (a[i] < min)
+   {
+ min = a[i];
+ bla = 1;
+   }
+}
+
+  if (bla)
+min += 1;
+  return min;
+}
-- 
2.17.0



[PATCH 3/6] ifcvt: Use enum instead of transform_name string.

2018-11-14 Thread Robin Dapp
This patch introduces an enum for ifcvt's various noce transformations.
As the transformation might be queried by the backend, I find it nicer
to allow checking for a proper type instead of a string comparison.

--

gcc/ChangeLog:

2018-11-14  Robin Dapp  

* ifcvt.c (noce_try_move): Use new function.
(noce_try_ifelse_collapse): Likewise.
(noce_try_store_flag): Likewise.
(noce_try_inverse_constants): Likewise.
(noce_try_store_flag_constants): Likewise.
(noce_try_addcc): Likewise.
(noce_try_store_flag_mask): Likewise.
(noce_try_cmove): Likewise.
(noce_try_cmove_arith): Likewise.
(noce_try_minmax): Likewise.
(noce_try_abs): Likewise.
(noce_try_sign_mask): Likewise.
(noce_try_bitop): Likewise.
(noce_convert_multiple_sets): Likewise.
(noce_process_if_block): Likewise.
(noce_find_if_block): Likewise.
* ifcvt.h (enum ifcvt_transform): Introduce enum.
(ifcvt_get_transform_name): New function.
(struct noce_if_info): Use enum instead of string.
---
 gcc/ifcvt.c | 46 ++--
 gcc/ifcvt.h | 67 ++---
 2 files changed, 88 insertions(+), 25 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 660bb46eb1c..94822c583fe 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1105,7 +1105,7 @@ noce_try_move (struct noce_if_info *if_info)
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
}
-  if_info->transform_name = "noce_try_move";
+  if_info->transform = ifcvt_transform_noce_try_move;
   return TRUE;
 }
   return FALSE;
@@ -1139,7 +1139,7 @@ noce_try_ifelse_collapse (struct noce_if_info * if_info)
   emit_insn_before_setloc (seq, if_info->jump,
  INSN_LOCATION (if_info->insn_a));
 
-  if_info->transform_name = "noce_try_ifelse_collapse";
+  if_info->transform = ifcvt_transform_noce_try_ifelse_collapse;
   return TRUE;
 }
 
@@ -1186,7 +1186,7 @@ noce_try_store_flag (struct noce_if_info *if_info)
 
   emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
-  if_info->transform_name = "noce_try_store_flag";
+  if_info->transform = ifcvt_transform_noce_try_store_flag;
   return TRUE;
 }
   else
@@ -1265,7 +1265,7 @@ noce_try_inverse_constants (struct noce_if_info *if_info)
 
   emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
-  if_info->transform_name = "noce_try_inverse_constants";
+  if_info->transform = ifcvt_transform_noce_try_inverse_constants;
   return true;
 }
 
@@ -1485,7 +1485,7 @@ noce_try_store_flag_constants (struct noce_if_info 
*if_info)
 
   emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
-  if_info->transform_name = "noce_try_store_flag_constants";
+  if_info->transform = ifcvt_transform_noce_try_store_flag_constants;
 
   return TRUE;
 }
@@ -1546,7 +1546,7 @@ noce_try_addcc (struct noce_if_info *if_info)
 
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
- if_info->transform_name = "noce_try_addcc";
+ if_info->transform = ifcvt_transform_noce_try_addcc;
 
  return TRUE;
}
@@ -1588,7 +1588,7 @@ noce_try_addcc (struct noce_if_info *if_info)
 
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
- if_info->transform_name = "noce_try_addcc";
+ if_info->transform = ifcvt_transform_noce_try_addcc;
  return TRUE;
}
  end_sequence ();
@@ -1639,7 +1639,7 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
 
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
- if_info->transform_name = "noce_try_store_flag_mask";
+ if_info->transform = ifcvt_transform_noce_try_store_flag_mask;
 
  return TRUE;
}
@@ -1791,7 +1791,7 @@ noce_try_cmove (struct noce_if_info *if_info)
 
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
- if_info->transform_name = "noce_try_cmove";
+ if_info->transform = ifcvt_transform_noce_try_cmove;
 
  return TRUE;
}
@@ -1844,7 +1844,7 @@ noce_try_cmove (struct noce_if_info *if_info)
 
  emit_insn_before_setloc (seq, if_info->jump,
   INSN_LOCATION (if_info->insn_a));
- if_info->transform_name = "noce_try_cmove";
+ if_info->transform = ifcvt_transform_noce_try_cmove;
  retur

[PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-14 Thread Robin Dapp
noce_convert_multiple_sets creates temporaries for the destination of
every emitted cmov and expects subsequent passes to get rid of them.  This
does not happen every time and even if the temporaries are removed, code
generation can be affected adversely.  In this patch, temporaries are
only created if the destination of a set is used in an emitted condition
check.

--

gcc/ChangeLog:

2018-11-14  Robin Dapp  

* ifcvt.c (check_need_temps): New function.
(noce_convert_multiple_sets): Only created temporaries if needed.
---
 gcc/ifcvt.c | 54 ++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 94822c583fe..6d1803ed40d 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -99,6 +99,10 @@ static int dead_or_predicable (basic_block, basic_block, 
basic_block,
   edge, int);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
+static void check_need_temps (basic_block bb,
+  hash_map *need_temp,
+  rtx cond);
+
 
 /* Count the number of non-jump active insns in BB.  */
 
@@ -3166,6 +3170,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec unmodified_insns;
   int count = 0;
 
+  hash_map need_temps;
+
+  check_need_temps (then_bb, &need_temps, cond);
+
+  hash_map temps_created;
+
   FOR_BB_INSNS (then_bb, insn)
 {
   /* Skip over non-insns.  */
@@ -3176,10 +3186,20 @@ noce_convert_multiple_sets (struct noce_if_info 
*if_info)
   gcc_checking_assert (set);
 
   rtx target = SET_DEST (set);
-  rtx temp = gen_reg_rtx (GET_MODE (target));
   rtx new_val = SET_SRC (set);
   rtx old_val = target;
 
+  rtx dest = SET_DEST (set);
+
+  rtx temp;
+  if (need_temps.get (dest))
+   {
+ temp = gen_reg_rtx (GET_MODE (target));
+ temps_created.put (target, true);
+   }
+  else
+   temp = target;
+
   /* If we were supposed to read from an earlier write in this block,
 we've changed the register allocation.  Rewire the read.  While
 we are looking, also try to catch a swap idiom.  */
@@ -3269,8 +3289,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
   /* Now fixup the assignments.  */
   for (int i = 0; i < count; i++)
-noce_emit_move_insn (targets[i], temporaries[i]);
-
+if (temps_created.get(targets[i]) && targets[i] != temporaries[i])
+  noce_emit_move_insn (targets[i], temporaries[i]);
 
   /* Actually emit the sequence if it isn't too expensive.  */
   rtx_insn *seq = get_insns ();
@@ -3775,6 +3795,34 @@ check_cond_move_block (basic_block bb,
   return TRUE;
 }
 
+/* Check for which sets we need to emit temporaries to hold the destination of
+   a conditional move.  */
+static void
+check_need_temps (basic_block bb, hash_map *need_temp, rtx cond)
+{
+  rtx_insn *insn;
+
+  FOR_BB_INSNS (bb, insn)
+{
+  rtx set, dest;
+
+  if (!active_insn_p (insn))
+   continue;
+
+  set = single_set (insn);
+  if (set == NULL_RTX)
+   continue;
+
+  dest = SET_DEST (set);
+
+  /* noce_emit_cmove will emit the condition check every time it is called
+ so we need a temp register if the destination is modified.  */
+  if (reg_overlap_mentioned_p (dest, cond))
+   need_temp->put (dest, true);
+}
+}
+
+
 /* Given a basic block BB suitable for conditional move conversion,
a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing
the register values depending on COND, emit the insns in the block as
-- 
2.17.0



Re: [PATCH][libbacktrace] Handle DW_FORM_GNU_strp_alt

2018-11-14 Thread Tom de Vries
On 13-11-18 14:42, Tom de Vries wrote:
> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
> index 3c1bd49dd7b..2fec9bbb4b6 100644
> --- a/libbacktrace/Makefile.am
> +++ b/libbacktrace/Makefile.am
> @@ -96,6 +96,28 @@ btest_LDADD = libbacktrace.la
>  
>  check_PROGRAMS += btest
>  
> +if HAVE_DWZ
> +
> +btest_dwz_SOURCES = btest_dwz.c testlib.c
> +btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0

Hmm, I already discovered that specifying the -O0 doesn't work, since
it's overridden by $(CFLAGS).

With a hack like this:
...
diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index 2fec9bbb4b6..8bdf13b3546 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -99,11 +99,14 @@ check_PROGRAMS += btest
 if HAVE_DWZ

 btest_dwz_SOURCES = btest_dwz.c testlib.c
-btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
+btest_dwz_CFLAGS = $(AM_CFLAGS) -g
 btest_dwz_LDADD = libbacktrace.la

 check_PROGRAMS += btest_dwz

+btest_dwz-btest_dwz.o: btest_dwz.c
+   $(AM_V_CC)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES)
$(AM_CPPFLAGS) $(CPPFLAGS) $(btest_dwz_CFLAGS) $(CFLAGS) -O0 -c -o
btest_dwz-btest_dwz.o `test -f 'btest_dwz.c' || echo
'$(srcdir)/'`btest_dwz.c
+
 TESTS += btest_dwz_2 btest_dwz_3

 btest_dwz_2 btest_dwz_3: btest_dwz_23
...
I manage to get the -O0 to be effective.

Then copying btest.c to btest_dwz.c and replacing "btest.c" with
"btest_dwz.c" gets me passes for all but the inline tests:
...
$ ./btest_dwz_2
PASS: backtrace_full noinline
test2: [0]: got main expected f13
FAIL: backtrace_full inline
PASS: backtrace_simple noinline
test4: [0]: got main expected f33
FAIL: backtrace_simple inline
PASS: backtrace_syminfo variable
...
which is expected because in the inline case some DW_AT_abstract_origins
are using DW_FORM_GNU_ref_alt, which is not supported yet.

Thanks,
- Tom


[PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
This patch checks whether the current target supports conditional moves
with immediate then/else operands and allows noce_convert_multiple_sets
to deal with constants subsequently.
Also, minor refactoring is performed.

--

gcc/ChangeLog:

2018-11-14  Robin Dapp  

* ifcvt.c (have_const_cmov): New function.
(noce_convert_multiple_sets): Allow constants if supported.
(bb_ok_for_noce_convert_multiple_sets): Likewise.
(check_cond_move_block): Refactor.
---
 gcc/ifcvt.c | 46 --
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index ddf077fa051..660bb46eb1c 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3077,6 +3077,27 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx 
cond,
   return false;
 }
 
+/* Check if we have a movcc pattern that accepts constants as then/else
+   operand (op 2/3).  */
+static bool
+have_const_cmov (machine_mode mode)
+{
+  enum insn_code icode;
+  if ((icode = direct_optab_handler (movcc_optab, mode))
+  != CODE_FOR_nothing)
+{
+  if (insn_data[(int) icode].operand[2].predicate
+ && (insn_data[(int) icode].operand[2].predicate
+   (const1_rtx, insn_data[(int) icode].operand[2].mode)))
+   if (insn_data[(int) icode].operand[3].predicate
+   && (insn_data[(int) icode].operand[3].predicate
+ (const1_rtx, insn_data[(int) icode].operand[3].mode)))
+ return true;
+}
+
+  return false;
+}
+
 /* We have something like:
 
  if (x > y)
@@ -3194,7 +3215,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 we'll end up trying to emit r4:HI = cond ? (r1:SI) : (r3:HI).
 Wrap the two cmove operands into subregs if appropriate to prevent
 that.  */
-  if (GET_MODE (new_val) != GET_MODE (temp))
+
+  /* Check if we can emit a cmove with constant operands.  */
+  bool allow_constants = have_const_cmov (GET_MODE (target));
+
+  if (!(allow_constants && CONST_INT_P (new_val))
+ && GET_MODE (new_val) != GET_MODE (temp))
{
  machine_mode src_mode = GET_MODE (new_val);
  machine_mode dst_mode = GET_MODE (temp);
@@ -3205,7 +3231,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
}
  new_val = lowpart_subreg (dst_mode, new_val, src_mode);
}
-  if (GET_MODE (old_val) != GET_MODE (temp))
+  if (!(allow_constants && CONST_INT_P (old_val))
+ && GET_MODE (old_val) != GET_MODE (temp))
{
  machine_mode src_mode = GET_MODE (old_val);
  machine_mode dst_mode = GET_MODE (temp);
@@ -3339,9 +3366,10 @@ bb_ok_for_noce_convert_multiple_sets (basic_block 
test_bb)
   if (!REG_P (dest))
return false;
 
-  if (!(REG_P (src)
-  || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
-  && subreg_lowpart_p (src
+  if (!((REG_P (src)
+ || (have_const_cmov (GET_MODE (dest)) && CONST_INT_P (src)))
+   || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
+ && subreg_lowpart_p (src
return false;
 
   /* Destination must be appropriate for a conditional write.  */
@@ -3689,7 +3717,7 @@ check_cond_move_block (basic_block bb,
 {
   rtx set, dest, src;
 
-  if (!NONDEBUG_INSN_P (insn) || JUMP_P (insn))
+  if (!active_insn_p (insn))
continue;
   set = single_set (insn);
   if (!set)
@@ -3705,10 +3733,8 @@ check_cond_move_block (basic_block bb,
   if (!CONSTANT_P (src) && !register_operand (src, VOIDmode))
return FALSE;
 
-  if (side_effects_p (src) || side_effects_p (dest))
-   return FALSE;
-
-  if (may_trap_p (src) || may_trap_p (dest))
+  /* Check for side effects and trapping.  */
+  if (!noce_operand_ok (src) || !noce_operand_ok (dest))
return FALSE;
 
   /* Don't try to handle this if the source register was
-- 
2.17.0



[PATCH 0/6] If conversion with multiple sets.

2018-11-14 Thread Robin Dapp
Hi,

the follow patch set was created in an attempt to allow multiple sets to be
if converted.  I was not able to make it work out of the box since I found the
cost estimation for the newly created sequence to always be much higher than
the sequence before.
 This is due to noce_convert_multiple_sets creating temporaries that will only
get optimized away (if at all) after the cost estimation.  Therefore, I decided
to expose the number of created conditional moves to the backend in the hope
that all temporaries get eliminated eventually.  The backend may still use the
cost estimation but currently, the original_cost is not even set up properly
when the noce_conversion_profitable_p is called.

The series also allows noce_convert_multiple_sets to use immediate operands
without moving them into a register.  Moreover it tries to only create
temporaries when needed so in the future, a cost estimation may be easier.

Regards
 Robin

--

  ifcvt: Store the number of created cmovs.
  ifcvt: Allow constants operands in noce_convert_multiple_sets.
  ifcvt: Use enum instead of transform_name string.
  S/390: Implement noce_conversion_profitable_p.
  ifcvt: Only created temporaries as needed.
  S/390: Add test for noce_convert_multiple_sets.

 gcc/config/s390/s390.c|  17 ++
 gcc/ifcvt.c   | 148 ++
 gcc/ifcvt.h   |  71 -
 .../gcc.target/s390/ifcvt-two-insns-int.c |  26 +++
 4 files changed, 226 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c

-- 
2.17.0



[PATCH] Fix PR88021

2018-11-14 Thread Richard Biener


This reportedly fixes PR88021 - I forgot to change some ints to
lambda_ints when widening the representation of lambda_vectors.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-14  Richard Biener  

PR middle-end/88021
* tree-data-ref.c (lambda_matrix_row_add): Change const1 argument
to lambda_int.
(lambda_vector_mult_const): Likewise.
(lambda_matrix_right_hermite): Use lambda_int temporaries.

diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 0096afb9ba7..5b554b02b4a 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -3442,8 +3483,9 @@ lambda_matrix_id (lambda_matrix mat, int size)
   mat[i][j] = (i == j) ? 1 : 0;
 }
 
-/* Return the first nonzero element of vector VEC1 between START and N.
-   We must have START <= N.   Returns N if VEC1 is the zero vector.  */
+/* Return the index of the first nonzero element of vector VEC1 between
+   START and N.  We must have START <= N.
+   Returns N if VEC1 is the zero vector.  */
 
 static int
 lambda_vector_first_nz (lambda_vector vec1, int n, int start)
@@ -3458,7 +3500,8 @@ lambda_vector_first_nz (lambda_vector vec1, int n, int 
start)
R2 = R2 + CONST1 * R1.  */
 
 static void
-lambda_matrix_row_add (lambda_matrix mat, int n, int r1, int r2, int const1)
+lambda_matrix_row_add (lambda_matrix mat, int n, int r1, int r2,
+  lambda_int const1)
 {
   int i;
 
@@ -3474,7 +3517,7 @@ lambda_matrix_row_add (lambda_matrix mat, int n, int r1, 
int r2, int const1)
 
 static void
 lambda_vector_mult_const (lambda_vector vec1, lambda_vector vec2,
- int size, int const1)
+ int size, lambda_int const1)
 {
   int i;
 
@@ -3539,7 +3582,7 @@ lambda_matrix_right_hermite (lambda_matrix A, int m, int 
n,
{
  while (S[i][j] != 0)
{
- int sigma, factor, a, b;
+ lambda_int sigma, factor, a, b;
 
  a = S[i-1][j];
  b = S[i][j];


Re: [PATCH] S/390: Fix expectation in mrecord-mcount test for 31-bit mode

2018-11-14 Thread Andreas Krebbel
On 14.11.18 13:55, Ilya Leoshkevich wrote:
> The emitted address is .long, not .quad, in that case.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-14  Ilya Leoshkevich  
> 
>   * gcc.target/s390/mrecord-mcount.c (profileme): Expect .long in
>   31-bit mode.

Ok. Thanks!

Andreas



Re: [PATCH] S/390: Disable 3 global-array-* tests for 31-bit mode

2018-11-14 Thread Andreas Krebbel
On 14.11.18 13:50, Ilya Leoshkevich wrote:
> These tests rely on larl->movdi merge, which is not implemented for
> 31-bit mode.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-14  Ilya Leoshkevich  
> 
>   * gcc.target/s390/global-array-almost-huge-element.c: Run only
>   in 64-bit mode.
>   * gcc.target/s390/global-array-almost-negative-huge-element.c:
>   Likewise.
>   * gcc.target/s390/global-array-even-element.c: Likewise.

Ok. Thanks!

Andreas



Re: [PATCH v3] Add sinh(atanh(x)) and cosh(atanh(x)) optimizations

2018-11-14 Thread Giuliano Augusto Faulin Belinassi
On Wed, Nov 14, 2018 at 7:41 AM Richard Biener
 wrote:
>
> On Tue, Nov 13, 2018 at 10:25 PM Giuliano Augusto Faulin Belinassi
>  wrote:
> >
> > Only do the optimization if flag_signed_zeros &&
> > !flag_finite_math_only is set, as suggested in the previous iteration.
> >
> > Before, the patch did the optimization even when -fno-signed-zeros and
> > -ffinite-math-only was set. This could generate badly incorrect
> > results for targets that do not support infinite or signed zeros.
>
> How's the result wrong if there are no signed zeros?  Note that -ffast-math
> enables -fno-signed-zeros for example.  So both of your check look
> backwards.

Indeed. After plotting the graph of both functions, it is very clear
that this check isn't required. Sorry about that.

> Also the support for signed zeros and infs/nans should be guarded with
>
> !HONOR_SIGNED_ZEROS (type) && !HONOR_NANS (type) && !HONOR_INFINITIES (type)
>
> which then means there's no difference between -0. and 0. and there are no
> NaNs or Infs in the inputs and ouptut NaNs or Infs need not be produced.

There can be NaNs and Infinities. For NaNs, take any input that is
outside the [-1, 1] line.
For Infinities, take x = -1, or x = 1. I think these must be 'honored'
as to ensure compatibility with the original expression.

so I must check for
!HONOR_SIGNED_ZEROS (type) && HONOR_NANS (type) && HONOR_INFINITIES (type)
that is correct? Also, is it safe to remove the !finite_math_only with
this, as now it is stated that the type supports infinity and NaNs?

However, I am not sure if it is OK to remove unsafe-math-optimizations
even if it enables
finite_math_only because of the 2 ULP error. As stated in the first
iteration, the user can be
using a very precise math library that yields 0 ULP.

> Richard.
>
> > I also updated the tests with the proper flags.
> >
> > gcc/ChangeLog
> > 2018-11-13  Giuliano Belinassi  
> >
> > * match.pd (sinh (atanh (x))): New simplification rules.
> > (cosh (atanh (x))): Likewise.
> >
> > gcc/testsuite/ChangeLog
> > 2018-11-13  Giuliano Belinassi  
> >
> > * gcc.dg/sinhatanh-1.c: New test.
> > * gcc.dg/sinhatanh-2.c: New test.
> >
> > There are no tests in trunk that seems to be breaking because of this patch.


[PATCH] S/390: Fix expectation in mrecord-mcount test for 31-bit mode

2018-11-14 Thread Ilya Leoshkevich
The emitted address is .long, not .quad, in that case.

gcc/testsuite/ChangeLog:

2018-11-14  Ilya Leoshkevich  

* gcc.target/s390/mrecord-mcount.c (profileme): Expect .long in
31-bit mode.
---
 gcc/testsuite/gcc.target/s390/mrecord-mcount.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/s390/mrecord-mcount.c 
b/gcc/testsuite/gcc.target/s390/mrecord-mcount.c
index d8a23ffdca4..54ced9f7a11 100644
--- a/gcc/testsuite/gcc.target/s390/mrecord-mcount.c
+++ b/gcc/testsuite/gcc.target/s390/mrecord-mcount.c
@@ -5,6 +5,7 @@ void
 profileme (void)
 {
   /* { dg-final { scan-assembler ".section __mcount_loc, \"a\",@progbits" } } 
*/
-  /* { dg-final { scan-assembler ".quad 1b" } } */
+  /* { dg-final { scan-assembler ".long 1b" { target { ! lp64 } } } } */
+  /* { dg-final { scan-assembler ".quad 1b" { target { lp64 } } } } */
   /* { dg-final { scan-assembler ".previous" } } */
 }
-- 
2.19.1



  1   2   >