calculate overflow type in wide int arithmetic

2018-07-05 Thread Aldy Hernandez
The reason for this patch are the changes showcased in tree-vrp.c. 
Basically I'd like to discourage rolling our own overflow and underflow 
calculation when doing wide int arithmetic.  We should have a 
centralized place for this, that is-- in the wide int code itself ;-).


The only cases I care about are plus/minus, which I have implemented, 
but we also get division for free, since AFAICT, division can only 
positive overflow:


-MIN / -1 => +OVERFLOW

Multiplication OTOH, can underflow, but I've not implemented it because 
we have no uses for it.  I have added a note in the code explaining this.


Originally I tried to only change plus/minus, but that made code that 
dealt with plus/minus in addition to div or mult a lot uglier.  You'd 
have to special case "int overflow_for_add_stuff" and "bool 
overflow_for_everything_else".  Changing everything to int, makes things 
consistent.


Note: I have left poly-int as is, with its concept of yes/no for 
overflow.  I can adapt this as well if desired.


Tested on x86-64 Linux.

OK for trunk?
gcc/

	* tree-vrp.c (vrp_int_const_binop): Change overflow type to int.
	(combine_bound): Use wide-int overflow calculation instead of
	rolling our own.
	* calls.c (maybe_warn_alloc_args_overflow): Change overflow type to
	int.
	* fold-const.c (int_const_binop_2): Same.
	(extract_muldiv_1): Same.
	(fold_div_compare): Same.
	(fold_abs_const): Same.
	* match.pd: Same.
	* poly-int.h (add): Same.
	(sub): Same.
	(neg): Same.
	(mul): Same.
	* predict.c (predict_iv_comparison): Same.
	* profile-count.c (slow_safe_scale_64bit): Same.
	* simplify-rtx.c (simplify_const_binary_operation): Same.
	* tree-chrec.c (tree_fold_binomial): Same.
	* tree-data-ref.c (split_constant_offset_1): Same.
	* tree-if-conv.c (idx_within_array_bound): Same.
	* tree-scalar-evolution.c (iv_can_overflow_p): Same.
	* tree-ssa-phiopt.c (minmax_replacement): Same.
	* tree-vect-loop.c (is_nonwrapping_integer_induction): Same.
	* tree-vect-stmts.c (vect_truncate_gather_scatter_offset): Same.
	* vr-values.c (vr_values::adjust_range_with_scev): Same.
	* wide-int.cc (wi::add_large): Same.
	(wi::mul_internal): Same.
	(wi::sub_large): Same.
	(wi::divmod_internal): Same.
	* wide-int.h: Change overflow type to int for neg, add, mul, smul,
	umul, div_trunc, div_floor, div_ceil, div_round, mod_trunc,
	mod_ceil, mod_round, add_large, sub_large, mul_internal,
	divmod_internal.

gcc/cp/

	* decl.c (build_enumerator): Change overflow type to int.
	* init.c (build_new_1): Same.

diff --git a/gcc/calls.c b/gcc/calls.c
index 1970f1c51dd..14c34cca883 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1517,7 +1517,7 @@ maybe_warn_alloc_args_overflow (tree fn, tree exp, tree args[2], int idx[2])
   wide_int x = wi::to_wide (argrange[0][0], szprec);
   wide_int y = wi::to_wide (argrange[1][0], szprec);
 
-  bool vflow;
+  int vflow;
   wide_int prod = wi::umul (x, y, &vflow);
 
   if (vflow)
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 0ea3c4a3490..dccca1502b3 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14628,7 +14628,6 @@ build_enumerator (tree name, tree value, tree enumtype, tree attributes,
 	  if (TYPE_VALUES (enumtype))
 	{
 	  tree prev_value;
-	  bool overflowed;
 
 	  /* C++03 7.2/4: If no initializer is specified for the first
 		 enumerator, the type is an unspecified integral
@@ -14642,6 +14641,7 @@ build_enumerator (tree name, tree value, tree enumtype, tree attributes,
 		value = error_mark_node;
 	  else
 		{
+		  int overflowed;
 		  tree type = TREE_TYPE (prev_value);
 		  signop sgn = TYPE_SIGN (type);
 		  widest_int wi = wi::add (wi::to_widest (prev_value), 1, sgn,
@@ -14668,7 +14668,7 @@ incremented enumerator value is too large for %") : G_("\
 incremented enumerator value is too large for %"));
 			}
 		  if (type == NULL_TREE)
-			overflowed = true;
+			overflowed = 1;
 		  else
 			value = wide_int_to_tree (type, wi);
 		}
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 76ce0b829dd..85df1a2efb9 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -2943,7 +2943,7 @@ build_new_1 (vec **placement, tree type, tree nelts,
   tree inner_nelts_cst = maybe_constant_value (inner_nelts);
   if (TREE_CODE (inner_nelts_cst) == INTEGER_CST)
 	{
-	  bool overflow;
+	  int overflow;
 	  offset_int result = wi::mul (wi::to_offset (inner_nelts_cst),
    inner_nelts_count, SIGNED, &overflow);
 	  if (overflow)
@@ -3072,7 +3072,7 @@ build_new_1 (vec **placement, tree type, tree nelts,
 	 maximum object size and is safe even if we choose not to use
 	 a cookie after all.  */
   max_size -= wi::to_offset (cookie_size);
-  bool overflow;
+  int overflow;
   inner_size = wi::mul (wi::to_offset (size), inner_nelts_count, SIGNED,
 			&overflow);
   if (overflow || wi::gtu_p (inner_size, max_size))
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8476c223e4f..5cfd5edd77d 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-c

[Patch, fortran] PR66679 - [OOP] ICE with class(*) and transfer

2018-07-05 Thread Paul Richard Thomas
The comment in the patch says it all.

Bootstrapped and regtested on FC28/x86_64 - OK for trunk?

Paul

2018-07-05  Paul Thomas  

PR fortran/66679
* trans-intrinsic.c (gfc_conv_intrinsic_transfer): Class array
elements are returned as references to the data element. Get
the class expression by stripping back the references. Use this
for the element size.

2018-07-05  Paul Thomas  

PR fortran/66679
* gfortran.dg/transfer_class_3.f90: New test.
Index: gcc/fortran/trans-intrinsic.c
===
*** gcc/fortran/trans-intrinsic.c	(revision 262299)
--- gcc/fortran/trans-intrinsic.c	(working copy)
*** gfc_conv_intrinsic_transfer (gfc_se * se
*** 7346,7358 
tree upper;
tree lower;
tree stmt;
gfc_actual_arglist *arg;
gfc_se argse;
gfc_array_info *info;
stmtblock_t block;
int n;
bool scalar_mold;
!   gfc_expr *source_expr, *mold_expr;
  
info = NULL;
if (se->loop)
--- 7346,7359 
tree upper;
tree lower;
tree stmt;
+   tree class_ref = NULL_TREE;
gfc_actual_arglist *arg;
gfc_se argse;
gfc_array_info *info;
stmtblock_t block;
int n;
bool scalar_mold;
!   gfc_expr *source_expr, *mold_expr, *class_expr;
  
info = NULL;
if (se->loop)
*** gfc_conv_intrinsic_transfer (gfc_se * se
*** 7383,7389 
  {
gfc_conv_expr_reference (&argse, arg->expr);
if (arg->expr->ts.type == BT_CLASS)
! 	source = gfc_class_data_get (argse.expr);
else
  	source = argse.expr;
  
--- 7384,7407 
  {
gfc_conv_expr_reference (&argse, arg->expr);
if (arg->expr->ts.type == BT_CLASS)
! 	{
! 	  tmp = build_fold_indirect_ref_loc (input_location, argse.expr);
! 	  if (GFC_CLASS_TYPE_P (TREE_TYPE (tmp)))
! 	source = gfc_class_data_get (tmp);
! 	  else
! 	{
! 	  /* Array elements are evaluated as a reference to the data.
! 		 To obtain the vptr for the element size, the argument
! 		 expression must be stripped to the class reference and
! 		 re-evaluated. The pre and post blocks are not needed.  */
! 	  gcc_assert (arg->expr->expr_type == EXPR_VARIABLE);
! 	  source = argse.expr;
! 	  class_expr = gfc_find_and_cut_at_last_class_ref (arg->expr);
! 	  gfc_init_se (&argse, NULL);
! 	  gfc_conv_expr (&argse, class_expr);
! 	  class_ref = argse.expr;
! 	}
! 	}
else
  	source = argse.expr;
  
*** gfc_conv_intrinsic_transfer (gfc_se * se
*** 7395,7400 
--- 7413,7421 
  	 argse.string_length);
  	  break;
  	case BT_CLASS:
+ 	  if (class_ref != NULL_TREE)
+ 	tmp = gfc_class_vtab_size_get (class_ref);
+ 	  else
  	tmp = gfc_class_vtab_size_get (argse.expr);
  	  break;
  	default:
Index: gcc/testsuite/gfortran.dg/transfer_class_3.f90
===
*** gcc/testsuite/gfortran.dg/transfer_class_3.f90	(nonexistent)
--- gcc/testsuite/gfortran.dg/transfer_class_3.f90	(working copy)
***
*** 0 
--- 1,18 
+ ! { dg-do run }
+ !
+ ! Test the fix for PR66679.
+ !
+ ! Contributed by Miha Polajnar  
+ !
+ program main
+   implicit none
+   class(*), allocatable :: vec(:)
+   integer :: var, ans(2)
+   allocate(vec(2),source=[1_4, 2_4])
+ 
+ ! This worked correctly.
+   if (any (transfer(vec,[var],2) .ne. [1_4, 2_4])) stop 1
+ 
+ ! This caused an ICE.
+   if (any ([transfer(vec(1),[var]), transfer(vec(2),[var])] .ne. [1_4, 2_4])) stop 2
+ end program main


Re: [PATCH] use TYPE_SIZE instead of TYPE_DOMAIN to compute array size (PR 86400)

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 4:17 AM Martin Sebor  wrote:
>
> A change of mine to the strlen pass assumes that the strlen
> argument points to an object of the correct type and does
> not correctly handle GIMPLE where the argument has the wrong
> type such as in:
>
>extern char a[1][2];
>n = strlen (*a);
>
> where the strlen pass actually sees
>
>n = strlen (a);
>
> The attached patch corrects the code to use TYPE_SIZE to
> determine the size of the array argument rather than using
> TYPE_DOMAIN.
>
> Tested on x86_64-linux.

OK.

Richard.

> Martin
>


Re: [PATCH] Add experimental::sample and experimental::shuffle from N4531

2018-07-05 Thread Christophe Lyon
On Wed, 4 Jul 2018 at 18:56, Jonathan Wakely  wrote:
>
> On 29/06/18 10:45 +0100, Jonathan Wakely wrote:
> >On 29/06/18 09:39 +0200, Christophe Lyon wrote:
> >>On Fri, 29 Jun 2018 at 09:21, Jonathan Wakely  wrote:
> >>>
> >>>On 29/06/18 08:55 +0200, Christophe Lyon wrote:
> On Mon, 25 Jun 2018 at 18:23, Jonathan Wakely  wrote:
> >
> > The additions to  were added in 2015 but the new
> > algorithms in  were not. This adds them.
> >
> > * include/experimental/algorithm (sample, shuffle): Add new 
> > overloads
> > using per-thread random number engine.
> > * testsuite/experimental/algorithm/sample.cc: Simpify and reduce
> > dependencies by using __gnu_test::test_container.
> > * testsuite/experimental/algorithm/sample-2.cc: New.
> > * testsuite/experimental/algorithm/shuffle.cc: New.
> >
> > Tested x86_64-linux, committed to trunk.
> >
> > This would be safe to backport, but nobody has noticed the algos are
> > missing or complained, so it doesn't seem very important to backport.
> >
> >
> 
> Hi,
> 
> On bare-metal targets (aarch64 and arm + newlib), I've noticed that
> the two new tests fail:
> PASS: experimental/algorithm/shuffle.cc (test for excess errors)
> spawn 
> /aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-eabi/gcc3/utils/bin/qemu-wrapper.sh
> ./shuffle.exe
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  random_device::random_device(const std::string&)
> 
> *** EXIT code 4242
> FAIL: experimental/algorithm/shuffle.cc execution test
> 
> PASS: experimental/algorithm/sample-2.cc (test for excess errors)
> spawn 
> /aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-eabi/gcc3/utils/bin/qemu-wrapper.sh
> ./sample-2.exe
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  random_device::random_device(const std::string&)
> 
> *** EXIT code 4242
> FAIL: experimental/algorithm/sample-2.cc execution test
> 
> Does this ring a bell?
> >>>
> >>>Does the existing testsuite/experimental/random/randint.cc file fail
> >>>in the same way?
> >>>
> >>
> >>Yes it does.
> >>
> >>And so do:
> >>25_algorithms/make_heap/complexity.cc
> >
> >This one also uses std::random_device.
> >
> >>23_containers/array/element_access/at_neg.cc
> >
> >Hmm,
> >
> > // Expected behavior is to either throw and have the uncaught
> > // exception end up in a terminate handler which eventually exits,
> > // or abort. (Depending on -fno-exceptions.)
> >
> >So this is expected to XFAIL.
> >
> >>26_numerics/random/random_device/cons/default.cc
> >
> >We should XFAIL the ones that use std::random_device, if we can
> >identify an effective target to describe them.
>
> This adds a new "random_device" effective-target, so the tests are
> disabled when the random_device isn't usable.
>
> Tested powerpc64le-linux, committed to trunk. If this works for
> Christophe's bare metal targets I'll backport it to gcc-8-branch too.
>
Yes, that works for me: the tests are now UNSUPPORTED on aarch64*-elf
and arm-none-eabi.

Thanks!


[testsuite] Simplify dg-final

2018-07-05 Thread Tom de Vries
[ was: [PATCH, testsuite/guality] Use line number vars in gdb-test ]

On Wed, Jul 04, 2018 at 08:32:49PM +0100, Richard Sandiford wrote:
> Tom de Vries  writes:
> > +proc dg-final { args } {
> > +upvar dg-final-code final-code
> > +
> > +if { [llength $args] > 2 } {
> > +   error "[lindex $args 0]: too many arguments"
> > +}
> > +set line [lindex $args 0]
> > +set code [lindex $args 1]
> > +set directive [lindex $code 0]
> > +set withline \
> > +   [switch $directive {
> > +   gdb-test {expr {1}}
> > +   default  {expr {0}}
> > +   }]
> > +if { $withline == 1 } {
> > +   set code [linsert $code 1 $line]
> > +}
> > +append final-code "$code\n"
> > +}
> 
> Like the idea, but I think:
> 
> set withline \
>   [switch $directive {
>   gdb-test {expr {1}}
>   default  {expr {0}}
>   }]
> if { $withline == 1 } {
>   set code [linsert $code 1 $line]
> }
> 
> would be clearer as:
> 
> switch $directive {
>   gdb-test {
>   set code [linsert $code 1 $line]
>   }
> }

Agreed, thanks for the comment.  Committed as below.

Thanks,
- Tom

[testsuite] Simplify dg-final

2018-07-05  Tom de Vries  

* lib/gcc-dg.exp (dg-final): Simplify tcl code.

---
 gcc/testsuite/lib/gcc-dg.exp | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 9e0b3f4ef95..f5e6bef5dd9 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -123,13 +123,10 @@ proc dg-final { args } {
 set line [lindex $args 0]
 set code [lindex $args 1]
 set directive [lindex $code 0]
-set withline \
-   [switch $directive {
-   gdb-test {expr {1}}
-   default  {expr {0}}
-   }]
-if { $withline == 1 } {
-   set code [linsert $code 1 $line]
+switch $directive {
+   gdb-test {
+   set code [linsert $code 1 $line]
+   }
 }
 append final-code "$code\n"
 }


Re: [PATCH] -fopt-info: add indentation via DUMP_VECT_SCOPE

2018-07-05 Thread Christophe Lyon
On Tue, 3 Jul 2018 at 15:53, Richard Biener  wrote:
>
> On Tue, Jul 3, 2018 at 3:52 PM David Malcolm  wrote:
> >
> > On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> > > On Mon, Jul 2, 2018 at 7:00 PM David Malcolm 
> > > wrote:
> > > >
> > > > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > > > ail.
> > > > > com> wrote:
> > > > > >
> > > > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > > > com>
> > > > > > wrote:
> > > > > > >
> > > > > > > This patch adds a concept of nested "scopes" to dumpfile.c's
> > > > > > > dump_*_loc
> > > > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in tree-
> > > > > > > vectorizer.h,
> > > > > > > so that the nested structure is shown in -fopt-info by
> > > > > > > indentation.
> > > > > > >
> > > > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > > >
> > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > > > test.c:8:3: note: symbolic number of iterations is (unsigned
> > > > > > > int)
> > > > > > > n_9(D)
> > > > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > > > calls or
> > > > > > > data references that cannot be analyzed
> > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > >
> > > > > > > to:
> > > > > > >
> > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > > > (unsigned
> > > > > > > int) n_9(D)
> > > > > > > test.c:8:3: note:   not vectorized: loop contains function
> > > > > > > calls
> > > > > > > or data references that cannot be analyzed
> > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > >
> > > > > > > showing that the "symbolic number of iterations" message is
> > > > > > > within
> > > > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > > > "=== vect_analyze_loop_form ===").
> > > > > > >
> > > > > > > This is also enabling work for followups involving
> > > > > > > optimization
> > > > > > > records
> > > > > > > (allowing the records to directly capture the nested
> > > > > > > structure of
> > > > > > > the
> > > > > > > dump messages).
> > > > > > >
> > > > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-
> > > > > > > gnu.
> > > > > > >
> > > > > > > OK for trunk?
> > > > >
> > > > > Hi,
> > > > >
> > > > > I've noticed that this patch (r262246) caused regressions on
> > > > > aarch64:
> > > > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
> > > > > dump
> > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > gcc.dg/vect/slp-perm-8.c scan-tree-dump vect "note: Built SLP
> > > > > cancelled: can use load/store-lanes"
> > > > >
> > > > > The problem is that now there are more spaces between "note:" and
> > > > > "Built", the attached small patch does that for slp-perm-1.c.
> > > >
> > 

[PATCH] Move ((A & N) + B) & M -> (A + B) & M etc. optimization from fold-const.c to match.pd (PR tree-optimization/86401)

2018-07-05 Thread Jakub Jelinek
Hi!

I've encountered this while testing the rotate patterns in discussion
with Jonathan for the std::__rot{l,r}, in the rotate-9.c test without
this patch f1 is optimized into a rotate, but f2 is not.

Fixed by moving the optimization from fold-const.c to match.pd (well,
leaving a helper in fold-const.c to avoid too much duplication).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-07-05  Jakub Jelinek  

PR tree-optimization/86401
* fold-const.c (fold_binary_loc) : Move the
((A & N) + B) & M -> (A + B) & M etc. optimization into ...
(fold_bit_and_mask): ... here.  New helper function for match.pd.
* fold-const.h (fold_bit_and_mask): Declare.
* match.pd (((A & N) + B) & M -> (A + B) & M): New optimization.

* gcc.dg/tree-ssa/pr86401-1.c: New test.
* gcc.dg/tree-ssa/pr86401-2.c: New test.
* c-c++-common/rotate-9.c: New test.

--- gcc/fold-const.c.jj 2018-06-26 09:05:23.196346433 +0200
+++ gcc/fold-const.c2018-07-04 12:47:59.139981801 +0200
@@ -10236,121 +10236,6 @@ fold_binary_loc (location_t loc, enum tr
}
}
 
-  /* For constants M and N, if M == (1LL << cst) - 1 && (N & M) == M,
-((A & N) + B) & M -> (A + B) & M
-Similarly if (N & M) == 0,
-((A | N) + B) & M -> (A + B) & M
-and for - instead of + (or unary - instead of +)
-and/or ^ instead of |.
-If B is constant and (B & M) == 0, fold into A & M.  */
-  if (TREE_CODE (arg1) == INTEGER_CST)
-   {
- wi::tree_to_wide_ref cst1 = wi::to_wide (arg1);
- if ((~cst1 != 0) && (cst1 & (cst1 + 1)) == 0
- && INTEGRAL_TYPE_P (TREE_TYPE (arg0))
- && (TREE_CODE (arg0) == PLUS_EXPR
- || TREE_CODE (arg0) == MINUS_EXPR
- || TREE_CODE (arg0) == NEGATE_EXPR)
- && (TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0))
- || TREE_CODE (TREE_TYPE (arg0)) == INTEGER_TYPE))
-   {
- tree pmop[2];
- int which = 0;
- wide_int cst0;
-
- /* Now we know that arg0 is (C + D) or (C - D) or
--C and arg1 (M) is == (1LL << cst) - 1.
-Store C into PMOP[0] and D into PMOP[1].  */
- pmop[0] = TREE_OPERAND (arg0, 0);
- pmop[1] = NULL;
- if (TREE_CODE (arg0) != NEGATE_EXPR)
-   {
- pmop[1] = TREE_OPERAND (arg0, 1);
- which = 1;
-   }
-
- if ((wi::max_value (TREE_TYPE (arg0)) & cst1) != cst1)
-   which = -1;
-
- for (; which >= 0; which--)
-   switch (TREE_CODE (pmop[which]))
- {
- case BIT_AND_EXPR:
- case BIT_IOR_EXPR:
- case BIT_XOR_EXPR:
-   if (TREE_CODE (TREE_OPERAND (pmop[which], 1))
-   != INTEGER_CST)
- break;
-   cst0 = wi::to_wide (TREE_OPERAND (pmop[which], 1)) & cst1;
-   if (TREE_CODE (pmop[which]) == BIT_AND_EXPR)
- {
-   if (cst0 != cst1)
- break;
- }
-   else if (cst0 != 0)
- break;
-   /* If C or D is of the form (A & N) where
-  (N & M) == M, or of the form (A | N) or
-  (A ^ N) where (N & M) == 0, replace it with A.  */
-   pmop[which] = TREE_OPERAND (pmop[which], 0);
-   break;
- case INTEGER_CST:
-   /* If C or D is a N where (N & M) == 0, it can be
-  omitted (assumed 0).  */
-   if ((TREE_CODE (arg0) == PLUS_EXPR
-|| (TREE_CODE (arg0) == MINUS_EXPR && which == 0))
-   && (cst1 & wi::to_wide (pmop[which])) == 0)
- pmop[which] = NULL;
-   break;
- default:
-   break;
- }
-
- /* Only build anything new if we optimized one or both arguments
-above.  */
- if (pmop[0] != TREE_OPERAND (arg0, 0)
- || (TREE_CODE (arg0) != NEGATE_EXPR
- && pmop[1] != TREE_OPERAND (arg0, 1)))
-   {
- tree utype = TREE_TYPE (arg0);
- if (! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0)))
-   {
- /* Perform the operations in a type that has defined
-overflow behavior.  */
- utype = unsigned_type_for (TREE_TYPE (arg0));
- if (pmop[0] != NULL)
-   pmop[0] = fold_convert_loc (loc, utype, pmop[0]);
- if (pmop[1] != NULL)
-   pmop[1] = fold_convert_loc (loc, utype, pmop[1]);
-   }
-
- 

Re: [PATCH][PR sanitizer/84250] Avoid global symbols collision when using both ASan and UBSan

2018-07-05 Thread Jakub Jelinek
On Wed, Jul 04, 2018 at 08:20:47PM +0300, Maxim Ostapenko wrote:
> On 07/04/2018 05:45 AM, Jeff Law wrote:
> > On 05/23/2018 11:15 AM, Maxim Ostapenko wrote:
> >> as described in PR, when using both ASan and UBSan
> >> (-fsanitize=address,undefined ), we have symbols collision for global
> >> functions, like __sanitizer_set_report_path. This leads to fuzzy results
> >> when printing reports into files e.g. for this test case:
> >>
> >> #include 
> >> int main(int argc, char **argv) {
> >>     __sanitizer_set_report_path("/tmp/sanitizer.txt");
> >>     int i = 23;
> >>     i <<= 32;
> >>     int *array = new int[100];
> >>     delete [] array;
> >>     return array[argc];
> >> }
> >>
> >> only ASan's report gets written to file; UBSan output goes to stderr.
> >>
> >> To resolve this issue we could use two approaches:
> >>
> >> 1) Use the same approach to that is implemented in Clang (UBSan embedded
> >> to ASan). The only caveat here is that we need to link (unused) C++ part
> >> of UBSan even in C programs when linking static ASan runtime. This
> >> happens because GCC, as opposed to Clang, doesn't split C and C++
> >> runtimes for sanitizers.
> >>
> >> 2) Just add SANITIZER_INTERFACE_ATTRIBUTE to report_file global
> >> variable. In this case all __sanitizer_set_report_path calls will set
> >> the same report_file variable. IMHO this is a hacky way to fix the
> >> issue, it's better to use the first option if possible.
> >>
> >>
> >> The attached patch fixes the symbols collision by embedding UBSan into
> >> ASan (variant 1), just like we do for LSan.
> >>
> >>
> >> Regtested/bootstrapped on x86_64-unknown-linux-gnu, looks reasonable
> >> enough for trunk?
> >>
> >>
> >> -Maxim
> >>
> >>
> >> pr84250-2.diff
> >>
> >>
> >> gcc/ChangeLog:
> >>
> >> 2018-05-23  Maxim Ostapenko  
> >>
> >>* config/gnu-user.h (LIBASAN_EARLY_SPEC): Pass -lstdc++ for static
> >>libasan.
> >>* gcc.c: Do not pass LIBUBSAN_SPEC if ASan is enabled with UBSan.
> >>
> >> libsanitizer/ChangeLog:
> >>
> >> 2018-05-23  Maxim Ostapenko  
> >>
> >>* Makefile.am: Reorder libs.
> >>* Makefile.in: Regenerate.
> >>* asan/Makefile.am: Define DCAN_SANITIZE_UB=1, add dependancy from
> >>libsanitizer_ubsan.la.
> >>* asan/Makefile.in: Regenerate.
> >>* ubsan/Makefile.am: Define new libsanitizer_ubsan.la library.
> >>* ubsan/Makefile.in: Regenerate.
> > You know this code better than anyone else working on GCC.  My only
> > concern would be the kernel builds with asan, but I suspect they're
> > providing their own runtime anyway, so the libstdc++ caveat shouldn't apply.
> 
> Yes, you are right, kernel provides its own runtime.
> 
> >
> > OK for the trunk.
> 
> Ok, thanks, I'll apply the patch today (with fixed ChangeLog entry).

This broke the c-c++-common/asan/pr59063-2.c test:

FAIL: c-c++-common/asan/pr59063-2.c   -O1  (test for excess errors)
Excess errors:
/usr/bin/ld: cannot find -lstdc++

While it could be fixed by tweaking asan-dg.exp, thinking about this, the
1) variant is actually not a good idea, it will not work properly anyway
if you link one library with -fsanitize=undefined and another library
with -fsanitize=address, the right solution is to make the two libraries
coexist sanely, so I'd prefer 2) or if not exporting a variable, export
an accessor function to get the address of the variable (or whole block of
shared state in one object between the libraries).

Yes, it means trying to get something accepted upstream, but anything else
is an ugly hack.

Jakub


Re: [PATCH] -fopt-info: add indentation via DUMP_VECT_SCOPE

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 10:42 AM Christophe Lyon
 wrote:
>
> On Tue, 3 Jul 2018 at 15:53, Richard Biener  
> wrote:
> >
> > On Tue, Jul 3, 2018 at 3:52 PM David Malcolm  wrote:
> > >
> > > On Tue, 2018-07-03 at 09:37 +0200, Richard Biener wrote:
> > > > On Mon, Jul 2, 2018 at 7:00 PM David Malcolm 
> > > > wrote:
> > > > >
> > > > > On Mon, 2018-07-02 at 14:23 +0200, Christophe Lyon wrote:
> > > > > > On Fri, 29 Jun 2018 at 10:09, Richard Biener  > > > > > ail.
> > > > > > com> wrote:
> > > > > > >
> > > > > > > On Tue, Jun 26, 2018 at 5:43 PM David Malcolm  > > > > > > com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > This patch adds a concept of nested "scopes" to dumpfile.c's
> > > > > > > > dump_*_loc
> > > > > > > > calls, and wires it up to the DUMP_VECT_SCOPE macro in tree-
> > > > > > > > vectorizer.h,
> > > > > > > > so that the nested structure is shown in -fopt-info by
> > > > > > > > indentation.
> > > > > > > >
> > > > > > > > For example, this converts -fopt-info-all e.g. from:
> > > > > > > >
> > > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > > test.c:8:3: note: === analyze_loop_nest ===
> > > > > > > > test.c:8:3: note: === vect_analyze_loop_form ===
> > > > > > > > test.c:8:3: note: === get_loop_niters ===
> > > > > > > > test.c:8:3: note: symbolic number of iterations is (unsigned
> > > > > > > > int)
> > > > > > > > n_9(D)
> > > > > > > > test.c:8:3: note: not vectorized: loop contains function
> > > > > > > > calls or
> > > > > > > > data references that cannot be analyzed
> > > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > > >
> > > > > > > > to:
> > > > > > > >
> > > > > > > > test.c:8:3: note: === analyzing loop ===
> > > > > > > > test.c:8:3: note:  === analyze_loop_nest ===
> > > > > > > > test.c:8:3: note:   === vect_analyze_loop_form ===
> > > > > > > > test.c:8:3: note:=== get_loop_niters ===
> > > > > > > > test.c:8:3: note:   symbolic number of iterations is
> > > > > > > > (unsigned
> > > > > > > > int) n_9(D)
> > > > > > > > test.c:8:3: note:   not vectorized: loop contains function
> > > > > > > > calls
> > > > > > > > or data references that cannot be analyzed
> > > > > > > > test.c:8:3: note: vectorized 0 loops in function
> > > > > > > >
> > > > > > > > showing that the "symbolic number of iterations" message is
> > > > > > > > within
> > > > > > > > the "=== analyze_loop_nest ===" (and not within the
> > > > > > > > "=== vect_analyze_loop_form ===").
> > > > > > > >
> > > > > > > > This is also enabling work for followups involving
> > > > > > > > optimization
> > > > > > > > records
> > > > > > > > (allowing the records to directly capture the nested
> > > > > > > > structure of
> > > > > > > > the
> > > > > > > > dump messages).
> > > > > > > >
> > > > > > > > Successfully bootstrapped & regrtested on x86_64-pc-linux-
> > > > > > > > gnu.
> > > > > > > >
> > > > > > > > OK for trunk?
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I've noticed that this patch (r262246) caused regressions on
> > > > > > aarch64:
> > > > > > gcc.dg/vect/slp-perm-1.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-1.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-2.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-2.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-3.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-3.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-5.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-5.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-7.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-7.c scan-tree-dump vect "note: Built SLP
> > > > > > cancelled: can use load/store-lanes"
> > > > > > gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects  scan-tree-
> > > > > > dump
> > > > > > vect "note: Built SLP cancelled: can use load/store-lanes"
> > > > > > gc

Re: [PATCH][PR sanitizer/84250] Avoid global symbols collision when using both ASan and UBSan

2018-07-05 Thread Maxim Ostapenko
On 07/05/2018 12:01 PM, Jakub Jelinek wrote:
> On Wed, Jul 04, 2018 at 08:20:47PM +0300, Maxim Ostapenko wrote:
>> On 07/04/2018 05:45 AM, Jeff Law wrote:
>>> On 05/23/2018 11:15 AM, Maxim Ostapenko wrote:
 as described in PR, when using both ASan and UBSan
 (-fsanitize=address,undefined ), we have symbols collision for global
 functions, like __sanitizer_set_report_path. This leads to fuzzy results
 when printing reports into files e.g. for this test case:

 #include 
 int main(int argc, char **argv) {
  __sanitizer_set_report_path("/tmp/sanitizer.txt");
  int i = 23;
  i <<= 32;
  int *array = new int[100];
  delete [] array;
  return array[argc];
 }

 only ASan's report gets written to file; UBSan output goes to stderr.

 To resolve this issue we could use two approaches:

 1) Use the same approach to that is implemented in Clang (UBSan embedded
 to ASan). The only caveat here is that we need to link (unused) C++ part
 of UBSan even in C programs when linking static ASan runtime. This
 happens because GCC, as opposed to Clang, doesn't split C and C++
 runtimes for sanitizers.

 2) Just add SANITIZER_INTERFACE_ATTRIBUTE to report_file global
 variable. In this case all __sanitizer_set_report_path calls will set
 the same report_file variable. IMHO this is a hacky way to fix the
 issue, it's better to use the first option if possible.


 The attached patch fixes the symbols collision by embedding UBSan into
 ASan (variant 1), just like we do for LSan.


 Regtested/bootstrapped on x86_64-unknown-linux-gnu, looks reasonable
 enough for trunk?


 -Maxim


 pr84250-2.diff


 gcc/ChangeLog:

 2018-05-23  Maxim Ostapenko  

* config/gnu-user.h (LIBASAN_EARLY_SPEC): Pass -lstdc++ for static
libasan.
* gcc.c: Do not pass LIBUBSAN_SPEC if ASan is enabled with UBSan.

 libsanitizer/ChangeLog:

 2018-05-23  Maxim Ostapenko  

* Makefile.am: Reorder libs.
* Makefile.in: Regenerate.
* asan/Makefile.am: Define DCAN_SANITIZE_UB=1, add dependancy from
libsanitizer_ubsan.la.
* asan/Makefile.in: Regenerate.
* ubsan/Makefile.am: Define new libsanitizer_ubsan.la library.
* ubsan/Makefile.in: Regenerate.
>>> You know this code better than anyone else working on GCC.  My only
>>> concern would be the kernel builds with asan, but I suspect they're
>>> providing their own runtime anyway, so the libstdc++ caveat shouldn't apply.
>> Yes, you are right, kernel provides its own runtime.
>>
>>> OK for the trunk.
>> Ok, thanks, I'll apply the patch today (with fixed ChangeLog entry).
> This broke the c-c++-common/asan/pr59063-2.c test:
>
> FAIL: c-c++-common/asan/pr59063-2.c   -O1  (test for excess errors)
> Excess errors:
> /usr/bin/ld: cannot find -lstdc++

I must mis-looked this, sorry :(.

> While it could be fixed by tweaking asan-dg.exp, thinking about this, the
> 1) variant is actually not a good idea, it will not work properly anyway
> if you link one library with -fsanitize=undefined and another library
> with -fsanitize=address, the right solution is to make the two libraries
> coexist sanely

Yes, you're right. Btw, we have pretty the same situation with ASan + 
LSan, right?

> , so I'd prefer 2) or if not exporting a variable, export
> an accessor function to get the address of the variable (or whole block of
> shared state in one object between the libraries).
>
> Yes, it means trying to get something accepted upstream, but anything else
> is an ugly hack.

Ok. Could you please revert this patch for me (I don't have a write 
access to repo right now) so we can cook a proper fix?

-Maxim

>
>   Jakub
>
>
>



Re: calculate overflow type in wide int arithmetic

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 9:35 AM Aldy Hernandez  wrote:
>
> The reason for this patch are the changes showcased in tree-vrp.c.
> Basically I'd like to discourage rolling our own overflow and underflow
> calculation when doing wide int arithmetic.  We should have a
> centralized place for this, that is-- in the wide int code itself ;-).
>
> The only cases I care about are plus/minus, which I have implemented,
> but we also get division for free, since AFAICT, division can only
> positive overflow:
>
> -MIN / -1 => +OVERFLOW
>
> Multiplication OTOH, can underflow, but I've not implemented it because
> we have no uses for it.  I have added a note in the code explaining this.
>
> Originally I tried to only change plus/minus, but that made code that
> dealt with plus/minus in addition to div or mult a lot uglier.  You'd
> have to special case "int overflow_for_add_stuff" and "bool
> overflow_for_everything_else".  Changing everything to int, makes things
> consistent.
>
> Note: I have left poly-int as is, with its concept of yes/no for
> overflow.  I can adapt this as well if desired.
>
> Tested on x86-64 Linux.
>
> OK for trunk?

looks all straight-forward but the following:

   else if (op1)
 {
   if (minus_p)
-   {
- wi = -wi::to_wide (op1);
-
- /* Check for overflow.  */
- if (sgn == SIGNED
- && wi::neg_p (wi::to_wide (op1))
- && wi::neg_p (wi))
-   ovf = 1;
- else if (sgn == UNSIGNED && wi::to_wide (op1) != 0)
-   ovf = -1;
-   }
+   wi = wi::neg (wi::to_wide (op1));
   else
wi = wi::to_wide (op1);

you fail to handle - -INT_MIN.

Given the fact that for multiplication (or others, didn't look too  close)
you didn't implement the direction indicator I wonder if it would be more
appropriate to do

enum ovfl { OVFL_NONE = 0, OVFL_UNDERFLOW = -1, OVFL_OVERFLOW = 1,
OVFL_UNKNOWN = 2 };

and tell us the "truth" here?

Hopefully if (overflow) will still work with that.

Otherwise can you please add a toplevel comment to wide-int.h as to what the
overflow result semantically is for a) SIGNED and b) UNSIGNED operations?

Thanks,
Richard.


Re: Enhance __gnu_debug::string debug assertion

2018-07-05 Thread Jonathan Wakely

On 05/07/18 07:28 +0200, François Dumont wrote:
    This patch improves the assertion message generated in 2 
__gnu_debug::string constructors giving the assertion context thanks 
to the __FUNCTION__ macro.


Was:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:56: 
const _CharT* __gnu_debug::__check_string(const _CharT*, _Integer, 
const char*, unsigned int, const char*) [with _CharT = char; _Integer 
= long unsigned int]: Assertion '__s != 0 || __n == 0' failed.

XFAIL: 21_strings/basic_string/debug/1_neg.cc execution test

Now:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:172:
In function:
    __gnu_debug::basic_string<_CharT, _Traits,
    _Allocator>::basic_string(const _CharT*,
    __gnu_debug::basic_string<_CharT, _Traits, _Allocator>::size_type, 
const

    _Allocator&) [with _CharT = char; _Traits = std::char_traits;
    _Allocator = std::allocator; __gnu_debug::basic_string<_CharT,
    _Traits, _Allocator>::size_type = long unsigned int]

Error: __s != 0 || __n == 0.
XFAIL: 21_strings/basic_string/debug/1_neg.cc execution test

    Tested under Linux x86_64 normal and debug modes.

    If not told otherwise I plan to commit the attached patch tomorrow.


Looks good - thanks.



Re: [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p

2018-07-05 Thread Kugan Vivekanandarajah
Hi Richard,

Thanks for the review.

On 28 June 2018 at 21:26, Richard Biener  wrote:
> On Wed, Jun 27, 2018 at 7:00 AM Kugan Vivekanandarajah
>  wrote:
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>> On 25 June 2018 at 20:01, Richard Biener  wrote:
>> > On Fri, Jun 22, 2018 at 11:13 AM Kugan Vivekanandarajah
>> >  wrote:
>> >>
>> >> [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p
>> >
>> > This says that COND_EXPR itself isn't expensive.  I think we should
>> > constrain that a bit.
>> > I think a good default would be to only allow a single COND_EXPR which
>> > you can achieve
>> > by adding a bool in_cond_expr_p = false argument to the function, pass
>> > in_cond_expr_p
>> > down and pass true down from the COND_EXPR handling itself.
>> >
>> > I'm not sure if we should require either COND_EXPR arm (operand 1 or
>> > 2) to be constant
>> > or !EXPR_P (then multiple COND_EXPRs might be OK).
>> >
>> > The main idea is to avoid evaluating many expressions but only
>> > choosing one in the end.
>> >
>> > The simplest patch achieving that is sth like
>> >
>> > +  if (code == COND_EXPR)
>> > +return (expression_expensive_p (TREE_OPERAND (expr, 0))
>> >   || (EXPR_P (TREE_OPERAND (expr, 1)) && EXPR_P
>> > (TREE_OPERAND (expr, 2)))
>> > +   || expression_expensive_p (TREE_OPERAND (expr, 1))
>> > +   || expression_expensive_p (TREE_OPERAND (expr, 2)));
>> >
>> > OK with that change.
>>
>> Is || (EXPR_P (TREE_OPERAND (expr, 1)) || EXPR_P (TREE_OPERAND (expr,
>> 2))) slightly better ?
>> Attaching  with the change. Is this OK?
>
> Well, it won't allow x != 0 ? popcount (x) : 1 because popcount(x) is 
> CALL_EXPR.
>
>>
>>
>> Because, for pr81661.c, we now allow as not expensive
>> > type > size 
>> unit-size 
>> align:32 warn_if_not_align:0 symtab:0 alias-set 1
>> canonical-type 0x769455e8 precision:32 min > 0x7692dee8 -2147483648> max > 2147483647>
>> pointer_to_this >
>>
>> arg:0 
>>
>> arg:0 
>> visited
>> def_stmt a.1_10 = a;
>> version:10>
>> arg:1 >
>> arg:1 
>>
>> arg:0 > _Bool>
>>
>> arg:0 > 0x769455e8 int>
>>
>> arg:0 > 0x769455e8 int>
>> arg:0  arg:1 > 0x7694a0d8 -1>>
>> arg:1 > 0x769455e8 int>
>> visited
>> def_stmt c.2_11 = c;
>> version:11>>
>> arg:1 > 0x769455e8 int>
>> visited
>> def_stmt b.3_13 = b;
>> version:13>>
>> arg:1 > int>
>>
>> arg:0 > 0x769455e8 int>
>>
>> arg:0 > 0x76a55b28>
>>
>> arg:0 > 0x76a55b28>
>>
>> arg:0 > 
>> arg:0  arg:1
>> >>
>> arg:1 > 0x76a55b28>
>> arg:0 
>> arg:2 >>
>>
>> Which also leads to an ICE in gimplify_modify_expr. I think this is a
>> latent issue and I am trying to find the source
>
> Well, I think that's because some COND_EXPRs only gimplify to
> conditional code.  See gimplify_cond_expr:
>
>   if (gimplify_ctxp->allow_rhs_cond_expr
>   /* If either branch has side effects or could trap, it can't be
>  evaluated unconditionally.  */
>   && !TREE_SIDE_EFFECTS (then_)
>   && !generic_expr_could_trap_p (then_)
>   && !TREE_SIDE_EFFECTS (else_)
>   && !generic_expr_could_trap_p (else_))
> return gimplify_pure_cond_expr (expr_p, pre_p);
>
> so we probably have to treat TREE_SIDE_EFFECTS / generic_expr_could_trap_p as
> "expensive" as well for the purpose of final value replacement unless we are
> going to support a code-generation way different from gimplification.

Is the attached patch which does this is OK?. I had to fix couple of
testcases because now the final value replacement removed the loop for
pr64183.c and pr85073.c is popcount pattern so I just disabled it so
that we can test what was tested earlier.
>
> The testcase you cite uses -ftrapv which is why we run into this issue.  Note
> that final value replacement deals with this by rewriting the expression to
> unsigned but it does so only after gimplification.  IIRC Jakub recently
> added a helper to rewrite GENERIC to unsigned so that might be useful
> in this context.
Could you kindly refer me to Jakubs patch please.

Thanks,
Kugan


>
> Richard.
>
>> the expr in gimple_modify_expr is
>> > type > size 
>> unit-size 
>> align:32 warn_if_not_align:0 symtab:0 alias-set 1
>> canonical-type 0x769455e8 precision:32 min > 0x7692dee8 -2147483648> max > 2147483647>
>> pointer_to_this >
>> side-effects
>> arg:0 > 0x769455e8 int>
>> used ignored SI (null):0:0 size > 32> unit-size 
>> align:32 warn_if_not_align:0 

[PATCH, S390] Avoid LA with base and index on z13

2018-07-05 Thread Robin Dapp
Hi,

this patch avoids emitting LA on z13 and later when the address has both
an index and a base since a regular add is faster in that case.

Regtested on s390x.

Regards
 Robin

--

gcc/ChangeLog:

2018-07-05  Robin Dapp  

* config/s390/s390.c (preferred_la_operand_p): Do not use
LA with base and index on z13 or later.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 5add5985866..df9357fa9e5 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -4630,6 +4630,11 @@ preferred_la_operand_p (rtx op1, rtx op2)
   if (addr.indx && s390_tune == PROCESSOR_2817_Z196)
 return false;
 
+  /* Avoid LA when the address has index as well as base registers,
+ a regular add is still faster then. */
+  if (addr.base && addr.indx && s390_tune >= PROCESSOR_2964_Z13)
+return false;
+
   if (!TARGET_64BIT && !addr.pointer)
 return false;
 


[PATCH][debug] Handle references to skipped params in remap_ssa_name

2018-07-05 Thread Tom de Vries
[ was: Re: [testsuite/guality, committed] Prevent optimization of local in
vla-1.c ]

On Wed, Jul 04, 2018 at 02:32:27PM +0200, Tom de Vries wrote:
> On 07/03/2018 11:05 AM, Tom de Vries wrote:
> > On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> >> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
> >>> Given the array has size i + 1 it's upper bound should be 'i' and 'i'
> >>> should be available via DW_OP_[GNU_]entry_value.
> >>>
> >>> I see it is
> >>>
> >>> <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 31
> >>> 1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
> >>> DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
> >>>
> >>> and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, the
> >>> storage itself doesn't have a location but the
> >>> type specifies the size.
> >>>
> >>> (gdb) ptype a
> >>> type = char [variable length]
> >>> (gdb) p sizeof(a)
> >>> $3 = 0
> >>>
> >>> this looks like a gdb bug to me?
> >>>
> > 
> > With gdb patch:
> > ...
> > diff --git a/gdb/findvar.c b/gdb/findvar.c
> > index 8ad5e25cb2..ebaff923a1 100644
> > --- a/gdb/findvar.c
> > +++ b/gdb/findvar.c
> > @@ -789,6 +789,8 @@ default_read_var_value
> >break;
> > 
> >  case LOC_OPTIMIZED_OUT:
> > +  if (is_dynamic_type (type))
> > +   type = resolve_dynamic_type (type, NULL,
> > +/* Unused address.  */ 0);
> >return allocate_optimized_out_value (type);
> > 
> >  default:
> > ...
> > 
> > I get:
> > ...
> > $ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
> > Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.
> > 
> > Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
> > 17return a[0];
> > $1 = 6
> > ...
> > 
> 
> Well, for -O1 and -O2.
> 
> For O3, I get instead:
> ...
> $ ./gdb vla-1.exe -q -batch -ex "b f1" -ex "run" -ex "p sizeof (a)"
> Breakpoint 1 at 0x4004b0: f1. (2 locations)
> 
> Breakpoint 1, f1 (i=5) at vla-1.c:17
> 17return a[0];
> $1 = 0
> ...
> 

Hi,

When compiling guality/vla-1.c with -O3 -g, vla 'a[i + 1]' in f1 is optimized
away, but f1 still contains a debug expression describing the upper bound of the
vla (D.1914):
...
 __attribute__((noinline))
 f1 (intD.6 iD.1900)
 {
   
   saved_stack.1_2 = __builtin_stack_save ();
   # DEBUG BEGIN_STMT
   # DEBUG D#3 => i_1(D) + 1
   # DEBUG D#2 => (long intD.8) D#3
   # DEBUG D#1 => D#2 + -1
   # DEBUG D.1914 => (sizetype) D#1
...

Then f1 is cloned to a version f1.constprop with no parameters, eliminating
parameter i, and 'DEBUG D#3 => i_1(D) + 1' turns into 'D#3 => NULL'.
Consequently, 'print sizeof (a)' yields '0' in gdb.

This patch fixes that by recognizing eliminated parameters in remap_ssa_name,
defining a debug expression linking back to the the eliminated parameter, and
using that debug expression to replace references to the eliminated parameter:
...
 __attribute__((noinline))
 f1.constprop ()
 {
   intD.6 iD.1949;

   
   # DEBUG D#8 s=> iD.1900
   # DEBUG iD.1949 => D#8

   
+  # DEBUG D#6 s=> iD.1900
   saved_stack.1_1 = __builtin_stack_save ();
   # DEBUG BEGIN_STMT
-  # DEBUG D#3 => NULL
+  # DEBUG D#3 => D#6 + 1
   # DEBUG D#2 => (long intD.8) D#3
   # DEBUG D#1 => D#2 + -1
   # DEBUG D.1951 => (sizetype) D#1
...

The inserted debug expression (D#6) is a duplicate of the debug expression
that will be inserted after copy_body in tree_function_versioning (D#8), so the
patch contains a todo to fix the duplication.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[debug] Handle references to skipped params in remap_ssa_name

2018-07-05  Tom de Vries  

* tree-inline.c (remap_ssa_name): Handle references to skipped
params.

---
 gcc/tree-inline.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 427ef959740..0fa996cab49 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -204,11 +204,22 @@ remap_ssa_name (tree name, copy_body_data *id)
  gimple *def_temp;
  gimple_stmt_iterator gsi;
  tree val = SSA_NAME_VAR (name);
+ bool skipped_parm_decl = false;
 
  n = id->decl_map->get (val);
  if (n != NULL)
-   val = *n;
- if (TREE_CODE (val) != PARM_DECL)
+   {
+ if (TREE_CODE (*n) == DEBUG_EXPR_DECL)
+   return *n;
+
+ if (TREE_CODE (*n) == VAR_DECL
+ && DECL_ABSTRACT_ORIGIN (*n)
+ && TREE_CODE (DECL_ABSTRACT_ORIGIN (*n)) == PARM_DECL)
+   skipped_parm_decl = true;
+ else
+   val = *n;
+   }
+ if (TREE_CODE (val) != PARM_DECL && !skipped_parm_decl)
{
  processing_debug_stmt = -1;
  return name;
@@ -219,6 +230,8 @@ remap_ssa_name (tree name, copy_body_data *id)

Re: [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 1:02 PM Kugan Vivekanandarajah
 wrote:
>
> Hi Richard,
>
> Thanks for the review.
>
> On 28 June 2018 at 21:26, Richard Biener  wrote:
> > On Wed, Jun 27, 2018 at 7:00 AM Kugan Vivekanandarajah
> >  wrote:
> >>
> >> Hi Richard,
> >>
> >> Thanks for the review.
> >>
> >> On 25 June 2018 at 20:01, Richard Biener  
> >> wrote:
> >> > On Fri, Jun 22, 2018 at 11:13 AM Kugan Vivekanandarajah
> >> >  wrote:
> >> >>
> >> >> [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p
> >> >
> >> > This says that COND_EXPR itself isn't expensive.  I think we should
> >> > constrain that a bit.
> >> > I think a good default would be to only allow a single COND_EXPR which
> >> > you can achieve
> >> > by adding a bool in_cond_expr_p = false argument to the function, pass
> >> > in_cond_expr_p
> >> > down and pass true down from the COND_EXPR handling itself.
> >> >
> >> > I'm not sure if we should require either COND_EXPR arm (operand 1 or
> >> > 2) to be constant
> >> > or !EXPR_P (then multiple COND_EXPRs might be OK).
> >> >
> >> > The main idea is to avoid evaluating many expressions but only
> >> > choosing one in the end.
> >> >
> >> > The simplest patch achieving that is sth like
> >> >
> >> > +  if (code == COND_EXPR)
> >> > +return (expression_expensive_p (TREE_OPERAND (expr, 0))
> >> >   || (EXPR_P (TREE_OPERAND (expr, 1)) && EXPR_P
> >> > (TREE_OPERAND (expr, 2)))
> >> > +   || expression_expensive_p (TREE_OPERAND (expr, 1))
> >> > +   || expression_expensive_p (TREE_OPERAND (expr, 2)));
> >> >
> >> > OK with that change.
> >>
> >> Is || (EXPR_P (TREE_OPERAND (expr, 1)) || EXPR_P (TREE_OPERAND (expr,
> >> 2))) slightly better ?
> >> Attaching  with the change. Is this OK?
> >
> > Well, it won't allow x != 0 ? popcount (x) : 1 because popcount(x) is 
> > CALL_EXPR.
> >
> >>
> >>
> >> Because, for pr81661.c, we now allow as not expensive
> >>  >> type  >> size 
> >> unit-size 
> >> align:32 warn_if_not_align:0 symtab:0 alias-set 1
> >> canonical-type 0x769455e8 precision:32 min  >> 0x7692dee8 -2147483648> max  >> 2147483647>
> >> pointer_to_this >
> >>
> >> arg:0 
> >>
> >> arg:0  >> int>
> >> visited
> >> def_stmt a.1_10 = a;
> >> version:10>
> >> arg:1 >
> >> arg:1 
> >>
> >> arg:0  >> _Bool>
> >>
> >> arg:0  >> 0x769455e8 int>
> >>
> >> arg:0  >> 0x769455e8 int>
> >> arg:0  arg:1  >> 0x7694a0d8 -1>>
> >> arg:1  >> 0x769455e8 int>
> >> visited
> >> def_stmt c.2_11 = c;
> >> version:11>>
> >> arg:1  >> 0x769455e8 int>
> >> visited
> >> def_stmt b.3_13 = b;
> >> version:13>>
> >> arg:1  >> 0x769455e8 int>
> >>
> >> arg:0  >> 0x769455e8 int>
> >>
> >> arg:0  >> 0x76a55b28>
> >>
> >> arg:0  >> 0x76a55b28>
> >>
> >> arg:0  >> 
> >> arg:0  arg:1
> >> >>
> >> arg:1  >> 0x76a55b28>
> >> arg:0 
> >> arg:2 >>
> >>
> >> Which also leads to an ICE in gimplify_modify_expr. I think this is a
> >> latent issue and I am trying to find the source
> >
> > Well, I think that's because some COND_EXPRs only gimplify to
> > conditional code.  See gimplify_cond_expr:
> >
> >   if (gimplify_ctxp->allow_rhs_cond_expr
> >   /* If either branch has side effects or could trap, it can't 
> > be
> >  evaluated unconditionally.  */
> >   && !TREE_SIDE_EFFECTS (then_)
> >   && !generic_expr_could_trap_p (then_)
> >   && !TREE_SIDE_EFFECTS (else_)
> >   && !generic_expr_could_trap_p (else_))
> > return gimplify_pure_cond_expr (expr_p, pre_p);
> >
> > so we probably have to treat TREE_SIDE_EFFECTS / generic_expr_could_trap_p 
> > as
> > "expensive" as well for the purpose of final value replacement unless we are
> > going to support a code-generation way different from gimplification.
>
> Is the attached patch which does this is OK?. I had to fix couple of
> testcases because now the final value replacement removed the loop for
> pr64183.c and pr85073.c is popcount pattern so I just disabled it so
> that we can test what was tested earlier.

The patch is OK.

> >
> > The testcase you cite uses -ftrapv which is why we run into this issue.  
> > Note
> > that final value replacement deals with this by rewriting the expression to
> > unsigned but it does so only after gimplification.  IIRC Jakub recently
> > added a helper to rewrite GENERIC to unsigned so that might be useful
> > in this context.
> Could you kindly refer me to Jakubs patch please.

I couldn't find it quickly, asked Jakub now.

Thank

Re: [PATCH][debug] Handle references to skipped params in remap_ssa_name

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 1:25 PM Tom de Vries  wrote:
>
> [ was: Re: [testsuite/guality, committed] Prevent optimization of local in
> vla-1.c ]
>
> On Wed, Jul 04, 2018 at 02:32:27PM +0200, Tom de Vries wrote:
> > On 07/03/2018 11:05 AM, Tom de Vries wrote:
> > > On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> > >> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
> > >>> Given the array has size i + 1 it's upper bound should be 'i' and 'i'
> > >>> should be available via DW_OP_[GNU_]entry_value.
> > >>>
> > >>> I see it is
> > >>>
> > >>> <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 31
> > >>> 1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
> > >>> DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
> > >>>
> > >>> and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, the
> > >>> storage itself doesn't have a location but the
> > >>> type specifies the size.
> > >>>
> > >>> (gdb) ptype a
> > >>> type = char [variable length]
> > >>> (gdb) p sizeof(a)
> > >>> $3 = 0
> > >>>
> > >>> this looks like a gdb bug to me?
> > >>>
> > >
> > > With gdb patch:
> > > ...
> > > diff --git a/gdb/findvar.c b/gdb/findvar.c
> > > index 8ad5e25cb2..ebaff923a1 100644
> > > --- a/gdb/findvar.c
> > > +++ b/gdb/findvar.c
> > > @@ -789,6 +789,8 @@ default_read_var_value
> > >break;
> > >
> > >  case LOC_OPTIMIZED_OUT:
> > > +  if (is_dynamic_type (type))
> > > +   type = resolve_dynamic_type (type, NULL,
> > > +/* Unused address.  */ 0);
> > >return allocate_optimized_out_value (type);
> > >
> > >  default:
> > > ...
> > >
> > > I get:
> > > ...
> > > $ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
> > > Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.
> > >
> > > Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
> > > 17return a[0];
> > > $1 = 6
> > > ...
> > >
> >
> > Well, for -O1 and -O2.
> >
> > For O3, I get instead:
> > ...
> > $ ./gdb vla-1.exe -q -batch -ex "b f1" -ex "run" -ex "p sizeof (a)"
> > Breakpoint 1 at 0x4004b0: f1. (2 locations)
> >
> > Breakpoint 1, f1 (i=5) at vla-1.c:17
> > 17return a[0];
> > $1 = 0
> > ...
> >
>
> Hi,
>
> When compiling guality/vla-1.c with -O3 -g, vla 'a[i + 1]' in f1 is optimized
> away, but f1 still contains a debug expression describing the upper bound of 
> the
> vla (D.1914):
> ...
>  __attribute__((noinline))
>  f1 (intD.6 iD.1900)
>  {
>
>saved_stack.1_2 = __builtin_stack_save ();
># DEBUG BEGIN_STMT
># DEBUG D#3 => i_1(D) + 1
># DEBUG D#2 => (long intD.8) D#3
># DEBUG D#1 => D#2 + -1
># DEBUG D.1914 => (sizetype) D#1
> ...
>
> Then f1 is cloned to a version f1.constprop with no parameters, eliminating
> parameter i, and 'DEBUG D#3 => i_1(D) + 1' turns into 'D#3 => NULL'.
> Consequently, 'print sizeof (a)' yields '0' in gdb.

So does gdb correctly recognize there isn't any size available or do we somehow
generate invalid debug info, not recognizing that D#3 => NULL means
"optimized out" and thus all dependent expressions are "optimized out" as well?

That is, shouldn't gdb do

(gdb) print sizeof (a)


?

> This patch fixes that by recognizing eliminated parameters in remap_ssa_name,
> defining a debug expression linking back to the the eliminated parameter, and
> using that debug expression to replace references to the eliminated parameter:
> ...
>  __attribute__((noinline))
>  f1.constprop ()
>  {
>intD.6 iD.1949;
>
>
># DEBUG D#8 s=> iD.1900
># DEBUG iD.1949 => D#8
>
>
> +  # DEBUG D#6 s=> iD.1900
>saved_stack.1_1 = __builtin_stack_save ();
># DEBUG BEGIN_STMT
> -  # DEBUG D#3 => NULL
> +  # DEBUG D#3 => D#6 + 1
># DEBUG D#2 => (long intD.8) D#3
># DEBUG D#1 => D#2 + -1
># DEBUG D.1951 => (sizetype) D#1
> ...
>
> The inserted debug expression (D#6) is a duplicate of the debug expression
> that will be inserted after copy_body in tree_function_versioning (D#8), so 
> the
> patch contains a todo to fix the duplication.
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for trunk?
>
> Thanks,
> - Tom
>
> [debug] Handle references to skipped params in remap_ssa_name
>
> 2018-07-05  Tom de Vries  
>
> * tree-inline.c (remap_ssa_name): Handle references to skipped
> params.
>
> ---
>  gcc/tree-inline.c | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 427ef959740..0fa996cab49 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -204,11 +204,22 @@ remap_ssa_name (tree name, copy_body_data *id)
>   gimple *def_temp;
>   gimple_stmt_iterator gsi;
>   tree val = SSA_NAME_VAR (name);
> + bool skipped_parm_decl = false;
>
>   n = id->decl_map->get (val);
>   if (n != NULL)
> -   val = *n;
> -  

Re: [PATCH, S390] Avoid LA with base and index on z13

2018-07-05 Thread Ulrich Weigand
Robin Dapp wrote:

> * config/s390/s390.c (preferred_la_operand_p): Do not use
>   LA with base and index on z13 or later.

The code just before your change reads:

  /* Avoid LA instructions with index register on z196; it is
 preferable to use regular add instructions when possible.
 Starting with zEC12 the la with index register is "uncracked"
 again.  */
  if (addr.indx && s390_tune == PROCESSOR_2817_Z196)
return false;

But on zEC12 LA works pretty much the same as on z13/z14, it is
indeed not cracked, but still a 2-cycle instruction when using
an index register.  So I guess the change really should apply
to zEC12 as well, and this could be as simple as changing the
above line to:

  if (addr.indx && s390_tune >= PROCESSOR_2817_Z196)

(Note that "addr.base && addr.indx" is the same as just checking
for addr.indx, since s390_decompose_address will never fill in
*just* an index.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[PATCH] Update config.guess and config.sub

2018-07-05 Thread Sebastian Huber
* config.guess: Sync with upstream version 2018-06-26.
* config.sub: Sync with upstream version 2018-07-02.
---
 config.guess | 6 +++---
 config.sub   | 8 +++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/config.guess b/config.guess
index 883a6713bf0..445c406836e 100755
--- a/config.guess
+++ b/config.guess
@@ -2,7 +2,7 @@
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2018 Free Software Foundation, Inc.
 
-timestamp='2018-05-19'
+timestamp='2018-06-26'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -894,8 +894,8 @@ EOF
# other systems with GNU libc and userland
echo "$UNAME_MACHINE-unknown-`echo "$UNAME_SYSTEM" | sed 's,^[^/]*/,,' 
| tr "[:upper:]" "[:lower:]"``echo "$UNAME_RELEASE"|sed -e 's/[-(].*//'`-$LIBC"
exit ;;
-i*86:Minix:*:*)
-   echo "$UNAME_MACHINE"-pc-minix
+*:Minix:*:*)
+   echo "$UNAME_MACHINE"-unknown-minix
exit ;;
 aarch64:Linux:*:*)
echo "$UNAME_MACHINE"-unknown-linux-"$LIBC"
diff --git a/config.sub b/config.sub
index d1f5b549034..072700fb037 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2018 Free Software Foundation, Inc.
 
-timestamp='2018-05-24'
+timestamp='2018-07-02'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -1125,6 +1125,12 @@ case $basic_machine in
ps2)
basic_machine=i386-ibm
;;
+   riscv)
+   basic_machine=riscv32-unknown
+   ;;
+   riscv-*)
+   basic_machine=`echo "$basic_machine" | sed 's/^riscv/riscv32/'`
+   ;;
rm[46]00)
basic_machine=mips-siemens
;;
-- 
2.13.7



Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Sebastian Huber

On 05/07/18 14:00, Sebastian Huber wrote:

* config.guess: Sync with upstream version 2018-06-26.
* config.sub: Sync with upstream version 2018-07-02.


I would like to back port this also to GCC 8.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH,rs6000] Fix implementation of vec_unpackh, vec_unpackl builtins

2018-07-05 Thread Segher Boessenkool
Hi Carl,

On Tue, Jul 03, 2018 at 02:36:22PM -0700, Carl Love wrote:
> Please let me know if the patch looks OK for GCC mainline. The patch
> also needs to be backported to GCC 8.

Looks great, thanks!  Okay for trunk, and also for 8.


Segher


> 2018-07-03  Carl Love  
> 
>   * config/rs6000/rs6000-c.c: Map ALTIVEC_BUILTIN_VEC_UNPACKH for
>   float argument to VSX_BUILTIN_DOUBLEH_V4SF.
>   Map ALTIVEC_BUILTIN_VEC_UNPACKL for float argument to
>   VSX_BUILTIN_DOUBLEL_V4SF.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-07-03  Carl Love  
>   * gcc.target/altivec-1-runnable.c: New test file.
>   * gcc.target/altivec-2-runnable.c: New test file.
>   * gcc.target/vsx-7.c (main2): Change expected expected instruction
>   for tests.


Re: [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p

2018-07-05 Thread Richard Biener
On Thu, Jul 5, 2018 at 1:29 PM Richard Biener
 wrote:
>
> On Thu, Jul 5, 2018 at 1:02 PM Kugan Vivekanandarajah
>  wrote:
> >
> > Hi Richard,
> >
> > Thanks for the review.
> >
> > On 28 June 2018 at 21:26, Richard Biener  wrote:
> > > On Wed, Jun 27, 2018 at 7:00 AM Kugan Vivekanandarajah
> > >  wrote:
> > >>
> > >> Hi Richard,
> > >>
> > >> Thanks for the review.
> > >>
> > >> On 25 June 2018 at 20:01, Richard Biener  
> > >> wrote:
> > >> > On Fri, Jun 22, 2018 at 11:13 AM Kugan Vivekanandarajah
> > >> >  wrote:
> > >> >>
> > >> >> [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p
> > >> >
> > >> > This says that COND_EXPR itself isn't expensive.  I think we should
> > >> > constrain that a bit.
> > >> > I think a good default would be to only allow a single COND_EXPR which
> > >> > you can achieve
> > >> > by adding a bool in_cond_expr_p = false argument to the function, pass
> > >> > in_cond_expr_p
> > >> > down and pass true down from the COND_EXPR handling itself.
> > >> >
> > >> > I'm not sure if we should require either COND_EXPR arm (operand 1 or
> > >> > 2) to be constant
> > >> > or !EXPR_P (then multiple COND_EXPRs might be OK).
> > >> >
> > >> > The main idea is to avoid evaluating many expressions but only
> > >> > choosing one in the end.
> > >> >
> > >> > The simplest patch achieving that is sth like
> > >> >
> > >> > +  if (code == COND_EXPR)
> > >> > +return (expression_expensive_p (TREE_OPERAND (expr, 0))
> > >> >   || (EXPR_P (TREE_OPERAND (expr, 1)) && EXPR_P
> > >> > (TREE_OPERAND (expr, 2)))
> > >> > +   || expression_expensive_p (TREE_OPERAND (expr, 1))
> > >> > +   || expression_expensive_p (TREE_OPERAND (expr, 2)));
> > >> >
> > >> > OK with that change.
> > >>
> > >> Is || (EXPR_P (TREE_OPERAND (expr, 1)) || EXPR_P (TREE_OPERAND (expr,
> > >> 2))) slightly better ?
> > >> Attaching  with the change. Is this OK?
> > >
> > > Well, it won't allow x != 0 ? popcount (x) : 1 because popcount(x) is 
> > > CALL_EXPR.
> > >
> > >>
> > >>
> > >> Because, for pr81661.c, we now allow as not expensive
> > >>  > >> type  > >> size 
> > >> unit-size 
> > >> align:32 warn_if_not_align:0 symtab:0 alias-set 1
> > >> canonical-type 0x769455e8 precision:32 min  > >> 0x7692dee8 -2147483648> max  > >> 2147483647>
> > >> pointer_to_this >
> > >>
> > >> arg:0  > >> int>
> > >>
> > >> arg:0  > >> int>
> > >> visited
> > >> def_stmt a.1_10 = a;
> > >> version:10>
> > >> arg:1 >
> > >> arg:1  > >> int>
> > >>
> > >> arg:0  > >> _Bool>
> > >>
> > >> arg:0  > >> 0x769455e8 int>
> > >>
> > >> arg:0  > >> 0x769455e8 int>
> > >> arg:0  arg:1  > >> 0x7694a0d8 -1>>
> > >> arg:1  > >> 0x769455e8 int>
> > >> visited
> > >> def_stmt c.2_11 = c;
> > >> version:11>>
> > >> arg:1  > >> 0x769455e8 int>
> > >> visited
> > >> def_stmt b.3_13 = b;
> > >> version:13>>
> > >> arg:1  > >> 0x769455e8 int>
> > >>
> > >> arg:0  > >> 0x769455e8 int>
> > >>
> > >> arg:0  > >> 0x76a55b28>
> > >>
> > >> arg:0  > >> 0x76a55b28>
> > >>
> > >> arg:0  > >> 
> > >> arg:0  arg:1
> > >> >>
> > >> arg:1  > >> 0x76a55b28>
> > >> arg:0 
> > >> arg:2 >>
> > >>
> > >> Which also leads to an ICE in gimplify_modify_expr. I think this is a
> > >> latent issue and I am trying to find the source
> > >
> > > Well, I think that's because some COND_EXPRs only gimplify to
> > > conditional code.  See gimplify_cond_expr:
> > >
> > >   if (gimplify_ctxp->allow_rhs_cond_expr
> > >   /* If either branch has side effects or could trap, it 
> > > can't be
> > >  evaluated unconditionally.  */
> > >   && !TREE_SIDE_EFFECTS (then_)
> > >   && !generic_expr_could_trap_p (then_)
> > >   && !TREE_SIDE_EFFECTS (else_)
> > >   && !generic_expr_could_trap_p (else_))
> > > return gimplify_pure_cond_expr (expr_p, pre_p);
> > >
> > > so we probably have to treat TREE_SIDE_EFFECTS / 
> > > generic_expr_could_trap_p as
> > > "expensive" as well for the purpose of final value replacement unless we 
> > > are
> > > going to support a code-generation way different from gimplification.
> >
> > Is the attached patch which does this is OK?. I had to fix couple of
> > testcases because now the final value replacement removed the loop for
> > pr64183.c and pr85073.c is popcount pattern so I just disabled it so
> > that we can test what was tested earlier.
>
> The patch is OK.
>
> > >
> > > The testcase you cite uses -ftrapv which is why we run into

Re: [patch, fortran] Asynchronous I/O, take 3

2018-07-05 Thread Dominique d'Humières
I have done a full regression testing with the patch at
https://gcc.gnu.org/ml/fortran/2018-07/msg4.html + Rainer’s patch 
at https://gcc.gnu.org/ml/fortran/2018-07/msg7.html

I am currently testing the patch at
https://gcc.gnu.org/ml/fortran/2018-07/msg8.html

so far, so good!

IMO the tests should go to gfortran.dg (they pass my tests).

I think there is a spurious STOP in libgomp.fortran/async_io_1.f90 and the last 
STOP 1 should be STOP 7, as in:

--- libgomp/testsuite/libgomp.fortran/async_io_1.f902018-07-03 
12:26:41.0 +0200
+++ gcc/testsuite/gfortran.dg/asynchronous_6.f902018-07-05 
12:57:35.0 +0200
@@ -21,7 +21,6 @@ program main
   write (10,'(A)', asynchronous=yes) 'asdf'
   write (10,*, asynchronous=yes) cc
   close (10)
-  stop
   open (20, file='a.dat', asynchronous=yes)
   read (20, *, asynchronous=yes) i, j
   read (20, *, asynchronous=yes) k, l
@@ -42,6 +41,6 @@ program main
   open(20, file='c.dat', asynchronous=yes) 
   read(20, *, asynchronous=yes) res
   wait (20)
-  if (any(res /= is)) stop 1
+  if (any(res /= is)) stop 7
   close (20,status="delete")
 end program

Thanks for the hard work!

Dominique



Re: [PATCH][debug] Handle references to skipped params in remap_ssa_name

2018-07-05 Thread Tom de Vries
On 07/05/2018 01:39 PM, Richard Biener wrote:
> On Thu, Jul 5, 2018 at 1:25 PM Tom de Vries  wrote:
>>
>> [ was: Re: [testsuite/guality, committed] Prevent optimization of local in
>> vla-1.c ]
>>
>> On Wed, Jul 04, 2018 at 02:32:27PM +0200, Tom de Vries wrote:
>>> On 07/03/2018 11:05 AM, Tom de Vries wrote:
 On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
>> Given the array has size i + 1 it's upper bound should be 'i' and 'i'
>> should be available via DW_OP_[GNU_]entry_value.
>>
>> I see it is
>>
>> <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 31
>> 1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
>> DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
>>
>> and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, the
>> storage itself doesn't have a location but the
>> type specifies the size.
>>
>> (gdb) ptype a
>> type = char [variable length]
>> (gdb) p sizeof(a)
>> $3 = 0
>>
>> this looks like a gdb bug to me?
>>

 With gdb patch:
 ...
 diff --git a/gdb/findvar.c b/gdb/findvar.c
 index 8ad5e25cb2..ebaff923a1 100644
 --- a/gdb/findvar.c
 +++ b/gdb/findvar.c
 @@ -789,6 +789,8 @@ default_read_var_value
break;

  case LOC_OPTIMIZED_OUT:
 +  if (is_dynamic_type (type))
 +   type = resolve_dynamic_type (type, NULL,
 +/* Unused address.  */ 0);
return allocate_optimized_out_value (type);

  default:
 ...

 I get:
 ...
 $ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
 Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.

 Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
 17return a[0];
 $1 = 6
 ...

>>>
>>> Well, for -O1 and -O2.
>>>
>>> For O3, I get instead:
>>> ...
>>> $ ./gdb vla-1.exe -q -batch -ex "b f1" -ex "run" -ex "p sizeof (a)"
>>> Breakpoint 1 at 0x4004b0: f1. (2 locations)
>>>
>>> Breakpoint 1, f1 (i=5) at vla-1.c:17
>>> 17return a[0];
>>> $1 = 0
>>> ...
>>>
>>
>> Hi,
>>
>> When compiling guality/vla-1.c with -O3 -g, vla 'a[i + 1]' in f1 is optimized
>> away, but f1 still contains a debug expression describing the upper bound of 
>> the
>> vla (D.1914):
>> ...
>>  __attribute__((noinline))
>>  f1 (intD.6 iD.1900)
>>  {
>>
>>saved_stack.1_2 = __builtin_stack_save ();
>># DEBUG BEGIN_STMT
>># DEBUG D#3 => i_1(D) + 1
>># DEBUG D#2 => (long intD.8) D#3
>># DEBUG D#1 => D#2 + -1
>># DEBUG D.1914 => (sizetype) D#1
>> ...
>>
>> Then f1 is cloned to a version f1.constprop with no parameters, eliminating
>> parameter i, and 'DEBUG D#3 => i_1(D) + 1' turns into 'D#3 => NULL'.
>> Consequently, 'print sizeof (a)' yields '0' in gdb.
> 
> So does gdb correctly recognize there isn't any size available or do we 
> somehow
> generate invalid debug info, not recognizing that D#3 => NULL means
> "optimized out" and thus all dependent expressions are "optimized out" as 
> well?
> 
> That is, shouldn't gdb do
> 
> (gdb) print sizeof (a)
> 
> 
> ?

The type for the vla gcc is emitting is an DW_TAG_array_type with
DW_TAG_subrange_type without DW_AT_upper_bound or DW_AT_count, which
makes the upper bound value 'unknown'. So I'd say the debug info is valid.

Using this gdb patch:
...
diff --git a/gdb/eval.c b/gdb/eval.c
index 9db6e7c69d..ea6f782c5b 100644
--- a/gdb/eval.c
+++ b/gdb/eval.c
@@ -3145,6 +3145,8 @@ evaluate_subexp_for_sizeof (...)
{
  val = evaluate_subexp (NULL_TYPE, exp, pos, EVAL_NORMAL);
  type = value_type (val);
+ if (TYPE_LENGTH (type) == 0)
+   return allocate_optimized_out_value (size_type);
}
   else
(*pos) += 4;
...

I get:
...
$ ./gdb vla-1.exe -batch -ex "b f1" -ex run -ex "p sizeof (a)"
Breakpoint 1 at 0x4004b0: f1. (2 locations)

Breakpoint 1, f1 (i=5) at vla-1.c:17
17return a[0];
$1 = 
...

Thanks,
- Tom


[PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-05 Thread Thomas Preudhomme
In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme  

PR target/85434
* gcc.target/arm/pr85434.c: New test.

Testing: Bootstrapped on ARM in both Arm and Thumb-2 mode as well as on
Aarch64. Testsuite shows no regression on these 3 variants either both
with default flags and with -fstack-protector-all.

Is this ok for trunk? If yes, would this be acceptable as a backport to
GCC 6, 7 and 8 provided that no regression is found?

Best regards,

Thomas
From d917d48c2005e46154383589f203d06f3c6167e0 Mon Sep 17 00:00:00 2001
From: Thomas Preud'homme 
Date: Tue, 8 May 2018 15:47:05 +0100
Subject: [PATCH] PR85434: Prevent spilling of stack protector guard's address
 on ARM

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names tha

Re: [PATCH] libtool: Sort output of 'find' to enable deterministic builds.

2018-07-05 Thread Bernhard M. Wiedemann
On 2018-06-29 17:09, Jeff Law wrote:
> In the immediate term, applying the patch to both instances seems wise.
> 
> Bernhard, do you have commit privs?

no, and I dont really want privs, since I expect to be doing only a few
patches for gcc.
Can you (or someone else) please commit the patch?


Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-05 Thread Qing Zhao
Hi,

I have sent two emails with the updated patches on 7/3:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html

however, these 2 emails  were not successfully forwarded to the 
gcc-patches@gcc.gnu.org mailing list.

So, I am sending the same email again in this one, hopefully this time it can 
go through.
Qing

Hi, Jeff,

thanks a lot for your review and comments.

I have addressed your comments,updated the patch, retested on both
aarch64 and x86.

The major changes in this version compared to the previous version are:

1. in routine expand_builtin_memcmp:
* move the inlining transformation AFTER the warning is issues for
-Wstringop-overflow;
* only apply inlining when there is No warning is issued.
2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
3. update comments to:
* capitalize the first word.
* capitalize all the arguments.

NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
changed.
the reason is:  there is NO overflow checking for these two routines currently.
if we need overflow checking for these two routines, I think that a separate 
patch is needed.
if this is needed, let me know, I can work on this separate patch for issuing 
warning for strcmp/strncmp when
-Wstringop-overflow is specified.

The new patch is as following, please take a look at it.

thanks.

Qing

gcc/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> On Jun 28, 2018, at 12:10 AM, Jeff Law  wrote:
> 
> 
> So this generally looks pretty good.  THe biggest technical concern is
> making sure we're doing the right thing WRT issuing warnings.  You can
> tackle that problem by deferring inlining to a later point after
> warnings have been issued or by verifying that your routines do not
> inline in cases where warnings will be issued.  It may be worth adding
> testcases for these issues.
> 
> There's a large number of comments that need capitalization fixes.
> 
> Given there was no measured runtime performance impact, but slight
> improvements on codesize for values <= 3, let's go ahead with that as
> the default.
> 
> Can you address the issues above and repost for final review?
> 
> Thanks,
> jeff



[PATCH] PR libstdc++/58265 implement LWG 2063 for COW strings

2018-07-05 Thread Jonathan Wakely

For COW strings the default constructor does not allocate when
_GLIBCXX_FULLY_DYNAMIC_STRING == 0, so can be noexcept. The move
constructor and swap do not allocate when the allocators are equal, so
add conditional noexcept using allocator_traits::is_always_equal.

PR libstdc++/58265
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
[_GLIBCXX_FULLY_DYNAMIC_STRING==0] (basic_string::basic_string()):
Add GLIBCXX_NOEXCEPT.
(basic_string::operator=(basic_string&&)): Add _GLIBCXX_NOEXCEPT_IF
to depend on the allocator's is_always_equal property (LWG 2063).
(basic_string::swap(basic_string&)): Likewise.
* include/bits/basic_string.tcc [!_GLIBCXX_USE_CXX11_ABI]
(basic_string::swap(basic_string&)): Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Check is_nothrow_move_assignable.
* testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Check is_nothrow_move_assignable.
* testsuite/21_strings/basic_string/cons/char/
noexcept_move_construct.cc: Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/
noexcept_move_construct.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit 926b3b642595383cb4abfbeb3586eecc721c1935
Author: Jonathan Wakely 
Date:   Thu Jul 5 15:12:06 2018 +0100

PR libstdc++/58265 implement LWG 2063 for COW strings

For COW strings the default constructor does not allocate when
_GLIBCXX_FULLY_DYNAMIC_STRING == 0, so can be noexcept. The move
constructor and swap do not allocate when the allocators are equal, so
add conditional noexcept using allocator_traits::is_always_equal.

PR libstdc++/58265
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
[_GLIBCXX_FULLY_DYNAMIC_STRING==0] (basic_string::basic_string()):
Add GLIBCXX_NOEXCEPT.
(basic_string::operator=(basic_string&&)): Add _GLIBCXX_NOEXCEPT_IF
to depend on the allocator's is_always_equal property (LWG 2063).
(basic_string::swap(basic_string&)): Likewise.
* include/bits/basic_string.tcc [!_GLIBCXX_USE_CXX11_ABI]
(basic_string::swap(basic_string&)): Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Check is_nothrow_move_assignable.
* 
testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Check is_nothrow_move_assignable.
* testsuite/21_strings/basic_string/cons/char/
noexcept_move_construct.cc: Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/
noexcept_move_construct.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index a77074da249..baad58682b6 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3486,6 +3486,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*/
   basic_string()
 #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
+  _GLIBCXX_NOEXCEPT
   : _M_dataplus(_S_empty_rep()._M_refdata(), _Alloc()) { }
 #else
   : _M_dataplus(_S_construct(size_type(), _CharT(), _Alloc()), _Alloc()){ }
@@ -3642,7 +3643,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  @param  __str  Source string.
*/
   basic_string&
-  operator=(const basic_string& __str) 
+  operator=(const basic_string& __str)
   { return this->assign(__str); }
 
   /**
@@ -3675,9 +3676,9 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  The contents of @a str are moved into this string (without copying).
*  @a str is a valid, but unspecified string.
**/
-  // PR 58265, this should be noexcept.
   basic_string&
   operator=(basic_string&& __str)
+  _GLIBCXX_NOEXCEPT_IF(allocator_traits<_Alloc>::is_always_equal::value)
   {
// NB: DR 1204.
this->swap(__str);
@@ -5111,9 +5112,9 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  Exchanges the contents of this string with that of @a __s in constant
*  time.
   */
-  // PR 58265, this should be noexcept.
   void
-  swap(basic_string& __s);
+  swap(basic_string& __s)
+  _GLIBCXX_NOEXCEPT_IF(allocator_traits<_Alloc>::is_always_equal::value);
 
   // String operations:
   /**
diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index 04b68ca0202..51bbb7bd6a0 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -967,6 +967,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 void
 basic_string<_CharT, _Traits, _Alloc>::
 swap(basic_string& __s)
+_GLIBCXX_NOEXCEPT_IF(allocator_traits<_Alloc>::is_always_equal::value)
 {
   if (_M_rep()->_M_is_leaked())
_M_rep()->_M_set_sharable();
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_st

Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-05 Thread Martin Sebor

On 07/05/2018 09:46 AM, Qing Zhao wrote:

Hi,

I have sent two emails with the updated patches on 7/3:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html

however, these 2 emails  were not successfully forwarded to the 
gcc-patches@gcc.gnu.org mailing list.

So, I am sending the same email again in this one, hopefully this time it can 
go through.
Qing


Thanks for taking care to issue the warnings (and thanks to
Jeff for pointing it out)!

One of the basic design principles that I myself have
accidentally violated in the past is that warning options
should not impact the emitted object code.  I don't think
your patch actually does introduce this dependency by having
the codegen depend on the result of check_access() -- I'm
pretty sure the function is designed to do the validation
irrespective of warning options and return based on
the result of the validation and not based on whether
a warning was issued.  But the choice of the variable name,
no_overflow_warn, suggests that it does, in fact, have this
effect.  So I would suggest to rename the variable and add
a test that verifies that this dependency does not exist.

Beyond that, an enhancement to this optimization that might
be worth considering is inlining even non-constant calls
with array arguments whose size is no greater than the limit.
As in:

  extern char a[4], *b;

  int n = strcmp (a, b);

Because strcmp arguments are required to be nul-terminated
strings, a's length above must be at most 3.  This is analogous
to similar optimizations GCC performs, such as folding to zero
calls to strlen() with one-element arrays.

Martin



Hi, Jeff,

thanks a lot for your review and comments.

I have addressed your comments,updated the patch, retested on both
aarch64 and x86.

The major changes in this version compared to the previous version are:

1. in routine expand_builtin_memcmp:
* move the inlining transformation AFTER the warning is issues for
-Wstringop-overflow;
* only apply inlining when there is No warning is issued.
2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
3. update comments to:
* capitalize the first word.
* capitalize all the arguments.

NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
changed.
the reason is:  there is NO overflow checking for these two routines currently.
if we need overflow checking for these two routines, I think that a separate 
patch is needed.
if this is needed, let me know, I can work on this separate patch for issuing 
warning for strcmp/strncmp when
-Wstringop-overflow is specified.

The new patch is as following, please take a look at it.

thanks.

Qing

gcc/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog

+2018-07-02  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+






On Jun 28, 2018, at 12:10 AM, Jeff Law  wrote:


So this generally looks pretty good.  THe biggest technical concern is
making sure we're doing the right thing WRT issuing warnings.  You can
tackle that problem by deferring inlining to a later point after
warnings have been issued or by verifying that your routines do not
inline in cases where warnings will be issued.  It may be worth adding
testcases for these issues.

There's a large number of comments that need capitalization fixes.

Given there was no measured runtime performance impact, but slight
improvements on codesize for values <= 3, let's go ahead with that as
the default.

Can you address the issues above and repost for final review?

Thanks,
jeff






Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Palmer Dabbelt

On Thu, 05 Jul 2018 05:00:20 PDT (-0700), sebastian.hu...@embedded-brains.de 
wrote:

* config.guess: Sync with upstream version 2018-06-26.
* config.sub: Sync with upstream version 2018-07-02.
---
 config.guess | 6 +++---
 config.sub   | 8 +++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/config.guess b/config.guess
index 883a6713bf0..445c406836e 100755
--- a/config.guess
+++ b/config.guess
@@ -2,7 +2,7 @@
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-05-19'
+timestamp='2018-06-26'

 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -894,8 +894,8 @@ EOF
# other systems with GNU libc and userland
echo "$UNAME_MACHINE-unknown-`echo "$UNAME_SYSTEM" | sed 's,^[^/]*/,,' | tr "[:upper:]" 
"[:lower:]"``echo "$UNAME_RELEASE"|sed -e 's/[-(].*//'`-$LIBC"
exit ;;
-i*86:Minix:*:*)
-   echo "$UNAME_MACHINE"-pc-minix
+*:Minix:*:*)
+   echo "$UNAME_MACHINE"-unknown-minix
exit ;;
 aarch64:Linux:*:*)
echo "$UNAME_MACHINE"-unknown-linux-"$LIBC"
diff --git a/config.sub b/config.sub
index d1f5b549034..072700fb037 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-05-24'
+timestamp='2018-07-02'

 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -1125,6 +1125,12 @@ case $basic_machine in
ps2)
basic_machine=i386-ibm
;;
+   riscv)
+   basic_machine=riscv32-unknown
+   ;;
+   riscv-*)
+   basic_machine=`echo "$basic_machine" | sed 's/^riscv/riscv32/'`
+   ;;
rm[46]00)
basic_machine=mips-siemens
;;


I'm not sure what the policy is on getting config stuff approved for commit, 
but just FYI there's another RISC-V related patch to config.sub that changes 
the behavior of "riscv-*" tuples.  I'm assuming we should take both, as it's 
odd to sync half way to the head of config.


When I try to build it I see "Unsupported RISC-V target riscv-unknown-elf", so 
there's at least some extra autoconf wizadry that needs to happen in here.  I'm 
actually not sure what the "riscv-*" tuples are supposed to do so I've added 
Liviu as I don't want to misrepresent his desires and get into trouble again 
:).


I'm fine with pretty much anything when it comes to this tuple stuff, so feel 
free to consider it all pre-approved from a RISC-V prospective -- though I 
assume it needs a GCC global maintainer to approve it as well.  My only 
constraint is that it doesn't break anything that currently builds, as I don't 
want to force a flag day on everyone because of this.


Thanks for submitting the patch!

Here's the config commit, for reference:

commit dd5d5dd697df579a5ebd119a88475b446c07c6b0
Author: Ben Elliston 
Date:   Tue Jul 3 21:18:29 2018 +1000

   * config.sub: Do not rewrite riscv -> riscv32.
   * testsuite/config-sub.data: Adjust tests.

diff --git a/ChangeLog b/ChangeLog
index dc19a4b02ba6..db7a24b8a2a3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2018-07-03  Liviu Ionescu 
+   Ben Elliston  
+
+   * config.sub: Do not rewrite riscv -> riscv32.
+   * testsuite/config-sub.data: Adjust tests.
+
2018-06-26  Sevan Janiyan  
Ben Elliston  

diff --git a/config.sub b/config.sub
index 072700fb037c..c95acc681d1b 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
# Configuration validation subroutine script.
#   Copyright 1992-2018 Free Software Foundation, Inc.

-timestamp='2018-07-02'
+timestamp='2018-07-03'

# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
@@ -625,7 +625,7 @@ case $basic_machine in
| powerpc | powerpc64 | powerpc64le | powerpcle \
| pru \
| pyramid \
-   | riscv32 | riscv64 \
+   | riscv | riscv32 | riscv64 \
| rl78 | rx \
| score \
| sh | sh[1234] | sh[24]a | sh[24]aeb | sh[23]e | sh[234]eb | sheb | 
shbe | shle | sh[1234]le | sh3ele \
@@ -752,7 +752,7 @@ case $basic_machine in
| powerpc-* | powerpc64-* | powerpc64le-* | powerpcle-* \
| pru-* \
| pyramid-* \
-   | riscv32-* | riscv64-* \
+   | riscv-* | riscv32-* | riscv64-* \
| rl78-* | romp-* | rs6000-* | rx-* \
| sh-* | sh[1234]-* | sh[24]a-* | sh[24]aeb-* | sh[23]e-* | sh[34]eb-* 
| sheb-* | shbe-* \
| shle-* | sh[1234]le-* | sh3ele-* | sh64-* | sh64le-* \
@@ -1125,12 +1125,6 @@ case $basic_machine in
ps2)
basic_machine=i386-ibm
;;
-   riscv)
-   basic

Re: [PATCH] PR libstdc++/58265 implement LWG 2063 for COW strings

2018-07-05 Thread Jonathan Wakely

On 05/07/18 16:55 +0100, Jonathan Wakely wrote:

For COW strings the default constructor does not allocate when
_GLIBCXX_FULLY_DYNAMIC_STRING == 0, so can be noexcept. The move
constructor and swap do not allocate when the allocators are equal, so
add conditional noexcept using allocator_traits::is_always_equal.

PR libstdc++/58265
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
[_GLIBCXX_FULLY_DYNAMIC_STRING==0] (basic_string::basic_string()):
Add GLIBCXX_NOEXCEPT.
(basic_string::operator=(basic_string&&)): Add _GLIBCXX_NOEXCEPT_IF
to depend on the allocator's is_always_equal property (LWG 2063).
(basic_string::swap(basic_string&)): Likewise.
* include/bits/basic_string.tcc [!_GLIBCXX_USE_CXX11_ABI]
(basic_string::swap(basic_string&)): Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Check is_nothrow_move_assignable.
* testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Check is_nothrow_move_assignable.
* testsuite/21_strings/basic_string/cons/char/
noexcept_move_construct.cc: Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/
noexcept_move_construct.cc: Likewise.


I missed a bit, finished by this patch. With these changes the SSO and
COW strings are slightly closer in behaviour, although the COW one is
still missing lots of C++11 features (like passing const_iterator
instead of iterator) and C++17 features (deduction guides).

Tested powerpc64le-linux, committed to trunk.


commit 265fc27e34d7fb8fb80653e3f9782c56c70a7ce4
Author: Jonathan Wakely 
Date:   Thu Jul 5 17:08:54 2018 +0100

PR libstdc++/58265 add noexcept to basic_string::assign(basic_string&&)

PR libstdc++/58265
* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
(basic_string::assign(basic_string&&)): Add conditional noexcept
depending on the allocator's is_always_equal property (LWG 2063).
* testsuite/21_strings/basic_string/modifiers/assign/char/
move_assign.cc: Check for non-throwing exception specification.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/
move_assign.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index baad58682b6..2d1b9dc6c29 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -725,7 +725,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  The contents of @a str are moved into this string (without copying).
*  @a str is a valid, but unspecified string.
**/
-  // PR 58265, this should be noexcept.
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2063. Contradictory requirements for string move assignment
   basic_string&
@@ -4275,9 +4274,9 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  This function sets this string to the exact contents of @a __str.
*  @a __str is a valid, but unspecified string.
*/
-  // PR 58265, this should be noexcept.
   basic_string&
   assign(basic_string&& __str)
+  noexcept(allocator_traits<_Alloc>::is_always_equal::value)
   {
 	this->swap(__str);
 	return *this;
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign.cc b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign.cc
index e9116b9c0e0..7089fea04c2 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign.cc
@@ -32,6 +32,9 @@ void test01()
   a.push_back('1');
   b.assign(std::move(a));
   VERIFY( b.size() == 1 && b[0] == '1' && a.size() == 0 );
+
+  // True for std::allocator because is_always_equal, but not true in general:
+  static_assert(noexcept(a.assign(std::move(b))), "lwg 2063");
 }
 
 int main()
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign.cc b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign.cc
index 74e342a8ef4..8d394602a9f 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign.cc
@@ -32,6 +32,9 @@ void test01()
   a.push_back(L'1');
   b.assign(std::move(a));
   VERIFY( b.size() == 1 && b[0] == '1' && a.size() == 0 );
+
+  // True for std::allocator because is_always_equal, but not true in general:
+  static_assert(noexcept(a.assign(std::move(b))), "lwg 2063");
 }
 
 int main()


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jeff Law
On 07/05/2018 06:00 AM, Sebastian Huber wrote:
>   * config.guess: Sync with upstream version 2018-06-26.
>   * config.sub: Sync with upstream version 2018-07-02.
OK.  And I think in general syncing to the latest version from upstream
ought not require explicit approval.

Richi/Jakub have the final decision about whether or not to backport to
the gcc-8 branch.

Jeff


[PATCH] Add xfail-if to some tests that fail with COW strings

2018-07-05 Thread Jonathan Wakely
These tests fail when run with -D_GLIBCXX_USE_CXX11_ABI=0 


* testsuite/21_strings/basic_string/cons/char/deduction.cc: XFAIL for
COW strings.
* testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc:
Likewise.
* testsuite/21_strings/basic_string/requirements/
explicit_instantiation/debug.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.

commit 643a1bb749c54ecc5faed9c675d7f4a29cfbea6a
Author: Jonathan Wakely 
Date:   Thu Jul 5 17:44:09 2018 +0100

Add xfail-if to some tests that fail with COW strings

These tests fail when run with -D_GLIBCXX_USE_CXX11_ABI=0

* testsuite/21_strings/basic_string/cons/char/deduction.cc: XFAIL 
for
COW strings.
* testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc:
Likewise.
* testsuite/21_strings/basic_string/requirements/
explicit_instantiation/debug.cc: Likewise.

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
index fc28467e29b..4662fbd4b4d 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
@@ -17,6 +17,7 @@
 
 // { dg-options "-std=gnu++17" }
 // { dg-do compile { target c++17 } }
+// { dg-xfail-if "COW string missing deduction guides" { ! cxx11-abi } }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
index c40651f13db..7740af51123 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
@@ -17,6 +17,7 @@
 
 // { dg-options "-std=gnu++17" }
 // { dg-do compile { target c++17 } }
+// { dg-xfail-if "COW string missing deduction guides" { ! cxx11-abi } }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/requirements/explicit_instantiation/debug.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/requirements/explicit_instantiation/debug.cc
index a166a9b1d58..20b8f59ba3d 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/requirements/explicit_instantiation/debug.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/requirements/explicit_instantiation/debug.cc
@@ -20,8 +20,9 @@
 #include 
 
 // { dg-do compile }
+// { dg-xfail-if "COW string missing some required members" { ! cxx11-abi } }
 
 // libstdc++/21770
 namespace debug = __gnu_debug;
-template class debug::basic_string, 
+template class debug::basic_string,
   std::allocator >;


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jeff Law
On 07/05/2018 10:51 AM, Palmer Dabbelt wrote:
> On Thu, 05 Jul 2018 05:00:20 PDT (-0700),
> sebastian.hu...@embedded-brains.de wrote:
>> * config.guess: Sync with upstream version 2018-06-26.
>> * config.sub: Sync with upstream version 2018-07-02.
>> ---
>>  config.guess | 6 +++---
>>  config.sub   | 8 +++-
>>  2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/config.guess b/config.guess
>> index 883a6713bf0..445c406836e 100755
>> --- a/config.guess
>> +++ b/config.guess
>> @@ -2,7 +2,7 @@
>>  # Attempt to guess a canonical system name.
>>  #   Copyright 1992-2018 Free Software Foundation, Inc.
>>
>> -timestamp='2018-05-19'
>> +timestamp='2018-06-26'
>>
>>  # This file is free software; you can redistribute it and/or modify it
>>  # under the terms of the GNU General Public License as published by
>> @@ -894,8 +894,8 @@ EOF
>>  # other systems with GNU libc and userland
>>  echo "$UNAME_MACHINE-unknown-`echo "$UNAME_SYSTEM" | sed
>> 's,^[^/]*/,,' | tr "[:upper:]" "[:lower:]"``echo "$UNAME_RELEASE"|sed
>> -e 's/[-(].*//'`-$LIBC"
>>  exit ;;
>> -    i*86:Minix:*:*)
>> -    echo "$UNAME_MACHINE"-pc-minix
>> +    *:Minix:*:*)
>> +    echo "$UNAME_MACHINE"-unknown-minix
>>  exit ;;
>>  aarch64:Linux:*:*)
>>  echo "$UNAME_MACHINE"-unknown-linux-"$LIBC"
>> diff --git a/config.sub b/config.sub
>> index d1f5b549034..072700fb037 100755
>> --- a/config.sub
>> +++ b/config.sub
>> @@ -2,7 +2,7 @@
>>  # Configuration validation subroutine script.
>>  #   Copyright 1992-2018 Free Software Foundation, Inc.
>>
>> -timestamp='2018-05-24'
>> +timestamp='2018-07-02'
>>
>>  # This file is free software; you can redistribute it and/or modify it
>>  # under the terms of the GNU General Public License as published by
>> @@ -1125,6 +1125,12 @@ case $basic_machine in
>>  ps2)
>>  basic_machine=i386-ibm
>>  ;;
>> +    riscv)
>> +    basic_machine=riscv32-unknown
>> +    ;;
>> +    riscv-*)
>> +    basic_machine=`echo "$basic_machine" | sed 's/^riscv/riscv32/'`
>> +    ;;
>>  rm[46]00)
>>  basic_machine=mips-siemens
>>  ;;
> 
> I'm not sure what the policy is on getting config stuff approved for
> commit, but just FYI there's another RISC-V related patch to config.sub
> that changes the behavior of "riscv-*" tuples.  I'm assuming we should
> take both, as it's odd to sync half way to the head of config.
> 
> When I try to build it I see "Unsupported RISC-V target
> riscv-unknown-elf", so there's at least some extra autoconf wizadry that
> needs to happen in here.  I'm actually not sure what the "riscv-*"
> tuples are supposed to do so I've added Liviu as I don't want to
> misrepresent his desires and get into trouble again :).
> 
> I'm fine with pretty much anything when it comes to this tuple stuff, so
> feel free to consider it all pre-approved from a RISC-V prospective --
> though I assume it needs a GCC global maintainer to approve it as well. 
> My only constraint is that it doesn't break anything that currently
> builds, as I don't want to force a flag day on everyone because of this.
> 
> Thanks for submitting the patch!
> 
> Here's the config commit, for reference:
> 
> commit dd5d5dd697df579a5ebd119a88475b446c07c6b0
> Author: Ben Elliston 
> Date:   Tue Jul 3 21:18:29 2018 +1000
> 
>    * config.sub: Do not rewrite riscv -> riscv32.
>    * testsuite/config-sub.data: Adjust tests.
If this is from upstream, consider it pre-approved for the trunk.

jeff


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Liviu Ionescu



> On 5 Jul 2018, at 19:51, Palmer Dabbelt  wrote:
> 
> ... When I try to build it I see "Unsupported RISC-V target 
> riscv-unknown-elf",

I guess configure is fine, you need to allow for the `riscv-` prefix in 
gcc/config.gcc, around line 3982

> so there's at least some extra autoconf wizadry that needs to happen in here. 
>  I'm actually not sure what the "riscv-*" tuples are supposed to do so I've 
> added Liviu as I don't want to misrepresent his desires and get into trouble 
> again :).
> ...  My only constraint is that it doesn't break anything that currently 
> builds, as I don't want to force a flag day on everyone because of this.

that's a reasonable desire.

however I guess that automatically transforming riscv- into riscv32- will break 
my builds.

if you take a look at the changes in my gcc fork, you'll see that later, in 
`config.gcc` I identify the `riscv-none-embed` tuple and provide a separate 
`elf-embed.h` instead of your `elf.h` which automatically links with libgloss. 

https://github.com/riscv/riscv-gcc/commit/5a282a3bc4e0f8700733dbf2c4d41aa528537e61

maybe this is not the best solution, but so far it worked.

if you have a better proposal, I am ready to consider it.

---

the requirements are simple:

- the resulting names of the binaries should not include any {32|64}; the idea 
to rename the binaries after the build is not acceptable
- the header file (elf.h) should be the edited one, which does not 
automatically link libgloss, since this is harmful for bare metal toolchains.

---

if for now the `-embed` suffix is not yet part of the script, I think I can 
continue to define it locally in my fork, until we have a RISC-V EABI and I can 
use `riscv-none-eabi-`.

however I think that adding it to the script is reasonable and might be also 
useful for other cross embedded toolchains that do not have an EABI.

---

generally speaking, I think that John proposal to clearly differentiate between 
Linux native compilers and cross embedded toolchains should be considered, it 
seems that he has a deep understanding of the problem and can provide a good 
solution.


regards,

Liviu




[PATCH] Fix several AVX512 intrinsic mask arguments.

2018-07-05 Thread Grazvydas Ignotas
gcc/ChangeLog:

2018-07-05  Grazvydas Ignotas  

* config/i386/avx512bwintrin.h: (_mm512_mask_cmp_epi8_mask,
_mm512_mask_cmp_epu8_mask): Fix mask arguments.
---
 gcc/config/i386/avx512bwintrin.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/avx512bwintrin.h b/gcc/config/i386/avx512bwintrin.h
index bd389fa..24ad5f1 100644
--- a/gcc/config/i386/avx512bwintrin.h
+++ b/gcc/config/i386/avx512bwintrin.h
@@ -3043,7 +3043,7 @@ _mm512_cmp_epi16_mask (__m512i __X, __m512i __Y, const 
int __P)
 
 extern __inline __mmask64
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_cmp_epi8_mask (__mmask32 __U, __m512i __X, __m512i __Y,
+_mm512_mask_cmp_epi8_mask (__mmask64 __U, __m512i __X, __m512i __Y,
   const int __P)
 {
   return (__mmask64) __builtin_ia32_cmpb512_mask ((__v64qi) __X,
@@ -3081,7 +3081,7 @@ _mm512_cmp_epu16_mask (__m512i __X, __m512i __Y, const 
int __P)
 
 extern __inline __mmask64
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_cmp_epu8_mask (__mmask32 __U, __m512i __X, __m512i __Y,
+_mm512_mask_cmp_epu8_mask (__mmask64 __U, __m512i __X, __m512i __Y,
   const int __P)
 {
   return (__mmask64) __builtin_ia32_ucmpb512_mask ((__v64qi) __X,
-- 
2.7.4



Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jim Wilson

On 07/05/2018 09:51 AM, Palmer Dabbelt wrote:
When I try to build it I see "Unsupported RISC-V target 
riscv-unknown-elf", so there's at least some extra autoconf wizadry that 
needs to happen in here.  I'm actually not sure what the "riscv-*" 
tuples are supposed to do so I've added Liviu as I don't want to 
misrepresent his desires and get into trouble again :).


I objected to this on the config.sub package mailing list, because it 
adds riscv-linux as a valid configure tuple, and we never wanted that. 
But this seems to be a losing battle.


Jim


Re: [PATCH] Fix several AVX512 intrinsic mask arguments.

2018-07-05 Thread Jakub Jelinek
On Thu, Jul 05, 2018 at 08:30:27PM +0300, Grazvydas Ignotas wrote:
> gcc/ChangeLog:
> 
> 2018-07-05  Grazvydas Ignotas  
> 
>   * config/i386/avx512bwintrin.h: (_mm512_mask_cmp_epi8_mask,
>   _mm512_mask_cmp_epu8_mask): Fix mask arguments.

LGTM, but
1) I think it would be nice to add a runtime testcase that fails (on avx512bw 
hw)
   without this patch and succeeds with this patch (have some non-zero and
   zero bits in the high 32 bits of the mask and test that the result is
   correct
2) there are other functions that have this bug, e.g.
   _mm_mask_cmp_epi8_mask, _mm256_mask_cmp_epi8_mask,
   _mm_mask_cmp_epu8_mask, _mm256_mask_cmp_epu8_mask in avx512vlbwintrin.h

Let's grep for all suspicious parts:
echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
's/@@@/}\n/g' | grep '__mmask8.*__mmask\(16\|32\|64\)'
 _mm512_mask_bitshuffle_epi64_mask (__mmask8 __M, __m512i __A, __m512i __B) { 
return (__mmask64) __builtin_ia32_vpshufbitqmb512_mask ((__v64qi) __A, 
(__v64qi) __B, (__mmask64) __M); }
 _mm_mask_cmp_epi8_mask (__mmask8 __U, __m128i __X, __m128i __Y, const int __P) 
{ return (__mmask16) __builtin_ia32_cmpb128_mask ((__v16qi) __X, (__v16qi) __Y, 
__P, (__mmask16) __U); }
 _mm_mask_cmp_epu8_mask (__mmask8 __U, __m128i __X, __m128i __Y, const int __P) 
{ return (__mmask16) __builtin_ia32_ucmpb128_mask ((__v16qi) __X, (__v16qi) 
__Y, __P, (__mmask16) __U); }
echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
's/@@@/}\n/g' | grep '__mmask16.*__mmask\(8\|32\|64\)'
 _mm512_mask_xor_epi64 (__m512i __W, __mmask16 __U, __m512i __A, __m512i __B) { 
return (__m512i) __builtin_ia32_pxorq512_mask ((__v8di) __A, (__v8di) __B, 
(__v8di) __W, (__mmask8) __U); }
 _mm512_maskz_xor_epi64 (__mmask16 __U, __m512i __A, __m512i __B) { return 
(__m512i) __builtin_ia32_pxorq512_mask ((__v8di) __A, (__v8di) __B, (__v8di) 
_mm512_setzero_si512 (), (__mmask8) __U); }
 _mm512_mask_cmpneq_epi64_mask (__mmask16 __M, __m512i __X, __m512i __Y) { 
return (__mmask8) __builtin_ia32_cmpq512_mask ((__v8di) __X, (__v8di) __Y, 4, 
(__mmask8) __M); }
 _mm256_mask_cmp_epi8_mask (__mmask16 __U, __m256i __X, __m256i __Y, const int 
__P) { return (__mmask32) __builtin_ia32_cmpb256_mask ((__v32qi) __X, (__v32qi) 
__Y, __P, (__mmask32) __U); }
 _mm256_mask_cmp_epu8_mask (__mmask16 __U, __m256i __X, __m256i __Y, const int 
__P) { return (__mmask32) __builtin_ia32_ucmpb256_mask ((__v32qi) __X, 
(__v32qi) __Y, __P, (__mmask32) __U); }
 _mm_mask_add_ps (__m128 __W, __mmask16 __U, __m128 __A, __m128 __B) { return 
(__m128) __builtin_ia32_addps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
__W, (__mmask8) __U); }
 _mm_maskz_add_ps (__mmask16 __U, __m128 __A, __m128 __B) { return (__m128) 
__builtin_ia32_addps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
_mm_setzero_ps (), (__mmask8) __U); }
 _mm256_mask_add_ps (__m256 __W, __mmask16 __U, __m256 __A, __m256 __B) { 
return (__m256) __builtin_ia32_addps256_mask ((__v8sf) __A, (__v8sf) __B, 
(__v8sf) __W, (__mmask8) __U); }
 _mm256_maskz_add_ps (__mmask16 __U, __m256 __A, __m256 __B) { return (__m256) 
__builtin_ia32_addps256_mask ((__v8sf) __A, (__v8sf) __B, (__v8sf) 
_mm256_setzero_ps (), (__mmask8) __U); }
 _mm_mask_sub_ps (__m128 __W, __mmask16 __U, __m128 __A, __m128 __B) { return 
(__m128) __builtin_ia32_subps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
__W, (__mmask8) __U); }
 _mm_maskz_sub_ps (__mmask16 __U, __m128 __A, __m128 __B) { return (__m128) 
__builtin_ia32_subps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
_mm_setzero_ps (), (__mmask8) __U); }
 _mm256_mask_sub_ps (__m256 __W, __mmask16 __U, __m256 __A, __m256 __B) { 
return (__m256) __builtin_ia32_subps256_mask ((__v8sf) __A, (__v8sf) __B, 
(__v8sf) __W, (__mmask8) __U); }
 _mm256_maskz_sub_ps (__mmask16 __U, __m256 __A, __m256 __B) { return (__m256) 
__builtin_ia32_subps256_mask ((__v8sf) __A, (__v8sf) __B, (__v8sf) 
_mm256_setzero_ps (), (__mmask8) __U); }
 _mm256_maskz_cvtepi32_ps (__mmask16 __U, __m256i __A) { return (__m256) 
__builtin_ia32_cvtdq2ps256_mask ((__v8si) __A, (__v8sf) _mm256_setzero_ps (), 
(__mmask8) __U); }
 _mm_maskz_cvtepi32_ps (__mmask16 __U, __m128i __A) { return (__m128) 
__builtin_ia32_cvtdq2ps128_mask ((__v4si) __A, (__v4sf) _mm_setzero_ps (), 
(__mmask8) __U); }
echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
's/@@@/}\n/g' | grep '__mmask32.*__mmask\(8\|16\|64\)'
 _mm512_mask_cmp_epi8_mask (__mmask32 __U, __m512i __X, __m512i __Y, const int 
__P) { return (__mmask64) __builtin_ia32_cmpb512_mask ((__v64qi) __X, (__v64qi) 
__Y, __P, (__mmask64) __U); }
 _mm512_mask_cmp_epu8_mask (__mmask32 __U, __m512i __X, __m512i __Y, const int 
__P) { return (__mmask64) __builtin_ia32_ucmpb512_mask ((__v64qi) __X, 
(__v64qi) __Y, __P, (__mmask64) __U); }
 _mm256_mask_cvtepi8_epi16 (__m256i __W, __mmask32 __U, __m128i __A) { return 
(__m256i) __builtin_ia32_pmovsxbw256_mask ((__v16qi) __A, (__v16hi) __W, 
(__mmask16) __U); }
 _mm_

Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Liviu Ionescu



> On 5 Jul 2018, at 21:26, Jim Wilson  wrote:
> 
> because it adds riscv-linux as a valid configure tuple, and we never wanted 
> that

If this is really a problem I guess you can blacklist it somehow.

But this proves once again that Linux native compilers and cross embedded 
toolchains should be processed differently.


Regards,

Liviu




Re: [PATCH] libtool: Sort output of 'find' to enable deterministic builds.

2018-07-05 Thread Jeff Law
On 07/05/2018 08:53 AM, Bernhard M. Wiedemann wrote:
> On 2018-06-29 17:09, Jeff Law wrote:
>> In the immediate term, applying the patch to both instances seems wise.
>>
>> Bernhard, do you have commit privs?
> 
> no, and I dont really want privs, since I expect to be doing only a few
> patches for gcc.
> Can you (or someone else) please commit the patch?
Installed on the trunk.

I did not update the libgo copies since we're a downstream consumer.

jeff


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jim Wilson
On Thu, Jul 5, 2018 at 11:31 AM, Liviu Ionescu  wrote:
> If this is really a problem I guess you can blacklist it somehow.
> But this proves once again that Linux native compilers and cross embedded 
> toolchains should be processed differently.

It isn't a major problem.  There is the issue that the more different
configure triplets we have the more work I need to do to keep them all
working.  So from my viewpoint, I'd rather not have riscv-*, but now
that we have it, I can fix the gcc config.sub to make it work.  On the
linux side, the expectation is that everyone will be using
riscv64-linux, and we don't have any usable upstream riscv32-linux
support yet, so there isn't any problem (yet).

Jim


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Liviu Ionescu



> On 5 Jul 2018, at 22:17, Jim Wilson  wrote:
> 
> ... I can fix the gcc config.sub to make it work.

Or you can edit `gcc/config.gcc` and trigger an error for `riscv-linux*`.


Regards,

Liviu



Re: [PATCH] Enable decimal float on x86_64 kFreeBSD and Hurd

2018-07-05 Thread Jeff Law
On 07/04/2018 03:24 PM, James Clarke wrote:
> config/
>   * dfp.m4 (enable_decimal_float): Enable for x86_64*-*-gnu* to
>   catch x86_64 kFreeBSD and Hurd.
> 
> gcc/
>   * configure: Regenerate.
> 
> libdecnumber/
>   * configure: Regenerate.
> 
> libgcc/
>   * configure: Regenerate.
THanks.  Installed on the trunk.

jeff


Re: [patch] jump threading multiple paths that start from the same BB

2018-07-05 Thread Jeff Law
On 07/04/2018 02:12 AM, Aldy Hernandez wrote:
> 
> 
> On 07/03/2018 08:16 PM, Jeff Law wrote:
>> On 07/03/2018 03:31 AM, Aldy Hernandez wrote:
>>> On 07/02/2018 07:08 AM, Christophe Lyon wrote:
>>>
>> On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
>>> While poking around in the backwards threader I noticed that we bail if
>>>
>>>
>>> we have already seen a starting BB.
>>>
>>>  /* Do not jump-thread twice from the same block.  */
>>>  if (bitmap_bit_p (threaded_blocks, entry->src->index)
>>>
>>> This limitation discards paths that are sub-paths of paths that have
>>> already been threaded.
>>>
>>> The following patch scans the remaining to-be-threaded paths to identify
>>>
>>>
>>> if any of them start from the same point, and are thus sub-paths of the
>>>
>>>
>>> just-threaded path.  By removing the common prefix of blocks in upcoming
>>>
>>>
>>> threadable paths, and then rewiring first non-common block
>>> appropriately, we expose new threading opportunities, since we are no
>>>
>>> longer starting from the same BB.  We also simplify the would-be
>>> threaded paths, because we don't duplicate already duplicated paths.
>>> [snip]
 Hi,

 I've noticed a regression on aarch64:
 FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
 threaded: 3"
 very likely caused by this patch (appeared between 262282 and 262294)

 Christophe
>>>
>>> The test needs to be adjusted here.
>>>
>>> The long story is that the aarch64 IL is different at thread3 time in
>>> that it has 2 profitable sub-paths that can now be threaded with my
>>> patch.  This is causing the threaded count to be 5 for aarch64, versus 3
>>> for x86 64.  Previously we couldn't thread these in aarch64, so the
>>> backwards threader would bail.
>>>
>>> One can see the different threading opportunities by sticking
>>> debug_all_paths() at the top of thread_through_all_blocks().  You will
>>> notice that aarch64 has far more candidates to begin with.  The IL on
>>> the x86 backend, has no paths that start on the same BB.  The aarch64,
>>> on the other hand, has many to choose from:
>>>
>>> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
>>> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
>>> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
>>> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
>>> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
>>> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
>>> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
>>> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
>>> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
>>>
>>> Some of these prove unprofitable, but 2 more than before are profitable now.
>>>
>>>
>>>
>>> BTW, I see another threading related failure on aarch64 which is
>>> unrelated to my patch, and was previously there:
>>>
>>> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
>>> threaded"
>>>
>>> This is probably another IL incompatibility between architectures.
>>>
>>> Anyways... the attached path fixes the regression.  I have added a note
>>> to the test explaining the IL differences.  We really should rewrite all
>>> the threading tests (I am NOT volunteering ;-)).
>>>
>>> OK for trunk?
>>> Aldy
>>>
>>> curr.patch
>>>
>>>
>>> gcc/testsuite/
>>>
>>> * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
>>> has a slightly different IL that provides more threading
>>> opportunities.
>> OK.
>>
>> WRT rewriting the tests.  I'd certainly agree that we don't have the
>> right set of knobs to allow us to characterize the target nor do we have
>> the right dumping/scanning facilities to describe and query the CFG
>> changes.
>>
>> The fact that the IL changes so much across targets is a sign that
>> target dependency (probably BRANCH_COST) is twiddling the gimple we
>> generate.  I strongly suspect we'd be a lot better off if we tackled the
>> BRANCH_COST problem first.
> 
> Huh.  I've always accepted differing IL between architectures as a
> necessary evil for things like auto-vectorization and the like.
Yes.  We've made a conscious decision that introducing target
dependencies for the autovectorizer makes sense.  BRANCH_COST on the
other hand is a different beast :-)

> 
> What's the ideal plan here? A knob to set default values for target
> dependent variables that can affect IL layout?  Then we could pass
> -fthis-is-an-IL-test and things be normalized?
Well, lots of things.

I'd like decisions about how to expand branches deferred until rtl
expansion.  Kai was poking at this in the past but never really got any
traction.

Many tests should turn into gimple IL tests.

And I'd like a better framework for testing what we're doing to the IL.


Jeff



Re: [PATCH][Middle-end]3rd patch of PR78809

2018-07-05 Thread Jeff Law
On 07/05/2018 09:35 AM, Qing Zhao wrote:
> Hi,
> 
> I have sent two emails with the updated patches on 7/3:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html
> 
> however, these 2 emails  were not successfully forwarded to the
> gcc-patches@gcc.gnu.org  mailing list.
> 
> So, I am sending the same email again in this one, hopefully this time
> it can go through. 
The original went through as well.

jeff


Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).

2018-07-05 Thread Jeff Law
On 04/10/2018 06:27 AM, Martin Liška wrote:
> On 04/10/2018 11:19 AM, Jakub Jelinek wrote:
>> On Mon, Apr 09, 2018 at 02:31:04PM +0200, Martin Liška wrote:
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2018-03-28  Martin Liska  
>>>
>>> * gcc.dg/string-opt-1.c:
>> I guess you really didn't mean to keep the above entry around, just the one
>> below, right?
> Sure, fixed.
> 
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2018-03-14  Martin Liska  
>>>
>>> * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target
>>> and others.
>>> --- a/gcc/config.gcc
>>> +++ b/gcc/config.gcc
>>> @@ -1607,6 +1607,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
>>> x86_64-*-linux*)
>>> tm_file="${tm_file} linux.h linux-android.h i386/linux-common.h 
>>> i386/linux64.h"
>>> extra_options="${extra_options} linux-android.opt"
>>> +   extra_objs="${extra_objs} x86-linux.o"
>>> ;;
>> The should go into the i[34567]86-*-linux*) case too (outside of the
>> if test x$enable_targets = xall; then conditional).
>> Or maybe better, remove the above and do it in:
>> i[34567]86-*-linux* | x86_64-*-linux*)
>> extra_objs="${extra_objs} cet.o"
>> tmake_file="$tmake_file i386/t-linux i386/t-cet"
>> ;;
>> spot, just add x86-linux.o next to cet.o.
> Done.
> 
>>> --- a/gcc/config/i386/linux.h
>>> +++ b/gcc/config/i386/linux.h
>>> @@ -24,3 +24,5 @@ along with GCC; see the file COPYING3.  If not see
>>>  
>>>  #undef MUSL_DYNAMIC_LINKER
>>>  #define MUSL_DYNAMIC_LINKER "/lib/ld-musl-i386.so.1"
>>> +
>>> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed
>>> diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h
>>> index f2d913e30ac..d855f5cc239 100644
>>> --- a/gcc/config/i386/linux64.h
>>> +++ b/gcc/config/i386/linux64.h
>>> @@ -37,3 +37,5 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
>>> If not, see
>>>  #define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-x86_64.so.1"
>>>  #undef MUSL_DYNAMIC_LINKERX32
>>>  #define MUSL_DYNAMIC_LINKERX32 "/lib/ld-musl-x32.so.1"
>>> +
>>> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed
>> And the above two changes should be replaced by a change in
>> gcc/config/i386/linux-common.h.
> Likewise.
> 
>>> +#include "coretypes.h"
>>> +#include "cp/cp-tree.h" /* This is why we're a separate module.  */
>> Why do you need cp/cp-tree.h?  That is just too weird.
>> The function just uses libc_speed (in core-types.h, built_in_function
>> (likewise), OPTION_GLIBC (config/linux.h).
> I ended up with minimal set of includes:
> 
> #include "config.h"
> #include "system.h"
> #include "coretypes.h"
> #include "backend.h"
> #include "tree.h"
> 
> I'm retesting the patch.
> 
> Martin
> 
>>  Jakub
>>
> 
> 0001-Introduce-new-libc_func_speed-target-hook-PR-middle-.patch
> 
> 
> From bed35715063f9435b697eaf4c9868f81e8556de8 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 14 Mar 2018 09:44:18 +0100
> Subject: [PATCH] Introduce new libc_func_speed target hook (PR
>  middle-end/81657).
> 
> gcc/ChangeLog:
> 
> 2018-03-14  Martin Liska  
> 
>   PR middle-end/81657
>   * builtins.c (expand_builtin_memory_copy_args): Handle situation
>   when libc library provides a fast mempcpy implementation/
>   * config/linux-protos.h (ix86_linux_libc_func_speed): New.
>   (TARGET_LIBC_FUNC_SPEED): Likewise.
>   * config/i386/linux-common.h (SUBTARGET_LIBC_FUNC_SPEED): Define
>   macro.
>   * config/i386/t-linux: Add x86-linux.o.
>   * config.gcc: Likewise.
>   * config/i386/x86-linux.c: New file.
>   * coretypes.h (enum libc_speed): Likewise.
>   * doc/tm.texi: Document new target hook.
>   * doc/tm.texi.in: Likewise.
>   * expr.c (emit_block_move_hints): Handle libc bail out argument.
>   * expr.h (emit_block_move_hints): Add new parameters.
>   * target.def: Add new hook.
>   * targhooks.c (enum libc_speed): New enum.
>   (default_libc_func_speed): Provide a default hook
>   implementation.
>   * targhooks.h (default_libc_func_speed): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-03-14  Martin Liska  
> 
>   * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target
>   and others.
This looks pretty reasonable now.  Let's go with it.  If we need to
adjust other targets we certainly can fault them in as their properties
are discovered/updated.

jeff


Re: [PING][PATCH, rs6000, C/C++] Fix PR target/86324: divkc3-1.c FAILs when compiling with -mabi=ieeelongdouble

2018-07-05 Thread Jeff Law
On 07/02/2018 03:50 PM, Peter Bergner wrote:
> I'd like to PING:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01713.html
> 
> I've included the entire patch below, since I missed the test cases in
> the original submission and Segher asked for some updated text for the
> hook documentation which I've included below.
> 
> Peter
> 
> 
> gcc/
>   PR target/86324
>   * target.def (translate_mode_attribute): New hook.
>   * targhooks.h (default_translate_mode_attribute): Declare.
>   * targhooks.c (default_translate_mode_attribute): New function.
>   * doc/tm.texi.in (TARGET_TRANSLATE_MODE_ATTRIBUTE): New hook.
>   * doc/tm.texi: Regenerate.
>   * config/rs6000/rs6000.c (TARGET_TRANSLATE_MODE_ATTRIBUTE): Define.
>   (rs6000_translate_mode_attribute): New function.
> 
> gcc/c-family/
>   PR target/86324
>   * c-attribs.c (handle_mode_attribute): Call new translate_mode_attribute
>   target hook.
> 
> gcc/testsuite/
>   PR target/86324
>   gcc.target/powerpc/pr86324-1.c: New test.
>   gcc.target/powerpc/pr86324-2.c: Likewise.
OK.
jeff


Re: [RFC PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-07-05 Thread Jeff Law
On 07/04/2018 11:32 AM, Martin Sebor wrote:
> On 07/03/2018 08:33 PM, Jeff Law wrote:
>>
>>>
>>> But since the number of warnings here hasn't changed, the ones
>>> in GCC logs predate my changes.  So updating the tests seems
>>> like an improvement to consider independently of the patch.
>> Agreed.  I'm still wary of proceeding given the general concerns about
>> configure tests.  It's good that GCC's configury bits aren't affected,
>> but I'm not sure we can generalize a whole lot from that.
> 
> So what's the next step?  I'm open to relaxing the warning
> so it only triggers with -Wall or -Wextra and not by default
> if that's considered necessary.
I'm not sure :-)  The problem is we have notable potential to break
things and do so in ways that are going to be painful to find.

Having them only turn on for -Wextra might be an compromise position.
But even if we do that I don't really see how we take the next step (ie,
adding it to Wall).


> 
> At the same time, the instances of the warning we have seen
> have all been issued for the configure tests for years and
> we have not seen any new instances of it as a result of
> this change, so the concern that the patch might lead to some
> more while at the same time accepting the ones we know about
> doesn't make sense to me.
Again, I don't think we can generalize much from the GCC autoconf
scripts and the failure modes are going to be extremely painful to track
down to this change.

While we have this concern with every new warning or enhancements to
existing warnings, this specific instance is worse because of how it
interacts with relatively common configury code.

Jeff




Re: [PATCH] tighten up -Wbuiltin-declaration-mismatch (PR 86125)

2018-07-05 Thread Jeff Law
On 06/28/2018 09:14 AM, Martin Sebor wrote:
> On 06/27/2018 11:20 PM, Jeff Law wrote:
>> On 06/26/2018 05:32 PM, Martin Sebor wrote:
>>> Attached is an updated patch to tighten up the warning and also
>>> prevent ICEs in the middle-end like in PR 86308 or PR 86202.
>>>
>>> I took Richard's suggestion to add the POINTER_TYPE_P() check
>>> to detect pointer/integer conflicts.  That also avoids the ICEs
>>> above.
>>>
>>> I also dealt with the fileptr_type_node problem so that file
>>> I/O built-ins can be declared to take any object pointer type
>>> as an argument, and that argument has to be the same for all
>>> them.
>>>
>>> I'm not too happy about the interaction with -Wextra but short
>>> of enabling the stricter checks even without it or introducing
>>> multiple levels for -Wbuiltin-declaration-mismatch I don't see
>>> a good alternative.
>>>
>>> Martin
>>>
>>> gcc-86125.diff
>>>
>>>
>>> PR c/86125 - missing -Wbuiltin-declaration-mismatch on a mismatched
>>> return type
>>> PR middle-end/86308 - ICE in verify_gimple calling index() with an
>>> invalid declaration
>>> PR middle-end/86202 - ICE in get_range_info calling an invalid
>>> memcpy() declaration
>>>
>>> gcc/c/ChangeLog:
>>>
>>> PR c/86125
>>> PR middle-end/86202
>>> PR middle-end/86308
>>> * c-decl.c (match_builtin_function_types): Add arguments.
>>> (diagnose_mismatched_decls): Diagnose mismatched declarations
>>> of built-ins more strictly.
>>> * doc/invoke.texi (-Wbuiltin-declaration-mismatch): Update.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR c/86125
>>> PR middle-end/86202
>>> PR middle-end/86308
>>> * gcc.dg/Wbuiltin-declaration-mismatch.c: New test.
>>> * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test.
>>> * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test.
>>> * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test.
>>> * gcc.dg/builtins-69.c: New test.
>>
>> [ ... ]
>>
>>>
>>> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
>>> index af16cfd..6c9e667 100644
>>> --- a/gcc/c/c-decl.c
>>> +++ b/gcc/c/c-decl.c
>>> @@ -1628,43 +1628,82 @@ c_bind (location_t loc, tree decl, bool
>>> is_global)
>>>    bind (DECL_NAME (decl), decl, scope, false, nested, loc);
>>>  }
>>>
>>> +
>>>  /* Subroutine of compare_decls.  Allow harmless mismatches in return
>>>     and argument types provided that the type modes match.  This
>>> function
>>> -   return a unified type given a suitable match, and 0 otherwise.  */
>>> +   returns a unified type given a suitable match, and 0 otherwise.  */
>>>
>>>  static tree
>>> -match_builtin_function_types (tree newtype, tree oldtype)
>>> +match_builtin_function_types (tree newtype, tree oldtype,
>>> +  tree *strict, unsigned *argno)
>> As Joseph notes, you need to update the function comment here.
>>
>> [ ... ]
>>
 +  /* Store the first FILE* argument type seen (whatever it is),
>>> + and expect any subsequent declarations of file I/O built-ins
>>> + to refer to it rather than to fileptr_type_node which is just
>>> + void*.  */
>>> +  static tree last_fileptr_type;
>> Is this actually safe?  Isn't the type in GC memory?  And if so, what
>> prevents it from being GC'd?  At the least I think you need to register
>> this as a GC root.  Why are we handling fileptr_types specially here to
>> begin with?
> 
> IIUC, garbage collection runs after front end processing (between
> separate passes) so the node should not be freed while the front
> end is holding on to it.  There are other examples in the FE of
> similar static usage (e.g., in c-format.c).You've stuffed a potentially GC'd 
> object into a static and that's going
to trigger a "is this correct/safe" discussion every time it's noticed :-)

Yes it's true that GC only happens at well known points and if an object
lives entirely in the front-end you can probably get away without the
GTY marker.  But then you have to actually prove there's nothing in the
middle/back ends that potentially call into this code.

I generally dislike that approach because it's bad from a long term
maintenance standpoint.  It's an implementation constraint that someone
has to remember forever to avoid hard to find bugs from being introduced.

Another way to help alleviate these concerns would be to assign the
object NULL once we're done parsing.

Or you can add a GTY marker.  There's a bit of overhead to this since
the GC system has to walk through all the registered roots.

Or you can conditionalize the code on some other variable which
indicates whether or not the parser is still running. "the_parser" might
be usable for this purpose.


> 
> The code detects mismatches between arguments to different file
> I/O functions, as in:
> 
>   struct SomeFile;
> 
>   // okay, FILE is struct SomeFile
>   int fputc (int, struct SomeFile*);
> 
>   struct OtherFile;
>   int fputs (const char*, struct OtherFile*);   // warning
I must be missing something.  What makes the first OK and the second

Re: [patch] Improve specs processing to allow %* in function arguments

2018-07-05 Thread Jeff Law
On 06/05/2018 08:13 AM, Olivier Hainque wrote:
> Hello,
> 
> The attached patch is a proposal to improve specs processing
> so %* works in spec function arguments (it doesn't as of today).
> 
> The immediate motivation is to allow a cleaner implementation of
> the -mmacosx-version-min support on darwin. I'll send a followup
> patch for that if the preliminary improvement suggested here gets
> approved. It seems generally useful in any case.
> 
> The idea is to propagate the matched pattern down into the spec
> processing chain when we have it, in particular from do_spec_1 to
> handle_spec_function, to eval_spec_function, to do_spec_2 and
> then do_spec_1 for the arguments.
> 
> Testing-wise, we have this running nightly on all our targets,
> currently based on gcc-7. We are using the facility on darwin in
> particular, with the reworked support for -mmacosx-version-min
> combined with local changes to support -mios-version-min as well.
>  
> Bootstrapped and regression tested with mainline on x86_64-linux.
> 
> Ok to commit ?
> 
> Thanks a lot in advance!
> 
> With Kind Regards,
> 
> Olivier
> 
> 2018-06-05  Olivier Hainque  
> 
> * gcc.c (handle_spec_function): Accept a soft_matched_part
> argument, as do_spec_1.  Pass it down to ...
> (eval_spec_function): Accept a soft_matched_part argument,
> and pass it down to ...
> (do_spec_2): Accept a soft_matched_part argument, and pass
> it down to do_spec_1.
> (do_spec_1): Pass soft_matched_part to handle_spec_function.
> (handle_braces): Update call to handle_spec_function.
> (driver::set_up_specs): Update calls to do_spec_2.
> (compare_debug_dump_opt_spec_function): Likewise.
> (compare_debug_self_opt_spec_function): Likewise.
OK.
jeff


Re: [PATCH] Fix several AVX512 intrinsic mask arguments.

2018-07-05 Thread Grazvydas Ignotas
On Thu, Jul 5, 2018 at 9:28 PM, Jakub Jelinek  wrote:
> On Thu, Jul 05, 2018 at 08:30:27PM +0300, Grazvydas Ignotas wrote:
>> gcc/ChangeLog:
>>
>> 2018-07-05  Grazvydas Ignotas  
>>
>>   * config/i386/avx512bwintrin.h: (_mm512_mask_cmp_epi8_mask,
>>   _mm512_mask_cmp_epu8_mask): Fix mask arguments.
>
> LGTM, but
> 1) I think it would be nice to add a runtime testcase that fails (on avx512bw 
> hw)
>without this patch and succeeds with this patch (have some non-zero and
>zero bits in the high 32 bits of the mask and test that the result is
>correct

Looks like the existing tests can already do it if we correct an
apparent mistake (see attached patch).

> 2) there are other functions that have this bug, e.g.
>_mm_mask_cmp_epi8_mask, _mm256_mask_cmp_epi8_mask,
>_mm_mask_cmp_epu8_mask, _mm256_mask_cmp_epu8_mask in avx512vlbwintrin.h
>
> Let's grep for all suspicious parts:
> echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
> 's/@@@/}\n/g' | grep '__mmask8.*__mmask\(16\|32\|64\)'
>  _mm512_mask_bitshuffle_epi64_mask (__mmask8 __M, __m512i __A, __m512i __B) { 
> return (__mmask64) __builtin_ia32_vpshufbitqmb512_mask ((__v64qi) __A, 
> (__v64qi) __B, (__mmask64) __M); }
>  _mm_mask_cmp_epi8_mask (__mmask8 __U, __m128i __X, __m128i __Y, const int 
> __P) { return (__mmask16) __builtin_ia32_cmpb128_mask ((__v16qi) __X, 
> (__v16qi) __Y, __P, (__mmask16) __U); }
>  _mm_mask_cmp_epu8_mask (__mmask8 __U, __m128i __X, __m128i __Y, const int 
> __P) { return (__mmask16) __builtin_ia32_ucmpb128_mask ((__v16qi) __X, 
> (__v16qi) __Y, __P, (__mmask16) __U); }
> echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
> 's/@@@/}\n/g' | grep '__mmask16.*__mmask\(8\|32\|64\)'
>  _mm512_mask_xor_epi64 (__m512i __W, __mmask16 __U, __m512i __A, __m512i __B) 
> { return (__m512i) __builtin_ia32_pxorq512_mask ((__v8di) __A, (__v8di) __B, 
> (__v8di) __W, (__mmask8) __U); }
>  _mm512_maskz_xor_epi64 (__mmask16 __U, __m512i __A, __m512i __B) { return 
> (__m512i) __builtin_ia32_pxorq512_mask ((__v8di) __A, (__v8di) __B, (__v8di) 
> _mm512_setzero_si512 (), (__mmask8) __U); }
>  _mm512_mask_cmpneq_epi64_mask (__mmask16 __M, __m512i __X, __m512i __Y) { 
> return (__mmask8) __builtin_ia32_cmpq512_mask ((__v8di) __X, (__v8di) __Y, 4, 
> (__mmask8) __M); }
>  _mm256_mask_cmp_epi8_mask (__mmask16 __U, __m256i __X, __m256i __Y, const 
> int __P) { return (__mmask32) __builtin_ia32_cmpb256_mask ((__v32qi) __X, 
> (__v32qi) __Y, __P, (__mmask32) __U); }
>  _mm256_mask_cmp_epu8_mask (__mmask16 __U, __m256i __X, __m256i __Y, const 
> int __P) { return (__mmask32) __builtin_ia32_ucmpb256_mask ((__v32qi) __X, 
> (__v32qi) __Y, __P, (__mmask32) __U); }
>  _mm_mask_add_ps (__m128 __W, __mmask16 __U, __m128 __A, __m128 __B) { return 
> (__m128) __builtin_ia32_addps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
> __W, (__mmask8) __U); }
>  _mm_maskz_add_ps (__mmask16 __U, __m128 __A, __m128 __B) { return (__m128) 
> __builtin_ia32_addps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
> _mm_setzero_ps (), (__mmask8) __U); }
>  _mm256_mask_add_ps (__m256 __W, __mmask16 __U, __m256 __A, __m256 __B) { 
> return (__m256) __builtin_ia32_addps256_mask ((__v8sf) __A, (__v8sf) __B, 
> (__v8sf) __W, (__mmask8) __U); }
>  _mm256_maskz_add_ps (__mmask16 __U, __m256 __A, __m256 __B) { return 
> (__m256) __builtin_ia32_addps256_mask ((__v8sf) __A, (__v8sf) __B, (__v8sf) 
> _mm256_setzero_ps (), (__mmask8) __U); }
>  _mm_mask_sub_ps (__m128 __W, __mmask16 __U, __m128 __A, __m128 __B) { return 
> (__m128) __builtin_ia32_subps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
> __W, (__mmask8) __U); }
>  _mm_maskz_sub_ps (__mmask16 __U, __m128 __A, __m128 __B) { return (__m128) 
> __builtin_ia32_subps128_mask ((__v4sf) __A, (__v4sf) __B, (__v4sf) 
> _mm_setzero_ps (), (__mmask8) __U); }
>  _mm256_mask_sub_ps (__m256 __W, __mmask16 __U, __m256 __A, __m256 __B) { 
> return (__m256) __builtin_ia32_subps256_mask ((__v8sf) __A, (__v8sf) __B, 
> (__v8sf) __W, (__mmask8) __U); }
>  _mm256_maskz_sub_ps (__mmask16 __U, __m256 __A, __m256 __B) { return 
> (__m256) __builtin_ia32_subps256_mask ((__v8sf) __A, (__v8sf) __B, (__v8sf) 
> _mm256_setzero_ps (), (__mmask8) __U); }
>  _mm256_maskz_cvtepi32_ps (__mmask16 __U, __m256i __A) { return (__m256) 
> __builtin_ia32_cvtdq2ps256_mask ((__v8si) __A, (__v8sf) _mm256_setzero_ps (), 
> (__mmask8) __U); }
>  _mm_maskz_cvtepi32_ps (__mmask16 __U, __m128i __A) { return (__m128) 
> __builtin_ia32_cvtdq2ps128_mask ((__v4si) __A, (__v4sf) _mm_setzero_ps (), 
> (__mmask8) __U); }
> echo `sed -n '/^_mm.*__mmask/,/^}/p' config/i386/*.h | sed 's/^}/@@@/'` | sed 
> 's/@@@/}\n/g' | grep '__mmask32.*__mmask\(8\|16\|64\)'
>  _mm512_mask_cmp_epi8_mask (__mmask32 __U, __m512i __X, __m512i __Y, const 
> int __P) { return (__mmask64) __builtin_ia32_cmpb512_mask ((__v64qi) __X, 
> (__v64qi) __Y, __P, (__mmask64) __U); }
>  _mm512_mask_cmp_epu8_mask (__mmask32 __U, __m512i __X

[PATCH] doc clarification: DONE and FAIL in define_split and define_peephole2

2018-07-05 Thread Paul Koning
Currently DONE and FAIL are documented only for define_expand, but they also 
work in essentially the same way for define_split and define_peephole2.

If FAIL is used in a define_insn_and_split, the output pattern cannot be the 
usual "#" dummy value. 

This patch updates the doc to describe those cases.  Ok for trunk?

paul

ChangeLog:

2018-07-05  Paul Koning  

* doc/md.texi (define_split): Document DONE and FAIL.  Describe
interaction with usual "#" output template in
define_insn_and_split.
(define_peephole2): Document DONE and FAIL.

Index: doc/md.texi
===
--- doc/md.texi (revision 262455)
+++ doc/md.texi (working copy)
@@ -8060,6 +8060,30 @@ those in @code{define_expand}, however, these stat
 generate any new pseudo-registers.  Once reload has completed, they also
 must not allocate any space in the stack frame.
 
+There are two special macros defined for use in the preparation statements:
+@code{DONE} and @code{FAIL}.  Use them with a following semicolon,
+as a statement.
+
+@table @code
+
+@findex DONE
+@item DONE
+Use the @code{DONE} macro to end RTL generation for the splitter.  The
+only RTL insns generated as replacement for the matched input insn will
+be those already emitted by explicit calls to @code{emit_insn} within
+the preparation statements; the replacement pattern is not used.
+
+@findex FAIL
+@item FAIL
+Make the @code{define_split} fail on this occasion.  When a @code{define_split}
+fails, it means that the splitter was not truly available for the inputs
+it was given, and this split is not done.
+@end table
+
+If the preparation falls through (invokes neither @code{DONE} nor
+@code{FAIL}), then the @code{define_split} uses the replacement
+template.
+
 Patterns are matched against @var{insn-pattern} in two different
 circumstances.  If an insn needs to be split for delay slot scheduling
 or insn scheduling, the insn is already known to be valid, which means
@@ -8232,6 +8256,15 @@ functionality as two separate @code{define_insn} a
 patterns.  It exists for compactness, and as a maintenance tool to prevent
 having to ensure the two patterns' templates match.
 
+In @code{define_insn_and_split}, the output template is usually simply
+@samp{#} since the assembly output is done by @code{define_insn}
+statements matching the generated insns, not by this
+@code{define_insn_and_split} statement.  But if @code{FAIL} is used in
+the preparation statements for certain input insns, those will not be
+split and during assembly output will again match this
+@code{define_insn_and_split}.  In that case, the appropriate assembly
+output statements are needed in the output template.
+
 @end ifset
 @ifset INTERNALS
 @node Including Patterns
@@ -8615,6 +8648,31 @@ so here's a silly made-up example:
   "")
 @end smallexample
 
+There are two special macros defined for use in the preparation statements:
+@code{DONE} and @code{FAIL}.  Use them with a following semicolon,
+as a statement.
+
+@table @code
+
+@findex DONE
+@item DONE
+Use the @code{DONE} macro to end RTL generation for the peephole.  The
+only RTL insns generated as replacement for the matched input insn will
+be those already emitted by explicit calls to @code{emit_insn} within
+the preparation statements; the replacement pattern is not used.
+
+@findex FAIL
+@item FAIL
+Make the @code{define_peephole2} fail on this occasion.  When a 
@code{define_peephole2}
+fails, it means that the replacement was not truly available for the
+particular inputs it was given, and the input insns are left unchanged.
+@end table
+
+If the preparation falls through (invokes neither @code{DONE} nor
+@code{FAIL}), then the @code{define_peephole2} uses the replacement
+template.
+
+
 @noindent
 If we had not added the @code{(match_dup 4)} in the middle of the input
 sequence, it might have been the case that the register we chose at the



[PATCH] PR libstdc++/85831 define move constructors and operators for exceptions

2018-07-05 Thread Jonathan Wakely

PR libstdc++/85831
* config/abi/pre/gnu.ver: Export move constructors and move
assignment operators for std::logic_error and std::runtime_error.
* include/std/stdexcept: Use _GLIBCXX_NOTHROW instead of
_GLIBCXX_USE_NOEXCEPT.
(logic_error, runtime_error): Declare move constructors and move
assignment operators. When not declared already, define copy
constructors and copy assignment operators as explicit-defaulted.
(domain_error, invalid_argument, length_error, out_of_range)
(overflow_error, underflow_error): Define move constructors and move
assignment operators as explicitly-defaulted.
* libsupc++/exception.h (exception): Likewise.
* src/c++11/cow-stdexcept.cc (logic_error, runtime_error): Define
move constructors and move assignment operators as defaulted.
* testsuite/19_diagnostics/stdexcept.cc: Check that constructors and
assignment operators are defined.

Tested powerpc64le-linux, committed to trunk.

commit cdf98fe7aeb8f7815daed0d30709395ef2d34d7a
Author: Jonathan Wakely 
Date:   Thu Jul 5 12:09:48 2018 +0100

PR libstdc++/85831 define move constructors and operators for exceptions

PR libstdc++/85831
* config/abi/pre/gnu.ver: Export move constructors and move
assignment operators for std::logic_error and std::runtime_error.
* include/std/stdexcept: Use _GLIBCXX_NOTHROW instead of
_GLIBCXX_USE_NOEXCEPT.
(logic_error, runtime_error): Declare move constructors and move
assignment operators. When not declared already, define copy
constructors and copy assignment operators as explicit-defaulted.
(domain_error, invalid_argument, length_error, out_of_range)
(overflow_error, underflow_error): Define move constructors and move
assignment operators as explicitly-defaulted.
* libsupc++/exception.h (exception): Likewise.
* src/c++11/cow-stdexcept.cc (logic_error, runtime_error): Define
move constructors and move assignment operators as defaulted.
* testsuite/19_diagnostics/stdexcept.cc: Check that constructors and
assignment operators are defined.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 782b1238742..521cebf1f80 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2014,6 +2014,13 @@ GLIBCXX_3.4.26 {
 # std::basic_string::insert(const_iterator, initializer_list)
 
_ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE6insertEN9__gnu_cxx17__normal_iteratorIPK[cw]S4_EESt16initializer_listI[cw]E;
 
+# std::logic_error move operations
+_ZNSt11logic_errorC[12]EOS_;
+_ZNSt11logic_erroraSEOS_;
+# std::runtime_error move operations
+_ZNSt13runtime_errorC[12]EOS_;
+_ZNSt13runtime_erroraSEOS_;
+
 } GLIBCXX_3.4.25;
 
 # Symbols in the support library (libsupc++) have their own tag.
diff --git a/libstdc++-v3/include/std/stdexcept 
b/libstdc++-v3/include/std/stdexcept
index 5267e5692bf..4fcc719f005 100644
--- a/libstdc++-v3/include/std/stdexcept
+++ b/libstdc++-v3/include/std/stdexcept
@@ -55,8 +55,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __cow_string();
 __cow_string(const std::string&);
 __cow_string(const char*, size_t);
-__cow_string(const __cow_string&) _GLIBCXX_USE_NOEXCEPT;
-__cow_string& operator=(const __cow_string&) _GLIBCXX_USE_NOEXCEPT;
+__cow_string(const __cow_string&) _GLIBCXX_NOTHROW;
+__cow_string& operator=(const __cow_string&) _GLIBCXX_NOTHROW;
 ~__cow_string();
 #if __cplusplus >= 201103L
 __cow_string(__cow_string&&) noexcept;
@@ -83,7 +83,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   char _M_bytes[sizeof(__str)];
 };
 
-__sso_string() _GLIBCXX_USE_NOEXCEPT;
+__sso_string() _GLIBCXX_NOTHROW;
 __sso_string(const std::string&);
 __sso_string(const char*, size_t);
 __sso_string(const __sso_string&);
@@ -122,19 +122,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus >= 201103L
 explicit
 logic_error(const char*) _GLIBCXX_TXN_SAFE;
+
+logic_error(logic_error&&) noexcept;
+logic_error& operator=(logic_error&&) noexcept;
 #endif
 
 #if _GLIBCXX_USE_CXX11_ABI || _GLIBCXX_DEFINE_STDEXCEPT_COPY_OPS
-logic_error(const logic_error&) _GLIBCXX_USE_NOEXCEPT;
-logic_error& operator=(const logic_error&) _GLIBCXX_USE_NOEXCEPT;
+logic_error(const logic_error&) _GLIBCXX_NOTHROW;
+logic_error& operator=(const logic_error&) _GLIBCXX_NOTHROW;
+#elif __cplusplus >= 201103L
+logic_error(const logic_error&) = default;
+logic_error& operator=(const logic_error&) = default;
 #endif
 
-virtual ~logic_error() _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_USE_NOEXCEPT;
+virtual ~logic_error() _GLIBCXX_TXN_SAFE_DYN _GLIBCXX_NOTHROW;
 
 /** Returns a C-style character string describin

Re: [PATCH 0/3][POPCOUNT]

2018-07-05 Thread Jeff Law
On 06/24/2018 08:41 PM, Kugan Vivekanandarajah wrote:
> Hi Jeff,
> 
> Thanks for the comments.
> 
> On 23 June 2018 at 02:06, Jeff Law  wrote:
>> On 06/22/2018 03:11 AM, Kugan Vivekanandarajah wrote:
>>> When we set niter with maybe_zero, currently final_value_relacement
>>> will not happen due to expression_expensive_p not handling. Patch 1
>>> adds this.
>>>
>>> With that we have the following optimized gimple.
>>>
>>>[local count: 118111601]:
>>>   if (b_4(D) != 0)
>>> goto ; [89.00%]
>>>   else
>>> goto ; [11.00%]
>>>
>>>[local count: 105119324]:
>>>   _2 = (unsigned long) b_4(D);
>>>   _9 = __builtin_popcountl (_2);
>>>   c_3 = b_4(D) != 0 ? _9 : 1;
>>>
>>>[local count: 118111601]:
>>>   # c_12 = PHI 
>>>
>>> I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
>>> latch execute zero times for b_4 == 0 means that the body will execute
>>> ones.
>> ISTM that DOM ought to have simplified the conditional, unless there's
>> some other way to get to bb3.  We know that b_4 is nonzero and thus c_3
>> must have the value _9.
> As of now, dom is not optimizing it. With the attached hack, it can be made 
> to.
What's strange is I'm not getting the c_3 = (b_4 != 0) ... in any of the
dumps I'm looking at.  Instead it's c_3 = _9, which is what I would
expect since we know that b_4 != 0


My tests have been on x86_64 and aarch64 linux targets.  I've tried with
patch#1 installed as well as with patch #1 and patch #2 together.

What target, what flags and what patches do I need to see this?

Jeff


[PATCH, Ada] RISC-V: Initial riscv linux Ada port.

2018-07-05 Thread Jim Wilson
I was asked about Ada support, so I tried cross building a native RISC-V Linux
Ada compiler, and it turned out to be possible with a little bit of work.  I
just started with the MIPS support, and then fixed everything that was
obviously wrong: endianness, error numbers, signal numbers, struct_sigaction
offsets, etc.

The result is good enough to bootstrap natively and seems to give reasonable
native testsuite results for a first attempt.  The machine I'm running on has
broken icache flushing, so trampolines won't work, and I suspect that is
causing a lot of the testsuite failures.  Here are the Ada testsuite results
I'm getting at the moment.

=== acats Summary ===
# of expected passes2138
# of unexpected failures182

=== gnat Summary ===

# of expected passes2757
# of unexpected failures26
# of expected failures  24
# of unsupported tests  25

Ada is a low priority side project for me, so if you want non-trivial changes
it may be a while before I can get to them.  There is a lot of other stuff
higher on my priority list at the moment, such as getting native gdb support
working.  If this isn't OK as is, then I'm willing to put work-in-progress
patches in a bug report or on a branch or something.

OK?

Jim

gcc/ada/
* Makefile.rtl: Add riscv*-linux* support.
* libgnarl/s-linux__riscv.ads: New.
* libgnat/system-linux-riscv.ads: New.
---
 gcc/ada/Makefile.rtl   |  28 +
 gcc/ada/libgnarl/s-linux__riscv.ads| 133 ++
 gcc/ada/libgnat/system-linux-riscv.ads | 147 +
 3 files changed, 308 insertions(+)
 create mode 100644 gcc/ada/libgnarl/s-linux__riscv.ads
 create mode 100644 gcc/ada/libgnat/system-linux-riscv.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index f69170d9fe3..374c60b576e 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2468,6 +2468,34 @@ ifeq ($(strip $(filter-out %x32 linux%,$(target_cpu) 
$(target_os))),)
   LIBRARY_VERSION := $(LIB_VERSION)
 endif
 
+# RISC-V Linux
+ifeq ($(strip $(filter-out riscv% linux%,$(target_cpu) $(target_os))),)
+  LIBGNAT_TARGET_PAIRS = \
+  a-intnam.adshttp://www.gnu.org/licenses/>.  --
+--  --
+--
+
+--  This is the RISC-V version of this package
+
+--  This package encapsulates cpu specific differences between implementations
+--  of GNU/Linux, in order to share s-osinte-linux.ads.
+
+--  PLEASE DO NOT add any with-clauses to this package or remove the pragma
+--  Preelaborate. This package is designed to be a bottom-level (leaf) package
+
+with Interfaces.C;
+
+package System.Linux is
+   pragma Preelaborate;
+
+   --
+   -- Time --
+   --
+
+   subtype int is Interfaces.C.int;
+   subtype longis Interfaces.C.long;
+   subtype suseconds_t is Interfaces.C.long;
+   subtype time_t  is Interfaces.C.long;
+   subtype clockid_t   is Interfaces.C.int;
+
+   type timespec is record
+  tv_sec  : time_t;
+  tv_nsec : long;
+   end record;
+   pragma Convention (C, timespec);
+
+   type timeval is record
+  tv_sec  : time_t;
+  tv_usec : suseconds_t;
+   end record;
+   pragma Convention (C, timeval);
+
+   ---
+   -- Errno --
+   ---
+
+   EAGAIN: constant := 11;
+   EINTR : constant := 4;
+   EINVAL: constant := 22;
+   ENOMEM: constant := 12;
+   EPERM : constant := 1;
+   ETIMEDOUT : constant := 110;
+
+   -
+   -- Signals --
+   -
+
+   SIGHUP : constant := 1; --  hangup
+   SIGINT : constant := 2; --  interrupt (rubout)
+   SIGQUIT: constant := 3; --  quit (ASCD FS)
+   SIGILL : constant := 4; --  illegal instruction (not reset)
+   SIGTRAP: constant := 5; --  trace trap (not reset)
+   SIGIOT : constant := 6; --  IOT instruction
+   SIGABRT: constant := 6; --  used by abort, replace SIGIOT in the  future
+   SIGBUS : constant := 7; --  bus error
+   SIGFPE : constant := 8; --  floating point exception
+   SIGKILL: constant := 9; --  kill (cannot be caught or ignored)
+   SIGUSR1: constant := 10; --  user defined signal 1
+   SIGSEGV: constant := 11; --  segmentation violation
+   SIGUSR2: constant := 12; --  user defined signal 2
+   SIGPIPE: constant := 13; --  write on a pipe with no one to read it
+   SIGALRM: constant := 14; --  alarm clock
+   SIGTERM: constant := 15; --  software termination signal from kill
+   SIGSTKFLT  : constant := 16; --  coprocessor stack fault (Linux)
+   SIGCLD : constant := 17; --  alias for SIGCHLD
+   SIGCHLD: constant := 17; --  child status change
+   SIGCONT: constant := 18; --  stopped process has been continued
+   SIGSTOP: constant := 19; -- 

[PATCH, Ada] Makefile patches from initial RISC-V cross/native build.

2018-07-05 Thread Jim Wilson
These are some patches I needed to complete my cross build of a native
riscv linux Ada compiler.  Some paths were different on the build machine
and host machine.  I needed to pass options into gnatmake to work around this,
and that required fixing some makefile rules to use $(GNATMAKE) instead of
calling gnatmake directly.

Tested with native riscv-linux bootstrap with Ada enabled.

OK?

Jim

gcc/ada/
* Make-generated.in (treeprs.ads): Use $(GNATMAKE) instead of gnatmake.
(einfo.h, sinfo.h, stamp-snames, stamp-nmake): Likewise.
* gcc-interface/Makefile.in (xoscons): Likewise.
---
 gcc/ada/Make-generated.in | 10 +-
 gcc/ada/gcc-interface/Makefile.in |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/Make-generated.in b/gcc/ada/Make-generated.in
index 757eaa85b90..bdcb62c4e56 100644
--- a/gcc/ada/Make-generated.in
+++ b/gcc/ada/Make-generated.in
@@ -28,21 +28,21 @@ $(ADA_GEN_SUBDIR)/treeprs.ads : 
$(ADA_GEN_SUBDIR)/treeprs.adt $(ADA_GEN_SUBDIR)/
-$(MKDIR) $(ADA_GEN_SUBDIR)/bldtools/treeprs
$(RM) $(addprefix $(ADA_GEN_SUBDIR)/bldtools/treeprs/,$(notdir $^))
$(CP) $^ $(ADA_GEN_SUBDIR)/bldtools/treeprs
-   (cd $(ADA_GEN_SUBDIR)/bldtools/treeprs; gnatmake -q xtreeprs ; 
./xtreeprs treeprs.ads )
+   (cd $(ADA_GEN_SUBDIR)/bldtools/treeprs; $(GNATMAKE) -q xtreeprs ; 
./xtreeprs treeprs.ads )
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/treeprs/treeprs.ads 
$(ADA_GEN_SUBDIR)/treeprs.ads
 
 $(ADA_GEN_SUBDIR)/einfo.h : $(ADA_GEN_SUBDIR)/einfo.ads 
$(ADA_GEN_SUBDIR)/einfo.adb $(ADA_GEN_SUBDIR)/xeinfo.adb 
$(ADA_GEN_SUBDIR)/ceinfo.adb
-$(MKDIR) $(ADA_GEN_SUBDIR)/bldtools/einfo
$(RM) $(addprefix $(ADA_GEN_SUBDIR)/bldtools/einfo/,$(notdir $^))
$(CP) $^ $(ADA_GEN_SUBDIR)/bldtools/einfo
-   (cd $(ADA_GEN_SUBDIR)/bldtools/einfo; gnatmake -q xeinfo ; ./xeinfo 
einfo.h )
+   (cd $(ADA_GEN_SUBDIR)/bldtools/einfo; $(GNATMAKE) -q xeinfo ; ./xeinfo 
einfo.h )
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/einfo/einfo.h 
$(ADA_GEN_SUBDIR)/einfo.h
 
 $(ADA_GEN_SUBDIR)/sinfo.h : $(ADA_GEN_SUBDIR)/sinfo.ads 
$(ADA_GEN_SUBDIR)/sinfo.adb $(ADA_GEN_SUBDIR)/xsinfo.adb 
$(ADA_GEN_SUBDIR)/csinfo.adb
-$(MKDIR) $(ADA_GEN_SUBDIR)/bldtools/sinfo
$(RM) $(addprefix $(ADA_GEN_SUBDIR)/bldtools/sinfo/,$(notdir $^))
$(CP) $^ $(ADA_GEN_SUBDIR)/bldtools/sinfo
-   (cd $(ADA_GEN_SUBDIR)/bldtools/sinfo; gnatmake -q xsinfo ; ./xsinfo 
sinfo.h )
+   (cd $(ADA_GEN_SUBDIR)/bldtools/sinfo; $(GNATMAKE) -q xsinfo ; ./xsinfo 
sinfo.h )
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/sinfo/sinfo.h 
$(ADA_GEN_SUBDIR)/sinfo.h
 
 $(ADA_GEN_SUBDIR)/snames.h $(ADA_GEN_SUBDIR)/snames.ads 
$(ADA_GEN_SUBDIR)/snames.adb : $(ADA_GEN_SUBDIR)/stamp-snames ; @true
@@ -50,7 +50,7 @@ $(ADA_GEN_SUBDIR)/stamp-snames : 
$(ADA_GEN_SUBDIR)/snames.ads-tmpl $(ADA_GEN_SUB
-$(MKDIR) $(ADA_GEN_SUBDIR)/bldtools/snamest
$(RM) $(addprefix $(ADA_GEN_SUBDIR)/bldtools/snamest/,$(notdir $^))
$(CP) $^ $(ADA_GEN_SUBDIR)/bldtools/snamest
-   (cd $(ADA_GEN_SUBDIR)/bldtools/snamest; gnatmake -q xsnamest ; 
./xsnamest )
+   (cd $(ADA_GEN_SUBDIR)/bldtools/snamest; $(GNATMAKE) -q xsnamest ; 
./xsnamest )
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/snamest/snames.ns 
$(ADA_GEN_SUBDIR)/snames.ads
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/snamest/snames.nb 
$(ADA_GEN_SUBDIR)/snames.adb
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/snamest/snames.nh 
$(ADA_GEN_SUBDIR)/snames.h
@@ -61,7 +61,7 @@ $(ADA_GEN_SUBDIR)/stamp-nmake: $(ADA_GEN_SUBDIR)/sinfo.ads 
$(ADA_GEN_SUBDIR)/nma
-$(MKDIR) $(ADA_GEN_SUBDIR)/bldtools/nmake
$(RM) $(addprefix $(ADA_GEN_SUBDIR)/bldtools/nmake/,$(notdir $^))
$(CP) $^ $(ADA_GEN_SUBDIR)/bldtools/nmake
-   (cd $(ADA_GEN_SUBDIR)/bldtools/nmake; gnatmake -q xnmake ; ./xnmake -b 
nmake.adb ; ./xnmake -s nmake.ads)
+   (cd $(ADA_GEN_SUBDIR)/bldtools/nmake; $(GNATMAKE) -q xnmake ; ./xnmake 
-b nmake.adb ; ./xnmake -s nmake.ads)
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/nmake/nmake.ads 
$(ADA_GEN_SUBDIR)/nmake.ads
$(MOVE_IF_CHANGE) $(ADA_GEN_SUBDIR)/bldtools/nmake/nmake.adb 
$(ADA_GEN_SUBDIR)/nmake.adb
touch $(ADA_GEN_SUBDIR)/stamp-nmake
diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 9a52e6d8edb..601f23afc1c 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -613,7 +613,7 @@ OSCONS_EXTRACT=$(OSCONS_CC) $(GNATLIBCFLAGS_FOR_C) -S 
s-oscons-tmplt.i
-$(MKDIR) ./bldtools/oscons
$(RM) $(addprefix ./bldtools/oscons/,$(notdir $^))
$(CP) $^ ./bldtools/oscons
-   (cd ./bldtools/oscons ; gnatmake -q xoscons)
+   (cd ./bldtools/oscons ; $(GNATMAKE) -q xoscons)
 
 $(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR) s-oscons-tmplt.c gsocket.h 
./bldtools/oscons/xoscons
$(RM) $(RTSD

[PATCH] fold strlen() of aggregate members (PR 77357)

2018-07-05 Thread Martin Sebor

GCC folds accesses to members of constant aggregates except
for character arrays/strings.  For example, the strlen() call
below is not folded:

  const char a[][4] = { "1", "12" };

  int f (void) { retturn strlen (a[1]); }

The attached change set enhances the string_constant() function
to make it possible to extract string constants from aggregate
initializers (CONSTRUCTORS).

The initial solution was much simpler but as is often the case,
MEM_REF made it fail to fold things like:

  int f (void) { retturn strlen (a[1] + 1); }

Handling those made the project a bit more interesting and
the final solution somewhat more involved.

To handle offsets into aggregate string members the patch also
extends the fold_ctor_reference() function to extract entire
string array initializers even if the offset points past
the beginning of the string and even though the size and
exact type of the reference are not known (there isn't enough
information in a MEM_REF to determine that).

Tested along with the patch for PR 86415 on x86_64-linux.

Martin
PR middle-end/77357 - strlen of constant strings not folded

gcc/ChangeLog:

	* builtins.c (c_strlen): Avoid out-of-bounds warnings when
	accessing implicitly initialized array elements.
	* expr.c (string_constant): Handle string initializers of
	character arrays within aggregates.
	* gimple-fold.c (fold_array_ctor_reference): Add argument.
	Store element offset.  As a special case, handle zero size.
	(fold_nonarray_ctor_reference): Same.
	(fold_ctor_reference): Add argument.  Store subobject offset.
	* gimple-fold.h (fold_ctor_reference): Add argument.

gcc/testsuite/ChangeLog:

	PR middle-end/77357
	* gcc.dg/strlenopt-49.c: New test.
	* gcc.dg/strlenopt-50.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 91658e8..2f9d5d7 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -602,8 +602,15 @@ c_strlen (tree src, int only_value)
 = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (src;
 
   /* Set MAXELTS to sizeof (SRC) / sizeof (*SRC) - 1, the maximum possible
- length of SRC.  */
-  unsigned maxelts = TREE_STRING_LENGTH (src) / eltsize - 1;
+ length of SRC.  Prefer TYPE_SIZE() to TREE_STRING_LENGTH() if possible
+ in case the latter is less than the size of the array.  */
+  HOST_WIDE_INT maxelts = TREE_STRING_LENGTH (src);
+  tree type = TREE_TYPE (src);
+  if (tree size = TYPE_SIZE_UNIT (type))
+if (tree_fits_shwi_p (size))
+  maxelts = tree_to_uhwi (size);
+
+  maxelts = maxelts / eltsize - 1;
 
   /* PTR can point to the byte representation of any string type, including
  char* and wchar_t*.  */
diff --git a/gcc/expr.c b/gcc/expr.c
index 56751df..be3ab93 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -54,11 +54,13 @@ along with GCC; see the file COPYING3.  If not see
 #include "reload.h"
 #include "langhooks.h"
 #include "common/common-target.h"
+#include "tree-dfa.h"
 #include "tree-ssa-live.h"
 #include "tree-outof-ssa.h"
 #include "tree-ssa-address.h"
 #include "builtins.h"
 #include "ccmp.h"
+#include "gimple-fold.h"
 #include "rtx-vector-builder.h"
 
 
@@ -11274,54 +11276,20 @@ is_aligning_offset (const_tree offset, const_tree exp)
 tree
 string_constant (tree arg, tree *ptr_offset)
 {
-  tree array, offset, lower_bound;
+  tree array;
   STRIP_NOPS (arg);
 
+  poly_int64 base_off = 0;
+
   if (TREE_CODE (arg) == ADDR_EXPR)
 {
-  if (TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST)
-	{
-	  *ptr_offset = size_zero_node;
-	  return TREE_OPERAND (arg, 0);
-	}
-  else if (TREE_CODE (TREE_OPERAND (arg, 0)) == VAR_DECL)
-	{
-	  array = TREE_OPERAND (arg, 0);
-	  offset = size_zero_node;
-	}
-  else if (TREE_CODE (TREE_OPERAND (arg, 0)) == ARRAY_REF)
-	{
-	  array = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
-	  offset = TREE_OPERAND (TREE_OPERAND (arg, 0), 1);
-	  if (TREE_CODE (array) != STRING_CST && !VAR_P (array))
-	return 0;
-
-	  /* Check if the array has a nonzero lower bound.  */
-	  lower_bound = array_ref_low_bound (TREE_OPERAND (arg, 0));
-	  if (!integer_zerop (lower_bound))
-	{
-	  /* If the offset and base aren't both constants, return 0.  */
-	  if (TREE_CODE (lower_bound) != INTEGER_CST)
-	return 0;
-	  if (TREE_CODE (offset) != INTEGER_CST)
-		return 0;
-	  /* Adjust offset by the lower bound.  */
-	  offset = size_diffop (fold_convert (sizetype, offset),
-fold_convert (sizetype, lower_bound));
-	}
-	}
-  else if (TREE_CODE (TREE_OPERAND (arg, 0)) == MEM_REF)
-	{
-	  array = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
-	  offset = TREE_OPERAND (TREE_OPERAND (arg, 0), 1);
-	  if (TREE_CODE (array) != ADDR_EXPR)
-	return 0;
-	  array = TREE_OPERAND (array, 0);
-	  if (TREE_CODE (array) != STRING_CST && !VAR_P (array))
-	return 0;
-	}
-  else
-	return 0;
+  arg = TREE_OPERAND (arg, 0);
+  array = get_addr_base_and_unit_offset (arg, &base_off);
+  if (!array
+	  || (TREE_CODE (array) != VAR_DECL
+	  && TREE_CODE (array) !=

[PATCH] fold strlen() of substrings within strings (PR 86415)

2018-07-05 Thread Martin Sebor

GCC folds strlen() calls to empty substrings within arrays
whose elements are explicitly initialized with NULs but
fails to do the same for elements that are zeroed out
implicitly.  For example:

  const char a[7] = "123\000\000\000";
  int f (void)
  {
return strlen (a + 5);   // folded
  }

but

  const char b[7] = "123";
  int g (void)
  {
return strlen (b + 5);   // not folded
  }

This is because the c_getstr() function only considers
the leading TREE_STRING_LENGTH() number of elements of
an array and not also the remaining elements up the full
size of the array the string may be stored in.

The attached patch enhances the function to also consider
those elements.  If there are more elements in the array
than TREE_STRING_LENGTH() evaluates to they must be all
NULs because the array is constant and initialized.

Tested along with the patch for PR 77357 on x86_64-linux.

Martin
PR tree-optimization/86415 - strlen() not folded for substrings within constant arrays

gcc/ChangeLog:

	PR tree-optimization/86415
	* fold-const.c (c_getstr): Handle substrings.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86415
	* gcc.dg/strlenopt-48.c: New test.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8476c22..1729348 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -14567,14 +14567,13 @@ fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE_INT off)
 			  ptr, size_int (off));
 }
 
-/* Return a char pointer for a C string if it is a string constant
-   or sum of string constant and integer constant.  We only support
-   string constants properly terminated with '\0' character.
-   If STRLEN is a valid pointer, length (including terminating character)
-   of returned string is stored to the argument.  */
+/* Return a char pointer for a C string if SRC refers to a NUL
+   terminated string constant or a NUL-terminated substring at
+   some offset within one.  If STRLEN is non-null, store
+   the length of the returned string in *STRLEN.  */
 
 const char *
-c_getstr (tree src, unsigned HOST_WIDE_INT *strlen)
+c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */)
 {
   tree offset_node;
 
@@ -14594,18 +14593,37 @@ c_getstr (tree src, unsigned HOST_WIDE_INT *strlen)
 	offset = tree_to_uhwi (offset_node);
 }
 
+  /* STRING_LENGTH is the size of the string literal, including any
+ embedded NULs.  STRING_SIZE is the size of the array the string
+ literal is stored in.  */
   unsigned HOST_WIDE_INT string_length = TREE_STRING_LENGTH (src);
+  unsigned HOST_WIDE_INT string_size = string_length;
+  tree type = TREE_TYPE (src);
+  if (tree size = TYPE_SIZE_UNIT (type))
+if (tree_fits_shwi_p (size))
+  string_size = tree_to_uhwi (size);
+
   const char *string = TREE_STRING_POINTER (src);
 
-  /* Support only properly null-terminated strings.  */
+  /* Support only properly null-terminated strings but handle
+ consecutive strings within the same array, such as the six
+ substrings in "1\0002\0003".  */
   if (string_length == 0
   || string[string_length - 1] != '\0'
-  || offset >= string_length)
+  || offset >= string_size)
 return NULL;
 
   if (strlen)
-*strlen = string_length - offset;
-  return string + offset;
+{
+  /* Compute and store the length of the substring at OFFSET.
+	 All offsets past the initial length refer to null strings.  */
+  if (offset <= string_length)
+	*strlen = string_length - offset;
+  else
+	*strlen = 0;
+}
+
+  return offset <= string_length ? string + offset : "";
 }
 
 /* Given a tree T, compute which bits in T may be nonzero.  */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-48.c b/gcc/testsuite/gcc.dg/strlenopt-48.c
new file mode 100644
index 000..1d1d368
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-48.c
@@ -0,0 +1,116 @@
+/* PR tree-optimization/86415 - strlen() not folded for substrings
+   within constant arrays
+   { dg-do compile }
+   { dg-options "-O2 -Wall -fdump-tree-gimple -fdump-tree-ccp" } */
+
+#include "strlenopt.h"
+
+#define CONCAT(x, y) x ## y
+#define CAT(x, y) CONCAT (x, y)
+#define FAILNAME(name) CAT (call_ ## name ##_on_line_, __LINE__)
+
+#define FAIL(name) do {\
+extern void FAILNAME (name) (void);		\
+FAILNAME (name)();\
+  } while (0)
+
+/* Macro to emit a call to funcation named
+ call_in_true_branch_not_eliminated_on_line_NNN()
+   for each call that's expected to be eliminated.  The dg-final
+   scan-tree-dump-time directive at the bottom of the test verifies
+   that no such call appears in output.  */
+#define ELIM(expr) \
+  if (!(expr)) FAIL (in_true_branch_not_eliminated); else (void)0
+
+#define T(s, n) ELIM (strlen (s) == n)
+
+/*  1
+	 0 1  23 4  567 8  901234  */
+#define STR "1\00012\000123\0001234\0"
+
+const char a[]   = STR;
+const char b[20] = STR;
+
+void test_literal (void)
+{
+  /* Verify that strlen() of substrings within a string literal are
+ correctly folded.

Re: [PATCH] fold strlen() of substrings within strings (PR 86415)

2018-07-05 Thread Jeff Law
On 07/05/2018 05:54 PM, Martin Sebor wrote:
> GCC folds strlen() calls to empty substrings within arrays
> whose elements are explicitly initialized with NULs but
> fails to do the same for elements that are zeroed out
> implicitly.  For example:
> 
>   const char a[7] = "123\000\000\000";
>   int f (void)
>   {
> return strlen (a + 5);   // folded
>   }
> 
> but
> 
>   const char b[7] = "123";
>   int g (void)
>   {
> return strlen (b + 5);   // not folded
>   }
> 
> This is because the c_getstr() function only considers
> the leading TREE_STRING_LENGTH() number of elements of
> an array and not also the remaining elements up the full
> size of the array the string may be stored in.
> 
> The attached patch enhances the function to also consider
> those elements.  If there are more elements in the array
> than TREE_STRING_LENGTH() evaluates to they must be all
> NULs because the array is constant and initialized.
> 
> Tested along with the patch for PR 77357 on x86_64-linux.
> 
> Martin
> 
> gcc-86415.diff
> 
> 
> PR tree-optimization/86415 - strlen() not folded for substrings within 
> constant arrays
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/86415
>   * fold-const.c (c_getstr): Handle substrings.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/86415
>   * gcc.dg/strlenopt-48.c: New test.
OK.
jeff


[PATCH] RISC-V: Add support for riscv-*-*.

2018-07-05 Thread Jim Wilson
Support for riscv-* was added to config.sub upstream, so we need to handle it
in gcc configure.  Just one place needs to be fixed for now to make this work.

Tested with riscv{32,64}-{elf,linux} and riscv-elf cross builds.

Committed.

Jim

gcc/
* config.gcc (riscv*-*-*): When setting xlen, handle riscv-*.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 63162aab676..78e84c2b864 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4091,7 +4091,7 @@ case "${target}" in
supported_defaults="abi arch tune"
 
case "${target}" in
-   riscv32*) xlen=32 ;;
+   riscv-* | riscv32*) xlen=32 ;;
riscv64*) xlen=64 ;;
*) echo "Unsupported RISC-V target ${target}" 1>&2; exit 1 ;;
esac
-- 
2.17.1



Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jim Wilson
On Thu, Jul 5, 2018 at 12:22 PM, Liviu Ionescu  wrote:
>> On 5 Jul 2018, at 22:17, Jim Wilson  wrote:
>> ... I can fix the gcc config.sub to make it work.
> Or you can edit `gcc/config.gcc` and trigger an error for `riscv-linux*`.

I added patches to binutils and gcc to make riscv-elf work.
riscv-rtems probably also works as a result of this.  I'm not worrying
about riscv-linux for now.  There may be more stuff that needs to be
fixed to make riscv-* work though.  I only checked binutils and gcc.

Jim


Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Jeff Law
On 07/05/2018 09:18 PM, Jim Wilson wrote:
> On Thu, Jul 5, 2018 at 12:22 PM, Liviu Ionescu  wrote:
>>> On 5 Jul 2018, at 22:17, Jim Wilson  wrote:
>>> ... I can fix the gcc config.sub to make it work.
>> Or you can edit `gcc/config.gcc` and trigger an error for `riscv-linux*`.
> 
> I added patches to binutils and gcc to make riscv-elf work.
> riscv-rtems probably also works as a result of this.  I'm not worrying
> about riscv-linux for now.  There may be more stuff that needs to be
> fixed to make riscv-* work though.  I only checked binutils and gcc.
My tester bootstraps riscv64 linux via qemu/chroot once every 24hrs.
If it fails, I'll pass that info along...

jeff


[committed] [PR tree-optimization/86010] More aggressively trim partially dead mem* and str* calls

2018-07-05 Thread Jeff Law
As noted in BZ 86010 we can be more aggressive when trimming tails of
mem* or str* calls in gimple DSE since trimming a tail doesn't affect
alignment and residuals are usually handled pretty efficiently in libc.

Additionally, if the total number of live bytes left is smaller than a
word, then it's highly likely we'll open-code the mem* or str* routine.
So we allow more aggressive trimming in that case too.

What's left to be able to close out 86010 is to identify when a memory
store could be merged with a subsequent memset.   I'm skeptical of the
importance of that optimization, though perhaps it comes up often enough
with structure initializations to be worth doing.

Bootstrapped and regression tested on x86_64-linux-gnu.  Installing on
the trunk.

Jeff
PR tree-optimization/86010
* tree-ssa-dse.c (compute_trims): More aggressively trim at
both the head and tail of mem* and str* calls.

diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 1af50a0..ebc4a1e 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -240,11 +240,14 @@ compute_trims (ao_ref *ref, sbitmap live, int *trim_head, 
int *trim_tail,
 
   /* Now identify how much, if any of the tail we can chop off.  */
   HOST_WIDE_INT const_size;
+  int last_live = bitmap_last_set_bit (live);
   if (ref->size.is_constant (&const_size))
 {
   int last_orig = (const_size / BITS_PER_UNIT) - 1;
-  int last_live = bitmap_last_set_bit (live);
-  *trim_tail = (last_orig - last_live) & ~0x1;
+  /* We can leave inconvenient amounts on the tail as
+residual handling in mem* and str* functions is usually
+reasonably efficient.  */
+  *trim_tail = last_orig - last_live;
 }
   else
 *trim_tail = 0;
@@ -252,7 +255,12 @@ compute_trims (ao_ref *ref, sbitmap live, int *trim_head, 
int *trim_tail,
   /* Identify how much, if any of the head we can chop off.  */
   int first_orig = 0;
   int first_live = bitmap_first_set_bit (live);
-  *trim_head = (first_live - first_orig) & ~0x1;
+  *trim_head = first_live - first_orig;
+
+  /* If more than a word remains, then make sure to keep the
+ starting point at least word aligned.  */
+  if (last_live - first_live > UNITS_PER_WORD)
+*trim_head &= (UNITS_PER_WORD - 1);
 
   if ((*trim_head || *trim_tail)
   && dump_file && (dump_flags & TDF_DETAILS))


[committed] Fix tree-ssa/asm-2.c on the v850

2018-07-05 Thread Jeff Law


r0 on the v850 is a hardwired 0 value.  For reasons unknown I exposed it
in the register file.

This runs afoul of tree-ssa/asm-2.c which has a local variable
explicitly assigned to register 0.  This naturally blows up.

The fix is trivial, use a different register like other ports do.

Installing on the trunk.

Jeff
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 8496a38c291..4952b18983f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2018-07-05  Jeff Law  
+
+   * gcc.dg/tree-ssa/asm-2.c (REGISTER): Override for v850 too.
+
 2018-07-05  Paul Thomas  
 
PR fortran/86408
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/asm-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/asm-2.c
index 4dc4a9d6c6a..00c3079181d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/asm-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/asm-2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
 
-#ifdef __hppa__
+#if defined(__hppa__) || defined(__v850__)
 #define REGISTER "1"
 #else
 #ifdef __moxie__


Re: [PATCH 0/3][POPCOUNT]

2018-07-05 Thread Richard Biener
On July 6, 2018 12:03:11 AM GMT+02:00, Jeff Law  wrote:
>On 06/24/2018 08:41 PM, Kugan Vivekanandarajah wrote:
>> Hi Jeff,
>> 
>> Thanks for the comments.
>> 
>> On 23 June 2018 at 02:06, Jeff Law  wrote:
>>> On 06/22/2018 03:11 AM, Kugan Vivekanandarajah wrote:
 When we set niter with maybe_zero, currently final_value_relacement
 will not happen due to expression_expensive_p not handling. Patch 1
 adds this.

 With that we have the following optimized gimple.

[local count: 118111601]:
   if (b_4(D) != 0)
 goto ; [89.00%]
   else
 goto ; [11.00%]

[local count: 105119324]:
   _2 = (unsigned long) b_4(D);
   _9 = __builtin_popcountl (_2);
   c_3 = b_4(D) != 0 ? _9 : 1;

[local count: 118111601]:
   # c_12 = PHI 

 I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when
>the
 latch execute zero times for b_4 == 0 means that the body will
>execute
 ones.
>>> ISTM that DOM ought to have simplified the conditional, unless
>there's
>>> some other way to get to bb3.  We know that b_4 is nonzero and thus
>c_3
>>> must have the value _9.
>> As of now, dom is not optimizing it. With the attached hack, it can
>be made to.
>What's strange is I'm not getting the c_3 = (b_4 != 0) ... in any of
>the
>dumps I'm looking at.  Instead it's c_3 = _9, which is what I would
>expect since we know that b_4 != 0
>
>
>My tests have been on x86_64 and aarch64 linux targets.  I've tried
>with
>patch#1 installed as well as with patch #1 and patch #2 together.
>
>What target, what flags and what patches do I need to see this?

I believe it has been fixed in niters analysis to avoid the condition if it is 
known to be true. 

Richard. 

>Jeff



Re: [PATCH 0/3][POPCOUNT]

2018-07-05 Thread Jeff Law
On 07/05/2018 11:39 PM, Richard Biener wrote:
> On July 6, 2018 12:03:11 AM GMT+02:00, Jeff Law  wrote:
>> On 06/24/2018 08:41 PM, Kugan Vivekanandarajah wrote:
>>> Hi Jeff,
>>>
>>> Thanks for the comments.
>>>
>>> On 23 June 2018 at 02:06, Jeff Law  wrote:
 On 06/22/2018 03:11 AM, Kugan Vivekanandarajah wrote:
> When we set niter with maybe_zero, currently final_value_relacement
> will not happen due to expression_expensive_p not handling. Patch 1
> adds this.
>
> With that we have the following optimized gimple.
>
>[local count: 118111601]:
>   if (b_4(D) != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   _9 = __builtin_popcountl (_2);
>   c_3 = b_4(D) != 0 ? _9 : 1;
>
>[local count: 118111601]:
>   # c_12 = PHI 
>
> I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when
>> the
> latch execute zero times for b_4 == 0 means that the body will
>> execute
> ones.
 ISTM that DOM ought to have simplified the conditional, unless
>> there's
 some other way to get to bb3.  We know that b_4 is nonzero and thus
>> c_3
 must have the value _9.
>>> As of now, dom is not optimizing it. With the attached hack, it can
>> be made to.
>> What's strange is I'm not getting the c_3 = (b_4 != 0) ... in any of
>> the
>> dumps I'm looking at.  Instead it's c_3 = _9, which is what I would
>> expect since we know that b_4 != 0
>>
>>
>> My tests have been on x86_64 and aarch64 linux targets.  I've tried
>> with
>> patch#1 installed as well as with patch #1 and patch #2 together.
>>
>> What target, what flags and what patches do I need to see this?
> 
> I believe it has been fixed in niters analysis to avoid the condition if it 
> is known to be true. 
Ah.  Presumably the change from 6-16.  I'll back that out and retry.

jeff


[PATCH], Add configuration checks to PowerPC --with-long-double-format=ieee

2018-07-05 Thread Michael Meissner
This patch adds a simple check of whether the GLIBC should be capable of
switching the long double format on the PowerPC to IEEE 128-bit floating point.
At the moment, library work is not yet finished, but I'm assuming that the
patches will be in place when GLIBC 2.28 is released.  If it turns out that the
finished support does not make it until 2.29, we can adjust the patch later.

Right now, if you use standard GLIBC 2.27 or earlier (ignoring the bits that
actually use long double that will need to be handled), you will not be able to
build libstdc++-v3 when long double is configured to be IEEE 128-bit due to
errors with overloaded functions like issignalling (where both __float128 and
long double versions are defined).  The GLIBC team has a fix for this, and it
should appear in 2.28.

This patch checks whether the GLIBC version is 2.28 before allowing you to
switch the long double type.  Because the work to prepare GLIBC for the switch
is being done using an Advance Toolchain framework, the patch allows an Advance
Toolchain 2.27 with the --with-advance-toolchain configuration option (the
official AT 11 release uses GLIBC 2.26 as a framework, and when completed the
AT 12 release should use GLIBC 2.28).

I have checked it on a little endian power8 system, building both toolchains
using IBM long double and IEEE long double configurations.  The tests that
depend on the library support for long double that failed before still fail.

I also did IEEE long double builds using the host GLIBC and that AT 11, and
verified that once GCC is configured it generates an error.  I built bootstrap
compilers on a big endian system, and verified if I selected IEEE long double,
it would fail, since I currently don't have a big endian GLIBC with the fixes
installed.

Can I check this in the trunk at on the GCC 8 branch?

2018-07-05  Michael Meissner  

* configure.ac (powerpc64*-*-linux*): Combine big and little
endian checks for the long double format.  Add checks to make sure
the GLIBC can handle configuration of long double to be IEEE
128-bit before building GCC.
* configure: Regenerate.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 262443)
+++ gcc/configure.ac(working copy)
@@ -6031,23 +6031,48 @@ AC_ARG_WITH([long-double-format],
   [AS_HELP_STRING([--with-long-double-format={ieee,ibm}]
  [Specify whether PowerPC long double uses IEEE or IBM 
format])],[
 case "$target:$with_long_double_format" in
-  powerpc64le-*-linux*:ieee | powerpc64le-*-linux*:ibm)
-:
-;;
-  powerpc64-*-linux*:ieee | powerpc64-*-linux*:ibm)
-# IEEE 128-bit emulation is only built on 64-bit VSX Linux systems
-case "$with_cpu" in
-  power7 | power8 | power9 | power1*)
+  powerpc64le-*-linux*:ibm | powerpc64-*-linux*:ibm | \
+  powerpc64le-*-linux*:ieee | powerpc64-*-linux*:ieee)
+# IEEE 128-bit emulation is only built on 64-bit VSX Linux systems.
+# Little endian 64-bit systems are always VSX, but big endian systems
+# might default to power4.
+case "$target:$with_cpu" in
+  powerpc64le-* | *:power7 | *:power8 | *:power9 | *:power1*)
:
;;
   *)
AC_MSG_ERROR([Configuration option --with-long-double-format is only \
 supported if the default cpu is power7 or newer])
with_long_double_format=""
-   ;;
-  esac
-  ;;
-  xpowerpc64*-*-linux*:*)
+esac
+
+if test "x$with_long_double_format" = xieee; then
+  # See if we have a new enough GLIBC to allow using IEEE 128-bit long
+  # double.  We assume the public 2.28 GLIBC and the development version of
+  # the Advance Toolchain (2.27) have all of the missing bits.
+  ieee_minor="28"
+  glibc_ieee="no"
+  atoolchain=""
+  if test "x$with_advance_toolchain" != x \
+-a -d "/opt/$with_advance_toolchain/." \
+-a -d "/opt/$with_advance_toolchain/bin/." \
+-a -d "/opt/$with_advance_toolchain/include/."; then
+
+   ieee_minor="27"
+   atoolchain="Advance Toolchain "
+  fi
+  GCC_GLIBC_VERSION_GTE_IFELSE([2], [$ieee_minor], [glibc_ieee=yes], )
+  if test "x$glibc_ieee" = xyes; then
+   echo "${atoolchain}GLIBC appears to have IEEE long double support" 1>&2
+
+  else
+   AC_MSG_ERROR([Configuration option --with-long-double-format=ieee \
+needs ${atoolchain}GLIBC 2.${ieee_minor} or newer])
+   with_long_double_format=""
+  fi
+fi
+;;
+  powerpc64*-*-linux*:*)
 AC_MSG_ERROR([--with-long-double-format argument should be ibm or ieee])
 with_long_double_format=""
 ;;
Index: gcc/configure
===
--- gcc/configure   (revision 262443)
+++ gcc/configure   (working copy)
@@ -2

Re: [PATCH 0/3][POPCOUNT]

2018-07-05 Thread Kugan Vivekanandarajah
Hi Jeff,

Thanks for looking into it.

On 6 July 2018 at 08:03, Jeff Law  wrote:
> On 06/24/2018 08:41 PM, Kugan Vivekanandarajah wrote:
>> Hi Jeff,
>>
>> Thanks for the comments.
>>
>> On 23 June 2018 at 02:06, Jeff Law  wrote:
>>> On 06/22/2018 03:11 AM, Kugan Vivekanandarajah wrote:
 When we set niter with maybe_zero, currently final_value_relacement
 will not happen due to expression_expensive_p not handling. Patch 1
 adds this.

 With that we have the following optimized gimple.

[local count: 118111601]:
   if (b_4(D) != 0)
 goto ; [89.00%]
   else
 goto ; [11.00%]

[local count: 105119324]:
   _2 = (unsigned long) b_4(D);
   _9 = __builtin_popcountl (_2);
   c_3 = b_4(D) != 0 ? _9 : 1;

[local count: 118111601]:
   # c_12 = PHI 

 I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
 latch execute zero times for b_4 == 0 means that the body will execute
 ones.
>>> ISTM that DOM ought to have simplified the conditional, unless there's
>>> some other way to get to bb3.  We know that b_4 is nonzero and thus c_3
>>> must have the value _9.
>> As of now, dom is not optimizing it. With the attached hack, it can be made 
>> to.
> What's strange is I'm not getting the c_3 = (b_4 != 0) ... in any of the
> dumps I'm looking at.  Instead it's c_3 = _9, which is what I would
> expect since we know that b_4 != 0
>
>
> My tests have been on x86_64 and aarch64 linux targets.  I've tried with
> patch#1 installed as well as with patch #1 and patch #2 together.
>
> What target, what flags and what patches do I need to see this?
You need the patch 1 (attaching) to get that. With Patch 2 in this
series, it will be optimized.

I haven't committed the patches yet as I am testing all the three
patches. I will commit after testing on current trunk.

Thanks,
Kugan


>
> Jeff
From 12263df77931aa55d205b9db470436848d762684 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Fri, 22 Jun 2018 14:10:26 +1000
Subject: [PATCH 1/3] generate popcount when checked for zero

Change-Id: I951e6d487268b757cbdaa8dcf671ab1377490db6
---
 gcc/gimplify.c  |  2 +-
 gcc/gimplify.h  |  1 +
 gcc/testsuite/gcc.dg/tree-ssa/pr64183.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr85073.c |  2 +-
 gcc/tree-scalar-evolution.c | 12 
 5 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 48ac92e..c86ad1a 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -3878,7 +3878,7 @@ gimplify_pure_cond_expr (tree *expr_p, gimple_seq *pre_p)
EXPR is GENERIC, while tree_could_trap_p can be called
only on GIMPLE.  */
 
-static bool
+bool
 generic_expr_could_trap_p (tree expr)
 {
   unsigned i, n;
diff --git a/gcc/gimplify.h b/gcc/gimplify.h
index dd0e4c0..62ca869 100644
--- a/gcc/gimplify.h
+++ b/gcc/gimplify.h
@@ -83,6 +83,7 @@ extern enum gimplify_status gimplify_arg (tree *, gimple_seq *, location_t,
 extern void gimplify_function_tree (tree);
 extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,
 		  gimple_seq *);
+extern bool generic_expr_could_trap_p (tree expr);
 gimple *gimplify_assign (tree, tree, gimple_seq *);
 
 #endif /* GCC_GIMPLIFY_H */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr64183.c b/gcc/testsuite/gcc.dg/tree-ssa/pr64183.c
index 7a854fc..50d0c5a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr64183.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr64183.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fno-tree-vectorize -fdump-tree-cunroll-details" } */
+/* { dg-options "-O3 -fno-tree-vectorize -fdisable-tree-sccp -fdump-tree-cunroll-details" } */
 
 int bits;
 unsigned int size;
diff --git a/gcc/testsuite/gcc.target/i386/pr85073.c b/gcc/testsuite/gcc.target/i386/pr85073.c
index 187102d..71a5d23 100644
--- a/gcc/testsuite/gcc.target/i386/pr85073.c
+++ b/gcc/testsuite/gcc.target/i386/pr85073.c
@@ -1,6 +1,6 @@
 /* PR target/85073 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -mbmi" } */
+/* { dg-options "-O2 -mbmi -fdisable-tree-sccp" } */
 
 int
 foo (unsigned x)
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 4b0ec02..8e29005 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -3508,6 +3508,18 @@ expression_expensive_p (tree expr)
   return false;
 }
 
+  if (code == COND_EXPR)
+return (expression_expensive_p (TREE_OPERAND (expr, 0))
+	|| (EXPR_P (TREE_OPERAND (expr, 1))
+		&& EXPR_P (TREE_OPERAND (expr, 2)))
+	/* If either branch has side effects or could trap.  */
+	|| TREE_SIDE_EFFECTS (TREE_OPERAND (expr, 1))
+	|| generic_expr_could_trap_p (TREE_OPERAND (expr, 1))
+	|| TREE_SIDE_EFFECTS (TREE_OPERAND (expr, 0))
+	|| generic_expr_could_trap_p (TREE_OPERAND (expr, 0))
+	|| expression_expensive_p (TREE_OPERAND (expr, 1))
+	|| expression_expensive_p (TREE_OPERAND

Re: [PATCH] Update config.guess and config.sub

2018-07-05 Thread Sebastian Huber

On 05/07/18 18:51, Palmer Dabbelt wrote:


I'm not sure what the policy is on getting config stuff approved for 
commit, but just FYI there's another RISC-V related patch to 
config.sub that changes the behavior of "riscv-*" tuples.  I'm 
assuming we should take both, as it's odd to sync half way to the head 
of config. 


I updated Binutils (master and binutils-2_31-branch) and GCC (master) to 
use the latest versions of config.sub (2018-07-03) and config.guess  
(2018-06-26).


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.