[SVE] [aarch64] Add "@" in vcond_mask pattern

2019-08-21 Thread Prathamesh Kulkarni
Hi Richard,
I screwed up while committing pr88839 fix, by not including the change
to aarch64-sve.md
that adds "@" to vcond_mask pattern, which resulted in trunk failing to build
with: "gen_vcond_mask" was not declared in this scope -:/
The attached patch resolves the failure.
OK to commit ?

Thanks,
Prathamesh
2019-08-22  Prathamesh Kulkarni  

* aarch64-sve.md (vcond_mask): Add "@".

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index ac65e691d73..f58353e9c6d 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -3927,7 +3927,7 @@
 ;; vcond_mask operand order: true, false, mask
 ;; UNSPEC_SEL operand order: mask, true, false (as for VEC_COND_EXPR)
 ;; SEL operand order:mask, true, false
-(define_expand "vcond_mask_"
+(define_expand "@vcond_mask_"
   [(set (match_operand:SVE_ALL 0 "register_operand")
(unspec:SVE_ALL
  [(match_operand: 3 "register_operand")


Re: [PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts

2019-08-21 Thread Bin.Cheng
On Thu, Aug 22, 2019 at 11:18 AM Kewen.Lin  wrote:
>
> Hi Bin,
>
> Thanks for your time!
>
> on 2019/8/21 下午8:32, Bin.Cheng wrote:
> > On Wed, Aug 14, 2019 at 3:23 PM Kewen.Lin  wrote:
> >>
> >> Hi!
> >>
> >> Comparing to the previous versions of implementation mainly based on the
> >> existing IV cands but zeroing the related group/use cost, this new one is 
> >> based
> >> on Richard and Segher's suggestion introducing one doloop dedicated IV 
> >> cand.
> >>
> >> Some key points are listed below:
> >>   1) New field doloop_p in struct iv_cand to indicate doloop dedicated IV 
> >> cand.
> >>   2) Special name "doloop" assigned.
> >>   3) Doloop IV cand with form (niter+1, +, -1)
> >>   4) For doloop IV cand, no extra one cost like BIV, assign zero cost for 
> >> step.
> >>   5) Support may_be_zero (regressed PR is in this case), the base of 
> >> doloop IV
> >>  can be COND_EXPR, add handlings in cand_value_at and may_eliminate_iv.
> >>   6) Add more expr support in force_expr_to_var_cost for reasonable cost
> >>  calculation on the IV base with may_be_zero (like COND_EXPR).
> >>   7) Set zero cost when using doloop IV cand for doloop use.
> >>   8) Add three hooks (should we merge _generic and _address?).
> >> *) have_count_reg_decr_p, is to indicate the target has special 
> >> hardware
> >>count register, we shouldn't consider the impact of doloop IV when
> >>calculating register pressures.
> >> *) doloop_cost_for_generic, is the extra cost when using doloop IV 
> >> cand for
> >>generic type IV use.
> >> *) doloop_cost_for_address, is the extra cost when using doloop IV 
> >> cand for
> >>address type IV use.
> > What will happen if doloop IV cand be used for generic/address type iv
> > use?  Can RTL doloop can still perform doloop optimization in this
> > case?
> >
>
> On Power, we put the iteration count into hardware count register, it takes 
> very
> high cost to move the count to GPR, so the cost is set as INF to make it 
> impossible
> to use it for generic/address type iv use.  But as some discussion before, on 
> some
> targets using GPR instead of hardware count register, they probably want to 
> use this
> doloop iv used for other uses if profitable.  These two hooks offer the 
> possibility.
> In that case, I think RTL doloop can still perform since it can still get the
> pattern and transform.  The generic/address uses can still use it.
> >>
> >> Bootstrapped on powerpc64le-linux-gnu and regression testing passed 
> >> excepting
> >> for one failure on gcc/testsuite/gcc.dg/guality/loop-1.c at -O3 which is 
> >> tracked
> >> by PR89983.
> >>
> >> Any comments and suggestions are highly appreciated.  Thanks!
> > Not sure if I understand the patch correctly, some comments embedded.
> >
> > +  /* The number of doloop candidate in the set.  */
> > +  unsigned n_doloop_cands;
> > +
> > This is unnecessary.  See below comments.
> >
> > -add_candidate_1 (data, base, step, important,
> > -IP_NORMAL, use, NULL, orig_iv);
> > +add_candidate_1 (data, base, step, important, IP_NORMAL, use, NULL, 
> > doloop,
> > +orig_iv);
> >if (ip_end_pos (data->current_loop)
> >&& allow_ip_end_pos_p (data->current_loop))
> > -add_candidate_1 (data, base, step, important, IP_END, use, NULL, 
> > orig_iv);
> > +add_candidate_1 (data, base, step, important, IP_END, use, NULL, 
> > doloop,
> > +orig_iv);
> > Do we need to skip ip_end_pos case for doloop candidate?  Because the
> > candidate increment will be inserted in latch, i.e, increment position
> > is after exit condition.
> >
>
> Yes, we should skip it.  Currently function find_doloop_use has the check on 
> an
> empty latch and gimple_cond to latch, partially excluding it.  But it's still 
> good
> to guard it directly here.
>
> > -  tree_to_aff_combination (iv->base, type, val);
> > +  tree base = iv->base;
> > +  /* See add_iv_candidate_for_doloop, if may_be_zero is set, we want to 
> > extract
> > + the value under !may_be_zero to get the compact bound which also well 
> > fits
> > + for may_be_zero since we ensure the value for it is const one.  */
> > +  if (cand->doloop_p && desc->may_be_zero && !integer_zerop
> > (desc->may_be_zero))
> > +base = fold_build2 (PLUS_EXPR, TREE_TYPE (niter),
> > +   unshare_expr (rewrite_to_non_trapping_overflow 
> > (niter)),
> > +   build_int_cst (TREE_TYPE (niter), 1));
> > +  tree_to_aff_combination (base, type, val);
> > I don't quite follow here.  The iv->base is computed from niter, I
> > suppose compact bound is for cheaper candidate initialization?  Why
> > it's possible to extract !may_be_zero niter for may_be_zero here?  The
> > niter under !may_be_zero has no indication about the real niter under
> > may_be_zero.
> >
>
> As you note below, the cand_value for doloop would be zero, but for the case
> may_be_zero set, the 

Re: Implement C++20 p1424 - 'constexpr' feature macro concerns.

2019-08-21 Thread Ed Smith-Rowland via gcc-patches

On 8/20/19 6:14 PM, Jonathan Wakely wrote:

On 16/08/19 22:39 -0400, Ed Smith-Rowland via libstdc++ wrote:
The latest draft and I guess the above paper changed the macro names 
for the C++20 constexpr lib featues.


__cpp_lib_constexpr_algorithms ->__cpp_lib_constexpr.


The __cpp_lib_constexpr macro is (now?) for a different feature.

The proposed resolution for https://cplusplus.github.io/LWG/issue3256
is to keep the __cpp_lib_constexpr_algorithms name to indicate support
for what is currently described by __cpp_lib_constexpr_swap_algorithms
(and see https://cplusplus.github.io/LWG/issue3257 for another issue
in this area)


I'm testing the attached.

I changed back to __cpp_lib_constexpr_algorithms from __cpp_lib_constexpr.

I dropped__cpp_lib_constexpr_swap_algorithms in favor of 
__cpp_lib_constexpr_algorithms.


I'm leaving __cpp_lib_constexpr alone.?? We don't test this anyway.?? It's 
sitting in .


Is it possible that __cpp_lib_constexpr_char_traits also should be 
bumped from 201611 to 201806 because of P1032 section 10? Note that I'm 
not done with this but I'd like to get ahead of the macro issues.?? 
Presumably, __cpp_lib_constexpr_algorithms gets bumped.


Ed


2019-08-22  Edward Smith-Rowland  <3dw...@verizon.net>

Implement C++20 p1424 - 'constexpr' feature macro concerns,
Issue 3256 - Feature testing macro for constexpr algorithms,
and Issue 3257 - Missing feature testing macro update from P0858.
* include/std/version (__cpp_lib_constexpr_algorithms): Bump value.
* include/bits/algorithmfwd.h: Ditto.
* include/std/utility: Ditto.
* testsuite/25_algorithms/constexpr_macro.cc: Ditto.
* testsuite/25_algorithms/cpp_lib_constexpr.cc: New check for
__cpp_lib_constexpr macro in .
* testsuite/20_util/exchange/constexpr.cc: Add check for
__cpp_lib_constexpr macro in .
* testsuite/25_algorithms/adjacent_find/constexpr.cc: Remove check for
__cpp_lib_constexpr_algorithms.
* testsuite/25_algorithms/all_of/constexpr.cc: Ditto.
* testsuite/25_algorithms/any_of/constexpr.cc: Ditto.
* testsuite/25_algorithms/binary_search/constexpr.cc: Ditto.
* testsuite/25_algorithms/copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/copy_backward/constexpr.cc: Ditto.
* testsuite/25_algorithms/copy_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/copy_n/constexpr.cc: Ditto.
* testsuite/25_algorithms/count/constexpr.cc: Ditto.
* testsuite/25_algorithms/count_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/equal/constexpr.cc: Ditto.
* testsuite/25_algorithms/equal_range/constexpr.cc: Ditto.
* testsuite/25_algorithms/fill/constexpr.cc: Ditto.
* testsuite/25_algorithms/fill_n/constexpr.cc: Ditto.
* testsuite/25_algorithms/find/constexpr.cc: Ditto.
* testsuite/25_algorithms/find_end/constexpr.cc: Ditto.
* testsuite/25_algorithms/find_first_of/constexpr.cc: Ditto.
* testsuite/25_algorithms/find_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/find_if_not/constexpr.cc: Ditto.
* testsuite/25_algorithms/for_each/constexpr.cc: Ditto.
* testsuite/25_algorithms/generate/constexpr.cc: Ditto.
* testsuite/25_algorithms/generate_n/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_heap/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_heap_until/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_partitioned/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_permutation/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_sorted/constexpr.cc: Ditto.
* testsuite/25_algorithms/is_sorted_until/constexpr.cc: Ditto.
* testsuite/25_algorithms/lexicographical_compare/constexpr.cc: Ditto.
* testsuite/25_algorithms/lower_bound/constexpr.cc: Ditto.
* testsuite/25_algorithms/merge/constexpr.cc: Ditto.
* testsuite/25_algorithms/mismatch/constexpr.cc: Ditto.
* testsuite/25_algorithms/none_of/constexpr.cc: Ditto.
* testsuite/25_algorithms/partition_copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/partition_point/constexpr.cc: Ditto.
* testsuite/25_algorithms/remove/constexpr.cc: Ditto.
* testsuite/25_algorithms/remove_copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/remove_copy_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/remove_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/replace_copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/replace_copy_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/replace_if/constexpr.cc: Ditto.
* testsuite/25_algorithms/reverse_copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/rotate_copy/constexpr.cc: Ditto.
* testsuite/25_algorithms/search/constexpr.cc: Ditto.
* testsuite/25_algorithms/search_n/constexpr.cc: Ditto.
* 

Re: [PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts

2019-08-21 Thread Kewen.Lin
Hi Bin,

Thanks for your time!

on 2019/8/21 下午8:32, Bin.Cheng wrote:
> On Wed, Aug 14, 2019 at 3:23 PM Kewen.Lin  wrote:
>>
>> Hi!
>>
>> Comparing to the previous versions of implementation mainly based on the
>> existing IV cands but zeroing the related group/use cost, this new one is 
>> based
>> on Richard and Segher's suggestion introducing one doloop dedicated IV cand.
>>
>> Some key points are listed below:
>>   1) New field doloop_p in struct iv_cand to indicate doloop dedicated IV 
>> cand.
>>   2) Special name "doloop" assigned.
>>   3) Doloop IV cand with form (niter+1, +, -1)
>>   4) For doloop IV cand, no extra one cost like BIV, assign zero cost for 
>> step.
>>   5) Support may_be_zero (regressed PR is in this case), the base of doloop 
>> IV
>>  can be COND_EXPR, add handlings in cand_value_at and may_eliminate_iv.
>>   6) Add more expr support in force_expr_to_var_cost for reasonable cost
>>  calculation on the IV base with may_be_zero (like COND_EXPR).
>>   7) Set zero cost when using doloop IV cand for doloop use.
>>   8) Add three hooks (should we merge _generic and _address?).
>> *) have_count_reg_decr_p, is to indicate the target has special hardware
>>count register, we shouldn't consider the impact of doloop IV when
>>calculating register pressures.
>> *) doloop_cost_for_generic, is the extra cost when using doloop IV cand 
>> for
>>generic type IV use.
>> *) doloop_cost_for_address, is the extra cost when using doloop IV cand 
>> for
>>address type IV use.
> What will happen if doloop IV cand be used for generic/address type iv
> use?  Can RTL doloop can still perform doloop optimization in this
> case?
> 

On Power, we put the iteration count into hardware count register, it takes very
high cost to move the count to GPR, so the cost is set as INF to make it 
impossible
to use it for generic/address type iv use.  But as some discussion before, on 
some
targets using GPR instead of hardware count register, they probably want to use 
this
doloop iv used for other uses if profitable.  These two hooks offer the 
possibility.
In that case, I think RTL doloop can still perform since it can still get the 
pattern and transform.  The generic/address uses can still use it.
>>
>> Bootstrapped on powerpc64le-linux-gnu and regression testing passed excepting
>> for one failure on gcc/testsuite/gcc.dg/guality/loop-1.c at -O3 which is 
>> tracked
>> by PR89983.
>>
>> Any comments and suggestions are highly appreciated.  Thanks!
> Not sure if I understand the patch correctly, some comments embedded.
> 
> +  /* The number of doloop candidate in the set.  */
> +  unsigned n_doloop_cands;
> +
> This is unnecessary.  See below comments.
> 
> -add_candidate_1 (data, base, step, important,
> -IP_NORMAL, use, NULL, orig_iv);
> +add_candidate_1 (data, base, step, important, IP_NORMAL, use, NULL, 
> doloop,
> +orig_iv);
>if (ip_end_pos (data->current_loop)
>&& allow_ip_end_pos_p (data->current_loop))
> -add_candidate_1 (data, base, step, important, IP_END, use, NULL, 
> orig_iv);
> +add_candidate_1 (data, base, step, important, IP_END, use, NULL, doloop,
> +orig_iv);
> Do we need to skip ip_end_pos case for doloop candidate?  Because the
> candidate increment will be inserted in latch, i.e, increment position
> is after exit condition.
> 

Yes, we should skip it.  Currently function find_doloop_use has the check on an
empty latch and gimple_cond to latch, partially excluding it.  But it's still 
good
to guard it directly here.

> -  tree_to_aff_combination (iv->base, type, val);
> +  tree base = iv->base;
> +  /* See add_iv_candidate_for_doloop, if may_be_zero is set, we want to 
> extract
> + the value under !may_be_zero to get the compact bound which also well 
> fits
> + for may_be_zero since we ensure the value for it is const one.  */
> +  if (cand->doloop_p && desc->may_be_zero && !integer_zerop
> (desc->may_be_zero))
> +base = fold_build2 (PLUS_EXPR, TREE_TYPE (niter),
> +   unshare_expr (rewrite_to_non_trapping_overflow 
> (niter)),
> +   build_int_cst (TREE_TYPE (niter), 1));
> +  tree_to_aff_combination (base, type, val);
> I don't quite follow here.  The iv->base is computed from niter, I
> suppose compact bound is for cheaper candidate initialization?  Why
> it's possible to extract !may_be_zero niter for may_be_zero here?  The
> niter under !may_be_zero has no indication about the real niter under
> may_be_zero.
> 

As you note below, the cand_value for doloop would be zero, but for the case
may_be_zero set, the current calculation would take care of the whole niter
expression including the cond_expr introduced by may_be_zero check, it's 
unexpected.  The purpose is to use the value under condition !may_be_zero
for the calculation, and yes, to get expected zero finally.

> -  cand_value_at (loop, cand, 

Re: [PATCH] Add MD Function type check for builtin_md vectorize

2019-08-21 Thread luoxhu

On 2019/8/21 15:40, Richard Biener wrote:

On Tue, 20 Aug 2019, Xiong Hu Luo wrote:


The DECL_MD_FUNCTION_CODE added in r274404(PR 91421) by rsandifo requires that
DECL to be a BUILTIN_IN_MD class built-in, asserts will happen when lto
as the patch r274411(PR 91287) outputs some math function symbol to the object,
this patch will check function type before do builtin_md vectorize.


I think Richard fixed this already.


Thanks. It was fixed by Richard's r274524 already. Please ignore this
patch.

Xionghu



Richard.


gcc/ChangeLog

2019-08-21  Xiong Hu Luo  

* tree-vect-stmts.c (vectorizable_call): Check callee built-in type.
* gcc/tree.h (DECL_MD_FUNCTION_P): New function.
---
  gcc/tree-vect-stmts.c |  2 +-
  gcc/tree.h| 12 
  2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1e2dfe5d22d..ef947f20d63 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3376,7 +3376,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
if (cfn != CFN_LAST)
fndecl = targetm.vectorize.builtin_vectorized_function
  (cfn, vectype_out, vectype_in);
-  else if (callee)
+  else if (callee && DECL_MD_FUNCTION_P (callee))
fndecl = targetm.vectorize.builtin_md_vectorized_function
  (callee, vectype_out, vectype_in);
  }
diff --git a/gcc/tree.h b/gcc/tree.h
index b910c5cb475..8cce89e5cf3 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3905,6 +3905,18 @@ DECL_MD_FUNCTION_CODE (const_tree decl)
return fndecl.function_code;
  }
  
+/* Return true if decl is a FUNCTION_DECL with built-in class BUILT_IN_MD.

+   Otherwise return false.  */
+inline bool
+DECL_MD_FUNCTION_P (const_tree decl)
+{
+  const tree_function_decl  = FUNCTION_DECL_CHECK (decl)->function_decl;
+  if (fndecl.built_in_class == BUILT_IN_MD)
+return true;
+  else
+return false;
+}
+
  /* Return the frontend-specific built-in function that DECL represents,
 given that it is known to be a FUNCTION_DECL with built-in class
 BUILT_IN_FRONTEND.  */







Re: [PATCH V2 2/8] bpf: new GCC port

2019-08-21 Thread Jose E. Marchesi


> +#undef TARGET_PASS_BY_REFERENCE
> +#define TARGET_PASS_BY_REFERENCE bpf_pass_by_reference

I might have misunderstood, but I thought from an earlier (IRC?)
message, it wasn't possible for the callee to access the caller's
frame, which was why you had the error about running out of argument
registers.  If so, won't passing by reference make the argument
inaccessible in practice?  I don't see what you gain by defining
the hook, since I'd have assumed (at least after the fix above)
that it would be better to pass by value and get an error about
having no argument registers left.

Yes.  I added that hook before I had the restriction of number of
arguments in place.  Removing it.

Happy auto correction :)

A colleague (who actually _uses_ eBPF extensively, ahem) tells me that
the kernel verifier allows to pass addresses of the caller's stack
frame, tracking that it is a ptr to a stack location, and it knows which
stack it came from.  So it is indeed possible for the callee to access
the caller's frame, and therefore to pass arguments by reference.

On the downside, it is not possible for a callee to access the caller's
frame applying an offset to its frame pointer, because the stacks are
disjoint.  This means that most probably I will have to dedicate a real,
not eliminable register to act as the arg pointer, if I want to get rid
of the annoying limitation on the number of arguments...  and in order
to keep ABI compatibility with llvm built objects, this register is
gonna have to be %r5, i.e. the last register usable to pass arguments,
but it should be only used for that purpose if the function gets more
than 5 arguments...  sounds messy, but there is hope, yay!

However, unless someone comes with a magical macro to define or an
existing target doing the same thing, I am deferring attacking this
problem for later (TM) and for the time being I will keep both the
ability of passing aggregates and other big arguments by reference, and
the limit on number of arguments (this is what clang does.)

I hope that's ok for you people.
Salud!


RE: PR90724 - ICE with __sync_bool_compare_and_swap with -march=armv8.2-a

2019-08-21 Thread JiangNing OS


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org  On
> Behalf Of JiangNing OS
> Sent: Thursday, August 22, 2019 8:24 AM
> To: Prathamesh Kulkarni ; James Greenhalgh
> 
> Cc: gcc Patches ; Richard Sandiford
> ; Kyrill Tkachov ;
> nd 
> Subject: RE: PR90724 - ICE with __sync_bool_compare_and_swap with -
> march=armv8.2-a
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org  On
> > Behalf Of Prathamesh Kulkarni
> > Sent: Thursday, August 22, 2019 2:36 AM
> > To: James Greenhalgh 
> > Cc: gcc Patches ; Richard Sandiford
> > ; Kyrill Tkachov
> > ; nd 
> > Subject: Re: PR90724 - ICE with __sync_bool_compare_and_swap with -
> > march=armv8.2-a
> >
> > On Mon, 19 Aug 2019 at 22:14, James Greenhalgh
> >  wrote:
> > >
> > > On Thu, Aug 15, 2019 at 02:11:25PM +0100, Prathamesh Kulkarni wrote:
> > > > On Thu, 8 Aug 2019 at 11:22, Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > On Thu, 1 Aug 2019 at 15:34, Prathamesh Kulkarni
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, 25 Jul 2019 at 11:56, Prathamesh Kulkarni
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Wed, 17 Jul 2019 at 18:15, Prathamesh Kulkarni
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Wed, 17 Jul 2019 at 13:45, Kyrill Tkachov
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Hi Prathamesh
> > > > > > > > >
> > > > > > > > > On 7/10/19 12:24 PM, Prathamesh Kulkarni wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > For following test-case, static long long AL[24];
> > > > > > > > > >
> > > > > > > > > > int
> > > > > > > > > > check_ok (void)
> > > > > > > > > > {
> > > > > > > > > >   return (__sync_bool_compare_and_swap (AL+1,
> > > > > > > > > > 0x20003ll, 0x1234567890ll)); }
> > > > > > > > > >
> > > > > > > > > > Compiling with -O2 -march=armv8.2-a results in:
> > > > > > > > > > pr90724.c: In function ‘check_ok’:
> > > > > > > > > > pr90724.c:7:1: error: unrecognizable insn:
> > > > > > > > > > 7 | }
> > > > > > > > > >   | ^
> > > > > > > > > > (insn 11 10 12 2 (set (reg:CC 66 cc)
> > > > > > > > > > (compare:CC (reg:DI 95)
> > > > > > > > > > (const_int 8589934595 [0x20003])))
> > "pr90724.c":6:11 -1
> > > > > > > > > >  (nil))
> > > > > > > > > >
> > > > > > > > > > IIUC, the issue is that 0x20003 falls outside the
> > > > > > > > > > range of allowable immediate in cmp ? If it's replaced
> > > > > > > > > > by a small constant then it works.
> > > > > > > > > >
> > > > > > > > > > The ICE results with -march=armv8.2-a because, we
> > > > > > > > > > enter if
> > > > > > > > > > (TARGET_LSE) { ... } condition in
> > > > > > > > > > aarch64_expand_compare_and_swap, while with
> > > > > > > > > > -march=armv8.a it goes into else, which forces oldval
> > > > > > > > > > into register if the predicate fails to match.
> > > > > > > > > >
> > > > > > > > > > The attached patch checks if y (oldval) satisfies
> > > > > > > > > > aarch64_plus_operand predicate and if not, forces it
> > > > > > > > > > to be in
> > register, which resolves ICE.
> > > > > > > > > > Does it look OK ?
> > > > > > > > > >
> > > > > > > > > > Bootstrap+testing in progress on aarch64-linux-gnu.
> > > > > > > > > >
> > > > > > > > > > PS: The issue has nothing to do with SVE, which I
> > > > > > > > > > incorrectly mentioned in bug report.
> > > > > > > > > >
> > > > > > > > > This looks ok to me (but you'll need maintainer approval).
> > > > > > > > >
> > > > > > > > > Does this fail on the branches as well?
> > > > > > > > Hi Kyrill,
> > > > > > > > Thanks for the review. The test also fails on gcc-9-branch
> > > > > > > > (but not on
> > gcc-8).
> > > > > > > Hi James,
> > > > > > > Is the patch OK to commit  ?
> > > > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > > > ping * 3:
> > > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > > ping * 4:
> > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > ping * 5: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > >
> > > Hi,
> > >
> > > Sorry, this missed my filters as it didn't mention AArch64 in the
> > > subject line.
> > >
> > > Thais is good for trunk, thanks for waiting.
> > Thanks, committed to trunk in r274805.
> 
> Trunk aarch64 build failure is exposed by this commit.
> 
> ../../gcc/gcc/config/aarch64/aarch64.c: In functionbool
> aarch64_evpc_sel(expand_vec_perm_d*)   :
> ../../gcc/gcc/config/aarch64/aarch64.c:18018:75: error:gen_vcond_mask
> was not declared in this scope
>emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op1, d->op0,
> pred));
>^
> make[3]: *** [aarch64.o] Error 1
> make[3]: Leaving directory `/home/amptest/gcc/build/gcc'

Oh. It is caused by a different commit 274810 rather than 274805.

> 
> > Is this OK to backport to gcc-9-branch ?
> >
> > Thanks,
> > Prathamesh
> > >
> > > James
> > >


RE: PR90724 - ICE with __sync_bool_compare_and_swap with -march=armv8.2-a

2019-08-21 Thread JiangNing OS
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org  On
> Behalf Of Prathamesh Kulkarni
> Sent: Thursday, August 22, 2019 2:36 AM
> To: James Greenhalgh 
> Cc: gcc Patches ; Richard Sandiford
> ; Kyrill Tkachov ;
> nd 
> Subject: Re: PR90724 - ICE with __sync_bool_compare_and_swap with -
> march=armv8.2-a
> 
> On Mon, 19 Aug 2019 at 22:14, James Greenhalgh
>  wrote:
> >
> > On Thu, Aug 15, 2019 at 02:11:25PM +0100, Prathamesh Kulkarni wrote:
> > > On Thu, 8 Aug 2019 at 11:22, Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Thu, 1 Aug 2019 at 15:34, Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > On Thu, 25 Jul 2019 at 11:56, Prathamesh Kulkarni
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, 17 Jul 2019 at 18:15, Prathamesh Kulkarni
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Wed, 17 Jul 2019 at 13:45, Kyrill Tkachov
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hi Prathamesh
> > > > > > > >
> > > > > > > > On 7/10/19 12:24 PM, Prathamesh Kulkarni wrote:
> > > > > > > > > Hi,
> > > > > > > > > For following test-case, static long long AL[24];
> > > > > > > > >
> > > > > > > > > int
> > > > > > > > > check_ok (void)
> > > > > > > > > {
> > > > > > > > >   return (__sync_bool_compare_and_swap (AL+1,
> > > > > > > > > 0x20003ll, 0x1234567890ll)); }
> > > > > > > > >
> > > > > > > > > Compiling with -O2 -march=armv8.2-a results in:
> > > > > > > > > pr90724.c: In function ‘check_ok’:
> > > > > > > > > pr90724.c:7:1: error: unrecognizable insn:
> > > > > > > > > 7 | }
> > > > > > > > >   | ^
> > > > > > > > > (insn 11 10 12 2 (set (reg:CC 66 cc)
> > > > > > > > > (compare:CC (reg:DI 95)
> > > > > > > > > (const_int 8589934595 [0x20003])))
> "pr90724.c":6:11 -1
> > > > > > > > >  (nil))
> > > > > > > > >
> > > > > > > > > IIUC, the issue is that 0x20003 falls outside the
> > > > > > > > > range of allowable immediate in cmp ? If it's replaced
> > > > > > > > > by a small constant then it works.
> > > > > > > > >
> > > > > > > > > The ICE results with -march=armv8.2-a because, we enter
> > > > > > > > > if
> > > > > > > > > (TARGET_LSE) { ... } condition in
> > > > > > > > > aarch64_expand_compare_and_swap, while with
> > > > > > > > > -march=armv8.a it goes into else, which forces oldval
> > > > > > > > > into register if the predicate fails to match.
> > > > > > > > >
> > > > > > > > > The attached patch checks if y (oldval) satisfies
> > > > > > > > > aarch64_plus_operand predicate and if not, forces it to be in
> register, which resolves ICE.
> > > > > > > > > Does it look OK ?
> > > > > > > > >
> > > > > > > > > Bootstrap+testing in progress on aarch64-linux-gnu.
> > > > > > > > >
> > > > > > > > > PS: The issue has nothing to do with SVE, which I
> > > > > > > > > incorrectly mentioned in bug report.
> > > > > > > > >
> > > > > > > > This looks ok to me (but you'll need maintainer approval).
> > > > > > > >
> > > > > > > > Does this fail on the branches as well?
> > > > > > > Hi Kyrill,
> > > > > > > Thanks for the review. The test also fails on gcc-9-branch (but 
> > > > > > > not on
> gcc-8).
> > > > > > Hi James,
> > > > > > Is the patch OK to commit  ?
> > > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > > ping * 3:
> > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > ping * 4: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > ping * 5: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> >
> > Hi,
> >
> > Sorry, this missed my filters as it didn't mention AArch64 in the
> > subject line.
> >
> > Thais is good for trunk, thanks for waiting.
> Thanks, committed to trunk in r274805.

Trunk aarch64 build failure is exposed by this commit.

../../gcc/gcc/config/aarch64/aarch64.c: In function ���bool 
aarch64_evpc_sel(expand_vec_perm_d*)���:
../../gcc/gcc/config/aarch64/aarch64.c:18018:75: error: ���gen_vcond_mask��� 
was not declared in this scope
   emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op1, d->op0, pred));
   ^
make[3]: *** [aarch64.o] Error 1
make[3]: Leaving directory `/home/amptest/gcc/build/gcc'

> Is this OK to backport to gcc-9-branch ?
> 
> Thanks,
> Prathamesh
> >
> > James
> >


Re: [wwwdocs] Document C++ news in GCC 10 Release Notes

2019-08-21 Thread Gerald Pfeifer
On Mon, 19 Aug 2019, Marek Polacek wrote:
> I'm of the mind that we should advertise some of the new cool
> C++ changes going into GCC 10, esp. those that are user-visible.

I concur. :-)

> Checking this in.

Thank you!

Gerald


Re: [RFA] [tree-optimization/80576] Handle non-constant sizes in DSE

2019-08-21 Thread Jeff Law
On 8/19/19 7:57 AM, Richard Biener wrote:

> 
> +static tree
> +objsize_from_type (tree object)
> +{
> +  if (TREE_CODE (object) != ADDR_EXPR)
> +return NULL_TREE;
> +
> +  tree type = TREE_TYPE (object);
> +  if (POINTER_TYPE_P (type))
> 
> that if looks suspicious...  I'd say
>   if (!POINTER_TYPE_P (type))
> return NULL_TREE;
> 
> is better
Sure.  But we may not be able to use this anyway as you noted later,
depending on the type is not safe.

> 
> +type = TREE_TYPE (type);
> 
> +  if (TREE_CODE (type) == ARRAY_TYPE && !array_at_struct_end_p (object))
> +{
> 
> array_at_struct_end_p will never return true here since you pass it
> an ADDR_EXPR...  you wanted to pass TREE_OPERAND (object, 0) here?
I think you're right.  Given this was cribbed from the tail of another
function elsewhere, I suspect that other function has the same issue.

I suspect ultimately we want to fix that other copy and drop
objsize_from_type.


> 
> Also you seem to use this info to constrain optimization when you
> might remember that types of addresses do not carry such information...
> Thus it should be "trivially" possible to write a testcase that is miscompiled
> after your patch.  I also don't see this really exercised in the
> testcases you add?
Arggh.  You're absolutely correct.  I must be blocking out that entire
discussion from last summer due to the trama :-)

If the destination is the address of a _DECL node, can we use the size
of the _DECL?


> All of them reference a static array a ...
> 
> +   if (TREE_CODE (size) != INTEGER_CST)
> 
> TREE_CODE (size) == SSA_NAME you want
Agreed.  Will fix.

jeff


Re: [RFA] [tree-optimization/80576] Handle non-constant sizes in DSE

2019-08-21 Thread Jeff Law
On 8/16/19 4:21 PM, Martin Sebor wrote:
> On 8/16/19 12:15 PM, Jeff Law wrote:
>> On 8/16/19 12:09 PM, Marc Glisse wrote:
>>> On Fri, 16 Aug 2019, Jeff Law wrote:
>>>
 This patch improves our ability to detect dead stores by handling cases
 where the size memcpy, memset, strncpy, etc call is not constant.  This
 addresses some, but not all, of the issues in 80576.

 The key here is when the size is not constant we can make conservative
 decisions that still give us a chance to analyze the code for dead
 stores.

 Remember that for dead store elimination, we're trying to prove that
 given two stores, the second store overwrites (partially or fully) the
 same memory locations as the first store.  That makes the first store
 either partially or fully dead.

 When we encounter the first store, we set up a bitmap of bytes written
 by that store (live_bytes).  We then look at subsequent stores and
 clear
 the appropriate entries in the bitmap.

 If the first store has a nonconstant length argument we can use the
 range of the length argument (max) and the size of the destination
 object to make a conservative estimation of how many bytes are written.

 For the second store the conservative thing to do for a non-constant
 length is to use the minimum of the range of the length argument.
>>>
>>> So I guess it won't handle things like
>>>
>>> void f(char*p,int n){
>>>    __builtin_memset(p,3,n);
>>>    __builtin_memset(p,7,n);
>>> }
>>>
>>> where we know nothing about the length, except that it is the same? Or
>>> do you look at symbolic ranges?
>> Nope.  I think ao_ref can represent that, so it'd just be a matter of
>> recording "n" as the length, then verifying that the second call's
>> length is "n" as well.  That makes the first call dead.  We'd have to
>> bypass the byte tracking in that case, but I think that's trivial
>> because we already have a means to do that when the sizes are too large.
>>
>>>
 This doesn't come up a lot in practice.  But it also happens to put
 some
 of the infrastructure in place to handle strcpy and strcpy_chk which
 are
 needed to fully resolve 80576.

 Bootstrapped and regression tested on x86, x86_64, ppc64le, ppc64,
 ppc32, aarch64, sparc, s390x and probably others.  Also verified that
 the tests work on the various *-elf targets in my tester.

 OK for the trunk?
>>>
>>> ENOPATCH
>> Opps.    Attached.
> 
> It's an improvement and I realize you said it doesn't handle
> everything and that you don't think it comes up a lot, but...
> I would actually expect the following example (from the bug)
> not to be that uncommon:
> 
>   void g (char *s)
>   {
>     char a[8];
>     __builtin_strcpy (a, s);
>     __builtin_memset (a, 0, sizeof a);
>     f (a);
>   }
> 
> or at least to be more common than the equivalent alternative
> the improvement does optimize:
> 
>   void g (char *s)
>   {
>     char a[8];
>     __builtin_memcpy (a, s,  __builtin_strlen (s));
>     __builtin_memset (a, 0, 8);
>     f (a);
>   }
> 
> It seems that making the first one work should be just a matter
> of handling strcpy analogously (the string length doesn't matter).
> 
> As an aside, the new tests make me realize that -Wstringop-overflow
> should be enhanced to detect this problem (i.e., a consider
> the largest size in a PHI).
Certainly not seeing much of these in the code I've looked at.  It may
be a case that aliasing gets in the way often.

Also note that we've got the same issues in this space that we did with
the strlen optimization improvements for last year.  I totally spaced
that and the net result may be we have to avoid using the type size for
anything in DSE which may make it impossible to really do a good job.

JEff


Re: [PATCH] fold constant flexarrays to strings (PR 91490)

2019-08-21 Thread Jeff Law
On 8/21/19 2:50 PM, Martin Sebor wrote:
> This patch is a subset of the solution for PR 91457 whose main
> goal is to eliminate inconsistencies in warnings issued for
> out-of-bounds accesses to the various flavors of flexible array
> members of constant objects.  That patch was posted here:
>   https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01202.html
> 
> Like PR 91457, this patch also relies on improving optimization
> to issue better quality warnings.  (I.e., with the latter being
> what motivated it.)
> 
> The optimization enhances string_constant to recognize empty
> CONSTRUCTORs returned by fold_ctor_reference for references
> to members of constant objects with either "insufficient"
> initializers (e.g., given struct S { char n, a[]; } s = { 0 };)
> or with braced-list initializers (e.g., given
> struct S s = { 3 { 1, 2, 3, 0 } };  The patch lets string_constant
> convert the CONSTRUCTOR for s.a into a STRING_CST, which in turn
> enables the folding of calls to built-ins like strlen, strchr, or
> strcmp with such arguments.
> 
> Exposing the strings to the folder then also lets it detect and
> issue warnings for out-of-bounds offsets in more instances of
> such references than before.
> 
> The remaining changes in the patch mostly enhance the places
> that use the no-warning bit to prevent redundant diagnostics.
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> gcc-91490.diff
> 
> PR middle-end/91490 - bogus argument missing terminating nul warning on 
> strlen of a flexible array member
> 
> gcc/c-family/ChangeLog:
> 
>   PR middle-end/91490
>   * c-common.c (braced_list_to_string): Add argument and overload.
>   Handle flexible length arrays.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR middle-end/91490
>   * c-c++-common/Warray-bounds-7.c: New test.
>   * gcc.dg/Warray-bounds-39.c: Expect either -Warray-bounds or
>   -Wstringop-overflow.
>   * gcc.dg/strlenopt-78.c: New test.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/91490
>   * builtins.c (c_strlen): Rename argument and introduce new local.
>   Set no-warning bit on original argument.
>   * expr.c (string_constant): Pass argument type to fold_ctor_reference.
>   * gimple-fold.c (fold_nonarray_ctor_reference): Return a STRING_CST
>   for missing initializers.
>   * tree.c (build_string_literal): Handle optional argument.
>   * tree.h (build_string_literal): Add defaulted argument.
>   * gimple-ssa-warn-restrict.c (maybe_diag_access_bounds): Check
>   no-warning bit on original expression.
> 

> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 2810867235d..ef942ffce57 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -8868,7 +8876,7 @@ braced_lists_to_strings (tree type, tree ctor)
>unsigned HOST_WIDE_INT idx;
>FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (ctor), idx, val)
>   {
> -   val = braced_lists_to_strings (ttp, val);
> +   val = braced_lists_to_strings (ttp, val, code == RECORD_TYPE);
Do you need to handle UNION_TYPE or QUAL_UNION_TYPE here too?  If so,
RECORD_OR_UNION_TYPE_P is a better test.


> diff --git a/gcc/expr.c b/gcc/expr.c
> index 92979289e83..d16e0982dcf 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -11402,6 +11402,16 @@ is_aligning_offset (const_tree offset, const_tree 
> exp)
>  tree
>  string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
>  {
> +  tree dummy = NULL_TREE;;
> +  if (!mem_size)
> +mem_size = 
> +
> +  tree chartype = TREE_TYPE (arg);
> +  if (POINTER_TYPE_P (chartype))
> +chartype = TREE_TYPE (chartype);
> +  while (TREE_CODE (chartype) == ARRAY_TYPE)
> +chartype = TREE_TYPE (chartype);
I'm hesitant to ACK this bit.  I can understand that if we're given an
ADDR_EXPR, which should be a POINTER_TYPE_P, that we'll want to look at
what it's pointing to.  But shouldn't we verify that it's an ADDR_EXPR
before just blindly stripping away the outer type?

I also wonder if this (assuming we keep it in some form) belongs after
the STRIP_NOPs.


> @@ -11602,17 +11605,39 @@ string_constant (tree arg, tree *ptr_offset, tree 
> *mem_size, tree *decl)
>int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
>if (len > 0)
>   {
> -   /* Construct a string literal with elements of ELTYPE and
> +   /* Construct a string literal with elements of INITTYPE and
>the representation above.  Then strip
>the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
> -   init = build_string_literal (len, (char *)charbuf, eltype);
> +   init = build_string_literal (len, (char *)charbuf, inittype);
> init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
>   }
>  }
>  
> +  tree initsize = TYPE_SIZE_UNIT (inittype);
> +
> +  if (TREE_CODE (init) == CONSTRUCTOR && !CONSTRUCTOR_ELTS (init))
Would initializer_zerop be better here?  That would catch
zero-initialized constructors as well.

If 

Re: Fix crash with -fdump-ada-spec

2019-08-21 Thread Joseph Myers
On Wed, 21 Aug 2019, Eric Botcazou wrote:

> > Rather than checking if the outermost declarator is
> > cdk_function and using its parameters if so, it's necessary to check if
> > the *innermost declarator that isn't cdk_id or cdk_attrs* is cdk_function
> > and use its parameters if so, as that's what actually determines if the
> > declarator for the entity being declared has the form of a function
> > declarator (see how grokdeclarator determines funcdef_syntax).
> 
> Thanks for the hint.  Tentative patch attached based on it, which certainly 
> makes -fdump-ada-spec happy, as witnessed by the dump-ada-spec-15.c change.
> 
> Is it something that you would approve for mainline?

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Fix crash with -fdump-ada-spec

2019-08-21 Thread Eric Botcazou
> The problem is that the logic in c_parser_declaration_or_fndef for setting
> DECL_ARGUMENTS (not previously meaningful in non-definition declarations
> at all), as introduced by
> 
> r253411 | dmalcolm | 2017-10-04 14:10:59 + (Wed, 04 Oct 2017) | 85 lines
> 
> C: underline parameters in mismatching function calls
> 
> (and subsequently changed by r268574, but that change isn't relevant here)
> is incorrect.

As a matter of fact, this logic helped -fdump-ada-spec too by making the name 
of the parameters available in non-definition declarations.

> Rather than checking if the outermost declarator is
> cdk_function and using its parameters if so, it's necessary to check if
> the *innermost declarator that isn't cdk_id or cdk_attrs* is cdk_function
> and use its parameters if so, as that's what actually determines if the
> declarator for the entity being declared has the form of a function
> declarator (see how grokdeclarator determines funcdef_syntax).

Thanks for the hint.  Tentative patch attached based on it, which certainly 
makes -fdump-ada-spec happy, as witnessed by the dump-ada-spec-15.c change.

Is it something that you would approve for mainline?


c/
* c-parser.c (c_parser_declaration_or_fndef): Set DECL_ARGUMENTS of a
FUNCTION_DECL to the right value in the presence of nested declarators.

testsuite/
* c-c++-common/dump-ada-spec-15.c: Check that the parameters are named.

-- 
Eric BotcazouIndex: c/c-parser.c
===
--- c/c-parser.c	(revision 274744)
+++ c/c-parser.c	(working copy)
@@ -2161,10 +2161,40 @@ c_parser_declaration_or_fndef (c_parser *parser, b
 	all_prefix_attrs));
 	  if (d
 		  && TREE_CODE (d) == FUNCTION_DECL
-		  && declarator->kind == cdk_function
 		  && DECL_ARGUMENTS (d) == NULL_TREE
 		  && DECL_INITIAL (d) == NULL_TREE)
-		DECL_ARGUMENTS (d) = declarator->u.arg_info->parms;
+		{
+		  /* Check if the innermost declarator that isn't cdk_id or
+		 cdk_attrs is cdk_function.  */
+		  const struct c_declarator *decl = declarator;
+		  const struct c_declarator *last_non_id_attrs = NULL;
+
+		  while (decl)
+		switch (decl->kind)
+		  {
+		  case cdk_array:
+		  case cdk_function:
+		  case cdk_pointer:
+			last_non_id_attrs = decl;
+			decl = decl->declarator;
+			break;
+
+		  case cdk_attrs:
+			decl = decl->declarator;
+			break;
+
+		  case cdk_id:
+			decl = 0;
+			break;
+
+		  default:
+			gcc_unreachable ();
+		  }
+		  /* If so, use its parameters for the function.  */
+		  if (last_non_id_attrs
+		  && last_non_id_attrs->kind == cdk_function)
+		DECL_ARGUMENTS (d) = last_non_id_attrs->u.arg_info->parms;
+		}
 	  if (omp_declare_simd_clauses.exists ())
 		{
 		  tree parms = NULL_TREE;
Index: testsuite/c-c++-common/dump-ada-spec-15.c
===
--- testsuite/c-c++-common/dump-ada-spec-15.c	(revision 274794)
+++ testsuite/c-c++-common/dump-ada-spec-15.c	(working copy)
@@ -3,4 +3,6 @@
 
 extern void (*signal (int __sig, void (*__handler)(int)))(int);
 
+/* { dg-final { scan-ada-spec "uu_sig" } } */
+/* { dg-final { scan-ada-spec "uu_handler" } } */
 /* { dg-final { cleanup-ada-spec } } */


Re: [PATCH] [PR81810] unused strcpy to a local buffer not eliminated

2019-08-21 Thread Jeff Law
On 8/21/19 11:23 AM, kamlesh kumar wrote:
> Hi ,
> This patch include fix for PR81810
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Thanks
> ./Kamlesh
> 
> 2019-08-21  Kamlesh Kumar   
> 
>PR tree-optimization/81810
>* tree-ssa-dse.c (dse_dom_walker::dse_optimize_stmt): Added
>BUILT_IN_STRCPY to consider for dse.
>(maybe_trim_memstar_call): Likewise.
>(initialize_ao_ref_for_dse): Likewise.
>* gcc.dg/tree-ssa/pr81810.c: New testcase.
And just to confirm, the work I'm already doing in DSE will pick up the
missed case.

jeff


Re: [PATCH] [PR81810] unused strcpy to a local buffer not eliminated

2019-08-21 Thread Jeff Law
On 8/21/19 11:23 AM, kamlesh kumar wrote:
> Hi ,
> This patch include fix for PR81810
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Thanks
> ./Kamlesh
> 
> 2019-08-21  Kamlesh Kumar   
> 
>PR tree-optimization/81810
>* tree-ssa-dse.c (dse_dom_walker::dse_optimize_stmt): Added
>BUILT_IN_STRCPY to consider for dse.
>(maybe_trim_memstar_call): Likewise.
>(initialize_ao_ref_for_dse): Likewise.
>* gcc.dg/tree-ssa/pr81810.c: New testcase.
This doesn't look right to me.


> 
> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
> index ba67884a825..dc4da4c9730 100644
> --- a/gcc/tree-ssa-dse.c
> +++ b/gcc/tree-ssa-dse.c
> @@ -115,8 +115,11 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
> case BUILT_IN_MEMSET_CHK:
> case BUILT_IN_STRNCPY:
> case BUILT_IN_STRNCPY_CHK:
> +   case BUILT_IN_STRCPY:
>   {
> -   tree size = gimple_call_arg (stmt, 2);
> +   tree size = NULL;
> +   if (gimple_call_num_args (stmt) > 2)
> +   size = gimple_call_arg (stmt, 2);
> tree ptr = gimple_call_arg (stmt, 0);
> ao_ref_init_from_ptr_and_size (write, ptr, size);
> return true;
This routine serves two purposes.

It can be called for a statement that is potentially dead.  In that case
it has to set the size of the ao_ref to cover all the data potentially
written by the strcpy.

It can also be called for a statement that makes an earlier statement
dead.  In this case you have to set the size for the ao_ref to the
number of bytes that you know have to be written.

In both cases the size is a function of the source string.  You can't
just leave the size NULL and expect everything to continue to work.






> @@ -470,6 +473,7 @@ maybe_trim_memstar_call (ao_ref *ref, sbitmap live,
> gimple *stmt)
>  case BUILT_IN_MEMCPY:
>  case BUILT_IN_MEMMOVE:
>  case BUILT_IN_STRNCPY:
> +case BUILT_IN_STRCPY:
>  case BUILT_IN_MEMCPY_CHK:
>  case BUILT_IN_MEMMOVE_CHK:
>  case BUILT_IN_STRNCPY_CHK:
How do you expect to trim a partially dead strcpy call?  The only way to
do that is to adjust the source string.  There's no mechanisms right now
to do that.  So this change can't be correct either.



> @@ -966,6 +970,7 @@ dse_dom_walker::dse_optimize_stmt (gimple_stmt_iterator
> *gsi)
> case BUILT_IN_MEMCPY:
> case BUILT_IN_MEMMOVE:
> case BUILT_IN_STRNCPY:
> +   case BUILT_IN_STRCPY:
> case BUILT_IN_MEMSET:
> case BUILT_IN_MEMCPY_CHK:
> case BUILT_IN_MEMMOVE_CHK:
This would be OK if all the other stuff was done right :-)


> @@ -975,11 +980,14 @@ dse_dom_walker::dse_optimize_stmt
> (gimple_stmt_iterator *gsi)
> /* Occasionally calls with an explicit length of zero
>show up in the IL.  It's pointless to do analysis
>on them, they're trivially dead.  */
> -   tree size = gimple_call_arg (stmt, 2);
> -   if (integer_zerop (size))
> +   if (gimple_call_num_args (stmt) > 2)
>   {
> -   delete_dead_or_redundant_call (gsi, "dead");
> -   return;
> +   tree size = gimple_call_arg (stmt, 2);
> +   if (integer_zerop (size))
> + {
> +   delete_dead_or_redundant_call (gsi, "dead");
> +   return;
> + }
>   }
?!?  This looks like you're just working around the fact that the other
parts of the patch are incorrect.


> 
> /* If this is a memset call that initializes an object
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
> b/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
> new file mode 100644
> index 000..89cf36a1367
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
> @@ -0,0 +1,25 @@
> +/* PR tree-optimization/81810
> +   { dg-do compile }
> +   { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +void f (const void *p, unsigned n)
> +{
> +  char a[8];
> +  __builtin_memcpy (a, p, n);
> +}
> +
> +void g (const char *s)
> +{
> +  char a[8];
> +  __builtin_strcpy (a, s);
> +}
> +
> +void h (const char *s)
> +{
> +  char a[8];
> +  __builtin_strncpy (a, s, sizeof a);
> +}
So just to be clear, the only case that's not handled here is the second
one involving strcpy.  The memcpy and strncpy cases are already handled.

The second case isn't handled because DSE doesn't know about the
semantics of strcpy yet.  That's precisely what I've been working on
recently and I'm pretty sure my changes (if we can deal with the
problems regarding some low level gimple semantics and sub-object
consistency) will fix this issue.  If my changes don't, then it
something I'm pretty sure can be addressed correctly using my changes as
underlying infrastructure.

jeff



[PATCH] fold constant flexarrays to strings (PR 91490)

2019-08-21 Thread Martin Sebor

This patch is a subset of the solution for PR 91457 whose main
goal is to eliminate inconsistencies in warnings issued for
out-of-bounds accesses to the various flavors of flexible array
members of constant objects.  That patch was posted here:
  https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01202.html

Like PR 91457, this patch also relies on improving optimization
to issue better quality warnings.  (I.e., with the latter being
what motivated it.)

The optimization enhances string_constant to recognize empty
CONSTRUCTORs returned by fold_ctor_reference for references
to members of constant objects with either "insufficient"
initializers (e.g., given struct S { char n, a[]; } s = { 0 };)
or with braced-list initializers (e.g., given
struct S s = { 3 { 1, 2, 3, 0 } };  The patch lets string_constant
convert the CONSTRUCTOR for s.a into a STRING_CST, which in turn
enables the folding of calls to built-ins like strlen, strchr, or
strcmp with such arguments.

Exposing the strings to the folder then also lets it detect and
issue warnings for out-of-bounds offsets in more instances of
such references than before.

The remaining changes in the patch mostly enhance the places
that use the no-warning bit to prevent redundant diagnostics.

Tested on x86_64-linux.

Martin
PR middle-end/91490 - bogus argument missing terminating nul warning on strlen of a flexible array member

gcc/c-family/ChangeLog:

	PR middle-end/91490
	* c-common.c (braced_list_to_string): Add argument and overload.
	Handle flexible length arrays.

gcc/testsuite/ChangeLog:

	PR middle-end/91490
	* c-c++-common/Warray-bounds-7.c: New test.
	* gcc.dg/Warray-bounds-39.c: Expect either -Warray-bounds or
	-Wstringop-overflow.
	* gcc.dg/strlenopt-78.c: New test.

gcc/ChangeLog:

	PR middle-end/91490
	* builtins.c (c_strlen): Rename argument and introduce new local.
	Set no-warning bit on original argument.
	* expr.c (string_constant): Pass argument type to fold_ctor_reference.
	* gimple-fold.c (fold_nonarray_ctor_reference): Return a STRING_CST
	for missing initializers.
	* tree.c (build_string_literal): Handle optional argument.
	* tree.h (build_string_literal): Add defaulted argument.
	* gimple-ssa-warn-restrict.c (maybe_diag_access_bounds): Check
	no-warning bit on original expression.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9a766e4ad63..073b92a1dc9 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -620,7 +620,7 @@ unterminated_array (tree exp, tree *size /* = NULL */, bool *exact /* = NULL */)
into the instruction stream and zero if it is going to be expanded.
E.g. with i++ ? "foo" : "bar", if ONLY_VALUE is nonzero, constant 3
is returned, otherwise NULL, since
-   len = c_strlen (src, 1); if (len) expand_expr (len, ...); would not
+   len = c_strlen (ARG, 1); if (len) expand_expr (len, ...); would not
evaluate the side-effects.
 
If ONLY_VALUE is two then we do not emit warnings about out-of-bound
@@ -628,7 +628,7 @@ unterminated_array (tree exp, tree *size /* = NULL */, bool *exact /* = NULL */)
into the instruction stream.
 
Additional information about the string accessed may be recorded
-   in DATA.  For example, if SRC references an unterminated string,
+   in DATA.  For example, if ARG references an unterminated string,
then the declaration will be stored in the DECL field.   If the
length of the unterminated string can be determined, it'll be
stored in the LEN field.  Note this length could well be different
@@ -640,7 +640,7 @@ unterminated_array (tree exp, tree *size /* = NULL */, bool *exact /* = NULL */)
The value returned is of type `ssizetype'.  */
 
 tree
-c_strlen (tree src, int only_value, c_strlen_data *data, unsigned eltsize)
+c_strlen (tree arg, int only_value, c_strlen_data *data, unsigned eltsize)
 {
   /* If we were not passed a DATA pointer, then get one to a local
  structure.  That avoids having to check DATA for NULL before
@@ -650,7 +650,8 @@ c_strlen (tree src, int only_value, c_strlen_data *data, unsigned eltsize)
 data = _strlen_data;
 
   gcc_checking_assert (eltsize == 1 || eltsize == 2 || eltsize == 4);
-  STRIP_NOPS (src);
+
+  tree src = STRIP_NOPS (arg);
   if (TREE_CODE (src) == COND_EXPR
   && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0
 {
@@ -762,11 +763,15 @@ c_strlen (tree src, int only_value, c_strlen_data *data, unsigned eltsize)
 {
   /* Suppress multiple warnings for propagated constant strings.  */
   if (only_value != 2
-	  && !TREE_NO_WARNING (src)
+	  && !TREE_NO_WARNING (arg)
 	  && warning_at (loc, OPT_Warray_bounds,
 			 "offset %qwi outside bounds of constant string",
 			 eltoff))
-	TREE_NO_WARNING (src) = 1;
+	{
+	  if (decl)
+	inform (DECL_SOURCE_LOCATION (decl), "%qE declared here", decl);
+	  TREE_NO_WARNING (arg) = 1;
+	}
   return NULL_TREE;
 }
 
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 2810867235d..ef942ffce57 100644
--- a/gcc/c-family/c-common.c

Re: [PATCH] issue consistent warning for past-the-end array stores (PR 91457)

2019-08-21 Thread Martin Sebor

On 8/20/19 1:26 AM, Richard Biener wrote:

On Tue, Aug 20, 2019 at 4:32 AM Martin Sebor  wrote:


On 8/19/19 8:10 AM, Richard Biener wrote:

On Sat, Aug 17, 2019 at 12:43 AM Martin Sebor  wrote:


With the recent enhancement to the strlen handling of multibyte
stores the g++.dg/warn/Warray-bounds-4.C for zero-length arrays
started failing on hppa (and probably elsewhere as well).  This
is partly the result of the added detection of past-the-end
writes into the strlen pass which detects more instances of
the problem than -Warray-bounds.  Since the IL each warning
works with varies between targets, the same invalid code can
be diagnosed by one warning one target and different warning
on another.

The attached patch does three things:

1) It enhances compute_objsize to also determine the size of
a flexible array member (and its various variants), including
from its initializer if necessary.  (This resolves 91457 but
introduces another warning where was previously just one.)
2) It guards the new instance of -Wstringop-overflow with
the no-warning bit on the assignment to avoid warning on code
that's already been diagnosed.
3) It arranges for -Warray-bounds to set the no-warning bit on
the enclosing expression to keep -Wstringop-overflow from issuing
another warning for the same problem.

Testing the compute_objsize enhancement to bring it up to par
with -Warray-bounds in turn exposed a weakness in the latter
warning for flexible array members.  Rather than snowballing
additional improvements into this one I decided to put that
off until later, so the new -Warray-bounds test has a bunch
of XFAILs.  I'll see if I can find the time to deal with those
either still in stage 1 or in stage 3 (one of them is actually
an ancient regression).


+static tree
+get_initializer_for (tree init, tree decl)
+{

can't you use fold_ctor_reference here?


Yes, but only with an additional enhancement.  Char initializers
for flexible array members aren't transformed to STRING_CSTs yet,
so without the size of the initializer specified, the function
returns the initializer for the smallest subobject, or char in
this case.  I've enhanced the function to handle them.


So at the moment it returns an empty array constructor, correct?
Isn't that the more canonical representation?  The STRING_CST
index type doesn't really handle "empty" strings and I expect code
more confused about that than about an empty CTOR?


Yes.  Returning an empty CTOR is more general than an empty
string and enables more optimizations.  It requires changing
the caller(s) that look for a string but I think that's fine.
Thanks for the hint!


+/* Determine the size of the flexible array FLD from the initializer
+   expression for the struct object DECL in which the meber is declared
+   (possibly recursively).  Return the size or zero constant if it isn't
+   initialized.  */
+
+static tree
+get_flexarray_size (tree decl, tree fld)
+{
+  if (tree init = DECL_INITIAL (decl))
+{
+  init = get_initializer_for (init, fld);
+  if (init)
+   return TYPE_SIZE_UNIT (TREE_TYPE (init));
+}
+
+  return integer_zero_node;

so you're hoping that the (sub-)CONSTRUCTOR get_initializer_for
returns has a complete type but the initialized object didn't get it
completed.  Isnt that wishful thinking?


I don't know what you mean.  When might a CONSTRUCTOR not have
a complete type, and if/when it doesn't, why would that be
a problem here?  TYPE_SIZE_UNIT will evaluate to null meaning
"don't know" and that's fine.  Could you try to be more specific
about the problem you're pointing out?


And why return integer_zero_node
rather than NULL_TREE here?


Because the size of a flexible array member with no initializer
is zero.



+  if (TREE_CODE (dest) == COMPONENT_REF)
+{
+  *pdecl = TREE_OPERAND (dest, 1);
+
+  /* If the member has a size return it.  Otherwise it's a flexible
+array member.  */
+  if (tree size = DECL_SIZE_UNIT (*pdecl))
+   return size;

because here you do.


Not sure what you mean here either.  (This code was also a bit


You return NULL_TREE.


out of date WRT to the patch I had tested.  Not sure how that
happened.  The attached patch is up to date.)



Also once you have an underlying VAR_DECL you can compute
the flexarray size by DECL_SIZE (var) - offset-of flexarray member.
Isn't that way cheaper than walking the initializer (possibly many
times?)


It would be nice if it were this easy.  Is the value of DECL_SIZE
(var) supposed to include the size of the flexible array member?


Yes, DECL_SIZE of the VAR_DECL is the size we use for assembling.
It is usually only the types that remain incomplete (or too small) since
the FE doesn't create many variants of a struct S { int n; char x[]; }
when "instantiating" it via struct S s = { 5, "abcde" };  that then holds
true for the DECL_SIZE of the FIELD_DECL of x as well.


I don't see it mentioned in the comments in tree.h and in my tests
it only does in C but not in C++.  

Re: [RFC] Enable math functions linking with static library for LTO

2019-08-21 Thread Joseph Myers
I'm afraid I'm not clear on the semantics of builtin_with_linkage_p (as in 
the current checked-in version).  It says:

/* Return true if the builtin DECL is implemented in a standard library.
   Otherwise returns false which doesn't guarantee it is not (thus the list of
   handled builtins below may be incomplete).  */

But what does "is implemented in a standard library" mean?  Certainly 
plenty of GCC ports are for systems that e.g. don't have fmaf64 which is 
included in that switch statement, and calls to fmaf64 on such systems 
will result in an undefined reference unless it's on an architecture where 
GCC can expand that inline.

Specifically: does Tejas's roundeven patch (and likewise the fadd patch) 
need updating to add those functions to the switch statement in 
builtin_with_linkage_p?  What are the correct semantics for a built-in 
function with the following properties: it may not be in the system libc / 
libm on all systems, but GCC can't always expand it inline either, so it 
may result in an undefined reference on systems where it's not in system 
libc / libm?

If it's only meant to be for functions you *know* are in a standard 
library on the system used, I'd expect checks of the libc_has_function 
target hook.  If, however, it's meant to be for functions that might 
result in an undefined reference (where GCC expects there to be a library 
function to satisfy that reference, but some systems don't have that 
function), I'd expect it to be based on whether a library name was passed 
when add_builtin_function was called, if that's something you can 
determine by examining the DECL - and in that case I wouldn't expect 
there to be a long switch statement listing particular functions, because 
the relevant information is specified when the built-in functions are 
created in the first place.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Builtin function roundeven folding implementation

2019-08-21 Thread Joseph Myers
On Wed, 21 Aug 2019, Martin Jambor wrote:

>   - I have listed roundeven variants in extend.texi.  If I did not find
> the right spot, I will gladly move to a more appropriate one.

I don't think they should be documented with the __builtin_* that are 
always expanded inline.  They should be documented in the list of 
functions that *might* be expanded inline.  That is, where extend.texi 
says:

  [...]  Many of these
  functions are only optimized in certain cases; if they are not optimized in
  a particular case, a call to the library function is emitted.

  @opindex ansi
  @opindex std
  Outside strict ISO C mode (@option{-ansi}, @option{-std=c90},
  @option{-std=c99} or @option{-std=c11}), the functions
  @code{_exit}, [...]
  may be handled as built-in functions.
  All these functions have corresponding versions
  prefixed with @code{__builtin_}, which may be used even in strict C90
  mode.

(Until we enable these by default for C2X, at which point they'd be listed 
as C2X functions after the C99 ones.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PR90724 - ICE with __sync_bool_compare_and_swap with -march=armv8.2-a

2019-08-21 Thread Prathamesh Kulkarni
On Mon, 19 Aug 2019 at 22:14, James Greenhalgh  wrote:
>
> On Thu, Aug 15, 2019 at 02:11:25PM +0100, Prathamesh Kulkarni wrote:
> > On Thu, 8 Aug 2019 at 11:22, Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Thu, 1 Aug 2019 at 15:34, Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Thu, 25 Jul 2019 at 11:56, Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > On Wed, 17 Jul 2019 at 18:15, Prathamesh Kulkarni
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, 17 Jul 2019 at 13:45, Kyrill Tkachov
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hi Prathamesh
> > > > > > >
> > > > > > > On 7/10/19 12:24 PM, Prathamesh Kulkarni wrote:
> > > > > > > > Hi,
> > > > > > > > For following test-case,
> > > > > > > > static long long AL[24];
> > > > > > > >
> > > > > > > > int
> > > > > > > > check_ok (void)
> > > > > > > > {
> > > > > > > >   return (__sync_bool_compare_and_swap (AL+1, 0x20003ll,
> > > > > > > > 0x1234567890ll));
> > > > > > > > }
> > > > > > > >
> > > > > > > > Compiling with -O2 -march=armv8.2-a results in:
> > > > > > > > pr90724.c: In function ‘check_ok’:
> > > > > > > > pr90724.c:7:1: error: unrecognizable insn:
> > > > > > > > 7 | }
> > > > > > > >   | ^
> > > > > > > > (insn 11 10 12 2 (set (reg:CC 66 cc)
> > > > > > > > (compare:CC (reg:DI 95)
> > > > > > > > (const_int 8589934595 [0x20003]))) 
> > > > > > > > "pr90724.c":6:11 -1
> > > > > > > >  (nil))
> > > > > > > >
> > > > > > > > IIUC, the issue is that 0x20003 falls outside the range of
> > > > > > > > allowable immediate in cmp ? If it's replaced by a small 
> > > > > > > > constant then
> > > > > > > > it works.
> > > > > > > >
> > > > > > > > The ICE results with -march=armv8.2-a because, we enter if
> > > > > > > > (TARGET_LSE) { ... } condition
> > > > > > > > in aarch64_expand_compare_and_swap, while with -march=armv8.a 
> > > > > > > > it goes
> > > > > > > > into else,
> > > > > > > > which forces oldval into register if the predicate fails to 
> > > > > > > > match.
> > > > > > > >
> > > > > > > > The attached patch checks if y (oldval) satisfies 
> > > > > > > > aarch64_plus_operand
> > > > > > > > predicate and if not, forces it to be in register, which 
> > > > > > > > resolves ICE.
> > > > > > > > Does it look OK ?
> > > > > > > >
> > > > > > > > Bootstrap+testing in progress on aarch64-linux-gnu.
> > > > > > > >
> > > > > > > > PS: The issue has nothing to do with SVE, which I incorrectly
> > > > > > > > mentioned in bug report.
> > > > > > > >
> > > > > > > This looks ok to me (but you'll need maintainer approval).
> > > > > > >
> > > > > > > Does this fail on the branches as well?
> > > > > > Hi Kyrill,
> > > > > > Thanks for the review. The test also fails on gcc-9-branch (but not 
> > > > > > on gcc-8).
> > > > > Hi James,
> > > > > Is the patch OK to commit  ?
> > > > > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > > ping * 3: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > > ping * 4: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
> > ping * 5: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00793.html
>
> Hi,
>
> Sorry, this missed my filters as it didn't mention AArch64 in the subject
> line.
>
> Thais is good for trunk, thanks for waiting.
Thanks, committed to trunk in r274805.
Is this OK to backport to gcc-9-branch ?

Thanks,
Prathamesh
>
> James
>


Re: [SVE] PR86753

2019-08-21 Thread Prathamesh Kulkarni
On Thu, 15 Aug 2019 at 01:50, Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Wed, Aug 14, 2019 at 6:49 PM Richard Biener
> >  wrote:
> >>
> >> On Wed, Aug 14, 2019 at 5:06 PM Prathamesh Kulkarni
> >>  wrote:
> >> >
> >> > Hi,
> >> > The attached patch tries to fix PR86753.
> >> >
> >> > For following test:
> >> > void
> >> > f1 (int *restrict x, int *restrict y, int *restrict z)
> >> > {
> >> >   for (int i = 0; i < 100; ++i)
> >> > x[i] = y[i] ? z[i] : 10;
> >> > }
> >> >
> >> > vect dump shows:
> >> >   vect_cst__42 = { 0, ... };
> >> >   vect_cst__48 = { 0, ... };
> >> >
> >> >   vect__4.7_41 = .MASK_LOAD (vectp_y.5_38, 4B, loop_mask_40);
> >> >   _4 = *_3;
> >> >   _5 = z_12(D) + _2;
> >> >   mask__35.8_43 = vect__4.7_41 != vect_cst__42;
> >> >   _35 = _4 != 0;
> >> >   vec_mask_and_46 = mask__35.8_43 & loop_mask_40;
> >> >   vect_iftmp.11_47 = .MASK_LOAD (vectp_z.9_44, 4B, vec_mask_and_46);
> >> >   iftmp.0_13 = 0;
> >> >   vect_iftmp.12_50 = VEC_COND_EXPR  >> > vect_iftmp.11_47, vect_cst__49>;
> >> >
> >> > and following code-gen:
> >> > L2:
> >> > ld1wz0.s, p2/z, [x1, x3, lsl 2]
> >> > cmpne   p1.s, p3/z, z0.s, #0
> >> > cmpne   p0.s, p2/z, z0.s, #0
> >> > ld1wz0.s, p0/z, [x2, x3, lsl 2]
> >> > sel z0.s, p1, z0.s, z1.s
> >> >
> >> > We could reuse vec_mask_and_46 in vec_cond_expr since the conditions
> >> > vect__4.7_41 != vect_cst__48 and vect__4.7_41 != vect_cst__42
> >> > are equivalent, and vect_iftmp.11_47 depends on vect__4.7_41 != 
> >> > vect_cst__48.
> >> >
> >> > I suppose in general for vec_cond_expr  if T comes from masked 
> >> > load,
> >> > which is conditional on C, then we could reuse the mask used in load,
> >> > in vec_cond_expr ?
> >> >
> >> > The patch maintains a hash_map cond_to_vec_mask
> >> > from  vec_mask (with loop predicate applied).
> >> > In prepare_load_store_mask, we record  -> vec_mask & 
> >> > loop_mask,
> >> > and in vectorizable_condition, we check if  exists in
> >> > cond_to_vec_mask
> >> > and if found, the corresponding vec_mask is used as 1st operand of
> >> > vec_cond_expr.
> >> >
> >> >  is represented with cond_vmask_key, and the patch
> >> > adds tree_cond_ops to represent condition operator and operands coming
> >> > either from cond_expr
> >> > or a gimple comparison stmt. If the stmt is not comparison, it returns
> >> >  and inserts that into cond_to_vec_mask.
> >> >
> >> > With patch, the redundant p1 is eliminated and sel uses p0 for above 
> >> > test.
> >> >
> >> > For following test:
> >> > void
> >> > f2 (int *restrict x, int *restrict y, int *restrict z, int fallback)
> >> > {
> >> >   for (int i = 0; i < 100; ++i)
> >> > x[i] = y[i] ? z[i] : fallback;
> >> > }
> >> >
> >> > input to vectorizer has operands swapped in cond_expr:
> >> >   _36 = _4 != 0;
> >> >   iftmp.0_14 = .MASK_LOAD (_5, 32B, _36);
> >> >   iftmp.0_8 = _4 == 0 ? fallback_12(D) : iftmp.0_14;
> >> >
> >> > So we need to check for inverted condition in cond_to_vec_mask,
> >> > and swap the operands.
> >> > Does the patch look OK so far ?
> >> >
> >> > One major issue remaining with the patch is value  numbering.
> >> > Currently, it does value numbering for entire function using sccvn
> >> > during start of vect pass, which is too expensive since we only need
> >> > block based VN. I am looking into that.
> >>
> >> Why do you need it at all?  We run VN on the if-converted loop bodies btw.
>
> This was my suggestion, but with the idea being to do the numbering
> per-statement as we vectorise.  We'll then see pattern statements too.
>
> That's important because we use pattern statements to set the right
> vector boolean type (e.g. vect_recog_mask_conversion_pattern).
> So some of the masks we care about don't exist after if converison.
>
> > Also I can't trivially see the equality of the masks and probably so
> > can't VN.  Is it that we just don't bother to apply loop_mask to
> > VEC_COND but there's no harm if we do?
>
> Yeah.  The idea of the optimisation is to decide when it's more profitable
> to apply the loop mask, even though doing so isn't necessary.  It would
> be hard to do after vectorisation because the masks aren't equivalent.
> We're relying on knowledge of how the vectoriser uses the result.
Hi,
Sorry for late response. This is an updated patch, that integrates
block-based VN into vect pass.
The patch
(a) Exports visit_stmt (renamed to vn_visit_stmt), vn_bb_init to
initialize VN state, and vn_bb_free to free it.
(b) Calls vn_visit_stmt in vect_transform_stmt for value numbering
stmts. We're only interested in obtaining
value numbers, not eliminating redundancies.
Does it look in the right direction ?
I am not sure if the initialization in vn_bb_init is entirely correct.

PS: The patch seems to regress fmla_2.c. I am looking into it.

Thanks,
Prathamesh
>
> Thanks,
> Richard
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index eb7e4be09e6..26a46757854 100644
--- 

Re: Merge from trunk to gccgo branch

2019-08-21 Thread Ian Lance Taylor
I've merged trunk revision 274803 to the gccgo branch.

Ian


Re: [PATCH] Builtin function roundeven folding implementation

2019-08-21 Thread Martin Jambor
Hi,

On Wed, Aug 21 2019, Joseph Myers wrote:
> On Wed, 21 Aug 2019, Martin Jambor wrote:
>
>> Hi Tejas,
>> 
>> On Wed, Aug 14 2019, Tejas Joshi wrote:
>> > Hi.
>> > Here is a clean patch that does not fold roundeven resulting for
>> > integer type and the conditions for folding functions
>> > round/ceil/floor/roundeven and trunc only checks for signaling NaN.
>> 
>> wouldn't checking for *signalling* NaNs mean using the macro
>> REAL_VALUE_ISSIGNALING_NAN rather than REAL_VALUE_ISNAN?
>
> Yes, it would.  So the patch should be retested / reposted with that 
> change.
>
>> > \ No newline at end of file
>
> This should be fixed (in both places, the tests should each end with 
> exactly one newline character).
>
> The new built-in functions roundeven / roundevenf / roundevenl also need 
> to be added to the lists in extend.texi (the list of GNU extension 
> functions for now, though once we add DEF_C2X_BUILTIN they'd go in a list 
> of C2X built-in functions instead).

OK, because I will be committing the patch on Tejas's behalf anyway, I
took the liberty of doing these few last changes too:

  - I replaced all REAL_VALUE_ISNAN in Tejas's patch with
REAL_VALUE_ISSIGNALING_NAN.

  - I have fixed the missing newlines in the testcases

  - I have listed roundeven variants in extend.texi.  If I did not find
the right spot, I will gladly move to a more appropriate one.

Otherwise I have not changed the patch in any way.  It has passed
bootstrap and testing on x86_64-linux, I will test on at least aarch64
too together with the followup i386 expansion patch.

Tejas, please have a quick look whether it all looks OK.

Joseph, please let me know if I can commit this to trunk.

Thanks a lot,

Martin


gcc/ChangeLog:

2019-08-21  Tejas Joshi  
Martin Jambor  

* builtins.c (mathfn_built_in_2): Added CASE_MATHFN for ROUNDEVEN.
* builtins.def: Added function definitions for roundeven function
variants.
* fold-const-call.c (fold_const_call_ss): Added case for roundeven
function call.
* fold-const.c (negate_mathfn_p): Added case for roundeven function.
(tree_call_nonnegative_warnv_p): Added case for roundeven function.
(integer_valued_real_call_p): Added case for roundeven function.
* real.c (is_even): New function. Returns true if real number is
even, otherwise returns false.
(is_halfway_below): New function. Returns true if real number is
halfway between two integers, else return false.
(real_roundeven): New function. Round real number to nearest
integer, rounding halfway cases towards even.
* real.h (real_value): Added descriptive comments.
Added function declaration for roundeven function.
* doc/extend.texi (Other Builtins): Document roundeven and its variants.

gcc/testsuite/ChangeLog:

2019-08-21  Tejas Joshi  

* gcc.dg/torture/builtin-round-roundeven.c: New test.
* gcc.dg/torture/builtin-round-roundevenf128.c: New test.
---
 gcc/builtins.c|  1 +
 gcc/builtins.def  |  6 ++
 gcc/doc/extend.texi   | 16 
 gcc/fold-const-call.c | 23 +++--
 gcc/fold-const.c  |  6 ++
 gcc/real.c| 83 +++
 gcc/real.h|  9 ++
 .../gcc.dg/torture/builtin-round-roundeven.c  | 23 +
 .../torture/builtin-round-roundevenf128.c | 20 +
 9 files changed, 182 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/builtin-round-roundeven.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/builtin-round-roundevenf128.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9a766e4ad63..5149d901a96 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2056,6 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
 CASE_MATHFN (REMQUO)
 CASE_MATHFN_FLOATN (RINT)
 CASE_MATHFN_FLOATN (ROUND)
+CASE_MATHFN (ROUNDEVEN)
 CASE_MATHFN (SCALB)
 CASE_MATHFN (SCALBLN)
 CASE_MATHFN (SCALBN)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 6d41bdb4f44..8bb7027aac7 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -542,12 +542,18 @@ DEF_C99_BUILTIN(BUILT_IN_RINTL, "rintl", 
BT_FN_LONGDOUBLE_LONGDOUBLE, AT
 #define RINT_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_RINT, "rint", RINT_TYPE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef RINT_TYPE
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVEN, "roundeven", BT_FN_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENF, "roundevenf", BT_FN_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENL, "roundevenl", 
BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUND, "round", BT_FN_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUNDF, "roundf", BT_FN_FLOAT_FLOAT, 

[PATCH 4/4] Modifications to the testsuite

2019-08-21 Thread Martin Jambor
This are all modifications to the testsuite required to get to the
state described in the cover letter of the entire IPA-SRA
patch-series.  Please note that ipa/ipa-sra-2.c and ipa/ipa-sra-6.c
should actually be svn rm-ed instead as they try to invoke
functionality that the new IPA-SRA does not have (splitting aggregates
passed by reference into individual bits passed by reference).  For
more information, see the cover letter of the whole IPA-SRA patch-set.

This patch has already been approved by Jeff, but I'm re-sending it
for completeness.  There has actually been a conflict in the options
of an LTO testcase, that is the only change compared to the previous
submission.

Martin

2019-08-20  Martin Jambor  

* g++.dg/ipa/pr81248.C: Adjust dg-options and dump-scan.
* gcc.dg/ipa/ipa-sra-1.c: Likewise.
* gcc.dg/ipa/ipa-sra-10.c: Likewise.
* gcc.dg/ipa/ipa-sra-11.c: Likewise.
* gcc.dg/ipa/ipa-sra-3.c: Likewise.
* gcc.dg/ipa/ipa-sra-4.c: Likewise.
* gcc.dg/ipa/ipa-sra-5.c: Likewise.
* gcc.dg/ipa/ipacost-2.c: Disable ipa-sra.
* gcc.dg/ipa/ipcp-agg-9.c: Likewise.
* gcc.dg/ipa/pr78121.c: Adjust scan pattern.
* gcc.dg/ipa/vrp1.c: Likewise.
* gcc.dg/ipa/vrp2.c: Likewise.
* gcc.dg/ipa/vrp3.c: Likewise.
* gcc.dg/ipa/vrp7.c: Likewise.
* gcc.dg/ipa/vrp8.c: Likewise.
* gcc.dg/noreorder.c: use noipa attribute instead of noinline.
* gcc.dg/ipa/20040703-wpa.c: New test.
* gcc.dg/ipa/ipa-sra-12.c: New test.
* gcc.dg/ipa/ipa-sra-13.c: Likewise.
* gcc.dg/ipa/ipa-sra-14.c: Likewise.
* gcc.dg/ipa/ipa-sra-15.c: Likewise.
* gcc.dg/ipa/ipa-sra-16.c: Likewise.
* gcc.dg/ipa/ipa-sra-17.c: Likewise.
* gcc.dg/ipa/ipa-sra-18.c: Likewise.
* gcc.dg/ipa/ipa-sra-19.c: Likewise.
* gcc.dg/ipa/ipa-sra-20.c: Likewise.
* gcc.dg/ipa/ipa-sra-21.c: Likewise.
* gcc.dg/ipa/ipa-sra-22.c: Likewise.
* gcc.dg/sso/ipa-sra-1.c: Likewise.
* g++.dg/ipa/ipa-sra-2.C: Likewise.
* g++.dg/ipa/ipa-sra-3.C: Likewise.
* gcc.dg/tree-ssa/ipa-cp-1.c: Make return value used.
* g++.dg/ipa/devirt-19.C: Add missing return, add -fipa-cp-clone
option.
* g++.dg/lto/devirt-19_0.C: Add -fipa-cp-clone option.

* gcc.dg/ipa/ipa-sra-2.c: Mark to be removed.
* gcc.dg/ipa/ipa-sra-6.c: Likewise.
---
 gcc/testsuite/g++.dg/ipa/devirt-19.C |   5 +-
 gcc/testsuite/g++.dg/ipa/ipa-sra-1.C |  46 +++
 gcc/testsuite/g++.dg/ipa/ipa-sra-2.C |  19 +++
 gcc/testsuite/g++.dg/ipa/ipa-sra-3.C |   9 ++
 gcc/testsuite/g++.dg/ipa/pr81248.C   |   4 +-
 gcc/testsuite/g++.dg/lto/devirt-19_0.C   |   2 +-
 gcc/testsuite/gcc.dg/ipa/20040703-wpa.c  | 151 +++
 gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c |   4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c|   4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c|   6 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c|  50 
 gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c|  49 
 gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c|  60 +
 gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c|  61 +
 gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c|  74 +++
 gcc/testsuite/gcc.dg/ipa/ipa-sra-17.c| 102 +++
 gcc/testsuite/gcc.dg/ipa/ipa-sra-18.c|  49 
 gcc/testsuite/gcc.dg/ipa/ipa-sra-19.c|  31 +
 gcc/testsuite/gcc.dg/ipa/ipa-sra-2.c |   9 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-20.c|  38 ++
 gcc/testsuite/gcc.dg/ipa/ipa-sra-21.c|  33 +
 gcc/testsuite/gcc.dg/ipa/ipa-sra-22.c|  56 +
 gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c |   7 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c |   8 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c |   4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-6.c |   6 +-
 gcc/testsuite/gcc.dg/ipa/ipacost-2.c |   4 +-
 gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c|   2 +-
 gcc/testsuite/gcc.dg/ipa/pr78121.c   |   2 +-
 gcc/testsuite/gcc.dg/ipa/vrp1.c  |   4 +-
 gcc/testsuite/gcc.dg/ipa/vrp2.c  |   4 +-
 gcc/testsuite/gcc.dg/ipa/vrp3.c  |   2 +-
 gcc/testsuite/gcc.dg/ipa/vrp7.c  |   2 +-
 gcc/testsuite/gcc.dg/ipa/vrp8.c  |   2 +-
 gcc/testsuite/gcc.dg/noreorder.c |   6 +-
 gcc/testsuite/gcc.dg/sso/ipa-sra-1.c |  57 +
 gcc/testsuite/gcc.dg/tree-ssa/ipa-cp-1.c |   2 +-
 37 files changed, 931 insertions(+), 43 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/ipa-sra-1.C
 create mode 100644 gcc/testsuite/g++.dg/ipa/ipa-sra-2.C
 create mode 100644 gcc/testsuite/g++.dg/ipa/ipa-sra-3.C
 create mode 100644 gcc/testsuite/gcc.dg/ipa/20040703-wpa.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-14.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
 create mode 100644 

[PATCH 3/4] New IPA-SRA implementation

2019-08-21 Thread Martin Jambor
This patch actually adds the analysis bits of IPA-SRA - both the
function summary generation and the interprocedural analysis and
decision stage.  The transformation itself then happens in the call
graph cloning infrastructure changes which are in the previous patch.
Please see the cover letter of the whole patch-set for more
information.

This is mostly only a rebase on the current trunk of the earlier
submission, the only functional change is that the pass does not clone
when all the work (unused parameter removal) has already been done by
IPA-CP.

Martin

2019-08-20  Martin Jambor  

* Makefile.in (GTFILES): Added ipa-sra.c.
(OBJS): Added ipa-sra.o.
* dbgcnt.def: Added ipa_sra_params and ipa_sra_retvalues.
* doc/invoke.texi (ipa-sra-max-replacements): New.
* ipa-sra.c: New file.
* lto-section-in.c (lto_section_name): Added ipa-sra section.
* lto-streamer.h (lto_section_type): Likewise.
* params.def (PARAM_IPA_SRA_MAX_REPLACEMENTS): New.
* passes.def: Add new IPA-SRA.
* tree-pass.h (make_pass_ipa_sra): Declare.
---
 gcc/Makefile.in  |3 +-
 gcc/dbgcnt.def   |2 +
 gcc/doc/invoke.texi  |5 +
 gcc/ipa-sra.c| 4030 ++
 gcc/lto-section-in.c |3 +-
 gcc/lto-streamer.h   |1 +
 gcc/params.def   |7 +
 gcc/passes.def   |1 +
 gcc/tree-pass.h  |1 +
 9 files changed, 4051 insertions(+), 2 deletions(-)
 create mode 100644 gcc/ipa-sra.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 9573f58221c..0cccaa64b40 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1369,6 +1369,7 @@ OBJS = \
init-regs.o \
internal-fn.o \
ipa-cp.o \
+   ipa-sra.o \
ipa-devirt.o \
ipa-fnsummary.o \
ipa-polymorphic-call.o \
@@ -2528,7 +2529,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/reload.h $(srcdir)/caller-save.c $(srcdir)/symtab.c \
   $(srcdir)/alias.c $(srcdir)/bitmap.c $(srcdir)/cselib.c $(srcdir)/cgraph.c \
   $(srcdir)/ipa-prop.c $(srcdir)/ipa-cp.c $(srcdir)/ipa-utils.h \
-  $(srcdir)/ipa-param-manipulation.h $(srcdir)/dbxout.c \
+  $(srcdir)/ipa-param-manipulation.h $(srcdir)/ipa-sra.c $(srcdir)/dbxout.c \
   $(srcdir)/signop.h \
   $(srcdir)/dwarf2out.h \
   $(srcdir)/dwarf2asm.c \
diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 19936375505..c023c755af8 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -167,6 +167,8 @@ DEBUG_COUNTER (if_after_combine)
 DEBUG_COUNTER (if_after_reload)
 DEBUG_COUNTER (if_conversion)
 DEBUG_COUNTER (if_conversion_tree)
+DEBUG_COUNTER (ipa_sra_params)
+DEBUG_COUNTER (ipa_sra_retvalues)
 DEBUG_COUNTER (ira_move)
 DEBUG_COUNTER (local_alloc_for_sched)
 DEBUG_COUNTER (merged_ipa_icf)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ca111792885..499d6127a79 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11885,6 +11885,11 @@ parameters only when their cumulative size is less or 
equal to
 @option{ipa-sra-ptr-growth-factor} times the size of the original
 pointer parameter.
 
+@item ipa-sra-max-replacements
+Maximum pieces of an aggregate that IPA-SRA tracks.  As a
+consequence, it is also the maximum number of replacements of a formal
+parameter.
+
 @item sra-max-scalarization-size-Ospeed
 @itemx sra-max-scalarization-size-Osize
 The two Scalar Reduction of Aggregates passes (SRA and IPA-SRA) aim to
diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
new file mode 100644
index 000..9dc85650f23
--- /dev/null
+++ b/gcc/ipa-sra.c
@@ -0,0 +1,4030 @@
+/* Interprocedural scalar replacement of aggregates
+   Copyright (C) 2008-2019 Free Software Foundation, Inc.
+
+   Contributed by Martin Jambor 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* IPA-SRA is an interprocedural pass that removes unused function return
+   values (turning functions returning a value which is never used into void
+   functions), removes unused function parameters.  It can also replace an
+   aggregate parameter by a set of other parameters representing part of the
+   original, turning those passed by reference into new ones which pass the
+   value directly.
+
+   The pass is a true IPA one, which means that it works in three stages in
+   order to be able to take advantage of LTO.  First, summaries about functions
+   and each calls are generated. 

[PATCH 1/4] Remove old IPA-SRA, introduce tree-sra.h

2019-08-21 Thread Martin Jambor
This patch removes the old IPA-SRA.  Please see the covert letter for
more information about the whole patch-set.

Martin

2019-07-23  Martin Jambor  

* dbgcnt.def: Remove eipa_sra.
* passes.def: Remove old IPA-SRA.
* tree-pass.h (make_pass_early_ipa_sra): Remove declaration.
* tree-sra.c: Removed IPA-SRA.  Include tree-sra.h.
(type_internals_preclude_sra_p): Make public.
* tree-sra.h: New file.
---
 gcc/dbgcnt.def  |1 -
 gcc/passes.def  |1 -
 gcc/tree-pass.h |1 -
 gcc/tree-sra.c  | 1859 +--
 gcc/tree-sra.h  |   31 +
 5 files changed, 68 insertions(+), 1825 deletions(-)
 create mode 100644 gcc/tree-sra.h

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 230072f7bb5..19936375505 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -156,7 +156,6 @@ DEBUG_COUNTER (df_byte_scan)
 DEBUG_COUNTER (dse)
 DEBUG_COUNTER (dse1)
 DEBUG_COUNTER (dse2)
-DEBUG_COUNTER (eipa_sra)
 DEBUG_COUNTER (gcse2_delete)
 DEBUG_COUNTER (global_alloc_at_func)
 DEBUG_COUNTER (global_alloc_at_reg)
diff --git a/gcc/passes.def b/gcc/passes.def
index 1a7fd144f87..99c0fee3155 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -89,7 +89,6 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_dse);
  NEXT_PASS (pass_cd_dce);
  NEXT_PASS (pass_phiopt, true /* early_p */);
- NEXT_PASS (pass_early_ipa_sra);
  NEXT_PASS (pass_tail_recursion);
  NEXT_PASS (pass_convert_switch);
  NEXT_PASS (pass_cleanup_eh);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 1c8df3d0a71..8f03c84132b 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -355,7 +355,6 @@ extern gimple_opt_pass *make_pass_early_tree_profile 
(gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_cleanup_eh (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sra (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sra_early (gcc::context *ctxt);
-extern gimple_opt_pass *make_pass_early_ipa_sra (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_recursion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_calls (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_fix_loops (gcc::context *ctxt);
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 1505ce57d30..48589323a1e 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -96,15 +96,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfg.h"
 #include "tree-dfa.h"
 #include "tree-ssa.h"
-#include "symbol-summary.h"
-#include "ipa-param-manipulation.h"
-#include "ipa-prop.h"
 #include "params.h"
 #include "dbgcnt.h"
-#include "tree-inline.h"
-#include "ipa-fnsummary.h"
-#include "ipa-utils.h"
 #include "builtins.h"
+#include "tree-sra.h"
 
 
 /* Enumeration of all aggregate reductions we can do.  */
@@ -169,8 +164,7 @@ struct access
   struct access *first_child;
 
   /* In intraprocedural SRA, pointer to the next sibling in the access tree as
- described above.  In IPA-SRA this is a pointer to the next access
- belonging to the same group (having the same representative).  */
+ described above.  */
   struct access *next_sibling;
 
   /* Pointers to the first and last element in the linked list of assign
@@ -185,9 +179,6 @@ struct access
  when grp_to_be_replaced flag is set.  */
   tree replacement_decl;
 
-  /* Is this access an access to a non-addressable field? */
-  unsigned non_addressable : 1;
-
   /* Is this access made in reverse storage order? */
   unsigned reverse : 1;
 
@@ -260,19 +251,6 @@ struct access
 
   /* Should TREE_NO_WARNING of a replacement be set?  */
   unsigned grp_no_warning : 1;
-
-  /* Is it possible that the group refers to data which might be (directly or
- otherwise) modified?  */
-  unsigned grp_maybe_modified : 1;
-
-  /* Set when this is a representative of a pointer to scalar (i.e. by
- reference) parameter which we consider for turning into a plain scalar
- (i.e. a by value parameter).  */
-  unsigned grp_scalar_ptr : 1;
-
-  /* Set when we discover that this pointer is not safe to dereference in the
- caller.  */
-  unsigned grp_not_necessarilly_dereferenced : 1;
 };
 
 typedef struct access *access_p;
@@ -349,29 +327,6 @@ static struct obstack name_obstack;
propagated to their assignment counterparts. */
 static struct access *work_queue_head;
 
-/* Number of parameters of the analyzed function when doing early ipa SRA.  */
-static int func_param_count;
-
-/* scan_function sets the following to true if it encounters a call to
-   __builtin_apply_args.  */
-static bool encountered_apply_args;
-
-/* Set by scan_function when it finds a recursive call.  */
-static bool encountered_recursive_call;
-
-/* Set by scan_function when it finds a recursive call with less actual
-   arguments than formal parameters..  */
-static bool encountered_unchangable_recursive_call;
-
-/* This is a table in which for each basic block and parameter 

[PATCH 0/4] True IPA reimplementation of IPA-SRA (v4)

2019-08-21 Thread Martin Jambor
Hello,

I have fixed two bugs of the previous IPA-SRA submission and re-based
it on a recent trunk and am posting it again so that people can have
look at an up-to-date version (and also as a reminder that it is still
pending review).  Previous submissions are available at:

   - https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01529.html and
   - https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00472.html

This reimplementation is a full IPA pass that can handle strictly
connected components in the call-graph, can take advantage of LTO and
does not weirdly switch functions in pass-pipeline like our current
quasi-IPA SRA does.  Unlike the current IPA-SRA it can also remove
return values, even in SCCs.  On the other hand, it is less powerful
when it comes to structures passed by reference.  By design it will
not create references to bits of an aggregate because that turned out
to be just obfuscation in practice.  However, it also cannot usually
split aggregates passed by reference that are just passed to another
function (where splitting would be useful) because it cannot perform
the same TBAA analysis like the current implementation which already
knows what types it should look at because it has access to bodies of
all functions attempts to modify (for an example see
https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00955.html).

Since May I have switched the order of IPA-CP and IPA-SRA, so that
IPA-CP is run before IPA-SRA and it can propagate constants in
aggregates before these are split and tracking is lost.  The July
submission frequently cloned even if all the parameter removal work
has already been done by IPA-CP, this version does not have that
problem.  Like before, there are dbgcounters.

This submission has
https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01567.html incorporated.

The patch has successfully bootstrapped and LTO-profilebootstrapped
and tested on x86_64-linux.  Testing on x86_64 fixes 24 LTO tests in
guality/pr68860-1.c but regresses 13 guality tests, 3 of which are not
LTO.  I plan to bootstrap on ppc64le and aarch64 like I did before
submitting the previous version.  Right after my last submission,
Martin Liška used a version which contained both aforementioned fixes
(but of course not the re-base) to build over 5000 openSUSE packages
(vast majority of them with LTO).

After the patches are approved, I would like to commit them in one big
commit but in order to ease review, I have split the big patch into
four patches:
  1. Removal of old IPA-SRA,
  2. New parameter manipulation infrastructure,
  3. New IPA-SRA implementation, and
  4. Modifications to the testsuite.

When applied one after another they should compile but I have only
exhaustively tested the complete patch.  If you want to experiment
with new IPA-SRA or the parameter manipulation infrastructure it
introduces, I suggest that you clone the jamborm/ipa-sra branch from
our git "mirror"
(https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/jamborm/ipa-sra).

Thanks in advance for any questions, comments and suggestions,

Martin


2019-08-20  Martin Jambor  

* coretypes.h (cgraph_edge): Declare.
* ipa-param-manipulation.c: Rewrite.
* ipa-param-manipulation.h: Likewise.
* Makefile.in (GTFILES): Added ipa-param-manipulation.h and ipa-sra.c.
(OBJS): Added ipa-sra.o.
* cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p
and ref_p, added fields param_adjustments and performed_splits.
(struct cgraph_clone_info): Remove ags_to_skip and
combined_args_to_skip, new field param_adjustments.
(cgraph_node::create_clone): Changed parameters to use
ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_virtual_clone_with_body): Likewise.
(tree_function_versioning): Likewise.
(cgraph_build_function_type_skip_args): Removed.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to
using ipa_param_adjustments.
(clone_of_p): Likewise.
* cgraphclones.c (cgraph_build_function_type_skip_args): Removed.
(build_function_decl_skip_args): Likewise.
(duplicate_thunk_for_node): Adjust parameters using
ipa_param_body_adjustments, copy param_adjustments instead of
args_to_skip.
(cgraph_node::create_clone): Convert to using ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_version_clone_with_body): Likewise.
(cgraph_materialize_clone): Likewise.
(symbol_table::materialize_all_clones): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify
ipa_replace_map check.
* ipa-cp.c (get_replacement_map): Do not initialize removed fields.
(initialize_node_lattices): Make aware that some parameters might have
already been removed.
(want_remove_some_param_p): New function.

[PATCH] [PR81810] unused strcpy to a local buffer not eliminated

2019-08-21 Thread kamlesh kumar
Hi ,
This patch include fix for PR81810
Bootstrapped/regtested on x86_64-linux, ok for trunk?

Thanks
./Kamlesh

2019-08-21  Kamlesh Kumar   

   PR tree-optimization/81810
   * tree-ssa-dse.c (dse_dom_walker::dse_optimize_stmt): Added
   BUILT_IN_STRCPY to consider for dse.
   (maybe_trim_memstar_call): Likewise.
   (initialize_ao_ref_for_dse): Likewise.
   * gcc.dg/tree-ssa/pr81810.c: New testcase.


diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index ba67884a825..dc4da4c9730 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -115,8 +115,11 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
case BUILT_IN_MEMSET_CHK:
case BUILT_IN_STRNCPY:
case BUILT_IN_STRNCPY_CHK:
+   case BUILT_IN_STRCPY:
  {
-   tree size = gimple_call_arg (stmt, 2);
+   tree size = NULL;
+   if (gimple_call_num_args (stmt) > 2)
+   size = gimple_call_arg (stmt, 2);
tree ptr = gimple_call_arg (stmt, 0);
ao_ref_init_from_ptr_and_size (write, ptr, size);
return true;
@@ -470,6 +473,7 @@ maybe_trim_memstar_call (ao_ref *ref, sbitmap live,
gimple *stmt)
 case BUILT_IN_MEMCPY:
 case BUILT_IN_MEMMOVE:
 case BUILT_IN_STRNCPY:
+case BUILT_IN_STRCPY:
 case BUILT_IN_MEMCPY_CHK:
 case BUILT_IN_MEMMOVE_CHK:
 case BUILT_IN_STRNCPY_CHK:
@@ -966,6 +970,7 @@ dse_dom_walker::dse_optimize_stmt (gimple_stmt_iterator
*gsi)
case BUILT_IN_MEMCPY:
case BUILT_IN_MEMMOVE:
case BUILT_IN_STRNCPY:
+   case BUILT_IN_STRCPY:
case BUILT_IN_MEMSET:
case BUILT_IN_MEMCPY_CHK:
case BUILT_IN_MEMMOVE_CHK:
@@ -975,11 +980,14 @@ dse_dom_walker::dse_optimize_stmt
(gimple_stmt_iterator *gsi)
/* Occasionally calls with an explicit length of zero
   show up in the IL.  It's pointless to do analysis
   on them, they're trivially dead.  */
-   tree size = gimple_call_arg (stmt, 2);
-   if (integer_zerop (size))
+   if (gimple_call_num_args (stmt) > 2)
  {
-   delete_dead_or_redundant_call (gsi, "dead");
-   return;
+   tree size = gimple_call_arg (stmt, 2);
+   if (integer_zerop (size))
+ {
+   delete_dead_or_redundant_call (gsi, "dead");
+   return;
+ }
  }

/* If this is a memset call that initializes an object
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
b/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
new file mode 100644
index 000..89cf36a1367
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81810.c
@@ -0,0 +1,25 @@
+/* PR tree-optimization/81810
+   { dg-do compile }
+   { dg-options "-O2 -fdump-tree-optimized" } */
+
+void f (const void *p, unsigned n)
+{
+  char a[8];
+  __builtin_memcpy (a, p, n);
+}
+
+void g (const char *s)
+{
+  char a[8];
+  __builtin_strcpy (a, s);
+}
+
+void h (const char *s)
+{
+  char a[8];
+  __builtin_strncpy (a, s, sizeof a);
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_memcpy" "optimized" } }
+   { dg-final { scan-tree-dump-not "__builtin_strcpy" "optimized" } }
+   { dg-final { scan-tree-dump-not "__builtin_strncpy" "optimized" } } */


Go patch committed: Don't use pkgpath for fieldtrack of unexported field

2019-08-21 Thread Ian Lance Taylor
This patch to the Go frontend fixes it to not use the full package
path for the fieldtrack information generated for an unexported field.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 274800)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7da359f4659f051c05ff442037cfa61febd5
+82d27f0f140f33406cf59c0fb262f6dba3077f8e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 274800)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -13795,7 +13795,7 @@ Field_reference_expression::do_lower(Gog
   if (nt != NULL)
 s.append(Gogo::unpack_hidden_name(nt->name()));
   s.push_back('.');
-  s.append(field->field_name());
+  s.append(Gogo::unpack_hidden_name(field->field_name()));
   s.push_back('"');
 
   // We can't use a string here, because internally a string holds a


Re: [PATCH] rs6000: Add ordered compares (PR58684)

2019-08-21 Thread Joseph Myers
On Thu, 8 Aug 2019, Segher Boessenkool wrote:

> 2) Is it *required*, or can we delete ordered compares in some cases?
> 2a) Like, if we test ab, we only need one compare instruction,
> not the two that are generate right now.

Yes, you only need one ordered compare there.  The relevant text from the 
C standard Annex F is "This specification does not require support for 
trap handlers that maintain information about the order or count of 
floating-point exceptions. Therefore, between function calls, 
floating-point exceptions need not be precise: the actual order and number 
of occurrences of floating-point exceptions (> 1) may vary from what the 
source code expresses.".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2019-08-21 Thread Kyrill Tkachov

Hi James,

On 8/21/19 1:48 PM, James Greenhalgh wrote:

On Mon, Oct 24, 2016 at 03:27:10PM +0100, Kyrill Tkachov wrote:

Hi all,

When storing a 64-bit immediate that has equal bottom and top halves we 
currently
synthesize the repeating 32-bit pattern twice and perform a single X-store.
With this patch we synthesize the 32-bit pattern once into a W register and 
store
that twice using an STP. This reduces codesize bloat from synthesising the same
constant multiple times at the expense of converting a store to a store-pair.
It will only trigger if we can save two or more instructions, so it will only 
transform:
  mov x1, 49370
  movkx1, 0xc0da, lsl 32
  str x1, [x0]

into:

  mov w1, 49370
  stp w1, w1, [x0]

when optimising for -Os, whereas it will always transform a 4-insn synthesis
sequence into a two-insn sequence + STP (see comments in the patch).

This patch triggers already but will trigger more with the store merging pass
that I'm working on since that will generate more of these repeating 64-bit 
constants.
This helps improve codegen on 456.hmmer where store merging can sometimes 
create very
complex repeating constants and target-specific expand needs to break them down.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Hi Kyrill,

Does this do the right thing for:

   void bar(u64 *x)
   {
*(volatile u64 *)x = 0xabcdef10abcdef10;
   }

C.f. 
https://lore.kernel.org/lkml/20190821103200.kpufwtviqhpbuv2n@willie-the-truck/

i.e. is this optimization still valid for volatile?


Not sure if it's valid, but it's most likely undesirable for whatever 
gain it gives in this scenario.


I'm testing a patch to disable it for volatile destinations.

Thanks,

Kyrill


Thanks,
James


Thanks,
Kyrill

2016-10-24  Kyrylo Tkachov  

  * config/aarch64/aarch64.md (mov): Call
  aarch64_split_dimode_const_store on DImode constant stores.
  * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store):
  New prototype.
  * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New
  function.

2016-10-24  Kyrylo Tkachov  

  * gcc.target/aarch64/store_repeating_constant_1.c: New test.
  * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.


Re: [PATCH] builtin fadd variants implementation

2019-08-21 Thread Joseph Myers
On Thu, 8 Aug 2019, Tejas Joshi wrote:

> +/* Try to evaluate:
> +
> +  *RESULT = fadd (*ARG0, *ARG1)
> +
> +   in format FORMAT.  Return true on success.  */
> +
> +static bool
> +fold_const_fadd (real_value *result, const real_value *arg0,
> +  const real_value *arg1, const real_format *format)
> +{
> +  if (REAL_VALUE_ISSIGNALING_NAN (*arg0)
> +  || REAL_VALUE_ISSIGNALING_NAN (*arg1))
> +return false;
> +
> +  if (real_fadd (result, format, arg0, arg1)
> +  && !flag_errno_math)
> +return false;

I think that rather than having a real_fadd function, you should just call 
real_arithmetic with a PLUS_EXPR argument and then move the overflow etc. 
checks out to here.

That way, rather than having a fold_const_fadd function and then similar 
functions for fsub / fmul / fdiv, you can have one function that covers 
all four binary narrowing operations, which all have very similar logic, 
with an argument that is PLUS_EXPR etc. passed to the 
fold_const_narrow_binary (or whatever you call it) function.

> +  if (!exact_real_truncate (format, result)
> +  && !(flag_rounding_math && flag_trapping_math))
> +return false;

I don't think the logic is right here.  What you want is: don't fold for 
inexact results if flag_rounding_math || flag_trapping_math.  I.e.:

  if (!exact_real_truncate (format, result)
  && (flag_rounding_math || flag_trapping_math))

> +/* Perform addition of X and Y and place the result in R to narrowed
> +   down type. Return TRUE if result involves overflow/underflow.  */
> +
> +bool
> +real_fadd (REAL_VALUE_TYPE *r, format_helper fmt,
> +const REAL_VALUE_TYPE *x, const REAL_VALUE_TYPE *y)
> +{
> +  do_add (r, x, y, 0);
> +
> +  /* Overflow/underflow check.  */
> +  if (REAL_EXP (r) > fmt->emax)
> +  {
> +get_inf (r, r->sign);
> +return true;
> +  }

This isn't the right way to check for overflow.  You can have a result 
that's within the exponent range of the narrower format before rounding, 
but overflows after the rounding is done.  The right way to check for 
overflow (given that the result is inexact, and !flag_rounding_math && 
!flag_trapping_math && flag_errno_math so that inexact results are OK but 
not if they overflow) would be to check after rounding whether the result 
is Inf when the arguments were finite.  (When you get only fdiv, that also 
becomes relevant for exact cases; fdiv (1.0, 0.0) produces an exact 
infinity but can't be folded if flag_errno_math || flag_trapping_math.)

As well as overflow you also need to consider the case of fadd (Inf, -Inf) 
which would set errno to EDOM and raise "invalid" so, although exact, is 
also not allowed to be folded if flag_errno_math || flag_trapping_math.  
The generic logic there is: if the arguments are not NaN but the result is 
NaN, you can't fold if flag_errno_math || flag_trapping_math (but 
flag_rounding_math is irrelevant in that case).

> diff --git a/gcc/testsuite/gcc.dg/builtin-fadd-1.c 
> b/gcc/testsuite/gcc.dg/builtin-fadd-1.c
> new file mode 100644
> index 000..958d5a019d7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/builtin-fadd-1.c
> @@ -0,0 +1,31 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#define TEST(FN, VAL1, VAL2, RESULT) \
> +  if (__builtin_##FN (VAL1, VAL2) != RESULT) __builtin_abort ();

The tests ought to be tests of what gets optimized away (so causing link 
failures, as in the roundeven tests, if the calls don't get optimized or 
get optimized incorrectly).

The tests should include cases that demonstrate there is only a single 
rounding involved: that there is no intermediate rounding to the wider 
type.  (E.g. fadd (0x1.01p0, FLT_MIN), as an example from the glibc 
tests: cases where an intermediate rounding produces a result half way 
between two values of the narrower type, but the exact value is such that 
the result of fadd should end up with last bit odd whereas double rounding 
would result in last bit even in such half-way cases.)

> +  TEST(fadd, __FLT_MAX__, __FLT_MAX__/2, 3*__FLT_MAX__/2);

It would seem better to write the expected result here as __builtin_inff 
().

Then you should have some tests of what does *not* get optimized with 
given compiler options if possible.  (Such a test might e.g. define a 
static fadd function locally that checks it gets called as expected, or 
else check the exceptions / errno if you rely on a suitable libm being 
available.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add missing popcount simplifications (PR90693)

2019-08-21 Thread Jeff Law
On 8/21/19 7:57 AM, Wilco Dijkstra wrote:
> Hi Richard,
> 
>>>
>>> I think this should be in expand stage where there could be comparison
>>> of the cost of the RTLs.
>>
>> I tend to agree here, if not then for the reason the "simplified" variants
>> have more GIMPLE stmts which means they are not "simpler".  In
>> fact I'd argue for canonicalization we'd want to have the reverse
>> "simplifications" on GIMPLE and expansion based on target cost.
>  
> So how would this work? Expand works on one statement at a time, but
> we are dealing with more complex expressions here. When we get a
> popcount (x) > 1 in expand_gimple_cond, the popcount has already been
> expanded. And the code in builtins.c that emits popcount doesn't see or
> consider the comparison, so it would be difficult to change it at that point.
> None of the infrastructure in expand seems to be set up to do complex
> pattern matches and replacements at expand time...
> 
> Costing would be difficult too since rtx_cost doesn't support builtins or
> calls, so each backend would need to be modified to add costs for these.
> 
> So what is the best place to do pattern matches? I thought it was all
> moving to match.pd.
I believe the expanders have access to more than one statement via the
use-def chains and TER's transformations.

jeff


Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-21 Thread Robin Dapp
I'm going to commit the attached two patches.  Removed the redundant
changes in test cases and added constructor initialization of
fold_all_stmts.

Regards
 Robin

--

gcc/ChangeLog:

2019-08-21  Robin Dapp  

* gimple-loop-versioning.cc (loop_versioning::record_address_fragment):
Add nop_convert case.
* tree-ssa-propagate.c
(substitute_and_fold_dom_walker::before_dom_children):
Fold all statements if requested.
* tree-ssa-propagate.h (class substitute_and_fold_engine):
Allow to fold all statements.
* tree-vrp.c (class vrp_folder):
Let substitute_and_fold_engine fold all statements.

gcc/fortran/ChangeLog:

2019-08-21  Robin Dapp  

* trans-intrinsic.c (gfc_conv_intrinsic_findloc): Initialize
to prevent "maybe used uninitialized".

--

gcc/ChangeLog:

2019-08-21  Robin Dapp  

* match.pd: Add (T)(A) + CST -> (T)(A + CST).

gcc/testsuite/ChangeLog:

2019-08-21  Robin Dapp  

* gcc.dg/tree-ssa/copy-headers-5.c: Do not run vrp pass.
* gcc.dg/tree-ssa/copy-headers-7.c: Do not run vrp pass.
* gcc.dg/tree-ssa/loop-15.c: Remove XFAIL.
* gcc.dg/tree-ssa/pr23744.c: Change search pattern.
* gcc.dg/wrapped-binop-simplify.c: New test.
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index a6e33833680..99ec5f34319 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -5428,8 +5428,9 @@ gfc_conv_intrinsic_findloc (gfc_se *se, gfc_expr *expr)
   tree type;
   tree tmp;
   tree found;
-  tree forward_branch;
-  tree back_branch;
+  /* Initialize here to avoid 'maybe used uninitialized'.  */
+  tree forward_branch = NULL_TREE;
+  tree back_branch = NULL_TREE;
   gfc_loopinfo loop;
   gfc_ss *arrayss;
   gfc_ss *maskss;
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 8fa19488490..35344b7b448 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1266,6 +1266,12 @@ loop_versioning::record_address_fragment (gimple *stmt,
 		  continue;
 		}
 	}
+	  if (CONVERT_EXPR_CODE_P (code))
+	{
+	  tree op1 = gimple_assign_rhs1 (assign);
+	  address->terms[i].expr = strip_casts (op1);
+	  continue;
+	}
 	}
   i += 1;
 }
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index 0862f83e9a1..7172ef8b4e6 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -814,7 +814,6 @@ ssa_propagation_engine::ssa_propagate (void)
   ssa_prop_fini ();
 }
 
-
 /* Return true if STMT is of the form 'mem_ref = RHS', where 'mem_ref'
is a non-volatile pointer dereference, a structure reference or a
reference to a single _DECL.  Ignore volatile memory references
@@ -1071,6 +1070,14 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
 	  stmt = gsi_stmt (i);
 	  gimple_set_modified (stmt, true);
 	}
+  /* Also fold if we want to fold all statements.  */
+  else if (substitute_and_fold_engine->fold_all_stmts
+	  && fold_stmt (, follow_single_use_edges))
+	{
+	  did_replace = true;
+	  stmt = gsi_stmt (i);
+	  gimple_set_modified (stmt, true);
+	}
 
   /* Some statements may be simplified using propagator
 	 specific information.  Do this before propagating
diff --git a/gcc/tree-ssa-propagate.h b/gcc/tree-ssa-propagate.h
index 81b635e0787..f79c2ffadf3 100644
--- a/gcc/tree-ssa-propagate.h
+++ b/gcc/tree-ssa-propagate.h
@@ -100,6 +100,8 @@ class ssa_propagation_engine
 class substitute_and_fold_engine
 {
  public:
+  substitute_and_fold_engine (bool fold_all_stmts = false)
+: fold_all_stmts (fold_all_stmts) { }
   virtual ~substitute_and_fold_engine (void) { }
   virtual bool fold_stmt (gimple_stmt_iterator *) { return false; }
   virtual tree get_value (tree) { return NULL_TREE; }
@@ -107,6 +109,10 @@ class substitute_and_fold_engine
   bool substitute_and_fold (basic_block = NULL);
   bool replace_uses_in (gimple *);
   bool replace_phi_args_in (gphi *);
+
+  /* Users like VRP can set this when they want to perform
+ folding for every propagation.  */
+  bool fold_all_stmts;
 };
 
 #endif /* _TREE_SSA_PROPAGATE_H  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e2850682da2..1cba9b4d180 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -6257,6 +6257,7 @@ vrp_prop::visit_phi (gphi *phi)
 class vrp_folder : public substitute_and_fold_engine
 {
  public:
+  vrp_folder () : substitute_and_fold_engine (/* Fold all stmts.  */ true) {  }
   tree get_value (tree) FINAL OVERRIDE;
   bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
   bool fold_predicate_in (gimple_stmt_iterator *);
diff --git a/gcc/match.pd b/gcc/match.pd
index 0317bc704f7..22282615f84 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2020,6 +2020,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (cst && !TREE_OVERFLOW (cst))
  (plus { cst; } @0
 
+/* ((T)(A)) + CST -> (T)(A + CST)  */
+#if GIMPLE
+  (simplify
+   (plus (convert 

Re: [PATCH V2 2/8] bpf: new GCC port

2019-08-21 Thread Jonathan Corbet
On Tue, 20 Aug 2019 23:14:35 +0200
jose.march...@oracle.com (Jose E. Marchesi) wrote:

> The kernel verifier doesn't allow backward jumps.
> 
> This may change at some point.  There is much discussion among the
> kernel hackers in whether it is possible to allow bounded loops in a
> safe way.  In that case, some of the restrictions may be lifted.

Actually, bounded loops are supported and allowed in the 5.3 kernel.

https://lwn.net/Articles/794934/

jon


C++ PATCH for c++/91304 - prefix attributes ignored in condition

2019-08-21 Thread Marek Polacek
Currently, we disregard prefix attributes in conditions, e.g.:

  if ([[maybe_unused]] int i = f()) { }

The problem here is that although we've parsed the attribute, it
was never passed down to start_decl, so the effects were lost.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-21  Marek Polacek  

PR c++/91304 - prefix attributes ignored in condition.
* parser.c (cp_parser_condition): Handle prefix attributes.

* g++.dg/cpp0x/gen-attrs-70.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index dbbfe1dbc2f..b410a6c030f 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -12066,6 +12066,10 @@ cp_parser_condition (cp_parser* parser)
   /* Restore the saved message.  */
   parser->type_definition_forbidden_message = saved_message;
 
+  /* Gather the attributes that were provided with the
+ decl-specifiers.  */
+  tree prefix_attributes = type_specifiers.attributes;
+
   cp_parser_maybe_commit_to_declaration (parser,
 type_specifiers.any_specifiers_p);
 
@@ -12116,7 +12120,7 @@ cp_parser_condition (cp_parser* parser)
  /* Create the declaration.  */
  decl = start_decl (declarator, _specifiers,
 /*initialized_p=*/true,
-attributes, /*prefix_attributes=*/NULL_TREE,
+attributes, prefix_attributes,
 _scope);
 
  /* Parse the initializer.  */
diff --git gcc/testsuite/g++.dg/cpp0x/gen-attrs-70.C 
gcc/testsuite/g++.dg/cpp0x/gen-attrs-70.C
new file mode 100644
index 000..90a2e97a3f6
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-70.C
@@ -0,0 +1,13 @@
+// PR c++/91304 - prefix attributes ignored in condition.
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-Wall -Wextra" }
+
+int f();
+
+void g()
+{   
+  if ([[maybe_unused]] int i = f()) { }
+  if ([[deprecated]] int i = f()) { i = 10; } // { dg-warning ".i. is 
deprecated" }
+  if (int i [[maybe_unused]] = f()) { }
+  if (int i [[deprecated]] = f()) { i = 10; } // { dg-warning ".i. is 
deprecated" }
+}


Go patch committed: If hidden function referenced by inline, don't hide descriptor

2019-08-21 Thread Ian Lance Taylor
This patch to the Go frontend checks if a hidden function is
referenced by an inline function when deciding whether to hid the
function descriptor.  This fixes https://golang.org/issue/33739.  The
test case is https://golang.org/cl/191001.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 274755)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1846b07fec2b91facc02ea269f7ab250b30f90b4
+7da359f4659f051c05ff442037cfa61febd5
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 274755)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1560,10 +1560,14 @@ Func_descriptor_expression::do_get_backe
  || no->name().find("equal") != std::string::npos))
is_exported_runtime = true;
 
+  bool is_referenced_by_inline =
+   no->is_function() && no->func_value()->is_referenced_by_inline();
+
   bool is_hidden = ((no->is_function()
 && no->func_value()->enclosing() != NULL)
|| (Gogo::is_hidden_name(no->name())
-   && !is_exported_runtime)
+   && !is_exported_runtime
+   && !is_referenced_by_inline)
|| Gogo::is_thunk(no));
 
   bvar = context->backend()->immutable_struct(var_name, asm_name,
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 274755)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -1543,6 +1543,11 @@ class Function
   set_is_inline_only()
   { this->is_inline_only_ = true; }
 
+  // Report whether the function is referenced by an inline body.
+  bool
+  is_referenced_by_inline() const
+  { return this->is_referenced_by_inline_; }
+
   // Mark the function as referenced by an inline body.
   void
   set_is_referenced_by_inline()


Re: [PATCH] Add missing popcount simplifications (PR90693)

2019-08-21 Thread Wilco Dijkstra
Hi Richard,

> >
> > I think this should be in expand stage where there could be comparison
> > of the cost of the RTLs.
> 
> I tend to agree here, if not then for the reason the "simplified" variants
> have more GIMPLE stmts which means they are not "simpler".  In
> fact I'd argue for canonicalization we'd want to have the reverse
> "simplifications" on GIMPLE and expansion based on target cost.
 
So how would this work? Expand works on one statement at a time, but
we are dealing with more complex expressions here. When we get a
popcount (x) > 1 in expand_gimple_cond, the popcount has already been
expanded. And the code in builtins.c that emits popcount doesn't see or
consider the comparison, so it would be difficult to change it at that point.
None of the infrastructure in expand seems to be set up to do complex
pattern matches and replacements at expand time...

Costing would be difficult too since rtx_cost doesn't support builtins or
calls, so each backend would need to be modified to add costs for these.

So what is the best place to do pattern matches? I thought it was all
moving to match.pd.

Wilco

Re: [C++] Protect call to copy_attributes_to_builtin (PR91505)

2019-08-21 Thread Richard Biener
On Wed, Aug 21, 2019 at 2:59 PM Richard Sandiford
 wrote:
>
> copy_attributes_to_builtin only handles BUILT_IN_NORMAL, but C++ was
> calling it immediately after the:
>
>   if (DECL_BUILT_IN_CLASS (newdecl) == BUILT_IN_NORMAL)
>
> block.  The corresponding C code calls it inside the block instead.
>
> Tested on x86_64-linux-gnu.  OK to install?

OK (looks even obvious).

Richard.

> Richard
>
>
> 2019-08-21  Richard Sandiford  
>
> gcc/cp/
> PR c++/91505
> * decl.c (duplicate_decls): Call copy_attributes_to_builtin inside
> the BUILT_IN_NORMAL block rather than afterward.
>
> gcc/testsuite/
> PR c++/91505
> * g++.target/i386/crc32-4.C: New test.
>
> Index: gcc/cp/decl.c
> ===
> --- gcc/cp/decl.c   2019-08-15 15:26:04.740237801 +0100
> +++ gcc/cp/decl.c   2019-08-21 13:57:57.336587359 +0100
> @@ -2565,9 +2565,9 @@ duplicate_decls (tree newdecl, tree oldd
> set_builtin_decl_declared_p (fncode, true);
>   break;
> }
> -   }
>
> - copy_attributes_to_builtin (newdecl);
> + copy_attributes_to_builtin (newdecl);
> +   }
> }
>if (new_defines_function)
> /* If defining a function declared with other language
> Index: gcc/testsuite/g++.target/i386/crc32-4.C
> ===
> --- /dev/null   2019-07-30 08:53:31.317691683 +0100
> +++ gcc/testsuite/g++.target/i386/crc32-4.C 2019-08-21 13:57:57.336587359 
> +0100
> @@ -0,0 +1,4 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx2" } */
> +
> +#include "../../gcc.target/i386/crc32-4.c"


[PATCH] libgcc/crtstuff.c: Fix incorrect alignment of entries in CRT data structures (PR target/91306)

2019-08-21 Thread Jozef Lawrynowicz
As described in PR target/91306, the alignment of entries in CRT data structures
as set in libgcc/crtstuff.c can be too large for msp430-elf.

crtstuff.c assumes the correct alignment of entries with a type of (void *) or
int32 is the "sizeof" that type.

However, for msp430-elf in the large memory model, pointer size is 4 bytes, but
they only need to be 2-byte aligned. Likewise for int32.

This is causing problems for __frame_dummy_init_array_entry, which gets 
inserted into .init_array.
It is being forced into a 4 byte alignment, which can result in a gap between
__frame_dummy_init_array_entry and the previous entry (or the start of the
table). So when the startup code comes to run through the entries
in .init_array, it interprets 2-bytes of padding 0s as part of a function
address, resulting in a function call to the incorrect location.

This issue has only recently appeared because msp430-elf was just transitioned
to use init/fini_array by default, as mandated by the mspabi.

The other CRT array entries in crtstuff.c are not used for msp430-elf, so aren't
causing problems.

As suggested in the PR, I tried changing the alignment of the array entry from
sizeof(func_ptr) to __alignof__(func_ptr), and this fixed the issue. In the
attached patch I also updated the other uses of sizeof in "aligned" attributes
to use __alignof__ instead.

I understand the alignment to sizeof(func_ptr) was originally added to fix an
issue on mips64 (r182066
https://gcc.gnu.org/ml/gcc-patches/2011-12/msg00393.html), however I do
not have access to a mips64 platform for which to bootstrap on. I tried building
a cross-compiler with the "aligned" attributes removed to see if I could
reproduce the failure, but the build completed successfully.
I built the mips64 cross-compiler with my patch applied and it
also completed successfully.

I successfully regtested the patch for msp430-elf. It fixes many execution
failures for msp430-elf/-mlarge.
I also successfully bootstrapped and regtested x86_64-pc-linux-gnu (g++ gcc
gfortran libatomic libgomp libitm libstdc++ objc).

According to the mips eabi (is this the right one?
http://www.cygwin.com/ml/binutils/2003-06/msg00436.html), pointers have 64-bit
size and alignment in 64-bit mode, so I am assuming using __alignof__ instead of
sizeof isn't going to cause problems.

Ok for trunk?
>From 636ffa0c4f15047dc27c829d9a3c8fea9ad5d635 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 15 Aug 2019 14:17:25 +0100
Subject: [PATCH] libgcc/crtstuff.c: Fix incorrect alignment of entries in CRT
 data structures

libgcc/ChangeLog:

2019-08-21  Jozef Lawrynowicz  

	* crtstuff.c (__CTOR_LIST__): Align to the "__alignof__" the array
	element type, instead of "sizeof" the element type.
	(__DTOR_LIST__): Likewise.
	(__TMC_LIST__): Likewise.
	(__do_global_dtors_aux_fini_array_entry): Likewise.
	(__frame_dummy_init_array_entry): Likewise.
	(__CTOR_END__): Likewise.
	(__DTOR_END__): Likweise.
	(__FRAME_END__): Likewise.
	(__TMC_END__): Likewise.
---
 libgcc/crtstuff.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
index 4927a9f8977..c8a8e2c87a5 100644
--- a/libgcc/crtstuff.c
+++ b/libgcc/crtstuff.c
@@ -233,11 +233,11 @@ CTOR_LIST_BEGIN;
 static func_ptr force_to_data[1] __attribute__ ((__used__)) = { };
 asm (__LIBGCC_CTORS_SECTION_ASM_OP__);
 STATIC func_ptr __CTOR_LIST__[1]
-  __attribute__ ((__used__, aligned(sizeof(func_ptr
+  __attribute__ ((__used__, aligned(__alignof__(func_ptr
   = { (func_ptr) (-1) };
 #else
 STATIC func_ptr __CTOR_LIST__[1]
-  __attribute__ ((__used__, section(".ctors"), aligned(sizeof(func_ptr
+  __attribute__ ((__used__, section(".ctors"), aligned(__alignof__(func_ptr
   = { (func_ptr) (-1) };
 #endif /* __CTOR_LIST__ alternatives */
 
@@ -246,11 +246,11 @@ DTOR_LIST_BEGIN;
 #elif defined(__LIBGCC_DTORS_SECTION_ASM_OP__)
 asm (__LIBGCC_DTORS_SECTION_ASM_OP__);
 STATIC func_ptr __DTOR_LIST__[1]
-  __attribute__ ((aligned(sizeof(func_ptr
+  __attribute__ ((aligned(__alignof__(func_ptr
   = { (func_ptr) (-1) };
 #else
 STATIC func_ptr __DTOR_LIST__[1]
-  __attribute__((section(".dtors"), aligned(sizeof(func_ptr
+  __attribute__((section(".dtors"), aligned(__alignof__(func_ptr
   = { (func_ptr) (-1) };
 #endif /* __DTOR_LIST__ alternatives */
 #endif /* USE_INITFINI_ARRAY */
@@ -265,7 +265,7 @@ STATIC EH_FRAME_SECTION_CONST char __EH_FRAME_BEGIN__[]
 
 #if USE_TM_CLONE_REGISTRY
 STATIC func_ptr __TMC_LIST__[]
-  __attribute__((used, section(".tm_clone_table"), aligned(sizeof(void*
+  __attribute__((used, section(".tm_clone_table"), aligned(__alignof__(void*
   = { };
 # ifdef HAVE_GAS_HIDDEN
 extern func_ptr __TMC_END__[] __attribute__((__visibility__ ("hidden")));
@@ -430,8 +430,8 @@ __do_global_dtors_aux (void)
 CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux)
 #elif defined (FINI_ARRAY_SECTION_ASM_OP)
 static func_ptr 

[C++] Protect call to copy_attributes_to_builtin (PR91505)

2019-08-21 Thread Richard Sandiford
copy_attributes_to_builtin only handles BUILT_IN_NORMAL, but C++ was
calling it immediately after the:

  if (DECL_BUILT_IN_CLASS (newdecl) == BUILT_IN_NORMAL)

block.  The corresponding C code calls it inside the block instead.

Tested on x86_64-linux-gnu.  OK to install?

Richard


2019-08-21  Richard Sandiford  

gcc/cp/
PR c++/91505
* decl.c (duplicate_decls): Call copy_attributes_to_builtin inside
the BUILT_IN_NORMAL block rather than afterward.

gcc/testsuite/
PR c++/91505
* g++.target/i386/crc32-4.C: New test.

Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c   2019-08-15 15:26:04.740237801 +0100
+++ gcc/cp/decl.c   2019-08-21 13:57:57.336587359 +0100
@@ -2565,9 +2565,9 @@ duplicate_decls (tree newdecl, tree oldd
set_builtin_decl_declared_p (fncode, true);
  break;
}
-   }
 
- copy_attributes_to_builtin (newdecl);
+ copy_attributes_to_builtin (newdecl);
+   }
}
   if (new_defines_function)
/* If defining a function declared with other language
Index: gcc/testsuite/g++.target/i386/crc32-4.C
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/g++.target/i386/crc32-4.C 2019-08-21 13:57:57.336587359 
+0100
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2" } */
+
+#include "../../gcc.target/i386/crc32-4.c"


Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2019-08-21 Thread James Greenhalgh
On Mon, Oct 24, 2016 at 03:27:10PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> When storing a 64-bit immediate that has equal bottom and top halves we 
> currently
> synthesize the repeating 32-bit pattern twice and perform a single X-store.
> With this patch we synthesize the 32-bit pattern once into a W register and 
> store
> that twice using an STP. This reduces codesize bloat from synthesising the 
> same
> constant multiple times at the expense of converting a store to a store-pair.
> It will only trigger if we can save two or more instructions, so it will only 
> transform:
>  mov x1, 49370
>  movkx1, 0xc0da, lsl 32
>  str x1, [x0]
> 
> into:
> 
>  mov w1, 49370
>  stp w1, w1, [x0]
> 
> when optimising for -Os, whereas it will always transform a 4-insn synthesis
> sequence into a two-insn sequence + STP (see comments in the patch).
> 
> This patch triggers already but will trigger more with the store merging pass
> that I'm working on since that will generate more of these repeating 64-bit 
> constants.
> This helps improve codegen on 456.hmmer where store merging can sometimes 
> create very
> complex repeating constants and target-specific expand needs to break them 
> down.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk?

Hi Kyrill,

Does this do the right thing for:

  void bar(u64 *x)
  {
*(volatile u64 *)x = 0xabcdef10abcdef10;
  }

C.f. 
https://lore.kernel.org/lkml/20190821103200.kpufwtviqhpbuv2n@willie-the-truck/

i.e. is this optimization still valid for volatile?

Thanks,
James

> 
> Thanks,
> Kyrill
> 
> 2016-10-24  Kyrylo Tkachov  
> 
>  * config/aarch64/aarch64.md (mov): Call
>  aarch64_split_dimode_const_store on DImode constant stores.
>  * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store):
>  New prototype.
>  * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New
>  function.
> 
> 2016-10-24  Kyrylo Tkachov  
> 
>  * gcc.target/aarch64/store_repeating_constant_1.c: New test.
>  * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.



Re: [PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts

2019-08-21 Thread Bin.Cheng
On Wed, Aug 14, 2019 at 3:23 PM Kewen.Lin  wrote:
>
> Hi!
>
> Comparing to the previous versions of implementation mainly based on the
> existing IV cands but zeroing the related group/use cost, this new one is 
> based
> on Richard and Segher's suggestion introducing one doloop dedicated IV cand.
>
> Some key points are listed below:
>   1) New field doloop_p in struct iv_cand to indicate doloop dedicated IV 
> cand.
>   2) Special name "doloop" assigned.
>   3) Doloop IV cand with form (niter+1, +, -1)
>   4) For doloop IV cand, no extra one cost like BIV, assign zero cost for 
> step.
>   5) Support may_be_zero (regressed PR is in this case), the base of doloop IV
>  can be COND_EXPR, add handlings in cand_value_at and may_eliminate_iv.
>   6) Add more expr support in force_expr_to_var_cost for reasonable cost
>  calculation on the IV base with may_be_zero (like COND_EXPR).
>   7) Set zero cost when using doloop IV cand for doloop use.
>   8) Add three hooks (should we merge _generic and _address?).
> *) have_count_reg_decr_p, is to indicate the target has special hardware
>count register, we shouldn't consider the impact of doloop IV when
>calculating register pressures.
> *) doloop_cost_for_generic, is the extra cost when using doloop IV cand 
> for
>generic type IV use.
> *) doloop_cost_for_address, is the extra cost when using doloop IV cand 
> for
>address type IV use.
What will happen if doloop IV cand be used for generic/address type iv
use?  Can RTL doloop can still perform doloop optimization in this
case?

>
> Bootstrapped on powerpc64le-linux-gnu and regression testing passed excepting
> for one failure on gcc/testsuite/gcc.dg/guality/loop-1.c at -O3 which is 
> tracked
> by PR89983.
>
> Any comments and suggestions are highly appreciated.  Thanks!
Not sure if I understand the patch correctly, some comments embedded.

+  /* The number of doloop candidate in the set.  */
+  unsigned n_doloop_cands;
+
This is unnecessary.  See below comments.

-add_candidate_1 (data, base, step, important,
-IP_NORMAL, use, NULL, orig_iv);
+add_candidate_1 (data, base, step, important, IP_NORMAL, use, NULL, doloop,
+orig_iv);
   if (ip_end_pos (data->current_loop)
   && allow_ip_end_pos_p (data->current_loop))
-add_candidate_1 (data, base, step, important, IP_END, use, NULL, orig_iv);
+add_candidate_1 (data, base, step, important, IP_END, use, NULL, doloop,
+orig_iv);
Do we need to skip ip_end_pos case for doloop candidate?  Because the
candidate increment will be inserted in latch, i.e, increment position
is after exit condition.

-  tree_to_aff_combination (iv->base, type, val);
+  tree base = iv->base;
+  /* See add_iv_candidate_for_doloop, if may_be_zero is set, we want to extract
+ the value under !may_be_zero to get the compact bound which also well fits
+ for may_be_zero since we ensure the value for it is const one.  */
+  if (cand->doloop_p && desc->may_be_zero && !integer_zerop
(desc->may_be_zero))
+base = fold_build2 (PLUS_EXPR, TREE_TYPE (niter),
+   unshare_expr (rewrite_to_non_trapping_overflow (niter)),
+   build_int_cst (TREE_TYPE (niter), 1));
+  tree_to_aff_combination (base, type, val);
I don't quite follow here.  The iv->base is computed from niter, I
suppose compact bound is for cheaper candidate initialization?  Why
it's possible to extract !may_be_zero niter for may_be_zero here?  The
niter under !may_be_zero has no indication about the real niter under
may_be_zero.

-  cand_value_at (loop, cand, use->stmt, desc->niter, );
+  cand_value_at (loop, cand, use->stmt, desc, );
If I understand correctly, doloop use/cand will only be
identified/added for single exit loop, and there will be only one
cond(doloop) iv_use and only one doloop cand for doloop loop.  So the
cand_value at niter at use position would be 0.  If that's the case,
we can skip calling cand_value_at here for doloop cand.  The change to
cand_value_at would be unnecessary neither.

-  expensive.  */
-  if (!integer_zerop (desc->may_be_zero))
+  expensive.
+
+ For doloop candidate, we have considered MAY_BE_ZERO for IV base, need to
+ support MAY_BE_ZERO ? 0 : NITER, so simply bypass this check.  */
+  if (!integer_zerop (desc->may_be_zero) && !cand->doloop_p)
 return iv_elimination_compare_lt (data, cand, comp, desc);
And we can early return before this?

+  if (may_be_zero)
+{
+  if (COMPARISON_CLASS_P (may_be_zero))
+   {
+ niter = fold_build3 (COND_EXPR, ntype, may_be_zero,
+  build_int_cst (ntype, 0),
+  rewrite_to_non_trapping_overflow (niter));
+   }
+  /* Don't try to obtain the iteration count expression when may_be_zero is
+integer_nonzerop (actually iteration count is one) or else.  */
+  else
+   return;
+}

Re: PC-relative TLS support

2019-08-21 Thread Alan Modra
On Mon, Aug 19, 2019 at 07:45:19AM -0500, Segher Boessenkool wrote:
> But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
> is relevant at all, now is the time, patches very welcome, it would be
> a nice cleanup :-)  Needs testing everywhere of course, but now is
> stage 1 :-)

This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
-mno-tls-markers) disappear as valid options too, because I figure
they haven't been used too much except by people testing the
compiler.  Bootstrapped and regression tested powerpc64le-linux and
powerpc-ibm-aix7.1.3.0 (on gcc111).  I believe powerpc*-darwin doesn't
support TLS.

Requiring an 8 year old binutils-2.20 shouldn't be that onerous.

Note that this patch doesn't remove the configure test to set
HAVE_AS_TLS_MARKERS.  I was wondering whether I ought to hook that
into a "sorry, your assembler is too old" error?

* config/rs6000/rs6000-protos.h (rs6000_output_tlsargs): Delete.
* config/rs6000/rs6000.c (rs6000_output_tlsargs): Delete.
(rs6000_legitimize_tls_address): Remove !TARGET_TLS_MARKERS code.
(rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_pltseq_template): Likewise.
(rs6000_opt_vars): Remove "tls-markers" entry.
* config/rs6000/rs6000.h (TARGET_TLS_MARKERS): Don't define.
(IS_NOMARK_TLSGETADDR): Likewise.
* config/rs6000/rs6000.md (tls_gd): Replace TARGET_TLS_MARKERS
with !TARGET_XCOFF.
(tls_gd_high, tls_gd_low): Likewise.
(tls_ld, tls_ld_high, tls_ld_low): Likewise.
(pltseq_plt_pcrel): Likewise.
(call_value_local32): Remove IS_NOMARK_TLSGETADDR predicate test.
(call_value_local64): Likewise.
(call_value_indirect_nonlocal_sysv): Remove IS_NOMARK_TLSGETADDR
output and length attribute sub-expression.
(call_value_nonlocal_sysv),
(call_value_nonlocal_sysv_secure),
(call_value_local_aix, call_value_nonlocal_aix),
(call_value_indirect_aix, call_value_indirect_elfv2),
(call_value_indirect_pcrel): Likewise.
* config/rs6000/rs6000.opt (mtls-markers): Delete.
* doc/install.texi (powerpc-*-*): Require binutils-2.20.

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 06e40d94b17..88b5b7cec55 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -139,7 +139,6 @@ extern bool valid_sf_si_move (rtx, rtx, machine_mode);
 extern void rs6000_emit_move (rtx, rtx, machine_mode);
 extern bool rs6000_legitimate_offset_address_p (machine_mode, rtx,
bool, bool);
-extern void rs6000_output_tlsargs (rtx *);
 extern rtx rs6000_find_base_term (rtx);
 extern rtx rs6000_return_addr (int, rtx);
 extern void rs6000_output_symbol_ref (FILE*, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e792116fb40..5e2b08c3c72 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8329,41 +8329,6 @@ rs6000_legitimize_tls_address_aix (rtx addr, enum 
tls_model model)
   return dest;
 }
 
-/* Output arg setup instructions for a !TARGET_TLS_MARKERS
-   __tls_get_addr call.  */
-
-void
-rs6000_output_tlsargs (rtx *operands)
-{
-  /* Set up operands for output_asm_insn, without modifying OPERANDS.  */
-  rtx op[3];
-
-  /* The set dest of the call, ie. r3, which is also the first arg reg.  */
-  op[0] = operands[0];
-  /* The TLS symbol from global_tlsarg stashed as CALL operand 2.  */
-  op[1] = XVECEXP (operands[2], 0, 0);
-  if (XINT (operands[2], 1) == UNSPEC_TLSGD)
-{
-  /* The GOT register.  */
-  op[2] = XVECEXP (operands[2], 0, 1);
-  if (TARGET_CMODEL != CMODEL_SMALL)
-   output_asm_insn ("addis %0,%2,%1@got@tlsgd@ha\n\t"
-"addi %0,%0,%1@got@tlsgd@l", op);
-  else
-   output_asm_insn ("addi %0,%2,%1@got@tlsgd", op);
-}
-  else if (XINT (operands[2], 1) == UNSPEC_TLSLD)
-{
-  if (TARGET_CMODEL != CMODEL_SMALL)
-   output_asm_insn ("addis %0,%1,%&@got@tlsld@ha\n\t"
-"addi %0,%0,%&@got@tlsld@l", op);
-  else
-   output_asm_insn ("addi %0,%1,%&@got@tlsld", op);
-}
-  else
-gcc_unreachable ();
-}
-
 /* Passes the tls arg value for global dynamic and local dynamic
emit_library_call_value in rs6000_legitimize_tls_address to
rs6000_call_aix and rs6000_call_sysv.  This is used to emit the
@@ -8465,16 +8430,10 @@ rs6000_legitimize_tls_address (rtx addr, enum tls_model 
model)
  rtx arg = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, addr, got),
UNSPEC_TLSGD);
  tga = rs6000_tls_get_addr ();
+ rtx argreg = gen_rtx_REG (Pmode, 3);
+ emit_insn (gen_rtx_SET (argreg, arg));
  global_tlsarg = arg;
- if 

Re: [PATCH 1/8] Document signaling for min, max and ltgt operations

2019-08-21 Thread Segher Boessenkool
On Wed, Aug 21, 2019 at 02:00:10PM +0200, Ilya Leoshkevich wrote:
> > Am 20.08.2019 um 19:13 schrieb Segher Boessenkool 
> > :
> > On Tue, Aug 20, 2019 at 06:13:00PM +0200, Ilya Leoshkevich wrote:
> >>> Am 20.08.2019 um 17:50 schrieb Segher Boessenkool 
> >>> :
> >>> There is currently no way to say (in trees or gimple or rtl) whether
> >>> comparisons are signaling ("ordered", generate a trap on an unordered
> >>> result).  I am working on this, but :-)
> >> 
> >> Isn't there?  This whole series is based on the following assumption:
> >> LT, LE, GT, GE are definitely signaling; LTGT is most likely signaling
> >> as well; the rest are not signaling.  This is based on gccint 11.6.3:
> >> Unary and Binary Expressions.
> > 
> > There is currently no way to implement, say, iseqsig.  And whether an
> > operation is signaling should be determined by the language frontend,
> > not separately by each backend!
> 
> Wouldn't expressing it as ((x <= y) && (x >= y)) work?

That is optimised only partially (on gimple) (first a compare for <=,
then a branch if false, then return "==").

I meant it cannot be implemented directly as one RTL (or gimple) insn,
although it can be done as one machine insn on many architectures.

> > (There should be a signaling and a non-signaling version of every float
> > comparison that can be unordered).
> 
> I wholeheartedly agree.  I had to write quite a few ugly patterns to
> work around the lack of e.g. non-signaling GT.

Ooh, buy in, I like it.  Thanks for the encouragement :-)


Segher


Re: [PATCH] Builtin function roundeven folding implementation

2019-08-21 Thread Joseph Myers
On Wed, 21 Aug 2019, Martin Jambor wrote:

> Hi Tejas,
> 
> On Wed, Aug 14 2019, Tejas Joshi wrote:
> > Hi.
> > Here is a clean patch that does not fold roundeven resulting for
> > integer type and the conditions for folding functions
> > round/ceil/floor/roundeven and trunc only checks for signaling NaN.
> 
> wouldn't checking for *signalling* NaNs mean using the macro
> REAL_VALUE_ISSIGNALING_NAN rather than REAL_VALUE_ISNAN?

Yes, it would.  So the patch should be retested / reposted with that 
change.

> > \ No newline at end of file

This should be fixed (in both places, the tests should each end with 
exactly one newline character).

The new built-in functions roundeven / roundevenf / roundevenl also need 
to be added to the lists in extend.texi (the list of GNU extension 
functions for now, though once we add DEF_C2X_BUILTIN they'd go in a list 
of C2X built-in functions instead).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 1/8] Document signaling for min, max and ltgt operations

2019-08-21 Thread Ilya Leoshkevich



> Am 20.08.2019 um 19:13 schrieb Segher Boessenkool 
> :
> 
> On Tue, Aug 20, 2019 at 06:13:00PM +0200, Ilya Leoshkevich wrote:
>>> Am 20.08.2019 um 17:50 schrieb Segher Boessenkool 
>>> :
>>> There is currently no way to say (in trees or gimple or rtl) whether
>>> comparisons are signaling ("ordered", generate a trap on an unordered
>>> result).  I am working on this, but :-)
>> 
>> Isn't there?  This whole series is based on the following assumption:
>> LT, LE, GT, GE are definitely signaling; LTGT is most likely signaling
>> as well; the rest are not signaling.  This is based on gccint 11.6.3:
>> Unary and Binary Expressions.
> 
> There is currently no way to implement, say, iseqsig.  And whether an
> operation is signaling should be determined by the language frontend,
> not separately by each backend!

Wouldn't expressing it as ((x <= y) && (x >= y)) work?

In any case, I'll need to check whether my patch series handles iseqsig
in at least remotely sane way..

> 
> (There should be a signaling and a non-signaling version of every float
> comparison that can be unordered).

I wholeheartedly agree.  I had to write quite a few ugly patterns to
work around the lack of e.g. non-signaling GT.


Re: Fix crash with -fdump-ada-spec

2019-08-21 Thread Joseph Myers
On Wed, 21 Aug 2019, Eric Botcazou wrote:

> This fixes a crash of the C compiler when -fdump-ada-spec is passed for:
> 
> extern void (*signal (int __sig, void (*__handler)(int)))(int);
> 
> It appears that the C parser is somewhat confused by this declaration and 
> builds a FUNCTION_DECL whose type has got 2 arguments (TYPE_ARG_TYPES list) 
> but which itself has got only 1 argument (DECL_ARGUMENTS list) corresponding 
> to the anonymous (int) at the end.  The C++ parser doesn't have the issue.
> 
> Joseph, is that a known issue of the C parser?

The problem is that the logic in c_parser_declaration_or_fndef for setting 
DECL_ARGUMENTS (not previously meaningful in non-definition declarations 
at all), as introduced by

r253411 | dmalcolm | 2017-10-04 14:10:59 + (Wed, 04 Oct 2017) | 85 lines

C: underline parameters in mismatching function calls

(and subsequently changed by r268574, but that change isn't relevant here) 
is incorrect.  Rather than checking if the outermost declarator is 
cdk_function and using its parameters if so, it's necessary to check if 
the *innermost declarator that isn't cdk_id or cdk_attrs* is cdk_function 
and use its parameters if so, as that's what actually determines if the 
declarator for the entity being declared has the form of a function 
declarator (see how grokdeclarator determines funcdef_syntax).  In your 
example, the outermost declarator is the one for the target type of the 
pointer that is the return type.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [SVE] PR88839

2019-08-21 Thread Prathamesh Kulkarni
On Wed, 21 Aug 2019 at 15:18, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > Hi,
> > The attached patch is a fix for PR88839 ported from sve-acle-branch.
> > OK to commit to trunk ?
> >
> > Thanks,
> > Prathamesh
> >
> > 2019-08-21  Prathamesh Kulkarni  
> >   Richard Sandiford  
> >
> >   PR target/88839
> >   * config/aarch64/aarch64.c (aarch64_evpc_sel): New function.
> >   (aarch64_expand_vec_perm_const_1): Call aarch64_evpc_sel.
> >
> > testsuite/
> >   * gcc.target/aarch64/sve/sel_1.c: New test.
> >   * gcc.target/aarch64/sve/sel_2.c: Likewise.
> >   * gcc.target/aarch64/sve/sel_3.c: Likewise.
> >   * gcc.target/aarch64/sve/sel_4.c: Likewise.
> >   * gcc.target/aarch64/sve/sel_5.c: Likewise.
> >   * gcc.target/aarch64/sve/sel_6.c: Likewise.
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index ec60e972f5f..f8d5270b982 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -16632,6 +16632,50 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d *d)
> >return true;
> >  }
> >
> > +/* Try to implement D using SVE SEL instruction.  */
> > +
> > +static bool
> > +aarch64_evpc_sel (struct expand_vec_perm_d *d)
> > +{
> > +  machine_mode vmode = d->vmode;
> > +  int nunits = GET_MODE_UNIT_SIZE (vmode);
>
> Sorry for not noticing last time, but this should be "unit_size"
> rather than "nunit".
Oops, sorry about that.
>
> OK with that change, thanks.
Thanks, will commit the patch after validating the patch on trunk.

Thanks,
Prathamesh
>
> Richard


Re: [PATCH] Fix PR91482

2019-08-21 Thread Richard Biener
On Tue, 20 Aug 2019, Richard Biener wrote:

> 
> Excessive use of __builtin_assume_aligned can cause missed optimizations
> because those calls are propagation barriers.  The following removes
> those that are redundant and provide no extra information, on the
> testcase allowng store-merging to apply.
> 
> Since the bit lattice and the const/copy lattice are merged
> we cannot track this during CCP propagation (make a copy out
> of the redundant call and propagate that out).  Thus I apply
> it in the CCP specific folding routine called during
> substitute_and_fold only.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

The following is what I have applied.

Richard.

2019-08-21  Richard Biener  

PR tree-optimization/91482
* tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless
BUILT_IN_ASSUME_ALIGNED calls.

* gcc.dg/tree-ssa/pr91482.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===
--- gcc/tree-ssa-ccp.c  (revision 274750)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -2315,6 +2315,32 @@ ccp_folder::fold_stmt (gimple_stmt_itera
  }
   }
 
+   /* If there's no extra info from an assume_aligned call,
+  drop it so it doesn't act as otherwise useless dataflow
+  barrier.  */
+   if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED))
+ {
+   tree ptr = gimple_call_arg (stmt, 0);
+   ccp_prop_value_t ptrval = get_value_for_expr (ptr, true);
+   if (ptrval.lattice_val == CONSTANT
+   && TREE_CODE (ptrval.value) == INTEGER_CST
+   && ptrval.mask != 0)
+ {
+   ccp_prop_value_t val
+ = bit_value_assume_aligned (stmt, NULL_TREE, ptrval, false);
+   unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ());
+   unsigned int align = least_bit_hwi (val.mask.to_uhwi ());
+   if (ptralign == align
+   && ((TREE_INT_CST_LOW (ptrval.value) & (align - 1))
+   == (TREE_INT_CST_LOW (val.value) & (align - 1
+ {
+   bool res = update_call_from_tree (gsi, ptr);
+   gcc_assert (res);
+   return true;
+ }
+ }
+ }
+
/* Propagate into the call arguments.  Compared to replace_uses_in
   this can use the argument slot types for type verification
   instead of the current argument type.  We also can safely
Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */
+
+void write64 (void *p)
+{
+  unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8);
+  *p1++ = 0;
+  unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4);
+  *p2++ = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 
2 stores" "store-merging" { target lp64 } } } */


Re: [PATCH 2/8] Introduce vcond_supported_p hook

2019-08-21 Thread Ilya Leoshkevich
> Am 20.08.2019 um 13:54 schrieb Richard Sandiford :
> 
> Ilya Leoshkevich  writes:
>>> Am 20.08.2019 um 12:13 schrieb Richard Sandiford 
>>> :
>>> 
>>> Ilya Leoshkevich  writes:
 z13 supports only non-signaling vector comparisons.  This means we
 cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
 However, we cannot express this restriction today: the code only checks
 whether vcond$a$b optab, which does not contain information about the
 operation.
 
 Introduce a hook that tells whether target supports certain vector
 comparison operations with certain modes.
 
 gcc/ChangeLog:
 
 2019-08-09  Ilya Leoshkevich  
 
* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
in addition to get_vcond_icode.
* target.def (vcond_supported_p): New hook.
* targhooks.c (default_vcond_supported_p): Likewise.
* targhooks.h (default_vcond_supported_p): Likewise.
>>> 
>>> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses
>>> the predicate for operand 3 to test whether a particular comparison
>>> is supported.  I guess this would require a cached rtx to avoid
>>> generating too much garbage rtl though (via GTY((cache))).
>> 
>> How can I implement such a predicate?  Would calling maybe_gen_insn with
>> a fake rtx be reasonable?  In this case, what would be the best way to
>> generate fake input operands?  The existing code that calls
>> maybe_gen_insn gets the corresponding rtxes from upper layers.
> 
> I was thinking of something like optabs.c:can_compare_p, but with
> some caching to reduce the overhead, and comparing registers rather
> than constants.  E.g.:
> 
> static rtx cached_binop GTY ((cached));
> 
> rtx
> get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode)
> {
>  ...create or modify cached_binop, with register operands...;
>  return cached_binop;
> }

Thanks, I got it working!  Still need to run the regtest, but the result
looks promising.

I have a question about caching.  If I define just `static rtx
cached_binop`, like you suggest, I would expect to see constant cache
misses.  Shouldn't it be something like `hash_map, rtx>`?


Re: [PATCH] Builtin function roundeven folding implementation

2019-08-21 Thread Martin Jambor
Hi Tejas,

On Wed, Aug 14 2019, Tejas Joshi wrote:
> Hi.
> Here is a clean patch that does not fold roundeven resulting for
> integer type and the conditions for folding functions
> round/ceil/floor/roundeven and trunc only checks for signaling NaN.

wouldn't checking for *signalling* NaNs mean using the macro
REAL_VALUE_ISSIGNALING_NAN rather than REAL_VALUE_ISNAN?

Joseph, would fixing this make the patch good to go?  Tejas's GSoC
project results would look much better if we could get this and the
(already approved) follow-up patch committed by the end of the week.

Thanks a lot,

Martin


> 2019-06-12  Tejas Joshi  
>
> * builtins.c (mathfn_built_in_2): Added CASE_MATHFN for ROUNDEVEN.
> * builtins.def: Added function definitions for roundeven function
> variants.
> * fold-const-call.c (fold_const_call_ss): Added case for roundeven
> function call.
> * fold-const.c (negate_mathfn_p): Added case for roundeven function.
> (tree_call_nonnegative_warnv_p): Added case for roundeven function.
> (integer_valued_real_call_p): Added case for roundeven function.
> * real.c (is_even): New function. Returns true if real number is
> even, otherwise returns false.
> (is_halfway_below): New function. Returns true if real number is
> halfway between two integers, else return false.
> (real_roundeven): New function. Round real number to nearest
> integer, rounding halfway cases towards even.
> * real.h (real_value): Added descriptive comments.
> Added function declaration for roundeven function.
>
> gcc/testsuite/ChangeLog:
>
> 2019-06-12  Tejas Joshi  
>
> * gcc.dg/torture/builtin-round-roundeven.c: New test.
> * gcc.dg/torture/builtin-round-roundevenf128.c: New test.
>
> On Sat, 10 Aug 2019 at 02:15, Joseph Myers  wrote:
>>
>> On Fri, 28 Jun 2019, Tejas Joshi wrote:
>>
>> > +CASE_CFN_ROUNDEVEN:
>> > +CASE_CFN_ROUNDEVEN_FN:
>> > +  if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
>>
>> Checking flag_errno_math here does not make sense.  roundeven never sets
>> errno (at least, TS 18661-1 makes it implementation-defined whether sNaN
>> input counts as a domain error, but I'm not aware of implementations that
>> make it a domain error and set errno, and typically GCC follows glibc in
>> such cases in the absence of known implementations requiring a different
>> approach).
>>
>> The only case where you need to avoid folding is where the argument is a
>> signaling NaN (it's fine to fold for quiet NaNs).  In that case, you need
>> to avoid folding to avoid losing an exception (if the user cares about
>> signaling NaNs, they probably care about exceptions) - so it still doesn't
>> matter whether the library implementation also sets errno or not.
>>
>> (Yes, this means the existing ceil / floor / round checks should be
>> adjusted just to check for signaling NaN, though that's fairly cosmetic as
>> calls to those functions with quiet NaN argument still get folded via
>> match.pd.  trunc ought also check for signaling NaN rather than folding
>> unconditionally, so all those functions should end up with the same
>> conditions for folding.)
>>
>> > @@ -898,6 +907,10 @@ fold_const_call_ss (wide_int *result, combined_fn fn,
>> >return fold_const_conversion (result, real_round, arg,
>> >   precision, format);
>> >
>> > +CASE_CFN_ROUNDEVEN:
>> > +CASE_CFN_ROUNDEVEN_FN:
>> > +  return fold_const_conversion (result, real_roundeven, arg, 
>> > precision, format);
>> > +
>>
>> This is the version of fold_const_call_ss for functions returning a result
>> of integer type; roundeven returns an integer value in a floating-point
>> type.  I don't think this code should be there, and I don't think this
>> version of the function should be called at all for roundeven.
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index e2ba356c0d3..8ceb077b0bf 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2056,6 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
>  CASE_MATHFN (REMQUO)
>  CASE_MATHFN_FLOATN (RINT)
>  CASE_MATHFN_FLOATN (ROUND)
> +CASE_MATHFN (ROUNDEVEN)
>  CASE_MATHFN (SCALB)
>  CASE_MATHFN (SCALBLN)
>  CASE_MATHFN (SCALBN)
> diff --git a/gcc/builtins.def b/gcc/builtins.def
> index 6d41bdb4f44..8bb7027aac7 100644
> --- a/gcc/builtins.def
> +++ b/gcc/builtins.def
> @@ -542,12 +542,18 @@ DEF_C99_BUILTIN(BUILT_IN_RINTL, "rintl", 
> BT_FN_LONGDOUBLE_LONGDOUBLE, AT
>  #define RINT_TYPE(F) BT_FN_##F##_##F
>  DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_RINT, "rint", RINT_TYPE, 
> ATTR_CONST_NOTHROW_LEAF_LIST)
>  #undef RINT_TYPE
> +DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVEN, "roundeven", 
> BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENF, "roundevenf", 
> BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENL, 

Re: [PATCH V2 2/8] bpf: new GCC port

2019-08-21 Thread Segher Boessenkool
On Tue, Aug 20, 2019 at 03:42:53PM -0600, Jeff Law wrote:
> > I have been thinking about Segher's suggestion on providing options to
> > lift some of the limitations, for compiler testing.  Unfortunately, many
> > of the restrictions are deeply rooted in the design of the
> > architecture... or the other way around.  Finding sane ways to implement
> > these extensions will be fun :)
> Hell, it's a virtual architecture.  I'd just make up new instructions
> for the missing functionality, make them dependent on an flag.  I think
> the PRU is in a similar position and uses that approach.  PTX might have
> as well.

This approach works well for simulators for physical architectures, too.

> > This may change at some point.  There is much discussion among the
> > kernel hackers in whether it is possible to allow bounded loops in a
> > safe way.  In that case, some of the restrictions may be lifted.
> ACK.  It's an interesting problem.  Would it help if we could annotate
> loops with bound information?  Not sure how to preserve that from gimple
> down to assembly, but it's worth pondering.

You probably should have machine insns that iterate a loop some number of
times given when you first start the loop (and cannot be changed later,
except maybe exiting from the loop).  Like "doloop" in GCC.  Maybe only
allow a constant number of times, if the verifier want to see that?

The only thing the verifier should be concerned with is how long the code
takes to run, or am I missing something?


Segher


Re: [PATCH V2 2/8] bpf: new GCC port

2019-08-21 Thread Segher Boessenkool
On Tue, Aug 20, 2019 at 11:14:35PM +0200, Jose E. Marchesi wrote:
> The kernel verifier doesn't allow backward jumps.
> 
> This may change at some point.  There is much discussion among the
> kernel hackers in whether it is possible to allow bounded loops in a
> safe way.  In that case, some of the restrictions may be lifted.
> 
> For now, only loops that can be peeled/massaged and then fully unrolled
> are supported.

You can also generate code like

x5: call x4
jump x1
x4: call x2
x2: call x1
x1: do things once here
ret

to do fixed number of iteration loops.  Disgusting?  You decide :-)

(Or is something in that not allowed by the verifier?)


Segher


Fix crash with -fdump-ada-spec

2019-08-21 Thread Eric Botcazou
This fixes a crash of the C compiler when -fdump-ada-spec is passed for:

extern void (*signal (int __sig, void (*__handler)(int)))(int);

It appears that the C parser is somewhat confused by this declaration and 
builds a FUNCTION_DECL whose type has got 2 arguments (TYPE_ARG_TYPES list) 
but which itself has got only 1 argument (DECL_ARGUMENTS list) corresponding 
to the anonymous (int) at the end.  The C++ parser doesn't have the issue.

Joseph, is that a known issue of the C parser?

The attached patch changes -fdump-ada-spec to using TYPE_ARG_TYPES for it.


2019-08-21  Eric Botcazou  

c-family/
* c-ada-spec.c (dump_ada_function_declaration): Be prepared for broken
function declarations where arguments are missing.  Rename variables.


2019-08-21  Eric Botcazou  

* c-c++-common/dump-ada-spec-15.c: New test.

-- 
Eric BotcazouIndex: c-family/c-ada-spec.c
===
--- c-family/c-ada-spec.c	(revision 274744)
+++ c-family/c-ada-spec.c	(working copy)
@@ -1589,14 +1589,13 @@ dump_ada_function_declaration (pretty_printer *buf
 			   bool is_method, bool is_constructor,
 			   bool is_destructor, int spc)
 {
-  tree arg;
-  const tree node = TREE_TYPE (func);
+  tree type = TREE_TYPE (func);
+  tree arg = TYPE_ARG_TYPES (type);
+  tree t;
   char buf[17];
-  int num = 0, num_args = 0, have_args = true, have_ellipsis = false;
+  int num, num_args = 0, have_args = true, have_ellipsis = false;
 
   /* Compute number of arguments.  */
-  arg = TYPE_ARG_TYPES (node);
-
   if (arg)
 {
   while (TREE_CHAIN (arg) && arg != error_mark_node)
@@ -1627,25 +1626,29 @@ dump_ada_function_declaration (pretty_printer *buf
   pp_left_paren (buffer);
 }
 
+  /* For a function, see if we have the corresponding arguments.  */
   if (TREE_CODE (func) == FUNCTION_DECL)
-arg = DECL_ARGUMENTS (func);
+{
+  arg = DECL_ARGUMENTS (func);
+  for (t = arg, num = 0; t; t = DECL_CHAIN (t))
+	num++;
+  if (num < num_args)
+	arg = NULL_TREE;
+}
   else
 arg = NULL_TREE;
 
-  if (arg == NULL_TREE)
+  /* Otherwise, only print the types.  */
+  if (!arg)
 {
   have_args = false;
-  arg = TYPE_ARG_TYPES (node);
-
-  if (arg && TREE_CODE (TREE_VALUE (arg)) == VOID_TYPE)
-	arg = NULL_TREE;
+  arg = TYPE_ARG_TYPES (type);
 }
 
   if (is_constructor)
 arg = TREE_CHAIN (arg);
 
-  /* Print the argument names (if available) & types.  */
-
+  /* Print the argument names (if available) and types.  */
   for (num = 1; num <= num_args; num++)
 {
   if (have_args)
@@ -1663,13 +1666,13 @@ dump_ada_function_declaration (pretty_printer *buf
 	  pp_string (buffer, buf);
 	}
 
-	  dump_ada_node (buffer, TREE_TYPE (arg), node, spc, false, true);
+	  dump_ada_node (buffer, TREE_TYPE (arg), type, spc, false, true);
 	}
   else
 	{
 	  sprintf (buf, "arg%d : ", num);
 	  pp_string (buffer, buf);
-	  dump_ada_node (buffer, TREE_VALUE (arg), node, spc, false, true);
+	  dump_ada_node (buffer, TREE_VALUE (arg), type, spc, false, true);
 	}
 
   /* If the type is a pointer to a tagged type, we need to differentiate
@@ -1707,11 +1710,11 @@ dump_ada_function_declaration (pretty_printer *buf
   if (num_args > 0)
 pp_right_paren (buffer);
 
-  if (is_constructor || !VOID_TYPE_P (TREE_TYPE (node)))
+  if (is_constructor || !VOID_TYPE_P (TREE_TYPE (type)))
 {
   pp_string (buffer, " return ");
-  tree type = is_constructor ? DECL_CONTEXT (func) : TREE_TYPE (node);
-  dump_ada_node (buffer, type, type, spc, false, true);
+  tree rtype = is_constructor ? DECL_CONTEXT (func) : TREE_TYPE (type);
+  dump_ada_node (buffer, rtype, rtype, spc, false, true);
 }
 }
 
/* { dg-do compile } */
/* { dg-options "-fdump-ada-spec" } */

extern void (*signal (int __sig, void (*__handler)(int)))(int);

/* { dg-final { cleanup-ada-spec } } */


Re: [SVE] PR88839

2019-08-21 Thread Richard Sandiford
Prathamesh Kulkarni  writes:
> Hi,
> The attached patch is a fix for PR88839 ported from sve-acle-branch.
> OK to commit to trunk ?
>
> Thanks,
> Prathamesh
>
> 2019-08-21  Prathamesh Kulkarni  
>   Richard Sandiford  
>
>   PR target/88839
>   * config/aarch64/aarch64.c (aarch64_evpc_sel): New function.
>   (aarch64_expand_vec_perm_const_1): Call aarch64_evpc_sel.
>
> testsuite/
>   * gcc.target/aarch64/sve/sel_1.c: New test.
>   * gcc.target/aarch64/sve/sel_2.c: Likewise.
>   * gcc.target/aarch64/sve/sel_3.c: Likewise.
>   * gcc.target/aarch64/sve/sel_4.c: Likewise.
>   * gcc.target/aarch64/sve/sel_5.c: Likewise.
>   * gcc.target/aarch64/sve/sel_6.c: Likewise.
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index ec60e972f5f..f8d5270b982 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -16632,6 +16632,50 @@ aarch64_evpc_sve_tbl (struct expand_vec_perm_d *d)
>return true;
>  }
>  
> +/* Try to implement D using SVE SEL instruction.  */
> +
> +static bool
> +aarch64_evpc_sel (struct expand_vec_perm_d *d)
> +{
> +  machine_mode vmode = d->vmode;
> +  int nunits = GET_MODE_UNIT_SIZE (vmode);

Sorry for not noticing last time, but this should be "unit_size"
rather than "nunit".

OK with that change, thanks.

Richard


[COMMITTED] [testsuite][arm] Add missing quotes to expected warning messages.

2019-08-21 Thread Christophe Lyon
Hi,

The arm/cmse/cmse-9.c test was missing quotes in the expected warning messages.
Fix as obvious with r274793.

Christophe

2019-08-21  Christophe Lyon  

   * gcc.target/arm/cmse/cmse-9.c: Add quotes to expected
   warning messages.

Index: gcc/testsuite/gcc.target/arm/cmse/cmse-9.c
===
--- gcc/testsuite/gcc.target/arm/cmse/cmse-9.c  (revision 274792)
+++ gcc/testsuite/gcc.target/arm/cmse/cmse-9.c  (revision 274793)
@@ -2,12 +2,12 @@
 /* { dg-skip-if "Testing exclusion of -mcmse" { arm-*-* } { "-mcmse"
} { "" } }  */


-void __attribute__ ((cmse_nonsecure_call)) (*bar) (int); /* {
dg-warning "attribute ignored without -mcmse option" } */
-typedef void __attribute__ ((cmse_nonsecure_call)) baz (int); /* {
dg-warning "attribute ignored without -mcmse option" } */
+void __attribute__ ((cmse_nonsecure_call)) (*bar) (int); /* {
dg-warning "attribute ignored without '-mcmse' option" } */
+typedef void __attribute__ ((cmse_nonsecure_call)) baz (int); /* {
dg-warning "attribute ignored without '-mcmse' option" } */

 int __attribute__ ((cmse_nonsecure_entry))
 foo (int a, baz b)
-{ /* { dg-warning "attribute ignored without -mcmse option" } */
+{ /* { dg-warning "attribute ignored without '-mcmse' option" } */
   bar (a);
   b (a);
   return a + 1;


[PATCH][AArch64] Add support for missing CPUs

2019-08-21 Thread Dennis Zhang
Hi all,

This patch adds '-mcpu' options for following CPUs:
Cortex-A77, Cortex-A76AE, Cortex-A65, Cortex-A65AE, and Cortex-A34.

Related specifications are as following:
https://developer.arm.com/ip-products/processors/cortex-a

Bootstraped/regtested for aarch64-none-linux-gnu.

Please help to check if it's ready.

Many thanks!
Dennis

gcc/ChangeLog:

2019-08-21  Dennis Zhang  

* config/aarch64/aarch64-cores.def (AARCH64_CORE): New entries
for Cortex-A77, Cortex-A76AE, Cortex-A65, Cortex-A65AE, and
Cortex-A34.
* config/aarch64/aarch64-tune.md: Regenerated.
* doc/invoke.texi: Document the new processors.
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 82d91d62519..c0be109009f 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -46,6 +46,7 @@
 /* ARMv8-A Architecture Processors.  */
 
 /* ARM ('A') cores. */
+AARCH64_CORE("cortex-a34",  cortexa34, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa35, 0x41, 0xd02, -1)
 AARCH64_CORE("cortex-a35",  cortexa35, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa35, 0x41, 0xd04, -1)
 AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, 0x41, 0xd03, -1)
 AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, 0xd07, -1)
@@ -100,6 +101,10 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR
 AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
 AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
 AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa72, 0x41, 0xd0b, -1)
+AARCH64_CORE("cortex-a76ae",  cortexa76ae, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa72, 0x41, 0xd0e, -1)
+AARCH64_CORE("cortex-a77",  cortexa77, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa72, 0x41, 0xd0d, -1)
+AARCH64_CORE("cortex-a65",  cortexa65, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd06, -1)
+AARCH64_CORE("cortex-a65ae",  cortexa65ae, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd43, -1)
 AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
 AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
 AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa53, 0x41, 0xd4a, -1)
diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md
index 2b1ec85ae31..a6a14b7fc77 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+	"cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa65,cortexa65ae,ares,neoversen1,neoversee1,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
 	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 29585cf15aa..3aa59b9a125 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15809,7 +15809,9 @@ Specify the name of the target processor for which GCC should tune the
 performance of the code.  Permissible values for this option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 

[Ada] Avoid spurious error in GNATprove mode on non-null access types

2019-08-21 Thread Pierre-Marie de Rodat
GNATprove directly handles non-null access checks, and requires that the
frontend does not insert explicit checks in the form of conditional
exceptions being raised. Now fixed.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Yannick Moy  

gcc/ada/

* checks.adb (Install_Null_Excluding_Check): Do not install
check in GNATprove mode.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -7964,6 +7964,12 @@ package body Checks is
  return;
   end if;
 
+  --  In GNATprove mode, we do not apply the check
+
+  if GNATprove_Mode then
+ return;
+  end if;
+
   --  Otherwise install access check
 
   Insert_Action (N,



[Ada] Do not rewrite argument of pragma Entry_Queue_Length in analysis

2019-08-21 Thread Pierre-Marie de Rodat
It was unusual for the analysis phase of compiler to rewrite static
expression of pragma Entry_Queue_Length with its value. This typically
happens when expanding the AST and only if needed for a given backend.
In particular, GNATprove doesn't need such an expansion and actually
needs the unrewritten AST to detect references to constants declared
with SPARK_Mode => Off within a code with SPARK_Mode => On.

This change has no impact on compilation, so no frontend test is
provided.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Piotr Trojanek  

gcc/ada/

* sem_prag.adb (Max_Entry_Queue_Length): Do not substitute
expression of the pragma argument with its value during
analysis.
* sem_util.adb (Get_Max_Queue_Length): Compute value of the
pragma argument when needed.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -19599,15 +19599,6 @@ package body Sem_Prag is
 
 end if;
 
---  Manually substitute the expression value of the pragma argument
---  if it's not an integer literal because this is not taken care
---  of automatically elsewhere.
-
-if Nkind (Arg) /= N_Integer_Literal then
-   Rewrite (Arg, Make_Integer_Literal (Sloc (Arg), Val));
-   Set_Etype (Arg, Etype (Original_Node (Arg)));
-end if;
-
 Record_Rep_Item (Entry_Id, N);
  end Max_Entry_Queue_Length;
 

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -9763,7 +9763,8 @@ package body Sem_Util is
  return Uint_0;
   end if;
 
-  Max := Intval (Expression (First (Pragma_Argument_Associations (Prag;
+  Max := Expr_Value
+(Expression (First (Pragma_Argument_Associations (Prag;
 
   --  Since -1 and 0 are equivalent, return 0 for instances of -1 for
   --  uniformity.



[Ada] More complete information level for -gnatR4 output

2019-08-21 Thread Pierre-Marie de Rodat
This instructs -gnatR4 to also list the Etype of components in
user-declared record types if it is compiler-generated, for example in:

package P3 is

type idx is range 1 .. 100;

type Arr is array (Idx range <>) of Character;

type Rec is record
   C : Arr (1 .. 5);
end record;

end P3;

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Eric Botcazou  

gcc/ada/

* repinfo.adb (List_Array_Info): In -gnatR4 mode, set the
relevant flag on the component type here instead of...
(List_Object_Info): Likewise for the object type.
(List_Entities): ...here.  In -gnatR4 mode, recurse into
entities local to a record type.
(List_Component_Layout): In -gnatR4 mode, mark the type as
relevant.--- gcc/ada/repinfo.adb
+++ gcc/ada/repinfo.adb
@@ -357,6 +357,14 @@ package body Repinfo is
  Write_Eol;
  Write_Line ("}");
   end if;
+
+  --  The component type is relevant for an array
+
+  if List_Representation_Info = 4
+and then Is_Itype (Component_Type (Base_Type (Ent)))
+  then
+ Relevant_Entities.Set (Component_Type (Base_Type (Ent)), True);
+  end if;
end List_Array_Info;
 
---
@@ -539,20 +547,17 @@ package body Repinfo is
  List_Record_Info (E, Bytes_Big_Endian);
   end if;
 
+  --  Recurse into entities local to a record type
+
+  if List_Representation_Info = 4 then
+ List_Entities (E, Bytes_Big_Endian, False);
+  end if;
+
elsif Is_Array_Type (E) then
   if List_Representation_Info >= 1 then
  List_Array_Info (E, Bytes_Big_Endian);
   end if;
 
-  --  The component type is relevant for an array
-
-  if List_Representation_Info = 4
-and then Is_Itype (Component_Type (Base_Type (E)))
-  then
- Relevant_Entities.Set
-   (Component_Type (Base_Type (E)), True);
-  end if;
-
elsif Is_Type (E) then
   if List_Representation_Info >= 2 then
  List_Type_Info (E);
@@ -564,13 +569,6 @@ package body Repinfo is
   E_Loop_Parameter,
   E_Variable)
then
-  --  The type is relevant for an object
-
-  if List_Representation_Info = 4 and then Is_Itype (Etype (E))
-  then
- Relevant_Entities.Set (Etype (E), True);
-  end if;
-
   if List_Representation_Info >= 2 then
  List_Object_Info (E);
   end if;
@@ -975,6 +973,12 @@ package body Repinfo is
 
  List_Linker_Section (Ent);
   end if;
+
+  --  The type is relevant for an object
+
+  if List_Representation_Info = 4 and then Is_Itype (Etype (Ent)) then
+ Relevant_Entities.Set (Etype (Ent), True);
+  end if;
end List_Object_Info;
 
--
@@ -1283,6 +1287,12 @@ package body Repinfo is
  else
 Write_Line (";");
  end if;
+
+ --  The type is relevant for a component
+
+ if List_Representation_Info = 4 and then Is_Itype (Etype (Ent)) then
+Relevant_Entities.Set (Etype (Ent), True);
+ end if;
   end List_Component_Layout;
 
   



[Ada] Fix assertion failure on derived private protected type

2019-08-21 Thread Pierre-Marie de Rodat
This fixes an assertion failure on the instantiation of a generic
package on a type derived from the private view of a protected type,
ultimately caused by Finalize_Address returning Empty for the subtype
built for the generic actual type of the instantiation.

Finalize_Address has a special processing for untagged derivations of
private views, but it would no longer trigger for the subtype because
this subtype is now represented as a subtype of an implicit derived base
type instead of as the derived type of an implicit subtype previously.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Eric Botcazou  

gcc/ada/

* exp_util.adb (Finalize_Address): Deal consistently with
subtypes of private protected types.

gcc/testsuite/

* gnat.dg/prot9.adb, gnat.dg/prot9_gen.ads,
gnat.dg/prot9_pkg1.ads, gnat.dg/prot9_pkg2.ads: New testcase.--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -5347,6 +5347,7 @@ package body Exp_Util is
--
 
function Finalize_Address (Typ : Entity_Id) return Entity_Id is
+  Btyp : constant Entity_Id := Base_Type (Typ);
   Utyp : Entity_Id := Typ;
 
begin
@@ -5386,12 +5387,12 @@ package body Exp_Util is
   --  records do not automatically inherit operations, but maybe they
   --  should???)
 
-  if Is_Untagged_Derivation (Typ) then
- if Is_Protected_Type (Typ) then
-Utyp := Corresponding_Record_Type (Root_Type (Base_Type (Typ)));
+  if Is_Untagged_Derivation (Btyp) then
+ if Is_Protected_Type (Btyp) then
+Utyp := Corresponding_Record_Type (Root_Type (Btyp));
 
  else
-Utyp := Underlying_Type (Root_Type (Base_Type (Typ)));
+Utyp := Underlying_Type (Root_Type (Btyp));
 
 if Is_Protected_Type (Utyp) then
Utyp := Corresponding_Record_Type (Utyp);

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot9.adb
@@ -0,0 +1,10 @@
+--  { dg-do compile }
+
+with Prot9_Gen;
+with Prot9_Pkg1;
+
+procedure Prot9 is
+   package Dummy is new Prot9_Gen (Prot9_Pkg1.Prot_Type);
+begin
+   null;
+end Prot9;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot9_gen.ads
@@ -0,0 +1,9 @@
+generic
+  type Field_Type is limited private;
+package Prot9_Gen is
+
+  type Field_Pointer is access all Field_Type;
+
+  Pointer : Field_Pointer := new Field_Type;
+
+end Prot9_Gen;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot9_pkg1.ads
@@ -0,0 +1,11 @@
+with Prot9_Pkg2;
+
+package Prot9_Pkg1 is
+
+   type Prot_Type is limited private;
+
+private
+
+   type Prot_Type is new Prot9_Pkg2.Prot_Type;
+
+end Prot9_Pkg1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot9_pkg2.ads
@@ -0,0 +1,16 @@
+with Ada.Containers.Doubly_Linked_Lists;
+
+package Prot9_Pkg2 is
+
+   type Prot_type is limited private;
+
+private
+
+   package My_Lists is new Ada.Containers.Doubly_Linked_Lists (Integer);
+
+   protected type Prot_type is
+   private
+ L : My_Lists.List;
+   end Prot_type;
+
+end Prot9_Pkg2;



[Ada] Improve detection of end of the process by GNAT.Expect

2019-08-21 Thread Pierre-Marie de Rodat
'read' system call may be interrupted by signal with 'errno' is set to
EINTER. In this case, re-try a few times.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Vadim Godunko  

gcc/ada/

* libgnat/g-expect.adb (Expect_Internal): Attempt to read
several times when 'read' returns non-positive.--- gcc/ada/libgnat/g-expect.adb
+++ gcc/ada/libgnat/g-expect.adb
@@ -692,15 +692,21 @@ package body GNAT.Expect is
Buffer_Size := 4096;
 end if;
 
-N := Read (Descriptors (D).Output_Fd, Buffer'Address,
-   Buffer_Size);
+--  Read may be interrupted on Linux by a signal and
+--  need to be repeated. We don't want to check for
+--  errno = EINTER, so just attempt to read a few
+--  times.
+
+for J in 1 .. 3 loop
+   N := Read (Descriptors (D).Output_Fd,
+  Buffer'Address, Buffer_Size);
+
+   exit when N > 0;
+end loop;
 
 --  Error or End of file
 
 if N <= 0 then
-   --  ??? Note that ddd tries again up to three times
-   --  in that case. See LiterateA.C:174
-
Close (Descriptors (D).Input_Fd);
Descriptors (D).Input_Fd := Invalid_FD;
Result := Expect_Process_Died;



[Ada] Undefined master in task with limited class-wide aliased entry formal

2019-08-21 Thread Pierre-Marie de Rodat
In the case of a task declaring an entry with an aliased formal
parameter of a limited class-wide type, the front end was creating a
master object (_master) for the access type generated for such an entry
formal inside the task specification, even though such access types
don't need an associated master.  The master object wasn't being copied
into the procedure expanded for the task body, but a renaming for the
master appeared in the statements of the task body, and the LLVM back
end rejects this since the master object doesn't appear in the expanded
task procedure (for some reason, gigi doesn't complain). This is fixed
by suppressing the creation of the master object in the case where the
access-to-limited-class-wide access type is the type of a component in
an entry's parameter block.  This is similar to the suppression done for
the master object in other cases, where the access type designates a
type explicitly containing tasks (though the suppression involves
testing Comes_From_Source in that case).

No simple test (and this only affects the LLVM-based compiler).

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Gary Dismukes  

gcc/ada/

* exp_ch3.adb (Build_Master): Suppress call to
Build_Class_Wide_Master in the case where the
access-to-limited-class-wide type was created for a component in
an entry's formal parameter
block (Is_Parameter_Block_Component_Type), to prevent a master
from being created for such access types generated by the front
end in a task spec for entry formals in a parameter block.  Add
a ??? about whether this suppression should be done more
generally (such as by using Comes_From_Source).--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -5518,7 +5518,14 @@ package body Exp_Ch3 is
  --  Note: This code covers access-to-limited-interfaces because they
  --can be used to reference tasks implementing them.
 
- elsif Is_Limited_Class_Wide_Type (Desig_Typ)
+ --  Suppress the master creation for access types created for entry
+ --  formal parameters (parameter block component types). Seems like
+ --  suppression should be more general for compiler-generated types,
+ --  but testing Comes_From_Source, like the code above does, may be
+ --  too general in this case (affects some test output)???
+
+ elsif not Is_Param_Block_Component_Type (Ptr_Typ)
+   and then Is_Limited_Class_Wide_Type (Desig_Typ)
and then Tasking_Allowed
  then
 Build_Class_Wide_Master (Ptr_Typ);



[Ada] Fix type mismatch in extended return statement expansion

2019-08-21 Thread Pierre-Marie de Rodat
This fixes a (sub)type mismatch in the expansion of an extended return
statement generated for a built-in-place function that doesn't need a
BIP_Alloc_Form parameter but returns unconstrained.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Eric Botcazou  

gcc/ada/

* exp_ch6.adb (Expand_N_Extended_Return_Statement): In the case
of a built-in-place function that doesn't need a BIP_Alloc_Form
parameter but returns unconstrained, build the return
consistently using the function's result subtype.  Remove bypass
added in previous change.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -5199,7 +5199,7 @@ package body Exp_Ch6 is
end if;
 
--  When the function's subtype is unconstrained, a run-time
-   --  test is needed to determine the form of allocation to use
+   --  test may be needed to decide the form of allocation to use
--  for the return object. The function has an implicit formal
--  parameter indicating this. If the BIP_Alloc_Form formal has
--  the value one, then the caller has passed access to an
@@ -5235,13 +5235,6 @@ package body Exp_Ch6 is
  SS_Allocator   : Node_Id;
 
   begin
- --  Reuse the itype created for the function's implicit
- --  access formal. This avoids the need to create a new
- --  access type here, plus it allows assigning the access
- --  formal directly without applying a conversion.
-
- --Ref_Type := Etype (Object_Access);
-
  --  Create an access type designating the function's
  --  result subtype.
 
@@ -5572,6 +5565,64 @@ package body Exp_Ch6 is
 
  Obj_Acc_Formal := Alloc_Obj_Id;
   end;
+
+   --  When the function's subtype is unconstrained and a run-time
+   --  test is not needed, we nevertheless need to build the return
+   --  using the function's result subtype.
+
+   elsif not Is_Constrained (Underlying_Type (Etype (Func_Id)))
+   then
+  declare
+ Alloc_Obj_Id   : Entity_Id;
+ Alloc_Obj_Decl : Node_Id;
+ Ptr_Type_Decl  : Node_Id;
+ Ref_Type   : Entity_Id;
+
+  begin
+ --  Create an access type designating the function's
+ --  result subtype.
+
+ Ref_Type := Make_Temporary (Loc, 'A');
+
+ Ptr_Type_Decl :=
+   Make_Full_Type_Declaration (Loc,
+ Defining_Identifier => Ref_Type,
+ Type_Definition =>
+   Make_Access_To_Object_Definition (Loc,
+ All_Present=> True,
+ Subtype_Indication =>
+   New_Occurrence_Of (Ret_Obj_Typ, Loc)));
+
+ Insert_Before (Ret_Obj_Decl, Ptr_Type_Decl);
+
+ --  Create an access object initialized to the conversion
+ --  of the implicit access value passed in by the caller.
+
+ Alloc_Obj_Id := Make_Temporary (Loc, 'R');
+ Set_Etype (Alloc_Obj_Id, Ref_Type);
+
+ --  See the ??? comment a few lines above about the use of
+ --  an unchecked conversion here.
+
+ Alloc_Obj_Decl :=
+   Make_Object_Declaration (Loc,
+ Defining_Identifier => Alloc_Obj_Id,
+ Object_Definition   =>
+   New_Occurrence_Of (Ref_Type, Loc),
+ Expression =>
+   Make_Unchecked_Type_Conversion (Loc,
+ Subtype_Mark =>
+   New_Occurrence_Of (Ref_Type, Loc),
+ Expression   =>
+   New_Occurrence_Of (Obj_Acc_Formal, Loc)));
+
+ Insert_Before (Ret_Obj_Decl, Alloc_Obj_Decl);
+
+ --  Remember the local access object for use in the
+ --  dereference of the renaming created below.
+
+ Obj_Acc_Formal := Alloc_Obj_Id;
+  end;
end if;
 
--  Replace the return object declaration with a renaming of a
@@ -5615,23 +5666,7 @@ package body Exp_Ch6 is
   Set_Comes_From_Extended_Return_Statement (Return_Stmt);
 
   Rewrite (N, Result);
-
-  declare
- T : constant Entity_Id := Etype (Ret_Obj_Id);
-  begin
- Analyze (N, Suppress => All_Checks);
-
- --  In some cases, analysis of N can set the Etype 

[Ada] More precise propagation of Size attribute in generic instances

2019-08-21 Thread Pierre-Marie de Rodat
GNATprove analyzer for SPARK code depends on the frontend to accurately
propagate the known value of Size attribute. This was not done for
formal type parameters in generic instantiations. Now fixed.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Yannick Moy  

gcc/ada/

* sem_ch3.adb (Analyze_Subtype_Declaration): Inherit RM_Size
field for formal type parameters in generic instantiations.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -,6 +,14 @@ package body Sem_Ch3 is
 =>
Set_Ekind(Id, E_Record_Subtype);
 
+   --  Subtype declarations introduced for formal type parameters
+   --  in generic instantiations should inherit the Size value of
+   --  the type they rename.
+
+   if Present (Generic_Parent_Type (N)) then
+  Set_RM_Size   (Id, RM_Size (T));
+   end if;
+
if Ekind (T) = E_Record_Subtype
  and then Present (Cloned_Subtype (T))
then



[Ada] Ignore subprogram address in ownership checking

2019-08-21 Thread Pierre-Marie de Rodat
Ownership checking done as in GNATprove should ignore address of
subprograms, as it applies only on objects. Now fixed.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Yannick Moy  

gcc/ada/

* sem_spark.adb (Process_Path): Do nothing on address of
subprogram.--- gcc/ada/sem_spark.adb
+++ gcc/ada/sem_spark.adb
@@ -5110,6 +5110,7 @@ package body Sem_SPARK is
   --  in an object.
 
   if not Present (Root)
+or else not Is_Object (Root)
 or else not Is_Deep (Etype (Root))
   then
  return;



[Ada] Max_Entry_Queue_Length aspect for protected entries

2019-08-21 Thread Pierre-Marie de Rodat
Allow values of negative one to be accepted as a valid parameter as a
special case.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Justin Squirek  

gcc/ada/

* doc/gnat_rm/implementation_defined_pragmas.rst: Modify
documentation to reflect expected behavior.
* gnat_rm.texi: Regenerate.
* sem_prag.adb (Analyze_Pragma): Modify handling of pragma
Max_Entry_Queue_Length to not reject integer values of negative
one.
* sem_util.adb (Get_Max_Queue_Length): Add processing for values
of negative one to fit within the current scheme.--- gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -3888,8 +3888,10 @@ Syntax::
 
 This pragma is used to specify the maximum callers per entry queue for
 individual protected entries and entry families. It accepts a single
-positive integer as a parameter and must appear after the declaration
-of an entry.
+integer (-1 or more) as a parameter and must appear after the declaration of an
+entry.
+
+A value of -1 represents no additional restriction on queue length.
 
 Pragma No_Body
 ==

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -1764,9 +1764,9 @@ Syntax:
 pragma Aggregate_Individually_Assign;
 @end example
 
-Where possible GNAT will store the binary representation of a record aggregate
+Where possible, GNAT will store the binary representation of a record aggregate
 in memory for space and performance reasons. This configuration pragma changes
-this behaviour so that record aggregates are instead always converted into
+this behavior so that record aggregates are instead always converted into
 individual assignment statements.
 
 @node Pragma Allow_Integer_Address,Pragma Annotate,Pragma Aggregate_Individually_Assign,Implementation Defined Pragmas
@@ -5394,8 +5394,10 @@ pragma Max_Entry_Queue (static_integer_EXPRESSION);
 
 This pragma is used to specify the maximum callers per entry queue for
 individual protected entries and entry families. It accepts a single
-positive integer as a parameter and must appear after the declaration
-of an entry.
+integer (-1 or more) as a parameter and must appear after the declaration of an
+entry.
+
+A value of -1 represents no additional restriction on queue length.
 
 @node Pragma No_Body,Pragma No_Caching,Pragma Max_Queue_Length,Implementation Defined Pragmas
 @anchor{gnat_rm/implementation_defined_pragmas pragma-no-body}@anchor{a1}

--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -19538,7 +19538,7 @@ package body Sem_Prag is
 | Pragma_Max_Entry_Queue_Depth
 | Pragma_Max_Queue_Length
  =>
- Max_Queue_Length : declare
+ Max_Entry_Queue_Length : declare
 Arg: Node_Id;
 Entry_Decl : Node_Id;
 Entry_Id   : Entity_Id;
@@ -19589,9 +19589,9 @@ package body Sem_Prag is
 
 Val := Expr_Value (Arg);
 
-if Val <= 0 then
+if Val < -1 then
Error_Pragma_Arg
- ("argument for pragma% must be positive", Arg1);
+ ("argument for pragma% cannot be less than -1", Arg1);
 
 elsif not UI_Is_In_Int_Range (Val) then
Error_Pragma_Arg
@@ -19609,7 +19609,7 @@ package body Sem_Prag is
 end if;
 
 Record_Rep_Item (Entry_Id, N);
- end Max_Queue_Length;
+ end Max_Entry_Queue_Length;
 
  -
  -- Memory_Size --

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -9752,16 +9752,27 @@ package body Sem_Util is
function Get_Max_Queue_Length (Id : Entity_Id) return Uint is
   pragma Assert (Is_Entry (Id));
   Prag : constant Entity_Id := Get_Pragma (Id, Pragma_Max_Queue_Length);
+  Max  : Uint;
 
begin
-  --  A value of 0 represents no maximum specified, and entries and entry
-  --  families with no Max_Queue_Length aspect or pragma default to it.
+  --  A value of 0 or -1 represents no maximum specified, and entries and
+  --  entry families with no Max_Queue_Length aspect or pragma default to
+  --  it.
 
   if not Present (Prag) then
  return Uint_0;
   end if;
 
-  return Intval (Expression (First (Pragma_Argument_Associations (Prag;
+  Max := Intval (Expression (First (Pragma_Argument_Associations (Prag;
+
+  --  Since -1 and 0 are equivalent, return 0 for instances of -1 for
+  --  uniformity.
+
+  if Max = -1 then
+ return Uint_0;
+  end if;
+
+  return Max;
end Get_Max_Queue_Length;
 




[Ada] Missing attribute update in new_copy_tree

2019-08-21 Thread Pierre-Marie de Rodat
The compiler crashes processing an internally generated cloned tree that
has a subprogram call with a named actual parameter.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-21  Javier Miranda  

gcc/ada/

* sem_util.adb (Update_Named_Associations): Update
First_Named_Actual when the subprogram call has a single named
actual.

gcc/testsuite/

* gnat.dg/implicit_param.adb, gnat.dg/implicit_param_pkg.ads:
New testcase.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -20623,6 +20623,10 @@ package body Sem_Util is
  Old_Next : Node_Id;
 
   begin
+ if No (First_Named_Actual (Old_Call)) then
+return;
+ end if;
+
  --  Recreate the First/Next_Named_Actual chain of a call by traversing
  --  the chains of both the old and new calls in parallel.
 
@@ -20630,15 +20634,16 @@ package body Sem_Util is
  Old_Act := First (Parameter_Associations (Old_Call));
  while Present (Old_Act) loop
 if Nkind (Old_Act) = N_Parameter_Association
-  and then Present (Next_Named_Actual (Old_Act))
+  and then Explicit_Actual_Parameter (Old_Act)
+ = First_Named_Actual (Old_Call)
 then
-   if First_Named_Actual (Old_Call) =
-Explicit_Actual_Parameter (Old_Act)
-   then
-  Set_First_Named_Actual (New_Call,
-Explicit_Actual_Parameter (New_Act));
-   end if;
+   Set_First_Named_Actual (New_Call,
+ Explicit_Actual_Parameter (New_Act));
+end if;
 
+if Nkind (Old_Act) = N_Parameter_Association
+  and then Present (Next_Named_Actual (Old_Act))
+then
--  Scan the actual parameter list to find the next suitable
--  named actual. Note that the list may be out of order.
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/implicit_param.adb
@@ -0,0 +1,19 @@
+--  { dg-do compile }
+
+with Implicit_Param_Pkg;
+
+procedure Implicit_Param is
+subtype Tiny is Integer range 1 .. 5;
+V : Tiny := 4;
+
+function Func62 return Implicit_Param_Pkg.Lim_Rec is
+begin
+   return
+ (case V is
+   when 1 .. 3 => Implicit_Param_Pkg.Func_Lim_Rec,
+   when 4 .. 5 => raise Program_Error);
+end Func62;
+
+begin
+null;
+end Implicit_Param;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/implicit_param_pkg.ads
@@ -0,0 +1,8 @@
+package Implicit_Param_Pkg is
+type Lim_Rec is limited record
+A : Integer;
+B : Boolean;
+end record;
+
+function Func_Lim_Rec return Lim_Rec;
+end Implicit_Param_Pkg;



[PATCH, PR d/90444] Committed fix for ICE in d_init_builtins, at d/d-builtins.cc:1121

2019-08-21 Thread Iain Buclaw
Hi,

This patch fixes an ICE in the D front-end that occurred in some
specific targets where va_list is an anonymous RECORD_TYPE, such as
mips -mabi=eabi.

These are now built as its equivalent D type, and exposed in the
gcc.builtins module.

Bootstrapped and regression tested the D testsuite on
x86_64-linux-gnu, with further checking done on cross-compiler targets
mips64vr-elf, msp430-elf, pdp11-aout, and visium-elf to verify the ICE
no longer persists.

Committed to trunk as r274765.

--
Iain
---
gcc/d/ChangeLog:

PR d/90444
* d-builtins.cc (build_frontend_type): Build anonymous RECORD_TYPE
nodes as well, push all fields to the struct members.
(d_build_builtins_module): Push anonymous va_list structs to the
builtins module, naming them __builtin_va_list.
(d_init_builtins): Use sorry instead of gcc_unreachable if va_list did
not succeed in being represented as a D type.
---
diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index 3ebee721a25..5619ebb1a09 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -213,38 +213,54 @@ build_frontend_type (tree type)
   break;
 
 case RECORD_TYPE:
-  if (TYPE_NAME (type))
+{
+  Identifier *ident = TYPE_IDENTIFIER (type) ?
+	Identifier::idPool (IDENTIFIER_POINTER (TYPE_IDENTIFIER (type))) : NULL;
+
+  /* Neither the `object' and `gcc.builtins' modules will not exist when
+	 this is called.  Use a stub 'object' module parent in the meantime.
+	 If `gcc.builtins' is later imported, the parent will be overridden
+	 with the correct module symbol.  */
+  static Identifier *object = Identifier::idPool ("object");
+  static Module *stubmod = Module::create ("object.d", object, 0, 0);
+
+  StructDeclaration *sdecl = StructDeclaration::create (Loc (), ident,
+			false);
+  sdecl->parent = stubmod;
+  sdecl->structsize = int_size_in_bytes (type);
+  sdecl->alignsize = TYPE_ALIGN_UNIT (type);
+  sdecl->alignment = STRUCTALIGN_DEFAULT;
+  sdecl->sizeok = SIZEOKdone;
+  sdecl->type = (TypeStruct::create (sdecl))->addMod (mod);
+  sdecl->type->ctype = type;
+  sdecl->type->merge2 ();
+
+  sdecl->members = new Dsymbols;
+
+  for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 	{
-	  tree structname = DECL_NAME (TYPE_NAME (type));
-	  Identifier *ident
-	= Identifier::idPool (IDENTIFIER_POINTER (structname));
-
-	  /* Neither the `object' and `gcc.builtins' modules will not exist when
-	 this is called.  Use a stub 'object' module parent in the meantime.
-	 If `gcc.builtins' is later imported, the parent will be overridden
-	 with the correct module symbol.  */
-	  static Identifier *object = Identifier::idPool ("object");
-	  static Module *stubmod = Module::create ("object.d", object, 0, 0);
-
-	  StructDeclaration *sdecl = StructDeclaration::create (Loc (), ident,
-false);
-	  sdecl->parent = stubmod;
-	  sdecl->structsize = int_size_in_bytes (type);
-	  sdecl->alignsize = TYPE_ALIGN_UNIT (type);
-	  sdecl->alignment = STRUCTALIGN_DEFAULT;
-	  sdecl->sizeok = SIZEOKdone;
-	  sdecl->type = (TypeStruct::create (sdecl))->addMod (mod);
-	  sdecl->type->ctype = type;
-	  sdecl->type->merge2 ();
-
-	  /* Does not seem necessary to convert fields, but the members field
-	 must be non-null for the above size setting to stick.  */
-	  sdecl->members = new Dsymbols;
-	  dtype = sdecl->type;
-	  builtin_converted_decls.safe_push (builtin_data (dtype, type, sdecl));
-	  return dtype;
+	  Type *ftype = build_frontend_type (TREE_TYPE (field));
+	  if (!ftype)
+	{
+	  delete sdecl->members;
+	  return NULL;
+	}
+
+	  Identifier *fident
+	= Identifier::idPool (IDENTIFIER_POINTER (DECL_NAME (field)));
+	  VarDeclaration *vd = VarDeclaration::create (Loc (), ftype, fident,
+		   NULL);
+	  vd->offset = tree_to_uhwi (DECL_FIELD_OFFSET (field));
+	  vd->semanticRun = PASSsemanticdone;
+	  vd->csym = field;
+	  sdecl->members->push (vd);
+	  sdecl->fields.push (vd);
 	}
-  break;
+
+  dtype = sdecl->type;
+  builtin_converted_decls.safe_push (builtin_data (dtype, type, sdecl));
+  return dtype;
+}
 
 case FUNCTION_TYPE:
   dtype = build_frontend_type (TREE_TYPE (type));
@@ -561,7 +577,7 @@ d_build_builtins_module (Module *m)
   /* Currently, there is no need to run semantic, but we do want to output
 	 initializers, typeinfo, and others on demand.  */
   Dsymbol *dsym = builtin_converted_decls[i].dsym;
-  if (dsym != NULL)
+  if (dsym != NULL && !dsym->isAnonymous ())
 	{
 	  dsym->parent = m;
 	  members->push (dsym);
@@ -569,7 +585,18 @@ d_build_builtins_module (Module *m)
 }
 
   /* va_list should already be built, so no need to convert to D type again.  */
-  members->push (build_alias_declaration ("__builtin_va_list", Type::tvalist));
+  StructDeclaration *sd = (Type::tvalist->ty == Tstruct)
+? ((TypeStruct *) 

[PATCH, libphobos] Committed merge with upstream phobos 66ae77ac3

2019-08-21 Thread Iain Buclaw
Hi,

This patch merges the libphobos library with upstream phobos 66ae77ac3.

Finishes off build support for phobos on musl targets.

Bootstrapped and regression tested on x86_64-linux-gnu and x86_64-linux-musl.

Committed to trunk as r274770.

--
Iain
---
diff --git a/libphobos/src/MERGE b/libphobos/src/MERGE
index cfac5c66b4e..6a1e00874bb 100644
--- a/libphobos/src/MERGE
+++ b/libphobos/src/MERGE
@@ -1,4 +1,4 @@
-3dc363783ea7b1e82336983486a14f3f44bbeadd
+66ae77ac3f97a007a12738e4bc02b3bbfef99bba
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/phobos repository.
diff --git a/libphobos/src/std/experimental/allocator/building_blocks/region.d b/libphobos/src/std/experimental/allocator/building_blocks/region.d
index 43dfdb788e5..835d0937cec 100644
--- a/libphobos/src/std/experimental/allocator/building_blocks/region.d
+++ b/libphobos/src/std/experimental/allocator/building_blocks/region.d
@@ -580,6 +580,12 @@ struct InSituRegion(size_t size, size_t minAlign = platformAlignment)
 assert(a.length == 2001);
 }
 
+version(CRuntime_Musl)
+{
+// sbrk and brk are disabled in Musl:
+// https://git.musl-libc.org/cgit/musl/commit/?id=7a995fe706e519a4f55399776ef0df9596101f93
+// https://git.musl-libc.org/cgit/musl/commit/?id=863d628d93ea341b6a32661a1654320ce69f6a07
+} else:
 private extern(C) void* sbrk(long);
 private extern(C) int brk(shared void*);
 
diff --git a/libphobos/src/std/socket.d b/libphobos/src/std/socket.d
index 2853dbde0c6..a4ba39c8336 100644
--- a/libphobos/src/std/socket.d
+++ b/libphobos/src/std/socket.d
@@ -163,47 +163,7 @@ string formatSocketError(int err) @trusted
 {
 cs = strerror_r(err, buf.ptr, buf.length);
 }
-else version (OSX)
-{
-auto errs = strerror_r(err, buf.ptr, buf.length);
-if (errs == 0)
-cs = buf.ptr;
-else
-return "Socket error " ~ to!string(err);
-}
-else version (FreeBSD)
-{
-auto errs = strerror_r(err, buf.ptr, buf.length);
-if (errs == 0)
-cs = buf.ptr;
-else
-return "Socket error " ~ to!string(err);
-}
-else version (NetBSD)
-{
-auto errs = strerror_r(err, buf.ptr, buf.length);
-if (errs == 0)
-cs = buf.ptr;
-else
-return "Socket error " ~ to!string(err);
-}
-else version (DragonFlyBSD)
-{
-auto errs = strerror_r(err, buf.ptr, buf.length);
-if (errs == 0)
-cs = buf.ptr;
-else
-return "Socket error " ~ to!string(err);
-}
-else version (Solaris)
-{
-auto errs = strerror_r(err, buf.ptr, buf.length);
-if (errs == 0)
-cs = buf.ptr;
-else
-return "Socket error " ~ to!string(err);
-}
-else version (CRuntime_Bionic)
+else
 {
 auto errs = strerror_r(err, buf.ptr, buf.length);
 if (errs == 0)
@@ -211,8 +171,6 @@ string formatSocketError(int err) @trusted
 else
 return "Socket error " ~ to!string(err);
 }
-else
-static assert(0);
 
 auto len = strlen(cs);
 
diff --git a/libphobos/src/std/stdio.d b/libphobos/src/std/stdio.d
index 63bc32e9694..4c1ad0baa15 100644
--- a/libphobos/src/std/stdio.d
+++ b/libphobos/src/std/stdio.d
@@ -44,38 +44,38 @@ version (CRuntime_Glibc)
 version = GCC_IO;
 version = HAS_GETDELIM;
 }
-
-version (OSX)
+else version (CRuntime_Bionic)
 {
 version = GENERIC_IO;
 version = HAS_GETDELIM;
 }
-
-version (FreeBSD)
+else version (CRuntime_Musl)
 {
 version = GENERIC_IO;
 version = HAS_GETDELIM;
 }
 
-version (NetBSD)
+version (OSX)
 {
 version = GENERIC_IO;
 version = HAS_GETDELIM;
 }
-
-version (DragonFlyBSD)
+else version (FreeBSD)
 {
 version = GENERIC_IO;
 version = HAS_GETDELIM;
 }
-
-version (Solaris)
+else version (NetBSD)
 {
 version = GENERIC_IO;
-version = NO_GETDELIM;
+version = HAS_GETDELIM;
 }
-
-version (CRuntime_Bionic)
+else version (DragonFlyBSD)
+{
+version = GENERIC_IO;
+version = HAS_GETDELIM;
+}
+else version (Solaris)
 {
 version = GENERIC_IO;
 version = NO_GETDELIM;


[PATCH PR d/91339] Committed merge with upstream dmd b37a537d3

2019-08-21 Thread Iain Buclaw
Hi,

This patch merges the dmd frontend implementation with upstream dmd b37a537d3.

Fixes PR d/91339, in which an error: cannot find source code for
runtime library file 'object.d' occurred when the path contained a
'~'.

Bootstrapped and regression tested the D testsuite on x86_64-linux-gnu.

Committed to trunk as r274771.

--
Iain
---
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index cb7b6bfac7f..578f3fc0309 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-375ed10aa7eb28755f92775ca5c5399550cd100b
+b37a537d36c2ac69afa505a3110e2328c9fc0114
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/root/filename.c b/gcc/d/dmd/root/filename.c
index 6144d810fb5..ad6b1148c7b 100644
--- a/gcc/d/dmd/root/filename.c
+++ b/gcc/d/dmd/root/filename.c
@@ -110,7 +110,8 @@ Strings *FileName::splitPath(const char *path)
 case '~':
 {
 char *home = getenv("HOME");
-if (home)
+// Expand ~ only if it is prefixing the rest of the path.
+if (!buf.offset && p[1] == '/' && home)
 buf.writestring(home);
 else
 buf.writestring("~");


[PATCH, PR d/88722] Partially fix ICE: in register_moduleinfo, at d/modules.cc:40

2019-08-21 Thread Iain Buclaw
Hi,

This patch turns an ICE into a sorry() error.  Not marking the PR as
fixed, as an implementation of ModuleInfo code generation should be
worked out when named section support is not present on the target.

Bootstrapped and regression tested the D testsuite on
x86_64-linux-gnu, with further checking done on pdp11-aout
cross-compiler target to verify the sorry() path is executed and the
ICE is no longer present.

Committed to trunk as r274769.

--
Iain
---
gcc/d/ChangeLog:

PR d/88722
* modules.cc: Include diagnostic.h.
(register_moduleinfo): Use sorry instead of gcc_assert for targets
without named sections.
---
diff --git a/gcc/d/modules.cc b/gcc/d/modules.cc
index 88cc5e89e9a..a25e06ae1cd 100644
--- a/gcc/d/modules.cc
+++ b/gcc/d/modules.cc
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dmd/module.h"
 
 #include "tree.h"
+#include "diagnostic.h"
 #include "fold-const.h"
 #include "tm.h"
 #include "function.h"
@@ -404,7 +405,8 @@ build_dso_registry_var (const char * name, tree type)
 static void
 register_moduleinfo (Module *decl, tree minfo)
 {
-  gcc_assert (targetm_common.have_named_sections);
+  if (!targetm_common.have_named_sections)
+sorry ("%<-fmoduleinfo%> is not supported on this target");
 
   /* Build the ModuleInfo reference, this is done once for every Module.  */
   tree ident = mangle_internal_decl (decl, "__moduleRef", "Z");


[PATCH, d] Committed merge with upstream dmd 375ed10aa

2019-08-21 Thread Iain Buclaw
Hi,

This patch merges the dmd frontend implementation with upstream dmd 375ed10aa.

This allows the frontend to be able to compile on targets where the
pointer size is 16-bits.

Bootstrapped and regression tested the D testsuite on
x86_64-linux-gnu, with further checking done on msp430-elf
cross-compiler target to verify that at least simple programs can be
compiled on 16-bit targets.

Committed to trunk as r274768.

--
Iain
---
gcc/d/ChangeLog:

* d-target.cc: Include diagnostic.h.
(Target::_init): Set Tsize_t and Tptrdiff_t as D ushort and short if
the target pointer size is 2.  Add sorry if the pointer size is not
either 2, 4, or 8.
---
diff --git a/gcc/d/d-target.cc b/gcc/d/d-target.cc
index 8d85534f054..dfaf9bf3792 100644
--- a/gcc/d/d-target.cc
+++ b/gcc/d/d-target.cc
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "memmodel.h"
 #include "fold-const.h"
+#include "diagnostic.h"
 #include "stor-layout.h"
 #include "tm.h"
 #include "tm_p.h"
@@ -145,17 +146,24 @@ Target::_init (void)
   Target::maxStaticDataSize = tree_to_shwi (TYPE_MAX_VALUE (integer_type_node));
 
   /* Define what type to use for size_t, ptrdiff_t.  */
-  if (POINTER_SIZE == 64)
+  if (Target::ptrsize == 8)
 {
   global.params.isLP64 = true;
   Tsize_t = Tuns64;
   Tptrdiff_t = Tint64;
 }
-  else
+  else if (Target::ptrsize == 4)
 {
   Tsize_t = Tuns32;
   Tptrdiff_t = Tint32;
 }
+  else if (Target::ptrsize == 2)
+{
+  Tsize_t = Tuns16;
+  Tptrdiff_t = Tint16;
+}
+  else
+sorry ("D does not support pointers on this target.");
 
   Type::tsize_t = Type::basic[Tsize_t];
   Type::tptrdiff_t = Type::basic[Tptrdiff_t];
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index d208aea3717..cb7b6bfac7f 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-792f0fdf249b21531dc91690024827f4f9ecbb97
+375ed10aa7eb28755f92775ca5c5399550cd100b
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/expression.c b/gcc/d/dmd/expression.c
index c674392095e..5f1bfa8f5a9 100644
--- a/gcc/d/dmd/expression.c
+++ b/gcc/d/dmd/expression.c
@@ -2920,10 +2920,12 @@ void IntegerExp::normalize()
 case Tint64:value = (d_int64) value;break;
 case Tuns64:value = (d_uns64) value;break;
 case Tpointer:
-if (Target::ptrsize == 4)
-value = (d_uns32) value;
-else if (Target::ptrsize == 8)
+if (Target::ptrsize == 8)
 value = (d_uns64) value;
+else if (Target::ptrsize == 4)
+value = (d_uns32) value;
+else if (Target::ptrsize == 2)
+value = (d_uns16) value;
 else
 assert(0);
 break;
diff --git a/gcc/d/dmd/hdrgen.c b/gcc/d/dmd/hdrgen.c
index 4eaa1ae1050..395aa3212b5 100644
--- a/gcc/d/dmd/hdrgen.c
+++ b/gcc/d/dmd/hdrgen.c
@@ -2152,10 +2152,12 @@ public:
 if ((sinteger_t)uval >= 0)
 {
 dinteger_t sizemax;
-if (Target::ptrsize == 4)
-sizemax = 0xUL;
-else if (Target::ptrsize == 8)
+if (Target::ptrsize == 8)
 sizemax = 0xULL;
+else if (Target::ptrsize == 4)
+sizemax = 0xUL;
+else if (Target::ptrsize == 2)
+sizemax = 0xUL;
 else
 assert(0);
 if (uval <= sizemax && uval <= 0x7FFFULL)
@@ -2296,12 +2298,10 @@ public:
 buf->writestring("cast(");
 buf->writestring(t->toChars());
 buf->writeByte(')');
-if (Target::ptrsize == 4)
-goto L3;
-else if (Target::ptrsize == 8)
+if (Target::ptrsize == 8)
 goto L4;
 else
-assert(0);
+goto L3;
 
 default:
 /* This can happen if errors, such as


[PATCH, PR d/90445] Committed fix for ICE in d_build_c_type_nodes, at d/d-builtins.cc:783

2019-08-21 Thread Iain Buclaw
Hi,

This patch fixes another ICE in the D front-end that occurred on
targets where SIZE_TYPE was unhandled by d_build_c_type_nodes.

Now signed_type_for() is used to set signed_size_type_node, and
(u)intmax_type_node is determined by UINTMAX_TYPE.

Bootstrapped and regression tested the D testsuite on
x86_64-linux-gnu, with further checking done on cross-compiler targets
mips64vr-elf, msp430-elf, pdp11-aout, and visium-elf to verify the ICE
no longer persists.

Committed to trunk as r274766.

--
Iain
---
gcc/d/ChangeLog:

PR d/90445
* d-builtins.cc (d_build_c_type_nodes): Test UINTMAX_TYPE for setting
uintmax_type_node.  Set signed_size_type_node as the signed_type_for
size_type_node.
---
diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index 845f6a3d586..c90fc9051d9 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -762,27 +762,25 @@ d_build_c_type_nodes (void)
 = build_pointer_type (build_qualified_type (char_type_node,
 		TYPE_QUAL_CONST));
 
-  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
+  if (strcmp (UINTMAX_TYPE, "unsigned int") == 0)
 {
   intmax_type_node = integer_type_node;
   uintmax_type_node = unsigned_type_node;
-  signed_size_type_node = integer_type_node;
 }
-  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
+  else if (strcmp (UINTMAX_TYPE, "long unsigned int") == 0)
 {
   intmax_type_node = long_integer_type_node;
   uintmax_type_node = long_unsigned_type_node;
-  signed_size_type_node = long_integer_type_node;
 }
-  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
+  else if (strcmp (UINTMAX_TYPE, "long long unsigned int") == 0)
 {
   intmax_type_node = long_long_integer_type_node;
   uintmax_type_node = long_long_unsigned_type_node;
-  signed_size_type_node = long_long_integer_type_node;
 }
   else
 gcc_unreachable ();
 
+  signed_size_type_node = signed_type_for (size_type_node);
   wint_type_node = unsigned_type_node;
   pid_type_node = integer_type_node;
 }


[PATCH, PR d/90446] Committed fix for segfault in build_function_type at gcc/tree.c:8539

2019-08-21 Thread Iain Buclaw
Hi,

This patch fixes a segmentation fault in the D front-end when handling
built-in types that are PSImode.

D does not have the notion of integer types to be of any size other
than the fixed-sized byte, short, int, long, and cent types, however
integer types with non-standard sizes are still matched to D types by
comparing TYPE_SIZE_UNIT, and internally casted as appropriate.

Bootstrapped and regression tested the D testsuite on
x86_64-linux-gnu, with further checking done on cross-compiler targets
mips64vr-elf, msp430-elf, pdp11-aout, and visium-elf to verify the ICE
no longer persists.

Committed to trunk as r274767.

--
Iain
---
gcc/d/ChangeLog:

PR d/90446
* d-lang.cc (d_type_for_mode): Check for all internal __intN types.
(d_type_for_size): Likewise.
---
diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index f23f719a2c3..db0db0e71dc 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -1360,6 +1360,17 @@ d_type_for_mode (machine_mode mode, int unsignedp)
   if (mode == TYPE_MODE (build_pointer_type (d_int_type)))
 return build_pointer_type (d_int_type);
 
+  for (int i = 0; i < NUM_INT_N_ENTS; i ++)
+{
+  if (int_n_enabled_p[i] && mode == int_n_data[i].m)
+	{
+	  if (unsignedp)
+	return int_n_trees[i].unsigned_type;
+	  else
+	return int_n_trees[i].signed_type;
+	}
+}
+
   if (COMPLEX_MODE_P (mode))
 {
   machine_mode inner_mode;
@@ -1408,6 +1419,17 @@ d_type_for_size (unsigned bits, int unsignedp)
   if (bits <= TYPE_PRECISION (d_cent_type))
 return unsignedp ? d_ucent_type : d_cent_type;
 
+  for (int i = 0; i < NUM_INT_N_ENTS; i ++)
+{
+  if (int_n_enabled_p[i] && bits == int_n_data[i].bitsize)
+	{
+	  if (unsignedp)
+	return int_n_trees[i].unsigned_type;
+	  else
+	return int_n_trees[i].signed_type;
+	}
+}
+
   return 0;
 }
 


Re: [PATCH] Add MD Function type check for builtin_md vectorize

2019-08-21 Thread Richard Biener
On Tue, 20 Aug 2019, Xiong Hu Luo wrote:

> The DECL_MD_FUNCTION_CODE added in r274404(PR 91421) by rsandifo requires that
> DECL to be a BUILTIN_IN_MD class built-in, asserts will happen when lto
> as the patch r274411(PR 91287) outputs some math function symbol to the 
> object,
> this patch will check function type before do builtin_md vectorize.

I think Richard fixed this already.

Richard.

> gcc/ChangeLog
> 
> 2019-08-21  Xiong Hu Luo  
> 
>   * tree-vect-stmts.c (vectorizable_call): Check callee built-in type.
>   * gcc/tree.h (DECL_MD_FUNCTION_P): New function.
> ---
>  gcc/tree-vect-stmts.c |  2 +-
>  gcc/tree.h| 12 
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 1e2dfe5d22d..ef947f20d63 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -3376,7 +3376,7 @@ vectorizable_call (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>if (cfn != CFN_LAST)
>   fndecl = targetm.vectorize.builtin_vectorized_function
> (cfn, vectype_out, vectype_in);
> -  else if (callee)
> +  else if (callee && DECL_MD_FUNCTION_P (callee))
>   fndecl = targetm.vectorize.builtin_md_vectorized_function
> (callee, vectype_out, vectype_in);
>  }
> diff --git a/gcc/tree.h b/gcc/tree.h
> index b910c5cb475..8cce89e5cf3 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -3905,6 +3905,18 @@ DECL_MD_FUNCTION_CODE (const_tree decl)
>return fndecl.function_code;
>  }
>  
> +/* Return true if decl is a FUNCTION_DECL with built-in class BUILT_IN_MD.
> +   Otherwise return false.  */
> +inline bool
> +DECL_MD_FUNCTION_P (const_tree decl)
> +{
> +  const tree_function_decl  = FUNCTION_DECL_CHECK 
> (decl)->function_decl;
> +  if (fndecl.built_in_class == BUILT_IN_MD)
> +return true;
> +  else
> +return false;
> +}
> +
>  /* Return the frontend-specific built-in function that DECL represents,
> given that it is known to be a FUNCTION_DECL with built-in class
> BUILT_IN_FRONTEND.  */
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

[PATCH] Fix invalid RTL sharing caused by STV changes

2019-08-21 Thread Richard Biener


I am testing the following and will commit it as obvious.

Richard.

2019-08-21  Richard Biener  

PR target/91498
* config/i386/i386-features.c
(general_scalar_chain::make_vector_copies): Copy stack temporary
rtx when using it multiple times.
(general_scalar_chain::convert_reg): Likewise.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 274764)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -706,7 +706,7 @@ general_scalar_chain::make_vector_copies
gen_rtx_SUBREG (SImode, reg, 4));
  }
else
- emit_move_insn (tmp, reg);
+ emit_move_insn (copy_rtx (tmp), reg);
emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0),
gen_gpr_to_xmm_move_src (vmode, tmp)));
  }
@@ -810,7 +810,7 @@ general_scalar_chain::convert_reg (unsig
  adjust_address (tmp, SImode, 4));
}
  else
-   emit_move_insn (scopy, tmp);
+   emit_move_insn (scopy, copy_rtx (tmp));
}
  else if (!TARGET_64BIT && smode == DImode)
{


Re: [PATCH 2/8] bpf: new GCC port

2019-08-21 Thread Segher Boessenkool
On Tue, Aug 20, 2019 at 04:05:40PM -0400, Hans-Peter Nilsson wrote:
> On Tue, 20 Aug 2019, Jose E. Marchesi wrote:
> > > On Thu, Aug 15, 2019 at 12:22:46AM +0200, Jose E. Marchesi wrote:
> > > > --- a/configure
> > > > +++ b/configure
> > Yeah by mistake I used a Debian patched autoconf 2.96.  Will regenerate
> > using vanilla autoconf for subsequent versions of the patch.
> 
> It's nice that this is identified and hopefully resolved, but
> since nobody mentioned it I'll just point out that
> it's preferable to *not at all* include generated files like
> configure in patches.  See
> .

"Do not include generated files as part of the patch, just mention them
in the ChangeLog (e.g., "* configure: Regenerate.")."

That's not common practice nowadays I think?  It's also not good advice
for people who might get it wrong.  Also, the patches on the mailing
list should preferably be exactly what is committed.  For sanity.


Segher