date:20190919

Re: [PATCH, Fortran] Optionally suppress no-automatic overwrites recursive warning - for review

2019-09-19 Thread Bernhard Reutner-Fischer

On Thu, 19 Sep 2019 17:46:29 +0200
Tobias Burnus  wrote:

> Hi Mark,
> 
> On 9/19/19 3:40 PM, Mark Eggleston wrote:
> > The following warning is produced when -fno-automatic and -frecursive 
> > are used at the same time:
> >
> > f951: Warning: Flag '-fno-automatic' overwrites '-frecursive'
> >
> > This patch allows the warning to be switched off using a new option, 
> > -Woverwrite-recursive, initialised to on.
> >
> > I don't have a test case for this as I don't know how to test for a 
> > warning that isn't related to a line of code.  
> 
> Try:
> 
> ! { dg-warning "Flag .-fno-automatic. overwrites .-frecursive." "" { 
> target *-*-* } 0 }
> 
> The syntax is { dg-warning "message", "label" {target ...} linenumber }, 
> where linenumber = 0 means it can be on any line.
> 
> If the output doesn't match (but I think it does with "Warning:"), 
> general messages can be caught with "dg-message".

Also:

> @@ -411,7 +411,7 @@ gfc_post_options (const char **pfilename)
>&& flag_max_stack_var_size != 0)
>  gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites 
> %<-fmax-stack-var-size=%d%>",
>flag_max_stack_var_size);
> -  else if (!flag_automatic && flag_recursive)
> +  else if (!flag_automatic && flag_recursive && warn_overwrite_recursive)
>  gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites 
> %<-frecursive%>");
>else if (!flag_automatic && flag_openmp)
>  gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites %<-frecursive%> 
> implied by "
> 

Doesn't look right to me. Do you want
gfc_warning_now (OPT_Woverwrite_recursive, "Flag ...
instead?

thanks,

[PATCH] Fix PR91767

2019-09-19 Thread Richard Biener



regno_reg_rtx[] is a smoking gun with C++ and references.

Committed as obvious.

Richard.

2019-09-20  Richard Biener  

PR target/91767
* config/i386/i386-features.c (general_scalar_chain::convert_registers):
Ensure there's a sequence point between allocating the new register
and passing a reference to a reg via regno_reg_rtx.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 275988)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -1210,7 +1210,10 @@ general_scalar_chain::convert_registers
   bitmap_iterator bi;
   unsigned id;
   EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, id, bi)
-defs_map.put (regno_reg_rtx[id], gen_reg_rtx (smode));
+{
+  rtx chain_reg = gen_reg_rtx (smode);
+  defs_map.put (regno_reg_rtx[id], chain_reg);
+}
   EXECUTE_IF_SET_IN_BITMAP (insns_conv, 0, id, bi)
 for (df_ref ref = DF_INSN_UID_DEFS (id); ref; ref = DF_REF_NEXT_LOC (ref))
   if (bitmap_bit_p (defs_conv, DF_REF_REGNO (ref)))

Re: [PATCH] Help compiler detect invalid code

2019-09-19 Thread François Dumont

I already realized that previous patch will be too controversial to be 
accepted.


In this new version I just implement a real memmove in __memmove so that 
in copy_backward there is no need for a shortcut to a more defensive code.


I'll see if in Debug mode I can do something.

François


On 9/19/19 10:27 PM, François Dumont wrote:

Hi

    I start working on making recently added constexpr tests to work 
in Debug mode.


    It appears that the compiler is able to detect code mistakes 
pretty well as long we don't try to hide the code intention with a 
defensive approach. This is why I'd like to propose to replace '__n > 
0' conditions with '__n != 0'.


    The result is demonstrated by the constexpr_neg.cc tests. What do 
you think ?


    * include/bits/stl_algobase.h (__memmove): Return _Tp*.
    (__memmove): Loop as long as __n is not 0.
    (__copy_move<>::__copy_m): Likewise.
    (__copy_move_backward<>::__copy_move_b): Likewise.
    * testsuite/25_algorithms/copy/constexpr.cc: Add check on copied 
values.

    * testsuite/25_algorithms/copy_backward/constexpr.cc: Likewise.
    * testsuite/25_algorithms/copy/constexpr_neg.cc: New.
    * testsuite/25_algorithms/copy_backward/constexpr.cc: New.

    I'll submit the patch to fix Debug mode depending on the decision 
for this one.


François



diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index 4eba053ac75..94a79b85d15 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -83,27 +83,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template
 _GLIBCXX14_CONSTEXPR
-inline void*
-__memmove(_Tp* __dst, const _Tp* __src, size_t __num)
+inline void
+__memmove(_Tp* __dst, const _Tp* __src, ptrdiff_t __num)
 {
 #ifdef __cpp_lib_is_constant_evaluated
   if (std::is_constant_evaluated())
 	{
-	  for(; __num > 0; --__num)
+	  __dst += __num;
+	  __src += __num;
+	  for (; __num != 0; --__num)
 	{
 	  if constexpr (_IsMove)
-		*__dst = std::move(*__src);
+		*--__dst = std::move(*--__src);
 	  else
-		*__dst = *__src;
-	  ++__src;
-	  ++__dst;
+		*--__dst = *--__src;
 	}
-	  return __dst;
 	}
   else
 #endif
-	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
-  return __dst;
+	__builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
 }
 
   /*
@@ -730,12 +728,6 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 			 && __is_pointer<_BI2>::__value
 			 && __are_same<_ValueType1, _ValueType2>::__value);
 
-#ifdef __cpp_lib_is_constant_evaluated
-  if (std::is_constant_evaluated())
-	return std::__copy_move_backward::__copy_move_b(__first, __last,
-			   __result);
-#endif
   return std::__copy_move_backward<_IsMove, __simple,
    _Category>::__copy_move_b(__first,
  __last,
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
index 67910b8773e..fc70db6dae7 100644
--- a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
@@ -22,14 +22,25 @@
 #include 
 
 constexpr bool
-test()
+test1()
 {
-  constexpr std::array ca0{{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}};
+  constexpr std::array ca0{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}};
   std::array ma0{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}};
 
   const auto out6 = std::copy(ca0.begin(), ca0.begin() + 8, ma0.begin() + 2);
 
-  return out6 == ma0.begin() + 10;
+  return out6 == ma0.begin() + 10 && *(ma0.begin() + 2) == 1 && *out6 == 0;
 }
 
-static_assert(test());
+static_assert(test1());
+
+constexpr bool
+test2()
+{
+  std::array ma0{{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}};
+  const auto out6 = std::copy(ma0.begin(), ma0.begin() + 8, ma0.begin() + 2);
+
+  return out6 == ma0.begin() + 10 && *(ma0.begin() + 9) == 7;
+}
+
+static_assert(test2());
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc
new file mode 100644
index 000..49052467409
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-s

Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

2019-09-19 Thread Segher Boessenkool

On Thu, Sep 12, 2019 at 02:54:50PM -0700, Nick Desaulniers wrote:
> I have seen instances where instruction selection fails to select the
> appropriate way to branch when inline asm size is misjudged, resulting
> in un-encodeable jumps (as in the branch target is too far to be
> encoded in the number of bits of a single jump/branch instruction).

"asm inline" *only* influences the estimated size *for inlining
purposes*.  Not for anything else.  Everything else still uses
conservative estimates.


Segher

Re: [10/32] Remove global call sets: combine.c

2019-09-19 Thread Segher Boessenkool

Hi Richard,

Sorry this too me so long to get back to.

On Thu, Sep 12, 2019 at 08:51:59AM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > On Wed, Sep 11, 2019 at 08:08:38PM +0100, Richard Sandiford wrote:
> >>hard_reg_set_iterator hrsi;
> >> -  EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, i, 
> >> hrsi)
> >> +  EXECUTE_IF_SET_IN_HARD_REG_SET (abi.full_and_partial_reg_clobbers 
> >> (),
> >> +0, i, hrsi)
> >
> > So "abi" in that means calls?
> 
> "abi" is the interface of the callee function, taking things like
> function attributes and -fipa-ra into account.
> 
> The register sets are describing what the callee does rather than
> what calls to it do.  E.g. on targets that allow linker stubs to be
> inserted between calls, the scratch registers reserved for linker stubs
> are still call-clobbered, even if the target of the call doesn't use
> them.  (Those call clobbers are represented separately, at least when
> TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS is true.  When it's
> false we don't use -fipa-ra information at all.)
> 
> > It is not such a great name like that.  Since its children are
> > very_long_names, it doesn't need to be only three chars itself,
> > either?
> 
> OK, what name would you prefer?

Maybe call_abi is a good name?  It's difficult to capture the subtleties
in a short enough name.  As always :-)


Segher

Re: [PATCH 0/4] True IPA reimplementation of IPA-SRA (v4)

2019-09-19 Thread Martin Jambor

Hello,

this is what I have committed after Honza's approval and after
re-testing on x86_64-linux and aarch64-linux.

Thanks a lot for bearing with me,

Martin

2019-09-20  Martin Jambor  

* coretypes.h (cgraph_edge): Declare.
* ipa-param-manipulation.c: Rewrite.
* ipa-param-manipulation.h: Likewise.
* Makefile.in (GTFILES): Added ipa-param-manipulation.h and ipa-sra.c.
(OBJS): Added ipa-sra.o.
* cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p
and ref_p, added fields param_adjustments and performed_splits.
(struct cgraph_clone_info): Remove ags_to_skip and
combined_args_to_skip, new field param_adjustments.
(cgraph_node::create_clone): Changed parameters to use
ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_virtual_clone_with_body): Likewise.
(tree_function_versioning): Likewise.
(cgraph_build_function_type_skip_args): Removed.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to
using ipa_param_adjustments.
(clone_of_p): Likewise.
* cgraphclones.c (cgraph_build_function_type_skip_args): Removed.
(build_function_decl_skip_args): Likewise.
(duplicate_thunk_for_node): Adjust parameters using
ipa_param_body_adjustments, copy param_adjustments instead of
args_to_skip.
(cgraph_node::create_clone): Convert to using ipa_param_adjustments.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_version_clone_with_body): Likewise.
(cgraph_materialize_clone): Likewise.
(symbol_table::materialize_all_clones): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify
ipa_replace_map check.
* ipa-cp.c (get_replacement_map): Do not initialize removed fields.
(initialize_node_lattices): Make aware that some parameters might have
already been removed.
(want_remove_some_param_p): New function.
(create_specialized_node): Convert to using ipa_param_adjustments and
deal with possibly pre-existing adjustments.
* lto-cgraph.c (output_cgraph_opt_summary_p): Likewise.
(output_node_opt_summary): Do not stream removed fields.  Stream
parameter adjustments instead of argumetns to skip.
(input_node_opt_summary): Likewise.
(input_node_opt_summary): Likewise.
* lto-section-in.c (lto_section_name): Added ipa-sra section.
* lto-streamer.h (lto_section_type): Likewise.
* tree-inline.h (copy_body_data): New fields killed_new_ssa_names and
param_body_adjs.
(copy_decl_to_var): Declare.
* tree-inline.c (update_clone_info): Do not remap old_tree.
(remap_gimple_stmt): Use ipa_param_body_adjustments to modify gimple
statements, walk all extra generated statements and remap their
operands.
(redirect_all_calls): Add killed SSA names to a hash set.
(remap_ssa_name): Do not remap killed SSA names.
(copy_arguments_for_versioning): Renames to copy_arguments_nochange,
half of functionality moved to ipa_param_body_adjustments.
(copy_decl_to_var): Make exported.
(copy_body): Destroy killed_new_ssa_names hash set.
(expand_call_inline): Remap performed splits.
(update_clone_info): Likewise.
(tree_function_versioning): Simplify tree_map processing.  Updated to
accept ipa_param_adjustments and use ipa_param_body_adjustments.
* omp-simd-clone.c (simd_clone_vector_of_formal_parm_types): Adjust
for the new interface.
(simd_clone_clauses_extract): Likewise, make args an auto_vec.
(simd_clone_compute_base_data_type): Likewise.
(simd_clone_init_simd_arrays): Adjust for the new interface.
(simd_clone_adjust_argument_types): Likewise.
(struct modify_stmt_info): Likewise.
(ipa_simd_modify_stmt_ops): Likewise.
(ipa_simd_modify_function_body): Likewise.
(simd_clone_adjust): Likewise.
* tree-sra.c: Removed IPA-SRA.  Include tree-sra.h.
(type_internals_preclude_sra_p): Make public.
* tree-sra.h: New file.
* ipa-inline-transform.c (save_inline_function_body): Update to
refelct new tree_function_versioning signature.
* ipa-prop.c (adjust_agg_replacement_values): Use a helper from
ipa_param_adjustments to get current parameter indices.
(ipcp_modif_dom_walker::before_dom_children): Likewise.
(ipcp_update_bits): Likewise.
(ipcp_update_vr): Likewise.
* ipa-split.c (split_function): Convert to using ipa_param_adjustments.
* ipa-sra.c: New file.
* multiple_target.c (create_target_clone): Update to reflet new type
of create_version_clone_with_body.
* trans-mem.c (ipa_tm_create_version): Update to reflect new type of

Re: [C++ PATCH 2/4] Fix conversions for built-in operator overloading candidates.

2019-09-19 Thread Marek Polacek

On Thu, Sep 19, 2019 at 04:11:16PM -0400, Jason Merrill wrote:
> Do any of you have a reproducer for this?

I've attached it to 91825.

Marek

[PATCH] Help compiler detect invalid code

2019-09-19 Thread François Dumont


Hi

    I start working on making recently added constexpr tests to work in 
Debug mode.


    It appears that the compiler is able to detect code mistakes pretty 
well as long we don't try to hide the code intention with a defensive 
approach. This is why I'd like to propose to replace '__n > 0' 
conditions with '__n != 0'.


    The result is demonstrated by the constexpr_neg.cc tests. What do 
you think ?


    * include/bits/stl_algobase.h (__memmove): Return _Tp*.
    (__memmove): Loop as long as __n is not 0.
    (__copy_move<>::__copy_m): Likewise.
    (__copy_move_backward<>::__copy_move_b): Likewise.
    * testsuite/25_algorithms/copy/constexpr.cc: Add check on copied 
values.

    * testsuite/25_algorithms/copy_backward/constexpr.cc: Likewise.
    * testsuite/25_algorithms/copy/constexpr_neg.cc: New.
    * testsuite/25_algorithms/copy_backward/constexpr.cc: New.

    I'll submit the patch to fix Debug mode depending on the decision 
for this one.


François

diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index 4eba053ac75..33593197b4f 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -83,13 +83,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template
 _GLIBCXX14_CONSTEXPR
-inline void*
-__memmove(_Tp* __dst, const _Tp* __src, size_t __num)
+inline _Tp*
+__memmove(_Tp* __dst, const _Tp* __src, ptrdiff_t __num)
 {
 #ifdef __cpp_lib_is_constant_evaluated
   if (std::is_constant_evaluated())
 	{
-	  for(; __num > 0; --__num)
+	  for (; __num != 0; --__num)
 	{
 	  if constexpr (_IsMove)
 		*__dst = std::move(*__src);
@@ -100,10 +100,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	}
 	  return __dst;
 	}
-  else
 #endif
-	return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num);
-  return __dst;
+  return static_cast<_Tp*>(
+	__builtin_memmove(__dst, __src, sizeof(_Tp) * __num));
 }
 
   /*
@@ -398,7 +397,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__copy_m(_II __first, _II __last, _OI __result)
 	{
 	  typedef typename iterator_traits<_II>::difference_type _Distance;
-	  for(_Distance __n = __last - __first; __n > 0; --__n)
+	  for (_Distance __n = __last - __first; __n != 0; --__n)
 	{
 	  *__result = *__first;
 	  ++__first;
@@ -418,7 +417,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__copy_m(_II __first, _II __last, _OI __result)
 	{
 	  typedef typename iterator_traits<_II>::difference_type _Distance;
-	  for(_Distance __n = __last - __first; __n > 0; --__n)
+	  for (_Distance __n = __last - __first; __n != 0; --__n)
 	{
 	  *__result = std::move(*__first);
 	  ++__first;
@@ -446,8 +445,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 	  const ptrdiff_t _Num = __last - __first;
 	  if (_Num)
-	std::__memmove<_IsMove>(__result, __first, _Num);
-	  return __result + _Num;
+	return std::__memmove<_IsMove>(__result, __first, _Num);
+	  return __result;
 	}
 };
 
@@ -671,7 +670,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 	{
 	  typename iterator_traits<_BI1>::difference_type
 	__n = __last - __first;
-	  for (; __n > 0; --__n)
+	  for (; __n != 0; --__n)
 	*--__result = *--__last;
 	  return __result;
 	}
@@ -688,7 +687,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 	{
 	  typename iterator_traits<_BI1>::difference_type
 	__n = __last - __first;
-	  for (; __n > 0; --__n)
+	  for (; __n != 0; --__n)
 	*--__result = std::move(*--__last);
 	  return __result;
 	}
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
index 67910b8773e..a0460603496 100644
--- a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr.cc
@@ -24,12 +24,12 @@
 constexpr bool
 test()
 {
-  constexpr std::array ca0{{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}};
+  constexpr std::array ca0{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}};
   std::array ma0{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}};
 
   const auto out6 = std::copy(ca0.begin(), ca0.begin() + 8, ma0.begin() + 2);
 
-  return out6 == ma0.begin() + 10;
+  return out6 == ma0.begin() + 10 && *(ma0.begin() + 2) == 1 && *out6 == 0;
 }
 
 static_assert(test());
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc
new file mode 100644
index 000..34e20be97eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/constexpr_neg.cc
@@ -0,0 +1,50 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; wit

Re: [C++ PATCH 2/4] Fix conversions for built-in operator overloading candidates.

2019-09-19 Thread Jason Merrill

Do any of you have a reproducer for this?

On Thu, Sep 19, 2019 at 3:43 PM Jason Merrill  wrote:
>
> I've reverted this patch for the moment.
>
> On Wed, Sep 18, 2019 at 8:19 PM JiangNing OS
>  wrote:
> >
> > Hi Jason,
> >
> > This commit caused boot-strap failure on aarch64. Is it a bug? Can this be 
> > fixed ASAP?
> >
> > ../../gcc/gcc/expmed.c:5602:19: error: ���int_mode��� may be used 
> > uninitialized in this function [-Werror=maybe-uninitialized]
> >  5602 |   scalar_int_mode int_mode;
> >   |   ^~~~
> >
> > Thanks,
> > -Jiangning
> >
> > > -Original Message-
> > > From: gcc-patches-ow...@gcc.gnu.org 
> > > On Behalf Of Jason Merrill
> > > Sent: Monday, September 16, 2019 12:33 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [C++ PATCH 2/4] Fix conversions for built-in operator overloading
> > > candidates.
> > >
> > > While working on C++20 operator<=>, I noticed that build_new_op_1 was
> > > doing too much conversion when a built-in candidate was selected; the
> > > standard says it should only perform user-defined conversions, and then
> > > leave the normal operator semantics to handle any standard conversions.
> > > This is important for operator<=> because a comparison of two different
> > > unscoped enums is ill-formed; if we promote the enums to int here,
> > > cp_build_binary_op never gets to see the original operand types, so we 
> > > can't
> > > give the error.
> > >
> > > Tested x86_64-pc-linux-gnu, applying to trunk.
> > >
> > >   * call.c (build_new_op_1): Don't apply any standard conversions to
> > >   the operands of a built-in operator.  Don't suppress conversions in
> > >   cp_build_unary_op.
> > >   * typeck.c (cp_build_unary_op): Do integral promotions for enums.
> > > ---
> > >  gcc/cp/call.c| 51 
> > >  gcc/cp/typeck.c  |  4 ++--
> > >  gcc/cp/ChangeLog |  7 +++
> > >  3 files changed, 34 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/gcc/cp/call.c b/gcc/cp/call.c index c3045d948c5..457fa6605c2
> > > 100644
> > > --- a/gcc/cp/call.c
> > > +++ b/gcc/cp/call.c
> > > @@ -6139,41 +6139,40 @@ build_new_op_1 (const op_location_t &loc,
> > > enum tree_code code, int flags,
> > > break;
> > >   }
> > >
> > > -   /* We need to strip any leading REF_BIND so that bitfields
> > > -  don't cause errors.  This should not remove any important
> > > -  conversions, because builtins don't apply to class
> > > -  objects directly.  */
> > > +   /* "If a built-in candidate is selected by overload resolution, 
> > > the
> > > +  operands of class type are converted to the types of the
> > > +  corresponding parameters of the selected operation function,
> > > +  except that the second standard conversion sequence of a
> > > +  user-defined conversion sequence (12.3.3.1.2) is not applied."
> > > +*/
> > > conv = cand->convs[0];
> > > -   if (conv->kind == ck_ref_bind)
> > > - conv = next_conversion (conv);
> > > -   arg1 = convert_like (conv, arg1, complain);
> > > +   if (conv->user_conv_p)
> > > + {
> > > +   while (conv->kind != ck_user)
> > > + conv = next_conversion (conv);
> > > +   arg1 = convert_like (conv, arg1, complain);
> > > + }
> > >
> > > if (arg2)
> > >   {
> > > conv = cand->convs[1];
> > > -   if (conv->kind == ck_ref_bind)
> > > - conv = next_conversion (conv);
> > > -   else
> > > - arg2 = decay_conversion (arg2, complain);
> > > -
> > > -   /* We need to call warn_logical_operator before
> > > -  converting arg2 to a boolean_type, but after
> > > -  decaying an enumerator to its value.  */
> > > -   if (complain & tf_warning)
> > > - warn_logical_operator (loc, code, boolean_type_node,
> > > -code_orig_arg1, arg1,
> > > -code_orig_arg2, arg2);
> > > -
> > > -   arg2 = convert_like (conv, arg2, complain);
> > > +   if (conv->user_conv_p)
> > > + {
> > > +   while (conv->kind != ck_user)
> > > + conv = next_conversion (conv);
> > > +   arg2 = convert_like (conv, arg2, complain);
> > > + }
> > >   }
> > > +
> > > if (arg3)
> > >   {
> > > conv = cand->convs[2];
> > > -   if (conv->kind == ck_ref_bind)
> > > - conv = next_conversion (conv);
> > > -   convert_like (conv, arg3, complain);
> > > +   if (conv->user_conv_p)
> > > + {
> > > +   while (conv->kind != ck_user)
> > > + conv = next_conversion (conv);
> > > +   arg3 = convert_like (conv, arg3, complain);
> > > + }
> > >   }
> > > -
> > >   }
> >

Re: Patch to support extended characters in C/C++ identifiers

2019-09-19 Thread Lewis Hyatt

On Thu, Sep 19, 2019 at 3:57 PM Joseph Myers  wrote:
>
> On Thu, 12 Sep 2019, Lewis Hyatt wrote:
>
> > Attached is a single patch relative to current trunk that incorporates all 
> > of
> > your feedback. I gzipped it like last time just in case the invalid UTF-8 in
> > the tests presents a problem. The code changes are the same as before other
> > than comments.
> >
> > The documentation is now updated... there were a couple other places that 
> > also
> > seemed reasonable to me to update, hope it sounds OK.
>
> Thanks, I've now committed this patch.
>

Thank you very much.

-Lewis

Re: Patch to support extended characters in C/C++ identifiers

2019-09-19 Thread Joseph Myers

On Thu, 12 Sep 2019, Lewis Hyatt wrote:

> Attached is a single patch relative to current trunk that incorporates all of
> your feedback. I gzipped it like last time just in case the invalid UTF-8 in
> the tests presents a problem. The code changes are the same as before other
> than comments.
> 
> The documentation is now updated... there were a couple other places that also
> seemed reasonable to me to update, hope it sounds OK.

Thanks, I've now committed this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com

[Darwin, PPC, testsuite, committed] Fix pr89313.c fail.

2019-09-19 Thread Iain Sandoe

Hi,

Darwin defines '__POWERPC__' rather than '__powerpc__' so check for
the upper case version too in order to select the correct register
name.

tested on powerpc-darwin9, , applied to mainline.
thanks
Iain

gcc/testsuite:

2019-09-19  Iain Sandoe  

* gcc.dg/pr89313.c: Test for __POWERPC__ in addition to
__powerpc__ in register name selection.


diff --git a/gcc/testsuite/gcc.dg/pr89313.c b/gcc/testsuite/gcc.dg/pr89313.c
index 6688323fbd..76cb0910b9 100644
--- a/gcc/testsuite/gcc.dg/pr89313.c
+++ b/gcc/testsuite/gcc.dg/pr89313.c
@@ -8,7 +8,7 @@
 # define REG "r0"
 #elif defined (__i386__)
 # define REG "%eax"
-#elif defined (__powerpc__)
+#elif defined (__powerpc__) || defined (__POWERPC__)
 # define REG "r3"
 #elif defined (__s390__)
 # define REG "0"

Re: [C++ PATCH 2/4] Fix conversions for built-in operator overloading candidates.

2019-09-19 Thread Jason Merrill

I've reverted this patch for the moment.

On Wed, Sep 18, 2019 at 8:19 PM JiangNing OS
 wrote:
>
> Hi Jason,
>
> This commit caused boot-strap failure on aarch64. Is it a bug? Can this be 
> fixed ASAP?
>
> ../../gcc/gcc/expmed.c:5602:19: error: ���int_mode��� may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
>  5602 |   scalar_int_mode int_mode;
>   |   ^~~~
>
> Thanks,
> -Jiangning
>
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org 
> > On Behalf Of Jason Merrill
> > Sent: Monday, September 16, 2019 12:33 PM
> > To: gcc-patches@gcc.gnu.org
> > Subject: [C++ PATCH 2/4] Fix conversions for built-in operator overloading
> > candidates.
> >
> > While working on C++20 operator<=>, I noticed that build_new_op_1 was
> > doing too much conversion when a built-in candidate was selected; the
> > standard says it should only perform user-defined conversions, and then
> > leave the normal operator semantics to handle any standard conversions.
> > This is important for operator<=> because a comparison of two different
> > unscoped enums is ill-formed; if we promote the enums to int here,
> > cp_build_binary_op never gets to see the original operand types, so we can't
> > give the error.
> >
> > Tested x86_64-pc-linux-gnu, applying to trunk.
> >
> >   * call.c (build_new_op_1): Don't apply any standard conversions to
> >   the operands of a built-in operator.  Don't suppress conversions in
> >   cp_build_unary_op.
> >   * typeck.c (cp_build_unary_op): Do integral promotions for enums.
> > ---
> >  gcc/cp/call.c| 51 
> >  gcc/cp/typeck.c  |  4 ++--
> >  gcc/cp/ChangeLog |  7 +++
> >  3 files changed, 34 insertions(+), 28 deletions(-)
> >
> > diff --git a/gcc/cp/call.c b/gcc/cp/call.c index c3045d948c5..457fa6605c2
> > 100644
> > --- a/gcc/cp/call.c
> > +++ b/gcc/cp/call.c
> > @@ -6139,41 +6139,40 @@ build_new_op_1 (const op_location_t &loc,
> > enum tree_code code, int flags,
> > break;
> >   }
> >
> > -   /* We need to strip any leading REF_BIND so that bitfields
> > -  don't cause errors.  This should not remove any important
> > -  conversions, because builtins don't apply to class
> > -  objects directly.  */
> > +   /* "If a built-in candidate is selected by overload resolution, the
> > +  operands of class type are converted to the types of the
> > +  corresponding parameters of the selected operation function,
> > +  except that the second standard conversion sequence of a
> > +  user-defined conversion sequence (12.3.3.1.2) is not applied."
> > +*/
> > conv = cand->convs[0];
> > -   if (conv->kind == ck_ref_bind)
> > - conv = next_conversion (conv);
> > -   arg1 = convert_like (conv, arg1, complain);
> > +   if (conv->user_conv_p)
> > + {
> > +   while (conv->kind != ck_user)
> > + conv = next_conversion (conv);
> > +   arg1 = convert_like (conv, arg1, complain);
> > + }
> >
> > if (arg2)
> >   {
> > conv = cand->convs[1];
> > -   if (conv->kind == ck_ref_bind)
> > - conv = next_conversion (conv);
> > -   else
> > - arg2 = decay_conversion (arg2, complain);
> > -
> > -   /* We need to call warn_logical_operator before
> > -  converting arg2 to a boolean_type, but after
> > -  decaying an enumerator to its value.  */
> > -   if (complain & tf_warning)
> > - warn_logical_operator (loc, code, boolean_type_node,
> > -code_orig_arg1, arg1,
> > -code_orig_arg2, arg2);
> > -
> > -   arg2 = convert_like (conv, arg2, complain);
> > +   if (conv->user_conv_p)
> > + {
> > +   while (conv->kind != ck_user)
> > + conv = next_conversion (conv);
> > +   arg2 = convert_like (conv, arg2, complain);
> > + }
> >   }
> > +
> > if (arg3)
> >   {
> > conv = cand->convs[2];
> > -   if (conv->kind == ck_ref_bind)
> > - conv = next_conversion (conv);
> > -   convert_like (conv, arg3, complain);
> > +   if (conv->user_conv_p)
> > + {
> > +   while (conv->kind != ck_user)
> > + conv = next_conversion (conv);
> > +   arg3 = convert_like (conv, arg3, complain);
> > + }
> >   }
> > -
> >   }
> >  }
> >
> > @@ -6241,7 +6240,7 @@ build_new_op_1 (const op_location_t &loc, enum
> > tree_code code, int flags,
> >  case REALPART_EXPR:
> >  case IMAGPART_EXPR:
> >  case ABS_EXPR:
> > -  return cp_build_unary_op (code, arg1, candidates != 0, complain);
> > +  return cp_build_unary_op (code, arg1, false, complain);
> >

C++ PATCH for c++/91819 - ICE with operator++ and enum.

2019-09-19 Thread Marek Polacek

This is an ICE that started with the recent r275745.  The problem here is that
for a POSTINCREMENT_EXPR build_new_op_1 is called with null arg2, so arg2_type 
is
also null after
 5819   tree arg2_type = arg2 ? unlowered_expr_type (arg2) : NULL_TREE;
but then we make arg2 nonnull
 5887   if (code == POSTINCREMENT_EXPR || code == POSTDECREMENT_EXPR)
 5888 arg2 = integer_zero_node;
while arg2_type is still null and so
 5940   else if (! arg2 || ! CLASS_TYPE_P (arg2_type))
crashes.  Fixed by setting arg2_type in the ++/-- case.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-09-19  Marek Polacek  

PR c++/91819 - ICE with operator++ and enum.
* call.c (build_new_op_1): Set arg2_type.

* g++.dg/other/operator4.C: New test.

diff --git gcc/cp/call.c gcc/cp/call.c
index b780b0af58e..512421b4772 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -5878,7 +5878,10 @@ build_new_op_1 (const op_location_t &loc, enum tree_code 
code, int flags,
 goto builtin;
 
   if (code == POSTINCREMENT_EXPR || code == POSTDECREMENT_EXPR)
-arg2 = integer_zero_node;
+{
+  arg2 = integer_zero_node;
+  arg2_type = integer_type_node;
+}
 
   vec_alloc (arglist, 3);
   arglist->quick_push (arg1);
diff --git gcc/testsuite/g++.dg/other/operator4.C 
gcc/testsuite/g++.dg/other/operator4.C
new file mode 100644
index 000..e7a41c01a58
--- /dev/null
+++ gcc/testsuite/g++.dg/other/operator4.C
@@ -0,0 +1,22 @@
+// PR c++/91819 - ICE with operator++ and enum.
+// { dg-do compile }
+
+enum Foo
+{
+  a,
+  b
+};
+
+inline Foo operator++(Foo &f, int) 
+{
+  return f = (Foo)(f + 1);
+}
+
+int main()
+{
+  int count = 0;
+  for (Foo f = a; f <= b; f++) {
+count++;
+  }
+  return count;
+}

[C++ PATCH] Handle [[likely]] on compound-statement.

2019-09-19 Thread Jason Merrill

I overlooked this case when adding [[likely]] handling to
cp_parser_statement.

Tested x86_64-pc-linux-gnu, applying to trunk.

* parser.c (cp_parser_statement): Handle [[likely]] on
compound-statement.
---
 gcc/cp/parser.c   | 5 -
 gcc/testsuite/g++.dg/cpp2a/attr-likely5.C | 9 +
 gcc/cp/ChangeLog  | 5 +
 3 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/attr-likely5.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 165039ef07c..da0ffacc218 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -11297,7 +11297,10 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
 }
   /* Anything that starts with a `{' must be a compound-statement.  */
   else if (token->type == CPP_OPEN_BRACE)
-statement = cp_parser_compound_statement (parser, NULL, BCS_NORMAL, false);
+{
+  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
+  statement = cp_parser_compound_statement (parser, NULL, BCS_NORMAL, 
false);
+}
   /* CPP_PRAGMA is a #pragma inside a function body, which constitutes
  a statement all its own.  */
   else if (token->type == CPP_PRAGMA)
diff --git a/gcc/testsuite/g++.dg/cpp2a/attr-likely5.C 
b/gcc/testsuite/g++.dg/cpp2a/attr-likely5.C
new file mode 100644
index 000..166214835d8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/attr-likely5.C
@@ -0,0 +1,9 @@
+// { dg-do compile { target c++11 } }
+
+void f(int i)
+{
+  if (i) [[likely]]
+{
+  ++i;
+}
+}
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index de1677f6142..3015d6806d0 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2019-09-17  Jason Merrill  
+
+   * parser.c (cp_parser_statement): Handle [[likely]] on
+   compound-statement.
+
 2019-09-19  Jason Merrill  
 
Revert:

base-commit: aae9c42b1657673a0df0829380edc7c6b7e486b1
-- 
2.21.0

Re: Monotonically increasing counter (was Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git)

2019-09-19 Thread Jason Merrill

On Wed, Aug 14, 2019 at 2:14 PM Jason Merrill  wrote:
> On Mon, Aug 5, 2019 at 2:22 PM Jason Merrill  wrote:
> > On 8/5/19 11:34 AM, Jakub Jelinek wrote:
> > > On Mon, Aug 05, 2019 at 11:20:09AM -0400, Jason Merrill wrote:
> > >> I agree.  But for those who want a monotonically increasing
> > >> identifier, there's already one in git: CommitDate.  In the discussion
> > >> of this issue four years ago,
> > >
> > > While commit date is monotonically increasing, it has the problem that at
> > > certain dates there are very few commits, at others many.  When doing
> > > bisection by hand, one does the revision computation (min+max)/2 in head
> > > (it doesn't have to be precise of course, just roughly, and isn't perfect
> > > either, because in svn all of trunk and branches contribute to the 
> > > revision
> > > numbers), with dates it would be division into further uneven chunks.
> >
> > That's true, but is it a major problem?  If you have multiple commits on
> > one day, you (can) have multiple binaries with the same date and
> > different times, and you can adjust your heuristic for choosing the next
> > bisection point accordingly.  Over longer periods, the number of commits
> > per day averages out.
> >
> > > Could we tag the branchpoints, say when we branch off gcc 10, we tag the
> > > branchpoint as tags/r11 and then we could use r11-123 as 123th commit on 
> > > the
> > > trunk since the branchpoint, and let bugzilla and web redirection handle
> > > those rNN- style identifiers?
> > > git describe --all --match 'r[0-9]*' ... | sed ...
> > > to map hashes etc. to these rNN- identifiers and something to map them
> > > back to hashes say for git web?
> >
> > Well, having such tags would allow git describe to produce identifiers
> > that you might find more readable.  For instance, if I do
> >
> > git tag -a -m 'GCC 9 branchpoint' b9 $(git merge-base trunk gcc-9-branch)
>
> Though I guess what you were suggesting is slightly different: this
> will put the tag in the history of both trunk and branch, and it would
> be better for "r11" to be only in the history of GCC 11.  So probably
> best to tag the commit that bumps BASE-VER instead, i.e.
>
> $ git tag -a -m 'GCC 10 stage 1 open' gcc10
> 70f448fa5347ba24e0916201dd8549bc16783ff0
> $ git tag -a -m 'GCC 9 stage 1 open' gcc9
> 949bc65ce4d0d7dd036ccfb279bffe63d02feee6
> $ git tag -a -m 'GCC 8 stage 1 open' gcc8
> 498621e8159c1f494a9b8a5f2c3e5225c74ed242
> ...
> $ git describe trunk
> gcc10-2527-gac18cc031cd
> $ git describe gcc-9-branch
> gcc9-7633-g28a024c36af
>
> Does this sound good to you?  Anyone have thoughts about naming for the tags?
>
> Since alphabetical sorting won't do well with gcc9 and gcc10, you may
> want to use the beginning of time tag for naming your binaries.  Also
> because the stage 1 boundary isn't that interesting for bisection.
>
> > git tag -a -m'Beginning of Time' r1 3cf0d8938a953ef13e57239613d42686f152b4fe
> > git describe --match r1 trunk
> >
> > r1-170718-gdb868bacf6a
> >
> > These tags don't need to be shared, this works fine locally.
> >
> > Note that when you feed such an identifier to other git commands, they
> > ignore the first two parts and just use the hash.
> >
> > This might be a good alternative to dates for you, and people could
> > refer to them interchangeably with raw hashes in the web interface.

I suppose we also need to decide what form we want to use for the
equivalent of gcc.gnu.org/rN.  I figure it needs to be some prefix
followed by a "commit-ish" (hash, tag, etc.).  Perhaps "g:" as the
prefix, so

gcc.gnu.org/g:HEAD
gcc.gnu.org/g:gcc-9-branch
gcc.gnu.org/g:3cf0d8938a953e

?

Jason

Re: [PATCH] Fix PR91790

2019-09-19 Thread Bill Schmidt




On 9/19/19 1:34 PM, Segher Boessenkool wrote:

Hi!

On Tue, Sep 17, 2019 at 09:45:54AM +0200, Richard Biener wrote:

The following fixes an old vectorizer issue with realignment support
(thus only powerpc is affected) and BB vectorization.  The realignment
token is set up from the wrong data-ref which causes an SSA verification
failure but in other circumstances might simply generate wrong code.

Bootstrap running on x86_64-unknown-linux-gnu, I'll install this
as obvious on trunk.

PPC folks - you know best how to appropriately test a target
where we use the re-alignment optimization.  IIRC on later
powerpc hardware this isn't exercised at all since we can use
unaligned accesses.

The issue is at least present on the GCC 9 branch as well but I'd
appreciate testing where it exercises the path before considering
a backport.

Is there a testcase?



Richard, can you turn the PR's reported test into a torture test case?  
We post P7 big-endian results frequently to gcc-testresults, and this 
bug hasn't fired on anything there, so it's not covered by existing 
tests.  Nothing has turned up on the testers since your patch went in, 
so having the new test added should be sufficient, I'd think.  P7 or 
older running big-endian is what's needed to test realignment support.


Thanks,

Bill



You can use -malign-natural to get stricter alignment requirements,
that might help.

Cc:ing Bill, this is vectorizer :-)


Segher

Re: [patch, testsuite, arm] Fix ICE in gcc.dg/gimplefe-28.c

2019-09-19 Thread Sandra Loosemore


On 9/19/19 2:40 AM, Kyrill Tkachov wrote:


Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp    (revision 275699)
+++ gcc/testsuite/lib/target-supports.exp    (working copy)
@@ -6670,6 +6670,9 @@ proc add_options_for_sqrt_insn { flags }
  if { [istarget amdgcn*-*-*] } {
  return "$flags -ffast-math"
  }
+    if { [istarget arm*-*-*] } {
+    return "$flags -mfpu=vfp -mfloat-abi=softfp"
+    }
  return $flags
  }

I tend to think it's a bit cleaner to use one of the add_options_for_* 
helpers we have that know the right float-abi and mfpu options to pass.

Would using add_options_for_arm_fp or add_options_for_arm_vfp3 work here?


add_options_for_arm_fp only adds a -mfloat-abi option which isn't 
sufficient by itself to enable floating-point on the default multilib 
for arm-none-eabi.  It needs -mfpu= too.


My understanding is that all VFP fpus have support for sqrt so why 
require vfp3?


I could put the magic options in a new function called 
add_options_for_arm_vfp instead of directly in 
add_options_for_sqrt_insn, and fix it up to cope with -mfloat-abi=hard 
multilibs too.  Would that be an improvement?


-Sandra

Re: [PATCH] Fix PR91790

2019-09-19 Thread Segher Boessenkool

Hi!

On Tue, Sep 17, 2019 at 09:45:54AM +0200, Richard Biener wrote:
> The following fixes an old vectorizer issue with realignment support
> (thus only powerpc is affected) and BB vectorization.  The realignment
> token is set up from the wrong data-ref which causes an SSA verification
> failure but in other circumstances might simply generate wrong code.
> 
> Bootstrap running on x86_64-unknown-linux-gnu, I'll install this
> as obvious on trunk.
> 
> PPC folks - you know best how to appropriately test a target
> where we use the re-alignment optimization.  IIRC on later
> powerpc hardware this isn't exercised at all since we can use
> unaligned accesses.
> 
> The issue is at least present on the GCC 9 branch as well but I'd
> appreciate testing where it exercises the path before considering
> a backport.

Is there a testcase?

You can use -malign-natural to get stricter alignment requirements,
that might help.

Cc:ing Bill, this is vectorizer :-)


Segher

Re: [SVE] PR91532

2019-09-19 Thread Richard Biener

On Thu, 19 Sep 2019, Prathamesh Kulkarni wrote:

> Hi,
> For PR91532, the dead store is trivially deleted if we place dse pass
> between ifcvt and vect. Would it be OK to add another instance of dse there ?
> Or should we add an ad-hoc "basic-block dse" sub-pass to ifcvt that
> will clean up the dead store ?

No, the issue is the same as PR33315 and exists on the non-vectorized
code as well.

Richard.

Re: [PATCH][ARM] Add logical DImode expanders

2019-09-19 Thread Wilco Dijkstra

Hi Richard,

> Please reformat this as one mapping per line.  Over time I expect this 
> is only going to grow.

Sure, I've committed it reformatted as r275970.

Wilco

Re: [PATCH][ARM] Remove support for MULS

2019-09-19 Thread Wilco Dijkstra

Hi Richard, Kyrill,

>> I disagree. If they still trigger and generate better code than without 
>> we should keep them.
> 
>> What kind of code is *common* varies greatly from user to user.

Not really - doing a multiply and checking whether the result is zero is
exceedingly rare. I found only 3 cases out of 7300 mul/mla in all of
SPEC2006... Overall codesize effect with -Os: 28 bytes or 0.00045%.

So we really should not even consider wasting any more time on
maintaining such useless patterns.

> Also, the main reason for restricting their use was that in the 'olden 
> days', when we had multi-cycle implementations of the multiply 
> instructions with short-circuit fast termination when the result was 
> completed, the flag setting variants would never short-circuit.

That only applied to conditional multiplies IIRC, some implementations
would not early-terminate if the condition failed. Today there are serious
penalties for conditional multiplies - but that's something to address in a
different patch.

> These days we have fixed cycle counts for multiply instructions, so this 
> is no-longer a penalty.  

No, there is a large overhead on modern cores when you set the flags,
and there are other penalties due to the extra micro-ops.

> In the thumb2 case in particular we can often 
> reduce mul-cmp (6 bytes) to muls (2 bytes), that's a 66% saving on this 
> sequence and definitely worth exploiting when we can, even if it's not 
> all that common.

Using muls+cbz is equally small. With my patch we generate this with -Os:

void g(void);
int f(int x)
{ 
  if (x * x != 0)
g();
}

f:
mulsr0, r0, r0
push{r3, lr}
cbz r0, .L9
bl  g
.L9:
pop {r3, pc}

Wilco

Re: [Ada] Fix 32/64bit mistake on SYSTEM_INFO component in s-win32

2019-09-19 Thread Olivier Hainque

Hello Iain & Rainer,

Thanks for the heads up and sorry for the disruption from
this one.

> On 19 Sep 2019, at 17:02, Iain Sandoe  wrote:
> 
>> Given that the test cannot compile on anything but *-*-linux* and
>> *-*-mingw*, I'd rather restrict the test to those two (or more targets
>> that decide to implement the missing interface).
> 
> that makes sense, too (targets with support can opt in)

Works for me as well. The point of the test is really to check
a 32/64 discrepancy initially observed on Windows. It is not meant
as a general representative test of the System.Task_Info feature.

If you get it to work on Darwin or Solaris and wish to opt-in
afterwards, this is also fine of course :)

We can take care of adding the required filter.

Cheers,

Olivier

Re: [PATCH, nvptx] Expand OpenACC child function arguments to use CUDA params space

2019-09-19 Thread Thomas Schwinge

Hi Chung-Lin!

On 2019-09-10T19:41:59+0800, Chung-Lin Tang  wrote:
> this is a completely new implementation of an earlier optimization
> that Cesar submitted:
> https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01202.html

Thanks for your re-work!

> The objective is to transform the original single-record-pointer argument
> form (OpenMP/pthreads originated) to multiple scalar parameters, that
> the CUDA runtime will place directly in the .params space for GPU kernels:
>
> #pragma acc parallel copy(a, b) copyin(c)
> {
>a += b;
>b -= c;
> }
>
> compiles to GIMPLE as:
>
> __attribute__((oacc function (1, 1, 32), omp target entrypoint))
> main._omp_fn.0 (const struct .omp_data_t.8 & restrict .omp_data_i)
> {
>...
>_3 = .omp_data_i_2(D)->a;
>_4 = *_3;
>_5 = .omp_data_i_2(D)->b;
>_6 = *_5;
>...
>
> this patch adds pass to transform into:
>
> __attribute__((oacc function (1, 1, 32), omp target entrypoint))
> main._omp_fn.0 (int * c, int * b, int * a)
> {
>...
>_3 = a;
>_4 = *_3;
>_5 = b;
>_6 = *_5;
>...

ACK.

> Cesar's original implementation tried to do this in the middle-end,
> which required lots of changes throughout the compiler, libgomp interface,
> etc. and required a dependency on libffi for the CPU-host fallback child
> function (since there is no longer a known, fixed single-pointer argument
> interface to all child functions)

Specifically, the major problem -- per my understanding -- is that
Cesar's implementation does this in the early stages of the middle end
('pass_lower_omp'), before the target vs. offload target code paths get
separated, and so the transformation was done for target ("host
fallback") as well as all offload targets, without each of them having
the option to opt in/out.

As can be seen from the new highly localized code changes (nvptx code
only), your re-work clearly fixes that aspect!  :-)

> This new implementation works by modifying the GIMPLE for child functions
> directly at the very start (before, actually) of RTL expansion

That's now near the other end of the pipeline.  ;-) What's the motivation
for putting it there, instead of early in the nvptx offloading
compilation (around 'pass_oacc_device_lower' etc. time, where I would've
assumed this transformation to be done)?  Not asking you to change that
now, but curious for the reason.

> and thus
> is placed in TARGET_EXPAND_TO_RTL_HOOK, as the core issue is we inherently
> need something different generated between the host-fallback vs for the GPU.

(Likewise, different per each offload target.)

> The new nvptx_expand_to_rtl_hook modifies the function decl type and
> arguments, and scans the gimple body to remove occurrences of .omp_data_i.*
> Detection of OpenACC child functions is done through "omp target entrypoint"
> and "oacc function" attributes. Because OpenMP target child functions
> have a more elaborate wrapper generated for them, this pass only supports
> OpenACC right now.

At the Cauldron, the question indeed has been raised (Jakub, Tom) why not
enabled for OpenMP, too.  My answer was that this surely can be done, but
the change as presented here already is an improvement over the current
status ("stands on its own", as Jeff Law would call it), so I'm fine with
you handling OpenACC first, and then OpenMP can follow later (at some as
of yet indeterminite point in time, even).

> libgomp has tested with this patch x86_64-linux (nvptx-none accelerator)
> without regressions

Can you present performance numbers, too?

> (I'm currently undergoing more gcc tests as well).

As these changes, being confined to nvptx code only, can't possibly have
any effect on other target testing, I assume that's nvptx target testing
you're talking about?  (..., where also I'm not expecting any
disturbance.)

> Is this okay for trunk?

I'm not the one to approve these code changes, but I do have a few
comments/questions:

> --- gcc/config/nvptx/nvptx.c  (revision 275493)
> +++ gcc/config/nvptx/nvptx.c  (working copy)

> +static void
> +nvptx_expand_to_rtl_hook (void)
> +{
> +  /* For utilizing CUDA .param kernel arguments, we detect and modify
> + the gimple of offloaded child functions, here before RTL expansion,
> + starting with standard OMP form:
> +  foo._omp_fn.0 (const struct .omp_data_t.8 & restrict .omp_data_i) { 
> ... }
> +   
> + and transform it into a style where the OMP data record fields are
> + "exploded" into individual scalar arguments:
> +  foo._omp_fn.0 (int * a, int * b, int * c) { ... }
> +
> + Note that there are implicit assumptions of how OMP lowering (and/or 
> other
> + intervening passes) behaves contained in this transformation code;
> + if those passes change in their output, this code may possibly need
> + updating.  */
> +
> +  if (lookup_attribute ("omp target entrypoint",
> + DECL_ATTRIBUTES (current_function_decl))
> +  /* The rather indirect manner in which OpenMP target functions

Re: [PATCH 3/4] New IPA-SRA implementation

2019-09-19 Thread Martin Jambor

Hi,

On Fri, Sep 13 2019, Jan Hubicka wrote:
>> This patch actually adds the analysis bits of IPA-SRA - both the
>> function summary generation and the interprocedural analysis and
>> decision stage.  The transformation itself then happens in the call
>> graph cloning infrastructure changes which are in the previous patch.
>> Please see the cover letter of the whole patch-set for more
>> information.
>> 
>> This is mostly only a rebase on the current trunk of the earlier
>> submission, the only functional change is that the pass does not clone
>> when all the work (unused parameter removal) has already been done by
>> IPA-CP.
>> 
>> Martin
>> 
>> 2019-08-20  Martin Jambor  
>> 
>> * Makefile.in (GTFILES): Added ipa-sra.c.
>> (OBJS): Added ipa-sra.o.
>> * dbgcnt.def: Added ipa_sra_params and ipa_sra_retvalues.
>> * doc/invoke.texi (ipa-sra-max-replacements): New.
>> * ipa-sra.c: New file.
>> * lto-section-in.c (lto_section_name): Added ipa-sra section.
>> * lto-streamer.h (lto_section_type): Likewise.
>> * params.def (PARAM_IPA_SRA_MAX_REPLACEMENTS): New.
>> * passes.def: Add new IPA-SRA.
>> * tree-pass.h (make_pass_ipa_sra): Declare.
> OK

Thanks!

>> +#define IPA_SRA_MAX_PARAM_FLOW_LEN 7
> I would move this to the place ISRA_ARG_SIZE_LIMIT_BITS is defined so
> they appaer at same place.  Perhaps C++ way would be to use constant
> member variable?

OK

...

>> +
>> +/* Quick mapping from a decl to its param descriptor.  */
>> +
>> +static hash_map *decl2desc;
>> +
>> +/* Countdown of allowe Alias analysis steps during summary building.  */
>> +
>> +static int aa_walking_limit;
>> +
>> +/* This is a table in which for each basic block and parameter there is a
>> +   distance (offset + size) in that parameter which is dereferenced and
>> +   accessed in that BB.  */
>> +static HOST_WIDE_INT *bb_dereferences = NULL;
>> +/* How many by-reference parameters there are in the current function.  */
>> +static int by_ref_count;
>> +
>> +/* Bitmap of BBs that can cause the function to "stop" progressing by
>> +   returning, throwing externally, looping infinitely or calling a function
>> +   which might abort etc.. */
>> +static bitmap final_bbs;
>> +
>> +/* Obstack to allocate various small structures required only when 
>> generating
>> +   summary for a function.  */
>> +static struct obstack gensum_obstack;
>
> Perhaps place the static vars at the beginig of the namespace.
> I think "static" should not be used when declaring vars in a namespace.


OK, I moved them to the beginning of the namespace and removed the
static modifier from them.



>> +/* Scan body function described by NODE and FUN and create access trees for
>> +   parameters.  */
>> +
>> +static void
>> +scan_function (cgraph_node *node, struct function *fun)
>> +{
>> +  basic_block bb;
>> +
>> +  FOR_EACH_BB_FN (bb, fun)
>> +{
>> +  gimple_stmt_iterator gsi;
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +{
>> +  gimple *stmt = gsi_stmt (gsi);
>> +
>> +  if (stmt_can_throw_external (fun, stmt))
>> +bitmap_set_bit (final_bbs, bb->index);
>> +  switch (gimple_code (stmt))
>> +{
>> +case GIMPLE_RETURN:
>> +  {
>> +tree t = gimple_return_retval (as_a  (stmt));
>> +if (t != NULL_TREE)
>> +  scan_expr_access (t, stmt, ISRA_CTX_LOAD, bb);
>> +bitmap_set_bit (final_bbs, bb->index);
>> +  }
>> +  break;
>> +
>> +case GIMPLE_ASSIGN:
>> +  if (gimple_assign_single_p (stmt)
>> +  && !gimple_clobber_p (stmt))
>> +{
>> +  tree rhs = gimple_assign_rhs1 (stmt);
>> +  scan_expr_access (rhs, stmt, ISRA_CTX_LOAD, bb);
>> +  tree lhs = gimple_assign_lhs (stmt);
>> +  scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
>> +}
>> +  break;
>> +
>> +case GIMPLE_CALL:
>> +  {
>> +unsigned argument_count = gimple_call_num_args (stmt);
>> +scan_call_info call_info;
>> +call_info.cs = node->get_edge (stmt);
>> +call_info.argument_count = argument_count;
>> +
>> +for (unsigned i = 0; i < argument_count; i++)
>> +  {
>> +call_info.arg_idx = i;
>> +scan_expr_access (gimple_call_arg (stmt, i), stmt,
>> +  ISRA_CTX_ARG, bb, &call_info);
>> +  }
>> +
>> +tree lhs = gimple_call_lhs (stmt);
>> +if (lhs)
>> +  scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
>> +int flags = gimple_call_flags (stmt);
>> +if ((flags & (ECF_CONST | ECF_PURE)) == 0)
>> +  bitmap_set_bit (final_bbs, bb->index);
>> +  }
>> +  break;
>> +
>> +case GIMPLE_ASM:
>> +  {
>> +gasm *asm_stmt = as_a  (stmt);

[SVE] PR91532

2019-09-19 Thread Prathamesh Kulkarni

Hi,
For PR91532, the dead store is trivially deleted if we place dse pass
between ifcvt and vect. Would it be OK to add another instance of dse there ?
Or should we add an ad-hoc "basic-block dse" sub-pass to ifcvt that
will clean up the dead store ?

Thanks,
Prathamesh

Re: [PATCH][ARM] Add logical DImode expanders

2019-09-19 Thread Richard Earnshaw (lists)


On 19/09/2019 15:26, Wilco Dijkstra wrote:

Hi Richard,


except we can do better than this...
(see below).  With that change, this just becomes di3

+(define_code_attr logical_op [(ior "ior") (xor "xor") (and "and")])


These should just be added to the existing 'optab' attribute, there's no
need for an additional attribute.



+(define_code_attr logical_OP [(ior "IOR") (xor "XOR") (and "AND")])


You don't need this, just use  where you want this substitution.


Yes we can avoid the new code attributes indeed and add to the existing
optab one. Here is what I did:

[PATCH][ARM] Simplify logical DImode iterators

Further simplify the logical DImode expander using code iterator and
obtab attributes.  This avoids adding unnecessary code_attr entries.

ChangeLog:
2019-09-19  Wilco Dijkstra  

* config/arm/arm.md (di3): Use  and .
* config/arm/iterators.md (optab): Add and, ior, xor entries.
(logical_op): Remove code attribute.
(logical_OP): Likewise.


OK, but...


--

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
d54082b13dd702aa55a1465c0fbfa87baa89149a..d607f88cb05ffa9cd8a47b8c8e0c53ea3a5ca411
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2039,16 +2039,16 @@ (define_expand "divdf3"
  ; operands or complex immediates, which leads to fewer LDRD/STRD instructions.
  ; So an explicit expander is needed to generate better code.
  
-(define_expand "di3"

+(define_expand "di3"
[(set (match_operand:DI   0 "s_register_operand")
(LOGICAL:DI (match_operand:DI 1 "s_register_operand")
-   (match_operand:DI 2 "arm_di_operand")))]
+   (match_operand:DI 2 "arm_di_operand")))]
"TARGET_32BIT"
{
-  rtx low  = simplify_gen_binary (, SImode,
+  rtx low  = simplify_gen_binary (, SImode,
  gen_lowpart (SImode, operands[1]),
  gen_lowpart (SImode, operands[2]));
-  rtx high = simplify_gen_binary (, SImode,
+  rtx high = simplify_gen_binary (, SImode,
  gen_highpart (SImode, operands[1]),
  gen_highpart_mode (SImode, DImode,
 operands[2]));
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 
5e3299e850813db2f3c0a25a6cde779d1d0d1d55..98ded4b22b2885e77e0ea0f1ce73ed12845115d3
 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -287,9 +287,6 @@ (define_code_attr cmp_type [(eq "i") (gt "s") (ge "s") (lt "s") (le 
"s")])
  
  (define_code_attr vfml_op [(plus "a") (minus "s")])
  
-(define_code_attr logical_op [(ior "ior") (xor "xor") (and "and")])

-(define_code_attr logical_OP [(ior "IOR") (xor "XOR") (and "AND")])
-
  ;;
  ;; Int iterators
  ;;
@@ -797,7 +794,7 @@ (define_code_attr VQH_sign [(plus "i") (smin "s") (smax "s") (umin 
"u")
  (umax "u")])
  
  (define_code_attr cnb [(ltu "CC_C") (geu "CC")])

-(define_code_attr optab [(ltu "ltu") (geu "geu")])
+(define_code_attr optab [(ltu "ltu") (geu "geu") (and "and") (ior "ior") (xor 
"xor")])


Please reformat this as one mapping per line.  Over time I expect this 
is only going to grow.


R.

  
  ;; Assembler mnemonics for signedness of widening operations.

  (define_code_attr US [(sign_extend "s") (zero_extend "u")])

Re: [PATCH, Fortran] Optionally suppress no-automatic overwrites recursive warning - for review

2019-09-19 Thread Tobias Burnus


Hi Mark,

On 9/19/19 3:40 PM, Mark Eggleston wrote:
The following warning is produced when -fno-automatic and -frecursive 
are used at the same time:


f951: Warning: Flag '-fno-automatic' overwrites '-frecursive'

This patch allows the warning to be switched off using a new option, 
-Woverwrite-recursive, initialised to on.


I don't have a test case for this as I don't know how to test for a 
warning that isn't related to a line of code.


Try:

! { dg-warning "Flag .-fno-automatic. overwrites .-frecursive." "" { 
target *-*-* } 0 }


The syntax is { dg-warning "message", "label" {target ...} linenumber }, 
where linenumber = 0 means it can be on any line.


If the output doesn't match (but I think it does with "Warning:"), 
general messages can be caught with "dg-message".



Thanks,

Tobias

Re: [PATCH][x86] Fix PR91814

2019-09-19 Thread Uros Bizjak

On Thu, Sep 19, 2019 at 5:30 PM Richard Biener  wrote:
>
>
> Boostrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?

OK.

Thanks,
Uros.

> Thanks,
> Richard.
>
> 2019-09-19  Richard Biener  
>
> PR target/91814
> * config/i386/i386-features.c (gen_gpr_to_xmm_move_src):
> Force operand to a register if it isn't nonimmediate_operand.
>
> Index: gcc/config/i386/i386-features.c
> ===
> --- gcc/config/i386/i386-features.c (revision 275959)
> +++ gcc/config/i386/i386-features.c (working copy)
> @@ -668,10 +668,13 @@ scalar_chain::emit_conversion_insns (rtx
>  static rtx
>  gen_gpr_to_xmm_move_src (enum machine_mode vmode, rtx gpr)
>  {
> +  if (!nonimmediate_operand (gpr, GET_MODE_INNER (vmode)))
> +gpr = force_reg (GET_MODE_INNER (vmode), gpr);
>switch (GET_MODE_NUNITS (vmode))
>  {
>  case 1:
> -  return gen_rtx_SUBREG (vmode, gpr, 0);
> +  /* We are not using this case currently.  */
> +  gcc_unreachable ();
>  case 2:
>return gen_rtx_VEC_CONCAT (vmode, gpr,
>  CONST0_RTX (GET_MODE_INNER (vmode)));

[PATCH][x86] Fix PR91814

2019-09-19 Thread Richard Biener



Boostrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2019-09-19  Richard Biener  

PR target/91814
* config/i386/i386-features.c (gen_gpr_to_xmm_move_src):
Force operand to a register if it isn't nonimmediate_operand.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 275959)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -668,10 +668,13 @@ scalar_chain::emit_conversion_insns (rtx
 static rtx
 gen_gpr_to_xmm_move_src (enum machine_mode vmode, rtx gpr)
 {
+  if (!nonimmediate_operand (gpr, GET_MODE_INNER (vmode)))
+gpr = force_reg (GET_MODE_INNER (vmode), gpr);
   switch (GET_MODE_NUNITS (vmode))
 {
 case 1:
-  return gen_rtx_SUBREG (vmode, gpr, 0);
+  /* We are not using this case currently.  */
+  gcc_unreachable ();
 case 2:
   return gen_rtx_VEC_CONCAT (vmode, gpr,
 CONST0_RTX (GET_MODE_INNER (vmode)));

Re: [ARM/FDPIC v6 13/24] [ARM] FDPIC: Force LSB bit for PC in Cortex-M architecture

2019-09-19 Thread Kyrill Tkachov




On 9/19/19 4:13 PM, Christophe Lyon wrote:
On Tue, 17 Sep 2019 at 14:08, Christophe Lyon  
wrote:

>
> On 17/09/2019 13:38, Wilco Dijkstra wrote:
> > Hi Christophe,
> >
> > Can you explain this in more detail - it doesn't make sense to me 
to force the
> > Thumb bit during unwinding since it should already be correct, 
even on a
> > Thumb-only CPU. Perhaps the kernel code that pushes an incorrect 
address on

> > the stack could be fixed instead?
> >
> >> Without this, when we are unwinding across a signal frame we can jump
> >> to an even address which leads to an exception.
> >>
> >> This is needed in __gnu_persnality_sigframe_fdpic() when 
restoring the

> >> PC from the signal frame since the PC saved by the kernel has the LSB
> >> bit set to zero.
> >
> > Wilco
> > .
> >
>
> Indeed, I've noticed the problem mentioned by Matthew since I 
committed that patch.

>
> I was about to propose a fix, replacing #if (__thumb__) with #if 
(!__ARM_ARCH_ISA_ARM), but you are right: maybe the kernel code should 
be fixed instead.

>
> So far I haven't managed to reproduce a failure in FDPIC mode 
without this patch though...

>
> Thanks and sorry for the breakage.
>

I'm having problems with the board I use for testing, so I propose to
revert that patch until I have a better description of the problem it
fixed.
OK?


Ok by me as long as lives the fdpic toolchain in a usable state (barring 
the potential issue here)


Thanks,

Kyrill




Christophe

> Christophe
>

Re: [ARM/FDPIC v6 13/24] [ARM] FDPIC: Force LSB bit for PC in Cortex-M architecture

2019-09-19 Thread Christophe Lyon

On Tue, 17 Sep 2019 at 14:08, Christophe Lyon  wrote:
>
> On 17/09/2019 13:38, Wilco Dijkstra wrote:
> > Hi Christophe,
> >
> > Can you explain this in more detail - it doesn't make sense to me to force 
> > the
> > Thumb bit during unwinding since it should already be correct, even on a
> > Thumb-only CPU. Perhaps the kernel code that pushes an incorrect address on
> > the stack could be fixed instead?
> >
> >> Without this, when we are unwinding across a signal frame we can jump
> >> to an even address which leads to an exception.
> >>
> >> This is needed in __gnu_persnality_sigframe_fdpic() when restoring the
> >> PC from the signal frame since the PC saved by the kernel has the LSB
> >> bit set to zero.
> >
> > Wilco
> > .
> >
>
> Indeed, I've noticed the problem mentioned by Matthew since I committed that 
> patch.
>
> I was about to propose a fix, replacing #if (__thumb__) with #if 
> (!__ARM_ARCH_ISA_ARM), but you are right: maybe the kernel code should be 
> fixed instead.
>
> So far I haven't managed to reproduce a failure in FDPIC mode without this 
> patch though...
>
> Thanks and sorry for the breakage.
>

I'm having problems with the board I use for testing, so I propose to
revert that patch until I have a better description of the problem it
fixed.
OK?

Christophe

> Christophe
>

Re: [Ada] Fix 32/64bit mistake on SYSTEM_INFO component in s-win32

2019-09-19 Thread Iain Sandoe

Hi Rainer,

> On 19 Sep 2019, at 15:51, Rainer Orth  wrote:
> 

>>> On 18 Sep 2019, at 09:39, Pierre-Marie de Rodat  wrote:
>>> 
>> 
>>> gcc/testsuite/
>>> 
>>> * gnat.dg/system_info1.adb: New testcase.
>> 
>> This new test fails everywhere on Darwin, which doesn’t have an 
>> implementation for
>> System.Task_Info.Number_Of_Processors
>> 
>> Given 
>> "pragma Obsolescent (Task_Info, "use System.Multiprocessors and CPU 
>> aspect”);”
>> 
>> is it worth me trying to implement the Task_Info stuff?
> 
> I'm seeing the same on Solaris (will be every non-Linux/MinGW target).
> I've implemented Number_Of_Processors using
> sysconf(__SC_NPROCESSORS_ONLN), which is also available on Darwin.  Will
> submit the patch tomorrow once testing has finished…

OK. it’s likely that the same (almost) patch will work for Darwin which also 
has that Posix call.

I’ll look at your patch and adapt it for Darwin then,

I don’t *think* we can make:

s-tasinf__posix.ad? 

using that since  I think __SC_NPROCESSORS_ONLN is an optional addition? 
(but ICBW about that)

>> or should I just skip this test on Darwin?
>> 
>> (if the latter, then I would plan to apply the patch below)
> [...]
>> diff --git a/gcc/testsuite/gnat.dg/system_info1.adb 
>> b/gcc/testsuite/gnat.dg/system_info1.adb
>> index 493a18e907..c1523a277f 100644
>> --- a/gcc/testsuite/gnat.dg/system_info1.adb
>> +++ b/gcc/testsuite/gnat.dg/system_info1.adb
>> @@ -1,4 +1,4 @@
>> ---  { dg-do run }
>> +--  { dg-do run { target { ! *-*-darwin* } } }
> 
> Given that the test cannot compile on anything but *-*-linux* and
> *-*-mingw*, I'd rather restrict the test to those two (or more targets
> that decide to implement the missing interface).

that makes sense, too (targets with support can opt in)

cheers
Iain

> 
>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Come up with debug counter for store-merging.

2019-09-19 Thread Bernhard Reutner-Fischer

On Wed, 18 Sep 2019 11:01:59 +0200
Richard Biener  wrote:

> On Wed, Sep 18, 2019 at 9:22 AM Martin Liška  wrote:
> >
> > Hi.
> >
> > After I spent quite some time with PR91758, I would like
> > to see a debug counter in store merging for the next time.
> >
> > Ready to be installed?  
> OK.

> @@ -4195,7 +4196,8 @@ imm_store_chain_info::output_merged_stores ()
>bool ret = false;
>FOR_EACH_VEC_ELT (m_merged_store_groups, i, merged_store)
>  {
> -  if (output_merged_store (merged_store))
> +  if (dbg_cnt (store_merging)
> +   && output_merged_store (merged_store))

Doesn't this over-count actual merged stores? You really count the number of 
attempted store merges, not the actual successful stores merged. As i would 
expect to count the latter, i would have expected ..

>   {
> unsigned int j;
> store_immediate_info *store;

dbg_cnt (store_merging) here, i.e. after the guard that determines if
the merge is benefical, no?

thanks,

Re: [Ada] Fix 32/64bit mistake on SYSTEM_INFO component in s-win32

2019-09-19 Thread Rainer Orth

Hi Iain,

>> On 18 Sep 2019, at 09:39, Pierre-Marie de Rodat  wrote:
>> 
>
>> gcc/testsuite/
>> 
>>  * gnat.dg/system_info1.adb: New testcase.
>
> This new test fails everywhere on Darwin, which doesn’t have an 
> implementation for
> System.Task_Info.Number_Of_Processors
>
> Given 
> "pragma Obsolescent (Task_Info, "use System.Multiprocessors and CPU aspect”);”
>
> is it worth me trying to implement the Task_Info stuff?

I'm seeing the same on Solaris (will be every non-Linux/MinGW target).
I've implemented Number_Of_Processors using
sysconf(__SC_NPROCESSORS_ONLN), which is also available on Darwin.  Will
submit the patch tomorrow once testing has finished...

> or should I just skip this test on Darwin?
>
> (if the latter, then I would plan to apply the patch below)
[...]
> diff --git a/gcc/testsuite/gnat.dg/system_info1.adb 
> b/gcc/testsuite/gnat.dg/system_info1.adb
> index 493a18e907..c1523a277f 100644
> --- a/gcc/testsuite/gnat.dg/system_info1.adb
> +++ b/gcc/testsuite/gnat.dg/system_info1.adb
> @@ -1,4 +1,4 @@
> ---  { dg-do run }
> +--  { dg-do run { target { ! *-*-darwin* } } }

Given that the test cannot compile on anything but *-*-linux* and
*-*-mingw*, I'd rather restrict the test to those two (or more targets
that decide to implement the missing interface).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Ada] Fix 32/64bit mistake on SYSTEM_INFO component in s-win32

2019-09-19 Thread Iain Sandoe

Hi Folks,

> On 18 Sep 2019, at 09:39, Pierre-Marie de Rodat  wrote:
> 

> gcc/testsuite/
> 
>   * gnat.dg/system_info1.adb: New testcase.

This new test fails everywhere on Darwin, which doesn’t have an implementation 
for
System.Task_Info.Number_Of_Processors

Given 
"pragma Obsolescent (Task_Info, "use System.Multiprocessors and CPU aspect”);”

is it worth me trying to implement the Task_Info stuff?
or should I just skip this test on Darwin?

(if the latter, then I would plan to apply the patch below)

cheers
Iain


2019-09-19  Iain Sandoe  

* gnat.dg/system_info1.adb: Skip for Darwin.


diff --git a/gcc/testsuite/gnat.dg/system_info1.adb 
b/gcc/testsuite/gnat.dg/system_info1.adb
index 493a18e907..c1523a277f 100644
--- a/gcc/testsuite/gnat.dg/system_info1.adb
+++ b/gcc/testsuite/gnat.dg/system_info1.adb
@@ -1,4 +1,4 @@
---  { dg-do run }
+--  { dg-do run { target { ! *-*-darwin* } } }
 
 with System.Multiprocessors;
 with System.Task_Info;

Re: [PATCH, AArch64 v4 0/6] LSE atomics out-of-line

2019-09-19 Thread Richard Henderson

On 9/18/19 5:58 AM, Kyrill Tkachov wrote:
> Thanks for this.
> 
> I've bootstrapped and tested this patch series on systems with and without LSE
> support, both with and without patch [6/6], so 4 setups in total.
> 
> It all looks clean for me.
> 
> I'm favour of this series going in (modulo patch 6/6, leaving the option to
> turn it on to the user).
> 
> I've got a couple of small comments on some of the patches that IMO can be
> fixed when committing.

Thanks.  Committed with the requested modifications.


r~

[PATCH V4] Extend IPA-CP to support arithmetically-computed value-passing on by-ref argument (PR ipa/91682)

2019-09-19 Thread Feng Xue OS

Fix a bug on unary/binary operation check.

Feng
---
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 33d52fe5537..f218f1093b8 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1244,23 +1244,23 @@ initialize_node_lattices (struct cgraph_node *node)
   }
 }
 
-/* Return the result of a (possibly arithmetic) pass through jump function
-   JFUNC on the constant value INPUT.  RES_TYPE is the type of the parameter
-   to which the result is passed.  Return NULL_TREE if that cannot be
-   determined or be considered an interprocedural invariant.  */
+/* Return the result of a (possibly arithmetic) operation on the constant
+   value INPUT.  OPERAND is 2nd operand for binary operation.  RES_TYPE is
+   the type of the parameter to which the result is passed.  Return
+   NULL_TREE if that cannot be determined or be considered an
+   interprocedural invariant.  */
 
 static tree
-ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input,
-   tree res_type)
+ipa_get_jf_arith_result (enum tree_code opcode, tree input, tree operand,
+tree res_type)
 {
   tree res;
 
-  if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
+  if (opcode == NOP_EXPR)
 return input;
   if (!is_gimple_ip_invariant (input))
 return NULL_TREE;
 
-  tree_code opcode = ipa_get_jf_pass_through_operation (jfunc);
   if (!res_type)
 {
   if (TREE_CODE_CLASS (opcode) == tcc_comparison)
@@ -1274,8 +1274,7 @@ ipa_get_jf_pass_through_result (struct ipa_jump_func 
*jfunc, tree input,
   if (TREE_CODE_CLASS (opcode) == tcc_unary)
 res = fold_unary (opcode, res_type, input);
   else
-res = fold_binary (opcode, res_type, input,
-  ipa_get_jf_pass_through_operand (jfunc));
+res = fold_binary (opcode, res_type, input, operand);
 
   if (res && !is_gimple_ip_invariant (res))
 return NULL_TREE;
@@ -1283,6 +1282,21 @@ ipa_get_jf_pass_through_result (struct ipa_jump_func 
*jfunc, tree input,
   return res;
 }
 
+/* Return the result of a (possibly arithmetic) pass through jump function
+   JFUNC on the constant value INPUT.  RES_TYPE is the type of the parameter
+   to which the result is passed.  Return NULL_TREE if that cannot be
+   determined or be considered an interprocedural invariant.  */
+
+static tree
+ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input,
+   tree res_type)
+{
+  return ipa_get_jf_arith_result (ipa_get_jf_pass_through_operation (jfunc),
+ input,
+ ipa_get_jf_pass_through_operand (jfunc),
+ res_type);
+}
+
 /* Return the result of an ancestor jump function JFUNC on the constant value
INPUT.  Return NULL_TREE if that cannot be determined.  */
 
@@ -1416,6 +1430,146 @@ ipa_context_from_jfunc (ipa_node_params *info, 
cgraph_edge *cs, int csidx,
   return ctx;
 }
 
+/* See if NODE is a clone with a known aggregate value at a given OFFSET of a
+   parameter with the given INDEX.  */
+
+static tree
+get_clone_agg_value (struct cgraph_node *node, HOST_WIDE_INT offset,
+int index)
+{
+  struct ipa_agg_replacement_value *aggval;
+
+  aggval = ipa_get_agg_replacements_for_node (node);
+  while (aggval)
+{
+  if (aggval->offset == offset
+ && aggval->index == index)
+   return aggval->value;
+  aggval = aggval->next;
+}
+  return NULL_TREE;
+}
+
+/* Determine whether ITEM, jump function for an aggregate part, evaluates to a
+   single known constant value and if so, return it.  Otherwise return NULL.
+   NODE and INFO describes the caller node or the one it is inlined to, and
+   its related info.  */
+
+static tree
+ipa_agg_value_from_node (class ipa_node_params *info,
+struct cgraph_node *node,
+struct ipa_agg_jf_item *item)
+{
+  tree value = NULL_TREE;
+  int src_idx;
+
+  if (item->offset < 0 || item->jftype == IPA_JF_UNKNOWN)
+return NULL_TREE;
+
+  if (item->jftype == IPA_JF_CONST)
+return item->value.constant;
+
+  gcc_checking_assert (item->jftype == IPA_JF_PASS_THROUGH
+  || item->jftype == IPA_JF_LOAD_AGG);
+
+  src_idx = item->value.pass_through.formal_id;
+
+  if (info->ipcp_orig_node)
+{
+  if (item->jftype == IPA_JF_PASS_THROUGH)
+   value = info->known_csts[src_idx];
+  else
+   value = get_clone_agg_value (node, item->value.load_agg.offset,
+src_idx);
+}
+  else if (info->lattices)
+{
+  class ipcp_param_lattices *src_plats
+   = ipa_get_parm_lattices (info, src_idx);
+
+  if (item->jftype == IPA_JF_PASS_THROUGH)
+   {
+ struct ipcp_lattice *lat = &src_plats->itself;
+
+ if (!lat->is_single_const ())
+   return NULL_TREE;
+
+ value = lat->values->value;
+   }
+  else if (src_plats->aggs
+  && !src_plats

[Patch,committed][OG9] Update dg-* in to reduce test failures

2019-09-19 Thread Tobias Burnus


Hi all,

the gfortran.dg/goacc/kernels-decompose-1.f95 change is just "note:" to 
"optimized:". It reduces the fails a bit – but I still see some XPASS 
and FAIL, which I haven't checked.


The other two files have an additional output which looks OK and, hence, 
has been added to the expected result:

"optimized: beginning .parloops. region in OpenACC .kernels. construct".

Cheers,

Tobias

commit e74aaa153d846fe852b6be202258daa46f48a900
Author: Tobias Burnus 
Date:   Thu Sep 19 15:57:08 2019 +0200

Reduce testsuite fails

gcc/testsuite/
2019-09-19  Tobias Burnus  

* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
one dg-message for additional -fopt-info-optimized-omp output.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Change 'note' to
'optimized' in dg-message.

diff --git a/gcc/testsuite/ChangeLog.openacc b/gcc/testsuite/ChangeLog.openacc
index 6faedb6866a..5f452e64a93 100644
--- a/gcc/testsuite/ChangeLog.openacc
+++ b/gcc/testsuite/ChangeLog.openacc
@@ -1,3 +1,11 @@
+2019-09-19  Tobias Burnus  
+
+	* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
+	one dg-message for additional -fopt-info-optimized-omp output.
+	* gfortran.dg/goacc/classify-kernels.f95: Likewise.
+	* gfortran.dg/goacc/kernels-decompose-1.f95: Change 'note' to
+	'optimized' in dg-message.
+
 2019-09-17  Julian Brown  
 
 	* c-c++-common/goacc/classify-kernels-unparallelized.c: Update expected
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
index 6e4001b4f9b..6411f481064 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
@@ -21,6 +21,7 @@ program main
 
   !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1))
   do i = 0, n - 1 ! { dg-message "optimized: assigned OpenACC seq loop parallelism" }
+  ! { dg-message "optimized: beginning .parloops. region in OpenACC .kernels. construct" "" { target *-*-* } 23 }
  c(i) = a(f (i)) + b(f (i))
   end do
   !$acc end kernels
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
index a0a5fd93bbc..8ee3e3d54aa 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
@@ -17,6 +17,7 @@ program main
 
   !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1))
   do i = 0, n - 1 ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
+  ! { dg-message "optimized: beginning .parloops. region in OpenACC .kernels. construct" "" { target *-*-* } 19 }
  c(i) = a(i) + b(i)
   end do
   !$acc end kernels
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
index 8173c3651e1..073b963f50d 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1.f95
@@ -24,7 +24,7 @@ program main
   integer :: a(N), b(N), c(N)
 
   !$acc kernels
-  x = 0 ! { dg-message "note: beginning .gang-single. region in OpenACC .kernels. construct" }
+  x = 0 ! { dg-message "optimized: beginning .gang-single. region in OpenACC .kernels. construct" }
   y = 0
   y_l = x < 10
   z = x
@@ -32,67 +32,67 @@ program main
   ;
   !$acc end kernels
 
-  !$acc kernels ! { dg-message "note: assigned OpenACC gang loop parallelism" }
-  do i = 1, N ! { dg-message "note: beginning .parloops. region in OpenACC .kernels. construct" }
+  !$acc kernels ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
+  do i = 1, N ! { dg-message "optimized: beginning .parloops. region in OpenACC .kernels. construct" }
  a(i) = 0
   end do
   !$acc end kernels
 
-  !$acc kernels loop ! { dg-message "note: assigned OpenACC gang loop parallelism" }
-  ! { dg-message "note: forwarded loop nest in OpenACC .kernels. construct to .parloops. for analysis" "" { target *-*-* } .-1 }
+  !$acc kernels loop ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
+  ! { dg-message "optimized: forwarded loop nest in OpenACC .kernels. construct to .parloops. for analysis" "" { target *-*-* } .-1 }
   do i = 1, N
  b(i) = a(N - i + 1)
   end do
 
   !$acc kernels
-  !$acc loop ! { dg-message "note: assigned OpenACC gang loop parallelism" }
-  ! { dg-message "note: forwarded loop nest in OpenACC .kernels. construct to .parloops. for analysis" "" { target *-*-* } .-1 }
+  !$acc loop ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
+  ! { dg-message "optimized: forwarded loop nest in OpenACC .kernels. construct to .parloops. for analysis" "" { target *-*-* } .-1 }
   do i = 1, N
  b(i) = a(N - i + 1)
   end do
 
-  !$acc loop ! { dg

Re: [PATCH][ARM] Add logical DImode expanders

2019-09-19 Thread Wilco Dijkstra

Hi Richard,

> except we can do better than this...
> (see below).  With that change, this just becomes di3
>> +(define_code_attr logical_op [(ior "ior") (xor "xor") (and "and")])
>
> These should just be added to the existing 'optab' attribute, there's no 
> need for an additional attribute.

>> +(define_code_attr logical_OP [(ior "IOR") (xor "XOR") (and "AND")])
>
> You don't need this, just use  where you want this substitution.

Yes we can avoid the new code attributes indeed and add to the existing
optab one. Here is what I did:

[PATCH][ARM] Simplify logical DImode iterators

Further simplify the logical DImode expander using code iterator and
obtab attributes.  This avoids adding unnecessary code_attr entries.

ChangeLog:
2019-09-19  Wilco Dijkstra  

* config/arm/arm.md (di3): Use  and .
* config/arm/iterators.md (optab): Add and, ior, xor entries.
(logical_op): Remove code attribute.
(logical_OP): Likewise.
--

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
d54082b13dd702aa55a1465c0fbfa87baa89149a..d607f88cb05ffa9cd8a47b8c8e0c53ea3a5ca411
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2039,16 +2039,16 @@ (define_expand "divdf3"
 ; operands or complex immediates, which leads to fewer LDRD/STRD instructions.
 ; So an explicit expander is needed to generate better code.
 
-(define_expand "di3"
+(define_expand "di3"
   [(set (match_operand:DI0 "s_register_operand")
(LOGICAL:DI (match_operand:DI 1 "s_register_operand")
-   (match_operand:DI 2 "arm_di_operand")))]
+   (match_operand:DI 2 "arm_di_operand")))]
   "TARGET_32BIT"
   {
-  rtx low  = simplify_gen_binary (, SImode,
+  rtx low  = simplify_gen_binary (, SImode,
  gen_lowpart (SImode, operands[1]),
  gen_lowpart (SImode, operands[2]));
-  rtx high = simplify_gen_binary (, SImode,
+  rtx high = simplify_gen_binary (, SImode,
  gen_highpart (SImode, operands[1]),
  gen_highpart_mode (SImode, DImode,
 operands[2]));
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 
5e3299e850813db2f3c0a25a6cde779d1d0d1d55..98ded4b22b2885e77e0ea0f1ce73ed12845115d3
 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -287,9 +287,6 @@ (define_code_attr cmp_type [(eq "i") (gt "s") (ge "s") (lt 
"s") (le "s")])
 
 (define_code_attr vfml_op [(plus "a") (minus "s")])
 
-(define_code_attr logical_op [(ior "ior") (xor "xor") (and "and")])
-(define_code_attr logical_OP [(ior "IOR") (xor "XOR") (and "AND")])
-
 ;;
 ;; Int iterators
 ;;
@@ -797,7 +794,7 @@ (define_code_attr VQH_sign [(plus "i") (smin "s") (smax 
"s") (umin "u")
 (umax "u")])
 
 (define_code_attr cnb [(ltu "CC_C") (geu "CC")])
-(define_code_attr optab [(ltu "ltu") (geu "geu")])
+(define_code_attr optab [(ltu "ltu") (geu "geu") (and "and") (ior "ior") (xor 
"xor")])
 
 ;; Assembler mnemonics for signedness of widening operations.
 (define_code_attr US [(sign_extend "s") (zero_extend "u")])

[PATCH] [og9] Add 'ephemeral' parameter to GOMP_OFFLOAD_openacc_async_host2dev

2019-09-19 Thread Julian Brown

This patch fixes a build failure in the NVPTX libgomp plugin after the
EPHEMERAL parameter was added to the GOMP_OFFLOAD_openacc_async_host2dev
plugin entry point.

So far the parameter is ignored, which is no change from the previous
behaviour but which I think may lead to a data race in some (presumably
rare) circumstances -- for reasons I describe in the FIXME comment added
to the function.

I will apply this to the openacc-gcc-9-branch shortly (to unbreak the
build), and investigate further.

Julian

ChangeLog

libgomp/
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_host2dev):
Add EPHEMERAL parameter, and FIXME function comment.
---
 libgomp/ChangeLog.openacc |  5 +
 libgomp/plugin/plugin-nvptx.c | 13 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc
index 943a9e4a933..7813760e642 100644
--- a/libgomp/ChangeLog.openacc
+++ b/libgomp/ChangeLog.openacc
@@ -1,3 +1,8 @@
+2019-09-19  Julian Brown  
+
+   * plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_host2dev):
+   Add EPHEMERAL parameter, and FIXME function comment.
+
 2019-09-18  Tobias Burnus  
 
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Fix dg-warning
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 4beb3222e8f..452415e1879 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1868,9 +1868,20 @@ GOMP_OFFLOAD_dev2dev (int ord, void *dst, const void 
*src, size_t n)
   return true;
 }
 
+/* FIXME: It is unknown whether the cuMemcpyHtoDAsync API call caches source
+   data before the asynchronous copy takes place.  Either way there is a data
+   race associated with ignoring the EPHEMERAL parameter here -- either if it
+   is TRUE (because we are copying uncached data that may disappear before the
+   async copy takes place) or if it is FALSE (because the source data may be
+   cached/snapshotted here before it is modified by an earlier async operation,
+   so stale data gets copied to the target).
+   Neither problem has been observed in practice, so far.  */
+
 bool
 GOMP_OFFLOAD_openacc_async_host2dev (int ord, void *dst, const void *src,
-size_t n, struct goacc_asyncqueue *aq)
+size_t n,
+bool ephemeral __attribute__((unused)),
+struct goacc_asyncqueue *aq)
 {
   if (!nvptx_attach_host_thread_to_device (ord)
   || !cuda_memcpy_sanity_check (src, dst, n))
-- 
2.22.0

[PATCH, Fortran] Optionally suppress no-automatic overwrites recursive warning - for review

2019-09-19 Thread Mark Eggleston

The following warning is produced when -fno-automatic and -frecursive 
are used at the same time:


f951: Warning: Flag '-fno-automatic' overwrites '-frecursive'

This is a reasonable warning, however, recursion can be used with the 
-fno-automatic flag where recursion is enabled using -frecusive instead 
of explicitly using the recursive keyword.


If all relevant subroutines local variables are explicitly declared 
automatic then recursion will work and the warning becomes a nuisance 
when -Werror is used in the build environment.


This patch allows the warning to be switched off using a new option, 
-Woverwrite-recursive, initialised to on.


I don't have a test case for this as I don't know how to test for a 
warning that isn't related to a line of code.


It is a very simple patch, can the test case be omitted?

ChangeLog

gcc/fortran

    Mark Eggleston  

    * lang.opt: Add option -Woverwrite-recursive initialised as on.
    * option.c (gfc_post_options): Output warning only if it is enabled.

--
https://www.codethink.co.uk/privacy.html

>From cfdcde8229ebed15304c1adb8085365f8ad5970a Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 16 Apr 2019 09:09:12 +0100
Subject: [PATCH 09/12] Suppress warning with -Wno-overwrite-recursive

The message "Warning: Flag '-fno-automatic' overwrites '-frecursive'" is
output by default when -fno-automatic and -frecursive are used together.
It warns that recursion may be broken, however if all the relavent variables
in the recursive procedure have automatic attributes the warning is
unnecessary so -Wno-overwrite-recursive can be used to suppress it. This
will allow compilation when warnings are regarded as errors.

Suppress warning with -Wno-overwrite-recursive
---
 gcc/fortran/lang.opt  | 4 
 gcc/fortran/options.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 2aba57d00b5..7d9fd3e048c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -289,6 +289,10 @@ Wopenmp-simd
 Fortran
 ; Documented in C
 
+Woverwrite-recursive
+Fortran Warning Var(warn_overwrite_recursive) Init(1)
+Warn that -fno-automatic may break recursion.
+
 Wpedantic
 Fortran
 ; Documented in common.opt
diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index 03e14338578..1721f93f673 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -411,7 +411,7 @@ gfc_post_options (const char **pfilename)
   && flag_max_stack_var_size != 0)
 gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites %<-fmax-stack-var-size=%d%>",
 		 flag_max_stack_var_size);
-  else if (!flag_automatic && flag_recursive)
+  else if (!flag_automatic && flag_recursive && warn_overwrite_recursive)
 gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites %<-frecursive%>");
   else if (!flag_automatic && flag_openmp)
 gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites %<-frecursive%> implied by "
-- 
2.11.0

[Ada] Fix bogus visibility error with nested generics and inlining

2019-09-19 Thread Pierre-Marie de Rodat

This prevents the compiler from issuing a bogus error about the
visibility of an operator in an instantiation of a nested generic
package which is itself used as an actual of an instantiation of another
generic package, when the instantiations are done in a unit withed from
the main unit and containing an inlined subprogram, and cross-unit
inlining is enabled.

In most cases, the compiler does not check the visibility of operators
in an instantiation context because this has already been done when the
generic package has been analyzed. However, there are exceptions like
the actuals of an instantiation of a generic child unit which is done
as a compilation unit and the In_Instance predicate has a special check
for these cases.

This check would incorrectly trigger here and needs to be tightened.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* sem_util.adb (In_Instance): Test whether the current unit has
been analyzed instead of being on the scope stack to detect the
case of actuals of an instantiation of a generic child unit done
as a compilation unit.

gcc/testsuite/

* gnat.dg/inline20.adb, gnat.dg/inline20_g.adb,
gnat.dg/inline20_g.ads, gnat.dg/inline20_h.ads,
gnat.dg/inline20_i.ads, gnat.dg/inline20_q-io.ads,
gnat.dg/inline20_q.ads, gnat.dg/inline20_r.ads: New testcase.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -12380,15 +12380,15 @@ package body Sem_Util is
  if Is_Generic_Instance (S) then
 
 --  A child instance is always compiled in the context of a parent
---  instance. Nevertheless, the actuals are not analyzed in an
+--  instance. Nevertheless, its actuals must not be analyzed in an
 --  instance context. We detect this case by examining the current
 --  compilation unit, which must be a child instance, and checking
---  that it is not currently on the scope stack.
+--  that it has not been analyzed yet.
 
 if Is_Child_Unit (Curr_Unit)
   and then Nkind (Unit (Cunit (Current_Sem_Unit))) =
  N_Package_Instantiation
-  and then not In_Open_Scopes (Curr_Unit)
+  and then Ekind (Curr_Unit) = E_Void
 then
return False;
 else

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20.adb
@@ -0,0 +1,9 @@
+--  { dg-do compile }
+--  { dg-options "-O -gnatn2" }
+with Inline20_Q.IO;
+with Inline20_R;
+
+procedure Inline20 is
+begin
+   Inline20_R.Log (Inline20_Q.IO.F);
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_g.adb
@@ -0,0 +1,18 @@
+with Ada.Streams; use Ada.Streams;
+
+package body Inline20_G is
+
+   package body Nested_G is
+
+  procedure Get (Data : T; Into : out Offset_Type) is
+  begin
+ Into := (T'Descriptor_Size + Data'Size) / Standard'Storage_Unit;
+  end;
+
+  function F return Integer is
+  begin
+ return 0;
+  end;
+   end Nested_G;
+
+end Inline20_G;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_g.ads
@@ -0,0 +1,18 @@
+with Ada.Streams;
+
+generic
+package Inline20_G is
+
+   subtype Offset_Type is Ada.Streams.Stream_Element_Offset;
+
+   generic
+  type T is private;
+   package Nested_G is
+
+  procedure Get (Data : T; Into : out Offset_Type);
+
+  function F return Integer with Inline;
+
+   end Nested_G;
+
+end Inline20_G;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_h.ads
@@ -0,0 +1,15 @@
+with Inline20_G;
+
+generic
+   with package Msg is new Inline20_G (<>);
+package Inline20_H is
+
+   generic
+  type T is private;
+  with function Image (Data : T) return String;
+   package Nested_H is
+  package My_Nested_G is new Msg.Nested_G (T);
+  function F return Integer renames My_Nested_G.F;
+   end Nested_H;
+
+end Inline20_H;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_i.ads
@@ -0,0 +1,19 @@
+with Inline20_R;
+
+generic
+package Inline20_I is
+
+   type Rec is null record;
+
+   generic
+   package Generic_IO is
+
+  function Image (Quote : Rec) return String;
+
+  package My_Nested_H is new Inline20_R.My_H.Nested_H (Rec, Image);
+
+  function F return Integer renames My_Nested_H.F;
+
+   end Generic_IO;
+
+end Inline20_I;
\ No newline at end of file

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_q-io.ads
@@ -0,0 +1 @@
+package Inline20_Q.IO is new Inline20_Q.Generic_IO;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_q.ads
@@ -0,0 +1,3 @@
+with Inline20_I;
+
+package Inline20_Q is new Inline20_I;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline20_r.ads
@@ -0,0 +1,12 @@
+with Inline20_G;
+with Inline20_H;
+
+package Inline20_R is
+
+   package My_G is new Inline20_G;
+
+   pack

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Martin Liška

On 9/19/19 3:18 PM, Richard Biener wrote:
> On Thu, Sep 19, 2019 at 3:15 PM Alexander Monakov  wrote:
>>
>> On Thu, 19 Sep 2019, Richard Biener wrote:
>>
 Good point, there's a tested patch.
>>>
>>> OK.
>>
>> Hold on, is the new comparator really correct?
>>
>>
>> @@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, 
>> const void *b)
>>  static int
>>  sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
>>  {
>> -  const congruence_class_group *g1
>> -= *(const congruence_class_group * const *)a;
>> -  const congruence_class_group *g2
>> -= *(const congruence_class_group * const *)b;
>> -
>> -  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
>> -  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
>> -
>> -  if (uid1 < uid2)
>> -return -1;
>> -  else if (uid1 > uid2)
>> -return 1;
>> -  else
>> -return 0;
>> +  const std::pair *g1
>> += *(const std::pair *const *) a;
>> +  const std::pair *g2
>> += *(const std::pair *const *) b;
>> +  return g1->second - g2->second;
>>  }
>>
>>
>> AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
>> cast to a pointer-to-pointer-to-pair? Shouldn't it read
>>
>>   const std::pair *g1
>> = (const std::pair *) a;
> 
> Indeed.  We need static type correctness for the comparator :/

Ooops, sorry for the mistake. I'm testing the following patch.
I checked in debugger that uids are valid for a simple test-case.

Martin

> 
> Richard.
> 
>> Thanks.
>> Alexander

>From 5f94530ee52b254a1e3cdfc0f4db4222a6b77a14 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 19 Sep 2019 15:27:20 +0200
Subject: [PATCH] Fix cast in sort_congruence_class_groups_by_decl_uid.

gcc/ChangeLog:

2019-09-19  Martin Liska  

	* ipa-icf.c (sort_congruence_class_groups_by_decl_uid):
	Use proper casting.
---
 gcc/ipa-icf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 59b7f8b1b9d..009aeb487dd 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -3373,9 +3373,9 @@ static int
 sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
 {
   const std::pair *g1
-= *(const std::pair *const *) a;
+= (const std::pair *) a;
   const std::pair *g2
-= *(const std::pair *const *) b;
+= (const std::pair *) b;
   return g1->second - g2->second;
 }
 
-- 
2.23.0

[Ada] Use declared type for deciding on SPARK pointer rules

2019-09-19 Thread Pierre-Marie de Rodat

A constant of pointer type is considered as mutable in SPARK, according
to SPARK RM 3.10, but this should be based on the declared type of the
constant instead of its underlying type.

There is no impact on compilation hence no test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Yannick Moy  

gcc/ada/

* sem_prag.adb (Analyze_Depends_In_Decl_Part): Simplify previous
test.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -1272,13 +1272,7 @@ package body Sem_Prag is
 if Ekind_In (Item_Id, E_Constant,
   E_Generic_In_Parameter,
   E_In_Parameter)
-
-  --  If Item_Id is of a private type whose completion has not been
-  --  analyzed yet, its Underlying_Type is empty and we handle it
-  --  as a constant.
-
-  and then Present (Underlying_Type (Etype (Item_Id)))
-  and then Is_Access_Type (Underlying_Type (Etype (Item_Id)))
+  and then Is_Access_Type (Etype (Item_Id))
 then
Adjusted_Kind := E_Variable;
 end if;

[Ada] Move SPARK borrow-checker to gnat2why codebase

2019-09-19 Thread Pierre-Marie de Rodat

Unit sem_spark was implementing the borrow-checker for the support of
ownership pointers in SPARK. It has been moved to gnat2why codebase to
facilitate its evolution and allow the more powerful flow analysis to
provide its results for better analysis on pointers.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Yannick Moy  

gcc/ada/

* gcc-interface/Make-lang.in: Remove references to sem_spark.
* sem_spark.adb, sem_spark.ads: Remove unit.

patch.diff.gz
Description: application/gzip

[Ada] Crash on predicate in full view in a generic unit

2019-09-19 Thread Pierre-Marie de Rodat

This patch fixes a compiler abort on a dynamic predicate applied to the
full view of a type in a generic package declaration, when the
expression for the predicate is a conditionql expression that contains
references to components of the full view of the type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Ed Schonberg  

gcc/ada/

* sem_ch13.adb (Check_Aspect_At_End_Of_Declarations): Simplify
handling of expressions in predicates when the context is a
generic unit.

gcc/testsuite/

* gnat.dg/predicate14.adb, gnat.dg/predicate14.ads: New
testcase.--- gcc/ada/sem_ch13.adb
+++ gcc/ada/sem_ch13.adb
@@ -9374,17 +9374,22 @@ package body Sem_Ch13 is
 
   else
  --  In a generic context freeze nodes are not always generated, so
- --  analyze the expression now. If the aspect is for a type, this
- --  makes its potential components accessible.
+ --  analyze the expression now. If the aspect is for a type, we must
+ --  also make its potential components accessible.
 
  if not Analyzed (Freeze_Expr) and then Inside_A_Generic then
 if A_Id = Aspect_Dynamic_Predicate
   or else A_Id = Aspect_Predicate
-  or else A_Id = Aspect_Priority
 then
Push_Type (Ent);
-   Preanalyze_Spec_Expression (Freeze_Expr, T);
+   Preanalyze_Spec_Expression (Freeze_Expr, Standard_Boolean);
+   Pop_Type (Ent);
+
+elsif A_Id = Aspect_Priority then
+   Push_Type (Ent);
+   Preanalyze_Spec_Expression (Freeze_Expr, Any_Integer);
Pop_Type (Ent);
+
 else
Preanalyze (Freeze_Expr);
 end if;
@@ -9395,12 +9400,23 @@ package body Sem_Ch13 is
 
  Set_Parent (End_Decl_Expr, ASN);
 
- --  In a generic context the aspect expressions have not been
- --  preanalyzed, so do it now. There are no conformance checks
- --  to perform in this case.
+ --  In a generic context the original  aspect expressions have not
+ --  been preanalyzed, so do it now. There are no conformance checks
+ --  to perform in this case. As before, we have to make components
+ --  visible for aspects that may reference them.
 
  if No (T) then
-Check_Aspect_At_Freeze_Point (ASN);
+if A_Id = Aspect_Dynamic_Predicate
+  or else A_Id = Aspect_Predicate
+  or else A_Id = Aspect_Priority
+then
+   Push_Type (Ent);
+   Check_Aspect_At_Freeze_Point (ASN);
+   Pop_Type (Ent);
+
+else
+   Check_Aspect_At_Freeze_Point (ASN);
+end if;
 return;
 
  --  The default values attributes may be defined in the private part,

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/predicate14.adb
@@ -0,0 +1,4 @@
+--  { dg-do compile }
+package body Predicate14 is
+procedure Dummy is null;
+end Predicate14;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/predicate14.ads
@@ -0,0 +1,56 @@
+generic
+package Predicate14 with
+  SPARK_Mode
+is
+
+   type Field_Type is (F_Initial, F_Payload, F_Final);
+
+   type State_Type is (S_Valid, S_Invalid);
+
+   type Cursor_Type (State : State_Type := S_Invalid) is private;
+
+   type Cursors_Type is array (Field_Type) of Cursor_Type;
+
+   type Context_Type is private;
+
+   type Result_Type (Field : Field_Type := F_Initial) is
+  record
+ case Field is
+when F_Initial | F_Final =>
+   null;
+when F_Payload =>
+   Value : Integer;
+ end case;
+  end record;
+
+   function Valid_Context (Context : Context_Type) return Boolean;
+
+private
+
+   function Valid_Type (Result : Result_Type) return Boolean is
+ (Result.Field = F_Initial);
+
+   type Cursor_Type (State : State_Type := S_Invalid) is
+  record
+ case State is
+when S_Valid =>
+   Value : Result_Type;
+when S_Invalid =>
+   null;
+ end case;
+  end record
+  with Dynamic_Predicate =>
+  (if State = S_Valid then Valid_Type (Value));
+
+   type Context_Type is
+  record
+ Field : Field_Type := F_Initial;
+ Cursors : Cursors_Type := (others => (State => S_Invalid));
+  end record;
+
+   function Valid_Context (Context : Context_Type) return Boolean is
+ (for all F in Context.Cursors'Range =>
+ (Context.Cursors (F).Value.Field = F));
+
+   procedure Dummy;
+end Predicate14;
\ No newline at end of file

[Ada] Remove duplicated routines for getting homonym number

2019-09-19 Thread Pierre-Marie de Rodat

Routines Homonym_Number and Get_Homonym_Number were exactly the same,
except for minor style differences. Keep the one in Exp_Util; remove the
one in Exp_Dbug. No test attached, because semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Piotr Trojanek  

gcc/ada/

* exp_dbug.ads, exp_dbug.adb (Get_Homonym_Number): Remove.
(Append_Homonym_Number): Use Homonym_Number instead of
Get_Homonym_Number.
* exp_util.ads, exp_util.adb (Homonym_Number): Mirror style of
the removed Get_Homonym_Number routine, i.e. initialize local
objects at declaration and refine the type of result.
* sem_util.adb (Add_Homonym_Suffix): Use Homonym_Number instead
of Get_Homonym_Number.--- gcc/ada/exp_dbug.adb
+++ gcc/ada/exp_dbug.adb
@@ -27,6 +27,7 @@ with Alloc;
 with Atree;use Atree;
 with Debug;use Debug;
 with Einfo;use Einfo;
+with Exp_Util; use Exp_Util;
 with Nlists;   use Nlists;
 with Nmake;use Nmake;
 with Opt;  use Opt;
@@ -224,7 +225,7 @@ package body Exp_Dbug is
 Homonym_Numbers (Homonym_Len) := '_';
  end if;
 
- Add_Nat_To_H (Get_Homonym_Number (E));
+ Add_Nat_To_H (Homonym_Number (E));
   end if;
end Append_Homonym_Number;
 
@@ -1054,26 +1055,6 @@ package body Exp_Dbug is
   end loop;
end Build_Subprogram_Instance_Renamings;
 
-   
-   -- Get_Homonym_Number --
-   
-
-   function Get_Homonym_Number (E : Entity_Id) return Pos is
-  H  : Entity_Id := Homonym (E);
-  Nr : Pos := 1;
-
-   begin
-  while Present (H) loop
- if Scope (H) = Scope (E) then
-Nr := Nr + 1;
- end if;
-
- H := Homonym (H);
-  end loop;
-
-  return Nr;
-   end Get_Homonym_Number;
-

-- Get_Secondary_DT_External_Name --


--- gcc/ada/exp_dbug.ads
+++ gcc/ada/exp_dbug.ads
@@ -460,10 +460,6 @@ package Exp_Dbug is
-- Subprograms for Handling Qualification --

 
-   function Get_Homonym_Number (E : Entity_Id) return Pos;
-   --  Return the homonym number for E, which is its position in the homonym
-   --  chain starting at 1. This is exported for use in GNATprove.
-
procedure Qualify_Entity_Names (N : Node_Id);
--  Given a node N, that represents a block, subprogram body, or package
--  body or spec, or protected or task type, sets a fully qualified name

--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -6795,13 +6795,11 @@ package body Exp_Util is
-- Homonym_Number --

 
-   function Homonym_Number (Subp : Entity_Id) return Nat is
-  Count : Nat;
-  Hom   : Entity_Id;
+   function Homonym_Number (Subp : Entity_Id) return Pos is
+  Hom   : Entity_Id := Homonym (Subp);
+  Count : Pos := 1;
 
begin
-  Count := 1;
-  Hom := Homonym (Subp);
   while Present (Hom) loop
  if Scope (Hom) = Scope (Subp) then
 Count := Count + 1;

--- gcc/ada/exp_util.ads
+++ gcc/ada/exp_util.ads
@@ -734,7 +734,7 @@ package Exp_Util is
--  pragmas at the start of the package declaration contains
--pragma Annotate (GNATprove, External_Axiomatization);
 
-   function Homonym_Number (Subp : Entity_Id) return Nat;
+   function Homonym_Number (Subp : Entity_Id) return Pos;
--  Here subp is the entity for a subprogram. This routine returns the
--  homonym number used to disambiguate overloaded subprograms in the same
--  scope (the number is used as part of constructed names to make sure that

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -33,7 +33,6 @@ with Elists;   use Elists;
 with Errout;   use Errout;
 with Erroutc;  use Erroutc;
 with Exp_Ch11; use Exp_Ch11;
-with Exp_Dbug; use Exp_Dbug;
 with Exp_Util; use Exp_Util;
 with Fname;use Fname;
 with Freeze;   use Freeze;
@@ -26314,7 +26313,7 @@ package body Sem_Util is
 
  if Has_Homonym (U) then
 declare
-   N : constant Pos := Get_Homonym_Number (U);
+   N : constant Pos := Homonym_Number (U);
S : constant String := N'Img;
 begin
if N > 1 then

[Ada] New routine GNAT.Sockets.Create_Socket_Pair

2019-09-19 Thread Pierre-Marie de Rodat

New routine to create 2 connected sockets. This routine is analog of the
UNIX system call socketpair. On UNIX platforms it is implemented on the
base of socketpair. On other platforms it is implemented by conecting
network sockets over loopback interface.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Dmitriy Anisimkov  

gcc/ada/

* libgnat/g-socket.ads, libgnat/g-socket.adb
(Create_Socket_Pair): New routine.
* libgnat/g-socthi.ads (OS_Has_Socketpair): Boolean constant.
(C_Socketpair): New imported routine.
* libgnat/g-socthi__mingw.ads, libgnat/g-socthi__vxworks.ads
(Default_Socket_Pair_Family): New constant.
(C_Socketpair): New routine.
* libgnat/g-socthi__mingw.adb, libgnat/g-socthi__vxworks.adb
(C_Socketpair): Is separated in anouther file.
* libgnat/g-sthcso.adb (C_Socketpair): Non UNIX implementation.
* libgnat/g-stsifd__sockets.adb: Reuse C_Socketpair.--- gcc/ada/libgnat/g-socket.adb
+++ gcc/ada/libgnat/g-socket.adb
@@ -867,6 +867,34 @@ package body GNAT.Sockets is
   Socket := Socket_Type (Res);
end Create_Socket;
 
+   
+   -- Create_Socket_Pair --
+   
+
+   procedure Create_Socket_Pair
+ (Left   : out Socket_Type;
+  Right  : out Socket_Type;
+  Family : Family_Type := Family_Unspec;
+  Mode   : Mode_Type   := Socket_Stream;
+  Level  : Level_Type  := IP_Protocol_For_IP_Level)
+   is
+  Res  : C.int;
+  Pair : aliased Thin_Common.Fd_Pair;
+
+   begin
+  Res := C_Socketpair
+((if Family = Family_Unspec then Default_Socket_Pair_Family
+  else Families (Family)),
+ Modes (Mode), Levels (Level), Pair'Access);
+
+  if Res = Failure then
+ Raise_Socket_Error (Socket_Errno);
+  end if;
+
+  Left  := Socket_Type (Pair (Pair'First));
+  Right := Socket_Type (Pair (Pair'Last));
+   end Create_Socket_Pair;
+
---
-- Empty --
---

--- gcc/ada/libgnat/g-socket.ads
+++ gcc/ada/libgnat/g-socket.ads
@@ -1104,7 +1104,17 @@ package GNAT.Sockets is
   Family : Family_Type := Family_Inet;
   Mode   : Mode_Type   := Socket_Stream;
   Level  : Level_Type  := IP_Protocol_For_IP_Level);
-   --  Create an endpoint for communication. Raises Socket_Error on error
+   --  Create an endpoint for communication. Raises Socket_Error on error.
+
+   procedure Create_Socket_Pair
+ (Left   : out Socket_Type;
+  Right  : out Socket_Type;
+  Family : Family_Type := Family_Unspec;
+  Mode   : Mode_Type   := Socket_Stream;
+  Level  : Level_Type  := IP_Protocol_For_IP_Level);
+   --  Create two connected sockets. Raises Socket_Error on error.
+   --  If Family is unspecified, it creates Family_Unix sockets on UNIX and
+   --  Family_Inet sockets on non UNIX platforms.
 
procedure Accept_Socket
  (Server  : Socket_Type;

--- gcc/ada/libgnat/g-socthi.ads
+++ gcc/ada/libgnat/g-socthi.ads
@@ -184,6 +184,16 @@ package GNAT.Sockets.Thin is
function C_System
  (Command : System.Address) return C.int;
 
+   Default_Socket_Pair_Family : constant := SOSC.AF_UNIX;
+   --  UNIX has socketpair system call and AF_UNIX family is widely supported
+
+   function C_Socketpair
+ (Domain   : C.int;
+  Typ  : C.int;
+  Protocol : C.int;
+  Fds  : not null access Fd_Pair) return C.int;
+   --  Creates pair of connected sockets
+
---
-- Signalling file descriptors for selector abortion --
---
@@ -249,6 +259,7 @@ private
pragma Import (C, C_Select, "select");
pragma Import (C, C_Setsockopt, "setsockopt");
pragma Import (C, C_Shutdown, "shutdown");
+   pragma Import (C, C_Socketpair, "socketpair");
pragma Import (C, C_System, "system");
 
pragma Import (C, Nonreentrant_Gethostbyname, "gethostbyname");

--- gcc/ada/libgnat/g-socthi__mingw.adb
+++ gcc/ada/libgnat/g-socthi__mingw.adb
@@ -511,6 +511,16 @@ package body GNAT.Sockets.Thin is
   return System.CRTL.ssize_t (Count);
end C_Sendmsg;
 
+   --
+   -- C_Socketpair --
+   --
+
+   function C_Socketpair
+ (Domain   : C.int;
+  Typ  : C.int;
+  Protocol : C.int;
+  Fds  : not null access Fd_Pair) return C.int is separate;
+
--
-- Finalize --
--

--- gcc/ada/libgnat/g-socthi__mingw.ads
+++ gcc/ada/libgnat/g-socthi__mingw.ads
@@ -177,6 +177,17 @@ package GNAT.Sockets.Thin is
   Typ  : C.int;
   Protocol : C.int) return C.int;
 
+   Default_Socket_Pair_Family : constant := SOSC.AF_INET;
+   --  Windows has not socketpair system call, and C_Socketpair below is
+   --  implemented on loopback connected network sockets.
+
+   function C_Socketpair
+ (Domain   : C.int;
+  Typ  : C.int;
+  Protocol : C.int;
+  Fds

[Ada] Accept concatentation arguments to pragma Annotate

2019-09-19 Thread Pierre-Marie de Rodat

In cases where pragma Annotate accepts a string literal as an argument,
we now also accept a concatenation of string literals.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Steve Baird  

gcc/ada/

* sem_prag.adb (Preferred_String_Type): A new function. Given an
expression, determines whether the preference rules defined for
the third-and-later arguments of pragma Annotate suffice to
determine the type of the expression. If so, then the preferred
type is returned; if not then Empty is returned. Handles
concatenations.
(Analyze_Pragma): Replace previous code, which dealt only with
string literals, with calls to the new Preferred_String_Type
function, which also handles concatenations.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation for pragma Annotate.
* gnat_rm.texi: Regenerate.

gcc/testsuite/

* gnat.dg/annotation1.adb: New testcase.--- gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -455,7 +455,8 @@ not otherwise analyze it. The second optional identifier is also left
 unanalyzed, and by convention is used to control the action of the tool to
 which the annotation is addressed.  The remaining ARG arguments
 can be either string literals or more generally expressions.
-String literals are assumed to be either of type
+String literals (and concatenations of string literals) are assumed to be
+either of type
 ``Standard.String`` or else ``Wide_String`` or ``Wide_Wide_String``
 depending on the character literals they contain.
 All other kinds of arguments are analyzed as expressions, and must be

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -1836,7 +1836,8 @@ not otherwise analyze it. The second optional identifier is also left
 unanalyzed, and by convention is used to control the action of the tool to
 which the annotation is addressed.  The remaining ARG arguments
 can be either string literals or more generally expressions.
-String literals are assumed to be either of type
+String literals (and concatenations of string literals) are assumed to be
+either of type
 @code{Standard.String} or else @code{Wide_String} or @code{Wide_Wide_String}
 depending on the character literals they contain.
 All other kinds of arguments are analyzed as expressions, and must be
@@ -7706,7 +7707,8 @@ usually supplied automatically by the project manager. A pragma
 Source_File_Name cannot appear after a @ref{ec,,Pragma Source_File_Name_Project}.
 
 For more details on the use of the @code{Source_File_Name} pragma, see the
-sections on @code{Using Other File Names} and @cite{Alternative File Naming Schemes' in the :title:`GNAT User's Guide}.
+sections on @cite{Using Other File Names} and @cite{Alternative File Naming Schemes}
+in the @cite{GNAT User's Guide}.
 
 @node Pragma Source_File_Name_Project,Pragma Source_Reference,Pragma Source_File_Name,Implementation Defined Pragmas
 @anchor{gnat_rm/implementation_defined_pragmas pragma-source-file-name-project}@anchor{ec}@anchor{gnat_rm/implementation_defined_pragmas id41}@anchor{ed}

--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -13085,6 +13085,56 @@ package body Sem_Prag is
 Expr: Node_Id;
 Nam_Arg : Node_Id;
 
+--
+-- Inferred_String_Type --
+--
+
+function Preferred_String_Type (Expr : Node_Id) return Entity_Id;
+--  Infer the type to use for a string literal or a concatentation
+--  of operands whose types can be inferred. For such expressions,
+--  returns the "narrowest" of the three predefined string types
+--  that can represent the characters occuring in the expression.
+--  For other expressions, returns Empty.
+
+function Preferred_String_Type (Expr : Node_Id) return Entity_Id is
+begin
+   case Nkind (Expr) is
+  when N_String_Literal =>
+ if Has_Wide_Wide_Character (Expr) then
+return Standard_Wide_Wide_String;
+ elsif Has_Wide_Character (Expr) then
+return Standard_Wide_String;
+ else
+return Standard_String;
+ end if;
+
+  when N_Op_Concat =>
+ declare
+L_Type : constant Entity_Id
+  := Preferred_String_Type (Left_Opnd (Expr));
+R_Type : constant Entity_Id
+  := Preferred_String_Type (Right_Opnd (Expr));
+
+Type_Table : constant array (1 .. 4) of Entity_Id
+  := (Empty,
+  Standard_Wide_Wide_String,
+  Standard_Wide_String,
+

[Ada] Fix bogus "too late" error with nested generics and inlining

2019-09-19 Thread Pierre-Marie de Rodat

This prevents the compiler from issuing a bogus error about a constant
whose full declaration appears too late, if it is declared in a nested
generic package and instantiated in another nested instantiation, when
the instantiations are done in a unit withed from the main unit and
containing an inlined subprogram, and cross-unit inlining is enabled.

It turns out that, under these very peculiar conditions, the compiler
ends up instantiating the body of the generic package twice, which leads
to various semantic errors, in particular for declarations of constants.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* sem_ch12.adb (Instantiate_Package_Body): Check that the body
has not already been instantiated when the body of the parent
was being loaded.

gcc/testsuite/

* gnat.dg/inline21.adb, gnat.dg/inline21_g.ads,
gnat.dg/inline21_h.adb, gnat.dg/inline21_h.ads,
gnat.dg/inline21_q.ads: New testcase.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -11442,6 +11442,68 @@ package body Sem_Ch12 is
  else
 Load_Parent_Of_Generic
   (Inst_Node, Specification (Gen_Decl), Body_Optional);
+
+--  Surprisingly enough, loading the body of the parent can cause
+--  the body to be instantiated and the double instantiation needs
+--  to be prevented in order to avoid giving bogus semantic errors.
+
+--  This case can occur because of the Collect_Previous_Instances
+--  machinery of Load_Parent_Of_Generic, which will instantiate
+--  bodies that are deemed to be ahead of the body of the parent
+--  in the compilation unit. But the relative position of these
+--  bodies is computed using the mere comparison of their Sloc.
+
+--  Now suppose that you have two generic packages G and H, with
+--  G containing a mere instantiation of H:
+
+--generic
+--package H is
+
+--  generic
+--  package Nested_G is
+-- ...
+--  end Nested_G;
+
+--end H;
+
+--with H;
+
+--generic
+--package G is
+
+--  package My_H is new H;
+
+--end G;
+
+--  and a third package Q instantiating G and Nested_G:
+
+--with G;
+
+--package Q is
+
+--  package My_G is new G;
+
+--  package My_Nested_G is new My_G.My_H.Nested_G;
+
+--end Q;
+
+--  The body to be instantiated is that of My_Nested_G and its
+--  parent is the instance My_G.My_H. This latter instantiation
+--  is done when My_G is analyzed, i.e. after the declarations
+--  of My_G and My_Nested_G have been parsed; as a result, the
+--  Sloc of My_G.My_H is greater than the Sloc of My_Nested_G.
+
+--  Therefore loading the body of My_G.My_H will cause the body
+--  of My_Nested_G to be instantiated because it is deemed to be
+--  ahead of My_G.My_H. This means that Load_Parent_Of_Generic
+--  will again be invoked on My_G.My_H, but this time with the
+--  Collect_Previous_Instances machinery disabled, so there is
+--  no endless mutual recursion and things are done in order.
+
+if Present (Corresponding_Body (Instance_Spec (Inst_Node))) then
+   goto Leave;
+end if;
+
 Gen_Body_Id := Corresponding_Body (Gen_Decl);
  end if;
   end if;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline21.adb
@@ -0,0 +1,9 @@
+--  { dg-compile }
+--  { dg-options "-O -gnatn" }
+
+with Inline21_Q;
+
+procedure Inline21 is
+begin
+  Inline21_Q.My_Nested_G.Proc;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline21_g.ads
@@ -0,0 +1,8 @@
+with Inline21_H;
+
+generic
+package Inline21_G is
+
+   package My_H is new Inline21_H;
+
+end Inline21_G;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline21_h.adb
@@ -0,0 +1,14 @@
+package body Inline21_H is
+
+   package body Nested_G is
+
+  C : constant Integer := 0;
+
+  procedure Proc is
+  begin
+ null;
+  end;
+
+   end Nested_G;
+
+end Inline21_H;
\ No newline at end of file

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline21_h.ads
@@ -0,0 +1,10 @@
+generic
+package Inline21_H is
+
+   generic
+   package Nested_G is
+  procedure Proc;
+  pragma Inline (Proc);
+   end Nested_G;
+
+end Inline21_H;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline21_q.ads
@@ -0,0 +1,9 @@
+with Inline21_G;
+
+package Inline21_Q is
+
+   package My_G is new Inline21_G;
+
+   package My_Nested_G is new My_G.My_H.Nested_G;
+
+end Inline21_Q;

[Ada] Get rid of useless temporary for slice in overaligned record type

2019-09-19 Thread Pierre-Marie de Rodat

This fixes a recent code quality regression for targets that do not
require the strict alignment of memory accesses: the compiler would
generate a useless temporary for a slice of an array component in an
overaligned record type.

Running these commands:

  gcc -c p.adb -gnatws -gnatD
  grep loop p.adb.dg

On the following sources:

procedure P (N : Positive) is

  type Rec1 is record
I : Integer;
  end record;

  type Arr is array (Positive range <>) of Rec1;

  type Rec2 is record
A : Arr (1 .. 128);
  end record;
  for Rec2'Alignment use 8;

  procedure Proc (A : Arr) is
  begin
null;
  end;

  R : Rec2;

begin
  Proc (R.A (1 .. N));
end;

Should execute silently.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_util.adb (Is_Possibly_Unaligned_Slice): Do not return true
on pure alignment considerations if the target does not require
the strict alignment of memory accesses.--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -8692,9 +8692,11 @@ package body Exp_Util is
--  We are definitely in trouble if the record in question
--  has an alignment, and either we know this alignment is
--  inconsistent with the alignment of the slice, or we don't
-   --  know what the alignment of the slice should be.
+   --  know what the alignment of the slice should be. But this
+   --  really matters only if the target has strict alignment.
 
-   if Known_Alignment (Ptyp)
+   if Target_Strict_Alignment
+ and then Known_Alignment (Ptyp)
  and then (Unknown_Alignment (Styp)
 or else Alignment (Styp) > Alignment (Ptyp))
then

[Ada] Disable inlining of traversal function in GNATprove

2019-09-19 Thread Pierre-Marie de Rodat

Traversal functions as defined in SPARK RM 3.10 should not be inlined
for analysis in GNATprove, as this changes the ownership behavior.
Disable the inlining performed in GNATprove on functions which could be
interpreted as such.

There is no impact on compilation and thus no test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Yannick Moy  

gcc/ada/

* inline.adb (Can_Be_Inlined_In_GNATprove_Mode): Add special
case for traversal functions.--- gcc/ada/inline.adb
+++ gcc/ada/inline.adb
@@ -1512,6 +1512,12 @@ package body Inline is
   --  Return True if subprogram Id is defined in the package specification,
   --  either its visible or private part.
 
+  function Maybe_Traversal_Function (Id : Entity_Id) return Boolean;
+  --  Return True if subprogram Id could be a traversal function, as
+  --  defined in SPARK RM 3.10. This is only a safe approximation, as the
+  --  knowledge of the SPARK boundary is needed to determine exactly
+  --  traversal functions.
+
   ---
   -- Has_Formal_With_Discriminant_Dependent_Fields --
   ---
@@ -1635,6 +1641,20 @@ package body Inline is
  return Nkind (Parent (Decl)) = N_Compilation_Unit;
   end Is_Unit_Subprogram;
 
+  --
+  -- Maybe_Traversal_Function --
+  --
+
+  function Maybe_Traversal_Function (Id : Entity_Id) return Boolean is
+  begin
+ return Ekind (Id) = E_Function
+
+   --  Only traversal functions return an anonymous access-to-object
+   --  type in SPARK.
+
+   and then Is_Anonymous_Access_Type (Etype (Id));
+  end Maybe_Traversal_Function;
+
   --  Local declarations
 
   Id : Entity_Id;
@@ -1757,6 +1777,15 @@ package body Inline is
   elsif Has_Formal_With_Discriminant_Dependent_Fields (Id) then
  return False;
 
+  --  Do not inline subprograms which may be traversal functions. Such
+  --  inlining introduces temporary variables of named access type for
+  --  which assignments are move instead of borrow/observe, possibly
+  --  leading to spurious errors when checking SPARK rules related to
+  --  pointer usage.
+
+  elsif Maybe_Traversal_Function (Id) then
+ return False;
+
   --  Otherwise, this is a subprogram declared inside the private part of a
   --  package, or inside a package body, or locally in a subprogram, and it
   --  does not have any contract. Inline it.

[Ada] Memory leak with 'Range of a function call in a loop

2019-09-19 Thread Pierre-Marie de Rodat

If a for loop starts with "for X in F (...)'Range loop", where F is a
function returning an unconstrained array, then memory is leaked. This
patch fixes that bug.

Running these commands:

  gnatmake -q -f main.adb
  main

On the following sources:

with Text_IO; use Text_IO;
package P is

   function Get_Objects return String;

end P;

package body P is
   function Get_Objects return String is
   begin
  return "xyzzy";
   end Get_Objects;

end P;

with Text_IO; use Text_IO;
pragma Warnings (Off, "an internal GNAT unit");
with System.Secondary_Stack;
pragma Warnings (On, "an internal GNAT unit");
with P; use P;

procedure Main is
   Max_Iterations : constant Integer := 1_000;

   procedure Leak_Call is
   begin
  for Id in Get_Objects'Range loop
 null;
  end loop;
   end Leak_Call;

   procedure SS_Info is new System.Secondary_Stack.SS_Info
(Text_IO.Put_Line);

begin
   for Iteration in 1 .. Max_Iterations loop
  Leak_Call;
   end loop;

   SS_Info;

end Main;

Should produce the following output:

  Secondary Stack information:
Total size  :  10240 bytes
Current allocated space :  0 bytes
Number of Chunks:  1
Default size of Chunks  :  10240

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Bob Duff  

gcc/ada/

* sem_attr.adb (Resolve_Attribute): Make sure the secondary
stack is properly managed in the case of a 'Range attribute in a
loop.--- gcc/ada/sem_attr.adb
+++ gcc/ada/sem_attr.adb
@@ -11570,6 +11570,16 @@ package body Sem_Attr is
  begin
 if not Is_Entity_Name (P) or else not Is_Type (Entity (P)) then
Resolve (P);
+
+   --  If the prefix is a function call returning on the secondary
+   --  stack, we must make sure to mark/release the stack.
+
+   if Nkind (P) = N_Function_Call
+ and then Nkind (Parent (N)) = N_Loop_Parameter_Specification
+ and then Requires_Transient_Scope (Etype (P))
+   then
+  Set_Uses_Sec_Stack (Scope (Current_Scope));
+   end if;
 end if;
 
 Dims := Expressions (N);

[Ada] Fix copy operation with private discriminated record type

2019-09-19 Thread Pierre-Marie de Rodat

This prevents the object code from reading too many bytes from the
source for a copy operation involving a private discriminated record
type with default discriminants and generated for the assignment of an
aggregate to a variable or the initialization of a constant.

The front-end already knows that it needs to convert the operation
involving the aggregate into individual assignments if the type of the
aggregate has mutable components, but it would not do so if this type is
private, which does not change anything for code generation.

Running these commands:

  gnatmake -q p -g -fsanitize=address
  p

On the following sources:

with Q; use Q;

procedure P is

   type Rec is record
  A : T;
   end record;

   C : constant Rec := Rec'(A => Default_T);

begin
   null;
end;

package Q is

   type T is private;

   Default_T : constant T;

private

   A : constant := 170;
   B : constant := 8192;

   type A_Index is range 1 .. A;
   type B_Index is range 1 .. B;

   type A_Array is array (A_Index) of Boolean;
   type B_Array is array (B_Index) of Boolean;

   type Data_Type is (A_Type, B_Type);

   type T (Discriminant : Data_Type := A_Type) is record
  case Discriminant is
 when A_Type =>
Field_A : A_Array;
 when B_Type =>
Field_B : B_Array;
  end case;
   end record;

   Default_T : constant T :=
 T'(Discriminant => A_Type, Field_A => (others => True));

end Q;

Should execute silently.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_aggr.adb (Has_Mutable_Components): Look at the underlying
type of components to find out whether they are mutable.--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -8162,13 +8162,15 @@ package body Exp_Aggr is
 
function Has_Mutable_Components (Typ : Entity_Id) return Boolean is
   Comp : Entity_Id;
+  Ctyp : Entity_Id;
 
begin
   Comp := First_Component (Typ);
   while Present (Comp) loop
- if Is_Record_Type (Etype (Comp))
-   and then Has_Discriminants (Etype (Comp))
-   and then not Is_Constrained (Etype (Comp))
+ Ctyp := Underlying_Type (Etype (Comp));
+ if Is_Record_Type (Ctyp)
+   and then Has_Discriminants (Ctyp)
+   and then not Is_Constrained (Ctyp)
  then
 return True;
  end if;

[Ada] Fix run-time segfault with derived access-to-subprogram type

2019-09-19 Thread Pierre-Marie de Rodat

This fixes a segfault at run time for the call to a local subprogram
through an access value if the type of this access value is derived
from an initial access-to-subprogram type and the access value was
originally obtained with the initial type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* sem_ch3.adb (Build_Derived_Access_Type): If this is an access-
to-subprogram type, copy Can_Use_Internal_Rep from the parent.

gcc/testsuite/

* gnat.dg/access9.adb: New testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -6723,6 +6723,11 @@ package body Sem_Ch3 is
   Has_Private_Component (Derived_Type));
   Conditional_Delay  (Derived_Type, Subt);
 
+  if Is_Access_Subprogram_Type (Derived_Type) then
+ Set_Can_Use_Internal_Rep
+   (Derived_Type, Can_Use_Internal_Rep (Parent_Type));
+  end if;
+
   --  Ada 2005 (AI-231): Set the null-exclusion attribute, and verify
   --  that it is not redundant.
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/access9.adb
@@ -0,0 +1,20 @@
+--  { dg-do run }
+
+procedure Access9 is
+
+  type A_Type is access procedure;
+
+  type B_Type is new A_Type;
+
+  procedure Invoke (B : B_Type) is
+  begin
+B.all;
+  end;
+
+  procedure Nested is begin null; end;
+
+  A : A_Type := Nested'Access;
+
+begin
+  Invoke (B_Type (A));
+end;
\ No newline at end of file

[Ada] Fix spurious type mismatch failure on nested instantiations

2019-09-19 Thread Pierre-Marie de Rodat

This fixes a spurious type mismatch failure reported between formal and
actual of a call to a subprogram that comes from the instantiation of a
child generic unit that itself contains an instantiation of a slibling
child generic unit, when the parent is itself a generic unit with
private part. The regression was introduced by a recent change made to
clear the Is_Generic_Actual_Type on the implicit full view built when a
generic package is instantiated on a private type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* sem_ch12.adb (Restore_Private_Views): Comment out new code
that clear the Is_Generic_Actual_Type also on the full view.

gcc/testsuite/

* gnat.dg/generic_inst13.adb,
gnat.dg/generic_inst13_pkg-nested_g.ads,
gnat.dg/generic_inst13_pkg-ops_g.ads,
gnat.dg/generic_inst13_pkg.ads: New testcase.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -14638,9 +14638,17 @@ package body Sem_Ch12 is
 else
Set_Is_Generic_Actual_Type (E, False);
 
-   if Is_Private_Type (E) and then Present (Full_View (E)) then
-  Set_Is_Generic_Actual_Type (Full_View (E), False);
-   end if;
+   --  It might seem reasonable to clear the Is_Generic_Actual_Type
+   --  flag also on the Full_View if the type is private, since it
+   --  was set also on this Full_View. However, this flag is relied
+   --  upon by Covers to spot "types exported from instantiations"
+   --  which are implicit Full_Views built for instantiations made
+   --  on private types and we get type mismatches if we do it when
+   --  the block exchanging the declarations below triggers ???
+
+   --  if Is_Private_Type (E) and then Present (Full_View (E)) then
+   --Set_Is_Generic_Actual_Type (Full_View (E), False);
+   --  end if;
 end if;
 
 --  An unusual case of aliasing: the actual may also be directly

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst13.adb
@@ -0,0 +1,22 @@
+--  { dg-do compile }
+
+with Generic_Inst13_Pkg;
+with Generic_Inst13_Pkg.Nested_G;
+
+procedure Generic_Inst13 is
+
+  type Item_T is range 1 .. 16;
+
+  package My_Inst is new Generic_Inst13_Pkg (Item_T);
+
+  package My_Nested is new My_Inst.Nested_G;
+
+  procedure Proc (Left, Right : My_Nested.T) is
+R : constant My_Nested.List_T := My_Nested."or" (Left, Right);
+  begin
+null;
+  end;
+
+begin
+  null;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst13_pkg-nested_g.ads
@@ -0,0 +1,14 @@
+with Generic_Inst13_Pkg.Ops_G;
+
+generic
+package Generic_Inst13_Pkg.Nested_G is
+
+  type T is new Generic_Inst13_Pkg.T;
+
+  package My_Operations is new Generic_Inst13_Pkg.Ops_G (T);
+
+  subtype List_T is My_Operations.List_T;
+
+  function "or" (Left, Right : T) return List_T renames My_Operations."or";
+
+end Generic_Inst13_Pkg.Nested_G;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst13_pkg-ops_g.ads
@@ -0,0 +1,9 @@
+generic
+  type Data_T is private;
+package Generic_Inst13_Pkg.Ops_G is
+
+  type List_T is array (Positive range <>) of Data_T;
+
+  function "or" (Left, Right : Data_T) return List_T is ((Left, Right));
+
+end Generic_Inst13_Pkg.Ops_G;
\ No newline at end of file

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst13_pkg.ads
@@ -0,0 +1,11 @@
+generic
+  type Component_T is private;
+package Generic_Inst13_Pkg is
+
+  type T is private;
+
+private
+
+  type T is array (Boolean) of Component_T;
+
+end Generic_Inst13_Pkg;

[Ada] Emit DW_AT_GNU_bias with -fgnat-encodings=gdb

2019-09-19 Thread Pierre-Marie de Rodat

Emit DW_AT_GNU_bias with -fgnat-encodings=gdb.  gdb implements this,
but not the encoded variant.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Tom Tromey  

gcc/ada/

* gcc-interface/misc.c (gnat_get_type_bias): Return the bias
when -fgnat-encodings=gdb.

gcc/testsuite/

* gnat.dg/bias1.adb: New testcase.--- gcc/ada/gcc-interface/misc.c
+++ gcc/ada/gcc-interface/misc.c
@@ -,7 +,7 @@ gnat_get_type_bias (const_tree gnu_type)
 {
   if (TREE_CODE (gnu_type) == INTEGER_TYPE
   && TYPE_BIASED_REPRESENTATION_P (gnu_type)
-  && gnat_encodings == DWARF_GNAT_ENCODINGS_MINIMAL)
+  && gnat_encodings != DWARF_GNAT_ENCODINGS_ALL)
 return TYPE_RM_MIN_VALUE (gnu_type);
 
   return NULL_TREE;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/bias1.adb
@@ -0,0 +1,34 @@
+--  { dg-do compile }
+--  { dg-options "-cargs -g -dA -gnatws -fgnat-encodings=gdb -margs" }
+--  { dg-final { scan-assembler "DW_AT_GNU_bias" } }
+
+procedure Bias1 is
+   type Small is range -7 .. -4;
+   for Small'Size use 2;
+   Y : Small := -5;
+   Y1 : Small := -7;
+
+   type Byte is mod 256;
+   type Repeat_Count_T is new Byte range 1 .. 2 ** 6;
+   for Repeat_Count_T'Size use 6;
+   X : Repeat_Count_T := 64;
+   X1 : Repeat_Count_T := 1;
+
+   type Char_Range is range 65 .. 68;
+   for Char_Range'Size use 2;
+   Cval : Char_Range := 65;
+
+   type SomePackedRecord is record
+  R: Small;
+  S: Small;
+   end record;
+   pragma Pack (SomePackedRecord);
+   SPR : SomePackedRecord := (R => -4, S => -5);
+
+   type Packed_Array is array (1 .. 3) of Small;
+   pragma pack (Packed_Array);
+   A : Packed_Array := (-7, -5, -4);
+
+begin
+   null;
+end Bias1;
\ No newline at end of file

[Ada] Streamline comparison for equality of 2-element arrays

2019-09-19 Thread Pierre-Marie de Rodat

In the general case, the comparison for equality of array objects is
implemented by a local function that contains, among other things, a
loop running over the elements, comparing them one by one and exiting
as soon as an element is not the same in the two array objects.

For the specific case of constrained 2-element arrays, this is rather
heavy and unnecessarily obfuscates the control flow of the program,
so this change implements a simple conjunction of comparisons for it.

Running these commands:

  gcc -c p.ads -O -gnatD
  grep loop p.ads.dg

On the following sources:

package P is

  type Rec is record
Re : Float;
Im : Float;
  end record;

  type Arr is array (1 .. 2) of Rec;

  function Equal (A, B : Arr) return Boolean is (A = B);

end P;

Should execute silently.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_ch4.adb (Expand_Array_Equality): If optimization is
enabled, generate a simple conjunction of comparisons for the
specific case of constrained 1-dimensional 2-element arrays.
Fix formatting.--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -1582,7 +1582,7 @@ package body Exp_Ch4 is
   Index_List1 : constant List_Id:= New_List;
   Index_List2 : constant List_Id:= New_List;
 
-  Actuals   : List_Id;
+  First_Idx : Node_Id;
   Formals   : List_Id;
   Func_Name : Entity_Id;
   Func_Body : Node_Id;
@@ -1594,6 +1594,10 @@ package body Exp_Ch4 is
   Rtyp : Entity_Id;
   --  The parameter types to be used for the formals
 
+  New_Lhs : Node_Id;
+  New_Rhs : Node_Id;
+  --  The LHS and RHS converted to the parameter types
+
   function Arr_Attr
 (Arr : Entity_Id;
  Nam : Name_Id;
@@ -1962,6 +1966,82 @@ package body Exp_Ch4 is
  pragma Assert (Ltyp = Rtyp);
   end if;
 
+  --  If the array type is distinct from the type of the arguments, it
+  --  is the full view of a private type. Apply an unchecked conversion
+  --  to ensure that analysis of the code below succeeds.
+
+  if No (Etype (Lhs))
+or else Base_Type (Etype (Lhs)) /= Base_Type (Ltyp)
+  then
+ New_Lhs := OK_Convert_To (Ltyp, Lhs);
+  else
+ New_Lhs := Lhs;
+  end if;
+
+  if No (Etype (Rhs))
+or else Base_Type (Etype (Rhs)) /= Base_Type (Rtyp)
+  then
+ New_Rhs := OK_Convert_To (Rtyp, Rhs);
+  else
+ New_Rhs := Rhs;
+  end if;
+
+  First_Idx := First_Index (Ltyp);
+
+  --  If optimization is enabled and the array boils down to a couple of
+  --  consecutive elements, generate a simple conjunction of comparisons
+  --  which should be easier to optimize by the code generator.
+
+  if Optimization_Level > 0
+and then Ltyp = Rtyp
+and then Is_Constrained (Ltyp)
+and then Number_Dimensions (Ltyp) = 1
+and then Nkind (First_Idx) = N_Range
+and then Compile_Time_Known_Value (Low_Bound (First_Idx))
+and then Compile_Time_Known_Value (High_Bound (First_Idx))
+and then Expr_Value (High_Bound (First_Idx)) =
+ Expr_Value (Low_Bound (First_Idx)) + 1
+  then
+ declare
+Ctyp : constant Entity_Id := Component_Type (Ltyp);
+L, R : Node_Id;
+TestL, TestH : Node_Id;
+Index_List   : List_Id;
+
+ begin
+Index_List := New_List (New_Copy_Tree (Low_Bound (First_Idx)));
+
+L :=
+  Make_Indexed_Component (Loc,
+Prefix  => New_Copy_Tree (New_Lhs),
+Expressions => Index_List);
+
+R :=
+  Make_Indexed_Component (Loc,
+Prefix  => New_Copy_Tree (New_Rhs),
+Expressions => Index_List);
+
+TestL := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);
+
+Index_List := New_List (New_Copy_Tree (High_Bound (First_Idx)));
+
+L :=
+  Make_Indexed_Component (Loc,
+Prefix  => New_Lhs,
+Expressions => Index_List);
+
+R :=
+  Make_Indexed_Component (Loc,
+Prefix  => New_Rhs,
+Expressions => Index_List);
+
+TestH := Expand_Composite_Equality (Nod, Ctyp, L, R, Bodies);
+
+return
+  Make_And_Then (Loc, Left_Opnd => TestL, Right_Opnd => TestH);
+ end;
+  end if;
+
   --  Build list of formals for function
 
   Formals := New_List (
@@ -2004,46 +2084,20 @@ package body Exp_Ch4 is
 Make_Simple_Return_Statement (Loc,
   Expression => New_Occurrence_Of (Standard_False, Loc,
 
-Handle_One_Dimension (1, First_Index (Ltyp)),
+Handle_One_Dimension (1, First_Idx),
 
 Make_Simple_Return_Statement (Loc,

[Ada] Suppress GNAT FE up-level reference transformation for GNAT-LLVM

2019-09-19 Thread Pierre-Marie de Rodat

In the case of GNAT-LLVM, the GNAT FE no longer does expansion of
up-level references identified by the subprogram unnesting machinery
into activation record references. This is now only done by the FE when
generating C code. This expansion is already taken care of by the
gnat-llvm middle phase, so there's no benefit to also doing it in the
front end.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Gary Dismukes  

gcc/ada/

* exp_unst.adb (Unnest_Subprogram): Bypass the transformation of
up-level references unless Opt.Generate_C_Code is enabled.--- gcc/ada/exp_unst.adb
+++ gcc/ada/exp_unst.adb
@@ -2177,11 +2177,14 @@ package body Exp_Unst is
 --  not need rewriting (e.g. the appearence in a conversion).
 --  Also ignore if no reference was specified or if the rewriting
 --  has already been done (this can happen if the N_Identifier
---  occurs more than one time in the tree).
+--  occurs more than one time in the tree). Also ignore references
+--  when not generating C code (in particular for the case of LLVM,
+--  since GNAT-LLVM will handle the processing for up-level refs).
 
 if No (UPJ.Ref)
   or else not Is_Entity_Name (UPJ.Ref)
   or else not Present (Entity (UPJ.Ref))
+  or else not Opt.Generate_C_Code
 then
goto Continue;
 end if;

[Ada] Allow constants of access type in Global contracts

2019-09-19 Thread Pierre-Marie de Rodat

Now that SPARK supports access types, global constants of access type
may appear as outputs of a subprogram, with the meaning that the
underlying memory can be modified (see SPARK RM 3.10).

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Yannick Moy  

gcc/ada/

* sem_prag.adb (Analyze_Global_In_Decl_Part): Do not issue an
error when a constant of an access type is used as output in a
Global contract.
(Analyze_Depends_In_Decl_Part): Do not issue an error when a
constant of an access type is used as output in a Depends
contract.

gcc/testsuite/

* gnat.dg/global2.adb, gnat.dg/global2.ads: New testcase.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -1262,8 +1262,28 @@ package body Sem_Prag is
(Item_Is_Input  : out Boolean;
 Item_Is_Output : out Boolean)
  is
+--  A constant or IN parameter of access type should be handled
+--  like a variable, as the underlying memory pointed-to can be
+--  modified. Use Adjusted_Kind to do this adjustment.
+
+Adjusted_Kind : Entity_Kind := Ekind (Item_Id);
+
  begin
-case Ekind (Item_Id) is
+if Ekind_In (Item_Id, E_Constant,
+  E_Generic_In_Parameter,
+  E_In_Parameter)
+
+  --  If Item_Id is of a private type whose completion has not been
+  --  analyzed yet, its Underlying_Type is empty and we handle it
+  --  as a constant.
+
+  and then Present (Underlying_Type (Etype (Item_Id)))
+  and then Is_Access_Type (Underlying_Type (Etype (Item_Id)))
+then
+   Adjusted_Kind := E_Variable;
+end if;
+
+case Adjusted_Kind is
 
--  Abstract states
 
@@ -1303,7 +1323,9 @@ package body Sem_Prag is
 
   Item_Is_Output := False;
 
-   --  Variables and IN OUT parameters
+   --  Variables and IN OUT parameters, as well as constants and
+   --  IN parameters of access type which are handled like
+   --  variables.
 
when E_Generic_In_Out_Parameter
   | E_In_Out_Parameter
@@ -2412,10 +2434,13 @@ package body Sem_Prag is
 
--  Constant related checks
 
-   elsif Ekind (Item_Id) = E_Constant then
+   elsif Ekind (Item_Id) = E_Constant
+ and then
+   not Is_Access_Type (Underlying_Type (Etype (Item_Id)))
+   then
 
-  --  A constant is a read-only item, therefore it cannot act
-  --  as an output.
+  --  Unless it is of an access type, a constant is a read-only
+  --  item, therefore it cannot act as an output.
 
   if Nam_In (Global_Mode, Name_In_Out, Name_Output) then
  SPARK_Msg_NE

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/global2.adb
@@ -0,0 +1,12 @@
+--  { dg-do compile }
+
+package body Global2 is
+   procedure Change_X is
+   begin
+  X.all := 1;
+   end Change_X;
+   procedure Change2_X is
+   begin
+  X.all := 1;
+   end Change2_X;
+end Global2;
\ No newline at end of file

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/global2.ads
@@ -0,0 +1,6 @@
+package Global2 is
+   type Int_Acc is access Integer;
+   X : constant Int_Acc := new Integer'(34);
+   procedure Change_X with Global => (In_Out => X);
+   procedure Change2_X with Depends => (X => X);
+end Global2;

[Ada] Implement Machine_Rounding attribute in line when possible

2019-09-19 Thread Pierre-Marie de Rodat

GNAT implements Machine_Rounding as an alias for Rounding but, whereas
the implementation of the latter is in line when possible, that of the
former is always out of line, which is not aligned with the intent of
the Ada RM.

This changes the compiler to using for Machine_Rounding the same in line
implementation as Rounding when possible.

Running these commands:

  gcc -c f.adb -gnatD
  grep system f.adb.dg

On the following sources:

function F (Val : Float) return Integer is
begin
  return Integer (Float'Machine_Rounding (Val));
end;

Should execute silently.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_attr.adb (Is_Inline_Floating_Point_Attribute): Treat
Machine_Rounding as an alias for Rounding.
* sem_res.adb (Simplify_Type_Conversion): Likewise.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -8360,13 +8360,13 @@ package body Exp_Attr is
  return False;
   end if;
 
-  --  Here we are in the integer conversion context
+  --  Here we are in the integer conversion context. We reuse Rounding for
+  --  Machine_Rounding as System.Fat_Gen, which is a permissible behavior.
 
-  --  Very probably we should also recognize the cases of Machine_Rounding
-  --  and unbiased rounding in this conversion context, but the back end is
-  --  not yet prepared to handle these cases ???
-
-  return Id = Attribute_Rounding or else Id = Attribute_Truncation;
+  return
+Id = Attribute_Rounding
+  or else Id = Attribute_Machine_Rounding
+  or else Id = Attribute_Truncation;
end Is_Inline_Floating_Point_Attribute;
 
 end Exp_Attr;

--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -12439,7 +12439,8 @@ package body Sem_Res is
 -- ityp (x)
 
 --  with the Float_Truncate flag set to False or True respectively,
---  which is more efficient.
+--  which is more efficient. We reuse Rounding for Machine_Rounding
+--  as System.Fat_Gen, which is a permissible behavior.
 
 if Is_Floating_Point_Type (Opnd_Typ)
   and then
@@ -12448,6 +12449,7 @@ package body Sem_Res is
 and then Conversion_OK (N)))
   and then Nkind (Operand) = N_Attribute_Reference
   and then Nam_In (Attribute_Name (Operand), Name_Rounding,
+ Name_Machine_Rounding,
  Name_Truncation)
 then
declare

[Ada] Fix fallout of previous change for bit-packed arrays

2019-09-19 Thread Pierre-Marie de Rodat

This fixes a regression introduced by the previous change that improved
the handling of explicit by-reference mechanism. For the very specific
case of a component of a bit-packed array, the front-end still needs to
insert a copy around the call because this is where the rewriting into
the sequence of mask-and-shifts is done for the code generator.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_ch6.adb (Add_Simple_Call_By_Copy_Code): Add
Bit_Packed_Array parameter and documet it. Always insert a copy
if it is set True.
(Expand_Actuals): Adjust the calls to
Add_Simple_Call_By_Copy_Code.

gcc/testsuite/

* gnat.dg/pack26.adb: New testcase.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -1252,10 +1252,11 @@ package body Exp_Ch6 is
   --  also takes care of any constraint checks required for the type
   --  conversion case (on both the way in and the way out).
 
-  procedure Add_Simple_Call_By_Copy_Code;
+  procedure Add_Simple_Call_By_Copy_Code (Bit_Packed_Array : Boolean);
   --  This is similar to the above, but is used in cases where we know
   --  that all that is needed is to simply create a temporary and copy
-  --  the value in and out of the temporary.
+  --  the value in and out of the temporary. If Bit_Packed_Array is True,
+  --  the procedure is called for a bit-packed array actual.
 
   procedure Add_Validation_Call_By_Copy_Code (Act : Node_Id);
   --  Perform copy-back for actual parameter Act which denotes a validation
@@ -1269,11 +1270,11 @@ package body Exp_Ch6 is
 
   function Is_Legal_Copy return Boolean;
   --  Check that an actual can be copied before generating the temporary
-  --  to be used in the call. If the actual is of a by_reference type then
-  --  the program is illegal (this can only happen in the presence of
-  --  rep. clauses that force an incorrect alignment). If the formal is
-  --  a by_reference parameter imposed by a DEC pragma, emit a warning to
-  --  the effect that this might lead to unaligned arguments.
+  --  to be used in the call. If the formal is of a by_reference type or
+  --  is aliased, then the program is illegal (this can only happen in
+  --  the presence of representation clauses that force a misalignment)
+  --  If the formal is a by_reference parameter imposed by a DEC pragma,
+  --  emit a warning that this might lead to unaligned arguments.
 
   function Make_Var (Actual : Node_Id) return Entity_Id;
   --  Returns an entity that refers to the given actual parameter, Actual
@@ -1610,7 +1611,7 @@ package body Exp_Ch6 is
   -- Add_Simple_Call_By_Copy_Code --
   --
 
-  procedure Add_Simple_Call_By_Copy_Code is
+  procedure Add_Simple_Call_By_Copy_Code (Bit_Packed_Array : Boolean) is
  Decl   : Node_Id;
  F_Typ  : Entity_Id := Etype (Formal);
  Incod  : Node_Id;
@@ -1621,7 +1622,12 @@ package body Exp_Ch6 is
  Temp   : Entity_Id;
 
   begin
- if not Is_Legal_Copy then
+ --  ??? We need to do the copy for a bit-packed array because this is
+ --  where the rewriting into a mask-and-shift sequence is done. But of
+ --  course this may break the program if it expects bits to be really
+ --  passed by reference. That's what we have done historically though.
+
+ if not Bit_Packed_Array and then not Is_Legal_Copy then
 return;
  end if;
 
@@ -2076,7 +2082,7 @@ package body Exp_Ch6 is
 --  [in] out parameters.
 
 elsif Is_Ref_To_Bit_Packed_Array (Actual) then
-   Add_Simple_Call_By_Copy_Code;
+   Add_Simple_Call_By_Copy_Code (Bit_Packed_Array => True);
 
 --  If a nonscalar actual is possibly bit-aligned, we need a copy
 --  because the back-end cannot cope with such objects. In other
@@ -2092,7 +2098,7 @@ package body Exp_Ch6 is
 Component_May_Be_Bit_Aligned (Entity (Selector_Name (Actual)))
   and then not Represented_As_Scalar (Etype (Formal))
 then
-   Add_Simple_Call_By_Copy_Code;
+   Add_Simple_Call_By_Copy_Code (Bit_Packed_Array => False);
 
 --  References to slices of bit-packed arrays are expanded
 
@@ -2295,14 +2301,14 @@ package body Exp_Ch6 is
 --  Is this really necessary in all cases???
 
 elsif Is_Ref_To_Bit_Packed_Array (Actual) then
-   Add_Simple_Call_By_Copy_Code;
+   Add_Simple_Call_By_Copy_Code (Bit_Packed_Array => True);
 
 --  If a nonscalar actual is possibly unaligned, we need a copy
 
 elsif Is_Possibly_Unaligned_Object (Actual)
   and then not Represented_As_Scalar (Etype (Formal))
 then
-   Add_Simple_Call_By_Copy_Code;
+

[Ada] Improve handling of explicit by-reference mechanism

2019-09-19 Thread Pierre-Marie de Rodat

This improves the handling of an explicit by-reference passing mechanism
specified by means of the GNAT pragma Export_Function.  This device sort
of circumvents the rules of the language for the by-reference passing
mechanism and it's then up to the programmer to ensure that the actual
parameter is addressable; if it is not, the compiler will generate a
temporary around the call, thus effectively passing the actual by copy.

It turns out that the compiler was too conservative when determining
whether the actual parameter is addressable, in particular if it's a
component of a record type subject to a representation clause.

The change effectively moves this computation from the front-end to the
back-end, which has much more information on the layout and alignment
of types and thus can be less conservative.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Eric Botcazou  

gcc/ada/

* exp_ch6.adb (Is_Legal_Copy): Also return false for an aliased
formal and a formal passed by reference in convention Ada.  Add
missing guard to the existing test on Is_Valued_Procedure.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -1859,12 +1859,16 @@ package body Exp_Ch6 is
  --  An attempt to copy a value of such a type can only occur if
  --  representation clauses give the actual a misaligned address.
 
- if Is_By_Reference_Type (Etype (Formal)) then
+ if Is_By_Reference_Type (Etype (Formal))
+   or else Is_Aliased (Formal)
+   or else (Mechanism (Formal) = By_Reference
+ and then not Has_Foreign_Convention (Subp))
+ then
 
 --  The actual may in fact be properly aligned but there is not
 --  enough front-end information to determine this. In that case
---  gigi will emit an error if a copy is not legal, or generate
---  the proper code.
+--  gigi will emit an error or a warning if a copy is not legal,
+--  or generate the proper code.
 
 return False;
 
@@ -1875,6 +1879,7 @@ package body Exp_Ch6 is
  --  be lurking.
 
  elsif Mechanism (Formal) = By_Reference
+   and then Ekind (Scope (Formal)) = E_Procedure
and then Is_Valued_Procedure (Scope (Formal))
  then
 Error_Msg_N

[Ada] Infinite loop with concatenation and aspect

2019-09-19 Thread Pierre-Marie de Rodat

This patch fixes a bug where an array object initialized with a
concatenation, and that has an aspect_specification for Alignment,
causes the compiler goes into an infinite loop.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Bob Duff  

gcc/ada/

* exp_ch3.adb (Rewrite_As_Renaming): Return False if there are
any aspect specifications, because otherwise Insert_Actions
blows up.

gcc/testsuite/

* gnat.dg/concat3.adb: New testcase.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -6318,7 +6318,8 @@ package body Exp_Ch3 is
   -
 
   function Rewrite_As_Renaming return Boolean is
-  begin
+ Result : constant Boolean :=
+
  --  If the object declaration appears in the form
 
  --Obj : Ctrl_Typ := Func (...);
@@ -6336,12 +6337,12 @@ package body Exp_Ch3 is
 
  --  This part is disabled for now, because it breaks GPS builds
 
- return (False -- ???
- and then Nkind (Expr_Q) = N_Explicit_Dereference
- and then not Comes_From_Source (Expr_Q)
- and then Nkind (Original_Node (Expr_Q)) = N_Function_Call
- and then Nkind (Object_Definition (N)) in N_Has_Entity
- and then (Needs_Finalization (Entity (Object_Definition (N)
+ (False -- ???
+and then Nkind (Expr_Q) = N_Explicit_Dereference
+and then not Comes_From_Source (Expr_Q)
+and then Nkind (Original_Node (Expr_Q)) = N_Function_Call
+and then Nkind (Object_Definition (N)) in N_Has_Entity
+and then (Needs_Finalization (Entity (Object_Definition (N)
 
--  If the initializing expression is for a variable with attribute
--  OK_To_Rename set, then transform:
@@ -6362,6 +6363,14 @@ package body Exp_Ch3 is
and then Ekind (Entity (Expr_Q)) = E_Variable
and then OK_To_Rename (Entity (Expr_Q))
and then Is_Entity_Name (Obj_Def));
+  begin
+ --  Return False if there are any aspect specifications, because
+ --  otherwise we duplicate that corresponding implicit attribute
+ --  definition, and call Insert_Action, which has no place to insert
+ --  the attribute definition. The attribute definition is stored in
+ --  Aspect_Rep_Item, which is not a list.
+
+ return Result and then No (Aspect_Specifications (N));
   end Rewrite_As_Renaming;
 
   --  Local variables

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/concat3.adb
@@ -0,0 +1,14 @@
+--  { dg-do run }
+--  { dg-options "-g -O0 -gnata" }
+
+procedure Concat3 is
+   procedure Show_Bug (S : in String)
+   is
+  Str : constant String := S & "-" with Alignment => 4;
+   begin
+  null;
+   end Show_Bug;
+
+begin
+   Show_Bug ("BUG");
+end Concat3;
\ No newline at end of file

[Ada] gnatxref: infinite loop on symbols not found

2019-09-19 Thread Pierre-Marie de Rodat

This patch fixes a bug in which if a symbol is not found, gnatxref can
sometimes enter an infinite loop. No impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-09-19  Bob Duff  

gcc/ada/

* xref_lib.adb (Get_Symbol_Name): If we reach EOF in the first
loop without finding the symbol, return "???". Otherwise, it's
an infinite loop.
(Parse_EOL): Assert that we're not already at EOF.  Remove
processing of LF/CR -- there are no operating systems that use
that.--- gcc/ada/xref_lib.adb
+++ gcc/ada/xref_lib.adb
@@ -723,6 +723,8 @@ package body Xref_Lib is
is
begin
   loop
+ pragma Assert (Source (Ptr) /= EOF);
+
  --  Skip to end of line
 
  while Source (Ptr) /= ASCII.CR and then Source (Ptr) /= ASCII.LF
@@ -737,11 +739,9 @@ package body Xref_Lib is
 Ptr := Ptr + 1;
  end if;
 
- --  Skip past CR/LF or LF/CR combination
+ --  Skip past CR/LF
 
- if (Source (Ptr) = ASCII.CR or else Source (Ptr) = ASCII.LF)
-   and then Source (Ptr) /= Source (Ptr - 1)
- then
+ if Source (Ptr - 1) = ASCII.CR and then Source (Ptr) = ASCII.LF then
 Ptr := Ptr + 1;
  end if;
 
@@ -783,6 +783,7 @@ package body Xref_Lib is
   --  line and column in the dependent unit number Eun. For this we need
   --  to parse the ali file again because the parent entity is not in
   --  the declaration table if it did not match the search pattern.
+  --  If the symbol is not found, we return "???".
 
   procedure Skip_To_Matching_Closing_Bracket;
   --  When Ptr points to an opening square bracket, moves it to the
@@ -803,6 +804,10 @@ package body Xref_Lib is
  --  Look for the X lines corresponding to unit Eun
 
  loop
+if Ali (Ptr) = EOF then
+   return "???";
+end if;
+
 if Ali (Ptr) = 'X' then
Ptr := Ptr + 1;
Parse_Number (Ali, Ptr, E_Eun);
@@ -832,10 +837,6 @@ package body Xref_Lib is
 exit when Ali (Ptr) = EOF;
  end loop;
 
- --  We were not able to find the symbol, this should not happen but
- --  since we don't want to stop here we return a string of three
- --  question marks as the symbol name.
-
  return "???";
   end Get_Symbol_Name;

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Richard Biener

On Thu, Sep 19, 2019 at 3:15 PM Alexander Monakov  wrote:
>
> On Thu, 19 Sep 2019, Richard Biener wrote:
>
> > > Good point, there's a tested patch.
> >
> > OK.
>
> Hold on, is the new comparator really correct?
>
>
> @@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, 
> const void *b)
>  static int
>  sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
>  {
> -  const congruence_class_group *g1
> -= *(const congruence_class_group * const *)a;
> -  const congruence_class_group *g2
> -= *(const congruence_class_group * const *)b;
> -
> -  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
> -  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
> -
> -  if (uid1 < uid2)
> -return -1;
> -  else if (uid1 > uid2)
> -return 1;
> -  else
> -return 0;
> +  const std::pair *g1
> += *(const std::pair *const *) a;
> +  const std::pair *g2
> += *(const std::pair *const *) b;
> +  return g1->second - g2->second;
>  }
>
>
> AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
> cast to a pointer-to-pointer-to-pair? Shouldn't it read
>
>   const std::pair *g1
> = (const std::pair *) a;

Indeed.  We need static type correctness for the comparator :/

Richard.

> Thanks.
> Alexander

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Alexander Monakov

On Thu, 19 Sep 2019, Richard Biener wrote:

> > Good point, there's a tested patch.
> 
> OK.

Hold on, is the new comparator really correct?


@@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, 
const void *b)
 static int
 sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
 {
-  const congruence_class_group *g1
-= *(const congruence_class_group * const *)a;
-  const congruence_class_group *g2
-= *(const congruence_class_group * const *)b;
-
-  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
-  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  const std::pair *g1
+= *(const std::pair *const *) a;
+  const std::pair *g2
+= *(const std::pair *const *) b;
+  return g1->second - g2->second;
 }


AFAICT the vec holds pairs, not pointers to pairs, so why does the comparator
cast to a pointer-to-pointer-to-pair? Shouldn't it read

  const std::pair *g1
= (const std::pair *) a;

Thanks.
Alexander

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Richard Biener

On Thu, Sep 19, 2019 at 3:02 PM Martin Liška  wrote:
>
> On 9/19/19 1:01 PM, Richard Biener wrote:
> > On Thu, Sep 19, 2019 at 11:06 AM Martin Liška  wrote:
> >>
> >> Hi.
> >>
> >> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is 
> >> the most
> >> expensive qsort operator in tramp3d compilation. It does unfortunate 7 
> >> pointer dereferences
> >> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting 
> >> to cache that
> >> in congruence_class_group.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Since we're populating the classes vector right before you could elide
> > the new field and do it in the same iteration by having
> >
> > auto_vec  > classes
> >
> > and pushing *it, DECL_UID and then sorting by the readily available
> > UID in the data?  That makes the sort use a linear memory region
> > w/o dereferences.
>
> Good point, there's a tested patch.

OK.

Thanks,
Richard.

> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >> 2019-09-19  Martin Liska  
> >>
> >> * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
> >> (sort_congruence_classes_by_decl_uid): Likewise.
> >> (sort_congruence_class_groups_by_decl_uid): Use
> >> congruence_class_group::uid.
> >> (sem_item_optimizer::merge_classes): Cache
> >> DECL_UID (classes[i]->classes[0]->members[0]->decl).
> >> * ipa-icf.h (struct congruence_class_group): New field.
> >> ---
> >>  gcc/ipa-icf.c | 29 +
> >>  gcc/ipa-icf.h |  1 +
> >>  2 files changed, 6 insertions(+), 24 deletions(-)
> >>
> >>
>

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Martin Liška

On 9/19/19 1:01 PM, Richard Biener wrote:
> On Thu, Sep 19, 2019 at 11:06 AM Martin Liška  wrote:
>>
>> Hi.
>>
>> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is 
>> the most
>> expensive qsort operator in tramp3d compilation. It does unfortunate 7 
>> pointer dereferences
>> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to 
>> cache that
>> in congruence_class_group.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> Since we're populating the classes vector right before you could elide
> the new field and do it in the same iteration by having
> 
> auto_vec  > classes
> 
> and pushing *it, DECL_UID and then sorting by the readily available
> UID in the data?  That makes the sort use a linear memory region
> w/o dereferences.

Good point, there's a tested patch.

Martin

> 
> Richard.
> 
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2019-09-19  Martin Liska  
>>
>> * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
>> (sort_congruence_classes_by_decl_uid): Likewise.
>> (sort_congruence_class_groups_by_decl_uid): Use
>> congruence_class_group::uid.
>> (sem_item_optimizer::merge_classes): Cache
>> DECL_UID (classes[i]->classes[0]->members[0]->decl).
>> * ipa-icf.h (struct congruence_class_group): New field.
>> ---
>>  gcc/ipa-icf.c | 29 +
>>  gcc/ipa-icf.h |  1 +
>>  2 files changed, 6 insertions(+), 24 deletions(-)
>>
>>

>From 05a78667e932e554dfbd69433ad66242cbca6492 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 19 Sep 2019 09:41:39 +0200
Subject: [PATCH] Speed up qsort in IPA ICF.

gcc/ChangeLog:

2019-09-19  Martin Liska  

	* ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
	(sort_congruence_classes_by_decl_uid): Likewise.
	(sort_congruence_class_groups_by_decl_uid): Use std::pair for
	easier sorting.
	(sem_item_optimizer::merge_classes): Likewise.
---
 gcc/ipa-icf.c | 49 -
 1 file changed, 16 insertions(+), 33 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index c9c3cb4a331..59b7f8b1b9d 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -3350,13 +3350,7 @@ sort_sem_items_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (i1->decl);
   int uid2 = DECL_UID (i2->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_classes A and B by DECL_UID of the first member.  */
@@ -3369,13 +3363,7 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (c1->members[0]->decl);
   int uid2 = DECL_UID (c2->members[0]->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_class_groups A and B by
@@ -3384,20 +3372,11 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
 static int
 sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
 {
-  const congruence_class_group *g1
-= *(const congruence_class_group * const *)a;
-  const congruence_class_group *g2
-= *(const congruence_class_group * const *)b;
-
-  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
-  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  const std::pair *g1
+= *(const std::pair *const *) a;
+  const std::pair *g2
+= *(const std::pair *const *) b;
+  return g1->second - g2->second;
 }
 
 /* After reduction is done, we can declare all items in a group
@@ -3445,10 +3424,14 @@ sem_item_optimizer::merge_classes (unsigned int prev_class_count)
 	  }
   }
 
-  auto_vec  classes (m_classes.elements ());
+  auto_vec > classes (
+m_classes.elements ());
   for (hash_table::iterator it = m_classes.begin ();
it != m_classes.end (); ++it)
-classes.quick_push (*it);
+{
+  int uid = DECL_UID ((*it)->classes[0]->members[0]->decl);
+  classes.quick_push (std::pair (*it, uid));
+}
 
   classes.qsort (sort_congruence_class_groups_by_decl_uid);
 
@@ -3470,11 +3453,11 @@ sem_item_optimizer::merge_classes (unsigned int prev_class_count)
 }
 
   unsigned int l;
-  congruence_class_group *it;
+  std::pair *it;
   FOR_EACH_VEC_ELT (classes, l, it)
-for (unsigned int i = 0; i < it->classes.length (); i++)
+for (unsigned int i = 0; i < it->first->classes.length (); i++)
   {
-	congruence_class *c = it->classes[i];
+	congruence_class *c = it->first->classes[i];
 
 	if (c->members.length () == 1)
 	  continue;
-- 
2.23.0

[PATCH, Fortran] Character type names in errors and warnings - new version for review

2019-09-19 Thread Mark Eggleston


Original thread: https://gcc.gnu.org/ml/fortran/2019-09/msg00024.html

The original patch introduced a new field in gfc_typespec called length 
to be used only for character literals. At the time I felt that this was 
a bit of kludge.  As a results of comments from Janne Blomqvist I 
investigated whether the existing mechanism for character length in 
gfc_typespec could be used for character literals. This turn out to be 
impractical.


The character length for literals is already held in the gfc_expr 
structure for character constants. I've added a new version of 
gfc_typename that accepts gfc_expr * instead of gfc_typespec. Where 
character types are possible the gfc_expr * version is now used instead 
of the gfc_typespec * version.


I've implemented Janne's suggestions.

I think this is a better solution.

Please review.

Tested on x86_64 (built with bootstrap).

ChangeLogs

gcc/fortran

    Mark Eggleston  

    * array.c (check_element_type): Call gfc_typename with the gfc_expr
    "expr" instead of its gfc_typespec "ts".
    * check.c (gfc_check_co_reduce): Call gfc_typename with the gfc_expr
    "a" instead of its gfc_typespec "ts".
    (gfc_check_co_reduce): Call gfc_typename with the gfc_expr "a" instead
     of its gfc_typespec "ts".
    (gfc_check_eoshift): Call gfc_typename with the gfc_expr "array"
    instead of its gfc_typespec ts.
    (gfc_check_same_type_as): In two calls to gfc_typename use "a" and "b"
    of type gfc_expr instead of the "ts" fields of "a" and "b"
    * decl.c (variable_decl): Call gfc_typename with the gfc_expr
    "initializer" instead of its gfc_typespec "ts".
    * expr.c (gfc_check_assign): Use "rvalue" and "lvalue" of type gfc_expr
    in calls to gfc_typename instead of their "ts" fields of type
    gfc_typespec.
    (gfc_check_pointer_assign): Use "rvalue" and "lvalue" of type gfc_expr
    in calls to gfc_typename instead of their "ts" fields of type
    gfc_typespec.
    * gfortran.h: Add prototypes for gfc_dummy_typename and a new function
    gfc_typename for gfc_expr *.
    *interface.c (gfc_check_dummy_characteristics): Use gfc_dummy_typename
    for the dummy variable.
    (compare_parameter): Use gfc_dummy_typename for the formal argument.
    Use "actual" of type gfc_expr in call to gfc_typename for the actual
    argument.
    * intrinsic.c (check_arglist): Use gfc_dummy_typename for the formal
    argument. Use expressions of type gfc_expr from the argument list to
    call gfc_typename.
    (gfc_convert_type_warn): New local variable "is_char_constant" set if
    the expression type is a character constant. At the "bad" label
    determine source type name by calling gfc_typename with either "expr"
    for character constants or "from_ts" and use that in the warning
    messages instead of the original call to gfc_typename.
    * misc.c (gfc_typename): New function for gfc_expr *, use for where
    character types are possible it can get the character length from
    gfc_expr for character literals.
    (gfc_dummy_typename): New functionfor gfc_typespec *, if no character
    length is present the character type is assumed and the appropriate
    string is return otherwise it calls gfc_typename for gfc_typespec *.
    (gfc_typespec): for character types construct the type name with length
    and kind (if it is not default kind).

gcc/testsuite

    Mark Eggleston 

    * gfortran.dg/bad_operands.f90: New test.
    * gfortran.dg/character mismatch.f90: New test.
    * gfortran.dg/compare_interfaces.f90: New test.
    * gfortran.dg/hollerith_to_char_parameter_1.f90: New test.
    * gfortran.dg/hollerith_to_char_parameter_2.f90: New test.
    * gfortran.dg/widechar_intrinsics_1.f90: Checked for specific character
    type names instead of "Type of argument".
    * gfortran.dg/widechar_intrinsics_2.f90: Checked for specific character
    type names instead of "Type of argument".
    * gfortran.dg/widechar_intrinsics_3.f90: Checked for specific character
    type names instead of "Type of argument".

--
https://www.codethink.co.uk/privacy.html

>From c9b86acc7c3a6c1e684231af95d2b6b5c562379b Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Fri, 30 Aug 2019 11:08:26 +0100
Subject: [PATCH] Character typenames in errors and warnings

Character type names now incorporate length, kind is only shown if
the default character is not being used.

Examples:

  character(7) is reported as CHARACTER(7)
  character(len=20,kind=4) is reported as CHARACTER(20,4)

dummy character variables with assumed length:

  character(*) is reported as CHARACTER(*)
  character(*,kind=4) is reported as CHARACTER(*,4)
---
 gcc/fortran/array.c|  2 +-
 gcc/fortran/check.c| 10 +--
 gcc/fortran/decl.c |  2 +-
 gcc/fortran/expr.c |  8 +--
 gcc/fortran/gfortran.h |  2 +
 gcc/fortran/interface.c| 11 ++--

[PATCH] Remove vect_is_slp_reduction

2019-09-19 Thread Richard Biener



This implements vect_is_slp_reduction in terms of the check_reduction_path
detected SCC.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-09-19  Richard Biener  

* tree-vect-loop.c (vect_is_slp_reduction): Remove.
(check_reduction_path): New overload having the path as result.
(vect_is_simple_reduction): From the detected reduction
path build a SLP reduction chain if possible.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 275959)
+++ gcc/tree-vect-loop.c(working copy)
@@ -2541,163 +2541,6 @@ vect_valid_reduction_input_p (stmt_vec_i
  && !is_loop_header_bb_p (gimple_bb (def_stmt_info->stmt;
 }
 
-/* Detect SLP reduction of the form:
-
-   #a1 = phi 
-   a2 = operation (a1)
-   a3 = operation (a2)
-   a4 = operation (a3)
-   a5 = operation (a4)
-
-   #a = phi 
-
-   PHI is the reduction phi node (#a1 = phi  above)
-   FIRST_STMT is the first reduction stmt in the chain
-   (a2 = operation (a1)).
-
-   Return TRUE if a reduction chain was detected.  */
-
-static bool
-vect_is_slp_reduction (loop_vec_info loop_info, gimple *phi,
-  gimple *first_stmt)
-{
-  class loop *loop = (gimple_bb (phi))->loop_father;
-  class loop *vect_loop = LOOP_VINFO_LOOP (loop_info);
-  enum tree_code code;
-  gimple *loop_use_stmt = NULL;
-  stmt_vec_info use_stmt_info;
-  tree lhs;
-  imm_use_iterator imm_iter;
-  use_operand_p use_p;
-  int nloop_uses, size = 0, n_out_of_loop_uses;
-  bool found = false;
-
-  if (loop != vect_loop)
-return false;
-
-  auto_vec reduc_chain;
-  lhs = PHI_RESULT (phi);
-  code = gimple_assign_rhs_code (first_stmt);
-  while (1)
-{
-  nloop_uses = 0;
-  n_out_of_loop_uses = 0;
-  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
-{
- gimple *use_stmt = USE_STMT (use_p);
- if (is_gimple_debug (use_stmt))
-   continue;
-
-  /* Check if we got back to the reduction phi.  */
- if (use_stmt == phi)
-{
- loop_use_stmt = use_stmt;
-  found = true;
-  break;
-}
-
-  if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt)))
-{
- loop_use_stmt = use_stmt;
- nloop_uses++;
-}
-   else
- n_out_of_loop_uses++;
-
-   /* There are can be either a single use in the loop or two uses in
-  phi nodes.  */
-   if (nloop_uses > 1 || (n_out_of_loop_uses && nloop_uses))
- return false;
-}
-
-  if (found)
-break;
-
-  /* We reached a statement with no loop uses.  */
-  if (nloop_uses == 0)
-   return false;
-
-  /* This is a loop exit phi, and we haven't reached the reduction phi.  */
-  if (gimple_code (loop_use_stmt) == GIMPLE_PHI)
-return false;
-
-  if (!is_gimple_assign (loop_use_stmt)
- || code != gimple_assign_rhs_code (loop_use_stmt)
- || !flow_bb_inside_loop_p (loop, gimple_bb (loop_use_stmt)))
-return false;
-
-  /* Insert USE_STMT into reduction chain.  */
-  use_stmt_info = loop_info->lookup_stmt (loop_use_stmt);
-  reduc_chain.safe_push (use_stmt_info);
-
-  lhs = gimple_assign_lhs (loop_use_stmt);
-  size++;
-   }
-
-  if (!found || loop_use_stmt != phi || size < 2)
-return false;
-
-  /* Swap the operands, if needed, to make the reduction operand be the second
- operand.  */
-  lhs = PHI_RESULT (phi);
-  for (unsigned i = 0; i < reduc_chain.length (); ++i)
-{
-  gassign *next_stmt = as_a  (reduc_chain[i]->stmt);
-  if (gimple_assign_rhs2 (next_stmt) == lhs)
-   {
- tree op = gimple_assign_rhs1 (next_stmt);
- stmt_vec_info def_stmt_info = loop_info->lookup_def (op);
-
- /* Check that the other def is either defined in the loop
-("vect_internal_def"), or it's an induction (defined by a
-loop-header phi-node).  */
- if (def_stmt_info
- && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt_info->stmt))
- && vect_valid_reduction_input_p (def_stmt_info))
-   {
- lhs = gimple_assign_lhs (next_stmt);
- continue;
-   }
-
- return false;
-   }
-  else
-   {
- gcc_assert (gimple_assign_rhs1 (next_stmt) == lhs);
- tree op = gimple_assign_rhs2 (next_stmt);
- stmt_vec_info def_stmt_info = loop_info->lookup_def (op);
-
-  /* Check that the other def is either defined in the loop
-("vect_internal_def"), or it's an induction (defined by a
-loop-header phi-node).  */
- if (def_stmt_info
- && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt_info->stmt))
- && vect_valid_reduction_input_p (def_stmt_info))
-   {
- lhs = gimple_assign_lhs (next_st

[PATCH] Fix PR91812

2019-09-19 Thread Richard Biener



Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2019-09-19  Richard Biener  

PR tree-optimization/91812
* tree-ssa-phiprop.c (propagate_with_phi): Do not replace
volatile loads.

* gcc.dg/torture/pr91812.c: New testcase.

Index: gcc/tree-ssa-phiprop.c
===
--- gcc/tree-ssa-phiprop.c  (revision 275698)
+++ gcc/tree-ssa-phiprop.c  (working copy)
@@ -338,8 +338,15 @@ propagate_with_phi (basic_block bb, gphi
&& (!type
|| types_compatible_p
 (TREE_TYPE (gimple_assign_lhs (use_stmt)), type))
-   /* We cannot replace a load that may throw or is volatile.  */
-   && !stmt_can_throw_internal (cfun, use_stmt)))
+   /* We cannot replace a load that may throw or is volatile.
+  For volatiles the transform can change the number of
+  executions if the load is inside a loop but the address
+  computations outside (PR91812).  We could relax this
+  if we guard against that appropriately.  For loads that can
+  throw we could relax things if the moved loads all are
+  known to not throw.  */
+   && !stmt_can_throw_internal (cfun, use_stmt)
+   && !gimple_has_volatile_ops (use_stmt)))
continue;
 
   /* Check if we can move the loads.  The def stmt of the virtual use
Index: gcc/testsuite/gcc.dg/torture/pr91812.c
===
--- gcc/testsuite/gcc.dg/torture/pr91812.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr91812.c  (working copy)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-optimized-blocks" } */
+
+unsigned register1;
+unsigned register2;
+
+void busy_wait_for_register (int x)
+{
+  volatile unsigned* ptr;
+  switch(x) {
+case 0x:
+ptr = ®ister1;
+break;
+
+case 0x:
+ptr = ®ister2;
+break;
+
+default:
+return;
+  }
+  while (*ptr) {}
+}
+
+/* { dg-final { scan-tree-dump "loop depth 1" "optimized" } } */

Re: [PATCH 5/9] Come up with an abstraction.

2019-09-19 Thread Richard Biener

On Wed, Sep 18, 2019 at 9:56 AM Martin Liška  wrote:
>
> Hello.
>
> Ok, so the current IPA ICF transformation is being blocked by the
> patch 2/9 (about FIELD_DECL). I asked Honza for a help here.
> In the meantime, can you Richi make an opinion about the part 5 which
> is about the interaction in between old operand_equal_p and a new
> hook in IPA ICF?

+static operand_compare default_compare_instance;
+
+/* Conveinece wrapper around operand_compare class because usually we do
+   not need to play with the valueizer.  */
+
+bool
+operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
+{
+  return default_compare_instance.operand_equal_p (arg0, arg1, flags);
+}

can we devirtualize this and thus further clone and devirtualize the
recursions in the default instance?  if not does making the default
instance const help (you need to make the methods const, possibly
a good idea if that works for your ICF use as well?)

+  if (flag_checking && !(flags & OEP_NO_HASH_CHECK))
+{

better keep that in the caller (avoids the virtual call and also
then the function does what it says...).

Otherwise it looks OK.  Note I'd really like to see the
overhead for the regular operand_equal_p calls being zero,
thus devirt is important here - but it should work with
IPA-CP of 'this'?  But maybe the function is too big to
clone :/

Richard.

> Thanks,
> Martin

Re: [PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Richard Biener

On Thu, Sep 19, 2019 at 11:06 AM Martin Liška  wrote:
>
> Hi.
>
> As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the 
> most
> expensive qsort operator in tramp3d compilation. It does unfortunate 7 
> pointer dereferences
> via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to 
> cache that
> in congruence_class_group.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Since we're populating the classes vector right before you could elide
the new field and do it in the same iteration by having

auto_vec  > classes

and pushing *it, DECL_UID and then sorting by the readily available
UID in the data?  That makes the sort use a linear memory region
w/o dereferences.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-09-19  Martin Liska  
>
> * ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
> (sort_congruence_classes_by_decl_uid): Likewise.
> (sort_congruence_class_groups_by_decl_uid): Use
> congruence_class_group::uid.
> (sem_item_optimizer::merge_classes): Cache
> DECL_UID (classes[i]->classes[0]->members[0]->decl).
> * ipa-icf.h (struct congruence_class_group): New field.
> ---
>  gcc/ipa-icf.c | 29 +
>  gcc/ipa-icf.h |  1 +
>  2 files changed, 6 insertions(+), 24 deletions(-)
>
>

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-19 Thread Richard Biener

On Thu, Sep 19, 2019 at 3:28 AM Prathamesh Kulkarni
 wrote:
>
> On Wed, 18 Sep 2019 at 22:17, Prathamesh Kulkarni
>  wrote:
> >
> > On Wed, 18 Sep 2019 at 01:46, Richard Biener  
> > wrote:
> > >
> > > On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  
> > > wrote:
> > > >
> > > > Hi Richard,
> > > >
> > > > > The issue with the bugzilla is that it lacked appropriate testcase(s) 
> > > > > and thus
> > > > > it is now a mess.  There are clear testcases (maybe not in the 
> > > > > benchmarks you
> > > >
> > > > Agreed - it's not clear whether any of the proposed changes would 
> > > > actually
> > > > help the original issue. My patch absolutely does.
> > > >
> > > > > care about) that benefit from code hoisting as enabler, mainly when 
> > > > > control
> > > > > flow can be then converted to data flow.  Also note that "size 
> > > > > optimizations"
> > > > > are important for all cases where followup transforms have size 
> > > > > limits on the IL
> > > > > in place.
> > > >
> > > > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. 
> > > > it's definitely
> > > > useful, but there are much larger gains to be had from other tweaks 
> > > > [1]. So we can
> > > > live without it until a better solution is found.
> > >
> > > A "solution" for better eembc benchmark results?
> > >
> > > The issues are all latent even w/o code-hoisting since you can do the
> > > same transform at the source level.  Which is usually why I argue
> > > trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> > > off random GIMPLE passes for specific benchmark regressions.
> > >
> > > Anyway, it's arm maintainers call if you want to have such hacks in
> > > place or not.
> > >
> > > As a release manager I say that GCC isn't a benchmark compiler.
> > >
> > > As the one "responsible" for the code-hoisting introduction I say that
> > > as long as I don't have access to the actual benchmark I can't assess
> > > wrongdoing of the pass nor suggest an appropriate place for optimization.
> > Hi Richard,
> > The actual benchmark function for PR80155 is almost identical to FMS()
> > function defined in
> > pr77445-2.c, and the test-case reproduces the same issue as in the 
> > benchmark.
> Hi,
> The attached patch is another workaround for hoisting. The rationale
> behind the patch is
> to avoid "long range" hoistings  for a "large enough" CFG.

But that isn't a good heuristic.  The issue with coming up with a good heuristic
is that we'd need to evaluate the effect on register pressure which, on GIMPLE
at least, can be only approximated.  The insertion phase of PRE isn't a good
place to do that, the only place you could reasonably decide on this is
the elimination phase which sees the earlier inserted load and increment
and needs to decide whether to replace the later load and increment with
the loaded value.  But in the end this boils down to do scheduling and
register rematerialization on GIMPLE based on the value graph which is
of course possible but not implemented and if implemented is probably
better done as a standalone pass.

> The patch walks dom tree for the block and finds the "furthest" dom
> block, for which
> intersect_p (availout_in_some, AVAIL_OUT (dom_block)) is true. The
> "distance" is measured
> in terms of dom depth.
>
> We can have two params say param_hoist_n_bbs to determine "large enough"
> CFG, and param_hoist_expr_dist, that avoids hoisting if the "distance"
> exceeds the param threshold.
> For the values (hardcoded) in the patch, it "works" for avoiding the
> spill and does not regress ssa-pre-*.c and ssa-hoist-*.c
> tests (the param values could be made target-specific). Does the
> approach look reasonable ?

As said, not really.  First you only see parts of the expression hoisted;
in the testcase it's *transitions + 1 where you first see _1 = *transition
and then _2 = _1 + 1; if it were *transitions + _3 and _3 dies here then
hoisting doesn't make register pressure worse; if it were _4 + _3 and
both operands die here then hoisting decreases register pressure.

On x86 I don't see any spilling so the other question is whether this
is really a register pressure issue?

> My concern with current version is that walking dom tree per block in
> do_hoist_insertion can end up being
> pretty expensive. Any suggestions on how we could speed that up ?

You can probably use the DFS numbers directly?

> Alternatively, we could consider hoisting from "bottom up", one block
> at a time, and keep a map to count number of
> times an expr is hoisted and refuse hoisting the expr further up if it
> exceeds target-specific param threshold ?

Hoisting piggy-backs on the ANTIC data flow solution so that doesn't work
in the current framework.  In fact the current framework relies on top-down
operation to not hoist things one at a time.

Richard.

> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > > >
> > > > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739

Re: [PATCH][ARM] Enable code hoisting with -Os (PR80155)

2019-09-19 Thread Richard Biener

On Wed, Sep 18, 2019 at 6:47 PM Prathamesh Kulkarni
 wrote:
>
> On Wed, 18 Sep 2019 at 01:46, Richard Biener  
> wrote:
> >
> > On Tue, Sep 17, 2019 at 7:18 PM Wilco Dijkstra  
> > wrote:
> > >
> > > Hi Richard,
> > >
> > > > The issue with the bugzilla is that it lacked appropriate testcase(s) 
> > > > and thus
> > > > it is now a mess.  There are clear testcases (maybe not in the 
> > > > benchmarks you
> > >
> > > Agreed - it's not clear whether any of the proposed changes would actually
> > > help the original issue. My patch absolutely does.
> > >
> > > > care about) that benefit from code hoisting as enabler, mainly when 
> > > > control
> > > > flow can be then converted to data flow.  Also note that "size 
> > > > optimizations"
> > > > are important for all cases where followup transforms have size limits 
> > > > on the IL
> > > > in place.
> > >
> > > The gain from -fcode-hoisting is about 0.2% overall on Thumb-2. Ie. it's 
> > > definitely
> > > useful, but there are much larger gains to be had from other tweaks [1]. 
> > > So we can
> > > live without it until a better solution is found.
> >
> > A "solution" for better eembc benchmark results?
> >
> > The issues are all latent even w/o code-hoisting since you can do the
> > same transform at the source level.  Which is usually why I argue
> > trying to fix this in code-hoisting is not a complete fix.  Nor is turning
> > off random GIMPLE passes for specific benchmark regressions.
> >
> > Anyway, it's arm maintainers call if you want to have such hacks in
> > place or not.
> >
> > As a release manager I say that GCC isn't a benchmark compiler.
> >
> > As the one "responsible" for the code-hoisting introduction I say that
> > as long as I don't have access to the actual benchmark I can't assess
> > wrongdoing of the pass nor suggest an appropriate place for optimization.
> Hi Richard,
> The actual benchmark function for PR80155 is almost identical to FMS()
> function defined in
> pr77445-2.c, and the test-case reproduces the same issue as in the benchmark.

So on that code-hoisting identifies the redudnant
transitions[S1]++;
stmts, but only the load and increment because it doesn't hoist stores.
I think that's ultimately useful and would be even profitable if we manage
to hoist the stores as well.  If you have access to the actual benchmark
you can do this transform in the source and measure the effect.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > >
> > > [1] https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01739.html
> > >
> > > Wilco

Re: [PATCH][ARM] Add logical DImode expanders

2019-09-19 Thread Richard Earnshaw (lists)


On 18/09/2019 20:55, Wilco Dijkstra wrote:

Hi Kyrill,


We should be able to "compress" the above 3 patterns into one using code
iterators.


Good point, that makes sense. I've committed this:

ChangeLog:
2019-09-18  Wilco Dijkstra  

PR target/91738
* config/arm/arm.md (di3): Expand explicitly.
(one_cmpldi2): Likewise.
* config/arm/arm.c (const_ok_for_dimode_op): Return true if one
of the constant parts is simple.
* config/arm/iterators.md (LOGICAL): Add new code iterator.
(logical_op): Add new code attribute.
(logical_OP): Likewise.
* config/arm/predicates.md (arm_anddi_operand): Add predicate.
(arm_iordi_operand): Add predicate.
(arm_xordi_operand): Add predicate.


except we can do better than this...




--
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
173e6363682c35aa72a9fa36c14b6324b59e347b..1fc90c62798978ea5abddb11fbf1d7acbc8a8dc3
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4273,8 +4273,8 @@ const_ok_for_dimode_op (HOST_WIDE_INT i, enum rtx_code 
code)
  case AND:
  case IOR:
  case XOR:
-  return (const_ok_for_op (hi_val, code) || hi_val == 0x)
-  && (const_ok_for_op (lo_val, code) || lo_val == 0x);
+  return const_ok_for_op (hi_val, code) || hi_val == 0x
+|| const_ok_for_op (lo_val, code) || lo_val == 0x;
  case PLUS:
return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);
  
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md

index 
e23683114087f6cc9ee78376529da97cfe31d3a6..3943c4252b272d30f88f265e90ebc4cb88e3a615
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2176,6 +2176,49 @@ (define_expand "divdf3"
"")
  
  
+; Expand logical operations.  The mid-end expander does not split off memory

+; operands or complex immediates, which leads to fewer LDRD/STRD instructions.
+; So an explicit expander is needed to generate better code.
+
+(define_expand "di3"


(see below).  With that change, this just becomes di3


+  [(set (match_operand:DI0 "s_register_operand")
+   (LOGICAL:DI (match_operand:DI 1 "s_register_operand")
+   (match_operand:DI 2 "arm_di_operand")))]
+  "TARGET_32BIT"
+  {
+  rtx low  = simplify_gen_binary (, SImode,
+ gen_lowpart (SImode, operands[1]),
+ gen_lowpart (SImode, operands[2]));
+  rtx high = simplify_gen_binary (, SImode,
+ gen_highpart (SImode, operands[1]),
+ gen_highpart_mode (SImode, DImode,
+operands[2]));
+
+  emit_insn (gen_rtx_SET (gen_lowpart (SImode, operands[0]), low));
+  emit_insn (gen_rtx_SET (gen_highpart (SImode, operands[0]), high));
+  DONE;
+  }
+)
+
+(define_expand "one_cmpldi2"
+  [(set (match_operand:DI 0 "s_register_operand")
+   (not:DI (match_operand:DI 1 "s_register_operand")))]
+  "TARGET_32BIT"
+  {
+  rtx low  = simplify_gen_unary (NOT, SImode,
+gen_lowpart (SImode, operands[1]),
+SImode);
+  rtx high = simplify_gen_unary (NOT, SImode,
+gen_highpart_mode (SImode, DImode,
+   operands[1]),
+SImode);
+
+  emit_insn (gen_rtx_SET (gen_lowpart (SImode, operands[0]), low));
+  emit_insn (gen_rtx_SET (gen_highpart (SImode, operands[0]), high));
+  DONE;
+  }
+)
+
  ;; Split DImode and, ior, xor operations.  Simply perform the logical
  ;; operation on the upper and lower halves of the registers.
  ;; This is needed for atomic operations in arm_split_atomic_op.
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 
fa6f0c0529d5364b1e1df705cb1029868578e38c..20fd96cb0445fcdf821c7c72f2dd30bae8590d0c
 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -239,6 +239,8 @@ (define_code_iterator COMPARISONS [eq gt ge le lt])
  ;; A list of ...
  (define_code_iterator IOR_XOR [ior xor])
  
+(define_code_iterator LOGICAL [and ior xor])

+
  ;; Operations on two halves of a quadword vector.
  (define_code_iterator VQH_OPS [plus smin smax umin umax])
  
@@ -285,6 +287,9 @@ (define_code_attr cmp_type [(eq "i") (gt "s") (ge "s") (lt "s") (le "s")])
  
  (define_code_attr vfml_op [(plus "a") (minus "s")])
  
+(define_code_attr logical_op [(ior "ior") (xor "xor") (and "and")])


These should just be added to the existing 'optab' attribute, there's no 
need for an additional attribute.



+(define_code_attr logical_OP [(ior "IOR") (xor "XOR") (and "AND")])


You don't need this, just use  where you want this substitution.


+
  ;;---

Re: [PATCH 2/4] New parameter manipulation infrastructure

2019-09-19 Thread Martin Jambor

Hi,

On Fri, Sep 13 2019, Jan Hubicka wrote:
>> 2019-08-20  Martin Jambor  
>> 
>> * Makefile.in (GTFILES): Added ipa-param-manipulation.h.
>> * cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p
>> and ref_p, added fields param_adjustments and performed_splits.
>> (struct cgraph_clone_info): Remove ags_to_skip and
>> combined_args_to_skip, new field param_adjustments.
>> (cgraph_node::create_clone): Changed parameters to use
>> ipa_param_adjustments.
>> (cgraph_node::create_virtual_clone): Likewise.
>> (cgraph_node::create_virtual_clone_with_body): Likewise.
>> (tree_function_versioning): Likewise.
>> (cgraph_build_function_type_skip_args): Removed.
>> * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to
>> using ipa_param_adjustments.
>> (clone_of_p): Likewise.
>> * cgraphclones.c (cgraph_build_function_type_skip_args): Removed.
>> (build_function_decl_skip_args): Likewise.
>> (duplicate_thunk_for_node): Adjust parameters using
>> ipa_param_body_adjustments, copy param_adjustments instead of
>> args_to_skip.
>> (cgraph_node::create_clone): Convert to using ipa_param_adjustments.
>> (cgraph_node::create_virtual_clone): Likewise.
>> (cgraph_node::create_version_clone_with_body): Likewise.
>> (cgraph_materialize_clone): Likewise.
>> (symbol_table::materialize_all_clones): Likewise.
>> * coretypes.h (cgraph_edge): Declare.
>> * ipa-cp.c (get_replacement_map): Do not initialize removed fields.
>> (initialize_node_lattices): Make aware that some parameters might 
>> have
>> already been removed.
>> (want_remove_some_param_p): New function.
>> (create_specialized_node): Convert to using ipa_param_adjustments and
>> deal with possibly pre-existing adjustments.
>> * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify
>> ipa_replace_map check.
>> * ipa-inline-transform.c (save_inline_function_body): Update to
>> refelct new tree_function_versioning signature.
>> * ipa-param-manipulation.c: Rewrite.
>> * ipa-param-manipulation.h: Likewise.
>> * ipa-prop.c (adjust_agg_replacement_values): Use a helper from
>> ipa_param_adjustments to get current parameter indices.
>> (ipcp_modif_dom_walker::before_dom_children): Likewise.
>> (ipcp_update_bits): Likewise.
>> (ipcp_update_vr): Likewise.
>> * ipa-split.c (split_function): Convert to using 
>> ipa_param_adjustments.
>> * lto-cgraph.c (output_cgraph_opt_summary_p): Likewise.
>> (output_node_opt_summary): Do not stream removed fields.  Stream
>> parameter adjustments instead of argumetns to skip.
>> (input_node_opt_summary): Likewise.
>> (input_node_opt_summary): Likewise.
>> * multiple_target.c (create_target_clone): Update to reflet new type
>> of create_version_clone_with_body.
>> * omp-simd-clone.c (simd_clone_vector_of_formal_parm_types): Adjust
>> for the new interface.
>> (simd_clone_clauses_extract): Likewise, make args an auto_vec.
>> (simd_clone_compute_base_data_type): Likewise.
>> (simd_clone_init_simd_arrays): Adjust for the new interface.
>> (simd_clone_adjust_argument_types): Likewise.
>> (struct modify_stmt_info): Likewise.
>> (ipa_simd_modify_stmt_ops): Likewise.
>> (ipa_simd_modify_function_body): Likewise.
>> (simd_clone_adjust): Likewise.
>> * trans-mem.c (ipa_tm_create_version): Update to reflect new type of
>> tree_function_versioning.
>> * tree-inline.h (copy_body_data): New fields killed_new_ssa_names and
>> param_body_adjs.
>> (copy_decl_to_var): Declare.
>> * tree-inline.c (update_clone_info): Do not remap old_tree.
>> (remap_gimple_stmt): Use ipa_param_body_adjustments to modify gimple
>> statements, walk all extra generated statements and remap their
>> operands.
>> (redirect_all_calls): Add killed SSA names to a hash set.
>> (remap_ssa_name): Do not remap killed SSA names.
>> (copy_arguments_for_versioning): Renames to copy_arguments_nochange,
>> half of functionality moved to ipa_param_body_adjustments.
>> (copy_decl_to_var): Make exported.
>> (copy_body): Destroy killed_new_ssa_names hash set.
>> (expand_call_inline): Remap performed splits.
>> (update_clone_info): Likewise.
>> (tree_function_versioning): Simplify tree_map processing.  Updated to
>> accept ipa_param_adjustments and use ipa_param_body_adjustments.
>
> OK

Thanks!

>> +/* Modify actual arguments of a function call in statement STMT, assuming it
>> +   calls CALLEE_DECL.  CALLER_ADJ must be the description of parameter
>> +   ad

Re: [PATCH][ARM] Remove support for MULS

2019-09-19 Thread Richard Earnshaw (lists)


On 18/09/2019 17:31, Kyrill Tkachov wrote:

Hi Wilco,

On 9/9/19 6:07 PM, Wilco Dijkstra wrote:

ping


Remove various MULS/MLAS patterns which are enabled when optimizing for
 size.  However the codesize gain from these patterns is so minimal that
 there is no point in keeping them.

I disagree. If they still trigger and generate better code than without 
we should keep them.


What kind of code is *common* varies greatly from user to user.


Also, the main reason for restricting their use was that in the 'olden 
days', when we had multi-cycle implementations of the multiply 
instructions with short-circuit fast termination when the result was 
completed, the flag setting variants would never short-circuit.


These days we have fixed cycle counts for multiply instructions, so this 
is no-longer a penalty.  In the thumb2 case in particular we can often 
reduce mul-cmp (6 bytes) to muls (2 bytes), that's a 66% saving on this 
sequence and definitely worth exploiting when we can, even if it's not 
all that common.


In fact, give the latest architectural implementations, I think we 
should look again at re-enabling these when doing normal optimization as 
well.


R.



Thanks,

Kyrill



 Bootstrap OK on armhf, regress passes.

 ChangeLog:
 2019-09-03  Wilco Dijkstra  

 * config/arm/arm.md (mulsi3_compare0): Remove pattern.
 (mulsi3_compare0_v6): Likewise.
 (mulsi_compare0_scratch): Likewise.
 (mulsi_compare0_scratch_v6): Likewise.
 (mulsi3addsi_compare0): Likewise.
 (mulsi3addsi_compare0_v6): Likewise.
 (mulsi3addsi_compare0_scratch): Likewise.
 (mulsi3addsi_compare0_scratch_v6): Likewise.
 * config/arm/thumb2.md (thumb2_mulsi_short_compare0): Remove 
pattern.

 (thumb2_mulsi_short_compare0_scratch): Likewise.

 --
 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
 index 
738d42fd164f117f1dec1108a824d984ccd70d09..66dafdc47b7cfc37c131764e482d47bcaab90538 
100644

 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -1618,60 +1618,6 @@ (define_insn "*arm_mulsi3_v6"
 (set_attr "predicable_short_it" "yes,yes,no")]
  )

 -(define_insn "*mulsi3_compare0"
 -  [(set (reg:CC_NOOV CC_REGNUM)
 -   (compare:CC_NOOV (mult:SI
 - (match_operand:SI 2 "s_register_operand" 
"r,r")
 - (match_operand:SI 1 "s_register_operand" 
"%0,r"))

 -    (const_int 0)))
 -   (set (match_operand:SI 0 "s_register_operand" "=&r,&r")
 -   (mult:SI (match_dup 2) (match_dup 1)))]
 -  "TARGET_ARM && !arm_arch6"
 -  "muls%?\\t%0, %2, %1"
 -  [(set_attr "conds" "set")
 -   (set_attr "type" "muls")]
 -)
 -
 -(define_insn "*mulsi3_compare0_v6"
 -  [(set (reg:CC_NOOV CC_REGNUM)
 -   (compare:CC_NOOV (mult:SI
 - (match_operand:SI 2 "s_register_operand" "r")
 - (match_operand:SI 1 "s_register_operand" "r"))
 -    (const_int 0)))
 -   (set (match_operand:SI 0 "s_register_operand" "=r")
 -   (mult:SI (match_dup 2) (match_dup 1)))]
 -  "TARGET_ARM && arm_arch6 && optimize_size"
 -  "muls%?\\t%0, %2, %1"
 -  [(set_attr "conds" "set")
 -   (set_attr "type" "muls")]
 -)
 -
 -(define_insn "*mulsi_compare0_scratch"
 -  [(set (reg:CC_NOOV CC_REGNUM)
 -   (compare:CC_NOOV (mult:SI
 - (match_operand:SI 2 "s_register_operand" 
"r,r")
 - (match_operand:SI 1 "s_register_operand" 
"%0,r"))

 -    (const_int 0)))
 -   (clobber (match_scratch:SI 0 "=&r,&r"))]
 -  "TARGET_ARM && !arm_arch6"
 -  "muls%?\\t%0, %2, %1"
 -  [(set_attr "conds" "set")
 -   (set_attr "type" "muls")]
 -)
 -
 -(define_insn "*mulsi_compare0_scratch_v6"
 -  [(set (reg:CC_NOOV CC_REGNUM)
 -   (compare:CC_NOOV (mult:SI
 - (match_operand:SI 2 "s_register_operand" "r")
 - (match_operand:SI 1 "s_register_operand" "r"))
 -    (const_int 0)))
 -   (clobber (match_scratch:SI 0 "=r"))]
 -  "TARGET_ARM && arm_arch6 && optimize_size"
 -  "muls%?\\t%0, %2, %1"
 -  [(set_attr "conds" "set")
 -   (set_attr "type" "muls")]
 -)
 -
  ;; Unnamed templates to match MLA instruction.

  (define_insn "*mulsi3addsi"
 @@ -1698,70 +1644,6 @@ (define_insn "*mulsi3addsi_v6"
 (set_attr "predicable" "yes")]
  )

 -(define_insn "*mulsi3addsi_compare0"
 -  [(set (reg:CC_NOOV CC_REGNUM)
 -   (compare:CC_NOOV
 -    (plus:SI (mult:SI
 -  (match_operand:SI 2 "s_register_operand" "r,r,r,r")
 -  (match_operand:SI 1 "s_register_operand" "%0,r,0,r"))
 - (match_operand:SI 3 "s_register_operand" "r,r,0,0"))
 -    (const_int 0)))
 -   (set (match_operand:SI 0 "s_register_operand" "=&r,&r,&r,&r")
 -   (plus:SI (mult:SI (match_dup 2) (match_dup 1))
 -    (match_dup 3)))]
 -  "TARGET_ARM && arm_arch6"
 -  "mlas%?\\t%0, %2, %1, %3"
 -  [(set_attr "conds" "set")
 -   (set

Re: [PATCH] Remove vectorizer reduction operand swapping

2019-09-19 Thread Richard Biener

On Thu, 19 Sep 2019, Richard Sandiford wrote:

> Richard Biener  writes:
> > It shouldn't be neccessary.
> 
> SVE is the counter-example :-)  But the fix is simpler than the code
> you removed, so it's still a net win.

Yeah, I meant it shouldn't be necessary to swap operands of the
original scalar stmts since we keep track of the "order" via
reduc_index.

> Fixes various vect.exp and aarch64-sve.exp ICEs.  OK if it passes
> proper testing?

OK.

Richard.

> Thanks,
> Richard
> 
> 
> 2019-09-19  Richard Sandiford  
> 
> gcc/
>   * tree-vectorizer.h (vectorizable_condition): Take an int
>   reduction index instead of a boolean flag.
>   * tree-vect-stmts.c (vectorizable_condition): Likewise.
>   Swap the "then" and "else" values for EXTRACT_LAST_REDUCTION
>   reductions if the reduction accumulator is the "then" rather
>   than the "else" value.
>   (vect_analyze_stmt): Update call accordingly.
>   (vect_transform_stmt): Likewise.
>   * tree-vect-loop.c (vectorizable_reduction): Likewise,
>   asserting that the index is > 0.
> 
> Index: gcc/tree-vectorizer.h
> ===
> --- gcc/tree-vectorizer.h 2019-09-05 08:49:30.717740417 +0100
> +++ gcc/tree-vectorizer.h 2019-09-19 08:43:53.965866883 +0100
> @@ -1541,7 +1541,7 @@ extern void vect_remove_stores (stmt_vec
>  extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree,
>slp_instance, stmt_vector_for_cost *);
>  extern bool vectorizable_condition (stmt_vec_info, gimple_stmt_iterator *,
> - stmt_vec_info *, bool, slp_tree,
> + stmt_vec_info *, int, slp_tree,
>   stmt_vector_for_cost *);
>  extern bool vectorizable_shift (stmt_vec_info, gimple_stmt_iterator *,
>   stmt_vec_info *, slp_tree,
> Index: gcc/tree-vect-stmts.c
> ===
> --- gcc/tree-vect-stmts.c 2019-09-17 15:27:13.090053952 +0100
> +++ gcc/tree-vect-stmts.c 2019-09-19 08:43:53.965866883 +0100
> @@ -9778,7 +9778,7 @@ vect_is_simple_cond (tree cond, vec_info
>  
>  bool
>  vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
> - stmt_vec_info *vec_stmt, bool for_reduction,
> + stmt_vec_info *vec_stmt, int reduc_index,
>   slp_tree slp_node, stmt_vector_for_cost *cost_vec)
>  {
>vec_info *vinfo = stmt_info->vinfo;
> @@ -9807,6 +9807,7 @@ vectorizable_condition (stmt_vec_info st
>vec vec_oprnds3 = vNULL;
>tree vec_cmp_type;
>bool masked = false;
> +  bool for_reduction = (reduc_index > 0);
>  
>if (for_reduction && STMT_SLP_TYPE (stmt_info))
>  return false;
> @@ -9888,6 +9889,29 @@ vectorizable_condition (stmt_vec_info st
>cond_expr1 = TREE_OPERAND (cond_expr, 1);
>  }
>  
> +  /* For conditional reductions, the "then" value needs to be the candidate
> + value calculated by this iteration while the "else" value needs to be
> + the result carried over from previous iterations.  If the COND_EXPR
> + is the other way around, we need to swap it.  */
> +  bool must_invert_cmp_result = false;
> +  if (reduction_type == EXTRACT_LAST_REDUCTION && reduc_index == 1)
> +{
> +  if (masked)
> + must_invert_cmp_result = true;
> +  else
> + {
> +   bool honor_nans = HONOR_NANS (TREE_TYPE (cond_expr0));
> +   tree_code new_code = invert_tree_comparison (cond_code, honor_nans);
> +   if (new_code == ERROR_MARK)
> + must_invert_cmp_result = true;
> +   else
> + cond_code = new_code;
> + }
> +  /* Make sure we don't accidentally use the old condition.  */
> +  cond_expr = NULL_TREE;
> +  std::swap (then_clause, else_clause);
> +}
> +
>if (!masked && VECTOR_BOOLEAN_TYPE_P (comp_vectype))
>  {
>/* Boolean values may have another representation in vectors
> @@ -10102,6 +10126,15 @@ vectorizable_condition (stmt_vec_info st
> vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> vec_compare = vec_compare_name;
>   }
> +   if (must_invert_cmp_result)
> + {
> +   tree vec_compare_name = make_ssa_name (vec_cmp_type);
> +   gassign *new_stmt = gimple_build_assign (vec_compare_name,
> +BIT_NOT_EXPR,
> +vec_compare);
> +   vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> +   vec_compare = vec_compare_name;
> + }
> gcall *new_stmt = gimple_build_call_internal
>   (IFN_FOLD_EXTRACT_LAST, 3, else_clause, vec_compare,
>vec_then_clause);
> @@ -10635,7 +10668,7 @@ vect_analyze_stmt (stmt_vec_info

[PATCH] Speed up qsort in IPA ICF

2019-09-19 Thread Martin Liška

Hi.

As Alexander pointed out, the sort_congruence_class_groups_by_decl_uid is the 
most
expensive qsort operator in tramp3d compilation. It does unfortunate 7 pointer 
dereferences
via: DECL_UID (classes[i]->classes[0]->members[0]->decl). I'm suggesting to 
cache that
in congruence_class_group.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-09-19  Martin Liska  

* ipa-icf.c (sort_sem_items_by_decl_uid): Simplify comparator.
(sort_congruence_classes_by_decl_uid): Likewise.
(sort_congruence_class_groups_by_decl_uid): Use
congruence_class_group::uid.
(sem_item_optimizer::merge_classes): Cache
DECL_UID (classes[i]->classes[0]->members[0]->decl).
* ipa-icf.h (struct congruence_class_group): New field.
---
 gcc/ipa-icf.c | 29 +
 gcc/ipa-icf.h |  1 +
 2 files changed, 6 insertions(+), 24 deletions(-)


diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index c9c3cb4a331..d78635ad6b3 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -3350,13 +3350,7 @@ sort_sem_items_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (i1->decl);
   int uid2 = DECL_UID (i2->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_classes A and B by DECL_UID of the first member.  */
@@ -3369,13 +3363,7 @@ sort_congruence_classes_by_decl_uid (const void *a, const void *b)
 
   int uid1 = DECL_UID (c1->members[0]->decl);
   int uid2 = DECL_UID (c2->members[0]->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  return uid1 - uid2;
 }
 
 /* Sort pair of congruence_class_groups A and B by
@@ -3388,16 +3376,7 @@ sort_congruence_class_groups_by_decl_uid (const void *a, const void *b)
 = *(const congruence_class_group * const *)a;
   const congruence_class_group *g2
 = *(const congruence_class_group * const *)b;
-
-  int uid1 = DECL_UID (g1->classes[0]->members[0]->decl);
-  int uid2 = DECL_UID (g2->classes[0]->members[0]->decl);
-
-  if (uid1 < uid2)
-return -1;
-  else if (uid1 > uid2)
-return 1;
-  else
-return 0;
+  return g1->uid - g2->uid;
 }
 
 /* After reduction is done, we can declare all items in a group
@@ -3450,6 +3429,8 @@ sem_item_optimizer::merge_classes (unsigned int prev_class_count)
it != m_classes.end (); ++it)
 classes.quick_push (*it);
 
+  for (unsigned i = 0; i < classes.length (); i++)
+classes[i]->uid = DECL_UID (classes[i]->classes[0]->members[0]->decl);
   classes.qsort (sort_congruence_class_groups_by_decl_uid);
 
   if (dump_file)
diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
index 2bf0f156ef6..e7bb4afcfee 100644
--- a/gcc/ipa-icf.h
+++ b/gcc/ipa-icf.h
@@ -461,6 +461,7 @@ struct congruence_class_group
 {
   hashval_t hash;
   sem_item_type type;
+  int uid;
   vec  classes;
 };

Re: Rework constant subreg folds and handle more variable-length cases

2019-09-19 Thread Richard Sandiford

Jeff Law  writes:
> On 7/12/19 1:44 AM, Richard Sandiford wrote:
>> Richard Sandiford  writes:
>>> This patch rewrites the way simplify_subreg handles constants.
>>> It uses similar native_encode/native_decode routines to the
>>> tree-level handling of VIEW_CONVERT_EXPR, meaning that we can
>>> move between rtx constants and the target memory image of them.
>>>
>>> The main point of this patch is to support subregs of constant-length
>>> vectors for VLA vectors, beyond the very simple cases that were already
>>> handled.  Many of the new tests failed before the patch for variable-
>>> length vectors.
>>>
>>> The boolean side is tested more by the upcoming SVE ACLE work.
>>>
>>> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
>>> OK to install?
>> I made a last-minute change after testing, to use uintNN_t types
>> for target_unit rather than the original unsigned char/short/int.
>> Of course, that doesn't survive a libgcc build since 
>> isn't included there.
>> 
>> Fixed below, and posted as tested this time.
>> 
>> Richard
>> 
>> 
>> 2019-07-12  Richard Sandiford  
>> 
>> gcc/
>>  * defaults.h (TARGET_UNIT): New macro.
>>  (target_unit): New type.
>>  * rtl.h (native_encode_rtx, native_decode_rtx)
>>  (native_decode_vector_rtx, subreg_size_lsb): Declare.
>>  (subreg_lsb_1): Turn into an inline wrapper around subreg_size_lsb.
>>  * rtlanal.c (subreg_lsb_1): Delete.
>>  (subreg_size_lsb): New function.
>>  * simplify-rtx.c: Include rtx-vector-builder.h
>>  (simplify_immed_subreg): Delete.
>>  (native_encode_rtx, native_decode_vector_rtx, native_decode_rtx)
>>  (simplify_const_vector_byte_offset, simplify_const_vector_subreg): New
>>  functions.
>>  (simplify_subreg): Use them.
>>  (test_vector_subregs_modes, test_vector_subregs_repeating)
>>  (test_vector_subregs_fore_back, test_vector_subregs_stepped)
>>  (test_vector_subregs): New functions.
>>  (test_vector_ops): Call test_vector_subregs for integer vector
>>  modes with at least 2 elements.
> This just turns out to be amazingly painful to work through and I don't
> particularly see any good breakdown which would make it obvious where
> the behavioral changes are vs just refactoring.
>
> Given your long history with GCC and your expertise in RTL as well as
> the SVE space I'm inclined to say go for it and we'll cope with any fallout.

Thanks.  Here's what I (very) belatedly applied.  Further cross-target
testing showed I needed some tweaks:

(1) Keep:

  /* Some ports misuse CCmode.  */
  if (GET_MODE_CLASS (outermode) == MODE_CC && CONST_INT_P (op))
return op;

which unfortunately is still needed.

(2) The old version filled the undefined upper bits of a paradoxical
subreg with zeros, but some ports expected it to be sign-extended
for integers.

(3) In the self tests, skip over non-IEEE floating-point modes,
since bitcasting to and from others can drop bits.

Richard


2019-09-19  Richard Sandiford  

gcc/
* defaults.h (TARGET_UNIT): New macro.
(target_unit): New type.
* rtl.h (native_encode_rtx, native_decode_rtx)
(native_decode_vector_rtx, subreg_size_lsb): Declare.
(subreg_lsb_1): Turn into an inline wrapper around subreg_size_lsb.
* rtlanal.c (subreg_lsb_1): Delete.
(subreg_size_lsb): New function.
* simplify-rtx.c: Include rtx-vector-builder.h
(simplify_immed_subreg): Delete.
(native_encode_rtx, native_decode_vector_rtx, native_decode_rtx)
(simplify_const_vector_byte_offset, simplify_const_vector_subreg): New
functions.
(simplify_subreg): Use them.
(test_vector_subregs_modes, test_vector_subregs_repeating)
(test_vector_subregs_fore_back, test_vector_subregs_stepped)
(test_vector_subregs): New functions.
(test_vector_ops): Call test_vector_subregs for integer vector
modes with at least 2 elements.

Index: gcc/defaults.h
===
*** gcc/defaults.h  2019-07-12 08:53:06.0 +0100
--- gcc/defaults.h  2019-09-19 09:56:43.873352025 +0100
*** #define TARGET_VTABLE_USES_DESCRIPTORS 0
*** 1459,1462 
--- 1459,1476 
  #define DWARF_GNAT_ENCODINGS_DEFAULT DWARF_GNAT_ENCODINGS_GDB
  #endif
  
+ #ifndef USED_FOR_TARGET
+ /* Done this way to keep gengtype happy.  */
+ #if BITS_PER_UNIT == 8
+ #define TARGET_UNIT uint8_t
+ #elif BITS_PER_UNIT == 16
+ #define TARGET_UNIT uint16_t
+ #elif BITS_PER_UNIT == 32
+ #define TARGET_UNIT uint32_t
+ #else
+ #error Unknown BITS_PER_UNIT
+ #endif
+ typedef TARGET_UNIT target_unit;
+ #endif
+ 
  #endif  /* ! GCC_DEFAULTS_H */
Index: gcc/rtl.h
===
*** gcc/rtl.h   2019-09-12 10:52:56.0 +0100
--- gcc/rtl.h   2019-09-19 09:56:43.877351995 +0100
*** extern int rtx_cost (rtx, machine_mode,
*** 2406,

Re: [patch, testsuite, arm] Fix ICE in gcc.dg/gimplefe-28.c

2019-09-19 Thread Kyrill Tkachov

Hi Sandra, Mike,

On 9/18/19 7:14 PM, Mike Stump wrote:
On Sep 13, 2019, at 12:06 PM, Sandra Loosemore 
 wrote:

>
> For the default multilib on arm-none-eabi, gcc.dg/gimplefe-28 has 
been getting an ICE because, while the target-supports infrastructure 
is probing to see if it can add the command-line options to enable the 
sqrt insn ("-mfpu=vfp -mfloat-abi=softfp"), it is not actually adding 
those options when building this testcase.  :-S  The hook to do this 
is already there; it just needs a case for arm.

>
> OK to commit?

Ok.

Hum, usually the arm people are so responsive.  I don't think I've 
seen a review of this, so when they don't, I will.

General note, I do prefer the target folk chime in on such patches 
instead of me, as there can be subtle target things that target folks 
track better than I and arm is one of those targets with so many 
wonder and subtle things.  :-)

I'm sorry for this. It slipped through the cracks among the big arm 
patch series recently (DImode cleanups and FDPIC).

In the future, putting the maintainers on CC in the submission helps, at 
least for me, to make it more visible.

As for the patch...

arm-sqrt.log

2019-09-13  Sandra Loosemore

gcc/testsuite/
* lib/target-supports.exp (add_options_for_sqrt_insn): Add
arm options consistent with check_effective_target_arm_vfp_ok.

arm-sqrt.patch

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 275699)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -6670,6 +6670,9 @@ proc add_options_for_sqrt_insn { flags }
 if { [istarget amdgcn*-*-*] } {
return "$flags -ffast-math"
 }
+if { [istarget arm*-*-*] } {
+   return "$flags -mfpu=vfp -mfloat-abi=softfp"
+}
 return $flags
 }

I tend to think it's a bit cleaner to use one of the add_options_for_* helpers 
we have that know the right float-abi and mfpu options to pass.
Would using add_options_for_arm_fp or add_options_for_arm_vfp3 work here?

Thanks,
Kyrill

[PATCH][RFC] Add new ipa-reorder pass

2019-09-19 Thread Martin Liška

Hi.

Function reordering has been around for quite some time and a naive
implementation was also part of my diploma thesis some time ago.
Currently, the GCC can reorder function based on first execution, which
happens with PGO and LTO of course. Known limitation is that the order
is preserved only partially as various symbols go into different LTRANS 
partitions.

There has been some research in the area and I would point out the Facebook 
paper
([1]) and Sony presentation ([2]). Based on that, I decided to make a new 
implementation
in the GCC that does the same (in a proper way). First part of the enablement 
are patches
to ld.bfd and ld.gold that come up with a new section .text.sorted, that is 
always sorted.
There's a snippet from the modified default linker script:

  .text   :
  {
*(.text.unlikely .text.*_unlikely .text.unlikely.*)
*(.text.exit .text.exit.*)
*(.text.startup .text.startup.*)
*(.text.hot .text.hot.*)
*(SORT(.text.sorted.*))
*(.text .stub .text.* .gnu.linkonce.t.*)
/* .gnu.warning sections are handled specially by elf.em.  */
*(.gnu.warning)
  }

Second part is the reordering pass where I implemented both approaches:
the existing one (-freorder-functions-algorithm=first-run) and the new one
based on the Facebook paper 
(-freorder-functions-algorithm=call-chain-clustering).
The implementation is straightforward and utilizes the newly introduced text 
subsection.

Results:
I made a PGO LTO bootstrap of the GCC with the patch and 
-freorder-functions-algorithm=call-chain-clustering.
The pass makes sorting order for about 7500 functions and from these about 7300 
are really put in the desired
order. The remaining are the functions that are later observed as cold and are 
eventually put into .text.unlikely
section. I verified the order with nm (sorted by address) and my 
-fdump-ipa-reorder dump. The aforementioned
sorted function occupy 5.0MB (out of 18.3MB) of the final binary of cc1plus. I 
made a simple benchmark of:

./cc1plus /home/marxin/Programming/tramp3d/tramp3d-v4.ii -O2 -quiet and run it 
20x before and after my patch:
run #020: old:15.033 s, new: 14.975 s, cmp: 99.61%

The numbers are stable and present a speed up of about 0.4%. To verify that I 
also measured iTLB misses
before:

18,475,504  iTLB-load-misses:u
and after:
13,779,347  iTLB-load-misses:u

Both these numbers are very stable based on my measurements. That said, I 
believe the patch is correct
and can help in huge binaries that have a flat profile.

Notes and limitations:
- The call-chain-clustering algorithm requires to fit as many as possible 
functions into page size (4K).
  Based on my measurements that should correspond to ~1000 GIMPLE statements 
(IPA inliner size). I can
  make it a param in the future.
- One needs modified binutils and I that would probably require a configure 
detection. The only way
  which I see is based on ld --version. I'm planning to make the binutils 
submission soon.
- Right now, I prefer to put functions into .text.sorted.* rather than 
.text.hot. That's required by the
  algorithm to put hot function calls next to each other.
- Observation: Very hot calls are likely inlined and so that the clustering 
reached quite big functions.

Thoughts? I would definitely welcome any interesting measurement on a bigger 
load.

Martin

[1] https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
[2] 
https://llvm.org/devmtg/2018-10/slides/Spencer-Profile%20Guided%20Function%20Layout%20in%20LLVM%20and%20LLD.pdf
>From 69aa0fd381d8a1362f68812a70c9ccdeaa1e51b0 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 18 Sep 2019 13:23:34 +0200
Subject: [PATCH] Introduce new .text.sorted.* sections.

---
 gold/layout.cc| 9 ++---
 ld/scripttempl/arclinux.sc| 1 +
 ld/scripttempl/elf.sc | 1 +
 ld/scripttempl/elf64bpf.sc| 1 +
 ld/scripttempl/nds32elf.sc| 1 +
 ld/testsuite/ld-arm/arm-no-rel-plt.ld | 1 +
 ld/testsuite/ld-arm/fdpic-main.ld | 1 +
 ld/testsuite/ld-arm/fdpic-shared.ld   | 1 +
 8 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/gold/layout.cc b/gold/layout.cc
index fc7cdf8b8b..b3a668544f 100644
--- a/gold/layout.cc
+++ b/gold/layout.cc
@@ -1131,12 +1131,15 @@ Layout::special_ordering_of_input_section(const char* name)
 ".text.hot"
   };
 
-  for (size_t i = 0;
-   i < sizeof(text_section_sort) / sizeof(text_section_sort[0]);
-   i++)
+  unsigned scount = sizeof(text_section_sort) / sizeof(text_section_sort[0]);
+  for (size_t i = 0; i < scount; i++)
 if (is_prefix_of(text_section_sort[i], name))
   return i;
 
+  int order;
+  if (sscanf (name, ".text.sorted.%d", &order) == 1)
+return order + scount;
+
   return -1;
 }
 
diff --git a/ld/scripttempl/arclinux.sc b/ld/scripttempl/arclinux.sc
index e13969ede0..41e8ccdf1b 100644
--- a/ld/scripttempl/arclinux.sc
+++ b/ld/scripttempl/arclinux.sc
@@ -491,6 +491,7 @@ cat

Re: [PATCH][ARM] Cleanup multiply patterns

2019-09-19 Thread Kyrill Tkachov




On 9/18/19 7:19 PM, Wilco Dijkstra wrote:

Hi Kyrill,


  + (mult:SI (match_operand:SI 3 "s_register_operand" "r")
  +  (match_operand:SI 2 "s_register_operand" "r"]

Looks like we'll want to mark operand 2 here with '%' as well?

That doesn't make any difference since both operands are identical.
It only helps if the operands are not the same, eg. rather than
"r,0" "0,r" you can write "%r" "0".


Right, I remember that now.

The patch is ok then.

Thanks,

Kyrill




Wilco

Re: [PATCH] Remove vectorizer reduction operand swapping

2019-09-19 Thread Richard Sandiford

Richard Biener  writes:
> It shouldn't be neccessary.

SVE is the counter-example :-)  But the fix is simpler than the code
you removed, so it's still a net win.

Fixes various vect.exp and aarch64-sve.exp ICEs.  OK if it passes
proper testing?

Thanks,
Richard


2019-09-19  Richard Sandiford  

gcc/
* tree-vectorizer.h (vectorizable_condition): Take an int
reduction index instead of a boolean flag.
* tree-vect-stmts.c (vectorizable_condition): Likewise.
Swap the "then" and "else" values for EXTRACT_LAST_REDUCTION
reductions if the reduction accumulator is the "then" rather
than the "else" value.
(vect_analyze_stmt): Update call accordingly.
(vect_transform_stmt): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Likewise,
asserting that the index is > 0.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2019-09-05 08:49:30.717740417 +0100
+++ gcc/tree-vectorizer.h   2019-09-19 08:43:53.965866883 +0100
@@ -1541,7 +1541,7 @@ extern void vect_remove_stores (stmt_vec
 extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree,
 slp_instance, stmt_vector_for_cost *);
 extern bool vectorizable_condition (stmt_vec_info, gimple_stmt_iterator *,
-   stmt_vec_info *, bool, slp_tree,
+   stmt_vec_info *, int, slp_tree,
stmt_vector_for_cost *);
 extern bool vectorizable_shift (stmt_vec_info, gimple_stmt_iterator *,
stmt_vec_info *, slp_tree,
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2019-09-17 15:27:13.090053952 +0100
+++ gcc/tree-vect-stmts.c   2019-09-19 08:43:53.965866883 +0100
@@ -9778,7 +9778,7 @@ vect_is_simple_cond (tree cond, vec_info
 
 bool
 vectorizable_condition (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
-   stmt_vec_info *vec_stmt, bool for_reduction,
+   stmt_vec_info *vec_stmt, int reduc_index,
slp_tree slp_node, stmt_vector_for_cost *cost_vec)
 {
   vec_info *vinfo = stmt_info->vinfo;
@@ -9807,6 +9807,7 @@ vectorizable_condition (stmt_vec_info st
   vec vec_oprnds3 = vNULL;
   tree vec_cmp_type;
   bool masked = false;
+  bool for_reduction = (reduc_index > 0);
 
   if (for_reduction && STMT_SLP_TYPE (stmt_info))
 return false;
@@ -9888,6 +9889,29 @@ vectorizable_condition (stmt_vec_info st
   cond_expr1 = TREE_OPERAND (cond_expr, 1);
 }
 
+  /* For conditional reductions, the "then" value needs to be the candidate
+ value calculated by this iteration while the "else" value needs to be
+ the result carried over from previous iterations.  If the COND_EXPR
+ is the other way around, we need to swap it.  */
+  bool must_invert_cmp_result = false;
+  if (reduction_type == EXTRACT_LAST_REDUCTION && reduc_index == 1)
+{
+  if (masked)
+   must_invert_cmp_result = true;
+  else
+   {
+ bool honor_nans = HONOR_NANS (TREE_TYPE (cond_expr0));
+ tree_code new_code = invert_tree_comparison (cond_code, honor_nans);
+ if (new_code == ERROR_MARK)
+   must_invert_cmp_result = true;
+ else
+   cond_code = new_code;
+   }
+  /* Make sure we don't accidentally use the old condition.  */
+  cond_expr = NULL_TREE;
+  std::swap (then_clause, else_clause);
+}
+
   if (!masked && VECTOR_BOOLEAN_TYPE_P (comp_vectype))
 {
   /* Boolean values may have another representation in vectors
@@ -10102,6 +10126,15 @@ vectorizable_condition (stmt_vec_info st
  vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
  vec_compare = vec_compare_name;
}
+ if (must_invert_cmp_result)
+   {
+ tree vec_compare_name = make_ssa_name (vec_cmp_type);
+ gassign *new_stmt = gimple_build_assign (vec_compare_name,
+  BIT_NOT_EXPR,
+  vec_compare);
+ vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
+ vec_compare = vec_compare_name;
+   }
  gcall *new_stmt = gimple_build_call_internal
(IFN_FOLD_EXTRACT_LAST, 3, else_clause, vec_compare,
 vec_then_clause);
@@ -10635,7 +10668,7 @@ vect_analyze_stmt (stmt_vec_info stmt_in
 node_instance, cost_vec)
  || vectorizable_induction (stmt_info, NULL, NULL, node, cost_vec)
  || vectorizable_shift (stmt_info, NULL, NULL, node, cost_vec)
- || vectorizable_condition (stmt_info, NULL, NULL, false, node,
+ || vectorizable_cond

87 matches

Mail list logo