Re: [2/3][vect] Add widening add, subtract vect patterns

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Joel Hutton wrote:

> Hi all,
> 
> This patch adds widening add and widening subtract patterns to 
> tree-vect-patterns.

I am missing documentation in md.texi for the new patterns.  In
particular I wonder why you need singed and unsigned variants
for the add/subtract patterns.

We're walking away from adding tree codes for new vectorizer
pieces and instead want to use direct internal functions for them.
Can you rework the patch to use this approach?

Thanks,
Richard.

> All 3 patches together bootstrapped and regression tested on aarch64.
> 
> gcc/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
> ? ? ? ? * optabs-tree.c (optab_for_tree_code): optabs for widening 
> adds,subtracts
> ? ? ? ? * optabs.def (OPTAB_D): define vectorized widen add, subtracts
> ? ? ? ? * tree-cfg.c (verify_gimple_assign_binary): Add case for widening 
> adds, subtracts
> ? ? ? ? * tree-inline.c (estimate_operator_cost): Add case for widening adds, 
> subtracts
> ? ? ? ? * tree-vect-generic.c (expand_vector_operations_1): Add case for 
> widening adds, subtracts
> ? ? ? ? * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog 
> ptatern
> ? ? ? ? (vect_recog_widen_sub_pattern): New recog pattern
> ? ? ? ? (vect_recog_average_pattern): Update widened add code
> ? ? ? ? (vect_recog_average_pattern): Update widened add code
> ? ? ? ? * tree-vect-stmts.c (vectorizable_conversion): Add case for widened 
> add, subtract
> ? ? ? ? (supportable_widening_operation): Add case for widened add, subtract
> ? ? ? ? * tree.def (WIDEN_ADD_EXPR): New tree code
> ? ? ? ? (WIDEN_SUB_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_ADD_HI_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_ADD_LO_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_SUB_HI_EXPR): New tree code
> ? ? ? ? (VEC_WIDEN_SUB_LO_EXPR): New tree code
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-12 ?Joel Hutton ?
> 
> ? ? ? ? * gcc.target/aarch64/vect-widen-add.c: New test.
> ? ? ? ? * gcc.target/aarch64/vect-widen-sub.c: New test.
> 
> 
> Ok for trunk?
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: Improve handling of memory operands in ipa-icf 2/4

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Jan Hubicka wrote:

> Hi,
> this is updated patch.  It fixes the comparsion of bitfield where I now
> check that they bitsizes and bitoffsets match (and OEP_ADDRESSOF is not
> used for bitfield references).
> I also noticed problem with dependence clique in ao_refs_may_alias that
> I copied here.  Instead of base rbase should be used.
> 
> Finally I ran statistics on when access paths mismatches and noticed
> that I do not really need to check that component_refs and array_refs
> are semantically equivalent since this is implied from earlier tests.
> This is described in inline comment and simplifies the code.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Thanks,
Richard.

> Honza
> 
> 
>   * ipa-icf-gimple.c: Include tree-ssa-alias-compare.h.
>   (find_checker::func_checker): Initialize m_tbaa.
>   (func_checker::hash_operand): Use hash_ao_ref for memory accesses.
>   (func_checker::compare_operand): Use compare_ao_refs for memory
>   accesses.
>   (func_checker::cmopare_gimple_assign): Do not check LHS types
>   of memory stores.
>   * ipa-icf-gimple.h (func_checker): Derive from ao_compare;
>   add m_tbaa.
>   * ipa-icf.c: Include tree-ssa-alias-compare.h.
>   (sem_function::equals_private): Update call of
>   func_checker::func_checker.
>   * ipa-utils.h (lto_streaming_expected_p): New inline
>   predicate.
>   * tree-ssa-alias-compare.h: New file.
>   * tree-ssa-alias.c: Include tree-ssa-alias-compare.h
>   and bultins.h
>   (view_converted_memref_p): New function.
>   (types_equal_for_same_type_for_tbaa_p): New function.
>   (ao_compare::compare_ao_refs): New member function.
>   (ao_compare::hash_ao_ref): New function
> 
>   * c-c++-common/Wstringop-overflow-2.c: Disable ICF.
>   * g++.dg/warn/Warray-bounds-8.C: Disable ICF.
> 
> index f75951f7c49..26337dd7384 100644
> --- a/gcc/ipa-icf-gimple.c
> +++ b/gcc/ipa-icf-gimple.c
> @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "gimple-walk.h"
>  
> +#include "tree-ssa-alias-compare.h"
>  #include "ipa-icf-gimple.h"
>  
>  namespace ipa_icf_gimple {
> @@ -52,13 +53,13 @@ namespace ipa_icf_gimple {
> of declarations that can be skipped.  */
>  
>  func_checker::func_checker (tree source_func_decl, tree target_func_decl,
> - bool ignore_labels,
> + bool ignore_labels, bool tbaa,
>   hash_set *ignored_source_nodes,
>   hash_set *ignored_target_nodes)
>: m_source_func_decl (source_func_decl), m_target_func_decl 
> (target_func_decl),
>  m_ignored_source_nodes (ignored_source_nodes),
>  m_ignored_target_nodes (ignored_target_nodes),
> -m_ignore_labels (ignore_labels)
> +m_ignore_labels (ignore_labels), m_tbaa (tbaa)
>  {
>function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
>function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
> @@ -252,9 +253,16 @@ func_checker::hash_operand (const_tree arg, 
> inchash::hash &hstate,
>  
>  void
>  func_checker::hash_operand (const_tree arg, inchash::hash &hstate,
> - unsigned int flags, operand_access_type)
> + unsigned int flags, operand_access_type access)
>  {
> -  return hash_operand (arg, hstate, flags);
> +  if (access == OP_MEMORY)
> +{
> +  ao_ref ref;
> +  ao_ref_init (&ref, const_cast  (arg));
> +  return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, hstate);
> +}
> +  else
> +return hash_operand (arg, hstate, flags);
>  }
>  
>  bool
> @@ -314,18 +322,40 @@ func_checker::compare_operand (tree t1, tree t2, 
> operand_access_type access)
>  return true;
>else if (!t1 || !t2)
>  return false;
> -  if (operand_equal_p (t1, t2, OEP_MATCH_SIDE_EFFECTS))
> -return true;
> -  switch (access)
> +  if (access == OP_MEMORY)
>  {
> -case OP_MEMORY:
> -  return return_false_with_msg
> -  ("operand_equal_p failed (access == memory)");
> -case OP_NORMAL:
> +  ao_ref ref1, ref2;
> +  ao_ref_init (&ref1, const_cast  (t1));
> +  ao_ref_init (&ref2, const_cast  (t2));
> +  int flags = compare_ao_refs (&ref1, &ref2,
> +lto_streaming_expected_p (), m_tbaa);
> +
> +  if (!flags)
> + return true;
> +  if (flags & SEMANTICS)
> + return return_false_with_msg
> + ("compare_ao_refs failed (semantic difference)");
> +  if (flags & BASE_ALIAS_SET)
> + return return_false_with_msg
> + ("compare_ao_refs failed (base alias set difference)");
> +  if (flags & REF_ALIAS_SET)
> + return return_false_with_msg
> +  ("compare_ao_refs failed (ref alias set difference)");
> +  if (flags & ACCESS_PATH)
> + return return_false_with_msg
> +  ("compare_ao_refs failed (access pa

Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Jan Hubicka wrote:

> > On Thu, 12 Nov 2020, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this is updated patch I am re-testing and plan to commit if it suceeds.
> > > 
> > >   * fold-const.c (operand_compare::operand_equal_p): Compare
> > >   offsets of fields in component_refs when comparing addresses.
> > >   (operand_compare::hash_operand): Likewise.
> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > > index c47557daeba..273ee25ceda 100644
> > > --- a/gcc/fold-const.c
> > > +++ b/gcc/fold-const.c
> > > @@ -3312,11 +3312,36 @@ operand_compare::operand_equal_p (const_tree 
> > > arg0, const_tree arg1,
> > >   case COMPONENT_REF:
> > > /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
> > >may be NULL when we're called to compare MEM_EXPRs.  */
> > > -   if (!OP_SAME_WITH_NULL (0)
> > > -   || !OP_SAME (1))
> > > +   if (!OP_SAME_WITH_NULL (0))
> > >   return false;
> > > -   flags &= ~OEP_ADDRESS_OF;
> > > -   return OP_SAME_WITH_NULL (2);
> > > +   /* Most of time we only need to compare FIELD_DECLs for equality.
> > > +  However when determining address look into actual offsets.
> > > +  These may match for unions and unshared record types.  */
> > 
> > looks like you can simplify by doing
> > 
> >   flags &= ~OEP_ADDRESS_OF;
> > 
> > here.  Neither the FIELD_DECL compare nor the offsets need it
> 
> Yep
> > 
> > You elided
> > 
> >   flags &= ~OEP_ADDRESS_OF;
> > - return OP_SAME_WITH_NULL (2);
> > 
> > that was here when OP_SAME (1), please re-instantiate.
> Sorry for that, that was not very careful.
> Here is updated patch I re-tested x86_64-linux.

OK.

Richard.

> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index c47557daeba..ddf18f27cb7 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3312,10 +3312,32 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>   case COMPONENT_REF:
> /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
>may be NULL when we're called to compare MEM_EXPRs.  */
> -   if (!OP_SAME_WITH_NULL (0)
> -   || !OP_SAME (1))
> +   if (!OP_SAME_WITH_NULL (0))
>   return false;
> +   /* Most of time we only need to compare FIELD_DECLs for equality.
> +  However when determining address look into actual offsets.
> +  These may match for unions and unshared record types.  */
> flags &= ~OEP_ADDRESS_OF;
> +   if (!OP_SAME (1))
> + {
> +   if (flags & OEP_ADDRESS_OF)
> + {
> +   if (TREE_OPERAND (arg0, 2)
> +   || TREE_OPERAND (arg1, 2))
> + return OP_SAME_WITH_NULL (2);
> +   tree field0 = TREE_OPERAND (arg0, 1);
> +   tree field1 = TREE_OPERAND (arg1, 1);
> +
> +   if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
> + DECL_FIELD_OFFSET (field1), flags)
> +   || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
> +DECL_FIELD_BIT_OFFSET (field1),
> +flags))
> + return false;
> + }
> +   else
> + return false;
> + }
> return OP_SAME_WITH_NULL (2);
>  
>   case BIT_FIELD_REF:
> @@ -3787,9 +3809,26 @@ operand_compare::hash_operand (const_tree t, 
> inchash::hash &hstate,
> sflags = flags;
> break;
>  
> + case COMPONENT_REF:
> +   if (sflags & OEP_ADDRESS_OF)
> + {
> +   hash_operand (TREE_OPERAND (t, 0), hstate, flags);
> +   if (TREE_OPERAND (t, 2))
> + hash_operand (TREE_OPERAND (t, 2), hstate,
> +   flags & ~OEP_ADDRESS_OF);
> +   else
> + {
> +   tree field = TREE_OPERAND (t, 1);
> +   hash_operand (DECL_FIELD_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> +   hash_operand (DECL_FIELD_BIT_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> + }
> +   return;
> + }
> +   break;
>   case ARRAY_REF:
>   case ARRAY_RANGE_REF:
> - case COMPONENT_REF:
>   case BIT_FIELD_REF:
> sflags &= ~OEP_ADDRESS_OF;
> break;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng via Gcc-patches
Oh I was dry-run but cc to gcc patches accidentally, but the patch set
is right, it just sent twice the same patch set.



On Fri, Nov 13, 2020 at 3:29 PM Kito Cheng  wrote:
>
>  - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
>corresponding configuration option --with-isa-spec.
>
>  - Current default ISA spec set to 2.2, but we intend to bump this to
>20191213 or later in next release.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.c (riscv_ext_version): New.
> (riscv_ext_version_table): Ditto.
> (get_default_version): Ditto.
> (riscv_subset_t::implied_p): New field.
> (riscv_subset_t::riscv_subset_t): Init implied_p.
> (riscv_subset_list::add): New.
> (riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
> instead of separated argument.
> (riscv_subset_list::to_string): Handle zifencei and zicsr, and
> omit version if version is unknown.
> (riscv_subset_list::parsing_subset_version): New argument `ext`,
> remove default_major_version and default_minor_version, get
> default version info via get_default_version.
> (riscv_subset_list::parse_std_ext): Update argument for
> parsing_subset_version calls.
> Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
> included in baseline ISA in that time.
> (riscv_subset_list::parse_multiletter_ext): Update argument for
> `parsing_subset_version` and `add` calls.
> (riscv_subset_list::parse): Adjust argument for
> riscv_subset_list::handle_implied_ext call.
> * config.gcc (riscv*-*-*): Handle --with-isa-spec=.
> * config.in (HAVE_AS_MISA_SPEC): New.
> (HAVE_AS_MARCH_ZIFENCEI): Ditto.
> * config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
> (riscv_isa_spec): Ditto.
> * config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
> (ASM_SPEC): Pass -misa-spec if gas supported.
> * config/riscv/riscv.opt (riscv_isa_spec_class) New.
> * configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
> (HAVE_AS_MISA_SPEC): Ditto.
> * configure: Regen.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/arch-9.c: New.
> * gcc.target/riscv/arch-10.c: Ditto.
> * gcc.target/riscv/arch-11.c: Ditto.
> * gcc.target/riscv/attribute-6.c: Remove, we don't support G
> with version anymore.
> * gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
> ordering.
> * gcc.target/riscv/attribute-9.c: We don't emit version for
> unknown extensions now.
> * gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
> * gcc.target/riscv/attribute-12.c: Ditto.
> * gcc.target/riscv/attribute-13.c: Ditto.
> * gcc.target/riscv/attribute-14.c: Ditto.
> * gcc.target/riscv/attribute-15.c: New.
> * gcc.target/riscv/attribute-16.c: Ditto.
> * gcc.target/riscv/attribute-17.c: Ditto.
> ---
>  gcc/common/config/riscv/riscv-common.c| 288 +-
>  gcc/config.gcc|  17 +-
>  gcc/config.in |  12 +
>  gcc/config/riscv/riscv-opts.h |  10 +
>  gcc/config/riscv/riscv.h  |   9 +-
>  gcc/config/riscv/riscv.opt|  17 ++
>  gcc/configure |  62 
>  gcc/configure.ac  |  10 +
>  gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
>  gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
>  gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
>  gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
>  gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
>  gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
>  gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
>  gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
>  21 files changed, 394 insertions(+), 88 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.c 
> b/gcc/common/config/riscv/riscv-common.c
> index ca88ca1dacd..ea2d516bb36 100644
> --- a/gcc/common/config/riscv/riscv-comm

[PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-12 Thread Kito Cheng
 - CSR related instructions and fence instructions has to be splitted from
   baseline ISA, zicsr and zifencei are corresponding sub-extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_implied_info):
d and f implied zicsr.
(riscv_ext_flag_table): Handle zicsr and zifencei.
* config/riscv/riscv-opts.h (MASK_ZICSR): New.
(MASK_ZIFENCEI): Ditto.
(TARGET_ZICSR): Ditto.
(TARGET_ZIFENCEI): Ditto.
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Check fence is available by TARGET_ZIFENCEI.
* config/riscv/riscv.opt (riscv_zi_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-8.c: New.
* gcc.target/riscv/attribute-14.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 6 ++
 gcc/config/riscv/riscv-opts.h | 6 ++
 gcc/config/riscv/riscv.c  | 3 +++
 gcc/config/riscv/riscv.md | 7 ---
 gcc/config/riscv/riscv.opt| 3 +++
 gcc/testsuite/gcc.target/riscv/arch-8.c   | 5 +
 gcc/testsuite/gcc.target/riscv/attribute-14.c | 6 ++
 7 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-14.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index f5f7be3cfff..ca88ca1dacd 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -57,6 +57,8 @@ struct riscv_implied_info_t
 static const riscv_implied_info_t riscv_implied_info[] =
 {
   {"d", "f"},
+  {"f", "zicsr"},
+  {"d", "zicsr"},
   {NULL, NULL}
 };
 
@@ -812,6 +814,10 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] 
=
   {"f", &gcc_options::x_target_flags, MASK_HARD_FLOAT},
   {"d", &gcc_options::x_target_flags, MASK_DOUBLE_FLOAT},
   {"c", &gcc_options::x_target_flags, MASK_RVC},
+
+  {"zicsr",&gcc_options::x_riscv_zi_subext, MASK_ZICSR},
+  {"zifencei", &gcc_options::x_riscv_zi_subext, MASK_ZIFENCEI},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 2a3f9d9eef5..de8ac0e038d 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -57,4 +57,10 @@ enum stack_protector_guard {
   SSP_GLOBAL   /* global canary */
 };
 
+#define MASK_ZICSR(1 << 0)
+#define MASK_ZIFENCEI (1 << 1)
+
+#define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
+#define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 738556539f6..2aaa8e96451 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
 static bool
 riscv_memmodel_needs_release_fence (enum memmodel model)
 {
+  if (!TARGET_ZIFENCEI)
+return false;
+
   switch (model)
 {
   case MEMMODEL_ACQ_REL:
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f15bad3b29e..756b35fb8c0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1543,19 +1543,20 @@
 LCT_NORMAL, VOIDmode, operands[0], Pmode,
 operands[1], Pmode, const0_rtx, Pmode);
 #else
-  emit_insn (gen_fence_i ());
+  if (TARGET_ZIFENCEI)
+emit_insn (gen_fence_i ());
 #endif
   DONE;
 })
 
 (define_insn "fence"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
-  ""
+  "TARGET_ZIFENCEI"
   "%|fence%-")
 
 (define_insn "fence_i"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
-  ""
+  "TARGET_ZIFENCEI"
   "fence.i")
 
 ;;
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 808b4a04405..ca2fc7c8021 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -183,3 +183,6 @@ Use the given offset for addressing the stack-protector 
guard.
 
 TargetVariable
 long riscv_stack_protector_guard_offset = 0
+
+TargetVariable
+int riscv_zi_subext
diff --git a/gcc/testsuite/gcc.target/riscv/arch-8.c 
b/gcc/testsuite/gcc.target/riscv/arch-8.c
new file mode 100644
index 000..d7760fc576f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-8.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=rv32id_zicsr_zifence -mabi=ilp32" } */
+int foo()
+{
+}
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-14.c 
b/gcc/testsuite/gcc.target/riscv/attribute-14.c
new file mode 100644
index 000..48456277152
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-14.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mriscv-attribute -march=rv32if -mabi=ilp32" } */
+int foo()
+{
+}
+/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p0_f2p0_zicsr2p0\"" } 
} */
-- 
2.29.2



[PATCH 1/3] RISC-V: Handle implied extension in canonical ordering.

2020-11-12 Thread Kito Cheng
 - ISA spec has specify the order between multi-letter extensions, implied
   extension also need to follow store in canonical ordering, so
   most easy way is we keep that in-order during insertion.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (single_letter_subset_rank): New.
(multi_letter_subset_rank): Ditto.
(subset_cmp): Ditto.
(riscv_subset_list::add): Insert subext in canonical ordering.
(riscv_subset_list::parse_std_ext): Move handle_implied_ext to ...
(riscv_subset_list::parse): ... here.
---
 gcc/common/config/riscv/riscv-common.c | 177 -
 1 file changed, 172 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index 9a576eb689b..f5f7be3cfff 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -145,6 +145,129 @@ riscv_subset_list::~riscv_subset_list ()
 }
 }
 
+/* Get the rank for single-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+single_letter_subset_rank (char ext)
+{
+  int rank;
+
+  switch (ext)
+{
+case 'i':
+  return 0;
+case 'e':
+  return 1;
+default:
+  break;
+}
+
+  const char *all_ext = riscv_supported_std_ext ();
+  const char *ext_pos = strchr (all_ext, ext);
+  if (ext_pos == NULL)
+/* If got an unknown extension letter, then give it an alphabetical
+   order, but after all known standard extension.  */
+rank = strlen (all_ext) + ext - 'a';
+  else
+rank = (int)(ext_pos - all_ext) + 2 /* e and i has higher rank.  */;
+
+  return rank;
+}
+
+/* Get the rank for multi-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+multi_letter_subset_rank (const std::string &subset)
+{
+  gcc_assert (subset.length () >= 2);
+  int high_order = -1;
+  int low_order = 0;
+  /* The order between multi-char extensions: s -> h -> z -> x.  */
+  char multiletter_class = subset[0];
+  switch (multiletter_class)
+{
+case 's':
+  high_order = 0;
+  break;
+case 'h':
+  high_order = 1;
+  break;
+case 'z':
+  gcc_assert (subset.length () > 2);
+  high_order = 2;
+  break;
+case 'x':
+  high_order = 3;
+  break;
+default:
+  gcc_unreachable ();
+  return -1;
+}
+
+  if (multiletter_class == 'z')
+/* Order for z extension on spec: If multiple "Z" extensions are named, 
they
+   should be ordered first by category, then alphabetically within a
+   category - for example, "Zicsr_Zifencei_Zam". */
+low_order = single_letter_subset_rank (subset[1]);
+  else
+low_order = 0;
+
+  return (high_order << 8) + low_order;
+}
+
+/* subset compare
+
+  Returns an integral value indicating the relationship between the subsets:
+  Return value  indicates
+  -1B has higher order than A.
+  0 A and B are same subset.
+  1 A has higher order than B.
+
+*/
+
+static int
+subset_cmp (const std::string &a, const std::string &b)
+{
+  if (a == b)
+return 0;
+
+  size_t a_len = a.length ();
+  size_t b_len = b.length ();
+
+  /* Single-letter extension always get higher order than
+ multi-letter extension.  */
+  if (a_len == 1 && b_len != 1)
+return 1;
+
+  if (a_len != 1 && b_len == 1)
+return -1;
+
+  if (a_len == 1 && b_len == 1)
+{
+  int rank_a = single_letter_subset_rank (a[0]);
+  int rank_b = single_letter_subset_rank (b[0]);
+
+  if (rank_a < rank_b)
+   return 1;
+  else
+   return -1;
+}
+  else
+{
+  int rank_a = multi_letter_subset_rank(a);
+  int rank_b = multi_letter_subset_rank(b);
+
+  /* Using alphabetical/lexicographical order if they have same rank.  */
+  if (rank_a == rank_b)
+   /* The return value of strcmp has opposite meaning.  */
+   return -strcmp (a.c_str (), b.c_str ());
+  else
+   return (rank_a < rank_b) ? 1 : -1;
+}
+}
+
 /* Add new subset to list.  */
 
 void
@@ -152,6 +275,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
int minor_version, bool explicit_version_p)
 {
   riscv_subset_t *s = new riscv_subset_t ();
+  riscv_subset_t *itr;
 
   if (m_head == NULL)
 m_head = s;
@@ -162,9 +286,45 @@ riscv_subset_list::add (const char *subset, int 
major_version,
   s->explicit_version_p = explicit_version_p;
   s->next = NULL;
 
-  if (m_tail != NULL)
-m_tail->next = s;
+  if (m_tail == NULL)
+{
+  m_tail = s;
+  return;
+}
+
+  /* e, i or g should be first subext, never come here.  */
+  gcc_assert (subset[0] != 'e'
+ && subset[0] != 'i'
+ && subset[0] != 'g');
+
+  if (m_tail == m_head)
+{
+  gcc_assert (m_head->next == NULL);
+  m_head->next = s;
+  m_tail = s;
+  return;
+}
+
+  gcc_assert (m_head->next != NULL);
+
+  /* Subset list must in canonical order, but impli

[PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng
 - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
   corresponding configuration option --with-isa-spec.

 - Current default ISA spec set to 2.2, but we intend to bump this to
   20191213 or later in next release.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_ext_version): New.
(riscv_ext_version_table): Ditto.
(get_default_version): Ditto.
(riscv_subset_t::implied_p): New field.
(riscv_subset_t::riscv_subset_t): Init implied_p.
(riscv_subset_list::add): New.
(riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
instead of separated argument.
(riscv_subset_list::to_string): Handle zifencei and zicsr, and
omit version if version is unknown.
(riscv_subset_list::parsing_subset_version): New argument `ext`,
remove default_major_version and default_minor_version, get
default version info via get_default_version.
(riscv_subset_list::parse_std_ext): Update argument for
parsing_subset_version calls.
Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
included in baseline ISA in that time.
(riscv_subset_list::parse_multiletter_ext): Update argument for
`parsing_subset_version` and `add` calls.
(riscv_subset_list::parse): Adjust argument for
riscv_subset_list::handle_implied_ext call.
* config.gcc (riscv*-*-*): Handle --with-isa-spec=.
* config.in (HAVE_AS_MISA_SPEC): New.
(HAVE_AS_MARCH_ZIFENCEI): Ditto.
* config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
(riscv_isa_spec): Ditto.
* config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
(ASM_SPEC): Pass -misa-spec if gas supported.
* config/riscv/riscv.opt (riscv_isa_spec_class) New.
* configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
(HAVE_AS_MISA_SPEC): Ditto.
* configure: Regen.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-9.c: New.
* gcc.target/riscv/arch-10.c: Ditto.
* gcc.target/riscv/arch-11.c: Ditto.
* gcc.target/riscv/attribute-6.c: Remove, we don't support G
with version anymore.
* gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
ordering.
* gcc.target/riscv/attribute-9.c: We don't emit version for
unknown extensions now.
* gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
* gcc.target/riscv/attribute-12.c: Ditto.
* gcc.target/riscv/attribute-13.c: Ditto.
* gcc.target/riscv/attribute-14.c: Ditto.
* gcc.target/riscv/attribute-15.c: New.
* gcc.target/riscv/attribute-16.c: Ditto.
* gcc.target/riscv/attribute-17.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 288 +-
 gcc/config.gcc|  17 +-
 gcc/config.in |  12 +
 gcc/config/riscv/riscv-opts.h |  10 +
 gcc/config/riscv/riscv.h  |   9 +-
 gcc/config/riscv/riscv.opt|  17 ++
 gcc/configure |  62 
 gcc/configure.ac  |  10 +
 gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
 gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
 gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
 gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
 21 files changed, 394 insertions(+), 88 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index ca88ca1dacd..ea2d516bb36 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -44,6 +44,7 @@ struct riscv_subset_t
   struct riscv_subset_t *next;
 
   bool explicit_version_p;
+  bool implied_p;
 };
 
 /* Type for implied ISA info.  */
@@ -62,6 +63,58 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {NULL, NULL}
 };
 
+/* This structure holds version information for

[PATCH 0/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng
Current GCC implementation is RISC-V ISA 2.2, this patch set implement 
v20190608 and v20191213, and also add option -misa-spec=[2.2|20190608|20191213] 
to change the default ISA spec version.

There is one major incompatible

That option will effect the default version of each sub-extension, for example 
I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.

We also update the -march parser to fit the latest standard, the canonical 
ordering for multi-letter, drop version support for G extension, and we also 
omitted the version for unrecognized extension.

And we add an special rule for G extension, imafd can't appear again if G 
extension is present, but zicsr and zifencei can.

The default ISA spec will keep on 2.2, and change that in next GCC release.





[PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng
 - New option -misa-spec support: -misa-spec=[2.2|20190608|20191213] and
   corresponding configuration option --with-isa-spec.

 - Current default ISA spec set to 2.2, but we intend to bump this to
   20191213 or later in next release.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_ext_version): New.
(riscv_ext_version_table): Ditto.
(get_default_version): Ditto.
(riscv_subset_t::implied_p): New field.
(riscv_subset_t::riscv_subset_t): Init implied_p.
(riscv_subset_list::add): New.
(riscv_subset_list::handle_implied_ext): Pass riscv_subset_t
instead of separated argument.
(riscv_subset_list::to_string): Handle zifencei and zicsr, and
omit version if version is unknown.
(riscv_subset_list::parsing_subset_version): New argument `ext`,
remove default_major_version and default_minor_version, get
default version info via get_default_version.
(riscv_subset_list::parse_std_ext): Update argument for
parsing_subset_version calls.
Handle 2.2 ISA spec, always enable zicsr and zifencei, they are
included in baseline ISA in that time.
(riscv_subset_list::parse_multiletter_ext): Update argument for
`parsing_subset_version` and `add` calls.
(riscv_subset_list::parse): Adjust argument for
riscv_subset_list::handle_implied_ext call.
* config.gcc (riscv*-*-*): Handle --with-isa-spec=.
* config.in (HAVE_AS_MISA_SPEC): New.
(HAVE_AS_MARCH_ZIFENCEI): Ditto.
* config/riscv/riscv-opts.h (riscv_isa_spec_class): New.
(riscv_isa_spec): Ditto.
* config/riscv/riscv.h (HAVE_AS_MISA_SPEC): New.
(ASM_SPEC): Pass -misa-spec if gas supported.
* config/riscv/riscv.opt (riscv_isa_spec_class) New.
* configure.ac (HAVE_AS_MARCH_ZIFENCEI): New test.
(HAVE_AS_MISA_SPEC): Ditto.
* configure: Regen.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-9.c: New.
* gcc.target/riscv/arch-10.c: Ditto.
* gcc.target/riscv/arch-11.c: Ditto.
* gcc.target/riscv/attribute-6.c: Remove, we don't support G
with version anymore.
* gcc.target/riscv/attribute-8.c: Reorder arch string to fit canonical
ordering.
* gcc.target/riscv/attribute-9.c: We don't emit version for
unknown extensions now.
* gcc.target/riscv/attribute-11.c: Add -misa-spec=2.2 flags.
* gcc.target/riscv/attribute-12.c: Ditto.
* gcc.target/riscv/attribute-13.c: Ditto.
* gcc.target/riscv/attribute-14.c: Ditto.
* gcc.target/riscv/attribute-15.c: New.
* gcc.target/riscv/attribute-16.c: Ditto.
* gcc.target/riscv/attribute-17.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 288 +-
 gcc/config.gcc|  17 +-
 gcc/config.in |  12 +
 gcc/config/riscv/riscv-opts.h |  10 +
 gcc/config/riscv/riscv.h  |   9 +-
 gcc/config/riscv/riscv.opt|  17 ++
 gcc/configure |  62 
 gcc/configure.ac  |  10 +
 gcc/testsuite/gcc.target/riscv/arch-10.c  |   6 +
 gcc/testsuite/gcc.target/riscv/arch-11.c  |   5 +
 gcc/testsuite/gcc.target/riscv/arch-9.c   |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-11.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-12.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-13.c |   2 +-
 gcc/testsuite/gcc.target/riscv/attribute-14.c |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-15.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-16.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-17.c |   6 +
 gcc/testsuite/gcc.target/riscv/attribute-6.c  |   6 -
 gcc/testsuite/gcc.target/riscv/attribute-8.c  |   4 +-
 gcc/testsuite/gcc.target/riscv/attribute-9.c  |   2 +-
 21 files changed, 394 insertions(+), 88 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-17.c
 delete mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index ca88ca1dacd..ea2d516bb36 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -44,6 +44,7 @@ struct riscv_subset_t
   struct riscv_subset_t *next;
 
   bool explicit_version_p;
+  bool implied_p;
 };
 
 /* Type for implied ISA info.  */
@@ -62,6 +63,58 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {NULL, NULL}
 };
 
+/* This structure holds version information for

[PATCH 1/3] RISC-V: Handle implied extension in canonical ordering.

2020-11-12 Thread Kito Cheng
 - ISA spec has specify the order between multi-letter extensions, implied
   extension also need to follow store in canonical ordering, so
   most easy way is we keep that in-order during insertion.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (single_letter_subset_rank): New.
(multi_letter_subset_rank): Ditto.
(subset_cmp): Ditto.
(riscv_subset_list::add): Insert subext in canonical ordering.
(riscv_subset_list::parse_std_ext): Move handle_implied_ext to ...
(riscv_subset_list::parse): ... here.
---
 gcc/common/config/riscv/riscv-common.c | 177 -
 1 file changed, 172 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index 9a576eb689b..f5f7be3cfff 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -145,6 +145,129 @@ riscv_subset_list::~riscv_subset_list ()
 }
 }
 
+/* Get the rank for single-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+single_letter_subset_rank (char ext)
+{
+  int rank;
+
+  switch (ext)
+{
+case 'i':
+  return 0;
+case 'e':
+  return 1;
+default:
+  break;
+}
+
+  const char *all_ext = riscv_supported_std_ext ();
+  const char *ext_pos = strchr (all_ext, ext);
+  if (ext_pos == NULL)
+/* If got an unknown extension letter, then give it an alphabetical
+   order, but after all known standard extension.  */
+rank = strlen (all_ext) + ext - 'a';
+  else
+rank = (int)(ext_pos - all_ext) + 2 /* e and i has higher rank.  */;
+
+  return rank;
+}
+
+/* Get the rank for multi-letter subsets, lower value meaning higher
+   priority.  */
+
+static int
+multi_letter_subset_rank (const std::string &subset)
+{
+  gcc_assert (subset.length () >= 2);
+  int high_order = -1;
+  int low_order = 0;
+  /* The order between multi-char extensions: s -> h -> z -> x.  */
+  char multiletter_class = subset[0];
+  switch (multiletter_class)
+{
+case 's':
+  high_order = 0;
+  break;
+case 'h':
+  high_order = 1;
+  break;
+case 'z':
+  gcc_assert (subset.length () > 2);
+  high_order = 2;
+  break;
+case 'x':
+  high_order = 3;
+  break;
+default:
+  gcc_unreachable ();
+  return -1;
+}
+
+  if (multiletter_class == 'z')
+/* Order for z extension on spec: If multiple "Z" extensions are named, 
they
+   should be ordered first by category, then alphabetically within a
+   category - for example, "Zicsr_Zifencei_Zam". */
+low_order = single_letter_subset_rank (subset[1]);
+  else
+low_order = 0;
+
+  return (high_order << 8) + low_order;
+}
+
+/* subset compare
+
+  Returns an integral value indicating the relationship between the subsets:
+  Return value  indicates
+  -1B has higher order than A.
+  0 A and B are same subset.
+  1 A has higher order than B.
+
+*/
+
+static int
+subset_cmp (const std::string &a, const std::string &b)
+{
+  if (a == b)
+return 0;
+
+  size_t a_len = a.length ();
+  size_t b_len = b.length ();
+
+  /* Single-letter extension always get higher order than
+ multi-letter extension.  */
+  if (a_len == 1 && b_len != 1)
+return 1;
+
+  if (a_len != 1 && b_len == 1)
+return -1;
+
+  if (a_len == 1 && b_len == 1)
+{
+  int rank_a = single_letter_subset_rank (a[0]);
+  int rank_b = single_letter_subset_rank (b[0]);
+
+  if (rank_a < rank_b)
+   return 1;
+  else
+   return -1;
+}
+  else
+{
+  int rank_a = multi_letter_subset_rank(a);
+  int rank_b = multi_letter_subset_rank(b);
+
+  /* Using alphabetical/lexicographical order if they have same rank.  */
+  if (rank_a == rank_b)
+   /* The return value of strcmp has opposite meaning.  */
+   return -strcmp (a.c_str (), b.c_str ());
+  else
+   return (rank_a < rank_b) ? 1 : -1;
+}
+}
+
 /* Add new subset to list.  */
 
 void
@@ -152,6 +275,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
int minor_version, bool explicit_version_p)
 {
   riscv_subset_t *s = new riscv_subset_t ();
+  riscv_subset_t *itr;
 
   if (m_head == NULL)
 m_head = s;
@@ -162,9 +286,45 @@ riscv_subset_list::add (const char *subset, int 
major_version,
   s->explicit_version_p = explicit_version_p;
   s->next = NULL;
 
-  if (m_tail != NULL)
-m_tail->next = s;
+  if (m_tail == NULL)
+{
+  m_tail = s;
+  return;
+}
+
+  /* e, i or g should be first subext, never come here.  */
+  gcc_assert (subset[0] != 'e'
+ && subset[0] != 'i'
+ && subset[0] != 'g');
+
+  if (m_tail == m_head)
+{
+  gcc_assert (m_head->next == NULL);
+  m_head->next = s;
+  m_tail = s;
+  return;
+}
+
+  gcc_assert (m_head->next != NULL);
+
+  /* Subset list must in canonical order, but impli

RISC-V: Support version controling for ISA standard extensions

2020-11-12 Thread Kito Cheng
Current GCC implementation is RISC-V ISA 2.2, this patch set implement 
v20190608 and v20191213, and also add option -misa-spec=[2.2|20190608|20191213] 
to change the default ISA spec version.

There is one major incompatible

That option will effect the default version of each sub-extension, for example 
I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.

We also update the -march parser to fit the latest standard, the canonical 
ordering for multi-letter, drop version support for G extension, and we also 
omitted the version for unrecognized extension.

And we add an special rule for G extension, imafd can't appear again if G 
extension is present, but zicsr and zifencei can.

The default ISA spec will keep on 2.2, and change that in next GCC release.




[PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-12 Thread Kito Cheng
 - CSR related instructions and fence instructions has to be splitted from
   baseline ISA, zicsr and zifencei are corresponding sub-extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.c (riscv_implied_info):
d and f implied zicsr.
(riscv_ext_flag_table): Handle zicsr and zifencei.
* config/riscv/riscv-opts.h (MASK_ZICSR): New.
(MASK_ZIFENCEI): Ditto.
(TARGET_ZICSR): Ditto.
(TARGET_ZIFENCEI): Ditto.
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Check fence is available by TARGET_ZIFENCEI.
* config/riscv/riscv.opt (riscv_zi_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-8.c: New.
* gcc.target/riscv/attribute-14.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.c| 6 ++
 gcc/config/riscv/riscv-opts.h | 6 ++
 gcc/config/riscv/riscv.c  | 3 +++
 gcc/config/riscv/riscv.md | 7 ---
 gcc/config/riscv/riscv.opt| 3 +++
 gcc/testsuite/gcc.target/riscv/arch-8.c   | 5 +
 gcc/testsuite/gcc.target/riscv/attribute-14.c | 6 ++
 7 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-14.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index f5f7be3cfff..ca88ca1dacd 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -57,6 +57,8 @@ struct riscv_implied_info_t
 static const riscv_implied_info_t riscv_implied_info[] =
 {
   {"d", "f"},
+  {"f", "zicsr"},
+  {"d", "zicsr"},
   {NULL, NULL}
 };
 
@@ -812,6 +814,10 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] 
=
   {"f", &gcc_options::x_target_flags, MASK_HARD_FLOAT},
   {"d", &gcc_options::x_target_flags, MASK_DOUBLE_FLOAT},
   {"c", &gcc_options::x_target_flags, MASK_RVC},
+
+  {"zicsr",&gcc_options::x_riscv_zi_subext, MASK_ZICSR},
+  {"zifencei", &gcc_options::x_riscv_zi_subext, MASK_ZIFENCEI},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 2a3f9d9eef5..de8ac0e038d 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -57,4 +57,10 @@ enum stack_protector_guard {
   SSP_GLOBAL   /* global canary */
 };
 
+#define MASK_ZICSR(1 << 0)
+#define MASK_ZIFENCEI (1 << 1)
+
+#define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
+#define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 738556539f6..2aaa8e96451 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
 static bool
 riscv_memmodel_needs_release_fence (enum memmodel model)
 {
+  if (!TARGET_ZIFENCEI)
+return false;
+
   switch (model)
 {
   case MEMMODEL_ACQ_REL:
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f15bad3b29e..756b35fb8c0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1543,19 +1543,20 @@
 LCT_NORMAL, VOIDmode, operands[0], Pmode,
 operands[1], Pmode, const0_rtx, Pmode);
 #else
-  emit_insn (gen_fence_i ());
+  if (TARGET_ZIFENCEI)
+emit_insn (gen_fence_i ());
 #endif
   DONE;
 })
 
 (define_insn "fence"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
-  ""
+  "TARGET_ZIFENCEI"
   "%|fence%-")
 
 (define_insn "fence_i"
   [(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
-  ""
+  "TARGET_ZIFENCEI"
   "fence.i")
 
 ;;
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 808b4a04405..ca2fc7c8021 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -183,3 +183,6 @@ Use the given offset for addressing the stack-protector 
guard.
 
 TargetVariable
 long riscv_stack_protector_guard_offset = 0
+
+TargetVariable
+int riscv_zi_subext
diff --git a/gcc/testsuite/gcc.target/riscv/arch-8.c 
b/gcc/testsuite/gcc.target/riscv/arch-8.c
new file mode 100644
index 000..d7760fc576f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-8.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=rv32id_zicsr_zifence -mabi=ilp32" } */
+int foo()
+{
+}
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-14.c 
b/gcc/testsuite/gcc.target/riscv/attribute-14.c
new file mode 100644
index 000..48456277152
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/attribute-14.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mriscv-attribute -march=rv32if -mabi=ilp32" } */
+int foo()
+{
+}
+/* { dg-final { scan-assembler ".attribute arch, \"rv32i2p0_f2p0_zicsr2p0\"" } 
} */
-- 
2.29.2



Re: [PATCH 2/2] loops: Invoke lim after successful loop interchange

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Martin Jambor wrote:

> Hi,
> 
> On Wed, Nov 11 2020, Richard Biener wrote:
> > On Mon, 9 Nov 2020, Martin Jambor wrote:
> >
> >> this patch modifies the loop invariant pass so that is can operate
> >> only on a single requested loop and its sub-loops and ignore the rest
> >> of the function, much like it currently ignores basic blocks that are
> >> not in any real loop.  It then invokes it from within the loop
> >> interchange pass when it successfully swaps two loops.  This avoids
> >> the non-LTO -Ofast run-time regressions of 410.bwaves and 503.bwaves_r
> >> (which are 19% and 15% faster than current master on an AMD zen2
> >> machine) while not introducing a full LIM pass into the pass pipeline.
> >> 
> >> I have not modified the LIM data structures, this means that it still
> >> contains vectors indexed by loop->num even though only a single loop
> >> nest is actually processed.  I also did not replace the uses of
> >> pre_and_rev_post_order_compute_fn with a function that would count a
> >> postorder only for a given loop.  I can of course do so if the
> >> approach is otherwise deemed viable.
> >> 
> >> The patch adds one additional global variable requested_loop to the
> >> pass and then at various places behaves differently when it is set.  I
> >> was considering storing the fake root loop into it for normal
> >> operation, but since this loop often requires special handling anyway,
> >> I came to the conclusion that the code would actually end up less
> >> straightforward.
> >> 
> >> I have bootstrapped and tested the patch on x86_64-linux and a very
> >> similar one on aarch64-linux.  I have also tested it by modifying the
> >> tree_ssa_lim function to run loop_invariant_motion_from_loop on each
> >> real outermost loop in a function and this variant also passed
> >> bootstrap and all tests, including dump scans, of all languages.
> >> 
> >> I have built the entire SPEC 2006 FPrate monitoring the activity of
> >> the LIM pass without and with the patch (on top of commit b642fca1c31
> >> with which 526.blender_r and 538.imagick_r seemed to be failing) and
> >> it only examined 0.2% more loops, 0.02% more BBs and even fewer
> >> percent of statements because it is invoked only in a rather special
> >> circumstance.  But the patch allows for more such need-based uses at
> >> hopefully reasonable cost.
> >> 
> >> Since I do not have much experience with loop optimizers, I expect
> >> that there will be requests to adjust the patch during the review.
> >> Still, it fixes a performance regression against GCC 9 and so I hope
> >> to address the concerns in time to get it into GCC 11.
> >> 
> 
> [...]
> 
> >
> > That said, in the way it's currently structured I think it's
> > "better" to export tree_ssa_lim () and call it from interchange
> > if any loop was interchanged (thus run a full pass but conditional
> > on interchange done).  You can make it cheaper by adding a flag
> > to tree_ssa_lim whether to do store-motion (I guess this might
> > be an interesting user-visible flag as well and a possibility
> > to make select lim passes cheaper via a pass flag) and not do
> > store-motion from the interchange call.  I think that's how we should
> > fix the regression, refactoring LIM properly requires more work
> > that doesn't seem to fit the stage1 deadline.
> >
> 
> So just like this?  Bootstrapped and tested on x86_64-linux and I have
> verified it fixes the bwaves reduction.

OK.

Thanks,
Richard.


> Thanks,
> 
> Martin
> 
> 
> 
> gcc/ChangeLog:
> 
> 2020-11-12  Martin Jambor  
> 
>   PR tree-optimization/94406
>   * tree-ssa-loop-im.c (tree_ssa_lim): Renamed to
>   loop_invariant_motion_in_fun, added a parameter to control store
>   motion.
>   (pass_lim::execute): Adjust call to tree_ssa_lim, now
>   loop_invariant_motion_in_fun.
>   * tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Declare.
>   * gimple-loop-interchange.cc (pass_linterchange::execute): Call
>   loop_invariant_motion_in_fun if any interchange has been done.
> ---
>  gcc/gimple-loop-interchange.cc |  9 +++--
>  gcc/tree-ssa-loop-im.c | 12 +++-
>  gcc/tree-ssa-loop-manip.h  |  2 +-
>  3 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
> index 1656004ecf0..a36dbb49b1f 100644
> --- a/gcc/gimple-loop-interchange.cc
> +++ b/gcc/gimple-loop-interchange.cc
> @@ -2085,8 +2085,13 @@ pass_linterchange::execute (function *fun)
>  }
>  
>if (changed_p)
> -scev_reset ();
> -  return changed_p ? (TODO_update_ssa_only_virtuals) : 0;
> +{
> +  unsigned todo = TODO_update_ssa_only_virtuals;
> +  todo |= loop_invariant_motion_in_fun (cfun, false);
> +  scev_reset ();
> +  return todo;
> +}
> +  return 0;
>  }
>  
>  } // anon namespace
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index 6bb07e133cd..3c7412737f0 100644
> --- a/gcc

[PATCH v2] PR target/97682 - Fix to reuse t1 register between call address and epilogue.

2020-11-12 Thread Monk Chiang
  - When expanding the call pattern, choose t1 register be a jump register.
Epilogue also uses a t1 register to adjust Stack point. The call pattern
and epilogue will initial t1 twice, if both are generated in the same
function. The call pattern will emit 'la t1,symbol' and 'jalr 
t1'instructions.
Epilogue also emits 'li t1,4096' and 'addi sp,sp,t1' instructions.
But li and addi instructions will be placed between la and jalr 
instructions.
The la instruction will be removed by some optimizations,
because t1 register define twice, the first define instruction look
likes duplicate.

  - To resolve this issue, Prologue and Epilogue use the t0 register
be a temporary register, the call pattern use the t1 register be
a temporary register.

  gcc/ChangeLog:

PR target/97682
* config/riscv/riscv.h (RISCV_PROLOGUE_TEMP_REGNUM): Change register to 
t0.
(RISCV_CALL_ADDRESS_TEMP_REGNUM): New Marco, define t1 register.
(RISCV_CALL_ADDRESS_TEMP): Use it for call instructions.
* config/riscv/riscv.c (riscv_legitimize_call_address): Use
RISCV_CALL_ADDRESS_TEMP.
(riscv_compute_frame_info): Change temporary register to t0 form t1.
(riscv_trampoline_init): Adjust comment.

  gcc/testsuite/ChangeLog

PR target/97682
* g++.target/riscv/pr97682.C: New test.
* gcc.target/riscv/interrupt-3.c: Check register for t0.
* gcc.target/riscv/interrupt-4.c: Likewise.
---
 gcc/config/riscv/riscv.c |  23 +--
 gcc/config/riscv/riscv.h |   6 +-
 gcc/testsuite/g++.target/riscv/pr97682.C | 160 +++
 gcc/testsuite/gcc.target/riscv/interrupt-3.c |   4 +-
 gcc/testsuite/gcc.target/riscv/interrupt-4.c |   4 +-
 5 files changed, 181 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr97682.C

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 989a9f15250..35029e7b435 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3110,7 +3110,7 @@ riscv_legitimize_call_address (rtx addr)
 {
   if (!call_insn_operand (addr, VOIDmode))
 {
-  rtx reg = RISCV_PROLOGUE_TEMP (Pmode);
+  rtx reg = RISCV_CALL_ADDRESS_TEMP (Pmode);
   riscv_emit_move (reg, addr);
   return reg;
 }
@@ -3707,18 +3707,18 @@ riscv_compute_frame_info (void)
 {
   struct riscv_frame_info *frame;
   HOST_WIDE_INT offset;
-  bool interrupt_save_t1 = false;
+  bool interrupt_save_prologue_temp = false;
   unsigned int regno, i, num_x_saved = 0, num_f_saved = 0;
 
   frame = &cfun->machine->frame;
 
   /* In an interrupt function, if we have a large frame, then we need to
- save/restore t1.  We check for this before clearing the frame struct.  */
+ save/restore t0.  We check for this before clearing the frame struct.  */
   if (cfun->machine->interrupt_handler_p)
 {
   HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
   if (! SMALL_OPERAND (frame->total_size - step1))
-   interrupt_save_t1 = true;
+   interrupt_save_prologue_temp = true;
 }
 
   memset (frame, 0, sizeof (*frame));
@@ -3728,7 +3728,8 @@ riscv_compute_frame_info (void)
   /* Find out which GPRs we need to save.  */
   for (regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
if (riscv_save_reg_p (regno)
-   || (interrupt_save_t1 && (regno == T1_REGNUM)))
+   || (interrupt_save_prologue_temp
+   && (regno == RISCV_PROLOGUE_TEMP_REGNUM)))
  frame->mask |= 1 << (regno - GP_REG_FIRST), num_x_saved++;
 
   /* If this function calls eh_return, we must also save and restore the
@@ -4902,9 +4903,9 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 
   rtx target_function = force_reg (Pmode, XEXP (DECL_RTL (fndecl), 0));
   /* lui t2, hi(chain)
-lui t1, hi(func)
+lui t0, hi(func)
 addit2, t2, lo(chain)
-jr  r1, lo(func)
+jr  t0, lo(func)
   */
   unsigned HOST_WIDE_INT lui_hi_chain_code, lui_hi_func_code;
   unsigned HOST_WIDE_INT lo_chain_code, lo_func_code;
@@ -4929,7 +4930,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 0);
   riscv_emit_move (mem, lui_hi_chain);
 
-  /* Gen lui t1, hi(func).  */
+  /* Gen lui t0, hi(func).  */
   rtx hi_func = riscv_force_binary (SImode, PLUS, target_function,
fixup_value);
   hi_func = riscv_force_binary (SImode, AND, hi_func,
@@ -4956,7 +4957,7 @@ riscv_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   mem = adjust_address (m_tramp, SImode, 2 * GET_MODE_SIZE (SImode));
   riscv_emit_move (mem, addi_lo_chain);
 
-  /* Gen jr r1, lo(func).  */
+  /* Gen jr t0, lo(func).  */
   rtx lo_func = riscv_force_binary (SImode, AND, target_function,
 

Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.

2020-11-12 Thread Nikhil Benesch via Gcc-patches

On 11/6/20 12:09 PM, Jeff Law wrote:

So I think the best path forward is to let you and Eduard-Mihai make the
technical decisions about what bits are ready for the trunk.  When y'all
think something is ready, let's go ahead and get it installed and
iterate on things that aren't quite ready yet.


For bits y'all think are ready, ISTM that Eduard-Mihai should commit the
changes.


I've attached an updated version of the patch that contains some
additional unit tests that eddyb noticed I lost. From my perspective,
this is now ready for commit.

Neither eddyb nor I have write access, so someone else will need to
commit. (But please wait for eddyb to sign off too.)


It's better to get it in sooner, but there is some degree of freedom
depending on the impact of the changes.  Changes in the rust demangler
aren't likely to trigger codegen or ABI breakages in the compiler itself
-- so with that in mind I think we should give this code a higher degree
of freedom to land after the stage1 close deadline.


Got it. Thanks. That's very helpful context.

Nikhil
diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index b87365c85fe..08c615f6d8b 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -1,6 +1,7 @@
 /* Demangler for the Rust programming language
Copyright (C) 2016-2020 Free Software Foundation, Inc.
Written by David Tolnay (dtol...@gmail.com).
+   Rewritten by Eduard-Mihai Burtescu (ed...@lyken.rs) for v0 support.
 
 This file is part of the libiberty library.
 Libiberty is free software; you can redistribute it and/or
@@ -64,11 +65,16 @@ struct rust_demangler
   /* Non-zero if any error occurred. */
   int errored;
 
+  /* Non-zero if nothing should be printed. */
+  int skipping_printing;
+
   /* Non-zero if printing should be verbose (e.g. include hashes). */
   int verbose;
 
   /* Rust mangling version, with legacy mangling being -1. */
   int version;
+
+  uint64_t bound_lifetime_depth;
 };
 
 /* Parsing functions. */
@@ -81,6 +87,18 @@ peek (const struct rust_demangler *rdm)
   return 0;
 }
 
+static int
+eat (struct rust_demangler *rdm, char c)
+{
+  if (peek (rdm) == c)
+{
+  rdm->next++;
+  return 1;
+}
+  else
+return 0;
+}
+
 static char
 next (struct rust_demangler *rdm)
 {
@@ -92,11 +110,87 @@ next (struct rust_demangler *rdm)
   return c;
 }
 
+static uint64_t
+parse_integer_62 (struct rust_demangler *rdm)
+{
+  char c;
+  uint64_t x;
+
+  if (eat (rdm, '_'))
+return 0;
+
+  x = 0;
+  while (!eat (rdm, '_'))
+{
+  c = next (rdm);
+  x *= 62;
+  if (ISDIGIT (c))
+x += c - '0';
+  else if (ISLOWER (c))
+x += 10 + (c - 'a');
+  else if (ISUPPER (c))
+x += 10 + 26 + (c - 'A');
+  else
+{
+  rdm->errored = 1;
+  return 0;
+}
+}
+  return x + 1;
+}
+
+static uint64_t
+parse_opt_integer_62 (struct rust_demangler *rdm, char tag)
+{
+  if (!eat (rdm, tag))
+return 0;
+  return 1 + parse_integer_62 (rdm);
+}
+
+static uint64_t
+parse_disambiguator (struct rust_demangler *rdm)
+{
+  return parse_opt_integer_62 (rdm, 's');
+}
+
+static size_t
+parse_hex_nibbles (struct rust_demangler *rdm, uint64_t *value)
+{
+  char c;
+  size_t hex_len;
+
+  hex_len = 0;
+  *value = 0;
+
+  while (!eat (rdm, '_'))
+{
+  *value <<= 4;
+
+  c = next (rdm);
+  if (ISDIGIT (c))
+*value |= c - '0';
+  else if (c >= 'a' && c <= 'f')
+*value |= 10 + (c - 'a');
+  else
+{
+  rdm->errored = 1;
+  return 0;
+}
+  hex_len++;
+}
+
+  return hex_len;
+}
+
 struct rust_mangled_ident
 {
   /* ASCII part of the identifier. */
   const char *ascii;
   size_t ascii_len;
+
+  /* Punycode insertion codes for Unicode codepoints, if any. */
+  const char *punycode;
+  size_t punycode_len;
 };
 
 static struct rust_mangled_ident
@@ -104,10 +198,16 @@ parse_ident (struct rust_demangler *rdm)
 {
   char c;
   size_t start, len;
+  int is_punycode = 0;
   struct rust_mangled_ident ident;
 
   ident.ascii = NULL;
   ident.ascii_len = 0;
+  ident.punycode = NULL;
+  ident.punycode_len = 0;
+
+  if (rdm->version != -1)
+is_punycode = eat (rdm, 'u');
 
   c = next (rdm);
   if (!ISDIGIT (c))
@@ -121,6 +221,10 @@ parse_ident (struct rust_demangler *rdm)
 while (ISDIGIT (peek (rdm)))
   len = len * 10 + (next (rdm) - '0');
 
+  /* Skip past the optional `_` separator (v0). */
+  if (rdm->version != -1)
+eat (rdm, '_');
+
   start = rdm->next;
   rdm->next += len;
   /* Check for overflows. */
@@ -133,6 +237,27 @@ parse_ident (struct rust_demangler *rdm)
   ident.ascii = rdm->sym + start;
   ident.ascii_len = len;
 
+  if (is_punycode)
+{
+  ident.punycode_len = 0;
+  while (ident.ascii_len > 0)
+{
+  ident.ascii_len--;
+
+  /* The last '_' is a separator between ascii & punycode. */
+  if (ident.ascii[ident.ascii_len] == '_')
+break;
+
+ 

Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.

2020-11-12 Thread Hongyu Wang via Gcc-patches
Hi

Thanks for reminding me about this patch. I didn't remove any existing
intrinsics, just remove redundant builtin functions that end-users
would not likely to use.

Also I'm OK to keep current implementation, in case there might be
someone using the builtin directly.

Jeff Law  于2020年11月13日周五 下午1:43写道:
>
>
> On 12/23/19 10:31 PM, Hongyu Wang wrote:
>
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
>  __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_avx512f): Use
> __builtin_ia32_getmantsd_mask_round builtins instead of
> __builtin_ia32_getmantsd_round.
> *gcc.target/i386/avx-1.c
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.
>
> So I like the idea of simplifying the implementation of some of the 
> intrinsics when we can, but ISTM that removing existing intrinsics would be a 
> mistake since end-users could be using them in their code.   I'd think we'd 
> want to keep the existing APIs, even if we change the implementation under 
> the hood.
>
>
> Thoughts?
>
>
> jeff
>
>
> Hongyu Wang
>
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 
> Date: Wed, 18 Dec 2019 14:52:54 +
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
> __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_

Re: [PATCH] Remove redundant builtins for avx512f scalar instructions.

2020-11-12 Thread Jeff Law via Gcc-patches


On 12/23/19 10:31 PM, Hongyu Wang wrote:
> Hi:
>   For avx512f scalar instructions, current builtin function like
> __builtin_ia32_*{sd,ss}_round can be replaced by
> __builtin_ia32_*{sd,ss}_mask_round with mask parameter set to -1. This
> patch did the replacement and remove the corresponding redundant
> builtins.
>
>   Bootstrap is ok, make-check ok for i386 target.
>   Ok for trunk?
>
> Changelog
>
> gcc/
> * config/i386/avx512fintrin.h
> (_mm_add_round_sd, _mm_add_round_ss): Use
>  __builtin_ia32_adds?_mask_round builtins instead of
> __builtin_ia32_adds?_round.
> (_mm_sub_round_sd, _mm_sub_round_ss,
> _mm_mul_round_sd, _mm_mul_round_ss,
> _mm_div_round_sd, _mm_div_round_ss,
> _mm_getexp_sd, _mm_getexp_ss,
> _mm_getexp_round_sd, _mm_getexp_round_ss,
> _mm_getmant_sd, _mm_getmant_ss,
> _mm_getmant_round_sd, _mm_getmant_round_ss,
> _mm_max_round_sd, _mm_max_round_ss,
> _mm_min_round_sd, _mm_min_round_ss,
> _mm_fmadd_round_sd, _mm_fmadd_round_ss,
> _mm_fmsub_round_sd, _mm_fmsub_round_ss,
> _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
> _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
> * config/i386/i386-builtin.def
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Remove corresponding case.
>
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_avx512f): Use
> __builtin_ia32_getmantsd_mask_round builtins instead of
> __builtin_ia32_getmantsd_round.
> *gcc.target/i386/avx-1.c
> (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
> __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
> __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
> __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
> __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
> __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
> __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
> __builtin_ia32_minsd_round, __builtin_ia32_minss_round,
> __builtin_ia32_vfmaddsd3_round,
> __builtin_ia32_vfmaddss3_round): Remove.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.

So I like the idea of simplifying the implementation of some of the
intrinsics when we can, but ISTM that removing existing intrinsics would
be a mistake since end-users could be using them in their code.   I'd
think we'd want to keep the existing APIs, even if we change the
implementation under the hood.


Thoughts?


jeff


> Hongyu Wang
>
> 0001-Remove-redundant-round-builtins-for-avx512f-scalar-i.patch
>
> From 9cc4928aad5770c53ff580f5c996092cdaf2f9ba Mon Sep 17 00:00:00 2001
> From: hongyuw1 
> Date: Wed, 18 Dec 2019 14:52:54 +
> Subject: [PATCH] Remove redundant round builtins for avx512f scalar
>  instructions
>
> Changelog
>
> gcc/
>   * config/i386/avx512fintrin.h
>   (_mm_add_round_sd, _mm_add_round_ss): Use
>__builtin_ia32_adds?_mask_round builtins instead of
>   __builtin_ia32_adds?_round.
>   (_mm_sub_round_sd, _mm_sub_round_ss,
>   _mm_mul_round_sd, _mm_mul_round_ss,
>   _mm_div_round_sd, _mm_div_round_ss,
>   _mm_getexp_sd, _mm_getexp_ss,
>   _mm_getexp_round_sd, _mm_getexp_round_ss,
>   _mm_getmant_sd, _mm_getmant_ss,
>   _mm_getmant_round_sd, _mm_getmant_round_ss,
>   _mm_max_round_sd, _mm_max_round_ss,
>   _mm_min_round_sd, _mm_min_round_ss,
>   _mm_fmadd_round_sd, _mm_fmadd_round_ss,
>   _mm_fmsub_round_sd, _mm_fmsub_round_ss,
>   _mm_fnmadd_round_sd, _mm_fnmadd_round_ss,
>   _mm_fnmsub_round_sd, _mm_fnmsub_round_ss): Likewise.
>   * config/i386/i386-builtin.def
>   (__builtin_ia32_addsd_round, __builtin_ia32_addss_round,
>   __builtin_ia32_subsd_round, __builtin_ia32_subss_round,
>   __builtin_ia32_mulsd_round, __builtin_ia32_mulss_round,
>   __builtin_ia32_divsd_round, __builtin_ia32_divss_round,
>   __builtin_ia32_getexpsd128_round, __builtin_ia32_getexpss128_round,
>   __builtin_ia32_getmantsd_round, __builtin_ia32_getmantss_round,
>   __builtin_ia32_maxsd_round, __builtin_ia32_maxss_round,
>   __builtin_ia32_minsd_round, __bui

[r11-4958 Regression] FAIL: 30_threads/future/members/poll.cc execution test on Linux/x86_64

2020-11-12 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

93fc47746815ea9dac413322fcade2931f757e7f is the first bad commit
commit 93fc47746815ea9dac413322fcade2931f757e7f
Author: Jonathan Wakely 
Date:   Thu Nov 12 21:25:14 2020 +

libstdc++: Optimise std::future::wait_for and fix futex polling

caused

FAIL: 30_threads/future/members/poll.cc execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-4958/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=30_threads/future/members/poll.cc 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] [libiberty] Fix write buffer overflow in cplus_demangle

2020-11-12 Thread Jeff Law via Gcc-patches


On 11/29/19 12:15 PM, Tim Rühsen wrote:
> * cplus-dem.c (ada_demangle): Correctly calculate the demangled
>   size by using two passes.

So I'm not sure why, but I can't get this patch to apply.  What's even
more interesting is ada_demangle doesn't seem to have changed since 2010
and even if I checkout a Nov 2019 trunk, I still can't apply the patch.


I can see what you're doing with your patch (it's primarily introducing
a loop where you count on the first pass and allocate on the second and
re-indent all the necessary code), I'd prefer not to muck it up trying
to apply by hand.


Any change you could update the patch so that it applies to the trunk. 
THe review is done, so it should be able to go straight in.  If you have
commit privs (I don't recall if you do or not), you can go ahead and
commit it yourself.


Sorry for the insane delays here.

jeff




Re: [PATCH v2] c: Silently ignore pragma region [PR85487]

2020-11-12 Thread Jeff Law via Gcc-patches


On 9/2/20 6:59 PM, Austin Morton via Gcc-patches wrote:
> #pragma region is a feature introduced by Microsoft in order to allow
> manual grouping and folding of code within Visual Studio.  It is
> entirely ignored by the compiler.  Clang has supported this feature
> since 2012 when in MSVC compatibility mode, and enabled it across the
> board in 2018.
>
> As it stands, you cannot use #pragma region within GCC without
> disabling unknown pragma warnings, which is not advisable.
>
> I propose GCC adopt "#pragma region" and "#pragma endregion" in order
> to alleviate these issues.  Because the pragma has no purpose at
> compile time, the implementation is trivial.
>
>
> Microsoft Documentation on the feature:
> https://docs.microsoft.com/en-us/cpp/preprocessor/region-endregion
>
> LLVM change which enabled pragma region across the board:
> https://reviews.llvm.org/D42248
> ---
>  gcc/ChangeLog|  5 +
>  gcc/c-family/ChangeLog   |  5 +
>  gcc/c-family/c-pragma.c  | 10 ++
>  gcc/doc/cpp.texi |  6 ++
>  gcc/testsuite/ChangeLog  |  5 +
>  gcc/testsuite/gcc.dg/pragma-region.c | 21 +
>  6 files changed, 52 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pragma-region.c

I'm not sure that this is really the way we want to handle this stuff. 
I understand the problem you're trying to solve, but embedding a list of
pragmas to ignore into the compiler itself just seems like the wrong
approach -- it bakes that set of pragmas to ignore into the compiler.


ISTM that we'd be better off either having a command line option to list
the set of pragmas to ignore, or they should be pulled from a file
specified on the command line.   That would seem to be a lot more
friendly to downstream users since each project could set the list of
pragmas to ignore on their own and have that set updated dynamically
over time without having to patch and update GCC.


Any chance you would be willing to work on that?

Jeff



[committed] MAINTAINERS: add myself for write after approval

2020-11-12 Thread HAO CHEN GUI via Gcc-patches

2020-11-13  Haochen Gui  

    * MAINTAINERS (Write After Approval): add myself
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a0216185de9..be42e1441ca 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -409,6 +409,7 @@ Matthew Gretton-Dann 
 Yury Gribov 
 Jon Grimm 
 Laurent Guerby 
+Haochen Gui 
 Jiufu Guo 
 Xuepeng Guo 
 Wei Guozhi 
--
2.18.4



Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Liu Hao via Gcc-patches
在 2020/11/13 2:46, Joseph Myers 写道:
> I'd expect these patches to include updates to the gcc.dg/format/ms_*.c 
> tests to reflect the changed semantics (or new tests there if some of the 
> changes don't result in any failures in the existing tests).
> 

Does the attached patch suffice?

I know very little about Deja GNU. I only tried compiling the function in that 
test and verified
that lines without `dg-warning` didn't result in any warnings with my 
bootstrapped GCC last night,
both on i686 and x86_64.



-- 
Best regards,
LH_Mouse
From 3f58912fb369fd1f645d880a3d967e6523b87507 Mon Sep 17 00:00:00 2001
From: Liu Hao 
Date: Thu, 12 Nov 2020 22:20:29 +0800
Subject: [PATCH] gcc: Add `ll` and `L` length modifiers for `ms_printf`

Previous code abuse `FMT_LEN_L` for the `I` modifier. As `L` is a valid
modifier for `f`, `e`, `g`, etc. and `I` has the same semantics as the
C99 `z` modifier, `FMT_LEN_z` is now used.

First, in the Microsoft ABI, type `long double` has the same layout as
type `double`, so `%Lg` behaves identically to `%g`. Users should pass
in `double`s instead as `long double`s, as GCC uses the 10-byte format.

Second, with a CRT that is recent enough (MSVCRT since Vista, MSVCR80,
UCRT, or mingw-w64 8.0), `printf`-family functions can handle the `ll`
length modifier correctly. This ability is assumed to be available
universally. A lot of libraries (such as libgomp) that use the
`format(printf, ...)` attribute used to suffer from warnings about
unknown format specifiers.

Reference: 
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/tcxf1dw6(v=vs.90)
Reference: 
https://docs.microsoft.com/en-us/cpp/porting/visual-cpp-what-s-new-2003-through-2015#new-crt-features
Signed-off-by: Liu Hao 

gcc/:
* config/i386/msformat-c.c: Add more length modifiers
---
 gcc/config/i386/msformat-c.c  | 45 ++-
 gcc/testsuite/gcc.dg/format/ms_c99-printf-3.c | 22 -
 2 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/gcc/config/i386/msformat-c.c b/gcc/config/i386/msformat-c.c
index 4ceec633a6e..1902b3c73d0 100644
--- a/gcc/config/i386/msformat-c.c
+++ b/gcc/config/i386/msformat-c.c
@@ -32,10 +32,11 @@ along with GCC; see the file COPYING3.  If not see
 static format_length_info ms_printf_length_specs[] =
 {
   { "h", FMT_LEN_h, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
-  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "L", FMT_LEN_L, STD_C89, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I32", FMT_LEN_l, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I64", FMT_LEN_ll, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
-  { "I", FMT_LEN_L, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
+  { "I", FMT_LEN_z, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { NULL, FMT_LEN_none, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 }
 };
 
@@ -90,33 +91,33 @@ static const format_flag_pair ms_strftime_flag_pairs[] =
 static const format_char_info ms_print_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  T99_SST, 
 BADLEN, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'",  "i",  NULL 
},
-  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0#", "i",  NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
-  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#'", "",   NULL },
-  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#",  "",   NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  T89_S,  T94_WI,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","",   NULL 
},
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  T89_S,  T94_W,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",   "cR", NULL 
},
-  { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","c",  NULL 
},
-  { "n",   1, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN,  
BADLEN, BADLEN,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",  "W",  NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN, 
T99_SST, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +'",  "i",  NULL },
+  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0#","i",  NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "

Ping^2: [PATCH 0/4] rs6000: Enable variable vec_insert with IFN VEC_SET

2020-11-12 Thread Xionghu Luo via Gcc-patches

Ping^2, thanks.

On 2020/11/5 09:34, Xionghu Luo via Gcc-patches wrote:

Ping.

On 2020/10/10 16:08, Xionghu Luo wrote:

Originated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
with patch split and some refinement per review comments.

Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed,
this patch set enables expanding IFN VEC_SET for Power9 and Power8
with specfic instruction sequences.

Xionghu Luo (4):
   rs6000: Change rs6000_expand_vector_set param
   rs6000: Support variable insert and Expand vec_insert in expander 
[PR79251]

   rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8
   rs6000: Update testcases' instruction count

  gcc/config/rs6000/rs6000-c.c  |  44 +++--
  gcc/config/rs6000/rs6000-call.c   |   2 +-
  gcc/config/rs6000/rs6000-protos.h |   3 +-
  gcc/config/rs6000/rs6000.c    | 181 +-
  gcc/config/rs6000/vector.md   |   4 +-
  .../powerpc/fold-vec-insert-char-p8.c |   8 +-
  .../powerpc/fold-vec-insert-char-p9.c |  12 +-
  .../powerpc/fold-vec-insert-double.c  |  11 +-
  .../powerpc/fold-vec-insert-float-p8.c    |   6 +-
  .../powerpc/fold-vec-insert-float-p9.c    |  10 +-
  .../powerpc/fold-vec-insert-int-p8.c  |   6 +-
  .../powerpc/fold-vec-insert-int-p9.c  |  11 +-
  .../powerpc/fold-vec-insert-longlong.c    |  10 +-
  .../powerpc/fold-vec-insert-short-p8.c    |   6 +-
  .../powerpc/fold-vec-insert-short-p9.c    |   8 +-
  .../gcc.target/powerpc/pr79251-run.c  |  28 +++
  gcc/testsuite/gcc.target/powerpc/pr79251.h    |  19 ++
  gcc/testsuite/gcc.target/powerpc/pr79251.p8.c |  17 ++
  gcc/testsuite/gcc.target/powerpc/pr79251.p9.c |  18 ++
  .../gcc.target/powerpc/vsx-builtin-7.c    |   4 +-
  20 files changed, 337 insertions(+), 71 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.h
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p8.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p9.c





--
Thanks,
Xionghu


Re: [committed] wwwdocs: Editorial changes around x86-64 ISA extensions

2020-11-12 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 13, 2020 at 3:32 AM Gerald Pfeifer  wrote:
>
> Per our discussion on the list (plus a grammer improvement in a
> section above).
>
> One question: why are the ISA extension lists not alphabetically
> sorted?  Wouldn't that be beneficial for users?  Easier to find
> something and also easier to compare?
>

Hmm, I just sorted them by the time they are enabled.

When I changed the wwwdocs, I was referring to the previous
gcc-8/changes.html, and didn't find that it was alphabetical.

> Gerald
>
> ---
>  htdocs/gcc-11/changes.html | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index fc4c74f4..106db8e9 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -265,7 +265,8 @@ a work-in-progress.
>
>New ISA extension support for Intel AMX-TILE, AMX-INT8, AMX-BF16 was
>added to GCC. AMX-TILE, AMX-INT8, AMX-BF16 intrinsics are available
> -  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler 
> switch.
> +  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler
> +  switches.
>
>New ISA extension support for Intel AVX-VNNI was added to GCC.
>AVX-VNNI intrinsics are available via the -mavxvnni
> @@ -273,14 +274,14 @@ a work-in-progress.
>
>GCC now supports the Intel CPU named Sapphire Rapids through
>  -march=sapphirerapids.
> -The switch enables the MOVDIRI MOVDIR64B AVX512VP2INTERSECT ENQCMD 
> CLDEMOTE
> -SERIALIZE PTWRITE WAITPKG TSXLDTRK AMT-TILE AMX-INT8 AMX-BF16 AVX-VNNI
> -ISA extensions.
> +The switch enables the MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD,
> +CLDEMOTE, SERIALIZE, PTWRITE, WAITPKG, TSXLDTRK, AMT-TILE, AMX-INT8,
> +AMX-BF16, and AVX-VNNI ISA extensions.
>
>GCC now supports the Intel CPU named Alderlake through
>  -march=alderlake.
> -The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE KEYLOCKER 
> AVX-VNNI
> -HRESET ISA extensions.
> +The switch enables the CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER,
> +AVX-VNNI, and HRESET ISA extensions.
>
>  
>
> --
> 2.29.2



-- 
BR,
Hongtao


Re: [PATCH,wwwdocs] gcc-11/changes: Mention Intel AVX-VNNI

2020-11-12 Thread Hongtao Liu via Gcc-patches
Got it.

On Fri, Nov 13, 2020 at 3:26 AM Gerald Pfeifer  wrote:
>
> On Wed, 11 Nov 2020, Hongtao Liu via Gcc-patches wrote:
> > +  New ISA extension support for Intel AVX-VNNI was added to GCC.
>
> More for the future (i.e., no need to change that now): I suggest
> to skip "to GCC" in cases like this, since this is our context to
> begin with.
>
> Gerald



-- 
BR,
Hongtao


Re: [PATCH] RISC-V: Enable ifunc if it was supported in the binutils for linux toolchain.

2020-11-12 Thread Nelson Chu
On Fri, Nov 13, 2020 at 5:50 AM Jim Wilson  wrote:
>I committed and pushed it.

Thanks for your help!!

> I see some extra ifunc related testsuite failures, but that is because we 
> don't have the glibc ifunc patches upstream yet.  It will be important to get 
> those done next.

Yeah, hope we can catch up on this before the next release.

Thanks
Nelson


[PATCH] Change range_handler, was Re: Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches

On 11/12/20 4:12 PM, Andrew MacLeod via Gcc-patches wrote:

On 11/12/20 3:53 PM, Richard Biener wrote: ... 😬



But it means that gimple_expr_code() isn't returning the correct result

for GIMPLE_SINGLE_RHS
It depends. A SSA name isn't an expression code either. As said, the 
generic gimple_expr_code should be used with extreme care.


what is an expression code?  It seems like its just a  tree_code 
representing what is on the RHS?    Im not sure I understand why one 
needs to be careful with it.  It only applies to COND, ASSIGN and 
CALL. and its current right for everything except GIMPLE_SINGLE_RHS?


If we dont fix gimple_expr_code, then Im basically going to be 
reimplementing it myself... which seems kind of pointless.


Andrew


However, that said, It seems like reworking the accessor is probably 
better anyway.  Point taken on expr_type..  for a GIMPLE_COND I wasn't 
actually getting the type I really wanted as it turned out.


anyway, fixed thusly.

Bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew

commit ee24da1b983a89b05303f2ac8828dd8cbe28d3b4
Author: Andrew MacLeod 
Date:   Thu Nov 12 19:25:59 2020 -0500

Change range_handler, was  Re: Fix gimple_expr_code?

Adjust the range_handler to not use gimple_expr_code/type.

* gimple-range.h (gimple_range_handler): Use gimple_assign and
gimple_cond routines to get type and code.
* range-op.cc (range_op_handler): Check for integral types.

diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 0aa6d4672ee..88d2ada324b 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -97,8 +97,12 @@ extern bool gimple_range_calc_op2 (irange &r, const gimple 
*s,
 static inline range_operator *
 gimple_range_handler (const gimple *s)
 {
-  if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == GIMPLE_COND))
-return range_op_handler (gimple_expr_code (s), gimple_expr_type (s));
+  if (gimple_code (s) == GIMPLE_ASSIGN)
+return range_op_handler (gimple_assign_rhs_code (s),
+TREE_TYPE (gimple_assign_lhs (s)));
+  if (gimple_code (s) == GIMPLE_COND)
+return range_op_handler (gimple_cond_code (s),
+TREE_TYPE (gimple_cond_lhs (s)));
   return NULL;
 }
 
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index aff9383d936..86d1af7fe54 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3341,10 +3341,12 @@ pointer_table::pointer_table ()
 range_operator *
 range_op_handler (enum tree_code code, tree type)
 {
-  // First check if there is apointer specialization.
+  // First check if there is a pointer specialization.
   if (POINTER_TYPE_P (type))
 return pointer_tree_table[code];
-  return integral_tree_table[code];
+  if (INTEGRAL_TYPE_P (type))
+return integral_tree_table[code];
+  return NULL;
 }
 
 // Cast the range in R to TYPE.


[committed] libstdc++: Optimise std::future::wait_for and fix futex polling

2020-11-12 Thread Jonathan Wakely via Gcc-patches
To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).

This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).

It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.

libstdc++-v3/ChangeLog:

* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

I think the shortcut in future::wait_for is worth backporting. The
changes in src/c++11/futex.cc are not needed because the code using
absolute timeouts with futex waits is not present on any release
branch.

commit 93fc47746815ea9dac413322fcade2931f757e7f
Author: Jonathan Wakely 
Date:   Thu Nov 12 21:25:14 2020

libstdc++: Optimise std::future::wait_for and fix futex polling

To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).

This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).

It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.

libstdc++-v3/ChangeLog:

* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index 5d948018c75c..f7617cac8e93 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -345,10 +345,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // to synchronize with the thread that made it ready.
  if (_M_status._M_load(memory_order_acquire) == _Status::__ready)
return future_status::ready;
+
  if (_M_is_deferred_future())
return future_status::deferred;
- if (_M_status._M_load_when_equal_for(_Status::__re

Re: [PATCH] C-Family, Objective-C : Implement Objective-C nullability Part 1 [PR90707].

2020-11-12 Thread Joseph Myers
On Thu, 12 Nov 2020, Iain Sandoe wrote:

> OK for the C-family changes?

OK.

> +When @var{nullability kind} is @var{"unspecified"} or @var{0}, nothing is

I think you mean @code or @samp for the second and third @var on this 
line, they look like literal code not metasyntactic variables.  Likewise 
below.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PowerPC: Use __float128 instead of __ieee128 in tests.

2020-11-12 Thread Segher Boessenkool
On Thu, Nov 12, 2020 at 04:44:09PM -0500, Michael Meissner wrote:
> On Thu, Nov 12, 2020 at 01:26:32PM -0600, Segher Boessenkool wrote:
> > On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> > > Two of the tests used the __ieee128 keyword instead of __float128.  This
> > > patch changes those cases to use the official keyword.
> > 
> > What is "official" about that?
> > 
> > Why make this change at all?  __ieee128 should work as well!  Did you
> > see failures without this patch?  Thos need fixing, then.
> 
> We document '__float128'.  We don't document '__ieee128'.  As I said, using
> '__ieee128' internally was due some issues in the GCC 7 time frame,
> particularly before we had the glibc changes.

Well, it is a much clearer type as well: __ibm128 is also 128 bits, and
is also a floating point type.  But if __float128 now *always* means
__ieee128, then fine :-)

(But the manual needs fixing in four places, then.)

Is __float128 a standard type?  ("Standard", in whatever context -- not
just a rs6000 GCC thing, and what else uses it then, and/or will other
things use it in the future?)

Also, we then should change things so __ieee128 becomes really only a
legacy alias for __float128.

Thanks,


Segher


Re: [PATCH v5 2/8] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2020-11-12 Thread Jonathan Wakely via Gcc-patches

On 29/05/20 07:17 +0100, Mike Crowe via Libstdc++ wrote:

The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
  relative timeout.

2. If someone changes the system clock during the wait then the futex
  timeout will correctly expire earlier or later.  Currently that only
  happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To ensure
that the code still works correctly with earlier kernel versions, an ENOSYS
error from futex[1] results in the futex_clock_realtime_unavailable flag
being set.  This flag is used to avoid the unnecessary unsupported futex
call in the future and to fall back to the previous gettimeofday and
relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the first
two calls to _M_futex_wait_until happen to happen simultaneously then the
only consequence is that both will try to use FUTEX_CLOCK_REALTIME, both
risk discovering that it doesn't work and, if so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
   supported.

* libstdc++-v3/src/c++11/futex.cc: Add new constants for required
futex flags.  Add futex_clock_realtime_unavailable flag to store
result of trying to use
FUTEX_CLOCK_REALTIME. 
(__atomic_futex_unsigned_base::_M_futex_wait_until):
Try to use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
fall back to using gettimeofday and FUTEX_WAIT if that's not
supported.


Mike,

I've been doing some performance comparisons and this patch seems to
make quite a big difference to code that polls a future by calling
fut.wait_until(t) using any t < now() as the timeout. For example,
fut.wait_until(chrono::system_clock::time_point{}) to wait until the
UNIX epoch.

With GCC 10 (or with the if (!futex_clock_realtime_unavailable.load(...)
commented out) I see that polling take < 100ns. With the change, it
takes 3000ns or more.

Now this is still far better than polling using fut.wait_for(0s) which
takes around 5ns due to the clock_gettime call, but I'm about to
fix that.

I'm not sure how important it is for wait_until(past) to be fast, but
the difference from 100ns to 3000ns seems significant. Do you see the
same kind of numbers? Is this just a property of the futex wait with
an absolute time?

N.B. using wait_until(system_clock::time_point::min()) or any other
time before the epoch doesn't work. The futex syscall returns EINVAL
which we don't check for. I'm about to fix that too.



libstdc++-v3/src/c++11/futex.cc | 37 ++-
1 file changed, 37 insertions(+)

diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index c9de11a..25b3e05 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -35,8 +35,16 @@

// Constants for the wait/wake futex syscall operations
const unsigned futex_wait_op = 0;
+const unsigned futex_wait_bitset_op = 9;
+const unsigned futex_clock_realtime_flag = 256;
+const unsigned futex_bitset_match_any = ~0;
const unsigned futex_wake_op = 1;

+namespace
+{
+  std::atomic futex_clock_realtime_unavailable;
+}
+
namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -58,6 +66,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  }
else
  {
+   if (!futex_clock_realtime_unavailable.load(std::memory_order_relaxed))
+ {
+   struct timespec rt;
+   rt.tv_sec = __s.count();
+   rt.tv_nsec = __ns.count();
+   if (syscall (SYS_futex, __addr,
+futex_wait_bitset_op | futex_clock_realtime_flag,
+__val, &rt, nullptr, futex_bitset_match_any) == -1)
+ {
+   __glibcxx_assert(errno == EINTR || errno == EAGAIN
+   || errno == ETIMEDOUT || errno == ENOSYS);
+   if (errno == ETIMEDOUT)
+ return false;
+   if (errno == ENOSYS)
+ {
+   futex_clock_realtime_unavailable.store(true,
+   std::memory_order_relaxed);
+   // Fall through to legacy implementation if the system
+   // call is unavailable.
+ }
+   else
+ return true;
+ }
+   else
+ return true;
+ }
+
+   // We only get to he

Re: [PATCH] openmp: Retire nest-var ICV

2020-11-12 Thread Kwok Cheung Yeung

On 10/11/2020 6:01 pm, Jakub Jelinek wrote:

One thing is that max-active-levels-var in 5.0 is per-device,
but in 5.1 per-data environment.  The question is if we should implement
the problematic 5.0 way or the 5.1 one.  E.g.:
#include 
#include 

int
main ()
{
   #pragma omp parallel
   {
 omp_set_nested (1);
 #pragma omp parallel num_threads(2)
 printf ("Hello, world!\n");
   }
}
which used to be valid in 4.5 (where nest-var used to be per-data
environment) is in 5.0 racy (and in 5.1 will not be racy again).
Though, as these are deprecated APIs, perhaps we can just do the 5.0 way for
now.


Since max-active-levels-var is still current in 5.1, I guess we might as well do 
it properly :-). I have now placed max-active-levels-var into gomp_task_icv. The 
definition of omp_get_nested in 5.1 refers to the active-level-var ICV which is 
currently not implemented, so the comparison is against omp_get_active_level() 
instead.



--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -489,8 +489,11 @@ represent their language-specific counterparts.
  
  Nested parallel regions may be initialized at startup by the

  @env{OMP_NESTED} environment variable or at runtime using
-@code{omp_set_nested}.  If undefined, nested parallel regions are
-disabled by default.
+@code{omp_set_nested}.  Setting the maximum number of nested
+regions to above one using the @env{OMP_MAX_ACTIVE_LEVELS}
+environment variable or @code{omp_set_max_active_levels} will
+also enable nesting.  If undefined, nested parallel regions
+are disabled by default.


This doesn't really describe what env.c does.  If undefined, then
if OMP_NESTED is defined, it will be folloed, and if neither is
defined, the code sets the default based on
"OMP_NUM_THREADS or OMP_PROC_BIND is set to a
comma-separated list of more than one value"
as the spec says and only is disabled otherwise.



Similarly.



Again.


I have changed these to more accurately describe what is happening. The 
descriptions are starting to get rather verbose though...



--- a/libgomp/testsuite/libgomp.c/target-5.c
+++ b/libgomp/testsuite/libgomp.c/target-5.c


Why does this testcase need updates?
It doesn't seem to use omp_[sg]et_max_active_levels and so I don't see
why it couldn't use omp_[sg]et_nested.



The problem is with max-active-levels-var (which nesting is now in terms of) 
being per device rather than per data environment. The test expects the nested 
setting to go back to its previous value after leaving a DE that sets it to 
something else.


Anyway, with max-active-levels-var now being per data environment, that is all 
moot now, and the test can remain unchanged.


Is this version okay for trunk? Bootstrapped on x86_64 and libgomp tested with 
no regressions with nvptx offloading.


Thanks

Kwok
commit bcaa3dbf1f130e3a2c7e6033a10be3f61221a951
Author: Kwok Cheung Yeung 
Date:   Thu Nov 12 13:42:28 2020 -0800

openmp: Retire nest-var ICV for OpenMP 5.1

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.  The max-active-levels-var ICV
is now per data environment rather than per device.

2020-11-12  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_global_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_max_active_levels_var): Remove.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
max_active_levels_var.
(initialize_env): Set max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
max_active_levels_var.
(omp_get_nested): Likewise.
(omp_set_max_active_levels): Use max_active_levels_var field instead
of gomp_max_active_levels_var.
(omp_get_max_active_levels): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field.  Add
max_active_levels_var field.
(gomp_max_active_levels_var): Delete.
* libgomp.texi (omp_get_nested): Update documentation.
(omp_set_nested): Likewise.
(OMP_MAX_ACTIVE_LEVELS): Likewise.
(OMP_NESTED): Likewise.
(OMP_NUM_THREADS): Likewise.
(OMP_PROC_BIND): Likewise.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with max_active_levels_var.  Use max_active_levels_var
field instead of gomp_max_active_levels_var.

diff --git a/libgomp/env.c b/libgomp/env.c
index ab22525..b8ed1bd 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -68,12 +68,11 @@ struct gomp_task_icv gomp_global_icv = {
   .run_sched_chunk_size = 1,
   .default_device_var = 0,
   .dyn_var = false,
-  .nest_var = false,
+  .max_active_levels_var = 1,
   .bind_var = omp_proc_bind_false,
   .target_data = NULL
 };
 
-unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
 bool gomp_cancel_var

[PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

2020-11-12 Thread H.J. Lu via Gcc-patches
In assemly code, the section flag 'R' sets the SHF_GNU_RETAIN flag to
indicate that the section must be preserved by the linker.

Add SECTION_RETAIN to indicate a section should be retained by the linker
and set SECTION_RETAIN on section for the preserved symbol if assembler
supports SHF_GNU_RETAIN.  All retained symbols are placed in separate
sections with

.section .data.rel.local.preserved_symbol,"awR"
preserved_symbol:
...
.section .data.rel.local,"aw"
not_preserved_symbol:
...

to avoid

.section .data.rel.local,"awR"
preserved_symbol:
...
not_preserved_symbol:
...

which places not_preserved_symbol definition in the SHF_GNU_RETAIN
section.

gcc/

2020-11-XX  H.J. Lu  

* configure.ac (HAVE_GAS_SHF_GNU_RETAIN): New.  Define 1 if
the assembler supports marking sections with SHF_GNU_RETAIN flag.
* output.h (SECTION_RETAIN): New.  Defined as 0x400.
(SECTION_MACH_DEP): Changed from 0x400 to 0x800.
(default_unique_section): Add a bool argument.
* varasm.c (get_section): Set SECTION_RETAIN for the preserved
symbol with HAVE_GAS_SHF_GNU_RETAIN.
(resolve_unique_section): Used named section for the preserved
symbol if assembler supports SHF_GNU_RETAIN.
(get_variable_section): Handle the preserved common symbol with
HAVE_GAS_SHF_GNU_RETAIN.
(default_elf_asm_named_section): Require the full declaration and
use the 'R' flag for SECTION_RETAIN.
* config.in: Regenerated.
* configure: Likewise.

gcc/testsuite/

2020-11-XX  H.J. Lu  
Jozef Lawrynowicz  

* c-c++-common/attr-used.c: Check the 'R' flag.
* c-c++-common/attr-used-2.c: Likewise.
* c-c++-common/attr-used-3.c: New test.
* c-c++-common/attr-used-4.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-1.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_R_flag_in_section): New proc.
---
 gcc/config.in |  7 +++
 gcc/configure | 51 +++
 gcc/configure.ac  | 20 
 gcc/output.h  |  6 ++-
 gcc/testsuite/c-c++-common/attr-used-2.c  |  1 +
 gcc/testsuite/c-c++-common/attr-used-3.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used-4.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used.c|  1 +
 .../compile/attr-used-retain-1.c  | 32 
 .../compile/attr-used-retain-2.c  | 15 ++
 gcc/testsuite/lib/target-supports.exp | 40 +++
 gcc/varasm.c  | 17 +--
 12 files changed, 200 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-3.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-4.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-2.c

diff --git a/gcc/config.in b/gcc/config.in
index b7c3107bfe3..23ae2f9bc1b 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1352,6 +1352,13 @@
 #endif
 
 
+/* Define 0/1 if your assembler supports marking sections with SHF_GNU_RETAIN
+   flag. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GAS_SHF_GNU_RETAIN
+#endif
+
+
 /* Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/configure b/gcc/configure
index dbda4415a17..a925a6e5efb 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24272,6 +24272,57 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+# Test if the assembler supports the section flag 'R' for specifying
+# section with SHF_GNU_RETAIN.
+case "${target}" in
+  # Solaris may use GNU assembler with Solairs ld.  Even if GNU
+  # assembler supports the section flag 'R', it doesn't mean that
+  # Solairs ld supports it.
+  *-*-solaris2*)
+gcc_cv_as_shf_gnu_retain=no
+;;
+  *)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for section 
'R' flag" >&5
+$as_echo_n "checking assembler for section 'R' flag... " >&6; }
+if ${gcc_cv_as_shf_gnu_retain+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_shf_gnu_retain=no
+if test $in_tree_gas = yes; then
+if test $in_tree_gas_is_elf = yes \
+  && test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 36 \) \* 1000 + 0`
+  then gcc_cv_as_shf_gnu_retain=yes
+fi
+  elif test x$gcc_cv_as != x; then
+$as_echo '.section .foo,"awR",%progbits
+.byte 0' > conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags --fatal-warnings -o conftest.o 
conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+then
+   gcc_cv_as_shf_gnu_retain=yes
+el

[committed] jit: add support for inline asm [PR87291]

2020-11-12 Thread David Malcolm via Gcc-patches
This patch adds various entrypoints to libgccjit for directly embedding
asm statements into a compile, analogous to inline asm in the C frontend:
  gcc_jit_block_add_extended_asm
  gcc_jit_block_end_with_extended_asm_goto
  gcc_jit_extended_asm_as_object
  gcc_jit_extended_asm_set_volatile_flag
  gcc_jit_extended_asm_set_inline_flag
  gcc_jit_extended_asm_add_output_operand
  gcc_jit_extended_asm_add_input_operand
  gcc_jit_extended_asm_add_clobber
  gcc_jit_context_add_top_level_asm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 421d0d0f54294a7bf2872b3b2ac521ce0fa9869e.

gcc/jit/ChangeLog:
PR jit/87291
* docs/cp/topics/asm.rst: New file.
* docs/cp/topics/index.rst (Topic Reference): Add it.
* docs/topics/asm.rst: New file.
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_15): New.
* docs/topics/functions.rst (Statements): Add link to extended
asm.
* docs/topics/index.rst (Topic Reference): Add asm.rst.
* docs/topics/objects.rst: Add gcc_jit_extended_asm to ASCII art.
* jit-common.h (gcc::jit::recording::extended_asm): New forward
decl.
(gcc::jit::recording::top_level_asm): Likewise.
* jit-playback.c: Include "stmt.h".
(build_string): New.
(gcc::jit::playback::context::new_string_literal): Disambiguate
build_string call.
(gcc::jit::playback::context::add_top_level_asm): New.
(build_operand_chain): New.
(build_clobbers): New.
(build_goto_operands): New.
(gcc::jit::playback::block::add_extended_asm): New.
* jit-playback.h (gcc::jit::playback::context::add_top_level_asm):
New decl.
(struct gcc::jit::playback::asm_operand): New struct.
(gcc::jit::playback::block::add_extended_asm): New decl.
* jit-recording.c (gcc::jit::recording::context::dump_to_file):
Dump top-level asms.
(gcc::jit::recording::context::add_top_level_asm): New.
(gcc::jit::recording::block::add_extended_asm): New.
(gcc::jit::recording::block::end_with_extended_asm_goto): New.
(gcc::jit::recording::asm_operand::asm_operand): New.
(gcc::jit::recording::asm_operand::print): New.
(gcc::jit::recording::asm_operand::make_debug_string): New.
(gcc::jit::recording::output_asm_operand::write_reproducer): New.
(gcc::jit::recording::output_asm_operand::print): New.
(gcc::jit::recording::input_asm_operand::write_reproducer): New.
(gcc::jit::recording::input_asm_operand::print): New.
(gcc::jit::recording::extended_asm::add_output_operand): New.
(gcc::jit::recording::extended_asm::add_input_operand): New.
(gcc::jit::recording::extended_asm::add_clobber): New.
(gcc::jit::recording::extended_asm::replay_into): New.
(gcc::jit::recording::extended_asm::make_debug_string): New.
(gcc::jit::recording::extended_asm::write_flags): New.
(gcc::jit::recording::extended_asm::write_clobbers): New.
(gcc::jit::recording::extended_asm_simple::write_reproducer): New.
(gcc::jit::recording::extended_asm::maybe_populate_playback_blocks):
New.
(gcc::jit::recording::extended_asm_goto::extended_asm_goto): New.
(gcc::jit::recording::extended_asm_goto::replay_into): New.
(gcc::jit::recording::extended_asm_goto::write_reproducer): New.
(gcc::jit::recording::extended_asm_goto::get_successor_blocks):
New.
(gcc::jit::recording::extended_asm_goto::maybe_print_gotos): New.

(gcc::jit::recording::extended_asm_goto::maybe_populate_playback_blocks):
New.
(gcc::jit::recording::top_level_asm::top_level_asm): New.
(gcc::jit::recording::top_level_asm::replay_into): New.
(gcc::jit::recording::top_level_asm::make_debug_string): New.
(gcc::jit::recording::top_level_asm::write_to_dump): New.
(gcc::jit::recording::top_level_asm::write_reproducer): New.
* jit-recording.h
(gcc::jit::recording::context::add_top_level_asm): New decl.
(gcc::jit::recording::context::m_top_level_asms): New field.
(gcc::jit::recording::block::add_extended_asm): New decl.
(gcc::jit::recording::block::end_with_extended_asm_goto): New
decl.
(gcc::jit::recording::asm_operand): New class.
(gcc::jit::recording::output_asm_operand): New class.
(gcc::jit::recording::input_asm_operand): New class.
(gcc::jit::recording::extended_asm): New class.
(gcc::jit::recording::extended_asm_simple): New class.
(gcc::jit::recording::extended_asm_goto): New class.
(gcc::jit::recording::top_level_asm): New class.
* libgccjit++.h (gccjit::extended_asm): New forward decl.
(gccjit::context::add_top_level_asm): New.
(gccjit::block::add_extended_asm): New.
(gccjit::block::end_with_extended_asm_goto): New.
(

[committed] jit: fix string escaping

2020-11-12 Thread David Malcolm via Gcc-patches
This patch fixes a bug in recording::string::make_debug_string in which
'\t' and '\n' were "escaped" by simply prepending a '\', thus emitting
'\' then '\n', rather than '\' then 'n'.  It also removes a hack that
determined if a string is to be escaped by checking for a leading '"',
by instead adding a flag.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as fec573408310139e1ffc42741fbe46b4f2947592.

gcc/jit/ChangeLog:
* jit-recording.c (recording::context::new_string): Add "escaped"
param and use it when creating the new recording::string instance.
(recording::string::string): Add "escaped" param and use it to
initialize m_escaped.
(recording::string::make_debug_string): Replace check that first
char is double-quote with use of m_escaped.  Fix escaping of
'\t' and '\n'.  Set "escaped" on the result.
* jit-recording.h (recording::context::new_string): Add "escaped"
param.
(recording::string::string): Add "escaped" param.
(recording::string::m_escaped): New field.

gcc/testsuite/ChangeLog:
* jit.dg/test-debug-strings.c (create_code): Add tests of
string literal escaping.
---
 gcc/jit/jit-recording.c   | 39 ---
 gcc/jit/jit-recording.h   |  9 --
 gcc/testsuite/jit.dg/test-debug-strings.c | 20 
 3 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c
index 3cbeba0f371..3a84c1fc5c0 100644
--- a/gcc/jit/jit-recording.c
+++ b/gcc/jit/jit-recording.c
@@ -724,12 +724,12 @@ recording::context::disassociate_from_playback ()
This creates a fresh copy of the given 0-terminated buffer.  */
 
 recording::string *
-recording::context::new_string (const char *text)
+recording::context::new_string (const char *text, bool escaped)
 {
   if (!text)
 return NULL;
 
-  recording::string *result = new string (this, text);
+  recording::string *result = new string (this, text, escaped);
   record (result);
   return result;
 }
@@ -1954,8 +1954,9 @@ recording::memento::write_to_dump (dump &d)
 /* Constructor for gcc::jit::recording::string::string, allocating a
copy of the given text using new char[].  */
 
-recording::string::string (context *ctxt, const char *text)
-  : memento (ctxt)
+recording::string::string (context *ctxt, const char *text, bool escaped)
+: memento (ctxt),
+  m_escaped (escaped)
 {
   m_len = strlen (text);
   m_buffer = new char[m_len + 1];
@@ -2005,9 +2006,9 @@ recording::string::from_printf (context *ctxt, const char 
*fmt, ...)
 recording::string *
 recording::string::make_debug_string ()
 {
-  /* Hack to avoid infinite recursion into strings when logging all
- mementos: don't re-escape strings:  */
-  if (m_buffer[0] == '"')
+  /* Avoid infinite recursion into strings when logging all mementos:
+ don't re-escape strings:  */
+  if (m_escaped)
 return this;
 
   /* Wrap in quotes and do escaping etc */
@@ -2024,15 +2025,31 @@ recording::string::make_debug_string ()
   for (size_t i = 0; i < m_len ; i++)
 {
   char ch = m_buffer[i];
-  if (ch == '\t' || ch == '\n' || ch == '\\' || ch == '"')
-   APPEND('\\');
-  APPEND(ch);
+  switch (ch)
+   {
+   default:
+ APPEND(ch);
+ break;
+   case '\t':
+ APPEND('\\');
+ APPEND('t');
+ break;
+   case '\n':
+ APPEND('\\');
+ APPEND('n');
+ break;
+   case '\\':
+   case '"':
+ APPEND('\\');
+ APPEND(ch);
+ break;
+   }
 }
   APPEND('"'); /* closing quote */
 #undef APPEND
   tmp[len] = '\0'; /* nil termintator */
 
-  string *result = m_ctxt->new_string (tmp);
+  string *result = m_ctxt->new_string (tmp, true);
 
   delete[] tmp;
   return result;
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 30e37aff387..9a43a7bf33a 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -74,7 +74,7 @@ public:
   void disassociate_from_playback ();
 
   string *
-  new_string (const char *text);
+  new_string (const char *text, bool escaped = false);
 
   location *
   new_location (const char *filename,
@@ -414,7 +414,7 @@ private:
 class string : public memento
 {
 public:
-  string (context *ctxt, const char *text);
+  string (context *ctxt, const char *text, bool escaped);
   ~string ();
 
   const char *c_str () { return m_buffer; }
@@ -431,6 +431,11 @@ private:
 private:
   size_t m_len;
   char *m_buffer;
+
+  /* Flag to track if this string is the result of string::make_debug_string,
+ to avoid infinite recursion when logging all mementos: don't re-escape
+ such strings.  */
+  bool m_escaped;
 };
 
 class location : public memento
diff --git a/gcc/testsuite/jit.dg/test-debug-strings.c 
b/gcc/testsuite/jit.dg/test-debug-strings.c
index e515a176257..03ef3370d94 100644
--- a/gcc/testsuite/jit.dg/test-debu

[committed] libgccjit.h: fix typo in comment

2020-11-12 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 8948a5715b00fe36d20c03b6c4c4397b74cc6282.

gcc/jit/ChangeLog:
* libgccjit.h: Fix typo in comment.
---
 gcc/jit/libgccjit.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 7134841bb07..7fbaa9f3162 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -1504,7 +1504,7 @@ gcc_jit_context_new_rvalue_from_vector (gcc_jit_context 
*ctxt,
 
 #define LIBGCCJIT_HAVE_gcc_jit_version
 
-/* Functions to retrive libgccjit version.
+/* Functions to retrieve libgccjit version.
Analogous to __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__ in C code.
 
These API entrypoints were added in LIBGCCJIT_ABI_13; you can test for their
-- 
2.26.2



Re: [PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Jason Merrill via Gcc-patches

On 11/12/20 1:27 PM, Patrick Palka wrote:

The atom_cache in normalize_atom relies on the assumption that two
equivalent (templated) trees (in the sense of cp_tree_equal) must use
the same template parameters (according to find_template_parameters).

This assumption unfortunately doesn't always hold for TARGET_EXPRs,
because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
find_template_parameters walks this target (and its DECL_CONTEXT).

Hence two TARGET_EXPRs built by force_target_expr with the same
initializer but under different settings of current_function_decl may
compare equal according to cp_tree_equal, but find_template_parameters
returns a different set of template parameters for them.  This breaks
the below testcase because during normalization we build two such
TARGET_EXPRs (one under current_function_decl=f and another under =g),
and then use the same ATOMIC_CONSTR for the two corresponding atoms,
leading to a crash during satisfaction of g's associated constraints.

This patch works around this assumption violation by removing the source
of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
added in r9-6043, but it seems it's no longer necessary (according to
https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
call was added in order to avoid regressing on initlist109.C at the time).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.  I wonder what else asserting !processing_template_decl in 
build_target_expr would find...



gcc/cp/ChangeLog:

* semantics.c (finish_compound_literal): Don't wrap the original
compound literal in a TARGET_EXPR when inside a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype3.C: New test.
---
  gcc/cp/semantics.c  |  7 +--
  gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
  2 files changed, 16 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 33d715edaec..172286922e7 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
compound_literal,
  
/* If we're in a template, return the original compound literal.  */

if (orig_cl)
-{
-  if (!VECTOR_TYPE_P (type))
-   return get_target_expr_sfinae (orig_cl, complain);
-  else
-   return orig_cl;
-}
+return orig_cl;
  
if (TREE_CODE (compound_literal) == CONSTRUCTOR)

  {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
new file mode 100644
index 000..837855ce8ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
@@ -0,0 +1,15 @@
+// { dg-do compile { target c++20 } }
+
+template  concept C = requires(T t) { t; };
+
+template  using A = decltype((T{}, int{}));
+
+template  concept D = C>;
+
+template  void f() requires D;
+template  void g() requires D;
+
+void h() {
+  f();
+  g();
+}





Re: [PATCH] PR target/97682 - Fix to reuse t1 register between call address and epilogue.

2020-11-12 Thread Jim Wilson
On Mon, Nov 9, 2020 at 11:15 PM Monk Chiang  wrote:

> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 172c7ca7c98..3bd1993c4c9 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -342,9 +342,13 @@ extern const char *riscv_default_mtune (int argc,
> const char **argv);
> The epilogue temporary mustn't conflict with the return registers,
> the frame pointer, the EH stack adjustment, or the EH data registers.
> */
>
> -#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST + 1)
> +#define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST)
>  #define RISCV_PROLOGUE_TEMP(MODE) gen_rtx_REG (MODE,
> RISCV_PROLOGUE_TEMP_REGNUM)
>
> +#define RISCV_CALL_ADDRESS_TEMP_REGNUM (GP_TEMP_FIRST + 1)
> +#define RISCV_CALL_ADDRESS_TEMP(MODE) \
> +  gen_rtx_REG (MODE, RISCV_CALL_ADDRESS_TEMP_REGNUM)
>

This looks generally OK, however there is a minor problem that we have code
in riscv_compute_frame_info to save t1 in an interrupt handler register
with a large stack frame, as we know the prologue code will clobber t1 in
this case.  However, with this patch, the prologue now clobbers t0
instead.  So riscv_computer_frame_info needs to be fixed.  I'd suggest
changing the T1_REGNUM to RISCV_PROLOGUE_TEMP_REGNUM to prevent this from
happening again, that is probably my fault.  And the interrupt_save_t1
variable should be renamed, maybe to interupt_save_prologue_temp.

You can see the problem with gcc/testsuite/gcc.target/riscv/interrupt-3.c
if you compile with -O0 and we get
foo:
addi sp,sp,-32
sw t1,28(sp)
sw s0,24(sp)
addi s0,sp,32
li t0,-4096
addi t0,t0,16
add sp,sp,t0
so we are saving t1 and then clobbering t0 with your patch.

Otherwise this looks good.

Jim


Re: [PATCH] RISC-V: Enable ifunc if it was supported in the binutils for linux toolchain.

2020-11-12 Thread Jim Wilson
On Tue, Nov 10, 2020 at 7:33 PM Nelson Chu  wrote:

> gcc/
> * configure: Regenerated.
> * configure.ac: If ifunc was supported in the binutils for
> linux toolchain, then set enable_gnu_indirect_function to yes.
>

Looks good.  I committed and pushed it.

I see some extra ifunc related testsuite failures, but that is because we
don't have the glibc ifunc patches upstream yet.  It will be important to
get those done next.

Jim


Re: PowerPC: Use __float128 instead of __ieee128 in tests.

2020-11-12 Thread Michael Meissner via Gcc-patches
On Thu, Nov 12, 2020 at 01:26:32PM -0600, Segher Boessenkool wrote:
> Hi,
> 
> On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> > Two of the tests used the __ieee128 keyword instead of __float128.  This
> > patch changes those cases to use the official keyword.
> 
> What is "official" about that?
> 
> Why make this change at all?  __ieee128 should work as well!  Did you
> see failures without this patch?  Thos need fixing, then.

We document '__float128'.  We don't document '__ieee128'.  As I said, using
'__ieee128' internally was due some issues in the GCC 7 time frame,
particularly before we had the glibc changes.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: PowerPC: Add __float128 conversions to/from Decimal

2020-11-12 Thread Michael Meissner via Gcc-patches
On Thu, Oct 29, 2020 at 10:05:38PM +, Joseph Myers wrote:
> On Thu, 29 Oct 2020, Segher Boessenkool wrote:
> 
> > > Doing these conversions accurately is nontrivial.  Converting via strings 
> > > is the simple approach (i.e. the one that moves the complexity somewhere 
> > > else).  There are more complicated but more efficient approaches that can 
> > > achieve correct conversions with smaller bounds on resource usage (and 
> > > there are various papers published in this area), but those involve a lot 
> > > more code (and precomputed data, with a speed/space trade-off in how much 
> > > you precompute; the BID code in libgcc has several MB of precomputed data 
> > > for that purpose).
> > 
> > Does the printf code in libgcc handle things correctly for IEEE QP float
> > as long double, do you know?
> 
> As far as I know, the code in libgcc for conversions *from* decimal *to* 
> binary (so the direction that uses strtof128 as opposed to the one using 
> strfrom128, in the binary128 case) works correctly, if the underlying libc 
> has accurate string/numeric conversion operations.
> 
> Binary to decimal is another matter, even for cases such as float to 
> _Decimal64.  I've just filed bug 97635 for that.
> 
> Also note that if you want to use printf as opposed to strfromf128 for 
> IEEE binary128 you'll need to use __printfieee128 (the version that 
> expects long double to be IEEE binary128) which was introduced in glibc 
> 2.32, so that doesn't help with the glibc version dependencies.

My latest patches now switches to using the GLIBC 2.32 and __sprintfieee128.
If we don't have glibc 2.32, it just calls abort, so we don't get linker
errors.  I hope to submit it tonight or tomorrow night.

> When I investigated and reported several bugs in the conversion operations 
> in libdfp, I noted (e.g. https://github.com/libdfp/libdfp/issues/29 ) that 
> the libgcc versions were working correctly for those tests (and filed and 
> subsequently fixed one glibc strtod bug, missing inexact exceptions, that 
> I'd noticed while looking at such issues in libdfp).  But the specific 
> case I tested for badly rounded conversions was the case of conversions 
> from decimal to binary, not the case of conversions from binary to 
> decimal, which, as noted above, turn out to be buggy in libgcc.
> 
> Lots of bugs have been fixed in the glibc conversion code over the years 
> (more on the strtod side than in the code shared by printf and strfrom 
> functions).  That code uses multiple-precision operations from GMP, which 
> avoids some complications but introduces others (it also needs to e.g. 
> deal with locale issues, which are irrelevant for libgcc conversions).

Using the sprintf method, I see an error in

c-c++-common/dfp/convert-bfp-11.c

that I didn't see with the method used in the patches with strtof128 and
strfromf128 directly.  I need to track down exactly what the error is.

All of the other dfp conversion tests work fine.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] C-Family, Objective-C : Implement Objective-C nullability Part 1 [PR90707].

2020-11-12 Thread Iain Sandoe

Hi,

The PR notes that our inability to parse these keywords in GNU Objective-C
is one of the contributing factors to being unable to use some important  
system

headers (at least, on Darwin platforms).

tested on x86_64-darwin and x86_64-linux-gnu,
OK for the C-family changes?
thanks
Iain

— commit log

Part 1 of the implementation covers property nullability attributes
and includes the changes to common code. Follow-on changes will be needed
to cover Objective-C method definitions, but those are expected to be
local to the Objective-C front end.

The basis of the implementation is to translate the Objective-C-specific
keywords into an attribute (objc_nullability) which has the required
states to carry the attribute markup.

We introduce the keywords, and these are parsed and validated in the same
manner as other property attributes.  The resulting value is attached to
the property as an objc_nullability attribute.

gcc/c-family/ChangeLog:

PR objc/90707
* c-common.c (c_common_reswords): null_unspecified, nullable,
nonnull, null_resettable: New keywords.
* c-common.h (enum rid): RID_NULL_UNSPECIFIED, RID_NULLABLE,
RID_NONNULL, RID_NULL_RESETTABLE: New.
(OBJC_IS_PATTR_KEYWORD): Include nullability keywords in the
ranges accepted for property attributes.
* c-attribs.c (handle_objc_nullability_attribute): New.
* c-objc.h (enum objc_property_attribute_group): Add
OBJC_PROPATTR_GROUP_NULLABLE.
(enum objc_property_attribute_kind):Add
OBJC_PROPERTY_ATTR_NULL_UNSPECIFIED, OBJC_PROPERTY_ATTR_NULLABLE,
OBJC_PROPERTY_ATTR_NONNULL, OBJC_PROPERTY_ATTR_NULL_RESETTABLE.

gcc/objc/ChangeLog:

PR objc/90707
* objc-act.c (objc_prop_attr_kind_for_rid): Handle nullability.
(objc_add_property_declaration): Handle nullability attributes.
Check that these are applicable to the property type.
* objc-act.h (enum objc_property_nullability): New.

gcc/testsuite/ChangeLog:

PR objc/90707
* obj-c++.dg/property/at-property-4.mm: Add basic nullability
tests.
* objc.dg/property/at-property-4.m: Likewise.
* obj-c++.dg/attributes/nullability-00.mm: New test.
* obj-c++.dg/property/nullability-00.mm: New test.
* objc.dg/attributes/nullability-00.m: New test.
* objc.dg/property/nullability-00.m: New test.

gcc/ChangeLog:

PR objc/90707
* doc/extend.texi: Document the objc_nullability attribute.
---
 gcc/c-family/c-attribs.c  | 49 ++
 gcc/c-family/c-common.c   |  6 +++
 gcc/c-family/c-common.h   |  7 ++-
 gcc/c-family/c-objc.h |  5 ++
 gcc/doc/extend.texi   | 27 ++
 gcc/objc/objc-act.c   | 51 ++-
 gcc/objc/objc-act.h   | 10 
 .../obj-c++.dg/attributes/nullability-00.mm   | 20 
 .../obj-c++.dg/property/at-property-4.mm  | 20 +++-
 .../obj-c++.dg/property/nullability-00.mm | 21 
 .../objc.dg/attributes/nullability-00.m   | 20 
 .../objc.dg/property/at-property-4.m  | 18 +++
 .../objc.dg/property/nullability-00.m | 21 
 13 files changed, 272 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/obj-c++.dg/attributes/nullability-00.mm
 create mode 100644 gcc/testsuite/obj-c++.dg/property/nullability-00.mm
 create mode 100644 gcc/testsuite/objc.dg/attributes/nullability-00.m
 create mode 100644 gcc/testsuite/objc.dg/property/nullability-00.m

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 6718fff6efb..9c62508651c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -161,6 +161,7 @@ static tree handle_patchable_function_entry_attribute  
(tree *, tree, tree,

 static tree handle_copy_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nsobject_attribute (tree *, tree, tree, int, bool *);
 static tree handle_objc_root_class_attribute (tree *, tree, tree, int, bool *);
+static tree handle_objc_nullability_attribute (tree *, tree, tree, int,  
bool *);


 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)  \
@@ -520,6 +521,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_nsobject_attribute, NULL },
   { "objc_root_class", 0, 0, true, false, false, false,
  handle_objc_root_class_attribute, NULL },
+  { "objc_nullability",1, 1, true, false, false, false,
+ handle_objc_nullability_attribute, NULL },
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
 };

@@ -5251,6 +5254,52 @@ handle_objc_root_class_attribute (tree */*node*/,  
tree name, tree /*args*/,

   return NULL_TREE;
 }

+/* Handle an "objc_nu

c: C2x __has_c_attribute

2020-11-12 Thread Joseph Myers
C2x adds the __has_c_attribute preprocessor operator, similar to C++
__has_cpp_attribute.

GCC implements __has_cpp_attribute as exactly equivalent to
__has_attribute.  (The documentation says they differ regarding the
values returned for standard attributes, but that's actually only a
matter of the particular nonzero value returned not being specified in
the documentation for __has_attribute; the implementation makes no
distinction between the two.)

I don't think having them exactly equivalent is actually correct,
either for __has_cpp_attribute or for __has_c_attribute.
Specifically, I think it is only correct for __has_cpp_attribute or
__has_c_attribute to return nonzero if the given attribute is
supported, with the particular pp-tokens passed to __has_cpp_attribute
or __has_c_attribute, with [[]] syntax, not if it's only accepted in
__attribute__ or with gnu:: added in [[]].  For example, they should
return nonzero for gnu::packed, but zero for plain packed, because
[[gnu::packed]] is accepted but [[packed]] is ignored as not a
standard attribute.

This patch implements that for __has_c_attribute, leaving any changes
to __has_cpp_attribute for the C++ maintainers.  A new
BT_HAS_STD_ATTRIBUTE is added for __has_c_attribute (which I think,
based on the above, would actually be correct to use for
__has_cpp_attribute as well).  The code in c_common_has_attribute that
deals with scopes has its C++ conditional removed; instead, whether
the language is C or C++ is used only to determine the numeric values
returned for standard attributes (and which standard attributes are
handled there at all).  A new argument is passed to
c_common_has_attribute to distinguish BT_HAS_STD_ATTRIBUTE from
BT_HAS_ATTRIBUTE, and that argument is used to stop attributes with no
scope specified from being accepted with __has_c_attribute unless they
are one of the known standard attributes and so handled specially.

Although the standard specify constants ending with 'L' as the values
for the standard attributes, there is no correctness issue with the
lack of code in GCC to add that 'L' to the expansion:
__has_c_attribute and __has_cpp_attribute are expanded in #if after
other macro expansion has occurred, with no semantics being specified
if they occur outside #if, so there is no way for a conforming program
to inspect the exact text of the expansion of those macros, only to
use the resulting pp-number in a #if expression, where long and int
have the same set of values.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/
2020-11-12  Joseph Myers  

* doc/cpp.texi (__has_attribute): Document when scopes are allowed
for C.
(__has_c_attribute): New.

gcc/c-family/
2020-11-12  Joseph Myers  

* c-lex.c (c_common_has_attribute): Take argument std_syntax.
Allow scope for C.  Handle standard attributes for C.  Do not
accept unscoped attributes if std_syntax and not handled as
standard attributes.
* c-common.h (c_common_has_attribute): Update prototype.

gcc/testsuite/
2020-11-12  Joseph Myers  

* gcc.dg/c2x-has-c-attribute-1.c, gcc.dg/c2x-has-c-attribute-2.c,
gcc.dg/c2x-has-c-attribute-3.c, gcc.dg/c2x-has-c-attribute-4.c:
New tests.

libcpp/
2020-11-12  Joseph Myers  

* include/cpplib.h (struct cpp_callbacks): Add bool argument to
has_attribute.
(enum cpp_builtin_type): Add BT_HAS_STD_ATTRIBUTE.
* init.c (builtin_array): Add __has_c_attribute.
(cpp_init_special_builtins): Handle BT_HAS_STD_ATTRIBUTE.
* macro.c (_cpp_builtin_macro_text): Handle BT_HAS_STD_ATTRIBUTE.
Update call to has_attribute for BT_HAS_ATTRIBUTE.
* traditional.c (fun_like_macro): Handle BT_HAS_STD_ATTRIBUTE.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 94f4868915a..f47097442eb 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1042,7 +1042,7 @@ extern bool c_cpp_diagnostic (cpp_reader *, enum 
cpp_diagnostic_level,
  enum cpp_warning_reason, rich_location *,
  const char *, va_list *)
  ATTRIBUTE_GCC_DIAG(5,0);
-extern int c_common_has_attribute (cpp_reader *);
+extern int c_common_has_attribute (cpp_reader *, bool);
 extern int c_common_has_builtin (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index e81e16ddc26..6cd3df7c96f 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -300,7 +300,7 @@ get_token_no_padding (cpp_reader *pfile)
 
 /* Callback for has_attribute.  */
 int
-c_common_has_attribute (cpp_reader *pfile)
+c_common_has_attribute (cpp_reader *pfile, bool std_syntax)
 {
   int result = 0;
   tree attr_name = NULL_TREE;
@@ -319,35 +319,37 @@ c_common_has_attribute (cpp_reader *pfile)
   attr_name = get_identifier ((const char *)
  cpp_token_as_tex

Re: Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches

On 11/12/20 3:53 PM, Richard Biener wrote:

On November 12, 2020 9:43:52 PM GMT+01:00, Andrew MacLeod via Gcc-patches 
 wrote:

So I spent some time tracking down a ranger issue, and in the end, it
boiled down to the range-op handler not being picked up properly.

The handler is picked up by:

   if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) ==
GIMPLE_COND))
     return range_op_handler (gimple_expr_code (s), gimple_expr_type
(s));

IMHO this should use more specific functions. Gimple_expr_code should go away 
similar to gimple_expr_type.


gimple_expr_type is quite pervasive.. and each consumer is going to have 
to roll their own version of it.  Why do we want to get rid of it?


If we are trying to save a few bytes by storing the information in 
different places, then we're going to need some sort of accessing 
function like that



where it is indexing the table with the gimple_expr_code..
the stmt being processed was for a pointer assignment,
   _5 = _33
and it was coming back with a gimple_expr_code of  VAR_DECL instead of
an SSA_NAME... which confused me greatly.


gimple_expr_code (const gimple *stmt)
{
   enum gimple_code code = gimple_code (stmt);
   if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
     return (enum tree_code) stmt->subcode;

A little more digging shows this:

static inline enum tree_code
gimple_assign_rhs_code (const gassign *gs)
{
   enum tree_code code = (enum tree_code) gs->subcode;
   /* While we initially set subcode to the TREE_CODE of the rhs for
  GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
  in sync when we rewrite stmts into SSA form or do SSA
propagations.  */
   if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
     code = TREE_CODE (gs->op[1]);

   return code;
}

Fascinating comment.

... 😬


But it means that gimple_expr_code() isn't returning the correct result

for GIMPLE_SINGLE_RHS

It depends. A SSA name isn't an expression code either. As said, the generic 
gimple_expr_code should be used with extreme care.


what is an expression code?  It seems like its just a  tree_code 
representing what is on the RHS?    Im not sure I understand why one 
needs to be careful with it.  It only applies to COND, ASSIGN and CALL. 
and its current right for everything except GIMPLE_SINGLE_RHS?


If we dont fix gimple_expr_code, then Im basically going to be 
reimplementing it myself... which seems kind of pointless.


Andrew





Re: Fix gimple_expr_code?

2020-11-12 Thread Richard Biener via Gcc-patches
On November 12, 2020 9:43:52 PM GMT+01:00, Andrew MacLeod via Gcc-patches 
 wrote:
>So I spent some time tracking down a ranger issue, and in the end, it 
>boiled down to the range-op handler not being picked up properly.
>
>The handler is picked up by:
>
>   if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == 
>GIMPLE_COND))
>    return range_op_handler (gimple_expr_code (s), gimple_expr_type
>(s));

IMHO this should use more specific functions. Gimple_expr_code should go away 
similar to gimple_expr_type. 

>where it is indexing the table with the gimple_expr_code..
>the stmt being processed was for a pointer assignment,
>   _5 = _33
>and it was coming back with a gimple_expr_code of  VAR_DECL instead of 
>an SSA_NAME... which confused me greatly.
>
>
>gimple_expr_code (const gimple *stmt)
>{
>   enum gimple_code code = gimple_code (stmt);
>   if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
>     return (enum tree_code) stmt->subcode;
>
>A little more digging shows this:
>
>static inline enum tree_code
>gimple_assign_rhs_code (const gassign *gs)
>{
>   enum tree_code code = (enum tree_code) gs->subcode;
>   /* While we initially set subcode to the TREE_CODE of the rhs for
>  GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
>  in sync when we rewrite stmts into SSA form or do SSA 
>propagations.  */
>   if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
>     code = TREE_CODE (gs->op[1]);
>
>   return code;
>}
>
>Fascinating comment.

... 😬 

>But it means that gimple_expr_code() isn't returning the correct result
>
>for GIMPLE_SINGLE_RHS

It depends. A SSA name isn't an expression code either. As said, the generic 
gimple_expr_code should be used with extreme care. 

>Wouldn't it make sense that gimple_expr_code be changed to return 
>gimple_assign_rhs_code() for GIMPLE_ASSIGN?
>
>I tested the attached patch, and it bootstraps and passes regression
>tests.
>
>There aren't a lot of places where its used, but I saw a suspicious bit
>
>in ipa-icf-gimple.c that looks like it is working around this?
>
>
>bool
>func_checker::compare_gimple_assign (gimple *s1, gimple *s2)
>{
>   tree arg1, arg2;
>   tree_code code1, code2;
>   unsigned i;
>
>   code1 = gimple_expr_code (s1);
>   code2 = gimple_expr_code (s2);
>
>   if (code1 != code2)
>     return false;
>
>   code1 = gimple_assign_rhs_code (s1);
>   code2 = gimple_assign_rhs_code (s2);
>
>   if (code1 != code2)
>     return false;
>
>
>and  there were one or two other places where SSA_NAME occurred in the 
>cases of a switch after calling gimple_expr_code().
>
>This seems like it should be the right thing?
>Andrew



Fix gimple_expr_code?

2020-11-12 Thread Andrew MacLeod via Gcc-patches
So I spent some time tracking down a ranger issue, and in the end, it 
boiled down to the range-op handler not being picked up properly.


The handler is picked up by:

  if ((gimple_code (s) == GIMPLE_ASSIGN) || (gimple_code (s) == 
GIMPLE_COND))

    return range_op_handler (gimple_expr_code (s), gimple_expr_type (s));

where it is indexing the table with the gimple_expr_code..
the stmt being processed was for a pointer assignment,
  _5 = _33
and it was coming back with a gimple_expr_code of  VAR_DECL instead of 
an SSA_NAME... which confused me greatly.



gimple_expr_code (const gimple *stmt)
{
  enum gimple_code code = gimple_code (stmt);
  if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
    return (enum tree_code) stmt->subcode;

A little more digging shows this:

static inline enum tree_code
gimple_assign_rhs_code (const gassign *gs)
{
  enum tree_code code = (enum tree_code) gs->subcode;
  /* While we initially set subcode to the TREE_CODE of the rhs for
 GIMPLE_SINGLE_RHS assigns we do not update that subcode to stay
 in sync when we rewrite stmts into SSA form or do SSA 
propagations.  */

  if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
    code = TREE_CODE (gs->op[1]);

  return code;
}

Fascinating comment.

But it means that gimple_expr_code() isn't returning the correct result 
for GIMPLE_SINGLE_RHS


Wouldn't it make sense that gimple_expr_code be changed to return 
gimple_assign_rhs_code() for GIMPLE_ASSIGN?


I tested the attached patch, and it bootstraps and passes regression tests.

There aren't a lot of places where its used, but I saw a suspicious bit 
in ipa-icf-gimple.c that looks like it is working around this?



   bool
   func_checker::compare_gimple_assign (gimple *s1, gimple *s2)
   {
  tree arg1, arg2;
  tree_code code1, code2;
  unsigned i;

  code1 = gimple_expr_code (s1);
  code2 = gimple_expr_code (s2);

  if (code1 != code2)
    return false;

  code1 = gimple_assign_rhs_code (s1);
  code2 = gimple_assign_rhs_code (s2);

  if (code1 != code2)
    return false;


and  there were one or two other places where SSA_NAME occurred in the 
cases of a switch after calling gimple_expr_code().


This seems like it should be the right thing?
Andrew
	* gimple.h (gimple_expr_code): Return gimple_assign_rhs_code
	for GIMPLE_ASSIGN.

diff --git a/gcc/gimple.h b/gcc/gimple.h
index 62b5a8a6124..8ef2f83d412 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -2229,26 +2229,6 @@ gimple_set_modified (gimple *s, bool modifiedp)
 }
 
 
-/* Return the tree code for the expression computed by STMT.  This is
-   only valid for GIMPLE_COND, GIMPLE_CALL and GIMPLE_ASSIGN.  For
-   GIMPLE_CALL, return CALL_EXPR as the expression code for
-   consistency.  This is useful when the caller needs to deal with the
-   three kinds of computation that GIMPLE supports.  */
-
-static inline enum tree_code
-gimple_expr_code (const gimple *stmt)
-{
-  enum gimple_code code = gimple_code (stmt);
-  if (code == GIMPLE_ASSIGN || code == GIMPLE_COND)
-return (enum tree_code) stmt->subcode;
-  else
-{
-  gcc_gimple_checking_assert (code == GIMPLE_CALL);
-  return CALL_EXPR;
-}
-}
-
-
 /* Return true if statement STMT contains volatile operands.  */
 
 static inline bool
@@ -2889,6 +2869,29 @@ gimple_assign_cast_p (const gimple *s)
   return false;
 }
 
+
+/* Return the tree code for the expression computed by STMT.  This is
+   only valid for GIMPLE_COND, GIMPLE_CALL and GIMPLE_ASSIGN.  For
+   GIMPLE_CALL, return CALL_EXPR as the expression code for
+   consistency.  This is useful when the caller needs to deal with the
+   three kinds of computation that GIMPLE supports.  */
+
+static inline enum tree_code
+gimple_expr_code (const gimple *stmt)
+{
+  enum gimple_code code = gimple_code (stmt);
+  if (code == GIMPLE_ASSIGN)
+return gimple_assign_rhs_code (stmt);
+  else if (code == GIMPLE_COND)
+return (enum tree_code) stmt->subcode;
+  else
+{
+  gcc_gimple_checking_assert (code == GIMPLE_CALL);
+  return CALL_EXPR;
+}
+}
+
+
 /* Return true if S is a clobber statement.  */
 
 static inline bool


[committed] openmp: Implement allocate clause in omp lowering

2020-11-12 Thread Jakub Jelinek via Gcc-patches
Hi!

For now, task/taskloop constructs aren't handled and C/C++ array reductions
and reductions with task or inscan modifiers need further work.
Instead of calling omp_alloc/omp_free (where the former doesn't have
alignment argument and omp_aligned_alloc is 5.1 only feature), this calls
GOMP_alloc/GOMP_free, so that the library can fail if it would fall back
into NULL (exception is zero length allocations).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-11-12  Jakub Jelinek  

gcc/
* builtin-types.def (BT_FN_PTR_SIZE_SIZE_PTRMODE): New function type.
* omp-builtins.def (BUILT_IN_GOACC_DECLARE): Move earlier.
(BUILT_IN_GOMP_ALLOC, BUILT_IN_GOMP_FREE): New builtins.
* gimplify.c (gimplify_scan_omp_clauses): Force allocator into a
decl if it is not NULL, INTEGER_CST or decl.
(gimplify_adjust_omp_clauses): Clear GOVD_EXPLICIT on explicit clauses
which are being removed.  Remove allocate clauses for variables not seen
if they are private, firstprivate or linear too.  Call
omp_notice_variable on the allocator otherwise.
(gimplify_omp_for): Handle iterator vars mentioned in allocate clauses
similarly to non-is_gimple_reg iterators.
* omp-low.c (struct omp_context): Add allocate_map field.
(delete_omp_context): Delete it.
(scan_sharing_clauses): Fill it from allocate clauses.  Remove it
if mentioned also in shared clause.
(lower_private_allocate): New function.
(lower_rec_input_clauses): Handle allocate clause for privatized
variables, except for task/taskloop, C/C++ array reductions for now
and task/inscan variables.
(lower_send_shared_vars): Don't consider variables in allocate_map
as shared.
* omp-expand.c (expand_omp_for_generic, expand_omp_for_static_nochunk,
expand_omp_for_static_chunk): Use expand_omp_build_assign instead of
gimple_build_assign + gsi_insert_after.
* builtins.c (builtin_fnspec): Handle BUILTIN_GOMP_ALLOC and
BUILTIN_GOMP_FREE.
* tree-ssa-ccp.c (evaluate_stmt): Handle BUILTIN_GOMP_ALLOC.
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Handle
BUILTIN_GOMP_ALLOC.
(mark_all_reaching_defs_necessary_1): Handle BUILTIN_GOMP_ALLOC
and BUILTIN_GOMP_FREE.
(propagate_necessity): Likewise.
gcc/fortran/
* f95-lang.c (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
Define.
(gfc_init_builtin_functions): Add alloc_size and warn_unused_result
attributes to __builtin_GOMP_alloc.
* types.def (BT_PTRMODE): New primitive type.
(BT_FN_VOID_PTR_PTRMODE, BT_FN_PTR_SIZE_SIZE_PTRMODE): New function
types.
libgomp/
* libgomp.map (GOMP_alloc, GOMP_free): Export at GOMP_5.0.1.
* omp.h.in (omp_alloc): Add malloc and alloc_size attributes.
* libgomp_g.h (GOMP_alloc, GOMP_free): Declare.
* allocator.c (omp_aligned_alloc): New for now static function,
add alignment argument and handle it.
(omp_alloc): Reimplement using omp_aligned_alloc.
(GOMP_alloc, GOMP_free): New functions.
(omp_free): Add ialias.
* testsuite/libgomp.c-c++-common/allocate-1.c: New test.
* testsuite/libgomp.c++/allocate-1.C: New test.

--- gcc/builtin-types.def.jj2020-11-12 11:57:58.465562360 +0100
+++ gcc/builtin-types.def   2020-11-12 12:42:06.093029492 +0100
@@ -637,6 +637,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_SIZE_SIZ
 DEF_FUNCTION_TYPE_3 (BT_FN_UINT_UINT_PTR_PTR, BT_UINT, BT_UINT, BT_PTR, BT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
 BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
+BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
--- gcc/omp-builtins.def.jj 2020-11-12 11:57:58.470562304 +0100
+++ gcc/omp-builtins.def2020-11-12 12:42:06.105029360 +0100
@@ -47,6 +47,8 @@ DEF_GOACC_BUILTIN (BUILT_IN_GOACC_UPDATE
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_WAIT, "GOACC_wait",
   BT_FN_VOID_INT_INT_VAR,
   ATTR_NOTHROW_LIST)
+DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
+  BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
 
 DEF_GOACC_BUILTIN_COMPILER (BUILT_IN_ACC_ON_DEVICE, "acc_on_device",
BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
@@ -444,5 +446,8 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TASK_RED
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WORKSHARE_TASK_REDUCTION_UNREGISTER,
  "GOMP_workshare_task_reduction_unregister",
  BT_FN_VOID_BOOL, ATTR_NOTHROW_LEAF_LIST)
-DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
-  BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHR

Re: [PATCH 2/2] loops: Invoke lim after successful loop interchange

2020-11-12 Thread Martin Jambor
Hi,

On Wed, Nov 11 2020, Richard Biener wrote:
> On Mon, 9 Nov 2020, Martin Jambor wrote:
>
>> this patch modifies the loop invariant pass so that is can operate
>> only on a single requested loop and its sub-loops and ignore the rest
>> of the function, much like it currently ignores basic blocks that are
>> not in any real loop.  It then invokes it from within the loop
>> interchange pass when it successfully swaps two loops.  This avoids
>> the non-LTO -Ofast run-time regressions of 410.bwaves and 503.bwaves_r
>> (which are 19% and 15% faster than current master on an AMD zen2
>> machine) while not introducing a full LIM pass into the pass pipeline.
>> 
>> I have not modified the LIM data structures, this means that it still
>> contains vectors indexed by loop->num even though only a single loop
>> nest is actually processed.  I also did not replace the uses of
>> pre_and_rev_post_order_compute_fn with a function that would count a
>> postorder only for a given loop.  I can of course do so if the
>> approach is otherwise deemed viable.
>> 
>> The patch adds one additional global variable requested_loop to the
>> pass and then at various places behaves differently when it is set.  I
>> was considering storing the fake root loop into it for normal
>> operation, but since this loop often requires special handling anyway,
>> I came to the conclusion that the code would actually end up less
>> straightforward.
>> 
>> I have bootstrapped and tested the patch on x86_64-linux and a very
>> similar one on aarch64-linux.  I have also tested it by modifying the
>> tree_ssa_lim function to run loop_invariant_motion_from_loop on each
>> real outermost loop in a function and this variant also passed
>> bootstrap and all tests, including dump scans, of all languages.
>> 
>> I have built the entire SPEC 2006 FPrate monitoring the activity of
>> the LIM pass without and with the patch (on top of commit b642fca1c31
>> with which 526.blender_r and 538.imagick_r seemed to be failing) and
>> it only examined 0.2% more loops, 0.02% more BBs and even fewer
>> percent of statements because it is invoked only in a rather special
>> circumstance.  But the patch allows for more such need-based uses at
>> hopefully reasonable cost.
>> 
>> Since I do not have much experience with loop optimizers, I expect
>> that there will be requests to adjust the patch during the review.
>> Still, it fixes a performance regression against GCC 9 and so I hope
>> to address the concerns in time to get it into GCC 11.
>> 

[...]

>
> That said, in the way it's currently structured I think it's
> "better" to export tree_ssa_lim () and call it from interchange
> if any loop was interchanged (thus run a full pass but conditional
> on interchange done).  You can make it cheaper by adding a flag
> to tree_ssa_lim whether to do store-motion (I guess this might
> be an interesting user-visible flag as well and a possibility
> to make select lim passes cheaper via a pass flag) and not do
> store-motion from the interchange call.  I think that's how we should
> fix the regression, refactoring LIM properly requires more work
> that doesn't seem to fit the stage1 deadline.
>

So just like this?  Bootstrapped and tested on x86_64-linux and I have
verified it fixes the bwaves reduction.

Thanks,

Martin



gcc/ChangeLog:

2020-11-12  Martin Jambor  

PR tree-optimization/94406
* tree-ssa-loop-im.c (tree_ssa_lim): Renamed to
loop_invariant_motion_in_fun, added a parameter to control store
motion.
(pass_lim::execute): Adjust call to tree_ssa_lim, now
loop_invariant_motion_in_fun.
* tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Declare.
* gimple-loop-interchange.cc (pass_linterchange::execute): Call
loop_invariant_motion_in_fun if any interchange has been done.
---
 gcc/gimple-loop-interchange.cc |  9 +++--
 gcc/tree-ssa-loop-im.c | 12 +++-
 gcc/tree-ssa-loop-manip.h  |  2 +-
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index 1656004ecf0..a36dbb49b1f 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -2085,8 +2085,13 @@ pass_linterchange::execute (function *fun)
 }
 
   if (changed_p)
-scev_reset ();
-  return changed_p ? (TODO_update_ssa_only_virtuals) : 0;
+{
+  unsigned todo = TODO_update_ssa_only_virtuals;
+  todo |= loop_invariant_motion_in_fun (cfun, false);
+  scev_reset ();
+  return todo;
+}
+  return 0;
 }
 
 } // anon namespace
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 6bb07e133cd..3c7412737f0 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3089,10 +3089,11 @@ tree_ssa_lim_finalize (void)
 }
 
 /* Moves invariants from loops.  Only "expensive" invariants are moved out --
-   i.e. those that are likely to be win regardless of the register pressure.  
*/
+   i.

[PATCH] Implementation of asm goto outputs

2020-11-12 Thread Vladimir Makarov via Gcc-patches

  The following patch implements asm goto with outputs.  Kernel
developers several times expressed wish to have this feature. Asm
goto with outputs was implemented in LLVM recently.  This new feature
was presented on 2020 linux plumbers conference
(https://linuxplumbersconf.org/event/7/contributions/801/attachments/659/1212/asm_goto_w__Outputs.pdf)
and 2020 LLVM conference
(https://www.youtube.com/watch?v=vcPD490s-hE).

  The patch permits to use outputs in asm gotos only when LRA is used.
It is problematic to implement it in the old reload pass.  To be
honest it was hard to implement it in LRA too until global live info
update was added to LRA few years ago.

  Different from LLVM asm goto output implementation, you can use
outputs on any path from the asm goto (not only on fallthrough path as
in LLVM).

  The patch removes critical edges on which potentially asm output
reloads could occur (it means you can have several asm gotos using the
same labels and the same outputs).  It is done in IRA as it is
difficult to create new BBs in LRA.  The most of the work (placement
of output reloads in BB destinations of asm goto basic block) is done in
LRA.  When it happens, LRA updates global live info to reflect that
new pseudos live on the BB borders and the old ones do not live there
anymore.

  I tried also approach to split live ranges of pseudos involved in
asm goto outputs to guarantee they get hard registers in IRA. But
this approach did not work as it is difficult to keep this assignment
through all LRA. Also probably it would result in worse code as move
insn coalescing is not guaranteed.

  Asm goto with outputs will not work for targets which were not
converted to LRA (probably some outdated targets as the old reload
pass is not supported anymore).  An error will be generated when the
old reload pass meets asm goto with an output.  A precaution is taken
not to crash compiler after this error.

  The patch is pretty small as all necessary infrastructure was
already implemented, practically in all compiler pipeline.  It did not
required adding new RTL insns opposite to what Google engineers did to
LLVM MIR.

  The patch could be also useful for implementing jump insns with
output reloads in the future (e.g. branch and count insns).

  I think asm gotos with outputs should be considered as an experimental
feature as there are no real usage of this yet.  Earlier adoption of
this feature could help with debugging and hardening the
implementation.

  The patch was successfully bootstrapped and tested on x86-64, ppc64, 
and aarch64.


Are non-RA changes ok in the patch?

2020-11-12  Vladimir Makarov 

    * c/c-parser.c (c_parser_asm_statement): Parse outputs for asm
    goto too.
    * c/c-typeck.c (build_asm_expr): Remove an assert checking output
    absence for asm goto.
    * cfgexpand.c (expand_asm_stmt): Output asm goto with outputs too.
    Place insns after asm goto on edges.
    * cp/parser.c (cp_parser_asm_definition): Parse outputs for asm
    goto too.
    * doc/extend.texi: Reflect the changes in asm goto documentation.
    * gcc/gimple.c (gimple_build_asm_1): Remove an assert checking 
output

    absence for asm goto.
    * gimple.h (gimple_asm_label_op, gimple_asm_set_label_op): Take
    possible asm goto outputs into account.
    * ira.c (ira): Remove critical edges for potential asm goto output
    reloads.
    (ira_nullify_asm_goto): New function.
    * ira.h (ira_nullify_asm_goto): New prototype.
    * lra-assigns.c (lra_split_hard_reg_for): Use ira_nullify_asm_goto.
    Check that splitting is done inside a basic block.
    * lra-constraints.c (curr_insn_transform): Permit output reloads
    for any jump insn.
    * lra-spills.c (lra_final_code_change): Remove USEs added in 
ira for asm gotos.

    * lra.c (lra_process_new_insns): Place output reload insns after
    jumps in the beginning of destination BBs.
    * reload.c (find_reloads): Report error for asm gotos with
    outputs.  Modify them to keep CFG consistency to avoid crashes.
    * tree-into-ssa.c (rewrite_stmt): Don't put debug stmt after asm
    goto.


2020-11-12  Vladimir Makarov  

    * c-c++-common/asmgoto-2.c: Permit output in asm goto.
    * gcc.c-torture/compile/asmgoto-[2345].c: New tests.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index ecc3d2119fa..db719fad58c 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7144,10 +7144,7 @@ c_parser_asm_statement (c_parser *parser)
 	switch (section)
 	  {
 	  case 0:
-	/* For asm goto, we don't allow output operands, but reserve
-	   the slot for a future extension that does allow them.  */
-	if (!is_goto)
-	  outputs = c_parser_asm_operands (parser);
+	outputs = c_parser_asm_operands (parser);
 	break;
 	  case 1:
 	inputs = c_parser_asm_operands (parser);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 968403

Re: [PATCH 1/3] C-family, Objective-C [1/3] : Implement Wobjc-root-class [PR77404].

2020-11-12 Thread Joseph Myers
On Thu, 12 Nov 2020, Iain Sandoe wrote:

> OK for the c-family parts?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all,

This patch adds backend patterns for vec_widen_add, vec_widen_sub on aarch64.

All 3 patches together bootstrapped and regression tested on aarch64.

Ok for stage 1?

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * config/aarch64/aarch64-simd.md: New patterns 
vec_widen_saddl_lo/hi_
From 3e47bc562b83417a048e780bcde52fb2c9617df3 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Mon, 9 Nov 2020 15:35:57 +
Subject: [PATCH 1/3] [aarch64] Add vec_widen patterns to aarch64

Add widening add and subtract pattrerns to the aarch64
backend.
---
 gcc/config/aarch64/aarch64-simd.md | 94 ++
 1 file changed, 94 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 2cf6fe9154a2ee1b21ad9e8e2a6109805022be7f..b4f56a2295926f027bd53e7456eec729af0cd6df 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3382,6 +3382,100 @@
   [(set_attr "type" "neon__long")]
 )
 
+(define_expand "vec_widen_saddl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_saddl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_ssubl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_ssubl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+(define_expand "vec_widen_saddl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_saddl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_ssubl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_ssubl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+(define_expand "vec_widen_uaddl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_uaddl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_usubl_lo_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+  emit_insn (gen_aarch64_usubl_lo_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_uaddl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_uaddl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
+(define_expand "vec_widen_usubl_hi_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQW 1 "register_operand")
+   (match_operand:VQW 2 "register_operand")]
+  "TARGET_SIMD"
+{
+  rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+  emit_insn (gen_aarch64_usubl_hi_internal (operands[0], operands[1],
+		  operands[2], p));
+  DONE;
+})
+
 
 (define_expand "aarch64_saddl2"
   [(match_operand: 0 "register_operand")
-- 
2.17.1



[3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all,

This patch adds support in the aarch64 backend for the vec_widen_shift 
vect-pattern and makes a minor mid-end fix to support it.

All 3 patches together bootstrapped and regression tested on aarch64.

Ok for stage 1?

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * config/aarch64/aarch64-simd.md: vec_widen_lshift_hi/lo patterns
        * tree-vect-stmts.c 
        (vectorizable_conversion): Fix for widen_lshift case

gcc/testsuite/ChangeLog:

2020-11-12  Joel Hutton  

        * gcc.target/aarch64/vect-widen-lshift.c: New test.
From 97af35b2d2a505dcefd8474cbd4bc3441b83ab02 Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Thu, 12 Nov 2020 11:48:25 +
Subject: [PATCH 3/3] [AArch64][vect] vec_widen_lshift pattern

Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in
mid-end.
---
 gcc/config/aarch64/aarch64-simd.md| 66 +++
 .../gcc.target/aarch64/vect-widen-lshift.c| 60 +
 gcc/tree-vect-stmts.c |  9 ++-
 3 files changed, 133 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b4f56a2295926f027bd53e7456eec729af0cd6df..2bb39c530a1a861cb9bd3df0c2943f62bd6153d7 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4711,8 +4711,74 @@
   [(set_attr "type" "neon_sat_shift_reg")]
 )
 
+(define_expand "vec_widen_shiftl_lo_"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
+emit_insn (gen_aarch64_shll_internal (operands[0], operands[1],
+		 p, operands[2]));
+DONE;
+  }
+)
+
+(define_expand "vec_widen_shiftl_hi_"
+   [(set (match_operand: 0 "register_operand")
+	(unspec: [(match_operand:VQW 1 "register_operand" "w")
+			 (match_operand:SI 2
+			   "immediate_operand" "i")]
+			  VSHLL))]
+   "TARGET_SIMD"
+   {
+rtx p = aarch64_simd_vect_par_cnst_half (mode, , true);
+emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1],
+		  p, operands[2]));
+DONE;
+   }
+)
+
 ;; vshll_n
 
+(define_insn "aarch64_shll_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_lo_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll\\t%0., %1., %3";
+else
+  return "shll\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_shll2_internal"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(unspec: [(vec_select:
+			(match_operand:VQW 1 "register_operand" "w")
+			(match_operand:VQW 2 "vect_par_cnst_hi_half" ""))
+			 (match_operand:SI 3
+			   "aarch64_simd_shift_imm_bitsize_" "i")]
+			 VSHLL))]
+  "TARGET_SIMD"
+  {
+if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode))
+  return "shll2\\t%0., %1., %3";
+else
+  return "shll2\\t%0., %1., %3";
+  }
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
 (define_insn "aarch64_shll_n"
   [(set (match_operand: 0 "register_operand" "=w")
 	(unspec: [(match_operand:VD_BHSI 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
new file mode 100644
index ..23ed93d1dcbc3ca559efa6708b4ed5855fb6a050
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-lshift.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps" } */
+#include 
+#include 
+
+#define ARR_SIZE 1024
+
+/* Should produce an shll,shll2 pair*/
+void sshll_opt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+__attribute__((optimize (0)))
+void sshll_nonopt (int32_t *foo, int16_t *a, int16_t *b)
+{
+for( int i = 0; i < ARR_SIZE - 3;i=i+4)
+{
+foo[i]   = a[i]   << 16;
+foo[i+1] = a[i+1] << 16;
+foo[i+2] = a[i+2] << 16;
+foo[i+3] = a[i+3] << 16;
+}
+}
+
+
+void __attribute__((optimize (0)))
+init(uint16_t *a, uint16_t *b)
+{
+for( int i = 0; i < ARR_SIZE;i++)
+{
+  a[i] = i;
+  b[i] = 2*i;
+}
+}
+
+int __attribute__((optimize (0)))
+main()
+{
+uint32_t foo_arr[ARR_SIZE];
+uint32_t bar_arr[ARR_SIZE];
+uint16_t a[ARR_SIZE];
+uint16_t b[ARR_SIZE];
+
+init(a, b);
+sshll_opt(foo_arr, a, b);
+sshll_nonop

[2/3][vect] Add widening add, subtract vect patterns

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all,

This patch adds widening add and widening subtract patterns to 
tree-vect-patterns.

All 3 patches together bootstrapped and regression tested on aarch64.

gcc/ChangeLog:

2020-11-12  Joel Hutton  

        * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
        * optabs-tree.c (optab_for_tree_code): optabs for widening 
adds,subtracts
        * optabs.def (OPTAB_D): define vectorized widen add, subtracts
        * tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds, 
subtracts
        * tree-inline.c (estimate_operator_cost): Add case for widening adds, 
subtracts
        * tree-vect-generic.c (expand_vector_operations_1): Add case for 
widening adds, subtracts
        * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog ptatern
        (vect_recog_widen_sub_pattern): New recog pattern
        (vect_recog_average_pattern): Update widened add code
        (vect_recog_average_pattern): Update widened add code
        * tree-vect-stmts.c (vectorizable_conversion): Add case for widened 
add, subtract
        (supportable_widening_operation): Add case for widened add, subtract
        * tree.def (WIDEN_ADD_EXPR): New tree code
        (WIDEN_SUB_EXPR): New tree code
        (VEC_WIDEN_ADD_HI_EXPR): New tree code
        (VEC_WIDEN_ADD_LO_EXPR): New tree code
        (VEC_WIDEN_SUB_HI_EXPR): New tree code
        (VEC_WIDEN_SUB_LO_EXPR): New tree code

gcc/testsuite/ChangeLog:

2020-11-12  Joel Hutton  

        * gcc.target/aarch64/vect-widen-add.c: New test.
        * gcc.target/aarch64/vect-widen-sub.c: New test.


Ok for trunk?
From e0c10ca554729b9e6d58dbd3f18ba72b2c3ee8bc Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Mon, 9 Nov 2020 15:44:18 +
Subject: [PATCH 2/3] [vect] Add widening add, subtract patterns

Add widening add, subtract patterns to tree-vect-patterns.
Add aarch64 tests for patterns.

fix sad
---
 gcc/expr.c|  6 ++
 gcc/optabs-tree.c | 17 
 gcc/optabs.def|  8 ++
 .../gcc.target/aarch64/vect-widen-add.c   | 90 +++
 .../gcc.target/aarch64/vect-widen-sub.c   | 90 +++
 gcc/tree-cfg.c|  8 ++
 gcc/tree-inline.c |  6 ++
 gcc/tree-vect-generic.c   |  4 +
 gcc/tree-vect-patterns.c  | 32 +--
 gcc/tree-vect-stmts.c | 15 +++-
 gcc/tree.def  |  6 ++
 11 files changed, 276 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c

diff --git a/gcc/expr.c b/gcc/expr.c
index ae16f07775870792729e3805436d7f2debafb6ca..ffc8aed5296174066849d9e0d73b1c352c20fd9e 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9034,6 +9034,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	  target, unsignedp);
   return target;
 
+case WIDEN_ADD_EXPR:
+case WIDEN_SUB_EXPR:
 case WIDEN_MULT_EXPR:
   /* If first operand is constant, swap them.
 	 Thus the following special case checks need only
@@ -9754,6 +9756,10 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	return temp;
   }
 
+case VEC_WIDEN_ADD_HI_EXPR:
+case VEC_WIDEN_ADD_LO_EXPR:
+case VEC_WIDEN_SUB_HI_EXPR:
+case VEC_WIDEN_SUB_LO_EXPR:
 case VEC_WIDEN_MULT_HI_EXPR:
 case VEC_WIDEN_MULT_LO_EXPR:
 case VEC_WIDEN_MULT_EVEN_EXPR:
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 4dfda756932de1693667c39c6fabed043b20b63b..009dccfa3bd298bca7b3b45401a4cc2acc90ff21 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -170,6 +170,23 @@ optab_for_tree_code (enum tree_code code, const_tree type,
   return (TYPE_UNSIGNED (type)
 	  ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
 
+case VEC_WIDEN_ADD_LO_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_uaddl_lo_optab  : vec_widen_saddl_lo_optab);
+
+case VEC_WIDEN_ADD_HI_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_uaddl_hi_optab  : vec_widen_saddl_hi_optab);
+
+case VEC_WIDEN_SUB_LO_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_usubl_lo_optab  : vec_widen_ssubl_lo_optab);
+
+case VEC_WIDEN_SUB_HI_EXPR:
+  return (TYPE_UNSIGNED (type)
+	  ? vec_widen_usubl_hi_optab  : vec_widen_ssubl_hi_optab);
+
+
 case VEC_UNPACK_HI_EXPR:
   return (TYPE_UNSIGNED (type)
 	  ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 78409aa14537d259bf90277751aac00d452a0d3f..a97cdb360781ca9c743e2991422c600626c75aa5 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -383,6 +383,14 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a")
 OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a")
 OPTAB_D (vec_widen_smult_lo_optab, "ve

RE: gcc-wwwdocs branch master updated. 88e29096c36837553fc841bd1fa5df6caa776b44

2020-11-12 Thread Gerald Pfeifer
On Fri, 6 Nov 2020, Liu, Hongtao wrote:
> I realize you're talking about the patch for gcc-wwwdocs.
> No, I didn't send out a patch, sorry for that, will do it in further commit.

Thanks - saw that. Jeff just beat me to it. :-)

Gerald


[committed] wwwdocs: Editorial changes around x86-64 ISA extensions

2020-11-12 Thread Gerald Pfeifer
Per our discussion on the list (plus a grammer improvement in a
section above).

One question: why are the ISA extension lists not alphabetically
sorted?  Wouldn't that be beneficial for users?  Easier to find
something and also easier to compare?

Gerald

---
 htdocs/gcc-11/changes.html | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index fc4c74f4..106db8e9 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -265,7 +265,8 @@ a work-in-progress.
   
   New ISA extension support for Intel AMX-TILE, AMX-INT8, AMX-BF16 was
   added to GCC. AMX-TILE, AMX-INT8, AMX-BF16 intrinsics are available
-  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler switch.
+  via the -mamx-tile, -mamx-int8, -mamx-bf16 compiler
+  switches.
   
   New ISA extension support for Intel AVX-VNNI was added to GCC.
   AVX-VNNI intrinsics are available via the -mavxvnni
@@ -273,14 +274,14 @@ a work-in-progress.
   
   GCC now supports the Intel CPU named Sapphire Rapids through
 -march=sapphirerapids.
-The switch enables the MOVDIRI MOVDIR64B AVX512VP2INTERSECT ENQCMD CLDEMOTE
-SERIALIZE PTWRITE WAITPKG TSXLDTRK AMT-TILE AMX-INT8 AMX-BF16 AVX-VNNI
-ISA extensions.
+The switch enables the MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD,
+CLDEMOTE, SERIALIZE, PTWRITE, WAITPKG, TSXLDTRK, AMT-TILE, AMX-INT8,
+AMX-BF16, and AVX-VNNI ISA extensions.
   
   GCC now supports the Intel CPU named Alderlake through
 -march=alderlake.
-The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE KEYLOCKER 
AVX-VNNI
-HRESET ISA extensions.
+The switch enables the CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER,
+AVX-VNNI, and HRESET ISA extensions.
   
 
 
-- 
2.29.2


Re: PowerPC: Use __float128 instead of __ieee128 in tests.

2020-11-12 Thread Segher Boessenkool
Hi,

On Thu, Oct 22, 2020 at 06:12:31PM -0400, Michael Meissner wrote:
> Two of the tests used the __ieee128 keyword instead of __float128.  This
> patch changes those cases to use the official keyword.

What is "official" about that?

Why make this change at all?  __ieee128 should work as well!  Did you
see failures without this patch?  Thos need fixing, then.


Segher


Re: [PATCH,wwwdocs] gcc-11/changes: Mention Intel AVX-VNNI

2020-11-12 Thread Gerald Pfeifer
On Wed, 11 Nov 2020, Hongtao Liu via Gcc-patches wrote:
> +  New ISA extension support for Intel AVX-VNNI was added to GCC.

More for the future (i.e., no need to change that now): I suggest
to skip "to GCC" in cases like this, since this is our context to
begin with. 

Gerald


Re: [Patch] Fortran: improve location data for OpenACC/OpenMP directives [PR97782]

2020-11-12 Thread Thomas Schwinge
Hi!

On 2020-11-12T12:45:24+0100, Tobias Burnus  wrote:
> For code like
>   !$acc kernels
>  ... a lot of loops and other code
>   !$acc end kernels
>
> gfortran generates
>#pragma ..._kernels
>  {
>... lot of code
>  }
>
> As the PR shows, the location associated with the #pragma
> is not the 'acc kernels' line but the one near the 'acc end kernel'
> line.
>
> The reason is that [...]

> This patch [...]

> In principle, it should also have an effect on warnings (if there are
> any)

..., and there are -- one, at least (and somewhat bogus, but still).  ;-)
I've thus pushed "Adjust 'libgomp.oacc-fortran/attach-descriptor-1.f90'
for improved location information" to master branch in commit
9106c51e57c06e88a0dddf994fb5432b4bbe68c0, see attached.  (Not (yet)
relevant for releases/gcc-10 branch; the commit introducing that testcase
isn't there yet -- that's to be discussed in a different thread.)


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 9106c51e57c06e88a0dddf994fb5432b4bbe68c0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 12 Nov 2020 20:07:25 +0100
Subject: [PATCH] Adjust 'libgomp.oacc-fortran/attach-descriptor-1.f90' for
 improved location information

Fix-up for commit b71ff8c15f5a7d6b1cc1524b4d27843f0d88dbda "Fortran: improve
location data for OpenACC/OpenMP directives [PR97782]".

	libgomp/
	PR fortran/97782
	* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Adjust.
---
 libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
index 960b9f94507..2701192e37d 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90
@@ -42,9 +42,8 @@ subroutine test(variant)
  stop 1
   end if
 
-  ! FIXME: This warning is emitted on the wrong line number.
-  ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } 52 }
   !$acc serial present(myvar%arr2)
+  ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } .-1 }
   do i=1,10
 myvar%arr1(i) = i + variant
 myvar%arr2(i) = i - variant
-- 
2.17.1



Re: [PATCH] PR libstdc++/71579 assert that type traits are not misused with an incomplete type

2020-11-12 Thread Antony Polukhin via Gcc-patches
Final bits for libstdc/71579

std::common_type assertions attempt to give a proper 'required from
here' hint for user code, do not bring many changes to the
implementation and check all the template parameters for completeness.
In some cases the type could be checked for completeness more than
once. This seems to be unsolvable due to the fact that
std::common_type could be specialized by the user, so we have to call
std::common_type recursively, potentially repeating the check for the
first type.

std::common_reference assertions make sure that we detect incomplete
types even if the user specialized the std::basic_common_reference.

Changelog:

2020-11-12  Antony Polukhin  
PR libstdc/71579
* include/std/type_traits (is_convertible, is_nothrow_convertible)
(common_type, common_reference): Add static_asserts
to make sure that the arguments of the type traits are not misused
with incomplete types.
* testsuite/20_util/common_reference/incomplete_basic_common_neg.cc:
New test.
* testsuite/20_util/common_reference/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/requirements/sfinae_friendly_1.cc: Remove
SFINAE tests on incomplete types.
* testsuite/20_util/is_convertible/incomplete_neg.cc: New test.
* testsuite/20_util/is_nothrow_convertible/incomplete_neg.cc: New test.



--
Best regards,
Antony Polukhin
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 34e068b..00fa7f5 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1406,12 +1406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_convertible
 : public __is_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   // helper trait for unique_ptr, shared_ptr, and span
   template
 using __is_array_convertible
-  = is_convertible<_FromElementType(*)[], _ToElementType(*)[]>;
+  = typename __is_convertible_helper<
+   _FromElementType(*)[], _ToElementType(*)[]>::type;
 
   template, is_function<_To>,
@@ -1454,7 +1460,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_nothrow_convertible
 : public __is_nt_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   /// is_nothrow_convertible_v
   template
@@ -2239,7 +2250,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct common_type<_Tp1, _Tp2>
 : public __common_type_impl<_Tp1, _Tp2>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "each argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "each argument type must be a complete class or an unbounded array");
+};
 
   template
 struct __common_type_pack
@@ -2253,7 +2269,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct common_type<_Tp1, _Tp2, _Rp...>
 : public __common_type_fold,
__common_type_pack<_Rp...>>
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "first argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "second argument type must be a complete class or an unbounded array");
+#ifdef __cpp_fold_expressions
+  static_assert((std::__is_complete_or_unbounded(
+   __type_identity<_Rp>{}) && ...),
+   "each argument type must be a complete class or an unbounded array");
+#endif
+};
 
   // Let C denote the same type, if any, as common_type_t.
   // If there is such a type C, type shall denote the same type, if any,
@@ -3315,9 +3341,10 @@ template 
 
   // If A and B are both rvalue reference types, ...
   template
-struct __common_ref_impl<_Xp&&, _Yp&&,
-  _Require>,
-  is_convertible<_Yp&&, __common_ref_C<_Xp, _Yp
+struct __common_ref_impl<_Xp&&, _Yp&&, _Require<
+  typename __is_convertible_helper<_Xp&&, __common_ref_C<_Xp, _Yp>>::type,
+  typename __is_convertible_helper<_Yp&&, __common_ref_C<_Xp, _Yp>>::type
+>>
 { using type = __common_ref_C<_Xp, _Yp>; };
 
   // let D be COMMON-REF(const X&, Y&)
@@ -3326,8 +33

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Joseph Myers
I'd expect these patches to include updates to the gcc.dg/format/ms_*.c 
tests to reflect the changed semantics (or new tests there if some of the 
changes don't result in any failures in the existing tests).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Marek Polacek via Gcc-patches
On Thu, Nov 12, 2020 at 01:27:23PM -0500, Patrick Palka wrote:
> The atom_cache in normalize_atom relies on the assumption that two
> equivalent (templated) trees (in the sense of cp_tree_equal) must use
> the same template parameters (according to find_template_parameters).
> 
> This assumption unfortunately doesn't always hold for TARGET_EXPRs,
> because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
> find_template_parameters walks this target (and its DECL_CONTEXT).
> 
> Hence two TARGET_EXPRs built by force_target_expr with the same
> initializer but under different settings of current_function_decl may
> compare equal according to cp_tree_equal, but find_template_parameters
> returns a different set of template parameters for them.  This breaks
> the below testcase because during normalization we build two such
> TARGET_EXPRs (one under current_function_decl=f and another under =g),
> and then use the same ATOMIC_CONSTR for the two corresponding atoms,
> leading to a crash during satisfaction of g's associated constraints.
> 
> This patch works around this assumption violation by removing the source
> of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
> added in r9-6043, but it seems it's no longer necessary (according to
> https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
> call was added in order to avoid regressing on initlist109.C at the time).
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

Looks OK to me, thanks!

> gcc/cp/ChangeLog:
> 
>   * semantics.c (finish_compound_literal): Don't wrap the original
>   compound literal in a TARGET_EXPR when inside a template.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/concepts-decltype3.C: New test.
> ---
>  gcc/cp/semantics.c  |  7 +--
>  gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
>  2 files changed, 16 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> 
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 33d715edaec..172286922e7 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
> compound_literal,
>  
>/* If we're in a template, return the original compound literal.  */
>if (orig_cl)
> -{
> -  if (!VECTOR_TYPE_P (type))
> - return get_target_expr_sfinae (orig_cl, complain);
> -  else
> - return orig_cl;
> -}
> +return orig_cl;
>  
>if (TREE_CODE (compound_literal) == CONSTRUCTOR)
>  {
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> new file mode 100644
> index 000..837855ce8ac
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
> @@ -0,0 +1,15 @@
> +// { dg-do compile { target c++20 } }
> +
> +template  concept C = requires(T t) { t; };
> +
> +template  using A = decltype((T{}, int{}));
> +
> +template  concept D = C>;
> +
> +template  void f() requires D;
> +template  void g() requires D;
> +
> +void h() {
> +  f();
> +  g();
> +}
> -- 
> 2.29.2.260.ge31aba42fb
> 

Marek



Re: SLS Mitigation patches backported for GCC9

2020-11-12 Thread Sebastian Pop via Gcc-patches
Hi,

could the SLS Mitigation patches be back-ported to the gcc-8 branch?

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=dc586a74922 aarch64:
Introduce SLS mitigation for RET and BR instructions
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=20da13e395b aarch64:
New Straight Line Speculation (SLS) mitigation flags

Thanks,
Sebastian

On Tue, Aug 4, 2020 at 3:34 AM Kyrylo Tkachov  wrote:
>
> Hi Matthew,
>
> > -Original Message-
> > From: Matthew Malcomson 
> > Sent: 24 July 2020 17:03
> > To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> > Cc: Richard Earnshaw ; Ross Burton
> > ; Richard Sandiford 
> > Subject: Re: SLS Mitigation patches backported for GCC9
> >
> > On 24/07/2020 12:01, Kyrylo Tkachov wrote:
> > > Hi Matthew,
> > >
> > >> -Original Message-
> > >> From: Matthew Malcomson 
> > >> Sent: 21 July 2020 16:16
> > >> To: gcc-patches@gcc.gnu.org
> > >> Cc: Richard Earnshaw ; Kyrylo Tkachov
> > >> ; Ross Burton 
> > >> Subject: SLS Mitigation patches backported for GCC9
> > >>
> > >> Hello,
> > >>
> > >> Eventually we will want to backport the SLS patches to older branches.
> > >>
> > >> When the GCC10 release is unfrozen we will work on getting the same
> > >> patches
> > >> already posted backported to that branch.  The patches already posted on
> > >> the
> > >> mailing list apply cleanly to the current releases/gcc-10 branch.
> > >>
> > >> I've heard interest in having the GCC 9 patches, so I'm posting the
> > modified
> > >> versions upstream sooner than otherwise.
> > >
> > > I'd say let's go ahead with the GCC 10 patches (assuming testing works out
> > well on there).
> > > For the GCC 9 patches it would be useful if you included a bit of text of 
> > > how
> > they differ from the GCC 10/11 patches.
> > > This would speed up the technical review.
> > > Thanks,
> > > Kyrill
> > >
> > >>
> > >> Cheers,
> > >> Matthew
> > >>
> > >> Entire patch series attached to cover letter.
> >
> > Below were the only two "interesting" hunks that failed to apply after
> > `patch -p1`.
> >
> > The differences causing these were:
> > - in GCC-9 the `retab` instruction wasn't in the "do_return" pattern.
> > - `simple_return` had "aarch64_use_simple_return_insn_p ()" as a
> > condition.
> >
> >
>
> Thanks, the backports to GCC 10 and GCC 9 are okay, let's go ahead with them.
> Kyrill
>
> >
> >
> > --- gcc/config/aarch64/aarch64.md
> > +++ gcc/config/aarch64/aarch64.md
> > @@ -863,18 +882,23 @@
> > [(return)]
> > ""
> > {
> > +const char *ret = NULL;
> >   if (aarch64_return_address_signing_enabled ()
> >  && TARGET_ARMV8_3
> >  && !crtl->calls_eh_return)
> > {
> >  if (aarch64_ra_sign_key == AARCH64_KEY_B)
> > - return "retab";
> > + ret = "retab";
> >  else
> > - return "retaa";
> > + ret = "retaa";
> > }
> > -return "ret";
> > +else
> > +  ret = "ret";
> > +output_asm_insn (ret, operands);
> > +return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> > }
> > -  [(set_attr "type" "branch")]
> > +  [(set_attr "type" "branch")
> > +   (set_attr "sls_length" "retbr")]
> >   )
> >
> >   (define_expand "return"
> > @@ -886,8 +910,12 @@
> >   (define_insn "simple_return"
> > [(simple_return)]
> > ""
> > -  "ret"
> > -  [(set_attr "type" "branch")]
> > +  {
> > +output_asm_insn ("ret", operands);
> > +return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> > +  }
> > +  [(set_attr "type" "branch")
> > +   (set_attr "sls_length" "retbr")]
> >   )
> >
> >   (define_insn "*cb1"


Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 12, 2020 at 7:26 PM Uros Bizjak  wrote:
>
> On Thu, Nov 12, 2020 at 6:51 PM Uros Bizjak  wrote:
>
> > > > > Yes, removed 'code' and value_mode by checking VECTOR_MODE_P and use 
> > > > > GET_MODE_INNER
> > > > > for value_mode.  ".md expanders" shall support for integer constants 
> > > > > index mode, but
> > > > > I guess they shouldn't be expanded by IFN as this function is for 
> > > > > variable index
> > > > > insert only?  Anyway, the v3 patch used VOIDmode check...
> > >
> > > I'm not sure what best to do here, as said accepting "any" (integer) mode 
> > > as
> > > input is desirable (SImode, DImode but eventually also smaller modes).  
> > > How
> > > that can be best achieved I don't know.
> >
> > I was expecting something similar to how extvM/extzvM operands are
> > handled here. We have:
> >
> > Operands 0 and 1 both have mode M.  Operands 2 and 3 have a
> > target-specific mode.
> >
> > Please note operands 2 and 3 having a "target-specific" mode, handled
> > in optabs-query.c as:
> >
> >   machine_mode struct_mode = data->operand[struct_op].mode;
> >   if (struct_mode == VOIDmode)
> > struct_mode = word_mode;
> >   if (mode != struct_mode)
> > return false;
> >
> > > Why's not specifying any mode in the patter no good?  Just make sure you
> > > appropriately extend/subreg it?  We can make sure it will be an integer
> > > mode in the expander itself.
> >
> > IIRC, having known mode, expanders can use create_convert_operand_to,
> > and the middle-end will do the above by itself. Also note that at
> > least two targets specify SImode, so register operands are currently
> > ineffective there.
>
> On a related note, the pattern is currently expanded as (see
> store_bit_field_1 in expmed.c):
>
>   create_fixed_operand (&ops[0], op0);
>   create_input_operand (&ops[1], value, innermode);
>   create_integer_operand (&ops[2], pos);
>
> I don't think calling create_integer_operand on register operand is
> correct. The function comment says:
>
> /* Make OP describe an input operand that has value INTVAL and that has
>no inherent mode.  This function should only be used for operands that
>are always expand-time constants.  The backend may request that INTVAL
>be copied into a different kind of rtx, but it must specify the mode
>of that rtx if so.  */

Ah, sorry - variable vec_set takes a different path, please disregard
my last message.

Uros.


Re: [RFC][PR target PR90000] (rs6000) Compile time hog w/impossible asm constraint lra loop

2020-11-12 Thread Segher Boessenkool
On Thu, Nov 12, 2020 at 09:15:11AM -0700, Jeff Law wrote:
> > void foo (void)
> > {
> >   register float __attribute__ ((mode(SD))) r31 __asm__ ("r31");
> >   register float __attribute__ ((mode(SD))) fr1 __asm__ ("fr1");
> >
> >   __asm__ ("#" : "=d" (fr1));
> >   r31 = fr1;
> >   __asm__ ("#" : : "r" (r31));
> > }
> 
> Looking at this again after many months away, I wonder the real problem
> is the reloads we have to generate for copies to/from he fr1 local
> variable, which is bound to hard reg fr1 rather than the asm statements
> themselves.  It's not clear to me from the BZ and I don't have a PPC
> cross handy to look directly.

We should never do a reload of a (local) register variable.
Unfortunately we cannot currently tell during reload that something is
one!

See also PR97708, and many more, going many years back.


Segher


[PATCH] c++: Don't form a templated TARGET_EXPR in finish_compound_literal

2020-11-12 Thread Patrick Palka via Gcc-patches
The atom_cache in normalize_atom relies on the assumption that two
equivalent (templated) trees (in the sense of cp_tree_equal) must use
the same template parameters (according to find_template_parameters).

This assumption unfortunately doesn't always hold for TARGET_EXPRs,
because cp_tree_equal ignores an artificial target of a TARGET_EXPR, but
find_template_parameters walks this target (and its DECL_CONTEXT).

Hence two TARGET_EXPRs built by force_target_expr with the same
initializer but under different settings of current_function_decl may
compare equal according to cp_tree_equal, but find_template_parameters
returns a different set of template parameters for them.  This breaks
the below testcase because during normalization we build two such
TARGET_EXPRs (one under current_function_decl=f and another under =g),
and then use the same ATOMIC_CONSTR for the two corresponding atoms,
leading to a crash during satisfaction of g's associated constraints.

This patch works around this assumption violation by removing the source
of these templated TARGET_EXPRs.  The relevant call to get_target_expr was
added in r9-6043, but it seems it's no longer necessary (according to
https://gcc.gnu.org/pipermail/gcc-patches/2019-February/517323.html, the
call was added in order to avoid regressing on initlist109.C at the time).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* semantics.c (finish_compound_literal): Don't wrap the original
compound literal in a TARGET_EXPR when inside a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype3.C: New test.
---
 gcc/cp/semantics.c  |  7 +--
 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C | 15 +++
 2 files changed, 16 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 33d715edaec..172286922e7 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3006,12 +3006,7 @@ finish_compound_literal (tree type, tree 
compound_literal,
 
   /* If we're in a template, return the original compound literal.  */
   if (orig_cl)
-{
-  if (!VECTOR_TYPE_P (type))
-   return get_target_expr_sfinae (orig_cl, complain);
-  else
-   return orig_cl;
-}
+return orig_cl;
 
   if (TREE_CODE (compound_literal) == CONSTRUCTOR)
 {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
new file mode 100644
index 000..837855ce8ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype3.C
@@ -0,0 +1,15 @@
+// { dg-do compile { target c++20 } }
+
+template  concept C = requires(T t) { t; };
+
+template  using A = decltype((T{}, int{}));
+
+template  concept D = C>;
+
+template  void f() requires D;
+template  void g() requires D;
+
+void h() {
+  f();
+  g();
+}
-- 
2.29.2.260.ge31aba42fb



Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 12, 2020 at 6:51 PM Uros Bizjak  wrote:

> > > > Yes, removed 'code' and value_mode by checking VECTOR_MODE_P and use 
> > > > GET_MODE_INNER
> > > > for value_mode.  ".md expanders" shall support for integer constants 
> > > > index mode, but
> > > > I guess they shouldn't be expanded by IFN as this function is for 
> > > > variable index
> > > > insert only?  Anyway, the v3 patch used VOIDmode check...
> >
> > I'm not sure what best to do here, as said accepting "any" (integer) mode as
> > input is desirable (SImode, DImode but eventually also smaller modes).  How
> > that can be best achieved I don't know.
>
> I was expecting something similar to how extvM/extzvM operands are
> handled here. We have:
>
> Operands 0 and 1 both have mode M.  Operands 2 and 3 have a
> target-specific mode.
>
> Please note operands 2 and 3 having a "target-specific" mode, handled
> in optabs-query.c as:
>
>   machine_mode struct_mode = data->operand[struct_op].mode;
>   if (struct_mode == VOIDmode)
> struct_mode = word_mode;
>   if (mode != struct_mode)
> return false;
>
> > Why's not specifying any mode in the patter no good?  Just make sure you
> > appropriately extend/subreg it?  We can make sure it will be an integer
> > mode in the expander itself.
>
> IIRC, having known mode, expanders can use create_convert_operand_to,
> and the middle-end will do the above by itself. Also note that at
> least two targets specify SImode, so register operands are currently
> ineffective there.

On a related note, the pattern is currently expanded as (see
store_bit_field_1 in expmed.c):

  create_fixed_operand (&ops[0], op0);
  create_input_operand (&ops[1], value, innermode);
  create_integer_operand (&ops[2], pos);

I don't think calling create_integer_operand on register operand is
correct. The function comment says:

/* Make OP describe an input operand that has value INTVAL and that has
   no inherent mode.  This function should only be used for operands that
   are always expand-time constants.  The backend may request that INTVAL
   be copied into a different kind of rtx, but it must specify the mode
   of that rtx if so.  */

Uros.


Re: [gcc r9-8794] aarch64: Clear canary value after stack_protect_test [PR96191]

2020-11-12 Thread Sebastian Pop via Gcc-patches
Hi,

On Fri, Aug 7, 2020 at 6:18 AM Richard Sandiford  wrote:
>
> https://gcc.gnu.org/g:5380912a17ea09a8996720fb62b1a70c16c8f9f2
>
> commit r9-8794-g5380912a17ea09a8996720fb62b1a70c16c8f9f2
> Author: Richard Sandiford 
> Date:   Fri Aug 7 12:17:37 2020 +0100

could you please also apply this change to the gcc-8 branch?

Thanks,
Sebastian

>
> aarch64: Clear canary value after stack_protect_test [PR96191]
>
> The stack_protect_test patterns were leaving the canary value in the
> temporary register, meaning that it was often still in registers on
> return from the function.  An attacker might therefore have been
> able to use it to defeat stack-smash protection for a later function.
>
> gcc/
> PR target/96191
> * config/aarch64/aarch64.md (stack_protect_test_): Set the
> CC register directly, instead of a GPR.  Replace the original GPR
> destination with an extra scratch register.  Zero out operand 3
> after use.
> (stack_protect_test): Update accordingly.
>
> gcc/testsuite/
> PR target/96191
> * gcc.target/aarch64/stack-protector-1.c: New test.
> * gcc.target/aarch64/stack-protector-2.c: Likewise.
>
> (cherry picked from commit fe1a26429038d7cd17abc53f96a6f3e2639b605f)
>
> Diff:
> ---
>  gcc/config/aarch64/aarch64.md  | 34 -
>  .../gcc.target/aarch64/stack-protector-1.c | 89 
> ++
>  .../gcc.target/aarch64/stack-protector-2.c |  6 ++
>  3 files changed, 110 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index ed8cf8ecea1..9598bac387f 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -6985,10 +6985,8 @@
> (match_operand 2)]
>""
>  {
> -  rtx result;
>machine_mode mode = GET_MODE (operands[0]);
>
> -  result = gen_reg_rtx(mode);
>if (aarch64_stack_protector_guard != SSP_GLOBAL)
>{
>  /* Generate access through the system register. The
> @@ -7013,29 +7011,27 @@
>  operands[1] = gen_rtx_MEM (mode, tmp_reg);
>}
>emit_insn ((mode == DImode
> - ? gen_stack_protect_test_di
> - : gen_stack_protect_test_si) (result,
> -   operands[0],
> -   operands[1]));
> -
> -  if (mode == DImode)
> -emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, 
> const0_rtx),
> -   result, const0_rtx, operands[2]));
> -  else
> -emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, 
> const0_rtx),
> -   result, const0_rtx, operands[2]));
> +? gen_stack_protect_test_di
> +: gen_stack_protect_test_si) (operands[0], operands[1]));
> +
> +  rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM);
> +  emit_jump_insn (gen_condjump (gen_rtx_EQ (VOIDmode, cc_reg, const0_rtx),
> +   cc_reg, operands[2]));
>DONE;
>  })
>
> +;; DO NOT SPLIT THIS PATTERN.  It is important for security reasons that the
> +;; canary value does not live beyond the end of this sequence.
>  (define_insn "stack_protect_test_"
> -  [(set (match_operand:PTR 0 "register_operand" "=r")
> -   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")
> -(match_operand:PTR 2 "memory_operand" "m")]
> -UNSPEC_SP_TEST))
> +  [(set (reg:CC CC_REGNUM)
> +   (unspec:CC [(match_operand:PTR 0 "memory_operand" "m")
> +   (match_operand:PTR 1 "memory_operand" "m")]
> +  UNSPEC_SP_TEST))
> +   (clobber (match_scratch:PTR 2 "=&r"))
> (clobber (match_scratch:PTR 3 "=&r"))]
>""
> -  "ldr\t%3, %1\;ldr\t%0, %2\;eor\t%0, %3, %0"
> -  [(set_attr "length" "12")
> +  "ldr\t%2, %0\;ldr\t%3, %1\;subs\t%2, %2, %3\;mov\t%3, 0"
> +  [(set_attr "length" "16")
> (set_attr "type" "multiple")])
>
>  ;; Write Floating-point Control Register.
> diff --git a/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c 
> b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
> new file mode 100644
> index 000..73e83bc413f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
> @@ -0,0 +1,89 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target fstack_protector } */
> +/* { dg-options "-fstack-protector-all -O2" } */
> +
> +extern volatile long *stack_chk_guard_ptr;
> +
> +volatile long *
> +get_ptr (void)
> +{
> +  return stack_chk_guard_ptr;
> +}
> +
> +void __attribute__ ((noipa))
> +f (void)
> +{
> +  volatile int x;
> +  x = 1;
> +  x += 1;
> +}
> +
> +#define CHECK(REG) "\tcmp\tx0, " #REG "\n\tbeq\t1f\n"
> +
> +asm (
> +"  .pushsection .data\n"
> +"  .align  3\n"
> +"  .globl  stack_chk_guard_ptr\n"
> +"stack_chk_guard_ptr:\n"
> +#if __ILP32__
> +"  .word   __stack_chk_guard\n"
> +#else
> +"  

Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 12, 2020 at 2:59 PM Richard Biener
 wrote:
> > > > > > > > gcc/ChangeLog:
> > > > > > > >
> > > > > > > > PR target/97194
> > > > > > > > * config/i386/i386-expand.c (ix86_expand_vector_set_var): New 
> > > > > > > > function.
> > > > > > > > * config/i386/i386-protos.h (ix86_expand_vector_set_var): New 
> > > > > > > > Decl.
> > > > > > > > * config/i386/predicates.md (vec_setm_operand): New predicate,
> > > > > > > > true for const_int_operand or register_operand under 
> > > > > > > > TARGET_AVX2.
> > > > > > > > * config/i386/sse.md (vec_set): Support both constant
> > > > > > > > and variable index vec_set.
> > > > > > > >
> > > > > > > > gcc/testsuite/ChangeLog:
> > > > > > > >
> > > > > > > > * gcc.target/i386/avx2-vec-set-1.c: New test.
> > > > > > > > * gcc.target/i386/avx2-vec-set-2.c: New test.
> > > > > > > > * gcc.target/i386/avx512bw-vec-set-1.c: New test.
> > > > > > > > * gcc.target/i386/avx512bw-vec-set-2.c: New test.
> > > > > > > > * gcc.target/i386/avx512f-vec-set-2.c: New test.
> > > > > > > > * gcc.target/i386/avx512vl-vec-set-2.c: New test.
> > > > > > >
> > > > > > > +;; True for registers, or const_int_operand, used to vec_setm 
> > > > > > > expander.
> > > > > > > +(define_predicate "vec_setm_operand"
> > > > > > > +  (ior (and (match_operand 0 "register_operand")
> > > > > > > +(match_test "TARGET_AVX2"))
> > > > > > > +   (match_code "const_int")))
> > > > > > > +
> > > > > > >  ;; True for registers, or 1 or -1.  Used to optimize double-word 
> > > > > > > shifts.
> > > > > > >  (define_predicate "reg_or_pm1_operand"
> > > > > > >(ior (match_operand 0 "register_operand")
> > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > > > index b153a87fb98..1798e5dea75 100644
> > > > > > > --- a/gcc/config/i386/sse.md
> > > > > > > +++ b/gcc/config/i386/sse.md
> > > > > > > @@ -8098,11 +8098,14 @@ (define_insn "vec_setv2df_0"
> > > > > > >  (define_expand "vec_set"
> > > > > > >[(match_operand:V 0 "register_operand")
> > > > > > > (match_operand: 1 "register_operand")
> > > > > > > -   (match_operand 2 "const_int_operand")]
> > > > > > > +   (match_operand 2 "vec_setm_operand")]
> > > > > > >
> > > > > > > You need to specify a mode, otherwise a register of any mode can 
> > > > > > > pass here.
> > > > > > >
> > > > > > Yes, theoretically, we only accept integer types. But in 
> > > > > > can_vec_set_var_idx_p
> > > > > > cut
> > > > > > ---
> > > > > > bool
> > > > > > can_vec_set_var_idx_p (machine_mode vec_mode)
> > > > > > {
> > > > > >   if (!VECTOR_MODE_P (vec_mode))
> > > > > > return false;
> > > > > >
> > > > > >   machine_mode inner_mode = GET_MODE_INNER (vec_mode);
> > > > > >   rtx reg1 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 1);
> > > > > >   rtx reg2 = alloca_raw_REG (inner_mode, LAST_VIRTUAL_REGISTER + 2);
> > > > > >   rtx reg3 = alloca_raw_REG (VOIDmode, LAST_VIRTUAL_REGISTER + 3);
> > > > > >
> > > > > >   enum insn_code icode = optab_handler (vec_set_optab, vec_mode);
> > > > > >
> > > > > >   return icode != CODE_FOR_nothing && insn_operand_matches (icode, 
> > > > > > 0, reg1)
> > > > > >  && insn_operand_matches (icode, 1, reg2)
> > > > > >  && insn_operand_matches (icode, 2, reg3);
> > > > > > }
> > > > > > ---
> > > > > >
> > > > > > reg3 is assumed to be VOIDmode, set anymode in match_operand 2 will
> > > > > > fail insn_operand_matches (icode, 2, reg3)
> > > > > > ---
> > > > > > (gdb) p insn_operand_matches(icode,2,reg3)
> > > > > > $5 = false
> > > > > > (gdb)
> > > > > > ---
> > > > > >
> > > > > > Maybe we need to change
> > > > > >
> > > > > > rtx reg3 = alloca_raw_REG (VOIDmode, LAST_VIRTUAL_REGISTER + 3);
> > > > > >
> > > > > > to
> > > > > >
> > > > > > rtx reg3 = alloca_raw_REG (SImode, LAST_VIRTUAL_REGISTER + 3);
> > > > > >
> > > > > > cc Richard Biener, any thoughts?
> > > > >
> > > > > There are two targets (gcn in gcn-valu.md and s390 in vector.md) that
> > > > > specify SImode for operand 2 in vec_setM pattern and allow register
> > > > > operands. I wonder if and how they manage to generate the pattern.
> > > > >
> > > > > Uros.
> > > >
> > > > Variable index vec_set is enabled by r11-3486, about two months ago in
> > > > [1]. But for the upper two targets, the codes are already there since
> > > > GCC10(maybe earlier, i just looked at gcc10 branch), I don't think
> > > > those codes are for [1].
> > > >
> > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555905.html
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> > >
> > > Correct [1] 
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> > in https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554592.html
> >
> > It says
> >
> > > >> +can_vec_set_var_idx_p (enum tree_code code, machine_mode vec_mode,
> > > >> +  machine_mode value_mode, machine_mode idx_mode)
> > > >
> > > > toplevel co

Re: Improve handling of memory operands in ipa-icf 2/4

2020-11-12 Thread Jan Hubicka
Hi,
this is updated patch.  It fixes the comparsion of bitfield where I now
check that they bitsizes and bitoffsets match (and OEP_ADDRESSOF is not
used for bitfield references).
I also noticed problem with dependence clique in ao_refs_may_alias that
I copied here.  Instead of base rbase should be used.

Finally I ran statistics on when access paths mismatches and noticed
that I do not really need to check that component_refs and array_refs
are semantically equivalent since this is implied from earlier tests.
This is described in inline comment and simplifies the code.

Bootstrapped/regtested x86_64-linux, OK?
Honza


* ipa-icf-gimple.c: Include tree-ssa-alias-compare.h.
(find_checker::func_checker): Initialize m_tbaa.
(func_checker::hash_operand): Use hash_ao_ref for memory accesses.
(func_checker::compare_operand): Use compare_ao_refs for memory
accesses.
(func_checker::cmopare_gimple_assign): Do not check LHS types
of memory stores.
* ipa-icf-gimple.h (func_checker): Derive from ao_compare;
add m_tbaa.
* ipa-icf.c: Include tree-ssa-alias-compare.h.
(sem_function::equals_private): Update call of
func_checker::func_checker.
* ipa-utils.h (lto_streaming_expected_p): New inline
predicate.
* tree-ssa-alias-compare.h: New file.
* tree-ssa-alias.c: Include tree-ssa-alias-compare.h
and bultins.h
(view_converted_memref_p): New function.
(types_equal_for_same_type_for_tbaa_p): New function.
(ao_compare::compare_ao_refs): New member function.
(ao_compare::hash_ao_ref): New function

* c-c++-common/Wstringop-overflow-2.c: Disable ICF.
* g++.dg/warn/Warray-bounds-8.C: Disable ICF.

index f75951f7c49..26337dd7384 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "gimple-walk.h"
 
+#include "tree-ssa-alias-compare.h"
 #include "ipa-icf-gimple.h"
 
 namespace ipa_icf_gimple {
@@ -52,13 +53,13 @@ namespace ipa_icf_gimple {
of declarations that can be skipped.  */
 
 func_checker::func_checker (tree source_func_decl, tree target_func_decl,
-   bool ignore_labels,
+   bool ignore_labels, bool tbaa,
hash_set *ignored_source_nodes,
hash_set *ignored_target_nodes)
   : m_source_func_decl (source_func_decl), m_target_func_decl 
(target_func_decl),
 m_ignored_source_nodes (ignored_source_nodes),
 m_ignored_target_nodes (ignored_target_nodes),
-m_ignore_labels (ignore_labels)
+m_ignore_labels (ignore_labels), m_tbaa (tbaa)
 {
   function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
   function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
@@ -252,9 +253,16 @@ func_checker::hash_operand (const_tree arg, inchash::hash 
&hstate,
 
 void
 func_checker::hash_operand (const_tree arg, inchash::hash &hstate,
-   unsigned int flags, operand_access_type)
+   unsigned int flags, operand_access_type access)
 {
-  return hash_operand (arg, hstate, flags);
+  if (access == OP_MEMORY)
+{
+  ao_ref ref;
+  ao_ref_init (&ref, const_cast  (arg));
+  return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, hstate);
+}
+  else
+return hash_operand (arg, hstate, flags);
 }
 
 bool
@@ -314,18 +322,40 @@ func_checker::compare_operand (tree t1, tree t2, 
operand_access_type access)
 return true;
   else if (!t1 || !t2)
 return false;
-  if (operand_equal_p (t1, t2, OEP_MATCH_SIDE_EFFECTS))
-return true;
-  switch (access)
+  if (access == OP_MEMORY)
 {
-case OP_MEMORY:
-  return return_false_with_msg
-("operand_equal_p failed (access == memory)");
-case OP_NORMAL:
+  ao_ref ref1, ref2;
+  ao_ref_init (&ref1, const_cast  (t1));
+  ao_ref_init (&ref2, const_cast  (t2));
+  int flags = compare_ao_refs (&ref1, &ref2,
+  lto_streaming_expected_p (), m_tbaa);
+
+  if (!flags)
+   return true;
+  if (flags & SEMANTICS)
+   return return_false_with_msg
+   ("compare_ao_refs failed (semantic difference)");
+  if (flags & BASE_ALIAS_SET)
+   return return_false_with_msg
+   ("compare_ao_refs failed (base alias set difference)");
+  if (flags & REF_ALIAS_SET)
+   return return_false_with_msg
+("compare_ao_refs failed (ref alias set difference)");
+  if (flags & ACCESS_PATH)
+   return return_false_with_msg
+("compare_ao_refs failed (access path difference)");
+  if (flags & DEPENDENCE_CLIQUE)
+   return return_false_with_msg
+("compare_ao_refs failed (dependence clique difference)");
+  gcc_unreachable ();
+}
+  else
+{
+  if (op

Re: [PATCH] libstdc++: Ensure __gthread_self doesn't call undefined weak symbol [PR 95989]

2020-11-12 Thread Jonathan Wakely via Gcc-patches

On 11/11/20 19:08 +0100, Jakub Jelinek via Libstdc++ wrote:

On Wed, Nov 11, 2020 at 05:24:42PM +, Jonathan Wakely wrote:

--- a/libgcc/gthr-posix.h
+++ b/libgcc/gthr-posix.h
@@ -684,7 +684,14 @@ __gthread_equal (__gthread_t __t1, __gthread_t __t2)
 static inline __gthread_t
 __gthread_self (void)
 {
+#if __GLIBC_PREREQ(2, 27)


What if it is a non-glibc system where __GLIBC_PREREQ macro isn't defined?
I think you'd get then
error: missing binary operator before token "("
So I think you want
#if defined __GLIBC__ && defined __GLIBC_PREREQ
#if __GLIBC_PREREQ(2, 27)
 return pthread_self ();
#else
 return __gthrw_(pthread_self) ();
#else
 return __gthrw_(pthread_self) ();
#endif
or similar.



Here's a fixed version of the patch.

I've moved the glibc-specific code in this_thread::get_id() into a new
macro defined in config/os/gnu-linux/os_defines.h (where we already
know we are dealing with glibc). That means we don't do the
__GLIBC_PREREQ check directly in , it's hidden away in a
target-specific header.

Tested powerpc64le-linux (glibc 2.17 and 2.32), sparc-solaris2.11 and
powerpc-aix.




commit 822914f1f1f4710ff252764ee634aa07ac565d53
Author: Jonathan Wakely 
Date:   Wed Nov 11 19:26:00 2020

libstdc++: Ensure __gthread_self doesn't call undefined weak symbol [PR 95989]

Since glibc 2.27 the pthread_self symbol has been defined in libc rather
than libpthread. Because we only call pthread_self through a weak alias
it's possible for statically linked executables to end up without a
definition of pthread_self. This crashes when trying to call an
undefined weak symbol.

We can use the __GLIBC_PREREQ version check to detect the version of
glibc where pthread_self is no longer in libpthread, and call it
directly rather than through the weak reference.

It would be better to check for pthread_self in libc during configure
instead of hardcoding the __GLIBC_PREREQ check. That would be somewhat
complicated by the fact that prior to glibc 2.27 only libc.so.6
contained the pthread_self symbol. The configure checks would need to
try to link both statically and dynamically, and the result would depend
on whether the static libc.a happens to be installed during configure
(which could vary between different systems using the same version of
glibc). Doing it properly is left for a future date, as it will be
needed anyway after glibc moves all pthread symbols from libpthread to
libc. When that happens we should revisit the whole approach of using
weak symbols for pthread symbols.

For the purposes of std::this_thread::get_id() we create a fake non-zero
thread ID ((__gthread_t)1) when using glibc but not linked to
libpthread. When using glibc 2.27 or later pthread_self() never returns
zero so we don't need to use (__gthread_t)1 for new glibc.

An undesirable consequence of this change is that code compiled prior to
the change might inline the old definition of this_thread::get_id()
which always returns (__gthread_t)1 in a program that isn't linked to
libpthread. Code compiled after the change will use pthread_self() and
so get a real TID. That could result in the main thread having different
thread::id values in different translation units. This seems acceptable,
as there are not expected to be many uses of thread::id in programs
that aren't linked to libpthread.

libgcc/ChangeLog:

PR libstdc++/95989
* gthr-posix.h (__gthread_self) [__GLIBC_PREREQ(2, 27)]: Call
pthread_self directly rather than using weak alias.

libstdc++-v3/ChangeLog:

PR libstdc++/95989
* config/os/gnu-linux/os_defines.h (_GLIBCXX_NATIVE_THREAD_ID):
Define new macro to get reliable thread ID.
* include/std/thread (this_thread::get_id): Use new macro if
it's defined.

diff --git a/libgcc/gthr-posix.h b/libgcc/gthr-posix.h
index 965247602acf..dc34645d1c52 100644
--- a/libgcc/gthr-posix.h
+++ b/libgcc/gthr-posix.h
@@ -684,7 +684,18 @@ __gthread_equal (__gthread_t __t1, __gthread_t __t2)
 static inline __gthread_t
 __gthread_self (void)
 {
+#if defined __GLIBC__ && defined __GLIBC_PREREQ
+# if __GLIBC_PREREQ(2, 27)
+  /* Since Glibc 2.27, pthread_self is defined in libc not libpthread.
+   * Call it directly so that we get a non-weak reference and won't call
+   * an undefined weak symbol when linked to the libc.a static lib.  */
+  return pthread_self ();
+# else
   return __gthrw_(pthread_self) ();
+# endif
+#else
+  return __gthrw_(pthread_self) ();
+#endif
 }
 
 static inline int
diff --git a/libstdc++-v3/config/os/gnu-linux/os_defines.h b/libstdc++-v3/config/os/gnu-linux/os_defines.h
index f821486ec8f5..ca61ecf60f62 100644
--- a/libstdc++-v3/config/os/gnu-linux/os_defines.h
+++ b/libstdc++-v3/config/os/gnu-linux/os_defines.h
@@ -49,4 +49,14 @@
 // version dynamically in case it has ch

Re: [PATCH] libstdc++: Add support for C++20 barriers

2020-11-12 Thread Jonathan Wakely via Gcc-patches

On 04/11/20 10:55 -0800, Thomas Rodgers wrote:

--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -603,13 +603,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }

#if __cplusplus > 201703L
+  template
+   _GLIBCXX_ALWAYS_INLINE void
+   _M_wait(__int_type __old, const _Func& __fn) const noexcept
+   { std::__atomic_wait(&_M_i, __old, __fn); }
+
 _GLIBCXX_ALWAYS_INLINE void
 wait(__int_type __old,
  memory_order __m = memory_order_seq_cst) const noexcept
 {
-   std::__atomic_wait(&_M_i, __old,
-  [__m, this, __old]
-  { return this->load(__m) != __old; });
+   _M_wait(__old,
+   [__m, this, __old]
+   { return this->load(__m) != __old; });
 }


This looks like it's not meant to be part of this patch.

It also looks wrong for any patch, because it adds _M_wait as a public
member.

Not sure what this piece is for :-)



It is used at include/std/barrier:197 to keep the implementation as close as 
possible to the libc++ version upon which it is based.


So the caller in  can't use __atomic_wait directly because it
can't access the _M_i member of the atomic.

Would it be possible to use atomic_ref instead of atomic, so that the
barrier code has access to the underlying object and can use it
directly with __atomic_wait?




Re: [PATCH] cgraph: Avoid segfault when attempting to dump NULL clone_info

2020-11-12 Thread Jan Hubicka
> Hi,
> 
> cgraph_node::materialize_clone segfaulted when I tried compiling Tramp3D
> with -fdump-ipa-all because there was no clone_info - IPA-CP created a
> clone only for an aggregate constant, adding a note to its
> transformation summary but not creating any tree_map nor
> param_adjustements.
> 
> Fixed with the following obvious extra checks which I will commit after
> an obligatory round of bootstrap and testing.
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2020-11-12  Martin Jambor  
> 
>   * cgraphclones.c (cgraph_node::materialize_clone): Check that clone
>   info is not NULL before attempting to dump it.
OK, thanks!
Honza
> ---
>  gcc/cgraphclones.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index bc590819f78..712a54e8d0c 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -1107,7 +1107,7 @@ cgraph_node::materialize_clone ()
>fprintf (symtab->dump_file, "cloning %s to %s\n",
>  clone_of->dump_name (),
>  dump_name ());
> -  if (info->tree_map)
> +  if (info && info->tree_map)
>  {
> fprintf (symtab->dump_file, "replace map:");
> for (unsigned int i = 0;
> @@ -1123,7 +1123,7 @@ cgraph_node::materialize_clone ()
>   }
> fprintf (symtab->dump_file, "\n");
>   }
> -  if (info->param_adjustments)
> +  if (info && info->param_adjustments)
>   info->param_adjustments->dump (symtab->dump_file);
>  }
>clear_stmts_in_references ();
> -- 
> 2.29.2
> 


Re: Move thunks out of cgraph_node

2020-11-12 Thread Jan Hubicka
> On Fri, 2020-10-23 at 21:45 +0200, Jan Hubicka wrote:
> > Hi,
> > this patch moves thunk_info out of cgraph_node into a symbol summary.
> > I also moved it to separate hearder file since cgraph.h became really
> > too
> > fat.  I plan to contiue with similar breakup in order to cleanup
> > interfaces
> > and reduce WPA memory footprint (symbol table now consumes more
> > memory than
> > trees)
> > 
> > Bootstrapped/regtested x86_64-linux, plan to commit it shortly.
> 
> This seems to have broken libgccjit (specifically, code that makes
> function calls).  Please can you test with --enable-languages=all,jit
> (as "jit" isn't part of "all", since it needs --enable-host-shared).

Sorry for that :(
I wl try to keep in mind and test JIT.

> 
> [...snip...]
> 
> > +/* Return thunk_info possibly creating new one.  */
> > +thunk_info *
> > +thunk_info::get_create (cgraph_node *node)
> > +{
> > +  if (!symtab->m_thunks)
> > +{
> > +  symtab->m_thunks
> > += new (ggc_alloc_no_dtor  ())
> > +thunk_infos_t (symtab, true);
> > +  symtab->m_thunks->disable_insertion_hook ();
> > +}
> > +  return symtab->m_thunks->get_create (node);
> > +}
> 
> symtab->m_thunks is allocated via ggc_alloc_no_dtor here, thus
> allocating it within the GC heap...
> 
> [...snip...]
> 
> > +/* Free thunk info summaries.  */
> > +inline void
> > +thunk_info::release ()
> > +{
> > +  if (symtab->m_thunks)
> > +delete (symtab->m_thunks);
> > +  symtab->m_thunks = NULL;
> > +}
> 
> ...but deallocated using plain "delete", attempting to release the GC-
> allocated memory into the system heap, leading to an ICE.  This seems
> to happen for any compilation of function calls in which
> toplev::finalize is called, i.e. any compilation of function calls from
> libgccjit.
> 
> Does it need to be in the GC heap (maybe for PCH?), or can it be simply
> allocated in the system heap via regular "new"?

It points to trees (alias decl) and thus it is in PCH so it walks
through it.  In fact I have a cleanup patch for this (which gets rid of
the alias decl that is, in fact, another PCH workaround hack.

> 
> I hope to get some continuous integration of libgccjit going at some
> point (but am focusing on finishing my stage 1 stuff right now, and am
> hacking round this for now).
> 
> I wonder if it would be useful for debug builds to
> call toplev::finalize, so that cc1/cc1plus exercise these cleanup code
> paths and thus catch this kind of breakage much earlier.

I think we could do that for checking compilers.
It would also make it easier to check for memory leaks...

Honza
> 
> Dave
> 


[PATCH] cgraph: Avoid segfault when attempting to dump NULL clone_info

2020-11-12 Thread Martin Jambor
Hi,

cgraph_node::materialize_clone segfaulted when I tried compiling Tramp3D
with -fdump-ipa-all because there was no clone_info - IPA-CP created a
clone only for an aggregate constant, adding a note to its
transformation summary but not creating any tree_map nor
param_adjustements.

Fixed with the following obvious extra checks which I will commit after
an obligatory round of bootstrap and testing.

Thanks,

Martin


gcc/ChangeLog:

2020-11-12  Martin Jambor  

* cgraphclones.c (cgraph_node::materialize_clone): Check that clone
info is not NULL before attempting to dump it.
---
 gcc/cgraphclones.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index bc590819f78..712a54e8d0c 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1107,7 +1107,7 @@ cgraph_node::materialize_clone ()
   fprintf (symtab->dump_file, "cloning %s to %s\n",
   clone_of->dump_name (),
   dump_name ());
-  if (info->tree_map)
+  if (info && info->tree_map)
 {
  fprintf (symtab->dump_file, "replace map:");
  for (unsigned int i = 0;
@@ -1123,7 +1123,7 @@ cgraph_node::materialize_clone ()
}
  fprintf (symtab->dump_file, "\n");
}
-  if (info->param_adjustments)
+  if (info && info->param_adjustments)
info->param_adjustments->dump (symtab->dump_file);
 }
   clear_stmts_in_references ();
-- 
2.29.2



Re: [PATCH] libstdc++: Enable without gthreads

2020-11-12 Thread Jonathan Wakely via Gcc-patches

On 11/11/20 17:31 +, Jonathan Wakely wrote:

On 11/11/20 16:13 +, Jonathan Wakely wrote:

This makes it possible to use std::thread in single-threaded builds.
All member functions are available, but attempting to create a new
thread will throw an exception.

The main benefit for most targets is that other headers such as 
do not need to include the whole of  just to be able to create a
std::thread. That avoids including  and std::jthread where
not required.


I forgot to mention that this patch also reduces the size of the
 header, by only including  instead of the
whole of . That could be done separately from the rest of the
changes here.

It would be possible to split std::thread and this_thread::get_id()
into a new header without also making them work without gthreads.

That would still reduce the size of the  header, because it
wouldn't need the whole of . But it wouldn't get rid of
preprocessor checks for _GLIBCXX_HAS_GTHREADS in .

Allowing std::this_thread::get_id() and std::this_thread::yield() to
work without threads seems worth doing (we already make
std::this_thread::sleep_until and std::this_thread::sleep_for work
without threads).


Here's a slightly more conservative version of the patch. This moves
std::thread and this_thread::get_id() and this_thread::yield() to a
new header, and makes *most* of std::thread defined without gthreads
(because we need the nested thread::id type to be returned from
this_thread::get_id()). But it doesn't declare the std::thread
constructor that creates new threads.

That means std::thread is present, but you can't even try to create
new threads. This means we don't need to export the std::thread
symbols from libstdc++.so for a target where they are unusable and
just throw an exception.

This still has the main benefits of making  include a lot less
code, and removing some #if conditions in .

One other change from the previous patch worth mentioning is that I've
made  include  so that
std::reference_wrapper (and std::ref and std::cref) are defined by
. That isn't required, but it is a tiny header and being able
to use std::ref to pass lvalues to new threads without including
all of  seems like a kindness to users.

Both this and the previous patch require some GDB changes, because GDB
currently assumes that if std::thread is declared in  that it
is usable and multiple threads are supported. That's no longer true,
because we would declare a useless std::thread after this patch. Tom
Tromey has patches to make GDB handle this though.

Tested powerpc64le-linux, --enable-threads and --disable-threads.

Thoughts?


commit 68a99d44890957d6c5b128116a6af6bb4bcfaad3
Author: Jonathan Wakely 
Date:   Thu Nov 12 15:26:02 2020

libstdc++: Move std::thread to a new header

This makes it possible to use std::thread without including the whole of
. It also makes this_thread::get_id() and this_thread::yield()
available even when there is no gthreads support (e.g. when GCC is built
with --disable-threads or --enable-threads=single).

In order for the std::thread::id return type of this_thread::get_id() to
be defined, std:thread itself is defined unconditionally. However the
constructor that creates new threads is not defined for single-threaded
builds. The thread::join() and thread::detach() member functions are
defined inline for single-threaded builds and just throw an exception
(because we know the thread cannot be joinable if the constructor that
creates joinable threads doesn't exit).

The thread::hardware_concurrency() member function is also defined
inline and returns 0 (as suggested by the standard when the value "is
not computable or well-defined").

The main benefit for most targets is that other headers such as 
do not need to include the whole of  just to be able to create a
std::thread. That avoids including  and std::jthread where
not required.

This also means we can use this_thread::get_id() and this_thread::yield()
in  instead of using the gthread functions directly. This
removes some preprocessor conditionals, simplifying the code.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add new  header.
* include/Makefile.in: Regenerate.
* include/std/future: Include new header instead of .
* include/std/stop_token: Include new header instead of
.
(stop_token::_S_yield()): Use this_thread::yield().
(_Stop_state_t::_M_requester): Change type to std::thread::id.
(_Stop_state_t::_M_request_stop()): Use this_thread::get_id().
(_Stop_state_t::_M_remove_callback(_Stop_cb*)): Likewise.
Use __is_single_threaded() to decide whether to synchronize.
* include/std/thread (thread, operator==, this_thread::get_id)
(this_thread::yield): Move to new header.
(operator<=>, operator!=, operator<, operator<=, 

Re: Move thunks out of cgraph_node

2020-11-12 Thread David Malcolm via Gcc-patches
On Fri, 2020-10-23 at 21:45 +0200, Jan Hubicka wrote:
> Hi,
> this patch moves thunk_info out of cgraph_node into a symbol summary.
> I also moved it to separate hearder file since cgraph.h became really
> too
> fat.  I plan to contiue with similar breakup in order to cleanup
> interfaces
> and reduce WPA memory footprint (symbol table now consumes more
> memory than
> trees)
> 
> Bootstrapped/regtested x86_64-linux, plan to commit it shortly.

This seems to have broken libgccjit (specifically, code that makes
function calls).  Please can you test with --enable-languages=all,jit
(as "jit" isn't part of "all", since it needs --enable-host-shared).

[...snip...]

> +/* Return thunk_info possibly creating new one.  */
> +thunk_info *
> +thunk_info::get_create (cgraph_node *node)
> +{
> +  if (!symtab->m_thunks)
> +{
> +  symtab->m_thunks
> +  = new (ggc_alloc_no_dtor  ())
> +  thunk_infos_t (symtab, true);
> +  symtab->m_thunks->disable_insertion_hook ();
> +}
> +  return symtab->m_thunks->get_create (node);
> +}

symtab->m_thunks is allocated via ggc_alloc_no_dtor here, thus
allocating it within the GC heap...

[...snip...]

> +/* Free thunk info summaries.  */
> +inline void
> +thunk_info::release ()
> +{
> +  if (symtab->m_thunks)
> +delete (symtab->m_thunks);
> +  symtab->m_thunks = NULL;
> +}

...but deallocated using plain "delete", attempting to release the GC-
allocated memory into the system heap, leading to an ICE.  This seems
to happen for any compilation of function calls in which
toplev::finalize is called, i.e. any compilation of function calls from
libgccjit.

Does it need to be in the GC heap (maybe for PCH?), or can it be simply
allocated in the system heap via regular "new"?

I hope to get some continuous integration of libgccjit going at some
point (but am focusing on finishing my stage 1 stuff right now, and am
hacking round this for now).

I wonder if it would be useful for debug builds to
call toplev::finalize, so that cc1/cc1plus exercise these cleanup code
paths and thus catch this kind of breakage much earlier.

Dave



Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Liu Hao via Gcc-patches
在 2020/11/12 23:12, Liu Hao 写道:
> 
> My humble opinion is that people should have gotten used to the `ll` 
> specifier so I propose a
> different patch that enables it unconditionally. As Jonathan Yong pointed 
> out, GCC is impossible to

The previous patch missed a `double_name` field. A revised version has been 
attached.



-- 
Best regards,
LH_Mouse
From 1d61adae0695e7067e35f36e607a754a7cf12796 Mon Sep 17 00:00:00 2001
From: Liu Hao 
Date: Thu, 12 Nov 2020 22:20:29 +0800
Subject: [PATCH] gcc: Add `ll` and `L` length modifiers for `ms_printf`

Previous code abuse `FMT_LEN_L` for the `I` modifier. As `L` is a valid
modifier for `f`, `e`, `g`, etc. and `I` has the same semantics as the
C99 `z` modifier, `FMT_LEN_z` is now used.

First, in the Microsoft ABI, type `long double` has the same layout as
type `double`, so `%Lg` behaves identically to `%g`. Users should pass
in `double`s instead as `long double`s, as GCC uses the 10-byte format.

Second, with a CRT that is recent enough (MSVCRT since Vista, MSVCR80,
UCRT, or mingw-w64 8.0), `printf`-family functions can handle the `ll`
length modifier correctly. This ability is assumed to be available
universally. A lot of libraries (such as libgomp) that use the
`format(printf, ...)` attribute used to suffer from warnings about
unknown format specifiers.

Reference: 
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/tcxf1dw6(v=vs.90)
Reference: 
https://docs.microsoft.com/en-us/cpp/porting/visual-cpp-what-s-new-2003-through-2015#new-crt-features
Signed-off-by: Liu Hao 

gcc/:
* config/i386/msformat-c.c: Add more length modifiers
---
 gcc/config/i386/msformat-c.c | 45 ++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/gcc/config/i386/msformat-c.c b/gcc/config/i386/msformat-c.c
index 4ceec633a6e..1629b866976 100644
--- a/gcc/config/i386/msformat-c.c
+++ b/gcc/config/i386/msformat-c.c
@@ -32,10 +32,11 @@ along with GCC; see the file COPYING3.  If not see
 static format_length_info ms_printf_length_specs[] =
 {
   { "h", FMT_LEN_h, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
-  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "L", FMT_LEN_L, STD_C89, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I32", FMT_LEN_l, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I64", FMT_LEN_ll, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
-  { "I", FMT_LEN_L, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
+  { "I", FMT_LEN_z, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { NULL, FMT_LEN_none, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 }
 };
 
@@ -90,33 +91,33 @@ static const format_flag_pair ms_strftime_flag_pairs[] =
 static const format_char_info ms_print_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  T99_SST, 
 BADLEN, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'",  "i",  NULL 
},
-  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0#", "i",  NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
-  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#'", "",   NULL },
-  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#",  "",   NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  T89_S,  T94_WI,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","",   NULL 
},
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  T89_S,  T94_W,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",   "cR", NULL 
},
-  { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","c",  NULL 
},
-  { "n",   1, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN,  
BADLEN, BADLEN,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",  "W",  NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN, 
T99_SST, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +'",  "i",  NULL },
+  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0#","i",  NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
+  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T89_D,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +#'", "",   NULL },
+  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   

Re: [24/32] module mapper

2020-11-12 Thread Nathan Sidwell

On 11/3/20 4:17 PM, Nathan Sidwell wrote:
this is the module mapper client and server pieces.  It features a 
default resolver that can read a text file, or generate default mappings 
from module name to cmi name.


Richard rightly suggested on IRC that the sample server for the module 
mapper shouldn't be in the gcc/cp dir.  It happened to be that way 
because it started out much more closely coupled, but then it grew legs.


So this patch creates a new c++tools toplevel directory and places the 
mapper-server and its default resolver there.  That means more changes 
to the toplevel Makefile.def and Makefile.tpl (I've not included the 
regenerated Makefile.in, nor other generated files in gcc/ and c++tools 
in this diff.)


We still need to build the default resolver when building cc1plus, and 
I've placed mapper-resolver.cc there, as a simple #include forwarder to 
the source in c++tools.  I also replace 'gcc/cp/mapper.h' with a 
client-specific 'gcc/cp/mapper-client.h'.  (mapper-client is only linked 
into cc1plus, so gcc/cp seems the right place for it.)


The sample server relies on gcc/version.o to pick up its version number, 
and I place it in the libexecsubdir that we place cc1plus.  I wasn't 
comfortable placing it in the install location of g++ itself.  I call it 
a sample server for a reason :)


I will of course provide changelog when committing.

nathan

--
Nathan Sidwell
diff --git c/Makefile.def w/Makefile.def
index 36fd26b0367..6e98d2d3340 100644
--- c/Makefile.def
+++ w/Makefile.def
@@ -125,12 +134,13 @@ host_modules= { module= libtermcap; no_check=true;
 missing=distclean;
 missing=maintainer-clean; };
 host_modules= { module= utils; no_check=true; };
+host_modules= { module= c++tools; };
 host_modules= { module= gnattools; };
+host_modules= { module= gotools; };
 host_modules= { module= lto-plugin; bootstrap=true;
 		extra_configure_flags='--enable-shared @extra_linker_plugin_flags@ @extra_linker_plugin_configure_flags@';
 		extra_make_flags='@extra_linker_plugin_flags@'; };
 host_modules= { module= libcc1; extra_configure_flags=--enable-shared; };
-host_modules= { module= gotools; };
 host_modules= { module= libctf; no_install=true; no_check=true;
 		bootstrap=true; };
 
@@ -381,6 +392,8 @@ dependencies = { module=all-lto-plugin; on=all-libiberty-linker-plugin; };
 dependencies = { module=configure-libcc1; on=configure-gcc; };
 dependencies = { module=all-libcc1; on=all-gcc; };
 
+// we want version.o from gcc, and implicitly depend on libcody
+dependencies = { module=all-c++tools; on=all-gcc; };
 dependencies = { module=all-gotools; on=all-target-libgo; };
 
 dependencies = { module=all-utils; on=all-libiberty; };
diff --git c/Makefile.tpl w/Makefile.tpl
index efed1511750..3b88f351d5b 100644
--- c/Makefile.tpl
+++ w/Makefile.tpl
@@ -864,8 +864,8 @@ local-distclean:
 	-rm -f texinfo/doc/Makefile texinfo/po/POTFILES
 	-rmdir texinfo/doc texinfo/info texinfo/intl texinfo/lib 2>/dev/null
 	-rmdir texinfo/makeinfo texinfo/po texinfo/util 2>/dev/null
-	-rmdir fastjar gcc gnattools gotools libcc1 libiberty 2>/dev/null
-	-rmdir texinfo zlib 2>/dev/null
+	-rmdir c++tools fastjar gcc gnattools gotools 2>/dev/null
+	-rmdir libcc1 libiberty texinfo zlib 2>/dev/null
 	-find . -name config.cache -exec rm -f {} \; \; 2>/dev/null
 
 local-maintainer-clean:
diff --git c/c++tools/configure.ac w/c++tools/configure.ac
new file mode 100644
index 000..8d882e541df
--- /dev/null
+++ w/c++tools/configure.ac
@@ -0,0 +1,210 @@
+# Configure script for c++tools
+#   Copyright (C) 2020 Free Software Foundation, Inc.
+#   Written by Nathan Sidwell  while at FaceBook
+#
+# This file is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; see the file COPYING3.  If not see
+# .
+
+# C++ has grown a C++20 mapper server.  This may be used to provide
+# and/or learn and/or build required modules.  This sample server
+# shows how the protocol introduced by wg21.link/p1184 may be used.
+# By default g++ uses an in-process mapper.
+
+sinclude(../config/acx.m4)
+
+AC_INIT(c++tools)
+
+AC_CONFIG_SRCDIR([server.cc])
+
+# Determine the noncanonical names used for directories.
+ACX_NONCANONICAL_HOST
+
+AC_CANONICAL_SYSTEM
+AC_PROG_INSTALL
+
+AC_PROG_CXX
+MISSING=`cd $ac_aux_dir && ${PWDCMD-pwd}`/missing
+AC_CHECK_PROGS([AUTOCONF], [autoconf], [$MISSING autoconf])
+AC_CHECK_PROGS([AUTOHEADER], [autoheader], [$MISSING autoheader])
+
+dnl Enab

Re: [RFC][PR target PR90000] (rs6000) Compile time hog w/impossible asm constraint lra loop

2020-11-12 Thread Jeff Law via Gcc-patches


On 4/23/20 9:48 AM, will schmidt wrote:
> On Wed, 2020-04-22 at 12:26 -0600, Jeff Law wrote:
>> On Fri, 2020-04-10 at 16:40 -0500, will schmidt via Gcc-patches
>> wrote:
>>> [RFC][PR target/9] Compile time hog w/impossible asm constraint
>>> lra loop
>>> 
>>> Hi,
>>>   RFC for a bandaid/patch to partially address target PR/9.
>>>
>>> This adds an escape condition from the forever loop where 
>>> LRA gets stuck while attempting to handle constraints from an 
>>> instruction that has previously suffered an impossible constraint
>>> error.
>>>
>>> This is somewhat inspired by MAX_RELOAD_INSNS_NUMBER as
>>> seen in lra-constraints.c lra_constraints().   This utilizes the
>>> existing counter variable lra_constraint_iter.
>>>
>>> More needs to be done here, as this does replace a spin-forever
>>> situation with an ICE.
>>>
>>> Thanks
>>> -Will
>>>
>>>
>>> gcc/
>>> 2020-04-10  Will Schmidt  
>>>
>>> * lra.c: Add include of rtl-error.h.
>>> (MAX_LRA_CONSTRAINT_PASSES): New define.
>>> (lra): Add check of lra_constraint_iter value.
>> Doesn't this argue that there's some other datastructure that needs
>> to be updated
>> when we removed the impossible asm?
> Yes, i think so.   I'm just not sure exactly what or where.
> The submitted patch is minimally allowing for manageable-in-size reload
> dumps for my continued debug.  :-)
>
> There is an old patch that addressed what looks like a similar issue,
> but i wasn't able to directly apply that to this situation without
> failing in other places. 
>
>> commit e86c0101ae59b32c3f10edcca78398cbf8848eaa
>> Author: Steven Bosscher 
>> Date:   Thu Jan 24 10:30:26 2013 +
>>re PR inline-asm/55934 (LRA inline asm error recovery)
> Which does a bit more, but at it's core is this:
>
> + PATTERN (insn) = gen_rtx_USE (VOIDmode, const0_rtx);
> + lra_set_insn_deleted (insn);
>
>
> I suspect this particular scenario with the testcase is a dependency across
> several 'insns', so marking just one as deleted is not enough.
> (but i'm not sure,..
>
> void foo (void)
> {
>   register float __attribute__ ((mode(SD))) r31 __asm__ ("r31");
>   register float __attribute__ ((mode(SD))) fr1 __asm__ ("fr1");
>
>   __asm__ ("#" : "=d" (fr1));
>   r31 = fr1;
>   __asm__ ("#" : : "r" (r31));
> }

Looking at this again after many months away, I wonder the real problem
is the reloads we have to generate for copies to/from he fr1 local
variable, which is bound to hard reg fr1 rather than the asm statements
themselves.  It's not clear to me from the BZ and I don't have a PPC
cross handy to look directly.


jeff




Re: [PATCH] IBM Z: Fix output template for "*vfees"

2020-11-12 Thread Stefan Schulze Frielinghaus via Gcc-patches
As pointed out in
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558816.html
this instruction pattern will be removed anyway.  Thus we can ignore
this patch.

On Thu, Nov 12, 2020 at 01:25:35PM +0100, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on IBM Z.  Ok for master?
> 
> gcc/ChangeLog:
> 
>   * config/s390/vx-builtins.md ("*vfees"): Fix output
> template.
> ---
>  gcc/config/s390/vx-builtins.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
> index 010db4d1115..0c2e7170223 100644
> --- a/gcc/config/s390/vx-builtins.md
> +++ b/gcc/config/s390/vx-builtins.md
> @@ -1395,7 +1395,7 @@
>  
>if (flags == VSTRING_FLAG_ZS)
>  return "vfeezs\t%v0,%v1,%v2";
> -  return "vfees\t%v0,%v1,%v2,%b3";
> +  return "vfees\t%v0,%v1,%v2";
>  }
>[(set_attr "op_type" "VRR")])
>  
> -- 
> 2.28.0
> 


Re: [RFC, Instruction Scheduler, Stage1] New hook/code to perform fusion of dependent instructions

2020-11-12 Thread Jeff Law via Gcc-patches


On 4/7/20 2:45 PM, Pat Haugen via Gcc-patches wrote:
> The Power processor has the ability to fuse certain pairs of dependent
> instructions to improve their performance if they appear back-to-back in
> the instruction stream. In looking at the current support for
> instruction fusion in GCC I saw the following 2 options.
>
> 1) TARGET_SCHED_MACRO_FUSION target hooks: Only looks at existing
> back-to-back instructions and will ensure the scheduler keeps them together.
>
> 2) -fsched-fusion/TARGET_SCHED_FUSION_PRIORITY: Runs as a separate
> scheduling pass before peephole2. Operates independently on a single
> insn. Used by ARM backend to assign higher priorities to base/disp loads
> and stores so that the scheduling pass will schedule loads/stores to
> adjacent memory back-to-back. Later these insns will be transformed into
> load/store pair insns.
>
> Neither of these work for Power's purpose because they don't deal with
> fusion of dependent insns that may not already be back-to-back. The
> TARGET_SCHED_REORDER[2] hooks also don't work since the dependent insn
> more than likely gets queued for N cycles so wouldn't be on the ready
> list for the reorder hooks to process. We want the ability for the
> scheduler to schedule dependent insn pairs back-to-back when possible
> (i.e. other dependencies of both insns have been satisfied).
>
> I have coded up a proof of concept that implements our needs via a new
> target hook. The hook is passed a pair of dependent insns and returns if
> they are a fusion candidate. It is called while removing the forward
> dependencies of the just scheduled insn. If a dependent insn becomes
> available to schedule and it's a fusion candidate with the just
> scheduled insn, then the new code moves it to the ready list (if
> necessary) and marks it as SCHED_GROUP (piggy-backing on the existing
> code used by TARGET_SCHED_MACRO_FUSION) to make sure the fusion
> candidate will be scheduled next. Following is the scheduling part of
> the diff. Does this sound like a feasible approach? I welcome any
> comments/discussion.

It looks fairly reasonable to me.   Do you plan on trying to take this
forward at all?


jeff




Re: [PATCH] system: Add WARN_UNUSED_RESULT

2020-11-12 Thread Jason Merrill via Gcc-patches

On 11/11/20 10:03 PM, Marek Polacek wrote:

I'd like to have the option of marking functions with
__attribute__ ((__warn_unused_result__)), so this patch adds a macro.
And use it for maybe_wrap_with_location, it's always a bug if the
return value is not used, which happened to me and got me confused.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/ChangeLog:

* system.h (WARN_UNUSED_RESULT): Define for GCC >= 3.4.
* tree.h (maybe_wrap_with_location): Add WARN_UNUSED_RESULT.
---
  gcc/system.h | 6 ++
  gcc/tree.h   | 2 +-
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/system.h b/gcc/system.h
index b0f3f1dd019..6f6ab616a61 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -789,6 +789,12 @@ extern void fancy_abort (const char *, int, const char *)
  #define ALWAYS_INLINE inline
  #endif
  
+#if GCC_VERSION >= 3004

+#define WARN_UNUSED_RESULT __attribute__ ((__warn_unused_result__))
+#else
+#define WARN_UNUSED_RESULT
+#endif
+
  /* Use gcc_unreachable() to mark unreachable locations (like an
 unreachable default case of a switch.  Do not use gcc_assert(0).  */
  #if (GCC_VERSION >= 4005) && !ENABLE_ASSERT_CHECKING
diff --git a/gcc/tree.h b/gcc/tree.h
index 684be10b440..9a713cdb0c7 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1214,7 +1214,7 @@ get_expr_source_range (tree expr)
  extern void protected_set_expr_location (tree, location_t);
  extern void protected_set_expr_location_if_unset (tree, location_t);
  
-extern tree maybe_wrap_with_location (tree, location_t);

+WARN_UNUSED_RESULT extern tree maybe_wrap_with_location (tree, location_t);
  
  extern int suppress_location_wrappers;
  


base-commit: 0f5f9ed5e5a041b636cc002451b1e8b2295f8e4f





Re: [PATCH] IBM Z: Define vec_vfees instruction pattern

2020-11-12 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Thu, Nov 12, 2020 at 02:18:13PM +0100, Andreas Krebbel wrote:
> On 12.11.20 13:21, Stefan Schulze Frielinghaus wrote:
> > Bootstrapped and regtested on IBM Z.  Ok for master?
> > 
> > gcc/ChangeLog:
> > 
> > * config/s390/vector.md ("vec_vfees"): New insn pattern.
> > ---
> >  gcc/config/s390/vector.md | 26 ++
> >  1 file changed, 26 insertions(+)
> > 
> > diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> > index 31d323930b2..4333a2191ae 100644
> > --- a/gcc/config/s390/vector.md
> > +++ b/gcc/config/s390/vector.md
> > @@ -1798,6 +1798,32 @@
> >"vll\t%v0,%1,%2"
> >[(set_attr "op_type" "VRS")])
> >  
> > +; vfeebs, vfeehs, vfeefs
> > +; vfeezbs, vfeezhs, vfeezfs
> > +(define_insn "vec_vfees"
> > +  [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
> > +   (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
> > +  (match_operand:VI_HW_QHS 2 "register_operand" "v")
> > +  (match_operand:QI 3 "const_mask_operand" "C")]
> > + UNSPEC_VEC_VFEE))
> > +   (set (reg:CCRAW CC_REGNUM)
> > +   (unspec:CCRAW [(match_dup 1)
> > +  (match_dup 2)
> > +  (match_dup 3)]
> > + UNSPEC_VEC_VFEECC))]
> > +  "TARGET_VX"
> > +{
> > +  unsigned HOST_WIDE_INT flags = UINTVAL (operands[3]);
> > +
> > +  gcc_assert (!(flags & ~(VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
> > +  flags &= ~VSTRING_FLAG_CS;
> > +
> > +  if (flags == VSTRING_FLAG_ZS)
> > +return "vfeezs\t%v0,%v1,%v2";
> > +  return "vfees\t%v0,%v1,%v2";
> > +}
> > +  [(set_attr "op_type" "VRR")])
> > +
> >  ; vfenebs, vfenehs, vfenefs
> >  ; vfenezbs, vfenezhs, vfenezfs
> >  (define_insn "vec_vfenes"
> > 
> 
> Since this is mostly a copy of the pattern in vx-builtins.md I think we 
> should remove the other
> version then.
> 
> I also would prefer this to be committed together with the code making use of 
> the expander. So far
> this would be dead code - right?

Ok, I will remove the dead code and commit this change in conjunction
with the user in a different patch.

Thanks,
Stefan


Re: [PATCH v3 1/2] generate EH info for volatile asm statements (PR93981)

2020-11-12 Thread Jeff Law via Gcc-patches


On 3/11/20 6:38 PM, J.W. Jagersma via Gcc-patches wrote:
> The following patch extends the generation of exception handling
> information, so that it is possible to catch exceptions thrown from
> volatile asm statements, when -fnon-call-exceptions is enabled.  Parts
> of the gcc code already suggested this should be possible, but it was
> never fully implemented.
>
> Two new test cases are added.  The target-dependent test should pass on
> platforms where throwing from a signal handler is allowed.  The only
> platform I am aware of where that is the case is *-linux-gnu, so it is
> set to XFAIL on all others.
>
> gcc/
> 2020-03-11  Jan W. Jagersma  
>
>   PR inline-asm/93981
>   * tree-cfg.c (make_edges_bb): Make EH edges for GIMPLE_ASM.
>   * tree-eh.c (lower_eh_constructs_2): Add case for GIMPLE_ASM.
>   Assign register output operands to temporaries.
>   * doc/extend.texi: Document that volatile asms can now throw.
>
> gcc/testsuite/
> 2020-03-11  Jan W. Jagersma  
>
>   PR inline-asm/93981
>   * g++.target/i386/pr93981.C: New test.
>   * g++.dg/eh/pr93981.C: New test.

Is this the final version of the patch?  Do we have agreement on the
sematics for output operands, particularly memory operands?  The last
few messages in the March thread lead me to believe that's still not
settled.


Jeff




Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Jan Hubicka
> On Thu, 12 Nov 2020, Jan Hubicka wrote:
> 
> > Hi,
> > this is updated patch I am re-testing and plan to commit if it suceeds.
> > 
> > * fold-const.c (operand_compare::operand_equal_p): Compare
> > offsets of fields in component_refs when comparing addresses.
> > (operand_compare::hash_operand): Likewise.
> > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > index c47557daeba..273ee25ceda 100644
> > --- a/gcc/fold-const.c
> > +++ b/gcc/fold-const.c
> > @@ -3312,11 +3312,36 @@ operand_compare::operand_equal_p (const_tree arg0, 
> > const_tree arg1,
> > case COMPONENT_REF:
> >   /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
> >  may be NULL when we're called to compare MEM_EXPRs.  */
> > - if (!OP_SAME_WITH_NULL (0)
> > - || !OP_SAME (1))
> > + if (!OP_SAME_WITH_NULL (0))
> > return false;
> > - flags &= ~OEP_ADDRESS_OF;
> > - return OP_SAME_WITH_NULL (2);
> > + /* Most of time we only need to compare FIELD_DECLs for equality.
> > +However when determining address look into actual offsets.
> > +These may match for unions and unshared record types.  */
> 
> looks like you can simplify by doing
> 
>   flags &= ~OEP_ADDRESS_OF;
> 
> here.  Neither the FIELD_DECL compare nor the offsets need it

Yep
> 
> You elided
> 
>   flags &= ~OEP_ADDRESS_OF;
> - return OP_SAME_WITH_NULL (2);
> 
> that was here when OP_SAME (1), please re-instantiate.
Sorry for that, that was not very careful.
Here is updated patch I re-tested x86_64-linux.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c47557daeba..ddf18f27cb7 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3312,10 +3312,32 @@ operand_compare::operand_equal_p (const_tree arg0, 
const_tree arg1,
case COMPONENT_REF:
  /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
 may be NULL when we're called to compare MEM_EXPRs.  */
- if (!OP_SAME_WITH_NULL (0)
- || !OP_SAME (1))
+ if (!OP_SAME_WITH_NULL (0))
return false;
+ /* Most of time we only need to compare FIELD_DECLs for equality.
+However when determining address look into actual offsets.
+These may match for unions and unshared record types.  */
  flags &= ~OEP_ADDRESS_OF;
+ if (!OP_SAME (1))
+   {
+ if (flags & OEP_ADDRESS_OF)
+   {
+ if (TREE_OPERAND (arg0, 2)
+ || TREE_OPERAND (arg1, 2))
+   return OP_SAME_WITH_NULL (2);
+ tree field0 = TREE_OPERAND (arg0, 1);
+ tree field1 = TREE_OPERAND (arg1, 1);
+
+ if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
+   DECL_FIELD_OFFSET (field1), flags)
+ || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
+  DECL_FIELD_BIT_OFFSET (field1),
+  flags))
+   return false;
+   }
+ else
+   return false;
+   }
  return OP_SAME_WITH_NULL (2);
 
case BIT_FIELD_REF:
@@ -3787,9 +3809,26 @@ operand_compare::hash_operand (const_tree t, 
inchash::hash &hstate,
  sflags = flags;
  break;
 
+   case COMPONENT_REF:
+ if (sflags & OEP_ADDRESS_OF)
+   {
+ hash_operand (TREE_OPERAND (t, 0), hstate, flags);
+ if (TREE_OPERAND (t, 2))
+   hash_operand (TREE_OPERAND (t, 2), hstate,
+ flags & ~OEP_ADDRESS_OF);
+ else
+   {
+ tree field = TREE_OPERAND (t, 1);
+ hash_operand (DECL_FIELD_OFFSET (field),
+   hstate, flags & ~OEP_ADDRESS_OF);
+ hash_operand (DECL_FIELD_BIT_OFFSET (field),
+   hstate, flags & ~OEP_ADDRESS_OF);
+   }
+ return;
+   }
+ break;
case ARRAY_REF:
case ARRAY_RANGE_REF:
-   case COMPONENT_REF:
case BIT_FIELD_REF:
  sflags &= ~OEP_ADDRESS_OF;
  break;


[PATCH 5/5] Inline delegators in vrp_folder.

2020-11-12 Thread Aldy Hernandez via Gcc-patches
Will push pending aarch64 tests.

gcc/ChangeLog:

* tree-vrp.c (class vrp_folder): Make visit_stmt, visit_phi,
and m_vr_values private.
(vrp_folder::vrp_evaluate_conditional): Remove.
(vrp_folder::vrp_simplify_stmt_using_ranges): Remove.
(vrp_folder::fold_predicate_in): Inline
vrp_evaluate_conditional and vrp_simplify_stmt_using_ranges.
(vrp_folder::fold_stmt): Same.
---
 gcc/tree-vrp.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 81bbaefd642..54ce017e8b2 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3824,10 +3824,10 @@ public:
   void initialize (struct function *);
   void finalize ();
 
+private:
   enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
   enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
 
-private:
   struct function *fun;
   vr_values *m_vr_values;
 };
@@ -4063,23 +4063,16 @@ class vrp_folder : public substitute_and_fold_engine
 : substitute_and_fold_engine (/* Fold all stmts.  */ true),
   m_vr_values (v), simplifier (v)
 {  }
-  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
 
+private:
   tree value_of_expr (tree name, gimple *stmt) OVERRIDE
 {
   return m_vr_values->value_of_expr (name, stmt);
 }
-  class vr_values *m_vr_values;
-
-private:
+  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
   bool fold_predicate_in (gimple_stmt_iterator *);
-  /* Delegators.  */
-  tree vrp_evaluate_conditional (tree_code code, tree op0,
-tree op1, gimple *stmt)
-{ return simplifier.vrp_evaluate_conditional (code, op0, op1, stmt); }
-  bool simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
-{ return simplifier.simplify (gsi); }
 
+  vr_values *m_vr_values;
   simplify_using_ranges simplifier;
 };
 
@@ -4098,16 +4091,16 @@ vrp_folder::fold_predicate_in (gimple_stmt_iterator *si)
   && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison)
 {
   assignment_p = true;
-  val = vrp_evaluate_conditional (gimple_assign_rhs_code (stmt),
- gimple_assign_rhs1 (stmt),
- gimple_assign_rhs2 (stmt),
- stmt);
+  val = simplifier.vrp_evaluate_conditional (gimple_assign_rhs_code (stmt),
+gimple_assign_rhs1 (stmt),
+gimple_assign_rhs2 (stmt),
+stmt);
 }
   else if (gcond *cond_stmt = dyn_cast  (stmt))
-val = vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
-   gimple_cond_lhs (cond_stmt),
-   gimple_cond_rhs (cond_stmt),
-   stmt);
+val = simplifier.vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
+  gimple_cond_lhs (cond_stmt),
+  gimple_cond_rhs (cond_stmt),
+  stmt);
   else
 return false;
 
@@ -4153,7 +4146,7 @@ vrp_folder::fold_stmt (gimple_stmt_iterator *si)
   if (fold_predicate_in (si))
 return true;
 
-  return simplify_stmt_using_ranges (si);
+  return simplifier.simplify (si);
 }
 
 /* Blocks which have more than one predecessor and more than
-- 
2.26.2



[PATCH 1/5] Group tree-vrp.c by functionality.

2020-11-12 Thread Aldy Hernandez via Gcc-patches
Earlier in this cycle there was some work by Giuliano Belinassi and
myself to refactor tree-vrp.c.  A lot of functions and globals were
moved into independent classes, but the haphazard layout remained.
Assertion methods were indispersed with the propagation code, and with
the jump threading code, etc etc.

This series of patches moves things around so that common
functionality is geographically close.  There is no change in
behavior.

I know this is all slated to go in the next release, but finding
things in the current code base, even if just to compare with the
ranger, is difficult.

Tested on x86-64 Linux.  Aarch64 tests are still going.

Since I keep getting bit by aarch64 regressions, I'll push when the
entire patchset finishes tests on aarch64.

gcc/ChangeLog:

* tree-vrp.c (struct assert_locus): Move.
(class vrp_insert): Rename to vrp_asserts.
(vrp_insert::build_assert_expr_for): Move to vrp_asserts.
(fp_predicate): Same.
(vrp_insert::dump): Same.
(vrp_insert::register_new_assert_for): Same.
(extract_code_and_val_from_cond_with_ops): Move.
(vrp_insert::finish_register_edge_assert_for): Move to vrp_asserts.
(maybe_set_nonzero_bits): Move.
(vrp_insert::find_conditional_asserts): Move to vrp_asserts.
(stmt_interesting_for_vrp): Move.
(struct case_info): Move.
(compare_case_labels): Move.
(lhs_of_dominating_assert): Move.
(find_case_label_index): Move.
(find_case_label_range): Move.
(class vrp_asserts): New.
(vrp_asserts::build_assert_expr_for): Rename from vrp_insert.
(vrp_asserts::dump): Same.
(vrp_asserts::register_new_assert_for): Same.
(vrp_asserts::finish_register_edge_assert_for): Same.
(vrp_asserts::find_conditional_asserts): Same.
(vrp_asserts::compare_case_labels): Same.
(vrp_asserts::find_switch_asserts): Same.
(vrp_asserts::find_assert_locations_in_bb): Same.
(vrp_asserts::find_assert_locations): Same.
(vrp_asserts::process_assert_insertions_for): Same.
(vrp_asserts::compare_assert_loc): Same.
(vrp_asserts::process_assert_insertions): Same.
(vrp_asserts::insert_range_assertions): Same.
(vrp_asserts::all_imm_uses_in_stmt_or_feed_cond): Same.
(vrp_asserts::remove_range_assertions): Same.
(class vrp_prop): Move.
(all_imm_uses_in_stmt_or_feed_cond): Move.
(vrp_prop::vrp_initialize): Move.
(class vrp_folder): Move.
(vrp_folder::fold_predicate_in): Move.
(vrp_folder::fold_stmt): Move.
(vrp_prop::initialize): Move.
(vrp_prop::visit_stmt): Move.
(enum ssa_prop_result): Move.
(vrp_prop::visit_phi): Move.
(vrp_prop::finalize): Move.
(class vrp_dom_walker): Rename to...
(class vrp_jump_threader): ...this.
(vrp_jump_threader::before_dom_children): Rename from
vrp_dom_walker.
(simplify_stmt_for_jump_threading): Rename to...
(vrp_jump_threader::simplify_stmt): ...here.
(vrp_jump_threader::after_dom_children): Same.
(identify_jump_threads): Move.
(vrp_prop::vrp_finalize): Move array bounds setup code to...
(execute_vrp): ...here.
---
 gcc/tree-vrp.c | 2127 
 1 file changed, 1057 insertions(+), 1070 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e00c034fee3..d3816ab569e 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -161,153 +161,6 @@ live_names::live_on_block_p (tree name, basic_block bb)
  && bitmap_bit_p (live[bb->index], SSA_NAME_VERSION (name)));
 }
 
-
-/* Location information for ASSERT_EXPRs.  Each instance of this
-   structure describes an ASSERT_EXPR for an SSA name.  Since a single
-   SSA name may have more than one assertion associated with it, these
-   locations are kept in a linked list attached to the corresponding
-   SSA name.  */
-struct assert_locus
-{
-  /* Basic block where the assertion would be inserted.  */
-  basic_block bb;
-
-  /* Some assertions need to be inserted on an edge (e.g., assertions
- generated by COND_EXPRs).  In those cases, BB will be NULL.  */
-  edge e;
-
-  /* Pointer to the statement that generated this assertion.  */
-  gimple_stmt_iterator si;
-
-  /* Predicate code for the ASSERT_EXPR.  Must be COMPARISON_CLASS_P.  */
-  enum tree_code comp_code;
-
-  /* Value being compared against.  */
-  tree val;
-
-  /* Expression to compare.  */
-  tree expr;
-
-  /* Next node in the linked list.  */
-  assert_locus *next;
-};
-
-class vrp_insert
-{
-public:
-  vrp_insert (struct function *fn) : fun (fn) { }
-
-  /* Traverse the flowgraph looking for conditional jumps to insert range
- expressions.  These range expressions are meant to provide information
- to optimizations that need to reason in terms of value ranges.  They
- will not be expanded into RTL.

[PATCH 4/5] Move vr_values out of vrp_prop into execute_vrp so it can be shared.

2020-11-12 Thread Aldy Hernandez via Gcc-patches
vr_values is being shared among the propagator and the folder and
passed around.  I've pulled it out from the propagator so it can be
passed around to each, instead of being publicly accessible from the
propagator.

Will push pending aarch64 tests.

gcc/ChangeLog:

* tree-vrp.c (class vrp_prop): Rename vr_values to m_vr_values.
(vrp_prop::vrp_prop): New.
(vrp_prop::initialize): Rename vr_values to m_vr_values.
(vrp_prop::visit_stmt): Same.
(vrp_prop::visit_phi): Same.
(vrp_prop::finalize): Same.
(execute_vrp): Instantiate vrp_vr_values and pass it to folder
and propagator.
---
 gcc/tree-vrp.c | 53 +++---
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 15267e3d878..81bbaefd642 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3817,15 +3817,19 @@ vrp_asserts::remove_range_assertions ()
 class vrp_prop : public ssa_propagation_engine
 {
 public:
-  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
-  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
-
-  struct function *fun;
+  vrp_prop (vr_values *v)
+: ssa_propagation_engine (),
+  m_vr_values (v) { }
 
   void initialize (struct function *);
   void finalize ();
 
-  class vr_values vr_values;
+  enum ssa_prop_result visit_stmt (gimple *, edge *, tree *) FINAL OVERRIDE;
+  enum ssa_prop_result visit_phi (gphi *) FINAL OVERRIDE;
+
+private:
+  struct function *fun;
+  vr_values *m_vr_values;
 };
 
 /* Initialization required by ssa_propagate engine.  */
@@ -3845,7 +3849,7 @@ vrp_prop::initialize (struct function *fn)
  if (!stmt_interesting_for_vrp (phi))
{
  tree lhs = PHI_RESULT (phi);
- vr_values.set_def_to_varying (lhs);
+ m_vr_values->set_def_to_varying (lhs);
  prop_set_simulate_again (phi, false);
}
  else
@@ -3864,7 +3868,7 @@ vrp_prop::initialize (struct function *fn)
prop_set_simulate_again (stmt, true);
  else if (!stmt_interesting_for_vrp (stmt))
{
- vr_values.set_defs_to_varying (stmt);
+ m_vr_values->set_defs_to_varying (stmt);
  prop_set_simulate_again (stmt, false);
}
  else
@@ -3887,11 +3891,11 @@ vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, 
tree *output_p)
 {
   tree lhs = gimple_get_lhs (stmt);
   value_range_equiv vr;
-  vr_values.extract_range_from_stmt (stmt, taken_edge_p, output_p, &vr);
+  m_vr_values->extract_range_from_stmt (stmt, taken_edge_p, output_p, &vr);
 
   if (*output_p)
 {
-  if (vr_values.update_value_range (*output_p, &vr))
+  if (m_vr_values->update_value_range (*output_p, &vr))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -3926,7 +3930,7 @@ vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, 
tree *output_p)
use_operand_p use_p;
enum ssa_prop_result res = SSA_PROP_VARYING;
 
-   vr_values.set_def_to_varying (lhs);
+   m_vr_values->set_def_to_varying (lhs);
 
FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
  {
@@ -3956,9 +3960,9 @@ vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, 
tree *output_p)
   {REAL,IMAG}PART_EXPR uses at all,
   return SSA_PROP_VARYING.  */
value_range_equiv new_vr;
-   vr_values.extract_range_basic (&new_vr, use_stmt);
+   m_vr_values->extract_range_basic (&new_vr, use_stmt);
const value_range_equiv *old_vr
- = vr_values.get_value_range (use_lhs);
+ = m_vr_values->get_value_range (use_lhs);
if (!old_vr->equal_p (new_vr, /*ignore_equivs=*/false))
  res = SSA_PROP_INTERESTING;
else
@@ -3980,7 +3984,7 @@ vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, 
tree *output_p)
 
   /* All other statements produce nothing of interest for VRP, so mark
  their outputs varying and prevent further simulation.  */
-  vr_values.set_defs_to_varying (stmt);
+  m_vr_values->set_defs_to_varying (stmt);
 
   return (*taken_edge_p) ? SSA_PROP_INTERESTING : SSA_PROP_VARYING;
 }
@@ -3994,8 +3998,8 @@ vrp_prop::visit_phi (gphi *phi)
 {
   tree lhs = PHI_RESULT (phi);
   value_range_equiv vr_result;
-  vr_values.extract_range_from_phi_node (phi, &vr_result);
-  if (vr_values.update_value_range (lhs, &vr_result))
+  m_vr_values->extract_range_from_phi_node (phi, &vr_result);
+  if (m_vr_values->update_value_range (lhs, &vr_result))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -4024,12 +4028,12 @@ vrp_prop::finalize ()
   size_t i;
 
   /* We have completed propagating through the lattice.  */
-  vr_values.set_lattice_propagation_complete ();
+  m_vr_values->set_lattice_propagation_complete ();
 
   if (dump_file)
 {

[PATCH 3/5] Move vrp_prop before vrp_folder.

2020-11-12 Thread Aldy Hernandez via Gcc-patches
Will push pending aarch64 tests.

gcc/ChangeLog:

* tree-vrp.c (class vrp_prop): Move entire class...
(class vrp_folder): ...before here.
---
 gcc/tree-vrp.c | 200 -
 1 file changed, 100 insertions(+), 100 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 6b77c357a8f..15267e3d878 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3814,106 +3814,6 @@ vrp_asserts::remove_range_assertions ()
   }
 }
 
-class vrp_folder : public substitute_and_fold_engine
-{
- public:
-  vrp_folder (vr_values *v)
-: substitute_and_fold_engine (/* Fold all stmts.  */ true),
-  m_vr_values (v), simplifier (v)
-{  }
-  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
-
-  tree value_of_expr (tree name, gimple *stmt) OVERRIDE
-{
-  return m_vr_values->value_of_expr (name, stmt);
-}
-  class vr_values *m_vr_values;
-
-private:
-  bool fold_predicate_in (gimple_stmt_iterator *);
-  /* Delegators.  */
-  tree vrp_evaluate_conditional (tree_code code, tree op0,
-tree op1, gimple *stmt)
-{ return simplifier.vrp_evaluate_conditional (code, op0, op1, stmt); }
-  bool simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
-{ return simplifier.simplify (gsi); }
-
-  simplify_using_ranges simplifier;
-};
-
-/* If the statement pointed by SI has a predicate whose value can be
-   computed using the value range information computed by VRP, compute
-   its value and return true.  Otherwise, return false.  */
-
-bool
-vrp_folder::fold_predicate_in (gimple_stmt_iterator *si)
-{
-  bool assignment_p = false;
-  tree val;
-  gimple *stmt = gsi_stmt (*si);
-
-  if (is_gimple_assign (stmt)
-  && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison)
-{
-  assignment_p = true;
-  val = vrp_evaluate_conditional (gimple_assign_rhs_code (stmt),
- gimple_assign_rhs1 (stmt),
- gimple_assign_rhs2 (stmt),
- stmt);
-}
-  else if (gcond *cond_stmt = dyn_cast  (stmt))
-val = vrp_evaluate_conditional (gimple_cond_code (cond_stmt),
-   gimple_cond_lhs (cond_stmt),
-   gimple_cond_rhs (cond_stmt),
-   stmt);
-  else
-return false;
-
-  if (val)
-{
-  if (assignment_p)
-val = fold_convert (gimple_expr_type (stmt), val);
-
-  if (dump_file)
-   {
- fprintf (dump_file, "Folding predicate ");
- print_gimple_expr (dump_file, stmt, 0);
- fprintf (dump_file, " to ");
- print_generic_expr (dump_file, val);
- fprintf (dump_file, "\n");
-   }
-
-  if (is_gimple_assign (stmt))
-   gimple_assign_set_rhs_from_tree (si, val);
-  else
-   {
- gcc_assert (gimple_code (stmt) == GIMPLE_COND);
- gcond *cond_stmt = as_a  (stmt);
- if (integer_zerop (val))
-   gimple_cond_make_false (cond_stmt);
- else if (integer_onep (val))
-   gimple_cond_make_true (cond_stmt);
- else
-   gcc_unreachable ();
-   }
-
-  return true;
-}
-
-  return false;
-}
-
-/* Callback for substitute_and_fold folding the stmt at *SI.  */
-
-bool
-vrp_folder::fold_stmt (gimple_stmt_iterator *si)
-{
-  if (fold_predicate_in (si))
-return true;
-
-  return simplify_stmt_using_ranges (si);
-}
-
 class vrp_prop : public ssa_propagation_engine
 {
 public:
@@ -4152,6 +4052,106 @@ vrp_prop::finalize ()
 }
 }
 
+class vrp_folder : public substitute_and_fold_engine
+{
+ public:
+  vrp_folder (vr_values *v)
+: substitute_and_fold_engine (/* Fold all stmts.  */ true),
+  m_vr_values (v), simplifier (v)
+{  }
+  bool fold_stmt (gimple_stmt_iterator *) FINAL OVERRIDE;
+
+  tree value_of_expr (tree name, gimple *stmt) OVERRIDE
+{
+  return m_vr_values->value_of_expr (name, stmt);
+}
+  class vr_values *m_vr_values;
+
+private:
+  bool fold_predicate_in (gimple_stmt_iterator *);
+  /* Delegators.  */
+  tree vrp_evaluate_conditional (tree_code code, tree op0,
+tree op1, gimple *stmt)
+{ return simplifier.vrp_evaluate_conditional (code, op0, op1, stmt); }
+  bool simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
+{ return simplifier.simplify (gsi); }
+
+  simplify_using_ranges simplifier;
+};
+
+/* If the statement pointed by SI has a predicate whose value can be
+   computed using the value range information computed by VRP, compute
+   its value and return true.  Otherwise, return false.  */
+
+bool
+vrp_folder::fold_predicate_in (gimple_stmt_iterator *si)
+{
+  bool assignment_p = false;
+  tree val;
+  gimple *stmt = gsi_stmt (*si);
+
+  if (is_gimple_assign (stmt)
+  && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison)
+{
+  assignment_p = true;
+  val = vrp_evaluate

[PATCH 2/5] Refactor VRP threading code into vrp_jump_threader class.

2020-11-12 Thread Aldy Hernandez via Gcc-patches
Will push pending aarch64 tests.

gcc/ChangeLog:

* tree-vrp.c (identify_jump_threads): Refactor to..
(vrp_jump_threader::vrp_jump_threader): ...here
(vrp_jump_threader::~vrp_jump_threader): ...and here.
(vrp_jump_threader::after_dom_children): Rename vr_values to
m_vr_values.
(execute_vrp): Use vrp_jump_threader.
---
 gcc/tree-vrp.c | 144 -
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d3816ab569e..6b77c357a8f 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -4152,32 +4152,87 @@ vrp_prop::finalize ()
 }
 }
 
+/* Blocks which have more than one predecessor and more than
+   one successor present jump threading opportunities, i.e.,
+   when the block is reached from a specific predecessor, we
+   may be able to determine which of the outgoing edges will
+   be traversed.  When this optimization applies, we are able
+   to avoid conditionals at runtime and we may expose secondary
+   optimization opportunities.
+
+   This class is effectively a driver for the generic jump
+   threading code.  It basically just presents the generic code
+   with edges that may be suitable for jump threading.
+
+   Unlike DOM, we do not iterate VRP if jump threading was successful.
+   While iterating may expose new opportunities for VRP, it is expected
+   those opportunities would be very limited and the compile time cost
+   to expose those opportunities would be significant.
+
+   As jump threading opportunities are discovered, they are registered
+   for later realization.  */
+
 class vrp_jump_threader : public dom_walker
 {
 public:
-  vrp_jump_threader (cdi_direction direction,
-class const_and_copies *const_and_copies,
-class avail_exprs_stack *avail_exprs_stack)
-: dom_walker (direction, REACHABLE_BLOCKS),
-  m_const_and_copies (const_and_copies),
-  m_avail_exprs_stack (avail_exprs_stack),
-  m_dummy_cond (NULL) {}
-
-  virtual edge before_dom_children (basic_block);
-  virtual void after_dom_children (basic_block);
+  vrp_jump_threader (struct function *, vr_values *);
+  ~vrp_jump_threader ();
 
-  class vr_values *vr_values;
+  void thread_jumps ()
+  {
+walk (m_fun->cfg->x_entry_block_ptr);
+  }
 
 private:
   static tree simplify_stmt (gimple *stmt, gimple *within_stmt,
 avail_exprs_stack *, basic_block);
+  virtual edge before_dom_children (basic_block);
+  virtual void after_dom_children (basic_block);
 
-  class const_and_copies *m_const_and_copies;
-  class avail_exprs_stack *m_avail_exprs_stack;
-
+  function *m_fun;
+  vr_values *m_vr_values;
+  const_and_copies *m_const_and_copies;
+  avail_exprs_stack *m_avail_exprs_stack;
+  hash_table *m_avail_exprs;
   gcond *m_dummy_cond;
 };
 
+vrp_jump_threader::vrp_jump_threader (struct function *fun, vr_values *v)
+  : dom_walker (CDI_DOMINATORS, REACHABLE_BLOCKS)
+{
+  /* Ugh.  When substituting values earlier in this pass we can wipe
+ the dominance information.  So rebuild the dominator information
+ as we need it within the jump threading code.  */
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* We do not allow VRP information to be used for jump threading
+ across a back edge in the CFG.  Otherwise it becomes too
+ difficult to avoid eliminating loop exit tests.  Of course
+ EDGE_DFS_BACK is not accurate at this time so we have to
+ recompute it.  */
+  mark_dfs_back_edges ();
+
+  /* Allocate our unwinder stack to unwind any temporary equivalences
+ that might be recorded.  */
+  m_const_and_copies = new const_and_copies ();
+
+  m_dummy_cond = NULL;
+  m_fun = fun;
+  m_vr_values = v;
+  m_avail_exprs = new hash_table (1024);
+  m_avail_exprs_stack = new avail_exprs_stack (m_avail_exprs);
+}
+
+vrp_jump_threader::~vrp_jump_threader ()
+{
+  /* We do not actually update the CFG or SSA graphs at this point as
+ ASSERT_EXPRs are still in the IL and cfg cleanup code does not
+ yet handle ASSERT_EXPRs gracefully.  */
+  delete m_const_and_copies;
+  delete m_avail_exprs;
+  delete m_avail_exprs_stack;
+}
+
 /* Called before processing dominator children of BB.  We want to look
at ASSERT_EXPRs and record information from them in the appropriate
tables.
@@ -4295,7 +4350,7 @@ vrp_jump_threader::after_dom_children (basic_block bb)
  integer_zero_node, integer_zero_node,
  NULL, NULL);
 
-  x_vr_values = vr_values;
+  x_vr_values = m_vr_values;
   thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
 m_avail_exprs_stack, NULL,
 simplify_stmt);
@@ -4305,62 +4360,6 @@ vrp_jump_threader::after_dom_children (basic_block bb)
   m_const_and_copies->pop_to_marker ();
 }
 
-/* Blocks which have more than one predecessor and more than
-   one successor pre

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-12 Thread Liu Hao via Gcc-patches
在 2020/11/12 18:18, Jonathan Yong 写道:
> libgomp build fails because of the false -Wformat error, even though:
> 1. Correct C99 inttypes.h macros are used.
> 2. __mingw_* C99 wrappers are used.
> 3. The printf attribute is used, but it was aliased to ms_printf
> 
> The attached patch makes mingw-w64 printf attribute equivalent to other 
> platforms on C99 or later.
> This allows libgomp to build again with -Werror on. This patch should not 
> affect the original
> mingw.org distribution in any way.
> 

According to the conversation on IRC, I personally consider this inappropriate. 
Although the `ll`
modifier is specified by C99 for `long long`, there are many more that don't 
conform to C99 without
UCRT:

1. The `z` modifier for `size_t` is unrecognized.
2. The `t` modifier for `ptrdiff_t` is unrecognized.
3. The `L` modifier for `long double` is accepted but ignored due to
   the fact that MSABI uses an 8-byte type.


> For C99 or later, the mingw-w64 headers already wrap printf/scanf properly, 
> and inttypes.h also
> gives the correct C99 specifiers, so it makes sense to treat the printf 
> attribute as C99 compliant.
> Under C89 mode, the headers would produce MS specific specifiers, so the 
> printf attribute under C89
> reverts to the old behavior of being aliased to ms_printf.
> 
> This might break other code that assumes differently however. I don't think 
> there is a solution to
> satisfy everyone, but at least this allows C99/C++11 compliant code to build 
> again with -Werror.
> Comments?

My humble opinion is that people should have gotten used to the `ll` specifier 
so I propose a
different patch that enables it unconditionally. As Jonathan Yong pointed out, 
GCC is impossible to
predict where the target executable will run. It may be reasonable to expect 
Vista+ instead of an
ancient one. Users who still code for XP- should probably handle such 
differentiation themselves. In
comparison, MSVC does not have such format checks at all.

I started bootstrapping GCC a few minutes ago. It's not gonna finish very soon 
so I send this patch
for your comments.



-- 
Best regards,
LH_Mouse
From 1d61adae0695e7067e35f36e607a754a7cf12796 Mon Sep 17 00:00:00 2001
From: Liu Hao 
Date: Thu, 12 Nov 2020 22:20:29 +0800
Subject: [PATCH] gcc: Add `ll` and `L` length modifiers for `ms_printf`

Previous code abuse `FMT_LEN_L` for the `I` modifier. As `L` is a valid
modifier for `f`, `e`, `g`, etc. and `I` has the same semantics as the
C99 `z` modifier, `FMT_LEN_z` is now used.

First, in the Microsoft ABI, type `long double` has the same layout as
type `double`, so `%Lg` behaves identically to `%g`. Users should pass
in `double`s instead as `long double`s, as GCC uses the 10-byte format.

Second, with a CRT that is recent enough (MSVCRT since Vista, MSVCR80,
UCRT, or mingw-w64 8.0), `printf`-family functions can handle the `ll`
length modifier correctly. This ability is assumed to be available
universally. A lot of libraries (such as libgomp) that use the
`format(printf, ...)` attribute used to suffer from warnings about
unknown format specifiers.

Reference: 
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/tcxf1dw6(v=vs.90)
Reference: 
https://docs.microsoft.com/en-us/cpp/porting/visual-cpp-what-s-new-2003-through-2015#new-crt-features
Signed-off-by: Liu Hao 

gcc/:
* config/i386/msformat-c.c: Add more length modifiers
---
 gcc/config/i386/msformat-c.c | 45 ++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/gcc/config/i386/msformat-c.c b/gcc/config/i386/msformat-c.c
index 4ceec633a6e..1629b866976 100644
--- a/gcc/config/i386/msformat-c.c
+++ b/gcc/config/i386/msformat-c.c
@@ -32,10 +32,11 @@ along with GCC; see the file COPYING3.  If not see
 static format_length_info ms_printf_length_specs[] =
 {
   { "h", FMT_LEN_h, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
-  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
+  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_ll, STD_C9L, 0 },
+  { "L", FMT_LEN_L, STD_C89, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I32", FMT_LEN_l, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I64", FMT_LEN_ll, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
-  { "I", FMT_LEN_L, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
+  { "I", FMT_LEN_z, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { NULL, FMT_LEN_none, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 }
 };
 
@@ -90,33 +91,33 @@ static const format_flag_pair ms_strftime_flag_pairs[] =
 static const format_char_info ms_print_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  T99_SST, 
 BADLEN, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'",  "i",  NULL 
},
-  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0#", "i",  NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL, 

Re: [PATCH] Add a new pattern in 4-insn combine

2020-11-12 Thread Segher Boessenkool
On Wed, Nov 11, 2020 at 06:22:53PM -0600, Segher Boessenkool wrote:
> I'm running an all-arch comparison with this patch, just to see what it
> does, but [...]

Results: C0 is trunk, C1 with patch:

C0C1
   alpha   6422312   99.971%
 arc   3783838  100.000%
 arm  10168277  100.000%
   arm64  20077721 0
   armhf  14886534  100.000%
 c6x   2509915  100.000%
csky 0 0
   h8300   1229802  100.000%
i386  12040952 0
ia64  18555229  100.000%
m68k   3868729  100.000%
  microblaze   5885763  100.000%
mips   9158101  100.000%
  mips64   7402870  100.001%
   nds32   4833031  100.000%
   nios2   3917080  100.000%
openrisc   4571561  100.000%
  parisc   7725308  100.000%
parisc64 0 0
 powerpc  11004119  100.000%
   powerpc64  22618492  100.000%
 powerpc64le  19609678  100.000%
 riscv32   1639840  100.000%
 riscv64   7658668 0
s390  15345481 0
  sh 0 0
 shnommu   1694176  100.000%
   sparc   4744809  100.000%
 sparc64   7205254  100.000%
  x86_64  19870124 0
  xtensa   2658455  100.002%

0 means it did not build...  So some targets newly ICE (x86, riscv, z).

It surprisingly only helps alpha a bit, and all other changes are in the
wrong direction (but very slightly).


Segher


Re: [committed] libstdc++: Fix __numeric_traits_integer<__int20> [PR 97798]

2020-11-12 Thread Jonathan Wakely via Gcc-patches

Here's a small tweak to __numeric_traits that I decided to do after
the previous patch.

Tested on powerpc64le-linux. Committed to trunk.

commit d21776ef90361e66401cd99c8ff0d98b46d3b0d6
Author: Jonathan Wakely 
Date:   Thu Nov 12 13:31:02 2020

libstdc++: Simplify __numeric_traits definition

This changes the __numeric_traits primary template to assume its
argument is an integer type. For the three floating point types that are
supported by __numeric_traits_floating an explicit specialization of
__numeric_traits chooses the right base class.

This improves the failure mode for using __numeric_traits with an
unsupported type. Previously it would use __numeric_traits_floating as
the base class, and give somewhat obscure errors for trying to access
the static data members. Now it will use __numeric_traits_integer which
has a static_assert to check for supported types.

As a side effect of this change there is no need to instantiate
__conditional_type to decide which base class to use.

libstdc++-v3/ChangeLog:

* include/ext/numeric_traits.h (__numeric_traits): Change
primary template to always derive from __numeric_traits_integer.
(__numeric_traits, __numeric_traits)
(__numeric_traits): Add explicit specializations.

diff --git a/libstdc++-v3/include/ext/numeric_traits.h b/libstdc++-v3/include/ext/numeric_traits.h
index c29f9f21d1aa..2cac7f1d1edc 100644
--- a/libstdc++-v3/include/ext/numeric_traits.h
+++ b/libstdc++-v3/include/ext/numeric_traits.h
@@ -176,19 +176,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 const int __numeric_traits_floating<_Value>::__max_exponent10;
 
-  template
-struct __numeric_traits
-: public __conditional_type<__is_integer_nonstrict<_Value>::__value,
-__numeric_traits_integer<_Value>,
-__numeric_traits_floating<_Value> >::__type
-{ };
-
-_GLIBCXX_END_NAMESPACE_VERSION
-} // namespace
-
 #undef __glibcxx_floating
 #undef __glibcxx_max_digits10
 #undef __glibcxx_digits10
 #undef __glibcxx_max_exponent10
 
+  template
+struct __numeric_traits
+: public __numeric_traits_integer<_Value>
+{ };
+
+  template<>
+struct __numeric_traits
+: public __numeric_traits_floating
+{ };
+
+  template<>
+struct __numeric_traits
+: public __numeric_traits_floating
+{ };
+
+  template<>
+struct __numeric_traits
+: public __numeric_traits_floating
+{ };
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace
+
 #endif


[PATCH] More PRE compile-time optimizations

2020-11-12 Thread Richard Biener
This fixes a bug in bitmap_list_view which could end up with
a NULL head->current which makes followup searches fail.  Oops.

It also further optimizes the PRE DFS walk by removing useless
stuff and special-casing bitmaps with just one element for
EXECUTE_IF_AND_IN_BITMAP which makes a quite big difference.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-12  Richard Biener  

* bitmap.c (bitmap_list_view): Restore head->current.
* tree-ssa-pre.c (pre_expr_DFS): Elide expr_visited bitmap.
Special-case value expression bitmaps with one element.
(bitmap_find_leader): Likewise.
(sorted_array_from_bitmap_set): Elide expr_visited bitmap.
---
 gcc/bitmap.c   |  5 +
 gcc/tree-ssa-pre.c | 40 +++-
 2 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/gcc/bitmap.c b/gcc/bitmap.c
index 810b80be1ba..c849b0d22f5 100644
--- a/gcc/bitmap.c
+++ b/gcc/bitmap.c
@@ -678,6 +678,11 @@ bitmap_list_view (bitmap head)
 }
 
   head->tree_form = false;
+  if (!head->current)
+{
+  head->current = head->first;
+  head->indx = head->current ? head->current->indx : 0;
+}
 }
 
 /* Convert bitmap HEAD from linked-list view to splay-tree view.
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 9db1b0258f7..e25cec7ffa1 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -806,15 +806,15 @@ bitmap_set_free (bitmap_set_t set)
 }
 
 static void
-pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap expr_visited,
- bitmap val_visited, vec &post);
+pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap val_visited,
+ vec &post);
 
 /* DFS walk leaders of VAL to their operands with leaders in SET, collecting
expressions in SET in postorder into POST.  */
 
 static void
-pre_expr_DFS (unsigned val, bitmap_set_t set, bitmap expr_visited,
- bitmap val_visited, vec &post)
+pre_expr_DFS (unsigned val, bitmap_set_t set, bitmap val_visited,
+ vec &post)
 {
   unsigned int i;
   bitmap_iterator bi;
@@ -822,21 +822,25 @@ pre_expr_DFS (unsigned val, bitmap_set_t set, bitmap 
expr_visited,
   /* Iterate over all leaders and DFS recurse.  Borrowed from
  bitmap_find_leader.  */
   bitmap exprset = value_expressions[val];
+  if (!exprset->first->next)
+{
+  EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi)
+   if (bitmap_bit_p (&set->expressions, i))
+ pre_expr_DFS (expression_for_id (i), set, val_visited, post);
+  return;
+}
+
   EXECUTE_IF_AND_IN_BITMAP (exprset, &set->expressions, 0, i, bi)
-pre_expr_DFS (expression_for_id (i),
- set, expr_visited, val_visited, post);
+pre_expr_DFS (expression_for_id (i), set, val_visited, post);
 }
 
 /* DFS walk EXPR to its operands with leaders in SET, collecting
expressions in SET in postorder into POST.  */
 
 static void
-pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap expr_visited,
- bitmap val_visited, vec &post)
+pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap val_visited,
+ vec &post)
 {
-  if (!bitmap_set_bit (expr_visited, get_expression_id (expr)))
-return;
-
   switch (expr->kind)
 {
 case NARY:
@@ -851,7 +855,7 @@ pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap 
expr_visited,
   recursed already.  Avoid the costly bitmap_find_leader.  */
if (bitmap_bit_p (&set->values, op_val_id)
&& bitmap_set_bit (val_visited, op_val_id))
- pre_expr_DFS (op_val_id, set, expr_visited, val_visited, post);
+ pre_expr_DFS (op_val_id, set, val_visited, post);
  }
break;
   }
@@ -873,8 +877,7 @@ pre_expr_DFS (pre_expr expr, bitmap_set_t set, bitmap 
expr_visited,
unsigned op_val_id = VN_INFO (op[n])->value_id;
if (bitmap_bit_p (&set->values, op_val_id)
&& bitmap_set_bit (val_visited, op_val_id))
- pre_expr_DFS (op_val_id,
-   set, expr_visited, val_visited, post);
+ pre_expr_DFS (op_val_id, set, val_visited, post);
  }
  }
break;
@@ -896,13 +899,11 @@ sorted_array_from_bitmap_set (bitmap_set_t set)
   /* Pre-allocate enough space for the array.  */
   result.create (bitmap_count_bits (&set->expressions));
 
-  auto_bitmap expr_visited (&grand_bitmap_obstack);
   auto_bitmap val_visited (&grand_bitmap_obstack);
-  bitmap_tree_view (expr_visited);
   bitmap_tree_view (val_visited);
   FOR_EACH_VALUE_ID_IN_SET (set, i, bi)
 if (bitmap_set_bit (val_visited, i))
-  pre_expr_DFS (i, set, expr_visited, val_visited, result);
+  pre_expr_DFS (i, set, val_visited, result);
 
   return result;
 }
@@ -1883,6 +1884,11 @@ bitmap_find_leader (bitmap_set_t set, unsigned int val)
   bitmap_iterator bi;
   bitmap exprset = value_expressions[val];
 
+  if (!exprset->first->next)
+   EXECUTE_

Re: [PATCH] pch: Specify reason of -Winvalid-pch warning [PR86674]

2020-11-12 Thread Jeff Law via Gcc-patches


On 3/9/20 2:55 AM, Nicholas Guriev wrote:
> gcc/c-family/ChangeLog:
>
>   PR pch/86674
>   * c-pch.c (c_common_valid_pch): Use cpp_warning with CPP_W_INVALID_PCH
>   reason to fix -Werror=invalid-pch and -Wno-error=invalid-pch switches.
> ---
>  gcc/c-family/ChangeLog |  6 ++
>  gcc/c-family/c-pch.c   | 40 +++-
>  libcpp/files.c |  2 +-
>  3 files changed, 22 insertions(+), 26 deletions(-)
>
> diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
> index 2e11b..1c83eeb0f 100644
> --- a/gcc/c-family/ChangeLog
> +++ b/gcc/c-family/ChangeLog
> @@ -1,3 +1,9 @@
> +2020-03-09  Nicholas Guriev 
> +
> + PR pch/86674
> + * c-pch.c (c_common_valid_pch): Use cpp_warning with CPP_W_INVALID_PCH
> + reason to fix -Werror=invalid-pch and -Wno-error=invalid-pch switches.

THanks.  I've added a ChangeLog entry for the libcpp/files.c change,
re-tested and pushed this to the trunk.


Jeff




Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Jan Hubicka wrote:

> Hi,
> this is updated patch I am re-testing and plan to commit if it suceeds.
> 
>   * fold-const.c (operand_compare::operand_equal_p): Compare
>   offsets of fields in component_refs when comparing addresses.
>   (operand_compare::hash_operand): Likewise.
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index c47557daeba..273ee25ceda 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3312,11 +3312,36 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>   case COMPONENT_REF:
> /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
>may be NULL when we're called to compare MEM_EXPRs.  */
> -   if (!OP_SAME_WITH_NULL (0)
> -   || !OP_SAME (1))
> +   if (!OP_SAME_WITH_NULL (0))
>   return false;
> -   flags &= ~OEP_ADDRESS_OF;
> -   return OP_SAME_WITH_NULL (2);
> +   /* Most of time we only need to compare FIELD_DECLs for equality.
> +  However when determining address look into actual offsets.
> +  These may match for unions and unshared record types.  */

looks like you can simplify by doing

  flags &= ~OEP_ADDRESS_OF;

here.  Neither the FIELD_DECL compare nor the offsets need it

> +   if (!OP_SAME (1))
> + {
> +   if (flags & OEP_ADDRESS_OF)
> + {
> +   if (TREE_OPERAND (arg0, 2)
> +   || TREE_OPERAND (arg1, 2))
> + {
> +   flags &= ~OEP_ADDRESS_OF;
> +   return OP_SAME_WITH_NULL (2);
> + }
> +   tree field0 = TREE_OPERAND (arg0, 1);
> +   tree field1 = TREE_OPERAND (arg1, 1);
> +
> +   if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
> + DECL_FIELD_OFFSET (field1),
> + flags & ~OEP_ADDRESS_OF)
> +   || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
> +DECL_FIELD_BIT_OFFSET (field1),
> +flags & ~OEP_ADDRESS_OF))
> + return false;
> + }
> +   else
> + return false;
> + }

You elided

  flags &= ~OEP_ADDRESS_OF;
- return OP_SAME_WITH_NULL (2);

that was here when OP_SAME (1), please re-instantiate.

> +   return true;
>  
>   case BIT_FIELD_REF:
> if (!OP_SAME (0))
> @@ -3787,9 +3812,26 @@ operand_compare::hash_operand (const_tree t, 
> inchash::hash &hstate,
> sflags = flags;
> break;
>  
> + case COMPONENT_REF:
> +   if (sflags & OEP_ADDRESS_OF)
> + {
> +   hash_operand (TREE_OPERAND (t, 0), hstate, flags);
> +   if (TREE_OPERAND (t, 2))
> + hash_operand (TREE_OPERAND (t, 2), hstate,
> +   flags & ~OEP_ADDRESS_OF);
> +   else
> + {
> +   tree field = TREE_OPERAND (t, 1);
> +   hash_operand (DECL_FIELD_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> +   hash_operand (DECL_FIELD_BIT_OFFSET (field),
> + hstate, flags & ~OEP_ADDRESS_OF);
> + }
> +   return;
> + }
> +   break;
>   case ARRAY_REF:
>   case ARRAY_RANGE_REF:
> - case COMPONENT_REF:
>   case BIT_FIELD_REF:
> sflags &= ~OEP_ADDRESS_OF;
> break;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.

2020-11-12 Thread Richard Biener via Gcc-patches
On Thu, Nov 12, 2020 at 10:23 AM Hongtao Liu  wrote:
>
> On Thu, Nov 12, 2020 at 5:15 PM Hongtao Liu  wrote:
> >
> > On Thu, Nov 12, 2020 at 5:12 PM Hongtao Liu  wrote:
> > >
> > > On Thu, Nov 12, 2020 at 4:21 PM Uros Bizjak  wrote:
> > > >
> > > > On Thu, Nov 12, 2020 at 3:04 AM Hongtao Liu  wrote:
> > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > > PR target/97194
> > > > > > > * config/i386/i386-expand.c (ix86_expand_vector_set_var): New 
> > > > > > > function.
> > > > > > > * config/i386/i386-protos.h (ix86_expand_vector_set_var): New 
> > > > > > > Decl.
> > > > > > > * config/i386/predicates.md (vec_setm_operand): New predicate,
> > > > > > > true for const_int_operand or register_operand under TARGET_AVX2.
> > > > > > > * config/i386/sse.md (vec_set): Support both constant
> > > > > > > and variable index vec_set.
> > > > > > >
> > > > > > > gcc/testsuite/ChangeLog:
> > > > > > >
> > > > > > > * gcc.target/i386/avx2-vec-set-1.c: New test.
> > > > > > > * gcc.target/i386/avx2-vec-set-2.c: New test.
> > > > > > > * gcc.target/i386/avx512bw-vec-set-1.c: New test.
> > > > > > > * gcc.target/i386/avx512bw-vec-set-2.c: New test.
> > > > > > > * gcc.target/i386/avx512f-vec-set-2.c: New test.
> > > > > > > * gcc.target/i386/avx512vl-vec-set-2.c: New test.
> > > > > >
> > > > > > +;; True for registers, or const_int_operand, used to vec_setm 
> > > > > > expander.
> > > > > > +(define_predicate "vec_setm_operand"
> > > > > > +  (ior (and (match_operand 0 "register_operand")
> > > > > > +(match_test "TARGET_AVX2"))
> > > > > > +   (match_code "const_int")))
> > > > > > +
> > > > > >  ;; True for registers, or 1 or -1.  Used to optimize double-word 
> > > > > > shifts.
> > > > > >  (define_predicate "reg_or_pm1_operand"
> > > > > >(ior (match_operand 0 "register_operand")
> > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > > index b153a87fb98..1798e5dea75 100644
> > > > > > --- a/gcc/config/i386/sse.md
> > > > > > +++ b/gcc/config/i386/sse.md
> > > > > > @@ -8098,11 +8098,14 @@ (define_insn "vec_setv2df_0"
> > > > > >  (define_expand "vec_set"
> > > > > >[(match_operand:V 0 "register_operand")
> > > > > > (match_operand: 1 "register_operand")
> > > > > > -   (match_operand 2 "const_int_operand")]
> > > > > > +   (match_operand 2 "vec_setm_operand")]
> > > > > >
> > > > > > You need to specify a mode, otherwise a register of any mode can 
> > > > > > pass here.
> > > > > >
> > > > > Yes, theoretically, we only accept integer types. But in 
> > > > > can_vec_set_var_idx_p
> > > > > cut
> > > > > ---
> > > > > bool
> > > > > can_vec_set_var_idx_p (machine_mode vec_mode)
> > > > > {
> > > > >   if (!VECTOR_MODE_P (vec_mode))
> > > > > return false;
> > > > >
> > > > >   machine_mode inner_mode = GET_MODE_INNER (vec_mode);
> > > > >   rtx reg1 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 1);
> > > > >   rtx reg2 = alloca_raw_REG (inner_mode, LAST_VIRTUAL_REGISTER + 2);
> > > > >   rtx reg3 = alloca_raw_REG (VOIDmode, LAST_VIRTUAL_REGISTER + 3);
> > > > >
> > > > >   enum insn_code icode = optab_handler (vec_set_optab, vec_mode);
> > > > >
> > > > >   return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, 
> > > > > reg1)
> > > > >  && insn_operand_matches (icode, 1, reg2)
> > > > >  && insn_operand_matches (icode, 2, reg3);
> > > > > }
> > > > > ---
> > > > >
> > > > > reg3 is assumed to be VOIDmode, set anymode in match_operand 2 will
> > > > > fail insn_operand_matches (icode, 2, reg3)
> > > > > ---
> > > > > (gdb) p insn_operand_matches(icode,2,reg3)
> > > > > $5 = false
> > > > > (gdb)
> > > > > ---
> > > > >
> > > > > Maybe we need to change
> > > > >
> > > > > rtx reg3 = alloca_raw_REG (VOIDmode, LAST_VIRTUAL_REGISTER + 3);
> > > > >
> > > > > to
> > > > >
> > > > > rtx reg3 = alloca_raw_REG (SImode, LAST_VIRTUAL_REGISTER + 3);
> > > > >
> > > > > cc Richard Biener, any thoughts?
> > > >
> > > > There are two targets (gcn in gcn-valu.md and s390 in vector.md) that
> > > > specify SImode for operand 2 in vec_setM pattern and allow register
> > > > operands. I wonder if and how they manage to generate the pattern.
> > > >
> > > > Uros.
> > >
> > > Variable index vec_set is enabled by r11-3486, about two months ago in
> > > [1]. But for the upper two targets, the codes are already there since
> > > GCC10(maybe earlier, i just looked at gcc10 branch), I don't think
> > > those codes are for [1].
> > >
> > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555905.html
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> > Correct [1] 
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html
> >
> > --
> > BR,
> > Hongtao
>
> in https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554592.html
>
> It says
>
> > >> +can_vec_set_var_idx_p (enum tree_code code, machine_mode vec_mode,
> > >> +  machine_mode value_mode, machine_mode idx_mode)
> > >
> > > toplevel comment

Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Jan Hubicka
Hi,
this is updated patch I am re-testing and plan to commit if it suceeds.

* fold-const.c (operand_compare::operand_equal_p): Compare
offsets of fields in component_refs when comparing addresses.
(operand_compare::hash_operand): Likewise.
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c47557daeba..273ee25ceda 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3312,11 +3312,36 @@ operand_compare::operand_equal_p (const_tree arg0, 
const_tree arg1,
case COMPONENT_REF:
  /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
 may be NULL when we're called to compare MEM_EXPRs.  */
- if (!OP_SAME_WITH_NULL (0)
- || !OP_SAME (1))
+ if (!OP_SAME_WITH_NULL (0))
return false;
- flags &= ~OEP_ADDRESS_OF;
- return OP_SAME_WITH_NULL (2);
+ /* Most of time we only need to compare FIELD_DECLs for equality.
+However when determining address look into actual offsets.
+These may match for unions and unshared record types.  */
+ if (!OP_SAME (1))
+   {
+ if (flags & OEP_ADDRESS_OF)
+   {
+ if (TREE_OPERAND (arg0, 2)
+ || TREE_OPERAND (arg1, 2))
+   {
+ flags &= ~OEP_ADDRESS_OF;
+ return OP_SAME_WITH_NULL (2);
+   }
+ tree field0 = TREE_OPERAND (arg0, 1);
+ tree field1 = TREE_OPERAND (arg1, 1);
+
+ if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
+   DECL_FIELD_OFFSET (field1),
+   flags & ~OEP_ADDRESS_OF)
+ || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
+  DECL_FIELD_BIT_OFFSET (field1),
+  flags & ~OEP_ADDRESS_OF))
+   return false;
+   }
+ else
+   return false;
+   }
+ return true;
 
case BIT_FIELD_REF:
  if (!OP_SAME (0))
@@ -3787,9 +3812,26 @@ operand_compare::hash_operand (const_tree t, 
inchash::hash &hstate,
  sflags = flags;
  break;
 
+   case COMPONENT_REF:
+ if (sflags & OEP_ADDRESS_OF)
+   {
+ hash_operand (TREE_OPERAND (t, 0), hstate, flags);
+ if (TREE_OPERAND (t, 2))
+   hash_operand (TREE_OPERAND (t, 2), hstate,
+ flags & ~OEP_ADDRESS_OF);
+ else
+   {
+ tree field = TREE_OPERAND (t, 1);
+ hash_operand (DECL_FIELD_OFFSET (field),
+   hstate, flags & ~OEP_ADDRESS_OF);
+ hash_operand (DECL_FIELD_BIT_OFFSET (field),
+   hstate, flags & ~OEP_ADDRESS_OF);
+   }
+ return;
+   }
+ break;
case ARRAY_REF:
case ARRAY_RANGE_REF:
-   case COMPONENT_REF:
case BIT_FIELD_REF:
  sflags &= ~OEP_ADDRESS_OF;
  break;


Re: Add support for copy specifier to fnspec

2020-11-12 Thread Richard Biener
On Thu, 12 Nov 2020, Jan Hubicka wrote:

> Hi,
> here is updated patch that replaces 'C' by '1'...'9' so we still have
> place to specify size.
> As discussed on IRC, this seems better alternative.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> Honza
> 
> gcc/ChangeLog:
> 
> 2020-11-12  Jan Hubicka  
> 
>   * attr-fnspec.h: Update topleve comment.
>   (attr_fnspec::arg_direct_p): Accept 1...9.
>   (attr_fnspec::arg_maybe_written_p): Reject 1...9.
>   (attr_fnspec::arg_copied_to_arg_p): New member function.
>   * builtins.c (builtin_fnspec): Update fnspec of block copy.
>   * tree-ssa-alias.c (attr_fnspec::verify): Update.
> 
> diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
> index 28135328437..766414a2520 100644
> --- a/gcc/attr-fnspec.h
> +++ b/gcc/attr-fnspec.h
> @@ -41,6 +41,9 @@
>   written and does not escape
>   'w' or 'W' specifies that the memory pointed to by the parameter does 
> not
>   escape
> + '1''9' specifies that the memory pointed to by the parameter is
> + copied to memory pointed to by different parameter
> + (as in memcpy).
>   '.' specifies that nothing is known.
> The uppercase letter in addition specifies that the memory pointed to
> by the parameter is not dereferenced.  For 'r' only read applies
> @@ -51,8 +54,8 @@
>   ' 'nothing is known
>   't' the size of value written/read corresponds to the size of
>   of the pointed-to type of the argument type
> - '1'...'9'  the size of value written/read is given by the specified
> - argument
> + '1'...'9'  specifies the size of value written/read is given by the
> + specified argument
>   */
>  
>  #ifndef ATTR_FNSPEC_H
> @@ -122,7 +125,8 @@ public:
>{
>  unsigned int idx = arg_idx (i);
>  gcc_checking_assert (arg_specified_p (i));
> -return str[idx] == 'R' || str[idx] == 'O' || str[idx] == 'W';
> +return str[idx] == 'R' || str[idx] == 'O'
> +|| str[idx] == 'W' || (str[idx] >= '1' && str[idx] <= '9');
>}
>  
>/* True if argument is used.  */
> @@ -161,6 +165,7 @@ public:
>  unsigned int idx = arg_idx (i);
>  gcc_checking_assert (arg_specified_p (i));
>  return str[idx] != 'r' && str[idx] != 'R'
> +&& (str[idx] < '1' || str[idx] > '9')
>  && str[idx] != 'x' && str[idx] != 'X';
>}
>  
> @@ -190,6 +195,21 @@ public:
>  return str[idx + 1] == 't';
>}
>  
> +  /* Return true if memory pointer to by argument is copied to a memory
> + pointed to by a different argument (as in memcpy).
> + In this case set ARG.  */
> +  bool
> +  arg_copied_to_arg_p (unsigned int i, unsigned int *arg)
> +  {
> +unsigned int idx = arg_idx (i);
> +gcc_checking_assert (arg_specified_p (i));
> +if (str[idx] < '1' || str[idx] > '9')
> +  return false;
> +*arg = str[idx] - '1';
> +return true;
> +  }
> +
> +
>/* True if the argument does not escape.  */
>bool
>arg_noescape_p (unsigned int i)
> @@ -230,7 +250,7 @@ public:
>  return str[1] != 'c' && str[1] != 'C';
>}
>  
> -  /* Return true if all memory written by the function 
> +  /* Return true if all memory written by the function
>   is specified by fnspec.  */
>bool
>global_memory_written_p ()
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index da25343beb1..4ec1766cffd 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -12939,16 +12939,16 @@ builtin_fnspec (tree callee)
>argument.  */
>case BUILT_IN_STRCAT:
>case BUILT_IN_STRCAT_CHK:
> - return "1cW R ";
> + return "1cW 1 ";
>case BUILT_IN_STRNCAT:
>case BUILT_IN_STRNCAT_CHK:
> - return "1cW R3";
> + return "1cW 13";
>case BUILT_IN_STRCPY:
>case BUILT_IN_STRCPY_CHK:
> - return "1cO R ";
> + return "1cO 1 ";
>case BUILT_IN_STPCPY:
>case BUILT_IN_STPCPY_CHK:
> - return ".cO R ";
> + return ".cO 1 ";
>case BUILT_IN_STRNCPY:
>case BUILT_IN_MEMCPY:
>case BUILT_IN_MEMMOVE:
> @@ -12957,15 +12957,15 @@ builtin_fnspec (tree callee)
>case BUILT_IN_STRNCPY_CHK:
>case BUILT_IN_MEMCPY_CHK:
>case BUILT_IN_MEMMOVE_CHK:
> - return "1cO3R3";
> + return "1cO313";
>case BUILT_IN_MEMPCPY:
>case BUILT_IN_MEMPCPY_CHK:
> - return ".cO3R3";
> + return ".cO313";
>case BUILT_IN_STPNCPY:
>case BUILT_IN_STPNCPY_CHK:
> - return ".cO3R3";
> + return ".cO313";
>case BUILT_IN_BCOPY:
> - return ".cR3O3";
> + return ".c23O3";
>case BUILT_IN_BZERO:
>   return ".cO2";
>case BUILT_IN_MEMCMP:
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index e64011d04df..b1e8e5b5352 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -3797,6 +3797,8 @@ attr_fnspec::verify ()
>default:
>   err = true;
>  

Re: Compare field offsets in fold_const when checking addresses

2020-11-12 Thread Jan Hubicka
> > * fold-const.c (operand_compare::operand_equal_p): When comparing 
> > addresses
> > look info field offsets for COMPONENT_REFs.
> > (operand_compare::hash_operand): Likewise.
> > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > index c47557daeba..a4e8cccb1b7 100644
> > --- a/gcc/fold-const.c
> > +++ b/gcc/fold-const.c
> > @@ -3312,9 +3312,41 @@ operand_compare::operand_equal_p (const_tree arg0, 
> > const_tree arg1,
> > case COMPONENT_REF:
> >   /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
> >  may be NULL when we're called to compare MEM_EXPRs.  */
> > - if (!OP_SAME_WITH_NULL (0)
> > - || !OP_SAME (1))
> > + if (!OP_SAME_WITH_NULL (0))
> > return false;
> > + /* Most of time we only need to compare FIELD_DECLs for equality.
> > +However when determining address look into actual offsets.
> > +These may match for unions and unshared record types.  */
> > + if (!OP_SAME (1))
> > +   {
> > + if (flags & OEP_ADDRESS_OF)
> > +   {
> 
> actually if OP2 is not NULL for both you can just compare that (and that's
> more correct then).
> 
> > + tree field0 = TREE_OPERAND (arg0, 1);
> > + tree field1 = TREE_OPERAND (arg1, 1);
> > + tree type0 = DECL_CONTEXT (field0);
> > + tree type1 = DECL_CONTEXT (field1);
> > +
> > + if (TREE_CODE (type0) == RECORD_TYPE
> > + && DECL_BIT_FIELD_REPRESENTATIVE (field0))
> > +   field0 = DECL_BIT_FIELD_REPRESENTATIVE (field0);
> > + if (TREE_CODE (type1) == RECORD_TYPE
> > + && DECL_BIT_FIELD_REPRESENTATIVE (field1))
> > +   field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
> 
> Why does the representative matter?  For a 32bit bitfield if you'd
> have two addresses at 8bit boundary but different you'd make them
> equal this way.  Soo ...
> 
> > + /* Assume that different FIELD_DECLs never overlap within a
> > +RECORD_TYPE.  */
> > + if (type0 == type1 && TREE_CODE (type0) == RECORD_TYPE)
> > +   return false;
> 
> this isn't really about "overlap", OEP_ADDRESS_OF is just about
> the address (not it's extent).

We discussed this with Jakub, so I have already tested version dropping
both of these, but indeed I should check for OPERAND2.  Will do that
now.

Thanks!
Honza
> 
> > + if (!operand_equal_p (DECL_FIELD_OFFSET (field0),
> > +   DECL_FIELD_OFFSET (field1),
> > +   flags & ~OEP_ADDRESS_OF)
> > + || !operand_equal_p (DECL_FIELD_BIT_OFFSET (field0),
> > +  DECL_FIELD_BIT_OFFSET (field1),
> > +  flags & ~OEP_ADDRESS_OF))
> > +   return false;
> 
> So this should suffice (on the original fields).
> 
> > +   }
> > + else
> > +   return false;
> > +   }
> >   flags &= ~OEP_ADDRESS_OF;
> >   return OP_SAME_WITH_NULL (2);
> >  
> > @@ -3787,9 +3819,28 @@ operand_compare::hash_operand (const_tree t, 
> > inchash::hash &hstate,
> >   sflags = flags;
> >   break;
> >  
> > +   case COMPONENT_REF:
> > + if (flags & OEP_ADDRESS_OF)
> > +   {
> > + tree field = TREE_OPERAND (t, 1);
> > + tree type = DECL_CONTEXT (field);
> > +
> > + if (TREE_CODE (type) == RECORD_TYPE
> > + && DECL_BIT_FIELD_REPRESENTATIVE (field))
> > +   field = DECL_BIT_FIELD_REPRESENTATIVE (field);
> 
> see above.
> 
> > + hash_operand (TREE_OPERAND (t, 0), hstate, flags);
> > + hash_operand (DECL_FIELD_OFFSET (field),
> > +   hstate, flags & ~OEP_ADDRESS_OF);
> > + hash_operand (DECL_FIELD_BIT_OFFSET (field),
> > +   hstate, flags & ~OEP_ADDRESS_OF);
> > + hash_operand (TREE_OPERAND (t, 2), hstate,
> > +   flags & ~OEP_ADDRESS_OF);
> 
> otherwise this looks ok.
> 
> > + return;
> > +   }
> > + break;
> > case ARRAY_REF:
> > case ARRAY_RANGE_REF:
> > -   case COMPONENT_REF:
> > case BIT_FIELD_REF:
> >   sflags &= ~OEP_ADDRESS_OF;
> >   break;
> > 
> 
> -- 
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imend


  1   2   >