[PATCH PR93674]Avoid introducing IV of enumeral type in case of -fstrict-enums

2020-03-02 Thread bin.cheng
Hi,
This is a simple fix for PR93674.  It adds cand carefully for enumeral type 
iv_use in
case of -fstrict-enums, it also avoids computing, replacing iv_use with the 
candidate
so that no IV of enumeral type is introduced with -fstrict-enums option.

Testcase is also added.  Bootstrap and test on x86_64.  Any comment?

Thanks,
bin
2020-03-02  Bin Cheng  

PR tree-optimization/93674
* tree-ssa-loop-ivopts.c (add_iv_candidate_for_use): Add candidate
for enumeral type iv_use converted from other iv.
(get_computation_cost, may_eliminate_iv): Avoid compute, eliminate
iv_use with enumeral type iv_cand in case of -fstrict-enums.

gcc/testsuite
2020-03-02  Bin Cheng  

PR tree-optimization/93674
* g++.dg/pr93674.C: New test.

pr93674-20200302.txt
Description: Binary data


Re: [PATCH] libstdc++: P0769R2 Add shift to

2020-03-02 Thread Stephan Bergmann

On 21/02/2020 20:29, Patrick Palka wrote:

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
index 7de1072abf0..c36afc6e19b 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -3683,6 +3683,54 @@ namespace ranges
inline constexpr __prev_permutation_fn prev_permutation{};
  
  } // namespace ranges

+
+  template
+constexpr ForwardIterator
+shift_left(ForwardIterator __first, ForwardIterator __last,
+  typename iterator_traits::difference_type __n)
+{
+  __glibcxx_assert(__n >= 0);
+  if (__n == 0)
+   return __last;
+
+  auto __mid = ranges::next(__first, __n, __last);
+  if (__mid == __last)
+   return __first;
+  return std::move(std::move(__mid), std::move(__last), 
std::move(__first));
+}
+
+  template
+constexpr ForwardIterator
+shift_right(ForwardIterator __first, ForwardIterator __last,
+   typename iterator_traits::difference_type __n)
+{
+  __glibcxx_assert(__n >= 0);
+  if (__n == 0)
+   return __first;
+
+  using _Cat = iterator_traits::iterator_category;


^ FYI, the above line causes recent Clang 10 trunk with -std=c++20 to 
fail due to a "missing" typedef



+  if constexpr (derived_from<_Cat, bidirectional_iterator_tag>)
+   {
+ auto __mid = ranges::next(__last, -__n, __first);
+ if (__mid == __first)
+   return __last;
+ return std::move_backward(std::move(__first), std::move(__mid),
+   std::move(__last));
+   }
+  else
+   {
+ auto __result = ranges::next(__first, __n, __last);
+ if (__result == __last)
+   return __last;
+ auto __dest = __result;
+ do
+   __dest = ranges::swap_ranges(__first, __result,
+std::move(__dest), __last).in2;
+ while (__dest != __last);
+ return __result;
+   }
+}
+
  _GLIBCXX_END_NAMESPACE_VERSION
  } // namespace std
  #endif // concepts




Re: gcov: reduce code quality loss by reproducible topn merging [PR92924]

2020-03-02 Thread Martin Liška

On 2/28/20 12:34 AM, Gerald Pfeifer wrote:

Okay?  Or does this qualify as obvious?


The patch seems to me obvious. Please install it.

Martin


Re: [PATCH] [9/10 Regression] lto: Also copy .note.gnu.property section

2020-03-02 Thread Richard Biener
On Sat, Feb 29, 2020 at 2:00 PM H.J. Lu  wrote:
>
> On Fri, Feb 28, 2020 at 7:38 AM H.J. Lu  wrote:
> >
> > On Fri, Feb 28, 2020 at 6:30 AM H.J. Lu  wrote:
> > >
> > > When generating the separate file with LTO debug sections, we should
> > > also copy .note.gnu.property section.
> > >
> > > OK for master if there is no regression?
> > >
> > > Thanks.
> > >
> > > H.J.
> > > ---
> > > libiberty/
> > >
> > > PR lto/93966
> > > * simple-object.c (handle_lto_debug_sections): Also copy
> > > .note.gnu.property section.
> > >
> >
> > The test will fail on non-CET enabled OS.   Here is the updated patch 
> > without
> > testcase.OK for master and backport to GCC 8/9 branches?
> >
>
> This is a GCC 9/10 regression introduced by early LTO debug patches.  Is my
> patch:
>
> https://gcc.gnu.org/ml/gcc-patches/2020-02/msg01626.html
>
> OK for master and backport for GCC 9 branch?

OK everywhere.

Thanks,
Richard.

> Thanks.
>
> --
> H.J.


Re: GCC 8 backports

2020-03-02 Thread Martin Liška

Hi.

There's one patch that was approved by Jakub before
another RC of GCC 8.4.0.

Martin
>From 5da6f38276fac87c89d86e0d447aefb7058d1880 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 28 Feb 2020 17:52:57 +0100
Subject: [PATCH] Backport 08bf7bde9f2987b1c623d272cc71fc14a1622442

gcc/ChangeLog:

2020-02-28  Martin Liska  

	PR other/93965
	* configure.ac: Improve detection of ld_date by requiring
	either two dashes or none.
	* configure: Regenerate.
---
 gcc/configure| 2 +-
 gcc/configure.ac | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index 7313088fc2c..97ba7d7d69c 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -22785,7 +22785,7 @@ if test $in_tree_ld != yes ; then
   ld_vers=`echo $ld_ver | sed -n \
 	  -e 's,^.*[	 ]\([0-9][0-9]*\.[0-9][0-9]*.*\)$,\1,p'`
 fi
-ld_date=`echo $ld_ver | sed -n 's,^.*\([2-9][0-9][0-9][0-9]\)[-]*\([01][0-9]\)[-]*\([0-3][0-9]\).*$,\1\2\3,p'`
+ld_date=`echo $ld_ver | sed -n 's,^.*\([2-9][0-9][0-9][0-9]\)\(-*\)\([01][0-9]\)\2\([0-3][0-9]\).*$,\1\3\4,p'`
 ld_vers_major=`expr "$ld_vers" : '\([0-9]*\)'`
 ld_vers_minor=`expr "$ld_vers" : '[0-9]*\.\([0-9]*\)'`
 ld_vers_patch=`expr "$ld_vers" : '[0-9]*\.[0-9]*\.\([0-9]*\)'`
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 9bed32ad43f..d6f2d5b2ed0 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2580,7 +2580,7 @@ if test $in_tree_ld != yes ; then
   ld_vers=`echo $ld_ver | sed -n \
 	  -e 's,^.*[	 ]\([0-9][0-9]*\.[0-9][0-9]*.*\)$,\1,p'`
 fi
-ld_date=`echo $ld_ver | sed -n 's,^.*\([2-9][0-9][0-9][0-9]\)[-]*\([01][0-9]\)[-]*\([0-3][0-9]\).*$,\1\2\3,p'`
+ld_date=`echo $ld_ver | sed -n 's,^.*\([2-9][0-9][0-9][0-9]\)\(-*\)\([01][0-9]\)\2\([0-3][0-9]\).*$,\1\3\4,p'`
 ld_vers_major=`expr "$ld_vers" : '\([0-9]*\)'`
 ld_vers_minor=`expr "$ld_vers" : '[0-9]*\.\([0-9]*\)'`
 ld_vers_patch=`expr "$ld_vers" : '[0-9]*\.[0-9]*\.\([0-9]*\)'`
-- 
2.25.1



Re: Minor regression due to recent IRA changes

2020-03-02 Thread Richard Biener
On Sat, Feb 29, 2020 at 8:35 PM Jeff Law  wrote:
>
> On Sun, 2020-03-01 at 01:47 +0900, Oleg Endo wrote:
> > On Sat, 2020-02-29 at 09:38 -0700, Jeff Law wrote:
> > > It really would have just been a workaround for some of the R0 issues
> > > anyway.
> > > I think at its core R0 on the SH probably needs to be treated more like a
> > > temporary rather than a general register.  But that's probably a huge
> > > change,
> > > both in terms of just getting it working right and in terms of addressing
> > > the
> > > code quality regressions that would introduce.
> > >
> >
> > I think one of the major issues is that R0 is a constraint in several
> > addressing modes for memory accesses.  I believe I once had the idea of
> > hiding R0 from RA ... then insert reg-reg copies (to load R0) after
> > RA/reload ... and then somehow do back propagation to get rid of the
> > reg-reg copies again.  Another idea was to run a pre-RA pass to pre-
> > allocate all R0 things.  But I think it's all just running in sqrt(1)
> > circles after all.
> Yup.  That was roughly what I was thinking and roughly the worry I had with
> trying to squash out the quality regressions.  But it may ultimately be the
> only way to really resolve these issues.

One could also simply pessimize R0 for RA via either an existing mechanism
or a new target hook ...

> DJ's work on the m32c IIRC might be useful if you do try to chase this stuff
> down.  Essentially there weren't really enough registers.  So he had the port
> pretend to have more than it really did, then had a post-reload pass to do the
> final allocation into the target's actual register file.
>
> jeff
>


[PATCH] coroutines: Don't make duplicate frame copies of awaitables.

2020-03-02 Thread Iain Sandoe
Hi,

this corrects a thinko that seemed initially to be a missed
optimisation, but turns out to lead to wrong code in some
cases.

tested on x86_64 darwin, linux and powerpc linux
OK for trunk?
thanks
Iain

In general, we need to manage the lifetime of compiler-
generated awaitable instances in the coroutine frame, since
these must persist across suspension points.

However, it is quite possible that the user might provide the
awaitable instances, either as function params or as a local
variable.  We will already generate a frame entry for these as
required.

At present, under this circumstance, we are duplicating these,
awaitable, initialising a second frame copy for them (which we
then subsequently destroy manually after the suspension point). 
That's not efficient - so an undesirable thinko in the first place.
However, there is also an actual bug; if the compiler elects to
elide the copy (which is perfectly legal), it does not have visibility
of the manual management of the post-suspend destruction
- this subsequently leads to double-free errors.

The solution is not to make the second copy (as noted, params
and local vars already have frame copies with managed lifetimes).

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  

* coroutines.cc (build_co_await): Do not build frame
awaitable proxy vars when the co_await expression is
a function parameter or local var.
(co_await_expander): Do not initialise a frame var with
itself.
(transform_await_expr): Only substitute the awaitable
frame var if it's needed.
(register_awaits): Do not make frame copies for param
or local vars that are awaitables.

gcc/testsuite/ChangeLog:

2020-03-02  Iain Sandoe  

* g++.dg/coroutines/torture/func-params-09-awaitable-parms.C: New test.
* g++.dg/coroutines/torture/local-var-5-awaitable.C: New test.
---
 gcc/cp/coroutines.cc  |  89 ++-
 .../torture/func-params-09-awaitable-parms.C  | 105 ++
 .../torture/local-var-5-awaitable.C   |  73 
 3 files changed, 241 insertions(+), 26 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/func-params-09-awaitable-parms.C
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/local-var-5-awaitable.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index ffc33aa1534..3e06f079787 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -738,8 +738,21 @@ build_co_await (location_t loc, tree a, suspend_point_kind 
suspend_kind)
   /* To complete the lookups, we need an instance of 'e' which is built from
  'o' according to [expr.await] 3.4.  However, we don't want to materialize
  'e' here (it might need to be placed in the coroutine frame) so we will
- make a temp placeholder instead.  */
-  tree e_proxy = build_lang_decl (VAR_DECL, NULL_TREE, o_type);
+ make a temp placeholder instead.  If 'o' is a parameter or a local var,
+ then we do not need an additional var (parms and local vars are already
+ copied into the frame and will have lifetimes according to their original
+ scope).  */
+  tree e_proxy = STRIP_NOPS (o);
+  if (INDIRECT_REF_P (e_proxy))
+e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (TREE_CODE (e_proxy) == PARM_DECL
+  || (TREE_CODE (e_proxy) == VAR_DECL && !DECL_ARTIFICIAL (e_proxy)))
+e_proxy = o;
+  else
+{
+  e_proxy = build_lang_decl (VAR_DECL, NULL_TREE, o_type);
+  DECL_ARTIFICIAL (e_proxy) = true;
+}
 
   /* I suppose we could check that this is contextually convertible to bool.  
*/
   tree awrd_func = NULL_TREE;
@@ -1452,10 +1465,17 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, 
void *d)
  tf_warning_or_error);
 
   tree stmt_list = NULL;
+  tree t_expr = STRIP_NOPS (expr);
+  tree r;
+  if (t_expr == var)
+dtor = NULL_TREE;
+  else
+{
   /* Initialize the var from the provided 'o' expression.  */
-  tree r = build2 (INIT_EXPR, await_type, var, expr);
+r = build2 (INIT_EXPR, await_type, var, expr);
   r = coro_build_cvt_void_expr_stmt (r, loc);
   append_to_statement_list (r, &stmt_list);
+}
 
   /* Use the await_ready() call to test if we need to suspend.  */
   tree ready_cond = TREE_VEC_ELT (awaiter_calls, 0); /* await_ready().  */
@@ -1687,20 +1707,26 @@ transform_await_expr (tree await_expr, await_xform_data 
*xform)
  and an empty pointer for void return.  */
   TREE_OPERAND (await_expr, 0) = ah;
 
-  /* Get a reference to the initial suspend var in the frame.  */
-  tree as_m
-= lookup_member (coro_frame_type, si->await_field_id,
-/*protect=*/1, /*want_type=*/0, tf_warning_or_error);
-  tree as = build_class_member_access_expr (xform->actor_frame, as_m, 
NULL_TREE,
-   true, tf_warning_or_error);
+  /* If we have a frame var for the awaitable, get a reference to it.  */
+  proxy_replace 

Re: [PATCH PR93674]Avoid introducing IV of enumeral type in case of -fstrict-enums

2020-03-02 Thread Richard Biener
On Mon, Mar 2, 2020 at 9:07 AM bin.cheng  wrote:
>
> Hi,
> This is a simple fix for PR93674.  It adds cand carefully for enumeral type 
> iv_use in
> case of -fstrict-enums, it also avoids computing, replacing iv_use with the 
> candidate
> so that no IV of enumeral type is introduced with -fstrict-enums option.
>
> Testcase is also added.  Bootstrap and test on x86_64.  Any comment?

I think we should avoid enum-typed (or bool-typed) IVs in general, not just
with -fstrict-enums.  That said, the ENUMERAL_TYPE checks should be
!(INTEGER_TYPE || POINTER_TYPE_P) checks.

+  /* Check if cand can represent values of use for strict enums.  */
+  else if (TREE_CODE (ctype) == ENUMERAL_TYPE && flag_strict_enums)
+{

if we don't have enum-typed IV candidates then the computation should
be carried out in INTEGER_TYPE and then be converted to enum type.
So why's this and the may_eliminate_iv hunks necessary?

Richard.

> Thanks,
> bin
> 2020-03-02  Bin Cheng  
>
> PR tree-optimization/93674
> * tree-ssa-loop-ivopts.c (add_iv_candidate_for_use): Add candidate
> for enumeral type iv_use converted from other iv.
> (get_computation_cost, may_eliminate_iv): Avoid compute, eliminate
> iv_use with enumeral type iv_cand in case of -fstrict-enums.
>
> gcc/testsuite
> 2020-03-02  Bin Cheng  
>
> PR tree-optimization/93674
> * g++.dg/pr93674.C: New test.


[PATCH] coroutines: Update lambda capture handling to n4849.

2020-03-02 Thread Iain Sandoe
Hi,

In the absence of specific comment on the handling of closures I'd
implemented something more than was intended (extending the lifetime
of lambda capture-by-copy vars to the duration of the coro).

After discussion at WG21 in February and by email, the correct handling
is to treat the closure "this" pointer the same way as for a regular one,
and thus it is the user's responsibility to ensure that the lambda capture
object has suitable lifetime for the coroutine.  It is noted that users
frequently get this wrong, so it would be a good thing to revisit for C++23.

This patch removes the additional copying behaviour for lambda capture-by-
copy vars.

@JunMa, this supercedes your fix to the aliases, which should no longer be
necessary, but i’ve added your testcases to this patch.

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  

* coroutines.cc (struct local_var_info): Adjust to remove the
reference to the captured var, and just to note that this is a
lambda capture proxy.
(transform_local_var_uses): Handle lambda captures specially.
(struct param_frame_data): Add a visited set.
(register_param_uses): Also check for param uses in lambda
capture proxies.
(struct local_vars_frame_data): Remove captures list.
(register_local_var_uses): Handle lambda capture proxies by
noting and bypassing them.
(morph_fn_to_coro): Update to remove lifetime extension of
lambda capture-by-copy vars.

gcc/testsuite/ChangeLog:

2020-03-02  Iain Sandoe  
Jun Ma 

* g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C:
Update to have multiple uses for the lambda parm.
* g++.dg/coroutines/torture/lambda-09-init-captures.C: New test.
* g++.dg/coroutines/torture/lambda-10-mutable.C: New test.

---
 gcc/cp/coroutines.cc  | 174 +++---
 .../class-05-lambda-capture-copy-local.C  |   4 +-
 .../torture/lambda-09-init-captures.C |  55 ++
 .../coroutines/torture/lambda-10-mutable.C|  48 +
 4 files changed, 171 insertions(+), 110 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/lambda-09-init-captures.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/lambda-10-mutable.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 3e06f079787..303e6e83d54 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1783,7 +1783,7 @@ struct local_var_info
   tree field_id;
   tree field_idx;
   tree frame_type;
-  tree captured;
+  bool is_lambda_capture;
   location_t def_loc;
 };
 
@@ -1828,6 +1828,14 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  cp_walk_tree (&DECL_SIZE_UNIT (lvar), transform_local_var_uses, d,
NULL);
 
+   /* For capture proxies, this could include the decl value expr.  */
+   if (local_var.is_lambda_capture)
+ {
+   tree ve = DECL_VALUE_EXPR (lvar);
+   cp_walk_tree (&ve, transform_local_var_uses, d, NULL);
+   continue; /* No frame entry for this.  */
+ }
+
  /* TODO: implement selective generation of fields when vars are
 known not-used.  */
  if (local_var.field_id == NULL_TREE)
@@ -1842,8 +1850,9 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  local_var.field_idx = fld_idx;
}
   cp_walk_tree (&BIND_EXPR_BODY (*stmt), transform_local_var_uses, d, 
NULL);
+
   /* Now we have processed and removed references to the original vars,
-we can drop those from the bind.  */
+we can drop those from the bind - leaving capture proxies alone.  */
   for (tree *pvar = &BIND_EXPR_VARS (*stmt); *pvar != NULL;)
{
  bool existed;
@@ -1851,10 +1860,24 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
= lvd->local_var_uses->get_or_insert (*pvar, &existed);
  gcc_checking_assert (existed);
 
+ /* Leave lambda closure captures alone, we replace the *this
+pointer with the frame version and let the normal process
+deal with the rest.  */
+ if (local_var.is_lambda_capture)
+   {
+ pvar = &DECL_CHAIN (*pvar);
+ continue;
+   }
+
+ /* It's not used, but we can let the optimizer deal with that.  */
  if (local_var.field_id == NULL_TREE)
-   pvar = &DECL_CHAIN (*pvar); /* Wasn't used.  */
+   {
+ pvar = &DECL_CHAIN (*pvar);
+ continue;
+   }
 
- *pvar = DECL_CHAIN (*pvar); /* discard this one, we replaced it.  */
+ /* Discard this one, we replaced it.  */
+ *pvar = DECL_CHAIN (*pvar);
}
 
   *do_subtree = 0; /* We've done the body already.  */
@@ -1884,6 +1907,9 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
   if (local_var_i == NULL)
 return NULL_TREE

Re: [mid-end] Add notes to dataflow insn info when re-emitting (PR92410)

2020-03-02 Thread Martin Liška

On 2/27/20 7:55 AM, Roman Zhuykov wrote:

Does anybody considered backporting this?


I'm for the backport (after 8.4.0 will be released).
I would like to get a permission from release managers about this patch?

Martin


Re: [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-03-02 Thread Richard Sandiford
"Kewen.Lin"  writes:
> on 2020/1/20 下午9:14, Segher Boessenkool wrote:
>> Hi!
>> 
>> On Mon, Jan 20, 2020 at 10:42:12AM +, Richard Sandiford wrote:
>>> "Kewen.Lin"  writes:
 gcc/ChangeLog

 2020-01-16  Kewen Lin  

* config/rs6000/rs6000.c (TARGET_STRIDE_DFORM_VALID_P): New macro.
(rs6000_stride_dform_valid_p): New function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_STRIDE_DFORM_VALID_P): New hook.
* target.def (stride_dform_valid_p): New hook.
>>>
>>> It looks like we should able to derive this information from the normal
>>> legitimate_address_p hook.
>> 
>> Yes, probably.
>> 
>>> Also, "D-form" vs. "X-form" is AFAIK a PowerPC-specific classification.
>>> It would be good to use a more generic term in target-independent code.
>> 
>> Yeah.  X-form is [reg+reg] addressing; D-form is [reg+imm] addressing.
>> We can do simple [reg] addressing in either form as well.  Whether D-form
>> can be used for some access depends on many factors (ISA version, mode of
>> the datum, alignment, and how big the offset is of course).  But the usual
>> legitimate_address_p hook should do fine.  The ivopts code already has an
>> addr_offset_valid_p function, maybe that could be adjusted for this?
>> 
>> 
>> Segher
>> 
>
> Hi Segher and Richard S.,
>
> Sorry for late response.  Thanks for your comments on legitimate_address_p 
> hook
> and function addr_offset_valid_p.  I updated the IVOPTs part with
> addr_offset_valid_p, although rs6000_legitimate_offset_address_p doesn't check
> strictly all the time (like worst_case is false), it works well with SPEC2017.
> Based on it, the hook is simplified as attached patch.

Thanks for the update.  I think it would be better to add a --param
rather than a bool hook though.  Targets can then change the default
(if necessary) using SET_OPTION_IF_UNSET.  The user can override the
default if they want to.

It might also be better to start with an opt-out rather than an opt-in
(i.e. with the default param value being true rather than false).
With a default-off option, it's much harder to tell whether something
has been deliberately turned off or whether no-one's thought about it
either way.  We can always flip the default later if it turns out that
nothing other than rs6000 benefits.

Richard


Re: [PATCH] sccvn: Improve handling of load masked with integer constant [PR93582]

2020-03-02 Thread Richard Biener
On Fri, 28 Feb 2020, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR and discussed on IRC, the following patch is the
> patch that fixes the originally reported issue.
> We have there because of the premature bitfield comparison -> BIT_FIELD_REF
> optimization:
>   s$s4_19 = 0;
>   s.s4 = s$s4_19;
>   _10 = BIT_FIELD_REF ;
>   _13 = _10 & 8;
> and no other s fields are initialized.  If they would be all initialized with
> constants, then my earlier PR93582 bitfield handling patches would handle it
> already, but if at least one bit we ignore after the BIT_AND_EXPR masking
> is not initialized or is initialized earlier to non-constant, we aren't able
> to look through it until combine, which is too late for the warnings on the
> dead code.
> This patch handles BIT_AND_EXPR where the first operand is a SSA_NAME
> initialized with a memory load and second operand is INTEGER_CST, by trying
> a partial def lookup after pushing the ranges of 0 bits in the mask as
> artificial initializers.  In the above case on little-endian, we push
> offset 0 size 3 {} partial def and offset 4 size 4 (the result is unsigned
> char) and then perform normal partial def handling.
> My initial version of the patch failed miserably during bootstrap, because
> data->finish (...) called vn_reference_lookup_or_insert_for_pieces
> which I believe tried to remember the masked value rather than real for the
> reference, or for failed lookup visit_reference_op_load called
> vn_reference_insert.  The following version makes sure we aren't calling
> either of those functions in the masked case, as we don't know anything
> better about the reference from whatever has been discovered when the load
> stmt has been visited, the patch just calls vn_nary_op_insert_stmt on
> failure with the lhs (apparently calling it with the INTEGER_CST doesn't
> work).
> 
> Bootstrapped/regtested on powerpc64{,le}-linux, I've additionally
> gathered statistics of successful BIT_AND_EXPR optimizations (attached, first
> column is uniq -c count, second BITS_PER_WORD, then filename, function name,
> the mask on BIT_AND_EXPR and finally the value it returned).
> Ok for trunk if it also passes bootstrap/regtest on x86_64-linux and
> i686-linux?

Comments below.

> 2020-02-28  Jakub Jelinek  
> 
>   PR tree-optimization/93582
>   * tree-ssa-sccvn.h (vn_reference_lookup): Add mask argument.
>   * tree-ssa-sccvn.c (struct vn_walk_cb_data): Add masked_p member,
>   initialize it in the constructor.
>   (vn_walk_cb_data::finish): If masked_p is true, return val instead
>   of calling vn_reference_lookup_or_insert_for_pieces.  Formatting fix.
>   (vn_reference_lookup_pieces): Adjust vn_walk_cb_data initialization.
>   Formatting fix.
>   (vn_reference_lookup): Add mask argument.  If non-NULL, don't call
>   fully_constant_vn_reference_p nor vn_reference_lookup_1 and
>   artificially push partial {} defs for the portions of the mask that
>   contains zeros.
>   (visit_nary_op): Handle BIT_AND_EXPR of a memory load and INTEGER_CST
>   mask.
>   (visit_reference_op_load): Add mask argument, pass it through
>   to vn_reference_lookup.  If non-NULL, don't call vn_reference_insert.
>   (visit_stmt): Adjust visit_reference_op_load caller.  Formatting fix.
> 
>   * gcc.dg/tree-ssa/pr93582-10.c: New test.
>   * gcc.dg/pr93582.c: New test.
>   * gcc.c-torture/execute/pr93582.c: New test.
> 
> --- gcc/tree-ssa-sccvn.h.jj   2020-02-28 11:56:39.506941888 +0100
> +++ gcc/tree-ssa-sccvn.h  2020-02-28 12:01:42.677404902 +0100
> @@ -256,7 +256,7 @@ tree vn_reference_lookup_pieces (tree, a
>vec ,
>vn_reference_t *, vn_lookup_kind);
>  tree vn_reference_lookup (tree, tree, vn_lookup_kind, vn_reference_t *, bool,
> -   tree * = NULL);
> +   tree * = NULL, tree = NULL_TREE);
>  void vn_reference_lookup_call (gcall *, vn_reference_t *, vn_reference_t);
>  vn_reference_t vn_reference_insert_pieces (tree, alias_set_type, tree,
>  vec ,
> --- gcc/tree-ssa-sccvn.c.jj   2020-02-28 11:56:39.506941888 +0100
> +++ gcc/tree-ssa-sccvn.c  2020-02-28 12:53:58.500459041 +0100
> @@ -1686,9 +1686,9 @@ struct pd_data
>  struct vn_walk_cb_data
>  {
>vn_walk_cb_data (vn_reference_t vr_, tree orig_ref_, tree *last_vuse_ptr_,
> -vn_lookup_kind vn_walk_kind_, bool tbaa_p_)
> +vn_lookup_kind vn_walk_kind_, bool tbaa_p_, bool masked_p_)
>  : vr (vr_), last_vuse_ptr (last_vuse_ptr_), last_vuse (NULL_TREE),
> -  vn_walk_kind (vn_walk_kind_), tbaa_p (tbaa_p_),
> +  vn_walk_kind (vn_walk_kind_), tbaa_p (tbaa_p_), masked_p (masked_p_),
>saved_operands (vNULL), first_set (-2), known_ranges (NULL)
> {
>   if (!last_vuse_ptr)
> @@ -1705,6 +1705,7 @@ struct vn_walk_cb_data
>tree last_vuse;
>vn_lookup_kind vn_walk_kind;

Re: RFA: Fix libiberty testsuite failure

2020-03-02 Thread H.J. Lu
On Mon, Jan 20, 2020 at 9:53 AM Ian Lance Taylor  wrote:
>
> kamlesh kumar  writes:
>
> > yes, current expected entry is wrong and
> > Nick's patch corrects that.
>
> Thanks.  Nick, the patch is OK.
>
> Ian

> >> > libiberty/ChangeLog
> >> > 2020-01-20  Nick Clifton  
> >> >
> >> >   * testsuite/demangle-expected: Fix expected demangling.
> >> >
> >> > Index: libiberty/testsuite/demangle-expected
> >> > ===
> >> > --- libiberty/testsuite/demangle-expected (revision 280157)
> >> > +++ libiberty/testsuite/demangle-expected (working copy)
> >> > @@ -1449,4 +1449,4 @@
> >> >  #PR91979 demangling nullptr expression
> >> >
> >> >  _Z3fooILPv0EEvPN9enable_ifIXeqT_LDnEEvE4typeE
> >> > -void foo<(void*)0>(enable_if<((void*)0)==((decltype(nullptr))),
> >> void>::type*)
> >> > +void foo<(void*)0>(enable_if<((void*)0)==(decltype(nullptr)),
> >> void>::type*)
> >>

I checked in the patch for Nick.

-- 
H.J.
From 3bb6abbf4bb98e58663ec6c9bc77ae0bdbac6e41 Mon Sep 17 00:00:00 2001
From: Nick Clifton 
Date: Mon, 2 Mar 2020 03:50:34 -0800
Subject: [PATCH] Fix a libiberty testsuite failure

	* testsuite/demangle-expected: Update expected demangling of
	enable_if pattern.
---
 libiberty/ChangeLog   | 5 +
 libiberty/testsuite/demangle-expected | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 91ae004005f..4c8b236cf78 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,8 @@
+2020-03-02  Nick Clifton  
+
+	* testsuite/demangle-expected: Update expected demangling of
+	enable_if pattern.
+
 2020-03-02  H.J. Lu  
 
 	PR lto/93966
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index daffe782112..ccadf84e608 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -1449,7 +1449,7 @@ Foo()::{lambda(auto:1)#1}::operator()(char) const::X::fn
 #PR91979 demangling nullptr expression
 
 _Z3fooILPv0EEvPN9enable_ifIXeqT_LDnEEvE4typeE
-void foo<(void*)0>(enable_if<((void*)0)==((decltype(nullptr))), void>::type*)
+void foo<(void*)0>(enable_if<((void*)0)==(decltype(nullptr)), void>::type*)
 
 _ZNK5coro15emptyawEv
 coro1::empty::operator co_await() const
-- 
2.24.1



[PATCH][OBVIOUS] Remove duplicate declaration.

2020-03-02 Thread Martin Liška

Hi.

One obvious patch where I remove a duplicate declaration.

Martin

libgcc/ChangeLog:

2020-03-02  Martin Liska  

* libgcov-interface.c: Remove duplicate
declaration of __gcov_flush_mx.
---
 libgcc/libgcov-interface.c | 1 -
 1 file changed, 1 deletion(-)


diff --git a/libgcc/libgcov-interface.c b/libgcc/libgcov-interface.c
index 49b44d5095b..048b9029ff3 100644
--- a/libgcc/libgcov-interface.c
+++ b/libgcc/libgcov-interface.c
@@ -52,7 +52,6 @@ void __gcov_dump (void) {}
   { src (); }
 
 extern __gthread_mutex_t __gcov_flush_mx ATTRIBUTE_HIDDEN;
-extern __gthread_mutex_t __gcov_flush_mx ATTRIBUTE_HIDDEN;
 
 #ifdef L_gcov_flush
 #ifdef __GTHREAD_MUTEX_INIT



[PATCH][OBVIOUS] Update comment to reflect optimization.

2020-03-02 Thread Martin Liška

Hi.

The patch is about wrong comment, we can optimize
the testcase.

Martin

gcc/testsuite/ChangeLog:

2020-03-02  Martin Liska  

* gcc.dg/vect/bb-slp-19.c: The comment
does not align with fact that we started
to SLP the testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-19.c | 1 -
 1 file changed, 1 deletion(-)


diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-19.c b/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
index c2821551c86..db446be7454 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-19.c
@@ -15,7 +15,6 @@ main1 ()
   unsigned short *pin = &in[0];
   unsigned short *pout = &out[0];
  
-  /* A group of 9 shorts - unsupported for now.  */
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;



Re: [PATCH] libstdc++: P0769R2 Add shift to

2020-03-02 Thread Jonathan Wakely

On 02/03/20 10:03 +0100, Stephan Bergmann wrote:

On 21/02/2020 20:29, Patrick Palka wrote:

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
index 7de1072abf0..c36afc6e19b 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -3683,6 +3683,54 @@ namespace ranges
   inline constexpr __prev_permutation_fn prev_permutation{};
 } // namespace ranges
+
+  template
+constexpr ForwardIterator
+shift_left(ForwardIterator __first, ForwardIterator __last,
+  typename iterator_traits::difference_type __n)
+{
+  __glibcxx_assert(__n >= 0);
+  if (__n == 0)
+   return __last;
+
+  auto __mid = ranges::next(__first, __n, __last);
+  if (__mid == __last)
+   return __first;
+  return std::move(std::move(__mid), std::move(__last), 
std::move(__first));
+}
+
+  template
+constexpr ForwardIterator
+shift_right(ForwardIterator __first, ForwardIterator __last,
+   typename iterator_traits::difference_type __n)
+{
+  __glibcxx_assert(__n >= 0);
+  if (__n == 0)
+   return __first;
+
+  using _Cat = iterator_traits::iterator_category;


^ FYI, the above line causes recent Clang 10 trunk with -std=c++20 to 
fail due to a "missing" typedef


Thanks, fixed by this patch, committed to master now.
commit 5fad000324d0bcc87283dc339423bfad6fa42c74
Author: Jonathan Wakely 
Date:   Mon Mar 2 12:18:45 2020 +

libstdc++: Add 'typename' to fix compilation with Clang

* include/bits/ranges_algo.h (shift_right): Add 'typename' to
dependent type.

diff --git a/libstdc++-v3/include/bits/ranges_algo.h b/libstdc++-v3/include/bits/ranges_algo.h
index 8fa4a8a9161..a34f75f53d8 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -3710,7 +3710,7 @@ namespace ranges
   if (__n == 0)
 	return __first;
 
-  using _Cat = iterator_traits::iterator_category;
+  using _Cat = typename iterator_traits::iterator_category;
   if constexpr (derived_from<_Cat, bidirectional_iterator_tag>)
 	{
 	  auto __mid = ranges::next(__last, -__n, __first);


Re: [PATCH] libstdc++: Fix bogus use of memcmp in ranges::lexicographical_compare (PR 93972)

2020-03-02 Thread Christophe Lyon
On Fri, 28 Feb 2020 at 22:53, Jonathan Wakely  wrote:
>
> On 28/02/20 14:59 -0500, Patrick Palka wrote:
> >We were enabling the memcmp optimization in ranges::lexicographical_compare 
> >for
> >signed integral types and for integral types larger than a byte.  But memcmp
> >gives the wrong answer for arrays of such types.  This patch fixes this 
> >issue by
> >refining the condition that enables the memcmp optimization.  It's now
> >consistent with the corresponding condition used in
> >std::lexicographical_compare.
> >
> >libstdc++-v3/ChangeLog:
> >
> >   PR libstdc++/93972
> >   * include/bits/ranges_algo.h 
> > (__lexicographical_compare_fn::operator()):
> >   Fix condition for when to use memcmp, making it consistent with the
> >   corresponding condition used in std::lexicographical_compare.
> >   * testsuite/25_algorithms/lexicographical_compare/93972.cc: New test.


Hi,

The new test fails on aarch64 and arm, and other targets according to
gcc-testresults.
On aarch64, my log says:
FAIL: 25_algorithms/lexicographical_compare/93972.cc (test for excess errors)
Excess errors:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'

UNRESOLVED: 25_algorithms/lexicographical_compare/93972.cc compilation
failed to produce executable

Christophe

> >---
> > libstdc++-v3/include/bits/ranges_algo.h   |   8 +-
> > .../lexicographical_compare/93972.cc  | 169 ++
> > 2 files changed, 175 insertions(+), 2 deletions(-)
> > create mode 100644 
> > libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/93972.cc
> >
> >diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
> >b/libstdc++-v3/include/bits/ranges_algo.h
> >index 05c0851d411..8fa4a8a9161 100644
> >--- a/libstdc++-v3/include/bits/ranges_algo.h
> >+++ b/libstdc++-v3/include/bits/ranges_algo.h
> >@@ -3466,9 +3466,13 @@ namespace ranges
> > {
> >   using _ValueType1 = iter_value_t<_Iter1>;
> >   using _ValueType2 = iter_value_t<_Iter2>;
> >+  // This condition is consistent with the one in
> >+  // __lexicographical_compare_aux in .
> >   constexpr bool __use_memcmp
> >-= ((is_integral_v<_ValueType1> || is_pointer_v<_ValueType1>)
> >-   && is_same_v<_ValueType1, _ValueType2>
> >+= (__is_byte<_ValueType1>::__value
> >+   && __is_byte<_ValueType2>::__value
> >+   && !__gnu_cxx::__numeric_traits<_ValueType1>::__is_signed
> >+   && !__gnu_cxx::__numeric_traits<_ValueType2>::__is_signed
>
> I think this could be:
>
>   && !is_signed_v<_ValueType1>
>   && !is_signed_v<_ValueType2>
>
> because this code doesn't need to be valid for C++98. But on the other
> hand, there's value in being consistent with the condition in
> std::lexicographical_compare.
>
> OK for master, thanks for the quick fix.
>


[PATCH][DOCS] Document -fprofile-reproducible.

2020-03-02 Thread Martin Liška

Hi.

It's a documentation update for the new -fprofile-reproducible option.

Martin

---
 htdocs/gcc-10/changes.html | 6 ++
 1 file changed, 6 insertions(+)


diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 53d0ca08..6e08ba41 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -78,6 +78,12 @@ a work-in-progress.
 	  can now be used to inform the compiler that code paths not covered by the
 	  training run should not be optimized for size.
   
+  https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-reproducible";>-fprofile-reproducible
+	  controls level of reproducibility of profile gathered by
+	  https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-generate";>-fprofile-generate.
+	  This makes it possible to rebuild program
+	  with same outcome which is useful, for example, for distribution packages.
+  
 
   
   



Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-03-02 Thread Richard Earnshaw (lists)

On 27/02/2020 13:37, Nathan Sidwell wrote:

On 2/3/20 6:41 AM, Richard Earnshaw (lists) wrote:

On 22/01/2020 17:45, Richard Earnshaw (lists) wrote:


[updated based on v2 discussions]

This patch proposes some new (additional) rules for email subject lines
when contributing to GCC.  The goal is to make sure that, as far as
possible, the subject for a patch will form a good summary when the
message is committed to the repository if applied with 'git am'.  Where
possible, I've tried to align these rules with those already in
use for glibc, so that the differences are minimal and only where
necessary.

Some things that differ from existing practice (at least by some people)
are:

- Use ':' rather than '[]'
   - This is more git friendly and works with 'git am'.
- Put bug numbers at the end of the line rather than the beginning.
   - The bug number is useful, but not as useful as the brief summary.
 Also, use the shortened form, as the topic part is more usefully
 conveyed in the proper topic field (see above).


I've not seen any follow-up to this version.  Should we go ahead and 
adopt this?


do it!

do it! do it! do it!

nathan


:-)

I'd like to.  But have we reached consensus?  Seems that every time I 
produce a revised version of the text we end up in another round of bike 
shedding.  (Is that a word?)


R.


Re: [PATCH] libstdc++: Fix bogus use of memcmp in ranges::lexicographical_compare (PR 93972)

2020-03-02 Thread Jonathan Wakely

On 02/03/20 13:22 +0100, Christophe Lyon wrote:

On Fri, 28 Feb 2020 at 22:53, Jonathan Wakely  wrote:


On 28/02/20 14:59 -0500, Patrick Palka wrote:
>We were enabling the memcmp optimization in ranges::lexicographical_compare for
>signed integral types and for integral types larger than a byte.  But memcmp
>gives the wrong answer for arrays of such types.  This patch fixes this issue 
by
>refining the condition that enables the memcmp optimization.  It's now
>consistent with the corresponding condition used in
>std::lexicographical_compare.
>
>libstdc++-v3/ChangeLog:
>
>   PR libstdc++/93972
>   * include/bits/ranges_algo.h (__lexicographical_compare_fn::operator()):
>   Fix condition for when to use memcmp, making it consistent with the
>   corresponding condition used in std::lexicographical_compare.
>   * testsuite/25_algorithms/lexicographical_compare/93972.cc: New test.



Hi,

The new test fails on aarch64 and arm, and other targets according to
gcc-testresults.
On aarch64, my log says:
FAIL: 25_algorithms/lexicographical_compare/93972.cc (test for excess errors)
Excess errors:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'

UNRESOLVED: 25_algorithms/lexicographical_compare/93972.cc compilation
failed to produce executable



Hmm, I think this was already broken in std::lexicographical_compare,
we just didn't have a test for it. I think the following will also
fail to compile on aarch64 and ARM (and any target where char is
unsigned):

#include 
#include 

int main()
{
  unsigned char a[] = {1, 2, 3, 4};
  char b[] = {1, 2, 3, 5};

  assert( std::lexicographical_compare(a, a+4, b, b+4) );
}

So Patrick's ranges::lexicographical_compare didn't introduce the bug,
it just found it by having better tests.

The std::__memcmp function is broken in a similar way to the
std::__memmove function that I removed last week. I'll fix that
today...




[PATCH] Clear --help=language and --help=common interaction.

2020-03-02 Thread Martin Liška

Hi.

For situations like -Q --help=warning,c one can't see a warning
that is common. For that one needs to use -Q --help=warning,common.
My patch explains the behavior in documentation.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-03-02  Martin Liska  

PR c/93886
PR c/93887
* doc/invoke.texi: Clarify --help=language and --help=common
interaction.
---
 gcc/doc/invoke.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4f88fe68999..e43d954283f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1665,7 +1665,8 @@ option.
 @item @var{language}
 Display the options supported for @var{language}, where
 @var{language} is the name of one of the languages supported in this
-version of GCC@.
+version of GCC@.  If an option is supported by all languages, one needs
+to use @var{common} qualifier instead.
 
 @item @samp{common}
 Display the options that are common to all languages.



Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-03-02 Thread Segher Boessenkool
On Mon, Mar 02, 2020 at 01:01:46PM +, Richard Earnshaw (lists) wrote:
> I'd like to.  But have we reached consensus?  Seems that every time I 
> produce a revised version of the text we end up in another round of bike 
> shedding.  (Is that a word?)

It's called a "lap", but "criterium" would be more suitable here.


Segher


Re: [PATCH] coroutines: Don't make duplicate frame copies of awaitables.

2020-03-02 Thread Nathan Sidwell

On 3/2/20 4:37 AM, Iain Sandoe wrote:

Hi,

this corrects a thinko that seemed initially to be a missed
optimisation, but turns out to lead to wrong code in some
cases.

tested on x86_64 darwin, linux and powerpc linux
OK for trunk?
thanks
Iain

In general, we need to manage the lifetime of compiler-
generated awaitable instances in the coroutine frame, since
these must persist across suspension points.

However, it is quite possible that the user might provide the
awaitable instances, either as function params or as a local
variable.  We will already generate a frame entry for these as
required.

At present, under this circumstance, we are duplicating these,
awaitable, initialising a second frame copy for them (which we
then subsequently destroy manually after the suspension point).
That's not efficient - so an undesirable thinko in the first place.
However, there is also an actual bug; if the compiler elects to
elide the copy (which is perfectly legal), it does not have visibility
of the manual management of the post-suspend destruction
- this subsequently leads to double-free errors.

The solution is not to make the second copy (as noted, params
and local vars already have frame copies with managed lifetimes).

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  

* coroutines.cc (build_co_await): Do not build frame
awaitable proxy vars when the co_await expression is
a function parameter or local var.
(co_await_expander): Do not initialise a frame var with
itself.
(transform_await_expr): Only substitute the awaitable
frame var if it's needed.
(register_awaits): Do not make frame copies for param
or local vars that are awaitables.



ok


--
Nathan Sidwell


Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-03-02 Thread Nathan Sidwell

On 3/2/20 8:01 AM, Richard Earnshaw (lists) wrote:

On 27/02/2020 13:37, Nathan Sidwell wrote:

On 2/3/20 6:41 AM, Richard Earnshaw (lists) wrote:

On 22/01/2020 17:45, Richard Earnshaw (lists) wrote:


[updated based on v2 discussions]

This patch proposes some new (additional) rules for email subject lines
when contributing to GCC.  The goal is to make sure that, as far as
possible, the subject for a patch will form a good summary when the
message is committed to the repository if applied with 'git am'.  Where
possible, I've tried to align these rules with those already in
use for glibc, so that the differences are minimal and only where
necessary.

Some things that differ from existing practice (at least by some 
people)

are:

- Use ':' rather than '[]'
   - This is more git friendly and works with 'git am'.
- Put bug numbers at the end of the line rather than the beginning.
   - The bug number is useful, but not as useful as the brief summary.
 Also, use the shortened form, as the topic part is more usefully
 conveyed in the proper topic field (see above).


I've not seen any follow-up to this version.  Should we go ahead and 
adopt this?


I'd like to.  But have we reached consensus?  Seems that every time I 
produce a revised version of the text we end up in another round of bike 
shedding.  (Is that a word?)


I'm not sure I've seen a specific proposal following yours.  Some 
suggestions for differences, with varying degrees of forcefulness.  I 
still say go for it.


nathan

--
Nathan Sidwell


Re: [PATCH] coroutines: Update lambda capture handling to n4849.

2020-03-02 Thread Nathan Sidwell

On 3/2/20 4:43 AM, Iain Sandoe wrote:

Hi,

In the absence of specific comment on the handling of closures I'd
implemented something more than was intended (extending the lifetime
of lambda capture-by-copy vars to the duration of the coro).

After discussion at WG21 in February and by email, the correct handling
is to treat the closure "this" pointer the same way as for a regular one,
and thus it is the user's responsibility to ensure that the lambda capture
object has suitable lifetime for the coroutine.  It is noted that users
frequently get this wrong, so it would be a good thing to revisit for C++23.

This patch removes the additional copying behaviour for lambda capture-by-
copy vars.

@JunMa, this supercedes your fix to the aliases, which should no longer be
necessary, but i’ve added your testcases to this patch.

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  

* coroutines.cc (struct local_var_info): Adjust to remove the
reference to the captured var, and just to note that this is a
lambda capture proxy.
(transform_local_var_uses): Handle lambda captures specially.
(struct param_frame_data): Add a visited set.
(register_param_uses): Also check for param uses in lambda
capture proxies.
(struct local_vars_frame_data): Remove captures list.
(register_local_var_uses): Handle lambda capture proxies by
noting and bypassing them.
(morph_fn_to_coro): Update to remove lifetime extension of
lambda capture-by-copy vars.


ok


--
Nathan Sidwell


Re: [PATCH Coroutines]Pickup more CO_AWAIT_EXPR expanding cases

2020-03-02 Thread Nathan Sidwell

On 2/10/20 3:33 AM, bin.cheng wrote:

Hi,

We found more ICEs because of unexpanded CO_AWAIT_EXPR, it turned out we
can fix these issues with more simplification in function co_await_expander.  
Here
is the patch with a new test.

Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

gcc/cp
2020-02-10  Bin Cheng  

 * coroutines.cc (co_await_expander): Simplify.


ok


--
Nathan Sidwell


Re: One more patch for PR93564

2020-03-02 Thread Christophe Lyon
On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov  wrote:
>
>   The following patch is dealing with arm failures after submitting
> original patch for PR93564.
>
>Changing heuristics in the original patch resulted in different order
> of allocation and creating gaps in hard reg file which were not enough
> for pseudos requiring double regs.  So RA started to use caller-saved
> regs and additional store/load insns in function prologue. That is the
> reason for some arm failures.
>
>The patch was successfully bootstrapped and benchmarked on x86-64.
> On x86-64 SPEC2000 the patch generates a bit smaller and faster in
> average code.
>

Hi,

This is causing another set of regressions on arm.
For instance on arm-linux-gnueabihf --with-cpu cortex-a9
--with-fpu neon-fp16:
FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0 1
FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1

Christophe


Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-03-02 Thread Jonathan Wakely
On Mon, 2 Mar 2020 at 14:31, Nathan Sidwell  wrote:
>
> On 3/2/20 8:01 AM, Richard Earnshaw (lists) wrote:
> > On 27/02/2020 13:37, Nathan Sidwell wrote:
> >> On 2/3/20 6:41 AM, Richard Earnshaw (lists) wrote:
> >>> On 22/01/2020 17:45, Richard Earnshaw (lists) wrote:
> 
>  [updated based on v2 discussions]
> 
>  This patch proposes some new (additional) rules for email subject lines
>  when contributing to GCC.  The goal is to make sure that, as far as
>  possible, the subject for a patch will form a good summary when the
>  message is committed to the repository if applied with 'git am'.  Where
>  possible, I've tried to align these rules with those already in
>  use for glibc, so that the differences are minimal and only where
>  necessary.
> 
>  Some things that differ from existing practice (at least by some
>  people)
>  are:
> 
>  - Use ':' rather than '[]'
> - This is more git friendly and works with 'git am'.
>  - Put bug numbers at the end of the line rather than the beginning.
> - The bug number is useful, but not as useful as the brief summary.
>   Also, use the shortened form, as the topic part is more usefully
>   conveyed in the proper topic field (see above).
> >>>
> >>> I've not seen any follow-up to this version.  Should we go ahead and
> >>> adopt this?
>
> > I'd like to.  But have we reached consensus?  Seems that every time I
> > produce a revised version of the text we end up in another round of bike
> > shedding.  (Is that a word?)
>
> I'm not sure I've seen a specific proposal following yours.  Some
> suggestions for differences, with varying degrees of forcefulness.  I
> still say go for it.

Go for it.

It's not like we're going to take away commit privs from people who
use slight variations on the scheme. It's better to have a written
policy that people should aim towards, and most people will follow in
most cases.


Re: [PATCH coroutines v1] Build co_await/yield_expr with unknown_type in processing_template_decl phase

2020-03-02 Thread Nathan Sidwell

On 2/5/20 4:17 AM, JunMa wrote:

在 2020/2/5 下午2:14, JunMa 写道:

Hi
This patch builds co_await/yield_expr with unknown_type when we can not
know the promise type in processing_template_decl phase. it avoid to
confuse compiler when handing type deduction and conversion.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa


Hi
sorry for that '}' was removed, here is the update patch:)

Regards
JunMa

gcc/cp
2020-02-05  Jun Ma 

    * coroutines.cc (finish_co_await_expr): Build co_await_expr
    with unknown_type_node.
    (finish_co_yield_expr): Ditto.
    *pt.c (type_dependent_expression_p): Set co_await/yield_expr
    with unknown type as dependent.

gcc/testsuite
2020-02-05  Jun Ma 

    * g++.dg/coroutines/torture/co-await-14-template-traits.C: New 
test.


ok


--
Nathan Sidwell


Re: [PATCH coroutines] Handle component_ref in captures_temporary

2020-03-02 Thread Nathan Sidwell

On 2/12/20 2:23 AM, JunMa wrote:

Hi
In captures_temporary, the current implementation fails to handle
component_ref. This causes ice with case co_await A while
operator co_await is defined in base class of A. Also it is necessary
to capture the object of base class as if it is temporary object.

This patch strips component_ref to its base object and check it as usual.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-02-12  Jun Ma 

     * coroutines.cc (captures_temporary): Strip component_ref
     to its base object.

gcc/testsuite
2020-02-12  Jun Ma 

     * g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C: New 
test.


+
+  /* In case of component_ref, we need to capture the object of base
+class as if it is temporary object.  There are two possibilities:
+(*base).field and base->field.  */
+  while (TREE_CODE (parm) == COMPONENT_REF)
+   {
+ parm = TREE_OPERAND (parm, 0);
+ if (TREE_CODE (parm) == INDIRECT_REF)
+   parm = TREE_OPERAND (parm, 0);
+ while (TREE_CODE (parm) == NOP_EXPR)
+   parm = TREE_OPERAND (parm, 0);


Use STRIP_NOPS.


+   }
+
   if (TREE_CODE (parm) == VAR_DECL && !DECL_ARTIFICIAL (parm))
/* This isn't a temporary... */
continue;
 
-  if (TREE_CODE (parm) == PARM_DECL)

+  if (TREE_CODE (parm) == PARM_DECL  || TREE_CODE (parm) == 
NON_LVALUE_EXPR)
/* .. nor is this... */
continue;


Either a separate if, or merging both ifs (my preference) would be better.

nathan

--
Nathan Sidwell


Re: [PATCH] sccvn: Improve handling of load masked with integer constant [PR93582]

2020-03-02 Thread Jakub Jelinek
On Mon, Mar 02, 2020 at 12:46:30PM +0100, Richard Biener wrote:
> > + void *r = data.push_partial_def (pd, 0, prec);
> > + if (r == (void *) -1)
> > +   return NULL_TREE;
> > + gcc_assert (r == NULL_TREE);
> > +   }
> > + pos += tz;
> > + if (pos == prec)
> > +   break;
> > + w = wi::lrshift (w, tz);
> > + tz = wi::ctz (wi::bit_not (w));
> > + if (pos + tz > prec)
> > +   tz = prec - pos;
> > + pos += tz;
> > + w = wi::lrshift (w, tz);
> > +   }
> 
> I'd do this in the vn_walk_cb_data CTOR instead - you pass mask != 
> NULL_TREE anyway so you can as well pass mask.

I've tried, but have no idea how to handle the case where
data.push_partial_def (pd, 0, prec); fails above if it is done in the
constructor.
Though, the BIT_AND_EXPR case already checks:
+ && CHAR_BIT == 8
+ && BITS_PER_UNIT == 8
+ && BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN
and also checks the pathological cases of mask being all ones or all zeros,
so it is just the theoretical case of
maxsizei > bufsize * BITS_PER_UNIT
so maybe it is moot and we can just assert that push_partial_def
returned NULL.

> I wonder if we can instead make the above return NULL (finish
> return (void *)-1) and do sth like
> 
>  if (!wvnresult && mask)
>return data.masked_result;
> 
> and thus avoid the type-"unsafe" return frobbing by storing the
> result value in an extra member of the vn_walk_cb_data struct.

Done that way.

> Any reason you piggy-back on visit_reference_op_load instead of using
> vn_reference_lookup directly?  I'd very much prefer that since it
> doesn't even try to mess with the SSA lattice.

I didn't want to duplicate the VCE case, but it isn't that long.

So, like this if it passes bootstrap/regtest?

2020-03-02  Jakub Jelinek  

PR tree-optimization/93582
* tree-ssa-sccvn.h (vn_reference_lookup): Add mask argument.
* tree-ssa-sccvn.c (struct vn_walk_cb_data): Add mask and masked_result
members, initialize them in the constructor and if mask is non-NULL,
artificially push_partial_def {} for the portions of the mask that
contain zeros.
(vn_walk_cb_data::finish): If mask is non-NULL, set masked_result to
val and return (void *)-1.  Formatting fix.
(vn_reference_lookup_pieces): Adjust vn_walk_cb_data initialization.
Formatting fix.
(vn_reference_lookup): Add mask argument.  If non-NULL, don't call
fully_constant_vn_reference_p nor vn_reference_lookup_1 and return
data.mask_result.
(visit_nary_op): Handle BIT_AND_EXPR of a memory load and INTEGER_CST
mask.
(visit_stmt): Formatting fix.

* gcc.dg/tree-ssa/pr93582-10.c: New test.
* gcc.dg/pr93582.c: New test.
* gcc.c-torture/execute/pr93582.c: New test.

--- gcc/tree-ssa-sccvn.h.jj 2020-02-28 17:32:56.391363613 +0100
+++ gcc/tree-ssa-sccvn.h2020-03-02 13:52:17.488680037 +0100
@@ -256,7 +256,7 @@ tree vn_reference_lookup_pieces (tree, a
 vec ,
 vn_reference_t *, vn_lookup_kind);
 tree vn_reference_lookup (tree, tree, vn_lookup_kind, vn_reference_t *, bool,
- tree * = NULL);
+ tree * = NULL, tree = NULL_TREE);
 void vn_reference_lookup_call (gcall *, vn_reference_t *, vn_reference_t);
 vn_reference_t vn_reference_insert_pieces (tree, alias_set_type, tree,
   vec ,
--- gcc/tree-ssa-sccvn.c.jj 2020-02-28 17:32:56.390363628 +0100
+++ gcc/tree-ssa-sccvn.c2020-03-02 15:48:12.982620557 +0100
@@ -1686,15 +1686,55 @@ struct pd_data
 struct vn_walk_cb_data
 {
   vn_walk_cb_data (vn_reference_t vr_, tree orig_ref_, tree *last_vuse_ptr_,
-  vn_lookup_kind vn_walk_kind_, bool tbaa_p_)
+  vn_lookup_kind vn_walk_kind_, bool tbaa_p_, tree mask_)
 : vr (vr_), last_vuse_ptr (last_vuse_ptr_), last_vuse (NULL_TREE),
-  vn_walk_kind (vn_walk_kind_), tbaa_p (tbaa_p_),
-  saved_operands (vNULL), first_set (-2), known_ranges (NULL)
-   {
- if (!last_vuse_ptr)
-   last_vuse_ptr = &last_vuse;
- ao_ref_init (&orig_ref, orig_ref_);
-   }
+  mask (mask_), masked_result (NULL_TREE), vn_walk_kind (vn_walk_kind_),
+  tbaa_p (tbaa_p_), saved_operands (vNULL), first_set (-2),
+  known_ranges (NULL)
+  {
+if (!last_vuse_ptr)
+  last_vuse_ptr = &last_vuse;
+ao_ref_init (&orig_ref, orig_ref_);
+if (mask)
+  {
+   wide_int w = wi::to_wide (mask);
+   unsigned int pos = 0, prec = w.get_precision ();
+   pd_data pd;
+   pd.rhs = build_constructor (NULL_TREE, NULL);
+   /* When bitwise and with a constant is done on a memory load,
+  we don't really need all the bits to be defined or defined
+  to constants, we don't really care 

Re: One more patch for PR93564

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:
> On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov  wrote:
> >   The following patch is dealing with arm failures after submitting
> > original patch for PR93564.
> > 
> >Changing heuristics in the original patch resulted in different order
> > of allocation and creating gaps in hard reg file which were not enough
> > for pseudos requiring double regs.  So RA started to use caller-saved
> > regs and additional store/load insns in function prologue. That is the
> > reason for some arm failures.
> > 
> >The patch was successfully bootstrapped and benchmarked on x86-64.
> > On x86-64 SPEC2000 the patch generates a bit smaller and faster in
> > average code.
> > 
> 
> Hi,
> 
> This is causing another set of regressions on arm.
> For instance on arm-linux-gnueabihf --with-cpu cortex-a9
> --with-fpu neon-fp16:
> FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
> FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0 1
> FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1
I suspect at least some of these are likely just register assignments changing.

Jeff
> 



Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-03-02 Thread Richard Earnshaw (lists)

On 02/03/2020 14:41, Jonathan Wakely wrote:

On Mon, 2 Mar 2020 at 14:31, Nathan Sidwell  wrote:


On 3/2/20 8:01 AM, Richard Earnshaw (lists) wrote:

On 27/02/2020 13:37, Nathan Sidwell wrote:

On 2/3/20 6:41 AM, Richard Earnshaw (lists) wrote:

On 22/01/2020 17:45, Richard Earnshaw (lists) wrote:


[updated based on v2 discussions]

This patch proposes some new (additional) rules for email subject lines
when contributing to GCC.  The goal is to make sure that, as far as
possible, the subject for a patch will form a good summary when the
message is committed to the repository if applied with 'git am'.  Where
possible, I've tried to align these rules with those already in
use for glibc, so that the differences are minimal and only where
necessary.

Some things that differ from existing practice (at least by some
people)
are:

- Use ':' rather than '[]'
- This is more git friendly and works with 'git am'.
- Put bug numbers at the end of the line rather than the beginning.
- The bug number is useful, but not as useful as the brief summary.
  Also, use the shortened form, as the topic part is more usefully
  conveyed in the proper topic field (see above).


I've not seen any follow-up to this version.  Should we go ahead and
adopt this?



I'd like to.  But have we reached consensus?  Seems that every time I
produce a revised version of the text we end up in another round of bike
shedding.  (Is that a word?)


I'm not sure I've seen a specific proposal following yours.  Some
suggestions for differences, with varying degrees of forcefulness.  I
still say go for it.


Go for it.

It's not like we're going to take away commit privs from people who
use slight variations on the scheme. It's better to have a written
policy that people should aim towards, and most people will follow in
most cases.



OK, pushed.   Folk can, of course, now propose changes to the text as it 
stands...


R.


Re: One more patch for PR93564

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 08:17 -0700, Jeff Law wrote:
> On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:
> > On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov  wrote:
> > >   The following patch is dealing with arm failures after submitting
> > > original patch for PR93564.
> > > 
> > >Changing heuristics in the original patch resulted in different order
> > > of allocation and creating gaps in hard reg file which were not enough
> > > for pseudos requiring double regs.  So RA started to use caller-saved
> > > regs and additional store/load insns in function prologue. That is the
> > > reason for some arm failures.
> > > 
> > >The patch was successfully bootstrapped and benchmarked on x86-64.
> > > On x86-64 SPEC2000 the patch generates a bit smaller and faster in
> > > average code.
> > > 
> > 
> > Hi,
> > 
> > This is causing another set of regressions on arm.
> > For instance on arm-linux-gnueabihf --with-cpu cortex-a9
> > --with-fpu neon-fp16:
> > FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
> > FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> > FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> > FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0 1
> > FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1
> I suspect at least some of these are likely just register assignments
> changing.
In fact, I'm certain that's the case for fuse-caller-save.c.  I'll be looking
at armv8_2-fp16-move-1.c as well since my tester tripped over that as well.  If
you could evaluate the others it'd be appreciated.

jeff



Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Marek Polacek
Ping: Jeff, see the question below.  Thanks.

On Mon, Feb 24, 2020 at 10:19:20AM -0500, Marek Polacek wrote:
> On Mon, Feb 24, 2020 at 09:59:51AM -0500, Jason Merrill wrote:
> > On 2/24/20 9:58 AM, Jason Merrill wrote:
> > > On 2/21/20 2:14 PM, Marek Polacek wrote:
> > > > This was approved in the Prague 2020 WG21 meeting so let's adjust the
> > > > comment.  Since it's supposed to be a DR I think we should no longer
> > > > limit it to C++20.
> > > 
> > > I'm a bit nervous about the impact, but OK.  It's easy enough to turn
> > > off -Wnarrowing if it's a problem for users.
> > 
> > Hmm, have you tried doing a Fedora build with this change?
> 
> I'll admit I have not.
> 
> Jeff, would it be possible to apply this patch onto one of your testers?
> We'd be much more comfortable going with this change then.
> 
> > > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > > > 
> > > > 2020-02-21  Marek Polacek  
> > > > 
> > > > P1957R2
> > > > * typeck2.c (check_narrowing): Consider T* to bool narrowing
> > > > in C++11 and up.
> > > > 
> > > > * g++.dg/cpp0x/initlist92.C: Don't expect an error in C++20 only.
> > > > ---
> > > >   gcc/cp/typeck2.c    | 7 ---
> > > >   gcc/testsuite/g++.dg/cpp0x/initlist92.C | 2 +-
> > > >   2 files changed, 5 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
> > > > index 48920894b3b..68bc2e5c170 100644
> > > > --- a/gcc/cp/typeck2.c
> > > > +++ b/gcc/cp/typeck2.c
> > > > @@ -1036,9 +1036,10 @@ check_narrowing (tree type, tree init,
> > > > tsubst_flags_t complain,
> > > >   }
> > > >     else if (TREE_CODE (type) == BOOLEAN_TYPE
> > > >  && (TYPE_PTR_P (ftype) || TYPE_PTRMEM_P (ftype)))
> > > > -    /* This hasn't actually made it into C++20 yet, but let's add
> > > > it now to get
> > > > -   an idea of the impact.  */
> > > > -    ok = (cxx_dialect < cxx2a);
> > > > +    /* C++20 P1957R2: converting from a pointer type or a
> > > > pointer-to-member
> > > > +   type to bool should be considered narrowing.  This is a DR
> > > > so is not
> > > > +   limited to C++20 only.  */
> > > > +    ok = false;
> > > >     bool almost_ok = ok;
> > > >     if (!ok && !CONSTANT_CLASS_P (init) && (complain &
> > > > tf_warning_or_error))
> > > > diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist92.C
> > > > b/gcc/testsuite/g++.dg/cpp0x/initlist92.C
> > > > index 319264ae274..213b192d441 100644
> > > > --- a/gcc/testsuite/g++.dg/cpp0x/initlist92.C
> > > > +++ b/gcc/testsuite/g++.dg/cpp0x/initlist92.C
> > > > @@ -23,7 +23,7 @@ bool Test4(std::initializer_list);
> > > >   int main ()
> > > >   {
> > > > -  ( Test1({"false"}) );    // { dg-error "narrowing" "" { target
> > > > c++2a } }
> > > > +  ( Test1({"false"}) );    // { dg-error "narrowing" }
> > > >     ( Test2({123}) );
> > > >     ( Test3({456}) );
> > > >     ( Test4({"false"}) );
> > > > 
> > > > base-commit: dbfba41e95d1d93b17e907b7f516b52ed3a3c415
> > > > 
> > > 
> > 
> 
> Marek
> 

Marek



Re: One more patch for PR93564

2020-03-02 Thread Vladimir Makarov



On 2020-03-02 10:17 a.m., Jeff Law wrote:

On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:

On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov  wrote:

   The following patch is dealing with arm failures after submitting
original patch for PR93564.

Changing heuristics in the original patch resulted in different order
of allocation and creating gaps in hard reg file which were not enough
for pseudos requiring double regs.  So RA started to use caller-saved
regs and additional store/load insns in function prologue. That is the
reason for some arm failures.

The patch was successfully bootstrapped and benchmarked on x86-64.
On x86-64 SPEC2000 the patch generates a bit smaller and faster in
average code.


Hi,

This is causing another set of regressions on arm.
For instance on arm-linux-gnueabihf --with-cpu cortex-a9
--with-fpu neon-fp16:
FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0 1
FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1

I suspect at least some of these are likely just register assignments changing.

  It is a generation of unexpected but still correct code. Changing 
heursitics can create small gaps in hard reg files which are not enough 
to fit multi-regs pseudos and there will be more probability of usage of 
callee-saved regs which means loads/stores in prologue/epilogue.


  As assigning to multi-regs pseudos first was never the highest 
priority in the assignment (execution frequency has a higher priority), 
we were lucky enough to generate the expected code.  In general, these 
kind failures are for very small functions without loops where even 
stack is not used.  The more important cases are RA for big functions 
(as we have aggressive inlining) with loops and for these cases the 
latest patch decreases SPEC2000 code size and improved the performance 
visibly at least for x86-64.


  In any case, I'll look at these tests but fixing all RA performance 
issues and tests checking them is might be just chasing a rainbow.





[committed][ARM] Fix minor testsuite fallout on ARM due to recent IRA changes

2020-03-02 Thread Jeff Law

More minor fallout from Vlad's IRA changes.

Previously this test used r3 to hold a value across a call (it's an ipa-ra
test).  After Vlad's changes we're using r1 instead.

This patch makes the obvious change to pattern we can for which should bring
the test back to a passing status.

There's a note about r3 being special on thumb1 and the pattern check is
skipped for thumb1.  That special casing my not be necessary anymore -- I leave
that to the ARM maintainers to resolve one way or the other.

Committing on the trunk momentarily.

jeff
commit 0ce38183001095c804b45bab0370ff50b34f886f
Author: Jeff Law 
Date:   Mon Mar 2 08:44:28 2020 -0700

Fix testsuite regression due to recent IRA changes.

* gcc.target/arm/fuse-caller-save.c: Update expected output.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 988d49af3b8..3f2c2851799 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-03-02  Jeff Law  
+
+   * gcc.target/arm/fuse-caller-save.c: Update expected output.
+
 2020-03-02  Martin Liska  
 
* gcc.dg/vect/bb-slp-19.c: The comment
diff --git a/gcc/testsuite/gcc.target/arm/fuse-caller-save.c 
b/gcc/testsuite/gcc.target/arm/fuse-caller-save.c
index ef9256dced9..20837125c0a 100644
--- a/gcc/testsuite/gcc.target/arm/fuse-caller-save.c
+++ b/gcc/testsuite/gcc.target/arm/fuse-caller-save.c
@@ -22,4 +22,4 @@ main (void)
 
 /* For thumb1, r3 is considered likely spilled, and treated differently in
ira_build_conflicts, which inhibits the fipa-ra optimization.  */
-/* { dg-final { scan-assembler-times "mov\tr3, r0" 1 { target { ! arm_thumb1 } 
} } } */
+/* { dg-final { scan-assembler-times "mov\tr1, r0" 1 { target { ! arm_thumb1 } 
} } } */


Re: GLIBC libmvec status

2020-03-02 Thread Bill Schmidt

In 2/28/20 10:31 AM, Jakub Jelinek wrote:

On Fri, Feb 28, 2020 at 04:23:03PM +, GT wrote:

Do we want to change the name and title of the document since Segher doesn't 
believe it
is an ABI. My initial suggestion: "POWER Architecture Specification of Scalar 
Function
to Vector Function Mapping".

It is an ABI, similarly like e.g. the C++ Itanium ABI is an ABI, it specifies
mangling of certain functions and how the function argument types and return
types are transformed.


Agreed, let's leave that as is.

One tiny nit on the document:  For the "b"  value, let's just say "VSX" 
rather than
"VSX as defined in PowerISA v2.07)."  We will plan to only change  values 
in case
a different vector length is defined in future.

Looks good otherwise!

Thanks,
Bill



Jakub



Re: One more patch for PR93564

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 08:40 -0700, Jeff Law wrote:
> On Mon, 2020-03-02 at 08:17 -0700, Jeff Law wrote:
> > On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:
> > > On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov 
> > > wrote:
> > > >   The following patch is dealing with arm failures after submitting
> > > > original patch for PR93564.
> > > > 
> > > >Changing heuristics in the original patch resulted in different
> > > > order
> > > > of allocation and creating gaps in hard reg file which were not enough
> > > > for pseudos requiring double regs.  So RA started to use caller-saved
> > > > regs and additional store/load insns in function prologue. That is the
> > > > reason for some arm failures.
> > > > 
> > > >The patch was successfully bootstrapped and benchmarked on x86-64.
> > > > On x86-64 SPEC2000 the patch generates a bit smaller and faster in
> > > > average code.
> > > > 
> > > 
> > > Hi,
> > > 
> > > This is causing another set of regressions on arm.
> > > For instance on arm-linux-gnueabihf --with-cpu cortex-a9
> > > --with-fpu neon-fp16:
> > > FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
> > > FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> > > FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> > > FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0
> > > 1
> > > FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1
> > I suspect at least some of these are likely just register assignments
> > changing.
> In fact, I'm certain that's the case for fuse-caller-save.c.  I'll be looking
> at armv8_2-fp16-move-1.c as well since my tester tripped over that as
> well.  If you could evaluate the others it'd be appreciated.
And I'm now certain armv8_2-fp16-move-1.c is of a similar nature.

In that test we get a slightly different packing of registers after Vlad's IRA
changes.  The different packing into registers ultimately results in one hard
register cprop not happening after Vlad's changes.  As a result we end up with
an extra reg->reg copy and the test fails.

This may be one we just have to live with.  As we come into cprop_hardreg we
have this after Vlad's changes:

(set (reg 0) (reg 18))
(set (reg 18) (float_extend ...)

[ ... ]
set (reg 17) (reg 0)


Obviously the set to reg18 in the middle insn blocks the ability to propagate
the source of the first set into the source of the last set.  Prior to Vlad's
change that middle set used a different hard register and thus didn't block the
hard register cprop.  But that was more of an accident than anything -- Vlad's
work results in, IMHO a better hard register allocation -- which in turn
inhibits hard register cprop.

I think the thing to do is either expect the single copy or xfail the test. 
I'm going to leave it to the ARM maintainers to decide how they can to handle
that.   I don't think this very minor code quality regression is significant
enough to warrant backing out Vlad's change.

Another approach would be to see if register renaming helps here, but that's a
can of worms I don't think we want to open at this point.


Jeff



Re: Minor regression due to recent IRA changes

2020-03-02 Thread Vladimir Makarov



On 2020-02-29 10:47 a.m., Jeff Law wrote:

On Sun, 2020-03-01 at 00:43 +0900, Oleg Endo wrote:

This could well be a target issue.  I haven't tried to debug it.  If
it's a
target issue, I'm fully comfortable punting it to the SH folks for
resolving.

The R0_REGS spill failure is a general problem, in particular with old
reload.  The atomic patterns tend to trigger it in one circumstance or
the other.  The IRA change probably just stresses it more.  Perhaps it
will go away with -mlra.

However, LRA on SH still has its own issues, so it can't be generally
enabled by default yet, unfortunately.  See also some of the recent
posts in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93877

It's almost certainly the case that the recent IRA changes are going to stress
R0 more.  If I'm reading what Vlad did correctly, one of the tie-breakers its
using now is to choose the lowest numbered register when all else is equal.  So
R0 on SH is likely going to be more problematical.
The last patch does not do it for targets requiring to honor reg 
allocation order.  I'd recommend to try to define macro 
HONOR_REG_ALLOC_ORDER for sh.

I wonder if just reordering the regs on the SH (and adjusting the debug output
to keep that working) would be enough to mitigate some of the R0 problems.

And yes, I saw 93877 fly by too :(




Re: Minor regression due to recent IRA changes

2020-03-02 Thread Vladimir Makarov



On 2020-03-02 4:34 a.m., Richard Biener wrote:


One could also simply pessimize R0 for RA via either an existing mechanism
or a new target hook ...


Yes, it could be a good strategy.  I'd recommend to try 
HONOR_REG_ALLOC_ORDER first with/without LRA.


If it does not work I am ready to accept a reasonable new hook for IRA 
and/or LRA.  Some GCC targets are to specific and require a special 
treatment.





Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 10:45 -0500, Marek Polacek wrote:
> Ping: Jeff, see the question below.  Thanks.
Sorry, totally missed the question.  I'm guessing you want me to run it through
the Fedora build tester?

jeff
> 



Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Marek Polacek
On Mon, Mar 02, 2020 at 09:25:35AM -0700, Jeff Law wrote:
> On Mon, 2020-03-02 at 10:45 -0500, Marek Polacek wrote:
> > Ping: Jeff, see the question below.  Thanks.
> Sorry, totally missed the question.  I'm guessing you want me to run it 
> through
> the Fedora build tester?

Yeah -- we want to allow a more strict narrow checking even in pre-C+20 modes,
and we're wondering about the fallout.  So re-compiling Fedora packages would
be useful.

Marek



Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 11:29 -0500, Marek Polacek wrote:
> On Mon, Mar 02, 2020 at 09:25:35AM -0700, Jeff Law wrote:
> > On Mon, 2020-03-02 at 10:45 -0500, Marek Polacek wrote:
> > > Ping: Jeff, see the question below.  Thanks.
> > Sorry, totally missed the question.  I'm guessing you want me to run it
> > through
> > the Fedora build tester?
> 
> Yeah -- we want to allow a more strict narrow checking even in pre-C+20
> modes, and we're wondering about the fallout.  So re-compiling Fedora
> packages would be useful.
Is the error easy to identify from a diagnostic?  If so that would avoid having
to get fresh baselines.  I can just apply the patch, build gcc, then spin up a
single build of everything.

Jeff



Re: [committed][ARM] Fix minor testsuite fallout on ARM due to recent IRA changes

2020-03-02 Thread Richard Earnshaw (lists)

On 02/03/2020 15:46, Jeff Law wrote:


More minor fallout from Vlad's IRA changes.

Previously this test used r3 to hold a value across a call (it's an ipa-ra
test).  After Vlad's changes we're using r1 instead.

This patch makes the obvious change to pattern we can for which should bring
the test back to a passing status.

There's a note about r3 being special on thumb1 and the pattern check is
skipped for thumb1.  That special casing my not be necessary anymore -- I leave
that to the ARM maintainers to resolve one way or the other.

Committing on the trunk momentarily.

jeff



Any of r1, r2, r3 could be chosen for the 'save' register, so why not 
put that in the regexp?


Something like:

+/* { dg-final { scan-assembler-times "mov\tr[123], r0" 1 { target { ! 
arm_thumb1 } } } } */


And then we are future-proof.

R.


Re: [committed][ARM] Fix minor testsuite fallout on ARM due to recent IRA changes

2020-03-02 Thread Jeff Law
On Mon, 2020-03-02 at 16:40 +, Richard Earnshaw (lists) wrote:
> On 02/03/2020 15:46, Jeff Law wrote:
> > More minor fallout from Vlad's IRA changes.
> > 
> > Previously this test used r3 to hold a value across a call (it's an ipa-ra
> > test).  After Vlad's changes we're using r1 instead.
> > 
> > This patch makes the obvious change to pattern we can for which should
> > bring
> > the test back to a passing status.
> > 
> > There's a note about r3 being special on thumb1 and the pattern check is
> > skipped for thumb1.  That special casing my not be necessary anymore -- I
> > leave
> > that to the ARM maintainers to resolve one way or the other.
> > 
> > Committing on the trunk momentarily.
> > 
> > jeff
> > 
> 
> Any of r1, r2, r3 could be chosen for the 'save' register, so why not 
> put that in the regexp?
> 
> Something like:
> 
> +/* { dg-final { scan-assembler-times "mov\tr[123], r0" 1 { target { ! 
> arm_thumb1 } } } } */
> 
> And then we are future-proof.
Seems reasonable.  I'll do that later today once the tester is finished with
its current run of arm-linux-gnueabi.

Any thoughts on the thumb1 issue?  I guess leaving it as-is just means slightly
less coverage for thumb1...

jeff



[committed] amdgcn: Extend reductions to all types

2020-03-02 Thread Andrew Stubbs

This patch adds support for vector reductions in all remaining vector modes.

Previously, only those modes directly supported by hardware instructions 
were available.  This patch implements multi-instruction sequences for 
all the others.


There are no visible test changes. Either the tests still fail for other 
reasons, or the reductions were previously vectorized by fall-back 
algorithms (permutations and masks). However, the performance is 
somewhat improved.


Andrew
amdgcn: Extend reductions to all types

Add support for V64DFmode addition, and V64DImode min, max.  There's no
direct hardware support for these, so we use regular vector instructions
and separate lane shift instructions.

Also add support for V64QI and V64HI reductions. Some of these require
additional extends and truncates, because AMD GCN has 32-bit vector lanes.

2020-03-02  Andrew Stubbs  

	gcc/
	* config/gcn/gcn-valu.md (dpp_move): New.
	(reduc_insn): Use 'U' and 'B' operand codes.
	(reduc__scal_): Allow all types.
	(reduc__scal_v64di): Delete.
	(*_dpp_shr_): Allow all 1reg types.
	(*plus_carry_dpp_shr_v64si): Change to ...
	(*plus_carry_dpp_shr_): ... this and allow all 1reg int types.
	(mov_from_lane63_v64di): Change to ...
	(mov_from_lane63_): ... this, and allow all 64-bit modes.
	* config/gcn/gcn.c (gcn_expand_dpp_shr_insn): Increase buffer size.
	Support UNSPEC_MOV_DPP_SHR output formats.
	(gcn_expand_reduc_scalar): Add "use_moves" reductions.
	Add "use_extends" reductions.
	(print_operand_address): Add 'I' and 'U' codes.
	* config/gcn/gcn.md (unspec): Add UNSPEC_MOV_DPP_SHR.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 40e864a8de7..a8034f77798 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -985,6 +985,20 @@
   [(set_attr "type" "vmult")
(set_attr "length" "24")])
 
+(define_insn "@dpp_move"
+  [(set (match_operand:VEC_REG_MODE 0 "register_operand""=v")
+	(unspec:VEC_REG_MODE
+	  [(match_operand:VEC_REG_MODE 1 "register_operand" " v")
+	   (match_operand:SI 2 "const_int_operand"	" n")]
+	  UNSPEC_MOV_DPP_SHR))]
+  ""
+  {
+return gcn_expand_dpp_shr_insn (mode, "v_mov_b32",
+UNSPEC_MOV_DPP_SHR, INTVAL (operands[2]));
+  }
+  [(set_attr "type" "vop_dpp")
+   (set_attr "length" "16")])
+
 ;; }}}
 ;; {{{ ALU special case: add/sub
 
@@ -2969,15 +2983,15 @@
 			 (UNSPEC_SMAX_DPP_SHR "v_max%i0")
 			 (UNSPEC_UMIN_DPP_SHR "v_min%u0")
 			 (UNSPEC_UMAX_DPP_SHR "v_max%u0")
-			 (UNSPEC_PLUS_DPP_SHR "v_add%u0")
-			 (UNSPEC_AND_DPP_SHR  "v_and%b0")
-			 (UNSPEC_IOR_DPP_SHR  "v_or%b0")
-			 (UNSPEC_XOR_DPP_SHR  "v_xor%b0")])
+			 (UNSPEC_PLUS_DPP_SHR "v_add%U0")
+			 (UNSPEC_AND_DPP_SHR  "v_and%B0")
+			 (UNSPEC_IOR_DPP_SHR  "v_or%B0")
+			 (UNSPEC_XOR_DPP_SHR  "v_xor%B0")])
 
 (define_expand "reduc__scal_"
   [(set (match_operand: 0 "register_operand")
 	(unspec:
-	  [(match_operand:VEC_1REG_MODE 1 "register_operand")]
+	  [(match_operand:VEC_ALLREG_MODE 1 "register_operand")]
 	  REDUC_UNSPEC))]
   ""
   {
@@ -2990,29 +3004,15 @@
 DONE;
   })
 
-(define_expand "reduc__scal_v64di"
-  [(set (match_operand:DI 0 "register_operand")
-	(unspec:DI
-	  [(match_operand:V64DI 1 "register_operand")]
-	  REDUC_2REG_UNSPEC))]
-  ""
-  {
-rtx tmp = gcn_expand_reduc_scalar (V64DImode, operands[1],
-   );
-
-/* The result of the reduction is in lane 63 of tmp.  */
-emit_insn (gen_mov_from_lane63_v64di (operands[0], tmp));
-
-DONE;
-  })
 
 (define_insn "*_dpp_shr_"
-  [(set (match_operand:VEC_1REG_MODE 0 "register_operand"   "=v")
-	(unspec:VEC_1REG_MODE
-	  [(match_operand:VEC_1REG_MODE 1 "register_operand" "v")
-	   (match_operand:VEC_1REG_MODE 2 "register_operand" "v")
-	   (match_operand:SI 3 "const_int_operand"	 "n")]
+  [(set (match_operand:VEC_ALL1REG_MODE 0 "register_operand"   "=v")
+	(unspec:VEC_ALL1REG_MODE
+	  [(match_operand:VEC_ALL1REG_MODE 1 "register_operand" "v")
+	   (match_operand:VEC_ALL1REG_MODE 2 "register_operand" "v")
+	   (match_operand:SI 3 "const_int_operand"		"n")]
 	  REDUC_UNSPEC))]
+  ; GCN3 requires a carry out, GCN5 not
   "!(TARGET_GCN3 && SCALAR_INT_MODE_P (mode)
  &&  == UNSPEC_PLUS_DPP_SHR)"
   {
@@ -3051,18 +3051,17 @@
 
 ; Special cases for addition.
 
-(define_insn "*plus_carry_dpp_shr_v64si"
-  [(set (match_operand:V64SI 0 "register_operand"   "=v")
-	(unspec:V64SI
-	  [(match_operand:V64SI 1 "register_operand" "v")
-	   (match_operand:V64SI 2 "register_operand" "v")
-	   (match_operand:SI 3 "const_int_operand"   "n")]
+(define_insn "*plus_carry_dpp_shr_"
+  [(set (match_operand:VEC_ALL1REG_INT_MODE 0 "register_operand"   "=v")
+	(unspec:VEC_ALL1REG_INT_MODE
+	  [(match_operand:VEC_ALL1REG_INT_MODE 1 "register_operand" "v")
+	   (match_operand:VEC_ALL1REG_INT_MODE 2 "register_operand" "v")
+	   (match_operand:SI 3 "const_int_operand"		"n")]
 	  UNSPEC_PLUS_CARRY_DPP_SHR))
(clobber (reg:DI VCC_REG))]
   ""
   {
-const char *insn = TARGE

Re: [PATCH] c++: Add -std=gnu++20 option [PR93958]

2020-03-02 Thread Jason Merrill

On 2/29/20 3:05 PM, Marek Polacek wrote:

One missing bit from r10-6656.  The docs and target-supports.exp
already handle -std=gnu++20.


OK.



2020-02-29  Marek Polacek  

PR c++/93958 - add missing -std=gnu++20.
* c.opt: Add -std=gnu++20.
---
  gcc/c-family/c.opt | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index b7e4fe146b2..1cd585fa71d 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -2149,7 +2149,11 @@ Conform to the ISO 2017 C++ standard with GNU extensions.
  
  std=gnu++2a

  C++ ObjC++
-Conform to the ISO 2020(?) C++ draft standard with GNU extensions 
(experimental and incomplete support).
+Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
+
+std=gnu++20
+C++ ObjC++ Alias(std=gnu++2a)
+Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
  
  std=gnu11

  C ObjC

base-commit: 38b1722d5d44c52e06a8694b8fa36793735e27d1





Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Jason Merrill

On 3/2/20 11:38 AM, Jeff Law wrote:

On Mon, 2020-03-02 at 11:29 -0500, Marek Polacek wrote:

On Mon, Mar 02, 2020 at 09:25:35AM -0700, Jeff Law wrote:

On Mon, 2020-03-02 at 10:45 -0500, Marek Polacek wrote:

Ping: Jeff, see the question below.  Thanks.

Sorry, totally missed the question.  I'm guessing you want me to run it
through
the Fedora build tester?


Yeah -- we want to allow a more strict narrow checking even in pre-C+20
modes, and we're wondering about the fallout.  So re-compiling Fedora
packages would be useful.

Is the error easy to identify from a diagnostic?  If so that would avoid having
to get fresh baselines.  I can just apply the patch, build gcc, then spin up a
single build of everything.


Not always; sometimes making something ill-formed just changes overload 
resolution


On further thought, it seems unnecessary to make this change at this 
point in the release cycle.  Let's defer this to GCC 11.


Jason



Re: [PATCH] c++: Implement P1957R2, T* to bool should be considered narrowing.

2020-03-02 Thread Marek Polacek
On Mon, Mar 02, 2020 at 11:59:12AM -0500, Jason Merrill wrote:
> On 3/2/20 11:38 AM, Jeff Law wrote:
> > On Mon, 2020-03-02 at 11:29 -0500, Marek Polacek wrote:
> > > On Mon, Mar 02, 2020 at 09:25:35AM -0700, Jeff Law wrote:
> > > > On Mon, 2020-03-02 at 10:45 -0500, Marek Polacek wrote:
> > > > > Ping: Jeff, see the question below.  Thanks.
> > > > Sorry, totally missed the question.  I'm guessing you want me to run it
> > > > through
> > > > the Fedora build tester?
> > > 
> > > Yeah -- we want to allow a more strict narrow checking even in pre-C+20
> > > modes, and we're wondering about the fallout.  So re-compiling Fedora
> > > packages would be useful.
> > Is the error easy to identify from a diagnostic?  If so that would avoid 
> > having
> > to get fresh baselines.  I can just apply the patch, build gcc, then spin 
> > up a
> > single build of everything.
> 
> Not always; sometimes making something ill-formed just changes overload
> resolution
> 
> On further thought, it seems unnecessary to make this change at this point
> in the release cycle.  Let's defer this to GCC 11.

Ack, will defer then.

Marek



Re: GLIBC libmvec status

2020-03-02 Thread Tulio Magno Quites Machado Filho
Bill Schmidt  writes:

> One tiny nit on the document:  For the "b"  value, let's just say "VSX" 
> rather than
> "VSX as defined in PowerISA v2.07)."  We will plan to only change  
> values in case
> a different vector length is defined in future.

That change would have more implications: all libmvec functions would have to
work on Power ISA v2.06 HW too.  But half of the functions do use v2.07
instructions now.

-- 
Tulio Magno


Re: [PATCH] libstdc++: Fix bogus use of memcmp in ranges::lexicographical_compare (PR 93972)

2020-03-02 Thread Jonathan Wakely

On 02/03/20 13:12 +, Jonathan Wakely wrote:

On 02/03/20 13:22 +0100, Christophe Lyon wrote:

On Fri, 28 Feb 2020 at 22:53, Jonathan Wakely  wrote:


On 28/02/20 14:59 -0500, Patrick Palka wrote:

We were enabling the memcmp optimization in ranges::lexicographical_compare for
signed integral types and for integral types larger than a byte.  But memcmp
gives the wrong answer for arrays of such types.  This patch fixes this issue by
refining the condition that enables the memcmp optimization.  It's now
consistent with the corresponding condition used in
std::lexicographical_compare.

libstdc++-v3/ChangeLog:

  PR libstdc++/93972
  * include/bits/ranges_algo.h (__lexicographical_compare_fn::operator()):
  Fix condition for when to use memcmp, making it consistent with the
  corresponding condition used in std::lexicographical_compare.
  * testsuite/25_algorithms/lexicographical_compare/93972.cc: New test.



Hi,

The new test fails on aarch64 and arm, and other targets according to
gcc-testresults.
On aarch64, my log says:
FAIL: 25_algorithms/lexicographical_compare/93972.cc (test for excess errors)
Excess errors:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(char*&, unsigned
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/aarch64-none-linux-gnu/libstdc++-v3/include/bits/ranges_algo.h:3490:
error: no matching function for call to '__memcmp(unsigned char*&,
char*&, const long int&)'

UNRESOLVED: 25_algorithms/lexicographical_compare/93972.cc compilation
failed to produce executable



Hmm, I think this was already broken in std::lexicographical_compare,
we just didn't have a test for it. I think the following will also
fail to compile on aarch64 and ARM (and any target where char is
unsigned):

#include 
#include 

int main()
{
 unsigned char a[] = {1, 2, 3, 4};
 char b[] = {1, 2, 3, 5};

 assert( std::lexicographical_compare(a, a+4, b, b+4) );
}

So Patrick's ranges::lexicographical_compare didn't introduce the bug,
it just found it by having better tests.

The std::__memcmp function is broken in a similar way to the
std::__memmove function that I removed last week. I'll fix that
today...


Fixed with this patch, tested powerpc64le-linux and committed to
master.

Thanks for noticing the new FAIL.


commit d112e173ea093f55a16a14b26ef65088381ee09c
Author: Jonathan Wakely 
Date:   Mon Mar 2 17:03:28 2020 +

libstdc++: Fix std::lexicographic_compare for unsigned char (PR 93972)

The new 25_algorithms/lexicographical_compare/93972.cc test fails on
targets where char is unsigned, revealing an existing regression with
the std::__memcmp helper that had gone unnoticed in
std::lexicographical_compare. When comparing char and unsigned char, the
memcmp optimisation is enabled, but the new std::__memcmp function fails
to compile for mismatched types.

PR libstdc++/93972
* include/bits/stl_algobase.h (__memcmp): Allow pointer types to
differ.
* testsuite/25_algorithms/lexicographical_compare/uchar.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index 5ec2f25424d..7a9d932b421 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -84,11 +84,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* A constexpr wrapper for __builtin_memcmp.
* @param __num The number of elements of type _Tp (not bytes).
*/
-  template
+  template
 _GLIBCXX14_CONSTEXPR
 inline int
-__memcmp(const _Tp* __first1, const _Tp* __first2, size_t __num)
+__memcmp(const _Tp* __first1, const _Up* __first2, size_t __num)
 {
+#if __cplusplus >= 201103L
+  static_assert(sizeof(_Tp) == sizeof(_Up), "can be compared with memcmp");
+#endif
 #ifdef __cpp_lib_is_constant_evaluated
   if (std::is_constant_evaluated())
 	{
diff --git a/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/uchar.cc b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/uchar.cc
new file mode 100644
index 000..990bb1e7489
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/uchar.cc
@@ -0,0 +1,61 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistrib

Re: GLIBC libmvec status

2020-03-02 Thread Bill Schmidt

On 3/2/20 11:10 AM, Tulio Magno Quites Machado Filho wrote:

Bill Schmidt  writes:


One tiny nit on the document:  For the "b"  value, let's just say "VSX" 
rather than
"VSX as defined in PowerISA v2.07)."  We will plan to only change  values 
in case
a different vector length is defined in future.

That change would have more implications: all libmvec functions would have to
work on Power ISA v2.06 HW too.  But half of the functions do use v2.07
instructions now.


Ah, I see.  Well, then language such as "VSX defined at least at the level of
PowerISA v2.07" would be appropriate.  We want to define a minimum subset 
without
further implied constraint.  (Higher levels can be handled with ifunc without
needing to reference this in the ABI, as previously discussed.)

Thanks,
Bill



Re: [PATCH PR93674]Avoid introducing IV of enumeral type in case of -fstrict-enums

2020-03-02 Thread Andrew Pinski
On Mon, Mar 2, 2020 at 1:40 AM Richard Biener
 wrote:
>
> On Mon, Mar 2, 2020 at 9:07 AM bin.cheng  wrote:
> >
> > Hi,
> > This is a simple fix for PR93674.  It adds cand carefully for enumeral type 
> > iv_use in
> > case of -fstrict-enums, it also avoids computing, replacing iv_use with the 
> > candidate
> > so that no IV of enumeral type is introduced with -fstrict-enums option.
> >
> > Testcase is also added.  Bootstrap and test on x86_64.  Any comment?
>
> I think we should avoid enum-typed (or bool-typed) IVs in general, not just
> with -fstrict-enums.  That said, the ENUMERAL_TYPE checks should be
> !(INTEGER_TYPE || POINTER_TYPE_P) checks.

Maybe even check type_has_mode_precision_p or
TYPE_MIN_VALUE/TYPE_MAX_VALUE have the same as the min/max for that
precision/signedness.

Thanks,
Andrew

>
> +  /* Check if cand can represent values of use for strict enums.  */
> +  else if (TREE_CODE (ctype) == ENUMERAL_TYPE && flag_strict_enums)
> +{
>
> if we don't have enum-typed IV candidates then the computation should
> be carried out in INTEGER_TYPE and then be converted to enum type.
> So why's this and the may_eliminate_iv hunks necessary?
>
> Richard.
>
> > Thanks,
> > bin
> > 2020-03-02  Bin Cheng  
> >
> > PR tree-optimization/93674
> > * tree-ssa-loop-ivopts.c (add_iv_candidate_for_use): Add candidate
> > for enumeral type iv_use converted from other iv.
> > (get_computation_cost, may_eliminate_iv): Avoid compute, eliminate
> > iv_use with enumeral type iv_cand in case of -fstrict-enums.
> >
> > gcc/testsuite
> > 2020-03-02  Bin Cheng  
> >
> > PR tree-optimization/93674
> > * g++.dg/pr93674.C: New test.


Re: [patch, fortran] PR93486 - ICE on valid with nested submodules and long submodule names

2020-03-02 Thread Andrew Benson
Hi Paul,

Thanks for the review. This is now committed as:

r10-6976-gf3c276aec26d9e406cc4bbf0e18b1105df63f0ee

I'll keep this in mind for future patches - this one seemed simple enough that 
I'd be confident to commit it without review after waiting for a few days. I'm 
hoping to find time to finish some other patches soon, some of which are more 
complicated and I'd definitely want to get reviewed before I commit them.

Thanks again everyone.

-Andrew

On Monday, March 2, 2020 6:41:46 AM PST Paul Richard Thomas wrote:
> Andrew,
> 
> I agree with Steve. That said, I took a look at your patch and it's
> just fine. OK to commit.
> 
> Cheers
> 
> Paul
> 
> On Mon, 2 Mar 2020 at 02:10, Steve Kargl
> 
>  wrote:
> > On Sun, Mar 01, 2020 at 11:43:23PM +0100, Thomas Koenig wrote:
> > > Am 01.03.20 um 23:42 schrieb Steve Kargl:
> > > > PS: in general, after multiple
> > > > pings, just commit the patch.
> > > 
> > > ... well, maybe after a "If there is no reply within a
> > > couple of days, I will commit this" :-)
> > 
> > Andrew submitted the patch and pinged it twice.  gfortran
> > development is running on fumes.  Beating one's head
> > against a wall seems counter productive.  I'm operating
> > on a principle that if one has commit access for gfortran,
> > one is committing a patch with the best attentions.  Could
> > this lead to a regression?  Sure.  The alternative of
> > constantly pinging patches is to simply stop submitting
> > patches.
> > 
> > 
> > --
> > Steve


-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus



[Ping][PATCH][Arm] ACLE intrinsics: AdvSIMD BFloat16 convert instructions

2020-03-02 Thread Dennis Zhang

Hi all,

On 17/01/2020 16:46, Dennis Zhang wrote:

Hi all,

This patch is part of a series adding support for Armv8.6-A features.
It depends on Arm BFMode patch 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html


This patch implements intrinsics to convert between bfloat16 and float32 
formats.

ACLE documents are at https://developer.arm.com/docs/101028/latest
ISA documents are at https://developer.arm.com/docs/ddi0596/latest

Regression tested.

Is it OK for trunk please?

Thanks,
Dennis

gcc/ChangeLog:

2020-01-17  Dennis Zhang  

 * config/arm/arm_bf16.h (vcvtah_f32_bf16, vcvth_bf16_f32): New.
 * config/arm/arm_neon.h (vcvt_f32_bf16, vcvtq_low_f32_bf16): New.
 (vcvtq_high_f32_bf16, vcvt_bf16_f32): New.
 (vcvtq_low_bf16_f32, vcvtq_high_bf16_f32): New.
 * config/arm/arm_neon_builtins.def (vbfcvt, vbfcvt_high): New entries.
 (vbfcvtv4sf, vbfcvtv4sf_high): Likewise.
 * config/arm/iterators.md (VBFCVT, VBFCVTM): New mode iterators.
 (V_bf_low, V_bf_cvt_m): New mode attributes.
 * config/arm/neon.md (neon_vbfcvtv4sf): New.
 (neon_vbfcvtv4sf_highv8bf, neon_vbfcvtsf): New.
 (neon_vbfcvt, neon_vbfcvt_highv8bf): New.
 (neon_vbfcvtbf_cvtmode, neon_vbfcvtbf): New
 * config/arm/unspecs.md (UNSPEC_BFCVT, UNSPEC_BFCVT_HIG): New.

gcc/testsuite/ChangeLog:

2020-01-17  Dennis Zhang  

 * gcc.target/arm/simd/bf16_cvt_1.c: New test.




The tests are updated in this patch for assembly test.
Rebased to trunk top.

Is it OK to commit please?

Cheers
Dennis
diff --git a/gcc/config/arm/arm_bf16.h b/gcc/config/arm/arm_bf16.h
index decf23f3834..1aa593192c0 100644
--- a/gcc/config/arm/arm_bf16.h
+++ b/gcc/config/arm/arm_bf16.h
@@ -34,6 +34,20 @@ extern "C" {
 typedef __bf16 bfloat16_t;
 typedef float float32_t;
 
+__extension__ extern __inline float32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtah_f32_bf16 (bfloat16_t __a)
+{
+  return __builtin_neon_vbfcvtbf (__a);
+}
+
+__extension__ extern __inline bfloat16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvth_bf16_f32 (float32_t __a)
+{
+  return __builtin_neon_vbfcvtsf (__a);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 81c407f5152..a66961d0c51 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19379,6 +19379,55 @@ vbfdotq_lane_f32 (float32x4_t __r, bfloat16x8_t __a, bfloat16x4_t __b,
 
 #pragma GCC pop_options
 
+#pragma GCC push_options
+#pragma GCC target ("arch=armv8.2-a+bf16")
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvt_f32_bf16 (bfloat16x4_t __a)
+{
+  return __builtin_neon_vbfcvtv4bf (__a);
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_low_f32_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_neon_vbfcvtv8bf (__a);
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_high_f32_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_neon_vbfcvt_highv8bf (__a);
+}
+
+__extension__ extern __inline bfloat16x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvt_bf16_f32 (float32x4_t __a)
+{
+  return __builtin_neon_vbfcvtv4sfv4bf (__a);
+}
+
+__extension__ extern __inline bfloat16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_low_bf16_f32 (float32x4_t __a)
+{
+  return __builtin_neon_vbfcvtv4sfv8bf (__a);
+}
+
+/* The 'inactive' operand is not converted but it provides the
+   low 64 bits to assemble the final 128-bit result.  */
+__extension__ extern __inline bfloat16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_high_bf16_f32 (bfloat16x8_t inactive, float32x4_t __a)
+{
+  return __builtin_neon_vbfcvtv4sf_highv8bf (inactive, __a);
+}
+
+#pragma GCC pop_options
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index 4b4d1c808d8..48c06c43a17 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -385,3 +385,9 @@ VAR1 (USTERNOP, usmmla, v16qi)
 VAR2 (TERNOP, vbfdot, v2sf, v4sf)
 VAR2 (MAC_LANE_PAIR, vbfdot_lanev4bf, v2sf, v4sf)
 VAR2 (MAC_LANE_PAIR, vbfdot_lanev8bf, v2sf, v4sf)
+
+VAR2 (UNOP, vbfcvt, sf, bf)
+VAR2 (UNOP, vbfcvt, v4bf, v8bf)
+VAR1 (UNOP, vbfcvt_high, v8bf)
+VAR2 (UNOP, vbfcvtv4sf, v4bf, v8bf)
+VAR1 (BINOP, vbfcvtv4sf_high, v8bf)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index ab30c371583..5f4e3d12358 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -229,6 +229,10 @@
 ;; Modes for polynomial or float values.
 (define_mode_iterator VPF [V8QI V16QI V2SF V4SF])
 
+;; Modes for BF16 convert instructions.
+(define_mode_iterator VBFCVT [V4BF V8BF])
+(define_mode_iterator VBFCVTM [V2SI SF])
+

Re: [PATCH] libstdc++: Implement integer-class types as defined in [iterator.concept.winc]

2020-03-02 Thread Patrick Palka
On Mon, 24 Feb 2020, Patrick Palka wrote:

> On Mon, 24 Feb 2020, Patrick Palka wrote:
> 
> > This implements signed and unsigned integer-class types, whose width is one 
> > bit
> > larger than the widest native signed and unsigned integral type 
> > respectively.
> > In our case this is either __int128 and unsigned __int128, or long long and
> > unsigned long long.
> > 
> > Internally, the two integer-class types are represented as a largest native
> > unsigned integral type plus one extra bit.  The signed integer-class type is
> > represented in two's complement form with the extra bit acting as the sign 
> > bit.
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > * include/bits/iterator_concepts.h (ranges::__detail::__max_diff_type):
> > Remove definition, replace with forward declaration of class
> > __max_diff_type.
> > (ranges::__detail::__max_size_type): Remove definition, replace with
> > forward declaration of class __max_size_type.
> > (__detail::__is_integer_like): Accept __int128 and unsigned __int128.
> > (__detail::__is_signed_integer_like): Accept __int128.
> > * include/bits/range_access.h (__detail::__max_size_type): New class.
> > (__detail::__max_diff_type): New class.
> > (__detail::__max_size_type::__max_size_type): Define this constructor
> > out-of-line to break the cycle.
> > (__detail::__to_unsigned_like): New function.
> > (numeric_limits<__detail::__max_size_type>): New explicit 
> > specialization.
> > (numeric_limits<__detail::__max_diff_type>): New explicit 
> > specialization.
> > * testsuite/std/ranges/iota/differenc_type.cc: New test.
> 
> Here's v2 of the patch that splits out __max_size_type and
> __max_diff_type into a dedicated header, along with other misc
> improvements and fixes.
> 
> -- >8 --

Here's v3 of the patch.  Changes from v2:

* The arithmetic tests in difference_type.cc have been split out to a
separate file.

* The arithmetic tests now run successfully in strict ANSI mode.  The
issue was that __int128 does not model the integral concept in strict
ANSI mode, which we use to make operations on this type behave as
integer operations do.  But for that we need to always treat __int128 as
an integer type in this API.  So a new concept __integralish which is
always modelled by __int128 is introduced and used in the API instead.

* Comments have been added explaining why __int128 is always used as the
underlying type even when the widest integer type in strict ANSI mode is
long long.

* New tests, some minor code clean-ups, and added comments to the
unsigned division and multiplication routines.

Tested on x86_64-pc-linux-gnu in both strict and GNU compilation modes,
with and without -U__SIZEOF_INT128__.

-- >8 --

This implements signed and unsigned integer-class types, whose width is one bit
larger than the widest supported signed and unsigned integral type respectively.
In our case this is either __int128 and unsigned __int128, or long long and
unsigned long long.

Internally, the two integer-class types are represented as a largest native
unsigned integral type plus one extra bit.  The signed integer-class type is
represented in two's complement form with the extra bit acting as the sign bit.

libstdc++-v3/ChangeLog:

* include/Makefile.am (bits_headers): Add new header
.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h (ranges::__detail::__max_diff_type):
Remove definition, replace with forward declaration of class
__max_diff_type.
(ranges::__detail::__max_size_type): Remove definition, replace with
forward declaration of class __max_size_type.
(__detail::__is_integer_like): Accept __int128 and unsigned __int128.
(__detail::__is_signed_integer_like): Accept __int128.
* include/bits/max_size_type.h: New header.
* include/bits/range_access.h: Include .
(__detail::__to_unsigned_like): Two new overloads.
* testsuite/std/ranges/iota/difference_type.cc: New test.
* testsuite/std/ranges/iota/max_size_type.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/bits/iterator_concepts.h |  15 +-
 libstdc++-v3/include/bits/max_size_type.h | 764 ++
 libstdc++-v3/include/bits/range_access.h  |  11 +
 .../std/ranges/iota/difference_type.cc|  57 ++
 .../std/ranges/iota/max_size_type.cc  | 376 +
 7 files changed, 1218 insertions(+), 7 deletions(-)
 create mode 100644 libstdc++-v3/include/bits/max_size_type.h
 create mode 100644 libstdc++-v3/testsuite/std/ranges/iota/difference_type.cc
 create mode 100644 libstdc++-v3/testsuite/std/ranges/iota/max_size_type.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 80aeb3f8959..a1460b98247 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Ma

[committed]: i386: Allow only registers with VALID_INT_MODE_P modes in movstrict [PR93997]

2020-03-02 Thread Uros Bizjak
*movstrict_1 insn pattern allows only general registers,
so we have to reject modes not suitable for general regs in
corresponding movstrict expander.

2020-03-02  Uroš Bizjak  

PR target/93997
* config/i386/i386.md (movstrict): Allow only
registers with VALID_INT_MODE_P modes.

testsuite/ChangeLog:

2020-03-02  Uroš Bizjak  

PR target/93997
* gcc.target/i386/pr93997.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6c57500ae8ec..8e29dffafa6e 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2778,7 +2778,7 @@
 {
   gcc_assert (SUBREG_P (operands[0]));
   if ((TARGET_PARTIAL_REG_STALL && optimize_function_for_speed_p (cfun))
-  || GET_MODE_CLASS (GET_MODE (SUBREG_REG (operands[0]))) != MODE_INT)
+  || !VALID_INT_MODE_P (GET_MODE (SUBREG_REG (operands[0]
 FAIL;
 })
 


Re: GLIBC libmvec status

2020-03-02 Thread Jakub Jelinek
On Mon, Mar 02, 2020 at 08:20:01PM +, GT wrote:
> Which raises the question: what use-case motivated allowing the compiler
> to auto-vectorize user defined functions? From having manually created vector

The feature made it into the OpenMP standard (already OpenMP 4.0) and so got
implemented as part of the OpenMP 4.0 implementation.

> versions of sin, cos and other libmvec functions, I'm wondering how GCC is 
> able to
> autovectorize a non-trivial user defined function.

There are various tests that cover it, look e.g. at tests that require 
vect_simd_clones
effective target (e.g. in gcc/testsuite/*/{vect,gomp}/ and 
libgomp/testsuite/*/).

In the OpenMP standard, see e.g.
https://www.openmp.org/spec-html/5.0/openmpsu42.html#x65-1390002.9.3

Jakub



Re: GLIBC libmvec status

2020-03-02 Thread GT
‐‐‐ Original Message ‐‐‐
On Thursday, February 27, 2020 9:52 AM, Jakub Jelinek  wrote:

> On Thu, Feb 27, 2020 at 08:47:19AM -0600, Bill Schmidt wrote:
>
> > But is this actually a good idea? It seems to me this will generate lousy
> > code in the absence of hardware support. Won't we be better off warning and
> > ignoring the directive, leaving the code in scalar form?
>
> Depends on the exact code, I think sometimes it will be just fine and will
> allow vectorizing something that really couldn't be otherwise.
> Isn't it better to leave it for the user to decide?
> They can always ask for it not to be generated (add notinbranch) if it isn't
> worthwhile.
>

I'm trying to understand what the x86_64 implementation does w.r.t. masked 
versions
of user defined functions. I haven't found any test under directory testsuite 
which verifies
that compiler-generated versions (from inbranch being specified) produce 
expected
results.

Which raises the question: what use-case motivated allowing the compiler
to auto-vectorize user defined functions? From having manually created vector
versions of sin, cos and other libmvec functions, I'm wondering how GCC is able 
to
autovectorize a non-trivial user defined function.

Any pointers to relevant tests and documentation will be really appreciated.

Thanks.
Bert.


[committed] analyzer: detect malloc, free, calloc within "std" [PR93959]

2020-03-02 Thread David Malcolm
PR analyzer/93959 reported that g++.dg/analyzer/malloc.C was failing
with no output on Solaris.

The issue is that  there has "using std::free;", converting
all the "free" calls to std::free, which fails the name-matching via
is_named_call_p.

This patch implements an is_std_named_call_p variant of is_named_call_p
to check for the name within "std", and uses it in sm-malloc.c to check
for std::malloc, std::calloc, and std::free.

Verified the fix on sparc-sun-solaris2.11 (gcc211.fsffrance.org).
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r10-6981-g9f00b22f98ec0688fcd9816a03aa3f7eea58bcf7.

gcc/analyzer/ChangeLog:
PR analyzer/93959
* analyzer.cc (is_std_function_p): New function.
(is_std_named_call_p): New functions.
* analyzer.h (is_std_named_call_p): New decl.
* sm-malloc.cc (malloc_state_machine::on_stmt): Check for "std::"
variants when checking for malloc, calloc and free.

gcc/testsuite/ChangeLog:
PR analyzer/93959
* g++.dg/analyzer/cstdlib-2.C: New test.
* g++.dg/analyzer/cstdlib.C: New test.
---
 gcc/analyzer/analyzer.cc  | 61 +++
 gcc/analyzer/analyzer.h   |  2 +
 gcc/analyzer/sm-malloc.cc |  3 ++
 gcc/testsuite/g++.dg/analyzer/cstdlib-2.C | 25 ++
 gcc/testsuite/g++.dg/analyzer/cstdlib.C   | 17 +++
 5 files changed, 108 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/analyzer/cstdlib-2.C
 create mode 100644 gcc/testsuite/g++.dg/analyzer/cstdlib.C

diff --git a/gcc/analyzer/analyzer.cc b/gcc/analyzer/analyzer.cc
index 5cf745ea632..8bc3ce49f07 100644
--- a/gcc/analyzer/analyzer.cc
+++ b/gcc/analyzer/analyzer.cc
@@ -86,6 +86,49 @@ is_named_call_p (tree fndecl, const char *funcname)
   return 0 == strcmp (tname, funcname);
 }
 
+/* Return true if FNDECL is within the namespace "std".
+   Compare with cp/typeck.c: decl_in_std_namespace_p, but this doesn't
+   rely on being the C++ FE (or handle inline namespaces inside of std).  */
+
+static inline bool
+is_std_function_p (const_tree fndecl)
+{
+  tree name_decl = DECL_NAME (fndecl);
+  if (!name_decl)
+return false;
+  if (!DECL_CONTEXT (fndecl))
+return false;
+  if (TREE_CODE (DECL_CONTEXT (fndecl)) != NAMESPACE_DECL)
+return false;
+  tree ns = DECL_CONTEXT (fndecl);
+  if (!(DECL_CONTEXT (ns) == NULL_TREE
+   || TREE_CODE (DECL_CONTEXT (ns)) == TRANSLATION_UNIT_DECL))
+return false;
+  if (!DECL_NAME (ns))
+return false;
+  return id_equal ("std", DECL_NAME (ns));
+}
+
+/* Like is_named_call_p, but look for std::FUNCNAME.  */
+
+bool
+is_std_named_call_p (tree fndecl, const char *funcname)
+{
+  gcc_assert (fndecl);
+  gcc_assert (funcname);
+
+  if (!is_std_function_p (fndecl))
+return false;
+
+  tree identifier = DECL_NAME (fndecl);
+  const char *name = IDENTIFIER_POINTER (identifier);
+  const char *tname = name;
+
+  /* Don't disregard prefix _ or __ in FNDECL's name.  */
+
+  return 0 == strcmp (tname, funcname);
+}
+
 /* Helper function for checkers.  Is FNDECL an extern fndecl at file scope
that has the given FUNCNAME, and does CALL have the given number of
arguments?  */
@@ -106,6 +149,24 @@ is_named_call_p (tree fndecl, const char *funcname,
   return true;
 }
 
+/* Like is_named_call_p, but check for std::FUNCNAME.  */
+
+bool
+is_std_named_call_p (tree fndecl, const char *funcname,
+const gcall *call, unsigned int num_args)
+{
+  gcc_assert (fndecl);
+  gcc_assert (funcname);
+
+  if (!is_std_named_call_p (fndecl, funcname))
+return false;
+
+  if (gimple_call_num_args (call) != num_args)
+return false;
+
+  return true;
+}
+
 /* Return true if stmt is a setjmp or sigsetjmp call.  */
 
 bool
diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index 1ae76cc4ea0..5364edb3d96 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -78,6 +78,8 @@ extern bool is_special_named_call_p (const gcall *call, const 
char *funcname,
 extern bool is_named_call_p (tree fndecl, const char *funcname);
 extern bool is_named_call_p (tree fndecl, const char *funcname,
 const gcall *call, unsigned int num_args);
+extern bool is_std_named_call_p (tree fndecl, const char *funcname,
+const gcall *call, unsigned int num_args);
 extern bool is_setjmp_call_p (const gcall *call);
 extern bool is_longjmp_call_p (const gcall *call);
 
diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index 46225b6f700..aaef6959362 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -611,6 +611,8 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
   {
if (is_named_call_p (callee_fndecl, "malloc", call, 1)
|| is_named_call_p (callee_fndecl, "calloc", call, 2)
+   || is_std_named_call_p (callee_fndecl, "malloc", call, 1)
+   || is_std_named_call_p (callee_fndecl,

[committed] invoke.texi: add missing option to -fanalyzer list

2020-03-02 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 6e078aec716aa8214c13d7d20292aa232b5b.

gcc/ChangeLog:
* doc/invoke.texi (Static Analyzer Options): Add
-Wanalyzer-stale-setjmp-buffer to the list of options enabled
by -fanalyzer.
---
 gcc/doc/invoke.texi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4f88fe68999..dc7440db103 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8247,6 +8247,7 @@ Enabling this option effectively enables the following 
warnings:
 -Wanalyzer-possible-null-dereference @gol
 -Wanalyzer-null-argument @gol
 -Wanalyzer-null-dereference @gol
+-Wanalyzer-stale-setjmp-buffer @gol
 -Wanalyzer-tainted-array-index @gol
 -Wanalyzer-unsafe-call-within-signal-handler @gol
 -Wanalyzer-use-after-free @gol
-- 
2.21.0



[committed] analyzer: don't print the duplicate count by default

2020-03-02 Thread David Malcolm
The note about duplicates attached to analyzer diagnostics feels like an
implementation detail; it's likely just noise from the perspective of an
end-user.

This patch disables it by default, introducing a flag to re-enable it.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 13b7691238f189c7a233aedec49306a7cb2b0a15.

gcc/analyzer/ChangeLog:
* analyzer.opt (fanalyzer-show-duplicate-count): New option.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Use the above to
guard the printing of the duplicate count.

gcc/ChangeLog:
* doc/invoke.texi (-fanalyzer-show-duplicate-count): New.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c: Add
-fanalyzer-show-duplicate-count.
---
 gcc/analyzer/analyzer.opt | 4 
 gcc/analyzer/diagnostic-manager.cc| 2 +-
 gcc/doc/invoke.texi   | 8 
 .../gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c  | 2 ++
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
index 4d122f3593a..22cf4b0ad3b 100644
--- a/gcc/analyzer/analyzer.opt
+++ b/gcc/analyzer/analyzer.opt
@@ -114,6 +114,10 @@ fanalyzer-fine-grained
 Common Var(flag_analyzer_fine_grained) Init(0)
 Avoid combining multiple statements into one exploded edge.
 
+fanalyzer-show-duplicate-count
+Common Var(flag_analyzer_show_duplicate_count) Init(0)
+Issue a note when diagnostics are deduplicated.
+
 fanalyzer-state-purge
 Common Var(flag_analyzer_state_purge) Init(1)
 Purge unneeded state during analysis.
diff --git a/gcc/analyzer/diagnostic-manager.cc 
b/gcc/analyzer/diagnostic-manager.cc
index b8e59334374..7435092e2d7 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -541,7 +541,7 @@ diagnostic_manager::emit_saved_diagnostic (const 
exploded_graph &eg,
   auto_cfun sentinel (sd.m_snode->m_fun);
   if (sd.m_d->emit (&rich_loc))
 {
-  if (num_dupes > 0)
+  if (flag_analyzer_show_duplicate_count && num_dupes > 0)
inform_n (stmt->location, num_dupes,
  "%i duplicate", "%i duplicates",
  num_dupes);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dc7440db103..54375ebd679 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8477,6 +8477,14 @@ By default, an edge in this graph can contain the 
effects of a run
 of multiple statements within a basic block.  With
 @option{-fanalyzer-fine-grained}, each statement gets its own edge.
 
+@item -fanalyzer-show-duplicate-count
+@opindex fanalyzer-show-duplicate-count
+@opindex fno-analyzer-show-duplicate-count
+This option is intended for analyzer developers: if multiple diagnostics
+have been detected as being duplicates of each other, it emits a note when
+reporting the best diagnostic, giving the number of additional diagnostics
+that were suppressed by the deduplication logic.
+
 @item -fno-analyzer-state-merge
 @opindex fanalyzer-state-merge
 @opindex fno-analyzer-state-merge
diff --git a/gcc/testsuite/gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c 
b/gcc/testsuite/gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c
index 53c046ed12f..b43148cb4a7 100644
--- a/gcc/testsuite/gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c
+++ b/gcc/testsuite/gcc.dg/analyzer/CVE-2005-1689-dedupe-issue.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-fanalyzer-show-duplicate-count" } */
+
 #include 
 
 typedef struct _krb5_data {
-- 
2.21.0



Re: GLIBC libmvec status

2020-03-02 Thread GT
‐‐‐ Original Message ‐‐‐
On Monday, March 2, 2020 3:31 PM, Jakub Jelinek  wrote:

> On Mon, Mar 02, 2020 at 08:20:01PM +, GT wrote:
>
> > Which raises the question: what use-case motivated allowing the compiler
> > to auto-vectorize user defined functions? From having manually created 
> > vector
>
> The feature made it into the OpenMP standard (already OpenMP 4.0) and so got
> implemented as part of the OpenMP 4.0 implementation.
>
> > versions of sin, cos and other libmvec functions, I'm wondering how GCC is 
> > able to
> > autovectorize a non-trivial user defined function.
>

Searching openmp.org located document "OpenMP API Examples". The relevant 
example
for inbranch/notinbranch shows very simple functions (SIMD.6.c). GCC testsuite
functions are similarly simple.
Wouldn't the same effect be achieved by letting GCC inline such functions and 
having
the loop autovectorizer handle the resulting code?


> There are various tests that cover it, look e.g. at tests that require 
> vect_simd_clones
> effective target (e.g. in gcc/testsuite//{vect,gomp}/ and 
> libgomp/testsuite//).
>

Sorry, I can't identify any test that ensures a masked vector function variant 
produces
expected results. I'll check again but I need more help here.

Bert.


Re: GLIBC libmvec status

2020-03-02 Thread Jakub Jelinek
On Mon, Mar 02, 2020 at 09:40:59PM +, GT wrote:
> Searching openmp.org located document "OpenMP API Examples". The relevant 
> example
> for inbranch/notinbranch shows very simple functions (SIMD.6.c). GCC testsuite
> functions are similarly simple.
> Wouldn't the same effect be achieved by letting GCC inline such functions and 
> having
> the loop autovectorizer handle the resulting code?

If it is defined in headers and inlinable, sure, then you don't need to mark
it any way.
The pragmas are mainly for functions that aren't inlinable for whatever
reason.

> > There are various tests that cover it, look e.g. at tests that require 
> > vect_simd_clones
> > effective target (e.g. in gcc/testsuite//{vect,gomp}/ and 
> > libgomp/testsuite//).
> >
> 
> Sorry, I can't identify any test that ensures a masked vector function 
> variant produces
> expected results. I'll check again but I need more help here.

Indeed, there aren't any yet on the vectorizer side, I thought I've implemented 
it
already in the vectorizer but apparently didn't, just the omp-simd-clone.c part 
is
implemented (the more important part, as it matters for the ABI).  A testcase 
could
be something along the lines of
#pragma omp declare simd
int foo (int, int);

void
bar (int *a, int *b, int *c)
{
  #pragma omp simd
  for (int i = 0; i < 1024; i++)
{
  int d = b[i], e = c[i], f;
  if (b[i] < 20)
f = foo (d, e);
  else
f = d + e;
}
}
To make this work, one would need to tweak tree-if-conv.c (invent some way
how to represent the conditional calls in the IL during the vect pass,
probably some new internal function) and then handle that in
vectorizable_simd_clone_call.

Jakub



[PATCH] rs6000: Fix -mpower9-vector -mno-altivec ICE (PR87560)

2020-03-02 Thread Bill Schmidt
PR87560 reports an ICE when a test case is compiled with -mpower9-vector
and -mno-altivec.  This patch terminates compilation with an error when
this combination (and other unreasonable ones) are requested.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Reported error is now:

f951: Error: '-mno-altivec' turns off '-mpower9-vector'

Is this okay for master, and for backport to releases/gcc-9 after the
9.3 release?  There's no urgency in getting this in 9.3.

Thanks,
Bill

2020-03-02  Bill Schmidt  

* rs6000-cpus.def (OTHER_ALTIVEC_MASKS): New #define.
* rs6000.c (rs6000_disable_incompatible_switches): Add table entry
for OPTION_MASK_ALTIVEC.
---
 gcc/config/rs6000/rs6000-cpus.def | 4 
 gcc/config/rs6000/rs6000.c| 1 +
 2 files changed, 5 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 193d77eb954..ff1db6019de 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -101,6 +101,10 @@
 | OPTION_MASK_FLOAT128_KEYWORD \
 | OPTION_MASK_P8_VECTOR)
 
+/* Flags that need to be turned off if -mno-altivec.  */
+#define OTHER_ALTIVEC_MASKS(OTHER_VSX_VECTOR_MASKS \
+| OPTION_MASK_VSX)
+
 #define POWERPC_7400_MASK  (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
 
 /* Deal with ports that do not have -mstrict-align.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9910b27ed24..ecbf7ae0c59 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -23632,6 +23632,7 @@ rs6000_disable_incompatible_switches (void)
 { OPTION_MASK_P9_VECTOR,   OTHER_P9_VECTOR_MASKS,  "power9-vector" },
 { OPTION_MASK_P8_VECTOR,   OTHER_P8_VECTOR_MASKS,  "power8-vector" },
 { OPTION_MASK_VSX, OTHER_VSX_VECTOR_MASKS, "vsx"   },
+{ OPTION_MASK_ALTIVEC, OTHER_ALTIVEC_MASKS,"altivec"   },
   };
 
   for (i = 0; i < ARRAY_SIZE (flags); i++)
-- 
2.17.1



[PING #2][PATCH] drop weakref attribute on function definitions (PR 92799)

2020-03-02 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00883.html

On 2/21/20 9:49 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00883.html

On 2/14/20 3:41 PM, Martin Sebor wrote:

Because attribute weakref introduces a kind of a definition, it can
only be applied to declarations of symbols that are not defined.  GCC
normally issues a warning when the attribute is applied to a defined
symbol, but PR 92799 shows that it misses some cases on which it then
leads to an ICE.

The ICE was introduced in GCC 4.5.  Prior to then, GCC accepted such
invalid definitions and silently dropped the weakref attribute.

The attached patch avoids the ICE while again dropping the invalid
attribute from the definition, except with the (now) usual warning.

Tested on x86_64-linux.

I also looked for code bases that make use of attribute weakref to
rebuild them as another test but couldn't find any.  (There are
a couple of instances in the Linux kernel but they look #ifdef'd
out).  Does anyone know of any that do use it that I could try to
build on Linux?

Martin






Re: [PATCH] c++: Fix convert_like in template [PR91465, PR93870, PR92031]

2020-03-02 Thread Jason Merrill

On 2/29/20 2:32 PM, Marek Polacek wrote:

The point of this patch is to fix the recurring problem of trees
generated by convert_like while processing a template that break when
substituting.  For instance, when convert_like creates a CALL_EXPR
while in a template, substituting such a call breaks in finish_call_expr
because we have two 'this' arguments.  Another problem is that we
can create &TARGET_EXPR<> and then fail when substituting because we're
taking the address of an rvalue.  I've analyzed some of the already fixed
PRs and also some of the currently open ones:

In c++/93870 we create EnumWrapper::operator E(&operator~(E)).
In c++/87145 we create S::operator int (&{N}).
In c++/92031 we create &TARGET_EXPR <0>.

And so on.  I'd like to fix it once and for all.  I wanted something
that fixes all the existing cases, removes the ugly check in
convert_nontype_argument, and something suitable for stage4.  I.e.,
I didn't implement any cleanups suggested in
 regarding
the pattern in e.g. build_explicit_specifier.


Hmm, it seems to me that addressing that pattern is an important part of 
fixing this once and for all.



The gist of the problem is when convert_like_real creates a call for
a ck_user or wraps a TARGET_EXPR in & in a template.  So in these cases
use IMPLICIT_CONV_EXPR.  In a template we shouldn't need to perform the
actual conversion, we only need it's result type.  Is that something
that convert_like_real shouldn't do?
perform_direct_initialization_if_possible and perform_implicit_conversion_flags
can also create an IMPLICIT_CONV_EXPR.


This seems like a reasonable approach.


Given the change above, build_converted_constant_expr can return an
IMPLICIT_CONV_EXPR so call fold_non_dependent_expr rather than
maybe_constant_value to deal with that.  A problem with that is that now
we may instantiate something twice in a row (?).


Right, and we must not do that.


Handling all of it in
build_converted_constant_expr won't be that straightforward because we
sometimes call cxx_constant_value to give errors, or use manifestly_const_eval
which should be honored.


Fair enough.  The alternative is ensuring that it's OK to call 
build_converted_constant_expr when processing_template_decl is true, and 
that the result in that case is still template trees.  I think that's 
what you are doing.


With this approach, we can change

  expr = instantiate_non_dependent_expr_sfinae (expr, complain);
  /* Don't let convert_like_real create more template codes.  */
  processing_template_decl_sentinel s;
  expr = build_converted_constant_bool_expr (expr, complain);
  expr = cxx_constant_value (expr);

to

  expr = build_converted_constant_bool_expr (expr, complain);
  expr = instantiate_non_dependent_expr_sfinae (expr, complain);
  expr = cxx_constant_value (expr);

Yes?


Bootstrapped/regtested on x86_64-linux, ok for trunk?

2020-02-29  Marek Polacek  

PR c++/92031 - bogus taking address of rvalue error.
PR c++/91465 - ICE with template codes in check_narrowing.
PR c++/93870 - wrong error when converting template non-type arg.
* call.c (convert_like_real) : Return IMPLICIT_CONV_EXPR
in a template.
(convert_like_real) : Likewise.
* decl.c (compute_array_index_type_loc): Call fold_non_dependent_expr
instead of maybe_constant_value.
* pt.c (convert_nontype_argument): Don't build IMPLICIT_CONV_EXPR.
Set IMPLICIT_CONV_EXPR_NONTYPE_ARG if that's what
build_converted_constant_expr returned.
* typeck2.c (check_narrowing): Call fold_non_dependent_expr instead
of maybe_constant_value.

* g++.dg/cpp0x/conv-tmpl2.C: New test.
* g++.dg/cpp0x/conv-tmpl3.C: New test.
* g++.dg/cpp0x/conv-tmpl4.C: New test.
* g++.dg/cpp1z/conv-tmpl1.C: New test.
---
  gcc/cp/call.c   | 12 +
  gcc/cp/decl.c   |  4 +--
  gcc/cp/pt.c | 25 ---
  gcc/cp/typeck2.c|  6 -
  gcc/testsuite/g++.dg/cpp0x/conv-tmpl2.C | 21 
  gcc/testsuite/g++.dg/cpp0x/conv-tmpl3.C | 16 
  gcc/testsuite/g++.dg/cpp0x/conv-tmpl4.C | 33 +
  gcc/testsuite/g++.dg/cpp1z/conv-tmpl1.C | 10 
  8 files changed, 104 insertions(+), 23 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/conv-tmpl4.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/conv-tmpl1.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 85bbd043a1d..4cb07b61695 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7383,6 +7383,12 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
{
struct z_candidate *cand = convs->cand;
  
+	/* Creating &TARGET_EXPR<> in a template breaks whe

Re: [PATCH] Clear --help=language and --help=common interaction.

2020-03-02 Thread Joseph Myers
On Mon, 2 Mar 2020, Martin Liška wrote:

> +version of GCC@.  If an option is supported by all languages, one needs
> +to use @var{common} qualifier instead.

"common" is literal text, so it should be @samp{common} not @var{common}, 
and the existing documentation here describes it as a "class" with other 
things such as "undocumented" or "joined" being "qualifiers"

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] use all same precision in wide_int arguments (PR 93986)

2020-03-02 Thread Martin Sebor

The wide_int APIs expect operands to have the same precision and
abort when they don't.  This is especially insidious in code where
the operands normally do have the same precision but where mixed
precision arguments can come up as a result of unusual combinations
optimization options.  That is also what precipitated pr93986.

The attached patch adjusts the code to extend all wide_int operands
to the same precision to avoid the ICE.

Besides the usual bootstrap/testing I also compiled all string tests
in gcc.dg with the same options as in the test case in pr93986 in
an effort to weed out any lingering bugs like it (found none).

Martin
PR tree-optimization/93986 - ICE on mixed-precision wide_int arguments

gcc/testsuite/ChangeLog:

	PR tree-optimization/93986
	* gcc.dg/pr93986.c: New test.

gcc/ChangeLog:

	PR tree-optimization/93986
	* tree-ssa-strlen.c (maybe_warn_overflow): Convert all wide_int
	operands to the same precision to avoid ICEs.

diff --git a/gcc/testsuite/gcc.dg/pr93986.c b/gcc/testsuite/gcc.dg/pr93986.c
new file mode 100644
index 000..bdbc192a01d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr93986.c
@@ -0,0 +1,16 @@
+/* PR tree-optimization/93986 - ICE in decompose, at wide-int.h:984
+   { dg-do compile }
+   { dg-options "-O1 -foptimize-strlen -ftree-slp-vectorize" } */
+
+int dd (void);
+
+void ya (int cm)
+{
+  char s2[cm];
+
+  s2[cm-12] = s2[cm-11] = s2[cm-10] = s2[cm-9]
+= s2[cm-8] = s2[cm-7] = s2[cm-6] = s2[cm-5] = ' ';
+
+  if (dd ())
+__builtin_exit (0);
+}
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index b76b54efbd8..136a72700d9 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -1924,11 +1924,22 @@ maybe_warn_overflow (gimple *stmt, tree len,
   if (TREE_NO_WARNING (dest))
 return;
 
+  /* Use maximum precision to avoid overflow in the addition below.
+ Make sure all operands have the same precision to keep wide_int
+ from ICE'ing.  */
+  const int prec = ADDR_MAX_PRECISION;
+  /* Convenience constants.  */
+  const wide_int diff_min
+= wi::to_wide (TYPE_MIN_VALUE (ptrdiff_type_node), prec);
+  const wide_int diff_max
+= wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node), prec);
+  const wide_int size_max
+= wi::to_wide (TYPE_MAX_VALUE (size_type_node), prec);
+
   /* The offset into the destination object computed below and not
  reflected in DESTSIZE.  */
   wide_int offrng[2];
-  const int off_prec = TYPE_PRECISION (ptrdiff_type_node);
-  offrng[0] = offrng[1] = wi::zero (off_prec);
+  offrng[0] = offrng[1] = wi::zero (prec);
 
   if (!si)
 {
@@ -1943,13 +1954,14 @@ maybe_warn_overflow (gimple *stmt, tree len,
 	  ref = TREE_OPERAND (ref, 0);
 	  if (get_range (off, offrng, rvals))
 	{
-	  offrng[0] = offrng[0].from (offrng[0], off_prec, SIGNED);
-	  offrng[1] = offrng[1].from (offrng[1], off_prec, SIGNED);
+	  /* Convert offsets to the expected precision.  */
+	  offrng[0] = wide_int::from (offrng[0], prec, SIGNED);
+	  offrng[1] = wide_int::from (offrng[1], prec, SIGNED);
 	}
 	  else
 	{
-	  offrng[0] = wi::to_wide (TYPE_MIN_VALUE (ptrdiff_type_node));
-	  offrng[1] = wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node));
+	  offrng[0] = diff_min;
+	  offrng[1] = diff_max;
 	}
 	}
 
@@ -1960,13 +1972,13 @@ maybe_warn_overflow (gimple *stmt, tree len,
 	  wide_int memoffrng[2];
 	  if (get_range (mem_off, memoffrng, rvals))
 	{
-	  offrng[0] += memoffrng[0];
-	  offrng[1] += memoffrng[1];
+	  offrng[0] += wide_int::from (memoffrng[0], prec, SIGNED);
+	  offrng[1] += wide_int::from (memoffrng[1], prec, SIGNED);
 	}
 	  else
 	{
-	  offrng[0] = wi::to_wide (TYPE_MIN_VALUE (ptrdiff_type_node));
-	  offrng[1] = wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node));
+	  offrng[0] = diff_min;
+	  offrng[1] = diff_max;
 	}
 	}
 
@@ -1974,8 +1986,8 @@ maybe_warn_overflow (gimple *stmt, tree len,
   if (int idx = get_stridx (ref, stroffrng, rvals))
 	{
 	  si = get_strinfo (idx);
-	  offrng[0] += stroffrng[0];
-	  offrng[1] += stroffrng[1];
+	  offrng[0] += wide_int::from (stroffrng[0], prec, SIGNED);
+	  offrng[1] += wide_int::from (stroffrng[1], prec, SIGNED);
 	}
 }
 
@@ -1995,7 +2007,6 @@ maybe_warn_overflow (gimple *stmt, tree len,
   /* Compute the range of sizes of the destination object.  The range
  is constant for declared objects but may be a range for allocated
  objects.  */
-  const int siz_prec = TYPE_PRECISION (size_type_node);
   wide_int sizrng[2];
   if (si)
 {
@@ -2003,7 +2014,7 @@ maybe_warn_overflow (gimple *stmt, tree len,
   alloc_call = si->alloc;
 }
   else
-offrng[0] = offrng[1] = wi::zero (off_prec);
+offrng[0] = offrng[1] = wi::zero (prec);
 
   if (!destsize)
 {
@@ -2014,7 +2025,7 @@ maybe_warn_overflow (gimple *stmt, tree len,
 	{
 	  /* Remember OFF but clear OFFRNG that may have been set above.  */
 	  destoff = off;
-	  offrng[0] = offrng[1] = wi::z

Re: [PATCH], PR target/93937, Fix variable vec_extract insn that will never match

2020-03-02 Thread Michael Meissner
On Fri, Feb 28, 2020 at 06:45:25AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 28, 2020 at 12:32:06AM -0500, Michael Meissner wrote:
> > As part of my work in adding support for -mcpu=future, I noticed an insn 
> > that
> > would never match.
> 
> > It will never match, because the zero_extend result is the same mode as the
> > input, so the machine independent parts of the compiler would never insert 
> > the
> > zero extend.
> 
> It's not valid RTL, even:
>   @findex zero_extend
>   @item (zero_extend:@var{m} @var{x})
>   Represents the result of zero-extending the value @var{x}
>   to machine mode @var{m}.  @var{m} must be a fixed-point mode
>   and @var{x} a fixed-point value of a mode narrower than @var{m}.
> 
> > There is a wider issue to optimize all cases of vec_extract to do the sign,
> > zero, and float extension automatically when we are loading from memory, 
> > which
> > is PR target/93230.  I have patches for all of the cases for 93230, but they
> > will need to wait until GCC 11 opens up.
> 
> If you don't use reload_completed in the split condition you do not have
> this problem (in the normal case).  Please work on that?

No.  I tend to think that if we do the split before reload, that it will cause
some regressions, because the register allocator will take the opportunity to
change loads to vector registers to be loads to GPRs and direct moves.  One of
the original motivations for some of these patches is to avoid direct moves.

I also worry that things like having to use SUBREG's before RA (instead of just
changing the mode and/or the register number that we can do after reload) will
not work because generally vectors and scalars aren't tieable.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] coroutines: Update lambda capture handling to n4849.

2020-03-02 Thread JunMa

在 2020/3/2 下午5:43, Iain Sandoe 写道:

Hi,

In the absence of specific comment on the handling of closures I'd
implemented something more than was intended (extending the lifetime
of lambda capture-by-copy vars to the duration of the coro).

After discussion at WG21 in February and by email, the correct handling
is to treat the closure "this" pointer the same way as for a regular one,
and thus it is the user's responsibility to ensure that the lambda capture
object has suitable lifetime for the coroutine.  It is noted that users
frequently get this wrong, so it would be a good thing to revisit for C++23.

This patch removes the additional copying behaviour for lambda capture-by-
copy vars.

@JunMa, this supercedes your fix to the aliases, which should no longer be
necessary, but i’ve added your testcases to this patch.

Hi Iain
Most part of your patch are same idea as my patch, so this LGTM with 
some comments.


Regards
JunMa

gcc/cp/ChangeLog:

2020-03-02  Iain Sandoe  

* coroutines.cc (struct local_var_info): Adjust to remove the
reference to the captured var, and just to note that this is a
lambda capture proxy.
(transform_local_var_uses): Handle lambda captures specially.
(struct param_frame_data): Add a visited set.
(register_param_uses): Also check for param uses in lambda
capture proxies.
(struct local_vars_frame_data): Remove captures list.
(register_local_var_uses): Handle lambda capture proxies by
noting and bypassing them.
(morph_fn_to_coro): Update to remove lifetime extension of
lambda capture-by-copy vars.

gcc/testsuite/ChangeLog:

2020-03-02  Iain Sandoe  
Jun Ma 

* g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C:
Update to have multiple uses for the lambda parm.
* g++.dg/coroutines/torture/lambda-09-init-captures.C: New test.
* g++.dg/coroutines/torture/lambda-10-mutable.C: New test.

---
  gcc/cp/coroutines.cc  | 174 +++---
  .../class-05-lambda-capture-copy-local.C  |   4 +-
  .../torture/lambda-09-init-captures.C |  55 ++
  .../coroutines/torture/lambda-10-mutable.C|  48 +
  4 files changed, 171 insertions(+), 110 deletions(-)
  create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/lambda-09-init-captures.C
  create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/lambda-10-mutable.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 3e06f079787..303e6e83d54 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1783,7 +1783,7 @@ struct local_var_info
tree field_id;
tree field_idx;
tree frame_type;
-  tree captured;
+  bool is_lambda_capture;
location_t def_loc;
  };
  
@@ -1828,6 +1828,14 @@ transform_local_var_uses (tree *stmt, int *do_subtree, void *d)

  cp_walk_tree (&DECL_SIZE_UNIT (lvar), transform_local_var_uses, d,
NULL);
  
+	/* For capture proxies, this could include the decl value expr.  */

+   if (local_var.is_lambda_capture)
+ {
+   tree ve = DECL_VALUE_EXPR (lvar);
+   cp_walk_tree (&ve, transform_local_var_uses, d, NULL);
+   continue; /* No frame entry for this.  */
+ }
+
  /* TODO: implement selective generation of fields when vars are
 known not-used.  */
  if (local_var.field_id == NULL_TREE)
@@ -1842,8 +1850,9 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  local_var.field_idx = fld_idx;
}
cp_walk_tree (&BIND_EXPR_BODY (*stmt), transform_local_var_uses, d, 
NULL);
+
/* Now we have processed and removed references to the original vars,
-we can drop those from the bind.  */
+we can drop those from the bind - leaving capture proxies alone.  */
for (tree *pvar = &BIND_EXPR_VARS (*stmt); *pvar != NULL;)
{
  bool existed;
@@ -1851,10 +1860,24 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
= lvd->local_var_uses->get_or_insert (*pvar, &existed);
  gcc_checking_assert (existed);
  
+	  /* Leave lambda closure captures alone, we replace the *this

+pointer with the frame version and let the normal process
+deal with the rest.  */
+ if (local_var.is_lambda_capture)
+   {
+ pvar = &DECL_CHAIN (*pvar);
+ continue;
+   }
+
+ /* It's not used, but we can let the optimizer deal with that.  */
  if (local_var.field_id == NULL_TREE)
-   pvar = &DECL_CHAIN (*pvar); /* Wasn't used.  */
+   {
+ pvar = &DECL_CHAIN (*pvar);
+ continue;
+   }
  

Merge ifs maybe better.

- *pvar = DECL_CHAIN (*pvar); /* discard this one, we replaced it.  */
+ /* Discard this one, we replaced it.  */
+ *pvar = DECL_CHAIN (*pvar);
}
  
*do_s

Re: [PATCH coroutines] Handle component_ref in captures_temporary

2020-03-02 Thread JunMa

在 2020/3/2 下午10:49, Nathan Sidwell 写道:

On 2/12/20 2:23 AM, JunMa wrote:

Hi
In captures_temporary, the current implementation fails to handle
component_ref. This causes ice with case co_await A while
operator co_await is defined in base class of A. Also it is necessary
to capture the object of base class as if it is temporary object.

This patch strips component_ref to its base object and check it as 
usual.


Bootstrap and test on X86_64, is it OK?

Regards
JunMa

gcc/cp
2020-02-12  Jun Ma 

 * coroutines.cc (captures_temporary): Strip component_ref
 to its base object.

gcc/testsuite
2020-02-12  Jun Ma 

 * g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C: 
New test.


+
+  /* In case of component_ref, we need to capture the object of 
base

+ class as if it is temporary object.  There are two possibilities:
+ (*base).field and base->field.  */
+  while (TREE_CODE (parm) == COMPONENT_REF)
+    {
+  parm = TREE_OPERAND (parm, 0);
+  if (TREE_CODE (parm) == INDIRECT_REF)
+    parm = TREE_OPERAND (parm, 0);
+  while (TREE_CODE (parm) == NOP_EXPR)
+    parm = TREE_OPERAND (parm, 0);


Use STRIP_NOPS.


+    }
+
   if (TREE_CODE (parm) == VAR_DECL && !DECL_ARTIFICIAL (parm))
 /* This isn't a temporary... */
 continue;

-  if (TREE_CODE (parm) == PARM_DECL)
+  if (TREE_CODE (parm) == PARM_DECL  || TREE_CODE (parm) == 
NON_LVALUE_EXPR)

 /* .. nor is this... */
 continue;


Either a separate if, or merging both ifs (my preference) would be 
better.


nathan


Hi nathan

Here is the updated patch

Regards
JunMa
---
 gcc/cp/coroutines.cc  | 20 +++-
 .../torture/co-await-15-capture-comp-ref.C| 99 +++
 2 files changed, 114 insertions(+), 5 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 966ec0583aa..2a54bcefc1e 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2613,12 +2613,22 @@ captures_temporary (tree *stmt, int *do_subtree, void 
*d)
continue;
 
   parm = TREE_OPERAND (parm, 0);
-  if (TREE_CODE (parm) == VAR_DECL && !DECL_ARTIFICIAL (parm))
-   /* This isn't a temporary... */
-   continue;
 
-  if (TREE_CODE (parm) == PARM_DECL)
-   /* .. nor is this... */
+  /* In case of component_ref, we need to capture the object of base
+class as if it is temporary object.  There are two possibilities:
+(*base).field and base->field.  */
+  while (TREE_CODE (parm) == COMPONENT_REF)
+   {
+ parm = TREE_OPERAND (parm, 0);
+ if (TREE_CODE (parm) == INDIRECT_REF)
+   parm = TREE_OPERAND (parm, 0);
+ parm = STRIP_NOPS (parm);
+   }
+
+  /* This isn't a temporary or argument.  */
+  if ((TREE_CODE (parm) == VAR_DECL && !DECL_ARTIFICIAL (parm))
+ || TREE_CODE (parm) == PARM_DECL
+ || TREE_CODE (parm) == NON_LVALUE_EXPR)
continue;
 
   if (TREE_CODE (parm) == TARGET_EXPR)
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C 
b/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C
new file mode 100644
index 000..93a43fbd298
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-await-15-capture-comp-ref.C
@@ -0,0 +1,99 @@
+//  { dg-do run }
+
+#include "../coro.h"
+
+class resumable {
+public:
+  struct promise_type;
+  using coro_handle = std::coroutine_handle;
+  resumable(coro_handle handle) : handle_(handle) { }
+  resumable(resumable&) = delete;
+  resumable(resumable&&) = delete;
+  ~resumable() { handle_.destroy(); }
+  coro_handle handle_;
+};
+
+struct resumable::promise_type {
+  using coro_handle = std::coroutine_handle;
+  int used;
+  auto get_return_object() {
+return coro_handle::from_promise(*this);
+  }
+  auto initial_suspend() { return std::suspend_never(); }
+  auto final_suspend() { return std::suspend_always(); }
+  void return_value(int x) {used = x;}
+  void unhandled_exception() {}
+
+  struct TestAwaiter {
+int recent_test;
+TestAwaiter(int test) : recent_test{test} {}
+bool await_ready() { return false; }
+void await_suspend(std::coroutine_handle) {}
+int await_resume() {
+  return recent_test;
+}
+auto operator co_await() {
+  return *this;
+}
+  };
+
+  struct TestAwaiterCH :TestAwaiter { 
+TestAwaiterCH(int test) : TestAwaiter(test) {};
+  };
+
+  struct TestAwaiterCHCH :TestAwaiterCH {
+TestAwaiterCHCH(int test) : TestAwaiterCH(test) {};
+
+resumable foo(){
+int x = co_await *this;
+co_return x;
+}
+  };
+};
+
+struct TestP {
+ resumable::promise_type::TestAwaiterCHCH  tp = 
resumable::promise_type::TestAwaiterCHCH(6);
+};
+
+resumable foo1(int t){
+  int x = co_await resumable::promise_type::TestAwaiterCH(t);
+  co_return x;
+}
+
+resumable foo2(){
+  struct TestP  T

Re: [PATCH] PR libstdc++/91620 Implement DR 526 for std::[forward_]list::remove_if/unique

2020-03-02 Thread François Dumont

Hi

    Isn't it something to fix before gcc 10 release ?

François

On 12/27/19 11:57 AM, François Dumont wrote:
Here is the patch to extend DR 526 to forward_list and list remove_if 
and unique.


As the adopted pattern is simpler I also applied it to the remove 
methods.


    PR libstdc++/91620
    * include/bits/forward_list.tcc (forward_list<>::remove): Collect 
nodes

    to destroy in an intermediate forward_list.
    (forward_list<>::remove_if, forward_list<>::unique): Likewise.
    * include/bits/list.tcc (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.
    * include/debug/forward_list (forward_list<>::_M_erase_after): 
Remove.

    (forward_list<>::erase_after): Adapt.
    (forward_list<>::remove, forward_list<>::remove_if): Collect nodes to
    destroy in an intermediate forward_list.
    (forward_list<>::unique): Likewise.
    * include/debug/list (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.

Tested under Linux x86_64 normal and debug modes.

Ok to commit ?

François





Re: [PATCH] gimple-fold: Verify builtin prototype before folding [PR93927]

2020-03-02 Thread Jakub Jelinek
On Wed, Feb 26, 2020 at 11:29:18AM +0100, Richard Biener wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

This got fixed with Martin's C FE change, so I've just committed
the testcases as obvious to trunk.

2020-03-03  Jakub Jelinek  

PR tree-optimization/93927
* gcc.c-torture/compile/pr93927-1.c: New test.
* gcc.c-torture/compile/pr93927-2.c: New test.

--- gcc/testsuite/gcc.c-torture/compile/pr93927-1.c.jj  2020-02-25 
11:47:10.983971425 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr93927-1.c 2020-02-25 
11:46:43.004382966 +0100
@@ -0,0 +1,9 @@
+/* PR tree-optimization/93927 */
+
+__SIZE_TYPE__ strstr (const char *, const char *);
+
+char *
+foo (char *x)
+{
+  return !!strstr (x, "0") ? "0" : "1";
+}
--- gcc/testsuite/gcc.c-torture/compile/pr93927-2.c.jj  2020-02-25 
11:47:13.930928081 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr93927-2.c 2020-02-25 
11:46:57.304172632 +0100
@@ -0,0 +1,9 @@
+/* PR tree-optimization/93927 */
+
+__SIZE_TYPE__ strchr (const char *, int);
+
+char *
+foo (char *x)
+{
+  return !!strchr (x, 0) ? "0" : "1";
+}


Jakub



Re: [PATCH] sccvn: Improve handling of load masked with integer constant [PR93582]

2020-03-02 Thread Jakub Jelinek
On Mon, Mar 02, 2020 at 03:54:36PM +0100, Jakub Jelinek wrote:
> So, like this if it passes bootstrap/regtest?

Bootstrapped/regtested successfully on x86_64-linux and i686-linux.

> 2020-03-02  Jakub Jelinek  
> 
>   PR tree-optimization/93582
>   * tree-ssa-sccvn.h (vn_reference_lookup): Add mask argument.
>   * tree-ssa-sccvn.c (struct vn_walk_cb_data): Add mask and masked_result
>   members, initialize them in the constructor and if mask is non-NULL,
>   artificially push_partial_def {} for the portions of the mask that
>   contain zeros.
>   (vn_walk_cb_data::finish): If mask is non-NULL, set masked_result to
>   val and return (void *)-1.  Formatting fix.
>   (vn_reference_lookup_pieces): Adjust vn_walk_cb_data initialization.
>   Formatting fix.
>   (vn_reference_lookup): Add mask argument.  If non-NULL, don't call
>   fully_constant_vn_reference_p nor vn_reference_lookup_1 and return
>   data.mask_result.
>   (visit_nary_op): Handle BIT_AND_EXPR of a memory load and INTEGER_CST
>   mask.
>   (visit_stmt): Formatting fix.
> 
>   * gcc.dg/tree-ssa/pr93582-10.c: New test.
>   * gcc.dg/pr93582.c: New test.
>   * gcc.c-torture/execute/pr93582.c: New test.

Jakub



[PATCH] re PR tree-optimization/90883 (Generated code is worse if returned struct is unnamed)

2020-03-02 Thread Kito Cheng
After add --param max-inline-insns-size=1 all target will remove the
redundant store at dse1, except some targets like AArch64 and MIPS will
expand the struct initialization into loop due to CLEAR_RATIO.

Tested on cross compiler of riscv32, riscv64, x86, x86_64, mips, mips64,
aarch64, nds32 and arm.

gcc/testsuite/ChangeLog

PR tree-optimization/90883
* g++.dg/tree-ssa/pr90883.c: Add --param max-inline-insns-size=1.
Add aarch64-*-* mips*-*-* to XFAIL.
---
 gcc/testsuite/g++.dg/tree-ssa/pr90883.C | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr90883.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr90883.C
index c5faffa1f32..0e622f263d2 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr90883.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr90883.C
@@ -1,4 +1,4 @@
-// { dg-options "-O2 -Os -fdump-tree-dse-details -std=c++11" }
+// { dg-options "-O2 -Os -fdump-tree-dse-details -std=c++11 --param 
max-inline-insns-size=1" }
 
 
 class C
@@ -15,6 +15,6 @@
 
 // We want to match enough here to capture that we deleted an empty
 // constructor store
-// { dg-final { scan-tree-dump "Deleted redundant store: .*\.a = {}" "dse1" { 
target { ! i?86-*-* } } } }
-// { dg-final { scan-tree-dump "Deleted redundant store: .*\.a = {}" "dse2" { 
target i?86-*-* } } }
+// aarch64 and mips will expand to loop to clear because CLEAR_RATIO.
+// { dg-final { scan-tree-dump "Deleted redundant store: .*\.a = {}" "dse1" { 
xfail { aarch64-*-* mips*-*-* } } } }
 
-- 
2.25.1



Re: [PATCH] s390: Fix --with-arch=... --with-tune=... [PR26877]

2020-03-02 Thread Andreas Krebbel
On 2/25/20 10:20 AM, Jakub Jelinek wrote:
> Hi!
> 
> In Fedora we configure GCC with --with-arch=zEC12 --with-tune=z13 right now
> and furthermore redhat-rpm-config adds to rpm packages -march=zEC12 -mtune=z13
> options (among others).  While looking at the git compilation, I've been
> surprised that -O2 actually behaves differently from -O2 -mtune=z13 in this
> configuration, and indeed, seems --with-tune= is completely ignored on s390
> if --with-arch= is specified.
> 
> i386 had the same problem, but got that fixed in 2006, see PR26877.
> The thing is that for tune, we add -mtune=%(VALUE) only if neither -mtune=
> nor -march= is present, but as arch is processed first, it adds
> -march=%(VALUE) first and then -march= is always present and so -mtune= is
> never added.
> By reordering it in OPTION_DEFAULT_SPECS, we process tune first, add the
> default -mtune=%(VALUE) if -mtune= or -march= isn't seen, and then
> add -march=%(VALUE) if -march= isn't seen.  It is true that cc1 etc.
> will be then invoked with -mtune=z13 -march=zEC12, but like if the user
> specifies it in that order, it should still use z13 tuning and zEC12
> ISA set.
> 
> Bootstrapped/regtested on s390x-linux, ok for trunk?
> 
> 2020-02-25  Jakub Jelinek  
> 
>   PR target/26877
>   * config/s390/s390.h (OPTION_DEFAULT_SPECS): Reorder.

Ok. Thanks for fixing this.

Andreas

> 
> --- gcc/config/s390/s390.h.jj 2020-01-12 11:54:36.412413424 +0100
> +++ gcc/config/s390/s390.h2020-02-24 19:04:14.104259482 +0100
> @@ -227,11 +227,13 @@ enum processor_flags
>  #define TARGET_DEFAULT 0
>  #endif
>  
> -/* Support for configure-time defaults.  */
> +/* Support for configure-time defaults.
> +   The order here is important so that -march doesn't squash the
> +   tune values.  */
>  #define OPTION_DEFAULT_SPECS \
>{ "mode", "%{!mesa:%{!mzarch:-m%(VALUE)}}" },  \
> -  { "arch", "%{!march=*:-march=%(VALUE)}" }, \
> -  { "tune", "%{!mtune=*:%{!march=*:-mtune=%(VALUE)}}" }
> +  { "tune", "%{!mtune=*:%{!march=*:-mtune=%(VALUE)}}" }, \
> +  { "arch", "%{!march=*:-march=%(VALUE)}" }
>  
>  #ifdef __s390__
>  extern const char *s390_host_detect_local_cpu (int argc, const char **argv);
> 
>   Jakub
> 



[PATCH] explow: Fix ICE caused by plus_constant [PR94002]

2020-03-02 Thread Jakub Jelinek
Hi!

The following testcase ICEs in cross to riscv64-linux.  The problem is
that we have a DImode integral constant (that doesn't fit into SImode),
which is pushed into a constant pool and later access just the first half of
it using a MEM.  When plus_constant is called on such a MEM, if the constant
has mode, we verify the mode, but if it doesn't, we don't and ICE later on
when we think the CONST_INT is a valid SImode constant.

Fixed thusly, tested with cross to riscv64-linux and bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk?

2020-03-03  Jakub Jelinek  

PR rtl-optimization/94002
* explow.c (plus_constant): Punt if cst has VOIDmode and
get_pool_mode is different from mode.

* gcc.dg/pr94002.c: New test.

--- gcc/explow.c.jj 2020-01-12 11:54:36.564411130 +0100
+++ gcc/explow.c2020-03-02 22:09:19.544380020 +0100
@@ -128,6 +128,9 @@ plus_constant (machine_mode mode, rtx x,
  cst = gen_lowpart (mode, cst);
  gcc_assert (cst);
}
+  else if (GET_MODE (cst) == VOIDmode
+  && get_pool_mode (XEXP (x, 0)) != mode)
+   break;
  if (GET_MODE (cst) == VOIDmode || GET_MODE (cst) == mode)
{
  tem = plus_constant (mode, cst, c);
--- gcc/testsuite/gcc.dg/pr94002.c.jj   2020-03-02 22:05:58.508338170 +0100
+++ gcc/testsuite/gcc.dg/pr94002.c  2020-03-02 22:05:32.864715503 +0100
@@ -0,0 +1,13 @@
+/* PR rtl-optimization/94002 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-tree-dce -fno-tree-reassoc" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+unsigned a, b;
+
+void
+foo (void)
+{
+  __builtin_sub_overflow (b, 44852956282LL, &a);
+  a += ~b;
+}

Jakub