date:20190313

V2 [PATCH] i386: Handle REG_EH_REGION note

2019-03-13 Thread H.J. Lu

On Thu, Mar 14, 2019 at 1:28 AM Jakub Jelinek  wrote:
>
> On Tue, Mar 12, 2019 at 09:36:32AM +0800, H.J. Lu wrote:
> >   PR target/89650
> >   * config/i386/i386.c (remove_partial_avx_dependency): Handle
> >   REG_EH_REGION note.
> >
> > gcc/testsuite/
> >
> >   PR target/89650
> >   * g++.target/i386/pr89650.C: New test.
> > ---
> >  gcc/config/i386/i386.c  |  6 ++
> >  gcc/testsuite/g++.target/i386/pr89650.C | 19 +++
> >  2 files changed, 25 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.target/i386/pr89650.C
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 896c6f33d40..b702703074c 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -2873,6 +2873,12 @@ remove_partial_avx_dependency (void)
> > /* Generate an XMM vector SET.  */
> > set = gen_rtx_SET (vec, src);
> > set_insn = emit_insn_before (set, insn);
> > +
> > +   /* Handle REG_EH_REGION note.  */
> > +   rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
> > +   if (note)
> > + add_reg_note (insn, REG_EH_REGION, XEXP (note, 0));
> > +
>
> How can this work?  If insn has REG_EH_REGION note, you add that note
> to it once more?  At minimum that should be add_reg_note (set_insn, ...).
> But perhaps better use:
>   copy_reg_eh_region_note_forward (insn, set_insn, insn);
> instead?
>
> That said, with either change the testcase ICEs, as we have control flow
> insn in the middle of a bb.
>
> Tried to do:
> --- gcc/config/i386/i386.c.jj   2019-03-13 09:23:37.138538182 +0100
> +++ gcc/config/i386/i386.c  2019-03-13 18:24:07.773761751 +0100
> @@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-vector-builder.h"
>  #include "debug.h"
>  #include "dwarf2out.h"
> +#include "cfgcleanup.h"
>
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -2814,6 +2815,9 @@ remove_partial_avx_dependency (void)
>bitmap_obstack_initialize (NULL);
>bitmap convert_bbs = BITMAP_ALLOC (NULL);
>
> +  auto_sbitmap blocks_with_eh (last_basic_block_for_fn (cfun));
> +  bitmap_clear (blocks_with_eh);
> +
>basic_block bb;
>rtx_insn *insn, *set_insn;
>rtx set;
> @@ -2873,6 +2877,15 @@ remove_partial_avx_dependency (void)
>   /* Generate an XMM vector SET.  */
>   set = gen_rtx_SET (vec, src);
>   set_insn = emit_insn_before (set, insn);
> +
> + /* Handle REG_EH_REGION note.  */
> + rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
> + if (note)
> +   {
> + bitmap_set_bit (blocks_with_eh, bb->index);
> + add_reg_note (set_insn, REG_EH_REGION, XEXP (note, 0));
> +   }
> +
>   df_insn_rescan (set_insn);
>
>   src = gen_rtx_SUBREG (dest_mode, vec, 0);
> @@ -2930,6 +2943,14 @@ remove_partial_avx_dependency (void)
>bitmap_obstack_release (NULL);
>BITMAP_FREE (convert_bbs);
>
> +  if (!bitmap_empty_p (blocks_with_eh))
> +{
> +  find_many_sub_basic_blocks (blocks_with_eh);
> +
> +  /* If we've moved any REG_EH_REGION notes, also cleanup the cfg.  */
> +  cleanup_cfg (0);
> +}
> +
>timevar_pop (TV_MACH_DEP);
>return 0;
>  }
>
> but that still ICEs.
>
> Jakub

We need to split the basic block if we create new insns, which may
throw exceptions, in the middle of the basic blocks.

Tested on AVX2 and AVX512 machines with and without

--with-arch=native

OK for trunk?

Thanks.

-- 
H.J.
From 713d7648c4018f2f08e179c165d0360de8ecb618 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 12 Mar 2019 08:55:24 +0800
Subject: [PATCH] i386: Handle REG_EH_REGION note

When we split:

(insn 18 17 76 2 (set (reg:SF 88 [ _19 ])
(float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32]))) "x.ii":4:20 170 {*floatsisf2}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

to

(insn 94 17 18 2 (set (reg:V4SF 115)
(vec_merge:V4SF (vec_duplicate:V4SF (float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32])))
(reg:V4SF 114)
(const_int 1 [0x1]))) "x.ii":4:20 -1
 (nil))
(insn 18 94 76 2 (set (reg:SF 88 [ _19 ])
(subreg:SF (reg:V4SF 115) 0)) "x.ii":4:20 112 {*movsf_internal}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

we must copy the REG_EH_REGION note to the first insn and split the block
after the newly added insn.  The REG_EH_REGION on the second insn will be
removed later since it no longer traps.

gcc/

	PR target/89650
	* config/i386/i386.c (remove_partial_avx_dependency): Handle
	REG_EH_REGION note.

gcc/testsuite/

	PR target/89650
	* g++.target/i386/pr89650.C: New test.
---
 gcc/config/i386/i386.c  | 45 +
 gcc/testsuite/g++.target/i386/pr89650.C | 19 +++
 2 files changed, 64 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/pr89650.C

Re: [PATCH] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-03-13 Thread Feng Xue OS

Ok. Got it. And I will add some cases.

Thanks,
Feng

From: Kyrill Tkachov 
Sent: Wednesday, March 13, 2019 5:40:37 PM
To: Feng Xue OS; Richard Biener
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Loop split upon semi-invariant condition (PR 
tree-optimization/89134)

Hi Feng,

On 3/13/19 1:56 AM, Feng Xue OS wrote:
> Richard,
>
> Thanks for your comment. Yes, it is like kind of jump threading
> with knowledge of loop structure. And what is rough time for GCC 10?
>
>

GCC 10 will be released once the number of P1 regressions gets down to
zero. Past experience shows that it's around the April/May timeframe.

In the meantime my comment on the patch is that you should add some
tests to the testsuite that showcase this transformation.

Thanks,

Kyrill


> Regards,
>
> Feng
>
>
> 
> From: Richard Biener 
> Sent: Tuesday, March 12, 2019 4:31:49 PM
> To: Feng Xue OS
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Loop split upon semi-invariant condition (PR
> tree-optimization/89134)
>
> On Tue, Mar 12, 2019 at 7:20 AM Feng Xue OS
>  wrote:
> >
> > This patch is composed to implement a loop transformation on one of
> its conditional statements, which we call it semi-invariant, in that
> its computation is impacted in only one of its branches.
> >
> > Suppose a loop as:
> >
> > void f (std::map m)
> > {
> > for (auto it = m.begin (); it != m.end (); ++it) {
> > /* if (b) is semi-invariant. */
> > if (b) {
> > b = do_something();/* Has effect on b */
> > } else {
> > /* No effect on b */
> > }
> > statements;  /* Also no effect on b */
> > }
> > }
> >
> > A transformation, kind of loop split, could be:
> >
> > void f (std::map m)
> > {
> > for (auto it = m.begin (); it != m.end (); ++it) {
> > if (b) {
> > b = do_something();
> > } else {
> > ++it;
> > statements;
> > break;
> > }
> > statements;
> > }
> >
> > for (; it != m.end (); ++it) {
> > statements;
> > }
> > }
> >
> > If "statements" contains nothing, the second loop becomes an empty
> one, which can be removed. (This part will be given in another patch).
> And if "statements" are straight line instructions, we get an
> opportunity to vectorize the second loop. In practice, this
> optimization is found to improve some real application by %7.
> >
> > Since it is just a kind of loop split, the codes are mainly placed
> in existing tree-ssa-loop-split module, and is controlled by
> -fsplit-loop, and is enabled with -O3.
>
> Note the transform itself is jump-threading with the threading
> duplicating a whole CFG cycle.
>
> I didn't look at the patch details yet since this is suitable for GCC
> 10 only.
>
> Thanks for implementing this.
> Richard.
>
> > Feng
> >
> >
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index 64bf6017d16..a6c2878d652 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,23 @@
> > +2019-03-12  Feng Xue 
> > +
> > +   PR tree-optimization/89134
> > +* doc/invoke.texi (max-cond-loop-split-insns): Document new
> --params.
> > +   (min-cond-loop-split-prob): Likewise.
> > +   * params.def: Add max-cond-loop-split-insns,
> min-cond-loop-split-prob.
> > +   * passes.def (pass_cond_loop_split) : New pass.
> > +   * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
> > +   * tree-pass.h (make_pass_cond_loop_split): New declaration.
> > +   * tree-ssa-loop-split.c (split_info): New class.
> > +   (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
> > +   (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
> > +   (can_branch_be_excluded, get_cond_invariant_branch): Likewise.
> > +   (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
> > +   (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
> > +   (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
> > +   (pass_data_cond_loop_split): New variable.
> > +   (pass_cond_loop_split): New class.
> > +   (make_pass_cond_loop_split): New function.
> > +
> >  2019-03-11  Jakub Jelinek  
> >
> > PR middle-end/89655
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index df0883f2fc9..f5e09bd71fd 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -11316,6 +11316,14 @@ The maximum number of branches unswitched
> in a single loop.
> >  @item lim-expensive
> >  The minimum cost of an expensive expression in the loop invariant
> motion.
> >
> > +@item max-cond-loop-split-insns
> > +The maximum number of insns to be increased due to loop split on
> > +semi-invariant condition statement.
> > +
> > +@item min-cond-loop-split-prob
> > +The minimum threshold for

Re: [C++ PATCH] Fix ICE on variable template access without template args (PR c++/89512)

2019-03-13 Thread Jason Merrill


On 3/13/19 6:37 PM, Jakub Jelinek wrote:

Hi!

When a variable template (both static data member of some class or at
namespace level) is accessed without template arguments,
finish_id_expression_1 handles it through:
   /* If we didn't find anything, or what we found was a type,
  then this wasn't really an id-expression.  */
   if (TREE_CODE (decl) == TEMPLATE_DECL
   && !DECL_FUNCTION_TEMPLATE_P (decl))
 {
   *error_msg = G_("missing template arguments");
   return error_mark_node;
 }
On the following invalid testcase the access is from within a template and
in that case finish_id_expression{,_1} isn't really called for that, instead
tsubst_qualified_id just calls finish_qualified_id_expr.

The ICE is because we pass over a TEMPLATE_DECL to the middle-end (e.g. on
rhs of an INIT_EXPR etc.).

The following patch does something similar to what finish_id_expression_1
does.  Bootstrapped/regtested on x86_64-linux and i686-linux.  Or should it
be done in tsubst_qualified_id instead (before or after the
finish_qualified_id_expr call?

2019-03-13  Jakub Jelinek  

PR c++/89512
* semantics.c (finish_qualified_id_expr): Reject variable templates.


OK.

Jason

Re: [C++ PATCH] Fix cxx_eval_loop_expr ICE (PR c++/89652, take 2)

2019-03-13 Thread Jason Merrill


On 3/13/19 6:20 PM, Jakub Jelinek wrote:

On Mon, Mar 11, 2019 at 11:21:00PM +0100, Jakub Jelinek wrote:

The following testcase ICEs since my recent cxx_eval_loop_expr changes.
The problem is that the Forget saved values of SAVE_EXPRs. inside of the
loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
iteration, we can also do the loop at the end of function (which has been
added there mainly to handle cases where the main loop breaks earlier)
and new_ctx.values->remove ICEs because *iter is already not in the table.

While I could e.g. add some bool whether there is anything to be removed
after the loop or not which would fix the regression part of this PR,
I believe there is a latent issue as well.  The thing is, new_ctx.save_exprs


That is actually not true, the bug is fully my fault and previously it was
ok, as cxx_eval_loop_expr used to do:
   do
 {
   hash_set save_exprs;
   new_ctx.save_exprs = _exprs;
...
 }
   while (...)
and thus there was a new hash_set for each iteration, it was just me moving
it out of the loop (so that I don't have to repeat the ->remove loop on each
break or use gotos).


Yet another possibility might be to turn save_exprs into a vec, from what


That said, I believe using a vec must be faster and the hash_set seems
overkill, as if SAVE_EXPR wasn't seen yet in the current iteration, it is
desirable to add it to the vec and if it has been added there,
ctx->value->get will be non-NULL, so we'll never add a duplicate, and by
using a vec we don't have to allocate storage for each iteration etc.

So, here is a variant patch I've bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?


This is OK.  Is my hash-table.h fix also OK for trunk?  It does also fix 
the testcase.


Jason

[PATCH] improve constant string length folding (PR 89688)

2019-03-13 Thread Martin Sebor


PR 89688 points out a bogus warning about an unterminated
character array argument to strlen.  The root cause is
an oversight in the transformation of braced initializer lists
to STRING_CSTs where the solution implemented last summer only
considers one-dimensional arrays and skips more complex aggregates
such as multi-dimensional arrays or structs.

The folder (string_constant), on the other hand, assumes that
every a constant character array with an initializer is either
a properly nul-terminated string (i.e., STRING_CST), or
an unterminated array or a single character.  If the latter
then unless the character value is zero it indicates to its
caller that the constant is not a string.  As a result, we
end up with a warning.

To avoid the false positives the attached patch extends
the solution to those other kinds of aggregates.

Martin
PR tree-optimization/89688 - -Wstringop-overflow confused by const 2D array of char

gcc/c/ChangeLog:

	PR tree-optimization/89688
	* c-decl.c (finish_decl): Call braced_lists_to_string for more
	kinds of initializers.

gcc/c-family/ChangeLog:

	PR tree-optimization/89688
	* c-common.c (braced_list_to_string): Make static.
	(braced_lists_to_strings): Define new function.
	* c-common.h (braced_list_to_string): Remove.
	(braced_lists_to_strings): Declare.

gcc/cp/ChangeLog:

	PR tree-optimization/89688
	* typeck2.c (store_init_value): Call braced_lists_to_string for more
	kinds of initializers.

gcc/testsuite/ChangeLog:

	PR tree-optimization/89688
	* gcc.dg/strlenopt-61.c: New test.
	* g++.dg/warn/Wstringop-overflow-2.C: New test.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c	(revision 269657)
+++ gcc/c/c-decl.c	(working copy)
@@ -5165,11 +5165,10 @@ finish_decl (tree decl, location_t init_loc, tree
   relayout_decl (decl);
 }
 
-  if (TREE_CODE (type) == ARRAY_TYPE
-  && TYPE_STRING_FLAG (TREE_TYPE (type))
-  && DECL_INITIAL (decl)
-  && TREE_CODE (DECL_INITIAL (decl)) == CONSTRUCTOR)
-DECL_INITIAL (decl) = braced_list_to_string (type, DECL_INITIAL (decl));
+  /* Look for braced array initializers for character arrays and
+ recursively convert them into STRING_CSTs.  */
+  if (tree init = DECL_INITIAL (decl))
+DECL_INITIAL (decl) = braced_lists_to_strings (type, init);
 
   if (VAR_P (decl))
 {
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 269657)
+++ gcc/c-family/c-common.c	(working copy)
@@ -8814,7 +8814,7 @@ maybe_add_include_fixit (rich_location *richloc, c
TYPE into a STRING_CST for convenience and efficiency.  Return
the converted string on success or the original ctor on failure.  */
 
-tree
+static tree
 braced_list_to_string (tree type, tree ctor)
 {
   if (!tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)))
@@ -8895,4 +8895,52 @@ braced_list_to_string (tree type, tree ctor)
   return res;
 }
 
+/* Attempt to convert a CTOR containing braced array initializer lists
+   for array TYPE into one containing STRING_CSTs, for convenience and
+   efficiency.  Recurse for arrays of arrays and member initializers.
+   Return the converted CTOR or STRING_CST on success or the original
+   CTOR otherwise.  */
+
+tree
+braced_lists_to_strings (tree type, tree ctor)
+{
+  if (TREE_CODE (ctor) != CONSTRUCTOR)
+return ctor;
+
+  tree_code code = TREE_CODE (type);
+
+  tree ttp;
+  if (code == ARRAY_TYPE)
+ttp = TREE_TYPE (type);
+  else if (code == RECORD_TYPE)
+{
+  ttp = TREE_TYPE (ctor);
+  if (TREE_CODE (ttp) == ARRAY_TYPE)
+	{
+	  type = ttp;
+	  ttp = TREE_TYPE (ttp);
+	}
+}
+  else
+return ctor;
+
+  if (TYPE_STRING_FLAG (ttp))
+return braced_list_to_string (type, ctor);
+
+  code = TREE_CODE (ttp);
+  if (code == ARRAY_TYPE || code == RECORD_TYPE)
+{
+  /* Handle array of arrays or struct member initializers.  */
+  tree val;
+  unsigned HOST_WIDE_INT idx;
+  FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (ctor), idx, val)
+	{
+	  val = braced_lists_to_strings (ttp, val);
+	  CONSTRUCTOR_ELT (ctor, idx)->value = val;
+	}
+}
+
+  return ctor;
+}
+
 #include "gt-c-family-c-common.h"
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h	(revision 269657)
+++ gcc/c-family/c-common.h	(working copy)
@@ -1372,7 +1372,8 @@ extern void maybe_add_include_fixit (rich_location
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
 		   enum cpp_ttype token_type,
 		   location_t prev_token_loc);
-extern tree braced_list_to_string (tree, tree);
+extern tree braced_lists_to_strings (tree, tree);
+
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 
 #if CHECKING_P
Index: gcc/cp/typeck2.c
===
--- gcc/cp/typeck2.c	(revision 269657)
+++ gcc/cp/typeck2.c	(working

Re: [C++ PATCH] Fix cxx_eval_loop_expr ICE (PR c++/89652)

2019-03-13 Thread Jason Merrill


On 3/13/19 10:08 PM, Jason Merrill wrote:

On 3/11/19 6:21 PM, Jakub Jelinek wrote:

Hi!

The following testcase ICEs since my recent cxx_eval_loop_expr changes.
The problem is that the Forget saved values of SAVE_EXPRs. inside of the
loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
iteration, we can also do the loop at the end of function (which has been
added there mainly to handle cases where the main loop breaks earlier)
and new_ctx.values->remove ICEs because *iter is already not in the 
table.


It shouldn't ICE, remove_elt_with_hash is documented to do nothing if 
the element is not in the table.


Ah, now I see you sent a follow-up, reading that now.

Jason

Re: [C++ PATCH] Fix cxx_eval_loop_expr ICE (PR c++/89652)

2019-03-13 Thread Jason Merrill


On 3/11/19 6:21 PM, Jakub Jelinek wrote:

Hi!

The following testcase ICEs since my recent cxx_eval_loop_expr changes.
The problem is that the Forget saved values of SAVE_EXPRs. inside of the
loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
iteration, we can also do the loop at the end of function (which has been
added there mainly to handle cases where the main loop breaks earlier)
and new_ctx.values->remove ICEs because *iter is already not in the table.


It shouldn't ICE, remove_elt_with_hash is documented to do nothing if 
the element is not in the table.


Does this untested patch fix the test?

Jason
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 1fd36946a53..fb2f77fdbee 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -940,7 +940,7 @@ hash_table
 ::remove_elt_with_hash (const compare_type , hashval_t hash)
 {
   value_type *slot = find_slot_with_hash (comparable, hash, NO_INSERT);
-  if (is_empty (*slot))
+  if (!slot || is_empty (*slot))
 return;
 
   Descriptor::remove (*slot);

Re: [PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-13 Thread Joseph Myers

On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> Hi!
> 
> The C FE sadly passes through some really bad prototypes of builtin
> functions as "harmless":
>   /* Accept "harmless" mismatches in function types such
>  as missing qualifiers or pointer vs same size integer
>  mismatches.  This is for the ffs and fprintf builtins.
>  However, with -Wextra in effect, diagnose return and
>  argument types that are incompatible according to
>  language rules.  */

Note that this isn't just about pre-standard headers with unexpected types 
(if it were, it might be obsolete).  It's also needed for building glibc 
(to build glibc with -Wextra with GCC trunk, one of the -Wno- options 
needed is -Wno-builtin-declaration-mismatch), because of how various code 
creates function aliases between functions that have the same ABI but 
different types (long versus int, etc.).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH v2, rs6000] PR87532: Bad Results from vec_extract(unsigned char, foo) dependent upon function inline

2019-03-13 Thread Segher Boessenkool

Hi Kelvin,

On Wed, Mar 13, 2019 at 04:43:35PM -0500, Kelvin Nilsen wrote:
[ snip ]

This looks great, thanks!  Okay for trunk.  A few more comments, do
with it as you like:

>   (altivec_expand_vec_ext_builtin): Use modular arithmetic to
>   computer index.

s/computer/compute/

> @@ -14723,9 +14742,17 @@ altivec_expand_vec_ext_builtin (tree exp, rtx targ
>op0 = expand_normal (arg0);
>op1 = expand_normal (arg1);
>  
> -  /* Call get_element_number to validate arg1 if it is a constant.  */
>if (TREE_CODE (arg1) == INTEGER_CST)
> -(void) get_element_number (TREE_TYPE (arg0), arg1);
> +{
> +  unsigned HOST_WIDE_INT elt;
> +  unsigned HOST_WIDE_INT size = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
> +  unsigned int truncated_selector;
> +  if (!tree_fits_uhwi_p (arg1))
> + error ("selector expression is out of range");
> +  elt = tree_to_uhwi (arg1);
> +  truncated_selector = elt % size;
> +  op1 = GEN_INT (truncated_selector);
> +}

This error really should never happen, but maybe there is a better way to
report it (error_at maybe), or we can avoid it completely?  It shouldn't
be too hard to extract the low few bits from any precision number.

Have you tested this?  Is the error reasonable?  Can you provoke it at all,
even with negative indices?

> +/* { dg-do run { target { int128 } } } */

You can write this simpler as

/* { dg-do run { target int128 } } */


Segher

[C++ PATCH] PR c++/86521 - C++17 copy elision in initialization by constructor.

2019-03-13 Thread Jason Merrill

This is an overlooked case in C++17 mandatory copy elision: We want overload
resolution to reflect that initializing an object from a prvalue does not
involve a copy or move constructor even when [over.match.ctor] says that
only constructors are candidates.  Here I implement that by looking through
the copy/move constructor in joust.

Clang does something similar in this situation.

Tested x86_64-pc-linux-gnu, applying to trunk.

* call.c (joust_maybe_elide_copy): New.
(joust): Call it.
---
 gcc/cp/call.c| 53 
 gcc/testsuite/g++.dg/cpp0x/overload-conv-3.C |  2 +-
 gcc/testsuite/g++.dg/overload/conv-op2.C |  6 +--
 gcc/cp/ChangeLog |  6 +++
 4 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index bf48ae2c27a..d1f50551cd3 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -10508,6 +10508,33 @@ add_warning (struct z_candidate *winner, struct 
z_candidate *loser)
   winner->warnings = cw;
 }
 
+/* CAND is a constructor candidate in joust in C++17 and up.  If it copies a
+   prvalue returned from a conversion function, replace CAND with the candidate
+   for the conversion and return true.  Otherwise, return false.  */
+
+static bool
+joust_maybe_elide_copy (z_candidate *)
+{
+  tree fn = cand->fn;
+  if (!DECL_COPY_CONSTRUCTOR_P (fn) && !DECL_MOVE_CONSTRUCTOR_P (fn))
+return false;
+  conversion *conv = cand->convs[0];
+  gcc_checking_assert (conv->kind == ck_ref_bind);
+  conv = next_conversion (conv);
+  if (conv->kind == ck_user && !TYPE_REF_P (conv->type))
+{
+  gcc_checking_assert (same_type_ignoring_top_level_qualifiers_p
+  (conv->type, DECL_CONTEXT (fn)));
+  z_candidate *uc = conv->cand;
+  if (DECL_CONV_FN_P (uc->fn))
+   {
+ cand = uc;
+ return true;
+   }
+}
+  return false;
+}
+
 /* Compare two candidates for overloading as described in
[over.match.best].  Return values:
 
@@ -10588,6 +10615,27 @@ joust (struct z_candidate *cand1, struct z_candidate 
*cand2, bool warn,
}
 }
 
+  /* Handle C++17 copy elision in [over.match.ctor] (direct-init) context.  The
+ standard currently says that only constructors are candidates, but if one
+ copies a prvalue returned by a conversion function we want to treat the
+ conversion as the candidate instead.
+
+ Clang does something similar, as discussed at
+ http://lists.isocpp.org/core/2017/10/3166.php
+ http://lists.isocpp.org/core/2019/03/5721.php  */
+  int elided_tiebreaker = 0;
+  if (len == 1 && cxx_dialect >= cxx17
+  && DECL_P (cand1->fn)
+  && DECL_COMPLETE_CONSTRUCTOR_P (cand1->fn)
+  && !(cand1->flags & LOOKUP_ONLYCONVERTING))
+{
+  bool elided1 = joust_maybe_elide_copy (cand1);
+  bool elided2 = joust_maybe_elide_copy (cand2);
+  /* As a tiebreaker below we will prefer a constructor to a conversion
+operator exposed this way.  */
+  elided_tiebreaker = elided2 - elided1;
+}
+
   for (i = 0; i < len; ++i)
 {
   conversion *t1 = cand1->convs[i + off1];
@@ -10697,6 +10745,11 @@ joust (struct z_candidate *cand1, struct z_candidate 
*cand2, bool warn,
   if (winner)
 return winner;
 
+  /* Put this tiebreaker first, so that we don't try to look at second_conv of
+ a constructor candidate that doesn't have one.  */
+  if (elided_tiebreaker)
+return elided_tiebreaker;
+
   /* DR 495 moved this tiebreaker above the template ones.  */
   /* or, if not that,
  the  context  is  an  initialization by user-defined conversion (see
diff --git a/gcc/testsuite/g++.dg/cpp0x/overload-conv-3.C 
b/gcc/testsuite/g++.dg/cpp0x/overload-conv-3.C
index 42a135dbf44..ae587f9673b 100644
--- a/gcc/testsuite/g++.dg/cpp0x/overload-conv-3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/overload-conv-3.C
@@ -17,5 +17,5 @@ struct Source {
 
 int main() {
   Source x;
-  Dest d(move(x)); // { dg-error "ambiguous" }
+  Dest d(move(x));// { dg-error "ambiguous" "" { target c++14_down } }
 }
diff --git a/gcc/testsuite/g++.dg/overload/conv-op2.C 
b/gcc/testsuite/g++.dg/overload/conv-op2.C
index e8e533b1dc5..cc5ebe78619 100644
--- a/gcc/testsuite/g++.dg/overload/conv-op2.C
+++ b/gcc/testsuite/g++.dg/overload/conv-op2.C
@@ -1,5 +1,5 @@
 // PR c++/81311
-// { dg-do link }
+// { dg-do compile { target c++11 } }
 
 struct function
 {
@@ -8,12 +8,12 @@ struct function
 
 struct ref
 {
-  operator function&() const;
+  operator function&() const = delete;
 } r;
 
 struct val
 {
-  operator function() const;
+  operator function() const = delete;
 } v;
 
 int main()
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index ca8bc03b406..1164652268f 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,9 @@
+2019-03-13  Jason Merrill  
+
+   PR c++/86521 - C++17 copy elision in initialization by constructor.
+   * call.c (joust_maybe_elide_copy): New.
+

Re: [patch, doc, fortran] Add Q edit descriptor

2019-03-13 Thread Steve Kargl

On Thu, Mar 14, 2019 at 12:18:08AM +0100, Thomas Koenig wrote:
> Am 13.03.19 um 23:55 schrieb Steve Kargl:
> > On Wed, Mar 13, 2019 at 11:48:12PM +0100, Thomas Koenig wrote:
> >> Hello world,
> >>
> >> this doc patch adds the Q edit descriptor to the non-implemented
> >> extension list. Tested with "make dvi", "make pdf" and "make info".
> >>
> >> Suggestions? OK for trunk?
> > 
> > How about asttaching the patch?  :-)
> 
> You're right, that would probably help :-)
> 

See below.

> Index: gfortran.texi
> ===
> --- gfortran.texi (Revision 269624)
> +++ gfortran.texi (Arbeitskopie)
> @@ -2896,6 +2896,7 @@
>  * Alternate complex function syntax::
>  * Volatile COMMON blocks::
>  * OPEN( ... NAME=)::
> +* Q edit descriptor::
>  @end menu
>  
>  @node ENCODE and DECODE statements
> @@ -3018,7 +3019,7 @@
>  
>  @node OPEN( ... NAME=)
>  @subsection @code{OPEN( ... NAME=)}
> -@cindex @code{NAM}
> +@cindex @code{NAME}
>  
>  Some Fortran compilers, including @command{g77}, let the user declare
>  @code{OPEN( ... NAME=)}. This is
> @@ -3026,8 +3027,28 @@
>  @command{gfortran}.  @code{OPEN( ... NAME=)} should be replaced
>  with @code{OPEN( ... FILE=)}.
>  
> +@node Q edit descriptor
> +@subsection @code{Q} edit descriptor
> +@cindex @code{Q} edit descriptor
>  
> +Some Fortran compilers include the @code{Q} edit descritpor, which

s/include the/provide a
s/descritpor/descriptor

> +transfers the numer of characters left on an input record into an

s/numer/number
s/on/within

> +integer variable.

s/integer/@code{INTEGER}


> +A direct replacement of the @code{Q} edit descriptor is not available
> +in @command{gfortran}.  How to replicate its functionality using
> +standard-conforming code depends on what exactly it does the original
> +code.

s/what exactly it does/the intent of 

> +
> +Options to replace @code{Q} may be to read the whole line into a
> +character variable and then counting the number of non-blank
> +characters left using @code{LEN_TRIM}.  Another method may be to use
> +formatted stream, read the data up to the position where the @code{Q}
> +descriptor occurred, use @code{INQUIRE} to get the file position,
> +count the characters up to the next @code{NEW_LINE} and then start
> +reading from the position marked previously.
> +
> +

With the suggested changes, OK.

-- 
Steve

Re: [patch, doc, fortran] Add Q edit descriptor

2019-03-13 Thread Thomas Koenig


Am 13.03.19 um 23:55 schrieb Steve Kargl:

On Wed, Mar 13, 2019 at 11:48:12PM +0100, Thomas Koenig wrote:

Hello world,

this doc patch adds the Q edit descriptor to the non-implemented
extension list. Tested with "make dvi", "make pdf" and "make info".

Suggestions? OK for trunk?


How about asttaching the patch?  :-)


You're right, that would probably help :-)


In all seriousness, thanks for your attack on bugzilla
over the last month or so.


Glad to be of service :-)

I have only fairly recently begun to understand some regions of the
compiler that were an almost complete mystery to me before, such as
what is going on in trans-*.

Generally, we are having a good run right now - we have been closing
two bugs per one new bug submitted since the new year started. And
this is very much a team effort.

Regards

Thomas
Index: gfortran.texi
===
--- gfortran.texi	(Revision 269624)
+++ gfortran.texi	(Arbeitskopie)
@@ -2896,6 +2896,7 @@
 * Alternate complex function syntax::
 * Volatile COMMON blocks::
 * OPEN( ... NAME=)::
+* Q edit descriptor::
 @end menu
 
 @node ENCODE and DECODE statements
@@ -3018,7 +3019,7 @@
 
 @node OPEN( ... NAME=)
 @subsection @code{OPEN( ... NAME=)}
-@cindex @code{NAM}
+@cindex @code{NAME}
 
 Some Fortran compilers, including @command{g77}, let the user declare
 @code{OPEN( ... NAME=)}. This is
@@ -3026,8 +3027,28 @@
 @command{gfortran}.  @code{OPEN( ... NAME=)} should be replaced
 with @code{OPEN( ... FILE=)}.
 
+@node Q edit descriptor
+@subsection @code{Q} edit descriptor
+@cindex @code{Q} edit descriptor
 
+Some Fortran compilers include the @code{Q} edit descritpor, which
+transfers the numer of characters left on an input record into an
+integer variable.
 
+A direct replacement of the @code{Q} edit descriptor is not available
+in @command{gfortran}.  How to replicate its functionality using
+standard-conforming code depends on what exactly it does the original
+code.
+
+Options to replace @code{Q} may be to read the whole line into a
+character variable and then counting the number of non-blank
+characters left using @code{LEN_TRIM}.  Another method may be to use
+formatted stream, read the data up to the position where the @code{Q}
+descriptor occurred, use @code{INQUIRE} to get the file position,
+count the characters up to the next @code{NEW_LINE} and then start
+reading from the position marked previously.
+
+
 @c -
 @c -
 @c Mixed-Language Programming

Re: [patch, doc, fortran] Add Q edit descriptor

2019-03-13 Thread Steve Kargl

On Wed, Mar 13, 2019 at 11:48:12PM +0100, Thomas Koenig wrote:
> Hello world,
> 
> this doc patch adds the Q edit descriptor to the non-implemented
> extension list. Tested with "make dvi", "make pdf" and "make info".
> 
> Suggestions? OK for trunk?

How about asttaching the patch?  :-)

In all seriousness, thanks for your attack on bugzilla
over the last month or so.

-- 
Steve

[patch, doc, fortran] Add Q edit descriptor

2019-03-13 Thread Thomas Koenig


Hello world,

this doc patch adds the Q edit descriptor to the non-implemented
extension list. Tested with "make dvi", "make pdf" and "make info".

Suggestions? OK for trunk?

Regards

Thomas

[PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-13 Thread Jakub Jelinek

Hi!

The C FE sadly passes through some really bad prototypes of builtin
functions as "harmless":
  /* Accept "harmless" mismatches in function types such
 as missing qualifiers or pointer vs same size integer
 mismatches.  This is for the ffs and fprintf builtins.
 However, with -Wextra in effect, diagnose return and
 argument types that are incompatible according to
 language rules.  */
In the strlen pass, we'd better have pointers where we expect pointers and
rightly sized integers where we expect integers.  The following patch
ensures that by calling gimple_builtin_call_types_compatible_p against
the builtin_decl_explicit decl.

Furthermore, I've noticed that a bunch of later added builtins that
the strlen pass now handles weren't added to valid_builtin_call, so the
patch adds that too (that is to handle cases where e.g. somebody declares
strncat as pure, or strcmp as const etc.).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-13  Jakub Jelinek  

PR tree-optimization/89703
* tree-ssa-strlen.c (valid_builtin_call): Punt if stmt call types
aren't compatible also with builtin_decl_explicit.  Check pure
or non-pure status of BUILT_IN_STR{{,N}CMP,N{LEN,{CAT,CPY}{,_CHK}}}
and BUILT_IN_STPNCPY{,_CHK}.

* gcc.c-torture/compile/pr89703-1.c: New test.
* gcc.c-torture/compile/pr89703-2.c: New test.

--- gcc/tree-ssa-strlen.c.jj2019-02-26 21:33:38.992826180 +0100
+++ gcc/tree-ssa-strlen.c   2019-03-13 19:53:20.152551962 +0100
@@ -971,12 +971,21 @@ valid_builtin_call (gimple *stmt)
 return false;
 
   tree callee = gimple_call_fndecl (stmt);
+  tree decl = builtin_decl_explicit (DECL_FUNCTION_CODE (callee));
+  if (decl
+  && decl != callee
+  && !gimple_builtin_call_types_compatible_p (stmt, decl))
+return false;
+
   switch (DECL_FUNCTION_CODE (callee))
 {
 case BUILT_IN_MEMCMP:
 case BUILT_IN_MEMCMP_EQ:
+case BUILT_IN_STRCMP:
+case BUILT_IN_STRNCMP:
 case BUILT_IN_STRCHR:
 case BUILT_IN_STRLEN:
+case BUILT_IN_STRNLEN:
   /* The above functions should be pure.  Punt if they aren't.  */
   if (gimple_vdef (stmt) || gimple_vuse (stmt) == NULL_TREE)
return false;
@@ -991,10 +1000,16 @@ valid_builtin_call (gimple *stmt)
 case BUILT_IN_MEMSET:
 case BUILT_IN_STPCPY:
 case BUILT_IN_STPCPY_CHK:
+case BUILT_IN_STPNCPY:
+case BUILT_IN_STPNCPY_CHK:
 case BUILT_IN_STRCAT:
 case BUILT_IN_STRCAT_CHK:
 case BUILT_IN_STRCPY:
 case BUILT_IN_STRCPY_CHK:
+case BUILT_IN_STRNCAT:
+case BUILT_IN_STRNCAT_CHK:
+case BUILT_IN_STRNCPY:
+case BUILT_IN_STRNCPY_CHK:
   /* The above functions should be neither const nor pure.  Punt if they
 aren't.  */
   if (gimple_vdef (stmt) == NULL_TREE || gimple_vuse (stmt) == NULL_TREE)
--- gcc/testsuite/gcc.c-torture/compile/pr89703-1.c.jj  2019-03-13 
19:56:10.619785685 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr89703-1.c 2019-03-13 
19:55:48.490146110 +0100
@@ -0,0 +1,13 @@
+/* PR tree-optimization/89703 */
+
+typedef __SIZE_TYPE__ size_t;
+extern char *strlen (const char *);
+extern char *strnlen (const char *, size_t);
+extern char c[2];
+
+void
+foo (char **q)
+{
+  q[0] = strlen (c);
+  q[1] = strnlen (c, 2);
+}
--- gcc/testsuite/gcc.c-torture/compile/pr89703-2.c.jj  2019-03-13 
19:56:13.495738838 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr89703-2.c 2019-03-13 
19:55:55.310035032 +0100
@@ -0,0 +1,13 @@
+/* PR tree-optimization/89703 */
+
+typedef __SIZE_TYPE__ size_t;
+extern void *memcpy (void *, const void *, size_t);
+extern char *strlen (const char *);
+extern char c[2];
+
+void
+foo (char **q)
+{
+  memcpy (c, "a", 2);
+  q[0] = strlen (c);
+}

Jakub

[C++ PATCH] Fix ICE on variable template access without template args (PR c++/89512)

2019-03-13 Thread Jakub Jelinek

Hi!

When a variable template (both static data member of some class or at
namespace level) is accessed without template arguments,
finish_id_expression_1 handles it through:
  /* If we didn't find anything, or what we found was a type,
 then this wasn't really an id-expression.  */
  if (TREE_CODE (decl) == TEMPLATE_DECL
  && !DECL_FUNCTION_TEMPLATE_P (decl))
{
  *error_msg = G_("missing template arguments");
  return error_mark_node;
}
On the following invalid testcase the access is from within a template and
in that case finish_id_expression{,_1} isn't really called for that, instead
tsubst_qualified_id just calls finish_qualified_id_expr.

The ICE is because we pass over a TEMPLATE_DECL to the middle-end (e.g. on
rhs of an INIT_EXPR etc.).

The following patch does something similar to what finish_id_expression_1
does.  Bootstrapped/regtested on x86_64-linux and i686-linux.  Or should it
be done in tsubst_qualified_id instead (before or after the
finish_qualified_id_expr call?

2019-03-13  Jakub Jelinek  

PR c++/89512
* semantics.c (finish_qualified_id_expr): Reject variable templates.

* g++.dg/cpp1y/var-templ61.C: New test.

--- gcc/cp/semantics.c.jj   2019-03-08 11:45:27.556465237 +0100
+++ gcc/cp/semantics.c  2019-03-13 17:41:58.694698814 +0100
@@ -2112,6 +2112,14 @@ finish_qualified_id_expr (tree qualifyin
expr = build_offset_ref (qualifying_class, expr, /*address_p=*/false,
 complain);
 }
+  else if (!template_p
+  && TREE_CODE (expr) == TEMPLATE_DECL
+  && !DECL_FUNCTION_TEMPLATE_P (expr))
+{
+  if (complain & tf_error)
+   error ("%qE missing template arguments", expr);
+  return error_mark_node;
+}
   else
 {
   /* In a template, return a SCOPE_REF for most qualified-ids
--- gcc/testsuite/g++.dg/cpp1y/var-templ61.C.jj 2019-03-13 17:36:33.895936102 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/var-templ61.C2019-03-13 17:41:30.868147514 
+0100
@@ -0,0 +1,20 @@
+// PR c++/89512
+// { dg-do compile { target c++14 } }
+
+struct A {
+  template 
+  static const int a = 0;
+};
+
+struct B {
+  template 
+  static int foo ()
+  {
+return T::a;   // { dg-error "missing template arguments" }
+  }
+};
+
+int bar ()
+{
+  return B::foo (); // { dg-message "required from here" }
+}

Jakub

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Joseph Myers

On Wed, 13 Mar 2019, Richard Biener wrote:

> And that fortran support patch would need yet another iteration.

Fortran never uses the _finite names because it never uses the C header 
that can declare the functions to use those asm names.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Fix create_dispatcher_calls ICE (PR ipa/89684)

2019-03-13 Thread Jakub Jelinek

Hi!

create_dispatcher_calls iterates over ipa_ref *s referring the current node
and wants to remove them all and create new ones.  The problem is
that if there are any ipa_ref objects where two or more of them are from the
same cgraph node to the current node (in the testcases there are both cases
where there are two or more foo -> foo self-references, or two or more
foo.avx -> foo references), when we ref->remove_reference () one of them,
that method will actually move some other ipa_ref that was last in
references vector over to that and thus invalidate what some other ipa_ref
pointers point to (the pointers still point to some elements of some
references vector, but could either point past the last one, or could point
to an element that was overwritten with something else).

The following patch fixes it by remove_reference all references immediately,
but push into the vector their content first; besides that
ref->remove_reference we are just accessing a couple of fields from that,
not calling any other methods.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-13  Jakub Jelinek  

PR ipa/89684
* multiple_target.c (create_dispatcher_calls): Change
references_to_redirect from vector of ipa_ref * to vector of ipa_ref.
In the node->iterate_referring loop, push *ref rather than ref, call
ref->remove_reference () and always pass 0 to iterate_referring.

--- gcc/multiple_target.c.jj2019-03-13 09:23:35.345567298 +0100
+++ gcc/multiple_target.c   2019-03-13 10:12:49.071371927 +0100
@@ -103,10 +103,16 @@ create_dispatcher_calls (struct cgraph_n
 inode->resolve_alias (cgraph_node::get (resolver_decl));
 
   auto_vec edges_to_redirect;
-  auto_vec references_to_redirect;
+  /* We need to capture the references by value rather than just pointers to 
them
+ and remove them right away, as removing them later would invalidate what
+ some other reference pointers point to.  */
+  auto_vec references_to_redirect;
 
-  for (unsigned i = 0; node->iterate_referring (i, ref); i++)
-references_to_redirect.safe_push (ref);
+  while (node->iterate_referring (0, ref))
+{
+  references_to_redirect.safe_push (*ref);
+  ref->remove_reference ();
+}
 
   /* We need to remember NEXT_CALLER as it could be modified in the loop.  */
   for (cgraph_edge *e = node->callers; e ; e = e->next_caller)
@@ -146,13 +152,11 @@ create_dispatcher_calls (struct cgraph_n
}
 
  symtab_node *source = ref->referring;
- ref->remove_reference ();
  source->create_reference (inode, IPA_REF_ADDR);
}
  else if (ref->use == IPA_REF_ALIAS)
{
  symtab_node *source = ref->referring;
- ref->remove_reference ();
  source->create_reference (inode, IPA_REF_ALIAS);
  source->add_to_same_comdat_group (inode);
}
--- gcc/testsuite/gcc.target/i386/pr89684.c.jj  2019-03-13 10:12:05.677080120 
+0100
+++ gcc/testsuite/gcc.target/i386/pr89684.c 2019-03-13 10:11:56.390231682 
+0100
@@ -0,0 +1,23 @@
+/* PR ipa/89684 */
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+
+void bar (int, void (*) (void));
+
+__attribute__((target_clones ("default", "avx")))
+void foo (void)
+{
+  bar (0, foo);
+  bar (0, foo);
+}
+
+__attribute__((target_clones ("default", "avx", "avx2")))
+void baz (void)
+{
+  bar (0, foo);
+  bar (0, foo);
+  bar (0, foo);
+  bar (0, foo);
+  bar (0, foo);
+  bar (0, foo);
+}

Jakub

[C++ PATCH] Fix cxx_eval_loop_expr ICE (PR c++/89652, take 2)

2019-03-13 Thread Jakub Jelinek

On Mon, Mar 11, 2019 at 11:21:00PM +0100, Jakub Jelinek wrote:
> The following testcase ICEs since my recent cxx_eval_loop_expr changes.
> The problem is that the Forget saved values of SAVE_EXPRs. inside of the
> loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
> iteration, we can also do the loop at the end of function (which has been
> added there mainly to handle cases where the main loop breaks earlier)
> and new_ctx.values->remove ICEs because *iter is already not in the table.
> 
> While I could e.g. add some bool whether there is anything to be removed
> after the loop or not which would fix the regression part of this PR,
> I believe there is a latent issue as well.  The thing is, new_ctx.save_exprs

That is actually not true, the bug is fully my fault and previously it was
ok, as cxx_eval_loop_expr used to do:
  do
{
  hash_set save_exprs;
  new_ctx.save_exprs = _exprs;
...
}
  while (...)
and thus there was a new hash_set for each iteration, it was just me moving
it out of the loop (so that I don't have to repeat the ->remove loop on each
break or use gotos).

> Yet another possibility might be to turn save_exprs into a vec, from what

That said, I believe using a vec must be faster and the hash_set seems
overkill, as if SAVE_EXPR wasn't seen yet in the current iteration, it is
desirable to add it to the vec and if it has been added there,
ctx->value->get will be non-NULL, so we'll never add a duplicate, and by
using a vec we don't have to allocate storage for each iteration etc.

So, here is a variant patch I've bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2019-03-13  Jakub Jelinek  

PR c++/89652
* constexpr.c (struct constexpr_ctx): Change save_exprs type from
hash_set to vec.
(cxx_eval_call_expression): Adjust for save_exprs being a vec instead
of hash_set.
(cxx_eval_loop_expr): Likewise.  Truncate the vector after each
removal of SAVE_EXPRs from values.
(cxx_eval_constant_expression) : Call safe_push
method on save_exprs instead of add.

* g++.dg/cpp1y/constexpr-89652.C: New test.

--- gcc/cp/constexpr.c.jj   2019-03-11 22:56:51.856732730 +0100
+++ gcc/cp/constexpr.c  2019-03-13 13:21:48.986687463 +0100
@@ -1024,7 +1024,7 @@ struct constexpr_ctx {
   hash_map *values;
   /* SAVE_EXPRs that we've seen within the current LOOP_EXPR.  NULL if we
  aren't inside a loop.  */
-  hash_set *save_exprs;
+  vec *save_exprs;
   /* The CONSTRUCTOR we're currently building up for an aggregate
  initializer.  */
   tree ctor;
@@ -1831,7 +1831,7 @@ cxx_eval_call_expression (const constexp
  /* Track the callee's evaluated SAVE_EXPRs so that we can forget
 their values after the call.  */
  constexpr_ctx ctx_with_save_exprs = *ctx;
- hash_set save_exprs;
+ auto_vec save_exprs;
  ctx_with_save_exprs.save_exprs = _exprs;
  ctx_with_save_exprs.call = _call;
 
@@ -1862,9 +1862,10 @@ cxx_eval_call_expression (const constexp
}
 
  /* Forget the saved values of the callee's SAVE_EXPRs.  */
- for (hash_set::iterator iter = save_exprs.begin();
-  iter != save_exprs.end(); ++iter)
-   ctx_with_save_exprs.values->remove (*iter);
+ unsigned int i;
+ tree save_expr;
+ FOR_EACH_VEC_ELT (save_exprs, i, save_expr)
+   ctx_with_save_exprs.values->remove (save_expr);
 
  /* Remove the parms/result from the values map.  Is it worth
 bothering to do this when the map itself is only live for
@@ -4190,7 +4191,7 @@ cxx_eval_loop_expr (const constexpr_ctx
 default:
   gcc_unreachable ();
 }
-  hash_set save_exprs;
+  auto_vec save_exprs;
   new_ctx.save_exprs = _exprs;
   do
 {
@@ -4234,9 +4235,11 @@ cxx_eval_loop_expr (const constexpr_ctx
}
 
   /* Forget saved values of SAVE_EXPRs.  */
-  for (hash_set::iterator iter = save_exprs.begin();
-  iter != save_exprs.end(); ++iter)
-   new_ctx.values->remove (*iter);
+  unsigned int i;
+  tree save_expr;
+  FOR_EACH_VEC_ELT (save_exprs, i, save_expr)
+   new_ctx.values->remove (save_expr);
+  save_exprs.truncate (0);
 
   if (++count >= constexpr_loop_limit)
{
@@ -4256,9 +4259,10 @@ cxx_eval_loop_expr (const constexpr_ctx
 && !*non_constant_p);
 
   /* Forget saved values of SAVE_EXPRs.  */
-  for (hash_set::iterator iter = save_exprs.begin();
-   iter != save_exprs.end(); ++iter)
-new_ctx.values->remove (*iter);
+  unsigned int i;
+  tree save_expr;
+  FOR_EACH_VEC_ELT (save_exprs, i, save_expr)
+new_ctx.values->remove (save_expr);
 
   return NULL_TREE;
 }
@@ -4616,7 +4620,7 @@ cxx_eval_constant_expression (const cons
non_constant_p, overflow_p);
  ctx->values->put (t, r);
  if (ctx->save_exprs)
-

Re: [PR72741] Encode OpenACC 'routine' directive's level of parallelism inside Fortran module files

2019-03-13 Thread Thomas Koenig


Am 13.03.19 um 18:50 schrieb Thomas Schwinge:

There are many ways to deal with it without bumping MOD_VERSION in a
backwards but not forwards compatible way, so that a newer compiler will be
able to parse old *.mod files, and newer compiler new ones as long as this
problematic stuff doesn't appear in.

Like the attached, actually pretty simple now?


Can you explain a) how this works, and b) how you tested it?

[PATCH v2, rs6000] PR87532: Bad Results from vec_extract(unsigned char, foo) dependent upon function inline

2019-03-13 Thread Kelvin Nilsen



An initial draft patch was distributed on 3/8/19.  Thanks Segher for careful 
review and detailed feedback.  This second draft patch differs from the first 
in the following regards:

  1. Simplified dg directives in the new tests cases:
 a) Removed { target { powerpc*-*-* } } from dg-do run directives because 
this is redundant with powerpc.exp
 b) Removed { dg-skip-if "" { powerpc*-*-darwin* } } directives because 
this is redundant with requiring vsx or altivec
 c) Changed effective target requirement from dfp_hw to vmx_hw
 d) Required { target { int128 } } instead of lp64 for tests that require 
int128 support
 e) Removed dg-skip-if for -mcpu= overrides, because these tests are not 
setting cpu
 f) Removed "-save-temps -dp -g" from the dg-options directive on certain 
tests
  2. Corrected certain __asm__ statements to require the "v" output constraint 
rather than the "wa" output constraint (when compiling with -maltivec)
  3. In rs6000-c.c, made modular computation of constant selector expression 
unconditional
  4. In rs6000.c:
 a) Changed computation of bits_in_element to use GET_MODE_INNER macro.
 b) Changed error message in case of selector expression overflow to not 
make reference to HOST_WIDE_INT


This problem report, though initially motivated by differences in behavior 
between constant and non-constant selector arguments, uncovered a number of 
inconsistencies in the implementation of vec_extract.

This patch provides several fixes to make handling of constant selector 
expressions the same as the handling of non-constant selector expressions.  In 
the process of testing, it was observed that certain existing regression tests 
were looking for the wrong instructions to be emitted and those tests have been 
updated.

This has bootstrapped and tested without regressions on 
powerpc64le-unknown-linux (both P8 and P9) and on powerpc-linux (P7 big-endian, 
with both -m32 and -m64 target options).

Is this ok for trunk?

gcc/ChangeLog:

2019-03-13  Kelvin Nilsen  

PR target/87532
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
When handling vec_extract, use modular arithmetic to allow
constant selectors greater than vector length.
* config/rs6000/rs6000.c (rs6000_expand_vector_extract): Allow
V1TImode vectors to have constant selector values greater than 0.
Use modular arithmetic to compute vector index.
(rs6000_split_vec_extract_var): Use modular arithmetic to compute
index for in-memory vectors.  Correct code generation for
in-register vectors.
(altivec_expand_vec_ext_builtin): Use modular arithmetic to
computer index.

gcc/testsuite/ChangeLog:

2019-03-13  Kelvin Nilsen  

PR target/87532
* gcc.target/powerpc/fold-vec-extract-char.p8.c: Modify expected
instruction selection.
* gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise.
* gcc.target/powerpc/fold-vec-extract-short.p8.c: Likewise.
* gcc.target/powerpc/pr87532-mc.c: New test.
* gcc.target/powerpc/pr87532.c: New test.
* gcc.target/powerpc/vec-extract-v16qiu-v2.h: New test.
* gcc.target/powerpc/vec-extract-v16qiu-v2a.c: New test.
* gcc.target/powerpc/vec-extract-v16qiu-v2b.c: New test.
* gcc.target/powerpc/vsx-builtin-10a.c: New test.
* gcc.target/powerpc/vsx-builtin-10b.c: New test.
* gcc.target/powerpc/vsx-builtin-11a.c: New test.
* gcc.target/powerpc/vsx-builtin-11b.c: New test.
* gcc.target/powerpc/vsx-builtin-12a.c: New test.
* gcc.target/powerpc/vsx-builtin-12b.c: New test.
* gcc.target/powerpc/vsx-builtin-13a.c: New test.
* gcc.target/powerpc/vsx-builtin-13b.c: New test.
* gcc.target/powerpc/vsx-builtin-14a.c: New test.
* gcc.target/powerpc/vsx-builtin-14b.c: New test.
* gcc.target/powerpc/vsx-builtin-15a.c: New test.
* gcc.target/powerpc/vsx-builtin-15b.c: New test.
* gcc.target/powerpc/vsx-builtin-16a.c: New test.
* gcc.target/powerpc/vsx-builtin-16b.c: New test.
* gcc.target/powerpc/vsx-builtin-17a.c: New test.
* gcc.target/powerpc/vsx-builtin-17b.c: New test.
* gcc.target/powerpc/vsx-builtin-18a.c: New test.
* gcc.target/powerpc/vsx-builtin-18b.c: New test.
* gcc.target/powerpc/vsx-builtin-19a.c: New test.
* gcc.target/powerpc/vsx-builtin-19b.c: New test.
* gcc.target/powerpc/vsx-builtin-20a.c: New test.
* gcc.target/powerpc/vsx-builtin-20b.c: New test.
* gcc.target/powerpc/vsx-builtin-9a.c: New test.
* gcc.target/powerpc/vsx-builtin-9b.c: New test.

Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 269492)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -6565,12 +6565,14 @@

Re: [PR fortran/87045, patch] - pointer to array of character

2019-03-13 Thread Harald Anlauf

Committed as r269664.

Thanks for the review!

Harald

On 03/13/19 01:41, Steve Kargl wrote:
> On Sun, Mar 10, 2019 at 10:19:03PM +0100, Harald Anlauf wrote:
>> The code in the testcase died due to a run-time bounds-check
>> generated slightly too early, leading to a crash for deferred
>> character length.  Moving the character length check after the
>> assignment solves the issue.
>>
>> Regtests cleanly on x86_64-pc-linux-gnu.
>>
>> OK for trunk?
>>
> 
> OK.
>

patch to fix PR85860

2019-03-13 Thread Vladimir Makarov


The following patch fixes

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85860

The patch was successfully bootstrapped and tested on x86-64.

Committed as rev. 269662 to trunk and as rev. 269663 to gcc-8-branch.


Index: ChangeLog
===
--- ChangeLog	(revision 269661)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2019-03-13  Vladimir Makarov  
+
+	PR target/85860
+	* lra-constraints.c (inherit_in_ebb): Update
+	potential_reload_hard_regs along with live_hard_regs.
+
 2019-03-13  Jakub Jelinek  
 
 	PR debug/89498
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 269651)
+++ lra-constraints.c	(working copy)
@@ -6365,6 +6365,7 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 			add_to_hard_reg_set (, PSEUDO_REGNO_MODE (dst_regno),
 	 reg_renumber[dst_regno]);
 		  AND_COMPL_HARD_REG_SET (live_hard_regs, s);
+		  AND_COMPL_HARD_REG_SET (potential_reload_hard_regs, s);
 		}
 		  /* We should invalidate potential inheritance or
 		 splitting for the current insn usages to the next
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 269661)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2019-03-13  Vladimir Makarov  
+
+	PR target/85860
+	* gcc.target/i386/pr85860.c: New.
+
 2019-03-13  Marek Polacek  
 
 	PR c++/89686 - mixing init-capture and simple-capture in lambda.
Index: testsuite/gcc.target/i386/pr85860.c
===
--- testsuite/gcc.target/i386/pr85860.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr85860.c	(working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fno-guess-branch-probability -flive-range-shrinkage -mbmi2" } */
+
+int a, b, c, d, e;
+
+extern int bar(void);
+
+__int128
+foo (unsigned g, int h, long i, __int128 j, short k, __int128 l)
+{
+  unsigned __int128 m = j;
+  do
+{
+  j %= 5;
+  c = c >> (m & 31);
+  e = __builtin_sub_overflow (b, 0, );
+  d = bar ();
+  l *= __builtin_mul_overflow_p ((unsigned) d, ~(unsigned __int128) 1,
+ (unsigned __int128) 0);
+}
+  while (a);
+  return m + j + k + l;
+}

Re: [PATCH] gcc/riscv: Correctly ignore empty C++ structs when flattening for ABI

2019-03-13 Thread Jim Wilson

On Wed, Mar 13, 2019 at 11:39 AM Andrew Burgess
 wrote:
> gcc/ChangeLog:
>
> PR target/89627
> * config/riscv/riscv.c (riscv_pass_fpr_single): Add offset
> parameter, and make use of it.
> (riscv_get_arg_info): Pass offset to riscv_pass_fpr_single.
>
> gcc/testsuite/ChangeLog:
>
> PR target/89627
> * g++.target/riscv/call-with-empty-struct-float.C: New file.
> * g++.target/riscv/call-with-empty-struct-int.C: New file.
> * g++.target/riscv/call-with-empty-struct.H: New file.
> * g++.target/riscv/riscv.exp: New file.

This is OK.

We can always move the testcases later if that is useful.

Jim

Re: [PATCH] gcc/riscv: Correctly ignore empty C++ structs when flattening for ABI

2019-03-13 Thread Jim Wilson

On Wed, Mar 13, 2019 at 11:53 AM Andrew Pinski  wrote:
> This testcase seems generic, that is it has no ABI dependent values
> attached to it.
> Can it be moved to a more generic location instead?  Maybe there are
> other targets which get this incorrect also.

I'm not aware of any other target that passes structs the same way.
RISC-V fails the testcase because it will decompose a struct
containing only floats into fields and pass the fields as scalars.
IA-64 and Aarch64 have support for HFA (homogenous FP aggregates), but
this case doesn't qualify as an HFA because of the empty struct, and
hence this case gets no special treatment for those targets and isn't
broken.

Jim

Re: C++ PATCH for c++/89660 - bogus error with -Wredundant-move

2019-03-13 Thread Jason Merrill


On 3/11/19 6:56 PM, Marek Polacek wrote:

My recent patch caused us to call convert_for_initialization for a std:move's
argument to see if it would have succeeded had the returned expression been
just that argument.

That caused a bogus error in this test, because convert_for_initialization
might cause additional instantiations, and they might fail.  My first
version of the patch fixed this by adding "cp_unevaluated e;", preventing
add_pending_template from adding further instantiations, but I no longer think
that's the best fix, because in this case the argument isn't an id-expression,
and the implicit move wouldn't be performed, so we shouldn't warn.  Thus fixed
by making the maybe_warn_pessimizing_move condition more strict -- that fixes
both the bogus error and the bogus warning.  Specifically, make sure that the
argument is of form "(T &) " and not "(T &) (T *) " or similar.

Also add a test with template-ids.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-03-11  Marek Polacek  

PR c++/89660 - bogus error with -Wredundant-move.
* typeck.c (maybe_warn_pessimizing_move): Only accept (T &) 
as the std::move's argument.  Don't call convert_for_initialization
when warn_redundant_move isn't on.


OK.

Jason

Re: [Patch, Fortran] PR 89601: [8/9 Regression] [PDT] ICE: Segmentation fault (in resolve_component)

2019-03-13 Thread Janus Weil

I have just committed the updated patch to trunk as r269658. If anyone
thinks this should be backported to 8-branch, please let me know.

Cheers,
Janus


Am Di., 12. März 2019 um 13:00 Uhr schrieb Janus Weil :
>
> Hi Steve,
>
> > > Technically the ICE is a regression, but since it happens on invalid
> > > code only, backporting is not essential IMHO (but might still be
> > > useful). Ok for trunk? And 8-branch?
> >
> > Looks good to me with a minor suggestion.  See below.
>
> thanks for the comments. Updated patch attached.
>
>
> > You may want to wait a bit or ping Paul to give him
> > a chance to peek at the patch.
>
> I'm putting Paul in CC and will wait until tomorrow night before committing.
>
>
> > >if (gfc_match_char (')') == MATCH_YES)
> > > -goto ok;
> > > +  {
> > > +if (typeparam)
> > > +  {
> > > + gfc_error_now ("A parameter name is required at %C");
> >
> > Can you insert "type" so the error reads "A type parameter name"?
> > Unfortunately, the words parameter has a few too many meanings.
>
> True. I'm using 'type parameter list' now.
>
> Cheers,
> Janus

Re: Patch ping (Re: [PATCH] Fix -gdwarf-5 -gsplit-dwarf ICEs (PR debug/89498))

2019-03-13 Thread Jason Merrill


On 3/12/19 7:51 AM, Jakub Jelinek wrote:

On Mon, Mar 04, 2019 at 11:35:50PM +0100, Jakub Jelinek wrote:

2019-03-04  Jakub Jelinek  

PR debug/89498
* dwarf2out.c (size_of_die): For dw_val_class_view_list always use
DWARF_OFFSET_SIZE.
(value_format): For dw_val_class_view_list never use DW_FORM_loclistx.


Given the clarifications Alex has just posted, I believe this patch does
what is documented in there.  Ok for trunk?


OK.

Jason

Re: C++ PATCH for c++/89686 - mixing init-capture and simple-capture in lambda

2019-03-13 Thread Jason Merrill


On 3/13/19 3:32 PM, Marek Polacek wrote:

On Wed, Mar 13, 2019 at 02:15:25PM -0400, Jason Merrill wrote:

On 3/12/19 5:42 PM, Marek Polacek wrote:

On Tue, Mar 12, 2019 at 04:07:56PM -0400, Jason Merrill wrote:

On 3/12/19 3:59 PM, Marek Polacek wrote:

As Barry explained in the PR, lambda capture is one of

 simple-capture ...[opt]
 ...[opt] init-capture

where init-capture requires an initializer.  Here we have

 [...xs...]

which is ill-formed as it's mingling both of these.

Bootstrapped/regtested on x86_64-linux, ok for trunk?  Or should I defer to GCC
10?

2019-03-12  Marek Polacek  

PR c++/89686 - mixing init-capture and simple-capture in lambda.
* parser.c (cp_parser_lambda_introducer): Give error when combining
init-capture and simple-capture.

* g++.dg/cpp2a/lambda-pack-init2.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index f9569ed..d5d8f364752 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -10721,6 +10721,15 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
{
  cp_lexer_consume_token (parser->lexer);
  capture_init_expr = make_pack_expansion (capture_init_expr);
+ if (init_pack_expansion)
+   {
+ /* We'd already seen '...' so we were expecting an
+init-capture.  But we just saw another '...' which
+would imply a simple-capture.  */
+ error_at (capture_token->location,
+   "combining init-capture and simple-capture");


That diagnostic seems a bit obscure, how about something like "too many
%<...%> in lambda capture"?


Yup, sounds better.


Or perhaps check to see if there's an initializer after the ..., and
complain about the second ... if so, or the first ... if not.


I tried but we already give diagnostic for [...xs=xs...] or [...xs=...xs]
and similar.


I was thinking about '...xs...=xs'.


Ah, I see.


But let's at least use a proper location for the '...'.


This patch uses the location of ... after the identifier, but if there's no
initializer, the ... before the identifier is the one we want to flag as
invalid.


How about this, then?

Bootstrap/regtest running on x86_64-linux, ok for trunk?


OK, thanks.

Jason

Re: C++ PATCH for c++/89686 - mixing init-capture and simple-capture in lambda

2019-03-13 Thread Marek Polacek

On Wed, Mar 13, 2019 at 02:15:25PM -0400, Jason Merrill wrote:
> On 3/12/19 5:42 PM, Marek Polacek wrote:
> > On Tue, Mar 12, 2019 at 04:07:56PM -0400, Jason Merrill wrote:
> > > On 3/12/19 3:59 PM, Marek Polacek wrote:
> > > > As Barry explained in the PR, lambda capture is one of
> > > > 
> > > > simple-capture ...[opt]
> > > > ...[opt] init-capture
> > > > 
> > > > where init-capture requires an initializer.  Here we have
> > > > 
> > > > [...xs...]
> > > > 
> > > > which is ill-formed as it's mingling both of these.
> > > > 
> > > > Bootstrapped/regtested on x86_64-linux, ok for trunk?  Or should I 
> > > > defer to GCC
> > > > 10?
> > > > 
> > > > 2019-03-12  Marek Polacek  
> > > > 
> > > > PR c++/89686 - mixing init-capture and simple-capture in lambda.
> > > > * parser.c (cp_parser_lambda_introducer): Give error when 
> > > > combining
> > > > init-capture and simple-capture.
> > > > 
> > > > * g++.dg/cpp2a/lambda-pack-init2.C: New test.
> > > > 
> > > > diff --git gcc/cp/parser.c gcc/cp/parser.c
> > > > index f9569ed..d5d8f364752 100644
> > > > --- gcc/cp/parser.c
> > > > +++ gcc/cp/parser.c
> > > > @@ -10721,6 +10721,15 @@ cp_parser_lambda_introducer (cp_parser* 
> > > > parser, tree lambda_expr)
> > > > {
> > > >   cp_lexer_consume_token (parser->lexer);
> > > >   capture_init_expr = make_pack_expansion 
> > > > (capture_init_expr);
> > > > + if (init_pack_expansion)
> > > > +   {
> > > > + /* We'd already seen '...' so we were expecting an
> > > > +init-capture.  But we just saw another '...' which
> > > > +would imply a simple-capture.  */
> > > > + error_at (capture_token->location,
> > > > +   "combining init-capture and 
> > > > simple-capture");
> > > 
> > > That diagnostic seems a bit obscure, how about something like "too many
> > > %<...%> in lambda capture"?
> > 
> > Yup, sounds better.
> > 
> > > Or perhaps check to see if there's an initializer after the ..., and
> > > complain about the second ... if so, or the first ... if not.
> > 
> > I tried but we already give diagnostic for [...xs=xs...] or [...xs=...xs]
> > and similar.
> 
> I was thinking about '...xs...=xs'.

Ah, I see.

> > But let's at least use a proper location for the '...'.
> 
> This patch uses the location of ... after the identifier, but if there's no
> initializer, the ... before the identifier is the one we want to flag as
> invalid.

How about this, then?

Bootstrap/regtest running on x86_64-linux, ok for trunk?

2019-03-13  Marek Polacek  

PR c++/89686 - mixing init-capture and simple-capture in lambda.
* parser.c (cp_parser_lambda_introducer): Give error when combining
init-capture and simple-capture.

* g++.dg/cpp2a/lambda-pack-init2.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 8244366e669..14da1a14501 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -10609,12 +10609,13 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
}
 
   bool init_pack_expansion = false;
+  location_t ellipsis_loc = UNKNOWN_LOCATION;
   if (cp_lexer_next_token_is (parser->lexer, CPP_ELLIPSIS))
{
- location_t loc = cp_lexer_peek_token (parser->lexer)->location;
+ ellipsis_loc = cp_lexer_peek_token (parser->lexer)->location;
  if (cxx_dialect < cxx2a)
-   pedwarn (loc, 0, "pack init-capture only available with "
-"%<-std=c++2a%> or %<-std=gnu++2a%>");
+   pedwarn (ellipsis_loc, 0, "pack init-capture only available with "
+"%<-std=c++2a%> or %<-std=gnu++2a%>");
  cp_lexer_consume_token (parser->lexer);
  init_pack_expansion = true;
}
@@ -10719,8 +10720,21 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
 
  if (cp_lexer_next_token_is (parser->lexer, CPP_ELLIPSIS))
{
+ location_t loc = cp_lexer_peek_token (parser->lexer)->location;
  cp_lexer_consume_token (parser->lexer);
  capture_init_expr = make_pack_expansion (capture_init_expr);
+ if (init_pack_expansion)
+   {
+ /* If what follows is an initializer, the second '...' is
+invalid.  But for cases like [...xs...], the first one
+is invalid.  */
+ if (cp_lexer_next_token_is (parser->lexer, CPP_EQ)
+ || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)
+ || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
+   ellipsis_loc = loc;
+ error_at (ellipsis_loc, "too many %<...%> in lambda capture");
+ continue;
+   }
}
}
 
diff --git gcc/testsuite/g++.dg/cpp2a/lambda-pack-init2.C

Re: [PATCH] gcc/riscv: Correctly ignore empty C++ structs when flattening for ABI

2019-03-13 Thread Andrew Pinski

On Wed, Mar 13, 2019 at 11:40 AM Andrew Burgess
 wrote:
>
> Jim,
>
> Sorry for the delay in sending this patch.
>
> Thanks,
> Andrew
>
> ---
>
> This fixes PR target/89627.
>
> The RISC-V ABI document[1] says:
>
>For the purposes of this section, "struct" refers to a C struct
>with its hierarchy flattened, including any array fields. That is,
>struct { struct { float f[1]; } g[2]; } and struct { float f; float
>g; } are treated the same. Fields containing empty structs or
>unions are ignored while flattening, even in C++, unless they have
>nontrivial copy constructors or destructors.
>
> However, this flattening only applies when one of the fields of the
> flattened structure can be placed into a floating point register,
> otherwise no flattening occurs.
>
> Currently GCC fails to correctly consider that empty C++ structures
> have a non-zero size when constructing the arguments from a flattened
> structure, and as a result, trying to pass a C++ structure like this:
>
>   struct sf { struct {} e; float f; };
>
> Doesn't work correctly, GCC fails to take the offset of 'f' within
> 'sf' into account and will actually pass the space backing 'e' as the
> contents of 'f'.
>
> This patch fixes this so that 'f' will be passed correctly.  A couple
> of new tests are added to cover this functionality.
>
> [1] https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md
>
> gcc/ChangeLog:
>
> PR target/89627
> * config/riscv/riscv.c (riscv_pass_fpr_single): Add offset
> parameter, and make use of it.
> (riscv_get_arg_info): Pass offset to riscv_pass_fpr_single.
>
> gcc/testsuite/ChangeLog:
>
> PR target/89627
> * g++.target/riscv/call-with-empty-struct-float.C: New file.
> * g++.target/riscv/call-with-empty-struct-int.C: New file.
> * g++.target/riscv/call-with-empty-struct.H: New file.
> * g++.target/riscv/riscv.exp: New file.

This testcase seems generic, that is it has no ABI dependent values
attached to it.
Can it be moved to a more generic location instead?  Maybe there are
other targets which get this incorrect also.

Thanks,
Andrew Pinski


> ---
>  gcc/ChangeLog  |  7 +
>  gcc/config/riscv/riscv.c   |  8 +++--
>  gcc/testsuite/ChangeLog|  8 +
>  .../riscv/call-with-empty-struct-float.C   |  6 
>  .../g++.target/riscv/call-with-empty-struct-int.C  |  6 
>  .../g++.target/riscv/call-with-empty-struct.H  | 19 
>  gcc/testsuite/g++.target/riscv/riscv.exp   | 34 
> ++
>  7 files changed, 85 insertions(+), 3 deletions(-)
>  create mode 100644 
> gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
>  create mode 100644 
> gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C
>  create mode 100644 gcc/testsuite/g++.target/riscv/call-with-empty-struct.H
>  create mode 100644 gcc/testsuite/g++.target/riscv/riscv.exp
>
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index 8881f80e18f..8e78ab76375 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -2449,13 +2449,14 @@ riscv_pass_aggregate_in_fpr_and_gpr_p (const_tree 
> type,
>
>  static rtx
>  riscv_pass_fpr_single (machine_mode type_mode, unsigned regno,
> -  machine_mode value_mode)
> +  machine_mode value_mode,
> +  HOST_WIDE_INT offset)
>  {
>rtx x = gen_rtx_REG (value_mode, regno);
>
>if (type_mode != value_mode)
>  {
> -  x = gen_rtx_EXPR_LIST (VOIDmode, x, const0_rtx);
> +  x = gen_rtx_EXPR_LIST (VOIDmode, x, GEN_INT (offset));
>x = gen_rtx_PARALLEL (type_mode, gen_rtvec (1, x));
>  }
>return x;
> @@ -2517,7 +2518,8 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
> CUMULATIVE_ARGS *cum,
>   {
>   case 1:
> return riscv_pass_fpr_single (mode, fregno,
> - TYPE_MODE (fields[0].type));
> + TYPE_MODE (fields[0].type),
> + fields[0].offset);
>
>   case 2:
> return riscv_pass_fpr_pair (mode, fregno,
> diff --git a/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C 
> b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
> new file mode 100644
> index 000..76d0dc6d93d
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
> @@ -0,0 +1,6 @@
> +// { dg-do run }
> +
> +#include "call-with-empty-struct.H"
> +
> +MAKE_STRUCT_PASSING_TEST(float,2.5)
> +
> diff --git a/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C 
> b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C
> new file mode 100644
> index 000..cc82912dc3e
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C

[PATCH] gcc/riscv: Correctly ignore empty C++ structs when flattening for ABI

2019-03-13 Thread Andrew Burgess

Jim,

Sorry for the delay in sending this patch.

Thanks,
Andrew

---

This fixes PR target/89627.

The RISC-V ABI document[1] says:

   For the purposes of this section, "struct" refers to a C struct
   with its hierarchy flattened, including any array fields. That is,
   struct { struct { float f[1]; } g[2]; } and struct { float f; float
   g; } are treated the same. Fields containing empty structs or
   unions are ignored while flattening, even in C++, unless they have
   nontrivial copy constructors or destructors.

However, this flattening only applies when one of the fields of the
flattened structure can be placed into a floating point register,
otherwise no flattening occurs.

Currently GCC fails to correctly consider that empty C++ structures
have a non-zero size when constructing the arguments from a flattened
structure, and as a result, trying to pass a C++ structure like this:

  struct sf { struct {} e; float f; };

Doesn't work correctly, GCC fails to take the offset of 'f' within
'sf' into account and will actually pass the space backing 'e' as the
contents of 'f'.

This patch fixes this so that 'f' will be passed correctly.  A couple
of new tests are added to cover this functionality.

[1] https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md

gcc/ChangeLog:

PR target/89627
* config/riscv/riscv.c (riscv_pass_fpr_single): Add offset
parameter, and make use of it.
(riscv_get_arg_info): Pass offset to riscv_pass_fpr_single.

gcc/testsuite/ChangeLog:

PR target/89627
* g++.target/riscv/call-with-empty-struct-float.C: New file.
* g++.target/riscv/call-with-empty-struct-int.C: New file.
* g++.target/riscv/call-with-empty-struct.H: New file.
* g++.target/riscv/riscv.exp: New file.
---
 gcc/ChangeLog  |  7 +
 gcc/config/riscv/riscv.c   |  8 +++--
 gcc/testsuite/ChangeLog|  8 +
 .../riscv/call-with-empty-struct-float.C   |  6 
 .../g++.target/riscv/call-with-empty-struct-int.C  |  6 
 .../g++.target/riscv/call-with-empty-struct.H  | 19 
 gcc/testsuite/g++.target/riscv/riscv.exp   | 34 ++
 7 files changed, 85 insertions(+), 3 deletions(-)
 create mode 100644 
gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
 create mode 100644 gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C
 create mode 100644 gcc/testsuite/g++.target/riscv/call-with-empty-struct.H
 create mode 100644 gcc/testsuite/g++.target/riscv/riscv.exp

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 8881f80e18f..8e78ab76375 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -2449,13 +2449,14 @@ riscv_pass_aggregate_in_fpr_and_gpr_p (const_tree type,
 
 static rtx
 riscv_pass_fpr_single (machine_mode type_mode, unsigned regno,
-  machine_mode value_mode)
+  machine_mode value_mode,
+  HOST_WIDE_INT offset)
 {
   rtx x = gen_rtx_REG (value_mode, regno);
 
   if (type_mode != value_mode)
 {
-  x = gen_rtx_EXPR_LIST (VOIDmode, x, const0_rtx);
+  x = gen_rtx_EXPR_LIST (VOIDmode, x, GEN_INT (offset));
   x = gen_rtx_PARALLEL (type_mode, gen_rtvec (1, x));
 }
   return x;
@@ -2517,7 +2518,8 @@ riscv_get_arg_info (struct riscv_arg_info *info, const 
CUMULATIVE_ARGS *cum,
  {
  case 1:
return riscv_pass_fpr_single (mode, fregno,
- TYPE_MODE (fields[0].type));
+ TYPE_MODE (fields[0].type),
+ fields[0].offset);
 
  case 2:
return riscv_pass_fpr_pair (mode, fregno,
diff --git a/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C 
b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
new file mode 100644
index 000..76d0dc6d93d
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-float.C
@@ -0,0 +1,6 @@
+// { dg-do run }
+
+#include "call-with-empty-struct.H"
+
+MAKE_STRUCT_PASSING_TEST(float,2.5)
+
diff --git a/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C 
b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C
new file mode 100644
index 000..cc82912dc3e
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/call-with-empty-struct-int.C
@@ -0,0 +1,6 @@
+/* { dg-do run } */
+
+#include "call-with-empty-struct.H"
+
+MAKE_STRUCT_PASSING_TEST(int,2)
+
diff --git a/gcc/testsuite/g++.target/riscv/call-with-empty-struct.H 
b/gcc/testsuite/g++.target/riscv/call-with-empty-struct.H
new file mode 100644
index 000..d2a46e78619
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/call-with-empty-struct.H
@@ -0,0 +1,19 @@
+#define MAKE_STRUCT_PASSING_TEST(type,val)  \
+  static struct struct_ ## type ## _t

Re: C++ PATCH for c++/88979 - further P0634 fix for constructors

2019-03-13 Thread Marek Polacek

On Wed, Mar 13, 2019 at 02:32:36PM -0400, Jason Merrill wrote:
> On 3/13/19 1:25 PM, Marek Polacek wrote:
> > This PR points out that our P0634 implementation misses constructor 
> > parameters
> > and still requires 'typename'.  I'm not 100% sure if the standard really 
> > says
> > we should also handle constructors.  Does a constructor have a function 
> > type?
> 
> Not that we can ask about, but a constructor declaration is a function
> declaration.

That settles it, then.

> > But I guess it makes sense, so this patch handles it.  To handle a 
> > constructor
> > in a class, we need to pass flags to cp_parser_constructor_declarator_p.  To
> > handle an out-of-line constructor, use constructor_name_p.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> OK.

Thanks!

Marek

Re: C++ PATCH for c++/88979 - further P0634 fix for constructors

2019-03-13 Thread Jason Merrill


On 3/13/19 1:25 PM, Marek Polacek wrote:

This PR points out that our P0634 implementation misses constructor parameters
and still requires 'typename'.  I'm not 100% sure if the standard really says
we should also handle constructors.  Does a constructor have a function type?


Not that we can ask about, but a constructor declaration is a function 
declaration.



But I guess it makes sense, so this patch handles it.  To handle a constructor
in a class, we need to pass flags to cp_parser_constructor_declarator_p.  To
handle an out-of-line constructor, use constructor_name_p.

Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK.

Jason

Re: C++ PATCH for c++/89686 - mixing init-capture and simple-capture in lambda

2019-03-13 Thread Jason Merrill


On 3/12/19 5:42 PM, Marek Polacek wrote:

On Tue, Mar 12, 2019 at 04:07:56PM -0400, Jason Merrill wrote:

On 3/12/19 3:59 PM, Marek Polacek wrote:

As Barry explained in the PR, lambda capture is one of

simple-capture ...[opt]
...[opt] init-capture

where init-capture requires an initializer.  Here we have

[...xs...]

which is ill-formed as it's mingling both of these.

Bootstrapped/regtested on x86_64-linux, ok for trunk?  Or should I defer to GCC
10?

2019-03-12  Marek Polacek  

PR c++/89686 - mixing init-capture and simple-capture in lambda.
* parser.c (cp_parser_lambda_introducer): Give error when combining
init-capture and simple-capture.

* g++.dg/cpp2a/lambda-pack-init2.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index f9569ed..d5d8f364752 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -10721,6 +10721,15 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
{
  cp_lexer_consume_token (parser->lexer);
  capture_init_expr = make_pack_expansion (capture_init_expr);
+ if (init_pack_expansion)
+   {
+ /* We'd already seen '...' so we were expecting an
+init-capture.  But we just saw another '...' which
+would imply a simple-capture.  */
+ error_at (capture_token->location,
+   "combining init-capture and simple-capture");


That diagnostic seems a bit obscure, how about something like "too many
%<...%> in lambda capture"?


Yup, sounds better.


Or perhaps check to see if there's an initializer after the ..., and
complain about the second ... if so, or the first ... if not.


I tried but we already give diagnostic for [...xs=xs...] or [...xs=...xs]
and similar.


I was thinking about '...xs...=xs'.


But let's at least use a proper location for the '...'.


This patch uses the location of ... after the identifier, but if there's 
no initializer, the ... before the identifier is the one we want to flag 
as invalid.


Jason

[PR72741] Encode OpenACC 'routine' directive's level of parallelism inside Fortran module files

2019-03-13 Thread Thomas Schwinge

Hi!

On Thu, 28 Feb 2019 22:17:01 +0100, Jakub Jelinek  wrote:
> On Thu, Feb 28, 2019 at 10:12:00PM +0100, Thomas Schwinge wrote:
> > On Wed, 15 Jun 2016 20:12:15 -0700, Cesar Philippidis 
> >  wrote:
> > The code changes now are actually very simple.  The "problem" is that
> > we're incrementing the Fortran module version, 'MOD_VERSION', which
> > breaks binary compatibility with Fortran module files created with
> > earlier versions of GCC, which is something that is to be avoided, as
> > I've heard.  Or, is it not that bad actually?
> 
> It is bad and we certainly shouldn't change it on release branches.

ACK.

> There are many ways to deal with it without bumping MOD_VERSION in a
> backwards but not forwards compatible way, so that a newer compiler will be
> able to parse old *.mod files, and newer compiler new ones as long as this
> problematic stuff doesn't appear in.

Like the attached, actually pretty simple now?

It may seem wasteful to use individual bits for 'gang', 'worker',
'vector', 'seq', but that makes it easy to implement the behavior
described above by Jakub, and I've heard rumors that OpenACC might at
some point allow several of these level of parallelism clauses to be
specified (plus a new 'auto' clause?), and then this will be necessary
anyway (several of these bits can then in fact appear).

If approving this patch, please respond with "Reviewed-by: NAME "
so that your effort will be recorded in the commit log, see
.


Grüße
 Thomas


>From b2f200d24d040c6d34b5b4421e4cb1be9786030f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 13 Mar 2019 18:39:53 +0100
Subject: [PATCH] [PR72741] Encode OpenACC 'routine' directive's level of
 parallelism inside Fortran module files

If 'use'ing with an old GCC a new module file (with OpenACC 'routine'
directive's level of parallelism encoded), then that expectedly fails as
follows:

f951: Fatal Error: Reading module 'routine_module_mod_1' at line 27 column 21: find_enum(): Enum not found

If 'use'ing with a new GCC an old module file (without OpenACC 'routine'
directive's level of parallelism encoded), then that (silently) continues to
accept the module file, and will proceed with the previous, erroneous behavior.

These seem to be acceptable compromises, instead of incrementing 'MOD_VERSION'.

	gcc/fortran/
	* module.c (verify_OACC_ROUTINE_LOP_NONE): New function.
	(enum ab_attribute): Add AB_OACC_ROUTINE_LOP_GANG,
	AB_OACC_ROUTINE_LOP_WORKER, AB_OACC_ROUTINE_LOP_VECTOR,
	AB_OACC_ROUTINE_LOP_SEQ.
	(attr_bits): Add these.
	(mio_symbol_attribute): Handle these.
	gcc/testsuite/
	* gfortran.dg/goacc/routine-module-1.f90: New file.
	* gfortran.dg/goacc/routine-module-2.f90: Likewise.
	* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
---
 gcc/fortran/module.c  | 57 -
 .../gfortran.dg/goacc/routine-module-1.f90| 47 +++
 .../gfortran.dg/goacc/routine-module-2.f90| 31 
 .../goacc/routine-module-mod-1.f90| 79 +++
 4 files changed, 213 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/routine-module-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/routine-module-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90

diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 0572b8e02c17..39b420039dff 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -2011,7 +2011,9 @@ enum ab_attribute
   AB_OACC_DECLARE_COPYIN, AB_OACC_DECLARE_DEVICEPTR,
   AB_OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_LINK,
   AB_OMP_DECLARE_TARGET_LINK, AB_PDT_KIND, AB_PDT_LEN, AB_PDT_TYPE,
-  AB_PDT_TEMPLATE, AB_PDT_ARRAY, AB_PDT_STRING
+  AB_PDT_TEMPLATE, AB_PDT_ARRAY, AB_PDT_STRING,
+  AB_OACC_ROUTINE_LOP_GANG, AB_OACC_ROUTINE_LOP_WORKER,
+  AB_OACC_ROUTINE_LOP_VECTOR, AB_OACC_ROUTINE_LOP_SEQ
 };
 
 static const mstring attr_bits[] =
@@ -2081,6 +2083,10 @@ static const mstring attr_bits[] =
 minit ("PDT_TEMPLATE", AB_PDT_TEMPLATE),
 minit ("PDT_ARRAY", AB_PDT_ARRAY),
 minit ("PDT_STRING", AB_PDT_STRING),
+minit ("OACC_ROUTINE_LOP_GANG", AB_OACC_ROUTINE_LOP_GANG),
+minit ("OACC_ROUTINE_LOP_WORKER", AB_OACC_ROUTINE_LOP_WORKER),
+minit ("OACC_ROUTINE_LOP_VECTOR", AB_OACC_ROUTINE_LOP_VECTOR),
+minit ("OACC_ROUTINE_LOP_SEQ", AB_OACC_ROUTINE_LOP_SEQ),
 minit (NULL, -1)
 };
 
@@ -2128,6 +2134,15 @@ DECL_MIO_NAME (sym_intent)
 DECL_MIO_NAME (inquiry_type)
 #undef DECL_MIO_NAME
 
+/* Verify OACC_ROUTINE_LOP_NONE.  */
+
+static void
+verify_OACC_ROUTINE_LOP_NONE (enum oacc_routine_lop lop)
+{
+  if (lop != OACC_ROUTINE_LOP_NONE)
+bad_module ("Unsupported: multiple OpenACC 'routine' levels of parallelism");
+}
+
 /* Symbol attributes are stored in list with the first three elements
being the enumerated fields, while the remaining elements (if any)
indicate the individual attribute bits.  The access field is not
@@

Re: [PATCH] i386: Handle REG_EH_REGION note

2019-03-13 Thread Jakub Jelinek

On Tue, Mar 12, 2019 at 09:36:32AM +0800, H.J. Lu wrote:
>   PR target/89650
>   * config/i386/i386.c (remove_partial_avx_dependency): Handle
>   REG_EH_REGION note.
> 
> gcc/testsuite/
> 
>   PR target/89650
>   * g++.target/i386/pr89650.C: New test.
> ---
>  gcc/config/i386/i386.c  |  6 ++
>  gcc/testsuite/g++.target/i386/pr89650.C | 19 +++
>  2 files changed, 25 insertions(+)
>  create mode 100644 gcc/testsuite/g++.target/i386/pr89650.C
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 896c6f33d40..b702703074c 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2873,6 +2873,12 @@ remove_partial_avx_dependency (void)
> /* Generate an XMM vector SET.  */
> set = gen_rtx_SET (vec, src);
> set_insn = emit_insn_before (set, insn);
> +
> +   /* Handle REG_EH_REGION note.  */
> +   rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
> +   if (note)
> + add_reg_note (insn, REG_EH_REGION, XEXP (note, 0));
> +

How can this work?  If insn has REG_EH_REGION note, you add that note
to it once more?  At minimum that should be add_reg_note (set_insn, ...).
But perhaps better use:
  copy_reg_eh_region_note_forward (insn, set_insn, insn);
instead?

That said, with either change the testcase ICEs, as we have control flow
insn in the middle of a bb.

Tried to do:
--- gcc/config/i386/i386.c.jj   2019-03-13 09:23:37.138538182 +0100
+++ gcc/config/i386/i386.c  2019-03-13 18:24:07.773761751 +0100
@@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.
 #include "tree-vector-builder.h"
 #include "debug.h"
 #include "dwarf2out.h"
+#include "cfgcleanup.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -2814,6 +2815,9 @@ remove_partial_avx_dependency (void)
   bitmap_obstack_initialize (NULL);
   bitmap convert_bbs = BITMAP_ALLOC (NULL);
 
+  auto_sbitmap blocks_with_eh (last_basic_block_for_fn (cfun));
+  bitmap_clear (blocks_with_eh);
+
   basic_block bb;
   rtx_insn *insn, *set_insn;
   rtx set;
@@ -2873,6 +2877,15 @@ remove_partial_avx_dependency (void)
  /* Generate an XMM vector SET.  */
  set = gen_rtx_SET (vec, src);
  set_insn = emit_insn_before (set, insn);
+
+ /* Handle REG_EH_REGION note.  */
+ rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
+ if (note)
+   {
+ bitmap_set_bit (blocks_with_eh, bb->index);
+ add_reg_note (set_insn, REG_EH_REGION, XEXP (note, 0));
+   }
+
  df_insn_rescan (set_insn);
 
  src = gen_rtx_SUBREG (dest_mode, vec, 0);
@@ -2930,6 +2943,14 @@ remove_partial_avx_dependency (void)
   bitmap_obstack_release (NULL);
   BITMAP_FREE (convert_bbs);
 
+  if (!bitmap_empty_p (blocks_with_eh))
+{
+  find_many_sub_basic_blocks (blocks_with_eh);
+
+  /* If we've moved any REG_EH_REGION notes, also cleanup the cfg.  */
+  cleanup_cfg (0);
+}
+
   timevar_pop (TV_MACH_DEP);
   return 0;
 }

but that still ICEs.

Jakub

C++ PATCH for c++/88979 - further P0634 fix for constructors

2019-03-13 Thread Marek Polacek

This PR points out that our P0634 implementation misses constructor parameters
and still requires 'typename'.  I'm not 100% sure if the standard really says
we should also handle constructors.  Does a constructor have a function type?

But I guess it makes sense, so this patch handles it.  To handle a constructor
in a class, we need to pass flags to cp_parser_constructor_declarator_p.  To
handle an out-of-line constructor, use constructor_name_p.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-03-13  Marek Polacek  

PR c++/88979 - further P0634 fix for constructors.
* parser.c (cp_parser_decl_specifier_seq): Pass flags to
cp_parser_constructor_declarator_p.
(cp_parser_direct_declarator): Allow missing typename for constructor
parameters.
(cp_parser_constructor_declarator_p): Add FLAGS parameter.  Pass it to
cp_parser_type_specifier.

* g++.dg/cpp2a/typename15.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index f9569ed..8244366e669 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -2506,7 +2506,7 @@ static cp_expr cp_parser_simple_cast_expression
 static tree cp_parser_global_scope_opt
   (cp_parser *, bool);
 static bool cp_parser_constructor_declarator_p
-  (cp_parser *, bool);
+  (cp_parser *, cp_parser_flags, bool);
 static tree cp_parser_function_definition_from_specifiers_and_declarator
   (cp_parser *, cp_decl_specifier_seq *, tree, const cp_declarator *);
 static tree cp_parser_function_definition_after_declarator
@@ -14052,7 +14052,8 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
= (!found_decl_spec
   && constructor_possible_p
   && (cp_parser_constructor_declarator_p
-  (parser, decl_spec_seq_has_spec_p (decl_specs, ds_friend;
+  (parser, flags, decl_spec_seq_has_spec_p (decl_specs,
+ds_friend;
 
   /* If we don't have a DECL_SPEC yet, then we must be looking at
 a type-specifier.  */
@@ -21160,7 +21161,13 @@ cp_parser_direct_declarator (cp_parser* parser,
tree decl
  = cp_parser_lookup_name_simple (parser, unqualified_name,
  token->location);
-   if (!is_overloaded_fn (decl))
+   if (!is_overloaded_fn (decl)
+   /* Allow
+  template
+  A::A(T::type) { }  */
+   && !(MAYBE_CLASS_TYPE_P (qualifying_scope)
+&& constructor_name_p (unqualified_name,
+   qualifying_scope)))
  flags &= ~CP_PARSER_FLAGS_TYPENAME_OPTIONAL;
  }
  }
@@ -27380,10 +27387,12 @@ cp_parser_global_scope_opt (cp_parser* parser, bool 
current_scope_valid_p)
 
 /* Returns TRUE if the upcoming token sequence is the start of a
constructor declarator or C++17 deduction guide.  If FRIEND_P is true, the
-   declarator is preceded by the `friend' specifier.  */
+   declarator is preceded by the `friend' specifier.  The parser flags FLAGS
+   is used to control type-specifier parsing.  */
 
 static bool
-cp_parser_constructor_declarator_p (cp_parser *parser, bool friend_p)
+cp_parser_constructor_declarator_p (cp_parser *parser, cp_parser_flags flags,
+   bool friend_p)
 {
   bool constructor_p;
   bool outside_class_specifier_p;
@@ -27562,9 +27571,10 @@ cp_parser_constructor_declarator_p (cp_parser *parser, 
bool friend_p)
= parser->num_template_parameter_lists;
  parser->num_template_parameter_lists = 0;
 
- /* Look for the type-specifier.  */
+ /* Look for the type-specifier.  It's not optional, but its typename
+might be.  */
  cp_parser_type_specifier (parser,
-   CP_PARSER_FLAGS_NONE,
+   (flags & ~CP_PARSER_FLAGS_OPTIONAL),
/*decl_specs=*/NULL,
/*is_declarator=*/true,
/*declares_class_or_enum=*/NULL,
diff --git gcc/testsuite/g++.dg/cpp2a/typename15.C 
gcc/testsuite/g++.dg/cpp2a/typename15.C
new file mode 100644
index 000..9094190cec9
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/typename15.C
@@ -0,0 +1,15 @@
+// PR c++/88979
+// { dg-do compile { target c++2a } }
+
+template
+struct B {
+  B(T::type);
+};
+
+template
+struct A {
+  A(T::type);
+};
+
+template
+A::A(T::type) { }

bootstrap error due to constexpr in go/gofrontend/ast-dump.cc

2019-03-13 Thread Martin Sebor


A recent change (r269633 AFAICS) introduced the constexpr keyword
into go which breaks bootstrap with a C++ 98 compiler.  I fixed
it like this in my tree but haven't fully tested it.  I just
thought I'd send a heads up before others run into it.

Martin

===
--- go/gofrontend/ast-dump.cc   (revision 269652)
+++ go/gofrontend/ast-dump.cc   (working copy)
@@ -610,7 +610,7 @@ class Type_dumper
  const char *tag);
   std::pair lookup(const Type*);

-  static constexpr unsigned notag = 0x;
+  static const unsigned notag = 0x;

  private:
   const Type* top_;

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Steve Ellcey

On Wed, 2019-03-13 at 14:38 +, Joseph Myers wrote:
> 
> ---
> ---
> On Wed, 13 Mar 2019, Jakub Jelinek wrote:
> 
> > If the finite only doesn't buy anything, then another option is to drop the
> > math-finite.h stuff or portions thereof.
> 
> Well, finite-only entry points avoid wrappers for the scalar functions - 
> it's just that suitable optimized implementations could avoid the wrappers 
> in all cases without needing a separate finite-only variant.  It's not 
> clear that adding wrappers in this case for scalar functions to avoid them 
> for vector functions is a good idea.  And regardless of the merits of a 
> particular set of entry points, I think requiring the same set of variants 
> for both vector and scalar functions is flawed; the headers should be able 
> to declare a scalar variant to be used under certain conditions without 
> requiring a corresponding vector variant, or of course the other way 
> round.

If some targets don't need _finite variants it might be useful to tell
the compiler not to generate them.  For example, on aarch64 we know
that __exp_finite and __expf_finite are just aliases of exp and expf,
so if we could turn off the math-finite.h functionality for those
functions on this target, GCC would also not generate calls to the
vector versions of exp_finite and expf_finite.  This doesn't directly
address the issue of the scalar and vector variants being independent
of each other but it does make the problem moot in this specific case.

Steve Ellcey
sell...@marvell.com

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Joseph Myers

On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> If the finite only doesn't buy anything, then another option is to drop the
> math-finite.h stuff or portions thereof.

Well, finite-only entry points avoid wrappers for the scalar functions - 
it's just that suitable optimized implementations could avoid the wrappers 
in all cases without needing a separate finite-only variant.  It's not 
clear that adding wrappers in this case for scalar functions to avoid them 
for vector functions is a good idea.  And regardless of the merits of a 
particular set of entry points, I think requiring the same set of variants 
for both vector and scalar functions is flawed; the headers should be able 
to declare a scalar variant to be used under certain conditions without 
requiring a corresponding vector variant, or of course the other way 
round.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Slightly altered backport for PR89572

2019-03-13 Thread Richard Biener



I am testing the following with additional hunks in 
tree-ssa-loop-ivcanon.c.

Bootstrap / regtest running on x86_64-unknown-linux-gnu (gcc-8-branch).

Richard.

2019-03-14  Richard Biener  

PR middle-end/89572
* tree-scalar-evolution.c (get_loop_exit_condition): Use
safe_dyn_cast.
* tree-ssa-loop-ivcanon.c (canonicalize_loop_induction_variables):
Use gimple_location_safe.

* gcc.dg/torture/pr89572.c: New testcase.

Index: gcc/testsuite/gcc.dg/torture/pr89572.c
===
--- gcc/testsuite/gcc.dg/torture/pr89572.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89572.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-finline-functions" } */
+
+int vh, it, k1;
+
+void
+vn (void)
+{
+  ++vh;
+  if (vh == 0 && it == 0)
+k1 = -k1;
+}
+
+__attribute__ ((returns_twice)) void
+ef (int *uw)
+{
+  while (uw != (void *) 0)
+{
+  vn ();
+  *uw = 0;
+}
+}
+
+void
+gu (int *uw)
+{
+  ef (uw);
+}
Index: gcc/tree-scalar-evolution.c
===
--- gcc/tree-scalar-evolution.c (revision 269650)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -875,7 +875,7 @@ get_loop_exit_condition (const struct lo
   gimple *stmt;
 
   stmt = last_stmt (exit_edge->src);
-  if (gcond *cond_stmt = dyn_cast  (stmt))
+  if (gcond *cond_stmt = safe_dyn_cast  (stmt))
res = cond_stmt;
 }
 
Index: gcc/tree-ssa-loop-ivcanon.c
===
--- gcc/tree-ssa-loop-ivcanon.c (revision 269650)
+++ gcc/tree-ssa-loop-ivcanon.c (working copy)
@@ -1177,7 +1177,7 @@ canonicalize_loop_induction_variables (s
= niter_desc.may_be_zero && !integer_zerop (niter_desc.may_be_zero);
 }
   if (TREE_CODE (niter) == INTEGER_CST)
-locus = gimple_location (last_stmt (exit->src));
+locus = gimple_location_safe (last_stmt (exit->src));
   else
 {
   /* For non-constant niter fold may_be_zero into niter again.  */
@@ -1204,7 +1204,7 @@ canonicalize_loop_induction_variables (s
niter = find_loop_niter_by_eval (loop, );
 
   if (exit)
-locus = gimple_location (last_stmt (exit->src));
+locus = gimple_location_safe (last_stmt (exit->src));
 
   if (TREE_CODE (niter) != INTEGER_CST)
exit = NULL;

[PATCH] Fix miscompilation due to REG_EQUAL note on a paradoxical subreg operation (PR rtl-optimization/89679)

2019-03-13 Thread Jakub Jelinek

Hi!

The following testcase is miscompiled on powerpc64le-linux.
As mentioned in the PR, the problem is with a REG_EQUAL note which
expand_mult_const attaches to:

(insn 17 16 18 2 (set (reg:SI 140)
(plus:SI (subreg:SI (reg:HI 138) 0)
(subreg:SI (reg:HI 136) 0))) "pr89679.c":16:3 -1
 (expr_list:REG_EQUAL (mult:SI (subreg:SI (reg:HI 136) 0)
(const_int 257 [0x101]))
(nil)))

The REG_EQUAL here is only approximate, as the operation is done on
SImode and using paradoxical subregs, so the upper bits can be anything,
only the low part of the SImode reg is guaranteed to be
equal to low part of the REG_EQUAL note.

Later on fwprop propagates separately into the insn PATTERN and into the
REG_EQUAL note and we end up with:

(insn 17 6 19 2 (set (reg:SI 140)
(const_int -1 [0x])) "pr89679.c":16:3 494 
{*movsi_internal1}
 (expr_list:REG_DEAD (reg:HI 138 [ c ])
(expr_list:REG_DEAD (reg:HI 136 [ c ])
(expr_list:REG_EQUAL (const_int 65535 [0x])
(nil)

which is clearly invalid, as it is never equal rather than always.
Later on CSE uses the REG_EQUAL note and miscompiles the testcase.

Below is one patch, more in the spirit of
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00418.html
where we refuse to add REG_EQUAL note in that case.

Attached is another variant, which keeps it around just in case something
would make use of it, but makes sure we don't fwprop into such REG_EQUAL
notes replacing a paradoxical subreg in there with something else.

Both patches were bootstrapped/regtested on
{x86_64,i686,powerpc64le,s390x,aarch64}-linux, ok for trunk (which one)?

2019-03-13  Jakub Jelinek  

PR rtl-optimization/89679
* expmed.c (expand_mult_const): Don't add a REG_EQUAL note if it
would contain a paradoxical SUBREG.

* gcc.dg/pr89679.c: New test.

--- gcc/expmed.c.jj 2019-01-10 11:43:14.387377695 +0100
+++ gcc/expmed.c2019-03-12 17:38:55.286511666 +0100
@@ -3356,13 +3356,20 @@ expand_mult_const (machine_mode mode, rt
  tem = gen_lowpart (nmode, op0);
}
 
- insn = get_last_insn ();
- wide_int wval_so_far
-   = wi::uhwi (val_so_far,
-   GET_MODE_PRECISION (as_a  (nmode)));
- rtx c = immed_wide_int_const (wval_so_far, nmode);
- set_dst_reg_note (insn, REG_EQUAL, gen_rtx_MULT (nmode, tem, c),
-   accum_inner);
+ /* Don't add a REG_EQUAL note if tem is a paradoxical SUBREG.
+In that case, only the low bits of accum would be guaranteed to
+be equal to the content of the REG_EQUAL note, the upper bits
+can be anything.  */
+ if (!paradoxical_subreg_p (tem))
+   {
+ insn = get_last_insn ();
+ wide_int wval_so_far
+   = wi::uhwi (val_so_far,
+   GET_MODE_PRECISION (as_a  (nmode)));
+ rtx c = immed_wide_int_const (wval_so_far, nmode);
+ set_dst_reg_note (insn, REG_EQUAL, gen_rtx_MULT (nmode, tem, c),
+   accum_inner);
+   }
}
 }
 
--- gcc/testsuite/gcc.dg/pr89679.c.jj   2019-02-22 00:02:11.718740936 +0100
+++ gcc/testsuite/gcc.dg/pr89679.c  2019-03-12 18:08:07.240502975 +0100
@@ -0,0 +1,26 @@
+/* PR rtl-optimization/89679 */
+/* { dg-do run } */
+/* { dg-options "-Og -frerun-cse-after-loop -fno-tree-fre" } */
+
+unsigned short g;
+
+void
+foo (unsigned long long x)
+{
+  if (x != 0x)
+__builtin_abort ();
+}
+
+int
+main ()
+{
+#if __SIZEOF_SHORT__ == 2 && __SIZEOF_INT__ == 4 && __CHAR_BIT__ == 8
+  unsigned short d = 0;
+  unsigned long long x, c = ~0;
+  c = c >> d;
+  __builtin_memset (, c, 2);
+  x = d + g;
+  foo (x);
+#endif
+  return 0;
+}

Jakub
2019-03-13  Jakub Jelinek  

PR rtl-optimization/89679
* fwprop.c (forward_propagate_and_simplify): Don't propagate into
paradoxical SUBREGs in REG_EQUAL notes.

* gcc.dg/pr89679.c: New test.

--- gcc/fwprop.c.jj 2019-01-01 12:37:21.190908780 +0100
+++ gcc/fwprop.c2019-03-12 17:24:14.623863160 +0100
@@ -1327,7 +1327,12 @@ forward_propagate_and_simplify (df_ref u
 {
   rtx note = find_reg_note (use_insn, REG_EQUAL, NULL_RTX);
   if (DF_REF_FLAGS (use) & DF_REF_IN_NOTE)
-   loc =  (note, 0);
+   {
+ loc =  (note, 0);
+ /* Don't simplify a paradoxical SUBREG in a REG_EQUAL note.  */
+ if (paradoxical_subreg_p (DF_REF_REG (use)))
+   return false;
+   }
   else
loc = _SRC (use_set);
 
--- gcc/testsuite/gcc.dg/pr89679.c.jj   2019-02-22 00:02:11.718740936 +0100
+++ gcc/testsuite/gcc.dg/pr89679.c  2019-03-12 18:08:07.240502975 +0100
@@ -0,0 +1,26 @@
+/* PR rtl-optimization/89679 */
+/* { dg-do run } */
+/* { dg-options "-Og -frerun-cse-after-loop -fno-tree-fre" } */
+
+unsigned short g;
+
+void
+foo (unsigned

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 02:07:29PM +0100, Jakub Jelinek wrote:
> On Wed, Mar 13, 2019 at 01:00:58PM +, Joseph Myers wrote:
> > > Yeah, an alias doesn't really cost much, and has the advantage that if in
> > > the vector version you need at some point to differentiate between the
> > > finite only vs. full implementations, you can just by tweaking libmvec
> > > implementation, the callers will have proper calls depending on if they 
> > > were
> > > compiled with -Ofast or -O3 etc.
> > 
> > Experience is showing that some or all of the finite-only versions in 
> > glibc were mistaken premature optimization - that proper optimized 
> > implementations do not gain anything from adding a finite-only 
> > restriction.  There is no good basis to suppose that if additional 
> > variants of the vector functions were useful in future, finite-only would 
> > be the right conditional (or that the right set of variants would be the 
> > same as the right set of variants for scalar functions).
> 
> If the finite only doesn't buy anything, then another option is to drop the
> math-finite.h stuff or portions thereof.
> But adding a new GCC extension that other compilers will need to implement
> too, instead of just adding a couple of aliases seems to be overkill to me.

Not to mention that fixing it on the glibc side will make it work also with
GCC 8, GCC 7 and GCC 6.

Jakub

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 01:00:58PM +, Joseph Myers wrote:
> > Yeah, an alias doesn't really cost much, and has the advantage that if in
> > the vector version you need at some point to differentiate between the
> > finite only vs. full implementations, you can just by tweaking libmvec
> > implementation, the callers will have proper calls depending on if they were
> > compiled with -Ofast or -O3 etc.
> 
> Experience is showing that some or all of the finite-only versions in 
> glibc were mistaken premature optimization - that proper optimized 
> implementations do not gain anything from adding a finite-only 
> restriction.  There is no good basis to suppose that if additional 
> variants of the vector functions were useful in future, finite-only would 
> be the right conditional (or that the right set of variants would be the 
> same as the right set of variants for scalar functions).

If the finite only doesn't buy anything, then another option is to drop the
math-finite.h stuff or portions thereof.
But adding a new GCC extension that other compilers will need to implement
too, instead of just adding a couple of aliases seems to be overkill to me.

Jakub

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Joseph Myers

On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> Yeah, an alias doesn't really cost much, and has the advantage that if in
> the vector version you need at some point to differentiate between the
> finite only vs. full implementations, you can just by tweaking libmvec
> implementation, the callers will have proper calls depending on if they were
> compiled with -Ofast or -O3 etc.

Experience is showing that some or all of the finite-only versions in 
glibc were mistaken premature optimization - that proper optimized 
implementations do not gain anything from adding a finite-only 
restriction.  There is no good basis to suppose that if additional 
variants of the vector functions were useful in future, finite-only would 
be the right conditional (or that the right set of variants would be the 
same as the right set of variants for scalar functions).

-- 
Joseph S. Myers
jos...@codesourcery.com

Aw: Re: [PR fortran/60091, patch] - Misleading error messages in rank-2 pointer assignment to rank-1 target

2019-03-13 Thread Harald Anlauf

Hi Thomas,

I am not so convinced that "plain english" messages are the way to go,
even if they appear better readable at first sight, if conciseness is
lost.  The main reason is that there three variants of the messages,
depending on context.  One of them refers to expecting either a
bounds-specification-list or a bounds-remapping-list.

Do you prefer sth. like

"All lower bounds and all or none of the upper bounds must be specified"

or

"Either all or none of the upper bounds must be specified at (1)"

(which we currently print in another context where it is wrong),

while other compilers print:

E.g. Crayftn:

  p(1 ,2:3) => t
^
ftn-1768 crayftn: ERROR SUB, File = ptr-remap.f90, Line = 3, Column = 5 
  Invalid bounds-spec-list or bounds-remapping-list for this pointer assignment.

E.g. Intel:

ptr-remap.f90(3): error #8524: The syntax of this data pointer assignment is 
incorrect: either 'bound spec' or 'bound remapping' is expected in this 
context.   [1]
  p(1 ,2:3) => t
^

Pointer remapping belongs IMHO to the 'more advanced' features and requires
some technical insight to get it right, which is why I think the related
error messages should be more technical and concise.

I'll think for another day or two.

Thanks,
Harald

> Gesendet: Dienstag, 12. März 2019 um 23:19 Uhr
> Von: "Thomas Koenig" 
> An: "Harald Anlauf" , "Dominique d'Humières" 
> 
> Cc: gfortran , gcc-patches 
> Betreff: Re: [PR fortran/60091, patch] - Misleading error messages in rank-2 
> pointer assignment to rank-1 target
>
> Hi Harald,
> 
> > how about the attached version?  It is quite verbose and produces
> > messages like
> > 
> > Error: Expected list of 'lower-bound-expr:' or list of
> > 'lower-bound-expr:upper-bound-expr' at (1)
> 
> I think this way of specifying error messages
> 
> +#define BOUNDS_SPEC_LIST "list of %"
> 
> ...
> 
> +   gfc_error ("Rank remapping requires a "
> +  BOUNDS_SPEC_LIST " at %L",
>>where);
> 
> will cause trouble in translation of the error messages.
> 
> Could you maybe use something like
> 
> +   gfc_error ("Rank remapping requires "
> +  lower and upper bounds at %L",
>>where);
> 
> and possibly, instead of
> 
> -   gfc_error ("Either all or none of the upper bounds"
> -  " must be specified at %L", >where);
> +   gfc_error ("Rank remapping requires a "
> +  BOUNDS_SPEC_LIST " at %L",
> +  >where);
> return false;
> 
> use
> 
> " Rank remapping requires that all lower and upper bounds be specified"
> 
> ?
> 
> (And I am fairly certain that my versions are not the best possible
> ones...)
> 
> So, either something like what you propsed (but without the #defines)
> or something like what I wrote above would be OK.
> 
> Regards
> 
>   Thomas
>

Re: [PATCH] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-03-13 Thread Kyrill Tkachov




On 3/13/19 12:07 PM, Richard Biener wrote:

On Wed, Mar 13, 2019 at 10:40 AM Kyrill Tkachov
 wrote:

Hi Feng,

On 3/13/19 1:56 AM, Feng Xue OS wrote:

Richard,

 Thanks for your comment. Yes, it is like kind of jump threading
with knowledge of loop structure. And what is rough time for GCC 10?



GCC 10 will be released once the number of P1 regressions gets down to
zero. Past experience shows that it's around the April/May timeframe.

Note GCC 10 is due only next year.


Errr, yes. I meant that GCC 10 *development* will start once GCC 9 is 
*released*.


Thanks,

Kyrill




In the meantime my comment on the patch is that you should add some
tests to the testsuite that showcase this transformation.

Thanks,

Kyrill



Regards,

Feng



From: Richard Biener 
Sent: Tuesday, March 12, 2019 4:31:49 PM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Loop split upon semi-invariant condition (PR
tree-optimization/89134)

On Tue, Mar 12, 2019 at 7:20 AM Feng Xue OS
 wrote:

This patch is composed to implement a loop transformation on one of

its conditional statements, which we call it semi-invariant, in that
its computation is impacted in only one of its branches.

Suppose a loop as:

 void f (std::map m)
 {
 for (auto it = m.begin (); it != m.end (); ++it) {
 /* if (b) is semi-invariant. */
 if (b) {
 b = do_something();/* Has effect on b */
 } else {
/* No effect on b */
 }
 statements;  /* Also no effect on b */
 }
 }

A transformation, kind of loop split, could be:

 void f (std::map m)
 {
 for (auto it = m.begin (); it != m.end (); ++it) {
 if (b) {
 b = do_something();
 } else {
 ++it;
 statements;
 break;
 }
 statements;
 }

 for (; it != m.end (); ++it) {
 statements;
 }
 }

If "statements" contains nothing, the second loop becomes an empty

one, which can be removed. (This part will be given in another patch).
And if "statements" are straight line instructions, we get an
opportunity to vectorize the second loop. In practice, this
optimization is found to improve some real application by %7.

Since it is just a kind of loop split, the codes are mainly placed

in existing tree-ssa-loop-split module, and is controlled by
-fsplit-loop, and is enabled with -O3.

Note the transform itself is jump-threading with the threading
duplicating a whole CFG cycle.

I didn't look at the patch details yet since this is suitable for GCC
10 only.

Thanks for implementing this.
Richard.


Feng


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 64bf6017d16..a6c2878d652 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,23 @@
+2019-03-12  Feng Xue 
+
+   PR tree-optimization/89134
+* doc/invoke.texi (max-cond-loop-split-insns): Document new

--params.

+   (min-cond-loop-split-prob): Likewise.
+   * params.def: Add max-cond-loop-split-insns,

min-cond-loop-split-prob.

+   * passes.def (pass_cond_loop_split) : New pass.
+   * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
+   * tree-pass.h (make_pass_cond_loop_split): New declaration.
+   * tree-ssa-loop-split.c (split_info): New class.
+   (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
+   (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
+   (can_branch_be_excluded, get_cond_invariant_branch): Likewise.
+   (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
+   (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
+   (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
+   (pass_data_cond_loop_split): New variable.
+   (pass_cond_loop_split): New class.
+   (make_pass_cond_loop_split): New function.
+
  2019-03-11  Jakub Jelinek  

 PR middle-end/89655
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index df0883f2fc9..f5e09bd71fd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11316,6 +11316,14 @@ The maximum number of branches unswitched

in a single loop.

  @item lim-expensive
  The minimum cost of an expensive expression in the loop invariant

motion.

+@item max-cond-loop-split-insns
+The maximum number of insns to be increased due to loop split on
+semi-invariant condition statement.
+
+@item min-cond-loop-split-prob
+The minimum threshold for probability of semi-invaraint condition
+statement to trigger loop split.
+
  @item iv-consider-all-candidates-bound
  Bound on number of candidates for induction variables, below which
  all candidates are considered for each use in induction variable
diff --git a/gcc/params.def b/gcc/params.def
index 3f1576448be..2e067526958 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -386,6 +386,18 @@

Re: [PATCH] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-03-13 Thread Richard Biener

On Wed, Mar 13, 2019 at 10:40 AM Kyrill Tkachov
 wrote:
>
> Hi Feng,
>
> On 3/13/19 1:56 AM, Feng Xue OS wrote:
> > Richard,
> >
> > Thanks for your comment. Yes, it is like kind of jump threading
> > with knowledge of loop structure. And what is rough time for GCC 10?
> >
> >
>
> GCC 10 will be released once the number of P1 regressions gets down to
> zero. Past experience shows that it's around the April/May timeframe.

Note GCC 10 is due only next year.

> In the meantime my comment on the patch is that you should add some
> tests to the testsuite that showcase this transformation.
>
> Thanks,
>
> Kyrill
>
>
> > Regards,
> >
> > Feng
> >
> >
> > 
> > From: Richard Biener 
> > Sent: Tuesday, March 12, 2019 4:31:49 PM
> > To: Feng Xue OS
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] Loop split upon semi-invariant condition (PR
> > tree-optimization/89134)
> >
> > On Tue, Mar 12, 2019 at 7:20 AM Feng Xue OS
> >  wrote:
> > >
> > > This patch is composed to implement a loop transformation on one of
> > its conditional statements, which we call it semi-invariant, in that
> > its computation is impacted in only one of its branches.
> > >
> > > Suppose a loop as:
> > >
> > > void f (std::map m)
> > > {
> > > for (auto it = m.begin (); it != m.end (); ++it) {
> > > /* if (b) is semi-invariant. */
> > > if (b) {
> > > b = do_something();/* Has effect on b */
> > > } else {
> > > /* No effect on b */
> > > }
> > > statements;  /* Also no effect on b */
> > > }
> > > }
> > >
> > > A transformation, kind of loop split, could be:
> > >
> > > void f (std::map m)
> > > {
> > > for (auto it = m.begin (); it != m.end (); ++it) {
> > > if (b) {
> > > b = do_something();
> > > } else {
> > > ++it;
> > > statements;
> > > break;
> > > }
> > > statements;
> > > }
> > >
> > > for (; it != m.end (); ++it) {
> > > statements;
> > > }
> > > }
> > >
> > > If "statements" contains nothing, the second loop becomes an empty
> > one, which can be removed. (This part will be given in another patch).
> > And if "statements" are straight line instructions, we get an
> > opportunity to vectorize the second loop. In practice, this
> > optimization is found to improve some real application by %7.
> > >
> > > Since it is just a kind of loop split, the codes are mainly placed
> > in existing tree-ssa-loop-split module, and is controlled by
> > -fsplit-loop, and is enabled with -O3.
> >
> > Note the transform itself is jump-threading with the threading
> > duplicating a whole CFG cycle.
> >
> > I didn't look at the patch details yet since this is suitable for GCC
> > 10 only.
> >
> > Thanks for implementing this.
> > Richard.
> >
> > > Feng
> > >
> > >
> > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > > index 64bf6017d16..a6c2878d652 100644
> > > --- a/gcc/ChangeLog
> > > +++ b/gcc/ChangeLog
> > > @@ -1,3 +1,23 @@
> > > +2019-03-12  Feng Xue 
> > > +
> > > +   PR tree-optimization/89134
> > > +* doc/invoke.texi (max-cond-loop-split-insns): Document new
> > --params.
> > > +   (min-cond-loop-split-prob): Likewise.
> > > +   * params.def: Add max-cond-loop-split-insns,
> > min-cond-loop-split-prob.
> > > +   * passes.def (pass_cond_loop_split) : New pass.
> > > +   * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
> > > +   * tree-pass.h (make_pass_cond_loop_split): New declaration.
> > > +   * tree-ssa-loop-split.c (split_info): New class.
> > > +   (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
> > > +   (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
> > > +   (can_branch_be_excluded, get_cond_invariant_branch): Likewise.
> > > +   (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
> > > +   (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
> > > +   (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
> > > +   (pass_data_cond_loop_split): New variable.
> > > +   (pass_cond_loop_split): New class.
> > > +   (make_pass_cond_loop_split): New function.
> > > +
> > >  2019-03-11  Jakub Jelinek  
> > >
> > > PR middle-end/89655
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index df0883f2fc9..f5e09bd71fd 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -11316,6 +11316,14 @@ The maximum number of branches unswitched
> > in a single loop.
> > >  @item lim-expensive
> > >  The minimum cost of an expensive expression in the loop invariant
> > motion.
> > >
> > > +@item max-cond-loop-split-insns
> > > +The maximum number of insns to be increased due to loop split on
> > > +semi-invariant condition statement.
> > > +
> > >

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 01:01:54PM +0100, Richard Biener wrote:
> On Wed, Mar 13, 2019 at 12:40 AM Jakub Jelinek  wrote:
> >
> > On Tue, Mar 12, 2019 at 11:21:26PM +, Steve Ellcey wrote:
> > > I like this idea.  I have prototyped something, I created 'vector_asm'
> > > as an aarch64 attribute because I couldn't find where to put global
> > > attributes like __simd__.  Does anyone know where these are listed?
> >
> > gcc/c-family/c-attribs.c
> >
> > Note, vector_asm seems like a bad name when the attribute is simd
> > or directive #pragma omp declare simd, it should be simd_asm instead.
> >
> > If we go this route, glibc headers would need to use it conditional on gcc
> > version though, because older gccs wouldn't support that attribute.
> 
> And that fortran support patch would need yet another iteration.
> 
> That said, I dislike it and I do not see why a solution in glibc needs to be
> inefficient either.

Yeah, an alias doesn't really cost much, and has the advantage that if in
the vector version you need at some point to differentiate between the
finite only vs. full implementations, you can just by tweaking libmvec
implementation, the callers will have proper calls depending on if they were
compiled with -Ofast or -O3 etc.

Jakub

Re: RFC: Patch to allow use of DECL_NAME in libmvec calls instead of DECL_ASSEMBER_NAME

2019-03-13 Thread Richard Biener

On Wed, Mar 13, 2019 at 12:40 AM Jakub Jelinek  wrote:
>
> On Tue, Mar 12, 2019 at 11:21:26PM +, Steve Ellcey wrote:
> > I like this idea.  I have prototyped something, I created 'vector_asm'
> > as an aarch64 attribute because I couldn't find where to put global
> > attributes like __simd__.  Does anyone know where these are listed?
>
> gcc/c-family/c-attribs.c
>
> Note, vector_asm seems like a bad name when the attribute is simd
> or directive #pragma omp declare simd, it should be simd_asm instead.
>
> If we go this route, glibc headers would need to use it conditional on gcc
> version though, because older gccs wouldn't support that attribute.

And that fortran support patch would need yet another iteration.

That said, I dislike it and I do not see why a solution in glibc needs to be
inefficient either.

Richard.

> Jakub

Re: [PATCH] avoid assuming every type has a size (PR 89662)

2019-03-13 Thread Richard Biener

On Tue, Mar 12, 2019 at 5:36 PM Martin Sebor  wrote:
>
> On 3/12/19 2:20 AM, Richard Biener wrote:
> > On Mon, Mar 11, 2019 at 9:16 PM Martin Sebor  wrote:
> >>
> >> A -Warray-bounds enhancement committed last year into GCC 9
> >> introduced an assumption that the MEM_REF type argument has
> >> a size.  The test case submitted in PR89662 does pointer
> >> addition on void*, in which the MEM_REF type is void*, which
> >> breaks the assumption.
> >>
> >> The attached change removes this assumption and considers such
> >> types to have the size of 1.  (The result is used to scale
> >> the offset in diagnostics after it has been determined to be
> >> out of bounds.)
> >
> > Why's this not catched here:
> >
> >if (POINTER_TYPE_P (reftype)
> >|| !COMPLETE_TYPE_P (reftype)
> > ^^^
> >
> >|| TREE_CODE (TYPE_SIZE_UNIT (reftype)) != INTEGER_CST
> >|| RECORD_OR_UNION_TYPE_P (reftype))
> >  return;
>
> Reftype is the type of the argument referenced by the MEM_REF,
> not the type of the MEM_REF itself.

Ugh how misleading...

> The type examined in
> the patch is the latter.  In the test case in PR 89662, reftype
> is unsigned char but the MEM_REF type is void*.
>
>void *f (void *c) { return c; }
>void g () {
>  // unsigned char D.1930[1];
>  int n = 1;
>  char c[n];
>  h (f(c) - 1);   // h ([(void *) + -1B]);
>}
>
> >
> > and what avoids the bad situation for
> >
> >char (*a)[n];
> >sink (a - 1);
> >
> > ?  That is, the code assumes TYPE_SIZE_UNIT is an INTEGER_CST
> > but the above should get you a non-constant type?  It's probably
> > easier to generate a gimple testcase with this.
>
> I think it will depend on what a points to.  The only kind of VLA
> the code works with is one whose size is known (for others, like
> in some range, I haven't seen a MEM_REF) .  In the following for
> instance we get:
>
>void f (void)
>{
>  int n = 4;
>  char a[n];  // unsigned char D.1935[4];
>  int (*p)[n] = 
>  sink (p - 1);   // [(void *) - 16B]
>  // warning: array subscript -4 is outside array bounds of ‘unsigned
> char[4]’
>  sink (*p - 1); // [(void *) - 4B]
>  // warning: array subscript -1 is outside array bounds of ‘unsigned
> char[4]’
>}
>
> I'm not sure what a GIMPLE test case might look like that would
> make the code misbehave.  I tried a few variations but most of
> them ICE somewhere else.

Thanks for trying.  The patch is OK if you adjust it to verify
TYPE_SIZE_UNIT is an INTEGER_CST before throwing it
at wi::to_offset.

Richard.

> Martin

Re: [GCC-8, Arm, committed] Fix availability of FP16-FP64 conversion instructions

2019-03-13 Thread Andre Vieira (lists)


Hi,

After also testing on the gcc-7 branch I committed a backport of r269499 
including the testism fix 269596 to gcc-7 branch in r269647.


Cheers,
Andre

gcc/ChangeLog:
2019-03-13  Andre Vieira  

Backport from mainline
2019-03-08  Andre Vieira  

* config/arm/arm.h (TARGET_FP16_TO_DOUBLE): Add TARGET_VFP_DOUBLE
requirement.

gcc/testsuite/ChangeLog:
2019-03-13  Andre Vieira  

Backport from mainline
2019-03-08  Andre Vieira  

* gcc.target/arm/f16_f64_conv_no_dp.c: New test.

Backport from mainline
2019-03-11  Christophe Lyon  

* gcc.target/arm/f16_f64_conv_no_dp.c: Add arm_fp16_ok effective
target.

On 12/03/2019 14:54, Andre Vieira (lists) wrote:

Hi,

Thanks Christophe! I have committed a backport of r269499 including the 
testism fix r269596 to gcc-8 branch in r269613.


gcc/ChangeLog:
2019-03-12  Andre Vieira  

     Backport from mainline
     2019-03-08  Andre Vieira  

     * config/arm/arm.h (TARGET_FP16_TO_DOUBLE): Add TARGET_VFP_DOUBLE
     requirement.

gcc/testsuite/ChangeLog:
2019-03-12  Andre Vieira  

     Backport from mainline
     2019-03-08  Andre Vieira  

     * gcc.target/arm/f16_f64_conv_no_dp.c: New test.

     Backport from mainline
     2019-03-11  Christophe Lyon  

     * gcc.target/arm/f16_f64_conv_no_dp.c: Add arm_fp16_ok effective
     target.

On 11/03/2019 20:50, Ramana Radhakrishnan wrote:

Nope, just do it after testing it and adjust with Christophes follow up

R

On Mon, 11 Mar 2019, 10:36 Andre Vieira (lists), 
> wrote:


    Hi,

    Any objections to me backporting this to GCC 8 and 7?

    Cheers,
    Andre

    On 08/03/2019 17:30, Andre Vieira (lists) wrote:
 > Hi,
 >
 > vcvtb.f16.f64 and vcvtb.f64.f16 were being made available even
    for FPUs
 > that do not support double precision.  This patch fixes that.
 >
 > Regression tested for arm-none-eabi.
 >
 > Committed in r269499.
 >
 > Cheers,
 > Andre
 >
 > gcc/ChangeLog:
 > 2019-03-08  Andre Vieira  mailto:andre.simoesdiasvie...@arm.com>>
 >
 >      * config/arm/arm.h (TARGET_FP16_TO_DOUBLE): Add
    TARGET_VFP_DOUBLE
 >      requirement.
 >
 > gcc/testsuite/ChangeLog:
 >
 > 2019-03-08  Andre Vieira  mailto:andre.simoesdiasvie...@arm.com>>
 >
 >      * gcc.target/arm/f16_f64_conv_no_dp.c: New test.

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 68a6fa56c7cc3cf1c369156d6b2f68f8997ed58b..d89b6d345f40c83fd525ff50dd5fcdbe74746b0b 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -186,7 +186,7 @@ extern tree arm_fp16_type_node;
 /* FPU supports converting between HFmode and DFmode in a single hardware
step.  */
 #define TARGET_FP16_TO_DOUBLE		\
-  (TARGET_HARD_FLOAT && (TARGET_FP16 && TARGET_VFP5))
+  (TARGET_HARD_FLOAT && TARGET_FP16 && TARGET_VFP5 && TARGET_VFP_DOUBLE)
 
 /* FPU supports fused-multiply-add operations.  */
 #define TARGET_FMA (bitmap_bit_p (arm_active_target.isa, isa_bit_VFPv4))
diff --git a/gcc/testsuite/gcc.target/arm/f16_f64_conv_no_dp.c b/gcc/testsuite/gcc.target/arm/f16_f64_conv_no_dp.c
new file mode 100644
index ..2620e57000425665c3891f7965f00d2c7645781b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/f16_f64_conv_no_dp.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_fp16_ok } */
+/* { dg-skip-if "do not override fpu" { *-*-* } { "-mfpu=*" } { "-mfpu=fpv5-sp-d16" } } */
+/* { dg-skip-if "do not disable fpu" { *-*-* } { "-mfloat-abi=soft" } { * } } */
+/* { dg-skip-if "do not override fp16-format" { *-*-* } { "-mfp16-format=*" } { "-mfp16-format=ieee" } } */
+/* { dg-options "-O1 -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mfp16-format=ieee" } */
+
+__fp16 foo (double a)
+{
+  return a;
+}
+
+double bar (__fp16 a)
+{
+  return a;
+}

[PATCH] Fix PR89698, bogus folding to BIT_FIELD_REF

2019-03-13 Thread Richard Biener



Our beloved condition combining code to BIT_FIELD_REFs miscompiles
the testcase because it relies on operand_equal_p to detect equal
bases.  For some reason that very same operand_equal_p is
treating any dereference of a pointer based on the same pointer
or decl the same - idependent on the actual type used for the
access (in this case, two different sized structs).  Weirdly
it already checks alignment...

The following patch makes operand_equal_p behave and more
in line with what it does for MEM_REF handling.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-03-14  Richard Biener  

PR middle-end/89698
* fold-const.c (operand_equal_p): For INDIRECT_REF check
that the access types are similar.

* g++.dg/torture/pr89698.C: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 269641)
+++ gcc/fold-const.c(working copy)
@@ -3220,10 +3220,16 @@ operand_equal_p (const_tree arg0, const_
   switch (TREE_CODE (arg0))
{
case INDIRECT_REF:
- if (!(flags & OEP_ADDRESS_OF)
- && (TYPE_ALIGN (TREE_TYPE (arg0))
- != TYPE_ALIGN (TREE_TYPE (arg1
-   return 0;
+ if (!(flags & OEP_ADDRESS_OF))
+   {
+ if (TYPE_ALIGN (TREE_TYPE (arg0))
+ != TYPE_ALIGN (TREE_TYPE (arg1)))
+   return 0;
+ /* Verify that the access types are compatible.  */
+ if (TYPE_MAIN_VARIANT (TREE_TYPE (arg0))
+ != TYPE_MAIN_VARIANT (TREE_TYPE (arg1)))
+   return 0;
+   }
  flags &= ~OEP_ADDRESS_OF;
  return OP_SAME (0);
 
Index: gcc/testsuite/g++.dg/torture/pr89698.C
===
--- gcc/testsuite/g++.dg/torture/pr89698.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr89698.C  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+
+extern "C" void abort (void);
+
+class A {
+virtual void f(){};
+public:
+int x;
+A(int in): x(in) {};
+};
+
+class B: public A {
+public:
+int y;
+B(int in):A(in-1), y(in) {};
+};
+
+int test(void)
+{
+  int res;
+  B b(2);
+  A* bp = 
+  void* vp = dynamic_cast(bp);
+  if (((A*)vp)->x == 1 && ((B*)vp)->y == 2)
+return 1;
+  return 0;
+}
+int main() { if (test() != 1) abort (); return 0; }

[PATCH] Fold unfolded constants from initializers

2019-03-13 Thread Richard Biener



This fixes a missed optimization in the testcase in PR89698
where we fail to fold an access to a virtual table slot because
it is (int (*) ()) 0 - yes, an unfolded INTEGER_CST.  Callers
are not happy with this so the following makes sure to
return a properly folded constant via canonicalize_constructor_val.

This avoids the dynamic_cast<> runtime overhead in the testcase.

Bootstrap & regtest running on x86_64-unknown-linux-gnu, queued for GCC 
10.

Richard.

2019-03-14  Richard Biener  

PR tree-optimization/89698
* gimple-fold.c (canonicalize_constructor_val): Early out
for constants, handle unfolded INTEGER_CSTs as they appear in
C++ virtual table ctors.

* g++.dg/tree-ssa/pr89698.C: New testcase.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 269641)
+++ gcc/gimple-fold.c   (working copy)
@@ -207,6 +207,9 @@ create_tmp_reg_or_ssa_name (tree type, g
 tree
 canonicalize_constructor_val (tree cval, tree from_decl)
 {
+  if (CONSTANT_CLASS_P (cval))
+return cval;
+
   tree orig_cval = cval;
   STRIP_NOPS (cval);
   if (TREE_CODE (cval) == POINTER_PLUS_EXPR
@@ -257,8 +260,15 @@ canonicalize_constructor_val (tree cval,
cval = fold_convert (TREE_TYPE (orig_cval), cval);
   return cval;
 }
-  if (TREE_OVERFLOW_P (cval))
-return drop_tree_overflow (cval);
+  /* In CONSTRUCTORs we may see unfolded constants like (int (*) ()) 0.  */
+  if (TREE_CODE (cval) == INTEGER_CST)
+{
+  if (TREE_OVERFLOW_P (cval))
+   cval = drop_tree_overflow (cval);
+  if (!useless_type_conversion_p (TREE_TYPE (orig_cval), TREE_TYPE (cval)))
+   cval = fold_convert (TREE_TYPE (orig_cval), cval);
+  return cval;
+}
   return orig_cval;
 }
 
Index: gcc/testsuite/g++.dg/tree-ssa/pr89698.C
===
--- gcc/testsuite/g++.dg/tree-ssa/pr89698.C (nonexistent)
+++ gcc/testsuite/g++.dg/tree-ssa/pr89698.C (working copy)
@@ -0,0 +1,29 @@
+// { dg-do compile }
+// { dg-options "-O -fdump-tree-fre1" }
+
+class A {
+virtual void f(){};
+public:
+int x;
+A(int in): x(in) {};
+};
+
+class B: public A {
+public:
+int y;
+B(int in):A(in-1), y(in) {};
+};
+
+void bar(void *);
+void test()
+{
+  B b(2);
+  A* bp = 
+  void* vp = dynamic_cast(bp);
+  bar (vp);
+}
+
+// We should be able to constant fold from the virtual table
+// the offset added to bp for the dynamic cast and forward
+//  to the argument of bar
+// { dg-final { scan-tree-dump "bar \\\(" "fre1" } }

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 12:21:15PM +0100, Eric Botcazou wrote:
> > So, when using the MemToShadow(addr) (1UL << 43) + ((addr << 12) >> (12 +
> > 3)) mapping, the first valid address above the hole will have shadow at:
> >  0x00020700UL (will not work, as it is inside of the VA hole)
> >  0x0001f800UL (will not work, as it is inside of the VA hole)
> >  0x00010800UL (this is the only case that will work)
> >  0x0800UL (will not work; that would mean that both the low and
> >high memory share the same shadow memory;
> >it could be made to work by doing mmap
> >(0xfff0UL, 0x8UL, MAP_FIXED, 
> > PROT_NONE)
> >at libasan initialization and failing if that doesn't
> >succeed)
> 
> OK, I can certainly do the last thing.
> 
> > Note for the first VA layout even the shadow offset 1UL << 43 will not work
> > at all even for the low part of the memory, as all of shadow memory is then
> > in the hole.
> 
> Yes, you need a kernel configured for SPARC-T4 or later.
> 
> > I think hardcoding one of the 4 choices in the ABI is undesirable.
> 
> Frankly I'm not sure why you care about the ABI of the AddressSanitizer...

Because in real world, people do build asan instrumented shared libraries etc.,
sometimes as second set of libs and not everybody builds stuff for just his
own machine.  Plus, by hardcoding it in the compiler you even don't give a
choice to change it for other systems.  There is no runtime diagnostics if
you mix objects or shared libraries or executable with different settings.

> > Another possibility is instead of using constant offset + 2 shifts use a
> > variable offset + the normal >> 3, where the offset would be chosen by the
> > runtime library depending on the VA hole size and where the shadow for the
> > high memory would precede the shadow for the low memory.
> 
> But you still need to chop the high bits, otherwise you end up in the hole.

Not if the >> 3 shift is arithmetic shift.

For the
[0x0800UL,0xf800UL) 

[0x8000UL,0x8000UL) 

[0x0008UL,0xfff8UL) 

[0x0010UL,0xfff0UL) 

if shadow is shadow_offset + ((long) addr >> 3), then shadow_offset could be
e.g.
0x0300UL
0x3000UL
0x0003UL
0x0004UL
or something similar, the shadow memory would map into something below the
hole (of course one needs to also take into account where exactly the
dynamic linker, stack and shared libraries are usually mapped).

Jakub

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Eric Botcazou

> So, when using the MemToShadow(addr) (1UL << 43) + ((addr << 12) >> (12 +
> 3)) mapping, the first valid address above the hole will have shadow at:
>  0x00020700UL (will not work, as it is inside of the VA hole)
>  0x0001f800UL (will not work, as it is inside of the VA hole)
>  0x00010800UL (this is the only case that will work)
>  0x0800UL (will not work; that would mean that both the low and
>  high memory share the same shadow memory;
>  it could be made to work by doing mmap
>  (0xfff0UL, 0x8UL, MAP_FIXED, 
> PROT_NONE)
>  at libasan initialization and failing if that doesn't
>  succeed)

OK, I can certainly do the last thing.

> Note for the first VA layout even the shadow offset 1UL << 43 will not work
> at all even for the low part of the memory, as all of shadow memory is then
> in the hole.

Yes, you need a kernel configured for SPARC-T4 or later.

> I think hardcoding one of the 4 choices in the ABI is undesirable.

Frankly I'm not sure why you care about the ABI of the AddressSanitizer...

> Another possibility is instead of using constant offset + 2 shifts use a
> variable offset + the normal >> 3, where the offset would be chosen by the
> runtime library depending on the VA hole size and where the shadow for the
> high memory would precede the shadow for the low memory.

But you still need to chop the high bits, otherwise you end up in the hole.

-- 
Eric Botcazou

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Eric Botcazou

> It actually is something that works with all the VA sizes that are
> supported.

Well, there were changes in the past that seem to indicate that this has not 
always been true but, of course, the very specific VM layout on SPARC 64-bit 
(apparently inherited from Solaris) makes things much more convoluted...

Moreover, I'm not sure this is a very important issue, people presumably don't 
run binaries compiled with -fsanitize=address in production, so having to 
recompile with a matching GCC version doesn't seem that much of a hurdle.

-- 
Eric Botcazou

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 11:17:49AM +0100, Jakub Jelinek wrote:
> On Wed, Mar 13, 2019 at 10:58:41AM +0100, Eric Botcazou wrote:
> > > Is the size of the virtual address space hole constant though (and will it
> > > remain constant)?
> > 
> > The kernel sources say that it's constant and with this position for 
> > SPARC-T4 
> > and later.  It's different (larger hole) for SPARC-T3 and earlier but I 
> > cannot 
> > really test.  I don't think that it will change for a given processor.
> > 
> > > E.g. on powerpc64 or aarch64 there are in each case like 4-5 different VA
> > > size configurations over the last 10+ years of kernel history and
> > > configuration options and fortunately all that is hidden inside of 
> > > libasan,
> > > if you have older gcc and run into an unsupported VA configuration, all it
> > > takes is update libasan to one that supports it and binaries continue to
> > > work.
> > 
> > But a few targets have hardcoded VA size in TARGET_ASAN_SHADOW_OFFSET too.
> 
> It actually is something that works with all the VA sizes that are
> supported.

The kernel says ATM that there are following possibilities for the hole:
[0x0800UL,0xf800UL)
[0x8000UL,0x8000UL)
[0x0008UL,0xfff8UL)
[0x0010UL,0xfff0UL)

So, when using the MemToShadow(addr) (1UL << 43) + ((addr << 12) >> (12 + 3)) 
mapping,
the first valid address above the hole will have shadow at:
 0x00020700UL (will not work, as it is inside of the VA hole)
 0x0001f800UL (will not work, as it is inside of the VA hole)
 0x00010800UL (this is the only case that will work)
 0x0800UL (will not work; that would mean that both the low and
   high memory share the same shadow memory;
   it could be made to work by doing mmap
   (0xfff0UL, 0x8UL, MAP_FIXED, 
PROT_NONE)
   at libasan initialization and failing if that doesn't
   succeed)
Note for the first VA layout even the shadow offset 1UL << 43 will not work
at all even for the low part of the memory, as all of shadow memory is then in 
the
hole.

I think hardcoding one of the 4 choices in the ABI is undesirable.

Another possibility is instead of using constant offset + 2 shifts use a
variable offset + the normal >> 3, where the offset would be chosen by the
runtime library depending on the VA hole size and where the shadow for the
high memory would precede the shadow for the low memory.

Jakub

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Jakub Jelinek

On Wed, Mar 13, 2019 at 10:58:41AM +0100, Eric Botcazou wrote:
> > Is the size of the virtual address space hole constant though (and will it
> > remain constant)?
> 
> The kernel sources say that it's constant and with this position for SPARC-T4 
> and later.  It's different (larger hole) for SPARC-T3 and earlier but I 
> cannot 
> really test.  I don't think that it will change for a given processor.
> 
> > E.g. on powerpc64 or aarch64 there are in each case like 4-5 different VA
> > size configurations over the last 10+ years of kernel history and
> > configuration options and fortunately all that is hidden inside of libasan,
> > if you have older gcc and run into an unsupported VA configuration, all it
> > takes is update libasan to one that supports it and binaries continue to
> > work.
> 
> But a few targets have hardcoded VA size in TARGET_ASAN_SHADOW_OFFSET too.

It actually is something that works with all the VA sizes that are
supported.

Jakub

Plastics injection mold/molding

2019-03-13 Thread iksonsale5

Greetings,
 
Please allow me to recommend ourselves IKSON Mould to you as the company 
specializing in exporting all kinds of molds in the world.
We offer as followings:
 
-ALUMINUM DIE CASTING
-ZINC DIE CASTING
-CNC MACHINING
-TOOL & DIE
-INJECTION MOLDING
-PRODUCT DEVELOPMENT
If you are interested of us, please feel free to contact me for a quotation, we 
are all here to provide you service at any time, thank you!


Alice
Overseas Marketing Dept.
Ikson Mould Technology Co.,Ltd
Tel:86-769-88023015
Fax:86-769-88023016
Email:al...@ikson-mould.com
Add:#3,1 Road,Buxinji industrial,Guanjingtou,Fenggang,Dongguan 
city,Guangdong,China
~~~

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Eric Botcazou

> Is the size of the virtual address space hole constant though (and will it
> remain constant)?

The kernel sources say that it's constant and with this position for SPARC-T4 
and later.  It's different (larger hole) for SPARC-T3 and earlier but I cannot 
really test.  I don't think that it will change for a given processor.

> E.g. on powerpc64 or aarch64 there are in each case like 4-5 different VA
> size configurations over the last 10+ years of kernel history and
> configuration options and fortunately all that is hidden inside of libasan,
> if you have older gcc and run into an unsupported VA configuration, all it
> takes is update libasan to one that supports it and binaries continue to
> work.

But a few targets have hardcoded VA size in TARGET_ASAN_SHADOW_OFFSET too.

> Could libasan initialization if it detects just PROT_NONE mmap from the end
> of hole to the start of the region it really supports (and fail if that
> fails), so that backward compatibility is ensured?

I'll investigate how targets supporting multiple VA size behave, but I don't 
have access to a large range of SPARC machines...

> Also, don't you need some corresponding libsanitizer changes?

Of course, just merged.

-- 
Eric Botcazou

[PATCH] Fix -gimple dumping for negative constants

2019-03-13 Thread Richard Biener



The following makes

int foo (int i)
{
  int j = -1 + i;
  return j + i;
}

-gimple dumps consumable by -fgimple without hand-editing the
non-obvious error.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-03-14  Richard Biener  

* tree-pretty-print.c (dump_generic_node): For -gimple properly
dump negative integer constants using _Literal (type) -num.

Index: gcc/tree-pretty-print.c
===
--- gcc/tree-pretty-print.c (revision 269569)
+++ gcc/tree-pretty-print.c (working copy)
@@ -1830,7 +1830,8 @@ dump_generic_node (pretty_printer *pp, t
  && (POINTER_TYPE_P (TREE_TYPE (node))
  || (TYPE_PRECISION (TREE_TYPE (node))
  < TYPE_PRECISION (integer_type_node))
- || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1))
+ || exact_log2 (TYPE_PRECISION (TREE_TYPE (node))) == -1
+ || tree_int_cst_sgn (node) < 0))
{
  pp_string (pp, "_Literal (");
  dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);

[PATCH] Fix PR89677

2019-03-13 Thread Richard Biener



The following makes SCEV not throw FP expressions at tree-affine
which really isn't prepared for that.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-03-13  Richard Biener  

PR middle-end/89677
* tree-scalar-evolution.c (simplify_peeled_chrec): Do not
throw FP expressions at tree-affine.

* gcc.dg/torture/pr89677.c: New testcase.

Index: gcc/tree-scalar-evolution.c
===
--- gcc/tree-scalar-evolution.c (revision 269383)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -1421,6 +1421,11 @@ simplify_peeled_chrec (struct loop *loop
   return build_polynomial_chrec (loop->num, init_cond, right);
 }
 
+  /* The affine code only deals with pointer and integer types.  */
+  if (!POINTER_TYPE_P (type)
+  && !INTEGRAL_TYPE_P (type))
+return chrec_dont_know;
+
   /* Try harder to check if they are equal.  */
   tree_to_aff_combination_expand (left, type, , _chrec_map);
   tree_to_aff_combination_expand (step_val, type, , _chrec_map);
Index: gcc/testsuite/gcc.dg/torture/pr89677.c
===
--- gcc/testsuite/gcc.dg/torture/pr89677.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89677.c  (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32plus } */
+
+int a, b, d;
+unsigned c;
+float e, f, g;
+void h() {
+float *i = 
+for (; c < 10; c += 3)
+  for (; d; d += 3) {
+ a = *i;
+ g = f + 0;
+ f = b + *i + (b - e + 305219) + -b + 3;
+  }
+}

Re: [PATCH] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-03-13 Thread Kyrill Tkachov


Hi Feng,

On 3/13/19 1:56 AM, Feng Xue OS wrote:

Richard,

    Thanks for your comment. Yes, it is like kind of jump threading 
with knowledge of loop structure. And what is rough time for GCC 10?





GCC 10 will be released once the number of P1 regressions gets down to 
zero. Past experience shows that it's around the April/May timeframe.


In the meantime my comment on the patch is that you should add some 
tests to the testsuite that showcase this transformation.


Thanks,

Kyrill



Regards,

Feng



From: Richard Biener 
Sent: Tuesday, March 12, 2019 4:31:49 PM
To: Feng Xue OS
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Loop split upon semi-invariant condition (PR 
tree-optimization/89134)


On Tue, Mar 12, 2019 at 7:20 AM Feng Xue OS 
 wrote:

>
> This patch is composed to implement a loop transformation on one of 
its conditional statements, which we call it semi-invariant, in that 
its computation is impacted in only one of its branches.

>
> Suppose a loop as:
>
> void f (std::map m)
> {
> for (auto it = m.begin (); it != m.end (); ++it) {
> /* if (b) is semi-invariant. */
> if (b) {
> b = do_something();    /* Has effect on b */
> } else {
> /* No effect on b */
> }
> statements;  /* Also no effect on b */
> }
> }
>
> A transformation, kind of loop split, could be:
>
> void f (std::map m)
> {
> for (auto it = m.begin (); it != m.end (); ++it) {
> if (b) {
> b = do_something();
> } else {
> ++it;
> statements;
> break;
> }
> statements;
> }
>
> for (; it != m.end (); ++it) {
> statements;
> }
> }
>
> If "statements" contains nothing, the second loop becomes an empty 
one, which can be removed. (This part will be given in another patch). 
And if "statements" are straight line instructions, we get an 
opportunity to vectorize the second loop. In practice, this 
optimization is found to improve some real application by %7.

>
> Since it is just a kind of loop split, the codes are mainly placed 
in existing tree-ssa-loop-split module, and is controlled by 
-fsplit-loop, and is enabled with -O3.


Note the transform itself is jump-threading with the threading
duplicating a whole CFG cycle.

I didn't look at the patch details yet since this is suitable for GCC 
10 only.


Thanks for implementing this.
Richard.

> Feng
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 64bf6017d16..a6c2878d652 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,23 @@
> +2019-03-12  Feng Xue 
> +
> +   PR tree-optimization/89134
> +    * doc/invoke.texi (max-cond-loop-split-insns): Document new 
--params.

> +   (min-cond-loop-split-prob): Likewise.
> +   * params.def: Add max-cond-loop-split-insns, 
min-cond-loop-split-prob.

> +   * passes.def (pass_cond_loop_split) : New pass.
> +   * timevar.def (TV_COND_LOOP_SPLIT): New time variable.
> +   * tree-pass.h (make_pass_cond_loop_split): New declaration.
> +   * tree-ssa-loop-split.c (split_info): New class.
> +   (find_vdef_in_loop, vuse_semi_invariant_p): New functions.
> +   (ssa_semi_invariant_p, stmt_semi_invariant_p): Likewise.
> +   (can_branch_be_excluded, get_cond_invariant_branch): Likewise.
> +   (is_cond_in_hidden_loop, compute_added_num_insns): Likewise.
> +   (can_split_loop_on_cond, mark_cond_to_split_loop): Likewise.
> +   (split_loop_for_cond, tree_ssa_split_loops_for_cond): Likewise.
> +   (pass_data_cond_loop_split): New variable.
> +   (pass_cond_loop_split): New class.
> +   (make_pass_cond_loop_split): New function.
> +
>  2019-03-11  Jakub Jelinek  
>
> PR middle-end/89655
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index df0883f2fc9..f5e09bd71fd 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11316,6 +11316,14 @@ The maximum number of branches unswitched 
in a single loop.

>  @item lim-expensive
>  The minimum cost of an expensive expression in the loop invariant 
motion.

>
> +@item max-cond-loop-split-insns
> +The maximum number of insns to be increased due to loop split on
> +semi-invariant condition statement.
> +
> +@item min-cond-loop-split-prob
> +The minimum threshold for probability of semi-invaraint condition
> +statement to trigger loop split.
> +
>  @item iv-consider-all-candidates-bound
>  Bound on number of candidates for induction variables, below which
>  all candidates are considered for each use in induction variable
> diff --git a/gcc/params.def b/gcc/params.def
> index 3f1576448be..2e067526958 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -386,6 +386,18 @@ DEFPARAM(PARAM_MAX_UNSWITCH_LEVEL,
> "The maximum number of unswitchings in a single loop.",
>

Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-13 Thread Jakub Jelinek

On Mon, Mar 11, 2019 at 11:29:39AM +0100, Eric Botcazou wrote:
> ASAN was enabled for the SPARC architecture during GCC 9 development but it 
> doesn't really work on SPARC64/Linux because of the specific layout of the 
> virtual memory address space.  Fortunately this is (easily) fixable and the 
> fix has been accepted upstream, along with other fixes for SPARC (I have 
> attached the asan/asan_mapping_sparc64.h file accepted upstream).

Is the size of the virtual address space hole constant though (and will it
remain constant)?
E.g. on powerpc64 or aarch64 there are in each case like 4-5 different VA
size configurations over the last 10+ years of kernel history and
configuration options and fortunately all that is hidden inside of libasan,
if you have older gcc and run into an unsupported VA configuration, all it
takes is update libasan to one that supports it and binaries continue to
work.
While in this testcase, the VA size is hardcoded into all the generated
code.  I guess running it on a VA layout that has the hole larger than
the one picked up (i.e. the high part of memory above the hole smaller;
supposedly for older kernel versions or older hw) should not be an issue,
but a big issue will be if the hole shrinks further, thus the high part of
memory above the hole grows.
Could libasan initialization if it detects just PROT_NONE mmap from the end
of hole to the start of the region it really supports (and fail if that
fails), so that backward compatibility is ensured?

> But, since GCC also hardcodes the scaling done by ASAN, this also requires a 
> small adjustment to the compiler proper by means of a hook, tentatively 
> called 
> TARGET_ASAN_SHADOW_LEFT_SHIFT, which is defined to NULL except for SPARC.  It
> yields a 100% clean ASAN testsuite on SPARC64/Linux (32-bit and 64-bit).
> 
> Tested on SPARC64/Linux, SPARC/Solaris and x86-64/Linux, OK for the mainline?
> 
> 
> 2019-03-11  Eric Botcazou  
> 
>   PR sanitizer/80953
>   * target.def (asan_shadow_left_shift): New hook.
>   (asan_shadow_offset): Minor tweak.
>   * doc/tm.texi.in: Add TARGET_ASAN_SHADOW_LEFT_SHIFT.
>   * doc/tm.texi: Regenerate.
>   * asan.c (asan_emit_stack_protection): Do a preliminary left shift if
>   TARGET_ASAN_SHADOW_LEFT_SHIFT is positive.
>   (build_shadow_mem_access): Likewise.
>   * config/sparc/sparc.c (TARGET_ASAN_SHADOW_LEFT_SHIFT): Define to...
>   (sparc_asan_shadow_left_shift): ...this.  New function.

Also, don't you need some corresponding libsanitizer changes?

Jakub

[libsanitizer] AddressSanitizer: 64-bit SPARC/Linux port

2019-03-13 Thread Eric Botcazou

This patch contains the bits required to make the AddressSanitizer work on 
SPARC64/Linux (SPARC-T4 and later).  It only affects the SPARC ports and has 
been tested on SPARC/Solaris and SPARC64/Linux.

It merges r355980 of the LLVM repository.  Installed on the mainline.


2019-03-13  Eric Botcazou  

PR sanitizer/80953
Merge from LLVM revision 355980
* asan/asan_allocator.h (kAllocatorSpace): Define for SPARC.
(kAllocatorSize): Likewise.
(DefaultSizeClassMap): Likewise.
* asan/asan_mapping.h (kSPARC64_ShadowOffset64): Define.
(SHADOW_OFFSET): Define for SPARC.
Include asan_mapping_sparc64.h for SPARC 64-bit.
* asan/asan_mapping_sparc64.h: New file.

-- 
Eric BotcazouIndex: asan/asan_allocator.h
===
--- asan/asan_allocator.h	(revision 269546)
+++ asan/asan_allocator.h	(working copy)
@@ -132,11 +132,15 @@ const uptr kAllocatorSpace =  ~(uptr)0;
 const uptr kAllocatorSize  =  0x20ULL;  // 128G.
 typedef VeryCompactSizeClassMap SizeClassMap;
 # elif defined(__aarch64__)
-// AArch64/SANITIZER_CAN_USER_ALLOCATOR64 is only for 42-bit VMA
+// AArch64/SANITIZER_CAN_USE_ALLOCATOR64 is only for 42-bit VMA
 // so no need to different values for different VMA.
 const uptr kAllocatorSpace =  0x100ULL;
 const uptr kAllocatorSize  =  0x100ULL;  // 3T.
 typedef DefaultSizeClassMap SizeClassMap;
+# elif defined(__sparc__)
+const uptr kAllocatorSpace = ~(uptr)0;
+const uptr kAllocatorSize  =  0x200ULL;  // 2T.
+typedef DefaultSizeClassMap SizeClassMap;
 # elif SANITIZER_WINDOWS
 const uptr kAllocatorSpace = ~(uptr)0;
 const uptr kAllocatorSize  =  0x80ULL;  // 500G
Index: asan/asan_mapping.h
===
--- asan/asan_mapping.h	(revision 269546)
+++ asan/asan_mapping.h	(working copy)
@@ -99,6 +99,13 @@
 // || `[0x10, 0x11]` || LowShadow  ||
 // || `[0x00, 0x0f]` || LowMem ||
 //
+// Default Linux/SPARC64 (52-bit VMA) mapping:
+// || `[0x8, 0xf]` || HighMem||
+// || `[0x10800, 0x207ff]` || HighShadow ||
+// || `[0x00900, 0x107ff]` || ShadowGap  ||
+// || `[0x00800, 0x008ff]` || LowShadow  ||
+// || `[0x0, 0x007ff]` || LowMem ||
+//
 // Shadow mapping on FreeBSD/x86-64 with SHADOW_OFFSET == 0x4000:
 // || `[0x5000, 0x7fff]` || HighMem||
 // || `[0x4a00, 0x4fff]` || HighShadow ||
@@ -161,6 +168,7 @@ static const u64 kMIPS32_ShadowOffset32
 static const u64 kMIPS64_ShadowOffset64 = 1ULL << 37;
 static const u64 kPPC64_ShadowOffset64 = 1ULL << 41;
 static const u64 kSystemZ_ShadowOffset64 = 1ULL << 52;
+static const u64 kSPARC64_ShadowOffset64 = 1ULL << 43;  // 0x800
 static const u64 kFreeBSD_ShadowOffset32 = 1ULL << 30;  // 0x4000
 static const u64 kFreeBSD_ShadowOffset64 = 1ULL << 46;  // 0x4000
 static const u64 kNetBSD_ShadowOffset32 = 1ULL << 30;  // 0x4000
@@ -223,6 +231,8 @@ static const u64 kMyriadCacheBitMask32 =
 #   define SHADOW_OFFSET kDefaultShadowOffset64
 #  elif defined(__mips64)
 #   define SHADOW_OFFSET kMIPS64_ShadowOffset64
+#  elif defined(__sparc__)
+#   define SHADOW_OFFSET kSPARC64_ShadowOffset64
 #  elif SANITIZER_WINDOWS64
 #   define SHADOW_OFFSET __asan_shadow_memory_dynamic_address
 #  else
@@ -269,6 +279,8 @@ extern uptr kHighMemEnd, kMidMemBeg, kMi
 
 #if SANITIZER_MYRIAD2
 #include "asan_mapping_myriad.h"
+#elif defined(__sparc__) && SANITIZER_WORDSIZE == 64
+#include "asan_mapping_sparc64.h"
 #else
 #define MEM_TO_SHADOW(mem) (((mem) >> SHADOW_SCALE) + (SHADOW_OFFSET))
 
Index: asan/asan_mapping_sparc64.h
===
--- asan/asan_mapping_sparc64.h	(nonexistent)
+++ asan/asan_mapping_sparc64.h	(working copy)
@@ -0,0 +1,100 @@
+//===-- asan_mapping_sparc64.h --*- C++ -*-===//
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details
+//
+//===--===//
+//
+// This file is a part of AddressSanitizer, an address sanity checker.
+//
+// SPARC64-specific definitions for ASan memory mapping.
+//===--===//
+#ifndef ASAN_MAPPING_SPARC64_H
+#define ASAN_MAPPING_SPARC64_H
+
+// This is tailored to the 52-bit VM layout on SPARC-T4 and later.
+// The VM space is split into two 51-bit halves at both ends: the low part
+// has all the bits above the 51st cleared, while the high part has them set.
+//   0xfff8 - 0x
+//   0x - 0x0007
+
+#define VMA_BITS 52
+#define HIGH_BITS (64 - VMA_BITS)
+
+// The idea is to chop the high bits

[libsanitizer] AddressSanitizer: fix for SPARC with GCC

2019-03-13 Thread Eric Botcazou

This patch contains a fixlet for the AddressSanitizer on the SPARC with GCC, 
which would otherwise generate a problematic call to the intercepted memcpy 
routine.  It only affects the SPARC ports and has been tested on SPARC/Solaris 
and SPARC64/Linux.

It merges r355979 of the LLVM repository.  Installed on the mainline.


2019-03-13  Eric Botcazou  

PR sanitizer/80953
Merge from LLVM revision 355979
* asan/asan_globals.c (GetGlobalsForAddress): Use internal_memcpy to
copy Global objects for SPARC with GCC.

-- 
Eric BotcazouIndex: asan/asan_globals.cc
===
--- asan/asan_globals.cc	(revision 269546)
+++ asan/asan_globals.cc	(working copy)
@@ -112,7 +112,11 @@ int GetGlobalsForAddress(uptr addr, Glob
 if (flags()->report_globals >= 2)
   ReportGlobal(g, "Search");
 if (IsAddressNearGlobal(addr, g)) {
+#if defined(__GNUC__) && defined(__sparc__)
+  internal_memcpy([res], , sizeof(g));
+#else
   globals[res] = g;
+#endif
   if (reg_sites)
 reg_sites[res] = FindRegistrationSite();
   res++;

[libsanitizer] SanitizerCommon: 64-bit SPARC/Linux port

2019-03-13 Thread Eric Botcazou

This patch contains the bits required to make the common 32-bit allocator work 
on SPARC64/Linux.  It only affects the SPARC ports and has been tested on 
SPARC/Solaris and SPARC64/Linux.

It merges r355978 of the LLVM repository.  Installed on the mainline.


2019-03-13  Eric Botcazou  

PR sanitizer/80953
Merge from LLVM revision 355978
* sanitizer_common/sanitizer_allocator_primary32.h
(class SizeClassAllocator32): Assert that kSpaceSize is power of 2 if
SANITIZER_SIGN_EXTENDED_ADDRESSES is set.
(PointerIsMine): Deal with SANITIZER_SIGN_EXTENDED_ADDRESSES.
(ComputeRegionId): Likewise.
* sanitizer_common/sanitizer_linux.cc (GetMaxVirtualAddress): Return
appropriate value for SPARC 64-bit.
* sanitizer_common/sanitizer_platform.h (SANITIZER_MMAP_RANGE_SIZE):
Define for SPARC.
(SANITIZER_SIGN_EXTENDED_ADDRESSES): Define to 1 for SPARC 64-bit.

-- 
Eric BotcazouIndex: sanitizer_common/sanitizer_allocator_primary32.h
===
--- sanitizer_common/sanitizer_allocator_primary32.h	(revision 269546)
+++ sanitizer_common/sanitizer_allocator_primary32.h	(working copy)
@@ -54,6 +54,9 @@ class SizeClassAllocator32 {
   typedef typename Params::ByteMap ByteMap;
   typedef typename Params::MapUnmapCallback MapUnmapCallback;
 
+  COMPILER_CHECK(!SANITIZER_SIGN_EXTENDED_ADDRESSES ||
+ (kSpaceSize & (kSpaceSize - 1)) == 0);
+
   static const bool kRandomShuffleChunks = Params::kFlags &
   SizeClassAllocator32FlagMasks::kRandomShuffleChunks;
   static const bool kUseSeparateSizeClassForBatch = Params::kFlags &
@@ -175,6 +178,8 @@ class SizeClassAllocator32 {
 
   bool PointerIsMine(const void *p) {
 uptr mem = reinterpret_cast(p);
+if (SANITIZER_SIGN_EXTENDED_ADDRESSES)
+  mem &= (kSpaceSize - 1);
 if (mem < kSpaceBeg || mem >= kSpaceBeg + kSpaceSize)
   return false;
 return GetSizeClass(p) != 0;
@@ -267,6 +272,8 @@ class SizeClassAllocator32 {
   COMPILER_CHECK(sizeof(SizeClassInfo) % kCacheLineSize == 0);
 
   uptr ComputeRegionId(uptr mem) {
+if (SANITIZER_SIGN_EXTENDED_ADDRESSES)
+  mem &= (kSpaceSize - 1);
 const uptr res = mem >> kRegionSizeLog;
 CHECK_LT(res, kNumPossibleRegions);
 return res;
Index: sanitizer_common/sanitizer_linux.cc
===
--- sanitizer_common/sanitizer_linux.cc	(revision 269546)
+++ sanitizer_common/sanitizer_linux.cc	(working copy)
@@ -1064,6 +1064,8 @@ uptr GetMaxVirtualAddress() {
   return (1ULL << 40) - 1;  // 0x00ffUL;
 # elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fUL;
+# elif defined(__sparc__)
+  return ~(uptr)0;
 # else
   return (1ULL << 47) - 1;  // 0x7fffUL;
 # endif
Index: sanitizer_common/sanitizer_platform.h
===
--- sanitizer_common/sanitizer_platform.h	(revision 269546)
+++ sanitizer_common/sanitizer_platform.h	(working copy)
@@ -239,10 +239,21 @@
 # else
 #  define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 48)
 # endif
+#elif defined(__sparc__)
+# define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 52)
 #else
 # define SANITIZER_MMAP_RANGE_SIZE FIRST_32_SECOND_64(1ULL << 32, 1ULL << 47)
 #endif
 
+// Whether the addresses are sign-extended from the VMA range to the word.
+// The SPARC64 Linux port implements this to split the VMA space into two
+// non-contiguous halves with a huge hole in the middle.
+#if defined(__sparc__) && SANITIZER_WORDSIZE == 64
+# define SANITIZER_SIGN_EXTENDED_ADDRESSES 1
+#else
+# define SANITIZER_SIGN_EXTENDED_ADDRESSES 0
+#endif
+
 // The AArch64 linux port uses the canonical syscall set as mandated by
 // the upstream linux community for all new ports. Other ports may still
 // use legacy syscalls.

[libsanitizer] SanitizerCommon: fixes for unwinding & backtrace on SPARC

2019-03-13 Thread Eric Botcazou

This patch contains various fixes for the unwinding and backtrace machinery 
on the SPARC, which doesn't work correctly in some cases.  It only affects the 
SPARC ports and has been tested on SPARC/Solaris and SPARC64/Linux.

It merges r355965 of the LLVM repository.  Installed on the mainline.


2019-03-13  Eric Botcazou  

PR sanitizer/80953
Merge from LLVM revision 355965
* sanitizer_common/sanitizer_linux.cc (GetWriteFlag): Implement for
SPARC/Linux.
(GetPcSpBp): Likewise.
* sanitizer_common/sanitizer_stacktrace.cc (GetNextInstructionPc):
Adjust for SPARC.
* sanitizer_common/sanitizer_stacktrace.h (SANITIZER_CAN_FAST_UNWIND):
Define to 1 for SPARC.
* sanitizer_common/sanitizer_stacktrace_sparc.cc: Rewrite.
* sanitizer_common/sanitizer_unwind_linux_libcdep.cc (SlowUnwindStack):
Adjust the PC address for SPARC with GCC.

-- 
Eric BotcazouIndex: sanitizer_common/sanitizer_linux.cc
===
--- sanitizer_common/sanitizer_linux.cc	(revision 269546)
+++ sanitizer_common/sanitizer_linux.cc	(working copy)
@@ -1848,10 +1850,20 @@ SignalContext::WriteFlag SignalContext::
   u64 esr;
   if (!Aarch64GetESR(ucontext, )) return UNKNOWN;
   return esr & ESR_ELx_WNR ? WRITE : READ;
-#elif SANITIZER_SOLARIS && defined(__sparc__)
+#elif defined(__sparc__)
   // Decode the instruction to determine the access type.
   // From OpenSolaris $SRC/uts/sun4/os/trap.c (get_accesstype).
+# if SANITIZER_SOLARIS
   uptr pc = ucontext->uc_mcontext.gregs[REG_PC];
+# else
+  // Historical BSDism here.
+  struct sigcontext *scontext = (struct sigcontext *)context;
+#  if defined(__arch64__)
+  uptr pc = scontext->sigc_regs.tpc;
+#  else
+  uptr pc = scontext->si_regs.pc;
+#  endif
+# endif
   u32 instr = *(u32 *)pc;
   return (instr >> 21) & 1 ? WRITE: READ;
 #else
@@ -1942,28 +1954,27 @@ static void GetPcSpBp(void *context, upt
   // pointer, but GCC always uses r31 when we need a frame pointer.
   *bp = ucontext->uc_mcontext.regs->gpr[PT_R31];
 #elif defined(__sparc__)
-  ucontext_t *ucontext = (ucontext_t*)context;
-  uptr *stk_ptr;
-# if defined(__sparcv9) || defined (__arch64__)
-# ifndef MC_PC
-#  define MC_PC REG_PC
-# endif
-# ifndef MC_O6
-#  define MC_O6 REG_O6
+# if defined(__arch64__) || defined(__sparcv9)
+#  define STACK_BIAS 2047
+# else
+#  define STACK_BIAS 0
 # endif
 # if SANITIZER_SOLARIS
-#  define mc_gregs gregs
-# endif
-  *pc = ucontext->uc_mcontext.mc_gregs[MC_PC];
-  *sp = ucontext->uc_mcontext.mc_gregs[MC_O6];
-  stk_ptr = (uptr *) (*sp + 2047);
-  *bp = stk_ptr[15];
-# else
+  ucontext_t *ucontext = (ucontext_t*)context;
   *pc = ucontext->uc_mcontext.gregs[REG_PC];
-  *sp = ucontext->uc_mcontext.gregs[REG_O6];
-  stk_ptr = (uptr *) *sp;
-  *bp = stk_ptr[15];
+  *sp = ucontext->uc_mcontext.gregs[REG_O6] + STACK_BIAS;
+# else
+  // Historical BSDism here.
+  struct sigcontext *scontext = (struct sigcontext *)context;
+#  if defined(__arch64__)
+  *pc = scontext->sigc_regs.tpc;
+  *sp = scontext->sigc_regs.u_regs[14] + STACK_BIAS;
+#  else
+  *pc = scontext->si_regs.pc;
+  *sp = scontext->si_regs.u_regs[14];
+#  endif
 # endif
+  *bp = (uptr) ((uhwptr *) *sp)[14] + STACK_BIAS;
 #elif defined(__mips__)
   ucontext_t *ucontext = (ucontext_t*)context;
   *pc = ucontext->uc_mcontext.pc;
Index: sanitizer_common/sanitizer_stacktrace.cc
===
--- sanitizer_common/sanitizer_stacktrace.cc	(revision 269546)
+++ sanitizer_common/sanitizer_stacktrace.cc	(working copy)
@@ -16,10 +16,9 @@
 namespace __sanitizer {
 
 uptr StackTrace::GetNextInstructionPc(uptr pc) {
-#if defined(__mips__)
+#if defined(__sparc__) || defined(__mips__)
   return pc + 8;
-#elif defined(__powerpc__) || defined(__sparc__) || defined(__arm__) || \
-defined(__aarch64__)
+#elif defined(__powerpc__) || defined(__arm__) || defined(__aarch64__)
   return pc + 4;
 #else
   return pc + 1;
Index: sanitizer_common/sanitizer_stacktrace.h
===
--- sanitizer_common/sanitizer_stacktrace.h	(revision 269546)
+++ sanitizer_common/sanitizer_stacktrace.h	(working copy)
@@ -17,7 +17,7 @@ namespace __sanitizer {
 
 static const u32 kStackTraceMax = 256;
 
-#if defined(__sparc__) || (SANITIZER_LINUX && defined(__mips__))
+#if SANITIZER_LINUX && defined(__mips__)
 # define SANITIZER_CAN_FAST_UNWIND 0
 #elif SANITIZER_WINDOWS
 # define SANITIZER_CAN_FAST_UNWIND 0
Index: sanitizer_common/sanitizer_stacktrace_sparc.cc
===
--- sanitizer_common/sanitizer_stacktrace_sparc.cc	(revision 269546)
+++ sanitizer_common/sanitizer_stacktrace_sparc.cc	(working copy)
@@ -11,9 +11,13 @@
 // Implemention of fast stack unwinding for Sparc.
 //===--===//
 
-// This file is ported

Re: [PATCH] Remove unused nonlocal_value member of ipa_parm_adjustment

2019-03-13 Thread Richard Biener

On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> Hi!
> 
> This field seems to have been unused since it has been added 10 years ago,
> the patch had just some comments about it in the code, but even those are
> long gone now.  https://gcc.gnu.org/ml/gcc-patches/2009-07/msg00514.html
> 
> Can we remove it now and only add if we ever need it for something?
> Bootstrapped/regtested on x86_64-linux and i686-linux.

You can remove it now.

Richard.

> 2019-03-13  Jakub Jelinek  
> 
>   * ipa-param-manipulation.h (struct ipa_parm_adjustment): Remove
>   nonlocal_value member.
> 
> --- gcc/ipa-param-manipulation.h.jj   2019-01-01 12:37:17.909962612 +0100
> +++ gcc/ipa-param-manipulation.h  2019-03-12 19:06:14.949156719 +0100
> @@ -75,10 +75,6 @@ struct ipa_parm_adjustment
>   non-default-def ssa names when a parm decl is going away.  */
>tree new_ssa_base;
>  
> -  /* If non-NULL and the original parameter is to be removed (copy_param 
> below
> - is NULL), this is going to be its nonlocalized vars value.  */
> -  tree nonlocal_value;
> -
>/* This holds the prefix to be used for the new DECL_NAME.  */
>const char *arg_prefix;
>  
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH][RFC] Teach GIMPLE FE to build proper CFG + SSA (+ loops)

2019-03-13 Thread Richard Biener

On Wed, 13 Mar 2019, Bin.Cheng wrote:

> On Wed, Mar 13, 2019 at 3:58 AM Richard Biener  wrote:
> >
> >
> > This makes an attempt at fixing the most annoying parts of the GIMPLE
> > FE unit testing - the lack of proper CFG preservation and hoops you
> > need to jump through to make the CFG and SSA builders happy.
> >
> > Due to this the __GIMPLE specifiers takes two new flags, "cfg"
> > for GIMPLE-with-a-CFG and "ssa" for GIMPLE-with-a-CFG-and-SSA.
> > When there is a CFG you need to start basic-block boundaries with
> >
> > __BB(index):
> >
> > where 'index' is the basic-block index.  That implicitely defines
> > a label __BBindex for use in goto stmts and friends (but doesn't
> > actually add a GIMPLE_LABEL).
> >
> > The parsing code isn't defensive right now so you need to watch
> > out to not use index 0 or 1 (entry and exit block) which are
> > only implicitely present.
> >
> > As a proof of concept I added one BB annotation - loop_header(num)
> > where "num" is the loop number.  This means you can now also
> > have loop structures preserved (to some extent - the loop tree is
> > not explicitely represented nor are loop fathers, both are re-built
> > via fixup).
> >
> > I've not yet adjusted -gimple dumping.
> >
> > I've adjusted all testcases I could find.
> >
> > The CFG build inside the frontend is a bit awkward (spread out...),
> > likewise some functionality is asking for split-out.
> >
> > This is mainly a RFC for whether you think this is forward looking
> > enough.  Future enhancements would include EH and abnormal edges
> > via stmt annotations:
> Thanks very much for doing this.  Do we have a centralized
> wiki/document on the formal definition?  It would be very helpful even
> it's still evolving, otherwise we would need to dig into messages for
> fragmented clues.

Earlier this week I transformed the old 
https://gcc.gnu.org/wiki/GimpleFrontEnd page into one documenting
the current state but this doesn't yet include any kind of documentation.

I suppose special syntax documentation should go into our texinfo
docs, I guess into a new section in the internals manual.  Currently
we only have documentation for the -fgimple switch.

Note that I view -gimple dumping as the thing that should give you
a clue about expected syntax.  I hope to spend a little more time
on the patch to make it ready for GCC9.

Richard.


> Thanks,
> bin
> >
> > __BB(2):
> >   foo () goto __BB3 __EDGE_EH; // EH edge to BB3
> >
> > __BB(3):
> >   setjmp () goto __BB4 __EDGE_AB; // abnormal edge to BB4
> >
> > __BB(4):
> >   _1 = _5 / 0. __NOTHROW; // gimple_nothrow_p is true
> >   _3 = _2 / _4 goto __BB5 __EDGE_EH; // EH edge to BB5
> >
> > etc.  extra edge flags:
> >
> >   goto __BB5 __EDGE_EXECUTABLE;
> >
> > I guess now is the time for bike-shedding.
> >
> > Richard.
> >
> > 2019-03-12  Richard Biener  
> >
> > c/
> > * c-tree.h (enum c_declspec_il): New.
> > (struct c_declspecs): Merge gimple_p and rtl_p into declspec_il
> > enum bitfield.
> > * c-parser.c (c_parser_declaration_or_fndef): Adjust accordingly.
> > Pass start pass and declspec_il to c_parser_parse_gimple_body.
> > (c_parser_declspecs): Adjust.
> > * gimple-parser.c: Include cfg.h, cfghooks.h, cfganal.h, tree-cfg.h,
> > gimple-iterator.h, cfgloop.h, tree-phinodes.h, tree-into-ssa.h
> > and bitmap.h.
> > (struct gimple_parser_edge): New.
> > (edges, current_bb): New globals.
> > (c_parser_gimple_parse_bb_spec): New helper.
> > (c_parser_parse_gimple_body): Get start pass and IL specification.
> > Initialize SSA and CFG.
> > (c_parser_gimple_compound_statement): Handle CFG build.
> > (c_parser_gimple_statement): Change intermittend __PHI internal
> > function arg.
> > (c_parser_gimple_or_rtl_pass_list): Handle ssa, cfg flags.
> > (c_parser_gimple_goto_stmt): Record edges to build.
> > (c_parser_gimple_if_stmt): Likewise.
> > * gimple-parser.h (c_parser_parse_gimple_body): Adjust.
> > (c_parser_gimple_or_rtl_pass_list): Likewise.
> >
> > * gcc.dg/gimplefe-13.c: Adjust.
> > ...
> >
> > Index: gcc/c/c-parser.c
> > ===
> > --- gcc/c/c-parser.c(revision 269604)
> > +++ gcc/c/c-parser.c(working copy)
> > @@ -2324,19 +2324,9 @@ c_parser_declaration_or_fndef (c_parser
> >DECL_STRUCT_FUNCTION (current_function_decl)->function_start_locus
> > = c_parser_peek_token (parser)->location;
> >
> > -  /* If the definition was marked with __GIMPLE then parse the
> > - function body as GIMPLE.  */
> > -  if (specs->gimple_p)
> > -   {
> > - cfun->pass_startwith = specs->gimple_or_rtl_pass;
> > - bool saved = in_late_binary_op;
> > - in_late_binary_op = true;
> > - c_parser_parse_gimple_body (parser);
> > - in_late_binary_op =

[committed] Handle PHIs in omp-simd-clone.c cloning (PR middle-end/88588)

2019-03-13 Thread Jakub Jelinek

Hi!

As the testcase shows (not really useful example of what simd functions
should be used for though), I forgot to adjust PHI arguments.  They can
contain  and if the parm is to be passed as vector, we want to replace
it with [iter_N].  That is not a valid gimple invariant though,
so we need to emit it in some other bb.  I chose to just emit it on the
first bb in the function, which is going to become the body of the new loop
that is supposed to be vectorized.  The code abuses otherwise unused
cand->alias_ptr_type (used just for IPA SRA, not in this case) for caching
a SSA_NAME containing the address, so that if we have many PHI nodes with
, we don't create a SSA_NAME for each of them and hope some SCCVN will
clean it up.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-03-13  Jakub Jelinek  

PR middle-end/88588
* omp-simd-clone.c (ipa_simd_modify_stmt_ops): Handle PHI args.
(ipa_simd_modify_function_body): Handle PHIs.

* c-c++-common/gomp/pr88588.c: New test.

--- gcc/omp-simd-clone.c.jj 2019-02-08 16:02:42.096813415 +0100
+++ gcc/omp-simd-clone.c2019-03-12 20:56:25.949386379 +0100
@@ -866,6 +866,18 @@ ipa_simd_modify_stmt_ops (tree *tp, int
 
   if (tp != orig_tp)
 {
+  if (gimple_code (info->stmt) == GIMPLE_PHI
+ && cand
+ && TREE_CODE (*orig_tp) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (*orig_tp, 0)) == PARM_DECL
+ && cand->alias_ptr_type)
+   {
+ gcc_assert (TREE_CODE (cand->alias_ptr_type) == SSA_NAME);
+ *orig_tp = cand->alias_ptr_type;
+ info->modified = true;
+ return NULL_TREE;
+   }
+
   repl = build_fold_addr_expr (repl);
   gimple *stmt;
   if (is_gimple_debug (info->stmt))
@@ -882,7 +894,18 @@ ipa_simd_modify_stmt_ops (tree *tp, int
  stmt = gimple_build_assign (make_ssa_name (TREE_TYPE (repl)), repl);
  repl = gimple_assign_lhs (stmt);
}
-  gimple_stmt_iterator gsi = gsi_for_stmt (info->stmt);
+  gimple_stmt_iterator gsi;
+  if (gimple_code (info->stmt) == GIMPLE_PHI)
+   {
+ gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+ /* Cache SSA_NAME for next time.  */
+ if (cand
+ && TREE_CODE (*orig_tp) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (*orig_tp, 0)) == PARM_DECL)
+   cand->alias_ptr_type = repl;
+   }
+  else
+   gsi = gsi_for_stmt (info->stmt);
   gsi_insert_before (, stmt, GSI_SAME_STMT);
   *orig_tp = repl;
 }
@@ -983,6 +1006,31 @@ ipa_simd_modify_function_body (struct cg
 {
   gimple_stmt_iterator gsi;
 
+  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gphi *phi = as_a  (gsi_stmt (gsi));
+ int i, n = gimple_phi_num_args (phi);
+ info.stmt = phi;
+ struct walk_stmt_info wi;
+ memset (, 0, sizeof (wi));
+ info.modified = false;
+ wi.info = 
+ for (i = 0; i < n; ++i)
+   {
+ int walk_subtrees = 1;
+ tree arg = gimple_phi_arg_def (phi, i);
+ tree op = arg;
+ ipa_simd_modify_stmt_ops (, _subtrees, );
+ if (op != arg)
+   {
+ SET_PHI_ARG_DEF (phi, i, op);
+ gcc_assert (TREE_CODE (op) == SSA_NAME);
+ if (gimple_phi_arg_edge (phi, i)->flags & EDGE_ABNORMAL)
+   SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op) = 1;
+   }
+   }
+   }
+
   gsi = gsi_start_bb (bb);
   while (!gsi_end_p (gsi))
{
--- gcc/testsuite/c-c++-common/gomp/pr88588.c.jj2019-03-12 
21:00:11.886712729 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr88588.c   2019-03-12 21:00:51.799063771 
+0100
@@ -0,0 +1,18 @@
+/* PR middle-end/88588 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -O1" } */
+
+int *v;
+
+#pragma omp declare simd
+void
+foo (int x)
+{
+  int *a = 
+
+  for (;;)
+{
+  *v = *a;
+  a = v;
+}
+}

Jakub

[PATCH] Remove unused nonlocal_value member of ipa_parm_adjustment

2019-03-13 Thread Jakub Jelinek

Hi!

This field seems to have been unused since it has been added 10 years ago,
the patch had just some comments about it in the code, but even those are
long gone now.  https://gcc.gnu.org/ml/gcc-patches/2009-07/msg00514.html

Can we remove it now and only add if we ever need it for something?
Bootstrapped/regtested on x86_64-linux and i686-linux.

2019-03-13  Jakub Jelinek  

* ipa-param-manipulation.h (struct ipa_parm_adjustment): Remove
nonlocal_value member.

--- gcc/ipa-param-manipulation.h.jj 2019-01-01 12:37:17.909962612 +0100
+++ gcc/ipa-param-manipulation.h2019-03-12 19:06:14.949156719 +0100
@@ -75,10 +75,6 @@ struct ipa_parm_adjustment
  non-default-def ssa names when a parm decl is going away.  */
   tree new_ssa_base;
 
-  /* If non-NULL and the original parameter is to be removed (copy_param below
- is NULL), this is going to be its nonlocalized vars value.  */
-  tree nonlocal_value;
-
   /* This holds the prefix to be used for the new DECL_NAME.  */
   const char *arg_prefix;
 

Jakub

Re: [patch, fortran] Fix PR 66695, 77746 and 79485

2019-03-13 Thread Thomas Koenig


Hi Steve,


Regressoin-tested. OK for trunk?

Looks good to me.


Committed as r269635, with one correction: When checking with SVN, I
found that I had misnumbered the test cases - I have now committed
them as binding_label_tests_30.f90 to binding_label_tests_33.f90.

Thanks for the review!

Regards

Thomas

Re: [PATCH][RFC] Teach GIMPLE FE to build proper CFG + SSA (+ loops)

2019-03-13 Thread Bin.Cheng

On Wed, Mar 13, 2019 at 3:58 AM Richard Biener  wrote:
>
>
> This makes an attempt at fixing the most annoying parts of the GIMPLE
> FE unit testing - the lack of proper CFG preservation and hoops you
> need to jump through to make the CFG and SSA builders happy.
>
> Due to this the __GIMPLE specifiers takes two new flags, "cfg"
> for GIMPLE-with-a-CFG and "ssa" for GIMPLE-with-a-CFG-and-SSA.
> When there is a CFG you need to start basic-block boundaries with
>
> __BB(index):
>
> where 'index' is the basic-block index.  That implicitely defines
> a label __BBindex for use in goto stmts and friends (but doesn't
> actually add a GIMPLE_LABEL).
>
> The parsing code isn't defensive right now so you need to watch
> out to not use index 0 or 1 (entry and exit block) which are
> only implicitely present.
>
> As a proof of concept I added one BB annotation - loop_header(num)
> where "num" is the loop number.  This means you can now also
> have loop structures preserved (to some extent - the loop tree is
> not explicitely represented nor are loop fathers, both are re-built
> via fixup).
>
> I've not yet adjusted -gimple dumping.
>
> I've adjusted all testcases I could find.
>
> The CFG build inside the frontend is a bit awkward (spread out...),
> likewise some functionality is asking for split-out.
>
> This is mainly a RFC for whether you think this is forward looking
> enough.  Future enhancements would include EH and abnormal edges
> via stmt annotations:
Thanks very much for doing this.  Do we have a centralized
wiki/document on the formal definition?  It would be very helpful even
it's still evolving, otherwise we would need to dig into messages for
fragmented clues.

Thanks,
bin
>
> __BB(2):
>   foo () goto __BB3 __EDGE_EH; // EH edge to BB3
>
> __BB(3):
>   setjmp () goto __BB4 __EDGE_AB; // abnormal edge to BB4
>
> __BB(4):
>   _1 = _5 / 0. __NOTHROW; // gimple_nothrow_p is true
>   _3 = _2 / _4 goto __BB5 __EDGE_EH; // EH edge to BB5
>
> etc.  extra edge flags:
>
>   goto __BB5 __EDGE_EXECUTABLE;
>
> I guess now is the time for bike-shedding.
>
> Richard.
>
> 2019-03-12  Richard Biener  
>
> c/
> * c-tree.h (enum c_declspec_il): New.
> (struct c_declspecs): Merge gimple_p and rtl_p into declspec_il
> enum bitfield.
> * c-parser.c (c_parser_declaration_or_fndef): Adjust accordingly.
> Pass start pass and declspec_il to c_parser_parse_gimple_body.
> (c_parser_declspecs): Adjust.
> * gimple-parser.c: Include cfg.h, cfghooks.h, cfganal.h, tree-cfg.h,
> gimple-iterator.h, cfgloop.h, tree-phinodes.h, tree-into-ssa.h
> and bitmap.h.
> (struct gimple_parser_edge): New.
> (edges, current_bb): New globals.
> (c_parser_gimple_parse_bb_spec): New helper.
> (c_parser_parse_gimple_body): Get start pass and IL specification.
> Initialize SSA and CFG.
> (c_parser_gimple_compound_statement): Handle CFG build.
> (c_parser_gimple_statement): Change intermittend __PHI internal
> function arg.
> (c_parser_gimple_or_rtl_pass_list): Handle ssa, cfg flags.
> (c_parser_gimple_goto_stmt): Record edges to build.
> (c_parser_gimple_if_stmt): Likewise.
> * gimple-parser.h (c_parser_parse_gimple_body): Adjust.
> (c_parser_gimple_or_rtl_pass_list): Likewise.
>
> * gcc.dg/gimplefe-13.c: Adjust.
> ...
>
> Index: gcc/c/c-parser.c
> ===
> --- gcc/c/c-parser.c(revision 269604)
> +++ gcc/c/c-parser.c(working copy)
> @@ -2324,19 +2324,9 @@ c_parser_declaration_or_fndef (c_parser
>DECL_STRUCT_FUNCTION (current_function_decl)->function_start_locus
> = c_parser_peek_token (parser)->location;
>
> -  /* If the definition was marked with __GIMPLE then parse the
> - function body as GIMPLE.  */
> -  if (specs->gimple_p)
> -   {
> - cfun->pass_startwith = specs->gimple_or_rtl_pass;
> - bool saved = in_late_binary_op;
> - in_late_binary_op = true;
> - c_parser_parse_gimple_body (parser);
> - in_late_binary_op = saved;
> -   }
> -  /* Similarly, if it was marked with __RTL, use the RTL parser now,
> +  /* If the definition was marked with __RTL, use the RTL parser now,
>  consuming the function body.  */
> -  else if (specs->rtl_p)
> +  if (specs->declspec_il == cdil_rtl)
> {
>   c_parser_parse_rtl_body (parser, specs->gimple_or_rtl_pass);
>
> @@ -2350,6 +2340,16 @@ c_parser_declaration_or_fndef (c_parser
>   finish_function ();
>   return;
> }
> +  /* If the definition was marked with __GIMPLE then parse the
> + function body as GIMPLE.  */
> +  else if (specs->declspec_il != cdil_none)
> +   {
> + bool saved = in_late_binary_op;
> + in_late_binary_op = true;
> + c_parser_parse_gimple_body (parser,

76 matches

Mail list logo