Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 05 2017, Thomas Koenig  wrote:

> @@ -227,6 +226,17 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl
>if (m == 0 || n == 0 || k == 0)
>   return;
>  
> +  /* Adjust size of t1 to what is needed.  */
> +  index_type t1_dim;
> +  t1_dim = (a_dim1-1) * 256 + b_dim1;
> +  if (t1_dim > 65536)
> + t1_dim = 65536;
> +
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wvla"
> +  'rtype_name` t1[t1_dim]; /* was [256][256] */

That does the wrong thing if b_dim1 == 0xDEADBEEF.

(gdb) p (a_dim1-1) * 256 + b_dim1
$2 = -764456190

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: avoid remove&reinsert of call when splitting block for inlining

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:12 PM, Alexandre Oliva  wrote:
> We used to split the inlined-into block at (= after) the call, and then
> remove the call from the first block to insert it in the second.
>
> The removal may cause unnecessary and unrecoverable resetting of debug
> insns: we do not generate debug temps for calls.
>
> Avoid the remove-and-reinsert dance by splitting the block before the
> call.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Ok if you included Ada in bootstrap / testing.

Thanks,
Richard.

> for  gcc/ChangeLog
>
> * tree-inline.c (expand_call_inline): Split block at stmt
> before the call.
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/guality/inline-params-2.c: New.
> ---
>  gcc/testsuite/gcc.dg/guality/inline-params-2.c |   38 
> 
>  gcc/tree-inline.c  |   25 
>  2 files changed, 44 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/guality/inline-params-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/guality/inline-params-2.c 
> b/gcc/testsuite/gcc.dg/guality/inline-params-2.c
> new file mode 100644
> index 000..e00188ca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/guality/inline-params-2.c
> @@ -0,0 +1,38 @@
> +/* { dg-do run } */
> +/* tree inline used to split the block for inlining after the call,
> +   then move the call to the after-the-call block.  This move
> +   temporarily deletes the assignment to the result, which in turn
> +   resets any debug bind stmts referencing the result.  Make sure we
> +   don't do that, verifying that the result is visible after the call,
> +   and when passed to another inline function.  */
> +/* { dg-options "-g" } */
> +/* { dg-xfail-run-if "" { "*-*-*" } { "-fno-fat-lto-objects" } } */
> +
> +#define GUALITY_DONT_FORCE_LIVE_AFTER -1
> +
> +#ifndef STATIC_INLINE
> +#define STATIC_INLINE /*static*/
> +#endif
> +
> +
> +#include "guality.h"
> +
> +__attribute__ ((always_inline)) static inline int
> +t1 (int i)
> +{
> +  GUALCHKVAL (i);
> +  return i;
> +}
> +__attribute__ ((always_inline)) static inline int
> +t2 (int i)
> +{
> +  GUALCHKVAL (i);
> +  return i - 42;
> +}
> +int
> +main (int argc, char *argv[])
> +{
> +  int i = t1(42);
> +  GUALCHKVAL (i);
> +  return t2(i);
> +}
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index bfaaede..db3e08f 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -4542,33 +4542,20 @@ expand_call_inline (basic_block bb, gimple *stmt, 
> copy_body_data *id)
>  DECL_FUNCTION_PERSONALITY (cg_edge->caller->decl)
>= DECL_FUNCTION_PERSONALITY (cg_edge->callee->decl);
>
> -  /* Split the block holding the GIMPLE_CALL.  */
> -  e = split_block (bb, stmt);
> +  /* Split the block before the GIMPLE_CALL.  */
> +  stmt_gsi = gsi_for_stmt (stmt);
> +  gsi_prev (&stmt_gsi);
> +  e = split_block (bb, gsi_end_p (stmt_gsi) ? NULL : gsi_stmt (stmt_gsi));
>bb = e->src;
>return_block = e->dest;
>remove_edge (e);
>
> -  /* split_block splits after the statement; work around this by
> - moving the call into the second block manually.  Not pretty,
> - but seems easier than doing the CFG manipulation by hand
> - when the GIMPLE_CALL is in the last statement of BB.  */
> -  stmt_gsi = gsi_last_bb (bb);
> -  gsi_remove (&stmt_gsi, false);
> -
>/* If the GIMPLE_CALL was in the last statement of BB, it may have
>   been the source of abnormal edges.  In this case, schedule
>   the removal of dead abnormal edges.  */
>gsi = gsi_start_bb (return_block);
> -  if (gsi_end_p (gsi))
> -{
> -  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
> -  purge_dead_abnormal_edges = true;
> -}
> -  else
> -{
> -  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
> -  purge_dead_abnormal_edges = false;
> -}
> +  gsi_next (&gsi);
> +  purge_dead_abnormal_edges = gsi_end_p (gsi);
>
>stmt_gsi = gsi_start_bb (return_block);
>
>
> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH 01/13] improve safety of freeing bitmaps

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> There's two groups of changes here, first taking a sbitmap &, so that we
> can assign null to the pointer after freeing the sbitmap to prevent use
> after free through that pointer.  Second we define overloads of
> sbitmap_free and bitmap_free taking auto_sbitmap and auto_bitmap
> respectively, so that you can't double free the bitmap owned by a
> auto_{s,}bitmap.

Looks good - but what do you need the void *& overload for?!  That at least
needs a comment.

Richard.

> gcc/ChangeLog:
>
> 2017-05-09  Trevor Saunders  
>
> * bitmap.h (BITMAP_FREE): Convert from macro to inline function
> and add overloaded decl for auto_bitmap.
> * sbitmap.h (inline void sbitmap_free): Add overload for
> auto_sbitmap, and change sbitmap to  point to null.
> ---
>  gcc/bitmap.h  | 21 +++--
>  gcc/sbitmap.h |  7 ++-
>  2 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> index f158b447357..7508239cff9 100644
> --- a/gcc/bitmap.h
> +++ b/gcc/bitmap.h
> @@ -129,6 +129,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  #include "obstack.h"
>
> +   class auto_bitmap;
> +
>  /* Bitmap memory usage.  */
>  struct bitmap_usage: public mem_usage
>  {
> @@ -372,8 +374,23 @@ extern hashval_t bitmap_hash (const_bitmap);
>  #define BITMAP_GGC_ALLOC() bitmap_gc_alloc ()
>
>  /* Do any cleanup needed on a bitmap when it is no longer used.  */
> -#define BITMAP_FREE(BITMAP) \
> -   ((void) (bitmap_obstack_free ((bitmap) BITMAP), (BITMAP) = (bitmap) 
> NULL))
> +inline void
> +BITMAP_FREE (bitmap &b)
> +{
> +  bitmap_obstack_free ((bitmap) b);
> +  b = NULL;
> +}
> +
> +inline void
> +BITMAP_FREE (void *&b)
> +{
> +  bitmap_obstack_free ((bitmap) b);
> +  b = NULL;
> +}
> +
> +/* Intentionally unimplemented to ensure it is never called with an
> +   auto_bitmap argument.  */
> +void BITMAP_FREE (auto_bitmap);
>
>  /* Iterator for bitmaps.  */
>
> diff --git a/gcc/sbitmap.h b/gcc/sbitmap.h
> index ce4d27d927c..cba0452cdb9 100644
> --- a/gcc/sbitmap.h
> +++ b/gcc/sbitmap.h
> @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3.  If not see
>  #define SBITMAP_ELT_BITS (HOST_BITS_PER_WIDEST_FAST_INT * 1u)
>  #define SBITMAP_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
>
> +class auto_sbitmap;
> +
>  struct simple_bitmap_def
>  {
>unsigned int n_bits; /* Number of bits.  */
> @@ -208,11 +210,14 @@ bmp_iter_next (sbitmap_iterator *i, unsigned *bit_no 
> ATTRIBUTE_UNUSED)
> bmp_iter_next (&(ITER), &(BITNUM)))
>  #endif
>
> -inline void sbitmap_free (sbitmap map)
> +inline void sbitmap_free (sbitmap &map)
>  {
>free (map);
> +  map = NULL;
>  }
>
> +void sbitmap_free (auto_sbitmap);
> +
>  inline void sbitmap_vector_free (sbitmap * vec)
>  {
>free (vec);
> --
> 2.11.0
>


Re: [PATCH 03/13] store the bitmap_head within the auto_bitmap

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> This gets rid of one allocation per bitmap.  Often the bitmap_head is
> now on the stack, when it isn't its part of some other struct on the
> heap instead of being refered to by that struct.  On 64 bit platforms
> this will increase the size of such structs by 24 bytes, but its an over
> all win since we don't need an 8 byte pointer pointing at the
> bitmap_head.  Given that the auto_bitmap owns the bitmap_head anyway we
> know there would never be a place where two auto_bitmaps would refer to
> the same bitmap_head object.

Ok.

Richard.

> gcc/ChangeLog:
>
> 2017-05-07  Trevor Saunders  
>
> * bitmap.h (class auto_bitmap): Change type of m_bits to
> bitmap_head, and adjust ctor / dtor and member operators.
> ---
>  gcc/bitmap.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> index 7508239cff9..49aec001cb0 100644
> --- a/gcc/bitmap.h
> +++ b/gcc/bitmap.h
> @@ -823,10 +823,10 @@ bmp_iter_and_compl (bitmap_iterator *bi, unsigned 
> *bit_no)
>  class auto_bitmap
>  {
>   public:
> -  auto_bitmap () { bits = BITMAP_ALLOC (NULL); }
> -  ~auto_bitmap () { BITMAP_FREE (bits); }
> +  auto_bitmap () { bitmap_initialize (&m_bits, &bitmap_default_obstack); }
> +  ~auto_bitmap () { bitmap_clear (&m_bits); }
>// Allow calling bitmap functions on our bitmap.
> -  operator bitmap () { return bits; }
> +  operator bitmap () { return &m_bits; }
>
>   private:
>// Prevent making a copy that references our bitmap.
> @@ -837,7 +837,7 @@ class auto_bitmap
>auto_bitmap &operator = (auto_bitmap &&);
>  #endif
>
> -  bitmap bits;
> +  bitmap_head m_bits;
>  };
>
>  #endif /* GCC_BITMAP_H */
> --
> 2.11.0
>


Re: [PATCH 06/13] replace some manual stacks with auto_vec

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

Richard.

> 2017-05-09  Trevor Saunders  
>
> * cfganal.c (mark_dfs_back_edges): Replace manual stack with
> auto_vec.
> (post_order_compute): Likewise.
> (inverted_post_order_compute): Likewise.
> (pre_and_rev_post_order_compute_fn): Likewise.
> ---
>  gcc/cfganal.c | 92 
> +++
>  1 file changed, 36 insertions(+), 56 deletions(-)
>
> diff --git a/gcc/cfganal.c b/gcc/cfganal.c
> index 7377a7a0434..1b01564e8c7 100644
> --- a/gcc/cfganal.c
> +++ b/gcc/cfganal.c
> @@ -61,10 +61,8 @@ static void flow_dfs_compute_reverse_finish 
> (depth_first_search_ds *);
>  bool
>  mark_dfs_back_edges (void)
>  {
> -  edge_iterator *stack;
>int *pre;
>int *post;
> -  int sp;
>int prenum = 1;
>int postnum = 1;
>bool found = false;
> @@ -74,8 +72,7 @@ mark_dfs_back_edges (void)
>post = XCNEWVEC (int, last_basic_block_for_fn (cfun));
>
>/* Allocate stack for back-tracking up CFG.  */
> -  stack = XNEWVEC (edge_iterator, n_basic_blocks_for_fn (cfun) + 1);
> -  sp = 0;
> +  auto_vec stack (n_basic_blocks_for_fn (cfun) + 1);
>
>/* Allocate bitmap to track nodes that have been visited.  */
>auto_sbitmap visited (last_basic_block_for_fn (cfun));
> @@ -84,16 +81,15 @@ mark_dfs_back_edges (void)
>bitmap_clear (visited);
>
>/* Push the first edge on to the stack.  */
> -  stack[sp++] = ei_start (ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs);
> +  stack.quick_push (ei_start (ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs));
>
> -  while (sp)
> +  while (!stack.is_empty ())
>  {
> -  edge_iterator ei;
>basic_block src;
>basic_block dest;
>
>/* Look at the edge on the top of the stack.  */
> -  ei = stack[sp - 1];
> +  edge_iterator ei = stack.last ();
>src = ei_edge (ei)->src;
>dest = ei_edge (ei)->dest;
>ei_edge (ei)->flags &= ~EDGE_DFS_BACK;
> @@ -110,7 +106,7 @@ mark_dfs_back_edges (void)
> {
>   /* Since the DEST node has been visited for the first
>  time, check its successors.  */
> - stack[sp++] = ei_start (dest->succs);
> + stack.quick_push (ei_start (dest->succs));
> }
>   else
> post[dest->index] = postnum++;
> @@ -128,15 +124,14 @@ mark_dfs_back_edges (void)
> post[src->index] = postnum++;
>
>   if (!ei_one_before_end_p (ei))
> -   ei_next (&stack[sp - 1]);
> +   ei_next (&stack.last ());
>   else
> -   sp--;
> +   stack.pop ();
> }
>  }
>
>free (pre);
>free (post);
> -  free (stack);
>
>return found;
>  }
> @@ -637,8 +632,6 @@ int
>  post_order_compute (int *post_order, bool include_entry_exit,
> bool delete_unreachable)
>  {
> -  edge_iterator *stack;
> -  int sp;
>int post_order_num = 0;
>int count;
>
> @@ -646,8 +639,7 @@ post_order_compute (int *post_order, bool 
> include_entry_exit,
>  post_order[post_order_num++] = EXIT_BLOCK;
>
>/* Allocate stack for back-tracking up CFG.  */
> -  stack = XNEWVEC (edge_iterator, n_basic_blocks_for_fn (cfun) + 1);
> -  sp = 0;
> +  auto_vec stack (n_basic_blocks_for_fn (cfun) + 1);
>
>/* Allocate bitmap to track nodes that have been visited.  */
>auto_sbitmap visited (last_basic_block_for_fn (cfun));
> @@ -656,16 +648,15 @@ post_order_compute (int *post_order, bool 
> include_entry_exit,
>bitmap_clear (visited);
>
>/* Push the first edge on to the stack.  */
> -  stack[sp++] = ei_start (ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs);
> +  stack.quick_push (ei_start (ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs));
>
> -  while (sp)
> +  while (!stack.is_empty ())
>  {
> -  edge_iterator ei;
>basic_block src;
>basic_block dest;
>
>/* Look at the edge on the top of the stack.  */
> -  ei = stack[sp - 1];
> +  edge_iterator ei = stack.last ();
>src = ei_edge (ei)->src;
>dest = ei_edge (ei)->dest;
>
> @@ -679,7 +670,7 @@ post_order_compute (int *post_order, bool 
> include_entry_exit,
>   if (EDGE_COUNT (dest->succs) > 0)
> /* Since the DEST node has been visited for the first
>time, check its successors.  */
> -   stack[sp++] = ei_start (dest->succs);
> +   stack.quick_push (ei_start (dest->succs));
>   else
> post_order[post_order_num++] = dest->index;
> }
> @@ -690,9 +681,9 @@ post_order_compute (int *post_order, bool 
> include_entry_exit,
> post_order[post_order_num++] = src->index;
>
>   if (!ei_one_before_end_p (ei))
> -   ei_next (&stack[sp - 1]);
> +   ei_next (&stack.last ());
>   else
> -   sp--;
> +   stack.pop ();
> }
>  }
>
> @@ -722,7 +713,6 @@ post_order_compute (int *post_order, bool 
>

Re: [PATCH] prevent -Wno-system-headers from suppressing -Wstringop-overflow (PR 79214)

2017-05-10 Thread Rainer Orth
Hi Martin,

>> The testcase also regresses on sparc-sun-solaris2.12:
>
> The test (intentionally) does bad things that trigger the new
> warning.  The failures were reported last week in bug 80643 along
> with a bunch of others.  The latter were also fixed last week but
> I overlooked these.  I pruned the new diagnostics this morning
> to avoid the excess failures.  The test should pass again on all
> targets.  Thank you for your patience.

it turned out my last bootstrap was just one revision before your fix.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH 08/13] move several bitmaps from gc memory to the default obstack and use auto_bitmap

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> These places where probably trying to use the default bitmap obstack,
> but passing 0 to bitmap_initialize actually uses gc allocation.  In any
> case they are all cleaned up before going out of scope so using
> auto_bitmap should be fine.

Ok.

Richard.

> gcc/ChangeLog:
>
> 2017-05-09  Trevor Saunders  
>
> * haifa-sched.c (estimate_shadow_tick): Replace manual bitmap
> management with auto_bitmap.
> (fix_inter_tick): Likewise.
> (fix_recovery_deps): Likewise.
> * ira.c (add_store_equivs): Likewise.
> (find_moveable_pseudos): Likewise.
> (split_live_ranges_for_shrink_wrap): Likewise.
> * print-rtl.c (rtx_reuse_manager::rtx_reuse_manager): Likewise.
> (rtx_reuse_manager::seen_def_p): Likewise.
> (rtx_reuse_manager::set_seen_def): Likewise.
> * print-rtl.h (class rtx_reuse_manager): Likewise.
> ---
>  gcc/haifa-sched.c | 23 +--
>  gcc/ira.c | 84 
> +++
>  gcc/print-rtl.c   |  5 ++--
>  gcc/print-rtl.h   |  2 +-
>  4 files changed, 38 insertions(+), 76 deletions(-)
>
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 0ebf110471c..1fcc01d04ae 100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -4843,14 +4843,12 @@ estimate_insn_tick (bitmap processed, rtx_insn *insn, 
> int budget)
>  static int
>  estimate_shadow_tick (struct delay_pair *p)
>  {
> -  bitmap_head processed;
> +  auto_bitmap processed;
>int t;
>bool cutoff;
> -  bitmap_initialize (&processed, 0);
>
> -  cutoff = !estimate_insn_tick (&processed, p->i2,
> +  cutoff = !estimate_insn_tick (processed, p->i2,
> max_insn_queue_index + pair_delay (p));
> -  bitmap_clear (&processed);
>if (cutoff)
>  return max_insn_queue_index;
>t = INSN_TICK_ESTIMATE (p->i2) - (clock_var + pair_delay (p) + 1);
> @@ -7515,15 +7513,13 @@ static void
>  fix_inter_tick (rtx_insn *head, rtx_insn *tail)
>  {
>/* Set of instructions with corrected INSN_TICK.  */
> -  bitmap_head processed;
> +  auto_bitmap processed;
>/* ??? It is doubtful if we should assume that cycle advance happens on
>   basic block boundaries.  Basically insns that are unconditionally ready
>   on the start of the block are more preferable then those which have
>   a one cycle dependency over insn from the previous block.  */
>int next_clock = clock_var + 1;
>
> -  bitmap_initialize (&processed, 0);
> -
>/* Iterates over scheduled instructions and fix their INSN_TICKs and
>   INSN_TICKs of dependent instructions, so that INSN_TICKs are consistent
>   across different blocks.  */
> @@ -7539,7 +7535,7 @@ fix_inter_tick (rtx_insn *head, rtx_insn *tail)
>   gcc_assert (tick >= MIN_TICK);
>
>   /* Fix INSN_TICK of instruction from just scheduled block.  */
> - if (bitmap_set_bit (&processed, INSN_LUID (head)))
> + if (bitmap_set_bit (processed, INSN_LUID (head)))
> {
>   tick -= next_clock;
>
> @@ -7563,7 +7559,7 @@ fix_inter_tick (rtx_insn *head, rtx_insn *tail)
>   /* If NEXT has its INSN_TICK calculated, fix it.
>  If not - it will be properly calculated from
>  scratch later in fix_tick_ready.  */
> - && bitmap_set_bit (&processed, INSN_LUID (next)))
> + && bitmap_set_bit (processed, INSN_LUID (next)))
> {
>   tick -= next_clock;
>
> @@ -7580,7 +7576,6 @@ fix_inter_tick (rtx_insn *head, rtx_insn *tail)
> }
> }
>  }
> -  bitmap_clear (&processed);
>  }
>
>  /* Check if NEXT is ready to be added to the ready or queue list.
> @@ -8617,9 +8612,7 @@ fix_recovery_deps (basic_block rec)
>  {
>rtx_insn *note, *insn, *jump;
>auto_vec ready_list;
> -  bitmap_head in_ready;
> -
> -  bitmap_initialize (&in_ready, 0);
> +  auto_bitmap in_ready;
>
>/* NOTE - a basic block note.  */
>note = NEXT_INSN (BB_HEAD (rec));
> @@ -8642,7 +8635,7 @@ fix_recovery_deps (basic_block rec)
> {
>   sd_delete_dep (sd_it);
>
> - if (bitmap_set_bit (&in_ready, INSN_LUID (consumer)))
> + if (bitmap_set_bit (in_ready, INSN_LUID (consumer)))
> ready_list.safe_push (consumer);
> }
>   else
> @@ -8657,8 +8650,6 @@ fix_recovery_deps (basic_block rec)
>  }
>while (insn != note);
>
> -  bitmap_clear (&in_ready);
> -
>/* Try to add instructions to the ready or queue list.  */
>unsigned int i;
>rtx_insn *temp;
> diff --git a/gcc/ira.c b/gcc/ira.c
> index c9751ce81ba..36a779bd37f 100644
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -3635,16 +3635,15 @@ update_equiv_regs (void)
>  static void
>  add_store_equivs (void)
>  {
> -  bitmap_head seen_insns;
> +  auto_bitmap seen_insns;
>
> -  bitmap_initial

Re: [PATCH 10/13] make a member an auto_sbitmap

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

> 2017-05-09  Trevor Saunders  
>
> * tree-ssa-dse.c (dse_dom_walker): Make m_live_byes a
> auto_sbitmap.
> ---
>  gcc/tree-ssa-dse.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
> index 90230abe822..3ebc19948e1 100644
> --- a/gcc/tree-ssa-dse.c
> +++ b/gcc/tree-ssa-dse.c
> @@ -601,16 +601,14 @@ class dse_dom_walker : public dom_walker
>  {
>  public:
>dse_dom_walker (cdi_direction direction)
> -: dom_walker (direction), m_byte_tracking_enabled (false)
> -
> -  { m_live_bytes = sbitmap_alloc (PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)); }
> -
> -  ~dse_dom_walker () { sbitmap_free (m_live_bytes); }
> +: dom_walker (direction),
> +m_live_bytes (PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)),
> +m_byte_tracking_enabled (false) {}
>
>virtual edge before_dom_children (basic_block);
>
>  private:
> -  sbitmap m_live_bytes;
> +  auto_sbitmap m_live_bytes;
>bool m_byte_tracking_enabled;
>void dse_optimize_stmt (gimple_stmt_iterator *);
>  };
> --
> 2.11.0
>


Re: [PATCH 11/13] make more vars auto_sbitmaps

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

> 2017-05-09  Trevor Saunders  
>
> * ddg.c (find_nodes_on_paths): Use auto_sbitmap.
> (longest_simple_path): Likewise.
> * shrink-wrap.c (spread_components): Likewise.
> (disqualify_problematic_components): Likewise.
> (emit_common_heads_for_components): Likewise.
> (emit_common_tails_for_components): Likewise.
> (insert_prologue_epilogue_for_components): Likewise.
> ---
>  gcc/ddg.c | 26 --
>  gcc/shrink-wrap.c | 38 +++---
>  2 files changed, 19 insertions(+), 45 deletions(-)
>
> diff --git a/gcc/ddg.c b/gcc/ddg.c
> index 9ea98d6f40f..8aaed80dec4 100644
> --- a/gcc/ddg.c
> +++ b/gcc/ddg.c
> @@ -1081,16 +1081,15 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_sccs)
>  int
>  find_nodes_on_paths (sbitmap result, ddg_ptr g, sbitmap from, sbitmap to)
>  {
> -  int answer;
>int change;
>unsigned int u = 0;
>int num_nodes = g->num_nodes;
>sbitmap_iterator sbi;
>
> -  sbitmap workset = sbitmap_alloc (num_nodes);
> -  sbitmap reachable_from = sbitmap_alloc (num_nodes);
> -  sbitmap reach_to = sbitmap_alloc (num_nodes);
> -  sbitmap tmp = sbitmap_alloc (num_nodes);
> +  auto_sbitmap workset (num_nodes);
> +  auto_sbitmap reachable_from (num_nodes);
> +  auto_sbitmap reach_to (num_nodes);
> +  auto_sbitmap tmp (num_nodes);
>
>bitmap_copy (reachable_from, from);
>bitmap_copy (tmp, from);
> @@ -1150,12 +1149,7 @@ find_nodes_on_paths (sbitmap result, ddg_ptr g, 
> sbitmap from, sbitmap to)
> }
>  }
>
> -  answer = bitmap_and (result, reachable_from, reach_to);
> -  sbitmap_free (workset);
> -  sbitmap_free (reachable_from);
> -  sbitmap_free (reach_to);
> -  sbitmap_free (tmp);
> -  return answer;
> +  return bitmap_and (result, reachable_from, reach_to);
>  }
>
>
> @@ -1195,10 +1189,9 @@ longest_simple_path (struct ddg * g, int src, int 
> dest, sbitmap nodes)
>int i;
>unsigned int u = 0;
>int change = 1;
> -  int result;
>int num_nodes = g->num_nodes;
> -  sbitmap workset = sbitmap_alloc (num_nodes);
> -  sbitmap tmp = sbitmap_alloc (num_nodes);
> +  auto_sbitmap workset (num_nodes);
> +  auto_sbitmap tmp (num_nodes);
>
>
>/* Data will hold the distance of the longest path found so far from
> @@ -1224,10 +1217,7 @@ longest_simple_path (struct ddg * g, int src, int 
> dest, sbitmap nodes)
>   change |= update_dist_to_successors (u_node, nodes, tmp);
> }
>  }
> -  result = g->nodes[dest].aux.count;
> -  sbitmap_free (workset);
> -  sbitmap_free (tmp);
> -  return result;
> +  return g->nodes[dest].aux.count;
>  }
>
>  #endif /* INSN_SCHEDULING */
> diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
> index 492376d949b..1ac4ea3b054 100644
> --- a/gcc/shrink-wrap.c
> +++ b/gcc/shrink-wrap.c
> @@ -1264,7 +1264,7 @@ spread_components (sbitmap components)
>todo.create (n_basic_blocks_for_fn (cfun));
>auto_bitmap seen;
>
> -  sbitmap old = sbitmap_alloc (SBITMAP_SIZE (components));
> +  auto_sbitmap old (SBITMAP_SIZE (components));
>
>/* Find for every block the components that are *not* needed on some path
>   from the entry to that block.  Do this with a flood fill from the entry
> @@ -1390,8 +1390,6 @@ spread_components (sbitmap components)
>   fprintf (dump_file, "\n");
> }
>  }
> -
> -  sbitmap_free (old);
>  }
>
>  /* If we cannot handle placing some component's prologues or epilogues where
> @@ -1400,8 +1398,8 @@ spread_components (sbitmap components)
>  static void
>  disqualify_problematic_components (sbitmap components)
>  {
> -  sbitmap pro = sbitmap_alloc (SBITMAP_SIZE (components));
> -  sbitmap epi = sbitmap_alloc (SBITMAP_SIZE (components));
> +  auto_sbitmap pro (SBITMAP_SIZE (components));
> +  auto_sbitmap epi (SBITMAP_SIZE (components));
>
>basic_block bb;
>FOR_EACH_BB_FN (bb, cfun)
> @@ -1466,9 +1464,6 @@ disqualify_problematic_components (sbitmap components)
> }
> }
>  }
> -
> -  sbitmap_free (pro);
> -  sbitmap_free (epi);
>  }
>
>  /* Place code for prologues and epilogues for COMPONENTS where we can put
> @@ -1476,9 +1471,9 @@ disqualify_problematic_components (sbitmap components)
>  static void
>  emit_common_heads_for_components (sbitmap components)
>  {
> -  sbitmap pro = sbitmap_alloc (SBITMAP_SIZE (components));
> -  sbitmap epi = sbitmap_alloc (SBITMAP_SIZE (components));
> -  sbitmap tmp = sbitmap_alloc (SBITMAP_SIZE (components));
> +  auto_sbitmap pro (SBITMAP_SIZE (components));
> +  auto_sbitmap epi (SBITMAP_SIZE (components));
> +  auto_sbitmap tmp (SBITMAP_SIZE (components));
>
>basic_block bb;
>FOR_ALL_BB_FN (bb, cfun)
> @@ -1554,10 +1549,6 @@ emit_common_heads_for_components (sbitmap components)
>   bitmap_ior (SW (bb)->head_components, SW (bb)->head_components, 
> epi);
> }
>  }
> -
> -  sbitmap_free (pro);
> - 

Re: [PATCH 04/13] allow auto_bitmap to use other bitmap obstacks

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

> 2017-05-07  Trevor Saunders  
>
> * bitmap.h (class auto_bitmap): New constructor taking
> bitmap_obstack * argument.
> ---
>  gcc/bitmap.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> index 49aec001cb0..2ddeee6bc10 100644
> --- a/gcc/bitmap.h
> +++ b/gcc/bitmap.h
> @@ -824,6 +824,7 @@ class auto_bitmap
>  {
>   public:
>auto_bitmap () { bitmap_initialize (&m_bits, &bitmap_default_obstack); }
> +  explicit auto_bitmap (bitmap_obstack *o) { bitmap_initialize (&m_bits, o); 
> }
>~auto_bitmap () { bitmap_clear (&m_bits); }
>// Allow calling bitmap functions on our bitmap.
>operator bitmap () { return &m_bits; }
> --
> 2.11.0
>


Re: [PATCH 07/13] use auto_bitmap more

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

Richard.

> 2017-05-09  Trevor Saunders  
>
> * bt-load.c (combine_btr_defs): Use auto_bitmap to manage bitmap
> lifetime.
> (migrate_btr_def): Likewise.
> * cfgloop.c (get_loop_body_in_bfs_order): Likewise.
> * df-core.c (loop_post_order_compute): Likewise.
> (loop_inverted_post_order_compute): Likewise.
> * hsa-common.h: Likewise.
> * hsa-gen.c (hsa_bb::~hsa_bb): Likewise.
> * init-regs.c (initialize_uninitialized_regs): Likewise.
> * ipa-inline.c (resolve_noninline_speculation): Likewise.
> (inline_small_functions): Likewise.
> * ipa-reference.c (ipa_reference_write_optimization_summary):
> Likewise.
> * ira.c (combine_and_move_insns): Likewise.
> (build_insn_chain): Likewise.
> * loop-invariant.c (find_invariants): Likewise.
> * lower-subreg.c (propagate_pseudo_copies): Likewise.
> * predict.c (tree_predict_by_opcode): Likewise.
> (predict_paths_leading_to): Likewise.
> (predict_paths_leading_to_edge): Likewise.
> (estimate_loops_at_level): Likewise.
> (estimate_loops): Likewise.
> * shrink-wrap.c (try_shrink_wrapping): Likewise.
> (spread_components): Likewise.
> * tree-cfg.c (remove_edge_and_dominated_blocks): Likewise.
> * tree-loop-distribution.c (rdg_build_partitions): Likewise.
> * tree-predcom.c (tree_predictive_commoning_loop): Likewise.
> * tree-ssa-coalesce.c (coalesce_ssa_name): Likewise.
> * tree-ssa-phionlycprop.c (pass_phi_only_cprop::execute):
> Likewise.
> * tree-ssa-pre.c (remove_dead_inserted_code): Likewise.
> * tree-ssa-sink.c (nearest_common_dominator_of_uses): Likewise.
> * tree-ssa-threadupdate.c (compute_path_counts): Likewise.
> (mark_threaded_blocks): Likewise.
> (thread_through_all_blocks): Likewise.
> * tree-ssa.c (verify_ssa): Likewise.
> (execute_update_addresses_taken): Likewise.
> * tree-ssanames.c (verify_ssaname_freelists): Likewise.
> ---
>  gcc/bt-load.c|  8 +++-
>  gcc/cfgloop.c|  4 +---
>  gcc/df-core.c|  8 ++--
>  gcc/hsa-common.h |  4 ++--
>  gcc/hsa-gen.c| 14 ++
>  gcc/init-regs.c  |  4 +---
>  gcc/ipa-inline.c |  6 ++
>  gcc/ipa-reference.c  |  3 +--
>  gcc/ira.c| 13 -
>  gcc/loop-invariant.c | 12 
>  gcc/lower-subreg.c   |  8 +---
>  gcc/predict.c| 19 +--
>  gcc/shrink-wrap.c| 10 +++---
>  gcc/tree-cfg.c   |  7 +--
>  gcc/tree-loop-distribution.c |  4 +---
>  gcc/tree-predcom.c   |  4 +---
>  gcc/tree-ssa-coalesce.c  |  4 +---
>  gcc/tree-ssa-phionlycprop.c  | 15 ---
>  gcc/tree-ssa-pre.c   |  4 +---
>  gcc/tree-ssa-sink.c  |  9 +++--
>  gcc/tree-ssa-threadupdate.c  | 13 +++--
>  gcc/tree-ssa.c   | 12 
>  gcc/tree-ssanames.c  | 10 +++---
>  23 files changed, 53 insertions(+), 142 deletions(-)
>
> diff --git a/gcc/bt-load.c b/gcc/bt-load.c
> index 27be6a382c4..32924e2ecc5 100644
> --- a/gcc/bt-load.c
> +++ b/gcc/bt-load.c
> @@ -1058,7 +1058,7 @@ combine_btr_defs (btr_def *def, HARD_REG_SET 
> *btrs_live_in_range)
>  target registers live over the merged range.  */
>   int btr;
>   HARD_REG_SET combined_btrs_live;
> - bitmap combined_live_range = BITMAP_ALLOC (NULL);
> + auto_bitmap combined_live_range;
>   btr_user *user;
>
>   if (other_def->live_range == NULL)
> @@ -1116,7 +1116,6 @@ combine_btr_defs (btr_def *def, HARD_REG_SET 
> *btrs_live_in_range)
>   delete_insn (other_def->insn);
>
> }
> - BITMAP_FREE (combined_live_range);
> }
>  }
>  }
> @@ -1255,7 +1254,6 @@ can_move_up (const_basic_block bb, const rtx_insn 
> *insn, int n_insns)
>  static int
>  migrate_btr_def (btr_def *def, int min_cost)
>  {
> -  bitmap live_range;
>HARD_REG_SET btrs_live_in_range;
>int btr_used_near_def = 0;
>int def_basic_block_freq;
> @@ -1289,7 +1287,7 @@ migrate_btr_def (btr_def *def, int min_cost)
>  }
>
>btr_def_live_range (def, &btrs_live_in_range);
> -  live_range = BITMAP_ALLOC (NULL);
> +  auto_bitmap live_range;
>bitmap_copy (live_range, def->live_range);
>
>  #ifdef INSN_SCHEDULING
> @@ -1373,7 +1371,7 @@ migrate_btr_def (btr_def *def, int min_cost)
>if (dump_file)
> fprintf (dump_file, "failed to move\n");
>  }
> -  BITMAP_FREE (live_range);
> +
>return !give_up;
>  }
>
> diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
> index afd56bb8cf7..654d188e8b5 100644
> --- a/gcc/cfgloop.c
> +++ b/gcc/cfgloop.c
> @@ -

Re: [PATCH 12/13] make depth_first_search_ds a class

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

> 2017-05-09  Trevor Saunders  
>
> * cfganal.c (connect_infinite_loops_to_exit): Adjust.
> (depth_first_search::depth_first_search): Change structure init
> function to this constructor.
> (depth_first_search::add_bb): Rename function to this member.
> (depth_first_search::execute): Likewise.
> (flow_dfs_compute_reverse_finish): Adjust.
> ---
>  gcc/cfganal.c | 96 
> +--
>  1 file changed, 34 insertions(+), 62 deletions(-)
>
> diff --git a/gcc/cfganal.c b/gcc/cfganal.c
> index 1b01564e8c7..27b453ca3f7 100644
> --- a/gcc/cfganal.c
> +++ b/gcc/cfganal.c
> @@ -28,25 +28,24 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfganal.h"
>  #include "cfgloop.h"
>
> +namespace {
>  /* Store the data structures necessary for depth-first search.  */
> -struct depth_first_search_ds {
> -  /* stack for backtracking during the algorithm */
> -  basic_block *stack;
> +class depth_first_search
> +  {
> +public:
> +depth_first_search ();
> +
> +basic_block execute (basic_block);
> +void add_bb (basic_block);
>
> -  /* number of edges in the stack.  That is, positions 0, ..., sp-1
> - have edges.  */
> -  unsigned int sp;
> +private:
> +  /* stack for backtracking during the algorithm */
> +  auto_vec m_stack;
>
>/* record of basic blocks already seen by depth-first search */
> -  sbitmap visited_blocks;
> +  auto_sbitmap m_visited_blocks;
>  };
> -
> -static void flow_dfs_compute_reverse_init (depth_first_search_ds *);
> -static void flow_dfs_compute_reverse_add_bb (depth_first_search_ds *,
> -basic_block);
> -static basic_block flow_dfs_compute_reverse_execute (depth_first_search_ds *,
> -basic_block);
> -static void flow_dfs_compute_reverse_finish (depth_first_search_ds *);
> +}
>
>  /* Mark the back edges in DFS traversal.
> Return nonzero if a loop (natural or otherwise) is present.
> @@ -597,30 +596,23 @@ add_noreturn_fake_exit_edges (void)
>  void
>  connect_infinite_loops_to_exit (void)
>  {
> -  basic_block unvisited_block = EXIT_BLOCK_PTR_FOR_FN (cfun);
> -  basic_block deadend_block;
> -  depth_first_search_ds dfs_ds;
> -
>/* Perform depth-first search in the reverse graph to find nodes
>   reachable from the exit block.  */
> -  flow_dfs_compute_reverse_init (&dfs_ds);
> -  flow_dfs_compute_reverse_add_bb (&dfs_ds, EXIT_BLOCK_PTR_FOR_FN (cfun));
> +  depth_first_search dfs;
> +  dfs.add_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
>
>/* Repeatedly add fake edges, updating the unreachable nodes.  */
> +  basic_block unvisited_block = EXIT_BLOCK_PTR_FOR_FN (cfun);
>while (1)
>  {
> -  unvisited_block = flow_dfs_compute_reverse_execute (&dfs_ds,
> - unvisited_block);
> +  unvisited_block = dfs.execute (unvisited_block);
>if (!unvisited_block)
> break;
>
> -  deadend_block = dfs_find_deadend (unvisited_block);
> +  basic_block deadend_block = dfs_find_deadend (unvisited_block);
>make_edge (deadend_block, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE);
> -  flow_dfs_compute_reverse_add_bb (&dfs_ds, deadend_block);
> +  dfs.add_bb (deadend_block);
>  }
> -
> -  flow_dfs_compute_reverse_finish (&dfs_ds);
> -  return;
>  }
>
>  /* Compute reverse top sort order.  This is computing a post order
> @@ -1094,31 +1086,22 @@ pre_and_rev_post_order_compute (int *pre_order, int 
> *rev_post_order,
> search context.  If INITIALIZE_STACK is nonzero, there is an
> element on the stack.  */
>
> -static void
> -flow_dfs_compute_reverse_init (depth_first_search_ds *data)
> +depth_first_search::depth_first_search () :
> +  m_stack (n_basic_blocks_for_fn (cfun)),
> +  m_visited_blocks (last_basic_block_for_fn (cfun))
>  {
> -  /* Allocate stack for back-tracking up CFG.  */
> -  data->stack = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
> -  data->sp = 0;
> -
> -  /* Allocate bitmap to track nodes that have been visited.  */
> -  data->visited_blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
> -
> -  /* None of the nodes in the CFG have been visited yet.  */
> -  bitmap_clear (data->visited_blocks);
> -
> -  return;
> +  bitmap_clear (m_visited_blocks);
>  }
>
>  /* Add the specified basic block to the top of the dfs data
> structures.  When the search continues, it will start at the
> block.  */
>
> -static void
> -flow_dfs_compute_reverse_add_bb (depth_first_search_ds *data, basic_block bb)
> +void
> +depth_first_search::add_bb (basic_block bb)
>  {
> -  data->stack[data->sp++] = bb;
> -  bitmap_set_bit (data->visited_blocks, bb->index);
> +  m_stack.quick_push (bb);
> +  bitmap_set_bit (m_visited_blocks, bb->index);
>  }
>
>  /* Continue the depth-first search through the

Re: [PATCH 09/13] use auto_bitmap more with alternate obstacks

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

> 2017-05-09  Trevor Saunders  
>
> * df-core.c (df_set_blocks): Start using auto_bitmap.
> (df_compact_blocks): Likewise.
> * df-problems.c (df_rd_confluence_n): Likewise.
> * df-scan.c (df_insn_rescan_all): Likewise.
> (df_process_deferred_rescans): Likewise.
> (df_update_entry_block_defs): Likewise.
> (df_update_exit_block_uses): Likewise.
> (df_entry_block_bitmap_verify): Likewise.
> (df_exit_block_bitmap_verify): Likewise.
> (df_scan_verify): Likewise.
> * lra-constraints.c (lra_constraints): Likewise.
> (undo_optional_reloads): Likewise.
> (lra_undo_inheritance): Likewise.
> * lra-remat.c (calculate_gen_cands): Likewise.
> (do_remat): Likewise.
> * lra-spills.c (assign_spill_hard_regs): Likewise.
> (spill_pseudos): Likewise.
> * tree-ssa-pre.c (bitmap_set_and): Likewise.
> (bitmap_set_subtract_values): Likewise.
> ---
>  gcc/df-core.c | 30 +++--
>  gcc/df-problems.c | 10 +++---
>  gcc/df-scan.c | 93 
> ---
>  gcc/lra-constraints.c | 42 ++-
>  gcc/lra-remat.c   | 43 ++--
>  gcc/lra-spills.c  | 25 ++
>  gcc/tree-ssa-pre.c| 17 --
>  7 files changed, 104 insertions(+), 156 deletions(-)
>
> diff --git a/gcc/df-core.c b/gcc/df-core.c
> index 98787a768c6..1b270d417aa 100644
> --- a/gcc/df-core.c
> +++ b/gcc/df-core.c
> @@ -497,9 +497,8 @@ df_set_blocks (bitmap blocks)
>   /* This block is called to change the focus from one subset
>  to another.  */
>   int p;
> - bitmap_head diff;
> - bitmap_initialize (&diff, &df_bitmap_obstack);
> - bitmap_and_compl (&diff, df->blocks_to_analyze, blocks);
> + auto_bitmap diff (&df_bitmap_obstack);
> + bitmap_and_compl (diff, df->blocks_to_analyze, blocks);
>   for (p = 0; p < df->num_problems_defined; p++)
> {
>   struct dataflow *dflow = df->problems_in_order[p];
> @@ -510,7 +509,7 @@ df_set_blocks (bitmap blocks)
>   bitmap_iterator bi;
>   unsigned int bb_index;
>
> - EXECUTE_IF_SET_IN_BITMAP (&diff, 0, bb_index, bi)
> + EXECUTE_IF_SET_IN_BITMAP (diff, 0, bb_index, bi)
> {
>   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bb_index);
>   if (bb)
> @@ -522,8 +521,6 @@ df_set_blocks (bitmap blocks)
> }
> }
> }
> -
> -  bitmap_clear (&diff);
> }
>else
> {
> @@ -1652,9 +1649,8 @@ df_compact_blocks (void)
>int i, p;
>basic_block bb;
>void *problem_temps;
> -  bitmap_head tmp;
>
> -  bitmap_initialize (&tmp, &df_bitmap_obstack);
> +  auto_bitmap tmp (&df_bitmap_obstack);
>for (p = 0; p < df->num_problems_defined; p++)
>  {
>struct dataflow *dflow = df->problems_in_order[p];
> @@ -1663,17 +1659,17 @@ df_compact_blocks (void)
>  dflow problem.  */
>if (dflow->out_of_date_transfer_functions)
> {
> - bitmap_copy (&tmp, dflow->out_of_date_transfer_functions);
> + bitmap_copy (tmp, dflow->out_of_date_transfer_functions);
>   bitmap_clear (dflow->out_of_date_transfer_functions);
> - if (bitmap_bit_p (&tmp, ENTRY_BLOCK))
> + if (bitmap_bit_p (tmp, ENTRY_BLOCK))
> bitmap_set_bit (dflow->out_of_date_transfer_functions, 
> ENTRY_BLOCK);
> - if (bitmap_bit_p (&tmp, EXIT_BLOCK))
> + if (bitmap_bit_p (tmp, EXIT_BLOCK))
> bitmap_set_bit (dflow->out_of_date_transfer_functions, 
> EXIT_BLOCK);
>
>   i = NUM_FIXED_BLOCKS;
>   FOR_EACH_BB_FN (bb, cfun)
> {
> - if (bitmap_bit_p (&tmp, bb->index))
> + if (bitmap_bit_p (tmp, bb->index))
> bitmap_set_bit (dflow->out_of_date_transfer_functions, i);
>   i++;
> }
> @@ -1711,23 +1707,21 @@ df_compact_blocks (void)
>
>if (df->blocks_to_analyze)
>  {
> -  if (bitmap_bit_p (&tmp, ENTRY_BLOCK))
> +  if (bitmap_bit_p (tmp, ENTRY_BLOCK))
> bitmap_set_bit (df->blocks_to_analyze, ENTRY_BLOCK);
> -  if (bitmap_bit_p (&tmp, EXIT_BLOCK))
> +  if (bitmap_bit_p (tmp, EXIT_BLOCK))
> bitmap_set_bit (df->blocks_to_analyze, EXIT_BLOCK);
> -  bitmap_copy (&tmp, df->blocks_to_analyze);
> +  bitmap_copy (tmp, df->blocks_to_analyze);
>bitmap_clear (df->blocks_to_analyze);
>i = NUM_FIXED_BLOCKS;
>FOR_EACH_BB_FN (bb, cfun)
> {
> - if (bitmap_bit_p (&tmp, bb->index))
> + if (bitmap_bit_p (tmp, bb->index))
> bitmap_set_bit (df->blocks_to_analyze, i);
>   i++;

Re: [PATCH 13/13] make inverted_post_order_compute() operate on a vec

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 10:52 PM,   wrote:
> From: Trevor Saunders 
>
> gcc/ChangeLog:

Ok.

Richard.

> 2017-05-09  Trevor Saunders  
>
> * cfganal.c (inverted_post_order_compute): Change argument type
> to vec *.
> * cfganal.h (inverted_post_order_compute): Adjust prototype.
> * df-core.c (rest_of_handle_df_initialize): Adjust.
> (rest_of_handle_df_finish): Likewise.
> (df_analyze_1): Likewise.
> (df_analyze): Likewise.
> (loop_inverted_post_order_compute): Change argument to be a vec *.
> (df_analyze_loop): Adjust.
> (df_get_n_blocks): Likewise.
> (df_get_postorder): Likewise.
> * df.h (struct df_d): Change field to be a vec.
> * lcm.c (compute_laterin): Adjust.
> (compute_available): Likewise.
> * lra-lives.c (lra_create_live_ranges_1): Likewise.
> * tree-ssa-dce.c (remove_dead_stmt): Likewise.
> * tree-ssa-pre.c (compute_antic): Likewise.
> ---
>  gcc/cfganal.c  | 14 ++
>  gcc/cfganal.h  |  2 +-
>  gcc/df-core.c  | 56 
> +-
>  gcc/df.h   |  4 +---
>  gcc/lcm.c  | 14 ++
>  gcc/lra-lives.c|  9 -
>  gcc/tree-ssa-dce.c | 10 --
>  gcc/tree-ssa-pre.c |  9 -
>  8 files changed, 52 insertions(+), 66 deletions(-)
>
> diff --git a/gcc/cfganal.c b/gcc/cfganal.c
> index 27b453ca3f7..a3a6ea86994 100644
> --- a/gcc/cfganal.c
> +++ b/gcc/cfganal.c
> @@ -790,12 +790,12 @@ dfs_find_deadend (basic_block bb)
> and start looking for a "dead end" from that block
> and do another inverted traversal from that block.  */
>
> -int
> -inverted_post_order_compute (int *post_order,
> +void
> +inverted_post_order_compute (vec *post_order,
>  sbitmap *start_points)
>  {
>basic_block bb;
> -  int post_order_num = 0;
> +  post_order->reserve_exact (n_basic_blocks_for_fn (cfun));
>
>if (flag_checking)
>  verify_no_unreachable_blocks ();
> @@ -863,13 +863,13 @@ inverted_post_order_compute (int *post_order,
> time, check its predecessors.  */
> stack.quick_push (ei_start (pred->preds));
>else
> -post_order[post_order_num++] = pred->index;
> +   post_order->quick_push (pred->index);
>  }
>else
>  {
>   if (bb != EXIT_BLOCK_PTR_FOR_FN (cfun)
>   && ei_one_before_end_p (ei))
> -post_order[post_order_num++] = bb->index;
> +   post_order->quick_push (bb->index);
>
>if (!ei_one_before_end_p (ei))
> ei_next (&stack.last ());
> @@ -927,9 +927,7 @@ inverted_post_order_compute (int *post_order,
>while (!stack.is_empty ());
>
>/* EXIT_BLOCK is always included.  */
> -  post_order[post_order_num++] = EXIT_BLOCK;
> -
> -  return post_order_num;
> +  post_order->quick_push (EXIT_BLOCK);
>  }
>
>  /* Compute the depth first search order of FN and store in the array
> diff --git a/gcc/cfganal.h b/gcc/cfganal.h
> index 7df484b8441..39bb5e547a5 100644
> --- a/gcc/cfganal.h
> +++ b/gcc/cfganal.h
> @@ -63,7 +63,7 @@ extern void add_noreturn_fake_exit_edges (void);
>  extern void connect_infinite_loops_to_exit (void);
>  extern int post_order_compute (int *, bool, bool);
>  extern basic_block dfs_find_deadend (basic_block);
> -extern int inverted_post_order_compute (int *, sbitmap *start_points = 0);
> +extern void inverted_post_order_compute (vec *postorder, sbitmap 
> *start_points = 0);
>  extern int pre_and_rev_post_order_compute_fn (struct function *,
>   int *, int *, bool);
>  extern int pre_and_rev_post_order_compute (int *, int *, bool);
> diff --git a/gcc/df-core.c b/gcc/df-core.c
> index 1b270d417aa..1e84d4d948f 100644
> --- a/gcc/df-core.c
> +++ b/gcc/df-core.c
> @@ -702,10 +702,9 @@ rest_of_handle_df_initialize (void)
>  df_live_add_problem ();
>
>df->postorder = XNEWVEC (int, last_basic_block_for_fn (cfun));
> -  df->postorder_inverted = XNEWVEC (int, last_basic_block_for_fn (cfun));
>df->n_blocks = post_order_compute (df->postorder, true, true);
> -  df->n_blocks_inverted = inverted_post_order_compute 
> (df->postorder_inverted);
> -  gcc_assert (df->n_blocks == df->n_blocks_inverted);
> +  inverted_post_order_compute (&df->postorder_inverted);
> +  gcc_assert ((unsigned) df->n_blocks == df->postorder_inverted.length ());
>
>df->hard_regs_live_count = XCNEWVEC (unsigned int, FIRST_PSEUDO_REGISTER);
>
> @@ -816,7 +815,7 @@ rest_of_handle_df_finish (void)
>  }
>
>free (df->postorder);
> -  free (df->postorder_inverted);
> +  df->postorder_inverted.release ();
>free (df->hard_regs_live_count);
>free (df);
>df = NULL;
> @@ -1198,7 +1197,7 @@ df_analyze_1 (void)
>int i;
>
>/* These should be the same.  */
> -  gcc_assert (df->n_blocks == df->n

Re: [PATCH] non-checking pure attribute

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 5:16 PM, Nathan Sidwell  wrote:
> On 05/09/2017 09:39 AM, Richard Biener wrote:
>>
>> On Tue, May 9, 2017 at 3:33 PM, Nathan Sidwell  wrote:
>
>
>>> I wondered if we'd get sane backtraces and what not, if the optimizer
>>> thought such functions never barfed.
>>
>>
>> Well, I think you'd either ICE in the first check or can safely CSE the
>> second.
>
>
> Done
>

Thanks.
Richard.

> --
> Nathan Sidwell


Re: [PATCH] Kill -fdump-translation-unit

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 5:41 PM, Nathan Sidwell  wrote:
> -fdump-translation-unit is an inscrutably opaque dump.  It turned out that
> most of the uses of the tree-dump header file was to indirectly get at
> dumpfile.h, and the dump_function entry point it had forwarded to a dumper
> in tree-cfg.c.  The gimple dumper would use its node dumper when asked for a
> raw dump, but that was about it.
>
> We have prettier printers now.  This patch nukes the tu dumper.  ok?

Ok if nobody objects within 24 hours.

Thanks,
Richard.

> nathan
>
> --
> Nathan Sidwell


Re: PR77644

2017-05-10 Thread Richard Biener
On Tue, 9 May 2017, Prathamesh Kulkarni wrote:

> Hi,
> The attached patch adds the following pattern to match.pd
> sqrt(x) cmp sqrt(y) -> x cmp y.
> and is enabled with -funsafe-math-optimization and -fno-math-errno.
> 
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> Cross-tested on arm*-*-*, aarch64*-*-*.
> OK for trunk ?

+  (cmp @0 { build_real (TREE_TYPE (@0), c2); })
+
+   /* PR77644: Transform sqrt(x) cmp sqrt(y) -> x cmp y.  */

Do not reference PRs here please (and omit the vertical space before
the sub-pattern.

+   (simplify
+(cmp (sq @0) (sq @1))
+  (if (! HONOR_NANS (type))
+   (cmp @0 @1))
 
It should be HONOR_NANS (@0), and not on 'type' (that's bool!).

Looks ok otherwise.

Thanks,
Richard.


Re: [PATCH] make RTL/TREE/IPA dump kind an index

2017-05-10 Thread Richard Biener
On Tue, May 9, 2017 at 9:00 PM, Nathan Sidwell  wrote:
> Currently, the TDF_foo flags serve 3 purposes:
> 1) what kind of dump
> 2) how detailed to print it
> 3) auxiliary message control
>
> This addresses #1, which currently uses a bit mask of TDF_{TREE,RTL,IPA}, of
> which exactly one must be set.  The patch changes things so that these are
> now an index value (I hesitate to say enumeration, because they're still raw
> ints).  A TDF_KIND(X) accessor extracts this value.  (I left the spare bit
> between the TDF_KIND_MASK and TDF_ADDRESS for the moment.)
>
> In addition I added 'TDF_LANG' for language-specific dump control, of which
> -fdump-translation-unit and -fdump-class-hierarchy become.  And can also be
> controlled by -fdump-lang-all. (rather than -fdump-tree-all)
>
> Next move will be to move -fdump-class-hierarchy into a more generic
> structure (and -fdump-translation-unit, if my patch to remove it is not
> accepted).
>
> ok?

   TDI_nested,  /* dump each function after unnesting it */
+
+  TDI_lang_all,/* enable all the language dumps.  */

extra vertical space

+
+#define TDF_ADDRESS(1 << 3)/* dump node addresses */

this leaves 1 << 2 unused.

Otherwise looks like a great cleanup.  You might want to coordinate with
Martin a bit here.  It also looks like with this we can start re-using
bits when they are restricted to one TDF_KIND.

Thanks,
Richard.

>
> nathan
> --
> Nathan Sidwell


Re: Trivial iter_swap cleanup

2017-05-10 Thread Jonathan Wakely

On 09/05/17 22:33 +0200, François Dumont wrote:

Hi

   A trivial code simplification in a pre-C++11 piece of code.

   Ok to commit ?


Good catch! OK, thanks.



Re: Bump version namespace and remove _Rb_tree useless template parameter

2017-05-10 Thread Jonathan Wakely

On 09/05/17 22:03 +0200, François Dumont wrote:

On 05/05/2017 15:08, Jonathan Wakely wrote:

On 04/05/17 22:16 +0200, François Dumont wrote:

Hi

  Here is the patch to remove the useless _Is_pod_comparator 
_Rb_tree_impl template parameter. As this is an ABI breaking 
change it is limited to the versioned namespace mode and the patch 
also bump the namespace version.


  Working on this patch I wonder if the 
gnu-versioned-namespace.ver is really up to date. The list of 
export expressions is far smaller than the one in gnu.ver.


Because it uses wildcards that match all symbols, because using the
versioned namespace everything gets the same symbol version. We don't
need to assign different versions to different symbols.


Would the testsuite show that some symbols are not properly exported ?


Yes (as long as we have a test that exercises the feature).



  Bump version namespace.
  * config/abi/pre/gnu-versioned-namespace.ver: Bump version namespace
  from __7 to __8. Bump GLIBCXX_7.0 into GLIBCXX_8.0.
  * include/bits/c++config: Adapt.
  * include/bits/regex.h: Adapt.
  * include/experimental/bits/fs_fwd.h: Adapt.
  * include/experimental/bits/lfts_config.h: Adapt.
  * include/std/variant: Adapt.
  * python/libstdcxx/v6/printers.py: Adapt.
  * testsuite/libstdc++-prettyprinters/48362.cc: Adapt.
  * include/bits/stl_tree.h (_Rb_tree_impl<>): Remove 
_Is_pod_comparator

  template parameter when version namespace is active.


The patch also needs to update libtool_VERSION in acinclude.m4 so that
the shared library goes from libstdc++.so.7 to libstdc++.so.8 (because
after this change we're absolutely not compatible with libstdc++.so.7


Ok, updated with attached patch. Ok to commit ?


Looks good to me. Paolo, what do you think about bumping the versioned
namespace and SONAME to 8?

It's not really important for this change, but I'd like to make the
new SSO std::string work for the versioned namespace, so we might want
it for that.




Re: Bump version namespace and remove _Rb_tree useless template parameter

2017-05-10 Thread Paolo Carlini

Hi,

On 10/05/2017 11:12, Jonathan Wakely wrote:

Looks good to me. Paolo, what do you think about bumping the versioned
namespace and SONAME to 8?

Sure, makes sense to me too.

Paolo.


Re: PR77644

2017-05-10 Thread Prathamesh Kulkarni
On 10 May 2017 at 14:28, Richard Biener  wrote:
> On Tue, 9 May 2017, Prathamesh Kulkarni wrote:
>
>> Hi,
>> The attached patch adds the following pattern to match.pd
>> sqrt(x) cmp sqrt(y) -> x cmp y.
>> and is enabled with -funsafe-math-optimization and -fno-math-errno.
>>
>> Bootstrapped+tested on x86_64-unknown-linux-gnu.
>> Cross-tested on arm*-*-*, aarch64*-*-*.
>> OK for trunk ?
>
> +  (cmp @0 { build_real (TREE_TYPE (@0), c2); })
> +
> +   /* PR77644: Transform sqrt(x) cmp sqrt(y) -> x cmp y.  */
>
> Do not reference PRs here please (and omit the vertical space before
> the sub-pattern.
>
> +   (simplify
> +(cmp (sq @0) (sq @1))
> +  (if (! HONOR_NANS (type))
> +   (cmp @0 @1))
>
> It should be HONOR_NANS (@0), and not on 'type' (that's bool!).
Ah indeed, sorry about that :/
Does the attached version look OK ?

Thanks,
Prathamesh
>
> Looks ok otherwise.
>
> Thanks,
> Richard.
diff --git a/gcc/match.pd b/gcc/match.pd
index e3d98baa12f..80a17ba3d23 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2633,7 +2633,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (GENERIC)
  (truth_andif
   (ge @0 { build_real (TREE_TYPE (@0), dconst0); })
-  (cmp @0 { build_real (TREE_TYPE (@0), c2); }
+  (cmp @0 { build_real (TREE_TYPE (@0), c2); })
+   /* Transform sqrt(x) cmp sqrt(y) -> x cmp y.  */
+   (simplify
+(cmp (sq @0) (sq @1))
+  (if (! HONOR_NANS (@0))
+   (cmp @0 @1))
 
 /* Fold A /[ex] B CMP C to A CMP B * C.  */
 (for cmp (eq ne)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c
new file mode 100644
index 000..c73bb73afdb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77644.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
+/* { dg-options "-O2 -fdump-tree-optimized -funsafe-math-optimizations 
-fno-math-errno -ffinite-math-only" } */
+
+#define FOO(type, cmp, suffix, no)  \
+int f_##no(type x, type y) \
+{ \
+  type gen_##no(); \
+  type xs = __builtin_sqrt##suffix((gen_##no())); \
+  type xy = __builtin_sqrt##suffix((gen_##no())); \
+  return (xs cmp xy); \
+}
+
+#define GEN_FOO(type, suffix) \
+FOO(type, <, suffix, suffix##1) \
+FOO(type, <=, suffix, suffix##2) \
+FOO(type, >, suffix, suffix##3) \
+FOO(type, >=, suffix, suffix##4) \
+FOO(type, ==, suffix, suffix##5) \
+FOO(type, !=, suffix, suffix##6)
+
+GEN_FOO(float, f)
+GEN_FOO(double, )
+GEN_FOO(long double, l)
+
+/* { dg-final { scan-tree-dump-not "__builtin_sqrtf" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_sqrt" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "__builtin_sqrtl" "optimized" } } */


Re: Bump version namespace and remove _Rb_tree useless template parameter

2017-05-10 Thread Jonathan Wakely

On 10/05/17 11:15 +0200, Paolo Carlini wrote:

Hi,

On 10/05/2017 11:12, Jonathan Wakely wrote:

Looks good to me. Paolo, what do you think about bumping the versioned
namespace and SONAME to 8?

Sure, makes sense to me too.


Please commit it then, François - thanks!



Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Ville Voutilainen
On 10 May 2017 at 09:57, Ville Voutilainen  wrote:
> On 9 May 2017 at 17:14, Nathan Sidwell  wrote:
>> On 05/09/2017 08:06 AM, Ville Voutilainen wrote:
>>>
>>> Tested on Linux-x64, not tested with the full suite yet.
>>>
>>> 2017-05-09  Ville Voutilainen  
>>>
>>>  gcc/
>>>
>>>  PR c++/80682
>>>  * cp/method.c (is_trivially_xible): Reject void types.
>>>
>>>  testsuite/
>>>
>>>  PR c++/80682
>>>  * g++.dg/ext/is_trivially_constructible1.C: Add tests for void
>>> target.
>>>
>>
>> +  if (to == void_type_node)
>> +return false;
>>
>> VOID_TYPE_P.
>>
>> ok with that change
>
>
> Full testsuite run is clean. Is it ok to backport this change to
> gcc-6? (And gcc-7, too)

..and gcc-5. Backporting everywhere allows library implementations
including libc++ to
just use the intrinsic, without using std::is_constructible in addition.


Re: PR77644

2017-05-10 Thread Richard Biener
On Wed, 10 May 2017, Prathamesh Kulkarni wrote:

> On 10 May 2017 at 14:28, Richard Biener  wrote:
> > On Tue, 9 May 2017, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> The attached patch adds the following pattern to match.pd
> >> sqrt(x) cmp sqrt(y) -> x cmp y.
> >> and is enabled with -funsafe-math-optimization and -fno-math-errno.
> >>
> >> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> >> Cross-tested on arm*-*-*, aarch64*-*-*.
> >> OK for trunk ?
> >
> > +  (cmp @0 { build_real (TREE_TYPE (@0), c2); })
> > +
> > +   /* PR77644: Transform sqrt(x) cmp sqrt(y) -> x cmp y.  */
> >
> > Do not reference PRs here please (and omit the vertical space before
> > the sub-pattern.
> >
> > +   (simplify
> > +(cmp (sq @0) (sq @1))
> > +  (if (! HONOR_NANS (type))
> > +   (cmp @0 @1))
> >
> > It should be HONOR_NANS (@0), and not on 'type' (that's bool!).
> Ah indeed, sorry about that :/
> Does the attached version look OK ?

Yes.

Thanks,
Richard.


Re: [PATCH 01/13] improve safety of freeing bitmaps

2017-05-10 Thread Trevor Saunders
On Wed, May 10, 2017 at 10:14:17AM +0200, Richard Biener wrote:
> On Tue, May 9, 2017 at 10:52 PM,   wrote:
> > From: Trevor Saunders 
> >
> > There's two groups of changes here, first taking a sbitmap &, so that we
> > can assign null to the pointer after freeing the sbitmap to prevent use
> > after free through that pointer.  Second we define overloads of
> > sbitmap_free and bitmap_free taking auto_sbitmap and auto_bitmap
> > respectively, so that you can't double free the bitmap owned by a
> > auto_{s,}bitmap.
> 
> Looks good - but what do you need the void *& overload for?!  That at least
> needs a comment.

yeah, its gross, I put it in to be compatible with the previous macro.
  The first problem with removing it is that cfgexpand.c:663 and
  presumably other places do BITMAP_FREE(bb->aux) which of course
  depends on being able to pass in a void *.  I'll add a comment and try
  and look into removing it.

  Trev

> 
> Richard.
> 
> > gcc/ChangeLog:
> >
> > 2017-05-09  Trevor Saunders  
> >
> > * bitmap.h (BITMAP_FREE): Convert from macro to inline function
> > and add overloaded decl for auto_bitmap.
> > * sbitmap.h (inline void sbitmap_free): Add overload for
> > auto_sbitmap, and change sbitmap to  point to null.
> > ---
> >  gcc/bitmap.h  | 21 +++--
> >  gcc/sbitmap.h |  7 ++-
> >  2 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> > index f158b447357..7508239cff9 100644
> > --- a/gcc/bitmap.h
> > +++ b/gcc/bitmap.h
> > @@ -129,6 +129,8 @@ along with GCC; see the file COPYING3.  If not see
> >
> >  #include "obstack.h"
> >
> > +   class auto_bitmap;
> > +
> >  /* Bitmap memory usage.  */
> >  struct bitmap_usage: public mem_usage
> >  {
> > @@ -372,8 +374,23 @@ extern hashval_t bitmap_hash (const_bitmap);
> >  #define BITMAP_GGC_ALLOC() bitmap_gc_alloc ()
> >
> >  /* Do any cleanup needed on a bitmap when it is no longer used.  */
> > -#define BITMAP_FREE(BITMAP) \
> > -   ((void) (bitmap_obstack_free ((bitmap) BITMAP), (BITMAP) = (bitmap) 
> > NULL))
> > +inline void
> > +BITMAP_FREE (bitmap &b)
> > +{
> > +  bitmap_obstack_free ((bitmap) b);
> > +  b = NULL;
> > +}
> > +
> > +inline void
> > +BITMAP_FREE (void *&b)
> > +{
> > +  bitmap_obstack_free ((bitmap) b);
> > +  b = NULL;
> > +}
> > +
> > +/* Intentionally unimplemented to ensure it is never called with an
> > +   auto_bitmap argument.  */
> > +void BITMAP_FREE (auto_bitmap);
> >
> >  /* Iterator for bitmaps.  */
> >
> > diff --git a/gcc/sbitmap.h b/gcc/sbitmap.h
> > index ce4d27d927c..cba0452cdb9 100644
> > --- a/gcc/sbitmap.h
> > +++ b/gcc/sbitmap.h
> > @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3.  If not see
> >  #define SBITMAP_ELT_BITS (HOST_BITS_PER_WIDEST_FAST_INT * 1u)
> >  #define SBITMAP_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
> >
> > +class auto_sbitmap;
> > +
> >  struct simple_bitmap_def
> >  {
> >unsigned int n_bits; /* Number of bits.  */
> > @@ -208,11 +210,14 @@ bmp_iter_next (sbitmap_iterator *i, unsigned *bit_no 
> > ATTRIBUTE_UNUSED)
> > bmp_iter_next (&(ITER), &(BITNUM)))
> >  #endif
> >
> > -inline void sbitmap_free (sbitmap map)
> > +inline void sbitmap_free (sbitmap &map)
> >  {
> >free (map);
> > +  map = NULL;
> >  }
> >
> > +void sbitmap_free (auto_sbitmap);
> > +
> >  inline void sbitmap_vector_free (sbitmap * vec)
> >  {
> >free (vec);
> > --
> > 2.11.0
> >


Re: [PATCH, GCC/LTO, ping3] Fix PR69866: LTO with def for weak alias in regular object file

2017-05-10 Thread Thomas Preudhomme

Hi,

On 09/05/17 23:36, Jan Hubicka wrote:

Ping?

Sorry for late reply

Hi,

This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file containing a strong
definition for that same symbol. The patch is twofold:

+ do not add an alias to a partition if it is external
+ do not declare (.globl) an alias if it is external


Adding external alises to partitions is important to keep the information
that two symbols are the same.
The second part makes sense to me.  What breaks when you drop the first
change?


Adding aliases to partitions is what cause the ICE referenced in the PR. It 
fails on the following assert:


  gcc_assert (c != SYMBOL_EXTERNAL
  && (c == SYMBOL_DUPLICATE || !symbol_partitioned_p (node)));


Second change came about when doing the first change because the linker was 
complaining about the alias being defined twice (once in the trans object, once 
in the non LTO object).


Best regards,

Thomas



Honza


ChangeLog entries are as follow:

*** gcc/lto/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * lto/lto-partition.c (add_symbol_to_partition_1): Do not add external
   aliases to partition.

*** gcc/ChangeLog ***

2017-03-01  Thomas Preud'homme  

   PR lto/69866
   * cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Do not
   declare external aliases.

*** gcc/testsuite/ChangeLog ***

2017-02-28  Thomas Preud'homme  

   PR lto/69866
   * gcc.dg/lto/pr69866_0.c: New test.
   * gcc.dg/lto/pr69866_1.c: Likewise.


Testing: Testsuite shows no regression when targeting Cortex-M3 with an
arm-none-eabi GCC cross-compiler, neither does it show any regression with
native LTO-bootstrapped x86-64_linux-gnu and aarch64-linux-gnu compilers.

Is this ok for stage4?

Best regards,

Thomas

On 31/03/17 18:07, Richard Biener wrote:

On March 31, 2017 5:23:03 PM GMT+02:00, Jeff Law  wrote:

On 03/16/2017 08:05 AM, Thomas Preudhomme wrote:

Ping?

Is this ok for stage4?

Given the lack of response from Richi, I'd suggest deferring to stage1.


Honza needs to review this, i habe too little knowledge here.

Richard.


jeff





diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 
c82a88a599ca61b068dd9783d2a6158163809b37..580500ff922b8546d33119261a2455235edbf16d
 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1972,7 +1972,7 @@ cgraph_node::assemble_thunks_and_aliases (void)
   FOR_EACH_ALIAS (this, ref)
 {
   cgraph_node *alias = dyn_cast  (ref->referring);
-  if (!alias->transparent_alias)
+  if (!alias->transparent_alias && !DECL_EXTERNAL (alias->decl))
{
  bool saved_written = TREE_ASM_WRITTEN (decl);

diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 
e27d0d1690c1fcfb39e2fac03ce0f4154031fc7c..f44fd435ed075a27e373bdfdf0464eb06e1731ef
 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -178,7 +178,8 @@ add_symbol_to_partition_1 (ltrans_partition part, 
symtab_node *node)
   /* Add all aliases associated with the symbol.  */

   FOR_EACH_ALIAS (node, ref)
-if (!ref->referring->transparent_alias)
+if (!ref->referring->transparent_alias
+   && ref->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
   add_symbol_to_partition_1 (part, ref->referring);
 else
   {
@@ -189,7 +190,8 @@ add_symbol_to_partition_1 (ltrans_partition part, 
symtab_node *node)
  {
/* Nested transparent aliases are not permitted.  */
gcc_checking_assert (!ref2->referring->transparent_alias);
-   add_symbol_to_partition_1 (part, ref2->referring);
+   if (ref2->referring->get_partitioning_class () != SYMBOL_EXTERNAL)
+ add_symbol_to_partition_1 (part, ref2->referring);
  }
   }

diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_0.c 
b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
new file mode 100644
index 
..f49ef8d4c1da7a21d1bfb5409d647bd18141595b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_0.c
@@ -0,0 +1,13 @@
+/* { dg-lto-do link } */
+
+int _umh(int i)
+{
+  return i+1;
+}
+
+int weaks(int i) __attribute__((weak, alias("_umh")));
+
+int main()
+{
+  return weaks(10);
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr69866_1.c 
b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
new file mode 100644
index 
..3a14f850eefaffbf659ce4642adef7900330f4ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr69866_1.c
@@ -0,0 +1,6 @@
+/* { dg-options { -fno-lto } } */
+
+int weaks(int i)
+{
+  return i+1;
+}




Re: [PATCH 01/13] improve safety of freeing bitmaps

2017-05-10 Thread Richard Biener
On Wed, May 10, 2017 at 12:52 PM, Trevor Saunders  wrote:
> On Wed, May 10, 2017 at 10:14:17AM +0200, Richard Biener wrote:
>> On Tue, May 9, 2017 at 10:52 PM,   wrote:
>> > From: Trevor Saunders 
>> >
>> > There's two groups of changes here, first taking a sbitmap &, so that we
>> > can assign null to the pointer after freeing the sbitmap to prevent use
>> > after free through that pointer.  Second we define overloads of
>> > sbitmap_free and bitmap_free taking auto_sbitmap and auto_bitmap
>> > respectively, so that you can't double free the bitmap owned by a
>> > auto_{s,}bitmap.
>>
>> Looks good - but what do you need the void *& overload for?!  That at least
>> needs a comment.
>
> yeah, its gross, I put it in to be compatible with the previous macro.
>   The first problem with removing it is that cfgexpand.c:663 and
>   presumably other places do BITMAP_FREE(bb->aux) which of course
>   depends on being able to pass in a void *.  I'll add a comment and try
>   and look into removing it.

Yeah, please remove it by fixing callers instead.

Richard.

>   Trev
>
>>
>> Richard.
>>
>> > gcc/ChangeLog:
>> >
>> > 2017-05-09  Trevor Saunders  
>> >
>> > * bitmap.h (BITMAP_FREE): Convert from macro to inline function
>> > and add overloaded decl for auto_bitmap.
>> > * sbitmap.h (inline void sbitmap_free): Add overload for
>> > auto_sbitmap, and change sbitmap to  point to null.
>> > ---
>> >  gcc/bitmap.h  | 21 +++--
>> >  gcc/sbitmap.h |  7 ++-
>> >  2 files changed, 25 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/gcc/bitmap.h b/gcc/bitmap.h
>> > index f158b447357..7508239cff9 100644
>> > --- a/gcc/bitmap.h
>> > +++ b/gcc/bitmap.h
>> > @@ -129,6 +129,8 @@ along with GCC; see the file COPYING3.  If not see
>> >
>> >  #include "obstack.h"
>> >
>> > +   class auto_bitmap;
>> > +
>> >  /* Bitmap memory usage.  */
>> >  struct bitmap_usage: public mem_usage
>> >  {
>> > @@ -372,8 +374,23 @@ extern hashval_t bitmap_hash (const_bitmap);
>> >  #define BITMAP_GGC_ALLOC() bitmap_gc_alloc ()
>> >
>> >  /* Do any cleanup needed on a bitmap when it is no longer used.  */
>> > -#define BITMAP_FREE(BITMAP) \
>> > -   ((void) (bitmap_obstack_free ((bitmap) BITMAP), (BITMAP) = 
>> > (bitmap) NULL))
>> > +inline void
>> > +BITMAP_FREE (bitmap &b)
>> > +{
>> > +  bitmap_obstack_free ((bitmap) b);
>> > +  b = NULL;
>> > +}
>> > +
>> > +inline void
>> > +BITMAP_FREE (void *&b)
>> > +{
>> > +  bitmap_obstack_free ((bitmap) b);
>> > +  b = NULL;
>> > +}
>> > +
>> > +/* Intentionally unimplemented to ensure it is never called with an
>> > +   auto_bitmap argument.  */
>> > +void BITMAP_FREE (auto_bitmap);
>> >
>> >  /* Iterator for bitmaps.  */
>> >
>> > diff --git a/gcc/sbitmap.h b/gcc/sbitmap.h
>> > index ce4d27d927c..cba0452cdb9 100644
>> > --- a/gcc/sbitmap.h
>> > +++ b/gcc/sbitmap.h
>> > @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3.  If not see
>> >  #define SBITMAP_ELT_BITS (HOST_BITS_PER_WIDEST_FAST_INT * 1u)
>> >  #define SBITMAP_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
>> >
>> > +class auto_sbitmap;
>> > +
>> >  struct simple_bitmap_def
>> >  {
>> >unsigned int n_bits; /* Number of bits.  */
>> > @@ -208,11 +210,14 @@ bmp_iter_next (sbitmap_iterator *i, unsigned *bit_no 
>> > ATTRIBUTE_UNUSED)
>> > bmp_iter_next (&(ITER), &(BITNUM)))
>> >  #endif
>> >
>> > -inline void sbitmap_free (sbitmap map)
>> > +inline void sbitmap_free (sbitmap &map)
>> >  {
>> >free (map);
>> > +  map = NULL;
>> >  }
>> >
>> > +void sbitmap_free (auto_sbitmap);
>> > +
>> >  inline void sbitmap_vector_free (sbitmap * vec)
>> >  {
>> >free (vec);
>> > --
>> > 2.11.0
>> >


[Committed][AArch64] Fix PR80671

2017-05-10 Thread Wilco Dijkstra
Move an use-after-free access before the delete.

Committed as obvious.

ChangeLog:

2017-05-10  Wilco Dijkstra  

PR target/80671
* config/aarch64/cortex-a57-fma-steering.c (merge_forest):
Move member access before delete.
--
diff --git a/gcc/config/aarch64/cortex-a57-fma-steering.c 
b/gcc/config/aarch64/cortex-a57-fma-steering.c
index 
4a3887984b4a0242b8a10bec0c6285ba184517ab..94d7f9c58692a417cba01720a7f05ec12b323c85
 100644
--- a/gcc/config/aarch64/cortex-a57-fma-steering.c
+++ b/gcc/config/aarch64/cortex-a57-fma-steering.c
@@ -411,9 +411,9 @@ fma_forest::merge_forest (fma_forest *other_forest)
  the list of tree roots of ref_forest.  */
   this->m_globals->remove_forest (other_forest);
   this->m_roots->splice (this->m_roots->begin (), *other_roots);
-  delete other_forest;
-
   this->m_nb_nodes += other_forest->m_nb_nodes;
+
+  delete other_forest;
 }
 
 /* Dump information about the forest FOREST.  */


Re: [PATCH] make RTL/TREE/IPA dump kind an index

2017-05-10 Thread Nathan Sidwell

On 05/10/2017 05:05 AM, Richard Biener wrote:

On Tue, May 9, 2017 at 9:00 PM, Nathan Sidwell  wrote:



+
+#define TDF_ADDRESS(1 << 3)/* dump node addresses */

this leaves 1 << 2 unused.


Yes, that was intentional (though I suspect my note about it was 
hidden).  As you say, I expect further cleanup and didn't want 
gratuitous churn.  I'll add a comment about bit 2 being free.



Otherwise looks like a great cleanup.  You might want to coordinate with
Martin a bit here.  It also looks like with this we can start re-using
bits when they are restricted to one TDF_KIND.


Indeed, we coordinated a bit yesterday.  Thanks for review!

nathan

--
Nathan Sidwell


Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Nathan Sidwell

On 05/10/2017 05:37 AM, Ville Voutilainen wrote:

On 10 May 2017 at 09:57, Ville Voutilainen  wrote:




Full testsuite run is clean. Is it ok to backport this change to
gcc-6? (And gcc-7, too)


..and gcc-5. Backporting everywhere allows library implementations
including libc++ to
just use the intrinsic, without using std::is_constructible in addition.


I have no objection, and this is a very simple fix.

nathan

--
Nathan Sidwell


[PATCH] tabify dumpfile.h

2017-05-10 Thread Nathan Sidwell
Committed this pre-patch to tabify dumpfile.h.  It was inconsistently 
using spaces and tabs for alignment.


nathan
--
Nathan Sidwell
2017-05-10  Nathan Sidwell  

	* dumpfile.h: Tabify.

Index: dumpfile.h
===
--- dumpfile.h	(revision 247831)
+++ dumpfile.h	(working copy)
@@ -27,17 +27,17 @@ along with GCC; see the file COPYING3.
 enum tree_dump_index
 {
   TDI_none,			/* No dump */
-  TDI_cgraph,   /* dump function call graph.  */
-  TDI_inheritance,  /* dump type inheritance graph.  */
+  TDI_cgraph,			/* dump function call graph.  */
+  TDI_inheritance,		/* dump type inheritance graph.  */
   TDI_clones,			/* dump IPA cloning decisions.  */
   TDI_tu,			/* dump the whole translation unit.  */
   TDI_class,			/* dump class hierarchy.  */
   TDI_original,			/* dump each function before optimizing it */
   TDI_generic,			/* dump each function after genericizing it */
   TDI_nested,			/* dump each function after unnesting it */
-  TDI_tree_all, /* enable all the GENERIC/GIMPLE dumps.  */
-  TDI_rtl_all,  /* enable all the RTL dumps.  */
-  TDI_ipa_all,  /* enable all the IPA dumps.  */
+  TDI_tree_all,			/* enable all the GENERIC/GIMPLE dumps.  */
+  TDI_rtl_all,			/* enable all the RTL dumps.  */
+  TDI_ipa_all,			/* enable all the IPA dumps.  */
 
   TDI_end
 };
@@ -49,7 +49,7 @@ enum tree_dump_index
allow that.  */
 #define TDF_ADDRESS	(1 << 0)	/* dump node addresses */
 #define TDF_SLIM	(1 << 1)	/* don't go wild following links */
-#define TDF_RAW  	(1 << 2)	/* don't unparse the function */
+#define TDF_RAW		(1 << 2)	/* don't unparse the function */
 #define TDF_DETAILS	(1 << 3)	/* show more detailed info about
 	   each pass */
 #define TDF_STATS	(1 << 4)	/* dump various statistics about
@@ -66,11 +66,11 @@ enum tree_dump_index
 
 #define TDF_GRAPH	(1 << 13)	/* a graph dump is being emitted */
 #define TDF_MEMSYMS	(1 << 14)	/* display memory symbols in expr.
-   Implies TDF_VOPS.  */
+	   Implies TDF_VOPS.  */
 
 #define TDF_DIAGNOSTIC	(1 << 15)	/* A dump to be put in a diagnostic
 	   message.  */
-#define TDF_VERBOSE (1 << 16)   /* A dump that uses the full tree
+#define TDF_VERBOSE	(1 << 16)	/* A dump that uses the full tree
 	   dumper to print stmts.  */
 #define TDF_RHS_ONLY	(1 << 17)	/* a flag to only print the RHS of
 	   a gimple stmt.  */
@@ -84,49 +84,50 @@ enum tree_dump_index
 #define TDF_SCEV	(1 << 24)	/* Dump SCEV details.  */
 #define TDF_COMMENT	(1 << 25)	/* Dump lines with prefix ";;"  */
 #define TDF_GIMPLE	(1 << 26)	/* Dump in GIMPLE FE syntax  */
-#define MSG_OPTIMIZED_LOCATIONS  (1 << 27)  /* -fopt-info optimized sources */
-#define MSG_MISSED_OPTIMIZATION  (1 << 28)  /* missed opportunities */
-#define MSG_NOTE (1 << 29)  /* general optimization info */
-#define MSG_ALL (MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION \
- | MSG_NOTE)
+#define MSG_OPTIMIZED_LOCATIONS	 (1 << 27)  /* -fopt-info optimized sources */
+#define MSG_MISSED_OPTIMIZATION	 (1 << 28)  /* missed opportunities */
+#define MSG_NOTE		 (1 << 29)  /* general optimization info */
+#define MSG_ALL		(MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION \
+			 | MSG_NOTE)
 
 
 /* Flags to control high-level -fopt-info dumps.  Usually these flags
define a group of passes.  An optimization pass can be part of
multiple groups.  */
-#define OPTGROUP_NONE(0)
-#define OPTGROUP_IPA (1 << 1)   /* IPA optimization passes */
-#define OPTGROUP_LOOP(1 << 2)   /* Loop optimization passes */
-#define OPTGROUP_INLINE  (1 << 3)   /* Inlining passes */
-#define OPTGROUP_OMP (1 << 4)   /* OMP (Offloading and Multi
+#define OPTGROUP_NONE	 (0)
+#define OPTGROUP_IPA	 (1 << 1)	/* IPA optimization passes */
+#define OPTGROUP_LOOP	 (1 << 2)	/* Loop optimization passes */
+#define OPTGROUP_INLINE	 (1 << 3)	/* Inlining passes */
+#define OPTGROUP_OMP	 (1 << 4)	/* OMP (Offloading and Multi
 	   Processing) transformations */
-#define OPTGROUP_VEC (1 << 5)   /* Vectorization passes */
-#define OPTGROUP_OTHER   (1 << 6)   /* All other passes */
+#define OPTGROUP_VEC	 (1 << 5)	/* Vectorization passes */
+#define OPTGROUP_OTHER	 (1 << 6)	/* All other passes */
 #define OPTGROUP_ALL	 (OPTGROUP_IPA | OPTGROUP_LOOP | OPTGROUP_INLINE \
-  | OPTGROUP_OMP | OPTGROUP_VEC | OPTGROUP_OTHER)
+			  | OPTGROUP_OMP | OPTGROUP_VEC | OPTGROUP_OTHER)
 
 /* Define a tree dump switch.  */
 struct dump_file_info
 {
-  const char *suffix;   /* suffix to give output file.  */
-  const char *swtch;/* command line dump switch */
-  const char *glob; /* command line glob  */
-  const char *pfilename;/* filename for the pass-specific stream  */
-  const cha

Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-05-10 Thread Prathamesh Kulkarni
On 9 May 2017 at 23:34, Martin Sebor  wrote:
> On 05/09/2017 07:24 AM, Prathamesh Kulkarni wrote:
>>
>> ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00161.html
>>
>> Thanks,
>> Prathamesh
>>
>> On 3 May 2017 at 11:30, Prathamesh Kulkarni
>>  wrote:
>>>
>>> On 3 May 2017 at 03:28, Martin Sebor  wrote:

 On 05/02/2017 11:11 AM, Prathamesh Kulkarni wrote:
>
>
> Hi,
> The attached patch attempts to add option -Wenum-conversion for C and
> objective-C similar to clang, which warns when an enum value of a type
> is implicitly converted to enum value of another type and is enabled
> by Wall.



 It seems quite useful.  My only high-level concern is with
 the growing number of specialized warnings and options for each
 and their interaction.

 I've been working on -Wenum-assign patch that complains about
 assigning to an enum variables an integer constants that doesn't
 match any of the enumerators of the type.  Testing revealed that
 the -Wenum-assign duplicated a subset of warnings already issued
 by -Wconversion enabled with -Wpedantic.  I'm debating whether
 to suppress that part of -Wenum-assign altogether or only when
 -Wconversion and -Wpedantic are enabled.

 My point is that these dependencies tend to be hard to discover
 and understand, and the interactions tricky to get right (e.g.,
 avoid duplicate warnings for similar but distinct problems).

 This is not meant to be a negative comment on your patch, but
 rather a comment about a general problem that might be worth
 starting to think about.

 One comment on the patch itself:

 + warning_at_rich_loc (&loc, 0, "implicit conversion from"
 +  " enum type of %qT to %qT", checktype,
 type);

 Unlike C++, the C front end formats an enumerated type E using
 %qT as 'enum E' so the warning prints 'enum type of 'enum E'),
 duplicating the "enum" part.

 I would suggest to simplify that to:

   warning_at_rich_loc (&loc, 0, "implicit conversion from "
"%qT to %qT", checktype, ...

>>> Thanks for the suggestions. I have updated the patch accordingly.
>>> Hmm the issue you pointed out of warnings interaction is indeed of
>>> concern.
>>> I was wondering then if we should merge this warning with -Wconversion
>>> instead of having a separate option -Wenum-conversion ? Although that
>>> will not
>>> really help with your example below.

 Martin

 PS As an example to illustrate my concern above, consider this:

   enum __attribute__ ((packed)) E { e1 = 1 };
   enum F { f256 = 256 };

   enum E e = f256;

 It triggers -Woverflow:

 warning: large integer implicitly truncated to unsigned type
 [-Woverflow]
enum E e = f256;
   ^~~~

 also my -Wenum-assign:

 warning: integer constant ‘256’ converted to ‘0’ due to limited range
 [0,
 255] of type ‘‘enum E’’ [-Wassign-enum]
enum E e = f256;
   ^~~~

 and (IIUC) will trigger your new -Wenum-conversion.
>>>
>>> Yep, on my branch it triggered -Woverflow and -Wenum-conversion.
>>> Running the example on clang shows a single warning, which they call
>>> as -Wconstant-conversion, which
>>> I suppose is similar to your -Wassign-enum.
>
>
> -Wassign-enum is a Clang warning too, it just isn't included in
> either -Wall or -Wextra.  It warns when a constant is assigned
> to a variable of an enumerated type and is not representable in
> it.  I enhanced it for GCC to also warn when the constant doesn't
> correspond to an enumerator in the type, but I'm starting to think
> that rather than adding yet another option to GCC it might be better
> to extend your -Wenum-conversion once it's committed to cover those
> cases (and also to avoid issuing multiple warnings for essentially
> the same problem).  Let me ponder that some more.
>
> I can't approve patches but it looks good to me for the most part.
> There is one minor issue that needs to be corrected:
>
> + gcc_rich_location loc (location);
> + warning_at_rich_loc (&loc, 0, "implicit conversion from"
> +  " %qT to %qT", checktype, type);
>
> Here the zero should be replaced with OPT_Wenum_conversion,
> otherwise the warning option won't be included in the message.
Oops, sorry about that, updated in the attached patch.
In the patch, I have left the warning in Wall, however I was wondering
whether it should be
in Wextra instead ?
The warning triggered for icv.c in libgomp for following assignment:
icv->run_sched_var = kind;

because icv->run_sched_var was of type enum gomp_schedule_type and
'kind' was of type enum omp_sched_t.
However although these enums have different names, they are
structurally identical (same values),
so the warning in this case, although no

[C++ PATCH] add_method, clone_function_decl

2017-05-10 Thread Nathan Sidwell
Another cleanup from modules.  Changed the final parms to bool from tree 
and int respectively.  In the former case we were only using its 
non-nullness.


Although cloning a ctor can clone an inherited ctor (via a using-decl), 
we don't need to record the usingness on the complete or base ctors.


nathan
--
Nathan Sidwell
2017-05-10  Nathan Sidwell  

	gcc/cp/
	* cp-tree.h (add_method, clone_function_decl): Change last arg to
	bool.
	* class.c (add_method): Change third arg to bool.  Adjust.
	(one_inheriting_sig, one_inherited_ctor): Adjust.
	(clone_function_decl): Change 2nd arg to bool.  Adjust.
	(clone_constructors_and_destructors): Adjust.
	* lambda.c (maybe_add_lambda_conv_op): Adjust.
	* method.c (lazily_declare_fn): Adjust.
	* pt.c (tsubst_decl, instantiate_template_1): Adjust.
	* semantics.c (finish_member_declaration): Adjust.

	libcc1/
	* libcp1plugin.cc (plugin_build_decl): Adjust add_method call.

Index: gcc/cp/class.c
===
--- gcc/cp/class.c	(revision 247831)
+++ gcc/cp/class.c	(working copy)
@@ -1002,12 +1002,12 @@ modify_vtable_entry (tree t,
 }
 
 
-/* Add method METHOD to class TYPE.  If USING_DECL is non-null, it is
-   the USING_DECL naming METHOD.  Returns true if the method could be
-   added to the method vec.  */
+/* Add method METHOD to class TYPE.  If VIA_USING indicates whether
+   METHOD is being injected via a using_decl.  Returns true if the
+   method could be added to the method vec.  */
 
 bool
-add_method (tree type, tree method, tree using_decl)
+add_method (tree type, tree method, bool via_using)
 {
   unsigned slot;
   tree overload;
@@ -1097,7 +1097,7 @@ add_method (tree type, tree method, tree
 
   /* Two using-declarations can coexist, we'll complain about ambiguity in
 	 overload resolution.  */
-  if (using_decl && TREE_CODE (fns) == OVERLOAD && OVL_USED (fns)
+  if (via_using && TREE_CODE (fns) == OVERLOAD && OVL_USED (fns)
 	  /* Except handle inherited constructors specially.  */
 	  && ! DECL_CONSTRUCTOR_P (fn))
 	goto cont;
@@ -1221,12 +1221,10 @@ add_method (tree type, tree method, tree
 	  /* Otherwise defer to the other function.  */
 	  return false;
 	}
-	  if (using_decl)
-	{
-	  if (DECL_CONTEXT (fn) == type)
-		/* Defer to the local function.  */
-		return false;
-	}
+
+	  if (via_using)
+	/* Defer to the local function.  */
+	return false;
 	  else if (flag_new_inheriting_ctors
 		   && DECL_INHERITED_CTOR (fn))
 	{
@@ -1238,13 +1236,9 @@ add_method (tree type, tree method, tree
 	{
 	  error ("%q+#D cannot be overloaded", method);
 	  error ("with %q+#D", fn);
+	  return false;
 	}
 
-	  /* We don't call duplicate_decls here to merge the
-	 declarations because that will confuse things if the
-	 methods have inline definitions.  In particular, we
-	 will crash while processing the definitions.  */
-	  return false;
 	}
 
 cont:
@@ -1259,7 +1253,7 @@ add_method (tree type, tree method, tree
 return false;
 
   /* Add the new binding.  */
-  if (using_decl)
+  if (via_using)
 {
   overload = ovl_cons (method, current_fns);
   OVL_USED (overload) = true;
@@ -3340,7 +3334,7 @@ one_inheriting_sig (tree t, tree ctor, t
   tree fn = implicitly_declare_fn (sfk_inheriting_constructor,
    t, false, ctor, parmlist);
   gcc_assert (TYPE_MAIN_VARIANT (t) == t);
-  if (add_method (t, fn, NULL_TREE))
+  if (add_method (t, fn, false))
 {
   DECL_CHAIN (fn) = TYPE_METHODS (t);
   TYPE_METHODS (t) = fn;
@@ -3359,7 +3353,7 @@ one_inherited_ctor (tree ctor, tree t, t
 {
   ctor = implicitly_declare_fn (sfk_inheriting_constructor,
 t, /*const*/false, ctor, parms);
-  add_method (t, ctor, using_decl);
+  add_method (t, ctor, using_decl != NULL_TREE);
   TYPE_HAS_USER_CONSTRUCTOR (t) = true;
   return;
 }
@@ -4890,11 +4884,12 @@ decl_cloned_function_p (const_tree decl,
 }
 
 /* Produce declarations for all appropriate clones of FN.  If
-   UPDATE_METHOD_VEC_P is nonzero, the clones are added to the
-   CLASTYPE_METHOD_VEC as well.  */
+   UPDATE_METHODS is true, the clones are added to the
+   CLASTYPE_METHOD_VEC.  VIA_USING indicates whether these are cloning
+   decls brought in via using declarations (i.e. inheriting ctors).  */
 
 void
-clone_function_decl (tree fn, int update_method_vec_p)
+clone_function_decl (tree fn, bool update_methods)
 {
   tree clone;
 
@@ -4908,11 +4903,11 @@ clone_function_decl (tree fn, int update
   /* For each constructor, we need two variants: an in-charge version
 	 and a not-in-charge version.  */
   clone = build_clone (fn, complete_ctor_identifier);
-  if (update_method_vec_p)
-	add_method (DECL_CONTEXT (clone), clone, NULL_TREE);
+  if (update_methods)
+	add_method (DECL_CONTEXT (clone), clone, false);
   clone = build_clone (fn, base_ctor_identifier);
-  if (update_method_vec_p)
-	add_method (DEC

Re: [PATCH] Fix bootstrap on arm target

2017-05-10 Thread Bernd Edlinger
On 05/09/17 15:10, Arnaud Charlet wrote:
>>
>> since a few days the bootstrap of ada fails on a native arm target.
>>
>> It is due to a -Werror warning when passing GNAT_EXCEPTION_CLASS
>> which is a string constant to exception_class_eq, but C++ forbids to cast
>> that to "char*".
>>
>> Not sure what is the smartest solution, I tried the following and it
>> seems to work for x86_64-pc-linux-gnu and arm-linux-gnueabihf.
>>
>> Is it OK for trunk?
>
> Patch looks OK to me FWIW. Tristan?
>

so, should I go ahead and commit it?

>> 2017-05-09  Bernd Edlinger  
>>
>>  * raise-gcc.c (exception_class_eq): Make ec parameter const.
>>
>> --- gcc/ada/raise-gcc.c.jj   2017-04-27 12:00:42.0 +0200
>> +++ gcc/ada/raise-gcc.c  2017-05-09 09:45:59.557507045 +0200
>> @@ -909,7 +909,8 @@
>>  /* Return true iff the exception class of EXCEPT is EC.  */
>>
>>  static int
>> -exception_class_eq (const _GNAT_Exception *except,
>> _Unwind_Exception_Class ec)
>> +exception_class_eq (const _GNAT_Exception *except,
>> +const _Unwind_Exception_Class ec)
>>  {
>>  #ifdef __ARM_EABI_UNWINDER__
>>return memcmp (except->common.exception_class, ec, 8) == 0;
>


Re: [PATCH] Fix bootstrap on arm target

2017-05-10 Thread Arnaud Charlet
> >> It is due to a -Werror warning when passing GNAT_EXCEPTION_CLASS
> >> which is a string constant to exception_class_eq, but C++ forbids to
> >> cast
> >> that to "char*".
> >>
> >> Not sure what is the smartest solution, I tried the following and it
> >> seems to work for x86_64-pc-linux-gnu and arm-linux-gnueabihf.
> >>
> >> Is it OK for trunk?
> >
> > Patch looks OK to me FWIW. Tristan?
> >
> 
> so, should I go ahead and commit it?

Go ahead.

> >> 2017-05-09  Bernd Edlinger  
> >>
> >>* raise-gcc.c (exception_class_eq): Make ec parameter const.
> >>
> >> --- gcc/ada/raise-gcc.c.jj 2017-04-27 12:00:42.0 +0200
> >> +++ gcc/ada/raise-gcc.c2017-05-09 09:45:59.557507045 +0200
> >> @@ -909,7 +909,8 @@
> >>  /* Return true iff the exception class of EXCEPT is EC.  */
> >>
> >>  static int
> >> -exception_class_eq (const _GNAT_Exception *except,
> >> _Unwind_Exception_Class ec)
> >> +exception_class_eq (const _GNAT_Exception *except,
> >> +  const _Unwind_Exception_Class ec)
> >>  {
> >>  #ifdef __ARM_EABI_UNWINDER__
> >>return memcmp (except->common.exception_class, ec, 8) == 0;
> >


Re: [PATCH, v4] Fix PR51513, switch statement with default case containing __builtin_unreachable leads to wild branch

2017-05-10 Thread Richard Biener
On Tue, 9 May 2017, Peter Bergner wrote:

> Here is the updated patch to use gimple_seq_unreachable_p() which scans the
> sequence backwards so it bails out earlier in the common case (ie, no call
> to __builtin_unreachable).  As discussed in the previous thread, we remove
> case statement labels from the jump table that lead to unreachable blocks,
> which leads to fewer compare/branches in the decision tree case and
> possibly a smaller jump table in the jump table case.  Unreachable default
> case statements are only handled here when generating a jump table, since
> the current code for decision trees seems to prefer the status quo.  We can
> revisit this later if someone finds a test case that would benefit from
> handling it for decision trees too.
> 
> This passes bootstrap and regtesting on powerpc64le-linux and x86_64-linux
> with no regressions.  Ok for trunk now?

Ok.

Thanks,
Richard.

> Peter
> 
> gcc/
>   * tree-cfg.c (gimple_seq_unreachable_p): New function.
>   (assert_unreachable_fallthru_edge_p): Use it.
>   (group_case_labels_stmt): Likewise.
>   * tree-cfg.h: Prototype it.
>   * stmt.c: Include cfghooks.h and tree-cfg.h.
>   (emit_case_dispatch_table) : New local variable.
>   Use it to fill dispatch table gaps.
>   Test for default_label before updating probabilities.
>   (expand_case) : Remove unneeded initialization.
>   Test for unreachable default case statement and remove its edge.
>   Set default_label accordingly.
>   * tree-ssa-ccp.c (optimize_unreachable): Update comment.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/pr51513.c: New test.
>   * gcc.dg/predict-13.c: Replace __builtin_unreachable() with
>   __builtin_abort().
>   * gcc.dg/predict-14.c: Likewise.
> 
> Index: gcc/tree-cfg.c
> ===
> --- gcc/tree-cfg.c(revision 247291)
> +++ gcc/tree-cfg.c(working copy)
> @@ -452,6 +452,31 @@ computed_goto_p (gimple *t)
> && TREE_CODE (gimple_goto_dest (t)) != LABEL_DECL);
>  }
>  
> +/* Returns true if the sequence of statements STMTS only contains
> +   a call to __builtin_unreachable ().  */
> +
> +bool
> +gimple_seq_unreachable_p (gimple_seq stmts)
> +{
> +  if (stmts == NULL)
> +return false;
> +
> +  gimple_stmt_iterator gsi = gsi_last (stmts);
> +
> +  if (!gimple_call_builtin_p (gsi_stmt (gsi), BUILT_IN_UNREACHABLE))
> +return false;
> +
> +  for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi))
> +{
> +  gimple *stmt = gsi_stmt (gsi);
> +  if (gimple_code (stmt) != GIMPLE_LABEL
> +   && !is_gimple_debug (stmt)
> +   && !gimple_clobber_p (stmt))
> +  return false;
> +}
> +  return true;
> +}
> +
>  /* Returns true for edge E where e->src ends with a GIMPLE_COND and
> the other edge points to a bb with just __builtin_unreachable ().
> I.e. return true for C->M edge in:
> @@ -476,22 +501,7 @@ assert_unreachable_fallthru_edge_p (edge
>if (other_bb == e->dest)
>   other_bb = EDGE_SUCC (pred_bb, 1)->dest;
>if (EDGE_COUNT (other_bb->succs) == 0)
> - {
> -   gimple_stmt_iterator gsi = gsi_after_labels (other_bb);
> -   gimple *stmt;
> -
> -   if (gsi_end_p (gsi))
> - return false;
> -   stmt = gsi_stmt (gsi);
> -   while (is_gimple_debug (stmt) || gimple_clobber_p (stmt))
> - {
> -   gsi_next (&gsi);
> -   if (gsi_end_p (gsi))
> - return false;
> -   stmt = gsi_stmt (gsi);
> - }
> -   return gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE);
> - }
> + return gimple_seq_unreachable_p (bb_seq (other_bb));
>  }
>return false;
>  }
> @@ -1668,9 +1678,11 @@ group_case_labels_stmt (gswitch *stmt)
>gcc_assert (base_case);
>base_bb = label_to_block (CASE_LABEL (base_case));
>  
> -  /* Discard cases that have the same destination as the
> -  default case.  */
> -  if (base_bb == default_bb)
> +  /* Discard cases that have the same destination as the default case
> +  or if their destination block is unreachable.  */
> +  if (base_bb == default_bb
> +   || (EDGE_COUNT (base_bb->succs) == 0
> +   && gimple_seq_unreachable_p (bb_seq (base_bb
>   {
> gimple_switch_set_label (stmt, i, NULL_TREE);
> i++;
> Index: gcc/tree-cfg.h
> ===
> --- gcc/tree-cfg.h(revision 247291)
> +++ gcc/tree-cfg.h(working copy)
> @@ -56,6 +56,7 @@ extern bool is_ctrl_stmt (gimple *);
>  extern bool is_ctrl_altering_stmt (gimple *);
>  extern bool simple_goto_p (gimple *);
>  extern bool stmt_ends_bb_p (gimple *);
> +extern bool gimple_seq_unreachable_p (gimple_seq);
>  extern bool assert_unreachable_fallthru_edge_p (edge);
>  extern void delete_tree_cfg_annotations (function *);
>  extern gphi *get_virtual_phi (basic_block);
> Index: gcc/stmt.c
> ===

Re: [PATCH 0/3] v3 of C++ template type diffing

2017-05-10 Thread Nathan Sidwell

On 05/09/2017 09:41 PM, David Malcolm wrote:

On Tue, 2017-05-09 at 10:52 -0400, Nathan Sidwell wrote:



I split out the non-c++ bits into a separate patch.

v3 of the patch kit is thus three parts:

[1/3] Non-C++ parts of template type diff printing
   I believe I can self-approve this part

[2/3] (v3) C++ template type diff printing
   Updated as above.


C++ bits ok -- thanks for addressing my points.  Give Jason a day to 
respond if he wants too.  The gcc/c-family/ and gcc/ changes will need 
someone other than me.



[3/3] (v3) Use %qH and %qI throughout C++ frontend
   Mostly mechanical


ok.

nathan

--
Nathan Sidwell


[gomp5] Add C/C++ lastprivate(conditional: ...) parsing

2017-05-10 Thread Jakub Jelinek
Hi!

This patch adds so far just parsing and diagnostics of conditional:
modifier to lastprivate clause.
Implementation for simd will require some vectorizer improvements,
for non-orphaned worksharing I think I can add a shared variable on the
parallel holding max so far stored iteration # and reduce based on that,
orphaned worksharing will be require some library work too.

2017-05-10  Jakub Jelinek  

* tree.h (OMP_CLAUSE_LASTPRIVATE_CONDITIONAL): Define.
* tree-pretty-print.c (dump_omp_clause) :
Print conditional: for OMP_CLAUSE_LASTPRIVATE_CONDITIONAL.
* gimplify.c (gimplify_scan_omp_clauses): Diagnose invalid
gimplify_scan_omp_clauses.
c-family/
* c-omp.c (c_omp_split_clauses): Copy
OMP_CLAUSE_LASTPRIVATE_CONDITIONAL.
c/
* c-parser.c (c_parser_omp_clause_lastprivate): Parse optional
conditional: modifier.
(c_parser_cilk_all_clauses): Call c_parser_omp_var_list_parens
directly.
cp/
* parser.c (cp_parser_omp_clause_lastprivate): New function.
(cp_parser_omp_all_clauses): Call it for OpenMP lastprivate clause.

--- gcc/tree.h.jj   2017-05-04 15:05:06.0 +0200
+++ gcc/tree.h  2017-05-09 16:42:05.460935117 +0200
@@ -1469,6 +1469,10 @@ extern void protected_set_expr_location
 #define OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV(NODE) \
   TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LASTPRIVATE))
 
+/* True if a LASTPRIVATE clause has CONDITIONAL: modifier.  */
+#define OMP_CLAUSE_LASTPRIVATE_CONDITIONAL(NODE) \
+  TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_LASTPRIVATE))
+
 /* True on a SHARED clause if a FIRSTPRIVATE clause for the same
decl is present in the chain (this can happen only for taskloop
with FIRSTPRIVATE/LASTPRIVATE on it originally.  */
--- gcc/tree-pretty-print.c.jj  2017-05-04 15:05:05.0 +0200
+++ gcc/tree-pretty-print.c 2017-05-09 16:45:46.542093806 +0200
@@ -389,7 +389,13 @@ dump_omp_clause (pretty_printer *pp, tre
   goto print_remap;
 case OMP_CLAUSE_LASTPRIVATE:
   name = "lastprivate";
-  goto print_remap;
+  if (!OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (clause))
+   goto print_remap;
+  pp_string (pp, "lastprivate(conditional:");
+  dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
+spc, flags, false);
+  pp_right_paren (pp);
+  break;
 case OMP_CLAUSE_COPYIN:
   name = "copyin";
   goto print_remap;
--- gcc/gimplify.c.jj   2017-05-04 15:05:58.0 +0200
+++ gcc/gimplify.c  2017-05-10 12:58:56.400849056 +0200
@@ -7431,16 +7431,42 @@ gimplify_scan_omp_clauses (tree *list_p,
  check_non_private = "firstprivate";
  goto do_add;
case OMP_CLAUSE_LASTPRIVATE:
+ if (OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c))
+   switch (code)
+ {
+ case OMP_DISTRIBUTE:
+   error_at (OMP_CLAUSE_LOCATION (c),
+ "conditional % clause on "
+ "% construct");
+   OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c) = 0;
+   break;
+ case OMP_TASKLOOP:
+   error_at (OMP_CLAUSE_LOCATION (c),
+ "conditional % clause on "
+ "% construct");
+   OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c) = 0;
+   break;
+ default:
+   break;
+ }
  flags = GOVD_LASTPRIVATE | GOVD_SEEN | GOVD_EXPLICIT;
  check_non_private = "lastprivate";
  decl = OMP_CLAUSE_DECL (c);
  if (error_operand_p (decl))
goto do_add;
- else if (outer_ctx
-  && (outer_ctx->region_type == ORT_COMBINED_PARALLEL
-  || outer_ctx->region_type == ORT_COMBINED_TEAMS)
-  && splay_tree_lookup (outer_ctx->variables,
-(splay_tree_key) decl) == NULL)
+ if (OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c)
+ && !lang_hooks.decls.omp_scalar_p (decl))
+   {
+ error_at (OMP_CLAUSE_LOCATION (c),
+   "non-scalar variable %qD in conditional "
+   "% clause", decl);
+ OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c) = 0;
+   }
+ if (outer_ctx
+ && (outer_ctx->region_type == ORT_COMBINED_PARALLEL
+ || outer_ctx->region_type == ORT_COMBINED_TEAMS)
+ && splay_tree_lookup (outer_ctx->variables,
+   (splay_tree_key) decl) == NULL)
{
  omp_add_variable (outer_ctx, decl, GOVD_SHARED | GOVD_SEEN);
  if (outer_ctx->outer_context)
--- gcc/c-family/c-omp.c.jj 2017-05-04 15:05:05.0 +0200
+++ gcc/c-family/c-omp.c2017-05-09 17:39:18.503872276 +0200
@@ -1191,6 +1191,8 @@ c_omp_split_clauses (location_t loc, enu
OMP_CLAUSE_LASTP

[PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-05-10 Thread Alexander Monakov
Hi,

When expanding __atomic_thread_fence(x) to RTL, the i386 backend doesn't emit
any instruction except for x==__ATOMIC_SEQ_CST (which emits 'mfence').  This 
is incorrect: although no machine barrier is needed, the compiler still must
emit a compiler barrier into the IR to prevent propagation and code motion
across the fence.  The testcase added with the patch shows how it can lead
to a miscompilation.

The proposed patch fixes it by handling non-seq-cst fences exactly like
__atomic_signal_fence is expanded, by emitting asm volatile("":::"memory").

The s390 backend uses the a similar mem_thread_fence expansion, so the patch
fixes both backends in the same manner.

Bootstrapped and regtested on x86_64; also checked that s390-linux cc1
successfully builds after the change.  OK for trunk?

(the original source code in the PR was misusing atomic fences by doing
something like

  void f(int *p)
  {
while (*p)
  __atomic_thread_fence(__ATOMIC_ACQUIRE);
  }

but since *p is not atomic, a concurrent write to *p would cause a data race and
thus invoke undefined behavior; also, if *p is false prior to entering the loop,
execution does not encounter the fence; new test here has code usable without 
UB)

Alexander

* config/i386/sync.md (mem_thread_fence): Emit a compiler barrier for
non-seq-cst fences.  Adjust comment.
* config/s390/s390.md (mem_thread_fence): Likewise.
* optabs.c (expand_asm_memory_barrier): Export.
* optabs.h (expand_asm_memory_barrier): Declare.
testsuite/
* gcc.target/i386/pr80640-1.c: New testcase.
---
 gcc/config/i386/sync.md   |  7 ++-
 gcc/config/s390/s390.md   | 11 +--
 gcc/optabs.c  |  2 +-
 gcc/optabs.h  |  3 +++
 gcc/testsuite/gcc.target/i386/pr80640-1.c | 12 
 5 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80640-1.c

diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index 20d46fe..619d53b 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -108,7 +108,7 @@ (define_expand "mem_thread_fence"
   enum memmodel model = memmodel_from_int (INTVAL (operands[0]));
 
   /* Unless this is a SEQ_CST fence, the i386 memory model is strong
- enough not to require barriers of any kind.  */
+ enough not to require a processor barrier of any kind.  */
   if (is_mm_seq_cst (model))
 {
   rtx (*mfence_insn)(rtx);
@@ -124,6 +124,11 @@ (define_expand "mem_thread_fence"
 
   emit_insn (mfence_insn (mem));
 }
+  else if (!is_mm_relaxed (model))
+{
+  /* However, a compiler barrier is still required.  */
+  expand_asm_memory_barrier ();
+}
   DONE;
 })
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index c9fd19a..65e54c4 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10109,14 +10109,21 @@ (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand")]   ;; model
   ""
 {
+  enum memmodel model = memmodel_from_int (INTVAL (operands[0]));
+
   /* Unless this is a SEQ_CST fence, the s390 memory model is strong
- enough not to require barriers of any kind.  */
-  if (is_mm_seq_cst (memmodel_from_int (INTVAL (operands[0]
+ enough not to require a processor barrier of any kind.  */
+  if (is_mm_seq_cst (model))
 {
   rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
   MEM_VOLATILE_P (mem) = 1;
   emit_insn (gen_mem_thread_fence_1 (mem));
 }
+  else if (!is_mm_relaxed (model))
+{
+  /* However, a compiler barrier is still required.  */
+  expand_asm_memory_barrier ();
+}
   DONE;
 })
 
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 48e37f8..1f1fbc3 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6269,7 +6269,7 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx 
*ptarget_oval,
 
 /* Generate asm volatile("" : : : "memory") as the memory barrier.  */
 
-static void
+void
 expand_asm_memory_barrier (void)
 {
   rtx asm_op, clob;
diff --git a/gcc/optabs.h b/gcc/optabs.h
index b2dd31a..aca6755 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -322,6 +322,9 @@ extern bool expand_atomic_compare_and_swap (rtx *, rtx *, 
rtx, rtx, rtx, bool,
 extern void expand_mem_thread_fence (enum memmodel);
 extern void expand_mem_signal_fence (enum memmodel);
 
+/* Generate a compile-time memory barrier.  */
+extern void expand_asm_memory_barrier (void);
+
 rtx expand_atomic_load (rtx, rtx, enum memmodel);
 rtx expand_atomic_store (rtx, rtx, enum memmodel, bool);
 rtx expand_atomic_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel, 
diff --git a/gcc/testsuite/gcc.target/i386/pr80640-1.c 
b/gcc/testsuite/gcc.target/i386/pr80640-1.c
new file mode 100644
index 000..f1d1e55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80640-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-ter" } *

Re: Make tree-ssa-strlen.c handle partial unterminated strings

2017-05-10 Thread Jakub Jelinek
Hi!

Note the intent of the pass is to handle the most common cases, it is fine
if some cases that aren't common aren't handled, it is all about the extra
complexity vs. how much it helps on real-world code.

On Sun, May 07, 2017 at 10:10:48AM +0100, Richard Sandiford wrote:
> I've got most of the way through a version that uses min_length instead.
> But one thing that the terminated flag allows that a constant min_length
> doesn't is:
> 
>   size_t
>   f1 (char *a1)
>   {
> size_t x = strlen (a1);
> char *a3 = a1 + x;
> a3[0] = '1';  // a1 length x + 1, unterminated  (min length x + 1)
> a3[1] = '2';  // a1 length x + 2, unterminated  (min length x + 2)
> a3[2] = '3';  // a1 length x + 3, unterminated  (min length x + 3)
> a3[3] = 0;// a1 length x + 3, terminated
> return strlen (a1);
>   }
> 
> For the a3[3] = 0, we know a3's min_length is 3 and so it's obvious
> that we can convert its min_length to a length.  But even if we allow
> a1's min_length to be nonconstant, it seems a bit risky to assume that
> we can convert its min_length to a length as well.  It would only work
> if the min_lengths in related strinfos are kept in sync, whereas it
> ought to be safe to say that the minimum length of something is 0.

And we have code for that.  If verify_related_strinfos returns non-NULL,
we can adjust all the related strinfos that need adjusting.
See e.g. zero_length_string on how it uses that.  It is just that we should
decide what is the acceptable complexity of the length/min_length
expressions (whether INTEGER_CST or SSA_NAME is enough, then the above
would not work, but is that really that important), or if we e.g. allow
SSA_NAME + INTEGER_CST in addition to that, or sum of 2 SSA_NAMEs, etc.).
I don't see how terminated vs. unterminated (which is misnamed anyway, it
means that it isn't known to be terminated, it might be terminated or not)
would help with that.

> So I think that gives four possiblities:
> 
>   (1) Keep the terminated flag, but extend the original patch to handle
>   strings built up a character at a time.  This would handle f1 above.

Only if you allow complex expressions like SSA_NAME + INTEGER_CST in length.

>   (2) Replace the terminated flag with a constant minimum length, don't
>   handle f1 above.

Sure, if you only allow constants, it will be limited to constants.

>   (3) Replace the terminated flag with an arbitrary minimum length and
>   ensure that it's always valid to copy the minimum length to the
>   length when we do so for the final strinfo in a chain.

Even length doesn't allow arbitrary expressions, the more complex it is,
the more expensive will it be to compute it when you e.g. replace
strlen with that.

I'd introduce min_length, start with INTEGER_CST, once it is handled
everywhere in the pass properly, see if there is enough code in the wild
that would justify allowing more than that.

min_length is a simple guarantee that there are no zero bytes among the
first min_length bytes, length is the same plus that there is a zero byte
right after that, so it is easy to argue about what happens if you store
non-zero somewhere into it, or store zero, etc.

Jakub


[PATCH] Verify loops with TODO_verify_il

2017-05-10 Thread Richard Biener

This fixes two remaining issues with properly preserving loops,
one academic with -dx which skips all RTL optimizers and then
ICEs in clean_state when trying to verify loops after we freed
everything else.  One maybe serious where TM instrumentation
wrecks loops but doesn't tell others about this (TM might be a
reason to not "ignore" loops with abnormal edges?).

Re-bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-05-10  Richard Biener  

* passes.c (execute_function_todo): Verify loops if they are
said to be up-to-date.
* cfgexpand.c (pass_expand::execute): Discard loops for -dx.
* trans-mem.c (pass_tm_edges::execute): Mark loops for fixup.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 247838)
+++ gcc/passes.c(working copy)
@@ -1979,8 +1979,12 @@ execute_function_todo (function *fn, voi
  && !from_ipa_pass)
verify_flow_info ();
  if (current_loops
- && loops_state_satisfies_p (LOOP_CLOSED_SSA))
-   verify_loop_closed_ssa (false);
+ && ! loops_state_satisfies_p (LOOPS_NEED_FIXUP))
+   {
+ verify_loop_structure ();
+ if (loops_state_satisfies_p (LOOP_CLOSED_SSA))
+   verify_loop_closed_ssa (false);
+   }
  if (cfun->curr_properties & PROP_rtl)
verify_rtl_sharing ();
}
Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c (revision 247838)
+++ gcc/cfgexpand.c (working copy)
@@ -6542,6 +6542,14 @@ pass_expand::execute (function *fun)
   set_block_levels (DECL_INITIAL (fun->decl), 0);
   default_rtl_profile ();
 
+  /* For -dx discard loops now, otherwise IL verify in clean_state will
+ ICE.  */
+  if (rtl_dump_and_exit)
+{
+  cfun->curr_properties &= ~PROP_loops;
+  loop_optimizer_finalize ();
+}
+
   timevar_pop (TV_POST_EXPAND);
 
   return 0;
Index: gcc/trans-mem.c
===
--- gcc/trans-mem.c (revision 247838)
+++ gcc/trans-mem.c (working copy)
@@ -3369,6 +3369,8 @@ pass_tm_edges::execute (function *fun)
  must be rebuilt completely.  Otherwise we'll crash trying to update
  the SSA web in the TODO section following this pass.  */
   free_dominance_info (CDI_DOMINATORS);
+  /* We'ge also wrecked loops badly with inserting of abnormal edges.  */
+  loops_state_set (LOOPS_NEED_FIXUP);
   bitmap_obstack_release (&tm_obstack);
   all_tm_regions = NULL;
 


[PATCH] Make PRE/FRE elimination not do useless work

2017-05-10 Thread Richard Biener

So this is a patch that makes skipping unreachable code when
doing elimination possible.  Previously interesting interactions
with tail-merging made this impossible, now I seem to have
figured a way around this.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

With this more elaborate stmt folding during elimination might
be in reach.

Richard.

2017-05-10  Richard Biener  

* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
Skip unreachable blocks and destinations.
(eliminate): Move stmt removal and fixup ...
(fini_eliminate): ... here.  Skip inserted exprs.
(pass_pre::execute): Move fini_pre after fini_eliminate.
* tree-ssa-tailmerge.c: Include tree-cfgcleanup.h.
(tail_merge_optimize): Run cleanup_tree_cfg if requested by
PRE to get rid of dead code that has invalid SSA form and
split critical edges again.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 247831)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4196,9 +4196,14 @@ eliminate_dom_walker::before_dom_childre
   /* Mark new bb.  */
   el_avail_stack.safe_push (NULL_TREE);
 
-  /* ???  If we do nothing for unreachable blocks then this will confuse
- tailmerging.  Eventually we can reduce its reliance on SCCVN now
- that we fully copy/constant-propagate (most) things.  */
+  /* Skip unreachable blocks marked unreachable during the SCCVN domwalk.  */
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, b->preds)
+if (e->flags & EDGE_EXECUTABLE)
+  break;
+  if (! e)
+return NULL;
 
   for (gphi_iterator gsi = gsi_start_phis (b); !gsi_end_p (gsi);)
 {
@@ -4695,10 +4700,8 @@ eliminate_dom_walker::before_dom_childre
 }
 
   /* Replace destination PHI arguments.  */
-  edge_iterator ei;
-  edge e;
   FOR_EACH_EDGE (e, ei, b->succs)
-{
+if (e->flags & EDGE_EXECUTABLE)
   for (gphi_iterator gsi = gsi_start_phis (e->dest);
   !gsi_end_p (gsi);
   gsi_next (&gsi))
@@ -4717,7 +4720,6 @@ eliminate_dom_walker::before_dom_childre
gimple_set_plf (SSA_NAME_DEF_STMT (sprime), NECESSARY, true);
}
}
-}
   return NULL;
 }
 
@@ -4743,9 +4745,6 @@ eliminate_dom_walker::after_dom_children
 static unsigned int
 eliminate (bool do_pre)
 {
-  gimple_stmt_iterator gsi;
-  gimple *stmt;
-
   need_eh_cleanup = BITMAP_ALLOC (NULL);
   need_ab_cleanup = BITMAP_ALLOC (NULL);
 
@@ -4761,6 +4760,18 @@ eliminate (bool do_pre)
   el_avail.release ();
   el_avail_stack.release ();
 
+  return el_todo;
+}
+
+/* Perform CFG cleanups made necessary by elimination.  */
+
+static unsigned 
+fini_eliminate (void)
+{
+  gimple_stmt_iterator gsi;
+  gimple *stmt;
+  unsigned todo = 0;
+
   /* We cannot remove stmts during BB walk, especially not release SSA
  names there as this confuses the VN machinery.  The stmts ending
  up in el_to_remove are either stores or simple copies.
@@ -4782,8 +4793,9 @@ eliminate (bool do_pre)
lhs = gimple_get_lhs (stmt);
 
   if (inserted_exprs
- && TREE_CODE (lhs) == SSA_NAME)
-   bitmap_clear_bit (inserted_exprs, SSA_NAME_VERSION (lhs));
+ && TREE_CODE (lhs) == SSA_NAME
+ && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (lhs)))
+   continue;
 
   gsi = gsi_for_stmt (stmt);
   if (gimple_code (stmt) == GIMPLE_PHI)
@@ -4800,7 +4812,7 @@ eliminate (bool do_pre)
}
 
   /* Removing a stmt may expose a forwarder block.  */
-  el_todo |= TODO_cleanup_cfg;
+  todo |= TODO_cleanup_cfg;
 }
   el_to_remove.release ();
 
@@ -4819,18 +4831,10 @@ eliminate (bool do_pre)
}
 
   if (fixup_noreturn_call (stmt))
-   el_todo |= TODO_cleanup_cfg;
+   todo |= TODO_cleanup_cfg;
 }
   el_to_fixup.release ();
 
-  return el_todo;
-}
-
-/* Perform CFG cleanups made necessary by elimination.  */
-
-static unsigned 
-fini_eliminate (void)
-{
   bool do_eh_cleanup = !bitmap_empty_p (need_eh_cleanup);
   bool do_ab_cleanup = !bitmap_empty_p (need_ab_cleanup);
 
@@ -4844,8 +4848,8 @@ fini_eliminate (void)
   BITMAP_FREE (need_ab_cleanup);
 
   if (do_eh_cleanup || do_ab_cleanup)
-return TODO_cleanup_cfg;
-  return 0;
+todo |= TODO_cleanup_cfg;
+  return todo;
 }
 
 /* Borrow a bit of tree-ssa-dce.c for the moment.
@@ -5110,8 +5114,8 @@ pass_pre::execute (function *fun)
   remove_dead_inserted_code ();
 
   scev_finalize ();
-  fini_pre ();
   todo |= fini_eliminate ();
+  fini_pre ();
   loop_optimizer_finalize ();
 
   /* Restore SSA info before tail-merging as that resets it as well.  */
Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c   (revision 247831)
+++ gcc/tree-ssa-tail-merge.c   (working copy)
@@ -205,6 +205,7 @@ along with GCC; see the file COPYING3.
 #include "tree-ssa-sccvn.h"
 #include "cfgloop.h"
 #include 

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-05-10 Thread Alexander Monakov
While fixing the fences issue of PR80640 I've noticed that a similar oversight
is present in expansion of atomic loads on x86: they become volatile loads, but
that is insufficient, a compiler memory barrier is still needed.  Volatility
prevents tearing the load (preserves non-divisibility of atomic access), but
does not prevent propagation and code motion of non-volatile accesses across the
load.  The testcase below demonstrates that even SEQ_CST loads are mishandled.

It's less clear how to fix this one.  AFAICS there are two possible approaches:
either emit explicit compiler barriers on either side of the load (a barrier
before the load is needed for SEQ_CST loads), or change how atomic loads are
expressed in RTL: atomic stores already use ATOMIC_STA UNSPECs, so they don't
suffer from this issue.  The asymmetry seems odd to me.

The former approach is easier and exposes non-seq-cst loads upwards for
optimization, so that's what the proposed patch implements.

The s390 backend suffers from the same issue, except there the atomic stores are
affected too.

Alexander

* config/i386/sync.md (atomic_load): Emit a compiler barrier
before (only for SEQ_CST order) and after the load.
testsuite/
* gcc.target/i386/pr80640-2.c: New testcase.

---
 gcc/config/i386/sync.md   |  7 +++
 gcc/testsuite/gcc.target/i386/pr80640-2.c | 11 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80640-2.c

diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index 619d53b..35a7b38 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -155,6 +155,11 @@ (define_expand "atomic_load"
   UNSPEC_LDA))]
   ""
 {
+  enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+
+  if (is_mm_seq_cst (model))
+expand_asm_memory_barrier ();
+
   /* For DImode on 32-bit, we can use the FPU to perform the load.  */
   if (mode == DImode && !TARGET_64BIT)
 emit_insn (gen_atomic_loaddi_fpu
@@ -173,6 +178,8 @@ (define_expand "atomic_load"
   if (dst != operands[0])
emit_move_insn (operands[0], dst);
 }
+  if (!is_mm_relaxed (model))
+expand_asm_memory_barrier ();
   DONE;
 })
 
diff --git a/gcc/testsuite/gcc.target/i386/pr80640-2.c 
b/gcc/testsuite/gcc.target/i386/pr80640-2.c
new file mode 100644
index 000..f0a050c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80640-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-ter" } */
+/* { dg-final { scan-assembler-not "xorl\[\\t \]*\\\%eax,\[\\t \]*%eax" } } */
+
+int f(int *p, volatile _Bool *p1, _Atomic _Bool *p2)
+{
+  int v = *p;
+  *p1 = !v;
+  while (__atomic_load_n (p2, __ATOMIC_SEQ_CST));
+  return v - *p;
+}
-- 
1.8.3.1



Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Ville Voutilainen
On 10 May 2017 at 14:40, Nathan Sidwell  wrote:
>>> Full testsuite run is clean. Is it ok to backport this change to
>>> gcc-6? (And gcc-7, too)
>>
>>
>> ..and gcc-5. Backporting everywhere allows library implementations
>> including libc++ to
>> just use the intrinsic, without using std::is_constructible in addition.
>
>
> I have no objection, and this is a very simple fix.


I appreciate that, but given that I operate under Write-After Approval, I need
more than a no-objection, I will need an actual ok from a maintainer. :)


Re: [RFC PATCH, i386]: Enable post-reload compare elimination pass

2017-05-10 Thread Jakub Jelinek
On Tue, May 09, 2017 at 06:06:47PM +0200, Uros Bizjak wrote:
> Attached patch enables post-reload compare elimination pass by
> providing expected patterns (duplicates of existing patterns with
> setters of reg and flags switched in the parallel) for flag setting
> arithmetic instructions.
> 
> The merge triggers more than 3000 times during the gcc bootstrap,
> mostly in cases where intervening memory load or store prevents
> combine from merging the arithmetic insn and the following compare.
> 
> Also, some recent linux x86_64 defconfig build results in ~200 merges,
> removing ~200 test/cmp insns. Not much, but I think the results still
> warrant the pass to be enabled.

Isn't the right fix instead to change the compare-elim.c pass to either
accept both reg vs. flags orderings in parallel, or both depending
on some target hook, or change it to the order i386.md and most other
major targets use and just fix up mn10300/rx (and aarch64?) to use the same
order?
I think this has been discussed before already several times.

> 
> 2017-05-09  Uros Bizjak  
> 
> * config/i386/i386-protos.h (ix86_match_ccmode_last): New prototype.
> * config/i386/i386.c (ix86_match_ccmode_1): Rename from
> ix86_match_ccmode.  Add "last" argument.  Make function static inline.
> (ix86_match_ccmode): New function.
> (ix86_match_ccmode_last): Ditto.
> (TARGET_FLAGS_REGNUM): Define.
> * config/i386/i386.md (*add_2b): New insn pattern.
> (*sub_2b): Ditto.
> (*and_2b): Ditto.
> (*_2b): Ditto.

Jakub


Re: [PATCH GCC8][01/33]Handle TRUNCATE between tieable modes in rtx_cost

2017-05-10 Thread Bin.Cheng
On Wed, May 3, 2017 at 11:09 AM, Kyrill Tkachov
 wrote:
> Hi Bin,
>
>
> On 03/05/17 11:02, Bin.Cheng wrote:
>>
>> On Wed, May 3, 2017 at 9:38 AM, Bin.Cheng  wrote:
>>>
>>> On Wed, May 3, 2017 at 7:17 AM, Eric Botcazou 
>>> wrote:
>
> 2017-04-11  Bin Cheng  
>
>* rtlanal.c (rtx_cost): Handle TRUNCATE between tieable modes.

 This breaks bootstrap with RTL checking:

 /home/eric/build/gcc/native/./gcc/xgcc
 -B/home/eric/build/gcc/native/./gcc/ -
 nostdinc -x c /dev/null -S -o /dev/null -fself-
 test=/home/eric/svn/gcc/gcc/testsuite/selftests
 cc1: internal compiler error: RTL check: expected code 'subreg', have
 'truncate' in rtx_cost, at rtlanal.c:4169
 0xbae338 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*,
 int,
 char const*)
  /home/eric/svn/gcc/gcc/rtl.c:829
 0xbbc9b4 rtx_cost(rtx_def*, machine_mode, rtx_code, int, bool)
  /home/eric/svn/gcc/gcc/rtlanal.c:4169
 0x8517e6 set_src_cost
  /home/eric/svn/gcc/gcc/rtl.h:2685
 0x8517e6 init_expmed_one_conv
  /home/eric/svn/gcc/gcc/expmed.c:142
 0x8517e6 init_expmed_one_mode
  /home/eric/svn/gcc/gcc/expmed.c:209
 0x853fb2 init_expmed()
  /home/eric/svn/gcc/gcc/expmed.c:270
 0xc45974 backend_init_target
  /home/eric/svn/gcc/gcc/toplev.c:1665
 0xc45974 initialize_rtl()

>>> Sorry for disturbing, I will revert this if can't fix today.
>>
>> It looks bogus and I couldn't find the motivating case for it, so
>> revert with attached patch.  Build on x86 and commit as obvious.
>>
>> Thanks,
>> bin
>> 2017-05-03  Bin Cheng  
>>
>>  Revert
>>  2017-05-02  Bin Cheng  
>>  * rtlanal.c (rtx_cost): Handle TRUNCATE between tieable modes.
>
>
> Looking at the code in the patch...
>
> +case TRUNCATE:
> +  /* If we can tie these modes, make this cheap.  */
> +  if (MODES_TIEABLE_P (mode, GET_MODE (SUBREG_REG (x
>
> 'code' here is GET_CODE (x) and in this case it is TRUNCATE.
> SUBREG_REG asserts (in RTL checking mode) that its argument is a SUBREG, so
> passing it a TRUNCATE rtx would cause
> the checking failure Eric reported. I think you meant to use XEXP (x, 0)
> instead of SUBREG_REG (x) ?
Turned out this is still necessary for a test case pr49781-1.c.   I
updated patch as suggested.  This is kind of an obvious update.

Thanks,
bin
>
> Thanks,
> Kyrill
>
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 321363f..d9f57c3 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -4164,6 +4164,13 @@ rtx_cost (rtx x, machine_mode mode, enum rtx_code 
outer_code,
return COSTS_N_INSNS (2 + factor);
   break;
 
+case TRUNCATE:
+  if (MODES_TIEABLE_P (mode, GET_MODE (XEXP (x, 0
+   {
+ total = 0;
+ break;
+   }
+  /* FALLTHRU */
 default:
   if (targetm.rtx_costs (x, mode, outer_code, opno, &total, speed))
return total;


[PATCH] Ada/x32: PR ada/80626: Correct Memory_Size

2017-05-10 Thread H.J. Lu
On Tue, May 9, 2017 at 7:32 PM, H.J. Lu  wrote:
> On Tue, Apr 4, 2017 at 4:46 AM, Andreas Krebbel
>  wrote:
>> On 04/03/2017 06:18 PM, Eric Botcazou wrote:
 On S/390 UNITS_PER_WORD is:
 8 with -m64
 4 with -m31
 8 with -m31 -mzarch

 This has been chosen to support use of 64 bit registers also in 32 bit
 code.  Code compiled with -m31 -mzarch is supposed to adhere to the 32
 bit ABI.  In order to make that work it was required to prevent
 UNITS_PER_WORD from being used in ABI-relevant contexts.  That's why
 Ulrich added the TARGET_UNWIND_WORD_MODE in 2008 (for SPU).
>>>
>>> We do that for 32-bit SPARC on Solaris (-mv8plus) but UNITS_PER_WORD is 4.
>>>
 Now I could either fix this by reverting that change for S/390
 (similiar to what Andreas Schwab did to fix the BZ) or I could just
 use the size of the long data type (as we do in the ABI-relevant parts
 of the backend as well).  Which one do you prefer?
>>>
>>> Having System.Word_Size != Standard'Word_Size is a bit disturbing.  Does it
>>> work to change only Memory_Size to 2 ** Long_Integer'Size?  This will also
>>> correct the definition of Address below.
>>
>> This worked as well. I've committed the following patch:
>>
>> gcc/ada/ChangeLog:
>>
>> 2017-04-04  Andreas Krebbel  
>>
>> * system-linux-s390.ads: Use Long_Integer'Size to define
>> Memory_Size.
>> ---
>>  gcc/ada/system-linux-s390.ads | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/ada/system-linux-s390.ads b/gcc/ada/system-linux-s390.ads
>> index 485a8de..9bf8375 100644
>> --- a/gcc/ada/system-linux-s390.ads
>> +++ b/gcc/ada/system-linux-s390.ads
>> @@ -70,7 +70,7 @@ package System is
>>
>> Storage_Unit : constant := 8;
>> Word_Size: constant := Standard'Word_Size;
>> -   Memory_Size  : constant := 2 ** Word_Size;
>> +   Memory_Size  : constant := 2 ** Long_Integer'Size;
>>
>> --  Address comparison
>>
>
> X32 needs something similar:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80626
>

Here is the patch:

X32 uses 64 as word size instead of 32.  This must not affect the
Address type definition which is based on Memory_Size.

PR ada/80626
* system-linux-x86.ads (Memory_Size): Use Long_Integer'Size
instead of Word_Size.

Tested on x86-64 with -m64/-m32/-mx32.   OK for trunk and
gcc-7-branch?

Thanks.

-- 
H.J.
From 813f0651e7c2a506903d0dfd0daff8895c339800 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 10 May 2017 07:31:21 -0700
Subject: [PATCH] Ada/x32: PR ada/80626: Correct Memory_Size

X32 uses 64 as word size instead of 32.  This must not affect the
Address type definition which is based on Memory_Size.

	PR ada/80626
	* system-linux-x86.ads (Memory_Size): Use Long_Integer'Size
	instead of Word_Size.
---
 gcc/ada/system-linux-x86.ads | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/system-linux-x86.ads b/gcc/ada/system-linux-x86.ads
index 22a212e..533d94e 100644
--- a/gcc/ada/system-linux-x86.ads
+++ b/gcc/ada/system-linux-x86.ads
@@ -70,7 +70,7 @@ package System is
 
Storage_Unit : constant := 8;
Word_Size: constant := Standard'Word_Size;
-   Memory_Size  : constant := 2 ** Word_Size;
+   Memory_Size  : constant := 2 ** Long_Integer'Size;
 
--  Address comparison
 
-- 
2.9.3



Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-05-10 Thread Martin Sebor

On 05/10/2017 06:19 AM, Prathamesh Kulkarni wrote:

On 9 May 2017 at 23:34, Martin Sebor  wrote:

On 05/09/2017 07:24 AM, Prathamesh Kulkarni wrote:


ping https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00161.html

Thanks,
Prathamesh

On 3 May 2017 at 11:30, Prathamesh Kulkarni
 wrote:


On 3 May 2017 at 03:28, Martin Sebor  wrote:


On 05/02/2017 11:11 AM, Prathamesh Kulkarni wrote:



Hi,
The attached patch attempts to add option -Wenum-conversion for C and
objective-C similar to clang, which warns when an enum value of a type
is implicitly converted to enum value of another type and is enabled
by Wall.




It seems quite useful.  My only high-level concern is with
the growing number of specialized warnings and options for each
and their interaction.

I've been working on -Wenum-assign patch that complains about
assigning to an enum variables an integer constants that doesn't
match any of the enumerators of the type.  Testing revealed that
the -Wenum-assign duplicated a subset of warnings already issued
by -Wconversion enabled with -Wpedantic.  I'm debating whether
to suppress that part of -Wenum-assign altogether or only when
-Wconversion and -Wpedantic are enabled.

My point is that these dependencies tend to be hard to discover
and understand, and the interactions tricky to get right (e.g.,
avoid duplicate warnings for similar but distinct problems).

This is not meant to be a negative comment on your patch, but
rather a comment about a general problem that might be worth
starting to think about.

One comment on the patch itself:

+ warning_at_rich_loc (&loc, 0, "implicit conversion from"
+  " enum type of %qT to %qT", checktype,
type);

Unlike C++, the C front end formats an enumerated type E using
%qT as 'enum E' so the warning prints 'enum type of 'enum E'),
duplicating the "enum" part.

I would suggest to simplify that to:

  warning_at_rich_loc (&loc, 0, "implicit conversion from "
   "%qT to %qT", checktype, ...


Thanks for the suggestions. I have updated the patch accordingly.
Hmm the issue you pointed out of warnings interaction is indeed of
concern.
I was wondering then if we should merge this warning with -Wconversion
instead of having a separate option -Wenum-conversion ? Although that
will not
really help with your example below.


Martin

PS As an example to illustrate my concern above, consider this:

  enum __attribute__ ((packed)) E { e1 = 1 };
  enum F { f256 = 256 };

  enum E e = f256;

It triggers -Woverflow:

warning: large integer implicitly truncated to unsigned type
[-Woverflow]
   enum E e = f256;
  ^~~~

also my -Wenum-assign:

warning: integer constant ‘256’ converted to ‘0’ due to limited range
[0,
255] of type ‘‘enum E’’ [-Wassign-enum]
   enum E e = f256;
  ^~~~

and (IIUC) will trigger your new -Wenum-conversion.


Yep, on my branch it triggered -Woverflow and -Wenum-conversion.
Running the example on clang shows a single warning, which they call
as -Wconstant-conversion, which
I suppose is similar to your -Wassign-enum.



-Wassign-enum is a Clang warning too, it just isn't included in
either -Wall or -Wextra.  It warns when a constant is assigned
to a variable of an enumerated type and is not representable in
it.  I enhanced it for GCC to also warn when the constant doesn't
correspond to an enumerator in the type, but I'm starting to think
that rather than adding yet another option to GCC it might be better
to extend your -Wenum-conversion once it's committed to cover those
cases (and also to avoid issuing multiple warnings for essentially
the same problem).  Let me ponder that some more.

I can't approve patches but it looks good to me for the most part.
There is one minor issue that needs to be corrected:

+ gcc_rich_location loc (location);
+ warning_at_rich_loc (&loc, 0, "implicit conversion from"
+  " %qT to %qT", checktype, type);

Here the zero should be replaced with OPT_Wenum_conversion,
otherwise the warning option won't be included in the message.

Oops, sorry about that, updated in the attached patch.
In the patch, I have left the warning in Wall, however I was wondering
whether it should be
in Wextra instead ?
The warning triggered for icv.c in libgomp for following assignment:
icv->run_sched_var = kind;

because icv->run_sched_var was of type enum gomp_schedule_type and
'kind' was of type enum omp_sched_t.
However although these enums have different names, they are
structurally identical (same values),
so the warning in this case, although not a false positive, seems a
bit artificial ?


I'd say the warning is justified in this case, even if the two
enums are clearly designed to be interchangeable.  It will be
a reminder to review code like it to make sure it is, in fact
intended and correct.  If it is, it's easy to suppress by
an explicit cast.  So based on this example alone I wouldn't
feel compelled to remove it from -Wall just y

Re: [RFC PATCH, i386]: Enable post-reload compare elimination pass

2017-05-10 Thread Uros Bizjak
On Wed, May 10, 2017 at 4:27 PM, Jakub Jelinek  wrote:
> On Tue, May 09, 2017 at 06:06:47PM +0200, Uros Bizjak wrote:
>> Attached patch enables post-reload compare elimination pass by
>> providing expected patterns (duplicates of existing patterns with
>> setters of reg and flags switched in the parallel) for flag setting
>> arithmetic instructions.
>>
>> The merge triggers more than 3000 times during the gcc bootstrap,
>> mostly in cases where intervening memory load or store prevents
>> combine from merging the arithmetic insn and the following compare.
>>
>> Also, some recent linux x86_64 defconfig build results in ~200 merges,
>> removing ~200 test/cmp insns. Not much, but I think the results still
>> warrant the pass to be enabled.
>
> Isn't the right fix instead to change the compare-elim.c pass to either
> accept both reg vs. flags orderings in parallel, or both depending
> on some target hook, or change it to the order i386.md and most other
> major targets use and just fix up mn10300/rx (and aarch64?) to use the same
> order?

I was looking at compare-elim.c, where in line 675 function
try_eliminate_compare simply substitutes clobber of CC-reg with a new
compare RTX through a validate_change call. I'm not sure, what would
be the best way to handle both insn variants here. I was hoping
perhaps Jeff would help with the correct approach. Additional copy of
several patterns indeed seems heavily counter-productive.

> I think this has been discussed before already several times.

True, but there was no resolution. As an experiment, I was surprised,
how many cases patched compiler caught, even with the limited set of
additional patterns. So, the postreload cmpelim pass certainly brings
some benefits, in the same sense postreload ree pass does, to catch
additional opportunities that combine pass wasn't able to merge.

Uros.


Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-10 Thread Jakub Jelinek
On Fri, May 05, 2017 at 10:23:59AM -0700, Kevin Buettner wrote:
> On Fri, 5 May 2017 14:23:14 +0300 (MSK)
> Alexander Monakov  wrote:
> 
> > On Thu, 4 May 2017, Kevin Buettner wrote:
> > > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> > > index 5c48b78..7029951 100644
> > > --- a/gcc/omp-expand.c
> > > +++ b/gcc/omp-expand.c
> > > @@ -667,6 +667,25 @@ expand_parallel_call (struct omp_region *region, 
> > > basic_block bb,  
> > 
> > Outlined functions are also used for 'omp task' and 'omp target' regions, 
> > but
> > here only 'omp parallel' is handled. Will this code need to be duplicated 
> > for
> > those region types?
> 
> For 'omp task' and 'omp target', I think it's possible or even likely
> that the original context which started these parallel tasks will no
> longer exist.  So, it might not make sense to do something equivalent
> for 'task' and 'target'.

It depends.  E.g. for #pragma omp taskloop without nogroup clause, it acts the
same as #pragma omp parallel in the nesting regard, the GOMP_taskloop* function 
will
not return until all the tasks finished.  Or if you have #pragma omp task and 
#pragma omp taskwait on the next line, or #pragma omp taskgroup
around it, or #pragma omp target without nowait clause, it will behave the same.
Then there are cases where the encountering function will still be around,
but already not all the lexical scopes (or inline functions), e.g. if there
is #pragma omp taskwait or taskgroup etc. outside of the innermost lexical
scope(s), but still somewhere in the function.  What the debugger should do
in that case is that it should figure out that the spot the task has been
created in has passed, so not show vars in the lexical scopes already left,
but still show others?  Then of course if there is nothing waiting for the
task or async target in the current function, the function's frame could be
left, perhaps multiple callers too.

> > >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
> > >t2 = build_fold_addr_expr (child_fndecl);
> > >  
> > > +  if (gimple_block (entry_stmt) != NULL_TREE
> > > +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)  
> > 
> > Here and also below, ...
> > 
> > > +{
> > > +  tree b = BLOCK_SUPERCONTEXT (gimple_block (entry_stmt));
> > > +
> > > +  /* Add child_fndecl to var chain of the supercontext of the
> > > +block corresponding to entry_stmt.  This ensures that debug
> > > +info for the outlined function will be emitted for the correct
> > > +lexical scope.  */
> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  
> > 
> > ... here, I'm curious why the conditionals are necessary -- I don't see why 
> > the
> > conditions can be sometimes true and sometimes false.  Sorry if I'm missing
> > something obvious.

gimple_block can be NULL.  And, most calls of gimple_block that want to
ensure it is a BLOCK actually do verify it is a BLOCK, while it is unlikely
and it is usually just LTO that screws things up, I'd keep it.

> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  
> 
> I check to make sure that b is a block so that I can later refer to
> BLOCK_VARS (b).

I believe BLOCK_SUPERCONTEXT of a BLOCK should always be non-NULL, either
another BLOCK, or FUNCTION_DECL.  Thus I think b != NULL_TREE && is
redundant here.

What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
of the child function for all kinds of outlined functions, but then you just
choose one of the many places and add it into the BLOCK tree.  Any reason
why the DECL_CONTEXT change can't be done in a helper function together
with all the changes you've added into omp-expand.c, and then call it from
expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
you can call it easily also for other constructs, either now or later on?
Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
instead of prepending it there (which is cheaper)?  Does the debugger
care about relative order of those artificial functions vs. other
variables in the lexical scope?

Jakub


Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Thomas Koenig

Hi Andreas,


+  index_type t1_dim;
+  t1_dim = (a_dim1-1) * 256 + b_dim1;
+  if (t1_dim > 65536)
+   t1_dim = 65536;
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wvla"
+  'rtype_name` t1[t1_dim]; /* was [256][256] */

That does the wrong thing if b_dim1 == 0xDEADBEEF.

(gdb) p (a_dim1-1) * 256 + b_dim1
$2 = -764456190


A look into the source code shows that b_dim1 is index_type,
which is 32 bits on 32-bit sytems and 64 bits on 64-bit system.

Now, consider if it is possible to declare an array on a 32-bit
system where the number of elements along one direction exceeds 2**31-1
(so sign extension would come into play), or if it would be
possible to declare an array on a 64-bit system where the number of
elements along one direction exceeds 2**63-1.

If you manage to come up with a legal Fortran testcas which
sets b_dim1 to 0xdeadbeef, I owe you a beer :-)

Regards

Thomas


Re: OpenACC 2.5 default (present) clause

2017-05-10 Thread Jakub Jelinek
On Fri, Apr 07, 2017 at 05:08:55PM +0200, Thomas Schwinge wrote:
> Hi!
> 
> OpenACC 2.5 added a default (present) clause, which "causes all arrays or
> variables of aggregate data type used in the compute construct that have
> implicitly determined data attributes to be treated as if they appeared
> in a present clause".  Preceded by the following cleanup patch (see
>  for its
> origin), OK for trunk in next stage 1?
> 
> commit 787fea9e71f693c1b629a699f8476f392c4bc55d
> Author: Thomas Schwinge 
> Date:   Thu Apr 6 07:58:37 2017 +0200
> 
> Clarify gcc/gimplify.c:oacc_default_clause
> 
> gcc/
> * gimplify.c (oacc_default_clause): Clarify.
> ---
>  gcc/gimplify.c | 40 ++--
>  1 file changed, 22 insertions(+), 18 deletions(-)
> 
> @@ -6897,6 +6900,7 @@ omp_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl,
>  found_outer:
>break;
>  
> +case OMP_CLAUSE_DEFAULT_PRESENT:
>  default:
>gcc_unreachable ();
>  }

What is the point of this hunk?  Document that present clause is not in
OpenMP?  I don't think that is needed.

> + case 0 | GOVD_MAP_FORCE:
> +   kind = GOMP_MAP_TOFROM | GOMP_MAP_FLAG_FORCE;
> +   break;

Please drop the "0 | ".

> @@ -357,7 +357,9 @@ enum omp_clause_code {
>/* OpenMP clause: ordered [(constant-integer-expression)].  */
>OMP_CLAUSE_ORDERED,
>  
> -  /* OpenMP clause: default.  */
> +  /* OpenACC clause: default ( none | present ).
> +
> + OpenMP clause: default ( firstprivate | none | private | shared ). */
>OMP_CLAUSE_DEFAULT,
>  
>/* OpenACC/OpenMP clause: collapse (constant-integer-expression).  */

I think this hunk isn't needed (plus it is not accurate anyway).

Otherwise LGTM.

Jakub


Re: Test cases to check OpenACC offloaded function's attributes and classification

2017-05-10 Thread Jakub Jelinek
On Mon, May 08, 2017 at 07:02:15PM +0200, Thomas Schwinge wrote:
> Hi!
> 
> Ping.
> 
> On Thu, 4 Aug 2016 16:06:10 +0200, I wrote:
> > Ping.
> > 
> > On Wed, 27 Jul 2016 10:59:02 +0200, I wrote:
> > > OK for trunk?
> 
> (In the mean time, I also added some more testing.)
> 
> commit b7d61270dfc581a6ea130f7a4fa7506a0a5762d8
> Author: Thomas Schwinge 
> Date:   Mon May 8 18:22:50 2017 +0200
> 
> Test cases to check OpenACC offloaded function's attributes and 
> classification
> 
> gcc/testsuite/
> * c-c++-common/goacc/classify-kernels-unparallelized.c: New file.
> * c-c++-common/goacc/classify-kernels.c: Likewise.
> * c-c++-common/goacc/classify-parallel.c: Likewise.
> * c-c++-common/goacc/classify-routine.c: Likewise.
> * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
> * gfortran.dg/goacc/classify-kernels.f95: Likewise.
> * gfortran.dg/goacc/classify-parallel.f95: Likewise.
> * gfortran.dg/goacc/classify-routine.f95: Likewise.

Dunno if it isn't too fragile, but if you are willing to maintain it, ok.

Jakub


Re: [C++ Patch] PR 80186 ("ICE on C++ code with invalid constructor...")

2017-05-10 Thread Jason Merrill
> On Mon, May 8, 2017 at 7:25 AM, Paolo Carlini  
> wrote:
>> in order to avoid this error recovery issue I think we want to check the
>> return value of grok_ctor_properties, as we do in decl.c, for its only other
>> use. Tested x86_64-linux.

The new testcase fails with -std=c++1z because the rvalue directly
initializes the parameter, so we don't get the second error.  Fixing
thus.

Jason
commit 6a6c7d18384241ce7b7a9a23490e074b196b8ef9
Author: Jason Merrill 
Date:   Wed May 10 11:14:28 2017 -0400

* g++.dg/template/crash126.C: Second error doesn't apply to C++17.

diff --git a/gcc/testsuite/g++.dg/template/crash126.C 
b/gcc/testsuite/g++.dg/template/crash126.C
index 8a3112e..903cab8 100644
--- a/gcc/testsuite/g++.dg/template/crash126.C
+++ b/gcc/testsuite/g++.dg/template/crash126.C
@@ -9,5 +9,5 @@ template < class T, class > struct A
 
 void f () 
 {
-  A < int, int > (A < int, int >());  // { dg-error "cannot bind" }
+  A < int, int > (A < int, int >());  // { dg-error "cannot bind" "" { target 
c++14_down } }
 }


Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Thomas Koenig

Am 10.05.2017 um 17:42 schrieb Thomas Koenig:


If you manage to come up with a legal Fortran testcas which
sets b_dim1 to 0xdeadbeef, I owe you a beer :-)


... on a 32-bit system, of course.


Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 10 2017, Thomas Koenig  wrote:

> If you manage to come up with a legal Fortran testcas which
> sets b_dim1 to 0xdeadbeef, I owe you a beer :-)

grep is your friend.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: C++ PATCHes for c++/79549 and 79556 (ICE with auto parameter pack)

2017-05-10 Thread Jason Merrill
On Fri, Feb 17, 2017 at 1:41 PM, Jason Merrill  wrote:
> In 79556, we try to deduce an auto type from a dependent initializer
> with null TREE_TYPE, which doesn't work; fixed by catching that case
> in do_auto_deduction.
>
> In 79549, we try to tsubst into the type of a NONTYPE_ARGUMENT_PACK,
> which doesn't make sense for an auto parameter pack; in fact, it
> doesn't make sense for the argument pack to have a type at all.  For
> GCC 7 I'm fixing this by leaving the auto type in place; for GCC 8
> we'll do away with TREE_TYPE on all NONTYPE_ARGUMENT_PACKs.

And now I'm applying the GCC8 patch.

Jason


Re: [patch, fortran] Reduce stack use in blocked matmul

2017-05-10 Thread Andreas Schwab
On Mai 10 2017, Thomas Koenig  wrote:

> ... on a 32-bit system, of course.

http://gcc.gnu.org/ml/gcc-testresults/2017-05/msg01063.html

FAIL: gfortran.dg/generic_20.f90   -O0  execution test
FAIL: gfortran.dg/generic_20.f90   -O1  execution test
FAIL: gfortran.dg/generic_20.f90   -O2  execution test
FAIL: gfortran.dg/generic_20.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/generic_20.f90   -O3 -g  execution test
FAIL: gfortran.dg/generic_20.f90   -Os  execution test
FAIL: gfortran.dg/matmul_6.f90   -O0  execution test
FAIL: gfortran.dg/matmul_bounds_5.f90   -O0  output pattern test
FAIL: gfortran.dg/matmul_bounds_6.f90   -O0  execution test
FAIL: gfortran.dg/matmul_bounds_6.f90   -O1  execution test
FAIL: gfortran.dg/matmul_bounds_6.f90   -O2  execution test
FAIL: gfortran.dg/matmul_bounds_6.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/matmul_bounds_6.f90   -O3 -g  execution test
FAIL: gfortran.dg/matmul_bounds_6.f90   -Os  execution test

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Use "oacc kernels" attribute for OpenACC kernels

2017-05-10 Thread Jakub Jelinek
On Mon, May 08, 2017 at 09:29:28PM +0200, Thomas Schwinge wrote:
> commit fac5c3214f58812881635d3fb1e1751446d4b660
> Author: Thomas Schwinge 
> Date:   Mon May 8 21:24:46 2017 +0200
> 
> Use "oacc kernels" attribute for OpenACC kernels
> 
> gcc/
> * omp-expand.c (expand_omp_target)
> : Set "oacc kernels" attribute.

I think
* omp-expand.c (expand_omp_target) :
Set "oacc kernels" attribute.
fits better.

> * omp-general.c (oacc_set_fn_attrib): Remove is_kernel formal
> parameter.  Adjust all users.
> (oacc_fn_attrib_kernels_p): Remove function.
> (execute_oacc_device_lower): Look for "oacc kernels" attribute
> instead of calling oacc_fn_attrib_kernels_p.
> * tree-ssa-loop.c (gate_oacc_kernels): Likewise.
> * tree-parloops.c (create_parallel_loop): If oacc_kernels_p,
> assert "oacc kernels" attribute is set.
> gcc/testsuite/
> * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
> * c-c++-common/goacc/classify-kernels.c: Likewise.
> * c-c++-common/goacc/classify-parallel.c: Likewise.
> * c-c++-common/goacc/classify-routine.c: Likewise.
> * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
> * gfortran.dg/goacc/classify-kernels.f95: Likewise.
> * gfortran.dg/goacc/classify-parallel.f95: Likewise.
> * gfortran.dg/goacc/classify-routine.f95: Likewise.

> @@ -7451,7 +7457,7 @@ expand_omp_target (struct omp_region *region)
>break;
>  case BUILT_IN_GOACC_PARALLEL:
>{
> - oacc_set_fn_attrib (child_fn, clauses, oacc_kernels_p, &args);
> + oacc_set_fn_attrib (child_fn, clauses, &args);
>   tagging = true;
>}
>/* FALLTHRU */

The {}s aren't needed around this, could you drop them?

> + pos = tree_cons (purpose[ix],
> +  build_int_cst (integer_type_node, dims[ix]),
> +  pos);

pos); would fit on the earlier line.

Ok with those changes.

Jakub


Re: Use "oacc kernels parallelized" attribute for parallelized OpenACC kernels (was: Use "oacc kernels" attribute for OpenACC kernels)

2017-05-10 Thread Jakub Jelinek
On Tue, May 09, 2017 at 10:57:34PM +0200, Thomas Schwinge wrote:
> commit b6b5d549089423e3fbe387f63467d052b956f3f7
> Author: Thomas Schwinge 
> Date:   Tue May 9 20:14:03 2017 +0200
> 
> Use "oacc kernels parallelized" attribute for parallelized OpenACC kernels
> 
> gcc/
> * tree-parloops.c (create_parallel_loop): Set "oacc kernels
> parallelized" attribute for parallelized OpenACC kernels.
> * omp-offload.c (execute_oacc_device_lower): Use it.
> gcc/testsuite/
> * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
> * c-c++-common/goacc/classify-kernels.c: Likewise.
> * c-c++-common/goacc/kernels-counter-vars-function-scope.c:
> Likewise.
> * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
> * c-c++-common/goacc/kernels-double-reduction.c: Likewise.
> * c-c++-common/goacc/kernels-loop-2.c: Likewise.
> * c-c++-common/goacc/kernels-loop-3.c: Likewise.
> * c-c++-common/goacc/kernels-loop-g.c: Likewise.
> * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
> * c-c++-common/goacc/kernels-loop-n.c: Likewise.
> * c-c++-common/goacc/kernels-loop-nest.c: Likewise.
> * c-c++-common/goacc/kernels-loop.c: Likewise.
> * c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
> * c-c++-common/goacc/kernels-reduction.c: Likewise.
> * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
> * gfortran.dg/goacc/classify-kernels.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
> * gfortran.dg/goacc/kernels-loop.f95: Likewise.

Ok.

Jakub


Minor C++ PATCH for dependent_type_p sanity check

2017-05-10 Thread Jason Merrill
A while back I noticed that we were calling dependent_type_p with
global_type_node, which should never happen.  I fixed that separately,
but this patch adds a check to catch it if it happens again.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit f9b1055070c90162f505d11a929a797f55d3230f
Author: Jason Merrill 
Date:   Fri Feb 17 15:24:17 2017 -0500

* pt.c (dependent_type_p): Make sure we aren't called with
global_type_node.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 72256b3..b9e7af7 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -23436,6 +23436,10 @@ dependent_type_p (tree type)
   if (type == error_mark_node)
 return false;
 
+  /* Getting here with global_type_node means we improperly called this
+ function on the TREE_TYPE of an IDENTIFIER_NODE.  */
+  gcc_checking_assert (type != global_type_node);
+
   /* If we have not already computed the appropriate value for TYPE,
  do so now.  */
   if (!TYPE_DEPENDENT_P_VALID (type))


Re: OpenACC C front end maintenance: c_parser_oacc_single_int_clause (was: [OpenACC] num_gangs, num_workers and vector_length in c++)

2017-05-10 Thread Jakub Jelinek
On Tue, May 09, 2017 at 11:27:14PM +0200, Thomas Schwinge wrote:
> Hi!
> 
> On Wed, 4 Nov 2015 14:10:29 -0800, Cesar Philippidis  
> wrote:
> > > [...]
> > 
> > Thanks. I've applied this patch to trunk.
> 
> > gcc/cp/
> > * (cp_parser_oacc_single_int_clause): New function.
> > (cp_parser_oacc_clause_vector_length): Delete.
> > (cp_parser_omp_clause_num_gangs): Delete.
> > (cp_parser_omp_clause_num_workers): Delete.
> > (cp_parser_oacc_all_clauses): Use cp_parser_oacc_single_int_clause
> > for num_gangs, num_workers and vector_length.
> 
> Here is a similar patch for the C front end, and it also adds test cases.
> OK for trunk?  During testing I also noticed some "strangeness" (also
> regarding some similar OpenMP clauses), which I'll file a PR for, later
> on.
> 
> commit 70baf3fc0c544fe63b9d3b3bebcca88daf7ce554
> Author: Thomas Schwinge 
> Date:   Fri May 5 16:38:52 2017 +0200
> 
> OpenACC C front end maintenance: c_parser_oacc_single_int_clause
> 
> gcc/c/
> * c-parser.c (c_parser_omp_clause_num_gangs)
> (c_parser_omp_clause_num_workers)
> (c_parser_omp_clause_vector_length): Merge functions into...
> (c_parser_oacc_single_int_clause): ... this new function.  Adjust
> all users.
> gcc/testsuite/
> * c-c++-common/goacc/parallel-dims-1.c: New file.
> * c-c++-common/goacc/parallel-dims-2.c: Likewise.

Ok.

Jakub


Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Nathan Sidwell

On 05/10/2017 10:21 AM, Ville Voutilainen wrote:

On 10 May 2017 at 14:40, Nathan Sidwell  wrote:




I appreciate that, but given that I operate under Write-After Approval, I need
more than a no-objection, I will need an actual ok from a maintainer. :)


IIUC backports have to fix a regression.  So you'll need to convince the 
release maintainer.  From a C++ POV you're approved.


nathan

--
Nathan Sidwell


Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Richard Biener
On May 10, 2017 6:36:20 PM GMT+02:00, Nathan Sidwell  wrote:
>On 05/10/2017 10:21 AM, Ville Voutilainen wrote:
>> On 10 May 2017 at 14:40, Nathan Sidwell  wrote:
>
>> 
>> I appreciate that, but given that I operate under Write-After
>Approval, I need
>> more than a no-objection, I will need an actual ok from a maintainer.
>:)
>
>IIUC backports have to fix a regression.  So you'll need to convince
>the 
>release maintainer.  From a C++ POV you're approved.

Wrong-code, rejects-valid and ice-on-valid-code are generally fine to backport 
if the fixes are not too intrusive.

Richard.

>nathan



Re: [CHKP] Fix for PR79990

2017-05-10 Thread Ilya Enkovich
2017-05-09 16:29 GMT+03:00 Alexander Ivchenko :
> Hi,
>
> Here is the latest version of the patch with all comments addressed:
>
> gcc/ChangeLog:
>
> 2017-05-09  Alexander Ivchenko  
>
> * tree-chkp.c (chkp_get_hard_register_var_fake_base_address):
> New function.
> (chkp_get_hard_register_fake_addr_expr): Ditto.
> (chkp_build_addr_expr): Add check for hard reg case.
> (chkp_parse_array_and_component_ref): Ditto.
> (chkp_find_bounds_1): Ditto.
> (chkp_process_stmt): Don't generate bounds store for
> hard reg case.
>
>
> gcc/testsuite/ChangeLog:
>
> 2017-05-09  Alexander Ivchenko  
>
> * gcc.target/i386/mpx/hard-reg-2-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-2-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-2-ubv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-1-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-1-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-1-ubv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-2-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-2-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-2-ubv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-3-ubv.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-1-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-1-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-1-ubv.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-2-lbv.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-2-nov.c: New test.
> * gcc.target/i386/mpx/hard-reg-4-2-ubv.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c
> b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c
> new file mode 100644
> index 000..319e1ec
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-lbv.c
> @@ -0,0 +1,21 @@
> +/* { dg-do run } */
> +/* { dg-shouldfail "bounds violation" } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +
> +#define SHOULDFAIL
> +
> +#include "mpx-check.h"
> +
> +typedef int v16 __attribute__((vector_size(16)));
> +
> +int foo(int i) {
> +  register v16 u asm("xmm0");
> +  return u[i];
> +}
> +
> +int mpx_test (int argc, const char **argv)
> +{
> +  printf ("%d\n", foo (-1));
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c
> b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c
> new file mode 100644
> index 000..3c6d39a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-nov.c
> @@ -0,0 +1,18 @@
> +/* { dg-do run } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +#include "mpx-check.h"
> +
> +typedef int v16 __attribute__((vector_size(16)));
> +
> +int foo (int i) {
> +  register v16 u asm ("xmm0");
> +  return u[i];
> +}
> +
> +int mpx_test (int argc, const char **argv)
> +{
> +  printf ("%d\n", foo (3));
> +  printf ("%d\n", foo (0));
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c
> b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c
> new file mode 100644
> index 000..7fe76c4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-2-ubv.c
> @@ -0,0 +1,21 @@
> +/* { dg-do run } */
> +/* { dg-shouldfail "bounds violation" } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +
> +#define SHOULDFAIL
> +
> +#include "mpx-check.h"
> +
> +typedef int v16 __attribute__((vector_size(16)));
> +
> +int foo (int i) {
> +  register v16 u asm ("xmm0");
> +  return u[i];
> +}
> +
> +int mpx_test (int argc, const char **argv)
> +{
> +  printf ("%d\n", foo (5));
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c
> b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c
> new file mode 100644
> index 000..7e4451f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-lbv.c
> @@ -0,0 +1,33 @@
> +/* { dg-do run } */
> +/* { dg-shouldfail "bounds violation" } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +
> +#define SHOULDFAIL
> +
> +#include "mpx-check.h"
> +
> +typedef int v8 __attribute__ ((vector_size (8)));
> +
> +struct S1
> +{
> +  v8 s1f;
> +};
> +
> +struct S2
> +{
> +  struct S1 s2f1;
> +  v8 s2f2;
> +};
> +
> +int foo_s2f1 (int i)
> +{
> +  register struct S2 b asm ("xmm0");
> +  return b.s2f1.s1f[i];
> +}
> +
> +int mpx_test (int argc, const char **argv)
> +{
> +  printf ("%d\n", foo_s2f1 (-1));
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c
> b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c
> new file mode 100644
> index 000..73bd7fb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/hard-reg-3-1-nov.c
> @@ -0,0 +1,30 @@
> +/* { dg-do run } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +
> +#include "mpx-check.h"
> +
> +type

Re: [PATCH] Ada/x32: PR ada/80626: Correct Memory_Size

2017-05-10 Thread Arnaud Charlet
> Here is the patch:
> 
> X32 uses 64 as word size instead of 32.  This must not affect the
> Address type definition which is based on Memory_Size.
> 
> PR ada/80626
> * system-linux-x86.ads (Memory_Size): Use Long_Integer'Size
> instead of Word_Size.
> 
> Tested on x86-64 with -m64/-m32/-mx32.   OK for trunk and
> gcc-7-branch?

That's OK.


Go patches committed: merge recent changes to gofrontend

2017-05-10 Thread Ian Lance Taylor
I have committed a large patch to update the Go frontend and libgo to
the recent changes in the gofrontend repository.  I had postponed
merging changes during the GCC 7 release process.  I am now merging
all the changes that were pending during that period.  Although this
is a merged patch, the changes can be seen individually in the
gofrontend repo (https://go.googlesource.com/gofrontend).  They are
also listed below.

This is a fairly significant patch that brings in the concurrent
garbage collector used in the Go 1.8 runtime.  This significantly
reduces pauses due to garbage collection while running a Go program.

This patch also brings in experimental support for AIX for gccgo,
contributed by Matthieu Sarter and others at Atos Infogérance.

The actual patch is too large for this e-mail patch, but I have
attached all the changes to the gcc/go directory.

Ian


2017-05-10  Than McIntosh  

* go-backend.c: Include "go-c.h".
* go-gcc.cc (Gcc_backend::write_export_data): New method.

2017-05-10  Ian Lance Taylor  

* go-gcc.cc (Gcc_backend::Gcc_backend): Declare
__builtin_prefetch.
* Make-lang.in (GO_OBJS): Add go/wb.o.

commit 884c9f2cafb3fc1decaca70f1817ae269e4c6889
Author: Than McIntosh 
Date:   Mon Jan 23 15:07:07 2017 -0500

compiler: insert additional conversion for type desc ptr expr

Change the method Type::type_descriptor_pointer to apply an additional
type conversion to its result Bexpression, to avoid type clashes in
the back end. The backend expression for a given type descriptor var
is given a type of "_type", however the virtual calls that create the
variable use types derived from _type, hence the need to force a
conversion.

Reviewed-on: https://go-review.googlesource.com/35506


commit 5f0647c71e3b29eddcd0eecc44e7ba44ae7fc8dd
Author: Than McIntosh 
Date:   Mon Jan 23 15:22:26 2017 -0500

compiler: insure tree integrity in Call_expression::set_result

Depending on the back end, it can be problematic to reuse Bexpressions
(passing the same Bexpression to more than one Backend call to create
additional Bexpressions or Bstatements). The Call_expression::set_result
method was reusing its Bexpression input in more than one tree
context; the fix is to pass in an Expression instead and generate
multiple Bexpression references to it within the method.

Reviewed-on: https://go-review.googlesource.com/35505


commit 7a8e49870885af898c3c790275e513d1764a2828
Author: Ian Lance Taylor 
Date:   Tue Jan 24 21:19:06 2017 -0800

runtime: copy more of the scheduler from the Go 1.8 runtime

Copies mstart, newm, m0, g0, and friends.

Reviewed-on: https://go-review.googlesource.com/35645


commit 3546e2f002d0277d805ec59c5403bc1d4eda4ed9
Author: Ian Lance Taylor 
Date:   Thu Jan 26 19:47:37 2017 -0800

runtime: remove a few C functions that are no longer used

Reviewed-on: https://go-review.googlesource.com/35849


commit a71b835254f6d3164a0e6beaf54f2b175d1a6a92
Author: Ian Lance Taylor 
Date:   Thu Jan 26 16:51:16 2017 -0800

runtime: copy over more of the Go 1.8 scheduler

In particular __go_go (aka newproc) and goexit[01].

Reviewed-on: https://go-review.googlesource.com/35847


commit c3725adbe54d8283c373b6aa7dc95d6fc27f
Author: Ian Lance Taylor 
Date:   Fri Jan 27 16:58:20 2017 -0800

runtime: copy syscall handling from Go 1.8 runtime

Entering a syscall still has to start in C, to save the registers.
Fix entersyscallblock to save them more reliably.

This copies over the tracing code for syscalls, which we previously
weren't doing, and lets us turn on runtime/trace/check.

Reviewed-on: https://go-review.googlesource.com/35912


commit d5b921de4a28b04000fc4c8dac7f529a4a624dfc
Author: Ian Lance Taylor 
Date:   Fri Jan 27 18:34:11 2017 -0800

runtime: copy SIGPROF handling from Go 1.8 runtime

Also copy over Breakpoint.

Fix Func.Name and Func.Entry to not crash on a nil Func.

Reviewed-on: https://go-review.googlesource.com/35913


commit cc60235e55aef14b15c3d2114030245beb3adfef
Author: Than McIntosh 
Date:   Mon Feb 6 11:12:12 2017 -0500

compiler: convert go_write_export_data to Backend method.

Convert the helper function 'go_write_export_data' into a Backend
class method, to allow for an implementation of this function that
needs to access backend state.

Reviewed-on: https://go-review.googlesource.com/36357


commit e387439bfd24d5e142874b8e68e7039f74c744d7
Author: Than McIntosh 
Date:   Wed Feb 8 11:13:46 2017 -0500

compiler: insert backend conversion in temporary statement init

Insert an additional type conversion in Temporary_statement::do_get_backend
when assigning a Bexpression initializer to the temporary variable, to
avoid potential clashes in the back end. This can come up when assigning
something of concrete pointer-to-function type to a variable of generic
pointer-to-function type.

Reviewed-on: https://go-rev

Re: [PATCH] Kill -fdump-translation-unit

2017-05-10 Thread Alexander Monakov
On Wed, 10 May 2017, Richard Biener wrote:

> On Tue, May 9, 2017 at 5:41 PM, Nathan Sidwell  wrote:
> > -fdump-translation-unit is an inscrutably opaque dump.  It turned out that
> > most of the uses of the tree-dump header file was to indirectly get at
> > dumpfile.h, and the dump_function entry point it had forwarded to a dumper
> > in tree-cfg.c.  The gimple dumper would use its node dumper when asked for a
> > raw dump, but that was about it.
> >
> > We have prettier printers now.  This patch nukes the tu dumper.  ok?
> 
> Ok if nobody objects within 24 hours.

There was a reasonable IMO objection on the IRC (sadly, I can't say the same
about the responses that person received from Nathan).

A quick search indicates that people have published .tu parsers in Perl, JS
(producing json), the person objecting on IRC apparently used Python, and I'm
aware of another Python-based parser by Bruce Merry.

My takeaway from this is that people cared enough about this to build and
publish parsers in their language of choice, and that apparently it is or was
feature-rich enough for them to use.  Despite the format being undocumented and
formally not supported.

The motivation put forward in the opening mail ("is an inscutably opaque dump")
seems like a weak reason for removal.

Alexander


Re: [c++ PATCH] PR c++/80682

2017-05-10 Thread Nathan Sidwell

On 05/10/2017 01:13 PM, Richard Biener wrote:

On May 10, 2017 6:36:20 PM GMT+02:00, Nathan Sidwell  wrote:

On 05/10/2017 10:21 AM, Ville Voutilainen wrote:



IIUC backports have to fix a regression.  So you'll need to convince
the
release maintainer.  From a C++ POV you're approved.


Wrong-code, rejects-valid and ice-on-valid-code are generally fine to backport 
if the fixes are not too intrusive.


ok, thanks.  I think this is ice-on-valid (or at least rejects-valid). 
So good to backport.


nathan

--
Nathan Sidwell


Re: [PATCH] Kill -fdump-translation-unit

2017-05-10 Thread Jakub Jelinek
On Wed, May 10, 2017 at 08:51:22PM +0300, Alexander Monakov wrote:
> On Wed, 10 May 2017, Richard Biener wrote:
> 
> > On Tue, May 9, 2017 at 5:41 PM, Nathan Sidwell  wrote:
> > > -fdump-translation-unit is an inscrutably opaque dump.  It turned out that
> > > most of the uses of the tree-dump header file was to indirectly get at
> > > dumpfile.h, and the dump_function entry point it had forwarded to a dumper
> > > in tree-cfg.c.  The gimple dumper would use its node dumper when asked 
> > > for a
> > > raw dump, but that was about it.
> > >
> > > We have prettier printers now.  This patch nukes the tu dumper.  ok?
> > 
> > Ok if nobody objects within 24 hours.
> 
> There was a reasonable IMO objection on the IRC (sadly, I can't say the same
> about the responses that person received from Nathan).
> 
> A quick search indicates that people have published .tu parsers in Perl, JS
> (producing json), the person objecting on IRC apparently used Python, and I'm
> aware of another Python-based parser by Bruce Merry.
> 
> My takeaway from this is that people cared enough about this to build and
> publish parsers in their language of choice, and that apparently it is or was
> feature-rich enough for them to use.  Despite the format being undocumented 
> and
> formally not supported.
> 
> The motivation put forward in the opening mail ("is an inscutably opaque 
> dump")
> seems like a weak reason for removal.

Can it at least be taken out of -fdump-tree-all?  It is huge, often larger
than the sum of all the other dump files, and don't remember ever using it
for anything.  Instead of trying to write a parser for it and reconstructing
something you can then later analyze, isn't it better to just write a plugin
that can analyze it directly?

Jakub


Re: [PATCH, rs6000] Add x86 instrinsic headers to GCC PPC64LE taget

2017-05-10 Thread Steven Munroe
On Tue, 2017-05-09 at 16:03 -0500, Segher Boessenkool wrote:
> On Tue, May 09, 2017 at 02:33:00PM -0500, Steven Munroe wrote:
> > On Tue, 2017-05-09 at 12:23 -0500, Segher Boessenkool wrote:
> > > On Mon, May 08, 2017 at 09:49:57AM -0500, Steven Munroe wrote:
> > > > Thus I would like to restrict this support to PowerPC
> > > > targets that support VMX/VSX and PowerISA-2.07 (power8) and later.
> > > 
> > > What happens if you run it on an older machine, or as BE or 32-bit,
> > > or with vectors disabled?
> > > 
> > Well I hope that I set the dg-require-effective-target correctly because
> > while some of these intrinsics might work on the BE or 32-bit machine,
> > most will not.
> 
> That is just for the testsuite; I meant what happens if a user tries
> to use it with an older target (or BE, or 32-bit)?  Is there a useful,
> obvious error message?
> 
So looking at the X86 headers, their current practice falls into two two
areas. 

1) guard 64-bit dependent intrinsic functions with:

#ifdef __x86_64__
#endif

But they do not provide any warnings. I assume that attempting to use an
intrinsic of this class would result in an implicit function declaration
and a link-time failure.

2) guard architecture level dependent intrinsic header content with:

#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target("avx")
#define __DISABLE_AVX__
#endif /* __AVX__ */
...

#ifdef __DISABLE_AVX__
#undef __DISABLE_AVX__
#pragma GCC pop_options
#endif /* __DISABLE_AVX__ */

So they don't many any attempt to prevent them from using a specific
header. If the compiler version does not support the "GCC target" I
assume that specific did not exist in that version. 

If GCC does support that target then the '#pragma GCC target("avx")'
will enable code generation, but the user might get a SIGILL if the
hardware they have does not support those instructions.

In the BMI headers I already guard with:

#ifdef  __PPC64__
#endif

This means that like x86_64, attempting to use _pext_u64 on a 32-bit
compiler will result in an implicit function declaration and cause a
linker error.

This is sufficient for most of BMI and BMI2 (registers only / endian
agnostic). But this does not address the larger issues (for SSE/SSE2+)
which needing VXS implementation or restricting to LE.

So should I check for:

#ifdef __VSX__
#endif

or 

#ifdef __POWER8_VECTOR__

or 

#ifdef _ARCH_PWR8

and perhaps:

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__

as well to enforce this. 

And are you suggesting I add an #else clause with #warning or #error? Or
is the implicit function and link failure sufficient?

> > The situation gets more complicated when we start looking at the
> > SSE/SSE2. These headers define many variants of load and store
> > instructions that are decidedly LE and many unaligned forms. While
> > powerpc64le handles this with ease, implementing LE semantics in BE mode
> > gets seriously tricky. I think it is better to avoid this and only
> > support these headers for LE.
> 
> Right.
> 
> > And while some SSE instrinsics can be implemented with VMX instructions
> > all the SSE2 double float intrinsics require VSX. And some PowerISA 2.07
> > instructions simplify implementation if available. As power8 is also the
> > first supported powerpc64le system it seems the logical starting point
> > for most of this work. 
> 
> Agreed as well.
> 
> > I don't plan to spend effort on supporting Intel intrinsic functions on
> > older PowerPC machines (before power8) or BE.
> 
> Just make sure if anyone tries anyway, there is a clear error message
> that tells them not to.
> 
> 
> Segher
> 




Re: [PATCH][x86] Add missing intrinsics for vrcp14sd/ss instructions.

2017-05-10 Thread Uros Bizjak
On Tue, May 9, 2017 at 8:21 AM, Koval, Julia  wrote:
> Hi,
>
> This patch adds missing intrinsics for VRCP14SD and VRCP14SS instructions:
> _mm_mask_rcp14_sd
> _mm_maskz_rcp14_sd
> _mm_mask_rcp14_ss
> _mm_maskz_rcp14_ss
>
> These instructions and intrinsics are described in SDM Vol. 2C 5-487.
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_rcp14_sd,
> _mm_maskz_rcp14_sd, _mm_mask_rcp14_ss,
> _mm_maskz_rcp14_ss): New intrinsics.
> * config/i386/i386-builtin.def (__builtin_ia32_rcp14sd_mask,
> __builtin_ia32_rcp14ss_mask): New builtins.
> * config/i386/sse.md (srcp14_mask): New pattern.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vrcp14sd-1.c: Test new intrinsics.
> * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
> * gcc.target/i386/avx512f-vrcp14ss-1.c: Ditto.
> * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

Approved and committed to mainline SVN.

Thanks,
Uros.


Re: [PR80582][X86] Add missing __mm256_set[r] intrinsics

2017-05-10 Thread Uros Bizjak
On Tue, May 9, 2017 at 11:42 AM, Koval, Julia  wrote:
> Sorry, fixed that.
>
> Thanks,
> Julia
>
> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Tuesday, May 09, 2017 11:36 AM
> To: Koval, Julia 
> Cc: GCC Patches ; Uros Bizjak ; 
> Kirill Yukhin 
> Subject: Re: [PR80582][X86] Add missing __mm256_set[r] intrinsics
>
> On Tue, May 09, 2017 at 09:28:40AM +, Koval, Julia wrote:
>> Hi,
>>
>> This patch implements missing intrinsics:
>> _mm256_set_m128
>> _mm256_set_m128d
>> _mm256_set_m128i
>> _mm256_setr_m128
>> _mm256_setr_m128d
>> _mm256_setr_m128i
>>
>> gcc/
>>   * config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d,
>>   _mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d,
>>   _mm256_setr_m128i): New intrinsics.
>>
>> gcc/testsuite/
>>   * gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics.
>>   * gcc.target/i386/avx-vinsertf128-256-2: Ditto.
>>   * gcc.target/i386/avx-vinsertf128-256-3: Ditto.
>>
>> Ok for trunk?

Approved and committed to mainline SVN.

Thanks,
Uros.


Re: [PATCH][x86] Add missing intrinsics for DIV[SD,SS] and MUL[SD,SS]

2017-05-10 Thread Uros Bizjak
On Tue, May 9, 2017 at 1:39 PM, Peryt, Sebastian
 wrote:
> Hi,
>
> This patch adds missing intrinsics for DIVSD, DIVSS, MULSD and MULSS 
> instructions.
>
> 2017-05-09  Sebastian Peryt  
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_mul_round_sd,
> _mm_maskz_mul_round_sd, _mm_mask_mul_round_ss,
> _mm_maskz_mul_round_ss, _mm_mask_div_round_sd,
> _mm_maskz_div_round_sd, _mm_mask_div_round_ss,
> _mm_maskz_div_round_ss, _mm_mask_mul_sd, _mm_maskz_mul_sd,
> _mm_mask_mul_ss, _mm_maskz_mul_ss, _mm_mask_div_sd,
> _mm_maskz_div_sd, _mm_mask_div_ss, _mm_maskz_div_ss): New intrinsics.
> * config/i386/i386-builtin-types.def 
> (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT,
> V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases.
> * config/i386/i386-builtin.def (__builtin_ia32_divsd_mask_round,
> __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round,
> __builtin_ia32_mulss_mask_round): New builtins.
> * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT,
> V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types.
> * config/i386/sse.md (_vm3): 
> Renamed to ...
> (_vm3): ... this.
> (v\t{%2, %1, %0|%0, 
> %1, %2}): Changed to ...
> (v\t{%2, %1, 
> %0|%0, %1, %2}): ... this.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vdivsd-1.c (_mm_mask_div_sd,
> _mm_maskz_div_sd, _mm_mask_div_round_sd,
> _mm_maskz_div_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vdivsd-2.c: New.
> * gcc.target/i386/avx512f-vdivss-1.c (_mm_mask_div_ss,
> _mm_maskz_div_ss, _mm_mask_div_round_ss,
> _mm_maskz_div_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vdivss-2.c: New.
> * gcc.target/i386/avx512f-vmulsd-1.c (_mm_mask_mul_sd,
> _mm_maskz_mul_sd, _mm_mask_mul_round_sd,
> _mm_maskz_mul_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vmulsd-2.c: New.
> * gcc.target/i386/avx512f-vmulss-1.c (_mm_mask_mul_ss,
> _mm_maskz_mul_ss, _mm_mask_mul_round_ss,
> _mm_maskz_mul_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vmulss-2.c: New.
> * gcc.target/i386/avx-1.c (__builtin_ia32_divsd_mask_round,
> __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round,
> __builtin_ia32_mulss_mask_round): Test new builtins.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * gcc.target/i386/sse-14.c (_mm_maskz_div_round_sd,
> _mm_maskz_div_round_ss, _mm_maskz_mul_round_sd,
> _mm_maskz_mul_round_ss): Test new intrinsics.
> * gcc.target/i386/testround-1.c: Ditto.

Approved and committed to mainline SVN.

Thanks,
Uros.


Re: [PATCH][x86] Add missing intrinsics for MAX[SD,SS] and MIN[SD,SS]

2017-05-10 Thread Uros Bizjak
On Tue, May 9, 2017 at 1:43 PM, Peryt, Sebastian
 wrote:
> Hi,
>
> This patch adds missing intrinsics for MAXSD, MAXSS, MINSD and MINSS 
> instructions.
>
> 2017-05-09  Sebastian Peryt  
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_max_round_sd,
> _mm_maskz_max_round_sd, _mm_mask_max_round_ss,
> _mm_maskz_max_round_ss, _mm_mask_min_round_sd,
> _mm_maskz_min_round_sd, _mm_mask_min_round_ss,
> _mm_maskz_min_round_ss): New intrinsics.
> * config/i386/i386-builtin-types.def (V2DF, V2DF, V2DF, V2DF, UQI, 
> INT,
> V4SF, V4SF, V4SF, V4SF, UQI, INT): New function type aliases.
> * config/i386/i386-builtin.def (__builtin_ia32_maxsd_mask_round,
> __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round,
> __builtin_ia32_minss_mask_round): New builtins.
> * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT,
> V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types.
> * config/i386/sse.md (_vm3): 
> Renamed to ...
> (_vm3): ... this.
> (v\t{%2, %1, 
> %0|%0, %1, %2}): Changed to ...
> (v\t{%2, 
> %1, %0|%0, %1, 
> %2}): ... this.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vmaxsd-1.c (_mm_mask_max_round_sd,
> _mm_maskz_max_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vmaxsd-2.c: New.
> * gcc.target/i386/avx512f-vmaxss-1.c (_mm_mask_max_round_ss,
> _mm_maskz_max_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vmaxss-2.c: New.
> * gcc.target/i386/avx512f-vminsd-1.c (_mm_mask_min_round_sd,
> _mm_maskz_min_round_sd): Test new intrinsics.
> * gcc.target/i386/avx512f-vminsd-2.c: New.
> * gcc.target/i386/avx512f-vminss-1.c (_mm_mask_min_round_ss,
> _mm_maskz_min_round_ss): Test new intrinsics.
> * gcc.target/i386/avx512f-vminss-2.c: New.
> * gcc.target/i386/avx-1.c (__builtin_ia32_maxsd_mask_round,
> __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round,
> __builtin_ia32_minss_mask_round): Test new builtins.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * gcc.target/i386/sse-14.c (_mm_maskz_max_round_sd,
> _mm_maskz_max_round_ss, _mm_maskz_min_round_sd,
> _mm_maskz_min_round_ss, _mm_mask_max_round_sd,
> _mm_mask_max_round_ss, _mm_mask_min_round_sd,
> _mm_mask_min_round_ss): Test new intrinsics.
> * gcc.target/i386/testround-1.c: Ditto.

Approved and committed to mainline SVN.

Thanks,
Uros.


Re: [RFC PATCH, i386]: Enable post-reload compare elimination pass

2017-05-10 Thread Uros Bizjak
On Wed, May 10, 2017 at 5:18 PM, Uros Bizjak  wrote:
> On Wed, May 10, 2017 at 4:27 PM, Jakub Jelinek  wrote:
>> On Tue, May 09, 2017 at 06:06:47PM +0200, Uros Bizjak wrote:
>>> Attached patch enables post-reload compare elimination pass by
>>> providing expected patterns (duplicates of existing patterns with
>>> setters of reg and flags switched in the parallel) for flag setting
>>> arithmetic instructions.
>>>
>>> The merge triggers more than 3000 times during the gcc bootstrap,
>>> mostly in cases where intervening memory load or store prevents
>>> combine from merging the arithmetic insn and the following compare.
>>>
>>> Also, some recent linux x86_64 defconfig build results in ~200 merges,
>>> removing ~200 test/cmp insns. Not much, but I think the results still
>>> warrant the pass to be enabled.
>>
>> Isn't the right fix instead to change the compare-elim.c pass to either
>> accept both reg vs. flags orderings in parallel, or both depending
>> on some target hook, or change it to the order i386.md and most other
>> major targets use and just fix up mn10300/rx (and aarch64?) to use the same
>> order?

Attached patch changes compare-elim.c order to what i386.md expects.

Thoughts?

Uros.
Index: compare-elim.c
===
--- compare-elim.c  (revision 247850)
+++ compare-elim.c  (working copy)
@@ -45,9 +45,9 @@
(3) If an insn of form (2) can usefully set the flags, there is
another pattern of the form
 
-   [(set (reg) (operation)
-(set (reg:CCM) (compare:CCM (operation) (immediate)))]
-
+   [(set (reg:CCM) (compare:CCM (operation) (immediate)))
+(set (reg) (operation)]
+
The mode CCM will be chosen as if by SELECT_CC_MODE.
 
Note that unlike NOTICE_UPDATE_CC, we do not handle memory operands.
@@ -582,7 +582,7 @@
 static bool
 try_eliminate_compare (struct comparison *cmp)
 {
-  rtx x, flags, in_a, in_b, cmp_src;
+  rtx flags, in_a, in_b, cmp_src;
 
   /* We must have found an interesting "clobber" preceding the compare.  */
   if (cmp->prev_clobber == NULL)
@@ -628,7 +628,8 @@
  Validate that PREV_CLOBBER itself does in fact refer to IN_A.  Do
  recall that we've already validated the shape of PREV_CLOBBER.  */
   rtx_insn *insn = cmp->prev_clobber;
-  x = XVECEXP (PATTERN (insn), 0, 0);
+
+  rtx x = XVECEXP (PATTERN (insn), 0, 0);
   if (rtx_equal_p (SET_DEST (x), in_a))
 cmp_src = SET_SRC (x);
 
@@ -666,16 +667,30 @@
 flags = gen_rtx_REG (cmp->orig_mode, targetm.flags_regnum);
 
   /* Generate a new comparison for installation in the setter.  */
-  x = copy_rtx (cmp_src);
-  x = gen_rtx_COMPARE (GET_MODE (flags), x, in_b);
-  x = gen_rtx_SET (flags, x);
+  rtx y = copy_rtx (cmp_src);
+  y = gen_rtx_COMPARE (GET_MODE (flags), y, in_b);
+  y = gen_rtx_SET (flags, y);
 
+  /* Canonicalize instruction to:
+   [(set (reg:CCM) (compare:CCM (operation) (immediate)))
+(set (reg) (operation)]
+  */
+
+  rtvec v = rtvec_alloc (2);
+  RTVEC_ELT (v, 0) = y;
+  RTVEC_ELT (v, 1) = x;
+  
+  rtx pat = gen_rtx_PARALLEL (VOIDmode, v);
+  
   /* Succeed if the new instruction is valid.  Note that we may have started
  a change group within maybe_select_cc_mode, therefore we must continue. */
-  validate_change (insn, &XVECEXP (PATTERN (insn), 0, 1), x, true);
+  validate_change (insn, &PATTERN (insn), pat, true);
   if (!apply_change_group ())
 return false;
 
+  printf ("TRIGGERED\n");
+  debug_rtx (pat);
+
   /* Success.  Delete the compare insn...  */
   delete_insn (cmp->insn);
 
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 247850)
+++ config/i386/i386.c  (working copy)
@@ -52043,6 +52043,8 @@
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST ix86_address_cost
 
+#undef TARGET_FLAGS_REGNUM
+#define TARGET_FLAGS_REGNUM FLAGS_REG
 #undef TARGET_FIXED_CONDITION_CODE_REGS
 #define TARGET_FIXED_CONDITION_CODE_REGS ix86_fixed_condition_code_regs
 #undef TARGET_CC_MODES_COMPATIBLE


Re: Make tree-ssa-strlen.c handle partial unterminated strings

2017-05-10 Thread Richard Sandiford
Jakub Jelinek  writes:
> Hi!
>
> Note the intent of the pass is to handle the most common cases, it is fine
> if some cases that aren't common aren't handled, it is all about the extra
> complexity vs. how much it helps on real-world code.

OK.

> On Sun, May 07, 2017 at 10:10:48AM +0100, Richard Sandiford wrote:
>> I've got most of the way through a version that uses min_length instead.
>> But one thing that the terminated flag allows that a constant min_length
>> doesn't is:
>> 
>>   size_t
>>   f1 (char *a1)
>>   {
>> size_t x = strlen (a1);
>> char *a3 = a1 + x;
>> a3[0] = '1';  // a1 length x + 1, unterminated  (min length x + 1)
>> a3[1] = '2';  // a1 length x + 2, unterminated  (min length x + 2)
>> a3[2] = '3';  // a1 length x + 3, unterminated  (min length x + 3)
>> a3[3] = 0;// a1 length x + 3, terminated
>> return strlen (a1);
>>   }
>> 
>> For the a3[3] = 0, we know a3's min_length is 3 and so it's obvious
>> that we can convert its min_length to a length.  But even if we allow
>> a1's min_length to be nonconstant, it seems a bit risky to assume that
>> we can convert its min_length to a length as well.  It would only work
>> if the min_lengths in related strinfos are kept in sync, whereas it
>> ought to be safe to say that the minimum length of something is 0.
>
> And we have code for that.  If verify_related_strinfos returns non-NULL,
> we can adjust all the related strinfos that need adjusting.
> See e.g. zero_length_string on how it uses that.  It is just that we should
> decide what is the acceptable complexity of the length/min_length
> expressions (whether INTEGER_CST or SSA_NAME is enough, then the above
> would not work, but is that really that important), or if we e.g. allow
> SSA_NAME + INTEGER_CST in addition to that, or sum of 2 SSA_NAMEs, etc.).
> I don't see how terminated vs. unterminated (which is misnamed anyway, it
> means that it isn't known to be terminated, it might be terminated or not)
> would help with that.

The example above works with the flag because we already allow
SSA_NAME + INTEGER_CST for the length field, thanks to:

  tree adj = NULL_TREE;
  if (oldlen == NULL_TREE)
;
  else if (integer_zerop (oldlen))
adj = srclen;
  else if (TREE_CODE (oldlen) == INTEGER_CST
   || TREE_CODE (srclen) == INTEGER_CST)
adj = fold_build2_loc (loc, MINUS_EXPR,
   TREE_TYPE (srclen), srclen,
   fold_convert_loc (loc, TREE_TYPE (srclen),
 oldlen));
  if (adj != NULL_TREE)
adjust_related_strinfos (loc, dsi, adj);

etc.  So with a constant min_length we lose out (compared to the flag)
by making min_length more restrictive.

Like you say later, min_length is the number of characters that are
known to be nonzero, and length is the number of characters that are
known to be nonzero and followed by a zero, so even if we do relax the
rules for min_length to match length, I think in almost all useful cases
the length will be equal to the min_length or will be null (i.e. it'll
act almost like a de facto flag).

If the name's a problem, how about "known_terminated_p" instead of
"terminated_p"?

>> So I think that gives four possiblities:
>> 
>>   (1) Keep the terminated flag, but extend the original patch to handle
>>   strings built up a character at a time.  This would handle f1 above.
>
> Only if you allow complex expressions like SSA_NAME + INTEGER_CST in length.
>
>>   (2) Replace the terminated flag with a constant minimum length, don't
>>   handle f1 above.
>
> Sure, if you only allow constants, it will be limited to constants.
>
>>   (3) Replace the terminated flag with an arbitrary minimum length and
>>   ensure that it's always valid to copy the minimum length to the
>>   length when we do so for the final strinfo in a chain.
>
> Even length doesn't allow arbitrary expressions, the more complex it is,
> the more expensive will it be to compute it when you e.g. replace
> strlen with that.
>
> I'd introduce min_length, start with INTEGER_CST, once it is handled
> everywhere in the pass properly, see if there is enough code in the wild
> that would justify allowing more than that.
>
> min_length is a simple guarantee that there are no zero bytes among the
> first min_length bytes, length is the same plus that there is a zero byte
> right after that, so it is easy to argue about what happens if you store
> non-zero somewhere into it, or store zero, etc.

I think that's true of the flag version too though.  If you store a zero
into X <= length, you set the length to X and set known_terminated_p.
If you store a nonzero into X < length, nothing changes.  If you store
an unknown value into X < length, you set the length to X and clear
the known_terminated_p flag.

Thanks,
Richard


Re: [RFC PATCH, i386]: Enable post-reload compare elimination pass

2017-05-10 Thread Uros Bizjak
On Wed, May 10, 2017 at 9:05 PM, Uros Bizjak  wrote:
> On Wed, May 10, 2017 at 5:18 PM, Uros Bizjak  wrote:
>> On Wed, May 10, 2017 at 4:27 PM, Jakub Jelinek  wrote:
>>> On Tue, May 09, 2017 at 06:06:47PM +0200, Uros Bizjak wrote:
 Attached patch enables post-reload compare elimination pass by
 providing expected patterns (duplicates of existing patterns with
 setters of reg and flags switched in the parallel) for flag setting
 arithmetic instructions.

 The merge triggers more than 3000 times during the gcc bootstrap,
 mostly in cases where intervening memory load or store prevents
 combine from merging the arithmetic insn and the following compare.

 Also, some recent linux x86_64 defconfig build results in ~200 merges,
 removing ~200 test/cmp insns. Not much, but I think the results still
 warrant the pass to be enabled.
>>>
>>> Isn't the right fix instead to change the compare-elim.c pass to either
>>> accept both reg vs. flags orderings in parallel, or both depending
>>> on some target hook, or change it to the order i386.md and most other
>>> major targets use and just fix up mn10300/rx (and aarch64?) to use the same
>>> order?
>
> Attached patch changes compare-elim.c order to what i386.md expects.

BTW: This patch now catches 417 cases (instead of 200+) in linux
build, including e.g.:

(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (lshiftrt:SI (reg:SI 4 si [orig:93 _10 ] [93])
(const_int 1 [0x1]))
(const_int 0 [0])))
(set (reg:DI 4 si)
(zero_extend:DI (lshiftrt:SI (reg:SI 4 si [orig:93 _10 ] [93])
(const_int 1 [0x1]
])

Uros.


C++ PATCH for CWG DR 1847, partial ordering and non-deduced context

2017-05-10 Thread Jason Merrill
The resolution of Core DRs 1391 and 1847 clarified that function
parameters that don't involve deducible template parameters are not
considered for partial ordering.  I also experimented with handling
this at a finer-grained level, in unify, so that we would handle a
typename vs. a concrete type even if there were other deducible
template parameters, but that broke template/partial15.C.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 8e4f455347781e161b73a121038efd76dc376aaa
Author: Jason Merrill 
Date:   Wed Mar 1 16:56:20 2017 -1000

CWG 1847 - Clarifying compatibility during partial ordering

* pt.c (more_specialized_fn): No order between two non-deducible
parameters.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b9e7af7..17398c9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -21182,6 +21182,13 @@ more_specialized_fn (tree pat1, tree pat2, int len)
   len = 0;
 }
 
+  /* DR 1847: If a particular P contains no template-parameters that
+participate in template argument deduction, that P is not used to
+determine the ordering.  */
+  if (!uses_deducible_template_parms (arg1)
+ && !uses_deducible_template_parms (arg2))
+   goto next;
+
   if (TREE_CODE (arg1) == REFERENCE_TYPE)
{
  ref1 = TYPE_REF_IS_RVALUE (arg1) + 1;
@@ -21303,6 +21310,8 @@ more_specialized_fn (tree pat1, tree pat2, int len)
   These must be unordered.  */
break;
 
+next:
+
   if (TREE_CODE (arg1) == TYPE_PACK_EXPANSION
   || TREE_CODE (arg2) == TYPE_PACK_EXPANSION)
 /* We have already processed all of the arguments in our
diff --git a/gcc/testsuite/g++.dg/template/partial-order1.C 
b/gcc/testsuite/g++.dg/template/partial-order1.C
new file mode 100644
index 000..0832ea5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/partial-order1.C
@@ -0,0 +1,18 @@
+// { dg-do compile { target c++11 } }
+
+using size_t = decltype(sizeof(0));
+template  struct A
+{
+  using size_type = size_t;
+};
+
+template 
+void f(size_t, T);
+
+template 
+void f(typename A::size_type, T);
+
+int main()
+{
+  f(1,2);  // { dg-error "ambiguous" }
+}


Minor C++ PATCH to make unify_invalid a common breakpoint

2017-05-10 Thread Jason Merrill
It's periodically been a minor headache that there was no one place I
could set a breakpoint on when exactly deduction fails.  This patch
makes unify_invalid that place.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 387302d431ce2d837c947c6893cbcbce1cdd1318
Author: Jason Merrill 
Date:   Wed May 10 12:29:26 2017 -0400

Have other unify failure functions call unify_invalid.

* pt.c (unify_parameter_deduction_failure, unify_cv_qual_mismatch)
(unify_type_mismatch, unify_parameter_pack_mismatch)
(unify_ptrmem_cst_mismatch, unify_expression_unequal)
(unify_parameter_pack_inconsistent, unify_inconsistency)
(unify_vla_arg, unify_method_type_error, unify_arity)
(unify_arg_conversion, unify_no_common_base)
(unify_inconsistent_template_template_parameters)
(unify_template_deduction_failure)
(unify_template_argument_mismatch)
(unify_overload_resolution_failure): Call unify_invalid.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 17398c9..f80d7a5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6105,19 +6105,22 @@ unify_success (bool /*explain_p*/)
   return 0;
 }
 
+/* Other failure functions should call this one, to provide a single function
+   for setting a breakpoint on.  */
+
+static int
+unify_invalid (bool /*explain_p*/)
+{
+  return 1;
+}
+
 static int
 unify_parameter_deduction_failure (bool explain_p, tree parm)
 {
   if (explain_p)
 inform (input_location,
"  couldn't deduce template parameter %qD", parm);
-  return 1;
-}
-
-static int
-unify_invalid (bool /*explain_p*/)
-{
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6127,7 +6130,7 @@ unify_cv_qual_mismatch (bool explain_p, tree parm, tree 
arg)
 inform (input_location,
"  types %qT and %qT have incompatible cv-qualifiers",
parm, arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6135,7 +6138,7 @@ unify_type_mismatch (bool explain_p, tree parm, tree arg)
 {
   if (explain_p)
 inform (input_location, "  mismatched types %qT and %qT", parm, arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6146,7 +6149,7 @@ unify_parameter_pack_mismatch (bool explain_p, tree parm, 
tree arg)
"  template parameter %qD is not a parameter pack, but "
"argument %qD is",
parm, arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6157,7 +6160,7 @@ unify_ptrmem_cst_mismatch (bool explain_p, tree parm, 
tree arg)
"  template argument %qE does not match "
"pointer-to-member constant %qE",
arg, parm);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6165,7 +6168,7 @@ unify_expression_unequal (bool explain_p, tree parm, tree 
arg)
 {
   if (explain_p)
 inform (input_location, "  %qE is not equivalent to %qE", parm, arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6175,7 +6178,7 @@ unify_parameter_pack_inconsistent (bool explain_p, tree 
old_arg, tree new_arg)
 inform (input_location,
"  inconsistent parameter pack deduction with %qT and %qT",
old_arg, new_arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6192,7 +6195,7 @@ unify_inconsistency (bool explain_p, tree parm, tree 
first, tree second)
"  deduced conflicting values for non-type parameter "
"%qE (%qE and %qE)", parm, first, second);
 }
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6203,7 +6206,7 @@ unify_vla_arg (bool explain_p, tree arg)
"  variable-sized array type %qT is not "
"a valid template argument",
arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6213,7 +6216,7 @@ unify_method_type_error (bool explain_p, tree arg)
 inform (input_location,
"  member function type %qT is not a valid template argument",
arg);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6232,7 +6235,7 @@ unify_arity (bool explain_p, int have, int wanted, bool 
least_p = false)
  "  candidate expects %d arguments, %d provided",
  wanted, have);
 }
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6256,7 +6259,7 @@ unify_arg_conversion (bool explain_p, tree to_type,
 inform (EXPR_LOC_OR_LOC (arg, input_location),
"  cannot convert %qE (type %qT) to type %qT",
arg, from_type, to_type);
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6274,7 +6277,7 @@ unify_no_common_base (bool explain_p, enum 
template_base_result r,
inform (input_location, "  %qT is not derived from %qT", arg, parm);
break;
   }
-  return 1;
+  return unify_invalid (explain_p);
 }
 
 static int
@@ -6284,7 +6287,7 @@ unify_in

[PATCH] sparc: Set noexecstack on mulsi3, divsi3, and modsi3

2017-05-10 Thread Adhemerval Zanella
A recent GLIBC fix for sparc [1] made some configuration to show
an executable stack on ld.so (shown on elf/check-execstack testcase
failure).

It is because with generated sparc toolchain from build-many-glibcs.py
(a GLIBC script to produce cross-compiling toolchains) the minimum
supported sparc32 version is pre-v9 and it requires a software
implementation of '.udiv'.  Since now we are using libgcc.a one instead,
it must have the '.note.GNU-stack' so linker can properly set the stack non
executable.

>From a build using a toolchain from build-many-glibcs.py:

elf/librtld.os.map

[...]
.../sparc64-glibc-linux-gnu/6.2.1/32/libgcc.a(_divsi3.o)
  
.../sparc64-glibc-linux-gnu/6.2.1/32/libgcc.a(_udivdi3.o) (.udiv)
.../sparc64-glibc-linux-gnu/6.2.1/32/libgcc.a(_clz.o)
  
.../lib/gcc/sparc64-glibc-linux-gnu/6.2.1/32/libgcc.a(_udivdi3.o) (__clz_tab)
[...]

And dumping _udivdi3.o section headers:

  [Nr] Name  TypeAddr OffSize   ES Flg Lk Inf Al
  [ 0]   NULL 00 00 00  0   0  0
  [ 1] .text PROGBITS 34 0002b0 00  AX  0   0  4
  [ 2] .data PROGBITS 0002e4 00 00  WA  0   0  1
  [ 3] .bss  NOBITS   0002e4 00 00  WA  0   0  1
  [ 4] .debug_line   PROGBITS 0002e4 00010d 00  0   0  1
  [ 5] .rela.debug_line  RELA 0007c0 0c 0c   I 12   4  4
  [ 6] .debug_info   PROGBITS 0003f1 ab 00  0   0  1
  [ 7] .rela.debug_info  RELA 0007cc 30 0c   I 12   6  4
  [ 8] .debug_abbrev PROGBITS 00049c 14 00  0   0  1
  [ 9] .debug_arangesPROGBITS 0004b0 20 00  0   0  8
  [10] .rela.debug_arang RELA 0007fc 18 0c   I 12   9  4
  [11] .shstrtab STRTAB   000814 70 00  0   0  1
  [12] .symtab   SYMTAB   0004d0 000220 10 13  32  4
  [13] .strtab   STRTAB   0006f0 cf 00  0   0  1

I am not seeing this on a native gcc build which I configured with:

' --with-arch-directory=sparc64 --enable-multiarch --enable-targets=all
  --with-cpu-32=ultrasparc --with-long-double-128 --enable-multilib'

Both libgcc's __udivdi3 and __umoddi3 do not pull .udiv since for this libgcc 
build
both are using hardware instructions:

elf/librtld.os.map

/home/azanella/gcc/install/lib/gcc/sparc64-linux-gnu/6.3.1/32/libgcc.a(_udivdi3.o)
  
/home/azanella/glibc/glibc-git-build-sparcv9/elf/dl-allobjs.os (__udivdi3)
/home/azanella/gcc/install/lib/gcc/sparc64-linux-gnu/6.3.1/32/libgcc.a(_umoddi3.o)
  
/home/azanella/glibc/glibc-git-build-sparcv9/elf/dl-allobjs.os (__umoddi3)

This patch adds them missing noexectack on sparc assembly implementation.  I saw
no regression on gcc testsuite and it fixes the regression on GLIBC side.

libgcc/

* config/sparc/lb1spc.S [__ELF__ && __linux__]: Emit .note.GNU-stack
section for a non-executable stack.

[1] 
https://sourceware.org/git/?p=glibc.git;a=commit;h=bdc543e338281da051b3dc06eae96c330a485ce6
---
 libgcc/ChangeLog | 5 +
 libgcc/config/sparc/lb1spc.S | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/libgcc/config/sparc/lb1spc.S b/libgcc/config/sparc/lb1spc.S
index b60bd57..e693864 100644
--- a/libgcc/config/sparc/lb1spc.S
+++ b/libgcc/config/sparc/lb1spc.S
@@ -5,6 +5,12 @@
slightly edited to match the desired calling convention, and also to
optimize them for our purposes.  */
 
+/* An executable stack is *not* required for these functions.  */
+#if defined(__ELF__) && defined(__linux__)
+.section .note.GNU-stack,"",%progbits
+.previous
+#endif
+
 #ifdef L_mulsi3
 .text
.align 4
-- 
2.7.4



Re: [RFC PATCH, i386]: Enable post-reload compare elimination pass

2017-05-10 Thread Jakub Jelinek
On Wed, May 10, 2017 at 09:57:56PM +0200, Uros Bizjak wrote:
> BTW: This patch now catches 417 cases (instead of 200+) in linux
> build, including e.g.:
> 
> (parallel [
> (set (reg:CCZ 17 flags)
> (compare:CCZ (lshiftrt:SI (reg:SI 4 si [orig:93 _10 ] [93])
> (const_int 1 [0x1]))
> (const_int 0 [0])))
> (set (reg:DI 4 si)
> (zero_extend:DI (lshiftrt:SI (reg:SI 4 si [orig:93 _10 ] [93])
> (const_int 1 [0x1]
> ])

That looks nice.  So, I think we need analysis on what order which targets
use.  I have looked at mn10300.md, I see {add,sub}si3_flags patterns that
would need PARALLEL reordering for this compare-elim.c change and then
cmp_liw vs. liw_cmp patterns I have no clue what they do and whether
compare-elim.c would care about those or not (they have UNSPECs).  Jeff/Alex?

In rx.md I see {add,sub}si3_flags too, then ssaddsi3 and 2 peepholes that
would need changing.

In visium.md I see flags_subst_{logic,arith} define_substs,
*{add,sub}3_insn_set_{carry,overflow}
{add,sub}si3_insn_set_{carry,overflow}, negsi2_insn_set_carry
and *neg2_insn_set_overflow that would need changing.

aarch64 is the only remaining compare-elim.c enabled target
(one that defines TARGET_FLAGS_REGNUM), and that one seems to use
the same parallel order as i386.md, so compare-elim.c most likely just
doesn't work there at all.

So all in all, sounds like we need to change at least 17 patterns on 3
not very widely used targets.

Jakub


[C++ PATCH] remove unnecessary hidden pruning

2017-05-10 Thread Nathan Sidwell
Name lookup as it currently is already removes hidden names for regular 
lookup (sometimes more than once!).   It also strips anticipated builtins.


So there's no need to do it again when building a function call.

Applied to trunk.

nathan
--
Nathan Sidwell
2017-05-10  Nathan Sidwell  

	* cp-tree.h (build_new_function_call): Lose koenig_p arg.  Fix
	line breaking.
	* call.c (build_new_function_call): Lose koenig_p arg.  Remove
	koenig_p handling here.
	* pt.c (push_template_decl_real): Unconditionally retrofit_lang_decl.
	(tsubst_omp_clauses): Likewise.
	(do_class_deduction): Adjust buld_new_function_call calls.
	* semantics.c (finish_call_expr): Likewise.

Index: call.c
===
--- call.c	(revision 247851)
+++ call.c	(working copy)
@@ -4192,7 +4192,7 @@ print_error_for_call_failure (tree fn, v
ARGS.  */
 
 tree
-build_new_function_call (tree fn, vec **args, bool koenig_p, 
+build_new_function_call (tree fn, vec **args,
 			 tsubst_flags_t complain)
 {
   struct z_candidate *candidates, *cand;
@@ -4210,22 +4210,6 @@ build_new_function_call (tree fn, vec **, bool, 
+extern tree build_new_function_call		(tree, vec **,
 		 tsubst_flags_t);
-extern tree build_operator_new_call		(tree, vec **, tree *,
-		 tree *, tree, tree, tree *,
-		 tsubst_flags_t);
-extern tree build_new_method_call		(tree, tree, vec **,
-		 tree, int, tree *,
-		 tsubst_flags_t);
-extern tree build_special_member_call		(tree, tree, vec **,
+extern tree build_operator_new_call		(tree, vec **,
+		 tree *, tree *, tree, tree,
+		 tree *, tsubst_flags_t);
+extern tree build_new_method_call		(tree, tree,
+		 vec **, tree,
+		 int, tree *, tsubst_flags_t);
+extern tree build_special_member_call		(tree, tree,
+		 vec **,
 		 tree, int, tsubst_flags_t);
 extern tree build_new_op			(location_t, enum tree_code,
 		 int, tree, tree, tree, tree *,
@@ -5665,7 +5666,7 @@ extern tree build_new_op			(location_t,
 extern tree build_op_call			(tree, vec **,
 		 tsubst_flags_t);
 extern bool aligned_allocation_fn_p		(tree);
-extern bool usual_deallocation_fn_p	(tree);
+extern bool usual_deallocation_fn_p		(tree);
 extern tree build_op_delete_call		(enum tree_code, tree, tree,
 		 bool, tree, tree,
 		 tsubst_flags_t);
Index: pt.c
===
--- pt.c	(revision 247851)
+++ pt.c	(working copy)
@@ -5570,7 +5570,7 @@ template arguments to %qD do not match o
 SET_TYPE_TEMPLATE_INFO (TREE_TYPE (tmpl), info);
   else
 {
-  if (is_primary && !DECL_LANG_SPECIFIC (decl))
+  if (is_primary)
 	retrofit_lang_decl (decl);
   if (DECL_LANG_SPECIFIC (decl))
 	DECL_TEMPLATE_INFO (decl) = info;
@@ -15250,8 +15250,7 @@ tsubst_omp_clauses (tree clauses, enum c
 		tree decl = OMP_CLAUSE_DECL (nc);
 		if (VAR_P (decl))
 		  {
-		if (!DECL_LANG_SPECIFIC (decl))
-		  retrofit_lang_decl (decl);
+		retrofit_lang_decl (decl);
 		DECL_OMP_PRIVATIZED_MEMBER (decl) = 1;
 		  }
 	  }
@@ -25238,14 +25237,12 @@ do_class_deduction (tree ptype, tree tmp
 }
 
   ++cp_unevaluated_operand;
-  tree t = build_new_function_call (cands, &args, /*koenig*/false,
-tf_decltype);
+  tree t = build_new_function_call (cands, &args, tf_decltype);
 
   if (t == error_mark_node && (complain & tf_warning_or_error))
 {
   error ("class template argument deduction failed:");
-  t = build_new_function_call (cands, &args, /*koenig*/false,
-   complain | tf_decltype);
+  t = build_new_function_call (cands, &args, complain | tf_decltype);
   if (old_cands != cands)
 	inform (input_location, "explicit deduction guides not considered "
 		"for copy-initialization");
Index: semantics.c
===
--- semantics.c	(revision 247851)
+++ semantics.c	(working copy)
@@ -2438,7 +2438,7 @@ finish_call_expr (tree fn, vec

Re: Bump version namespace and remove _Rb_tree useless template parameter

2017-05-10 Thread François Dumont

On 10/05/2017 11:28, Jonathan Wakely wrote:

On 10/05/17 11:15 +0200, Paolo Carlini wrote:

Hi,

On 10/05/2017 11:12, Jonathan Wakely wrote:

Looks good to me. Paolo, what do you think about bumping the versioned
namespace and SONAME to 8?

Sure, makes sense to me too.


Please commit it then, François - thanks!



Done, don't hesitate to update the ChangeLog if I used some wrong terms.

I'll also send some changes requiring this bump.

François



Re: [patch] FreeBSD arm libgcc config.host

2017-05-10 Thread Andreas Tobler

On 07.05.17 21:30, Andreas Tobler wrote:

Hi all,

I'm going to commit the below patch to all active branches. (8,7,6,5)
It makes arm*-*-freebsd* use the generic FreeBSD t-slibgcc-elf-ver
definition. This makes all FreeBSD targets 'consistent' in this area.

If not ok, please speak up soon.


Commit done.

Andreas


TIA,
Andreas

2017-05-07  Andreas Tobler  

* config.host): Use the generic FreeBSD t-slibgcc-elf-ver for
arm*-*-freebsd instead of the t-slibgcc-libgcc.

Index: config.host
===
--- config.host (revision 247727)
+++ config.host (working copy)
@@ -397,7 +397,7 @@
;;
   arm*-*-freebsd*)# ARM FreeBSD EABI
tmake_file="${tmake_file} arm/t-arm t-fixedpoint-gnu-prefix arm/t-elf"
-   tmake_file="${tmake_file} arm/t-bpabi arm/t-freebsd t-slibgcc-libgcc"
+   tmake_file="${tmake_file} arm/t-bpabi arm/t-freebsd"
tm_file="${tm_file} arm/bpabi-lib.h"
unwind_header=config/arm/unwind-arm.h
tmake_file="${tmake_file} t-softfp-sfdf t-softfp-excl arm/t-softfp
t-softfp"





[C++ PATCH] ambiguous candidate printing

2017-05-10 Thread Nathan Sidwell
Ambiguous lookups return a tree list that can contain embedded overload 
lists.  This patch separates the printing somewhat more, so that the 
overload iterator I'll be introduce will slot in better.


It'd be nice for ambiguous lookups to return a specially marked 
overload, but that's a change for another day (I don't have such a 
change handy).


Applied to trunk.

nathan
--
Nathan Sidwell
2017-05-10  Nathan Sidwell  

	* pt.c (print_candidates_1): Separate TREE_LIST and OVERLOAD
	printing.
	(print_candidates): Adjust.

Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 247860)
+++ gcc/cp/pt.c	(working copy)
@@ -1922,43 +1922,28 @@ explicit_class_specialization_p (tree ty
in *STR when it ends.  */
 
 static void
-print_candidates_1 (tree fns, bool more, const char **str)
+print_candidates_1 (tree fns, char **str, bool more = false)
 {
-  tree fn, fn2;
-  char *spaces = NULL;
-
-  for (fn = fns; fn; fn = OVL_NEXT (fn))
-if (TREE_CODE (fn) == TREE_LIST)
-  {
-for (fn2 = fn; fn2 != NULL_TREE; fn2 = TREE_CHAIN (fn2))
-  print_candidates_1 (TREE_VALUE (fn2),
-  TREE_CHAIN (fn2) || more, str);
-  }
-else
+  if (TREE_CODE (fns) == TREE_LIST)
+for (; fns; fns = TREE_CHAIN (fns))
+  print_candidates_1 (TREE_VALUE (fns), str, more || TREE_CHAIN (fns));
+  else
+while (fns)
   {
-	tree cand = OVL_CURRENT (fn);
-if (!*str)
-  {
-/* Pick the prefix string.  */
-if (!more && !OVL_NEXT (fns))
-  {
-inform (DECL_SOURCE_LOCATION (cand),
-			"candidate is: %#qD", cand);
-continue;
-  }
+	tree cand = OVL_CURRENT (fns);
 
-*str = _("candidates are:");
-spaces = get_spaces (*str);
-  }
-	inform (DECL_SOURCE_LOCATION (cand), "%s %#qD", *str, cand);
-*str = spaces ? spaces : *str;
+	fns = OVL_NEXT (fns);
+	const char *pfx = *str;
+	if (!pfx)
+	  {
+	if (more || fns)
+	  pfx = _("candidates are:");
+	else
+	  pfx = _("candidate is:");
+	*str = get_spaces (pfx);
+	  }
+	inform (DECL_SOURCE_LOCATION (cand), "%s %#qD", pfx, cand);
   }
-
-  if (!more)
-{
-  free (spaces);
-  *str = NULL;
-}
 }
 
 /* Print the list of candidate FNS in an error message.  FNS can also
@@ -1967,9 +1952,9 @@ print_candidates_1 (tree fns, bool more,
 void
 print_candidates (tree fns)
 {
-  const char *str = NULL;
-  print_candidates_1 (fns, false, &str);
-  gcc_assert (str == NULL);
+  char *str = NULL;
+  print_candidates_1 (fns, &str);
+  free (str);
 }
 
 /* Get a (possibly) constrained template declaration for the


Re: [PATCH] Kill -fdump-translation-unit

2017-05-10 Thread Alexander Monakov
On Wed, 10 May 2017, Jakub Jelinek wrote:
> Can it at least be taken out of -fdump-tree-all?  It is huge, often larger
> than the sum of all the other dump files, and don't remember ever using it
> for anything.

Yes, apart from advertising the capability I don't imagine it's useful to
produce that dump without a special flag.


> Instead of trying to write a parser for it and reconstructing
> something you can then later analyze, isn't it better to just write a plugin
> that can analyze it directly?

I think I can understand people writing a parser when it's sufficient; it won't
need to be recompiled for a specific compiler version (with headers from that
compiler), won't crash the compiler if you did something wrong.  For people more
familiar with a dynamic language like Python than C/C++ it may be just more
comfortable to do it that way.

Alexander


[C++ PATCH] address of overload

2017-05-10 Thread Nathan Sidwell
We were gratuitously checking for an overload before using OVL_CURRENT, 
and then unilaterally using OVL_CURRENT on the result of that.


I also simplify taking the address of an overloaded fn by using'for 
(tree fn ...) idiom.  And strip out another unneeded anticipated decl 
stripping.


Applied to trunk.

nathan
--
Nathan Sidwell
2017-05-10  Nathan Sidwell  

	* class.c (handle_using_decl): Always use OVL_CURRENT.
	(resolve_address_of_overloaded_function): Move iterator decl into
	for scope.  Don't strip anticipated decls here.

Index: class.c
===
--- class.c	(revision 247863)
+++ class.c	(working copy)
@@ -1359,8 +1359,7 @@ handle_using_decl (tree using_decl, tree
 			 tf_warning_or_error);
   if (old_value)
 {
-  if (is_overloaded_fn (old_value))
-	old_value = OVL_CURRENT (old_value);
+  old_value = OVL_CURRENT (old_value);
 
   if (DECL_P (old_value) && DECL_CONTEXT (old_value) == t)
 	/* OK */;
@@ -1384,7 +1383,7 @@ handle_using_decl (tree using_decl, tree
 	{
 	  error ("%q+D invalid in %q#T", using_decl, t);
 	  error ("  because of local method %q+#D with same name",
-		 OVL_CURRENT (old_value));
+		 old_value);
 	  return;
 	}
 }
@@ -8184,39 +8183,29 @@ resolve_address_of_overloaded_function (
  if we're just going to throw them out anyhow.  But, of course, we
  can only do this when we don't *need* a template function.  */
   if (!template_only)
-{
-  tree fns;
-
-  for (fns = overload; fns; fns = OVL_NEXT (fns))
-	{
-	  tree fn = OVL_CURRENT (fns);
-
-	  if (TREE_CODE (fn) == TEMPLATE_DECL)
-	/* We're not looking for templates just yet.  */
-	continue;
+for (tree fns = overload; fns; fns = OVL_NEXT (fns))
+  {
+	tree fn = OVL_CURRENT (fns);
 
-	  if ((TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
-	  != is_ptrmem)
-	/* We're looking for a non-static member, and this isn't
-	   one, or vice versa.  */
-	continue;
+	if (TREE_CODE (fn) == TEMPLATE_DECL)
+	  /* We're not looking for templates just yet.  */
+	  continue;
 
-	  /* Ignore functions which haven't been explicitly
-	 declared.  */
-	  if (DECL_ANTICIPATED (fn))
-	continue;
+	if ((TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE) != is_ptrmem)
+	  /* We're looking for a non-static member, and this isn't
+	 one, or vice versa.  */
+	  continue;
 
-	  /* In C++17 we need the noexcept-qualifier to compare types.  */
-	  if (flag_noexcept_type)
-	maybe_instantiate_noexcept (fn);
-
-	  /* See if there's a match.  */
-	  tree fntype = static_fn_type (fn);
-	  if (same_type_p (target_fn_type, fntype)
-	  || fnptr_conv_p (target_fn_type, fntype))
-	matches = tree_cons (fn, NULL_TREE, matches);
-	}
-}
+	/* In C++17 we need the noexcept-qualifier to compare types.  */
+	if (flag_noexcept_type)
+	  maybe_instantiate_noexcept (fn);
+
+	/* See if there's a match.  */
+	tree fntype = static_fn_type (fn);
+	if (same_type_p (target_fn_type, fntype)
+	|| fnptr_conv_p (target_fn_type, fntype))
+	  matches = tree_cons (fn, NULL_TREE, matches);
+  }
 
   /* Now, if we've already got a match (or matches), there's no need
  to proceed to the template functions.  But, if we don't have a


Re: [PATCH] Kill -fdump-translation-unit

2017-05-10 Thread Nathan Sidwell

On 05/10/2017 01:58 PM, Jakub Jelinek wrote:


A quick search indicates that people have published .tu parsers in Perl, JS
(producing json), the person objecting on IRC apparently used Python, and I'm
aware of another Python-based parser by Bruce Merry.


Prior to Alex mentioning it, I was unaware of such parsers -- I'm 
surprised.  This is not a data interchange format, it's a debugging dump.


The fellow on IRC failed to mention that, and made the claim that the TU 
dump was the simplest way of determining sizeof (time_t) when one has a 
cross compiler.


nathan

--
Nathan Sidwell


[RFA] Improve tree-ssa-uninit.c's predicate simplification

2017-05-10 Thread Jeff Law


So I have some improvements to jump threading that are regressing one of 
the uninit-preds testcases.


The problem is we end up threading deeper into the CFG during VRP1. 
This changes the shape of the CFG such that the condition guarding a use 
changes in an interesting way.


Background:

The form of predicates in tree-ssa-uninit.c is a chain of IOR operations 
at the toplevel.  Each IOR operand can be a chain of AND operations.


ie we represent things like


(X & Y) (no IOR operations at all, just a chain of ANDs)

X | Y

X | ( Y & Z)

(A & B) | (Y & Z) | (P & D & Q)

You hopefully get the idea.



We can not represent something like this:

(X | Y) & (A | B)

In this case the IORs are operands of the AND.

--


Without the additional threading we have use predicate that looks 
something like this:


_3 != 0 (.AND.) _9 != 0
(.OR.)
_3 != 0 (.AND.)  (.NOT.) _9 != 0 (.AND.) r_10(D) <= 9
(.OR.)
 (.NOT.) _3 != 0 (.AND.) r_10(D) <= 9



Which simplifies nicely into:

9 != 0
(.OR.)
r_10(D) <= 9


Which normalizes into:


m_7(D) > 100
(.OR.)
n_5(D) <= 9
(.OR.)
r_10(D) <= 9


Which is easily determined to be a subset of the problematical PHI's 
argument's guard.


With the additional threading the predicate chain for the use instead 
looks something like this:


_11 != 0 (.AND.) _30 != 0

If we were to look inside each predicate we'd see each is set from a 
BIT_IOR and it ought to expand into something like this:


(X | Y) & (X | Z)

But that's not a form we can really represent.  So no notable 
simplification or normalization occurs and the result is we're unable to 
determine the use guard is a subset of the conditions of the PHI 
argument's guard.  Thus the use does not appear to be properly guarded 
and we issue the false positive warning.


But you will notice that form has a common term, X.  We can rewrite it 
as X | (Y & Z) which is a form suitable for tree-ssa-uninit.c.  And 
that's precisely what this patch does.


It walks through the toplevel pred_chain_union.  Each element is a 
pred_chain.  Within the pred_chain we look for cases where the predicate 
is set from a BIT_IOR.  Given two predicates set from a BIT_IOR, we then 
check if there's a common term.


If there is a common term, then we extract the common term and add it to 
the toplevel pred_chain_union (X above).  The two existing predicates 
are replaced by the unique terms.  (Y and Z above).


By replacing the predicates within the pred_chain (as opposed to removal 
and pushing on new predicates), we can trivially look for additional 
opportunities to simplify the active pred_chain.


Anyway once rewritten as X | (Y & Z)  we can again see that use is 
properly guarded relative to the offending PHI argument and we do not 
warn for the use.


Bootstrapped and regression tested on x86_64-linux-gnu.  I wandered the 
bugs attached to our uninitialized meta BZ and didn't see anything which 
might obviously be fixed by this improvement (sigh).


The testcase is derived from uninit-pred-8_b.c with the one jump thread 
manually applied.  It will give a false positive uninit warning with the 
trunk, but does not with this patch applied.


OK for the trunk?

Jeff

ps. This is blocking moving forward with eliminating VRP's jump 
threading dependency on ASSERT_EXPRs :-)




* tree-ssa-uninit.c (simplify_preds_1): Simplify (X | Y) & (X | Z)
into X | (Y & Z).
(simplify_preds): Call it.

* gcc.dg/uninit-pred-8_e.c: New test.

diff --git a/gcc/testsuite/gcc.dg/uninit-pred-8_e.c 
b/gcc/testsuite/gcc.dg/uninit-pred-8_e.c
new file mode 100644
index 000..ede02a7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-pred-8_e.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized -O2" } */
+
+void bar (void);
+void blah1 (int);
+void blah2 (int);
+int g;
+int
+foo (int n, int l, int m, int r)
+{
+  int v;
+  _Bool _1;
+  _Bool _2;
+  _Bool _3;
+  _Bool _5;
+  _Bool _6;
+  _Bool _24;
+  _Bool _25;
+  _Bool _26;
+  _Bool _27;
+
+  _1 = n <= 9;
+  _2 = m > 100;
+  _3 = _1 | _2;
+  _27 = r <= 19;
+  if (_3 != 0)
+v = r;
+  else
+{
+  _5 = l != 0;
+  _6 = _5 | _27;
+  if (_6 != 0)
+v = r;
+}
+
+  if (m == 0)
+bar ();
+  else
+g++;
+
+  _24 = _3 | _27;
+  if (_24 == 0)
+return 0;
+
+  blah1 (v);   /* { dg-bogus "uninitialized" "bogus warning" } */
+  _25 = r <= 9;
+  _26 = _3 | _25;
+  if (_26 != 0)
+blah2 (v);  /* { dg-bogus "uninitialized" "bogus warning" } */
+
+  return 0;
+}
diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 60731b2..be99949 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -1582,6 +1582,8 @@ pred_neg_p (pred_info x1, pred_info x2)
   (x != 0 AND y != 0)
5) (X AND Y) OR (!X AND Z) OR (!Y AND Z) is equivalent to
   (X AND Y) OR Z
+   6) (X | Y) AND (X | Z) is equivalent to
+  X | (Y & Z)
 
PREDS is the predicate chains, and N is the number of chains.  */
 
@@ -1648,6 +1650,125 @@ simplify_pred (pre

[PATCH] Move target independent code to support target_clones attributes from i386 to common code

2017-05-10 Thread Michael Meissner
As I mentioned in the mail message:
https://gcc.gnu.org/ml/gcc/2017-05/msg00060.html

I'm working on adding the target_clones attribute support to the PowerPC.  I
have an implementation right now, but I want to iterate on it somewhat.

In doing the patch, I noticed there were several functions that were added to
the i386 port to enable target_clones that I could use without modification in
the PowerPC.  This patch moves these functions from i386.c to attribs.c.

I made a few changes to the functions to in order to make these common code:

1)  I removed 'static' on the declarations.

2)  I renamed 'ix86_function_versions' to 'common_function_versions' and
changed TARGET_OPTION_FUNCTION_VERSIONS to point to that.

3)  I renamed make_name to make_unique_name.

4)  I removed a trailing space in one of the functions.

I have done bootstraps and make check tests on both x86_64 and PowerPC and
there were no regressions.  On the PowerPC, I included my initial
implementation of the target_clones support, but those patches are not part of
this patch submission.

Can I check this into the trunk?

2017-05-10  Michael Meissner  

* attribs.h (sorted_attr_string): Move machine independent
functions for target clone support from the i386 port to common
code.  Rename ix86_function_versions to common_function_versions.
Rename make_name to make_unique_name.
(common_function_versions): Likewise.
(make_unique_name): Likewise.
(make_dispatcher_decl): Likewise.
(is_function_default_version): Likewise.
* attribs.c (attr_strcmp): Likewise.
(sorted_attr_string): Likewise.
(common_function_versions): Likewise.
(make_unique_name): Likewise.
(make_dispatcher_decl): Likewise.
(is_function_default_version): Likewise.
* config/i386/i386.c (attr_strcmp): Likewise.
(sorted_attr_string): Likewise.
(ix86_function_versions): Likewise.
(make_name): Likewise.
(make_dispatcher_decl): Likewise.
(is_function_default_version): Likewise.
(TARGET_OPTION_FUNCTION_VERSIONS): Update target function hook.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/attribs.h
===
--- gcc/attribs.h   (revision 247770)
+++ gcc/attribs.h   (working copy)
@@ -41,4 +41,10 @@ extern tree make_attribute (const char *
 extern struct scoped_attributes* register_scoped_attributes (const struct 
attribute_spec *,
 const char *);
 
+extern char *sorted_attr_string (tree);
+extern bool common_function_versions (tree, tree);
+extern char *make_unique_name (tree, const char *, bool);
+extern tree make_dispatcher_decl (const tree);
+extern bool is_function_default_version (const tree);
+
 #endif // GCC_ATTRIBS_H
Index: gcc/attribs.c
===
--- gcc/attribs.c   (revision 247770)
+++ gcc/attribs.c   (working copy)
@@ -690,3 +690,242 @@ make_attribute (const char *name, const 
   attr = tree_cons (attr_name, attr_args, chain);
   return attr;
 }
+
+
+/* Common functions used for target clone support.  */
+
+/* Comparator function to be used in qsort routine to sort attribute
+   specification strings to "target".  */
+
+static int
+attr_strcmp (const void *v1, const void *v2)
+{
+  const char *c1 = *(char *const*)v1;
+  const char *c2 = *(char *const*)v2;
+  return strcmp (c1, c2);
+}
+
+/* ARGLIST is the argument to target attribute.  This function tokenizes
+   the comma separated arguments, sorts them and returns a string which
+   is a unique identifier for the comma separated arguments.   It also
+   replaces non-identifier characters "=,-" with "_".  */
+
+char *
+sorted_attr_string (tree arglist)
+{
+  tree arg;
+  size_t str_len_sum = 0;
+  char **args = NULL;
+  char *attr_str, *ret_str;
+  char *attr = NULL;
+  unsigned int argnum = 1;
+  unsigned int i;
+
+  for (arg = arglist; arg; arg = TREE_CHAIN (arg))
+{
+  const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
+  size_t len = strlen (str);
+  str_len_sum += len + 1;
+  if (arg != arglist)
+   argnum++;
+  for (i = 0; i < strlen (str); i++)
+   if (str[i] == ',')
+ argnum++;
+}
+
+  attr_str = XNEWVEC (char, str_len_sum);
+  str_len_sum = 0;
+  for (arg = arglist; arg; arg = TREE_CHAIN (arg))
+{
+  const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
+  size_t len = strlen (str);
+  memcpy (attr_str + str_len_sum, str, len);
+  attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0';
+  str_len_sum += len + 1;
+}
+
+  /* Replace "=,-" with "_".  */
+  for (i = 0; i < strlen (attr_str); i++)
+if (attr_str[i] == '=' || attr_str[i]== '-')

[Committed] hppa: Reject changes to/from modes with zero size in pa_cannot_change_mode_class

2017-05-10 Thread John David Anglin
The attached change fixes PR target/79027  On gcc-8 and gcc-7, under some 
conditions that are
a bit hard to reproduce, we get a floating point exception when regcprop.c 
tries to change a SImode value
to BLKmode.  The attached patch to pa_cannot_change_mode_class reject such 
changes.  As a result,
fold-const.c compiles successfully.

The bug is latent in all active branches.

Tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11 trunk.  Committed to 
trunk.

Dave
--
John David Anglin   dave.ang...@bell.net


2017-05-10  John David Anglin  

PR target/79027
* config/pa/pa.c (pa_cannot_change_mode_class): Reject changes to/from
modes with zero size.  Enhance comment.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 247726)
+++ config/pa/pa.c  (working copy)
@@ -9962,19 +9981,23 @@
   if (from == to)
 return false;
 
+  if (GET_MODE_SIZE (from) == GET_MODE_SIZE (to))
+return false;
+
+  /* Reject changes to/from modes with zero size.  */
+  if (!GET_MODE_SIZE (from) || !GET_MODE_SIZE (to))
+return true;
+
   /* Reject changes to/from complex and vector modes.  */
   if (COMPLEX_MODE_P (from) || VECTOR_MODE_P (from)
   || COMPLEX_MODE_P (to) || VECTOR_MODE_P (to))
 return true;
   
-  if (GET_MODE_SIZE (from) == GET_MODE_SIZE (to))
-return false;
-
-  /* There is no way to load QImode or HImode values directly from
- memory.  SImode loads to the FP registers are not zero extended.
- On the 64-bit target, this conflicts with the definition of
- LOAD_EXTEND_OP.  Thus, we can't allow changing between modes
- with different sizes in the floating-point registers.  */
+  /* There is no way to load QImode or HImode values directly from memory
+ to a FP register.  SImode loads to the FP registers are not zero
+ extended.  On the 64-bit target, this conflicts with the definition
+ of LOAD_EXTEND_OP.  Thus, we can't allow changing between modes with
+ different sizes in the floating-point registers.  */
   if (MAYBE_FP_REG_CLASS_P (rclass))
 return true;
 


[PATCH] hppa: Fix pa_assemble_integer

2017-05-10 Thread John David Anglin
There are situations on hppa when call assemble_external will result in actual 
assembly code
being generated.  An example is shown in PR target/80090 where compiling 
autodock-vina fails.

The problem is that output_addr_const may output visibility information while 
outputting an address constant:

case SYMBOL_REF:
  if (SYMBOL_REF_DECL (x))
assemble_external (SYMBOL_REF_DECL (x));
#ifdef ASM_OUTPUT_SYMBOL_REF
  ASM_OUTPUT_SYMBOL_REF (file, x);
#else
  assemble_name (file, XSTR (x, 0));
#endif
  break;

This can result in visibility information being output in the middle of an 
assembly line.  The attached change works
around this issue by calling assemble_external earlier, when we have a 
SYMBOL_REF_DECL, and temporarily
setting the SYMBOL_REF_DECL to NULL.

Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and 
hppa64-hp-hpux11.11.  Committed to trunk.

Dave
--
John David Anglin   dave.ang...@bell.net


2017-05-10  John David Anglin  

PR target/80090
* config/pa/pa.c (pa_assemble_integer): When outputting a SYMBOL_REF,
handle calling assemble_external ourself.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 247871)
+++ config/pa/pa.c  (working copy)
@@ -3299,6 +3299,24 @@
 static bool
 pa_assemble_integer (rtx x, unsigned int size, int aligned_p)
 {
+  bool result;
+  tree decl = NULL;
+
+  /* When we have a SYMBOL_REF with a SYMBOL_REF_DECL, we need to call
+ call assemble_external and set the SYMBOL_REF_DECL to NULL before
+ calling output_addr_const.  Otherwise, it may call assemble_external
+ in the midst of outputing the assembler code for the SYMBOL_REF.
+ We restore the SYMBOL_REF_DECL after the output is done.  */
+  if (GET_CODE (x) == SYMBOL_REF)
+{
+  decl = SYMBOL_REF_DECL (x);
+  if (decl)
+   {
+ assemble_external (decl);
+ SET_SYMBOL_REF_DECL (x, NULL);
+   }
+}
+
   if (size == UNITS_PER_WORD
   && aligned_p
   && function_label_operand (x, VOIDmode))
@@ -3311,9 +3329,15 @@
 
   output_addr_const (asm_out_file, x);
   fputc ('\n', asm_out_file);
-  return true;
+  result = true;
 }
-  return default_assemble_integer (x, size, aligned_p);
+  else
+result = default_assemble_integer (x, size, aligned_p);
+
+  if (decl)
+SET_SYMBOL_REF_DECL (x, decl);
+
+  return result;
 }
 
 /* Output an ascii string.  */


Re: Go patches committed: merge recent changes to gofrontend

2017-05-10 Thread Andrew Pinski
On Wed, May 10, 2017 at 10:26 AM, Ian Lance Taylor  wrote:
> I have committed a large patch to update the Go frontend and libgo to
> the recent changes in the gofrontend repository.  I had postponed
> merging changes during the GCC 7 release process.  I am now merging
> all the changes that were pending during that period.  Although this
> is a merged patch, the changes can be seen individually in the
> gofrontend repo (https://go.googlesource.com/gofrontend).  They are
> also listed below.
>
> This is a fairly significant patch that brings in the concurrent
> garbage collector used in the Go 1.8 runtime.  This significantly
> reduces pauses due to garbage collection while running a Go program.
>
> This patch also brings in experimental support for AIX for gccgo,
> contributed by Matthieu Sarter and others at Atos Infogérance.
>
> The actual patch is too large for this e-mail patch, but I have
> attached all the changes to the gcc/go directory.
>
> Ian


This causes a build failure on aarch64-linux-gnu:
../../../gcc/libgo/runtime/proc.c: In function ‘runtime_malg’:
../../../gcc/libgo/runtime/proc.c:729:43: warning: implicit
declaration of function ‘mstats’; did you mean ‘mstart1’?
[-Wimplicit-function-declaration]
void *p = runtime_sysAlloc(stacksize, &mstats()->other_sys);
   ^~
   mstart1
../../../gcc/libgo/runtime/proc.c:729:51: error: invalid type argument
of ‘->’ (have ‘int’)
void *p = runtime_sysAlloc(stacksize, &mstats()->other_sys);
   ^~


Thanks,
Andrew



>
>
> 2017-05-10  Than McIntosh  
>
> * go-backend.c: Include "go-c.h".
> * go-gcc.cc (Gcc_backend::write_export_data): New method.
>
> 2017-05-10  Ian Lance Taylor  
>
> * go-gcc.cc (Gcc_backend::Gcc_backend): Declare
> __builtin_prefetch.
> * Make-lang.in (GO_OBJS): Add go/wb.o.
>
> commit 884c9f2cafb3fc1decaca70f1817ae269e4c6889
> Author: Than McIntosh 
> Date:   Mon Jan 23 15:07:07 2017 -0500
>
> compiler: insert additional conversion for type desc ptr expr
>
> Change the method Type::type_descriptor_pointer to apply an additional
> type conversion to its result Bexpression, to avoid type clashes in
> the back end. The backend expression for a given type descriptor var
> is given a type of "_type", however the virtual calls that create the
> variable use types derived from _type, hence the need to force a
> conversion.
>
> Reviewed-on: https://go-review.googlesource.com/35506
>
>
> commit 5f0647c71e3b29eddcd0eecc44e7ba44ae7fc8dd
> Author: Than McIntosh 
> Date:   Mon Jan 23 15:22:26 2017 -0500
>
> compiler: insure tree integrity in Call_expression::set_result
>
> Depending on the back end, it can be problematic to reuse Bexpressions
> (passing the same Bexpression to more than one Backend call to create
> additional Bexpressions or Bstatements). The Call_expression::set_result
> method was reusing its Bexpression input in more than one tree
> context; the fix is to pass in an Expression instead and generate
> multiple Bexpression references to it within the method.
>
> Reviewed-on: https://go-review.googlesource.com/35505
>
>
> commit 7a8e49870885af898c3c790275e513d1764a2828
> Author: Ian Lance Taylor 
> Date:   Tue Jan 24 21:19:06 2017 -0800
>
> runtime: copy more of the scheduler from the Go 1.8 runtime
>
> Copies mstart, newm, m0, g0, and friends.
>
> Reviewed-on: https://go-review.googlesource.com/35645
>
>
> commit 3546e2f002d0277d805ec59c5403bc1d4eda4ed9
> Author: Ian Lance Taylor 
> Date:   Thu Jan 26 19:47:37 2017 -0800
>
> runtime: remove a few C functions that are no longer used
>
> Reviewed-on: https://go-review.googlesource.com/35849
>
>
> commit a71b835254f6d3164a0e6beaf54f2b175d1a6a92
> Author: Ian Lance Taylor 
> Date:   Thu Jan 26 16:51:16 2017 -0800
>
> runtime: copy over more of the Go 1.8 scheduler
>
> In particular __go_go (aka newproc) and goexit[01].
>
> Reviewed-on: https://go-review.googlesource.com/35847
>
>
> commit c3725adbe54d8283c373b6aa7dc95d6fc27f
> Author: Ian Lance Taylor 
> Date:   Fri Jan 27 16:58:20 2017 -0800
>
> runtime: copy syscall handling from Go 1.8 runtime
>
> Entering a syscall still has to start in C, to save the registers.
> Fix entersyscallblock to save them more reliably.
>
> This copies over the tracing code for syscalls, which we previously
> weren't doing, and lets us turn on runtime/trace/check.
>
> Reviewed-on: https://go-review.googlesource.com/35912
>
>
> commit d5b921de4a28b04000fc4c8dac7f529a4a624dfc
> Author: Ian Lance Taylor 
> Date:   Fri Jan 27 18:34:11 2017 -0800
>
> runtime: copy SIGPROF handling from Go 1.8 runtime
>
> Also copy over Breakpoint.
>
> Fix Func.Name and Func.Entry to not crash on a nil Func.
>
> Reviewed-on: https://go-review.googlesource.com/35913
>
>
> commit cc60235e55aef14b15c3d211403

Re: Go patches committed: merge recent changes to gofrontend

2017-05-10 Thread Andrew Pinski
On Wed, May 10, 2017 at 5:37 PM, Andrew Pinski  wrote:
> On Wed, May 10, 2017 at 10:26 AM, Ian Lance Taylor  wrote:
>> I have committed a large patch to update the Go frontend and libgo to
>> the recent changes in the gofrontend repository.  I had postponed
>> merging changes during the GCC 7 release process.  I am now merging
>> all the changes that were pending during that period.  Although this
>> is a merged patch, the changes can be seen individually in the
>> gofrontend repo (https://go.googlesource.com/gofrontend).  They are
>> also listed below.
>>
>> This is a fairly significant patch that brings in the concurrent
>> garbage collector used in the Go 1.8 runtime.  This significantly
>> reduces pauses due to garbage collection while running a Go program.
>>
>> This patch also brings in experimental support for AIX for gccgo,
>> contributed by Matthieu Sarter and others at Atos Infogérance.
>>
>> The actual patch is too large for this e-mail patch, but I have
>> attached all the changes to the gcc/go directory.
>>
>> Ian
>
>
> This causes a build failure on aarch64-linux-gnu:
> ../../../gcc/libgo/runtime/proc.c: In function ‘runtime_malg’:
> ../../../gcc/libgo/runtime/proc.c:729:43: warning: implicit
> declaration of function ‘mstats’; did you mean ‘mstart1’?
> [-Wimplicit-function-declaration]
> void *p = runtime_sysAlloc(stacksize, &mstats()->other_sys);
>^~
>mstart1
> ../../../gcc/libgo/runtime/proc.c:729:51: error: invalid type argument
> of ‘->’ (have ‘int’)
> void *p = runtime_sysAlloc(stacksize, &mstats()->other_sys);
>^~
>

Just FYI the reason why it fails on aarch64-linux-gnu and not
x86_64-linux-gnu is because this code is only enabled for targets
which don't not support split stacks.

Thanks,
andrew Pinski

>
> Thanks,
> Andrew
>
>
>
>>
>>
>> 2017-05-10  Than McIntosh  
>>
>> * go-backend.c: Include "go-c.h".
>> * go-gcc.cc (Gcc_backend::write_export_data): New method.
>>
>> 2017-05-10  Ian Lance Taylor  
>>
>> * go-gcc.cc (Gcc_backend::Gcc_backend): Declare
>> __builtin_prefetch.
>> * Make-lang.in (GO_OBJS): Add go/wb.o.
>>
>> commit 884c9f2cafb3fc1decaca70f1817ae269e4c6889
>> Author: Than McIntosh 
>> Date:   Mon Jan 23 15:07:07 2017 -0500
>>
>> compiler: insert additional conversion for type desc ptr expr
>>
>> Change the method Type::type_descriptor_pointer to apply an additional
>> type conversion to its result Bexpression, to avoid type clashes in
>> the back end. The backend expression for a given type descriptor var
>> is given a type of "_type", however the virtual calls that create the
>> variable use types derived from _type, hence the need to force a
>> conversion.
>>
>> Reviewed-on: https://go-review.googlesource.com/35506
>>
>>
>> commit 5f0647c71e3b29eddcd0eecc44e7ba44ae7fc8dd
>> Author: Than McIntosh 
>> Date:   Mon Jan 23 15:22:26 2017 -0500
>>
>> compiler: insure tree integrity in Call_expression::set_result
>>
>> Depending on the back end, it can be problematic to reuse Bexpressions
>> (passing the same Bexpression to more than one Backend call to create
>> additional Bexpressions or Bstatements). The Call_expression::set_result
>> method was reusing its Bexpression input in more than one tree
>> context; the fix is to pass in an Expression instead and generate
>> multiple Bexpression references to it within the method.
>>
>> Reviewed-on: https://go-review.googlesource.com/35505
>>
>>
>> commit 7a8e49870885af898c3c790275e513d1764a2828
>> Author: Ian Lance Taylor 
>> Date:   Tue Jan 24 21:19:06 2017 -0800
>>
>> runtime: copy more of the scheduler from the Go 1.8 runtime
>>
>> Copies mstart, newm, m0, g0, and friends.
>>
>> Reviewed-on: https://go-review.googlesource.com/35645
>>
>>
>> commit 3546e2f002d0277d805ec59c5403bc1d4eda4ed9
>> Author: Ian Lance Taylor 
>> Date:   Thu Jan 26 19:47:37 2017 -0800
>>
>> runtime: remove a few C functions that are no longer used
>>
>> Reviewed-on: https://go-review.googlesource.com/35849
>>
>>
>> commit a71b835254f6d3164a0e6beaf54f2b175d1a6a92
>> Author: Ian Lance Taylor 
>> Date:   Thu Jan 26 16:51:16 2017 -0800
>>
>> runtime: copy over more of the Go 1.8 scheduler
>>
>> In particular __go_go (aka newproc) and goexit[01].
>>
>> Reviewed-on: https://go-review.googlesource.com/35847
>>
>>
>> commit c3725adbe54d8283c373b6aa7dc95d6fc27f
>> Author: Ian Lance Taylor 
>> Date:   Fri Jan 27 16:58:20 2017 -0800
>>
>> runtime: copy syscall handling from Go 1.8 runtime
>>
>> Entering a syscall still has to start in C, to save the registers.
>> Fix entersyscallblock to save them more reliably.
>>
>> This copies over the tracing code for syscalls, which we previously
>> weren't doing, and lets us turn on runtime/trace/check.
>>
>> Reviewed-on: https://go-review.googlesource.com

Re: [PING][PATCH] Move the check for any_condjump_p from sched-deps to target macros

2017-05-10 Thread Hurugalawadi, Naveen
Hi,

>> Doesn't this avoid calling the target hook in cases where it used to 
>> call it before?

Yes. Thanks for pointing it out.

>> Consider a conditional jump inside a parallel that is not a single set.

Please find attached the modified patch that handles the case mentioned.
Please review the patch and let us know if its okay?

Bootstrapped and Regression tested on AArch64 and X86_64.
Please review the patch and let us know if its okay?

Thanks,
Naveen
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2e385c4..b38b8b7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -13973,13 +13973,23 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
 {
   enum attr_type prev_type = get_attr_type (prev);
 
-  /* FIXME: this misses some which is considered simple arthematic
- instructions for ThunderX.  Simple shifts are missed here.  */
-  if (prev_type == TYPE_ALUS_SREG
-  || prev_type == TYPE_ALUS_IMM
-  || prev_type == TYPE_LOGICS_REG
-  || prev_type == TYPE_LOGICS_IMM)
-return true;
+  unsigned int condreg1, condreg2;
+  rtx cc_reg_1;
+  aarch64_fixed_condition_code_regs (&condreg1, &condreg2);
+  cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
+
+  if (reg_referenced_p (cc_reg_1, PATTERN (curr))
+	  && prev
+	  && modified_in_p (cc_reg_1, prev))
+	{
+	  /* FIXME: this misses some which is considered simple arthematic
+	 instructions for ThunderX.  Simple shifts are missed here.  */
+	  if (prev_type == TYPE_ALUS_SREG
+	  || prev_type == TYPE_ALUS_IMM
+	  || prev_type == TYPE_LOGICS_REG
+	  || prev_type == TYPE_LOGICS_IMM)
+	return true;
+	}
 }
 
   return false;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0b2fa1b..af14c90 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -29483,6 +29483,15 @@ ix86_macro_fusion_pair_p (rtx_insn *condgen, rtx_insn *condjmp)
   if (!any_condjump_p (condjmp))
 return false;
 
+  unsigned int condreg1, condreg2;
+  rtx cc_reg_1;
+  ix86_fixed_condition_code_regs (&condreg1, &condreg2);
+  cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
+  if (!reg_referenced_p (cc_reg_1, PATTERN (condjmp))
+  || !condgen
+  || !modified_in_p (cc_reg_1, condgen))
+return false;
+
   if (get_attr_type (condgen) != TYPE_TEST
   && get_attr_type (condgen) != TYPE_ICMP
   && get_attr_type (condgen) != TYPE_INCDEC
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index b2393bf..4c459e6 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -2834,34 +2834,30 @@ static void
 sched_macro_fuse_insns (rtx_insn *insn)
 {
   rtx_insn *prev;
-
+  prev = prev_nonnote_nondebug_insn (insn);
+  if (!prev)
+return;
+ 
   if (any_condjump_p (insn))
 {
   unsigned int condreg1, condreg2;
   rtx cc_reg_1;
   targetm.fixed_condition_code_regs (&condreg1, &condreg2);
   cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
-  prev = prev_nonnote_nondebug_insn (insn);
-  if (!reg_referenced_p (cc_reg_1, PATTERN (insn))
-  || !prev
-  || !modified_in_p (cc_reg_1, prev))
-return;
+  if (reg_referenced_p (cc_reg_1, PATTERN (insn))
+	  && modified_in_p (cc_reg_1, prev))
+	{
+	  if (targetm.sched.macro_fusion_pair_p (prev, insn))
+	SCHED_GROUP_P (insn) = 1;
+	  return;
+	}
 }
-  else
-{
-  rtx insn_set = single_set (insn);
-
-  prev = prev_nonnote_nondebug_insn (insn);
-  if (!prev
-  || !insn_set
-  || !single_set (prev))
-return;
 
+  if (single_set (insn) && single_set (prev))
+{
+  if (targetm.sched.macro_fusion_pair_p (prev, insn))
+	SCHED_GROUP_P (insn) = 1;
 }
-
-  if (targetm.sched.macro_fusion_pair_p (prev, insn))
-SCHED_GROUP_P (insn) = 1;
-
 }
 
 /* Get the implicit reg pending clobbers for INSN and save them in TEMP.  */


Re: [PING2][PATCH][AArch64] Add addr_type attribute

2017-05-10 Thread Hurugalawadi, Naveen
Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00222.html

Thanks,
Naveen


    

Re: [PING2][PATCH][AArch64] Add addr_type attribute

2017-05-10 Thread Hurugalawadi, Naveen
Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00222.html

Thanks,
Naveen


    

Re: [PING2][PATCH][AArch64] Add addr_type attribute

2017-05-10 Thread Hurugalawadi, Naveen
Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00222.html

Thanks,
Naveen


    

  1   2   >